diff --git a/CHANGELOG.md b/CHANGELOG.md index 4e1c1d59b99..950bdc7e374 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,196 @@ +## ClickHouse release 20.7 + +### ClickHouse release v20.7.2.30-stable, 2020-08-31 + +#### Backward Incompatible Change + +* Function `modulo` (operator `%`) with at least one floating point number as argument will calculate remainder of division directly on floating point numbers without converting both arguments to integers. It makes behaviour compatible with most of DBMS. This also applicable for Date and DateTime data types. Added alias `mod`. This closes [#7323](https://github.com/ClickHouse/ClickHouse/issues/7323). [#12585](https://github.com/ClickHouse/ClickHouse/pull/12585) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Deprecate special printing of zero Date/DateTime values as `0000-00-00` and `0000-00-00 00:00:00`. [#12442](https://github.com/ClickHouse/ClickHouse/pull/12442) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* The function `groupArrayMoving*` was not working for distributed queries. It's result was calculated within incorrect data type (without promotion to the largest type). The function `groupArrayMovingAvg` was returning integer number that was inconsistent with the `avg` function. This fixes [#12568](https://github.com/ClickHouse/ClickHouse/issues/12568). [#12622](https://github.com/ClickHouse/ClickHouse/pull/12622) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Add sanity check for MergeTree settings. If the settings are incorrect, the server will refuse to start or to create a table, printing detailed explanation to the user. [#13153](https://github.com/ClickHouse/ClickHouse/pull/13153) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Protect from the cases when user may set `background_pool_size` to value lower than `number_of_free_entries_in_pool_to_execute_mutation` or `number_of_free_entries_in_pool_to_lower_max_size_of_merge`. In these cases ALTERs won't work or the maximum size of merge will be too limited. It will throw exception explaining what to do. This closes [#10897](https://github.com/ClickHouse/ClickHouse/issues/10897). [#12728](https://github.com/ClickHouse/ClickHouse/pull/12728) ([alexey-milovidov](https://github.com/alexey-milovidov)). + +#### New Feature + +* Polygon dictionary type that provides efficient "reverse geocoding" lookups - to find the region by coordinates in a dictionary of many polygons (world map). It is using carefully optimized algorithm with recursive grids to maintain low CPU and memory usage. [#9278](https://github.com/ClickHouse/ClickHouse/pull/9278) ([achulkov2](https://github.com/achulkov2)). +* Added support of LDAP authentication for preconfigured users ("Simple Bind" method). [#11234](https://github.com/ClickHouse/ClickHouse/pull/11234) ([Denis Glazachev](https://github.com/traceon)). +* Introduce setting `alter_partition_verbose_result` which outputs information about touched parts for some types of `ALTER TABLE ... PARTITION ...` queries (currently `ATTACH` and `FREEZE`). Closes [#8076](https://github.com/ClickHouse/ClickHouse/issues/8076). [#13017](https://github.com/ClickHouse/ClickHouse/pull/13017) ([alesapin](https://github.com/alesapin)). +* Add `bayesAB` function for bayesian-ab-testing. [#12327](https://github.com/ClickHouse/ClickHouse/pull/12327) ([achimbab](https://github.com/achimbab)). +* Added `system.crash_log` table into which stack traces for fatal errors are collected. This table should be empty. [#12316](https://github.com/ClickHouse/ClickHouse/pull/12316) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Added http headers `X-ClickHouse-Database` and `X-ClickHouse-Format` which may be used to set default database and output format. [#12981](https://github.com/ClickHouse/ClickHouse/pull/12981) ([hcz](https://github.com/hczhcz)). +* Add `minMap` and `maxMap` functions support to `SimpleAggregateFunction`. [#12662](https://github.com/ClickHouse/ClickHouse/pull/12662) ([Ildus Kurbangaliev](https://github.com/ildus)). +* Add setting `allow_non_metadata_alters` which restricts to execute `ALTER` queries which modify data on disk. Disabled be default. Closes [#11547](https://github.com/ClickHouse/ClickHouse/issues/11547). [#12635](https://github.com/ClickHouse/ClickHouse/pull/12635) ([alesapin](https://github.com/alesapin)). +* A function `formatRow` is added to support turning arbitrary expressions into a string via given format. It's useful for manipulating SQL outputs and is quite versatile combined with the `columns` function. [#12574](https://github.com/ClickHouse/ClickHouse/pull/12574) ([Amos Bird](https://github.com/amosbird)). +* Add `FROM_UNIXTIME` function for compatibility with MySQL, related to [12149](https://github.com/ClickHouse/ClickHouse/issues/12149). [#12484](https://github.com/ClickHouse/ClickHouse/pull/12484) ([flynn](https://github.com/ucasFL)). +* Allow Nullable types as keys in MergeTree tables if `allow_nullable_key` table setting is enabled. https://github.com/ClickHouse/ClickHouse/issues/5319. [#12433](https://github.com/ClickHouse/ClickHouse/pull/12433) ([Amos Bird](https://github.com/amosbird)). +* Integration with [COS](https://intl.cloud.tencent.com/product/cos). [#12386](https://github.com/ClickHouse/ClickHouse/pull/12386) ([fastio](https://github.com/fastio)). +* Add mapAdd and mapSubtract functions for adding/subtracting key-mapped values. [#11735](https://github.com/ClickHouse/ClickHouse/pull/11735) ([Ildus Kurbangaliev](https://github.com/ildus)). + +#### Bug Fix + +* Fix premature `ON CLUSTER` timeouts for queries that must be executed on a single replica. Fixes [#6704](https://github.com/ClickHouse/ClickHouse/issues/6704), [#7228](https://github.com/ClickHouse/ClickHouse/issues/7228), [#13361](https://github.com/ClickHouse/ClickHouse/issues/13361), [#11884](https://github.com/ClickHouse/ClickHouse/issues/11884). [#13450](https://github.com/ClickHouse/ClickHouse/pull/13450) ([alesapin](https://github.com/alesapin)). +* Fix crash in mark inclusion search introduced in https://github.com/ClickHouse/ClickHouse/pull/12277. [#14225](https://github.com/ClickHouse/ClickHouse/pull/14225) ([Amos Bird](https://github.com/amosbird)). +* Fix race condition in external dictionaries with cache layout which can lead server crash. [#12566](https://github.com/ClickHouse/ClickHouse/pull/12566) ([alesapin](https://github.com/alesapin)). +* Fix visible data clobbering by progress bar in client in interactive mode. This fixes [#12562](https://github.com/ClickHouse/ClickHouse/issues/12562) and [#13369](https://github.com/ClickHouse/ClickHouse/issues/13369) and [#13584](https://github.com/ClickHouse/ClickHouse/issues/13584) and fixes [#12964](https://github.com/ClickHouse/ClickHouse/issues/12964). [#13691](https://github.com/ClickHouse/ClickHouse/pull/13691) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fixed incorrect sorting order for `LowCardinality` columns when ORDER BY multiple columns is used. This fixes [#13958](https://github.com/ClickHouse/ClickHouse/issues/13958). [#14223](https://github.com/ClickHouse/ClickHouse/pull/14223) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Removed hardcoded timeout, which wrongly overruled `query_wait_timeout_milliseconds` setting for cache-dictionary. [#14105](https://github.com/ClickHouse/ClickHouse/pull/14105) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Fixed wrong mount point in extra info for `Poco::Exception: no space left on device`. [#14050](https://github.com/ClickHouse/ClickHouse/pull/14050) ([tavplubix](https://github.com/tavplubix)). +* Fix wrong query optimization of select queries with `DISTINCT` keyword when subqueries also have `DISTINCT` in case `optimize_duplicate_order_by_and_distinct` setting is enabled. [#13925](https://github.com/ClickHouse/ClickHouse/pull/13925) ([Artem Zuikov](https://github.com/4ertus2)). +* Fixed potential deadlock when renaming `Distributed` table. [#13922](https://github.com/ClickHouse/ClickHouse/pull/13922) ([tavplubix](https://github.com/tavplubix)). +* Fix incorrect sorting for `FixedString` columns when ORDER BY multiple columns is used. Fixes [#13182](https://github.com/ClickHouse/ClickHouse/issues/13182). [#13887](https://github.com/ClickHouse/ClickHouse/pull/13887) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix potentially lower precision of `topK`/`topKWeighted` aggregations (with non-default parameters). [#13817](https://github.com/ClickHouse/ClickHouse/pull/13817) ([Azat Khuzhin](https://github.com/azat)). +* Fix reading from MergeTree table with INDEX of type SET fails when compared against NULL. This fixes [#13686](https://github.com/ClickHouse/ClickHouse/issues/13686). [#13793](https://github.com/ClickHouse/ClickHouse/pull/13793) ([Amos Bird](https://github.com/amosbird)). +* Fix step overflow in function `range()`. [#13790](https://github.com/ClickHouse/ClickHouse/pull/13790) ([Azat Khuzhin](https://github.com/azat)). +* Fixed `Directory not empty` error when concurrently executing `DROP DATABASE` and `CREATE TABLE`. [#13756](https://github.com/ClickHouse/ClickHouse/pull/13756) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Add range check for `h3KRing` function. This fixes [#13633](https://github.com/ClickHouse/ClickHouse/issues/13633). [#13752](https://github.com/ClickHouse/ClickHouse/pull/13752) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix race condition between DETACH and background merges. Parts may revive after detach. This is continuation of [#8602](https://github.com/ClickHouse/ClickHouse/issues/8602) that did not fix the issue but introduced a test that started to fail in very rare cases, demonstrating the issue. [#13746](https://github.com/ClickHouse/ClickHouse/pull/13746) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix logging Settings.Names/Values when `log_queries_min_type` greater than `QUERY_START`. [#13737](https://github.com/ClickHouse/ClickHouse/pull/13737) ([Azat Khuzhin](https://github.com/azat)). +* Fix incorrect message in `clickhouse-server.init` while checking user and group. [#13711](https://github.com/ClickHouse/ClickHouse/pull/13711) ([ylchou](https://github.com/ylchou)). +* Do not optimize `any(arrayJoin())` to `arrayJoin()` under `optimize_move_functions_out_of_any`. [#13681](https://github.com/ClickHouse/ClickHouse/pull/13681) ([Azat Khuzhin](https://github.com/azat)). +* Fixed possible deadlock in concurrent `ALTER ... REPLACE/MOVE PARTITION ...` queries. [#13626](https://github.com/ClickHouse/ClickHouse/pull/13626) ([tavplubix](https://github.com/tavplubix)). +* Fixed the behaviour when sometimes cache-dictionary returned default value instead of present value from source. [#13624](https://github.com/ClickHouse/ClickHouse/pull/13624) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Fix secondary indices corruption in compact parts (compact parts is an experimental feature). [#13538](https://github.com/ClickHouse/ClickHouse/pull/13538) ([Anton Popov](https://github.com/CurtizJ)). +* Fix wrong code in function `netloc`. This fixes [#13335](https://github.com/ClickHouse/ClickHouse/issues/13335). [#13446](https://github.com/ClickHouse/ClickHouse/pull/13446) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix error in `parseDateTimeBestEffort` function when unix timestamp was passed as an argument. This fixes [#13362](https://github.com/ClickHouse/ClickHouse/issues/13362). [#13441](https://github.com/ClickHouse/ClickHouse/pull/13441) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix invalid return type for comparison of tuples with `NULL` elements. Fixes [#12461](https://github.com/ClickHouse/ClickHouse/issues/12461). [#13420](https://github.com/ClickHouse/ClickHouse/pull/13420) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix wrong optimization caused `aggregate function any(x) is found inside another aggregate function in query` error with `SET optimize_move_functions_out_of_any = 1` and aliases inside `any()`. [#13419](https://github.com/ClickHouse/ClickHouse/pull/13419) ([Artem Zuikov](https://github.com/4ertus2)). +* Fix possible race in `StorageMemory`. [#13416](https://github.com/ClickHouse/ClickHouse/pull/13416) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix empty output for `Arrow` and `Parquet` formats in case if query return zero rows. It was done because empty output is not valid for this formats. [#13399](https://github.com/ClickHouse/ClickHouse/pull/13399) ([hcz](https://github.com/hczhcz)). +* Fix select queries with constant columns and prefix of primary key in `ORDER BY` clause. [#13396](https://github.com/ClickHouse/ClickHouse/pull/13396) ([Anton Popov](https://github.com/CurtizJ)). +* Fix `PrettyCompactMonoBlock` for clickhouse-local. Fix extremes/totals with `PrettyCompactMonoBlock`. Fixes [#7746](https://github.com/ClickHouse/ClickHouse/issues/7746). [#13394](https://github.com/ClickHouse/ClickHouse/pull/13394) ([Azat Khuzhin](https://github.com/azat)). +* Fixed deadlock in system.text_log. [#12452](https://github.com/ClickHouse/ClickHouse/pull/12452) ([alexey-milovidov](https://github.com/alexey-milovidov)). It is a part of [#12339](https://github.com/ClickHouse/ClickHouse/issues/12339). This fixes [#12325](https://github.com/ClickHouse/ClickHouse/issues/12325). [#13386](https://github.com/ClickHouse/ClickHouse/pull/13386) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Fixed `File(TSVWithNames*)` (header was written multiple times), fixed `clickhouse-local --format CSVWithNames*` (lacks header, broken after [#12197](https://github.com/ClickHouse/ClickHouse/issues/12197)), fixed `clickhouse-local --format CSVWithNames*` with zero rows (lacks header). [#13343](https://github.com/ClickHouse/ClickHouse/pull/13343) ([Azat Khuzhin](https://github.com/azat)). +* Fix segfault when function `groupArrayMovingSum` deserializes empty state. Fixes [#13339](https://github.com/ClickHouse/ClickHouse/issues/13339). [#13341](https://github.com/ClickHouse/ClickHouse/pull/13341) ([alesapin](https://github.com/alesapin)). +* Throw error on `arrayJoin()` function in `JOIN ON` section. [#13330](https://github.com/ClickHouse/ClickHouse/pull/13330) ([Artem Zuikov](https://github.com/4ertus2)). +* Fix crash in `LEFT ASOF JOIN` with `join_use_nulls=1`. [#13291](https://github.com/ClickHouse/ClickHouse/pull/13291) ([Artem Zuikov](https://github.com/4ertus2)). +* Fix possible error `Totals having transform was already added to pipeline` in case of a query from delayed replica. [#13290](https://github.com/ClickHouse/ClickHouse/pull/13290) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* The server may crash if user passed specifically crafted arguments to the function `h3ToChildren`. This fixes [#13275](https://github.com/ClickHouse/ClickHouse/issues/13275). [#13277](https://github.com/ClickHouse/ClickHouse/pull/13277) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix potentially low performance and slightly incorrect result for `uniqExact`, `topK`, `sumDistinct` and similar aggregate functions called on Float types with `NaN` values. It also triggered assert in debug build. This fixes [#12491](https://github.com/ClickHouse/ClickHouse/issues/12491). [#13254](https://github.com/ClickHouse/ClickHouse/pull/13254) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix assertion in KeyCondition when primary key contains expression with monotonic function and query contains comparison with constant whose type is different. This fixes [#12465](https://github.com/ClickHouse/ClickHouse/issues/12465). [#13251](https://github.com/ClickHouse/ClickHouse/pull/13251) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Return passed number for numbers with MSB set in function roundUpToPowerOfTwoOrZero(). It prevents potential errors in case of overflow of array sizes. [#13234](https://github.com/ClickHouse/ClickHouse/pull/13234) ([Azat Khuzhin](https://github.com/azat)). +* Fix function if with nullable constexpr as cond that is not literal NULL. Fixes [#12463](https://github.com/ClickHouse/ClickHouse/issues/12463). [#13226](https://github.com/ClickHouse/ClickHouse/pull/13226) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix assert in `arrayElement` function in case of array elements are Nullable and array subscript is also Nullable. This fixes [#12172](https://github.com/ClickHouse/ClickHouse/issues/12172). [#13224](https://github.com/ClickHouse/ClickHouse/pull/13224) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix DateTime64 conversion functions with constant argument. [#13205](https://github.com/ClickHouse/ClickHouse/pull/13205) ([Azat Khuzhin](https://github.com/azat)). +* Fix parsing row policies from users.xml when names of databases or tables contain dots. This fixes https://github.com/ClickHouse/ClickHouse/issues/5779, https://github.com/ClickHouse/ClickHouse/issues/12527. [#13199](https://github.com/ClickHouse/ClickHouse/pull/13199) ([Vitaly Baranov](https://github.com/vitlibar)). +* Fix access to `redis` dictionary after connection was dropped once. It may happen with `cache` and `direct` dictionary layouts. [#13082](https://github.com/ClickHouse/ClickHouse/pull/13082) ([Anton Popov](https://github.com/CurtizJ)). +* Fix wrong index analysis with functions. It could lead to some data parts being skipped when reading from `MergeTree` tables. Fixes [#13060](https://github.com/ClickHouse/ClickHouse/issues/13060). Fixes [#12406](https://github.com/ClickHouse/ClickHouse/issues/12406). [#13081](https://github.com/ClickHouse/ClickHouse/pull/13081) ([Anton Popov](https://github.com/CurtizJ)). +* Fix error `Cannot convert column because it is constant but values of constants are different in source and result` for remote queries which use deterministic functions in scope of query, but not deterministic between queries, like `now()`, `now64()`, `randConstant()`. Fixes [#11327](https://github.com/ClickHouse/ClickHouse/issues/11327). [#13075](https://github.com/ClickHouse/ClickHouse/pull/13075) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix crash which was possible for queries with `ORDER BY` tuple and small `LIMIT`. Fixes [#12623](https://github.com/ClickHouse/ClickHouse/issues/12623). [#13009](https://github.com/ClickHouse/ClickHouse/pull/13009) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix `Block structure mismatch` error for queries with `UNION` and `JOIN`. Fixes [#12602](https://github.com/ClickHouse/ClickHouse/issues/12602). [#12989](https://github.com/ClickHouse/ClickHouse/pull/12989) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Corrected `merge_with_ttl_timeout` logic which did not work well when expiration affected more than one partition over one time interval. (Authored by @excitoon). [#12982](https://github.com/ClickHouse/ClickHouse/pull/12982) ([Alexander Kazakov](https://github.com/Akazz)). +* Fix columns duplication for range hashed dictionary created from DDL query. This fixes [#10605](https://github.com/ClickHouse/ClickHouse/issues/10605). [#12857](https://github.com/ClickHouse/ClickHouse/pull/12857) ([alesapin](https://github.com/alesapin)). +* Fix unnecessary limiting for the number of threads for selects from local replica. [#12840](https://github.com/ClickHouse/ClickHouse/pull/12840) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix rare bug when `ALTER DELETE` and `ALTER MODIFY COLUMN` queries executed simultaneously as a single mutation. Bug leads to an incorrect amount of rows in `count.txt` and as a consequence incorrect data in part. Also, fix a small bug with simultaneous `ALTER RENAME COLUMN` and `ALTER ADD COLUMN`. [#12760](https://github.com/ClickHouse/ClickHouse/pull/12760) ([alesapin](https://github.com/alesapin)). +* Wrong credentials being used when using `clickhouse` dictionary source to query remote tables. [#12756](https://github.com/ClickHouse/ClickHouse/pull/12756) ([sundyli](https://github.com/sundy-li)). +* Fix `CAST(Nullable(String), Enum())`. [#12745](https://github.com/ClickHouse/ClickHouse/pull/12745) ([Azat Khuzhin](https://github.com/azat)). +* Fix performance with large tuples, which are interpreted as functions in `IN` section. The case when user writes `WHERE x IN tuple(1, 2, ...)` instead of `WHERE x IN (1, 2, ...)` for some obscure reason. [#12700](https://github.com/ClickHouse/ClickHouse/pull/12700) ([Anton Popov](https://github.com/CurtizJ)). +* Fix memory tracking for input_format_parallel_parsing (by attaching thread to group). [#12672](https://github.com/ClickHouse/ClickHouse/pull/12672) ([Azat Khuzhin](https://github.com/azat)). +* Fix wrong optimization `optimize_move_functions_out_of_any=1` in case of `any(func())`. [#12664](https://github.com/ClickHouse/ClickHouse/pull/12664) ([Artem Zuikov](https://github.com/4ertus2)). +* Fixed [#10572](https://github.com/ClickHouse/ClickHouse/issues/10572) fix bloom filter index with const expression. [#12659](https://github.com/ClickHouse/ClickHouse/pull/12659) ([Winter Zhang](https://github.com/zhang2014)). +* Fix SIGSEGV in StorageKafka when broker is unavailable (and not only). [#12658](https://github.com/ClickHouse/ClickHouse/pull/12658) ([Azat Khuzhin](https://github.com/azat)). +* Add support for function `if` with `Array(UUID)` arguments. This fixes [#11066](https://github.com/ClickHouse/ClickHouse/issues/11066). [#12648](https://github.com/ClickHouse/ClickHouse/pull/12648) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* CREATE USER IF NOT EXISTS now doesn't throw exception if the user exists. This fixes https://github.com/ClickHouse/ClickHouse/issues/12507. [#12646](https://github.com/ClickHouse/ClickHouse/pull/12646) ([Vitaly Baranov](https://github.com/vitlibar)). +* Exception `There is no supertype...` can be thrown during `ALTER ... UPDATE` in unexpected cases (e.g. when subtracting from UInt64 column). This fixes [#7306](https://github.com/ClickHouse/ClickHouse/issues/7306). This fixes [#4165](https://github.com/ClickHouse/ClickHouse/issues/4165). [#12633](https://github.com/ClickHouse/ClickHouse/pull/12633) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix possible `Pipeline stuck` error for queries with external sorting. Fixes [#12617](https://github.com/ClickHouse/ClickHouse/issues/12617). [#12618](https://github.com/ClickHouse/ClickHouse/pull/12618) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix error `Output of TreeExecutor is not sorted` for `OPTIMIZE DEDUPLICATE`. Fixes [#11572](https://github.com/ClickHouse/ClickHouse/issues/11572). [#12613](https://github.com/ClickHouse/ClickHouse/pull/12613) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix the issue when alias on result of function `any` can be lost during query optimization. [#12593](https://github.com/ClickHouse/ClickHouse/pull/12593) ([Anton Popov](https://github.com/CurtizJ)). +* Remove data for Distributed tables (blocks from async INSERTs) on DROP TABLE. [#12556](https://github.com/ClickHouse/ClickHouse/pull/12556) ([Azat Khuzhin](https://github.com/azat)). +* Now ClickHouse will recalculate checksums for parts when file `checksums.txt` is absent. Broken since [#9827](https://github.com/ClickHouse/ClickHouse/issues/9827). [#12545](https://github.com/ClickHouse/ClickHouse/pull/12545) ([alesapin](https://github.com/alesapin)). +* Fix bug which lead to broken old parts after `ALTER DELETE` query when `enable_mixed_granularity_parts=1`. Fixes [#12536](https://github.com/ClickHouse/ClickHouse/issues/12536). [#12543](https://github.com/ClickHouse/ClickHouse/pull/12543) ([alesapin](https://github.com/alesapin)). +* Fixing race condition in live view tables which could cause data duplication. LIVE VIEW is an experimental feature. [#12519](https://github.com/ClickHouse/ClickHouse/pull/12519) ([vzakaznikov](https://github.com/vzakaznikov)). +* Fix backwards compatibility in binary format of `AggregateFunction(avg, ...)` values. This fixes [#12342](https://github.com/ClickHouse/ClickHouse/issues/12342). [#12486](https://github.com/ClickHouse/ClickHouse/pull/12486) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix crash in JOIN with dictionary when we are joining over expression of dictionary key: `t JOIN dict ON expr(dict.id) = t.id`. Disable dictionary join optimisation for this case. [#12458](https://github.com/ClickHouse/ClickHouse/pull/12458) ([Artem Zuikov](https://github.com/4ertus2)). +* Fix overflow when very large LIMIT or OFFSET is specified. This fixes [#10470](https://github.com/ClickHouse/ClickHouse/issues/10470). This fixes [#11372](https://github.com/ClickHouse/ClickHouse/issues/11372). [#12427](https://github.com/ClickHouse/ClickHouse/pull/12427) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* kafka: fix SIGSEGV if there is a message with error in the middle of the batch. [#12302](https://github.com/ClickHouse/ClickHouse/pull/12302) ([Azat Khuzhin](https://github.com/azat)). + +#### Improvement + +* Keep smaller amount of logs in ZooKeeper. Avoid excessive growing of ZooKeeper nodes in case of offline replicas when having many servers/tables/inserts. [#13100](https://github.com/ClickHouse/ClickHouse/pull/13100) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Now exceptions forwarded to the client if an error happened during ALTER or mutation. Closes [#11329](https://github.com/ClickHouse/ClickHouse/issues/11329). [#12666](https://github.com/ClickHouse/ClickHouse/pull/12666) ([alesapin](https://github.com/alesapin)). +* Add `QueryTimeMicroseconds`, `SelectQueryTimeMicroseconds` and `InsertQueryTimeMicroseconds` to `system.events`, along with system.metrics, processes, query_log, etc. [#13028](https://github.com/ClickHouse/ClickHouse/pull/13028) ([ianton-ru](https://github.com/ianton-ru)). +* Added `SelectedRows` and `SelectedBytes` to `system.events`, along with system.metrics, processes, query_log, etc. [#12638](https://github.com/ClickHouse/ClickHouse/pull/12638) ([ianton-ru](https://github.com/ianton-ru)). +* Added `current_database` information to `system.query_log`. [#12652](https://github.com/ClickHouse/ClickHouse/pull/12652) ([Amos Bird](https://github.com/amosbird)). +* Allow `TabSeparatedRaw` as input format. [#12009](https://github.com/ClickHouse/ClickHouse/pull/12009) ([hcz](https://github.com/hczhcz)). +* Now `joinGet` supports multi-key lookup. [#12418](https://github.com/ClickHouse/ClickHouse/pull/12418) ([Amos Bird](https://github.com/amosbird)). +* Allow `*Map` aggregate functions to work on Arrays with NULLs. Fixes [#13157](https://github.com/ClickHouse/ClickHouse/issues/13157). [#13225](https://github.com/ClickHouse/ClickHouse/pull/13225) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Avoid overflow in parsing of DateTime values that will lead to negative unix timestamp in their timezone (for example, `1970-01-01 00:00:00` in Moscow). Saturate to zero instead. This fixes [#3470](https://github.com/ClickHouse/ClickHouse/issues/3470). This fixes [#4172](https://github.com/ClickHouse/ClickHouse/issues/4172). [#12443](https://github.com/ClickHouse/ClickHouse/pull/12443) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* AvroConfluent: Skip Kafka tombstone records - Support skipping broken records [#13203](https://github.com/ClickHouse/ClickHouse/pull/13203) ([Andrew Onyshchuk](https://github.com/oandrew)). +* Fix wrong error for long queries. It was possible to get syntax error other than `Max query size exceeded` for correct query. [#13928](https://github.com/ClickHouse/ClickHouse/pull/13928) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix data race in `lgamma` function. This race was caught only in `tsan`, no side effects really happened. [#13842](https://github.com/ClickHouse/ClickHouse/pull/13842) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix a 'Week'-interval formatting for ATTACH/ALTER/CREATE QUOTA-statements. [#13417](https://github.com/ClickHouse/ClickHouse/pull/13417) ([vladimir-golovchenko](https://github.com/vladimir-golovchenko)). +* Now broken parts are also reported when encountered in compact part processing. Compact parts is an experimental feature. [#13282](https://github.com/ClickHouse/ClickHouse/pull/13282) ([Amos Bird](https://github.com/amosbird)). +* Fix assert in `geohashesInBox`. This fixes [#12554](https://github.com/ClickHouse/ClickHouse/issues/12554). [#13229](https://github.com/ClickHouse/ClickHouse/pull/13229) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix assert in `parseDateTimeBestEffort`. This fixes [#12649](https://github.com/ClickHouse/ClickHouse/issues/12649). [#13227](https://github.com/ClickHouse/ClickHouse/pull/13227) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Minor optimization in Processors/PipelineExecutor: breaking out of a loop because it makes sense to do so. [#13058](https://github.com/ClickHouse/ClickHouse/pull/13058) ([Mark Papadakis](https://github.com/markpapadakis)). +* Support TRUNCATE table without TABLE keyword. [#12653](https://github.com/ClickHouse/ClickHouse/pull/12653) ([Winter Zhang](https://github.com/zhang2014)). +* Fix explain query format overwrite by default, issue https://github.com/ClickHouse/ClickHouse/issues/12432. [#12541](https://github.com/ClickHouse/ClickHouse/pull/12541) ([BohuTANG](https://github.com/BohuTANG)). +* Allow to set JOIN kind and type in more standad way: `LEFT SEMI JOIN` instead of `SEMI LEFT JOIN`. For now both are correct. [#12520](https://github.com/ClickHouse/ClickHouse/pull/12520) ([Artem Zuikov](https://github.com/4ertus2)). +* Changes default value for `multiple_joins_rewriter_version` to 2. It enables new multiple joins rewriter that knows about column names. [#12469](https://github.com/ClickHouse/ClickHouse/pull/12469) ([Artem Zuikov](https://github.com/4ertus2)). +* Add several metrics for requests to S3 storages. [#12464](https://github.com/ClickHouse/ClickHouse/pull/12464) ([ianton-ru](https://github.com/ianton-ru)). +* Use correct default secure port for clickhouse-benchmark with `--secure` argument. This fixes [#11044](https://github.com/ClickHouse/ClickHouse/issues/11044). [#12440](https://github.com/ClickHouse/ClickHouse/pull/12440) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Rollback insertion errors in `Log`, `TinyLog`, `StripeLog` engines. In previous versions insertion error lead to inconsisent table state (this works as documented and it is normal for these table engines). This fixes [#12402](https://github.com/ClickHouse/ClickHouse/issues/12402). [#12426](https://github.com/ClickHouse/ClickHouse/pull/12426) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Implement `RENAME DATABASE` and `RENAME DICTIONARY` for `Atomic` database engine - Add implicit `{uuid}` macro, which can be used in ZooKeeper path for `ReplicatedMergeTree`. It works with `CREATE ... ON CLUSTER ...` queries. Set `show_table_uuid_in_table_create_query_if_not_nil` to `true` to use it. - Make `ReplicatedMergeTree` engine arguments optional, `/clickhouse/tables/{uuid}/{shard}/` and `{replica}` are used by default. Closes [#12135](https://github.com/ClickHouse/ClickHouse/issues/12135). - Minor fixes. - These changes break backward compatibility of `Atomic` database engine. Previously created `Atomic` databases must be manually converted to new format. Atomic database is an experimental feature. [#12343](https://github.com/ClickHouse/ClickHouse/pull/12343) ([tavplubix](https://github.com/tavplubix)). +* Separated `AWSAuthV4Signer` into different logger, removed excessive `AWSClient: AWSClient` from log messages. [#12320](https://github.com/ClickHouse/ClickHouse/pull/12320) ([Vladimir Chebotarev](https://github.com/excitoon)). +* Better exception message in disk access storage. [#12625](https://github.com/ClickHouse/ClickHouse/pull/12625) ([alesapin](https://github.com/alesapin)). +* Better exception for function `in` with invalid number of arguments. [#12529](https://github.com/ClickHouse/ClickHouse/pull/12529) ([Anton Popov](https://github.com/CurtizJ)). +* Fix error message about adaptive granularity. [#12624](https://github.com/ClickHouse/ClickHouse/pull/12624) ([alesapin](https://github.com/alesapin)). +* Fix SETTINGS parse after FORMAT. [#12480](https://github.com/ClickHouse/ClickHouse/pull/12480) ([Azat Khuzhin](https://github.com/azat)). +* If MergeTree table does not contain ORDER BY or PARTITION BY, it was possible to request ALTER to CLEAR all the columns and ALTER will stuck. Fixed [#7941](https://github.com/ClickHouse/ClickHouse/issues/7941). [#12382](https://github.com/ClickHouse/ClickHouse/pull/12382) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Avoid re-loading completion from the history file after each query (to avoid history overlaps with other client sessions). [#13086](https://github.com/ClickHouse/ClickHouse/pull/13086) ([Azat Khuzhin](https://github.com/azat)). + +#### Performance Improvement + +* Lower memory usage for some operations up to 2 times. [#12424](https://github.com/ClickHouse/ClickHouse/pull/12424) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Optimize PK lookup for queries that match exact PK range. [#12277](https://github.com/ClickHouse/ClickHouse/pull/12277) ([Ivan Babrou](https://github.com/bobrik)). +* Slightly optimize very short queries with `LowCardinality`. [#14129](https://github.com/ClickHouse/ClickHouse/pull/14129) ([Anton Popov](https://github.com/CurtizJ)). +* Slightly improve performance of aggregation by UInt8/UInt16 keys. [#13091](https://github.com/ClickHouse/ClickHouse/pull/13091) and [#13055](https://github.com/ClickHouse/ClickHouse/pull/13055) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Push down `LIMIT` step for query plan (inside subqueries). [#13016](https://github.com/ClickHouse/ClickHouse/pull/13016) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Parallel primary key lookup and skipping index stages on parts, as described in [#11564](https://github.com/ClickHouse/ClickHouse/issues/11564). [#12589](https://github.com/ClickHouse/ClickHouse/pull/12589) ([Ivan Babrou](https://github.com/bobrik)). +* Converting String-type arguments of function "if" and "transform" into enum if `set optimize_if_transform_strings_to_enum = 1`. [#12515](https://github.com/ClickHouse/ClickHouse/pull/12515) ([Artem Zuikov](https://github.com/4ertus2)). +* Replaces monotonic functions with its argument in `ORDER BY` if `set optimize_monotonous_functions_in_order_by=1`. [#12467](https://github.com/ClickHouse/ClickHouse/pull/12467) ([Artem Zuikov](https://github.com/4ertus2)). +* Add order by optimization that rewrites `ORDER BY x, f(x)` with `ORDER by x` if `set optimize_redundant_functions_in_order_by = 1`. [#12404](https://github.com/ClickHouse/ClickHouse/pull/12404) ([Artem Zuikov](https://github.com/4ertus2)). +* Allow pushdown predicate when subquery contains `WITH` clause. This fixes [#12293](https://github.com/ClickHouse/ClickHouse/issues/12293) [#12663](https://github.com/ClickHouse/ClickHouse/pull/12663) ([Winter Zhang](https://github.com/zhang2014)). +* Improve performance of reading from compact parts. Compact parts is an experimental feature. [#12492](https://github.com/ClickHouse/ClickHouse/pull/12492) ([Anton Popov](https://github.com/CurtizJ)). +* Attempt to implement streaming optimization in `DiskS3`. DiskS3 is an experimental feature. [#12434](https://github.com/ClickHouse/ClickHouse/pull/12434) ([Vladimir Chebotarev](https://github.com/excitoon)). + +#### Build/Testing/Packaging Improvement + +* Use `shellcheck` for sh tests linting. [#13200](https://github.com/ClickHouse/ClickHouse/pull/13200) [#13207](https://github.com/ClickHouse/ClickHouse/pull/13207) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Add script which set labels for pull requests in GitHub hook. [#13183](https://github.com/ClickHouse/ClickHouse/pull/13183) ([alesapin](https://github.com/alesapin)). +* Remove some of recursive submodules. See [#13378](https://github.com/ClickHouse/ClickHouse/issues/13378). [#13379](https://github.com/ClickHouse/ClickHouse/pull/13379) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Ensure that all the submodules are from proper URLs. Continuation of [#13379](https://github.com/ClickHouse/ClickHouse/issues/13379). This fixes [#13378](https://github.com/ClickHouse/ClickHouse/issues/13378). [#13397](https://github.com/ClickHouse/ClickHouse/pull/13397) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Added support for user-declared settings, which can be accessed from inside queries. This is needed when ClickHouse engine is used as a component of another system. [#13013](https://github.com/ClickHouse/ClickHouse/pull/13013) ([Vitaly Baranov](https://github.com/vitlibar)). +* Added testing for RBAC functionality of INSERT privilege in TestFlows. Expanded tables on which SELECT is being tested. Added Requirements to match new table engine tests. [#13340](https://github.com/ClickHouse/ClickHouse/pull/13340) ([MyroTk](https://github.com/MyroTk)). +* Fix timeout error during server restart in the stress test. [#13321](https://github.com/ClickHouse/ClickHouse/pull/13321) ([alesapin](https://github.com/alesapin)). +* Now fast test will wait server with retries. [#13284](https://github.com/ClickHouse/ClickHouse/pull/13284) ([alesapin](https://github.com/alesapin)). +* Function `materialize()` (the function for ClickHouse testing) will work for NULL as expected - by transforming it to non-constant column. [#13212](https://github.com/ClickHouse/ClickHouse/pull/13212) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix libunwind build in AArch64. This fixes [#13204](https://github.com/ClickHouse/ClickHouse/issues/13204). [#13208](https://github.com/ClickHouse/ClickHouse/pull/13208) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Even more retries in zkutil gtest to prevent test flakiness. [#13165](https://github.com/ClickHouse/ClickHouse/pull/13165) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Small fixes to the RBAC TestFlows. [#13152](https://github.com/ClickHouse/ClickHouse/pull/13152) ([vzakaznikov](https://github.com/vzakaznikov)). +* Fixing `00960_live_view_watch_events_live.py` test. [#13108](https://github.com/ClickHouse/ClickHouse/pull/13108) ([vzakaznikov](https://github.com/vzakaznikov)). +* Improve cache purge in documentation deploy script. [#13107](https://github.com/ClickHouse/ClickHouse/pull/13107) ([alesapin](https://github.com/alesapin)). +* Rewrote some orphan tests to gtest. Removed useless includes from tests. [#13073](https://github.com/ClickHouse/ClickHouse/pull/13073) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Added tests for RBAC functionality of `SELECT` privilege in TestFlows. [#13061](https://github.com/ClickHouse/ClickHouse/pull/13061) ([Ritaank Tiwari](https://github.com/ritaank)). +* Rerun some tests in fast test check. [#12992](https://github.com/ClickHouse/ClickHouse/pull/12992) ([alesapin](https://github.com/alesapin)). +* Fix MSan error in "rdkafka" library. This closes [#12990](https://github.com/ClickHouse/ClickHouse/issues/12990). Updated `rdkafka` to version 1.5 (master). [#12991](https://github.com/ClickHouse/ClickHouse/pull/12991) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix UBSan report in base64 if tests were run on server with AVX-512. This fixes [#12318](https://github.com/ClickHouse/ClickHouse/issues/12318). Author: @qoega. [#12441](https://github.com/ClickHouse/ClickHouse/pull/12441) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix UBSan report in HDFS library. This closes [#12330](https://github.com/ClickHouse/ClickHouse/issues/12330). [#12453](https://github.com/ClickHouse/ClickHouse/pull/12453) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Check an ability that we able to restore the backup from an old version to the new version. This closes [#8979](https://github.com/ClickHouse/ClickHouse/issues/8979). [#12959](https://github.com/ClickHouse/ClickHouse/pull/12959) ([alesapin](https://github.com/alesapin)). +* Do not build helper_container image inside integrational tests. Build docker container in CI and use pre-built helper_container in integration tests. [#12953](https://github.com/ClickHouse/ClickHouse/pull/12953) ([Ilya Yatsishin](https://github.com/qoega)). +* Add a test for `ALTER TABLE CLEAR COLUMN` query for primary key columns. [#12951](https://github.com/ClickHouse/ClickHouse/pull/12951) ([alesapin](https://github.com/alesapin)). +* Increased timeouts in testflows tests. [#12949](https://github.com/ClickHouse/ClickHouse/pull/12949) ([vzakaznikov](https://github.com/vzakaznikov)). +* Fix build of test under Mac OS X. This closes [#12767](https://github.com/ClickHouse/ClickHouse/issues/12767). [#12772](https://github.com/ClickHouse/ClickHouse/pull/12772) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Connector-ODBC updated to mysql-connector-odbc-8.0.21. [#12739](https://github.com/ClickHouse/ClickHouse/pull/12739) ([Ilya Yatsishin](https://github.com/qoega)). +* Adding RBAC syntax tests in TestFlows. [#12642](https://github.com/ClickHouse/ClickHouse/pull/12642) ([vzakaznikov](https://github.com/vzakaznikov)). +* Improve performance of TestKeeper. This will speedup tests with heavy usage of Replicated tables. [#12505](https://github.com/ClickHouse/ClickHouse/pull/12505) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Now we check that server is able to start after stress tests run. This fixes [#12473](https://github.com/ClickHouse/ClickHouse/issues/12473). [#12496](https://github.com/ClickHouse/ClickHouse/pull/12496) ([alesapin](https://github.com/alesapin)). +* Update fmtlib to master (7.0.1). [#12446](https://github.com/ClickHouse/ClickHouse/pull/12446) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Add docker image for fast tests. [#12294](https://github.com/ClickHouse/ClickHouse/pull/12294) ([alesapin](https://github.com/alesapin)). +* Rework configuration paths for integration tests. [#12285](https://github.com/ClickHouse/ClickHouse/pull/12285) ([Ilya Yatsishin](https://github.com/qoega)). +* Add compiler option to control that stack frames are not too large. This will help to run the code in fibers with small stack size. [#11524](https://github.com/ClickHouse/ClickHouse/pull/11524) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Update gitignore-files. [#13447](https://github.com/ClickHouse/ClickHouse/pull/13447) ([vladimir-golovchenko](https://github.com/vladimir-golovchenko)). + + ## ClickHouse release 20.6 ### ClickHouse release v20.6.3.28-stable diff --git a/README.md b/README.md index 5daf152109f..7f6a102a2dd 100644 --- a/README.md +++ b/README.md @@ -17,5 +17,5 @@ ClickHouse is an open-source column-oriented database management system that all ## Upcoming Events -* [ClickHouse at ByteDance (in Chinese)](https://mp.weixin.qq.com/s/Em-HjPylO8D7WPui4RREAQ) on August 28, 2020. * [ClickHouse Data Integration Virtual Meetup](https://www.eventbrite.com/e/clickhouse-september-virtual-meetup-data-integration-tickets-117421895049) on September 10, 2020. +* [ClickHouse talk at Ya.Subbotnik (in Russian)](https://ya.cc/t/cIBI-3yECj5JF) on September 12, 2020. diff --git a/base/common/arithmeticOverflow.h b/base/common/arithmeticOverflow.h index 3dfbdbc1346..e228af287e2 100644 --- a/base/common/arithmeticOverflow.h +++ b/base/common/arithmeticOverflow.h @@ -38,18 +38,18 @@ namespace common } template <> - inline bool addOverflow(bInt256 x, bInt256 y, bInt256 & res) + inline bool addOverflow(wInt256 x, wInt256 y, wInt256 & res) { res = x + y; - return (y > 0 && x > std::numeric_limits::max() - y) || - (y < 0 && x < std::numeric_limits::min() - y); + return (y > 0 && x > std::numeric_limits::max() - y) || + (y < 0 && x < std::numeric_limits::min() - y); } template <> - inline bool addOverflow(bUInt256 x, bUInt256 y, bUInt256 & res) + inline bool addOverflow(wUInt256 x, wUInt256 y, wUInt256 & res) { res = x + y; - return x > std::numeric_limits::max() - y; + return x > std::numeric_limits::max() - y; } template @@ -86,15 +86,15 @@ namespace common } template <> - inline bool subOverflow(bInt256 x, bInt256 y, bInt256 & res) + inline bool subOverflow(wInt256 x, wInt256 y, wInt256 & res) { res = x - y; - return (y < 0 && x > std::numeric_limits::max() + y) || - (y > 0 && x < std::numeric_limits::min() + y); + return (y < 0 && x > std::numeric_limits::max() + y) || + (y > 0 && x < std::numeric_limits::min() + y); } template <> - inline bool subOverflow(bUInt256 x, bUInt256 y, bUInt256 & res) + inline bool subOverflow(wUInt256 x, wUInt256 y, wUInt256 & res) { res = x - y; return x < y; @@ -137,19 +137,19 @@ namespace common } template <> - inline bool mulOverflow(bInt256 x, bInt256 y, bInt256 & res) + inline bool mulOverflow(wInt256 x, wInt256 y, wInt256 & res) { res = x * y; if (!x || !y) return false; - bInt256 a = (x > 0) ? x : -x; - bInt256 b = (y > 0) ? y : -y; + wInt256 a = (x > 0) ? x : -x; + wInt256 b = (y > 0) ? y : -y; return (a * b) / b != a; } template <> - inline bool mulOverflow(bUInt256 x, bUInt256 y, bUInt256 & res) + inline bool mulOverflow(wUInt256 x, wUInt256 y, wUInt256 & res) { res = x * y; if (!x || !y) diff --git a/base/common/types.h b/base/common/types.h index c49e9334bf5..682fe94366c 100644 --- a/base/common/types.h +++ b/base/common/types.h @@ -6,7 +6,7 @@ #include #include -#include +#include using Int8 = int8_t; using Int16 = int16_t; @@ -25,12 +25,11 @@ using UInt64 = uint64_t; using Int128 = __int128; -/// We have to use 127 and 255 bit integers to safe a bit for a sign serialization -//using bInt256 = boost::multiprecision::int256_t; -using bInt256 = boost::multiprecision::number >; -using bUInt256 = boost::multiprecision::uint256_t; +using wInt256 = std::wide_integer<256, signed>; +using wUInt256 = std::wide_integer<256, unsigned>; +static_assert(sizeof(wInt256) == 32); +static_assert(sizeof(wUInt256) == 32); using String = std::string; @@ -44,7 +43,7 @@ struct is_signed }; template <> struct is_signed { static constexpr bool value = true; }; -template <> struct is_signed { static constexpr bool value = true; }; +template <> struct is_signed { static constexpr bool value = true; }; template inline constexpr bool is_signed_v = is_signed::value; @@ -55,7 +54,7 @@ struct is_unsigned static constexpr bool value = std::is_unsigned_v; }; -template <> struct is_unsigned { static constexpr bool value = true; }; +template <> struct is_unsigned { static constexpr bool value = true; }; template inline constexpr bool is_unsigned_v = is_unsigned::value; @@ -69,8 +68,8 @@ struct is_integer }; template <> struct is_integer { static constexpr bool value = true; }; -template <> struct is_integer { static constexpr bool value = true; }; -template <> struct is_integer { static constexpr bool value = true; }; +template <> struct is_integer { static constexpr bool value = true; }; +template <> struct is_integer { static constexpr bool value = true; }; template inline constexpr bool is_integer_v = is_integer::value; @@ -93,9 +92,9 @@ struct make_unsigned typedef std::make_unsigned_t type; }; -template <> struct make_unsigned<__int128> { using type = unsigned __int128; }; -template <> struct make_unsigned { using type = bUInt256; }; -template <> struct make_unsigned { using type = bUInt256; }; +template <> struct make_unsigned { using type = unsigned __int128; }; +template <> struct make_unsigned { using type = wUInt256; }; +template <> struct make_unsigned { using type = wUInt256; }; template using make_unsigned_t = typename make_unsigned::type; @@ -105,8 +104,8 @@ struct make_signed typedef std::make_signed_t type; }; -template <> struct make_signed { typedef bInt256 type; }; -template <> struct make_signed { typedef bInt256 type; }; +template <> struct make_signed { using type = wInt256; }; +template <> struct make_signed { using type = wInt256; }; template using make_signed_t = typename make_signed::type; @@ -116,8 +115,20 @@ struct is_big_int static constexpr bool value = false; }; -template <> struct is_big_int { static constexpr bool value = true; }; -template <> struct is_big_int { static constexpr bool value = true; }; +template <> struct is_big_int { static constexpr bool value = true; }; +template <> struct is_big_int { static constexpr bool value = true; }; template inline constexpr bool is_big_int_v = is_big_int::value; + +template +inline std::string bigintToString(const T & x) +{ + return to_string(x); +} + +template +inline To bigint_cast(const From & x [[maybe_unused]]) +{ + return static_cast(x); +} diff --git a/base/common/wide_integer.h b/base/common/wide_integer.h new file mode 100644 index 00000000000..67d0b3f04da --- /dev/null +++ b/base/common/wide_integer.h @@ -0,0 +1,249 @@ +#pragma once + +/////////////////////////////////////////////////////////////// +// Distributed under the Boost Software License, Version 1.0. +// (See at http://www.boost.org/LICENSE_1_0.txt) +/////////////////////////////////////////////////////////////// + +/* Divide and multiply + * + * + * Copyright (c) 2008 + * Evan Teran + * + * Permission to use, copy, modify, and distribute this software and its + * documentation for any purpose and without fee is hereby granted, provided + * that the above copyright notice appears in all copies and that both the + * copyright notice and this permission notice appear in supporting + * documentation, and that the same name not be used in advertising or + * publicity pertaining to distribution of the software without specific, + * written prior permission. We make no representations about the + * suitability this software for any purpose. It is provided "as is" + * without express or implied warranty. + */ + +#include // CHAR_BIT +#include +#include +#include +#include + +namespace std +{ +template +class wide_integer; + +template +struct common_type, wide_integer>; + +template +struct common_type, Arithmetic>; + +template +struct common_type>; + +template +class wide_integer +{ +public: + using base_type = uint8_t; + using signed_base_type = int8_t; + + // ctors + wide_integer() = default; + + template + constexpr wide_integer(T rhs) noexcept; + template + constexpr wide_integer(std::initializer_list il) noexcept; + + // assignment + template + constexpr wide_integer & operator=(const wide_integer & rhs) noexcept; + + template + constexpr wide_integer & operator=(Arithmetic rhs) noexcept; + + template + constexpr wide_integer & operator*=(const Arithmetic & rhs); + + template + constexpr wide_integer & operator/=(const Arithmetic & rhs); + + template + constexpr wide_integer & operator+=(const Arithmetic & rhs) noexcept(is_same::value); + + template + constexpr wide_integer & operator-=(const Arithmetic & rhs) noexcept(is_same::value); + + template + constexpr wide_integer & operator%=(const Integral & rhs); + + template + constexpr wide_integer & operator&=(const Integral & rhs) noexcept; + + template + constexpr wide_integer & operator|=(const Integral & rhs) noexcept; + + template + constexpr wide_integer & operator^=(const Integral & rhs) noexcept; + + constexpr wide_integer & operator<<=(int n); + constexpr wide_integer & operator>>=(int n) noexcept; + + constexpr wide_integer & operator++() noexcept(is_same::value); + constexpr wide_integer operator++(int) noexcept(is_same::value); + constexpr wide_integer & operator--() noexcept(is_same::value); + constexpr wide_integer operator--(int) noexcept(is_same::value); + + // observers + + constexpr explicit operator bool() const noexcept; + + template + using __integral_not_wide_integer_class = typename std::enable_if::value, T>::type; + + template > + constexpr operator T() const noexcept; + + constexpr operator long double() const noexcept; + constexpr operator double() const noexcept; + constexpr operator float() const noexcept; + + struct _impl; + +private: + template + friend class wide_integer; + + friend class numeric_limits>; + friend class numeric_limits>; + + base_type m_arr[_impl::arr_size]; +}; + +template +static constexpr bool ArithmeticConcept() noexcept; +template +using __only_arithmetic = typename std::enable_if() && ArithmeticConcept()>::type; + +template +static constexpr bool IntegralConcept() noexcept; +template +using __only_integer = typename std::enable_if() && IntegralConcept()>::type; + +// Unary operators +template +constexpr wide_integer operator~(const wide_integer & lhs) noexcept; + +template +constexpr wide_integer operator-(const wide_integer & lhs) noexcept(is_same::value); + +template +constexpr wide_integer operator+(const wide_integer & lhs) noexcept(is_same::value); + +// Binary operators +template +std::common_type_t, wide_integer> constexpr +operator*(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator*(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +std::common_type_t, wide_integer> constexpr +operator/(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator/(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +std::common_type_t, wide_integer> constexpr +operator+(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator+(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +std::common_type_t, wide_integer> constexpr +operator-(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator-(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +std::common_type_t, wide_integer> constexpr +operator%(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator%(const Integral & rhs, const Integral2 & lhs); + +template +std::common_type_t, wide_integer> constexpr +operator&(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator&(const Integral & rhs, const Integral2 & lhs); + +template +std::common_type_t, wide_integer> constexpr +operator|(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator|(const Integral & rhs, const Integral2 & lhs); + +template +std::common_type_t, wide_integer> constexpr +operator^(const wide_integer & lhs, const wide_integer & rhs); +template > +std::common_type_t constexpr operator^(const Integral & rhs, const Integral2 & lhs); + +// TODO: Integral +template +constexpr wide_integer operator<<(const wide_integer & lhs, int n) noexcept; +template +constexpr wide_integer operator>>(const wide_integer & lhs, int n) noexcept; + +template >> +constexpr wide_integer operator<<(const wide_integer & lhs, Int n) noexcept +{ + return lhs << int(n); +} +template >> +constexpr wide_integer operator>>(const wide_integer & lhs, Int n) noexcept +{ + return lhs >> int(n); +} + +template +constexpr bool operator<(const wide_integer & lhs, const wide_integer & rhs); +template > +constexpr bool operator<(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +constexpr bool operator>(const wide_integer & lhs, const wide_integer & rhs); +template > +constexpr bool operator>(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +constexpr bool operator<=(const wide_integer & lhs, const wide_integer & rhs); +template > +constexpr bool operator<=(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +constexpr bool operator>=(const wide_integer & lhs, const wide_integer & rhs); +template > +constexpr bool operator>=(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +constexpr bool operator==(const wide_integer & lhs, const wide_integer & rhs); +template > +constexpr bool operator==(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +constexpr bool operator!=(const wide_integer & lhs, const wide_integer & rhs); +template > +constexpr bool operator!=(const Arithmetic & rhs, const Arithmetic2 & lhs); + +template +std::string to_string(const wide_integer & n); + +template +struct hash>; + +} + +#include "wide_integer_impl.h" diff --git a/base/common/wide_integer_impl.h b/base/common/wide_integer_impl.h new file mode 100644 index 00000000000..c77a9120a55 --- /dev/null +++ b/base/common/wide_integer_impl.h @@ -0,0 +1,1301 @@ +/// Original is here https://github.com/cerevra/int +#pragma once + +#include "wide_integer.h" + +#include +#include + +namespace std +{ +#define CT(x) \ + std::common_type_t, std::decay_t> { x } + +// numeric limits +template +class numeric_limits> +{ +public: + static constexpr bool is_specialized = true; + static constexpr bool is_signed = is_same::value; + static constexpr bool is_integer = true; + static constexpr bool is_exact = true; + static constexpr bool has_infinity = false; + static constexpr bool has_quiet_NaN = false; + static constexpr bool has_signaling_NaN = true; + static constexpr std::float_denorm_style has_denorm = std::denorm_absent; + static constexpr bool has_denorm_loss = false; + static constexpr std::float_round_style round_style = std::round_toward_zero; + static constexpr bool is_iec559 = false; + static constexpr bool is_bounded = true; + static constexpr bool is_modulo = true; + static constexpr int digits = Bits - (is_same::value ? 1 : 0); + static constexpr int digits10 = digits * 0.30103 /*std::log10(2)*/; + static constexpr int max_digits10 = 0; + static constexpr int radix = 2; + static constexpr int min_exponent = 0; + static constexpr int min_exponent10 = 0; + static constexpr int max_exponent = 0; + static constexpr int max_exponent10 = 0; + static constexpr bool traps = true; + static constexpr bool tinyness_before = false; + + static constexpr wide_integer min() noexcept + { + if (is_same::value) + { + using T = wide_integer; + T res{}; + res.m_arr[T::_impl::big(0)] = std::numeric_limits::signed_base_type>::min(); + return res; + } + return 0; + } + + static constexpr wide_integer max() noexcept + { + using T = wide_integer; + T res{}; + res.m_arr[T::_impl::big(0)] = is_same::value + ? std::numeric_limits::signed_base_type>::max() + : std::numeric_limits::base_type>::max(); + for (int i = 1; i < wide_integer::_impl::arr_size; ++i) + { + res.m_arr[T::_impl::big(i)] = std::numeric_limits::base_type>::max(); + } + return res; + } + + static constexpr wide_integer lowest() noexcept { return min(); } + static constexpr wide_integer epsilon() noexcept { return 0; } + static constexpr wide_integer round_error() noexcept { return 0; } + static constexpr wide_integer infinity() noexcept { return 0; } + static constexpr wide_integer quiet_NaN() noexcept { return 0; } + static constexpr wide_integer signaling_NaN() noexcept { return 0; } + static constexpr wide_integer denorm_min() noexcept { return 0; } +}; + +template +struct IsWideInteger +{ + static const constexpr bool value = false; +}; + +template +struct IsWideInteger> +{ + static const constexpr bool value = true; +}; + +template +static constexpr bool ArithmeticConcept() noexcept +{ + return std::is_arithmetic_v || IsWideInteger::value; +} + +template +static constexpr bool IntegralConcept() noexcept +{ + return std::is_integral_v || IsWideInteger::value; +} + +// type traits +template +struct common_type, wide_integer> +{ + using type = std::conditional_t < Bits == Bits2, + wide_integer< + Bits, + std::conditional_t<(std::is_same::value && std::is_same::value), signed, unsigned>>, + std::conditional_t, wide_integer>>; +}; + +template +struct common_type, Arithmetic> +{ + static_assert(ArithmeticConcept(), ""); + + using type = std::conditional_t< + std::is_floating_point::value, + Arithmetic, + std::conditional_t< + sizeof(Arithmetic) < Bits * sizeof(long), + wide_integer, + std::conditional_t< + Bits * sizeof(long) < sizeof(Arithmetic), + Arithmetic, + std::conditional_t< + Bits * sizeof(long) == sizeof(Arithmetic) && (is_same::value || std::is_signed::value), + Arithmetic, + wide_integer>>>>; +}; + +template +struct common_type> : std::common_type, Arithmetic> +{ +}; + +template +struct wide_integer::_impl +{ + static_assert(Bits % CHAR_BIT == 0, "=)"); + + // utils + static const int base_bits = sizeof(base_type) * CHAR_BIT; + static const int arr_size = Bits / base_bits; + static constexpr size_t _Bits = Bits; + static constexpr bool _is_wide_integer = true; + + // The original implementation is big-endian. We need little one. + static constexpr unsigned little(unsigned idx) { return idx; } + static constexpr unsigned big(unsigned idx) { return arr_size - 1 - idx; } + static constexpr unsigned any(unsigned idx) { return idx; } + + template + constexpr static bool is_negative(const wide_integer & n) noexcept + { + if constexpr (std::is_same_v) + return static_cast(n.m_arr[big(0)]) < 0; + else + return false; + } + + template + constexpr static wide_integer make_positive(const wide_integer & n) noexcept + { + return is_negative(n) ? operator_unary_minus(n) : n; + } + + template + constexpr static auto to_Integral(T f) noexcept + { + if constexpr (std::is_same_v) + return f; + else if constexpr (std::is_signed_v) + return static_cast(f); + else + return static_cast(f); + } + + template + constexpr static void wide_integer_from_bultin(wide_integer & self, Integral rhs) noexcept + { + auto r = _impl::to_Integral(rhs); + + int r_idx = 0; + for (; static_cast(r_idx) < sizeof(Integral) && r_idx < arr_size; ++r_idx) + { + base_type & curr = self.m_arr[little(r_idx)]; + base_type curr_rhs = (r >> (r_idx * CHAR_BIT)) & std::numeric_limits::max(); + curr = curr_rhs; + } + + for (; r_idx < arr_size; ++r_idx) + { + base_type & curr = self.m_arr[little(r_idx)]; + curr = r < 0 ? std::numeric_limits::max() : 0; + } + } + + constexpr static void wide_integer_from_bultin(wide_integer & self, double rhs) noexcept + { + if ((rhs > 0 && rhs < std::numeric_limits::max()) || (rhs < 0 && rhs > std::numeric_limits::min())) + { + self = to_Integral(rhs); + return; + } + + long double r = rhs; + if (r < 0) + r = -r; + + size_t count = r / std::numeric_limits::max(); + self = count; + self *= std::numeric_limits::max(); + long double to_diff = count; + to_diff *= std::numeric_limits::max(); + + self += to_Integral(r - to_diff); + + if (rhs < 0) + self = -self; + } + + template + constexpr static void + wide_integer_from_wide_integer(wide_integer & self, const wide_integer & rhs) noexcept + { + // int Bits_to_copy = std::min(arr_size, rhs.arr_size); + auto rhs_arr_size = wide_integer::_impl::arr_size; + int base_elems_to_copy = _impl::arr_size < rhs_arr_size ? _impl::arr_size : rhs_arr_size; + for (int i = 0; i < base_elems_to_copy; ++i) + { + self.m_arr[little(i)] = rhs.m_arr[little(i)]; + } + for (int i = 0; i < arr_size - base_elems_to_copy; ++i) + { + self.m_arr[big(i)] = is_negative(rhs) ? std::numeric_limits::max() : 0; + } + } + + template + constexpr static bool should_keep_size() + { + return sizeof(T) * CHAR_BIT <= Bits; + } + + constexpr static wide_integer shift_left(const wide_integer & rhs, int n) + { + if (static_cast(n) >= base_bits * arr_size) + return 0; + if (n <= 0) + return rhs; + + wide_integer lhs = rhs; + int bit_shift = n % base_bits; + unsigned n_bytes = n / base_bits; + if (bit_shift) + { + lhs.m_arr[big(0)] <<= bit_shift; + for (int i = 1; i < arr_size; ++i) + { + lhs.m_arr[big(i - 1)] |= lhs.m_arr[big(i)] >> (base_bits - bit_shift); + lhs.m_arr[big(i)] <<= bit_shift; + } + } + if (n_bytes) + { + for (unsigned i = 0; i < arr_size - n_bytes; ++i) + { + lhs.m_arr[big(i)] = lhs.m_arr[big(i + n_bytes)]; + } + for (unsigned i = arr_size - n_bytes; i < arr_size; ++i) + lhs.m_arr[big(i)] = 0; + } + return lhs; + } + + constexpr static wide_integer shift_left(const wide_integer & rhs, int n) + { + // static_assert(is_negative(rhs), "shift left for negative lhsbers is underfined!"); + if (is_negative(rhs)) + throw std::runtime_error("shift left for negative lhsbers is underfined!"); + + return wide_integer(shift_left(wide_integer(rhs), n)); + } + + constexpr static wide_integer shift_right(const wide_integer & rhs, int n) noexcept + { + if (static_cast(n) >= base_bits * arr_size) + return 0; + if (n <= 0) + return rhs; + + wide_integer lhs = rhs; + int bit_shift = n % base_bits; + unsigned n_bytes = n / base_bits; + if (bit_shift) + { + lhs.m_arr[little(0)] >>= bit_shift; + for (int i = 1; i < arr_size; ++i) + { + lhs.m_arr[little(i - 1)] |= lhs.m_arr[little(i)] << (base_bits - bit_shift); + lhs.m_arr[little(i)] >>= bit_shift; + } + } + if (n_bytes) + { + for (unsigned i = 0; i < arr_size - n_bytes; ++i) + { + lhs.m_arr[little(i)] = lhs.m_arr[little(i + n_bytes)]; + } + for (unsigned i = arr_size - n_bytes; i < arr_size; ++i) + lhs.m_arr[little(i)] = 0; + } + return lhs; + } + + constexpr static wide_integer shift_right(const wide_integer & rhs, int n) noexcept + { + if (static_cast(n) >= base_bits * arr_size) + return 0; + if (n <= 0) + return rhs; + + bool is_neg = is_negative(rhs); + if (!is_neg) + return shift_right(wide_integer(rhs), n); + + wide_integer lhs = rhs; + int bit_shift = n % base_bits; + unsigned n_bytes = n / base_bits; + if (bit_shift) + { + lhs = shift_right(wide_integer(lhs), bit_shift); + lhs.m_arr[big(0)] |= std::numeric_limits::max() << (base_bits - bit_shift); + } + if (n_bytes) + { + for (unsigned i = 0; i < arr_size - n_bytes; ++i) + { + lhs.m_arr[little(i)] = lhs.m_arr[little(i + n_bytes)]; + } + for (unsigned i = arr_size - n_bytes; i < arr_size; ++i) + { + lhs.m_arr[little(i)] = std::numeric_limits::max(); + } + } + return lhs; + } + + template + constexpr static wide_integer + operator_plus_T(const wide_integer & lhs, T rhs) noexcept(is_same::value) + { + if (rhs < 0) + return _operator_minus_T(lhs, -rhs); + else + return _operator_plus_T(lhs, rhs); + } + +private: + template + constexpr static wide_integer + _operator_minus_T(const wide_integer & lhs, T rhs) noexcept(is_same::value) + { + wide_integer res = lhs; + + bool is_underflow = false; + int r_idx = 0; + for (; static_cast(r_idx) < sizeof(T) && r_idx < arr_size; ++r_idx) + { + base_type & res_i = res.m_arr[little(r_idx)]; + base_type curr_rhs = (rhs >> (r_idx * CHAR_BIT)) & std::numeric_limits::max(); + + if (is_underflow) + { + --res_i; + is_underflow = res_i == std::numeric_limits::max(); + } + + if (res_i < curr_rhs) + is_underflow = true; + res_i -= curr_rhs; + } + + if (is_underflow && r_idx < arr_size) + { + --res.m_arr[little(r_idx)]; + for (int i = arr_size - 1 - r_idx - 1; i >= 0; --i) + { + if (res.m_arr[big(i + 1)] == std::numeric_limits::max()) + --res.m_arr[big(i)]; + else + break; + } + } + + return res; + } + + template + constexpr static wide_integer + _operator_plus_T(const wide_integer & lhs, T rhs) noexcept(is_same::value) + { + wide_integer res = lhs; + + bool is_overflow = false; + int r_idx = 0; + for (; static_cast(r_idx) < sizeof(T) && r_idx < arr_size; ++r_idx) + { + base_type & res_i = res.m_arr[little(r_idx)]; + base_type curr_rhs = (rhs >> (r_idx * CHAR_BIT)) & std::numeric_limits::max(); + + if (is_overflow) + { + ++res_i; + is_overflow = res_i == 0; + } + + res_i += curr_rhs; + if (res_i < curr_rhs) + is_overflow = true; + } + + if (is_overflow && r_idx < arr_size) + { + ++res.m_arr[little(r_idx)]; + for (int i = arr_size - 1 - r_idx - 1; i >= 0; --i) + { + if (res.m_arr[big(i + 1)] == 0) + ++res.m_arr[big(i)]; + else + break; + } + } + + return res; + } + +public: + constexpr static wide_integer operator_unary_tilda(const wide_integer & lhs) noexcept + { + wide_integer res{}; + + for (int i = 0; i < arr_size; ++i) + res.m_arr[any(i)] = ~lhs.m_arr[any(i)]; + return res; + } + + constexpr static wide_integer + operator_unary_minus(const wide_integer & lhs) noexcept(is_same::value) + { + return operator_plus_T(operator_unary_tilda(lhs), 1); + } + + template + constexpr static auto operator_plus(const wide_integer & lhs, const T & rhs) noexcept(is_same::value) + { + if constexpr (should_keep_size()) + { + wide_integer t = rhs; + if (is_negative(t)) + return _operator_minus_wide_integer(lhs, operator_unary_minus(t)); + else + return _operator_plus_wide_integer(lhs, t); + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, wide_integer>::_impl::operator_plus( + wide_integer(lhs), rhs); + } + } + + template + constexpr static auto operator_minus(const wide_integer & lhs, const T & rhs) noexcept(is_same::value) + { + if constexpr (should_keep_size()) + { + wide_integer t = rhs; + if (is_negative(t)) + return _operator_plus_wide_integer(lhs, operator_unary_minus(t)); + else + return _operator_minus_wide_integer(lhs, t); + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, wide_integer>::_impl::operator_minus( + wide_integer(lhs), rhs); + } + } + +private: + constexpr static wide_integer _operator_minus_wide_integer( + const wide_integer & lhs, const wide_integer & rhs) noexcept(is_same::value) + { + wide_integer res = lhs; + + bool is_underflow = false; + for (int idx = 0; idx < arr_size; ++idx) + { + base_type & res_i = res.m_arr[little(idx)]; + const base_type rhs_i = rhs.m_arr[little(idx)]; + + if (is_underflow) + { + --res_i; + is_underflow = res_i == std::numeric_limits::max(); + } + + if (res_i < rhs_i) + is_underflow = true; + + res_i -= rhs_i; + } + + return res; + } + + constexpr static wide_integer _operator_plus_wide_integer( + const wide_integer & lhs, const wide_integer & rhs) noexcept(is_same::value) + { + wide_integer res = lhs; + + bool is_overflow = false; + for (int idx = 0; idx < arr_size; ++idx) + { + base_type & res_i = res.m_arr[little(idx)]; + const base_type rhs_i = rhs.m_arr[little(idx)]; + + if (is_overflow) + { + ++res_i; + is_overflow = res_i == 0; + } + + res_i += rhs_i; + + if (res_i < rhs_i) + is_overflow = true; + } + + return res; + } + +public: + template + constexpr static auto operator_star(const wide_integer & lhs, const T & rhs) + { + if constexpr (should_keep_size()) + { + const wide_integer a = make_positive(lhs); + wide_integer t = make_positive(wide_integer(rhs)); + + wide_integer res = 0; + + for (size_t i = 0; i < arr_size * base_bits; ++i) + { + if (t.m_arr[little(0)] & 1) + res = operator_plus(res, shift_left(a, i)); + + t = shift_right(t, 1); + } + + if (is_same::value && is_negative(wide_integer(rhs)) != is_negative(lhs)) + res = operator_unary_minus(res); + + return res; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, T>::_impl::operator_star(T(lhs), rhs); + } + } + + template + constexpr static bool operator_more(const wide_integer & lhs, const T & rhs) noexcept + { + if constexpr (should_keep_size()) + { + // static_assert(Signed == std::is_signed::value, + // "warning: operator_more: comparison of integers of different signs"); + + wide_integer t = rhs; + + if (std::numeric_limits::is_signed && (is_negative(lhs) != is_negative(t))) + return is_negative(t); + + for (int i = 0; i < arr_size; ++i) + { + if (lhs.m_arr[big(i)] != t.m_arr[big(i)]) + return lhs.m_arr[big(i)] > t.m_arr[big(i)]; + } + + return false; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, T>::_impl::operator_more(T(lhs), rhs); + } + } + + template + constexpr static bool operator_less(const wide_integer & lhs, const T & rhs) noexcept + { + if constexpr (should_keep_size()) + { + // static_assert(Signed == std::is_signed::value, + // "warning: operator_less: comparison of integers of different signs"); + + wide_integer t = rhs; + + if (std::numeric_limits::is_signed && (is_negative(lhs) != is_negative(t))) + return is_negative(lhs); + + for (int i = 0; i < arr_size; ++i) + if (lhs.m_arr[big(i)] != t.m_arr[big(i)]) + return lhs.m_arr[big(i)] < t.m_arr[big(i)]; + + return false; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, T>::_impl::operator_less(T(lhs), rhs); + } + } + + template + constexpr static bool operator_eq(const wide_integer & lhs, const T & rhs) noexcept + { + if constexpr (should_keep_size()) + { + wide_integer t = rhs; + + for (int i = 0; i < arr_size; ++i) + if (lhs.m_arr[any(i)] != t.m_arr[any(i)]) + return false; + + return true; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, T>::_impl::operator_eq(T(lhs), rhs); + } + } + + template + constexpr static auto operator_pipe(const wide_integer & lhs, const T & rhs) noexcept + { + if constexpr (should_keep_size()) + { + wide_integer t = rhs; + wide_integer res = lhs; + + for (int i = 0; i < arr_size; ++i) + res.m_arr[any(i)] |= t.m_arr[any(i)]; + return res; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, T>::_impl::operator_pipe(T(lhs), rhs); + } + } + + template + constexpr static auto operator_amp(const wide_integer & lhs, const T & rhs) noexcept + { + if constexpr (should_keep_size()) + { + wide_integer t = rhs; + wide_integer res = lhs; + + for (int i = 0; i < arr_size; ++i) + res.m_arr[any(i)] &= t.m_arr[any(i)]; + return res; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, T>::_impl::operator_amp(T(lhs), rhs); + } + } + +private: + template + constexpr static void divide(const T & lhserator, const T & denominator, T & quotient, T & remainder) + { + bool is_zero = true; + for (auto c : denominator.m_arr) + { + if (c != 0) + { + is_zero = false; + break; + } + } + + if (is_zero) + throw std::domain_error("divide by zero"); + + T n = lhserator; + T d = denominator; + T x = 1; + T answer = 0; + + while (!operator_more(d, n) && operator_eq(operator_amp(shift_right(d, base_bits * arr_size - 1), 1), 0)) + { + x = shift_left(x, 1); + d = shift_left(d, 1); + } + + while (!operator_eq(x, 0)) + { + if (!operator_more(d, n)) + { + n = operator_minus(n, d); + answer = operator_pipe(answer, x); + } + + x = shift_right(x, 1); + d = shift_right(d, 1); + } + + quotient = answer; + remainder = n; + } + +public: + template + constexpr static auto operator_slash(const wide_integer & lhs, const T & rhs) + { + if constexpr (should_keep_size()) + { + wide_integer o = rhs; + wide_integer quotient{}, remainder{}; + divide(make_positive(lhs), make_positive(o), quotient, remainder); + + if (is_same::value && is_negative(o) != is_negative(lhs)) + quotient = operator_unary_minus(quotient); + + return quotient; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, wide_integer>::operator_slash(T(lhs), rhs); + } + } + + template + constexpr static auto operator_percent(const wide_integer & lhs, const T & rhs) + { + if constexpr (should_keep_size()) + { + wide_integer o = rhs; + wide_integer quotient{}, remainder{}; + divide(make_positive(lhs), make_positive(o), quotient, remainder); + + if (is_same::value && is_negative(lhs)) + remainder = operator_unary_minus(remainder); + + return remainder; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return std::common_type_t, wide_integer>::operator_percent(T(lhs), rhs); + } + } + + // ^ + template + constexpr static auto operator_circumflex(const wide_integer & lhs, const T & rhs) noexcept + { + if constexpr (should_keep_size()) + { + wide_integer t(rhs); + wide_integer res = lhs; + + for (int i = 0; i < arr_size; ++i) + res.m_arr[any(i)] ^= t.m_arr[any(i)]; + return res; + } + else + { + static_assert(T::_impl::_is_wide_integer, ""); + return T::operator_circumflex(T(lhs), rhs); + } + } + + constexpr static wide_integer from_str(const char * c) + { + wide_integer res = 0; + + bool is_neg = is_same::value && *c == '-'; + if (is_neg) + ++c; + + if (*c == '0' && (*(c + 1) == 'x' || *(c + 1) == 'X')) + { // hex + ++c; + ++c; + while (*c) + { + if (*c >= '0' && *c <= '9') + { + res = operator_star(res, 16U); + res = operator_plus_T(res, *c - '0'); + ++c; + } + else if (*c >= 'a' && *c <= 'f') + { + res = operator_star(res, 16U); + res = operator_plus_T(res, *c - 'a' + 10U); + ++c; + } + else if (*c >= 'A' && *c <= 'F') + { // tolower must be used, but it is not constexpr + res = operator_star(res, 16U); + res = operator_plus_T(res, *c - 'A' + 10U); + ++c; + } + else + throw std::runtime_error("invalid char from"); + } + } + else + { // dec + while (*c) + { + if (*c < '0' || *c > '9') + throw std::runtime_error("invalid char from"); + + res = operator_star(res, 10U); + res = operator_plus_T(res, *c - '0'); + ++c; + } + } + + if (is_neg) + res = operator_unary_minus(res); + + return res; + } +}; + +// Members + +template +template +constexpr wide_integer::wide_integer(T rhs) noexcept + : m_arr{} +{ + if constexpr (IsWideInteger::value) + _impl::wide_integer_from_wide_integer(*this, rhs); + else + _impl::wide_integer_from_bultin(*this, rhs); +} + +template +template +constexpr wide_integer::wide_integer(std::initializer_list il) noexcept + : m_arr{} +{ + if (il.size() == 1) + { + if constexpr (IsWideInteger::value) + _impl::wide_integer_from_wide_integer(*this, *il.begin()); + else + _impl::wide_integer_from_bultin(*this, *il.begin()); + } + else + _impl::wide_integer_from_bultin(*this, 0); +} + +template +template +constexpr wide_integer & wide_integer::operator=(const wide_integer & rhs) noexcept +{ + _impl::wide_integer_from_wide_integer(*this, rhs); + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator=(T rhs) noexcept +{ + _impl::wide_integer_from_bultin(*this, rhs); + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator*=(const T & rhs) +{ + *this = *this * rhs; + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator/=(const T & rhs) +{ + *this = *this / rhs; + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator+=(const T & rhs) noexcept(is_same::value) +{ + *this = *this + rhs; + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator-=(const T & rhs) noexcept(is_same::value) +{ + *this = *this - rhs; + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator%=(const T & rhs) +{ + *this = *this % rhs; + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator&=(const T & rhs) noexcept +{ + *this = *this & rhs; + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator|=(const T & rhs) noexcept +{ + *this = *this | rhs; + return *this; +} + +template +template +constexpr wide_integer & wide_integer::operator^=(const T & rhs) noexcept +{ + *this = *this ^ rhs; + return *this; +} + +template +constexpr wide_integer & wide_integer::operator<<=(int n) +{ + *this = _impl::shift_left(*this, n); + return *this; +} + +template +constexpr wide_integer & wide_integer::operator>>=(int n) noexcept +{ + *this = _impl::shift_right(*this, n); + return *this; +} + +template +constexpr wide_integer & wide_integer::operator++() noexcept(is_same::value) +{ + *this = _impl::operator_plus(*this, 1); + return *this; +} + +template +constexpr wide_integer wide_integer::operator++(int) noexcept(is_same::value) +{ + auto tmp = *this; + *this = _impl::operator_plus(*this, 1); + return tmp; +} + +template +constexpr wide_integer & wide_integer::operator--() noexcept(is_same::value) +{ + *this = _impl::operator_minus(*this, 1); + return *this; +} + +template +constexpr wide_integer wide_integer::operator--(int) noexcept(is_same::value) +{ + auto tmp = *this; + *this = _impl::operator_minus(*this, 1); + return tmp; +} + +template +constexpr wide_integer::operator bool() const noexcept +{ + return !_impl::operator_eq(*this, 0); +} + +template +template +constexpr wide_integer::operator T() const noexcept +{ + static_assert(std::numeric_limits::is_integer, ""); + T res = 0; + for (size_t r_idx = 0; r_idx < _impl::arr_size && r_idx < sizeof(T); ++r_idx) + { + res |= (T(m_arr[_impl::little(r_idx)]) << (_impl::base_bits * r_idx)); + } + return res; +} + +template +constexpr wide_integer::operator long double() const noexcept +{ + if (_impl::operator_eq(*this, 0)) + return 0; + + wide_integer tmp = *this; + if (_impl::is_negative(*this)) + tmp = -tmp; + + long double res = 0; + for (size_t idx = 0; idx < _impl::arr_size; ++idx) + { + long double t = res; + res *= std::numeric_limits::max(); + res += t; + res += tmp.m_arr[_impl::big(idx)]; + } + + if (_impl::is_negative(*this)) + res = -res; + + return res; +} + +template +constexpr wide_integer::operator double() const noexcept +{ + return static_cast(*this); +} + +template +constexpr wide_integer::operator float() const noexcept +{ + return static_cast(*this); +} + +// Unary operators +template +constexpr wide_integer operator~(const wide_integer & lhs) noexcept +{ + return wide_integer::_impl::operator_unary_tilda(lhs); +} + +template +constexpr wide_integer operator-(const wide_integer & lhs) noexcept(is_same::value) +{ + return wide_integer::_impl::operator_unary_minus(lhs); +} + +template +constexpr wide_integer operator+(const wide_integer & lhs) noexcept(is_same::value) +{ + return lhs; +} + +// Binary operators +template +std::common_type_t, wide_integer> constexpr +operator*(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_star(lhs, rhs); +} + +template +std::common_type_t constexpr operator*(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) * CT(rhs); +} + +template +std::common_type_t, wide_integer> constexpr +operator/(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_slash(lhs, rhs); +} +template +std::common_type_t constexpr operator/(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) / CT(rhs); +} + +template +std::common_type_t, wide_integer> constexpr +operator+(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_plus(lhs, rhs); +} +template +std::common_type_t constexpr operator+(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) + CT(rhs); +} + +template +std::common_type_t, wide_integer> constexpr +operator-(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_minus(lhs, rhs); +} +template +std::common_type_t constexpr operator-(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) - CT(rhs); +} + +template +std::common_type_t, wide_integer> constexpr +operator%(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_percent(lhs, rhs); +} +template +std::common_type_t constexpr operator%(const Integral & lhs, const Integral2 & rhs) +{ + return CT(lhs) % CT(rhs); +} + +template +std::common_type_t, wide_integer> constexpr +operator&(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_amp(lhs, rhs); +} +template +std::common_type_t constexpr operator&(const Integral & lhs, const Integral2 & rhs) +{ + return CT(lhs) & CT(rhs); +} + +template +std::common_type_t, wide_integer> constexpr +operator|(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_pipe(lhs, rhs); +} +template +std::common_type_t constexpr operator|(const Integral & lhs, const Integral2 & rhs) +{ + return CT(lhs) | CT(rhs); +} + +template +std::common_type_t, wide_integer> constexpr +operator^(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_circumflex(lhs, rhs); +} +template +std::common_type_t constexpr operator^(const Integral & lhs, const Integral2 & rhs) +{ + return CT(lhs) ^ CT(rhs); +} + +template +constexpr wide_integer operator<<(const wide_integer & lhs, int n) noexcept +{ + return wide_integer::_impl::shift_left(lhs, n); +} +template +constexpr wide_integer operator>>(const wide_integer & lhs, int n) noexcept +{ + return wide_integer::_impl::shift_right(lhs, n); +} + +template +constexpr bool operator<(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_less(lhs, rhs); +} +template +constexpr bool operator<(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) < CT(rhs); +} + +template +constexpr bool operator>(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_more(lhs, rhs); +} +template +constexpr bool operator>(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) > CT(rhs); +} + +template +constexpr bool operator<=(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_less(lhs, rhs) + || std::common_type_t, wide_integer>::_impl::operator_eq(lhs, rhs); +} +template +constexpr bool operator<=(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) <= CT(rhs); +} + +template +constexpr bool operator>=(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_more(lhs, rhs) + || std::common_type_t, wide_integer>::_impl::operator_eq(lhs, rhs); +} +template +constexpr bool operator>=(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) >= CT(rhs); +} + +template +constexpr bool operator==(const wide_integer & lhs, const wide_integer & rhs) +{ + return std::common_type_t, wide_integer>::_impl::operator_eq(lhs, rhs); +} +template +constexpr bool operator==(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) == CT(rhs); +} + +template +constexpr bool operator!=(const wide_integer & lhs, const wide_integer & rhs) +{ + return !std::common_type_t, wide_integer>::_impl::operator_eq(lhs, rhs); +} +template +constexpr bool operator!=(const Arithmetic & lhs, const Arithmetic2 & rhs) +{ + return CT(lhs) != CT(rhs); +} + +template +inline std::string to_string(const wide_integer & n) +{ + std::string res; + if (wide_integer::_impl::operator_eq(n, 0U)) + return "0"; + + wide_integer t; + bool is_neg = wide_integer::_impl::is_negative(n); + if (is_neg) + t = wide_integer::_impl::operator_unary_minus(n); + else + t = n; + + while (!wide_integer::_impl::operator_eq(t, 0U)) + { + res.insert(res.begin(), '0' + char(wide_integer::_impl::operator_percent(t, 10U))); + t = wide_integer::_impl::operator_slash(t, 10U); + } + + if (is_neg) + res.insert(res.begin(), '-'); + return res; +} + +template +struct hash> +{ + std::size_t operator()(const wide_integer & lhs) const + { + static_assert(Bits % (sizeof(size_t) * 8) == 0); + + const auto * ptr = reinterpret_cast(lhs.m_arr); + unsigned count = Bits / (sizeof(size_t) * 8); + + size_t res = 0; + for (unsigned i = 0; i < count; ++i) + res ^= ptr[i]; + return hash()(res); + } +}; + +#undef CT +} diff --git a/base/common/ya.make b/base/common/ya.make index 7cac8f2c9a5..2bd08afbf3a 100644 --- a/base/common/ya.make +++ b/base/common/ya.make @@ -32,6 +32,8 @@ PEERDIR( contrib/restricted/cityhash-1.0.2 ) +CFLAGS(-g0) + SRCS( argsToConfig.cpp coverage.cpp diff --git a/base/common/ya.make.in b/base/common/ya.make.in index e841648692c..89c075da309 100644 --- a/base/common/ya.make.in +++ b/base/common/ya.make.in @@ -31,6 +31,8 @@ PEERDIR( contrib/restricted/cityhash-1.0.2 ) +CFLAGS(-g0) + SRCS( ) diff --git a/base/daemon/ya.make b/base/daemon/ya.make index 125417adca5..75ea54b6021 100644 --- a/base/daemon/ya.make +++ b/base/daemon/ya.make @@ -6,6 +6,8 @@ PEERDIR( clickhouse/src/Common ) +CFLAGS(-g0) + SRCS( BaseDaemon.cpp GraphiteWriter.cpp diff --git a/base/loggers/ya.make b/base/loggers/ya.make index b1c84042eee..6cb95633c72 100644 --- a/base/loggers/ya.make +++ b/base/loggers/ya.make @@ -4,6 +4,8 @@ PEERDIR( clickhouse/src/Common ) +CFLAGS(-g0) + SRCS( ExtendedLogChannel.cpp Loggers.cpp diff --git a/base/mysqlxx/Connection.cpp b/base/mysqlxx/Connection.cpp index 7ba14c9baba..8c7e11eb4a1 100644 --- a/base/mysqlxx/Connection.cpp +++ b/base/mysqlxx/Connection.cpp @@ -116,7 +116,7 @@ void Connection::connect(const char* db, throw ConnectionFailed(errorMessage(driver.get()), mysql_errno(driver.get())); /// Enables auto-reconnect. - my_bool reconnect = true; + bool reconnect = true; if (mysql_options(driver.get(), MYSQL_OPT_RECONNECT, reinterpret_cast(&reconnect))) throw ConnectionFailed(errorMessage(driver.get()), mysql_errno(driver.get())); diff --git a/base/readpassphrase/ya.make b/base/readpassphrase/ya.make index 80ad197e5d4..46f7f5983e3 100644 --- a/base/readpassphrase/ya.make +++ b/base/readpassphrase/ya.make @@ -1,5 +1,7 @@ LIBRARY() +CFLAGS(-g0) + SRCS( readpassphrase.c ) diff --git a/base/widechar_width/ya.make b/base/widechar_width/ya.make index fa0b4f705db..180aea001c1 100644 --- a/base/widechar_width/ya.make +++ b/base/widechar_width/ya.make @@ -2,6 +2,8 @@ LIBRARY() ADDINCL(GLOBAL clickhouse/base/widechar_width) +CFLAGS(-g0) + SRCS( widechar_width.cpp ) diff --git a/cmake/autogenerated_versions.txt b/cmake/autogenerated_versions.txt index ebb9bdcf568..6ca3999ff7f 100644 --- a/cmake/autogenerated_versions.txt +++ b/cmake/autogenerated_versions.txt @@ -1,9 +1,9 @@ # This strings autochanged from release_lib.sh: -SET(VERSION_REVISION 54438) +SET(VERSION_REVISION 54440) SET(VERSION_MAJOR 20) -SET(VERSION_MINOR 8) +SET(VERSION_MINOR 10) SET(VERSION_PATCH 1) -SET(VERSION_GITHASH 5d60ab33a511efd149c7c3de77c0dd4b81e65b13) -SET(VERSION_DESCRIBE v20.8.1.1-prestable) -SET(VERSION_STRING 20.8.1.1) +SET(VERSION_GITHASH 11a247d2f42010c1a17bf678c3e00a4bc89b23f8) +SET(VERSION_DESCRIBE v20.10.1.1-prestable) +SET(VERSION_STRING 20.10.1.1) # end of autochange diff --git a/cmake/find/ccache.cmake b/cmake/find/ccache.cmake index 59211e9d304..270db1b4e66 100644 --- a/cmake/find/ccache.cmake +++ b/cmake/find/ccache.cmake @@ -6,6 +6,11 @@ endif() if ((ENABLE_CCACHE OR NOT DEFINED ENABLE_CCACHE) AND NOT COMPILER_MATCHES_CCACHE) find_program (CCACHE_FOUND ccache) + if (CCACHE_FOUND) + set(ENABLE_CCACHE_BY_DEFAULT 1) + else() + set(ENABLE_CCACHE_BY_DEFAULT 0) + endif() endif() if (NOT CCACHE_FOUND AND NOT DEFINED ENABLE_CCACHE AND NOT COMPILER_MATCHES_CCACHE) @@ -13,7 +18,7 @@ if (NOT CCACHE_FOUND AND NOT DEFINED ENABLE_CCACHE AND NOT COMPILER_MATCHES_CCAC "Setting it up will significantly reduce compilation time for 2nd and consequent builds") endif() -option(ENABLE_CCACHE "Speedup re-compilations using ccache" ${CCACHE_FOUND}) +option(ENABLE_CCACHE "Speedup re-compilations using ccache" ${ENABLE_CCACHE_BY_DEFAULT}) if (NOT ENABLE_CCACHE) return() @@ -24,7 +29,7 @@ if (CCACHE_FOUND AND NOT COMPILER_MATCHES_CCACHE) string(REGEX REPLACE "ccache version ([0-9\\.]+).*" "\\1" CCACHE_VERSION ${CCACHE_VERSION}) if (CCACHE_VERSION VERSION_GREATER "3.2.0" OR NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang") - #message(STATUS "Using ${CCACHE_FOUND} ${CCACHE_VERSION}") + message(STATUS "Using ${CCACHE_FOUND} ${CCACHE_VERSION}") set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE ${CCACHE_FOUND}) set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK ${CCACHE_FOUND}) else () diff --git a/contrib/boost-cmake/CMakeLists.txt b/contrib/boost-cmake/CMakeLists.txt index 62379f8c2dc..fd860c9f9b0 100644 --- a/contrib/boost-cmake/CMakeLists.txt +++ b/contrib/boost-cmake/CMakeLists.txt @@ -2,8 +2,8 @@ option (USE_INTERNAL_BOOST_LIBRARY "Use internal Boost library" ${NOT_UNBUNDLED} if (NOT USE_INTERNAL_BOOST_LIBRARY) # 1.70 like in contrib/boost - # 1.67 on CI - set(BOOST_VERSION 1.67) + # 1.71 on CI + set(BOOST_VERSION 1.71) find_package(Boost ${BOOST_VERSION} COMPONENTS system diff --git a/contrib/capnproto-cmake/CMakeLists.txt b/contrib/capnproto-cmake/CMakeLists.txt index 8bdac0beec0..949481e7ef5 100644 --- a/contrib/capnproto-cmake/CMakeLists.txt +++ b/contrib/capnproto-cmake/CMakeLists.txt @@ -74,12 +74,12 @@ target_link_libraries(capnpc PUBLIC capnp) # The library has substandard code if (COMPILER_GCC) - set (SUPPRESS_WARNINGS -Wno-non-virtual-dtor -Wno-sign-compare -Wno-strict-aliasing -Wno-maybe-uninitialized - -Wno-deprecated-declarations -Wno-class-memaccess) + set (SUPPRESS_WARNINGS -w) elseif (COMPILER_CLANG) - set (SUPPRESS_WARNINGS -Wno-non-virtual-dtor -Wno-sign-compare -Wno-strict-aliasing -Wno-deprecated-declarations) + set (SUPPRESS_WARNINGS -w) + set (CAPNP_PRIVATE_CXX_FLAGS -fno-char8_t) endif () -target_compile_options(kj PRIVATE ${SUPPRESS_WARNINGS}) -target_compile_options(capnp PRIVATE ${SUPPRESS_WARNINGS}) -target_compile_options(capnpc PRIVATE ${SUPPRESS_WARNINGS}) +target_compile_options(kj PRIVATE ${SUPPRESS_WARNINGS} ${CAPNP_PRIVATE_CXX_FLAGS}) +target_compile_options(capnp PRIVATE ${SUPPRESS_WARNINGS} ${CAPNP_PRIVATE_CXX_FLAGS}) +target_compile_options(capnpc PRIVATE ${SUPPRESS_WARNINGS} ${CAPNP_PRIVATE_CXX_FLAGS}) diff --git a/debian/changelog b/debian/changelog index c82a3c6657b..244b2b1fde4 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,5 +1,5 @@ -clickhouse (20.8.1.1) unstable; urgency=low +clickhouse (20.10.1.1) unstable; urgency=low * Modified source code - -- clickhouse-release Fri, 07 Aug 2020 21:45:46 +0300 + -- clickhouse-release Tue, 08 Sep 2020 17:04:39 +0300 diff --git a/debian/clickhouse-server.init b/debian/clickhouse-server.init index b82c70bd6e0..f56164759bf 100755 --- a/debian/clickhouse-server.init +++ b/debian/clickhouse-server.init @@ -67,13 +67,6 @@ if uname -mpi | grep -q 'x86_64'; then fi -SUPPORTED_COMMANDS="{start|stop|status|restart|forcestop|forcerestart|reload|condstart|condstop|condrestart|condreload|initdb}" -is_supported_command() -{ - echo "$SUPPORTED_COMMANDS" | grep -E "(\{|\|)$1(\||})" &> /dev/null -} - - is_running() { pgrep --pidfile "$CLICKHOUSE_PIDFILE" $(echo "${PROGRAM}" | cut -c1-15) 1> /dev/null 2> /dev/null @@ -283,13 +276,12 @@ use_cron() fi return 0 } - +# returns false if cron disabled (with systemd) enable_cron() { use_cron && sed -i 's/^#*//' "$CLICKHOUSE_CRONFILE" } - - +# returns false if cron disabled (with systemd) disable_cron() { use_cron && sed -i 's/^#*/#/' "$CLICKHOUSE_CRONFILE" @@ -312,15 +304,14 @@ main() EXIT_STATUS=0 case "$1" in start) - start && enable_cron + service_or_func start && enable_cron ;; stop) - # disable_cron returns false if cron disabled (with systemd) - not checking return status disable_cron - stop + service_or_func stop ;; restart) - restart && enable_cron + service_or_func restart && enable_cron ;; forcestop) disable_cron @@ -330,7 +321,7 @@ main() forcerestart && enable_cron ;; reload) - restart + service_or_func restart ;; condstart) is_running || service_or_func start @@ -354,7 +345,7 @@ main() disable_cron ;; *) - echo "Usage: $0 $SUPPORTED_COMMANDS" + echo "Usage: $0 {start|stop|status|restart|forcestop|forcerestart|reload|condstart|condstop|condrestart|condreload|initdb}" exit 2 ;; esac diff --git a/debian/rules b/debian/rules index 5b271a8691f..ffe1f9e1228 100755 --- a/debian/rules +++ b/debian/rules @@ -18,7 +18,7 @@ ifeq ($(CCACHE_PREFIX),distcc) THREADS_COUNT=$(shell distcc -j) endif ifeq ($(THREADS_COUNT),) - THREADS_COUNT=$(shell nproc || grep -c ^processor /proc/cpuinfo || sysctl -n hw.ncpu || echo 4) + THREADS_COUNT=$(shell echo $$(( $$(nproc || grep -c ^processor /proc/cpuinfo || sysctl -n hw.ncpu || echo 8) / 2 )) ) endif DEB_BUILD_OPTIONS+=parallel=$(THREADS_COUNT) diff --git a/docker/builder/Dockerfile b/docker/builder/Dockerfile index b7dadc3ec6d..d4a121d13eb 100644 --- a/docker/builder/Dockerfile +++ b/docker/builder/Dockerfile @@ -6,7 +6,7 @@ RUN apt-get update \ && apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \ --yes --no-install-recommends --verbose-versions \ && export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \ - && wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ + && wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ && echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \ && apt-key add /tmp/llvm-snapshot.gpg.key \ && export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \ diff --git a/docker/client/Dockerfile b/docker/client/Dockerfile index fa7e3816959..5ce506aafa3 100644 --- a/docker/client/Dockerfile +++ b/docker/client/Dockerfile @@ -1,7 +1,7 @@ FROM ubuntu:18.04 ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/" -ARG version=20.8.1.* +ARG version=20.10.1.* RUN apt-get update \ && apt-get install --yes --no-install-recommends \ diff --git a/docker/packager/binary/Dockerfile b/docker/packager/binary/Dockerfile index b8650b945e1..45c35c2e0f3 100644 --- a/docker/packager/binary/Dockerfile +++ b/docker/packager/binary/Dockerfile @@ -1,5 +1,5 @@ # docker build -t yandex/clickhouse-binary-builder . -FROM ubuntu:19.10 +FROM ubuntu:20.04 ENV DEBIAN_FRONTEND=noninteractive LLVM_VERSION=10 @@ -7,7 +7,7 @@ RUN apt-get update \ && apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \ --yes --no-install-recommends --verbose-versions \ && export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \ - && wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ + && wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ && echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \ && apt-key add /tmp/llvm-snapshot.gpg.key \ && export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \ @@ -32,6 +32,8 @@ RUN apt-get update \ curl \ gcc-9 \ g++-9 \ + gcc-10 \ + g++-10 \ llvm-${LLVM_VERSION} \ clang-${LLVM_VERSION} \ lld-${LLVM_VERSION} \ @@ -55,7 +57,6 @@ RUN apt-get update \ cmake \ gdb \ rename \ - wget \ build-essential \ --yes --no-install-recommends @@ -83,14 +84,14 @@ RUN git clone https://github.com/tpoechtrager/cctools-port.git \ && rm -rf cctools-port # Download toolchain for Darwin -RUN wget https://github.com/phracker/MacOSX-SDKs/releases/download/10.14-beta4/MacOSX10.14.sdk.tar.xz +RUN wget -nv https://github.com/phracker/MacOSX-SDKs/releases/download/10.14-beta4/MacOSX10.14.sdk.tar.xz # Download toolchain for ARM # It contains all required headers and libraries. Note that it's named as "gcc" but actually we are using clang for cross compiling. -RUN wget "https://developer.arm.com/-/media/Files/downloads/gnu-a/8.3-2019.03/binrel/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz?revision=2e88a73f-d233-4f96-b1f4-d8b36e9bb0b9&la=en" -O gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz +RUN wget -nv "https://developer.arm.com/-/media/Files/downloads/gnu-a/8.3-2019.03/binrel/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz?revision=2e88a73f-d233-4f96-b1f4-d8b36e9bb0b9&la=en" -O gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz # Download toolchain for FreeBSD 11.3 -RUN wget https://clickhouse-datasets.s3.yandex.net/toolchains/toolchains/freebsd-11.3-toolchain.tar.xz +RUN wget -nv https://clickhouse-datasets.s3.yandex.net/toolchains/toolchains/freebsd-11.3-toolchain.tar.xz COPY build.sh / CMD ["/bin/bash", "/build.sh"] diff --git a/docker/packager/deb/Dockerfile b/docker/packager/deb/Dockerfile index 6d0fdca2310..87f4582f8e2 100644 --- a/docker/packager/deb/Dockerfile +++ b/docker/packager/deb/Dockerfile @@ -7,7 +7,7 @@ RUN apt-get update \ && apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \ --yes --no-install-recommends --verbose-versions \ && export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \ - && wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ + && wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ && echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \ && apt-key add /tmp/llvm-snapshot.gpg.key \ && export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \ @@ -34,7 +34,7 @@ RUN curl -O https://clickhouse-builds.s3.yandex.net/utils/1/dpkg-deb \ ENV APACHE_PUBKEY_HASH="bba6987b63c63f710fd4ed476121c588bc3812e99659d27a855f8c4d312783ee66ad6adfce238765691b04d62fa3688f" RUN export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \ - && wget -O /tmp/arrow-keyring.deb "https://apache.bintray.com/arrow/ubuntu/apache-arrow-archive-keyring-latest-${CODENAME}.deb" \ + && wget -nv -O /tmp/arrow-keyring.deb "https://apache.bintray.com/arrow/ubuntu/apache-arrow-archive-keyring-latest-${CODENAME}.deb" \ && echo "${APACHE_PUBKEY_HASH} /tmp/arrow-keyring.deb" | sha384sum -c \ && dpkg -i /tmp/arrow-keyring.deb diff --git a/docker/packager/packager b/docker/packager/packager index 251efb097f5..5874bedd17a 100755 --- a/docker/packager/packager +++ b/docker/packager/packager @@ -143,7 +143,7 @@ def parse_env_variables(build_type, compiler, sanitizer, package_type, image_typ if unbundled: # TODO: fix build with ENABLE_RDKAFKA - cmake_flags.append('-DUNBUNDLED=1 -DUSE_INTERNAL_RDKAFKA_LIBRARY=1') # too old version in ubuntu 19.10 + cmake_flags.append('-DUNBUNDLED=1 -DUSE_INTERNAL_RDKAFKA_LIBRARY=1 -DENABLE_ARROW=0 -DENABLE_ORC=0 -DENABLE_PARQUET=0') if split_binary: cmake_flags.append('-DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1 -DCLICKHOUSE_SPLIT_BINARY=1') diff --git a/docker/server/Dockerfile b/docker/server/Dockerfile index 1ba00bf299d..c15bd89b646 100644 --- a/docker/server/Dockerfile +++ b/docker/server/Dockerfile @@ -1,7 +1,7 @@ FROM ubuntu:20.04 ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/" -ARG version=20.8.1.* +ARG version=20.10.1.* ARG gosu_ver=1.10 RUN apt-get update \ diff --git a/docker/test/Dockerfile b/docker/test/Dockerfile index c9144230da9..ae588af2459 100644 --- a/docker/test/Dockerfile +++ b/docker/test/Dockerfile @@ -1,7 +1,7 @@ FROM ubuntu:18.04 ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/" -ARG version=20.8.1.* +ARG version=20.10.1.* RUN apt-get update && \ apt-get install -y apt-transport-https dirmngr && \ diff --git a/docker/test/base/Dockerfile b/docker/test/base/Dockerfile index c9b0700ecfc..8117d2907bc 100644 --- a/docker/test/base/Dockerfile +++ b/docker/test/base/Dockerfile @@ -7,7 +7,7 @@ RUN apt-get update \ && apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \ --yes --no-install-recommends --verbose-versions \ && export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \ - && wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ + && wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ && echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \ && apt-key add /tmp/llvm-snapshot.gpg.key \ && export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \ diff --git a/docker/test/codebrowser/Dockerfile b/docker/test/codebrowser/Dockerfile index f9d239ef8ef..cb3462cad0e 100644 --- a/docker/test/codebrowser/Dockerfile +++ b/docker/test/codebrowser/Dockerfile @@ -15,7 +15,7 @@ RUN apt-get --allow-unauthenticated update -y \ gpg-agent \ git -RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | sudo apt-key add - +RUN wget -nv -O - https://apt.kitware.com/keys/kitware-archive-latest.asc | sudo apt-key add - RUN sudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' RUN sudo echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-8 main" >> /etc/apt/sources.list diff --git a/docker/test/fasttest/Dockerfile b/docker/test/fasttest/Dockerfile index 49845d72f1d..9b4bb574f8f 100644 --- a/docker/test/fasttest/Dockerfile +++ b/docker/test/fasttest/Dockerfile @@ -7,7 +7,7 @@ RUN apt-get update \ && apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \ --yes --no-install-recommends --verbose-versions \ && export LLVM_PUBKEY_HASH="bda960a8da687a275a2078d43c111d66b1c6a893a3275271beedf266c1ff4a0cdecb429c7a5cccf9f486ea7aa43fd27f" \ - && wget -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ + && wget -nv -O /tmp/llvm-snapshot.gpg.key https://apt.llvm.org/llvm-snapshot.gpg.key \ && echo "${LLVM_PUBKEY_HASH} /tmp/llvm-snapshot.gpg.key" | sha384sum -c \ && apt-key add /tmp/llvm-snapshot.gpg.key \ && export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \ @@ -61,7 +61,6 @@ RUN apt-get update \ software-properties-common \ tzdata \ unixodbc \ - wget \ --yes --no-install-recommends # This symlink required by gcc to find lld compiler @@ -70,7 +69,7 @@ RUN ln -s /usr/bin/lld-${LLVM_VERSION} /usr/bin/ld.lld ARG odbc_driver_url="https://github.com/ClickHouse/clickhouse-odbc/releases/download/v1.1.4.20200302/clickhouse-odbc-1.1.4-Linux.tar.gz" RUN mkdir -p /tmp/clickhouse-odbc-tmp \ - && wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ + && wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ && cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \ && odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \ && odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \ diff --git a/docker/test/fasttest/run.sh b/docker/test/fasttest/run.sh index 0152f9c5cfd..3317bb06043 100755 --- a/docker/test/fasttest/run.sh +++ b/docker/test/fasttest/run.sh @@ -5,8 +5,7 @@ trap "exit" INT TERM trap 'kill $(jobs -pr) ||:' EXIT # This script is separated into two stages, cloning and everything else, so -# that we can run the "everything else" stage from the cloned source (we don't -# do this yet). +# that we can run the "everything else" stage from the cloned source. stage=${stage:-} # A variable to pass additional flags to CMake. @@ -16,7 +15,6 @@ stage=${stage:-} # empty parameter. read -ra FASTTEST_CMAKE_FLAGS <<< "${FASTTEST_CMAKE_FLAGS:-}" -ls -la function kill_clickhouse { @@ -60,6 +58,7 @@ function clone_root git clone https://github.com/ClickHouse/ClickHouse.git | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/clone_log.txt cd ClickHouse CLICKHOUSE_DIR=$(pwd) +export CLICKHOUSE_DIR if [ "$PULL_REQUEST_NUMBER" != "0" ]; then @@ -211,6 +210,9 @@ TESTS_TO_SKIP=( # to make some progress. 00646_url_engine 00974_query_profiler + + # Look at DistributedFilesToInsert, so cannot run in parallel. + 01460_DistributedFilesToInsert ) clickhouse-test -j 4 --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/test_log.txt @@ -248,12 +250,20 @@ fi case "$stage" in "") + ls -la ;& + "clone_root") clone_root - # TODO bootstrap into the cloned script here. Add this on Sep 1 2020 or - # later, so that most of the old branches are updated with this code. + + # Pass control to the script from cloned sources, unless asked otherwise. + if ! [ -v FASTTEST_LOCAL_SCRIPT ] + then + stage=run "$CLICKHOUSE_DIR/docker/test/fasttest/run.sh" + exit $? + fi ;& + "run") run ;& diff --git a/docker/test/fuzzer/query-fuzzer-tweaks-users.xml b/docker/test/fuzzer/query-fuzzer-tweaks-users.xml index 8d430aa5c54..356d3212932 100644 --- a/docker/test/fuzzer/query-fuzzer-tweaks-users.xml +++ b/docker/test/fuzzer/query-fuzzer-tweaks-users.xml @@ -2,6 +2,15 @@ 10 + + + + 10 + + diff --git a/docker/test/fuzzer/run-fuzzer.sh b/docker/test/fuzzer/run-fuzzer.sh index 8cfe1a87408..3d70faca5e0 100755 --- a/docker/test/fuzzer/run-fuzzer.sh +++ b/docker/test/fuzzer/run-fuzzer.sh @@ -91,7 +91,7 @@ function fuzz ./clickhouse-client --query "select elapsed, query from system.processes" ||: killall clickhouse-server ||: - for x in {1..10} + for _ in {1..10} do if ! pgrep -f clickhouse-server then @@ -172,8 +172,59 @@ case "$stage" in echo "failure" > status.txt echo "Fuzzer failed ($fuzzer_exit_code). See the logs" > description.txt fi + ;& +"report") +cat > report.html < + + + + AST Fuzzer for PR #${PR_TO_TEST} @ ${SHA_TO_TEST} + + +
+ +

AST Fuzzer for PR #${PR_TO_TEST} @ ${SHA_TO_TEST}

+ + + + +
Test nameTest statusDescription
AST Fuzzer$(cat status.txt)$(cat description.txt)
+ + + +EOF ;& esac +exit $task_exit_code \ No newline at end of file diff --git a/docker/test/integration/runner/Dockerfile b/docker/test/integration/runner/Dockerfile index 95ab516cdaa..bfbe8da816f 100644 --- a/docker/test/integration/runner/Dockerfile +++ b/docker/test/integration/runner/Dockerfile @@ -46,7 +46,7 @@ RUN set -eux; \ \ # this "case" statement is generated via "update.sh" \ - if ! wget -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \ + if ! wget -nv -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \ echo >&2 "error: failed to download 'docker-${DOCKER_VERSION}' from '${DOCKER_CHANNEL}' for '${x86_64}'"; \ exit 1; \ fi; \ diff --git a/docker/test/integration/runner/compose/docker_compose_mysql.yml b/docker/test/integration/runner/compose/docker_compose_mysql.yml index 2e3afce117d..2f09c2c01e3 100644 --- a/docker/test/integration/runner/compose/docker_compose_mysql.yml +++ b/docker/test/integration/runner/compose/docker_compose_mysql.yml @@ -7,3 +7,4 @@ services: MYSQL_ROOT_PASSWORD: clickhouse ports: - 3308:3306 + command: --server_id=100 --log-bin='mysql-bin-1.log' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency \ No newline at end of file diff --git a/docker/test/integration/runner/compose/docker_compose_mysql_5_7.yml b/docker/test/integration/runner/compose/docker_compose_mysql_5_7.yml deleted file mode 100644 index f42d2c6dd79..00000000000 --- a/docker/test/integration/runner/compose/docker_compose_mysql_5_7.yml +++ /dev/null @@ -1,10 +0,0 @@ -version: '2.3' -services: - mysql5_7: - image: mysql:5.7 - restart: always - environment: - MYSQL_ROOT_PASSWORD: clickhouse - ports: - - 33307:3306 - command: --server_id=100 --log-bin='mysql-bin-1.log' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency diff --git a/docker/test/integration/runner/compose/docker_compose_redis.yml b/docker/test/integration/runner/compose/docker_compose_redis.yml index 2dc79ed5910..2c9ace96d0c 100644 --- a/docker/test/integration/runner/compose/docker_compose_redis.yml +++ b/docker/test/integration/runner/compose/docker_compose_redis.yml @@ -5,3 +5,4 @@ services: restart: always ports: - 6380:6379 + command: redis-server --requirepass "clickhouse" diff --git a/docker/test/performance-comparison/compare.sh b/docker/test/performance-comparison/compare.sh index d3b9fc2214e..364e9994ab7 100755 --- a/docker/test/performance-comparison/compare.sh +++ b/docker/test/performance-comparison/compare.sh @@ -536,7 +536,9 @@ create table queries engine File(TSVWithNamesAndTypes, 'report/queries.tsv') left join query_display_names on query_metric_stats.test = query_display_names.test and query_metric_stats.query_index = query_display_names.query_index - where metric_name = 'server_time' + -- 'server_time' is rounded down to ms, which might be bad for very short queries. + -- Use 'client_time' instead. + where metric_name = 'client_time' order by test, query_index, metric_name ; @@ -563,40 +565,54 @@ create table unstable_queries_report engine File(TSV, 'report/unstable-queries.t toDecimal64(stat_threshold, 3), unstable_fail, test, query_index, query_display_name from queries where unstable_show order by stat_threshold desc; -create table test_time_changes engine File(TSV, 'report/test-time-changes.tsv') as - select test, queries, average_time_change from ( - select test, count(*) queries, - sum(left) as left, sum(right) as right, - (right - left) / right average_time_change - from queries - group by test - order by abs(average_time_change) desc - ) - ; -create table unstable_tests engine File(TSV, 'report/unstable-tests.tsv') as - select test, sum(unstable_show) total_unstable, sum(changed_show) total_changed +create view test_speedup as + select + test, + exp2(avg(log2(left / right))) times_speedup, + count(*) queries, + unstable + changed bad, + sum(changed_show) changed, + sum(unstable_show) unstable from queries group by test - order by total_unstable + total_changed desc + order by times_speedup desc + ; + +create view total_speedup as + select + 'Total' test, + exp2(avg(log2(times_speedup))) times_speedup, + sum(queries) queries, + unstable + changed bad, + sum(changed) changed, + sum(unstable) unstable + from test_speedup ; create table test_perf_changes_report engine File(TSV, 'report/test-perf-changes.tsv') as - select test, - queries, - coalesce(total_unstable, 0) total_unstable, - coalesce(total_changed, 0) total_changed, - total_unstable + total_changed total_bad, - coalesce(toString(toDecimal64(average_time_change, 3)), '??') average_time_change_str - from test_time_changes - full join unstable_tests - using test - where (abs(average_time_change) > 0.05 and queries > 5) - or (total_bad > 0) - order by total_bad desc, average_time_change desc - settings join_use_nulls = 1 + with + (times_speedup >= 1 + ? '-' || toString(toDecimal64(times_speedup, 3)) || 'x' + : '+' || toString(toDecimal64(1 / times_speedup, 3)) || 'x') + as times_speedup_str + select test, times_speedup_str, queries, bad, changed, unstable + -- Not sure what's the precedence of UNION ALL vs WHERE & ORDER BY, hence all + -- the braces. + from ( + ( + select * from total_speedup + ) union all ( + select * from test_speedup + where + (times_speedup >= 1 ? times_speedup : (1 / times_speedup)) >= 1.005 + or bad + ) + ) + order by test = 'Total' desc, times_speedup desc ; + create view total_client_time_per_query as select * from file('analyze/client-times.tsv', TSV, 'test text, query_index int, client float, server float'); @@ -888,7 +904,10 @@ for log in *-err.log do test=$(basename "$log" "-err.log") { - grep -H -m2 -i '\(Exception\|Error\):[^:]' "$log" \ + # The second grep is a heuristic for error messages like + # "socket.timeout: timed out". + grep -h -m2 -i '\(Exception\|Error\):[^:]' "$log" \ + || grep -h -m2 -i '^[^ ]\+: ' "$log" \ || head -2 "$log" } | sed "s/^/$test\t/" >> run-errors.tsv ||: done diff --git a/docker/test/performance-comparison/perf.py b/docker/test/performance-comparison/perf.py index a659326b068..e1476d9aeb4 100755 --- a/docker/test/performance-comparison/perf.py +++ b/docker/test/performance-comparison/perf.py @@ -262,6 +262,13 @@ for query_index, q in enumerate(test_queries): print(f'query\t{query_index}\t{run_id}\t{conn_index}\t{c.last_query.elapsed}') server_seconds += c.last_query.elapsed + if c.last_query.elapsed > 10: + # Stop processing pathologically slow queries, to avoid timing out + # the entire test task. This shouldn't really happen, so we don't + # need much handling for this case and can just exit. + print(f'The query no. {query_index} is taking too long to run ({c.last_query.elapsed} s)', file=sys.stderr) + exit(2) + client_seconds = time.perf_counter() - start_seconds print(f'client-time\t{query_index}\t{client_seconds}\t{server_seconds}') diff --git a/docker/test/performance-comparison/report.py b/docker/test/performance-comparison/report.py index ecf9c7a45e5..1003a6d0e1a 100755 --- a/docker/test/performance-comparison/report.py +++ b/docker/test/performance-comparison/report.py @@ -168,12 +168,6 @@ def nextRowAnchor(): global table_anchor return f'{table_anchor}.{row_anchor + 1}' -def setRowAnchor(anchor_row_part): - global row_anchor - global table_anchor - row_anchor = anchor_row_part - return currentRowAnchor() - def advanceRowAnchor(): global row_anchor global table_anchor @@ -376,7 +370,7 @@ if args.report == 'main': columns = [ 'Old, s', # 0 'New, s', # 1 - 'Times speedup / slowdown', # 2 + 'Ratio of speedup (-) or slowdown (+)', # 2 'Relative difference (new − old) / old', # 3 'p < 0.001 threshold', # 4 # Failed # 5 @@ -453,7 +447,7 @@ if args.report == 'main': addSimpleTable('Skipped tests', ['Test', 'Reason'], skipped_tests_rows) addSimpleTable('Test performance changes', - ['Test', 'Queries', 'Unstable', 'Changed perf', 'Total not OK', 'Avg relative time diff'], + ['Test', 'Ratio of speedup (-) or slowdown (+)', 'Queries', 'Total not OK', 'Changed perf', 'Unstable'], tsvRows('report/test-perf-changes.tsv')) def add_test_times(): @@ -480,11 +474,12 @@ if args.report == 'main': total_runs = (nominal_runs + 1) * 2 # one prewarm run, two servers attrs = ['' for c in columns] for r in rows: + anchor = f'{currentTableAnchor()}.{r[0]}' if float(r[6]) > 1.5 * total_runs: # FIXME should be 15s max -- investigate parallel_insert slow_average_tests += 1 attrs[6] = f'style="background: {color_bad}"' - errors_explained.append([f'The test \'{r[0]}\' is too slow to run as a whole. Investigate whether the create and fill queries can be sped up']) + errors_explained.append([f'The test \'{r[0]}\' is too slow to run as a whole. Investigate whether the create and fill queries can be sped up']) else: attrs[6] = '' @@ -495,7 +490,7 @@ if args.report == 'main': else: attrs[5] = '' - text += tableRow(r, attrs) + text += tableRow(r, attrs, anchor) text += tableEnd() tables.append(text) @@ -652,7 +647,7 @@ elif args.report == 'all-queries': # Unstable #1 'Old, s', #2 'New, s', #3 - 'Times speedup / slowdown', #4 + 'Ratio of speedup (-) or slowdown (+)', #4 'Relative difference (new − old) / old', #5 'p < 0.001 threshold', #6 'Test', #7 diff --git a/docker/test/pvs/Dockerfile b/docker/test/pvs/Dockerfile index ebd9c105705..0aedb67e572 100644 --- a/docker/test/pvs/Dockerfile +++ b/docker/test/pvs/Dockerfile @@ -12,8 +12,8 @@ RUN apt-get update --yes \ strace \ --yes --no-install-recommends -#RUN wget -q -O - http://files.viva64.com/etc/pubkey.txt | sudo apt-key add - -#RUN sudo wget -O /etc/apt/sources.list.d/viva64.list http://files.viva64.com/etc/viva64.list +#RUN wget -nv -O - http://files.viva64.com/etc/pubkey.txt | sudo apt-key add - +#RUN sudo wget -nv -O /etc/apt/sources.list.d/viva64.list http://files.viva64.com/etc/viva64.list # #RUN apt-get --allow-unauthenticated update -y \ # && env DEBIAN_FRONTEND=noninteractive \ @@ -24,10 +24,10 @@ ENV PKG_VERSION="pvs-studio-latest" RUN set -x \ && export PUBKEY_HASHSUM="486a0694c7f92e96190bbfac01c3b5ac2cb7823981db510a28f744c99eabbbf17a7bcee53ca42dc6d84d4323c2742761" \ - && wget https://files.viva64.com/etc/pubkey.txt -O /tmp/pubkey.txt \ + && wget -nv https://files.viva64.com/etc/pubkey.txt -O /tmp/pubkey.txt \ && echo "${PUBKEY_HASHSUM} /tmp/pubkey.txt" | sha384sum -c \ && apt-key add /tmp/pubkey.txt \ - && wget "https://files.viva64.com/${PKG_VERSION}.deb" \ + && wget -nv "https://files.viva64.com/${PKG_VERSION}.deb" \ && { debsig-verify ${PKG_VERSION}.deb \ || echo "WARNING: Some file was just downloaded from the internet without any validation and we are installing it into the system"; } \ && dpkg -i "${PKG_VERSION}.deb" diff --git a/docker/test/stateful/run.sh b/docker/test/stateful/run.sh index 5be14970914..c3576acc0e4 100755 --- a/docker/test/stateful/run.sh +++ b/docker/test/stateful/run.sh @@ -29,17 +29,26 @@ if [[ -n "$USE_DATABASE_ATOMIC" ]] && [[ "$USE_DATABASE_ATOMIC" -eq 1 ]]; then ln -s /usr/share/clickhouse-test/config/database_atomic_usersd.xml /etc/clickhouse-server/users.d/ fi -echo "TSAN_OPTIONS='verbosity=1000 halt_on_error=1 history_size=7'" >> /etc/environment -echo "TSAN_SYMBOLIZER_PATH=/usr/lib/llvm-10/bin/llvm-symbolizer" >> /etc/environment -echo "UBSAN_OPTIONS='print_stacktrace=1'" >> /etc/environment -echo "ASAN_SYMBOLIZER_PATH=/usr/lib/llvm-10/bin/llvm-symbolizer" >> /etc/environment -echo "UBSAN_SYMBOLIZER_PATH=/usr/lib/llvm-10/bin/llvm-symbolizer" >> /etc/environment -echo "LLVM_SYMBOLIZER_PATH=/usr/lib/llvm-10/bin/llvm-symbolizer" >> /etc/environment +function start() +{ + counter=0 + until clickhouse-client --query "SELECT 1" + do + if [ "$counter" -gt 120 ] + then + echo "Cannot start clickhouse-server" + cat /var/log/clickhouse-server/stdout.log + tail -n1000 /var/log/clickhouse-server/stderr.log + tail -n1000 /var/log/clickhouse-server/clickhouse-server.log + break + fi + timeout 120 service clickhouse-server start + sleep 0.5 + counter=$(($counter + 1)) + done +} -service zookeeper start -sleep 5 -service clickhouse-server start -sleep 5 +start /s3downloader --dataset-names $DATASETS chmod 777 -R /var/lib/clickhouse clickhouse-client --query "SHOW DATABASES" diff --git a/docker/test/stateful/s3downloader b/docker/test/stateful/s3downloader index f8e2bf3cbe4..fb49931f022 100755 --- a/docker/test/stateful/s3downloader +++ b/docker/test/stateful/s3downloader @@ -2,6 +2,7 @@ # -*- coding: utf-8 -*- import os import sys +import time import tarfile import logging import argparse @@ -16,6 +17,8 @@ AVAILABLE_DATASETS = { 'visits': 'visits_v1.tar', } +RETRIES_COUNT = 5 + def _get_temp_file_name(): return os.path.join(tempfile._get_default_tempdir(), next(tempfile._get_candidate_names())) @@ -24,25 +27,37 @@ def build_url(base_url, dataset): def dowload_with_progress(url, path): logging.info("Downloading from %s to temp path %s", url, path) - with open(path, 'w') as f: - response = requests.get(url, stream=True) - response.raise_for_status() - total_length = response.headers.get('content-length') - if total_length is None or int(total_length) == 0: - logging.info("No content-length, will download file without progress") - f.write(response.content) - else: - dl = 0 - total_length = int(total_length) - logging.info("Content length is %ld bytes", total_length) - for data in response.iter_content(chunk_size=4096): - dl += len(data) - f.write(data) - if sys.stdout.isatty(): - done = int(50 * dl / total_length) - percent = int(100 * float(dl) / total_length) - sys.stdout.write("\r[{}{}] {}%".format('=' * done, ' ' * (50-done), percent)) - sys.stdout.flush() + for i in range(RETRIES_COUNT): + try: + with open(path, 'w') as f: + response = requests.get(url, stream=True) + response.raise_for_status() + total_length = response.headers.get('content-length') + if total_length is None or int(total_length) == 0: + logging.info("No content-length, will download file without progress") + f.write(response.content) + else: + dl = 0 + total_length = int(total_length) + logging.info("Content length is %ld bytes", total_length) + for data in response.iter_content(chunk_size=4096): + dl += len(data) + f.write(data) + if sys.stdout.isatty(): + done = int(50 * dl / total_length) + percent = int(100 * float(dl) / total_length) + sys.stdout.write("\r[{}{}] {}%".format('=' * done, ' ' * (50-done), percent)) + sys.stdout.flush() + break + except Exception as ex: + sys.stdout.write("\n") + time.sleep(3) + logging.info("Exception while downloading %s, retry %s", ex, i + 1) + if os.path.exists(path): + os.remove(path) + else: + raise Exception("Cannot download dataset from {}, all retries exceeded".format(url)) + sys.stdout.write("\n") logging.info("Downloading finished") diff --git a/docker/test/stateful_with_coverage/run.sh b/docker/test/stateful_with_coverage/run.sh index 8928fc28f80..c2434b319b9 100755 --- a/docker/test/stateful_with_coverage/run.sh +++ b/docker/test/stateful_with_coverage/run.sh @@ -71,14 +71,26 @@ ln -s /usr/share/clickhouse-test/config/macros.xml /etc/clickhouse-server/config ln -s --backup=simple --suffix=_original.xml \ /usr/share/clickhouse-test/config/query_masking_rules.xml /etc/clickhouse-server/config.d/ +function start() +{ + counter=0 + until clickhouse-client --query "SELECT 1" + do + if [ "$counter" -gt 120 ] + then + echo "Cannot start clickhouse-server" + cat /var/log/clickhouse-server/stdout.log + tail -n1000 /var/log/clickhouse-server/stderr.log + tail -n1000 /var/log/clickhouse-server/clickhouse-server.log + break + fi + timeout 120 service clickhouse-server start + sleep 0.5 + counter=$(($counter + 1)) + done +} -service zookeeper start - -sleep 5 - -start_clickhouse - -sleep 5 +start if ! /s3downloader --dataset-names $DATASETS; then echo "Cannot download datatsets" diff --git a/docker/test/stateful_with_coverage/s3downloader b/docker/test/stateful_with_coverage/s3downloader index f8e2bf3cbe4..fb49931f022 100755 --- a/docker/test/stateful_with_coverage/s3downloader +++ b/docker/test/stateful_with_coverage/s3downloader @@ -2,6 +2,7 @@ # -*- coding: utf-8 -*- import os import sys +import time import tarfile import logging import argparse @@ -16,6 +17,8 @@ AVAILABLE_DATASETS = { 'visits': 'visits_v1.tar', } +RETRIES_COUNT = 5 + def _get_temp_file_name(): return os.path.join(tempfile._get_default_tempdir(), next(tempfile._get_candidate_names())) @@ -24,25 +27,37 @@ def build_url(base_url, dataset): def dowload_with_progress(url, path): logging.info("Downloading from %s to temp path %s", url, path) - with open(path, 'w') as f: - response = requests.get(url, stream=True) - response.raise_for_status() - total_length = response.headers.get('content-length') - if total_length is None or int(total_length) == 0: - logging.info("No content-length, will download file without progress") - f.write(response.content) - else: - dl = 0 - total_length = int(total_length) - logging.info("Content length is %ld bytes", total_length) - for data in response.iter_content(chunk_size=4096): - dl += len(data) - f.write(data) - if sys.stdout.isatty(): - done = int(50 * dl / total_length) - percent = int(100 * float(dl) / total_length) - sys.stdout.write("\r[{}{}] {}%".format('=' * done, ' ' * (50-done), percent)) - sys.stdout.flush() + for i in range(RETRIES_COUNT): + try: + with open(path, 'w') as f: + response = requests.get(url, stream=True) + response.raise_for_status() + total_length = response.headers.get('content-length') + if total_length is None or int(total_length) == 0: + logging.info("No content-length, will download file without progress") + f.write(response.content) + else: + dl = 0 + total_length = int(total_length) + logging.info("Content length is %ld bytes", total_length) + for data in response.iter_content(chunk_size=4096): + dl += len(data) + f.write(data) + if sys.stdout.isatty(): + done = int(50 * dl / total_length) + percent = int(100 * float(dl) / total_length) + sys.stdout.write("\r[{}{}] {}%".format('=' * done, ' ' * (50-done), percent)) + sys.stdout.flush() + break + except Exception as ex: + sys.stdout.write("\n") + time.sleep(3) + logging.info("Exception while downloading %s, retry %s", ex, i + 1) + if os.path.exists(path): + os.remove(path) + else: + raise Exception("Cannot download dataset from {}, all retries exceeded".format(url)) + sys.stdout.write("\n") logging.info("Downloading finished") diff --git a/docker/test/stateless/Dockerfile b/docker/test/stateless/Dockerfile index d3bc03a8f92..409a1b07bef 100644 --- a/docker/test/stateless/Dockerfile +++ b/docker/test/stateless/Dockerfile @@ -26,7 +26,7 @@ RUN apt-get update -y \ zookeeperd RUN mkdir -p /tmp/clickhouse-odbc-tmp \ - && wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ + && wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ && cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \ && odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \ && odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \ diff --git a/docker/test/stateless_unbundled/Dockerfile b/docker/test/stateless_unbundled/Dockerfile index 7de29fede72..b05e46406da 100644 --- a/docker/test/stateless_unbundled/Dockerfile +++ b/docker/test/stateless_unbundled/Dockerfile @@ -71,7 +71,7 @@ RUN apt-get --allow-unauthenticated update -y \ zookeeperd RUN mkdir -p /tmp/clickhouse-odbc-tmp \ - && wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ + && wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ && cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \ && odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \ && odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \ diff --git a/docker/test/stateless_with_coverage/Dockerfile b/docker/test/stateless_with_coverage/Dockerfile index f3539804852..77357d5142f 100644 --- a/docker/test/stateless_with_coverage/Dockerfile +++ b/docker/test/stateless_with_coverage/Dockerfile @@ -33,7 +33,7 @@ RUN apt-get update -y \ qemu-user-static RUN mkdir -p /tmp/clickhouse-odbc-tmp \ - && wget --quiet -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ + && wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ && cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \ && odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \ && odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \ diff --git a/docker/test/testflows/runner/Dockerfile b/docker/test/testflows/runner/Dockerfile index 6b4ec12b80c..898552ade56 100644 --- a/docker/test/testflows/runner/Dockerfile +++ b/docker/test/testflows/runner/Dockerfile @@ -44,7 +44,7 @@ RUN set -eux; \ \ # this "case" statement is generated via "update.sh" \ - if ! wget -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \ + if ! wget -nv -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \ echo >&2 "error: failed to download 'docker-${DOCKER_VERSION}' from '${DOCKER_CHANNEL}' for '${x86_64}'"; \ exit 1; \ fi; \ diff --git a/docs/en/engines/table-engines/integrations/kafka.md b/docs/en/engines/table-engines/integrations/kafka.md index 3324386e1c5..fe9aa2ca25e 100644 --- a/docs/en/engines/table-engines/integrations/kafka.md +++ b/docs/en/engines/table-engines/integrations/kafka.md @@ -32,7 +32,8 @@ SETTINGS [kafka_num_consumers = N,] [kafka_max_block_size = 0,] [kafka_skip_broken_messages = N,] - [kafka_commit_every_batch = 0] + [kafka_commit_every_batch = 0,] + [kafka_thread_per_consumer = 0] ``` Required parameters: @@ -50,6 +51,7 @@ Optional parameters: - `kafka_max_block_size` - The maximum batch size (in messages) for poll (default: `max_block_size`). - `kafka_skip_broken_messages` – Kafka message parser tolerance to schema-incompatible messages per block. Default: `0`. If `kafka_skip_broken_messages = N` then the engine skips *N* Kafka messages that cannot be parsed (a message equals a row of data). - `kafka_commit_every_batch` - Commit every consumed and handled batch instead of a single commit after writing a whole block (default: `0`). +- `kafka_thread_per_consumer` - Provide independent thread for each consumer (default: `0`). When enabled, every consumer flush the data independently, in parallel (otherwise - rows from several consumers squashed to form one block). Examples: diff --git a/docs/en/engines/table-engines/integrations/rabbitmq.md b/docs/en/engines/table-engines/integrations/rabbitmq.md index 7d09c6f72a5..284d64f459f 100644 --- a/docs/en/engines/table-engines/integrations/rabbitmq.md +++ b/docs/en/engines/table-engines/integrations/rabbitmq.md @@ -7,7 +7,7 @@ toc_title: RabbitMQ This engine allows integrating ClickHouse with [RabbitMQ](https://www.rabbitmq.com). -RabbitMQ lets you: +`RabbitMQ` lets you: - Publish or subscribe to data flows. - Process streams as they become available. @@ -27,9 +27,15 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] [rabbitmq_exchange_type = 'exchange_type',] [rabbitmq_routing_key_list = 'key1,key2,...',] [rabbitmq_row_delimiter = 'delimiter_symbol',] + [rabbitmq_schema = '',] [rabbitmq_num_consumers = N,] [rabbitmq_num_queues = N,] - [rabbitmq_transactional_channel = 0] + [rabbitmq_queue_base = 'queue',] + [rabbitmq_deadletter_exchange = 'dl-exchange',] + [rabbitmq_persistent = 0,] + [rabbitmq_skip_broken_messages = N,] + [rabbitmq_max_block_size = N,] + [rabbitmq_flush_interval_ms = N] ``` Required parameters: @@ -40,12 +46,18 @@ Required parameters: Optional parameters: -- `rabbitmq_exchange_type` – The type of RabbitMQ exchange: `direct`, `fanout`, `topic`, `headers`, `consistent-hash`. Default: `fanout`. +- `rabbitmq_exchange_type` – The type of RabbitMQ exchange: `direct`, `fanout`, `topic`, `headers`, `consistent_hash`. Default: `fanout`. - `rabbitmq_routing_key_list` – A comma-separated list of routing keys. - `rabbitmq_row_delimiter` – Delimiter character, which ends the message. +- `rabbitmq_schema` – Parameter that must be used if the format requires a schema definition. For example, [Cap’n Proto](https://capnproto.org/) requires the path to the schema file and the name of the root `schema.capnp:Message` object. - `rabbitmq_num_consumers` – The number of consumers per table. Default: `1`. Specify more consumers if the throughput of one consumer is insufficient. -- `rabbitmq_num_queues` – The number of queues per consumer. Default: `1`. Specify more queues if the capacity of one queue per consumer is insufficient. Single queue can contain up to 50K messages at the same time. -- `rabbitmq_transactional_channel` – Wrap insert queries in transactions. Default: `0`. +- `rabbitmq_num_queues` – The number of queues per consumer. Default: `1`. Specify more queues if the capacity of one queue per consumer is insufficient. +- `rabbitmq_queue_base` - Specify a base name for queues that will be declared. By default, queues are declared unique to tables based on db and table names. +- `rabbitmq_deadletter_exchange` - Specify name for a [dead letter exchange](https://www.rabbitmq.com/dlx.html). You can create another table with this exchange name and collect messages in cases when they are republished to dead letter exchange. By default dead letter exchange is not specified. +- `persistent` - If set to 1 (true), in insert query delivery mode will be set to 2 (marks messages as 'persistent'). Default: `0`. +- `rabbitmq_skip_broken_messages` – RabbitMQ message parser tolerance to schema-incompatible messages per block. Default: `0`. If `rabbitmq_skip_broken_messages = N` then the engine skips *N* RabbitMQ messages that cannot be parsed (a message equals a row of data). +- `rabbitmq_max_block_size` +- `rabbitmq_flush_interval_ms` Required configuration: @@ -72,7 +84,7 @@ Example: ## Description {#description} -`SELECT` is not particularly useful for reading messages (except for debugging), because each message can be read only once. It is more practical to create real-time threads using materialized views. To do this: +`SELECT` is not particularly useful for reading messages (except for debugging), because each message can be read only once. It is more practical to create real-time threads using [materialized views](../../../sql-reference/statements/create/view.md). To do this: 1. Use the engine to create a RabbitMQ consumer and consider it a data stream. 2. Create a table with the desired structure. @@ -86,19 +98,28 @@ There can be no more than one exchange per table. One exchange can be shared bet Exchange type options: -- `direct` - Routing is based on exact matching of keys. Example table key list: `key1,key2,key3,key4,key5`, message key can eqaul any of them. +- `direct` - Routing is based on the exact matching of keys. Example table key list: `key1,key2,key3,key4,key5`, message key can equal any of them. - `fanout` - Routing to all tables (where exchange name is the same) regardless of the keys. - `topic` - Routing is based on patterns with dot-separated keys. Examples: `*.logs`, `records.*.*.2020`, `*.2018,*.2019,*.2020`. - `headers` - Routing is based on `key=value` matches with a setting `x-match=all` or `x-match=any`. Example table key list: `x-match=all,format=logs,type=report,year=2020`. -- `consistent-hash` - Data is evenly distributed between all bound tables (where exchange name is the same). Note that this exchange type must be enabled with RabbitMQ plugin: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`. +- `consistent-hash` - Data is evenly distributed between all bound tables (where the exchange name is the same). Note that this exchange type must be enabled with RabbitMQ plugin: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`. -If exchange type is not specified, then default is `fanout` and routing keys for data publishing must be randomized in range `[1, num_consumers]` for every message/batch (or in range `[1, num_consumers * num_queues]` if `rabbitmq_num_queues` is set). This table configuration works quicker then any other, especially when `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` parameters are set. +Setting `rabbitmq_queue_base` may be used for the following cases: +- to let different tables share queues, so that multiple consumers could be registered for the same queues, which makes a better performance. If using `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` settings, the exact match of queues is achieved in case these parameters are the same. +- to be able to restore reading from certain durable queues when not all messages were successfully consumed. To be able to resume consumption from one specific queue - set its name in `rabbitmq_queue_base` setting and do not specify `rabbitmq_num_consumers` and `rabbitmq_num_queues` (defaults to 1). To be able to resume consumption from all queues, which were declared for a specific table - just specify the same settings: `rabbitmq_queue_base`, `rabbitmq_num_consumers`, `rabbitmq_num_queues`. By default, queue names will be unique to tables. Note: it makes sence only if messages are sent with delivery mode 2 - marked 'persistent', durable. +- to reuse queues as they are declared durable and not auto-deleted. -If `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` parameters are specified along with `rabbitmq_exchange_type`, then: +To improve performance, received messages are grouped into blocks the size of [max\_insert\_block\_size](../../../operations/server-configuration-parameters/settings.md#settings-max_insert_block_size). If the block wasn’t formed within [stream\_flush\_interval\_ms](../../../operations/server-configuration-parameters/settings.md) milliseconds, the data will be flushed to the table regardless of the completeness of the block. + +If `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` settings are specified along with `rabbitmq_exchange_type`, then: - `rabbitmq-consistent-hash-exchange` plugin must be enabled. - `message_id` property of the published messages must be specified (unique for each message/batch). +For insert query there is message metadata, which is added for each published message: `messageID` and `republished` flag (true, if published more than once) - can be accessed via message headers. + +Do not use the same table for inserts and materialized views. + Example: ``` sql @@ -113,10 +134,18 @@ Example: rabbitmq_num_consumers = 5; CREATE TABLE daily (key UInt64, value UInt64) - ENGINE = MergeTree(); + ENGINE = MergeTree() ORDER BY key; CREATE MATERIALIZED VIEW consumer TO daily AS SELECT key, value FROM queue; SELECT key, value FROM daily ORDER BY key; ``` + +## Virtual Columns {#virtual-columns} + +- `_exchange_name` - RabbitMQ exchange name. +- `_channel_id` - ChannelID, on which consumer, who received the message, was declared. +- `_delivery_tag` - DeliveryTag of the received message. Scoped per channel. +- `_redelivered` - `redelivered` flag of the message. +- `_message_id` - MessageID of the received message; non-empty if was set, when message was published. diff --git a/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md index 109ae6c4601..684e7e28112 100644 --- a/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md @@ -31,7 +31,7 @@ For a description of request parameters, see [statement description](../../../sq **ReplacingMergeTree Parameters** -- `ver` — column with version. Type `UInt*`, `Date`, `DateTime` or `DateTime64`. Optional parameter. +- `ver` — column with version. Type `UInt*`, `Date` or `DateTime`. Optional parameter. When merging, `ReplacingMergeTree` from all the rows with the same sorting key leaves only one: diff --git a/docs/en/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md b/docs/en/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md index a010a395c64..b23139b402b 100644 --- a/docs/en/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md +++ b/docs/en/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md @@ -121,7 +121,7 @@ To find out why we need two rows for each change, see [Algorithm](#table_engines **Notes on Usage** -1. The program that writes the data should remember the state of an object in order to cancel it. The “cancel” string should be a copy of the “state” string with the opposite `Sign`. This increases the initial size of storage but allows to write the data quickly. +1. The program that writes the data should remember the state of an object to be able to cancel it. “Cancel” string should contain copies of the primary key fields and the version of the “state” string and the opposite `Sign`. It increases the initial size of storage but allows to write the data quickly. 2. Long growing arrays in columns reduce the efficiency of the engine due to the load for writing. The more straightforward the data, the better the efficiency. 3. `SELECT` results depend strongly on the consistency of the history of object changes. Be accurate when preparing data for inserting. You can get unpredictable results with inconsistent data, such as negative values for non-negative metrics like session depth. diff --git a/docs/en/interfaces/http.md b/docs/en/interfaces/http.md index a5e7ef22558..35c79b5ee02 100644 --- a/docs/en/interfaces/http.md +++ b/docs/en/interfaces/http.md @@ -36,7 +36,7 @@ Examples: $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123 diff --git a/docs/en/introduction/adopters.md b/docs/en/introduction/adopters.md index 61ed944a811..596fe20be90 100644 --- a/docs/en/introduction/adopters.md +++ b/docs/en/introduction/adopters.md @@ -72,6 +72,7 @@ toc_title: Adopters | QINGCLOUD | Cloud services | Main product | — | — | [Slides in Chinese, October 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup19/4.%20Cloud%20%2B%20TSDB%20for%20ClickHouse%20张健%20QingCloud.pdf) | | Qrator | DDoS protection | Main product | — | — | [Blog Post, March 2019](https://blog.qrator.net/en/clickhouse-ddos-mitigation_37/) | | Rambler | Internet services | Analytics | — | — | [Talk in Russian, April 2018](https://medium.com/@ramblertop/разработка-api-clickhouse-для-рамблер-топ-100-f4c7e56f3141) | +| Retell | Speech synthesis | Analytics | — | — | [Blog Article, August 2020](https://vc.ru/services/153732-kak-sozdat-audiostati-na-vashem-sayte-i-zachem-eto-nuzhno) | | Rspamd | Antispam | Analytics | — | — | [Official Website](https://rspamd.com/doc/modules/clickhouse.html) | | S7 Airlines | Airlines | Metrics, Logging | — | — | [Talk in Russian, March 2019](https://www.youtube.com/watch?v=nwG68klRpPg&t=15s) | | scireum GmbH | e-Commerce | Main product | — | — | [Talk in German, February 2020](https://www.youtube.com/watch?v=7QWAn5RbyR4) | diff --git a/docs/en/operations/access-rights.md b/docs/en/operations/access-rights.md index 9833d2a06c2..0ab5b9aa6ff 100644 --- a/docs/en/operations/access-rights.md +++ b/docs/en/operations/access-rights.md @@ -62,6 +62,7 @@ Management queries: - [ALTER USER](../sql-reference/statements/alter/user.md#alter-user-statement) - [DROP USER](../sql-reference/statements/drop.md) - [SHOW CREATE USER](../sql-reference/statements/show.md#show-create-user-statement) +- [SHOW USERS](../sql-reference/statements/show.md#show-users-statement) ### Settings Applying {#access-control-settings-applying} @@ -90,6 +91,7 @@ Management queries: - [SET ROLE](../sql-reference/statements/set-role.md) - [SET DEFAULT ROLE](../sql-reference/statements/set-role.md#set-default-role-statement) - [SHOW CREATE ROLE](../sql-reference/statements/show.md#show-create-role-statement) +- [SHOW ROLES](../sql-reference/statements/show.md#show-roles-statement) Privileges can be granted to a role by the [GRANT](../sql-reference/statements/grant.md) query. To revoke privileges from a role ClickHouse provides the [REVOKE](../sql-reference/statements/revoke.md) query. @@ -103,6 +105,7 @@ Management queries: - [ALTER ROW POLICY](../sql-reference/statements/alter/row-policy.md#alter-row-policy-statement) - [DROP ROW POLICY](../sql-reference/statements/drop.md#drop-row-policy-statement) - [SHOW CREATE ROW POLICY](../sql-reference/statements/show.md#show-create-row-policy-statement) +- [SHOW POLICIES](../sql-reference/statements/show.md#show-policies-statement) ## Settings Profile {#settings-profiles-management} @@ -114,6 +117,7 @@ Management queries: - [ALTER SETTINGS PROFILE](../sql-reference/statements/alter/settings-profile.md#alter-settings-profile-statement) - [DROP SETTINGS PROFILE](../sql-reference/statements/drop.md#drop-settings-profile-statement) - [SHOW CREATE SETTINGS PROFILE](../sql-reference/statements/show.md#show-create-settings-profile-statement) +- [SHOW PROFILES](../sql-reference/statements/show.md#show-profiles-statement) ## Quota {#quotas-management} @@ -127,6 +131,8 @@ Management queries: - [ALTER QUOTA](../sql-reference/statements/alter/quota.md#alter-quota-statement) - [DROP QUOTA](../sql-reference/statements/drop.md#drop-quota-statement) - [SHOW CREATE QUOTA](../sql-reference/statements/show.md#show-create-quota-statement) +- [SHOW QUOTA](../sql-reference/statements/show.md#show-quota-statement) +- [SHOW QUOTAS](../sql-reference/statements/show.md#show-quotas-statement) ## Enabling SQL-driven Access Control and Account Management {#enabling-access-control} diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md index d4edc22a89b..592904f93b2 100644 --- a/docs/en/operations/settings/settings.md +++ b/docs/en/operations/settings/settings.md @@ -1290,6 +1290,47 @@ Possible values: Default value: 0. +## distributed\_group\_by\_no\_merge {#distributed-group-by-no-merge} + +Do not merge aggregation states from different servers for distributed query processing, you can use this in case it is for certain that there are different keys on different shards + +Possible values: + +- 0 — Disabled (final query processing is done on the initiator node). +- 1 - Do not merge aggregation states from different servers for distributed query processing (query completelly processed on the shard, initiator only proxy the data). +- 2 - Same as 1 but apply `ORDER BY` and `LIMIT` on the initiator (can be used for queries with `ORDER BY` and/or `LIMIT`). + +**Example** + +```sql +SELECT * +FROM remote('127.0.0.{2,3}', system.one) +GROUP BY dummy +LIMIT 1 +SETTINGS distributed_group_by_no_merge = 1 +FORMAT PrettyCompactMonoBlock + +┌─dummy─┐ +│ 0 │ +│ 0 │ +└───────┘ +``` + +```sql +SELECT * +FROM remote('127.0.0.{2,3}', system.one) +GROUP BY dummy +LIMIT 1 +SETTINGS distributed_group_by_no_merge = 2 +FORMAT PrettyCompactMonoBlock + +┌─dummy─┐ +│ 0 │ +└───────┘ +``` + +Default value: 0 + ## optimize\_skip\_unused\_shards {#optimize-skip-unused-shards} Enables or disables skipping of unused shards for [SELECT](../../sql-reference/statements/select/index.md) queries that have sharding key condition in `WHERE/PREWHERE` (assuming that the data is distributed by sharding key, otherwise does nothing). @@ -1337,6 +1378,40 @@ Possible values: Default value: 0 +## optimize\_distributed\_group\_by\_sharding\_key {#optimize-distributed-group-by-sharding-key} + +Optimize `GROUP BY sharding_key` queries, by avoiding costly aggregation on the initiator server (which will reduce memory usage for the query on the initiator server). + +The following types of queries are supported (and all combinations of them): + +- `SELECT DISTINCT [..., ]sharding_key[, ...] FROM dist` +- `SELECT ... FROM dist GROUP BY sharding_key[, ...]` +- `SELECT ... FROM dist GROUP BY sharding_key[, ...] ORDER BY x` +- `SELECT ... FROM dist GROUP BY sharding_key[, ...] LIMIT 1` +- `SELECT ... FROM dist GROUP BY sharding_key[, ...] LIMIT 1 BY x` + +The following types of queries are not supported (support for some of them may be added later): + +- `SELECT ... GROUP BY sharding_key[, ...] WITH TOTALS` +- `SELECT ... GROUP BY sharding_key[, ...] WITH ROLLUP` +- `SELECT ... GROUP BY sharding_key[, ...] WITH CUBE` +- `SELECT ... GROUP BY sharding_key[, ...] SETTINGS extremes=1` + +Possible values: + +- 0 — Disabled. +- 1 — Enabled. + +Default value: 0 + +See also: + +- [distributed\_group\_by\_no\_merge](#distributed-group-by-no-merge) +- [optimize\_skip\_unused\_shards](#optimize-skip-unused-shards) + +!!! note "Note" + Right now it requires `optimize_skip_unused_shards` (the reason behind this is that one day it may be enabled by default, and it will work correctly only if data was inserted via Distributed table, i.e. data is distributed according to sharding_key). + ## optimize\_throw\_if\_noop {#setting-optimize_throw_if_noop} Enables or disables throwing an exception if an [OPTIMIZE](../../sql-reference/statements/misc.md#misc_operations-optimize) query didn’t perform a merge. @@ -1894,10 +1969,10 @@ Locking timeout is used to protect from deadlocks while executing read/write ope Possible values: -- Positive integer. +- Positive integer (in seconds). - 0 — No locking timeout. -Default value: `120`. +Default value: `120` seconds. ## output_format_pretty_max_value_width {#output_format_pretty_max_value_width} diff --git a/docs/en/operations/system-tables/grants.md b/docs/en/operations/system-tables/grants.md new file mode 100644 index 00000000000..fb2a91ab30a --- /dev/null +++ b/docs/en/operations/system-tables/grants.md @@ -0,0 +1,24 @@ +# system.grants {#system_tables-grants} + +Privileges granted to ClickHouse user accounts. + +Columns: +- `user_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — User name. + +- `role_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Role assigned to user account. + +- `access_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Access parameters for ClickHouse user account. + +- `database` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Name of a database. + +- `table` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Name of a table. + +- `column` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Name of a column to which access is granted. + +- `is_partial_revoke` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Logical value. It shows whether some privileges have been revoked. Possible values: +- `0` — The row describes a partial revoke. +- `1` — The row describes a grant. + +- `grant_option` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Permission is granted `WITH GRANT OPTION`, see [GRANT](../../sql-reference/statements/grant.md#grant-privigele-syntax). + +[Original article](https://clickhouse.tech/docs/en/operations/system_tables/grants) diff --git a/docs/en/operations/system-tables/query_log.md b/docs/en/operations/system-tables/query_log.md index 26f4c53bf0a..72927b5a7e9 100644 --- a/docs/en/operations/system-tables/query_log.md +++ b/docs/en/operations/system-tables/query_log.md @@ -34,6 +34,7 @@ Columns: - `event_date` ([Date](../../sql-reference/data-types/date.md)) — Query starting date. - `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Query starting time. - `query_start_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Start time of query execution. +- `query_start_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — Start time of query execution with microsecond precision. - `query_duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Duration of query execution in milliseconds. - `read_rows` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Total number or rows read from all tables and table functions participated in query. It includes usual subqueries, subqueries for `IN` and `JOIN`. For distributed queries `read_rows` includes the total number of rows read at all replicas. Each replica sends it’s `read_rows` value, and the server-initiator of the query summarize all received and local values. The cache volumes doesn’t affect this value. - `read_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Total number or bytes read from all tables and table functions participated in query. It includes usual subqueries, subqueries for `IN` and `JOIN`. For distributed queries `read_bytes` includes the total number of rows read at all replicas. Each replica sends it’s `read_bytes` value, and the server-initiator of the query summarize all received and local values. The cache volumes doesn’t affect this value. diff --git a/docs/en/operations/system-tables/query_thread_log.md b/docs/en/operations/system-tables/query_thread_log.md index e42f5532e67..3dcd05c4cc3 100644 --- a/docs/en/operations/system-tables/query_thread_log.md +++ b/docs/en/operations/system-tables/query_thread_log.md @@ -16,6 +16,7 @@ Columns: - `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the thread has finished execution of the query. - `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the thread has finished execution of the query. - `query_start_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Start time of query execution. +- `query_start_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — Start time of query execution with microsecond precision. - `query_duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Duration of query execution. - `read_rows` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Number of read rows. - `read_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Number of read bytes. diff --git a/docs/en/operations/system-tables/quota_usage.md b/docs/en/operations/system-tables/quota_usage.md index b865939090d..0eb59fd6453 100644 --- a/docs/en/operations/system-tables/quota_usage.md +++ b/docs/en/operations/system-tables/quota_usage.md @@ -23,4 +23,8 @@ Columns: - `execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — The total query execution time, in seconds (wall time). - `max_execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — Maximum of query execution time. +## See Also {#see-also} + +- [SHOW QUOTA](../../sql-reference/statements/show.md#show-quota-statement) + [Original article](https://clickhouse.tech/docs/en/operations/system_tables/quota_usage) diff --git a/docs/en/operations/system-tables/quotas.md b/docs/en/operations/system-tables/quotas.md index dbbaa0655e9..f4f52a4a131 100644 --- a/docs/en/operations/system-tables/quotas.md +++ b/docs/en/operations/system-tables/quotas.md @@ -20,5 +20,9 @@ Columns: - `apply_to_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — List of user names/[roles](../../operations/access-rights.md#role-management) that the quota should be applied to. - `apply_to_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — List of user names/roles that the quota should not apply to. +## See Also {#see-also} + +- [SHOW QUOTAS](../../sql-reference/statements/show.md#show-quotas-statement) + [Original article](https://clickhouse.tech/docs/en/operations/system_tables/quotas) diff --git a/docs/en/operations/system-tables/quotas_usage.md b/docs/en/operations/system-tables/quotas_usage.md index f88479ce74a..ed6be820b26 100644 --- a/docs/en/operations/system-tables/quotas_usage.md +++ b/docs/en/operations/system-tables/quotas_usage.md @@ -24,4 +24,8 @@ Columns: - `execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — The total query execution time, in seconds (wall time). - `max_execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — Maximum of query execution time. +## See Also {#see-also} + +- [SHOW QUOTA](../../sql-reference/statements/show.md#show-quota-statement) + [Original article](https://clickhouse.tech/docs/en/operations/system_tables/quotas_usage) diff --git a/docs/en/operations/system-tables/role-grants.md b/docs/en/operations/system-tables/role-grants.md index cdeceebdaeb..5eb18b0dca7 100644 --- a/docs/en/operations/system-tables/role-grants.md +++ b/docs/en/operations/system-tables/role-grants.md @@ -5,11 +5,15 @@ Contains the role grants for users and roles. To add entries to this table, use Columns: - `user_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — User name. + - `role_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Role name. + - `granted_role_name` ([String](../../sql-reference/data-types/string.md)) — Name of role granted to the `role_name` role. To grant one role to another one use `GRANT role1 TO role2`. + - `granted_role_is_default` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Flag that shows whether `granted_role` is a default role. Possible values: - 1 — `granted_role` is a default role. - 0 — `granted_role` is not a default role. + - `with_admin_option` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Flag that shows whether `granted_role` is a role with [ADMIN OPTION](../../sql-reference/statements/grant.md#admin-option-privilege) privilege. Possible values: - 1 — The role has `ADMIN OPTION` privilege. - 0 — The role without `ADMIN OPTION` privilege. diff --git a/docs/en/operations/system-tables/roles.md b/docs/en/operations/system-tables/roles.md index 7dc5cdfe3de..4ab5102dfc8 100644 --- a/docs/en/operations/system-tables/roles.md +++ b/docs/en/operations/system-tables/roles.md @@ -8,4 +8,8 @@ Columns: - `id` ([UUID](../../sql-reference/data-types/uuid.md)) — Role ID. - `storage` ([String](../../sql-reference/data-types/string.md)) — Path to the storage of roles. Configured in the `access_control_path` parameter. +## See Also {#see-also} + +- [SHOW ROLES](../../sql-reference/statements/show.md#show-roles-statement) + [Original article](https://clickhouse.tech/docs/en/operations/system_tables/roles) diff --git a/docs/en/operations/system-tables/row_policies.md b/docs/en/operations/system-tables/row_policies.md new file mode 100644 index 00000000000..97474d1b3ee --- /dev/null +++ b/docs/en/operations/system-tables/row_policies.md @@ -0,0 +1,34 @@ +# system.row_policies {#system_tables-row_policies} + +Contains filters for one particular table, as well as a list of roles and/or users which should use this row policy. + +Columns: +- `name` ([String](../../sql-reference/data-types/string.md)) — Name of a row policy. + +- `short_name` ([String](../../sql-reference/data-types/string.md)) — Short name of a row policy. Names of row policies are compound, for example: myfilter ON mydb.mytable. Here "myfilter ON mydb.mytable" is the name of the row policy, "myfilter" is it's short name. + +- `database` ([String](../../sql-reference/data-types/string.md)) — Database name. + +- `table` ([String](../../sql-reference/data-types/string.md)) — Table name. + +- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — Row policy ID. + +- `storage` ([String](../../sql-reference/data-types/string.md)) — Name of the directory where the row policy is stored. + +- `select_filter` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Condition which is used to filter rows. + +- `is_restrictive` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Shows whether the row policy restricts access to rows, see [CREATE ROW POLICY](../../sql-reference/statements/create/row-policy.md#create-row-policy-as). Value: +- `0` — The row policy is defined with `AS PERMISSIVE` clause. +- `1` — The row policy is defined with `AS RESTRICTIVE` clause. + +- `apply_to_all` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Shows that the row policies set for all roles and/or users. + +- `apply_to_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — List of the roles and/or users to which the row policies is applied. + +- `apply_to_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — The row policies is applied to all roles and/or users excepting of the listed ones. + +## See Also {#see-also} + +- [SHOW POLICIES](../../sql-reference/statements/show.md#show-policies-statement) + +[Original article](https://clickhouse.tech/docs/en/operations/system_tables/row_policies) diff --git a/docs/en/operations/system-tables/settings_profile_elements.md b/docs/en/operations/system-tables/settings_profile_elements.md new file mode 100644 index 00000000000..d0f2c3c4527 --- /dev/null +++ b/docs/en/operations/system-tables/settings_profile_elements.md @@ -0,0 +1,30 @@ +# system.settings_profile_elements {#system_tables-settings_profile_elements} + +Describes the content of the settings profile: + +- Сonstraints. +- Roles and users that the setting applies to. +- Parent settings profiles. + +Columns: +- `profile_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Setting profile name. + +- `user_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — User name. + +- `role_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Role name. + +- `index` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Sequential number of the settings profile element. + +- `setting_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Setting name. + +- `value` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Setting value. + +- `min` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The minimum value of the setting. `NULL` if not set. + +- `max` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The maximum value of the setting. NULL if not set. + +- `readonly` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges))) — Profile that allows only read queries. + +- `inherit_profile` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — A parent profile for this setting profile. `NULL` if not set. Setting profile will inherit all the settings' values and constraints (`min`, `max`, `readonly`) from its parent profiles. + +[Original article](https://clickhouse.tech/docs/en/operations/system_tables/settings_profile_elements) diff --git a/docs/en/operations/system-tables/settings_profiles.md b/docs/en/operations/system-tables/settings_profiles.md new file mode 100644 index 00000000000..a06b26b9cb6 --- /dev/null +++ b/docs/en/operations/system-tables/settings_profiles.md @@ -0,0 +1,24 @@ +# system.settings_profiles {#system_tables-settings_profiles} + +Contains properties of configured setting profiles. + +Columns: +- `name` ([String](../../sql-reference/data-types/string.md)) — Setting profile name. + +- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — Setting profile ID. + +- `storage` ([String](../../sql-reference/data-types/string.md)) — Path to the storage of setting profiles. Configured in the `access_control_path` parameter. + +- `num_elements` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of elements for this profile in the `system.settings_profile_elements` table. + +- `apply_to_all` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Shows that the settings profile set for all roles and/or users. + +- `apply_to_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — List of the roles and/or users to which the setting profile is applied. + +- `apply_to_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — The setting profile is applied to all roles and/or users excepting of the listed ones. + +## See Also {#see-also} + +- [SHOW PROFILES](../../sql-reference/statements/show.md#show-profiles-statement) + +[Original article](https://clickhouse.tech/docs/en/operations/system_tables/settings_profiles) diff --git a/docs/en/operations/system-tables/stack_trace.md b/docs/en/operations/system-tables/stack_trace.md index b1714a93a20..44b13047cc3 100644 --- a/docs/en/operations/system-tables/stack_trace.md +++ b/docs/en/operations/system-tables/stack_trace.md @@ -82,8 +82,8 @@ res: /lib/x86_64-linux-gnu/libc-2.27.so - [Introspection Functions](../../sql-reference/functions/introspection.md) — Which introspection functions are available and how to use them. - [system.trace_log](../system-tables/trace_log.md) — Contains stack traces collected by the sampling query profiler. -- [arrayMap](../../sql-reference/functions/higher-order-functions.md#higher_order_functions-array-map) — Description and usage example of the `arrayMap` function. -- [arrayFilter](../../sql-reference/functions/higher-order-functions.md#higher_order_functions-array-filter) — Description and usage example of the `arrayFilter` function. +- [arrayMap](../../sql-reference/functions/array-functions.md#array-map) — Description and usage example of the `arrayMap` function. +- [arrayFilter](../../sql-reference/functions/array-functions.md#array-filter) — Description and usage example of the `arrayFilter` function. [Original article](https://clickhouse.tech/docs/en/operations/system-tables/stack_trace) diff --git a/docs/en/operations/system-tables/users.md b/docs/en/operations/system-tables/users.md new file mode 100644 index 00000000000..2227816aff3 --- /dev/null +++ b/docs/en/operations/system-tables/users.md @@ -0,0 +1,34 @@ +# system.users {#system_tables-users} + +Contains a list of [user accounts](../../operations/access-rights.md#user-account-management) configured at the server. + +Columns: +- `name` ([String](../../sql-reference/data-types/string.md)) — User name. + +- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — User ID. + +- `storage` ([String](../../sql-reference/data-types/string.md)) — Path to the storage of users. Configured in the `access_control_path` parameter. + +- `auth_type` ([Enum8](../../sql-reference/data-types/enum.md)('no_password' = 0,'plaintext_password' = 1, 'sha256_password' = 2, 'double_sha1_password' = 3)) — Shows the authentication type. There are multiple ways of user identification: with no password, with plain text password, with [SHA256](https://ru.wikipedia.org/wiki/SHA-2)-encoded password or with [double SHA-1](https://ru.wikipedia.org/wiki/SHA-1)-encoded password. + +- `auth_params` ([String](../../sql-reference/data-types/string.md)) — Authentication parameters in the JSON format depending on the `auth_type`. + +- `host_ip` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — IP addresses of hosts that are allowed to connect to the ClickHouse server. + +- `host_names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Names of hosts that are allowed to connect to the ClickHouse server. + +- `host_names_regexp` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Regular expression for host names that are allowed to connect to the ClickHouse server. + +- `host_names_like` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Names of hosts that are allowed to connect to the ClickHouse server, set using the LIKE predicate. + +- `default_roles_all` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Shows that all granted roles set for user by default. + +- `default_roles_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — List of granted roles provided by default. + +- `default_roles_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — All the granted roles set as default excepting of the listed ones. + +## See Also {#see-also} + +- [SHOW USERS](../../sql-reference/statements/show.md#show-users-statement) + +[Original article](https://clickhouse.tech/docs/en/operations/system_tables/users) diff --git a/docs/en/operations/tips.md b/docs/en/operations/tips.md index a4378388ef5..56510ee09cc 100644 --- a/docs/en/operations/tips.md +++ b/docs/en/operations/tips.md @@ -35,7 +35,7 @@ $ echo 0 | sudo tee /proc/sys/vm/overcommit_memory Always disable transparent huge pages. It interferes with memory allocators, which leads to significant performance degradation. ``` bash -$ echo 'never' | sudo tee /sys/kernel/mm/transparent_hugepage/enabled +$ echo 'madvise' | sudo tee /sys/kernel/mm/transparent_hugepage/enabled ``` Use `perf top` to watch the time spent in the kernel for memory management. diff --git a/docs/en/sql-reference/data-types/tuple.md b/docs/en/sql-reference/data-types/tuple.md index 60adb942925..e396006d957 100644 --- a/docs/en/sql-reference/data-types/tuple.md +++ b/docs/en/sql-reference/data-types/tuple.md @@ -7,7 +7,7 @@ toc_title: Tuple(T1, T2, ...) A tuple of elements, each having an individual [type](../../sql-reference/data-types/index.md#data_types). -Tuples are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see the sections [IN operators](../../sql-reference/operators/in.md) and [Higher order functions](../../sql-reference/functions/higher-order-functions.md). +Tuples are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see the sections [IN operators](../../sql-reference/operators/in.md) and [Higher order functions](../../sql-reference/functions/index.md#higher-order-functions). Tuples can be the result of a query. In this case, for text formats other than JSON, values are comma-separated in brackets. In JSON formats, tuples are output as arrays (in square brackets). diff --git a/docs/en/sql-reference/functions/arithmetic-functions.md b/docs/en/sql-reference/functions/arithmetic-functions.md index 5d89d6d335b..c4b151f59ce 100644 --- a/docs/en/sql-reference/functions/arithmetic-functions.md +++ b/docs/en/sql-reference/functions/arithmetic-functions.md @@ -1,5 +1,5 @@ --- -toc_priority: 35 +toc_priority: 34 toc_title: Arithmetic --- diff --git a/docs/en/sql-reference/functions/array-functions.md b/docs/en/sql-reference/functions/array-functions.md index 91ecc963b1f..82700a109b5 100644 --- a/docs/en/sql-reference/functions/array-functions.md +++ b/docs/en/sql-reference/functions/array-functions.md @@ -1,9 +1,9 @@ --- -toc_priority: 46 +toc_priority: 35 toc_title: Arrays --- -# Functions for Working with Arrays {#functions-for-working-with-arrays} +# Array Functions {#functions-for-working-with-arrays} ## empty {#function-empty} @@ -241,6 +241,12 @@ SELECT indexOf([1, 3, NULL, NULL], NULL) Elements set to `NULL` are handled as normal values. +## arrayCount(\[func,\] arr1, …) {#array-count} + +Returns the number of elements in the arr array for which func returns something other than 0. If ‘func’ is not specified, it returns the number of non-zero elements in the array. + +Note that the `arrayCount` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. + ## countEqual(arr, x) {#countequalarr-x} Returns the number of elements in the array equal to x. Equivalent to arrayCount (elem -\> elem = x, arr). @@ -568,7 +574,7 @@ SELECT arraySort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]); - `NaN` values are right before `NULL`. - `Inf` values are right before `NaN`. -Note that `arraySort` is a [higher-order function](../../sql-reference/functions/higher-order-functions.md). You can pass a lambda function to it as the first argument. In this case, sorting order is determined by the result of the lambda function applied to the elements of the array. +Note that `arraySort` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. In this case, sorting order is determined by the result of the lambda function applied to the elements of the array. Let’s consider the following example: @@ -668,7 +674,7 @@ SELECT arrayReverseSort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]) as res; - `NaN` values are right before `NULL`. - `-Inf` values are right before `NaN`. -Note that the `arrayReverseSort` is a [higher-order function](../../sql-reference/functions/higher-order-functions.md). You can pass a lambda function to it as the first argument. Example is shown below. +Note that the `arrayReverseSort` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. Example is shown below. ``` sql SELECT arrayReverseSort((x) -> -x, [1, 2, 3]) as res; @@ -1120,7 +1126,205 @@ Result: ``` text ┌─arrayAUC([0.1, 0.4, 0.35, 0.8], [0, 0, 1, 1])─┐ │ 0.75 │ -└────────────────────────────────────────---──┘ +└───────────────────────────────────────────────┘ ``` +## arrayMap(func, arr1, …) {#array-map} + +Returns an array obtained from the original application of the `func` function to each element in the `arr` array. + +Examples: + +``` sql +SELECT arrayMap(x -> (x + 2), [1, 2, 3]) as res; +``` + +``` text +┌─res─────┐ +│ [3,4,5] │ +└─────────┘ +``` + +The following example shows how to create a tuple of elements from different arrays: + +``` sql +SELECT arrayMap((x, y) -> (x, y), [1, 2, 3], [4, 5, 6]) AS res +``` + +``` text +┌─res─────────────────┐ +│ [(1,4),(2,5),(3,6)] │ +└─────────────────────┘ +``` + +Note that the `arrayMap` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arrayFilter(func, arr1, …) {#array-filter} + +Returns an array containing only the elements in `arr1` for which `func` returns something other than 0. + +Examples: + +``` sql +SELECT arrayFilter(x -> x LIKE '%World%', ['Hello', 'abc World']) AS res +``` + +``` text +┌─res───────────┐ +│ ['abc World'] │ +└───────────────┘ +``` + +``` sql +SELECT + arrayFilter( + (i, x) -> x LIKE '%World%', + arrayEnumerate(arr), + ['Hello', 'abc World'] AS arr) + AS res +``` + +``` text +┌─res─┐ +│ [2] │ +└─────┘ +``` + +Note that the `arrayFilter` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arrayFill(func, arr1, …) {#array-fill} + +Scan through `arr1` from the first element to the last element and replace `arr1[i]` by `arr1[i - 1]` if `func` returns 0. The first element of `arr1` will not be replaced. + +Examples: + +``` sql +SELECT arrayFill(x -> not isNull(x), [1, null, 3, 11, 12, null, null, 5, 6, 14, null, null]) AS res +``` + +``` text +┌─res──────────────────────────────┐ +│ [1,1,3,11,12,12,12,5,6,14,14,14] │ +└──────────────────────────────────┘ +``` + +Note that the `arrayFill` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arrayReverseFill(func, arr1, …) {#array-reverse-fill} + +Scan through `arr1` from the last element to the first element and replace `arr1[i]` by `arr1[i + 1]` if `func` returns 0. The last element of `arr1` will not be replaced. + +Examples: + +``` sql +SELECT arrayReverseFill(x -> not isNull(x), [1, null, 3, 11, 12, null, null, 5, 6, 14, null, null]) AS res +``` + +``` text +┌─res────────────────────────────────┐ +│ [1,3,3,11,12,5,5,5,6,14,NULL,NULL] │ +└────────────────────────────────────┘ +``` + +Note that the `arrayReverseFilter` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arraySplit(func, arr1, …) {#array-split} + +Split `arr1` into multiple arrays. When `func` returns something other than 0, the array will be split on the left hand side of the element. The array will not be split before the first element. + +Examples: + +``` sql +SELECT arraySplit((x, y) -> y, [1, 2, 3, 4, 5], [1, 0, 0, 1, 0]) AS res +``` + +``` text +┌─res─────────────┐ +│ [[1,2,3],[4,5]] │ +└─────────────────┘ +``` + +Note that the `arraySplit` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arrayReverseSplit(func, arr1, …) {#array-reverse-split} + +Split `arr1` into multiple arrays. When `func` returns something other than 0, the array will be split on the right hand side of the element. The array will not be split after the last element. + +Examples: + +``` sql +SELECT arrayReverseSplit((x, y) -> y, [1, 2, 3, 4, 5], [1, 0, 0, 1, 0]) AS res +``` + +``` text +┌─res───────────────┐ +│ [[1],[2,3,4],[5]] │ +└───────────────────┘ +``` + +Note that the `arrayReverseSplit` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arrayExists(\[func,\] arr1, …) {#arrayexistsfunc-arr1} + +Returns 1 if there is at least one element in `arr` for which `func` returns something other than 0. Otherwise, it returns 0. + +Note that the `arrayExists` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. + +## arrayAll(\[func,\] arr1, …) {#arrayallfunc-arr1} + +Returns 1 if `func` returns something other than 0 for all the elements in `arr`. Otherwise, it returns 0. + +Note that the `arrayAll` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. + +## arrayFirst(func, arr1, …) {#array-first} + +Returns the first element in the `arr1` array for which `func` returns something other than 0. + +Note that the `arrayFirst` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arrayFirstIndex(func, arr1, …) {#array-first-index} + +Returns the index of the first element in the `arr1` array for which `func` returns something other than 0. + +Note that the `arrayFirstIndex` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted. + +## arraySum(\[func,\] arr1, …) {#array-sum} + +Returns the sum of the `func` values. If the function is omitted, it just returns the sum of the array elements. + +Note that the `arraySum` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. + +## arrayCumSum(\[func,\] arr1, …) {#arraycumsumfunc-arr1} + +Returns an array of partial sums of elements in the source array (a running sum). If the `func` function is specified, then the values of the array elements are converted by this function before summing. + +Example: + +``` sql +SELECT arrayCumSum([1, 1, 1, 1]) AS res +``` + +``` text +┌─res──────────┐ +│ [1, 2, 3, 4] │ +└──────────────┘ +``` + +Note that the `arrayCumSum` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. + +## arrayCumSumNonNegative(arr) {#arraycumsumnonnegativearr} + +Same as `arrayCumSum`, returns an array of partial sums of elements in the source array (a running sum). Different `arrayCumSum`, when then returned value contains a value less than zero, the value is replace with zero and the subsequent calculation is performed with zero parameters. For example: + +``` sql +SELECT arrayCumSumNonNegative([1, 1, -4, 1]) AS res +``` + +``` text +┌─res───────┐ +│ [1,2,0,1] │ +└───────────┘ +``` +Note that the `arraySumNonNegative` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument. + [Original article](https://clickhouse.tech/docs/en/query_language/functions/array_functions/) diff --git a/docs/en/sql-reference/functions/higher-order-functions.md b/docs/en/sql-reference/functions/higher-order-functions.md deleted file mode 100644 index 484bdaa12e6..00000000000 --- a/docs/en/sql-reference/functions/higher-order-functions.md +++ /dev/null @@ -1,262 +0,0 @@ ---- -toc_priority: 57 -toc_title: Higher-Order ---- - -# Higher-order Functions {#higher-order-functions} - -## `->` operator, lambda(params, expr) function {#operator-lambdaparams-expr-function} - -Allows describing a lambda function for passing to a higher-order function. The left side of the arrow has a formal parameter, which is any ID, or multiple formal parameters – any IDs in a tuple. The right side of the arrow has an expression that can use these formal parameters, as well as any table columns. - -Examples: `x -> 2 * x, str -> str != Referer.` - -Higher-order functions can only accept lambda functions as their functional argument. - -A lambda function that accepts multiple arguments can be passed to a higher-order function. In this case, the higher-order function is passed several arrays of identical length that these arguments will correspond to. - -For some functions, such as [arrayCount](#higher_order_functions-array-count) or [arraySum](#higher_order_functions-array-count), the first argument (the lambda function) can be omitted. In this case, identical mapping is assumed. - -A lambda function can’t be omitted for the following functions: - -- [arrayMap](#higher_order_functions-array-map) -- [arrayFilter](#higher_order_functions-array-filter) -- [arrayFill](#higher_order_functions-array-fill) -- [arrayReverseFill](#higher_order_functions-array-reverse-fill) -- [arraySplit](#higher_order_functions-array-split) -- [arrayReverseSplit](#higher_order_functions-array-reverse-split) -- [arrayFirst](#higher_order_functions-array-first) -- [arrayFirstIndex](#higher_order_functions-array-first-index) - -### arrayMap(func, arr1, …) {#higher_order_functions-array-map} - -Returns an array obtained from the original application of the `func` function to each element in the `arr` array. - -Examples: - -``` sql -SELECT arrayMap(x -> (x + 2), [1, 2, 3]) as res; -``` - -``` text -┌─res─────┐ -│ [3,4,5] │ -└─────────┘ -``` - -The following example shows how to create a tuple of elements from different arrays: - -``` sql -SELECT arrayMap((x, y) -> (x, y), [1, 2, 3], [4, 5, 6]) AS res -``` - -``` text -┌─res─────────────────┐ -│ [(1,4),(2,5),(3,6)] │ -└─────────────────────┘ -``` - -Note that the first argument (lambda function) can’t be omitted in the `arrayMap` function. - -### arrayFilter(func, arr1, …) {#higher_order_functions-array-filter} - -Returns an array containing only the elements in `arr1` for which `func` returns something other than 0. - -Examples: - -``` sql -SELECT arrayFilter(x -> x LIKE '%World%', ['Hello', 'abc World']) AS res -``` - -``` text -┌─res───────────┐ -│ ['abc World'] │ -└───────────────┘ -``` - -``` sql -SELECT - arrayFilter( - (i, x) -> x LIKE '%World%', - arrayEnumerate(arr), - ['Hello', 'abc World'] AS arr) - AS res -``` - -``` text -┌─res─┐ -│ [2] │ -└─────┘ -``` - -Note that the first argument (lambda function) can’t be omitted in the `arrayFilter` function. - -### arrayFill(func, arr1, …) {#higher_order_functions-array-fill} - -Scan through `arr1` from the first element to the last element and replace `arr1[i]` by `arr1[i - 1]` if `func` returns 0. The first element of `arr1` will not be replaced. - -Examples: - -``` sql -SELECT arrayFill(x -> not isNull(x), [1, null, 3, 11, 12, null, null, 5, 6, 14, null, null]) AS res -``` - -``` text -┌─res──────────────────────────────┐ -│ [1,1,3,11,12,12,12,5,6,14,14,14] │ -└──────────────────────────────────┘ -``` - -Note that the first argument (lambda function) can’t be omitted in the `arrayFill` function. - -### arrayReverseFill(func, arr1, …) {#higher_order_functions-array-reverse-fill} - -Scan through `arr1` from the last element to the first element and replace `arr1[i]` by `arr1[i + 1]` if `func` returns 0. The last element of `arr1` will not be replaced. - -Examples: - -``` sql -SELECT arrayReverseFill(x -> not isNull(x), [1, null, 3, 11, 12, null, null, 5, 6, 14, null, null]) AS res -``` - -``` text -┌─res────────────────────────────────┐ -│ [1,3,3,11,12,5,5,5,6,14,NULL,NULL] │ -└────────────────────────────────────┘ -``` - -Note that the first argument (lambda function) can’t be omitted in the `arrayReverseFill` function. - -### arraySplit(func, arr1, …) {#higher_order_functions-array-split} - -Split `arr1` into multiple arrays. When `func` returns something other than 0, the array will be split on the left hand side of the element. The array will not be split before the first element. - -Examples: - -``` sql -SELECT arraySplit((x, y) -> y, [1, 2, 3, 4, 5], [1, 0, 0, 1, 0]) AS res -``` - -``` text -┌─res─────────────┐ -│ [[1,2,3],[4,5]] │ -└─────────────────┘ -``` - -Note that the first argument (lambda function) can’t be omitted in the `arraySplit` function. - -### arrayReverseSplit(func, arr1, …) {#higher_order_functions-array-reverse-split} - -Split `arr1` into multiple arrays. When `func` returns something other than 0, the array will be split on the right hand side of the element. The array will not be split after the last element. - -Examples: - -``` sql -SELECT arrayReverseSplit((x, y) -> y, [1, 2, 3, 4, 5], [1, 0, 0, 1, 0]) AS res -``` - -``` text -┌─res───────────────┐ -│ [[1],[2,3,4],[5]] │ -└───────────────────┘ -``` - -Note that the first argument (lambda function) can’t be omitted in the `arraySplit` function. - -### arrayCount(\[func,\] arr1, …) {#higher_order_functions-array-count} - -Returns the number of elements in the arr array for which func returns something other than 0. If ‘func’ is not specified, it returns the number of non-zero elements in the array. - -### arrayExists(\[func,\] arr1, …) {#arrayexistsfunc-arr1} - -Returns 1 if there is at least one element in ‘arr’ for which ‘func’ returns something other than 0. Otherwise, it returns 0. - -### arrayAll(\[func,\] arr1, …) {#arrayallfunc-arr1} - -Returns 1 if ‘func’ returns something other than 0 for all the elements in ‘arr’. Otherwise, it returns 0. - -### arraySum(\[func,\] arr1, …) {#higher-order-functions-array-sum} - -Returns the sum of the ‘func’ values. If the function is omitted, it just returns the sum of the array elements. - -### arrayFirst(func, arr1, …) {#higher_order_functions-array-first} - -Returns the first element in the ‘arr1’ array for which ‘func’ returns something other than 0. - -Note that the first argument (lambda function) can’t be omitted in the `arrayFirst` function. - -### arrayFirstIndex(func, arr1, …) {#higher_order_functions-array-first-index} - -Returns the index of the first element in the ‘arr1’ array for which ‘func’ returns something other than 0. - -Note that the first argument (lambda function) can’t be omitted in the `arrayFirstIndex` function. - -### arrayCumSum(\[func,\] arr1, …) {#arraycumsumfunc-arr1} - -Returns an array of partial sums of elements in the source array (a running sum). If the `func` function is specified, then the values of the array elements are converted by this function before summing. - -Example: - -``` sql -SELECT arrayCumSum([1, 1, 1, 1]) AS res -``` - -``` text -┌─res──────────┐ -│ [1, 2, 3, 4] │ -└──────────────┘ -``` - -### arrayCumSumNonNegative(arr) {#arraycumsumnonnegativearr} - -Same as `arrayCumSum`, returns an array of partial sums of elements in the source array (a running sum). Different `arrayCumSum`, when then returned value contains a value less than zero, the value is replace with zero and the subsequent calculation is performed with zero parameters. For example: - -``` sql -SELECT arrayCumSumNonNegative([1, 1, -4, 1]) AS res -``` - -``` text -┌─res───────┐ -│ [1,2,0,1] │ -└───────────┘ -``` - -### arraySort(\[func,\] arr1, …) {#arraysortfunc-arr1} - -Returns an array as result of sorting the elements of `arr1` in ascending order. If the `func` function is specified, sorting order is determined by the result of the function `func` applied to the elements of array (arrays) - -The [Schwartzian transform](https://en.wikipedia.org/wiki/Schwartzian_transform) is used to improve sorting efficiency. - -Example: - -``` sql -SELECT arraySort((x, y) -> y, ['hello', 'world'], [2, 1]); -``` - -``` text -┌─res────────────────┐ -│ ['world', 'hello'] │ -└────────────────────┘ -``` - -For more information about the `arraySort` method, see the [Functions for Working With Arrays](../../sql-reference/functions/array-functions.md#array_functions-sort) section. - -### arrayReverseSort(\[func,\] arr1, …) {#arrayreversesortfunc-arr1} - -Returns an array as result of sorting the elements of `arr1` in descending order. If the `func` function is specified, sorting order is determined by the result of the function `func` applied to the elements of array (arrays). - -Example: - -``` sql -SELECT arrayReverseSort((x, y) -> y, ['hello', 'world'], [2, 1]) as res; -``` - -``` text -┌─res───────────────┐ -│ ['hello','world'] │ -└───────────────────┘ -``` - -For more information about the `arrayReverseSort` method, see the [Functions for Working With Arrays](../../sql-reference/functions/array-functions.md#array_functions-reverse-sort) section. - -[Original article](https://clickhouse.tech/docs/en/query_language/functions/higher_order_functions/) diff --git a/docs/en/sql-reference/functions/index.md b/docs/en/sql-reference/functions/index.md index 65514eff673..1a0b9d83b5f 100644 --- a/docs/en/sql-reference/functions/index.md +++ b/docs/en/sql-reference/functions/index.md @@ -44,6 +44,21 @@ Functions have the following behaviors: Functions can’t change the values of their arguments – any changes are returned as the result. Thus, the result of calculating separate functions does not depend on the order in which the functions are written in the query. +## Higher-order functions, `->` operator and lambda(params, expr) function {#higher-order-functions} + +Higher-order functions can only accept lambda functions as their functional argument. To pass a lambda function to a higher-order function use `->` operator. The left side of the arrow has a formal parameter, which is any ID, or multiple formal parameters – any IDs in a tuple. The right side of the arrow has an expression that can use these formal parameters, as well as any table columns. + +Examples: + +``` +x -> 2 * x +str -> str != Referer +``` + +A lambda function that accepts multiple arguments can also be passed to a higher-order function. In this case, the higher-order function is passed several arrays of identical length that these arguments will correspond to. + +For some functions the first argument (the lambda function) can be omitted. In this case, identical mapping is assumed. + ## Error Handling {#error-handling} Some functions might throw an exception if the data is invalid. In this case, the query is canceled and an error text is returned to the client. For distributed processing, when an exception occurs on one of the servers, the other servers also attempt to abort the query. diff --git a/docs/en/sql-reference/functions/introspection.md b/docs/en/sql-reference/functions/introspection.md index 6848f74da1f..1fd39c704c5 100644 --- a/docs/en/sql-reference/functions/introspection.md +++ b/docs/en/sql-reference/functions/introspection.md @@ -98,7 +98,7 @@ LIMIT 1 \G ``` -The [arrayMap](../../sql-reference/functions/higher-order-functions.md#higher_order_functions-array-map) function allows to process each individual element of the `trace` array by the `addressToLine` function. The result of this processing you see in the `trace_source_code_lines` column of output. +The [arrayMap](../../sql-reference/functions/array-functions.md#array-map) function allows to process each individual element of the `trace` array by the `addressToLine` function. The result of this processing you see in the `trace_source_code_lines` column of output. ``` text Row 1: @@ -184,7 +184,7 @@ LIMIT 1 \G ``` -The [arrayMap](../../sql-reference/functions/higher-order-functions.md#higher_order_functions-array-map) function allows to process each individual element of the `trace` array by the `addressToSymbols` function. The result of this processing you see in the `trace_symbols` column of output. +The [arrayMap](../../sql-reference/functions/array-functions.md#array-map) function allows to process each individual element of the `trace` array by the `addressToSymbols` function. The result of this processing you see in the `trace_symbols` column of output. ``` text Row 1: @@ -281,7 +281,7 @@ LIMIT 1 \G ``` -The [arrayMap](../../sql-reference/functions/higher-order-functions.md#higher_order_functions-array-map) function allows to process each individual element of the `trace` array by the `demangle` function. The result of this processing you see in the `trace_functions` column of output. +The [arrayMap](../../sql-reference/functions/array-functions.md#array-map) function allows to process each individual element of the `trace` array by the `demangle` function. The result of this processing you see in the `trace_functions` column of output. ``` text Row 1: diff --git a/docs/en/sql-reference/statements/alter/settings-profile.md b/docs/en/sql-reference/statements/alter/settings-profile.md index 64e15788e80..4b7941a9e86 100644 --- a/docs/en/sql-reference/statements/alter/settings-profile.md +++ b/docs/en/sql-reference/statements/alter/settings-profile.md @@ -10,7 +10,7 @@ Changes settings profiles. Syntax: ``` sql -ALTER SETTINGS PROFILE [IF EXISTS] name [ON CLUSTER cluster_name] +ALTER SETTINGS PROFILE [IF EXISTS] TO name [ON CLUSTER cluster_name] [RENAME TO new_name] [SETTINGS variable [= value] [MIN [=] min_value] [MAX [=] max_value] [READONLY|WRITABLE] | INHERIT 'profile_name'] [,...] ``` diff --git a/docs/en/sql-reference/statements/create/settings-profile.md b/docs/en/sql-reference/statements/create/settings-profile.md index 6489daebc98..6fcd1d4e840 100644 --- a/docs/en/sql-reference/statements/create/settings-profile.md +++ b/docs/en/sql-reference/statements/create/settings-profile.md @@ -10,7 +10,7 @@ Creates a [settings profile](../../../operations/access-rights.md#settings-profi Syntax: ``` sql -CREATE SETTINGS PROFILE [IF NOT EXISTS | OR REPLACE] name [ON CLUSTER cluster_name] +CREATE SETTINGS PROFILE [IF NOT EXISTS | OR REPLACE] TO name [ON CLUSTER cluster_name] [SETTINGS variable [= value] [MIN [=] min_value] [MAX [=] max_value] [READONLY|WRITABLE] | INHERIT 'profile_name'] [,...] ``` diff --git a/docs/en/sql-reference/statements/create/table.md b/docs/en/sql-reference/statements/create/table.md index e3e767482db..dbe1f282b5d 100644 --- a/docs/en/sql-reference/statements/create/table.md +++ b/docs/en/sql-reference/statements/create/table.md @@ -136,7 +136,7 @@ ENGINE = ... ``` -If a codec is specified, the default codec doesn’t apply. Codecs can be combined in a pipeline, for example, `CODEC(Delta, ZSTD)`. To select the best codec combination for you project, pass benchmarks similar to described in the Altinity [New Encodings to Improve ClickHouse Efficiency](https://www.altinity.com/blog/2019/7/new-encodings-to-improve-clickhouse) article. +If a codec is specified, the default codec doesn’t apply. Codecs can be combined in a pipeline, for example, `CODEC(Delta, ZSTD)`. To select the best codec combination for you project, pass benchmarks similar to described in the Altinity [New Encodings to Improve ClickHouse Efficiency](https://www.altinity.com/blog/2019/7/new-encodings-to-improve-clickhouse) article. One thing to note is that codec can't be applied for ALIAS column type. !!! warning "Warning" You can’t decompress ClickHouse database files with external utilities like `lz4`. Instead, use the special [clickhouse-compressor](https://github.com/ClickHouse/ClickHouse/tree/master/programs/compressor) utility. diff --git a/docs/en/sql-reference/statements/show.md b/docs/en/sql-reference/statements/show.md index 3cf6d22dfc8..a18e99d7b11 100644 --- a/docs/en/sql-reference/statements/show.md +++ b/docs/en/sql-reference/statements/show.md @@ -148,7 +148,7 @@ SHOW CREATE [ROW] POLICY name ON [database.]table Shows parameters that were used at a [quota creation](../../sql-reference/statements/create/quota.md). -### Syntax {#show-create-row-policy-syntax} +### Syntax {#show-create-quota-syntax} ``` sql SHOW CREATE QUOTA [name | CURRENT] @@ -158,10 +158,70 @@ SHOW CREATE QUOTA [name | CURRENT] Shows parameters that were used at a [settings profile creation](../../sql-reference/statements/create/settings-profile.md). -### Syntax {#show-create-row-policy-syntax} +### Syntax {#show-create-settings-profile-syntax} ``` sql SHOW CREATE [SETTINGS] PROFILE name ``` +## SHOW USERS {#show-users-statement} + +Returns a list of [user account](../../operations/access-rights.md#user-account-management) names. To view user accounts parameters, see the system table [system.users](../../operations/system-tables/users.md#system_tables-users). + +### Syntax {#show-users-syntax} + +``` sql +SHOW USERS +``` + +## SHOW ROLES {#show-roles-statement} + +Returns a list of [roles](../../operations/access-rights.md#role-management). To view another parameters, see system tables [system.roles](../../operations/system-tables/roles.md#system_tables-roles) and [system.role-grants](../../operations/system-tables/role-grants.md#system_tables-role_grants). + +### Syntax {#show-roles-syntax} + +``` sql +SHOW [CURRENT|ENABLED] ROLES +``` + +## SHOW PROFILES {#show-profiles-statement} + +Returns a list of [setting profiles](../../operations/access-rights.md#settings-profiles-management). To view user accounts parameters, see the system table [settings_profiles](../../operations/system-tables/settings_profiles.md#system_tables-settings_profiles). + +### Syntax {#show-profiles-syntax} + +``` sql +SHOW [SETTINGS] PROFILES +``` + +## SHOW POLICIES {#show-policies-statement} + +Returns a list of [row policies](../../operations/access-rights.md#row-policy-management) for the specified table. To view user accounts parameters, see the system table [system.row_policies](../../operations/system-tables/row_policies.md#system_tables-row_policies). + +### Syntax {#show-policies-syntax} + +``` sql +SHOW [ROW] POLICIES [ON [db.]table] +``` + +## SHOW QUOTAS {#show-quotas-statement} + +Returns a list of [quotas](../../operations/access-rights.md#quotas-management). To view quotas parameters, see the system table [system.quotas](../../operations/system-tables/quotas.md#system_tables-quotas). + +### Syntax {#show-quotas-syntax} + +``` sql +SHOW QUOTAS +``` + +## SHOW QUOTA {#show-quota-statement} + +Returns a [quota](../../operations/quotas.md) consumption for all users or for current user. To view another parameters, see system tables [system.quotas_usage](../../operations/system-tables/quotas_usage.md#system_tables-quotas_usage) and [system.quota_usage](../../operations/system-tables/quota_usage.md#system_tables-quota_usage). + +### Syntax {#show-quota-syntax} + +``` sql +SHOW [CURRENT] QUOTA +``` + [Original article](https://clickhouse.tech/docs/en/query_language/show/) diff --git a/docs/es/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/es/engines/table-engines/mergetree-family/replacingmergetree.md index cb3c6aea34b..a1e95c5b5f4 100644 --- a/docs/es/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/es/engines/table-engines/mergetree-family/replacingmergetree.md @@ -33,7 +33,7 @@ Para obtener una descripción de los parámetros de solicitud, consulte [descrip **ReplacingMergeTree Parámetros** -- `ver` — column with version. Type `UInt*`, `Date`, `DateTime` o `DateTime64`. Parámetro opcional. +- `ver` — column with version. Type `UInt*`, `Date` o `DateTime`. Parámetro opcional. Al fusionar, `ReplacingMergeTree` de todas las filas con la misma clave primaria deja solo una: diff --git a/docs/es/interfaces/http.md b/docs/es/interfaces/http.md index abc5cf63188..ebce0ec7a51 100644 --- a/docs/es/interfaces/http.md +++ b/docs/es/interfaces/http.md @@ -38,7 +38,7 @@ Ejemplos: $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123 diff --git a/docs/fa/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/fa/engines/table-engines/mergetree-family/replacingmergetree.md index 4ece20461cb..0ace0e05afc 100644 --- a/docs/fa/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/fa/engines/table-engines/mergetree-family/replacingmergetree.md @@ -33,7 +33,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] **پارامترهای جایگزین** -- `ver` — column with version. Type `UInt*`, `Date`, `DateTime` یا `DateTime64`. پارامتر اختیاری. +- `ver` — column with version. Type `UInt*`, `Date` یا `DateTime`. پارامتر اختیاری. هنگام ادغام, `ReplacingMergeTree` از تمام ردیف ها با همان کلید اصلی تنها یک برگ دارد: diff --git a/docs/fa/interfaces/http.md b/docs/fa/interfaces/http.md index 774980cf8fb..9ce40c17e6f 100644 --- a/docs/fa/interfaces/http.md +++ b/docs/fa/interfaces/http.md @@ -38,7 +38,7 @@ Ok. $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123 diff --git a/docs/fr/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/fr/engines/table-engines/mergetree-family/replacingmergetree.md index 755249c1a38..ac3c0f3b021 100644 --- a/docs/fr/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/fr/engines/table-engines/mergetree-family/replacingmergetree.md @@ -33,7 +33,7 @@ Pour une description des paramètres de requête, voir [demande de description]( **ReplacingMergeTree Paramètres** -- `ver` — column with version. Type `UInt*`, `Date`, `DateTime` ou `DateTime64`. Paramètre facultatif. +- `ver` — column with version. Type `UInt*`, `Date` ou `DateTime`. Paramètre facultatif. Lors de la fusion, `ReplacingMergeTree` de toutes les lignes avec la même clé primaire ne laisse qu'un: diff --git a/docs/fr/interfaces/http.md b/docs/fr/interfaces/http.md index 2de32747d4a..a414bba2c2f 100644 --- a/docs/fr/interfaces/http.md +++ b/docs/fr/interfaces/http.md @@ -38,7 +38,7 @@ Exemple: $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123 diff --git a/docs/ja/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/ja/engines/table-engines/mergetree-family/replacingmergetree.md index e2cce893e3a..c3df9559415 100644 --- a/docs/ja/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/ja/engines/table-engines/mergetree-family/replacingmergetree.md @@ -33,7 +33,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] **ReplacingMergeTreeパラメータ** -- `ver` — column with version. Type `UInt*`, `Date`, `DateTime` または `DateTime64`. 任意パラメータ。 +- `ver` — column with version. Type `UInt*`, `Date` または `DateTime`. 任意パラメータ。 マージ時, `ReplacingMergeTree` 同じ主キーを持つすべての行から、一つだけを残します: diff --git a/docs/ja/interfaces/http.md b/docs/ja/interfaces/http.md index c76b1ba0827..31f2b54af6d 100644 --- a/docs/ja/interfaces/http.md +++ b/docs/ja/interfaces/http.md @@ -38,7 +38,7 @@ GETメソッドを使用する場合, ‘readonly’ 設定されています。 $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123 diff --git a/docs/ja/sql-reference/data-types/date.md b/docs/ja/sql-reference/data-types/date.md index ff6e028e885..bcdc8f7224d 100644 --- a/docs/ja/sql-reference/data-types/date.md +++ b/docs/ja/sql-reference/data-types/date.md @@ -1,14 +1,11 @@ --- -machine_translated: true -machine_translated_rev: 72537a2d527c63c07aa5d2361a8829f3895cf2bd toc_priority: 47 toc_title: "\u65E5\u4ED8" --- # 日付 {#date} -デートだ 1970-01-01(符号なし)以降の日数として二バイト単位で格納されます。 Unixエポックの開始直後から、コンパイル段階で定数によって定義される上限しきい値までの値を格納できます(現在は2106年までですが、完全にサポート -最小値は1970-01-01として出力されます。 +日付型です。 1970-01-01 からの日数が2バイトの符号なし整数として格納されます。 UNIX時間の開始直後から、変換段階で定数として定義される上限しきい値までの値を格納できます(現在は2106年までですが、一年分を完全にサポートしているのは2105年までです)。 日付値は、タイムゾーンなしで格納されます。 diff --git a/docs/ru/engines/table-engines/integrations/rabbitmq.md b/docs/ru/engines/table-engines/integrations/rabbitmq.md new file mode 100644 index 00000000000..b6b239f0eee --- /dev/null +++ b/docs/ru/engines/table-engines/integrations/rabbitmq.md @@ -0,0 +1,122 @@ +--- +toc_priority: 6 +toc_title: RabbitMQ +--- + +# RabbitMQ {#rabbitmq-engine} + +Движок работает с [RabbitMQ](https://www.rabbitmq.com). + +`RabbitMQ` позволяет: + +- Публиковать/подписываться на потоки данных. +- Обрабатывать потоки по мере их появления. + +## Создание таблицы {#table_engine-rabbitmq-creating-a-table} + +``` sql +CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] +( + name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1], + name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2], + ... +) ENGINE = RabbitMQ SETTINGS + rabbitmq_host_port = 'host:port', + rabbitmq_exchange_name = 'exchange_name', + rabbitmq_format = 'data_format'[,] + [rabbitmq_exchange_type = 'exchange_type',] + [rabbitmq_routing_key_list = 'key1,key2,...',] + [rabbitmq_row_delimiter = 'delimiter_symbol',] + [rabbitmq_num_consumers = N,] + [rabbitmq_num_queues = N,] + [rabbitmq_transactional_channel = 0] +``` + +Обязательные параметры: + +- `rabbitmq_host_port` – адрес сервера (`хост:порт`). Например: `localhost:5672`. +- `rabbitmq_exchange_name` – имя точки обмена в RabbitMQ. +- `rabbitmq_format` – формат сообщения. Используется такое же обозначение, как и в функции `FORMAT` в SQL, например, `JSONEachRow`. Подробнее см. в разделе [Форматы входных и выходных данных](../../../interfaces/formats.md). + +Дополнительные параметры: + +- `rabbitmq_exchange_type` – тип точки обмена в RabbitMQ: `direct`, `fanout`, `topic`, `headers`, `consistent-hash`. По умолчанию: `fanout`. +- `rabbitmq_routing_key_list` – список ключей маршрутизации, через запятую. +- `rabbitmq_row_delimiter` – символ-разделитель, который завершает сообщение. +- `rabbitmq_num_consumers` – количество потребителей на таблицу. По умолчанию: `1`. Укажите больше потребителей, если пропускная способность одного потребителя недостаточна. +- `rabbitmq_num_queues` – количество очередей на потребителя. По умолчанию: `1`. Укажите больше потребителей, если пропускная способность одной очереди на потребителя недостаточна. Одна очередь поддерживает до 50 тысяч сообщений одновременно. +- `rabbitmq_transactional_channel` – обернутые запросы `INSERT` в транзакциях. По умолчанию: `0`. + +Требуемая конфигурация: + +Конфигурация сервера RabbitMQ добавляется с помощью конфигурационного файла ClickHouse. + +``` xml + + root + clickhouse + +``` + +Example: + +``` sql + CREATE TABLE queue ( + key UInt64, + value UInt64 + ) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672', + rabbitmq_exchange_name = 'exchange1', + rabbitmq_format = 'JSONEachRow', + rabbitmq_num_consumers = 5; +``` + +## Описание {#description} + +Запрос `SELECT` не очень полезен для чтения сообщений (за исключением отладки), поскольку каждое сообщение может быть прочитано только один раз. Практичнее создавать потоки реального времени с помощью [материализованных преставлений](../../../sql-reference/statements/create/view.md). Для этого: + +1. Создайте потребителя RabbitMQ с помощью движка и рассматривайте его как поток данных. +2. Создайте таблицу с необходимой структурой. +3. Создайте материализованное представление, которое преобразует данные от движка и помещает их в ранее созданную таблицу. + +Когда к движку присоединяется материализованное представление, оно начинает в фоновом режиме собирать данные. Это позволяет непрерывно получать сообщения от RabbitMQ и преобразовывать их в необходимый формат с помощью `SELECT`. +У одной таблицы RabbitMQ может быть неограниченное количество материализованных представлений. + +Данные передаются с помощью параметров `rabbitmq_exchange_type` и `rabbitmq_routing_key_list`. +Может быть не более одной точки обмена на таблицу. Одна точка обмена может использоваться несколькими таблицами: это позволяет выполнять маршрутизацию по нескольким таблицам одновременно. + +Параметры точек обмена: + +- `direct` - маршрутизация основана на точном совпадении ключей. Пример списка ключей: `key1,key2,key3,key4,key5`. Ключ сообщения может совпадать с одним из них. +- `fanout` - маршрутизация по всем таблицам, где имя точки обмена совпадает, независимо от ключей. +- `topic` - маршрутизация основана на правилах с ключами, разделенными точками. Например: `*.logs`, `records.*.*.2020`, `*.2018,*.2019,*.2020`. +- `headers` - маршрутизация основана на совпадении `key=value` с настройкой `x-match=all` или `x-match=any`. Пример списка ключей таблицы: `x-match=all,format=logs,type=report,year=2020`. +- `consistent-hash` - данные равномерно распределяются между всеми связанными таблицами, где имя точки обмена совпадает. Обратите внимание, что этот тип обмена должен быть включен с помощью плагина RabbitMQ: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`. + +Если тип точки обмена не задан, по умолчанию используется `fanout`. В таком случае ключи маршрутизации для публикации данных должны быть рандомизированы в диапазоне `[1, num_consumers]` за каждое сообщение/пакет (или в диапазоне `[1, num_consumers * num_queues]`, если `rabbitmq_num_queues` задано). Эта конфигурация таблицы работает быстрее, чем любая другая, особенно когда заданы параметры `rabbitmq_num_consumers` и/или `rabbitmq_num_queues`. + +Если параметры`rabbitmq_num_consumers` и/или `rabbitmq_num_queues` заданы вместе с параметром `rabbitmq_exchange_type`: + +- плагин `rabbitmq-consistent-hash-exchange` должен быть включен. +- свойство `message_id` должно быть определено (уникальное для каждого сообщения/пакета). + +Пример: + +``` sql + CREATE TABLE queue ( + key UInt64, + value UInt64 + ) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672', + rabbitmq_exchange_name = 'exchange1', + rabbitmq_exchange_type = 'headers', + rabbitmq_routing_key_list = 'format=logs,type=report,year=2020', + rabbitmq_format = 'JSONEachRow', + rabbitmq_num_consumers = 5; + + CREATE TABLE daily (key UInt64, value UInt64) + ENGINE = MergeTree(); + + CREATE MATERIALIZED VIEW consumer TO daily + AS SELECT key, value FROM queue; + + SELECT key, value FROM daily ORDER BY key; +``` diff --git a/docs/ru/engines/table-engines/mergetree-family/mergetree.md b/docs/ru/engines/table-engines/mergetree-family/mergetree.md index f04fbae18ba..3c80fe663f1 100644 --- a/docs/ru/engines/table-engines/mergetree-family/mergetree.md +++ b/docs/ru/engines/table-engines/mergetree-family/mergetree.md @@ -1,3 +1,8 @@ +--- +toc_priority: 30 +toc_title: MergeTree +--- + # MergeTree {#table_engines-mergetree} Движок `MergeTree`, а также другие движки этого семейства (`*MergeTree`) — это наиболее функциональные движки таблиц ClickHouse. @@ -28,8 +33,8 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1, INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2 ) ENGINE = MergeTree() +ORDER BY expr [PARTITION BY expr] -[ORDER BY expr] [PRIMARY KEY expr] [SAMPLE BY expr] [TTL expr [DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'], ...] @@ -38,27 +43,42 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] Описание параметров смотрите в [описании запроса CREATE](../../../engines/table-engines/mergetree-family/mergetree.md). -!!! note "Note" +!!! note "Примечание" `INDEX` — экспериментальная возможность, смотрите [Индексы пропуска данных](#table_engine-mergetree-data_skipping-indexes). ### Секции запроса {#mergetree-query-clauses} - `ENGINE` — имя и параметры движка. `ENGINE = MergeTree()`. `MergeTree` не имеет параметров. -- `PARTITION BY` — [ключ партиционирования](custom-partitioning-key.md). Для партиционирования по месяцам используйте выражение `toYYYYMM(date_column)`, где `date_column` — столбец с датой типа [Date](../../../engines/table-engines/mergetree-family/mergetree.md). В этом случае имена партиций имеют формат `"YYYYMM"`. +- `ORDER BY` — ключ сортировки. + + Кортеж столбцов или произвольных выражений. Пример: `ORDER BY (CounterID, EventDate)`. -- `ORDER BY` — ключ сортировки. Кортеж столбцов или произвольных выражений. Пример: `ORDER BY (CounterID, EventDate)`. + ClickHouse использует ключ сортировки в качестве первичного ключа, если первичный ключ не задан в секции `PRIMARY KEY`. -- `PRIMARY KEY` — первичный ключ, если он [отличается от ключа сортировки](#pervichnyi-kliuch-otlichnyi-ot-kliucha-sortirovki). По умолчанию первичный ключ совпадает с ключом сортировки (который задаётся секцией `ORDER BY`.) Поэтому в большинстве случаев секцию `PRIMARY KEY` отдельно указывать не нужно. + Чтобы отключить сортировку, используйте синтаксис `ORDER BY tuple()`. Смотрите [выбор первичного ключа](#vybor-pervichnogo-kliucha). -- `SAMPLE BY` — выражение для сэмплирования. Если используется выражение для сэмплирования, то первичный ключ должен содержать его. Пример: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`. +- `PARTITION BY` — [ключ партиционирования](custom-partitioning-key.md). Необязательный параметр. -- `TTL` — список правил, определяющих длительности хранения строк, а также задающих правила перемещения частей на определённые тома или диски. Выражение должно возвращать столбец `Date` или `DateTime`. Пример: `TTL date + INTERVAL 1 DAY`. - - Тип правила `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'` указывает действие, которое будет выполнено с частью, удаление строк (прореживание), перемещение (при выполнении условия для всех строк части) на определённый диск (`TO DISK 'xxx'`) или том (`TO VOLUME 'xxx'`). - - Поведение по умолчанию соответствует удалению строк (`DELETE`). В списке правил может быть указано только одно выражение с поведением `DELETE`. - - Дополнительные сведения смотрите в разделе [TTL для столбцов и таблиц](#table_engine-mergetree-ttl) + Для партиционирования по месяцам используйте выражение `toYYYYMM(date_column)`, где `date_column` — столбец с датой типа [Date](../../../engines/table-engines/mergetree-family/mergetree.md). В этом случае имена партиций имеют формат `"YYYYMM"`. -- `SETTINGS` — дополнительные параметры, регулирующие поведение `MergeTree`: +- `PRIMARY KEY` — первичный ключ, если он [отличается от ключа сортировки](#pervichnyi-kliuch-otlichnyi-ot-kliucha-sortirovki). Необязательный параметр. + + По умолчанию первичный ключ совпадает с ключом сортировки (который задаётся секцией `ORDER BY`.) Поэтому в большинстве случаев секцию `PRIMARY KEY` отдельно указывать не нужно. + +- `SAMPLE BY` — выражение для сэмплирования. Необязательный параметр. + + Если используется выражение для сэмплирования, то первичный ключ должен содержать его. Пример: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`. + +- `TTL` — список правил, определяющих длительности хранения строк, а также задающих правила перемещения частей на определённые тома или диски. Необязательный параметр. + + Выражение должно возвращать столбец `Date` или `DateTime`. Пример: `TTL date + INTERVAL 1 DAY`. + + Тип правила `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'` указывает действие, которое будет выполнено с частью, удаление строк (прореживание), перемещение (при выполнении условия для всех строк части) на определённый диск (`TO DISK 'xxx'`) или том (`TO VOLUME 'xxx'`). Поведение по умолчанию соответствует удалению строк (`DELETE`). В списке правил может быть указано только одно выражение с поведением `DELETE`. + + Дополнительные сведения смотрите в разделе [TTL для столбцов и таблиц](#table_engine-mergetree-ttl) + +- `SETTINGS` — дополнительные параметры, регулирующие поведение `MergeTree` (необязательные): - `index_granularity` — максимальное количество строк данных между засечками индекса. По умолчанию — 8192. Смотрите [Хранение данных](#mergetree-data-storage). - `index_granularity_bytes` — максимальный размер гранул данных в байтах. По умолчанию — 10Mb. Чтобы ограничить размер гранул только количеством строк, установите значение 0 (не рекомендовано). Смотрите [Хранение данных](#mergetree-data-storage). @@ -180,6 +200,14 @@ ClickHouse не требует уникального первичного кл Длинный первичный ключ будет негативно влиять на производительность вставки и потребление памяти, однако на производительность ClickHouse при запросах `SELECT` лишние столбцы в первичном ключе не влияют. +Вы можете создать таблицу без первичного ключа, используя синтаксис `ORDER BY tuple()`. В этом случае ClickHouse хранит данные в порядке вставки. Если вы хотите сохранить порядок данных при вставке данных с помощью запросов `INSERT ... SELECT`, установите [max\_insert\_threads = 1](../../../operations/settings/settings.md#settings-max-insert-threads). + +Чтобы выбрать данные в первоначальном порядке, используйте +[однопоточные](../../../operations/settings/settings.md#settings-max_threads) запросы `SELECT. + + + + ### Первичный ключ, отличный от ключа сортировки {#pervichnyi-kliuch-otlichnyi-ot-kliucha-sortirovki} Существует возможность задать первичный ключ (выражение, значения которого будут записаны в индексный файл для diff --git a/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md index fefc3c65b38..4aa1eb556f3 100644 --- a/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md @@ -25,7 +25,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] **Параметры ReplacingMergeTree** -- `ver` — столбец с версией, тип `UInt*`, `Date`, `DateTime` или `DateTime64`. Необязательный параметр. +- `ver` — столбец с версией, тип `UInt*`, `Date` или `DateTime`. Необязательный параметр. При слиянии, из всех строк с одинаковым значением ключа сортировки `ReplacingMergeTree` оставляет только одну: diff --git a/docs/ru/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md b/docs/ru/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md index 5dc9589bef5..bf280eb52bc 100644 --- a/docs/ru/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md +++ b/docs/ru/engines/table-engines/mergetree-family/versionedcollapsingmergetree.md @@ -116,7 +116,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] **Примечания по использованию** -1. Программа, которая записывает данные, должна помнить состояние объекта, чтобы иметь возможность отменить его. Строка отмены состояния должна быть копией предыдущей строки состояния с противоположным значением `Sign`. Это увеличивает начальный размер хранилища, но позволяет быстро записывать данные. +1. Программа, которая записывает данные, должна помнить состояние объекта, чтобы иметь возможность отменить его. Строка отмены состояния должна содержать копии полей первичного ключа и копию версии строки состояния и противоположное значение `Sign`. Это увеличивает начальный размер хранилища, но позволяет быстро записывать данные. 2. Длинные растущие массивы в столбцах снижают эффективность работы движка за счёт нагрузки на запись. Чем проще данные, тем выше эффективность. 3. `SELECT` результаты сильно зависят от согласованности истории изменений объекта. Будьте точны при подготовке данных для вставки. Вы можете получить непредсказуемые результаты с несогласованными данными, такими как отрицательные значения для неотрицательных метрик, таких как глубина сеанса. diff --git a/docs/ru/interfaces/http.md b/docs/ru/interfaces/http.md index afd4d083365..b1cc4c79b25 100644 --- a/docs/ru/interfaces/http.md +++ b/docs/ru/interfaces/http.md @@ -31,7 +31,7 @@ Ok. $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123 diff --git a/docs/ru/interfaces/third-party/gui.md b/docs/ru/interfaces/third-party/gui.md index 8c8550690ef..f7eaa5cc77f 100644 --- a/docs/ru/interfaces/third-party/gui.md +++ b/docs/ru/interfaces/third-party/gui.md @@ -89,6 +89,14 @@ [clickhouse-flamegraph](https://github.com/Slach/clickhouse-flamegraph) — специализированный инструмент для визуализации `system.trace_log` в виде [flamegraph](http://www.brendangregg.com/flamegraphs.html). +### clickhouse-plantuml {#clickhouse-plantuml} + +[cickhouse-plantuml](https://pypi.org/project/clickhouse-plantuml/) — скрипт, генерирующий [PlantUML](https://plantuml.com/) диаграммы схем таблиц. + +### xeus-clickhouse {#xeus-clickhouse} + +[xeus-clickhouse](https://github.com/wangfenjin/xeus-clickhouse) — это ядро Jupyter для ClickHouse, которое поддерживает запрос ClickHouse-данных с использованием SQL в Jupyter. + ## Коммерческие {#kommercheskie} ### DataGrip {#datagrip} diff --git a/docs/ru/operations/configuration-files.md b/docs/ru/operations/configuration-files.md index f9c8f3f57a6..b8ab21c5f85 100644 --- a/docs/ru/operations/configuration-files.md +++ b/docs/ru/operations/configuration-files.md @@ -16,9 +16,12 @@ Подстановки могут также выполняться из ZooKeeper. Для этого укажите у элемента атрибут `from_zk = "/path/to/node"`. Значение элемента заменится на содержимое узла `/path/to/node` в ZooKeeper. В ZooKeeper-узел также можно положить целое XML-поддерево, оно будет целиком вставлено в исходный элемент. -В `config.xml` может быть указан отдельный конфиг с настройками пользователей, профилей и квот. Относительный путь к нему указывается в элементе users\_config. По умолчанию - `users.xml`. Если `users_config` не указан, то настройки пользователей, профилей и квот, указываются непосредственно в `config.xml`. +В элементе `users_config` файла `config.xml` можно указать относительный путь к конфигурационному файлу с настройками пользователей, профилей и квот. Значение `users_config` по умолчанию — `users.xml`. Если `users_config` не указан, то настройки пользователей, профилей и квот можно задать непосредственно в `config.xml`. -Для `users_config` могут также существовать переопределения в файлах из директории `users_config.d` (например, `users.d`) и подстановки. Например, можно иметь по отдельному конфигурационному файлу для каждого пользователя: +Настройки пользователя могут быть разделены в несколько отдельных файлов аналогичных `config.xml` и `config.d\`. Имя директории задаётся также как `users_config`. +Имя директории задаётся так же, как имя файла в `users_config`, с подстановкой `.d` вместо `.xml`. +Директория `users.d` используется по умолчанию, также как `users.xml` используется для `users_config`. +Например, можно иметь по отдельному конфигурационному файлу для каждого пользователя: ``` bash $ cat /etc/clickhouse-server/users.d/alice.xml diff --git a/docs/ru/operations/settings/settings.md b/docs/ru/operations/settings/settings.md index b04f8f411c3..55f8f9362e5 100644 --- a/docs/ru/operations/settings/settings.md +++ b/docs/ru/operations/settings/settings.md @@ -401,13 +401,91 @@ INSERT INTO test VALUES (lower('Hello')), (lower('world')), (lower('INSERT')), ( Устанавливает тип поведения [JOIN](../../sql-reference/statements/select/join.md). При объединении таблиц могут появиться пустые ячейки. ClickHouse заполняет их по-разному в зависимости от настроек. -Возможные значения +Возможные значения: - 0 — пустые ячейки заполняются значением по умолчанию соответствующего типа поля. - 1 — `JOIN` ведёт себя как в стандартном SQL. Тип соответствующего поля преобразуется в [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable), а пустые ячейки заполняются значениями [NULL](../../sql-reference/syntax.md). +## partial_merge_join_optimizations {#partial_merge_join_optimizations} + +Отключает все оптимизации для запросов [JOIN](../../sql-reference/statements/select/join.md) с частичным MergeJoin алгоритмом. + +По умолчанию оптимизации включены, что может привести к неправильным результатам. Если вы видите подозрительные результаты в своих запросах, отключите оптимизацию с помощью этого параметра. В различных версиях сервера ClickHouse, оптимизация может отличаться. + +Возможные значения: + +- 0 — Оптимизация отключена. +- 1 — Оптимизация включена. + +Значение по умолчанию: 1. + +## partial_merge_join_rows_in_right_blocks {#partial_merge_join_rows_in_right_blocks} + +Устанавливает предельные размеры блоков данных «правого» соединения, для запросов [JOIN](../../sql-reference/statements/select/join.md) с частичным MergeJoin алгоритмом. + +Сервер ClickHouse: + +1. Разделяет данные правого соединения на блоки с заданным числом строк. +2. Индексирует для каждого блока минимальное и максимальное значение. +3. Выгружает подготовленные блоки на диск, если это возможно. + +Возможные значения: + +- Положительное целое число. Рекомендуемый диапазон значений [1000, 100000]. + +Значение по умолчанию: 65536. + +## join_on_disk_max_files_to_merge {#join_on_disk_max_files_to_merge} + +Устанавливет количество файлов, разрешенных для параллельной сортировки, при выполнении операций MergeJoin на диске. + +Чем больше значение параметра, тем больше оперативной памяти используется и тем меньше используется диск (I/O). + +Возможные значения: + +- Положительное целое число, больше 2. + +Значение по умолчанию: 64. + +## temporary_files_codec {#temporary_files_codec} + +Устанавливает метод сжатия для временных файлов на диске, используемых при сортировки и объединения. + +Возможные значения: + +- LZ4 — применять сжатие, используя алгоритм [LZ4](https://ru.wikipedia.org/wiki/LZ4) +- NONE — не применять сжатие. + +Значение по умолчанию: LZ4. + +## any_join_distinct_right_table_keys {#any_join_distinct_right_table_keys} + +Включает устаревшее поведение сервера ClickHouse при выполнении операций `ANY INNER|LEFT JOIN`. + +!!! note "Внимание" + Используйте этот параметр только в целях обратной совместимости, если ваши варианты использования требуют устаревшего поведения `JOIN`. + +Когда включено устаревшее поведение: + +- Результаты операций "t1 ANY LEFT JOIN t2" и "t2 ANY RIGHT JOIN t1" не равны, поскольку ClickHouse использует логику с сопоставлением ключей таблицы "многие к одному слева направо". +- Результаты операций `ANY INNER JOIN` содержат все строки из левой таблицы, аналогично операции `SEMI LEFT JOIN`. + +Когда устаревшее поведение отключено: + +- Результаты операций `t1 ANY LEFT JOIN t2` и `t2 ANY RIGHT JOIN t1` равно, потому что ClickHouse использует логику сопоставления ключей один-ко-многим в операциях `ANY RIGHT JOIN`. +- Результаты операций `ANY INNER JOIN` содержат по одной строке на ключ из левой и правой таблиц. + +Возможные значения: + +- 0 — Устаревшее поведение отключено. +- 1 — Устаревшее поведение включено. + Значение по умолчанию: 0. +См. также: + +- [JOIN strictness](../../sql-reference/statements/select/join.md#select-join-strictness) + ## max\_block\_size {#setting-max_block_size} Данные в ClickHouse обрабатываются по блокам (наборам кусочков столбцов). Внутренние циклы обработки для одного блока достаточно эффективны, но есть заметные издержки на каждый блок. Настройка `max_block_size` — это рекомендация, какой размер блока (в количестве строк) загружать из таблиц. Размер блока не должен быть слишком маленьким, чтобы затраты на каждый блок были заметны, но не слишком велики, чтобы запрос с LIMIT, который завершается после первого блока, обрабатывался быстро. Цель состоит в том, чтобы не использовалось слишком много оперативки при вынимании большого количества столбцов в несколько потоков; чтобы оставалась хоть какая-нибудь кэш-локальность. @@ -520,31 +598,6 @@ ClickHouse использует этот параметр при чтении д Значение по умолчанию: 0. -## network_compression_method {#network_compression_method} - -Задает метод сжатия данных, используемый при обмене данными между серверами и при обмене между сервером и [clickhouse-client](../../interfaces/cli.md). - -Возможные значения: - -- `LZ4` — устанавливает метод сжатия LZ4. -- `ZSTD` — устанавливает метод сжатия ZSTD. - -Значение по умолчанию: `LZ4`. - -См. также: - -- [network_zstd_compression_level](#network_zstd_compression_level) - -## network_zstd_compression_level {#network_zstd_compression_level} - -Регулирует уровень сжатия ZSTD. Используется только тогда, когда [network_compression_method](#network_compression_method) имеет значение `ZSTD`. - -Возможные значения: - -- Положительное целое число от 1 до 15. - -Значение по умолчанию: `1`. - ## log\_queries {#settings-log-queries} Установка логирования запроса. @@ -557,6 +610,60 @@ ClickHouse использует этот параметр при чтении д log_queries=1 ``` +## log\_queries\_min\_type {#settings-log-queries-min-type} + +`query_log` минимальный уровень логирования. + +Возможные значения: +- `QUERY_START` (`=1`) +- `QUERY_FINISH` (`=2`) +- `EXCEPTION_BEFORE_START` (`=3`) +- `EXCEPTION_WHILE_PROCESSING` (`=4`) + +Значение по умолчанию: `QUERY_START`. + +Можно использовать для ограничения того, какие объекты будут записаны в `query_log`, например, если вас интересуют ошибки, тогда вы можете использовать `EXCEPTION_WHILE_PROCESSING`: + +``` text +log_queries_min_type='EXCEPTION_WHILE_PROCESSING' +``` + +## log\_queries\_min\_type {#settings-log-queries-min-type} + +`query_log` минимальный уровень логирования. + +Возможные значения: +- `QUERY_START` (`=1`) +- `QUERY_FINISH` (`=2`) +- `EXCEPTION_BEFORE_START` (`=3`) +- `EXCEPTION_WHILE_PROCESSING` (`=4`) + +Значение по умолчанию: `QUERY_START`. + +Можно использовать для ограничения того, какие объекты будут записаны в `query_log`, например, если вас интересуют ошибки, тогда вы можете использовать `EXCEPTION_WHILE_PROCESSING`: + +``` text +log_queries_min_type='EXCEPTION_WHILE_PROCESSING' +``` + +## log\_queries\_min\_type {#settings-log-queries-min-type} + +Задаёт минимальный уровень логирования в `query_log`. + +Возможные значения: +- `QUERY_START` (`=1`) +- `QUERY_FINISH` (`=2`) +- `EXCEPTION_BEFORE_START` (`=3`) +- `EXCEPTION_WHILE_PROCESSING` (`=4`) + +Значение по умолчанию: `QUERY_START`. + +Можно использовать для ограничения того, какие объекты будут записаны в `query_log`, например, если вас интересуют ошибки, тогда вы можете использовать `EXCEPTION_WHILE_PROCESSING`: + +``` text +log_queries_min_type='EXCEPTION_WHILE_PROCESSING' +``` + ## log\_query\_threads {#settings-log-query-threads} Установка логирования информации о потоках выполнения запроса. @@ -571,7 +678,7 @@ log_query_threads=1 ## max\_insert\_block\_size {#settings-max_insert_block_size} -Формировать блоки указанного размера (в количестве строк), при вставке в таблицу. +Формировать блоки указанного размера, при вставке в таблицу. Эта настройка действует только в тех случаях, когда сервер сам формирует такие блоки. Например, при INSERT-е через HTTP интерфейс, сервер парсит формат данных, и формирует блоки указанного размера. А при использовании clickhouse-client, клиент сам парсит данные, и настройка max\_insert\_block\_size на сервере не влияет на размер вставляемых блоков. @@ -946,7 +1053,6 @@ SELECT area/period FROM account_orders FORMAT JSON; "type": "Float64" } ], - "data": [ { @@ -959,9 +1065,7 @@ SELECT area/period FROM account_orders FORMAT JSON; "divide(area, period)": null } ], - "rows": 3, - "statistics": { "elapsed": 0.003648093, @@ -982,7 +1086,6 @@ SELECT area/period FROM account_orders FORMAT JSON; "type": "Float64" } ], - "data": [ { @@ -995,9 +1098,7 @@ SELECT area/period FROM account_orders FORMAT JSON; "divide(area, period)": "-inf" } ], - "rows": 3, - "statistics": { "elapsed": 0.000070241, @@ -1007,6 +1108,7 @@ SELECT area/period FROM account_orders FORMAT JSON; } ``` + ## format\_csv\_delimiter {#settings-format_csv_delimiter} Символ, интерпретируемый как разделитель в данных формата CSV. По умолчанию — `,`. @@ -1220,7 +1322,7 @@ ClickHouse генерирует исключение Значение по умолчанию: 0 -## force\_optimize\_skip\_unused\_shards {#force-optimize-skip-unused-shards} +## force\_optimize\_skip\_unused\_shards {#settings-force_optimize_skip_unused_shards} Разрешает или запрещает выполнение запроса, если настройка [optimize_skip_unused_shards](#optimize-skip-unused-shards) включена, а пропуск неиспользуемых шардов невозможен. Если данная настройка включена и пропуск невозможен, ClickHouse генерирует исключение. @@ -1234,19 +1336,30 @@ ClickHouse генерирует исключение ## force\_optimize\_skip\_unused\_shards\_nesting {#settings-force_optimize_skip_unused_shards_nesting} -Контролирует настройку [`force_optimize_skip_unused_shards`](#force-optimize-skip-unused-shards) (поэтому все еще требует `optimize_skip_unused_shards`) в зависимости от вложенности распределенного запроса (когда у вас есть `Distributed` таблица которая смотрит на другую `Distributed` таблицу). +Контролирует настройку [`force_optimize_skip_unused_shards`](#settings-force_optimize_skip_unused_shards) (поэтому все еще требует `optimize_skip_unused_shards`) в зависимости от вложенности распределенного запроса (когда у вас есть `Distributed` таблица которая смотрит на другую `Distributed` таблицу). Возможные значения: -- 0 - Disabled, `force_optimize_skip_unused_shards` works on all levels. -- 1 — Enables `force_optimize_skip_unused_shards` only for the first level. -- 2 — Enables `force_optimize_skip_unused_shards` up to the second level. +- 0 - Выключена, `force_optimize_skip_unused_shards` работает всегда. +- 1 — Включает `force_optimize_skip_unused_shards` только для 1-ого уровня вложенности. +- 2 — Включает `force_optimize_skip_unused_shards` для 1-ого и 2-ого уровня вложенности. + +Значение по умолчанию: 0 + +## force\_optimize\_skip\_unused\_shards\_no\_nested {#settings-force_optimize_skip_unused_shards_no_nested} + +Сбрасывает [`optimize_skip_unused_shards`](#settings-force_optimize_skip_unused_shards) для вложенных `Distributed` таблиц. + +Возможные значения: + +- 1 — Включена. +- 0 — Выключена. Значение по умолчанию: 0 ## optimize\_throw\_if\_noop {#setting-optimize_throw_if_noop} -Включает или отключает генерирование исключения в в случаях, когда запрос [OPTIMIZE](../../sql-reference/statements/misc.md#misc_operations-optimize) не выполняет мёрж. +Включает или отключает генерирование исключения в случаях, когда запрос [OPTIMIZE](../../sql-reference/statements/misc.md#misc_operations-optimize) не выполняет мёрж. По умолчанию, `OPTIMIZE` завершается успешно и в тех случаях, когда он ничего не сделал. Настройка позволяет отделить подобные случаи и включает генерирование исключения с поясняющим сообщением. @@ -1367,7 +1480,7 @@ Default value: 0. - [Sampling Query Profiler](../optimizing-performance/sampling-query-profiler.md) - System table [trace\_log](../../operations/system-tables/trace_log.md#system_tables-trace_log) -## background_pool_size {#background_pool_size} +## background\_pool\_size {#background_pool_size} Задает количество потоков для выполнения фоновых операций в движках таблиц (например, слияния в таблицах c движком [MergeTree](../../engines/table-engines/mergetree-family/index.md)). Настройка применяется при запуске сервера ClickHouse и не может быть изменена во пользовательском сеансе. Настройка позволяет управлять загрузкой процессора и диска. Чем меньше пулл, тем ниже нагрузка на CPU и диск, при этом фоновые процессы замедляются, что может повлиять на скорость выполнения запроса. @@ -1381,7 +1494,7 @@ Default value: 0. Включает параллельную обработку распределённых запросов `INSERT ... SELECT`. -Если при выполнении запроса `INSERT INTO distributed_table_a SELECT ... FROM distributed_table_b` оказывается, что обе таблицы находятся в одном кластере, то независимо от того [реплицируемые](../../engines/table-engines/mergetree-family/replication.md) они или нет, запрос выполняется локально на каждом шарде. +Если при выполнении запроса `INSERT INTO distributed_table_a SELECT ... FROM distributed_table_b` оказывается, что обе таблицы находятся в одном кластере, то независимо от того [реплицируемые](../../engines/table-engines/mergetree-family/replication.md) они или нет, запрос выполняется локально на каждом шарде. Допустимые значения: @@ -1431,7 +1544,7 @@ Default value: 0. Значение по умолчанию: 0. -**См. также:** +**См. также:** - [Репликация данных](../../engines/table-engines/mergetree-family/replication.md) @@ -1448,7 +1561,7 @@ Possible values: Значение по умолчанию: 0. -**Пример** +**Пример** Рассмотрим таблицу `null_in`: @@ -1499,7 +1612,7 @@ SELECT idx, i FROM null_in WHERE i IN (1, NULL) SETTINGS transform_null_in = 1; └──────┴───────┘ ``` -**См. также** +**См. также** - [Обработка значения NULL в операторе IN](../../sql-reference/operators/in.md#in-null-processing) @@ -1610,8 +1723,8 @@ SELECT idx, i FROM null_in WHERE i IN (1, NULL) SETTINGS transform_null_in = 1; Возможные значения: -- 0 - мутации выполняются асинхронно. -- 1 - запрос ждет завершения всех мутаций на текущем сервере. +- 0 - мутации выполняются асинхронно. +- 1 - запрос ждет завершения всех мутаций на текущем сервере. - 2 - запрос ждет завершения всех мутаций на всех репликах (если они есть). Значение по умолчанию: `0`. @@ -1697,4 +1810,17 @@ SELECT range(number) FROM system.numbers LIMIT 5 FORMAT PrettyCompactNoEscapes; └───────────────┘ ``` +## lock_acquire_timeout {#lock_acquire_timeout} + +Устанавливает, сколько секунд сервер ожидает возможности выполнить блокировку таблицы. + +Таймаут устанавливается для защиты от взаимоблокировки при выполнении операций чтения или записи. Если время ожидания истекло, а блокировку выполнить не удалось, сервер возвращает исключение с кодом `DEADLOCK_AVOIDED` и сообщением "Locking attempt timed out! Possible deadlock avoided. Client should retry." ("Время ожидания блокировки истекло! Возможная взаимоблокировка предотвращена. Повторите запрос."). + +Возможные значения: + +- Положительное целое число (в секундах). +- 0 — таймаут не устанавливается. + +Значение по умолчанию: `120` секунд. + [Оригинальная статья](https://clickhouse.tech/docs/ru/operations/settings/settings/) diff --git a/docs/ru/operations/system-tables/grants.md b/docs/ru/operations/system-tables/grants.md new file mode 100644 index 00000000000..58d8a9e1e06 --- /dev/null +++ b/docs/ru/operations/system-tables/grants.md @@ -0,0 +1,24 @@ +# system.grants {#system_tables-grants} + +Привилегии пользовательских аккаунтов ClickHouse. + +Столбцы: +- `user_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Название учётной записи. + +- `role_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Роль, назначенная учетной записи пользователя. + +- `access_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Параметры доступа для учетной записи пользователя ClickHouse. + +- `database` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Имя базы данных. + +- `table` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Имя таблицы. + +- `column` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Имя столбца, к которому предоставляется доступ. + +- `is_partial_revoke` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Логическое значение. Показывает, были ли отменены некоторые привилегии. Возможные значения: +- `0` — Строка описывает частичный отзыв. +- `1` — Строка описывает грант. + +- `grant_option` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Разрешение предоставлено с опцией `WITH GRANT OPTION`, подробнее см. [GRANT](../../sql-reference/statements/grant.md#grant-privigele-syntax). + +[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/grants) diff --git a/docs/ru/operations/system-tables/index.md b/docs/ru/operations/system-tables/index.md index 95715cd84c4..6fa989d3d0d 100644 --- a/docs/ru/operations/system-tables/index.md +++ b/docs/ru/operations/system-tables/index.md @@ -7,10 +7,38 @@ toc_title: Системные таблицы ## Введение {#system-tables-introduction} -Системные таблицы используются для реализации части функциональности системы, а также предоставляют доступ к информации о работе системы. -Вы не можете удалить системную таблицу (хотя можете сделать DETACH). -Для системных таблиц нет файлов с данными на диске и файлов с метаданными. Сервер создаёт все системные таблицы при старте. -В системные таблицы нельзя записывать данные - можно только читать. -Системные таблицы расположены в базе данных system. +Системные таблицы содержат информацию о: + +- Состоянии сервера, процессов и окружении. +- Внутренних процессах сервера. + +Системные таблицы: + +- Находятся в базе данных `system`. +- Доступны только для чтения данных. +- Не могут быть удалены или изменены, но их можно отсоединить. + +Системные таблицы `metric_log`, `query_log`, `query_thread_log`, `trace_log` системные таблицы хранят данные в файловой системе. Остальные системные таблицы хранят свои данные в оперативной памяти. Сервер ClickHouse создает такие системные таблицы при запуске. + +### Источники системных показателей + +Для сбора системных показателей сервер ClickHouse использует: + +- Возможности `CAP_NET_ADMIN`. +- [procfs](https://ru.wikipedia.org/wiki/Procfs) (только Linux). + +**procfs** + +Если для сервера ClickHouse не включено `CAP_NET_ADMIN`, он пытается обратиться к `ProcfsMetricsProvider`. `ProcfsMetricsProvider` позволяет собирать системные показатели для каждого запроса (для CPU и I/O). + +Если procfs поддерживается и включена в системе, то сервер ClickHouse собирает следующие системные показатели: + +- `OSCPUVirtualTimeMicroseconds` +- `OSCPUWaitMicroseconds` +- `OSIOWaitMicroseconds` +- `OSReadChars` +- `OSWriteChars` +- `OSReadBytes` +- `OSWriteBytes` [Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system-tables/) diff --git a/docs/ru/operations/system-tables/quota_usage.md b/docs/ru/operations/system-tables/quota_usage.md index a6f748ec97f..cea3c4b2daa 100644 --- a/docs/ru/operations/system-tables/quota_usage.md +++ b/docs/ru/operations/system-tables/quota_usage.md @@ -24,4 +24,8 @@ - `execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — Общее время выполнения запроса, в секундах. - `max_execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — Максимальное время выполнения запроса. +## Смотрите также {#see-also} + +- [SHOW QUOTA](../../sql-reference/statements/show.md#show-quota-statement) + [Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/quota_usage) diff --git a/docs/ru/operations/system-tables/quotas.md b/docs/ru/operations/system-tables/quotas.md index 7a1c1fd6a80..15bb41a85bf 100644 --- a/docs/ru/operations/system-tables/quotas.md +++ b/docs/ru/operations/system-tables/quotas.md @@ -21,5 +21,9 @@ - `apply_to_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Список имен пользователей/[ролей](../../operations/access-rights.md#role-management) к которым применяется квота. - `apply_to_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Список имен пользователей/ролей к которым квота применяться не должна. +## Смотрите также {#see-also} + +- [SHOW QUOTAS](../../sql-reference/statements/show.md#show-quotas-statement) + [Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/quotas) diff --git a/docs/ru/operations/system-tables/quotas_usage.md b/docs/ru/operations/system-tables/quotas_usage.md index 4a40ae44f8f..9d6d339c434 100644 --- a/docs/ru/operations/system-tables/quotas_usage.md +++ b/docs/ru/operations/system-tables/quotas_usage.md @@ -25,4 +25,8 @@ - `execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — Общее время выполнения запроса, в секундах. - `max_execution_time` ([Nullable](../../sql-reference/data-types/nullable.md)([Float64](../../sql-reference/data-types/float.md))) — Максимальное время выполнения запроса. +## Смотрите также {#see-also} + +- [SHOW QUOTA](../../sql-reference/statements/show.md#show-quota-statement) + [Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/quotas_usage) diff --git a/docs/ru/operations/system-tables/roles.md b/docs/ru/operations/system-tables/roles.md index 11845a32651..1b548e85be2 100644 --- a/docs/ru/operations/system-tables/roles.md +++ b/docs/ru/operations/system-tables/roles.md @@ -5,7 +5,13 @@ Столбцы: - `name` ([String](../../sql-reference/data-types/string.md)) — Имя роли. + - `id` ([UUID](../../sql-reference/data-types/uuid.md)) — ID роли. + - `storage` ([String](../../sql-reference/data-types/string.md)) — Путь к хранилищу ролей. Настраивается в параметре `access_control_path`. +## Смотрите также {#see-also} + +- [SHOW ROLES](../../sql-reference/statements/show.md#show-roles-statement) + [Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/roles) diff --git a/docs/ru/operations/system-tables/row_policies.md b/docs/ru/operations/system-tables/row_policies.md new file mode 100644 index 00000000000..7d0a490f01c --- /dev/null +++ b/docs/ru/operations/system-tables/row_policies.md @@ -0,0 +1,34 @@ +# system.row_policies {#system_tables-row_policies} + +Содержит фильтры безопасности уровня строк (политики строк) для каждой таблицы, а также список ролей и/или пользователей, к которым применяются эти политики. + +Столбцы: +- `name` ([String](../../sql-reference/data-types/string.md)) — Имя политики строк. + +- `short_name` ([String](../../sql-reference/data-types/string.md)) — Короткое имя политики строк. Имена политик строк являются составными, например: `myfilter ON mydb.mytable`. Здесь `myfilter ON mydb.mytable` — это имя политики строк, `myfilter` — короткое имя. + +- `database` ([String](../../sql-reference/data-types/string.md)) — Имя базы данных. + +- `table` ([String](../../sql-reference/data-types/string.md)) — Имя таблицы. + +- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — ID политики строк. + +- `storage` ([String](../../sql-reference/data-types/string.md)) — Имя каталога, в котором хранится политика строк. + +- `select_filter` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Условие, которое используется для фильтрации строк. + +- `is_restrictive` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Показывает, ограничивает ли политика строк доступ к строкам, подробнее см. [CREATE ROW POLICY](../../sql-reference/statements/create/row-policy.md#create-row-policy-as). Значения: +- `0` — Политика строк определяется с помощью условия 'AS PERMISSIVE'. +- `1` — Политика строк определяется с помощью условия 'AS RESTRICTIVE'. + +- `apply_to_all` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Показывает, что политики строк заданы для всех ролей и/или пользователей. + +- `apply_to_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Список ролей и/или пользователей, к которым применяется политика строк. + +- `apply_to_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Политики строк применяются ко всем ролям и/или пользователям, за исключением перечисленных. + +## Смотрите также {#see-also} + +- [SHOW POLICIES](../../sql-reference/statements/show.md#show-policies-statement) + +[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/row_policies) diff --git a/docs/ru/operations/system-tables/settings_profile_elements.md b/docs/ru/operations/system-tables/settings_profile_elements.md new file mode 100644 index 00000000000..cd801468e21 --- /dev/null +++ b/docs/ru/operations/system-tables/settings_profile_elements.md @@ -0,0 +1,30 @@ +# system.settings_profile_elements {#system_tables-settings_profile_elements} + +Описывает содержимое профиля настроек: + +- Ограничения. +- Роли и пользователи, к которым применяется настройка. +- Родительские профили настроек. + +Столбцы: +- `profile_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Имя профиля настроек. + +- `user_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Имя пользователя. + +- `role_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Имя роли. + +- `index` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Порядковый номер элемента профиля настроек. + +- `setting_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Имя настройки. + +- `value` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Значение настройки. + +- `min` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Минимальное значение настройки. `NULL` если не задано. + +- `max` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Максимальное значение настройки. `NULL` если не задано. + +- `readonly` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges))) — Профиль разрешает только запросы на чтение. + +- `inherit_profile` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Родительский профиль для данного профиля настроек. `NULL` если не задано. Профиль настроек может наследовать все значения и ограничения настроек (`min`, `max`, `readonly`) от своего родительского профиля. + +[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/settings_profile_elements) diff --git a/docs/ru/operations/system-tables/settings_profiles.md b/docs/ru/operations/system-tables/settings_profiles.md new file mode 100644 index 00000000000..e1401553a4a --- /dev/null +++ b/docs/ru/operations/system-tables/settings_profiles.md @@ -0,0 +1,24 @@ +# system.settings_profiles {#system_tables-settings_profiles} + +Содержит свойства сконфигурированных профилей настроек. + +Столбцы: +- `name` ([String](../../sql-reference/data-types/string.md)) — Имя профиля настроек. + +- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — ID профиля настроек. + +- `storage` ([String](../../sql-reference/data-types/string.md)) — Путь к хранилищу профилей настроек. Настраивается в параметре `access_control_path`. + +- `num_elements` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Число элементов для этого профиля в таблице `system.settings_profile_elements`. + +- `apply_to_all` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Признак, который показывает, что параметры профиля заданы для всех ролей и/или пользователей. + +- `apply_to_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Список ролей и/или пользователей, к которым применяется профиль настроек. + +- `apply_to_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Профиль настроек применяется ко всем ролям и/или пользователям, за исключением перечисленных. + +## Смотрите также {#see-also} + +- [SHOW PROFILES](../../sql-reference/statements/show.md#show-profiles-statement) + +[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/settings_profiles) diff --git a/docs/ru/operations/system-tables/stack_trace.md b/docs/ru/operations/system-tables/stack_trace.md index 966a07633d8..0689e15c35c 100644 --- a/docs/ru/operations/system-tables/stack_trace.md +++ b/docs/ru/operations/system-tables/stack_trace.md @@ -82,7 +82,7 @@ res: /lib/x86_64-linux-gnu/libc-2.27.so - [Функции интроспекции](../../sql-reference/functions/introspection.md) — Что такое функции интроспекции и как их использовать. - [system.trace_log](../../operations/system-tables/trace_log.md#system_tables-trace_log) — Содержит трассировки стека, собранные профилировщиком выборочных запросов. -- [arrayMap](../../sql-reference/functions/higher-order-functions.md#higher_order_functions-array-map) — Описание и пример использования функции `arrayMap`. -- [arrayFilter](../../sql-reference/functions/higher-order-functions.md#higher_order_functions-array-filter) — Описание и пример использования функции `arrayFilter`. +- [arrayMap](../../sql-reference/functions/array-functions.md#array-map) — Описание и пример использования функции `arrayMap`. +- [arrayFilter](../../sql-reference/functions/array-functions.md#array-filter) — Описание и пример использования функции `arrayFilter`. [Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/stack_trace) diff --git a/docs/ru/operations/system-tables/users.md b/docs/ru/operations/system-tables/users.md new file mode 100644 index 00000000000..c12b91f445f --- /dev/null +++ b/docs/ru/operations/system-tables/users.md @@ -0,0 +1,34 @@ +# system.users {#system_tables-users} + +Содержит список [аккаунтов пользователей](../../operations/access-rights.md#user-account-management), настроенных на сервере. + +Столбцы: +- `name` ([String](../../sql-reference/data-types/string.md)) — Имя пользователя. + +- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — ID пользователя. + +- `storage` ([String](../../sql-reference/data-types/string.md)) — Путь к хранилищу пользователей. Настраивается в параметре `access_control_path`. + +- `auth_type` ([Enum8](../../sql-reference/data-types/enum.md)('no_password' = 0,'plaintext_password' = 1, 'sha256_password' = 2, 'double_sha1_password' = 3)) — Показывает тип аутентификации. Существует несколько способов идентификации пользователя: без пароля, с помощью обычного текстового пароля, с помощью шифрования [SHA256] (https://ru.wikipedia.org/wiki/SHA-2) или с помощью шифрования [double SHA-1] (https://ru.wikipedia.org/wiki/SHA-1). + +- `auth_params` ([String](../../sql-reference/data-types/string.md)) — Параметры аутентификации в формате JSON, зависят от `auth_type`. + +- `host_ip` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — IP-адреса хостов, которым разрешено подключаться к серверу ClickHouse. + +- `host_names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Имена хостов, которым разрешено подключаться к серверу ClickHouse. + +- `host_names_regexp` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Регулярное выражение для имен хостов, которым разрешено подключаться к серверу ClickHouse. + +- `host_names_like` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Имена хостов, которым разрешено подключаться к серверу ClickHouse, заданные с помощью предиката LIKE. + +- `default_roles_all` ([UInt8](../../sql-reference/data-types/int-uint.md#uint-ranges)) — Показывает, что все предоставленные роли установлены для пользователя по умолчанию. + +- `default_roles_list` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Список предоставленных ролей по умолчанию. + +- `default_roles_except` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Все предоставленные роли задаются по умолчанию, за исключением перечисленных. + +## Смотрите также {#see-also} + +- [SHOW USERS](../../sql-reference/statements/show.md#show-users-statement) + +[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/system_tables/users) diff --git a/docs/ru/operations/tips.md b/docs/ru/operations/tips.md index 271a6a35e25..e537f6ef5c1 100644 --- a/docs/ru/operations/tips.md +++ b/docs/ru/operations/tips.md @@ -30,7 +30,7 @@ $ echo 0 | sudo tee /proc/sys/vm/overcommit_memory Механизм прозрачных huge pages нужно отключить. Он мешает работе аллокаторов памяти, что приводит к значительной деградации производительности. ``` bash -$ echo 'never' | sudo tee /sys/kernel/mm/transparent_hugepage/enabled +$ echo 'madvise' | sudo tee /sys/kernel/mm/transparent_hugepage/enabled ``` С помощью `perf top` можно наблюдать за временем, проведенном в ядре операционной системы для управления памятью. diff --git a/docs/ru/operations/utilities/clickhouse-copier.md b/docs/ru/operations/utilities/clickhouse-copier.md index b05db93b28b..b43f5ccaf7a 100644 --- a/docs/ru/operations/utilities/clickhouse-copier.md +++ b/docs/ru/operations/utilities/clickhouse-copier.md @@ -24,7 +24,7 @@ Утилиту следует запускать вручную следующим образом: ``` bash -$ clickhouse-copier copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir +$ clickhouse-copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir ``` Параметры запуска: diff --git a/docs/ru/sql-reference/aggregate-functions/reference/summap.md b/docs/ru/sql-reference/aggregate-functions/reference/summap.md index d127d7df491..460fc078893 100644 --- a/docs/ru/sql-reference/aggregate-functions/reference/summap.md +++ b/docs/ru/sql-reference/aggregate-functions/reference/summap.md @@ -2,15 +2,14 @@ toc_priority: 141 --- -# sumMap {#agg_functions-summap} - -Синтаксис: `sumMap(key, value)` или `sumMap(Tuple(key, value))` +## sumMap(key, value), sumMap(Tuple(key, value)) {#agg_functions-summap} Производит суммирование массива ‘value’ по соответствующим ключам заданным в массиве ‘key’. +Передача кортежа ключей и значений массива синонимично передаче двух массивов ключей и значений. Количество элементов в ‘key’ и ‘value’ должно быть одинаковым для каждой строки, для которой происходит суммирование. Возвращает кортеж из двух массивов - ключи в отсортированном порядке и значения, просуммированные по соответствующим ключам. -Пример: +**Пример:** ``` sql CREATE TABLE sum_map( @@ -19,25 +18,28 @@ CREATE TABLE sum_map( statusMap Nested( status UInt16, requests UInt64 - ) + ), + statusMapTuple Tuple(Array(Int32), Array(Int32)) ) ENGINE = Log; INSERT INTO sum_map VALUES - ('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]), - ('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]), - ('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]), - ('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]); + ('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10], ([1, 2, 3], [10, 10, 10])), + ('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10], ([3, 4, 5], [10, 10, 10])), + ('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10], ([4, 5, 6], [10, 10, 10])), + ('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10], ([6, 7, 8], [10, 10, 10])); + SELECT timeslot, - sumMap(statusMap.status, statusMap.requests) + sumMap(statusMap.status, statusMap.requests), + sumMap(statusMapTuple) FROM sum_map GROUP BY timeslot ``` ``` text -┌────────────timeslot─┬─sumMap(statusMap.status, statusMap.requests)─┐ -│ 2000-01-01 00:00:00 │ ([1,2,3,4,5],[10,10,20,10,10]) │ -│ 2000-01-01 00:01:00 │ ([4,5,6,7,8],[10,10,20,10,10]) │ -└─────────────────────┴──────────────────────────────────────────────┘ +┌────────────timeslot─┬─sumMap(statusMap.status, statusMap.requests)─┬─sumMap(statusMapTuple)─────────┐ +│ 2000-01-01 00:00:00 │ ([1,2,3,4,5],[10,10,20,10,10]) │ ([1,2,3,4,5],[10,10,20,10,10]) │ +│ 2000-01-01 00:01:00 │ ([4,5,6,7,8],[10,10,20,10,10]) │ ([4,5,6,7,8],[10,10,20,10,10]) │ +└─────────────────────┴──────────────────────────────────────────────┴────────────────────────────────┘ ``` [Оригинальная статья](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/summap/) diff --git a/docs/ru/sql-reference/data-types/aggregatefunction.md b/docs/ru/sql-reference/data-types/aggregatefunction.md index 9fbf3dbeded..07983885bde 100644 --- a/docs/ru/sql-reference/data-types/aggregatefunction.md +++ b/docs/ru/sql-reference/data-types/aggregatefunction.md @@ -1,8 +1,8 @@ # AggregateFunction {#data-type-aggregatefunction} -Промежуточное состояние агрегатной функции. Чтобы его получить, используются агрегатные функции с суффиксом `-State`. Чтобы в дальнейшем получить агрегированные данные необходимо использовать те же агрегатные функции с суффиксом `-Merge`. +Агрегатные функции могут обладать определяемым реализацией промежуточным состоянием, которое может быть сериализовано в тип данных, соответствующий AggregateFunction(…), и быть записано в таблицу обычно посредством [материализованного представления] (../../sql-reference/statements/create.md#create-view). Чтобы получить промежуточное состояние, обычно используются агрегатные функции с суффиксом `-State`. Чтобы в дальнейшем получить агрегированные данные необходимо использовать те же агрегатные функции с суффиксом `-Merge`. -`AggregateFunction(name, types_of_arguments…)` — параметрический тип данных. +`AggregateFunction(name, types\_of\_arguments…)` — параметрический тип данных. **Параметры** @@ -23,7 +23,7 @@ CREATE TABLE t ) ENGINE = ... ``` -[uniq](../../sql-reference/aggregate-functions/reference/uniq.md#agg_function-uniq), anyIf ([any](../../sql-reference/aggregate-functions/reference/any.md#agg_function-any)+[If](../../sql-reference/aggregate-functions/combinators.md#agg-functions-combinator-if)) и [quantiles](../../sql-reference/aggregate-functions/reference/quantiles.md) — агрегатные функции, поддержанные в ClickHouse. +[uniq](../../sql-reference/data-types/aggregatefunction.md#agg_function-uniq), anyIf ([any](../../sql-reference/data-types/aggregatefunction.md#agg_function-any)+[If](../../sql-reference/data-types/aggregatefunction.md#agg-functions-combinator-if)) и [quantiles](../../sql-reference/data-types/aggregatefunction.md) — агрегатные функции, поддержанные в ClickHouse. ## Особенности использования {#osobennosti-ispolzovaniia} @@ -58,6 +58,6 @@ SELECT uniqMerge(state) FROM (SELECT uniqState(UserID) AS state FROM table GROUP ## Пример использования {#primer-ispolzovaniia} -Смотрите в описании движка [AggregatingMergeTree](../../engines/table-engines/mergetree-family/aggregatingmergetree.md). +Смотрите в описании движка [AggregatingMergeTree](../../sql-reference/data-types/aggregatefunction.md). [Оригинальная статья](https://clickhouse.tech/docs/ru/data_types/nested_data_structures/aggregatefunction/) diff --git a/docs/ru/sql-reference/data-types/simpleaggregatefunction.md b/docs/ru/sql-reference/data-types/simpleaggregatefunction.md index d36dc87e8ba..52f0412a177 100644 --- a/docs/ru/sql-reference/data-types/simpleaggregatefunction.md +++ b/docs/ru/sql-reference/data-types/simpleaggregatefunction.md @@ -9,6 +9,7 @@ The following aggregate functions are supported: - [`min`](../../sql-reference/aggregate-functions/reference/min.md#agg_function-min) - [`max`](../../sql-reference/aggregate-functions/reference/max.md#agg_function-max) - [`sum`](../../sql-reference/aggregate-functions/reference/sum.md#agg_function-sum) +- [`sumWithOverflow`](../../sql-reference/aggregate-functions/reference/sumwithoverflow.md#sumwithoverflowx) - [`groupBitAnd`](../../sql-reference/aggregate-functions/reference/groupbitand.md#groupbitand) - [`groupBitOr`](../../sql-reference/aggregate-functions/reference/groupbitor.md#groupbitor) - [`groupBitXor`](../../sql-reference/aggregate-functions/reference/groupbitxor.md#groupbitxor) diff --git a/docs/ru/sql-reference/data-types/tuple.md b/docs/ru/sql-reference/data-types/tuple.md index 566a582eb95..0a1089d1aef 100644 --- a/docs/ru/sql-reference/data-types/tuple.md +++ b/docs/ru/sql-reference/data-types/tuple.md @@ -2,7 +2,7 @@ Кортеж из элементов любого [типа](index.md#data_types). Элементы кортежа могут быть одного или разных типов. -Кортежи используются для временной группировки столбцов. Столбцы могут группироваться при использовании выражения IN в запросе, а также для указания нескольких формальных параметров лямбда-функций. Подробнее смотрите разделы [Операторы IN](../../sql-reference/data-types/tuple.md), [Функции высшего порядка](../../sql-reference/functions/higher-order-functions.md#higher-order-functions). +Кортежи используются для временной группировки столбцов. Столбцы могут группироваться при использовании выражения IN в запросе, а также для указания нескольких формальных параметров лямбда-функций. Подробнее смотрите разделы [Операторы IN](../../sql-reference/data-types/tuple.md), [Функции высшего порядка](../../sql-reference/functions/index.md#higher-order-functions). Кортежи могут быть результатом запроса. В этом случае, в текстовых форматах кроме JSON, значения выводятся в круглых скобках через запятую. В форматах JSON, кортежи выводятся в виде массивов (в квадратных скобках). diff --git a/docs/ru/sql-reference/functions/array-functions.md b/docs/ru/sql-reference/functions/array-functions.md index 00d039ca3eb..91c0443c85d 100644 --- a/docs/ru/sql-reference/functions/array-functions.md +++ b/docs/ru/sql-reference/functions/array-functions.md @@ -1,4 +1,4 @@ -# Функции по работе с массивами {#funktsii-po-rabote-s-massivami} +# Массивы {#functions-for-working-with-arrays} ## empty {#function-empty} @@ -186,6 +186,13 @@ SELECT indexOf([1, 3, NULL, NULL], NULL) Элементы, равные `NULL`, обрабатываются как обычные значения. +## arrayCount(\[func,\] arr1, …) {#array-count} + +Возвращает количество элементов массива `arr`, для которых функция `func` возвращает не 0. Если `func` не указана - возвращает количество ненулевых элементов массива. + +Функция `arrayCount` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей можно передать лямбда-функцию. + + ## countEqual(arr, x) {#countequalarr-x} Возвращает количество элементов массива, равных x. Эквивалентно arrayCount(elem -\> elem = x, arr). @@ -513,7 +520,7 @@ SELECT arraySort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]); - Значения `NaN` идут перед `NULL`. - Значения `Inf` идут перед `NaN`. -Функция `arraySort` является [функцией высшего порядка](higher-order-functions.md) — в качестве первого аргумента ей можно передать лямбда-функцию. В этом случае порядок сортировки определяется результатом применения лямбда-функции на элементы массива. +Функция `arraySort` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей можно передать лямбда-функцию. В этом случае порядок сортировки определяется результатом применения лямбда-функции на элементы массива. Рассмотрим пример: @@ -613,7 +620,7 @@ SELECT arrayReverseSort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]) as res; - Значения `NaN` идут перед `NULL`. - Значения `-Inf` идут перед `NaN`. -Функция `arrayReverseSort` является [функцией высшего порядка](higher-order-functions.md). Вы можете передать ей в качестве первого аргумента лямбда-функцию. Например: +Функция `arrayReverseSort` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей можно передать лямбда-функцию. Например: ``` sql SELECT arrayReverseSort((x) -> -x, [1, 2, 3]) as res; @@ -851,7 +858,7 @@ SELECT arrayReduce('maxIf', [3, 5], [1, 0]) Пример с параметрической агрегатной функцией: -Запрос: +Запрос: ```sql SELECT arrayReduce('uniqUpTo(3)', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) @@ -1036,4 +1043,150 @@ SELECT arrayZip(['a', 'b', 'c'], [5, 2, 1]) └──────────────────────────────────────┘ ``` +## arrayMap(func, arr1, …) {#array-map} + +Возвращает массив, полученный на основе результатов применения функции `func` к каждому элементу массива `arr`. + +Примеры: + +``` sql +SELECT arrayMap(x -> (x + 2), [1, 2, 3]) as res; +``` + +``` text +┌─res─────┐ +│ [3,4,5] │ +└─────────┘ +``` + +Следующий пример показывает, как создать кортежи из элементов разных массивов: + +``` sql +SELECT arrayMap((x, y) -> (x, y), [1, 2, 3], [4, 5, 6]) AS res +``` + +``` text +┌─res─────────────────┐ +│ [(1,4),(2,5),(3,6)] │ +└─────────────────────┘ +``` + +Функция `arrayMap` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей нужно передать лямбда-функцию, и этот аргумент не может быть опущен. + +## arrayFilter(func, arr1, …) {#array-filter} + +Возвращает массив, содержащий только те элементы массива `arr1`, для которых функция `func` возвращает не 0. + +Примеры: + +``` sql +SELECT arrayFilter(x -> x LIKE '%World%', ['Hello', 'abc World']) AS res +``` + +``` text +┌─res───────────┐ +│ ['abc World'] │ +└───────────────┘ +``` + +``` sql +SELECT + arrayFilter( + (i, x) -> x LIKE '%World%', + arrayEnumerate(arr), + ['Hello', 'abc World'] AS arr) + AS res +``` + +``` text +┌─res─┐ +│ [2] │ +└─────┘ +``` + +Функция `arrayFilter` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей нужно передать лямбда-функцию, и этот аргумент не может быть опущен. + +## arrayExists(\[func,\] arr1, …) {#arrayexistsfunc-arr1} + +Возвращает 1, если существует хотя бы один элемент массива `arr`, для которого функция func возвращает не 0. Иначе возвращает 0. + +Функция `arrayExists` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) - в качестве первого аргумента ей можно передать лямбда-функцию. + +## arrayAll(\[func,\] arr1, …) {#arrayallfunc-arr1} + +Возвращает 1, если для всех элементов массива `arr`, функция `func` возвращает не 0. Иначе возвращает 0. + +Функция `arrayAll` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) - в качестве первого аргумента ей можно передать лямбда-функцию. + +## arrayFirst(func, arr1, …) {#array-first} + +Возвращает первый элемент массива `arr1`, для которого функция func возвращает не 0. + +Функция `arrayFirst` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей нужно передать лямбда-функцию, и этот аргумент не может быть опущен. + +## arrayFirstIndex(func, arr1, …) {#array-first-index} + +Возвращает индекс первого элемента массива `arr1`, для которого функция func возвращает не 0. + +Функция `arrayFirstIndex` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) — в качестве первого аргумента ей нужно передать лямбда-функцию, и этот аргумент не может быть опущен. + +## arraySum(\[func,\] arr1, …) {#array-sum} + +Возвращает сумму значений функции `func`. Если функция не указана - просто возвращает сумму элементов массива. + +Функция `arraySum` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) - в качестве первого аргумента ей можно передать лямбда-функцию. + +## arrayCumSum(\[func,\] arr1, …) {#arraycumsumfunc-arr1} + +Возвращает массив из частичных сумм элементов исходного массива (сумма с накоплением). Если указана функция `func`, то значения элементов массива преобразуются этой функцией перед суммированием. + +Функция `arrayCumSum` является [функцией высшего порядка](../../sql-reference/functions/index.md#higher-order-functions) - в качестве первого аргумента ей можно передать лямбда-функцию. + +Пример: + +``` sql +SELECT arrayCumSum([1, 1, 1, 1]) AS res +``` + +``` text +┌─res──────────┐ +│ [1, 2, 3, 4] │ +└──────────────┘ + +## arrayAUC {#arrayauc} + +Вычисляет площадь под кривой. + +**Синтаксис** + +``` sql +arrayAUC(arr_scores, arr_labels) +``` + +**Параметры** +- `arr_scores` — оценка, которую дает модель предсказания. +- `arr_labels` — ярлыки выборок, обычно 1 для содержательных выборок и 0 для бессодержательных выборок. + +**Возвращаемое значение** + +Значение площади под кривой. + +Тип данных: `Float64`. + +**Пример** + +Запрос: + +``` sql +select arrayAUC([0.1, 0.4, 0.35, 0.8], [0, 0, 1, 1]) +``` + +Ответ: + +``` text +┌─arrayAUC([0.1, 0.4, 0.35, 0.8], [0, 0, 1, 1])─┐ +│ 0.75 │ +└────────────────────────────────────────---──┘ +``` + [Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/array_functions/) diff --git a/docs/ru/sql-reference/functions/higher-order-functions.md b/docs/ru/sql-reference/functions/higher-order-functions.md deleted file mode 100644 index cd3dee5b1a7..00000000000 --- a/docs/ru/sql-reference/functions/higher-order-functions.md +++ /dev/null @@ -1,167 +0,0 @@ -# Функции высшего порядка {#higher-order-functions} - -## Оператор `->`, функция lambda(params, expr) {#operator-funktsiia-lambdaparams-expr} - -Позволяет описать лямбда-функцию для передачи в функцию высшего порядка. Слева от стрелочки стоит формальный параметр - произвольный идентификатор, или несколько формальных параметров - произвольные идентификаторы в кортеже. Справа от стрелочки стоит выражение, в котором могут использоваться эти формальные параметры, а также любые столбцы таблицы. - -Примеры: `x -> 2 * x, str -> str != Referer.` - -Функции высшего порядка, в качестве своего функционального аргумента могут принимать только лямбда-функции. - -В функции высшего порядка может быть передана лямбда-функция, принимающая несколько аргументов. В этом случае, в функцию высшего порядка передаётся несколько массивов одинаковых длин, которым эти аргументы будут соответствовать. - -Для некоторых функций, например [arrayCount](#higher_order_functions-array-count) или [arraySum](#higher_order_functions-array-sum), первый аргумент (лямбда-функция) может отсутствовать. В этом случае, подразумевается тождественное отображение. - -Для функций, перечисленных ниже, лямбда-функцию должна быть указана всегда: - -- [arrayMap](#higher_order_functions-array-map) -- [arrayFilter](#higher_order_functions-array-filter) -- [arrayFirst](#higher_order_functions-array-first) -- [arrayFirstIndex](#higher_order_functions-array-first-index) - -### arrayMap(func, arr1, …) {#higher_order_functions-array-map} - -Вернуть массив, полученный на основе результатов применения функции `func` к каждому элементу массива `arr`. - -Примеры: - -``` sql -SELECT arrayMap(x -> (x + 2), [1, 2, 3]) as res; -``` - -``` text -┌─res─────┐ -│ [3,4,5] │ -└─────────┘ -``` - -Следующий пример показывает, как создать кортежи из элементов разных массивов: - -``` sql -SELECT arrayMap((x, y) -> (x, y), [1, 2, 3], [4, 5, 6]) AS res -``` - -``` text -┌─res─────────────────┐ -│ [(1,4),(2,5),(3,6)] │ -└─────────────────────┘ -``` - -Обратите внимание, что у функции `arrayMap` первый аргумент (лямбда-функция) не может быть опущен. - -### arrayFilter(func, arr1, …) {#higher_order_functions-array-filter} - -Вернуть массив, содержащий только те элементы массива `arr1`, для которых функция `func` возвращает не 0. - -Примеры: - -``` sql -SELECT arrayFilter(x -> x LIKE '%World%', ['Hello', 'abc World']) AS res -``` - -``` text -┌─res───────────┐ -│ ['abc World'] │ -└───────────────┘ -``` - -``` sql -SELECT - arrayFilter( - (i, x) -> x LIKE '%World%', - arrayEnumerate(arr), - ['Hello', 'abc World'] AS arr) - AS res -``` - -``` text -┌─res─┐ -│ [2] │ -└─────┘ -``` - -Обратите внимание, что у функции `arrayFilter` первый аргумент (лямбда-функция) не может быть опущен. - -### arrayCount(\[func,\] arr1, …) {#higher_order_functions-array-count} - -Вернуть количество элементов массива `arr`, для которых функция func возвращает не 0. Если func не указана - вернуть количество ненулевых элементов массива. - -### arrayExists(\[func,\] arr1, …) {#arrayexistsfunc-arr1} - -Вернуть 1, если существует хотя бы один элемент массива `arr`, для которого функция func возвращает не 0. Иначе вернуть 0. - -### arrayAll(\[func,\] arr1, …) {#arrayallfunc-arr1} - -Вернуть 1, если для всех элементов массива `arr`, функция `func` возвращает не 0. Иначе вернуть 0. - -### arraySum(\[func,\] arr1, …) {#higher_order_functions-array-sum} - -Вернуть сумму значений функции `func`. Если функция не указана - просто вернуть сумму элементов массива. - -### arrayFirst(func, arr1, …) {#higher_order_functions-array-first} - -Вернуть первый элемент массива `arr1`, для которого функция func возвращает не 0. - -Обратите внимание, что у функции `arrayFirst` первый аргумент (лямбда-функция) не может быть опущен. - -### arrayFirstIndex(func, arr1, …) {#higher_order_functions-array-first-index} - -Вернуть индекс первого элемента массива `arr1`, для которого функция func возвращает не 0. - -Обратите внимание, что у функции `arrayFirstFilter` первый аргумент (лямбда-функция) не может быть опущен. - -### arrayCumSum(\[func,\] arr1, …) {#arraycumsumfunc-arr1} - -Возвращает массив из частичных сумм элементов исходного массива (сумма с накоплением). Если указана функция `func`, то значения элементов массива преобразуются этой функцией перед суммированием. - -Пример: - -``` sql -SELECT arrayCumSum([1, 1, 1, 1]) AS res -``` - -``` text -┌─res──────────┐ -│ [1, 2, 3, 4] │ -└──────────────┘ -``` - -### arraySort(\[func,\] arr1, …) {#arraysortfunc-arr1} - -Возвращает отсортированный в восходящем порядке массив `arr1`. Если задана функция `func`, то порядок сортировки определяется результатом применения функции `func` на элементы массива (массивов). - -Для улучшения эффективности сортировки применяется [Преобразование Шварца](https://ru.wikipedia.org/wiki/%D0%9F%D1%80%D0%B5%D0%BE%D0%B1%D1%80%D0%B0%D0%B7%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D0%B5_%D0%A8%D0%B2%D0%B0%D1%80%D1%86%D0%B0). - -Пример: - -``` sql -SELECT arraySort((x, y) -> y, ['hello', 'world'], [2, 1]); -``` - -``` text -┌─res────────────────┐ -│ ['world', 'hello'] │ -└────────────────────┘ -``` - -Подробная информация о методе `arraySort` приведена в разделе [Функции по работе с массивами](array-functions.md#array_functions-sort). - -### arrayReverseSort(\[func,\] arr1, …) {#arrayreversesortfunc-arr1} - -Возвращает отсортированный в нисходящем порядке массив `arr1`. Если задана функция `func`, то порядок сортировки определяется результатом применения функции `func` на элементы массива (массивов). - -Пример: - -``` sql -SELECT arrayReverseSort((x, y) -> y, ['hello', 'world'], [2, 1]) as res; -``` - -``` text -┌─res───────────────┐ -│ ['hello','world'] │ -└───────────────────┘ -``` - -Подробная информация о методе `arrayReverseSort` приведена в разделе [Функции по работе с массивами](array-functions.md#array_functions-reverse-sort). - -[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/higher_order_functions/) diff --git a/docs/ru/sql-reference/functions/index.md b/docs/ru/sql-reference/functions/index.md index 06d3d892cf9..9c1c0c5ca9d 100644 --- a/docs/ru/sql-reference/functions/index.md +++ b/docs/ru/sql-reference/functions/index.md @@ -38,6 +38,20 @@ Функции не могут поменять значения своих аргументов - любые изменения возвращаются в качестве результата. Соответственно, от порядка записи функций в запросе, результат вычислений отдельных функций не зависит. +## Функции высшего порядка, оператор `->` и функция lambda(params, expr) {#higher-order-functions} + +Функции высшего порядка, в качестве своего функционального аргумента могут принимать только лямбда-функции. Чтобы передать лямбда-функцию в функцию высшего порядка, используйте оператор `->`. Слева от стрелочки стоит формальный параметр — произвольный идентификатор, или несколько формальных параметров — произвольные идентификаторы в кортеже. Справа от стрелочки стоит выражение, в котором могут использоваться эти формальные параметры, а также любые столбцы таблицы. + +Примеры: +``` +x -> 2 * x +str -> str != Referer +``` + +В функции высшего порядка может быть передана лямбда-функция, принимающая несколько аргументов. В этом случае в функцию высшего порядка передаётся несколько массивов одинаковой длины, которым эти аргументы будут соответствовать. + +Для некоторых функций первый аргумент (лямбда-функция) может отсутствовать. В этом случае подразумевается тождественное отображение. + ## Обработка ошибок {#obrabotka-oshibok} Некоторые функции могут кидать исключения в случае ошибочных данных. В этом случае, выполнение запроса прерывается, и текст ошибки выводится клиенту. При распределённой обработке запроса, при возникновении исключения на одном из серверов, на другие серверы пытается отправиться просьба тоже прервать выполнение запроса. diff --git a/docs/ru/sql-reference/functions/introspection.md b/docs/ru/sql-reference/functions/introspection.md index 9c6a0711ec9..655c4be8318 100644 --- a/docs/ru/sql-reference/functions/introspection.md +++ b/docs/ru/sql-reference/functions/introspection.md @@ -93,7 +93,7 @@ LIMIT 1 \G ``` -Функция [arrayMap](higher-order-functions.md#higher_order_functions-array-map) позволяет обрабатывать каждый отдельный элемент массива `trace` с помощью функции `addressToLine`. Результат этой обработки вы видите в виде `trace_source_code_lines` колонки выходных данных. +Функция [arrayMap](../../sql-reference/functions/array-functions.md#array-map) позволяет обрабатывать каждый отдельный элемент массива `trace` с помощью функции `addressToLine`. Результат этой обработки вы видите в виде `trace_source_code_lines` колонки выходных данных. ``` text Row 1: @@ -179,7 +179,7 @@ LIMIT 1 \G ``` -То [arrayMap](higher-order-functions.md#higher_order_functions-array-map) функция позволяет обрабатывать каждый отдельный элемент системы. `trace` массив по типу `addressToSymbols` функция. Результат этой обработки вы видите в виде `trace_symbols` колонка выходных данных. +То [arrayMap](../../sql-reference/functions/array-functions.md#array-map) функция позволяет обрабатывать каждый отдельный элемент системы. `trace` массив по типу `addressToSymbols` функция. Результат этой обработки вы видите в виде `trace_symbols` колонка выходных данных. ``` text Row 1: @@ -276,7 +276,7 @@ LIMIT 1 \G ``` -Функция [arrayMap](higher-order-functions.md#higher_order_functions-array-map) позволяет обрабатывать каждый отдельный элемент массива `trace` с помощью функции `demangle`. +Функция [arrayMap](../../sql-reference/functions/array-functions.md#array-map) позволяет обрабатывать каждый отдельный элемент массива `trace` с помощью функции `demangle`. ``` text Row 1: diff --git a/docs/ru/sql-reference/functions/ip-address-functions.md b/docs/ru/sql-reference/functions/ip-address-functions.md index 6dd5a68adc5..a9a0a7f919a 100644 --- a/docs/ru/sql-reference/functions/ip-address-functions.md +++ b/docs/ru/sql-reference/functions/ip-address-functions.md @@ -127,9 +127,9 @@ SELECT IPv6NumToString(IPv4ToIPv6(IPv4StringToNum('192.168.0.1'))) AS addr └────────────────────┘ ``` -## cutIPv6(x, bitsToCutForIPv6, bitsToCutForIPv4) {#cutipv6x-bitstocutforipv6-bitstocutforipv4} +## cutIPv6(x, bytesToCutForIPv6, bytesToCutForIPv4) {#cutipv6x-bytestocutforipv6-bytestocutforipv4} -Принимает значение типа FixedString(16), содержащее IPv6-адрес в бинарном виде. Возвращает строку, содержащую адрес из указанного количества битов, удаленных в текстовом формате. Например: +Принимает значение типа FixedString(16), содержащее IPv6-адрес в бинарном виде. Возвращает строку, содержащую адрес из указанного количества байтов, удаленных в текстовом формате. Например: ``` sql WITH diff --git a/docs/ru/sql-reference/functions/splitting-merging-functions.md b/docs/ru/sql-reference/functions/splitting-merging-functions.md index 81a8011a5bf..bf4e76c3bb1 100644 --- a/docs/ru/sql-reference/functions/splitting-merging-functions.md +++ b/docs/ru/sql-reference/functions/splitting-merging-functions.md @@ -2,13 +2,90 @@ ## splitByChar(separator, s) {#splitbycharseparator-s} -Разбивает строку на подстроки, используя в качестве разделителя separator. +Разбивает строку на подстроки, используя в качестве разделителя `separator`. separator должен быть константной строкой из ровно одного символа. Возвращается массив выделенных подстрок. Могут выделяться пустые подстроки, если разделитель идёт в начале или в конце строки, или если идёт более одного разделителя подряд. +**Синтаксис** + +``` sql +splitByChar(, ) +``` + +**Параметры** + +- `separator` — Разделитель, состоящий из одного символа. [String](../../sql-reference/data-types/string.md). +- `s` — Разбиваемая строка. [String](../../sql-reference/data-types/string.md). + +**Возвращаемые значения** + +Возвращает массив подстрок. Пустая подстрока, может быть возвращена, когда: + +- Разделитель находится в начале или конце строки; +- Задано несколько последовательных разделителей; +- Исходная строка `s` пуста. + +Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md). + +**Пример** + +``` sql +SELECT splitByChar(',', '1,2,3,abcde') +``` + +``` text +┌─splitByChar(',', '1,2,3,abcde')─┐ +│ ['1','2','3','abcde'] │ +└─────────────────────────────────┘ +``` + ## splitByString(separator, s) {#splitbystringseparator-s} -То же самое, но использует строку из нескольких символов в качестве разделителя. Строка должна быть непустой. +Разбивает строку на подстроки, разделенные строкой. В качестве разделителя использует константную строку `separator`, которая может состоять из нескольких символов. Если строка `separator` пуста, то функция разделит строку `s` на массив из символов. + +**Синтаксис** + +``` sql +splitByString(separator, s) +``` + +**Параметры** + +- `separator` — Разделитель. [String](../../sql-reference/data-types/string.md). +- `s` — Разбиваемая строка. [String](../../sql-reference/data-types/string.md). + +**Возвращаемые значения** + +Возвращает массив подстрок. Пустая подстрока, может быть возвращена, когда: + +- Разделитель находится в начале или конце строки; +- Задано несколько последовательных разделителей; +- Исходная строка `s` пуста. + +Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md). + +**Примеры** + +``` sql +SELECT splitByString(', ', '1, 2 3, 4,5, abcde') +``` + +``` text +┌─splitByString(', ', '1, 2 3, 4,5, abcde')─┐ +│ ['1','2 3','4,5','abcde'] │ +└───────────────────────────────────────────┘ +``` + +``` sql +SELECT splitByString('', 'abcde') +``` + +``` text +┌─splitByString('', 'abcde')─┐ +│ ['a','b','c','d','e'] │ +└────────────────────────────┘ +``` + ## arrayStringConcat(arr\[, separator\]) {#arraystringconcatarr-separator} @@ -33,42 +110,4 @@ SELECT alphaTokens('abca1abc') └─────────────────────────┘ ``` -## extractAllGroups(text, regexp) {#extractallgroups} - -Выделяет все группы из неперекрывающихся подстрок, которые соответствуют регулярному выражению. - -**Синтаксис** - -``` sql -extractAllGroups(text, regexp) -``` - -**Параметры** - -- `text` — [String](../data-types/string.md) или [FixedString](../data-types/fixedstring.md). -- `regexp` — Регулярное выражение. Константа. [String](../data-types/string.md) или [FixedString](../data-types/fixedstring.md). - -**Возвращаемые значения** - -- Если найдена хотя бы одна подходящая группа, функция возвращает столбец вида `Array(Array(String))`, сгруппированный по идентификатору группы (от 1 до N, где N — количество групп с захватом содержимого в `regexp`). - -- Если подходящих групп не найдено, возвращает пустой массив. - -Тип: [Array](../data-types/array.md). - -**Пример использования** - -Запрос: - -``` sql -SELECT extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)'); -``` - -Результат: - -``` text -┌─extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)')─┐ -│ [['abc','123'],['8','"hkl"']] │ -└───────────────────────────────────────────────────────────────────────┘ -``` [Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/splitting_merging_functions/) diff --git a/docs/ru/sql-reference/statements/select/join.md b/docs/ru/sql-reference/statements/select/join.md index 2a5bcff0cbb..800f07a7c66 100644 --- a/docs/ru/sql-reference/statements/select/join.md +++ b/docs/ru/sql-reference/statements/select/join.md @@ -36,7 +36,9 @@ FROM !!! note "Примечание" Значение строгости по умолчанию может быть переопределено с помощью настройки [join\_default\_strictness](../../../operations/settings/settings.md#settings-join_default_strictness). - + +Поведение сервера ClickHouse для операций `ANY JOIN` зависит от параметра [any_join_distinct_right_table_keys](../../../operations/settings/settings.md#any_join_distinct_right_table_keys). + ### Использование ASOF JOIN {#asof-join-usage} `ASOF JOIN` применим в том случае, когда необходимо объединять записи, которые не имеют точного совпадения. diff --git a/docs/ru/sql-reference/statements/show.md b/docs/ru/sql-reference/statements/show.md index b376da352ba..575742568cb 100644 --- a/docs/ru/sql-reference/statements/show.md +++ b/docs/ru/sql-reference/statements/show.md @@ -169,4 +169,65 @@ SHOW CREATE QUOTA [name | CURRENT] SHOW CREATE [SETTINGS] PROFILE name ``` + +## SHOW USERS {#show-users-statement} + +Выводит список [пользовательских аккаунтов](../../operations/access-rights.md#user-account-management). Для просмотра параметров пользовательских аккаунтов, см. системную таблицу [system.users](../../operations/system-tables/users.md#system_tables-users). + +### Синтаксис {#show-users-syntax} + +``` sql +SHOW USERS +``` + +## SHOW ROLES {#show-roles-statement} + +Выводит список [ролей](../../operations/access-rights.md#role-management). Для просмотра параметров ролей, см. системные таблицы [system.roles](../../operations/system-tables/roles.md#system_tables-roles) и [system.role-grants](../../operations/system-tables/role-grants.md#system_tables-role_grants). + +### Синтаксис {#show-roles-syntax} + +``` sql +SHOW [CURRENT|ENABLED] ROLES +``` + +## SHOW PROFILES {#show-profiles-statement} + +Выводит список [профилей настроек](../../operations/access-rights.md#settings-profiles-management). Для просмотра других параметров профилей настроек, см. системную таблицу [settings_profiles](../../operations/system-tables/settings_profiles.md#system_tables-settings_profiles). + +### Синтаксис {#show-profiles-syntax} + +``` sql +SHOW [SETTINGS] PROFILES +``` + +## SHOW POLICIES {#show-policies-statement} + +Выводит список [политик доступа к строкам](../../operations/access-rights.md#row-policy-management) для указанной таблицы. Для просмотра других параметров, см. системную таблицу [system.row_policies](../../operations/system-tables/row_policies.md#system_tables-row_policies). + +### Синтаксис {#show-policies-syntax} + +``` sql +SHOW [ROW] POLICIES [ON [db.]table] +``` + +## SHOW QUOTAS {#show-quotas-statement} + +Выводит список [квот](../../operations/access-rights.md#quotas-management). Для просмотра параметров квот, см. системную таблицу [system.quotas](../../operations/system-tables/quotas.md#system_tables-quotas). + +### Синтаксис {#show-quotas-syntax} + +``` sql +SHOW QUOTAS +``` + +## SHOW QUOTA {#show-quota-statement} + +Выводит потребление [квоты](../../operations/quotas.md) для всех пользователей или только для текущего пользователя. Для просмотра других параметров, см. системные таблицы [system.quotas_usage](../../operations/system-tables/quotas_usage.md#system_tables-quotas_usage) и [system.quota_usage](../../operations/system-tables/quota_usage.md#system_tables-quota_usage). + +### Синтаксис {#show-quota-syntax} + +``` sql +SHOW [CURRENT] QUOTA +``` + [Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/show/) diff --git a/docs/tools/build.py b/docs/tools/build.py index ac675897fca..120af33c8fb 100755 --- a/docs/tools/build.py +++ b/docs/tools/build.py @@ -180,12 +180,13 @@ def build(args): if not args.skip_website: website.build_website(args) - test.test_templates(args.website_dir) + if not args.skip_test_templates: + test.test_templates(args.website_dir) - build_docs(args) - - from github import build_releases - build_releases(args, build_docs) + if not args.skip_docs: + build_docs(args) + from github import build_releases + build_releases(args, build_docs) if not args.skip_blog: blog.build_blog(args) @@ -220,6 +221,8 @@ if __name__ == '__main__': arg_parser.add_argument('--skip-website', action='store_true') arg_parser.add_argument('--skip-blog', action='store_true') arg_parser.add_argument('--skip-git-log', action='store_true') + arg_parser.add_argument('--skip-docs', action='store_true') + arg_parser.add_argument('--skip-test-templates', action='store_true') arg_parser.add_argument('--test-only', action='store_true') arg_parser.add_argument('--minify', action='store_true') arg_parser.add_argument('--htmlproofer', action='store_true') diff --git a/docs/tr/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/tr/engines/table-engines/mergetree-family/replacingmergetree.md index f586b97cb2f..a24c84e9a16 100644 --- a/docs/tr/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/tr/engines/table-engines/mergetree-family/replacingmergetree.md @@ -33,7 +33,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] **ReplacingMergeTree Parametreleri** -- `ver` — column with version. Type `UInt*`, `Date`, `DateTime` veya `DateTime64`. İsteğe bağlı parametre. +- `ver` — column with version. Type `UInt*`, `Date` veya `DateTime`. İsteğe bağlı parametre. Birleş whenirken, `ReplacingMergeTree` aynı birincil anahtara sahip tüm satırlardan sadece bir tane bırakır: diff --git a/docs/tr/interfaces/http.md b/docs/tr/interfaces/http.md index 2b92dd0ed9b..49d20ef6655 100644 --- a/docs/tr/interfaces/http.md +++ b/docs/tr/interfaces/http.md @@ -38,7 +38,7 @@ GET yöntemini kullanırken, ‘readonly’ ayar .lanmıştır. Başka bir deyi $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123 diff --git a/docs/zh/engines/table-engines/mergetree-family/mergetree.md b/docs/zh/engines/table-engines/mergetree-family/mergetree.md index e92621c12df..0b886547229 100644 --- a/docs/zh/engines/table-engines/mergetree-family/mergetree.md +++ b/docs/zh/engines/table-engines/mergetree-family/mergetree.md @@ -2,44 +2,47 @@ Clickhouse 中最强大的表引擎当属 `MergeTree` (合并树)引擎及该系列(`*MergeTree`)中的其他引擎。 -`MergeTree` 引擎系列的基本理念如下。当你有巨量数据要插入到表中,你要高效地一批批写入数据片段,并希望这些数据片段在后台按照一定规则合并。相比在插入时不断修改(重写)数据进存储,这种策略会高效很多。 +`MergeTree` 系列的引擎被设计用于插入极大量的数据到一张表当中。数据可以以数据片段的形式一个接着一个的快速写入,数据片段在后台按照一定的规则进行合并。相比在插入时不断修改(重写)已存储的数据,这种策略会高效很多。 主要特点: - 存储的数据按主键排序。 - 这让你可以创建一个用于快速检索数据的小稀疏索引。 + 这使得你能够创建一个小型的稀疏索引来加快数据检索。 -- 允许使用分区,如果指定了 [分区键](custom-partitioning-key.md) 的话。 +- 支持数据分区,如果指定了 [分区键](custom-partitioning-key.md) 的话。 在相同数据集和相同结果集的情况下 ClickHouse 中某些带分区的操作会比普通操作更快。查询中指定了分区键时 ClickHouse 会自动截取分区数据。这也有效增加了查询性能。 - 支持数据副本。 - `ReplicatedMergeTree` 系列的表便是用于此。更多信息,请参阅 [数据副本](replication.md) 一节。 + `ReplicatedMergeTree` 系列的表提供了数据副本功能。更多信息,请参阅 [数据副本](replication.md) 一节。 - 支持数据采样。 需要的话,你可以给表设置一个采样方法。 -!!! 注意 "注意" +!!! note "注意" [合并](../special/merge.md#merge) 引擎并不属于 `*MergeTree` 系列。 ## 建表 {#table_engine-mergetree-creating-a-table} - CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] - ( - name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1], - name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2], - ... - INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1, - INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2 - ) ENGINE = MergeTree() - [PARTITION BY expr] - [ORDER BY expr] - [PRIMARY KEY expr] - [SAMPLE BY expr] - [SETTINGS name=value, ...] +``` sql +CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] +( + name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1], + name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2], + ... + INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1, + INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2 +) ENGINE = MergeTree() +ORDER BY expr +[PARTITION BY expr] +[PRIMARY KEY expr] +[SAMPLE BY expr] +[TTL expr [DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'], ...] +[SETTINGS name=value, ...] +``` 对于以上参数的描述,可参考 [CREATE 语句 的描述](../../../engines/table-engines/mergetree-family/mergetree.md) 。 @@ -62,7 +65,7 @@ Clickhouse 中最强大的表引擎当属 `MergeTree` (合并树)引擎及 要按月分区,可以使用表达式 `toYYYYMM(date_column)` ,这里的 `date_column` 是一个 [Date](../../../engines/table-engines/mergetree-family/mergetree.md) 类型的列。分区名的格式会是 `"YYYYMM"` 。 -- `PRIMARY KEY` - 主键,如果要设成 [跟排序键不相同](#xuan-ze-gen-pai-xu-jian-bu-yi-yang-zhu-jian),可选。 +- `PRIMARY KEY` - 主键,如果要 [选择与排序键不同的主键](#choosing-a-primary-key-that-differs-from-the-sorting-key),可选。 默认情况下主键跟排序键(由 `ORDER BY` 子句指定)相同。 因此,大部分情况下不需要再专门指定一个 `PRIMARY KEY` 子句。 @@ -72,17 +75,19 @@ Clickhouse 中最强大的表引擎当属 `MergeTree` (合并树)引擎及 如果要用抽样表达式,主键中必须包含这个表达式。例如: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))` 。 -- TTL 指定行存储的持续时间并定义 PART 在硬盘和卷上的移动逻辑的规则列表,可选。 +- TTL 指定行存储的持续时间并定义数据片段在硬盘和卷上的移动逻辑的规则列表,可选。 表达式中必须存在至少一个 `Date` 或 `DateTime` 类型的列,比如: `TTL date + INTERVAl 1 DAY` - 规则的类型 `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'`指定了当满足条件(到达当前时间)时所要执行的动作:移除过期的行,还是将 PART (如果PART中的所有行都满足表达式的话)移动到指定的磁盘(`TO DISK 'xxx'`) 或 卷(`TO VOLUME 'xxx'`)。默认的规则是移除(`DELETE`)。可以在列表中指定多个规则,但最多只能有一个`DELETE`的规则。 + 规则的类型 `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'`指定了当满足条件(到达指定时间)时所要执行的动作:移除过期的行,还是将数据片段(如果数据片段中的所有行都满足表达式的话)移动到指定的磁盘(`TO DISK 'xxx'`) 或 卷(`TO VOLUME 'xxx'`)。默认的规则是移除(`DELETE`)。可以在列表中指定多个规则,但最多只能有一个`DELETE`的规则。 + + 更多细节,请查看 [表和列的 TTL](#table_engine-mergetree-ttl) -- `SETTINGS` — 影响 `MergeTree` 性能的额外参数: +- `SETTINGS` — 控制 `MergeTree` 行为的额外参数: - - `index_granularity` — 索引粒度。索引中相邻的『标记』间的数据行数。默认值,8192 。参考[Data Storage](#mergetree-data-storage)。 + - `index_granularity` — 索引粒度。索引中相邻的『标记』间的数据行数。默认值,8192 。参考[数据存储](#mergetree-data-storage)。 - `index_granularity_bytes` — 索引粒度,以字节为单位,默认值: 10Mb。如果想要仅按数据行数限制索引粒度, 请设置为0(不建议)。 - `enable_mixed_granularity_parts` — 是否启用通过 `index_granularity_bytes` 控制索引粒度的大小。在19.11版本之前, 只有 `index_granularity` 配置能够用于限制索引粒度的大小。当从具有很大的行(几十上百兆字节)的表中查询数据时候,`index_granularity_bytes` 配置能够提升ClickHouse的性能。如果你的表里有很大的行,可以开启这项配置来提升`SELECT` 查询的性能。 - `use_minimalistic_part_header_in_zookeeper` — 是否在 ZooKeeper 中启用最小的数据片段头 。如果设置了 `use_minimalistic_part_header_in_zookeeper=1` ,ZooKeeper 会存储更少的数据。更多信息参考『服务配置参数』这章中的 [设置描述](../../../operations/server-configuration-parameters/settings.md#server-settings-use_minimalistic_part_header_in_zookeeper) 。 @@ -90,18 +95,21 @@ Clickhouse 中最强大的表引擎当属 `MergeTree` (合并树)引擎及 - `merge_with_ttl_timeout` — TTL合并频率的最小间隔时间,单位:秒。默认值: 86400 (1 天)。 - `write_final_mark` — 是否启用在数据片段尾部写入最终索引标记。默认值: 1(不建议更改)。 - - `storage_policy` — 存储策略。 参见 [使用多个区块装置进行数据存储](#table_engine-mergetree-multiple-volumes). - - `min_bytes_for_wide_part`,`min_rows_for_wide_part` 在数据分段中可以使用`Wide`格式进行存储的最小字节数/行数。你可以不设置、只设置一个,或全都设置。参考:[Data Storage](#mergetree-data-storage) + - `merge_max_block_size` — 在块中进行合并操作时的最大行数限制。默认值:8192 + - `storage_policy` — 存储策略。 参见 [使用具有多个块的设备进行数据存储](#table_engine-mergetree-multiple-volumes). + - `min_bytes_for_wide_part`,`min_rows_for_wide_part` 在数据片段中可以使用`Wide`格式进行存储的最小字节数/行数。你可以不设置、只设置一个,或全都设置。参考:[数据存储](#mergetree-data-storage) **示例配置** - ENGINE MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity=8192 +``` sql +ENGINE MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity=8192 +``` -示例中,我们设为按月分区。 +在这个例子中,我们设置了按月进行分区。 -同时我们设置了一个按用户ID哈希的抽样表达式。这让你可以有该表中每个 `CounterID` 和 `EventDate` 下面的数据的伪随机分布。如果你在查询时指定了 [SAMPLE](../../../engines/table-engines/mergetree-family/mergetree.md#select-sample-clause) 子句。 ClickHouse会返回对于用户子集的一个均匀的伪随机数据采样。 +同时我们设置了一个按用户 ID 哈希的抽样表达式。这使得你可以对该表中每个 `CounterID` 和 `EventDate` 的数据伪随机分布。如果你在查询时指定了 [SAMPLE](../../../engines/table-engines/mergetree-family/mergetree.md#select-sample-clause) 子句。 ClickHouse会返回对于用户子集的一个均匀的伪随机数据采样。 -`index_granularity` 可省略,默认值为 8192 。 +`index_granularity` 可省略因为 8192 是默认设置 。
@@ -133,15 +141,20 @@ Clickhouse 中最强大的表引擎当属 `MergeTree` (合并树)引擎及 ## 数据存储 {#mergetree-data-storage} -表由按主键排序的数据 *片段* 组成。 +表由按主键排序的数据片段(DATA PART)组成。 -当数据被插入到表中时,会分成数据片段并按主键的字典序排序。例如,主键是 `(CounterID, Date)` 时,片段中数据按 `CounterID` 排序,具有相同 `CounterID` 的部分按 `Date` 排序。 +当数据被插入到表中时,会创建多个数据片段并按主键的字典序排序。例如,主键是 `(CounterID, Date)` 时,片段中数据首先按 `CounterID` 排序,具有相同 `CounterID` 的部分按 `Date` 排序。 -不同分区的数据会被分成不同的片段,ClickHouse 在后台合并数据片段以便更高效存储。不会合并来自不同分区的数据片段。这个合并机制并不保证相同主键的所有行都会合并到同一个数据片段中。 +不同分区的数据会被分成不同的片段,ClickHouse 在后台合并数据片段以便更高效存储。不同分区的数据片段不会进行合并。合并机制并不保证具有相同主键的行全都合并到同一个数据片段中。 -ClickHouse 会为每个数据片段创建一个索引文件,索引文件包含每个索引行(『标记』)的主键值。索引行号定义为 `n * index_granularity` 。最大的 `n` 等于总行数除以 `index_granularity` 的值的整数部分。对于每列,跟主键相同的索引行处也会写入『标记』。这些『标记』让你可以直接找到数据所在的列。 +数据片段可以以 `Wide` 或 `Compact` 格式存储。在 `Wide` 格式下,每一列都会在文件系统中存储为单独的文件,在 `Compact` 格式下所有列都存储在一个文件中。`Compact` 格式可以提高插入量少插入频率频繁时的性能。 -你可以只用一单一大表并不断地一块块往里面加入数据 – `MergeTree` 引擎的就是为了这样的场景。 +数据存储格式由 `min_bytes_for_wide_part` 和 `min_rows_for_wide_part` 表引擎参数控制。如果数据片段中的字节数或行数少于相应的设置值,数据片段会以 `Compact` 格式存储,否则会以 `Wide` 格式存储。 + +每个数据片段被逻辑的分割成颗粒(granules)。颗粒是 ClickHouse 中进行数据查询时的最小不可分割数据集。ClickHouse 不会对行或值进行拆分,所以每个颗粒总是包含整数个行。每个颗粒的第一行通过该行的主键值进行标记, +ClickHouse 会为每个数据片段创建一个索引文件来存储这些标记。对于每列,无论它是否包含在主键当中,ClickHouse 都会存储类似标记。这些标记让你可以在列文件中直接找到数据。 + +颗粒的大小通过表引擎参数 `index_granularity` 和 `index_granularity_bytes` 控制。取决于行的大小,颗粒的行数的在 `[1, index_granularity]` 范围中。如果单行的大小超过了 `index_granularity_bytes` 设置的值,那么一个颗粒的大小会超过 `index_granularity_bytes`。在这种情况下,颗粒的大小等于该行的大小。 ## 主键和索引在查询中的表现 {#primary-keys-and-indexes-in-queries} @@ -162,56 +175,53 @@ ClickHouse 会为每个数据片段创建一个索引文件,索引文件包含 上面例子可以看出使用索引通常会比全表描述要高效。 -稀疏索引会引起额外的数据读取。当读取主键单个区间范围的数据时,每个数据块中最多会多读 `index_granularity * 2` 行额外的数据。大部分情况下,当 `index_granularity = 8192` 时,ClickHouse的性能并不会降级。 +稀疏索引会引起额外的数据读取。当读取主键单个区间范围的数据时,每个数据块中最多会多读 `index_granularity * 2` 行额外的数据。 -稀疏索引让你能操作有巨量行的表。因为这些索引是常驻内存(RAM)的。 +稀疏索引使得你可以处理极大量的行,因为大多数情况下,这些索引常驻与内存(RAM)中。 -ClickHouse 不要求主键惟一。所以,你可以插入多条具有相同主键的行。 +ClickHouse 不要求主键惟一,所以你可以插入多条具有相同主键的行。 ### 主键的选择 {#zhu-jian-de-xuan-ze} -主键中列的数量并没有明确的限制。依据数据结构,你应该让主键包含多些或少些列。这样可以: +主键中列的数量并没有明确的限制。依据数据结构,你可以在主键包含多些或少些列。这样可以: - 改善索引的性能。 - 如果当前主键是 `(a, b)` ,然后加入另一个 `c` 列,满足下面条件时,则可以改善性能: - - 有带有 `c` 列条件的查询。 - - 很长的数据范围( `index_granularity` 的数倍)里 `(a, b)` 都是相同的值,并且这种的情况很普遍。换言之,就是加入另一列后,可以让你的查询略过很长的数据范围。 + 如果当前主键是 `(a, b)` ,在下列情况下添加另一个 `c` 列会提升性能: + + - 查询会使用 `c` 列作为条件 + - 很长的数据范围( `index_granularity` 的数倍)里 `(a, b)` 都是相同的值,并且这样的情况很普遍。换言之,就是加入另一列后,可以让你的查询略过很长的数据范围。 - 改善数据压缩。 - ClickHouse 以主键排序片段数据,所以,数据的一致性越高,压缩越好。 + ClickHouse 以主键排序片段数据,所以,数据的一致性越高,压缩越好。 -- [折叠树](collapsingmergetree.md#table_engine-collapsingmergetree) 和 [SummingMergeTree](summingmergetree.md) 引擎里,数据合并时,会有额外的处理逻辑。 +- 在[CollapsingMergeTree](collapsingmergetree.md#table_engine-collapsingmergetree) 和 [SummingMergeTree](summingmergetree.md) 引擎里进行数据合并时会提供额外的处理逻辑。 - 在这种情况下,指定一个跟主键不同的 *排序键* 也是有意义的。 + 在这种情况下,指定与主键不同的 *排序键* 也是有意义的。 长的主键会对插入性能和内存消耗有负面影响,但主键中额外的列并不影响 `SELECT` 查询的性能。 -### 选择跟排序键不一样主键 {#xuan-ze-gen-pai-xu-jian-bu-yi-yang-zhu-jian} +可以使用 `ORDER BY tuple()` 语法创建没有主键的表。在这种情况下 ClickHouse 根据数据插入的顺序存储。如果在使用 `INSERT ... SELECT` 时希望保持数据的排序,请设置 [max\_insert\_threads = 1](../../../operations/settings/settings.md#settings-max-insert-threads)。 -指定一个跟排序键(用于排序数据片段中行的表达式) -不一样的主键(用于计算写到索引文件的每个标记值的表达式)是可以的。 -这种情况下,主键表达式元组必须是排序键表达式元组的一个前缀。 +想要根据初始顺序进行数据查询,使用 [单线程查询](../../../operations/settings/settings.md#settings-max_threads) -当使用 [SummingMergeTree](summingmergetree.md) 和 -[AggregatingMergeTree](aggregatingmergetree.md) 引擎时,这个特性非常有用。 -通常,使用这类引擎时,表里列分两种:*维度* 和 *度量* 。 -典型的查询是在 `GROUP BY` 并过虑维度的情况下统计度量列的值。 -像 SummingMergeTree 和 AggregatingMergeTree ,用相同的排序键值统计行时, -通常会加上所有的维度。结果就是,这键的表达式会是一长串的列组成, -并且这组列还会因为新加维度必须频繁更新。 +### 选择与排序键不同主键 {#choosing-a-primary-key-that-differs-from-the-sorting-key} -这种情况下,主键中仅预留少量列保证高效范围扫描, -剩下的维度列放到排序键元组里。这样是合理的。 +指定一个跟排序键不一样的主键是可以的,此时排序键用于在数据片段中进行排序,主键用于在索引文件中进行标记的写入。这种情况下,主键表达式元组必须是排序键表达式元组的前缀。 -[排序键的修改](../../../engines/table-engines/mergetree-family/mergetree.md) 是轻量级的操作,因为一个新列同时被加入到表里和排序键后时,已存在的数据片段并不需要修改。由于旧的排序键是新排序键的前缀,并且刚刚添加的列中没有数据,因此在表修改时的数据对于新旧的排序键来说都是有序的。 +当使用 [SummingMergeTree](summingmergetree.md) 和 [AggregatingMergeTree](aggregatingmergetree.md) 引擎时,这个特性非常有用。通常在使用这类引擎时,表里的列分两种:*维度* 和 *度量* 。典型的查询会通过任意的 `GROUP BY` 对度量列进行聚合并通过维度列进行过滤。由于 SummingMergeTree 和 AggregatingMergeTree 会对排序键相同的行进行聚合,所以把所有的维度放进排序键是很自然的做法。但这将导致排序键中包含大量的列,并且排序键会伴随着新添加的维度不断的更新。 -### 索引和分区在查询中的应用 {#suo-yin-he-fen-qu-zai-cha-xun-zhong-de-ying-yong} +在这种情况下合理的做法是,只保留少量的列在主键当中用于提升扫描效率,将维度列添加到排序键中。 -对于 `SELECT` 查询,ClickHouse 分析是否可以使用索引。如果 `WHERE/PREWHERE` 子句具有下面这些表达式(作为谓词链接一子项或整个)则可以使用索引:基于主键或分区键的列或表达式的部分的等式或比较运算表达式;基于主键或分区键的列或表达式的固定前缀的 `IN` 或 `LIKE` 表达式;基于主键或分区键的列的某些函数;基于主键或分区键的表达式的逻辑表达式。 +对排序键进行 [ALTER](../../../sql-reference/statements/alter.md) 是轻量级的操作,因为当一个新列同时被加入到表里和排序键里时,已存在的数据片段并不需要修改。由于旧的排序键是新排序键的前缀,并且新添加的列中没有数据,因此在表修改时的数据对于新旧的排序键来说都是有序的。 -因此,在索引键的一个或多个区间上快速地跑查询都是可能的。下面例子中,指定标签;指定标签和日期范围;指定标签和日期;指定多个标签和日期范围等运行查询,都会非常快。 +### 索引和分区在查询中的应用 {#use-of-indexes-and-partitions-in-queries} + +对于 `SELECT` 查询,ClickHouse 分析是否可以使用索引。如果 `WHERE/PREWHERE` 子句具有下面这些表达式(作为谓词链接一子项或整个)则可以使用索引:包含一个表示与主键/分区键中的部分字段或全部字段相等/不等的比较表达式;基于主键/分区键的字段上的 `IN` 或 固定前缀的`LIKE` 表达式;基于主键/分区键的字段上的某些函数;基于主键/分区键的表达式的逻辑表达式。 + + +因此,在索引键的一个或多个区间上快速地执行查询都是可能的。下面例子中,指定标签;指定标签和日期范围;指定标签和日期;指定多个标签和日期范围等执行查询,都会非常快。 当引擎配置如下时: @@ -237,11 +247,18 @@ SELECT count() FROM table WHERE CounterID = 34 OR URL LIKE '%upyachka%' 要检查 ClickHouse 执行一个查询时能否使用索引,可设置 [force\_index\_by\_date](../../../operations/settings/settings.md#settings-force_index_by_date) 和 [force\_primary\_key](../../../operations/settings/settings.md) 。 -按月分区的分区键是只能读取包含适当范围日期的数据块。这种情况下,数据块会包含很多天(最多整月)的数据。在块中,数据按主键排序,主键第一列可能不包含日期。因此,仅使用日期而没有带主键前缀条件的查询将会导致读取超过这个日期范围。 +按月分区的分区键是只能读取包含适当范围日期的数据块。这种情况下,数据块会包含很多天(最多整月)的数据。在块中,数据按主键排序,主键第一列可能不包含日期。因此,仅使用日期而没有带主键前几个字段作为条件的查询将会导致需要读取超过这个指定日期以外的数据。 -### 跳数索引(分段汇总索引,实验性的) {#tiao-shu-suo-yin-fen-duan-hui-zong-suo-yin-shi-yan-xing-de} +### 部分单调主键的使用 -需要设置 `allow_experimental_data_skipping_indices` 为 1 才能使用此索引。(执行 `SET allow_experimental_data_skipping_indices = 1`)。 +考虑这样的场景,比如一个月中的几天。它们在一个月的范围内形成一个[单调序列](https://zh.wikipedia.org/wiki/单调函数) ,但如果扩展到更大的时间范围它们就不再单调了。这就是一个部分单调序列。如果用户使用部分单调的主键创建表,ClickHouse同样会创建一个稀疏索引。当用户从这类表中查询数据时,ClickHouse 会对查询条件进行分析。如果用户希望获取两个索引标记之间的数据并且这两个标记在一个月以内,ClickHouse 可以在这种特殊情况下使用到索引,因为它可以计算出查询参数与索引标记之间的距离。 + +如果查询参数范围内的主键不是单调序列,那么 ClickHouse 无法使用索引。在这种情况下,ClickHouse 会进行全表扫描。 + +ClickHouse 在任何主键代表一个部分单调序列的情况下都会使用这个逻辑。 + + +### 跳数索引 {#tiao-shu-suo-yin-fen-duan-hui-zong-suo-yin-shi-yan-xing-de} 此索引在 `CREATE` 语句的列部分里定义。 @@ -249,12 +266,14 @@ SELECT count() FROM table WHERE CounterID = 34 OR URL LIKE '%upyachka%' INDEX index_name expr TYPE type(...) GRANULARITY granularity_value ``` -`*MergeTree` 系列的表都能指定跳数索引。 +`*MergeTree` 系列的表可以指定跳数索引。 这些索引是由数据块按粒度分割后的每部分在指定表达式上汇总信息 `granularity_value` 组成(粒度大小用表引擎里 `index_granularity` 的指定)。 这些汇总信息有助于用 `where` 语句跳过大片不满足的数据,从而减少 `SELECT` 查询从磁盘读取的数据量, -示例 +这些索引会在数据块上聚合指定表达式的信息,这些信息以 granularity_value 指定的粒度组成 (粒度的大小通过在表引擎中定义 index_granularity 定义)。这些汇总信息有助于跳过大片不满足 `where` 条件的数据,从而减少 `SELECT` 查询从磁盘读取的数据量。 + +**示例** ``` sql CREATE TABLE table_name @@ -282,19 +301,27 @@ SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234 存储指定表达式的极值(如果表达式是 `tuple` ,则存储 `tuple` 中每个元素的极值),这些信息用于跳过数据块,类似主键。 - `set(max_rows)` - 存储指定表达式的惟一值(不超过 `max_rows` 个,`max_rows=0` 则表示『无限制』)。这些信息可用于检查 `WHERE` 表达式是否满足某个数据块。 + 存储指定表达式的不重复值(不超过 `max_rows` 个,`max_rows=0` 则表示『无限制』)。这些信息可用于检查 数据块是否满足 `WHERE` 条件。 - `ngrambf_v1(n, size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)` - 存储包含数据块中所有 n 元短语的 [布隆过滤器](https://en.wikipedia.org/wiki/Bloom_filter) 。只可用在字符串上。 + 存储一个包含数据块中所有 n元短语(ngram) 的 [布隆过滤器](https://en.wikipedia.org/wiki/Bloom_filter) 。只可用在字符串上。 可用于优化 `equals` , `like` 和 `in` 表达式的性能。 `n` – 短语长度。 - `size_of_bloom_filter_in_bytes` – 布隆过滤器大小,单位字节。(因为压缩得好,可以指定比较大的值,如256或512)。 - `number_of_hash_functions` – 布隆过滤器中使用的 hash 函数的个数。 - `random_seed` – hash 函数的随机种子。 + `size_of_bloom_filter_in_bytes` – 布隆过滤器大小,单位字节。(因为压缩得好,可以指定比较大的值,如 256 或 512)。 + `number_of_hash_functions` – 布隆过滤器中使用的哈希函数的个数。 + `random_seed` – 哈希函数的随机种子。 - `tokenbf_v1(size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)` - 跟 `ngrambf_v1` 类似,不同于 ngrams 存储字符串指定长度的所有片段。它只存储被非字母数据字符分割的片段。 + 跟 `ngrambf_v1` 类似,不同于 ngrams 存储字符串指定长度的所有片段。它只存储被非字母数字字符分割的片段。 +- `bloom_filter(bloom_filter([false_positive])` – 为指定的列存储布隆过滤器 + + 可选的参数 false_positive 用来指定从布隆过滤器收到错误响应的几率。取值范围是 (0,1),默认值:0.025 + + 支持的数据类型:`Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`, `Array`, `LowCardinality`, `Nullable`。 + + 以下函数会用到这个索引: [equals](../../../sql-reference/functions/comparison-functions.md), [notEquals](../../../sql-reference/functions/comparison-functions.md), [in](../../../sql-reference/functions/in-functions.md), [notIn](../../../sql-reference/functions/in-functions.md), [has](../../../sql-reference/functions/array-functions.md) + ``` sql @@ -303,17 +330,62 @@ INDEX sample_index2 (u64 * length(str), i32 + f64 * 100, date, str) TYPE set(100 INDEX sample_index3 (lower(str), str) TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4 ``` -## 并发数据访问 {#bing-fa-shu-ju-fang-wen} +#### 函数支持 {#functions-support} + +WHERE 子句中的条件包含对列的函数调用,如果列是索引的一部分,ClickHouse 会在执行函数时尝试使用索引。不同的函数对索引的支持是不同的。 + +`set` 索引会对所有函数生效,其他索引对函数的生效情况见下表 + +| 函数 (操作符) / 索引 | primary key | minmax | ngrambf\_v1 | tokenbf\_v1 | bloom\_filter | +|------------------------------------------------------------------------------------------------------------|-------------|--------|-------------|-------------|---------------| +| [equals (=, ==)](../../../sql-reference/functions/comparison-functions.md#function-equals) | ✔ | ✔ | ✔ | ✔ | ✔ | +| [notEquals(!=, \<\>)](../../../sql-reference/functions/comparison-functions.md#function-notequals) | ✔ | ✔ | ✔ | ✔ | ✔ | +| [like](../../../sql-reference/functions/string-search-functions.md#function-like) | ✔ | ✔ | ✔ | ✔ | ✔ | +| [notLike](../../../sql-reference/functions/string-search-functions.md#function-notlike) | ✔ | ✔ | ✗ | ✗ | ✗ | +| [startsWith](../../../sql-reference/functions/string-functions.md#startswith) | ✔ | ✔ | ✔ | ✔ | ✗ | +| [endsWith](../../../sql-reference/functions/string-functions.md#endswith) | ✗ | ✗ | ✔ | ✔ | ✗ | +| [multiSearchAny](../../../sql-reference/functions/string-search-functions.md#function-multisearchany) | ✗ | ✗ | ✔ | ✗ | ✗ | +| [in](../../../sql-reference/functions/in-functions.md#in-functions) | ✔ | ✔ | ✔ | ✔ | ✔ | +| [notIn](../../../sql-reference/functions/in-functions.md#in-functions) | ✔ | ✔ | ✔ | ✔ | ✔ | +| [less (\<)](../../../sql-reference/functions/comparison-functions.md#function-less) | ✔ | ✔ | ✗ | ✗ | ✗ | +| [greater (\>)](../../../sql-reference/functions/comparison-functions.md#function-greater) | ✔ | ✔ | ✗ | ✗ | ✗ | +| [lessOrEquals (\<=)](../../../sql-reference/functions/comparison-functions.md#function-lessorequals) | ✔ | ✔ | ✗ | ✗ | ✗ | +| [greaterOrEquals (\>=)](../../../sql-reference/functions/comparison-functions.md#function-greaterorequals) | ✔ | ✔ | ✗ | ✗ | ✗ | +| [empty](../../../sql-reference/functions/array-functions.md#function-empty) | ✔ | ✔ | ✗ | ✗ | ✗ | +| [notEmpty](../../../sql-reference/functions/array-functions.md#function-notempty) | ✔ | ✔ | ✗ | ✗ | ✗ | +| hasToken | ✗ | ✗ | ✗ | ✔ | ✗ | + +常量参数小于 ngram 大小的函数不能使用 `ngrambf_v1` 进行查询优化。 + +!!! note "注意" +布隆过滤器可能会包含不符合条件的匹配,所以 `ngrambf_v1`, `tokenbf_v1` 和 `bloom_filter` 索引不能用于负向的函数,例如: + +- 可以用来优化的场景 + - `s LIKE '%test%'` + - `NOT s NOT LIKE '%test%'` + - `s = 1` + - `NOT s != 1` + - `startsWith(s, 'test')` +- 不能用来优化的场景 + - `NOT s LIKE '%test%'` + - `s NOT LIKE '%test%'` + - `NOT s = 1` + - `s != 1` + - `NOT startsWith(s, 'test')` + +## 并发数据访问 {#concurrent-data-access} 应对表的并发访问,我们使用多版本机制。换言之,当同时读和更新表时,数据从当前查询到的一组片段中读取。没有冗长的的锁。插入不会阻碍读取。 对表的读操作是自动并行的。 -## 列和表的TTL {#table_engine-mergetree-ttl} +## 列和表的 TTL {#table_engine-mergetree-ttl} -TTL可以设置值的生命周期,它既可以为整张表设置,也可以为每个列字段单独设置。如果`TTL`同时作用于表和字段,ClickHouse会使用先到期的那个。 +TTL 可以设置值的生命周期,它既可以为整张表设置,也可以为每个列字段单独设置。表级别的 TTL 还会指定数据在磁盘和卷上自动转移的逻辑。 -被设置TTL的表,必须拥有[日期](../../../engines/table-engines/mergetree-family/mergetree.md) 或 [日期时间](../../../engines/table-engines/mergetree-family/mergetree.md) 类型的字段。要定义数据的生命周期,需要在这个日期字段上使用操作符,例如: +TTL 表达式的计算结果必须是 [日期](../../../engines/table-engines/mergetree-family/mergetree.md) 或 [日期时间](../../../engines/table-engines/mergetree-family/mergetree.md) 类型的字段。 + +示例: ``` sql TTL time_column @@ -327,15 +399,15 @@ TTL date_time + INTERVAL 1 MONTH TTL date_time + INTERVAL 15 HOUR ``` -### 列字段 TTL {#mergetree-column-ttl} +### 列 TTL {#mergetree-column-ttl} -当列字段中的值过期时, ClickHouse会将它们替换成数据类型的默认值。如果分区内,某一列的所有值均已过期,则ClickHouse会从文件系统中删除这个分区目录下的列文件。 +当列中的值过期时, ClickHouse会将它们替换成该列数据类型的默认值。如果数据片段中列的所有值均已过期,则ClickHouse 会从文件系统中的数据片段中此列。 `TTL`子句不能被用于主键字段。 -示例说明: +示例: -创建一张包含 `TTL` 的表 +创建表时指定 `TTL` ``` sql CREATE TABLE example_table @@ -368,11 +440,21 @@ ALTER TABLE example_table ### 表 TTL {#mergetree-table-ttl} -当表内的数据过期时, ClickHouse会删除所有对应的行。 +表可以设置一个用于移除过期行的表达式,以及多个用于在磁盘或卷上自动转移数据片段的表达式。当表中的行过期时,ClickHouse 会删除所有对应的行。对于数据片段的转移特性,必须所有的行都满足转移条件。 -举例说明: +``` sql +TTL expr [DELETE|TO DISK 'aaa'|TO VOLUME 'bbb'], ... +``` -创建一张包含 `TTL` 的表 +TTL 规则的类型紧跟在每个 TTL 表达式后面,它会影响满足表达式时(到达指定时间时)应当执行的操作: + +- `DELETE` - 删除过期的行(默认操作); +- `TO DISK 'aaa'` - 将数据片段移动到磁盘 `aaa`; +- `TO VOLUME 'bbb'` - 将数据片段移动到卷 `bbb`. + +示例: + +创建时指定 TTL ``` sql CREATE TABLE example_table @@ -383,7 +465,9 @@ CREATE TABLE example_table ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d -TTL d + INTERVAL 1 MONTH; +TTL d + INTERVAL 1 MONTH [DELETE], + d + INTERVAL 1 WEEK TO VOLUME 'aaa', + d + INTERVAL 2 WEEK TO DISK 'bbb'; ``` 修改表的 `TTL` @@ -395,14 +479,179 @@ ALTER TABLE example_table **删除数据** -当ClickHouse合并数据分区时, 会删除TTL过期的数据。 +ClickHouse 在数据片段合并时会删除掉过期的数据。 -当ClickHouse发现数据过期时, 它将会执行一个计划外的合并。要控制这类合并的频率, 你可以设置 `merge_with_ttl_timeout`。如果该值被设置的太低, 它将导致执行许多的计划外合并,这可能会消耗大量资源。 +当ClickHouse发现数据过期时, 它将会执行一个计划外的合并。要控制这类合并的频率, 你可以设置 `merge_with_ttl_timeout`。如果该值被设置的太低, 它将引发大量计划外的合并,这可能会消耗大量资源。 -如果在合并的时候执行`SELECT` 查询, 则可能会得到过期的数据。为了避免这种情况,可以在`SELECT`之前使用 [OPTIMIZE](../../../engines/table-engines/mergetree-family/mergetree.md#misc_operations-optimize) 查询。 +如果在合并的过程中执行 `SELECT` 查询, 则可能会得到过期的数据。为了避免这种情况,可以在 `SELECT` 之前使用 [OPTIMIZE](../../../engines/table-engines/mergetree-family/mergetree.md#misc_operations-optimize) 查询。 -## 使用多个块设备进行数据存储 {#table_engine-mergetree-multiple-volumes} +## 使用具有多个块的设备进行数据存储 {#table_engine-mergetree-multiple-volumes} + +### 介绍 {#introduction} + +MergeTree 系列表引擎可以将数据存储在多块设备上。这对某些可以潜在被划分为“冷”“热”的表来说是很有用的。近期数据被定期的查询但只需要很小的空间。相反,详尽的历史数据很少被用到。如果有多块磁盘可用,那么“热”的数据可以放置在快速的磁盘上(比如 NVMe 固态硬盘或内存),“冷”的数据可以放在相对较慢的磁盘上(比如机械硬盘)。 + +数据片段是 `MergeTree` 引擎表的最小可移动单元。属于同一个数据片段的数据被存储在同一块磁盘上。数据片段会在后台自动的在磁盘间移动,也可以通过 [ALTER](../../../sql-reference/statements/alter.md#alter_move-partition) 查询来移动。 + +### 术语 {#terms} + +- 磁盘 — 挂载到文件系统的块设备 +- 默认磁盘 — 在服务器设置中通过 [path](../../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-path) 参数指定的数据存储 +- 卷 — 磁盘的等效有序集合 (类似于 [JBOD](https://en.wikipedia.org/wiki/Non-RAID_drive_architectures)) +- 存储策略 — 卷的集合及他们之间的数据移动规则 ### 配置 {#table_engine-mergetree-multiple-volumes_configure} -[来源文章](https://clickhouse.tech/docs/en/operations/table_engines/mergetree/) +磁盘、卷和存储策略应当在主文件 `config.xml` 或 `config.d` 目录中的独立文件中的 `` 标签内定义。 + +配置结构: + +``` xml + + + + /mnt/fast_ssd/clickhouse/ + + + /mnt/hdd1/clickhouse/ + 10485760 + + + /mnt/hdd2/clickhouse/ + 10485760 + + + ... + + + ... + +``` + +标签: + +- `` — 磁盘名,名称必须与其他磁盘不同. +- `path` — 服务器将用来存储数据 (`data` 和 `shadow` 目录) 的路径, 应当以 ‘/’ 结尾. +- `keep_free_space_bytes` — 需要保留的剩余磁盘空间. + +磁盘定义的顺序无关紧要。 + +存储策略配置: + +``` xml + + ... + + + + + disk_name_from_disks_configuration + 1073741824 + + + + + + + 0.2 + + + + + + + + ... + +``` + +标签: + +- `policy_name_N` — 策略名称,不能重复。 +- `volume_name_N` — 卷名称,不能重复。 +- `disk` — 卷中的磁盘。 +- `max_data_part_size_bytes` — 任意卷上的磁盘可以存储的数据片段的最大大小。 +- `move_factor` — 当可用空间少于这个因子时,数据将自动的向下一个卷(如果有的话)移动 (默认值为 0.1)。 + +配置示例: + +``` xml + + ... + + + + + disk1 + disk2 + + + + + + + + fast_ssd + 1073741824 + + + disk1 + + + 0.2 + + + ... + +``` + +在给出的例子中, `hdd_in_order` 策略实现了 [循环制](https://zh.wikipedia.org/wiki/循环制) 方法。因此这个策略只定义了一个卷(`single`),数据片段会以循环的顺序全部存储到它的磁盘上。当有多个类似的磁盘挂载到系统上,但没有配置 RAID 时,这种策略非常有用。请注意一个每个独立的磁盘驱动都并不可靠,你可能需要用 3 或更大的复制因此来补偿它。 + +如果在系统中有不同类型的磁盘可用,可以使用 `moving_from_ssd_to_hdd`。`hot` 卷由 SSD 磁盘(`fast_ssd`)组成,这个卷上可以存储的数据片段的最大大小为 1GB。所有大于 1GB 的数据片段都会被直接存储到 `cold` 卷上,`cold` 卷包含一个名为 `disk1` 的 HDD 磁盘。 +同样,一旦 `fast_ssd` 被填充超过 80%,数据会通过后台进程向 `disk1` 进行转移。 + +存储策略中卷的枚举顺序是很重要的。因为当一个卷被充满时,数据会向下一个卷转移。磁盘的枚举顺序同样重要,因为数据是依次存储在磁盘上的。 + +在创建表时,可以将一个配置好的策略应用到表: + +``` sql +CREATE TABLE table_with_non_default_policy ( + EventDate Date, + OrderID UInt64, + BannerID UInt64, + SearchPhrase String +) ENGINE = MergeTree +ORDER BY (OrderID, BannerID) +PARTITION BY toYYYYMM(EventDate) +SETTINGS storage_policy = 'moving_from_ssd_to_hdd' +``` + +`default` 存储策略意味着只使用一个卷,这个卷只包含一个在 `` 中定义的磁盘。表创建后,它的存储策略就不能改变了。 + +可以通过 [background\_move\_pool\_size](../../../operations/settings/settings.md#background_move_pool_size) 设置调整执行后台任务的线程数。 + +### 详细说明 {#details} + +对于 `MergeTree` 表,数据通过以下不同的方式写入到磁盘当中: + +- 作为插入(`INSERT`查询)的结果 +- 在后台合并和[数据变异](../../../sql-reference/statements/alter.md#alter-mutations)期间 +- 当从另一个副本下载时 +- 作为 [ALTER TABLE … FREEZE PARTITION](../../../sql-reference/statements/alter.md#alter_freeze-partition) 冻结分区的结果 + +除了数据变异和冻结分区以外的情况下,数据按照以下逻辑存储到卷或磁盘上: + +1. 首个卷(按定义顺序)拥有足够的磁盘空间存储数据片段(`unreserved_space > current_part_size`)并且允许存储给定数据片段的大小(`max_data_part_size_bytes > current_part_size`) +2. 在这个数据卷内,紧挨着先前存储数据的那块磁盘之后的磁盘,拥有比数据片段大的剩余空间。(`unreserved_space - keep_free_space_bytes > current_part_size`) + +更进一步,数据变异和分区冻结使用的是 [硬链接](https://en.wikipedia.org/wiki/Hard_link)。不同磁盘之间的硬链接是不支持的,所以在这种情况下数据片段都会被存储到初始化的那一块磁盘上。 + +在后台,数据片段基于剩余空间(`move_factor`参数)根据卷在配置文件中定义的顺序进行转移。数据永远不会从最后一个移出也不会从第一个移入。可以通过系统表 [system.part\_log](../../../operations/system-tables/part_log.md#system_tables-part-log) (字段 `type = MOVE_PART`) 和 [system.parts](../../../operations/system-tables/parts.md#system_tables-parts) (字段 `path` 和 `disk`) 来监控后台的移动情况。同时,具体细节可以通过服务器日志查看。 + +用户可以通过 [ALTER TABLE … MOVE PART\|PARTITION … TO VOLUME\|DISK …](../../../sql-reference/statements/alter.md#alter_move-partition) 强制移动一个数据片段或分区到另外一个卷,所有后台移动的限制都会被考虑在内。这个查询会自行启动,无需等待后台操作完成。如果没有足够的可用空间或任何必须条件没有被满足,用户会收到报错信息。 + +数据移动不会妨碍到数据复制。也就是说,同一张表的不同副本可以指定不同的存储策略。 + +在后台合并和数据变异之后,就的数据片段会在一定时间后被移除 (`old_parts_lifetime`)。在这期间,他们不能被移动到其他的卷或磁盘。也就是说,直到数据片段被完全移除,它们仍然会被磁盘占用空间计算在内。 + +[原始文章](https://clickhouse.tech/docs/en/operations/table_engines/mergetree/) diff --git a/docs/zh/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/zh/engines/table-engines/mergetree-family/replacingmergetree.md index 03b47172400..626597eeaf0 100644 --- a/docs/zh/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/zh/engines/table-engines/mergetree-family/replacingmergetree.md @@ -25,7 +25,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] **参数** -- `ver` — 版本列。类型为 `UInt*`, `Date`, `DateTime` 或 `DateTime64`。可选参数。 +- `ver` — 版本列。类型为 `UInt*`, `Date` 或 `DateTime`。可选参数。 合并的时候,`ReplacingMergeTree` 从所有具有相同主键的行中选择一行留下: - 如果 `ver` 列未指定,选择最后一条。 diff --git a/docs/zh/guides/apply-catboost-model.md b/docs/zh/guides/apply-catboost-model.md index be21c372307..3657a947ad2 100644 --- a/docs/zh/guides/apply-catboost-model.md +++ b/docs/zh/guides/apply-catboost-model.md @@ -15,7 +15,7 @@ toc_title: "\u5E94\u7528CatBoost\u6A21\u578B" 1. [创建表](#create-table). 2. [将数据插入到表中](#insert-data-to-table). -3. [碌莽禄into拢Integrate010-68520682\](#integrate-catboost-into-clickhouse) (可选步骤)。 +3. [将CatBoost集成到ClickHouse中](#integrate-catboost-into-clickhouse) (可选步骤)。 4. [从SQL运行模型推理](#run-model-inference). 有关训练CatBoost模型的详细信息,请参阅 [培训和应用模型](https://catboost.ai/docs/features/training.html#training). @@ -119,12 +119,12 @@ FROM amazon_train +-------+ ``` -## 3. 碌莽禄into拢Integrate010-68520682\ {#integrate-catboost-into-clickhouse} +## 3. 将CatBoost集成到ClickHouse中 {#integrate-catboost-into-clickhouse} !!! note "注" **可选步骤。** Docker映像包含运行CatBoost和ClickHouse所需的所有内容。 -碌莽禄to拢integrate010-68520682\: +CatBoost集成到ClickHouse步骤: **1.** 构建评估库。 diff --git a/docs/zh/interfaces/http.md b/docs/zh/interfaces/http.md index 0fecb1873db..9feb8c5d69d 100644 --- a/docs/zh/interfaces/http.md +++ b/docs/zh/interfaces/http.md @@ -23,7 +23,7 @@ Ok. $ curl 'http://localhost:8123/?query=SELECT%201' 1 -$ wget -O- -q 'http://localhost:8123/?query=SELECT 1' +$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1' 1 $ GET 'http://localhost:8123/?query=SELECT 1' diff --git a/programs/benchmark/Benchmark.cpp b/programs/benchmark/Benchmark.cpp index 3ae3980c273..c8fdde3d3a6 100644 --- a/programs/benchmark/Benchmark.cpp +++ b/programs/benchmark/Benchmark.cpp @@ -104,6 +104,8 @@ public: query_processing_stage = QueryProcessingStage::FetchColumns; else if (stage == "with_mergeable_state") query_processing_stage = QueryProcessingStage::WithMergeableState; + else if (stage == "with_mergeable_state_after_aggregation") + query_processing_stage = QueryProcessingStage::WithMergeableStateAfterAggregation; else throw Exception("Unknown query processing stage: " + stage, ErrorCodes::BAD_ARGUMENTS); @@ -564,8 +566,8 @@ int mainEntryClickHouseBenchmark(int argc, char ** argv) desc.add_options() ("help", "produce help message") ("concurrency,c", value()->default_value(1), "number of parallel queries") - ("delay,d", value()->default_value(1), "delay between intermediate reports in seconds (set 0 to disable reports)") - ("stage", value()->default_value("complete"), "request query processing up to specified stage: complete,fetch_columns,with_mergeable_state") + ("delay,d", value()->default_value(1), "delay between intermediate reports in seconds (set 0 to disable reports)") + ("stage", value()->default_value("complete"), "request query processing up to specified stage: complete,fetch_columns,with_mergeable_state,with_mergeable_state_after_aggregation") ("iterations,i", value()->default_value(0), "amount of queries to be executed") ("timelimit,t", value()->default_value(0.), "stop launch of queries after specified time limit") ("randomize,r", value()->default_value(false), "randomize order of execution") diff --git a/programs/client/Client.cpp b/programs/client/Client.cpp index db5a677e0e1..c9701950dc5 100644 --- a/programs/client/Client.cpp +++ b/programs/client/Client.cpp @@ -847,32 +847,26 @@ private: } // Parse and execute what we've read. - fprintf(stderr, "will now parse '%s'\n", text.c_str()); - const auto * new_end = processWithFuzzing(text); if (new_end > &text[0]) { const auto rest_size = text.size() - (new_end - &text[0]); - fprintf(stderr, "total %zd, rest %zd\n", text.size(), rest_size); - memcpy(&text[0], new_end, rest_size); text.resize(rest_size); } else { - fprintf(stderr, "total %zd, can't parse\n", text.size()); + // We didn't read enough text to parse a query. Will read more. } - if (!connection->isConnected()) - { - // Uh-oh... - std::cerr << "Lost connection to the server." << std::endl; - last_exception_received_from_server - = std::make_unique(210, "~"); - return; - } + // Ensure that we're still connected to the server. If the server died, + // the reconnect is going to fail with an exception, and the fuzzer + // will exit. The ping() would be the best match here, but it's + // private, probably for a good reason that the protocol doesn't allow + // pings at any possible moment. + connection->forceConnected(connection_parameters.timeouts); if (text.size() > 4 * 1024) { @@ -880,9 +874,6 @@ private: // and we still cannot parse a single query in it. Abort. std::cerr << "Read too much text and still can't parse a query." " Aborting." << std::endl; - last_exception_received_from_server - = std::make_unique(1, "~"); - // return; exit(1); } } diff --git a/programs/obfuscator/Obfuscator.cpp b/programs/obfuscator/Obfuscator.cpp index acdab861ea3..756aab0a574 100644 --- a/programs/obfuscator/Obfuscator.cpp +++ b/programs/obfuscator/Obfuscator.cpp @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -363,6 +364,17 @@ static void transformFixedString(const UInt8 * src, UInt8 * dst, size_t size, UI } } +static void transformUUID(const UInt128 & src, UInt128 & dst, UInt64 seed) +{ + SipHash hash; + hash.update(seed); + hash.update(reinterpret_cast(&src), sizeof(UInt128)); + + /// Saving version and variant from an old UUID + hash.get128(reinterpret_cast(&dst)); + dst.high = (dst.high & 0x1fffffffffffffffull) | (src.high & 0xe000000000000000ull); + dst.low = (dst.low & 0xffffffffffff0fffull) | (src.low & 0x000000000000f000ull); +} class FixedStringModel : public IModel { @@ -400,6 +412,38 @@ public: } }; +class UUIDModel : public IModel +{ +private: + UInt64 seed; + +public: + explicit UUIDModel(UInt64 seed_) : seed(seed_) {} + + void train(const IColumn &) override {} + void finalize() override {} + + ColumnPtr generate(const IColumn & column) override + { + const ColumnUInt128 & src_column = assert_cast(column); + const auto & src_data = src_column.getData(); + + auto res_column = ColumnUInt128::create(); + auto & res_data = res_column->getData(); + + res_data.resize(src_data.size()); + for (size_t i = 0; i < src_column.size(); ++i) + transformUUID(src_data[i], res_data[i], seed); + + return res_column; + } + + void updateSeed() override + { + seed = hash(seed); + } +}; + /// Leave date part as is and apply pseudorandom permutation to time difference with previous value within the same log2 class. class DateTimeModel : public IModel @@ -935,6 +979,9 @@ public: if (typeid_cast(&data_type)) return std::make_unique(seed); + if (typeid_cast(&data_type)) + return std::make_unique(seed); + if (const auto * type = typeid_cast(&data_type)) return std::make_unique(get(*type->getNestedType(), seed, markov_model_params)); diff --git a/programs/odbc-bridge/SchemaAllowedHandler.cpp b/programs/odbc-bridge/SchemaAllowedHandler.cpp index 5aaba57399e..fa08a27da59 100644 --- a/programs/odbc-bridge/SchemaAllowedHandler.cpp +++ b/programs/odbc-bridge/SchemaAllowedHandler.cpp @@ -21,15 +21,14 @@ namespace { bool isSchemaAllowed(SQLHDBC hdbc) { - std::string identifier; - - SQLSMALLINT t; - SQLRETURN r = POCO_SQL_ODBC_CLASS::SQLGetInfo(hdbc, SQL_SCHEMA_USAGE, nullptr, 0, &t); + SQLUINTEGER value; + SQLSMALLINT value_length = sizeof(value); + SQLRETURN r = POCO_SQL_ODBC_CLASS::SQLGetInfo(hdbc, SQL_SCHEMA_USAGE, &value, sizeof(value), &value_length); if (POCO_SQL_ODBC_CLASS::Utility::isError(r)) throw POCO_SQL_ODBC_CLASS::ConnectionException(hdbc); - return t != 0; + return value != 0; } } diff --git a/programs/server/Server.cpp b/programs/server/Server.cpp index 3a975325851..f24ba444203 100644 --- a/programs/server/Server.cpp +++ b/programs/server/Server.cpp @@ -716,6 +716,7 @@ int Server::main(const std::vector & /*args*/) { /// Disable DNS caching at all DNSResolver::instance().setDisableCacheFlag(); + LOG_DEBUG(log, "DNS caching disabled"); } else { diff --git a/programs/server/ya.make b/programs/server/ya.make index 2e13267f715..b4deaafedc5 100644 --- a/programs/server/ya.make +++ b/programs/server/ya.make @@ -8,6 +8,8 @@ PEERDIR( contrib/libs/poco/NetSSL_OpenSSL ) +CFLAGS(-g0) + SRCS( clickhouse-server.cpp diff --git a/programs/ya.make b/programs/ya.make index 1b80b264959..e77814ddf69 100644 --- a/programs/ya.make +++ b/programs/ya.make @@ -12,6 +12,8 @@ PEERDIR( clickhouse/src ) +CFLAGS(-g0) + SRCS( main.cpp diff --git a/src/Access/AccessControlManager.cpp b/src/Access/AccessControlManager.cpp index 6158be1b603..1fa26c85354 100644 --- a/src/Access/AccessControlManager.cpp +++ b/src/Access/AccessControlManager.cpp @@ -281,41 +281,33 @@ void AccessControlManager::addStoragesFromMainConfig( String config_dir = std::filesystem::path{config_path}.remove_filename().string(); String dbms_dir = config.getString("path", DBMS_DEFAULT_PATH); String include_from_path = config.getString("include_from", "/etc/metrika.xml"); + bool has_user_directories = config.has("user_directories"); - if (config.has("user_directories")) + /// If path to users' config isn't absolute, try guess its root (current) dir. + /// At first, try to find it in dir of main config, after will use current dir. + String users_config_path = config.getString("users_config", ""); + if (users_config_path.empty()) { - if (config.has("users_config")) - LOG_WARNING(getLogger(), " is specified, the path from won't be used: " + config.getString("users_config")); - if (config.has("access_control_path")) - LOG_WARNING(getLogger(), " is specified, the path from won't be used: " + config.getString("access_control_path")); - - addStoragesFromUserDirectoriesConfig( - config, - "user_directories", - config_dir, - dbms_dir, - include_from_path, - get_zookeeper_function); - } - else - { - /// If path to users' config isn't absolute, try guess its root (current) dir. - /// At first, try to find it in dir of main config, after will use current dir. - String users_config_path = config.getString("users_config", ""); - if (users_config_path.empty()) + if (!has_user_directories) users_config_path = config_path; - else if (std::filesystem::path{users_config_path}.is_relative() && std::filesystem::exists(config_dir + users_config_path)) - users_config_path = config_dir + users_config_path; + } + else if (std::filesystem::path{users_config_path}.is_relative() && std::filesystem::exists(config_dir + users_config_path)) + users_config_path = config_dir + users_config_path; + if (!users_config_path.empty()) + { if (users_config_path != config_path) checkForUsersNotInMainConfig(config, config_path, users_config_path, getLogger()); addUsersConfigStorage(users_config_path, include_from_path, dbms_dir, get_zookeeper_function); - - String disk_storage_dir = config.getString("access_control_path", ""); - if (!disk_storage_dir.empty()) - addDiskStorage(disk_storage_dir); } + + String disk_storage_dir = config.getString("access_control_path", ""); + if (!disk_storage_dir.empty()) + addDiskStorage(disk_storage_dir); + + if (has_user_directories) + addStoragesFromUserDirectoriesConfig(config, "user_directories", config_dir, dbms_dir, include_from_path, get_zookeeper_function); } diff --git a/src/Access/AccessControlManager.h b/src/Access/AccessControlManager.h index ad9fb48d263..d7cf59cfb28 100644 --- a/src/Access/AccessControlManager.h +++ b/src/Access/AccessControlManager.h @@ -47,7 +47,7 @@ class AccessControlManager : public MultipleAccessStorage { public: AccessControlManager(); - ~AccessControlManager(); + ~AccessControlManager() override; /// Parses access entities from a configuration loaded from users.xml. /// This function add UsersConfigAccessStorage if it wasn't added before. diff --git a/src/Access/AccessFlags.h b/src/Access/AccessFlags.h index 9b801fd88a3..3cb92b6b855 100644 --- a/src/Access/AccessFlags.h +++ b/src/Access/AccessFlags.h @@ -96,6 +96,22 @@ public: /// Returns all the flags related to a dictionary. static AccessFlags allDictionaryFlags(); + /// Returns all the flags which could be granted on the global level. + /// The same as allFlags(). + static AccessFlags allFlagsGrantableOnGlobalLevel(); + + /// Returns all the flags which could be granted on the database level. + /// Returns allDatabaseFlags() | allTableFlags() | allDictionaryFlags() | allColumnFlags(). + static AccessFlags allFlagsGrantableOnDatabaseLevel(); + + /// Returns all the flags which could be granted on the table level. + /// Returns allTableFlags() | allDictionaryFlags() | allColumnFlags(). + static AccessFlags allFlagsGrantableOnTableLevel(); + + /// Returns all the flags which could be granted on the global level. + /// The same as allColumnFlags(). + static AccessFlags allFlagsGrantableOnColumnLevel(); + private: static constexpr size_t NUM_FLAGS = 128; using Flags = std::bitset; @@ -193,6 +209,10 @@ public: const Flags & getTableFlags() const { return all_flags_for_target[TABLE]; } const Flags & getColumnFlags() const { return all_flags_for_target[COLUMN]; } const Flags & getDictionaryFlags() const { return all_flags_for_target[DICTIONARY]; } + const Flags & getAllFlagsGrantableOnGlobalLevel() const { return getAllFlags(); } + const Flags & getAllFlagsGrantableOnDatabaseLevel() const { return all_flags_grantable_on_database_level; } + const Flags & getAllFlagsGrantableOnTableLevel() const { return all_flags_grantable_on_table_level; } + const Flags & getAllFlagsGrantableOnColumnLevel() const { return getColumnFlags(); } private: enum NodeType @@ -381,6 +401,9 @@ private: } for (const auto & child : start_node->children) collectAllFlags(child.get()); + + all_flags_grantable_on_table_level = all_flags_for_target[TABLE] | all_flags_for_target[DICTIONARY] | all_flags_for_target[COLUMN]; + all_flags_grantable_on_database_level = all_flags_for_target[DATABASE] | all_flags_grantable_on_table_level; } Impl() @@ -431,6 +454,8 @@ private: std::vector access_type_to_flags_mapping; Flags all_flags; Flags all_flags_for_target[static_cast(DICTIONARY) + 1]; + Flags all_flags_grantable_on_database_level; + Flags all_flags_grantable_on_table_level; }; @@ -447,6 +472,10 @@ inline AccessFlags AccessFlags::allDatabaseFlags() { return Impl<>::instance().g inline AccessFlags AccessFlags::allTableFlags() { return Impl<>::instance().getTableFlags(); } inline AccessFlags AccessFlags::allColumnFlags() { return Impl<>::instance().getColumnFlags(); } inline AccessFlags AccessFlags::allDictionaryFlags() { return Impl<>::instance().getDictionaryFlags(); } +inline AccessFlags AccessFlags::allFlagsGrantableOnGlobalLevel() { return Impl<>::instance().getAllFlagsGrantableOnGlobalLevel(); } +inline AccessFlags AccessFlags::allFlagsGrantableOnDatabaseLevel() { return Impl<>::instance().getAllFlagsGrantableOnDatabaseLevel(); } +inline AccessFlags AccessFlags::allFlagsGrantableOnTableLevel() { return Impl<>::instance().getAllFlagsGrantableOnTableLevel(); } +inline AccessFlags AccessFlags::allFlagsGrantableOnColumnLevel() { return Impl<>::instance().getAllFlagsGrantableOnColumnLevel(); } inline AccessFlags operator |(AccessType left, AccessType right) { return AccessFlags(left) | right; } inline AccessFlags operator &(AccessType left, AccessType right) { return AccessFlags(left) & right; } diff --git a/src/Access/AccessRights.cpp b/src/Access/AccessRights.cpp index 65c78f39e86..8ce71dd8da8 100644 --- a/src/Access/AccessRights.cpp +++ b/src/Access/AccessRights.cpp @@ -1,5 +1,4 @@ #include -#include #include #include #include @@ -8,12 +7,6 @@ namespace DB { -namespace ErrorCodes -{ - extern const int INVALID_GRANT; -} - - namespace { using Kind = AccessRightsElementWithOptions::Kind; @@ -214,30 +207,14 @@ namespace COLUMN_LEVEL, }; - AccessFlags getAcceptableFlags(Level level) + AccessFlags getAllGrantableFlags(Level level) { switch (level) { - case GLOBAL_LEVEL: - { - static const AccessFlags res = AccessFlags::allFlags(); - return res; - } - case DATABASE_LEVEL: - { - static const AccessFlags res = AccessFlags::allDatabaseFlags() | AccessFlags::allTableFlags() | AccessFlags::allDictionaryFlags() | AccessFlags::allColumnFlags(); - return res; - } - case TABLE_LEVEL: - { - static const AccessFlags res = AccessFlags::allTableFlags() | AccessFlags::allDictionaryFlags() | AccessFlags::allColumnFlags(); - return res; - } - case COLUMN_LEVEL: - { - static const AccessFlags res = AccessFlags::allColumnFlags(); - return res; - } + case GLOBAL_LEVEL: return AccessFlags::allFlagsGrantableOnGlobalLevel(); + case DATABASE_LEVEL: return AccessFlags::allFlagsGrantableOnDatabaseLevel(); + case TABLE_LEVEL: return AccessFlags::allFlagsGrantableOnTableLevel(); + case COLUMN_LEVEL: return AccessFlags::allFlagsGrantableOnColumnLevel(); } __builtin_unreachable(); } @@ -276,21 +253,7 @@ public: void grant(const AccessFlags & flags_) { - if (!flags_) - return; - - AccessFlags flags_to_add = flags_ & getAcceptableFlags(); - - if (!flags_to_add) - { - if (level == DATABASE_LEVEL) - throw Exception(flags_.toString() + " cannot be granted on the database level", ErrorCodes::INVALID_GRANT); - else if (level == TABLE_LEVEL) - throw Exception(flags_.toString() + " cannot be granted on the table level", ErrorCodes::INVALID_GRANT); - else if (level == COLUMN_LEVEL) - throw Exception(flags_.toString() + " cannot be granted on the column level", ErrorCodes::INVALID_GRANT); - } - + AccessFlags flags_to_add = flags_ & getAllGrantableFlags(); addGrantsRec(flags_to_add); optimizeTree(); } @@ -456,8 +419,8 @@ public: } private: - AccessFlags getAcceptableFlags() const { return ::DB::getAcceptableFlags(level); } - AccessFlags getChildAcceptableFlags() const { return ::DB::getAcceptableFlags(static_cast(level + 1)); } + AccessFlags getAllGrantableFlags() const { return ::DB::getAllGrantableFlags(level); } + AccessFlags getChildAllGrantableFlags() const { return ::DB::getAllGrantableFlags(static_cast(level + 1)); } Node * tryGetChild(const std::string_view & name) const { @@ -480,7 +443,7 @@ private: Node & new_child = (*children)[*new_child_name]; new_child.node_name = std::move(new_child_name); new_child.level = static_cast(level + 1); - new_child.flags = flags & new_child.getAcceptableFlags(); + new_child.flags = flags & new_child.getAllGrantableFlags(); return new_child; } @@ -496,12 +459,12 @@ private: bool canEraseChild(const Node & child) const { - return ((flags & child.getAcceptableFlags()) == child.flags) && !child.children; + return ((flags & child.getAllGrantableFlags()) == child.flags) && !child.children; } void addGrantsRec(const AccessFlags & flags_) { - if (auto flags_to_add = flags_ & getAcceptableFlags()) + if (auto flags_to_add = flags_ & getAllGrantableFlags()) { flags |= flags_to_add; if (children) @@ -547,7 +510,7 @@ private: const AccessFlags & parent_flags) { auto flags = node.flags; - auto parent_fl = parent_flags & node.getAcceptableFlags(); + auto parent_fl = parent_flags & node.getAllGrantableFlags(); auto revokes = parent_fl - flags; auto grants = flags - parent_fl; @@ -576,9 +539,9 @@ private: const Node * node_go, const AccessFlags & parent_flags_go) { - auto acceptable_flags = ::DB::getAcceptableFlags(static_cast(full_name.size())); - auto parent_fl = parent_flags & acceptable_flags; - auto parent_fl_go = parent_flags_go & acceptable_flags; + auto grantable_flags = ::DB::getAllGrantableFlags(static_cast(full_name.size())); + auto parent_fl = parent_flags & grantable_flags; + auto parent_fl_go = parent_flags_go & grantable_flags; auto flags = node ? node->flags : parent_fl; auto flags_go = node_go ? node_go->flags : parent_fl_go; auto revokes = parent_fl - flags; @@ -672,8 +635,8 @@ private: } max_flags_with_children |= max_among_children; - AccessFlags add_acceptable_flags = getAcceptableFlags() - getChildAcceptableFlags(); - min_flags_with_children &= min_among_children | add_acceptable_flags; + AccessFlags add_flags = getAllGrantableFlags() - getChildAllGrantableFlags(); + min_flags_with_children &= min_among_children | add_flags; } void makeUnionRec(const Node & rhs) @@ -689,7 +652,7 @@ private: for (auto & [lhs_childname, lhs_child] : *children) { if (!rhs.tryGetChild(lhs_childname)) - lhs_child.flags |= rhs.flags & lhs_child.getAcceptableFlags(); + lhs_child.flags |= rhs.flags & lhs_child.getAllGrantableFlags(); } } } @@ -738,7 +701,7 @@ private: if (new_flags != flags) { - new_flags &= getAcceptableFlags(); + new_flags &= getAllGrantableFlags(); flags_added |= static_cast(new_flags - flags); flags_removed |= static_cast(flags - new_flags); flags = new_flags; diff --git a/src/Access/AccessRightsElement.h b/src/Access/AccessRightsElement.h index f9f7c433308..36cb64e6eba 100644 --- a/src/Access/AccessRightsElement.h +++ b/src/Access/AccessRightsElement.h @@ -71,6 +71,8 @@ struct AccessRightsElement { } + bool empty() const { return !access_flags || (!any_column && columns.empty()); } + auto toTuple() const { return std::tie(access_flags, any_database, database, any_table, table, any_column, columns); } friend bool operator==(const AccessRightsElement & left, const AccessRightsElement & right) { return left.toTuple() == right.toTuple(); } friend bool operator!=(const AccessRightsElement & left, const AccessRightsElement & right) { return !(left == right); } @@ -86,6 +88,9 @@ struct AccessRightsElement /// If the database is empty, replaces it with `new_database`. Otherwise does nothing. void replaceEmptyDatabase(const String & new_database); + /// Resets flags which cannot be granted. + void removeNonGrantableFlags(); + /// Returns a human-readable representation like "SELECT, UPDATE(x, y) ON db.table". String toString() const; }; @@ -111,6 +116,9 @@ struct AccessRightsElementWithOptions : public AccessRightsElement friend bool operator==(const AccessRightsElementWithOptions & left, const AccessRightsElementWithOptions & right) { return left.toTuple() == right.toTuple(); } friend bool operator!=(const AccessRightsElementWithOptions & left, const AccessRightsElementWithOptions & right) { return !(left == right); } + /// Resets flags which cannot be granted. + void removeNonGrantableFlags(); + /// Returns a human-readable representation like "GRANT SELECT, UPDATE(x, y) ON db.table". String toString() const; }; @@ -120,9 +128,14 @@ struct AccessRightsElementWithOptions : public AccessRightsElement class AccessRightsElements : public std::vector { public: + bool empty() const { return std::all_of(begin(), end(), [](const AccessRightsElement & e) { return e.empty(); }); } + /// Replaces the empty database with `new_database`. void replaceEmptyDatabase(const String & new_database); + /// Resets flags which cannot be granted. + void removeNonGrantableFlags(); + /// Returns a human-readable representation like "GRANT SELECT, UPDATE(x, y) ON db.table". String toString() const; }; @@ -134,6 +147,9 @@ public: /// Replaces the empty database with `new_database`. void replaceEmptyDatabase(const String & new_database); + /// Resets flags which cannot be granted. + void removeNonGrantableFlags(); + /// Returns a human-readable representation like "GRANT SELECT, UPDATE(x, y) ON db.table". String toString() const; }; @@ -157,4 +173,34 @@ inline void AccessRightsElementsWithOptions::replaceEmptyDatabase(const String & element.replaceEmptyDatabase(new_database); } +inline void AccessRightsElement::removeNonGrantableFlags() +{ + if (!any_column) + access_flags &= AccessFlags::allFlagsGrantableOnColumnLevel(); + else if (!any_table) + access_flags &= AccessFlags::allFlagsGrantableOnTableLevel(); + else if (!any_database) + access_flags &= AccessFlags::allFlagsGrantableOnDatabaseLevel(); + else + access_flags &= AccessFlags::allFlagsGrantableOnGlobalLevel(); +} + +inline void AccessRightsElementWithOptions::removeNonGrantableFlags() +{ + if (kind == Kind::GRANT) + AccessRightsElement::removeNonGrantableFlags(); +} + +inline void AccessRightsElements::removeNonGrantableFlags() +{ + for (auto & element : *this) + element.removeNonGrantableFlags(); +} + +inline void AccessRightsElementsWithOptions::removeNonGrantableFlags() +{ + for (auto & element : *this) + element.removeNonGrantableFlags(); +} + } diff --git a/src/Access/ya.make b/src/Access/ya.make index aaa052355f6..e5fa73f107c 100644 --- a/src/Access/ya.make +++ b/src/Access/ya.make @@ -5,6 +5,8 @@ PEERDIR( clickhouse/src/Common ) +CFLAGS(-g0) + SRCS( AccessControlManager.cpp AccessRights.cpp diff --git a/src/Access/ya.make.in b/src/Access/ya.make.in index 4ae9f9ddb0a..e48d0d1bda7 100644 --- a/src/Access/ya.make.in +++ b/src/Access/ya.make.in @@ -4,6 +4,8 @@ PEERDIR( clickhouse/src/Common ) +CFLAGS(-g0) + SRCS( ) diff --git a/src/AggregateFunctions/AggregateFunctionQuantile.cpp b/src/AggregateFunctions/AggregateFunctionQuantile.cpp index 52b82fbf733..21451fe33be 100644 --- a/src/AggregateFunctions/AggregateFunctionQuantile.cpp +++ b/src/AggregateFunctions/AggregateFunctionQuantile.cpp @@ -106,8 +106,8 @@ AggregateFunctionPtr createAggregateFunctionQuantile(const std::string & name, c if constexpr (supportBigInt()) { if (which.idx == TypeIndex::Int128) return std::make_shared>(argument_types, params); - if (which.idx == TypeIndex::bInt256) return std::make_shared>(argument_types, params); - if (which.idx == TypeIndex::bUInt256) return std::make_shared>(argument_types, params); + if (which.idx == TypeIndex::Int256) return std::make_shared>(argument_types, params); + if (which.idx == TypeIndex::UInt256) return std::make_shared>(argument_types, params); } throw Exception("Illegal type " + argument_type->getName() + " of argument for aggregate function " + name, diff --git a/src/AggregateFunctions/AggregateFunctionRankCorrelation.cpp b/src/AggregateFunctions/AggregateFunctionRankCorrelation.cpp new file mode 100644 index 00000000000..20472279dba --- /dev/null +++ b/src/AggregateFunctions/AggregateFunctionRankCorrelation.cpp @@ -0,0 +1,51 @@ +#include +#include +#include +#include "registerAggregateFunctions.h" +#include + + +namespace ErrorCodes +{ +extern const int NOT_IMPLEMENTED; +} + +namespace DB +{ + +namespace +{ + +AggregateFunctionPtr createAggregateFunctionRankCorrelation(const std::string & name, const DataTypes & argument_types, const Array & parameters) +{ + assertBinary(name, argument_types); + assertNoParameters(name, parameters); + + AggregateFunctionPtr res; + + if (isDecimal(argument_types[0]) || isDecimal(argument_types[1])) + { + throw Exception("Aggregate function " + name + " only supports numerical types", ErrorCodes::NOT_IMPLEMENTED); + } + else + { + res.reset(createWithTwoNumericTypes(*argument_types[0], *argument_types[1], argument_types)); + } + + if (!res) + { + throw Exception("Aggregate function " + name + " only supports numerical types", ErrorCodes::NOT_IMPLEMENTED); + } + + return res; +} + +} + + +void registerAggregateFunctionRankCorrelation(AggregateFunctionFactory & factory) +{ + factory.registerFunction("rankCorr", createAggregateFunctionRankCorrelation, AggregateFunctionFactory::CaseInsensitive); +} + +} diff --git a/src/AggregateFunctions/AggregateFunctionRankCorrelation.h b/src/AggregateFunctions/AggregateFunctionRankCorrelation.h new file mode 100644 index 00000000000..379a8332f09 --- /dev/null +++ b/src/AggregateFunctions/AggregateFunctionRankCorrelation.h @@ -0,0 +1,234 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include + +#include + +namespace ErrorCodes +{ +extern const int BAD_ARGUMENTS; +} + +namespace DB +{ + +template