mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-22 15:42:02 +00:00
Merge remote-tracking branch 'origin/master' into FixTestFileNameTypo
This commit is contained in:
commit
11ac570ff3
5
.github/workflows/pull_request.yml
vendored
5
.github/workflows/pull_request.yml
vendored
@ -532,6 +532,11 @@ jobs:
|
||||
run_command: |
|
||||
cd "$REPO_COPY/tests/ci"
|
||||
|
||||
mkdir -p "${REPORTS_PATH}/integration"
|
||||
mkdir -p "${REPORTS_PATH}/stateless"
|
||||
cp -r ${REPORTS_PATH}/changed_images* ${REPORTS_PATH}/integration
|
||||
cp -r ${REPORTS_PATH}/changed_images* ${REPORTS_PATH}/stateless
|
||||
|
||||
TEMP_PATH="${TEMP_PATH}/integration" \
|
||||
REPORTS_PATH="${REPORTS_PATH}/integration" \
|
||||
python3 integration_test_check.py "Integration $CHECK_NAME" \
|
||||
|
3
.gitmodules
vendored
3
.gitmodules
vendored
@ -354,6 +354,3 @@
|
||||
[submodule "contrib/aklomp-base64"]
|
||||
path = contrib/aklomp-base64
|
||||
url = https://github.com/aklomp/base64.git
|
||||
[submodule "contrib/pocketfft"]
|
||||
path = contrib/pocketfft
|
||||
url = https://github.com/mreineck/pocketfft.git
|
||||
|
242
CHANGELOG.md
242
CHANGELOG.md
@ -1,4 +1,5 @@
|
||||
### Table of Contents
|
||||
**[ClickHouse release v23.11, 2023-12-05](#2311)**<br/>
|
||||
**[ClickHouse release v23.10, 2023-11-02](#2310)**<br/>
|
||||
**[ClickHouse release v23.9, 2023-09-28](#239)**<br/>
|
||||
**[ClickHouse release v23.8 LTS, 2023-08-31](#238)**<br/>
|
||||
@ -13,7 +14,222 @@
|
||||
|
||||
# 2023 Changelog
|
||||
|
||||
### ClickHouse release 23.10, 2023-11-02
|
||||
### <a id="2311"></a> ClickHouse release 23.11, 2023-12-05
|
||||
|
||||
#### Backward Incompatible Change
|
||||
* The default ClickHouse server configuration file has enabled `access_management` (user manipulation by SQL queries) and `named_collection_control` (manipulation of named collection by SQL queries) for the `default` user by default. This closes [#56482](https://github.com/ClickHouse/ClickHouse/issues/56482). [#56619](https://github.com/ClickHouse/ClickHouse/pull/56619) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Multiple improvements for `RESPECT NULLS`/`IGNORE NULLS` for window functions. If you use them as aggregate functions and store the states of aggregate functions with these modifiers, they might become incompatible. [#57189](https://github.com/ClickHouse/ClickHouse/pull/57189) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Remove optimization `optimize_move_functions_out_of_any`. [#57190](https://github.com/ClickHouse/ClickHouse/pull/57190) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Formatters `%l`/`%k`/`%c` in function `parseDateTime` are now able to parse hours/months without leading zeros, e.g. `select parseDateTime('2023-11-26 8:14', '%F %k:%i')` now works. Set `parsedatetime_parse_without_leading_zeros = 0` to restore the previous behavior which required two digits. Function `formatDateTime` is now also able to print hours/months without leading zeros. This is controlled by setting `formatdatetime_format_without_leading_zeros` but off by default to not break existing use cases. [#55872](https://github.com/ClickHouse/ClickHouse/pull/55872) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* You can no longer use the aggregate function `avgWeighted` with arguments of type `Decimal`. Workaround: convert arguments to `Float64`. This closes [#43928](https://github.com/ClickHouse/ClickHouse/issues/43928). This closes [#31768](https://github.com/ClickHouse/ClickHouse/issues/31768). This closes [#56435](https://github.com/ClickHouse/ClickHouse/issues/56435). If you have used this function inside materialized views or projections with `Decimal` arguments, contact support@clickhouse.com. Fixed error in aggregate function `sumMap` and made it slower around 1.5..2 times. It does not matter because the function is garbage anyway. This closes [#54955](https://github.com/ClickHouse/ClickHouse/issues/54955). This closes [#53134](https://github.com/ClickHouse/ClickHouse/issues/53134). This closes [#55148](https://github.com/ClickHouse/ClickHouse/issues/55148). Fix a bug in function `groupArraySample` - it used the same random seed in case more than one aggregate state is generated in a query. [#56350](https://github.com/ClickHouse/ClickHouse/pull/56350) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
|
||||
#### New Feature
|
||||
* Added server setting `async_load_databases` for asynchronous loading of databases and tables. Speeds up the server start time. Applies to databases with `Ordinary`, `Atomic` and `Replicated` engines. Their tables load metadata asynchronously. Query to a table increases the priority of the load job and waits for it to be done. Added a new table `system.asynchronous_loader` for introspection. [#49351](https://github.com/ClickHouse/ClickHouse/pull/49351) ([Sergei Trifonov](https://github.com/serxa)).
|
||||
* Add system table `blob_storage_log`. It allows auditing all the data written to S3 and other object storages. [#52918](https://github.com/ClickHouse/ClickHouse/pull/52918) ([vdimir](https://github.com/vdimir)).
|
||||
* Use statistics to order prewhere conditions better. [#53240](https://github.com/ClickHouse/ClickHouse/pull/53240) ([Han Fei](https://github.com/hanfei1991)).
|
||||
* Added support for compression in the Keeper's protocol. It can be enabled on the ClickHouse side by using this flag `use_compression` inside `zookeeper` section. Keep in mind that only ClickHouse Keeper supports compression, while Apache ZooKeeper does not. Resolves [#49507](https://github.com/ClickHouse/ClickHouse/issues/49507). [#54957](https://github.com/ClickHouse/ClickHouse/pull/54957) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
|
||||
* Introduce the feature `storage_metadata_write_full_object_key`. If it is set as `true` then metadata files are written with the new format. With that format ClickHouse stores full remote object key in the metadata file which allows better flexibility and optimization. [#55566](https://github.com/ClickHouse/ClickHouse/pull/55566) ([Sema Checherinda](https://github.com/CheSema)).
|
||||
* Add new settings and syntax to protect named collections' fields from being overridden. This is meant to prevent a malicious user from obtaining unauthorized access to secrets. [#55782](https://github.com/ClickHouse/ClickHouse/pull/55782) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
|
||||
* Add `hostname` column to all system log tables - it is useful if you make the system tables replicated, shared, or distributed. [#55894](https://github.com/ClickHouse/ClickHouse/pull/55894) ([Bharat Nallan](https://github.com/bharatnc)).
|
||||
* Add `CHECK ALL TABLES` query. [#56022](https://github.com/ClickHouse/ClickHouse/pull/56022) ([vdimir](https://github.com/vdimir)).
|
||||
* Added function `fromDaysSinceYearZero` which is similar to MySQL's `FROM_DAYS`. E.g. `SELECT fromDaysSinceYearZero(739136)` returns `2023-09-08`. [#56088](https://github.com/ClickHouse/ClickHouse/pull/56088) ([Joanna Hulboj](https://github.com/jh0x)).
|
||||
* Implemented a function for series period detect method using FFT. [#56171](https://github.com/ClickHouse/ClickHouse/pull/56171) ([Bhavna Jindal](https://github.com/bhavnajindal)).
|
||||
* Add an external Python tool to view backups and to extract information from them without using ClickHouse. [#56268](https://github.com/ClickHouse/ClickHouse/pull/56268) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Implement a new setting called `preferred_projection_name`. If it is set to a non-empty string, the specified projection would be used if possible instead of choosing from all the candidates. [#56309](https://github.com/ClickHouse/ClickHouse/pull/56309) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* Add 4-letter command for yielding/resigning leadership (https://github.com/ClickHouse/ClickHouse/issues/56352). [#56354](https://github.com/ClickHouse/ClickHouse/pull/56354) ([Pradeep Chhetri](https://github.com/chhetripradeep)). [#56620](https://github.com/ClickHouse/ClickHouse/pull/56620) ([Pradeep Chhetri](https://github.com/chhetripradeep)).
|
||||
* Added a new SQL function, `arrayRandomSample(arr, k)` which returns a sample of k elements from the input array. Similar functionality could previously be achieved only with less convenient syntax, e.g. `SELECT arrayReduce('groupArraySample(3)', range(10))`. [#56416](https://github.com/ClickHouse/ClickHouse/pull/56416) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Added support for `Float16` type data to use in `.npy` files. Closes [#56344](https://github.com/ClickHouse/ClickHouse/issues/56344). [#56424](https://github.com/ClickHouse/ClickHouse/pull/56424) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* Added a system view `information_schema.statistics` for better compatibility with Tableau Online. [#56425](https://github.com/ClickHouse/ClickHouse/pull/56425) ([Serge Klochkov](https://github.com/slvrtrn)).
|
||||
* Add `system.symbols` table useful for introspection of the binary. [#56548](https://github.com/ClickHouse/ClickHouse/pull/56548) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Configurable dashboards. Queries for charts are now loaded using a query, which by default uses a new `system.dashboards` table. [#56771](https://github.com/ClickHouse/ClickHouse/pull/56771) ([Sergei Trifonov](https://github.com/serxa)).
|
||||
* Introduce `fileCluster` table function - it is useful if you mount a shared filesystem (NFS and similar) into the `user_files` directory. [#56868](https://github.com/ClickHouse/ClickHouse/pull/56868) ([Andrey Zvonov](https://github.com/zvonand)).
|
||||
* Add `_size` virtual column with file size in bytes to `s3/file/hdfs/url/azureBlobStorage` engines. [#57126](https://github.com/ClickHouse/ClickHouse/pull/57126) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Expose the number of errors for each error code occurred on a server since last restart from the Prometheus endpoint. [#57209](https://github.com/ClickHouse/ClickHouse/pull/57209) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* ClickHouse keeper reports its running availability zone at `/keeper/availability-zone` path. This can be configured via `<availability_zone><value>us-west-1a</value></availability_zone>`. [#56715](https://github.com/ClickHouse/ClickHouse/pull/56715) ([Jianfei Hu](https://github.com/incfly)).
|
||||
* Make ALTER materialized_view MODIFY QUERY non experimental and deprecate `allow_experimental_alter_materialized_view_structure` setting. Fixes [#15206](https://github.com/ClickHouse/ClickHouse/issues/15206). [#57311](https://github.com/ClickHouse/ClickHouse/pull/57311) ([alesapin](https://github.com/alesapin)).
|
||||
* Setting `join_algorithm` respects specified order [#51745](https://github.com/ClickHouse/ClickHouse/pull/51745) ([vdimir](https://github.com/vdimir)).
|
||||
* Add support for the [well-known Protobuf types](https://protobuf.dev/reference/protobuf/google.protobuf/) in the Protobuf format. [#56741](https://github.com/ClickHouse/ClickHouse/pull/56741) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
|
||||
|
||||
#### Performance Improvement
|
||||
* It is now possible to refer to ALIAS column in index (non-primary-key) definitions (issue [#55650](https://github.com/ClickHouse/ClickHouse/issues/55650)). Example: `CREATE TABLE tab(col UInt32, col_alias ALIAS col + 1, INDEX idx (col_alias) TYPE minmax) ENGINE = MergeTree ORDER BY col;`. [#57220](https://github.com/ClickHouse/ClickHouse/pull/57220) ([flynn](https://github.com/ucasfl)).
|
||||
* Adaptive timeouts for interacting with S3. The first attempt is made with low send and receive timeouts. [#56314](https://github.com/ClickHouse/ClickHouse/pull/56314) ([Sema Checherinda](https://github.com/CheSema)).
|
||||
* Increase the default value of `max_concurrent_queries` from 100 to 1000. This makes sense when there is a large number of connecting clients, which are slowly sending or receiving data, so the server is not limited by CPU, or when the number of CPU cores is larger than 100. Also, enable the concurrency control by default, and set the desired number of query processing threads in total as twice the number of CPU cores. It improves performance in scenarios with a very large number of concurrent queries. [#46927](https://github.com/ClickHouse/ClickHouse/pull/46927) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Support parallel evaluation of window functions. Fixes [#34688](https://github.com/ClickHouse/ClickHouse/issues/34688). [#39631](https://github.com/ClickHouse/ClickHouse/pull/39631) ([Dmitry Novik](https://github.com/novikd)).
|
||||
* `Numbers` table engine (of the `system.numbers` table) now analyzes the condition to generate the needed subset of data, like table's index. [#50909](https://github.com/ClickHouse/ClickHouse/pull/50909) ([JackyWoo](https://github.com/JackyWoo)).
|
||||
* Improved the performance of filtering by `IN (...)` condition for `Merge` table engine. [#54905](https://github.com/ClickHouse/ClickHouse/pull/54905) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* An improvement which takes place when the filesystem cache is full and there are big reads. [#55158](https://github.com/ClickHouse/ClickHouse/pull/55158) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Add ability to disable checksums for S3 to avoid excessive pass over the file (this is controlled by the setting `s3_disable_checksum`). [#55559](https://github.com/ClickHouse/ClickHouse/pull/55559) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Now we read synchronously from remote tables when data is in page cache (like we do for local tables). It is faster, it doesn't require synchronisation inside the thread pool, and doesn't hesitate to do `seek`-s on local FS, and reduces CPU wait. [#55841](https://github.com/ClickHouse/ClickHouse/pull/55841) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* Optimization for getting value from `map`, `arrayElement`. It will bring about 30% speedup. - reduce the reserved memory - reduce the `resize` call. [#55957](https://github.com/ClickHouse/ClickHouse/pull/55957) ([lgbo](https://github.com/lgbo-ustc)).
|
||||
* Optimization of multi-stage filtering with AVX-512. The performance experiments of the OnTime dataset on the ICX device (Intel Xeon Platinum 8380 CPU, 80 cores, 160 threads) show that this change could bring the improvements of 7.4%, 5.9%, 4.7%, 3.0%, and 4.6% to the QPS of the query Q2, Q3, Q4, Q5 and Q6 respectively while having no impact on others. [#56079](https://github.com/ClickHouse/ClickHouse/pull/56079) ([Zhiguo Zhou](https://github.com/ZhiguoZh)).
|
||||
* Limit the number of threads busy inside the query profiler. If there are more - they will skip profiling. [#56105](https://github.com/ClickHouse/ClickHouse/pull/56105) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Decrease the amount of virtual function calls in window functions. [#56120](https://github.com/ClickHouse/ClickHouse/pull/56120) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Allow recursive Tuple field pruning in ORC data format to speed up scaning. [#56122](https://github.com/ClickHouse/ClickHouse/pull/56122) ([李扬](https://github.com/taiyang-li)).
|
||||
* Trivial count optimization for `Npy` data format: queries like `select count() from 'data.npy'` will work much more fast because of caching the results. [#56304](https://github.com/ClickHouse/ClickHouse/pull/56304) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* Queries with aggregation and a large number of streams will use less amount of memory during the plan's construction. [#57074](https://github.com/ClickHouse/ClickHouse/pull/57074) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Improve performance of executing queries for use cases with many users and highly concurrent queries (>2000 QPS) by optimizing the access to ProcessList. [#57106](https://github.com/ClickHouse/ClickHouse/pull/57106) ([Andrej Hoos](https://github.com/adikus)).
|
||||
* Trivial improvement on array join, reuse some intermediate results. [#57183](https://github.com/ClickHouse/ClickHouse/pull/57183) ([李扬](https://github.com/taiyang-li)).
|
||||
* There are cases when stack unwinding was slow. Not anymore. [#57221](https://github.com/ClickHouse/ClickHouse/pull/57221) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Now we use default read pool for reading from external storage when `max_streams = 1`. It is beneficial when read prefetches are enabled. [#57334](https://github.com/ClickHouse/ClickHouse/pull/57334) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* Keeper improvement: improve memory-usage during startup by delaying log preprocessing. [#55660](https://github.com/ClickHouse/ClickHouse/pull/55660) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Improved performance of glob matching for `File` and `HDFS` storages. [#56141](https://github.com/ClickHouse/ClickHouse/pull/56141) ([Andrey Zvonov](https://github.com/zvonand)).
|
||||
* Posting lists in experimental full text indexes are now compressed which reduces their size by 10-30%. [#56226](https://github.com/ClickHouse/ClickHouse/pull/56226) ([Harry Lee](https://github.com/HarryLeeIBM)).
|
||||
* Parallelise `BackupEntriesCollector` in backups. [#56312](https://github.com/ClickHouse/ClickHouse/pull/56312) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
|
||||
#### Improvement
|
||||
* Add a new `MergeTree` setting `add_implicit_sign_column_constraint_for_collapsing_engine` (disabled by default). When enabled, it adds an implicit CHECK constraint for `CollapsingMergeTree` tables that restricts the value of the `Sign` column to be only -1 or 1. [#56701](https://github.com/ClickHouse/ClickHouse/issues/56701). [#56986](https://github.com/ClickHouse/ClickHouse/pull/56986) ([Kevin Mingtarja](https://github.com/kevinmingtarja)).
|
||||
* Enable adding new disk to storage configuration without restart. [#56367](https://github.com/ClickHouse/ClickHouse/pull/56367) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Support creating and materializing index in the same alter query, also support "modify TTL" and "materialize TTL" in the same query. Closes [#55651](https://github.com/ClickHouse/ClickHouse/issues/55651). [#56331](https://github.com/ClickHouse/ClickHouse/pull/56331) ([flynn](https://github.com/ucasfl)).
|
||||
* Add a new table function named `fuzzJSON` with rows containing perturbed versions of the source JSON string with random variations. [#56490](https://github.com/ClickHouse/ClickHouse/pull/56490) ([Julia Kartseva](https://github.com/jkartseva)).
|
||||
* Engine `Merge` filters the records according to the row policies of the underlying tables, so you don't have to create another row policy on a `Merge` table. [#50209](https://github.com/ClickHouse/ClickHouse/pull/50209) ([Ilya Golshtein](https://github.com/ilejn)).
|
||||
* Add a setting `max_execution_time_leaf` to limit the execution time on shard for distributed query, and `timeout_overflow_mode_leaf` to control the behaviour if timeout happens. [#51823](https://github.com/ClickHouse/ClickHouse/pull/51823) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Add ClickHouse setting to disable tunneling for HTTPS requests over HTTP proxy. [#55033](https://github.com/ClickHouse/ClickHouse/pull/55033) ([Arthur Passos](https://github.com/arthurpassos)).
|
||||
* Set `background_fetches_pool_size` to 16, background_schedule_pool_size to 512 that is better for production usage with frequent small insertions. [#54327](https://github.com/ClickHouse/ClickHouse/pull/54327) ([Denny Crane](https://github.com/den-crane)).
|
||||
* While read data from a csv format file, and at end of line is `\r` , which not followed by `\n`, then we will enconter the exception as follows `Cannot parse CSV format: found \r (CR) not followed by \n (LF). Line must end by \n (LF) or \r\n (CR LF) or \n\r.` In clickhouse, the csv end of line must be `\n` or `\r\n` or `\n\r`, so the `\r` must be followed by `\n`, but in some suitation, the csv input data is abnormal, like above, `\r` is at end of line. [#54340](https://github.com/ClickHouse/ClickHouse/pull/54340) ([KevinyhZou](https://github.com/KevinyhZou)).
|
||||
* Update Arrow library to release-13.0.0 that supports new encodings. Closes [#44505](https://github.com/ClickHouse/ClickHouse/issues/44505). [#54800](https://github.com/ClickHouse/ClickHouse/pull/54800) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Improve performance of ON CLUSTER queries by removing heavy system calls to get all network interfaces when looking for local ip address in the DDL entry hosts list. [#54909](https://github.com/ClickHouse/ClickHouse/pull/54909) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Fixed accounting of memory allocated before attaching a thread to a query or a user. [#56089](https://github.com/ClickHouse/ClickHouse/pull/56089) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* Add support for `LARGE_LIST` in Apache Arrow formats. [#56118](https://github.com/ClickHouse/ClickHouse/pull/56118) ([edef](https://github.com/edef1c)).
|
||||
* Allow manual compaction of `EmbeddedRocksDB` via `OPTIMIZE` query. [#56225](https://github.com/ClickHouse/ClickHouse/pull/56225) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Add ability to specify BlockBasedTableOptions for `EmbeddedRocksDB` tables. [#56264](https://github.com/ClickHouse/ClickHouse/pull/56264) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* `SHOW COLUMNS` now displays MySQL's equivalent data type name when the connection was made through the MySQL protocol. Previously, this was the case when setting `use_mysql_types_in_show_columns = 1`. The setting is retained but made obsolete. [#56277](https://github.com/ClickHouse/ClickHouse/pull/56277) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Fixed possible `The local set of parts of table doesn't look like the set of parts in ZooKeeper` error if server was restarted just after `TRUNCATE` or `DROP PARTITION`. [#56282](https://github.com/ClickHouse/ClickHouse/pull/56282) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Fixed handling of non-const query strings in functions `formatQuery`/ `formatQuerySingleLine`. Also added `OrNull` variants of both functions that return a NULL when a query cannot be parsed instead of throwing an exception. [#56327](https://github.com/ClickHouse/ClickHouse/pull/56327) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Allow backup of materialized view with dropped inner table instead of failing the backup. [#56387](https://github.com/ClickHouse/ClickHouse/pull/56387) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Queries to `system.replicas` initiate requests to ZooKeeper when certain columns are queried. When there are thousands of tables these requests might produce a considerable load on ZooKeeper. If there are multiple simultaneous queries to `system.replicas` they do same requests multiple times. The change is to "deduplicate" requests from concurrent queries. [#56420](https://github.com/ClickHouse/ClickHouse/pull/56420) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Fix translation to MySQL compatible query for querying external databases. [#56456](https://github.com/ClickHouse/ClickHouse/pull/56456) ([flynn](https://github.com/ucasfl)).
|
||||
* Add support for backing up and restoring tables using `KeeperMap` engine. [#56460](https://github.com/ClickHouse/ClickHouse/pull/56460) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* 404 response for CompleteMultipartUpload has to be rechecked. Operation could be done on server even if client got timeout or other network errors. The next retry of CompleteMultipartUpload receives 404 response. If the object key exists that operation is considered as successful. [#56475](https://github.com/ClickHouse/ClickHouse/pull/56475) ([Sema Checherinda](https://github.com/CheSema)).
|
||||
* Enable the HTTP OPTIONS method by default - it simplifies requesting ClickHouse from a web browser. [#56483](https://github.com/ClickHouse/ClickHouse/pull/56483) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* The value for `dns_max_consecutive_failures` was changed by mistake in [#46550](https://github.com/ClickHouse/ClickHouse/issues/46550) - this is reverted and adjusted to a better value. Also, increased the HTTP keep-alive timeout to a reasonable value from production. [#56485](https://github.com/ClickHouse/ClickHouse/pull/56485) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Load base backups lazily (a base backup won't be loaded until it's needed). Also add some log message and profile events for backups. [#56516](https://github.com/ClickHouse/ClickHouse/pull/56516) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Setting `query_cache_store_results_of_queries_with_nondeterministic_functions` (with values `false` or `true`) was marked obsolete. It was replaced by setting `query_cache_nondeterministic_function_handling`, a three-valued enum that controls how the query cache handles queries with non-deterministic functions: a) throw an exception (default behavior), b) save the non-deterministic query result regardless, or c) ignore, i.e. don't throw an exception and don't cache the result. [#56519](https://github.com/ClickHouse/ClickHouse/pull/56519) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Rewrite equality with `is null` check in JOIN ON section. Experimental *Analyzer only*. [#56538](https://github.com/ClickHouse/ClickHouse/pull/56538) ([vdimir](https://github.com/vdimir)).
|
||||
* Function`concat` now supports arbitrary argument types (instead of only String and FixedString arguments). This makes it behave more similar to MySQL `concat` implementation. For example, `SELECT concat('ab', 42)` now returns `ab42`. [#56540](https://github.com/ClickHouse/ClickHouse/pull/56540) ([Serge Klochkov](https://github.com/slvrtrn)).
|
||||
* Allow getting cache configuration from 'named_collection' section in config or from SQL created named collections. [#56541](https://github.com/ClickHouse/ClickHouse/pull/56541) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Update `query_masking_rules` when reloading the config ([#56449](https://github.com/ClickHouse/ClickHouse/issues/56449)). [#56573](https://github.com/ClickHouse/ClickHouse/pull/56573) ([Mikhail Koviazin](https://github.com/mkmkme)).
|
||||
* PostgreSQL database engine: Make the removal of outdated tables less aggressive with unsuccessful postgres connection. [#56609](https://github.com/ClickHouse/ClickHouse/pull/56609) ([jsc0218](https://github.com/jsc0218)).
|
||||
* It took too much time to connnect to PG when URL is not right, so the relevant query stucks there and get cancelled. [#56648](https://github.com/ClickHouse/ClickHouse/pull/56648) ([jsc0218](https://github.com/jsc0218)).
|
||||
* Do not allow tables on different replicas have different aggregate functions in `SimpleAggregateFunction` columns. [#56724](https://github.com/ClickHouse/ClickHouse/pull/56724) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Keeper improvement: disable compressed logs by default in Keeper. [#56763](https://github.com/ClickHouse/ClickHouse/pull/56763) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Add config setting `wait_dictionaries_load_at_startup`. [#56782](https://github.com/ClickHouse/ClickHouse/pull/56782) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* There was a potential vulnerability in previous ClickHouse versions: if a user has connected and unsuccessfully tried to authenticate with the "interserver secret" method, the server didn't terminate the connection immediately but continued to receive and ignore the leftover packets from the client. While these packets are ignored, they are still parsed, and if they use a compression method with another known vulnerability, it will lead to exploitation of it without authentication. This issue was found with [ClickHouse Bug Bounty Program](https://github.com/ClickHouse/ClickHouse/issues/38986) by https://twitter.com/malacupa. [#56794](https://github.com/ClickHouse/ClickHouse/pull/56794) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fetching a part waits when that part is fully committed on remote replica. It is better not send part in PreActive state. In case of zero copy this is mandatory restriction. [#56808](https://github.com/ClickHouse/ClickHouse/pull/56808) ([Sema Checherinda](https://github.com/CheSema)).
|
||||
* Fix possible postgresql logical replication conversion error when using experimental `MaterializedPostgreSQL`. [#53721](https://github.com/ClickHouse/ClickHouse/pull/53721) ([takakawa](https://github.com/takakawa)).
|
||||
* Implement user-level setting `alter_move_to_space_execute_async` which allow to execute queries `ALTER TABLE ... MOVE PARTITION|PART TO DISK|VOLUME` asynchronously. The size of pool for background executions is controlled by `background_move_pool_size`. Default behavior is synchronous execution. Fixes [#47643](https://github.com/ClickHouse/ClickHouse/issues/47643). [#56809](https://github.com/ClickHouse/ClickHouse/pull/56809) ([alesapin](https://github.com/alesapin)).
|
||||
* Able to filter by engine when scanning system.tables, avoid unnecessary (potentially time-consuming) connection. [#56813](https://github.com/ClickHouse/ClickHouse/pull/56813) ([jsc0218](https://github.com/jsc0218)).
|
||||
* Show `total_bytes` and `total_rows` in system tables for RocksDB storage. [#56816](https://github.com/ClickHouse/ClickHouse/pull/56816) ([Aleksandr Musorin](https://github.com/AVMusorin)).
|
||||
* Allow basic commands in ALTER for TEMPORARY tables. [#56892](https://github.com/ClickHouse/ClickHouse/pull/56892) ([Sergey](https://github.com/icuken)).
|
||||
* LZ4 compression. Buffer compressed block in a rare case when out buffer capacity is not enough for writing compressed block directly to out's buffer. [#56938](https://github.com/ClickHouse/ClickHouse/pull/56938) ([Sema Checherinda](https://github.com/CheSema)).
|
||||
* Add metrics for the number of queued jobs, which is useful for the IO thread pool. [#56958](https://github.com/ClickHouse/ClickHouse/pull/56958) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Add a setting for PostgreSQL table engine setting in the config file. Added a check for the setting Added documentation around the additional setting. [#56959](https://github.com/ClickHouse/ClickHouse/pull/56959) ([Peignon Melvyn](https://github.com/melvynator)).
|
||||
* Function `concat` can now be called with a single argument, e.g., `SELECT concat('abc')`. This makes its behavior more consistent with MySQL's concat implementation. [#57000](https://github.com/ClickHouse/ClickHouse/pull/57000) ([Serge Klochkov](https://github.com/slvrtrn)).
|
||||
* Signs all `x-amz-*` headers as required by AWS S3 docs. [#57001](https://github.com/ClickHouse/ClickHouse/pull/57001) ([Arthur Passos](https://github.com/arthurpassos)).
|
||||
* Function `fromDaysSinceYearZero` (alias: `FROM_DAYS`) can now be used with unsigned and signed integer types (previously, it had to be an unsigned integer). This improve compatibility with 3rd party tools such as Tableau Online. [#57002](https://github.com/ClickHouse/ClickHouse/pull/57002) ([Serge Klochkov](https://github.com/slvrtrn)).
|
||||
* Add `system.s3queue_log` to default config. [#57036](https://github.com/ClickHouse/ClickHouse/pull/57036) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Change the default for `wait_dictionaries_load_at_startup` to true, and use this setting only if `dictionaries_lazy_load` is false. [#57133](https://github.com/ClickHouse/ClickHouse/pull/57133) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Check dictionary source type on creation even if `dictionaries_lazy_load` is enabled. [#57134](https://github.com/ClickHouse/ClickHouse/pull/57134) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Plan-level optimizations can now be enabled/disabled individually. Previously, it was only possible to disable them all. The setting which previously did that (`query_plan_enable_optimizations`) is retained and can still be used to disable all optimizations. [#57152](https://github.com/ClickHouse/ClickHouse/pull/57152) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* The server's exit code will correspond to the exception code. For example, if the server cannot start due to memory limit, it will exit with the code 241 = MEMORY_LIMIT_EXCEEDED. In previous versions, the exit code for exceptions was always 70 = Poco::Util::ExitCode::EXIT_SOFTWARE. [#57153](https://github.com/ClickHouse/ClickHouse/pull/57153) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Do not demangle and symbolize stack frames from `functional` C++ header. [#57201](https://github.com/ClickHouse/ClickHouse/pull/57201) ([Mike Kot](https://github.com/myrrc)).
|
||||
* HTTP server page `/dashboard` now supports charts with multiple lines. [#57236](https://github.com/ClickHouse/ClickHouse/pull/57236) ([Sergei Trifonov](https://github.com/serxa)).
|
||||
* The `max_memory_usage_in_client` command line option supports a string value with a suffix (K, M, G, etc). Closes [#56879](https://github.com/ClickHouse/ClickHouse/issues/56879). [#57273](https://github.com/ClickHouse/ClickHouse/pull/57273) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* Bumped Intel QPL (used by codec `DEFLATE_QPL`) from v1.2.0 to v1.3.1 . Also fixed a bug in case of BOF (Block On Fault) = 0, changed to handle page faults by falling back to SW path. [#57291](https://github.com/ClickHouse/ClickHouse/pull/57291) ([jasperzhu](https://github.com/jinjunzh)).
|
||||
* Increase default `replicated_deduplication_window` of MergeTree settings from 100 to 1k. [#57335](https://github.com/ClickHouse/ClickHouse/pull/57335) ([sichenzhao](https://github.com/sichenzhao)).
|
||||
* Stop using `INCONSISTENT_METADATA_FOR_BACKUP` that much. If possible prefer to continue scanning instead of stopping and starting the scanning for backup from the beginning. [#57385](https://github.com/ClickHouse/ClickHouse/pull/57385) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
|
||||
#### Build/Testing/Packaging Improvement
|
||||
* Add SQLLogic test. [#56078](https://github.com/ClickHouse/ClickHouse/pull/56078) ([Han Fei](https://github.com/hanfei1991)).
|
||||
* Make `clickhouse-local` and `clickhouse-client` available under short names (`ch`, `chl`, `chc`) for usability. [#56634](https://github.com/ClickHouse/ClickHouse/pull/56634) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Optimized build size further by removing unused code from external libraries. [#56786](https://github.com/ClickHouse/ClickHouse/pull/56786) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Add automatic check that there are no large translation units. [#56559](https://github.com/ClickHouse/ClickHouse/pull/56559) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Lower the size of the single-binary distribution. This closes [#55181](https://github.com/ClickHouse/ClickHouse/issues/55181). [#56617](https://github.com/ClickHouse/ClickHouse/pull/56617) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Information about the sizes of every translation unit and binary file after each build will be sent to the CI database in ClickHouse Cloud. This closes [#56107](https://github.com/ClickHouse/ClickHouse/issues/56107). [#56636](https://github.com/ClickHouse/ClickHouse/pull/56636) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Certain files of "Apache Arrow" library (which we use only for non-essential things like parsing the arrow format) were rebuilt all the time regardless of the build cache. This is fixed. [#56657](https://github.com/ClickHouse/ClickHouse/pull/56657) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Avoid recompiling translation units depending on the autogenerated source file about version. [#56660](https://github.com/ClickHouse/ClickHouse/pull/56660) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Tracing data of the linker invocations will be sent to the CI database in ClickHouse Cloud. [#56725](https://github.com/ClickHouse/ClickHouse/pull/56725) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Use DWARF 5 debug symbols for the clickhouse binary (was DWARF 4 previously). [#56770](https://github.com/ClickHouse/ClickHouse/pull/56770) ([Michael Kolupaev](https://github.com/al13n321)).
|
||||
* Add a new build option `SANITIZE_COVERAGE`. If it is enabled, the code is instrumented to track the coverage. The collected information is available inside ClickHouse with: (1) a new function `coverage` that returns an array of unique addresses in the code found after the previous coverage reset; (2) `SYSTEM RESET COVERAGE` query that resets the accumulated data. This allows us to compare the coverage of different tests, including differential code coverage. Continuation of [#20539](https://github.com/ClickHouse/ClickHouse/issues/20539). [#56102](https://github.com/ClickHouse/ClickHouse/pull/56102) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Some of the stack frames might not be resolved when collecting stacks. In such cases the raw address might be helpful. [#56267](https://github.com/ClickHouse/ClickHouse/pull/56267) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Add an option to disable `libssh`. [#56333](https://github.com/ClickHouse/ClickHouse/pull/56333) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Enable temporary_data_in_cache in S3 tests in CI. [#48425](https://github.com/ClickHouse/ClickHouse/pull/48425) ([vdimir](https://github.com/vdimir)).
|
||||
* Set the max memory usage for clickhouse-client (`1G`) in the CI. [#56873](https://github.com/ClickHouse/ClickHouse/pull/56873) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
|
||||
#### Bug Fix (user-visible misbehavior in an official stable release)
|
||||
|
||||
* Fix exerimental Analyzer - insertion from select with subquery referencing insertion table should process only insertion block. [#50857](https://github.com/ClickHouse/ClickHouse/pull/50857) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Fix a bug in `str_to_map` function. [#56423](https://github.com/ClickHouse/ClickHouse/pull/56423) ([Arthur Passos](https://github.com/arthurpassos)).
|
||||
* Keeper `reconfig`: add timeout before yielding/taking leadership [#53481](https://github.com/ClickHouse/ClickHouse/pull/53481) ([Mike Kot](https://github.com/myrrc)).
|
||||
* Fix incorrect header in grace hash join and filter pushdown [#53922](https://github.com/ClickHouse/ClickHouse/pull/53922) ([vdimir](https://github.com/vdimir)).
|
||||
* Select from system tables when table based on table function. [#55540](https://github.com/ClickHouse/ClickHouse/pull/55540) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
|
||||
* RFC: Fix "Cannot find column X in source stream" for Distributed queries with LIMIT BY [#55836](https://github.com/ClickHouse/ClickHouse/pull/55836) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix 'Cannot read from file:' while running client in a background [#55976](https://github.com/ClickHouse/ClickHouse/pull/55976) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix clickhouse-local exit on bad send_logs_level setting [#55994](https://github.com/ClickHouse/ClickHouse/pull/55994) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Bug fix explain ast with parameterized view [#56004](https://github.com/ClickHouse/ClickHouse/pull/56004) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
|
||||
* Fix a crash during table loading on startup [#56232](https://github.com/ClickHouse/ClickHouse/pull/56232) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix ClickHouse-sourced dictionaries with an explicit query [#56236](https://github.com/ClickHouse/ClickHouse/pull/56236) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix segfault in signal handler for Keeper [#56266](https://github.com/ClickHouse/ClickHouse/pull/56266) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Fix incomplete query result for UNION in view() function. [#56274](https://github.com/ClickHouse/ClickHouse/pull/56274) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix inconsistency of "cast('0' as DateTime64(3))" and "cast('0' as Nullable(DateTime64(3)))" [#56286](https://github.com/ClickHouse/ClickHouse/pull/56286) ([李扬](https://github.com/taiyang-li)).
|
||||
* Fix rare race condition related to Memory allocation failure [#56303](https://github.com/ClickHouse/ClickHouse/pull/56303) ([alesapin](https://github.com/alesapin)).
|
||||
* Fix restore from backup with `flatten_nested` and `data_type_default_nullable` [#56306](https://github.com/ClickHouse/ClickHouse/pull/56306) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix crash in case of adding a column with type Object(JSON) [#56307](https://github.com/ClickHouse/ClickHouse/pull/56307) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Fix crash in filterPushDown [#56380](https://github.com/ClickHouse/ClickHouse/pull/56380) ([vdimir](https://github.com/vdimir)).
|
||||
* Fix restore from backup with mat view and dropped source table [#56383](https://github.com/ClickHouse/ClickHouse/pull/56383) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix segfault during Kerberos initialization [#56401](https://github.com/ClickHouse/ClickHouse/pull/56401) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix buffer overflow in T64 [#56434](https://github.com/ClickHouse/ClickHouse/pull/56434) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix nullable primary key in final (2) [#56452](https://github.com/ClickHouse/ClickHouse/pull/56452) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix ON CLUSTER queries without database on initial node [#56484](https://github.com/ClickHouse/ClickHouse/pull/56484) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix startup failure due to TTL dependency [#56489](https://github.com/ClickHouse/ClickHouse/pull/56489) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix ALTER COMMENT queries ON CLUSTER [#56491](https://github.com/ClickHouse/ClickHouse/pull/56491) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix ALTER COLUMN with ALIAS [#56493](https://github.com/ClickHouse/ClickHouse/pull/56493) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix empty NAMED COLLECTIONs [#56494](https://github.com/ClickHouse/ClickHouse/pull/56494) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix two cases of projection analysis. [#56502](https://github.com/ClickHouse/ClickHouse/pull/56502) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix handling of aliases in query cache [#56545](https://github.com/ClickHouse/ClickHouse/pull/56545) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Fix conversion from `Nullable(Enum)` to `Nullable(String)` [#56644](https://github.com/ClickHouse/ClickHouse/pull/56644) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* More reliable log handling in Keeper [#56670](https://github.com/ClickHouse/ClickHouse/pull/56670) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Fix configuration merge for nodes with substitution attributes [#56694](https://github.com/ClickHouse/ClickHouse/pull/56694) ([Konstantin Bogdanov](https://github.com/thevar1able)).
|
||||
* Fix duplicate usage of table function input(). [#56695](https://github.com/ClickHouse/ClickHouse/pull/56695) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix: RabbitMQ OpenSSL dynamic loading issue [#56703](https://github.com/ClickHouse/ClickHouse/pull/56703) ([Igor Nikonov](https://github.com/devcrafter)).
|
||||
* Fix crash in GCD codec in case when zeros present in data [#56704](https://github.com/ClickHouse/ClickHouse/pull/56704) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Fix 'mutex lock failed: Invalid argument' in clickhouse-local during insert into function [#56710](https://github.com/ClickHouse/ClickHouse/pull/56710) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix Date text parsing in optimistic path [#56765](https://github.com/ClickHouse/ClickHouse/pull/56765) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix crash in FPC codec [#56795](https://github.com/ClickHouse/ClickHouse/pull/56795) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* DatabaseReplicated: fix DDL query timeout after recovering a replica [#56796](https://github.com/ClickHouse/ClickHouse/pull/56796) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Fix incorrect nullable columns reporting in MySQL binary protocol [#56799](https://github.com/ClickHouse/ClickHouse/pull/56799) ([Serge Klochkov](https://github.com/slvrtrn)).
|
||||
* Support Iceberg metadata files for metastore tables [#56810](https://github.com/ClickHouse/ClickHouse/pull/56810) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix TSAN report under transform [#56817](https://github.com/ClickHouse/ClickHouse/pull/56817) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Fix SET query and SETTINGS formatting [#56825](https://github.com/ClickHouse/ClickHouse/pull/56825) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix failure to start due to table dependency in joinGet [#56828](https://github.com/ClickHouse/ClickHouse/pull/56828) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix flattening existing Nested columns during ADD COLUMN [#56830](https://github.com/ClickHouse/ClickHouse/pull/56830) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix allow cr end of line for csv [#56901](https://github.com/ClickHouse/ClickHouse/pull/56901) ([KevinyhZou](https://github.com/KevinyhZou)).
|
||||
* Fix `tryBase64Decode` with invalid input [#56913](https://github.com/ClickHouse/ClickHouse/pull/56913) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Fix generating deep nested columns in CapnProto/Protobuf schemas [#56941](https://github.com/ClickHouse/ClickHouse/pull/56941) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Prevent incompatible ALTER of projection columns [#56948](https://github.com/ClickHouse/ClickHouse/pull/56948) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix sqlite file path validation [#56984](https://github.com/ClickHouse/ClickHouse/pull/56984) ([San](https://github.com/santrancisco)).
|
||||
* S3Queue: fix metadata reference increment [#56990](https://github.com/ClickHouse/ClickHouse/pull/56990) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* S3Queue minor fix [#56999](https://github.com/ClickHouse/ClickHouse/pull/56999) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix file path validation for DatabaseFileSystem [#57029](https://github.com/ClickHouse/ClickHouse/pull/57029) ([San](https://github.com/santrancisco)).
|
||||
* Fix `fuzzBits` with `ARRAY JOIN` [#57033](https://github.com/ClickHouse/ClickHouse/pull/57033) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Fix Nullptr dereference in partial merge join with joined_subquery_re… [#57048](https://github.com/ClickHouse/ClickHouse/pull/57048) ([vdimir](https://github.com/vdimir)).
|
||||
* Fix race condition in RemoteSource [#57052](https://github.com/ClickHouse/ClickHouse/pull/57052) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Implement `bitHammingDistance` for big integers [#57073](https://github.com/ClickHouse/ClickHouse/pull/57073) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* S3-style links bug fix [#57075](https://github.com/ClickHouse/ClickHouse/pull/57075) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* Fix JSON_QUERY function with multiple numeric paths [#57096](https://github.com/ClickHouse/ClickHouse/pull/57096) ([KevinyhZou](https://github.com/KevinyhZou)).
|
||||
* Fix buffer overflow in Gorilla codec [#57107](https://github.com/ClickHouse/ClickHouse/pull/57107) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Close interserver connection on any exception before authentication [#57142](https://github.com/ClickHouse/ClickHouse/pull/57142) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Fix segfault after ALTER UPDATE with Nullable MATERIALIZED column [#57147](https://github.com/ClickHouse/ClickHouse/pull/57147) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix incorrect JOIN plan optimization with partially materialized normal projection [#57196](https://github.com/ClickHouse/ClickHouse/pull/57196) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Ignore comments when comparing column descriptions [#57259](https://github.com/ClickHouse/ClickHouse/pull/57259) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Fix `ReadonlyReplica` metric for all cases [#57267](https://github.com/ClickHouse/ClickHouse/pull/57267) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Background merges correctly use temporary data storage in the cache [#57275](https://github.com/ClickHouse/ClickHouse/pull/57275) ([vdimir](https://github.com/vdimir)).
|
||||
* Keeper fix for changelog and snapshots [#57299](https://github.com/ClickHouse/ClickHouse/pull/57299) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Ignore finished ON CLUSTER tasks if hostname changed [#57339](https://github.com/ClickHouse/ClickHouse/pull/57339) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* MergeTree mutations reuse source part index granularity [#57352](https://github.com/ClickHouse/ClickHouse/pull/57352) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* FS cache: add a limit for background download [#57424](https://github.com/ClickHouse/ClickHouse/pull/57424) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
|
||||
|
||||
### <a id="2310"></a> ClickHouse release 23.10, 2023-11-02
|
||||
|
||||
#### Backward Incompatible Change
|
||||
* There is no longer an option to automatically remove broken data parts. This closes [#55174](https://github.com/ClickHouse/ClickHouse/issues/55174). [#55184](https://github.com/ClickHouse/ClickHouse/pull/55184) ([Alexey Milovidov](https://github.com/alexey-milovidov)). [#55557](https://github.com/ClickHouse/ClickHouse/pull/55557) ([Jihyuk Bok](https://github.com/tomahawk28)).
|
||||
@ -39,7 +255,7 @@
|
||||
* Allow to drop cache for Protobuf format with `SYSTEM DROP SCHEMA FORMAT CACHE [FOR Protobuf]`. [#55064](https://github.com/ClickHouse/ClickHouse/pull/55064) ([Aleksandr Musorin](https://github.com/AVMusorin)).
|
||||
* Add external HTTP Basic authenticator. [#55199](https://github.com/ClickHouse/ClickHouse/pull/55199) ([Aleksei Filatov](https://github.com/aalexfvk)).
|
||||
* Added function `byteSwap` which reverses the bytes of unsigned integers. This is particularly useful for reversing values of types which are represented as unsigned integers internally such as IPv4. [#55211](https://github.com/ClickHouse/ClickHouse/pull/55211) ([Priyansh Agrawal](https://github.com/Priyansh121096)).
|
||||
* Added function `formatQuery()` which returns a formatted version (possibly spanning multiple lines) of a SQL query string. Also added function `formatQuerySingleLine()` which does the same but the returned string will not contain linebreaks. [#55239](https://github.com/ClickHouse/ClickHouse/pull/55239) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
|
||||
* Added function `formatQuery` which returns a formatted version (possibly spanning multiple lines) of a SQL query string. Also added function `formatQuerySingleLine` which does the same but the returned string will not contain linebreaks. [#55239](https://github.com/ClickHouse/ClickHouse/pull/55239) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
|
||||
* Added `DWARF` input format that reads debug symbols from an ELF executable/library/object file. [#55450](https://github.com/ClickHouse/ClickHouse/pull/55450) ([Michael Kolupaev](https://github.com/al13n321)).
|
||||
* Allow to save unparsed records and errors in RabbitMQ, NATS and FileLog engines. Add virtual columns `_error` and `_raw_message`(for NATS and RabbitMQ), `_raw_record` (for FileLog) that are filled when ClickHouse fails to parse new record. The behaviour is controlled under storage settings `nats_handle_error_mode` for NATS, `rabbitmq_handle_error_mode` for RabbitMQ, `handle_error_mode` for FileLog similar to `kafka_handle_error_mode`. If it's set to `default`, en exception will be thrown when ClickHouse fails to parse a record, if it's set to `stream`, erorr and raw record will be saved into virtual columns. Closes [#36035](https://github.com/ClickHouse/ClickHouse/issues/36035). [#55477](https://github.com/ClickHouse/ClickHouse/pull/55477) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Keeper client improvement: add `get_all_children_number command` that returns number of all children nodes under a specific path. [#55485](https://github.com/ClickHouse/ClickHouse/pull/55485) ([guoxiaolong](https://github.com/guoxiaolongzte)).
|
||||
@ -74,11 +290,11 @@
|
||||
* Reduced memory consumption during loading of hierarchical dictionaries. [#55838](https://github.com/ClickHouse/ClickHouse/pull/55838) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* All dictionaries support setting `dictionary_use_async_executor`. [#55839](https://github.com/ClickHouse/ClickHouse/pull/55839) ([vdimir](https://github.com/vdimir)).
|
||||
* Prevent excesive memory usage when deserializing AggregateFunctionTopKGenericData. [#55947](https://github.com/ClickHouse/ClickHouse/pull/55947) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* On a Keeper with lots of watches AsyncMetrics threads can consume 100% of CPU for noticable time in `DB::KeeperStorage::getSessionsWithWatchesCount()`. The fix is to avoid traversing heavy `watches` and `list_watches` sets. [#56054](https://github.com/ClickHouse/ClickHouse/pull/56054) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Add setting `optimize_trivial_approximate_count_query` to use `count()` approximation for storage EmbeddedRocksDB. Enable trivial count for StorageJoin. [#55806](https://github.com/ClickHouse/ClickHouse/pull/55806) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* On a Keeper with lots of watches AsyncMetrics threads can consume 100% of CPU for noticable time in `DB::KeeperStorage::getSessionsWithWatchesCount`. The fix is to avoid traversing heavy `watches` and `list_watches` sets. [#56054](https://github.com/ClickHouse/ClickHouse/pull/56054) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Add setting `optimize_trivial_approximate_count_query` to use `count` approximation for storage EmbeddedRocksDB. Enable trivial count for StorageJoin. [#55806](https://github.com/ClickHouse/ClickHouse/pull/55806) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
|
||||
#### Improvement
|
||||
* Functions `toDayOfWeek()` (MySQL alias: `DAYOFWEEK()`), `toYearWeek()` (`YEARWEEK()`) and `toWeek()` (`WEEK()`) now supports `String` arguments. This makes its behavior consistent with MySQL's behavior. [#55589](https://github.com/ClickHouse/ClickHouse/pull/55589) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Functions `toDayOfWeek` (MySQL alias: `DAYOFWEEK`), `toYearWeek` (`YEARWEEK`) and `toWeek` (`WEEK`) now supports `String` arguments. This makes its behavior consistent with MySQL's behavior. [#55589](https://github.com/ClickHouse/ClickHouse/pull/55589) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Introduced setting `date_time_overflow_behavior` with possible values `ignore`, `throw`, `saturate` that controls the overflow behavior when converting from Date, Date32, DateTime64, Integer or Float to Date, Date32, DateTime or DateTime64. [#55696](https://github.com/ClickHouse/ClickHouse/pull/55696) ([Andrey Zvonov](https://github.com/zvonand)).
|
||||
* Implement query parameters support for `ALTER TABLE ... ACTION PARTITION [ID] {parameter_name:ParameterType}`. Merges [#49516](https://github.com/ClickHouse/ClickHouse/issues/49516). Closes [#49449](https://github.com/ClickHouse/ClickHouse/issues/49449). [#55604](https://github.com/ClickHouse/ClickHouse/pull/55604) ([alesapin](https://github.com/alesapin)).
|
||||
* Print processor ids in a prettier manner in EXPLAIN. [#48852](https://github.com/ClickHouse/ClickHouse/pull/48852) ([Vlad Seliverstov](https://github.com/behebot)).
|
||||
@ -112,7 +328,7 @@
|
||||
* Functions `(add|subtract)(Year|Quarter|Month|Week|Day|Hour|Minute|Second|Millisecond|Microsecond|Nanosecond)` now support string-encoded date arguments, e.g. `SELECT addDays('2023-10-22', 1)`. This increases compatibility with MySQL and is needed by Tableau Online. [#55869](https://github.com/ClickHouse/ClickHouse/pull/55869) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* The setting `apply_deleted_mask` when disabled allows to read rows that where marked as deleted by lightweight DELETE queries. This is useful for debugging. [#55952](https://github.com/ClickHouse/ClickHouse/pull/55952) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Allow skipping `null` values when serailizing Tuple to json objects, which makes it possible to keep compatiability with Spark's `to_json` function, which is also useful for gluten. [#55956](https://github.com/ClickHouse/ClickHouse/pull/55956) ([李扬](https://github.com/taiyang-li)).
|
||||
* Functions `(add|sub)Date()` now support string-encoded date arguments, e.g. `SELECT addDate('2023-10-22 11:12:13', INTERVAL 5 MINUTE)`. The same support for string-encoded date arguments is added to the plus and minus operators, e.g. `SELECT '2023-10-23' + INTERVAL 1 DAY`. This increases compatibility with MySQL and is needed by Tableau Online. [#55960](https://github.com/ClickHouse/ClickHouse/pull/55960) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Functions `(add|sub)Date` now support string-encoded date arguments, e.g. `SELECT addDate('2023-10-22 11:12:13', INTERVAL 5 MINUTE)`. The same support for string-encoded date arguments is added to the plus and minus operators, e.g. `SELECT '2023-10-23' + INTERVAL 1 DAY`. This increases compatibility with MySQL and is needed by Tableau Online. [#55960](https://github.com/ClickHouse/ClickHouse/pull/55960) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Allow unquoted strings with CR (`\r`) in CSV format. Closes [#39930](https://github.com/ClickHouse/ClickHouse/issues/39930). [#56046](https://github.com/ClickHouse/ClickHouse/pull/56046) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Allow to run `clickhouse-keeper` using embedded config. [#56086](https://github.com/ClickHouse/ClickHouse/pull/56086) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Set limit of the maximum configuration value for `queued.min.messages` to avoid problem with start fetching data with Kafka. [#56121](https://github.com/ClickHouse/ClickHouse/pull/56121) ([Stas Morozov](https://github.com/r3b-fish)).
|
||||
@ -133,7 +349,7 @@
|
||||
* Fixed bug of `match` function (regex) with pattern containing alternation produces incorrect key condition. Closes #53222. [#54696](https://github.com/ClickHouse/ClickHouse/pull/54696) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Fix 'Cannot find column' in read-in-order optimization with ARRAY JOIN [#51746](https://github.com/ClickHouse/ClickHouse/pull/51746) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Support missed experimental `Object(Nullable(json))` subcolumns in query. [#54052](https://github.com/ClickHouse/ClickHouse/pull/54052) ([zps](https://github.com/VanDarkholme7)).
|
||||
* Re-add fix for `accurateCastOrNull()` [#54629](https://github.com/ClickHouse/ClickHouse/pull/54629) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
|
||||
* Re-add fix for `accurateCastOrNull` [#54629](https://github.com/ClickHouse/ClickHouse/pull/54629) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
|
||||
* Fix detecting `DEFAULT` for columns of a Distributed table created without AS [#55060](https://github.com/ClickHouse/ClickHouse/pull/55060) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Proper cleanup in case of exception in ctor of ShellCommandSource [#55103](https://github.com/ClickHouse/ClickHouse/pull/55103) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Fix deadlock in LDAP assigned role update [#55119](https://github.com/ClickHouse/ClickHouse/pull/55119) ([Julian Maicher](https://github.com/jmaicher)).
|
||||
@ -191,7 +407,7 @@
|
||||
* Add error handler to odbc-bridge [#56185](https://github.com/ClickHouse/ClickHouse/pull/56185) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
|
||||
|
||||
### ClickHouse release 23.9, 2023-09-28
|
||||
### <a id="239"></a> ClickHouse release 23.9, 2023-09-28
|
||||
|
||||
#### Backward Incompatible Change
|
||||
* Remove the `status_info` configuration option and dictionaries status from the default Prometheus handler. [#54090](https://github.com/ClickHouse/ClickHouse/pull/54090) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
@ -213,7 +429,7 @@
|
||||
* Add function `decodeHTMLComponent`. [#54097](https://github.com/ClickHouse/ClickHouse/pull/54097) ([Bharat Nallan](https://github.com/bharatnc)).
|
||||
* Added `peak_threads_usage` to query_log table. [#54335](https://github.com/ClickHouse/ClickHouse/pull/54335) ([Alexey Gerasimchuck](https://github.com/Demilivor)).
|
||||
* Add `SHOW FUNCTIONS` support to clickhouse-client. [#54337](https://github.com/ClickHouse/ClickHouse/pull/54337) ([Julia Kartseva](https://github.com/wat-ze-hex)).
|
||||
* Added function `toDaysSinceYearZero` with alias `TO_DAYS` (for compatibility with MySQL) which returns the number of days passed since `0001-01-01` (in Proleptic Gregorian Calendar). [#54479](https://github.com/ClickHouse/ClickHouse/pull/54479) ([Robert Schulze](https://github.com/rschu1ze)). Function `toDaysSinceYearZero()` now supports arguments of type `DateTime` and `DateTime64`. [#54856](https://github.com/ClickHouse/ClickHouse/pull/54856) ([Serge Klochkov](https://github.com/slvrtrn)).
|
||||
* Added function `toDaysSinceYearZero` with alias `TO_DAYS` (for compatibility with MySQL) which returns the number of days passed since `0001-01-01` (in Proleptic Gregorian Calendar). [#54479](https://github.com/ClickHouse/ClickHouse/pull/54479) ([Robert Schulze](https://github.com/rschu1ze)). Function `toDaysSinceYearZero` now supports arguments of type `DateTime` and `DateTime64`. [#54856](https://github.com/ClickHouse/ClickHouse/pull/54856) ([Serge Klochkov](https://github.com/slvrtrn)).
|
||||
* Added functions `YYYYMMDDtoDate`, `YYYYMMDDtoDate32`, `YYYYMMDDhhmmssToDateTime` and `YYYYMMDDhhmmssToDateTime64`. They convert a date or date with time encoded as integer (e.g. 20230911) into a native date or date with time. As such, they provide the opposite functionality of existing functions `YYYYMMDDToDate`, `YYYYMMDDToDateTime`, `YYYYMMDDhhmmddToDateTime`, `YYYYMMDDhhmmddToDateTime64`. [#54509](https://github.com/ClickHouse/ClickHouse/pull/54509) ([Quanfa Fu](https://github.com/dentiscalprum)) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Add several string distance functions, including `byteHammingDistance`, `editDistance`. [#54935](https://github.com/ClickHouse/ClickHouse/pull/54935) ([flynn](https://github.com/ucasfl)).
|
||||
* Allow specifying the expiration date and, optionally, the time for user credentials with `VALID UNTIL datetime` clause. [#51261](https://github.com/ClickHouse/ClickHouse/pull/51261) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
@ -229,7 +445,7 @@
|
||||
* An optimization to rewrite `COUNT(DISTINCT ...)` and various `uniq` variants to `count` if it is selected from a subquery with GROUP BY. [#52082](https://github.com/ClickHouse/ClickHouse/pull/52082) [#52645](https://github.com/ClickHouse/ClickHouse/pull/52645) ([JackyWoo](https://github.com/JackyWoo)).
|
||||
* Remove manual calls to `mmap/mremap/munmap` and delegate all this work to `jemalloc` - and it slightly improves performance. [#52792](https://github.com/ClickHouse/ClickHouse/pull/52792) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* Fixed high in CPU consumption when working with NATS. [#54399](https://github.com/ClickHouse/ClickHouse/pull/54399) ([Vasilev Pyotr](https://github.com/vahpetr)).
|
||||
* Since we use separate instructions for executing `toString()` with datetime argument, it is possible to improve performance a bit for non-datetime arguments and have some parts of the code cleaner. Follows up [#53680](https://github.com/ClickHouse/ClickHouse/issues/53680). [#54443](https://github.com/ClickHouse/ClickHouse/pull/54443) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* Since we use separate instructions for executing `toString` with datetime argument, it is possible to improve performance a bit for non-datetime arguments and have some parts of the code cleaner. Follows up [#53680](https://github.com/ClickHouse/ClickHouse/issues/53680). [#54443](https://github.com/ClickHouse/ClickHouse/pull/54443) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* Instead of serializing json elements into a `std::stringstream`, this PR try to put the serialization result into `ColumnString` direclty. [#54613](https://github.com/ClickHouse/ClickHouse/pull/54613) ([lgbo](https://github.com/lgbo-ustc)).
|
||||
* Enable ORDER BY optimization for reading data in corresponding order from a MergeTree table in case that the table is behind a view. [#54628](https://github.com/ClickHouse/ClickHouse/pull/54628) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Improve JSON SQL functions by reusing `GeneratorJSONPath` and removing several shared pointers. [#54735](https://github.com/ClickHouse/ClickHouse/pull/54735) ([lgbo](https://github.com/lgbo-ustc)).
|
||||
@ -479,7 +695,7 @@
|
||||
* The `domainRFC` function now supports IPv6 in square brackets. [#53506](https://github.com/ClickHouse/ClickHouse/pull/53506) ([Chen768959](https://github.com/Chen768959)).
|
||||
* Use longer timeout for S3 CopyObject requests, which are used in backups. [#53533](https://github.com/ClickHouse/ClickHouse/pull/53533) ([Michael Kolupaev](https://github.com/al13n321)).
|
||||
* Added server setting `aggregate_function_group_array_max_element_size`. This setting is used to limit array size for `groupArray` function at serialization. The default value is `16777215`. [#53550](https://github.com/ClickHouse/ClickHouse/pull/53550) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* `SCHEMA()` was added as alias for `DATABASE()` to improve MySQL compatibility. [#53587](https://github.com/ClickHouse/ClickHouse/pull/53587) ([Daniël van Eeden](https://github.com/dveeden)).
|
||||
* `SCHEMA` was added as alias for `DATABASE` to improve MySQL compatibility. [#53587](https://github.com/ClickHouse/ClickHouse/pull/53587) ([Daniël van Eeden](https://github.com/dveeden)).
|
||||
* Add asynchronous metrics about tables in the system database. For example, `TotalBytesOfMergeTreeTablesSystem`. This closes [#53603](https://github.com/ClickHouse/ClickHouse/issues/53603). [#53604](https://github.com/ClickHouse/ClickHouse/pull/53604) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* SQL editor in the Play UI and Dashboard will not use Grammarly. [#53614](https://github.com/ClickHouse/ClickHouse/pull/53614) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* As expert-level settings, it is now possible to (1) configure the size_ratio (i.e. the relative size of the protected queue) of the [index] mark/uncompressed caches, (2) configure the cache policy of the index mark and index uncompressed caches. [#53657](https://github.com/ClickHouse/ClickHouse/pull/53657) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
@ -741,7 +957,7 @@
|
||||
* Disable expression templates for time intervals [#52335](https://github.com/ClickHouse/ClickHouse/pull/52335) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Fix `apply_snapshot` in Keeper [#52358](https://github.com/ClickHouse/ClickHouse/pull/52358) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Update build-osx.md [#52377](https://github.com/ClickHouse/ClickHouse/pull/52377) ([AlexBykovski](https://github.com/AlexBykovski)).
|
||||
* Fix `countSubstrings()` hang with empty needle and a column haystack [#52409](https://github.com/ClickHouse/ClickHouse/pull/52409) ([Sergei Trifonov](https://github.com/serxa)).
|
||||
* Fix `countSubstrings` hang with empty needle and a column haystack [#52409](https://github.com/ClickHouse/ClickHouse/pull/52409) ([Sergei Trifonov](https://github.com/serxa)).
|
||||
* Fix normal projection with merge table [#52432](https://github.com/ClickHouse/ClickHouse/pull/52432) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix possible double-free in Aggregator [#52439](https://github.com/ClickHouse/ClickHouse/pull/52439) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* Fixed inserting into Buffer engine [#52440](https://github.com/ClickHouse/ClickHouse/pull/52440) ([Vasily Nemkov](https://github.com/Enmk)).
|
||||
@ -1585,7 +1801,7 @@
|
||||
* A couple of segfaults have been reported around `c-ares`. They were introduced in my previous pull requests. I have fixed them with the help of Alexander Tokmakov. [#45629](https://github.com/ClickHouse/ClickHouse/pull/45629) ([Arthur Passos](https://github.com/arthurpassos)).
|
||||
* Fix key description when encountering duplicate primary keys. This can happen in projections. See [#45590](https://github.com/ClickHouse/ClickHouse/issues/45590) for details. [#45686](https://github.com/ClickHouse/ClickHouse/pull/45686) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Set compression method and level for backup Closes [#45690](https://github.com/ClickHouse/ClickHouse/issues/45690). [#45737](https://github.com/ClickHouse/ClickHouse/pull/45737) ([Pradeep Chhetri](https://github.com/chhetripradeep)).
|
||||
* Should use `select_query_typed.limitByOffset()` instead of `select_query_typed.limitOffset()`. [#45817](https://github.com/ClickHouse/ClickHouse/pull/45817) ([刘陶峰](https://github.com/taofengliu)).
|
||||
* Should use `select_query_typed.limitByOffset` instead of `select_query_typed.limitOffset`. [#45817](https://github.com/ClickHouse/ClickHouse/pull/45817) ([刘陶峰](https://github.com/taofengliu)).
|
||||
* When use experimental analyzer, queries like `SELECT number FROM numbers(100) LIMIT 10 OFFSET 10;` get wrong results (empty result for this sql). That is caused by an unnecessary offset step added by planner. [#45822](https://github.com/ClickHouse/ClickHouse/pull/45822) ([刘陶峰](https://github.com/taofengliu)).
|
||||
* Backward compatibility - allow implicit narrowing conversion from UInt64 to IPv4 - required for "INSERT ... VALUES ..." expression. [#45865](https://github.com/ClickHouse/ClickHouse/pull/45865) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Bugfix IPv6 parser for mixed ip4 address with missed first octet (like `::.1.2.3`). [#45871](https://github.com/ClickHouse/ClickHouse/pull/45871) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
|
@ -33,10 +33,9 @@ curl https://clickhouse.com/ | sh
|
||||
|
||||
## Upcoming Events
|
||||
|
||||
* [**ClickHouse Meetup in San Francisco**](https://www.meetup.com/clickhouse-silicon-valley-meetup-group/events/296334923/) - Nov 14
|
||||
* [**ClickHouse Meetup in Singapore**](https://www.meetup.com/clickhouse-singapore-meetup-group/events/296334976/) - Nov 15
|
||||
* [**ClickHouse Meetup in Berlin**](https://www.meetup.com/clickhouse-berlin-user-group/events/296488501/) - Nov 30
|
||||
* [**ClickHouse Meetup in NYC**](https://www.meetup.com/clickhouse-new-york-user-group/events/296488779/) - Dec 11
|
||||
* [**ClickHouse Meetup in Sydney**](https://www.meetup.com/clickhouse-sydney-user-group/events/297638812/) - Dec 12
|
||||
* [**ClickHouse Meetup in Boston**](https://www.meetup.com/clickhouse-boston-user-group/events/296488840/) - Dec 12
|
||||
|
||||
Also, keep an eye out for upcoming meetups around the world. Somewhere else you want us to be? Please feel free to reach out to tyler <at> clickhouse <dot> com.
|
||||
|
1
contrib/CMakeLists.txt
vendored
1
contrib/CMakeLists.txt
vendored
@ -44,7 +44,6 @@ else ()
|
||||
endif ()
|
||||
add_contrib (miniselect-cmake miniselect)
|
||||
add_contrib (pdqsort-cmake pdqsort)
|
||||
add_contrib (pocketfft-cmake pocketfft)
|
||||
add_contrib (crc32-vpmsum-cmake crc32-vpmsum)
|
||||
add_contrib (sparsehash-c11-cmake sparsehash-c11)
|
||||
add_contrib (abseil-cpp-cmake abseil-cpp)
|
||||
|
@ -385,9 +385,25 @@ endif ()
|
||||
|
||||
include("${ClickHouse_SOURCE_DIR}/contrib/google-protobuf-cmake/protobuf_generate.cmake")
|
||||
|
||||
# These files needs to be installed to make it possible that users can use well-known protobuf types
|
||||
set(google_proto_files
|
||||
${protobuf_source_dir}/src/google/protobuf/any.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/api.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/descriptor.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/duration.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/empty.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/field_mask.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/source_context.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/struct.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/timestamp.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/type.proto
|
||||
${protobuf_source_dir}/src/google/protobuf/wrappers.proto
|
||||
)
|
||||
|
||||
add_library(_protobuf INTERFACE)
|
||||
target_link_libraries(_protobuf INTERFACE _libprotobuf)
|
||||
target_include_directories(_protobuf INTERFACE "${Protobuf_INCLUDE_DIR}")
|
||||
set_target_properties(_protobuf PROPERTIES google_proto_files "${google_proto_files}")
|
||||
add_library(ch_contrib::protobuf ALIAS _protobuf)
|
||||
|
||||
add_library(_protoc INTERFACE)
|
||||
|
@ -33,7 +33,7 @@ target_include_directories(cxxabi SYSTEM BEFORE
|
||||
PRIVATE $<BUILD_INTERFACE:${LIBCXXABI_SOURCE_DIR}/../libcxx/include>
|
||||
PRIVATE $<BUILD_INTERFACE:${LIBCXXABI_SOURCE_DIR}/../libcxx/src>
|
||||
)
|
||||
target_compile_definitions(cxxabi PRIVATE -D_LIBCPP_BUILDING_LIBRARY)
|
||||
target_compile_definitions(cxxabi PRIVATE -D_LIBCPP_BUILDING_LIBRARY -DHAS_THREAD_LOCAL)
|
||||
target_compile_options(cxxabi PRIVATE -nostdinc++ -fno-sanitize=undefined -Wno-macro-redefined) # If we don't disable UBSan, infinite recursion happens in dynamic_cast.
|
||||
target_link_libraries(cxxabi PUBLIC unwind)
|
||||
|
||||
|
2
contrib/libpqxx
vendored
2
contrib/libpqxx
vendored
@ -1 +1 @@
|
||||
Subproject commit 791d68fd89902835133c50435e380ec7a73271b7
|
||||
Subproject commit c995193a3a14d71f4711f1f421f65a1a1db64640
|
1
contrib/pocketfft
vendored
1
contrib/pocketfft
vendored
@ -1 +0,0 @@
|
||||
Subproject commit 9efd4da52cf8d28d14531d14e43ad9d913807546
|
@ -1,10 +0,0 @@
|
||||
option (ENABLE_POCKETFFT "Enable pocketfft" ${ENABLE_LIBRARIES})
|
||||
|
||||
if (NOT ENABLE_POCKETFFT)
|
||||
message(STATUS "Not using pocketfft")
|
||||
return()
|
||||
endif()
|
||||
|
||||
add_library(_pocketfft INTERFACE)
|
||||
target_include_directories(_pocketfft INTERFACE ${ClickHouse_SOURCE_DIR}/contrib/pocketfft)
|
||||
add_library(ch_contrib::pocketfft ALIAS _pocketfft)
|
2
contrib/qpl
vendored
2
contrib/qpl
vendored
@ -1 +1 @@
|
||||
Subproject commit faaf19350459c076e66bb5df11743c3fade59b73
|
||||
Subproject commit a61bdd845fd7ca363b2bcc55454aa520dfcd8298
|
@ -20,7 +20,8 @@ RUN apt-get update --yes \
|
||||
RUN pip3 install \
|
||||
numpy \
|
||||
pyodbc \
|
||||
deepdiff
|
||||
deepdiff \
|
||||
sqlglot
|
||||
|
||||
ARG odbc_repo="https://github.com/ClickHouse/clickhouse-odbc.git"
|
||||
|
||||
@ -35,7 +36,7 @@ RUN git clone --recursive ${odbc_repo} \
|
||||
&& odbcinst -i -s -l -f /clickhouse-odbc/packaging/odbc.ini.sample
|
||||
|
||||
ENV TZ=Europe/Amsterdam
|
||||
ENV MAX_RUN_TIME=900
|
||||
ENV MAX_RUN_TIME=9000
|
||||
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
|
||||
|
||||
ARG sqllogic_test_repo="https://github.com/gregrahn/sqllogictest.git"
|
||||
|
@ -75,6 +75,20 @@ function run_tests()
|
||||
cat /test_output/statements-test/check_status.tsv >> /test_output/check_status.tsv
|
||||
cat /test_output/statements-test/test_results.tsv >> /test_output/test_results.tsv
|
||||
tar -zcvf statements-check.tar.gz statements-test 1>/dev/null
|
||||
|
||||
mkdir -p /test_output/complete-test
|
||||
/clickhouse-tests/sqllogic/runner.py \
|
||||
--log-file /test_output/runner-complete-test.log \
|
||||
--log-level info \
|
||||
complete-test \
|
||||
--input-dir /sqllogictest \
|
||||
--out-dir /test_output/complete-test \
|
||||
2>&1 \
|
||||
| ts '%Y-%m-%d %H:%M:%S'
|
||||
|
||||
cat /test_output/complete-test/check_status.tsv >> /test_output/check_status.tsv
|
||||
cat /test_output/complete-test/test_results.tsv >> /test_output/test_results.tsv
|
||||
tar -zcvf complete-check.tar.gz complete-test 1>/dev/null
|
||||
fi
|
||||
}
|
||||
|
||||
|
@ -19,10 +19,14 @@ dpkg -i package_folder/clickhouse-common-static-dbg_*.deb
|
||||
dpkg -i package_folder/clickhouse-server_*.deb
|
||||
dpkg -i package_folder/clickhouse-client_*.deb
|
||||
|
||||
echo "$BUGFIX_VALIDATE_CHECK"
|
||||
|
||||
# Check that the tools are available under short names
|
||||
ch --query "SELECT 1" || exit 1
|
||||
chl --query "SELECT 1" || exit 1
|
||||
chc --version || exit 1
|
||||
if [[ -z "$BUGFIX_VALIDATE_CHECK" ]]; then
|
||||
ch --query "SELECT 1" || exit 1
|
||||
chl --query "SELECT 1" || exit 1
|
||||
chc --version || exit 1
|
||||
fi
|
||||
|
||||
ln -s /usr/share/clickhouse-test/clickhouse-test /usr/bin/clickhouse-test
|
||||
|
||||
@ -46,6 +50,16 @@ fi
|
||||
|
||||
config_logs_export_cluster /etc/clickhouse-server/config.d/system_logs_export.yaml
|
||||
|
||||
if [[ -n "$BUGFIX_VALIDATE_CHECK" ]] && [[ "$BUGFIX_VALIDATE_CHECK" -eq 1 ]]; then
|
||||
sudo cat /etc/clickhouse-server/config.d/zookeeper.xml \
|
||||
| sed "/<use_compression>1<\/use_compression>/d" \
|
||||
> /etc/clickhouse-server/config.d/zookeeper.xml.tmp
|
||||
sudo mv /etc/clickhouse-server/config.d/zookeeper.xml.tmp /etc/clickhouse-server/config.d/zookeeper.xml
|
||||
|
||||
# it contains some new settings, but we can safely remove it
|
||||
rm /etc/clickhouse-server/users.d/s3_cache_new.xml
|
||||
fi
|
||||
|
||||
# For flaky check we also enable thread fuzzer
|
||||
if [ "$NUM_TRIES" -gt "1" ]; then
|
||||
export THREAD_FUZZER_CPU_TIME_PERIOD_US=1000
|
||||
|
@ -191,6 +191,12 @@ sudo cat /etc/clickhouse-server/config.d/logger_trace.xml \
|
||||
> /etc/clickhouse-server/config.d/logger_trace.xml.tmp
|
||||
mv /etc/clickhouse-server/config.d/logger_trace.xml.tmp /etc/clickhouse-server/config.d/logger_trace.xml
|
||||
|
||||
# Randomize async_load_databases
|
||||
if [ $(( $(date +%-d) % 2 )) -eq 1 ]; then
|
||||
sudo echo "<clickhouse><async_load_databases>true</async_load_databases></clickhouse>" \
|
||||
> /etc/clickhouse-server/config.d/enable_async_load_databases.xml
|
||||
fi
|
||||
|
||||
start
|
||||
|
||||
stress --hung-check --drop-databases --output-folder test_output --skip-func-tests "$SKIP_TESTS_OPTION" --global-time-limit 1200 \
|
||||
|
@ -79,6 +79,7 @@ rm /etc/clickhouse-server/config.d/merge_tree.xml
|
||||
rm /etc/clickhouse-server/config.d/enable_wait_for_shutdown_replicated_tables.xml
|
||||
rm /etc/clickhouse-server/users.d/nonconst_timezone.xml
|
||||
rm /etc/clickhouse-server/users.d/s3_cache_new.xml
|
||||
rm /etc/clickhouse-server/users.d/replicated_ddl_entry.xml
|
||||
|
||||
start
|
||||
stop
|
||||
@ -116,6 +117,7 @@ rm /etc/clickhouse-server/config.d/merge_tree.xml
|
||||
rm /etc/clickhouse-server/config.d/enable_wait_for_shutdown_replicated_tables.xml
|
||||
rm /etc/clickhouse-server/users.d/nonconst_timezone.xml
|
||||
rm /etc/clickhouse-server/users.d/s3_cache_new.xml
|
||||
rm /etc/clickhouse-server/users.d/replicated_ddl_entry.xml
|
||||
|
||||
start
|
||||
|
||||
|
@ -39,8 +39,8 @@ If you need to update rows frequently, we recommend using the [`ReplacingMergeTr
|
||||
``` sql
|
||||
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
||||
(
|
||||
name1 [type1] [[NOT] NULL] [DEFAULT|MATERIALIZED|ALIAS|EPHEMERAL expr1] [COMMENT ...] [CODEC(codec1)] [TTL expr1] [PRIMARY KEY],
|
||||
name2 [type2] [[NOT] NULL] [DEFAULT|MATERIALIZED|ALIAS|EPHEMERAL expr2] [COMMENT ...] [CODEC(codec2)] [TTL expr2] [PRIMARY KEY],
|
||||
name1 [type1] [[NOT] NULL] [DEFAULT|MATERIALIZED|ALIAS|EPHEMERAL expr1] [COMMENT ...] [CODEC(codec1)] [STATISTIC(stat1)] [TTL expr1] [PRIMARY KEY],
|
||||
name2 [type2] [[NOT] NULL] [DEFAULT|MATERIALIZED|ALIAS|EPHEMERAL expr2] [COMMENT ...] [CODEC(codec2)] [STATISTIC(stat2)] [TTL expr2] [PRIMARY KEY],
|
||||
...
|
||||
INDEX index_name1 expr1 TYPE type1(...) [GRANULARITY value1],
|
||||
INDEX index_name2 expr2 TYPE type2(...) [GRANULARITY value2],
|
||||
@ -1358,3 +1358,33 @@ In this sample configuration:
|
||||
- `_partition_value` — Values (a tuple) of a `partition by` expression.
|
||||
- `_sample_factor` — Sample factor (from the query).
|
||||
- `_block_number` — Block number of the row, it is persisted on merges when `allow_experimental_block_number_column` is set to true.
|
||||
|
||||
## Column Statistics (Experimental) {#column-statistics}
|
||||
|
||||
The statistic declaration is in the columns section of the `CREATE` query for tables from the `*MergeTree*` Family when we enable `set allow_experimental_statistic = 1`.
|
||||
|
||||
``` sql
|
||||
CREATE TABLE example_table
|
||||
(
|
||||
a Int64 STATISTIC(tdigest),
|
||||
b Float64
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY a
|
||||
```
|
||||
|
||||
We can also manipulate statistics with `ALTER` statements.
|
||||
|
||||
```sql
|
||||
ALTER TABLE example_table ADD STATISTIC b TYPE tdigest;
|
||||
ALTER TABLE example_table DROP STATISTIC a TYPE tdigest;
|
||||
```
|
||||
|
||||
These lightweight statistics aggregate information about distribution of values in columns.
|
||||
They can be used for query optimization when we enable `set allow_statistic_optimize = 1`.
|
||||
|
||||
#### Available Types of Column Statistics {#available-types-of-column-statistics}
|
||||
|
||||
- `tdigest`
|
||||
|
||||
Stores distribution of values from numeric columns in [TDigest](https://github.com/tdunning/t-digest) sketch.
|
||||
|
@ -56,7 +56,7 @@ On Linux, macOS and FreeBSD:
|
||||
./clickhouse client
|
||||
ClickHouse client version 23.2.1.1501 (official build).
|
||||
Connecting to localhost:9000 as user default.
|
||||
Connected to ClickHouse server version 23.2.1 revision 54461.
|
||||
Connected to ClickHouse server version 23.2.1.
|
||||
|
||||
local-host :)
|
||||
```
|
||||
|
@ -16,7 +16,7 @@ ClickHouse provides a native command-line client: `clickhouse-client`. The clien
|
||||
$ clickhouse-client
|
||||
ClickHouse client version 20.13.1.5273 (official build).
|
||||
Connecting to localhost:9000 as user default.
|
||||
Connected to ClickHouse server version 20.13.1 revision 54442.
|
||||
Connected to ClickHouse server version 20.13.1.
|
||||
|
||||
:)
|
||||
```
|
||||
|
@ -15,6 +15,27 @@ You can monitor:
|
||||
- Utilization of hardware resources.
|
||||
- ClickHouse server metrics.
|
||||
|
||||
## Built-in observability dashboard
|
||||
|
||||
<img width="400" alt="Screenshot 2023-11-12 at 6 08 58 PM" src="https://github.com/ClickHouse/ClickHouse/assets/3936029/2bd10011-4a47-4b94-b836-d44557c7fdc1" />
|
||||
|
||||
ClickHouse comes with a built-in observability dashboard feature which can be accessed by `$HOST:$PORT/dashboard` (requires user and password) that shows the following metrics:
|
||||
- Queries/second
|
||||
- CPU usage (cores)
|
||||
- Queries running
|
||||
- Merges running
|
||||
- Selected bytes/second
|
||||
- IO wait
|
||||
- CPU wait
|
||||
- OS CPU Usage (userspace)
|
||||
- OS CPU Usage (kernel)
|
||||
- Read from disk
|
||||
- Read from filesystem
|
||||
- Memory (tracked)
|
||||
- Inserted rows/second
|
||||
- Total MergeTree parts
|
||||
- Max parts for partition
|
||||
|
||||
## Resource Utilization {#resource-utilization}
|
||||
|
||||
ClickHouse also monitors the state of hardware resources by itself such as:
|
||||
|
@ -16,9 +16,9 @@ More information about PGO in ClickHouse you can read in the corresponding GitHu
|
||||
|
||||
There are two major kinds of PGO: [Instrumentation](https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers) and [Sampling](https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers) (also known as AutoFDO). In this guide is described the Instrumentation PGO with ClickHouse.
|
||||
|
||||
1. Build ClickHouse in Instrumented mode. In Clang it can be done via passing `-fprofile-instr-generate` option to `CXXFLAGS`.
|
||||
1. Build ClickHouse in Instrumented mode. In Clang it can be done via passing `-fprofile-generate` option to `CXXFLAGS`.
|
||||
2. Run instrumented ClickHouse on a sample workload. Here you need to use your usual workload. One of the approaches could be using [ClickBench](https://github.com/ClickHouse/ClickBench) as a sample workload. ClickHouse in the instrumentation mode could work slowly so be ready for that and do not run instrumented ClickHouse in performance-critical environments.
|
||||
3. Recompile ClickHouse once again with `-fprofile-instr-use` compiler flags and profiles that are collected from the previous step.
|
||||
3. Recompile ClickHouse once again with `-fprofile-use` compiler flags and profiles that are collected from the previous step.
|
||||
|
||||
A more detailed guide on how to apply PGO is in the Clang [documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization).
|
||||
|
||||
|
@ -1646,6 +1646,45 @@ Default value: `0.5`.
|
||||
|
||||
|
||||
|
||||
## async_load_databases {#async_load_databases}
|
||||
|
||||
Asynchronous loading of databases and tables.
|
||||
|
||||
If `true` all non-system databases with `Ordinary`, `Atomic` and `Replicated` engine will be loaded asynchronously after the ClickHouse server start up. See `system.async_loader` table, `tables_loader_background_pool_size` and `tables_loader_foreground_pool_size` server settings. Any query that tries to access a table, that is not yet loaded, will wait for exactly this table to be started up. If load job fails, query will rethrow an error (instead of shutting down the whole server in case of `async_load_databases = false`). The table that is waited for by at least one query will be loaded with higher priority. DDL queries on a database will wait for exactly that database to be started up.
|
||||
|
||||
If `false`, all databases are loaded when the server starts.
|
||||
|
||||
The default is `false`.
|
||||
|
||||
**Example**
|
||||
|
||||
``` xml
|
||||
<async_load_databases>true</async_load_databases>
|
||||
```
|
||||
|
||||
## tables_loader_foreground_pool_size {#tables_loader_foreground_pool_size}
|
||||
|
||||
Sets the number of threads performing load jobs in foreground pool. The foreground pool is used for loading table synchronously before server start listening on a port and for loading tables that are waited for. Foreground pool has higher priority than background pool. It means that no job starts in background pool while there are jobs running in foreground pool.
|
||||
|
||||
Possible values:
|
||||
|
||||
- Any positive integer.
|
||||
- Zero. Use all available CPUs.
|
||||
|
||||
Default value: 0.
|
||||
|
||||
|
||||
## tables_loader_background_pool_size {#tables_loader_background_pool_size}
|
||||
|
||||
Sets the number of threads performing asynchronous load jobs in background pool. The background pool is used for loading tables asynchronously after server start in case there are no queries waiting for the table. It could be beneficial to keep low number of threads in background pool if there are a lot of tables. It will reserve CPU resources for concurrent query execution.
|
||||
|
||||
Possible values:
|
||||
|
||||
- Any positive integer.
|
||||
- Zero. Use all available CPUs.
|
||||
|
||||
Default value: 0.
|
||||
|
||||
|
||||
## merge_tree {#merge_tree}
|
||||
|
||||
@ -1835,9 +1874,10 @@ Settings:
|
||||
|
||||
- `endpoint` – HTTP endpoint for scraping metrics by prometheus server. Start from ‘/’.
|
||||
- `port` – Port for `endpoint`.
|
||||
- `metrics` – Flag that sets to expose metrics from the [system.metrics](../../operations/system-tables/metrics.md#system_tables-metrics) table.
|
||||
- `events` – Flag that sets to expose metrics from the [system.events](../../operations/system-tables/events.md#system_tables-events) table.
|
||||
- `asynchronous_metrics` – Flag that sets to expose current metrics values from the [system.asynchronous_metrics](../../operations/system-tables/asynchronous_metrics.md#system_tables-asynchronous_metrics) table.
|
||||
- `metrics` – Expose metrics from the [system.metrics](../../operations/system-tables/metrics.md#system_tables-metrics) table.
|
||||
- `events` – Expose metrics from the [system.events](../../operations/system-tables/events.md#system_tables-events) table.
|
||||
- `asynchronous_metrics` – Expose current metrics values from the [system.asynchronous_metrics](../../operations/system-tables/asynchronous_metrics.md#system_tables-asynchronous_metrics) table.
|
||||
- `errors` - Expose the number of errors by error codes occurred since the last server restart. This information could be obtained from the [system.errors](../../operations/system-tables/asynchronous_metrics.md#system_tables-errors) as well.
|
||||
|
||||
**Example**
|
||||
|
||||
@ -1853,6 +1893,7 @@ Settings:
|
||||
<metrics>true</metrics>
|
||||
<events>true</events>
|
||||
<asynchronous_metrics>true</asynchronous_metrics>
|
||||
<errors>true</errors>
|
||||
</prometheus>
|
||||
<!-- highlight-end -->
|
||||
</clickhouse>
|
||||
@ -2350,7 +2391,7 @@ Path on the local filesystem to store temporary data for processing large querie
|
||||
|
||||
## user_files_path {#user_files_path}
|
||||
|
||||
The directory with user files. Used in the table function [file()](../../sql-reference/table-functions/file.md).
|
||||
The directory with user files. Used in the table function [file()](../../sql-reference/table-functions/file.md), [fileCluster()](../../sql-reference/table-functions/fileCluster.md).
|
||||
|
||||
**Example**
|
||||
|
||||
|
@ -149,7 +149,7 @@ Possible values:
|
||||
- Any positive integer.
|
||||
- 0 (disable deduplication)
|
||||
|
||||
Default value: 100.
|
||||
Default value: 1000.
|
||||
|
||||
The `Insert` command creates one or more blocks (parts). For [insert deduplication](../../engines/table-engines/mergetree-family/replication.md), when writing into replicated tables, ClickHouse writes the hash sums of the created parts into ClickHouse Keeper. Hash sums are stored only for the most recent `replicated_deduplication_window` blocks. The oldest hash sums are removed from ClickHouse Keeper.
|
||||
A large number of `replicated_deduplication_window` slows down `Inserts` because it needs to compare more entries.
|
||||
|
@ -4801,6 +4801,14 @@ a Tuple(
|
||||
)
|
||||
```
|
||||
|
||||
## allow_experimental_statistic {#allow_experimental_statistic}
|
||||
|
||||
Allows defining columns with [statistics](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-creating-a-table) and [manipulate statistics](../../engines/table-engines/mergetree-family/mergetree.md#column-statistics).
|
||||
|
||||
## allow_statistic_optimize {#allow_statistic_optimize}
|
||||
|
||||
Allows using statistic to optimize the order of [prewhere conditions](../../sql-reference/statements/select/prewhere.md).
|
||||
|
||||
## analyze_index_with_space_filling_curves
|
||||
|
||||
If a table has a space-filling curve in its index, e.g. `ORDER BY mortonEncode(x, y)`, and the query has conditions on its arguments, e.g. `x >= 10 AND x <= 20 AND y >= 20 AND y <= 30`, use the space-filling curve for index analysis.
|
||||
|
54
docs/en/operations/system-tables/async_loader.md
Normal file
54
docs/en/operations/system-tables/async_loader.md
Normal file
@ -0,0 +1,54 @@
|
||||
---
|
||||
slug: /en/operations/system-tables/async_loader
|
||||
---
|
||||
# async_loader
|
||||
|
||||
Contains information and status for recent asynchronous jobs (e.g. for tables loading). The table contains a row for every job. There is a tool for visualizing information from this table `utils/async_loader_graph`.
|
||||
|
||||
Example:
|
||||
|
||||
``` sql
|
||||
SELECT *
|
||||
FROM system.async_loader
|
||||
FORMAT Vertical
|
||||
LIMIT 1
|
||||
```
|
||||
|
||||
``` text
|
||||
```
|
||||
|
||||
Columns:
|
||||
|
||||
- `job` (`String`) - Job name (may be not unique).
|
||||
- `job_id` (`UInt64`) - Unique ID of the job.
|
||||
- `dependencies` (`Array(UInt64)`) - List of IDs of jobs that should be done before this job.
|
||||
- `dependencies_left` (`UInt64`) - Current number of dependencies left to be done.
|
||||
- `status` (`Enum`) - Current load status of a job:
|
||||
`PENDING`: Load job is not started yet.
|
||||
`OK`: Load job executed and was successful.
|
||||
`FAILED`: Load job executed and failed.
|
||||
`CANCELED`: Load job is not going to be executed due to removal or dependency failure.
|
||||
|
||||
A pending job might be in one of the following states:
|
||||
- `is_executing` (`UInt8`) - The job is currently being executed by a worker.
|
||||
- `is_blocked` (`UInt8`) - The job waits for its dependencies to be done.
|
||||
- `is_ready` (`UInt8`) - The job is ready to be executed and waits for a worker.
|
||||
- `elapsed` (`Float64`) - Seconds elapsed since start of execution. Zero if job is not started. Total execution time if job finished.
|
||||
|
||||
Every job has a pool associated with it and is started in this pool. Each pool has a constant priority and a mutable maximum number of workers. Higher priority (lower `priority` value) jobs are run first. No job with lower priority is started while there is at least one higher priority job ready or executing. Job priority can be elevated (but cannot be lowered) by prioritizing it. For example jobs for a table loading and startup will be prioritized if incoming query required this table. It is possible prioritize a job during its execution, but job is not moved from its `execution_pool` to newly assigned `pool`. The job uses `pool` for creating new jobs to avoid priority inversion. Already started jobs are not preempted by higher priority jobs and always run to completion after start.
|
||||
- `pool_id` (`UInt64`) - ID of a pool currently assigned to the job.
|
||||
- `pool` (`String`) - Name of `pool_id` pool.
|
||||
- `priority` (`Int64`) - Priority of `pool_id` pool.
|
||||
- `execution_pool_id` (`UInt64`) - ID of a pool the job is executed in. Equals initially assigned pool before execution starts.
|
||||
- `execution_pool` (`String`) - Name of `execution_pool_id` pool.
|
||||
- `execution_priority` (`Int64`) - Priority of `execution_pool_id` pool.
|
||||
|
||||
- `ready_seqno` (`Nullable(UInt64)`) - Not null for ready jobs. Worker pulls the next job to be executed from a ready queue of its pool. If there are multiple ready jobs, then job with the lowest value of `ready_seqno` is picked.
|
||||
- `waiters` (`UInt64`) - The number of threads waiting on this job.
|
||||
- `exception` (`Nullable(String)`) - Not null for failed and canceled jobs. Holds error message raised during query execution or error leading to cancelling of this job along with dependency failure chain of job names.
|
||||
|
||||
Time instants during job lifetime:
|
||||
- `schedule_time` (`DateTime64`) - Time when job was created and scheduled to be executed (usually with all its dependencies).
|
||||
- `enqueue_time` (`Nullable(DateTime64)`) - Time when job became ready and was enqueued into a ready queue of it's pool. Null if the job is not ready yet.
|
||||
- `start_time` (`Nullable(DateTime64)`) - Time when worker dequeues the job from ready queue and start its execution. Null if the job is not started yet.
|
||||
- `finish_time` (`Nullable(DateTime64)`) - Time when job execution is finished. Null if the job is not finished yet.
|
@ -13,6 +13,7 @@ ClickHouse does not delete data from the table automatically. See [Introduction]
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the async insert happened.
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the async insert finished execution.
|
||||
- `event_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — The date and time when the async insert finished execution with microseconds precision.
|
||||
@ -42,6 +43,7 @@ SELECT * FROM system.asynchronous_insert_log LIMIT 1 \G;
|
||||
Result:
|
||||
|
||||
``` text
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-06-08
|
||||
event_time: 2023-06-08 10:08:53
|
||||
event_time_microseconds: 2023-06-08 10:08:53.199516
|
||||
|
@ -7,6 +7,7 @@ Contains the historical values for `system.asynchronous_metrics`, which are save
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — Event date.
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Event time.
|
||||
- `name` ([String](../../sql-reference/data-types/string.md)) — Metric name.
|
||||
@ -15,22 +16,33 @@ Columns:
|
||||
**Example**
|
||||
|
||||
``` sql
|
||||
SELECT * FROM system.asynchronous_metric_log LIMIT 10
|
||||
SELECT * FROM system.asynchronous_metric_log LIMIT 3 \G
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─event_date─┬──────────event_time─┬─name─────────────────────────────────────┬─────value─┐
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ CPUFrequencyMHz_0 │ 2120.9 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.arenas.all.pmuzzy │ 743 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.arenas.all.pdirty │ 26288 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.background_thread.run_intervals │ 0 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.background_thread.num_runs │ 0 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.retained │ 60694528 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.mapped │ 303161344 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.resident │ 260931584 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.metadata │ 12079488 │
|
||||
│ 2020-09-05 │ 2020-09-05 15:56:30 │ jemalloc.allocated │ 133756128 │
|
||||
└────────────┴─────────────────────┴──────────────────────────────────────────┴───────────┘
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-11-14
|
||||
event_time: 2023-11-14 14:39:07
|
||||
metric: AsynchronousHeavyMetricsCalculationTimeSpent
|
||||
value: 0.001
|
||||
|
||||
Row 2:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-11-14
|
||||
event_time: 2023-11-14 14:39:08
|
||||
metric: AsynchronousHeavyMetricsCalculationTimeSpent
|
||||
value: 0
|
||||
|
||||
Row 3:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-11-14
|
||||
event_time: 2023-11-14 14:39:09
|
||||
metric: AsynchronousHeavyMetricsCalculationTimeSpent
|
||||
value: 0
|
||||
```
|
||||
|
||||
**See Also**
|
||||
|
@ -7,6 +7,7 @@ Contains logging entries with the information about `BACKUP` and `RESTORE` opera
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — Date of the entry.
|
||||
- `event_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — Time of the entry with microseconds precision.
|
||||
- `id` ([String](../../sql-reference/data-types/string.md)) — Identifier of the backup or restore operation.
|
||||
@ -45,6 +46,7 @@ SELECT * FROM system.backup_log WHERE id = 'e5b74ecb-f6f1-426a-80be-872f90043885
|
||||
```response
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-08-19
|
||||
event_time_microseconds: 2023-08-19 11:05:21.998566
|
||||
id: e5b74ecb-f6f1-426a-80be-872f90043885
|
||||
@ -63,6 +65,7 @@ bytes_read: 0
|
||||
|
||||
Row 2:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-08-19
|
||||
event_time_microseconds: 2023-08-19 11:08:56.916192
|
||||
id: e5b74ecb-f6f1-426a-80be-872f90043885
|
||||
@ -93,6 +96,7 @@ SELECT * FROM system.backup_log WHERE id = 'cdf1f731-52ef-42da-bc65-2e1bfcd4ce90
|
||||
```response
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-08-19
|
||||
event_time_microseconds: 2023-08-19 11:09:19.718077
|
||||
id: cdf1f731-52ef-42da-bc65-2e1bfcd4ce90
|
||||
@ -111,6 +115,7 @@ bytes_read: 0
|
||||
|
||||
Row 2:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2023-08-19
|
||||
event_time_microseconds: 2023-08-19 11:09:29.334234
|
||||
id: cdf1f731-52ef-42da-bc65-2e1bfcd4ce90
|
||||
|
@ -7,6 +7,7 @@ Contains information about stack traces for fatal errors. The table does not exi
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([DateTime](../../sql-reference/data-types/datetime.md)) — Date of the event.
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Time of the event.
|
||||
- `timestamp_ns` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Timestamp of the event with nanoseconds.
|
||||
@ -32,6 +33,7 @@ Result (not full):
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2020-10-14
|
||||
event_time: 2020-10-14 15:47:40
|
||||
timestamp_ns: 1602679660271312710
|
||||
|
@ -6,6 +6,7 @@ slug: /en/operations/system-tables/metric_log
|
||||
Contains history of metrics values from tables `system.metrics` and `system.events`, periodically flushed to disk.
|
||||
|
||||
Columns:
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — Event date.
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Event time.
|
||||
- `event_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — Event time with microseconds resolution.
|
||||
@ -19,6 +20,7 @@ SELECT * FROM system.metric_log LIMIT 1 FORMAT Vertical;
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2020-09-05
|
||||
event_time: 2020-09-05 16:22:33
|
||||
event_time_microseconds: 2020-09-05 16:22:33.196807
|
||||
|
@ -45,6 +45,22 @@ Number of threads in the Aggregator thread pool.
|
||||
|
||||
Number of threads in the Aggregator thread pool running a task.
|
||||
|
||||
### TablesLoaderForegroundThreads
|
||||
|
||||
Number of threads in the async loader foreground thread pool.
|
||||
|
||||
### TablesLoaderForegroundThreadsActive
|
||||
|
||||
Number of threads in the async loader foreground thread pool running a task.
|
||||
|
||||
### TablesLoaderBackgroundThreads
|
||||
|
||||
Number of threads in the async loader background thread pool.
|
||||
|
||||
### TablesLoaderBackgroundThreadsActive
|
||||
|
||||
Number of threads in the async loader background thread pool running a task.
|
||||
|
||||
### AsyncInsertCacheSize
|
||||
|
||||
Number of async insert hash id in cache
|
||||
@ -197,14 +213,6 @@ Number of threads in the DatabaseOnDisk thread pool.
|
||||
|
||||
Number of threads in the DatabaseOnDisk thread pool running a task.
|
||||
|
||||
### DatabaseOrdinaryThreads
|
||||
|
||||
Number of threads in the Ordinary database thread pool.
|
||||
|
||||
### DatabaseOrdinaryThreadsActive
|
||||
|
||||
Number of threads in the Ordinary database thread pool running a task.
|
||||
|
||||
### DelayedInserts
|
||||
|
||||
Number of INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree table.
|
||||
@ -625,14 +633,6 @@ Number of connections that are sending data for external tables to remote server
|
||||
|
||||
Number of connections that are sending data for scalars to remote servers.
|
||||
|
||||
### StartupSystemTablesThreads
|
||||
|
||||
Number of threads in the StartupSystemTables thread pool.
|
||||
|
||||
### StartupSystemTablesThreadsActive
|
||||
|
||||
Number of threads in the StartupSystemTables thread pool running a task.
|
||||
|
||||
### StorageBufferBytes
|
||||
|
||||
Number of bytes in buffers of Buffer tables
|
||||
@ -677,14 +677,6 @@ Number of threads in the system.replicas thread pool running a task.
|
||||
|
||||
Number of connections to TCP server (clients with native interface), also included server-server distributed query connections
|
||||
|
||||
### TablesLoaderThreads
|
||||
|
||||
Number of threads in the tables loader thread pool.
|
||||
|
||||
### TablesLoaderThreadsActive
|
||||
|
||||
Number of threads in the tables loader thread pool running a task.
|
||||
|
||||
### TablesToDropQueueSize
|
||||
|
||||
Number of dropped tables, that are waiting for background data removal.
|
||||
|
@ -31,3 +31,26 @@ SELECT * FROM system.numbers LIMIT 10;
|
||||
|
||||
10 rows in set. Elapsed: 0.001 sec.
|
||||
```
|
||||
|
||||
You can also limit the output by predicates.
|
||||
|
||||
```sql
|
||||
SELECT * FROM system.numbers < 10;
|
||||
```
|
||||
|
||||
```response
|
||||
┌─number─┐
|
||||
│ 0 │
|
||||
│ 1 │
|
||||
│ 2 │
|
||||
│ 3 │
|
||||
│ 4 │
|
||||
│ 5 │
|
||||
│ 6 │
|
||||
│ 7 │
|
||||
│ 8 │
|
||||
│ 9 │
|
||||
└────────┘
|
||||
|
||||
10 rows in set. Elapsed: 0.001 sec.
|
||||
```
|
||||
|
@ -8,28 +8,19 @@ Contains information about [trace spans](https://opentracing.io/docs/overview/sp
|
||||
Columns:
|
||||
|
||||
- `trace_id` ([UUID](../../sql-reference/data-types/uuid.md)) — ID of the trace for executed query.
|
||||
|
||||
- `span_id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — ID of the `trace span`.
|
||||
|
||||
- `parent_span_id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — ID of the parent `trace span`.
|
||||
|
||||
- `operation_name` ([String](../../sql-reference/data-types/string.md)) — The name of the operation.
|
||||
|
||||
- `kind` ([Enum8](../../sql-reference/data-types/enum.md)) — The [SpanKind](https://opentelemetry.io/docs/reference/specification/trace/api/#spankind) of the span.
|
||||
- `INTERNAL` — Indicates that the span represents an internal operation within an application.
|
||||
- `SERVER` — Indicates that the span covers server-side handling of a synchronous RPC or other remote request.
|
||||
- `CLIENT` — Indicates that the span describes a request to some remote service.
|
||||
- `PRODUCER` — Indicates that the span describes the initiators of an asynchronous request. This parent span will often end before the corresponding child CONSUMER span, possibly even before the child span starts.
|
||||
- `CONSUMER` - Indicates that the span describes a child of an asynchronous PRODUCER request.
|
||||
|
||||
- `start_time_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The start time of the `trace span` (in microseconds).
|
||||
|
||||
- `finish_time_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The finish time of the `trace span` (in microseconds).
|
||||
|
||||
- `finish_date` ([Date](../../sql-reference/data-types/date.md)) — The finish date of the `trace span`.
|
||||
|
||||
- `attribute.names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — [Attribute](https://opentelemetry.io/docs/go/instrumentation/#attributes) names depending on the `trace span`. They are filled in according to the recommendations in the [OpenTelemetry](https://opentelemetry.io/) standard.
|
||||
|
||||
- `attribute.values` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Attribute values depending on the `trace span`. They are filled in according to the recommendations in the `OpenTelemetry` standard.
|
||||
|
||||
**Example**
|
||||
|
@ -9,6 +9,7 @@ This table contains information about events that occurred with [data parts](../
|
||||
|
||||
The `system.part_log` table contains the following columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `query_id` ([String](../../sql-reference/data-types/string.md)) — Identifier of the `INSERT` query that created this data part.
|
||||
- `event_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Type of the event that occurred with the data part. Can have one of the following values:
|
||||
- `NewPart` — Inserting of a new data part.
|
||||
@ -56,13 +57,14 @@ SELECT * FROM system.part_log LIMIT 1 FORMAT Vertical;
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
query_id: 983ad9c7-28d5-4ae1-844e-603116b7de31
|
||||
event_type: NewPart
|
||||
merge_reason: NotAMerge
|
||||
merge_algorithm: Undecided
|
||||
event_date: 2021-02-02
|
||||
event_time: 2021-02-02 11:14:28
|
||||
event_time_microseconds: 2021-02-02 11:14:28.861919
|
||||
event_time_microseconds: 2021-02-02 11:14:28.861919
|
||||
duration_ms: 35
|
||||
database: default
|
||||
table: log_mt_2
|
||||
|
@ -4,6 +4,7 @@ This table contains profiling on processors level (that you can find in [`EXPLAI
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the event happened.
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the event happened.
|
||||
- `event_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — The date and time with microseconds precision when the event happened.
|
||||
|
@ -34,6 +34,7 @@ You can use the [log_formatted_queries](../../operations/settings/settings.md#se
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `type` ([Enum8](../../sql-reference/data-types/enum.md)) — Type of an event that occurred when executing the query. Values:
|
||||
- `'QueryStart' = 1` — Successful start of query execution.
|
||||
- `'QueryFinish' = 2` — Successful end of query execution.
|
||||
@ -127,6 +128,7 @@ SELECT * FROM system.query_log WHERE type = 'QueryFinish' ORDER BY query_start_t
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
type: QueryFinish
|
||||
event_date: 2021-11-03
|
||||
event_time: 2021-11-03 16:13:54
|
||||
@ -167,7 +169,7 @@ initial_query_start_time: 2021-11-03 16:13:54
|
||||
initial_query_start_time_microseconds: 2021-11-03 16:13:54.952325
|
||||
interface: 1
|
||||
os_user: sevirov
|
||||
client_hostname: clickhouse.ru-central1.internal
|
||||
client_hostname: clickhouse.eu-central1.internal
|
||||
client_name: ClickHouse
|
||||
client_revision: 54449
|
||||
client_version_major: 21
|
||||
|
@ -18,6 +18,7 @@ You can use the [log_queries_probability](../../operations/settings/settings.md#
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the thread has finished execution of the query.
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the thread has finished execution of the query.
|
||||
- `event_time_microsecinds` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the thread has finished execution of the query with microseconds precision.
|
||||
@ -74,6 +75,7 @@ Columns:
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2020-09-11
|
||||
event_time: 2020-09-11 10:08:17
|
||||
event_time_microseconds: 2020-09-11 10:08:17.134042
|
||||
|
@ -18,6 +18,7 @@ You can use the [log_queries_probability](../../operations/settings/settings.md#
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the last event of the view happened.
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the view finished execution.
|
||||
- `event_time_microseconds` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the view finished execution with microseconds precision.
|
||||
@ -59,6 +60,7 @@ Result:
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2021-06-22
|
||||
event_time: 2021-06-22 13:23:07
|
||||
event_time_microseconds: 2021-06-22 13:23:07.738221
|
||||
|
@ -7,6 +7,7 @@ Contains information about all successful and failed login and logout events.
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `type` ([Enum8](../../sql-reference/data-types/enum.md)) — Login/logout result. Possible values:
|
||||
- `LoginFailure` — Login error.
|
||||
- `LoginSuccess` — Successful login.
|
||||
@ -57,6 +58,7 @@ Result:
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
type: LoginSuccess
|
||||
auth_id: 45e6bd83-b4aa-4a23-85e6-bd83b4aa1a23
|
||||
session_id:
|
||||
|
@ -7,6 +7,7 @@ Contains logging entries. The logging level which goes to this table can be limi
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` (Date) — Date of the entry.
|
||||
- `event_time` (DateTime) — Time of the entry.
|
||||
- `event_time_microseconds` (DateTime) — Time of the entry with microseconds precision.
|
||||
@ -39,6 +40,7 @@ SELECT * FROM system.text_log LIMIT 1 \G
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2020-09-10
|
||||
event_time: 2020-09-10 11:23:07
|
||||
event_time_microseconds: 2020-09-10 11:23:07.871397
|
||||
|
@ -12,37 +12,27 @@ To analyze logs, use the `addressToLine`, `addressToLineWithInlines`, `addressTo
|
||||
|
||||
Columns:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — Date of sampling moment.
|
||||
|
||||
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Timestamp of the sampling moment.
|
||||
|
||||
- `event_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — Timestamp of the sampling moment with microseconds precision.
|
||||
|
||||
- `timestamp_ns` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Timestamp of the sampling moment in nanoseconds.
|
||||
|
||||
- `revision` ([UInt32](../../sql-reference/data-types/int-uint.md)) — ClickHouse server build revision.
|
||||
|
||||
When connecting to the server by `clickhouse-client`, you see the string similar to `Connected to ClickHouse server version 19.18.1 revision 54429.`. This field contains the `revision`, but not the `version` of a server.
|
||||
When connecting to the server by `clickhouse-client`, you see the string similar to `Connected to ClickHouse server version 19.18.1.`. This field contains the `revision`, but not the `version` of a server.
|
||||
|
||||
- `trace_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Trace type:
|
||||
|
||||
- `Real` represents collecting stack traces by wall-clock time.
|
||||
- `CPU` represents collecting stack traces by CPU time.
|
||||
- `Memory` represents collecting allocations and deallocations when memory allocation exceeds the subsequent watermark.
|
||||
- `MemorySample` represents collecting random allocations and deallocations.
|
||||
- `MemoryPeak` represents collecting updates of peak memory usage.
|
||||
- `ProfileEvent` represents collecting of increments of profile events.
|
||||
|
||||
- `thread_id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Thread identifier.
|
||||
|
||||
- `query_id` ([String](../../sql-reference/data-types/string.md)) — Query identifier that can be used to get details about a query that was running from the [query_log](#system_tables-query_log) system table.
|
||||
|
||||
- `trace` ([Array(UInt64)](../../sql-reference/data-types/array.md)) — Stack trace at the moment of sampling. Each element is a virtual memory address inside ClickHouse server process.
|
||||
|
||||
- `size` ([Int64](../../sql-reference/data-types/int-uint.md)) - For trace types `Memory`, `MemorySample` or `MemoryPeak` is the amount of memory allocated, for other trace types is 0.
|
||||
|
||||
- `event` ([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md)) - For trace type `ProfileEvent` is the name of updated profile event, for other trace types is an empty string.
|
||||
|
||||
- `increment` ([UInt64](../../sql-reference/data-types/int-uint.md)) - For trace type `ProfileEvent` is the amount of increment of profile event, for other trace types is 0.
|
||||
|
||||
**Example**
|
||||
@ -54,6 +44,7 @@ SELECT * FROM system.trace_log LIMIT 1 \G
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
event_date: 2020-09-10
|
||||
event_time: 2020-09-10 11:23:09
|
||||
event_time_microseconds: 2020-09-10 11:23:09.872924
|
||||
|
@ -9,6 +9,7 @@ For requests, only columns with request parameters are filled in, and the remain
|
||||
|
||||
Columns with request parameters:
|
||||
|
||||
- `hostname` ([LowCardinality(String)](../../sql-reference/data-types/string.md)) — Hostname of the server executing the query.
|
||||
- `type` ([Enum](../../sql-reference/data-types/enum.md)) — Event type in the ZooKeeper client. Can have one of the following values:
|
||||
- `Request` — The request has been sent.
|
||||
- `Response` — The response was received.
|
||||
@ -63,6 +64,7 @@ Result:
|
||||
``` text
|
||||
Row 1:
|
||||
──────
|
||||
hostname: clickhouse.eu-central1.internal
|
||||
type: Request
|
||||
event_date: 2021-08-09
|
||||
event_time: 2021-08-09 21:38:30.291792
|
||||
|
@ -93,7 +93,7 @@ While ClickHouse can work over NFS, it is not the best idea.
|
||||
|
||||
## Linux Kernel {#linux-kernel}
|
||||
|
||||
Don’t use an outdated Linux kernel.
|
||||
Don't use an outdated Linux kernel.
|
||||
|
||||
## Network {#network}
|
||||
|
||||
|
@ -487,24 +487,23 @@ Where:
|
||||
|
||||
## uniqUpTo(N)(x)
|
||||
|
||||
Calculates the number of different argument values if it is less than or equal to N. If the number of different argument values is greater than N, it returns N + 1.
|
||||
Calculates the number of different values of the argument up to a specified limit, `N`. If the number of different argument values is greater than `N`, this function returns `N` + 1, otherwise it calculates the exact value.
|
||||
|
||||
Recommended for use with small Ns, up to 10. The maximum value of N is 100.
|
||||
Recommended for use with small `N`s, up to 10. The maximum value of `N` is 100.
|
||||
|
||||
For the state of an aggregate function, it uses the amount of memory equal to 1 + N \* the size of one value of bytes.
|
||||
For strings, it stores a non-cryptographic hash of 8 bytes. That is, the calculation is approximated for strings.
|
||||
For the state of an aggregate function, this function uses the amount of memory equal to 1 + `N` \* the size of one value of bytes.
|
||||
When dealing with strings, this function stores a non-cryptographic hash of 8 bytes; the calculation is approximated for strings.
|
||||
|
||||
The function also works for several arguments.
|
||||
For example, if you had a table that logs every search query made by users on your website. Each row in the table represents a single search query, with columns for the user ID, the search query, and the timestamp of the query. You can use `uniqUpTo` to generate a report that shows only the keywords that produced at least 5 unique users.
|
||||
|
||||
It works as fast as possible, except for cases when a large N value is used and the number of unique values is slightly less than N.
|
||||
|
||||
Usage example:
|
||||
|
||||
``` text
|
||||
Problem: Generate a report that shows only keywords that produced at least 5 unique users.
|
||||
Solution: Write in the GROUP BY query SearchPhrase HAVING uniqUpTo(4)(UserID) >= 5
|
||||
```sql
|
||||
SELECT SearchPhrase
|
||||
FROM SearchLog
|
||||
GROUP BY SearchPhrase
|
||||
HAVING uniqUpTo(4)(UserID) >= 5
|
||||
```
|
||||
|
||||
`uniqUpTo(4)(UserID)` calculates the number of unique `UserID` values for each `SearchPhrase`, but it only counts up to 4 unique values. If there are more than 4 unique `UserID` values for a `SearchPhrase`, the function returns 5 (4 + 1). The `HAVING` clause then filters out the `SearchPhrase` values for which the number of unique `UserID` values is less than 5. This will give you a list of search keywords that were used by at least 5 unique users.
|
||||
|
||||
## sumMapFiltered(keys_to_keep)(keys, values)
|
||||
|
||||
|
@ -5,7 +5,12 @@ sidebar_position: 6
|
||||
|
||||
# any
|
||||
|
||||
Selects the first encountered (non-NULL) value, unless all rows have NULL values in that column.
|
||||
Selects the first encountered value of a column.
|
||||
|
||||
By default, it ignores NULL values and returns the first NOT NULL value found in the column. As [`first_value`](../../../sql-reference/aggregate-functions/reference/first_value.md) if supports `RESPECT NULLS`, in which case it will select the first value passed, independently on whether it's NULL or not.
|
||||
|
||||
The return type of the function is the same as the input, except for LowCardinality which is discarded). This means that given no rows as input it will return the default value of that type (0 for integers, or Null for a Nullable() column). You might use the `-OrNull` [combinator](../../../sql-reference/aggregate-functions/combinators.md) ) to modify this behaviour.
|
||||
|
||||
The query can be executed in any order and even in a different order each time, so the result of this function is indeterminate.
|
||||
To get a determinate result, you can use the ‘min’ or ‘max’ function instead of ‘any’.
|
||||
|
||||
@ -13,4 +18,4 @@ In some cases, you can rely on the order of execution. This applies to cases whe
|
||||
|
||||
When a `SELECT` query has the `GROUP BY` clause or at least one aggregate function, ClickHouse (in contrast to MySQL) requires that all expressions in the `SELECT`, `HAVING`, and `ORDER BY` clauses be calculated from keys or from aggregate functions. In other words, each column selected from the table must be used either in keys or inside aggregate functions. To get behavior like in MySQL, you can put the other columns in the `any` aggregate function.
|
||||
|
||||
- Alias: `any_value`
|
||||
- Alias: `any_value`, `first_value`.
|
||||
|
@ -5,9 +5,12 @@ sidebar_position: 7
|
||||
|
||||
# first_value
|
||||
|
||||
Selects the first encountered value, similar to `any`, but could accept NULL.
|
||||
Mostly it should be used with [Window Functions](../../window-functions/index.md).
|
||||
Without Window Functions the result will be random if the source stream is not ordered.
|
||||
It is an alias for [`any`](../../../sql-reference/aggregate-functions/reference/any.md) but it was introduced for compatibility with [Window Functions](../../window-functions/index.md), where sometimes it's necessary to process `NULL` values (by default all ClickHouse aggregate functions ignore NULL values).
|
||||
|
||||
It supports declaring a modifier to respect nulls (`RESPECT NULLS`), both under [Window Functions](../../window-functions/index.md) and in normal aggregations.
|
||||
|
||||
As with `any`, without Window Functions the result will be random if the source stream is not ordered and the return type
|
||||
matches the input type (Null is only returned if the input is Nullable or -OrNull combinator is added).
|
||||
|
||||
## examples
|
||||
|
||||
@ -23,15 +26,15 @@ INSERT INTO test_data (a, b) Values (1,null), (2,3), (4, 5), (6,null);
|
||||
```
|
||||
|
||||
### example1
|
||||
The NULL value is ignored at default.
|
||||
By default, the NULL value is ignored.
|
||||
```sql
|
||||
select first_value(b) from test_data;
|
||||
```
|
||||
|
||||
```text
|
||||
┌─first_value_ignore_nulls(b)─┐
|
||||
│ 3 │
|
||||
└─────────────────────────────┘
|
||||
┌─any(b)─┐
|
||||
│ 3 │
|
||||
└────────┘
|
||||
```
|
||||
|
||||
### example2
|
||||
@ -41,9 +44,9 @@ select first_value(b) ignore nulls from test_data
|
||||
```
|
||||
|
||||
```text
|
||||
┌─first_value_ignore_nulls(b)─┐
|
||||
│ 3 │
|
||||
└─────────────────────────────┘
|
||||
┌─any(b) IGNORE NULLS ─┐
|
||||
│ 3 │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
### example3
|
||||
@ -53,9 +56,9 @@ select first_value(b) respect nulls from test_data
|
||||
```
|
||||
|
||||
```text
|
||||
┌─first_value_respect_nulls(b)─┐
|
||||
│ ᴺᵁᴸᴸ │
|
||||
└──────────────────────────────┘
|
||||
┌─any(b) RESPECT NULLS ─┐
|
||||
│ ᴺᵁᴸᴸ │
|
||||
└───────────────────────┘
|
||||
```
|
||||
|
||||
### example4
|
||||
@ -73,8 +76,8 @@ FROM
|
||||
```
|
||||
|
||||
```text
|
||||
┌─first_value_respect_nulls(b)─┬─first_value(b)─┐
|
||||
│ ᴺᵁᴸᴸ │ 3 │
|
||||
└──────────────────────────────┴────────────────┘
|
||||
┌─any_respect_nulls(b)─┬─any(b)─┐
|
||||
│ ᴺᵁᴸᴸ │ 3 │
|
||||
└──────────────────────┴────────┘
|
||||
```
|
||||
|
||||
|
@ -56,7 +56,7 @@ Functions:
|
||||
|
||||
## Related content
|
||||
|
||||
- [Reducing ClickHouse Storage Cost with the Low Cardinality Type – Lessons from an Instana Engineer](https://www.instana.com/blog/reducing-clickhouse-storage-cost-with-the-low-cardinality-type-lessons-from-an-instana-engineer/)
|
||||
- [Reducing ClickHouse Storage Cost with the Low Cardinality Type – Lessons from an Instana Engineer](https://altinity.com/blog/2020-5-20-reducing-clickhouse-storage-cost-with-the-low-cardinality-type-lessons-from-an-instana-engineer)
|
||||
- [String Optimization (video presentation in Russian)](https://youtu.be/rqf-ILRgBdY?list=PL0Z2YDlm0b3iwXCpEFiOOYmwXzVmjJfEt). [Slides in English](https://github.com/ClickHouse/clickhouse-presentations/raw/master/meetup19/string_optimization.pdf)
|
||||
- Blog: [Optimizing ClickHouse with Schemas and Codecs](https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema)
|
||||
- Blog: [Working with time series data in ClickHouse](https://clickhouse.com/blog/working-with-time-series-data-and-functions-ClickHouse)
|
||||
|
@ -1083,7 +1083,7 @@ Result:
|
||||
|
||||
**See also**
|
||||
|
||||
- [arrayFold](#arrayFold)
|
||||
- [arrayFold](#arrayfold)
|
||||
|
||||
## arrayReduceInRanges
|
||||
|
||||
@ -1175,7 +1175,7 @@ FROM numbers(1,10);
|
||||
|
||||
**See also**
|
||||
|
||||
- [arrayReduce](#arrayReduce)
|
||||
- [arrayReduce](#arrayreduce)
|
||||
|
||||
## arrayReverse(arr)
|
||||
|
||||
|
@ -2533,13 +2533,14 @@ formatDateTime(Time, Format[, Timezone])
|
||||
Returns time and date values according to the determined format.
|
||||
|
||||
**Replacement fields**
|
||||
|
||||
Using replacement fields, you can define a pattern for the resulting string. “Example” column shows formatting result for `2018-01-02 22:33:44`.
|
||||
|
||||
| Placeholder | Description | Example |
|
||||
| Placeholder | Description | Example |
|
||||
|----------|---------------------------------------------------------|------------|
|
||||
| %a | abbreviated weekday name (Mon-Sun) | Mon |
|
||||
| %b | abbreviated month name (Jan-Dec) | Jan |
|
||||
| %c | month as an integer number (01-12) | 01 |
|
||||
| %c | month as an integer number (01-12), see 'Note 3' below | 01 |
|
||||
| %C | year divided by 100 and truncated to integer (00-99) | 20 |
|
||||
| %d | day of the month, zero-padded (01-31) | 02 |
|
||||
| %D | Short MM/DD/YY date, equivalent to %m/%d/%y | 01/02/18 |
|
||||
@ -2553,8 +2554,8 @@ Using replacement fields, you can define a pattern for the resulting string. “
|
||||
| %i | minute (00-59) | 33 |
|
||||
| %I | hour in 12h format (01-12) | 10 |
|
||||
| %j | day of the year (001-366) | 002 |
|
||||
| %k | hour in 24h format (00-23) | 22 |
|
||||
| %l | hour in 12h format (01-12) | 09 |
|
||||
| %k | hour in 24h format (00-23), see 'Note 3' below | 14 |
|
||||
| %l | hour in 12h format (01-12), see 'Note 3' below | 09 |
|
||||
| %m | month as an integer number (01-12) | 01 |
|
||||
| %M | full month name (January-December), see 'Note 2' below | January |
|
||||
| %n | new-line character (‘’) | |
|
||||
@ -2579,6 +2580,8 @@ Note 1: In ClickHouse versions earlier than v23.4, `%f` prints a single zero (0)
|
||||
|
||||
Note 2: In ClickHouse versions earlier than v23.4, `%M` prints the minute (00-59) instead of the full month name (January-December). The previous behavior can be restored using setting `formatdatetime_parsedatetime_m_is_month_name = 0`.
|
||||
|
||||
Note 3: In ClickHouse versions earlier than v23.11, function `parseDateTime()` required leading zeros for formatters `%c` (month) and `%l`/`%k` (hour), e.g. `07`. In later versions, the leading zero may be omitted, e.g. `7`. The previous behavior can be restored using setting `parsedatetime_parse_without_leading_zeros = 0`. Note that function `formatDateTime()` by default still prints leading zeros for `%c` and `%l`/`%k` to not break existing use cases. This behavior can be changed by setting `formatdatetime_format_without_leading_zeros = 1`.
|
||||
|
||||
**Example**
|
||||
|
||||
``` sql
|
||||
|
@ -164,7 +164,7 @@ Consider a list of contacts that may specify multiple ways to contact a customer
|
||||
└──────────┴──────┴───────────┴───────────┘
|
||||
```
|
||||
|
||||
The `mail` and `phone` fields are of type String, but the `icq` field is `UInt32`, so it needs to be converted to `String`.
|
||||
The `mail` and `phone` fields are of type String, but the `telegram` field is `UInt32`, so it needs to be converted to `String`.
|
||||
|
||||
Get the first available contact method for the customer from the contact list:
|
||||
|
||||
|
@ -1,47 +0,0 @@
|
||||
---
|
||||
slug: /en/sql-reference/functions/time-series-functions
|
||||
sidebar_position: 172
|
||||
sidebar_label: Time Series
|
||||
---
|
||||
|
||||
# Time Series Functions
|
||||
|
||||
Below functions are used for time series analysis.
|
||||
|
||||
## seriesPeriodDetectFFT
|
||||
|
||||
Finds the period of the given time series data using FFT
|
||||
Detect Period in time series data using FFT.
|
||||
FFT - Fast Fourier transform (https://en.wikipedia.org/wiki/Fast_Fourier_transform)
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
seriesPeriodDetectFFT(series);
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `series` - An array of numeric values
|
||||
|
||||
**Returned value**
|
||||
|
||||
- A real value equal to the period of time series
|
||||
|
||||
Type: [Float64](../../sql-reference/data-types/float.md).
|
||||
|
||||
**Examples**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT seriesPeriodDetectFFT([1, 4, 6, 1, 4, 6, 1, 4, 6, 1, 4, 6, 1, 4, 6, 1, 4, 6, 1, 4, 6]) AS print_0;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────────print_0──────┐
|
||||
│ 3 │
|
||||
└────────────────────────┘
|
||||
```
|
@ -5,7 +5,7 @@ slug: /en/sql-reference/operators/exists
|
||||
|
||||
The `EXISTS` operator checks how many records are in the result of a subquery. If it is empty, then the operator returns `0`. Otherwise, it returns `1`.
|
||||
|
||||
`EXISTS` can be used in a [WHERE](../../sql-reference/statements/select/where.md) clause.
|
||||
`EXISTS` can also be used in a [WHERE](../../sql-reference/statements/select/where.md) clause.
|
||||
|
||||
:::tip
|
||||
References to main query tables and columns are not supported in a subquery.
|
||||
@ -13,12 +13,26 @@ References to main query tables and columns are not supported in a subquery.
|
||||
|
||||
**Syntax**
|
||||
|
||||
```sql
|
||||
WHERE EXISTS(subquery)
|
||||
``` sql
|
||||
EXISTS(subquery)
|
||||
```
|
||||
|
||||
**Example**
|
||||
|
||||
Query checking existence of values in a subquery:
|
||||
|
||||
``` sql
|
||||
SELECT EXISTS(SELECT * FROM numbers(10) WHERE number > 8), EXISTS(SELECT * FROM numbers(10) WHERE number > 11)
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─in(1, _subquery1)─┬─in(1, _subquery2)─┐
|
||||
│ 1 │ 0 │
|
||||
└───────────────────┴───────────────────┘
|
||||
```
|
||||
|
||||
Query with a subquery returning several rows:
|
||||
|
||||
``` sql
|
||||
|
@ -10,7 +10,7 @@ A set of queries that allow changing the table structure.
|
||||
Syntax:
|
||||
|
||||
``` sql
|
||||
ALTER TABLE [db].name [ON CLUSTER cluster] ADD|DROP|RENAME|CLEAR|COMMENT|{MODIFY|ALTER}|MATERIALIZE COLUMN ...
|
||||
ALTER [TEMPORARY] TABLE [db].name [ON CLUSTER cluster] ADD|DROP|RENAME|CLEAR|COMMENT|{MODIFY|ALTER}|MATERIALIZE COLUMN ...
|
||||
```
|
||||
|
||||
In the query, specify a list of one or more comma-separated actions.
|
||||
|
@ -16,6 +16,7 @@ Most `ALTER TABLE` queries modify table settings or data:
|
||||
- [INDEX](/docs/en/sql-reference/statements/alter/skipping-index.md)
|
||||
- [CONSTRAINT](/docs/en/sql-reference/statements/alter/constraint.md)
|
||||
- [TTL](/docs/en/sql-reference/statements/alter/ttl.md)
|
||||
- [STATISTIC](/docs/en/sql-reference/statements/alter/statistic.md)
|
||||
|
||||
:::note
|
||||
Most `ALTER TABLE` queries are supported only for [\*MergeTree](/docs/en/engines/table-engines/mergetree-family/index.md) tables, as well as [Merge](/docs/en/engines/table-engines/special/merge.md) and [Distributed](/docs/en/engines/table-engines/special/distributed.md).
|
||||
|
25
docs/en/sql-reference/statements/alter/statistic.md
Normal file
25
docs/en/sql-reference/statements/alter/statistic.md
Normal file
@ -0,0 +1,25 @@
|
||||
---
|
||||
slug: /en/sql-reference/statements/alter/statistic
|
||||
sidebar_position: 45
|
||||
sidebar_label: STATISTIC
|
||||
---
|
||||
|
||||
# Manipulating Column Statistics
|
||||
|
||||
The following operations are available:
|
||||
|
||||
- `ALTER TABLE [db].table ADD STATISTIC (columns list) TYPE type` - Adds statistic description to tables metadata.
|
||||
|
||||
- `ALTER TABLE [db].table DROP STATISTIC (columns list) TYPE type` - Removes statistic description from tables metadata and deletes statistic files from disk.
|
||||
|
||||
- `ALTER TABLE [db].table CLEAR STATISTIC (columns list) TYPE type` - Deletes statistic files from disk.
|
||||
|
||||
- `ALTER TABLE [db.]table MATERIALIZE STATISTIC (columns list) TYPE type` - Rebuilds the statistic for columns. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations).
|
||||
|
||||
The first two commands are lightweight in a sense that they only change metadata or remove files.
|
||||
|
||||
Also, they are replicated, syncing statistics metadata via ZooKeeper.
|
||||
|
||||
:::note
|
||||
Statistic manipulation is supported only for tables with [`*MergeTree`](../../../engines/table-engines/mergetree-family/mergetree.md) engine (including [replicated](../../../engines/table-engines/mergetree-family/replication.md) variants).
|
||||
:::
|
@ -415,7 +415,7 @@ ExpressionTransform
|
||||
ExpressionTransform × 2
|
||||
(SettingQuotaAndLimits)
|
||||
(ReadFromStorage)
|
||||
NumbersMt × 2 0 → 1
|
||||
NumbersRange × 2 0 → 1
|
||||
```
|
||||
### EXPLAIN ESTIMATE
|
||||
|
||||
|
@ -150,7 +150,7 @@ SYSTEM RELOAD CONFIG [ON CLUSTER cluster_name]
|
||||
|
||||
## RELOAD USERS
|
||||
|
||||
Reloads all access storages, including: users.xml, local disk access storage, replicated (in ZooKeeper) access storage.
|
||||
Reloads all access storages, including: users.xml, local disk access storage, replicated (in ZooKeeper) access storage.
|
||||
|
||||
```sql
|
||||
SYSTEM RELOAD USERS [ON CLUSTER cluster_name]
|
||||
@ -354,7 +354,7 @@ After running this statement the `[db.]replicated_merge_tree_family_table_name`
|
||||
|
||||
### SYNC DATABASE REPLICA
|
||||
|
||||
Waits until the specified [replicated database](https://clickhouse.com/docs/en/engines/database-engines/replicated) applies all schema changes from the DDL queue of that database.
|
||||
Waits until the specified [replicated database](https://clickhouse.com/docs/en/engines/database-engines/replicated) applies all schema changes from the DDL queue of that database.
|
||||
|
||||
**Syntax**
|
||||
```sql
|
||||
@ -451,12 +451,12 @@ SYSTEM SYNC FILE CACHE [ON CLUSTER cluster_name]
|
||||
|
||||
### SYSTEM STOP LISTEN
|
||||
|
||||
Closes the socket and gracefully terminates the existing connections to the server on the specified port with the specified protocol.
|
||||
Closes the socket and gracefully terminates the existing connections to the server on the specified port with the specified protocol.
|
||||
|
||||
However, if the corresponding protocol settings were not specified in the clickhouse-server configuration, this command will have no effect.
|
||||
|
||||
```sql
|
||||
SYSTEM STOP LISTEN [ON CLUSTER cluster_name] [QUERIES ALL | QUERIES DEFAULT | QUERIES CUSTOM | TCP | TCP_WITH_PROXY | TCP_SECURE | HTTP | HTTPS | MYSQL | GRPC | POSTGRESQL | PROMETHEUS | CUSTOM 'protocol']
|
||||
SYSTEM STOP LISTEN [ON CLUSTER cluster_name] [QUERIES ALL | QUERIES DEFAULT | QUERIES CUSTOM | TCP | TCP WITH PROXY | TCP SECURE | HTTP | HTTPS | MYSQL | GRPC | POSTGRESQL | PROMETHEUS | CUSTOM 'protocol']
|
||||
```
|
||||
|
||||
- If `CUSTOM 'protocol'` modifier is specified, the custom protocol with the specified name defined in the protocols section of the server configuration will be stopped.
|
||||
@ -471,5 +471,5 @@ Allows new connections to be established on the specified protocols.
|
||||
However, if the server on the specified port and protocol was not stopped using the SYSTEM STOP LISTEN command, this command will have no effect.
|
||||
|
||||
```sql
|
||||
SYSTEM START LISTEN [ON CLUSTER cluster_name] [QUERIES ALL | QUERIES DEFAULT | QUERIES CUSTOM | TCP | TCP_WITH_PROXY | TCP_SECURE | HTTP | HTTPS | MYSQL | GRPC | POSTGRESQL | PROMETHEUS | CUSTOM 'protocol']
|
||||
SYSTEM START LISTEN [ON CLUSTER cluster_name] [QUERIES ALL | QUERIES DEFAULT | QUERIES CUSTOM | TCP | TCP WITH PROXY | TCP SECURE | HTTP | HTTPS | MYSQL | GRPC | POSTGRESQL | PROMETHEUS | CUSTOM 'protocol']
|
||||
```
|
||||
|
@ -1,4 +1,4 @@
|
||||
--
|
||||
---
|
||||
slug: /en/sql-reference/table-functions/file
|
||||
sidebar_position: 60
|
||||
sidebar_label: file
|
||||
|
85
docs/en/sql-reference/table-functions/fileCluster.md
Normal file
85
docs/en/sql-reference/table-functions/fileCluster.md
Normal file
@ -0,0 +1,85 @@
|
||||
---
|
||||
slug: /en/sql-reference/table-functions/fileCluster
|
||||
sidebar_position: 61
|
||||
sidebar_label: fileCluster
|
||||
---
|
||||
|
||||
# fileCluster Table Function
|
||||
|
||||
Enables simultaneous processing of files matching a specified path across multiple nodes within a cluster. The initiator establishes connections to worker nodes, expands globs in the file path, and delegates file-reading tasks to worker nodes. Each worker node is querying the initiator for the next file to process, repeating until all tasks are completed (all files are read).
|
||||
|
||||
:::note
|
||||
This function will operate _correctly_ only in case the set of files matching the initially specified path is identical across all nodes, and their content is consistent among different nodes.
|
||||
In case these files differ between nodes, the return value cannot be predetermined and depends on the order in which worker nodes request tasks from the initiator.
|
||||
:::
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
fileCluster(cluster_name, path[, format, structure, compression_method])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `cluster_name` — Name of a cluster that is used to build a set of addresses and connection parameters to remote and local servers.
|
||||
- `path` — The relative path to the file from [user_files_path](/docs/en/operations/server-configuration-parameters/settings.md#server_configuration_parameters-user_files_path). Path to file also supports [globs](#globs_in_path).
|
||||
- `format` — [Format](../../interfaces/formats.md#formats) of the files. Type: [String](../../sql-reference/data-types/string.md).
|
||||
- `structure` — Table structure in `'UserID UInt64, Name String'` format. Determines column names and types. Type: [String](../../sql-reference/data-types/string.md).
|
||||
- `compression_method` — Compression method. Supported compression types are `gz`, `br`, `xz`, `zst`, `lz4`, and `bz2`.
|
||||
|
||||
**Returned value**
|
||||
|
||||
A table with the specified format and structure and with data from files matching the specified path.
|
||||
|
||||
**Example**
|
||||
|
||||
Given a cluster named `my_cluster` and given the following value of setting `user_files_path`:
|
||||
|
||||
``` bash
|
||||
$ grep user_files_path /etc/clickhouse-server/config.xml
|
||||
<user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
|
||||
```
|
||||
Also, given there are files `test1.csv` and `test2.csv` inside `user_files_path` of each cluster node, and their content is identical across different nodes:
|
||||
```bash
|
||||
$ cat /var/lib/clickhouse/user_files/test1.csv
|
||||
1,"file1"
|
||||
11,"file11"
|
||||
|
||||
$ cat /var/lib/clickhouse/user_files/test1.csv
|
||||
2,"file2"
|
||||
22,"file22"
|
||||
```
|
||||
|
||||
For example, one can create these files by executing these two queries on every cluster node:
|
||||
```sql
|
||||
INSERT INTO TABLE FUNCTION file('file1.csv', 'CSV', 'i UInt32, s String') VALUES (1,'file1'), (11,'file11');
|
||||
INSERT INTO TABLE FUNCTION file('file2.csv', 'CSV', 'i UInt32, s String') VALUES (2,'file2'), (22,'file22');
|
||||
```
|
||||
|
||||
Now, read data contents of `test1.csv` and `test2.csv` via `fileCluster` table function:
|
||||
|
||||
```sql
|
||||
SELECT * from fileCluster(
|
||||
'my_cluster', 'file{1,2}.csv', 'CSV', 'i UInt32, s String') ORDER BY (i, s)"""
|
||||
)
|
||||
```
|
||||
|
||||
```
|
||||
┌──i─┬─s──────┐
|
||||
│ 1 │ file1 │
|
||||
│ 11 │ file11 │
|
||||
└────┴────────┘
|
||||
┌──i─┬─s──────┐
|
||||
│ 2 │ file2 │
|
||||
│ 22 │ file22 │
|
||||
└────┴────────┘
|
||||
```
|
||||
|
||||
|
||||
## Globs in Path {#globs_in_path}
|
||||
|
||||
All patterns supported by [File](../../sql-reference/table-functions/file.md#globs-in-path) table function are supported by FileCluster.
|
||||
|
||||
**See Also**
|
||||
|
||||
- [File table function](../../sql-reference/table-functions/file.md)
|
86
docs/en/sql-reference/table-functions/fuzzJSON.md
Normal file
86
docs/en/sql-reference/table-functions/fuzzJSON.md
Normal file
@ -0,0 +1,86 @@
|
||||
---
|
||||
slug: /en/sql-reference/table-functions/fuzzJSON
|
||||
sidebar_position: 75
|
||||
sidebar_label: fuzzJSON
|
||||
---
|
||||
|
||||
# fuzzJSON
|
||||
|
||||
Perturbs a JSON string with random variations.
|
||||
|
||||
``` sql
|
||||
fuzzJSON({ named_collection [option=value [,..]] | json_str[, random_seed] })
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `named_collection`- A [NAMED COLLECTION](/docs/en/sql-reference/statements/create/named-collection.md).
|
||||
- `option=value` - Named collection optional parameters and their values.
|
||||
- `json_str` (String) - The source string representing structured data in JSON format.
|
||||
- `random_seed` (UInt64) - Manual random seed for producing stable results.
|
||||
- `reuse_output` (boolean) - Reuse the output from a fuzzing process as input for the next fuzzer.
|
||||
- `max_output_length` (UInt64) - Maximum allowable length of the generated or perturbed JSON string.
|
||||
- `probability` (Float64) - The probability to fuzz a JSON field (a key-value pair). Must be within [0, 1] range.
|
||||
- `max_nesting_level` (UInt64) - The maximum allowed depth of nested structures within the JSON data.
|
||||
- `max_array_size` (UInt64) - The maximum allowed size of a JSON array.
|
||||
- `max_object_size` (UInt64) - The maximum allowed number of fields on a single level of a JSON object.
|
||||
- `max_string_value_length` (UInt64) - The maximum length of a String value.
|
||||
- `min_key_length` (UInt64) - The minimum key length. Should be at least 1.
|
||||
- `max_key_length` (UInt64) - The maximum key length. Should be greater or equal than the `min_key_length`, if specified.
|
||||
|
||||
**Returned Value**
|
||||
|
||||
A table object with a a single column containing perturbed JSON strings.
|
||||
|
||||
## Usage Example
|
||||
|
||||
``` sql
|
||||
CREATE NAMED COLLECTION json_fuzzer AS json_str='{}';
|
||||
SELECT * FROM fuzzJSON(json_fuzzer) LIMIT 3;
|
||||
```
|
||||
|
||||
``` text
|
||||
{"52Xz2Zd4vKNcuP2":true}
|
||||
{"UPbOhOQAdPKIg91":3405264103600403024}
|
||||
{"X0QUWu8yT":[]}
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT * FROM fuzzJSON(json_fuzzer, json_str='{"name" : "value"}', random_seed=1234) LIMIT 3;
|
||||
```
|
||||
|
||||
``` text
|
||||
{"key":"value", "mxPG0h1R5":"L-YQLv@9hcZbOIGrAn10%GA"}
|
||||
{"BRE3":true}
|
||||
{"key":"value", "SWzJdEJZ04nrpSfy":[{"3Q23y":[]}]}
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT * FROM fuzzJSON(json_fuzzer, json_str='{"students" : ["Alice", "Bob"]}', reuse_output=true) LIMIT 3;
|
||||
```
|
||||
|
||||
``` text
|
||||
{"students":["Alice", "Bob"], "nwALnRMc4pyKD9Krv":[]}
|
||||
{"students":["1rNY5ZNs0wU&82t_P", "Bob"], "wLNRGzwDiMKdw":[{}]}
|
||||
{"xeEk":["1rNY5ZNs0wU&82t_P", "Bob"], "wLNRGzwDiMKdw":[{}, {}]}
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT * FROM fuzzJSON(json_fuzzer, json_str='{"students" : ["Alice", "Bob"]}', max_output_length=512) LIMIT 3;
|
||||
```
|
||||
|
||||
``` text
|
||||
{"students":["Alice", "Bob"], "BREhhXj5":true}
|
||||
{"NyEsSWzJdeJZ04s":["Alice", 5737924650575683711, 5346334167565345826], "BjVO2X9L":true}
|
||||
{"NyEsSWzJdeJZ04s":["Alice", 5737924650575683711, 5346334167565345826], "BjVO2X9L":true, "k1SXzbSIz":[{}]}
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT * FROM fuzzJSON('{"id":1}', 1234) LIMIT 3;
|
||||
```
|
||||
|
||||
``` text
|
||||
{"id":1, "mxPG0h1R5":"L-YQLv@9hcZbOIGrAn10%GA"}
|
||||
{"BRjE":16137826149911306846}
|
||||
{"XjKE":15076727133550123563}
|
||||
```
|
@ -17,6 +17,8 @@ The following queries are equivalent:
|
||||
SELECT * FROM numbers(10);
|
||||
SELECT * FROM numbers(0, 10);
|
||||
SELECT * FROM system.numbers LIMIT 10;
|
||||
SELECT * FROM system.numbers WHERE number BETWEEN 0 AND 9;
|
||||
SELECT * FROM system.numbers WHERE number IN (0, 1, 2, 3, 4, 5, 6, 7, 8, 9);
|
||||
```
|
||||
|
||||
Examples:
|
||||
|
@ -14,7 +14,7 @@ ClickHouse предоставляет собственный клиент ком
|
||||
$ clickhouse-client
|
||||
ClickHouse client version 20.13.1.5273 (official build).
|
||||
Connecting to localhost:9000 as user default.
|
||||
Connected to ClickHouse server version 20.13.1 revision 54442.
|
||||
Connected to ClickHouse server version 20.13.1.
|
||||
|
||||
:)
|
||||
```
|
||||
|
@ -1215,6 +1215,7 @@ ClickHouse использует потоки из глобального пул
|
||||
- `metrics` – флаг для экспорта текущих значений метрик из таблицы [system.metrics](../system-tables/metrics.md#system_tables-metrics).
|
||||
- `events` – флаг для экспорта текущих значений метрик из таблицы [system.events](../system-tables/events.md#system_tables-events).
|
||||
- `asynchronous_metrics` – флаг для экспорта текущих значений значения метрик из таблицы [system.asynchronous_metrics](../system-tables/asynchronous_metrics.md#system_tables-asynchronous_metrics).
|
||||
- `errors` - флаг для экспорта количества ошибок (по кодам) случившихся с момента последнего рестарта сервера. Эта информация может быть получена из таблицы [system.errors](../system-tables/asynchronous_metrics.md#system_tables-errors)
|
||||
|
||||
**Пример**
|
||||
|
||||
@ -1225,6 +1226,7 @@ ClickHouse использует потоки из глобального пул
|
||||
<metrics>true</metrics>
|
||||
<events>true</events>
|
||||
<asynchronous_metrics>true</asynchronous_metrics>
|
||||
<errors>true</errors>
|
||||
</prometheus>
|
||||
```
|
||||
|
||||
@ -1676,7 +1678,7 @@ TCP порт для защищённого обмена данными с кли
|
||||
|
||||
## user_files_path {#server_configuration_parameters-user_files_path}
|
||||
|
||||
Каталог с пользовательскими файлами. Используется в табличной функции [file()](../../operations/server-configuration-parameters/settings.md).
|
||||
Каталог с пользовательскими файлами. Используется в табличных функциях [file()](../../sql-reference/table-functions/fileCluster.md) и [fileCluster()](../../sql-reference/table-functions/fileCluster.md).
|
||||
|
||||
**Пример**
|
||||
|
||||
|
@ -119,7 +119,7 @@ Eсли суммарное число активных кусков во все
|
||||
- Положительное целое число.
|
||||
- 0 (без ограничений).
|
||||
|
||||
Значение по умолчанию: 100.
|
||||
Значение по умолчанию: 1000.
|
||||
|
||||
Команда `Insert` создает один или несколько блоков (кусков). При вставке в Replicated таблицы ClickHouse для [дедупликации вставок](../../engines/table-engines/mergetree-family/replication.md) записывает в Zookeeper хеш-суммы созданных кусков. Но хранятся только последние `replicated_deduplication_window` хеш-сумм. Самые старые хеш-суммы удаляются из Zookeeper.
|
||||
Большое значение `replicated_deduplication_window` замедляет `Insert`, так как приходится сравнивать большее количество хеш-сумм.
|
||||
|
@ -19,7 +19,7 @@ ClickHouse создает эту таблицу когда установлен
|
||||
|
||||
- `revision`([UInt32](../../sql-reference/data-types/int-uint.md)) — ревизия сборки сервера ClickHouse.
|
||||
|
||||
Во время соединения с сервером через `clickhouse-client`, вы видите строку похожую на `Connected to ClickHouse server version 19.18.1 revision 54429.`. Это поле содержит номер после `revision`, но не содержит строку после `version`.
|
||||
Во время соединения с сервером через `clickhouse-client`, вы видите строку похожую на `Connected to ClickHouse server version 19.18.1.`. Это поле содержит номер после `revision`, но не содержит строку после `version`.
|
||||
|
||||
- `trace_type`([Enum8](../../sql-reference/data-types/enum.md)) — тип трассировки:
|
||||
|
||||
|
@ -11,7 +11,7 @@ sidebar_label: "Манипуляции со столбцами"
|
||||
Синтаксис:
|
||||
|
||||
``` sql
|
||||
ALTER TABLE [db].name [ON CLUSTER cluster] ADD|DROP|RENAME|CLEAR|COMMENT|{MODIFY|ALTER}|MATERIALIZE COLUMN ...
|
||||
ALTER [TEMPORARY] TABLE [db].name [ON CLUSTER cluster] ADD|DROP|RENAME|CLEAR|COMMENT|{MODIFY|ALTER}|MATERIALIZE COLUMN ...
|
||||
```
|
||||
|
||||
В запросе можно указать сразу несколько действий над одной таблицей через запятую.
|
||||
|
@ -371,7 +371,7 @@ ExpressionTransform
|
||||
ExpressionTransform × 2
|
||||
(SettingQuotaAndLimits)
|
||||
(ReadFromStorage)
|
||||
NumbersMt × 2 0 → 1
|
||||
NumbersRange × 2 0 → 1
|
||||
```
|
||||
|
||||
### EXPLAIN ESTIMATE {#explain-estimate}
|
||||
|
@ -13,7 +13,7 @@ sidebar_label: file
|
||||
**Синтаксис**
|
||||
|
||||
``` sql
|
||||
file(path [,format] [,structure])
|
||||
file(path [,format] [,structure] [,compression])
|
||||
```
|
||||
|
||||
**Параметры**
|
||||
@ -21,6 +21,7 @@ file(path [,format] [,structure])
|
||||
- `path` — относительный путь до файла от [user_files_path](../../sql-reference/table-functions/file.md#server_configuration_parameters-user_files_path). Путь к файлу поддерживает следующие шаблоны в режиме доступа только для чтения `*`, `?`, `{abc,def}` и `{N..M}`, где `N`, `M` — числа, `'abc', 'def'` — строки.
|
||||
- `format` — [формат](../../interfaces/formats.md#formats) файла.
|
||||
- `structure` — структура таблицы. Формат: `'colunmn1_name column1_ype, column2_name column2_type, ...'`.
|
||||
- `compression` — Используемый тип сжатия для запроса SELECT или желаемый тип сжатия для запроса INSERT. Поддерживаемые типы сжатия: `gz`, `br`, `xz`, `zst`, `lz4` и `bz2`.
|
||||
|
||||
**Возвращаемое значение**
|
||||
|
||||
|
84
docs/ru/sql-reference/table-functions/fileCluster.md
Normal file
84
docs/ru/sql-reference/table-functions/fileCluster.md
Normal file
@ -0,0 +1,84 @@
|
||||
---
|
||||
slug: /ru/sql-reference/table-functions/fileCluster
|
||||
sidebar_position: 38
|
||||
sidebar_label: fileCluster
|
||||
---
|
||||
|
||||
# fileCluster
|
||||
|
||||
Позволяет одновременно обрабатывать файлы, находящиеся по указанному пути, на нескольких узлах внутри кластера. Узел-инициатор устанавливает соединения с рабочими узлами (worker nodes), раскрывает шаблоны в пути к файлам и отдаёт задачи по чтению файлов рабочим узлам. Рабочий узел запрашивает у инициатора путь к следующему файлу для обработки, повторяя до тех пор, пока не завершатся все задачи (то есть пока не будут обработаны все файлы).
|
||||
|
||||
:::note
|
||||
Эта табличная функция будет работать _корректно_ только в случае, если набор файлов, соответствующих изначально указанному пути, одинаков на всех узлах и содержание этих файлов идентично на различных узлах. В случае, если эти файлы различаются между узлами, результат не предопределён и зависит от очерёдности, с которой рабочие узлы будут запрашивать задачи у инициатора.
|
||||
:::
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
``` sql
|
||||
fileCluster(cluster_name, path[, format, structure, compression_method])
|
||||
```
|
||||
|
||||
**Аргументы**
|
||||
|
||||
- `cluster_name` — имя кластера, используемое для создания набора адресов и параметров подключения к удаленным и локальным серверам.
|
||||
- `path` — относительный путь до файла от [user_files_path](../../sql-reference/table-functions/file.md#server_configuration_parameters-user_files_path). Путь к файлу поддерживает [шаблоны поискаglobs](#globs_in_path).
|
||||
- `format` — [формат](../../interfaces/formats.md#formats) файла.
|
||||
- `structure` — структура таблицы. Формат: `'colunmn1_name column1_ype, column2_name column2_type, ...'`.
|
||||
- `compression_method` — Используемый тип сжатия. Поддерживаемые типы: `gz`, `br`, `xz`, `zst`, `lz4` и `bz2`.
|
||||
|
||||
**Возвращаемое значение**
|
||||
|
||||
Таблица с указанным форматом и структурой, содержащая данные из файлов, соответствующих указанному пути.
|
||||
|
||||
**Пример**
|
||||
Пусть есть кластер с именем `my_cluster`, а также установлено нижеследующее значение параметра `user_files_path`:
|
||||
|
||||
``` bash
|
||||
$ grep user_files_path /etc/clickhouse-server/config.xml
|
||||
<user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
|
||||
```
|
||||
|
||||
Пусть также на каждом узле кластера в директории `user_files_path` находятся файлы `test1.csv` и `test2.csv`, и их содержимое идентично на разных узлах:
|
||||
```bash
|
||||
$ cat /var/lib/clickhouse/user_files/test1.csv
|
||||
1,"file1"
|
||||
11,"file11"
|
||||
|
||||
$ cat /var/lib/clickhouse/user_files/test1.csv
|
||||
2,"file2"
|
||||
22,"file22"
|
||||
```
|
||||
|
||||
Например, эти файлы можно создать, выполнив на каждом узле два запроса:
|
||||
```sql
|
||||
INSERT INTO TABLE FUNCTION file('file1.csv', 'CSV', 'i UInt32, s String') VALUES (1,'file1'), (11,'file11');
|
||||
INSERT INTO TABLE FUNCTION file('file2.csv', 'CSV', 'i UInt32, s String') VALUES (2,'file2'), (22,'file22');
|
||||
```
|
||||
|
||||
Прочитаем содержимое файлов `test1.csv` и `test2.csv` с помощью табличной функции `fileCluster`:
|
||||
|
||||
```sql
|
||||
SELECT * from fileCluster(
|
||||
'my_cluster', 'file{1,2}.csv', 'CSV', 'i UInt32, s String') ORDER BY (i, s)"""
|
||||
)
|
||||
```
|
||||
|
||||
```
|
||||
┌──i─┬─s──────┐
|
||||
│ 1 │ file1 │
|
||||
│ 11 │ file11 │
|
||||
└────┴────────┘
|
||||
┌──i─┬─s──────┐
|
||||
│ 2 │ file2 │
|
||||
│ 22 │ file22 │
|
||||
└────┴────────┘
|
||||
```
|
||||
|
||||
|
||||
## Шаблоны поиска в компонентах пути {#globs_in_path}
|
||||
|
||||
Поддерживаются все шаблоны поиска, что поддерживаются табличной функцией [File](../../sql-reference/table-functions/file.md#globs-in-path).
|
||||
|
||||
**Смотрите также**
|
||||
|
||||
- [File (табличная функция)](../../sql-reference/table-functions/file.md)
|
@ -14,7 +14,7 @@ ClickHouse提供了一个原生命令行客户端`clickhouse-client`客户端支
|
||||
$ clickhouse-client
|
||||
ClickHouse client version 19.17.1.1579 (official build).
|
||||
Connecting to localhost:9000 as user default.
|
||||
Connected to ClickHouse server version 19.17.1 revision 54428.
|
||||
Connected to ClickHouse server version 19.17.1.
|
||||
|
||||
:)
|
||||
```
|
||||
|
@ -22,7 +22,7 @@ ClickHouse创建此表时 [trace_log](../../operations/server-configuration-para
|
||||
|
||||
- `revision` ([UInt32](../../sql-reference/data-types/int-uint.md)) — ClickHouse server build revision.
|
||||
|
||||
通过以下方式连接到服务器 `clickhouse-client`,你看到的字符串类似于 `Connected to ClickHouse server version 19.18.1 revision 54429.`. 该字段包含 `revision`,但不是 `version` 的服务器。
|
||||
通过以下方式连接到服务器 `clickhouse-client`,你看到的字符串类似于 `Connected to ClickHouse server version 19.18.1.`. 该字段包含 `revision`,但不是 `version` 的服务器。
|
||||
|
||||
- `timer_type` ([枚举8](../../sql-reference/data-types/enum.md)) — Timer type:
|
||||
|
||||
|
@ -44,6 +44,8 @@ contents:
|
||||
dst: /usr/bin/clickhouse-odbc-bridge
|
||||
- src: root/usr/share/bash-completion/completions
|
||||
dst: /usr/share/bash-completion/completions
|
||||
- src: root/usr/share/clickhouse
|
||||
dst: /usr/share/clickhouse
|
||||
# docs
|
||||
- src: ../AUTHORS
|
||||
dst: /usr/share/doc/clickhouse-common-static/AUTHORS
|
||||
|
@ -457,3 +457,10 @@ endif()
|
||||
if (ENABLE_FUZZING)
|
||||
add_compile_definitions(FUZZING_MODE=1)
|
||||
endif ()
|
||||
|
||||
if (TARGET ch_contrib::protobuf)
|
||||
get_property(google_proto_files TARGET ch_contrib::protobuf PROPERTY google_proto_files)
|
||||
foreach (proto_file IN LISTS google_proto_files)
|
||||
install(FILES ${proto_file} DESTINATION ${CMAKE_INSTALL_DATAROOTDIR}/clickhouse/protos/google/protobuf)
|
||||
endforeach()
|
||||
endif ()
|
||||
|
@ -306,6 +306,10 @@ void Client::initialize(Poco::Util::Application & self)
|
||||
/// Set path for format schema files
|
||||
if (config().has("format_schema_path"))
|
||||
global_context->setFormatSchemaPath(fs::weakly_canonical(config().getString("format_schema_path")));
|
||||
|
||||
/// Set the path for google proto files
|
||||
if (config().has("google_protos_path"))
|
||||
global_context->setGoogleProtosPath(fs::weakly_canonical(config().getString("google_protos_path")));
|
||||
}
|
||||
|
||||
|
||||
@ -489,8 +493,7 @@ void Client::connect()
|
||||
|
||||
if (is_interactive)
|
||||
{
|
||||
std::cout << "Connected to " << server_name << " server version " << server_version << " revision " << server_revision << "."
|
||||
<< std::endl << std::endl;
|
||||
std::cout << "Connected to " << server_name << " server version " << server_version << "." << std::endl << std::endl;
|
||||
|
||||
auto client_version_tuple = std::make_tuple(VERSION_MAJOR, VERSION_MINOR, VERSION_PATCH);
|
||||
auto server_version_tuple = std::make_tuple(server_version_major, server_version_minor, server_version_patch);
|
||||
|
@ -37,7 +37,7 @@
|
||||
<production>{display_name} \e[1;31m:)\e[0m </production> <!-- if it matched to the substring "production" in the server display name -->
|
||||
</prompt_by_server_display_name>
|
||||
|
||||
<!--
|
||||
<!--
|
||||
Settings adjustable via command-line parameters
|
||||
can take their defaults from that config file, see examples:
|
||||
|
||||
@ -58,6 +58,9 @@
|
||||
The same can be done on user-level configuration, just create & adjust: ~/.clickhouse-client/config.xml
|
||||
-->
|
||||
|
||||
<!-- Directory containing the proto files for the well-known Protobuf types.
|
||||
-->
|
||||
<google_protos_path>/usr/share/clickhouse/protos/</google_protos_path>
|
||||
|
||||
<!-- Analog of .netrc -->
|
||||
<![CDATA[
|
||||
|
@ -36,7 +36,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 2)
|
||||
@ -51,8 +51,8 @@ public:
|
||||
const String & path_from = command_arguments[0];
|
||||
const String & path_to = command_arguments[1];
|
||||
|
||||
DiskPtr disk_from = global_context->getDisk(disk_name_from);
|
||||
DiskPtr disk_to = global_context->getDisk(disk_name_to);
|
||||
DiskPtr disk_from = disk_selector->get(disk_name_from);
|
||||
DiskPtr disk_to = disk_selector->get(disk_name_to);
|
||||
|
||||
String relative_path_from = validatePathAndGetAsRelative(path_from);
|
||||
String relative_path_to = validatePathAndGetAsRelative(path_to);
|
||||
|
@ -27,7 +27,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 2)
|
||||
@ -41,7 +41,7 @@ public:
|
||||
const String & path_from = command_arguments[0];
|
||||
const String & path_to = command_arguments[1];
|
||||
|
||||
DiskPtr disk = global_context->getDisk(disk_name);
|
||||
DiskPtr disk = disk_selector->get(disk_name);
|
||||
|
||||
String relative_path_from = validatePathAndGetAsRelative(path_from);
|
||||
String relative_path_to = validatePathAndGetAsRelative(path_to);
|
||||
|
@ -33,7 +33,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 1)
|
||||
@ -46,7 +46,7 @@ public:
|
||||
|
||||
const String & path = command_arguments[0];
|
||||
|
||||
DiskPtr disk = global_context->getDisk(disk_name);
|
||||
DiskPtr disk = disk_selector->get(disk_name);
|
||||
|
||||
String relative_path = validatePathAndGetAsRelative(path);
|
||||
|
||||
|
@ -26,8 +26,8 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
Poco::Util::LayeredConfiguration &) override
|
||||
std::shared_ptr<DiskSelector> &,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (!command_arguments.empty())
|
||||
{
|
||||
@ -35,8 +35,29 @@ public:
|
||||
throw DB::Exception(DB::ErrorCodes::BAD_ARGUMENTS, "Bad Arguments");
|
||||
}
|
||||
|
||||
for (const auto & [disk_name, _] : global_context->getDisksMap())
|
||||
std::cout << disk_name << '\n';
|
||||
constexpr auto config_prefix = "storage_configuration.disks";
|
||||
constexpr auto default_disk_name = "default";
|
||||
|
||||
Poco::Util::AbstractConfiguration::Keys keys;
|
||||
config.keys(config_prefix, keys);
|
||||
|
||||
bool has_default_disk = false;
|
||||
|
||||
/// For the output to be ordered
|
||||
std::set<String> disks;
|
||||
|
||||
for (const auto & disk_name : keys)
|
||||
{
|
||||
if (disk_name == default_disk_name)
|
||||
has_default_disk = true;
|
||||
disks.insert(disk_name);
|
||||
}
|
||||
|
||||
if (!has_default_disk)
|
||||
disks.insert(default_disk_name);
|
||||
|
||||
for (const auto & disk : disks)
|
||||
std::cout << disk << '\n';
|
||||
}
|
||||
};
|
||||
}
|
||||
|
@ -34,7 +34,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 1)
|
||||
@ -47,7 +47,7 @@ public:
|
||||
|
||||
const String & path = command_arguments[0];
|
||||
|
||||
DiskPtr disk = global_context->getDisk(disk_name);
|
||||
DiskPtr disk = disk_selector->get(disk_name);
|
||||
|
||||
String relative_path = validatePathAndGetAsRelative(path);
|
||||
bool recursive = config.getBool("recursive", false);
|
||||
|
@ -26,7 +26,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 2)
|
||||
@ -40,7 +40,7 @@ public:
|
||||
const String & path_from = command_arguments[0];
|
||||
const String & path_to = command_arguments[1];
|
||||
|
||||
DiskPtr disk = global_context->getDisk(disk_name);
|
||||
DiskPtr disk = disk_selector->get(disk_name);
|
||||
|
||||
String relative_path_from = validatePathAndGetAsRelative(path_from);
|
||||
String relative_path_to = validatePathAndGetAsRelative(path_to);
|
||||
|
@ -36,7 +36,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 1)
|
||||
@ -47,7 +47,7 @@ public:
|
||||
|
||||
String disk_name = config.getString("disk", "default");
|
||||
|
||||
DiskPtr disk = global_context->getDisk(disk_name);
|
||||
DiskPtr disk = disk_selector->get(disk_name);
|
||||
|
||||
String relative_path = validatePathAndGetAsRelative(command_arguments[0]);
|
||||
|
||||
|
@ -26,7 +26,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 1)
|
||||
@ -39,7 +39,7 @@ public:
|
||||
|
||||
const String & path = command_arguments[0];
|
||||
|
||||
DiskPtr disk = global_context->getDisk(disk_name);
|
||||
DiskPtr disk = disk_selector->get(disk_name);
|
||||
|
||||
String relative_path = validatePathAndGetAsRelative(path);
|
||||
|
||||
|
@ -37,7 +37,7 @@ public:
|
||||
|
||||
void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) override
|
||||
{
|
||||
if (command_arguments.size() != 1)
|
||||
@ -50,7 +50,7 @@ public:
|
||||
|
||||
const String & path = command_arguments[0];
|
||||
|
||||
DiskPtr disk = global_context->getDisk(disk_name);
|
||||
DiskPtr disk = disk_selector->get(disk_name);
|
||||
|
||||
String relative_path = validatePathAndGetAsRelative(path);
|
||||
|
||||
|
@ -209,7 +209,35 @@ int DisksApp::main(const std::vector<String> & /*args*/)
|
||||
po::parsed_options parsed = parser.run();
|
||||
args = po::collect_unrecognized(parsed.options, po::collect_unrecognized_mode::include_positional);
|
||||
}
|
||||
command->execute(args, global_context, config());
|
||||
|
||||
std::unordered_set<std::string> disks
|
||||
{
|
||||
config().getString("disk", "default"),
|
||||
config().getString("disk-from", config().getString("disk", "default")),
|
||||
config().getString("disk-to", config().getString("disk", "default")),
|
||||
};
|
||||
|
||||
auto validator = [&disks](
|
||||
const Poco::Util::AbstractConfiguration & config,
|
||||
const std::string & disk_config_prefix,
|
||||
const std::string & disk_name)
|
||||
{
|
||||
if (!disks.contains(disk_name))
|
||||
return false;
|
||||
|
||||
const auto disk_type = config.getString(disk_config_prefix + ".type", "local");
|
||||
|
||||
if (disk_type == "cache")
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Disk type 'cache' of disk {} is not supported by clickhouse-disks", disk_name);
|
||||
|
||||
return true;
|
||||
};
|
||||
|
||||
constexpr auto config_prefix = "storage_configuration.disks";
|
||||
auto disk_selector = std::make_shared<DiskSelector>();
|
||||
disk_selector->initialize(config(), config_prefix, global_context, validator);
|
||||
|
||||
command->execute(args, disk_selector, config());
|
||||
|
||||
return Application::EXIT_OK;
|
||||
}
|
||||
|
@ -1,6 +1,7 @@
|
||||
#pragma once
|
||||
|
||||
#include <Disks/IDisk.h>
|
||||
#include <Disks/DiskSelector.h>
|
||||
|
||||
#include <boost/program_options.hpp>
|
||||
|
||||
@ -25,7 +26,7 @@ public:
|
||||
|
||||
virtual void execute(
|
||||
const std::vector<String> & command_arguments,
|
||||
DB::ContextMutablePtr & global_context,
|
||||
std::shared_ptr<DiskSelector> & disk_selector,
|
||||
Poco::Util::LayeredConfiguration & config) = 0;
|
||||
|
||||
const std::optional<ProgramOptionsDescription> & getCommandOptions() const { return command_option_description; }
|
||||
|
@ -41,6 +41,7 @@
|
||||
<min_session_timeout_ms>10000</min_session_timeout_ms>
|
||||
<session_timeout_ms>100000</session_timeout_ms>
|
||||
<raft_logs_level>information</raft_logs_level>
|
||||
<compress_logs>false</compress_logs>
|
||||
<!-- All settings listed in https://github.com/ClickHouse/ClickHouse/blob/master/src/Coordination/CoordinationSettings.h -->
|
||||
</coordination_settings>
|
||||
|
||||
|
@ -23,6 +23,7 @@
|
||||
#include <Common/scope_guard_safe.h>
|
||||
#include <Interpreters/Session.h>
|
||||
#include <Access/AccessControl.h>
|
||||
#include <Common/PoolId.h>
|
||||
#include <Common/Exception.h>
|
||||
#include <Common/Macros.h>
|
||||
#include <Common/Config/ConfigProcessor.h>
|
||||
@ -742,16 +743,16 @@ void LocalServer::processConfig()
|
||||
status.emplace(fs::path(path) / "status", StatusFile::write_full_info);
|
||||
|
||||
LOG_DEBUG(log, "Loading metadata from {}", path);
|
||||
loadMetadataSystem(global_context);
|
||||
auto startup_system_tasks = loadMetadataSystem(global_context);
|
||||
attachSystemTablesLocal(global_context, *createMemoryDatabaseIfNotExists(global_context, DatabaseCatalog::SYSTEM_DATABASE));
|
||||
attachInformationSchema(global_context, *createMemoryDatabaseIfNotExists(global_context, DatabaseCatalog::INFORMATION_SCHEMA));
|
||||
attachInformationSchema(global_context, *createMemoryDatabaseIfNotExists(global_context, DatabaseCatalog::INFORMATION_SCHEMA_UPPERCASE));
|
||||
startupSystemTables();
|
||||
waitLoad(TablesLoaderForegroundPoolId, startup_system_tasks);
|
||||
|
||||
if (!config().has("only-system-tables"))
|
||||
{
|
||||
DatabaseCatalog::instance().createBackgroundTasks();
|
||||
loadMetadata(global_context);
|
||||
waitLoad(loadMetadata(global_context));
|
||||
DatabaseCatalog::instance().startupBackgroundTasks();
|
||||
}
|
||||
|
||||
|
@ -20,6 +20,7 @@
|
||||
#include <base/coverage.h>
|
||||
#include <base/getFQDNOrHostName.h>
|
||||
#include <base/safeExit.h>
|
||||
#include <Common/PoolId.h>
|
||||
#include <Common/MemoryTracker.h>
|
||||
#include <Common/ClickHouseRevision.h>
|
||||
#include <Common/DNSResolver.h>
|
||||
@ -1334,6 +1335,10 @@ try
|
||||
global_context->getMessageBrokerSchedulePool().increaseThreadsCount(server_settings_.background_message_broker_schedule_pool_size);
|
||||
global_context->getDistributedSchedulePool().increaseThreadsCount(server_settings_.background_distributed_schedule_pool_size);
|
||||
|
||||
global_context->getAsyncLoader().setMaxThreads(TablesLoaderForegroundPoolId, server_settings_.tables_loader_foreground_pool_size);
|
||||
global_context->getAsyncLoader().setMaxThreads(TablesLoaderBackgroundLoadPoolId, server_settings_.tables_loader_background_pool_size);
|
||||
global_context->getAsyncLoader().setMaxThreads(TablesLoaderBackgroundStartupPoolId, server_settings_.tables_loader_background_pool_size);
|
||||
|
||||
getIOThreadPool().reloadConfiguration(
|
||||
server_settings.max_io_thread_pool_size,
|
||||
server_settings.max_io_thread_pool_free_size,
|
||||
@ -1575,6 +1580,10 @@ try
|
||||
global_context->setFormatSchemaPath(format_schema_path);
|
||||
fs::create_directories(format_schema_path);
|
||||
|
||||
/// Set the path for google proto files
|
||||
if (config().has("google_protos_path"))
|
||||
global_context->setGoogleProtosPath(fs::weakly_canonical(config().getString("google_protos_path")));
|
||||
|
||||
/// Set path for filesystem caches
|
||||
fs::path filesystem_caches_path(config().getString("filesystem_caches_path", ""));
|
||||
if (!filesystem_caches_path.empty())
|
||||
@ -1670,17 +1679,18 @@ try
|
||||
|
||||
LOG_INFO(log, "Loading metadata from {}", path_str);
|
||||
|
||||
LoadTaskPtrs load_metadata_tasks;
|
||||
try
|
||||
{
|
||||
auto & database_catalog = DatabaseCatalog::instance();
|
||||
/// We load temporary database first, because projections need it.
|
||||
database_catalog.initializeAndLoadTemporaryDatabase();
|
||||
loadMetadataSystem(global_context);
|
||||
maybeConvertSystemDatabase(global_context);
|
||||
auto system_startup_tasks = loadMetadataSystem(global_context);
|
||||
maybeConvertSystemDatabase(global_context, system_startup_tasks);
|
||||
/// This has to be done before the initialization of system logs,
|
||||
/// otherwise there is a race condition between the system database initialization
|
||||
/// and creation of new tables in the database.
|
||||
startupSystemTables();
|
||||
waitLoad(TablesLoaderForegroundPoolId, system_startup_tasks);
|
||||
/// After attaching system databases we can initialize system log.
|
||||
global_context->initializeSystemLogs();
|
||||
global_context->setSystemZooKeeperLogAfterInitializationIfNeeded();
|
||||
@ -1696,9 +1706,10 @@ try
|
||||
/// and so loadMarkedAsDroppedTables() will find it and try to add, and UUID will overlap.
|
||||
database_catalog.loadMarkedAsDroppedTables();
|
||||
database_catalog.createBackgroundTasks();
|
||||
/// Then, load remaining databases
|
||||
loadMetadata(global_context, default_database);
|
||||
convertDatabasesEnginesIfNeed(global_context);
|
||||
/// Then, load remaining databases (some of them maybe be loaded asynchronously)
|
||||
load_metadata_tasks = loadMetadata(global_context, default_database, server_settings.async_load_databases);
|
||||
/// If we need to convert database engines, disable async tables loading
|
||||
convertDatabasesEnginesIfNeed(load_metadata_tasks, global_context);
|
||||
database_catalog.startupBackgroundTasks();
|
||||
/// After loading validate that default database exists
|
||||
database_catalog.assertDatabaseExists(default_database);
|
||||
@ -1710,6 +1721,7 @@ try
|
||||
tryLogCurrentException(log, "Caught exception while loading metadata");
|
||||
throw;
|
||||
}
|
||||
|
||||
LOG_DEBUG(log, "Loaded metadata.");
|
||||
|
||||
/// Init trace collector only after trace_log system table was created
|
||||
@ -1865,9 +1877,14 @@ try
|
||||
throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "distributed_ddl.pool_size should be greater then 0");
|
||||
global_context->setDDLWorker(std::make_unique<DDLWorker>(pool_size, ddl_zookeeper_path, global_context, &config(),
|
||||
"distributed_ddl", "DDLWorker",
|
||||
&CurrentMetrics::MaxDDLEntryID, &CurrentMetrics::MaxPushedDDLEntryID));
|
||||
&CurrentMetrics::MaxDDLEntryID, &CurrentMetrics::MaxPushedDDLEntryID),
|
||||
load_metadata_tasks);
|
||||
}
|
||||
|
||||
/// Do not keep tasks in server, they should be kept inside databases. Used here to make dependent tasks only.
|
||||
load_metadata_tasks.clear();
|
||||
load_metadata_tasks.shrink_to_fit();
|
||||
|
||||
{
|
||||
std::lock_guard lock(servers_lock);
|
||||
for (auto & server : servers)
|
||||
|
@ -3,6 +3,7 @@
|
||||
<tmp_path replace="replace">./tmp/</tmp_path>
|
||||
<user_files_path replace="replace">./user_files/</user_files_path>
|
||||
<format_schema_path replace="replace">./format_schemas/</format_schema_path>
|
||||
<google_protos_path replace="replace">../../contrib/google-protobuf/src/</google_protos_path>
|
||||
<access_control_path replace="replace">./access/</access_control_path>
|
||||
<top_level_domains_path replace="replace">./top_level_domains/</top_level_domains_path>
|
||||
</clickhouse>
|
||||
|
@ -364,8 +364,15 @@
|
||||
<background_schedule_pool_size>128</background_schedule_pool_size>
|
||||
<background_message_broker_schedule_pool_size>16</background_message_broker_schedule_pool_size>
|
||||
<background_distributed_schedule_pool_size>16</background_distributed_schedule_pool_size>
|
||||
<tables_loader_foreground_pool_size>0</tables_loader_foreground_pool_size>
|
||||
<tables_loader_background_pool_size>0</tables_loader_background_pool_size>
|
||||
-->
|
||||
|
||||
<!-- Enables asynchronous loading of databases and tables to speedup server startup.
|
||||
Queries to not yet loaded entity will be blocked until load is finished.
|
||||
-->
|
||||
<!-- <async_load_databases>true</async_load_databases> -->
|
||||
|
||||
<!-- On memory constrained environments you may have to set this to value larger than 1.
|
||||
-->
|
||||
<max_server_memory_usage_to_ram_ratio>0.9</max_server_memory_usage_to_ram_ratio>
|
||||
@ -1428,6 +1435,10 @@
|
||||
-->
|
||||
<format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>
|
||||
|
||||
<!-- Directory containing the proto files for the well-known Protobuf types.
|
||||
-->
|
||||
<google_protos_path>/usr/share/clickhouse/protos/</google_protos_path>
|
||||
|
||||
<!-- Default query masking rules, matching lines would be replaced with something else in the logs
|
||||
(both text logs and system.query_log).
|
||||
name - name for the rule (optional)
|
||||
|
@ -108,7 +108,7 @@
|
||||
filter: blur(1px);
|
||||
}
|
||||
|
||||
.chart div { position: absolute; }
|
||||
.chart > div { position: absolute; }
|
||||
|
||||
.inputs {
|
||||
height: auto;
|
||||
@ -215,8 +215,6 @@
|
||||
color: var(--text-color);
|
||||
}
|
||||
|
||||
.u-legend th { display: none; }
|
||||
|
||||
.themes {
|
||||
float: right;
|
||||
font-size: 20pt;
|
||||
@ -433,6 +431,16 @@
|
||||
display: none;
|
||||
}
|
||||
|
||||
.u-series {
|
||||
line-height: 0.8;
|
||||
}
|
||||
|
||||
.u-series.footer {
|
||||
font-size: 8px;
|
||||
padding-top: 0;
|
||||
margin-top: 0;
|
||||
}
|
||||
|
||||
/* Source: https://cdn.jsdelivr.net/npm/uplot@1.6.21/dist/uPlot.min.css
|
||||
* It is copy-pasted to lower the number of requests.
|
||||
*/
|
||||
@ -478,7 +486,6 @@
|
||||
* - compress the state for URL's #hash;
|
||||
* - footer with "about" or a link to source code;
|
||||
* - allow to configure a table on a server to save the dashboards;
|
||||
* - multiple lines on chart;
|
||||
* - if a query returned one value, display this value instead of a diagram;
|
||||
* - if a query returned something unusual, display the table;
|
||||
*/
|
||||
@ -520,10 +527,54 @@ let queries = [];
|
||||
/// Query parameters with predefined default values.
|
||||
/// All other parameters will be automatically found in the queries.
|
||||
let params = {
|
||||
"rounding": "60",
|
||||
"seconds": "86400"
|
||||
'rounding': '60',
|
||||
'seconds': '86400'
|
||||
};
|
||||
|
||||
/// Palette generation for charts
|
||||
function generatePalette(baseColor, numColors) {
|
||||
const baseHSL = hexToHsl(baseColor);
|
||||
const hueStep = 360 / numColors;
|
||||
const palette = [];
|
||||
for (let i = 0; i < numColors; i++) {
|
||||
const hue = Math.round((baseHSL.h + i * hueStep) % 360);
|
||||
const color = `hsl(${hue}, ${baseHSL.s}%, ${baseHSL.l}%)`;
|
||||
palette.push(color);
|
||||
}
|
||||
return palette;
|
||||
}
|
||||
|
||||
/// Helper function to convert hex color to HSL
|
||||
function hexToHsl(hex) {
|
||||
hex = hex.replace(/^#/, '');
|
||||
const bigint = parseInt(hex, 16);
|
||||
const r = (bigint >> 16) & 255;
|
||||
const g = (bigint >> 8) & 255;
|
||||
const b = bigint & 255;
|
||||
const r_norm = r / 255;
|
||||
const g_norm = g / 255;
|
||||
const b_norm = b / 255;
|
||||
const max = Math.max(r_norm, g_norm, b_norm);
|
||||
const min = Math.min(r_norm, g_norm, b_norm);
|
||||
const l = (max + min) / 2;
|
||||
let s = 0;
|
||||
if (max !== min) {
|
||||
s = l > 0.5 ? (max - min) / (2 - max - min) : (max - min) / (max + min);
|
||||
}
|
||||
let h = 0;
|
||||
if (max !== min) {
|
||||
if (max === r_norm) {
|
||||
h = (g_norm - b_norm) / (max - min) + (g_norm < b_norm ? 6 : 0);
|
||||
} else if (max === g_norm) {
|
||||
h = (b_norm - r_norm) / (max - min) + 2;
|
||||
} else {
|
||||
h = (r_norm - g_norm) / (max - min) + 4;
|
||||
}
|
||||
}
|
||||
h = Math.round(h * 60);
|
||||
return { h, s: Math.round(s * 100), l: Math.round(l * 100) };
|
||||
}
|
||||
|
||||
let theme = 'light';
|
||||
|
||||
function setTheme(new_theme) {
|
||||
@ -913,6 +964,8 @@ document.getElementById('mass-editor-textarea').addEventListener('input', e => {
|
||||
|
||||
function legendAsTooltipPlugin({ className, style = { background: "var(--legend-background)" } } = {}) {
|
||||
let legendEl;
|
||||
let showTop = false;
|
||||
const showLimit = 5;
|
||||
|
||||
function init(u, opts) {
|
||||
legendEl = u.root.querySelector(".u-legend");
|
||||
@ -932,13 +985,28 @@ function legendAsTooltipPlugin({ className, style = { background: "var(--legend-
|
||||
...style
|
||||
});
|
||||
|
||||
// hide series color markers
|
||||
const idents = legendEl.querySelectorAll(".u-marker");
|
||||
if (opts.series.length == 2) {
|
||||
const nodes = legendEl.querySelectorAll("th");
|
||||
for (let i = 0; i < nodes.length; i++)
|
||||
nodes[i].style.display = "none";
|
||||
} else {
|
||||
legendEl.querySelector("th").remove();
|
||||
legendEl.querySelector("td").setAttribute('colspan', '2');
|
||||
legendEl.querySelector("td").style.textAlign = 'center';
|
||||
}
|
||||
|
||||
for (let i = 0; i < idents.length; i++)
|
||||
idents[i].style.display = "none";
|
||||
if (opts.series.length - 1 > showLimit) {
|
||||
showTop = true;
|
||||
let footer = legendEl.insertRow().insertCell();
|
||||
footer.setAttribute('colspan', '2');
|
||||
footer.style.textAlign = 'center';
|
||||
footer.classList.add('u-value');
|
||||
footer.parentNode.classList.add('u-series','footer');
|
||||
footer.textContent = ". . .";
|
||||
}
|
||||
|
||||
const overEl = u.over;
|
||||
overEl.style.overflow = "visible";
|
||||
|
||||
overEl.appendChild(legendEl);
|
||||
|
||||
@ -946,11 +1014,28 @@ function legendAsTooltipPlugin({ className, style = { background: "var(--legend-
|
||||
overEl.addEventListener("mouseleave", () => {legendEl.style.display = "none";});
|
||||
}
|
||||
|
||||
function nodeListToArray(nodeList) {
|
||||
return Array.prototype.slice.call(nodeList);
|
||||
}
|
||||
|
||||
function update(u) {
|
||||
let { left, top } = u.cursor;
|
||||
left -= legendEl.clientWidth / 2;
|
||||
top -= legendEl.clientHeight / 2;
|
||||
legendEl.style.transform = "translate(" + left + "px, " + top + "px)";
|
||||
if (showTop) {
|
||||
let nodes = nodeListToArray(legendEl.querySelectorAll("tr"));
|
||||
let header = nodes.shift();
|
||||
let footer = nodes.pop();
|
||||
nodes.forEach(function (node) { node._sort_key = +node.querySelector("td").textContent; });
|
||||
nodes.sort((a, b) => +b._sort_key - +a._sort_key);
|
||||
nodes.forEach(function (node) { node.parentNode.appendChild(node); });
|
||||
for (let i = 0; i < nodes.length; i++) {
|
||||
nodes[i].style.display = i < showLimit ? null : "none";
|
||||
delete nodes[i]._sort_key;
|
||||
}
|
||||
footer.parentNode.appendChild(footer);
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
@ -961,12 +1046,13 @@ function legendAsTooltipPlugin({ className, style = { background: "var(--legend-
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
async function doFetch(query, url_params = '') {
|
||||
host = document.getElementById('url').value || host;
|
||||
user = document.getElementById('user').value;
|
||||
password = document.getElementById('password').value;
|
||||
|
||||
let url = `${host}?default_format=JSONCompactColumns&enable_http_compression=1`
|
||||
let url = `${host}?default_format=JSONColumnsWithMetadata&enable_http_compression=1`
|
||||
|
||||
if (add_http_cors_header) {
|
||||
// For debug purposes, you may set add_http_cors_header from a browser console
|
||||
@ -980,14 +1066,17 @@ async function doFetch(query, url_params = '') {
|
||||
url += `&password=${encodeURIComponent(password)}`;
|
||||
}
|
||||
|
||||
let response, data, error;
|
||||
let response, reply, error;
|
||||
try {
|
||||
response = await fetch(url + url_params, { method: "POST", body: query });
|
||||
data = await response.text();
|
||||
reply = await response.text();
|
||||
if (response.ok) {
|
||||
data = JSON.parse(data);
|
||||
reply = JSON.parse(reply);
|
||||
if (reply.exception) {
|
||||
error = reply.exception;
|
||||
}
|
||||
} else {
|
||||
error = data;
|
||||
error = reply;
|
||||
}
|
||||
} catch (e) {
|
||||
console.log(e);
|
||||
@ -1006,7 +1095,7 @@ async function doFetch(query, url_params = '') {
|
||||
}
|
||||
}
|
||||
|
||||
return {data, error};
|
||||
return {reply, error};
|
||||
}
|
||||
|
||||
async function draw(idx, chart, url_params, query) {
|
||||
@ -1015,17 +1104,76 @@ async function draw(idx, chart, url_params, query) {
|
||||
plots[idx] = null;
|
||||
}
|
||||
|
||||
let {data, error} = await doFetch(query, url_params);
|
||||
let {reply, error} = await doFetch(query, url_params);
|
||||
if (!error) {
|
||||
if (reply.rows.length == 0) {
|
||||
error = "Query returned empty result.";
|
||||
} else if (reply.meta.length < 2) {
|
||||
error = "Query should return at least two columns: unix timestamp and value.";
|
||||
} else {
|
||||
for (let i = 0; i < reply.meta.length; i++) {
|
||||
let label = reply.meta[i].name;
|
||||
let column = reply.data[label];
|
||||
if (!Array.isArray(column) || column.length != reply.data[reply.meta[0].name].length) {
|
||||
error = "Wrong data format of the query.";
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Transform string-labeled data to multi-column data
|
||||
function transformToColumns() {
|
||||
const x = reply.meta[0].name; // time; must be ordered
|
||||
const l = reply.meta[1].name; // string label column to distinguish series; must be ordered
|
||||
const y = reply.meta[2].name; // values; must have single value for (x, l) pair
|
||||
const labels = [...new Set(reply.data[l])].sort((a, b) => a - b);
|
||||
if (labels.includes('__time__')) {
|
||||
error = "The second column is not allowed to contain '__time__' values.";
|
||||
return;
|
||||
}
|
||||
const times = [...new Set(reply.data[x])].sort((a, b) => a - b);
|
||||
let new_meta = [{ name: '__time__', type: reply.meta[0].type }];
|
||||
let new_data = { __time__: [] };
|
||||
for (let label of labels) {
|
||||
new_meta.push({ name: label, type: reply.meta[2].type });
|
||||
new_data[label] = [];
|
||||
}
|
||||
let new_rows = 0;
|
||||
function row_done(row_time) {
|
||||
new_rows++;
|
||||
new_data.__time__.push(row_time);
|
||||
for (let label of labels) {
|
||||
if (new_data[label].length < new_rows) {
|
||||
new_data[label].push(null);
|
||||
}
|
||||
}
|
||||
}
|
||||
let prev_time = reply.data[x][0];
|
||||
const old_rows = reply.data[x].length;
|
||||
for (let i = 0; i < old_rows; i++) {
|
||||
const time = reply.data[x][i];
|
||||
const label = reply.data[l][i];
|
||||
const value = reply.data[y][i];
|
||||
if (prev_time != time) {
|
||||
row_done(prev_time);
|
||||
prev_time = time;
|
||||
}
|
||||
new_data[label].push(value);
|
||||
}
|
||||
row_done(prev_time);
|
||||
reply.meta = new_meta;
|
||||
reply.data = new_data;
|
||||
reply.rows = new_rows;
|
||||
}
|
||||
|
||||
function isStringColumn(type) {
|
||||
return type === 'String' || type === 'LowCardinality(String)';
|
||||
}
|
||||
|
||||
if (!error) {
|
||||
if (!Array.isArray(data)) {
|
||||
error = "Query should return an array.";
|
||||
} else if (data.length == 0) {
|
||||
error = "Query returned empty result.";
|
||||
} else if (data.length != 2) {
|
||||
error = "Query should return exactly two columns: unix timestamp and value.";
|
||||
} else if (!Array.isArray(data[0]) || !Array.isArray(data[1]) || data[0].length != data[1].length) {
|
||||
error = "Wrong data format of the query.";
|
||||
if (reply.meta.length == 3 && isStringColumn(reply.meta[1].type)) {
|
||||
transformToColumns();
|
||||
}
|
||||
}
|
||||
|
||||
@ -1043,24 +1191,38 @@ async function draw(idx, chart, url_params, query) {
|
||||
}
|
||||
|
||||
const [line_color, fill_color, grid_color, axes_color] = theme != 'dark'
|
||||
? ["#F88", "#FEE", "#EED", "#2c3235"]
|
||||
: ["#864", "#045", "#2c3235", "#c7d0d9"];
|
||||
? ["#ff8888", "#ffeeee", "#eeeedd", "#2c3235"]
|
||||
: ["#886644", "#004455", "#2c3235", "#c7d0d9"];
|
||||
|
||||
let sync = uPlot.sync("sync");
|
||||
|
||||
const max_value = Math.max(...data[1]);
|
||||
let axis = {
|
||||
stroke: axes_color,
|
||||
grid: { width: 1 / devicePixelRatio, stroke: grid_color },
|
||||
ticks: { width: 1 / devicePixelRatio, stroke: grid_color }
|
||||
};
|
||||
|
||||
let axes = [axis, axis];
|
||||
let series = [{ label: "x" }];
|
||||
let data = [reply.data[reply.meta[0].name]];
|
||||
|
||||
// Treat every column as series
|
||||
const series_count = reply.meta.length;
|
||||
const fill = series_count == 2 ? fill_color : undefined;
|
||||
const palette = generatePalette(line_color, series_count);
|
||||
let max_value = Number.NEGATIVE_INFINITY;
|
||||
for (let i = 1; i < series_count; i++) {
|
||||
let label = reply.meta[i].name;
|
||||
series.push({ label, stroke: palette[i - 1], fill });
|
||||
data.push(reply.data[label]);
|
||||
max_value = Math.max(max_value, ...reply.data[label]);
|
||||
}
|
||||
|
||||
const opts = {
|
||||
width: chart.clientWidth,
|
||||
height: chart.clientHeight,
|
||||
axes: [ { stroke: axes_color,
|
||||
grid: { width: 1 / devicePixelRatio, stroke: grid_color },
|
||||
ticks: { width: 1 / devicePixelRatio, stroke: grid_color } },
|
||||
{ stroke: axes_color,
|
||||
grid: { width: 1 / devicePixelRatio, stroke: grid_color },
|
||||
ticks: { width: 1 / devicePixelRatio, stroke: grid_color } } ],
|
||||
series: [ { label: "x" },
|
||||
{ label: "y", stroke: line_color, fill: fill_color } ],
|
||||
axes,
|
||||
series,
|
||||
padding: [ null, null, null, (Math.round(max_value * 100) / 100).toString().length * 6 - 10 ],
|
||||
plugins: [ legendAsTooltipPlugin() ],
|
||||
cursor: {
|
||||
@ -1216,22 +1378,21 @@ function saveState() {
|
||||
}
|
||||
|
||||
async function searchQueries() {
|
||||
let {data, error} = await doFetch(search_query);
|
||||
let {reply, error} = await doFetch(search_query);
|
||||
if (error) {
|
||||
throw new Error(error);
|
||||
}
|
||||
if (!Array.isArray(data)) {
|
||||
throw new Error("Search query should return an array.");
|
||||
} else if (data.length == 0) {
|
||||
let data = reply.data;
|
||||
if (reply.rows == 0) {
|
||||
throw new Error("Search query returned empty result.");
|
||||
} else if (data.length != 2) {
|
||||
} else if (reply.meta.length != 2 || reply.meta[0].name != "title" || reply.meta[1].name != "query") {
|
||||
throw new Error("Search query should return exactly two columns: title and query.");
|
||||
} else if (!Array.isArray(data[0]) || !Array.isArray(data[1]) || data[0].length != data[1].length) {
|
||||
} else if (!Array.isArray(data.title) || !Array.isArray(data.query) || data.title.length != data.query.length) {
|
||||
throw new Error("Wrong data format of the search query.");
|
||||
}
|
||||
|
||||
for (let i = 0; i < data[0].length; i++) {
|
||||
queries.push({title: data[0][i], query: data[1][i]});
|
||||
for (let i = 0; i < data.title.length; i++) {
|
||||
queries.push({title: data.title[i], query: data.query[i]});
|
||||
}
|
||||
|
||||
regenerate();
|
||||
|
@ -51,6 +51,11 @@ enum class AccessType
|
||||
M(ALTER_CLEAR_INDEX, "CLEAR INDEX", TABLE, ALTER_INDEX) \
|
||||
M(ALTER_INDEX, "INDEX", GROUP, ALTER_TABLE) /* allows to execute ALTER ORDER BY or ALTER {ADD|DROP...} INDEX */\
|
||||
\
|
||||
M(ALTER_ADD_STATISTIC, "ALTER ADD STATISTIC", TABLE, ALTER_STATISTIC) \
|
||||
M(ALTER_DROP_STATISTIC, "ALTER DROP STATISTIC", TABLE, ALTER_STATISTIC) \
|
||||
M(ALTER_MATERIALIZE_STATISTIC, "ALTER MATERIALIZE STATISTIC", TABLE, ALTER_STATISTIC) \
|
||||
M(ALTER_STATISTIC, "STATISTIC", GROUP, ALTER_TABLE) /* allows to execute ALTER STATISTIC */\
|
||||
\
|
||||
M(ALTER_ADD_PROJECTION, "ADD PROJECTION", TABLE, ALTER_PROJECTION) \
|
||||
M(ALTER_DROP_PROJECTION, "DROP PROJECTION", TABLE, ALTER_PROJECTION) \
|
||||
M(ALTER_MATERIALIZE_PROJECTION, "MATERIALIZE PROJECTION", TABLE, ALTER_PROJECTION) \
|
||||
|
@ -1,26 +1,213 @@
|
||||
#include <AggregateFunctions/AggregateFunctionFactory.h>
|
||||
#include <AggregateFunctions/HelpersMinMaxAny.h>
|
||||
#include <IO/ReadHelpers.h>
|
||||
#include <IO/WriteHelpers.h>
|
||||
#include <base/defines.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
struct Settings;
|
||||
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int INCORRECT_DATA;
|
||||
extern const int LOGICAL_ERROR;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
struct AggregateFunctionAnyRespectNullsData
|
||||
{
|
||||
enum Status : UInt8
|
||||
{
|
||||
NotSet = 1,
|
||||
SetNull = 2,
|
||||
SetOther = 3
|
||||
};
|
||||
|
||||
Status status = Status::NotSet;
|
||||
Field value;
|
||||
|
||||
bool isSet() const { return status != Status::NotSet; }
|
||||
void setNull() { status = Status::SetNull; }
|
||||
void setOther() { status = Status::SetOther; }
|
||||
};
|
||||
|
||||
template <bool First>
|
||||
class AggregateFunctionAnyRespectNulls final
|
||||
: public IAggregateFunctionDataHelper<AggregateFunctionAnyRespectNullsData, AggregateFunctionAnyRespectNulls<First>>
|
||||
{
|
||||
public:
|
||||
using Data = AggregateFunctionAnyRespectNullsData;
|
||||
|
||||
SerializationPtr serialization;
|
||||
const bool returns_nullable_type = false;
|
||||
|
||||
explicit AggregateFunctionAnyRespectNulls(const DataTypePtr & type)
|
||||
: IAggregateFunctionDataHelper<Data, AggregateFunctionAnyRespectNulls<First>>({type}, {}, type)
|
||||
, serialization(type->getDefaultSerialization())
|
||||
, returns_nullable_type(type->isNullable())
|
||||
{
|
||||
}
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
if constexpr (First)
|
||||
return "any_respect_nulls";
|
||||
else
|
||||
return "anyLast_respect_nulls";
|
||||
}
|
||||
|
||||
bool allocatesMemoryInArena() const override { return false; }
|
||||
|
||||
void addNull(AggregateDataPtr __restrict place) const
|
||||
{
|
||||
chassert(returns_nullable_type);
|
||||
auto & d = this->data(place);
|
||||
if (First && d.isSet())
|
||||
return;
|
||||
d.setNull();
|
||||
}
|
||||
|
||||
void add(AggregateDataPtr __restrict place, const IColumn ** columns, size_t row_num, Arena *) const override
|
||||
{
|
||||
if (columns[0]->isNullable())
|
||||
{
|
||||
if (columns[0]->isNullAt(row_num))
|
||||
return addNull(place);
|
||||
}
|
||||
auto & d = this->data(place);
|
||||
if (First && d.isSet())
|
||||
return;
|
||||
d.setOther();
|
||||
columns[0]->get(row_num, d.value);
|
||||
}
|
||||
|
||||
void addManyDefaults(AggregateDataPtr __restrict place, const IColumn ** columns, size_t, Arena * arena) const override
|
||||
{
|
||||
if (columns[0]->isNullable())
|
||||
addNull(place);
|
||||
else
|
||||
add(place, columns, 0, arena);
|
||||
}
|
||||
|
||||
void addBatchSinglePlace(
|
||||
size_t row_begin, size_t row_end, AggregateDataPtr place, const IColumn ** columns, Arena * arena, ssize_t if_argument_pos)
|
||||
const override
|
||||
{
|
||||
if (if_argument_pos >= 0)
|
||||
{
|
||||
const auto & flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData();
|
||||
size_t size = row_end - row_begin;
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
size_t pos = First ? row_begin + i : row_end - 1 - i;
|
||||
if (flags[pos])
|
||||
{
|
||||
add(place, columns, pos, arena);
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
size_t pos = First ? row_begin : row_end - 1;
|
||||
add(place, columns, pos, arena);
|
||||
}
|
||||
}
|
||||
|
||||
void addBatchSinglePlaceNotNull(
|
||||
size_t, size_t, AggregateDataPtr __restrict, const IColumn **, const UInt8 *, Arena *, ssize_t) const override
|
||||
{
|
||||
/// This should not happen since it means somebody else has preprocessed the data (NULLs or IFs) and might
|
||||
/// have discarded values that we need (NULLs)
|
||||
throw DB::Exception(ErrorCodes::LOGICAL_ERROR, "AggregateFunctionAnyRespectNulls::addBatchSinglePlaceNotNull called");
|
||||
}
|
||||
|
||||
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena *) const override
|
||||
{
|
||||
auto & d = this->data(place);
|
||||
if (First && d.isSet())
|
||||
return;
|
||||
|
||||
auto & other = this->data(rhs);
|
||||
if (other.isSet())
|
||||
{
|
||||
d.status = other.status;
|
||||
d.value = other.value;
|
||||
}
|
||||
}
|
||||
|
||||
void serialize(ConstAggregateDataPtr __restrict place, WriteBuffer & buf, std::optional<size_t> /* version */) const override
|
||||
{
|
||||
auto & d = this->data(place);
|
||||
UInt8 k = d.status;
|
||||
|
||||
writeBinaryLittleEndian<UInt8>(k, buf);
|
||||
if (k == Data::Status::SetOther)
|
||||
serialization->serializeBinary(d.value, buf, {});
|
||||
}
|
||||
|
||||
void deserialize(AggregateDataPtr place, ReadBuffer & buf, std::optional<size_t> /* version */, Arena *) const override
|
||||
{
|
||||
auto & d = this->data(place);
|
||||
UInt8 k = Data::Status::NotSet;
|
||||
readBinaryLittleEndian<UInt8>(k, buf);
|
||||
d.status = static_cast<Data::Status>(k);
|
||||
if (d.status == Data::Status::NotSet)
|
||||
return;
|
||||
else if (d.status == Data::Status::SetNull)
|
||||
{
|
||||
if (!returns_nullable_type)
|
||||
throw Exception(ErrorCodes::INCORRECT_DATA, "Incorrect type (NULL) in non-nullable {}State", getName());
|
||||
return;
|
||||
}
|
||||
else if (d.status == Data::Status::SetOther)
|
||||
serialization->deserializeBinary(d.value, buf, {});
|
||||
else
|
||||
throw Exception(ErrorCodes::INCORRECT_DATA, "Incorrect type ({}) in {}State", static_cast<Int8>(k), getName());
|
||||
}
|
||||
|
||||
void insertResultInto(AggregateDataPtr __restrict place, IColumn & to, Arena *) const override
|
||||
{
|
||||
auto & d = this->data(place);
|
||||
if (d.status == Data::Status::SetOther)
|
||||
to.insert(d.value);
|
||||
else
|
||||
to.insertDefault();
|
||||
}
|
||||
|
||||
AggregateFunctionPtr getOwnNullAdapter(
|
||||
const AggregateFunctionPtr & original_function,
|
||||
const DataTypes & /*arguments*/,
|
||||
const Array & /*params*/,
|
||||
const AggregateFunctionProperties & /*properties*/) const override
|
||||
{
|
||||
return original_function;
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
template <bool First>
|
||||
IAggregateFunction * createAggregateFunctionSingleValueRespectNulls(
|
||||
const String & name, const DataTypes & argument_types, const Array & parameters, const Settings *)
|
||||
{
|
||||
assertNoParameters(name, parameters);
|
||||
assertUnary(name, argument_types);
|
||||
|
||||
return new AggregateFunctionAnyRespectNulls<First>(argument_types[0]);
|
||||
}
|
||||
|
||||
AggregateFunctionPtr createAggregateFunctionAny(const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
|
||||
{
|
||||
return AggregateFunctionPtr(createAggregateFunctionSingleValue<AggregateFunctionsSingleValue, AggregateFunctionAnyData>(name, argument_types, parameters, settings));
|
||||
}
|
||||
|
||||
template <bool RespectNulls = false>
|
||||
AggregateFunctionPtr createAggregateFunctionNullableAny(
|
||||
AggregateFunctionPtr createAggregateFunctionAnyRespectNulls(
|
||||
const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
|
||||
{
|
||||
return AggregateFunctionPtr(
|
||||
createAggregateFunctionSingleNullableValue<AggregateFunctionsSingleValue, AggregateFunctionAnyData, RespectNulls>(
|
||||
name, argument_types, parameters, settings));
|
||||
return AggregateFunctionPtr(createAggregateFunctionSingleValueRespectNulls<true>(name, argument_types, parameters, settings));
|
||||
}
|
||||
|
||||
AggregateFunctionPtr createAggregateFunctionAnyLast(const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
|
||||
@ -28,13 +215,10 @@ AggregateFunctionPtr createAggregateFunctionAnyLast(const std::string & name, co
|
||||
return AggregateFunctionPtr(createAggregateFunctionSingleValue<AggregateFunctionsSingleValue, AggregateFunctionAnyLastData>(name, argument_types, parameters, settings));
|
||||
}
|
||||
|
||||
template <bool RespectNulls = false>
|
||||
AggregateFunctionPtr createAggregateFunctionNullableAnyLast(const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
|
||||
AggregateFunctionPtr createAggregateFunctionAnyLastRespectNulls(
|
||||
const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
|
||||
{
|
||||
return AggregateFunctionPtr(createAggregateFunctionSingleNullableValue<
|
||||
AggregateFunctionsSingleValue,
|
||||
AggregateFunctionAnyLastData,
|
||||
RespectNulls>(name, argument_types, parameters, settings));
|
||||
return AggregateFunctionPtr(createAggregateFunctionSingleValueRespectNulls<false>(name, argument_types, parameters, settings));
|
||||
}
|
||||
|
||||
AggregateFunctionPtr createAggregateFunctionAnyHeavy(const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
|
||||
@ -46,26 +230,28 @@ AggregateFunctionPtr createAggregateFunctionAnyHeavy(const std::string & name, c
|
||||
|
||||
void registerAggregateFunctionsAny(AggregateFunctionFactory & factory)
|
||||
{
|
||||
AggregateFunctionProperties properties = { .returns_default_when_only_null = false, .is_order_dependent = true };
|
||||
AggregateFunctionProperties default_properties = {.returns_default_when_only_null = false, .is_order_dependent = true};
|
||||
AggregateFunctionProperties default_properties_for_respect_nulls
|
||||
= {.returns_default_when_only_null = false, .is_order_dependent = true, .is_window_function = true};
|
||||
|
||||
factory.registerFunction("any", { createAggregateFunctionAny, properties });
|
||||
factory.registerFunction("any", {createAggregateFunctionAny, default_properties});
|
||||
factory.registerAlias("any_value", "any", AggregateFunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction("anyLast", { createAggregateFunctionAnyLast, properties });
|
||||
factory.registerFunction("anyHeavy", { createAggregateFunctionAnyHeavy, properties });
|
||||
factory.registerAlias("first_value", "any", AggregateFunctionFactory::CaseInsensitive);
|
||||
|
||||
// Synonyms for use as window functions.
|
||||
factory.registerFunction("first_value",
|
||||
{ createAggregateFunctionAny, properties },
|
||||
AggregateFunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction("first_value_respect_nulls",
|
||||
{ createAggregateFunctionNullableAny<true>, properties },
|
||||
AggregateFunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction("last_value",
|
||||
{ createAggregateFunctionAnyLast, properties },
|
||||
AggregateFunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction("last_value_respect_nulls",
|
||||
{ createAggregateFunctionNullableAnyLast<true>, properties },
|
||||
AggregateFunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction("any_respect_nulls", {createAggregateFunctionAnyRespectNulls, default_properties_for_respect_nulls});
|
||||
factory.registerAlias("any_value_respect_nulls", "any_respect_nulls", AggregateFunctionFactory::CaseInsensitive);
|
||||
factory.registerAlias("first_value_respect_nulls", "any_respect_nulls", AggregateFunctionFactory::CaseInsensitive);
|
||||
|
||||
factory.registerFunction("anyLast", {createAggregateFunctionAnyLast, default_properties});
|
||||
factory.registerAlias("last_value", "anyLast", AggregateFunctionFactory::CaseInsensitive);
|
||||
|
||||
factory.registerFunction("anyLast_respect_nulls", {createAggregateFunctionAnyLastRespectNulls, default_properties_for_respect_nulls});
|
||||
factory.registerAlias("last_value_respect_nulls", "anyLast_respect_nulls", AggregateFunctionFactory::CaseInsensitive);
|
||||
|
||||
factory.registerFunction("anyHeavy", {createAggregateFunctionAnyHeavy, default_properties});
|
||||
|
||||
factory.registerNullsActionTransformation("any", "any_respect_nulls");
|
||||
factory.registerNullsActionTransformation("anyLast", "anyLast_respect_nulls");
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -77,7 +77,7 @@ public:
|
||||
if (if_argument_pos >= 0)
|
||||
{
|
||||
const auto & flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData();
|
||||
data(place).count += countBytesInFilter(flags);
|
||||
data(place).count += countBytesInFilter(flags.data(), row_begin, row_end);
|
||||
}
|
||||
else
|
||||
{
|
||||
@ -116,7 +116,7 @@ public:
|
||||
/// Return normalized state type: count()
|
||||
AggregateFunctionProperties properties;
|
||||
return std::make_shared<DataTypeAggregateFunction>(
|
||||
AggregateFunctionFactory::instance().get(getName(), {}, {}, properties), DataTypes{}, Array{});
|
||||
AggregateFunctionFactory::instance().get(getName(), NullsAction::EMPTY, {}, {}, properties), DataTypes{}, Array{});
|
||||
}
|
||||
|
||||
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena *) const override
|
||||
@ -267,7 +267,7 @@ public:
|
||||
/// Return normalized state type: count()
|
||||
AggregateFunctionProperties properties;
|
||||
return std::make_shared<DataTypeAggregateFunction>(
|
||||
AggregateFunctionFactory::instance().get(getName(), {}, {}, properties), DataTypes{}, Array{});
|
||||
AggregateFunctionFactory::instance().get(getName(), NullsAction::EMPTY, {}, {}, properties), DataTypes{}, Array{});
|
||||
}
|
||||
|
||||
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena *) const override
|
||||
|
@ -1,23 +1,11 @@
|
||||
#include <AggregateFunctions/AggregateFunctionFactory.h>
|
||||
#include <AggregateFunctions/Combinators/AggregateFunctionCombinatorFactory.h>
|
||||
|
||||
#include <DataTypes/DataTypeAggregateFunction.h>
|
||||
#include <DataTypes/DataTypeNullable.h>
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <DataTypes/DataTypeLowCardinality.h>
|
||||
|
||||
#include <IO/WriteHelpers.h>
|
||||
|
||||
#include <Interpreters/Context.h>
|
||||
|
||||
#include <Common/StringUtils/StringUtils.h>
|
||||
#include <Common/typeid_cast.h>
|
||||
#include <Common/CurrentThread.h>
|
||||
|
||||
#include <Poco/String.h>
|
||||
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <Functions/FunctionFactory.h>
|
||||
|
||||
#include <IO/WriteHelpers.h>
|
||||
#include <Interpreters/Context.h>
|
||||
|
||||
static constexpr size_t MAX_AGGREGATE_FUNCTION_NAME_LENGTH = 1000;
|
||||
|
||||
@ -28,10 +16,11 @@ struct Settings;
|
||||
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int UNKNOWN_AGGREGATE_FUNCTION;
|
||||
extern const int LOGICAL_ERROR;
|
||||
extern const int ILLEGAL_AGGREGATION;
|
||||
extern const int LOGICAL_ERROR;
|
||||
extern const int NOT_IMPLEMENTED;
|
||||
extern const int TOO_LARGE_STRING_SIZE;
|
||||
extern const int UNKNOWN_AGGREGATE_FUNCTION;
|
||||
}
|
||||
|
||||
const String & getAggregateFunctionCanonicalNameIfAny(const String & name)
|
||||
@ -59,6 +48,23 @@ void AggregateFunctionFactory::registerFunction(const String & name, Value creat
|
||||
}
|
||||
}
|
||||
|
||||
void AggregateFunctionFactory::registerNullsActionTransformation(const String & source_ignores_nulls, const String & target_respect_nulls)
|
||||
{
|
||||
if (!aggregate_functions.contains(source_ignores_nulls))
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "registerNullsActionTransformation: Source aggregation '{}' not found");
|
||||
|
||||
if (!aggregate_functions.contains(target_respect_nulls))
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "registerNullsActionTransformation: Target aggregation '{}' not found");
|
||||
|
||||
if (!respect_nulls.emplace(source_ignores_nulls, target_respect_nulls).second)
|
||||
throw Exception(
|
||||
ErrorCodes::LOGICAL_ERROR, "registerNullsActionTransformation: Assignment from '{}' is not unique", source_ignores_nulls);
|
||||
|
||||
if (!ignore_nulls.emplace(target_respect_nulls, source_ignores_nulls).second)
|
||||
throw Exception(
|
||||
ErrorCodes::LOGICAL_ERROR, "registerNullsActionTransformation: Assignment from '{}' is not unique", target_respect_nulls);
|
||||
}
|
||||
|
||||
static DataTypes convertLowCardinalityTypesToNested(const DataTypes & types)
|
||||
{
|
||||
DataTypes res_types;
|
||||
@ -70,7 +76,11 @@ static DataTypes convertLowCardinalityTypesToNested(const DataTypes & types)
|
||||
}
|
||||
|
||||
AggregateFunctionPtr AggregateFunctionFactory::get(
|
||||
const String & name, const DataTypes & argument_types, const Array & parameters, AggregateFunctionProperties & out_properties) const
|
||||
const String & name,
|
||||
NullsAction action,
|
||||
const DataTypes & argument_types,
|
||||
const Array & parameters,
|
||||
AggregateFunctionProperties & out_properties) const
|
||||
{
|
||||
/// This to prevent costly string manipulation in parsing the aggregate function combinators.
|
||||
/// Example: avgArrayArrayArrayArray...(1000 times)...Array
|
||||
@ -81,8 +91,9 @@ AggregateFunctionPtr AggregateFunctionFactory::get(
|
||||
|
||||
/// If one of the types is Nullable, we apply aggregate function combinator "Null" if it's not window function.
|
||||
/// Window functions are not real aggregate functions. Applying combinators doesn't make sense for them,
|
||||
/// they must handle the nullability themselves
|
||||
auto properties = tryGetProperties(name);
|
||||
/// they must handle the nullability themselves.
|
||||
/// Aggregate functions such as any_value_respect_nulls are considered window functions in that sense
|
||||
auto properties = tryGetProperties(name, action);
|
||||
bool is_window_function = properties.has_value() && properties->is_window_function;
|
||||
if (!is_window_function && std::any_of(types_without_low_cardinality.begin(), types_without_low_cardinality.end(),
|
||||
[](const auto & type) { return type->isNullable(); }))
|
||||
@ -98,8 +109,7 @@ AggregateFunctionPtr AggregateFunctionFactory::get(
|
||||
bool has_null_arguments = std::any_of(types_without_low_cardinality.begin(), types_without_low_cardinality.end(),
|
||||
[](const auto & type) { return type->onlyNull(); });
|
||||
|
||||
AggregateFunctionPtr nested_function = getImpl(
|
||||
name, nested_types, nested_parameters, out_properties, has_null_arguments);
|
||||
AggregateFunctionPtr nested_function = getImpl(name, action, nested_types, nested_parameters, out_properties, has_null_arguments);
|
||||
|
||||
// Pure window functions are not real aggregate functions. Applying
|
||||
// combinators doesn't make sense for them, they must handle the
|
||||
@ -110,22 +120,54 @@ AggregateFunctionPtr AggregateFunctionFactory::get(
|
||||
return combinator->transformAggregateFunction(nested_function, out_properties, types_without_low_cardinality, parameters);
|
||||
}
|
||||
|
||||
auto with_original_arguments = getImpl(name, types_without_low_cardinality, parameters, out_properties, false);
|
||||
auto with_original_arguments = getImpl(name, action, types_without_low_cardinality, parameters, out_properties, false);
|
||||
|
||||
if (!with_original_arguments)
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "Logical error: AggregateFunctionFactory returned nullptr");
|
||||
return with_original_arguments;
|
||||
}
|
||||
|
||||
std::optional<AggregateFunctionWithProperties>
|
||||
AggregateFunctionFactory::getAssociatedFunctionByNullsAction(const String & name, NullsAction action) const
|
||||
{
|
||||
if (action == NullsAction::RESPECT_NULLS)
|
||||
{
|
||||
if (auto it = respect_nulls.find(name); it == respect_nulls.end())
|
||||
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Function {} does not support RESPECT NULLS", name);
|
||||
else if (auto associated_it = aggregate_functions.find(it->second); associated_it != aggregate_functions.end())
|
||||
return {associated_it->second};
|
||||
else
|
||||
throw Exception(
|
||||
ErrorCodes::LOGICAL_ERROR, "Unable to find the function {} (equivalent to '{} RESPECT NULLS')", it->second, name);
|
||||
}
|
||||
|
||||
if (action == NullsAction::IGNORE_NULLS)
|
||||
{
|
||||
if (auto it = ignore_nulls.find(name); it != ignore_nulls.end())
|
||||
{
|
||||
if (auto associated_it = aggregate_functions.find(it->second); associated_it != aggregate_functions.end())
|
||||
return {associated_it->second};
|
||||
else
|
||||
throw Exception(
|
||||
ErrorCodes::LOGICAL_ERROR, "Unable to find the function {} (equivalent to '{} IGNORE NULLS')", it->second, name);
|
||||
}
|
||||
/// We don't throw for IGNORE NULLS of other functions because that's the default in CH
|
||||
}
|
||||
|
||||
return {};
|
||||
}
|
||||
|
||||
|
||||
AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
const String & name_param,
|
||||
NullsAction action,
|
||||
const DataTypes & argument_types,
|
||||
const Array & parameters,
|
||||
AggregateFunctionProperties & out_properties,
|
||||
bool has_null_arguments) const
|
||||
{
|
||||
String name = getAliasToOrName(name_param);
|
||||
String case_insensitive_name;
|
||||
bool is_case_insensitive = false;
|
||||
Value found;
|
||||
|
||||
@ -135,10 +177,14 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
found = it->second;
|
||||
}
|
||||
|
||||
if (auto jt = case_insensitive_aggregate_functions.find(Poco::toLower(name)); jt != case_insensitive_aggregate_functions.end())
|
||||
if (!found.creator)
|
||||
{
|
||||
found = jt->second;
|
||||
is_case_insensitive = true;
|
||||
case_insensitive_name = Poco::toLower(name);
|
||||
if (auto jt = case_insensitive_aggregate_functions.find(case_insensitive_name); jt != case_insensitive_aggregate_functions.end())
|
||||
{
|
||||
found = jt->second;
|
||||
is_case_insensitive = true;
|
||||
}
|
||||
}
|
||||
|
||||
ContextPtr query_context;
|
||||
@ -147,11 +193,14 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
|
||||
if (found.creator)
|
||||
{
|
||||
out_properties = found.properties;
|
||||
auto opt = getAssociatedFunctionByNullsAction(is_case_insensitive ? case_insensitive_name : name, action);
|
||||
if (opt)
|
||||
found = *opt;
|
||||
|
||||
out_properties = found.properties;
|
||||
if (query_context && query_context->getSettingsRef().log_queries)
|
||||
query_context->addQueryFactoriesInfo(
|
||||
Context::QueryLogFactories::AggregateFunction, is_case_insensitive ? Poco::toLower(name) : name);
|
||||
Context::QueryLogFactories::AggregateFunction, is_case_insensitive ? case_insensitive_name : name);
|
||||
|
||||
/// The case when aggregate function should return NULL on NULL arguments. This case is handled in "get" method.
|
||||
if (!out_properties.returns_default_when_only_null && has_null_arguments)
|
||||
@ -196,7 +245,7 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
DataTypes nested_types = combinator->transformArguments(argument_types);
|
||||
Array nested_parameters = combinator->transformParameters(parameters);
|
||||
|
||||
AggregateFunctionPtr nested_function = get(nested_name, nested_types, nested_parameters, out_properties);
|
||||
AggregateFunctionPtr nested_function = get(nested_name, action, nested_types, nested_parameters, out_properties);
|
||||
return combinator->transformAggregateFunction(nested_function, out_properties, argument_types, parameters);
|
||||
}
|
||||
|
||||
@ -213,16 +262,7 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
throw Exception(ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION, "Unknown aggregate function {}{}", name, extra_info);
|
||||
}
|
||||
|
||||
|
||||
AggregateFunctionPtr AggregateFunctionFactory::tryGet(
|
||||
const String & name, const DataTypes & argument_types, const Array & parameters, AggregateFunctionProperties & out_properties) const
|
||||
{
|
||||
return isAggregateFunctionName(name)
|
||||
? get(name, argument_types, parameters, out_properties)
|
||||
: nullptr;
|
||||
}
|
||||
|
||||
std::optional<AggregateFunctionProperties> AggregateFunctionFactory::tryGetProperties(String name) const
|
||||
std::optional<AggregateFunctionProperties> AggregateFunctionFactory::tryGetProperties(String name, NullsAction action) const
|
||||
{
|
||||
if (name.size() > MAX_AGGREGATE_FUNCTION_NAME_LENGTH)
|
||||
throw Exception(ErrorCodes::TOO_LARGE_STRING_SIZE, "Too long name of aggregate function, maximum: {}", MAX_AGGREGATE_FUNCTION_NAME_LENGTH);
|
||||
@ -231,6 +271,8 @@ std::optional<AggregateFunctionProperties> AggregateFunctionFactory::tryGetPrope
|
||||
{
|
||||
name = getAliasToOrName(name);
|
||||
Value found;
|
||||
String lower_case_name;
|
||||
bool is_case_insensitive = false;
|
||||
|
||||
/// Find by exact match.
|
||||
if (auto it = aggregate_functions.find(name); it != aggregate_functions.end())
|
||||
@ -238,11 +280,23 @@ std::optional<AggregateFunctionProperties> AggregateFunctionFactory::tryGetPrope
|
||||
found = it->second;
|
||||
}
|
||||
|
||||
if (auto jt = case_insensitive_aggregate_functions.find(Poco::toLower(name)); jt != case_insensitive_aggregate_functions.end())
|
||||
found = jt->second;
|
||||
if (!found.creator)
|
||||
{
|
||||
lower_case_name = Poco::toLower(name);
|
||||
if (auto jt = case_insensitive_aggregate_functions.find(lower_case_name); jt != case_insensitive_aggregate_functions.end())
|
||||
{
|
||||
is_case_insensitive = true;
|
||||
found = jt->second;
|
||||
}
|
||||
}
|
||||
|
||||
if (found.creator)
|
||||
{
|
||||
auto opt = getAssociatedFunctionByNullsAction(is_case_insensitive ? lower_case_name : name, action);
|
||||
if (opt)
|
||||
return opt->properties;
|
||||
return found.properties;
|
||||
}
|
||||
|
||||
/// Combinators of aggregate functions.
|
||||
/// For every aggregate function 'agg' and combiner '-Comb' there is a combined aggregate function with the name 'aggComb',
|
||||
@ -262,27 +316,29 @@ std::optional<AggregateFunctionProperties> AggregateFunctionFactory::tryGetPrope
|
||||
}
|
||||
|
||||
|
||||
bool AggregateFunctionFactory::isAggregateFunctionName(String name) const
|
||||
bool AggregateFunctionFactory::isAggregateFunctionName(const String & name_) const
|
||||
{
|
||||
if (name.size() > MAX_AGGREGATE_FUNCTION_NAME_LENGTH)
|
||||
if (name_.size() > MAX_AGGREGATE_FUNCTION_NAME_LENGTH)
|
||||
throw Exception(ErrorCodes::TOO_LARGE_STRING_SIZE, "Too long name of aggregate function, maximum: {}", MAX_AGGREGATE_FUNCTION_NAME_LENGTH);
|
||||
|
||||
while (true)
|
||||
if (aggregate_functions.contains(name_) || isAlias(name_))
|
||||
return true;
|
||||
|
||||
String name_lowercase = Poco::toLower(name_);
|
||||
if (case_insensitive_aggregate_functions.contains(name_lowercase) || isAlias(name_lowercase))
|
||||
return true;
|
||||
|
||||
String name = name_;
|
||||
while (AggregateFunctionCombinatorPtr combinator = AggregateFunctionCombinatorFactory::instance().tryFindSuffix(name))
|
||||
{
|
||||
if (aggregate_functions.contains(name) || isAlias(name))
|
||||
return true;
|
||||
name = name.substr(0, name.size() - combinator->getName().size());
|
||||
name_lowercase = name_lowercase.substr(0, name_lowercase.size() - combinator->getName().size());
|
||||
|
||||
String name_lowercase = Poco::toLower(name);
|
||||
if (case_insensitive_aggregate_functions.contains(name_lowercase) || isAlias(name_lowercase))
|
||||
if (aggregate_functions.contains(name) || isAlias(name) || case_insensitive_aggregate_functions.contains(name_lowercase)
|
||||
|| isAlias(name_lowercase))
|
||||
return true;
|
||||
|
||||
if (AggregateFunctionCombinatorPtr combinator = AggregateFunctionCombinatorFactory::instance().tryFindSuffix(name))
|
||||
{
|
||||
name = name.substr(0, name.size() - combinator->getName().size());
|
||||
}
|
||||
else
|
||||
return false;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
AggregateFunctionFactory & AggregateFunctionFactory::instance()
|
||||
|
@ -1,9 +1,9 @@
|
||||
#pragma once
|
||||
|
||||
#include <AggregateFunctions/IAggregateFunction.h>
|
||||
#include <Common/IFactoryWithAliases.h>
|
||||
#include <Parsers/ASTFunction.h>
|
||||
|
||||
#include <Parsers/NullsAction.h>
|
||||
#include <Common/IFactoryWithAliases.h>
|
||||
|
||||
#include <functional>
|
||||
#include <memory>
|
||||
@ -62,36 +62,44 @@ public:
|
||||
Value creator,
|
||||
CaseSensitiveness case_sensitiveness = CaseSensitive);
|
||||
|
||||
/// Register how to transform from one aggregate function to other based on NullsAction
|
||||
/// Registers them both ways:
|
||||
/// SOURCE + RESPECT NULLS will be transformed to TARGET
|
||||
/// TARGET + IGNORE NULLS will be transformed to SOURCE
|
||||
void registerNullsActionTransformation(const String & source_ignores_nulls, const String & target_respect_nulls);
|
||||
|
||||
/// Throws an exception if not found.
|
||||
AggregateFunctionPtr
|
||||
get(const String & name,
|
||||
const DataTypes & argument_types,
|
||||
const Array & parameters,
|
||||
AggregateFunctionProperties & out_properties) const;
|
||||
|
||||
/// Returns nullptr if not found.
|
||||
AggregateFunctionPtr tryGet(
|
||||
const String & name,
|
||||
NullsAction action,
|
||||
const DataTypes & argument_types,
|
||||
const Array & parameters,
|
||||
AggregateFunctionProperties & out_properties) const;
|
||||
|
||||
/// Get properties if the aggregate function exists.
|
||||
std::optional<AggregateFunctionProperties> tryGetProperties(String name) const;
|
||||
std::optional<AggregateFunctionProperties> tryGetProperties(String name, NullsAction action) const;
|
||||
|
||||
bool isAggregateFunctionName(String name) const;
|
||||
bool isAggregateFunctionName(const String & name) const;
|
||||
|
||||
private:
|
||||
AggregateFunctionPtr getImpl(
|
||||
const String & name,
|
||||
NullsAction action,
|
||||
const DataTypes & argument_types,
|
||||
const Array & parameters,
|
||||
AggregateFunctionProperties & out_properties,
|
||||
bool has_null_arguments) const;
|
||||
|
||||
using AggregateFunctions = std::unordered_map<String, Value>;
|
||||
using ActionMap = std::unordered_map<String, String>;
|
||||
|
||||
AggregateFunctions aggregate_functions;
|
||||
/// Mapping from functions with `RESPECT NULLS` modifier to actual aggregate function names
|
||||
/// Example: `any(x) RESPECT NULLS` should be executed as function `any_respect_nulls`
|
||||
ActionMap respect_nulls;
|
||||
/// Same as above for `IGNORE NULLS` modifier
|
||||
ActionMap ignore_nulls;
|
||||
std::optional<AggregateFunctionWithProperties> getAssociatedFunctionByNullsAction(const String & name, NullsAction action) const;
|
||||
|
||||
/// Case insensitive aggregate functions will be additionally added here with lowercased name.
|
||||
AggregateFunctions case_insensitive_aggregate_functions;
|
||||
|
@ -771,26 +771,18 @@ static_assert(
|
||||
|
||||
|
||||
/// For any other value types.
|
||||
template <bool RESULT_IS_NULLABLE = false>
|
||||
struct SingleValueDataGeneric
|
||||
{
|
||||
private:
|
||||
using Self = SingleValueDataGeneric;
|
||||
|
||||
Field value;
|
||||
bool has_value = false;
|
||||
|
||||
public:
|
||||
static constexpr bool result_is_nullable = RESULT_IS_NULLABLE;
|
||||
static constexpr bool should_skip_null_arguments = !RESULT_IS_NULLABLE;
|
||||
static constexpr bool result_is_nullable = false;
|
||||
static constexpr bool should_skip_null_arguments = true;
|
||||
static constexpr bool is_any = false;
|
||||
|
||||
bool has() const
|
||||
{
|
||||
if constexpr (result_is_nullable)
|
||||
return has_value;
|
||||
return !value.isNull();
|
||||
}
|
||||
bool has() const { return !value.isNull(); }
|
||||
|
||||
void insertResultInto(IColumn & to) const
|
||||
{
|
||||
@ -820,19 +812,9 @@ public:
|
||||
serialization.deserializeBinary(value, buf, {});
|
||||
}
|
||||
|
||||
void change(const IColumn & column, size_t row_num, Arena *)
|
||||
{
|
||||
column.get(row_num, value);
|
||||
if constexpr (result_is_nullable)
|
||||
has_value = true;
|
||||
}
|
||||
void change(const IColumn & column, size_t row_num, Arena *) { column.get(row_num, value); }
|
||||
|
||||
void change(const Self & to, Arena *)
|
||||
{
|
||||
value = to.value;
|
||||
if constexpr (result_is_nullable)
|
||||
has_value = true;
|
||||
}
|
||||
void change(const Self & to, Arena *) { value = to.value; }
|
||||
|
||||
bool changeFirstTime(const IColumn & column, size_t row_num, Arena * arena)
|
||||
{
|
||||
@ -847,7 +829,7 @@ public:
|
||||
|
||||
bool changeFirstTime(const Self & to, Arena * arena)
|
||||
{
|
||||
if (!has() && (result_is_nullable || to.has()))
|
||||
if (!has() && to.has())
|
||||
{
|
||||
change(to, arena);
|
||||
return true;
|
||||
@ -882,30 +864,15 @@ public:
|
||||
}
|
||||
else
|
||||
{
|
||||
if constexpr (result_is_nullable)
|
||||
Field new_value;
|
||||
column.get(row_num, new_value);
|
||||
if (new_value < value)
|
||||
{
|
||||
Field new_value;
|
||||
column.get(row_num, new_value);
|
||||
if (!value.isNull() && (new_value.isNull() || new_value < value))
|
||||
{
|
||||
value = new_value;
|
||||
return true;
|
||||
}
|
||||
else
|
||||
return false;
|
||||
value = new_value;
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
Field new_value;
|
||||
column.get(row_num, new_value);
|
||||
if (new_value < value)
|
||||
{
|
||||
value = new_value;
|
||||
return true;
|
||||
}
|
||||
else
|
||||
return false;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
@ -913,30 +880,13 @@ public:
|
||||
{
|
||||
if (!to.has())
|
||||
return false;
|
||||
if constexpr (result_is_nullable)
|
||||
if (!has() || to.value < value)
|
||||
{
|
||||
if (!has())
|
||||
{
|
||||
change(to, arena);
|
||||
return true;
|
||||
}
|
||||
if (to.value.isNull() || (!value.isNull() && to.value < value))
|
||||
{
|
||||
value = to.value;
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
change(to, arena);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (!has() || to.value < value)
|
||||
{
|
||||
change(to, arena);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
return false;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
bool changeIfGreater(const IColumn & column, size_t row_num, Arena * arena)
|
||||
@ -948,29 +898,15 @@ public:
|
||||
}
|
||||
else
|
||||
{
|
||||
if constexpr (result_is_nullable)
|
||||
Field new_value;
|
||||
column.get(row_num, new_value);
|
||||
if (new_value > value)
|
||||
{
|
||||
Field new_value;
|
||||
column.get(row_num, new_value);
|
||||
if (!value.isNull() && (new_value.isNull() || value < new_value))
|
||||
{
|
||||
value = new_value;
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
value = new_value;
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
Field new_value;
|
||||
column.get(row_num, new_value);
|
||||
if (new_value > value)
|
||||
{
|
||||
value = new_value;
|
||||
return true;
|
||||
}
|
||||
else
|
||||
return false;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
@ -978,36 +914,18 @@ public:
|
||||
{
|
||||
if (!to.has())
|
||||
return false;
|
||||
if constexpr (result_is_nullable)
|
||||
if (!has() || to.value > value)
|
||||
{
|
||||
if (!value.isNull() && (to.value.isNull() || value < to.value))
|
||||
{
|
||||
value = to.value;
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
change(to, arena);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (!has() || to.value > value)
|
||||
{
|
||||
change(to, arena);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
return false;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
bool isEqualTo(const IColumn & column, size_t row_num) const
|
||||
{
|
||||
return has() && value == column[row_num];
|
||||
}
|
||||
bool isEqualTo(const IColumn & column, size_t row_num) const { return has() && value == column[row_num]; }
|
||||
|
||||
bool isEqualTo(const Self & to) const
|
||||
{
|
||||
return has() && to.value == value;
|
||||
}
|
||||
bool isEqualTo(const Self & to) const { return has() && to.value == value; }
|
||||
|
||||
static bool allocatesMemoryInArena()
|
||||
{
|
||||
|
@ -150,7 +150,7 @@ public:
|
||||
AggregateFunctionProperties properties;
|
||||
return std::make_shared<DataTypeAggregateFunction>(
|
||||
AggregateFunctionFactory::instance().get(
|
||||
GatherFunctionQuantileData::toFusedNameOrSelf(getName()), this->argument_types, params, properties),
|
||||
GatherFunctionQuantileData::toFusedNameOrSelf(getName()), NullsAction::EMPTY, this->argument_types, params, properties),
|
||||
this->argument_types,
|
||||
params);
|
||||
}
|
||||
|
@ -142,6 +142,7 @@ struct AggregateFunctionSumData
|
||||
), addManyConditionalInternalImpl, MULTITARGET_FUNCTION_BODY((const Value * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end) /// NOLINT
|
||||
{
|
||||
ptr += start;
|
||||
condition_map += start;
|
||||
size_t count = end - start;
|
||||
const auto * end_ptr = ptr + count;
|
||||
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user