Merge remote-tracking branch 'blessed/master' into speedup_numbers

This commit is contained in:
Raúl Marín 2023-12-28 11:24:24 +01:00
commit 2fdb56bb4b
219 changed files with 5577 additions and 1546 deletions

3
.gitmodules vendored
View File

@ -360,6 +360,3 @@
[submodule "contrib/sqids-cpp"]
path = contrib/sqids-cpp
url = https://github.com/sqids/sqids-cpp.git
[submodule "contrib/idna"]
path = contrib/idna
url = https://github.com/ada-url/idna.git

View File

@ -1,4 +1,5 @@
### Table of Contents
**[ClickHouse release v23.12, 2023-12-28](#2312)**<br/>
**[ClickHouse release v23.11, 2023-12-06](#2311)**<br/>
**[ClickHouse release v23.10, 2023-11-02](#2310)**<br/>
**[ClickHouse release v23.9, 2023-09-28](#239)**<br/>
@ -14,6 +15,146 @@
# 2023 Changelog
### <a id="2312"></a> ClickHouse release 23.12, 2023-12-28
#### Backward Incompatible Change
* Fix check for non-deterministic functions in TTL expressions. Previously, you could create a TTL expression with non-deterministic functions in some cases, which could lead to undefined behavior later. This fixes [#37250](https://github.com/ClickHouse/ClickHouse/issues/37250). Disallow TTL expressions that don't depend on any columns of a table by default. It can be allowed back by `SET allow_suspicious_ttl_expressions = 1` or `SET compatibility = '23.11'`. Closes [#37286](https://github.com/ClickHouse/ClickHouse/issues/37286). [#51858](https://github.com/ClickHouse/ClickHouse/pull/51858) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* The MergeTree setting `clean_deleted_rows` is deprecated, it has no effect anymore. The `CLEANUP` keyword for the `OPTIMIZE` is not allowed by default (it can be unlocked with the `allow_experimental_replacing_merge_with_cleanup` setting). [#58267](https://github.com/ClickHouse/ClickHouse/pull/58267) ([Alexander Tokmakov](https://github.com/tavplubix)). This fixes [#57930](https://github.com/ClickHouse/ClickHouse/issues/57930). This closes [#54988](https://github.com/ClickHouse/ClickHouse/issues/54988). This closes [#54570](https://github.com/ClickHouse/ClickHouse/issues/54570). This closes [#50346](https://github.com/ClickHouse/ClickHouse/issues/50346). This closes [#47579](https://github.com/ClickHouse/ClickHouse/issues/47579). The feature has to be removed because it is not good. We have to remove it as quickly as possible, because there is no other option. [#57932](https://github.com/ClickHouse/ClickHouse/pull/57932) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### New Feature
* Implement Refreshable Materialized Views, requested in [#33919](https://github.com/ClickHouse/ClickHouse/issues/57995). [#56946](https://github.com/ClickHouse/ClickHouse/pull/56946) ([Michael Kolupaev](https://github.com/al13n321), [Michael Guzov](https://github.com/koloshmet)).
* Introduce `PASTE JOIN`, which allows users to join tables without `ON` clause simply by row numbers. Example: `SELECT * FROM (SELECT number AS a FROM numbers(2)) AS t1 PASTE JOIN (SELECT number AS a FROM numbers(2) ORDER BY a DESC) AS t2`. [#57995](https://github.com/ClickHouse/ClickHouse/pull/57995) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* The `ORDER BY` clause now supports specifying `ALL`, meaning that ClickHouse sorts by all columns in the `SELECT` clause. Example: `SELECT col1, col2 FROM tab WHERE [...] ORDER BY ALL`. [#57875](https://github.com/ClickHouse/ClickHouse/pull/57875) ([zhongyuankai](https://github.com/zhongyuankai)).
* Added a new mutation command `ALTER TABLE <table> APPLY DELETED MASK`, which allows to enforce applying of mask written by lightweight delete and to remove rows marked as deleted from disk. [#57433](https://github.com/ClickHouse/ClickHouse/pull/57433) ([Anton Popov](https://github.com/CurtizJ)).
* A handler `/binary` opens a visual viewer of symbols inside the ClickHouse binary. [#58211](https://github.com/ClickHouse/ClickHouse/pull/58211) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Added a new SQL function `sqid` to generate Sqids (https://sqids.org/), example: `SELECT sqid(125, 126)`. [#57512](https://github.com/ClickHouse/ClickHouse/pull/57512) ([Robert Schulze](https://github.com/rschu1ze)).
* Add a new function `seriesPeriodDetectFFT` to detect series period using FFT. [#57574](https://github.com/ClickHouse/ClickHouse/pull/57574) ([Bhavna Jindal](https://github.com/bhavnajindal)).
* Add an HTTP endpoint for checking if Keeper is ready to accept traffic. [#55876](https://github.com/ClickHouse/ClickHouse/pull/55876) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Add 'union' mode for schema inference. In this mode the resulting table schema is the union of all files schemas (so schema is inferred from each file). The mode of schema inference is controlled by a setting `schema_inference_mode` with two possible values - `default` and `union`. Closes [#55428](https://github.com/ClickHouse/ClickHouse/issues/55428). [#55892](https://github.com/ClickHouse/ClickHouse/pull/55892) ([Kruglov Pavel](https://github.com/Avogar)).
* Add new setting `input_format_csv_try_infer_numbers_from_strings` that allows to infer numbers from strings in CSV format. Closes [#56455](https://github.com/ClickHouse/ClickHouse/issues/56455). [#56859](https://github.com/ClickHouse/ClickHouse/pull/56859) ([Kruglov Pavel](https://github.com/Avogar)).
* When the number of databases or tables exceeds a configurable threshold, show a warning to the user. [#57375](https://github.com/ClickHouse/ClickHouse/pull/57375) ([凌涛](https://github.com/lingtaolf)).
* Dictionary with `HASHED_ARRAY` (and `COMPLEX_KEY_HASHED_ARRAY`) layout supports `SHARDS` similarly to `HASHED`. [#57544](https://github.com/ClickHouse/ClickHouse/pull/57544) ([vdimir](https://github.com/vdimir)).
* Add asynchronous metrics for total primary key bytes and total allocated primary key bytes in memory. [#57551](https://github.com/ClickHouse/ClickHouse/pull/57551) ([Bharat Nallan](https://github.com/bharatnc)).
* Add `SHA512_256` function. [#57645](https://github.com/ClickHouse/ClickHouse/pull/57645) ([Bharat Nallan](https://github.com/bharatnc)).
* Add `FORMAT_BYTES` as an alias for `formatReadableSize`. [#57592](https://github.com/ClickHouse/ClickHouse/pull/57592) ([Bharat Nallan](https://github.com/bharatnc)).
* Allow passing optional session token to the `s3` table function. [#57850](https://github.com/ClickHouse/ClickHouse/pull/57850) ([Shani Elharrar](https://github.com/shanielh)).
* Introduce a new setting `http_make_head_request`. If it is turned off, the URL table engine will not do a HEAD request to determine the file size. This is needed to support inefficient, misconfigured, or not capable HTTP servers. [#54602](https://github.com/ClickHouse/ClickHouse/pull/54602) ([Fionera](https://github.com/fionera)).
* It is now possible to refer to ALIAS column in index (non-primary-key) definitions (issue [#55650](https://github.com/ClickHouse/ClickHouse/issues/55650)). Example: `CREATE TABLE tab(col UInt32, col_alias ALIAS col + 1, INDEX idx (col_alias) TYPE minmax) ENGINE = MergeTree ORDER BY col;`. [#57546](https://github.com/ClickHouse/ClickHouse/pull/57546) ([Robert Schulze](https://github.com/rschu1ze)).
* Added a new setting `readonly` which can be used to specify an S3 disk is read only. It can be useful to create a table on a disk of `s3_plain` type, while having read only access to the underlying S3 bucket. [#57977](https://github.com/ClickHouse/ClickHouse/pull/57977) ([Pengyuan Bian](https://github.com/bianpengyuan)).
* The primary key analysis in MergeTree tables will now be applied to predicates that include the virtual column `_part_offset` (optionally with `_part`). This feature can serve as a special kind of a secondary index. [#58224](https://github.com/ClickHouse/ClickHouse/pull/58224) ([Amos Bird](https://github.com/amosbird)).
#### Performance Improvement
* Extract non-intersecting parts ranges from MergeTree table during FINAL processing. That way we can avoid additional FINAL logic for this non-intersecting parts ranges. In case when amount of duplicate values with same primary key is low, performance will be almost the same as without FINAL. Improve reading performance for MergeTree FINAL when `do_not_merge_across_partitions_select_final` setting is set. [#58120](https://github.com/ClickHouse/ClickHouse/pull/58120) ([Maksim Kita](https://github.com/kitaisreal)).
* Made copy between s3 disks using a s3-server-side copy instead of copying through the buffer. Improves `BACKUP/RESTORE` operations and `clickhouse-disks copy` command. [#56744](https://github.com/ClickHouse/ClickHouse/pull/56744) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
* Hash JOIN respects setting `max_joined_block_size_rows` and do not produce large blocks for `ALL JOIN`. [#56996](https://github.com/ClickHouse/ClickHouse/pull/56996) ([vdimir](https://github.com/vdimir)).
* Release memory for aggregation earlier. This may avoid unnecessary external aggregation. [#57691](https://github.com/ClickHouse/ClickHouse/pull/57691) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Improve performance of string serialization. [#57717](https://github.com/ClickHouse/ClickHouse/pull/57717) ([Maksim Kita](https://github.com/kitaisreal)).
* Support trivial count optimization for `Merge`-engine tables. [#57867](https://github.com/ClickHouse/ClickHouse/pull/57867) ([skyoct](https://github.com/skyoct)).
* Optimized aggregation in some cases. [#57872](https://github.com/ClickHouse/ClickHouse/pull/57872) ([Anton Popov](https://github.com/CurtizJ)).
* The `hasAny` function can now take advantage of the full-text skipping indices. [#57878](https://github.com/ClickHouse/ClickHouse/pull/57878) ([Jpnock](https://github.com/Jpnock)).
* Function `if(cond, then, else)` (and its alias `cond ? then : else`) were optimized to use branch-free evaluation. [#57885](https://github.com/ClickHouse/ClickHouse/pull/57885) ([zhanglistar](https://github.com/zhanglistar)).
* MergeTree automatically derive `do_not_merge_across_partitions_select_final` setting if partition key expression contains only columns from primary key expression. [#58218](https://github.com/ClickHouse/ClickHouse/pull/58218) ([Maksim Kita](https://github.com/kitaisreal)).
* Speedup `MIN` and `MAX` for native types. [#58231](https://github.com/ClickHouse/ClickHouse/pull/58231) ([Raúl Marín](https://github.com/Algunenano)).
* Implement `SLRU` cache policy for filesystem cache. [#57076](https://github.com/ClickHouse/ClickHouse/pull/57076) ([Kseniia Sumarokova](https://github.com/kssenii)).
* The limit for the number of connections per endpoint for background fetches was raised from `15` to the value of `background_fetches_pool_size` setting. - MergeTree-level setting `replicated_max_parallel_fetches_for_host` became obsolete - MergeTree-level settings `replicated_fetches_http_connection_timeout`, `replicated_fetches_http_send_timeout` and `replicated_fetches_http_receive_timeout` are moved to the Server-level. - Setting `keep_alive_timeout` is added to the list of Server-level settings. [#57523](https://github.com/ClickHouse/ClickHouse/pull/57523) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Make querying `system.filesystem_cache` not memory intensive. [#57687](https://github.com/ClickHouse/ClickHouse/pull/57687) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Reduce memory usage on strings deserialization. [#57787](https://github.com/ClickHouse/ClickHouse/pull/57787) ([Maksim Kita](https://github.com/kitaisreal)).
* More efficient constructor for Enum - it makes sense when Enum has a boatload of values. [#57887](https://github.com/ClickHouse/ClickHouse/pull/57887) ([Duc Canh Le](https://github.com/canhld94)).
* An improvement for reading from the filesystem cache: always use `pread` method. [#57970](https://github.com/ClickHouse/ClickHouse/pull/57970) ([Nikita Taranov](https://github.com/nickitat)).
* Add optimization for AND notEquals chain in logical expression optimizer [#58214](https://github.com/ClickHouse/ClickHouse/pull/58214) ([Kevin Mingtarja](https://github.com/kevinmingtarja)).
#### Improvement
* Support for soft memory limit in Keeper. It will refuse requests if the memory usage is close to the maximum. [#57271](https://github.com/ClickHouse/ClickHouse/pull/57271) ([Han Fei](https://github.com/hanfei1991)). [#57699](https://github.com/ClickHouse/ClickHouse/pull/57699) ([Han Fei](https://github.com/hanfei1991)).
* Make inserts into distributed tables handle updated cluster configuration properly. When the list of cluster nodes is dynamically updated, the Directory Monitor of the distribution table will update it. [#42826](https://github.com/ClickHouse/ClickHouse/pull/42826) ([zhongyuankai](https://github.com/zhongyuankai)).
* Do not allow creating a replicated table with inconsistent merge parameters. [#56833](https://github.com/ClickHouse/ClickHouse/pull/56833) ([Duc Canh Le](https://github.com/canhld94)).
* Show uncompressed size in `system.tables`. [#56618](https://github.com/ClickHouse/ClickHouse/issues/56618). [#57186](https://github.com/ClickHouse/ClickHouse/pull/57186) ([Chen Lixiang](https://github.com/chenlx0)).
* Add `skip_unavailable_shards` as a setting for `Distributed` tables that is similar to the corresponding query-level setting. Closes [#43666](https://github.com/ClickHouse/ClickHouse/issues/43666). [#57218](https://github.com/ClickHouse/ClickHouse/pull/57218) ([Gagan Goel](https://github.com/tntnatbry)).
* The function `substring` (aliases: `substr`, `mid`) can now be used with `Enum` types. Previously, the first function argument had to be a value of type `String` or `FixedString`. This improves compatibility with 3rd party tools such as Tableau via MySQL interface. [#57277](https://github.com/ClickHouse/ClickHouse/pull/57277) ([Serge Klochkov](https://github.com/slvrtrn)).
* Function `format` now supports arbitrary argument types (instead of only `String` and `FixedString` arguments). This is important to calculate `SELECT format('The {0} to all questions is {1}', 'answer', 42)`. [#57549](https://github.com/ClickHouse/ClickHouse/pull/57549) ([Robert Schulze](https://github.com/rschu1ze)).
* Allows to use the `date_trunc` function with a case insensitive first argument. Both cases are now supported: `SELECT date_trunc('day', now())` and `SELECT date_trunc('DAY', now())`. [#57624](https://github.com/ClickHouse/ClickHouse/pull/57624) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Better hints when a table doesn't exist. [#57342](https://github.com/ClickHouse/ClickHouse/pull/57342) ([Bharat Nallan](https://github.com/bharatnc)).
* Allow to overwrite `max_partition_size_to_drop` and `max_table_size_to_drop` server settings in query time. [#57452](https://github.com/ClickHouse/ClickHouse/pull/57452) ([Jordi Villar](https://github.com/jrdi)).
* Slightly better inference of unnamed tupes in JSON formats. [#57751](https://github.com/ClickHouse/ClickHouse/pull/57751) ([Kruglov Pavel](https://github.com/Avogar)).
* Add support for read-only flag when connecting to Keeper (fixes [#53749](https://github.com/ClickHouse/ClickHouse/issues/53749)). [#57479](https://github.com/ClickHouse/ClickHouse/pull/57479) ([Mikhail Koviazin](https://github.com/mkmkme)).
* Fix possible distributed sends stuck due to "No such file or directory" (during recovering a batch from disk). Fix possible issues with `error_count` from `system.distribution_queue` (in case of `distributed_directory_monitor_max_sleep_time_ms` >5min). Introduce profile event to track async INSERT failures - `DistributedAsyncInsertionFailures`. [#57480](https://github.com/ClickHouse/ClickHouse/pull/57480) ([Azat Khuzhin](https://github.com/azat)).
* Support PostgreSQL generated columns and default column values in `MaterializedPostgreSQL` (experimental feature). Closes [#40449](https://github.com/ClickHouse/ClickHouse/issues/40449). [#57568](https://github.com/ClickHouse/ClickHouse/pull/57568) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Allow to apply some filesystem cache config settings changes without server restart. [#57578](https://github.com/ClickHouse/ClickHouse/pull/57578) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Properly handling PostgreSQL table structure with empty array. [#57618](https://github.com/ClickHouse/ClickHouse/pull/57618) ([Mike Kot](https://github.com/myrrc)).
* Expose the total number of errors occurred since last server restart as a `ClickHouseErrorMetric_ALL` metric. [#57627](https://github.com/ClickHouse/ClickHouse/pull/57627) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Allow nodes in the configuration file with `from_env`/`from_zk` reference and non empty element with replace=1. [#57628](https://github.com/ClickHouse/ClickHouse/pull/57628) ([Azat Khuzhin](https://github.com/azat)).
* A table function `fuzzJSON` which allows generating a lot of malformed JSON for fuzzing. [#57646](https://github.com/ClickHouse/ClickHouse/pull/57646) ([Julia Kartseva](https://github.com/jkartseva)).
* Allow IPv6 to UInt128 conversion and binary arithmetic. [#57707](https://github.com/ClickHouse/ClickHouse/pull/57707) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Add a setting for `async inserts deduplication cache` - how long we wait for cache update. Deprecate setting `async_block_ids_cache_min_update_interval_ms`. Now cache is updated only in case of conflicts. [#57743](https://github.com/ClickHouse/ClickHouse/pull/57743) ([alesapin](https://github.com/alesapin)).
* `sleep()` function now can be cancelled with `KILL QUERY`. [#57746](https://github.com/ClickHouse/ClickHouse/pull/57746) ([Vitaly Baranov](https://github.com/vitlibar)).
* Forbid `CREATE TABLE ... AS SELECT` queries for `Replicated` table engines in the experimental `Replicated` database because they are not supported. Reference [#35408](https://github.com/ClickHouse/ClickHouse/issues/35408). [#57796](https://github.com/ClickHouse/ClickHouse/pull/57796) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix and improve transforming queries for external databases, to recursively obtain all compatible predicates. [#57888](https://github.com/ClickHouse/ClickHouse/pull/57888) ([flynn](https://github.com/ucasfl)).
* Support dynamic reloading of the filesystem cache size. Closes [#57866](https://github.com/ClickHouse/ClickHouse/issues/57866). [#57897](https://github.com/ClickHouse/ClickHouse/pull/57897) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Correctly support `system.stack_trace` for threads with blocked SIGRTMIN (these threads can exist in low-quality external libraries such as Apache rdkafka). [#57907](https://github.com/ClickHouse/ClickHouse/pull/57907) ([Azat Khuzhin](https://github.com/azat)). Aand also send signal to the threads only if it is not blocked to avoid waiting `storage_system_stack_trace_pipe_read_timeout_ms` when it does not make any sense. [#58136](https://github.com/ClickHouse/ClickHouse/pull/58136) ([Azat Khuzhin](https://github.com/azat)).
* Tolerate keeper failures in the quorum inserts' check. [#57986](https://github.com/ClickHouse/ClickHouse/pull/57986) ([Raúl Marín](https://github.com/Algunenano)).
* Add max/peak RSS (`MemoryResidentMax`) into system.asynchronous_metrics. [#58095](https://github.com/ClickHouse/ClickHouse/pull/58095) ([Azat Khuzhin](https://github.com/azat)).
* This PR allows users to use s3-style links (`https://` and `s3://`) without mentioning region if it's not default. Also find the correct region if the user mentioned the wrong one. [#58148](https://github.com/ClickHouse/ClickHouse/pull/58148) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* `clickhouse-format --obfuscate` will know about Settings, MergeTreeSettings, and time zones and keep their names unchanged. [#58179](https://github.com/ClickHouse/ClickHouse/pull/58179) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Added explicit `finalize()` function in `ZipArchiveWriter`. Simplify too complicated code in `ZipArchiveWriter`. This fixes [#58074](https://github.com/ClickHouse/ClickHouse/issues/58074). [#58202](https://github.com/ClickHouse/ClickHouse/pull/58202) ([Vitaly Baranov](https://github.com/vitlibar)).
* Make caches with the same path use the same cache objects. This behaviour existed before, but was broken in 23.4. If such caches with the same path have different set of cache settings, an exception will be thrown, that this is not allowed. [#58264](https://github.com/ClickHouse/ClickHouse/pull/58264) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Parallel replicas (experimental feature): friendly settings [#57542](https://github.com/ClickHouse/ClickHouse/pull/57542) ([Igor Nikonov](https://github.com/devcrafter)).
* Parallel replicas (experimental feature): announcement response handling improvement [#57749](https://github.com/ClickHouse/ClickHouse/pull/57749) ([Igor Nikonov](https://github.com/devcrafter)).
* Parallel replicas (experimental feature): give more respect to `min_number_of_marks` in `ParallelReplicasReadingCoordinator` [#57763](https://github.com/ClickHouse/ClickHouse/pull/57763) ([Nikita Taranov](https://github.com/nickitat)).
* Parallel replicas (experimental feature): disable parallel replicas with IN (subquery) [#58133](https://github.com/ClickHouse/ClickHouse/pull/58133) ([Igor Nikonov](https://github.com/devcrafter)).
* Parallel replicas (experimental feature): add profile event 'ParallelReplicasUsedCount' [#58173](https://github.com/ClickHouse/ClickHouse/pull/58173) ([Igor Nikonov](https://github.com/devcrafter)).
* Non POST requests such as HEAD will be readonly similar to GET. [#58060](https://github.com/ClickHouse/ClickHouse/pull/58060) ([San](https://github.com/santrancisco)).
* Add `bytes_uncompressed` column to `system.part_log` [#58167](https://github.com/ClickHouse/ClickHouse/pull/58167) ([Jordi Villar](https://github.com/jrdi)).
* Add base backup name to `system.backups` and `system.backup_log` tables [#58178](https://github.com/ClickHouse/ClickHouse/pull/58178) ([Pradeep Chhetri](https://github.com/chhetripradeep)).
* Add support for specifying query parameters in the command line in clickhouse-local [#58210](https://github.com/ClickHouse/ClickHouse/pull/58210) ([Pradeep Chhetri](https://github.com/chhetripradeep)).
#### Build/Testing/Packaging Improvement
* Randomize more settings [#39663](https://github.com/ClickHouse/ClickHouse/pull/39663) ([Anton Popov](https://github.com/CurtizJ)).
* Randomize disabled optimizations in CI [#57315](https://github.com/ClickHouse/ClickHouse/pull/57315) ([Raúl Marín](https://github.com/Algunenano)).
* Allow usage of Azure-related table engines/functions on macOS. [#51866](https://github.com/ClickHouse/ClickHouse/pull/51866) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* ClickHouse Fast Test now uses Musl instead of GLibc. [#57711](https://github.com/ClickHouse/ClickHouse/pull/57711) ([Alexey Milovidov](https://github.com/alexey-milovidov)). The fully-static Musl build is available to download from the CI.
* Run ClickBench for every commit. This closes [#57708](https://github.com/ClickHouse/ClickHouse/issues/57708). [#57712](https://github.com/ClickHouse/ClickHouse/pull/57712) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove the usage of a harmful C/POSIX `select` function from external libraries. [#57467](https://github.com/ClickHouse/ClickHouse/pull/57467) ([Igor Nikonov](https://github.com/devcrafter)).
* Settings only available in ClickHouse Cloud will be also present in the open-source ClickHouse build for convenience. [#57638](https://github.com/ClickHouse/ClickHouse/pull/57638) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fixed a possibility of sorting order breakage in TTL GROUP BY [#49103](https://github.com/ClickHouse/ClickHouse/pull/49103) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix: split `lttb` bucket strategy, first bucket and last bucket should only contain single point [#57003](https://github.com/ClickHouse/ClickHouse/pull/57003) ([FFish](https://github.com/wxybear)).
* Fix possible deadlock in the `Template` format during sync after error [#57004](https://github.com/ClickHouse/ClickHouse/pull/57004) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix early stop while parsing a file with skipping lots of errors [#57006](https://github.com/ClickHouse/ClickHouse/pull/57006) ([Kruglov Pavel](https://github.com/Avogar)).
* Prevent dictionary's ACL bypass via the `dictionary` table function [#57362](https://github.com/ClickHouse/ClickHouse/pull/57362) ([Salvatore Mesoraca](https://github.com/aiven-sal)).
* Fix another case of a "non-ready set" error found by Fuzzer. [#57423](https://github.com/ClickHouse/ClickHouse/pull/57423) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix several issues regarding PostgreSQL `array_ndims` usage. [#57436](https://github.com/ClickHouse/ClickHouse/pull/57436) ([Ryan Jacobs](https://github.com/ryanmjacobs)).
* Fix RWLock inconsistency after write lock timeout [#57454](https://github.com/ClickHouse/ClickHouse/pull/57454) ([Vitaly Baranov](https://github.com/vitlibar)). Fix RWLock inconsistency after write lock timeout (again) [#57733](https://github.com/ClickHouse/ClickHouse/pull/57733) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix: don't exclude ephemeral column when building pushing to view chain [#57461](https://github.com/ClickHouse/ClickHouse/pull/57461) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* MaterializedPostgreSQL (experimental issue): fix issue [#41922](https://github.com/ClickHouse/ClickHouse/issues/41922), add test for [#41923](https://github.com/ClickHouse/ClickHouse/issues/41923) [#57515](https://github.com/ClickHouse/ClickHouse/pull/57515) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Ignore ON CLUSTER clause in grant/revoke queries for management of replicated access entities. [#57538](https://github.com/ClickHouse/ClickHouse/pull/57538) ([MikhailBurdukov](https://github.com/MikhailBurdukov)).
* Fix crash in clickhouse-local [#57553](https://github.com/ClickHouse/ClickHouse/pull/57553) ([Nikolay Degterinsky](https://github.com/evillique)).
* A fix for Hash JOIN. [#57564](https://github.com/ClickHouse/ClickHouse/pull/57564) ([vdimir](https://github.com/vdimir)).
* Fix possible error in PostgreSQL source [#57567](https://github.com/ClickHouse/ClickHouse/pull/57567) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix type correction in Hash JOIN for nested LowCardinality. [#57614](https://github.com/ClickHouse/ClickHouse/pull/57614) ([vdimir](https://github.com/vdimir)).
* Avoid hangs of `system.stack_trace` by correctly prohibiting parallel reading from it. [#57641](https://github.com/ClickHouse/ClickHouse/pull/57641) ([Azat Khuzhin](https://github.com/azat)).
* Fix an error for aggregation of sparse columns with `any(...) RESPECT NULL` [#57710](https://github.com/ClickHouse/ClickHouse/pull/57710) ([Azat Khuzhin](https://github.com/azat)).
* Fix unary operators parsing [#57713](https://github.com/ClickHouse/ClickHouse/pull/57713) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix dependency loading for the experimental table engine `MaterializedPostgreSQL`. [#57754](https://github.com/ClickHouse/ClickHouse/pull/57754) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix retries for disconnected nodes for BACKUP/RESTORE ON CLUSTER [#57764](https://github.com/ClickHouse/ClickHouse/pull/57764) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix result of external aggregation in case of partially materialized projection [#57790](https://github.com/ClickHouse/ClickHouse/pull/57790) ([Anton Popov](https://github.com/CurtizJ)).
* Fix merge in aggregation functions with `*Map` combinator [#57795](https://github.com/ClickHouse/ClickHouse/pull/57795) ([Anton Popov](https://github.com/CurtizJ)).
* Disable `system.kafka_consumers` because it has a bug. [#57822](https://github.com/ClickHouse/ClickHouse/pull/57822) ([Azat Khuzhin](https://github.com/azat)).
* Fix LowCardinality keys support in Merge JOIN. [#57827](https://github.com/ClickHouse/ClickHouse/pull/57827) ([vdimir](https://github.com/vdimir)).
* A fix for `InterpreterCreateQuery` related to the sample block. [#57855](https://github.com/ClickHouse/ClickHouse/pull/57855) ([Maksim Kita](https://github.com/kitaisreal)).
* `addresses_expr` were ignored for named collections from PostgreSQL. [#57874](https://github.com/ClickHouse/ClickHouse/pull/57874) ([joelynch](https://github.com/joelynch)).
* Fix invalid memory access in BLAKE3 (Rust) [#57876](https://github.com/ClickHouse/ClickHouse/pull/57876) ([Raúl Marín](https://github.com/Algunenano)). Then it was rewritten from Rust to C++ for better [memory-safety](https://www.memorysafety.org/). [#57994](https://github.com/ClickHouse/ClickHouse/pull/57994) ([Raúl Marín](https://github.com/Algunenano)).
* Normalize function names in `CREATE INDEX` [#57906](https://github.com/ClickHouse/ClickHouse/pull/57906) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix handling of unavailable replicas before first request happened [#57933](https://github.com/ClickHouse/ClickHouse/pull/57933) ([Nikita Taranov](https://github.com/nickitat)).
* Fix literal alias misclassification [#57988](https://github.com/ClickHouse/ClickHouse/pull/57988) ([Chen768959](https://github.com/Chen768959)).
* Fix invalid preprocessing on Keeper [#58069](https://github.com/ClickHouse/ClickHouse/pull/58069) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix integer overflow in the `Poco` library, related to `UTF32Encoding` [#58073](https://github.com/ClickHouse/ClickHouse/pull/58073) ([Andrey Fedotov](https://github.com/anfedotoff)).
* Fix parallel replicas (experimental feature) in presence of a scalar subquery with a big integer value [#58118](https://github.com/ClickHouse/ClickHouse/pull/58118) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix `accurateCastOrNull` for out-of-range `DateTime` [#58139](https://github.com/ClickHouse/ClickHouse/pull/58139) ([Andrey Zvonov](https://github.com/zvonand)).
* Fix possible `PARAMETER_OUT_OF_BOUND` error during subcolumns reading from a wide part in MergeTree [#58175](https://github.com/ClickHouse/ClickHouse/pull/58175) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix a slow-down of CREATE VIEW with an enormous number of subqueries [#58220](https://github.com/ClickHouse/ClickHouse/pull/58220) ([Tao Wang](https://github.com/wangtZJU)).
* Fix parallel parsing for JSONCompactEachRow [#58181](https://github.com/ClickHouse/ClickHouse/pull/58181) ([Alexey Milovidov](https://github.com/alexey-milovidov)). [#58250](https://github.com/ClickHouse/ClickHouse/pull/58250) ([Kruglov Pavel](https://github.com/Avogar)).
### <a id="2311"></a> ClickHouse release 23.11, 2023-12-06
#### Backward Incompatible Change

View File

@ -12,6 +12,8 @@ elseif (CMAKE_SYSTEM_NAME MATCHES "FreeBSD")
elseif (CMAKE_SYSTEM_NAME MATCHES "Darwin")
set (OS_DARWIN 1)
add_definitions(-D OS_DARWIN)
# For MAP_ANON/MAP_ANONYMOUS
add_definitions(-D _DARWIN_C_SOURCE)
elseif (CMAKE_SYSTEM_NAME MATCHES "SunOS")
set (OS_SUNOS 1)
add_definitions(-D OS_SUNOS)

View File

@ -154,7 +154,6 @@ add_contrib (libpqxx-cmake libpqxx)
add_contrib (libpq-cmake libpq)
add_contrib (nuraft-cmake NuRaft)
add_contrib (fast_float-cmake fast_float)
add_contrib (idna-cmake idna)
add_contrib (datasketches-cpp-cmake datasketches-cpp)
add_contrib (incbin-cmake incbin)
add_contrib (sqids-cpp-cmake sqids-cpp)

1
contrib/idna vendored

@ -1 +0,0 @@
Subproject commit 3c8be01d42b75649f1ac9b697d0ef757eebfe667

View File

@ -1,24 +0,0 @@
option(ENABLE_IDNA "Enable idna support" ${ENABLE_LIBRARIES})
if ((NOT ENABLE_IDNA))
message (STATUS "Not using idna")
return()
endif()
set (LIBRARY_DIR "${ClickHouse_SOURCE_DIR}/contrib/idna")
set (SRCS
"${LIBRARY_DIR}/src/idna.cpp"
"${LIBRARY_DIR}/src/mapping.cpp"
"${LIBRARY_DIR}/src/mapping_tables.cpp"
"${LIBRARY_DIR}/src/normalization.cpp"
"${LIBRARY_DIR}/src/normalization_tables.cpp"
"${LIBRARY_DIR}/src/punycode.cpp"
"${LIBRARY_DIR}/src/to_ascii.cpp"
"${LIBRARY_DIR}/src/to_unicode.cpp"
"${LIBRARY_DIR}/src/unicode_transcoding.cpp"
"${LIBRARY_DIR}/src/validity.cpp"
)
add_library (_idna ${SRCS})
target_include_directories(_idna PUBLIC "${LIBRARY_DIR}/include")
add_library (ch_contrib::idna ALIAS _idna)

View File

@ -551,6 +551,14 @@ Total amount of bytes (compressed, including data and indices) stored in all tab
Total amount of data parts in all tables of MergeTree family. Numbers larger than 10 000 will negatively affect the server startup time and it may indicate unreasonable choice of the partition key.
### TotalPrimaryKeyBytesInMemory
The total amount of memory (in bytes) used by primary key values (only takes active parts into account).
### TotalPrimaryKeyBytesInMemoryAllocated
The total amount of memory (in bytes) reserved for primary key values (only takes active parts into account).
### TotalRowsOfMergeTreeTables
Total amount of rows (records) stored in all tables of MergeTree family.

View File

@ -0,0 +1,43 @@
---
slug: /en/operations/system-tables/view_refreshes
---
# view_refreshes
Information about [Refreshable Materialized Views](../../sql-reference/statements/create/view.md#refreshable-materialized-view). Contains all refreshable materialized views, regardless of whether there's a refresh in progress or not.
Columns:
- `database` ([String](../../sql-reference/data-types/string.md)) — The name of the database the table is in.
- `view` ([String](../../sql-reference/data-types/string.md)) — Table name.
- `status` ([String](../../sql-reference/data-types/string.md)) — Current state of the refresh.
- `last_refresh_result` ([String](../../sql-reference/data-types/string.md)) — Outcome of the latest refresh attempt.
- `last_refresh_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Time of the last refresh attempt. `NULL` if no refresh attempts happened since server startup or table creation.
- `last_success_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Time of the last successful refresh. `NULL` if no successful refreshes happened since server startup or table creation.
- `duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md)) — How long the last refresh attempt took.
- `next_refresh_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Time at which the next refresh is scheduled to start.
- `remaining_dependencies` ([Array(String)](../../sql-reference/data-types/array.md)) — If the view has [refresh dependencies](../../sql-reference/statements/create/view.md#refresh-dependencies), this array contains the subset of those dependencies that are not satisfied for the current refresh yet. If `status = 'WaitingForDependencies'`, a refresh is ready to start as soon as these dependencies are fulfilled.
- `exception` ([String](../../sql-reference/data-types/string.md)) — if `last_refresh_result = 'Exception'`, i.e. the last refresh attempt failed, this column contains the corresponding error message and stack trace.
- `refresh_count` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of successful refreshes since last server restart or table creation.
- `progress` ([Float64](../../sql-reference/data-types/float.md)) — Progress of the current refresh, between 0 and 1.
- `read_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of rows read by the current refresh so far.
- `total_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Estimated total number of rows that need to be read by the current refresh.
(There are additional columns related to current refresh progress, but they are currently unreliable.)
**Example**
```sql
SELECT
database,
view,
status,
last_refresh_result,
last_refresh_time,
next_refresh_time
FROM system.view_refreshes
┌─database─┬─view───────────────────────┬─status────┬─last_refresh_result─┬───last_refresh_time─┬───next_refresh_time─┐
│ default │ hello_documentation_reader │ Scheduled │ Finished │ 2023-12-01 01:24:00 │ 2023-12-01 01:25:00 │
└──────────┴────────────────────────────┴───────────┴─────────────────────┴─────────────────────┴─────────────────────┘
```

View File

@ -1383,71 +1383,6 @@ Result:
└──────────────────┘
```
## punycodeEncode
Returns the [Punycode](https://en.wikipedia.org/wiki/Punycode) of a string.
The string must be UTF8-encoded, otherwise results are undefined.
**Syntax**
``` sql
punycodeEncode(val)
```
**Arguments**
- `val` - Input value. [String](../data-types/string.md)
**Returned value**
- A Punycode representation of the input value. [String](../data-types/string.md)
**Example**
``` sql
select punycodeEncode('München');
```
Result:
```result
┌─punycodeEncode('München')─┐
│ Mnchen-3ya │
└───────────────────────────┘
```
## punycodeDecode
Returns the UTF8-encoded plaintext of a [Punycode](https://en.wikipedia.org/wiki/Punycode)-encoded string.
**Syntax**
``` sql
punycodeEncode(val)
```
**Arguments**
- `val` - Punycode-encoded string. [String](../data-types/string.md)
**Returned value**
- The plaintext of the input value. [String](../data-types/string.md)
**Example**
``` sql
select punycodeDecode('Mnchen-3ya');
```
Result:
```result
┌─punycodeEncode('Mnchen-3ya')─┐
│ München │
└──────────────────────────────┘
```
## byteHammingDistance
Calculates the [hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) between two byte strings.

View File

@ -6,28 +6,28 @@ sidebar_label: VIEW
# ALTER TABLE … MODIFY QUERY Statement
You can modify `SELECT` query that was specified when a [materialized view](../create/view.md#materialized) was created with the `ALTER TABLE … MODIFY QUERY` statement without interrupting ingestion process.
You can modify `SELECT` query that was specified when a [materialized view](../create/view.md#materialized) was created with the `ALTER TABLE … MODIFY QUERY` statement without interrupting ingestion process.
The `allow_experimental_alter_materialized_view_structure` setting must be enabled.
The `allow_experimental_alter_materialized_view_structure` setting must be enabled.
This command is created to change materialized view created with `TO [db.]name` clause. It does not change the structure of the underling storage table and it does not change the columns' definition of the materialized view, because of this the application of this command is very limited for materialized views are created without `TO [db.]name` clause.
**Example with TO table**
```sql
CREATE TABLE events (ts DateTime, event_type String)
CREATE TABLE events (ts DateTime, event_type String)
ENGINE = MergeTree ORDER BY (event_type, ts);
CREATE TABLE events_by_day (ts DateTime, event_type String, events_cnt UInt64)
CREATE TABLE events_by_day (ts DateTime, event_type String, events_cnt UInt64)
ENGINE = SummingMergeTree ORDER BY (event_type, ts);
CREATE MATERIALIZED VIEW mv TO events_by_day AS
CREATE MATERIALIZED VIEW mv TO events_by_day AS
SELECT toStartOfDay(ts) ts, event_type, count() events_cnt
FROM events
GROUP BY ts, event_type;
GROUP BY ts, event_type;
INSERT INTO events
SELECT Date '2020-01-01' + interval number * 900 second,
INSERT INTO events
SELECT Date '2020-01-01' + interval number * 900 second,
['imp', 'click'][number%2+1]
FROM numbers(100);
@ -43,23 +43,23 @@ ORDER BY ts, event_type;
│ 2020-01-02 00:00:00 │ imp │ 2 │
└─────────────────────┴────────────┴─────────────────┘
-- Let's add the new measurment `cost`
-- Let's add the new measurment `cost`
-- and the new dimension `browser`.
ALTER TABLE events
ALTER TABLE events
ADD COLUMN browser String,
ADD COLUMN cost Float64;
-- Column do not have to match in a materialized view and TO
-- (destination table), so the next alter does not break insertion.
ALTER TABLE events_by_day
ALTER TABLE events_by_day
ADD COLUMN cost Float64,
ADD COLUMN browser String after event_type,
MODIFY ORDER BY (event_type, ts, browser);
INSERT INTO events
SELECT Date '2020-01-02' + interval number * 900 second,
INSERT INTO events
SELECT Date '2020-01-02' + interval number * 900 second,
['imp', 'click'][number%2+1],
['firefox', 'safary', 'chrome'][number%3+1],
10/(number+1)%33
@ -82,16 +82,16 @@ ORDER BY ts, event_type;
└─────────────────────┴────────────┴─────────┴────────────┴──────┘
SET allow_experimental_alter_materialized_view_structure=1;
ALTER TABLE mv MODIFY QUERY
ALTER TABLE mv MODIFY QUERY
SELECT toStartOfDay(ts) ts, event_type, browser,
count() events_cnt,
sum(cost) cost
FROM events
GROUP BY ts, event_type, browser;
INSERT INTO events
SELECT Date '2020-01-03' + interval number * 900 second,
INSERT INTO events
SELECT Date '2020-01-03' + interval number * 900 second,
['imp', 'click'][number%2+1],
['firefox', 'safary', 'chrome'][number%3+1],
10/(number+1)%33
@ -138,7 +138,7 @@ PRIMARY KEY (event_type, ts)
ORDER BY (event_type, ts, browser)
SETTINGS index_granularity = 8192
-- !!! The columns' definition is unchanged but it does not matter, we are not quering
-- !!! The columns' definition is unchanged but it does not matter, we are not quering
-- MATERIALIZED VIEW, we are quering TO (storage) table.
-- SELECT section is updated.
@ -169,7 +169,7 @@ The application is very limited because you can only change the `SELECT` section
```sql
CREATE TABLE src_table (`a` UInt32) ENGINE = MergeTree ORDER BY a;
CREATE MATERIALIZED VIEW mv (`a` UInt32) ENGINE = MergeTree ORDER BY a AS SELECT a FROM src_table;
CREATE MATERIALIZED VIEW mv (`a` UInt32) ENGINE = MergeTree ORDER BY a AS SELECT a FROM src_table;
INSERT INTO src_table (a) VALUES (1), (2);
SELECT * FROM mv;
```
@ -199,3 +199,7 @@ SELECT * FROM mv;
## ALTER LIVE VIEW Statement
`ALTER LIVE VIEW ... REFRESH` statement refreshes a [Live view](../create/view.md#live-view). See [Force Live View Refresh](../create/view.md#live-view-alter-refresh).
## ALTER TABLE … MODIFY REFRESH Statement
`ALTER TABLE ... MODIFY REFRESH` statement changes refresh parameters of a [Refreshable Materialized View](../create/view.md#refreshable-materialized-view). See [Changing Refresh Parameters](../create/view.md#changing-refresh-parameters).

View File

@ -37,6 +37,7 @@ SELECT a, b, c FROM (SELECT ...)
```
## Parameterized View
Parametrized views are similar to normal views, but can be created with parameters which are not resolved immediately. These views can be used with table functions, which specify the name of the view as function name and the parameter values as its arguments.
``` sql
@ -66,7 +67,7 @@ When creating a materialized view with `TO [db].[table]`, you can't also use `PO
A materialized view is implemented as follows: when inserting data to the table specified in `SELECT`, part of the inserted data is converted by this `SELECT` query, and the result is inserted in the view.
:::note
:::note
Materialized views in ClickHouse use **column names** instead of column order during insertion into destination table. If some column names are not present in the `SELECT` query result, ClickHouse uses a default value, even if the column is not [Nullable](../../data-types/nullable.md). A safe practice would be to add aliases for every column when using Materialized views.
Materialized views in ClickHouse are implemented more like insert triggers. If theres some aggregation in the view query, its applied only to the batch of freshly inserted data. Any changes to existing data of source table (like update, delete, drop partition, etc.) does not change the materialized view.
@ -96,9 +97,116 @@ This feature is deprecated and will be removed in the future.
For your convenience, the old documentation is located [here](https://pastila.nl/?00f32652/fdf07272a7b54bda7e13b919264e449f.md)
## Refreshable Materialized View {#refreshable-materialized-view}
```sql
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db.]table_name
REFRESH EVERY|AFTER interval [OFFSET interval]
RANDOMIZE FOR interval
DEPENDS ON [db.]name [, [db.]name [, ...]]
[TO[db.]name] [(columns)] [ENGINE = engine] [EMPTY]
AS SELECT ...
```
where `interval` is a sequence of simple intervals:
```sql
number SECOND|MINUTE|HOUR|DAY|WEEK|MONTH|YEAR
```
Periodically runs the corresponding query and stores its result in a table, atomically replacing the table's previous contents.
Differences from regular non-refreshable materialized views:
* No insert trigger. I.e. when new data is inserted into the table specified in SELECT, it's *not* automatically pushed to the refreshable materialized view. The periodic refresh runs the entire query and replaces the entire table.
* No restrictions on the SELECT query. Table functions (e.g. `url()`), views, UNION, JOIN, are all allowed.
:::note
Refreshable materialized views are a work in progress. Setting `allow_experimental_refreshable_materialized_view = 1` is required for creating one. Current limitations:
* not compatible with Replicated database or table engines,
* require [Atomic database engine](../../../engines/database-engines/atomic.md),
* no retries for failed refresh - we just skip to the next scheduled refresh time,
* no limit on number of concurrent refreshes.
:::
### Refresh Schedule
Example refresh schedules:
```sql
REFRESH EVERY 1 DAY -- every day, at midnight (UTC)
REFRESH EVERY 1 MONTH -- on 1st day of every month, at midnight
REFRESH EVERY 1 MONTH OFFSET 5 DAY 2 HOUR -- on 6th day of every month, at 2:00 am
REFRESH EVERY 2 WEEK OFFSET 5 DAY 15 HOUR 10 MINUTE -- every other Saturday, at 3:10 pm
REFRESH EVERY 30 MINUTE -- at 00:00, 00:30, 01:00, 01:30, etc
REFRESH AFTER 30 MINUTE -- 30 minutes after the previous refresh completes, no alignment with time of day
-- REFRESH AFTER 1 HOUR OFFSET 1 MINUTE -- syntax errror, OFFSET is not allowed with AFTER
```
`RANDOMIZE FOR` randomly adjusts the time of each refresh, e.g.:
```sql
REFRESH EVERY 1 DAY OFFSET 2 HOUR RANDOMIZE FOR 1 HOUR -- every day at random time between 01:30 and 02:30
```
At most one refresh may be running at a time, for a given view. E.g. if a view with `REFRESH EVERY 1 MINUTE` takes 2 minutes to refresh, it'll just be refreshing every 2 minutes. If it then becomes faster and starts refreshing in 10 seconds, it'll go back to refreshing every minute. (In particular, it won't refresh every 10 seconds to catch up with a backlog of missed refreshes - there's no such backlog.)
Additionally, a refresh is started immediately after the materialized view is created, unless `EMPTY` is specified in the `CREATE` query. If `EMPTY` is specified, the first refresh happens according to schedule.
### Dependencies {#refresh-dependencies}
`DEPENDS ON` synchronizes refreshes of different tables. By way of example, suppose there's a chain of two refreshable materialized views:
```sql
CREATE MATERIALIZED VIEW source REFRESH EVERY 1 DAY AS SELECT * FROM url(...)
CREATE MATERIALIZED VIEW destination REFRESH EVERY 1 DAY AS SELECT ... FROM source
```
Without `DEPENDS ON`, both views will start a refresh at midnight, and `destination` typically will see yesterday's data in `source`. If we add dependency:
```
CREATE MATERIALIZED VIEW destination REFRESH EVERY 1 DAY DEPENDS ON source AS SELECT ... FROM source
```
then `destination`'s refresh will start only after `source`'s refresh finished for that day, so `destination` will be based on fresh data.
Alternatively, the same result can be achieved with:
```
CREATE MATERIALIZED VIEW destination REFRESH AFTER 1 HOUR DEPENDS ON source AS SELECT ... FROM source
```
where `1 HOUR` can be any duration less than `source`'s refresh period. The dependent table won't be refreshed more frequently than any of its dependencies. This is a valid way to set up a chain of refreshable views without specifying the real refresh period more than once.
A few more examples:
* `REFRESH EVERY 1 DAY OFFSET 10 MINUTE` (`destination`) depends on `REFRESH EVERY 1 DAY` (`source`)<br/>
If `source` refresh takes more than 10 minutes, `destination` will wait for it.
* `REFRESH EVERY 1 DAY OFFSET 1 HOUR` depends on `REFRESH EVERY 1 DAY OFFSET 23 HOUR`<br/>
Similar to the above, even though the corresponding refreshes happen on different calendar days.
`destination`'s refresh on day X+1 will wait for `source`'s refresh on day X (if it takes more than 2 hours).
* `REFRESH EVERY 2 HOUR` depends on `REFRESH EVERY 1 HOUR`<br/>
The 2 HOUR refresh happens after the 1 HOUR refresh for every other hour, e.g. after the midnight
refresh, then after the 2am refresh, etc.
* `REFRESH EVERY 1 MINUTE` depends on `REFRESH EVERY 2 HOUR`<br/>
`REFRESH AFTER 1 MINUTE` depends on `REFRESH EVERY 2 HOUR`<br/>
`REFRESH AFTER 1 MINUTE` depends on `REFRESH AFTER 2 HOUR`<br/>
`destination` is refreshed once after every `source` refresh, i.e. every 2 hours. The `1 MINUTE` is effectively ignored.
* `REFRESH AFTER 1 HOUR` depends on `REFRESH AFTER 1 HOUR`<br/>
Currently this is not recommended.
:::note
`DEPENDS ON` only works between refreshable materialized views. Listing a regular table in the `DEPENDS ON` list will prevent the view from ever refreshing (dependencies can be removed with `ALTER`, see below).
:::
### Changing Refresh Parameters {#changing-refresh-parameters}
To change refresh parameters:
```
ALTER TABLE [db.]name MODIFY REFRESH EVERY|AFTER ... [RANDOMIZE FOR ...] [DEPENDS ON ...]
```
:::note
This replaces refresh schedule *and* dependencies. If the table had a `DEPENDS ON`, doing a `MODIFY REFRESH` without `DEPENDS ON` will remove the dependencies.
:::
### Other operations
The status of all refreshable materialized views is available in table [`system.view_refreshes`](../../../operations/system-tables/view_refreshes.md). In particular, it contains refresh progress (if running), last and next refresh time, exception message if a refresh failed.
To manually stop, start, trigger, or cancel refreshes use [`SYSTEM STOP|START|REFRESH|CANCEL VIEW`](../system.md#refreshable-materialized-views).
## Window View [Experimental]
:::info
:::info
This is an experimental feature that may change in backwards-incompatible ways in the future releases. Enable usage of window views and `WATCH` query using [allow_experimental_window_view](../../../operations/settings/settings.md#allow-experimental-window-view) setting. Input the command `set allow_experimental_window_view = 1`.
:::

View File

@ -449,7 +449,7 @@ SYSTEM SYNC FILE CACHE [ON CLUSTER cluster_name]
```
### SYSTEM STOP LISTEN
## SYSTEM STOP LISTEN
Closes the socket and gracefully terminates the existing connections to the server on the specified port with the specified protocol.
@ -464,7 +464,7 @@ SYSTEM STOP LISTEN [ON CLUSTER cluster_name] [QUERIES ALL | QUERIES DEFAULT | QU
- If `QUERIES DEFAULT [EXCEPT .. [,..]]` modifier is specified, all default protocols are stopped, unless specified with `EXCEPT` clause.
- If `QUERIES CUSTOM [EXCEPT .. [,..]]` modifier is specified, all custom protocols are stopped, unless specified with `EXCEPT` clause.
### SYSTEM START LISTEN
## SYSTEM START LISTEN
Allows new connections to be established on the specified protocols.
@ -473,3 +473,47 @@ However, if the server on the specified port and protocol was not stopped using
```sql
SYSTEM START LISTEN [ON CLUSTER cluster_name] [QUERIES ALL | QUERIES DEFAULT | QUERIES CUSTOM | TCP | TCP WITH PROXY | TCP SECURE | HTTP | HTTPS | MYSQL | GRPC | POSTGRESQL | PROMETHEUS | CUSTOM 'protocol']
```
## Managing Refreshable Materialized Views {#refreshable-materialized-views}
Commands to control background tasks performed by [Refreshable Materialized Views](../../sql-reference/statements/create/view.md#refreshable-materialized-view)
Keep an eye on [`system.view_refreshes`](../../operations/system-tables/view_refreshes.md) while using them.
### SYSTEM REFRESH VIEW
Trigger an immediate out-of-schedule refresh of a given view.
```sql
SYSTEM REFRESH VIEW [db.]name
```
### SYSTEM STOP VIEW, SYSTEM STOP VIEWS
Disable periodic refreshing of the given view or all refreshable views. If a refresh is in progress, cancel it too.
```sql
SYSTEM STOP VIEW [db.]name
```
```sql
SYSTEM STOP VIEWS
```
### SYSTEM START VIEW, SYSTEM START VIEWS
Enable periodic refreshing for the given view or all refreshable views. No immediate refresh is triggered.
```sql
SYSTEM START VIEW [db.]name
```
```sql
SYSTEM START VIEWS
```
### SYSTEM CANCEL VIEW
If there's a refresh in progress for the given view, interrupt and cancel it. Otherwise do nothing.
```sql
SYSTEM CANCEL VIEW [db.]name
```

View File

@ -82,7 +82,8 @@ enum class AccessType
\
M(ALTER_VIEW_REFRESH, "ALTER LIVE VIEW REFRESH, REFRESH VIEW", VIEW, ALTER_VIEW) \
M(ALTER_VIEW_MODIFY_QUERY, "ALTER TABLE MODIFY QUERY", VIEW, ALTER_VIEW) \
M(ALTER_VIEW, "", GROUP, ALTER) /* allows to execute ALTER VIEW REFRESH, ALTER VIEW MODIFY QUERY;
M(ALTER_VIEW_MODIFY_REFRESH, "ALTER TABLE MODIFY QUERY", VIEW, ALTER_VIEW) \
M(ALTER_VIEW, "", GROUP, ALTER) /* allows to execute ALTER VIEW REFRESH, ALTER VIEW MODIFY QUERY, ALTER VIEW MODIFY REFRESH;
implicitly enabled by the grant ALTER_TABLE */\
\
M(ALTER, "", GROUP, ALL) /* allows to execute ALTER {TABLE|LIVE VIEW} */\
@ -177,6 +178,7 @@ enum class AccessType
M(SYSTEM_MOVES, "SYSTEM STOP MOVES, SYSTEM START MOVES, STOP MOVES, START MOVES", TABLE, SYSTEM) \
M(SYSTEM_PULLING_REPLICATION_LOG, "SYSTEM STOP PULLING REPLICATION LOG, SYSTEM START PULLING REPLICATION LOG", TABLE, SYSTEM) \
M(SYSTEM_CLEANUP, "SYSTEM STOP CLEANUP, SYSTEM START CLEANUP", TABLE, SYSTEM) \
M(SYSTEM_VIEWS, "SYSTEM REFRESH VIEW, SYSTEM START VIEWS, SYSTEM STOP VIEWS, SYSTEM START VIEW, SYSTEM STOP VIEW, SYSTEM CANCEL VIEW, REFRESH VIEW, START VIEWS, STOP VIEWS, START VIEW, STOP VIEW, CANCEL VIEW", VIEW, SYSTEM) \
M(SYSTEM_DISTRIBUTED_SENDS, "SYSTEM STOP DISTRIBUTED SENDS, SYSTEM START DISTRIBUTED SENDS, STOP DISTRIBUTED SENDS, START DISTRIBUTED SENDS", TABLE, SYSTEM_SENDS) \
M(SYSTEM_REPLICATED_SENDS, "SYSTEM STOP REPLICATED SENDS, SYSTEM START REPLICATED SENDS, STOP REPLICATED SENDS, START REPLICATED SENDS", TABLE, SYSTEM_SENDS) \
M(SYSTEM_SENDS, "SYSTEM STOP SENDS, SYSTEM START SENDS, STOP SENDS, START SENDS", GROUP, SYSTEM) \

View File

@ -51,7 +51,7 @@ TEST(AccessRights, Union)
"CREATE DICTIONARY, DROP DATABASE, DROP TABLE, DROP VIEW, DROP DICTIONARY, UNDROP TABLE, "
"TRUNCATE, OPTIMIZE, BACKUP, CREATE ROW POLICY, ALTER ROW POLICY, DROP ROW POLICY, "
"SHOW ROW POLICIES, SYSTEM MERGES, SYSTEM TTL MERGES, SYSTEM FETCHES, "
"SYSTEM MOVES, SYSTEM PULLING REPLICATION LOG, SYSTEM CLEANUP, SYSTEM SENDS, SYSTEM REPLICATION QUEUES, "
"SYSTEM MOVES, SYSTEM PULLING REPLICATION LOG, SYSTEM CLEANUP, SYSTEM VIEWS, SYSTEM SENDS, SYSTEM REPLICATION QUEUES, "
"SYSTEM DROP REPLICA, SYSTEM SYNC REPLICA, SYSTEM RESTART REPLICA, "
"SYSTEM RESTORE REPLICA, SYSTEM WAIT LOADING PARTS, SYSTEM SYNC DATABASE REPLICA, SYSTEM FLUSH DISTRIBUTED, dictGet ON db1.*, GRANT NAMED COLLECTION ADMIN ON db1");
}

View File

@ -20,6 +20,7 @@
#include <Common/ArenaAllocator.h>
#include <Common/assert_cast.h>
#include <Common/thread_local_rng.h>
#include <AggregateFunctions/IAggregateFunction.h>

View File

@ -1,7 +1,7 @@
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/HelpersMinMaxAny.h>
#include <AggregateFunctions/FactoryHelpers.h>
#include <AggregateFunctions/HelpersMinMaxAny.h>
#include <AggregateFunctions/findNumeric.h>
namespace DB
{
@ -10,10 +10,122 @@ struct Settings;
namespace
{
template <typename Data>
class AggregateFunctionsSingleValueMax final : public AggregateFunctionsSingleValue<Data>
{
using Parent = AggregateFunctionsSingleValue<Data>;
public:
explicit AggregateFunctionsSingleValueMax(const DataTypePtr & type) : Parent(type) { }
/// Specializations for native numeric types
ALWAYS_INLINE inline void addBatchSinglePlace(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
Arena * arena,
ssize_t if_argument_pos) const override;
ALWAYS_INLINE inline void addBatchSinglePlaceNotNull(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
const UInt8 * __restrict null_map,
Arena * arena,
ssize_t if_argument_pos) const override;
};
// NOLINTBEGIN(bugprone-macro-parentheses)
#define SPECIALIZE(TYPE) \
template <> \
void AggregateFunctionsSingleValueMax<typename DB::AggregateFunctionMaxData<SingleValueDataFixed<TYPE>>>::addBatchSinglePlace( \
size_t row_begin, \
size_t row_end, \
AggregateDataPtr __restrict place, \
const IColumn ** __restrict columns, \
Arena *, \
ssize_t if_argument_pos) const \
{ \
const auto & column = assert_cast<const DB::AggregateFunctionMaxData<SingleValueDataFixed<TYPE>>::ColVecType &>(*columns[0]); \
std::optional<TYPE> opt; \
if (if_argument_pos >= 0) \
{ \
const auto & flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData(); \
opt = findNumericMaxIf(column.getData().data(), flags.data(), row_begin, row_end); \
} \
else \
opt = findNumericMax(column.getData().data(), row_begin, row_end); \
if (opt.has_value()) \
this->data(place).changeIfGreater(opt.value()); \
}
// NOLINTEND(bugprone-macro-parentheses)
FOR_BASIC_NUMERIC_TYPES(SPECIALIZE)
#undef SPECIALIZE
template <typename Data>
void AggregateFunctionsSingleValueMax<Data>::addBatchSinglePlace(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
Arena * arena,
ssize_t if_argument_pos) const
{
return Parent::addBatchSinglePlace(row_begin, row_end, place, columns, arena, if_argument_pos);
}
// NOLINTBEGIN(bugprone-macro-parentheses)
#define SPECIALIZE(TYPE) \
template <> \
void AggregateFunctionsSingleValueMax<typename DB::AggregateFunctionMaxData<SingleValueDataFixed<TYPE>>>::addBatchSinglePlaceNotNull( \
size_t row_begin, \
size_t row_end, \
AggregateDataPtr __restrict place, \
const IColumn ** __restrict columns, \
const UInt8 * __restrict null_map, \
Arena *, \
ssize_t if_argument_pos) const \
{ \
const auto & column = assert_cast<const DB::AggregateFunctionMaxData<SingleValueDataFixed<TYPE>>::ColVecType &>(*columns[0]); \
std::optional<TYPE> opt; \
if (if_argument_pos >= 0) \
{ \
const auto * if_flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData().data(); \
auto final_flags = std::make_unique<UInt8[]>(row_end); \
for (size_t i = row_begin; i < row_end; ++i) \
final_flags[i] = (!null_map[i]) & !!if_flags[i]; \
opt = findNumericMaxIf(column.getData().data(), final_flags.get(), row_begin, row_end); \
} \
else \
opt = findNumericMaxNotNull(column.getData().data(), null_map, row_begin, row_end); \
if (opt.has_value()) \
this->data(place).changeIfGreater(opt.value()); \
}
// NOLINTEND(bugprone-macro-parentheses)
FOR_BASIC_NUMERIC_TYPES(SPECIALIZE)
#undef SPECIALIZE
template <typename Data>
void AggregateFunctionsSingleValueMax<Data>::addBatchSinglePlaceNotNull(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
const UInt8 * __restrict null_map,
Arena * arena,
ssize_t if_argument_pos) const
{
return Parent::addBatchSinglePlaceNotNull(row_begin, row_end, place, columns, null_map, arena, if_argument_pos);
}
AggregateFunctionPtr createAggregateFunctionMax(
const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
{
return AggregateFunctionPtr(createAggregateFunctionSingleValue<AggregateFunctionsSingleValue, AggregateFunctionMaxData>(name, argument_types, parameters, settings));
return AggregateFunctionPtr(createAggregateFunctionSingleValue<AggregateFunctionsSingleValueMax, AggregateFunctionMaxData>(name, argument_types, parameters, settings));
}
AggregateFunctionPtr createAggregateFunctionArgMax(

View File

@ -1,6 +1,7 @@
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/HelpersMinMaxAny.h>
#include <AggregateFunctions/FactoryHelpers.h>
#include <AggregateFunctions/HelpersMinMaxAny.h>
#include <AggregateFunctions/findNumeric.h>
namespace DB
@ -10,10 +11,123 @@ struct Settings;
namespace
{
template <typename Data>
class AggregateFunctionsSingleValueMin final : public AggregateFunctionsSingleValue<Data>
{
using Parent = AggregateFunctionsSingleValue<Data>;
public:
explicit AggregateFunctionsSingleValueMin(const DataTypePtr & type) : Parent(type) { }
/// Specializations for native numeric types
ALWAYS_INLINE inline void addBatchSinglePlace(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
Arena * arena,
ssize_t if_argument_pos) const override;
ALWAYS_INLINE inline void addBatchSinglePlaceNotNull(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
const UInt8 * __restrict null_map,
Arena * arena,
ssize_t if_argument_pos) const override;
};
// NOLINTBEGIN(bugprone-macro-parentheses)
#define SPECIALIZE(TYPE) \
template <> \
void AggregateFunctionsSingleValueMin<typename DB::AggregateFunctionMinData<SingleValueDataFixed<TYPE>>>::addBatchSinglePlace( \
size_t row_begin, \
size_t row_end, \
AggregateDataPtr __restrict place, \
const IColumn ** __restrict columns, \
Arena *, \
ssize_t if_argument_pos) const \
{ \
const auto & column = assert_cast<const DB::AggregateFunctionMinData<SingleValueDataFixed<TYPE>>::ColVecType &>(*columns[0]); \
std::optional<TYPE> opt; \
if (if_argument_pos >= 0) \
{ \
const auto & flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData(); \
opt = findNumericMinIf(column.getData().data(), flags.data(), row_begin, row_end); \
} \
else \
opt = findNumericMin(column.getData().data(), row_begin, row_end); \
if (opt.has_value()) \
this->data(place).changeIfLess(opt.value()); \
}
// NOLINTEND(bugprone-macro-parentheses)
FOR_BASIC_NUMERIC_TYPES(SPECIALIZE)
#undef SPECIALIZE
template <typename Data>
void AggregateFunctionsSingleValueMin<Data>::addBatchSinglePlace(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
Arena * arena,
ssize_t if_argument_pos) const
{
return Parent::addBatchSinglePlace(row_begin, row_end, place, columns, arena, if_argument_pos);
}
// NOLINTBEGIN(bugprone-macro-parentheses)
#define SPECIALIZE(TYPE) \
template <> \
void AggregateFunctionsSingleValueMin<typename DB::AggregateFunctionMinData<SingleValueDataFixed<TYPE>>>::addBatchSinglePlaceNotNull( \
size_t row_begin, \
size_t row_end, \
AggregateDataPtr __restrict place, \
const IColumn ** __restrict columns, \
const UInt8 * __restrict null_map, \
Arena *, \
ssize_t if_argument_pos) const \
{ \
const auto & column = assert_cast<const DB::AggregateFunctionMinData<SingleValueDataFixed<TYPE>>::ColVecType &>(*columns[0]); \
std::optional<TYPE> opt; \
if (if_argument_pos >= 0) \
{ \
const auto * if_flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData().data(); \
auto final_flags = std::make_unique<UInt8[]>(row_end); \
for (size_t i = row_begin; i < row_end; ++i) \
final_flags[i] = (!null_map[i]) & !!if_flags[i]; \
opt = findNumericMinIf(column.getData().data(), final_flags.get(), row_begin, row_end); \
} \
else \
opt = findNumericMinNotNull(column.getData().data(), null_map, row_begin, row_end); \
if (opt.has_value()) \
this->data(place).changeIfLess(opt.value()); \
}
// NOLINTEND(bugprone-macro-parentheses)
FOR_BASIC_NUMERIC_TYPES(SPECIALIZE)
#undef SPECIALIZE
template <typename Data>
void AggregateFunctionsSingleValueMin<Data>::addBatchSinglePlaceNotNull(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** __restrict columns,
const UInt8 * __restrict null_map,
Arena * arena,
ssize_t if_argument_pos) const
{
return Parent::addBatchSinglePlaceNotNull(row_begin, row_end, place, columns, null_map, arena, if_argument_pos);
}
AggregateFunctionPtr createAggregateFunctionMin(
const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
{
return AggregateFunctionPtr(createAggregateFunctionSingleValue<AggregateFunctionsSingleValue, AggregateFunctionMinData>(name, argument_types, parameters, settings));
return AggregateFunctionPtr(createAggregateFunctionSingleValue<AggregateFunctionsSingleValueMin, AggregateFunctionMinData>(
name, argument_types, parameters, settings));
}
AggregateFunctionPtr createAggregateFunctionArgMin(

View File

@ -43,14 +43,12 @@ namespace ErrorCodes
template <typename T>
struct SingleValueDataFixed
{
private:
using Self = SingleValueDataFixed;
using ColVecType = ColumnVectorOrDecimal<T>;
bool has_value = false; /// We need to remember if at least one value has been passed. This is necessary for AggregateFunctionIf.
T value = T{};
public:
static constexpr bool result_is_nullable = false;
static constexpr bool should_skip_null_arguments = true;
static constexpr bool is_any = false;
@ -157,6 +155,15 @@ public:
return false;
}
void changeIfLess(T from)
{
if (!has() || from < value)
{
has_value = true;
value = from;
}
}
bool changeIfGreater(const IColumn & column, size_t row_num, Arena * arena)
{
if (!has() || assert_cast<const ColVecType &>(column).getData()[row_num] > value)
@ -179,6 +186,15 @@ public:
return false;
}
void changeIfGreater(T & from)
{
if (!has() || from > value)
{
has_value = true;
value = from;
}
}
bool isEqualTo(const Self & to) const
{
return has() && to.value == value;
@ -448,7 +464,6 @@ public:
}
#endif
};
struct Compatibility
@ -1214,7 +1229,7 @@ struct AggregateFunctionAnyHeavyData : Data
template <typename Data>
class AggregateFunctionsSingleValue final : public IAggregateFunctionDataHelper<Data, AggregateFunctionsSingleValue<Data>>
class AggregateFunctionsSingleValue : public IAggregateFunctionDataHelper<Data, AggregateFunctionsSingleValue<Data>>
{
static constexpr bool is_any = Data::is_any;
@ -1230,8 +1245,11 @@ public:
|| StringRef(Data::name()) == StringRef("max"))
{
if (!type->isComparable())
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of aggregate function {} "
"because the values of that data type are not comparable", type->getName(), getName());
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Illegal type {} of argument of aggregate function {} because the values of that data type are not comparable",
type->getName(),
Data::name());
}
}

View File

@ -504,7 +504,7 @@ public:
const auto * if_flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData().data();
auto final_flags = std::make_unique<UInt8[]>(row_end);
for (size_t i = row_begin; i < row_end; ++i)
final_flags[i] = (!null_map[i]) & if_flags[i];
final_flags[i] = (!null_map[i]) & !!if_flags[i];
this->data(place).addManyConditional(column.getData().data(), final_flags.get(), row_begin, row_end);
}

View File

@ -1,6 +1,7 @@
#include <AggregateFunctions/QuantileTDigest.h>
#include <IO/WriteBufferFromString.h>
#include <IO/ReadBufferFromString.h>
#include <iostream>
int main(int, char **)
{

View File

@ -0,0 +1,15 @@
#include <AggregateFunctions/findNumeric.h>
namespace DB
{
#define INSTANTIATION(T) \
template std::optional<T> findNumericMin(const T * __restrict ptr, size_t start, size_t end); \
template std::optional<T> findNumericMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \
template std::optional<T> findNumericMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \
template std::optional<T> findNumericMax(const T * __restrict ptr, size_t start, size_t end); \
template std::optional<T> findNumericMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \
template std::optional<T> findNumericMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end);
FOR_BASIC_NUMERIC_TYPES(INSTANTIATION)
#undef INSTANTIATION
}

View File

@ -0,0 +1,154 @@
#pragma once
#include <DataTypes/IDataType.h>
#include <base/defines.h>
#include <base/types.h>
#include <Common/Concepts.h>
#include <Common/TargetSpecific.h>
#include <algorithm>
#include <optional>
namespace DB
{
template <typename T>
concept is_any_native_number = (is_any_of<T, Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64, Float32, Float64>);
template <is_any_native_number T>
struct MinComparator
{
static ALWAYS_INLINE inline const T & cmp(const T & a, const T & b) { return std::min(a, b); }
};
template <is_any_native_number T>
struct MaxComparator
{
static ALWAYS_INLINE inline const T & cmp(const T & a, const T & b) { return std::max(a, b); }
};
MULTITARGET_FUNCTION_AVX2_SSE42(
MULTITARGET_FUNCTION_HEADER(template <is_any_native_number T, typename ComparatorClass, bool add_all_elements, bool add_if_cond_zero> static std::optional<T> NO_INLINE),
findNumericExtremeImpl,
MULTITARGET_FUNCTION_BODY((const T * __restrict ptr, const UInt8 * __restrict condition_map [[maybe_unused]], size_t row_begin, size_t row_end)
{
size_t count = row_end - row_begin;
ptr += row_begin;
if constexpr (!add_all_elements)
condition_map += row_begin;
T ret{};
size_t i = 0;
for (; i < count; i++)
{
if (add_all_elements || !condition_map[i] == add_if_cond_zero)
{
ret = ptr[i];
break;
}
}
if (i >= count)
return std::nullopt;
/// Unroll the loop manually for floating point, since the compiler doesn't do it without fastmath
/// as it might change the return value
if constexpr (std::is_floating_point_v<T>)
{
constexpr size_t unroll_block = 512 / sizeof(T); /// Chosen via benchmarks with AVX2 so YMMV
size_t unrolled_end = i + (((count - i) / unroll_block) * unroll_block);
if (i < unrolled_end)
{
T partial_min[unroll_block];
for (size_t unroll_it = 0; unroll_it < unroll_block; unroll_it++)
partial_min[unroll_it] = ret;
while (i < unrolled_end)
{
for (size_t unroll_it = 0; unroll_it < unroll_block; unroll_it++)
{
if (add_all_elements || !condition_map[i + unroll_it] == add_if_cond_zero)
partial_min[unroll_it] = ComparatorClass::cmp(partial_min[unroll_it], ptr[i + unroll_it]);
}
i += unroll_block;
}
for (size_t unroll_it = 0; unroll_it < unroll_block; unroll_it++)
ret = ComparatorClass::cmp(ret, partial_min[unroll_it]);
}
}
for (; i < count; i++)
{
if (add_all_elements || !condition_map[i] == add_if_cond_zero)
ret = ComparatorClass::cmp(ret, ptr[i]);
}
return ret;
}
))
/// Given a vector of T finds the extreme (MIN or MAX) value
template <is_any_native_number T, class ComparatorClass, bool add_all_elements, bool add_if_cond_zero>
static std::optional<T>
findNumericExtreme(const T * __restrict ptr, const UInt8 * __restrict condition_map [[maybe_unused]], size_t start, size_t end)
{
#if USE_MULTITARGET_CODE
/// We see no benefit from using AVX512BW or AVX512F (over AVX2), so we only declare SSE and AVX2
if (isArchSupported(TargetArch::AVX2))
return findNumericExtremeImplAVX2<T, ComparatorClass, add_all_elements, add_if_cond_zero>(ptr, condition_map, start, end);
if (isArchSupported(TargetArch::SSE42))
return findNumericExtremeImplSSE42<T, ComparatorClass, add_all_elements, add_if_cond_zero>(ptr, condition_map, start, end);
#endif
return findNumericExtremeImpl<T, ComparatorClass, add_all_elements, add_if_cond_zero>(ptr, condition_map, start, end);
}
template <is_any_native_number T>
std::optional<T> findNumericMin(const T * __restrict ptr, size_t start, size_t end)
{
return findNumericExtreme<T, MinComparator<T>, true, false>(ptr, nullptr, start, end);
}
template <is_any_native_number T>
std::optional<T> findNumericMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end)
{
return findNumericExtreme<T, MinComparator<T>, false, true>(ptr, condition_map, start, end);
}
template <is_any_native_number T>
std::optional<T> findNumericMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end)
{
return findNumericExtreme<T, MinComparator<T>, false, false>(ptr, condition_map, start, end);
}
template <is_any_native_number T>
std::optional<T> findNumericMax(const T * __restrict ptr, size_t start, size_t end)
{
return findNumericExtreme<T, MaxComparator<T>, true, false>(ptr, nullptr, start, end);
}
template <is_any_native_number T>
std::optional<T> findNumericMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end)
{
return findNumericExtreme<T, MaxComparator<T>, false, true>(ptr, condition_map, start, end);
}
template <is_any_native_number T>
std::optional<T> findNumericMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end)
{
return findNumericExtreme<T, MaxComparator<T>, false, false>(ptr, condition_map, start, end);
}
#define EXTERN_INSTANTIATION(T) \
extern template std::optional<T> findNumericMin(const T * __restrict ptr, size_t start, size_t end); \
extern template std::optional<T> findNumericMinNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \
extern template std::optional<T> findNumericMinIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \
extern template std::optional<T> findNumericMax(const T * __restrict ptr, size_t start, size_t end); \
extern template std::optional<T> findNumericMaxNotNull(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end); \
extern template std::optional<T> findNumericMaxIf(const T * __restrict ptr, const UInt8 * __restrict condition_map, size_t start, size_t end);
FOR_BASIC_NUMERIC_TYPES(EXTERN_INSTANTIATION)
#undef EXTERN_INSTANTIATION
}

View File

@ -68,7 +68,6 @@ namespace
client_configuration.connectTimeoutMs = 10 * 1000;
/// Requests in backups can be extremely long, set to one hour
client_configuration.requestTimeoutMs = 60 * 60 * 1000;
client_configuration.retryStrategy = std::make_shared<Aws::Client::DefaultRetryStrategy>(request_settings.retry_attempts);
return S3::ClientFactory::instance().create(
client_configuration,

View File

@ -226,6 +226,7 @@ add_object_library(clickhouse_storages_statistics Storages/Statistics)
add_object_library(clickhouse_storages_liveview Storages/LiveView)
add_object_library(clickhouse_storages_windowview Storages/WindowView)
add_object_library(clickhouse_storages_s3queue Storages/S3Queue)
add_object_library(clickhouse_storages_materializedview Storages/MaterializedView)
add_object_library(clickhouse_client Client)
add_object_library(clickhouse_bridge BridgeHelper)
add_object_library(clickhouse_server Server)

View File

@ -1,4 +1,5 @@
#include <Columns/ColumnCompressed.h>
#include <Common/formatReadable.h>
#pragma clang diagnostic ignored "-Wold-style-cast"

View File

@ -1,4 +1,5 @@
#include <Common/Arena.h>
#include <Core/Field.h>
#include <Columns/IColumnDummy.h>
#include <Columns/ColumnsCommon.h>

View File

@ -4,6 +4,7 @@
#include <vector>
#include <Columns/ColumnsNumber.h>
#include <Common/randomSeed.h>
#include <Common/thread_local_rng.h>
#include <gtest/gtest.h>
using namespace DB;

View File

@ -1,9 +1,190 @@
#include "Allocator.h"
#include <Common/Allocator.h>
#include <Common/Exception.h>
#include <Common/logger_useful.h>
#include <Common/formatReadable.h>
#include <Common/CurrentMemoryTracker.h>
#include <base/errnoToString.h>
#include <base/getPageSize.h>
#include <Poco/Logger.h>
#include <sys/mman.h> /// MADV_POPULATE_WRITE
namespace DB
{
namespace ErrorCodes
{
extern const int CANNOT_ALLOCATE_MEMORY;
extern const int LOGICAL_ERROR;
}
}
namespace
{
using namespace DB;
#if defined(MADV_POPULATE_WRITE)
/// Address passed to madvise is required to be aligned to the page boundary.
auto adjustToPageSize(void * buf, size_t len, size_t page_size)
{
const uintptr_t address_numeric = reinterpret_cast<uintptr_t>(buf);
const size_t next_page_start = ((address_numeric + page_size - 1) / page_size) * page_size;
return std::make_pair(reinterpret_cast<void *>(next_page_start), len - (next_page_start - address_numeric));
}
#endif
void prefaultPages([[maybe_unused]] void * buf_, [[maybe_unused]] size_t len_)
{
#if defined(MADV_POPULATE_WRITE)
if (len_ < POPULATE_THRESHOLD)
return;
static const size_t page_size = ::getPageSize();
if (len_ < page_size) /// Rounded address should be still within [buf, buf + len).
return;
auto [buf, len] = adjustToPageSize(buf_, len_, page_size);
if (auto res = ::madvise(buf, len, MADV_POPULATE_WRITE); res < 0)
LOG_TRACE(
LogFrequencyLimiter(&Poco::Logger::get("Allocator"), 1),
"Attempt to populate pages failed: {} (EINVAL is expected for kernels < 5.14)",
errnoToString(res));
#endif
}
template <bool clear_memory, bool populate>
void * allocNoTrack(size_t size, size_t alignment)
{
void * buf;
if (alignment <= MALLOC_MIN_ALIGNMENT)
{
if constexpr (clear_memory)
buf = ::calloc(size, 1);
else
buf = ::malloc(size);
if (nullptr == buf)
throw DB::ErrnoException(DB::ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Allocator: Cannot malloc {}.", ReadableSize(size));
}
else
{
buf = nullptr;
int res = posix_memalign(&buf, alignment, size);
if (0 != res)
throw DB::ErrnoException(
DB::ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Cannot allocate memory (posix_memalign) {}.", ReadableSize(size));
if constexpr (clear_memory)
memset(buf, 0, size);
}
if constexpr (populate)
prefaultPages(buf, size);
return buf;
}
void freeNoTrack(void * buf)
{
::free(buf);
}
void checkSize(size_t size)
{
/// More obvious exception in case of possible overflow (instead of just "Cannot mmap").
if (size >= 0x8000000000000000ULL)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Too large size ({}) passed to allocator. It indicates an error.", size);
}
}
/// Constant is chosen almost arbitrarily, what I observed is 128KB is too small, 1MB is almost indistinguishable from 64MB and 1GB is too large.
extern const size_t POPULATE_THRESHOLD = 16 * 1024 * 1024;
template <bool clear_memory_, bool populate>
void * Allocator<clear_memory_, populate>::alloc(size_t size, size_t alignment)
{
checkSize(size);
auto trace = CurrentMemoryTracker::alloc(size);
void * ptr = allocNoTrack<clear_memory_, populate>(size, alignment);
trace.onAlloc(ptr, size);
return ptr;
}
template <bool clear_memory_, bool populate>
void Allocator<clear_memory_, populate>::free(void * buf, size_t size)
{
try
{
checkSize(size);
freeNoTrack(buf);
auto trace = CurrentMemoryTracker::free(size);
trace.onFree(buf, size);
}
catch (...)
{
DB::tryLogCurrentException("Allocator::free");
throw;
}
}
template <bool clear_memory_, bool populate>
void * Allocator<clear_memory_, populate>::realloc(void * buf, size_t old_size, size_t new_size, size_t alignment)
{
checkSize(new_size);
if (old_size == new_size)
{
/// nothing to do.
/// BTW, it's not possible to change alignment while doing realloc.
}
else if (alignment <= MALLOC_MIN_ALIGNMENT)
{
/// Resize malloc'd memory region with no special alignment requirement.
auto trace_free = CurrentMemoryTracker::free(old_size);
auto trace_alloc = CurrentMemoryTracker::alloc(new_size);
trace_free.onFree(buf, old_size);
void * new_buf = ::realloc(buf, new_size);
if (nullptr == new_buf)
{
throw DB::ErrnoException(
DB::ErrorCodes::CANNOT_ALLOCATE_MEMORY,
"Allocator: Cannot realloc from {} to {}",
ReadableSize(old_size),
ReadableSize(new_size));
}
buf = new_buf;
trace_alloc.onAlloc(buf, new_size);
if constexpr (clear_memory)
if (new_size > old_size)
memset(reinterpret_cast<char *>(buf) + old_size, 0, new_size - old_size);
}
else
{
/// Big allocs that requires a copy. MemoryTracker is called inside 'alloc', 'free' methods.
void * new_buf = alloc(new_size, alignment);
memcpy(new_buf, buf, std::min(old_size, new_size));
free(buf, old_size);
buf = new_buf;
}
if constexpr (populate)
prefaultPages(buf, new_size);
return buf;
}
template class Allocator<false, false>;
template class Allocator<true, false>;
template class Allocator<false, true>;

View File

@ -8,47 +8,19 @@
#define ALLOCATOR_ASLR 1
#endif
#include <pcg_random.hpp>
#include <Common/thread_local_rng.h>
#if !defined(OS_DARWIN) && !defined(OS_FREEBSD)
#include <malloc.h>
#endif
#include <cstdlib>
#include <algorithm>
#include <sys/mman.h>
#include <Core/Defines.h>
#include <base/getPageSize.h>
#include <Common/CurrentMemoryTracker.h>
#include <Common/CurrentMetrics.h>
#include <Common/Exception.h>
#include <Common/formatReadable.h>
#include <Common/Allocator_fwd.h>
#include <base/errnoToString.h>
#include <Poco/Logger.h>
#include <Common/logger_useful.h>
#include <cstdlib>
extern const size_t POPULATE_THRESHOLD;
static constexpr size_t MALLOC_MIN_ALIGNMENT = 8;
namespace DB
{
namespace ErrorCodes
{
extern const int CANNOT_ALLOCATE_MEMORY;
extern const int LOGICAL_ERROR;
}
}
/** Previously there was a code which tried to use manual mmap and mremap (clickhouse_mremap.h) for large allocations/reallocations (64MB+).
* Most modern allocators (including jemalloc) don't use mremap, so the idea was to take advantage from mremap system call for large reallocs.
* Actually jemalloc had support for mremap, but it was intentionally removed from codebase https://github.com/jemalloc/jemalloc/commit/e2deab7a751c8080c2b2cdcfd7b11887332be1bb.
@ -69,83 +41,16 @@ class Allocator
{
public:
/// Allocate memory range.
void * alloc(size_t size, size_t alignment = 0)
{
checkSize(size);
auto trace = CurrentMemoryTracker::alloc(size);
void * ptr = allocNoTrack(size, alignment);
trace.onAlloc(ptr, size);
return ptr;
}
void * alloc(size_t size, size_t alignment = 0);
/// Free memory range.
void free(void * buf, size_t size)
{
try
{
checkSize(size);
freeNoTrack(buf);
auto trace = CurrentMemoryTracker::free(size);
trace.onFree(buf, size);
}
catch (...)
{
DB::tryLogCurrentException("Allocator::free");
throw;
}
}
void free(void * buf, size_t size);
/** Enlarge memory range.
* Data from old range is moved to the beginning of new range.
* Address of memory range could change.
*/
void * realloc(void * buf, size_t old_size, size_t new_size, size_t alignment = 0)
{
checkSize(new_size);
if (old_size == new_size)
{
/// nothing to do.
/// BTW, it's not possible to change alignment while doing realloc.
}
else if (alignment <= MALLOC_MIN_ALIGNMENT)
{
/// Resize malloc'd memory region with no special alignment requirement.
auto trace_free = CurrentMemoryTracker::free(old_size);
auto trace_alloc = CurrentMemoryTracker::alloc(new_size);
trace_free.onFree(buf, old_size);
void * new_buf = ::realloc(buf, new_size);
if (nullptr == new_buf)
{
throw DB::ErrnoException(
DB::ErrorCodes::CANNOT_ALLOCATE_MEMORY,
"Allocator: Cannot realloc from {} to {}",
ReadableSize(old_size),
ReadableSize(new_size));
}
buf = new_buf;
trace_alloc.onAlloc(buf, new_size);
if constexpr (clear_memory)
if (new_size > old_size)
memset(reinterpret_cast<char *>(buf) + old_size, 0, new_size - old_size);
}
else
{
/// Big allocs that requires a copy. MemoryTracker is called inside 'alloc', 'free' methods.
void * new_buf = alloc(new_size, alignment);
memcpy(new_buf, buf, std::min(old_size, new_size));
free(buf, old_size);
buf = new_buf;
}
if constexpr (populate)
prefaultPages(buf, new_size);
return buf;
}
void * realloc(void * buf, size_t old_size, size_t new_size, size_t alignment = 0);
protected:
static constexpr size_t getStackThreshold()
@ -156,76 +61,6 @@ protected:
static constexpr bool clear_memory = clear_memory_;
private:
void * allocNoTrack(size_t size, size_t alignment)
{
void * buf;
if (alignment <= MALLOC_MIN_ALIGNMENT)
{
if constexpr (clear_memory)
buf = ::calloc(size, 1);
else
buf = ::malloc(size);
if (nullptr == buf)
throw DB::ErrnoException(DB::ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Allocator: Cannot malloc {}.", ReadableSize(size));
}
else
{
buf = nullptr;
int res = posix_memalign(&buf, alignment, size);
if (0 != res)
throw DB::ErrnoException(
DB::ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Cannot allocate memory (posix_memalign) {}.", ReadableSize(size));
if constexpr (clear_memory)
memset(buf, 0, size);
}
if constexpr (populate)
prefaultPages(buf, size);
return buf;
}
void freeNoTrack(void * buf)
{
::free(buf);
}
void checkSize(size_t size)
{
/// More obvious exception in case of possible overflow (instead of just "Cannot mmap").
if (size >= 0x8000000000000000ULL)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Too large size ({}) passed to allocator. It indicates an error.", size);
}
/// Address passed to madvise is required to be aligned to the page boundary.
auto adjustToPageSize(void * buf, size_t len, size_t page_size)
{
const uintptr_t address_numeric = reinterpret_cast<uintptr_t>(buf);
const size_t next_page_start = ((address_numeric + page_size - 1) / page_size) * page_size;
return std::make_pair(reinterpret_cast<void *>(next_page_start), len - (next_page_start - address_numeric));
}
void prefaultPages([[maybe_unused]] void * buf_, [[maybe_unused]] size_t len_)
{
#if defined(MADV_POPULATE_WRITE)
if (len_ < POPULATE_THRESHOLD)
return;
static const size_t page_size = ::getPageSize();
if (len_ < page_size) /// Rounded address should be still within [buf, buf + len).
return;
auto [buf, len] = adjustToPageSize(buf_, len_, page_size);
if (auto res = ::madvise(buf, len, MADV_POPULATE_WRITE); res < 0)
LOG_TRACE(
LogFrequencyLimiter(&Poco::Logger::get("Allocator"), 1),
"Attempt to populate pages failed: {} (EINVAL is expected for kernels < 5.14)",
errnoToString(res));
#endif
}
};

View File

@ -8,6 +8,7 @@
#include <Common/Allocator.h>
#include <Common/ProfileEvents.h>
#include <Common/memcpySmall.h>
#include <base/getPageSize.h>
#if __has_include(<sanitizer/asan_interface.h>) && defined(ADDRESS_SANITIZER)
# include <sanitizer/asan_interface.h>

View File

@ -1,5 +1,6 @@
#pragma once
#include <mutex>
#include <Core/Defines.h>
#if __has_include(<sanitizer/asan_interface.h>) && defined(ADDRESS_SANITIZER)
# include <sanitizer/asan_interface.h>

View File

@ -19,11 +19,6 @@
#include <Common/randomSeed.h>
#include <Common/formatReadable.h>
/// Required for older Darwin builds, that lack definition of MAP_ANONYMOUS
#ifndef MAP_ANONYMOUS
#define MAP_ANONYMOUS MAP_ANON
#endif
namespace DB
{

View File

@ -1,3 +1,4 @@
#include <Common/formatReadable.h>
#include <Common/AsynchronousMetrics.h>
#include <Common/Exception.h>
#include <Common/setThreadName.h>
@ -8,6 +9,7 @@
#include <IO/MMappedFileCache.h>
#include <IO/ReadHelpers.h>
#include <base/errnoToString.h>
#include <base/getPageSize.h>
#include <sys/resource.h>
#include <chrono>

View File

@ -1,5 +1,6 @@
#pragma once
#include <algorithm>
#include <cassert>
#include <concepts>
#include <cstddef>

View File

@ -0,0 +1,144 @@
#include <Common/CalendarTimeInterval.h>
#include <Common/Exception.h>
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
}
CalendarTimeInterval::CalendarTimeInterval(const CalendarTimeInterval::Intervals & intervals)
{
for (auto [kind, val] : intervals)
{
switch (kind.kind)
{
case IntervalKind::Nanosecond:
case IntervalKind::Microsecond:
case IntervalKind::Millisecond:
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Sub-second intervals are not supported here");
case IntervalKind::Second:
case IntervalKind::Minute:
case IntervalKind::Hour:
case IntervalKind::Day:
case IntervalKind::Week:
seconds += val * kind.toAvgSeconds();
break;
case IntervalKind::Month:
months += val;
break;
case IntervalKind::Quarter:
months += val * 3;
break;
case IntervalKind::Year:
months += val * 12;
break;
}
}
}
CalendarTimeInterval::Intervals CalendarTimeInterval::toIntervals() const
{
Intervals res;
auto greedy = [&](UInt64 x, std::initializer_list<std::pair<IntervalKind, UInt64>> kinds)
{
for (auto [kind, count] : kinds)
{
UInt64 k = x / count;
if (k == 0)
continue;
x -= k * count;
res.emplace_back(kind, k);
}
chassert(x == 0);
};
greedy(months, {{IntervalKind::Year, 12}, {IntervalKind::Month, 1}});
greedy(seconds, {{IntervalKind::Week, 3600*24*7}, {IntervalKind::Day, 3600*24}, {IntervalKind::Hour, 3600}, {IntervalKind::Minute, 60}, {IntervalKind::Second, 1}});
return res;
}
UInt64 CalendarTimeInterval::minSeconds() const
{
return 3600*24 * (months/12 * 365 + months%12 * 28) + seconds;
}
UInt64 CalendarTimeInterval::maxSeconds() const
{
return 3600*24 * (months/12 * 366 + months%12 * 31) + seconds;
}
void CalendarTimeInterval::assertSingleUnit() const
{
if (seconds && months)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Interval shouldn't contain both calendar units and clock units (e.g. months and days)");
}
void CalendarTimeInterval::assertPositive() const
{
if (!seconds && !months)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Interval must be positive");
}
/// Number of whole months between 1970-01-01 and `t`.
static Int64 toAbsoluteMonth(std::chrono::system_clock::time_point t)
{
std::chrono::year_month_day ymd(std::chrono::floor<std::chrono::days>(t));
return (Int64(int(ymd.year())) - 1970) * 12 + Int64(unsigned(ymd.month()) - 1);
}
static std::chrono::sys_seconds startOfAbsoluteMonth(Int64 absolute_month)
{
Int64 year = absolute_month >= 0 ? absolute_month/12 : -((-absolute_month+11)/12);
Int64 month = absolute_month - year*12;
chassert(month >= 0 && month < 12);
std::chrono::year_month_day ymd(
std::chrono::year(int(year + 1970)),
std::chrono::month(unsigned(month + 1)),
std::chrono::day(1));
return std::chrono::sys_days(ymd);
}
std::chrono::sys_seconds CalendarTimeInterval::advance(std::chrono::system_clock::time_point tp) const
{
auto t = std::chrono::sys_seconds(std::chrono::floor<std::chrono::seconds>(tp));
if (months)
{
auto m = toAbsoluteMonth(t);
auto s = t - startOfAbsoluteMonth(m);
t = startOfAbsoluteMonth(m + Int64(months)) + s;
}
return t + std::chrono::seconds(Int64(seconds));
}
std::chrono::sys_seconds CalendarTimeInterval::floor(std::chrono::system_clock::time_point tp) const
{
assertSingleUnit();
assertPositive();
if (months)
return startOfAbsoluteMonth(toAbsoluteMonth(tp) / months * months);
else
{
constexpr std::chrono::seconds epoch(-3600*24*3);
auto t = std::chrono::sys_seconds(std::chrono::floor<std::chrono::seconds>(tp));
/// We want to align with weeks, but 1970-01-01 is a Thursday, so align with 1969-12-29 instead.
return std::chrono::sys_seconds((t.time_since_epoch() - epoch) / seconds * seconds + epoch);
}
}
bool CalendarTimeInterval::operator==(const CalendarTimeInterval & rhs) const
{
return std::tie(months, seconds) == std::tie(rhs.months, rhs.seconds);
}
bool CalendarTimeInterval::operator!=(const CalendarTimeInterval & rhs) const
{
return !(*this == rhs);
}
}

View File

@ -0,0 +1,63 @@
#pragma once
#include <Common/IntervalKind.h>
#include <chrono>
namespace DB
{
/// Represents a duration of calendar time, e.g.:
/// * 2 weeks + 5 minutes + and 21 seconds (aka 605121 seconds),
/// * 1 (calendar) month - not equivalent to any number of seconds!
/// * 3 years + 2 weeks (aka 36 months + 604800 seconds).
///
/// Be careful with calendar arithmetic: it's missing many familiar properties of numbers.
/// E.g. x + y - y is not always equal to x (October 31 + 1 month - 1 month = November 1).
struct CalendarTimeInterval
{
UInt64 seconds = 0;
UInt64 months = 0;
using Intervals = std::vector<std::pair<IntervalKind, UInt64>>;
CalendarTimeInterval() = default;
/// Year, Quarter, Month are converted to months.
/// Week, Day, Hour, Minute, Second are converted to seconds.
/// Millisecond, Microsecond, Nanosecond throw exception.
explicit CalendarTimeInterval(const Intervals & intervals);
/// E.g. for {36 months, 604801 seconds} returns {3 years, 2 weeks, 1 second}.
Intervals toIntervals() const;
/// Approximate shortest and longest duration in seconds. E.g. a month is [28, 31] days.
UInt64 minSeconds() const;
UInt64 maxSeconds() const;
/// Checks that the interval has only months or only seconds, throws otherwise.
void assertSingleUnit() const;
void assertPositive() const;
/// Add this interval to the timestamp. First months, then seconds.
/// Gets weird near month boundaries: October 31 + 1 month = December 1.
std::chrono::sys_seconds advance(std::chrono::system_clock::time_point t) const;
/// Rounds the timestamp down to the nearest timestamp "aligned" with this interval.
/// The interval must satisfy assertSingleUnit() and assertPositive().
/// * For months, rounds to the start of a month whose abosolute index is divisible by `months`.
/// The month index is 0-based starting from January 1970.
/// E.g. if the interval is 1 month, rounds down to the start of the month.
/// * For seconds, rounds to a timestamp x such that (x - December 29 1969 (Monday)) is divisible
/// by this interval.
/// E.g. if the interval is 1 week, rounds down to the start of the week (Monday).
///
/// Guarantees:
/// * advance(floor(x)) > x
/// * floor(advance(floor(x))) = advance(floor(x))
std::chrono::sys_seconds floor(std::chrono::system_clock::time_point t) const;
bool operator==(const CalendarTimeInterval & rhs) const;
bool operator!=(const CalendarTimeInterval & rhs) const;
};
}

View File

@ -253,6 +253,8 @@
M(MergeTreeAllRangesAnnouncementsSent, "The current number of announcement being sent in flight from the remote server to the initiator server about the set of data parts (for MergeTree tables). Measured on the remote server side.") \
M(CreatedTimersInQueryProfiler, "Number of Created thread local timers in QueryProfiler") \
M(ActiveTimersInQueryProfiler, "Number of Active thread local timers in QueryProfiler") \
M(RefreshableViews, "Number materialized views with periodic refreshing (REFRESH)") \
M(RefreshingViews, "Number of materialized views currently executing a refresh") \
#ifdef APPLY_FOR_EXTERNAL_METRICS
#define APPLY_FOR_METRICS(M) APPLY_FOR_BUILTIN_METRICS(M) APPLY_FOR_EXTERNAL_METRICS(M)

View File

@ -13,6 +13,11 @@
#include <valgrind/valgrind.h>
#endif
/// Required for older Darwin builds, that lack definition of MAP_ANONYMOUS
#ifndef MAP_ANONYMOUS
#define MAP_ANONYMOUS MAP_ANON
#endif
namespace DB::ErrorCodes
{
extern const int CANNOT_ALLOCATE_MEMORY;

View File

@ -71,6 +71,8 @@ struct IntervalKind
/// Returns false if the conversion did not succeed.
/// For example, `IntervalKind::tryParseString('second', result)` returns `result` equals `IntervalKind::Kind::Second`.
static bool tryParseString(const std::string & kind, IntervalKind::Kind & result);
auto operator<=>(const IntervalKind & other) const { return kind <=> other.kind; }
};
/// NOLINTNEXTLINE

View File

@ -1,7 +1,6 @@
#pragma once
#include <base/types.h>
#include <Common/PODArray.h>
#include <Common/levenshteinDistance.h>
#include <algorithm>

View File

@ -2,6 +2,7 @@
#include <random>
#include <base/getThreadId.h>
#include <Common/thread_local_rng.h>
#include <Common/Exception.h>
#include <base/hex.h>
#include <Core/Settings.h>

View File

@ -1,8 +1,46 @@
#include <Common/Exception.h>
#include <Common/PODArray.h>
namespace DB
{
namespace ErrorCodes
{
extern const int CANNOT_MPROTECT;
extern const int CANNOT_ALLOCATE_MEMORY;
}
namespace PODArrayDetails
{
#ifndef NDEBUG
void protectMemoryRegion(void * addr, size_t len, int prot)
{
if (0 != mprotect(addr, len, prot))
throw ErrnoException(ErrorCodes::CANNOT_MPROTECT, "Cannot mprotect memory region");
}
#endif
size_t byte_size(size_t num_elements, size_t element_size)
{
size_t amount;
if (__builtin_mul_overflow(num_elements, element_size, &amount))
throw Exception(ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Amount of memory requested to allocate is more than allowed");
return amount;
}
size_t minimum_memory_for_elements(size_t num_elements, size_t element_size, size_t pad_left, size_t pad_right)
{
size_t amount;
if (__builtin_add_overflow(byte_size(num_elements, element_size), pad_left + pad_right, &amount))
throw Exception(ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Amount of memory requested to allocate is more than allowed");
return amount;
}
}
/// Used for left padding of PODArray when empty
const char empty_pod_array[empty_pod_array_size]{};
@ -25,4 +63,5 @@ template class PODArray<Int8, 4096, Allocator<false>, 0, 0>;
template class PODArray<Int16, 4096, Allocator<false>, 0, 0>;
template class PODArray<Int32, 4096, Allocator<false>, 0, 0>;
template class PODArray<Int64, 4096, Allocator<false>, 0, 0>;
}

View File

@ -1,27 +1,21 @@
#pragma once
#include <Common/Allocator.h>
#include <Common/BitHelpers.h>
#include <Common/memcpySmall.h>
#include <Common/PODArray_fwd.h>
#include <base/getPageSize.h>
#include <boost/noncopyable.hpp>
#include <cstring>
#include <cstddef>
#include <cassert>
#include <algorithm>
#include <memory>
#include <boost/noncopyable.hpp>
#include <base/strong_typedef.h>
#include <base/getPageSize.h>
#include <Common/Allocator.h>
#include <Common/Exception.h>
#include <Common/BitHelpers.h>
#include <Common/memcpySmall.h>
#ifndef NDEBUG
#include <sys/mman.h>
#include <sys/mman.h>
#endif
#include <Common/PODArray_fwd.h>
/** Whether we can use memcpy instead of a loop with assignment to T from U.
* It is Ok if types are the same. And if types are integral and of the same size,
* example: char, signed char, unsigned char.
@ -35,12 +29,6 @@ constexpr bool memcpy_can_be_used_for_assignment = std::is_same_v<T, U>
namespace DB
{
namespace ErrorCodes
{
extern const int CANNOT_MPROTECT;
extern const int CANNOT_ALLOCATE_MEMORY;
}
/** A dynamic array for POD types.
* Designed for a small number of large arrays (rather than a lot of small ones).
* To be more precise - for use in ColumnVector.
@ -77,6 +65,19 @@ namespace ErrorCodes
static constexpr size_t empty_pod_array_size = 1024;
extern const char empty_pod_array[empty_pod_array_size];
namespace PODArrayDetails
{
void protectMemoryRegion(void * addr, size_t len, int prot);
/// The amount of memory occupied by the num_elements of the elements.
size_t byte_size(size_t num_elements, size_t element_size); /// NOLINT
/// Minimum amount of memory to allocate for num_elements, including padding.
size_t minimum_memory_for_elements(size_t num_elements, size_t element_size, size_t pad_left, size_t pad_right); /// NOLINT
};
/** Base class that depend only on size of element, not on element itself.
* You can static_cast to this class if you want to insert some data regardless to the actual type T.
*/
@ -102,27 +103,9 @@ protected:
char * c_end = null;
char * c_end_of_storage = null; /// Does not include pad_right.
/// The amount of memory occupied by the num_elements of the elements.
static size_t byte_size(size_t num_elements) /// NOLINT
{
size_t amount;
if (__builtin_mul_overflow(num_elements, ELEMENT_SIZE, &amount))
throw Exception(ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Amount of memory requested to allocate is more than allowed");
return amount;
}
/// Minimum amount of memory to allocate for num_elements, including padding.
static size_t minimum_memory_for_elements(size_t num_elements)
{
size_t amount;
if (__builtin_add_overflow(byte_size(num_elements), pad_left + pad_right, &amount))
throw Exception(ErrorCodes::CANNOT_ALLOCATE_MEMORY, "Amount of memory requested to allocate is more than allowed");
return amount;
}
void alloc_for_num_elements(size_t num_elements) /// NOLINT
{
alloc(minimum_memory_for_elements(num_elements));
alloc(PODArrayDetails::minimum_memory_for_elements(num_elements, ELEMENT_SIZE, pad_left, pad_right));
}
template <typename ... TAllocatorParams>
@ -188,7 +171,7 @@ protected:
// The allocated memory should be multiplication of ELEMENT_SIZE to hold the element, otherwise,
// memory issue such as corruption could appear in edge case.
realloc(std::max(integerRoundUp(initial_bytes, ELEMENT_SIZE),
minimum_memory_for_elements(1)),
PODArrayDetails::minimum_memory_for_elements(1, ELEMENT_SIZE, pad_left, pad_right)),
std::forward<TAllocatorParams>(allocator_params)...);
}
else
@ -208,8 +191,7 @@ protected:
if (right_rounded_down > left_rounded_up)
{
size_t length = right_rounded_down - left_rounded_up;
if (0 != mprotect(left_rounded_up, length, prot))
throw ErrnoException(ErrorCodes::CANNOT_MPROTECT, "Cannot mprotect memory region");
PODArrayDetails::protectMemoryRegion(left_rounded_up, length, prot);
}
}
@ -232,14 +214,14 @@ public:
void reserve(size_t n, TAllocatorParams &&... allocator_params)
{
if (n > capacity())
realloc(roundUpToPowerOfTwoOrZero(minimum_memory_for_elements(n)), std::forward<TAllocatorParams>(allocator_params)...);
realloc(roundUpToPowerOfTwoOrZero(PODArrayDetails::minimum_memory_for_elements(n, ELEMENT_SIZE, pad_left, pad_right)), std::forward<TAllocatorParams>(allocator_params)...);
}
template <typename ... TAllocatorParams>
void reserve_exact(size_t n, TAllocatorParams &&... allocator_params) /// NOLINT
{
if (n > capacity())
realloc(minimum_memory_for_elements(n), std::forward<TAllocatorParams>(allocator_params)...);
realloc(PODArrayDetails::minimum_memory_for_elements(n, ELEMENT_SIZE, pad_left, pad_right), std::forward<TAllocatorParams>(allocator_params)...);
}
template <typename ... TAllocatorParams>
@ -258,7 +240,7 @@ public:
void resize_assume_reserved(const size_t n) /// NOLINT
{
c_end = c_start + byte_size(n);
c_end = c_start + PODArrayDetails::byte_size(n, ELEMENT_SIZE);
}
const char * raw_data() const /// NOLINT
@ -339,7 +321,7 @@ public:
explicit PODArray(size_t n)
{
this->alloc_for_num_elements(n);
this->c_end += this->byte_size(n);
this->c_end += PODArrayDetails::byte_size(n, sizeof(T));
}
PODArray(size_t n, const T & x)
@ -411,9 +393,9 @@ public:
if (n > old_size)
{
this->reserve(n);
memset(this->c_end, 0, this->byte_size(n - old_size));
memset(this->c_end, 0, PODArrayDetails::byte_size(n - old_size, sizeof(T)));
}
this->c_end = this->c_start + this->byte_size(n);
this->c_end = this->c_start + PODArrayDetails::byte_size(n, sizeof(T));
}
void resize_fill(size_t n, const T & value) /// NOLINT
@ -424,7 +406,7 @@ public:
this->reserve(n);
std::fill(t_end(), t_end() + n - old_size, value);
}
this->c_end = this->c_start + this->byte_size(n);
this->c_end = this->c_start + PODArrayDetails::byte_size(n, sizeof(T));
}
template <typename U, typename ... TAllocatorParams>
@ -487,7 +469,7 @@ public:
if (required_capacity > this->capacity())
this->reserve(roundUpToPowerOfTwoOrZero(required_capacity), std::forward<TAllocatorParams>(allocator_params)...);
size_t bytes_to_copy = this->byte_size(from_end - from_begin);
size_t bytes_to_copy = PODArrayDetails::byte_size(from_end - from_begin, sizeof(T));
if (bytes_to_copy)
{
memcpy(this->c_end, reinterpret_cast<const void *>(rhs.begin() + from_begin), bytes_to_copy);
@ -502,7 +484,7 @@ public:
static_assert(pad_right_ >= PADDING_FOR_SIMD - 1);
static_assert(sizeof(T) == sizeof(*from_begin));
insertPrepare(from_begin, from_end, std::forward<TAllocatorParams>(allocator_params)...);
size_t bytes_to_copy = this->byte_size(from_end - from_begin);
size_t bytes_to_copy = PODArrayDetails::byte_size(from_end - from_begin, sizeof(T));
memcpySmallAllowReadWriteOverflow15(this->c_end, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
this->c_end += bytes_to_copy;
}
@ -513,11 +495,11 @@ public:
{
static_assert(memcpy_can_be_used_for_assignment<std::decay_t<T>, std::decay_t<decltype(*from_begin)>>);
size_t bytes_to_copy = this->byte_size(from_end - from_begin);
size_t bytes_to_copy = PODArrayDetails::byte_size(from_end - from_begin, sizeof(T));
if (!bytes_to_copy)
return;
size_t bytes_to_move = this->byte_size(end() - it);
size_t bytes_to_move = PODArrayDetails::byte_size(end() - it, sizeof(T));
insertPrepare(from_begin, from_end);
@ -545,10 +527,10 @@ public:
if (required_capacity > this->capacity())
this->reserve(roundUpToPowerOfTwoOrZero(required_capacity), std::forward<TAllocatorParams>(allocator_params)...);
size_t bytes_to_copy = this->byte_size(copy_size);
size_t bytes_to_copy = PODArrayDetails::byte_size(copy_size, sizeof(T));
if (bytes_to_copy)
{
auto begin = this->c_start + this->byte_size(start_index);
auto begin = this->c_start + PODArrayDetails::byte_size(start_index, sizeof(T));
memcpy(this->c_end, reinterpret_cast<const void *>(&*begin), bytes_to_copy);
this->c_end += bytes_to_copy;
}
@ -560,7 +542,7 @@ public:
static_assert(memcpy_can_be_used_for_assignment<std::decay_t<T>, std::decay_t<decltype(*from_begin)>>);
this->assertNotIntersects(from_begin, from_end);
size_t bytes_to_copy = this->byte_size(from_end - from_begin);
size_t bytes_to_copy = PODArrayDetails::byte_size(from_end - from_begin, sizeof(T));
if (bytes_to_copy)
{
memcpy(this->c_end, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);
@ -593,13 +575,13 @@ public:
/// arr1 takes ownership of the heap memory of arr2.
arr1.c_start = arr2.c_start;
arr1.c_end_of_storage = arr1.c_start + heap_allocated - arr2.pad_right - arr2.pad_left;
arr1.c_end = arr1.c_start + this->byte_size(heap_size);
arr1.c_end = arr1.c_start + PODArrayDetails::byte_size(heap_size, sizeof(T));
/// Allocate stack space for arr2.
arr2.alloc(stack_allocated, std::forward<TAllocatorParams>(allocator_params)...);
/// Copy the stack content.
memcpy(arr2.c_start, stack_c_start, this->byte_size(stack_size));
arr2.c_end = arr2.c_start + this->byte_size(stack_size);
memcpy(arr2.c_start, stack_c_start, PODArrayDetails::byte_size(stack_size, sizeof(T)));
arr2.c_end = arr2.c_start + PODArrayDetails::byte_size(stack_size, sizeof(T));
};
auto do_move = [&](PODArray & src, PODArray & dest)
@ -608,8 +590,8 @@ public:
{
dest.dealloc();
dest.alloc(src.allocated_bytes(), std::forward<TAllocatorParams>(allocator_params)...);
memcpy(dest.c_start, src.c_start, this->byte_size(src.size()));
dest.c_end = dest.c_start + this->byte_size(src.size());
memcpy(dest.c_start, src.c_start, PODArrayDetails::byte_size(src.size(), sizeof(T)));
dest.c_end = dest.c_start + PODArrayDetails::byte_size(src.size(), sizeof(T));
src.c_start = Base::null;
src.c_end = Base::null;
@ -666,8 +648,8 @@ public:
this->c_end_of_storage = this->c_start + rhs_allocated - Base::pad_right - Base::pad_left;
rhs.c_end_of_storage = rhs.c_start + lhs_allocated - Base::pad_right - Base::pad_left;
this->c_end = this->c_start + this->byte_size(rhs_size);
rhs.c_end = rhs.c_start + this->byte_size(lhs_size);
this->c_end = this->c_start + PODArrayDetails::byte_size(rhs_size, sizeof(T));
rhs.c_end = rhs.c_start + PODArrayDetails::byte_size(lhs_size, sizeof(T));
}
else if (this->isAllocatedFromStack() && !rhs.isAllocatedFromStack())
{
@ -702,7 +684,7 @@ public:
if (required_capacity > this->capacity())
this->reserve_exact(required_capacity, std::forward<TAllocatorParams>(allocator_params)...);
size_t bytes_to_copy = this->byte_size(required_capacity);
size_t bytes_to_copy = PODArrayDetails::byte_size(required_capacity, sizeof(T));
if (bytes_to_copy)
memcpy(this->c_start, reinterpret_cast<const void *>(&*from_begin), bytes_to_copy);

View File

@ -348,6 +348,25 @@ DECLARE_AVX512VBMI2_SPECIFIC_CODE(
#if ENABLE_MULTITARGET_CODE && defined(__GNUC__) && defined(__x86_64__)
/// NOLINTNEXTLINE
#define MULTITARGET_FUNCTION_AVX2_SSE42(FUNCTION_HEADER, name, FUNCTION_BODY) \
FUNCTION_HEADER \
\
AVX2_FUNCTION_SPECIFIC_ATTRIBUTE \
name##AVX2 \
FUNCTION_BODY \
\
FUNCTION_HEADER \
\
SSE42_FUNCTION_SPECIFIC_ATTRIBUTE \
name##SSE42 \
FUNCTION_BODY \
\
FUNCTION_HEADER \
\
name \
FUNCTION_BODY \
/// NOLINTNEXTLINE
#define MULTITARGET_FUNCTION_AVX512BW_AVX512F_AVX2_SSE42(FUNCTION_HEADER, name, FUNCTION_BODY) \
FUNCTION_HEADER \
@ -381,6 +400,14 @@ DECLARE_AVX512VBMI2_SPECIFIC_CODE(
#else
/// NOLINTNEXTLINE
#define MULTITARGET_FUNCTION_AVX2_SSE42(FUNCTION_HEADER, name, FUNCTION_BODY) \
FUNCTION_HEADER \
\
name \
FUNCTION_BODY \
/// NOLINTNEXTLINE
#define MULTITARGET_FUNCTION_AVX512BW_AVX512F_AVX2_SSE42(FUNCTION_HEADER, name, FUNCTION_BODY) \
FUNCTION_HEADER \

View File

@ -28,6 +28,40 @@ namespace CurrentMetrics
extern const Metric GlobalThreadScheduled;
}
class JobWithPriority
{
public:
using Job = std::function<void()>;
Job job;
Priority priority;
CurrentMetrics::Increment metric_increment;
DB::OpenTelemetry::TracingContextOnThread thread_trace_context;
/// Call stacks of all jobs' schedulings leading to this one
std::vector<StackTrace::FramePointers> frame_pointers;
bool enable_job_stack_trace = false;
JobWithPriority(
Job job_, Priority priority_, CurrentMetrics::Metric metric,
const DB::OpenTelemetry::TracingContextOnThread & thread_trace_context_,
bool capture_frame_pointers)
: job(job_), priority(priority_), metric_increment(metric),
thread_trace_context(thread_trace_context_), enable_job_stack_trace(capture_frame_pointers)
{
if (!capture_frame_pointers)
return;
/// Save all previous jobs call stacks and append with current
frame_pointers = DB::Exception::thread_frame_pointers;
frame_pointers.push_back(StackTrace().getFramePointers());
}
bool operator<(const JobWithPriority & rhs) const
{
return priority > rhs.priority; // Reversed for `priority_queue` max-heap to yield minimum value (i.e. highest priority) first
}
};
static constexpr auto DEFAULT_THREAD_NAME = "ThreadPool";
template <typename Thread>

View File

@ -20,9 +20,10 @@
#include <Common/ThreadPool_fwd.h>
#include <Common/Priority.h>
#include <Common/StackTrace.h>
#include <Common/Exception.h>
#include <base/scope_guard.h>
class JobWithPriority;
/** Very simple thread pool similar to boost::threadpool.
* Advantages:
* - catches exceptions and rethrows on wait.
@ -128,37 +129,6 @@ private:
bool threads_remove_themselves = true;
const bool shutdown_on_exception = true;
struct JobWithPriority
{
Job job;
Priority priority;
CurrentMetrics::Increment metric_increment;
DB::OpenTelemetry::TracingContextOnThread thread_trace_context;
/// Call stacks of all jobs' schedulings leading to this one
std::vector<StackTrace::FramePointers> frame_pointers;
bool enable_job_stack_trace = false;
JobWithPriority(
Job job_, Priority priority_, CurrentMetrics::Metric metric,
const DB::OpenTelemetry::TracingContextOnThread & thread_trace_context_,
bool capture_frame_pointers)
: job(job_), priority(priority_), metric_increment(metric),
thread_trace_context(thread_trace_context_), enable_job_stack_trace(capture_frame_pointers)
{
if (!capture_frame_pointers)
return;
/// Save all previous jobs call stacks and append with current
frame_pointers = DB::Exception::thread_frame_pointers;
frame_pointers.push_back(StackTrace().getFramePointers());
}
bool operator<(const JobWithPriority & rhs) const
{
return priority > rhs.priority; // Reversed for `priority_queue` max-heap to yield minimum value (i.e. highest priority) first
}
};
boost::heap::priority_queue<JobWithPriority> jobs;
std::list<Thread> threads;
std::exception_ptr first_exception;

View File

@ -4,6 +4,7 @@
#include <Common/ThreadStatus.h>
#include <Common/CurrentThread.h>
#include <Common/logger_useful.h>
#include <base/getPageSize.h>
#include <base/errnoToString.h>
#include <Interpreters/Context.h>

View File

@ -1,4 +1,5 @@
#include "Common/ZooKeeper/ZooKeeperConstants.h"
#include <Common/ZooKeeper/ZooKeeperConstants.h>
#include <Common/thread_local_rng.h>
#include <Common/ZooKeeper/ZooKeeperImpl.h>
#include <IO/Operators.h>

View File

@ -28,7 +28,6 @@
#cmakedefine01 USE_S2_GEOMETRY
#cmakedefine01 USE_FASTOPS
#cmakedefine01 USE_SQIDS
#cmakedefine01 USE_IDNA
#cmakedefine01 USE_NLP
#cmakedefine01 USE_VECTORSCAN
#cmakedefine01 USE_LIBURING

View File

@ -6,6 +6,7 @@
#include <IO/ReadBufferFromFileDescriptor.h>
#include <IO/WriteBufferFromFileDescriptor.h>
#include <IO/copyData.h>
#include <iostream>
/** This example shows how we can proxy stdin to ShellCommand and obtain stdout in streaming fashion. */

View File

@ -1,6 +1,7 @@
#pragma once
#include <cstring>
#include <sys/types.h> /// ssize_t
#ifdef __SSE2__
# include <emmintrin.h>

View File

@ -28,31 +28,25 @@ namespace ErrorCodes
static thread_local char thread_name[THREAD_NAME_SIZE]{};
void setThreadName(const char * name, bool truncate)
void setThreadName(const char * name)
{
size_t name_len = strlen(name);
if (!truncate && name_len > THREAD_NAME_SIZE - 1)
if (strlen(name) > THREAD_NAME_SIZE - 1)
throw DB::Exception(DB::ErrorCodes::PTHREAD_ERROR, "Thread name cannot be longer than 15 bytes");
size_t name_capped_len = std::min<size_t>(1 + name_len, THREAD_NAME_SIZE - 1);
char name_capped[THREAD_NAME_SIZE];
memcpy(name_capped, name, name_capped_len);
name_capped[name_capped_len] = '\0';
#if defined(OS_FREEBSD)
pthread_set_name_np(pthread_self(), name_capped);
pthread_set_name_np(pthread_self(), name);
if ((false))
#elif defined(OS_DARWIN)
if (0 != pthread_setname_np(name_capped))
if (0 != pthread_setname_np(name))
#elif defined(OS_SUNOS)
if (0 != pthread_setname_np(pthread_self(), name_capped))
if (0 != pthread_setname_np(pthread_self(), name))
#else
if (0 != prctl(PR_SET_NAME, name_capped, 0, 0, 0))
if (0 != prctl(PR_SET_NAME, name, 0, 0, 0))
#endif
if (errno != ENOSYS && errno != EPERM) /// It's ok if the syscall is unsupported or not allowed in some environments.
throw DB::ErrnoException(DB::ErrorCodes::PTHREAD_ERROR, "Cannot set thread name with prctl(PR_SET_NAME, ...)");
memcpy(thread_name, name_capped, name_capped_len);
memcpy(thread_name, name, std::min<size_t>(1 + strlen(name), THREAD_NAME_SIZE - 1));
}
const char * getThreadName()

View File

@ -4,9 +4,7 @@
/** Sets the thread name (maximum length is 15 bytes),
* which will be visible in ps, gdb, /proc,
* for convenience of observation and debugging.
*
* @param truncate - if true, will truncate to 15 automatically, otherwise throw
*/
void setThreadName(const char * name, bool truncate = false);
void setThreadName(const char * name);
const char * getThreadName();

View File

@ -11,6 +11,7 @@
#include "libaccel_config.h"
#include <Common/MemorySanitizer.h>
#include <base/scope_guard.h>
#include <base/getPageSize.h>
#include <immintrin.h>

View File

@ -31,7 +31,7 @@ bool BackgroundSchedulePoolTaskInfo::schedule()
return true;
}
bool BackgroundSchedulePoolTaskInfo::scheduleAfter(size_t milliseconds, bool overwrite)
bool BackgroundSchedulePoolTaskInfo::scheduleAfter(size_t milliseconds, bool overwrite, bool only_if_scheduled)
{
std::lock_guard lock(schedule_mutex);
@ -39,6 +39,8 @@ bool BackgroundSchedulePoolTaskInfo::scheduleAfter(size_t milliseconds, bool ove
return false;
if (delayed && !overwrite)
return false;
if (!delayed && only_if_scheduled)
return false;
pool.scheduleDelayedTask(shared_from_this(), milliseconds, lock);
return true;

View File

@ -106,8 +106,10 @@ public:
bool schedule();
/// Schedule for execution after specified delay.
/// If overwrite is set then the task will be re-scheduled (if it was already scheduled, i.e. delayed == true).
bool scheduleAfter(size_t milliseconds, bool overwrite = true);
/// If overwrite is set, and the task is already scheduled with a delay (delayed == true),
/// the task will be re-scheduled with the new delay.
/// If only_if_scheduled is set, don't do anything unless the task is already scheduled with a delay.
bool scheduleAfter(size_t milliseconds, bool overwrite = true, bool only_if_scheduled = false);
/// Further attempts to schedule become no-op. Will wait till the end of the current execution of the task.
void deactivate();

View File

@ -584,6 +584,8 @@ class IColumn;
M(Bool, enable_early_constant_folding, true, "Enable query optimization where we analyze function and subqueries results and rewrite query if there're constants there", 0) \
M(Bool, deduplicate_blocks_in_dependent_materialized_views, false, "Should deduplicate blocks for materialized views if the block is not a duplicate for the table. Use true to always deduplicate in dependent tables.", 0) \
M(Bool, materialized_views_ignore_errors, false, "Allows to ignore errors for MATERIALIZED VIEW, and deliver original block to the table regardless of MVs", 0) \
M(Bool, allow_experimental_refreshable_materialized_view, false, "Allow refreshable materialized views (CREATE MATERIALIZED VIEW <name> REFRESH ...).", 0) \
M(Bool, stop_refreshable_materialized_views_on_startup, false, "On server startup, prevent scheduling of refreshable materialized views, as if with SYSTEM STOP VIEWS. You can manually start them with SYSTEM START VIEWS or SYSTEM START VIEW <name> afterwards. Also applies to newly created views. Has no effect on non-refreshable materialized views.", 0) \
M(Bool, use_compact_format_in_distributed_parts_names, true, "Changes format of directories names for distributed table insert parts.", 0) \
M(Bool, validate_polygons, true, "Throw exception if polygon is invalid in function pointInPolygon (e.g. self-tangent, self-intersecting). If the setting is false, the function will accept invalid polygons but may silently return wrong result.", 0) \
M(UInt64, max_parser_depth, DBMS_DEFAULT_MAX_PARSER_DEPTH, "Maximum parser depth (recursion depth of recursive descend parser).", 0) \

View File

@ -65,6 +65,11 @@ void applyMetadataChangesToCreateQuery(const ASTPtr & query, const StorageInMemo
query->replace(ast_create_query.select, metadata.select.select_query);
}
if (metadata.refresh)
{
query->replace(ast_create_query.refresh_strategy, metadata.refresh);
}
/// MaterializedView, Dictionary are types of CREATE query without storage.
if (ast_create_query.storage)
{

View File

@ -207,7 +207,6 @@ PostgreSQLTableStructure::ColumnsInfoPtr readNamesAndTypesList(
columns.push_back(NameAndTypePair(column_name, data_type));
auto attgenerated = std::get<6>(row);
LOG_TEST(&Poco::Logger::get("kssenii"), "KSSENII: attgenerated: {}", attgenerated);
attributes.emplace(
column_name,

View File

@ -60,7 +60,7 @@ public:
/// Removes all dependencies of "table_id", returns those dependencies.
std::vector<StorageID> removeDependencies(const StorageID & table_id, bool remove_isolated_tables = false);
/// Removes a table from the graph and removes all references to in from the graph (both from its dependencies and dependents).
/// Removes a table from the graph and removes all references to it from the graph (both from its dependencies and dependents).
bool removeTable(const StorageID & table_id);
/// Removes tables from the graph by a specified filter.

View File

@ -8,6 +8,7 @@
#include <Common/filesystemHelpers.h>
#include <Common/quoteString.h>
#include <Common/atomicRename.h>
#include <Common/formatReadable.h>
#include <Disks/IO/createReadBufferFromFileBase.h>
#include <Disks/loadLocalDiskConfig.h>
#include <Disks/TemporaryFileOnDisk.h>

View File

@ -3,6 +3,7 @@
#include <IO/ReadBufferFromString.h>
#include <IO/ReadBufferFromEmptyFile.h>
#include <IO/WriteBufferFromFile.h>
#include <Common/formatReadable.h>
#include <Common/CurrentThread.h>
#include <Common/quoteString.h>
#include <Common/logger_useful.h>

View File

@ -22,11 +22,13 @@ struct S3ObjectStorageSettings
const S3Settings::RequestSettings & request_settings_,
uint64_t min_bytes_for_seek_,
int32_t list_object_keys_size_,
int32_t objects_chunk_size_to_delete_)
int32_t objects_chunk_size_to_delete_,
bool read_only_)
: request_settings(request_settings_)
, min_bytes_for_seek(min_bytes_for_seek_)
, list_object_keys_size(list_object_keys_size_)
, objects_chunk_size_to_delete(objects_chunk_size_to_delete_)
, read_only(read_only_)
{}
S3Settings::RequestSettings request_settings;
@ -34,6 +36,7 @@ struct S3ObjectStorageSettings
uint64_t min_bytes_for_seek;
int32_t list_object_keys_size;
int32_t objects_chunk_size_to_delete;
bool read_only;
};
@ -166,6 +169,8 @@ public:
ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
bool isReadOnly() const override { return s3_settings.get()->read_only; }
private:
void setNewSettings(std::unique_ptr<S3ObjectStorageSettings> && s3_settings_);

View File

@ -34,7 +34,8 @@ std::unique_ptr<S3ObjectStorageSettings> getSettings(const Poco::Util::AbstractC
request_settings,
config.getUInt64(config_prefix + ".min_bytes_for_seek", 1024 * 1024),
config.getInt(config_prefix + ".list_object_keys_size", 1000),
config.getInt(config_prefix + ".objects_chunk_size_to_delete", 1000));
config.getInt(config_prefix + ".objects_chunk_size_to_delete", 1000),
config.getBool(config_prefix + ".readonly", false));
}
std::unique_ptr<S3::Client> getClient(
@ -92,10 +93,6 @@ std::unique_ptr<S3::Client> getClient(
HTTPHeaderEntries headers = S3::getHTTPHeaders(config_prefix, config);
S3::ServerSideEncryptionKMSConfig sse_kms_config = S3::getSSEKMSConfig(config_prefix, config);
client_configuration.retryStrategy
= std::make_shared<Aws::Client::DefaultRetryStrategy>(
config.getUInt64(config_prefix + ".retry_attempts", settings.request_settings.retry_attempts));
return S3::ClientFactory::instance().create(
client_configuration,
uri.is_virtual_hosted_style,

View File

@ -14,6 +14,7 @@ void registerFileSegmentationEngineJSONEachRow(FormatFactory & factory);
void registerFileSegmentationEngineRegexp(FormatFactory & factory);
void registerFileSegmentationEngineJSONAsString(FormatFactory & factory);
void registerFileSegmentationEngineJSONAsObject(FormatFactory & factory);
void registerFileSegmentationEngineJSONCompactEachRow(FormatFactory & factory);
#if USE_HIVE
void registerFileSegmentationEngineHiveText(FormatFactory & factory);
#endif
@ -160,6 +161,7 @@ void registerFormats()
registerFileSegmentationEngineJSONEachRow(factory);
registerFileSegmentationEngineJSONAsString(factory);
registerFileSegmentationEngineJSONAsObject(factory);
registerFileSegmentationEngineJSONCompactEachRow(factory);
#if USE_HIVE
registerFileSegmentationEngineHiveText(factory);
#endif

View File

@ -83,10 +83,6 @@ if (TARGET ch_contrib::sqids)
list (APPEND PRIVATE_LIBS ch_contrib::sqids)
endif()
if (TARGET ch_contrib::idna)
list (APPEND PRIVATE_LIBS ch_contrib::idna)
endif()
if (TARGET ch_contrib::h3)
list (APPEND PRIVATE_LIBS ch_contrib::h3)
endif()

View File

@ -1,6 +1,6 @@
#include "config.h"
#if USE_SQIDS
#ifdef ENABLE_SQIDS
#include <Columns/ColumnString.h>
#include <Columns/ColumnsNumber.h>

View File

@ -1,5 +1,6 @@
#include <Functions/FunctionFactory.h>
#include <Functions/formatReadable.h>
#include <Common/formatReadable.h>
namespace DB

View File

@ -1,5 +1,6 @@
#include <Functions/FunctionFactory.h>
#include <Functions/formatReadable.h>
#include <Common/formatReadable.h>
namespace DB

View File

@ -1,5 +1,6 @@
#include <Functions/FunctionFactory.h>
#include <Functions/formatReadable.h>
#include <Common/formatReadable.h>
namespace DB

View File

@ -1,165 +0,0 @@
#include "config.h"
#if USE_IDNA
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionStringToString.h>
#ifdef __clang__
# pragma clang diagnostic push
# pragma clang diagnostic ignored "-Wnewline-eof"
#endif
# include <ada/idna/punycode.h>
# include <ada/idna/unicode_transcoding.h>
#ifdef __clang__
# pragma clang diagnostic pop
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
extern const int ILLEGAL_COLUMN;
}
struct PunycodeEncodeImpl
{
static void vector(
const ColumnString::Chars & data,
const ColumnString::Offsets & offsets,
ColumnString::Chars & res_data,
ColumnString::Offsets & res_offsets)
{
const size_t rows = offsets.size();
res_data.reserve(data.size()); /// just a guess, assuming the input is all-ASCII
res_offsets.reserve(rows);
size_t prev_offset = 0;
std::u32string value_utf32;
std::string value_puny;
for (size_t row = 0; row < rows; ++row)
{
const char * value = reinterpret_cast<const char *>(&data[prev_offset]);
const size_t value_length = offsets[row] - prev_offset - 1;
const size_t value_utf32_length = ada::idna::utf32_length_from_utf8(value, value_length);
value_utf32.resize(value_utf32_length);
ada::idna::utf8_to_utf32(value, value_length, value_utf32.data());
const bool ok = ada::idna::utf32_to_punycode(value_utf32, value_puny);
if (!ok)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Internal error during Punycode encoding");
res_data.insert(value_puny.c_str(), value_puny.c_str() + value_puny.size() + 1);
res_offsets.push_back(res_data.size());
prev_offset = offsets[row];
value_utf32.clear();
value_puny.clear(); /// utf32_to_punycode() appends to its output string
}
}
[[noreturn]] static void vectorFixed(const ColumnString::Chars &, size_t, ColumnString::Chars &)
{
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Column of type FixedString is not supported by punycodeEncode function");
}
};
struct PunycodeDecodeImpl
{
static void vector(
const ColumnString::Chars & data,
const ColumnString::Offsets & offsets,
ColumnString::Chars & res_data,
ColumnString::Offsets & res_offsets)
{
const size_t rows = offsets.size();
res_data.reserve(data.size()); /// just a guess, assuming the input is all-ASCII
res_offsets.reserve(rows);
size_t prev_offset = 0;
std::u32string value_utf32;
std::string value_utf8;
for (size_t row = 0; row < rows; ++row)
{
const char * value = reinterpret_cast<const char *>(&data[prev_offset]);
const size_t value_length = offsets[row] - prev_offset - 1;
const std::string_view value_punycode(value, value_length);
const bool ok = ada::idna::punycode_to_utf32(value_punycode, value_utf32);
if (!ok)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Internal error during Punycode decoding");
const size_t utf8_length = ada::idna::utf8_length_from_utf32(value_utf32.data(), value_utf32.size());
value_utf8.resize(utf8_length);
ada::idna::utf32_to_utf8(value_utf32.data(), value_utf32.size(), value_utf8.data());
res_data.insert(value_utf8.c_str(), value_utf8.c_str() + value_utf8.size() + 1);
res_offsets.push_back(res_data.size());
prev_offset = offsets[row];
value_utf32.clear(); /// punycode_to_utf32() appends to its output string
value_utf8.clear();
}
}
[[noreturn]] static void vectorFixed(const ColumnString::Chars &, size_t, ColumnString::Chars &)
{
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Column of type FixedString is not supported by punycodeDecode function");
}
};
struct NamePunycodeEncode
{
static constexpr auto name = "punycodeEncode";
};
struct NamePunycodeDecode
{
static constexpr auto name = "punycodeDecode";
};
REGISTER_FUNCTION(Punycode)
{
factory.registerFunction<FunctionStringToString<PunycodeEncodeImpl, NamePunycodeEncode>>(FunctionDocumentation{
.description=R"(
Computes a Punycode representation of a string.)",
.syntax="punycodeEncode(str)",
.arguments={{"str", "Input string"}},
.returned_value="The punycode representation [String](/docs/en/sql-reference/data-types/string.md).",
.examples={
{"simple",
"SELECT punycodeEncode('München') AS puny;",
R"(
puny
Mnchen-3ya
)"
}}
});
factory.registerFunction<FunctionStringToString<PunycodeDecodeImpl, NamePunycodeDecode>>(FunctionDocumentation{
.description=R"(
Computes a Punycode representation of a string.)",
.syntax="punycodeDecode(str)",
.arguments={{"str", "A Punycode-encoded string"}},
.returned_value="The plaintext representation [String](/docs/en/sql-reference/data-types/string.md).",
.examples={
{"simple",
"SELECT punycodeDecode('Mnchen-3ya') AS plain;",
R"(
plain
München
)"
}}
});
}
}
#endif

View File

@ -1,7 +1,8 @@
#include <Functions/IFunction.h>
#include <Functions/FunctionHelpers.h>
#include <Functions/FunctionFactory.h>
#include "Common/Exception.h"
#include <Common/Exception.h>
#include <Common/thread_local_rng.h>
#include <Common/NaNUtils.h>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnsNumber.h>

View File

@ -6,6 +6,8 @@
#include <base/errnoToString.h>
#include <zip.h>
#include <boost/algorithm/string/predicate.hpp>
#include <Common/logger_useful.h>
#include <Poco/Logger.h>
namespace DB

View File

@ -1,5 +1,6 @@
#pragma once
#include <algorithm>
#include <bit>
#include <base/types.h>
#include <Common/BitHelpers.h>

View File

@ -1,4 +1,5 @@
#include <IO/MMapReadBufferFromFileWithCache.h>
#include <base/getPageSize.h>
namespace DB

View File

@ -41,6 +41,7 @@
#include <IO/PeekableReadBuffer.h>
#include <IO/VarInt.h>
#include <pcg_random.hpp>
#include <double-conversion/double-conversion.h>
static constexpr auto DEFAULT_MAX_STRING_SIZE = 1_GiB;

View File

@ -18,6 +18,7 @@ namespace ActionLocks
extern const StorageActionBlockType PartsMove = 7;
extern const StorageActionBlockType PullReplicationLog = 8;
extern const StorageActionBlockType Cleanup = 9;
extern const StorageActionBlockType ViewRefresh = 10;
}

View File

@ -5,6 +5,7 @@
#include <Parsers/ASTQueryWithTableAndOutput.h>
#include <Parsers/ASTRenameQuery.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTRefreshStrategy.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTSubquery.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
@ -87,6 +88,12 @@ public:
visit(child);
}
void visit(ASTRefreshStrategy & refresh) const
{
ASTPtr unused;
visit(refresh, unused);
}
private:
ContextPtr context;
@ -229,6 +236,13 @@ private:
}
}
void visit(ASTRefreshStrategy & refresh, ASTPtr &) const
{
if (refresh.dependencies)
for (auto & table : refresh.dependencies->children)
tryVisit<ASTTableIdentifier>(table);
}
void visitChildren(IAST & ast) const
{
for (auto & child : ast.children)

View File

@ -50,12 +50,35 @@ FileCachePtr FileCacheFactory::getOrCreate(
{
std::lock_guard lock(mutex);
auto it = caches_by_name.find(cache_name);
auto it = std::find_if(caches_by_name.begin(), caches_by_name.end(), [&](const auto & cache_by_name)
{
return cache_by_name.second->getSettings().base_path == file_cache_settings.base_path;
});
if (it == caches_by_name.end())
{
auto cache = std::make_shared<FileCache>(cache_name, file_cache_settings);
it = caches_by_name.emplace(
cache_name, std::make_unique<FileCacheData>(cache, file_cache_settings, config_path)).first;
bool inserted;
std::tie(it, inserted) = caches_by_name.emplace(
cache_name, std::make_unique<FileCacheData>(cache, file_cache_settings, config_path));
if (!inserted)
{
throw Exception(ErrorCodes::BAD_ARGUMENTS,
"Cache with name {} exists, but it has a different path", cache_name);
}
}
else if (it->second->getSettings() != file_cache_settings)
{
throw Exception(ErrorCodes::BAD_ARGUMENTS,
"Found more than one cache configuration with the same path, "
"but with different cache settings ({} and {})",
it->first, cache_name);
}
else if (it->first != cache_name)
{
caches_by_name.emplace(cache_name, it->second);
}
return it->second->cache;
@ -69,12 +92,33 @@ FileCachePtr FileCacheFactory::create(
std::lock_guard lock(mutex);
auto it = caches_by_name.find(cache_name);
if (it != caches_by_name.end())
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Cache with name {} already exists", cache_name);
auto cache = std::make_shared<FileCache>(cache_name, file_cache_settings);
it = caches_by_name.emplace(
cache_name, std::make_unique<FileCacheData>(cache, file_cache_settings, config_path)).first;
it = std::find_if(caches_by_name.begin(), caches_by_name.end(), [&](const auto & cache_by_name)
{
return cache_by_name.second->getSettings().base_path == file_cache_settings.base_path;
});
if (it == caches_by_name.end())
{
auto cache = std::make_shared<FileCache>(cache_name, file_cache_settings);
it = caches_by_name.emplace(
cache_name, std::make_unique<FileCacheData>(cache, file_cache_settings, config_path)).first;
}
else if (it->second->getSettings() != file_cache_settings)
{
throw Exception(ErrorCodes::BAD_ARGUMENTS,
"Found more than one cache configuration with the same path, "
"but with different cache settings ({} and {})",
it->first, cache_name);
}
else
{
[[maybe_unused]] bool inserted = caches_by_name.emplace(cache_name, it->second).second;
chassert(inserted);
}
return it->second->cache;
}
@ -98,11 +142,14 @@ void FileCacheFactory::updateSettingsFromConfig(const Poco::Util::AbstractConfig
caches_by_name_copy = caches_by_name;
}
std::unordered_set<std::string> checked_paths;
for (const auto & [_, cache_info] : caches_by_name_copy)
{
if (cache_info->config_path.empty())
if (cache_info->config_path.empty() || checked_paths.contains(cache_info->config_path))
continue;
checked_paths.emplace(cache_info->config_path);
FileCacheSettings new_settings;
new_settings.loadFromConfig(config, cache_info->config_path);

View File

@ -13,7 +13,7 @@ namespace DB
IFileCachePriority::IFileCachePriority(size_t max_size_, size_t max_elements_)
: max_size(max_size_), max_elements(max_elements_)
{
CurrentMetrics::set(CurrentMetrics::FilesystemCacheSizeLimit, max_size_);
CurrentMetrics::add(CurrentMetrics::FilesystemCacheSizeLimit, max_size_);
}
IFileCachePriority::Entry::Entry(

View File

@ -7,6 +7,7 @@
#include <base/scope_guard.h>
#include <Common/logger_useful.h>
#include <Common/formatReadable.h>
namespace DB
{

View File

@ -95,6 +95,7 @@
#include <Interpreters/JIT/CompiledExpressionCache.h>
#include <Storages/MergeTree/BackgroundJobsAssignee.h>
#include <Storages/MergeTree/MergeTreeDataPartUUID.h>
#include <Storages/MaterializedView/RefreshSet.h>
#include <Interpreters/SynonymsExtensions.h>
#include <Interpreters/Lemmatizers.h>
#include <Interpreters/ClusterDiscovery.h>
@ -289,6 +290,7 @@ struct ContextSharedPart : boost::noncopyable
MergeList merge_list; /// The list of executable merge (for (Replicated)?MergeTree)
MovesList moves_list; /// The list of executing moves (for (Replicated)?MergeTree)
ReplicatedFetchList replicated_fetch_list;
RefreshSet refresh_set; /// The list of active refreshes (for MaterializedView)
ConfigurationPtr users_config TSA_GUARDED_BY(mutex); /// Config with the users, profiles and quotas sections.
InterserverIOHandler interserver_io_handler; /// Handler for interserver communication.
@ -825,6 +827,8 @@ MovesList & Context::getMovesList() { return shared->moves_list; }
const MovesList & Context::getMovesList() const { return shared->moves_list; }
ReplicatedFetchList & Context::getReplicatedFetchList() { return shared->replicated_fetch_list; }
const ReplicatedFetchList & Context::getReplicatedFetchList() const { return shared->replicated_fetch_list; }
RefreshSet & Context::getRefreshSet() { return shared->refresh_set; }
const RefreshSet & Context::getRefreshSet() const { return shared->refresh_set; }
String Context::resolveDatabase(const String & database_name) const
{

View File

@ -74,6 +74,7 @@ class BackgroundSchedulePool;
class MergeList;
class MovesList;
class ReplicatedFetchList;
class RefreshSet;
class Cluster;
class Compiler;
class MarkCache;
@ -922,6 +923,9 @@ public:
ReplicatedFetchList & getReplicatedFetchList();
const ReplicatedFetchList & getReplicatedFetchList() const;
RefreshSet & getRefreshSet();
const RefreshSet & getRefreshSet() const;
/// If the current session is expired at the time of the call, synchronously creates and returns a new session with the startNewSession() call.
/// If no ZooKeeper configured, throws an exception.
std::shared_ptr<zkutil::ZooKeeper> getZooKeeper() const;

View File

@ -6,6 +6,7 @@
#include <Interpreters/TemporaryDataOnDisk.h>
#include <Compression/CompressedWriteBuffer.h>
#include <Common/formatReadable.h>
#include <Common/logger_useful.h>
#include <Common/thread_local_rng.h>

View File

@ -30,6 +30,7 @@
#include <Common/Exception.h>
#include <Common/typeid_cast.h>
#include <Common/assert_cast.h>
#include <Common/formatReadable.h>
#include <Functions/FunctionHelpers.h>
#include <Interpreters/castColumn.h>

View File

@ -460,6 +460,11 @@ AccessRightsElements InterpreterAlterQuery::getRequiredAccessForCommand(const AS
required_access.emplace_back(AccessType::ALTER_VIEW_MODIFY_QUERY, database, table);
break;
}
case ASTAlterCommand::MODIFY_REFRESH:
{
required_access.emplace_back(AccessType::ALTER_VIEW_MODIFY_REFRESH, database, table);
break;
}
case ASTAlterCommand::LIVE_VIEW_REFRESH:
{
required_access.emplace_back(AccessType::ALTER_VIEW_REFRESH, database, table);

View File

@ -1089,6 +1089,13 @@ void InterpreterCreateQuery::assertOrSetUUID(ASTCreateQuery & create, const Data
"{} UUID specified, but engine of database {} is not Atomic", kind, create.getDatabase());
}
if (create.refresh_strategy && database->getEngineName() != "Atomic")
throw Exception(ErrorCodes::INCORRECT_QUERY,
"Refreshable materialized view requires Atomic database engine, but database {} has engine {}", create.getDatabase(), database->getEngineName());
/// TODO: Support Replicated databases, only with Shared/ReplicatedMergeTree.
/// Figure out how to make the refreshed data appear all at once on other
/// replicas; maybe a replicated SYSTEM SYNC REPLICA query before the rename?
/// The database doesn't support UUID so we'll ignore it. The UUID could be set here because of either
/// a) the initiator of `ON CLUSTER` query generated it to ensure the same UUIDs are used on different hosts; or
/// b) `RESTORE from backup` query generated it to ensure the same UUIDs are used on different hosts.
@ -1210,6 +1217,16 @@ BlockIO InterpreterCreateQuery::createTable(ASTCreateQuery & create)
visitor.visit(*create.select);
}
if (create.refresh_strategy)
{
if (!getContext()->getSettingsRef().allow_experimental_refreshable_materialized_view)
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED,
"Refreshable materialized views are experimental. Enable allow_experimental_refreshable_materialized_view to use.");
AddDefaultDatabaseVisitor visitor(getContext(), current_database);
visitor.visit(*create.refresh_strategy);
}
if (create.columns_list)
{
AddDefaultDatabaseVisitor visitor(getContext(), current_database);

View File

@ -54,6 +54,7 @@
#include <Storages/StorageS3.h>
#include <Storages/StorageURL.h>
#include <Storages/StorageAzureBlob.h>
#include <Storages/MaterializedView/RefreshTask.h>
#include <Storages/HDFS/StorageHDFS.h>
#include <Storages/System/StorageSystemFilesystemCache.h>
#include <Parsers/ASTSystemQuery.h>
@ -108,6 +109,7 @@ namespace ActionLocks
extern const StorageActionBlockType PartsMove;
extern const StorageActionBlockType PullReplicationLog;
extern const StorageActionBlockType Cleanup;
extern const StorageActionBlockType ViewRefresh;
}
@ -165,6 +167,8 @@ AccessType getRequiredAccessType(StorageActionBlockType action_type)
return AccessType::SYSTEM_PULLING_REPLICATION_LOG;
else if (action_type == ActionLocks::Cleanup)
return AccessType::SYSTEM_CLEANUP;
else if (action_type == ActionLocks::ViewRefresh)
return AccessType::SYSTEM_VIEWS;
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown action type: {}", std::to_string(action_type));
}
@ -605,6 +609,23 @@ BlockIO InterpreterSystemQuery::execute()
case Type::START_CLEANUP:
startStopAction(ActionLocks::Cleanup, true);
break;
case Type::START_VIEW:
case Type::START_VIEWS:
startStopAction(ActionLocks::ViewRefresh, true);
break;
case Type::STOP_VIEW:
case Type::STOP_VIEWS:
startStopAction(ActionLocks::ViewRefresh, false);
break;
case Type::REFRESH_VIEW:
getRefreshTask()->run();
break;
case Type::CANCEL_VIEW:
getRefreshTask()->cancel();
break;
case Type::TEST_VIEW:
getRefreshTask()->setFakeTime(query.fake_time_for_view);
break;
case Type::DROP_REPLICA:
dropReplica(query);
break;
@ -1092,6 +1113,17 @@ void InterpreterSystemQuery::flushDistributed(ASTSystemQuery &)
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "SYSTEM RESTART DISK is not supported");
}
RefreshTaskHolder InterpreterSystemQuery::getRefreshTask()
{
auto ctx = getContext();
ctx->checkAccess(AccessType::SYSTEM_VIEWS);
auto task = ctx->getRefreshSet().getTask(table_id);
if (!task)
throw Exception(
ErrorCodes::BAD_ARGUMENTS, "Refreshable view {} doesn't exist", table_id.getNameForLogs());
return task;
}
AccessRightsElements InterpreterSystemQuery::getRequiredAccessForDDLOnCluster() const
{
@ -1241,6 +1273,20 @@ AccessRightsElements InterpreterSystemQuery::getRequiredAccessForDDLOnCluster()
required_access.emplace_back(AccessType::SYSTEM_REPLICATION_QUEUES, query.getDatabase(), query.getTable());
break;
}
case Type::REFRESH_VIEW:
case Type::START_VIEW:
case Type::START_VIEWS:
case Type::STOP_VIEW:
case Type::STOP_VIEWS:
case Type::CANCEL_VIEW:
case Type::TEST_VIEW:
{
if (!query.table)
required_access.emplace_back(AccessType::SYSTEM_VIEWS);
else
required_access.emplace_back(AccessType::SYSTEM_VIEWS, query.getDatabase(), query.getTable());
break;
}
case Type::DROP_REPLICA:
case Type::DROP_DATABASE_REPLICA:
{

View File

@ -3,6 +3,7 @@
#include <Interpreters/IInterpreter.h>
#include <Parsers/IAST_fwd.h>
#include <Storages/IStorage_fwd.h>
#include <Storages/MaterializedView/RefreshTask_fwd.h>
#include <Interpreters/StorageID.h>
#include <Common/ActionLock.h>
#include <Disks/IVolume.h>
@ -72,6 +73,8 @@ private:
void flushDistributed(ASTSystemQuery & query);
[[noreturn]] void restartDisk(String & name);
RefreshTaskHolder getRefreshTask();
AccessRightsElements getRequiredAccessForDDLOnCluster() const;
void startStopAction(StorageActionBlockType action_type, bool start);
};

View File

@ -255,6 +255,9 @@ void ServerAsynchronousMetrics::updateImpl(AsynchronousMetricValues & new_values
size_t total_number_of_rows_system = 0;
size_t total_number_of_parts_system = 0;
size_t total_primary_key_bytes_memory = 0;
size_t total_primary_key_bytes_memory_allocated = 0;
for (const auto & db : databases)
{
/// Check if database can contain MergeTree tables
@ -293,6 +296,15 @@ void ServerAsynchronousMetrics::updateImpl(AsynchronousMetricValues & new_values
total_number_of_rows_system += rows;
total_number_of_parts_system += parts;
}
// only fetch the parts which are in active state
auto all_parts = table_merge_tree->getDataPartsVectorForInternalUsage();
for (const auto & part : all_parts)
{
total_primary_key_bytes_memory += part->getIndexSizeInBytes();
total_primary_key_bytes_memory_allocated += part->getIndexSizeInAllocatedBytes();
}
}
if (StorageReplicatedMergeTree * table_replicated_merge_tree = typeid_cast<StorageReplicatedMergeTree *>(table.get()))
@ -347,11 +359,14 @@ void ServerAsynchronousMetrics::updateImpl(AsynchronousMetricValues & new_values
new_values["TotalPartsOfMergeTreeTables"] = { total_number_of_parts, "Total amount of data parts in all tables of MergeTree family."
" Numbers larger than 10 000 will negatively affect the server startup time and it may indicate unreasonable choice of the partition key." };
new_values["NumberOfTablesSystem"] = { total_number_of_tables_system, "Total number of tables in the system database on the server stored in tables of MergeTree family."};
new_values["NumberOfTablesSystem"] = { total_number_of_tables_system, "Total number of tables in the system database on the server stored in tables of MergeTree family." };
new_values["TotalBytesOfMergeTreeTablesSystem"] = { total_number_of_bytes_system, "Total amount of bytes (compressed, including data and indices) stored in tables of MergeTree family in the system database." };
new_values["TotalRowsOfMergeTreeTablesSystem"] = { total_number_of_rows_system, "Total amount of rows (records) stored in tables of MergeTree family in the system database." };
new_values["TotalPartsOfMergeTreeTablesSystem"] = { total_number_of_parts_system, "Total amount of data parts in tables of MergeTree family in the system database." };
new_values["TotalPrimaryKeyBytesInMemory"] = { total_primary_key_bytes_memory, "The total amount of memory (in bytes) used by primary key values (only takes active parts into account)." };
new_values["TotalPrimaryKeyBytesInMemoryAllocated"] = { total_primary_key_bytes_memory_allocated, "The total amount of memory (in bytes) reserved for primary key values (only takes active parts into account)." };
}
#if USE_NURAFT

View File

@ -1,3 +1,4 @@
#include <Common/thread_local_rng.h>
#include <Common/ThreadPool.h>
#include <Common/PoolId.h>

View File

@ -453,6 +453,12 @@ void ASTAlterCommand::formatImpl(const FormatSettings & settings, FormatState &
<< (settings.hilite ? hilite_none : "");
select->formatImpl(settings, state, frame);
}
else if (type == ASTAlterCommand::MODIFY_REFRESH)
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "MODIFY REFRESH " << settings.nl_or_ws
<< (settings.hilite ? hilite_none : "");
refresh->formatImpl(settings, state, frame);
}
else if (type == ASTAlterCommand::LIVE_VIEW_REFRESH)
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "REFRESH " << (settings.hilite ? hilite_none : "");

View File

@ -40,6 +40,7 @@ public:
MODIFY_SETTING,
RESET_SETTING,
MODIFY_QUERY,
MODIFY_REFRESH,
REMOVE_TTL,
REMOVE_SAMPLE_BY,
@ -166,6 +167,9 @@ public:
*/
ASTPtr values;
/// For MODIFY REFRESH
ASTPtr refresh;
bool detach = false; /// true for DETACH PARTITION
bool part = false; /// true for ATTACH PART, DROP DETACHED PART and MOVE

View File

@ -2,7 +2,6 @@
#include <Parsers/ASTExpressionList.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Parsers/ASTSetQuery.h>
#include <Common/quoteString.h>
#include <Interpreters/StorageID.h>
#include <IO/Operators.h>
@ -340,6 +339,12 @@ void ASTCreateQuery::formatQueryImpl(const FormatSettings & settings, FormatStat
formatOnCluster(settings);
}
if (refresh_strategy)
{
settings.ostr << settings.nl_or_ws;
refresh_strategy->formatImpl(settings, state, frame);
}
if (to_table_id)
{
assert((is_materialized_view || is_window_view) && to_inner_uuid == UUIDHelpers::Nil);

View File

@ -5,6 +5,7 @@
#include <Parsers/ASTDictionary.h>
#include <Parsers/ASTDictionaryAttributeDeclaration.h>
#include <Parsers/ASTTableOverrides.h>
#include <Parsers/ASTRefreshStrategy.h>
#include <Interpreters/StorageID.h>
namespace DB
@ -116,6 +117,7 @@ public:
ASTExpressionList * dictionary_attributes_list = nullptr; /// attributes of
ASTDictionary * dictionary = nullptr; /// dictionary definition (layout, primary key, etc.)
ASTRefreshStrategy * refresh_strategy = nullptr; // For CREATE MATERIALIZED VIEW ... REFRESH ...
std::optional<UInt64> live_view_periodic_refresh; /// For CREATE LIVE VIEW ... WITH [PERIODIC] REFRESH ...
bool is_watermark_strictly_ascending{false}; /// STRICTLY ASCENDING WATERMARK STRATEGY FOR WINDOW VIEW

View File

@ -0,0 +1,71 @@
#include <Parsers/ASTRefreshStrategy.h>
#include <IO/Operators.h>
namespace DB
{
ASTPtr ASTRefreshStrategy::clone() const
{
auto res = std::make_shared<ASTRefreshStrategy>(*this);
res->children.clear();
if (period)
res->set(res->period, period->clone());
if (offset)
res->set(res->offset, offset->clone());
if (spread)
res->set(res->spread, spread->clone());
if (settings)
res->set(res->settings, settings->clone());
if (dependencies)
res->set(res->dependencies, dependencies->clone());
res->schedule_kind = schedule_kind;
return res;
}
void ASTRefreshStrategy::formatImpl(
const IAST::FormatSettings & f_settings, IAST::FormatState & state, IAST::FormatStateStacked frame) const
{
frame.need_parens = false;
f_settings.ostr << (f_settings.hilite ? hilite_keyword : "") << "REFRESH " << (f_settings.hilite ? hilite_none : "");
using enum RefreshScheduleKind;
switch (schedule_kind)
{
case AFTER:
f_settings.ostr << "AFTER " << (f_settings.hilite ? hilite_none : "");
period->formatImpl(f_settings, state, frame);
break;
case EVERY:
f_settings.ostr << "EVERY " << (f_settings.hilite ? hilite_none : "");
period->formatImpl(f_settings, state, frame);
if (offset)
{
f_settings.ostr << (f_settings.hilite ? hilite_keyword : "") << " OFFSET " << (f_settings.hilite ? hilite_none : "");
offset->formatImpl(f_settings, state, frame);
}
break;
default:
f_settings.ostr << (f_settings.hilite ? hilite_none : "");
break;
}
if (spread)
{
f_settings.ostr << (f_settings.hilite ? hilite_keyword : "") << " RANDOMIZE FOR " << (f_settings.hilite ? hilite_none : "");
spread->formatImpl(f_settings, state, frame);
}
if (dependencies)
{
f_settings.ostr << (f_settings.hilite ? hilite_keyword : "") << " DEPENDS ON " << (f_settings.hilite ? hilite_none : "");
dependencies->formatImpl(f_settings, state, frame);
}
if (settings)
{
f_settings.ostr << (f_settings.hilite ? hilite_keyword : "") << " SETTINGS " << (f_settings.hilite ? hilite_none : "");
settings->formatImpl(f_settings, state, frame);
}
}
}

View File

@ -0,0 +1,35 @@
#pragma once
#include <Parsers/ASTSetQuery.h>
#include <Parsers/ASTExpressionList.h>
#include <Parsers/ASTTimeInterval.h>
namespace DB
{
enum class RefreshScheduleKind : UInt8
{
UNKNOWN = 0,
AFTER,
EVERY
};
/// Strategy for MATERIALIZED VIEW ... REFRESH ..
class ASTRefreshStrategy : public IAST
{
public:
ASTSetQuery * settings = nullptr;
ASTExpressionList * dependencies = nullptr;
ASTTimeInterval * period = nullptr;
ASTTimeInterval * offset = nullptr;
ASTTimeInterval * spread = nullptr;
RefreshScheduleKind schedule_kind{RefreshScheduleKind::UNKNOWN};
String getID(char) const override { return "Refresh strategy definition"; }
ASTPtr clone() const override;
void formatImpl(const FormatSettings & s, FormatState & state, FormatStateStacked frame) const override;
};
}

View File

@ -90,6 +90,13 @@ public:
STOP_CLEANUP,
START_CLEANUP,
RESET_COVERAGE,
REFRESH_VIEW,
START_VIEW,
START_VIEWS,
STOP_VIEW,
STOP_VIEWS,
CANCEL_VIEW,
TEST_VIEW,
END
};
@ -133,6 +140,10 @@ public:
ServerType server_type;
/// For SYSTEM TEST VIEW <name> (SET FAKE TIME <time> | UNSET FAKE TIME).
/// Unix time.
std::optional<Int64> fake_time_for_view;
String getID(char) const override { return "SYSTEM query"; }
ASTPtr clone() const override

View File

@ -0,0 +1,28 @@
#include <Parsers/ASTTimeInterval.h>
#include <IO/Operators.h>
#include <ranges>
namespace DB
{
ASTPtr ASTTimeInterval::clone() const
{
return std::make_shared<ASTTimeInterval>(*this);
}
void ASTTimeInterval::formatImpl(const FormatSettings & f_settings, FormatState &, FormatStateStacked frame) const
{
frame.need_parens = false;
for (bool is_first = true; auto [kind, value] : interval.toIntervals())
{
if (!std::exchange(is_first, false))
f_settings.ostr << ' ';
f_settings.ostr << value << ' ';
f_settings.ostr << (f_settings.hilite ? hilite_keyword : "") << kind.toKeyword() << (f_settings.hilite ? hilite_none : "");
}
}
}

View File

@ -0,0 +1,24 @@
#pragma once
#include <Parsers/IAST.h>
#include <Common/CalendarTimeInterval.h>
#include <map>
namespace DB
{
/// Compound time interval like 1 YEAR 3 DAY 15 MINUTE
class ASTTimeInterval : public IAST
{
public:
CalendarTimeInterval interval;
String getID(char) const override { return "TimeInterval"; }
ASTPtr clone() const override;
void formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const override;
};
}

Some files were not shown because too many files have changed in this diff Show More