Merge remote-tracking branch 'origin/master' into analyzer-refactor-constant-name

This commit is contained in:
Dmitry Novik 2024-02-27 16:35:28 +01:00
commit 33ae650e6a
223 changed files with 3069 additions and 2547 deletions

2
contrib/liburing vendored

@ -1 +1 @@
Subproject commit f5a48392c4ea33f222cbebeb2e2fc31620162949
Subproject commit f4e42a515cd78c8c9cac2be14222834be5f8df2b

View File

@ -86,7 +86,7 @@ function download
chmod +x clickhouse
# clickhouse may be compressed - run once to decompress
./clickhouse ||:
./clickhouse --query "SELECT 1" ||:
ln -s ./clickhouse ./clickhouse-server
ln -s ./clickhouse ./clickhouse-client
ln -s ./clickhouse ./clickhouse-local

View File

@ -870,6 +870,11 @@ Tags:
- `load_balancing` - Policy for disk balancing, `round_robin` or `least_used`.
- `least_used_ttl_ms` - Configure timeout (in milliseconds) for the updating available space on all disks (`0` - update always, `-1` - never update, default is `60000`). Note, if the disk can be used by ClickHouse only and is not subject to a online filesystem resize/shrink you can use `-1`, in all other cases it is not recommended, since eventually it will lead to incorrect space distribution.
- `prefer_not_to_merge` — You should not use this setting. Disables merging of data parts on this volume (this is harmful and leads to performance degradation). When this setting is enabled (don't do it), merging data on this volume is not allowed (which is bad). This allows (but you don't need it) controlling (if you want to control something, you're making a mistake) how ClickHouse works with slow disks (but ClickHouse knows better, so please don't use this setting).
- `volume_priority` — Defines the priority (order) in which volumes are filled. Lower value means higher priority. The parameter values should be natural numbers and collectively cover the range from 1 to N (lowest priority given) without skipping any numbers.
* If _all_ volumes are tagged, they are prioritized in given order.
* If only _some_ volumes are tagged, those without the tag have the lowest priority, and they are prioritized in the order they are defined in config.
* If _no_ volumes are tagged, their priority is set correspondingly to their order they are declared in configuration.
* Two volumes cannot have the same priority value.
Configuration examples:
@ -919,7 +924,8 @@ In given example, the `hdd_in_order` policy implements the [round-robin](https:/
If there are different kinds of disks available in the system, `moving_from_ssd_to_hdd` policy can be used instead. The volume `hot` consists of an SSD disk (`fast_ssd`), and the maximum size of a part that can be stored on this volume is 1GB. All the parts with the size larger than 1GB will be stored directly on the `cold` volume, which contains an HDD disk `disk1`.
Also, once the disk `fast_ssd` gets filled by more than 80%, data will be transferred to the `disk1` by a background process.
The order of volume enumeration within a storage policy is important. Once a volume is overfilled, data are moved to the next one. The order of disk enumeration is important as well because data are stored on them in turns.
The order of volume enumeration within a storage policy is important in case at least one of the volumes listed has no explicit `volume_priority` parameter.
Once a volume is overfilled, data are moved to the next one. The order of disk enumeration is important as well because data are stored on them in turns.
When creating a table, one can apply one of the configured storage policies to it:

View File

@ -74,6 +74,10 @@ Specifying the `sharding_key` is necessary for the following:
`fsync_directories` - do the `fsync` for directories. Guarantees that the OS refreshed directory metadata after operations related to background inserts on Distributed table (after insert, after sending the data to shard, etc.).
#### skip_unavailable_shards
`skip_unavailable_shards` - If true, ClickHouse silently skips unavailable shards. Shard is marked as unavailable when: 1) The shard cannot be reached due to a connection failure. 2) Shard is unresolvable through DNS. 3) Table does not exist on the shard. Default false.
#### bytes_to_throw_insert
`bytes_to_throw_insert` - if more than this number of compressed bytes will be pending for background INSERT, an exception will be thrown. 0 - do not throw. Default 0.
@ -102,6 +106,10 @@ Specifying the `sharding_key` is necessary for the following:
`background_insert_max_sleep_time_ms` - same as [distributed_background_insert_max_sleep_time_ms](../../../operations/settings/settings.md#distributed_background_insert_max_sleep_time_ms)
#### flush_on_detach
`flush_on_detach` - Flush data to remote nodes on DETACH/DROP/server shutdown. Default true.
:::note
**Durability settings** (`fsync_...`):

View File

@ -79,10 +79,7 @@ It is recommended to use official pre-compiled `deb` packages for Debian or Ubun
#### Setup the Debian repository
``` bash
sudo apt-get install -y apt-transport-https ca-certificates dirmngr
GNUPGHOME=$(mktemp -d)
sudo GNUPGHOME="$GNUPGHOME" gpg --no-default-keyring --keyring /usr/share/keyrings/clickhouse-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 8919F6BD2B48D754
sudo rm -rf "$GNUPGHOME"
sudo chmod +r /usr/share/keyrings/clickhouse-keyring.gpg
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/clickhouse-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 8919F6BD2B48D754
echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb stable main" | sudo tee \
/etc/apt/sources.list.d/clickhouse.list

View File

@ -199,6 +199,20 @@ Type: Bool
Default: 0
## dns_cache_max_size
Internal DNS cache max size in bytes.
:::note
ClickHouse also has a reverse cache, so the actual memory usage could be twice as much.
:::
Type: UInt64
Default: 1024
## dns_cache_update_period
Internal DNS cache update period in seconds.

View File

@ -1776,7 +1776,7 @@ Default value: 0 (no restriction).
## insert_quorum {#insert_quorum}
:::note
`insert_quorum` does not apply when using the [`SharedMergeTree` table engine](/en/cloud/reference/shared-merge-tree) in ClickHouse Cloud as all inserts are quorum inserted.
This setting is not applicable to SharedMergeTree, see [SharedMergeTree consistency](/docs/en/cloud/reference/shared-merge-tree/#consistency) for more information.
:::
Enables the quorum writes.
@ -1819,7 +1819,7 @@ See also:
## insert_quorum_parallel {#insert_quorum_parallel}
:::note
`insert_quorum_parallel` does not apply when using the [`SharedMergeTree` table engine](/en/cloud/reference/shared-merge-tree) in ClickHouse Cloud as all inserts are quorum inserted.
This setting is not applicable to SharedMergeTree, see [SharedMergeTree consistency](/docs/en/cloud/reference/shared-merge-tree/#consistency) for more information.
:::
Enables or disables parallelism for quorum `INSERT` queries. If enabled, additional `INSERT` queries can be sent while previous queries have not yet finished. If disabled, additional writes to the same table will be rejected.
@ -1839,6 +1839,10 @@ See also:
## select_sequential_consistency {#select_sequential_consistency}
:::note
This setting differ in behavior between SharedMergeTree and ReplicatedMergeTree, see [SharedMergeTree consistency](/docs/en/cloud/reference/shared-merge-tree/#consistency) for more information about the behavior of `select_sequential_consistency` in SharedMergeTree.
:::
Enables or disables sequential consistency for `SELECT` queries. Requires `insert_quorum_parallel` to be disabled (enabled by default).
Possible values:
@ -2037,7 +2041,7 @@ Possible values:
- 0 — Disabled.
- 1 — Enabled.
Default value: 1.
Default value: 0.
By default, async inserts are inserted into replicated tables by the `INSERT` statement enabling [async_insert](#async-insert) are deduplicated (see [Data Replication](../../engines/table-engines/mergetree-family/replication.md)).
For the replicated tables, by default, only 10000 of the most recent inserts for each partition are deduplicated (see [replicated_deduplication_window_for_async_inserts](merge-tree-settings.md/#replicated-deduplication-window-async-inserts), [replicated_deduplication_window_seconds_for_async_inserts](merge-tree-settings.md/#replicated-deduplication-window-seconds-async-inserts)).
@ -3445,7 +3449,7 @@ Has an effect only when the connection is made through the MySQL wire protocol.
- 0 - Use `BLOB`.
- 1 - Use `TEXT`.
Default value: `0`.
Default value: `1`.
## mysql_map_fixed_string_to_text_in_show_columns {#mysql_map_fixed_string_to_text_in_show_columns}
@ -3456,7 +3460,7 @@ Has an effect only when the connection is made through the MySQL wire protocol.
- 0 - Use `BLOB`.
- 1 - Use `TEXT`.
Default value: `0`.
Default value: `1`.
## execute_merges_on_single_replica_time_threshold {#execute-merges-on-single-replica-time-threshold}
@ -3706,7 +3710,7 @@ Default value: `0`.
## allow_experimental_live_view {#allow-experimental-live-view}
Allows creation of experimental [live views](../../sql-reference/statements/create/view.md/#live-view).
Allows creation of a deprecated LIVE VIEW.
Possible values:
@ -3717,21 +3721,15 @@ Default value: `0`.
## live_view_heartbeat_interval {#live-view-heartbeat-interval}
Sets the heartbeat interval in seconds to indicate [live view](../../sql-reference/statements/create/view.md/#live-view) is alive .
Default value: `15`.
Deprecated.
## max_live_view_insert_blocks_before_refresh {#max-live-view-insert-blocks-before-refresh}
Sets the maximum number of inserted blocks after which mergeable blocks are dropped and query for [live view](../../sql-reference/statements/create/view.md/#live-view) is re-executed.
Default value: `64`.
Deprecated.
## periodic_live_view_refresh {#periodic-live-view-refresh}
Sets the interval in seconds after which periodically refreshed [live view](../../sql-reference/statements/create/view.md/#live-view) is forced to refresh.
Default value: `60`.
Deprecated.
## http_connection_timeout {#http_connection_timeout}

View File

@ -0,0 +1,38 @@
---
slug: /en/operations/system-tables/dns_cache
---
# dns_cache
Contains information about cached DNS records.
Columns:
- `hostname` ([String](../../sql-reference/data-types/string.md)) — cached hostname
- `ip_address` ([String](../../sql-reference/data-types/string.md)) — ip address for the hostname
- `ip_family` ([Enum](../../sql-reference/data-types/enum.md)) — family of the ip address, possible values:
- 'IPv4'
- 'IPv6'
- 'UNIX_LOCAL'
- `cached_at` ([DateTime](../../sql-reference/data-types/datetime.md)) - when the record was cached
**Example**
Query:
```sql
SELECT * FROM system.dns_cache;
```
Result:
| hostname | ip\_address | ip\_family | cached\_at |
| :--- | :--- | :--- | :--- |
| localhost | ::1 | IPv6 | 2024-02-11 17:04:40 |
| localhost | 127.0.0.1 | IPv4 | 2024-02-11 17:04:40 |
**See also**
- [disable_internal_dns_cache setting](../../operations/server-configuration-parameters/settings.md#disable_internal_dns_cache)
- [dns_cache_max_size setting](../../operations/server-configuration-parameters/settings.md#dns_cache_max_size)
- [dns_cache_update_period setting](../../operations/server-configuration-parameters/settings.md#dns_cache_update_period)
- [dns_max_consecutive_failures setting](../../operations/server-configuration-parameters/settings.md#dns_max_consecutive_failures)

View File

@ -111,6 +111,14 @@ On newer Linux kernels transparent huge pages are alright.
$ echo 'madvise' | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
```
If you want to modify the transparent huge pages setting permanently, editing the `/etc/default/grub` to add the `transparent_hugepage=never` to the `GRUB_CMDLINE_LINUX_DEFAULT` option:
```bash
$ GRUB_CMDLINE_LINUX_DEFAULT="transparent_hugepage=madvise ..."
```
After that, run the `sudo update-grub` command then reboot to take effect.
## Hypervisor configuration
If you are using OpenStack, set

View File

@ -0,0 +1,50 @@
---
slug: /en/sql-reference/aggregate-functions/reference/grouparrayintersect
sidebar_position: 115
---
# groupArrayIntersect
Return an intersection of given arrays (Return all items of arrays, that are in all given arrays).
**Syntax**
``` sql
groupArrayIntersect(x)
```
**Arguments**
- `x` — Argument (column name or expression).
**Returned values**
- Array that contains elements that are in all arrays.
Type: [Array](../../data-types/array.md).
**Examples**
Consider table `numbers`:
``` text
┌─a──────────────┐
│ [1,2,4] │
│ [1,5,2,8,-1,0] │
│ [1,5,7,5,8,2] │
└────────────────┘
```
Query with column name as argument:
``` sql
SELECT groupArrayIntersect(a) as intersection FROM numbers;
```
Result:
```text
┌─intersection──────┐
│ [1, 2] │
└───────────────────┘
```

View File

@ -55,6 +55,7 @@ ClickHouse-specific aggregate functions:
- [groupArrayMovingSum](/docs/en/sql-reference/aggregate-functions/reference/grouparraymovingsum.md)
- [groupArraySample](./grouparraysample.md)
- [groupArraySorted](/docs/en/sql-reference/aggregate-functions/reference/grouparraysorted.md)
- [groupArrayIntersect](./grouparrayintersect.md)
- [groupBitAnd](/docs/en/sql-reference/aggregate-functions/reference/groupbitand.md)
- [groupBitOr](/docs/en/sql-reference/aggregate-functions/reference/groupbitor.md)
- [groupBitXor](/docs/en/sql-reference/aggregate-functions/reference/groupbitxor.md)

View File

@ -5,25 +5,25 @@ sidebar_position: 221
# stochasticLinearRegression
This function implements stochastic linear regression. It supports custom parameters for learning rate, L2 regularization coefficient, mini-batch size and has few methods for updating weights ([Adam](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam) (used by default), [simple SGD](https://en.wikipedia.org/wiki/Stochastic_gradient_descent), [Momentum](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Momentum), [Nesterov](https://mipt.ru/upload/medialibrary/d7e/41-91.pdf)).
This function implements stochastic linear regression. It supports custom parameters for learning rate, L2 regularization coefficient, mini-batch size, and has a few methods for updating weights ([Adam](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam) (used by default), [simple SGD](https://en.wikipedia.org/wiki/Stochastic_gradient_descent), [Momentum](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Momentum), and [Nesterov](https://mipt.ru/upload/medialibrary/d7e/41-91.pdf)).
### Parameters
There are 4 customizable parameters. They are passed to the function sequentially, but there is no need to pass all four - default values will be used, however good model required some parameter tuning.
``` text
stochasticLinearRegression(1.0, 1.0, 10, 'SGD')
stochasticLinearRegression(0.00001, 0.1, 15, 'Adam')
```
1. `learning rate` is the coefficient on step length, when gradient descent step is performed. Too big learning rate may cause infinite weights of the model. Default is `0.00001`.
1. `learning rate` is the coefficient on step length, when the gradient descent step is performed. A learning rate that is too big may cause infinite weights of the model. Default is `0.00001`.
2. `l2 regularization coefficient` which may help to prevent overfitting. Default is `0.1`.
3. `mini-batch size` sets the number of elements, which gradients will be computed and summed to perform one step of gradient descent. Pure stochastic descent uses one element, however having small batches(about 10 elements) make gradient steps more stable. Default is `15`.
4. `method for updating weights`, they are: `Adam` (by default), `SGD`, `Momentum`, `Nesterov`. `Momentum` and `Nesterov` require little bit more computations and memory, however they happen to be useful in terms of speed of convergence and stability of stochastic gradient methods.
3. `mini-batch size` sets the number of elements, which gradients will be computed and summed to perform one step of gradient descent. Pure stochastic descent uses one element, however, having small batches (about 10 elements) makes gradient steps more stable. Default is `15`.
4. `method for updating weights`, they are: `Adam` (by default), `SGD`, `Momentum`, and `Nesterov`. `Momentum` and `Nesterov` require a little bit more computations and memory, however, they happen to be useful in terms of speed of convergence and stability of stochastic gradient methods.
### Usage
`stochasticLinearRegression` is used in two steps: fitting the model and predicting on new data. In order to fit the model and save its state for later usage we use `-State` combinator, which basically saves the state (model weights, etc).
To predict we use function [evalMLMethod](../../../sql-reference/functions/machine-learning-functions.md#machine_learning_methods-evalmlmethod), which takes a state as an argument as well as features to predict on.
`stochasticLinearRegression` is used in two steps: fitting the model and predicting on new data. In order to fit the model and save its state for later usage, we use the `-State` combinator, which saves the state (e.g. model weights).
To predict, we use the function [evalMLMethod](../../../sql-reference/functions/machine-learning-functions.md#machine_learning_methods-evalmlmethod), which takes a state as an argument as well as features to predict on.
<a name="stochasticlinearregression-usage-fitting"></a>
@ -44,12 +44,12 @@ stochasticLinearRegressionState(0.1, 0.0, 5, 'SGD')(target, param1, param2)
AS state FROM train_data;
```
Here we also need to insert data into `train_data` table. The number of parameters is not fixed, it depends only on number of arguments, passed into `linearRegressionState`. They all must be numeric values.
Note that the column with target value(which we would like to learn to predict) is inserted as the first argument.
Here, we also need to insert data into the `train_data` table. The number of parameters is not fixed, it depends only on the number of arguments passed into `linearRegressionState`. They all must be numeric values.
Note that the column with target value (which we would like to learn to predict) is inserted as the first argument.
**2.** Predicting
After saving a state into the table, we may use it multiple times for prediction, or even merge with other states and create new even better models.
After saving a state into the table, we may use it multiple times for prediction or even merge with other states and create new, even better models.
``` sql
WITH (SELECT state FROM your_model) AS model SELECT

View File

@ -780,8 +780,52 @@ If executed in the context of a distributed table, this function generates a nor
## version()
Returns the server version as a string.
If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
Returns the current version of ClickHouse as a string in the form of:
- Major version
- Minor version
- Patch version
- Number of commits since the previous stable release.
```plaintext
major_version.minor_version.patch_version.number_of_commits_since_the_previous_stable_release
```
If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise, it produces a constant value.
**Syntax**
```sql
version()
```
**Arguments**
None.
**Returned value**
Type: [String](../data-types/string)
**Implementation details**
None.
**Example**
Query:
```sql
SELECT version()
```
**Result**:
```response
┌─version()─┐
│ 24.2.1.1 │
└───────────┘
```
## buildId()

View File

@ -176,7 +176,7 @@ INSERT INTO infile_globs FROM INFILE 'input_?.csv' FORMAT CSV;
```
:::
## Inserting into Table Function
## Inserting using a Table Function
Data can be inserted into tables referenced by [table functions](../../sql-reference/table-functions/index.md).
@ -204,7 +204,7 @@ Result:
└─────┴───────────────────────┘
```
## Inserts into ClickHouse Cloud
## Inserting into ClickHouse Cloud
By default, services on ClickHouse Cloud provide multiple replicas for high availability. When you connect to a service, a connection is established to one of these replicas.
@ -218,6 +218,12 @@ SELECT .... SETTINGS select_sequential_consistency = 1;
Note that using `select_sequential_consistency` will increase the load on ClickHouse Keeper (used by ClickHouse Cloud internally) and may result in slower performance depending on the load on the service. We recommend against enabling this setting unless necessary. The recommended approach is to execute read/writes in the same session or to use a client driver that uses the native protocol (and thus supports sticky connections).
## Inserting into a replicated setup
In a replicated setup, data will be visible on other replicas after it has been replicated. Data begins being replicated (downloaded on other replicas) immediately after an `INSERT`. This differs from ClickHouse Cloud, where data is immediately written to shared storage and replicas subscribe to metadata changes.
Note that for replicated setups, `INSERTs` can sometimes take a considerable amount of time (in the order of one second) as it requires committing to ClickHouse Keeper for distributed consensus. Using S3 for storage also adds additional latency.
## Performance Considerations
`INSERT` sorts the input data by primary key and splits them into partitions by a partition key. If you insert data into several partitions at once, it can significantly reduce the performance of the `INSERT` query. To avoid this:
@ -230,7 +236,15 @@ Performance will not decrease if:
- Data is added in real time.
- You upload data that is usually sorted by time.
It's also possible to asynchronously insert data in small but frequent inserts. The data from such insertions is combined into batches and then safely inserted into a table. To enable the asynchronous mode, switch on the [async_insert](../../operations/settings/settings.md#async-insert) setting. Note that asynchronous insertions are supported only over HTTP protocol, and deduplication is not supported for them.
### Asynchronous inserts
It is possible to asynchronously insert data in small but frequent inserts. The data from such insertions is combined into batches and then safely inserted into a table. To use asynchronous inserts, enable the [`async_insert`](../../operations/settings/settings.md#async-insert) setting.
Using `async_insert` or the [`Buffer` table engine](/en/engines/table-engines/special/buffer) results in additional buffering.
### Large or long-running inserts
When you are inserting large amounts of data, ClickHouse will optimize write performance through a process called "squashing". Small blocks of inserted data in memory are merged and squashed into larger blocks before being written to disk. Squashing reduces the overhead associated with each write operation. In this process, inserted data will be available to query after ClickHouse completes writing each [`max_insert_block_size`](/en/operations/settings/settings#max_insert_block_size) rows.
**See Also**

View File

@ -68,7 +68,7 @@ RELOAD FUNCTION [ON CLUSTER cluster_name] function_name
Clears ClickHouses internal DNS cache. Sometimes (for old ClickHouse versions) it is necessary to use this command when changing the infrastructure (changing the IP address of another ClickHouse server or the server used by dictionaries).
For more convenient (automatic) cache management, see disable_internal_dns_cache, dns_cache_update_period parameters.
For more convenient (automatic) cache management, see disable_internal_dns_cache, dns_cache_max_size, dns_cache_update_period parameters.
## DROP MARK CACHE

View File

@ -679,11 +679,20 @@ TTL d + INTERVAL 1 MONTH GROUP BY k1, k2 SET x = max(x), y = min(y);
Тэги:
- `policy_name_N` — название политики. Названия политик должны быть уникальны.
- `volume_name_N` — название тома. Названия томов должны быть уникальны.
- `disk` — диск, находящийся внутри тома.
- `max_data_part_size_bytes` — максимальный размер куска данных, который может находиться на любом из дисков этого тома. Если в результате слияния размер куска ожидается больше, чем max_data_part_size_bytes, то этот кусок будет записан в следующий том. В основном эта функция позволяет хранить новые / мелкие куски на горячем (SSD) томе и перемещать их на холодный (HDD) том, когда они достигают большого размера. Не используйте этот параметр, если политика имеет только один том.
- `move_factor` — доля доступного свободного места на томе, если места становится меньше, то данные начнут перемещение на следующий том, если он есть (по умолчанию 0.1). Для перемещения куски сортируются по размеру от большего к меньшему (по убыванию) и выбираются куски, совокупный размер которых достаточен для соблюдения условия `move_factor`, если совокупный размер всех партов недостаточен, будут перемещены все парты.
- `policy_name_N` — название политики. Названия политик должны быть уникальны.
- `volume_name_N` — название тома. Названия томов должны быть уникальны.
- `disk` — диск, находящийся внутри тома.
- `max_data_part_size_bytes` — максимальный размер куска данных, который может находиться на любом из дисков этого тома. Если в результате слияния размер куска ожидается больше, чем max_data_part_size_bytes, то этот кусок будет записан в следующий том. В основном эта функция позволяет хранить новые / мелкие куски на горячем (SSD) томе и перемещать их на холодный (HDD) том, когда они достигают большого размера. Не используйте этот параметр, если политика имеет только один том.
- `move_factor` — доля доступного свободного места на томе, если места становится меньше, то данные начнут перемещение на следующий том, если он есть (по умолчанию 0.1). Для перемещения куски сортируются по размеру от большего к меньшему (по убыванию) и выбираются куски, совокупный размер которых достаточен для соблюдения условия `move_factor`, если совокупный размер всех партов недостаточен, будут перемещены все парты.
- `perform_ttl_move_on_insert` — отключает перемещение данных с истекшим TTL при вставке. По умолчанию (если включено), если мы вставляем часть данных, которая уже просрочилась по правилу перемещения по сроку жизни, она немедленно перемещается на том / диск, указанный в правиле перемещения. Это может значительно замедлить вставку в случае, если целевой том / диск медленный (например, S3). Если отключено, то просроченная часть данных записывается на том по умолчанию, а затем сразу перемещается на том, указанный в правиле для истёкшего TTL.
- `load_balancing` - политика балансировки дисков, `round_robin` или `least_used`.
- `least_used_ttl_ms` - устанавливает таймаут (в миллисекундах) для обновления доступного пространства на всех дисках (`0` - обновлять всегда, `-1` - никогда не обновлять, значение по умолчанию - `60000`). Обратите внимание, если диск используется только ClickHouse и не будет подвергаться изменению размеров файловой системы на лету, можно использовать значение `-1`. Во всех остальных случаях это не рекомендуется, так как в конечном итоге это приведет к неправильному распределению пространства.
- `prefer_not_to_merge` — эту настройку лучше не использовать. Она отключает слияние частей данных на этом томе (что потенциально вредно и может привести к замедлению). Когда эта настройка включена (не делайте этого), объединение данных на этом томе запрещено (что плохо). Это позволяет (но вам это не нужно) контролировать (если вы хотите что-то контролировать, вы делаете ошибку), как ClickHouse взаимодействует с медленными дисками (но ClickHouse лучше знает, поэтому, пожалуйста, не используйте эту настройку).
- `volume_priority` — Определяет приоритет (порядок), в котором заполняются тома. Чем меньше значение -- тем выше приоритет. Значения параметра должны быть натуральными числами и охватывать диапазон от 1 до N (N - наибольшее значение параметра из указанных) без пропусков.
* Если се_ тома имеют этот параметр, они приоритизируются в указанном порядке.
* Если его имеют лишь екоторые_, то не имеющие этого параметра тома имеют самый низкий приоритет. Те, у которых он указан, приоритизируются в соответствии со значением тега, приоритет остальных определяется порядком описания в конфигурационном файле относительно друг друга.
* Если _ни одному_ тому не присвоен этот параметр, их порядок определяется порядком описания в конфигурационном файле.
* Приоритет нескольких томов не может быть одинаковым.
Примеры конфигураций:
@ -733,7 +742,7 @@ TTL d + INTERVAL 1 MONTH GROUP BY k1, k2 SET x = max(x), y = min(y);
Если система содержит диски различных типов, то может пригодиться политика `moving_from_ssd_to_hdd`. В томе `hot` находится один SSD-диск (`fast_ssd`), а также задается ограничение на максимальный размер куска, который может храниться на этом томе (1GB). Все куски такой таблицы больше 1GB будут записываться сразу на том `cold`, в котором содержится один HDD-диск `disk1`. Также при заполнении диска `fast_ssd` более чем на 80% данные будут переноситься на диск `disk1` фоновым процессом.
Порядок томов в политиках хранения важен, при достижении условий на переполнение тома данные переносятся на следующий. Порядок дисков в томах так же важен, данные пишутся по очереди на каждый из них.
Порядок томов в политиках хранения важен в случае, если приоритеты томов (`volume_priority`) не указаны явно: при достижении условий на переполнение тома данные переносятся на следующий. Порядок дисков в томах так же важен, данные пишутся по очереди на каждый из них.
После задания конфигурации политик хранения их можно использовать, как настройку при создании таблиц:

View File

@ -3258,7 +3258,7 @@ SELECT * FROM test2;
## allow_experimental_live_view {#allow-experimental-live-view}
Включает экспериментальную возможность использования [LIVE-представлений](../../sql-reference/statements/create/view.md#live-view).
Включает устаревшую возможность использования [LIVE-представлений](../../sql-reference/statements/create/view.md#live-view).
Возможные значения:
- 0 — живые представления не поддерживаются.
@ -3268,21 +3268,15 @@ SELECT * FROM test2;
## live_view_heartbeat_interval {#live-view-heartbeat-interval}
Задает интервал в секундах для периодической проверки существования [LIVE VIEW](../../sql-reference/statements/create/view.md#live-view).
Значение по умолчанию: `15`.
Устарело.
## max_live_view_insert_blocks_before_refresh {#max-live-view-insert-blocks-before-refresh}
Задает наибольшее число вставок, после которых запрос на формирование [LIVE VIEW](../../sql-reference/statements/create/view.md#live-view) исполняется снова.
Значение по умолчанию: `64`.
Устарело.
## periodic_live_view_refresh {#periodic-live-view-refresh}
Задает время в секундах, по истечении которого [LIVE VIEW](../../sql-reference/statements/create/view.md#live-view) с установленным автообновлением обновляется.
Значение по умолчанию: `60`.
Устарело.
## check_query_single_value_result {#check_query_single_value_result}

View File

@ -280,9 +280,6 @@ GRANT INSERT(x,y) ON db.table TO john
- `ALTER MOVE PARTITION`. 级别: `TABLE`. 别名: `ALTER MOVE PART`, `MOVE PARTITION`, `MOVE PART`
- `ALTER FETCH PARTITION`. 级别: `TABLE`. 别名: `FETCH PARTITION`
- `ALTER FREEZE PARTITION`. 级别: `TABLE`. 别名: `FREEZE PARTITION`
- `ALTER VIEW` 级别: `GROUP`
- `ALTER VIEW REFRESH`. 级别: `VIEW`. 别名: `ALTER LIVE VIEW REFRESH`, `REFRESH VIEW`
- `ALTER VIEW MODIFY QUERY`. 级别: `VIEW`. 别名: `ALTER TABLE MODIFY QUERY`
如何对待该层级的示例:
- `ALTER` 权限包含所有其它 `ALTER *` 的权限

View File

@ -1774,6 +1774,8 @@ try
}
else
{
DNSResolver::instance().setCacheMaxSize(server_settings.dns_cache_max_size);
/// Initialize a watcher periodically updating DNS cache
dns_cache_updater = std::make_unique<DNSCacheUpdater>(
global_context, server_settings.dns_cache_update_period, server_settings.dns_max_consecutive_failures);

View File

@ -1392,13 +1392,27 @@
<!-- <host_name>replica</host_name> -->
</distributed_ddl>
<!-- Settings to fine tune MergeTree tables. See documentation in source code, in MergeTreeSettings.h -->
<!-- Settings to fine-tune MergeTree tables. See documentation in source code, in MergeTreeSettings.h -->
<!--
<merge_tree>
<max_suspicious_broken_parts>5</max_suspicious_broken_parts>
</merge_tree>
-->
<!-- Settings to fine-tune ReplicatedMergeTree tables. See documentation in source code, in MergeTreeSettings.h -->
<!--
<replicated_merge_tree>
<max_replicated_fetches_network_bandwidth>1000000000</max_replicated_fetches_network_bandwidth>
</replicated_merge_tree>
-->
<!-- Settings to fine-tune Distributed tables. See documentation in source code, in DistributedSettings.h -->
<!--
<distributed>
<flush_on_detach>false</flush_on_detach>
</distributed>
-->
<!-- Protection from accidental DROP.
If size of a MergeTree table is greater than max_table_size_to_drop (in bytes) than table could not be dropped with any DROP query.
If you want do delete one table and don't want to change clickhouse-server config, you could create special file <clickhouse-path>/flags/force_drop_table and make DROP once.

View File

@ -85,11 +85,10 @@
gap: 1rem;
}
.chart {
flex: 1 40%;
min-width: 20rem;
flex: 1 1 40rem;
min-height: 16rem;
background: var(--chart-background);
box-shadow: 0 0 1rem var(--shadow-color);
box-shadow: 1px 1px 0 var(--shadow-color);
overflow: hidden;
position: relative;
}
@ -195,7 +194,7 @@
}
.inputs input {
box-shadow: 0 0 1rem var(--shadow-color);
box-shadow: 1px 1px 0 var(--shadow-color);
padding: 0.25rem;
}
@ -255,8 +254,6 @@
font-weight: bold;
user-select: none;
cursor: pointer;
padding-left: 0.5rem;
padding-right: 0.5rem;
background: var(--new-chart-background-color);
color: var(--new-chart-text-color);
float: right;
@ -275,7 +272,6 @@
width: 36%;
}
#global-error {
align-self: center;
width: 60%;
@ -298,7 +294,7 @@
background: var(--param-background-color);
color: var(--param-text-color);
display: inline-block;
box-shadow: 0 0 1rem var(--shadow-color);
box-shadow: 1px 1px 0 var(--shadow-color);
margin-bottom: 1rem;
}
@ -491,17 +487,10 @@
* - if a query returned something unusual, display the table;
*/
let host = 'https://play.clickhouse.com/';
let user = 'explorer';
let host = location.protocol != 'file:' ? location.origin : 'http://localhost:8123/';
let user = 'default';
let password = '';
let add_http_cors_header = true;
/// If it is hosted on server, assume that it is the address of ClickHouse.
if (location.protocol != 'file:') {
host = location.origin;
user = 'default';
add_http_cors_header = false;
}
let add_http_cors_header = (location.protocol != 'file:');
const errorCodeMessageMap = {
516: 'Error authenticating with database. Please check your connection params and try again.'
@ -1273,8 +1262,11 @@ function hideError() {
}
let firstLoad = true;
let is_drawing = false; // Prevent race condition leading to duplicate/dangling charts.
async function drawAll() {
if (is_drawing) return;
is_drawing = true;
let params = getParamsForURL();
const chartsArray = document.getElementsByClassName('chart');
@ -1301,12 +1293,12 @@ async function drawAll() {
document.getElementById('edit').style.display = 'inline-block';
document.getElementById('search-span').style.display = '';
hideError();
}
else {
const charts = document.getElementById('charts')
charts.style.height = '0px';
} else {
document.getElementById('charts').style.height = '0px';
}
});
is_drawing = false;
}
function resize() {

View File

@ -164,7 +164,7 @@ public:
int getBcryptWorkfactor() const;
/// Enables logic that users without permissive row policies can still read rows using a SELECT query.
/// For example, if there two users A, B and a row policy is defined only for A, then
/// For example, if there are two users A, B and a row policy is defined only for A, then
/// if this setting is true the user B will see all rows, and if this setting is false the user B will see no rows.
void setEnabledUsersWithoutRowPoliciesCanReadRows(bool enable) { users_without_row_policies_can_read_rows = enable; }
bool isEnabledUsersWithoutRowPoliciesCanReadRows() const { return users_without_row_policies_can_read_rows; }

View File

@ -80,13 +80,12 @@ enum class AccessType
M(ALTER_TABLE, "", GROUP, ALTER) \
M(ALTER_DATABASE, "", GROUP, ALTER) \
\
M(ALTER_VIEW_REFRESH, "ALTER LIVE VIEW REFRESH, REFRESH VIEW", VIEW, ALTER_VIEW) \
M(ALTER_VIEW_MODIFY_QUERY, "ALTER TABLE MODIFY QUERY", VIEW, ALTER_VIEW) \
M(ALTER_VIEW_MODIFY_REFRESH, "ALTER TABLE MODIFY QUERY", VIEW, ALTER_VIEW) \
M(ALTER_VIEW, "", GROUP, ALTER) /* allows to execute ALTER VIEW REFRESH, ALTER VIEW MODIFY QUERY, ALTER VIEW MODIFY REFRESH;
implicitly enabled by the grant ALTER_TABLE */\
\
M(ALTER, "", GROUP, ALL) /* allows to execute ALTER {TABLE|LIVE VIEW} */\
M(ALTER, "", GROUP, ALL) /* allows to execute ALTER TABLE */\
\
M(CREATE_DATABASE, "", DATABASE, CREATE) /* allows to execute {CREATE|ATTACH} DATABASE */\
M(CREATE_TABLE, "", TABLE, CREATE) /* allows to execute {CREATE|ATTACH} {TABLE|VIEW} */\

View File

@ -0,0 +1,439 @@
#include <cassert>
#include <memory>
#include <IO/WriteHelpers.h>
#include <IO/ReadHelpers.h>
#include <IO/ReadHelpersArena.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeString.h>
#include <Columns/ColumnArray.h>
#include <Common/HashTable/HashSet.h>
#include <Common/HashTable/HashTableKeyHolder.h>
#include <Common/assert_cast.h>
#include <AggregateFunctions/IAggregateFunction.h>
#include <AggregateFunctions/KeyHolderHelpers.h>
#include <Core/Field.h>
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/Helpers.h>
#include <AggregateFunctions/FactoryHelpers.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeDate32.h>
#include <DataTypes/DataTypeDateTime.h>
#include <DataTypes/DataTypeDateTime64.h>
namespace DB
{
struct Settings;
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
}
struct Settings;
template <typename T>
struct AggregateFunctionGroupArrayIntersectData
{
using Set = HashSet<T>;
Set value;
UInt64 version = 0;
};
/// Puts all values to the hash set. Returns an array of unique values. Implemented for numeric types.
template <typename T>
class AggregateFunctionGroupArrayIntersect
: public IAggregateFunctionDataHelper<AggregateFunctionGroupArrayIntersectData<T>, AggregateFunctionGroupArrayIntersect<T>>
{
private:
using State = AggregateFunctionGroupArrayIntersectData<T>;
public:
AggregateFunctionGroupArrayIntersect(const DataTypePtr & argument_type, const Array & parameters_)
: IAggregateFunctionDataHelper<AggregateFunctionGroupArrayIntersectData<T>,
AggregateFunctionGroupArrayIntersect<T>>({argument_type}, parameters_, argument_type) {}
AggregateFunctionGroupArrayIntersect(const DataTypePtr & argument_type, const Array & parameters_, const DataTypePtr & result_type_)
: IAggregateFunctionDataHelper<AggregateFunctionGroupArrayIntersectData<T>,
AggregateFunctionGroupArrayIntersect<T>>({argument_type}, parameters_, result_type_) {}
String getName() const override { return "GroupArrayIntersect"; }
bool allocatesMemoryInArena() const override { return false; }
void add(AggregateDataPtr __restrict place, const IColumn ** columns, size_t row_num, Arena *) const override
{
auto & version = this->data(place).version;
auto & set = this->data(place).value;
const auto data_column = assert_cast<const ColumnArray &>(*columns[0]).getDataPtr();
const auto & offsets = assert_cast<const ColumnArray &>(*columns[0]).getOffsets();
const size_t offset = offsets[static_cast<ssize_t>(row_num) - 1];
const auto arr_size = offsets[row_num] - offset;
++version;
if (version == 1)
{
for (size_t i = 0; i < arr_size; ++i)
set.insert(static_cast<T>((*data_column)[offset + i].get<T>()));
}
else if (!set.empty())
{
typename State::Set new_set;
for (size_t i = 0; i < arr_size; ++i)
{
typename State::Set::LookupResult set_value = set.find(static_cast<T>((*data_column)[offset + i].get<T>()));
if (set_value != nullptr)
new_set.insert(static_cast<T>((*data_column)[offset + i].get<T>()));
}
set = std::move(new_set);
}
}
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena *) const override
{
auto & set = this->data(place).value;
const auto & rhs_set = this->data(rhs).value;
if (this->data(rhs).version == 0)
return;
UInt64 version = this->data(place).version++;
if (version == 0)
{
for (auto & rhs_elem : rhs_set)
set.insert(rhs_elem.getValue());
return;
}
if (!set.empty())
{
auto create_new_set = [](auto & lhs_val, auto & rhs_val)
{
typename State::Set new_set;
for (auto & lhs_elem : lhs_val)
{
auto res = rhs_val.find(lhs_elem.getValue());
if (res != nullptr)
new_set.insert(lhs_elem.getValue());
}
return new_set;
};
auto new_set = rhs_set.size() < set.size() ? create_new_set(rhs_set, set) : create_new_set(set, rhs_set);
set = std::move(new_set);
}
}
void serialize(ConstAggregateDataPtr __restrict place, WriteBuffer & buf, std::optional<size_t> /* version */) const override
{
auto & set = this->data(place).value;
auto version = this->data(place).version;
writeVarUInt(version, buf);
writeVarUInt(set.size(), buf);
for (const auto & elem : set)
writeIntBinary(elem.getValue(), buf);
}
void deserialize(AggregateDataPtr __restrict place, ReadBuffer & buf, std::optional<size_t> /* version */, Arena *) const override
{
readVarUInt(this->data(place).version, buf);
this->data(place).value.read(buf);
}
void insertResultInto(AggregateDataPtr __restrict place, IColumn & to, Arena *) const override
{
ColumnArray & arr_to = assert_cast<ColumnArray &>(to);
ColumnArray::Offsets & offsets_to = arr_to.getOffsets();
const auto & set = this->data(place).value;
offsets_to.push_back(offsets_to.back() + set.size());
typename ColumnVector<T>::Container & data_to = assert_cast<ColumnVector<T> &>(arr_to.getData()).getData();
size_t old_size = data_to.size();
data_to.resize(old_size + set.size());
size_t i = 0;
for (auto it = set.begin(); it != set.end(); ++it, ++i)
data_to[old_size + i] = it->getValue();
}
};
/// Generic implementation, it uses serialized representation as object descriptor.
struct AggregateFunctionGroupArrayIntersectGenericData
{
using Set = HashSet<StringRef>;
Set value;
UInt64 version = 0;
};
/** Template parameter with true value should be used for columns that store their elements in memory continuously.
* For such columns GroupArrayIntersect() can be implemented more efficiently (especially for small numeric arrays).
*/
template <bool is_plain_column = false>
class AggregateFunctionGroupArrayIntersectGeneric
: public IAggregateFunctionDataHelper<AggregateFunctionGroupArrayIntersectGenericData,
AggregateFunctionGroupArrayIntersectGeneric<is_plain_column>>
{
const DataTypePtr & input_data_type;
using State = AggregateFunctionGroupArrayIntersectGenericData;
public:
AggregateFunctionGroupArrayIntersectGeneric(const DataTypePtr & input_data_type_, const Array & parameters_)
: IAggregateFunctionDataHelper<AggregateFunctionGroupArrayIntersectGenericData, AggregateFunctionGroupArrayIntersectGeneric<is_plain_column>>({input_data_type_}, parameters_, input_data_type_)
, input_data_type(this->argument_types[0]) {}
AggregateFunctionGroupArrayIntersectGeneric(const DataTypePtr & input_data_type_, const Array & parameters_, const DataTypePtr & result_type_)
: IAggregateFunctionDataHelper<AggregateFunctionGroupArrayIntersectGenericData, AggregateFunctionGroupArrayIntersectGeneric<is_plain_column>>({input_data_type_}, parameters_, result_type_)
, input_data_type(result_type_) {}
String getName() const override { return "GroupArrayIntersect"; }
bool allocatesMemoryInArena() const override { return true; }
void add(AggregateDataPtr __restrict place, const IColumn ** columns, size_t row_num, Arena * arena) const override
{
auto & set = this->data(place).value;
auto & version = this->data(place).version;
bool inserted;
State::Set::LookupResult it;
const auto data_column = assert_cast<const ColumnArray &>(*columns[0]).getDataPtr();
const auto & offsets = assert_cast<const ColumnArray &>(*columns[0]).getOffsets();
const size_t offset = offsets[static_cast<ssize_t>(row_num) - 1];
const auto arr_size = offsets[row_num] - offset;
++version;
if (version == 1)
{
for (size_t i = 0; i < arr_size; ++i)
{
if constexpr (is_plain_column)
set.emplace(ArenaKeyHolder{data_column->getDataAt(offset + i), *arena}, it, inserted);
else
{
const char * begin = nullptr;
StringRef serialized = data_column->serializeValueIntoArena(offset + i, *arena, begin);
assert(serialized.data != nullptr);
set.emplace(SerializedKeyHolder{serialized, *arena}, it, inserted);
}
}
}
else if (!set.empty())
{
typename State::Set new_set;
for (size_t i = 0; i < arr_size; ++i)
{
if constexpr (is_plain_column)
{
it = set.find(data_column->getDataAt(offset + i));
if (it != nullptr)
new_set.emplace(ArenaKeyHolder{data_column->getDataAt(offset + i), *arena}, it, inserted);
}
else
{
const char * begin = nullptr;
StringRef serialized = data_column->serializeValueIntoArena(offset + i, *arena, begin);
assert(serialized.data != nullptr);
it = set.find(serialized);
if (it != nullptr)
new_set.emplace(SerializedKeyHolder{serialized, *arena}, it, inserted);
}
}
set = std::move(new_set);
}
}
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena * arena) const override
{
auto & set = this->data(place).value;
const auto & rhs_value = this->data(rhs).value;
if (this->data(rhs).version == 0)
return;
UInt64 version = this->data(place).version++;
if (version == 0)
{
bool inserted;
State::Set::LookupResult it;
for (auto & rhs_elem : rhs_value)
{
set.emplace(ArenaKeyHolder{rhs_elem.getValue(), *arena}, it, inserted);
}
}
else if (!set.empty())
{
auto create_new_map = [](auto & lhs_val, auto & rhs_val)
{
typename State::Set new_map;
for (auto & lhs_elem : lhs_val)
{
auto val = rhs_val.find(lhs_elem.getValue());
if (val != nullptr)
new_map.insert(lhs_elem.getValue());
}
return new_map;
};
auto new_map = rhs_value.size() < set.size() ? create_new_map(rhs_value, set) : create_new_map(set, rhs_value);
set = std::move(new_map);
}
}
void serialize(ConstAggregateDataPtr __restrict place, WriteBuffer & buf, std::optional<size_t> /* version */) const override
{
auto & set = this->data(place).value;
auto & version = this->data(place).version;
writeVarUInt(version, buf);
writeVarUInt(set.size(), buf);
for (const auto & elem : set)
writeStringBinary(elem.getValue(), buf);
}
void deserialize(AggregateDataPtr __restrict place, ReadBuffer & buf, std::optional<size_t> /* version */, Arena * arena) const override
{
auto & set = this->data(place).value;
auto & version = this->data(place).version;
size_t size;
readVarUInt(version, buf);
readVarUInt(size, buf);
set.reserve(size);
UInt64 elem_version;
for (size_t i = 0; i < size; ++i)
{
auto key = readStringBinaryInto(*arena, buf);
readVarUInt(elem_version, buf);
set.insert(key);
}
}
void insertResultInto(AggregateDataPtr __restrict place, IColumn & to, Arena *) const override
{
ColumnArray & arr_to = assert_cast<ColumnArray &>(to);
ColumnArray::Offsets & offsets_to = arr_to.getOffsets();
IColumn & data_to = arr_to.getData();
auto & set = this->data(place).value;
offsets_to.push_back(offsets_to.back() + set.size());
for (auto & elem : set)
{
if constexpr (is_plain_column)
data_to.insertData(elem.getValue().data, elem.getValue().size);
else
std::ignore = data_to.deserializeAndInsertFromArena(elem.getValue().data);
}
}
};
namespace
{
/// Substitute return type for Date and DateTime
class AggregateFunctionGroupArrayIntersectDate : public AggregateFunctionGroupArrayIntersect<DataTypeDate::FieldType>
{
public:
explicit AggregateFunctionGroupArrayIntersectDate(const DataTypePtr & argument_type, const Array & parameters_)
: AggregateFunctionGroupArrayIntersect<DataTypeDate::FieldType>(argument_type, parameters_, createResultType()) {}
static DataTypePtr createResultType() { return std::make_shared<DataTypeArray>(std::make_shared<DataTypeDate>()); }
};
class AggregateFunctionGroupArrayIntersectDateTime : public AggregateFunctionGroupArrayIntersect<DataTypeDateTime::FieldType>
{
public:
explicit AggregateFunctionGroupArrayIntersectDateTime(const DataTypePtr & argument_type, const Array & parameters_)
: AggregateFunctionGroupArrayIntersect<DataTypeDateTime::FieldType>(argument_type, parameters_, createResultType()) {}
static DataTypePtr createResultType() { return std::make_shared<DataTypeArray>(std::make_shared<DataTypeDateTime>()); }
};
class AggregateFunctionGroupArrayIntersectDate32 : public AggregateFunctionGroupArrayIntersect<DataTypeDate32::FieldType>
{
public:
explicit AggregateFunctionGroupArrayIntersectDate32(const DataTypePtr & argument_type, const Array & parameters_)
: AggregateFunctionGroupArrayIntersect<DataTypeDate32::FieldType>(argument_type, parameters_, createResultType()) {}
static DataTypePtr createResultType() { return std::make_shared<DataTypeArray>(std::make_shared<DataTypeDate32>()); }
};
IAggregateFunction * createWithExtraTypes(const DataTypePtr & argument_type, const Array & parameters)
{
WhichDataType which(argument_type);
if (which.idx == TypeIndex::Date) return new AggregateFunctionGroupArrayIntersectDate(argument_type, parameters);
else if (which.idx == TypeIndex::DateTime) return new AggregateFunctionGroupArrayIntersectDateTime(argument_type, parameters);
else if (which.idx == TypeIndex::Date32) return new AggregateFunctionGroupArrayIntersectDate32(argument_type, parameters);
else if (which.idx == TypeIndex::DateTime64)
{
const auto * datetime64_type = dynamic_cast<const DataTypeDateTime64 *>(argument_type.get());
const auto return_type = std::make_shared<DataTypeArray>(std::make_shared<DataTypeDateTime64>(datetime64_type->getScale()));
return new AggregateFunctionGroupArrayIntersectGeneric<true>(argument_type, parameters, return_type);
}
else
{
/// Check that we can use plain version of AggregateFunctionGroupArrayIntersectGeneric
if (argument_type->isValueUnambiguouslyRepresentedInContiguousMemoryRegion())
return new AggregateFunctionGroupArrayIntersectGeneric<true>(argument_type, parameters);
else
return new AggregateFunctionGroupArrayIntersectGeneric<false>(argument_type, parameters);
}
}
inline AggregateFunctionPtr createAggregateFunctionGroupArrayIntersectImpl(const std::string & name, const DataTypePtr & argument_type, const Array & parameters)
{
const auto & nested_type = dynamic_cast<const DataTypeArray &>(*argument_type).getNestedType();
AggregateFunctionPtr res(createWithNumericType<AggregateFunctionGroupArrayIntersect, const DataTypePtr &>(*nested_type, argument_type, parameters));
if (!res)
{
res = AggregateFunctionPtr(createWithExtraTypes(argument_type, parameters));
}
if (!res)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument for aggregate function {}",
argument_type->getName(), name);
return res;
}
AggregateFunctionPtr createAggregateFunctionGroupArrayIntersect(
const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings *)
{
assertUnary(name, argument_types);
if (!WhichDataType(argument_types.at(0)).isArray())
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Aggregate function groupArrayIntersect accepts only array type argument.");
if (!parameters.empty())
throw Exception(ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH,
"Incorrect number of parameters for aggregate function {}, should be 0", name);
return createAggregateFunctionGroupArrayIntersectImpl(name, argument_types[0], parameters);
}
}
void registerAggregateFunctionGroupArrayIntersect(AggregateFunctionFactory & factory)
{
AggregateFunctionProperties properties = { .returns_default_when_only_null = false, .is_order_dependent = true };
factory.registerFunction("groupArrayIntersect", { createAggregateFunctionGroupArrayIntersect, properties });
}
}

View File

@ -18,6 +18,7 @@ void registerAggregateFunctionGroupArray(AggregateFunctionFactory &);
void registerAggregateFunctionGroupArraySorted(AggregateFunctionFactory & factory);
void registerAggregateFunctionGroupUniqArray(AggregateFunctionFactory &);
void registerAggregateFunctionGroupArrayInsertAt(AggregateFunctionFactory &);
void registerAggregateFunctionGroupArrayIntersect(AggregateFunctionFactory &);
void registerAggregateFunctionsQuantile(AggregateFunctionFactory &);
void registerAggregateFunctionsQuantileDeterministic(AggregateFunctionFactory &);
void registerAggregateFunctionsQuantileExact(AggregateFunctionFactory &);
@ -116,6 +117,7 @@ void registerAggregateFunctions()
registerAggregateFunctionGroupArraySorted(factory);
registerAggregateFunctionGroupUniqArray(factory);
registerAggregateFunctionGroupArrayInsertAt(factory);
registerAggregateFunctionGroupArrayIntersect(factory);
registerAggregateFunctionsQuantile(factory);
registerAggregateFunctionsQuantileDeterministic(factory);
registerAggregateFunctionsQuantileExact(factory);

View File

@ -388,7 +388,7 @@ QueryTreeNodes extractAllTableReferences(const QueryTreeNodePtr & tree)
return result;
}
QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node, bool add_array_join)
QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node, bool add_array_join, bool recursive)
{
QueryTreeNodes result;
@ -406,15 +406,28 @@ QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node,
{
case QueryTreeNodeType::TABLE:
[[fallthrough]];
case QueryTreeNodeType::QUERY:
[[fallthrough]];
case QueryTreeNodeType::UNION:
[[fallthrough]];
case QueryTreeNodeType::TABLE_FUNCTION:
{
result.push_back(std::move(node_to_process));
break;
}
case QueryTreeNodeType::QUERY:
{
if (recursive)
nodes_to_process.push_back(node_to_process->as<QueryNode>()->getJoinTree());
result.push_back(std::move(node_to_process));
break;
}
case QueryTreeNodeType::UNION:
{
if (recursive)
{
for (const auto & union_node : node_to_process->as<UnionNode>()->getQueries().getNodes())
nodes_to_process.push_back(union_node);
}
result.push_back(std::move(node_to_process));
break;
}
case QueryTreeNodeType::ARRAY_JOIN:
{
auto & array_join_node = node_to_process->as<ArrayJoinNode &>();

View File

@ -54,7 +54,7 @@ void addTableExpressionOrJoinIntoTablesInSelectQuery(ASTPtr & tables_in_select_q
QueryTreeNodes extractAllTableReferences(const QueryTreeNodePtr & tree);
/// Extract table, table function, query, union from join tree.
QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node, bool add_array_join = false);
QueryTreeNodes extractTableExpressions(const QueryTreeNodePtr & join_tree_node, bool add_array_join = false, bool recursive = false);
/// Extract left table expression from join tree.
QueryTreeNodePtr extractLeftTableExpression(const QueryTreeNodePtr & join_tree_node);

View File

@ -32,8 +32,6 @@ namespace ErrorCodes
M(UInt64, shard_num) \
M(UInt64, replica_num) \
M(Bool, check_parts) \
M(Bool, check_projection_parts) \
M(Bool, allow_backup_broken_projections) \
M(Bool, internal) \
M(String, host_id) \
M(OptionalUUID, backup_uuid)

View File

@ -62,12 +62,6 @@ struct BackupSettings
/// Check checksums of the data parts before writing them to a backup.
bool check_parts = true;
/// Check checksums of the projection data parts before writing them to a backup.
bool check_projection_parts = true;
/// Allow to create backup with broken projections.
bool allow_backup_broken_projections = false;
/// Internal, should not be specified by user.
/// Whether this backup is a part of a distributed backup created by BACKUP ON CLUSTER.
bool internal = false;

View File

@ -4,13 +4,10 @@
#include <Common/ProfileEvents.h>
#include <Common/thread_local_rng.h>
#include <Common/logger_useful.h>
#include <Core/Names.h>
#include <base/types.h>
#include <Poco/Net/IPAddress.h>
#include <Poco/Net/DNS.h>
#include <Poco/Net/NetException.h>
#include <Poco/NumberParser.h>
#include <arpa/inet.h>
#include <atomic>
#include <optional>
#include <string_view>
@ -141,10 +138,10 @@ DNSResolver::IPAddresses resolveIPAddressImpl(const std::string & host)
return addresses;
}
DNSResolver::IPAddresses resolveIPAddressWithCache(CacheBase<std::string, DNSResolver::IPAddresses> & cache, const std::string & host)
DNSResolver::IPAddresses resolveIPAddressWithCache(CacheBase<std::string, DNSResolver::CacheEntry> & cache, const std::string & host)
{
auto [result, _ ] = cache.getOrSet(host, [&host]() { return std::make_shared<DNSResolver::IPAddresses>(resolveIPAddressImpl(host)); });
return *result;
auto [result, _ ] = cache.getOrSet(host, [&host]() {return std::make_shared<DNSResolver::CacheEntry>(resolveIPAddressImpl(host), std::chrono::system_clock::now());});
return result->addresses;
}
std::unordered_set<String> reverseResolveImpl(const Poco::Net::IPAddress & address)
@ -179,8 +176,8 @@ struct DNSResolver::Impl
using HostWithConsecutiveFailures = std::unordered_map<String, UInt32>;
using AddressWithConsecutiveFailures = std::unordered_map<Poco::Net::IPAddress, UInt32>;
CacheBase<std::string, DNSResolver::IPAddresses> cache_host{100};
CacheBase<Poco::Net::IPAddress, std::unordered_set<std::string>> cache_address{100};
CacheBase<std::string, DNSResolver::CacheEntry> cache_host{1024};
CacheBase<Poco::Net::IPAddress, std::unordered_set<std::string>> cache_address{1024};
std::mutex drop_mutex;
std::mutex update_mutex;
@ -292,6 +289,12 @@ void DNSResolver::setDisableCacheFlag(bool is_disabled)
impl->disable_cache = is_disabled;
}
void DNSResolver::setCacheMaxSize(const UInt64 cache_max_size)
{
impl->cache_address.setMaxSizeInBytes(cache_max_size);
impl->cache_host.setMaxSizeInBytes(cache_max_size);
}
String DNSResolver::getHostName()
{
if (impl->disable_cache)
@ -411,7 +414,7 @@ bool DNSResolver::updateHost(const String & host)
const auto old_value = resolveIPAddressWithCache(impl->cache_host, host);
auto new_value = resolveIPAddressImpl(host);
const bool result = old_value != new_value;
impl->cache_host.set(host, std::make_shared<DNSResolver::IPAddresses>(std::move(new_value)));
impl->cache_host.set(host, std::make_shared<DNSResolver::CacheEntry>(std::move(new_value), std::chrono::system_clock::now()));
return result;
}
@ -438,6 +441,19 @@ void DNSResolver::addToNewAddresses(const Poco::Net::IPAddress & address)
impl->new_addresses.insert({address, consecutive_failures});
}
std::vector<std::pair<std::string, DNSResolver::CacheEntry>> DNSResolver::cacheEntries() const
{
std::lock_guard lock(impl->drop_mutex);
std::vector<std::pair<std::string, DNSResolver::CacheEntry>> entries;
for (auto & [key, entry] : impl->cache_host.dump())
{
entries.emplace_back(std::move(key), *entry);
}
return entries;
}
DNSResolver::~DNSResolver() = default;
DNSResolver & DNSResolver::instance()

View File

@ -20,7 +20,11 @@ class DNSResolver : private boost::noncopyable
{
public:
using IPAddresses = std::vector<Poco::Net::IPAddress>;
using IPAddressesPtr = std::shared_ptr<IPAddresses>;
using CacheEntry = struct
{
IPAddresses addresses;
std::chrono::system_clock::time_point cached_at;
};
static DNSResolver & instance();
@ -48,6 +52,9 @@ public:
/// Disables caching
void setDisableCacheFlag(bool is_disabled = true);
/// Set a limit of cache size in bytes
void setCacheMaxSize(const UInt64 cache_max_size);
/// Drops all caches
void dropCache();
@ -58,6 +65,9 @@ public:
/// Returns true if IP of any host has been changed or an element was dropped (too many failures)
bool updateCache(UInt32 max_consecutive_failures);
/// Returns a copy of cache entries
std::vector<std::pair<std::string, CacheEntry>> cacheEntries() const;
~DNSResolver();
private:

View File

@ -592,7 +592,6 @@
M(710, FAULT_INJECTED) \
M(711, FILECACHE_ACCESS_DENIED) \
M(712, TOO_MANY_MATERIALIZED_VIEWS) \
M(713, BROKEN_PROJECTION) \
M(714, UNEXPECTED_CLUSTER) \
M(715, CANNOT_DETECT_FORMAT) \
M(716, CANNOT_FORGET_PARTITION) \

View File

@ -36,7 +36,6 @@ static constexpr auto DEFAULT_BLOCK_SIZE
static constexpr auto DEFAULT_INSERT_BLOCK_SIZE
= 1048449; /// 1048576 - PADDING_FOR_SIMD - (PADDING_FOR_SIMD - 1) bytes padding that we usually have in arrays
static constexpr auto DEFAULT_PERIODIC_LIVE_VIEW_REFRESH_SEC = 60;
static constexpr auto SHOW_CHARS_ON_SYNTAX_ERROR = ptrdiff_t(160);
/// each period reduces the error counter by 2 times
/// too short a period can cause errors to disappear immediately after creation.

View File

@ -79,8 +79,9 @@ namespace DB
M(Double, index_mark_cache_size_ratio, DEFAULT_INDEX_MARK_CACHE_SIZE_RATIO, "The size of the protected queue in the secondary index mark cache relative to the cache's total size.", 0) \
M(UInt64, mmap_cache_size, DEFAULT_MMAP_CACHE_MAX_SIZE, "A cache for mmapped files.", 0) \
\
M(Bool, disable_internal_dns_cache, false, "Disable internal DNS caching at all.", 0) \
M(Int32, dns_cache_update_period, 15, "Internal DNS cache update period in seconds.", 0) \
M(Bool, disable_internal_dns_cache, false, "Disable internal DNS caching at all.", 0) \
M(UInt64, dns_cache_max_size, 1024, "Internal DNS cache max size in bytes.", 0) \
M(Int32, dns_cache_update_period, 15, "Internal DNS cache update period in seconds.", 0) \
M(UInt32, dns_max_consecutive_failures, 10, "Max DNS resolve failures of a hostname before dropping the hostname from ClickHouse DNS cache.", 0) \
\
M(UInt64, max_table_size_to_drop, 50000000000lu, "If size of a table is greater than this value (in bytes) than table could not be dropped with any DROP query.", 0) \

View File

@ -224,8 +224,8 @@ class IColumn;
M(Bool, allow_experimental_inverted_index, false, "If it is set to true, allow to use experimental inverted index.", 0) \
\
M(UInt64, mysql_max_rows_to_insert, 65536, "The maximum number of rows in MySQL batch insertion of the MySQL storage engine", 0) \
M(Bool, mysql_map_string_to_text_in_show_columns, false, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
M(Bool, mysql_map_fixed_string_to_text_in_show_columns, false, "If enabled, FixedString type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
M(Bool, mysql_map_string_to_text_in_show_columns, true, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise. Has an effect only when the connection is made through the MySQL wire protocol.", 0) \
M(Bool, mysql_map_fixed_string_to_text_in_show_columns, true, "If enabled, FixedString type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise. Has an effect only when the connection is made through the MySQL wire protocol.", 0) \
\
M(UInt64, optimize_min_equality_disjunction_chain_length, 3, "The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", 0) \
M(UInt64, optimize_min_inequality_conjunction_chain_length, 3, "The minimum length of the expression `expr <> x1 AND ... expr <> xN` for optimization ", 0) \
@ -604,7 +604,7 @@ class IColumn;
M(Bool, validate_polygons, true, "Throw exception if polygon is invalid in function pointInPolygon (e.g. self-tangent, self-intersecting). If the setting is false, the function will accept invalid polygons but may silently return wrong result.", 0) \
M(UInt64, max_parser_depth, DBMS_DEFAULT_MAX_PARSER_DEPTH, "Maximum parser depth (recursion depth of recursive descend parser).", 0) \
M(Bool, allow_settings_after_format_in_insert, false, "Allow SETTINGS after FORMAT, but note, that this is not always safe (note: this is a compatibility setting).", 0) \
M(Seconds, periodic_live_view_refresh, DEFAULT_PERIODIC_LIVE_VIEW_REFRESH_SEC, "Interval after which periodically refreshed live view is forced to refresh.", 0) \
M(Seconds, periodic_live_view_refresh, 60, "Interval after which periodically refreshed live view is forced to refresh.", 0) \
M(Bool, transform_null_in, false, "If enabled, NULL values will be matched with 'IN' operator as if they are considered equal.", 0) \
M(Bool, allow_nondeterministic_mutations, false, "Allow non-deterministic functions in ALTER UPDATE/ALTER DELETE statements", 0) \
M(Seconds, lock_acquire_timeout, DBMS_DEFAULT_LOCK_ACQUIRE_TIMEOUT_SEC, "How long locking request should wait before failing", 0) \
@ -719,6 +719,7 @@ class IColumn;
M(Bool, query_plan_split_filter, true, "Allow to split filters in the query plan", 0) \
M(Bool, query_plan_merge_expressions, true, "Allow to merge expressions in the query plan", 0) \
M(Bool, query_plan_filter_push_down, true, "Allow to push down filter by predicate query plan step", 0) \
M(Bool, query_plan_optimize_prewhere, true, "Allow to push down filter to PREWHERE expression for supported storages", 0) \
M(Bool, query_plan_execute_functions_after_sorting, true, "Allow to re-order functions after sorting", 0) \
M(Bool, query_plan_reuse_storage_ordering_for_window_functions, true, "Allow to use the storage sorting for window functions", 0) \
M(Bool, query_plan_lift_up_union, true, "Allow to move UNIONs up so that more parts of the query plan can be optimized", 0) \
@ -841,6 +842,7 @@ class IColumn;
M(Bool, use_with_fill_by_sorting_prefix, true, "Columns preceding WITH FILL columns in ORDER BY clause form sorting prefix. Rows with different values in sorting prefix are filled independently", 0) \
M(Bool, optimize_uniq_to_count, true, "Rewrite uniq and its variants(except uniqUpTo) to count if subquery has distinct or group by clause.", 0) \
M(Bool, use_variant_as_common_type, false, "Use Variant as a result type for if/multiIf in case when there is no common type for arguments", 0) \
M(Bool, enable_order_by_all, true, "Enable sorting expression ORDER BY ALL.", 0) \
\
/** Experimental functions */ \
M(Bool, allow_experimental_materialized_postgresql_table, false, "Allows to use the MaterializedPostgreSQL table engine. Disabled by default, because this feature is experimental", 0) \
@ -872,7 +874,6 @@ class IColumn;
M(UInt64, cache_warmer_threads, 4, "Only available in ClickHouse Cloud", 0) \
M(Int64, ignore_cold_parts_seconds, 0, "Only available in ClickHouse Cloud", 0) \
M(Int64, prefer_warmed_unmerged_parts_seconds, 0, "Only available in ClickHouse Cloud", 0) \
M(Bool, enable_order_by_all, true, "Enable sorting expression ORDER BY ALL.", 0) \
M(Bool, iceberg_engine_ignore_schema_evolution, false, "Ignore schema evolution in Iceberg table engine and read all data using latest schema saved on table creation. Note that it can lead to incorrect result", 0) \
// End of COMMON_SETTINGS

View File

@ -78,7 +78,8 @@ namespace SettingsChangesHistory
/// History of settings changes that controls some backward incompatible changes
/// across all ClickHouse versions. It maps ClickHouse version to settings changes that were done
/// in this version. This history contains both changes to existing settings and newly added settings.
/// Settings changes is a vector of structs {setting_name, previous_value, new_value}.
/// Settings changes is a vector of structs
/// {setting_name, previous_value, new_value, reason}.
/// For newly added setting choose the most appropriate previous_value (for example, if new setting
/// controls new feature and it's 'true' by default, use 'false' as previous_value).
/// It's used to implement `compatibility` setting (see https://github.com/ClickHouse/ClickHouse/issues/35972)
@ -87,6 +88,7 @@ static std::map<ClickHouseVersion, SettingsChangesHistory::SettingsChanges> sett
{"24.2", {
{"output_format_values_escape_quote_with_quote", false, false, "If true escape ' with '', otherwise quoted with \\'"},
{"input_format_try_infer_exponent_floats", true, false, "Don't infer floats in exponential notation by default"},
{"query_plan_optimize_prewhere", true, true, "Allow to push down filter to PREWHERE expression for supported storages"},
{"async_insert_max_data_size", 1000000, 10485760, "The previous value appeared to be too small."},
{"async_insert_poll_timeout_ms", 10, 10, "Timeout in milliseconds for polling data from asynchronous insert queue"},
{"async_insert_use_adaptive_busy_timeout", true, true, "Use adaptive asynchronous insert timeout"},
@ -103,6 +105,8 @@ static std::map<ClickHouseVersion, SettingsChangesHistory::SettingsChanges> sett
{"min_external_table_block_size_bytes", DEFAULT_INSERT_BLOCK_SIZE * 256, DEFAULT_INSERT_BLOCK_SIZE * 256, "Squash blocks passed to external table to specified size in bytes, if blocks are not big enough."},
{"parallel_replicas_prefer_local_join", true, true, "If true, and JOIN can be executed with parallel replicas algorithm, and all storages of right JOIN part are *MergeTree, local JOIN will be used instead of GLOBAL JOIN."},
{"extract_key_value_pairs_max_pairs_per_row", 0, 0, "Max number of pairs that can be produced by the `extractKeyValuePairs` function. Used as a safeguard against consuming too much memory."},
{"mysql_map_string_to_text_in_show_columns", false, true, "Reduce the configuration effort to connect ClickHouse with BI tools."},
{"mysql_map_fixed_string_to_text_in_show_columns", false, true, "Reduce the configuration effort to connect ClickHouse with BI tools."},
}},
{"24.1", {{"print_pretty_type_names", false, true, "Better user experience."},
{"input_format_json_read_bools_as_strings", false, true, "Allow to read bools as strings in JSON formats by default"},

View File

@ -92,6 +92,8 @@ protected:
const String name;
public:
/// Volume priority. Maximum UInt64 value by default (lowest possible priority)
UInt64 volume_priority;
/// Max size of reservation, zero means unlimited size
UInt64 max_data_part_size = 0;
/// Should a new data part be synchronously moved to a volume according to ttl on insert

View File

@ -28,6 +28,7 @@ namespace ErrorCodes
extern const int BAD_ARGUMENTS;
extern const int EXCESSIVE_ELEMENT_IN_CONFIG;
extern const int NO_ELEMENTS_IN_CONFIG;
extern const int INVALID_CONFIG_PARAMETER;
extern const int UNKNOWN_POLICY;
extern const int UNKNOWN_VOLUME;
extern const int LOGICAL_ERROR;
@ -56,6 +57,8 @@ StoragePolicy::StoragePolicy(
config.keys(volumes_prefix, keys);
}
std::set<UInt64> volume_priorities;
for (const auto & attr_name : keys)
{
if (!std::all_of(attr_name.begin(), attr_name.end(), isWordCharASCII))
@ -63,6 +66,27 @@ StoragePolicy::StoragePolicy(
"Volume name can contain only alphanumeric and '_' in storage policy {} ({})",
backQuote(name), attr_name);
volumes.emplace_back(createVolumeFromConfig(attr_name, config, volumes_prefix + "." + attr_name, disks));
UInt64 last_priority = volumes.back()->volume_priority;
if (last_priority != std::numeric_limits<UInt64>::max() && !volume_priorities.insert(last_priority).second)
{
throw Exception(
ErrorCodes::INVALID_CONFIG_PARAMETER,
"volume_priority values must be unique across the policy");
}
}
if (!volume_priorities.empty())
{
/// Check that priority values cover the range from 1 to N (lowest explicit priority)
if (*volume_priorities.begin() != 1 || *volume_priorities.rbegin() != volume_priorities.size())
throw Exception(
ErrorCodes::INVALID_CONFIG_PARAMETER,
"volume_priority values must cover the range from 1 to N (lowest priority specified) without gaps");
std::stable_sort(
volumes.begin(), volumes.end(),
[](const VolumePtr a, const VolumePtr b) { return a->volume_priority < b->volume_priority; });
}
if (volumes.empty() && name == DEFAULT_STORAGE_POLICY_NAME)

View File

@ -23,6 +23,8 @@ VolumeJBOD::VolumeJBOD(
{
LoggerPtr logger = getLogger("StorageConfiguration");
volume_priority = config.getUInt64(config_prefix + ".volume_priority", std::numeric_limits<UInt64>::max());
auto has_max_bytes = config.has(config_prefix + ".max_data_part_size_bytes");
auto has_max_ratio = config.has(config_prefix + ".max_data_part_size_ratio");
if (has_max_bytes && has_max_ratio)

View File

@ -45,7 +45,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{

View File

@ -29,18 +29,6 @@ struct AsynchronousMetricLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
/// Returns the list of columns as in CREATE TABLE statement or nullptr.
/// If it's not nullptr, this list of columns will be used to create the table.
/// Otherwise the list will be constructed from LogElement::getNamesAndTypes and LogElement::getNamesAndAliases.
static const char * getCustomColumnList()
{
return "hostname LowCardinality(String) CODEC(ZSTD(1)), "
"event_date Date CODEC(Delta(2), ZSTD(1)), "
"event_time DateTime CODEC(Delta(4), ZSTD(1)), "
"metric LowCardinality(String) CODEC(ZSTD(1)), "
"value Float64 CODEC(ZSTD(3))";
}
};
class AsynchronousMetricLog : public SystemLog<AsynchronousMetricLogElement>

View File

@ -29,6 +29,7 @@
#include <Storages/MergeTree/ReplicatedFetchList.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <Storages/MergeTree/MergeTreeSettings.h>
#include <Storages/Distributed/DistributedSettings.h>
#include <Storages/CompressionCodecSelector.h>
#include <Storages/StorageS3Settings.h>
#include <Disks/DiskLocal.h>
@ -112,6 +113,7 @@
#include <Parsers/FunctionParameterValuesVisitor.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <base/defines.h>
namespace fs = std::filesystem;
@ -353,6 +355,7 @@ struct ContextSharedPart : boost::noncopyable
std::optional<MergeTreeSettings> merge_tree_settings TSA_GUARDED_BY(mutex); /// Settings of MergeTree* engines.
std::optional<MergeTreeSettings> replicated_merge_tree_settings TSA_GUARDED_BY(mutex); /// Settings of ReplicatedMergeTree* engines.
std::optional<DistributedSettings> distributed_settings TSA_GUARDED_BY(mutex);
std::atomic_size_t max_table_size_to_drop = 50000000000lu; /// Protects MergeTree tables from accidental DROP (50GB by default)
std::atomic_size_t max_partition_size_to_drop = 50000000000lu; /// Protects MergeTree partitions from accidental DROP (50GB by default)
/// No lock required for format_schema_path modified only during initialization
@ -4118,6 +4121,21 @@ const MergeTreeSettings & Context::getReplicatedMergeTreeSettings() const
return *shared->replicated_merge_tree_settings;
}
const DistributedSettings & Context::getDistributedSettings() const
{
std::lock_guard lock(shared->mutex);
if (!shared->distributed_settings)
{
const auto & config = shared->getConfigRefWithLock(lock);
DistributedSettings distributed_settings;
distributed_settings.loadFromConfig("distributed", config);
shared->distributed_settings.emplace(distributed_settings);
}
return *shared->distributed_settings;
}
const StorageS3Settings & Context::getStorageS3Settings() const
{
std::lock_guard lock(shared->mutex);

View File

@ -113,6 +113,7 @@ class BlobStorageLog;
class IAsynchronousReader;
class IOUringReader;
struct MergeTreeSettings;
struct DistributedSettings;
struct InitialAllRangesAnnouncement;
struct ParallelReadRequest;
struct ParallelReadResponse;
@ -1075,6 +1076,7 @@ public:
const MergeTreeSettings & getMergeTreeSettings() const;
const MergeTreeSettings & getReplicatedMergeTreeSettings() const;
const DistributedSettings & getDistributedSettings() const;
const StorageS3Settings & getStorageS3Settings() const;
/// Prevents DROP TABLE if its size is greater than max_size (50GB by default, max_size=0 turn off this check)

View File

@ -38,7 +38,6 @@ struct FilesystemReadPrefetchesLogElement
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
class FilesystemReadPrefetchesLog : public SystemLog<FilesystemReadPrefetchesLogElement>

View File

@ -60,8 +60,7 @@ BlockIO InterpreterAlterQuery::execute()
{
return executeToDatabase(alter);
}
else if (alter.alter_object == ASTAlterQuery::AlterObjectType::TABLE
|| alter.alter_object == ASTAlterQuery::AlterObjectType::LIVE_VIEW)
else if (alter.alter_object == ASTAlterQuery::AlterObjectType::TABLE)
{
return executeToTable(alter);
}
@ -467,11 +466,6 @@ AccessRightsElements InterpreterAlterQuery::getRequiredAccessForCommand(const AS
required_access.emplace_back(AccessType::ALTER_VIEW_MODIFY_REFRESH, database, table);
break;
}
case ASTAlterCommand::LIVE_VIEW_REFRESH:
{
required_access.emplace_back(AccessType::ALTER_VIEW_REFRESH, database, table);
break;
}
case ASTAlterCommand::RENAME_COLUMN:
{
required_access.emplace_back(AccessType::ALTER_RENAME_COLUMN, database, table, column_name());

View File

@ -76,6 +76,8 @@
#include <Storages/IStorage.h>
#include <Storages/MergeTree/MergeTreeWhereOptimizer.h>
#include <Storages/StorageDistributed.h>
#include <Storages/StorageDummy.h>
#include <Storages/StorageMerge.h>
#include <Storages/StorageValues.h>
#include <Storages/StorageView.h>
@ -224,8 +226,10 @@ InterpreterSelectQuery::InterpreterSelectQuery(
const StoragePtr & storage_,
const StorageMetadataPtr & metadata_snapshot_,
const SelectQueryOptions & options_)
: InterpreterSelectQuery(query_ptr_, context_, std::nullopt, storage_, options_.copy().noSubquery(), {}, metadata_snapshot_)
{}
: InterpreterSelectQuery(
query_ptr_, context_, std::nullopt, storage_, options_.copy().noSubquery(), {}, metadata_snapshot_)
{
}
InterpreterSelectQuery::InterpreterSelectQuery(
const ASTPtr & query_ptr_,
@ -618,7 +622,6 @@ InterpreterSelectQuery::InterpreterSelectQuery(
required_result_column_names,
table_join);
query_info.syntax_analyzer_result = syntax_analyzer_result;
context->setDistributed(syntax_analyzer_result->is_remote_storage);
@ -777,7 +780,8 @@ InterpreterSelectQuery::InterpreterSelectQuery(
result_header = getSampleBlockImpl();
};
analyze(shouldMoveToPrewhere());
/// Conditionally support AST-based PREWHERE optimization.
analyze(shouldMoveToPrewhere() && (!settings.query_plan_optimize_prewhere || !settings.query_plan_enable_optimizations));
bool need_analyze_again = false;
bool can_analyze_again = false;
@ -901,7 +905,24 @@ bool InterpreterSelectQuery::adjustParallelReplicasAfterAnalysis()
}
ActionDAGNodes added_filter_nodes = MergeTreeData::getFiltersForPrimaryKeyAnalysis(*this);
UInt64 rows_to_read = storage_merge_tree->estimateNumberOfRowsToRead(context, storage_snapshot, query_info_copy, added_filter_nodes);
if (query_info_copy.prewhere_info)
{
{
const auto & node
= query_info_copy.prewhere_info->prewhere_actions->findInOutputs(query_info_copy.prewhere_info->prewhere_column_name);
added_filter_nodes.nodes.push_back(&node);
}
if (query_info_copy.prewhere_info->row_level_filter)
{
const auto & node
= query_info_copy.prewhere_info->row_level_filter->findInOutputs(query_info_copy.prewhere_info->row_level_column_name);
added_filter_nodes.nodes.push_back(&node);
}
}
query_info_copy.filter_actions_dag = ActionsDAG::buildFilterActionsDAG(added_filter_nodes.nodes);
UInt64 rows_to_read = storage_merge_tree->estimateNumberOfRowsToRead(context, storage_snapshot, query_info_copy);
/// Note that we treat an estimation of 0 rows as a real estimation
size_t number_of_replicas_to_use = rows_to_read / settings.parallel_replicas_min_number_of_rows_per_replica;
LOG_TRACE(log, "Estimated {} rows to read. It is enough work for {} parallel replicas", rows_to_read, number_of_replicas_to_use);
@ -2336,6 +2357,49 @@ UInt64 InterpreterSelectQuery::maxBlockSizeByLimit() const
return 0;
}
/** Storages can rely that filters that for storage will be available for analysis before
* plan is fully constructed and optimized.
*
* StorageMerge common header calculation and prewhere push-down relies on this.
*
* This is similar to Planner::collectFiltersForAnalysis
*/
void collectFiltersForAnalysis(
const ASTPtr & query_ptr,
const ContextPtr & query_context,
const StorageSnapshotPtr & storage_snapshot,
const SelectQueryOptions & options,
SelectQueryInfo & query_info)
{
auto get_column_options = GetColumnsOptions(GetColumnsOptions::All).withExtendedObjects().withVirtuals();
auto dummy = std::make_shared<StorageDummy>(
storage_snapshot->storage.getStorageID(), ColumnsDescription(storage_snapshot->getColumns(get_column_options)), storage_snapshot);
QueryPlan query_plan;
InterpreterSelectQuery(query_ptr, query_context, dummy, dummy->getInMemoryMetadataPtr(), options).buildQueryPlan(query_plan);
auto optimization_settings = QueryPlanOptimizationSettings::fromContext(query_context);
query_plan.optimize(optimization_settings);
std::vector<QueryPlan::Node *> nodes_to_process;
nodes_to_process.push_back(query_plan.getRootNode());
while (!nodes_to_process.empty())
{
const auto * node_to_process = nodes_to_process.back();
nodes_to_process.pop_back();
nodes_to_process.insert(nodes_to_process.end(), node_to_process->children.begin(), node_to_process->children.end());
auto * read_from_dummy = typeid_cast<ReadFromDummy *>(node_to_process->step.get());
if (!read_from_dummy)
continue;
query_info.filter_actions_dag = read_from_dummy->getFilterActionsDAG();
query_info.optimized_prewhere_info = read_from_dummy->getPrewhereInfo();
}
}
void InterpreterSelectQuery::executeFetchColumns(QueryProcessingStage::Enum processing_stage, QueryPlan & query_plan)
{
auto & query = getSelectQuery();
@ -2462,6 +2526,10 @@ void InterpreterSelectQuery::executeFetchColumns(QueryProcessingStage::Enum proc
}
else if (storage)
{
if (shouldMoveToPrewhere() && settings.query_plan_optimize_prewhere && settings.query_plan_enable_optimizations
&& typeid_cast<const StorageMerge *>(storage.get()))
collectFiltersForAnalysis(query_ptr, context, storage_snapshot, options, query_info);
/// Table.
if (max_streams == 0)
max_streams = 1;

View File

@ -31,7 +31,6 @@ struct MetricLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};

View File

@ -342,11 +342,6 @@ bool MutationsInterpreter::Source::hasProjection(const String & name) const
return part && part->hasProjection(name);
}
bool MutationsInterpreter::Source::hasBrokenProjection(const String & name) const
{
return part && part->hasBrokenProjection(name);
}
bool MutationsInterpreter::Source::isCompactPart() const
{
return part && part->getType() == MergeTreeDataPartType::Compact;
@ -812,7 +807,7 @@ void MutationsInterpreter::prepare(bool dry_run)
{
mutation_kind.set(MutationKind::MUTATE_INDEX_STATISTIC_PROJECTION);
const auto & projection = projections_desc.get(command.projection_name);
if (!source.hasProjection(projection.name) || source.hasBrokenProjection(projection.name))
if (!source.hasProjection(projection.name))
{
for (const auto & column : projection.required_columns)
dependencies.emplace(column, ColumnDependency::PROJECTION);
@ -999,13 +994,6 @@ void MutationsInterpreter::prepare(bool dry_run)
if (!source.hasProjection(projection.name))
continue;
/// Always rebuild broken projections.
if (source.hasBrokenProjection(projection.name))
{
materialized_projections.insert(projection.name);
continue;
}
if (need_rebuild_projections)
{
materialized_projections.insert(projection.name);

View File

@ -126,7 +126,6 @@ public:
bool materializeTTLRecalculateOnly() const;
bool hasSecondaryIndex(const String & name) const;
bool hasProjection(const String & name) const;
bool hasBrokenProjection(const String & name) const;
bool isCompactPart() const;
void read(

View File

@ -20,7 +20,6 @@ struct OpenTelemetrySpanLogElement : public OpenTelemetry::Span
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases();
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
// OpenTelemetry standardizes some Log data as well, so it's not just

View File

@ -96,7 +96,6 @@ struct PartLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases();
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
class IMergeTreeDataPart;

View File

@ -40,7 +40,6 @@ struct ProcessorProfileLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
class ProcessorsProfileLog : public SystemLog<ProcessorProfileLogElement>

View File

@ -106,7 +106,6 @@ struct QueryLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases();
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
static void appendClientInfo(const ClientInfo & client_info, MutableColumns & columns, size_t & i);
};

View File

@ -49,7 +49,6 @@ struct QueryThreadLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases();
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};

View File

@ -81,7 +81,6 @@ struct QueryViewsLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases();
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};

View File

@ -37,7 +37,6 @@ struct S3QueueLogElement
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
class S3QueueLog : public SystemLog<S3QueueLogElement>

View File

@ -64,7 +64,6 @@ struct SessionLogElement
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};

View File

@ -15,7 +15,7 @@ struct StorageInMemoryMetadata;
using StorageMetadataPtr = std::shared_ptr<const StorageInMemoryMetadata>;
/// Optimizer that tries to replace columns to equal columns (according to constraints)
/// with lower size (accorsing to compressed and uncomressed size).
/// with lower size (according to compressed and uncomressed size).
class SubstituteColumnOptimizer
{
public:

View File

@ -34,7 +34,6 @@ struct TextLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
class TextLog : public SystemLog<TextLogElement>

View File

@ -41,7 +41,6 @@ struct TraceLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
class TraceLog : public SystemLog<TraceLogElement>

View File

@ -43,7 +43,6 @@ struct TransactionsInfoLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
void fillCommonFields(const TransactionInfoContext * context = nullptr);
};

View File

@ -72,7 +72,6 @@ struct ZooKeeperLogElement
static ColumnsDescription getColumnsDescription();
static NamesAndAliases getNamesAndAliases() { return {}; }
void appendToBlock(MutableColumns & columns) const;
static const char * getCustomColumnList() { return nullptr; }
};
class ZooKeeperLog : public SystemLog<ZooKeeperLogElement>

View File

@ -603,6 +603,9 @@ void logExceptionBeforeStart(
if (auto txn = context->getCurrentTransaction())
elem.tid = txn->tid;
if (settings.log_query_settings)
elem.query_settings = std::make_shared<Settings>(context->getSettingsRef());
if (settings.calculate_text_stack_trace)
setExceptionStackTrace(elem);
logException(context, elem);

View File

@ -98,23 +98,7 @@ Block getHeaderForProcessingStage(
case QueryProcessingStage::FetchColumns:
{
Block header = storage_snapshot->getSampleBlockForColumns(column_names);
if (query_info.prewhere_info)
{
auto & prewhere_info = *query_info.prewhere_info;
if (prewhere_info.row_level_filter)
{
header = prewhere_info.row_level_filter->updateHeader(std::move(header));
header.erase(prewhere_info.row_level_column_name);
}
if (prewhere_info.prewhere_actions)
header = prewhere_info.prewhere_actions->updateHeader(std::move(header));
if (prewhere_info.remove_prewhere_column)
header.erase(prewhere_info.prewhere_column_name);
}
header = SourceStepWithFilter::applyPrewhereActions(header, query_info.prewhere_info);
return header;
}
case QueryProcessingStage::WithMergeableState:
@ -153,7 +137,9 @@ Block getHeaderForProcessingStage(
if (context->getSettingsRef().allow_experimental_analyzer)
{
auto storage = std::make_shared<StorageDummy>(storage_snapshot->storage.getStorageID(), storage_snapshot->getAllColumnsDescription());
auto storage = std::make_shared<StorageDummy>(storage_snapshot->storage.getStorageID(),
storage_snapshot->getAllColumnsDescription(),
storage_snapshot);
InterpreterSelectQueryAnalyzer interpreter(query, context, storage, SelectQueryOptions(processed_stage).analyze());
result = interpreter.getSampleBlock();
}

View File

@ -466,10 +466,6 @@ void ASTAlterCommand::formatImpl(const FormatSettings & settings, FormatState &
<< (settings.hilite ? hilite_none : "");
refresh->formatImpl(settings, state, frame);
}
else if (type == ASTAlterCommand::LIVE_VIEW_REFRESH)
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "REFRESH " << (settings.hilite ? hilite_none : "");
}
else if (type == ASTAlterCommand::RENAME_COLUMN)
{
settings.ostr << (settings.hilite ? hilite_keyword : "") << "RENAME COLUMN " << (if_exists ? "IF EXISTS " : "")
@ -621,9 +617,6 @@ void ASTAlterQuery::formatQueryImpl(const FormatSettings & settings, FormatState
case AlterObjectType::DATABASE:
settings.ostr << "ALTER DATABASE ";
break;
case AlterObjectType::LIVE_VIEW:
settings.ostr << "ALTER LIVE VIEW ";
break;
default:
break;
}

View File

@ -17,8 +17,6 @@ namespace DB
* MODIFY COLUMN col_name type,
* DROP PARTITION partition,
* COMMENT_COLUMN col_name 'comment',
* ALTER LIVE VIEW [db.]name_type
* REFRESH
*/
class ASTAlterCommand : public IAST
@ -79,8 +77,6 @@ public:
NO_TYPE,
LIVE_VIEW_REFRESH,
MODIFY_DATABASE_SETTING,
MODIFY_COMMENT,
@ -242,7 +238,6 @@ public:
{
TABLE,
DATABASE,
LIVE_VIEW,
UNKNOWN,
};

View File

@ -63,9 +63,6 @@ bool ParserAlterCommand::parseImpl(Pos & pos, ASTPtr & node, Expected & expected
ParserKeyword s_add("ADD");
ParserKeyword s_drop("DROP");
ParserKeyword s_suspend("SUSPEND");
ParserKeyword s_resume("RESUME");
ParserKeyword s_refresh("REFRESH");
ParserKeyword s_modify("MODIFY");
ParserKeyword s_attach_partition("ATTACH PARTITION");
@ -175,16 +172,6 @@ bool ParserAlterCommand::parseImpl(Pos & pos, ASTPtr & node, Expected & expected
switch (alter_object)
{
case ASTAlterQuery::AlterObjectType::LIVE_VIEW:
{
if (s_refresh.ignore(pos, expected))
{
command->type = ASTAlterCommand::LIVE_VIEW_REFRESH;
}
else
return false;
break;
}
case ASTAlterQuery::AlterObjectType::DATABASE:
{
if (s_modify_setting.ignore(pos, expected))
@ -986,7 +973,6 @@ bool ParserAlterQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
ParserKeyword s_alter_table("ALTER TABLE");
ParserKeyword s_alter_temporary_table("ALTER TEMPORARY TABLE");
ParserKeyword s_alter_live_view("ALTER LIVE VIEW");
ParserKeyword s_alter_database("ALTER DATABASE");
ASTAlterQuery::AlterObjectType alter_object_type;
@ -995,10 +981,6 @@ bool ParserAlterQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
alter_object_type = ASTAlterQuery::AlterObjectType::TABLE;
}
else if (s_alter_live_view.ignore(pos, expected))
{
alter_object_type = ASTAlterQuery::AlterObjectType::LIVE_VIEW;
}
else if (s_alter_database.ignore(pos, expected))
{
alter_object_type = ASTAlterQuery::AlterObjectType::DATABASE;

View File

@ -28,8 +28,6 @@ namespace DB
* [DROP INDEX [IF EXISTS] index_name]
* [CLEAR INDEX [IF EXISTS] index_name IN PARTITION partition]
* [MATERIALIZE INDEX [IF EXISTS] index_name [IN PARTITION partition]]
* ALTER LIVE VIEW [db.name]
* [REFRESH]
*/
class ParserAlterQuery : public IParserBase

View File

@ -890,7 +890,7 @@ bool ParserCreateLiveViewQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & e
if (ParserKeyword{"REFRESH"}.ignore(pos, expected) || ParserKeyword{"PERIODIC REFRESH"}.ignore(pos, expected))
{
if (!ParserNumber{}.parse(pos, live_view_periodic_refresh, expected))
live_view_periodic_refresh = std::make_shared<ASTLiteral>(static_cast<UInt64>(DEFAULT_PERIODIC_LIVE_VIEW_REFRESH_SEC));
live_view_periodic_refresh = std::make_shared<ASTLiteral>(static_cast<UInt64>(60));
with_periodic_refresh = true;
}

View File

@ -45,6 +45,7 @@
#include <Storages/SelectQueryInfo.h>
#include <Storages/StorageDistributed.h>
#include <Storages/StorageDummy.h>
#include <Storages/StorageMerge.h>
#include <Analyzer/Utils.h>
#include <Analyzer/ColumnNode.h>
@ -135,6 +136,7 @@ void checkStoragesSupportTransactions(const PlannerContextPtr & planner_context)
*
* StorageDistributed skip unused shards optimization relies on this.
* Parallel replicas estimation relies on this too.
* StorageMerge common header calculation relies on this too.
*
* To collect filters that will be applied to specific table in case we have JOINs requires
* to run query plan optimization pipeline.
@ -145,16 +147,16 @@ void checkStoragesSupportTransactions(const PlannerContextPtr & planner_context)
* 3. Optimize query plan.
* 4. Extract filters from ReadFromDummy query plan steps from query plan leaf nodes.
*/
void collectFiltersForAnalysis(const QueryTreeNodePtr & query_tree, const PlannerContextPtr & planner_context)
FiltersForTableExpressionMap collectFiltersForAnalysis(const QueryTreeNodePtr & query_tree, const QueryTreeNodes & table_nodes, const ContextPtr & query_context)
{
bool collect_filters = false;
const auto & query_context = planner_context->getQueryContext();
const auto & settings = query_context->getSettingsRef();
bool parallel_replicas_estimation_enabled
= query_context->canUseParallelReplicasOnInitiator() && settings.parallel_replicas_min_number_of_rows_per_replica > 0;
for (auto & [table_expression, table_expression_data] : planner_context->getTableExpressionNodeToData())
for (const auto & table_expression : table_nodes)
{
auto * table_node = table_expression->as<TableNode>();
auto * table_function_node = table_expression->as<TableFunctionNode>();
@ -162,7 +164,7 @@ void collectFiltersForAnalysis(const QueryTreeNodePtr & query_tree, const Planne
continue;
const auto & storage = table_node ? table_node->getStorage() : table_function_node->getStorage();
if (typeid_cast<const StorageDistributed *>(storage.get())
if (typeid_cast<const StorageDistributed *>(storage.get()) || typeid_cast<const StorageMerge *>(storage.get())
|| (parallel_replicas_estimation_enabled && std::dynamic_pointer_cast<MergeTreeData>(storage)))
{
collect_filters = true;
@ -171,18 +173,18 @@ void collectFiltersForAnalysis(const QueryTreeNodePtr & query_tree, const Planne
}
if (!collect_filters)
return;
return {};
ResultReplacementMap replacement_map;
auto updated_query_tree = replaceTableExpressionsWithDummyTables(query_tree, planner_context->getQueryContext(), &replacement_map);
std::unordered_map<const IStorage *, TableExpressionData *> dummy_storage_to_table_expression_data;
auto updated_query_tree = replaceTableExpressionsWithDummyTables(query_tree, table_nodes, query_context, &replacement_map);
std::unordered_map<const IStorage *, QueryTreeNodePtr> dummy_storage_to_table;
for (auto & [from_table_expression, dummy_table_expression] : replacement_map)
{
auto * dummy_storage = dummy_table_expression->as<TableNode &>().getStorage().get();
auto * table_expression_data = &planner_context->getTableExpressionDataOrThrow(from_table_expression);
dummy_storage_to_table_expression_data.emplace(dummy_storage, table_expression_data);
dummy_storage_to_table.emplace(dummy_storage, from_table_expression);
}
SelectQueryOptions select_query_options;
@ -194,6 +196,8 @@ void collectFiltersForAnalysis(const QueryTreeNodePtr & query_tree, const Planne
auto optimization_settings = QueryPlanOptimizationSettings::fromContext(query_context);
result_query_plan.optimize(optimization_settings);
FiltersForTableExpressionMap res;
std::vector<QueryPlan::Node *> nodes_to_process;
nodes_to_process.push_back(result_query_plan.getRootNode());
@ -207,10 +211,33 @@ void collectFiltersForAnalysis(const QueryTreeNodePtr & query_tree, const Planne
if (!read_from_dummy)
continue;
auto filter_actions = ActionsDAG::buildFilterActionsDAG(read_from_dummy->getFilterNodes().nodes);
auto & table_expression_data = dummy_storage_to_table_expression_data.at(&read_from_dummy->getStorage());
table_expression_data->setFilterActions(std::move(filter_actions));
auto filter_actions = read_from_dummy->getFilterActionsDAG();
const auto & table_node = dummy_storage_to_table.at(&read_from_dummy->getStorage());
res[table_node] = FiltersForTableExpression{std::move(filter_actions), read_from_dummy->getPrewhereInfo()};
}
return res;
}
FiltersForTableExpressionMap collectFiltersForAnalysis(const QueryTreeNodePtr & query_tree_node, SelectQueryOptions & select_query_options)
{
if (select_query_options.only_analyze)
return {};
auto * query_node = query_tree_node->as<QueryNode>();
auto * union_node = query_tree_node->as<UnionNode>();
if (!query_node && !union_node)
throw Exception(ErrorCodes::UNSUPPORTED_METHOD,
"Expected QUERY or UNION node. Actual {}",
query_tree_node->formatASTForErrorMessage());
auto context = query_node ? query_node->getContext() : union_node->getContext();
auto table_expressions_nodes
= extractTableExpressions(query_tree_node, false /* add_array_join */, true /* recursive */);
return collectFiltersForAnalysis(query_tree_node, table_expressions_nodes, context);
}
/// Extend lifetime of query context, storages, and table locks
@ -1058,7 +1085,7 @@ void addBuildSubqueriesForSetsStepIfNeeded(
Planner subquery_planner(
query_tree,
subquery_options,
std::make_shared<GlobalPlannerContext>(nullptr, nullptr));
std::make_shared<GlobalPlannerContext>(nullptr, nullptr, FiltersForTableExpressionMap{}));
subquery_planner.buildQueryPlanIfNeeded();
subquery->setQueryPlan(std::make_unique<QueryPlan>(std::move(subquery_planner).extractQueryPlan()));
@ -1164,7 +1191,8 @@ Planner::Planner(const QueryTreeNodePtr & query_tree_,
, planner_context(buildPlannerContext(query_tree, select_query_options,
std::make_shared<GlobalPlannerContext>(
findQueryForParallelReplicas(query_tree, select_query_options),
findTableForParallelReplicas(query_tree, select_query_options))))
findTableForParallelReplicas(query_tree, select_query_options),
collectFiltersForAnalysis(query_tree, select_query_options))))
{
}
@ -1359,8 +1387,20 @@ void Planner::buildPlanForQueryNode()
collectTableExpressionData(query_tree, planner_context);
checkStoragesSupportTransactions(planner_context);
if (!select_query_options.only_analyze)
collectFiltersForAnalysis(query_tree, planner_context);
const auto & table_filters = planner_context->getGlobalPlannerContext()->filters_for_table_expressions;
if (!select_query_options.only_analyze && !table_filters.empty()) // && top_level)
{
for (auto & [table_node, table_expression_data] : planner_context->getTableExpressionNodeToData())
{
auto it = table_filters.find(table_node);
if (it != table_filters.end())
{
const auto & filters = it->second;
table_expression_data.setFilterActions(filters.filter_actions);
table_expression_data.setPrewhereInfo(filters.prewhere_info);
}
}
}
if (query_context->canUseTaskBasedParallelReplicas())
{

View File

@ -23,12 +23,25 @@ namespace DB
class QueryNode;
class TableNode;
struct FiltersForTableExpression
{
ActionsDAGPtr filter_actions;
PrewhereInfoPtr prewhere_info;
};
using FiltersForTableExpressionMap = std::map<QueryTreeNodePtr, FiltersForTableExpression>;
class GlobalPlannerContext
{
public:
explicit GlobalPlannerContext(const QueryNode * parallel_replicas_node_, const TableNode * parallel_replicas_table_)
GlobalPlannerContext(
const QueryNode * parallel_replicas_node_,
const TableNode * parallel_replicas_table_,
FiltersForTableExpressionMap filters_for_table_expressions_)
: parallel_replicas_node(parallel_replicas_node_)
, parallel_replicas_table(parallel_replicas_table_)
, filters_for_table_expressions(std::move(filters_for_table_expressions_))
{
}
@ -54,6 +67,8 @@ public:
/// It is the left-most table of the query (in JOINs, UNIONs and subqueries).
const TableNode * const parallel_replicas_table = nullptr;
const FiltersForTableExpressionMap filters_for_table_expressions;
private:
std::unordered_set<ColumnIdentifier> column_identifiers;
};

View File

@ -624,6 +624,7 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres
auto table_expression_query_info = select_query_info;
table_expression_query_info.table_expression = table_expression;
table_expression_query_info.filter_actions_dag = table_expression_data.getFilterActions();
table_expression_query_info.optimized_prewhere_info = table_expression_data.getPrewhereInfo();
table_expression_query_info.analyzer_can_use_parallel_replicas_on_follower = table_node == planner_context->getGlobalPlannerContext()->parallel_replicas_table;
size_t max_streams = settings.max_threads;
@ -717,12 +718,16 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres
}
/// Apply trivial_count optimization if possible
bool is_trivial_count_applied = !select_query_options.only_analyze &&
is_single_table_expression &&
(table_node || table_function_node) &&
select_query_info.has_aggregates &&
settings.additional_table_filters.value.empty() &&
applyTrivialCountIfPossible(query_plan, table_expression_query_info, table_node, table_function_node, select_query_info.query_tree, planner_context->getMutableQueryContext(), table_expression_data.getColumnNames());
bool is_trivial_count_applied = !select_query_options.only_analyze && is_single_table_expression
&& (table_node || table_function_node) && select_query_info.has_aggregates && settings.additional_table_filters.value.empty()
&& applyTrivialCountIfPossible(
query_plan,
table_expression_query_info,
table_node,
table_function_node,
select_query_info.query_tree,
planner_context->getMutableQueryContext(),
table_expression_data.getColumnNames());
if (is_trivial_count_applied)
{
@ -736,11 +741,8 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres
if (storage_merge_tree && query_context->canUseParallelReplicasOnInitiator()
&& settings.parallel_replicas_min_number_of_rows_per_replica > 0)
{
ActionDAGNodes filter_nodes;
if (table_expression_query_info.filter_actions_dag)
filter_nodes.nodes = table_expression_query_info.filter_actions_dag->getOutputs();
UInt64 rows_to_read = storage_merge_tree->estimateNumberOfRowsToRead(
query_context, storage_snapshot, table_expression_query_info, filter_nodes);
UInt64 rows_to_read
= storage_merge_tree->estimateNumberOfRowsToRead(query_context, storage_snapshot, table_expression_query_info);
if (max_block_size_limited && (max_block_size_limited < rows_to_read))
rows_to_read = max_block_size_limited;
@ -766,15 +768,16 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres
}
}
auto & prewhere_info = table_expression_query_info.prewhere_info;
const auto & prewhere_actions = table_expression_data.getPrewhereFilterActions();
if (prewhere_actions)
{
table_expression_query_info.prewhere_info = std::make_shared<PrewhereInfo>();
table_expression_query_info.prewhere_info->prewhere_actions = prewhere_actions;
table_expression_query_info.prewhere_info->prewhere_column_name = prewhere_actions->getOutputs().at(0)->result_name;
table_expression_query_info.prewhere_info->remove_prewhere_column = true;
table_expression_query_info.prewhere_info->need_filter = true;
prewhere_info = std::make_shared<PrewhereInfo>();
prewhere_info->prewhere_actions = prewhere_actions;
prewhere_info->prewhere_column_name = prewhere_actions->getOutputs().at(0)->result_name;
prewhere_info->remove_prewhere_column = true;
prewhere_info->need_filter = true;
}
updatePrewhereOutputsIfNeeded(table_expression_query_info, table_expression_data.getColumnNames(), storage_snapshot);
@ -787,32 +790,34 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres
if (!filter_info.actions)
return;
bool is_final = table_expression_query_info.table_expression_modifiers &&
table_expression_query_info.table_expression_modifiers->hasFinal();
bool optimize_move_to_prewhere = settings.optimize_move_to_prewhere && (!is_final || settings.optimize_move_to_prewhere_if_final);
bool is_final = table_expression_query_info.table_expression_modifiers
&& table_expression_query_info.table_expression_modifiers->hasFinal();
bool optimize_move_to_prewhere
= settings.optimize_move_to_prewhere && (!is_final || settings.optimize_move_to_prewhere_if_final);
if (storage->supportsPrewhere() && optimize_move_to_prewhere)
{
if (!table_expression_query_info.prewhere_info)
table_expression_query_info.prewhere_info = std::make_shared<PrewhereInfo>();
if (!prewhere_info)
prewhere_info = std::make_shared<PrewhereInfo>();
if (!table_expression_query_info.prewhere_info->prewhere_actions)
if (!prewhere_info->prewhere_actions)
{
table_expression_query_info.prewhere_info->prewhere_actions = filter_info.actions;
table_expression_query_info.prewhere_info->prewhere_column_name = filter_info.column_name;
table_expression_query_info.prewhere_info->remove_prewhere_column = filter_info.do_remove_column;
table_expression_query_info.prewhere_info->need_filter = true;
prewhere_info->prewhere_actions = filter_info.actions;
prewhere_info->prewhere_column_name = filter_info.column_name;
prewhere_info->remove_prewhere_column = filter_info.do_remove_column;
prewhere_info->need_filter = true;
}
else if (!table_expression_query_info.prewhere_info->row_level_filter)
else if (!prewhere_info->row_level_filter)
{
table_expression_query_info.prewhere_info->row_level_filter = filter_info.actions;
table_expression_query_info.prewhere_info->row_level_column_name = filter_info.column_name;
table_expression_query_info.prewhere_info->need_filter = true;
prewhere_info->row_level_filter = filter_info.actions;
prewhere_info->row_level_column_name = filter_info.column_name;
prewhere_info->need_filter = true;
}
else
{
where_filters.emplace_back(filter_info, std::move(description));
}
}
else
{
@ -820,7 +825,8 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres
}
};
auto row_policy_filter_info = buildRowPolicyFilterIfNeeded(storage, table_expression_query_info, planner_context, used_row_policies);
auto row_policy_filter_info
= buildRowPolicyFilterIfNeeded(storage, table_expression_query_info, planner_context, used_row_policies);
add_filter(row_policy_filter_info, "Row-level security filter");
if (row_policy_filter_info.actions)
table_expression_data.setRowLevelFilterActions(row_policy_filter_info.actions);
@ -829,25 +835,34 @@ JoinTreeQueryPlan buildQueryPlanForTableExpression(QueryTreeNodePtr table_expres
{
if (settings.parallel_replicas_count > 1)
{
auto parallel_replicas_custom_key_filter_info = buildCustomKeyFilterIfNeeded(storage, table_expression_query_info, planner_context);
auto parallel_replicas_custom_key_filter_info
= buildCustomKeyFilterIfNeeded(storage, table_expression_query_info, planner_context);
add_filter(parallel_replicas_custom_key_filter_info, "Parallel replicas custom key filter");
}
else
else if (auto * distributed = typeid_cast<StorageDistributed *>(storage.get());
distributed && query_context->canUseParallelReplicasCustomKey(*distributed->getCluster()))
{
if (auto * distributed = typeid_cast<StorageDistributed *>(storage.get());
distributed && query_context->canUseParallelReplicasCustomKey(*distributed->getCluster()))
{
planner_context->getMutableQueryContext()->setSetting("distributed_group_by_no_merge", 2);
}
planner_context->getMutableQueryContext()->setSetting("distributed_group_by_no_merge", 2);
}
}
const auto & table_expression_alias = table_expression->getOriginalAlias();
auto additional_filters_info = buildAdditionalFiltersIfNeeded(storage, table_expression_alias, table_expression_query_info, planner_context);
auto additional_filters_info
= buildAdditionalFiltersIfNeeded(storage, table_expression_alias, table_expression_query_info, planner_context);
add_filter(additional_filters_info, "additional filter");
from_stage = storage->getQueryProcessingStage(query_context, select_query_options.to_stage, storage_snapshot, table_expression_query_info);
storage->read(query_plan, columns_names, storage_snapshot, table_expression_query_info, query_context, from_stage, max_block_size, max_streams);
from_stage = storage->getQueryProcessingStage(
query_context, select_query_options.to_stage, storage_snapshot, table_expression_query_info);
storage->read(
query_plan,
columns_names,
storage_snapshot,
table_expression_query_info,
query_context,
from_stage,
max_block_size,
max_streams);
for (const auto & filter_info_and_description : where_filters)
{

View File

@ -17,6 +17,9 @@ using ColumnIdentifier = std::string;
using ColumnIdentifiers = std::vector<ColumnIdentifier>;
using ColumnIdentifierSet = std::unordered_set<ColumnIdentifier>;
struct PrewhereInfo;
using PrewhereInfoPtr = std::shared_ptr<PrewhereInfo>;
/** Table expression data is created for each table expression that take part in query.
* Table expression data has information about columns that participate in query, their name to identifier mapping,
* and additional table expression properties.
@ -282,6 +285,16 @@ public:
filter_actions = std::move(filter_actions_value);
}
const PrewhereInfoPtr & getPrewhereInfo() const
{
return prewhere_info;
}
void setPrewhereInfo(PrewhereInfoPtr prewhere_info_value)
{
prewhere_info = std::move(prewhere_info_value);
}
private:
void addColumnImpl(const NameAndTypePair & column, const ColumnIdentifier & column_identifier)
{
@ -309,6 +322,9 @@ private:
/// Valid for table, table function
ActionsDAGPtr filter_actions;
/// Valid for table, table function
PrewhereInfoPtr prewhere_info;
/// Valid for table, table function
ActionsDAGPtr prewhere_filter_actions;

View File

@ -386,66 +386,37 @@ QueryTreeNodePtr mergeConditionNodes(const QueryTreeNodes & condition_nodes, con
return function_node;
}
QueryTreeNodePtr replaceTableExpressionsWithDummyTables(const QueryTreeNodePtr & query_node,
QueryTreeNodePtr replaceTableExpressionsWithDummyTables(
const QueryTreeNodePtr & query_node,
const QueryTreeNodes & table_nodes,
const ContextPtr & context,
//PlannerContext & planner_context,
ResultReplacementMap * result_replacement_map)
{
auto & query_node_typed = query_node->as<QueryNode &>();
auto table_expressions = extractTableExpressions(query_node_typed.getJoinTree());
std::unordered_map<const IQueryTreeNode *, QueryTreeNodePtr> replacement_map;
size_t subquery_index = 0;
for (auto & table_expression : table_expressions)
for (const auto & table_expression : table_nodes)
{
auto * table_node = table_expression->as<TableNode>();
auto * table_function_node = table_expression->as<TableFunctionNode>();
auto * subquery_node = table_expression->as<QueryNode>();
auto * union_node = table_expression->as<UnionNode>();
StoragePtr storage_dummy;
if (table_node || table_function_node)
{
const auto & storage_snapshot = table_node ? table_node->getStorageSnapshot() : table_function_node->getStorageSnapshot();
auto get_column_options = GetColumnsOptions(GetColumnsOptions::All).withExtendedObjects().withVirtuals();
storage_dummy
= std::make_shared<StorageDummy>(storage_snapshot->storage.getStorageID(), ColumnsDescription(storage_snapshot->getColumns(get_column_options)));
StoragePtr storage_dummy = std::make_shared<StorageDummy>(
storage_snapshot->storage.getStorageID(),
ColumnsDescription(storage_snapshot->getColumns(get_column_options)),
storage_snapshot);
auto dummy_table_node = std::make_shared<TableNode>(std::move(storage_dummy), context);
if (result_replacement_map)
result_replacement_map->emplace(table_expression, dummy_table_node);
dummy_table_node->setAlias(table_expression->getAlias());
replacement_map.emplace(table_expression.get(), std::move(dummy_table_node));
}
else if (subquery_node || union_node)
{
const auto & subquery_projection_columns
= subquery_node ? subquery_node->getProjectionColumns() : union_node->computeProjectionColumns();
NameSet unique_column_names;
NamesAndTypes storage_dummy_columns;
storage_dummy_columns.reserve(subquery_projection_columns.size());
for (const auto & projection_column : subquery_projection_columns)
{
auto [_, inserted] = unique_column_names.insert(projection_column.name);
if (inserted)
storage_dummy_columns.emplace_back(projection_column);
}
storage_dummy = std::make_shared<StorageDummy>(StorageID{"dummy", "subquery_" + std::to_string(subquery_index)}, ColumnsDescription::fromNamesAndTypes(storage_dummy_columns));
++subquery_index;
}
auto dummy_table_node = std::make_shared<TableNode>(std::move(storage_dummy), context);
if (result_replacement_map)
result_replacement_map->emplace(table_expression, dummy_table_node);
dummy_table_node->setAlias(table_expression->getAlias());
// auto & src_table_expression_data = planner_context.getOrCreateTableExpressionData(table_expression);
// auto & dst_table_expression_data = planner_context.getOrCreateTableExpressionData(dummy_table_node);
// dst_table_expression_data = src_table_expression_data;
replacement_map.emplace(table_expression.get(), std::move(dummy_table_node));
}
return query_node->cloneAndReplace(replacement_map);

View File

@ -70,7 +70,9 @@ QueryTreeNodePtr mergeConditionNodes(const QueryTreeNodes & condition_nodes, con
/// Replace table expressions from query JOIN TREE with dummy tables
using ResultReplacementMap = std::unordered_map<QueryTreeNodePtr, QueryTreeNodePtr>;
QueryTreeNodePtr replaceTableExpressionsWithDummyTables(const QueryTreeNodePtr & query_node,
QueryTreeNodePtr replaceTableExpressionsWithDummyTables(
const QueryTreeNodePtr & query_node,
const QueryTreeNodes & table_nodes,
const ContextPtr & context,
ResultReplacementMap * result_replacement_map = nullptr);

View File

@ -126,8 +126,10 @@ public:
const auto & storage_snapshot = table_node ? table_node->getStorageSnapshot() : table_function_node->getStorageSnapshot();
auto get_column_options = GetColumnsOptions(GetColumnsOptions::All).withExtendedObjects().withVirtuals();
auto storage_dummy
= std::make_shared<StorageDummy>(storage_snapshot->storage.getStorageID(), ColumnsDescription(storage_snapshot->getColumns(get_column_options)));
auto storage_dummy = std::make_shared<StorageDummy>(
storage_snapshot->storage.getStorageID(),
ColumnsDescription(storage_snapshot->getColumns(get_column_options)),
storage_snapshot);
auto dummy_table_node = std::make_shared<TableNode>(std::move(storage_dummy), context);
@ -263,7 +265,7 @@ const QueryNode * findQueryForParallelReplicas(const QueryTreeNodePtr & query_tr
auto updated_query_tree = replaceTablesWithDummyTables(query_tree_node, mutable_context);
SelectQueryOptions options;
Planner planner(updated_query_tree, options, std::make_shared<GlobalPlannerContext>(nullptr, nullptr));
Planner planner(updated_query_tree, options, std::make_shared<GlobalPlannerContext>(nullptr, nullptr, FiltersForTableExpressionMap{}));
planner.buildQueryPlanIfNeeded();
/// This part is a bit clumsy.

View File

@ -64,6 +64,9 @@ public:
using DataStreams = std::vector<DataStream>;
class QueryPlan;
using QueryPlanRawPtrs = std::list<QueryPlan *>;
/// Single step of query plan.
class IQueryPlanStep
{
@ -109,6 +112,9 @@ public:
/// Get description of processors added in current step. Should be called after updatePipeline().
virtual void describePipeline(FormatSettings & /*settings*/) const {}
/// Get child plans contained inside some steps (e.g ReadFromMerge) so that they are visible when doing EXPLAIN.
virtual QueryPlanRawPtrs getChildPlans() { return {}; }
/// Append extra processors for this step.
void appendExtraProcessors(const Processors & extra_processors);

View File

@ -22,6 +22,8 @@ QueryPlanOptimizationSettings QueryPlanOptimizationSettings::fromSettings(const
settings.filter_push_down = from.query_plan_enable_optimizations && from.query_plan_filter_push_down;
settings.optimize_prewhere = from.query_plan_enable_optimizations && from.query_plan_optimize_prewhere;
settings.execute_functions_after_sorting = from.query_plan_enable_optimizations && from.query_plan_execute_functions_after_sorting;
settings.reuse_storage_ordering_for_window_functions = from.query_plan_enable_optimizations && from.query_plan_reuse_storage_ordering_for_window_functions;

View File

@ -61,6 +61,8 @@ struct QueryPlanOptimizationSettings
/// If remove-redundant-distinct-steps optimization is enabled.
bool remove_redundant_distinct = true;
bool optimize_prewhere = true;
/// If reading from projection can be applied
bool optimize_projection = false;
bool force_use_projection = false;

View File

@ -1,8 +1,9 @@
#include <Processors/QueryPlan/Optimizations/Optimizations.h>
#include <Processors/QueryPlan/ExpressionStep.h>
#include <Processors/QueryPlan/FilterStep.h>
#include <Processors/QueryPlan/ReadFromMergeTree.h>
#include <Processors/QueryPlan/SourceStepWithFilter.h>
#include <Storages/MergeTree/MergeTreeWhereOptimizer.h>
#include <Storages/StorageDummy.h>
#include <Interpreters/ActionsDAG.h>
#include <Functions/FunctionsLogical.h>
#include <Functions/IFunctionAdaptors.h>
@ -38,46 +39,35 @@ void optimizePrewhere(Stack & stack, QueryPlan::Nodes &)
*
* 1. SomeNode
* 2. FilterNode
* 3. ReadFromMergeTreeNode
* 3. SourceStepWithFilterNode
*/
auto * read_from_merge_tree = typeid_cast<ReadFromMergeTree *>(frame.node->step.get());
if (!read_from_merge_tree)
auto * source_step_with_filter = dynamic_cast<SourceStepWithFilter *>(frame.node->step.get());
if (!source_step_with_filter)
return;
const auto & storage_prewhere_info = read_from_merge_tree->getPrewhereInfo();
const auto & storage_snapshot = source_step_with_filter->getStorageSnapshot();
const auto & storage = storage_snapshot->storage;
if (!storage.canMoveConditionsToPrewhere())
return;
const auto & storage_prewhere_info = source_step_with_filter->getPrewhereInfo();
if (storage_prewhere_info && storage_prewhere_info->prewhere_actions)
return;
/// TODO: We can also check for UnionStep, such as StorageBuffer and local distributed plans.
QueryPlan::Node * filter_node = (stack.rbegin() + 1)->node;
const auto * filter_step = typeid_cast<FilterStep *>(filter_node->step.get());
if (!filter_step)
return;
const auto & context = read_from_merge_tree->getContext();
const auto & context = source_step_with_filter->getContext();
const auto & settings = context->getSettingsRef();
if (!settings.allow_experimental_analyzer)
return;
bool is_final = read_from_merge_tree->isQueryWithFinal();
bool is_final = source_step_with_filter->isQueryWithFinal();
bool optimize_move_to_prewhere = settings.optimize_move_to_prewhere && (!is_final || settings.optimize_move_to_prewhere_if_final);
if (!optimize_move_to_prewhere)
return;
const auto & storage_snapshot = read_from_merge_tree->getStorageSnapshot();
ColumnsWithTypeAndName required_columns_after_filter;
if (read_from_merge_tree->isQueryWithSampling())
{
const auto & sampling_key = storage_snapshot->getMetadataForQuery()->getSamplingKey();
const auto & sampling_source_columns = sampling_key.expression->getRequiredColumnsWithTypes();
for (const auto & column : sampling_source_columns)
required_columns_after_filter.push_back(ColumnWithTypeAndName(column.type, column.name));
const auto & sampling_result_columns = sampling_key.sample_block.getColumnsWithTypeAndName();
required_columns_after_filter.insert(required_columns_after_filter.end(), sampling_result_columns.begin(), sampling_result_columns.end());
}
const auto & storage = storage_snapshot->storage;
const auto & storage_metadata = storage_snapshot->metadata;
auto column_sizes = storage.getColumnSizes();
if (column_sizes.empty())
@ -88,19 +78,19 @@ void optimizePrewhere(Stack & stack, QueryPlan::Nodes &)
for (const auto & [name, sizes] : column_sizes)
column_compressed_sizes[name] = sizes.data_compressed;
Names queried_columns = read_from_merge_tree->getRealColumnNames();
Names queried_columns = source_step_with_filter->requiredSourceColumns();
MergeTreeWhereOptimizer where_optimizer{
std::move(column_compressed_sizes),
storage_metadata,
storage.getConditionEstimatorByPredicate(read_from_merge_tree->getQueryInfo(), storage_snapshot, context),
storage.getConditionEstimatorByPredicate(source_step_with_filter->getQueryInfo(), storage_snapshot, context),
queried_columns,
storage.supportedPrewhereColumns(),
getLogger("QueryPlanOptimizePrewhere")};
auto optimize_result = where_optimizer.optimize(filter_step->getExpression(),
filter_step->getFilterColumnName(),
read_from_merge_tree->getContext(),
source_step_with_filter->getContext(),
is_final);
if (optimize_result.prewhere_nodes.empty())
@ -113,11 +103,12 @@ void optimizePrewhere(Stack & stack, QueryPlan::Nodes &)
prewhere_info = std::make_shared<PrewhereInfo>();
prewhere_info->need_filter = true;
prewhere_info->remove_prewhere_column = optimize_result.fully_moved_to_prewhere && filter_step->removesFilterColumn();
auto filter_expression = filter_step->getExpression();
const auto & filter_column_name = filter_step->getFilterColumnName();
if (optimize_result.fully_moved_to_prewhere && filter_step->removesFilterColumn())
if (prewhere_info->remove_prewhere_column)
{
removeFromOutput(*filter_expression, filter_column_name);
auto & outputs = filter_expression->getOutputs();
@ -142,7 +133,8 @@ void optimizePrewhere(Stack & stack, QueryPlan::Nodes &)
///
/// So, here we restore removed inputs for PREWHERE actions
{
std::unordered_set<const ActionsDAG::Node *> first_outputs(split_result.first->getOutputs().begin(), split_result.first->getOutputs().end());
std::unordered_set<const ActionsDAG::Node *> first_outputs(
split_result.first->getOutputs().begin(), split_result.first->getOutputs().end());
for (const auto * input : split_result.first->getInputs())
{
if (!first_outputs.contains(input))
@ -157,7 +149,7 @@ void optimizePrewhere(Stack & stack, QueryPlan::Nodes &)
ActionsDAG::NodeRawConstPtrs conditions;
conditions.reserve(split_result.split_nodes_mapping.size());
for (const auto * condition : optimize_result.prewhere_nodes)
for (const auto * condition : optimize_result.prewhere_nodes_list)
conditions.push_back(split_result.split_nodes_mapping.at(condition));
prewhere_info->prewhere_actions = std::move(split_result.first);
@ -166,7 +158,8 @@ void optimizePrewhere(Stack & stack, QueryPlan::Nodes &)
if (conditions.size() == 1)
{
prewhere_info->prewhere_column_name = conditions.front()->result_name;
prewhere_info->prewhere_actions->getOutputs().push_back(conditions.front());
if (prewhere_info->remove_prewhere_column)
prewhere_info->prewhere_actions->getOutputs().push_back(conditions.front());
}
else
{
@ -178,20 +171,21 @@ void optimizePrewhere(Stack & stack, QueryPlan::Nodes &)
prewhere_info->prewhere_actions->getOutputs().push_back(node);
}
read_from_merge_tree->updatePrewhereInfo(prewhere_info);
source_step_with_filter->updatePrewhereInfo(prewhere_info);
if (!optimize_result.fully_moved_to_prewhere)
{
filter_node->step = std::make_unique<FilterStep>(
read_from_merge_tree->getOutputStream(),
source_step_with_filter->getOutputStream(),
std::move(split_result.second),
filter_step->getFilterColumnName(),
filter_step->removesFilterColumn());
}
else
{
/// Have to keep this expression to change column names to column identifiers
filter_node->step = std::make_unique<ExpressionStep>(
read_from_merge_tree->getOutputStream(),
source_step_with_filter->getOutputStream(),
std::move(split_result.second));
}
}

View File

@ -1,8 +1,8 @@
#include <Processors/QueryPlan/Optimizations/Optimizations.h>
#include <Processors/QueryPlan/ExpressionStep.h>
#include <Processors/QueryPlan/FilterStep.h>
#include <Processors/QueryPlan/ReadFromMergeTree.h>
#include <Processors/QueryPlan/SourceStepWithFilter.h>
#include <deque>
namespace DB::QueryPlanOptimizations
{
@ -15,6 +15,14 @@ void optimizePrimaryKeyCondition(const Stack & stack)
if (!source_step_with_filter)
return;
const auto & storage_prewhere_info = source_step_with_filter->getPrewhereInfo();
if (storage_prewhere_info)
{
source_step_with_filter->addFilter(storage_prewhere_info->prewhere_actions, storage_prewhere_info->prewhere_column_name);
if (storage_prewhere_info->row_level_filter)
source_step_with_filter->addFilter(storage_prewhere_info->row_level_filter, storage_prewhere_info->row_level_column_name);
}
for (auto iter = stack.rbegin() + 1; iter != stack.rend(); ++iter)
{
if (auto * filter_step = typeid_cast<FilterStep *>(iter->node->step.get()))
@ -28,6 +36,8 @@ void optimizePrimaryKeyCondition(const Stack & stack)
else
break;
}
source_step_with_filter->applyFilters();
}
}

View File

@ -114,10 +114,13 @@ void optimizeTreeSecondPass(const QueryPlanOptimizationSettings & optimization_s
while (!stack.empty())
{
/// NOTE: optimizePrewhere can modify the stack.
optimizePrewhere(stack, nodes);
optimizePrimaryKeyCondition(stack);
/// NOTE: optimizePrewhere can modify the stack.
/// Prewhere optimization relies on PK optimization (getConditionEstimatorByPredicate)
if (optimization_settings.optimize_prewhere)
optimizePrewhere(stack, nodes);
auto & frame = stack.back();
if (frame.next_child == 0)
@ -223,11 +226,6 @@ void optimizeTreeThirdPass(QueryPlan & plan, QueryPlan::Node & root, QueryPlan::
continue;
}
if (auto * source_step_with_filter = dynamic_cast<SourceStepWithFilter *>(frame.node->step.get()))
{
source_step_with_filter->applyFilters();
}
addPlansForSets(plan, *frame.node, nodes);
stack.pop_back();

View File

@ -605,9 +605,6 @@ bool optimizeUseAggregateProjections(QueryPlan::Node & node, QueryPlan::Nodes &
for (auto & candidate : candidates.real)
{
auto required_column_names = candidate.dag->getRequiredColumnsNames();
ActionDAGNodes added_filter_nodes;
if (candidates.has_filter)
added_filter_nodes.nodes.push_back(candidate.dag->getOutputs().front());
bool analyzed = analyzeProjectionCandidate(
candidate,
@ -618,7 +615,7 @@ bool optimizeUseAggregateProjections(QueryPlan::Node & node, QueryPlan::Nodes &
query_info,
context,
max_added_blocks,
added_filter_nodes);
candidate.dag);
if (!analyzed)
continue;
@ -669,15 +666,16 @@ bool optimizeUseAggregateProjections(QueryPlan::Node & node, QueryPlan::Nodes &
auto proj_snapshot = std::make_shared<StorageSnapshot>(storage_snapshot->storage, storage_snapshot->metadata);
proj_snapshot->addProjection(best_candidate->projection);
auto query_info_copy = query_info;
query_info_copy.prewhere_info = nullptr;
auto projection_query_info = query_info;
projection_query_info.prewhere_info = nullptr;
projection_query_info.filter_actions_dag = nullptr;
projection_reading = reader.readFromParts(
/* parts = */ {},
/* alter_conversions = */ {},
best_candidate->dag->getRequiredColumnsNames(),
proj_snapshot,
query_info_copy,
projection_query_info,
context,
reading->getMaxBlockSize(),
reading->getNumStreams(),

View File

@ -163,10 +163,6 @@ bool optimizeUseNormalProjections(Stack & stack, QueryPlan::Nodes & nodes)
auto & candidate = candidates.emplace_back();
candidate.projection = projection;
ActionDAGNodes added_filter_nodes;
if (query.filter_node)
added_filter_nodes.nodes.push_back(query.filter_node);
bool analyzed = analyzeProjectionCandidate(
candidate,
*reading,
@ -176,7 +172,7 @@ bool optimizeUseNormalProjections(Stack & stack, QueryPlan::Nodes & nodes)
query_info,
context,
max_added_blocks,
added_filter_nodes);
query.filter_node ? query.dag : nullptr);
if (!analyzed)
continue;

View File

@ -214,7 +214,7 @@ bool analyzeProjectionCandidate(
const SelectQueryInfo & query_info,
const ContextPtr & context,
const std::shared_ptr<PartitionIdToMaxBlock> & max_added_blocks,
const ActionDAGNodes & added_filter_nodes)
const ActionsDAGPtr & dag)
{
MergeTreeData::DataPartsVector projection_parts;
MergeTreeData::DataPartsVector normal_parts;
@ -223,7 +223,7 @@ bool analyzeProjectionCandidate(
{
const auto & created_projections = part_with_ranges.data_part->getProjectionParts();
auto it = created_projections.find(candidate.projection->name);
if (it != created_projections.end() && !it->second->is_broken)
if (it != created_projections.end())
{
projection_parts.push_back(it->second);
}
@ -237,13 +237,15 @@ bool analyzeProjectionCandidate(
if (projection_parts.empty())
return false;
auto projection_query_info = query_info;
projection_query_info.prewhere_info = nullptr;
projection_query_info.filter_actions_dag = dag;
auto projection_result_ptr = reader.estimateNumMarksToRead(
std::move(projection_parts),
nullptr,
required_column_names,
candidate.projection->metadata,
query_info, /// How it is actually used? I hope that for index we need only added_filter_nodes
added_filter_nodes,
projection_query_info,
context,
context->getSettingsRef().max_threads,
max_added_blocks);

View File

@ -60,6 +60,6 @@ bool analyzeProjectionCandidate(
const SelectQueryInfo & query_info,
const ContextPtr & context,
const std::shared_ptr<PartitionIdToMaxBlock> & max_added_blocks,
const ActionDAGNodes & added_filter_nodes);
const ActionsDAGPtr & dag);
}

View File

@ -275,6 +275,14 @@ JSONBuilder::ItemPtr QueryPlan::explainPlan(const ExplainPlanOptions & options)
}
else
{
auto child_plans = frame.node->step->getChildPlans();
if (!frame.children_array && !child_plans.empty())
frame.children_array = std::make_unique<JSONBuilder::JSONArray>();
for (const auto & child_plan : child_plans)
frame.children_array->add(child_plan->explainPlan(options));
if (frame.children_array)
frame.node_map->add("Plans", std::move(frame.children_array));
@ -360,7 +368,7 @@ std::string debugExplainStep(const IQueryPlanStep & step)
return out.str();
}
void QueryPlan::explainPlan(WriteBuffer & buffer, const ExplainPlanOptions & options)
void QueryPlan::explainPlan(WriteBuffer & buffer, const ExplainPlanOptions & options, size_t indent)
{
checkInitialized();
@ -382,7 +390,7 @@ void QueryPlan::explainPlan(WriteBuffer & buffer, const ExplainPlanOptions & opt
if (!frame.is_description_printed)
{
settings.offset = (stack.size() - 1) * settings.indent;
settings.offset = (indent + stack.size() - 1) * settings.indent;
explainStep(*frame.node->step, settings, options);
frame.is_description_printed = true;
}
@ -393,7 +401,14 @@ void QueryPlan::explainPlan(WriteBuffer & buffer, const ExplainPlanOptions & opt
++frame.next_child;
}
else
{
auto child_plans = frame.node->step->getChildPlans();
for (const auto & child_plan : child_plans)
child_plan->explainPlan(buffer, options, indent + stack.size());
stack.pop();
}
}
}

View File

@ -82,7 +82,7 @@ public:
};
JSONBuilder::ItemPtr explainPlan(const ExplainPlanOptions & options);
void explainPlan(WriteBuffer & buffer, const ExplainPlanOptions & options);
void explainPlan(WriteBuffer & buffer, const ExplainPlanOptions & options, size_t indent = 0);
void explainPipeline(WriteBuffer & buffer, const ExplainPipelineOptions & options);
void explainEstimate(MutableColumns & columns);

View File

@ -95,17 +95,24 @@ private:
InitializerFunc initializer_func;
};
ReadFromMemoryStorageStep::ReadFromMemoryStorageStep(const Names & columns_to_read_,
StoragePtr storage_,
const StorageSnapshotPtr & storage_snapshot_,
const size_t num_streams_,
const bool delay_read_for_global_sub_queries_) :
SourceStepWithFilter(DataStream{.header=storage_snapshot_->getSampleBlockForColumns(columns_to_read_)}),
columns_to_read(columns_to_read_),
storage(std::move(storage_)),
storage_snapshot(storage_snapshot_),
num_streams(num_streams_),
delay_read_for_global_sub_queries(delay_read_for_global_sub_queries_)
ReadFromMemoryStorageStep::ReadFromMemoryStorageStep(
const Names & columns_to_read_,
const SelectQueryInfo & query_info_,
const StorageSnapshotPtr & storage_snapshot_,
const ContextPtr & context_,
StoragePtr storage_,
const size_t num_streams_,
const bool delay_read_for_global_sub_queries_)
: SourceStepWithFilter(
DataStream{.header = storage_snapshot_->getSampleBlockForColumns(columns_to_read_)},
columns_to_read_,
query_info_,
storage_snapshot_,
context_)
, columns_to_read(columns_to_read_)
, storage(std::move(storage_))
, num_streams(num_streams_)
, delay_read_for_global_sub_queries(delay_read_for_global_sub_queries_)
{
}

View File

@ -15,11 +15,14 @@ class QueryPipelineBuilder;
class ReadFromMemoryStorageStep final : public SourceStepWithFilter
{
public:
ReadFromMemoryStorageStep(const Names & columns_to_read_,
StoragePtr storage_,
const StorageSnapshotPtr & storage_snapshot_,
size_t num_streams_,
bool delay_read_for_global_sub_queries_);
ReadFromMemoryStorageStep(
const Names & columns_to_read_,
const SelectQueryInfo & query_info_,
const StorageSnapshotPtr & storage_snapshot_,
const ContextPtr & context_,
StoragePtr storage_,
size_t num_streams_,
bool delay_read_for_global_sub_queries_);
ReadFromMemoryStorageStep() = delete;
ReadFromMemoryStorageStep(const ReadFromMemoryStorageStep &) = delete;
@ -37,7 +40,6 @@ private:
Names columns_to_read;
StoragePtr storage;
StorageSnapshotPtr storage_snapshot;
size_t num_streams;
bool delay_read_for_global_sub_queries;

View File

@ -40,18 +40,13 @@
#include <Common/JSONBuilder.h>
#include <Common/isLocalAddress.h>
#include <Common/logger_useful.h>
#include "Processors/QueryPlan/IQueryPlanStep.h"
#include <Processors/QueryPlan/IQueryPlanStep.h>
#include <Parsers/parseIdentifierOrStringLiteral.h>
#include <Parsers/ExpressionListParsers.h>
#include <algorithm>
#include <functional>
#include <iterator>
#include <limits>
#include <memory>
#include <numeric>
#include <queue>
#include <stdexcept>
#include <unordered_map>
using namespace DB;
@ -265,12 +260,13 @@ void ReadFromMergeTree::AnalysisResult::checkLimits(const Settings & settings, c
ReadFromMergeTree::ReadFromMergeTree(
MergeTreeData::DataPartsVector parts_,
std::vector<AlterConversionsPtr> alter_conversions_,
const Names & column_names_,
Names real_column_names_,
Names virt_column_names_,
const MergeTreeData & data_,
const SelectQueryInfo & query_info_,
StorageSnapshotPtr storage_snapshot_,
ContextPtr context_,
const StorageSnapshotPtr & storage_snapshot_,
const ContextPtr & context_,
size_t max_block_size_,
size_t num_streams_,
bool sample_factor_column_queried_,
@ -282,19 +278,15 @@ ReadFromMergeTree::ReadFromMergeTree(
storage_snapshot_->getSampleBlockForColumns(real_column_names_),
query_info_.prewhere_info,
data_.getPartitionValueType(),
virt_column_names_)})
virt_column_names_)}, column_names_, query_info_, storage_snapshot_, context_)
, reader_settings(getMergeTreeReaderSettings(context_, query_info_))
, prepared_parts(std::move(parts_))
, alter_conversions_for_parts(std::move(alter_conversions_))
, real_column_names(std::move(real_column_names_))
, virt_column_names(std::move(virt_column_names_))
, data(data_)
, query_info(query_info_)
, prewhere_info(query_info_.prewhere_info)
, actions_settings(ExpressionActionsSettings::fromContext(context_))
, storage_snapshot(std::move(storage_snapshot_))
, metadata_for_reading(storage_snapshot->getMetadataForQuery())
, context(std::move(context_))
, block_size{
.max_block_size_rows = max_block_size_,
.preferred_block_size_bytes = context->getSettingsRef().preferred_block_size_bytes,
@ -1303,8 +1295,6 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToRead(
return selectRangesToRead(
std::move(parts),
std::move(alter_conversions),
prewhere_info,
filter_nodes,
metadata_for_reading,
query_info,
context,
@ -1317,47 +1307,6 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToRead(
indexes);
}
static ActionsDAGPtr buildFilterDAG(
const ContextPtr & context,
const PrewhereInfoPtr & prewhere_info,
const ActionDAGNodes & added_filter_nodes,
const SelectQueryInfo & query_info)
{
const auto & settings = context->getSettingsRef();
ActionsDAG::NodeRawConstPtrs nodes;
if (prewhere_info)
{
{
const auto & node = prewhere_info->prewhere_actions->findInOutputs(prewhere_info->prewhere_column_name);
nodes.push_back(&node);
}
if (prewhere_info->row_level_filter)
{
const auto & node = prewhere_info->row_level_filter->findInOutputs(prewhere_info->row_level_column_name);
nodes.push_back(&node);
}
}
for (const auto & node : added_filter_nodes.nodes)
nodes.push_back(node);
std::unordered_map<std::string, ColumnWithTypeAndName> node_name_to_input_node_column;
if (settings.allow_experimental_analyzer && query_info.planner_context)
{
const auto & table_expression_data = query_info.planner_context->getTableExpressionDataOrThrow(query_info.table_expression);
for (const auto & [column_identifier, column_name] : table_expression_data.getColumnIdentifierToColumnName())
{
const auto & column = table_expression_data.getColumnOrThrow(column_name);
node_name_to_input_node_column.emplace(column_identifier, ColumnWithTypeAndName(column.type, column_name));
}
}
return ActionsDAG::buildFilterActionsDAG(nodes, node_name_to_input_node_column);
}
static void buildIndexes(
std::optional<ReadFromMergeTree::Indexes> & indexes,
ActionsDAGPtr filter_actions_dag,
@ -1391,7 +1340,6 @@ static void buildIndexes(
indexes->partition_pruner.emplace(metadata_snapshot, filter_actions_dag, context, false /* strict */);
}
/// TODO Support row_policy_filter and additional_filters
indexes->part_values = MergeTreeDataSelectExecutor::filterPartsByVirtualColumns(data, parts, filter_actions_dag, context);
MergeTreeDataSelectExecutor::buildKeyConditionFromPartOffset(indexes->part_offset_condition, filter_actions_dag, context);
@ -1404,19 +1352,6 @@ static void buildIndexes(
if (!indexes->use_skip_indexes)
return;
std::optional<SelectQueryInfo> info_copy;
auto get_query_info = [&]() -> const SelectQueryInfo &
{
if (settings.allow_experimental_analyzer)
{
info_copy.emplace(query_info);
info_copy->filter_actions_dag = filter_actions_dag;
return *info_copy;
}
return query_info;
};
std::unordered_set<std::string> ignored_index_names;
if (settings.ignore_data_skipping_indices.changed)
@ -1456,7 +1391,7 @@ static void buildIndexes(
if (inserted)
{
skip_indexes.merged_indices.emplace_back();
skip_indexes.merged_indices.back().condition = index_helper->createIndexMergedCondition(get_query_info(), metadata_snapshot);
skip_indexes.merged_indices.back().condition = index_helper->createIndexMergedCondition(query_info, metadata_snapshot);
}
skip_indexes.merged_indices[it->second].addIndex(index_helper);
@ -1468,11 +1403,11 @@ static void buildIndexes(
{
#ifdef ENABLE_ANNOY
if (const auto * annoy = typeid_cast<const MergeTreeIndexAnnoy *>(index_helper.get()))
condition = annoy->createIndexCondition(get_query_info(), context);
condition = annoy->createIndexCondition(query_info, context);
#endif
#ifdef ENABLE_USEARCH
if (const auto * usearch = typeid_cast<const MergeTreeIndexUSearch *>(index_helper.get()))
condition = usearch->createIndexCondition(get_query_info(), context);
condition = usearch->createIndexCondition(query_info, context);
#endif
if (!condition)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown vector search index {}", index_helper->index.name);
@ -1489,20 +1424,48 @@ static void buildIndexes(
indexes->skip_indexes = std::move(skip_indexes);
}
void ReadFromMergeTree::applyFilters()
void ReadFromMergeTree::applyFilters(ActionDAGNodes added_filter_nodes)
{
auto filter_actions_dag = buildFilterDAG(context, prewhere_info, filter_nodes, query_info);
buildIndexes(indexes, filter_actions_dag, data, prepared_parts, context, query_info, metadata_for_reading);
if (!indexes)
{
/// Analyzer generates unique ColumnIdentifiers like __table1.__partition_id in filter nodes,
/// while key analysis still requires unqualified column names.
std::unordered_map<std::string, ColumnWithTypeAndName> node_name_to_input_node_column;
if (query_info.planner_context)
{
const auto & table_expression_data = query_info.planner_context->getTableExpressionDataOrThrow(query_info.table_expression);
for (const auto & [column_identifier, column_name] : table_expression_data.getColumnIdentifierToColumnName())
{
const auto & column = table_expression_data.getColumnOrThrow(column_name);
node_name_to_input_node_column.emplace(column_identifier, ColumnWithTypeAndName(column.type, column_name));
}
}
filter_actions_dag = ActionsDAG::buildFilterActionsDAG(added_filter_nodes.nodes, node_name_to_input_node_column);
/// NOTE: Currently we store two DAGs for analysis:
/// (1) SourceStepWithFilter::filter_nodes, (2) query_info.filter_actions_dag. Make sure there are consistent.
/// TODO: Get rid of filter_actions_dag in query_info after we move analysis of
/// parallel replicas and unused shards into optimization, similar to projection analysis.
query_info.filter_actions_dag = filter_actions_dag;
buildIndexes(
indexes,
filter_actions_dag,
data,
prepared_parts,
context,
query_info,
metadata_for_reading);
}
}
ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToRead(
MergeTreeData::DataPartsVector parts,
std::vector<AlterConversionsPtr> alter_conversions,
const PrewhereInfoPtr & prewhere_info,
const ActionDAGNodes & added_filter_nodes,
const StorageMetadataPtr & metadata_snapshot,
const SelectQueryInfo & query_info,
ContextPtr context,
const SelectQueryInfo & query_info_,
ContextPtr context_,
size_t num_streams,
std::shared_ptr<PartitionIdToMaxBlock> max_block_numbers_to_read,
const MergeTreeData & data,
@ -1511,15 +1474,12 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToRead(
LoggerPtr log,
std::optional<Indexes> & indexes)
{
auto updated_query_info_with_filter_dag = query_info;
updated_query_info_with_filter_dag.filter_actions_dag = buildFilterDAG(context, prewhere_info, added_filter_nodes, query_info);
return selectRangesToReadImpl(
std::move(parts),
std::move(alter_conversions),
metadata_snapshot,
updated_query_info_with_filter_dag,
context,
query_info_,
context_,
num_streams,
max_block_numbers_to_read,
data,
@ -1533,8 +1493,8 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToReadImpl(
MergeTreeData::DataPartsVector parts,
std::vector<AlterConversionsPtr> alter_conversions,
const StorageMetadataPtr & metadata_snapshot,
const SelectQueryInfo & query_info,
ContextPtr context,
const SelectQueryInfo & query_info_,
ContextPtr context_,
size_t num_streams,
std::shared_ptr<PartitionIdToMaxBlock> max_block_numbers_to_read,
const MergeTreeData & data,
@ -1544,7 +1504,7 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToReadImpl(
std::optional<Indexes> & indexes)
{
AnalysisResult result;
const auto & settings = context->getSettingsRef();
const auto & settings = context_->getSettingsRef();
size_t total_parts = parts.size();
@ -1562,7 +1522,7 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToReadImpl(
const Names & primary_key_column_names = primary_key.column_names;
if (!indexes)
buildIndexes(indexes, query_info.filter_actions_dag, data, parts, context, query_info, metadata_snapshot);
buildIndexes(indexes, query_info_.filter_actions_dag, data, parts, context_, query_info_, metadata_snapshot);
if (indexes->part_values && indexes->part_values->empty())
return std::make_shared<AnalysisResult>(std::move(result));
@ -1594,19 +1554,19 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToReadImpl(
indexes->part_values,
metadata_snapshot,
data,
context,
context_,
max_block_numbers_to_read.get(),
log,
result.index_stats);
result.sampling = MergeTreeDataSelectExecutor::getSampling(
query_info,
query_info_,
metadata_snapshot->getColumns().getAllPhysical(),
parts,
indexes->key_condition,
data,
metadata_snapshot,
context,
context_,
sample_factor_column_queried,
log);
@ -1617,12 +1577,12 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToReadImpl(
total_marks_pk += part->index_granularity.getMarksCountWithoutFinal();
parts_before_pk = parts.size();
auto reader_settings = getMergeTreeReaderSettings(context, query_info);
auto reader_settings = getMergeTreeReaderSettings(context_, query_info_);
result.parts_with_ranges = MergeTreeDataSelectExecutor::filterPartsByPrimaryKeyAndSkipIndexes(
std::move(parts),
std::move(alter_conversions),
metadata_snapshot,
context,
context_,
indexes->key_condition,
indexes->part_offset_condition,
indexes->skip_indexes,
@ -1658,8 +1618,8 @@ ReadFromMergeTree::AnalysisResultPtr ReadFromMergeTree::selectRangesToReadImpl(
result.total_marks_pk = total_marks_pk;
result.selected_rows = sum_rows;
if (query_info.input_order_info)
result.read_type = (query_info.input_order_info->direction > 0)
if (query_info_.input_order_info)
result.read_type = (query_info_.input_order_info->direction > 0)
? ReadType::InOrder
: ReadType::InReverseOrder;
@ -1808,11 +1768,6 @@ ReadFromMergeTree::AnalysisResult ReadFromMergeTree::getAnalysisResult() const
return *result_ptr;
}
bool ReadFromMergeTree::isQueryWithFinal() const
{
return query_info.isFinal();
}
bool ReadFromMergeTree::isQueryWithSampling() const
{
if (context->getSettingsRef().parallel_replicas_count > 1 && data.supportsSampling())
@ -1920,6 +1875,11 @@ Pipe ReadFromMergeTree::groupStreamsByPartition(AnalysisResult & result, Actions
void ReadFromMergeTree::initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &)
{
auto result = getAnalysisResult();
/// Do not keep data parts in snapshot.
/// They are stored separately, and some could be released after PK analysis.
storage_snapshot->data = std::make_unique<MergeTreeData::SnapshotData>();
result.checkLimits(context->getSettingsRef(), query_info);
LOG_DEBUG(

View File

@ -110,12 +110,13 @@ public:
ReadFromMergeTree(
MergeTreeData::DataPartsVector parts_,
std::vector<AlterConversionsPtr> alter_conversions_,
const Names & column_names_,
Names real_column_names_,
Names virt_column_names_,
const MergeTreeData & data_,
const SelectQueryInfo & query_info_,
StorageSnapshotPtr storage_snapshot,
ContextPtr context_,
const StorageSnapshotPtr & storage_snapshot,
const ContextPtr & context_,
size_t max_block_size_,
size_t num_streams_,
bool sample_factor_column_queried_,
@ -139,7 +140,6 @@ public:
const Names & getVirtualColumnNames() const { return virt_column_names; }
StorageID getStorageID() const { return data.getStorageID(); }
const StorageSnapshotPtr & getStorageSnapshot() const { return storage_snapshot; }
UInt64 getSelectedParts() const { return selected_parts; }
UInt64 getSelectedRows() const { return selected_rows; }
UInt64 getSelectedMarks() const { return selected_marks; }
@ -158,8 +158,6 @@ public:
static AnalysisResultPtr selectRangesToRead(
MergeTreeData::DataPartsVector parts,
std::vector<AlterConversionsPtr> alter_conversions,
const PrewhereInfoPtr & prewhere_info,
const ActionDAGNodes & added_filter_nodes,
const StorageMetadataPtr & metadata_snapshot,
const SelectQueryInfo & query_info,
ContextPtr context,
@ -175,17 +173,13 @@ public:
MergeTreeData::DataPartsVector parts,
std::vector<AlterConversionsPtr> alter_conversions) const;
ContextPtr getContext() const { return context; }
const SelectQueryInfo & getQueryInfo() const { return query_info; }
StorageMetadataPtr getStorageMetadata() const { return metadata_for_reading; }
const PrewhereInfoPtr & getPrewhereInfo() const { return prewhere_info; }
/// Returns `false` if requested reading cannot be performed.
bool requestReadingInOrder(size_t prefix_size, int direction, size_t limit);
bool readsInOrder() const;
void updatePrewhereInfo(const PrewhereInfoPtr & prewhere_info_value);
bool isQueryWithFinal() const;
void updatePrewhereInfo(const PrewhereInfoPtr & prewhere_info_value) override;
bool isQueryWithSampling() const;
/// Returns true if the optimization is applicable (and applies it then).
@ -203,7 +197,7 @@ public:
size_t getNumStreams() const { return requested_num_streams; }
bool isParallelReadingEnabled() const { return read_task_callback != std::nullopt; }
void applyFilters() override;
void applyFilters(ActionDAGNodes added_filter_nodes) override;
private:
static AnalysisResultPtr selectRangesToReadImpl(
@ -237,14 +231,10 @@ private:
Names virt_column_names;
const MergeTreeData & data;
SelectQueryInfo query_info;
PrewhereInfoPtr prewhere_info;
ExpressionActionsSettings actions_settings;
StorageSnapshotPtr storage_snapshot;
StorageMetadataPtr metadata_for_reading;
ContextPtr context;
const MergeTreeReadTask::BlockSizeParams block_size;
size_t requested_num_streams;

View File

@ -321,21 +321,24 @@ void shrinkRanges(Ranges & ranges, size_t size)
ReadFromSystemNumbersStep::ReadFromSystemNumbersStep(
const Names & column_names_,
StoragePtr storage_,
const SelectQueryInfo & query_info_,
const StorageSnapshotPtr & storage_snapshot_,
SelectQueryInfo & query_info,
ContextPtr context_,
const ContextPtr & context_,
StoragePtr storage_,
size_t max_block_size_,
size_t num_streams_)
: SourceStepWithFilter{DataStream{.header = storage_snapshot_->getSampleBlockForColumns(column_names_)}}
: SourceStepWithFilter(
DataStream{.header = storage_snapshot_->getSampleBlockForColumns(column_names_)},
column_names_,
query_info_,
storage_snapshot_,
context_)
, column_names{column_names_}
, storage{std::move(storage_)}
, storage_snapshot{storage_snapshot_}
, context{std::move(context_)}
, key_expression{KeyDescription::parse(column_names[0], storage_snapshot->metadata->columns, context).expression}
, max_block_size{max_block_size_}
, num_streams{num_streams_}
, limit_length_and_offset(InterpreterSelectQuery::getLimitLengthAndOffset(query_info.query->as<ASTSelectQuery&>(), context))
, limit_length_and_offset(InterpreterSelectQuery::getLimitLengthAndOffset(query_info.query->as<ASTSelectQuery &>(), context))
, should_pushdown_limit(shouldPushdownLimit(query_info, limit_length_and_offset.first))
, limit(query_info.limit)
, storage_limits(query_info.storage_limits)
@ -375,7 +378,7 @@ Pipe ReadFromSystemNumbersStep::makePipe()
num_streams = 1;
/// Build rpn of query filters
KeyCondition condition(buildFilterDAG(), context, column_names, key_expression);
KeyCondition condition(filter_actions_dag, context, column_names, key_expression);
Pipe pipe;
Ranges ranges;
@ -504,12 +507,6 @@ Pipe ReadFromSystemNumbersStep::makePipe()
return pipe;
}
ActionsDAGPtr ReadFromSystemNumbersStep::buildFilterDAG()
{
std::unordered_map<std::string, ColumnWithTypeAndName> node_name_to_input_node_column;
return ActionsDAG::buildFilterActionsDAG(filter_nodes.nodes, node_name_to_input_node_column);
}
void ReadFromSystemNumbersStep::checkLimits(size_t rows)
{
const auto & settings = context->getSettingsRef();

View File

@ -16,10 +16,10 @@ class ReadFromSystemNumbersStep final : public SourceStepWithFilter
public:
ReadFromSystemNumbersStep(
const Names & column_names_,
StoragePtr storage_,
const SelectQueryInfo & query_info_,
const StorageSnapshotPtr & storage_snapshot_,
SelectQueryInfo & query_info,
ContextPtr context_,
const ContextPtr & context_,
StoragePtr storage_,
size_t max_block_size_,
size_t num_streams_);
@ -32,12 +32,9 @@ private:
void checkLimits(size_t rows);
Pipe makePipe();
ActionsDAGPtr buildFilterDAG();
const Names column_names;
StoragePtr storage;
StorageSnapshotPtr storage_snapshot;
ContextPtr context;
ExpressionActionsPtr key_expression;
size_t max_block_size;
size_t num_streams;

View File

@ -0,0 +1,160 @@
#include <Processors/QueryPlan/SourceStepWithFilter.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeNullable.h>
#include <IO/Operators.h>
#include <Interpreters/Context.h>
#include <Parsers/ASTSelectQuery.h>
#include <Common/JSONBuilder.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER;
}
Block SourceStepWithFilter::applyPrewhereActions(Block block, const PrewhereInfoPtr & prewhere_info)
{
if (prewhere_info)
{
if (prewhere_info->row_level_filter)
{
block = prewhere_info->row_level_filter->updateHeader(std::move(block));
auto & row_level_column = block.getByName(prewhere_info->row_level_column_name);
if (!row_level_column.type->canBeUsedInBooleanContext())
{
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER,
"Invalid type for filter in PREWHERE: {}",
row_level_column.type->getName());
}
block.erase(prewhere_info->row_level_column_name);
}
if (prewhere_info->prewhere_actions)
{
block = prewhere_info->prewhere_actions->updateHeader(std::move(block));
auto & prewhere_column = block.getByName(prewhere_info->prewhere_column_name);
if (!prewhere_column.type->canBeUsedInBooleanContext())
{
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER,
"Invalid type for filter in PREWHERE: {}",
prewhere_column.type->getName());
}
if (prewhere_info->remove_prewhere_column)
{
block.erase(prewhere_info->prewhere_column_name);
}
else if (prewhere_info->need_filter)
{
if (const auto * type = typeid_cast<const DataTypeNullable *>(prewhere_column.type.get()); type && type->onlyNull())
{
prewhere_column.column = prewhere_column.type->createColumnConst(block.rows(), Null());
}
else
{
WhichDataType which(removeNullable(recursiveRemoveLowCardinality(prewhere_column.type)));
if (which.isNativeInt() || which.isNativeUInt())
prewhere_column.column = prewhere_column.type->createColumnConst(block.rows(), 1u)->convertToFullColumnIfConst();
else if (which.isFloat())
prewhere_column.column = prewhere_column.type->createColumnConst(block.rows(), 1.0f)->convertToFullColumnIfConst();
else
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER,
"Illegal type {} of column for filter",
prewhere_column.type->getName());
}
}
}
}
return block;
}
void SourceStepWithFilter::applyFilters(ActionDAGNodes added_filter_nodes)
{
filter_actions_dag = ActionsDAG::buildFilterActionsDAG(added_filter_nodes.nodes);
}
void SourceStepWithFilter::updatePrewhereInfo(const PrewhereInfoPtr & prewhere_info_value)
{
query_info.prewhere_info = prewhere_info_value;
prewhere_info = prewhere_info_value;
output_stream = DataStream{.header = applyPrewhereActions(output_stream->header, prewhere_info)};
}
void SourceStepWithFilter::describeActions(FormatSettings & format_settings) const
{
std::string prefix(format_settings.offset, format_settings.indent_char);
if (prewhere_info)
{
format_settings.out << prefix << "Prewhere info" << '\n';
format_settings.out << prefix << "Need filter: " << prewhere_info->need_filter << '\n';
prefix.push_back(format_settings.indent_char);
prefix.push_back(format_settings.indent_char);
if (prewhere_info->prewhere_actions)
{
format_settings.out << prefix << "Prewhere filter" << '\n';
format_settings.out << prefix << "Prewhere filter column: " << prewhere_info->prewhere_column_name;
if (prewhere_info->remove_prewhere_column)
format_settings.out << " (removed)";
format_settings.out << '\n';
auto expression = std::make_shared<ExpressionActions>(prewhere_info->prewhere_actions);
expression->describeActions(format_settings.out, prefix);
}
if (prewhere_info->row_level_filter)
{
format_settings.out << prefix << "Row level filter" << '\n';
format_settings.out << prefix << "Row level filter column: " << prewhere_info->row_level_column_name << '\n';
auto expression = std::make_shared<ExpressionActions>(prewhere_info->row_level_filter);
expression->describeActions(format_settings.out, prefix);
}
}
}
void SourceStepWithFilter::describeActions(JSONBuilder::JSONMap & map) const
{
if (prewhere_info)
{
std::unique_ptr<JSONBuilder::JSONMap> prewhere_info_map = std::make_unique<JSONBuilder::JSONMap>();
prewhere_info_map->add("Need filter", prewhere_info->need_filter);
if (prewhere_info->prewhere_actions)
{
std::unique_ptr<JSONBuilder::JSONMap> prewhere_filter_map = std::make_unique<JSONBuilder::JSONMap>();
prewhere_filter_map->add("Prewhere filter column", prewhere_info->prewhere_column_name);
prewhere_filter_map->add("Prewhere filter remove filter column", prewhere_info->remove_prewhere_column);
auto expression = std::make_shared<ExpressionActions>(prewhere_info->prewhere_actions);
prewhere_filter_map->add("Prewhere filter expression", expression->toTree());
prewhere_info_map->add("Prewhere filter", std::move(prewhere_filter_map));
}
if (prewhere_info->row_level_filter)
{
std::unique_ptr<JSONBuilder::JSONMap> row_level_filter_map = std::make_unique<JSONBuilder::JSONMap>();
row_level_filter_map->add("Row level filter column", prewhere_info->row_level_column_name);
auto expression = std::make_shared<ExpressionActions>(prewhere_info->row_level_filter);
row_level_filter_map->add("Row level filter expression", expression->toTree());
prewhere_info_map->add("Row level filter", std::move(row_level_filter_map));
}
map.add("Prewhere info", std::move(prewhere_info_map));
}
}
}

View File

@ -1,7 +1,9 @@
#pragma once
#include <Processors/QueryPlan/ISourceStep.h>
#include <Interpreters/ActionsDAG.h>
#include <Processors/QueryPlan/ISourceStep.h>
#include <Storages/SelectQueryInfo.h>
#include <Storages/StorageSnapshot.h>
namespace DB
{
@ -15,15 +17,31 @@ public:
using Base = ISourceStep;
using Base::Base;
const std::vector<ActionsDAGPtr> & getFilters() const
SourceStepWithFilter(
DataStream output_stream_,
const Names & column_names_,
const SelectQueryInfo & query_info_,
const StorageSnapshotPtr & storage_snapshot_,
const ContextPtr & context_)
: ISourceStep(std::move(output_stream_))
, required_source_columns(column_names_)
, query_info(query_info_)
, prewhere_info(query_info.prewhere_info)
, storage_snapshot(storage_snapshot_)
, context(context_)
{
return filter_dags;
}
const ActionDAGNodes & getFilterNodes() const
{
return filter_nodes;
}
const ActionsDAGPtr & getFilterActionsDAG() const { return filter_actions_dag; }
const SelectQueryInfo & getQueryInfo() const { return query_info; }
const PrewhereInfoPtr & getPrewhereInfo() const { return prewhere_info; }
ContextPtr getContext() const { return context; }
const StorageSnapshotPtr & getStorageSnapshot() const { return storage_snapshot; }
bool isQueryWithFinal() const { return query_info.isFinal(); }
const Names & requiredSourceColumns() const { return required_source_columns; }
void addFilter(ActionsDAGPtr filter_dag, std::string column_name)
{
@ -31,18 +49,41 @@ public:
filter_dags.push_back(std::move(filter_dag));
}
void addFilter(ActionsDAGPtr filter_dag, const ActionsDAG::Node * filter_node)
void addFilterFromParentStep(const ActionsDAG::Node * filter_node)
{
filter_nodes.nodes.push_back(filter_node);
filter_dags.push_back(std::move(filter_dag));
}
/// Apply filters that can optimize reading from storage.
virtual void applyFilters() {}
void applyFilters()
{
applyFilters(std::move(filter_nodes));
filter_dags = {};
}
virtual void applyFilters(ActionDAGNodes added_filter_nodes);
virtual void updatePrewhereInfo(const PrewhereInfoPtr & prewhere_info_value);
void describeActions(FormatSettings & format_settings) const override;
void describeActions(JSONBuilder::JSONMap & map) const override;
static Block applyPrewhereActions(Block block, const PrewhereInfoPtr & prewhere_info);
protected:
std::vector<ActionsDAGPtr> filter_dags;
Names required_source_columns;
SelectQueryInfo query_info;
PrewhereInfoPtr prewhere_info;
StorageSnapshotPtr storage_snapshot;
ContextPtr context;
ActionsDAGPtr filter_actions_dag;
private:
/// Will be cleared after applyFilters() is called.
ActionDAGNodes filter_nodes;
std::vector<ActionsDAGPtr> filter_dags;
};
}

View File

@ -16,13 +16,8 @@ protected:
/// Represents pushed down filters in source
std::shared_ptr<const KeyCondition> key_condition;
void setKeyConditionImpl(const ActionsDAG::NodeRawConstPtrs & nodes, ContextPtr context, const Block & keys)
void setKeyConditionImpl(const ActionsDAGPtr & filter_actions_dag, ContextPtr context, const Block & keys)
{
std::unordered_map<std::string, DB::ColumnWithTypeAndName> node_name_to_input_column;
for (const auto & column : keys.getColumnsWithTypeAndName())
node_name_to_input_column.insert({column.name, column});
auto filter_actions_dag = ActionsDAG::buildFilterActionsDAG(nodes, node_name_to_input_column);
key_condition = std::make_shared<const KeyCondition>(
filter_actions_dag,
context,
@ -37,7 +32,7 @@ public:
/// Set key_condition directly. It is used for filter push down in source.
virtual void setKeyCondition(const std::shared_ptr<const KeyCondition> & key_condition_) { key_condition = key_condition_; }
/// Set key_condition created by nodes and context.
virtual void setKeyCondition(const ActionsDAG::NodeRawConstPtrs & /*nodes*/, ContextPtr /*context*/) { }
/// Set key_condition created by filter_actions_dag and context.
virtual void setKeyCondition(const ActionsDAGPtr & /*filter_actions_dag*/, ContextPtr /*context*/) { }
};
}

View File

@ -548,7 +548,6 @@ Chain buildPushingToViewsChain(
result_chain.addSource(std::move(sink));
}
/// TODO: add pushing to live view
if (result_chain.empty())
result_chain.addSink(std::make_shared<NullSinkToStorage>(storage_header));

View File

@ -1298,9 +1298,7 @@ HTTPRequestHandlerFactoryPtr createDynamicHandlerFactory(IServer & server,
};
auto factory = std::make_shared<HandlingRuleHTTPHandlerFactory<DynamicQueryHandler>>(std::move(creator));
factory->addFiltersFromConfig(config, config_prefix);
return factory;
}

View File

@ -33,6 +33,16 @@ static void addDefaultHandlersFactory(
const Poco::Util::AbstractConfiguration & config,
AsynchronousMetrics & async_metrics);
static auto createPingHandlerFactory(IServer & server)
{
auto creator = [&server]() -> std::unique_ptr<StaticRequestHandler>
{
constexpr auto ping_response_expression = "Ok.\n";
return std::make_unique<StaticRequestHandler>(server, ping_response_expression);
};
return std::make_shared<HandlingRuleHTTPHandlerFactory<StaticRequestHandler>>(std::move(creator));
}
static inline auto createHandlersFactoryFromConfig(
IServer & server,
const Poco::Util::AbstractConfiguration & config,
@ -60,15 +70,49 @@ static inline auto createHandlersFactoryFromConfig(
"{}.{}.handler.type", prefix, key);
if (handler_type == "static")
{
main_handler_factory->addHandler(createStaticHandlerFactory(server, config, prefix + "." + key));
}
else if (handler_type == "dynamic_query_handler")
{
main_handler_factory->addHandler(createDynamicHandlerFactory(server, config, prefix + "." + key));
}
else if (handler_type == "predefined_query_handler")
{
main_handler_factory->addHandler(createPredefinedHandlerFactory(server, config, prefix + "." + key));
}
else if (handler_type == "prometheus")
{
main_handler_factory->addHandler(createPrometheusHandlerFactory(server, config, async_metrics, prefix + "." + key));
}
else if (handler_type == "replicas_status")
{
main_handler_factory->addHandler(createReplicasStatusHandlerFactory(server, config, prefix + "." + key));
}
else if (handler_type == "ping")
{
auto handler = createPingHandlerFactory(server);
handler->addFiltersFromConfig(config, prefix + "." + key);
main_handler_factory->addHandler(std::move(handler));
}
else if (handler_type == "play")
{
auto handler = std::make_shared<HandlingRuleHTTPHandlerFactory<PlayWebUIRequestHandler>>(server);
handler->addFiltersFromConfig(config, prefix + "." + key);
main_handler_factory->addHandler(std::move(handler));
}
else if (handler_type == "dashboard")
{
auto handler = std::make_shared<HandlingRuleHTTPHandlerFactory<DashboardWebUIRequestHandler>>(server);
handler->addFiltersFromConfig(config, prefix + "." + key);
main_handler_factory->addHandler(std::move(handler));
}
else if (handler_type == "binary")
{
auto handler = std::make_shared<HandlingRuleHTTPHandlerFactory<BinaryWebUIRequestHandler>>(server);
handler->addFiltersFromConfig(config, prefix + "." + key);
main_handler_factory->addHandler(std::move(handler));
}
else
throw Exception(ErrorCodes::INVALID_CONFIG_PARAMETER, "Unknown handler type '{}' in config here: {}.{}.handler.type",
handler_type, prefix, key);
@ -108,6 +152,7 @@ static inline HTTPRequestHandlerFactoryPtr createInterserverHTTPHandlerFactory(I
return factory;
}
HTTPRequestHandlerFactoryPtr createHandlerFactory(IServer & server, const Poco::Util::AbstractConfiguration & config, AsynchronousMetrics & async_metrics, const std::string & name)
{
if (name == "HTTPHandler-factory" || name == "HTTPSHandler-factory")
@ -136,12 +181,7 @@ void addCommonDefaultHandlersFactory(HTTPRequestHandlerFactoryMain & factory, IS
root_handler->allowGetAndHeadRequest();
factory.addHandler(root_handler);
auto ping_creator = [&server]() -> std::unique_ptr<StaticRequestHandler>
{
constexpr auto ping_response_expression = "Ok.\n";
return std::make_unique<StaticRequestHandler>(server, ping_response_expression);
};
auto ping_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<StaticRequestHandler>>(std::move(ping_creator));
auto ping_handler = createPingHandlerFactory(server);
ping_handler->attachStrictPath("/ping");
ping_handler->allowGetAndHeadRequest();
factory.addPathToHints("/ping");
@ -153,25 +193,25 @@ void addCommonDefaultHandlersFactory(HTTPRequestHandlerFactoryMain & factory, IS
factory.addPathToHints("/replicas_status");
factory.addHandler(replicas_status_handler);
auto play_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<WebUIRequestHandler>>(server);
auto play_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<PlayWebUIRequestHandler>>(server);
play_handler->attachNonStrictPath("/play");
play_handler->allowGetAndHeadRequest();
factory.addPathToHints("/play");
factory.addHandler(play_handler);
auto dashboard_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<WebUIRequestHandler>>(server);
auto dashboard_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<DashboardWebUIRequestHandler>>(server);
dashboard_handler->attachNonStrictPath("/dashboard");
dashboard_handler->allowGetAndHeadRequest();
factory.addPathToHints("/dashboard");
factory.addHandler(dashboard_handler);
auto binary_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<WebUIRequestHandler>>(server);
auto binary_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<BinaryWebUIRequestHandler>>(server);
binary_handler->attachNonStrictPath("/binary");
binary_handler->allowGetAndHeadRequest();
factory.addPathToHints("/binary");
factory.addHandler(binary_handler);
auto js_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<WebUIRequestHandler>>(server);
auto js_handler = std::make_shared<HandlingRuleHTTPHandlerFactory<JavaScriptWebUIRequestHandler>>(server);
js_handler->attachNonStrictPath("/js/");
js_handler->allowGetAndHeadRequest();
factory.addHandler(js_handler);

Some files were not shown because too many files have changed in this diff Show More