mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-09-20 00:30:49 +00:00
Merge branch 'master' into randomize-mt-settings
This commit is contained in:
commit
d8fe2bcbaa
2
contrib/poco
vendored
2
contrib/poco
vendored
@ -1 +1 @@
|
||||
Subproject commit 4b1c8dd9913d2a16db62df0e509fa598da5c8219
|
||||
Subproject commit 7fefdf30244a9bf8eb58562a9b2a51cc59a8877a
|
@ -235,6 +235,7 @@ function run_tests
|
||||
--check-zookeeper-session
|
||||
--order random
|
||||
--print-time
|
||||
--report-logs-stats
|
||||
--jobs "${NPROC}"
|
||||
)
|
||||
time clickhouse-test "${test_opts[@]}" -- "$FASTTEST_FOCUS" 2>&1 \
|
||||
|
@ -1,15 +1,22 @@
|
||||
# Inverted indexes [experimental] {#table_engines-ANNIndex}
|
||||
---
|
||||
slug: /en/engines/table-engines/mergetree-family/invertedindexes
|
||||
sidebar_label: Inverted Indexes
|
||||
description: Quickly find search terms in text.
|
||||
keywords: [full-text search, text search]
|
||||
---
|
||||
|
||||
Inverted indexes are an experimental type of [secondary indexes](mergetree.md#available-types-of-indices) which provide fast text search
|
||||
capabilities for [String](../../../sql-reference/data-types/string.md) or [FixedString](../../../sql-reference/data-types/fixedstring.md)
|
||||
columns. The main idea of an inverted indexes is to store a mapping from "terms" to the rows which contains these terms. "Terms" are
|
||||
tokenized cells of the string column. For example, string cell "I will be a little late" is by default tokenized into six terms "I", "will",
|
||||
"be", "a", "little" and "late". Another kind of tokenizer are n-grams. For example, the result of 3-gram tokenization will be 21 terms "I w",
|
||||
# Inverted indexes [experimental]
|
||||
|
||||
Inverted indexes are an experimental type of [secondary indexes](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#available-types-of-indices) which provide fast text search
|
||||
capabilities for [String](/docs/en/sql-reference/data-types/string.md) or [FixedString](/docs/en/sql-reference/data-types/fixedstring.md)
|
||||
columns. The main idea of an inverted index is to store a mapping from "terms" to the rows which contain these terms. "Terms" are
|
||||
tokenized cells of the string column. For example, the string cell "I will be a little late" is by default tokenized into six terms "I", "will",
|
||||
"be", "a", "little" and "late". Another kind of tokenizer is n-grams. For example, the result of 3-gram tokenization will be 21 terms "I w",
|
||||
" wi", "wil", "ill", "ll ", "l b", " be" etc. The more fine-granular the input strings are tokenized, the bigger but also the more
|
||||
useful the resulting inverted index will be.
|
||||
|
||||
:::warning
|
||||
Inverted indexes are experimental and should not be used in production environment yet. They may change in future in backwards-incompatible
|
||||
Inverted indexes are experimental and should not be used in production environments yet. They may change in the future in backward-incompatible
|
||||
ways, for example with respect to their DDL/DQL syntax or performance/compression characteristics.
|
||||
:::
|
||||
|
||||
@ -24,7 +31,14 @@ SET allow_experimental_inverted_index = true;
|
||||
An inverted index can be defined on a string column using the following syntax
|
||||
|
||||
``` sql
|
||||
CREATE TABLE tab (key UInt64, str String, INDEX inv_idx(s) TYPE inverted(N) GRANULARITY 1) Engine=MergeTree ORDER BY (k);
|
||||
CREATE TABLE tab
|
||||
(
|
||||
`key` UInt64,
|
||||
`str` String,
|
||||
INDEX inv_idx(str) TYPE inverted(0) GRANULARITY 1
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY key
|
||||
```
|
||||
|
||||
where `N` specifies the tokenizer:
|
||||
@ -32,7 +46,7 @@ where `N` specifies the tokenizer:
|
||||
- `inverted(0)` (or shorter: `inverted()`) set the tokenizer to "tokens", i.e. split strings along spaces,
|
||||
- `inverted(N)` with `N` between 2 and 8 sets the tokenizer to "ngrams(N)"
|
||||
|
||||
Being a type of skipping indexes, inverted indexes can be dropped or added to a column after table creation:
|
||||
Being a type of skipping index, inverted indexes can be dropped or added to a column after table creation:
|
||||
|
||||
``` sql
|
||||
ALTER TABLE tbl DROP INDEX inv_idx;
|
||||
@ -54,13 +68,13 @@ SELECT * from tab WHERE multiSearchAll(s, [‘Hello’, ‘World’]);
|
||||
The inverted index also works on columns of type `Array(String)`, `Array(FixedString)`, `Map(String)` and `Map(String)`.
|
||||
|
||||
Like for other secondary indices, each column part has its own inverted index. Furthermore, each inverted index is internally divided into
|
||||
"segments". The existence and size of the segments is generally transparent to users but the segment size determines the memory consumption
|
||||
"segments". The existence and size of the segments are generally transparent to users but the segment size determines the memory consumption
|
||||
during index construction (e.g. when two parts are merged). Configuration parameter "max_digestion_size_per_segment" (default: 256 MB)
|
||||
controls the amount of data read consumed from the underlying column before a new segment is created. Incrementing the parameter raises the
|
||||
intermediate memory consumption for index constuction but also improves lookup performance since fewer segments need to be checked on
|
||||
intermediate memory consumption for index construction but also improves lookup performance since fewer segments need to be checked on
|
||||
average to evaluate a query.
|
||||
|
||||
Unlike other secondary indices, inverted indexes (for now) map to row numbers (row ids) instead of granule ids. The reason for this design
|
||||
is performance. In practice, users often search for multiple terms at once. For example, filter predicate `WHERE s LIKE '%little%' OR s LIKE
|
||||
'%big%'` can be evaluated directly using an inverted index by forming the union of the rowid lists for terms "little" and "big". This also
|
||||
means that parameter `GRANULARITY` supplied to index creation has no meaning (it may be removed from the syntax in future).
|
||||
'%big%'` can be evaluated directly using an inverted index by forming the union of the row id lists for terms "little" and "big". This also
|
||||
means that the parameter `GRANULARITY` supplied to index creation has no meaning (it may be removed from the syntax in the future).
|
||||
|
259
docs/en/getting-started/example-datasets/laion.md
Normal file
259
docs/en/getting-started/example-datasets/laion.md
Normal file
@ -0,0 +1,259 @@
|
||||
# Laion-400M dataset
|
||||
|
||||
The dataset contains 400 million images with English text. For more information follow this [link](https://laion.ai/blog/laion-400-open-dataset/). Laion provides even larger datasets (e.g. [5 billion](https://laion.ai/blog/laion-5b/)). Working with them will be similar.
|
||||
|
||||
The dataset has prepared embeddings for texts and images. This will be used to demonstrate [Approximate nearest neighbor search indexes](../../engines/table-engines/mergetree-family/annindexes.md).
|
||||
|
||||
## Prepare data
|
||||
|
||||
Embeddings are stored in `.npy` files, so we have to read them with python and merge with other data.
|
||||
|
||||
Download data and process it with simple `download.sh` script:
|
||||
|
||||
```bash
|
||||
wget --tries=100 https://deploy.laion.ai/8f83b608504d46bb81708ec86e912220/embeddings/img_emb/img_emb_${1}.npy
|
||||
wget --tries=100 https://deploy.laion.ai/8f83b608504d46bb81708ec86e912220/embeddings/metadata/metadata_${1}.parquet
|
||||
wget --tries=100 https://deploy.laion.ai/8f83b608504d46bb81708ec86e912220/embeddings/text_emb/text_emb_${1}.npy
|
||||
python3 process.py ${1}
|
||||
```
|
||||
|
||||
Where `process.py`:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
import numpy as np
|
||||
import os
|
||||
import sys
|
||||
|
||||
str_i = str(sys.argv[1])
|
||||
npy_file = "img_emb_" + str_i + '.npy'
|
||||
metadata_file = "metadata_" + str_i + '.parquet'
|
||||
text_npy = "text_emb_" + str_i + '.npy'
|
||||
|
||||
# load all files
|
||||
im_emb = np.load(npy_file)
|
||||
text_emb = np.load(text_npy)
|
||||
data = pd.read_parquet(metadata_file)
|
||||
|
||||
# combine them
|
||||
data = pd.concat([data, pd.DataFrame({"image_embedding" : [*im_emb]}), pd.DataFrame({"text_embedding" : [*text_emb]})], axis=1, copy=False)
|
||||
|
||||
# you can save more columns
|
||||
data = data[['url', 'caption', 'similarity', "image_embedding", "text_embedding"]]
|
||||
|
||||
# transform np.arrays to lists
|
||||
data['image_embedding'] = data['image_embedding'].apply(lambda x: list(x))
|
||||
data['text_embedding'] = data['text_embedding'].apply(lambda x: list(x))
|
||||
|
||||
# this small hack is needed becase caption sometimes contains all kind of quotes
|
||||
data['caption'] = data['caption'].apply(lambda x: x.replace("'", " ").replace('"', " "))
|
||||
|
||||
# save data to file
|
||||
data.to_csv(str_i + '.csv', header=False)
|
||||
|
||||
# previous files can be removed
|
||||
os.system(f"rm {npy_file} {metadata_file} {text_npy}")
|
||||
```
|
||||
|
||||
You can download data with
|
||||
```bash
|
||||
seq 0 409 | xargs -P100 -I{} bash -c './download.sh {}'
|
||||
```
|
||||
|
||||
The dataset is divided into 409 files. If you want to work only with a certain part of the dataset, just change the limits.
|
||||
|
||||
## Create table for laion
|
||||
|
||||
Without indexes table can be created by
|
||||
|
||||
```sql
|
||||
CREATE TABLE laion_dataset
|
||||
(
|
||||
`id` Int64,
|
||||
`url` String,
|
||||
`caption` String,
|
||||
`similarity` Float32,
|
||||
`image_embedding` Array(Float32),
|
||||
`text_embedding` Array(Float32)
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY id
|
||||
SETTINGS index_granularity = 8192
|
||||
```
|
||||
|
||||
Fill table with data:
|
||||
|
||||
```sql
|
||||
INSERT INTO laion_dataset FROM INFILE '{path_to_csv_files}/*.csv'
|
||||
```
|
||||
|
||||
## Check data in table without indexes
|
||||
|
||||
Let's check the work of the following query on the part of the dataset (8 million records):
|
||||
|
||||
```sql
|
||||
select url, caption from test_laion where similarity > 0.2 order by L2Distance(image_embedding, {target:Array(Float32)}) limit 30
|
||||
```
|
||||
|
||||
Since the embeddings for images and texts may not match, let's also require a certain threshold of matching accuracy to get images that are more likely to satisfy our queries. The client parameter `target`, which is an array of 512 elements. See later in this article for a convenient way of obtaining such vectors. I used a random picture of a cat from the Internet as a target vector.
|
||||
|
||||
**The result**
|
||||
|
||||
```
|
||||
┌─url───────────────────────────────────────────────────────────────────────────────────────────────────────────┬─caption────────────────────────────────────────────────────────────────┐
|
||||
│ https://s3.amazonaws.com/filestore.rescuegroups.org/6685/pictures/animals/13884/13884995/63318230_463x463.jpg │ Adoptable Female Domestic Short Hair │
|
||||
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/8/b/6/239905226.jpg │ Adopt A Pet :: Marzipan - New York, NY │
|
||||
│ http://d1n3ar4lqtlydb.cloudfront.net/9/2/4/248407625.jpg │ Adopt A Pet :: Butterscotch - New Castle, DE │
|
||||
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/e/e/c/245615237.jpg │ Adopt A Pet :: Tiggy - Chicago, IL │
|
||||
│ http://pawsofcoronado.org/wp-content/uploads/2012/12/rsz_pumpkin.jpg │ Pumpkin an orange tabby kitten for adoption │
|
||||
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/7/8/3/188700997.jpg │ Adopt A Pet :: Brian the Brad Pitt of cats - Frankfort, IL │
|
||||
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/8/b/d/191533561.jpg │ Domestic Shorthair Cat for adoption in Mesa, Arizona - Charlie │
|
||||
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/0/1/2/221698235.jpg │ Domestic Shorthair Cat for adoption in Marietta, Ohio - Daisy (Spayed) │
|
||||
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
8 rows in set. Elapsed: 6.432 sec. Processed 19.65 million rows, 43.96 GB (3.06 million rows/s., 6.84 GB/s.)
|
||||
```
|
||||
|
||||
## Add indexes
|
||||
|
||||
Create a new table or follow instructions from [alter documentation](../../sql-reference/statements/alter/skipping-index.md).
|
||||
|
||||
```sql
|
||||
CREATE TABLE laion_dataset
|
||||
(
|
||||
`id` Int64,
|
||||
`url` String,
|
||||
`caption` String,
|
||||
`similarity` Float32,
|
||||
`image_embedding` Array(Float32),
|
||||
`text_embedding` Array(Float32),
|
||||
INDEX annoy_image image_embedding TYPE annoy(1000) GRANULARITY 1000,
|
||||
INDEX annoy_text text_embedding TYPE annoy(1000) GRANULARITY 1000
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY id
|
||||
SETTINGS index_granularity = 8192
|
||||
```
|
||||
|
||||
When created, the index will be built by L2Distance. You can read more about the parameters in the [annoy documentation](../../engines/table-engines/mergetree-family/annindexes.md#annoy-annoy). It makes sense to build indexes for a large number of granules. If you need good speed, then GRANULARITY should be several times larger than the expected number of results in the search.
|
||||
Now let's check again with the same query:
|
||||
|
||||
```sql
|
||||
select url, caption from test_indexes_laion where similarity > 0.2 order by L2Distance(image_embedding, {target:Array(Float32)}) limit 8
|
||||
```
|
||||
|
||||
**Result**
|
||||
|
||||
```
|
||||
┌─url──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─caption──────────────────────────────────────────────────────────────┐
|
||||
│ http://tse1.mm.bing.net/th?id=OIP.R1CUoYp_4hbeFSHBaaB5-gHaFj │ bed bugs and pets can cats carry bed bugs pets adviser │
|
||||
│ http://pet-uploads.adoptapet.com/1/9/c/1963194.jpg?336w │ Domestic Longhair Cat for adoption in Quincy, Massachusetts - Ashley │
|
||||
│ https://thumbs.dreamstime.com/t/cat-bed-12591021.jpg │ Cat on bed Stock Image │
|
||||
│ https://us.123rf.com/450wm/penta/penta1105/penta110500004/9658511-portrait-of-british-short-hair-kitten-lieing-at-sofa-on-sun.jpg │ Portrait of british short hair kitten lieing at sofa on sun. │
|
||||
│ https://www.easypetmd.com/sites/default/files/Wirehaired%20Vizsla%20(2).jpg │ Vizsla (Wirehaired) image 3 │
|
||||
│ https://images.ctfassets.net/yixw23k2v6vo/0000000200009b8800000000/7950f4e1c1db335ef91bb2bc34428de9/dog-cat-flickr-Impatience_1.jpg?w=600&h=400&fm=jpg&fit=thumb&q=65&fl=progressive │ dog and cat image │
|
||||
│ https://i1.wallbox.ru/wallpapers/small/201523/eaa582ee76a31fd.jpg │ cats, kittens, faces, tonkinese │
|
||||
│ https://www.baxterboo.com/images/breeds/medium/cairn-terrier.jpg │ Cairn Terrier Photo │
|
||||
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
8 rows in set. Elapsed: 0.641 sec. Processed 22.06 thousand rows, 49.36 MB (91.53 thousand rows/s., 204.81 MB/s.)
|
||||
```
|
||||
|
||||
The speed has increased significantly. But now, the results sometimes differ from what you are looking for. This is due to the approximation of the search and the quality of the constructed embedding. Note that the example was given for picture embeddings, but there are also text embeddings in the dataset, which can also be used for searching.
|
||||
|
||||
## Scripts for embeddings
|
||||
|
||||
Usually, we do not want to get embeddings from existing data, but to get them for new data and look for similar ones in old data. We can use [UDF](../../sql-reference/functions/index.md#sql-user-defined-functions) for this purpose. They will allow you to set the `target` vector without leaving the client. All of the following scripts will be written for the `ViT-B/32` model, as it was used for this dataset. You can use any model, but it is necessary to build embeddings in the dataset and for new objects using the same model.
|
||||
|
||||
### Text embeddings
|
||||
|
||||
`encode_text.py`:
|
||||
```python
|
||||
#!/usr/bin/python3
|
||||
import clip
|
||||
import torch
|
||||
import numpy as np
|
||||
import sys
|
||||
|
||||
if __name__ == '__main__':
|
||||
device = "cuda" if torch.cuda.is_available() else "cpu"
|
||||
model, preprocess = clip.load("ViT-B/32", device=device)
|
||||
for text in sys.stdin:
|
||||
inputs = clip.tokenize(text)
|
||||
with torch.no_grad():
|
||||
text_features = model.encode_text(inputs)[0].tolist()
|
||||
sys.stdout.flush()
|
||||
```
|
||||
|
||||
`encode_text_function.xml`:
|
||||
```xml
|
||||
<functions>
|
||||
<function>
|
||||
<type>executable</type>
|
||||
<name>encode_text</name>
|
||||
<return_type>Array(Float32)</return_type>
|
||||
<argument>
|
||||
<type>String</type>
|
||||
<name>text</name>
|
||||
</argument>
|
||||
<format>TabSeparated</format>
|
||||
<command>encode_text.py</command>
|
||||
<command_read_timeout>1000000</command_read_timeout>
|
||||
</function>
|
||||
</functions>
|
||||
```
|
||||
|
||||
Now we can simply use:
|
||||
|
||||
```sql
|
||||
SELECT encode_text('cat');
|
||||
```
|
||||
|
||||
The first use will be slow because the model needs to be loaded. But repeated queries will be fast. Then we copy the results to ``set param_target=...`` and can easily write queries
|
||||
|
||||
### Image embeddings
|
||||
|
||||
For pictures, the process is similar, but you send the path instead of the picture (if necessary, you can implement a download picture with processing, but it will take longer)
|
||||
|
||||
`encode_picture.py`
|
||||
```python
|
||||
#!/usr/bin/python3
|
||||
import clip
|
||||
import torch
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
import sys
|
||||
|
||||
if __name__ == '__main__':
|
||||
device = "cuda" if torch.cuda.is_available() else "cpu"
|
||||
model, preprocess = clip.load("ViT-B/32", device=device)
|
||||
for text in sys.stdin:
|
||||
image = preprocess(Image.open(text.strip())).unsqueeze(0).to(device)
|
||||
with torch.no_grad():
|
||||
image_features = model.encode_image(image)[0].tolist()
|
||||
print(image_features)
|
||||
sys.stdout.flush()
|
||||
```
|
||||
|
||||
`encode_picture_function.xml`
|
||||
```xml
|
||||
<functions>
|
||||
<function>
|
||||
<type>executable_pool</type>
|
||||
<name>encode_picture</name>
|
||||
<return_type>Array(Float32)</return_type>
|
||||
<argument>
|
||||
<type>String</type>
|
||||
<name>path</name>
|
||||
</argument>
|
||||
<format>TabSeparated</format>
|
||||
<command>encode_picture.py</command>
|
||||
<command_read_timeout>1000000</command_read_timeout>
|
||||
</function>
|
||||
</functions>
|
||||
```
|
||||
|
||||
The query:
|
||||
```sql
|
||||
SELECT encode_picture('some/path/to/your/picture');
|
||||
```
|
@ -22,6 +22,7 @@ functions in ClickHouse. The sample datasets include:
|
||||
- The [Cell Towers dataset](../getting-started/example-datasets/cell-towers.md) imports a CSV into ClickHouse
|
||||
- The [NYPD Complaint Data](../getting-started/example-datasets/nypd_complaint_data.md) demonstrates how to use data inference to simplify creating tables
|
||||
- The ["What's on the Menu?" dataset](../getting-started/example-datasets/menus.md) has an example of denormalizing data
|
||||
- The [Laion dataset](../getting-started/example-datasets/laion.md) has an example of [Approximate nearest neighbor search indexes](../engines/table-engines/mergetree-family/annindexes.md) usage
|
||||
- [Getting Data Into ClickHouse - Part 1](https://clickhouse.com/blog/getting-data-into-clickhouse-part-1) provides examples of defining a schema and loading a small Hacker News dataset
|
||||
- [Getting Data Into ClickHouse - Part 3 - Using S3](https://clickhouse.com/blog/getting-data-into-clickhouse-part-3-s3) has examples of loading data from s3
|
||||
- [Generating random data in ClickHouse](https://clickhouse.com/blog/generating-random-test-distribution-data-for-clickhouse) shows how to generate random data if none of the above fit your needs.
|
||||
|
@ -83,6 +83,7 @@ The supported formats are:
|
||||
| [RawBLOB](#rawblob) | ✔ | ✔ |
|
||||
| [MsgPack](#msgpack) | ✔ | ✔ |
|
||||
| [MySQLDump](#mysqldump) | ✔ | ✗ |
|
||||
| [Markdown](#markdown) | ✗ | ✔ |
|
||||
|
||||
|
||||
You can control some format processing parameters with the ClickHouse settings. For more information read the [Settings](/docs/en/operations/settings/settings-formats.md) section.
|
||||
@ -2347,3 +2348,26 @@ FROM file(dump.sql, MySQLDump)
|
||||
│ 3 │
|
||||
└───┘
|
||||
```
|
||||
|
||||
## Markdown {#markdown}
|
||||
|
||||
You can export results using [Markdown](https://en.wikipedia.org/wiki/Markdown) format to generate output ready to be pasted into your `.md` files:
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
number,
|
||||
number * 2
|
||||
FROM numbers(5)
|
||||
FORMAT Markdown
|
||||
```
|
||||
```results
|
||||
| number | multiply(number, 2) |
|
||||
|-:|-:|
|
||||
| 0 | 0 |
|
||||
| 1 | 2 |
|
||||
| 2 | 4 |
|
||||
| 3 | 6 |
|
||||
| 4 | 8 |
|
||||
```
|
||||
|
||||
Markdown table will be generated automatically and can be used on markdown-enabled platforms, like Github. This format is used only for output.
|
||||
|
@ -1915,6 +1915,21 @@ Possible values:
|
||||
|
||||
Default value: 0.
|
||||
|
||||
## optimize_skip_merged_partitions {#optimize-skip-merged-partitions}
|
||||
|
||||
Enables or disables optimization for [OPTIMIZE TABLE ... FINAL](../../sql-reference/statements/optimize.md) query if there is only one part with level > 0 and it doesn't have expired TTL.
|
||||
|
||||
- `OPTIMIZE TABLE ... FINAL SETTINGS optimize_skip_merged_partitions=1`
|
||||
|
||||
By default, `OPTIMIZE TABLE ... FINAL` query rewrites the one part even if there is only a single part.
|
||||
|
||||
Possible values:
|
||||
|
||||
- 1 - Enable optimization.
|
||||
- 0 - Disable optimization.
|
||||
|
||||
Default value: 0.
|
||||
|
||||
## optimize_functions_to_subcolumns {#optimize-functions-to-subcolumns}
|
||||
|
||||
Enables or disables optimization by transforming some functions to reading subcolumns. This reduces the amount of data to read.
|
||||
|
@ -290,15 +290,11 @@ This storage method works the same way as hashed and allows using date/time (arb
|
||||
Example: The table contains discounts for each advertiser in the format:
|
||||
|
||||
``` text
|
||||
+---------|-------------|-------------|------+
|
||||
| advertiser id | discount start date | discount end date | amount |
|
||||
+===============+=====================+===================+========+
|
||||
| 123 | 2015-01-01 | 2015-01-15 | 0.15 |
|
||||
+---------|-------------|-------------|------+
|
||||
| 123 | 2015-01-16 | 2015-01-31 | 0.25 |
|
||||
+---------|-------------|-------------|------+
|
||||
| 456 | 2015-01-01 | 2015-01-15 | 0.05 |
|
||||
+---------|-------------|-------------|------+
|
||||
┌─advertiser_id─┬─discount_start_date─┬─discount_end_date─┬─amount─┐
|
||||
│ 123 │ 2015-01-16 │ 2015-01-31 │ 0.25 │
|
||||
│ 123 │ 2015-01-01 │ 2015-01-15 │ 0.15 │
|
||||
│ 456 │ 2015-01-01 │ 2015-01-15 │ 0.05 │
|
||||
└───────────────┴─────────────────────┴───────────────────┴────────┘
|
||||
```
|
||||
|
||||
To use a sample for date ranges, define the `range_min` and `range_max` elements in the [structure](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md). These elements must contain elements `name` and `type` (if `type` is not specified, the default type will be used - Date). `type` can be any numeric type (Date / DateTime / UInt64 / Int32 / others).
|
||||
@ -307,19 +303,25 @@ To use a sample for date ranges, define the `range_min` and `range_max` elements
|
||||
Values of `range_min` and `range_max` should fit in `Int64` type.
|
||||
:::
|
||||
|
||||
Example:
|
||||
Example:
|
||||
|
||||
``` xml
|
||||
<layout>
|
||||
<range_hashed>
|
||||
<!-- Strategy for overlapping ranges (min/max). Default: min (return a matching range with the min(range_min -> range_max) value) -->
|
||||
<range_lookup_strategy>min</range_lookup_strategy>
|
||||
</range_hashed>
|
||||
</layout>
|
||||
<structure>
|
||||
<id>
|
||||
<name>Id</name>
|
||||
<name>advertiser_id</name>
|
||||
</id>
|
||||
<range_min>
|
||||
<name>first</name>
|
||||
<name>discount_start_date</name>
|
||||
<type>Date</type>
|
||||
</range_min>
|
||||
<range_max>
|
||||
<name>last</name>
|
||||
<name>discount_end_date</name>
|
||||
<type>Date</type>
|
||||
</range_max>
|
||||
...
|
||||
@ -328,17 +330,17 @@ Example:
|
||||
or
|
||||
|
||||
``` sql
|
||||
CREATE DICTIONARY somedict (
|
||||
id UInt64,
|
||||
first Date,
|
||||
last Date,
|
||||
advertiser_id UInt64
|
||||
CREATE DICTIONARY discounts_dict (
|
||||
advertiser_id UInt64,
|
||||
discount_start_date Date,
|
||||
discount_end_date Date,
|
||||
amount Float64
|
||||
)
|
||||
PRIMARY KEY id
|
||||
SOURCE(CLICKHOUSE(TABLE 'date_table'))
|
||||
SOURCE(CLICKHOUSE(TABLE 'discounts'))
|
||||
LIFETIME(MIN 1 MAX 1000)
|
||||
LAYOUT(RANGE_HASHED())
|
||||
RANGE(MIN first MAX last)
|
||||
LAYOUT(RANGE_HASHED(range_lookup_strategy 'max'))
|
||||
RANGE(MIN discount_start_date MAX discount_end_date)
|
||||
```
|
||||
|
||||
To work with these dictionaries, you need to pass an additional argument to the `dictGet` function, for which a range is selected:
|
||||
@ -349,16 +351,17 @@ dictGet('dict_name', 'attr_name', id, date)
|
||||
Query example:
|
||||
|
||||
``` sql
|
||||
SELECT dictGet('somedict', 'advertiser_id', 1, '2022-10-20 23:20:10.000'::DateTime64::UInt64);
|
||||
SELECT dictGet('discounts_dict', 'amount', 1, '2022-10-20'::Date);
|
||||
```
|
||||
|
||||
This function returns the value for the specified `id`s and the date range that includes the passed date.
|
||||
|
||||
Details of the algorithm:
|
||||
|
||||
- If the `id` is not found or a range is not found for the `id`, it returns the default value for the dictionary.
|
||||
- If there are overlapping ranges, it returns value for any (random) range.
|
||||
- If the range delimiter is `NULL` or an invalid date (such as 1900-01-01), the range is open. The range can be open on both sides.
|
||||
- If the `id` is not found or a range is not found for the `id`, it returns the default value of the attribute's type.
|
||||
- If there are overlapping ranges and `range_lookup_strategy=min`, it returns a matching range with minimal `range_min`, if several ranges found, it returns a range with minimal `range_max`, if again several ranges found (several ranges had the same `range_min` and `range_max` it returns a random range of them.
|
||||
- If there are overlapping ranges and `range_lookup_strategy=max`, it returns a matching range with maximal `range_min`, if several ranges found, it returns a range with maximal `range_max`, if again several ranges found (several ranges had the same `range_min` and `range_max` it returns a random range of them.
|
||||
- If the `range_max` is `NULL`, the range is open. `NULL` is treated as maximal possible value. For the `range_min` `1970-01-01` or `0` (-MAX_INT) can be used as the open value.
|
||||
|
||||
Configuration example:
|
||||
|
||||
@ -407,6 +410,108 @@ PRIMARY KEY Abcdef
|
||||
RANGE(MIN StartTimeStamp MAX EndTimeStamp)
|
||||
```
|
||||
|
||||
Configuration example with overlapping ranges and open ranges:
|
||||
|
||||
```sql
|
||||
CREATE TABLE discounts
|
||||
(
|
||||
advertiser_id UInt64,
|
||||
discount_start_date Date,
|
||||
discount_end_date Nullable(Date),
|
||||
amount Float64
|
||||
)
|
||||
ENGINE = Memory;
|
||||
|
||||
INSERT INTO discounts VALUES (1, '2015-01-01', Null, 0.1);
|
||||
INSERT INTO discounts VALUES (1, '2015-01-15', Null, 0.2);
|
||||
INSERT INTO discounts VALUES (2, '2015-01-01', '2015-01-15', 0.3);
|
||||
INSERT INTO discounts VALUES (2, '2015-01-04', '2015-01-10', 0.4);
|
||||
INSERT INTO discounts VALUES (3, '1970-01-01', '2015-01-15', 0.5);
|
||||
INSERT INTO discounts VALUES (3, '1970-01-01', '2015-01-10', 0.6);
|
||||
|
||||
SELECT * FROM discounts ORDER BY advertiser_id, discount_start_date;
|
||||
┌─advertiser_id─┬─discount_start_date─┬─discount_end_date─┬─amount─┐
|
||||
│ 1 │ 2015-01-01 │ ᴺᵁᴸᴸ │ 0.1 │
|
||||
│ 1 │ 2015-01-15 │ ᴺᵁᴸᴸ │ 0.2 │
|
||||
│ 2 │ 2015-01-01 │ 2015-01-15 │ 0.3 │
|
||||
│ 2 │ 2015-01-04 │ 2015-01-10 │ 0.4 │
|
||||
│ 3 │ 1970-01-01 │ 2015-01-15 │ 0.5 │
|
||||
│ 3 │ 1970-01-01 │ 2015-01-10 │ 0.6 │
|
||||
└───────────────┴─────────────────────┴───────────────────┴────────┘
|
||||
|
||||
-- RANGE_LOOKUP_STRATEGY 'max'
|
||||
|
||||
CREATE DICTIONARY discounts_dict
|
||||
(
|
||||
advertiser_id UInt64,
|
||||
discount_start_date Date,
|
||||
discount_end_date Nullable(Date),
|
||||
amount Float64
|
||||
)
|
||||
PRIMARY KEY advertiser_id
|
||||
SOURCE(CLICKHOUSE(TABLE discounts))
|
||||
LIFETIME(MIN 600 MAX 900)
|
||||
LAYOUT(RANGE_HASHED(RANGE_LOOKUP_STRATEGY 'max'))
|
||||
RANGE(MIN discount_start_date MAX discount_end_date);
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-14')) res;
|
||||
┌─res─┐
|
||||
│ 0.1 │ -- the only one range is matching: 2015-01-01 - Null
|
||||
└─────┘
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-16')) res;
|
||||
┌─res─┐
|
||||
│ 0.2 │ -- two ranges are matching, range_min 2015-01-15 (0.2) is bigger than 2015-01-01 (0.1)
|
||||
└─────┘
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 2, toDate('2015-01-06')) res;
|
||||
┌─res─┐
|
||||
│ 0.4 │ -- two ranges are matching, range_min 2015-01-04 (0.4) is bigger than 2015-01-01 (0.3)
|
||||
└─────┘
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 3, toDate('2015-01-01')) res;
|
||||
┌─res─┐
|
||||
│ 0.5 │ -- two ranges are matching, range_min are equal, 2015-01-15 (0.5) is bigger than 2015-01-10 (0.6)
|
||||
└─────┘
|
||||
|
||||
DROP DICTIONARY discounts_dict;
|
||||
|
||||
-- RANGE_LOOKUP_STRATEGY 'min'
|
||||
|
||||
CREATE DICTIONARY discounts_dict
|
||||
(
|
||||
advertiser_id UInt64,
|
||||
discount_start_date Date,
|
||||
discount_end_date Nullable(Date),
|
||||
amount Float64
|
||||
)
|
||||
PRIMARY KEY advertiser_id
|
||||
SOURCE(CLICKHOUSE(TABLE discounts))
|
||||
LIFETIME(MIN 600 MAX 900)
|
||||
LAYOUT(RANGE_HASHED(RANGE_LOOKUP_STRATEGY 'min'))
|
||||
RANGE(MIN discount_start_date MAX discount_end_date);
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-14')) res;
|
||||
┌─res─┐
|
||||
│ 0.1 │ -- the only one range is matching: 2015-01-01 - Null
|
||||
└─────┘
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-16')) res;
|
||||
┌─res─┐
|
||||
│ 0.1 │ -- two ranges are matching, range_min 2015-01-01 (0.1) is less than 2015-01-15 (0.2)
|
||||
└─────┘
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 2, toDate('2015-01-06')) res;
|
||||
┌─res─┐
|
||||
│ 0.3 │ -- two ranges are matching, range_min 2015-01-01 (0.3) is less than 2015-01-04 (0.4)
|
||||
└─────┘
|
||||
|
||||
select dictGet('discounts_dict', 'amount', 3, toDate('2015-01-01')) res;
|
||||
┌─res─┐
|
||||
│ 0.6 │ -- two ranges are matching, range_min are equal, 2015-01-10 (0.6) is less than 2015-01-15 (0.5)
|
||||
└─────┘
|
||||
```
|
||||
|
||||
### complex_key_range_hashed
|
||||
|
||||
The dictionary is stored in memory in the form of a hash table with an ordered array of ranges and their corresponding values (see [range_hashed](#range-hashed)). This type of storage is for use with composite [keys](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md).
|
||||
|
@ -209,10 +209,25 @@ Aliases: `DAYOFMONTH`, `DAY`.
|
||||
|
||||
## toDayOfWeek
|
||||
|
||||
Converts a date or date with time to a UInt8 number containing the number of the day of the week (Monday is 1, and Sunday is 7).
|
||||
Converts a date or date with time to a UInt8 number containing the number of the day of the week.
|
||||
|
||||
The two-argument form of `toDayOfWeek()` enables you to specify whether the week starts on Monday or Sunday, and whether the return value should be in the range from 0 to 6 or 1 to 7. If the mode argument is ommited, the default mode is 0. The time zone of the date can be specified as the third argument.
|
||||
|
||||
| Mode | First day of week | Range |
|
||||
|------|-------------------|------------------------------------------------|
|
||||
| 0 | Monday | 1-7: Monday = 1, Tuesday = 2, ..., Sunday = 7 |
|
||||
| 1 | Monday | 0-6: Monday = 0, Tuesday = 1, ..., Sunday = 6 |
|
||||
| 2 | Sunday | 0-6: Sunday = 0, Monday = 1, ..., Saturday = 6 |
|
||||
| 3 | Sunday | 1-7: Sunday = 1, Monday = 2, ..., Saturday = 7 |
|
||||
|
||||
Alias: `DAYOFWEEK`.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
toDayOfWeek(t[, mode[, timezone]])
|
||||
```
|
||||
|
||||
## toHour
|
||||
|
||||
Converts a date with time to a UInt8 number containing the number of the hour in 24-hour time (0-23).
|
||||
@ -316,11 +331,17 @@ If `toLastDayOfMonth` is called with an argument of type `Date` greater then 214
|
||||
Rounds down a date, or date with time, to the nearest Monday.
|
||||
Returns the date.
|
||||
|
||||
## toStartOfWeek(t\[,mode\])
|
||||
## toStartOfWeek
|
||||
|
||||
Rounds down a date, or date with time, to the nearest Sunday or Monday by mode.
|
||||
Rounds a date or date with time down to the nearest Sunday or Monday.
|
||||
Returns the date.
|
||||
The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.
|
||||
The mode argument works exactly like the mode argument in function `toWeek()`. If no mode is specified, mode is assumed as 0.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
toStartOfWeek(t[, mode[, timezone]])
|
||||
```
|
||||
|
||||
## toStartOfDay
|
||||
|
||||
@ -455,10 +476,12 @@ Converts a date, or date with time, to a UInt16 number containing the ISO Year n
|
||||
|
||||
Converts a date, or date with time, to a UInt8 number containing the ISO Week number.
|
||||
|
||||
## toWeek(date\[,mode\])
|
||||
## toWeek
|
||||
|
||||
This function returns the week number for date or datetime. The two-argument form of `toWeek()` enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53. If the mode argument is omitted, the default mode is 0.
|
||||
|
||||
`toISOWeek()` is a compatibility function that is equivalent to `toWeek(date,3)`.
|
||||
|
||||
This function returns the week number for date or datetime. The two-argument form of toWeek() enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53. If the mode argument is omitted, the default mode is 0.
|
||||
`toISOWeek()`is a compatibility function that is equivalent to `toWeek(date,3)`.
|
||||
The following table describes how the mode argument works.
|
||||
|
||||
| Mode | First day of week | Range | Week 1 is the first week … |
|
||||
@ -482,13 +505,15 @@ For mode values with a meaning of “with 4 or more days this year,” weeks are
|
||||
|
||||
For mode values with a meaning of “contains January 1”, the week contains January 1 is week 1. It does not matter how many days in the new year the week contained, even if it contained only one day.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
toWeek(date, [, mode][, Timezone])
|
||||
toWeek(t[, mode[, time_zone]])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `date` – Date or DateTime.
|
||||
- `t` – Date or DateTime.
|
||||
- `mode` – Optional parameter, Range of values is \[0,9\], default is 0.
|
||||
- `Timezone` – Optional parameter, it behaves like any other conversion function.
|
||||
|
||||
@ -504,13 +529,19 @@ SELECT toDate('2016-12-27') AS date, toWeek(date) AS week0, toWeek(date,1) AS we
|
||||
└────────────┴───────┴───────┴───────┘
|
||||
```
|
||||
|
||||
## toYearWeek(date\[,mode\])
|
||||
## toYearWeek
|
||||
|
||||
Returns year and week for a date. The year in the result may be different from the year in the date argument for the first and the last week of the year.
|
||||
|
||||
The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.
|
||||
The mode argument works exactly like the mode argument to `toWeek()`. For the single-argument syntax, a mode value of 0 is used.
|
||||
|
||||
`toISOYear()`is a compatibility function that is equivalent to `intDiv(toYearWeek(date,3),100)`.
|
||||
`toISOYear()` is a compatibility function that is equivalent to `intDiv(toYearWeek(date,3),100)`.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
toYearWeek(t[, mode[, timezone]])
|
||||
```
|
||||
|
||||
**Example**
|
||||
|
||||
|
@ -23,7 +23,7 @@ When `OPTIMIZE` is used with the [ReplicatedMergeTree](../../engines/table-engin
|
||||
|
||||
- If `OPTIMIZE` does not perform a merge for any reason, it does not notify the client. To enable notifications, use the [optimize_throw_if_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop) setting.
|
||||
- If you specify a `PARTITION`, only the specified partition is optimized. [How to set partition expression](alter/partition.md#how-to-set-partition-expression).
|
||||
- If you specify `FINAL`, optimization is performed even when all the data is already in one part. Also merge is forced even if concurrent merges are performed.
|
||||
- If you specify `FINAL`, optimization is performed even when all the data is already in one part. You can control this behaviour with [optimize_skip_merged_partitions](../../operations/settings/settings.md#optimize-skip-merged-partitions). Also, the merge is forced even if concurrent merges are performed.
|
||||
- If you specify `DEDUPLICATE`, then completely identical rows (unless by-clause is specified) will be deduplicated (all columns are compared), it makes sense only for the MergeTree engine.
|
||||
|
||||
You can specify how long (in seconds) to wait for inactive replicas to execute `OPTIMIZE` queries by the [replication_wait_for_inactive_replica_timeout](../../operations/settings/settings.md#replication-wait-for-inactive-replica-timeout) setting.
|
||||
|
@ -1997,6 +1997,21 @@ SELECT * FROM test_table
|
||||
|
||||
Значение по умолчанию: 0.
|
||||
|
||||
## optimize_skip_merged_partitions {#optimize-skip-merged-partitions}
|
||||
|
||||
Включает или отключает оптимизацию для запроса [OPTIMIZE TABLE ... FINAL](../../sql-reference/statements/optimize.md), когда есть только один парт с level > 0 и неистекший TTL.
|
||||
|
||||
- `OPTIMIZE TABLE ... FINAL SETTINGS optimize_skip_merged_partitions=1`
|
||||
|
||||
По умолчанию, `OPTIMIZE TABLE ... FINAL` перезапишет даже один парт.
|
||||
|
||||
Возможные значения:
|
||||
|
||||
- 1 - Включена
|
||||
- 0 - Выключена
|
||||
|
||||
Значение по умолчанию: 0.
|
||||
|
||||
## optimize_functions_to_subcolumns {#optimize-functions-to-subcolumns}
|
||||
|
||||
Включает или отключает оптимизацию путем преобразования некоторых функций к чтению подстолбцов, таким образом уменьшая объем данных для чтения.
|
||||
|
@ -24,7 +24,7 @@ OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition | PARTITION I
|
||||
- По умолчанию, если запросу `OPTIMIZE` не удалось выполнить слияние, то
|
||||
ClickHouse не оповещает клиента. Чтобы включить оповещения, используйте настройку [optimize_throw_if_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop).
|
||||
- Если указать `PARTITION`, то оптимизация выполняется только для указанной партиции. [Как задавать имя партиции в запросах](alter/index.md#alter-how-to-specify-part-expr).
|
||||
- Если указать `FINAL`, то оптимизация выполняется даже в том случае, если все данные уже лежат в одном куске данных. Кроме того, слияние является принудительным, даже если выполняются параллельные слияния.
|
||||
- Если указать `FINAL`, то оптимизация выполняется даже в том случае, если все данные уже лежат в одном куске данных. Можно контролировать с помощью настройки [optimize_skip_merged_partitions](../../operations/settings/settings.md#optimize-skip-merged-partitions). Кроме того, слияние является принудительным, даже если выполняются параллельные слияния.
|
||||
- Если указать `DEDUPLICATE`, то произойдет схлопывание полностью одинаковых строк (сравниваются значения во всех столбцах), имеет смысл только для движка MergeTree.
|
||||
|
||||
Вы можете указать время ожидания (в секундах) выполнения запросов `OPTIMIZE` для неактивных реплик с помощью настройки [replication_wait_for_inactive_replica_timeout](../../operations/settings/settings.md#replication-wait-for-inactive-replica-timeout).
|
||||
@ -196,4 +196,5 @@ SELECT * FROM example;
|
||||
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
|
||||
│ 1 │ 1 │ 2 │ 3 │
|
||||
└─────────────┴───────────────┴───────┴───────────────┘
|
||||
```
|
||||
```
|
||||
|
||||
|
@ -277,7 +277,7 @@ private:
|
||||
}
|
||||
|
||||
if (queries.empty())
|
||||
throw Exception("Empty list of queries.", ErrorCodes::EMPTY_DATA_PASSED);
|
||||
throw Exception(ErrorCodes::EMPTY_DATA_PASSED, "Empty list of queries.");
|
||||
}
|
||||
else
|
||||
{
|
||||
|
@ -724,7 +724,7 @@ bool Client::processWithFuzzing(const String & full_query)
|
||||
// uniformity.
|
||||
// Surprisingly, this is a client exception, because we get the
|
||||
// server exception w/o throwing (see onReceiveException()).
|
||||
client_exception = std::make_unique<Exception>(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
|
||||
client_exception = std::make_unique<Exception>(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
|
||||
have_error = true;
|
||||
}
|
||||
|
||||
@ -859,7 +859,7 @@ bool Client::processWithFuzzing(const String & full_query)
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
client_exception = std::make_unique<Exception>(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
|
||||
client_exception = std::make_unique<Exception>(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
|
||||
have_error = true;
|
||||
}
|
||||
|
||||
|
@ -165,9 +165,8 @@ int mainEntryClickHouseFormat(int argc, char ** argv)
|
||||
/// should throw exception early and make exception message more readable.
|
||||
if (const auto * insert_query = res->as<ASTInsertQuery>(); insert_query && insert_query->data)
|
||||
{
|
||||
throw Exception(
|
||||
"Can't format ASTInsertQuery with data, since data will be lost",
|
||||
DB::ErrorCodes::INVALID_FORMAT_INSERT_QUERY_WITH_DATA);
|
||||
throw Exception(DB::ErrorCodes::INVALID_FORMAT_INSERT_QUERY_WITH_DATA,
|
||||
"Can't format ASTInsertQuery with data, since data will be lost");
|
||||
}
|
||||
if (!quiet)
|
||||
{
|
||||
|
@ -196,7 +196,7 @@ void Keeper::createServer(const std::string & listen_host, const char * port_nam
|
||||
}
|
||||
else
|
||||
{
|
||||
throw Exception{message, ErrorCodes::NETWORK_ERROR};
|
||||
throw Exception::createDeprecated(message, ErrorCodes::NETWORK_ERROR);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -375,7 +375,7 @@ try
|
||||
if (effective_user_id == 0)
|
||||
{
|
||||
message += " Run under 'sudo -u " + data_owner + "'.";
|
||||
throw Exception(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
|
||||
throw Exception::createDeprecated(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
|
||||
}
|
||||
else
|
||||
{
|
||||
|
@ -243,7 +243,6 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
|
||||
const ColumnRawPtrs & columns,
|
||||
bool cat_features_are_strings) const
|
||||
{
|
||||
std::string error_msg = "Error occurred while applying CatBoost model: ";
|
||||
size_t column_size = columns.front()->size();
|
||||
|
||||
auto result = ColumnFloat64::create(column_size * tree_count);
|
||||
@ -265,7 +264,8 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
|
||||
result_buf, column_size * tree_count))
|
||||
{
|
||||
|
||||
throw Exception(error_msg + api.GetErrorString(), ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL);
|
||||
throw Exception(ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL,
|
||||
"Error occurred while applying CatBoost model: {}", api.GetErrorString());
|
||||
}
|
||||
return result;
|
||||
}
|
||||
@ -288,7 +288,8 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
|
||||
cat_features_buf, cat_features_count,
|
||||
result_buf, column_size * tree_count))
|
||||
{
|
||||
throw Exception(error_msg + api.GetErrorString(), ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL);
|
||||
throw Exception(ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL,
|
||||
"Error occurred while applying CatBoost model: {}", api.GetErrorString());
|
||||
}
|
||||
}
|
||||
else
|
||||
@ -304,7 +305,8 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
|
||||
cat_features_buf, cat_features_count,
|
||||
result_buf, column_size * tree_count))
|
||||
{
|
||||
throw Exception(error_msg + api.GetErrorString(), ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL);
|
||||
throw Exception(ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL,
|
||||
"Error occurred while applying CatBoost model: {}", api.GetErrorString());
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -416,7 +416,7 @@ void Server::createServer(
|
||||
}
|
||||
else
|
||||
{
|
||||
throw Exception{message, ErrorCodes::NETWORK_ERROR};
|
||||
throw Exception::createDeprecated(message, ErrorCodes::NETWORK_ERROR);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -946,7 +946,7 @@ try
|
||||
if (effective_user_id == 0)
|
||||
{
|
||||
message += " Run under 'sudo -u " + data_owner + "'.";
|
||||
throw Exception(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
|
||||
throw Exception::createDeprecated(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
|
||||
}
|
||||
else
|
||||
{
|
||||
|
@ -334,6 +334,19 @@
|
||||
|
||||
<max_thread_pool_size>10000</max_thread_pool_size>
|
||||
|
||||
<!-- Configure other thread pools: -->
|
||||
<!--
|
||||
<background_buffer_flush_schedule_pool_size>16</background_buffer_flush_schedule_pool_size>
|
||||
<background_pool_size>16</background_pool_size>
|
||||
<background_merges_mutations_concurrency_ratio>2</background_merges_mutations_concurrency_ratio>
|
||||
<background_move_pool_size>8</background_move_pool_size>
|
||||
<background_fetches_pool_size>8</background_fetches_pool_size>
|
||||
<background_common_pool_size>8</background_common_pool_size>
|
||||
<background_schedule_pool_size>128</background_schedule_pool_size>
|
||||
<background_message_broker_schedule_pool_size>16</background_message_broker_schedule_pool_size>
|
||||
<background_distributed_schedule_pool_size>16</background_distributed_schedule_pool_size>
|
||||
-->
|
||||
|
||||
<!-- Number of workers to recycle connections in background (see also drain_timeout).
|
||||
If the pool is full, connection will be drained synchronously. -->
|
||||
<!-- <max_threads_for_connection_collector>10</max_threads_for_connection_collector> -->
|
||||
|
@ -79,9 +79,7 @@ AuthenticationData::Digest AuthenticationData::Util::encodeSHA256(std::string_vi
|
||||
::DB::encodeSHA256(text, hash.data());
|
||||
return hash;
|
||||
#else
|
||||
throw DB::Exception(
|
||||
"SHA256 passwords support is disabled, because ClickHouse was built without SSL library",
|
||||
DB::ErrorCodes::SUPPORT_IS_DISABLED);
|
||||
throw DB::Exception(DB::ErrorCodes::SUPPORT_IS_DISABLED, "SHA256 passwords support is disabled, because ClickHouse was built without SSL library");
|
||||
#endif
|
||||
}
|
||||
|
||||
|
@ -484,13 +484,15 @@ bool ContextAccess::checkAccessImplHelper(AccessFlags flags, const Args &... arg
|
||||
return true;
|
||||
};
|
||||
|
||||
auto access_denied = [&](const String & error_msg, int error_code [[maybe_unused]])
|
||||
auto access_denied = [&]<typename... FmtArgs>(int error_code [[maybe_unused]],
|
||||
FormatStringHelper<String, FmtArgs...> fmt_string [[maybe_unused]],
|
||||
FmtArgs && ...fmt_args [[maybe_unused]])
|
||||
{
|
||||
if (trace_log)
|
||||
LOG_TRACE(trace_log, "Access denied: {}{}", (AccessRightsElement{flags, args...}.toStringWithoutOptions()),
|
||||
(grant_option ? " WITH GRANT OPTION" : ""));
|
||||
if constexpr (throw_if_denied)
|
||||
throw Exception(getUserName() + ": " + error_msg, error_code);
|
||||
throw Exception(error_code, std::move(fmt_string), getUserName(), std::forward<FmtArgs>(fmt_args)...);
|
||||
return false;
|
||||
};
|
||||
|
||||
@ -519,18 +521,16 @@ bool ContextAccess::checkAccessImplHelper(AccessFlags flags, const Args &... arg
|
||||
{
|
||||
if (grant_option && acs->isGranted(flags, args...))
|
||||
{
|
||||
return access_denied(
|
||||
"Not enough privileges. "
|
||||
return access_denied(ErrorCodes::ACCESS_DENIED,
|
||||
"{}: Not enough privileges. "
|
||||
"The required privileges have been granted, but without grant option. "
|
||||
"To execute this query it's necessary to have grant "
|
||||
+ AccessRightsElement{flags, args...}.toStringWithoutOptions() + " WITH GRANT OPTION",
|
||||
ErrorCodes::ACCESS_DENIED);
|
||||
"To execute this query it's necessary to have grant {} WITH GRANT OPTION",
|
||||
AccessRightsElement{flags, args...}.toStringWithoutOptions());
|
||||
}
|
||||
|
||||
return access_denied(
|
||||
"Not enough privileges. To execute this query it's necessary to have grant "
|
||||
+ AccessRightsElement{flags, args...}.toStringWithoutOptions() + (grant_option ? " WITH GRANT OPTION" : ""),
|
||||
ErrorCodes::ACCESS_DENIED);
|
||||
return access_denied(ErrorCodes::ACCESS_DENIED,
|
||||
"{}: Not enough privileges. To execute this query it's necessary to have grant {}",
|
||||
AccessRightsElement{flags, args...}.toStringWithoutOptions() + (grant_option ? " WITH GRANT OPTION" : ""));
|
||||
}
|
||||
|
||||
struct PrecalculatedFlags
|
||||
@ -557,32 +557,34 @@ bool ContextAccess::checkAccessImplHelper(AccessFlags flags, const Args &... arg
|
||||
if (params.readonly)
|
||||
{
|
||||
if constexpr (grant_option)
|
||||
return access_denied("Cannot change grants in readonly mode.", ErrorCodes::READONLY);
|
||||
return access_denied(ErrorCodes::READONLY, "{}: Cannot change grants in readonly mode.");
|
||||
if ((flags & precalc.not_readonly_flags) ||
|
||||
((params.readonly == 1) && (flags & precalc.not_readonly_1_flags)))
|
||||
{
|
||||
if (params.interface == ClientInfo::Interface::HTTP && params.http_method == ClientInfo::HTTPMethod::GET)
|
||||
{
|
||||
return access_denied(
|
||||
"Cannot execute query in readonly mode. "
|
||||
"For queries over HTTP, method GET implies readonly. You should use method POST for modifying queries",
|
||||
ErrorCodes::READONLY);
|
||||
return access_denied(ErrorCodes::READONLY,
|
||||
"{}: Cannot execute query in readonly mode. "
|
||||
"For queries over HTTP, method GET implies readonly. "
|
||||
"You should use method POST for modifying queries");
|
||||
}
|
||||
else
|
||||
return access_denied("Cannot execute query in readonly mode", ErrorCodes::READONLY);
|
||||
return access_denied(ErrorCodes::READONLY, "{}: Cannot execute query in readonly mode");
|
||||
}
|
||||
}
|
||||
|
||||
if (!params.allow_ddl && !grant_option)
|
||||
{
|
||||
if (flags & precalc.ddl_flags)
|
||||
return access_denied("Cannot execute query. DDL queries are prohibited for the user", ErrorCodes::QUERY_IS_PROHIBITED);
|
||||
return access_denied(ErrorCodes::QUERY_IS_PROHIBITED,
|
||||
"Cannot execute query. DDL queries are prohibited for the user {}");
|
||||
}
|
||||
|
||||
if (!params.allow_introspection && !grant_option)
|
||||
{
|
||||
if (flags & precalc.introspection_flags)
|
||||
return access_denied("Introspection functions are disabled, because setting 'allow_introspection_functions' is set to 0", ErrorCodes::FUNCTION_NOT_ALLOWED);
|
||||
return access_denied(ErrorCodes::FUNCTION_NOT_ALLOWED, "{}: Introspection functions are disabled, "
|
||||
"because setting 'allow_introspection_functions' is set to 0");
|
||||
}
|
||||
|
||||
return access_granted();
|
||||
@ -679,11 +681,13 @@ void ContextAccess::checkGrantOption(const AccessRightsElements & elements) cons
|
||||
template <bool throw_if_denied, typename Container, typename GetNameFunction>
|
||||
bool ContextAccess::checkAdminOptionImplHelper(const Container & role_ids, const GetNameFunction & get_name_function) const
|
||||
{
|
||||
auto show_error = [this](const String & msg, int error_code [[maybe_unused]])
|
||||
auto show_error = []<typename... FmtArgs>(int error_code [[maybe_unused]],
|
||||
FormatStringHelper<FmtArgs...> fmt_string [[maybe_unused]],
|
||||
FmtArgs && ...fmt_args [[maybe_unused]])
|
||||
{
|
||||
UNUSED(this);
|
||||
if constexpr (throw_if_denied)
|
||||
throw Exception(getUserName() + ": " + msg, error_code);
|
||||
throw Exception(error_code, std::move(fmt_string), std::forward<FmtArgs>(fmt_args)...);
|
||||
return false;
|
||||
};
|
||||
|
||||
if (is_full_access)
|
||||
@ -691,7 +695,7 @@ bool ContextAccess::checkAdminOptionImplHelper(const Container & role_ids, const
|
||||
|
||||
if (user_was_dropped)
|
||||
{
|
||||
show_error("User has been dropped", ErrorCodes::UNKNOWN_USER);
|
||||
show_error(ErrorCodes::UNKNOWN_USER, "User has been dropped");
|
||||
return false;
|
||||
}
|
||||
|
||||
@ -716,14 +720,15 @@ bool ContextAccess::checkAdminOptionImplHelper(const Container & role_ids, const
|
||||
role_name = "ID {" + toString(role_id) + "}";
|
||||
|
||||
if (info->enabled_roles.count(role_id))
|
||||
show_error("Not enough privileges. "
|
||||
"Role " + backQuote(*role_name) + " is granted, but without ADMIN option. "
|
||||
"To execute this query it's necessary to have the role " + backQuoteIfNeed(*role_name) + " granted with ADMIN option.",
|
||||
ErrorCodes::ACCESS_DENIED);
|
||||
show_error(ErrorCodes::ACCESS_DENIED,
|
||||
"Not enough privileges. "
|
||||
"Role {} is granted, but without ADMIN option. "
|
||||
"To execute this query it's necessary to have the role {} granted with ADMIN option.",
|
||||
backQuote(*role_name), backQuoteIfNeed(*role_name));
|
||||
else
|
||||
show_error("Not enough privileges. "
|
||||
"To execute this query it's necessary to have the role " + backQuoteIfNeed(*role_name) + " granted with ADMIN option.",
|
||||
ErrorCodes::ACCESS_DENIED);
|
||||
show_error(ErrorCodes::ACCESS_DENIED, "Not enough privileges. "
|
||||
"To execute this query it's necessary to have the role {} granted with ADMIN option.",
|
||||
backQuoteIfNeed(*role_name));
|
||||
}
|
||||
|
||||
return false;
|
||||
|
@ -81,7 +81,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
{
|
||||
ret = krb5_cc_resolve(k5.ctx, cache_name.c_str(), &k5.out_cc);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving cache{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving cache: {}", fmtError(ret));
|
||||
LOG_TRACE(log,"Resolved cache");
|
||||
}
|
||||
else
|
||||
@ -89,7 +89,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
// Resolve the default cache and get its type and default principal (if it is initialized).
|
||||
ret = krb5_cc_default(k5.ctx, &defcache);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while getting default cache{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while getting default cache: {}", fmtError(ret));
|
||||
LOG_TRACE(log,"Resolved default cache");
|
||||
deftype = krb5_cc_get_type(k5.ctx, defcache);
|
||||
if (krb5_cc_get_principal(k5.ctx, defcache, &defcache_princ) != 0)
|
||||
@ -99,7 +99,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
// Use the specified principal name.
|
||||
ret = krb5_parse_name_flags(k5.ctx, principal.c_str(), 0, &k5.me);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when parsing principal name {}", principal + fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when parsing principal name ({}): {}", principal, fmtError(ret));
|
||||
|
||||
// Cache related commands
|
||||
if (k5.out_cc == nullptr && krb5_cc_support_switch(k5.ctx, deftype))
|
||||
@ -107,7 +107,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
// Use an existing cache for the client principal if we can.
|
||||
ret = krb5_cc_cache_match(k5.ctx, k5.me, &k5.out_cc);
|
||||
if (ret && ret != KRB5_CC_NOTFOUND)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while searching for cache for {}", principal + fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while searching for cache for ({}): {}", principal, fmtError(ret));
|
||||
if (0 == ret)
|
||||
{
|
||||
LOG_TRACE(log,"Using default cache: {}", krb5_cc_get_name(k5.ctx, k5.out_cc));
|
||||
@ -118,7 +118,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
// Create a new cache to avoid overwriting the initialized default cache.
|
||||
ret = krb5_cc_new_unique(k5.ctx, deftype, nullptr, &k5.out_cc);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while generating new cache{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while generating new cache: {}", fmtError(ret));
|
||||
LOG_TRACE(log,"Using default cache: {}", krb5_cc_get_name(k5.ctx, k5.out_cc));
|
||||
k5.switch_to_cache = 1;
|
||||
}
|
||||
@ -134,24 +134,24 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
|
||||
ret = krb5_unparse_name(k5.ctx, k5.me, &k5.name);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when unparsing name{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when unparsing name: {}", fmtError(ret));
|
||||
LOG_TRACE(log,"Using principal: {}", k5.name);
|
||||
|
||||
// Allocate a new initial credential options structure.
|
||||
ret = krb5_get_init_creds_opt_alloc(k5.ctx, &options);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in options allocation{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in options allocation: {}", fmtError(ret));
|
||||
|
||||
// Resolve keytab
|
||||
ret = krb5_kt_resolve(k5.ctx, keytab_file.c_str(), &keytab);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving keytab {}{}", keytab_file, fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving keytab ({}): {}", keytab_file, fmtError(ret));
|
||||
LOG_TRACE(log,"Using keytab: {}", keytab_file);
|
||||
|
||||
// Set an output credential cache in initial credential options.
|
||||
ret = krb5_get_init_creds_opt_set_out_ccache(k5.ctx, options, k5.out_cc);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in setting output credential cache{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in setting output credential cache: {}", fmtError(ret));
|
||||
|
||||
// Action: init or renew
|
||||
LOG_TRACE(log,"Trying to renew credentials");
|
||||
@ -165,7 +165,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
// Request KDC for an initial credentials using keytab.
|
||||
ret = krb5_get_init_creds_keytab(k5.ctx, &my_creds, k5.me, keytab, 0, nullptr, options);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in getting initial credentials{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in getting initial credentials: {}", fmtError(ret));
|
||||
else
|
||||
LOG_TRACE(log,"Got initial credentials");
|
||||
}
|
||||
@ -175,7 +175,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
// Initialize a credential cache. Destroy any existing contents of cache and initialize it for the default principal.
|
||||
ret = krb5_cc_initialize(k5.ctx, k5.out_cc, k5.me);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when initializing cache{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when initializing cache: {}", fmtError(ret));
|
||||
LOG_TRACE(log,"Initialized cache");
|
||||
// Store credentials in a credential cache.
|
||||
ret = krb5_cc_store_cred(k5.ctx, k5.out_cc, &my_creds);
|
||||
@ -189,7 +189,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
|
||||
// Make a credential cache the primary cache for its collection.
|
||||
ret = krb5_cc_switch(k5.ctx, k5.out_cc);
|
||||
if (ret)
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while switching to new cache{}", fmtError(ret));
|
||||
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while switching to new cache: {}", fmtError(ret));
|
||||
}
|
||||
|
||||
LOG_TRACE(log,"Authenticated to Kerberos v5");
|
||||
|
@ -205,7 +205,7 @@ void LDAPClient::handleError(int result_code, String text)
|
||||
}
|
||||
}
|
||||
|
||||
throw Exception(text, ErrorCodes::LDAP_ERROR);
|
||||
throw Exception::createDeprecated(text, ErrorCodes::LDAP_ERROR);
|
||||
}
|
||||
}
|
||||
|
||||
@ -569,7 +569,7 @@ LDAPClient::SearchResults LDAPClient::search(const SearchParams & search_params)
|
||||
message += matched_msg;
|
||||
}
|
||||
|
||||
throw Exception(message, ErrorCodes::LDAP_ERROR);
|
||||
throw Exception::createDeprecated(message, ErrorCodes::LDAP_ERROR);
|
||||
}
|
||||
|
||||
break;
|
||||
|
@ -266,7 +266,7 @@ bool SettingsConstraints::Checker::check(SettingChange & change, const Field & n
|
||||
if (!explain.empty())
|
||||
{
|
||||
if (reaction == THROW_ON_VIOLATION)
|
||||
throw Exception(explain, code);
|
||||
throw Exception::createDeprecated(explain, code);
|
||||
else
|
||||
return false;
|
||||
}
|
||||
|
@ -106,7 +106,7 @@ public:
|
||||
default:
|
||||
throw Exception(
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Map key type " + key_type->getName() + " is not is not supported by combinator " + getName());
|
||||
"Map key type {} is not is not supported by combinator {}", key_type->getName(), getName());
|
||||
}
|
||||
}
|
||||
else
|
||||
|
@ -66,13 +66,13 @@ public:
|
||||
, kind(kind_)
|
||||
{
|
||||
if (!isNativeNumber(arguments[0]))
|
||||
throw Exception{getName() + ": first argument must be represented by integer", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
|
||||
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "{}: first argument must be represented by integer", getName());
|
||||
|
||||
if (!isNativeNumber(arguments[1]))
|
||||
throw Exception{getName() + ": second argument must be represented by integer", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
|
||||
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "{}: second argument must be represented by integer", getName());
|
||||
|
||||
if (!arguments[0]->equals(*arguments[1]))
|
||||
throw Exception{getName() + ": arguments must have the same type", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
|
||||
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "{}: arguments must have the same type", getName());
|
||||
}
|
||||
|
||||
String getName() const override
|
||||
|
@ -88,9 +88,9 @@ createAggregateFunctionSequenceNode(const std::string & name, const DataTypes &
|
||||
name, toString(min_required_args + 1));
|
||||
|
||||
if (argument_types.size() > max_events_size + min_required_args)
|
||||
throw Exception(fmt::format(
|
||||
throw Exception(ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH,
|
||||
"Aggregate function '{}' requires at most {} (timestamp, value_column, ...{} events) arguments.",
|
||||
name, max_events_size + min_required_args, max_events_size), ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
|
||||
name, max_events_size + min_required_args, max_events_size);
|
||||
|
||||
if (const auto * cond_arg = argument_types[2].get(); cond_arg && !isUInt8(cond_arg))
|
||||
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of third argument of aggregate function {}, "
|
||||
@ -100,9 +100,8 @@ createAggregateFunctionSequenceNode(const std::string & name, const DataTypes &
|
||||
{
|
||||
const auto * cond_arg = argument_types[i].get();
|
||||
if (!isUInt8(cond_arg))
|
||||
throw Exception(fmt::format(
|
||||
"Illegal type '{}' of {} argument of aggregate function '{}', must be UInt8", cond_arg->getName(), i + 1, name),
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Illegal type '{}' of {} argument of aggregate function '{}', must be UInt8", cond_arg->getName(), i + 1, name);
|
||||
}
|
||||
|
||||
if (WhichDataType(argument_types[1].get()).idx != TypeIndex::String)
|
||||
|
@ -235,7 +235,7 @@ private:
|
||||
if (skip_degree_ == skip_degree)
|
||||
return;
|
||||
if (skip_degree_ > detail::MAX_SKIP_DEGREE)
|
||||
throw DB::Exception{"skip_degree exceeds maximum value", DB::ErrorCodes::MEMORY_LIMIT_EXCEEDED};
|
||||
throw DB::Exception(DB::ErrorCodes::MEMORY_LIMIT_EXCEEDED, "skip_degree exceeds maximum value");
|
||||
skip_degree = skip_degree_;
|
||||
if (skip_degree == detail::MAX_SKIP_DEGREE)
|
||||
skip_mask = static_cast<UInt32>(-1);
|
||||
|
@ -1,14 +1,15 @@
|
||||
#pragma once
|
||||
|
||||
#include <optional>
|
||||
#include <utility>
|
||||
#include <Common/SettingsChanges.h>
|
||||
#include <base/scope_guard.h>
|
||||
|
||||
#include <Common/Exception.h>
|
||||
#include <Core/Settings.h>
|
||||
|
||||
#include <Analyzer/IQueryTreeNode.h>
|
||||
#include <Analyzer/QueryNode.h>
|
||||
#include <Analyzer/UnionNode.h>
|
||||
|
||||
#include <Interpreters/Context.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -89,4 +90,134 @@ private:
|
||||
template <typename Derived>
|
||||
using ConstInDepthQueryTreeVisitor = InDepthQueryTreeVisitor<Derived, true /*const_visitor*/>;
|
||||
|
||||
/** Same as InDepthQueryTreeVisitor and additionally keeps track of current scope context.
|
||||
* This can be useful if your visitor has special logic that depends on current scope context.
|
||||
*/
|
||||
template <typename Derived, bool const_visitor = false>
|
||||
class InDepthQueryTreeVisitorWithContext
|
||||
{
|
||||
public:
|
||||
using VisitQueryTreeNodeType = std::conditional_t<const_visitor, const QueryTreeNodePtr, QueryTreeNodePtr>;
|
||||
|
||||
explicit InDepthQueryTreeVisitorWithContext(ContextPtr context)
|
||||
: current_context(std::move(context))
|
||||
{}
|
||||
|
||||
/// Return true if visitor should traverse tree top to bottom, false otherwise
|
||||
bool shouldTraverseTopToBottom() const
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
/// Return true if visitor should visit child, false otherwise
|
||||
bool needChildVisit(VisitQueryTreeNodeType & parent [[maybe_unused]], VisitQueryTreeNodeType & child [[maybe_unused]])
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
const ContextPtr & getContext() const
|
||||
{
|
||||
return current_context;
|
||||
}
|
||||
|
||||
const Settings & getSettings() const
|
||||
{
|
||||
return current_context->getSettingsRef();
|
||||
}
|
||||
|
||||
void visit(VisitQueryTreeNodeType & query_tree_node)
|
||||
{
|
||||
auto current_scope_context_ptr = current_context;
|
||||
SCOPE_EXIT(
|
||||
current_context = std::move(current_scope_context_ptr);
|
||||
);
|
||||
|
||||
if (auto * query_node = query_tree_node->template as<QueryNode>())
|
||||
current_context = query_node->getContext();
|
||||
else if (auto * union_node = query_tree_node->template as<UnionNode>())
|
||||
current_context = union_node->getContext();
|
||||
|
||||
bool traverse_top_to_bottom = getDerived().shouldTraverseTopToBottom();
|
||||
if (!traverse_top_to_bottom)
|
||||
visitChildren(query_tree_node);
|
||||
|
||||
getDerived().visitImpl(query_tree_node);
|
||||
|
||||
if (traverse_top_to_bottom)
|
||||
visitChildren(query_tree_node);
|
||||
}
|
||||
private:
|
||||
Derived & getDerived()
|
||||
{
|
||||
return *static_cast<Derived *>(this);
|
||||
}
|
||||
|
||||
const Derived & getDerived() const
|
||||
{
|
||||
return *static_cast<Derived *>(this);
|
||||
}
|
||||
|
||||
void visitChildren(VisitQueryTreeNodeType & expression)
|
||||
{
|
||||
for (auto & child : expression->getChildren())
|
||||
{
|
||||
if (!child)
|
||||
continue;
|
||||
|
||||
bool need_visit_child = getDerived().needChildVisit(expression, child);
|
||||
|
||||
if (need_visit_child)
|
||||
visit(child);
|
||||
}
|
||||
}
|
||||
|
||||
ContextPtr current_context;
|
||||
};
|
||||
|
||||
template <typename Derived>
|
||||
using ConstInDepthQueryTreeVisitorWithContext = InDepthQueryTreeVisitorWithContext<Derived, true /*const_visitor*/>;
|
||||
|
||||
/** Visitor that use another visitor to visit node only if condition for visiting node is true.
|
||||
* For example, your visitor need to visit only query tree nodes or union nodes.
|
||||
*
|
||||
* Condition interface:
|
||||
* struct Condition
|
||||
* {
|
||||
* bool operator()(VisitQueryTreeNodeType & node)
|
||||
* {
|
||||
* return shouldNestedVisitorVisitNode(node);
|
||||
* }
|
||||
* }
|
||||
*/
|
||||
template <typename Visitor, typename Condition, bool const_visitor = false>
|
||||
class InDepthQueryTreeConditionalVisitor : public InDepthQueryTreeVisitor<InDepthQueryTreeConditionalVisitor<Visitor, Condition, const_visitor>, const_visitor>
|
||||
{
|
||||
public:
|
||||
using Base = InDepthQueryTreeVisitor<InDepthQueryTreeConditionalVisitor<Visitor, Condition, const_visitor>, const_visitor>;
|
||||
using VisitQueryTreeNodeType = typename Base::VisitQueryTreeNodeType;
|
||||
|
||||
explicit InDepthQueryTreeConditionalVisitor(Visitor & visitor_, Condition & condition_)
|
||||
: visitor(visitor_)
|
||||
, condition(condition_)
|
||||
{
|
||||
}
|
||||
|
||||
bool shouldTraverseTopToBottom() const
|
||||
{
|
||||
return visitor.shouldTraverseTopToBottom();
|
||||
}
|
||||
|
||||
void visitImpl(VisitQueryTreeNodeType & query_tree_node)
|
||||
{
|
||||
if (condition(query_tree_node))
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
Visitor & visitor;
|
||||
Condition & condition;
|
||||
};
|
||||
|
||||
template <typename Visitor, typename Condition>
|
||||
using ConstInDepthQueryTreeConditionalVisitor = InDepthQueryTreeConditionalVisitor<Visitor, Condition, true /*const_visitor*/>;
|
||||
|
||||
}
|
||||
|
@ -45,12 +45,11 @@ Field zeroField(const Field & value)
|
||||
* TODO: Support `groupBitAnd`, `groupBitOr`, `groupBitXor` functions.
|
||||
* TODO: Support rewrite `f((2 * n) * n)` into '2 * f(n * n)'.
|
||||
*/
|
||||
class AggregateFunctionsArithmericOperationsVisitor : public InDepthQueryTreeVisitor<AggregateFunctionsArithmericOperationsVisitor>
|
||||
class AggregateFunctionsArithmericOperationsVisitor : public InDepthQueryTreeVisitorWithContext<AggregateFunctionsArithmericOperationsVisitor>
|
||||
{
|
||||
public:
|
||||
explicit AggregateFunctionsArithmericOperationsVisitor(ContextPtr context_)
|
||||
: context(std::move(context_))
|
||||
{}
|
||||
using Base = InDepthQueryTreeVisitorWithContext<AggregateFunctionsArithmericOperationsVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
/// Traverse tree bottom to top
|
||||
static bool shouldTraverseTopToBottom()
|
||||
@ -60,6 +59,9 @@ public:
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_arithmetic_operations_in_aggregate_functions)
|
||||
return;
|
||||
|
||||
auto * aggregate_function_node = node->as<FunctionNode>();
|
||||
if (!aggregate_function_node || !aggregate_function_node->isAggregateFunction())
|
||||
return;
|
||||
@ -175,7 +177,7 @@ private:
|
||||
|
||||
inline void resolveOrdinaryFunctionNode(FunctionNode & function_node, const String & function_name) const
|
||||
{
|
||||
auto function = FunctionFactory::instance().get(function_name, context);
|
||||
auto function = FunctionFactory::instance().get(function_name, getContext());
|
||||
function_node.resolveAsFunction(function->build(function_node.getArgumentColumns()));
|
||||
}
|
||||
|
||||
@ -191,8 +193,6 @@ private:
|
||||
|
||||
function_node.resolveAsAggregateFunction(std::move(aggregate_function));
|
||||
}
|
||||
|
||||
ContextPtr context;
|
||||
};
|
||||
|
||||
}
|
||||
|
@ -1,17 +1,23 @@
|
||||
#include <Analyzer/Passes/ConvertOrLikeChainPass.h>
|
||||
|
||||
#include <memory>
|
||||
#include <unordered_map>
|
||||
#include <vector>
|
||||
#include <Analyzer/Passes/ConvertOrLikeChainPass.h>
|
||||
|
||||
#include <Core/Field.h>
|
||||
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
|
||||
#include <Functions/FunctionFactory.h>
|
||||
#include <Functions/likePatternToRegexp.h>
|
||||
|
||||
#include <Interpreters/Context.h>
|
||||
|
||||
#include <Analyzer/ConstantNode.h>
|
||||
#include <Analyzer/UnionNode.h>
|
||||
#include <Analyzer/FunctionNode.h>
|
||||
#include <Analyzer/HashUtils.h>
|
||||
#include <Analyzer/InDepthQueryTreeVisitor.h>
|
||||
#include <Core/Field.h>
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <Functions/FunctionFactory.h>
|
||||
#include <Functions/likePatternToRegexp.h>
|
||||
#include <Interpreters/Context.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -19,36 +25,28 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class ConvertOrLikeChainVisitor : public InDepthQueryTreeVisitor<ConvertOrLikeChainVisitor>
|
||||
class ConvertOrLikeChainVisitor : public InDepthQueryTreeVisitorWithContext<ConvertOrLikeChainVisitor>
|
||||
{
|
||||
using FunctionNodes = std::vector<std::shared_ptr<FunctionNode>>;
|
||||
|
||||
const FunctionOverloadResolverPtr match_function_ref;
|
||||
const FunctionOverloadResolverPtr or_function_resolver;
|
||||
public:
|
||||
using Base = InDepthQueryTreeVisitorWithContext<ConvertOrLikeChainVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
explicit ConvertOrLikeChainVisitor(ContextPtr context)
|
||||
: InDepthQueryTreeVisitor<ConvertOrLikeChainVisitor>()
|
||||
, match_function_ref(FunctionFactory::instance().get("multiMatchAny", context))
|
||||
, or_function_resolver(FunctionFactory::instance().get("or", context))
|
||||
explicit ConvertOrLikeChainVisitor(FunctionOverloadResolverPtr or_function_resolver_,
|
||||
FunctionOverloadResolverPtr match_function_resolver_,
|
||||
ContextPtr context)
|
||||
: Base(std::move(context))
|
||||
, or_function_resolver(std::move(or_function_resolver_))
|
||||
, match_function_resolver(std::move(match_function_resolver_))
|
||||
{}
|
||||
|
||||
static bool needChildVisit(VisitQueryTreeNodeType & parent, VisitQueryTreeNodeType &)
|
||||
bool needChildVisit(VisitQueryTreeNodeType &, VisitQueryTreeNodeType &)
|
||||
{
|
||||
ContextPtr context;
|
||||
if (auto * query = parent->as<QueryNode>())
|
||||
context = query->getContext();
|
||||
else if (auto * union_node = parent->as<UnionNode>())
|
||||
context = union_node->getContext();
|
||||
if (context)
|
||||
{
|
||||
const auto & settings = context->getSettingsRef();
|
||||
return settings.optimize_or_like_chain
|
||||
&& settings.allow_hyperscan
|
||||
&& settings.max_hyperscan_regexp_length == 0
|
||||
&& settings.max_hyperscan_regexp_total_length == 0;
|
||||
}
|
||||
return true;
|
||||
const auto & settings = getSettings();
|
||||
|
||||
return settings.optimize_or_like_chain
|
||||
&& settings.allow_hyperscan
|
||||
&& settings.max_hyperscan_regexp_length == 0
|
||||
&& settings.max_hyperscan_regexp_total_length == 0;
|
||||
}
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
@ -61,27 +59,28 @@ public:
|
||||
|
||||
QueryTreeNodePtrWithHashMap<Array> node_to_patterns;
|
||||
FunctionNodes match_functions;
|
||||
for (auto & arg : function_node->getArguments())
|
||||
{
|
||||
unique_elems.push_back(arg);
|
||||
|
||||
auto * arg_func = arg->as<FunctionNode>();
|
||||
if (!arg_func)
|
||||
for (auto & argument : function_node->getArguments())
|
||||
{
|
||||
unique_elems.push_back(argument);
|
||||
|
||||
auto * argument_function = argument->as<FunctionNode>();
|
||||
if (!argument_function)
|
||||
continue;
|
||||
|
||||
const bool is_like = arg_func->getFunctionName() == "like";
|
||||
const bool is_ilike = arg_func->getFunctionName() == "ilike";
|
||||
const bool is_like = argument_function->getFunctionName() == "like";
|
||||
const bool is_ilike = argument_function->getFunctionName() == "ilike";
|
||||
|
||||
/// Not {i}like -> bail out.
|
||||
if (!is_like && !is_ilike)
|
||||
continue;
|
||||
|
||||
const auto & like_arguments = arg_func->getArguments().getNodes();
|
||||
const auto & like_arguments = argument_function->getArguments().getNodes();
|
||||
if (like_arguments.size() != 2)
|
||||
continue;
|
||||
|
||||
auto identifier = like_arguments[0];
|
||||
auto * pattern = like_arguments[1]->as<ConstantNode>();
|
||||
const auto & like_first_argument = like_arguments[0];
|
||||
const auto * pattern = like_arguments[1]->as<ConstantNode>();
|
||||
if (!pattern || !isString(pattern->getResultType()))
|
||||
continue;
|
||||
|
||||
@ -91,17 +90,20 @@ public:
|
||||
regexp = "(?i)" + regexp;
|
||||
|
||||
unique_elems.pop_back();
|
||||
auto it = node_to_patterns.find(identifier);
|
||||
|
||||
auto it = node_to_patterns.find(like_first_argument);
|
||||
if (it == node_to_patterns.end())
|
||||
{
|
||||
it = node_to_patterns.insert({identifier, Array{}}).first;
|
||||
it = node_to_patterns.insert({like_first_argument, Array{}}).first;
|
||||
|
||||
/// The second argument will be added when all patterns are known.
|
||||
auto match_function = std::make_shared<FunctionNode>("multiMatchAny");
|
||||
match_function->getArguments().getNodes().push_back(identifier);
|
||||
|
||||
match_function->getArguments().getNodes().push_back(like_first_argument);
|
||||
match_functions.push_back(match_function);
|
||||
|
||||
unique_elems.push_back(std::move(match_function));
|
||||
}
|
||||
|
||||
it->second.push_back(regexp);
|
||||
}
|
||||
|
||||
@ -111,23 +113,29 @@ public:
|
||||
auto & arguments = match_function->getArguments().getNodes();
|
||||
auto & patterns = node_to_patterns.at(arguments[0]);
|
||||
arguments.push_back(std::make_shared<ConstantNode>(Field{std::move(patterns)}));
|
||||
match_function->resolveAsFunction(match_function_ref);
|
||||
match_function->resolveAsFunction(match_function_resolver);
|
||||
}
|
||||
|
||||
/// OR must have at least two arguments.
|
||||
if (unique_elems.size() == 1)
|
||||
unique_elems.push_back(std::make_shared<ConstantNode>(false));
|
||||
unique_elems.push_back(std::make_shared<ConstantNode>(static_cast<UInt8>(0)));
|
||||
|
||||
function_node->getArguments().getNodes() = std::move(unique_elems);
|
||||
function_node->resolveAsFunction(or_function_resolver);
|
||||
}
|
||||
private:
|
||||
using FunctionNodes = std::vector<std::shared_ptr<FunctionNode>>;
|
||||
const FunctionOverloadResolverPtr or_function_resolver;
|
||||
const FunctionOverloadResolverPtr match_function_resolver;
|
||||
};
|
||||
|
||||
}
|
||||
|
||||
void ConvertOrLikeChainPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
void ConvertOrLikeChainPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
ConvertOrLikeChainVisitor visitor(context);
|
||||
auto or_function_resolver = FunctionFactory::instance().get("or", context);
|
||||
auto match_function_resolver = FunctionFactory::instance().get("multiMatchAny", context);
|
||||
ConvertOrLikeChainVisitor visitor(std::move(or_function_resolver), std::move(match_function_resolver), std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
|
@ -16,11 +16,17 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class CountDistinctVisitor : public InDepthQueryTreeVisitor<CountDistinctVisitor>
|
||||
class CountDistinctVisitor : public InDepthQueryTreeVisitorWithContext<CountDistinctVisitor>
|
||||
{
|
||||
public:
|
||||
static void visitImpl(QueryTreeNodePtr & node)
|
||||
using Base = InDepthQueryTreeVisitorWithContext<CountDistinctVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().count_distinct_optimization)
|
||||
return;
|
||||
|
||||
auto * query_node = node->as<QueryNode>();
|
||||
|
||||
/// Check that query has only SELECT clause
|
||||
@ -78,9 +84,9 @@ public:
|
||||
|
||||
}
|
||||
|
||||
void CountDistinctPass::run(QueryTreeNodePtr query_tree_node, ContextPtr)
|
||||
void CountDistinctPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
CountDistinctVisitor visitor;
|
||||
CountDistinctVisitor visitor(std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
|
@ -16,12 +16,11 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class CustomizeFunctionsVisitor : public InDepthQueryTreeVisitor<CustomizeFunctionsVisitor>
|
||||
class CustomizeFunctionsVisitor : public InDepthQueryTreeVisitorWithContext<CustomizeFunctionsVisitor>
|
||||
{
|
||||
public:
|
||||
explicit CustomizeFunctionsVisitor(ContextPtr & context_)
|
||||
: context(context_)
|
||||
{}
|
||||
using Base = InDepthQueryTreeVisitorWithContext<CustomizeFunctionsVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node) const
|
||||
{
|
||||
@ -29,7 +28,7 @@ public:
|
||||
if (!function_node)
|
||||
return;
|
||||
|
||||
const auto & settings = context->getSettingsRef();
|
||||
const auto & settings = getSettings();
|
||||
|
||||
/// After successful function replacement function name and function name lowercase must be recalculated
|
||||
auto function_name = function_node->getFunctionName();
|
||||
@ -154,19 +153,16 @@ public:
|
||||
|
||||
inline void resolveOrdinaryFunctionNode(FunctionNode & function_node, const String & function_name) const
|
||||
{
|
||||
auto function = FunctionFactory::instance().get(function_name, context);
|
||||
auto function = FunctionFactory::instance().get(function_name, getContext());
|
||||
function_node.resolveAsFunction(function->build(function_node.getArgumentColumns()));
|
||||
}
|
||||
|
||||
private:
|
||||
ContextPtr & context;
|
||||
};
|
||||
|
||||
}
|
||||
|
||||
void CustomizeFunctionsPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
CustomizeFunctionsVisitor visitor(context);
|
||||
CustomizeFunctionsVisitor visitor(std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
|
@ -22,15 +22,17 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class FunctionToSubcolumnsVisitor : public InDepthQueryTreeVisitor<FunctionToSubcolumnsVisitor>
|
||||
class FunctionToSubcolumnsVisitor : public InDepthQueryTreeVisitorWithContext<FunctionToSubcolumnsVisitor>
|
||||
{
|
||||
public:
|
||||
explicit FunctionToSubcolumnsVisitor(ContextPtr & context_)
|
||||
: context(context_)
|
||||
{}
|
||||
using Base = InDepthQueryTreeVisitorWithContext<FunctionToSubcolumnsVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node) const
|
||||
{
|
||||
if (!getSettings().optimize_functions_to_subcolumns)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
if (!function_node)
|
||||
return;
|
||||
@ -192,11 +194,9 @@ public:
|
||||
private:
|
||||
inline void resolveOrdinaryFunctionNode(FunctionNode & function_node, const String & function_name) const
|
||||
{
|
||||
auto function = FunctionFactory::instance().get(function_name, context);
|
||||
auto function = FunctionFactory::instance().get(function_name, getContext());
|
||||
function_node.resolveAsFunction(function->build(function_node.getArgumentColumns()));
|
||||
}
|
||||
|
||||
ContextPtr & context;
|
||||
};
|
||||
|
||||
}
|
||||
|
@ -5,7 +5,7 @@
|
||||
namespace DB
|
||||
{
|
||||
|
||||
/** Transform functions to subcolumns.
|
||||
/** Transform functions to subcolumns. Enabled using setting optimize_functions_to_subcolumns.
|
||||
* It can help to reduce amount of read data.
|
||||
*
|
||||
* Example: SELECT tupleElement(column, subcolumn) FROM test_table;
|
||||
|
@ -26,16 +26,22 @@ namespace ErrorCodes
|
||||
namespace
|
||||
{
|
||||
|
||||
class FuseFunctionsVisitor : public InDepthQueryTreeVisitor<FuseFunctionsVisitor>
|
||||
class FuseFunctionsVisitor : public InDepthQueryTreeVisitorWithContext<FuseFunctionsVisitor>
|
||||
{
|
||||
public:
|
||||
using Base = InDepthQueryTreeVisitorWithContext<FuseFunctionsVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
explicit FuseFunctionsVisitor(const std::unordered_set<String> names_to_collect_)
|
||||
: names_to_collect(names_to_collect_)
|
||||
explicit FuseFunctionsVisitor(const std::unordered_set<String> names_to_collect_, ContextPtr context)
|
||||
: Base(std::move(context))
|
||||
, names_to_collect(names_to_collect_)
|
||||
{}
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_syntax_fuse_functions)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
if (!function_node || !function_node->isAggregateFunction() || !names_to_collect.contains(function_node->getFunctionName()))
|
||||
return;
|
||||
@ -201,7 +207,7 @@ FunctionNodePtr createFusedQuantilesNode(std::vector<QueryTreeNodePtr *> & nodes
|
||||
|
||||
void tryFuseSumCountAvg(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
FuseFunctionsVisitor visitor({"sum", "count", "avg"});
|
||||
FuseFunctionsVisitor visitor({"sum", "count", "avg"}, context);
|
||||
visitor.visit(query_tree_node);
|
||||
|
||||
for (auto & [argument, nodes] : visitor.argument_to_functions_mapping)
|
||||
@ -220,7 +226,7 @@ void tryFuseSumCountAvg(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
|
||||
void tryFuseQuantiles(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
FuseFunctionsVisitor visitor_quantile({"quantile"});
|
||||
FuseFunctionsVisitor visitor_quantile({"quantile"}, context);
|
||||
visitor_quantile.visit(query_tree_node);
|
||||
|
||||
for (auto & [argument, nodes_set] : visitor_quantile.argument_to_functions_mapping)
|
||||
|
@ -12,15 +12,22 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class IfChainToMultiIfPassVisitor : public InDepthQueryTreeVisitor<IfChainToMultiIfPassVisitor>
|
||||
class IfChainToMultiIfPassVisitor : public InDepthQueryTreeVisitorWithContext<IfChainToMultiIfPassVisitor>
|
||||
{
|
||||
public:
|
||||
explicit IfChainToMultiIfPassVisitor(FunctionOverloadResolverPtr multi_if_function_ptr_)
|
||||
: multi_if_function_ptr(std::move(multi_if_function_ptr_))
|
||||
using Base = InDepthQueryTreeVisitorWithContext<IfChainToMultiIfPassVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
explicit IfChainToMultiIfPassVisitor(FunctionOverloadResolverPtr multi_if_function_ptr_, ContextPtr context)
|
||||
: Base(std::move(context))
|
||||
, multi_if_function_ptr(std::move(multi_if_function_ptr_))
|
||||
{}
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_if_chain_to_multiif)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
if (!function_node || function_node->getFunctionName() != "if" || function_node->getArguments().getNodes().size() != 3)
|
||||
return;
|
||||
@ -68,7 +75,8 @@ private:
|
||||
|
||||
void IfChainToMultiIfPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
IfChainToMultiIfPassVisitor visitor(FunctionFactory::instance().get("multiIf", context));
|
||||
auto multi_if_function_ptr = FunctionFactory::instance().get("multiIf", context);
|
||||
IfChainToMultiIfPassVisitor visitor(std::move(multi_if_function_ptr), std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
|
@ -107,21 +107,24 @@ void wrapIntoToString(FunctionNode & function_node, QueryTreeNodePtr arg, Contex
|
||||
assert(isString(function_node.getResultType()));
|
||||
}
|
||||
|
||||
class ConvertStringsToEnumVisitor : public InDepthQueryTreeVisitor<ConvertStringsToEnumVisitor>
|
||||
class ConvertStringsToEnumVisitor : public InDepthQueryTreeVisitorWithContext<ConvertStringsToEnumVisitor>
|
||||
{
|
||||
public:
|
||||
explicit ConvertStringsToEnumVisitor(ContextPtr context_)
|
||||
: context(std::move(context_))
|
||||
{
|
||||
}
|
||||
using Base = InDepthQueryTreeVisitorWithContext<ConvertStringsToEnumVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_if_transform_strings_to_enum)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
|
||||
if (!function_node)
|
||||
return;
|
||||
|
||||
const auto & context = getContext();
|
||||
|
||||
/// to preserve return type (String) of the current function_node, we wrap the newly
|
||||
/// generated function nodes into toString
|
||||
|
||||
@ -198,16 +201,13 @@ public:
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
private:
|
||||
ContextPtr context;
|
||||
};
|
||||
|
||||
}
|
||||
|
||||
void IfTransformStringsToEnumPass::run(QueryTreeNodePtr query, ContextPtr context)
|
||||
{
|
||||
ConvertStringsToEnumVisitor visitor(context);
|
||||
ConvertStringsToEnumVisitor visitor(std::move(context));
|
||||
visitor.visit(query);
|
||||
}
|
||||
|
||||
|
@ -10,15 +10,22 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class MultiIfToIfVisitor : public InDepthQueryTreeVisitor<MultiIfToIfVisitor>
|
||||
class MultiIfToIfVisitor : public InDepthQueryTreeVisitorWithContext<MultiIfToIfVisitor>
|
||||
{
|
||||
public:
|
||||
explicit MultiIfToIfVisitor(FunctionOverloadResolverPtr if_function_ptr_)
|
||||
: if_function_ptr(if_function_ptr_)
|
||||
using Base = InDepthQueryTreeVisitorWithContext<MultiIfToIfVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
explicit MultiIfToIfVisitor(FunctionOverloadResolverPtr if_function_ptr_, ContextPtr context)
|
||||
: Base(std::move(context))
|
||||
, if_function_ptr(std::move(if_function_ptr_))
|
||||
{}
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_multiif_to_if)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
if (!function_node || function_node->getFunctionName() != "multiIf")
|
||||
return;
|
||||
@ -38,7 +45,8 @@ private:
|
||||
|
||||
void MultiIfToIfPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
MultiIfToIfVisitor visitor(FunctionFactory::instance().get("if", context));
|
||||
auto if_function_ptr = FunctionFactory::instance().get("if", context);
|
||||
MultiIfToIfVisitor visitor(std::move(if_function_ptr), std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
|
@ -14,12 +14,17 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class NormalizeCountVariantsVisitor : public InDepthQueryTreeVisitor<NormalizeCountVariantsVisitor>
|
||||
class NormalizeCountVariantsVisitor : public InDepthQueryTreeVisitorWithContext<NormalizeCountVariantsVisitor>
|
||||
{
|
||||
public:
|
||||
explicit NormalizeCountVariantsVisitor(ContextPtr context_) : context(std::move(context_)) {}
|
||||
using Base = InDepthQueryTreeVisitorWithContext<NormalizeCountVariantsVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_normalize_count_variants)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
if (!function_node || !function_node->isAggregateFunction() || (function_node->getFunctionName() != "count" && function_node->getFunctionName() != "sum"))
|
||||
return;
|
||||
@ -42,15 +47,13 @@ public:
|
||||
else if (function_node->getFunctionName() == "sum" &&
|
||||
first_argument_constant_literal.getType() == Field::Types::UInt64 &&
|
||||
first_argument_constant_literal.get<UInt64>() == 1 &&
|
||||
!context->getSettingsRef().aggregate_functions_null_for_empty)
|
||||
!getSettings().aggregate_functions_null_for_empty)
|
||||
{
|
||||
resolveAsCountAggregateFunction(*function_node);
|
||||
function_node->getArguments().getNodes().clear();
|
||||
}
|
||||
}
|
||||
private:
|
||||
ContextPtr context;
|
||||
|
||||
static inline void resolveAsCountAggregateFunction(FunctionNode & function_node)
|
||||
{
|
||||
AggregateFunctionProperties properties;
|
||||
|
@ -1,26 +1,33 @@
|
||||
#include <Analyzer/Passes/OptimizeGroupByFunctionKeysPass.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include <queue>
|
||||
|
||||
#include <Analyzer/FunctionNode.h>
|
||||
#include <Analyzer/HashUtils.h>
|
||||
#include <Analyzer/IQueryTreeNode.h>
|
||||
#include <Analyzer/InDepthQueryTreeVisitor.h>
|
||||
#include <Analyzer/QueryNode.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include <queue>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
class OptimizeGroupByFunctionKeysVisitor : public InDepthQueryTreeVisitor<OptimizeGroupByFunctionKeysVisitor>
|
||||
class OptimizeGroupByFunctionKeysVisitor : public InDepthQueryTreeVisitorWithContext<OptimizeGroupByFunctionKeysVisitor>
|
||||
{
|
||||
public:
|
||||
using Base = InDepthQueryTreeVisitorWithContext<OptimizeGroupByFunctionKeysVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
static bool needChildVisit(QueryTreeNodePtr & /*parent*/, QueryTreeNodePtr & child)
|
||||
{
|
||||
return !child->as<FunctionNode>();
|
||||
}
|
||||
|
||||
static void visitImpl(QueryTreeNodePtr & node)
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_group_by_function_keys)
|
||||
return;
|
||||
|
||||
auto * query = node->as<QueryNode>();
|
||||
if (!query)
|
||||
return;
|
||||
@ -41,11 +48,10 @@ public:
|
||||
optimizeGroupingSet(group_by);
|
||||
}
|
||||
private:
|
||||
|
||||
struct NodeWithInfo
|
||||
{
|
||||
QueryTreeNodePtr node;
|
||||
bool parents_are_only_deterministic;
|
||||
bool parents_are_only_deterministic = false;
|
||||
};
|
||||
|
||||
static bool canBeEliminated(QueryTreeNodePtr & node, const QueryTreeNodePtrWithHashSet & group_by_keys)
|
||||
@ -64,7 +70,7 @@ private:
|
||||
// TODO: Also process CONSTANT here. We can simplify GROUP BY x, x + 1 to GROUP BY x.
|
||||
while (!candidates.empty())
|
||||
{
|
||||
auto [candidate, deterministic_context] = candidates.back();
|
||||
auto [candidate, parents_are_only_deterministic] = candidates.back();
|
||||
candidates.pop_back();
|
||||
|
||||
bool found = group_by_keys.contains(candidate);
|
||||
@ -80,7 +86,7 @@ private:
|
||||
|
||||
if (!found)
|
||||
{
|
||||
bool is_deterministic_function = deterministic_context && function->getFunction()->isDeterministicInScopeOfQuery();
|
||||
bool is_deterministic_function = parents_are_only_deterministic && function->getFunction()->isDeterministicInScopeOfQuery();
|
||||
for (auto it = arguments.rbegin(); it != arguments.rend(); ++it)
|
||||
candidates.push_back({ *it, is_deterministic_function });
|
||||
}
|
||||
@ -91,7 +97,7 @@ private:
|
||||
return false;
|
||||
break;
|
||||
case QueryTreeNodeType::CONSTANT:
|
||||
if (!deterministic_context)
|
||||
if (!parents_are_only_deterministic)
|
||||
return false;
|
||||
break;
|
||||
default:
|
||||
@ -117,9 +123,10 @@ private:
|
||||
}
|
||||
};
|
||||
|
||||
void OptimizeGroupByFunctionKeysPass::run(QueryTreeNodePtr query_tree_node, ContextPtr /*context*/)
|
||||
void OptimizeGroupByFunctionKeysPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
OptimizeGroupByFunctionKeysVisitor().visit(query_tree_node);
|
||||
OptimizeGroupByFunctionKeysVisitor visitor(std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -1,11 +1,13 @@
|
||||
#include <Analyzer/Passes/OptimizeRedundantFunctionsInOrderByPass.h>
|
||||
|
||||
#include <Functions/IFunction.h>
|
||||
|
||||
#include <Analyzer/ColumnNode.h>
|
||||
#include <Analyzer/FunctionNode.h>
|
||||
#include <Analyzer/HashUtils.h>
|
||||
#include <Analyzer/InDepthQueryTreeVisitor.h>
|
||||
#include <Analyzer/QueryNode.h>
|
||||
#include <Analyzer/SortNode.h>
|
||||
#include <Functions/IFunction.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -13,9 +15,12 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class OptimizeRedundantFunctionsInOrderByVisitor : public InDepthQueryTreeVisitor<OptimizeRedundantFunctionsInOrderByVisitor>
|
||||
class OptimizeRedundantFunctionsInOrderByVisitor : public InDepthQueryTreeVisitorWithContext<OptimizeRedundantFunctionsInOrderByVisitor>
|
||||
{
|
||||
public:
|
||||
using Base = InDepthQueryTreeVisitorWithContext<OptimizeRedundantFunctionsInOrderByVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
static bool needChildVisit(QueryTreeNodePtr & node, QueryTreeNodePtr & /*parent*/)
|
||||
{
|
||||
if (node->as<FunctionNode>())
|
||||
@ -25,6 +30,9 @@ public:
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_redundant_functions_in_order_by)
|
||||
return;
|
||||
|
||||
auto * query = node->as<QueryNode>();
|
||||
if (!query)
|
||||
return;
|
||||
@ -116,9 +124,10 @@ private:
|
||||
|
||||
}
|
||||
|
||||
void OptimizeRedundantFunctionsInOrderByPass::run(QueryTreeNodePtr query_tree_node, ContextPtr /*context*/)
|
||||
void OptimizeRedundantFunctionsInOrderByPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
OptimizeRedundantFunctionsInOrderByVisitor().visit(query_tree_node);
|
||||
OptimizeRedundantFunctionsInOrderByVisitor visitor(std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -1943,7 +1943,7 @@ void QueryAnalyzer::validateTableExpressionModifiers(const QueryTreeNodePtr & ta
|
||||
|
||||
if (!table_node && !table_function_node && !query_node && !union_node)
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR,
|
||||
"Unexpected table expression. Expected table, table function, query or union node. Actual {}",
|
||||
"Unexpected table expression. Expected table, table function, query or union node. Table node: {}, scope node: {}",
|
||||
table_expression_node->formatASTForErrorMessage(),
|
||||
scope.scope_node->formatASTForErrorMessage());
|
||||
|
||||
@ -4366,12 +4366,9 @@ ProjectionNames QueryAnalyzer::resolveFunction(QueryTreeNodePtr & node, Identifi
|
||||
{
|
||||
if (!AggregateFunctionFactory::instance().isAggregateFunctionName(function_name))
|
||||
{
|
||||
std::string error_message = fmt::format("Aggregate function with name '{}' does not exists. In scope {}",
|
||||
function_name,
|
||||
scope.scope_node->formatASTForErrorMessage());
|
||||
|
||||
AggregateFunctionFactory::instance().appendHintsMessage(error_message, function_name);
|
||||
throw Exception(ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION, error_message);
|
||||
throw Exception(ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION, "Aggregate function with name '{}' does not exists. In scope {}{}",
|
||||
function_name, scope.scope_node->formatASTForErrorMessage(),
|
||||
getHintsErrorMessageSuffix(AggregateFunctionFactory::instance().getHints(function_name)));
|
||||
}
|
||||
|
||||
if (!function_lambda_arguments_indexes.empty())
|
||||
@ -5726,7 +5723,7 @@ void QueryAnalyzer::resolveQueryJoinTreeNode(QueryTreeNodePtr & join_tree_node,
|
||||
case QueryTreeNodeType::IDENTIFIER:
|
||||
{
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR,
|
||||
"Identifiers in FROM section must be already resolved. In scope {}",
|
||||
"Identifiers in FROM section must be already resolved. Node {}, scope {}",
|
||||
join_tree_node->formatASTForErrorMessage(),
|
||||
scope.scope_node->formatASTForErrorMessage());
|
||||
}
|
||||
|
@ -20,15 +20,17 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
class SumIfToCountIfVisitor : public InDepthQueryTreeVisitor<SumIfToCountIfVisitor>
|
||||
class SumIfToCountIfVisitor : public InDepthQueryTreeVisitorWithContext<SumIfToCountIfVisitor>
|
||||
{
|
||||
public:
|
||||
explicit SumIfToCountIfVisitor(ContextPtr & context_)
|
||||
: context(context_)
|
||||
{}
|
||||
using Base = InDepthQueryTreeVisitorWithContext<SumIfToCountIfVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_rewrite_sum_if_to_count_if)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
if (!function_node || !function_node->isAggregateFunction())
|
||||
return;
|
||||
@ -56,7 +58,7 @@ public:
|
||||
if (!isInt64OrUInt64FieldType(constant_value_literal.getType()))
|
||||
return;
|
||||
|
||||
if (constant_value_literal.get<UInt64>() != 1 || context->getSettingsRef().aggregate_functions_null_for_empty)
|
||||
if (constant_value_literal.get<UInt64>() != 1 || getSettings().aggregate_functions_null_for_empty)
|
||||
return;
|
||||
|
||||
function_node_arguments_nodes[0] = std::move(function_node_arguments_nodes[1]);
|
||||
@ -122,7 +124,7 @@ public:
|
||||
auto & not_function_arguments = not_function->getArguments().getNodes();
|
||||
not_function_arguments.push_back(nested_if_function_arguments_nodes[0]);
|
||||
|
||||
not_function->resolveAsFunction(FunctionFactory::instance().get("not", context)->build(not_function->getArgumentColumns()));
|
||||
not_function->resolveAsFunction(FunctionFactory::instance().get("not", getContext())->build(not_function->getArgumentColumns()));
|
||||
|
||||
function_node_arguments_nodes[0] = std::move(not_function);
|
||||
function_node_arguments_nodes.resize(1);
|
||||
@ -143,8 +145,6 @@ private:
|
||||
|
||||
function_node.resolveAsAggregateFunction(std::move(aggregate_function));
|
||||
}
|
||||
|
||||
ContextPtr & context;
|
||||
};
|
||||
|
||||
}
|
||||
|
@ -25,11 +25,17 @@ bool isUniqFunction(const String & function_name)
|
||||
function_name == "uniqTheta";
|
||||
}
|
||||
|
||||
class UniqInjectiveFunctionsEliminationVisitor : public InDepthQueryTreeVisitor<UniqInjectiveFunctionsEliminationVisitor>
|
||||
class UniqInjectiveFunctionsEliminationVisitor : public InDepthQueryTreeVisitorWithContext<UniqInjectiveFunctionsEliminationVisitor>
|
||||
{
|
||||
public:
|
||||
static void visitImpl(QueryTreeNodePtr & node)
|
||||
using Base = InDepthQueryTreeVisitorWithContext<UniqInjectiveFunctionsEliminationVisitor>;
|
||||
using Base::Base;
|
||||
|
||||
void visitImpl(QueryTreeNodePtr & node)
|
||||
{
|
||||
if (!getSettings().optimize_injective_functions_inside_uniq)
|
||||
return;
|
||||
|
||||
auto * function_node = node->as<FunctionNode>();
|
||||
if (!function_node || !function_node->isAggregateFunction() || !isUniqFunction(function_node->getFunctionName()))
|
||||
return;
|
||||
@ -81,9 +87,9 @@ public:
|
||||
|
||||
}
|
||||
|
||||
void UniqInjectiveFunctionsEliminationPass::run(QueryTreeNodePtr query_tree_node, ContextPtr)
|
||||
void UniqInjectiveFunctionsEliminationPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
|
||||
{
|
||||
UniqInjectiveFunctionsEliminationVisitor visitor;
|
||||
UniqInjectiveFunctionsEliminationVisitor visitor(std::move(context));
|
||||
visitor.visit(query_tree_node);
|
||||
}
|
||||
|
||||
|
@ -1,5 +1,7 @@
|
||||
#include <Analyzer/QueryNode.h>
|
||||
|
||||
#include <fmt/core.h>
|
||||
|
||||
#include <Common/SipHash.h>
|
||||
#include <Common/FieldVisitorToString.h>
|
||||
|
||||
@ -17,7 +19,6 @@
|
||||
#include <Parsers/ASTSetQuery.h>
|
||||
|
||||
#include <Analyzer/Utils.h>
|
||||
#include <fmt/core.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -36,7 +37,7 @@ QueryNode::QueryNode(ContextMutablePtr context_, SettingsChanges settings_change
|
||||
}
|
||||
|
||||
QueryNode::QueryNode(ContextMutablePtr context_)
|
||||
: QueryNode(context_, {} /*settings_changes*/)
|
||||
: QueryNode(std::move(context_), {} /*settings_changes*/)
|
||||
{}
|
||||
|
||||
void QueryNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
|
||||
@ -185,10 +186,7 @@ void QueryNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, s
|
||||
{
|
||||
buffer << '\n' << std::string(indent + 2, ' ') << "SETTINGS";
|
||||
for (const auto & change : settings_changes)
|
||||
{
|
||||
buffer << fmt::format(" {}={}", change.name, toString(change.value));
|
||||
}
|
||||
buffer << '\n';
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -1,6 +1,7 @@
|
||||
#include <memory>
|
||||
#include <Analyzer/QueryTreePassManager.h>
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include <Common/Exception.h>
|
||||
|
||||
#include <IO/WriteHelpers.h>
|
||||
@ -133,7 +134,6 @@ private:
|
||||
* TODO: Support setting optimize_aggregators_of_group_by_keys.
|
||||
* TODO: Support setting optimize_duplicate_order_by_and_distinct.
|
||||
* TODO: Support setting optimize_monotonous_functions_in_order_by.
|
||||
* TODO: Support settings.optimize_or_like_chain.
|
||||
* TODO: Add optimizations based on function semantics. Example: SELECT * FROM test_table WHERE id != id. (id is not nullable column).
|
||||
*/
|
||||
|
||||
@ -210,53 +210,31 @@ void QueryTreePassManager::dump(WriteBuffer & buffer, size_t up_to_pass_index)
|
||||
|
||||
void addQueryTreePasses(QueryTreePassManager & manager)
|
||||
{
|
||||
auto context = manager.getContext();
|
||||
const auto & settings = context->getSettingsRef();
|
||||
|
||||
manager.addPass(std::make_unique<QueryAnalysisPass>());
|
||||
manager.addPass(std::make_unique<FunctionToSubcolumnsPass>());
|
||||
|
||||
if (settings.optimize_functions_to_subcolumns)
|
||||
manager.addPass(std::make_unique<FunctionToSubcolumnsPass>());
|
||||
|
||||
if (settings.count_distinct_optimization)
|
||||
manager.addPass(std::make_unique<CountDistinctPass>());
|
||||
|
||||
if (settings.optimize_rewrite_sum_if_to_count_if)
|
||||
manager.addPass(std::make_unique<SumIfToCountIfPass>());
|
||||
|
||||
if (settings.optimize_normalize_count_variants)
|
||||
manager.addPass(std::make_unique<NormalizeCountVariantsPass>());
|
||||
manager.addPass(std::make_unique<CountDistinctPass>());
|
||||
manager.addPass(std::make_unique<SumIfToCountIfPass>());
|
||||
manager.addPass(std::make_unique<NormalizeCountVariantsPass>());
|
||||
|
||||
manager.addPass(std::make_unique<CustomizeFunctionsPass>());
|
||||
|
||||
if (settings.optimize_arithmetic_operations_in_aggregate_functions)
|
||||
manager.addPass(std::make_unique<AggregateFunctionsArithmericOperationsPass>());
|
||||
|
||||
if (settings.optimize_injective_functions_inside_uniq)
|
||||
manager.addPass(std::make_unique<UniqInjectiveFunctionsEliminationPass>());
|
||||
|
||||
if (settings.optimize_group_by_function_keys)
|
||||
manager.addPass(std::make_unique<OptimizeGroupByFunctionKeysPass>());
|
||||
|
||||
if (settings.optimize_multiif_to_if)
|
||||
manager.addPass(std::make_unique<MultiIfToIfPass>());
|
||||
manager.addPass(std::make_unique<AggregateFunctionsArithmericOperationsPass>());
|
||||
manager.addPass(std::make_unique<UniqInjectiveFunctionsEliminationPass>());
|
||||
manager.addPass(std::make_unique<OptimizeGroupByFunctionKeysPass>());
|
||||
|
||||
manager.addPass(std::make_unique<MultiIfToIfPass>());
|
||||
manager.addPass(std::make_unique<IfConstantConditionPass>());
|
||||
manager.addPass(std::make_unique<IfChainToMultiIfPass>());
|
||||
|
||||
if (settings.optimize_if_chain_to_multiif)
|
||||
manager.addPass(std::make_unique<IfChainToMultiIfPass>());
|
||||
|
||||
if (settings.optimize_redundant_functions_in_order_by)
|
||||
manager.addPass(std::make_unique<OptimizeRedundantFunctionsInOrderByPass>());
|
||||
manager.addPass(std::make_unique<OptimizeRedundantFunctionsInOrderByPass>());
|
||||
|
||||
manager.addPass(std::make_unique<OrderByTupleEliminationPass>());
|
||||
manager.addPass(std::make_unique<OrderByLimitByDuplicateEliminationPass>());
|
||||
|
||||
if (settings.optimize_syntax_fuse_functions)
|
||||
manager.addPass(std::make_unique<FuseFunctionsPass>());
|
||||
manager.addPass(std::make_unique<FuseFunctionsPass>());
|
||||
|
||||
if (settings.optimize_if_transform_strings_to_enum)
|
||||
manager.addPass(std::make_unique<IfTransformStringsToEnumPass>());
|
||||
manager.addPass(std::make_unique<IfTransformStringsToEnumPass>());
|
||||
|
||||
manager.addPass(std::make_unique<ConvertOrLikeChainPass>());
|
||||
|
||||
|
@ -79,7 +79,7 @@ namespace
|
||||
request.SetMaxKeys(1);
|
||||
auto outcome = client.ListObjects(request);
|
||||
if (!outcome.IsSuccess())
|
||||
throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
|
||||
throw Exception::createDeprecated(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
|
||||
return outcome.GetResult().GetContents();
|
||||
}
|
||||
|
||||
@ -233,7 +233,7 @@ void BackupWriterS3::removeFile(const String & file_name)
|
||||
request.SetKey(fs::path(s3_uri.key) / file_name);
|
||||
auto outcome = client->DeleteObject(request);
|
||||
if (!outcome.IsSuccess() && !isNotFoundError(outcome.GetError().GetErrorType()))
|
||||
throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
|
||||
throw Exception::createDeprecated(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
|
||||
}
|
||||
|
||||
void BackupWriterS3::removeFiles(const Strings & file_names)
|
||||
@ -291,7 +291,7 @@ void BackupWriterS3::removeFilesBatch(const Strings & file_names)
|
||||
|
||||
auto outcome = client->DeleteObjects(request);
|
||||
if (!outcome.IsSuccess() && !isNotFoundError(outcome.GetError().GetErrorType()))
|
||||
throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
|
||||
throw Exception::createDeprecated(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -93,7 +93,7 @@ namespace
|
||||
catch (...)
|
||||
{
|
||||
if (coordination)
|
||||
coordination->setError(current_host, Exception{getCurrentExceptionCode(), getCurrentExceptionMessage(true, true)});
|
||||
coordination->setError(current_host, Exception(getCurrentExceptionMessageAndPattern(true, true), getCurrentExceptionCode()));
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -452,7 +452,7 @@ void ClientBase::onData(Block & block, ASTPtr parsed_query)
|
||||
catch (const Exception &)
|
||||
{
|
||||
/// Catch client errors like NO_ROW_DELIMITER
|
||||
throw LocalFormatError(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
|
||||
throw LocalFormatError(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
|
||||
}
|
||||
|
||||
/// Received data block is immediately displayed to the user.
|
||||
@ -630,7 +630,7 @@ try
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
throw LocalFormatError(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
|
||||
throw LocalFormatError(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
|
||||
}
|
||||
|
||||
|
||||
@ -1912,7 +1912,7 @@ bool ClientBase::executeMultiQuery(const String & all_queries_text)
|
||||
{
|
||||
// Surprisingly, this is a client error. A server error would
|
||||
// have been reported without throwing (see onReceiveSeverException()).
|
||||
client_exception = std::make_unique<Exception>(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
|
||||
client_exception = std::make_unique<Exception>(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
|
||||
have_error = true;
|
||||
}
|
||||
|
||||
|
@ -187,7 +187,7 @@ void LocalConnection::sendQuery(
|
||||
catch (...)
|
||||
{
|
||||
state->io.onException();
|
||||
state->exception = std::make_unique<Exception>("Unknown exception", ErrorCodes::UNKNOWN_EXCEPTION);
|
||||
state->exception = std::make_unique<Exception>(ErrorCodes::UNKNOWN_EXCEPTION, "Unknown exception");
|
||||
}
|
||||
}
|
||||
|
||||
@ -291,7 +291,7 @@ bool LocalConnection::poll(size_t)
|
||||
catch (...)
|
||||
{
|
||||
state->io.onException();
|
||||
state->exception = std::make_unique<Exception>("Unknown exception", ErrorCodes::UNKNOWN_EXCEPTION);
|
||||
state->exception = std::make_unique<Exception>(ErrorCodes::UNKNOWN_EXCEPTION, "Unknown exception");
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -83,7 +83,7 @@ template <is_decimal T>
|
||||
UInt64 ColumnDecimal<T>::get64([[maybe_unused]] size_t n) const
|
||||
{
|
||||
if constexpr (sizeof(T) > sizeof(UInt64))
|
||||
throw Exception(String("Method get64 is not supported for ") + getFamilyName(), ErrorCodes::NOT_IMPLEMENTED);
|
||||
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Method get64 is not supported for {}", getFamilyName());
|
||||
else
|
||||
return static_cast<NativeT>(data[n]);
|
||||
}
|
||||
|
@ -35,7 +35,7 @@ ColumnNullable::ColumnNullable(MutableColumnPtr && nested_column_, MutableColumn
|
||||
nested_column = getNestedColumn().convertToFullColumnIfConst();
|
||||
|
||||
if (!getNestedColumn().canBeInsideNullable())
|
||||
throw Exception{getNestedColumn().getName() + " cannot be inside Nullable column", ErrorCodes::ILLEGAL_COLUMN};
|
||||
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "{} cannot be inside Nullable column", getNestedColumn().getName());
|
||||
|
||||
if (isColumnConst(*null_map))
|
||||
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "ColumnNullable cannot have constant null map");
|
||||
|
@ -207,8 +207,9 @@ private:
|
||||
if (size >= MMAP_THRESHOLD)
|
||||
{
|
||||
if (alignment > mmap_min_alignment)
|
||||
throw DB::Exception(fmt::format("Too large alignment {}: more than page size when allocating {}.",
|
||||
ReadableSize(alignment), ReadableSize(size)), DB::ErrorCodes::BAD_ARGUMENTS);
|
||||
throw DB::Exception(DB::ErrorCodes::BAD_ARGUMENTS,
|
||||
"Too large alignment {}: more than page size when allocating {}.",
|
||||
ReadableSize(alignment), ReadableSize(size));
|
||||
|
||||
buf = mmap(getMmapHint(), size, PROT_READ | PROT_WRITE,
|
||||
mmap_flags, -1, 0);
|
||||
|
@ -140,7 +140,7 @@ void CancelToken::raise()
|
||||
{
|
||||
std::unique_lock lock(signal_mutex);
|
||||
if (exception_code != 0)
|
||||
throw DB::Exception(
|
||||
throw DB::Exception::createRuntime(
|
||||
std::exchange(exception_code, 0),
|
||||
std::exchange(exception_message, {}));
|
||||
else
|
||||
|
@ -88,7 +88,7 @@ public:
|
||||
{
|
||||
/// A more understandable error message.
|
||||
if (e.code() == DB::ErrorCodes::CANNOT_READ_ALL_DATA || e.code() == DB::ErrorCodes::ATTEMPT_TO_READ_AFTER_EOF)
|
||||
throw DB::ParsingException("File " + path + " is empty. You must fill it manually with appropriate value.", e.code());
|
||||
throw DB::ParsingException(e.code(), "File {} is empty. You must fill it manually with appropriate value.", path);
|
||||
else
|
||||
throw;
|
||||
}
|
||||
|
@ -39,6 +39,15 @@ enum class WeekModeFlag : UInt8
|
||||
};
|
||||
using YearWeek = std::pair<UInt16, UInt8>;
|
||||
|
||||
/// Modes for toDayOfWeek() function.
|
||||
enum class WeekDayMode
|
||||
{
|
||||
WeekStartsMonday1 = 0,
|
||||
WeekStartsMonday0 = 1,
|
||||
WeekStartsSunday0 = 2,
|
||||
WeekStartsSunday1 = 3
|
||||
};
|
||||
|
||||
/** Lookup table to conversion of time to date, and to month / year / day of week / day of month and so on.
|
||||
* First time was implemented for OLAPServer, that needed to do billions of such transformations.
|
||||
*/
|
||||
@ -619,9 +628,28 @@ public:
|
||||
template <typename DateOrTime>
|
||||
inline Int16 toYear(DateOrTime v) const { return lut[toLUTIndex(v)].year; }
|
||||
|
||||
/// 1-based, starts on Monday
|
||||
template <typename DateOrTime>
|
||||
inline UInt8 toDayOfWeek(DateOrTime v) const { return lut[toLUTIndex(v)].day_of_week; }
|
||||
|
||||
template <typename DateOrTime>
|
||||
inline UInt8 toDayOfWeek(DateOrTime v, UInt8 week_day_mode) const
|
||||
{
|
||||
WeekDayMode mode = check_week_day_mode(week_day_mode);
|
||||
|
||||
UInt8 res = toDayOfWeek(v);
|
||||
using enum WeekDayMode;
|
||||
bool start_from_sunday = (mode == WeekStartsSunday0 || mode == WeekStartsSunday1);
|
||||
bool zero_based = (mode == WeekStartsMonday0 || mode == WeekStartsSunday0);
|
||||
|
||||
if (start_from_sunday)
|
||||
res = res % 7 + 1;
|
||||
if (zero_based)
|
||||
--res;
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
template <typename DateOrTime>
|
||||
inline UInt8 toDayOfMonth(DateOrTime v) const { return lut[toLUTIndex(v)].day_of_month; }
|
||||
|
||||
@ -844,6 +872,12 @@ public:
|
||||
return week_format;
|
||||
}
|
||||
|
||||
/// Check and change mode to effective.
|
||||
inline WeekDayMode check_week_day_mode(UInt8 mode) const /// NOLINT
|
||||
{
|
||||
return static_cast<WeekDayMode>(mode & 3);
|
||||
}
|
||||
|
||||
/** Calculate weekday from d.
|
||||
* Returns 0 for monday, 1 for tuesday...
|
||||
*/
|
||||
|
@ -252,7 +252,7 @@ uint64_t readOffset(std::string_view & sp, bool is64_bit)
|
||||
// Read "len" bytes
|
||||
std::string_view readBytes(std::string_view & sp, uint64_t len)
|
||||
{
|
||||
SAFE_CHECK(len <= sp.size(), "invalid string length: " + std::to_string(len) + " vs. " + std::to_string(sp.size()));
|
||||
SAFE_CHECK(len <= sp.size(), "invalid string length: {} vs. {}", len, sp.size());
|
||||
std::string_view ret(sp.data(), len);
|
||||
sp.remove_prefix(len);
|
||||
return ret;
|
||||
@ -953,7 +953,7 @@ bool Dwarf::findDebugInfoOffset(uintptr_t address, std::string_view aranges, uin
|
||||
|
||||
Dwarf::Die Dwarf::getDieAtOffset(const CompilationUnit & cu, uint64_t offset) const
|
||||
{
|
||||
SAFE_CHECK(offset < info_.size(), fmt::format("unexpected offset {}, info size {}", offset, info_.size()));
|
||||
SAFE_CHECK(offset < info_.size(), "unexpected offset {}, info size {}", offset, info_.size());
|
||||
Die die;
|
||||
std::string_view sp{info_.data() + offset, cu.offset + cu.size - offset};
|
||||
die.offset = offset;
|
||||
|
@ -188,7 +188,7 @@ static void tryLogCurrentExceptionImpl(Poco::Logger * logger, const std::string
|
||||
{
|
||||
PreformattedMessage message = getCurrentExceptionMessageAndPattern(true);
|
||||
if (!start_of_message.empty())
|
||||
message.message = fmt::format("{}: {}", start_of_message, message.message);
|
||||
message.text = fmt::format("{}: {}", start_of_message, message.text);
|
||||
|
||||
LOG_ERROR(logger, message);
|
||||
}
|
||||
@ -339,7 +339,7 @@ std::string getExtraExceptionInfo(const std::exception & e)
|
||||
|
||||
std::string getCurrentExceptionMessage(bool with_stacktrace, bool check_embedded_stacktrace /*= false*/, bool with_extra_info /*= true*/)
|
||||
{
|
||||
return getCurrentExceptionMessageAndPattern(with_stacktrace, check_embedded_stacktrace, with_extra_info).message;
|
||||
return getCurrentExceptionMessageAndPattern(with_stacktrace, check_embedded_stacktrace, with_extra_info).text;
|
||||
}
|
||||
|
||||
PreformattedMessage getCurrentExceptionMessageAndPattern(bool with_stacktrace, bool check_embedded_stacktrace /*= false*/, bool with_extra_info /*= true*/)
|
||||
@ -481,7 +481,7 @@ void tryLogException(std::exception_ptr e, Poco::Logger * logger, const std::str
|
||||
|
||||
std::string getExceptionMessage(const Exception & e, bool with_stacktrace, bool check_embedded_stacktrace)
|
||||
{
|
||||
return getExceptionMessageAndPattern(e, with_stacktrace, check_embedded_stacktrace).message;
|
||||
return getExceptionMessageAndPattern(e, with_stacktrace, check_embedded_stacktrace).text;
|
||||
}
|
||||
|
||||
PreformattedMessage getExceptionMessageAndPattern(const Exception & e, bool with_stacktrace, bool check_embedded_stacktrace)
|
||||
@ -577,10 +577,6 @@ ParsingException::ParsingException(const std::string & msg, int code)
|
||||
: Exception(msg, code)
|
||||
{
|
||||
}
|
||||
ParsingException::ParsingException(int code, const std::string & message)
|
||||
: Exception(message, code)
|
||||
{
|
||||
}
|
||||
|
||||
/// We use additional field formatted_message_ to make this method const.
|
||||
std::string ParsingException::displayText() const
|
||||
|
@ -16,26 +16,6 @@
|
||||
|
||||
namespace Poco { class Logger; }
|
||||
|
||||
/// Extract format string from a string literal and constructs consteval fmt::format_string
|
||||
template <typename... Args>
|
||||
struct FormatStringHelperImpl
|
||||
{
|
||||
std::string_view message_format_string;
|
||||
fmt::format_string<Args...> fmt_str;
|
||||
template<typename T>
|
||||
consteval FormatStringHelperImpl(T && str) : message_format_string(tryGetStaticFormatString(str)), fmt_str(std::forward<T>(str)) {}
|
||||
template<typename T>
|
||||
FormatStringHelperImpl(fmt::basic_runtime<T> && str) : message_format_string(), fmt_str(std::forward<fmt::basic_runtime<T>>(str)) {}
|
||||
|
||||
PreformattedMessage format(Args && ...args) const
|
||||
{
|
||||
return PreformattedMessage{fmt::format(fmt_str, std::forward<Args...>(args)...), message_format_string};
|
||||
}
|
||||
};
|
||||
|
||||
template <typename... Args>
|
||||
using FormatStringHelper = FormatStringHelperImpl<std::type_identity_t<Args>...>;
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
@ -48,6 +28,17 @@ public:
|
||||
|
||||
Exception() = default;
|
||||
|
||||
Exception(const PreformattedMessage & msg, int code): Exception(msg.text, code)
|
||||
{
|
||||
message_format_string = msg.format_string;
|
||||
}
|
||||
|
||||
Exception(PreformattedMessage && msg, int code): Exception(std::move(msg.text), code)
|
||||
{
|
||||
message_format_string = msg.format_string;
|
||||
}
|
||||
|
||||
protected:
|
||||
// used to remove the sensitive information from exceptions if query_masking_rules is configured
|
||||
struct MessageMasked
|
||||
{
|
||||
@ -62,11 +53,16 @@ public:
|
||||
// delegating constructor to mask sensitive information from the message
|
||||
Exception(const std::string & msg, int code, bool remote_ = false): Exception(MessageMasked(msg), code, remote_) {}
|
||||
Exception(std::string && msg, int code, bool remote_ = false): Exception(MessageMasked(std::move(msg)), code, remote_) {}
|
||||
Exception(PreformattedMessage && msg, int code): Exception(std::move(msg.message), code)
|
||||
|
||||
public:
|
||||
/// This creator is for exceptions that should format a message using fmt::format from the variadic ctor Exception(code, fmt, ...),
|
||||
/// but were not rewritten yet. It will be removed.
|
||||
static Exception createDeprecated(const std::string & msg, int code, bool remote_ = false)
|
||||
{
|
||||
message_format_string = msg.format_string;
|
||||
return Exception(msg, code, remote_);
|
||||
}
|
||||
|
||||
/// Message must be a compile-time constant
|
||||
template<typename T, typename = std::enable_if_t<std::is_convertible_v<T, String>>>
|
||||
Exception(int code, T && message)
|
||||
: Exception(message, code)
|
||||
@ -74,9 +70,11 @@ public:
|
||||
message_format_string = tryGetStaticFormatString(message);
|
||||
}
|
||||
|
||||
template<> Exception(int code, const String & message) : Exception(message, code) {}
|
||||
template<> Exception(int code, String & message) : Exception(message, code) {}
|
||||
template<> Exception(int code, String && message) : Exception(std::move(message), code) {}
|
||||
/// These creators are for messages that were received by network or generated by a third-party library in runtime.
|
||||
/// Please use a constructor for all other cases.
|
||||
static Exception createRuntime(int code, const String & message) { return Exception(message, code); }
|
||||
static Exception createRuntime(int code, String & message) { return Exception(message, code); }
|
||||
static Exception createRuntime(int code, String && message) { return Exception(std::move(message), code); }
|
||||
|
||||
// Format message with fmt::format, like the logging functions.
|
||||
template <typename... Args>
|
||||
@ -167,11 +165,9 @@ private:
|
||||
/// more convenient calculation of problem line number.
|
||||
class ParsingException : public Exception
|
||||
{
|
||||
ParsingException(const std::string & msg, int code);
|
||||
public:
|
||||
ParsingException();
|
||||
ParsingException(const std::string & msg, int code);
|
||||
ParsingException(int code, const std::string & message);
|
||||
ParsingException(int code, std::string && message) : Exception(message, code) {}
|
||||
|
||||
// Format message with fmt::format, like the logging functions.
|
||||
template <typename... Args>
|
||||
|
7
src/Common/LoggingFormatStringHelpers.cpp
Normal file
7
src/Common/LoggingFormatStringHelpers.cpp
Normal file
@ -0,0 +1,7 @@
|
||||
#include <Common/LoggingFormatStringHelpers.h>
|
||||
|
||||
[[noreturn]] void functionThatFailsCompilationOfConstevalFunctions(const char * error)
|
||||
{
|
||||
throw std::runtime_error(error);
|
||||
}
|
||||
|
@ -2,24 +2,70 @@
|
||||
#include <base/defines.h>
|
||||
#include <fmt/format.h>
|
||||
|
||||
struct PreformattedMessage;
|
||||
consteval void formatStringCheckArgsNumImpl(std::string_view str, size_t nargs);
|
||||
template <typename T> constexpr std::string_view tryGetStaticFormatString(T && x);
|
||||
|
||||
/// Extract format string from a string literal and constructs consteval fmt::format_string
|
||||
template <typename... Args>
|
||||
struct FormatStringHelperImpl
|
||||
{
|
||||
std::string_view message_format_string;
|
||||
fmt::format_string<Args...> fmt_str;
|
||||
template<typename T>
|
||||
consteval FormatStringHelperImpl(T && str) : message_format_string(tryGetStaticFormatString(str)), fmt_str(std::forward<T>(str))
|
||||
{
|
||||
formatStringCheckArgsNumImpl(message_format_string, sizeof...(Args));
|
||||
}
|
||||
template<typename T>
|
||||
FormatStringHelperImpl(fmt::basic_runtime<T> && str) : message_format_string(), fmt_str(std::forward<fmt::basic_runtime<T>>(str)) {}
|
||||
|
||||
PreformattedMessage format(Args && ...args) const;
|
||||
};
|
||||
|
||||
template <typename... Args>
|
||||
using FormatStringHelper = FormatStringHelperImpl<std::type_identity_t<Args>...>;
|
||||
|
||||
/// Saves a format string for already formatted message
|
||||
struct PreformattedMessage
|
||||
{
|
||||
String message;
|
||||
std::string text;
|
||||
std::string_view format_string;
|
||||
|
||||
operator const String & () const { return message; }
|
||||
operator String () && { return std::move(message); }
|
||||
template <typename... Args>
|
||||
static PreformattedMessage create(FormatStringHelper<Args...> fmt, Args &&... args);
|
||||
|
||||
operator const std::string & () const { return text; }
|
||||
operator std::string () && { return std::move(text); }
|
||||
operator fmt::format_string<> () const { UNREACHABLE(); }
|
||||
};
|
||||
|
||||
template <typename... Args>
|
||||
PreformattedMessage FormatStringHelperImpl<Args...>::format(Args && ...args) const
|
||||
{
|
||||
return PreformattedMessage{fmt::format(fmt_str, std::forward<Args>(args)...), message_format_string};
|
||||
}
|
||||
|
||||
template <typename... Args>
|
||||
PreformattedMessage PreformattedMessage::create(FormatStringHelper<Args...> fmt, Args && ...args)
|
||||
{
|
||||
return fmt.format(std::forward<Args>(args)...);
|
||||
}
|
||||
|
||||
template<typename T> struct is_fmt_runtime : std::false_type {};
|
||||
template<typename T> struct is_fmt_runtime<fmt::basic_runtime<T>> : std::true_type {};
|
||||
|
||||
template <typename T> constexpr std::string_view tryGetStaticFormatString(T && x)
|
||||
{
|
||||
/// Failure of this asserting indicates that something went wrong during type deduction.
|
||||
/// For example, a string literal was implicitly converted to std::string. It should not happen.
|
||||
/// Format string for an exception or log message must be a string literal (compile-time constant).
|
||||
/// Failure of this assertion may indicate one of the following issues:
|
||||
/// - A message was already formatted into std::string before passing to Exception(...) or LOG_XXXXX(...).
|
||||
/// Please use variadic constructor of Exception.
|
||||
/// Consider using PreformattedMessage or LogToStr if you want to avoid double formatting and/or copy-paste.
|
||||
/// - A string literal was converted to std::string (or const char *).
|
||||
/// - Use Exception::createRuntime or fmt::runtime if there's no format string
|
||||
/// and a message is generated in runtime by a third-party library
|
||||
/// or deserialized from somewhere.
|
||||
static_assert(!std::is_same_v<std::string, std::decay_t<T>>);
|
||||
|
||||
if constexpr (is_fmt_runtime<std::decay_t<T>>::value)
|
||||
@ -53,3 +99,60 @@ template <typename T, typename... Ts> constexpr auto firstArg(T && x, Ts &&...)
|
||||
/// For implicit conversion of fmt::basic_runtime<> to char* for std::string ctor
|
||||
template <typename T, typename... Ts> constexpr auto firstArg(fmt::basic_runtime<T> && data, Ts &&...) { return data.str.data(); }
|
||||
|
||||
consteval ssize_t formatStringCountArgsNum(const char * const str, size_t len)
|
||||
{
|
||||
/// It does not count named args, but we don't use them
|
||||
size_t cnt = 0;
|
||||
size_t i = 0;
|
||||
while (i + 1 < len)
|
||||
{
|
||||
if (str[i] == '{' && str[i + 1] == '}')
|
||||
{
|
||||
i += 2;
|
||||
cnt += 1;
|
||||
}
|
||||
else if (str[i] == '{')
|
||||
{
|
||||
/// Ignore checks for complex formatting like "{:.3f}"
|
||||
return -1;
|
||||
}
|
||||
else
|
||||
{
|
||||
i += 1;
|
||||
}
|
||||
}
|
||||
return cnt;
|
||||
}
|
||||
|
||||
[[noreturn]] void functionThatFailsCompilationOfConstevalFunctions(const char * error);
|
||||
|
||||
/// fmt::format checks that there are enough arguments, but ignores extra arguments (e.g. fmt::format("{}", 1, 2) compiles)
|
||||
/// This function will fail to compile if the number of "{}" substitutions does not exactly match
|
||||
consteval void formatStringCheckArgsNumImpl(std::string_view str, size_t nargs)
|
||||
{
|
||||
if (str.empty())
|
||||
return;
|
||||
ssize_t cnt = formatStringCountArgsNum(str.data(), str.size());
|
||||
if (0 <= cnt && cnt != nargs)
|
||||
functionThatFailsCompilationOfConstevalFunctions("unexpected number of arguments in a format string");
|
||||
}
|
||||
|
||||
template <typename... Args>
|
||||
struct CheckArgsNumHelperImpl
|
||||
{
|
||||
template<typename T>
|
||||
consteval CheckArgsNumHelperImpl(T && str)
|
||||
{
|
||||
formatStringCheckArgsNumImpl(tryGetStaticFormatString(str), sizeof...(Args));
|
||||
}
|
||||
|
||||
/// No checks for fmt::runtime and PreformattedMessage
|
||||
template<typename T> CheckArgsNumHelperImpl(fmt::basic_runtime<T> &&) {}
|
||||
template<> CheckArgsNumHelperImpl(PreformattedMessage &) {}
|
||||
template<> CheckArgsNumHelperImpl(const PreformattedMessage &) {}
|
||||
template<> CheckArgsNumHelperImpl(PreformattedMessage &&) {}
|
||||
|
||||
};
|
||||
|
||||
template <typename... Args> using CheckArgsNumHelper = CheckArgsNumHelperImpl<std::type_identity_t<Args>...>;
|
||||
template <typename... Args> void formatStringCheckArgsNum(CheckArgsNumHelper<Args...>, Args &&...) {}
|
||||
|
@ -187,7 +187,7 @@ public:
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::BAD_ARGUMENTS,
|
||||
"Value with key `{}` is used twice in the SET query",
|
||||
"Value with key `{}` is used twice in the SET query (collection name: {})",
|
||||
name, query.collection_name);
|
||||
}
|
||||
}
|
||||
|
@ -211,9 +211,8 @@ PoolWithFailoverBase<TNestedPool>::get(size_t max_ignored_errors, bool fallback_
|
||||
max_ignored_errors, fallback_to_stale_replicas,
|
||||
try_get_entry, get_priority);
|
||||
if (results.empty() || results[0].entry.isNull())
|
||||
throw DB::Exception(
|
||||
"PoolWithFailoverBase::getMany() returned less than min_entries entries.",
|
||||
DB::ErrorCodes::LOGICAL_ERROR);
|
||||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR,
|
||||
"PoolWithFailoverBase::getMany() returned less than min_entries entries.");
|
||||
return results[0].entry;
|
||||
}
|
||||
|
||||
@ -320,10 +319,8 @@ PoolWithFailoverBase<TNestedPool>::getMany(
|
||||
try_results.resize(up_to_date_count);
|
||||
}
|
||||
else
|
||||
throw DB::Exception(
|
||||
"Could not find enough connections to up-to-date replicas. Got: " + std::to_string(up_to_date_count)
|
||||
+ ", needed: " + std::to_string(min_entries),
|
||||
DB::ErrorCodes::ALL_REPLICAS_ARE_STALE);
|
||||
throw DB::Exception(DB::ErrorCodes::ALL_REPLICAS_ARE_STALE,
|
||||
"Could not find enough connections to up-to-date replicas. Got: {}, needed: {}", up_to_date_count, max_entries);
|
||||
|
||||
return try_results;
|
||||
}
|
||||
|
@ -62,10 +62,10 @@ public:
|
||||
, replacement(replacement_string)
|
||||
{
|
||||
if (!regexp.ok())
|
||||
throw DB::Exception(
|
||||
"SensitiveDataMasker: cannot compile re2: " + regexp_string_ + ", error: " + regexp.error()
|
||||
+ ". Look at https://github.com/google/re2/wiki/Syntax for reference.",
|
||||
DB::ErrorCodes::CANNOT_COMPILE_REGEXP);
|
||||
throw DB::Exception(DB::ErrorCodes::CANNOT_COMPILE_REGEXP,
|
||||
"SensitiveDataMasker: cannot compile re2: {}, error: {}. "
|
||||
"Look at https://github.com/google/re2/wiki/Syntax for reference.",
|
||||
regexp_string_, regexp.error());
|
||||
}
|
||||
|
||||
uint64_t apply(std::string & data) const
|
||||
|
@ -100,7 +100,7 @@ size_t TLDListsHolder::parseAndAddTldList(const std::string & name, const std::s
|
||||
tld_list_tmp.emplace(line, TLDType::TLD_REGULAR);
|
||||
}
|
||||
if (!in.eof())
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "Not all list had been read", name);
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "Not all list had been read: {}", name);
|
||||
|
||||
TLDList tld_list(tld_list_tmp.size());
|
||||
for (const auto & [host, type] : tld_list_tmp)
|
||||
|
@ -58,7 +58,7 @@ UInt64 Throttler::add(size_t amount)
|
||||
}
|
||||
|
||||
if (limit && count_value > limit)
|
||||
throw Exception(limit_exceeded_exception_message + std::string(" Maximum: ") + toString(limit), ErrorCodes::LIMIT_EXCEEDED);
|
||||
throw Exception::createDeprecated(limit_exceeded_exception_message + std::string(" Maximum: ") + toString(limit), ErrorCodes::LIMIT_EXCEEDED);
|
||||
|
||||
/// Wait unless there is positive amount of tokens - throttling
|
||||
Int64 sleep_time = 0;
|
||||
|
@ -41,7 +41,7 @@ To assert_cast(From && from)
|
||||
}
|
||||
catch (const std::exception & e)
|
||||
{
|
||||
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
|
||||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Bad cast from type {} to {}",
|
||||
|
@ -57,6 +57,7 @@ namespace
|
||||
if (_is_clients_log || _logger->is((PRIORITY))) \
|
||||
{ \
|
||||
std::string formatted_message = numArgs(__VA_ARGS__) > 1 ? fmt::format(__VA_ARGS__) : firstArg(__VA_ARGS__); \
|
||||
formatStringCheckArgsNum(__VA_ARGS__); \
|
||||
if (auto _channel = _logger->getChannel()) \
|
||||
{ \
|
||||
std::string file_function; \
|
||||
|
@ -171,9 +171,8 @@ TEST(Common, RWLockDeadlock)
|
||||
auto holder2 = lock2->getLock(RWLockImpl::Read, "q1", std::chrono::milliseconds(100));
|
||||
if (!holder2)
|
||||
{
|
||||
throw Exception(
|
||||
"Locking attempt timed out! Possible deadlock avoided. Client should retry.",
|
||||
ErrorCodes::DEADLOCK_AVOIDED);
|
||||
throw Exception(ErrorCodes::DEADLOCK_AVOIDED,
|
||||
"Locking attempt timed out! Possible deadlock avoided. Client should retry.");
|
||||
}
|
||||
}
|
||||
catch (const Exception & e)
|
||||
@ -202,9 +201,8 @@ TEST(Common, RWLockDeadlock)
|
||||
auto holder1 = lock1->getLock(RWLockImpl::Read, "q3", std::chrono::milliseconds(100));
|
||||
if (!holder1)
|
||||
{
|
||||
throw Exception(
|
||||
"Locking attempt timed out! Possible deadlock avoided. Client should retry.",
|
||||
ErrorCodes::DEADLOCK_AVOIDED);
|
||||
throw Exception(ErrorCodes::DEADLOCK_AVOIDED,
|
||||
"Locking attempt timed out! Possible deadlock avoided. Client should retry.");
|
||||
}
|
||||
}
|
||||
catch (const Exception & e)
|
||||
|
@ -37,7 +37,7 @@ To typeid_cast(From & from)
|
||||
}
|
||||
catch (const std::exception & e)
|
||||
{
|
||||
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
|
||||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Bad cast from type {} to {}",
|
||||
@ -58,7 +58,7 @@ To typeid_cast(From * from)
|
||||
}
|
||||
catch (const std::exception & e)
|
||||
{
|
||||
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
}
|
||||
|
||||
@ -93,6 +93,6 @@ To typeid_cast(const std::shared_ptr<From> & from)
|
||||
}
|
||||
catch (const std::exception & e)
|
||||
{
|
||||
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
}
|
||||
|
@ -86,7 +86,7 @@ static void validateChecksum(char * data, size_t size, const Checksum expected_c
|
||||
{
|
||||
message << ". The mismatch is caused by single bit flip in data block at byte " << (bit_pos / 8) << ", bit " << (bit_pos % 8) << ". "
|
||||
<< message_hardware_failure;
|
||||
throw Exception(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
|
||||
throw Exception::createDeprecated(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
|
||||
}
|
||||
|
||||
flip_bit(tmp_data, bit_pos); /// Restore
|
||||
@ -101,10 +101,10 @@ static void validateChecksum(char * data, size_t size, const Checksum expected_c
|
||||
{
|
||||
message << ". The mismatch is caused by single bit flip in checksum. "
|
||||
<< message_hardware_failure;
|
||||
throw Exception(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
|
||||
throw Exception::createDeprecated(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
|
||||
}
|
||||
|
||||
throw Exception(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
|
||||
throw Exception::createDeprecated(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
|
||||
}
|
||||
|
||||
static void readHeaderAndGetCodecAndSize(
|
||||
|
@ -30,7 +30,7 @@ protected:
|
||||
bool isGenericCompression() const override { return false; }
|
||||
|
||||
private:
|
||||
UInt8 delta_bytes_size;
|
||||
const UInt8 delta_bytes_size;
|
||||
};
|
||||
|
||||
|
||||
@ -68,8 +68,8 @@ void compressDataForType(const char * source, UInt32 source_size, char * dest)
|
||||
if (source_size % sizeof(T) != 0)
|
||||
throw Exception(ErrorCodes::CANNOT_COMPRESS, "Cannot delta compress, data size {} is not aligned to {}", source_size, sizeof(T));
|
||||
|
||||
T prev_src{};
|
||||
const char * source_end = source + source_size;
|
||||
T prev_src = 0;
|
||||
const char * const source_end = source + source_size;
|
||||
while (source < source_end)
|
||||
{
|
||||
T curr_src = unalignedLoad<T>(source);
|
||||
@ -84,17 +84,17 @@ void compressDataForType(const char * source, UInt32 source_size, char * dest)
|
||||
template <typename T>
|
||||
void decompressDataForType(const char * source, UInt32 source_size, char * dest, UInt32 output_size)
|
||||
{
|
||||
const char * output_end = dest + output_size;
|
||||
const char * const output_end = dest + output_size;
|
||||
|
||||
if (source_size % sizeof(T) != 0)
|
||||
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot delta decompress, data size {} is not aligned to {}", source_size, sizeof(T));
|
||||
|
||||
T accumulator{};
|
||||
const char * source_end = source + source_size;
|
||||
const char * const source_end = source + source_size;
|
||||
while (source < source_end)
|
||||
{
|
||||
accumulator += unalignedLoad<T>(source);
|
||||
if (dest + sizeof(accumulator) > output_end)
|
||||
if (dest + sizeof(accumulator) > output_end) [[unlikely]]
|
||||
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress the data");
|
||||
unalignedStore<T>(dest, accumulator);
|
||||
|
||||
@ -140,7 +140,7 @@ void CompressionCodecDelta::doDecompressData(const char * source, UInt32 source_
|
||||
|
||||
UInt8 bytes_size = source[0];
|
||||
|
||||
if (bytes_size == 0)
|
||||
if (!(bytes_size == 1 || bytes_size == 2 || bytes_size == 4 || bytes_size == 8))
|
||||
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress. File has wrong header");
|
||||
|
||||
UInt8 bytes_to_skip = uncompressed_size % bytes_size;
|
||||
@ -190,7 +190,7 @@ UInt8 getDeltaBytesSize(const IDataType * column_type)
|
||||
void registerCodecDelta(CompressionCodecFactory & factory)
|
||||
{
|
||||
UInt8 method_code = static_cast<UInt8>(CompressionMethodByte::Delta);
|
||||
factory.registerCompressionCodecWithType("Delta", method_code, [&](const ASTPtr & arguments, const IDataType * column_type) -> CompressionCodecPtr
|
||||
auto codec_builder = [&](const ASTPtr & arguments, const IDataType * column_type) -> CompressionCodecPtr
|
||||
{
|
||||
UInt8 delta_bytes_size = 0;
|
||||
|
||||
@ -215,7 +215,8 @@ void registerCodecDelta(CompressionCodecFactory & factory)
|
||||
}
|
||||
|
||||
return std::make_shared<CompressionCodecDelta>(delta_bytes_size);
|
||||
});
|
||||
};
|
||||
factory.registerCompressionCodecWithType("Delta", method_code, codec_builder);
|
||||
}
|
||||
|
||||
CompressionCodecPtr getCompressionCodecDelta(UInt8 delta_bytes_size)
|
||||
|
@ -141,7 +141,7 @@ size_t encrypt(std::string_view plaintext, char * ciphertext_and_tag, Encryption
|
||||
reinterpret_cast<const uint8_t*>(key.data()), key.size(),
|
||||
tag_size, nullptr);
|
||||
if (!ok_init)
|
||||
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
|
||||
/// encrypt data using context and given nonce.
|
||||
size_t out_len;
|
||||
@ -152,7 +152,7 @@ size_t encrypt(std::string_view plaintext, char * ciphertext_and_tag, Encryption
|
||||
reinterpret_cast<const uint8_t *>(plaintext.data()), plaintext.size(),
|
||||
nullptr, 0);
|
||||
if (!ok_open)
|
||||
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
|
||||
return out_len;
|
||||
}
|
||||
@ -171,7 +171,7 @@ size_t decrypt(std::string_view ciphertext, char * plaintext, EncryptionMethod m
|
||||
reinterpret_cast<const uint8_t*>(key.data()), key.size(),
|
||||
tag_size, nullptr);
|
||||
if (!ok_init)
|
||||
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
|
||||
/// decrypt data using given nonce
|
||||
size_t out_len;
|
||||
@ -182,7 +182,7 @@ size_t decrypt(std::string_view ciphertext, char * plaintext, EncryptionMethod m
|
||||
reinterpret_cast<const uint8_t *>(ciphertext.data()), ciphertext.size(),
|
||||
nullptr, 0);
|
||||
if (!ok_open)
|
||||
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
|
||||
|
||||
return out_len;
|
||||
}
|
||||
|
@ -11,19 +11,18 @@
|
||||
#include <IO/ReadBufferFromMemory.h>
|
||||
#include <IO/BitHelpers.h>
|
||||
|
||||
#include <bitset>
|
||||
#include <cstring>
|
||||
#include <algorithm>
|
||||
#include <type_traits>
|
||||
|
||||
#include <bitset>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
/** Gorilla column codec implementation.
|
||||
*
|
||||
* Based on Gorilla paper: http://www.vldb.org/pvldb/vol8/p1816-teller.pdf
|
||||
* Based on Gorilla paper: https://dl.acm.org/doi/10.14778/2824032.2824078
|
||||
*
|
||||
* This codec is best used against monotonic floating sequences, like CPU usage percentage
|
||||
* or any other gauge.
|
||||
@ -125,7 +124,7 @@ protected:
|
||||
bool isGenericCompression() const override { return false; }
|
||||
|
||||
private:
|
||||
UInt8 data_bytes_size;
|
||||
const UInt8 data_bytes_size;
|
||||
};
|
||||
|
||||
|
||||
@ -139,7 +138,7 @@ namespace ErrorCodes
|
||||
namespace
|
||||
{
|
||||
|
||||
constexpr inline UInt8 getBitLengthOfLength(UInt8 data_bytes_size)
|
||||
constexpr UInt8 getBitLengthOfLength(UInt8 data_bytes_size)
|
||||
{
|
||||
// 1-byte value is 8 bits, and we need 4 bits to represent 8 : 1000,
|
||||
// 2-byte 16 bits => 5
|
||||
@ -147,21 +146,20 @@ constexpr inline UInt8 getBitLengthOfLength(UInt8 data_bytes_size)
|
||||
// 8-byte 64 bits => 7
|
||||
const UInt8 bit_lengths[] = {0, 4, 5, 0, 6, 0, 0, 0, 7};
|
||||
assert(data_bytes_size >= 1 && data_bytes_size < sizeof(bit_lengths) && bit_lengths[data_bytes_size] != 0);
|
||||
|
||||
return bit_lengths[data_bytes_size];
|
||||
}
|
||||
|
||||
|
||||
UInt32 getCompressedHeaderSize(UInt8 data_bytes_size)
|
||||
{
|
||||
const UInt8 items_count_size = 4;
|
||||
|
||||
constexpr UInt8 items_count_size = 4;
|
||||
return items_count_size + data_bytes_size;
|
||||
}
|
||||
|
||||
UInt32 getCompressedDataSize(UInt8 data_bytes_size, UInt32 uncompressed_size)
|
||||
{
|
||||
const UInt32 items_count = uncompressed_size / data_bytes_size;
|
||||
|
||||
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(data_bytes_size);
|
||||
// -1 since there must be at least 1 non-zero bit.
|
||||
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
|
||||
@ -182,7 +180,7 @@ struct BinaryValueInfo
|
||||
};
|
||||
|
||||
template <typename T>
|
||||
BinaryValueInfo getLeadingAndTrailingBits(const T & value)
|
||||
BinaryValueInfo getBinaryValueInfo(const T & value)
|
||||
{
|
||||
constexpr UInt8 bit_size = sizeof(T) * 8;
|
||||
|
||||
@ -190,28 +188,25 @@ BinaryValueInfo getLeadingAndTrailingBits(const T & value)
|
||||
const UInt8 tz = getTrailingZeroBits(value);
|
||||
const UInt8 data_size = value == 0 ? 0 : static_cast<UInt8>(bit_size - lz - tz);
|
||||
|
||||
return BinaryValueInfo{lz, data_size, tz};
|
||||
return {lz, data_size, tz};
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest, UInt32 dest_size)
|
||||
{
|
||||
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
|
||||
// -1 since there must be at least 1 non-zero bit.
|
||||
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
|
||||
|
||||
if (source_size % sizeof(T) != 0)
|
||||
throw Exception(ErrorCodes::CANNOT_COMPRESS, "Cannot compress, data size {} is not aligned to {}", source_size, sizeof(T));
|
||||
const char * source_end = source + source_size;
|
||||
const char * dest_start = dest;
|
||||
const char * dest_end = dest + dest_size;
|
||||
|
||||
const char * const source_end = source + source_size;
|
||||
const char * const dest_start = dest;
|
||||
const char * const dest_end = dest + dest_size;
|
||||
|
||||
const UInt32 items_count = source_size / sizeof(T);
|
||||
|
||||
unalignedStoreLE<UInt32>(dest, items_count);
|
||||
dest += sizeof(items_count);
|
||||
|
||||
T prev_value{};
|
||||
T prev_value = 0;
|
||||
// That would cause first XORed value to be written in-full.
|
||||
BinaryValueInfo prev_xored_info{0, 0, 0};
|
||||
|
||||
@ -226,13 +221,17 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
|
||||
|
||||
BitWriter writer(dest, dest_end - dest);
|
||||
|
||||
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
|
||||
// -1 since there must be at least 1 non-zero bit.
|
||||
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
|
||||
|
||||
while (source < source_end)
|
||||
{
|
||||
const T curr_value = unalignedLoadLE<T>(source);
|
||||
source += sizeof(curr_value);
|
||||
|
||||
const auto xored_data = curr_value ^ prev_value;
|
||||
const BinaryValueInfo curr_xored_info = getLeadingAndTrailingBits(xored_data);
|
||||
const BinaryValueInfo curr_xored_info = getBinaryValueInfo(xored_data);
|
||||
|
||||
if (xored_data == 0)
|
||||
{
|
||||
@ -265,11 +264,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
|
||||
template <typename T>
|
||||
void decompressDataForType(const char * source, UInt32 source_size, char * dest)
|
||||
{
|
||||
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
|
||||
// -1 since there must be at least 1 non-zero bit.
|
||||
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
|
||||
|
||||
const char * source_end = source + source_size;
|
||||
const char * const source_end = source + source_size;
|
||||
|
||||
if (source + sizeof(UInt32) > source_end)
|
||||
return;
|
||||
@ -277,7 +272,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
|
||||
const UInt32 items_count = unalignedLoadLE<UInt32>(source);
|
||||
source += sizeof(items_count);
|
||||
|
||||
T prev_value{};
|
||||
T prev_value = 0;
|
||||
|
||||
// decoding first item
|
||||
if (source + sizeof(T) > source_end || items_count < 1)
|
||||
@ -293,13 +288,17 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
|
||||
|
||||
BinaryValueInfo prev_xored_info{0, 0, 0};
|
||||
|
||||
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
|
||||
// -1 since there must be at least 1 non-zero bit.
|
||||
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
|
||||
|
||||
// since data is tightly packed, up to 1 bit per value, and last byte is padded with zeroes,
|
||||
// we have to keep track of items to avoid reading more that there is.
|
||||
for (UInt32 items_read = 1; items_read < items_count && !reader.eof(); ++items_read)
|
||||
{
|
||||
T curr_value = prev_value;
|
||||
BinaryValueInfo curr_xored_info = prev_xored_info;
|
||||
T xored_data{};
|
||||
T xored_data = 0;
|
||||
|
||||
if (reader.readBit() == 1)
|
||||
{
|
||||
@ -314,7 +313,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
|
||||
|
||||
if (curr_xored_info.leading_zero_bits == 0
|
||||
&& curr_xored_info.data_bits == 0
|
||||
&& curr_xored_info.trailing_zero_bits == 0)
|
||||
&& curr_xored_info.trailing_zero_bits == 0) [[unlikely]]
|
||||
{
|
||||
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress gorilla-encoded data: corrupted input data.");
|
||||
}
|
||||
@ -403,7 +402,7 @@ UInt32 CompressionCodecGorilla::doCompressData(const char * source, UInt32 sourc
|
||||
break;
|
||||
}
|
||||
|
||||
return 1 + 1 + result_size;
|
||||
return 2 + bytes_to_skip + result_size;
|
||||
}
|
||||
|
||||
void CompressionCodecGorilla::doDecompressData(const char * source, UInt32 source_size, char * dest, UInt32 uncompressed_size) const
|
||||
|
@ -11,13 +11,6 @@
|
||||
namespace DB
|
||||
{
|
||||
|
||||
class ICompressionCodec;
|
||||
|
||||
using CompressionCodecPtr = std::shared_ptr<ICompressionCodec>;
|
||||
using Codecs = std::vector<CompressionCodecPtr>;
|
||||
|
||||
class IDataType;
|
||||
|
||||
extern "C" int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size);
|
||||
|
||||
/**
|
||||
@ -120,7 +113,7 @@ protected:
|
||||
/// Return size of compressed data without header
|
||||
virtual UInt32 getMaxCompressedDataSize(UInt32 uncompressed_size) const { return uncompressed_size; }
|
||||
|
||||
/// Actually compress data, without header
|
||||
/// Actually compress data without header
|
||||
virtual UInt32 doCompressData(const char * source, UInt32 source_size, char * dest) const = 0;
|
||||
|
||||
/// Actually decompress data without header
|
||||
@ -134,4 +127,7 @@ private:
|
||||
CodecMode decompressMode{CodecMode::Synchronous};
|
||||
};
|
||||
|
||||
using CompressionCodecPtr = std::shared_ptr<ICompressionCodec>;
|
||||
using Codecs = std::vector<CompressionCodecPtr>;
|
||||
|
||||
}
|
||||
|
@ -29,11 +29,13 @@ namespace ErrorCodes
|
||||
extern const int AMBIGUOUS_COLUMN_NAME;
|
||||
}
|
||||
|
||||
template <typename ReturnType>
|
||||
static ReturnType onError(const std::string & message [[maybe_unused]], int code [[maybe_unused]])
|
||||
template <typename ReturnType, typename... FmtArgs>
|
||||
static ReturnType onError(int code [[maybe_unused]],
|
||||
FormatStringHelper<FmtArgs...> fmt_string [[maybe_unused]],
|
||||
FmtArgs && ...fmt_args [[maybe_unused]])
|
||||
{
|
||||
if constexpr (std::is_same_v<ReturnType, void>)
|
||||
throw Exception(message, code);
|
||||
throw Exception(code, std::move(fmt_string), std::forward<FmtArgs>(fmt_args)...);
|
||||
else
|
||||
return false;
|
||||
}
|
||||
@ -44,13 +46,13 @@ static ReturnType checkColumnStructure(const ColumnWithTypeAndName & actual, con
|
||||
std::string_view context_description, bool allow_materialize, int code)
|
||||
{
|
||||
if (actual.name != expected.name)
|
||||
return onError<ReturnType>("Block structure mismatch in " + std::string(context_description) + " stream: different names of columns:\n"
|
||||
+ actual.dumpStructure() + "\n" + expected.dumpStructure(), code);
|
||||
return onError<ReturnType>(code, "Block structure mismatch in {} stream: different names of columns:\n{}\n{}",
|
||||
context_description, actual.dumpStructure(), expected.dumpStructure());
|
||||
|
||||
if ((actual.type && !expected.type) || (!actual.type && expected.type)
|
||||
|| (actual.type && expected.type && !actual.type->equals(*expected.type)))
|
||||
return onError<ReturnType>("Block structure mismatch in " + std::string(context_description) + " stream: different types:\n"
|
||||
+ actual.dumpStructure() + "\n" + expected.dumpStructure(), code);
|
||||
return onError<ReturnType>(code, "Block structure mismatch in {} stream: different types:\n{}\n{}",
|
||||
context_description, actual.dumpStructure(), expected.dumpStructure());
|
||||
|
||||
if (!actual.column || !expected.column)
|
||||
return ReturnType(true);
|
||||
@ -74,22 +76,18 @@ static ReturnType checkColumnStructure(const ColumnWithTypeAndName & actual, con
|
||||
if (actual_column_maybe_agg && expected_column_maybe_agg)
|
||||
{
|
||||
if (!actual_column_maybe_agg->getAggregateFunction()->haveSameStateRepresentation(*expected_column_maybe_agg->getAggregateFunction()))
|
||||
return onError<ReturnType>(
|
||||
fmt::format(
|
||||
return onError<ReturnType>(code,
|
||||
"Block structure mismatch in {} stream: different columns:\n{}\n{}",
|
||||
context_description,
|
||||
actual.dumpStructure(),
|
||||
expected.dumpStructure()),
|
||||
code);
|
||||
expected.dumpStructure());
|
||||
}
|
||||
else if (actual_column->getName() != expected.column->getName())
|
||||
return onError<ReturnType>(
|
||||
fmt::format(
|
||||
return onError<ReturnType>(code,
|
||||
"Block structure mismatch in {} stream: different columns:\n{}\n{}",
|
||||
context_description,
|
||||
actual.dumpStructure(),
|
||||
expected.dumpStructure()),
|
||||
code);
|
||||
expected.dumpStructure());
|
||||
|
||||
if (isColumnConst(*actual.column) && isColumnConst(*expected.column)
|
||||
&& !actual.column->empty() && !expected.column->empty()) /// don't check values in empty columns
|
||||
@ -98,14 +96,12 @@ static ReturnType checkColumnStructure(const ColumnWithTypeAndName & actual, con
|
||||
Field expected_value = assert_cast<const ColumnConst &>(*expected.column).getField();
|
||||
|
||||
if (actual_value != expected_value)
|
||||
return onError<ReturnType>(
|
||||
fmt::format(
|
||||
return onError<ReturnType>(code,
|
||||
"Block structure mismatch in {} stream: different values of constants in column '{}': actual: {}, expected: {}",
|
||||
context_description,
|
||||
actual.name,
|
||||
applyVisitor(FieldVisitorToString(), actual_value),
|
||||
applyVisitor(FieldVisitorToString(), expected_value)),
|
||||
code);
|
||||
applyVisitor(FieldVisitorToString(), expected_value));
|
||||
}
|
||||
|
||||
return ReturnType(true);
|
||||
@ -117,8 +113,8 @@ static ReturnType checkBlockStructure(const Block & lhs, const Block & rhs, std:
|
||||
{
|
||||
size_t columns = rhs.columns();
|
||||
if (lhs.columns() != columns)
|
||||
return onError<ReturnType>("Block structure mismatch in " + std::string(context_description) + " stream: different number of columns:\n"
|
||||
+ lhs.dumpStructure() + "\n" + rhs.dumpStructure(), ErrorCodes::LOGICAL_ERROR);
|
||||
return onError<ReturnType>(ErrorCodes::LOGICAL_ERROR, "Block structure mismatch in {} stream: different number of columns:\n{}\n{}",
|
||||
context_description, lhs.dumpStructure(), rhs.dumpStructure());
|
||||
|
||||
for (size_t i = 0; i < columns; ++i)
|
||||
{
|
||||
|
@ -95,7 +95,7 @@ void MySQLClient::handshake()
|
||||
packet_endpoint->resetSequenceId();
|
||||
|
||||
if (packet_response.getType() == PACKET_ERR)
|
||||
throw Exception(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
|
||||
throw Exception::createDeprecated(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
|
||||
else if (packet_response.getType() == PACKET_AUTH_SWITCH)
|
||||
throw Exception(ErrorCodes::UNKNOWN_PACKET_FROM_SERVER, "Access denied for user {}", user);
|
||||
}
|
||||
@ -110,7 +110,7 @@ void MySQLClient::writeCommand(char command, String query)
|
||||
switch (packet_response.getType())
|
||||
{
|
||||
case PACKET_ERR:
|
||||
throw Exception(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
|
||||
throw Exception::createDeprecated(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
|
||||
case PACKET_OK:
|
||||
break;
|
||||
default:
|
||||
@ -128,7 +128,7 @@ void MySQLClient::registerSlaveOnMaster(UInt32 slave_id)
|
||||
packet_endpoint->receivePacket(packet_response);
|
||||
packet_endpoint->resetSequenceId();
|
||||
if (packet_response.getType() == PACKET_ERR)
|
||||
throw Exception(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
|
||||
throw Exception::createDeprecated(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
|
||||
}
|
||||
|
||||
void MySQLClient::ping()
|
||||
|
@ -111,7 +111,7 @@ namespace MySQLReplication
|
||||
else if (query.starts_with("XA"))
|
||||
{
|
||||
if (query.starts_with("XA ROLLBACK"))
|
||||
throw ReplicationError("ParseQueryEvent: Unsupported query event:" + query, ErrorCodes::LOGICAL_ERROR);
|
||||
throw ReplicationError(ErrorCodes::LOGICAL_ERROR, "ParseQueryEvent: Unsupported query event: {}", query);
|
||||
typ = QUERY_EVENT_XA;
|
||||
if (!query.starts_with("XA COMMIT"))
|
||||
transaction_complete = false;
|
||||
@ -247,7 +247,7 @@ namespace MySQLReplication
|
||||
break;
|
||||
}
|
||||
default:
|
||||
throw ReplicationError("ParseMetaData: Unhandled data type:" + std::to_string(typ), ErrorCodes::UNKNOWN_EXCEPTION);
|
||||
throw ReplicationError(ErrorCodes::UNKNOWN_EXCEPTION, "ParseMetaData: Unhandled data type: {}", std::to_string(typ));
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -770,8 +770,8 @@ namespace MySQLReplication
|
||||
break;
|
||||
}
|
||||
default:
|
||||
throw ReplicationError(
|
||||
"ParseRow: Unhandled MySQL field type:" + std::to_string(field_type), ErrorCodes::UNKNOWN_EXCEPTION);
|
||||
throw ReplicationError(ErrorCodes::UNKNOWN_EXCEPTION,
|
||||
"ParseRow: Unhandled MySQL field type: {}", std::to_string(field_type));
|
||||
}
|
||||
}
|
||||
null_index++;
|
||||
@ -873,7 +873,7 @@ namespace MySQLReplication
|
||||
break;
|
||||
}
|
||||
default:
|
||||
throw ReplicationError("Position update with unsupported event", ErrorCodes::LOGICAL_ERROR);
|
||||
throw ReplicationError(ErrorCodes::LOGICAL_ERROR, "Position update with unsupported event");
|
||||
}
|
||||
}
|
||||
|
||||
@ -901,11 +901,11 @@ namespace MySQLReplication
|
||||
switch (header)
|
||||
{
|
||||
case PACKET_EOF:
|
||||
throw ReplicationError("Master maybe lost", ErrorCodes::CANNOT_READ_ALL_DATA);
|
||||
throw ReplicationError(ErrorCodes::CANNOT_READ_ALL_DATA, "Master maybe lost");
|
||||
case PACKET_ERR:
|
||||
ERRPacket err;
|
||||
err.readPayloadWithUnpacked(payload);
|
||||
throw ReplicationError(err.error_message, ErrorCodes::UNKNOWN_EXCEPTION);
|
||||
throw ReplicationError::createDeprecated(err.error_message, ErrorCodes::UNKNOWN_EXCEPTION);
|
||||
}
|
||||
// skip the generic response packets header flag.
|
||||
payload.ignore(1);
|
||||
|
@ -74,7 +74,7 @@ ConnectionHolderPtr PoolWithFailover::get()
|
||||
if (replicas_with_priority.empty())
|
||||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "No address specified");
|
||||
|
||||
DB::WriteBufferFromOwnString error_message;
|
||||
PreformattedMessage error_message;
|
||||
for (size_t try_idx = 0; try_idx < max_tries; ++try_idx)
|
||||
{
|
||||
for (auto & priority : replicas_with_priority)
|
||||
@ -107,7 +107,7 @@ ConnectionHolderPtr PoolWithFailover::get()
|
||||
catch (const pqxx::broken_connection & pqxx_error)
|
||||
{
|
||||
LOG_ERROR(log, "Connection error: {}", pqxx_error.what());
|
||||
error_message << fmt::format(
|
||||
error_message = PreformattedMessage::create(
|
||||
"Try {}. Connection to {} failed with error: {}\n",
|
||||
try_idx + 1, DB::backQuote(replica.connection_info.host_port), pqxx_error.what());
|
||||
|
||||
@ -131,7 +131,7 @@ ConnectionHolderPtr PoolWithFailover::get()
|
||||
}
|
||||
}
|
||||
|
||||
throw DB::Exception(DB::ErrorCodes::POSTGRESQL_CONNECTION_FAILURE, error_message.str());
|
||||
throw DB::Exception(error_message, DB::ErrorCodes::POSTGRESQL_CONNECTION_FAILURE);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -402,7 +402,7 @@ void SettingFieldEnum<EnumT, Traits>::readBinary(ReadBuffer & in)
|
||||
auto it = map.find(value); \
|
||||
if (it != map.end()) \
|
||||
return it->second; \
|
||||
throw Exception( \
|
||||
throw Exception::createDeprecated( \
|
||||
"Unexpected value of " #NEW_NAME ":" + std::to_string(std::underlying_type<EnumType>::type(value)), \
|
||||
ERROR_CODE_FOR_UNEXPECTED_NAME); \
|
||||
} \
|
||||
@ -428,7 +428,7 @@ void SettingFieldEnum<EnumT, Traits>::readBinary(ReadBuffer & in)
|
||||
msg += "'" + String{name} + "'"; \
|
||||
} \
|
||||
msg += "]"; \
|
||||
throw Exception(msg, ERROR_CODE_FOR_UNEXPECTED_NAME); \
|
||||
throw Exception::createDeprecated(msg, ERROR_CODE_FOR_UNEXPECTED_NAME); \
|
||||
}
|
||||
|
||||
// Mostly like SettingFieldEnum, but can have multiple enum values (or none) set at once.
|
||||
|
@ -933,7 +933,7 @@ void BaseDaemon::handleSignal(int signal_id)
|
||||
onInterruptSignals(signal_id);
|
||||
}
|
||||
else
|
||||
throw DB::Exception(std::string("Unsupported signal: ") + strsignal(signal_id), 0); // NOLINT(concurrency-mt-unsafe) // it is not thread-safe but ok in this context
|
||||
throw DB::Exception::createDeprecated(std::string("Unsupported signal: ") + strsignal(signal_id), 0); // NOLINT(concurrency-mt-unsafe) // it is not thread-safe but ok in this context
|
||||
}
|
||||
|
||||
void BaseDaemon::onInterruptSignals(int signal_id)
|
||||
|
@ -189,7 +189,8 @@ template <template <typename> typename DecimalType>
|
||||
inline DataTypePtr createDecimal(UInt64 precision_value, UInt64 scale_value)
|
||||
{
|
||||
if (precision_value < DecimalUtils::min_precision || precision_value > DecimalUtils::max_precision<Decimal256>)
|
||||
throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Wrong precision");
|
||||
throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Wrong precision: it must be between {} and {}, got {}",
|
||||
DecimalUtils::min_precision, DecimalUtils::max_precision<Decimal256>, precision_value);
|
||||
|
||||
if (static_cast<UInt64>(scale_value) > precision_value)
|
||||
throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Negative scales and scales larger than precision are not supported");
|
||||
|
@ -57,7 +57,7 @@ static DataTypePtr create(const ASTPtr & arguments)
|
||||
{
|
||||
if (func->name != "Nullable" || func->arguments->children.size() != 1)
|
||||
throw Exception(ErrorCodes::UNEXPECTED_AST_STRUCTURE,
|
||||
"Expected 'Nullable(<schema_name>)' as parameter for type Object", func->name);
|
||||
"Expected 'Nullable(<schema_name>)' as parameter for type Object (function: {})", func->name);
|
||||
|
||||
schema_argument = func->arguments->children[0];
|
||||
is_nullable = true;
|
||||
|
@ -53,10 +53,10 @@ static std::optional<Exception> checkTupleNames(const Strings & names)
|
||||
for (const auto & name : names)
|
||||
{
|
||||
if (name.empty())
|
||||
return Exception("Names of tuple elements cannot be empty", ErrorCodes::BAD_ARGUMENTS);
|
||||
return Exception(ErrorCodes::BAD_ARGUMENTS, "Names of tuple elements cannot be empty");
|
||||
|
||||
if (!names_set.insert(name).second)
|
||||
return Exception("Names of tuple elements must be unique", ErrorCodes::DUPLICATE_COLUMN);
|
||||
return Exception(ErrorCodes::DUPLICATE_COLUMN, "Names of tuple elements must be unique");
|
||||
}
|
||||
|
||||
return {};
|
||||
|
@ -373,8 +373,8 @@ void SerializationArray::deserializeBinaryBulkWithMultipleStreams(
|
||||
/// Check consistency between offsets and elements subcolumns.
|
||||
/// But if elements column is empty - it's ok for columns of Nested types that was added by ALTER.
|
||||
if (!nested_column->empty() && nested_column->size() != last_offset)
|
||||
throw ParsingException("Cannot read all array values: read just " + toString(nested_column->size()) + " of " + toString(last_offset),
|
||||
ErrorCodes::CANNOT_READ_ALL_DATA);
|
||||
throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Cannot read all array values: read just {} of {}",
|
||||
toString(nested_column->size()), toString(last_offset));
|
||||
|
||||
column = std::move(mutable_column);
|
||||
}
|
||||
|
@ -360,19 +360,20 @@ ReturnType SerializationNullable::deserializeTextEscapedAndRawImpl(IColumn & col
|
||||
/// or if someone uses tab or LF in TSV null_representation.
|
||||
/// In the first case we cannot continue reading anyway. The second case seems to be unlikely.
|
||||
if (null_representation.find('\t') != std::string::npos || null_representation.find('\n') != std::string::npos)
|
||||
throw DB::ParsingException("TSV custom null representation containing '\\t' or '\\n' may not work correctly "
|
||||
"for large input.", ErrorCodes::CANNOT_READ_ALL_DATA);
|
||||
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "TSV custom null representation "
|
||||
"containing '\\t' or '\\n' may not work correctly for large input.");
|
||||
|
||||
WriteBufferFromOwnString parsed_value;
|
||||
if constexpr (escaped)
|
||||
nested_serialization->serializeTextEscaped(nested_column, nested_column.size() - 1, parsed_value, settings);
|
||||
else
|
||||
nested_serialization->serializeTextRaw(nested_column, nested_column.size() - 1, parsed_value, settings);
|
||||
throw DB::ParsingException("Error while parsing \"" + std::string(pos, buf.buffer().end()) + std::string(istr.position(), std::min(size_t(10), istr.available())) + "\" as Nullable"
|
||||
+ " at position " + std::to_string(istr.count()) + ": got \"" + std::string(pos, buf.position() - pos)
|
||||
+ "\", which was deserialized as \""
|
||||
+ parsed_value.str() + "\". It seems that input data is ill-formatted.",
|
||||
ErrorCodes::CANNOT_READ_ALL_DATA);
|
||||
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable"
|
||||
" at position {}: got \"{}\", which was deserialized as \"{}\". "
|
||||
"It seems that input data is ill-formatted.",
|
||||
std::string(pos, buf.buffer().end()),
|
||||
std::string(istr.position(), std::min(size_t(10), istr.available())),
|
||||
istr.count(), std::string(pos, buf.position() - pos), parsed_value.str());
|
||||
};
|
||||
|
||||
return safeDeserialize<ReturnType>(column, *nested_serialization, check_for_null, deserialize_nested);
|
||||
@ -584,16 +585,17 @@ ReturnType SerializationNullable::deserializeTextCSVImpl(IColumn & column, ReadB
|
||||
/// In the first case we cannot continue reading anyway. The second case seems to be unlikely.
|
||||
if (null_representation.find(settings.csv.delimiter) != std::string::npos || null_representation.find('\r') != std::string::npos
|
||||
|| null_representation.find('\n') != std::string::npos)
|
||||
throw DB::ParsingException("CSV custom null representation containing format_csv_delimiter, '\\r' or '\\n' may not work correctly "
|
||||
"for large input.", ErrorCodes::CANNOT_READ_ALL_DATA);
|
||||
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "CSV custom null representation containing "
|
||||
"format_csv_delimiter, '\\r' or '\\n' may not work correctly for large input.");
|
||||
|
||||
WriteBufferFromOwnString parsed_value;
|
||||
nested_serialization->serializeTextCSV(nested_column, nested_column.size() - 1, parsed_value, settings);
|
||||
throw DB::ParsingException("Error while parsing \"" + std::string(pos, buf.buffer().end()) + std::string(istr.position(), std::min(size_t(10), istr.available())) + "\" as Nullable"
|
||||
+ " at position " + std::to_string(istr.count()) + ": got \"" + std::string(pos, buf.position() - pos)
|
||||
+ "\", which was deserialized as \""
|
||||
+ parsed_value.str() + "\". It seems that input data is ill-formatted.",
|
||||
ErrorCodes::CANNOT_READ_ALL_DATA);
|
||||
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable"
|
||||
" at position {}: got \"{}\", which was deserialized as \"{}\". "
|
||||
"It seems that input data is ill-formatted.",
|
||||
std::string(pos, buf.buffer().end()),
|
||||
std::string(istr.position(), std::min(size_t(10), istr.available())),
|
||||
istr.count(), std::string(pos, buf.position() - pos), parsed_value.str());
|
||||
};
|
||||
|
||||
return safeDeserialize<ReturnType>(column, *nested_serialization, check_for_null, deserialize_nested);
|
||||
|
@ -231,7 +231,7 @@ void SerializationObject<Parser>::deserializeBinaryBulkStatePrefix(
|
||||
auto kind = magic_enum::enum_cast<BinarySerializationKind>(kind_raw);
|
||||
if (!kind)
|
||||
throw Exception(ErrorCodes::INCORRECT_DATA,
|
||||
"Unknown binary serialization kind of Object: " + std::to_string(kind_raw));
|
||||
"Unknown binary serialization kind of Object: {}", std::to_string(kind_raw));
|
||||
|
||||
auto state_object = std::make_shared<DeserializeStateObject>();
|
||||
state_object->kind = *kind;
|
||||
@ -255,7 +255,7 @@ void SerializationObject<Parser>::deserializeBinaryBulkStatePrefix(
|
||||
else
|
||||
{
|
||||
throw Exception(ErrorCodes::INCORRECT_DATA,
|
||||
"Unknown binary serialization kind of Object: " + std::to_string(kind_raw));
|
||||
"Unknown binary serialization kind of Object: {}", std::to_string(kind_raw));
|
||||
}
|
||||
|
||||
settings.path.push_back(Substream::ObjectData);
|
||||
|
@ -40,7 +40,6 @@ template <typename DataTypes>
|
||||
String getExceptionMessagePrefix(const DataTypes & types)
|
||||
{
|
||||
WriteBufferFromOwnString res;
|
||||
res << "There is no supertype for types ";
|
||||
|
||||
bool first = true;
|
||||
for (const auto & type : types)
|
||||
@ -65,9 +64,9 @@ DataTypePtr throwOrReturn(const DataTypes & types, std::string_view message_suff
|
||||
return nullptr;
|
||||
|
||||
if (message_suffix.empty())
|
||||
throw Exception(error_code, getExceptionMessagePrefix(types));
|
||||
throw Exception(error_code, "There is no supertype for types {}", getExceptionMessagePrefix(types));
|
||||
|
||||
throw Exception(error_code, "{} {}", getExceptionMessagePrefix(types), message_suffix);
|
||||
throw Exception(error_code, "There is no supertype for types {} {}", getExceptionMessagePrefix(types), message_suffix);
|
||||
}
|
||||
|
||||
template <LeastSupertypeOnError on_error>
|
||||
|
@ -49,7 +49,7 @@ DataTypePtr getMostSubtype(const DataTypes & types, bool throw_if_result_is_noth
|
||||
auto get_nothing_or_throw = [throw_if_result_is_nothing, & types](const std::string & reason)
|
||||
{
|
||||
if (throw_if_result_is_nothing)
|
||||
throw Exception(getExceptionMessagePrefix(types) + reason, ErrorCodes::NO_COMMON_TYPE);
|
||||
throw Exception::createDeprecated(getExceptionMessagePrefix(types) + reason, ErrorCodes::NO_COMMON_TYPE);
|
||||
return std::make_shared<DataTypeNothing>();
|
||||
};
|
||||
|
||||
|
@ -47,10 +47,10 @@ getArgument(const ASTPtr & arguments, size_t argument_index, const char * argume
|
||||
else
|
||||
{
|
||||
if (argument && argument->value.getType() != field_type)
|
||||
throw Exception(getExceptionMessage(fmt::format(" has wrong type: {}", argument->value.getTypeName()),
|
||||
throw Exception::createDeprecated(getExceptionMessage(fmt::format(" has wrong type: {}", argument->value.getTypeName()),
|
||||
argument_index, argument_name, context_data_type_name, field_type), ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
else
|
||||
throw Exception(getExceptionMessage(" is missing", argument_index, argument_name, context_data_type_name, field_type),
|
||||
throw Exception::createDeprecated(getExceptionMessage(" is missing", argument_index, argument_name, context_data_type_name, field_type),
|
||||
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
|
||||
}
|
||||
}
|
||||
@ -67,7 +67,7 @@ static DataTypePtr create(const ASTPtr & arguments)
|
||||
const auto timezone = getArgument<String, ArgumentKind::Optional>(arguments, !!scale, "timezone", "DateTime");
|
||||
|
||||
if (!scale && !timezone)
|
||||
throw Exception(getExceptionMessage(" has wrong type: ", 0, "scale", "DateTime", Field::Types::Which::UInt64),
|
||||
throw Exception::createDeprecated(getExceptionMessage(" has wrong type: ", 0, "scale", "DateTime", Field::Types::Which::UInt64),
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
/// If scale is defined, the data type is DateTime when scale = 0 otherwise the data type is DateTime64
|
||||
|
@ -41,10 +41,9 @@ namespace
|
||||
}
|
||||
catch (Exception & e)
|
||||
{
|
||||
throw Exception(
|
||||
fmt::format("Error while loading dictionary '{}.{}': {}",
|
||||
database_name, load_result.name, e.displayText()),
|
||||
e.code());
|
||||
throw Exception(e.code(),
|
||||
"Error while loading dictionary '{}.{}': {}",
|
||||
database_name, load_result.name, e.displayText());
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -118,7 +117,7 @@ ASTPtr DatabaseDictionary::getCreateTableQueryImpl(const String & table_name, Co
|
||||
/* hilite = */ false, "", /* allow_multi_statements = */ false, 0, settings.max_parser_depth);
|
||||
|
||||
if (!ast && throw_on_error)
|
||||
throw Exception(error_message, ErrorCodes::SYNTAX_ERROR);
|
||||
throw Exception::createDeprecated(error_message, ErrorCodes::SYNTAX_ERROR);
|
||||
|
||||
return ast;
|
||||
}
|
||||
|
@ -252,13 +252,11 @@ DatabasePtr DatabaseFactory::getImpl(const ASTCreateQuery & create, const String
|
||||
{
|
||||
auto print_create_ast = create.clone();
|
||||
print_create_ast->as<ASTCreateQuery>()->attach = false;
|
||||
throw Exception(
|
||||
fmt::format(
|
||||
throw Exception(ErrorCodes::NOT_IMPLEMENTED,
|
||||
"The MaterializedMySQL database engine no longer supports Ordinary databases. To re-create the database, delete "
|
||||
"the old one by executing \"rm -rf {}{{,.sql}}\", then re-create the database with the following query: {}",
|
||||
metadata_path,
|
||||
queryToString(print_create_ast)),
|
||||
ErrorCodes::NOT_IMPLEMENTED);
|
||||
queryToString(print_create_ast));
|
||||
}
|
||||
|
||||
return std::make_shared<DatabaseMaterializedMySQL>(
|
||||
|
@ -676,7 +676,7 @@ ASTPtr DatabaseOnDisk::parseQueryFromMetadata(
|
||||
"in file " + metadata_file_path, /* allow_multi_statements = */ false, 0, settings.max_parser_depth);
|
||||
|
||||
if (!ast && throw_on_error)
|
||||
throw Exception(error_message, ErrorCodes::SYNTAX_ERROR);
|
||||
throw Exception::createDeprecated(error_message, ErrorCodes::SYNTAX_ERROR);
|
||||
else if (!ast)
|
||||
return nullptr;
|
||||
|
||||
|
@ -136,7 +136,7 @@ ASTPtr DatabaseMySQL::getCreateTableQueryImpl(const String & table_name, Context
|
||||
if (local_tables_cache.find(table_name) == local_tables_cache.end())
|
||||
{
|
||||
if (throw_on_error)
|
||||
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {} doesn't exist.", database_name_in_mysql, table_name);
|
||||
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {}.{} doesn't exist.", database_name_in_mysql, table_name);
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
@ -180,7 +180,7 @@ time_t DatabaseMySQL::getObjectMetadataModificationTime(const String & table_nam
|
||||
fetchTablesIntoLocalCache(getContext());
|
||||
|
||||
if (local_tables_cache.find(table_name) == local_tables_cache.end())
|
||||
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {} doesn't exist.", database_name_in_mysql, table_name);
|
||||
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {}.{} doesn't exist.", database_name_in_mysql, table_name);
|
||||
|
||||
return time_t(local_tables_cache[table_name].first);
|
||||
}
|
||||
|
@ -147,7 +147,7 @@ static void checkMySQLVariables(const mysqlxx::Pool::Entry & connection, const S
|
||||
first = false;
|
||||
}
|
||||
|
||||
throw Exception(error_message.str(), ErrorCodes::ILLEGAL_MYSQL_VARIABLE);
|
||||
throw Exception::createDeprecated(error_message.str(), ErrorCodes::ILLEGAL_MYSQL_VARIABLE);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -214,12 +214,12 @@ void DatabasePostgreSQL::attachTable(ContextPtr /* context_ */, const String & t
|
||||
|
||||
if (!checkPostgresTable(table_name))
|
||||
throw Exception(ErrorCodes::UNKNOWN_TABLE,
|
||||
"Cannot attach PostgreSQL table {} because it does not exist in PostgreSQL",
|
||||
"Cannot attach PostgreSQL table {} because it does not exist in PostgreSQL (database: {})",
|
||||
getTableNameForLogs(table_name), database_name);
|
||||
|
||||
if (!detached_or_dropped.contains(table_name))
|
||||
throw Exception(ErrorCodes::TABLE_ALREADY_EXISTS,
|
||||
"Cannot attach PostgreSQL table {} because it already exists",
|
||||
"Cannot attach PostgreSQL table {} because it already exists (database: {})",
|
||||
getTableNameForLogs(table_name), database_name);
|
||||
|
||||
if (cache_tables)
|
||||
|
@ -19,7 +19,7 @@ static std::mutex init_sqlite_db_mutex;
|
||||
void processSQLiteError(const String & message, bool throw_on_error)
|
||||
{
|
||||
if (throw_on_error)
|
||||
throw Exception(ErrorCodes::PATH_ACCESS_DENIED, message);
|
||||
throw Exception::createDeprecated(message, ErrorCodes::PATH_ACCESS_DENIED);
|
||||
else
|
||||
LOG_ERROR(&Poco::Logger::get("SQLiteEngine"), fmt::runtime(message));
|
||||
}
|
||||
|
@ -54,9 +54,9 @@ void RegionsHierarchy::reload()
|
||||
if (region_entry.id > max_region_id)
|
||||
{
|
||||
if (region_entry.id > max_size)
|
||||
throw DB::Exception(
|
||||
"Region id is too large: " + DB::toString(region_entry.id) + ", should be not more than " + DB::toString(max_size),
|
||||
DB::ErrorCodes::INCORRECT_DATA);
|
||||
throw DB::Exception(DB::ErrorCodes::INCORRECT_DATA,
|
||||
"Region id is too large: {}, should be not more than {}",
|
||||
DB::toString(region_entry.id), DB::toString(max_size));
|
||||
|
||||
max_region_id = region_entry.id;
|
||||
|
||||
|
@ -84,9 +84,9 @@ void RegionsNames::reload()
|
||||
max_region_id = name_entry.id;
|
||||
|
||||
if (name_entry.id > max_size)
|
||||
throw DB::Exception(
|
||||
"Region id is too large: " + DB::toString(name_entry.id) + ", should be not more than " + DB::toString(max_size),
|
||||
DB::ErrorCodes::INCORRECT_DATA);
|
||||
throw DB::Exception(DB::ErrorCodes::INCORRECT_DATA,
|
||||
"Region id is too large: {}, should be not more than {}",
|
||||
DB::toString(name_entry.id), DB::toString(max_size));
|
||||
}
|
||||
|
||||
while (name_entry.id >= new_names_refs.size())
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user