Merge branch 'master' into randomize-mt-settings

This commit is contained in:
Anton Popov 2023-01-27 16:03:00 +01:00 committed by GitHub
commit d8fe2bcbaa
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
285 changed files with 3972 additions and 2602 deletions

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit 4b1c8dd9913d2a16db62df0e509fa598da5c8219
Subproject commit 7fefdf30244a9bf8eb58562a9b2a51cc59a8877a

View File

@ -235,6 +235,7 @@ function run_tests
--check-zookeeper-session
--order random
--print-time
--report-logs-stats
--jobs "${NPROC}"
)
time clickhouse-test "${test_opts[@]}" -- "$FASTTEST_FOCUS" 2>&1 \

View File

@ -1,15 +1,22 @@
# Inverted indexes [experimental] {#table_engines-ANNIndex}
---
slug: /en/engines/table-engines/mergetree-family/invertedindexes
sidebar_label: Inverted Indexes
description: Quickly find search terms in text.
keywords: [full-text search, text search]
---
Inverted indexes are an experimental type of [secondary indexes](mergetree.md#available-types-of-indices) which provide fast text search
capabilities for [String](../../../sql-reference/data-types/string.md) or [FixedString](../../../sql-reference/data-types/fixedstring.md)
columns. The main idea of an inverted indexes is to store a mapping from "terms" to the rows which contains these terms. "Terms" are
tokenized cells of the string column. For example, string cell "I will be a little late" is by default tokenized into six terms "I", "will",
"be", "a", "little" and "late". Another kind of tokenizer are n-grams. For example, the result of 3-gram tokenization will be 21 terms "I w",
# Inverted indexes [experimental]
Inverted indexes are an experimental type of [secondary indexes](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#available-types-of-indices) which provide fast text search
capabilities for [String](/docs/en/sql-reference/data-types/string.md) or [FixedString](/docs/en/sql-reference/data-types/fixedstring.md)
columns. The main idea of an inverted index is to store a mapping from "terms" to the rows which contain these terms. "Terms" are
tokenized cells of the string column. For example, the string cell "I will be a little late" is by default tokenized into six terms "I", "will",
"be", "a", "little" and "late". Another kind of tokenizer is n-grams. For example, the result of 3-gram tokenization will be 21 terms "I w",
" wi", "wil", "ill", "ll ", "l b", " be" etc. The more fine-granular the input strings are tokenized, the bigger but also the more
useful the resulting inverted index will be.
:::warning
Inverted indexes are experimental and should not be used in production environment yet. They may change in future in backwards-incompatible
Inverted indexes are experimental and should not be used in production environments yet. They may change in the future in backward-incompatible
ways, for example with respect to their DDL/DQL syntax or performance/compression characteristics.
:::
@ -24,7 +31,14 @@ SET allow_experimental_inverted_index = true;
An inverted index can be defined on a string column using the following syntax
``` sql
CREATE TABLE tab (key UInt64, str String, INDEX inv_idx(s) TYPE inverted(N) GRANULARITY 1) Engine=MergeTree ORDER BY (k);
CREATE TABLE tab
(
`key` UInt64,
`str` String,
INDEX inv_idx(str) TYPE inverted(0) GRANULARITY 1
)
ENGINE = MergeTree
ORDER BY key
```
where `N` specifies the tokenizer:
@ -32,7 +46,7 @@ where `N` specifies the tokenizer:
- `inverted(0)` (or shorter: `inverted()`) set the tokenizer to "tokens", i.e. split strings along spaces,
- `inverted(N)` with `N` between 2 and 8 sets the tokenizer to "ngrams(N)"
Being a type of skipping indexes, inverted indexes can be dropped or added to a column after table creation:
Being a type of skipping index, inverted indexes can be dropped or added to a column after table creation:
``` sql
ALTER TABLE tbl DROP INDEX inv_idx;
@ -54,13 +68,13 @@ SELECT * from tab WHERE multiSearchAll(s, [Hello, World]);
The inverted index also works on columns of type `Array(String)`, `Array(FixedString)`, `Map(String)` and `Map(String)`.
Like for other secondary indices, each column part has its own inverted index. Furthermore, each inverted index is internally divided into
"segments". The existence and size of the segments is generally transparent to users but the segment size determines the memory consumption
"segments". The existence and size of the segments are generally transparent to users but the segment size determines the memory consumption
during index construction (e.g. when two parts are merged). Configuration parameter "max_digestion_size_per_segment" (default: 256 MB)
controls the amount of data read consumed from the underlying column before a new segment is created. Incrementing the parameter raises the
intermediate memory consumption for index constuction but also improves lookup performance since fewer segments need to be checked on
intermediate memory consumption for index construction but also improves lookup performance since fewer segments need to be checked on
average to evaluate a query.
Unlike other secondary indices, inverted indexes (for now) map to row numbers (row ids) instead of granule ids. The reason for this design
is performance. In practice, users often search for multiple terms at once. For example, filter predicate `WHERE s LIKE '%little%' OR s LIKE
'%big%'` can be evaluated directly using an inverted index by forming the union of the rowid lists for terms "little" and "big". This also
means that parameter `GRANULARITY` supplied to index creation has no meaning (it may be removed from the syntax in future).
'%big%'` can be evaluated directly using an inverted index by forming the union of the row id lists for terms "little" and "big". This also
means that the parameter `GRANULARITY` supplied to index creation has no meaning (it may be removed from the syntax in the future).

View File

@ -0,0 +1,259 @@
# Laion-400M dataset
The dataset contains 400 million images with English text. For more information follow this [link](https://laion.ai/blog/laion-400-open-dataset/). Laion provides even larger datasets (e.g. [5 billion](https://laion.ai/blog/laion-5b/)). Working with them will be similar.
The dataset has prepared embeddings for texts and images. This will be used to demonstrate [Approximate nearest neighbor search indexes](../../engines/table-engines/mergetree-family/annindexes.md).
## Prepare data
Embeddings are stored in `.npy` files, so we have to read them with python and merge with other data.
Download data and process it with simple `download.sh` script:
```bash
wget --tries=100 https://deploy.laion.ai/8f83b608504d46bb81708ec86e912220/embeddings/img_emb/img_emb_${1}.npy
wget --tries=100 https://deploy.laion.ai/8f83b608504d46bb81708ec86e912220/embeddings/metadata/metadata_${1}.parquet
wget --tries=100 https://deploy.laion.ai/8f83b608504d46bb81708ec86e912220/embeddings/text_emb/text_emb_${1}.npy
python3 process.py ${1}
```
Where `process.py`:
```python
import pandas as pd
import numpy as np
import os
import sys
str_i = str(sys.argv[1])
npy_file = "img_emb_" + str_i + '.npy'
metadata_file = "metadata_" + str_i + '.parquet'
text_npy = "text_emb_" + str_i + '.npy'
# load all files
im_emb = np.load(npy_file)
text_emb = np.load(text_npy)
data = pd.read_parquet(metadata_file)
# combine them
data = pd.concat([data, pd.DataFrame({"image_embedding" : [*im_emb]}), pd.DataFrame({"text_embedding" : [*text_emb]})], axis=1, copy=False)
# you can save more columns
data = data[['url', 'caption', 'similarity', "image_embedding", "text_embedding"]]
# transform np.arrays to lists
data['image_embedding'] = data['image_embedding'].apply(lambda x: list(x))
data['text_embedding'] = data['text_embedding'].apply(lambda x: list(x))
# this small hack is needed becase caption sometimes contains all kind of quotes
data['caption'] = data['caption'].apply(lambda x: x.replace("'", " ").replace('"', " "))
# save data to file
data.to_csv(str_i + '.csv', header=False)
# previous files can be removed
os.system(f"rm {npy_file} {metadata_file} {text_npy}")
```
You can download data with
```bash
seq 0 409 | xargs -P100 -I{} bash -c './download.sh {}'
```
The dataset is divided into 409 files. If you want to work only with a certain part of the dataset, just change the limits.
## Create table for laion
Without indexes table can be created by
```sql
CREATE TABLE laion_dataset
(
`id` Int64,
`url` String,
`caption` String,
`similarity` Float32,
`image_embedding` Array(Float32),
`text_embedding` Array(Float32)
)
ENGINE = MergeTree
ORDER BY id
SETTINGS index_granularity = 8192
```
Fill table with data:
```sql
INSERT INTO laion_dataset FROM INFILE '{path_to_csv_files}/*.csv'
```
## Check data in table without indexes
Let's check the work of the following query on the part of the dataset (8 million records):
```sql
select url, caption from test_laion where similarity > 0.2 order by L2Distance(image_embedding, {target:Array(Float32)}) limit 30
```
Since the embeddings for images and texts may not match, let's also require a certain threshold of matching accuracy to get images that are more likely to satisfy our queries. The client parameter `target`, which is an array of 512 elements. See later in this article for a convenient way of obtaining such vectors. I used a random picture of a cat from the Internet as a target vector.
**The result**
```
┌─url───────────────────────────────────────────────────────────────────────────────────────────────────────────┬─caption────────────────────────────────────────────────────────────────┐
│ https://s3.amazonaws.com/filestore.rescuegroups.org/6685/pictures/animals/13884/13884995/63318230_463x463.jpg │ Adoptable Female Domestic Short Hair │
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/8/b/6/239905226.jpg │ Adopt A Pet :: Marzipan - New York, NY │
│ http://d1n3ar4lqtlydb.cloudfront.net/9/2/4/248407625.jpg │ Adopt A Pet :: Butterscotch - New Castle, DE │
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/e/e/c/245615237.jpg │ Adopt A Pet :: Tiggy - Chicago, IL │
│ http://pawsofcoronado.org/wp-content/uploads/2012/12/rsz_pumpkin.jpg │ Pumpkin an orange tabby kitten for adoption │
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/7/8/3/188700997.jpg │ Adopt A Pet :: Brian the Brad Pitt of cats - Frankfort, IL │
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/8/b/d/191533561.jpg │ Domestic Shorthair Cat for adoption in Mesa, Arizona - Charlie │
│ https://s3.amazonaws.com/pet-uploads.adoptapet.com/0/1/2/221698235.jpg │ Domestic Shorthair Cat for adoption in Marietta, Ohio - Daisy (Spayed) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────┘
8 rows in set. Elapsed: 6.432 sec. Processed 19.65 million rows, 43.96 GB (3.06 million rows/s., 6.84 GB/s.)
```
## Add indexes
Create a new table or follow instructions from [alter documentation](../../sql-reference/statements/alter/skipping-index.md).
```sql
CREATE TABLE laion_dataset
(
`id` Int64,
`url` String,
`caption` String,
`similarity` Float32,
`image_embedding` Array(Float32),
`text_embedding` Array(Float32),
INDEX annoy_image image_embedding TYPE annoy(1000) GRANULARITY 1000,
INDEX annoy_text text_embedding TYPE annoy(1000) GRANULARITY 1000
)
ENGINE = MergeTree
ORDER BY id
SETTINGS index_granularity = 8192
```
When created, the index will be built by L2Distance. You can read more about the parameters in the [annoy documentation](../../engines/table-engines/mergetree-family/annindexes.md#annoy-annoy). It makes sense to build indexes for a large number of granules. If you need good speed, then GRANULARITY should be several times larger than the expected number of results in the search.
Now let's check again with the same query:
```sql
select url, caption from test_indexes_laion where similarity > 0.2 order by L2Distance(image_embedding, {target:Array(Float32)}) limit 8
```
**Result**
```
┌─url──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─caption──────────────────────────────────────────────────────────────┐
│ http://tse1.mm.bing.net/th?id=OIP.R1CUoYp_4hbeFSHBaaB5-gHaFj │ bed bugs and pets can cats carry bed bugs pets adviser │
│ http://pet-uploads.adoptapet.com/1/9/c/1963194.jpg?336w │ Domestic Longhair Cat for adoption in Quincy, Massachusetts - Ashley │
│ https://thumbs.dreamstime.com/t/cat-bed-12591021.jpg │ Cat on bed Stock Image │
│ https://us.123rf.com/450wm/penta/penta1105/penta110500004/9658511-portrait-of-british-short-hair-kitten-lieing-at-sofa-on-sun.jpg │ Portrait of british short hair kitten lieing at sofa on sun. │
│ https://www.easypetmd.com/sites/default/files/Wirehaired%20Vizsla%20(2).jpg │ Vizsla (Wirehaired) image 3 │
│ https://images.ctfassets.net/yixw23k2v6vo/0000000200009b8800000000/7950f4e1c1db335ef91bb2bc34428de9/dog-cat-flickr-Impatience_1.jpg?w=600&h=400&fm=jpg&fit=thumb&q=65&fl=progressive │ dog and cat image │
│ https://i1.wallbox.ru/wallpapers/small/201523/eaa582ee76a31fd.jpg │ cats, kittens, faces, tonkinese │
│ https://www.baxterboo.com/images/breeds/medium/cairn-terrier.jpg │ Cairn Terrier Photo │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────┘
8 rows in set. Elapsed: 0.641 sec. Processed 22.06 thousand rows, 49.36 MB (91.53 thousand rows/s., 204.81 MB/s.)
```
The speed has increased significantly. But now, the results sometimes differ from what you are looking for. This is due to the approximation of the search and the quality of the constructed embedding. Note that the example was given for picture embeddings, but there are also text embeddings in the dataset, which can also be used for searching.
## Scripts for embeddings
Usually, we do not want to get embeddings from existing data, but to get them for new data and look for similar ones in old data. We can use [UDF](../../sql-reference/functions/index.md#sql-user-defined-functions) for this purpose. They will allow you to set the `target` vector without leaving the client. All of the following scripts will be written for the `ViT-B/32` model, as it was used for this dataset. You can use any model, but it is necessary to build embeddings in the dataset and for new objects using the same model.
### Text embeddings
`encode_text.py`:
```python
#!/usr/bin/python3
import clip
import torch
import numpy as np
import sys
if __name__ == '__main__':
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)
for text in sys.stdin:
inputs = clip.tokenize(text)
with torch.no_grad():
text_features = model.encode_text(inputs)[0].tolist()
sys.stdout.flush()
```
`encode_text_function.xml`:
```xml
<functions>
<function>
<type>executable</type>
<name>encode_text</name>
<return_type>Array(Float32)</return_type>
<argument>
<type>String</type>
<name>text</name>
</argument>
<format>TabSeparated</format>
<command>encode_text.py</command>
<command_read_timeout>1000000</command_read_timeout>
</function>
</functions>
```
Now we can simply use:
```sql
SELECT encode_text('cat');
```
The first use will be slow because the model needs to be loaded. But repeated queries will be fast. Then we copy the results to ``set param_target=...`` and can easily write queries
### Image embeddings
For pictures, the process is similar, but you send the path instead of the picture (if necessary, you can implement a download picture with processing, but it will take longer)
`encode_picture.py`
```python
#!/usr/bin/python3
import clip
import torch
import numpy as np
from PIL import Image
import sys
if __name__ == '__main__':
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)
for text in sys.stdin:
image = preprocess(Image.open(text.strip())).unsqueeze(0).to(device)
with torch.no_grad():
image_features = model.encode_image(image)[0].tolist()
print(image_features)
sys.stdout.flush()
```
`encode_picture_function.xml`
```xml
<functions>
<function>
<type>executable_pool</type>
<name>encode_picture</name>
<return_type>Array(Float32)</return_type>
<argument>
<type>String</type>
<name>path</name>
</argument>
<format>TabSeparated</format>
<command>encode_picture.py</command>
<command_read_timeout>1000000</command_read_timeout>
</function>
</functions>
```
The query:
```sql
SELECT encode_picture('some/path/to/your/picture');
```

View File

@ -22,6 +22,7 @@ functions in ClickHouse. The sample datasets include:
- The [Cell Towers dataset](../getting-started/example-datasets/cell-towers.md) imports a CSV into ClickHouse
- The [NYPD Complaint Data](../getting-started/example-datasets/nypd_complaint_data.md) demonstrates how to use data inference to simplify creating tables
- The ["What's on the Menu?" dataset](../getting-started/example-datasets/menus.md) has an example of denormalizing data
- The [Laion dataset](../getting-started/example-datasets/laion.md) has an example of [Approximate nearest neighbor search indexes](../engines/table-engines/mergetree-family/annindexes.md) usage
- [Getting Data Into ClickHouse - Part 1](https://clickhouse.com/blog/getting-data-into-clickhouse-part-1) provides examples of defining a schema and loading a small Hacker News dataset
- [Getting Data Into ClickHouse - Part 3 - Using S3](https://clickhouse.com/blog/getting-data-into-clickhouse-part-3-s3) has examples of loading data from s3
- [Generating random data in ClickHouse](https://clickhouse.com/blog/generating-random-test-distribution-data-for-clickhouse) shows how to generate random data if none of the above fit your needs.

View File

@ -83,6 +83,7 @@ The supported formats are:
| [RawBLOB](#rawblob) | ✔ | ✔ |
| [MsgPack](#msgpack) | ✔ | ✔ |
| [MySQLDump](#mysqldump) | ✔ | ✗ |
| [Markdown](#markdown) | ✗ | ✔ |
You can control some format processing parameters with the ClickHouse settings. For more information read the [Settings](/docs/en/operations/settings/settings-formats.md) section.
@ -2347,3 +2348,26 @@ FROM file(dump.sql, MySQLDump)
│ 3 │
└───┘
```
## Markdown {#markdown}
You can export results using [Markdown](https://en.wikipedia.org/wiki/Markdown) format to generate output ready to be pasted into your `.md` files:
```sql
SELECT
number,
number * 2
FROM numbers(5)
FORMAT Markdown
```
```results
| number | multiply(number, 2) |
|-:|-:|
| 0 | 0 |
| 1 | 2 |
| 2 | 4 |
| 3 | 6 |
| 4 | 8 |
```
Markdown table will be generated automatically and can be used on markdown-enabled platforms, like Github. This format is used only for output.

View File

@ -1915,6 +1915,21 @@ Possible values:
Default value: 0.
## optimize_skip_merged_partitions {#optimize-skip-merged-partitions}
Enables or disables optimization for [OPTIMIZE TABLE ... FINAL](../../sql-reference/statements/optimize.md) query if there is only one part with level > 0 and it doesn't have expired TTL.
- `OPTIMIZE TABLE ... FINAL SETTINGS optimize_skip_merged_partitions=1`
By default, `OPTIMIZE TABLE ... FINAL` query rewrites the one part even if there is only a single part.
Possible values:
- 1 - Enable optimization.
- 0 - Disable optimization.
Default value: 0.
## optimize_functions_to_subcolumns {#optimize-functions-to-subcolumns}
Enables or disables optimization by transforming some functions to reading subcolumns. This reduces the amount of data to read.

View File

@ -290,15 +290,11 @@ This storage method works the same way as hashed and allows using date/time (arb
Example: The table contains discounts for each advertiser in the format:
``` text
+---------|-------------|-------------|------+
| advertiser id | discount start date | discount end date | amount |
+===============+=====================+===================+========+
| 123 | 2015-01-01 | 2015-01-15 | 0.15 |
+---------|-------------|-------------|------+
| 123 | 2015-01-16 | 2015-01-31 | 0.25 |
+---------|-------------|-------------|------+
| 456 | 2015-01-01 | 2015-01-15 | 0.05 |
+---------|-------------|-------------|------+
┌─advertiser_id─┬─discount_start_date─┬─discount_end_date─┬─amount─┐
│ 123 │ 2015-01-16 │ 2015-01-31 │ 0.25 │
│ 123 │ 2015-01-01 │ 2015-01-15 │ 0.15 │
│ 456 │ 2015-01-01 │ 2015-01-15 │ 0.05 │
└───────────────┴─────────────────────┴───────────────────┴────────┘
```
To use a sample for date ranges, define the `range_min` and `range_max` elements in the [structure](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md). These elements must contain elements `name` and `type` (if `type` is not specified, the default type will be used - Date). `type` can be any numeric type (Date / DateTime / UInt64 / Int32 / others).
@ -307,19 +303,25 @@ To use a sample for date ranges, define the `range_min` and `range_max` elements
Values of `range_min` and `range_max` should fit in `Int64` type.
:::
Example:
Example:
``` xml
<layout>
<range_hashed>
<!-- Strategy for overlapping ranges (min/max). Default: min (return a matching range with the min(range_min -> range_max) value) -->
<range_lookup_strategy>min</range_lookup_strategy>
</range_hashed>
</layout>
<structure>
<id>
<name>Id</name>
<name>advertiser_id</name>
</id>
<range_min>
<name>first</name>
<name>discount_start_date</name>
<type>Date</type>
</range_min>
<range_max>
<name>last</name>
<name>discount_end_date</name>
<type>Date</type>
</range_max>
...
@ -328,17 +330,17 @@ Example:
or
``` sql
CREATE DICTIONARY somedict (
id UInt64,
first Date,
last Date,
advertiser_id UInt64
CREATE DICTIONARY discounts_dict (
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Date,
amount Float64
)
PRIMARY KEY id
SOURCE(CLICKHOUSE(TABLE 'date_table'))
SOURCE(CLICKHOUSE(TABLE 'discounts'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(RANGE_HASHED())
RANGE(MIN first MAX last)
LAYOUT(RANGE_HASHED(range_lookup_strategy 'max'))
RANGE(MIN discount_start_date MAX discount_end_date)
```
To work with these dictionaries, you need to pass an additional argument to the `dictGet` function, for which a range is selected:
@ -349,16 +351,17 @@ dictGet('dict_name', 'attr_name', id, date)
Query example:
``` sql
SELECT dictGet('somedict', 'advertiser_id', 1, '2022-10-20 23:20:10.000'::DateTime64::UInt64);
SELECT dictGet('discounts_dict', 'amount', 1, '2022-10-20'::Date);
```
This function returns the value for the specified `id`s and the date range that includes the passed date.
Details of the algorithm:
- If the `id` is not found or a range is not found for the `id`, it returns the default value for the dictionary.
- If there are overlapping ranges, it returns value for any (random) range.
- If the range delimiter is `NULL` or an invalid date (such as 1900-01-01), the range is open. The range can be open on both sides.
- If the `id` is not found or a range is not found for the `id`, it returns the default value of the attribute's type.
- If there are overlapping ranges and `range_lookup_strategy=min`, it returns a matching range with minimal `range_min`, if several ranges found, it returns a range with minimal `range_max`, if again several ranges found (several ranges had the same `range_min` and `range_max` it returns a random range of them.
- If there are overlapping ranges and `range_lookup_strategy=max`, it returns a matching range with maximal `range_min`, if several ranges found, it returns a range with maximal `range_max`, if again several ranges found (several ranges had the same `range_min` and `range_max` it returns a random range of them.
- If the `range_max` is `NULL`, the range is open. `NULL` is treated as maximal possible value. For the `range_min` `1970-01-01` or `0` (-MAX_INT) can be used as the open value.
Configuration example:
@ -407,6 +410,108 @@ PRIMARY KEY Abcdef
RANGE(MIN StartTimeStamp MAX EndTimeStamp)
```
Configuration example with overlapping ranges and open ranges:
```sql
CREATE TABLE discounts
(
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Nullable(Date),
amount Float64
)
ENGINE = Memory;
INSERT INTO discounts VALUES (1, '2015-01-01', Null, 0.1);
INSERT INTO discounts VALUES (1, '2015-01-15', Null, 0.2);
INSERT INTO discounts VALUES (2, '2015-01-01', '2015-01-15', 0.3);
INSERT INTO discounts VALUES (2, '2015-01-04', '2015-01-10', 0.4);
INSERT INTO discounts VALUES (3, '1970-01-01', '2015-01-15', 0.5);
INSERT INTO discounts VALUES (3, '1970-01-01', '2015-01-10', 0.6);
SELECT * FROM discounts ORDER BY advertiser_id, discount_start_date;
┌─advertiser_id─┬─discount_start_date─┬─discount_end_date─┬─amount─┐
│ 1 │ 2015-01-01 │ ᴺᵁᴸᴸ │ 0.1 │
│ 1 │ 2015-01-15 │ ᴺᵁᴸᴸ │ 0.2 │
│ 2 │ 2015-01-01 │ 2015-01-15 │ 0.3 │
│ 2 │ 2015-01-04 │ 2015-01-10 │ 0.4 │
│ 3 │ 1970-01-01 │ 2015-01-15 │ 0.5 │
│ 3 │ 1970-01-01 │ 2015-01-10 │ 0.6 │
└───────────────┴─────────────────────┴───────────────────┴────────┘
-- RANGE_LOOKUP_STRATEGY 'max'
CREATE DICTIONARY discounts_dict
(
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Nullable(Date),
amount Float64
)
PRIMARY KEY advertiser_id
SOURCE(CLICKHOUSE(TABLE discounts))
LIFETIME(MIN 600 MAX 900)
LAYOUT(RANGE_HASHED(RANGE_LOOKUP_STRATEGY 'max'))
RANGE(MIN discount_start_date MAX discount_end_date);
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-14')) res;
┌─res─┐
│ 0.1 │ -- the only one range is matching: 2015-01-01 - Null
└─────┘
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-16')) res;
┌─res─┐
│ 0.2 │ -- two ranges are matching, range_min 2015-01-15 (0.2) is bigger than 2015-01-01 (0.1)
└─────┘
select dictGet('discounts_dict', 'amount', 2, toDate('2015-01-06')) res;
┌─res─┐
│ 0.4 │ -- two ranges are matching, range_min 2015-01-04 (0.4) is bigger than 2015-01-01 (0.3)
└─────┘
select dictGet('discounts_dict', 'amount', 3, toDate('2015-01-01')) res;
┌─res─┐
│ 0.5 │ -- two ranges are matching, range_min are equal, 2015-01-15 (0.5) is bigger than 2015-01-10 (0.6)
└─────┘
DROP DICTIONARY discounts_dict;
-- RANGE_LOOKUP_STRATEGY 'min'
CREATE DICTIONARY discounts_dict
(
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Nullable(Date),
amount Float64
)
PRIMARY KEY advertiser_id
SOURCE(CLICKHOUSE(TABLE discounts))
LIFETIME(MIN 600 MAX 900)
LAYOUT(RANGE_HASHED(RANGE_LOOKUP_STRATEGY 'min'))
RANGE(MIN discount_start_date MAX discount_end_date);
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-14')) res;
┌─res─┐
│ 0.1 │ -- the only one range is matching: 2015-01-01 - Null
└─────┘
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-16')) res;
┌─res─┐
│ 0.1 │ -- two ranges are matching, range_min 2015-01-01 (0.1) is less than 2015-01-15 (0.2)
└─────┘
select dictGet('discounts_dict', 'amount', 2, toDate('2015-01-06')) res;
┌─res─┐
│ 0.3 │ -- two ranges are matching, range_min 2015-01-01 (0.3) is less than 2015-01-04 (0.4)
└─────┘
select dictGet('discounts_dict', 'amount', 3, toDate('2015-01-01')) res;
┌─res─┐
│ 0.6 │ -- two ranges are matching, range_min are equal, 2015-01-10 (0.6) is less than 2015-01-15 (0.5)
└─────┘
```
### complex_key_range_hashed
The dictionary is stored in memory in the form of a hash table with an ordered array of ranges and their corresponding values (see [range_hashed](#range-hashed)). This type of storage is for use with composite [keys](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-structure.md).

View File

@ -209,10 +209,25 @@ Aliases: `DAYOFMONTH`, `DAY`.
## toDayOfWeek
Converts a date or date with time to a UInt8 number containing the number of the day of the week (Monday is 1, and Sunday is 7).
Converts a date or date with time to a UInt8 number containing the number of the day of the week.
The two-argument form of `toDayOfWeek()` enables you to specify whether the week starts on Monday or Sunday, and whether the return value should be in the range from 0 to 6 or 1 to 7. If the mode argument is ommited, the default mode is 0. The time zone of the date can be specified as the third argument.
| Mode | First day of week | Range |
|------|-------------------|------------------------------------------------|
| 0 | Monday | 1-7: Monday = 1, Tuesday = 2, ..., Sunday = 7 |
| 1 | Monday | 0-6: Monday = 0, Tuesday = 1, ..., Sunday = 6 |
| 2 | Sunday | 0-6: Sunday = 0, Monday = 1, ..., Saturday = 6 |
| 3 | Sunday | 1-7: Sunday = 1, Monday = 2, ..., Saturday = 7 |
Alias: `DAYOFWEEK`.
**Syntax**
``` sql
toDayOfWeek(t[, mode[, timezone]])
```
## toHour
Converts a date with time to a UInt8 number containing the number of the hour in 24-hour time (0-23).
@ -316,11 +331,17 @@ If `toLastDayOfMonth` is called with an argument of type `Date` greater then 214
Rounds down a date, or date with time, to the nearest Monday.
Returns the date.
## toStartOfWeek(t\[,mode\])
## toStartOfWeek
Rounds down a date, or date with time, to the nearest Sunday or Monday by mode.
Rounds a date or date with time down to the nearest Sunday or Monday.
Returns the date.
The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.
The mode argument works exactly like the mode argument in function `toWeek()`. If no mode is specified, mode is assumed as 0.
**Syntax**
``` sql
toStartOfWeek(t[, mode[, timezone]])
```
## toStartOfDay
@ -455,10 +476,12 @@ Converts a date, or date with time, to a UInt16 number containing the ISO Year n
Converts a date, or date with time, to a UInt8 number containing the ISO Week number.
## toWeek(date\[,mode\])
## toWeek
This function returns the week number for date or datetime. The two-argument form of `toWeek()` enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53. If the mode argument is omitted, the default mode is 0.
`toISOWeek()` is a compatibility function that is equivalent to `toWeek(date,3)`.
This function returns the week number for date or datetime. The two-argument form of toWeek() enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53. If the mode argument is omitted, the default mode is 0.
`toISOWeek()`is a compatibility function that is equivalent to `toWeek(date,3)`.
The following table describes how the mode argument works.
| Mode | First day of week | Range | Week 1 is the first week … |
@ -482,13 +505,15 @@ For mode values with a meaning of “with 4 or more days this year,” weeks are
For mode values with a meaning of “contains January 1”, the week contains January 1 is week 1. It does not matter how many days in the new year the week contained, even if it contained only one day.
**Syntax**
``` sql
toWeek(date, [, mode][, Timezone])
toWeek(t[, mode[, time_zone]])
```
**Arguments**
- `date` Date or DateTime.
- `t` Date or DateTime.
- `mode` Optional parameter, Range of values is \[0,9\], default is 0.
- `Timezone` Optional parameter, it behaves like any other conversion function.
@ -504,13 +529,19 @@ SELECT toDate('2016-12-27') AS date, toWeek(date) AS week0, toWeek(date,1) AS we
└────────────┴───────┴───────┴───────┘
```
## toYearWeek(date\[,mode\])
## toYearWeek
Returns year and week for a date. The year in the result may be different from the year in the date argument for the first and the last week of the year.
The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.
The mode argument works exactly like the mode argument to `toWeek()`. For the single-argument syntax, a mode value of 0 is used.
`toISOYear()`is a compatibility function that is equivalent to `intDiv(toYearWeek(date,3),100)`.
`toISOYear()` is a compatibility function that is equivalent to `intDiv(toYearWeek(date,3),100)`.
**Syntax**
``` sql
toYearWeek(t[, mode[, timezone]])
```
**Example**

View File

@ -23,7 +23,7 @@ When `OPTIMIZE` is used with the [ReplicatedMergeTree](../../engines/table-engin
- If `OPTIMIZE` does not perform a merge for any reason, it does not notify the client. To enable notifications, use the [optimize_throw_if_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop) setting.
- If you specify a `PARTITION`, only the specified partition is optimized. [How to set partition expression](alter/partition.md#how-to-set-partition-expression).
- If you specify `FINAL`, optimization is performed even when all the data is already in one part. Also merge is forced even if concurrent merges are performed.
- If you specify `FINAL`, optimization is performed even when all the data is already in one part. You can control this behaviour with [optimize_skip_merged_partitions](../../operations/settings/settings.md#optimize-skip-merged-partitions). Also, the merge is forced even if concurrent merges are performed.
- If you specify `DEDUPLICATE`, then completely identical rows (unless by-clause is specified) will be deduplicated (all columns are compared), it makes sense only for the MergeTree engine.
You can specify how long (in seconds) to wait for inactive replicas to execute `OPTIMIZE` queries by the [replication_wait_for_inactive_replica_timeout](../../operations/settings/settings.md#replication-wait-for-inactive-replica-timeout) setting.

View File

@ -1997,6 +1997,21 @@ SELECT * FROM test_table
Значение по умолчанию: 0.
## optimize_skip_merged_partitions {#optimize-skip-merged-partitions}
Включает или отключает оптимизацию для запроса [OPTIMIZE TABLE ... FINAL](../../sql-reference/statements/optimize.md), когда есть только один парт с level > 0 и неистекший TTL.
- `OPTIMIZE TABLE ... FINAL SETTINGS optimize_skip_merged_partitions=1`
По умолчанию, `OPTIMIZE TABLE ... FINAL` перезапишет даже один парт.
Возможные значения:
- 1 - Включена
- 0 - Выключена
Значение по умолчанию: 0.
## optimize_functions_to_subcolumns {#optimize-functions-to-subcolumns}
Включает или отключает оптимизацию путем преобразования некоторых функций к чтению подстолбцов, таким образом уменьшая объем данных для чтения.

View File

@ -24,7 +24,7 @@ OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition | PARTITION I
- По умолчанию, если запросу `OPTIMIZE` не удалось выполнить слияние, то
ClickHouse не оповещает клиента. Чтобы включить оповещения, используйте настройку [optimize_throw_if_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop).
- Если указать `PARTITION`, то оптимизация выполняется только для указанной партиции. [Как задавать имя партиции в запросах](alter/index.md#alter-how-to-specify-part-expr).
- Если указать `FINAL`, то оптимизация выполняется даже в том случае, если все данные уже лежат в одном куске данных. Кроме того, слияние является принудительным, даже если выполняются параллельные слияния.
- Если указать `FINAL`, то оптимизация выполняется даже в том случае, если все данные уже лежат в одном куске данных. Можно контролировать с помощью настройки [optimize_skip_merged_partitions](../../operations/settings/settings.md#optimize-skip-merged-partitions). Кроме того, слияние является принудительным, даже если выполняются параллельные слияния.
- Если указать `DEDUPLICATE`, то произойдет схлопывание полностью одинаковых строк (сравниваются значения во всех столбцах), имеет смысл только для движка MergeTree.
Вы можете указать время ожидания (в секундах) выполнения запросов `OPTIMIZE` для неактивных реплик с помощью настройки [replication_wait_for_inactive_replica_timeout](../../operations/settings/settings.md#replication-wait-for-inactive-replica-timeout).
@ -196,4 +196,5 @@ SELECT * FROM example;
┌─primary_key─┬─secondary_key─┬─value─┬─partition_key─┐
│ 1 │ 1 │ 2 │ 3 │
└─────────────┴───────────────┴───────┴───────────────┘
```
```

View File

@ -277,7 +277,7 @@ private:
}
if (queries.empty())
throw Exception("Empty list of queries.", ErrorCodes::EMPTY_DATA_PASSED);
throw Exception(ErrorCodes::EMPTY_DATA_PASSED, "Empty list of queries.");
}
else
{

View File

@ -724,7 +724,7 @@ bool Client::processWithFuzzing(const String & full_query)
// uniformity.
// Surprisingly, this is a client exception, because we get the
// server exception w/o throwing (see onReceiveException()).
client_exception = std::make_unique<Exception>(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
client_exception = std::make_unique<Exception>(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
have_error = true;
}
@ -859,7 +859,7 @@ bool Client::processWithFuzzing(const String & full_query)
}
catch (...)
{
client_exception = std::make_unique<Exception>(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
client_exception = std::make_unique<Exception>(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
have_error = true;
}

View File

@ -165,9 +165,8 @@ int mainEntryClickHouseFormat(int argc, char ** argv)
/// should throw exception early and make exception message more readable.
if (const auto * insert_query = res->as<ASTInsertQuery>(); insert_query && insert_query->data)
{
throw Exception(
"Can't format ASTInsertQuery with data, since data will be lost",
DB::ErrorCodes::INVALID_FORMAT_INSERT_QUERY_WITH_DATA);
throw Exception(DB::ErrorCodes::INVALID_FORMAT_INSERT_QUERY_WITH_DATA,
"Can't format ASTInsertQuery with data, since data will be lost");
}
if (!quiet)
{

View File

@ -196,7 +196,7 @@ void Keeper::createServer(const std::string & listen_host, const char * port_nam
}
else
{
throw Exception{message, ErrorCodes::NETWORK_ERROR};
throw Exception::createDeprecated(message, ErrorCodes::NETWORK_ERROR);
}
}
}
@ -375,7 +375,7 @@ try
if (effective_user_id == 0)
{
message += " Run under 'sudo -u " + data_owner + "'.";
throw Exception(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
throw Exception::createDeprecated(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
}
else
{

View File

@ -243,7 +243,6 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
const ColumnRawPtrs & columns,
bool cat_features_are_strings) const
{
std::string error_msg = "Error occurred while applying CatBoost model: ";
size_t column_size = columns.front()->size();
auto result = ColumnFloat64::create(column_size * tree_count);
@ -265,7 +264,8 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
result_buf, column_size * tree_count))
{
throw Exception(error_msg + api.GetErrorString(), ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL);
throw Exception(ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL,
"Error occurred while applying CatBoost model: {}", api.GetErrorString());
}
return result;
}
@ -288,7 +288,8 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
cat_features_buf, cat_features_count,
result_buf, column_size * tree_count))
{
throw Exception(error_msg + api.GetErrorString(), ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL);
throw Exception(ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL,
"Error occurred while applying CatBoost model: {}", api.GetErrorString());
}
}
else
@ -304,7 +305,8 @@ ColumnFloat64::MutablePtr CatBoostLibraryHandler::evalImpl(
cat_features_buf, cat_features_count,
result_buf, column_size * tree_count))
{
throw Exception(error_msg + api.GetErrorString(), ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL);
throw Exception(ErrorCodes::CANNOT_APPLY_CATBOOST_MODEL,
"Error occurred while applying CatBoost model: {}", api.GetErrorString());
}
}

View File

@ -416,7 +416,7 @@ void Server::createServer(
}
else
{
throw Exception{message, ErrorCodes::NETWORK_ERROR};
throw Exception::createDeprecated(message, ErrorCodes::NETWORK_ERROR);
}
}
}
@ -946,7 +946,7 @@ try
if (effective_user_id == 0)
{
message += " Run under 'sudo -u " + data_owner + "'.";
throw Exception(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
throw Exception::createDeprecated(message, ErrorCodes::MISMATCHING_USERS_FOR_PROCESS_AND_DATA);
}
else
{

View File

@ -334,6 +334,19 @@
<max_thread_pool_size>10000</max_thread_pool_size>
<!-- Configure other thread pools: -->
<!--
<background_buffer_flush_schedule_pool_size>16</background_buffer_flush_schedule_pool_size>
<background_pool_size>16</background_pool_size>
<background_merges_mutations_concurrency_ratio>2</background_merges_mutations_concurrency_ratio>
<background_move_pool_size>8</background_move_pool_size>
<background_fetches_pool_size>8</background_fetches_pool_size>
<background_common_pool_size>8</background_common_pool_size>
<background_schedule_pool_size>128</background_schedule_pool_size>
<background_message_broker_schedule_pool_size>16</background_message_broker_schedule_pool_size>
<background_distributed_schedule_pool_size>16</background_distributed_schedule_pool_size>
-->
<!-- Number of workers to recycle connections in background (see also drain_timeout).
If the pool is full, connection will be drained synchronously. -->
<!-- <max_threads_for_connection_collector>10</max_threads_for_connection_collector> -->

View File

@ -79,9 +79,7 @@ AuthenticationData::Digest AuthenticationData::Util::encodeSHA256(std::string_vi
::DB::encodeSHA256(text, hash.data());
return hash;
#else
throw DB::Exception(
"SHA256 passwords support is disabled, because ClickHouse was built without SSL library",
DB::ErrorCodes::SUPPORT_IS_DISABLED);
throw DB::Exception(DB::ErrorCodes::SUPPORT_IS_DISABLED, "SHA256 passwords support is disabled, because ClickHouse was built without SSL library");
#endif
}

View File

@ -484,13 +484,15 @@ bool ContextAccess::checkAccessImplHelper(AccessFlags flags, const Args &... arg
return true;
};
auto access_denied = [&](const String & error_msg, int error_code [[maybe_unused]])
auto access_denied = [&]<typename... FmtArgs>(int error_code [[maybe_unused]],
FormatStringHelper<String, FmtArgs...> fmt_string [[maybe_unused]],
FmtArgs && ...fmt_args [[maybe_unused]])
{
if (trace_log)
LOG_TRACE(trace_log, "Access denied: {}{}", (AccessRightsElement{flags, args...}.toStringWithoutOptions()),
(grant_option ? " WITH GRANT OPTION" : ""));
if constexpr (throw_if_denied)
throw Exception(getUserName() + ": " + error_msg, error_code);
throw Exception(error_code, std::move(fmt_string), getUserName(), std::forward<FmtArgs>(fmt_args)...);
return false;
};
@ -519,18 +521,16 @@ bool ContextAccess::checkAccessImplHelper(AccessFlags flags, const Args &... arg
{
if (grant_option && acs->isGranted(flags, args...))
{
return access_denied(
"Not enough privileges. "
return access_denied(ErrorCodes::ACCESS_DENIED,
"{}: Not enough privileges. "
"The required privileges have been granted, but without grant option. "
"To execute this query it's necessary to have grant "
+ AccessRightsElement{flags, args...}.toStringWithoutOptions() + " WITH GRANT OPTION",
ErrorCodes::ACCESS_DENIED);
"To execute this query it's necessary to have grant {} WITH GRANT OPTION",
AccessRightsElement{flags, args...}.toStringWithoutOptions());
}
return access_denied(
"Not enough privileges. To execute this query it's necessary to have grant "
+ AccessRightsElement{flags, args...}.toStringWithoutOptions() + (grant_option ? " WITH GRANT OPTION" : ""),
ErrorCodes::ACCESS_DENIED);
return access_denied(ErrorCodes::ACCESS_DENIED,
"{}: Not enough privileges. To execute this query it's necessary to have grant {}",
AccessRightsElement{flags, args...}.toStringWithoutOptions() + (grant_option ? " WITH GRANT OPTION" : ""));
}
struct PrecalculatedFlags
@ -557,32 +557,34 @@ bool ContextAccess::checkAccessImplHelper(AccessFlags flags, const Args &... arg
if (params.readonly)
{
if constexpr (grant_option)
return access_denied("Cannot change grants in readonly mode.", ErrorCodes::READONLY);
return access_denied(ErrorCodes::READONLY, "{}: Cannot change grants in readonly mode.");
if ((flags & precalc.not_readonly_flags) ||
((params.readonly == 1) && (flags & precalc.not_readonly_1_flags)))
{
if (params.interface == ClientInfo::Interface::HTTP && params.http_method == ClientInfo::HTTPMethod::GET)
{
return access_denied(
"Cannot execute query in readonly mode. "
"For queries over HTTP, method GET implies readonly. You should use method POST for modifying queries",
ErrorCodes::READONLY);
return access_denied(ErrorCodes::READONLY,
"{}: Cannot execute query in readonly mode. "
"For queries over HTTP, method GET implies readonly. "
"You should use method POST for modifying queries");
}
else
return access_denied("Cannot execute query in readonly mode", ErrorCodes::READONLY);
return access_denied(ErrorCodes::READONLY, "{}: Cannot execute query in readonly mode");
}
}
if (!params.allow_ddl && !grant_option)
{
if (flags & precalc.ddl_flags)
return access_denied("Cannot execute query. DDL queries are prohibited for the user", ErrorCodes::QUERY_IS_PROHIBITED);
return access_denied(ErrorCodes::QUERY_IS_PROHIBITED,
"Cannot execute query. DDL queries are prohibited for the user {}");
}
if (!params.allow_introspection && !grant_option)
{
if (flags & precalc.introspection_flags)
return access_denied("Introspection functions are disabled, because setting 'allow_introspection_functions' is set to 0", ErrorCodes::FUNCTION_NOT_ALLOWED);
return access_denied(ErrorCodes::FUNCTION_NOT_ALLOWED, "{}: Introspection functions are disabled, "
"because setting 'allow_introspection_functions' is set to 0");
}
return access_granted();
@ -679,11 +681,13 @@ void ContextAccess::checkGrantOption(const AccessRightsElements & elements) cons
template <bool throw_if_denied, typename Container, typename GetNameFunction>
bool ContextAccess::checkAdminOptionImplHelper(const Container & role_ids, const GetNameFunction & get_name_function) const
{
auto show_error = [this](const String & msg, int error_code [[maybe_unused]])
auto show_error = []<typename... FmtArgs>(int error_code [[maybe_unused]],
FormatStringHelper<FmtArgs...> fmt_string [[maybe_unused]],
FmtArgs && ...fmt_args [[maybe_unused]])
{
UNUSED(this);
if constexpr (throw_if_denied)
throw Exception(getUserName() + ": " + msg, error_code);
throw Exception(error_code, std::move(fmt_string), std::forward<FmtArgs>(fmt_args)...);
return false;
};
if (is_full_access)
@ -691,7 +695,7 @@ bool ContextAccess::checkAdminOptionImplHelper(const Container & role_ids, const
if (user_was_dropped)
{
show_error("User has been dropped", ErrorCodes::UNKNOWN_USER);
show_error(ErrorCodes::UNKNOWN_USER, "User has been dropped");
return false;
}
@ -716,14 +720,15 @@ bool ContextAccess::checkAdminOptionImplHelper(const Container & role_ids, const
role_name = "ID {" + toString(role_id) + "}";
if (info->enabled_roles.count(role_id))
show_error("Not enough privileges. "
"Role " + backQuote(*role_name) + " is granted, but without ADMIN option. "
"To execute this query it's necessary to have the role " + backQuoteIfNeed(*role_name) + " granted with ADMIN option.",
ErrorCodes::ACCESS_DENIED);
show_error(ErrorCodes::ACCESS_DENIED,
"Not enough privileges. "
"Role {} is granted, but without ADMIN option. "
"To execute this query it's necessary to have the role {} granted with ADMIN option.",
backQuote(*role_name), backQuoteIfNeed(*role_name));
else
show_error("Not enough privileges. "
"To execute this query it's necessary to have the role " + backQuoteIfNeed(*role_name) + " granted with ADMIN option.",
ErrorCodes::ACCESS_DENIED);
show_error(ErrorCodes::ACCESS_DENIED, "Not enough privileges. "
"To execute this query it's necessary to have the role {} granted with ADMIN option.",
backQuoteIfNeed(*role_name));
}
return false;

View File

@ -81,7 +81,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
{
ret = krb5_cc_resolve(k5.ctx, cache_name.c_str(), &k5.out_cc);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving cache{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving cache: {}", fmtError(ret));
LOG_TRACE(log,"Resolved cache");
}
else
@ -89,7 +89,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
// Resolve the default cache and get its type and default principal (if it is initialized).
ret = krb5_cc_default(k5.ctx, &defcache);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while getting default cache{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while getting default cache: {}", fmtError(ret));
LOG_TRACE(log,"Resolved default cache");
deftype = krb5_cc_get_type(k5.ctx, defcache);
if (krb5_cc_get_principal(k5.ctx, defcache, &defcache_princ) != 0)
@ -99,7 +99,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
// Use the specified principal name.
ret = krb5_parse_name_flags(k5.ctx, principal.c_str(), 0, &k5.me);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when parsing principal name {}", principal + fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when parsing principal name ({}): {}", principal, fmtError(ret));
// Cache related commands
if (k5.out_cc == nullptr && krb5_cc_support_switch(k5.ctx, deftype))
@ -107,7 +107,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
// Use an existing cache for the client principal if we can.
ret = krb5_cc_cache_match(k5.ctx, k5.me, &k5.out_cc);
if (ret && ret != KRB5_CC_NOTFOUND)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while searching for cache for {}", principal + fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while searching for cache for ({}): {}", principal, fmtError(ret));
if (0 == ret)
{
LOG_TRACE(log,"Using default cache: {}", krb5_cc_get_name(k5.ctx, k5.out_cc));
@ -118,7 +118,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
// Create a new cache to avoid overwriting the initialized default cache.
ret = krb5_cc_new_unique(k5.ctx, deftype, nullptr, &k5.out_cc);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while generating new cache{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while generating new cache: {}", fmtError(ret));
LOG_TRACE(log,"Using default cache: {}", krb5_cc_get_name(k5.ctx, k5.out_cc));
k5.switch_to_cache = 1;
}
@ -134,24 +134,24 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
ret = krb5_unparse_name(k5.ctx, k5.me, &k5.name);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when unparsing name{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when unparsing name: {}", fmtError(ret));
LOG_TRACE(log,"Using principal: {}", k5.name);
// Allocate a new initial credential options structure.
ret = krb5_get_init_creds_opt_alloc(k5.ctx, &options);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in options allocation{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in options allocation: {}", fmtError(ret));
// Resolve keytab
ret = krb5_kt_resolve(k5.ctx, keytab_file.c_str(), &keytab);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving keytab {}{}", keytab_file, fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in resolving keytab ({}): {}", keytab_file, fmtError(ret));
LOG_TRACE(log,"Using keytab: {}", keytab_file);
// Set an output credential cache in initial credential options.
ret = krb5_get_init_creds_opt_set_out_ccache(k5.ctx, options, k5.out_cc);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in setting output credential cache{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in setting output credential cache: {}", fmtError(ret));
// Action: init or renew
LOG_TRACE(log,"Trying to renew credentials");
@ -165,7 +165,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
// Request KDC for an initial credentials using keytab.
ret = krb5_get_init_creds_keytab(k5.ctx, &my_creds, k5.me, keytab, 0, nullptr, options);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in getting initial credentials{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error in getting initial credentials: {}", fmtError(ret));
else
LOG_TRACE(log,"Got initial credentials");
}
@ -175,7 +175,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
// Initialize a credential cache. Destroy any existing contents of cache and initialize it for the default principal.
ret = krb5_cc_initialize(k5.ctx, k5.out_cc, k5.me);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when initializing cache{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error when initializing cache: {}", fmtError(ret));
LOG_TRACE(log,"Initialized cache");
// Store credentials in a credential cache.
ret = krb5_cc_store_cred(k5.ctx, k5.out_cc, &my_creds);
@ -189,7 +189,7 @@ void KerberosInit::init(const String & keytab_file, const String & principal, co
// Make a credential cache the primary cache for its collection.
ret = krb5_cc_switch(k5.ctx, k5.out_cc);
if (ret)
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while switching to new cache{}", fmtError(ret));
throw Exception(ErrorCodes::KERBEROS_ERROR, "Error while switching to new cache: {}", fmtError(ret));
}
LOG_TRACE(log,"Authenticated to Kerberos v5");

View File

@ -205,7 +205,7 @@ void LDAPClient::handleError(int result_code, String text)
}
}
throw Exception(text, ErrorCodes::LDAP_ERROR);
throw Exception::createDeprecated(text, ErrorCodes::LDAP_ERROR);
}
}
@ -569,7 +569,7 @@ LDAPClient::SearchResults LDAPClient::search(const SearchParams & search_params)
message += matched_msg;
}
throw Exception(message, ErrorCodes::LDAP_ERROR);
throw Exception::createDeprecated(message, ErrorCodes::LDAP_ERROR);
}
break;

View File

@ -266,7 +266,7 @@ bool SettingsConstraints::Checker::check(SettingChange & change, const Field & n
if (!explain.empty())
{
if (reaction == THROW_ON_VIOLATION)
throw Exception(explain, code);
throw Exception::createDeprecated(explain, code);
else
return false;
}

View File

@ -106,7 +106,7 @@ public:
default:
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Map key type " + key_type->getName() + " is not is not supported by combinator " + getName());
"Map key type {} is not is not supported by combinator {}", key_type->getName(), getName());
}
}
else

View File

@ -66,13 +66,13 @@ public:
, kind(kind_)
{
if (!isNativeNumber(arguments[0]))
throw Exception{getName() + ": first argument must be represented by integer", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "{}: first argument must be represented by integer", getName());
if (!isNativeNumber(arguments[1]))
throw Exception{getName() + ": second argument must be represented by integer", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "{}: second argument must be represented by integer", getName());
if (!arguments[0]->equals(*arguments[1]))
throw Exception{getName() + ": arguments must have the same type", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "{}: arguments must have the same type", getName());
}
String getName() const override

View File

@ -88,9 +88,9 @@ createAggregateFunctionSequenceNode(const std::string & name, const DataTypes &
name, toString(min_required_args + 1));
if (argument_types.size() > max_events_size + min_required_args)
throw Exception(fmt::format(
throw Exception(ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH,
"Aggregate function '{}' requires at most {} (timestamp, value_column, ...{} events) arguments.",
name, max_events_size + min_required_args, max_events_size), ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
name, max_events_size + min_required_args, max_events_size);
if (const auto * cond_arg = argument_types[2].get(); cond_arg && !isUInt8(cond_arg))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of third argument of aggregate function {}, "
@ -100,9 +100,8 @@ createAggregateFunctionSequenceNode(const std::string & name, const DataTypes &
{
const auto * cond_arg = argument_types[i].get();
if (!isUInt8(cond_arg))
throw Exception(fmt::format(
"Illegal type '{}' of {} argument of aggregate function '{}', must be UInt8", cond_arg->getName(), i + 1, name),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Illegal type '{}' of {} argument of aggregate function '{}', must be UInt8", cond_arg->getName(), i + 1, name);
}
if (WhichDataType(argument_types[1].get()).idx != TypeIndex::String)

View File

@ -235,7 +235,7 @@ private:
if (skip_degree_ == skip_degree)
return;
if (skip_degree_ > detail::MAX_SKIP_DEGREE)
throw DB::Exception{"skip_degree exceeds maximum value", DB::ErrorCodes::MEMORY_LIMIT_EXCEEDED};
throw DB::Exception(DB::ErrorCodes::MEMORY_LIMIT_EXCEEDED, "skip_degree exceeds maximum value");
skip_degree = skip_degree_;
if (skip_degree == detail::MAX_SKIP_DEGREE)
skip_mask = static_cast<UInt32>(-1);

View File

@ -1,14 +1,15 @@
#pragma once
#include <optional>
#include <utility>
#include <Common/SettingsChanges.h>
#include <base/scope_guard.h>
#include <Common/Exception.h>
#include <Core/Settings.h>
#include <Analyzer/IQueryTreeNode.h>
#include <Analyzer/QueryNode.h>
#include <Analyzer/UnionNode.h>
#include <Interpreters/Context.h>
namespace DB
{
@ -89,4 +90,134 @@ private:
template <typename Derived>
using ConstInDepthQueryTreeVisitor = InDepthQueryTreeVisitor<Derived, true /*const_visitor*/>;
/** Same as InDepthQueryTreeVisitor and additionally keeps track of current scope context.
* This can be useful if your visitor has special logic that depends on current scope context.
*/
template <typename Derived, bool const_visitor = false>
class InDepthQueryTreeVisitorWithContext
{
public:
using VisitQueryTreeNodeType = std::conditional_t<const_visitor, const QueryTreeNodePtr, QueryTreeNodePtr>;
explicit InDepthQueryTreeVisitorWithContext(ContextPtr context)
: current_context(std::move(context))
{}
/// Return true if visitor should traverse tree top to bottom, false otherwise
bool shouldTraverseTopToBottom() const
{
return true;
}
/// Return true if visitor should visit child, false otherwise
bool needChildVisit(VisitQueryTreeNodeType & parent [[maybe_unused]], VisitQueryTreeNodeType & child [[maybe_unused]])
{
return true;
}
const ContextPtr & getContext() const
{
return current_context;
}
const Settings & getSettings() const
{
return current_context->getSettingsRef();
}
void visit(VisitQueryTreeNodeType & query_tree_node)
{
auto current_scope_context_ptr = current_context;
SCOPE_EXIT(
current_context = std::move(current_scope_context_ptr);
);
if (auto * query_node = query_tree_node->template as<QueryNode>())
current_context = query_node->getContext();
else if (auto * union_node = query_tree_node->template as<UnionNode>())
current_context = union_node->getContext();
bool traverse_top_to_bottom = getDerived().shouldTraverseTopToBottom();
if (!traverse_top_to_bottom)
visitChildren(query_tree_node);
getDerived().visitImpl(query_tree_node);
if (traverse_top_to_bottom)
visitChildren(query_tree_node);
}
private:
Derived & getDerived()
{
return *static_cast<Derived *>(this);
}
const Derived & getDerived() const
{
return *static_cast<Derived *>(this);
}
void visitChildren(VisitQueryTreeNodeType & expression)
{
for (auto & child : expression->getChildren())
{
if (!child)
continue;
bool need_visit_child = getDerived().needChildVisit(expression, child);
if (need_visit_child)
visit(child);
}
}
ContextPtr current_context;
};
template <typename Derived>
using ConstInDepthQueryTreeVisitorWithContext = InDepthQueryTreeVisitorWithContext<Derived, true /*const_visitor*/>;
/** Visitor that use another visitor to visit node only if condition for visiting node is true.
* For example, your visitor need to visit only query tree nodes or union nodes.
*
* Condition interface:
* struct Condition
* {
* bool operator()(VisitQueryTreeNodeType & node)
* {
* return shouldNestedVisitorVisitNode(node);
* }
* }
*/
template <typename Visitor, typename Condition, bool const_visitor = false>
class InDepthQueryTreeConditionalVisitor : public InDepthQueryTreeVisitor<InDepthQueryTreeConditionalVisitor<Visitor, Condition, const_visitor>, const_visitor>
{
public:
using Base = InDepthQueryTreeVisitor<InDepthQueryTreeConditionalVisitor<Visitor, Condition, const_visitor>, const_visitor>;
using VisitQueryTreeNodeType = typename Base::VisitQueryTreeNodeType;
explicit InDepthQueryTreeConditionalVisitor(Visitor & visitor_, Condition & condition_)
: visitor(visitor_)
, condition(condition_)
{
}
bool shouldTraverseTopToBottom() const
{
return visitor.shouldTraverseTopToBottom();
}
void visitImpl(VisitQueryTreeNodeType & query_tree_node)
{
if (condition(query_tree_node))
visitor.visit(query_tree_node);
}
Visitor & visitor;
Condition & condition;
};
template <typename Visitor, typename Condition>
using ConstInDepthQueryTreeConditionalVisitor = InDepthQueryTreeConditionalVisitor<Visitor, Condition, true /*const_visitor*/>;
}

View File

@ -45,12 +45,11 @@ Field zeroField(const Field & value)
* TODO: Support `groupBitAnd`, `groupBitOr`, `groupBitXor` functions.
* TODO: Support rewrite `f((2 * n) * n)` into '2 * f(n * n)'.
*/
class AggregateFunctionsArithmericOperationsVisitor : public InDepthQueryTreeVisitor<AggregateFunctionsArithmericOperationsVisitor>
class AggregateFunctionsArithmericOperationsVisitor : public InDepthQueryTreeVisitorWithContext<AggregateFunctionsArithmericOperationsVisitor>
{
public:
explicit AggregateFunctionsArithmericOperationsVisitor(ContextPtr context_)
: context(std::move(context_))
{}
using Base = InDepthQueryTreeVisitorWithContext<AggregateFunctionsArithmericOperationsVisitor>;
using Base::Base;
/// Traverse tree bottom to top
static bool shouldTraverseTopToBottom()
@ -60,6 +59,9 @@ public:
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_arithmetic_operations_in_aggregate_functions)
return;
auto * aggregate_function_node = node->as<FunctionNode>();
if (!aggregate_function_node || !aggregate_function_node->isAggregateFunction())
return;
@ -175,7 +177,7 @@ private:
inline void resolveOrdinaryFunctionNode(FunctionNode & function_node, const String & function_name) const
{
auto function = FunctionFactory::instance().get(function_name, context);
auto function = FunctionFactory::instance().get(function_name, getContext());
function_node.resolveAsFunction(function->build(function_node.getArgumentColumns()));
}
@ -191,8 +193,6 @@ private:
function_node.resolveAsAggregateFunction(std::move(aggregate_function));
}
ContextPtr context;
};
}

View File

@ -1,17 +1,23 @@
#include <Analyzer/Passes/ConvertOrLikeChainPass.h>
#include <memory>
#include <unordered_map>
#include <vector>
#include <Analyzer/Passes/ConvertOrLikeChainPass.h>
#include <Core/Field.h>
#include <DataTypes/DataTypesNumber.h>
#include <Functions/FunctionFactory.h>
#include <Functions/likePatternToRegexp.h>
#include <Interpreters/Context.h>
#include <Analyzer/ConstantNode.h>
#include <Analyzer/UnionNode.h>
#include <Analyzer/FunctionNode.h>
#include <Analyzer/HashUtils.h>
#include <Analyzer/InDepthQueryTreeVisitor.h>
#include <Core/Field.h>
#include <DataTypes/DataTypesNumber.h>
#include <Functions/FunctionFactory.h>
#include <Functions/likePatternToRegexp.h>
#include <Interpreters/Context.h>
namespace DB
{
@ -19,36 +25,28 @@ namespace DB
namespace
{
class ConvertOrLikeChainVisitor : public InDepthQueryTreeVisitor<ConvertOrLikeChainVisitor>
class ConvertOrLikeChainVisitor : public InDepthQueryTreeVisitorWithContext<ConvertOrLikeChainVisitor>
{
using FunctionNodes = std::vector<std::shared_ptr<FunctionNode>>;
const FunctionOverloadResolverPtr match_function_ref;
const FunctionOverloadResolverPtr or_function_resolver;
public:
using Base = InDepthQueryTreeVisitorWithContext<ConvertOrLikeChainVisitor>;
using Base::Base;
explicit ConvertOrLikeChainVisitor(ContextPtr context)
: InDepthQueryTreeVisitor<ConvertOrLikeChainVisitor>()
, match_function_ref(FunctionFactory::instance().get("multiMatchAny", context))
, or_function_resolver(FunctionFactory::instance().get("or", context))
explicit ConvertOrLikeChainVisitor(FunctionOverloadResolverPtr or_function_resolver_,
FunctionOverloadResolverPtr match_function_resolver_,
ContextPtr context)
: Base(std::move(context))
, or_function_resolver(std::move(or_function_resolver_))
, match_function_resolver(std::move(match_function_resolver_))
{}
static bool needChildVisit(VisitQueryTreeNodeType & parent, VisitQueryTreeNodeType &)
bool needChildVisit(VisitQueryTreeNodeType &, VisitQueryTreeNodeType &)
{
ContextPtr context;
if (auto * query = parent->as<QueryNode>())
context = query->getContext();
else if (auto * union_node = parent->as<UnionNode>())
context = union_node->getContext();
if (context)
{
const auto & settings = context->getSettingsRef();
return settings.optimize_or_like_chain
&& settings.allow_hyperscan
&& settings.max_hyperscan_regexp_length == 0
&& settings.max_hyperscan_regexp_total_length == 0;
}
return true;
const auto & settings = getSettings();
return settings.optimize_or_like_chain
&& settings.allow_hyperscan
&& settings.max_hyperscan_regexp_length == 0
&& settings.max_hyperscan_regexp_total_length == 0;
}
void visitImpl(QueryTreeNodePtr & node)
@ -61,27 +59,28 @@ public:
QueryTreeNodePtrWithHashMap<Array> node_to_patterns;
FunctionNodes match_functions;
for (auto & arg : function_node->getArguments())
{
unique_elems.push_back(arg);
auto * arg_func = arg->as<FunctionNode>();
if (!arg_func)
for (auto & argument : function_node->getArguments())
{
unique_elems.push_back(argument);
auto * argument_function = argument->as<FunctionNode>();
if (!argument_function)
continue;
const bool is_like = arg_func->getFunctionName() == "like";
const bool is_ilike = arg_func->getFunctionName() == "ilike";
const bool is_like = argument_function->getFunctionName() == "like";
const bool is_ilike = argument_function->getFunctionName() == "ilike";
/// Not {i}like -> bail out.
if (!is_like && !is_ilike)
continue;
const auto & like_arguments = arg_func->getArguments().getNodes();
const auto & like_arguments = argument_function->getArguments().getNodes();
if (like_arguments.size() != 2)
continue;
auto identifier = like_arguments[0];
auto * pattern = like_arguments[1]->as<ConstantNode>();
const auto & like_first_argument = like_arguments[0];
const auto * pattern = like_arguments[1]->as<ConstantNode>();
if (!pattern || !isString(pattern->getResultType()))
continue;
@ -91,17 +90,20 @@ public:
regexp = "(?i)" + regexp;
unique_elems.pop_back();
auto it = node_to_patterns.find(identifier);
auto it = node_to_patterns.find(like_first_argument);
if (it == node_to_patterns.end())
{
it = node_to_patterns.insert({identifier, Array{}}).first;
it = node_to_patterns.insert({like_first_argument, Array{}}).first;
/// The second argument will be added when all patterns are known.
auto match_function = std::make_shared<FunctionNode>("multiMatchAny");
match_function->getArguments().getNodes().push_back(identifier);
match_function->getArguments().getNodes().push_back(like_first_argument);
match_functions.push_back(match_function);
unique_elems.push_back(std::move(match_function));
}
it->second.push_back(regexp);
}
@ -111,23 +113,29 @@ public:
auto & arguments = match_function->getArguments().getNodes();
auto & patterns = node_to_patterns.at(arguments[0]);
arguments.push_back(std::make_shared<ConstantNode>(Field{std::move(patterns)}));
match_function->resolveAsFunction(match_function_ref);
match_function->resolveAsFunction(match_function_resolver);
}
/// OR must have at least two arguments.
if (unique_elems.size() == 1)
unique_elems.push_back(std::make_shared<ConstantNode>(false));
unique_elems.push_back(std::make_shared<ConstantNode>(static_cast<UInt8>(0)));
function_node->getArguments().getNodes() = std::move(unique_elems);
function_node->resolveAsFunction(or_function_resolver);
}
private:
using FunctionNodes = std::vector<std::shared_ptr<FunctionNode>>;
const FunctionOverloadResolverPtr or_function_resolver;
const FunctionOverloadResolverPtr match_function_resolver;
};
}
void ConvertOrLikeChainPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
void ConvertOrLikeChainPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
ConvertOrLikeChainVisitor visitor(context);
auto or_function_resolver = FunctionFactory::instance().get("or", context);
auto match_function_resolver = FunctionFactory::instance().get("multiMatchAny", context);
ConvertOrLikeChainVisitor visitor(std::move(or_function_resolver), std::move(match_function_resolver), std::move(context));
visitor.visit(query_tree_node);
}

View File

@ -16,11 +16,17 @@ namespace DB
namespace
{
class CountDistinctVisitor : public InDepthQueryTreeVisitor<CountDistinctVisitor>
class CountDistinctVisitor : public InDepthQueryTreeVisitorWithContext<CountDistinctVisitor>
{
public:
static void visitImpl(QueryTreeNodePtr & node)
using Base = InDepthQueryTreeVisitorWithContext<CountDistinctVisitor>;
using Base::Base;
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().count_distinct_optimization)
return;
auto * query_node = node->as<QueryNode>();
/// Check that query has only SELECT clause
@ -78,9 +84,9 @@ public:
}
void CountDistinctPass::run(QueryTreeNodePtr query_tree_node, ContextPtr)
void CountDistinctPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
CountDistinctVisitor visitor;
CountDistinctVisitor visitor(std::move(context));
visitor.visit(query_tree_node);
}

View File

@ -16,12 +16,11 @@ namespace DB
namespace
{
class CustomizeFunctionsVisitor : public InDepthQueryTreeVisitor<CustomizeFunctionsVisitor>
class CustomizeFunctionsVisitor : public InDepthQueryTreeVisitorWithContext<CustomizeFunctionsVisitor>
{
public:
explicit CustomizeFunctionsVisitor(ContextPtr & context_)
: context(context_)
{}
using Base = InDepthQueryTreeVisitorWithContext<CustomizeFunctionsVisitor>;
using Base::Base;
void visitImpl(QueryTreeNodePtr & node) const
{
@ -29,7 +28,7 @@ public:
if (!function_node)
return;
const auto & settings = context->getSettingsRef();
const auto & settings = getSettings();
/// After successful function replacement function name and function name lowercase must be recalculated
auto function_name = function_node->getFunctionName();
@ -154,19 +153,16 @@ public:
inline void resolveOrdinaryFunctionNode(FunctionNode & function_node, const String & function_name) const
{
auto function = FunctionFactory::instance().get(function_name, context);
auto function = FunctionFactory::instance().get(function_name, getContext());
function_node.resolveAsFunction(function->build(function_node.getArgumentColumns()));
}
private:
ContextPtr & context;
};
}
void CustomizeFunctionsPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
CustomizeFunctionsVisitor visitor(context);
CustomizeFunctionsVisitor visitor(std::move(context));
visitor.visit(query_tree_node);
}

View File

@ -22,15 +22,17 @@ namespace DB
namespace
{
class FunctionToSubcolumnsVisitor : public InDepthQueryTreeVisitor<FunctionToSubcolumnsVisitor>
class FunctionToSubcolumnsVisitor : public InDepthQueryTreeVisitorWithContext<FunctionToSubcolumnsVisitor>
{
public:
explicit FunctionToSubcolumnsVisitor(ContextPtr & context_)
: context(context_)
{}
using Base = InDepthQueryTreeVisitorWithContext<FunctionToSubcolumnsVisitor>;
using Base::Base;
void visitImpl(QueryTreeNodePtr & node) const
{
if (!getSettings().optimize_functions_to_subcolumns)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node)
return;
@ -192,11 +194,9 @@ public:
private:
inline void resolveOrdinaryFunctionNode(FunctionNode & function_node, const String & function_name) const
{
auto function = FunctionFactory::instance().get(function_name, context);
auto function = FunctionFactory::instance().get(function_name, getContext());
function_node.resolveAsFunction(function->build(function_node.getArgumentColumns()));
}
ContextPtr & context;
};
}

View File

@ -5,7 +5,7 @@
namespace DB
{
/** Transform functions to subcolumns.
/** Transform functions to subcolumns. Enabled using setting optimize_functions_to_subcolumns.
* It can help to reduce amount of read data.
*
* Example: SELECT tupleElement(column, subcolumn) FROM test_table;

View File

@ -26,16 +26,22 @@ namespace ErrorCodes
namespace
{
class FuseFunctionsVisitor : public InDepthQueryTreeVisitor<FuseFunctionsVisitor>
class FuseFunctionsVisitor : public InDepthQueryTreeVisitorWithContext<FuseFunctionsVisitor>
{
public:
using Base = InDepthQueryTreeVisitorWithContext<FuseFunctionsVisitor>;
using Base::Base;
explicit FuseFunctionsVisitor(const std::unordered_set<String> names_to_collect_)
: names_to_collect(names_to_collect_)
explicit FuseFunctionsVisitor(const std::unordered_set<String> names_to_collect_, ContextPtr context)
: Base(std::move(context))
, names_to_collect(names_to_collect_)
{}
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_syntax_fuse_functions)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node || !function_node->isAggregateFunction() || !names_to_collect.contains(function_node->getFunctionName()))
return;
@ -201,7 +207,7 @@ FunctionNodePtr createFusedQuantilesNode(std::vector<QueryTreeNodePtr *> & nodes
void tryFuseSumCountAvg(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
FuseFunctionsVisitor visitor({"sum", "count", "avg"});
FuseFunctionsVisitor visitor({"sum", "count", "avg"}, context);
visitor.visit(query_tree_node);
for (auto & [argument, nodes] : visitor.argument_to_functions_mapping)
@ -220,7 +226,7 @@ void tryFuseSumCountAvg(QueryTreeNodePtr query_tree_node, ContextPtr context)
void tryFuseQuantiles(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
FuseFunctionsVisitor visitor_quantile({"quantile"});
FuseFunctionsVisitor visitor_quantile({"quantile"}, context);
visitor_quantile.visit(query_tree_node);
for (auto & [argument, nodes_set] : visitor_quantile.argument_to_functions_mapping)

View File

@ -12,15 +12,22 @@ namespace DB
namespace
{
class IfChainToMultiIfPassVisitor : public InDepthQueryTreeVisitor<IfChainToMultiIfPassVisitor>
class IfChainToMultiIfPassVisitor : public InDepthQueryTreeVisitorWithContext<IfChainToMultiIfPassVisitor>
{
public:
explicit IfChainToMultiIfPassVisitor(FunctionOverloadResolverPtr multi_if_function_ptr_)
: multi_if_function_ptr(std::move(multi_if_function_ptr_))
using Base = InDepthQueryTreeVisitorWithContext<IfChainToMultiIfPassVisitor>;
using Base::Base;
explicit IfChainToMultiIfPassVisitor(FunctionOverloadResolverPtr multi_if_function_ptr_, ContextPtr context)
: Base(std::move(context))
, multi_if_function_ptr(std::move(multi_if_function_ptr_))
{}
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_if_chain_to_multiif)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node || function_node->getFunctionName() != "if" || function_node->getArguments().getNodes().size() != 3)
return;
@ -68,7 +75,8 @@ private:
void IfChainToMultiIfPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
IfChainToMultiIfPassVisitor visitor(FunctionFactory::instance().get("multiIf", context));
auto multi_if_function_ptr = FunctionFactory::instance().get("multiIf", context);
IfChainToMultiIfPassVisitor visitor(std::move(multi_if_function_ptr), std::move(context));
visitor.visit(query_tree_node);
}

View File

@ -107,21 +107,24 @@ void wrapIntoToString(FunctionNode & function_node, QueryTreeNodePtr arg, Contex
assert(isString(function_node.getResultType()));
}
class ConvertStringsToEnumVisitor : public InDepthQueryTreeVisitor<ConvertStringsToEnumVisitor>
class ConvertStringsToEnumVisitor : public InDepthQueryTreeVisitorWithContext<ConvertStringsToEnumVisitor>
{
public:
explicit ConvertStringsToEnumVisitor(ContextPtr context_)
: context(std::move(context_))
{
}
using Base = InDepthQueryTreeVisitorWithContext<ConvertStringsToEnumVisitor>;
using Base::Base;
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_if_transform_strings_to_enum)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node)
return;
const auto & context = getContext();
/// to preserve return type (String) of the current function_node, we wrap the newly
/// generated function nodes into toString
@ -198,16 +201,13 @@ public:
return;
}
}
private:
ContextPtr context;
};
}
void IfTransformStringsToEnumPass::run(QueryTreeNodePtr query, ContextPtr context)
{
ConvertStringsToEnumVisitor visitor(context);
ConvertStringsToEnumVisitor visitor(std::move(context));
visitor.visit(query);
}

View File

@ -10,15 +10,22 @@ namespace DB
namespace
{
class MultiIfToIfVisitor : public InDepthQueryTreeVisitor<MultiIfToIfVisitor>
class MultiIfToIfVisitor : public InDepthQueryTreeVisitorWithContext<MultiIfToIfVisitor>
{
public:
explicit MultiIfToIfVisitor(FunctionOverloadResolverPtr if_function_ptr_)
: if_function_ptr(if_function_ptr_)
using Base = InDepthQueryTreeVisitorWithContext<MultiIfToIfVisitor>;
using Base::Base;
explicit MultiIfToIfVisitor(FunctionOverloadResolverPtr if_function_ptr_, ContextPtr context)
: Base(std::move(context))
, if_function_ptr(std::move(if_function_ptr_))
{}
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_multiif_to_if)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node || function_node->getFunctionName() != "multiIf")
return;
@ -38,7 +45,8 @@ private:
void MultiIfToIfPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
MultiIfToIfVisitor visitor(FunctionFactory::instance().get("if", context));
auto if_function_ptr = FunctionFactory::instance().get("if", context);
MultiIfToIfVisitor visitor(std::move(if_function_ptr), std::move(context));
visitor.visit(query_tree_node);
}

View File

@ -14,12 +14,17 @@ namespace DB
namespace
{
class NormalizeCountVariantsVisitor : public InDepthQueryTreeVisitor<NormalizeCountVariantsVisitor>
class NormalizeCountVariantsVisitor : public InDepthQueryTreeVisitorWithContext<NormalizeCountVariantsVisitor>
{
public:
explicit NormalizeCountVariantsVisitor(ContextPtr context_) : context(std::move(context_)) {}
using Base = InDepthQueryTreeVisitorWithContext<NormalizeCountVariantsVisitor>;
using Base::Base;
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_normalize_count_variants)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node || !function_node->isAggregateFunction() || (function_node->getFunctionName() != "count" && function_node->getFunctionName() != "sum"))
return;
@ -42,15 +47,13 @@ public:
else if (function_node->getFunctionName() == "sum" &&
first_argument_constant_literal.getType() == Field::Types::UInt64 &&
first_argument_constant_literal.get<UInt64>() == 1 &&
!context->getSettingsRef().aggregate_functions_null_for_empty)
!getSettings().aggregate_functions_null_for_empty)
{
resolveAsCountAggregateFunction(*function_node);
function_node->getArguments().getNodes().clear();
}
}
private:
ContextPtr context;
static inline void resolveAsCountAggregateFunction(FunctionNode & function_node)
{
AggregateFunctionProperties properties;

View File

@ -1,26 +1,33 @@
#include <Analyzer/Passes/OptimizeGroupByFunctionKeysPass.h>
#include <algorithm>
#include <queue>
#include <Analyzer/FunctionNode.h>
#include <Analyzer/HashUtils.h>
#include <Analyzer/IQueryTreeNode.h>
#include <Analyzer/InDepthQueryTreeVisitor.h>
#include <Analyzer/QueryNode.h>
#include <algorithm>
#include <queue>
namespace DB
{
class OptimizeGroupByFunctionKeysVisitor : public InDepthQueryTreeVisitor<OptimizeGroupByFunctionKeysVisitor>
class OptimizeGroupByFunctionKeysVisitor : public InDepthQueryTreeVisitorWithContext<OptimizeGroupByFunctionKeysVisitor>
{
public:
using Base = InDepthQueryTreeVisitorWithContext<OptimizeGroupByFunctionKeysVisitor>;
using Base::Base;
static bool needChildVisit(QueryTreeNodePtr & /*parent*/, QueryTreeNodePtr & child)
{
return !child->as<FunctionNode>();
}
static void visitImpl(QueryTreeNodePtr & node)
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_group_by_function_keys)
return;
auto * query = node->as<QueryNode>();
if (!query)
return;
@ -41,11 +48,10 @@ public:
optimizeGroupingSet(group_by);
}
private:
struct NodeWithInfo
{
QueryTreeNodePtr node;
bool parents_are_only_deterministic;
bool parents_are_only_deterministic = false;
};
static bool canBeEliminated(QueryTreeNodePtr & node, const QueryTreeNodePtrWithHashSet & group_by_keys)
@ -64,7 +70,7 @@ private:
// TODO: Also process CONSTANT here. We can simplify GROUP BY x, x + 1 to GROUP BY x.
while (!candidates.empty())
{
auto [candidate, deterministic_context] = candidates.back();
auto [candidate, parents_are_only_deterministic] = candidates.back();
candidates.pop_back();
bool found = group_by_keys.contains(candidate);
@ -80,7 +86,7 @@ private:
if (!found)
{
bool is_deterministic_function = deterministic_context && function->getFunction()->isDeterministicInScopeOfQuery();
bool is_deterministic_function = parents_are_only_deterministic && function->getFunction()->isDeterministicInScopeOfQuery();
for (auto it = arguments.rbegin(); it != arguments.rend(); ++it)
candidates.push_back({ *it, is_deterministic_function });
}
@ -91,7 +97,7 @@ private:
return false;
break;
case QueryTreeNodeType::CONSTANT:
if (!deterministic_context)
if (!parents_are_only_deterministic)
return false;
break;
default:
@ -117,9 +123,10 @@ private:
}
};
void OptimizeGroupByFunctionKeysPass::run(QueryTreeNodePtr query_tree_node, ContextPtr /*context*/)
void OptimizeGroupByFunctionKeysPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
OptimizeGroupByFunctionKeysVisitor().visit(query_tree_node);
OptimizeGroupByFunctionKeysVisitor visitor(std::move(context));
visitor.visit(query_tree_node);
}
}

View File

@ -1,11 +1,13 @@
#include <Analyzer/Passes/OptimizeRedundantFunctionsInOrderByPass.h>
#include <Functions/IFunction.h>
#include <Analyzer/ColumnNode.h>
#include <Analyzer/FunctionNode.h>
#include <Analyzer/HashUtils.h>
#include <Analyzer/InDepthQueryTreeVisitor.h>
#include <Analyzer/QueryNode.h>
#include <Analyzer/SortNode.h>
#include <Functions/IFunction.h>
namespace DB
{
@ -13,9 +15,12 @@ namespace DB
namespace
{
class OptimizeRedundantFunctionsInOrderByVisitor : public InDepthQueryTreeVisitor<OptimizeRedundantFunctionsInOrderByVisitor>
class OptimizeRedundantFunctionsInOrderByVisitor : public InDepthQueryTreeVisitorWithContext<OptimizeRedundantFunctionsInOrderByVisitor>
{
public:
using Base = InDepthQueryTreeVisitorWithContext<OptimizeRedundantFunctionsInOrderByVisitor>;
using Base::Base;
static bool needChildVisit(QueryTreeNodePtr & node, QueryTreeNodePtr & /*parent*/)
{
if (node->as<FunctionNode>())
@ -25,6 +30,9 @@ public:
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_redundant_functions_in_order_by)
return;
auto * query = node->as<QueryNode>();
if (!query)
return;
@ -116,9 +124,10 @@ private:
}
void OptimizeRedundantFunctionsInOrderByPass::run(QueryTreeNodePtr query_tree_node, ContextPtr /*context*/)
void OptimizeRedundantFunctionsInOrderByPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
OptimizeRedundantFunctionsInOrderByVisitor().visit(query_tree_node);
OptimizeRedundantFunctionsInOrderByVisitor visitor(std::move(context));
visitor.visit(query_tree_node);
}
}

View File

@ -1943,7 +1943,7 @@ void QueryAnalyzer::validateTableExpressionModifiers(const QueryTreeNodePtr & ta
if (!table_node && !table_function_node && !query_node && !union_node)
throw Exception(ErrorCodes::LOGICAL_ERROR,
"Unexpected table expression. Expected table, table function, query or union node. Actual {}",
"Unexpected table expression. Expected table, table function, query or union node. Table node: {}, scope node: {}",
table_expression_node->formatASTForErrorMessage(),
scope.scope_node->formatASTForErrorMessage());
@ -4366,12 +4366,9 @@ ProjectionNames QueryAnalyzer::resolveFunction(QueryTreeNodePtr & node, Identifi
{
if (!AggregateFunctionFactory::instance().isAggregateFunctionName(function_name))
{
std::string error_message = fmt::format("Aggregate function with name '{}' does not exists. In scope {}",
function_name,
scope.scope_node->formatASTForErrorMessage());
AggregateFunctionFactory::instance().appendHintsMessage(error_message, function_name);
throw Exception(ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION, error_message);
throw Exception(ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION, "Aggregate function with name '{}' does not exists. In scope {}{}",
function_name, scope.scope_node->formatASTForErrorMessage(),
getHintsErrorMessageSuffix(AggregateFunctionFactory::instance().getHints(function_name)));
}
if (!function_lambda_arguments_indexes.empty())
@ -5726,7 +5723,7 @@ void QueryAnalyzer::resolveQueryJoinTreeNode(QueryTreeNodePtr & join_tree_node,
case QueryTreeNodeType::IDENTIFIER:
{
throw Exception(ErrorCodes::LOGICAL_ERROR,
"Identifiers in FROM section must be already resolved. In scope {}",
"Identifiers in FROM section must be already resolved. Node {}, scope {}",
join_tree_node->formatASTForErrorMessage(),
scope.scope_node->formatASTForErrorMessage());
}

View File

@ -20,15 +20,17 @@ namespace DB
namespace
{
class SumIfToCountIfVisitor : public InDepthQueryTreeVisitor<SumIfToCountIfVisitor>
class SumIfToCountIfVisitor : public InDepthQueryTreeVisitorWithContext<SumIfToCountIfVisitor>
{
public:
explicit SumIfToCountIfVisitor(ContextPtr & context_)
: context(context_)
{}
using Base = InDepthQueryTreeVisitorWithContext<SumIfToCountIfVisitor>;
using Base::Base;
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_rewrite_sum_if_to_count_if)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node || !function_node->isAggregateFunction())
return;
@ -56,7 +58,7 @@ public:
if (!isInt64OrUInt64FieldType(constant_value_literal.getType()))
return;
if (constant_value_literal.get<UInt64>() != 1 || context->getSettingsRef().aggregate_functions_null_for_empty)
if (constant_value_literal.get<UInt64>() != 1 || getSettings().aggregate_functions_null_for_empty)
return;
function_node_arguments_nodes[0] = std::move(function_node_arguments_nodes[1]);
@ -122,7 +124,7 @@ public:
auto & not_function_arguments = not_function->getArguments().getNodes();
not_function_arguments.push_back(nested_if_function_arguments_nodes[0]);
not_function->resolveAsFunction(FunctionFactory::instance().get("not", context)->build(not_function->getArgumentColumns()));
not_function->resolveAsFunction(FunctionFactory::instance().get("not", getContext())->build(not_function->getArgumentColumns()));
function_node_arguments_nodes[0] = std::move(not_function);
function_node_arguments_nodes.resize(1);
@ -143,8 +145,6 @@ private:
function_node.resolveAsAggregateFunction(std::move(aggregate_function));
}
ContextPtr & context;
};
}

View File

@ -25,11 +25,17 @@ bool isUniqFunction(const String & function_name)
function_name == "uniqTheta";
}
class UniqInjectiveFunctionsEliminationVisitor : public InDepthQueryTreeVisitor<UniqInjectiveFunctionsEliminationVisitor>
class UniqInjectiveFunctionsEliminationVisitor : public InDepthQueryTreeVisitorWithContext<UniqInjectiveFunctionsEliminationVisitor>
{
public:
static void visitImpl(QueryTreeNodePtr & node)
using Base = InDepthQueryTreeVisitorWithContext<UniqInjectiveFunctionsEliminationVisitor>;
using Base::Base;
void visitImpl(QueryTreeNodePtr & node)
{
if (!getSettings().optimize_injective_functions_inside_uniq)
return;
auto * function_node = node->as<FunctionNode>();
if (!function_node || !function_node->isAggregateFunction() || !isUniqFunction(function_node->getFunctionName()))
return;
@ -81,9 +87,9 @@ public:
}
void UniqInjectiveFunctionsEliminationPass::run(QueryTreeNodePtr query_tree_node, ContextPtr)
void UniqInjectiveFunctionsEliminationPass::run(QueryTreeNodePtr query_tree_node, ContextPtr context)
{
UniqInjectiveFunctionsEliminationVisitor visitor;
UniqInjectiveFunctionsEliminationVisitor visitor(std::move(context));
visitor.visit(query_tree_node);
}

View File

@ -1,5 +1,7 @@
#include <Analyzer/QueryNode.h>
#include <fmt/core.h>
#include <Common/SipHash.h>
#include <Common/FieldVisitorToString.h>
@ -17,7 +19,6 @@
#include <Parsers/ASTSetQuery.h>
#include <Analyzer/Utils.h>
#include <fmt/core.h>
namespace DB
{
@ -36,7 +37,7 @@ QueryNode::QueryNode(ContextMutablePtr context_, SettingsChanges settings_change
}
QueryNode::QueryNode(ContextMutablePtr context_)
: QueryNode(context_, {} /*settings_changes*/)
: QueryNode(std::move(context_), {} /*settings_changes*/)
{}
void QueryNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
@ -185,10 +186,7 @@ void QueryNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, s
{
buffer << '\n' << std::string(indent + 2, ' ') << "SETTINGS";
for (const auto & change : settings_changes)
{
buffer << fmt::format(" {}={}", change.name, toString(change.value));
}
buffer << '\n';
}
}

View File

@ -1,6 +1,7 @@
#include <memory>
#include <Analyzer/QueryTreePassManager.h>
#include <memory>
#include <Common/Exception.h>
#include <IO/WriteHelpers.h>
@ -133,7 +134,6 @@ private:
* TODO: Support setting optimize_aggregators_of_group_by_keys.
* TODO: Support setting optimize_duplicate_order_by_and_distinct.
* TODO: Support setting optimize_monotonous_functions_in_order_by.
* TODO: Support settings.optimize_or_like_chain.
* TODO: Add optimizations based on function semantics. Example: SELECT * FROM test_table WHERE id != id. (id is not nullable column).
*/
@ -210,53 +210,31 @@ void QueryTreePassManager::dump(WriteBuffer & buffer, size_t up_to_pass_index)
void addQueryTreePasses(QueryTreePassManager & manager)
{
auto context = manager.getContext();
const auto & settings = context->getSettingsRef();
manager.addPass(std::make_unique<QueryAnalysisPass>());
manager.addPass(std::make_unique<FunctionToSubcolumnsPass>());
if (settings.optimize_functions_to_subcolumns)
manager.addPass(std::make_unique<FunctionToSubcolumnsPass>());
if (settings.count_distinct_optimization)
manager.addPass(std::make_unique<CountDistinctPass>());
if (settings.optimize_rewrite_sum_if_to_count_if)
manager.addPass(std::make_unique<SumIfToCountIfPass>());
if (settings.optimize_normalize_count_variants)
manager.addPass(std::make_unique<NormalizeCountVariantsPass>());
manager.addPass(std::make_unique<CountDistinctPass>());
manager.addPass(std::make_unique<SumIfToCountIfPass>());
manager.addPass(std::make_unique<NormalizeCountVariantsPass>());
manager.addPass(std::make_unique<CustomizeFunctionsPass>());
if (settings.optimize_arithmetic_operations_in_aggregate_functions)
manager.addPass(std::make_unique<AggregateFunctionsArithmericOperationsPass>());
if (settings.optimize_injective_functions_inside_uniq)
manager.addPass(std::make_unique<UniqInjectiveFunctionsEliminationPass>());
if (settings.optimize_group_by_function_keys)
manager.addPass(std::make_unique<OptimizeGroupByFunctionKeysPass>());
if (settings.optimize_multiif_to_if)
manager.addPass(std::make_unique<MultiIfToIfPass>());
manager.addPass(std::make_unique<AggregateFunctionsArithmericOperationsPass>());
manager.addPass(std::make_unique<UniqInjectiveFunctionsEliminationPass>());
manager.addPass(std::make_unique<OptimizeGroupByFunctionKeysPass>());
manager.addPass(std::make_unique<MultiIfToIfPass>());
manager.addPass(std::make_unique<IfConstantConditionPass>());
manager.addPass(std::make_unique<IfChainToMultiIfPass>());
if (settings.optimize_if_chain_to_multiif)
manager.addPass(std::make_unique<IfChainToMultiIfPass>());
if (settings.optimize_redundant_functions_in_order_by)
manager.addPass(std::make_unique<OptimizeRedundantFunctionsInOrderByPass>());
manager.addPass(std::make_unique<OptimizeRedundantFunctionsInOrderByPass>());
manager.addPass(std::make_unique<OrderByTupleEliminationPass>());
manager.addPass(std::make_unique<OrderByLimitByDuplicateEliminationPass>());
if (settings.optimize_syntax_fuse_functions)
manager.addPass(std::make_unique<FuseFunctionsPass>());
manager.addPass(std::make_unique<FuseFunctionsPass>());
if (settings.optimize_if_transform_strings_to_enum)
manager.addPass(std::make_unique<IfTransformStringsToEnumPass>());
manager.addPass(std::make_unique<IfTransformStringsToEnumPass>());
manager.addPass(std::make_unique<ConvertOrLikeChainPass>());

View File

@ -79,7 +79,7 @@ namespace
request.SetMaxKeys(1);
auto outcome = client.ListObjects(request);
if (!outcome.IsSuccess())
throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
throw Exception::createDeprecated(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
return outcome.GetResult().GetContents();
}
@ -233,7 +233,7 @@ void BackupWriterS3::removeFile(const String & file_name)
request.SetKey(fs::path(s3_uri.key) / file_name);
auto outcome = client->DeleteObject(request);
if (!outcome.IsSuccess() && !isNotFoundError(outcome.GetError().GetErrorType()))
throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
throw Exception::createDeprecated(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
}
void BackupWriterS3::removeFiles(const Strings & file_names)
@ -291,7 +291,7 @@ void BackupWriterS3::removeFilesBatch(const Strings & file_names)
auto outcome = client->DeleteObjects(request);
if (!outcome.IsSuccess() && !isNotFoundError(outcome.GetError().GetErrorType()))
throw Exception(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
throw Exception::createDeprecated(outcome.GetError().GetMessage(), ErrorCodes::S3_ERROR);
}
}

View File

@ -93,7 +93,7 @@ namespace
catch (...)
{
if (coordination)
coordination->setError(current_host, Exception{getCurrentExceptionCode(), getCurrentExceptionMessage(true, true)});
coordination->setError(current_host, Exception(getCurrentExceptionMessageAndPattern(true, true), getCurrentExceptionCode()));
}
}

View File

@ -452,7 +452,7 @@ void ClientBase::onData(Block & block, ASTPtr parsed_query)
catch (const Exception &)
{
/// Catch client errors like NO_ROW_DELIMITER
throw LocalFormatError(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
throw LocalFormatError(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
}
/// Received data block is immediately displayed to the user.
@ -630,7 +630,7 @@ try
}
catch (...)
{
throw LocalFormatError(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
throw LocalFormatError(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
}
@ -1912,7 +1912,7 @@ bool ClientBase::executeMultiQuery(const String & all_queries_text)
{
// Surprisingly, this is a client error. A server error would
// have been reported without throwing (see onReceiveSeverException()).
client_exception = std::make_unique<Exception>(getCurrentExceptionMessage(print_stack_trace), getCurrentExceptionCode());
client_exception = std::make_unique<Exception>(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
have_error = true;
}

View File

@ -187,7 +187,7 @@ void LocalConnection::sendQuery(
catch (...)
{
state->io.onException();
state->exception = std::make_unique<Exception>("Unknown exception", ErrorCodes::UNKNOWN_EXCEPTION);
state->exception = std::make_unique<Exception>(ErrorCodes::UNKNOWN_EXCEPTION, "Unknown exception");
}
}
@ -291,7 +291,7 @@ bool LocalConnection::poll(size_t)
catch (...)
{
state->io.onException();
state->exception = std::make_unique<Exception>("Unknown exception", ErrorCodes::UNKNOWN_EXCEPTION);
state->exception = std::make_unique<Exception>(ErrorCodes::UNKNOWN_EXCEPTION, "Unknown exception");
}
}

View File

@ -83,7 +83,7 @@ template <is_decimal T>
UInt64 ColumnDecimal<T>::get64([[maybe_unused]] size_t n) const
{
if constexpr (sizeof(T) > sizeof(UInt64))
throw Exception(String("Method get64 is not supported for ") + getFamilyName(), ErrorCodes::NOT_IMPLEMENTED);
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Method get64 is not supported for {}", getFamilyName());
else
return static_cast<NativeT>(data[n]);
}

View File

@ -35,7 +35,7 @@ ColumnNullable::ColumnNullable(MutableColumnPtr && nested_column_, MutableColumn
nested_column = getNestedColumn().convertToFullColumnIfConst();
if (!getNestedColumn().canBeInsideNullable())
throw Exception{getNestedColumn().getName() + " cannot be inside Nullable column", ErrorCodes::ILLEGAL_COLUMN};
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "{} cannot be inside Nullable column", getNestedColumn().getName());
if (isColumnConst(*null_map))
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "ColumnNullable cannot have constant null map");

View File

@ -207,8 +207,9 @@ private:
if (size >= MMAP_THRESHOLD)
{
if (alignment > mmap_min_alignment)
throw DB::Exception(fmt::format("Too large alignment {}: more than page size when allocating {}.",
ReadableSize(alignment), ReadableSize(size)), DB::ErrorCodes::BAD_ARGUMENTS);
throw DB::Exception(DB::ErrorCodes::BAD_ARGUMENTS,
"Too large alignment {}: more than page size when allocating {}.",
ReadableSize(alignment), ReadableSize(size));
buf = mmap(getMmapHint(), size, PROT_READ | PROT_WRITE,
mmap_flags, -1, 0);

View File

@ -140,7 +140,7 @@ void CancelToken::raise()
{
std::unique_lock lock(signal_mutex);
if (exception_code != 0)
throw DB::Exception(
throw DB::Exception::createRuntime(
std::exchange(exception_code, 0),
std::exchange(exception_message, {}));
else

View File

@ -88,7 +88,7 @@ public:
{
/// A more understandable error message.
if (e.code() == DB::ErrorCodes::CANNOT_READ_ALL_DATA || e.code() == DB::ErrorCodes::ATTEMPT_TO_READ_AFTER_EOF)
throw DB::ParsingException("File " + path + " is empty. You must fill it manually with appropriate value.", e.code());
throw DB::ParsingException(e.code(), "File {} is empty. You must fill it manually with appropriate value.", path);
else
throw;
}

View File

@ -39,6 +39,15 @@ enum class WeekModeFlag : UInt8
};
using YearWeek = std::pair<UInt16, UInt8>;
/// Modes for toDayOfWeek() function.
enum class WeekDayMode
{
WeekStartsMonday1 = 0,
WeekStartsMonday0 = 1,
WeekStartsSunday0 = 2,
WeekStartsSunday1 = 3
};
/** Lookup table to conversion of time to date, and to month / year / day of week / day of month and so on.
* First time was implemented for OLAPServer, that needed to do billions of such transformations.
*/
@ -619,9 +628,28 @@ public:
template <typename DateOrTime>
inline Int16 toYear(DateOrTime v) const { return lut[toLUTIndex(v)].year; }
/// 1-based, starts on Monday
template <typename DateOrTime>
inline UInt8 toDayOfWeek(DateOrTime v) const { return lut[toLUTIndex(v)].day_of_week; }
template <typename DateOrTime>
inline UInt8 toDayOfWeek(DateOrTime v, UInt8 week_day_mode) const
{
WeekDayMode mode = check_week_day_mode(week_day_mode);
UInt8 res = toDayOfWeek(v);
using enum WeekDayMode;
bool start_from_sunday = (mode == WeekStartsSunday0 || mode == WeekStartsSunday1);
bool zero_based = (mode == WeekStartsMonday0 || mode == WeekStartsSunday0);
if (start_from_sunday)
res = res % 7 + 1;
if (zero_based)
--res;
return res;
}
template <typename DateOrTime>
inline UInt8 toDayOfMonth(DateOrTime v) const { return lut[toLUTIndex(v)].day_of_month; }
@ -844,6 +872,12 @@ public:
return week_format;
}
/// Check and change mode to effective.
inline WeekDayMode check_week_day_mode(UInt8 mode) const /// NOLINT
{
return static_cast<WeekDayMode>(mode & 3);
}
/** Calculate weekday from d.
* Returns 0 for monday, 1 for tuesday...
*/

View File

@ -252,7 +252,7 @@ uint64_t readOffset(std::string_view & sp, bool is64_bit)
// Read "len" bytes
std::string_view readBytes(std::string_view & sp, uint64_t len)
{
SAFE_CHECK(len <= sp.size(), "invalid string length: " + std::to_string(len) + " vs. " + std::to_string(sp.size()));
SAFE_CHECK(len <= sp.size(), "invalid string length: {} vs. {}", len, sp.size());
std::string_view ret(sp.data(), len);
sp.remove_prefix(len);
return ret;
@ -953,7 +953,7 @@ bool Dwarf::findDebugInfoOffset(uintptr_t address, std::string_view aranges, uin
Dwarf::Die Dwarf::getDieAtOffset(const CompilationUnit & cu, uint64_t offset) const
{
SAFE_CHECK(offset < info_.size(), fmt::format("unexpected offset {}, info size {}", offset, info_.size()));
SAFE_CHECK(offset < info_.size(), "unexpected offset {}, info size {}", offset, info_.size());
Die die;
std::string_view sp{info_.data() + offset, cu.offset + cu.size - offset};
die.offset = offset;

View File

@ -188,7 +188,7 @@ static void tryLogCurrentExceptionImpl(Poco::Logger * logger, const std::string
{
PreformattedMessage message = getCurrentExceptionMessageAndPattern(true);
if (!start_of_message.empty())
message.message = fmt::format("{}: {}", start_of_message, message.message);
message.text = fmt::format("{}: {}", start_of_message, message.text);
LOG_ERROR(logger, message);
}
@ -339,7 +339,7 @@ std::string getExtraExceptionInfo(const std::exception & e)
std::string getCurrentExceptionMessage(bool with_stacktrace, bool check_embedded_stacktrace /*= false*/, bool with_extra_info /*= true*/)
{
return getCurrentExceptionMessageAndPattern(with_stacktrace, check_embedded_stacktrace, with_extra_info).message;
return getCurrentExceptionMessageAndPattern(with_stacktrace, check_embedded_stacktrace, with_extra_info).text;
}
PreformattedMessage getCurrentExceptionMessageAndPattern(bool with_stacktrace, bool check_embedded_stacktrace /*= false*/, bool with_extra_info /*= true*/)
@ -481,7 +481,7 @@ void tryLogException(std::exception_ptr e, Poco::Logger * logger, const std::str
std::string getExceptionMessage(const Exception & e, bool with_stacktrace, bool check_embedded_stacktrace)
{
return getExceptionMessageAndPattern(e, with_stacktrace, check_embedded_stacktrace).message;
return getExceptionMessageAndPattern(e, with_stacktrace, check_embedded_stacktrace).text;
}
PreformattedMessage getExceptionMessageAndPattern(const Exception & e, bool with_stacktrace, bool check_embedded_stacktrace)
@ -577,10 +577,6 @@ ParsingException::ParsingException(const std::string & msg, int code)
: Exception(msg, code)
{
}
ParsingException::ParsingException(int code, const std::string & message)
: Exception(message, code)
{
}
/// We use additional field formatted_message_ to make this method const.
std::string ParsingException::displayText() const

View File

@ -16,26 +16,6 @@
namespace Poco { class Logger; }
/// Extract format string from a string literal and constructs consteval fmt::format_string
template <typename... Args>
struct FormatStringHelperImpl
{
std::string_view message_format_string;
fmt::format_string<Args...> fmt_str;
template<typename T>
consteval FormatStringHelperImpl(T && str) : message_format_string(tryGetStaticFormatString(str)), fmt_str(std::forward<T>(str)) {}
template<typename T>
FormatStringHelperImpl(fmt::basic_runtime<T> && str) : message_format_string(), fmt_str(std::forward<fmt::basic_runtime<T>>(str)) {}
PreformattedMessage format(Args && ...args) const
{
return PreformattedMessage{fmt::format(fmt_str, std::forward<Args...>(args)...), message_format_string};
}
};
template <typename... Args>
using FormatStringHelper = FormatStringHelperImpl<std::type_identity_t<Args>...>;
namespace DB
{
@ -48,6 +28,17 @@ public:
Exception() = default;
Exception(const PreformattedMessage & msg, int code): Exception(msg.text, code)
{
message_format_string = msg.format_string;
}
Exception(PreformattedMessage && msg, int code): Exception(std::move(msg.text), code)
{
message_format_string = msg.format_string;
}
protected:
// used to remove the sensitive information from exceptions if query_masking_rules is configured
struct MessageMasked
{
@ -62,11 +53,16 @@ public:
// delegating constructor to mask sensitive information from the message
Exception(const std::string & msg, int code, bool remote_ = false): Exception(MessageMasked(msg), code, remote_) {}
Exception(std::string && msg, int code, bool remote_ = false): Exception(MessageMasked(std::move(msg)), code, remote_) {}
Exception(PreformattedMessage && msg, int code): Exception(std::move(msg.message), code)
public:
/// This creator is for exceptions that should format a message using fmt::format from the variadic ctor Exception(code, fmt, ...),
/// but were not rewritten yet. It will be removed.
static Exception createDeprecated(const std::string & msg, int code, bool remote_ = false)
{
message_format_string = msg.format_string;
return Exception(msg, code, remote_);
}
/// Message must be a compile-time constant
template<typename T, typename = std::enable_if_t<std::is_convertible_v<T, String>>>
Exception(int code, T && message)
: Exception(message, code)
@ -74,9 +70,11 @@ public:
message_format_string = tryGetStaticFormatString(message);
}
template<> Exception(int code, const String & message) : Exception(message, code) {}
template<> Exception(int code, String & message) : Exception(message, code) {}
template<> Exception(int code, String && message) : Exception(std::move(message), code) {}
/// These creators are for messages that were received by network or generated by a third-party library in runtime.
/// Please use a constructor for all other cases.
static Exception createRuntime(int code, const String & message) { return Exception(message, code); }
static Exception createRuntime(int code, String & message) { return Exception(message, code); }
static Exception createRuntime(int code, String && message) { return Exception(std::move(message), code); }
// Format message with fmt::format, like the logging functions.
template <typename... Args>
@ -167,11 +165,9 @@ private:
/// more convenient calculation of problem line number.
class ParsingException : public Exception
{
ParsingException(const std::string & msg, int code);
public:
ParsingException();
ParsingException(const std::string & msg, int code);
ParsingException(int code, const std::string & message);
ParsingException(int code, std::string && message) : Exception(message, code) {}
// Format message with fmt::format, like the logging functions.
template <typename... Args>

View File

@ -0,0 +1,7 @@
#include <Common/LoggingFormatStringHelpers.h>
[[noreturn]] void functionThatFailsCompilationOfConstevalFunctions(const char * error)
{
throw std::runtime_error(error);
}

View File

@ -2,24 +2,70 @@
#include <base/defines.h>
#include <fmt/format.h>
struct PreformattedMessage;
consteval void formatStringCheckArgsNumImpl(std::string_view str, size_t nargs);
template <typename T> constexpr std::string_view tryGetStaticFormatString(T && x);
/// Extract format string from a string literal and constructs consteval fmt::format_string
template <typename... Args>
struct FormatStringHelperImpl
{
std::string_view message_format_string;
fmt::format_string<Args...> fmt_str;
template<typename T>
consteval FormatStringHelperImpl(T && str) : message_format_string(tryGetStaticFormatString(str)), fmt_str(std::forward<T>(str))
{
formatStringCheckArgsNumImpl(message_format_string, sizeof...(Args));
}
template<typename T>
FormatStringHelperImpl(fmt::basic_runtime<T> && str) : message_format_string(), fmt_str(std::forward<fmt::basic_runtime<T>>(str)) {}
PreformattedMessage format(Args && ...args) const;
};
template <typename... Args>
using FormatStringHelper = FormatStringHelperImpl<std::type_identity_t<Args>...>;
/// Saves a format string for already formatted message
struct PreformattedMessage
{
String message;
std::string text;
std::string_view format_string;
operator const String & () const { return message; }
operator String () && { return std::move(message); }
template <typename... Args>
static PreformattedMessage create(FormatStringHelper<Args...> fmt, Args &&... args);
operator const std::string & () const { return text; }
operator std::string () && { return std::move(text); }
operator fmt::format_string<> () const { UNREACHABLE(); }
};
template <typename... Args>
PreformattedMessage FormatStringHelperImpl<Args...>::format(Args && ...args) const
{
return PreformattedMessage{fmt::format(fmt_str, std::forward<Args>(args)...), message_format_string};
}
template <typename... Args>
PreformattedMessage PreformattedMessage::create(FormatStringHelper<Args...> fmt, Args && ...args)
{
return fmt.format(std::forward<Args>(args)...);
}
template<typename T> struct is_fmt_runtime : std::false_type {};
template<typename T> struct is_fmt_runtime<fmt::basic_runtime<T>> : std::true_type {};
template <typename T> constexpr std::string_view tryGetStaticFormatString(T && x)
{
/// Failure of this asserting indicates that something went wrong during type deduction.
/// For example, a string literal was implicitly converted to std::string. It should not happen.
/// Format string for an exception or log message must be a string literal (compile-time constant).
/// Failure of this assertion may indicate one of the following issues:
/// - A message was already formatted into std::string before passing to Exception(...) or LOG_XXXXX(...).
/// Please use variadic constructor of Exception.
/// Consider using PreformattedMessage or LogToStr if you want to avoid double formatting and/or copy-paste.
/// - A string literal was converted to std::string (or const char *).
/// - Use Exception::createRuntime or fmt::runtime if there's no format string
/// and a message is generated in runtime by a third-party library
/// or deserialized from somewhere.
static_assert(!std::is_same_v<std::string, std::decay_t<T>>);
if constexpr (is_fmt_runtime<std::decay_t<T>>::value)
@ -53,3 +99,60 @@ template <typename T, typename... Ts> constexpr auto firstArg(T && x, Ts &&...)
/// For implicit conversion of fmt::basic_runtime<> to char* for std::string ctor
template <typename T, typename... Ts> constexpr auto firstArg(fmt::basic_runtime<T> && data, Ts &&...) { return data.str.data(); }
consteval ssize_t formatStringCountArgsNum(const char * const str, size_t len)
{
/// It does not count named args, but we don't use them
size_t cnt = 0;
size_t i = 0;
while (i + 1 < len)
{
if (str[i] == '{' && str[i + 1] == '}')
{
i += 2;
cnt += 1;
}
else if (str[i] == '{')
{
/// Ignore checks for complex formatting like "{:.3f}"
return -1;
}
else
{
i += 1;
}
}
return cnt;
}
[[noreturn]] void functionThatFailsCompilationOfConstevalFunctions(const char * error);
/// fmt::format checks that there are enough arguments, but ignores extra arguments (e.g. fmt::format("{}", 1, 2) compiles)
/// This function will fail to compile if the number of "{}" substitutions does not exactly match
consteval void formatStringCheckArgsNumImpl(std::string_view str, size_t nargs)
{
if (str.empty())
return;
ssize_t cnt = formatStringCountArgsNum(str.data(), str.size());
if (0 <= cnt && cnt != nargs)
functionThatFailsCompilationOfConstevalFunctions("unexpected number of arguments in a format string");
}
template <typename... Args>
struct CheckArgsNumHelperImpl
{
template<typename T>
consteval CheckArgsNumHelperImpl(T && str)
{
formatStringCheckArgsNumImpl(tryGetStaticFormatString(str), sizeof...(Args));
}
/// No checks for fmt::runtime and PreformattedMessage
template<typename T> CheckArgsNumHelperImpl(fmt::basic_runtime<T> &&) {}
template<> CheckArgsNumHelperImpl(PreformattedMessage &) {}
template<> CheckArgsNumHelperImpl(const PreformattedMessage &) {}
template<> CheckArgsNumHelperImpl(PreformattedMessage &&) {}
};
template <typename... Args> using CheckArgsNumHelper = CheckArgsNumHelperImpl<std::type_identity_t<Args>...>;
template <typename... Args> void formatStringCheckArgsNum(CheckArgsNumHelper<Args...>, Args &&...) {}

View File

@ -187,7 +187,7 @@ public:
{
throw Exception(
ErrorCodes::BAD_ARGUMENTS,
"Value with key `{}` is used twice in the SET query",
"Value with key `{}` is used twice in the SET query (collection name: {})",
name, query.collection_name);
}
}

View File

@ -211,9 +211,8 @@ PoolWithFailoverBase<TNestedPool>::get(size_t max_ignored_errors, bool fallback_
max_ignored_errors, fallback_to_stale_replicas,
try_get_entry, get_priority);
if (results.empty() || results[0].entry.isNull())
throw DB::Exception(
"PoolWithFailoverBase::getMany() returned less than min_entries entries.",
DB::ErrorCodes::LOGICAL_ERROR);
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR,
"PoolWithFailoverBase::getMany() returned less than min_entries entries.");
return results[0].entry;
}
@ -320,10 +319,8 @@ PoolWithFailoverBase<TNestedPool>::getMany(
try_results.resize(up_to_date_count);
}
else
throw DB::Exception(
"Could not find enough connections to up-to-date replicas. Got: " + std::to_string(up_to_date_count)
+ ", needed: " + std::to_string(min_entries),
DB::ErrorCodes::ALL_REPLICAS_ARE_STALE);
throw DB::Exception(DB::ErrorCodes::ALL_REPLICAS_ARE_STALE,
"Could not find enough connections to up-to-date replicas. Got: {}, needed: {}", up_to_date_count, max_entries);
return try_results;
}

View File

@ -62,10 +62,10 @@ public:
, replacement(replacement_string)
{
if (!regexp.ok())
throw DB::Exception(
"SensitiveDataMasker: cannot compile re2: " + regexp_string_ + ", error: " + regexp.error()
+ ". Look at https://github.com/google/re2/wiki/Syntax for reference.",
DB::ErrorCodes::CANNOT_COMPILE_REGEXP);
throw DB::Exception(DB::ErrorCodes::CANNOT_COMPILE_REGEXP,
"SensitiveDataMasker: cannot compile re2: {}, error: {}. "
"Look at https://github.com/google/re2/wiki/Syntax for reference.",
regexp_string_, regexp.error());
}
uint64_t apply(std::string & data) const

View File

@ -100,7 +100,7 @@ size_t TLDListsHolder::parseAndAddTldList(const std::string & name, const std::s
tld_list_tmp.emplace(line, TLDType::TLD_REGULAR);
}
if (!in.eof())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Not all list had been read", name);
throw Exception(ErrorCodes::LOGICAL_ERROR, "Not all list had been read: {}", name);
TLDList tld_list(tld_list_tmp.size());
for (const auto & [host, type] : tld_list_tmp)

View File

@ -58,7 +58,7 @@ UInt64 Throttler::add(size_t amount)
}
if (limit && count_value > limit)
throw Exception(limit_exceeded_exception_message + std::string(" Maximum: ") + toString(limit), ErrorCodes::LIMIT_EXCEEDED);
throw Exception::createDeprecated(limit_exceeded_exception_message + std::string(" Maximum: ") + toString(limit), ErrorCodes::LIMIT_EXCEEDED);
/// Wait unless there is positive amount of tokens - throttling
Int64 sleep_time = 0;

View File

@ -41,7 +41,7 @@ To assert_cast(From && from)
}
catch (const std::exception & e)
{
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
}
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Bad cast from type {} to {}",

View File

@ -57,6 +57,7 @@ namespace
if (_is_clients_log || _logger->is((PRIORITY))) \
{ \
std::string formatted_message = numArgs(__VA_ARGS__) > 1 ? fmt::format(__VA_ARGS__) : firstArg(__VA_ARGS__); \
formatStringCheckArgsNum(__VA_ARGS__); \
if (auto _channel = _logger->getChannel()) \
{ \
std::string file_function; \

View File

@ -171,9 +171,8 @@ TEST(Common, RWLockDeadlock)
auto holder2 = lock2->getLock(RWLockImpl::Read, "q1", std::chrono::milliseconds(100));
if (!holder2)
{
throw Exception(
"Locking attempt timed out! Possible deadlock avoided. Client should retry.",
ErrorCodes::DEADLOCK_AVOIDED);
throw Exception(ErrorCodes::DEADLOCK_AVOIDED,
"Locking attempt timed out! Possible deadlock avoided. Client should retry.");
}
}
catch (const Exception & e)
@ -202,9 +201,8 @@ TEST(Common, RWLockDeadlock)
auto holder1 = lock1->getLock(RWLockImpl::Read, "q3", std::chrono::milliseconds(100));
if (!holder1)
{
throw Exception(
"Locking attempt timed out! Possible deadlock avoided. Client should retry.",
ErrorCodes::DEADLOCK_AVOIDED);
throw Exception(ErrorCodes::DEADLOCK_AVOIDED,
"Locking attempt timed out! Possible deadlock avoided. Client should retry.");
}
}
catch (const Exception & e)

View File

@ -37,7 +37,7 @@ To typeid_cast(From & from)
}
catch (const std::exception & e)
{
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
}
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "Bad cast from type {} to {}",
@ -58,7 +58,7 @@ To typeid_cast(From * from)
}
catch (const std::exception & e)
{
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
}
}
@ -93,6 +93,6 @@ To typeid_cast(const std::shared_ptr<From> & from)
}
catch (const std::exception & e)
{
throw DB::Exception(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
throw DB::Exception::createDeprecated(e.what(), DB::ErrorCodes::LOGICAL_ERROR);
}
}

View File

@ -86,7 +86,7 @@ static void validateChecksum(char * data, size_t size, const Checksum expected_c
{
message << ". The mismatch is caused by single bit flip in data block at byte " << (bit_pos / 8) << ", bit " << (bit_pos % 8) << ". "
<< message_hardware_failure;
throw Exception(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
throw Exception::createDeprecated(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
}
flip_bit(tmp_data, bit_pos); /// Restore
@ -101,10 +101,10 @@ static void validateChecksum(char * data, size_t size, const Checksum expected_c
{
message << ". The mismatch is caused by single bit flip in checksum. "
<< message_hardware_failure;
throw Exception(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
throw Exception::createDeprecated(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
}
throw Exception(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
throw Exception::createDeprecated(message.str(), ErrorCodes::CHECKSUM_DOESNT_MATCH);
}
static void readHeaderAndGetCodecAndSize(

View File

@ -30,7 +30,7 @@ protected:
bool isGenericCompression() const override { return false; }
private:
UInt8 delta_bytes_size;
const UInt8 delta_bytes_size;
};
@ -68,8 +68,8 @@ void compressDataForType(const char * source, UInt32 source_size, char * dest)
if (source_size % sizeof(T) != 0)
throw Exception(ErrorCodes::CANNOT_COMPRESS, "Cannot delta compress, data size {} is not aligned to {}", source_size, sizeof(T));
T prev_src{};
const char * source_end = source + source_size;
T prev_src = 0;
const char * const source_end = source + source_size;
while (source < source_end)
{
T curr_src = unalignedLoad<T>(source);
@ -84,17 +84,17 @@ void compressDataForType(const char * source, UInt32 source_size, char * dest)
template <typename T>
void decompressDataForType(const char * source, UInt32 source_size, char * dest, UInt32 output_size)
{
const char * output_end = dest + output_size;
const char * const output_end = dest + output_size;
if (source_size % sizeof(T) != 0)
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot delta decompress, data size {} is not aligned to {}", source_size, sizeof(T));
T accumulator{};
const char * source_end = source + source_size;
const char * const source_end = source + source_size;
while (source < source_end)
{
accumulator += unalignedLoad<T>(source);
if (dest + sizeof(accumulator) > output_end)
if (dest + sizeof(accumulator) > output_end) [[unlikely]]
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress the data");
unalignedStore<T>(dest, accumulator);
@ -140,7 +140,7 @@ void CompressionCodecDelta::doDecompressData(const char * source, UInt32 source_
UInt8 bytes_size = source[0];
if (bytes_size == 0)
if (!(bytes_size == 1 || bytes_size == 2 || bytes_size == 4 || bytes_size == 8))
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress. File has wrong header");
UInt8 bytes_to_skip = uncompressed_size % bytes_size;
@ -190,7 +190,7 @@ UInt8 getDeltaBytesSize(const IDataType * column_type)
void registerCodecDelta(CompressionCodecFactory & factory)
{
UInt8 method_code = static_cast<UInt8>(CompressionMethodByte::Delta);
factory.registerCompressionCodecWithType("Delta", method_code, [&](const ASTPtr & arguments, const IDataType * column_type) -> CompressionCodecPtr
auto codec_builder = [&](const ASTPtr & arguments, const IDataType * column_type) -> CompressionCodecPtr
{
UInt8 delta_bytes_size = 0;
@ -215,7 +215,8 @@ void registerCodecDelta(CompressionCodecFactory & factory)
}
return std::make_shared<CompressionCodecDelta>(delta_bytes_size);
});
};
factory.registerCompressionCodecWithType("Delta", method_code, codec_builder);
}
CompressionCodecPtr getCompressionCodecDelta(UInt8 delta_bytes_size)

View File

@ -141,7 +141,7 @@ size_t encrypt(std::string_view plaintext, char * ciphertext_and_tag, Encryption
reinterpret_cast<const uint8_t*>(key.data()), key.size(),
tag_size, nullptr);
if (!ok_init)
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
/// encrypt data using context and given nonce.
size_t out_len;
@ -152,7 +152,7 @@ size_t encrypt(std::string_view plaintext, char * ciphertext_and_tag, Encryption
reinterpret_cast<const uint8_t *>(plaintext.data()), plaintext.size(),
nullptr, 0);
if (!ok_open)
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
return out_len;
}
@ -171,7 +171,7 @@ size_t decrypt(std::string_view ciphertext, char * plaintext, EncryptionMethod m
reinterpret_cast<const uint8_t*>(key.data()), key.size(),
tag_size, nullptr);
if (!ok_init)
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
/// decrypt data using given nonce
size_t out_len;
@ -182,7 +182,7 @@ size_t decrypt(std::string_view ciphertext, char * plaintext, EncryptionMethod m
reinterpret_cast<const uint8_t *>(ciphertext.data()), ciphertext.size(),
nullptr, 0);
if (!ok_open)
throw Exception(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
throw Exception::createDeprecated(lastErrorString(), ErrorCodes::OPENSSL_ERROR);
return out_len;
}

View File

@ -11,19 +11,18 @@
#include <IO/ReadBufferFromMemory.h>
#include <IO/BitHelpers.h>
#include <bitset>
#include <cstring>
#include <algorithm>
#include <type_traits>
#include <bitset>
namespace DB
{
/** Gorilla column codec implementation.
*
* Based on Gorilla paper: http://www.vldb.org/pvldb/vol8/p1816-teller.pdf
* Based on Gorilla paper: https://dl.acm.org/doi/10.14778/2824032.2824078
*
* This codec is best used against monotonic floating sequences, like CPU usage percentage
* or any other gauge.
@ -125,7 +124,7 @@ protected:
bool isGenericCompression() const override { return false; }
private:
UInt8 data_bytes_size;
const UInt8 data_bytes_size;
};
@ -139,7 +138,7 @@ namespace ErrorCodes
namespace
{
constexpr inline UInt8 getBitLengthOfLength(UInt8 data_bytes_size)
constexpr UInt8 getBitLengthOfLength(UInt8 data_bytes_size)
{
// 1-byte value is 8 bits, and we need 4 bits to represent 8 : 1000,
// 2-byte 16 bits => 5
@ -147,21 +146,20 @@ constexpr inline UInt8 getBitLengthOfLength(UInt8 data_bytes_size)
// 8-byte 64 bits => 7
const UInt8 bit_lengths[] = {0, 4, 5, 0, 6, 0, 0, 0, 7};
assert(data_bytes_size >= 1 && data_bytes_size < sizeof(bit_lengths) && bit_lengths[data_bytes_size] != 0);
return bit_lengths[data_bytes_size];
}
UInt32 getCompressedHeaderSize(UInt8 data_bytes_size)
{
const UInt8 items_count_size = 4;
constexpr UInt8 items_count_size = 4;
return items_count_size + data_bytes_size;
}
UInt32 getCompressedDataSize(UInt8 data_bytes_size, UInt32 uncompressed_size)
{
const UInt32 items_count = uncompressed_size / data_bytes_size;
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(data_bytes_size);
// -1 since there must be at least 1 non-zero bit.
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
@ -182,7 +180,7 @@ struct BinaryValueInfo
};
template <typename T>
BinaryValueInfo getLeadingAndTrailingBits(const T & value)
BinaryValueInfo getBinaryValueInfo(const T & value)
{
constexpr UInt8 bit_size = sizeof(T) * 8;
@ -190,28 +188,25 @@ BinaryValueInfo getLeadingAndTrailingBits(const T & value)
const UInt8 tz = getTrailingZeroBits(value);
const UInt8 data_size = value == 0 ? 0 : static_cast<UInt8>(bit_size - lz - tz);
return BinaryValueInfo{lz, data_size, tz};
return {lz, data_size, tz};
}
template <typename T>
UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest, UInt32 dest_size)
{
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
// -1 since there must be at least 1 non-zero bit.
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
if (source_size % sizeof(T) != 0)
throw Exception(ErrorCodes::CANNOT_COMPRESS, "Cannot compress, data size {} is not aligned to {}", source_size, sizeof(T));
const char * source_end = source + source_size;
const char * dest_start = dest;
const char * dest_end = dest + dest_size;
const char * const source_end = source + source_size;
const char * const dest_start = dest;
const char * const dest_end = dest + dest_size;
const UInt32 items_count = source_size / sizeof(T);
unalignedStoreLE<UInt32>(dest, items_count);
dest += sizeof(items_count);
T prev_value{};
T prev_value = 0;
// That would cause first XORed value to be written in-full.
BinaryValueInfo prev_xored_info{0, 0, 0};
@ -226,13 +221,17 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
BitWriter writer(dest, dest_end - dest);
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
// -1 since there must be at least 1 non-zero bit.
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
while (source < source_end)
{
const T curr_value = unalignedLoadLE<T>(source);
source += sizeof(curr_value);
const auto xored_data = curr_value ^ prev_value;
const BinaryValueInfo curr_xored_info = getLeadingAndTrailingBits(xored_data);
const BinaryValueInfo curr_xored_info = getBinaryValueInfo(xored_data);
if (xored_data == 0)
{
@ -265,11 +264,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
template <typename T>
void decompressDataForType(const char * source, UInt32 source_size, char * dest)
{
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
// -1 since there must be at least 1 non-zero bit.
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
const char * source_end = source + source_size;
const char * const source_end = source + source_size;
if (source + sizeof(UInt32) > source_end)
return;
@ -277,7 +272,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
const UInt32 items_count = unalignedLoadLE<UInt32>(source);
source += sizeof(items_count);
T prev_value{};
T prev_value = 0;
// decoding first item
if (source + sizeof(T) > source_end || items_count < 1)
@ -293,13 +288,17 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
BinaryValueInfo prev_xored_info{0, 0, 0};
static const auto DATA_BIT_LENGTH = getBitLengthOfLength(sizeof(T));
// -1 since there must be at least 1 non-zero bit.
static const auto LEADING_ZEROES_BIT_LENGTH = DATA_BIT_LENGTH - 1;
// since data is tightly packed, up to 1 bit per value, and last byte is padded with zeroes,
// we have to keep track of items to avoid reading more that there is.
for (UInt32 items_read = 1; items_read < items_count && !reader.eof(); ++items_read)
{
T curr_value = prev_value;
BinaryValueInfo curr_xored_info = prev_xored_info;
T xored_data{};
T xored_data = 0;
if (reader.readBit() == 1)
{
@ -314,7 +313,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
if (curr_xored_info.leading_zero_bits == 0
&& curr_xored_info.data_bits == 0
&& curr_xored_info.trailing_zero_bits == 0)
&& curr_xored_info.trailing_zero_bits == 0) [[unlikely]]
{
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress gorilla-encoded data: corrupted input data.");
}
@ -403,7 +402,7 @@ UInt32 CompressionCodecGorilla::doCompressData(const char * source, UInt32 sourc
break;
}
return 1 + 1 + result_size;
return 2 + bytes_to_skip + result_size;
}
void CompressionCodecGorilla::doDecompressData(const char * source, UInt32 source_size, char * dest, UInt32 uncompressed_size) const

View File

@ -11,13 +11,6 @@
namespace DB
{
class ICompressionCodec;
using CompressionCodecPtr = std::shared_ptr<ICompressionCodec>;
using Codecs = std::vector<CompressionCodecPtr>;
class IDataType;
extern "C" int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size);
/**
@ -120,7 +113,7 @@ protected:
/// Return size of compressed data without header
virtual UInt32 getMaxCompressedDataSize(UInt32 uncompressed_size) const { return uncompressed_size; }
/// Actually compress data, without header
/// Actually compress data without header
virtual UInt32 doCompressData(const char * source, UInt32 source_size, char * dest) const = 0;
/// Actually decompress data without header
@ -134,4 +127,7 @@ private:
CodecMode decompressMode{CodecMode::Synchronous};
};
using CompressionCodecPtr = std::shared_ptr<ICompressionCodec>;
using Codecs = std::vector<CompressionCodecPtr>;
}

View File

@ -29,11 +29,13 @@ namespace ErrorCodes
extern const int AMBIGUOUS_COLUMN_NAME;
}
template <typename ReturnType>
static ReturnType onError(const std::string & message [[maybe_unused]], int code [[maybe_unused]])
template <typename ReturnType, typename... FmtArgs>
static ReturnType onError(int code [[maybe_unused]],
FormatStringHelper<FmtArgs...> fmt_string [[maybe_unused]],
FmtArgs && ...fmt_args [[maybe_unused]])
{
if constexpr (std::is_same_v<ReturnType, void>)
throw Exception(message, code);
throw Exception(code, std::move(fmt_string), std::forward<FmtArgs>(fmt_args)...);
else
return false;
}
@ -44,13 +46,13 @@ static ReturnType checkColumnStructure(const ColumnWithTypeAndName & actual, con
std::string_view context_description, bool allow_materialize, int code)
{
if (actual.name != expected.name)
return onError<ReturnType>("Block structure mismatch in " + std::string(context_description) + " stream: different names of columns:\n"
+ actual.dumpStructure() + "\n" + expected.dumpStructure(), code);
return onError<ReturnType>(code, "Block structure mismatch in {} stream: different names of columns:\n{}\n{}",
context_description, actual.dumpStructure(), expected.dumpStructure());
if ((actual.type && !expected.type) || (!actual.type && expected.type)
|| (actual.type && expected.type && !actual.type->equals(*expected.type)))
return onError<ReturnType>("Block structure mismatch in " + std::string(context_description) + " stream: different types:\n"
+ actual.dumpStructure() + "\n" + expected.dumpStructure(), code);
return onError<ReturnType>(code, "Block structure mismatch in {} stream: different types:\n{}\n{}",
context_description, actual.dumpStructure(), expected.dumpStructure());
if (!actual.column || !expected.column)
return ReturnType(true);
@ -74,22 +76,18 @@ static ReturnType checkColumnStructure(const ColumnWithTypeAndName & actual, con
if (actual_column_maybe_agg && expected_column_maybe_agg)
{
if (!actual_column_maybe_agg->getAggregateFunction()->haveSameStateRepresentation(*expected_column_maybe_agg->getAggregateFunction()))
return onError<ReturnType>(
fmt::format(
return onError<ReturnType>(code,
"Block structure mismatch in {} stream: different columns:\n{}\n{}",
context_description,
actual.dumpStructure(),
expected.dumpStructure()),
code);
expected.dumpStructure());
}
else if (actual_column->getName() != expected.column->getName())
return onError<ReturnType>(
fmt::format(
return onError<ReturnType>(code,
"Block structure mismatch in {} stream: different columns:\n{}\n{}",
context_description,
actual.dumpStructure(),
expected.dumpStructure()),
code);
expected.dumpStructure());
if (isColumnConst(*actual.column) && isColumnConst(*expected.column)
&& !actual.column->empty() && !expected.column->empty()) /// don't check values in empty columns
@ -98,14 +96,12 @@ static ReturnType checkColumnStructure(const ColumnWithTypeAndName & actual, con
Field expected_value = assert_cast<const ColumnConst &>(*expected.column).getField();
if (actual_value != expected_value)
return onError<ReturnType>(
fmt::format(
return onError<ReturnType>(code,
"Block structure mismatch in {} stream: different values of constants in column '{}': actual: {}, expected: {}",
context_description,
actual.name,
applyVisitor(FieldVisitorToString(), actual_value),
applyVisitor(FieldVisitorToString(), expected_value)),
code);
applyVisitor(FieldVisitorToString(), expected_value));
}
return ReturnType(true);
@ -117,8 +113,8 @@ static ReturnType checkBlockStructure(const Block & lhs, const Block & rhs, std:
{
size_t columns = rhs.columns();
if (lhs.columns() != columns)
return onError<ReturnType>("Block structure mismatch in " + std::string(context_description) + " stream: different number of columns:\n"
+ lhs.dumpStructure() + "\n" + rhs.dumpStructure(), ErrorCodes::LOGICAL_ERROR);
return onError<ReturnType>(ErrorCodes::LOGICAL_ERROR, "Block structure mismatch in {} stream: different number of columns:\n{}\n{}",
context_description, lhs.dumpStructure(), rhs.dumpStructure());
for (size_t i = 0; i < columns; ++i)
{

View File

@ -95,7 +95,7 @@ void MySQLClient::handshake()
packet_endpoint->resetSequenceId();
if (packet_response.getType() == PACKET_ERR)
throw Exception(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
throw Exception::createDeprecated(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
else if (packet_response.getType() == PACKET_AUTH_SWITCH)
throw Exception(ErrorCodes::UNKNOWN_PACKET_FROM_SERVER, "Access denied for user {}", user);
}
@ -110,7 +110,7 @@ void MySQLClient::writeCommand(char command, String query)
switch (packet_response.getType())
{
case PACKET_ERR:
throw Exception(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
throw Exception::createDeprecated(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
case PACKET_OK:
break;
default:
@ -128,7 +128,7 @@ void MySQLClient::registerSlaveOnMaster(UInt32 slave_id)
packet_endpoint->receivePacket(packet_response);
packet_endpoint->resetSequenceId();
if (packet_response.getType() == PACKET_ERR)
throw Exception(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
throw Exception::createDeprecated(packet_response.err.error_message, ErrorCodes::UNKNOWN_PACKET_FROM_SERVER);
}
void MySQLClient::ping()

View File

@ -111,7 +111,7 @@ namespace MySQLReplication
else if (query.starts_with("XA"))
{
if (query.starts_with("XA ROLLBACK"))
throw ReplicationError("ParseQueryEvent: Unsupported query event:" + query, ErrorCodes::LOGICAL_ERROR);
throw ReplicationError(ErrorCodes::LOGICAL_ERROR, "ParseQueryEvent: Unsupported query event: {}", query);
typ = QUERY_EVENT_XA;
if (!query.starts_with("XA COMMIT"))
transaction_complete = false;
@ -247,7 +247,7 @@ namespace MySQLReplication
break;
}
default:
throw ReplicationError("ParseMetaData: Unhandled data type:" + std::to_string(typ), ErrorCodes::UNKNOWN_EXCEPTION);
throw ReplicationError(ErrorCodes::UNKNOWN_EXCEPTION, "ParseMetaData: Unhandled data type: {}", std::to_string(typ));
}
}
}
@ -770,8 +770,8 @@ namespace MySQLReplication
break;
}
default:
throw ReplicationError(
"ParseRow: Unhandled MySQL field type:" + std::to_string(field_type), ErrorCodes::UNKNOWN_EXCEPTION);
throw ReplicationError(ErrorCodes::UNKNOWN_EXCEPTION,
"ParseRow: Unhandled MySQL field type: {}", std::to_string(field_type));
}
}
null_index++;
@ -873,7 +873,7 @@ namespace MySQLReplication
break;
}
default:
throw ReplicationError("Position update with unsupported event", ErrorCodes::LOGICAL_ERROR);
throw ReplicationError(ErrorCodes::LOGICAL_ERROR, "Position update with unsupported event");
}
}
@ -901,11 +901,11 @@ namespace MySQLReplication
switch (header)
{
case PACKET_EOF:
throw ReplicationError("Master maybe lost", ErrorCodes::CANNOT_READ_ALL_DATA);
throw ReplicationError(ErrorCodes::CANNOT_READ_ALL_DATA, "Master maybe lost");
case PACKET_ERR:
ERRPacket err;
err.readPayloadWithUnpacked(payload);
throw ReplicationError(err.error_message, ErrorCodes::UNKNOWN_EXCEPTION);
throw ReplicationError::createDeprecated(err.error_message, ErrorCodes::UNKNOWN_EXCEPTION);
}
// skip the generic response packets header flag.
payload.ignore(1);

View File

@ -74,7 +74,7 @@ ConnectionHolderPtr PoolWithFailover::get()
if (replicas_with_priority.empty())
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "No address specified");
DB::WriteBufferFromOwnString error_message;
PreformattedMessage error_message;
for (size_t try_idx = 0; try_idx < max_tries; ++try_idx)
{
for (auto & priority : replicas_with_priority)
@ -107,7 +107,7 @@ ConnectionHolderPtr PoolWithFailover::get()
catch (const pqxx::broken_connection & pqxx_error)
{
LOG_ERROR(log, "Connection error: {}", pqxx_error.what());
error_message << fmt::format(
error_message = PreformattedMessage::create(
"Try {}. Connection to {} failed with error: {}\n",
try_idx + 1, DB::backQuote(replica.connection_info.host_port), pqxx_error.what());
@ -131,7 +131,7 @@ ConnectionHolderPtr PoolWithFailover::get()
}
}
throw DB::Exception(DB::ErrorCodes::POSTGRESQL_CONNECTION_FAILURE, error_message.str());
throw DB::Exception(error_message, DB::ErrorCodes::POSTGRESQL_CONNECTION_FAILURE);
}
}

View File

@ -402,7 +402,7 @@ void SettingFieldEnum<EnumT, Traits>::readBinary(ReadBuffer & in)
auto it = map.find(value); \
if (it != map.end()) \
return it->second; \
throw Exception( \
throw Exception::createDeprecated( \
"Unexpected value of " #NEW_NAME ":" + std::to_string(std::underlying_type<EnumType>::type(value)), \
ERROR_CODE_FOR_UNEXPECTED_NAME); \
} \
@ -428,7 +428,7 @@ void SettingFieldEnum<EnumT, Traits>::readBinary(ReadBuffer & in)
msg += "'" + String{name} + "'"; \
} \
msg += "]"; \
throw Exception(msg, ERROR_CODE_FOR_UNEXPECTED_NAME); \
throw Exception::createDeprecated(msg, ERROR_CODE_FOR_UNEXPECTED_NAME); \
}
// Mostly like SettingFieldEnum, but can have multiple enum values (or none) set at once.

View File

@ -933,7 +933,7 @@ void BaseDaemon::handleSignal(int signal_id)
onInterruptSignals(signal_id);
}
else
throw DB::Exception(std::string("Unsupported signal: ") + strsignal(signal_id), 0); // NOLINT(concurrency-mt-unsafe) // it is not thread-safe but ok in this context
throw DB::Exception::createDeprecated(std::string("Unsupported signal: ") + strsignal(signal_id), 0); // NOLINT(concurrency-mt-unsafe) // it is not thread-safe but ok in this context
}
void BaseDaemon::onInterruptSignals(int signal_id)

View File

@ -189,7 +189,8 @@ template <template <typename> typename DecimalType>
inline DataTypePtr createDecimal(UInt64 precision_value, UInt64 scale_value)
{
if (precision_value < DecimalUtils::min_precision || precision_value > DecimalUtils::max_precision<Decimal256>)
throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Wrong precision");
throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Wrong precision: it must be between {} and {}, got {}",
DecimalUtils::min_precision, DecimalUtils::max_precision<Decimal256>, precision_value);
if (static_cast<UInt64>(scale_value) > precision_value)
throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Negative scales and scales larger than precision are not supported");

View File

@ -57,7 +57,7 @@ static DataTypePtr create(const ASTPtr & arguments)
{
if (func->name != "Nullable" || func->arguments->children.size() != 1)
throw Exception(ErrorCodes::UNEXPECTED_AST_STRUCTURE,
"Expected 'Nullable(<schema_name>)' as parameter for type Object", func->name);
"Expected 'Nullable(<schema_name>)' as parameter for type Object (function: {})", func->name);
schema_argument = func->arguments->children[0];
is_nullable = true;

View File

@ -53,10 +53,10 @@ static std::optional<Exception> checkTupleNames(const Strings & names)
for (const auto & name : names)
{
if (name.empty())
return Exception("Names of tuple elements cannot be empty", ErrorCodes::BAD_ARGUMENTS);
return Exception(ErrorCodes::BAD_ARGUMENTS, "Names of tuple elements cannot be empty");
if (!names_set.insert(name).second)
return Exception("Names of tuple elements must be unique", ErrorCodes::DUPLICATE_COLUMN);
return Exception(ErrorCodes::DUPLICATE_COLUMN, "Names of tuple elements must be unique");
}
return {};

View File

@ -373,8 +373,8 @@ void SerializationArray::deserializeBinaryBulkWithMultipleStreams(
/// Check consistency between offsets and elements subcolumns.
/// But if elements column is empty - it's ok for columns of Nested types that was added by ALTER.
if (!nested_column->empty() && nested_column->size() != last_offset)
throw ParsingException("Cannot read all array values: read just " + toString(nested_column->size()) + " of " + toString(last_offset),
ErrorCodes::CANNOT_READ_ALL_DATA);
throw ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Cannot read all array values: read just {} of {}",
toString(nested_column->size()), toString(last_offset));
column = std::move(mutable_column);
}

View File

@ -360,19 +360,20 @@ ReturnType SerializationNullable::deserializeTextEscapedAndRawImpl(IColumn & col
/// or if someone uses tab or LF in TSV null_representation.
/// In the first case we cannot continue reading anyway. The second case seems to be unlikely.
if (null_representation.find('\t') != std::string::npos || null_representation.find('\n') != std::string::npos)
throw DB::ParsingException("TSV custom null representation containing '\\t' or '\\n' may not work correctly "
"for large input.", ErrorCodes::CANNOT_READ_ALL_DATA);
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "TSV custom null representation "
"containing '\\t' or '\\n' may not work correctly for large input.");
WriteBufferFromOwnString parsed_value;
if constexpr (escaped)
nested_serialization->serializeTextEscaped(nested_column, nested_column.size() - 1, parsed_value, settings);
else
nested_serialization->serializeTextRaw(nested_column, nested_column.size() - 1, parsed_value, settings);
throw DB::ParsingException("Error while parsing \"" + std::string(pos, buf.buffer().end()) + std::string(istr.position(), std::min(size_t(10), istr.available())) + "\" as Nullable"
+ " at position " + std::to_string(istr.count()) + ": got \"" + std::string(pos, buf.position() - pos)
+ "\", which was deserialized as \""
+ parsed_value.str() + "\". It seems that input data is ill-formatted.",
ErrorCodes::CANNOT_READ_ALL_DATA);
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable"
" at position {}: got \"{}\", which was deserialized as \"{}\". "
"It seems that input data is ill-formatted.",
std::string(pos, buf.buffer().end()),
std::string(istr.position(), std::min(size_t(10), istr.available())),
istr.count(), std::string(pos, buf.position() - pos), parsed_value.str());
};
return safeDeserialize<ReturnType>(column, *nested_serialization, check_for_null, deserialize_nested);
@ -584,16 +585,17 @@ ReturnType SerializationNullable::deserializeTextCSVImpl(IColumn & column, ReadB
/// In the first case we cannot continue reading anyway. The second case seems to be unlikely.
if (null_representation.find(settings.csv.delimiter) != std::string::npos || null_representation.find('\r') != std::string::npos
|| null_representation.find('\n') != std::string::npos)
throw DB::ParsingException("CSV custom null representation containing format_csv_delimiter, '\\r' or '\\n' may not work correctly "
"for large input.", ErrorCodes::CANNOT_READ_ALL_DATA);
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "CSV custom null representation containing "
"format_csv_delimiter, '\\r' or '\\n' may not work correctly for large input.");
WriteBufferFromOwnString parsed_value;
nested_serialization->serializeTextCSV(nested_column, nested_column.size() - 1, parsed_value, settings);
throw DB::ParsingException("Error while parsing \"" + std::string(pos, buf.buffer().end()) + std::string(istr.position(), std::min(size_t(10), istr.available())) + "\" as Nullable"
+ " at position " + std::to_string(istr.count()) + ": got \"" + std::string(pos, buf.position() - pos)
+ "\", which was deserialized as \""
+ parsed_value.str() + "\". It seems that input data is ill-formatted.",
ErrorCodes::CANNOT_READ_ALL_DATA);
throw DB::ParsingException(ErrorCodes::CANNOT_READ_ALL_DATA, "Error while parsing \"{}{}\" as Nullable"
" at position {}: got \"{}\", which was deserialized as \"{}\". "
"It seems that input data is ill-formatted.",
std::string(pos, buf.buffer().end()),
std::string(istr.position(), std::min(size_t(10), istr.available())),
istr.count(), std::string(pos, buf.position() - pos), parsed_value.str());
};
return safeDeserialize<ReturnType>(column, *nested_serialization, check_for_null, deserialize_nested);

View File

@ -231,7 +231,7 @@ void SerializationObject<Parser>::deserializeBinaryBulkStatePrefix(
auto kind = magic_enum::enum_cast<BinarySerializationKind>(kind_raw);
if (!kind)
throw Exception(ErrorCodes::INCORRECT_DATA,
"Unknown binary serialization kind of Object: " + std::to_string(kind_raw));
"Unknown binary serialization kind of Object: {}", std::to_string(kind_raw));
auto state_object = std::make_shared<DeserializeStateObject>();
state_object->kind = *kind;
@ -255,7 +255,7 @@ void SerializationObject<Parser>::deserializeBinaryBulkStatePrefix(
else
{
throw Exception(ErrorCodes::INCORRECT_DATA,
"Unknown binary serialization kind of Object: " + std::to_string(kind_raw));
"Unknown binary serialization kind of Object: {}", std::to_string(kind_raw));
}
settings.path.push_back(Substream::ObjectData);

View File

@ -40,7 +40,6 @@ template <typename DataTypes>
String getExceptionMessagePrefix(const DataTypes & types)
{
WriteBufferFromOwnString res;
res << "There is no supertype for types ";
bool first = true;
for (const auto & type : types)
@ -65,9 +64,9 @@ DataTypePtr throwOrReturn(const DataTypes & types, std::string_view message_suff
return nullptr;
if (message_suffix.empty())
throw Exception(error_code, getExceptionMessagePrefix(types));
throw Exception(error_code, "There is no supertype for types {}", getExceptionMessagePrefix(types));
throw Exception(error_code, "{} {}", getExceptionMessagePrefix(types), message_suffix);
throw Exception(error_code, "There is no supertype for types {} {}", getExceptionMessagePrefix(types), message_suffix);
}
template <LeastSupertypeOnError on_error>

View File

@ -49,7 +49,7 @@ DataTypePtr getMostSubtype(const DataTypes & types, bool throw_if_result_is_noth
auto get_nothing_or_throw = [throw_if_result_is_nothing, & types](const std::string & reason)
{
if (throw_if_result_is_nothing)
throw Exception(getExceptionMessagePrefix(types) + reason, ErrorCodes::NO_COMMON_TYPE);
throw Exception::createDeprecated(getExceptionMessagePrefix(types) + reason, ErrorCodes::NO_COMMON_TYPE);
return std::make_shared<DataTypeNothing>();
};

View File

@ -47,10 +47,10 @@ getArgument(const ASTPtr & arguments, size_t argument_index, const char * argume
else
{
if (argument && argument->value.getType() != field_type)
throw Exception(getExceptionMessage(fmt::format(" has wrong type: {}", argument->value.getTypeName()),
throw Exception::createDeprecated(getExceptionMessage(fmt::format(" has wrong type: {}", argument->value.getTypeName()),
argument_index, argument_name, context_data_type_name, field_type), ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
else
throw Exception(getExceptionMessage(" is missing", argument_index, argument_name, context_data_type_name, field_type),
throw Exception::createDeprecated(getExceptionMessage(" is missing", argument_index, argument_name, context_data_type_name, field_type),
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
}
}
@ -67,7 +67,7 @@ static DataTypePtr create(const ASTPtr & arguments)
const auto timezone = getArgument<String, ArgumentKind::Optional>(arguments, !!scale, "timezone", "DateTime");
if (!scale && !timezone)
throw Exception(getExceptionMessage(" has wrong type: ", 0, "scale", "DateTime", Field::Types::Which::UInt64),
throw Exception::createDeprecated(getExceptionMessage(" has wrong type: ", 0, "scale", "DateTime", Field::Types::Which::UInt64),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
/// If scale is defined, the data type is DateTime when scale = 0 otherwise the data type is DateTime64

View File

@ -41,10 +41,9 @@ namespace
}
catch (Exception & e)
{
throw Exception(
fmt::format("Error while loading dictionary '{}.{}': {}",
database_name, load_result.name, e.displayText()),
e.code());
throw Exception(e.code(),
"Error while loading dictionary '{}.{}': {}",
database_name, load_result.name, e.displayText());
}
}
}
@ -118,7 +117,7 @@ ASTPtr DatabaseDictionary::getCreateTableQueryImpl(const String & table_name, Co
/* hilite = */ false, "", /* allow_multi_statements = */ false, 0, settings.max_parser_depth);
if (!ast && throw_on_error)
throw Exception(error_message, ErrorCodes::SYNTAX_ERROR);
throw Exception::createDeprecated(error_message, ErrorCodes::SYNTAX_ERROR);
return ast;
}

View File

@ -252,13 +252,11 @@ DatabasePtr DatabaseFactory::getImpl(const ASTCreateQuery & create, const String
{
auto print_create_ast = create.clone();
print_create_ast->as<ASTCreateQuery>()->attach = false;
throw Exception(
fmt::format(
throw Exception(ErrorCodes::NOT_IMPLEMENTED,
"The MaterializedMySQL database engine no longer supports Ordinary databases. To re-create the database, delete "
"the old one by executing \"rm -rf {}{{,.sql}}\", then re-create the database with the following query: {}",
metadata_path,
queryToString(print_create_ast)),
ErrorCodes::NOT_IMPLEMENTED);
queryToString(print_create_ast));
}
return std::make_shared<DatabaseMaterializedMySQL>(

View File

@ -676,7 +676,7 @@ ASTPtr DatabaseOnDisk::parseQueryFromMetadata(
"in file " + metadata_file_path, /* allow_multi_statements = */ false, 0, settings.max_parser_depth);
if (!ast && throw_on_error)
throw Exception(error_message, ErrorCodes::SYNTAX_ERROR);
throw Exception::createDeprecated(error_message, ErrorCodes::SYNTAX_ERROR);
else if (!ast)
return nullptr;

View File

@ -136,7 +136,7 @@ ASTPtr DatabaseMySQL::getCreateTableQueryImpl(const String & table_name, Context
if (local_tables_cache.find(table_name) == local_tables_cache.end())
{
if (throw_on_error)
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {} doesn't exist.", database_name_in_mysql, table_name);
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {}.{} doesn't exist.", database_name_in_mysql, table_name);
return nullptr;
}
@ -180,7 +180,7 @@ time_t DatabaseMySQL::getObjectMetadataModificationTime(const String & table_nam
fetchTablesIntoLocalCache(getContext());
if (local_tables_cache.find(table_name) == local_tables_cache.end())
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {} doesn't exist.", database_name_in_mysql, table_name);
throw Exception(ErrorCodes::UNKNOWN_TABLE, "MySQL table {}.{} doesn't exist.", database_name_in_mysql, table_name);
return time_t(local_tables_cache[table_name].first);
}

View File

@ -147,7 +147,7 @@ static void checkMySQLVariables(const mysqlxx::Pool::Entry & connection, const S
first = false;
}
throw Exception(error_message.str(), ErrorCodes::ILLEGAL_MYSQL_VARIABLE);
throw Exception::createDeprecated(error_message.str(), ErrorCodes::ILLEGAL_MYSQL_VARIABLE);
}
}

View File

@ -214,12 +214,12 @@ void DatabasePostgreSQL::attachTable(ContextPtr /* context_ */, const String & t
if (!checkPostgresTable(table_name))
throw Exception(ErrorCodes::UNKNOWN_TABLE,
"Cannot attach PostgreSQL table {} because it does not exist in PostgreSQL",
"Cannot attach PostgreSQL table {} because it does not exist in PostgreSQL (database: {})",
getTableNameForLogs(table_name), database_name);
if (!detached_or_dropped.contains(table_name))
throw Exception(ErrorCodes::TABLE_ALREADY_EXISTS,
"Cannot attach PostgreSQL table {} because it already exists",
"Cannot attach PostgreSQL table {} because it already exists (database: {})",
getTableNameForLogs(table_name), database_name);
if (cache_tables)

View File

@ -19,7 +19,7 @@ static std::mutex init_sqlite_db_mutex;
void processSQLiteError(const String & message, bool throw_on_error)
{
if (throw_on_error)
throw Exception(ErrorCodes::PATH_ACCESS_DENIED, message);
throw Exception::createDeprecated(message, ErrorCodes::PATH_ACCESS_DENIED);
else
LOG_ERROR(&Poco::Logger::get("SQLiteEngine"), fmt::runtime(message));
}

View File

@ -54,9 +54,9 @@ void RegionsHierarchy::reload()
if (region_entry.id > max_region_id)
{
if (region_entry.id > max_size)
throw DB::Exception(
"Region id is too large: " + DB::toString(region_entry.id) + ", should be not more than " + DB::toString(max_size),
DB::ErrorCodes::INCORRECT_DATA);
throw DB::Exception(DB::ErrorCodes::INCORRECT_DATA,
"Region id is too large: {}, should be not more than {}",
DB::toString(region_entry.id), DB::toString(max_size));
max_region_id = region_entry.id;

View File

@ -84,9 +84,9 @@ void RegionsNames::reload()
max_region_id = name_entry.id;
if (name_entry.id > max_size)
throw DB::Exception(
"Region id is too large: " + DB::toString(name_entry.id) + ", should be not more than " + DB::toString(max_size),
DB::ErrorCodes::INCORRECT_DATA);
throw DB::Exception(DB::ErrorCodes::INCORRECT_DATA,
"Region id is too large: {}, should be not more than {}",
DB::toString(name_entry.id), DB::toString(max_size));
}
while (name_entry.id >= new_names_refs.size())

Some files were not shown because too many files have changed in this diff Show More