Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into fix_and_enable_testflows_tests

This commit is contained in:
Vitaliy Zakaznikov 2021-07-06 14:42:46 -04:00
commit 89f89e33d1
99 changed files with 2031 additions and 207 deletions

View File

@ -13,6 +13,3 @@ ClickHouse® is an open-source column-oriented database management system that a
* [Code Browser](https://clickhouse.tech/codebrowser/html_report/ClickHouse/index.html) with syntax highlight and navigation.
* [Contacts](https://clickhouse.tech/#contacts) can help to get your questions answered if there are any.
* You can also [fill this form](https://clickhouse.tech/#meet) to meet Yandex ClickHouse team in person.
## Upcoming Events
* [China ClickHouse Community Meetup (online)](http://hdxu.cn/rhbfZ) on 26 June 2021.

View File

@ -23,6 +23,7 @@
<!-- disable jit for perf tests -->
<compile_expressions>0</compile_expressions>
<compile_aggregate_expressions>0</compile_aggregate_expressions>
</default>
</profiles>
<users>

View File

@ -7,13 +7,13 @@ toc_title: Third-Party Libraries Used
The list of third-party libraries can be obtained by the following query:
```
``` sql
SELECT library_name, license_type, license_path FROM system.licenses ORDER BY library_name COLLATE 'en'
```
[Example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIGxpYnJhcnlfbmFtZSwgbGljZW5zZV90eXBlLCBsaWNlbnNlX3BhdGggRlJPTSBzeXN0ZW0ubGljZW5zZXMgT1JERVIgQlkgbGlicmFyeV9uYW1lIENPTExBVEUgJ2VuJw==)
| library_name | license_type | license_path |
| library_name | license_type | license_path |
|:-|:-|:-|
| abseil-cpp | Apache | /contrib/abseil-cpp/LICENSE |
| AMQP-CPP | Apache | /contrib/AMQP-CPP/LICENSE |
@ -89,3 +89,15 @@ SELECT library_name, license_type, license_path FROM system.licenses ORDER BY li
| xz | Public Domain | /contrib/xz/COPYING |
| zlib-ng | zLib | /contrib/zlib-ng/LICENSE.md |
| zstd | BSD | /contrib/zstd/LICENSE |
## Guidelines for adding new third-party libraries and maintaining custom changes in them {#adding-third-party-libraries}
1. All external third-party code should reside in the dedicated directories under `contrib` directory of ClickHouse repo. Prefer Git submodules, when available.
2. Fork/mirror the official repo in [Clickhouse-extras](https://github.com/ClickHouse-Extras). Prefer official GitHub repos, when available.
3. Branch from the branch you want to integrate, e.g., `master` -> `clickhouse/master`, or `release/vX.Y.Z` -> `clickhouse/release/vX.Y.Z`.
4. All forks in [Clickhouse-extras](https://github.com/ClickHouse-Extras) can be automatically synchronized with upstreams. `clickhouse/...` branches will remain unaffected, since virtually nobody is going to use that naming pattern in their upstream repos.
5. Add submodules under `contrib` of ClickHouse repo that refer the above forks/mirrors. Set the submodules to track the corresponding `clickhouse/...` branches.
6. Every time the custom changes have to be made in the library code, a dedicated branch should be created, like `clickhouse/my-fix`. Then this branch should be merged into the branch, that is tracked by the submodule, e.g., `clickhouse/master` or `clickhouse/release/vX.Y.Z`.
7. No code should be pushed in any branch of the forks in [Clickhouse-extras](https://github.com/ClickHouse-Extras), whose names do not follow `clickhouse/...` pattern.
8. Always write the custom changes with the official repo in mind. Once the PR is merged from (a feature/fix branch in) your personal fork into the fork in [Clickhouse-extras](https://github.com/ClickHouse-Extras), and the submodule is bumped in ClickHouse repo, consider opening another PR from (a feature/fix branch in) the fork in [Clickhouse-extras](https://github.com/ClickHouse-Extras) to the official repo of the library. This will make sure, that 1) the contribution has more than a single use case and importance, 2) others will also benefit from it, 3) the change will not remain a maintenance burden solely on ClickHouse developers.
9. When a submodule needs to start using a newer code from the original branch (e.g., `master`), and since the custom changes might be merged in the branch it is tracking (e.g., `clickhouse/master`) and so it may diverge from its original counterpart (i.e., `master`), a careful merge should be carried out first, i.e., `master` -> `clickhouse/master`, and only then the submodule can be bumped in ClickHouse.

View File

@ -237,6 +237,8 @@ The description of ClickHouse architecture can be found here: https://clickhouse
The Code Style Guide: https://clickhouse.tech/docs/en/development/style/
Adding third-party libraries: https://clickhouse.tech/docs/en/development/contrib/#adding-third-party-libraries
Writing tests: https://clickhouse.tech/docs/en/development/tests/
List of tasks: https://github.com/ClickHouse/ClickHouse/issues?q=is%3Aopen+is%3Aissue+label%3A%22easy+task%22

View File

@ -628,7 +628,7 @@ If the class is not intended for polymorphic use, you do not need to make functi
**18.** Encodings.
Use UTF-8 everywhere. Use `std::string`and`char *`. Do not use `std::wstring`and`wchar_t`.
Use UTF-8 everywhere. Use `std::string` and `char *`. Do not use `std::wstring` and `wchar_t`.
**19.** Logging.
@ -749,17 +749,9 @@ If your code in the `master` branch is not buildable yet, exclude it from the bu
**1.** The C++20 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
**2.** If necessary, you can use any well-known libraries available in the OS package.
**2.** It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in `contrib` directory and built with ClickHouse.
If there is a good solution already available, then use it, even if it means you have to install another library.
(But be prepared to remove bad libraries from code.)
**3.** You can install a library that isnt in the packages, if the packages do not have what you need or have an outdated version or the wrong type of compilation.
**4.** If the library is small and does not have its own complex build system, put the source files in the `contrib` folder.
**5.** Preference is always given to libraries that are already in use.
**3.** Preference is always given to libraries that are already in use.
## General Recommendations {#general-recommendations-1}

View File

@ -1,3 +1,7 @@
---
toc_priority: 212
---
# median {#median}
The `median*` functions are the aliases for the corresponding `quantile*` functions. They calculate median of a numeric data sample.
@ -12,6 +16,7 @@ Functions:
- `medianTimingWeighted` — Alias for [quantileTimingWeighted](../../../sql-reference/aggregate-functions/reference/quantiletimingweighted.md#quantiletimingweighted).
- `medianTDigest` — Alias for [quantileTDigest](../../../sql-reference/aggregate-functions/reference/quantiletdigest.md#quantiletdigest).
- `medianTDigestWeighted` — Alias for [quantileTDigestWeighted](../../../sql-reference/aggregate-functions/reference/quantiletdigestweighted.md#quantiletdigestweighted).
- `medianBFloat16` — Alias for [quantileBFloat16](../../../sql-reference/aggregate-functions/reference/quantilebfloat16.md#quantilebfloat16).
**Example**

View File

@ -0,0 +1,64 @@
---
toc_priority: 209
---
# quantileBFloat16 {#quantilebfloat16}
Computes an approximate [quantile](https://en.wikipedia.org/wiki/Quantile) of a sample consisting of [bfloat16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format) numbers. `bfloat16` is a floating-point data type with 1 sign bit, 8 exponent bits and 7 fraction bits.
The function converts input values to 32-bit floats and takes the most significant 16 bits. Then it calculates `bfloat16` quantile value and converts the result to a 64-bit float by appending zero bits.
The function is a fast quantile estimator with a relative error no more than 0.390625%.
**Syntax**
``` sql
quantileBFloat16[(level)](expr)
```
Alias: `medianBFloat16`
**Arguments**
- `expr` — Column with numeric data. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md).
**Parameters**
- `level` — Level of quantile. Optional. Possible values are in the range from 0 to 1. Default value: 0.5. [Float](../../../sql-reference/data-types/float.md).
**Returned value**
- Approximate quantile of the specified level.
Type: [Float64](../../../sql-reference/data-types/float.md#float32-float64).
**Example**
Input table has an integer and a float columns:
``` text
┌─a─┬─────b─┐
│ 1 │ 1.001 │
│ 2 │ 1.002 │
│ 3 │ 1.003 │
│ 4 │ 1.004 │
└───┴───────┘
```
Query to calculate 0.75-quantile (third quartile):
``` sql
SELECT quantileBFloat16(0.75)(a), quantileBFloat16(0.75)(b) FROM example_table;
```
Result:
``` text
┌─quantileBFloat16(0.75)(a)─┬─quantileBFloat16(0.75)(b)─┐
│ 3 │ 1 │
└───────────────────────────┴───────────────────────────┘
```
Note that all floating point values in the example are truncated to 1.0 when converting to `bfloat16`.
**See Also**
- [median](../../../sql-reference/aggregate-functions/reference/median.md#median)
- [quantiles](../../../sql-reference/aggregate-functions/reference/quantiles.md#quantiles)

View File

@ -74,7 +74,7 @@ When using multiple `quantile*` functions with different levels in a query, the
**Syntax**
``` sql
quantileExact(level)(expr)
quantileExactLow(level)(expr)
```
Alias: `medianExactLow`.

View File

@ -8,7 +8,7 @@ toc_priority: 201
Syntax: `quantiles(level1, level2, …)(x)`
All the quantile functions also have corresponding quantiles functions: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`. These functions calculate all the quantiles of the listed levels in one pass, and return an array of the resulting values.
All the quantile functions also have corresponding quantiles functions: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`, `quantilesBFloat16`. These functions calculate all the quantiles of the listed levels in one pass, and return an array of the resulting values.
## quantilesExactExclusive {#quantilesexactexclusive}
@ -18,7 +18,7 @@ To get exact value, all the passed values are combined into an array, whic
This function is equivalent to [PERCENTILE.EXC](https://support.microsoft.com/en-us/office/percentile-exc-function-bbaa7204-e9e1-4010-85bf-c31dc5dce4ba) Excel function, ([type R6](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample)).
Works more efficiently with sets of levels than [quantilesExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
Works more efficiently with sets of levels than [quantileExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
**Syntax**
@ -70,7 +70,7 @@ To get exact value, all the passed values are combined into an array, whic
This function is equivalent to [PERCENTILE.INC](https://support.microsoft.com/en-us/office/percentile-inc-function-680f9539-45eb-410b-9a5e-c1355e5fe2ed) Excel function, ([type R7](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample)).
Works more efficiently with sets of levels than [quantilesExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantilesexactinclusive).
Works more efficiently with sets of levels than [quantileExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactinclusive).
**Syntax**

View File

@ -80,6 +80,7 @@ SELECT toDateTime('2019-01-01 00:00:00', 'UTC') AS time_utc,
toInt32(time_samoa) AS int32samoa
FORMAT Vertical;
```
Result:
```text
@ -1014,7 +1015,7 @@ Result:
## dateName {#dataname}
Returns part of date with specified date part.
Returns specified part of date.
**Syntax**
@ -1024,13 +1025,13 @@ dateName(date_part, date)
**Arguments**
- `date_part` - Date part. Possible values .
- `date` — Date [Date](../../sql-reference/data-types/date.md) or DateTime [DateTime](../../sql-reference/data-types/datetime.md), [DateTime64](../../sql-reference/data-types/datetime64.md).
- `date_part` — Date part. Possible values: 'year', 'quarter', 'month', 'week', 'dayofyear', 'day', 'weekday', 'hour', 'minute', 'second'. [String](../../sql-reference/data-types/string.md).
- `date` — Date. [Date](../../sql-reference/data-types/date.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — Timezone. Optional. [String](../../sql-reference/data-types/string.md).
**Returned value**
- Specified date part of date.
- The specified part of date.
Type: [String](../../sql-reference/data-types/string.md#string)

View File

@ -224,7 +224,7 @@ Accepts an integer. Returns an array of UInt64 numbers containing the list of po
## bitPositionsToArray(num) {#bitpositionstoarraynum}
Accepts an integer, argument will be converted to unsigned integer type. Returns an array of UInt64 numbers containing the list of positions of bits that equals 1. Numbers in the array are in ascending order.
Accepts an integer and converts it to an unsigned integer. Returns an array of `UInt64` numbers containing the list of positions of bits of `arg` that equal `1`, in ascending order.
**Syntax**
@ -234,11 +234,13 @@ bitPositionsToArray(arg)
**Arguments**
- `arg` — Integer value.Types: [Int/UInt](../../sql-reference/data-types/int-uint.md)
- `arg` — Integer value. [Int/UInt](../../sql-reference/data-types/int-uint.md).
**Returned value**
An array of UInt64 numbers containing the list of positions of bits that equals 1. Numbers in the array are in ascending order.
- An array containing a list of positions of bits that equal `1`, in ascending order.
Type: [Array](../../sql-reference/data-types/array.md)([UInt64](../../sql-reference/data-types/int-uint.md)).
**Example**

View File

@ -749,19 +749,11 @@ CPU命令セットは、サーバー間でサポートされる最小のセッ
## 図書館 {#libraries}
**1.** C++20標準ライブラリが使用されています実験的な拡張が許可されています`boost``Poco` フレームワーク
**1.** The C++20 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
**2.** 必要に応じて、OSパッケージで利用可能な既知のライブラリを使用できます。
**2.** It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in `contrib` directory and built with ClickHouse.
すでに利用可能な良い解決策がある場合は、別のライブラリをインストールする必要がある場合でも、それを使用してください。
(が準備をしておいてくださ去の悪い図書館からのコードです。)
**3.** パッケージに必要なものがない場合や、古いバージョンや間違った種類のコンパイルがある場合は、パッケージにないライブラリをインストールできます。
**4.** ライブラリが小さく、独自の複雑なビルドシステムがない場合は、ソースファイルを `contrib` フォルダ。
**5.** すでに使用されているライブラリが優先されます。
**3.** Preference is always given to libraries that are already in use.
## 一般的な推奨事項 {#general-recommendations-1}

View File

@ -824,17 +824,9 @@ The dictionary is configured incorrectly.
**1.** Используются стандартная библиотека C++20 (допустимо использовать экспериментальные расширения) а также фреймворки `boost`, `Poco`.
**2.** При необходимости, можно использовать любые известные библиотеки, доступные в ОС из пакетов.
**2.** Библиотеки должны быть расположены в виде исходников в директории `contrib` и собираться вместе с ClickHouse. Не разрешено использовать библиотеки, доступные в пакетах ОС или любые другие способы установки библиотек в систему.
Если есть хорошее готовое решение, то оно используется, даже если для этого придётся установить ещё одну библиотеку.
(Но будьте готовы к тому, что иногда вам придётся выкидывать плохие библиотеки из кода.)
**3.** Если в пакетах нет нужной библиотеки, или её версия достаточно старая, или если она собрана не так, как нужно, то можно использовать библиотеку, устанавливаемую не из пакетов.
**4.** Если библиотека достаточно маленькая и у неё нет своей системы сборки, то следует включить её файлы в проект, в директорию `contrib`.
**5.** Предпочтение всегда отдаётся уже использующимся библиотекам.
**3.** Предпочтение отдаётся уже использующимся библиотекам.
## Общее {#obshchee-1}

View File

@ -1,17 +1,19 @@
# median {#median}
Функции `median*`алиасы для соответствущих функций `quantile*`. Они вычисляют медиану числовой последовательности.
Функции `median*`синонимы для соответствущих функций `quantile*`. Они вычисляют медиану числовой последовательности.
Functions:
Функции:
- `median` — алиас [quantile](#quantile).
- `medianDeterministic` — алиас [quantileDeterministic](#quantiledeterministic).
- `medianExact` — алиас [quantileExact](#quantileexact).
- `medianExactWeighted` — алиас [quantileExactWeighted](#quantileexactweighted).
- `medianTiming` — алиас [quantileTiming](#quantiletiming).
- `medianTimingWeighted` — алиас [quantileTimingWeighted](#quantiletimingweighted).
- `medianTDigest` — алиас [quantileTDigest](#quantiletdigest).
- `medianTDigestWeighted` — алиас [quantileTDigestWeighted](#quantiletdigestweighted).
- `median` — синоним для [quantile](../../../sql-reference/aggregate-functions/reference/quantile.md#quantile).
- `medianDeterministic` — синоним для [quantileDeterministic](../../../sql-reference/aggregate-functions/reference/quantiledeterministic.md#quantiledeterministic).
- `medianExact` — синоним для [quantileExact](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexact).
- `medianExactWeighted` — синоним для [quantileExactWeighted](../../../sql-reference/aggregate-functions/reference/quantileexactweighted.md#quantileexactweighted).
- `medianTiming` — синоним для [quantileTiming](../../../sql-reference/aggregate-functions/reference/quantiletiming.md#quantiletiming).
- `medianTimingWeighted` — синоним для [quantileTimingWeighted](../../../sql-reference/aggregate-functions/reference/quantiletimingweighted.md#quantiletimingweighted).
- `medianTDigest` — синоним для [quantileTDigest](../../../sql-reference/aggregate-functions/reference/quantiletdigest.md#quantiletdigest).
- `medianTDigestWeighted` — синоним для [quantileTDigestWeighted](../../../sql-reference/aggregate-functions/reference/quantiletdigestweighted.md#quantiletdigestweighted).
- `medianBFloat16` — синоним для [quantileBFloat16](../../../sql-reference/aggregate-functions/reference/quantilebfloat16.md#quantilebfloat16).
**Пример**

View File

@ -0,0 +1,64 @@
---
toc_priority: 209
---
# quantileBFloat16 {#quantilebfloat16}
Приближенно вычисляет [квантиль](https://ru.wikipedia.org/wiki/Квантиль) выборки чисел в формате [bfloat16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format). `bfloat16` — это формат с плавающей точкой, в котором для представления числа используется 1 знаковый бит, 8 бит для порядка и 7 бит для мантиссы.
Функция преобразует входное число в 32-битное с плавающей точкой и обрабатывает его старшие 16 бит. Она вычисляет квантиль в формате `bfloat16` и преобразует его в 64-битное число с плавающей точкой, добавляя нулевые биты.
Эта функция выполняет быстрые приближенные вычисления с относительной ошибкой не более 0.390625%.
**Синтаксис**
``` sql
quantileBFloat16[(level)](expr)
```
Синоним: `medianBFloat16`
**Аргументы**
- `expr` — столбец с числовыми данными. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md).
**Параметры**
- `level` — уровень квантиля. Необязательный параметр. Допустимый диапазон значений от 0 до 1. Значение по умолчанию: 0.5. [Float](../../../sql-reference/data-types/float.md).
**Возвращаемое значение**
- Приближенное значение квантиля.
Тип: [Float64](../../../sql-reference/data-types/float.md#float32-float64).
**Пример**
В таблице есть столбцы с целыми числами и с числами с плавающей точкой:
``` text
┌─a─┬─────b─┐
│ 1 │ 1.001 │
│ 2 │ 1.002 │
│ 3 │ 1.003 │
│ 4 │ 1.004 │
└───┴───────┘
```
Запрос для вычисления 0.75-квантиля (верхнего квартиля):
``` sql
SELECT quantileBFloat16(0.75)(a), quantileBFloat16(0.75)(b) FROM example_table;
```
Результат:
``` text
┌─quantileBFloat16(0.75)(a)─┬─quantileBFloat16(0.75)(b)─┐
│ 3 │ 1 │
└───────────────────────────┴───────────────────────────┘
```
Обратите внимание, что все числа с плавающей точкой в примере были округлены до 1.0 при преобразовании к `bfloat16`.
**См. также**
- [median](../../../sql-reference/aggregate-functions/reference/median.md#median)
- [quantiles](../../../sql-reference/aggregate-functions/reference/quantiles.md#quantiles)

View File

@ -8,7 +8,7 @@ toc_priority: 201
Синтаксис: `quantiles(level1, level2, …)(x)`
Все функции для вычисления квантилей имеют соответствующие функции для вычисления нескольких квантилей: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`. Эти функции вычисляют все квантили указанных уровней в один проход и возвращают массив с вычисленными значениями.
Все функции для вычисления квантилей имеют соответствующие функции для вычисления нескольких квантилей: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`, `quantilesBFloat16`. Эти функции вычисляют все квантили указанных уровней в один проход и возвращают массив с вычисленными значениями.
## quantilesExactExclusive {#quantilesexactexclusive}
@ -18,7 +18,7 @@ toc_priority: 201
Эта функция эквивалентна Excel функции [PERCENTILE.EXC](https://support.microsoft.com/en-us/office/percentile-exc-function-bbaa7204-e9e1-4010-85bf-c31dc5dce4ba), [тип R6](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample).
С наборами уровней работает эффективнее, чем [quantilesExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
С наборами уровней работает эффективнее, чем [quantileExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
**Синтаксис**
@ -70,7 +70,7 @@ SELECT quantilesExactExclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM
Эта функция эквивалентна Excel функции [PERCENTILE.INC](https://support.microsoft.com/en-us/office/percentile-inc-function-680f9539-45eb-410b-9a5e-c1355e5fe2ed), [тип R7](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample).
С наборами уровней работает эффективнее, чем [quantilesExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantilesexactinclusive).
С наборами уровней работает эффективнее, чем [quantileExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactinclusive).
**Синтаксис**

View File

@ -27,40 +27,40 @@ SELECT
Возвращает часовой пояс сервера.
**Синтаксис**
**Синтаксис**
``` sql
timeZone()
```
Псевдоним: `timezone`.
Синоним: `timezone`.
**Возвращаемое значение**
- Часовой пояс.
- Часовой пояс.
Тип: [String](../../sql-reference/data-types/string.md).
## toTimeZone {#totimezone}
Переводит дату или дату с временем в указанный часовой пояс. Часовой пояс - это атрибут типов `Date` и `DateTime`. Внутреннее значение (количество секунд) поля таблицы или результирующего столбца не изменяется, изменяется тип поля и, соответственно, его текстовое отображение.
Переводит дату или дату с временем в указанный часовой пояс. Часовой пояс - это атрибут типов `Date` и `DateTime`. Внутреннее значение (количество секунд) поля таблицы или результирующего столбца не изменяется, изменяется тип поля и, соответственно, его текстовое отображение.
**Синтаксис**
**Синтаксис**
``` sql
toTimezone(value, timezone)
```
Псевдоним: `toTimezone`.
Синоним: `toTimezone`.
**Аргументы**
**Аргументы**
- `value` — время или дата с временем. [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — часовой пояс для возвращаемого значения. [String](../../sql-reference/data-types/string.md).
**Возвращаемое значение**
- Дата с временем.
- Дата с временем.
Тип: [DateTime](../../sql-reference/data-types/datetime.md).
@ -80,6 +80,7 @@ SELECT toDateTime('2019-01-01 00:00:00', 'UTC') AS time_utc,
toInt32(time_samoa) AS int32samoa
FORMAT Vertical;
```
Результат:
```text
@ -102,21 +103,21 @@ int32samoa: 1546300800
Возвращает название часового пояса для значений типа [DateTime](../../sql-reference/data-types/datetime.md) и [DateTime64](../../sql-reference/data-types/datetime64.md).
**Синтаксис**
**Синтаксис**
``` sql
timeZoneOf(value)
```
Псевдоним: `timezoneOf`.
Синоним: `timezoneOf`.
**Аргументы**
- `value` — Дата с временем. [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `value` — Дата с временем. [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
**Возвращаемое значение**
- Название часового пояса.
- Название часового пояса.
Тип: [String](../../sql-reference/data-types/string.md).
@ -145,15 +146,15 @@ SELECT timezoneOf(now());
timeZoneOffset(value)
```
Псевдоним: `timezoneOffset`.
Синоним: `timezoneOffset`.
**Аргументы**
- `value` — Дата с временем. [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
- `value` — Дата с временем. [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md).
**Возвращаемое значение**
- Смещение в секундах от UTC.
- Смещение в секундах от UTC.
Тип: [Int32](../../sql-reference/data-types/int-uint.md).
@ -626,7 +627,7 @@ SELECT now(), date_trunc('hour', now(), 'Europe/Moscow');
Добавляет интервал времени или даты к указанной дате или дате со временем.
**Синтаксис**
**Синтаксис**
``` sql
date_add(unit, value, date)
@ -1025,6 +1026,45 @@ SELECT formatDateTime(toDate('2010-01-04'), '%g');
└────────────────────────────────────────────┘
```
## dateName {#dataname}
Возвращает указанную часть даты.
**Синтаксис**
``` sql
dateName(date_part, date)
```
**Аргументы**
- `date_part` — часть даты. Возможные значения: 'year', 'quarter', 'month', 'week', 'dayofyear', 'day', 'weekday', 'hour', 'minute', 'second'. [String](../../sql-reference/data-types/string.md).
- `date` — дата. [Date](../../sql-reference/data-types/date.md), [DateTime](../../sql-reference/data-types/datetime.md) или [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — часовой пояс. Необязательный аргумент. [String](../../sql-reference/data-types/string.md).
**Возвращаемое значение**
- Указанная часть даты.
Тип: [String](../../sql-reference/data-types/string.md#string).
**Пример**
Запрос:
```sql
WITH toDateTime('2021-04-14 11:22:33') AS date_value
SELECT dateName('year', date_value), dateName('month', date_value), dateName('day', date_value);
```
Результат:
```text
┌─dateName('year', date_value)─┬─dateName('month', date_value)─┬─dateName('day', date_value)─┐
│ 2021 │ April │ 14 │
└──────────────────────────────┴───────────────────────────────┴─────────────────────────────
```
## FROM\_UNIXTIME {#fromunixtime}
Функция преобразует Unix timestamp в календарную дату и время.

View File

@ -223,3 +223,53 @@ SELECT reinterpretAsUInt64(reverse(unhex('FFF'))) AS num;
## bitmaskToArray(num) {#bitmasktoarraynum}
Принимает целое число. Возвращает массив чисел типа UInt64, содержащий степени двойки, в сумме дающих исходное число; числа в массиве идут по возрастанию.
## bitPositionsToArray(num) {#bitpositionstoarraynum}
Принимает целое число и приводит его к беззнаковому виду. Возвращает массив `UInt64` чисел, который содержит список позиций битов `arg`, равных `1`, в порядке возрастания.
**Синтаксис**
```sql
bitPositionsToArray(arg)
```
**Аргументы**
- `arg` — целое значение. [Int/UInt](../../sql-reference/data-types/int-uint.md).
**Возвращаемое значение**
- Массив, содержащий список позиций битов, равных `1`, в порядке возрастания.
Тип: [Array](../../sql-reference/data-types/array.md)([UInt64](../../sql-reference/data-types/int-uint.md)).
**Примеры**
Запрос:
``` sql
SELECT bitPositionsToArray(toInt8(1)) AS bit_positions;
```
Результат:
``` text
┌─bit_positions─┐
│ [0] │
└───────────────┘
```
Запрос:
``` sql
select bitPositionsToArray(toInt8(-1)) as bit_positions;
```
Результат:
``` text
┌─bit_positions─────┐
│ [0,1,2,3,4,5,6,7] │
└───────────────────┘
```

View File

@ -742,19 +742,11 @@ CPU指令集是我们服务器中支持的最小集合。 目前它是SSE 4.2
## 库 {#ku}
**1.** 使用C++20标准库允许实验性功能以及 `boost``Poco` 框架。
**1.** The C++20 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
**2.** 如有必要,您可以使用 OS 包中提供的任何已知库。
**2.** It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in `contrib` directory and built with ClickHouse.
如果有一个好的解决方案已经可用,那就使用它,即使这意味着你必须安装另一个库。
(但要准备从代码中删除不好的库)
**3.** 如果软件包没有您需要的软件包或者有过时的版本或错误的编译类型,则可以安装不在软件包中的库。
**4.** 如果库很小并且没有自己的复杂构建系统,请将源文件放在 `contrib` 文件夹中。
**5.** 始终优先考虑已经使用的库。
**3.** Preference is always given to libraries that are already in use.
## 一般建议 {#yi-ban-jian-yi-1}

View File

@ -185,8 +185,8 @@ public:
auto * denominator_type = toNativeType<Denominator>(b);
static constexpr size_t denominator_offset = offsetof(Fraction, denominator);
auto * denominator_dst_ptr = b.CreatePointerCast(b.CreateConstGEP1_32(nullptr, aggregate_data_dst_ptr, denominator_offset), denominator_type->getPointerTo());
auto * denominator_src_ptr = b.CreatePointerCast(b.CreateConstGEP1_32(nullptr, aggregate_data_src_ptr, denominator_offset), denominator_type->getPointerTo());
auto * denominator_dst_ptr = b.CreatePointerCast(b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_dst_ptr, denominator_offset), denominator_type->getPointerTo());
auto * denominator_src_ptr = b.CreatePointerCast(b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_src_ptr, denominator_offset), denominator_type->getPointerTo());
auto * denominator_dst_value = b.CreateLoad(denominator_type, denominator_dst_ptr);
auto * denominator_src_value = b.CreateLoad(denominator_type, denominator_src_ptr);

View File

@ -74,7 +74,7 @@ public:
auto * denominator_type = toNativeType<Denominator>(b);
static constexpr size_t denominator_offset = offsetof(Fraction, denominator);
auto * denominator_offset_ptr = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, denominator_offset);
auto * denominator_offset_ptr = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, denominator_offset);
auto * denominator_ptr = b.CreatePointerCast(denominator_offset_ptr, denominator_type->getPointerTo());
auto * weight_cast_to_denominator = nativeCast(b, arguments_types[1], argument_values[1], denominator_type);

View File

@ -139,7 +139,7 @@ public:
if constexpr (result_is_nullable)
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, { removeNullable(nullable_type) }, { wrapped_value });
b.CreateBr(join_block);
@ -290,7 +290,7 @@ public:
if constexpr (result_is_nullable)
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, non_nullable_types, wrapped_values);
b.CreateBr(join_block);

View File

@ -199,7 +199,7 @@ public:
static constexpr size_t value_offset_from_structure = offsetof(SingleValueDataFixed<T>, value);
auto * type = toNativeType<T>(builder);
auto * value_ptr_with_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, value_offset_from_structure);
auto * value_ptr_with_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, value_offset_from_structure);
auto * value_ptr = b.CreatePointerCast(value_ptr_with_offset, type->getPointerTo());
return value_ptr;

View File

@ -207,7 +207,7 @@ public:
if constexpr (result_is_nullable)
b.CreateMemSet(aggregate_data_ptr, llvm::ConstantInt::get(b.getInt8Ty(), 0), this->prefix_size, llvm::assumeAligned(this->alignOfData()));
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
this->nested_function->compileCreate(b, aggregate_data_ptr_with_prefix_size_offset);
}
@ -225,8 +225,8 @@ public:
b.CreateStore(is_null_result_value, aggregate_data_dst_ptr);
}
auto * aggregate_data_dst_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_dst_ptr, this->prefix_size);
auto * aggregate_data_src_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_src_ptr, this->prefix_size);
auto * aggregate_data_dst_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_dst_ptr, this->prefix_size);
auto * aggregate_data_src_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_src_ptr, this->prefix_size);
this->nested_function->compileMerge(b, aggregate_data_dst_ptr_with_prefix_size_offset, aggregate_data_src_ptr_with_prefix_size_offset);
}
@ -260,7 +260,7 @@ public:
b.CreateBr(join_block);
b.SetInsertPoint(if_not_null);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
auto * nested_result = this->nested_function->compileGetResult(builder, aggregate_data_ptr_with_prefix_size_offset);
b.CreateStore(b.CreateInsertValue(nullable_value, nested_result, {0}), nullable_value_ptr);
b.CreateBr(join_block);
@ -351,7 +351,7 @@ public:
if constexpr (result_is_nullable)
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, { removeNullable(nullable_type) }, { wrapped_value });
b.CreateBr(join_block);
@ -479,7 +479,7 @@ public:
if constexpr (result_is_nullable)
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, arguments_types, wrapped_values);
b.CreateBr(join_block);
@ -488,7 +488,7 @@ public:
else
{
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, non_nullable_types, wrapped_values);
}
}

View File

@ -108,7 +108,7 @@ class IColumn;
M(Bool, compile_expressions, true, "Compile some scalar functions and operators to native code.", 0) \
M(UInt64, min_count_to_compile_expression, 3, "The number of identical expressions before they are JIT-compiled", 0) \
M(Bool, compile_aggregate_expressions, true, "Compile aggregate functions to native code.", 0) \
M(UInt64, min_count_to_compile_aggregate_expression, 0, "The number of identical aggreagte expressions before they are JIT-compiled", 0) \
M(UInt64, min_count_to_compile_aggregate_expression, 3, "The number of identical aggregate expressions before they are JIT-compiled", 0) \
M(UInt64, group_by_two_level_threshold, 100000, "From what number of keys, a two-level aggregation starts. 0 - the threshold is not set.", 0) \
M(UInt64, group_by_two_level_threshold_bytes, 50000000, "From what size of the aggregation state in bytes, a two-level aggregation begins to be used. 0 - the threshold is not set. Two-level aggregation is used when at least one of the thresholds is triggered.", 0) \
M(Bool, distributed_aggregation_memory_efficient, true, "Is the memory-saving mode of distributed aggregation enabled.", 0) \

View File

@ -81,7 +81,7 @@ void SerializationMap::deserializeBinary(IColumn & column, ReadBuffer & istr) co
template <typename Writer>
void SerializationMap::serializeTextImpl(const IColumn & column, size_t row_num, WriteBuffer & ostr, Writer && writer) const
void SerializationMap::serializeTextImpl(const IColumn & column, size_t row_num, bool quote_key, WriteBuffer & ostr, Writer && writer) const
{
const auto & column_map = assert_cast<const ColumnMap &>(column);
@ -97,7 +97,16 @@ void SerializationMap::serializeTextImpl(const IColumn & column, size_t row_num,
{
if (i != offset)
writeChar(',', ostr);
writer(key, nested_tuple.getColumn(0), i);
if (quote_key)
{
writeChar('"', ostr);
writer(key, nested_tuple.getColumn(0), i);
writeChar('"', ostr);
}
else
writer(key, nested_tuple.getColumn(0), i);
writeChar(':', ostr);
writer(value, nested_tuple.getColumn(1), i);
}
@ -161,7 +170,7 @@ void SerializationMap::deserializeTextImpl(IColumn & column, ReadBuffer & istr,
void SerializationMap::serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
{
serializeTextImpl(column, row_num, ostr,
serializeTextImpl(column, row_num, /*quote_key=*/ false, ostr,
[&](const SerializationPtr & subcolumn_serialization, const IColumn & subcolumn, size_t pos)
{
subcolumn_serialization->serializeTextQuoted(subcolumn, pos, ostr, settings);
@ -170,7 +179,6 @@ void SerializationMap::serializeText(const IColumn & column, size_t row_num, Wri
void SerializationMap::deserializeText(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
{
deserializeTextImpl(column, istr,
[&](const SerializationPtr & subcolumn_serialization, IColumn & subcolumn)
{
@ -178,10 +186,13 @@ void SerializationMap::deserializeText(IColumn & column, ReadBuffer & istr, cons
});
}
void SerializationMap::serializeTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
{
serializeTextImpl(column, row_num, ostr,
/// We need to double-quote integer keys to produce valid JSON.
const auto & column_key = assert_cast<const ColumnMap &>(column).getNestedData().getColumn(0);
bool quote_key = !WhichDataType(column_key.getDataType()).isStringOrFixedString();
serializeTextImpl(column, row_num, quote_key, ostr,
[&](const SerializationPtr & subcolumn_serialization, const IColumn & subcolumn, size_t pos)
{
subcolumn_serialization->serializeTextJSON(subcolumn, pos, ostr, settings);

View File

@ -61,7 +61,7 @@ public:
private:
template <typename Writer>
void serializeTextImpl(const IColumn & column, size_t row_num, WriteBuffer & ostr, Writer && writer) const;
void serializeTextImpl(const IColumn & column, size_t row_num, bool quote_key, WriteBuffer & ostr, Writer && writer) const;
template <typename Reader>
void deserializeTextImpl(IColumn & column, ReadBuffer & istr, Reader && reader) const;

View File

@ -116,6 +116,8 @@ target_link_libraries(clickhouse_functions PRIVATE clickhouse_functions_url)
add_subdirectory(array)
target_link_libraries(clickhouse_functions PRIVATE clickhouse_functions_array)
add_subdirectory(JSONPath)
if (USE_STATS)
target_link_libraries(clickhouse_functions PRIVATE stats)
endif()

View File

@ -39,6 +39,8 @@ struct DummyJSONParser
std::string_view getString() const { return {}; }
Array getArray() const { return {}; }
Object getObject() const { return {}; }
Element getElement() { return {}; }
};
/// References an array in a JSON document.
@ -97,4 +99,9 @@ struct DummyJSONParser
#endif
};
inline ALWAYS_INLINE std::ostream& operator<<(std::ostream& out, DummyJSONParser::Element)
{
return out;
}
}

View File

@ -0,0 +1,15 @@
#include <Functions/FunctionSQLJSON.h>
#include <Functions/FunctionFactory.h>
namespace DB
{
void registerFunctionsSQLJSON(FunctionFactory & factory)
{
factory.registerFunction<FunctionSQLJSON<NameJSONExists, JSONExistsImpl>>();
factory.registerFunction<FunctionSQLJSON<NameJSONQuery, JSONQueryImpl>>();
factory.registerFunction<FunctionSQLJSON<NameJSONValue, JSONValueImpl>>();
}
}

View File

@ -0,0 +1,334 @@
#pragma once
#include <sstream>
#include <type_traits>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnsNumber.h>
#include <Core/Settings.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesNumber.h>
#include <Functions/DummyJSONParser.h>
#include <Functions/IFunction.h>
#include <Functions/JSONPath/ASTs/ASTJSONPath.h>
#include <Functions/JSONPath/Generator/GeneratorJSONPath.h>
#include <Functions/JSONPath/Parsers/ParserJSONPath.h>
#include <Functions/RapidJSONParser.h>
#include <Functions/SimdJSONParser.h>
#include <Interpreters/Context.h>
#include <Parsers/IParser.h>
#include <Parsers/Lexer.h>
#include <common/range.h>
#if !defined(ARCADIA_BUILD)
# include "config_functions.h"
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
extern const int BAD_ARGUMENTS;
}
class FunctionSQLJSONHelpers
{
public:
template <typename Name, template <typename> typename Impl, class JSONParser>
class Executor
{
public:
static ColumnPtr run(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count, uint32_t parse_depth)
{
MutableColumnPtr to{result_type->createColumn()};
to->reserve(input_rows_count);
if (arguments.size() < 2)
{
throw Exception{"JSONPath functions require at least 2 arguments", ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
}
const auto & first_column = arguments[0];
/// Check 1 argument: must be of type String (JSONPath)
if (!isString(first_column.type))
{
throw Exception(
"JSONPath functions require 1 argument to be JSONPath of type string, illegal type: " + first_column.type->getName(),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
/// Check 1 argument: must be const (JSONPath)
if (!isColumnConst(*first_column.column))
{
throw Exception("1 argument (JSONPath) must be const", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
const auto & second_column = arguments[1];
/// Check 2 argument: must be of type String (JSON)
if (!isString(second_column.type))
{
throw Exception(
"JSONPath functions require 2 argument to be JSON of string, illegal type: " + second_column.type->getName(),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
const ColumnPtr & arg_jsonpath = first_column.column;
const auto * arg_jsonpath_const = typeid_cast<const ColumnConst *>(arg_jsonpath.get());
const auto * arg_jsonpath_string = typeid_cast<const ColumnString *>(arg_jsonpath_const->getDataColumnPtr().get());
const ColumnPtr & arg_json = second_column.column;
const auto * col_json_const = typeid_cast<const ColumnConst *>(arg_json.get());
const auto * col_json_string
= typeid_cast<const ColumnString *>(col_json_const ? col_json_const->getDataColumnPtr().get() : arg_json.get());
/// Get data and offsets for 1 argument (JSONPath)
const ColumnString::Chars & chars_path = arg_jsonpath_string->getChars();
const ColumnString::Offsets & offsets_path = arg_jsonpath_string->getOffsets();
/// Prepare to parse 1 argument (JSONPath)
const char * query_begin = reinterpret_cast<const char *>(&chars_path[0]);
const char * query_end = query_begin + offsets_path[0] - 1;
/// Tokenize query
Tokens tokens(query_begin, query_end);
/// Max depth 0 indicates that depth is not limited
IParser::Pos token_iterator(tokens, parse_depth);
/// Parse query and create AST tree
Expected expected;
ASTPtr res;
ParserJSONPath parser;
const bool parse_res = parser.parse(token_iterator, res, expected);
if (!parse_res)
{
throw Exception{"Unable to parse JSONPath", ErrorCodes::BAD_ARGUMENTS};
}
/// Get data and offsets for 2 argument (JSON)
const ColumnString::Chars & chars_json = col_json_string->getChars();
const ColumnString::Offsets & offsets_json = col_json_string->getOffsets();
JSONParser json_parser;
using Element = typename JSONParser::Element;
Element document;
bool document_ok = false;
/// Parse JSON for every row
Impl<JSONParser> impl;
for (const auto i : collections::range(0, input_rows_count))
{
std::string_view json{
reinterpret_cast<const char *>(&chars_json[offsets_json[i - 1]]), offsets_json[i] - offsets_json[i - 1] - 1};
document_ok = json_parser.parse(json, document);
bool added_to_column = false;
if (document_ok)
{
added_to_column = impl.insertResultToColumn(*to, document, res);
}
if (!added_to_column)
{
to->insertDefault();
}
}
return to;
}
};
};
template <typename Name, template <typename> typename Impl>
class FunctionSQLJSON : public IFunction, WithConstContext
{
public:
static FunctionPtr create(ContextPtr context_) { return std::make_shared<FunctionSQLJSON>(context_); }
explicit FunctionSQLJSON(ContextPtr context_) : WithConstContext(context_) { }
static constexpr auto name = Name::name;
String getName() const override { return Name::name; }
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {0}; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
return Impl<DummyJSONParser>::getReturnType(Name::name, arguments);
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override
{
/// Choose JSONParser.
/// 1. Lexer(path) -> Tokens
/// 2. Create ASTPtr
/// 3. Parser(Tokens, ASTPtr) -> complete AST
/// 4. Execute functions: call getNextItem on generator and handle each item
uint32_t parse_depth = getContext()->getSettingsRef().max_parser_depth;
#if USE_SIMDJSON
if (getContext()->getSettingsRef().allow_simdjson)
return FunctionSQLJSONHelpers::Executor<Name, Impl, SimdJSONParser>::run(arguments, result_type, input_rows_count, parse_depth);
#endif
return FunctionSQLJSONHelpers::Executor<Name, Impl, DummyJSONParser>::run(arguments, result_type, input_rows_count, parse_depth);
}
};
struct NameJSONExists
{
static constexpr auto name{"JSON_EXISTS"};
};
struct NameJSONValue
{
static constexpr auto name{"JSON_VALUE"};
};
struct NameJSONQuery
{
static constexpr auto name{"JSON_QUERY"};
};
template <typename JSONParser>
class JSONExistsImpl
{
public:
using Element = typename JSONParser::Element;
static DataTypePtr getReturnType(const char *, const ColumnsWithTypeAndName &) { return std::make_shared<DataTypeUInt8>(); }
static size_t getNumberOfIndexArguments(const ColumnsWithTypeAndName & arguments) { return arguments.size() - 1; }
static bool insertResultToColumn(IColumn & dest, const Element & root, ASTPtr & query_ptr)
{
GeneratorJSONPath<JSONParser> generator_json_path(query_ptr);
Element current_element = root;
VisitorStatus status;
while ((status = generator_json_path.getNextItem(current_element)) != VisitorStatus::Exhausted)
{
if (status == VisitorStatus::Ok)
{
break;
}
current_element = root;
}
/// insert result, status can be either Ok (if we found the item)
/// or Exhausted (if we never found the item)
ColumnUInt8 & col_bool = assert_cast<ColumnUInt8 &>(dest);
if (status == VisitorStatus::Ok)
{
col_bool.insert(1);
}
else
{
col_bool.insert(0);
}
return true;
}
};
template <typename JSONParser>
class JSONValueImpl
{
public:
using Element = typename JSONParser::Element;
static DataTypePtr getReturnType(const char *, const ColumnsWithTypeAndName &) { return std::make_shared<DataTypeString>(); }
static size_t getNumberOfIndexArguments(const ColumnsWithTypeAndName & arguments) { return arguments.size() - 1; }
static bool insertResultToColumn(IColumn & dest, const Element & root, ASTPtr & query_ptr)
{
GeneratorJSONPath<JSONParser> generator_json_path(query_ptr);
Element current_element = root;
VisitorStatus status;
Element res;
while ((status = generator_json_path.getNextItem(current_element)) != VisitorStatus::Exhausted)
{
if (status == VisitorStatus::Ok)
{
if (!(current_element.isArray() || current_element.isObject()))
{
break;
}
}
else if (status == VisitorStatus::Error)
{
/// ON ERROR
/// Here it is possible to handle errors with ON ERROR (as described in ISO/IEC TR 19075-6),
/// however this functionality is not implemented yet
}
current_element = root;
}
if (status == VisitorStatus::Exhausted)
{
return false;
}
std::stringstream out; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
out << current_element.getElement();
auto output_str = out.str();
ColumnString & col_str = assert_cast<ColumnString &>(dest);
col_str.insertData(output_str.data(), output_str.size());
return true;
}
};
/**
* Function to test jsonpath member access, will be removed in final PR
* @tparam JSONParser parser
*/
template <typename JSONParser>
class JSONQueryImpl
{
public:
using Element = typename JSONParser::Element;
static DataTypePtr getReturnType(const char *, const ColumnsWithTypeAndName &) { return std::make_shared<DataTypeString>(); }
static size_t getNumberOfIndexArguments(const ColumnsWithTypeAndName & arguments) { return arguments.size() - 1; }
static bool insertResultToColumn(IColumn & dest, const Element & root, ASTPtr & query_ptr)
{
GeneratorJSONPath<JSONParser> generator_json_path(query_ptr);
Element current_element = root;
VisitorStatus status;
std::stringstream out; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
/// Create json array of results: [res1, res2, ...]
out << "[";
bool success = false;
while ((status = generator_json_path.getNextItem(current_element)) != VisitorStatus::Exhausted)
{
if (status == VisitorStatus::Ok)
{
if (success)
{
out << ", ";
}
success = true;
out << current_element.getElement();
}
else if (status == VisitorStatus::Error)
{
/// ON ERROR
/// Here it is possible to handle errors with ON ERROR (as described in ISO/IEC TR 19075-6),
/// however this functionality is not implemented yet
}
current_element = root;
}
out << "]";
if (!success)
{
return false;
}
ColumnString & col_str = assert_cast<ColumnString &>(dest);
auto output_str = out.str();
col_str.insertData(output_str.data(), output_str.size());
return true;
}
};
}

View File

@ -575,12 +575,12 @@ ColumnPtr FunctionAnyArityLogical<Impl, Name>::getConstantResultForNonConstArgum
if constexpr (std::is_same_v<Impl, AndImpl>)
{
if (has_false_constant)
result_type->createColumnConst(0, static_cast<UInt8>(false));
result_column = result_type->createColumnConst(0, static_cast<UInt8>(false));
}
else if constexpr (std::is_same_v<Impl, OrImpl>)
{
if (has_true_constant)
result_type->createColumnConst(0, static_cast<UInt8>(true));
result_column = result_type->createColumnConst(0, static_cast<UInt8>(true));
}
return result_column;

View File

@ -0,0 +1,18 @@
#pragma once
#include <Functions/JSONPath/ASTs/ASTJSONPathQuery.h>
#include <Parsers/IAST.h>
namespace DB
{
class ASTJSONPath : public IAST
{
public:
String getID(char) const override { return "ASTJSONPath"; }
ASTPtr clone() const override { return std::make_shared<ASTJSONPath>(*this); }
ASTJSONPathQuery * jsonpath_query;
};
}

View File

@ -0,0 +1,19 @@
#pragma once
#include <Parsers/IAST.h>
namespace DB
{
class ASTJSONPathMemberAccess : public IAST
{
public:
String getID(char) const override { return "ASTJSONPathMemberAccess"; }
ASTPtr clone() const override { return std::make_shared<ASTJSONPathMemberAccess>(*this); }
public:
/// Member name to lookup in json document (in path: $.some_key.another_key. ...)
String member_name;
};
}

View File

@ -0,0 +1,15 @@
#pragma once
#include <Parsers/IAST.h>
namespace DB
{
class ASTJSONPathQuery : public IAST
{
public:
String getID(char) const override { return "ASTJSONPathQuery"; }
ASTPtr clone() const override { return std::make_shared<ASTJSONPathQuery>(*this); }
};
}

View File

@ -0,0 +1,23 @@
#pragma once
#include <vector>
#include <Parsers/IAST.h>
namespace DB
{
class ASTJSONPathRange : public IAST
{
public:
String getID(char) const override { return "ASTJSONPathRange"; }
ASTPtr clone() const override { return std::make_shared<ASTJSONPathRange>(*this); }
public:
/// Ranges to lookup in json array ($[0, 1, 2, 4 to 9])
/// Range is represented as <start, end (non-inclusive)>
/// Single index is represented as <start, start + 1>
std::vector<std::pair<UInt32, UInt32>> ranges;
bool is_star = false;
};
}

View File

@ -0,0 +1,15 @@
#pragma once
#include <Parsers/IAST.h>
namespace DB
{
class ASTJSONPathRoot : public IAST
{
public:
String getID(char) const override { return "ASTJSONPathRoot"; }
ASTPtr clone() const override { return std::make_shared<ASTJSONPathRoot>(*this); }
};
}

View File

@ -0,0 +1,15 @@
#pragma once
#include <Parsers/IAST.h>
namespace DB
{
class ASTJSONPathStar : public IAST
{
public:
String getID(char) const override { return "ASTJSONPathStar"; }
ASTPtr clone() const override { return std::make_shared<ASTJSONPathStar>(*this); }
};
}

View File

@ -0,0 +1,13 @@
include("${ClickHouse_SOURCE_DIR}/cmake/dbms_glob_sources.cmake")
add_headers_and_sources(clickhouse_functions_jsonpath Parsers)
add_headers_and_sources(clickhouse_functions_jsonpath ASTs)
add_headers_and_sources(clickhouse_functions_jsonpath Generator)
add_library(clickhouse_functions_jsonpath ${clickhouse_functions_jsonpath_sources} ${clickhouse_functions_jsonpath_headers})
target_link_libraries(clickhouse_functions_jsonpath PRIVATE dbms)
target_link_libraries(clickhouse_functions_jsonpath PRIVATE clickhouse_parsers)
target_link_libraries(clickhouse_functions PRIVATE clickhouse_functions_jsonpath)
if (STRIP_DEBUG_SYMBOLS_FUNCTIONS)
target_compile_options(clickhouse_functions_jsonpath PRIVATE "-g0")
endif()

View File

@ -0,0 +1,128 @@
#pragma once
#include <Functions/JSONPath/Generator/IGenerator.h>
#include <Functions/JSONPath/Generator/VisitorJSONPathMemberAccess.h>
#include <Functions/JSONPath/Generator/VisitorJSONPathRange.h>
#include <Functions/JSONPath/Generator/VisitorJSONPathRoot.h>
#include <Functions/JSONPath/Generator/VisitorJSONPathStar.h>
#include <Functions/JSONPath/Generator/VisitorStatus.h>
#include <Functions/JSONPath/ASTs/ASTJSONPath.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
template <typename JSONParser>
class GeneratorJSONPath : public IGenerator<JSONParser>
{
public:
/**
* Traverses children ASTs of ASTJSONPathQuery and creates a vector of corresponding visitors
* @param query_ptr_ pointer to ASTJSONPathQuery
*/
GeneratorJSONPath(ASTPtr query_ptr_)
{
query_ptr = query_ptr_;
const auto * path = query_ptr->as<ASTJSONPath>();
if (!path)
{
throw Exception("Invalid path", ErrorCodes::LOGICAL_ERROR);
}
const auto * query = path->jsonpath_query;
for (auto child_ast : query->children)
{
if (typeid_cast<ASTJSONPathRoot *>(child_ast.get()))
{
visitors.push_back(std::make_shared<VisitorJSONPathRoot<JSONParser>>(child_ast));
}
else if (typeid_cast<ASTJSONPathMemberAccess *>(child_ast.get()))
{
visitors.push_back(std::make_shared<VisitorJSONPathMemberAccess<JSONParser>>(child_ast));
}
else if (typeid_cast<ASTJSONPathRange *>(child_ast.get()))
{
visitors.push_back(std::make_shared<VisitorJSONPathRange<JSONParser>>(child_ast));
}
else if (typeid_cast<ASTJSONPathStar *>(child_ast.get()))
{
visitors.push_back(std::make_shared<VisitorJSONPathStar<JSONParser>>(child_ast));
}
}
}
const char * getName() const override { return "GeneratorJSONPath"; }
/**
* This method exposes API of traversing all paths, described by JSONPath,
* to SQLJSON Functions.
* Expected usage is to iteratively call this method from inside the function
* and to execute custom logic with received element or handle an error.
* On each such call getNextItem will yield next item into element argument
* and modify its internal state to prepare for next call.
*
* @param element root of JSON document
* @return is the generator exhausted
*/
VisitorStatus getNextItem(typename JSONParser::Element & element) override
{
while (true)
{
/// element passed to us actually is root, so here we assign current to root
auto current = element;
if (current_visitor < 0)
{
return VisitorStatus::Exhausted;
}
for (int i = 0; i < current_visitor; ++i)
{
visitors[i]->apply(current);
}
VisitorStatus status = VisitorStatus::Error;
for (size_t i = current_visitor; i < visitors.size(); ++i)
{
status = visitors[i]->visit(current);
current_visitor = i;
if (status == VisitorStatus::Error || status == VisitorStatus::Ignore)
{
break;
}
}
updateVisitorsForNextRun();
if (status != VisitorStatus::Ignore)
{
element = current;
return status;
}
}
}
private:
bool updateVisitorsForNextRun()
{
while (current_visitor >= 0 && visitors[current_visitor]->isExhausted())
{
visitors[current_visitor]->reinitialize();
current_visitor--;
}
if (current_visitor >= 0)
{
visitors[current_visitor]->updateState();
}
return current_visitor >= 0;
}
int current_visitor = 0;
ASTPtr query_ptr;
VisitorList<JSONParser> visitors;
};
}

View File

@ -0,0 +1,29 @@
#pragma once
#include <Functions/JSONPath/Generator/IGenerator_fwd.h>
#include <Functions/JSONPath/Generator/VisitorStatus.h>
#include <Parsers/IAST.h>
namespace DB
{
template <typename JSONParser>
class IGenerator
{
public:
IGenerator() = default;
virtual const char * getName() const = 0;
/**
* Used to yield next non-ignored element describes by JSONPath query.
*
* @param element to be extracted into
* @return true if generator is not exhausted
*/
virtual VisitorStatus getNextItem(typename JSONParser::Element & element) = 0;
virtual ~IGenerator() = default;
};
}

View File

@ -0,0 +1,16 @@
#pragma once
#include <Functions/JSONPath/Generator/IVisitor.h>
namespace DB
{
template <typename JSONParser>
class IGenerator;
template <typename JSONParser>
using IVisitorPtr = std::shared_ptr<IVisitor<JSONParser>>;
template <typename JSONParser>
using VisitorList = std::vector<IVisitorPtr<JSONParser>>;
}

View File

@ -0,0 +1,46 @@
#pragma once
#include <Functions/JSONPath/Generator/VisitorStatus.h>
namespace DB
{
template <typename JSONParser>
class IVisitor
{
public:
virtual const char * getName() const = 0;
/**
* Applies this visitor to document and mutates its state
* @param element simdjson element
*/
virtual VisitorStatus visit(typename JSONParser::Element & element) = 0;
/**
* Applies this visitor to document, but does not mutate state
* @param element simdjson element
*/
virtual VisitorStatus apply(typename JSONParser::Element & element) const = 0;
/**
* Restores visitor's initial state for later use
*/
virtual void reinitialize() = 0;
virtual void updateState() = 0;
bool isExhausted() { return is_exhausted; }
void setExhausted(bool exhausted) { is_exhausted = exhausted; }
virtual ~IVisitor() = default;
private:
/**
* This variable is for detecting whether a visitor's next visit will be able
* to yield a new item.
*/
bool is_exhausted = false;
};
}

View File

@ -0,0 +1,50 @@
#pragma once
#include <Functions/JSONPath/ASTs/ASTJSONPathMemberAccess.h>
#include <Functions/JSONPath/Generator/IVisitor.h>
#include <Functions/JSONPath/Generator/VisitorStatus.h>
namespace DB
{
template <typename JSONParser>
class VisitorJSONPathMemberAccess : public IVisitor<JSONParser>
{
public:
VisitorJSONPathMemberAccess(ASTPtr member_access_ptr_)
: member_access_ptr(member_access_ptr_->as<ASTJSONPathMemberAccess>()) { }
const char * getName() const override { return "VisitorJSONPathMemberAccess"; }
VisitorStatus apply(typename JSONParser::Element & element) const override
{
typename JSONParser::Element result;
element.getObject().find(std::string_view(member_access_ptr->member_name), result);
element = result;
return VisitorStatus::Ok;
}
VisitorStatus visit(typename JSONParser::Element & element) override
{
this->setExhausted(true);
if (!element.isObject())
{
return VisitorStatus::Error;
}
typename JSONParser::Element result;
if (!element.getObject().find(std::string_view(member_access_ptr->member_name), result))
{
return VisitorStatus::Error;
}
apply(element);
return VisitorStatus::Ok;
}
void reinitialize() override { this->setExhausted(false); }
void updateState() override { }
private:
ASTJSONPathMemberAccess * member_access_ptr;
};
}

View File

@ -0,0 +1,80 @@
#pragma once
#include <Functions/JSONPath/ASTs/ASTJSONPathRange.h>
#include <Functions/JSONPath/Generator/IVisitor.h>
#include <Functions/JSONPath/Generator/VisitorStatus.h>
namespace DB
{
template <typename JSONParser>
class VisitorJSONPathRange : public IVisitor<JSONParser>
{
public:
VisitorJSONPathRange(ASTPtr range_ptr_) : range_ptr(range_ptr_->as<ASTJSONPathRange>())
{
current_range = 0;
current_index = range_ptr->ranges[current_range].first;
}
const char * getName() const override { return "VisitorJSONPathRange"; }
VisitorStatus apply(typename JSONParser::Element & element) const override
{
typename JSONParser::Element result;
typename JSONParser::Array array = element.getArray();
element = array[current_index];
return VisitorStatus::Ok;
}
VisitorStatus visit(typename JSONParser::Element & element) override
{
if (!element.isArray())
{
this->setExhausted(true);
return VisitorStatus::Error;
}
VisitorStatus status;
if (current_index < element.getArray().size())
{
apply(element);
status = VisitorStatus::Ok;
}
else
{
status = VisitorStatus::Ignore;
}
if (current_index + 1 == range_ptr->ranges[current_range].second
&& current_range + 1 == range_ptr->ranges.size())
{
this->setExhausted(true);
}
return status;
}
void reinitialize() override
{
current_range = 0;
current_index = range_ptr->ranges[current_range].first;
this->setExhausted(false);
}
void updateState() override
{
current_index++;
if (current_index == range_ptr->ranges[current_range].second)
{
current_range++;
current_index = range_ptr->ranges[current_range].first;
}
}
private:
ASTJSONPathRange * range_ptr;
size_t current_range;
UInt32 current_index;
};
}

View File

@ -0,0 +1,35 @@
#pragma once
#include <Functions/JSONPath/ASTs/ASTJSONPathRoot.h>
#include <Functions/JSONPath/Generator/IVisitor.h>
#include <Functions/JSONPath/Generator/VisitorStatus.h>
namespace DB
{
template <typename JSONParser>
class VisitorJSONPathRoot : public IVisitor<JSONParser>
{
public:
VisitorJSONPathRoot(ASTPtr) { }
const char * getName() const override { return "VisitorJSONPathRoot"; }
VisitorStatus apply(typename JSONParser::Element & /*element*/) const override
{
/// No-op on document, since we are already passed document's root
return VisitorStatus::Ok;
}
VisitorStatus visit(typename JSONParser::Element & element) override
{
apply(element);
this->setExhausted(true);
return VisitorStatus::Ok;
}
void reinitialize() override { this->setExhausted(false); }
void updateState() override { }
};
}

View File

@ -0,0 +1,66 @@
#pragma once
#include <Functions/JSONPath/ASTs/ASTJSONPathStar.h>
#include <Functions/JSONPath/Generator/IVisitor.h>
#include <Functions/JSONPath/Generator/VisitorStatus.h>
namespace DB
{
template <typename JSONParser>
class VisitorJSONPathStar : public IVisitor<JSONParser>
{
public:
VisitorJSONPathStar(ASTPtr)
{
current_index = 0;
}
const char * getName() const override { return "VisitorJSONPathStar"; }
VisitorStatus apply(typename JSONParser::Element & element) const override
{
typename JSONParser::Element result;
typename JSONParser::Array array = element.getArray();
element = array[current_index];
return VisitorStatus::Ok;
}
VisitorStatus visit(typename JSONParser::Element & element) override
{
if (!element.isArray())
{
this->setExhausted(true);
return VisitorStatus::Error;
}
VisitorStatus status;
if (current_index < element.getArray().size())
{
apply(element);
status = VisitorStatus::Ok;
}
else
{
status = VisitorStatus::Ignore;
this->setExhausted(true);
}
return status;
}
void reinitialize() override
{
current_index = 0;
this->setExhausted(false);
}
void updateState() override
{
current_index++;
}
private:
UInt32 current_index;
};
}

View File

@ -0,0 +1,13 @@
#pragma once
namespace DB
{
enum VisitorStatus
{
Ok,
Exhausted,
Error,
Ignore
};
}

View File

@ -0,0 +1,31 @@
#include <Functions/JSONPath/ASTs/ASTJSONPath.h>
#include <Functions/JSONPath/ASTs/ASTJSONPathMemberAccess.h>
#include <Functions/JSONPath/Parsers/ParserJSONPath.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathQuery.h>
namespace DB
{
/**
* Entry parser for JSONPath
*/
bool ParserJSONPath::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
auto ast_jsonpath = std::make_shared<ASTJSONPath>();
ParserJSONPathQuery parser_jsonpath_query;
/// Push back dot AST and brackets AST to query->children
ASTPtr query;
bool res = parser_jsonpath_query.parse(pos, query, expected);
if (res)
{
/// Set ASTJSONPathQuery of ASTJSONPath
ast_jsonpath->set(ast_jsonpath->jsonpath_query, query);
}
node = ast_jsonpath;
return res;
}
}

View File

@ -0,0 +1,21 @@
#pragma once
#include <Parsers/IParserBase.h>
namespace DB
{
/**
* Entry parser for JSONPath
*/
class ParserJSONPath : public IParserBase
{
private:
const char * getName() const override { return "ParserJSONPath"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
public:
explicit ParserJSONPath() = default;
};
}

View File

@ -0,0 +1,42 @@
#include <Functions/JSONPath/ASTs/ASTJSONPathMemberAccess.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathMemberAccess.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ExpressionElementParsers.h>
#include <Parsers/Lexer.h>
namespace DB
{
/**
*
* @param pos token iterator
* @param node node of ASTJSONPathMemberAccess
* @param expected stuff for logging
* @return was parse successful
*/
bool ParserJSONPathMemberAccess::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
if (pos->type != TokenType::Dot)
{
return false;
}
++pos;
if (pos->type != TokenType::BareWord)
{
return false;
}
ParserIdentifier name_p;
ASTPtr member_name;
if (!name_p.parse(pos, member_name, expected))
{
return false;
}
auto member_access = std::make_shared<ASTJSONPathMemberAccess>();
node = member_access;
return tryGetIdentifierNameInto(member_name, member_access->member_name);
}
}

View File

@ -0,0 +1,14 @@
#pragma once
#include <Parsers/IParserBase.h>
namespace DB
{
class ParserJSONPathMemberAccess : public IParserBase
{
const char * getName() const override { return "ParserJSONPathMemberAccess"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
};
}

View File

@ -0,0 +1,48 @@
#include <Functions/JSONPath/ASTs/ASTJSONPathQuery.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathQuery.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathRoot.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathMemberAccess.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathRange.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathStar.h>
namespace DB
{
/**
*
* @param pos token iterator
* @param query node of ASTJSONPathQuery
* @param expected stuff for logging
* @return was parse successful
*/
bool ParserJSONPathQuery::parseImpl(Pos & pos, ASTPtr & query, Expected & expected)
{
query = std::make_shared<ASTJSONPathQuery>();
ParserJSONPathMemberAccess parser_jsonpath_member_access;
ParserJSONPathRange parser_jsonpath_range;
ParserJSONPathStar parser_jsonpath_star;
ParserJSONPathRoot parser_jsonpath_root;
ASTPtr path_root;
if (!parser_jsonpath_root.parse(pos, path_root, expected))
{
return false;
}
query->children.push_back(path_root);
ASTPtr accessor;
while (parser_jsonpath_member_access.parse(pos, accessor, expected)
|| parser_jsonpath_range.parse(pos, accessor, expected)
|| parser_jsonpath_star.parse(pos, accessor, expected))
{
if (accessor)
{
query->children.push_back(accessor);
accessor = nullptr;
}
}
/// parsing was successful if we reached the end of query by this point
return pos->type == TokenType::EndOfStream;
}
}

View File

@ -0,0 +1,14 @@
#pragma once
#include <Parsers/IParserBase.h>
namespace DB
{
class ParserJSONPathQuery : public IParserBase
{
protected:
const char * getName() const override { return "ParserJSONPathQuery"; }
bool parseImpl(Pos & pos, ASTPtr & query, Expected & expected) override;
};
}

View File

@ -0,0 +1,94 @@
#include <Functions/JSONPath/ASTs/ASTJSONPathRange.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathQuery.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathRange.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ExpressionElementParsers.h>
#include <Parsers/CommonParsers.h>
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
}
/**
*
* @param pos token iterator
* @param node node of ASTJSONPathQuery
* @param expected stuff for logging
* @return was parse successful
*/
bool ParserJSONPathRange::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
if (pos->type != TokenType::OpeningSquareBracket)
{
return false;
}
++pos;
auto range = std::make_shared<ASTJSONPathRange>();
node = range;
ParserNumber number_p;
ASTPtr number_ptr;
while (pos->type != TokenType::ClosingSquareBracket)
{
if (pos->type != TokenType::Number)
{
return false;
}
std::pair<UInt32, UInt32> range_indices;
if (!number_p.parse(pos, number_ptr, expected))
{
return false;
}
range_indices.first = number_ptr->as<ASTLiteral>()->value.get<UInt32>();
if (pos->type == TokenType::Comma || pos->type == TokenType::ClosingSquareBracket)
{
/// Single index case
range_indices.second = range_indices.first + 1;
}
else if (pos->type == TokenType::BareWord)
{
if (!ParserKeyword("TO").ignore(pos, expected))
{
return false;
}
if (!number_p.parse(pos, number_ptr, expected))
{
return false;
}
range_indices.second = number_ptr->as<ASTLiteral>()->value.get<UInt32>();
}
else
{
return false;
}
if (range_indices.first >= range_indices.second)
{
throw Exception(
ErrorCodes::BAD_ARGUMENTS,
"Start of range must be greater than end of range, however {} >= {}",
range_indices.first,
range_indices.second);
}
range->ranges.push_back(std::move(range_indices));
if (pos->type != TokenType::ClosingSquareBracket)
{
++pos;
}
}
++pos;
/// We can't have both ranges and star present, so parse was successful <=> exactly 1 of these conditions is true
return !range->ranges.empty() ^ range->is_star;
}
}

View File

@ -0,0 +1,18 @@
#pragma once
#include <Parsers/IParserBase.h>
namespace DB
{
class ParserJSONPathRange : public IParserBase
{
private:
const char * getName() const override { return "ParserJSONPathRange"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
public:
explicit ParserJSONPathRange() = default;
};
}

View File

@ -0,0 +1,27 @@
#include <Functions/JSONPath/ASTs/ASTJSONPathRoot.h>
#include <Functions/JSONPath/Parsers/ParserJSONPathRoot.h>
#include <Parsers/Lexer.h>
namespace DB
{
/**
*
* @param pos token iterator
* @param node node of ASTJSONPathRoot
* @param expected stuff for logging
* @return was parse successful
*/
bool ParserJSONPathRoot::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
if (pos->type != TokenType::DollarSign)
{
expected.add(pos, "dollar sign (start of jsonpath)");
return false;
}
node = std::make_shared<ASTJSONPathRoot>();
++pos;
return true;
}
}

View File

@ -0,0 +1,18 @@
#pragma once
#include <Parsers/IParserBase.h>
namespace DB
{
class ParserJSONPathRoot : public IParserBase
{
private:
const char * getName() const override { return "ParserJSONPathRoot"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
public:
explicit ParserJSONPathRoot() = default;
};
}

View File

@ -0,0 +1,31 @@
#include <Functions/JSONPath/Parsers/ParserJSONPathStar.h>
#include <Functions/JSONPath/ASTs/ASTJSONPathStar.h>
namespace DB
{
bool ParserJSONPathStar::parseImpl(Pos & pos, ASTPtr & node, Expected & expected)
{
if (pos->type != TokenType::OpeningSquareBracket)
{
return false;
}
++pos;
if (pos->type != TokenType::Asterisk)
{
return false;
}
++pos;
if (pos->type != TokenType::ClosingSquareBracket)
{
expected.add(pos, "Closing square bracket");
return false;
}
++pos;
node = std::make_shared<ASTJSONPathStar>();
return true;
}
}

View File

@ -0,0 +1,18 @@
#pragma once
#include <Parsers/IParserBase.h>
namespace DB
{
class ParserJSONPathStar : public IParserBase
{
private:
const char * getName() const override { return "ParserJSONPathStar"; }
bool parseImpl(Pos & pos, ASTPtr & node, Expected & expected) override;
public:
explicit ParserJSONPathStar() = default;
};
}

View File

@ -50,6 +50,8 @@ struct SimdJSONParser
ALWAYS_INLINE Array getArray() const;
ALWAYS_INLINE Object getObject() const;
ALWAYS_INLINE simdjson::dom::element getElement() const { return element; }
private:
simdjson::dom::element element;
};

View File

@ -29,6 +29,11 @@ public:
return name;
}
bool isInjective(const ColumnsWithTypeAndName & /*sample_columns*/) const override
{
return true;
}
bool useDefaultImplementationForLowCardinalityColumns() const override { return false; }
size_t getNumberOfArguments() const override

View File

@ -40,6 +40,7 @@ void registerFunctionsGeo(FunctionFactory &);
void registerFunctionsIntrospection(FunctionFactory &);
void registerFunctionsNull(FunctionFactory &);
void registerFunctionsJSON(FunctionFactory &);
void registerFunctionsSQLJSON(FunctionFactory &);
void registerFunctionToJSONString(FunctionFactory &);
void registerFunctionsConsistentHashing(FunctionFactory & factory);
void registerFunctionsUnixTimestamp64(FunctionFactory & factory);
@ -99,6 +100,7 @@ void registerFunctions()
registerFunctionsGeo(factory);
registerFunctionsNull(factory);
registerFunctionsJSON(factory);
registerFunctionsSQLJSON(factory);
registerFunctionToJSONString(factory);
registerFunctionsIntrospection(factory);
registerFunctionsConsistentHashing(factory);

View File

@ -2,6 +2,7 @@
#include <DataTypes/DataTypeString.h>
#include <Functions/FunctionFactory.h>
#include <Functions/IFunction.h>
#include <Formats/FormatFactory.h>
#include <IO/WriteBufferFromVector.h>
#include <IO/WriteHelpers.h>
@ -13,7 +14,9 @@ namespace
{
public:
static constexpr auto name = "toJSONString";
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionToJSONString>(); }
static FunctionPtr create(ContextPtr context) { return std::make_shared<FunctionToJSONString>(context); }
explicit FunctionToJSONString(ContextPtr context) : format_settings(getFormatSettings(context)) {}
String getName() const override { return name; }
@ -35,7 +38,7 @@ namespace
WriteBufferFromVector<ColumnString::Chars> json(data_to);
for (size_t i = 0; i < input_rows_count; ++i)
{
serializer->serializeTextJSON(*arguments[0].column, i, json, FormatSettings());
serializer->serializeTextJSON(*arguments[0].column, i, json, format_settings);
writeChar(0, json);
offsets_to[i] = json.count();
}
@ -43,6 +46,10 @@ namespace
json.finalize();
return res;
}
private:
/// Affects only subset of part of settings related to json.
const FormatSettings format_settings;
};
}

View File

@ -44,6 +44,7 @@ SRCS(
FunctionFile.cpp
FunctionHelpers.cpp
FunctionJoinGet.cpp
FunctionSQLJSON.cpp
FunctionsAES.cpp
FunctionsCoding.cpp
FunctionsConversion.cpp
@ -76,6 +77,12 @@ SRCS(
GatherUtils/sliceFromRightConstantOffsetUnbounded.cpp
GeoHash.cpp
IFunction.cpp
JSONPath/Parsers/ParserJSONPath.cpp
JSONPath/Parsers/ParserJSONPathMemberAccess.cpp
JSONPath/Parsers/ParserJSONPathQuery.cpp
JSONPath/Parsers/ParserJSONPathRange.cpp
JSONPath/Parsers/ParserJSONPathRoot.cpp
JSONPath/Parsers/ParserJSONPathStar.cpp
TargetSpecific.cpp
URL/URLHierarchy.cpp
URL/URLPathHierarchy.cpp

View File

@ -21,12 +21,9 @@ using Scalars = std::map<String, Block>;
class Context;
/// Most used types have shorter names
/// TODO: in the first part of refactoring all the context pointers are non-const.
using ContextPtr = std::shared_ptr<const Context>;
using ContextConstPtr = ContextPtr; /// For compatibility. Use ContextPtr.
using ContextMutablePtr = std::shared_ptr<Context>;
using ContextWeakPtr = std::weak_ptr<const Context>;
using ContextWeakConstPtr = ContextWeakPtr; /// For compatibility. Use ContextWeakPtr.
using ContextWeakMutablePtr = std::weak_ptr<Context>;
template <class Shared = ContextPtr>

View File

@ -231,7 +231,6 @@ void ExpressionAnalyzer::analyzeAggregation()
if (has_aggregation)
{
/// Find out aggregation keys.
if (select_query)
{
@ -252,6 +251,8 @@ void ExpressionAnalyzer::analyzeAggregation()
/// Constant expressions have non-null column pointer at this stage.
if (node->column && isColumnConst(*node->column))
{
select_query->group_by_with_constant_keys = true;
/// But don't remove last key column if no aggregate functions, otherwise aggregation will not work.
if (!aggregate_descriptions.empty() || size > 1)
{
@ -288,6 +289,11 @@ void ExpressionAnalyzer::analyzeAggregation()
else
aggregated_columns = temp_actions->getNamesAndTypesList();
/// Constant expressions are already removed during first 'analyze' run.
/// So for second `analyze` information is taken from select_query.
if (select_query)
has_const_aggregation_keys = select_query->group_by_with_constant_keys;
for (const auto & desc : aggregate_descriptions)
aggregated_columns.emplace_back(desc.column_name, desc.function->getReturnType());
}

View File

@ -65,6 +65,7 @@ struct ExpressionAnalyzerData
bool has_aggregation = false;
NamesAndTypesList aggregation_keys;
bool has_const_aggregation_keys = false;
AggregateDescriptions aggregate_descriptions;
WindowDescriptions window_descriptions;
@ -309,6 +310,7 @@ public:
bool hasTableJoin() const { return syntax->ast_join; }
const NamesAndTypesList & aggregationKeys() const { return aggregation_keys; }
bool hasConstAggregationKeys() const { return has_const_aggregation_keys; }
const AggregateDescriptions & aggregates() const { return aggregate_descriptions; }
const PreparedSets & getPreparedSets() const { return prepared_sets; }

View File

@ -2041,7 +2041,7 @@ void InterpreterSelectQuery::executeAggregation(QueryPlan & query_plan, const Ac
settings.group_by_two_level_threshold,
settings.group_by_two_level_threshold_bytes,
settings.max_bytes_before_external_group_by,
settings.empty_result_for_aggregation_by_empty_set,
settings.empty_result_for_aggregation_by_empty_set || (keys.empty() && query_analyzer->hasConstAggregationKeys()),
context->getTemporaryVolume(),
settings.max_threads,
settings.min_free_disk_space_for_temporary_data,

View File

@ -168,7 +168,7 @@ static void compileFunction(llvm::Module & module, const IFunctionBase & functio
for (size_t i = 0; i <= arg_types.size(); ++i)
{
const auto & type = i == arg_types.size() ? function.getResultType() : arg_types[i];
auto * data = b.CreateLoad(data_type, b.CreateConstInBoundsGEP1_32(data_type, columns_arg, i));
auto * data = b.CreateLoad(data_type, b.CreateConstInBoundsGEP1_64(data_type, columns_arg, i));
columns[i].data_init = b.CreatePointerCast(b.CreateExtractValue(data, {0}), toNativeType(b, removeNullable(type))->getPointerTo());
columns[i].null_init = type->isNullable() ? b.CreateExtractValue(data, {1}) : nullptr;
}
@ -236,9 +236,9 @@ static void compileFunction(llvm::Module & module, const IFunctionBase & functio
auto * cur_block = b.GetInsertBlock();
for (auto & col : columns)
{
col.data->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, col.data, 1), cur_block);
col.data->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, col.data, 1), cur_block);
if (col.null)
col.null->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, col.null, 1), cur_block);
col.null->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, col.null, 1), cur_block);
}
auto * value = b.CreateAdd(counter_phi, llvm::ConstantInt::get(size_type, 1));
@ -295,7 +295,7 @@ static void compileCreateAggregateStatesFunctions(llvm::Module & module, const s
{
size_t aggregate_function_offset = function_to_compile.aggregate_data_offset;
const auto * aggregate_function = function_to_compile.function;
auto * aggregation_place_with_offset = b.CreateConstInBoundsGEP1_32(nullptr, aggregate_data_place_arg, aggregate_function_offset);
auto * aggregation_place_with_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_place_arg, aggregate_function_offset);
aggregate_function->compileCreate(b, aggregation_place_with_offset);
}
@ -338,7 +338,7 @@ static void compileAddIntoAggregateStatesFunctions(llvm::Module & module, const
for (size_t column_argument_index = 0; column_argument_index < function_arguments_size; ++column_argument_index)
{
const auto & argument_type = argument_types[column_argument_index];
auto * data = b.CreateLoad(column_data_type, b.CreateConstInBoundsGEP1_32(column_data_type, columns_arg, previous_columns_size + column_argument_index));
auto * data = b.CreateLoad(column_data_type, b.CreateConstInBoundsGEP1_64(column_data_type, columns_arg, previous_columns_size + column_argument_index));
data_placeholder.data_init = b.CreatePointerCast(b.CreateExtractValue(data, {0}), toNativeType(b, removeNullable(argument_type))->getPointerTo());
data_placeholder.null_init = argument_type->isNullable() ? b.CreateExtractValue(data, {1}) : nullptr;
columns.emplace_back(data_placeholder);
@ -408,7 +408,7 @@ static void compileAddIntoAggregateStatesFunctions(llvm::Module & module, const
arguments_values[column_argument_index] = nullable_value;
}
auto * aggregation_place_with_offset = b.CreateConstInBoundsGEP1_32(nullptr, aggregation_place, aggregate_function_offset);
auto * aggregation_place_with_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregation_place, aggregate_function_offset);
aggregate_function_ptr->compileAdd(b, aggregation_place_with_offset, arguments_types, arguments_values);
previous_columns_size += function_arguments_size;
@ -419,13 +419,13 @@ static void compileAddIntoAggregateStatesFunctions(llvm::Module & module, const
auto * cur_block = b.GetInsertBlock();
for (auto & col : columns)
{
col.data->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, col.data, 1), cur_block);
col.data->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, col.data, 1), cur_block);
if (col.null)
col.null->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, col.null, 1), cur_block);
col.null->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, col.null, 1), cur_block);
}
places_phi->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, places_phi, 1), cur_block);
places_phi->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, places_phi, 1), cur_block);
auto * value = b.CreateAdd(counter_phi, llvm::ConstantInt::get(size_type, 1));
counter_phi->addIncoming(value, cur_block);
@ -457,8 +457,8 @@ static void compileMergeAggregatesStates(llvm::Module & module, const std::vecto
size_t aggregate_function_offset = function_to_compile.aggregate_data_offset;
const auto * aggregate_function_ptr = function_to_compile.function;
auto * aggregate_data_place_merge_dst_with_offset = b.CreateConstInBoundsGEP1_32(nullptr, aggregate_data_place_dst_arg, aggregate_function_offset);
auto * aggregate_data_place_merge_src_with_offset = b.CreateConstInBoundsGEP1_32(nullptr, aggregate_data_place_src_arg, aggregate_function_offset);
auto * aggregate_data_place_merge_dst_with_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_place_dst_arg, aggregate_function_offset);
auto * aggregate_data_place_merge_src_with_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_place_src_arg, aggregate_function_offset);
aggregate_function_ptr->compileMerge(b, aggregate_data_place_merge_dst_with_offset, aggregate_data_place_merge_src_with_offset);
}
@ -490,7 +490,7 @@ static void compileInsertAggregatesIntoResultColumns(llvm::Module & module, cons
for (size_t i = 0; i < functions.size(); ++i)
{
auto return_type = functions[i].function->getReturnType();
auto * data = b.CreateLoad(column_data_type, b.CreateConstInBoundsGEP1_32(column_data_type, columns_arg, i));
auto * data = b.CreateLoad(column_data_type, b.CreateConstInBoundsGEP1_64(column_data_type, columns_arg, i));
columns[i].data_init = b.CreatePointerCast(b.CreateExtractValue(data, {0}), toNativeType(b, removeNullable(return_type))->getPointerTo());
columns[i].null_init = return_type->isNullable() ? b.CreateExtractValue(data, {1}) : nullptr;
}
@ -526,7 +526,7 @@ static void compileInsertAggregatesIntoResultColumns(llvm::Module & module, cons
const auto * aggregate_function_ptr = functions[i].function;
auto * aggregate_data_place = b.CreateLoad(b.getInt8Ty()->getPointerTo(), aggregate_data_place_phi);
auto * aggregation_place_with_offset = b.CreateConstInBoundsGEP1_32(nullptr, aggregate_data_place, aggregate_function_offset);
auto * aggregation_place_with_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_place, aggregate_function_offset);
auto * final_value = aggregate_function_ptr->compileGetResult(b, aggregation_place_with_offset);
@ -546,16 +546,16 @@ static void compileInsertAggregatesIntoResultColumns(llvm::Module & module, cons
auto * cur_block = b.GetInsertBlock();
for (auto & col : columns)
{
col.data->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, col.data, 1), cur_block);
col.data->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, col.data, 1), cur_block);
if (col.null)
col.null->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, col.null, 1), cur_block);
col.null->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, col.null, 1), cur_block);
}
auto * value = b.CreateAdd(counter_phi, llvm::ConstantInt::get(size_type, 1), "", true, true);
counter_phi->addIncoming(value, cur_block);
aggregate_data_place_phi->addIncoming(b.CreateConstInBoundsGEP1_32(nullptr, aggregate_data_place_phi, 1), cur_block);
aggregate_data_place_phi->addIncoming(b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_place_phi, 1), cur_block);
b.CreateCondBr(b.CreateICmpEQ(value, rows_count_arg), end, loop);

View File

@ -44,6 +44,7 @@ public:
bool group_by_with_totals = false;
bool group_by_with_rollup = false;
bool group_by_with_cube = false;
bool group_by_with_constant_keys = false;
bool limit_with_ties = false;
ASTPtr & refSelect() { return getExpression(Expression::SELECT); }

View File

@ -338,6 +338,11 @@ Token Lexer::nextTokenImpl()
}
default:
if (*pos == '$' && ((pos + 1 < end && !isWordCharASCII(pos[1])) || pos + 1 == end))
{
/// Capture standalone dollar sign
return Token(TokenType::DollarSign, token_begin, ++pos);
}
if (isWordCharASCII(*pos) || *pos == '$')
{
++pos;

View File

@ -33,6 +33,7 @@ namespace DB
\
M(Asterisk) /** Could be used as multiplication operator or on it's own: "SELECT *" */ \
\
M(DollarSign) \
M(Plus) \
M(Minus) \
M(Slash) \

View File

@ -226,7 +226,7 @@ void StorageMaterializedPostgreSQL::shutdown()
{
if (replication_handler)
replication_handler->shutdown();
auto nested = getNested();
auto nested = tryGetNested();
if (nested)
nested->shutdown();
}

View File

@ -3,6 +3,7 @@
import logging
import subprocess
import os
import glob
import time
import shutil
from collections import defaultdict
@ -17,7 +18,6 @@ SLEEP_BETWEEN_RETRIES = 5
PARALLEL_GROUP_SIZE = 100
CLICKHOUSE_BINARY_PATH = "/usr/bin/clickhouse"
CLICKHOUSE_ODBC_BRIDGE_BINARY_PATH = "/usr/bin/clickhouse-odbc-bridge"
DOCKERD_LOGS_PATH = "/ClickHouse/tests/integration/dockerd.log"
CLICKHOUSE_LIBRARY_BRIDGE_BINARY_PATH = "/usr/bin/clickhouse-library-bridge"
TRIES_COUNT = 10
@ -256,8 +256,8 @@ class ClickhouseIntegrationTestsRunner:
shutil.copy(CLICKHOUSE_LIBRARY_BRIDGE_BINARY_PATH, result_path_library_bridge)
return None, None
def _compress_logs(self, path, result_path):
subprocess.check_call("tar czf {} -C {} .".format(result_path, path), shell=True) # STYLE_CHECK_ALLOW_SUBPROCESS_CHECK_CALL
def _compress_logs(self, dir, relpaths, result_path):
subprocess.check_call("tar czf {} -C {} {}".format(result_path, dir, ' '.join(relpaths)), shell=True) # STYLE_CHECK_ALLOW_SUBPROCESS_CHECK_CALL
def _get_all_tests(self, repo_path):
image_cmd = self._get_runner_image_cmd(repo_path)
@ -336,6 +336,27 @@ class ClickhouseIntegrationTestsRunner:
logging.info("Cannot run with custom docker image version :(")
return image_cmd
def _find_test_data_dirs(self, repo_path, test_names):
relpaths = {}
for test_name in test_names:
if '/' in test_name:
test_dir = test_name[:test_name.find('/')]
else:
test_dir = test_name
if os.path.isdir(os.path.join(repo_path, "tests/integration", test_dir)):
for name in os.listdir(os.path.join(repo_path, "tests/integration", test_dir)):
relpath = os.path.join(os.path.join(test_dir, name))
mtime = os.path.getmtime(os.path.join(repo_path, "tests/integration", relpath))
relpaths[relpath] = mtime
return relpaths
def _get_test_data_dirs_difference(self, new_snapshot, old_snapshot):
res = set()
for path in new_snapshot:
if (not path in old_snapshot) or (old_snapshot[path] != new_snapshot[path]):
res.add(path)
return res
def run_test_group(self, repo_path, test_group, tests_in_group, num_tries, num_workers):
counters = {
"ERROR": [],
@ -355,18 +376,14 @@ class ClickhouseIntegrationTestsRunner:
image_cmd = self._get_runner_image_cmd(repo_path)
test_group_str = test_group.replace('/', '_').replace('.', '_')
log_paths = []
test_data_dirs = {}
for i in range(num_tries):
logging.info("Running test group %s for the %s retry", test_group, i)
clear_ip_tables_and_restart_daemons()
output_path = os.path.join(str(self.path()), "test_output_" + test_group_str + "_" + str(i) + ".log")
log_name = "integration_run_" + test_group_str + "_" + str(i) + ".txt"
log_path = os.path.join(str(self.path()), log_name)
log_paths.append(log_path)
logging.info("Will wait output inside %s", output_path)
test_names = set([])
for test_name in tests_in_group:
if test_name not in counters["PASSED"]:
@ -375,11 +392,19 @@ class ClickhouseIntegrationTestsRunner:
else:
test_names.add(test_name)
if i == 0:
test_data_dirs = self._find_test_data_dirs(repo_path, test_names)
info_basename = test_group_str + "_" + str(i) + ".nfo"
info_path = os.path.join(repo_path, "tests/integration", info_basename)
test_cmd = ' '.join([test for test in sorted(test_names)])
parallel_cmd = " --parallel {} ".format(num_workers) if num_workers > 0 else ""
cmd = "cd {}/tests/integration && ./runner --tmpfs {} -t {} {} '-ss -rfEp --run-id={} --color=no --durations=0 {}' | tee {}".format(
repo_path, image_cmd, test_cmd, parallel_cmd, i, _get_deselect_option(self.should_skip_tests()), output_path)
cmd = "cd {}/tests/integration && ./runner --tmpfs {} -t {} {} '-rfEp --run-id={} --color=no --durations=0 {}' | tee {}".format(
repo_path, image_cmd, test_cmd, parallel_cmd, i, _get_deselect_option(self.should_skip_tests()), info_path)
log_basename = test_group_str + "_" + str(i) + ".log"
log_path = os.path.join(repo_path, "tests/integration", log_basename)
with open(log_path, 'w') as log:
logging.info("Executing cmd: %s", cmd)
retcode = subprocess.Popen(cmd, shell=True, stderr=log, stdout=log).wait()
@ -388,15 +413,41 @@ class ClickhouseIntegrationTestsRunner:
else:
logging.info("Some tests failed")
if os.path.exists(output_path):
lines = parse_test_results_output(output_path)
extra_logs_names = [log_basename]
log_result_path = os.path.join(str(self.path()), 'integration_run_' + log_basename)
shutil.copy(log_path, log_result_path)
log_paths.append(log_result_path)
for pytest_log_path in glob.glob(os.path.join(repo_path, "tests/integration/pytest*.log")):
new_name = test_group_str + "_" + str(i) + "_" + os.path.basename(pytest_log_path)
os.rename(pytest_log_path, os.path.join(repo_path, "tests/integration", new_name))
extra_logs_names.append(new_name)
dockerd_log_path = os.path.join(repo_path, "tests/integration/dockerd.log")
if os.path.exists(dockerd_log_path):
new_name = test_group_str + "_" + str(i) + "_" + os.path.basename(dockerd_log_path)
os.rename(dockerd_log_path, os.path.join(repo_path, "tests/integration", new_name))
extra_logs_names.append(new_name)
if os.path.exists(info_path):
extra_logs_names.append(info_basename)
lines = parse_test_results_output(info_path)
new_counters = get_counters(lines)
times_lines = parse_test_times(output_path)
times_lines = parse_test_times(info_path)
new_tests_times = get_test_times(times_lines)
self._update_counters(counters, new_counters)
for test_name, test_time in new_tests_times.items():
tests_times[test_name] = test_time
os.remove(output_path)
test_data_dirs_new = self._find_test_data_dirs(repo_path, test_names)
test_data_dirs_diff = self._get_test_data_dirs_difference(test_data_dirs_new, test_data_dirs)
test_data_dirs = test_data_dirs_new
if extra_logs_names or test_data_dirs_diff:
extras_result_path = os.path.join(str(self.path()), "integration_run_" + test_group_str + "_" + str(i) + ".tar.gz")
self._compress_logs(os.path.join(repo_path, "tests/integration"), extra_logs_names + list(test_data_dirs_diff), extras_result_path)
log_paths.append(extras_result_path)
if len(counters["PASSED"]) + len(counters["FLAKY"]) == len(tests_in_group):
logging.info("All tests from group %s passed", test_group)
break
@ -459,15 +510,6 @@ class ClickhouseIntegrationTestsRunner:
break
time.sleep(5)
logging.info("Finally all tests done, going to compress test dir")
test_logs = os.path.join(str(self.path()), "./test_dir.tar.gz")
self._compress_logs("{}/tests/integration".format(repo_path), test_logs)
logging.info("Compression finished")
result_path_dockerd_logs = os.path.join(str(self.path()), "dockerd.log")
if os.path.exists(result_path_dockerd_logs):
shutil.copy(DOCKERD_LOGS_PATH, result_path_dockerd_logs)
test_result = []
for state in ("ERROR", "FAILED", "PASSED", "SKIPPED", "FLAKY"):
if state == "PASSED":
@ -479,7 +521,7 @@ class ClickhouseIntegrationTestsRunner:
test_result += [(c + ' (✕' + str(final_retry) + ')', text_state, "{:.2f}".format(tests_times[c])) for c in counters[state]]
status_text = description_prefix + ', '.join([str(n).lower().replace('failed', 'fail') + ': ' + str(len(c)) for n, c in counters.items()])
return result_state, status_text, test_result, [test_logs] + logs
return result_state, status_text, test_result, logs
def run_impl(self, repo_path, build_path):
if self.flaky_check:
@ -539,15 +581,6 @@ class ClickhouseIntegrationTestsRunner:
logging.info("Collected more than 20 failed/error tests, stopping")
break
logging.info("Finally all tests done, going to compress test dir")
test_logs = os.path.join(str(self.path()), "./test_dir.tar.gz")
self._compress_logs("{}/tests/integration".format(repo_path), test_logs)
logging.info("Compression finished")
result_path_dockerd_logs = os.path.join(str(self.path()), "dockerd.log")
if os.path.exists(result_path_dockerd_logs):
shutil.copy(DOCKERD_LOGS_PATH, result_path_dockerd_logs)
if counters["FAILED"] or counters["ERROR"]:
logging.info("Overall status failure, because we have tests in FAILED or ERROR state")
result_state = "failure"
@ -580,7 +613,7 @@ class ClickhouseIntegrationTestsRunner:
if '(memory)' in self.params['context_name']:
result_state = "success"
return result_state, status_text, test_result, [test_logs]
return result_state, status_text, test_result, []
def write_results(results_file, status_file, results, status):
with open(results_file, 'w') as f:

View File

@ -30,6 +30,7 @@ from kazoo.client import KazooClient
from kazoo.exceptions import KazooException
from minio import Minio
from helpers.test_tools import assert_eq_with_retry
from helpers import pytest_xdist_logging_to_separate_files
import docker
@ -56,22 +57,22 @@ def run_and_check(args, env=None, shell=False, stdout=subprocess.PIPE, stderr=su
subprocess.Popen(args, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, env=env, shell=shell)
return
logging.debug(f"Command:{args}")
res = subprocess.run(args, stdout=stdout, stderr=stderr, env=env, shell=shell, timeout=timeout)
out = res.stdout.decode('utf-8')
err = res.stderr.decode('utf-8')
if res.returncode != 0:
# check_call(...) from subprocess does not print stderr, so we do it manually
logging.debug(f"Command:{args}")
logging.debug(f"Stderr:{err}")
# check_call(...) from subprocess does not print stderr, so we do it manually
if out:
logging.debug(f"Stdout:{out}")
logging.debug(f"Env: {env}")
if err:
logging.debug(f"Stderr:{err}")
if res.returncode != 0:
logging.debug(f"Exitcode:{res.returncode}")
if env:
logging.debug(f"Env:{env}")
if not nothrow:
raise Exception(f"Command {args} return non-zero code {res.returncode}: {res.stderr.decode('utf-8')}")
else:
logging.debug(f"Command:{args}")
logging.debug(f"Stderr: {err}")
logging.debug(f"Stdout: {out}")
return out
return out
# Based on https://stackoverflow.com/questions/2838244/get-open-tcp-port-in-python/2838309#2838309
def get_free_port():
@ -192,6 +193,7 @@ class ClickHouseCluster:
zookeeper_keyfile=None, zookeeper_certfile=None):
for param in list(os.environ.keys()):
logging.debug("ENV %40s %s" % (param, os.environ[param]))
self.base_path = base_path
self.base_dir = p.dirname(base_path)
self.name = name if name is not None else ''
@ -392,11 +394,13 @@ class ClickHouseCluster:
def cleanup(self):
# Just in case kill unstopped containers from previous launch
try:
result = run_and_check(f'docker container list --all --filter name={self.project_name} | wc -l', shell=True)
# We need to have "^/" and "$" in the "--filter name" option below to filter by exact name of the container, see
# https://stackoverflow.com/questions/48767760/how-to-make-docker-container-ls-f-name-filter-by-exact-name
result = run_and_check(f'docker container list --all --filter name=^/{self.project_name}$ | wc -l', shell=True)
if int(result) > 1:
logging.debug(f"Trying to kill unstopped containers for project{self.project_name}...")
run_and_check(f'docker kill $(docker container list --all --quiet --filter name={self.project_name})', shell=True)
run_and_check(f'docker rm $(docker container list --all --quiet --filter name={self.project_name})', shell=True)
logging.debug(f"Trying to kill unstopped containers for project {self.project_name}...")
run_and_check(f'docker kill $(docker container list --all --quiet --filter name=^/{self.project_name}$)', shell=True)
run_and_check(f'docker rm $(docker container list --all --quiet --filter name=^/{self.project_name}$)', shell=True)
logging.debug("Unstopped containers killed")
run_and_check(['docker-compose', 'ps', '--services', '--all'])
else:
@ -1293,6 +1297,9 @@ class ClickHouseCluster:
raise Exception("Can't wait Cassandra to start")
def start(self, destroy_dirs=True):
pytest_xdist_logging_to_separate_files.setup()
logging.info("Running tests in {}".format(self.base_path))
logging.debug("Cluster start called. is_up={}, destroy_dirs={}".format(self.is_up, destroy_dirs))
if self.is_up:
return
@ -1774,12 +1781,14 @@ class ClickHouseInstance:
# Connects to the instance via clickhouse-client, sends a query (1st argument) and returns the answer
def query(self, sql, stdin=None, timeout=None, settings=None, user=None, password=None, database=None,
ignore_error=False):
logging.debug(f"Executing query {sql} on {self.name}")
return self.client.query(sql, stdin=stdin, timeout=timeout, settings=settings, user=user, password=password,
database=database, ignore_error=ignore_error)
def query_with_retry(self, sql, stdin=None, timeout=None, settings=None, user=None, password=None, database=None,
ignore_error=False,
retry_count=20, sleep_time=0.5, check_callback=lambda x: True):
logging.debug(f"Executing query {sql} on {self.name}")
result = None
for i in range(retry_count):
try:
@ -1797,23 +1806,27 @@ class ClickHouseInstance:
raise Exception("Can't execute query {}".format(sql))
# As query() but doesn't wait response and returns response handler
def get_query_request(self, *args, **kwargs):
return self.client.get_query_request(*args, **kwargs)
def get_query_request(self, sql, *args, **kwargs):
logging.debug(f"Executing query {sql} on {self.name}")
return self.client.get_query_request(sql, *args, **kwargs)
# Connects to the instance via clickhouse-client, sends a query (1st argument), expects an error and return its code
def query_and_get_error(self, sql, stdin=None, timeout=None, settings=None, user=None, password=None,
database=None):
logging.debug(f"Executing query {sql} on {self.name}")
return self.client.query_and_get_error(sql, stdin=stdin, timeout=timeout, settings=settings, user=user,
password=password, database=database)
# The same as query_and_get_error but ignores successful query.
def query_and_get_answer_with_error(self, sql, stdin=None, timeout=None, settings=None, user=None, password=None,
database=None):
logging.debug(f"Executing query {sql} on {self.name}")
return self.client.query_and_get_answer_with_error(sql, stdin=stdin, timeout=timeout, settings=settings,
user=user, password=password, database=database)
# Connects to the instance via HTTP interface, sends a query and returns the answer
def http_query(self, sql, data=None, params=None, user=None, password=None, expect_fail_and_get_error=False):
logging.debug(f"Executing query {sql} on {self.name} via HTTP interface")
if params is None:
params = {}
else:
@ -1848,11 +1861,13 @@ class ClickHouseInstance:
# Connects to the instance via HTTP interface, sends a query and returns the answer
def http_request(self, url, method='GET', params=None, data=None, headers=None):
logging.debug(f"Sending HTTP request {url} to {self.name}")
url = "http://" + self.ip_address + ":8123/" + url
return requests.request(method=method, url=url, params=params, data=data, headers=headers)
# Connects to the instance via HTTP interface, sends a query, expects an error and return the error message
def http_query_and_get_error(self, sql, data=None, params=None, user=None, password=None):
logging.debug(f"Executing query {sql} on {self.name} via HTTP interface")
return self.http_query(sql=sql, data=data, params=params, user=user, password=password,
expect_fail_and_get_error=True)

View File

@ -0,0 +1,28 @@
import logging
import os.path
# Makes the parallel workers of pytest-xdist to log to separate files.
# Without this function all workers will log to the same log file
# and mix everything together making it much more difficult for troubleshooting.
def setup():
worker_name = os.environ.get('PYTEST_XDIST_WORKER', 'master')
if worker_name == 'master':
return
logger = logging.getLogger('')
new_handlers = []
handlers_to_remove = []
for handler in logger.handlers:
if isinstance(handler, logging.FileHandler):
filename, ext = os.path.splitext(handler.baseFilename)
if not filename.endswith('-' + worker_name):
new_filename = filename + '-' + worker_name
new_handler = logging.FileHandler(new_filename + ext)
new_handler.setFormatter(handler.formatter)
new_handler.setLevel(handler.level)
new_handlers.append(new_handler)
handlers_to_remove.append(handler)
for new_handler in new_handlers:
logger.addHandler(new_handler)
for handler in handlers_to_remove:
handler.flush()
logger.removeHandler(handler)

View File

@ -4,10 +4,14 @@ norecursedirs = _instances*
timeout = 1800
junit_duration_report = call
junit_suite_name = integration
log_cli = 1
log_level = DEBUG
log_format = %(asctime)s [ %(process)d ] %(levelname)s : %(message)s (%(filename)s:%(lineno)s, %(funcName)s)
log_date_format=%Y-%m-%d %H:%M:%S
log_cli = true
log_cli_level = CRITICAL
log_cli_format = %%(asctime)s [%(levelname)8s] %(funcName)s %(message)s (%(filename)s:%(lineno)s)
log_cli_format = %(asctime)s [ %(process)d ] %(levelname)s : %(message)s (%(filename)s:%(lineno)s, %(funcName)s)
log_cli_date_format=%Y-%m-%d %H:%M:%S
log_file = pytest.log
log_file_level = DEBUG
log_file_format = %(asctime)s [%(levelname)8s] %(funcName)s %(message)s (%(filename)s:%(lineno)s)
log_file_date_format=%Y-%m-%d %H:%M:%S
log_file_format = %(asctime)s [ %(process)d ] %(levelname)s : %(message)s (%(filename)s:%(lineno)s, %(funcName)s)
log_file_date_format = %Y-%m-%d %H:%M:%S

View File

@ -3,6 +3,7 @@
import subprocess
import os
import getpass
import glob
import argparse
import logging
import signal
@ -99,7 +100,7 @@ signal.signal(signal.SIGINT, docker_kill_handler_handler)
# 2) path of runner script is used to determine paths for trivial case, when we run it from repository
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(message)s')
logging.basicConfig(level=logging.INFO, format='%(asctime)s [ %(process)d ] %(levelname)s : %(message)s (%(filename)s:%(lineno)s, %(funcName)s)')
parser = argparse.ArgumentParser(description="ClickHouse integration tests runner")
parser.add_argument(
@ -257,6 +258,9 @@ if __name__ == "__main__":
if sys.stdout.isatty() and sys.stdin.isatty():
tty = "-it"
# Remove old logs.
for old_log_path in glob.glob(args.cases_dir + "/pytest*.log"):
os.remove(old_log_path)
cmd = "docker run {net} {tty} --rm --name {name} --privileged \
--volume={odbc_bridge_bin}:/clickhouse-odbc-bridge --volume={bin}:/clickhouse \

View File

@ -48,7 +48,7 @@ def create_postgres_db(cursor, name='postgres_database'):
cursor.execute("CREATE DATABASE {}".format(name))
def drop_postgres_db(cursor, name='postgres_database'):
cursor.execute("DROP DATABASE {}".format(name))
cursor.execute("DROP DATABASE IF EXISTS {}".format(name))
def create_clickhouse_postgres_db(ip, port, name='postgres_database'):
instance.query('''
@ -168,7 +168,10 @@ def test_load_and_sync_all_database_tables(started_cluster):
result = instance.query('''SELECT count() FROM system.tables WHERE database = 'test_database';''')
assert(int(result) == NUM_TABLES)
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_replicating_dml(started_cluster):
@ -209,7 +212,7 @@ def test_replicating_dml(started_cluster):
check_tables_are_synchronized('postgresql_replica_{}'.format(i));
for i in range(NUM_TABLES):
cursor.execute('drop table postgresql_replica_{};'.format(i))
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
drop_materialized_db()
@ -262,7 +265,6 @@ def test_different_data_types(started_cluster):
cursor.execute('''UPDATE test_data_types SET i = '2020-12-12';'''.format(col, i))
check_tables_are_synchronized('test_data_types', 'id');
cursor.execute('drop table test_data_types;')
instance.query("INSERT INTO postgres_database.test_array_data_type "
"VALUES ("
@ -296,7 +298,10 @@ def test_different_data_types(started_cluster):
check_tables_are_synchronized('test_array_data_type');
result = instance.query('SELECT * FROM test_database.test_array_data_type ORDER BY key;')
assert(result == expected)
drop_materialized_db()
cursor.execute('drop table if exists test_data_types;')
cursor.execute('drop table if exists test_array_data_type;')
def test_load_and_sync_subset_of_database_tables(started_cluster):
@ -345,8 +350,10 @@ def test_load_and_sync_subset_of_database_tables(started_cluster):
table_name = 'postgresql_replica_{}'.format(i)
if i < int(NUM_TABLES/2):
check_tables_are_synchronized(table_name);
cursor.execute('drop table {};'.format(table_name))
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_changing_replica_identity_value(started_cluster):
@ -365,7 +372,9 @@ def test_changing_replica_identity_value(started_cluster):
check_tables_are_synchronized('postgresql_replica');
cursor.execute("UPDATE postgresql_replica SET key=key-25 WHERE key<100 ")
check_tables_are_synchronized('postgresql_replica');
drop_materialized_db()
cursor.execute('drop table if exists postgresql_replica;')
def test_clickhouse_restart(started_cluster):
@ -393,7 +402,10 @@ def test_clickhouse_restart(started_cluster):
for i in range(NUM_TABLES):
check_tables_are_synchronized('postgresql_replica_{}'.format(i));
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_replica_identity_index(started_cluster):
@ -421,7 +433,9 @@ def test_replica_identity_index(started_cluster):
cursor.execute('DELETE FROM postgresql_replica WHERE key2<75;')
check_tables_are_synchronized('postgresql_replica', order_by='key1');
drop_materialized_db()
cursor.execute('drop table if exists postgresql_replica;')
def test_table_schema_changes(started_cluster):
@ -477,6 +491,8 @@ def test_table_schema_changes(started_cluster):
cursor.execute('drop table postgresql_replica_{};'.format(i))
instance.query("DROP DATABASE test_database")
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_many_concurrent_queries(started_cluster):
@ -555,7 +571,10 @@ def test_many_concurrent_queries(started_cluster):
count2 = instance.query('SELECT count() FROM (SELECT * FROM test_database.postgresql_replica_{})'.format(i))
assert(int(count1) == int(count2))
print(count1, count2)
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_single_transaction(started_cluster):
@ -583,7 +602,9 @@ def test_single_transaction(started_cluster):
conn.commit()
check_tables_are_synchronized('postgresql_replica_0');
drop_materialized_db()
cursor.execute('drop table if exists postgresql_replica_0;')
def test_virtual_columns(started_cluster):
@ -617,7 +638,9 @@ def test_virtual_columns(started_cluster):
result = instance.query('SELECT key, value, value2, _sign, _version FROM test_database.postgresql_replica_0;')
print(result)
drop_materialized_db()
cursor.execute('drop table if exists postgresql_replica_0;')
def test_multiple_databases(started_cluster):
@ -671,8 +694,14 @@ def test_multiple_databases(started_cluster):
check_tables_are_synchronized(
table_name, 'key', 'postgres_database_{}'.format(cursor_id + 1), 'test_database_{}'.format(cursor_id + 1));
for i in range(NUM_TABLES):
cursor1.execute('drop table if exists postgresql_replica_{};'.format(i))
for i in range(NUM_TABLES):
cursor2.execute('drop table if exists postgresql_replica_{};'.format(i))
drop_clickhouse_postgres_db('postgres_database_1')
drop_clickhouse_postgres_db('postgres_database_2')
drop_materialized_db('test_database_1')
drop_materialized_db('test_database_2')
@ -718,7 +747,10 @@ def test_concurrent_transactions(started_cluster):
count2 = instance.query('SELECT count() FROM (SELECT * FROM test_database.postgresql_replica_{})'.format(i))
print(int(count1), int(count2), sep=' ')
assert(int(count1) == int(count2))
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_abrupt_connection_loss_while_heavy_replication(started_cluster):
@ -780,6 +812,8 @@ def test_abrupt_connection_loss_while_heavy_replication(started_cluster):
print(result) # Just debug
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_drop_database_while_replication_startup_not_finished(started_cluster):
@ -800,6 +834,9 @@ def test_drop_database_while_replication_startup_not_finished(started_cluster):
time.sleep(0.5 * i)
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
def test_restart_server_while_replication_startup_not_finished(started_cluster):
drop_materialized_db()
@ -819,7 +856,10 @@ def test_restart_server_while_replication_startup_not_finished(started_cluster):
instance.restart_clickhouse()
for i in range(NUM_TABLES):
check_tables_are_synchronized('postgresql_replica_{}'.format(i));
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table postgresql_replica_{};'.format(i))
def test_abrupt_server_restart_while_heavy_replication(started_cluster):
@ -878,6 +918,8 @@ def test_abrupt_server_restart_while_heavy_replication(started_cluster):
print(result) # Just debug
drop_materialized_db()
for i in range(NUM_TABLES):
cursor.execute('drop table if exists postgresql_replica_{};'.format(i))
if __name__ == '__main__':

View File

@ -3,6 +3,11 @@
<table_exists>hits_100m_single</table_exists>
</preconditions>
<settings>
<compile_aggregate_expressions>1</compile_aggregate_expressions>
<min_count_to_compile_aggregate_expression>0</min_count_to_compile_aggregate_expression>
</settings>
<create_query>
CREATE TABLE jit_test_memory (
key UInt64,

View File

@ -1 +1 @@
SELECT ('a', 'b').2
SELECT ('a', 'b').2

View File

@ -1 +1 @@
SELECT arrayWithConstant(-231.37104, -138); -- { serverError 128 }
SELECT arrayWithConstant(-231.37104, -138); -- { serverError 128 }

View File

@ -1 +1 @@
SHOW PRIVILEGES;
SHOW PRIVILEGES;

View File

@ -1 +1 @@
SELECT MONTH(toDateTime('2016-06-15 23:00:00'));
SELECT MONTH(toDateTime('2016-06-15 23:00:00'));

View File

@ -1 +1 @@
SELECT YEAR(toDateTime('2016-06-15 23:00:00'));
SELECT YEAR(toDateTime('2016-06-15 23:00:00'));

View File

@ -1 +1 @@
SELECT REPEAT('Test', 3);
SELECT REPEAT('Test', 3);

View File

@ -1 +1 @@
SELECT QUARTER(toDateTime('2016-06-15 23:00:00'));
SELECT QUARTER(toDateTime('2016-06-15 23:00:00'));

View File

@ -1 +1 @@
SELECT SECOND(toDateTime('2016-06-15 23:00:00'));
SELECT SECOND(toDateTime('2016-06-15 23:00:00'));

View File

@ -1 +1 @@
SELECT MINUTE(toDateTime('2016-06-15 23:00:00'));
SELECT MINUTE(toDateTime('2016-06-15 23:00:00'));

View File

@ -1 +1 @@
SELECT map('a', 1, 'b', 2) FROM remote('127.0.0.{1,2}', system, one);
SELECT map('a', 1, 'b', 2) FROM remote('127.0.0.{1,2}', system, one);

View File

@ -0,0 +1,43 @@
--JSON_VALUE--
1
1.2
true
"world"
null
--JSON_QUERY--
[{"hello":1}]
[1]
[1.2]
[true]
["world"]
[null]
[["world","world2"]]
[{"world":"!"}]
[0, 1, 4, 0, -1, -4]
--JSON_EXISTS--
1
0
1
1
1
0
1
0
0
1
1
0
1
0
1
--MANY ROWS--
0 ["Vasily", "Kostya"]
1 ["Tihon", "Ernest"]
2 ["Katya", "Anatoliy"]

View File

@ -0,0 +1,50 @@
SELECT '--JSON_VALUE--';
SELECT JSON_VALUE('$', '{"hello":1}'); -- root is a complex object => default value (empty string)
SELECT JSON_VALUE('$.hello', '{"hello":1}');
SELECT JSON_VALUE('$.hello', '{"hello":1.2}');
SELECT JSON_VALUE('$.hello', '{"hello":true}');
SELECT JSON_VALUE('$.hello', '{"hello":"world"}');
SELECT JSON_VALUE('$.hello', '{"hello":null}');
SELECT JSON_VALUE('$.hello', '{"hello":["world","world2"]}');
SELECT JSON_VALUE('$.hello', '{"hello":{"world":"!"}}');
SELECT JSON_VALUE('$.hello', '{hello:world}'); -- invalid json => default value (empty string)
SELECT JSON_VALUE('$.hello', '');
SELECT '--JSON_QUERY--';
SELECT JSON_QUERY('$', '{"hello":1}');
SELECT JSON_QUERY('$.hello', '{"hello":1}');
SELECT JSON_QUERY('$.hello', '{"hello":1.2}');
SELECT JSON_QUERY('$.hello', '{"hello":true}');
SELECT JSON_QUERY('$.hello', '{"hello":"world"}');
SELECT JSON_QUERY('$.hello', '{"hello":null}');
SELECT JSON_QUERY('$.hello', '{"hello":["world","world2"]}');
SELECT JSON_QUERY('$.hello', '{"hello":{"world":"!"}}');
SELECT JSON_QUERY('$.hello', '{hello:{"world":"!"}}}'); -- invalid json => default value (empty string)
SELECT JSON_QUERY('$.hello', '');
SELECT JSON_QUERY('$.array[*][0 to 2, 4]', '{"array":[[0, 1, 2, 3, 4, 5], [0, -1, -2, -3, -4, -5]]}');
SELECT '--JSON_EXISTS--';
SELECT JSON_EXISTS('$', '{"hello":1}');
SELECT JSON_EXISTS('$', '');
SELECT JSON_EXISTS('$', '{}');
SELECT JSON_EXISTS('$.hello', '{"hello":1}');
SELECT JSON_EXISTS('$.world', '{"hello":1,"world":2}');
SELECT JSON_EXISTS('$.world', '{"hello":{"world":1}}');
SELECT JSON_EXISTS('$.hello.world', '{"hello":{"world":1}}');
SELECT JSON_EXISTS('$.hello', '{hello:world}'); -- invalid json => default value (zero integer)
SELECT JSON_EXISTS('$.hello', '');
SELECT JSON_EXISTS('$.hello[*]', '{"hello":["world"]}');
SELECT JSON_EXISTS('$.hello[0]', '{"hello":["world"]}');
SELECT JSON_EXISTS('$.hello[1]', '{"hello":["world"]}');
SELECT JSON_EXISTS('$.a[*].b', '{"a":[{"b":1},{"c":2}]}');
SELECT JSON_EXISTS('$.a[*].f', '{"a":[{"b":1},{"c":2}]}');
SELECT JSON_EXISTS('$.a[*][0].h', '{"a":[[{"b":1}, {"g":1}],[{"h":1},{"y":1}]]}');
SELECT '--MANY ROWS--';
DROP TABLE IF EXISTS 01889_sql_json;
CREATE TABLE 01889_sql_json (id UInt8, json String) ENGINE = MergeTree ORDER BY id;
INSERT INTO 01889_sql_json(id, json) VALUES(0, '{"name":"Ivan","surname":"Ivanov","friends":["Vasily","Kostya","Artyom"]}');
INSERT INTO 01889_sql_json(id, json) VALUES(1, '{"name":"Katya","surname":"Baltica","friends":["Tihon","Ernest","Innokentiy"]}');
INSERT INTO 01889_sql_json(id, json) VALUES(2, '{"name":"Vitali","surname":"Brown","friends":["Katya","Anatoliy","Ivan","Oleg"]}');
SELECT id, JSON_QUERY('$.friends[0 to 2]', json) FROM 01889_sql_json ORDER BY id;
DROP TABLE 01889_sql_json;

View File

@ -0,0 +1,2 @@
SELECT 1 as a, count() FROM numbers(10) WHERE 0 GROUP BY a;
SELECT count() FROM numbers(10) WHERE 0

View File

@ -0,0 +1,8 @@
{"m":{"1":2,"3":4}}
{1:2,3:4} {"1":2,"3":4} 1
{"m":{"key1":"1","key2":"2"}}
{'key1':1,'key2':2} {"key1":"1","key2":"2"} 1
{"m":{"key1":1,"key2":2}}
{'key1':1,'key2':2} {"key1":1,"key2":2} 1
{"m1":{"k1":"1","k2":"2"},"m2":{"1":2,"2":3}}
{"m1":{"k1":1,"k2":2},"m2":{"1":2,"2":3}}

View File

@ -0,0 +1,19 @@
SELECT map(1, 2, 3, 4) AS m FORMAT JSONEachRow;
SELECT map(1, 2, 3, 4) AS m, toJSONString(m) AS s, isValidJSON(s);
SELECT map('key1', number, 'key2', number * 2) AS m FROM numbers(1, 1) FORMAT JSONEachRow;
SELECT map('key1', number, 'key2', number * 2) AS m, toJSONString(m) AS s, isValidJSON(s) FROM numbers(1, 1);
SELECT map('key1', number, 'key2', number * 2) AS m FROM numbers(1, 1)
FORMAT JSONEachRow
SETTINGS output_format_json_quote_64bit_integers = 0;
SELECT map('key1', number, 'key2', number * 2) AS m, toJSONString(m) AS s, isValidJSON(s) FROM numbers(1, 1)
SETTINGS output_format_json_quote_64bit_integers = 0;
CREATE TEMPORARY TABLE map_json (m1 Map(String, UInt64), m2 Map(UInt32, UInt32));
INSERT INTO map_json FORMAT JSONEachRow {"m1" : {"k1" : 1, "k2" : 2}, "m2" : {"1" : 2, "2" : 3}};
SELECT m1, m2 FROM map_json FORMAT JSONEachRow;
SELECT m1, m2 FROM map_json FORMAT JSONEachRow SETTINGS output_format_json_quote_64bit_integers = 0;

View File

@ -252,3 +252,5 @@
01914_exchange_dictionaries
01923_different_expression_name_alias
01932_null_valid_identifier
00918_json_functions
01889_sql_json_functions