mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-22 07:31:57 +00:00
Merge branch 'master' into arbitrary-const-expressions-in-limit
This commit is contained in:
commit
75c087bcf5
30
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
30
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
@ -0,0 +1,30 @@
|
||||
---
|
||||
name: Bug report
|
||||
about: Create a report to help us improve ClickHouse
|
||||
title: ''
|
||||
labels: bug, issue
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
(you don't have to strictly follow this form)
|
||||
|
||||
**Describe the bug**
|
||||
A clear and concise description of what the bug is.
|
||||
|
||||
**How to reproduce**
|
||||
* Which ClickHouse server version to use
|
||||
* Which interface to use, if matters
|
||||
* Non-default settings, if any
|
||||
* `CREATE TABLE` statements for all tables involved
|
||||
* Sample data for all these tables, use [clickhouse-obfuscator](https://github.com/yandex/ClickHouse/blob/master/dbms/programs/obfuscator/Obfuscator.cpp#L42-L80) if necessary
|
||||
* Queries to run that lead to unexpected result
|
||||
|
||||
**Expected behavior**
|
||||
A clear and concise description of what you expected to happen.
|
||||
|
||||
**Error message and/or stacktrace**
|
||||
If applicable, add screenshots to help explain your problem.
|
||||
|
||||
**Additional context**
|
||||
Add any other context about the problem here.
|
21
.github/ISSUE_TEMPLATE/build-issue.md
vendored
Normal file
21
.github/ISSUE_TEMPLATE/build-issue.md
vendored
Normal file
@ -0,0 +1,21 @@
|
||||
---
|
||||
name: Build issue
|
||||
about: Report failed ClickHouse build from master
|
||||
title: ''
|
||||
labels: build
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
Make sure that `git diff` result is empty and you've just pulled fresh master. Try cleaning up cmake cache. Just in case, official build instructions are published here: https://clickhouse.yandex/docs/en/development/build/
|
||||
|
||||
**Operating system**
|
||||
OS kind or distribution, specific version/release, non-standard kernel if any. If you are trying to build inside virtual machine, please mention it too.
|
||||
|
||||
**Cmake version**
|
||||
|
||||
**Ninja version**
|
||||
|
||||
**Compiler name and version**
|
||||
|
||||
**Full cmake and/or ninja output**
|
22
.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
22
.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
@ -0,0 +1,22 @@
|
||||
---
|
||||
name: Feature request
|
||||
about: Suggest an idea for ClickHouse
|
||||
title: ''
|
||||
labels: feature
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
(you don't have to strictly follow this form)
|
||||
|
||||
**Use case**
|
||||
A clear and concise description of what is the intended usage scenario is.
|
||||
|
||||
**Describe the solution you'd like**
|
||||
A clear and concise description of what you want to happen.
|
||||
|
||||
**Describe alternatives you've considered**
|
||||
A clear and concise description of any alternative solutions or features you've considered.
|
||||
|
||||
**Additional context**
|
||||
Add any other context or screenshots about the feature request here.
|
12
.github/ISSUE_TEMPLATE/question.md
vendored
Normal file
12
.github/ISSUE_TEMPLATE/question.md
vendored
Normal file
@ -0,0 +1,12 @@
|
||||
---
|
||||
name: Question
|
||||
about: Ask question about ClickHouse
|
||||
title: ''
|
||||
labels: question
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
Make sure to check documentation https://clickhouse.yandex/docs/en/ first. If the question is concise and probably has a short answer, asking it in Telegram chat https://telegram.me/clickhouse_en is probably the fastest way to find the answer. For more complicated questions, consider asking them on StackOverflow with "clickhouse" tag https://stackoverflow.com/questions/tagged/clickhouse
|
||||
|
||||
If you still prefer GitHub issues, remove all this text and ask your question here.
|
3
.gitmodules
vendored
3
.gitmodules
vendored
@ -64,3 +64,6 @@
|
||||
[submodule "contrib/cppkafka"]
|
||||
path = contrib/cppkafka
|
||||
url = https://github.com/mfontanini/cppkafka.git
|
||||
[submodule "contrib/pdqsort"]
|
||||
path = contrib/pdqsort
|
||||
url = https://github.com/orlp/pdqsort
|
||||
|
@ -8,7 +8,7 @@
|
||||
* Added functions `left`, `right`, `trim`, `ltrim`, `rtrim`, `timestampadd`, `timestampsub` for SQL standard compatibility. [#3826](https://github.com/yandex/ClickHouse/pull/3826) ([Ivan Blinkov](https://github.com/blinkov))
|
||||
* Support for write in `HDFS` tables and `hdfs` table function. [#4084](https://github.com/yandex/ClickHouse/pull/4084) ([alesapin](https://github.com/alesapin))
|
||||
* Added functions to search for multiple constant strings from big haystack: `multiPosition`, `multiSearch` ,`firstMatch` also with `-UTF8`, `-CaseInsensitive`, and `-CaseInsensitiveUTF8` variants. [#4053](https://github.com/yandex/ClickHouse/pull/4053) ([Danila Kutenin](https://github.com/danlark1))
|
||||
* Pruning of unused shards if `SELECT` query filters by sharding key (setting `distributed_optimize_skip_select_on_unused_shards`). [#3851](https://github.com/yandex/ClickHouse/pull/3851) ([Ivan](https://github.com/abyss7))
|
||||
* Pruning of unused shards if `SELECT` query filters by sharding key (setting `optimize_skip_unused_shards`). [#3851](https://github.com/yandex/ClickHouse/pull/3851) ([Gleb Kanterov](https://github.com/kanterov), [Ivan](https://github.com/abyss7))
|
||||
* Allow `Kafka` engine to ignore some number of parsing errors per block. [#4094](https://github.com/yandex/ClickHouse/pull/4094) ([Ivan](https://github.com/abyss7))
|
||||
* Added support for `CatBoost` multiclass models evaluation. Function `modelEvaluate` returns tuple with per-class raw predictions for multiclass models. `libcatboostmodel.so` should be built with [#607](https://github.com/catboost/catboost/pull/607). [#3959](https://github.com/yandex/ClickHouse/pull/3959) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Added functions `filesystemAvailable`, `filesystemFree`, `filesystemCapacity`. [#4097](https://github.com/yandex/ClickHouse/pull/4097) ([Boris Granveaud](https://github.com/bgranvea))
|
||||
|
109
CHANGELOG_RU.md
109
CHANGELOG_RU.md
@ -1,3 +1,112 @@
|
||||
## ClickHouse release 19.1.6, 2019-01-24
|
||||
|
||||
### Новые возможности:
|
||||
|
||||
* Задание формата сжатия для отдельных столбцов. [#3899](https://github.com/yandex/ClickHouse/pull/3899) [#4111](https://github.com/yandex/ClickHouse/pull/4111) ([alesapin](https://github.com/alesapin), [Winter Zhang](https://github.com/zhang2014), [Anatoly](https://github.com/Sindbag))
|
||||
* Формат сжатия `Delta`. [#4052](https://github.com/yandex/ClickHouse/pull/4052) ([alesapin](https://github.com/alesapin))
|
||||
* Изменение формата сжатия запросом `ALTER`. [#4054](https://github.com/yandex/ClickHouse/pull/4054) ([alesapin](https://github.com/alesapin))
|
||||
* Добавлены функции `left`, `right`, `trim`, `ltrim`, `rtrim`, `timestampadd`, `timestampsub` для совместимости со стандартом SQL. [#3826](https://github.com/yandex/ClickHouse/pull/3826) ([Ivan Blinkov](https://github.com/blinkov))
|
||||
* Поддержка записи в движок `HDFS` и табличную функцию `hdfs`. [#4084](https://github.com/yandex/ClickHouse/pull/4084) ([alesapin](https://github.com/alesapin))
|
||||
* Добавлены функции поиска набора константных строк в тексте: `multiPosition`, `multiSearch` ,`firstMatch` также с суффиксами `-UTF8`, `-CaseInsensitive`, и `-CaseInsensitiveUTF8`. [#4053](https://github.com/yandex/ClickHouse/pull/4053) ([Danila Kutenin](https://github.com/danlark1))
|
||||
* Пропуск неиспользуемых шардов в случае, если запрос `SELECT` содержит фильтрацию по ключу шардирования (настройка `optimize_skip_unused_shards`). [#3851](https://github.com/yandex/ClickHouse/pull/3851) ([Gleb Kanterov](https://github.com/kanterov), [Ivan](https://github.com/abyss7))
|
||||
* Пропуск строк в случае ошибки парсинга для движка `Kafka` (настройка `kafka_skip_broken_messages`). [#4094](https://github.com/yandex/ClickHouse/pull/4094) ([Ivan](https://github.com/abyss7))
|
||||
* Поддержка применения мультиклассовых моделей `CatBoost`. Функция `modelEvaluate` возвращает кортеж в случае использования мультиклассовой модели. `libcatboostmodel.so` should be built with [#607](https://github.com/catboost/catboost/pull/607). [#3959](https://github.com/yandex/ClickHouse/pull/3959) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Добавлены функции `filesystemAvailable`, `filesystemFree`, `filesystemCapacity`. [#4097](https://github.com/yandex/ClickHouse/pull/4097) ([Boris Granveaud](https://github.com/bgranvea))
|
||||
* Добавлены функции хеширования `xxHash64` и `xxHash32`. [#3905](https://github.com/yandex/ClickHouse/pull/3905) ([filimonov](https://github.com/filimonov))
|
||||
* Добавлена функция хеширования `gccMurmurHash` (GCC flavoured Murmur hash), использующая те же hash seed, что и [gcc](https://github.com/gcc-mirror/gcc/blob/41d6b10e96a1de98e90a7c0378437c3255814b16/libstdc%2B%2B-v3/include/bits/functional_hash.h#L191) [#4000](https://github.com/yandex/ClickHouse/pull/4000) ([sundyli](https://github.com/sundy-li))
|
||||
* Добавлены функции хеширования `javaHash`, `hiveHash`. [#3811](https://github.com/yandex/ClickHouse/pull/3811) ([shangshujie365](https://github.com/shangshujie365))
|
||||
* Добавлена функция `remoteSecure`. Функция работает аналогично `remote`, но использует безопасное соединение. [#4088](https://github.com/yandex/ClickHouse/pull/4088) ([proller](https://github.com/proller))
|
||||
|
||||
|
||||
### Экспериментальные возможности:
|
||||
|
||||
* Эмуляция запросов с несколькими секциями `JOIN` (настройка `allow_experimental_multiple_joins_emulation`). [#3946](https://github.com/yandex/ClickHouse/pull/3946) ([Artem Zuikov](https://github.com/4ertus2))
|
||||
|
||||
### Исправления ошибок:
|
||||
|
||||
* Ограничен размер кеша скомпилированных выражений в случае, если не указана настройка `compiled_expression_cache_size` для экономии потребляемой памяти. [#4041](https://github.com/yandex/ClickHouse/pull/4041) ([alesapin](https://github.com/alesapin))
|
||||
* Исправлена проблема зависания потоков, выполняющих запрос `ALTER` для таблиц семейства `Replicated`, а также потоков, обновляющих конфигурацию из ZooKeeper. [#2947](https://github.com/yandex/ClickHouse/issues/2947) [#3891](https://github.com/yandex/ClickHouse/issues/3891) [#3934](https://github.com/yandex/ClickHouse/pull/3934) ([Alex Zatelepin](https://github.com/ztlpn))
|
||||
* Исправлен race condition в случае выполнения распределенной задачи запроса `ALTER`. Race condition приводил к состоянию, когда более чем одна реплика пыталась выполнить задачу, в результате чего все такие реплики, кроме одной, падали с ошибкой обращения к ZooKeeper. [#3904](https://github.com/yandex/ClickHouse/pull/3904) ([Alex Zatelepin](https://github.com/ztlpn))
|
||||
* Исправлена проблема обновления настройки `from_zk`. Настройка, указанная в файле конфигурации, не обновлялась в случае, если запрос к ZooKeeper падал по timeout. [#2947](https://github.com/yandex/ClickHouse/issues/2947) [#3947](https://github.com/yandex/ClickHouse/pull/3947) ([Alex Zatelepin](https://github.com/ztlpn))
|
||||
* Исправлена ошибка в вычислении сетевого префикса при указании IPv4 маски подсети. [#3945](https://github.com/yandex/ClickHouse/pull/3945) ([alesapin](https://github.com/alesapin))
|
||||
* Исправлено падение (`std::terminate`) в редком сценарии, когда новый поток не мог быть создан из-за нехватки ресурсов. [#3956](https://github.com/yandex/ClickHouse/pull/3956) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлено падение табличной функции `remote` в случае, когда не удавалось получить структуру таблицы из-за ограничений пользователя. [#4009](https://github.com/yandex/ClickHouse/pull/4009) ([alesapin](https://github.com/alesapin))
|
||||
* Исправлена утечка сетевых сокетов. Сокеты создавались в пуле и никогда не закрывались. При создании потока, создавались новые сокеты в случае, если все доступные использовались. [#4017](https://github.com/yandex/ClickHouse/pull/4017) ([Alex Zatelepin](https://github.com/ztlpn))
|
||||
* Исправлена проблема закрывания `/proc/self/fd` раньше, чем все файловые дескрипторы были прочитаны из `/proc` после создания процесса `odbc-bridge`. [#4120](https://github.com/yandex/ClickHouse/pull/4120) ([alesapin](https://github.com/alesapin))
|
||||
* Исправлен баг в монотонном преобразовании String в UInt в случае использования String в первичном ключе. [#3870](https://github.com/yandex/ClickHouse/pull/3870) ([Winter Zhang](https://github.com/zhang2014))
|
||||
* Исправлен баг в вычислении монотонности функции преобразования типа целых значений. [#3921](https://github.com/yandex/ClickHouse/pull/3921) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлено падение в функциях `arrayEnumerateUniq`, `arrayEnumerateDense` при передаче невалидных аргументов. [#3909](https://github.com/yandex/ClickHouse/pull/3909) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлен undefined behavior в StorageMerge. [#3910](https://github.com/yandex/ClickHouse/pull/3910) ([Amos Bird](https://github.com/amosbird))
|
||||
* Исправлено падение в функциях `addDays`, `subtractDays`. [#3913](https://github.com/yandex/ClickHouse/pull/3913) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлена проблема, в результате которой функции `round`, `floor`, `trunc`, `ceil` могли возвращать неверный результат для отрицательных целочисленных аргументов с большим значением. [#3914](https://github.com/yandex/ClickHouse/pull/3914) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлена проблема, в результате которой 'kill query sync' приводил к падению сервера. [#3916](https://github.com/yandex/ClickHouse/pull/3916) ([muVulDeePecker](https://github.com/fancyqlx))
|
||||
* Исправлен баг, приводящий к большой задержке в случае пустой очереди репликации. [#3928](https://github.com/yandex/ClickHouse/pull/3928) [#3932](https://github.com/yandex/ClickHouse/pull/3932) ([alesapin](https://github.com/alesapin))
|
||||
* Исправлено избыточное использование памяти в случае вставки в таблицу с `LowCardinality` в первичном ключе. [#3955](https://github.com/yandex/ClickHouse/pull/3955) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Исправлена сериализация пустых массивов типа `LowCardinality` для формата `Native`. [#3907](https://github.com/yandex/ClickHouse/issues/3907) [#4011](https://github.com/yandex/ClickHouse/pull/4011) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Исправлен неверный результат в случае использования distinct для числового столбца `LowCardinality`. [#3895](https://github.com/yandex/ClickHouse/issues/3895) [#4012](https://github.com/yandex/ClickHouse/pull/4012) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Исправлена компиляция вычисления агрегатных функций для ключа `LowCardinality` (для случая, когда включена настройка `compile`). [#3886](https://github.com/yandex/ClickHouse/pull/3886) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Исправлена передача пользователя и пароля для запросов с реплик. [#3957](https://github.com/yandex/ClickHouse/pull/3957) ([alesapin](https://github.com/alesapin)) ([小路](https://github.com/nicelulu))
|
||||
* Исправлен очень редкий race condition возникающий при перечислении таблиц из базы данных типа `Dictionary` во время перезагрузки словарей. [#3970](https://github.com/yandex/ClickHouse/pull/3970) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлен неверный результат в случае использования HAVING с ROLLUP или CUBE. [#3756](https://github.com/yandex/ClickHouse/issues/3756) [#3837](https://github.com/yandex/ClickHouse/pull/3837) ([Sam Chou](https://github.com/reflection))
|
||||
* Исправлена проблема с алиасами столбцов для запросов с `JOIN ON` над распределенными таблицами. [#3980](https://github.com/yandex/ClickHouse/pull/3980) ([Winter Zhang](https://github.com/zhang2014))
|
||||
* Исправлена ошибка в реализации функции `quantileTDigest` (нашел Artem Vakhrushev). Эта ошибка никогда не происходит в ClickHouse и актуальна только для тех, кто использует кодовую базу ClickHouse напрямую в качестве библиотеки. [#3935](https://github.com/yandex/ClickHouse/pull/3935) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
|
||||
### Улучшения:
|
||||
|
||||
* Добавлена поддержка `IF NOT EXISTS` в выражении `ALTER TABLE ADD COLUMN`, `IF EXISTS` в выражении `DROP/MODIFY/CLEAR/COMMENT COLUMN`. [#3900](https://github.com/yandex/ClickHouse/pull/3900) ([Boris Granveaud](https://github.com/bgranvea))
|
||||
* Функция `parseDateTimeBestEffort` теперь поддерживает форматы `DD.MM.YYYY`, `DD.MM.YY`, `DD-MM-YYYY`, `DD-Mon-YYYY`, `DD/Month/YYYY` и аналогичные. [#3922](https://github.com/yandex/ClickHouse/pull/3922) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* `CapnProtoInputStream` теперь поддерживает jagged структуры. [#4063](https://github.com/yandex/ClickHouse/pull/4063) ([Odin Hultgren Van Der Horst](https://github.com/Miniwoffer))
|
||||
* Улучшение usability: добавлена проверка, что сервер запущен от пользователя, совпадающего с владельцем директории данных. Запрещен запуск от пользователя root в случае, если root не владеет директорией с данными. [#3785](https://github.com/yandex/ClickHouse/pull/3785) ([sergey-v-galtsev](https://github.com/sergey-v-galtsev))
|
||||
* Улучшена логика проверки столбцов, необходимых для JOIN, на стадии анализа запроса. [#3930](https://github.com/yandex/ClickHouse/pull/3930) ([Artem Zuikov](https://github.com/4ertus2))
|
||||
* Уменьшено число поддерживаемых соединений в случае большого числа распределенных таблиц. [#3726](https://github.com/yandex/ClickHouse/pull/3726) ([Winter Zhang](https://github.com/zhang2014))
|
||||
* Добавлена поддержка строки с totals для запроса с `WITH TOTALS` через ODBC драйвер. [#3836](https://github.com/yandex/ClickHouse/pull/3836) ([Maksim Koritckiy](https://github.com/nightweb))
|
||||
* Поддержано использование `Enum` в качестве чисел в функции `if`. [#3875](https://github.com/yandex/ClickHouse/pull/3875) ([Ivan](https://github.com/abyss7))
|
||||
* Добавлена настройка `low_cardinality_allow_in_native_format`. Если она выключена, то тип `LowCadrinality` не используется в формате `Native`. [#3879](https://github.com/yandex/ClickHouse/pull/3879) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Удалены некоторые избыточные объекты из кеша скомпилированных выражений для уменьшения потребления памяти. [#4042](https://github.com/yandex/ClickHouse/pull/4042) ([alesapin](https://github.com/alesapin))
|
||||
* Добавлена проверка того, что в запрос `SET send_logs_level = 'value'` передается верное значение. [#3873](https://github.com/yandex/ClickHouse/pull/3873) ([Sabyanin Maxim](https://github.com/s-mx))
|
||||
* Добавлена проверка типов для функций преобразования типов. [#3896](https://github.com/yandex/ClickHouse/pull/3896) ([Winter Zhang](https://github.com/zhang2014))
|
||||
|
||||
### Улучшения производительности:
|
||||
|
||||
* Добавлена настройка `use_minimalistic_part_header_in_zookeeper` для движка MergeTree. Если настройка включена, Replicated таблицы будут хранить метаданные куска в компактном виде (в соответствующем znode для этого куска). Это может значительно уменьшить размер для ZooKeeper snapshot (особенно для таблиц с большим числом столбцов). После включения данной настройки будет невозможно сделать откат к версии, которая эту настройку не поддерживает. [#3960](https://github.com/yandex/ClickHouse/pull/3960) ([Alex Zatelepin](https://github.com/ztlpn))
|
||||
* Добавлена реализация функций `sequenceMatch` и `sequenceCount` на основе конечного автомата в случае, если последовательность событий не содержит условия на время. [#4004](https://github.com/yandex/ClickHouse/pull/4004) ([Léo Ercolanelli](https://github.com/ercolanelli-leo))
|
||||
* Улучшена производительность сериализации целых чисел. [#3968](https://github.com/yandex/ClickHouse/pull/3968) ([Amos Bird](https://github.com/amosbird))
|
||||
* Добавлен zero left padding для PODArray. Теперь элемент с индексом -1 является валидным нулевым значением. Эта особенность используется для удаления условного выражения при вычислении оффсетов массивов. [#3920](https://github.com/yandex/ClickHouse/pull/3920) ([Amos Bird](https://github.com/amosbird))
|
||||
* Откат версии `jemalloc`, приводящей к деградации производительности. [#4018](https://github.com/yandex/ClickHouse/pull/4018) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
|
||||
### Обратно несовместимые изменения:
|
||||
|
||||
* Удалена недокументированная возможность `ALTER MODIFY PRIMARY KEY`, замененная выражением `ALTER MODIFY ORDER BY`. [#3887](https://github.com/yandex/ClickHouse/pull/3887) ([Alex Zatelepin](https://github.com/ztlpn))
|
||||
* Удалена функция `shardByHash`. [#3833](https://github.com/yandex/ClickHouse/pull/3833) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Запрещено использование скалярных подзапросов с результатом, имеющим тип `AggregateFunction`. [#3865](https://github.com/yandex/ClickHouse/pull/3865) ([Ivan](https://github.com/abyss7))
|
||||
|
||||
### Улучшения сборки/тестирования/пакетирования:
|
||||
|
||||
* Добавлена поддержка сборки под PowerPC (`ppc64le`). [#4132](https://github.com/yandex/ClickHouse/pull/4132) ([Danila Kutenin](https://github.com/danlark1))
|
||||
* Функциональные stateful тесты запускаются на публично доступных данных. [#3969](https://github.com/yandex/ClickHouse/pull/3969) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлена ошибка, при которой сервер не мог запуститься с сообщением `bash: /usr/bin/clickhouse-extract-from-config: Operation not permitted` при использовании Docker или systemd-nspawn. [#4136](https://github.com/yandex/ClickHouse/pull/4136) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Обновлена библиотека `rdkafka` до версии v1.0.0-RC5. Использована cppkafka на замену интерфейса языка C. [#4025](https://github.com/yandex/ClickHouse/pull/4025) ([Ivan](https://github.com/abyss7))
|
||||
* Обновлена библиотека `mariadb-client`. Исправлена проблема, обнаруженная с использованием UBSan. [#3924](https://github.com/yandex/ClickHouse/pull/3924) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправления для сборок с UBSan. [#3926](https://github.com/yandex/ClickHouse/pull/3926) [#3021](https://github.com/yandex/ClickHouse/pull/3021) [#3948](https://github.com/yandex/ClickHouse/pull/3948) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Добавлены покоммитные запуски тестов с UBSan сборкой.
|
||||
* Добавлены покоммитные запуски тестов со статическим анализатором PVS-Studio.
|
||||
* Исправлены проблемы, найденные с использованием PVS-Studio. [#4013](https://github.com/yandex/ClickHouse/pull/4013) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправлены проблемы совместимости glibc. [#4100](https://github.com/yandex/ClickHouse/pull/4100) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Docker образы перемещены на Ubuntu 18.10, добавлена совместимость с glibc >= 2.28 [#3965](https://github.com/yandex/ClickHouse/pull/3965) ([alesapin](https://github.com/alesapin))
|
||||
* Добавлена переменная окружения `CLICKHOUSE_DO_NOT_CHOWN`, позволяющая не делать shown директории для Docker образа сервера. [#3967](https://github.com/yandex/ClickHouse/pull/3967) ([alesapin](https://github.com/alesapin))
|
||||
* Включены большинство предупреждений из `-Weverything` для clang. Включено `-Wpedantic`. [#3986](https://github.com/yandex/ClickHouse/pull/3986) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Добавлены некоторые предупреждения, специфичные только для clang 8. [#3993](https://github.com/yandex/ClickHouse/pull/3993) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* При использовании динамической линковки используется `libLLVM` вместо библиотеки `LLVM`. [#3989](https://github.com/yandex/ClickHouse/pull/3989) ([Orivej Desh](https://github.com/orivej))
|
||||
* Добавлены переменные окружения для параметров `TSan`, `UBSan`, `ASan` в тестовом Docker образе. [#4072](https://github.com/yandex/ClickHouse/pull/4072) ([alesapin](https://github.com/alesapin))
|
||||
* Debian пакет `clickhouse-server` будет рекомендовать пакет `libcap2-bin` для того, чтобы использовать утилиту `setcap` для настроек. Данный пакет опционален. [#4093](https://github.com/yandex/ClickHouse/pull/4093) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Уменьшено время сборки, убраны ненужные включения заголовочных файлов. [#3898](https://github.com/yandex/ClickHouse/pull/3898) ([proller](https://github.com/proller))
|
||||
* Добавлены тесты производительности для функций хеширования. [#3918](https://github.com/yandex/ClickHouse/pull/3918) ([filimonov](https://github.com/filimonov))
|
||||
* Исправлены циклические зависимости библиотек. [#3958](https://github.com/yandex/ClickHouse/pull/3958) ([proller](https://github.com/proller))
|
||||
* Улучшена компиляция при малом объеме памяти. [#4030](https://github.com/yandex/ClickHouse/pull/4030) ([proller](https://github.com/proller))
|
||||
* Добавлен тестовый скрипт для воспроизведения деградации производительности в `jemalloc`. [#4036](https://github.com/yandex/ClickHouse/pull/4036) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Исправления опечаток в комментариях и строковых литералах. [#4122](https://github.com/yandex/ClickHouse/pull/4122) ([maiha](https://github.com/maiha))
|
||||
* Исправления опечаток в комментариях. [#4089](https://github.com/yandex/ClickHouse/pull/4089) ([Evgenii Pravda](https://github.com/kvinty))
|
||||
|
||||
## ClickHouse release 18.16.1, 2018-12-21
|
||||
|
||||
### Исправления ошибок:
|
||||
|
@ -221,7 +221,7 @@ if (UNBUNDLED OR NOT (OS_LINUX OR APPLE) OR ARCH_32)
|
||||
option (NO_WERROR "Disable -Werror compiler option" ON)
|
||||
endif ()
|
||||
|
||||
message (STATUS "Building for: ${CMAKE_SYSTEM} ${CMAKE_SYSTEM_PROCESSOR} ${CMAKE_LIBRARY_ARCHITECTURE} ; USE_STATIC_LIBRARIES=${USE_STATIC_LIBRARIES} MAKE_STATIC_LIBRARIES=${MAKE_STATIC_LIBRARIES} UNBUNDLED=${UNBUNDLED} CCACHE=${CCACHE_FOUND} ${CCACHE_VERSION}")
|
||||
message (STATUS "Building for: ${CMAKE_SYSTEM} ${CMAKE_SYSTEM_PROCESSOR} ${CMAKE_LIBRARY_ARCHITECTURE} ; USE_STATIC_LIBRARIES=${USE_STATIC_LIBRARIES} MAKE_STATIC_LIBRARIES=${MAKE_STATIC_LIBRARIES} SPLIT_SHARED=${SPLIT_SHARED_LIBRARIES} UNBUNDLED=${UNBUNDLED} CCACHE=${CCACHE_FOUND} ${CCACHE_VERSION}")
|
||||
|
||||
include(GNUInstallDirs)
|
||||
|
||||
@ -253,6 +253,7 @@ endif()
|
||||
include (cmake/find_libgsasl.cmake)
|
||||
include (cmake/find_libxml2.cmake)
|
||||
include (cmake/find_protobuf.cmake)
|
||||
include (cmake/find_pdqsort.cmake)
|
||||
include (cmake/find_hdfs3.cmake)
|
||||
include (cmake/find_consistent-hashing.cmake)
|
||||
include (cmake/find_base64.cmake)
|
||||
|
@ -13,4 +13,5 @@ ClickHouse is an open-source column-oriented database management system that all
|
||||
|
||||
## Upcoming Events
|
||||
|
||||
* [C++ ClickHouse and CatBoost Sprints](https://events.yandex.ru/events/ClickHouse/2-feb-2019/) in Moscow on February 2.
|
||||
* [ClickHouse Community Meetup](https://www.eventbrite.com/e/meetup-clickhouse-in-the-wild-deployment-success-stories-registration-55305051899) in San Francisco on February 19.
|
||||
* [ClickHouse Community Meetup](https://www.eventbrite.com/e/clickhouse-meetup-in-madrid-registration-55376746339) in Madrid on April 2.
|
||||
|
@ -1,3 +1,4 @@
|
||||
# ARM: Cannot cpuid_get_raw_data: CPUID instruction is not supported
|
||||
if (NOT ARCH_ARM)
|
||||
option (USE_INTERNAL_CPUID_LIBRARY "Set to FALSE to use system cpuid library instead of bundled" ${NOT_UNBUNDLED})
|
||||
endif ()
|
||||
@ -21,7 +22,7 @@ if (CPUID_LIBRARY AND CPUID_INCLUDE_DIR)
|
||||
# TODO: make virtual target cpuid:cpuid with COMPILE_DEFINITIONS property
|
||||
endif ()
|
||||
set (USE_CPUID 1)
|
||||
elseif (NOT MISSING_INTERNAL_CPUID_LIBRARY)
|
||||
elseif (NOT ARCH_ARM AND NOT MISSING_INTERNAL_CPUID_LIBRARY)
|
||||
set (CPUID_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libcpuid/include)
|
||||
set (USE_INTERNAL_CPUID_LIBRARY 1)
|
||||
set (CPUID_LIBRARY cpuid)
|
||||
|
@ -1,5 +1,12 @@
|
||||
option(USE_INTERNAL_CPUINFO_LIBRARY "Set to FALSE to use system cpuinfo library instead of bundled" ${NOT_UNBUNDLED})
|
||||
|
||||
# Now we have no contrib/libcpuinfo, use from system.
|
||||
if (USE_INTERNAL_CPUINFO_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libcpuinfo/include")
|
||||
#message (WARNING "submodule contrib/libcpuid is missing. to fix try run: \n git submodule update --init --recursive")
|
||||
set (USE_INTERNAL_CPUINFO_LIBRARY 0)
|
||||
set (MISSING_INTERNAL_CPUINFO_LIBRARY 1)
|
||||
endif ()
|
||||
|
||||
if(NOT USE_INTERNAL_CPUINFO_LIBRARY)
|
||||
find_library(CPUINFO_LIBRARY cpuinfo)
|
||||
find_path(CPUINFO_INCLUDE_DIR NAMES cpuinfo.h PATHS ${CPUINFO_INCLUDE_PATHS})
|
||||
|
2
cmake/find_pdqsort.cmake
Normal file
2
cmake/find_pdqsort.cmake
Normal file
@ -0,0 +1,2 @@
|
||||
set(PDQSORT_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/pdqsort)
|
||||
message(STATUS "Using pdqsort: ${PDQSORT_INCLUDE_DIR}")
|
@ -1,5 +1,11 @@
|
||||
option(USE_INTERNAL_PROTOBUF_LIBRARY "Set to FALSE to use system protobuf instead of bundled" ${NOT_UNBUNDLED})
|
||||
|
||||
if(OS_FREEBSD AND SANITIZE STREQUAL "address")
|
||||
# ../contrib/protobuf/src/google/protobuf/arena_impl.h:45:10: fatal error: 'sanitizer/asan_interface.h' file not found
|
||||
set(MISSING_INTERNAL_PROTOBUF_LIBRARY 1)
|
||||
set(USE_INTERNAL_PROTOBUF_LIBRARY 0)
|
||||
endif()
|
||||
|
||||
if(NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/protobuf/cmake/CMakeLists.txt")
|
||||
if(USE_INTERNAL_PROTOBUF_LIBRARY)
|
||||
message(WARNING "submodule contrib/protobuf is missing. to fix try run: \n git submodule update --init --recursive")
|
||||
|
2
contrib/CMakeLists.txt
vendored
2
contrib/CMakeLists.txt
vendored
@ -8,6 +8,8 @@ elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
|
||||
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-format -Wno-inconsistent-missing-override -std=c++1z")
|
||||
endif ()
|
||||
|
||||
set_property(DIRECTORY PROPERTY EXCLUDE_FROM_ALL 1)
|
||||
|
||||
if (USE_INTERNAL_BOOST_LIBRARY)
|
||||
add_subdirectory (boost-cmake)
|
||||
endif ()
|
||||
|
@ -39,5 +39,20 @@ add_library(base64 ${LINK_MODE}
|
||||
${LIBRARY_DIR}/lib/codecs.h
|
||||
${CMAKE_CURRENT_BINARY_DIR}/config.h)
|
||||
|
||||
target_compile_options(base64 PRIVATE ${base64_SSSE3_opt} ${base64_SSE41_opt} ${base64_SSE42_opt} ${base64_AVX_opt} ${base64_AVX2_opt})
|
||||
if(HAVE_AVX)
|
||||
set_source_files_properties(${LIBRARY_DIR}/lib/arch/avx/codec.c PROPERTIES COMPILE_FLAGS -mavx)
|
||||
endif()
|
||||
if(HAVE_AVX2)
|
||||
set_source_files_properties(${LIBRARY_DIR}/lib/arch/avx2/codec.c PROPERTIES COMPILE_FLAGS -mavx2)
|
||||
endif()
|
||||
if(HAVE_SSE41)
|
||||
set_source_files_properties(${LIBRARY_DIR}/lib/arch/sse41/codec.c PROPERTIES COMPILE_FLAGS -msse4.1)
|
||||
endif()
|
||||
if(HAVE_SSE42)
|
||||
set_source_files_properties(${LIBRARY_DIR}/lib/arch/sse42/codec.c PROPERTIES COMPILE_FLAGS -msse4.2)
|
||||
endif()
|
||||
if(HAVE_SSSE3)
|
||||
set_source_files_properties(${LIBRARY_DIR}/lib/arch/ssse3/codec.c PROPERTIES COMPILE_FLAGS -mssse3)
|
||||
endif()
|
||||
|
||||
target_include_directories(base64 PRIVATE ${LIBRARY_DIR}/include ${CMAKE_CURRENT_BINARY_DIR})
|
||||
|
1
contrib/pdqsort
vendored
Submodule
1
contrib/pdqsort
vendored
Submodule
@ -0,0 +1 @@
|
||||
Subproject commit 08879029ab8dcb80a70142acb709e3df02de5d37
|
@ -206,6 +206,8 @@ target_link_libraries (clickhouse_common_io
|
||||
${CMAKE_DL_LIBS}
|
||||
)
|
||||
|
||||
target_include_directories(clickhouse_common_io SYSTEM BEFORE PUBLIC ${PDQSORT_INCLUDE_DIR})
|
||||
|
||||
target_include_directories(clickhouse_common_io SYSTEM BEFORE PUBLIC ${RE2_INCLUDE_DIR})
|
||||
|
||||
if(CPUID_LIBRARY)
|
||||
@ -282,6 +284,7 @@ target_link_libraries (dbms PRIVATE ${Poco_Foundation_LIBRARY})
|
||||
|
||||
if (USE_ICU)
|
||||
target_link_libraries (dbms PRIVATE ${ICU_LIBRARIES})
|
||||
target_include_directories (dbms SYSTEM PRIVATE ${ICU_INCLUDE_DIRS})
|
||||
endif ()
|
||||
|
||||
if (USE_CAPNP)
|
||||
|
@ -1,11 +1,11 @@
|
||||
# This strings autochanged from release_lib.sh:
|
||||
set(VERSION_REVISION 54413)
|
||||
set(VERSION_REVISION 54414)
|
||||
set(VERSION_MAJOR 19)
|
||||
set(VERSION_MINOR 1)
|
||||
set(VERSION_PATCH 6)
|
||||
set(VERSION_GITHASH f73b337a93d534671b2187660398b8573fc1d464)
|
||||
set(VERSION_DESCRIBE v19.1.6-testing)
|
||||
set(VERSION_STRING 19.1.6)
|
||||
set(VERSION_MINOR 2)
|
||||
set(VERSION_PATCH 0)
|
||||
set(VERSION_GITHASH dcfca1355468a2d083b33c867effa8f79642ed6e)
|
||||
set(VERSION_DESCRIBE v19.2.0-testing)
|
||||
set(VERSION_STRING 19.2.0)
|
||||
# end of autochange
|
||||
|
||||
set(VERSION_EXTRA "" CACHE STRING "")
|
||||
|
@ -12,6 +12,7 @@
|
||||
#include <unordered_set>
|
||||
#include <algorithm>
|
||||
#include <optional>
|
||||
#include <ext/scope_guard.h>
|
||||
#include <boost/program_options.hpp>
|
||||
#include <boost/algorithm/string/replace.hpp>
|
||||
#include <Poco/String.h>
|
||||
@ -400,6 +401,7 @@ private:
|
||||
throw Exception("time option could be specified only in non-interactive mode", ErrorCodes::BAD_ARGUMENTS);
|
||||
|
||||
#if USE_READLINE
|
||||
SCOPE_EXIT({ Suggest::instance().finalize(); });
|
||||
if (server_revision >= Suggest::MIN_SERVER_REVISION
|
||||
&& !config().getBool("disable_suggestion", false))
|
||||
{
|
||||
@ -1542,12 +1544,19 @@ public:
|
||||
po::options_description main_description("Main options", line_length, min_description_length);
|
||||
main_description.add_options()
|
||||
("help", "produce help message")
|
||||
("config-file,c", po::value<std::string>(), "config-file path")
|
||||
("config-file,C", po::value<std::string>(), "config-file path")
|
||||
("config,c", po::value<std::string>(), "config-file path (another shorthand)")
|
||||
("host,h", po::value<std::string>()->default_value("localhost"), "server host")
|
||||
("port", po::value<int>()->default_value(9000), "server port")
|
||||
("secure,s", "Use TLS connection")
|
||||
("user,u", po::value<std::string>()->default_value("default"), "user")
|
||||
("password", po::value<std::string>(), "password")
|
||||
/** If "--password [value]" is used but the value is omitted, the bad argument exception will be thrown.
|
||||
* implicit_value is used to avoid this exception (to allow user to type just "--password")
|
||||
* Since currently boost provides no way to check if a value has been set implicitly for an option,
|
||||
* the "\n" is used to distinguish this case because there is hardly a chance an user would use "\n"
|
||||
* as the password.
|
||||
*/
|
||||
("password", po::value<std::string>()->implicit_value("\n"), "password")
|
||||
("ask-password", "ask-password")
|
||||
("query_id", po::value<std::string>(), "query_id")
|
||||
("query,q", po::value<std::string>(), "query")
|
||||
@ -1585,13 +1594,11 @@ public:
|
||||
("structure", po::value<std::string>(), "structure")
|
||||
("types", po::value<std::string>(), "types")
|
||||
;
|
||||
|
||||
/// Parse main commandline options.
|
||||
po::parsed_options parsed = po::command_line_parser(
|
||||
common_arguments.size(), common_arguments.data()).options(main_description).run();
|
||||
po::variables_map options;
|
||||
po::store(parsed, options);
|
||||
|
||||
if (options.count("version") || options.count("V"))
|
||||
{
|
||||
showClientVersion();
|
||||
@ -1649,9 +1656,14 @@ public:
|
||||
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
|
||||
#undef EXTRACT_SETTING
|
||||
|
||||
if (options.count("config-file") && options.count("config"))
|
||||
throw Exception("Two or more configuration files referenced in arguments", ErrorCodes::BAD_ARGUMENTS);
|
||||
|
||||
/// Save received data into the internal config.
|
||||
if (options.count("config-file"))
|
||||
config().setString("config-file", options["config-file"].as<std::string>());
|
||||
if (options.count("config"))
|
||||
config().setString("config-file", options["config"].as<std::string>());
|
||||
if (options.count("host") && !options["host"].defaulted())
|
||||
config().setString("host", options["host"].as<std::string>());
|
||||
if (options.count("query_id"))
|
||||
@ -1710,11 +1722,11 @@ public:
|
||||
|
||||
int mainEntryClickHouseClient(int argc, char ** argv)
|
||||
{
|
||||
DB::Client client;
|
||||
|
||||
try
|
||||
{
|
||||
DB::Client client;
|
||||
client.init(argc, argv);
|
||||
return client.run();
|
||||
}
|
||||
catch (const boost::program_options::error & e)
|
||||
{
|
||||
@ -1726,6 +1738,4 @@ int mainEntryClickHouseClient(int argc, char ** argv)
|
||||
std::cerr << DB::getCurrentExceptionMessage(true) << std::endl;
|
||||
return 1;
|
||||
}
|
||||
|
||||
return client.run();
|
||||
}
|
||||
|
@ -8,7 +8,7 @@
|
||||
#include <Common/Exception.h>
|
||||
#include <IO/ConnectionTimeouts.h>
|
||||
|
||||
#include <common/SetTerminalEcho.h>
|
||||
#include <common/setTerminalEcho.h>
|
||||
#include <ext/scope_guard.h>
|
||||
|
||||
#include <Poco/Util/AbstractConfiguration.h>
|
||||
@ -48,27 +48,33 @@ struct ConnectionParameters
|
||||
is_secure ? DBMS_DEFAULT_SECURE_PORT : DBMS_DEFAULT_PORT));
|
||||
|
||||
default_database = config.getString("database", "");
|
||||
user = config.getString("user", "");
|
||||
|
||||
/// changed the default value to "default" to fix the issue when the user in the prompt is blank
|
||||
user = config.getString("user", "default");
|
||||
bool password_prompt = false;
|
||||
if (config.getBool("ask-password", false))
|
||||
{
|
||||
if (config.has("password"))
|
||||
throw Exception("Specified both --password and --ask-password. Remove one of them", ErrorCodes::BAD_ARGUMENTS);
|
||||
|
||||
std::cout << "Password for user " << user << ": ";
|
||||
SetTerminalEcho(false);
|
||||
|
||||
SCOPE_EXIT({
|
||||
SetTerminalEcho(true);
|
||||
});
|
||||
std::getline(std::cin, password);
|
||||
std::cout << std::endl;
|
||||
password_prompt = true;
|
||||
}
|
||||
else
|
||||
{
|
||||
password = config.getString("password", "");
|
||||
/// if the value of --password is omitted, the password will be set implicitly to "\n"
|
||||
if (password == "\n")
|
||||
password_prompt = true;
|
||||
}
|
||||
if (password_prompt)
|
||||
{
|
||||
std::cout << "Password for user (" << user << "): ";
|
||||
setTerminalEcho(false);
|
||||
|
||||
SCOPE_EXIT({
|
||||
setTerminalEcho(true);
|
||||
});
|
||||
std::getline(std::cin, password);
|
||||
std::cout << std::endl;
|
||||
}
|
||||
compression = config.getBool("compression", true)
|
||||
? Protocol::Compression::Enable
|
||||
: Protocol::Compression::Disable;
|
||||
|
@ -194,6 +194,12 @@ public:
|
||||
});
|
||||
}
|
||||
|
||||
void finalize()
|
||||
{
|
||||
if (loading_thread.joinable())
|
||||
loading_thread.join();
|
||||
}
|
||||
|
||||
/// A function for readline.
|
||||
static char * generator(const char * text, int state)
|
||||
{
|
||||
@ -211,8 +217,7 @@ public:
|
||||
|
||||
~Suggest()
|
||||
{
|
||||
if (loading_thread.joinable())
|
||||
loading_thread.join();
|
||||
finalize();
|
||||
}
|
||||
};
|
||||
|
||||
|
@ -817,7 +817,7 @@ public:
|
||||
|
||||
try
|
||||
{
|
||||
type->deserializeTextQuoted(*column_dummy, rb, FormatSettings());
|
||||
type->deserializeAsTextQuoted(*column_dummy, rb, FormatSettings());
|
||||
}
|
||||
catch (Exception & e)
|
||||
{
|
||||
@ -1179,7 +1179,7 @@ protected:
|
||||
/// Removes MATERIALIZED and ALIAS columns from create table query
|
||||
static ASTPtr removeAliasColumnsFromCreateQuery(const ASTPtr & query_ast)
|
||||
{
|
||||
const ASTs & column_asts = typeid_cast<ASTCreateQuery &>(*query_ast).columns->children;
|
||||
const ASTs & column_asts = typeid_cast<ASTCreateQuery &>(*query_ast).columns_list->columns->children;
|
||||
auto new_columns = std::make_shared<ASTExpressionList>();
|
||||
|
||||
for (const ASTPtr & column_ast : column_asts)
|
||||
@ -1198,8 +1198,13 @@ protected:
|
||||
|
||||
ASTPtr new_query_ast = query_ast->clone();
|
||||
ASTCreateQuery & new_query = typeid_cast<ASTCreateQuery &>(*new_query_ast);
|
||||
new_query.columns = new_columns.get();
|
||||
new_query.children.at(0) = std::move(new_columns);
|
||||
|
||||
auto new_columns_list = std::make_shared<ASTColumns>();
|
||||
new_columns_list->set(new_columns_list->columns, new_columns);
|
||||
new_columns_list->set(
|
||||
new_columns_list->indices, typeid_cast<ASTCreateQuery &>(*query_ast).columns_list->indices->clone());
|
||||
|
||||
new_query.replace(new_query.columns_list, new_columns_list);
|
||||
|
||||
return new_query_ast;
|
||||
}
|
||||
@ -1217,7 +1222,7 @@ protected:
|
||||
res->table = new_table.second;
|
||||
|
||||
res->children.clear();
|
||||
res->set(res->columns, create.columns->clone());
|
||||
res->set(res->columns_list, create.columns_list->clone());
|
||||
res->set(res->storage, new_storage_ast->clone());
|
||||
|
||||
return res;
|
||||
@ -1877,7 +1882,7 @@ protected:
|
||||
for (size_t i = 0; i < column.column->size(); ++i)
|
||||
{
|
||||
WriteBufferFromOwnString wb;
|
||||
column.type->serializeTextQuoted(*column.column, i, wb, FormatSettings());
|
||||
column.type->serializeAsTextQuoted(*column.column, i, wb, FormatSettings());
|
||||
res.emplace(wb.str());
|
||||
}
|
||||
}
|
||||
|
@ -297,7 +297,7 @@ void LocalServer::processQueries()
|
||||
|
||||
try
|
||||
{
|
||||
executeQuery(read_buf, write_buf, /* allow_into_outfile = */ true, *context, {});
|
||||
executeQuery(read_buf, write_buf, /* allow_into_outfile = */ true, *context, {}, {});
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
|
@ -25,12 +25,14 @@ PerformanceTest::PerformanceTest(
|
||||
Connection & connection_,
|
||||
InterruptListener & interrupt_listener_,
|
||||
const PerformanceTestInfo & test_info_,
|
||||
Context & context_)
|
||||
Context & context_,
|
||||
const std::vector<size_t> & queries_to_run_)
|
||||
: config(config_)
|
||||
, connection(connection_)
|
||||
, interrupt_listener(interrupt_listener_)
|
||||
, test_info(test_info_)
|
||||
, context(context_)
|
||||
, queries_to_run(queries_to_run_)
|
||||
, log(&Poco::Logger::get("PerformanceTest"))
|
||||
{
|
||||
}
|
||||
@ -128,12 +130,43 @@ UInt64 PerformanceTest::calculateMaxExecTime() const
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
void PerformanceTest::prepare() const
|
||||
{
|
||||
for (const auto & query : test_info.create_queries)
|
||||
{
|
||||
LOG_INFO(log, "Executing create query '" << query << "'");
|
||||
connection.sendQuery(query);
|
||||
}
|
||||
|
||||
for (const auto & query : test_info.fill_queries)
|
||||
{
|
||||
LOG_INFO(log, "Executing fill query '" << query << "'");
|
||||
connection.sendQuery(query);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
void PerformanceTest::finish() const
|
||||
{
|
||||
for (const auto & query : test_info.drop_queries)
|
||||
{
|
||||
LOG_INFO(log, "Executing drop query '" << query << "'");
|
||||
connection.sendQuery(query);
|
||||
}
|
||||
}
|
||||
|
||||
std::vector<TestStats> PerformanceTest::execute()
|
||||
{
|
||||
std::vector<TestStats> statistics_by_run;
|
||||
size_t query_count;
|
||||
if (queries_to_run.empty())
|
||||
query_count = test_info.queries.size();
|
||||
else
|
||||
query_count = queries_to_run.size();
|
||||
size_t total_runs = test_info.times_to_run * test_info.queries.size();
|
||||
statistics_by_run.resize(total_runs);
|
||||
LOG_INFO(log, "Totally will run cases " << total_runs << " times");
|
||||
LOG_INFO(log, "Totally will run cases " << test_info.times_to_run * query_count << " times");
|
||||
UInt64 max_exec_time = calculateMaxExecTime();
|
||||
if (max_exec_time != 0)
|
||||
LOG_INFO(log, "Test will be executed for a maximum of " << max_exec_time / 1000. << " seconds");
|
||||
@ -146,9 +179,13 @@ std::vector<TestStats> PerformanceTest::execute()
|
||||
|
||||
for (size_t query_index = 0; query_index < test_info.queries.size(); ++query_index)
|
||||
{
|
||||
size_t statistic_index = number_of_launch * test_info.queries.size() + query_index;
|
||||
|
||||
queries_with_indexes.push_back({test_info.queries[query_index], statistic_index});
|
||||
if (queries_to_run.empty() || std::find(queries_to_run.begin(), queries_to_run.end(), query_index) != queries_to_run.end())
|
||||
{
|
||||
size_t statistic_index = number_of_launch * test_info.queries.size() + query_index;
|
||||
queries_with_indexes.push_back({test_info.queries[query_index], statistic_index});
|
||||
}
|
||||
else
|
||||
LOG_INFO(log, "Will skip query " << test_info.queries[query_index] << " by index");
|
||||
}
|
||||
|
||||
if (got_SIGINT)
|
||||
|
@ -22,15 +22,19 @@ public:
|
||||
Connection & connection_,
|
||||
InterruptListener & interrupt_listener_,
|
||||
const PerformanceTestInfo & test_info_,
|
||||
Context & context_);
|
||||
Context & context_,
|
||||
const std::vector<size_t> & queries_to_run_);
|
||||
|
||||
bool checkPreconditions() const;
|
||||
void prepare() const;
|
||||
std::vector<TestStats> execute();
|
||||
void finish() const;
|
||||
|
||||
const PerformanceTestInfo & getTestInfo() const
|
||||
{
|
||||
return test_info;
|
||||
}
|
||||
|
||||
bool checkSIGINT() const
|
||||
{
|
||||
return got_SIGINT;
|
||||
@ -51,6 +55,7 @@ private:
|
||||
PerformanceTestInfo test_info;
|
||||
Context & context;
|
||||
|
||||
std::vector<size_t> queries_to_run;
|
||||
Poco::Logger * log;
|
||||
|
||||
bool got_SIGINT = false;
|
||||
|
@ -36,42 +36,6 @@ void extractSettings(
|
||||
}
|
||||
}
|
||||
|
||||
void checkMetricsInput(const Strings & metrics, ExecutionType exec_type)
|
||||
{
|
||||
Strings loop_metrics = {
|
||||
"min_time", "quantiles", "total_time",
|
||||
"queries_per_second", "rows_per_second",
|
||||
"bytes_per_second"};
|
||||
|
||||
Strings non_loop_metrics = {
|
||||
"max_rows_per_second", "max_bytes_per_second",
|
||||
"avg_rows_per_second", "avg_bytes_per_second"};
|
||||
|
||||
if (exec_type == ExecutionType::Loop)
|
||||
{
|
||||
for (const std::string & metric : metrics)
|
||||
{
|
||||
auto non_loop_pos =
|
||||
std::find(non_loop_metrics.begin(), non_loop_metrics.end(), metric);
|
||||
|
||||
if (non_loop_pos != non_loop_metrics.end())
|
||||
throw Exception("Wrong type of metric for loop execution type (" + metric + ")",
|
||||
ErrorCodes::BAD_ARGUMENTS);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
for (const std::string & metric : metrics)
|
||||
{
|
||||
auto loop_pos = std::find(loop_metrics.begin(), loop_metrics.end(), metric);
|
||||
if (loop_pos != loop_metrics.end())
|
||||
throw Exception(
|
||||
"Wrong type of metric for non-loop execution type (" + metric + ")",
|
||||
ErrorCodes::BAD_ARGUMENTS);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
||||
@ -84,12 +48,20 @@ PerformanceTestInfo::PerformanceTestInfo(
|
||||
{
|
||||
test_name = config->getString("name");
|
||||
path = config->getString("path");
|
||||
if (config->has("main_metric"))
|
||||
{
|
||||
Strings main_metrics;
|
||||
config->keys("main_metric", main_metrics);
|
||||
if (main_metrics.size())
|
||||
main_metric = main_metrics[0];
|
||||
}
|
||||
|
||||
applySettings(config);
|
||||
extractQueries(config);
|
||||
processSubstitutions(config);
|
||||
getExecutionType(config);
|
||||
getStopConditions(config);
|
||||
getMetrics(config);
|
||||
extractAuxiliaryQueries(config);
|
||||
}
|
||||
|
||||
void PerformanceTestInfo::applySettings(XMLConfigurationPtr config)
|
||||
@ -238,35 +210,16 @@ void PerformanceTestInfo::getStopConditions(XMLConfigurationPtr config)
|
||||
|
||||
}
|
||||
|
||||
|
||||
void PerformanceTestInfo::getMetrics(XMLConfigurationPtr config)
|
||||
void PerformanceTestInfo::extractAuxiliaryQueries(XMLConfigurationPtr config)
|
||||
{
|
||||
ConfigurationPtr metrics_view(config->createView("metrics"));
|
||||
metrics_view->keys(metrics);
|
||||
if (config->has("create_query"))
|
||||
create_queries = getMultipleValuesFromConfig(*config, "", "create_query");
|
||||
|
||||
if (config->has("main_metric"))
|
||||
{
|
||||
Strings main_metrics;
|
||||
config->keys("main_metric", main_metrics);
|
||||
if (main_metrics.size())
|
||||
main_metric = main_metrics[0];
|
||||
}
|
||||
if (config->has("fill_query"))
|
||||
fill_queries = getMultipleValuesFromConfig(*config, "", "fill_query");
|
||||
|
||||
if (!main_metric.empty())
|
||||
{
|
||||
if (std::find(metrics.begin(), metrics.end(), main_metric) == metrics.end())
|
||||
metrics.push_back(main_metric);
|
||||
}
|
||||
else
|
||||
{
|
||||
if (metrics.empty())
|
||||
throw Exception("You shoud specify at least one metric",
|
||||
ErrorCodes::BAD_ARGUMENTS);
|
||||
main_metric = metrics[0];
|
||||
}
|
||||
|
||||
if (metrics.size() > 0)
|
||||
checkMetricsInput(metrics, exec_type);
|
||||
if (config->has("drop_query"))
|
||||
drop_queries = getMultipleValuesFromConfig(*config, "", "drop_query");
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -33,7 +33,6 @@ public:
|
||||
std::string main_metric;
|
||||
|
||||
Strings queries;
|
||||
Strings metrics;
|
||||
|
||||
Settings settings;
|
||||
ExecutionType exec_type;
|
||||
@ -43,6 +42,10 @@ public:
|
||||
std::string profiles_file;
|
||||
std::vector<TestStopConditions> stop_conditions_by_run;
|
||||
|
||||
Strings create_queries;
|
||||
Strings fill_queries;
|
||||
Strings drop_queries;
|
||||
|
||||
private:
|
||||
void applySettings(XMLConfigurationPtr config);
|
||||
void extractQueries(XMLConfigurationPtr config);
|
||||
@ -50,6 +53,7 @@ private:
|
||||
void getExecutionType(XMLConfigurationPtr config);
|
||||
void getStopConditions(XMLConfigurationPtr config);
|
||||
void getMetrics(XMLConfigurationPtr config);
|
||||
void extractAuxiliaryQueries(XMLConfigurationPtr config);
|
||||
};
|
||||
|
||||
}
|
||||
|
@ -11,12 +11,13 @@
|
||||
#include <boost/filesystem.hpp>
|
||||
#include <boost/program_options.hpp>
|
||||
|
||||
#include <Poco/Util/XMLConfiguration.h>
|
||||
#include <Poco/Logger.h>
|
||||
#include <Poco/AutoPtr.h>
|
||||
#include <Poco/ConsoleChannel.h>
|
||||
#include <Poco/FormattingChannel.h>
|
||||
#include <Poco/Logger.h>
|
||||
#include <Poco/Path.h>
|
||||
#include <Poco/PatternFormatter.h>
|
||||
|
||||
#include <Poco/Util/XMLConfiguration.h>
|
||||
|
||||
#include <common/logger_useful.h>
|
||||
#include <Client/Connection.h>
|
||||
@ -25,7 +26,6 @@
|
||||
#include <IO/ConnectionTimeouts.h>
|
||||
#include <IO/UseSSL.h>
|
||||
#include <Interpreters/Settings.h>
|
||||
#include <Poco/AutoPtr.h>
|
||||
#include <Common/Exception.h>
|
||||
#include <Common/InterruptListener.h>
|
||||
|
||||
@ -70,6 +70,7 @@ public:
|
||||
Strings && skip_names_,
|
||||
Strings && tests_names_regexp_,
|
||||
Strings && skip_names_regexp_,
|
||||
const std::unordered_map<std::string, std::vector<size_t>> query_indexes_,
|
||||
const ConnectionTimeouts & timeouts)
|
||||
: connection(host_, port_, default_database_, user_,
|
||||
password_, timeouts, "performance-test", Protocol::Compression::Enable,
|
||||
@ -80,6 +81,7 @@ public:
|
||||
, skip_tags(std::move(skip_tags_))
|
||||
, skip_names(std::move(skip_names_))
|
||||
, skip_names_regexp(std::move(skip_names_regexp_))
|
||||
, query_indexes(query_indexes_)
|
||||
, lite_output(lite_output_)
|
||||
, profiles_file(profiles_file_)
|
||||
, input_files(input_files_)
|
||||
@ -128,6 +130,7 @@ private:
|
||||
const Strings & skip_tags;
|
||||
const Strings & skip_names;
|
||||
const Strings & skip_names_regexp;
|
||||
std::unordered_map<std::string, std::vector<size_t>> query_indexes;
|
||||
|
||||
Context global_context = Context::createGlobal();
|
||||
std::shared_ptr<ReportBuilder> report_builder;
|
||||
@ -198,19 +201,26 @@ private:
|
||||
{
|
||||
PerformanceTestInfo info(test_config, profiles_file);
|
||||
LOG_INFO(log, "Config for test '" << info.test_name << "' parsed");
|
||||
PerformanceTest current(test_config, connection, interrupt_listener, info, global_context);
|
||||
PerformanceTest current(test_config, connection, interrupt_listener, info, global_context, query_indexes[info.path]);
|
||||
|
||||
current.checkPreconditions();
|
||||
LOG_INFO(log, "Preconditions for test '" << info.test_name << "' are fullfilled");
|
||||
|
||||
LOG_INFO(log, "Preparing for run, have " << info.create_queries.size()
|
||||
<< " create queries and " << info.fill_queries.size() << " fill queries");
|
||||
current.prepare();
|
||||
LOG_INFO(log, "Prepared");
|
||||
LOG_INFO(log, "Running test '" << info.test_name << "'");
|
||||
auto result = current.execute();
|
||||
LOG_INFO(log, "Test '" << info.test_name << "' finished");
|
||||
|
||||
LOG_INFO(log, "Running post run queries");
|
||||
current.finish();
|
||||
LOG_INFO(log, "Postqueries finished");
|
||||
|
||||
if (lite_output)
|
||||
return {report_builder->buildCompactReport(info, result), current.checkSIGINT()};
|
||||
return {report_builder->buildCompactReport(info, result, query_indexes[info.path]), current.checkSIGINT()};
|
||||
else
|
||||
return {report_builder->buildFullReport(info, result), current.checkSIGINT()};
|
||||
return {report_builder->buildFullReport(info, result, query_indexes[info.path]), current.checkSIGINT()};
|
||||
}
|
||||
|
||||
};
|
||||
@ -282,6 +292,29 @@ static std::vector<std::string> getInputFiles(const po::variables_map & options,
|
||||
return input_files;
|
||||
}
|
||||
|
||||
std::unordered_map<std::string, std::vector<std::size_t>> getTestQueryIndexes(const po::basic_parsed_options<char> & parsed_opts)
|
||||
{
|
||||
std::unordered_map<std::string, std::vector<std::size_t>> result;
|
||||
const auto & options = parsed_opts.options;
|
||||
for (size_t i = 0; i < options.size() - 1; ++i)
|
||||
{
|
||||
const auto & opt = options[i];
|
||||
if (opt.string_key == "input-files")
|
||||
{
|
||||
if (options[i + 1].string_key == "query-indexes")
|
||||
{
|
||||
const std::string & test_path = Poco::Path(opt.value[0]).absolute().toString();
|
||||
for (const auto & query_num_str : options[i + 1].value)
|
||||
{
|
||||
size_t query_num = std::stoul(query_num_str);
|
||||
result[test_path].push_back(query_num);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
int mainEntryClickHousePerformanceTest(int argc, char ** argv)
|
||||
try
|
||||
{
|
||||
@ -307,24 +340,18 @@ try
|
||||
("skip-names", value<Strings>()->multitoken(), "Do not run tests with name")
|
||||
("names-regexp", value<Strings>()->multitoken(), "Run tests with names matching regexp")
|
||||
("skip-names-regexp", value<Strings>()->multitoken(), "Do not run tests with names matching regexp")
|
||||
("input-files", value<Strings>()->multitoken(), "Input .xml files")
|
||||
("query-indexes", value<std::vector<size_t>>()->multitoken(), "Input query indexes")
|
||||
("recursive,r", "Recurse in directories to find all xml's");
|
||||
|
||||
/// These options will not be displayed in --help
|
||||
po::options_description hidden("Hidden options");
|
||||
hidden.add_options()
|
||||
("input-files", value<std::vector<std::string>>(), "");
|
||||
|
||||
/// But they will be legit, though. And they must be given without name
|
||||
po::positional_options_description positional;
|
||||
positional.add("input-files", -1);
|
||||
|
||||
po::options_description cmdline_options;
|
||||
cmdline_options.add(desc).add(hidden);
|
||||
cmdline_options.add(desc);
|
||||
|
||||
po::variables_map options;
|
||||
po::store(
|
||||
po::command_line_parser(argc, argv).
|
||||
options(cmdline_options).positional(positional).run(), options);
|
||||
po::basic_parsed_options<char> parsed = po::command_line_parser(argc, argv).options(cmdline_options).run();
|
||||
auto queries_with_indexes = getTestQueryIndexes(parsed);
|
||||
po::store(parsed, options);
|
||||
|
||||
po::notify(options);
|
||||
|
||||
Poco::AutoPtr<Poco::PatternFormatter> formatter(new Poco::PatternFormatter("%Y.%m.%d %H:%M:%S.%F <%p> %s: %t"));
|
||||
@ -371,6 +398,7 @@ try
|
||||
std::move(skip_names),
|
||||
std::move(tests_names_regexp),
|
||||
std::move(skip_names_regexp),
|
||||
queries_with_indexes,
|
||||
timeouts);
|
||||
return performance_test_suite.run();
|
||||
}
|
||||
|
@ -17,6 +17,18 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
const std::regex QUOTE_REGEX{"\""};
|
||||
std::string getMainMetric(const PerformanceTestInfo & test_info)
|
||||
{
|
||||
std::string main_metric;
|
||||
if (test_info.main_metric.empty())
|
||||
if (test_info.exec_type == ExecutionType::Loop)
|
||||
main_metric = "min_time";
|
||||
else
|
||||
main_metric = "rows_per_second";
|
||||
else
|
||||
main_metric = test_info.main_metric;
|
||||
return main_metric;
|
||||
}
|
||||
}
|
||||
|
||||
ReportBuilder::ReportBuilder(const std::string & server_version_)
|
||||
@ -35,7 +47,8 @@ std::string ReportBuilder::getCurrentTime() const
|
||||
|
||||
std::string ReportBuilder::buildFullReport(
|
||||
const PerformanceTestInfo & test_info,
|
||||
std::vector<TestStats> & stats) const
|
||||
std::vector<TestStats> & stats,
|
||||
const std::vector<std::size_t> & queries_to_run) const
|
||||
{
|
||||
JSONString json_output;
|
||||
|
||||
@ -47,13 +60,7 @@ std::string ReportBuilder::buildFullReport(
|
||||
json_output.set("time", getCurrentTime());
|
||||
json_output.set("test_name", test_info.test_name);
|
||||
json_output.set("path", test_info.path);
|
||||
json_output.set("main_metric", test_info.main_metric);
|
||||
|
||||
auto has_metric = [&test_info] (const std::string & metric_name)
|
||||
{
|
||||
return std::find(test_info.metrics.begin(),
|
||||
test_info.metrics.end(), metric_name) != test_info.metrics.end();
|
||||
};
|
||||
json_output.set("main_metric", getMainMetric(test_info));
|
||||
|
||||
if (test_info.substitutions.size())
|
||||
{
|
||||
@ -85,6 +92,9 @@ std::string ReportBuilder::buildFullReport(
|
||||
std::vector<JSONString> run_infos;
|
||||
for (size_t query_index = 0; query_index < test_info.queries.size(); ++query_index)
|
||||
{
|
||||
if (!queries_to_run.empty() && std::find(queries_to_run.begin(), queries_to_run.end(), query_index) == queries_to_run.end())
|
||||
continue;
|
||||
|
||||
for (size_t number_of_launch = 0; number_of_launch < test_info.times_to_run; ++number_of_launch)
|
||||
{
|
||||
size_t stat_index = number_of_launch * test_info.queries.size() + query_index;
|
||||
@ -97,16 +107,16 @@ std::string ReportBuilder::buildFullReport(
|
||||
|
||||
auto query = std::regex_replace(test_info.queries[query_index], QUOTE_REGEX, "\\\"");
|
||||
runJSON.set("query", query);
|
||||
runJSON.set("query_index", query_index);
|
||||
if (!statistics.exception.empty())
|
||||
runJSON.set("exception", statistics.exception);
|
||||
|
||||
if (test_info.exec_type == ExecutionType::Loop)
|
||||
{
|
||||
/// in seconds
|
||||
if (has_metric("min_time"))
|
||||
runJSON.set("min_time", statistics.min_time / double(1000));
|
||||
runJSON.set("min_time", statistics.min_time / double(1000));
|
||||
|
||||
if (has_metric("quantiles"))
|
||||
if (statistics.sampler.size() != 0)
|
||||
{
|
||||
JSONString quantiles(4); /// here, 4 is the size of \t padding
|
||||
for (double percent = 10; percent <= 90; percent += 10)
|
||||
@ -130,34 +140,21 @@ std::string ReportBuilder::buildFullReport(
|
||||
runJSON.set("quantiles", quantiles.asString());
|
||||
}
|
||||
|
||||
if (has_metric("total_time"))
|
||||
runJSON.set("total_time", statistics.total_time);
|
||||
runJSON.set("total_time", statistics.total_time);
|
||||
|
||||
if (has_metric("queries_per_second"))
|
||||
runJSON.set("queries_per_second",
|
||||
double(statistics.queries) / statistics.total_time);
|
||||
|
||||
if (has_metric("rows_per_second"))
|
||||
runJSON.set("rows_per_second",
|
||||
double(statistics.total_rows_read) / statistics.total_time);
|
||||
|
||||
if (has_metric("bytes_per_second"))
|
||||
runJSON.set("bytes_per_second",
|
||||
double(statistics.total_bytes_read) / statistics.total_time);
|
||||
if (statistics.total_time != 0)
|
||||
{
|
||||
runJSON.set("queries_per_second", static_cast<double>(statistics.queries) / statistics.total_time);
|
||||
runJSON.set("rows_per_second", static_cast<double>(statistics.total_rows_read) / statistics.total_time);
|
||||
runJSON.set("bytes_per_second", static_cast<double>(statistics.total_bytes_read) / statistics.total_time);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if (has_metric("max_rows_per_second"))
|
||||
runJSON.set("max_rows_per_second", statistics.max_rows_speed);
|
||||
|
||||
if (has_metric("max_bytes_per_second"))
|
||||
runJSON.set("max_bytes_per_second", statistics.max_bytes_speed);
|
||||
|
||||
if (has_metric("avg_rows_per_second"))
|
||||
runJSON.set("avg_rows_per_second", statistics.avg_rows_speed_value);
|
||||
|
||||
if (has_metric("avg_bytes_per_second"))
|
||||
runJSON.set("avg_bytes_per_second", statistics.avg_bytes_speed_value);
|
||||
runJSON.set("max_rows_per_second", statistics.max_rows_speed);
|
||||
runJSON.set("max_bytes_per_second", statistics.max_bytes_speed);
|
||||
runJSON.set("avg_rows_per_second", statistics.avg_rows_speed_value);
|
||||
runJSON.set("avg_bytes_per_second", statistics.avg_bytes_speed_value);
|
||||
}
|
||||
|
||||
run_infos.push_back(runJSON);
|
||||
@ -171,26 +168,32 @@ std::string ReportBuilder::buildFullReport(
|
||||
|
||||
std::string ReportBuilder::buildCompactReport(
|
||||
const PerformanceTestInfo & test_info,
|
||||
std::vector<TestStats> & stats) const
|
||||
std::vector<TestStats> & stats,
|
||||
const std::vector<std::size_t> & queries_to_run) const
|
||||
{
|
||||
|
||||
std::ostringstream output;
|
||||
|
||||
for (size_t query_index = 0; query_index < test_info.queries.size(); ++query_index)
|
||||
{
|
||||
if (!queries_to_run.empty() && std::find(queries_to_run.begin(), queries_to_run.end(), query_index) == queries_to_run.end())
|
||||
continue;
|
||||
|
||||
for (size_t number_of_launch = 0; number_of_launch < test_info.times_to_run; ++number_of_launch)
|
||||
{
|
||||
if (test_info.queries.size() > 1)
|
||||
output << "query \"" << test_info.queries[query_index] << "\", ";
|
||||
|
||||
output << "run " << std::to_string(number_of_launch + 1) << ": ";
|
||||
output << test_info.main_metric << " = ";
|
||||
|
||||
std::string main_metric = getMainMetric(test_info);
|
||||
|
||||
output << main_metric << " = ";
|
||||
size_t index = number_of_launch * test_info.queries.size() + query_index;
|
||||
output << stats[index].getStatisticByName(test_info.main_metric);
|
||||
output << stats[index].getStatisticByName(main_metric);
|
||||
output << "\n";
|
||||
}
|
||||
}
|
||||
return output.str();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -9,14 +9,18 @@ namespace DB
|
||||
class ReportBuilder
|
||||
{
|
||||
public:
|
||||
explicit ReportBuilder(const std::string & server_version_);
|
||||
ReportBuilder(const std::string & server_version_);
|
||||
std::string buildFullReport(
|
||||
const PerformanceTestInfo & test_info,
|
||||
std::vector<TestStats> & stats) const;
|
||||
std::vector<TestStats> & stats,
|
||||
const std::vector<std::size_t> & queries_to_run) const;
|
||||
|
||||
|
||||
std::string buildCompactReport(
|
||||
const PerformanceTestInfo & test_info,
|
||||
std::vector<TestStats> & stats) const;
|
||||
std::vector<TestStats> & stats,
|
||||
const std::vector<std::size_t> & queries_to_run) const;
|
||||
|
||||
private:
|
||||
std::string server_version;
|
||||
std::string hostname;
|
||||
|
@ -4,6 +4,7 @@
|
||||
#include <Poco/File.h>
|
||||
#include <Poco/Net/HTTPBasicCredentials.h>
|
||||
#include <Poco/Net/HTTPServerRequest.h>
|
||||
#include <Poco/Net/HTTPServerRequestImpl.h>
|
||||
#include <Poco/Net/HTTPServerResponse.h>
|
||||
#include <Poco/Net/NetException.h>
|
||||
|
||||
@ -558,12 +559,51 @@ void HTTPHandler::processQuery(
|
||||
client_info.http_method = http_method;
|
||||
client_info.http_user_agent = request.get("User-Agent", "");
|
||||
|
||||
auto appendCallback = [&context] (ProgressCallback callback)
|
||||
{
|
||||
auto prev = context.getProgressCallback();
|
||||
|
||||
context.setProgressCallback([prev, callback] (const Progress & progress)
|
||||
{
|
||||
if (prev)
|
||||
prev(progress);
|
||||
|
||||
callback(progress);
|
||||
});
|
||||
};
|
||||
|
||||
/// While still no data has been sent, we will report about query execution progress by sending HTTP headers.
|
||||
if (settings.send_progress_in_http_headers)
|
||||
context.setProgressCallback([&used_output] (const Progress & progress) { used_output.out->onProgress(progress); });
|
||||
appendCallback([&used_output] (const Progress & progress) { used_output.out->onProgress(progress); });
|
||||
|
||||
if (settings.readonly > 0 && settings.cancel_http_readonly_queries_on_client_close)
|
||||
{
|
||||
Poco::Net::StreamSocket & socket = dynamic_cast<Poco::Net::HTTPServerRequestImpl &>(request).socket();
|
||||
|
||||
appendCallback([&context, &socket](const Progress &)
|
||||
{
|
||||
/// Assume that at the point this method is called no one is reading data from the socket any more.
|
||||
/// True for read-only queries.
|
||||
try
|
||||
{
|
||||
char b;
|
||||
int status = socket.receiveBytes(&b, 1, MSG_DONTWAIT | MSG_PEEK);
|
||||
if (status == 0)
|
||||
context.killCurrentQuery();
|
||||
}
|
||||
catch (Poco::TimeoutException &)
|
||||
{
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
context.killCurrentQuery();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
executeQuery(*in, *used_output.out_maybe_delayed_and_compressed, /* allow_into_outfile = */ false, context,
|
||||
[&response] (const String & content_type) { response.setContentType(content_type); });
|
||||
[&response] (const String & content_type) { response.setContentType(content_type); },
|
||||
[&response] (const String & current_query_id) { response.add("Query-Id", current_query_id); });
|
||||
|
||||
if (used_output.hasDelayed())
|
||||
{
|
||||
|
@ -11,6 +11,7 @@
|
||||
#include <Poco/DirectoryIterator.h>
|
||||
#include <Poco/Net/HTTPServer.h>
|
||||
#include <Poco/Net/NetException.h>
|
||||
#include <Poco/Util/HelpFormatter.h>
|
||||
#include <ext/scope_guard.h>
|
||||
#include <common/logger_useful.h>
|
||||
#include <common/ErrorHandlers.h>
|
||||
@ -47,6 +48,7 @@
|
||||
#include "MetricsTransmitter.h"
|
||||
#include <Common/StatusFile.h>
|
||||
#include "TCPHandlerFactory.h"
|
||||
#include "Common/config_version.h"
|
||||
|
||||
#if defined(__linux__)
|
||||
#include <Common/hasLinuxCapability.h>
|
||||
@ -116,6 +118,26 @@ void Server::uninitialize()
|
||||
BaseDaemon::uninitialize();
|
||||
}
|
||||
|
||||
int Server::run()
|
||||
{
|
||||
if (config().hasOption("help"))
|
||||
{
|
||||
Poco::Util::HelpFormatter helpFormatter(Server::options());
|
||||
std::stringstream header;
|
||||
header << commandName() << " [OPTION] [-- [ARG]...]\n";
|
||||
header << "positional arguments can be used to rewrite config.xml properties, for example, --http_port=8010";
|
||||
helpFormatter.setHeader(header.str());
|
||||
helpFormatter.format(std::cout);
|
||||
return 0;
|
||||
}
|
||||
if (config().hasOption("version"))
|
||||
{
|
||||
std::cout << DBMS_NAME << " server version " << VERSION_STRING << "." << std::endl;
|
||||
return 0;
|
||||
}
|
||||
return Application::run();
|
||||
}
|
||||
|
||||
void Server::initialize(Poco::Util::Application & self)
|
||||
{
|
||||
BaseDaemon::initialize(self);
|
||||
@ -127,6 +149,21 @@ std::string Server::getDefaultCorePath() const
|
||||
return getCanonicalPath(config().getString("path", DBMS_DEFAULT_PATH)) + "cores";
|
||||
}
|
||||
|
||||
void Server::defineOptions(Poco::Util::OptionSet & _options)
|
||||
{
|
||||
_options.addOption(
|
||||
Poco::Util::Option("help", "h", "show help and exit")
|
||||
.required(false)
|
||||
.repeatable(false)
|
||||
.binding("help"));
|
||||
_options.addOption(
|
||||
Poco::Util::Option("version", "V", "show version and exit")
|
||||
.required(false)
|
||||
.repeatable(false)
|
||||
.binding("version"));
|
||||
BaseDaemon::defineOptions(_options);
|
||||
}
|
||||
|
||||
int Server::main(const std::vector<std::string> & /*args*/)
|
||||
{
|
||||
Logger * log = &logger();
|
||||
@ -398,19 +435,37 @@ int Server::main(const std::vector<std::string> & /*args*/)
|
||||
if (config().has("max_partition_size_to_drop"))
|
||||
global_context->setMaxPartitionSizeToDrop(config().getUInt64("max_partition_size_to_drop"));
|
||||
|
||||
/// Set up caches.
|
||||
|
||||
/// Lower cache size on low-memory systems.
|
||||
double cache_size_to_ram_max_ratio = config().getDouble("cache_size_to_ram_max_ratio", 0.5);
|
||||
size_t max_cache_size = memory_amount * cache_size_to_ram_max_ratio;
|
||||
|
||||
/// Size of cache for uncompressed blocks. Zero means disabled.
|
||||
size_t uncompressed_cache_size = config().getUInt64("uncompressed_cache_size", 0);
|
||||
if (uncompressed_cache_size)
|
||||
global_context->setUncompressedCache(uncompressed_cache_size);
|
||||
if (uncompressed_cache_size > max_cache_size)
|
||||
{
|
||||
uncompressed_cache_size = max_cache_size;
|
||||
LOG_INFO(log, "Uncompressed cache size was lowered to " << formatReadableSizeWithBinarySuffix(uncompressed_cache_size)
|
||||
<< " because the system has low amount of memory");
|
||||
}
|
||||
global_context->setUncompressedCache(uncompressed_cache_size);
|
||||
|
||||
/// Load global settings from default_profile and system_profile.
|
||||
global_context->setDefaultProfiles(config());
|
||||
Settings & settings = global_context->getSettingsRef();
|
||||
|
||||
/// Size of cache for marks (index of MergeTree family of tables). It is necessary.
|
||||
/// Size of cache for marks (index of MergeTree family of tables). It is mandatory.
|
||||
size_t mark_cache_size = config().getUInt64("mark_cache_size");
|
||||
if (mark_cache_size)
|
||||
global_context->setMarkCache(mark_cache_size);
|
||||
if (!mark_cache_size)
|
||||
LOG_ERROR(log, "Too low mark cache size will lead to severe performance degradation.");
|
||||
if (mark_cache_size > max_cache_size)
|
||||
{
|
||||
mark_cache_size = max_cache_size;
|
||||
LOG_INFO(log, "Mark cache size was lowered to " << formatReadableSizeWithBinarySuffix(uncompressed_cache_size)
|
||||
<< " because the system has low amount of memory");
|
||||
}
|
||||
global_context->setMarkCache(mark_cache_size);
|
||||
|
||||
#if USE_EMBEDDED_COMPILER
|
||||
size_t compiled_expression_cache_size = config().getUInt64("compiled_expression_cache_size", 500);
|
||||
@ -697,10 +752,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
|
||||
|
||||
{
|
||||
std::stringstream message;
|
||||
message << "Available RAM = " << formatReadableSizeWithBinarySuffix(memory_amount) << ";"
|
||||
<< " physical cores = " << getNumberOfPhysicalCPUCores() << ";"
|
||||
message << "Available RAM: " << formatReadableSizeWithBinarySuffix(memory_amount) << ";"
|
||||
<< " physical cores: " << getNumberOfPhysicalCPUCores() << ";"
|
||||
// on ARM processors it can show only enabled at current moment cores
|
||||
<< " threads = " << std::thread::hardware_concurrency() << ".";
|
||||
<< " logical cores: " << std::thread::hardware_concurrency() << ".";
|
||||
LOG_INFO(log, message.str());
|
||||
}
|
||||
|
||||
|
@ -21,6 +21,8 @@ namespace DB
|
||||
class Server : public BaseDaemon, public IServer
|
||||
{
|
||||
public:
|
||||
using ServerApplication::run;
|
||||
|
||||
Poco::Util::LayeredConfiguration & config() const override
|
||||
{
|
||||
return BaseDaemon::config();
|
||||
@ -41,7 +43,10 @@ public:
|
||||
return BaseDaemon::isCancelled();
|
||||
}
|
||||
|
||||
void defineOptions(Poco::Util::OptionSet & _options) override;
|
||||
protected:
|
||||
int run() override;
|
||||
|
||||
void initialize(Application & self) override;
|
||||
|
||||
void uninitialize() override;
|
||||
|
44
dbms/src/AggregateFunctions/AggregateFunctionEntropy.cpp
Normal file
44
dbms/src/AggregateFunctions/AggregateFunctionEntropy.cpp
Normal file
@ -0,0 +1,44 @@
|
||||
#include <AggregateFunctions/AggregateFunctionFactory.h>
|
||||
#include <AggregateFunctions/AggregateFunctionEntropy.h>
|
||||
#include <AggregateFunctions/FactoryHelpers.h>
|
||||
#include <AggregateFunctions/Helpers.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
|
||||
AggregateFunctionPtr createAggregateFunctionEntropy(const std::string & name, const DataTypes & argument_types, const Array & parameters)
|
||||
{
|
||||
assertNoParameters(name, parameters);
|
||||
if (argument_types.empty())
|
||||
throw Exception("Incorrect number of arguments for aggregate function " + name,
|
||||
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
|
||||
|
||||
size_t num_args = argument_types.size();
|
||||
if (num_args == 1)
|
||||
{
|
||||
/// Specialized implementation for single argument of numeric type.
|
||||
if (auto res = createWithNumericBasedType<AggregateFunctionEntropy>(*argument_types[0], num_args))
|
||||
return AggregateFunctionPtr(res);
|
||||
}
|
||||
|
||||
/// Generic implementation for other types or for multiple arguments.
|
||||
return std::make_shared<AggregateFunctionEntropy<UInt128>>(num_args);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
void registerAggregateFunctionEntropy(AggregateFunctionFactory & factory)
|
||||
{
|
||||
factory.registerFunction("entropy", createAggregateFunctionEntropy);
|
||||
}
|
||||
|
||||
}
|
149
dbms/src/AggregateFunctions/AggregateFunctionEntropy.h
Normal file
149
dbms/src/AggregateFunctions/AggregateFunctionEntropy.h
Normal file
@ -0,0 +1,149 @@
|
||||
#pragma once
|
||||
|
||||
#include <Common/HashTable/HashMap.h>
|
||||
#include <Common/NaNUtils.h>
|
||||
|
||||
#include <AggregateFunctions/IAggregateFunction.h>
|
||||
#include <AggregateFunctions/UniqVariadicHash.h>
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <Columns/ColumnVector.h>
|
||||
|
||||
#include <cmath>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
/** Calculates Shannon Entropy, using HashMap and computing empirical distribution function.
|
||||
* Entropy is measured in bits (base-2 logarithm is used).
|
||||
*/
|
||||
template <typename Value>
|
||||
struct EntropyData
|
||||
{
|
||||
using Weight = UInt64;
|
||||
|
||||
using HashingMap = HashMap<
|
||||
Value, Weight,
|
||||
HashCRC32<Value>,
|
||||
HashTableGrower<4>,
|
||||
HashTableAllocatorWithStackMemory<sizeof(std::pair<Value, Weight>) * (1 << 3)>>;
|
||||
|
||||
/// For the case of pre-hashed values.
|
||||
using TrivialMap = HashMap<
|
||||
Value, Weight,
|
||||
UInt128TrivialHash,
|
||||
HashTableGrower<4>,
|
||||
HashTableAllocatorWithStackMemory<sizeof(std::pair<Value, Weight>) * (1 << 3)>>;
|
||||
|
||||
using Map = std::conditional_t<std::is_same_v<UInt128, Value>, TrivialMap, HashingMap>;
|
||||
|
||||
Map map;
|
||||
|
||||
void add(const Value & x)
|
||||
{
|
||||
if (!isNaN(x))
|
||||
++map[x];
|
||||
}
|
||||
|
||||
void add(const Value & x, const Weight & weight)
|
||||
{
|
||||
if (!isNaN(x))
|
||||
map[x] += weight;
|
||||
}
|
||||
|
||||
void merge(const EntropyData & rhs)
|
||||
{
|
||||
for (const auto & pair : rhs.map)
|
||||
map[pair.first] += pair.second;
|
||||
}
|
||||
|
||||
void serialize(WriteBuffer & buf) const
|
||||
{
|
||||
map.write(buf);
|
||||
}
|
||||
|
||||
void deserialize(ReadBuffer & buf)
|
||||
{
|
||||
typename Map::Reader reader(buf);
|
||||
while (reader.next())
|
||||
{
|
||||
const auto & pair = reader.get();
|
||||
map[pair.first] = pair.second;
|
||||
}
|
||||
}
|
||||
|
||||
Float64 get() const
|
||||
{
|
||||
UInt64 total_value = 0;
|
||||
for (const auto & pair : map)
|
||||
total_value += pair.second;
|
||||
|
||||
Float64 shannon_entropy = 0;
|
||||
for (const auto & pair : map)
|
||||
{
|
||||
Float64 frequency = Float64(pair.second) / total_value;
|
||||
shannon_entropy -= frequency * log2(frequency);
|
||||
}
|
||||
|
||||
return shannon_entropy;
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
template <typename Value>
|
||||
class AggregateFunctionEntropy final : public IAggregateFunctionDataHelper<EntropyData<Value>, AggregateFunctionEntropy<Value>>
|
||||
{
|
||||
private:
|
||||
size_t num_args;
|
||||
|
||||
public:
|
||||
AggregateFunctionEntropy(size_t num_args) : num_args(num_args)
|
||||
{
|
||||
}
|
||||
|
||||
String getName() const override { return "entropy"; }
|
||||
|
||||
DataTypePtr getReturnType() const override
|
||||
{
|
||||
return std::make_shared<DataTypeNumber<Float64>>();
|
||||
}
|
||||
|
||||
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
|
||||
{
|
||||
if constexpr (!std::is_same_v<UInt128, Value>)
|
||||
{
|
||||
/// Here we manage only with numerical types
|
||||
const auto & column = static_cast<const ColumnVector <Value> &>(*columns[0]);
|
||||
this->data(place).add(column.getData()[row_num]);
|
||||
}
|
||||
else
|
||||
{
|
||||
this->data(place).add(UniqVariadicHash<true, false>::apply(num_args, columns, row_num));
|
||||
}
|
||||
}
|
||||
|
||||
void merge(AggregateDataPtr place, ConstAggregateDataPtr rhs, Arena *) const override
|
||||
{
|
||||
this->data(place).merge(this->data(rhs));
|
||||
}
|
||||
|
||||
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override
|
||||
{
|
||||
this->data(const_cast<AggregateDataPtr>(place)).serialize(buf);
|
||||
}
|
||||
|
||||
void deserialize(AggregateDataPtr place, ReadBuffer & buf, Arena *) const override
|
||||
{
|
||||
this->data(place).deserialize(buf);
|
||||
}
|
||||
|
||||
void insertResultInto(ConstAggregateDataPtr place, IColumn & to) const override
|
||||
{
|
||||
auto & column = static_cast<ColumnVector<Float64> &>(to);
|
||||
column.getData().push_back(this->data(place).get());
|
||||
}
|
||||
|
||||
const char * getHeaderFilePath() const override { return __FILE__; }
|
||||
};
|
||||
|
||||
}
|
@ -128,7 +128,11 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl(
|
||||
return combinator->transformAggregateFunction(nested_function, argument_types, parameters);
|
||||
}
|
||||
|
||||
throw Exception("Unknown aggregate function " + name, ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION);
|
||||
auto hints = this->getHints(name);
|
||||
if (!hints.empty())
|
||||
throw Exception("Unknown aggregate function " + name + ". Maybe you meant: " + toString(hints), ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION);
|
||||
else
|
||||
throw Exception("Unknown aggregate function " + name, ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION);
|
||||
}
|
||||
|
||||
|
||||
|
@ -13,6 +13,8 @@
|
||||
|
||||
#include <IO/WriteBuffer.h>
|
||||
#include <IO/ReadBuffer.h>
|
||||
#include <IO/WriteHelpers.h>
|
||||
#include <IO/ReadHelpers.h>
|
||||
#include <IO/VarInt.h>
|
||||
|
||||
#include <AggregateFunctions/IAggregateFunction.h>
|
||||
@ -268,15 +270,13 @@ public:
|
||||
lower_bound = std::min(lower_bound, other.lower_bound);
|
||||
upper_bound = std::max(lower_bound, other.upper_bound);
|
||||
for (size_t i = 0; i < other.size; i++)
|
||||
{
|
||||
add(other.points[i].mean, other.points[i].weight, max_bins);
|
||||
}
|
||||
}
|
||||
|
||||
void write(WriteBuffer & buf) const
|
||||
{
|
||||
buf.write(reinterpret_cast<const char *>(&lower_bound), sizeof(lower_bound));
|
||||
buf.write(reinterpret_cast<const char *>(&upper_bound), sizeof(upper_bound));
|
||||
writeBinary(lower_bound, buf);
|
||||
writeBinary(upper_bound, buf);
|
||||
|
||||
writeVarUInt(size, buf);
|
||||
buf.write(reinterpret_cast<const char *>(points), size * sizeof(WeightedValue));
|
||||
@ -284,11 +284,10 @@ public:
|
||||
|
||||
void read(ReadBuffer & buf, UInt32 max_bins)
|
||||
{
|
||||
buf.read(reinterpret_cast<char *>(&lower_bound), sizeof(lower_bound));
|
||||
buf.read(reinterpret_cast<char *>(&upper_bound), sizeof(upper_bound));
|
||||
readBinary(lower_bound, buf);
|
||||
readBinary(upper_bound, buf);
|
||||
|
||||
readVarUInt(size, buf);
|
||||
|
||||
if (size > max_bins * 2)
|
||||
throw Exception("Too many bins", ErrorCodes::TOO_LARGE_ARRAY_SIZE);
|
||||
|
||||
|
@ -41,7 +41,7 @@ template <typename T> using FuncQuantilesTDigestWeighted = AggregateFunctionQuan
|
||||
|
||||
|
||||
template <template <typename> class Function>
|
||||
static constexpr bool SupportDecimal()
|
||||
static constexpr bool supportDecimal()
|
||||
{
|
||||
return std::is_same_v<Function<Float32>, FuncQuantileExact<Float32>> ||
|
||||
std::is_same_v<Function<Float32>, FuncQuantilesExact<Float32>>;
|
||||
@ -61,11 +61,10 @@ AggregateFunctionPtr createAggregateFunctionQuantile(const std::string & name, c
|
||||
if (which.idx == TypeIndex::TYPE) return std::make_shared<Function<TYPE>>(argument_type, params);
|
||||
FOR_NUMERIC_TYPES(DISPATCH)
|
||||
#undef DISPATCH
|
||||
#undef FOR_NUMERIC_TYPES
|
||||
if (which.idx == TypeIndex::Date) return std::make_shared<Function<DataTypeDate::FieldType>>(argument_type, params);
|
||||
if (which.idx == TypeIndex::DateTime) return std::make_shared<Function<DataTypeDateTime::FieldType>>(argument_type, params);
|
||||
|
||||
if constexpr (SupportDecimal<Function>())
|
||||
if constexpr (supportDecimal<Function>())
|
||||
{
|
||||
if (which.idx == TypeIndex::Decimal32) return std::make_shared<Function<Decimal32>>(argument_type, params);
|
||||
if (which.idx == TypeIndex::Decimal64) return std::make_shared<Function<Decimal64>>(argument_type, params);
|
||||
|
@ -20,7 +20,7 @@ namespace DB
|
||||
|
||||
/** Create an aggregate function with a numeric type in the template parameter, depending on the type of the argument.
|
||||
*/
|
||||
template <template <typename> class AggregateFunctionTemplate, typename ... TArgs>
|
||||
template <template <typename> class AggregateFunctionTemplate, typename... TArgs>
|
||||
static IAggregateFunction * createWithNumericType(const IDataType & argument_type, TArgs && ... args)
|
||||
{
|
||||
WhichDataType which(argument_type);
|
||||
@ -33,7 +33,7 @@ static IAggregateFunction * createWithNumericType(const IDataType & argument_typ
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, typename Data, typename ... TArgs>
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, typename Data, typename... TArgs>
|
||||
static IAggregateFunction * createWithNumericType(const IDataType & argument_type, TArgs && ... args)
|
||||
{
|
||||
WhichDataType which(argument_type);
|
||||
@ -46,7 +46,7 @@ static IAggregateFunction * createWithNumericType(const IDataType & argument_typ
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, template <typename> class Data, typename ... TArgs>
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, template <typename> class Data, typename... TArgs>
|
||||
static IAggregateFunction * createWithNumericType(const IDataType & argument_type, TArgs && ... args)
|
||||
{
|
||||
WhichDataType which(argument_type);
|
||||
@ -59,7 +59,7 @@ static IAggregateFunction * createWithNumericType(const IDataType & argument_typ
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, template <typename> class Data, typename ... TArgs>
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, template <typename> class Data, typename... TArgs>
|
||||
static IAggregateFunction * createWithUnsignedIntegerType(const IDataType & argument_type, TArgs && ... args)
|
||||
{
|
||||
WhichDataType which(argument_type);
|
||||
@ -70,7 +70,7 @@ static IAggregateFunction * createWithUnsignedIntegerType(const IDataType & argu
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
template <template <typename> class AggregateFunctionTemplate, typename ... TArgs>
|
||||
template <template <typename> class AggregateFunctionTemplate, typename... TArgs>
|
||||
static IAggregateFunction * createWithNumericBasedType(const IDataType & argument_type, TArgs && ... args)
|
||||
{
|
||||
IAggregateFunction * f = createWithNumericType<AggregateFunctionTemplate>(argument_type, std::forward<TArgs>(args)...);
|
||||
@ -85,7 +85,7 @@ static IAggregateFunction * createWithNumericBasedType(const IDataType & argumen
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
template <template <typename> class AggregateFunctionTemplate, typename ... TArgs>
|
||||
template <template <typename> class AggregateFunctionTemplate, typename... TArgs>
|
||||
static IAggregateFunction * createWithDecimalType(const IDataType & argument_type, TArgs && ... args)
|
||||
{
|
||||
WhichDataType which(argument_type);
|
||||
@ -98,7 +98,7 @@ static IAggregateFunction * createWithDecimalType(const IDataType & argument_typ
|
||||
|
||||
/** For template with two arguments.
|
||||
*/
|
||||
template <typename FirstType, template <typename, typename> class AggregateFunctionTemplate, typename ... TArgs>
|
||||
template <typename FirstType, template <typename, typename> class AggregateFunctionTemplate, typename... TArgs>
|
||||
static IAggregateFunction * createWithTwoNumericTypesSecond(const IDataType & second_type, TArgs && ... args)
|
||||
{
|
||||
WhichDataType which(second_type);
|
||||
@ -111,7 +111,7 @@ static IAggregateFunction * createWithTwoNumericTypesSecond(const IDataType & se
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, typename ... TArgs>
|
||||
template <template <typename, typename> class AggregateFunctionTemplate, typename... TArgs>
|
||||
static IAggregateFunction * createWithTwoNumericTypes(const IDataType & first_type, const IDataType & second_type, TArgs && ... args)
|
||||
{
|
||||
WhichDataType which(first_type);
|
||||
|
@ -19,7 +19,7 @@ namespace ErrorCodes
|
||||
/** Calculates quantile by collecting all values into array
|
||||
* and applying n-th element (introselect) algorithm for the resulting array.
|
||||
*
|
||||
* It use O(N) memory and it is very inefficient in case of high amount of identical values.
|
||||
* It uses O(N) memory and it is very inefficient in case of high amount of identical values.
|
||||
* But it is very CPU efficient for not large datasets.
|
||||
*/
|
||||
template <typename Value>
|
||||
|
@ -14,7 +14,7 @@ namespace ErrorCodes
|
||||
|
||||
/** Calculates quantile by counting number of occurrences for each value in a hash map.
|
||||
*
|
||||
* It use O(distinct(N)) memory. Can be naturally applied for values with weight.
|
||||
* It uses O(distinct(N)) memory. Can be naturally applied for values with weight.
|
||||
* In case of many identical values, it can be more efficient than QuantileExact even when weight is not used.
|
||||
*/
|
||||
template <typename Value>
|
||||
|
@ -27,6 +27,7 @@ void registerAggregateFunctionUniqUpTo(AggregateFunctionFactory &);
|
||||
void registerAggregateFunctionTopK(AggregateFunctionFactory &);
|
||||
void registerAggregateFunctionsBitwise(AggregateFunctionFactory &);
|
||||
void registerAggregateFunctionsMaxIntersections(AggregateFunctionFactory &);
|
||||
void registerAggregateFunctionEntropy(AggregateFunctionFactory &);
|
||||
|
||||
void registerAggregateFunctionCombinatorIf(AggregateFunctionCombinatorFactory &);
|
||||
void registerAggregateFunctionCombinatorArray(AggregateFunctionCombinatorFactory &);
|
||||
@ -65,6 +66,7 @@ void registerAggregateFunctions()
|
||||
registerAggregateFunctionsMaxIntersections(factory);
|
||||
registerAggregateFunctionHistogram(factory);
|
||||
registerAggregateFunctionRetention(factory);
|
||||
registerAggregateFunctionEntropy(factory);
|
||||
}
|
||||
|
||||
{
|
||||
|
@ -138,7 +138,7 @@ public:
|
||||
StringRef getRawData() const override { return StringRef(chars.data(), chars.size()); }
|
||||
|
||||
/// Specialized part of interface, not from IColumn.
|
||||
|
||||
void insertString(const String & string) { insertData(string.c_str(), string.size()); }
|
||||
Chars & getChars() { return chars; }
|
||||
const Chars & getChars() const { return chars; }
|
||||
|
||||
|
@ -12,6 +12,7 @@
|
||||
#include <Columns/ColumnsCommon.h>
|
||||
#include <DataStreams/ColumnGathererStream.h>
|
||||
#include <ext/bit_cast.h>
|
||||
#include <pdqsort.h>
|
||||
|
||||
#ifdef __SSE2__
|
||||
#include <emmintrin.h>
|
||||
@ -90,9 +91,9 @@ void ColumnVector<T>::getPermutation(bool reverse, size_t limit, int nan_directi
|
||||
else
|
||||
{
|
||||
if (reverse)
|
||||
std::sort(res.begin(), res.end(), greater(*this, nan_direction_hint));
|
||||
pdqsort(res.begin(), res.end(), greater(*this, nan_direction_hint));
|
||||
else
|
||||
std::sort(res.begin(), res.end(), less(*this, nan_direction_hint));
|
||||
pdqsort(res.begin(), res.end(), less(*this, nan_direction_hint));
|
||||
}
|
||||
}
|
||||
|
||||
|
557
dbms/src/Common/ColumnsHashing.h
Normal file
557
dbms/src/Common/ColumnsHashing.h
Normal file
@ -0,0 +1,557 @@
|
||||
#pragma once
|
||||
|
||||
|
||||
#include <Common/ColumnsHashingImpl.h>
|
||||
#include <Common/Arena.h>
|
||||
#include <Common/LRUCache.h>
|
||||
#include <common/unaligned.h>
|
||||
|
||||
#include <Columns/ColumnString.h>
|
||||
#include <Columns/ColumnFixedString.h>
|
||||
#include <Columns/ColumnLowCardinality.h>
|
||||
|
||||
#include <Core/Defines.h>
|
||||
#include <memory>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
namespace ColumnsHashing
|
||||
{
|
||||
|
||||
/// For the case when there is one numeric key.
|
||||
/// UInt8/16/32/64 for any type with corresponding bit width.
|
||||
template <typename Value, typename Mapped, typename FieldType, bool use_cache = true>
|
||||
struct HashMethodOneNumber
|
||||
: public columns_hashing_impl::HashMethodBase<HashMethodOneNumber<Value, Mapped, FieldType, use_cache>, Value, Mapped, use_cache>
|
||||
{
|
||||
using Self = HashMethodOneNumber<Value, Mapped, FieldType, use_cache>;
|
||||
using Base = columns_hashing_impl::HashMethodBase<Self, Value, Mapped, use_cache>;
|
||||
|
||||
const char * vec;
|
||||
|
||||
/// If the keys of a fixed length then key_sizes contains their lengths, empty otherwise.
|
||||
HashMethodOneNumber(const ColumnRawPtrs & key_columns, const Sizes & /*key_sizes*/, const HashMethodContextPtr &)
|
||||
{
|
||||
vec = key_columns[0]->getRawData().data;
|
||||
}
|
||||
|
||||
/// Creates context. Method is called once and result context is used in all threads.
|
||||
using Base::createContext; /// (const HashMethodContext::Settings &) -> HashMethodContextPtr
|
||||
|
||||
/// Emplace key into HashTable or HashMap. If Data is HashMap, returns ptr to value, otherwise nullptr.
|
||||
/// Data is a HashTable where to insert key from column's row.
|
||||
/// For Serialized method, key may be placed in pool.
|
||||
using Base::emplaceKey; /// (Data & data, size_t row, Arena & pool) -> EmplaceResult
|
||||
|
||||
/// Find key into HashTable or HashMap. If Data is HashMap and key was found, returns ptr to value, otherwise nullptr.
|
||||
using Base::findKey; /// (Data & data, size_t row, Arena & pool) -> FindResult
|
||||
|
||||
/// Get hash value of row.
|
||||
using Base::getHash; /// (const Data & data, size_t row, Arena & pool) -> size_t
|
||||
|
||||
/// Is used for default implementation in HashMethodBase.
|
||||
FieldType getKey(size_t row, Arena &) const { return unalignedLoad<FieldType>(vec + row * sizeof(FieldType)); }
|
||||
|
||||
/// Get StringRef from value which can be inserted into column.
|
||||
static StringRef getValueRef(const Value & value)
|
||||
{
|
||||
return StringRef(reinterpret_cast<const char *>(&value.first), sizeof(value.first));
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
/// For the case when there is one string key.
|
||||
template <typename Value, typename Mapped, bool place_string_to_arena = true, bool use_cache = true>
|
||||
struct HashMethodString
|
||||
: public columns_hashing_impl::HashMethodBase<HashMethodString<Value, Mapped, place_string_to_arena, use_cache>, Value, Mapped, use_cache>
|
||||
{
|
||||
using Self = HashMethodString<Value, Mapped, place_string_to_arena, use_cache>;
|
||||
using Base = columns_hashing_impl::HashMethodBase<Self, Value, Mapped, use_cache>;
|
||||
|
||||
const IColumn::Offset * offsets;
|
||||
const UInt8 * chars;
|
||||
|
||||
HashMethodString(const ColumnRawPtrs & key_columns, const Sizes & /*key_sizes*/, const HashMethodContextPtr &)
|
||||
{
|
||||
const IColumn & column = *key_columns[0];
|
||||
const ColumnString & column_string = static_cast<const ColumnString &>(column);
|
||||
offsets = column_string.getOffsets().data();
|
||||
chars = column_string.getChars().data();
|
||||
}
|
||||
|
||||
auto getKey(ssize_t row, Arena &) const
|
||||
{
|
||||
return StringRef(chars + offsets[row - 1], offsets[row] - offsets[row - 1] - 1);
|
||||
}
|
||||
|
||||
static StringRef getValueRef(const Value & value) { return StringRef(value.first.data, value.first.size); }
|
||||
|
||||
protected:
|
||||
friend class columns_hashing_impl::HashMethodBase<Self, Value, Mapped, use_cache>;
|
||||
|
||||
static ALWAYS_INLINE void onNewKey([[maybe_unused]] StringRef & key, [[maybe_unused]] Arena & pool)
|
||||
{
|
||||
if constexpr (place_string_to_arena)
|
||||
{
|
||||
if (key.size)
|
||||
key.data = pool.insert(key.data, key.size);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
/// For the case when there is one fixed-length string key.
|
||||
template <typename Value, typename Mapped, bool place_string_to_arena = true, bool use_cache = true>
|
||||
struct HashMethodFixedString
|
||||
: public columns_hashing_impl::HashMethodBase<HashMethodFixedString<Value, Mapped, place_string_to_arena, use_cache>, Value, Mapped, use_cache>
|
||||
{
|
||||
using Self = HashMethodFixedString<Value, Mapped, place_string_to_arena, use_cache>;
|
||||
using Base = columns_hashing_impl::HashMethodBase<Self, Value, Mapped, use_cache>;
|
||||
|
||||
size_t n;
|
||||
const ColumnFixedString::Chars * chars;
|
||||
|
||||
HashMethodFixedString(const ColumnRawPtrs & key_columns, const Sizes & /*key_sizes*/, const HashMethodContextPtr &)
|
||||
{
|
||||
const IColumn & column = *key_columns[0];
|
||||
const ColumnFixedString & column_string = static_cast<const ColumnFixedString &>(column);
|
||||
n = column_string.getN();
|
||||
chars = &column_string.getChars();
|
||||
}
|
||||
|
||||
StringRef getKey(size_t row, Arena &) const { return StringRef(&(*chars)[row * n], n); }
|
||||
|
||||
static StringRef getValueRef(const Value & value) { return StringRef(value.first.data, value.first.size); }
|
||||
|
||||
protected:
|
||||
friend class columns_hashing_impl::HashMethodBase<Self, Value, Mapped, use_cache>;
|
||||
static ALWAYS_INLINE void onNewKey([[maybe_unused]] StringRef & key, [[maybe_unused]] Arena & pool)
|
||||
{
|
||||
if constexpr (place_string_to_arena)
|
||||
key.data = pool.insert(key.data, key.size);
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
/// Cache stores dictionaries and saved_hash per dictionary key.
|
||||
class LowCardinalityDictionaryCache : public HashMethodContext
|
||||
{
|
||||
public:
|
||||
/// Will assume that dictionaries with same hash has the same keys.
|
||||
/// Just in case, check that they have also the same size.
|
||||
struct DictionaryKey
|
||||
{
|
||||
UInt128 hash;
|
||||
UInt64 size;
|
||||
|
||||
bool operator== (const DictionaryKey & other) const { return hash == other.hash && size == other.size; }
|
||||
};
|
||||
|
||||
struct DictionaryKeyHash
|
||||
{
|
||||
size_t operator()(const DictionaryKey & key) const
|
||||
{
|
||||
SipHash hash;
|
||||
hash.update(key.hash.low);
|
||||
hash.update(key.hash.high);
|
||||
hash.update(key.size);
|
||||
return hash.get64();
|
||||
}
|
||||
};
|
||||
|
||||
struct CachedValues
|
||||
{
|
||||
/// Store ptr to dictionary to be sure it won't be deleted.
|
||||
ColumnPtr dictionary_holder;
|
||||
/// Hashes for dictionary keys.
|
||||
const UInt64 * saved_hash = nullptr;
|
||||
};
|
||||
|
||||
using CachedValuesPtr = std::shared_ptr<CachedValues>;
|
||||
|
||||
explicit LowCardinalityDictionaryCache(const HashMethodContext::Settings & settings) : cache(settings.max_threads) {}
|
||||
|
||||
CachedValuesPtr get(const DictionaryKey & key) { return cache.get(key); }
|
||||
void set(const DictionaryKey & key, const CachedValuesPtr & mapped) { cache.set(key, mapped); }
|
||||
|
||||
private:
|
||||
using Cache = LRUCache<DictionaryKey, CachedValues, DictionaryKeyHash>;
|
||||
Cache cache;
|
||||
};
|
||||
|
||||
|
||||
/// Single low cardinality column.
|
||||
template <typename SingleColumnMethod, typename Mapped, bool use_cache>
|
||||
struct HashMethodSingleLowCardinalityColumn : public SingleColumnMethod
|
||||
{
|
||||
using Base = SingleColumnMethod;
|
||||
|
||||
enum class VisitValue
|
||||
{
|
||||
Empty = 0,
|
||||
Found = 1,
|
||||
NotFound = 2,
|
||||
};
|
||||
|
||||
static constexpr bool has_mapped = !std::is_same<Mapped, void>::value;
|
||||
using EmplaceResult = columns_hashing_impl::EmplaceResultImpl<Mapped>;
|
||||
using FindResult = columns_hashing_impl::FindResultImpl<Mapped>;
|
||||
|
||||
static HashMethodContextPtr createContext(const HashMethodContext::Settings & settings)
|
||||
{
|
||||
return std::make_shared<LowCardinalityDictionaryCache>(settings);
|
||||
}
|
||||
|
||||
ColumnRawPtrs key_columns;
|
||||
const IColumn * positions = nullptr;
|
||||
size_t size_of_index_type = 0;
|
||||
|
||||
/// saved hash is from current column or from cache.
|
||||
const UInt64 * saved_hash = nullptr;
|
||||
/// Hold dictionary in case saved_hash is from cache to be sure it won't be deleted.
|
||||
ColumnPtr dictionary_holder;
|
||||
|
||||
/// Cache AggregateDataPtr for current column in order to decrease the number of hash table usages.
|
||||
columns_hashing_impl::MappedCache<Mapped> mapped_cache;
|
||||
PaddedPODArray<VisitValue> visit_cache;
|
||||
|
||||
/// If initialized column is nullable.
|
||||
bool is_nullable = false;
|
||||
|
||||
static const ColumnLowCardinality & getLowCardinalityColumn(const IColumn * low_cardinality_column)
|
||||
{
|
||||
auto column = typeid_cast<const ColumnLowCardinality *>(low_cardinality_column);
|
||||
if (!column)
|
||||
throw Exception("Invalid aggregation key type for HashMethodSingleLowCardinalityColumn method. "
|
||||
"Excepted LowCardinality, got " + column->getName(), ErrorCodes::LOGICAL_ERROR);
|
||||
return *column;
|
||||
}
|
||||
|
||||
HashMethodSingleLowCardinalityColumn(
|
||||
const ColumnRawPtrs & key_columns_low_cardinality, const Sizes & key_sizes, const HashMethodContextPtr & context)
|
||||
: Base({getLowCardinalityColumn(key_columns_low_cardinality[0]).getDictionary().getNestedNotNullableColumn().get()}, key_sizes, context)
|
||||
{
|
||||
auto column = &getLowCardinalityColumn(key_columns_low_cardinality[0]);
|
||||
|
||||
if (!context)
|
||||
throw Exception("Cache wasn't created for HashMethodSingleLowCardinalityColumn",
|
||||
ErrorCodes::LOGICAL_ERROR);
|
||||
|
||||
LowCardinalityDictionaryCache * cache;
|
||||
if constexpr (use_cache)
|
||||
{
|
||||
cache = typeid_cast<LowCardinalityDictionaryCache *>(context.get());
|
||||
if (!cache)
|
||||
{
|
||||
const auto & cached_val = *context;
|
||||
throw Exception("Invalid type for HashMethodSingleLowCardinalityColumn cache: "
|
||||
+ demangle(typeid(cached_val).name()), ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
}
|
||||
|
||||
auto * dict = column->getDictionary().getNestedNotNullableColumn().get();
|
||||
is_nullable = column->getDictionary().nestedColumnIsNullable();
|
||||
key_columns = {dict};
|
||||
bool is_shared_dict = column->isSharedDictionary();
|
||||
|
||||
typename LowCardinalityDictionaryCache::DictionaryKey dictionary_key;
|
||||
typename LowCardinalityDictionaryCache::CachedValuesPtr cached_values;
|
||||
|
||||
if (is_shared_dict)
|
||||
{
|
||||
dictionary_key = {column->getDictionary().getHash(), dict->size()};
|
||||
if constexpr (use_cache)
|
||||
cached_values = cache->get(dictionary_key);
|
||||
}
|
||||
|
||||
if (cached_values)
|
||||
{
|
||||
saved_hash = cached_values->saved_hash;
|
||||
dictionary_holder = cached_values->dictionary_holder;
|
||||
}
|
||||
else
|
||||
{
|
||||
saved_hash = column->getDictionary().tryGetSavedHash();
|
||||
dictionary_holder = column->getDictionaryPtr();
|
||||
|
||||
if constexpr (use_cache)
|
||||
{
|
||||
if (is_shared_dict)
|
||||
{
|
||||
cached_values = std::make_shared<typename LowCardinalityDictionaryCache::CachedValues>();
|
||||
cached_values->saved_hash = saved_hash;
|
||||
cached_values->dictionary_holder = dictionary_holder;
|
||||
|
||||
cache->set(dictionary_key, cached_values);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if constexpr (has_mapped)
|
||||
mapped_cache.resize(key_columns[0]->size());
|
||||
|
||||
VisitValue empty(VisitValue::Empty);
|
||||
visit_cache.assign(key_columns[0]->size(), empty);
|
||||
|
||||
size_of_index_type = column->getSizeOfIndexType();
|
||||
positions = column->getIndexesPtr().get();
|
||||
}
|
||||
|
||||
ALWAYS_INLINE size_t getIndexAt(size_t row) const
|
||||
{
|
||||
switch (size_of_index_type)
|
||||
{
|
||||
case sizeof(UInt8): return static_cast<const ColumnUInt8 *>(positions)->getElement(row);
|
||||
case sizeof(UInt16): return static_cast<const ColumnUInt16 *>(positions)->getElement(row);
|
||||
case sizeof(UInt32): return static_cast<const ColumnUInt32 *>(positions)->getElement(row);
|
||||
case sizeof(UInt64): return static_cast<const ColumnUInt64 *>(positions)->getElement(row);
|
||||
default: throw Exception("Unexpected size of index type for low cardinality column.", ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
}
|
||||
|
||||
/// Get the key from the key columns for insertion into the hash table.
|
||||
ALWAYS_INLINE auto getKey(size_t row, Arena & pool) const
|
||||
{
|
||||
return Base::getKey(getIndexAt(row), pool);
|
||||
}
|
||||
|
||||
template <typename Data>
|
||||
ALWAYS_INLINE EmplaceResult emplaceKey(Data & data, size_t row_, Arena & pool)
|
||||
{
|
||||
size_t row = getIndexAt(row_);
|
||||
|
||||
if (is_nullable && row == 0)
|
||||
{
|
||||
visit_cache[row] = VisitValue::Found;
|
||||
bool has_null_key = data.hasNullKeyData();
|
||||
data.hasNullKeyData() = true;
|
||||
|
||||
if constexpr (has_mapped)
|
||||
return EmplaceResult(data.getNullKeyData(), mapped_cache[0], !has_null_key);
|
||||
else
|
||||
return EmplaceResult(!has_null_key);
|
||||
}
|
||||
|
||||
if (visit_cache[row] == VisitValue::Found)
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
return EmplaceResult(mapped_cache[row], mapped_cache[row], false);
|
||||
else
|
||||
return EmplaceResult(false);
|
||||
}
|
||||
|
||||
auto key = getKey(row_, pool);
|
||||
|
||||
bool inserted = false;
|
||||
typename Data::iterator it;
|
||||
if (saved_hash)
|
||||
data.emplace(key, it, inserted, saved_hash[row]);
|
||||
else
|
||||
data.emplace(key, it, inserted);
|
||||
|
||||
visit_cache[row] = VisitValue::Found;
|
||||
|
||||
if (inserted)
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
{
|
||||
new(&it->second) Mapped();
|
||||
Base::onNewKey(it->first, pool);
|
||||
}
|
||||
else
|
||||
Base::onNewKey(*it, pool);
|
||||
}
|
||||
|
||||
if constexpr (has_mapped)
|
||||
return EmplaceResult(it->second, mapped_cache[row], inserted);
|
||||
else
|
||||
return EmplaceResult(inserted);
|
||||
}
|
||||
|
||||
ALWAYS_INLINE bool isNullAt(size_t i)
|
||||
{
|
||||
if (!is_nullable)
|
||||
return false;
|
||||
|
||||
return getIndexAt(i) == 0;
|
||||
}
|
||||
|
||||
template <typename Data>
|
||||
ALWAYS_INLINE FindResult findFromRow(Data & data, size_t row_, Arena & pool)
|
||||
{
|
||||
size_t row = getIndexAt(row_);
|
||||
|
||||
if (is_nullable && row == 0)
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
return FindResult(data.hasNullKeyData() ? &data.getNullKeyData() : nullptr, data.hasNullKeyData());
|
||||
else
|
||||
return FindResult(data.hasNullKeyData());
|
||||
}
|
||||
|
||||
if (visit_cache[row] != VisitValue::Empty)
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
return FindResult(&mapped_cache[row], visit_cache[row] == VisitValue::Found);
|
||||
else
|
||||
return FindResult(visit_cache[row] == VisitValue::Found);
|
||||
}
|
||||
|
||||
auto key = getKey(row_, pool);
|
||||
|
||||
typename Data::iterator it;
|
||||
if (saved_hash)
|
||||
it = data.find(key, saved_hash[row]);
|
||||
else
|
||||
it = data.find(key);
|
||||
|
||||
bool found = it != data.end();
|
||||
visit_cache[row] = found ? VisitValue::Found : VisitValue::NotFound;
|
||||
|
||||
if constexpr (has_mapped)
|
||||
{
|
||||
if (found)
|
||||
mapped_cache[row] = it->second;
|
||||
}
|
||||
|
||||
if constexpr (has_mapped)
|
||||
return FindResult(&mapped_cache[row], found);
|
||||
else
|
||||
return FindResult(found);
|
||||
}
|
||||
|
||||
template <typename Data>
|
||||
ALWAYS_INLINE size_t getHash(const Data & data, size_t row, Arena & pool)
|
||||
{
|
||||
row = getIndexAt(row);
|
||||
if (saved_hash)
|
||||
return saved_hash[row];
|
||||
|
||||
return Base::getHash(data, row, pool);
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
// Optional mask for low cardinality columns.
|
||||
template <bool has_low_cardinality>
|
||||
struct LowCardinalityKeys
|
||||
{
|
||||
ColumnRawPtrs nested_columns;
|
||||
ColumnRawPtrs positions;
|
||||
Sizes position_sizes;
|
||||
};
|
||||
|
||||
template <>
|
||||
struct LowCardinalityKeys<false> {};
|
||||
|
||||
/// For the case when all keys are of fixed length, and they fit in N (for example, 128) bits.
|
||||
template <typename Value, typename Key, typename Mapped, bool has_nullable_keys_ = false, bool has_low_cardinality_ = false, bool use_cache = true>
|
||||
struct HashMethodKeysFixed
|
||||
: private columns_hashing_impl::BaseStateKeysFixed<Key, has_nullable_keys_>
|
||||
, public columns_hashing_impl::HashMethodBase<HashMethodKeysFixed<Value, Key, Mapped, has_nullable_keys_, has_low_cardinality_, use_cache>, Value, Mapped, use_cache>
|
||||
{
|
||||
using Self = HashMethodKeysFixed<Value, Key, Mapped, has_nullable_keys_, has_low_cardinality_, use_cache>;
|
||||
using BaseHashed = columns_hashing_impl::HashMethodBase<Self, Value, Mapped, use_cache>;
|
||||
using Base = columns_hashing_impl::BaseStateKeysFixed<Key, has_nullable_keys_>;
|
||||
|
||||
static constexpr bool has_nullable_keys = has_nullable_keys_;
|
||||
static constexpr bool has_low_cardinality = has_low_cardinality_;
|
||||
|
||||
LowCardinalityKeys<has_low_cardinality> low_cardinality_keys;
|
||||
Sizes key_sizes;
|
||||
size_t keys_size;
|
||||
|
||||
HashMethodKeysFixed(const ColumnRawPtrs & key_columns, const Sizes & key_sizes, const HashMethodContextPtr &)
|
||||
: Base(key_columns), key_sizes(std::move(key_sizes)), keys_size(key_columns.size())
|
||||
{
|
||||
if constexpr (has_low_cardinality)
|
||||
{
|
||||
low_cardinality_keys.nested_columns.resize(key_columns.size());
|
||||
low_cardinality_keys.positions.assign(key_columns.size(), nullptr);
|
||||
low_cardinality_keys.position_sizes.resize(key_columns.size());
|
||||
for (size_t i = 0; i < key_columns.size(); ++i)
|
||||
{
|
||||
if (auto * low_cardinality_col = typeid_cast<const ColumnLowCardinality *>(key_columns[i]))
|
||||
{
|
||||
low_cardinality_keys.nested_columns[i] = low_cardinality_col->getDictionary().getNestedColumn().get();
|
||||
low_cardinality_keys.positions[i] = &low_cardinality_col->getIndexes();
|
||||
low_cardinality_keys.position_sizes[i] = low_cardinality_col->getSizeOfIndexType();
|
||||
}
|
||||
else
|
||||
low_cardinality_keys.nested_columns[i] = key_columns[i];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ALWAYS_INLINE Key getKey(size_t row, Arena &) const
|
||||
{
|
||||
if constexpr (has_nullable_keys)
|
||||
{
|
||||
auto bitmap = Base::createBitmap(row);
|
||||
return packFixed<Key>(row, keys_size, Base::getActualColumns(), key_sizes, bitmap);
|
||||
}
|
||||
else
|
||||
{
|
||||
if constexpr (has_low_cardinality)
|
||||
return packFixed<Key, true>(row, keys_size, low_cardinality_keys.nested_columns, key_sizes,
|
||||
&low_cardinality_keys.positions, &low_cardinality_keys.position_sizes);
|
||||
|
||||
return packFixed<Key>(row, keys_size, Base::getActualColumns(), key_sizes);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
/** Hash by concatenating serialized key values.
|
||||
* The serialized value differs in that it uniquely allows to deserialize it, having only the position with which it starts.
|
||||
* That is, for example, for strings, it contains first the serialized length of the string, and then the bytes.
|
||||
* Therefore, when aggregating by several strings, there is no ambiguity.
|
||||
*/
|
||||
template <typename Value, typename Mapped>
|
||||
struct HashMethodSerialized
|
||||
: public columns_hashing_impl::HashMethodBase<HashMethodSerialized<Value, Mapped>, Value, Mapped, false>
|
||||
{
|
||||
using Self = HashMethodSerialized<Value, Mapped>;
|
||||
using Base = columns_hashing_impl::HashMethodBase<Self, Value, Mapped, false>;
|
||||
|
||||
ColumnRawPtrs key_columns;
|
||||
size_t keys_size;
|
||||
|
||||
HashMethodSerialized(const ColumnRawPtrs & key_columns, const Sizes & /*key_sizes*/, const HashMethodContextPtr &)
|
||||
: key_columns(key_columns), keys_size(key_columns.size()) {}
|
||||
|
||||
protected:
|
||||
friend class columns_hashing_impl::HashMethodBase<Self, Value, Mapped, false>;
|
||||
|
||||
ALWAYS_INLINE StringRef getKey(size_t row, Arena & pool) const
|
||||
{
|
||||
return serializeKeysToPoolContiguous(row, keys_size, key_columns, pool);
|
||||
}
|
||||
|
||||
static ALWAYS_INLINE void onExistingKey(StringRef & key, Arena & pool) { pool.rollback(key.size); }
|
||||
};
|
||||
|
||||
/// For the case when there is one string key.
|
||||
template <typename Value, typename Mapped, bool use_cache = true>
|
||||
struct HashMethodHashed
|
||||
: public columns_hashing_impl::HashMethodBase<HashMethodHashed<Value, Mapped, use_cache>, Value, Mapped, use_cache>
|
||||
{
|
||||
using Key = UInt128;
|
||||
using Self = HashMethodHashed<Value, Mapped, use_cache>;
|
||||
using Base = columns_hashing_impl::HashMethodBase<Self, Value, Mapped, use_cache>;
|
||||
|
||||
ColumnRawPtrs key_columns;
|
||||
|
||||
HashMethodHashed(ColumnRawPtrs key_columns, const Sizes &, const HashMethodContextPtr &)
|
||||
: key_columns(std::move(key_columns)) {}
|
||||
|
||||
ALWAYS_INLINE Key getKey(size_t row, Arena &) const { return hash128(row, key_columns.size(), key_columns); }
|
||||
|
||||
static ALWAYS_INLINE StringRef getValueRef(const Value & value)
|
||||
{
|
||||
return StringRef(reinterpret_cast<const char *>(&value.first), sizeof(value.first));
|
||||
}
|
||||
};
|
||||
|
||||
}
|
||||
}
|
356
dbms/src/Common/ColumnsHashingImpl.h
Normal file
356
dbms/src/Common/ColumnsHashingImpl.h
Normal file
@ -0,0 +1,356 @@
|
||||
#pragma once
|
||||
|
||||
#include <Columns/IColumn.h>
|
||||
#include <Interpreters/AggregationCommon.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
namespace ColumnsHashing
|
||||
{
|
||||
|
||||
/// Generic context for HashMethod. Context is shared between multiple threads, all methods must be thread-safe.
|
||||
/// Is used for caching.
|
||||
class HashMethodContext
|
||||
{
|
||||
public:
|
||||
virtual ~HashMethodContext() = default;
|
||||
|
||||
struct Settings
|
||||
{
|
||||
size_t max_threads;
|
||||
};
|
||||
};
|
||||
|
||||
using HashMethodContextPtr = std::shared_ptr<HashMethodContext>;
|
||||
|
||||
|
||||
namespace columns_hashing_impl
|
||||
{
|
||||
|
||||
template <typename Value, bool consecutive_keys_optimization_>
|
||||
struct LastElementCache
|
||||
{
|
||||
static constexpr bool consecutive_keys_optimization = consecutive_keys_optimization_;
|
||||
Value value;
|
||||
bool empty = true;
|
||||
bool found = false;
|
||||
|
||||
bool check(const Value & value_) { return !empty && value == value_; }
|
||||
|
||||
template <typename Key>
|
||||
bool check(const Key & key) { return !empty && value.first == key; }
|
||||
};
|
||||
|
||||
template <typename Data>
|
||||
struct LastElementCache<Data, false>
|
||||
{
|
||||
static constexpr bool consecutive_keys_optimization = false;
|
||||
};
|
||||
|
||||
template <typename Mapped>
|
||||
class EmplaceResultImpl
|
||||
{
|
||||
Mapped & value;
|
||||
Mapped & cached_value;
|
||||
bool inserted;
|
||||
|
||||
public:
|
||||
EmplaceResultImpl(Mapped & value, Mapped & cached_value, bool inserted)
|
||||
: value(value), cached_value(cached_value), inserted(inserted) {}
|
||||
|
||||
bool isInserted() const { return inserted; }
|
||||
auto & getMapped() const { return value; }
|
||||
|
||||
void setMapped(const Mapped & mapped)
|
||||
{
|
||||
cached_value = mapped;
|
||||
value = mapped;
|
||||
}
|
||||
};
|
||||
|
||||
template <>
|
||||
class EmplaceResultImpl<void>
|
||||
{
|
||||
bool inserted;
|
||||
|
||||
public:
|
||||
explicit EmplaceResultImpl(bool inserted) : inserted(inserted) {}
|
||||
bool isInserted() const { return inserted; }
|
||||
};
|
||||
|
||||
template <typename Mapped>
|
||||
class FindResultImpl
|
||||
{
|
||||
Mapped * value;
|
||||
bool found;
|
||||
|
||||
public:
|
||||
FindResultImpl(Mapped * value, bool found) : value(value), found(found) {}
|
||||
bool isFound() const { return found; }
|
||||
Mapped & getMapped() const { return *value; }
|
||||
};
|
||||
|
||||
template <>
|
||||
class FindResultImpl<void>
|
||||
{
|
||||
bool found;
|
||||
|
||||
public:
|
||||
explicit FindResultImpl(bool found) : found(found) {}
|
||||
bool isFound() const { return found; }
|
||||
};
|
||||
|
||||
template <typename Derived, typename Value, typename Mapped, bool consecutive_keys_optimization>
|
||||
class HashMethodBase
|
||||
{
|
||||
public:
|
||||
using EmplaceResult = EmplaceResultImpl<Mapped>;
|
||||
using FindResult = FindResultImpl<Mapped>;
|
||||
static constexpr bool has_mapped = !std::is_same<Mapped, void>::value;
|
||||
using Cache = LastElementCache<Value, consecutive_keys_optimization>;
|
||||
|
||||
static HashMethodContextPtr createContext(const HashMethodContext::Settings &) { return nullptr; }
|
||||
|
||||
template <typename Data>
|
||||
ALWAYS_INLINE EmplaceResult emplaceKey(Data & data, size_t row, Arena & pool)
|
||||
{
|
||||
auto key = static_cast<Derived &>(*this).getKey(row, pool);
|
||||
return emplaceKeyImpl(key, data, pool);
|
||||
}
|
||||
|
||||
template <typename Data>
|
||||
ALWAYS_INLINE FindResult findKey(Data & data, size_t row, Arena & pool)
|
||||
{
|
||||
auto key = static_cast<Derived &>(*this).getKey(row, pool);
|
||||
auto res = findKeyImpl(key, data);
|
||||
static_cast<Derived &>(*this).onExistingKey(key, pool);
|
||||
return res;
|
||||
}
|
||||
|
||||
template <typename Data>
|
||||
ALWAYS_INLINE size_t getHash(const Data & data, size_t row, Arena & pool)
|
||||
{
|
||||
auto key = static_cast<Derived &>(*this).getKey(row, pool);
|
||||
auto res = data.hash(key);
|
||||
static_cast<Derived &>(*this).onExistingKey(key, pool);
|
||||
return res;
|
||||
}
|
||||
|
||||
protected:
|
||||
Cache cache;
|
||||
|
||||
HashMethodBase()
|
||||
{
|
||||
if constexpr (consecutive_keys_optimization)
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
{
|
||||
/// Init PairNoInit elements.
|
||||
cache.value.second = Mapped();
|
||||
using Key = decltype(cache.value.first);
|
||||
cache.value.first = Key();
|
||||
}
|
||||
else
|
||||
cache.value = Value();
|
||||
}
|
||||
}
|
||||
|
||||
template <typename Key>
|
||||
static ALWAYS_INLINE void onNewKey(Key & /*key*/, Arena & /*pool*/) {}
|
||||
template <typename Key>
|
||||
static ALWAYS_INLINE void onExistingKey(Key & /*key*/, Arena & /*pool*/) {}
|
||||
|
||||
template <typename Data, typename Key>
|
||||
ALWAYS_INLINE EmplaceResult emplaceKeyImpl(Key key, Data & data, Arena & pool)
|
||||
{
|
||||
if constexpr (Cache::consecutive_keys_optimization)
|
||||
{
|
||||
if (cache.found && cache.check(key))
|
||||
{
|
||||
static_cast<Derived &>(*this).onExistingKey(key, pool);
|
||||
|
||||
if constexpr (has_mapped)
|
||||
return EmplaceResult(cache.value.second, cache.value.second, false);
|
||||
else
|
||||
return EmplaceResult(false);
|
||||
}
|
||||
}
|
||||
|
||||
typename Data::iterator it;
|
||||
bool inserted = false;
|
||||
data.emplace(key, it, inserted);
|
||||
|
||||
[[maybe_unused]] Mapped * cached = nullptr;
|
||||
if constexpr (has_mapped)
|
||||
cached = &it->second;
|
||||
|
||||
if (inserted)
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
{
|
||||
new(&it->second) Mapped();
|
||||
static_cast<Derived &>(*this).onNewKey(it->first, pool);
|
||||
}
|
||||
else
|
||||
static_cast<Derived &>(*this).onNewKey(*it, pool);
|
||||
}
|
||||
else
|
||||
static_cast<Derived &>(*this).onExistingKey(key, pool);
|
||||
|
||||
if constexpr (consecutive_keys_optimization)
|
||||
{
|
||||
cache.value = *it;
|
||||
cache.found = true;
|
||||
cache.empty = false;
|
||||
|
||||
if constexpr (has_mapped)
|
||||
cached = &cache.value.second;
|
||||
}
|
||||
|
||||
if constexpr (has_mapped)
|
||||
return EmplaceResult(it->second, *cached, inserted);
|
||||
else
|
||||
return EmplaceResult(inserted);
|
||||
}
|
||||
|
||||
template <typename Data, typename Key>
|
||||
ALWAYS_INLINE FindResult findKeyImpl(Key key, Data & data)
|
||||
{
|
||||
if constexpr (Cache::consecutive_keys_optimization)
|
||||
{
|
||||
if (cache.check(key))
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
return FindResult(&cache.value.second, cache.found);
|
||||
else
|
||||
return FindResult(cache.found);
|
||||
}
|
||||
}
|
||||
|
||||
auto it = data.find(key);
|
||||
bool found = it != data.end();
|
||||
|
||||
if constexpr (consecutive_keys_optimization)
|
||||
{
|
||||
cache.found = found;
|
||||
cache.empty = false;
|
||||
|
||||
if (found)
|
||||
cache.value = *it;
|
||||
else
|
||||
{
|
||||
if constexpr (has_mapped)
|
||||
cache.value.first = key;
|
||||
else
|
||||
cache.value = key;
|
||||
}
|
||||
}
|
||||
|
||||
if constexpr (has_mapped)
|
||||
return FindResult(found ? &it->second : nullptr, found);
|
||||
else
|
||||
return FindResult(found);
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
template <typename T>
|
||||
struct MappedCache : public PaddedPODArray<T> {};
|
||||
|
||||
template <>
|
||||
struct MappedCache<void> {};
|
||||
|
||||
|
||||
/// This class is designed to provide the functionality that is required for
|
||||
/// supporting nullable keys in HashMethodKeysFixed. If there are
|
||||
/// no nullable keys, this class is merely implemented as an empty shell.
|
||||
template <typename Key, bool has_nullable_keys>
|
||||
class BaseStateKeysFixed;
|
||||
|
||||
/// Case where nullable keys are supported.
|
||||
template <typename Key>
|
||||
class BaseStateKeysFixed<Key, true>
|
||||
{
|
||||
protected:
|
||||
BaseStateKeysFixed(const ColumnRawPtrs & key_columns)
|
||||
{
|
||||
null_maps.reserve(key_columns.size());
|
||||
actual_columns.reserve(key_columns.size());
|
||||
|
||||
for (const auto & col : key_columns)
|
||||
{
|
||||
if (col->isColumnNullable())
|
||||
{
|
||||
const auto & nullable_col = static_cast<const ColumnNullable &>(*col);
|
||||
actual_columns.push_back(&nullable_col.getNestedColumn());
|
||||
null_maps.push_back(&nullable_col.getNullMapColumn());
|
||||
}
|
||||
else
|
||||
{
|
||||
actual_columns.push_back(col);
|
||||
null_maps.push_back(nullptr);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Return the columns which actually contain the values of the keys.
|
||||
/// For a given key column, if it is nullable, we return its nested
|
||||
/// column. Otherwise we return the key column itself.
|
||||
inline const ColumnRawPtrs & getActualColumns() const
|
||||
{
|
||||
return actual_columns;
|
||||
}
|
||||
|
||||
/// Create a bitmap that indicates whether, for a particular row,
|
||||
/// a key column bears a null value or not.
|
||||
KeysNullMap<Key> createBitmap(size_t row) const
|
||||
{
|
||||
KeysNullMap<Key> bitmap{};
|
||||
|
||||
for (size_t k = 0; k < null_maps.size(); ++k)
|
||||
{
|
||||
if (null_maps[k] != nullptr)
|
||||
{
|
||||
const auto & null_map = static_cast<const ColumnUInt8 &>(*null_maps[k]).getData();
|
||||
if (null_map[row] == 1)
|
||||
{
|
||||
size_t bucket = k / 8;
|
||||
size_t offset = k % 8;
|
||||
bitmap[bucket] |= UInt8(1) << offset;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return bitmap;
|
||||
}
|
||||
|
||||
private:
|
||||
ColumnRawPtrs actual_columns;
|
||||
ColumnRawPtrs null_maps;
|
||||
};
|
||||
|
||||
/// Case where nullable keys are not supported.
|
||||
template <typename Key>
|
||||
class BaseStateKeysFixed<Key, false>
|
||||
{
|
||||
protected:
|
||||
BaseStateKeysFixed(const ColumnRawPtrs & columns) : actual_columns(columns) {}
|
||||
|
||||
const ColumnRawPtrs & getActualColumns() const { return actual_columns; }
|
||||
|
||||
KeysNullMap<Key> createBitmap(size_t) const
|
||||
{
|
||||
throw Exception{"Internal error: calling createBitmap() for non-nullable keys"
|
||||
" is forbidden", ErrorCodes::LOGICAL_ERROR};
|
||||
}
|
||||
|
||||
private:
|
||||
ColumnRawPtrs actual_columns;
|
||||
};
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
}
|
@ -69,7 +69,7 @@ public:
|
||||
static void finalizePerformanceCounters();
|
||||
|
||||
/// Returns a non-empty string if the thread is attached to a query
|
||||
static std::string getCurrentQueryID();
|
||||
static const std::string & getQueryId();
|
||||
|
||||
/// Non-master threads call this method in destructor automatically
|
||||
static void detachQuery();
|
||||
|
@ -415,6 +415,7 @@ namespace ErrorCodes
|
||||
extern const int DATA_TYPE_CANNOT_BE_PROMOTED = 438;
|
||||
extern const int CANNOT_SCHEDULE_TASK = 439;
|
||||
extern const int INVALID_LIMIT_EXPRESSION = 440;
|
||||
extern const int CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING = 441;
|
||||
|
||||
extern const int KEEPER_EXCEPTION = 999;
|
||||
extern const int POCO_EXCEPTION = 1000;
|
||||
|
@ -38,10 +38,10 @@ std::string errnoToString(int code, int e)
|
||||
#endif
|
||||
{
|
||||
std::string tmp = std::to_string(code);
|
||||
const char * code = tmp.c_str();
|
||||
const char * code_str = tmp.c_str();
|
||||
const char * unknown_message = "Unknown error ";
|
||||
strcpy(buf, unknown_message);
|
||||
strcpy(buf + strlen(unknown_message), code);
|
||||
strcpy(buf + strlen(unknown_message), code_str);
|
||||
}
|
||||
return "errno: " + toString(e) + ", strerror: " + std::string(buf);
|
||||
#else
|
||||
@ -88,7 +88,7 @@ std::string getCurrentExceptionMessage(bool with_stacktrace, bool check_embedded
|
||||
try
|
||||
{
|
||||
stream << "Poco::Exception. Code: " << ErrorCodes::POCO_EXCEPTION << ", e.code() = " << e.code()
|
||||
<< ", e.displayText() = " << e.displayText() << ", e.what() = " << e.what();
|
||||
<< ", e.displayText() = " << e.displayText();
|
||||
}
|
||||
catch (...) {}
|
||||
}
|
||||
@ -202,7 +202,7 @@ std::string getExceptionMessage(const Exception & e, bool with_stacktrace, bool
|
||||
}
|
||||
}
|
||||
|
||||
stream << "Code: " << e.code() << ", e.displayText() = " << text << ", e.what() = " << e.what();
|
||||
stream << "Code: " << e.code() << ", e.displayText() = " << text;
|
||||
|
||||
if (with_stacktrace && !has_embedded_stack_trace)
|
||||
stream << ", Stack trace:\n\n" << e.getStackTrace().toString();
|
||||
|
@ -33,6 +33,7 @@ public:
|
||||
Exception * clone() const override { return new Exception(*this); }
|
||||
void rethrow() const override { throw *this; }
|
||||
const char * name() const throw() override { return "DB::Exception"; }
|
||||
const char * what() const throw() override { return message().data(); }
|
||||
|
||||
/// Add something to the existing message.
|
||||
void addMessage(const std::string & arg) { extendedMessage(arg); }
|
||||
|
@ -1,6 +1,7 @@
|
||||
#pragma once
|
||||
|
||||
#include <Common/Exception.h>
|
||||
#include <Common/NamePrompter.h>
|
||||
#include <Core/Types.h>
|
||||
#include <Poco/String.h>
|
||||
|
||||
@ -105,6 +106,12 @@ public:
|
||||
return aliases.count(name) || case_insensitive_aliases.count(name);
|
||||
}
|
||||
|
||||
std::vector<String> getHints(const String & name) const
|
||||
{
|
||||
static const auto registered_names = getAllRegisteredNames();
|
||||
return prompter.getHints(name, registered_names);
|
||||
}
|
||||
|
||||
virtual ~IFactoryWithAliases() {}
|
||||
|
||||
private:
|
||||
@ -120,6 +127,12 @@ private:
|
||||
|
||||
/// Case insensitive aliases
|
||||
AliasMap case_insensitive_aliases;
|
||||
|
||||
/**
|
||||
* prompter for names, if a person makes a typo for some function or type, it
|
||||
* helps to find best possible match (in particular, edit distance is one or two symbols)
|
||||
*/
|
||||
NamePrompter</*MistakeFactor=*/2, /*MaxNumHints=*/2> prompter;
|
||||
};
|
||||
|
||||
}
|
||||
|
83
dbms/src/Common/NamePrompter.h
Normal file
83
dbms/src/Common/NamePrompter.h
Normal file
@ -0,0 +1,83 @@
|
||||
#pragma once
|
||||
|
||||
#include <Core/Types.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include <cctype>
|
||||
#include <queue>
|
||||
#include <utility>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
template <size_t MistakeFactor, size_t MaxNumHints>
|
||||
class NamePrompter
|
||||
{
|
||||
public:
|
||||
using DistanceIndex = std::pair<size_t, size_t>;
|
||||
using DistanceIndexQueue = std::priority_queue<DistanceIndex>;
|
||||
|
||||
static std::vector<String> getHints(const String & name, const std::vector<String> & prompting_strings)
|
||||
{
|
||||
DistanceIndexQueue queue;
|
||||
for (size_t i = 0; i < prompting_strings.size(); ++i)
|
||||
appendToQueue(i, name, queue, prompting_strings);
|
||||
return release(queue, prompting_strings);
|
||||
}
|
||||
|
||||
private:
|
||||
static size_t levenshteinDistance(const String & lhs, const String & rhs)
|
||||
{
|
||||
size_t n = lhs.size();
|
||||
size_t m = rhs.size();
|
||||
std::vector<std::vector<size_t>> dp(n + 1, std::vector<size_t>(m + 1));
|
||||
|
||||
for (size_t i = 1; i <= n; ++i)
|
||||
dp[i][0] = i;
|
||||
|
||||
for (size_t i = 1; i <= m; ++i)
|
||||
dp[0][i] = i;
|
||||
|
||||
for (size_t j = 1; j <= m; ++j)
|
||||
{
|
||||
for (size_t i = 1; i <= n; ++i)
|
||||
{
|
||||
if (std::tolower(lhs[i - 1]) == std::tolower(rhs[j - 1]))
|
||||
dp[i][j] = dp[i - 1][j - 1];
|
||||
else
|
||||
dp[i][j] = std::min(dp[i - 1][j] + 1, std::min(dp[i][j - 1] + 1, dp[i - 1][j - 1] + 1));
|
||||
}
|
||||
}
|
||||
|
||||
return dp[n][m];
|
||||
}
|
||||
|
||||
static void appendToQueue(size_t ind, const String & name, DistanceIndexQueue & queue, const std::vector<String> & prompting_strings)
|
||||
{
|
||||
if (prompting_strings[ind].size() <= name.size() + MistakeFactor && prompting_strings[ind].size() + MistakeFactor >= name.size())
|
||||
{
|
||||
size_t distance = levenshteinDistance(prompting_strings[ind], name);
|
||||
if (distance <= MistakeFactor)
|
||||
{
|
||||
queue.emplace(distance, ind);
|
||||
if (queue.size() > MaxNumHints)
|
||||
queue.pop();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static std::vector<String> release(DistanceIndexQueue & queue, const std::vector<String> & prompting_strings)
|
||||
{
|
||||
std::vector<String> ans;
|
||||
ans.reserve(queue.size());
|
||||
while (!queue.empty())
|
||||
{
|
||||
auto top = queue.top();
|
||||
queue.pop();
|
||||
ans.push_back(prompting_strings[top.second]);
|
||||
}
|
||||
std::reverse(ans.begin(), ans.end());
|
||||
return ans;
|
||||
}
|
||||
};
|
||||
|
||||
}
|
@ -17,6 +17,7 @@
|
||||
#include <common/unaligned.h>
|
||||
#include <string>
|
||||
#include <type_traits>
|
||||
#include <Core/Defines.h>
|
||||
|
||||
#define ROTL(x, b) static_cast<UInt64>(((x) << (b)) | ((x) >> (64 - (b))))
|
||||
|
||||
@ -49,7 +50,7 @@ private:
|
||||
UInt8 current_bytes[8];
|
||||
};
|
||||
|
||||
void finalize()
|
||||
ALWAYS_INLINE void finalize()
|
||||
{
|
||||
/// In the last free byte, we write the remainder of the division by 256.
|
||||
current_bytes[7] = cnt;
|
||||
@ -156,7 +157,7 @@ public:
|
||||
|
||||
/// template for avoiding 'unsigned long long' vs 'unsigned long' problem on old poco in macos
|
||||
template <typename T>
|
||||
void get128(T & lo, T & hi)
|
||||
ALWAYS_INLINE void get128(T & lo, T & hi)
|
||||
{
|
||||
static_assert(sizeof(T) == 8);
|
||||
finalize();
|
||||
|
@ -21,7 +21,7 @@ namespace ErrorCodes
|
||||
}
|
||||
|
||||
|
||||
thread_local ThreadStatusPtr current_thread = nullptr;
|
||||
thread_local ThreadStatus * current_thread = nullptr;
|
||||
|
||||
|
||||
TasksStatsCounters TasksStatsCounters::current()
|
||||
@ -124,7 +124,7 @@ void ThreadStatus::attachInternalTextLogsQueue(const InternalTextLogsQueuePtr &
|
||||
if (!thread_group)
|
||||
return;
|
||||
|
||||
std::unique_lock lock(thread_group->mutex);
|
||||
std::lock_guard lock(thread_group->mutex);
|
||||
thread_group->logs_queue_ptr = logs_queue;
|
||||
}
|
||||
|
||||
|
@ -25,7 +25,6 @@ namespace DB
|
||||
class Context;
|
||||
class QueryStatus;
|
||||
class ThreadStatus;
|
||||
using ThreadStatusPtr = ThreadStatus*;
|
||||
class QueryThreadLog;
|
||||
struct TasksStatsCounters;
|
||||
struct RUsageCounters;
|
||||
@ -46,7 +45,7 @@ using InternalTextLogsQueueWeakPtr = std::weak_ptr<InternalTextLogsQueue>;
|
||||
class ThreadGroupStatus
|
||||
{
|
||||
public:
|
||||
mutable std::shared_mutex mutex;
|
||||
mutable std::mutex mutex;
|
||||
|
||||
ProfileEvents::Counters performance_counters{VariableContext::Process};
|
||||
MemoryTracker memory_tracker{VariableContext::Process};
|
||||
@ -56,12 +55,11 @@ public:
|
||||
|
||||
InternalTextLogsQueueWeakPtr logs_queue_ptr;
|
||||
|
||||
/// Key is Poco's thread_id
|
||||
using QueryThreadStatuses = std::map<UInt32, ThreadStatusPtr>;
|
||||
QueryThreadStatuses thread_statuses;
|
||||
std::vector<UInt32> thread_numbers;
|
||||
|
||||
/// The first thread created this thread group
|
||||
ThreadStatusPtr master_thread;
|
||||
UInt32 master_thread_number = 0;
|
||||
Int32 master_thread_os_id = -1;
|
||||
|
||||
String query;
|
||||
};
|
||||
@ -69,7 +67,7 @@ public:
|
||||
using ThreadGroupStatusPtr = std::shared_ptr<ThreadGroupStatus>;
|
||||
|
||||
|
||||
extern thread_local ThreadStatusPtr current_thread;
|
||||
extern thread_local ThreadStatus * current_thread;
|
||||
|
||||
/** Encapsulates all per-thread info (ProfileEvents, MemoryTracker, query_id, query context, etc.).
|
||||
* The object must be created in thread function and destroyed in the same thread before the exit.
|
||||
@ -116,7 +114,7 @@ public:
|
||||
return thread_state.load(std::memory_order_relaxed);
|
||||
}
|
||||
|
||||
String getQueryID();
|
||||
const std::string & getQueryId() const;
|
||||
|
||||
/// Starts new query and create new thread group for it, current thread becomes master thread of the query
|
||||
void initializeQuery();
|
||||
@ -160,6 +158,8 @@ protected:
|
||||
/// Use it only from current thread
|
||||
Context * query_context = nullptr;
|
||||
|
||||
String query_id;
|
||||
|
||||
/// A logs queue used by TCPHandler to pass logs to a client
|
||||
InternalTextLogsQueueWeakPtr logs_queue_ptr;
|
||||
|
||||
|
@ -1,12 +1,44 @@
|
||||
#include <Common/formatIPv6.h>
|
||||
#include <Common/hex.h>
|
||||
#include <Common/StringUtils/StringUtils.h>
|
||||
|
||||
#include <ext/range.h>
|
||||
#include <array>
|
||||
|
||||
#include <algorithm>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
// To be used in formatIPv4, maps a byte to it's string form prefixed with length (so save strlen call).
|
||||
extern const char one_byte_to_string_lookup_table[256][4] = {
|
||||
{1, '0'}, {1, '1'}, {1, '2'}, {1, '3'}, {1, '4'}, {1, '5'}, {1, '6'}, {1, '7'}, {1, '8'}, {1, '9'},
|
||||
{2, '1', '0'}, {2, '1', '1'}, {2, '1', '2'}, {2, '1', '3'}, {2, '1', '4'}, {2, '1', '5'}, {2, '1', '6'}, {2, '1', '7'}, {2, '1', '8'}, {2, '1', '9'},
|
||||
{2, '2', '0'}, {2, '2', '1'}, {2, '2', '2'}, {2, '2', '3'}, {2, '2', '4'}, {2, '2', '5'}, {2, '2', '6'}, {2, '2', '7'}, {2, '2', '8'}, {2, '2', '9'},
|
||||
{2, '3', '0'}, {2, '3', '1'}, {2, '3', '2'}, {2, '3', '3'}, {2, '3', '4'}, {2, '3', '5'}, {2, '3', '6'}, {2, '3', '7'}, {2, '3', '8'}, {2, '3', '9'},
|
||||
{2, '4', '0'}, {2, '4', '1'}, {2, '4', '2'}, {2, '4', '3'}, {2, '4', '4'}, {2, '4', '5'}, {2, '4', '6'}, {2, '4', '7'}, {2, '4', '8'}, {2, '4', '9'},
|
||||
{2, '5', '0'}, {2, '5', '1'}, {2, '5', '2'}, {2, '5', '3'}, {2, '5', '4'}, {2, '5', '5'}, {2, '5', '6'}, {2, '5', '7'}, {2, '5', '8'}, {2, '5', '9'},
|
||||
{2, '6', '0'}, {2, '6', '1'}, {2, '6', '2'}, {2, '6', '3'}, {2, '6', '4'}, {2, '6', '5'}, {2, '6', '6'}, {2, '6', '7'}, {2, '6', '8'}, {2, '6', '9'},
|
||||
{2, '7', '0'}, {2, '7', '1'}, {2, '7', '2'}, {2, '7', '3'}, {2, '7', '4'}, {2, '7', '5'}, {2, '7', '6'}, {2, '7', '7'}, {2, '7', '8'}, {2, '7', '9'},
|
||||
{2, '8', '0'}, {2, '8', '1'}, {2, '8', '2'}, {2, '8', '3'}, {2, '8', '4'}, {2, '8', '5'}, {2, '8', '6'}, {2, '8', '7'}, {2, '8', '8'}, {2, '8', '9'},
|
||||
{2, '9', '0'}, {2, '9', '1'}, {2, '9', '2'}, {2, '9', '3'}, {2, '9', '4'}, {2, '9', '5'}, {2, '9', '6'}, {2, '9', '7'}, {2, '9', '8'}, {2, '9', '9'},
|
||||
{3, '1', '0', '0'}, {3, '1', '0', '1'}, {3, '1', '0', '2'}, {3, '1', '0', '3'}, {3, '1', '0', '4'}, {3, '1', '0', '5'}, {3, '1', '0', '6'}, {3, '1', '0', '7'}, {3, '1', '0', '8'}, {3, '1', '0', '9'},
|
||||
{3, '1', '1', '0'}, {3, '1', '1', '1'}, {3, '1', '1', '2'}, {3, '1', '1', '3'}, {3, '1', '1', '4'}, {3, '1', '1', '5'}, {3, '1', '1', '6'}, {3, '1', '1', '7'}, {3, '1', '1', '8'}, {3, '1', '1', '9'},
|
||||
{3, '1', '2', '0'}, {3, '1', '2', '1'}, {3, '1', '2', '2'}, {3, '1', '2', '3'}, {3, '1', '2', '4'}, {3, '1', '2', '5'}, {3, '1', '2', '6'}, {3, '1', '2', '7'}, {3, '1', '2', '8'}, {3, '1', '2', '9'},
|
||||
{3, '1', '3', '0'}, {3, '1', '3', '1'}, {3, '1', '3', '2'}, {3, '1', '3', '3'}, {3, '1', '3', '4'}, {3, '1', '3', '5'}, {3, '1', '3', '6'}, {3, '1', '3', '7'}, {3, '1', '3', '8'}, {3, '1', '3', '9'},
|
||||
{3, '1', '4', '0'}, {3, '1', '4', '1'}, {3, '1', '4', '2'}, {3, '1', '4', '3'}, {3, '1', '4', '4'}, {3, '1', '4', '5'}, {3, '1', '4', '6'}, {3, '1', '4', '7'}, {3, '1', '4', '8'}, {3, '1', '4', '9'},
|
||||
{3, '1', '5', '0'}, {3, '1', '5', '1'}, {3, '1', '5', '2'}, {3, '1', '5', '3'}, {3, '1', '5', '4'}, {3, '1', '5', '5'}, {3, '1', '5', '6'}, {3, '1', '5', '7'}, {3, '1', '5', '8'}, {3, '1', '5', '9'},
|
||||
{3, '1', '6', '0'}, {3, '1', '6', '1'}, {3, '1', '6', '2'}, {3, '1', '6', '3'}, {3, '1', '6', '4'}, {3, '1', '6', '5'}, {3, '1', '6', '6'}, {3, '1', '6', '7'}, {3, '1', '6', '8'}, {3, '1', '6', '9'},
|
||||
{3, '1', '7', '0'}, {3, '1', '7', '1'}, {3, '1', '7', '2'}, {3, '1', '7', '3'}, {3, '1', '7', '4'}, {3, '1', '7', '5'}, {3, '1', '7', '6'}, {3, '1', '7', '7'}, {3, '1', '7', '8'}, {3, '1', '7', '9'},
|
||||
{3, '1', '8', '0'}, {3, '1', '8', '1'}, {3, '1', '8', '2'}, {3, '1', '8', '3'}, {3, '1', '8', '4'}, {3, '1', '8', '5'}, {3, '1', '8', '6'}, {3, '1', '8', '7'}, {3, '1', '8', '8'}, {3, '1', '8', '9'},
|
||||
{3, '1', '9', '0'}, {3, '1', '9', '1'}, {3, '1', '9', '2'}, {3, '1', '9', '3'}, {3, '1', '9', '4'}, {3, '1', '9', '5'}, {3, '1', '9', '6'}, {3, '1', '9', '7'}, {3, '1', '9', '8'}, {3, '1', '9', '9'},
|
||||
{3, '2', '0', '0'}, {3, '2', '0', '1'}, {3, '2', '0', '2'}, {3, '2', '0', '3'}, {3, '2', '0', '4'}, {3, '2', '0', '5'}, {3, '2', '0', '6'}, {3, '2', '0', '7'}, {3, '2', '0', '8'}, {3, '2', '0', '9'},
|
||||
{3, '2', '1', '0'}, {3, '2', '1', '1'}, {3, '2', '1', '2'}, {3, '2', '1', '3'}, {3, '2', '1', '4'}, {3, '2', '1', '5'}, {3, '2', '1', '6'}, {3, '2', '1', '7'}, {3, '2', '1', '8'}, {3, '2', '1', '9'},
|
||||
{3, '2', '2', '0'}, {3, '2', '2', '1'}, {3, '2', '2', '2'}, {3, '2', '2', '3'}, {3, '2', '2', '4'}, {3, '2', '2', '5'}, {3, '2', '2', '6'}, {3, '2', '2', '7'}, {3, '2', '2', '8'}, {3, '2', '2', '9'},
|
||||
{3, '2', '3', '0'}, {3, '2', '3', '1'}, {3, '2', '3', '2'}, {3, '2', '3', '3'}, {3, '2', '3', '4'}, {3, '2', '3', '5'}, {3, '2', '3', '6'}, {3, '2', '3', '7'}, {3, '2', '3', '8'}, {3, '2', '3', '9'},
|
||||
{3, '2', '4', '0'}, {3, '2', '4', '1'}, {3, '2', '4', '2'}, {3, '2', '4', '3'}, {3, '2', '4', '4'}, {3, '2', '4', '5'}, {3, '2', '4', '6'}, {3, '2', '4', '7'}, {3, '2', '4', '8'}, {3, '2', '4', '9'},
|
||||
{3, '2', '5', '0'}, {3, '2', '5', '1'}, {3, '2', '5', '2'}, {3, '2', '5', '3'}, {3, '2', '5', '4'}, {3, '2', '5', '5'},
|
||||
};
|
||||
|
||||
/// integer logarithm, return ceil(log(value, base)) (the smallest integer greater or equal than log(value, base)
|
||||
static constexpr UInt32 intLog(const UInt32 value, const UInt32 base, const bool carry)
|
||||
{
|
||||
@ -45,22 +77,6 @@ static void printInteger(char *& out, T value)
|
||||
}
|
||||
}
|
||||
|
||||
/// print IPv4 address as %u.%u.%u.%u
|
||||
static void formatIPv4(const unsigned char * src, char *& dst, UInt8 zeroed_tail_bytes_count)
|
||||
{
|
||||
const auto limit = IPV4_BINARY_LENGTH - zeroed_tail_bytes_count;
|
||||
|
||||
for (const auto i : ext::range(0, IPV4_BINARY_LENGTH))
|
||||
{
|
||||
UInt8 byte = (i < limit) ? src[i] : 0;
|
||||
printInteger<10, UInt8>(dst, byte);
|
||||
|
||||
if (i != IPV4_BINARY_LENGTH - 1)
|
||||
*dst++ = '.';
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void formatIPv6(const unsigned char * src, char *& dst, UInt8 zeroed_tail_bytes_count)
|
||||
{
|
||||
struct { int base, len; } best{-1, 0}, cur{-1, 0};
|
||||
@ -122,8 +138,14 @@ void formatIPv6(const unsigned char * src, char *& dst, UInt8 zeroed_tail_bytes_
|
||||
/// Is this address an encapsulated IPv4?
|
||||
if (i == 6 && best.base == 0 && (best.len == 6 || (best.len == 5 && words[5] == 0xffffu)))
|
||||
{
|
||||
formatIPv4(src + 12, dst, std::min(zeroed_tail_bytes_count, static_cast<UInt8>(IPV4_BINARY_LENGTH)));
|
||||
break;
|
||||
UInt8 ipv4_buffer[IPV4_BINARY_LENGTH] = {0};
|
||||
memcpy(ipv4_buffer, src + 12, IPV4_BINARY_LENGTH);
|
||||
// Due to historical reasons formatIPv4() takes ipv4 in BE format, but inside ipv6 we store it in LE-format.
|
||||
std::reverse(std::begin(ipv4_buffer), std::end(ipv4_buffer));
|
||||
|
||||
formatIPv4(ipv4_buffer, dst, std::min(zeroed_tail_bytes_count, static_cast<UInt8>(IPV4_BINARY_LENGTH)), "0");
|
||||
// formatIPv4 has already added a null-terminator for us.
|
||||
return;
|
||||
}
|
||||
|
||||
printInteger<16>(dst, words[i]);
|
||||
|
@ -1,12 +1,17 @@
|
||||
#pragma once
|
||||
|
||||
#include <common/Types.h>
|
||||
#include <string.h>
|
||||
#include <algorithm>
|
||||
#include <utility>
|
||||
#include <ext/range.h>
|
||||
#include <Common/hex.h>
|
||||
#include <Common/StringUtils/StringUtils.h>
|
||||
|
||||
#define IPV4_BINARY_LENGTH 4
|
||||
#define IPV6_BINARY_LENGTH 16
|
||||
#define IPV4_MAX_TEXT_LENGTH 15 /// Does not count tail zero byte.
|
||||
#define IPV6_MAX_TEXT_LENGTH 39
|
||||
|
||||
constexpr size_t IPV4_BINARY_LENGTH = 4;
|
||||
constexpr size_t IPV6_BINARY_LENGTH = 16;
|
||||
constexpr size_t IPV4_MAX_TEXT_LENGTH = 15; /// Does not count tail zero byte.
|
||||
constexpr size_t IPV6_MAX_TEXT_LENGTH = 39;
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -18,4 +23,205 @@ namespace DB
|
||||
*/
|
||||
void formatIPv6(const unsigned char * src, char *& dst, UInt8 zeroed_tail_bytes_count = 0);
|
||||
|
||||
/** Unsafe (no bounds-checking for src nor dst), optimized version of parsing IPv4 string.
|
||||
*
|
||||
* Parses the input string `src` and stores binary BE value into buffer pointed by `dst`,
|
||||
* which should be long enough.
|
||||
* That is "127.0.0.1" becomes 0x7f000001.
|
||||
*
|
||||
* In case of failure returns false and doesn't modify buffer pointed by `dst`.
|
||||
*
|
||||
* @param src - input string, expected to be non-null and null-terminated right after the IPv4 string value.
|
||||
* @param dst - where to put output bytes, expected to be non-null and atleast IPV4_BINARY_LENGTH-long.
|
||||
* @return false if parsing failed, true otherwise.
|
||||
*/
|
||||
inline bool parseIPv4(const char * src, unsigned char * dst)
|
||||
{
|
||||
UInt32 result = 0;
|
||||
for (int offset = 24; offset >= 0; offset -= 8)
|
||||
{
|
||||
UInt32 value = 0;
|
||||
size_t len = 0;
|
||||
while (isNumericASCII(*src) && len <= 3)
|
||||
{
|
||||
value = value * 10 + (*src - '0');
|
||||
++len;
|
||||
++src;
|
||||
}
|
||||
if (len == 0 || value > 255 || (offset > 0 && *src != '.'))
|
||||
return false;
|
||||
result |= value << offset;
|
||||
++src;
|
||||
}
|
||||
if (*(src - 1) != '\0')
|
||||
return false;
|
||||
|
||||
memcpy(dst, &result, sizeof(result));
|
||||
return true;
|
||||
}
|
||||
|
||||
/** Unsafe (no bounds-checking for src nor dst), optimized version of parsing IPv6 string.
|
||||
*
|
||||
* Slightly altered implementation from http://svn.apache.org/repos/asf/apr/apr/trunk/network_io/unix/inet_pton.c
|
||||
* Parses the input string `src` and stores binary LE value into buffer pointed by `dst`,
|
||||
* which should be long enough. In case of failure zeroes
|
||||
* IPV6_BINARY_LENGTH bytes of buffer pointed by `dst`.
|
||||
*
|
||||
* @param src - input string, expected to be non-null and null-terminated right after the IPv6 string value.
|
||||
* @param dst - where to put output bytes, expected to be non-null and atleast IPV6_BINARY_LENGTH-long.
|
||||
* @return false if parsing failed, true otherwise.
|
||||
*/
|
||||
inline bool parseIPv6(const char * src, unsigned char * dst)
|
||||
{
|
||||
const auto clear_dst = [dst]()
|
||||
{
|
||||
memset(dst, '\0', IPV6_BINARY_LENGTH);
|
||||
return false;
|
||||
};
|
||||
|
||||
/// Leading :: requires some special handling.
|
||||
if (*src == ':')
|
||||
if (*++src != ':')
|
||||
return clear_dst();
|
||||
|
||||
unsigned char tmp[IPV6_BINARY_LENGTH]{};
|
||||
auto tp = tmp;
|
||||
auto endp = tp + IPV6_BINARY_LENGTH;
|
||||
auto curtok = src;
|
||||
auto saw_xdigit = false;
|
||||
UInt32 val{};
|
||||
unsigned char * colonp = nullptr;
|
||||
|
||||
/// Assuming zero-terminated string.
|
||||
while (const auto ch = *src++)
|
||||
{
|
||||
const auto num = unhex(ch);
|
||||
|
||||
if (num != '\xff')
|
||||
{
|
||||
val <<= 4;
|
||||
val |= num;
|
||||
if (val > 0xffffu)
|
||||
return clear_dst();
|
||||
|
||||
saw_xdigit = 1;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (ch == ':')
|
||||
{
|
||||
curtok = src;
|
||||
if (!saw_xdigit)
|
||||
{
|
||||
if (colonp)
|
||||
return clear_dst();
|
||||
|
||||
colonp = tp;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (tp + sizeof(UInt16) > endp)
|
||||
return clear_dst();
|
||||
|
||||
*tp++ = static_cast<unsigned char>((val >> 8) & 0xffu);
|
||||
*tp++ = static_cast<unsigned char>(val & 0xffu);
|
||||
saw_xdigit = false;
|
||||
val = 0;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (ch == '.' && (tp + IPV4_BINARY_LENGTH) <= endp)
|
||||
{
|
||||
if (!parseIPv4(curtok, tp))
|
||||
return clear_dst();
|
||||
std::reverse(tp, tp + IPV4_BINARY_LENGTH);
|
||||
|
||||
tp += IPV4_BINARY_LENGTH;
|
||||
saw_xdigit = false;
|
||||
break; /* '\0' was seen by ipv4_scan(). */
|
||||
}
|
||||
|
||||
return clear_dst();
|
||||
}
|
||||
|
||||
if (saw_xdigit)
|
||||
{
|
||||
if (tp + sizeof(UInt16) > endp)
|
||||
return clear_dst();
|
||||
|
||||
*tp++ = static_cast<unsigned char>((val >> 8) & 0xffu);
|
||||
*tp++ = static_cast<unsigned char>(val & 0xffu);
|
||||
}
|
||||
|
||||
if (colonp)
|
||||
{
|
||||
/*
|
||||
* Since some memmove()'s erroneously fail to handle
|
||||
* overlapping regions, we'll do the shift by hand.
|
||||
*/
|
||||
const auto n = tp - colonp;
|
||||
|
||||
for (int i = 1; i <= n; ++i)
|
||||
{
|
||||
endp[- i] = colonp[n - i];
|
||||
colonp[n - i] = 0;
|
||||
}
|
||||
tp = endp;
|
||||
}
|
||||
|
||||
if (tp != endp)
|
||||
return clear_dst();
|
||||
|
||||
memcpy(dst, tmp, sizeof(tmp));
|
||||
return true;
|
||||
}
|
||||
|
||||
/** Format 4-byte binary sequesnce as IPv4 text: 'aaa.bbb.ccc.ddd',
|
||||
* expects inout to be in BE-format, that is 0x7f000001 => "127.0.0.1".
|
||||
*
|
||||
* Any number of the tail bytes can be masked with given mask string.
|
||||
*
|
||||
* Assumptions:
|
||||
* src is IPV4_BINARY_LENGTH long,
|
||||
* dst is IPV4_MAX_TEXT_LENGTH long,
|
||||
* mask_tail_octets <= IPV4_BINARY_LENGTH
|
||||
* mask_string is NON-NULL, if mask_tail_octets > 0.
|
||||
*
|
||||
* Examples:
|
||||
* formatIPv4(&0x7f000001, dst, mask_tail_octets = 0, nullptr);
|
||||
* > dst == "127.0.0.1"
|
||||
* formatIPv4(&0x7f000001, dst, mask_tail_octets = 1, "xxx");
|
||||
* > dst == "127.0.0.xxx"
|
||||
* formatIPv4(&0x7f000001, dst, mask_tail_octets = 1, "0");
|
||||
* > dst == "127.0.0.0"
|
||||
*/
|
||||
inline void formatIPv4(const unsigned char * src, char *& dst, UInt8 mask_tail_octets = 0, const char * mask_string = "xxx")
|
||||
{
|
||||
extern const char one_byte_to_string_lookup_table[256][4];
|
||||
|
||||
const size_t mask_length = mask_string ? strlen(mask_string) : 0;
|
||||
const size_t limit = std::min(IPV4_BINARY_LENGTH, IPV4_BINARY_LENGTH - mask_tail_octets);
|
||||
for (size_t octet = 0; octet < limit; ++octet)
|
||||
{
|
||||
const UInt8 value = static_cast<UInt8>(src[IPV4_BINARY_LENGTH - octet - 1]);
|
||||
auto rep = one_byte_to_string_lookup_table[value];
|
||||
const UInt8 len = rep[0];
|
||||
const char* str = rep + 1;
|
||||
|
||||
memcpy(dst, str, len);
|
||||
dst += len;
|
||||
*dst++ = '.';
|
||||
}
|
||||
|
||||
for (size_t mask = 0; mask < mask_tail_octets; ++mask)
|
||||
{
|
||||
memcpy(dst, mask_string, mask_length);
|
||||
dst += mask_length;
|
||||
|
||||
*dst++ = '.';
|
||||
}
|
||||
|
||||
dst[-1] = '\0';
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -153,6 +153,4 @@ private:
|
||||
void attachToThreadGroup();
|
||||
};
|
||||
|
||||
using BackgroundSchedulePoolPtr = std::shared_ptr<BackgroundSchedulePool>;
|
||||
|
||||
}
|
||||
|
@ -24,6 +24,9 @@ public:
|
||||
|
||||
Block getHeader() const override { return children.at(0)->getHeader(); }
|
||||
|
||||
/// We call readSuffix prematurely by ourself. Suppress default behaviour.
|
||||
void readSuffix() override {}
|
||||
|
||||
protected:
|
||||
Block readImpl() override
|
||||
{
|
||||
|
@ -120,17 +120,7 @@ void CreatingSetsBlockInputStream::createOne(SubqueryForSet & subquery)
|
||||
|
||||
if (!done_with_join)
|
||||
{
|
||||
for (const auto & name_with_alias : subquery.joined_block_aliases)
|
||||
{
|
||||
if (block.has(name_with_alias.first))
|
||||
{
|
||||
auto pos = block.getPositionByName(name_with_alias.first);
|
||||
auto column = block.getByPosition(pos);
|
||||
block.erase(pos);
|
||||
column.name = name_with_alias.second;
|
||||
block.insert(std::move(column));
|
||||
}
|
||||
}
|
||||
subquery.renameColumns(block);
|
||||
|
||||
if (subquery.joined_block_actions)
|
||||
subquery.joined_block_actions->execute(block);
|
||||
|
@ -85,24 +85,15 @@ void DistinctBlockInputStream::buildFilter(
|
||||
size_t rows,
|
||||
SetVariants & variants) const
|
||||
{
|
||||
typename Method::State state;
|
||||
state.init(columns);
|
||||
typename Method::State state(columns, key_sizes, nullptr);
|
||||
|
||||
for (size_t i = 0; i < rows; ++i)
|
||||
{
|
||||
/// Make a key.
|
||||
typename Method::Key key = state.getKey(columns, columns.size(), i, key_sizes);
|
||||
|
||||
typename Method::Data::iterator it;
|
||||
bool inserted;
|
||||
method.data.emplace(key, it, inserted);
|
||||
|
||||
if (inserted)
|
||||
method.onNewKey(*it, columns.size(), variants.string_pool);
|
||||
auto emplace_result = state.emplaceKey(method.data, i, variants.string_pool);
|
||||
|
||||
/// Emit the record if there is no such key in the current set yet.
|
||||
/// Skip it otherwise.
|
||||
filter[i] = inserted;
|
||||
filter[i] = emplace_result.isInserted();
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -85,8 +85,7 @@ bool DistinctSortedBlockInputStream::buildFilter(
|
||||
size_t rows,
|
||||
ClearableSetVariants & variants) const
|
||||
{
|
||||
typename Method::State state;
|
||||
state.init(columns);
|
||||
typename Method::State state(columns, key_sizes, nullptr);
|
||||
|
||||
/// Compare last row of previous block and first row of current block,
|
||||
/// If rows not equal, we can clear HashSet,
|
||||
@ -106,21 +105,14 @@ bool DistinctSortedBlockInputStream::buildFilter(
|
||||
if (i > 0 && !clearing_hint_columns.empty() && !rowsEqual(clearing_hint_columns, i, clearing_hint_columns, i - 1))
|
||||
method.data.clear();
|
||||
|
||||
/// Make a key.
|
||||
typename Method::Key key = state.getKey(columns, columns.size(), i, key_sizes);
|
||||
typename Method::Data::iterator it = method.data.find(key);
|
||||
bool inserted;
|
||||
method.data.emplace(key, it, inserted);
|
||||
auto emplace_result = state.emplaceKey(method.data, i, variants.string_pool);
|
||||
|
||||
if (inserted)
|
||||
{
|
||||
method.onNewKey(*it, columns.size(), variants.string_pool);
|
||||
if (emplace_result.isInserted())
|
||||
has_new_data = true;
|
||||
}
|
||||
|
||||
/// Emit the record if there is no such key in the current set yet.
|
||||
/// Skip it otherwise.
|
||||
filter[i] = inserted;
|
||||
filter[i] = emplace_result.isInserted();
|
||||
}
|
||||
return has_new_data;
|
||||
}
|
||||
|
@ -96,6 +96,13 @@ Block IBlockInputStream::read()
|
||||
|
||||
void IBlockInputStream::readPrefix()
|
||||
{
|
||||
#ifndef NDEBUG
|
||||
if (!read_prefix_is_called)
|
||||
read_prefix_is_called = true;
|
||||
else
|
||||
throw Exception("readPrefix is called twice for " + getName() + " stream", ErrorCodes::LOGICAL_ERROR);
|
||||
#endif
|
||||
|
||||
readPrefixImpl();
|
||||
|
||||
forEachChild([&] (IBlockInputStream & child)
|
||||
@ -108,6 +115,13 @@ void IBlockInputStream::readPrefix()
|
||||
|
||||
void IBlockInputStream::readSuffix()
|
||||
{
|
||||
#ifndef NDEBUG
|
||||
if (!read_suffix_is_called)
|
||||
read_suffix_is_called = true;
|
||||
else
|
||||
throw Exception("readSuffix is called twice for " + getName() + " stream", ErrorCodes::LOGICAL_ERROR);
|
||||
#endif
|
||||
|
||||
forEachChild([&] (IBlockInputStream & child)
|
||||
{
|
||||
child.readSuffix();
|
||||
|
@ -314,6 +314,11 @@ private:
|
||||
if (f(*child))
|
||||
return;
|
||||
}
|
||||
|
||||
#ifndef NDEBUG
|
||||
bool read_prefix_is_called = false;
|
||||
bool read_suffix_is_called = false;
|
||||
#endif
|
||||
};
|
||||
|
||||
}
|
||||
|
@ -32,6 +32,9 @@ public:
|
||||
return header;
|
||||
}
|
||||
|
||||
/// We call readPrefix lazily. Suppress default behaviour.
|
||||
void readPrefix() override {}
|
||||
|
||||
protected:
|
||||
Block readImpl() override
|
||||
{
|
||||
|
@ -157,7 +157,7 @@ protected:
|
||||
using QueueWithCollation = std::priority_queue<SortCursorWithCollation>;
|
||||
QueueWithCollation queue_with_collation;
|
||||
|
||||
/// Used in Vertical merge algorithm to gather non-PK columns (on next step)
|
||||
/// Used in Vertical merge algorithm to gather non-PK/non-index columns (on next step)
|
||||
/// If it is not nullptr then it should be populated during execution
|
||||
WriteBuffer * out_row_sources_buf;
|
||||
|
||||
|
@ -26,6 +26,10 @@ public:
|
||||
children.push_back(input_);
|
||||
}
|
||||
|
||||
/// Suppress readPrefix and readSuffix, because they are called by copyData.
|
||||
void readPrefix() override {}
|
||||
void readSuffix() override {}
|
||||
|
||||
String getName() const override { return "NullAndDoCopy"; }
|
||||
|
||||
Block getHeader() const override { return {}; }
|
||||
|
@ -183,7 +183,8 @@ private:
|
||||
try
|
||||
{
|
||||
setThreadName("ParalInputsProc");
|
||||
CurrentThread::attachTo(thread_group);
|
||||
if (thread_group)
|
||||
CurrentThread::attachTo(thread_group);
|
||||
|
||||
while (!finish)
|
||||
{
|
||||
|
@ -32,7 +32,7 @@ namespace ErrorCodes
|
||||
}
|
||||
|
||||
|
||||
std::string DataTypeAggregateFunction::getName() const
|
||||
std::string DataTypeAggregateFunction::doGetName() const
|
||||
{
|
||||
std::stringstream stream;
|
||||
stream << "AggregateFunction(" << function->getName();
|
||||
|
@ -29,7 +29,7 @@ public:
|
||||
std::string getFunctionName() const { return function->getName(); }
|
||||
AggregateFunctionPtr getFunction() const { return function; }
|
||||
|
||||
std::string getName() const override;
|
||||
std::string doGetName() const override;
|
||||
const char * getFamilyName() const override { return "AggregateFunction"; }
|
||||
TypeIndex getTypeId() const override { return TypeIndex::AggregateFunction; }
|
||||
|
||||
|
@ -350,7 +350,7 @@ void DataTypeArray::serializeText(const IColumn & column, size_t row_num, WriteB
|
||||
serializeTextImpl(column, row_num, ostr,
|
||||
[&](const IColumn & nested_column, size_t i)
|
||||
{
|
||||
nested->serializeTextQuoted(nested_column, i, ostr, settings);
|
||||
nested->serializeAsTextQuoted(nested_column, i, ostr, settings);
|
||||
});
|
||||
}
|
||||
|
||||
@ -360,7 +360,7 @@ void DataTypeArray::deserializeText(IColumn & column, ReadBuffer & istr, const F
|
||||
deserializeTextImpl(column, istr,
|
||||
[&](IColumn & nested_column)
|
||||
{
|
||||
nested->deserializeTextQuoted(nested_column, istr, settings);
|
||||
nested->deserializeAsTextQuoted(nested_column, istr, settings);
|
||||
});
|
||||
}
|
||||
|
||||
@ -379,7 +379,7 @@ void DataTypeArray::serializeTextJSON(const IColumn & column, size_t row_num, Wr
|
||||
{
|
||||
if (i != offset)
|
||||
writeChar(',', ostr);
|
||||
nested->serializeTextJSON(nested_column, i, ostr, settings);
|
||||
nested->serializeAsTextJSON(nested_column, i, ostr, settings);
|
||||
}
|
||||
writeChar(']', ostr);
|
||||
}
|
||||
@ -387,7 +387,7 @@ void DataTypeArray::serializeTextJSON(const IColumn & column, size_t row_num, Wr
|
||||
|
||||
void DataTypeArray::deserializeTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
deserializeTextImpl(column, istr, [&](IColumn & nested_column) { nested->deserializeTextJSON(nested_column, istr, settings); });
|
||||
deserializeTextImpl(column, istr, [&](IColumn & nested_column) { nested->deserializeAsTextJSON(nested_column, istr, settings); });
|
||||
}
|
||||
|
||||
|
||||
@ -405,7 +405,7 @@ void DataTypeArray::serializeTextXML(const IColumn & column, size_t row_num, Wri
|
||||
for (size_t i = offset; i < next_offset; ++i)
|
||||
{
|
||||
writeCString("<elem>", ostr);
|
||||
nested->serializeTextXML(nested_column, i, ostr, settings);
|
||||
nested->serializeAsTextXML(nested_column, i, ostr, settings);
|
||||
writeCString("</elem>", ostr);
|
||||
}
|
||||
writeCString("</array>", ostr);
|
||||
|
@ -20,7 +20,7 @@ public:
|
||||
|
||||
TypeIndex getTypeId() const override { return TypeIndex::Array; }
|
||||
|
||||
std::string getName() const override
|
||||
std::string doGetName() const override
|
||||
{
|
||||
return "Array(" + nested->getName() + ")";
|
||||
}
|
||||
|
@ -26,7 +26,7 @@ DataTypeDateTime::DataTypeDateTime(const std::string & time_zone_name)
|
||||
{
|
||||
}
|
||||
|
||||
std::string DataTypeDateTime::getName() const
|
||||
std::string DataTypeDateTime::doGetName() const
|
||||
{
|
||||
if (!has_explicit_time_zone)
|
||||
return "DateTime";
|
||||
|
@ -34,7 +34,7 @@ public:
|
||||
DataTypeDateTime(const std::string & time_zone_name = "");
|
||||
|
||||
const char * getFamilyName() const override { return "DateTime"; }
|
||||
std::string getName() const override;
|
||||
std::string doGetName() const override;
|
||||
TypeIndex getTypeId() const override { return TypeIndex::DateTime; }
|
||||
|
||||
void serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
|
118
dbms/src/DataTypes/DataTypeDomainIPv4AndIPv6.cpp
Normal file
118
dbms/src/DataTypes/DataTypeDomainIPv4AndIPv6.cpp
Normal file
@ -0,0 +1,118 @@
|
||||
#include <Columns/ColumnsNumber.h>
|
||||
#include <Common/Exception.h>
|
||||
#include <Common/formatIPv6.h>
|
||||
#include <DataTypes/DataTypeDomainWithSimpleSerialization.h>
|
||||
#include <DataTypes/DataTypeFactory.h>
|
||||
#include <DataTypes/IDataTypeDomain.h>
|
||||
#include <Functions/FunctionHelpers.h>
|
||||
#include <Functions/FunctionsCoding.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int ILLEGAL_COLUMN;
|
||||
extern const int UNSUPPORTED_METHOD;
|
||||
extern const int CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
|
||||
class DataTypeDomanIPv4 : public DataTypeDomainWithSimpleSerialization
|
||||
{
|
||||
public:
|
||||
const char * getName() const override
|
||||
{
|
||||
return "IPv4";
|
||||
}
|
||||
|
||||
void serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override
|
||||
{
|
||||
const auto col = checkAndGetColumn<ColumnUInt32>(&column);
|
||||
if (!col)
|
||||
{
|
||||
throw Exception(String(getName()) + " domain can only serialize columns of type UInt32." + column.getName(), ErrorCodes::ILLEGAL_COLUMN);
|
||||
}
|
||||
|
||||
char buffer[IPV4_MAX_TEXT_LENGTH + 1] = {'\0'};
|
||||
char * ptr = buffer;
|
||||
formatIPv4(reinterpret_cast<const unsigned char *>(&col->getData()[row_num]), ptr);
|
||||
|
||||
ostr.write(buffer, strlen(buffer));
|
||||
}
|
||||
|
||||
void deserializeText(IColumn & column, ReadBuffer & istr, const FormatSettings &) const override
|
||||
{
|
||||
ColumnUInt32 * col = typeid_cast<ColumnUInt32 *>(&column);
|
||||
if (!col)
|
||||
{
|
||||
throw Exception(String(getName()) + " domain can only deserialize columns of type UInt32." + column.getName(), ErrorCodes::ILLEGAL_COLUMN);
|
||||
}
|
||||
|
||||
char buffer[IPV4_MAX_TEXT_LENGTH + 1] = {'\0'};
|
||||
istr.read(buffer, sizeof(buffer) - 1);
|
||||
UInt32 ipv4_value = 0;
|
||||
if (!parseIPv4(buffer, reinterpret_cast<unsigned char *>(&ipv4_value)))
|
||||
{
|
||||
throw Exception("Invalid IPv4 value.", ErrorCodes::CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING);
|
||||
}
|
||||
|
||||
col->insert(ipv4_value);
|
||||
}
|
||||
};
|
||||
|
||||
class DataTypeDomanIPv6 : public DataTypeDomainWithSimpleSerialization
|
||||
{
|
||||
public:
|
||||
const char * getName() const override
|
||||
{
|
||||
return "IPv6";
|
||||
}
|
||||
|
||||
void serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override
|
||||
{
|
||||
const auto col = checkAndGetColumn<ColumnFixedString>(&column);
|
||||
if (!col)
|
||||
{
|
||||
throw Exception(String(getName()) + " domain can only serialize columns of type FixedString(16)." + column.getName(), ErrorCodes::ILLEGAL_COLUMN);
|
||||
}
|
||||
|
||||
char buffer[IPV6_MAX_TEXT_LENGTH + 1] = {'\0'};
|
||||
char * ptr = buffer;
|
||||
formatIPv6(reinterpret_cast<const unsigned char *>(col->getDataAt(row_num).data), ptr);
|
||||
|
||||
ostr.write(buffer, strlen(buffer));
|
||||
}
|
||||
|
||||
void deserializeText(IColumn & column, ReadBuffer & istr, const FormatSettings &) const override
|
||||
{
|
||||
ColumnFixedString * col = typeid_cast<ColumnFixedString *>(&column);
|
||||
if (!col)
|
||||
{
|
||||
throw Exception(String(getName()) + " domain can only deserialize columns of type FixedString(16)." + column.getName(), ErrorCodes::ILLEGAL_COLUMN);
|
||||
}
|
||||
|
||||
char buffer[IPV6_MAX_TEXT_LENGTH + 1] = {'\0'};
|
||||
istr.read(buffer, sizeof(buffer) - 1);
|
||||
|
||||
std::string ipv6_value(IPV6_BINARY_LENGTH, '\0');
|
||||
if (!parseIPv6(buffer, reinterpret_cast<unsigned char *>(ipv6_value.data())))
|
||||
{
|
||||
throw Exception(String("Invalid ") + getName() + " value.", ErrorCodes::CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING);
|
||||
}
|
||||
|
||||
col->insertString(ipv6_value);
|
||||
}
|
||||
};
|
||||
|
||||
} // namespace
|
||||
|
||||
void registerDataTypeDomainIPv4AndIPv6(DataTypeFactory & factory)
|
||||
{
|
||||
factory.registerDataTypeDomain("UInt32", std::make_unique<DataTypeDomanIPv4>());
|
||||
factory.registerDataTypeDomain("FixedString(16)", std::make_unique<DataTypeDomanIPv6>());
|
||||
}
|
||||
|
||||
} // namespace DB
|
88
dbms/src/DataTypes/DataTypeDomainWithSimpleSerialization.cpp
Normal file
88
dbms/src/DataTypes/DataTypeDomainWithSimpleSerialization.cpp
Normal file
@ -0,0 +1,88 @@
|
||||
#include <DataTypes/DataTypeDomainWithSimpleSerialization.h>
|
||||
|
||||
#include <IO/ReadBufferFromString.h>
|
||||
#include <IO/ReadHelpers.h>
|
||||
#include <IO/WriteBufferFromString.h>
|
||||
#include <IO/WriteHelpers.h>
|
||||
|
||||
namespace
|
||||
{
|
||||
using namespace DB;
|
||||
|
||||
static String serializeToString(const DataTypeDomainWithSimpleSerialization & domain, const IColumn & column, size_t row_num, const FormatSettings & settings)
|
||||
{
|
||||
WriteBufferFromOwnString buffer;
|
||||
domain.serializeText(column, row_num, buffer, settings);
|
||||
|
||||
return buffer.str();
|
||||
}
|
||||
|
||||
static void deserializeFromString(const DataTypeDomainWithSimpleSerialization & domain, IColumn & column, const String & s, const FormatSettings & settings)
|
||||
{
|
||||
ReadBufferFromString istr(s);
|
||||
domain.deserializeText(column, istr, settings);
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
DataTypeDomainWithSimpleSerialization::~DataTypeDomainWithSimpleSerialization()
|
||||
{
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::serializeTextEscaped(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
writeEscapedString(serializeToString(*this, column, row_num, settings), ostr);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::deserializeTextEscaped(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
String str;
|
||||
readEscapedString(str, istr);
|
||||
deserializeFromString(*this, column, str, settings);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::serializeTextQuoted(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
writeQuotedString(serializeToString(*this, column, row_num, settings), ostr);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::deserializeTextQuoted(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
String str;
|
||||
readQuotedString(str, istr);
|
||||
deserializeFromString(*this, column, str, settings);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
writeCSVString(serializeToString(*this, column, row_num, settings), ostr);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
String str;
|
||||
readCSVString(str, istr, settings.csv);
|
||||
deserializeFromString(*this, column, str, settings);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::serializeTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
writeJSONString(serializeToString(*this, column, row_num, settings), ostr, settings);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::deserializeTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
String str;
|
||||
readJSONString(str, istr);
|
||||
deserializeFromString(*this, column, str, settings);
|
||||
}
|
||||
|
||||
void DataTypeDomainWithSimpleSerialization::serializeTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
writeXMLString(serializeToString(*this, column, row_num, settings), ostr);
|
||||
}
|
||||
|
||||
} // namespace DB
|
53
dbms/src/DataTypes/DataTypeDomainWithSimpleSerialization.h
Normal file
53
dbms/src/DataTypes/DataTypeDomainWithSimpleSerialization.h
Normal file
@ -0,0 +1,53 @@
|
||||
#pragma once
|
||||
|
||||
#include <DataTypes/IDataTypeDomain.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
class ReadBuffer;
|
||||
class WriteBuffer;
|
||||
struct FormatSettings;
|
||||
class IColumn;
|
||||
|
||||
/** Simple DataTypeDomain that uses serializeText/deserializeText
|
||||
* for all serialization and deserialization. */
|
||||
class DataTypeDomainWithSimpleSerialization : public IDataTypeDomain
|
||||
{
|
||||
public:
|
||||
virtual ~DataTypeDomainWithSimpleSerialization() override;
|
||||
|
||||
// Methods that subclasses must override in order to get full serialization/deserialization support.
|
||||
virtual void serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override = 0;
|
||||
virtual void deserializeText(IColumn & column, ReadBuffer & istr, const FormatSettings &) const = 0;
|
||||
|
||||
/** Text serialization with escaping but without quoting.
|
||||
*/
|
||||
void serializeTextEscaped(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
void deserializeTextEscaped(IColumn & column, ReadBuffer & istr, const FormatSettings &) const override;
|
||||
|
||||
/** Text serialization as a literal that may be inserted into a query.
|
||||
*/
|
||||
void serializeTextQuoted(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
void deserializeTextQuoted(IColumn & column, ReadBuffer & istr, const FormatSettings &) const override;
|
||||
|
||||
/** Text serialization for the CSV format.
|
||||
*/
|
||||
void serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
/** delimiter - the delimiter we expect when reading a string value that is not double-quoted
|
||||
* (the delimiter is not consumed).
|
||||
*/
|
||||
void deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings &) const override;
|
||||
|
||||
/** Text serialization intended for using in JSON format.
|
||||
* force_quoting_64bit_integers parameter forces to brace UInt64 and Int64 types into quotes.
|
||||
*/
|
||||
void serializeTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
void deserializeTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings &) const override;
|
||||
|
||||
/** Text serialization for putting into the XML format.
|
||||
*/
|
||||
void serializeTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const override;
|
||||
};
|
||||
|
||||
} // namespace DB
|
@ -61,7 +61,7 @@ public:
|
||||
explicit DataTypeEnum(const Values & values_);
|
||||
|
||||
const Values & getValues() const { return values; }
|
||||
std::string getName() const override { return type_name; }
|
||||
std::string doGetName() const override { return type_name; }
|
||||
const char * getFamilyName() const override;
|
||||
|
||||
TypeIndex getTypeId() const override { return sizeof(FieldType) == 1 ? TypeIndex::Enum8 : TypeIndex::Enum16; }
|
||||
|
@ -1,4 +1,5 @@
|
||||
#include <DataTypes/DataTypeFactory.h>
|
||||
#include <DataTypes/IDataTypeDomain.h>
|
||||
#include <Parsers/parseQuery.h>
|
||||
#include <Parsers/ParserCreateQuery.h>
|
||||
#include <Parsers/ASTFunction.h>
|
||||
@ -7,7 +8,7 @@
|
||||
#include <Common/typeid_cast.h>
|
||||
#include <Poco/String.h>
|
||||
#include <Common/StringUtils/StringUtils.h>
|
||||
|
||||
#include <IO/WriteHelpers.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -73,21 +74,7 @@ DataTypePtr DataTypeFactory::get(const String & family_name_param, const ASTPtr
|
||||
return get("LowCardinality", low_cardinality_params);
|
||||
}
|
||||
|
||||
{
|
||||
DataTypesDictionary::const_iterator it = data_types.find(family_name);
|
||||
if (data_types.end() != it)
|
||||
return it->second(parameters);
|
||||
}
|
||||
|
||||
String family_name_lowercase = Poco::toLower(family_name);
|
||||
|
||||
{
|
||||
DataTypesDictionary::const_iterator it = case_insensitive_data_types.find(family_name_lowercase);
|
||||
if (case_insensitive_data_types.end() != it)
|
||||
return it->second(parameters);
|
||||
}
|
||||
|
||||
throw Exception("Unknown data type family: " + family_name, ErrorCodes::UNKNOWN_TYPE);
|
||||
return findCreatorByName(family_name)(parameters);
|
||||
}
|
||||
|
||||
|
||||
@ -128,6 +115,49 @@ void DataTypeFactory::registerSimpleDataType(const String & name, SimpleCreator
|
||||
}, case_sensitiveness);
|
||||
}
|
||||
|
||||
void DataTypeFactory::registerDataTypeDomain(const String & type_name, DataTypeDomainPtr domain, CaseSensitiveness case_sensitiveness)
|
||||
{
|
||||
all_domains.reserve(all_domains.size() + 1);
|
||||
|
||||
auto data_type = get(type_name);
|
||||
setDataTypeDomain(*data_type, *domain);
|
||||
|
||||
registerDataType(domain->getName(), [data_type](const ASTPtr & /*ast*/)
|
||||
{
|
||||
return data_type;
|
||||
}, case_sensitiveness);
|
||||
|
||||
all_domains.emplace_back(std::move(domain));
|
||||
}
|
||||
|
||||
const DataTypeFactory::Creator& DataTypeFactory::findCreatorByName(const String & family_name) const
|
||||
{
|
||||
{
|
||||
DataTypesDictionary::const_iterator it = data_types.find(family_name);
|
||||
if (data_types.end() != it)
|
||||
return it->second;
|
||||
}
|
||||
|
||||
String family_name_lowercase = Poco::toLower(family_name);
|
||||
|
||||
{
|
||||
DataTypesDictionary::const_iterator it = case_insensitive_data_types.find(family_name_lowercase);
|
||||
if (case_insensitive_data_types.end() != it)
|
||||
return it->second;
|
||||
}
|
||||
|
||||
auto hints = this->getHints(family_name);
|
||||
if (!hints.empty())
|
||||
throw Exception("Unknown data type family: " + family_name + ". Maybe you meant: " + toString(hints), ErrorCodes::UNKNOWN_TYPE);
|
||||
else
|
||||
throw Exception("Unknown data type family: " + family_name, ErrorCodes::UNKNOWN_TYPE);
|
||||
}
|
||||
|
||||
void DataTypeFactory::setDataTypeDomain(const IDataType & data_type, const IDataTypeDomain & domain)
|
||||
{
|
||||
data_type.setDomain(&domain);
|
||||
}
|
||||
|
||||
void registerDataTypeNumbers(DataTypeFactory & factory);
|
||||
void registerDataTypeDecimal(DataTypeFactory & factory);
|
||||
void registerDataTypeDate(DataTypeFactory & factory);
|
||||
@ -144,6 +174,7 @@ void registerDataTypeAggregateFunction(DataTypeFactory & factory);
|
||||
void registerDataTypeNested(DataTypeFactory & factory);
|
||||
void registerDataTypeInterval(DataTypeFactory & factory);
|
||||
void registerDataTypeLowCardinality(DataTypeFactory & factory);
|
||||
void registerDataTypeDomainIPv4AndIPv6(DataTypeFactory & factory);
|
||||
|
||||
|
||||
DataTypeFactory::DataTypeFactory()
|
||||
@ -164,6 +195,10 @@ DataTypeFactory::DataTypeFactory()
|
||||
registerDataTypeNested(*this);
|
||||
registerDataTypeInterval(*this);
|
||||
registerDataTypeLowCardinality(*this);
|
||||
registerDataTypeDomainIPv4AndIPv6(*this);
|
||||
}
|
||||
|
||||
DataTypeFactory::~DataTypeFactory()
|
||||
{}
|
||||
|
||||
}
|
||||
|
@ -14,6 +14,9 @@ namespace DB
|
||||
class IDataType;
|
||||
using DataTypePtr = std::shared_ptr<const IDataType>;
|
||||
|
||||
class IDataTypeDomain;
|
||||
using DataTypeDomainPtr = std::unique_ptr<const IDataTypeDomain>;
|
||||
|
||||
class IAST;
|
||||
using ASTPtr = std::shared_ptr<IAST>;
|
||||
|
||||
@ -37,13 +40,24 @@ public:
|
||||
/// Register a simple data type, that have no parameters.
|
||||
void registerSimpleDataType(const String & name, SimpleCreator creator, CaseSensitiveness case_sensitiveness = CaseSensitive);
|
||||
|
||||
// Register a domain - a refinement of existing type.
|
||||
void registerDataTypeDomain(const String & type_name, DataTypeDomainPtr domain, CaseSensitiveness case_sensitiveness = CaseSensitive);
|
||||
|
||||
private:
|
||||
static void setDataTypeDomain(const IDataType & data_type, const IDataTypeDomain & domain);
|
||||
const Creator& findCreatorByName(const String & family_name) const;
|
||||
|
||||
private:
|
||||
DataTypesDictionary data_types;
|
||||
|
||||
/// Case insensitive data types will be additionally added here with lowercased name.
|
||||
DataTypesDictionary case_insensitive_data_types;
|
||||
|
||||
// All domains are owned by factory and shared amongst DataType instances.
|
||||
std::vector<DataTypeDomainPtr> all_domains;
|
||||
|
||||
DataTypeFactory();
|
||||
~DataTypeFactory() override;
|
||||
|
||||
const DataTypesDictionary & getCreatorMap() const override { return data_types; }
|
||||
|
||||
|
@ -32,7 +32,7 @@ namespace ErrorCodes
|
||||
}
|
||||
|
||||
|
||||
std::string DataTypeFixedString::getName() const
|
||||
std::string DataTypeFixedString::doGetName() const
|
||||
{
|
||||
return "FixedString(" + toString(n) + ")";
|
||||
}
|
||||
|
@ -30,7 +30,7 @@ public:
|
||||
throw Exception("FixedString size is too large", ErrorCodes::ARGUMENT_OUT_OF_BOUND);
|
||||
}
|
||||
|
||||
std::string getName() const override;
|
||||
std::string doGetName() const override;
|
||||
TypeIndex getTypeId() const override { return TypeIndex::FixedString; }
|
||||
|
||||
const char * getFamilyName() const override { return "FixedString"; }
|
||||
|
@ -6,7 +6,7 @@
|
||||
namespace DB
|
||||
{
|
||||
|
||||
std::string DataTypeFunction::getName() const
|
||||
std::string DataTypeFunction::doGetName() const
|
||||
{
|
||||
WriteBufferFromOwnString res;
|
||||
|
||||
|
@ -22,7 +22,7 @@ public:
|
||||
DataTypeFunction(const DataTypes & argument_types_ = DataTypes(), const DataTypePtr & return_type_ = nullptr)
|
||||
: argument_types(argument_types_), return_type(return_type_) {}
|
||||
|
||||
std::string getName() const override;
|
||||
std::string doGetName() const override;
|
||||
const char * getFamilyName() const override { return "Function"; }
|
||||
TypeIndex getTypeId() const override { return TypeIndex::Function; }
|
||||
|
||||
|
@ -56,7 +56,7 @@ public:
|
||||
|
||||
DataTypeInterval(Kind kind) : kind(kind) {}
|
||||
|
||||
std::string getName() const override { return std::string("Interval") + kindToString(); }
|
||||
std::string doGetName() const override { return std::string("Interval") + kindToString(); }
|
||||
const char * getFamilyName() const override { return "Interval"; }
|
||||
TypeIndex getTypeId() const override { return TypeIndex::Interval; }
|
||||
|
||||
|
@ -15,7 +15,7 @@ public:
|
||||
|
||||
const DataTypePtr & getDictionaryType() const { return dictionary_type; }
|
||||
|
||||
String getName() const override
|
||||
String doGetName() const override
|
||||
{
|
||||
return "LowCardinality(" + dictionary_type->getName() + ")";
|
||||
}
|
||||
@ -63,51 +63,51 @@ public:
|
||||
|
||||
void serializeTextEscaped(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const override
|
||||
{
|
||||
serializeImpl(column, row_num, &IDataType::serializeTextEscaped, ostr, settings);
|
||||
serializeImpl(column, row_num, &IDataType::serializeAsTextEscaped, ostr, settings);
|
||||
}
|
||||
|
||||
void deserializeTextEscaped(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const override
|
||||
{
|
||||
deserializeImpl(column, &IDataType::deserializeTextEscaped, istr, settings);
|
||||
deserializeImpl(column, &IDataType::deserializeAsTextEscaped, istr, settings);
|
||||
}
|
||||
|
||||
void serializeTextQuoted(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const override
|
||||
{
|
||||
serializeImpl(column, row_num, &IDataType::serializeTextQuoted, ostr, settings);
|
||||
serializeImpl(column, row_num, &IDataType::serializeAsTextQuoted, ostr, settings);
|
||||
}
|
||||
|
||||
void deserializeTextQuoted(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const override
|
||||
{
|
||||
deserializeImpl(column, &IDataType::deserializeTextQuoted, istr, settings);
|
||||
deserializeImpl(column, &IDataType::deserializeAsTextQuoted, istr, settings);
|
||||
}
|
||||
|
||||
void serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const override
|
||||
{
|
||||
serializeImpl(column, row_num, &IDataType::serializeTextCSV, ostr, settings);
|
||||
serializeImpl(column, row_num, &IDataType::serializeAsTextCSV, ostr, settings);
|
||||
}
|
||||
|
||||
void deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const override
|
||||
{
|
||||
deserializeImpl(column, &IDataType::deserializeTextCSV, istr, settings);
|
||||
deserializeImpl(column, &IDataType::deserializeAsTextCSV, istr, settings);
|
||||
}
|
||||
|
||||
void serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const override
|
||||
{
|
||||
serializeImpl(column, row_num, &IDataType::serializeText, ostr, settings);
|
||||
serializeImpl(column, row_num, &IDataType::serializeAsText, ostr, settings);
|
||||
}
|
||||
|
||||
void serializeTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const override
|
||||
{
|
||||
serializeImpl(column, row_num, &IDataType::serializeTextJSON, ostr, settings);
|
||||
serializeImpl(column, row_num, &IDataType::serializeAsTextJSON, ostr, settings);
|
||||
}
|
||||
void deserializeTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const override
|
||||
{
|
||||
deserializeImpl(column, &IDataType::deserializeTextJSON, istr, settings);
|
||||
deserializeImpl(column, &IDataType::deserializeAsTextJSON, istr, settings);
|
||||
}
|
||||
|
||||
void serializeTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const override
|
||||
{
|
||||
serializeImpl(column, row_num, &IDataType::serializeTextXML, ostr, settings);
|
||||
serializeImpl(column, row_num, &IDataType::serializeAsTextXML, ostr, settings);
|
||||
}
|
||||
|
||||
void serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const override
|
||||
|
@ -172,7 +172,7 @@ void DataTypeNullable::serializeTextEscaped(const IColumn & column, size_t row_n
|
||||
if (col.isNullAt(row_num))
|
||||
writeCString("\\N", ostr);
|
||||
else
|
||||
nested_data_type->serializeTextEscaped(col.getNestedColumn(), row_num, ostr, settings);
|
||||
nested_data_type->serializeAsTextEscaped(col.getNestedColumn(), row_num, ostr, settings);
|
||||
}
|
||||
|
||||
|
||||
@ -188,7 +188,7 @@ void DataTypeNullable::deserializeTextEscaped(IColumn & column, ReadBuffer & ist
|
||||
{
|
||||
safeDeserialize(column,
|
||||
[] { return false; },
|
||||
[this, &istr, &settings] (IColumn & nested) { nested_data_type->deserializeTextEscaped(nested, istr, settings); });
|
||||
[this, &istr, &settings] (IColumn & nested) { nested_data_type->deserializeAsTextEscaped(nested, istr, settings); });
|
||||
}
|
||||
else
|
||||
{
|
||||
@ -214,7 +214,7 @@ void DataTypeNullable::deserializeTextEscaped(IColumn & column, ReadBuffer & ist
|
||||
{
|
||||
/// We could step back to consume backslash again.
|
||||
--istr.position();
|
||||
nested_data_type->deserializeTextEscaped(nested, istr, settings);
|
||||
nested_data_type->deserializeAsTextEscaped(nested, istr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
@ -222,7 +222,7 @@ void DataTypeNullable::deserializeTextEscaped(IColumn & column, ReadBuffer & ist
|
||||
ReadBufferFromMemory prefix("\\", 1);
|
||||
ConcatReadBuffer prepended_istr(prefix, istr);
|
||||
|
||||
nested_data_type->deserializeTextEscaped(nested, prepended_istr, settings);
|
||||
nested_data_type->deserializeAsTextEscaped(nested, prepended_istr, settings);
|
||||
|
||||
/// Synchronise cursor position in original buffer.
|
||||
|
||||
@ -240,7 +240,7 @@ void DataTypeNullable::serializeTextQuoted(const IColumn & column, size_t row_nu
|
||||
if (col.isNullAt(row_num))
|
||||
writeCString("NULL", ostr);
|
||||
else
|
||||
nested_data_type->serializeTextQuoted(col.getNestedColumn(), row_num, ostr, settings);
|
||||
nested_data_type->serializeAsTextQuoted(col.getNestedColumn(), row_num, ostr, settings);
|
||||
}
|
||||
|
||||
|
||||
@ -248,7 +248,7 @@ void DataTypeNullable::deserializeTextQuoted(IColumn & column, ReadBuffer & istr
|
||||
{
|
||||
safeDeserialize(column,
|
||||
[&istr] { return checkStringByFirstCharacterAndAssertTheRestCaseInsensitive("NULL", istr); },
|
||||
[this, &istr, &settings] (IColumn & nested) { nested_data_type->deserializeTextQuoted(nested, istr, settings); });
|
||||
[this, &istr, &settings] (IColumn & nested) { nested_data_type->deserializeAsTextQuoted(nested, istr, settings); });
|
||||
}
|
||||
|
||||
void DataTypeNullable::serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
@ -258,14 +258,14 @@ void DataTypeNullable::serializeTextCSV(const IColumn & column, size_t row_num,
|
||||
if (col.isNullAt(row_num))
|
||||
writeCString("\\N", ostr);
|
||||
else
|
||||
nested_data_type->serializeTextCSV(col.getNestedColumn(), row_num, ostr, settings);
|
||||
nested_data_type->serializeAsTextCSV(col.getNestedColumn(), row_num, ostr, settings);
|
||||
}
|
||||
|
||||
void DataTypeNullable::deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
safeDeserialize(column,
|
||||
[&istr] { return checkStringByFirstCharacterAndAssertTheRest("\\N", istr); },
|
||||
[this, &settings, &istr] (IColumn & nested) { nested_data_type->deserializeTextCSV(nested, istr, settings); });
|
||||
[this, &settings, &istr] (IColumn & nested) { nested_data_type->deserializeAsTextCSV(nested, istr, settings); });
|
||||
}
|
||||
|
||||
void DataTypeNullable::serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
@ -281,7 +281,7 @@ void DataTypeNullable::serializeText(const IColumn & column, size_t row_num, Wri
|
||||
if (col.isNullAt(row_num))
|
||||
writeCString("ᴺᵁᴸᴸ", ostr);
|
||||
else
|
||||
nested_data_type->serializeText(col.getNestedColumn(), row_num, ostr, settings);
|
||||
nested_data_type->serializeAsText(col.getNestedColumn(), row_num, ostr, settings);
|
||||
}
|
||||
|
||||
void DataTypeNullable::serializeTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
@ -291,14 +291,14 @@ void DataTypeNullable::serializeTextJSON(const IColumn & column, size_t row_num,
|
||||
if (col.isNullAt(row_num))
|
||||
writeCString("null", ostr);
|
||||
else
|
||||
nested_data_type->serializeTextJSON(col.getNestedColumn(), row_num, ostr, settings);
|
||||
nested_data_type->serializeAsTextJSON(col.getNestedColumn(), row_num, ostr, settings);
|
||||
}
|
||||
|
||||
void DataTypeNullable::deserializeTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
safeDeserialize(column,
|
||||
[&istr] { return checkStringByFirstCharacterAndAssertTheRest("null", istr); },
|
||||
[this, &istr, &settings] (IColumn & nested) { nested_data_type->deserializeTextJSON(nested, istr, settings); });
|
||||
[this, &istr, &settings] (IColumn & nested) { nested_data_type->deserializeAsTextJSON(nested, istr, settings); });
|
||||
}
|
||||
|
||||
void DataTypeNullable::serializeTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
@ -308,7 +308,7 @@ void DataTypeNullable::serializeTextXML(const IColumn & column, size_t row_num,
|
||||
if (col.isNullAt(row_num))
|
||||
writeCString("\\N", ostr);
|
||||
else
|
||||
nested_data_type->serializeTextXML(col.getNestedColumn(), row_num, ostr, settings);
|
||||
nested_data_type->serializeAsTextXML(col.getNestedColumn(), row_num, ostr, settings);
|
||||
}
|
||||
|
||||
void DataTypeNullable::serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const
|
||||
|
@ -14,7 +14,7 @@ public:
|
||||
static constexpr bool is_parametric = true;
|
||||
|
||||
explicit DataTypeNullable(const DataTypePtr & nested_data_type_);
|
||||
std::string getName() const override { return "Nullable(" + nested_data_type->getName() + ")"; }
|
||||
std::string doGetName() const override { return "Nullable(" + nested_data_type->getName() + ")"; }
|
||||
const char * getFamilyName() const override { return "Nullable"; }
|
||||
TypeIndex getTypeId() const override { return TypeIndex::Nullable; }
|
||||
|
||||
|
@ -64,7 +64,7 @@ DataTypeTuple::DataTypeTuple(const DataTypes & elems_, const Strings & names_)
|
||||
|
||||
|
||||
|
||||
std::string DataTypeTuple::getName() const
|
||||
std::string DataTypeTuple::doGetName() const
|
||||
{
|
||||
size_t size = elems.size();
|
||||
WriteBufferFromOwnString s;
|
||||
@ -160,7 +160,7 @@ void DataTypeTuple::serializeText(const IColumn & column, size_t row_num, WriteB
|
||||
{
|
||||
if (i != 0)
|
||||
writeChar(',', ostr);
|
||||
elems[i]->serializeTextQuoted(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
elems[i]->serializeAsTextQuoted(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
}
|
||||
writeChar(')', ostr);
|
||||
}
|
||||
@ -180,7 +180,7 @@ void DataTypeTuple::deserializeText(IColumn & column, ReadBuffer & istr, const F
|
||||
assertChar(',', istr);
|
||||
skipWhitespaceIfAny(istr);
|
||||
}
|
||||
elems[i]->deserializeTextQuoted(extractElementColumn(column, i), istr, settings);
|
||||
elems[i]->deserializeAsTextQuoted(extractElementColumn(column, i), istr, settings);
|
||||
}
|
||||
});
|
||||
|
||||
@ -195,7 +195,7 @@ void DataTypeTuple::serializeTextJSON(const IColumn & column, size_t row_num, Wr
|
||||
{
|
||||
if (i != 0)
|
||||
writeChar(',', ostr);
|
||||
elems[i]->serializeTextJSON(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
elems[i]->serializeAsTextJSON(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
}
|
||||
writeChar(']', ostr);
|
||||
}
|
||||
@ -215,7 +215,7 @@ void DataTypeTuple::deserializeTextJSON(IColumn & column, ReadBuffer & istr, con
|
||||
assertChar(',', istr);
|
||||
skipWhitespaceIfAny(istr);
|
||||
}
|
||||
elems[i]->deserializeTextJSON(extractElementColumn(column, i), istr, settings);
|
||||
elems[i]->deserializeAsTextJSON(extractElementColumn(column, i), istr, settings);
|
||||
}
|
||||
});
|
||||
|
||||
@ -229,7 +229,7 @@ void DataTypeTuple::serializeTextXML(const IColumn & column, size_t row_num, Wri
|
||||
for (const auto i : ext::range(0, ext::size(elems)))
|
||||
{
|
||||
writeCString("<elem>", ostr);
|
||||
elems[i]->serializeTextXML(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
elems[i]->serializeAsTextXML(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
writeCString("</elem>", ostr);
|
||||
}
|
||||
writeCString("</tuple>", ostr);
|
||||
@ -241,7 +241,7 @@ void DataTypeTuple::serializeTextCSV(const IColumn & column, size_t row_num, Wri
|
||||
{
|
||||
if (i != 0)
|
||||
writeChar(',', ostr);
|
||||
elems[i]->serializeTextCSV(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
elems[i]->serializeAsTextCSV(extractElementColumn(column, i), row_num, ostr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
@ -258,7 +258,7 @@ void DataTypeTuple::deserializeTextCSV(IColumn & column, ReadBuffer & istr, cons
|
||||
assertChar(settings.csv.delimiter, istr);
|
||||
skipWhitespaceIfAny(istr);
|
||||
}
|
||||
elems[i]->deserializeTextCSV(extractElementColumn(column, i), istr, settings);
|
||||
elems[i]->deserializeAsTextCSV(extractElementColumn(column, i), istr, settings);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
@ -29,7 +29,7 @@ public:
|
||||
DataTypeTuple(const DataTypes & elems, const Strings & names);
|
||||
|
||||
TypeIndex getTypeId() const override { return TypeIndex::Tuple; }
|
||||
std::string getName() const override;
|
||||
std::string doGetName() const override;
|
||||
const char * getFamilyName() const override { return "Tuple"; }
|
||||
|
||||
bool canBeInsideNullable() const override { return false; }
|
||||
|
@ -28,7 +28,7 @@ bool decimalCheckArithmeticOverflow(const Context & context) { return context.ge
|
||||
//
|
||||
|
||||
template <typename T>
|
||||
std::string DataTypeDecimal<T>::getName() const
|
||||
std::string DataTypeDecimal<T>::doGetName() const
|
||||
{
|
||||
std::stringstream ss;
|
||||
ss << "Decimal(" << precision << ", " << scale << ")";
|
||||
|
@ -86,7 +86,7 @@ public:
|
||||
}
|
||||
|
||||
const char * getFamilyName() const override { return "Decimal"; }
|
||||
std::string getName() const override;
|
||||
std::string doGetName() const override;
|
||||
TypeIndex getTypeId() const override { return TypeId<T>::value; }
|
||||
|
||||
void serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
|
@ -9,6 +9,7 @@
|
||||
#include <IO/WriteHelpers.h>
|
||||
|
||||
#include <DataTypes/IDataType.h>
|
||||
#include <DataTypes/IDataTypeDomain.h>
|
||||
#include <DataTypes/NestedUtils.h>
|
||||
|
||||
|
||||
@ -22,6 +23,31 @@ namespace ErrorCodes
|
||||
extern const int DATA_TYPE_CANNOT_BE_PROMOTED;
|
||||
}
|
||||
|
||||
IDataType::IDataType()
|
||||
: domain(nullptr)
|
||||
{
|
||||
}
|
||||
|
||||
IDataType::~IDataType()
|
||||
{
|
||||
}
|
||||
|
||||
String IDataType::getName() const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
return domain->getName();
|
||||
}
|
||||
else
|
||||
{
|
||||
return doGetName();
|
||||
}
|
||||
}
|
||||
|
||||
String IDataType::doGetName() const
|
||||
{
|
||||
return getFamilyName();
|
||||
}
|
||||
|
||||
void IDataType::updateAvgValueSizeHint(const IColumn & column, double & avg_value_size_hint)
|
||||
{
|
||||
@ -114,4 +140,133 @@ void IDataType::insertDefaultInto(IColumn & column) const
|
||||
column.insertDefault();
|
||||
}
|
||||
|
||||
void IDataType::serializeAsTextEscaped(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->serializeTextEscaped(column, row_num, ostr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
serializeTextEscaped(column, row_num, ostr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::deserializeAsTextEscaped(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->deserializeTextEscaped(column, istr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
deserializeTextEscaped(column, istr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::serializeAsTextQuoted(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->serializeTextQuoted(column, row_num, ostr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
serializeTextQuoted(column, row_num, ostr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::deserializeAsTextQuoted(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->deserializeTextQuoted(column, istr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
deserializeTextQuoted(column, istr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::serializeAsTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->serializeTextCSV(column, row_num, ostr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
serializeTextCSV(column, row_num, ostr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::deserializeAsTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->deserializeTextCSV(column, istr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
deserializeTextCSV(column, istr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::serializeAsText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->serializeText(column, row_num, ostr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
serializeText(column, row_num, ostr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::serializeAsTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->serializeTextJSON(column, row_num, ostr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
serializeTextJSON(column, row_num, ostr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::deserializeAsTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->deserializeTextJSON(column, istr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
deserializeTextJSON(column, istr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::serializeAsTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
if (domain)
|
||||
{
|
||||
domain->serializeTextXML(column, row_num, ostr, settings);
|
||||
}
|
||||
else
|
||||
{
|
||||
serializeTextXML(column, row_num, ostr, settings);
|
||||
}
|
||||
}
|
||||
|
||||
void IDataType::setDomain(const IDataTypeDomain* const new_domain) const
|
||||
{
|
||||
if (domain != nullptr)
|
||||
{
|
||||
throw Exception("Type " + getName() + " already has a domain.", ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
domain = new_domain;
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -12,6 +12,7 @@ namespace DB
|
||||
class ReadBuffer;
|
||||
class WriteBuffer;
|
||||
|
||||
class IDataTypeDomain;
|
||||
class IDataType;
|
||||
struct FormatSettings;
|
||||
|
||||
@ -35,6 +36,9 @@ class ProtobufWriter;
|
||||
class IDataType : private boost::noncopyable
|
||||
{
|
||||
public:
|
||||
IDataType();
|
||||
virtual ~IDataType();
|
||||
|
||||
/// Compile time flag. If false, then if C++ types are the same, then SQL types are also the same.
|
||||
/// Example: DataTypeString is not parametric: thus all instances of DataTypeString are the same SQL type.
|
||||
/// Example: DataTypeFixedString is parametric: different instances of DataTypeFixedString may be different SQL types.
|
||||
@ -42,7 +46,7 @@ public:
|
||||
/// static constexpr bool is_parametric = false;
|
||||
|
||||
/// Name of data type (examples: UInt64, Array(String)).
|
||||
virtual String getName() const { return getFamilyName(); }
|
||||
String getName() const;
|
||||
|
||||
/// Name of data type family (example: FixedString, Array).
|
||||
virtual const char * getFamilyName() const = 0;
|
||||
@ -217,6 +221,43 @@ public:
|
||||
/// If method will throw an exception, then column will be in same state as before call to method.
|
||||
virtual void deserializeBinary(IColumn & column, ReadBuffer & istr) const = 0;
|
||||
|
||||
/** Text serialization with escaping but without quoting.
|
||||
*/
|
||||
virtual void serializeAsTextEscaped(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const;
|
||||
|
||||
virtual void deserializeAsTextEscaped(IColumn & column, ReadBuffer & istr, const FormatSettings &) const;
|
||||
|
||||
/** Text serialization as a literal that may be inserted into a query.
|
||||
*/
|
||||
virtual void serializeAsTextQuoted(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const;
|
||||
|
||||
virtual void deserializeAsTextQuoted(IColumn & column, ReadBuffer & istr, const FormatSettings &) const;
|
||||
|
||||
/** Text serialization for the CSV format.
|
||||
*/
|
||||
virtual void serializeAsTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const;
|
||||
virtual void deserializeAsTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings &) const;
|
||||
|
||||
/** Text serialization for displaying on a terminal or saving into a text file, and the like.
|
||||
* Without escaping or quoting.
|
||||
*/
|
||||
virtual void serializeAsText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const;
|
||||
|
||||
/** Text serialization intended for using in JSON format.
|
||||
*/
|
||||
virtual void serializeAsTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const;
|
||||
virtual void deserializeAsTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings &) const;
|
||||
|
||||
/** Text serialization for putting into the XML format.
|
||||
*/
|
||||
virtual void serializeAsTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const;
|
||||
|
||||
/** Serialize to a protobuf. */
|
||||
virtual void serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const = 0;
|
||||
|
||||
protected:
|
||||
virtual String doGetName() const;
|
||||
|
||||
/** Text serialization with escaping but without quoting.
|
||||
*/
|
||||
virtual void serializeTextEscaped(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0;
|
||||
@ -232,10 +273,6 @@ public:
|
||||
/** Text serialization for the CSV format.
|
||||
*/
|
||||
virtual void serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0;
|
||||
|
||||
/** delimiter - the delimiter we expect when reading a string value that is not double-quoted
|
||||
* (the delimiter is not consumed).
|
||||
*/
|
||||
virtual void deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings &) const = 0;
|
||||
|
||||
/** Text serialization for displaying on a terminal or saving into a text file, and the like.
|
||||
@ -256,9 +293,7 @@ public:
|
||||
serializeText(column, row_num, ostr, settings);
|
||||
}
|
||||
|
||||
/** Serialize to a protobuf. */
|
||||
virtual void serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const = 0;
|
||||
|
||||
public:
|
||||
/** Create empty column for corresponding type.
|
||||
*/
|
||||
virtual MutableColumnPtr createColumn() const = 0;
|
||||
@ -290,8 +325,6 @@ public:
|
||||
/// Checks that two instances belong to the same type
|
||||
virtual bool equals(const IDataType & rhs) const = 0;
|
||||
|
||||
virtual ~IDataType() {}
|
||||
|
||||
|
||||
/// Various properties on behaviour of data type.
|
||||
|
||||
@ -419,6 +452,21 @@ public:
|
||||
static void updateAvgValueSizeHint(const IColumn & column, double & avg_value_size_hint);
|
||||
|
||||
static String getFileNameForStream(const String & column_name, const SubstreamPath & path);
|
||||
|
||||
private:
|
||||
friend class DataTypeFactory;
|
||||
/** Sets domain on existing DataType, can be considered as second phase
|
||||
* of construction explicitly done by DataTypeFactory.
|
||||
* Will throw an exception if domain is already set.
|
||||
*/
|
||||
void setDomain(const IDataTypeDomain* newDomain) const;
|
||||
|
||||
private:
|
||||
/** This is mutable to allow setting domain on `const IDataType` post construction,
|
||||
* simplifying creation of domains for all types, without them even knowing
|
||||
* of domain existence.
|
||||
*/
|
||||
mutable IDataTypeDomain const* domain;
|
||||
};
|
||||
|
||||
|
||||
|
59
dbms/src/DataTypes/IDataTypeDomain.h
Normal file
59
dbms/src/DataTypes/IDataTypeDomain.h
Normal file
@ -0,0 +1,59 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
class ReadBuffer;
|
||||
class WriteBuffer;
|
||||
struct FormatSettings;
|
||||
class IColumn;
|
||||
|
||||
/** Further refinment of the properties of data type.
|
||||
*
|
||||
* Contains methods for serialization/deserialization.
|
||||
* Implementations of this interface represent a data type domain (example: IPv4)
|
||||
* which is a refinement of the exsitgin type with a name and specific text
|
||||
* representation.
|
||||
*
|
||||
* IDataTypeDomain is totally immutable object. You can always share them.
|
||||
*/
|
||||
class IDataTypeDomain
|
||||
{
|
||||
public:
|
||||
virtual ~IDataTypeDomain() {}
|
||||
|
||||
virtual const char* getName() const = 0;
|
||||
|
||||
/** Text serialization for displaying on a terminal or saving into a text file, and the like.
|
||||
* Without escaping or quoting.
|
||||
*/
|
||||
virtual void serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0;
|
||||
|
||||
/** Text serialization with escaping but without quoting.
|
||||
*/
|
||||
virtual void serializeTextEscaped(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0;
|
||||
virtual void deserializeTextEscaped(IColumn & column, ReadBuffer & istr, const FormatSettings &) const = 0;
|
||||
|
||||
/** Text serialization as a literal that may be inserted into a query.
|
||||
*/
|
||||
virtual void serializeTextQuoted(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0;
|
||||
virtual void deserializeTextQuoted(IColumn & column, ReadBuffer & istr, const FormatSettings &) const = 0;
|
||||
|
||||
/** Text serialization for the CSV format.
|
||||
*/
|
||||
virtual void serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0;
|
||||
virtual void deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings &) const = 0;
|
||||
|
||||
/** Text serialization intended for using in JSON format.
|
||||
*/
|
||||
virtual void serializeTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const = 0;
|
||||
virtual void deserializeTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings &) const = 0;
|
||||
|
||||
/** Text serialization for putting into the XML format.
|
||||
*/
|
||||
virtual void serializeTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const = 0;
|
||||
};
|
||||
|
||||
} // namespace DB
|
@ -20,9 +20,8 @@ namespace ErrorCodes
|
||||
extern const int SYNTAX_ERROR;
|
||||
}
|
||||
|
||||
DatabaseDictionary::DatabaseDictionary(const String & name_, const Context & context)
|
||||
DatabaseDictionary::DatabaseDictionary(const String & name_)
|
||||
: name(name_),
|
||||
external_dictionaries(context.getExternalDictionaries()),
|
||||
log(&Logger::get("DatabaseDictionary(" + name + ")"))
|
||||
{
|
||||
}
|
||||
@ -31,23 +30,21 @@ void DatabaseDictionary::loadTables(Context &, ThreadPool *, bool)
|
||||
{
|
||||
}
|
||||
|
||||
Tables DatabaseDictionary::loadTables()
|
||||
Tables DatabaseDictionary::listTables(const Context & context)
|
||||
{
|
||||
auto objects_map = external_dictionaries.getObjectsMap();
|
||||
auto objects_map = context.getExternalDictionaries().getObjectsMap();
|
||||
const auto & dictionaries = objects_map.get();
|
||||
|
||||
Tables tables;
|
||||
for (const auto & pair : dictionaries)
|
||||
{
|
||||
const std::string & dict_name = pair.first;
|
||||
if (deleted_tables.count(dict_name))
|
||||
continue;
|
||||
auto dict_ptr = std::static_pointer_cast<IDictionaryBase>(pair.second.loadable);
|
||||
if (dict_ptr)
|
||||
{
|
||||
const DictionaryStructure & dictionary_structure = dict_ptr->getStructure();
|
||||
auto columns = StorageDictionary::getNamesAndTypes(dictionary_structure);
|
||||
tables[dict_name] = StorageDictionary::create(dict_name, ColumnsDescription{columns}, dictionary_structure, dict_name);
|
||||
const std::string & dict_name = pair.first;
|
||||
tables[dict_name] = StorageDictionary::create(dict_name, ColumnsDescription{columns}, context, true, dict_name);
|
||||
}
|
||||
}
|
||||
|
||||
@ -55,23 +52,21 @@ Tables DatabaseDictionary::loadTables()
|
||||
}
|
||||
|
||||
bool DatabaseDictionary::isTableExist(
|
||||
const Context & /*context*/,
|
||||
const Context & context,
|
||||
const String & table_name) const
|
||||
{
|
||||
auto objects_map = external_dictionaries.getObjectsMap();
|
||||
auto objects_map = context.getExternalDictionaries().getObjectsMap();
|
||||
const auto & dictionaries = objects_map.get();
|
||||
return dictionaries.count(table_name) && !deleted_tables.count(table_name);
|
||||
return dictionaries.count(table_name);
|
||||
}
|
||||
|
||||
StoragePtr DatabaseDictionary::tryGetTable(
|
||||
const Context & /*context*/,
|
||||
const Context & context,
|
||||
const String & table_name) const
|
||||
{
|
||||
auto objects_map = external_dictionaries.getObjectsMap();
|
||||
auto objects_map = context.getExternalDictionaries().getObjectsMap();
|
||||
const auto & dictionaries = objects_map.get();
|
||||
|
||||
if (deleted_tables.count(table_name))
|
||||
return {};
|
||||
{
|
||||
auto it = dictionaries.find(table_name);
|
||||
if (it != dictionaries.end())
|
||||
@ -81,7 +76,7 @@ StoragePtr DatabaseDictionary::tryGetTable(
|
||||
{
|
||||
const DictionaryStructure & dictionary_structure = dict_ptr->getStructure();
|
||||
auto columns = StorageDictionary::getNamesAndTypes(dictionary_structure);
|
||||
return StorageDictionary::create(table_name, ColumnsDescription{columns}, dictionary_structure, table_name);
|
||||
return StorageDictionary::create(table_name, ColumnsDescription{columns}, context, true, table_name);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -89,17 +84,17 @@ StoragePtr DatabaseDictionary::tryGetTable(
|
||||
return {};
|
||||
}
|
||||
|
||||
DatabaseIteratorPtr DatabaseDictionary::getIterator(const Context & /*context*/)
|
||||
DatabaseIteratorPtr DatabaseDictionary::getIterator(const Context & context)
|
||||
{
|
||||
return std::make_unique<DatabaseSnapshotIterator>(loadTables());
|
||||
return std::make_unique<DatabaseSnapshotIterator>(listTables(context));
|
||||
}
|
||||
|
||||
bool DatabaseDictionary::empty(const Context & /*context*/) const
|
||||
bool DatabaseDictionary::empty(const Context & context) const
|
||||
{
|
||||
auto objects_map = external_dictionaries.getObjectsMap();
|
||||
auto objects_map = context.getExternalDictionaries().getObjectsMap();
|
||||
const auto & dictionaries = objects_map.get();
|
||||
for (const auto & pair : dictionaries)
|
||||
if (pair.second.loadable && !deleted_tables.count(pair.first))
|
||||
if (pair.second.loadable)
|
||||
return false;
|
||||
return true;
|
||||
}
|
||||
@ -115,23 +110,19 @@ void DatabaseDictionary::attachTable(const String & /*table_name*/, const Storag
|
||||
}
|
||||
|
||||
void DatabaseDictionary::createTable(
|
||||
const Context & /*context*/,
|
||||
const String & /*table_name*/,
|
||||
const StoragePtr & /*table*/,
|
||||
const ASTPtr & /*query*/)
|
||||
const Context &,
|
||||
const String &,
|
||||
const StoragePtr &,
|
||||
const ASTPtr &)
|
||||
{
|
||||
throw Exception("DatabaseDictionary: createTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
|
||||
}
|
||||
|
||||
void DatabaseDictionary::removeTable(
|
||||
const Context & context,
|
||||
const String & table_name)
|
||||
const Context &,
|
||||
const String &)
|
||||
{
|
||||
if (!isTableExist(context, table_name))
|
||||
throw Exception("Table " + name + "." + table_name + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
|
||||
|
||||
auto objects_map = external_dictionaries.getObjectsMap();
|
||||
deleted_tables.insert(table_name);
|
||||
throw Exception("DatabaseDictionary: removeTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
|
||||
}
|
||||
|
||||
void DatabaseDictionary::renameTable(
|
||||
@ -147,6 +138,7 @@ void DatabaseDictionary::alterTable(
|
||||
const Context &,
|
||||
const String &,
|
||||
const ColumnsDescription &,
|
||||
const IndicesDescription &,
|
||||
const ASTModifier &)
|
||||
{
|
||||
throw Exception("DatabaseDictionary: alterTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
|
||||
|
@ -15,7 +15,6 @@ namespace Poco
|
||||
|
||||
namespace DB
|
||||
{
|
||||
class ExternalDictionaries;
|
||||
|
||||
/* Database to store StorageDictionary tables
|
||||
* automatically creates tables for all dictionaries
|
||||
@ -23,7 +22,7 @@ class ExternalDictionaries;
|
||||
class DatabaseDictionary : public IDatabase
|
||||
{
|
||||
public:
|
||||
DatabaseDictionary(const String & name_, const Context & context);
|
||||
DatabaseDictionary(const String & name_);
|
||||
|
||||
String getDatabaseName() const override;
|
||||
|
||||
@ -72,6 +71,7 @@ public:
|
||||
const Context & context,
|
||||
const String & name,
|
||||
const ColumnsDescription & columns,
|
||||
const IndicesDescription & indices,
|
||||
const ASTModifier & engine_modifier) override;
|
||||
|
||||
time_t getTableMetadataModificationTime(
|
||||
@ -93,13 +93,10 @@ public:
|
||||
private:
|
||||
const String name;
|
||||
mutable std::mutex mutex;
|
||||
const ExternalDictionaries & external_dictionaries;
|
||||
std::unordered_set<String> deleted_tables;
|
||||
|
||||
Poco::Logger * log;
|
||||
|
||||
Tables loadTables();
|
||||
|
||||
Tables listTables(const Context & context);
|
||||
ASTPtr getCreateTableQueryImpl(const Context & context, const String & table_name, bool throw_on_error) const;
|
||||
};
|
||||
|
||||
|
@ -23,7 +23,7 @@ DatabasePtr DatabaseFactory::get(
|
||||
else if (engine_name == "Memory")
|
||||
return std::make_shared<DatabaseMemory>(database_name);
|
||||
else if (engine_name == "Dictionary")
|
||||
return std::make_shared<DatabaseDictionary>(database_name, context);
|
||||
return std::make_shared<DatabaseDictionary>(database_name);
|
||||
|
||||
throw Exception("Unknown database engine: " + engine_name, ErrorCodes::UNKNOWN_DATABASE_ENGINE);
|
||||
}
|
||||
|
@ -53,6 +53,7 @@ void DatabaseMemory::alterTable(
|
||||
const Context &,
|
||||
const String &,
|
||||
const ColumnsDescription &,
|
||||
const IndicesDescription &,
|
||||
const ASTModifier &)
|
||||
{
|
||||
throw Exception("DatabaseMemory: alterTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
|
||||
|
@ -48,6 +48,7 @@ public:
|
||||
const Context & context,
|
||||
const String & name,
|
||||
const ColumnsDescription & columns,
|
||||
const IndicesDescription & indices,
|
||||
const ASTModifier & engine_modifier) override;
|
||||
|
||||
time_t getTableMetadataModificationTime(
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user