Merge branch 'master' of github.com:yandex/ClickHouse into CLICKHOUSE-4032

This commit is contained in:
Sabyanin Maxim 2018-10-27 22:33:52 +03:00
commit 65bd40e290
943 changed files with 17096 additions and 4692 deletions

3
.gitignore vendored
View File

@ -9,7 +9,10 @@
# auto generated files
*.logrt
dbms/src/Storages/System/StorageSystemContributors.generated.cpp
/build
/build_*
/docs/build
/docs/edit
/docs/tools/venv/

View File

@ -0,0 +1 @@
* Настройка `enable_optimize_predicate_expression` выключена по-умолчанию.

View File

@ -1,3 +1,122 @@
## ClickHouse release 18.14.10, 2018-10-23
* Настройка `compile_expressions` (JIT компиляция выражений) выключена по умолчанию. [#3410](https://github.com/yandex/ClickHouse/pull/3410)
* Настройка `enable_optimize_predicate_expression` выключена по умолчанию.
## ClickHouse release 18.14.9, 2018-10-16
### Новые возможности:
* Модификатор `WITH CUBE` для `GROUP BY` (также доступен синтаксис: `GROUP BY CUBE(...)`). [#3172](https://github.com/yandex/ClickHouse/pull/3172)
* Добавлена функция `formatDateTime`. [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/2770)
* Добавлен движок таблиц `JDBC` и табличная функция `jdbc` (для работы требуется установка clickhouse-jdbc-bridge). [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3210)
* Добавлены функции для работы с ISO номером недели: `toISOWeek`, `toISOYear`, `toStartOfISOYear`, а также `toDayOfYear`. [#3146](https://github.com/yandex/ClickHouse/pull/3146)
* Добавлена возможность использования столбцов типа `Nullable` для таблиц типа `MySQL`, `ODBC`. [#3362](https://github.com/yandex/ClickHouse/pull/3362)
* Возможность чтения вложенных структур данных как вложенных объектов в формате `JSONEachRow`. Добавлена настройка `input_format_import_nested_json`. [Veloman Yunkan](https://github.com/yandex/ClickHouse/pull/3144)
* Возможность параллельной обработки многих `MATERIALIZED VIEW` при вставке данных. Настройка `parallel_view_processing`. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3208)
* Добавлен запрос `SYSTEM FLUSH LOGS` (форсированный сброс логов в системные таблицы, такие как например, `query_log`) [#3321](https://github.com/yandex/ClickHouse/pull/3321)
* Возможность использования предопределённых макросов `database` и `table` в объявлении `Replicated` таблиц. [#3251](https://github.com/yandex/ClickHouse/pull/3251)
* Добавлена возможность чтения значения типа `Decimal` в инженерной нотации (с указанием десятичной экспоненты). [#3153](https://github.com/yandex/ClickHouse/pull/3153)
### Экспериментальные возможности:
* Оптимизация GROUP BY для типов данных `LowCardinality` [#3138](https://github.com/yandex/ClickHouse/pull/3138)
* Оптимизации вычисления выражений для типов данных `LowCardinality` [#3200](https://github.com/yandex/ClickHouse/pull/3200)
### Улучшения:
* Существенно уменьшено потребление памяти для запросов с `ORDER BY` и `LIMIT`. Настройка `max_bytes_before_remerge_sort`. [#3205](https://github.com/yandex/ClickHouse/pull/3205)
* При отсутствии указания типа `JOIN` (`LEFT`, `INNER`, ...), подразумевается `INNER JOIN`. [#3147](https://github.com/yandex/ClickHouse/pull/3147)
* Корректная работа квалифицированных звёздочек в запросах с `JOIN`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3202)
* Движок таблиц `ODBC` корректно выбирает способ квотирования идентификаторов в SQL диалекте удалённой СУБД. [Alexandr Krasheninnikov](https://github.com/yandex/ClickHouse/pull/3210)
* Настройка `compile_expressions` (JIT компиляция выражений) включена по-умолчанию.
* Исправлено поведение при одновременном DROP DATABASE/TABLE IF EXISTS и CREATE DATABASE/TABLE IF NOT EXISTS. Ранее запрос `CREATE DATABASE ... IF NOT EXISTS` мог выдавать сообщение об ошибке вида "File ... already exists", а запросы `CREATE TABLE ... IF NOT EXISTS` и `DROP TABLE IF EXISTS` могли выдавать сообщение `Table ... is creating or attaching right now`. [#3101](https://github.com/yandex/ClickHouse/pull/3101)
* Выражения LIKE и IN с константной правой частью пробрасываются на удалённый сервер при запросах из таблиц типа MySQL и ODBC. [#3182](https://github.com/yandex/ClickHouse/pull/3182)
* Сравнения с константными выражениями в секции WHERE пробрасываются на удалённый сервер при запросах из таблиц типа MySQL и ODBC. Ранее пробрасывались только сравнения с константами. [#3182](https://github.com/yandex/ClickHouse/pull/3182)
* Корректное вычисление ширины строк в терминале для `Pretty` форматов, в том числе для строк с иероглифами. [Amos Bird](https://github.com/yandex/ClickHouse/pull/3257).
* Возможность указания `ON CLUSTER` для запросов `ALTER UPDATE`.
* Увеличена производительность чтения данных в формате `JSONEachRow`. [#3332](https://github.com/yandex/ClickHouse/pull/3332)
* Добавлены синонимы функций `LENGTH`, `CHARACTER_LENGTH` для совместимости. Функция `CONCAT` стала регистронезависимой. [#3306](https://github.com/yandex/ClickHouse/pull/3306)
* Добавлен синоним `TIMESTAMP` для типа `DateTime`. [#3390](https://github.com/yandex/ClickHouse/pull/3390)
* В логах сервера всегда присутствует место для query_id, даже если строчка лога не относится к запросу. Это сделано для более простого парсинга текстовых логов сервера сторонними инструментами.
* Логгирование потребления памяти запросом при превышении очередной отметки целого числа гигабайт. [#3205](https://github.com/yandex/ClickHouse/pull/3205)
* Добавлен режим совместимости для случая, когда клиентская библиотека, работающая по Native протоколу, по ошибке отправляет меньшее количество столбцов, чем сервер ожидает для запроса INSERT. Такой сценарий был возможен при использовании библиотеки clickhouse-cpp. Ранее этот сценарий приводил к падению сервера. [#3171](https://github.com/yandex/ClickHouse/pull/3171)
* В `clickhouse-copier`, в задаваемом пользователем выражении WHERE, появилась возможность использовать алиас `partition_key` (для дополнительной фильтрации по партициям исходной таблицы). Это полезно, если схема партиционирования изменяется при копировании, но изменяется незначительно. [#3166](https://github.com/yandex/ClickHouse/pull/3166)
* Рабочий поток движка `Kafka` перенесён в фоновый пул потоков для того, чтобы автоматически уменьшать скорость чтения данных при большой нагрузке. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3215).
* Поддержка чтения значений типа `Tuple` и `Nested` структур как `struct` в формате `Cap'n'Proto` [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3216).
* В список доменов верхнего уровня для функции `firstSignificantSubdomain` добавлен домен `biz` [decaseal](https://github.com/yandex/ClickHouse/pull/3219).
* В конфигурации внешних словарей, пустое значение `null_value` интерпретируется, как значение типа данных по-умоланию. [#3330](https://github.com/yandex/ClickHouse/pull/3330)
* Поддержка функций `intDiv`, `intDivOrZero` для `Decimal`. [b48402e8](https://github.com/yandex/ClickHouse/commit/b48402e8712e2b9b151e0eef8193811d433a1264)
* Поддержка типов `Date`, `DateTime`, `UUID`, `Decimal` в качестве ключа для агрегатной функции `sumMap`. [#3281](https://github.com/yandex/ClickHouse/pull/3281)
* Поддержка типа данных `Decimal` во внешних словарях. [#3324](https://github.com/yandex/ClickHouse/pull/3324)
* Поддержка типа данных `Decimal` в таблицах типа `SummingMergeTree`. [#3348](https://github.com/yandex/ClickHouse/pull/3348)
* Добавлена специализация для `UUID` в функции `if`. [#3366](https://github.com/yandex/ClickHouse/pull/3366)
* Уменьшено количество системных вызовов `open`, `close` при чтении из таблиц семейства `MergeTree` [#3283](https://github.com/yandex/ClickHouse/pull/3283).
* Возможность выполнения запроса `TRUNCATE TABLE` на любой реплике (запрос пробрасывается на реплику-лидера). [Kirill Shvakov](https://github.com/yandex/ClickHouse/pull/3375)
### Исправление ошибок:
* Исправлена ошибка в работе таблиц типа `Dictionary` для словарей типа `range_hashed`. Ошибка возникла в версии 18.12.17. [#1702](https://github.com/yandex/ClickHouse/pull/1702)
* Исправлена ошибка при загрузке словарей типа `range_hashed` (сообщение `Unsupported type Nullable(...)`). Ошибка возникла в версии 18.12.17. [#3362](https://github.com/yandex/ClickHouse/pull/3362)
* Исправлена некорректная работа функции `pointInPolygon` из-за накопления погрешности при вычислениях для полигонов с большим количеством близко расположенных вершин. [#3331](https://github.com/yandex/ClickHouse/pull/3331) [#3341](https://github.com/yandex/ClickHouse/pull/3341)
* Если после слияния кусков данных, у результирующего куска чексумма отличается от результата того же слияния на другой реплике, то результат слияния удаляется, и вместо этого кусок скачивается с другой реплики (это правильное поведение). Но после скачивания куска, он не мог добавиться в рабочий набор из-за ошибки, что кусок уже существует (так как кусок после слияния удалялся не сразу, а с задержкой). Это приводило к циклическим попыткам скачивания одних и тех же данных. [#3194](https://github.com/yandex/ClickHouse/pull/3194)
* Исправлен некорректный учёт общего потребления оперативной памяти запросами (что приводило к неправильной работе настройки `max_memory_usage_for_all_queries` и неправильному значению метрики `MemoryTracking`). Ошибка возникла в версии 18.12.13. [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3344)
* Исправлена работоспособность запросов `CREATE TABLE ... ON CLUSTER ... AS SELECT ...` Ошибка возникла в версии 18.12.13. [#3247](https://github.com/yandex/ClickHouse/pull/3247)
* Исправлена лишняя подготовка структуры данных для `JOIN` на сервере-инициаторе запроса, если `JOIN` выполняется только на удалённых серверах. [#3340](https://github.com/yandex/ClickHouse/pull/3340)
* Исправлены ошибки в движке `Kafka`: неработоспособность после исключения при начале чтения данных; блокировка при завершении [Marek Vavruša](https://github.com/yandex/ClickHouse/pull/3215).
* Для таблиц `Kafka` не передавался опциональный параметр `schema` (схема формата `Cap'n'Proto`). [Vojtech Splichal](https://github.com/yandex/ClickHouse/pull/3150)
* Если ансамбль серверов ZooKeeper содержит серверы, которые принимают соединение, но сразу же разрывают его вместо ответа на рукопожатие, то ClickHouse выбирает для соединения другой сервер. Ранее в этом случае возникала ошибка `Cannot read all data. Bytes read: 0. Bytes expected: 4.` и сервер не мог стартовать. [8218cf3a](https://github.com/yandex/ClickHouse/commit/8218cf3a5f39a43401953769d6d12a0bb8d29da9)
* Если ансамбль серверов ZooKeeper содержит серверы, для которых DNS запрос возвращает ошибку, то такие серверы пропускаются. [17b8e209](https://github.com/yandex/ClickHouse/commit/17b8e209221061325ad7ba0539f03c6e65f87f29)
* Исправлено преобразование типов между `Date` и `DateTime` при вставке данных в формате `VALUES` (в случае, когда `input_format_values_interpret_expressions = 1`). Ранее преобразование производилось между числовым значением количества дней с начала unix эпохи и unix timestamp, что приводило к неожиданным результатам. [#3229](https://github.com/yandex/ClickHouse/pull/3229)
* Исправление преобразования типов между `Decimal` и целыми числами. [#3211](https://github.com/yandex/ClickHouse/pull/3211)
* Исправлены ошибки в работе настройки `enable_optimize_predicate_expression`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3231)
* Исправлена ошибка парсинга формата CSV с числами с плавающей запятой, если используется разделитель CSV не по-умолчанию, такой как например, `;` [#3155](https://github.com/yandex/ClickHouse/pull/3155).
* Исправлена функция `arrayCumSumNonNegative` (она не накапливает отрицательные значения, если аккумулятор становится меньше нуля). [Aleksey Studnev](https://github.com/yandex/ClickHouse/pull/3163)
* Исправлена работа `Merge` таблицы поверх `Distributed` таблиц при использовании `PREWHERE`. [#3165](https://github.com/yandex/ClickHouse/pull/3165)
* Исправления ошибок в запросе `ALTER UPDATE`.
* Исправления ошибок в табличной функции `odbc`, которые возникли в версии 18.12. [#3197](https://github.com/yandex/ClickHouse/pull/3197)
* Исправлена работа агрегатных функций с комбинаторами `StateArray`. [#3188](https://github.com/yandex/ClickHouse/pull/3188)
* Исправлено падение при делении значения типа `Decimal` на ноль. [69dd6609](https://github.com/yandex/ClickHouse/commit/69dd6609193beb4e7acd3e6ad216eca0ccfb8179)
* Исправлен вывод типов для операций с использованием аргументов типа `Decimal` и целых чисел. [#3224](https://github.com/yandex/ClickHouse/pull/3224)
* Исправлен segfault при `GROUP BY` по `Decimal128`. [3359ba06](https://github.com/yandex/ClickHouse/commit/3359ba06c39fcd05bfdb87d6c64154819621e13a)
* Настройка `log_query_threads` (логгирование информации о каждом потоке исполнения запроса) теперь имеет эффект только если настройка `log_queries` (логгирование информации о запросах) выставлена в 1. Так как настройка `log_query_threads` включена по-умолчанию, ранее информация о потоках логгировалась даже если логгирование запросов выключено. [#3241](https://github.com/yandex/ClickHouse/pull/3241)
* Исправлена ошибка в распределённой работе агрегатной функции quantiles (сообщение об ошибке вида `Not found column quantile...`). [292a8855](https://github.com/yandex/ClickHouse/commit/292a885533b8e3b41ce8993867069d14cbd5a664)
* Исправлена проблема совместимости при одновременной работе на кластере серверов версии 18.12.17 и более старых, приводящая к тому, что при распределённых запросах с GROUP BY по ключам одновременно фиксированной и не фиксированной длины, при условии, что количество данных в процессе агрегации большое, могли возвращаться не до конца агрегированные данные (одни и те же ключи агрегации в двух разных строках). [#3254](https://github.com/yandex/ClickHouse/pull/3254)
* Исправлена обработка подстановок в `clickhouse-performance-test`, если запрос содержит только часть из объявленных в тесте подстановок. [#3263](https://github.com/yandex/ClickHouse/pull/3263)
* Исправлена ошибка при использовании `FINAL` совместно с `PREWHERE`. [#3298](https://github.com/yandex/ClickHouse/pull/3298)
* Исправлена ошибка при использовании `PREWHERE` над столбцами, добавленными при `ALTER`. [#3298](https://github.com/yandex/ClickHouse/pull/3298)
* Добавлена проверка отсутствия `arrayJoin` для `DEFAULT`, `MATERIALIZED` выражений. Ранее наличие `arrayJoin` приводило к ошибке при вставке данных. [#3337](https://github.com/yandex/ClickHouse/pull/3337)
* Добавлена проверка отсутствия `arrayJoin` в секции `PREWHERE`. Ранее это приводило к сообщениям вида `Size ... doesn't match` или `Unknown compression method` при выполнении запросов. [#3357](https://github.com/yandex/ClickHouse/pull/3357)
* Исправлен segfault, который мог возникать в редких случаях после оптимизации - замены цепочек AND из равенства выражения константам на соответствующее выражение IN. [liuyimin-bytedance](https://github.com/yandex/ClickHouse/pull/3339).
* Мелкие исправления `clickhouse-benchmark`: ранее информация о клиенте не передавалась на сервер; более корректный подсчёт числа выполненных запросов при завершении работы и для ограничения числа итераций. [#3351](https://github.com/yandex/ClickHouse/pull/3351) [#3352](https://github.com/yandex/ClickHouse/pull/3352)
### Обратно несовместимые изменения:
* Удалена настройка `allow_experimental_decimal_type`. Тип данных `Decimal` доступен для использования по-умолчанию. [#3329](https://github.com/yandex/ClickHouse/pull/3329)
## ClickHouse release 18.12.17, 2018-09-16
### Новые возможности:
* `invalidate_query` (возможность задать запрос для проверки необходимости обновления внешнего словаря) реализована для источника `clickhouse`. [#3126](https://github.com/yandex/ClickHouse/pull/3126)
* Добавлена возможность использования типов данных `UInt*`, `Int*`, `DateTime` (наравне с типом `Date`) в качестве ключа внешнего словаря типа `range_hashed`, определяющего границы диапазонов. Возможность использования `NULL` в качестве обозначения открытого диапазона. [Vasily Nemkov](https://github.com/yandex/ClickHouse/pull/3123)
* Для типа `Decimal` добавлена поддержка агрегатных функций `var*`, `stddev*`. [#3129](https://github.com/yandex/ClickHouse/pull/3129)
* Для типа `Decimal` добавлена поддержка математических функций (`exp`, `sin` и т. п.) [#3129](https://github.com/yandex/ClickHouse/pull/3129)
* В таблицу `system.part_log` добавлен столбец `partition_id`. [#3089](https://github.com/yandex/ClickHouse/pull/3089)
### Исправление ошибок:
* Исправлена работа `Merge` таблицы поверх `Distributed` таблиц. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3159)
* Исправлена несовместимость (лишняя зависимость от версии `glibc`), приводящая к невозможности запуска ClickHouse на `Ubuntu Precise` и более старых. Несовместимость возникла в версии 18.12.13. [#3130](https://github.com/yandex/ClickHouse/pull/3130)
* Исправлены ошибки в работе настройки `enable_optimize_predicate_expression`. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3107)
* Исправлено незначительное нарушение обратной совместимости, проявляющееся при одновременной работе на кластере реплик версий до 18.12.13 и создании новой реплики таблицы на сервере более новой версии (выдаётся сообщение `Can not clone replica, because the ... updated to new ClickHouse version`, что полностью логично, но не должно было происходить). [#3122](https://github.com/yandex/ClickHouse/pull/3122)
### Обратно несовместимые изменения:
* Настройка `enable_optimize_predicate_expression` включена по-умолчанию, что конечно очень оптимистично. При возникновении ошибок анализа запроса, связанных с поиском имён столбцов, следует выставить `enable_optimize_predicate_expression` в 0. [Winter Zhang](https://github.com/yandex/ClickHouse/pull/3107)
## ClickHouse release 18.12.14, 2018-09-13
### Новые возможности:

View File

@ -147,7 +147,7 @@ set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${COMPILER_FLAGS} -fn
set (CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELWITHDEBINFO} -O3 ${CMAKE_C_FLAGS_ADD}")
set (CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -O0 -g3 -ggdb3 -fno-inline ${CMAKE_C_FLAGS_ADD}")
if (MAKE_STATIC_LIBRARIES AND NOT APPLE AND NOT (CMAKE_CXX_COMPILER_ID STREQUAL "Clang" AND OS_FREEBSD))
if (MAKE_STATIC_LIBRARIES AND NOT APPLE AND NOT (COMPILER_CLANG AND OS_FREEBSD))
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static-libgcc -static-libstdc++")
# Along with executables, we also build example of shared library for "library dictionary source"; and it also should be self-contained.
@ -158,7 +158,7 @@ set(THREADS_PREFER_PTHREAD_FLAG ON)
include (cmake/test_compiler.cmake)
if (OS_LINUX AND CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
if (OS_LINUX AND COMPILER_CLANG)
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}")
option (USE_LIBCXX "Use libc++ and libc++abi instead of libstdc++ (only make sense on Linux with Clang)" ${HAVE_LIBCXX})

View File

@ -11,5 +11,5 @@ ClickHouse is an open-source column-oriented database management system that all
## Upcoming Meetups
* [Paris on October 2](http://bit.ly/clickhouse-paris-2-october-2018)
* [Beijing on October 28](http://www.clickhouse.com.cn/topic/5ba0e3f99d28dfde2ddc62a1)
* [Amsterdam on November 15](https://events.yandex.com/events/meetings/15-11-2018/)

View File

@ -1,5 +1,8 @@
option (ENABLE_EMBEDDED_COMPILER "Set to TRUE to enable support for 'compile' option for query execution" 1)
option (USE_INTERNAL_LLVM_LIBRARY "Use bundled or system LLVM library. Default: system library for quicker developer builds." ${APPLE})
# Broken in macos. TODO: update clang, re-test, enable
if (NOT APPLE)
option (ENABLE_EMBEDDED_COMPILER "Set to TRUE to enable support for 'compile' option for query execution" 1)
option (USE_INTERNAL_LLVM_LIBRARY "Use bundled or system LLVM library. Default: system library for quicker developer builds." ${APPLE})
endif ()
if (ENABLE_EMBEDDED_COMPILER)
if (USE_INTERNAL_LLVM_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/llvm/llvm/CMakeLists.txt")

View File

@ -86,4 +86,4 @@ if (ENABLE_ODBC)
endif ()
endif ()
message (STATUS "Using odbc: ${ODBC_INCLUDE_DIRECTORIES} : ${ODBC_LIBRARIES}")
message (STATUS "Using odbc=${ODBC_FOUND}: ${ODBC_INCLUDE_DIRECTORIES} : ${ODBC_LIBRARIES}")

View File

@ -116,10 +116,10 @@ endif ()
if (Poco_MongoDB_LIBRARY)
set (USE_POCO_MONGODB 1)
endif ()
if (Poco_DataODBC_LIBRARY)
if (Poco_DataODBC_LIBRARY AND ODBC_FOUND)
set (USE_POCO_DATAODBC 1)
endif ()
if (Poco_SQLODBC_LIBRARY)
if (Poco_SQLODBC_LIBRARY AND ODBC_FOUND)
set (USE_POCO_SQLODBC 1)
endif ()

View File

@ -3,7 +3,7 @@ Motivation
For reproducible build, we need to control, what exactly version of boost we build,
because different versions of boost obviously have slightly different behaviour.
You may already have installed arbitary version of boost on your system, to build another projects.
You may already have installed arbitrary version of boost on your system, to build another projects.
We need to have all libraries with C++ interface to be located in tree and to be build together.
This is needed to allow quickly changing build options, that could introduce changes in ABI of that libraries.

View File

@ -1,7 +1,7 @@
This directory contains several hash-map implementations, similar in
API to SGI's hash_map class, but with different performance
characteristics. sparse_hash_map uses very little space overhead, 1-2
bits per entry. dense_hash_map is very fast, particulary on lookup.
bits per entry. dense_hash_map is very fast, particularly on lookup.
(sparse_hash_set and dense_hash_set are the set versions of these
routines.) On the other hand, these classes have requirements that
may not make them appropriate for all applications.

View File

@ -6,18 +6,12 @@ include(${CMAKE_CURRENT_SOURCE_DIR}/cmake/find_vectorclass.cmake)
set (CONFIG_VERSION ${CMAKE_CURRENT_BINARY_DIR}/src/Common/config_version.h)
set (CONFIG_COMMON ${CMAKE_CURRENT_BINARY_DIR}/src/Common/config.h)
set (CONFIG_BUILD ${CMAKE_CURRENT_BINARY_DIR}/src/Common/config_build.cpp)
include (cmake/version.cmake)
message (STATUS "Will build ${VERSION_FULL}")
configure_file (${CMAKE_CURRENT_SOURCE_DIR}/src/Common/config.h.in ${CONFIG_COMMON})
configure_file (${CMAKE_CURRENT_SOURCE_DIR}/src/Common/config_version.h.in ${CONFIG_VERSION})
get_property (BUILD_COMPILE_DEFINITIONS DIRECTORY ${ClickHouse_SOURCE_DIR} PROPERTY COMPILE_DEFINITIONS)
get_property (BUILD_INCLUDE_DIRECTORIES DIRECTORY ${ClickHouse_SOURCE_DIR} PROPERTY INCLUDE_DIRECTORIES)
string (TIMESTAMP BUILD_DATE "%Y-%m-%d" UTC)
configure_file (${CMAKE_CURRENT_SOURCE_DIR}/src/Common/config_build.cpp.in ${CONFIG_BUILD})
if (NOT MSVC)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wextra")
endif ()
@ -162,6 +156,7 @@ endif()
target_link_libraries (clickhouse_common_io
common
string_utils
widechar_width
${LINK_LIBRARIES_ONLY_ON_X86_64}
${LZ4_LIBRARY}
${ZSTD_LIBRARY}

View File

@ -2,10 +2,10 @@
set(VERSION_REVISION 54409 CACHE STRING "")
set(VERSION_MAJOR 18 CACHE STRING "")
set(VERSION_MINOR 14 CACHE STRING "")
set(VERSION_PATCH 1 CACHE STRING "")
set(VERSION_GITHASH 4452a1f49bf9ae93838da62694b7f1b1523470e3 CACHE STRING "")
set(VERSION_DESCRIBE v18.14.1-testing CACHE STRING "")
set(VERSION_STRING 18.14.1 CACHE STRING "")
set(VERSION_PATCH 9 CACHE STRING "")
set(VERSION_GITHASH 457f8fd495b2812940e69c15ab5b499cd863aae4 CACHE STRING "")
set(VERSION_DESCRIBE v18.14.9-testing CACHE STRING "")
set(VERSION_STRING 18.14.9 CACHE STRING "")
# end of autochange
set(VERSION_EXTRA "" CACHE STRING "")

View File

@ -110,6 +110,8 @@ private:
/// Don't execute new queries after timelimit or SIGINT or exception
std::atomic<bool> shutdown{false};
std::atomic<size_t> queries_executed{0};
struct Stats
{
Stopwatch watch;
@ -238,10 +240,12 @@ private:
size_t query_index = randomize ? distribution(generator) : i % queries.size();
if (!tryPushQueryInteractively(queries[query_index], interrupt_listener))
{
shutdown = true;
break;
}
}
shutdown = true;
pool.wait();
info_total.watch.stop();
@ -274,11 +278,12 @@ private:
{
extracted = queue.tryPop(query, 100);
if (shutdown)
if (shutdown || (max_iterations && queries_executed == max_iterations))
return;
}
execute(connection, query);
++queries_executed;
}
}
catch (...)
@ -419,20 +424,20 @@ int mainEntryClickHouseBenchmark(int argc, char ** argv)
boost::program_options::options_description desc("Allowed options");
desc.add_options()
("help", "produce help message")
("concurrency,c", value<unsigned>()->default_value(1), "number of parallel queries")
("delay,d", value<double>()->default_value(1), "delay between intermediate reports in seconds (set 0 to disable reports)")
("stage", value<std::string>()->default_value("complete"), "request query processing up to specified stage")
("iterations,i", value<size_t>()->default_value(0), "amount of queries to be executed")
("timelimit,t", value<double>()->default_value(0.), "stop launch of queries after specified time limit")
("randomize,r", value<bool>()->default_value(false), "randomize order of execution")
("json", value<std::string>()->default_value(""), "write final report to specified file in JSON format")
("host,h", value<std::string>()->default_value("localhost"), "")
("port", value<UInt16>()->default_value(9000), "")
("user", value<std::string>()->default_value("default"), "")
("password", value<std::string>()->default_value(""), "")
("database", value<std::string>()->default_value("default"), "")
("stacktrace", "print stack traces of exceptions")
("help", "produce help message")
("concurrency,c", value<unsigned>()->default_value(1), "number of parallel queries")
("delay,d", value<double>()->default_value(1), "delay between intermediate reports in seconds (set 0 to disable reports)")
("stage", value<std::string>()->default_value("complete"), "request query processing up to specified stage")
("iterations,i", value<size_t>()->default_value(0), "amount of queries to be executed")
("timelimit,t", value<double>()->default_value(0.), "stop launch of queries after specified time limit")
("randomize,r", value<bool>()->default_value(false), "randomize order of execution")
("json", value<std::string>()->default_value(""), "write final report to specified file in JSON format")
("host,h", value<std::string>()->default_value("localhost"), "")
("port", value<UInt16>()->default_value(9000), "")
("user", value<std::string>()->default_value("default"), "")
("password", value<std::string>()->default_value(""), "")
("database", value<std::string>()->default_value("default"), "")
("stacktrace", "print stack traces of exceptions")
#define DECLARE_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) (#NAME, boost::program_options::value<std::string> (), DESCRIPTION)
APPLY_FOR_SETTINGS(DECLARE_SETTING)

View File

@ -904,7 +904,7 @@ private:
ASTPtr parseQuery(const char * & pos, const char * end, bool allow_multi_statements)
{
ParserQuery parser(end);
ParserQuery parser(end, true);
ASTPtr res;
const auto ignore_error = config().getBool("ignore-error", false);

View File

@ -76,7 +76,8 @@ struct ConnectionParameters
timeouts = ConnectionTimeouts(
Poco::Timespan(config.getInt("connect_timeout", DBMS_DEFAULT_CONNECT_TIMEOUT_SEC), 0),
Poco::Timespan(config.getInt("receive_timeout", DBMS_DEFAULT_RECEIVE_TIMEOUT_SEC), 0),
Poco::Timespan(config.getInt("send_timeout", DBMS_DEFAULT_SEND_TIMEOUT_SEC), 0));
Poco::Timespan(config.getInt("send_timeout", DBMS_DEFAULT_SEND_TIMEOUT_SEC), 0),
Poco::Timespan(config.getInt("tcp_keep_alive_timeout", 0), 0));
}
};

View File

@ -500,7 +500,7 @@ private:
return CRC32Hash()(StringRef(reinterpret_cast<const char *>(begin), (end - begin) * sizeof(CodePoint)));
}
/// By the way, we don't have to use actual Unicode numbers. We use just arbitary bijective mapping.
/// By the way, we don't have to use actual Unicode numbers. We use just arbitrary bijective mapping.
CodePoint readCodePoint(const char *& pos, const char * end)
{
size_t length = UTF8::seqLength(*pos);
@ -954,7 +954,7 @@ try
("structure,S", po::value<std::string>(), "structure of the initial table (list of column and type names)")
("input-format", po::value<std::string>(), "input format of the initial table data")
("output-format", po::value<std::string>(), "default output format")
("seed", po::value<std::string>(), "seed (arbitary string), must be random string with at least 10 bytes length")
("seed", po::value<std::string>(), "seed (arbitrary string), must be random string with at least 10 bytes length")
("limit", po::value<UInt64>(), "if specified - stop after generating that number of rows")
("silent", po::value<bool>()->default_value(false), "don't print information messages to stderr")
("order", po::value<UInt64>()->default_value(5), "order of markov model to generate strings")

View File

@ -2,8 +2,10 @@ add_library (clickhouse-odbc-bridge-lib ${LINK_MODE}
PingHandler.cpp
MainHandler.cpp
ColumnInfoHandler.cpp
IdentifierQuoteHandler.cpp
HandlerFactory.cpp
ODBCBridge.cpp
getIdentifierQuote.cpp
validateODBCConnectionString.cpp
)

View File

@ -1,4 +1,5 @@
#include "ColumnInfoHandler.h"
#include "getIdentifierQuote.h"
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
#if USE_POCO_SQLODBC
@ -58,6 +59,11 @@ namespace
}
}
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
void ODBCColumnsInfoHandler::handleRequest(Poco::Net::HTTPServerRequest & request, Poco::Net::HTTPServerResponse & response)
{
Poco::Net::HTMLForm params(request, request.stream());
@ -88,7 +94,8 @@ void ODBCColumnsInfoHandler::handleRequest(Poco::Net::HTTPServerRequest & reques
{
schema_name = params.get("schema");
LOG_TRACE(log, "Will fetch info for table '" << schema_name + "." + table_name << "'");
} else
}
else
LOG_TRACE(log, "Will fetch info for table '" << table_name << "'");
LOG_TRACE(log, "Got connection str '" << connection_string << "'");
@ -113,10 +120,22 @@ void ODBCColumnsInfoHandler::handleRequest(Poco::Net::HTTPServerRequest & reques
IAST::FormatSettings settings(ss, true);
settings.always_quote_identifiers = true;
settings.identifier_quoting_style = IdentifierQuotingStyle::DoubleQuotes;
auto identifier_quote = getIdentifierQuote(hdbc);
if (identifier_quote.length() == 0)
settings.identifier_quoting_style = IdentifierQuotingStyle::None;
else if(identifier_quote[0] == '`')
settings.identifier_quoting_style = IdentifierQuotingStyle::Backticks;
else if(identifier_quote[0] == '"')
settings.identifier_quoting_style = IdentifierQuotingStyle::DoubleQuotes;
else
throw Exception("Can not map quote identifier '" + identifier_quote + "' to IdentifierQuotingStyle value", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
select->format(settings);
std::string query = ss.str();
LOG_TRACE(log, "Inferring structure with query '" << query << "'");
if (POCO_SQL_ODBC_CLASS::Utility::isError(POCO_SQL_ODBC_CLASS::SQLPrepare(hstmt, reinterpret_cast<SQLCHAR *>(query.data()), query.size())))
throw POCO_SQL_ODBC_CLASS::DescriptorException(session.dbc());

View File

@ -24,6 +24,12 @@ Poco::Net::HTTPRequestHandler * HandlerFactory::createRequestHandler(const Poco:
return new ODBCColumnsInfoHandler(keep_alive_timeout, context);
#else
return nullptr;
#endif
else if(uri.getPath() == "/identifier_quote")
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
return new IdentifierQuoteHandler(keep_alive_timeout, context);
#else
return nullptr;
#endif
else
return new ODBCHandler(pool_map, keep_alive_timeout, context);

View File

@ -5,6 +5,7 @@
#include <Poco/Net/HTTPRequestHandlerFactory.h>
#include "MainHandler.h"
#include "ColumnInfoHandler.h"
#include "IdentifierQuoteHandler.h"
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-parameter"
@ -14,7 +15,7 @@
namespace DB
{
/** Factory for '/ping', '/' and '/columns_info' handlers.
/** Factory for '/ping', '/', '/columns_info', '/identifier_quote' handlers.
* Also stores Session pools for ODBC connections
*/
class HandlerFactory : public Poco::Net::HTTPRequestHandlerFactory

View File

@ -0,0 +1,69 @@
#include "IdentifierQuoteHandler.h"
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
#if USE_POCO_SQLODBC
#include <Poco/SQL/ODBC/ODBCException.h> // Y_IGNORE
#include <Poco/SQL/ODBC/SessionImpl.h> // Y_IGNORE
#include <Poco/SQL/ODBC/Utility.h> // Y_IGNORE
#define POCO_SQL_ODBC_CLASS Poco::SQL::ODBC
#endif
#if USE_POCO_DATAODBC
#include <Poco/Data/ODBC/ODBCException.h>
#include <Poco/Data/ODBC/SessionImpl.h>
#include <Poco/Data/ODBC/Utility.h>
#define POCO_SQL_ODBC_CLASS Poco::Data::ODBC
#endif
#include <DataTypes/DataTypeFactory.h>
#include <IO/WriteBufferFromHTTPServerResponse.h>
#include <IO/WriteHelpers.h>
#include <Parsers/ParserQueryWithOutput.h>
#include <Parsers/parseQuery.h>
#include <Poco/Net/HTMLForm.h>
#include <Poco/Net/HTTPServerRequest.h>
#include <Poco/Net/HTTPServerResponse.h>
#include <common/logger_useful.h>
#include <ext/scope_guard.h>
#include "getIdentifierQuote.h"
#include "validateODBCConnectionString.h"
namespace DB
{
void IdentifierQuoteHandler::handleRequest(Poco::Net::HTTPServerRequest & request, Poco::Net::HTTPServerResponse & response)
{
Poco::Net::HTMLForm params(request, request.stream());
LOG_TRACE(log, "Request URI: " + request.getURI());
auto process_error = [&response, this](const std::string & message)
{
response.setStatusAndReason(Poco::Net::HTTPResponse::HTTP_INTERNAL_SERVER_ERROR);
if (!response.sent())
response.send() << message << std::endl;
LOG_WARNING(log, message);
};
if (!params.has("connection_string"))
{
process_error("No 'connection_string' in request URL");
return;
}
try
{
std::string connection_string = params.get("connection_string");
POCO_SQL_ODBC_CLASS::SessionImpl session(validateODBCConnectionString(connection_string), DBMS_DEFAULT_CONNECT_TIMEOUT_SEC);
SQLHDBC hdbc = session.dbc().handle();
auto identifier = getIdentifierQuote(hdbc);
WriteBufferFromHTTPServerResponse out(request, response, keep_alive_timeout);
writeStringBinary(identifier, out);
}
catch (...)
{
process_error("Error getting identifier quote style from ODBC '" + getCurrentExceptionMessage(false) + "'");
tryLogCurrentException(log);
}
}
}
#endif

View File

@ -0,0 +1,28 @@
#pragma once
#include <Interpreters/Context.h>
#include <Poco/Logger.h>
#include <Poco/Net/HTTPRequestHandler.h>
#include <Common/config.h>
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
/** This handler establish connection to database, and retrieve quote style identifier
*/
namespace DB
{
class IdentifierQuoteHandler : public Poco::Net::HTTPRequestHandler
{
public:
IdentifierQuoteHandler(size_t keep_alive_timeout_, std::shared_ptr<Context> context_)
: log(&Poco::Logger::get("IdentifierQuoteHandler")), keep_alive_timeout(keep_alive_timeout_), context(context_)
{
}
void handleRequest(Poco::Net::HTTPServerRequest & request, Poco::Net::HTTPServerResponse & response) override;
private:
Poco::Logger * log;
size_t keep_alive_timeout;
std::shared_ptr<Context> context;
};
}
#endif

View File

@ -5,8 +5,8 @@ was possible segfaults or another faults in ODBC implementations, which can
crash whole clickhouse-server process.
This tool works via HTTP, not via pipes, shared memory, or TCP because:
- It's simplier to implement
- It's simplier to debug
- It's simpler to implement
- It's simpler to debug
- jdbc-bridge can be implemented in the same way
## Usage

View File

@ -0,0 +1,44 @@
#include "getIdentifierQuote.h"
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
#if USE_POCO_SQLODBC
#include <Poco/SQL/ODBC/ODBCException.h> // Y_IGNORE
#include <Poco/SQL/ODBC/SessionImpl.h> // Y_IGNORE
#include <Poco/SQL/ODBC/Utility.h> // Y_IGNORE
#define POCO_SQL_ODBC_CLASS Poco::SQL::ODBC
#endif
#if USE_POCO_DATAODBC
#include <Poco/Data/ODBC/ODBCException.h>
#include <Poco/Data/ODBC/SessionImpl.h>
#include <Poco/Data/ODBC/Utility.h>
#define POCO_SQL_ODBC_CLASS Poco::Data::ODBC
#endif
namespace DB
{
std::string getIdentifierQuote(SQLHDBC hdbc)
{
std::string identifier;
SQLSMALLINT t;
SQLRETURN r = POCO_SQL_ODBC_CLASS::SQLGetInfo(hdbc, SQL_IDENTIFIER_QUOTE_CHAR, nullptr, 0, &t);
if (POCO_SQL_ODBC_CLASS::Utility::isError(r))
throw POCO_SQL_ODBC_CLASS::ConnectionException(hdbc);
if (t > 0)
{
// I have no idea, why to add '2' here, got from: contrib/poco/Data/ODBC/src/ODBCStatementImpl.cpp:60 (SQL_DRIVER_NAME)
identifier.resize(static_cast<std::size_t>(t) + 2);
if (POCO_SQL_ODBC_CLASS::Utility::isError(POCO_SQL_ODBC_CLASS::SQLGetInfo(
hdbc, SQL_IDENTIFIER_QUOTE_CHAR, &identifier[0], SQLSMALLINT((identifier.length() - 1) * sizeof(identifier[0])), &t)))
throw POCO_SQL_ODBC_CLASS::ConnectionException(hdbc);
identifier.resize(static_cast<std::size_t>(t));
}
return identifier;
}
}
#endif

View File

@ -0,0 +1,22 @@
#pragma once
#include <Common/config.h>
#include <Interpreters/Context.h>
#include <Poco/Logger.h>
#include <Poco/Net/HTTPRequestHandler.h>
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
#if USE_POCO_SQLODBC
#include <Poco/SQL/ODBC/Utility.h> // Y_IGNORE
#endif
#if USE_POCO_DATAODBC
#include <Poco/Data/ODBC/Utility.h>
#endif
namespace DB
{
std::string getIdentifierQuote(SQLHDBC hdbc);
}
#endif

View File

@ -6,10 +6,10 @@
namespace DB
{
/** Passing arbitary connection string to ODBC Driver Manager is insecure, for the following reasons:
/** Passing arbitrary connection string to ODBC Driver Manager is insecure, for the following reasons:
* 1. Driver Manager like unixODBC has multiple bugs like buffer overflow.
* 2. Driver Manager can interpret some parameters as a path to library for dlopen or a file to read,
* thus allows arbitary remote code execution.
* thus allows arbitrary remote code execution.
*
* This function will throw exception if connection string has insecure parameters.
* It may also modify connection string to harden it.

View File

@ -199,8 +199,10 @@ void TCPHandler::runImpl()
else
processOrdinaryQuery();
sendLogs();
/// Do it before sending end of stream, to have a chance to show log message in client.
query_scope->logPeakMemoryUsage();
sendLogs();
sendEndOfStream();
query_scope.reset();

View File

@ -27,7 +27,7 @@ AggregateFunctionPtr createAggregateFunctionAvg(const std::string & name, const
AggregateFunctionPtr res;
DataTypePtr data_type = argument_types[0];
if (isDecimal(data_type))
res.reset(createWithDecimalType<AggregateFuncAvg>(*data_type));
res.reset(createWithDecimalType<AggregateFuncAvg>(*data_type, *data_type));
else
res.reset(createWithNumericType<AggregateFuncAvg>(*data_type));

View File

@ -107,7 +107,7 @@ public:
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
{
/// TODO Do positions need to be 1-based for this function?
size_t position = columns[1]->get64(row_num);
size_t position = columns[1]->getUInt(row_num);
/// If position is larger than size to which array will be cutted - simply ignore value.
if (length_to_resize && position >= length_to_resize)

View File

@ -607,7 +607,7 @@ struct AggregateFunctionAnyLastData : Data
/** Implement 'heavy hitters' algorithm.
* Selects most frequent value if its frequency is more than 50% in each thread of execution.
* Otherwise, selects some arbitary value.
* Otherwise, selects some arbitrary value.
* http://www.cs.umd.edu/~samir/498/karp.pdf
*/
template <typename Data>

View File

@ -10,7 +10,7 @@ namespace DB
{
/** Aggregate function that takes arbitary number of arbitary arguments and does nothing.
/** Aggregate function that takes arbitrary number of arbitrary arguments and does nothing.
*/
class AggregateFunctionNothing final : public IAggregateFunctionHelper<AggregateFunctionNothing>
{

View File

@ -50,7 +50,7 @@ AggregateFunctionPtr createAggregateFunctionSum(const std::string & name, const
AggregateFunctionPtr res;
DataTypePtr data_type = argument_types[0];
if (isDecimal(data_type))
res.reset(createWithDecimalType<Function>(*data_type));
res.reset(createWithDecimalType<Function>(*data_type, *data_type));
else
res.reset(createWithNumericType<Function>(*data_type));

View File

@ -37,7 +37,9 @@ AggregateFunctionPtr createAggregateFunctionSumMap(const std::string & name, con
values_types.push_back(array_type->getNestedType());
}
AggregateFunctionPtr res(createWithNumericType<AggregateFunctionSumMap>(*keys_type, keys_type, std::move(values_types)));
AggregateFunctionPtr res(createWithNumericBasedType<AggregateFunctionSumMap>(*keys_type, keys_type, values_types));
if (!res)
res.reset(createWithDecimalType<AggregateFunctionSumMap>(*keys_type, keys_type, values_types));
if (!res)
throw Exception("Illegal type of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);

View File

@ -8,6 +8,8 @@
#include <Columns/ColumnArray.h>
#include <Columns/ColumnTuple.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnDecimal.h>
#include <Common/FieldVisitors.h>
#include <AggregateFunctions/IAggregateFunction.h>
@ -53,6 +55,8 @@ class AggregateFunctionSumMap final : public IAggregateFunctionDataHelper<
AggregateFunctionSumMapData<typename NearestFieldType<T>::Type>, AggregateFunctionSumMap<T>>
{
private:
using ColVecType = std::conditional_t<IsDecimalNumber<T>, ColumnDecimal<T>, ColumnVector<T>>;
DataTypePtr keys_type;
DataTypes values_types;
@ -78,7 +82,7 @@ public:
// Column 0 contains array of keys of known type
const ColumnArray & array_column = static_cast<const ColumnArray &>(*columns[0]);
const IColumn::Offsets & offsets = array_column.getOffsets();
const auto & keys_vec = static_cast<const ColumnVector<T> &>(array_column.getData());
const auto & keys_vec = static_cast<const ColVecType &>(array_column.getData());
const size_t keys_vec_offset = row_num == 0 ? 0 : offsets[row_num - 1];
const size_t keys_vec_size = (offsets[row_num] - keys_vec_offset);
@ -99,9 +103,20 @@ public:
// Insert column values for all keys
for (size_t i = 0; i < keys_vec_size; ++i)
{
using MapType = std::decay_t<decltype(merged_maps)>;
using IteratorType = typename MapType::iterator;
array_column.getData().get(values_vec_offset + i, value);
const auto & key = keys_vec.getData()[keys_vec_offset + i];
const auto & it = merged_maps.find(key);
IteratorType it;
if constexpr (IsDecimalNumber<T>)
{
UInt32 scale = keys_vec.getData().getScale();
it = merged_maps.find(DecimalField<T>(key, scale));
}
else
it = merged_maps.find(key);
if (it != merged_maps.end())
applyVisitor(FieldVisitorSum(value), it->second[col]);
@ -113,7 +128,13 @@ public:
for (size_t k = 0; k < new_values.size(); ++k)
new_values[k] = (k == col) ? value : values_types[k]->getDefault();
merged_maps[key] = std::move(new_values);
if constexpr (IsDecimalNumber<T>)
{
UInt32 scale = keys_vec.getData().getScale();
merged_maps.emplace(DecimalField<T>(key, scale), std::move(new_values));
}
else
merged_maps.emplace(key, std::move(new_values));
}
}
}
@ -167,7 +188,10 @@ public:
for (size_t col = 0; col < values_types.size(); ++col)
values_types[col]->deserializeBinary(values[col], buf);
merged_maps[key.get<T>()] = values;
if constexpr (IsDecimalNumber<T>)
merged_maps[key.get<DecimalField<T>>()] = values;
else
merged_maps[key.get<T>()] = values;
}
}

View File

@ -24,9 +24,7 @@ AggregateFunctionPtr createAggregateFunctionStatisticsUnary(const std::string &
AggregateFunctionPtr res;
DataTypePtr data_type = argument_types[0];
if (isDecimal(data_type))
{
res.reset(createWithDecimalType<FunctionTemplate>(*data_type));
}
res.reset(createWithDecimalType<FunctionTemplate>(*data_type, *data_type));
else
res.reset(createWithNumericType<FunctionTemplate>(*data_type));

View File

@ -19,6 +19,6 @@ list(REMOVE_ITEM clickhouse_aggregate_functions_headers
FactoryHelpers.h
)
add_library(clickhouse_aggregate_functions ${clickhouse_aggregate_functions_sources})
add_library(clickhouse_aggregate_functions ${LINK_MODE} ${clickhouse_aggregate_functions_sources})
target_link_libraries(clickhouse_aggregate_functions dbms)
target_include_directories (clickhouse_aggregate_functions BEFORE PRIVATE ${COMMON_INCLUDE_DIR})

View File

@ -70,13 +70,28 @@ static IAggregateFunction * createWithUnsignedIntegerType(const IDataType & argu
return nullptr;
}
template <template <typename> class AggregateFunctionTemplate, typename ... TArgs>
static IAggregateFunction * createWithNumericBasedType(const IDataType & argument_type, TArgs && ... args)
{
IAggregateFunction * f = createWithNumericType<AggregateFunctionTemplate>(argument_type, std::forward<TArgs>(args)...);
if (f)
return f;
/// expects that DataTypeDate based on UInt16, DataTypeDateTime based on UInt32 and UUID based on UInt128
WhichDataType which(argument_type);
if (which.idx == TypeIndex::Date) return new AggregateFunctionTemplate<UInt16>(std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::DateTime) return new AggregateFunctionTemplate<UInt32>(std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::UUID) return new AggregateFunctionTemplate<UInt128>(std::forward<TArgs>(args)...);
return nullptr;
}
template <template <typename> class AggregateFunctionTemplate, typename ... TArgs>
static IAggregateFunction * createWithDecimalType(const IDataType & argument_type, TArgs && ... args)
{
WhichDataType which(argument_type);
if (which.idx == TypeIndex::Decimal32) return new AggregateFunctionTemplate<Decimal32>(argument_type, std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::Decimal64) return new AggregateFunctionTemplate<Decimal64>(argument_type, std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::Decimal128) return new AggregateFunctionTemplate<Decimal128>(argument_type, std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::Decimal32) return new AggregateFunctionTemplate<Decimal32>(std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::Decimal64) return new AggregateFunctionTemplate<Decimal64>(std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::Decimal128) return new AggregateFunctionTemplate<Decimal128>(std::forward<TArgs>(args)...);
return nullptr;
}

View File

@ -77,6 +77,17 @@ void Connection::connect()
socket->setReceiveTimeout(timeouts.receive_timeout);
socket->setSendTimeout(timeouts.send_timeout);
socket->setNoDelay(true);
if (timeouts.tcp_keep_alive_timeout.totalSeconds())
{
socket->setKeepAlive(true);
socket->setOption(IPPROTO_TCP,
#if defined(TCP_KEEPALIVE)
TCP_KEEPALIVE
#else
TCP_KEEPIDLE // __APPLE__
#endif
, timeouts.tcp_keep_alive_timeout);
}
in = std::make_shared<ReadBufferFromPocoSocket>(*socket);
out = std::make_shared<WriteBufferFromPocoSocket>(*socket);
@ -231,6 +242,14 @@ void Connection::getServerVersion(String & name, UInt64 & version_major, UInt64
revision = server_revision;
}
UInt64 Connection::getServerRevision()
{
if (!connected)
connect();
return server_revision;
}
const String & Connection::getServerTimezone()
{
if (!connected)
@ -349,7 +368,7 @@ void Connection::sendQuery(
{
ClientInfo client_info_to_send;
if (!client_info)
if (!client_info || client_info->empty())
{
/// No client info passed - means this query initiated by me.
client_info_to_send.query_kind = ClientInfo::QueryKind::INITIAL_QUERY;

View File

@ -106,6 +106,7 @@ public:
void setDefaultDatabase(const String & database);
void getServerVersion(String & name, UInt64 & version_major, UInt64 & version_minor, UInt64 & version_patch, UInt64 & revision);
UInt64 getServerRevision();
const String & getServerTimezone();
const String & getServerDisplayName();

View File

@ -153,13 +153,9 @@ ConnectionPoolWithFailover::tryGetEntry(
{
result.entry = pool.get(settings, /* force_connected = */ false);
String server_name;
UInt64 server_version_major;
UInt64 server_version_minor;
UInt64 server_version_patch;
UInt64 server_revision;
UInt64 server_revision = 0;
if (table_to_check)
result.entry->getServerVersion(server_name, server_version_major, server_version_minor, server_version_patch, server_revision);
server_revision = result.entry->getServerRevision();
if (!table_to_check || server_revision < DBMS_MIN_REVISION_WITH_TABLES_STATUS)
{

View File

@ -87,28 +87,36 @@ void MultiplexedConnections::sendQuery(
if (sent_query)
throw Exception("Query already sent.", ErrorCodes::LOGICAL_ERROR);
if (replica_states.size() > 1)
Settings modified_settings = settings;
for (auto & replica : replica_states)
{
Settings query_settings = settings;
query_settings.parallel_replicas_count = replica_states.size();
if (!replica.connection)
throw Exception("MultiplexedConnections: Internal error", ErrorCodes::LOGICAL_ERROR);
for (size_t i = 0; i < replica_states.size(); ++i)
if (replica.connection->getServerRevision() < DBMS_MIN_REVISION_WITH_CURRENT_AGGREGATION_VARIANT_SELECTION_METHOD)
{
Connection * connection = replica_states[i].connection;
if (connection == nullptr)
throw Exception("MultiplexedConnections: Internal error", ErrorCodes::LOGICAL_ERROR);
/// Disable two-level aggregation due to version incompatibility.
modified_settings.group_by_two_level_threshold = 0;
modified_settings.group_by_two_level_threshold_bytes = 0;
}
}
query_settings.parallel_replica_offset = i;
connection->sendQuery(query, query_id, stage, &query_settings, client_info, with_pending_data);
size_t num_replicas = replica_states.size();
if (num_replicas > 1)
{
/// Use multiple replicas for parallel query processing.
modified_settings.parallel_replicas_count = num_replicas;
for (size_t i = 0; i < num_replicas; ++i)
{
modified_settings.parallel_replica_offset = i;
replica_states[i].connection->sendQuery(query, query_id, stage, &modified_settings, client_info, with_pending_data);
}
}
else
{
Connection * connection = replica_states[0].connection;
if (connection == nullptr)
throw Exception("MultiplexedConnections: Internal error", ErrorCodes::LOGICAL_ERROR);
connection->sendQuery(query, query_id, stage, &settings, client_info, with_pending_data);
/// Use single replica.
replica_states[0].connection->sendQuery(query, query_id, stage, &modified_settings, client_info, with_pending_data);
}
sent_query = true;

View File

@ -373,7 +373,7 @@ const char * ColumnAggregateFunction::deserializeAndInsertFromArena(const char *
/** We will read from src_arena.
* There is no limit for reading - it is assumed, that we can read all that we need after src_arena pointer.
* Buf ReadBufferFromMemory requires some bound. We will use arbitary big enough number, that will not overflow pointer.
* Buf ReadBufferFromMemory requires some bound. We will use arbitrary big enough number, that will not overflow pointer.
* NOTE Technically, this is not compatible with C++ standard,
* as we cannot legally compare pointers after last element + 1 of some valid memory region.
* Probably this will not work under UBSan.

View File

@ -15,7 +15,7 @@ namespace ErrorCodes
/** ColumnConst contains another column with single element,
* but looks like a column with arbitary amount of same elements.
* but looks like a column with arbitrary amount of same elements.
*/
class ColumnConst final : public COWPtrHelper<IColumn, ColumnConst>
{

View File

@ -1,6 +1,3 @@
//#include <cstring>
#include <cmath>
#include <Common/Exception.h>
#include <Common/Arena.h>
#include <Common/SipHash.h>
@ -20,6 +17,7 @@ namespace ErrorCodes
{
extern const int PARAMETER_OUT_OF_BOUND;
extern const int SIZES_OF_COLUMNS_DOESNT_MATCH;
extern const int NOT_IMPLEMENTED;
}
template <typename T>
@ -47,6 +45,14 @@ const char * ColumnDecimal<T>::deserializeAndInsertFromArena(const char * pos)
return pos + sizeof(T);
}
template <typename T>
UInt64 ColumnDecimal<T>::get64(size_t n) const
{
if constexpr (sizeof(T) > sizeof(UInt64))
throw Exception(String("Method get64 is not supported for ") + getFamilyName(), ErrorCodes::NOT_IMPLEMENTED);
return static_cast<typename T::NativeType>(data[n]);
}
template <typename T>
void ColumnDecimal<T>::updateHashWithValue(size_t n, SipHash & hash) const
{
@ -111,6 +117,14 @@ MutableColumnPtr ColumnDecimal<T>::cloneResized(size_t size) const
return std::move(res);
}
template <typename T>
void ColumnDecimal<T>::insertData(const char * src, size_t /*length*/)
{
T tmp;
memcpy(&tmp, src, sizeof(T));
data.emplace_back(tmp);
}
template <typename T>
void ColumnDecimal<T>::insertRangeFrom(const IColumn & src, size_t start, size_t length)
{

View File

@ -89,7 +89,7 @@ public:
void reserve(size_t n) override { data.reserve(n); }
void insertFrom(const IColumn & src, size_t n) override { data.push_back(static_cast<const Self &>(src).getData()[n]); }
void insertData(const char * pos, size_t /*length*/) override { data.push_back(*reinterpret_cast<const T *>(pos)); }
void insertData(const char * pos, size_t /*length*/) override;
void insertDefault() override { data.push_back(T()); }
void insert(const Field & x) override { data.push_back(DB::get<typename NearestFieldType<T>::Type>(x)); }
void insertRangeFrom(const IColumn & src, size_t start, size_t length) override;
@ -111,6 +111,8 @@ public:
void get(size_t n, Field & res) const override { res = (*this)[n]; }
bool getBool(size_t n) const override { return bool(data[n]); }
Int64 getInt(size_t n) const override { return Int64(data[n] * scale); }
UInt64 get64(size_t n) const override;
bool isDefaultAt(size_t n) const override { return data[n] == 0; }
ColumnPtr filter(const IColumn::Filter & filt, ssize_t result_size_hint) const override;
ColumnPtr permute(const IColumn::Permutation & perm, size_t limit) const override;

View File

@ -92,7 +92,7 @@ public:
}
/** If column is numeric, return value of n-th element, casted to UInt64.
* For NULL values of Nullable column it is allowed to return arbitary value.
* For NULL values of Nullable column it is allowed to return arbitrary value.
* Otherwise throw an exception.
*/
virtual UInt64 getUInt(size_t /*n*/) const
@ -105,6 +105,7 @@ public:
throw Exception("Method getInt is not supported for " + getName(), ErrorCodes::NOT_IMPLEMENTED);
}
virtual bool isDefaultAt(size_t n) const { return get64(n) == 0; }
virtual bool isNullAt(size_t /*n*/) const { return false; }
/** If column is numeric, return value of n-th element, casted to bool.
@ -174,7 +175,7 @@ public:
virtual const char * deserializeAndInsertFromArena(const char * pos) = 0;
/// Update state of hash function with value of n-th element.
/// On subsequent calls of this method for sequence of column values of arbitary types,
/// On subsequent calls of this method for sequence of column values of arbitrary types,
/// passed bytes to hash must identify sequence of values unambiguously.
virtual void updateHashWithValue(size_t n, SipHash & hash) const = 0;

View File

@ -462,6 +462,8 @@ XMLDocumentPtr ConfigProcessor::processConfig(
std::string include_from_path;
if (node)
{
/// if we include_from env or zk.
doIncludesRecursive(config, nullptr, node, zk_node_cache, contributing_zk_paths);
include_from_path = node->innerText();
}
else

View File

@ -75,11 +75,14 @@ public:
static void detachQuery();
static void detachQueryIfNotDetached();
/// Initializes query with current thread as master thread in constructor, and detaches it in desstructor
/// Initializes query with current thread as master thread in constructor, and detaches it in destructor
struct QueryScope
{
explicit QueryScope(Context & query_context);
~QueryScope();
void logPeakMemoryUsage();
bool log_peak_memory_usage_in_destructor = true;
};
/// Implicitly finalizes current thread in the destructor

View File

@ -74,7 +74,7 @@ void MemoryTracker::alloc(Int64 size)
*/
Int64 will_be = size + amount.fetch_add(size, std::memory_order_relaxed);
if (!parent.load(std::memory_order_relaxed))
if (metric != CurrentMetrics::end())
CurrentMetrics::add(metric, size);
Int64 current_limit = limit.load(std::memory_order_relaxed);
@ -154,7 +154,8 @@ void MemoryTracker::free(Int64 size)
if (auto loaded_next = parent.load(std::memory_order_relaxed))
loaded_next->free(size);
else
if (metric != CurrentMetrics::end())
CurrentMetrics::sub(metric, size);
}
@ -169,7 +170,7 @@ void MemoryTracker::resetCounters()
void MemoryTracker::reset()
{
if (!parent.load(std::memory_order_relaxed))
if (metric != CurrentMetrics::end())
CurrentMetrics::sub(metric, amount.load(std::memory_order_relaxed));
resetCounters();

View File

@ -7,12 +7,6 @@
#include <Common/VariableContext.h>
namespace CurrentMetrics
{
extern const Metric MemoryTracking;
}
/** Tracks memory consumption.
* It throws an exception if amount of consumed memory become greater than certain limit.
* The same memory tracker could be simultaneously used in different threads.
@ -31,7 +25,7 @@ class MemoryTracker
std::atomic<MemoryTracker *> parent {};
/// You could specify custom metric to track memory usage.
CurrentMetrics::Metric metric = CurrentMetrics::MemoryTracking;
CurrentMetrics::Metric metric = CurrentMetrics::end();
/// This description will be used as prefix into log messages (if isn't nullptr)
const char * description = nullptr;

View File

@ -1,151 +0,0 @@
#include <Common/ODBCBridgeHelper.h>
#include <sstream>
#include <IO/ReadHelpers.h>
#include <IO/ReadWriteBufferFromHTTP.h>
#include <Poco/File.h>
#include <Poco/Net/HTTPRequest.h>
#include <Poco/Path.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <Common/ShellCommand.h>
#include <Common/config.h>
#include <common/logger_useful.h>
#include <ext/range.h>
namespace DB
{
namespace ErrorCodes
{
extern const int EXTERNAL_SERVER_IS_NOT_RESPONDING;
}
ODBCBridgeHelper::ODBCBridgeHelper(
const Configuration & config_, const Poco::Timespan & http_timeout_, const std::string & connection_string_)
: config(config_), http_timeout(http_timeout_), connection_string(connection_string_)
{
size_t bridge_port = config.getUInt("odbc_bridge.port", DEFAULT_PORT);
std::string bridge_host = config.getString("odbc_bridge.host", DEFAULT_HOST);
ping_url.setHost(bridge_host);
ping_url.setPort(bridge_port);
ping_url.setScheme("http");
ping_url.setPath(PING_HANDLER);
}
void ODBCBridgeHelper::startODBCBridge() const
{
Poco::Path path{config.getString("application.dir", "")};
path.setFileName(
#if CLICKHOUSE_SPLIT_BINARY
"clickhouse-odbc-bridge"
#else
"clickhouse"
#endif
);
if (!Poco::File(path).exists())
throw Exception("clickhouse binary (" + path.toString() + ") is not found", ErrorCodes::EXTERNAL_EXECUTABLE_NOT_FOUND);
std::stringstream command;
command << path.toString() <<
#if CLICKHOUSE_SPLIT_BINARY
" "
#else
" odbc-bridge "
#endif
;
command << "--http-port " << config.getUInt("odbc_bridge.port", DEFAULT_PORT) << ' ';
command << "--listen-host " << config.getString("odbc_bridge.listen_host", DEFAULT_HOST) << ' ';
command << "--http-timeout " << http_timeout.totalMicroseconds() << ' ';
if (config.has("logger.odbc_bridge_log"))
command << "--log-path " << config.getString("logger.odbc_bridge_log") << ' ';
if (config.has("logger.odbc_bridge_errlog"))
command << "--err-log-path " << config.getString("logger.odbc_bridge_errlog") << ' ';
if (config.has("logger.odbc_bridge_level"))
command << "--log-level " << config.getString("logger.odbc_bridge_level") << ' ';
command << "&"; /// we don't want to wait this process
auto command_str = command.str();
LOG_TRACE(log, "Starting clickhouse-odbc-bridge with command: " << command_str);
auto cmd = ShellCommand::execute(command_str);
cmd->wait();
}
std::vector<std::pair<std::string, std::string>> ODBCBridgeHelper::getURLParams(const std::string & cols, size_t max_block_size) const
{
std::vector<std::pair<std::string, std::string>> result;
result.emplace_back("connection_string", connection_string); /// already validated
result.emplace_back("columns", cols);
result.emplace_back("max_block_size", std::to_string(max_block_size));
return result;
}
bool ODBCBridgeHelper::checkODBCBridgeIsRunning() const
{
try
{
ReadWriteBufferFromHTTP buf(ping_url, Poco::Net::HTTPRequest::HTTP_GET, nullptr);
return checkString(ODBCBridgeHelper::PING_OK_ANSWER, buf);
}
catch (...)
{
return false;
}
}
void ODBCBridgeHelper::startODBCBridgeSync() const
{
if (!checkODBCBridgeIsRunning())
{
LOG_TRACE(log, "clickhouse-odbc-bridge is not running, will try to start it");
startODBCBridge();
bool started = false;
for (size_t counter : ext::range(1, 20))
{
LOG_TRACE(log, "Checking clickhouse-odbc-bridge is running, try " << counter);
if (checkODBCBridgeIsRunning())
{
started = true;
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
if (!started)
throw Exception("ODBCBridgeHelper: clickhouse-odbc-bridge is not responding", ErrorCodes::EXTERNAL_SERVER_IS_NOT_RESPONDING);
}
}
Poco::URI ODBCBridgeHelper::getMainURI() const
{
size_t bridge_port = config.getUInt("odbc_bridge.port", DEFAULT_PORT);
std::string bridge_host = config.getString("odbc_bridge.host", DEFAULT_HOST);
Poco::URI main_uri;
main_uri.setHost(bridge_host);
main_uri.setPort(bridge_port);
main_uri.setScheme("http");
main_uri.setPath(MAIN_HANDLER);
return main_uri;
}
Poco::URI ODBCBridgeHelper::getColumnsInfoURI() const
{
size_t bridge_port = config.getUInt("odbc_bridge.port", DEFAULT_PORT);
std::string bridge_host = config.getString("odbc_bridge.host", DEFAULT_HOST);
Poco::URI columns_info_uri;
columns_info_uri.setHost(bridge_host);
columns_info_uri.setPort(bridge_port);
columns_info_uri.setScheme("http");
columns_info_uri.setPath(COL_INFO_HANDLER);
return columns_info_uri;
}
}

View File

@ -1,52 +0,0 @@
#pragma once
#include <Interpreters/Context.h>
#include <Poco/Logger.h>
#include <Poco/URI.h>
#include <Poco/Util/AbstractConfiguration.h>
namespace DB
{
namespace ErrorCodes
{
extern const int EXTERNAL_EXECUTABLE_NOT_FOUND;
}
/** Helper for odbc-bridge, provide utility methods, not main request
*/
class ODBCBridgeHelper
{
private:
using Configuration = Poco::Util::AbstractConfiguration;
const Configuration & config;
Poco::Timespan http_timeout;
std::string connection_string;
Poco::URI ping_url;
Poco::Logger * log = &Poco::Logger::get("ODBCBridgeHelper");
public:
static constexpr inline size_t DEFAULT_PORT = 9018;
static constexpr inline auto DEFAULT_HOST = "localhost";
static constexpr inline auto DEFAULT_FORMAT = "RowBinary";
static constexpr inline auto PING_HANDLER = "/ping";
static constexpr inline auto MAIN_HANDLER = "/";
static constexpr inline auto COL_INFO_HANDLER = "/columns_info";
static constexpr inline auto PING_OK_ANSWER = "Ok.";
ODBCBridgeHelper(const Configuration & config_, const Poco::Timespan & http_timeout_, const std::string & connection_string_);
std::vector<std::pair<std::string, std::string>> getURLParams(const std::string & cols, size_t max_block_size) const;
bool checkODBCBridgeIsRunning() const;
void startODBCBridge() const;
void startODBCBridgeSync() const;
Poco::URI getMainURI() const;
Poco::URI getColumnsInfoURI() const;
};
}

View File

@ -0,0 +1,134 @@
#include <Common/UTF8Helpers.h>
#include <widechar_width.h>
namespace DB
{
namespace UTF8
{
// based on https://bjoern.hoehrmann.de/utf-8/decoder/dfa/
// Copyright (c) 2008-2009 Bjoern Hoehrmann <bjoern@hoehrmann.de>
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions: The above copyright
// notice and this permission notice shall be included in all copies or
// substantial portions of the Software.
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
static const UInt8 TABLE[] =
{
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 00..1f
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 20..3f
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 40..5f
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 60..7f
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, // 80..9f
7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, // a0..bf
8,8,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, // c0..df
0xa,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x4,0x3,0x3, // e0..ef
0xb,0x6,0x6,0x6,0x5,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8, // f0..ff
0x0,0x1,0x2,0x3,0x5,0x8,0x7,0x1,0x1,0x1,0x4,0x6,0x1,0x1,0x1,0x1, // s0..s0
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,0,1,0,1,1,1,1,1,1, // s1..s2
1,2,1,1,1,1,1,2,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1, // s3..s4
1,2,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,3,1,1,1,1,1,1, // s5..s6
1,3,1,1,1,1,1,3,1,3,1,1,1,1,1,1,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // s7..s8
};
struct UTF8Decoder
{
enum
{
ACCEPT = 0,
REJECT = 1
};
UInt32 decode(UInt8 byte)
{
UInt32 type = TABLE[byte];
codepoint = (state != ACCEPT) ? (byte & 0x3fu) | (codepoint << 6) : (0xff >> type) & (byte);
state = TABLE[256 + state * 16 + type];
return state;
}
void reset()
{
state = ACCEPT;
codepoint = 0xfffdU;
}
UInt8 state {ACCEPT};
UInt32 codepoint {0};
};
static int wcwidth(wchar_t wc)
{
int width = widechar_wcwidth(wc);
switch (width)
{
case widechar_nonprint:
[[fallthrough]];
case widechar_combining:
[[fallthrough]];
case widechar_unassigned:
return 0;
case widechar_ambiguous:
[[fallthrough]];
case widechar_private_use:
[[fallthrough]];
case widechar_widened_in_9:
return 1;
default:
return width;
}
}
size_t computeWidth(const UInt8 * data, size_t size, size_t prefix) noexcept
{
UTF8Decoder decoder;
size_t width = 0;
size_t rollback = 0;
for (size_t i = 0; i < size; ++i)
{
switch (decoder.decode(data[i]))
{
case UTF8Decoder::REJECT:
decoder.reset();
// invalid sequences seem to have zero width in modern terminals
// tested in libvte-based, alacritty, urxvt and xterm
i -= rollback;
rollback = 0;
break;
case UTF8Decoder::ACCEPT:
// there are special control characters that manipulate the terminal output.
// (`0x08`, `0x09`, `0x0a`, `0x0b`, `0x0c`, `0x0d`, `0x1b`)
// Since we don't touch the original column data, there is no easy way to escape them.
// TODO: escape control characters
// TODO: multiline support for '\n'
// special treatment for '\t'
if (decoder.codepoint == '\t')
width += 8 - (prefix + width) % 8;
else
width += wcwidth(decoder.codepoint);
rollback = 0;
break;
// continue if we meet other values here
default:
++rollback;
}
}
// no need to handle trailing sequence as they have zero width
return width;
}
}
}

View File

@ -72,6 +72,11 @@ inline size_t countCodePoints(const UInt8 * data, size_t size)
return res;
}
/// returns UTF-8 wcswidth. Invalid sequence is treated as zero width character.
/// `prefix` is used to compute the `\t` width which extends the string before
/// and include `\t` to the nearest longer length with multiple of eight.
size_t computeWidth(const UInt8 * data, size_t size, size_t prefix = 0) noexcept;
}

View File

@ -0,0 +1,300 @@
#pragma once
#include <sstream>
#include <IO/ReadHelpers.h>
#include <IO/ReadWriteBufferFromHTTP.h>
#include <Interpreters/Context.h>
#include <Poco/File.h>
#include <Poco/Logger.h>
#include <Poco/Net/HTTPRequest.h>
#include <Poco/Path.h>
#include <Poco/URI.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <Common/ShellCommand.h>
#include <Common/config.h>
#include <common/logger_useful.h>
#include <ext/range.h>
namespace DB
{
namespace ErrorCodes
{
extern const int EXTERNAL_EXECUTABLE_NOT_FOUND;
extern const int EXTERNAL_SERVER_IS_NOT_RESPONDING;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
/**
* Class for Helpers for Xdbc-bridges, provide utility methods, not main request
*/
class IXDBCBridgeHelper
{
public:
static constexpr inline auto DEFAULT_FORMAT = "RowBinary";
virtual std::vector<std::pair<std::string, std::string>> getURLParams(const std::string & cols, size_t max_block_size) const = 0;
virtual void startBridgeSync() const = 0;
virtual Poco::URI getMainURI() const = 0;
virtual Poco::URI getColumnsInfoURI() const = 0;
virtual IdentifierQuotingStyle getIdentifierQuotingStyle() = 0;
virtual String getName() const = 0;
virtual ~IXDBCBridgeHelper() = default;
};
using BridgeHelperPtr = std::shared_ptr<IXDBCBridgeHelper>;
template <typename BridgeHelperMixin>
class XDBCBridgeHelper : public IXDBCBridgeHelper
{
private:
Poco::Timespan http_timeout;
std::string connection_string;
Poco::URI ping_url;
Poco::Logger * log = &Poco::Logger::get(BridgeHelperMixin::getName() + "BridgeHelper");
std::optional<IdentifierQuotingStyle> quote_style;
protected:
auto getConnectionString() const
{
return connection_string;
}
public:
using Configuration = Poco::Util::AbstractConfiguration;
const Configuration & config;
static constexpr inline auto DEFAULT_HOST = "localhost";
static constexpr inline auto DEFAULT_PORT = BridgeHelperMixin::DEFAULT_PORT;
static constexpr inline auto PING_HANDLER = "/ping";
static constexpr inline auto MAIN_HANDLER = "/";
static constexpr inline auto COL_INFO_HANDLER = "/columns_info";
static constexpr inline auto IDENTIFIER_QUOTE_HANDLER = "/identifier_quote";
static constexpr inline auto PING_OK_ANSWER = "Ok.";
XDBCBridgeHelper(const Configuration & config_, const Poco::Timespan & http_timeout_, const std::string & connection_string_)
: http_timeout(http_timeout_), connection_string(connection_string_), config(config_)
{
size_t bridge_port = config.getUInt(BridgeHelperMixin::configPrefix() + ".port", DEFAULT_PORT);
std::string bridge_host = config.getString(BridgeHelperMixin::configPrefix() + ".host", DEFAULT_HOST);
ping_url.setHost(bridge_host);
ping_url.setPort(bridge_port);
ping_url.setScheme("http");
ping_url.setPath(PING_HANDLER);
}
String getName() const override
{
return BridgeHelperMixin::getName();
}
IdentifierQuotingStyle getIdentifierQuotingStyle() override
{
if (!quote_style.has_value())
{
startBridgeSync();
auto uri = createBaseURI();
uri.setPath(IDENTIFIER_QUOTE_HANDLER);
uri.addQueryParameter("connection_string", getConnectionString());
ReadWriteBufferFromHTTP buf(uri, Poco::Net::HTTPRequest::HTTP_POST, nullptr);
std::string character;
readStringBinary(character, buf);
if (character.length() > 1)
throw Exception("Failed to parse quoting style from '" + character + "' for service " + BridgeHelperMixin::serviceAlias(), ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
else if (character.length() == 0)
quote_style = IdentifierQuotingStyle::None;
else if (character[0] == '`')
quote_style = IdentifierQuotingStyle::Backticks;
else if (character[0] == '"')
quote_style = IdentifierQuotingStyle::DoubleQuotes;
else
throw Exception("Can not map quote identifier '" + character + "' to enum value", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
return *quote_style;
}
/**
* @todo leaky abstraction - used by external API's
*/
std::vector<std::pair<std::string, std::string>> getURLParams(const std::string & cols, size_t max_block_size) const override
{
std::vector<std::pair<std::string, std::string>> result;
result.emplace_back("connection_string", connection_string); /// already validated
result.emplace_back("columns", cols);
result.emplace_back("max_block_size", std::to_string(max_block_size));
return result;
}
/**
* Performs spawn of external daemon
*/
void startBridgeSync() const override
{
if (!checkBridgeIsRunning())
{
LOG_TRACE(log, BridgeHelperMixin::serviceAlias() + " is not running, will try to start it");
startBridge();
bool started = false;
for (size_t counter : ext::range(1, 20))
{
LOG_TRACE(log, "Checking " + BridgeHelperMixin::serviceAlias() + " is running, try " << counter);
if (checkBridgeIsRunning())
{
started = true;
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
if (!started)
throw Exception(BridgeHelperMixin::getName() + "BridgeHelper: " + BridgeHelperMixin::serviceAlias() + " is not responding",
ErrorCodes::EXTERNAL_SERVER_IS_NOT_RESPONDING);
}
}
/**
* URI to fetch the data from external service
*/
Poco::URI getMainURI() const override
{
auto uri = createBaseURI();
uri.setPath(MAIN_HANDLER);
return uri;
}
/**
* URI to retrieve column description from external service
*/
Poco::URI getColumnsInfoURI() const override
{
auto uri = createBaseURI();
uri.setPath(COL_INFO_HANDLER);
return uri;
}
protected:
Poco::URI createBaseURI() const
{
Poco::URI uri;
uri.setHost(ping_url.getHost());
uri.setPort(ping_url.getPort());
uri.setScheme("http");
return uri;
}
private:
bool checkBridgeIsRunning() const
{
try
{
ReadWriteBufferFromHTTP buf(ping_url, Poco::Net::HTTPRequest::HTTP_GET, nullptr);
return checkString(XDBCBridgeHelper::PING_OK_ANSWER, buf);
}
catch (...)
{
return false;
}
}
/* Contains logic for instantiation of the bridge instance */
void startBridge() const
{
BridgeHelperMixin::startBridge(config, log, http_timeout);
}
};
struct JDBCBridgeMixin
{
static constexpr inline auto DEFAULT_PORT = 9019;
static const String configPrefix()
{
return "jdbc_bridge";
}
static const String serviceAlias()
{
return "clickhouse-jdbc-bridge";
}
static const String getName()
{
return "JDBC";
}
static void startBridge(const Poco::Util::AbstractConfiguration &, const Poco::Logger *, const Poco::Timespan &)
{
throw Exception("jdbc-bridge is not running. Please, start it manually", ErrorCodes::EXTERNAL_SERVER_IS_NOT_RESPONDING);
}
};
struct ODBCBridgeMixin
{
static constexpr inline auto DEFAULT_PORT = 9018;
static const String configPrefix()
{
return "odbc_bridge";
}
static const String serviceAlias()
{
return "clickhouse-odbc-bridge";
}
static const String getName()
{
return "ODBC";
}
static void startBridge(const Poco::Util::AbstractConfiguration & config, Poco::Logger * log, const Poco::Timespan & http_timeout)
{
/// Path to executable folder
Poco::Path path{config.getString("application.dir", "/usr/bin")};
path.setFileName(
#if CLICKHOUSE_SPLIT_BINARY
"clickhouse-odbc-bridge"
#else
"clickhouse"
#endif
);
std::stringstream command;
command << path.toString() <<
#if CLICKHOUSE_SPLIT_BINARY
" "
#else
" odbc-bridge "
#endif
;
command << "--http-port " << config.getUInt(configPrefix() + ".port", DEFAULT_PORT) << ' ';
command << "--listen-host " << config.getString(configPrefix() + ".listen_host", XDBCBridgeHelper<ODBCBridgeMixin>::DEFAULT_HOST)
<< ' ';
command << "--http-timeout " << http_timeout.totalMicroseconds() << ' ';
if (config.has("logger." + configPrefix() + "_log"))
command << "--log-path " << config.getString("logger." + configPrefix() + "_log") << ' ';
if (config.has("logger." + configPrefix() + "_errlog"))
command << "--err-log-path " << config.getString("logger." + configPrefix() + "_errlog") << ' ';
if (config.has("logger." + configPrefix() + "_level"))
command << "--log-level " << config.getString("logger." + configPrefix() + "_level") << ' ';
command << "&"; /// we don't want to wait this process
auto command_str = command.str();
std::cerr << command_str << std::endl;
LOG_TRACE(log, "Starting " + serviceAlias() + " with command: " << command_str);
auto cmd = ShellCommand::execute(command_str);
cmd->wait();
}
};
}

View File

@ -1,4 +0,0 @@
#pragma once
#include <vector>
extern const char * auto_config_build[];

View File

@ -19,7 +19,7 @@ namespace DB
* This is unit of data processing.
* Also contains metadata - data types of columns and their names
* (either original names from a table, or generated names during temporary calculations).
* Allows to insert, remove columns in arbitary position, to change order of columns.
* Allows to insert, remove columns in arbitrary position, to change order of columns.
*/
class Context;

View File

@ -47,6 +47,10 @@
#define DBMS_MIN_REVISION_WITH_SERVER_DISPLAY_NAME 54372
#define DBMS_MIN_REVISION_WITH_VERSION_PATCH 54401
#define DBMS_MIN_REVISION_WITH_SERVER_LOGS 54406
/// Minimum revision with exactly the same set of aggregation methods and rules to select them.
/// Two-level (bucketed) aggregation is incompatible if servers are inconsistent in these rules
/// (keys will be placed in different buckets and result will not be fully aggregated).
#define DBMS_MIN_REVISION_WITH_CURRENT_AGGREGATION_VARIANT_SELECTION_METHOD 54408
/// Version of ClickHouse TCP protocol. Set to git tag with latest protocol change.
#define DBMS_TCP_PROTOCOL_VERSION 54226

View File

@ -26,10 +26,20 @@ struct SortColumnDescription
SortColumnDescription(const std::string & column_name_, int direction_, int nulls_direction_, const std::shared_ptr<Collator> & collator_ = nullptr)
: column_name(column_name_), column_number(0), direction(direction_), nulls_direction(nulls_direction_), collator(collator_) {}
bool operator == (const SortColumnDescription & other) const
{
return column_name == other.column_name && column_number == other.column_number
&& direction == other.direction && nulls_direction == other.nulls_direction;
}
bool operator != (const SortColumnDescription & other) const
{
return !(*this == other);
}
};
/// Description of the sorting rule for several columns.
using SortDescription = std::vector<SortColumnDescription>;
}

View File

@ -16,7 +16,7 @@ struct TypePair
template <typename T, bool _int, bool _float, bool _dec, typename F>
template <typename T, bool _int, bool _float, bool _decimal, bool _datetime, typename F>
bool callOnBasicType(TypeIndex number, F && f)
{
if constexpr (_int)
@ -40,7 +40,7 @@ bool callOnBasicType(TypeIndex number, F && f)
}
}
if constexpr (_dec)
if constexpr (_decimal)
{
switch (number)
{
@ -63,40 +63,51 @@ bool callOnBasicType(TypeIndex number, F && f)
}
}
if constexpr (_datetime)
{
switch (number)
{
case TypeIndex::Date: return f(TypePair<T, UInt16>());
case TypeIndex::DateTime: return f(TypePair<T, UInt32>());
default:
break;
}
}
return false;
}
/// Unroll template using TypeIndex
template <bool _int, bool _float, bool _dec, typename F>
template <bool _int, bool _float, bool _decimal, bool _datetime, typename F>
inline bool callOnBasicTypes(TypeIndex type_num1, TypeIndex type_num2, F && f)
{
if constexpr (_int)
{
switch (type_num1)
{
case TypeIndex::UInt8: return callOnBasicType<UInt8, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::UInt16: return callOnBasicType<UInt16, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::UInt32: return callOnBasicType<UInt32, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::UInt64: return callOnBasicType<UInt64, _int, _float, _dec>(type_num2, std::forward<F>(f));
//case TypeIndex::UInt128: return callOnBasicType<UInt128, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::UInt8: return callOnBasicType<UInt8, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::UInt16: return callOnBasicType<UInt16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::UInt32: return callOnBasicType<UInt32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::UInt64: return callOnBasicType<UInt64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
//case TypeIndex::UInt128: return callOnBasicType<UInt128, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int8: return callOnBasicType<Int8, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Int16: return callOnBasicType<Int16, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Int32: return callOnBasicType<Int32, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Int64: return callOnBasicType<Int64, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Int128: return callOnBasicType<Int128, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Int8: return callOnBasicType<Int8, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int16: return callOnBasicType<Int16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int32: return callOnBasicType<Int32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int64: return callOnBasicType<Int64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Int128: return callOnBasicType<Int128, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
default:
break;
}
}
if constexpr (_dec)
if constexpr (_decimal)
{
switch (type_num1)
{
case TypeIndex::Decimal32: return callOnBasicType<Decimal32, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Decimal64: return callOnBasicType<Decimal64, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Decimal128: return callOnBasicType<Decimal128, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Decimal32: return callOnBasicType<Decimal32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Decimal64: return callOnBasicType<Decimal64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Decimal128: return callOnBasicType<Decimal128, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
default:
break;
}
@ -106,8 +117,19 @@ inline bool callOnBasicTypes(TypeIndex type_num1, TypeIndex type_num2, F && f)
{
switch (type_num1)
{
case TypeIndex::Float32: return callOnBasicType<Float32, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Float64: return callOnBasicType<Float64, _int, _float, _dec>(type_num2, std::forward<F>(f));
case TypeIndex::Float32: return callOnBasicType<Float32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::Float64: return callOnBasicType<Float64, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
default:
break;
}
}
if constexpr (_datetime)
{
switch (type_num1)
{
case TypeIndex::Date: return callOnBasicType<UInt16, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
case TypeIndex::DateTime: return callOnBasicType<UInt32, _int, _float, _decimal, _datetime>(type_num2, std::forward<F>(f));
default:
break;
}

View File

@ -125,9 +125,6 @@ void CreatingSetsBlockInputStream::createOne(SubqueryForSet & subquery)
if (!done_with_join)
{
if (subquery.joined_block_actions)
subquery.joined_block_actions->execute(block);
for (const auto & name_with_alias : subquery.joined_block_aliases)
{
if (block.has(name_with_alias.first))
@ -140,6 +137,9 @@ void CreatingSetsBlockInputStream::createOne(SubqueryForSet & subquery)
}
}
if (subquery.joined_block_actions)
subquery.joined_block_actions->execute(block);
if (!subquery.join->insertFromBlock(block))
done_with_join = true;
}

View File

@ -0,0 +1,162 @@
#include <DataStreams/FinishSortingBlockInputStream.h>
#include <DataStreams/MergeSortingBlockInputStream.h>
#include <DataStreams/processConstants.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
static bool isPrefix(const SortDescription & pref_descr, const SortDescription & descr)
{
if (pref_descr.size() > descr.size())
return false;
for (size_t i = 0; i < pref_descr.size(); ++i)
if (pref_descr[i] != descr[i])
return false;
return true;
}
FinishSortingBlockInputStream::FinishSortingBlockInputStream(
const BlockInputStreamPtr & input, const SortDescription & description_sorted_,
const SortDescription & description_to_sort_,
size_t max_merged_block_size_, size_t limit_)
: description_sorted(description_sorted_), description_to_sort(description_to_sort_),
max_merged_block_size(max_merged_block_size_), limit(limit_)
{
if (!isPrefix(description_sorted, description_to_sort))
throw Exception("Can`t finish sorting. SortDescription of already sorted stream is not prefix of "
"SortDescription needed to sort", ErrorCodes::LOGICAL_ERROR);
children.push_back(input);
header = children.at(0)->getHeader();
removeConstantsFromSortDescription(header, description_to_sort);
}
struct Less
{
const ColumnsWithSortDescriptions & left_columns;
const ColumnsWithSortDescriptions & right_columns;
Less(const ColumnsWithSortDescriptions & left_columns_, const ColumnsWithSortDescriptions & right_columns_) :
left_columns(left_columns_), right_columns(right_columns_) {}
bool operator() (size_t a, size_t b) const
{
for (auto it = left_columns.begin(), jt = right_columns.begin(); it != left_columns.end(); ++it, ++jt)
{
int res = it->second.direction * it->first->compareAt(a, b, *jt->first, it->second.nulls_direction);
if (res < 0)
return true;
else if (res > 0)
return false;
}
return false;
}
};
Block FinishSortingBlockInputStream::readImpl()
{
if (limit && total_rows_processed >= limit)
return {};
Block res;
if (impl)
res = impl->read();
/// If res block is empty, we have finished sorting previous chunk of blocks.
if (!res)
{
if (end_of_stream)
return {};
blocks.clear();
if (tail_block)
blocks.push_back(std::move(tail_block));
while (true)
{
Block block = children.back()->read();
/// End of input stream, but we can`t return immediatly, we need to merge already read blocks.
/// Check it later, when get end of stream from impl.
if (!block)
{
end_of_stream = true;
break;
}
// If there were only const columns in sort description, then there is no need to sort.
// Return the blocks as is.
if (description_to_sort.empty())
return block;
size_t size = block.rows();
if (size == 0)
continue;
/// We need to sort each block separatly before merging.
sortBlock(block, description_to_sort);
removeConstantsFromBlock(block);
/// Find the position of last already read key in current block.
if (!blocks.empty())
{
const Block & last_block = blocks.back();
auto last_columns = getColumnsWithSortDescription(last_block, description_sorted);
auto current_columns = getColumnsWithSortDescription(block, description_sorted);
Less less(last_columns, current_columns);
IColumn::Permutation perm(size);
for (size_t i = 0; i < size; ++i)
perm[i] = i;
auto it = std::upper_bound(perm.begin(), perm.end(), last_block.rows() - 1, less);
/// We need to save tail of block, because next block may starts with the same key as in tail
/// and we should sort these rows in one chunk.
if (it != perm.end())
{
size_t tail_pos = it - perm.begin();
Block head_block = block.cloneEmpty();
tail_block = block.cloneEmpty();
for (size_t i = 0; i < block.columns(); ++i)
{
head_block.getByPosition(i).column = block.getByPosition(i).column->cut(0, tail_pos);
tail_block.getByPosition(i).column = block.getByPosition(i).column->cut(tail_pos, block.rows() - tail_pos);
}
if (head_block.rows())
blocks.push_back(head_block);
break;
}
}
/// If we reach here, that means that current block is first in chunk
/// or it all consists of rows with the same key as tail of a previous block.
blocks.push_back(block);
}
impl = std::make_unique<MergeSortingBlocksBlockInputStream>(blocks, description_to_sort, max_merged_block_size, limit);
res = impl->read();
}
if (res)
enrichBlockWithConstants(res, header);
total_rows_processed += res.rows();
return res;
}
}

View File

@ -0,0 +1,51 @@
#pragma once
#include <Core/SortDescription.h>
#include <Interpreters/sortBlock.h>
#include <DataStreams/IProfilingBlockInputStream.h>
namespace DB
{
/** Takes stream already sorted by `x` and finishes sorting it by (`x`, `y`).
* During sorting only blocks with rows that equal by `x` saved in RAM.
* */
class FinishSortingBlockInputStream : public IProfilingBlockInputStream
{
public:
/// limit - if not 0, allowed to return just first 'limit' rows in sorted order.
FinishSortingBlockInputStream(const BlockInputStreamPtr & input, const SortDescription & description_sorted_,
const SortDescription & description_to_sort_,
size_t max_merged_block_size_, size_t limit_);
String getName() const override { return "FinishSorting"; }
bool isSortedOutput() const override { return true; }
const SortDescription & getSortDescription() const override { return description_to_sort; }
Block getHeader() const override { return header; }
protected:
Block readImpl() override;
private:
SortDescription description_sorted;
SortDescription description_to_sort;
size_t max_merged_block_size;
size_t limit;
Block tail_block;
Blocks blocks;
std::unique_ptr<IBlockInputStream> impl;
/// Before operation, will remove constant columns from blocks. And after, place constant columns back.
/// to avoid excessive virtual function calls
/// Save original block structure here.
Block header;
bool end_of_stream = false;
size_t total_rows_processed = 0;
};
}

View File

@ -132,7 +132,7 @@ void GraphiteRollupSortedBlockInputStream::merge(MutableColumns & merged_columns
is_first = false;
time_t next_row_time = next_cursor->all_columns[time_column_num]->get64(next_cursor->pos);
time_t next_row_time = next_cursor->all_columns[time_column_num]->getUInt(next_cursor->pos);
/// Is new key before rounding.
bool is_new_key = new_path || next_row_time != current_time;

View File

@ -2,9 +2,11 @@
#include <DataStreams/MergingSortedBlockInputStream.h>
#include <DataStreams/NativeBlockOutputStream.h>
#include <DataStreams/copyData.h>
#include <DataStreams/processConstants.h>
#include <Common/formatReadable.h>
#include <IO/WriteBufferFromFile.h>
#include <IO/CompressedWriteBuffer.h>
#include <Interpreters/sortBlock.h>
namespace ProfileEvents
@ -16,54 +18,6 @@ namespace ProfileEvents
namespace DB
{
/** Remove constant columns from block.
*/
static void removeConstantsFromBlock(Block & block)
{
size_t columns = block.columns();
size_t i = 0;
while (i < columns)
{
if (block.getByPosition(i).column->isColumnConst())
{
block.erase(i);
--columns;
}
else
++i;
}
}
static void removeConstantsFromSortDescription(const Block & header, SortDescription & description)
{
description.erase(std::remove_if(description.begin(), description.end(),
[&](const SortColumnDescription & elem)
{
if (!elem.column_name.empty())
return header.getByName(elem.column_name).column->isColumnConst();
else
return header.safeGetByPosition(elem.column_number).column->isColumnConst();
}), description.end());
}
/** Add into block, whose constant columns was removed by previous function,
* constant columns from header (which must have structure as before removal of constants from block).
*/
static void enrichBlockWithConstants(Block & block, const Block & header)
{
size_t rows = block.rows();
size_t columns = header.columns();
for (size_t i = 0; i < columns; ++i)
{
const auto & col_type_name = header.getByPosition(i);
if (col_type_name.column->isColumnConst())
block.insert(i, {col_type_name.column->cloneResized(rows), col_type_name.type, col_type_name.name});
}
}
MergeSortingBlockInputStream::MergeSortingBlockInputStream(
const BlockInputStreamPtr & input, SortDescription & description_,
size_t max_merged_block_size_, size_t limit_, size_t max_bytes_before_remerge_,
@ -303,5 +257,4 @@ void MergeSortingBlockInputStream::remerge()
sum_rows_in_blocks = new_sum_rows_in_blocks;
sum_bytes_in_blocks = new_sum_bytes_in_blocks;
}
}

View File

@ -130,5 +130,4 @@ private:
/// If remerge doesn't save memory at least several times, mark it as useless and don't do it anymore.
bool remerge_is_useful = true;
};
}

View File

@ -44,6 +44,9 @@ PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(
auto dependent_table = context.getTable(database_table.first, database_table.second);
auto & materialized_view = dynamic_cast<const StorageMaterializedView &>(*dependent_table);
if (StoragePtr inner_table = materialized_view.tryGetTargetTable())
addTableLock(inner_table->lockStructure(true, __PRETTY_FUNCTION__));
auto query = materialized_view.getInnerQuery();
BlockOutputStreamPtr out = std::make_shared<PushingToViewsBlockOutputStream>(
database_table.first, database_table.second, dependent_table, *views_context, ASTPtr());
@ -77,15 +80,15 @@ void PushingToViewsBlockOutputStream::write(const Block & block)
return;
// Insert data into materialized views only after successful insert into main table
bool allow_concurrent_view_processing = context.getSettingsRef().allow_concurrent_view_processing;
if (allow_concurrent_view_processing && views.size() > 1)
const Settings & settings = context.getSettingsRef();
if (settings.parallel_view_processing && views.size() > 1)
{
// Push to views concurrently if enabled, and more than one view is attached
ThreadPool pool(std::min(getNumberOfPhysicalCPUCores(), views.size()));
ThreadPool pool(std::min(size_t(settings.max_threads), views.size()));
for (size_t view_num = 0; view_num < views.size(); ++view_num)
{
auto thread_group = CurrentThread::getGroup();
pool.schedule([=] ()
pool.schedule([=]
{
setThreadName("PushingToViewsBlockOutputStream");
CurrentThread::attachToIfDetached(thread_group);

View File

@ -33,9 +33,6 @@ RemoteBlockOutputStream::RemoteBlockOutputStream(Connection & connection_, const
if (Protocol::Server::Data == packet.type)
{
header = packet.block;
if (!header)
throw Exception("Logical error: empty block received as table structure", ErrorCodes::LOGICAL_ERROR);
break;
}
else if (Protocol::Server::Exception == packet.type)
@ -58,7 +55,8 @@ RemoteBlockOutputStream::RemoteBlockOutputStream(Connection & connection_, const
void RemoteBlockOutputStream::write(const Block & block)
{
assertBlocksHaveEqualStructure(block, header, "RemoteBlockOutputStream");
if (header)
assertBlocksHaveEqualStructure(block, header, "RemoteBlockOutputStream");
try
{

View File

@ -216,7 +216,7 @@ void SummingSortedBlockInputStream::insertCurrentRowIfNeeded(MutableColumns & me
if (desc.column_numbers.size() == 1)
{
// Flag row as non-empty if at least one column number if non-zero
current_row_is_zero = current_row_is_zero && desc.merged_column->get64(desc.merged_column->size() - 1) == 0;
current_row_is_zero = current_row_is_zero && desc.merged_column->isDefaultAt(desc.merged_column->size() - 1);
}
else
{

View File

@ -0,0 +1,48 @@
#include <DataStreams/processConstants.h>
namespace DB
{
void removeConstantsFromBlock(Block & block)
{
size_t columns = block.columns();
size_t i = 0;
while (i < columns)
{
if (block.getByPosition(i).column->isColumnConst())
{
block.erase(i);
--columns;
}
else
++i;
}
}
void removeConstantsFromSortDescription(const Block & header, SortDescription & description)
{
description.erase(std::remove_if(description.begin(), description.end(),
[&](const SortColumnDescription & elem)
{
if (!elem.column_name.empty())
return header.getByName(elem.column_name).column->isColumnConst();
else
return header.safeGetByPosition(elem.column_number).column->isColumnConst();
}), description.end());
}
void enrichBlockWithConstants(Block & block, const Block & header)
{
size_t rows = block.rows();
size_t columns = header.columns();
for (size_t i = 0; i < columns; ++i)
{
const auto & col_type_name = header.getByPosition(i);
if (col_type_name.column->isColumnConst())
block.insert(i, {col_type_name.column->cloneResized(rows), col_type_name.type, col_type_name.name});
}
}
}

View File

@ -0,0 +1,23 @@
#pragma once
#include <Core/Block.h>
#include <Core/SortDescription.h>
namespace DB
{
/** Functions for manipulate constants for sorting.
* See MergeSortingBlocksBlockInputStream and FinishSortingBlockInputStream for details.
*/
/** Remove constant columns from block.
*/
void removeConstantsFromBlock(Block & block);
void removeConstantsFromSortDescription(const Block & header, SortDescription & description);
/** Add into block, whose constant columns was removed by previous function,
* constant columns from header (which must have structure as before removal of constants from block).
*/
void enrichBlockWithConstants(Block & block, const Block & header);
}

View File

@ -11,3 +11,6 @@ target_link_libraries (union_stream2 dbms)
add_executable (collapsing_sorted_stream collapsing_sorted_stream.cpp ${SRCS})
target_link_libraries (collapsing_sorted_stream dbms)
add_executable (finish_sorting_stream finish_sorting_stream.cpp ${SRCS})
target_link_libraries (finish_sorting_stream dbms)

View File

@ -0,0 +1,106 @@
#include <iostream>
#include <iomanip>
#include <DataTypes/DataTypesNumber.h>
#include <Columns/ColumnsNumber.h>
#include <Core/SortDescription.h>
#include <DataStreams/MergeSortingBlockInputStream.h>
#include <DataStreams/PartialSortingBlockInputStream.h>
#include <DataStreams/FinishSortingBlockInputStream.h>
#include <Interpreters/sortBlock.h>
using namespace DB;
int main(int argc, char ** argv)
{
srand(123456);
try
{
size_t m = argc >= 2 ? atoi(argv[1]) : 2;
size_t n = argc >= 3 ? atoi(argv[2]) : 10;
Blocks blocks;
for (size_t t = 0; t < m; ++t)
{
Block block;
for (size_t i = 0; i < 2; ++i)
{
ColumnWithTypeAndName column;
column.name = "col" + std::to_string(i + 1);
column.type = std::make_shared<DataTypeInt32>();
auto col = ColumnInt32::create();
auto & vec = col->getData();
vec.resize(n);
for (size_t j = 0; j < n; ++j)
vec[j] = rand() % 10;
column.column = std::move(col);
block.insert(column);
}
blocks.push_back(block);
}
SortDescription sort_descr;
sort_descr.emplace_back("col1", 1, 1);
for (auto & block : blocks)
sortBlock(block, sort_descr);
BlockInputStreamPtr stream = std::make_shared<MergeSortingBlocksBlockInputStream>(blocks, sort_descr, n);
SortDescription sort_descr_final;
sort_descr_final.emplace_back("col1", 1, 1);
sort_descr_final.emplace_back("col2", 1, 1);
stream = std::make_shared<FinishSortingBlockInputStream>(stream, sort_descr, sort_descr_final, n, 0);
{
Stopwatch stopwatch;
stopwatch.start();
Block res_block = blocks[0].cloneEmpty();
while (Block block = stream->read())
{
for (size_t i = 0; i < block.columns(); ++i)
{
MutableColumnPtr ptr = (*std::move(res_block.getByPosition(i).column)).mutate();
ptr->insertRangeFrom(*block.getByPosition(i).column.get(), 0, block.rows());
}
}
if (res_block.rows() != n * m)
throw Exception("Result block size mismatch");
const auto & columns = res_block.getColumns();
for (size_t i = 1; i < res_block.rows(); ++i)
for (const auto & col : columns)
{
int res = col->compareAt(i - 1, i, *col, 1);
if (res < 0)
break;
else if (res > 0)
throw Exception("Result stream not sorted");
}
stopwatch.stop();
std::cout << std::fixed << std::setprecision(2)
<< "Elapsed " << stopwatch.elapsedSeconds() << " sec."
<< ", " << n / stopwatch.elapsedSeconds() << " rows/sec."
<< std::endl;
}
}
catch (const Exception & e)
{
std::cerr << e.displayText() << std::endl;
return -1;
}
return 0;
}

View File

@ -172,6 +172,7 @@ static DataTypePtr create(const ASTPtr & arguments)
void registerDataTypeDateTime(DataTypeFactory & factory)
{
factory.registerDataType("DateTime", create, DataTypeFactory::CaseInsensitive);
factory.registerAlias("TIMESTAMP", "DateTime", DataTypeFactory::CaseInsensitive);
}

View File

@ -73,7 +73,7 @@ DataTypePtr FieldToDataType::operator() (const DecimalField<Decimal64> & x) cons
DataTypePtr FieldToDataType::operator() (const DecimalField<Decimal128> & x) const
{
using Type = DataTypeDecimal<Decimal64>;
using Type = DataTypeDecimal<Decimal128>;
return std::make_shared<Type>(Type::maxPrecision(), x.getScale());
}

View File

@ -304,7 +304,7 @@ public:
virtual bool shouldAlignRightInPrettyFormats() const { return false; }
/** Does formatted value in any text format can contain anything but valid UTF8 sequences.
* Example: String (because it can contain arbitary bytes).
* Example: String (because it can contain arbitrary bytes).
* Counterexamples: numbers, Date, DateTime.
* For Enum, it depends.
*/

View File

@ -1,8 +1,10 @@
#pragma once
#include <Core/Types.h>
#include <type_traits>
#include <Core/Types.h>
#include <Common/UInt128.h>
namespace DB
{
@ -145,6 +147,8 @@ template <typename A> struct ResultOfBitNot
* Float<x>, Float<y> -> Float<max(x, y)>
* UInt<x>, Int<y> -> Int<max(x*2, y)>
* Float<x>, [U]Int<y> -> Float<max(x, y*2)>
* Decimal<x>, Decimal<y> -> Decimal<max(x,y)>
* UUID, UUID -> UUID
* UInt64 , Int<x> -> Error
* Float<x>, [U]Int64 -> Error
*/
@ -161,11 +165,16 @@ struct ResultOfIf
static constexpr size_t max_size_of_integer = max(std::is_integral_v<A> ? sizeof(A) : 0, std::is_integral_v<B> ? sizeof(B) : 0);
static constexpr size_t max_size_of_float = max(std::is_floating_point_v<A> ? sizeof(A) : 0, std::is_floating_point_v<B> ? sizeof(B) : 0);
using Type = typename Construct<has_signed, has_float,
using ConstructedType = typename Construct<has_signed, has_float,
((has_float && has_integer && max_size_of_integer >= max_size_of_float)
|| (has_signed && has_unsigned && max_size_of_unsigned_integer >= max_size_of_signed_integer))
? max(sizeof(A), sizeof(B)) * 2
: max(sizeof(A), sizeof(B))>::Type;
using ConstructedWithUUID = std::conditional_t<std::is_same_v<A, UInt128> && std::is_same_v<B, UInt128>, A, ConstructedType>;
using Type = std::conditional_t<!IsDecimalNumber<A> && !IsDecimalNumber<B>, ConstructedWithUUID,
std::conditional_t<IsDecimalNumber<A> && IsDecimalNumber<B>, std::conditional_t<(sizeof(A) > sizeof(B)), A, B>, Error>>;
};
/** Before applying operator `%` and bitwise operations, operands are casted to whole numbers. */

View File

@ -1,5 +1,6 @@
#include <iomanip>
#include <Poco/Event.h>
#include <Poco/DirectoryIterator.h>
#include <common/logger_useful.h>
@ -41,7 +42,6 @@ namespace ErrorCodes
static constexpr size_t PRINT_MESSAGE_EACH_N_TABLES = 256;
static constexpr size_t PRINT_MESSAGE_EACH_N_SECONDS = 5;
static constexpr size_t METADATA_FILE_BUFFER_SIZE = 32768;
static constexpr size_t TABLES_PARALLEL_LOAD_BUNCH_SIZE = 100;
namespace detail
{
@ -149,6 +149,9 @@ void DatabaseOrdinary::loadTables(
ErrorCodes::INCORRECT_FILE_NAME);
}
if (file_names.empty())
return;
/** Tables load faster if they are loaded in sorted (by name) order.
* Otherwise (for the ext4 filesystem), `DirectoryIterator` iterates through them in some order,
* which does not correspond to order tables creation and does not correspond to order of their location on disk.
@ -160,36 +163,27 @@ void DatabaseOrdinary::loadTables(
AtomicStopwatch watch;
std::atomic<size_t> tables_processed {0};
Poco::Event all_tables_processed;
auto task_function = [&](FileNames::const_iterator begin, FileNames::const_iterator end)
auto task_function = [&](const String & table)
{
for (auto it = begin; it != end; ++it)
/// Messages, so that it's not boring to wait for the server to load for a long time.
if ((tables_processed + 1) % PRINT_MESSAGE_EACH_N_TABLES == 0
|| watch.compareAndRestart(PRINT_MESSAGE_EACH_N_SECONDS))
{
const String & table = *it;
/// Messages, so that it's not boring to wait for the server to load for a long time.
if ((++tables_processed) % PRINT_MESSAGE_EACH_N_TABLES == 0
|| watch.compareAndRestart(PRINT_MESSAGE_EACH_N_SECONDS))
{
LOG_INFO(log, std::fixed << std::setprecision(2) << tables_processed * 100.0 / total_tables << "%");
watch.restart();
}
loadTable(context, metadata_path, *this, name, data_path, table, has_force_restore_data_flag);
LOG_INFO(log, std::fixed << std::setprecision(2) << tables_processed * 100.0 / total_tables << "%");
watch.restart();
}
loadTable(context, metadata_path, *this, name, data_path, table, has_force_restore_data_flag);
if (++tables_processed == total_tables)
all_tables_processed.set();
};
const size_t bunch_size = TABLES_PARALLEL_LOAD_BUNCH_SIZE;
size_t num_bunches = (total_tables + bunch_size - 1) / bunch_size;
for (size_t i = 0; i < num_bunches; ++i)
for (const auto & filename : file_names)
{
auto begin = file_names.begin() + i * bunch_size;
auto end = (i + 1 == num_bunches)
? file_names.end()
: (file_names.begin() + (i + 1) * bunch_size);
auto task = std::bind(task_function, begin, end);
auto task = std::bind(task_function, filename);
if (thread_pool)
thread_pool->schedule(task);
@ -198,7 +192,7 @@ void DatabaseOrdinary::loadTables(
}
if (thread_pool)
thread_pool->wait();
all_tables_processed.wait();
/// After all tables was basically initialized, startup them.
startupTables(thread_pool);
@ -212,47 +206,38 @@ void DatabaseOrdinary::startupTables(ThreadPool * thread_pool)
AtomicStopwatch watch;
std::atomic<size_t> tables_processed {0};
size_t total_tables = tables.size();
Poco::Event all_tables_processed;
auto task_function = [&](Tables::iterator begin, Tables::iterator end)
if (!total_tables)
return;
auto task_function = [&](const StoragePtr & table)
{
for (auto it = begin; it != end; ++it)
if ((tables_processed + 1) % PRINT_MESSAGE_EACH_N_TABLES == 0
|| watch.compareAndRestart(PRINT_MESSAGE_EACH_N_SECONDS))
{
if ((++tables_processed) % PRINT_MESSAGE_EACH_N_TABLES == 0
|| watch.compareAndRestart(PRINT_MESSAGE_EACH_N_SECONDS))
{
LOG_INFO(log, std::fixed << std::setprecision(2) << tables_processed * 100.0 / total_tables << "%");
watch.restart();
}
it->second->startup();
LOG_INFO(log, std::fixed << std::setprecision(2) << tables_processed * 100.0 / total_tables << "%");
watch.restart();
}
table->startup();
if (++tables_processed == total_tables)
all_tables_processed.set();
};
const size_t bunch_size = TABLES_PARALLEL_LOAD_BUNCH_SIZE;
size_t num_bunches = (total_tables + bunch_size - 1) / bunch_size;
auto begin = tables.begin();
for (size_t i = 0; i < num_bunches; ++i)
for (const auto & name_storage : tables)
{
auto end = begin;
if (i + 1 == num_bunches)
end = tables.end();
else
std::advance(end, bunch_size);
auto task = std::bind(task_function, begin, end);
auto task = std::bind(task_function, name_storage.second);
if (thread_pool)
thread_pool->schedule(task);
else
task();
begin = end;
}
if (thread_pool)
thread_pool->wait();
all_tables_processed.wait();
}

View File

@ -2,7 +2,6 @@
#include <sstream>
#include <memory>
#include <Columns/ColumnsNumber.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnString.h>
#include <Common/BitHelpers.h>
#include <Common/randomSeed.h>
@ -209,7 +208,7 @@ void CacheDictionary::isInConstantVector(
#define DECLARE(TYPE)\
void CacheDictionary::get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, PaddedPODArray<TYPE> & out) const\
void CacheDictionary::get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ResultArrayType<TYPE> & out) const\
{\
auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -230,6 +229,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void CacheDictionary::getString(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ColumnString * out) const
@ -246,7 +248,7 @@ void CacheDictionary::getString(const std::string & attribute_name, const Padded
#define DECLARE(TYPE)\
void CacheDictionary::get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const PaddedPODArray<TYPE> & def,\
PaddedPODArray<TYPE> & out) const\
ResultArrayType<TYPE> & out) const\
{\
auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -265,6 +267,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void CacheDictionary::getString(
@ -280,7 +285,7 @@ void CacheDictionary::getString(
#define DECLARE(TYPE)\
void CacheDictionary::get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE def, PaddedPODArray<TYPE> & out) const\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE def, ResultArrayType<TYPE> & out) const\
{\
auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -299,6 +304,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void CacheDictionary::getString(
@ -491,6 +499,21 @@ CacheDictionary::Attribute CacheDictionary::createAttributeWithType(const Attrib
std::get<ContainerPtrType<Int64>>(attr.arrays) = std::make_unique<ContainerType<Int64>>(size);
bytes_allocated += size * sizeof(Int64);
break;
case AttributeUnderlyingType::Decimal32:
std::get<Decimal32>(attr.null_values) = null_value.get<Decimal32>();
std::get<ContainerPtrType<Decimal32>>(attr.arrays) = std::make_unique<ContainerType<Decimal32>>(size);
bytes_allocated += size * sizeof(Decimal32);
break;
case AttributeUnderlyingType::Decimal64:
std::get<Decimal64>(attr.null_values) = null_value.get<Decimal64>();
std::get<ContainerPtrType<Decimal64>>(attr.arrays) = std::make_unique<ContainerType<Decimal64>>(size);
bytes_allocated += size * sizeof(Decimal64);
break;
case AttributeUnderlyingType::Decimal128:
std::get<Decimal128>(attr.null_values) = null_value.get<Decimal128>();
std::get<ContainerPtrType<Decimal128>>(attr.arrays) = std::make_unique<ContainerType<Decimal128>>(size);
bytes_allocated += size * sizeof(Decimal128);
break;
case AttributeUnderlyingType::Float32:
std::get<Float32>(attr.null_values) = null_value.get<Float64>();
std::get<ContainerPtrType<Float32>>(attr.arrays) = std::make_unique<ContainerType<Float32>>(size);
@ -518,7 +541,7 @@ template <typename OutputType, typename DefaultGetter>
void CacheDictionary::getItemsNumber(
Attribute & attribute,
const PaddedPODArray<Key> & ids,
PaddedPODArray<OutputType> & out,
ResultArrayType<OutputType> & out,
DefaultGetter && get_default) const
{
if (false) {}
@ -536,6 +559,9 @@ void CacheDictionary::getItemsNumber(
DISPATCH(Int64)
DISPATCH(Float32)
DISPATCH(Float64)
DISPATCH(Decimal32)
DISPATCH(Decimal64)
DISPATCH(Decimal128)
#undef DISPATCH
else
throw Exception("Unexpected type of attribute: " + toString(attribute.type), ErrorCodes::LOGICAL_ERROR);
@ -545,7 +571,7 @@ template <typename AttributeType, typename OutputType, typename DefaultGetter>
void CacheDictionary::getItemsNumberImpl(
Attribute & attribute,
const PaddedPODArray<Key> & ids,
PaddedPODArray<OutputType> & out,
ResultArrayType<OutputType> & out,
DefaultGetter && get_default) const
{
/// Mapping: <id> -> { all indices `i` of `ids` such that `ids[i]` = <id> }
@ -893,6 +919,17 @@ void CacheDictionary::setDefaultAttributeValue(Attribute & attribute, const Key
case AttributeUnderlyingType::Int64: std::get<ContainerPtrType<Int64>>(attribute.arrays)[idx] = std::get<Int64>(attribute.null_values); break;
case AttributeUnderlyingType::Float32: std::get<ContainerPtrType<Float32>>(attribute.arrays)[idx] = std::get<Float32>(attribute.null_values); break;
case AttributeUnderlyingType::Float64: std::get<ContainerPtrType<Float64>>(attribute.arrays)[idx] = std::get<Float64>(attribute.null_values); break;
case AttributeUnderlyingType::Decimal32:
std::get<ContainerPtrType<Decimal32>>(attribute.arrays)[idx] = std::get<Decimal32>(attribute.null_values);
break;
case AttributeUnderlyingType::Decimal64:
std::get<ContainerPtrType<Decimal64>>(attribute.arrays)[idx] = std::get<Decimal64>(attribute.null_values);
break;
case AttributeUnderlyingType::Decimal128:
std::get<ContainerPtrType<Decimal128>>(attribute.arrays)[idx] = std::get<Decimal128>(attribute.null_values);
break;
case AttributeUnderlyingType::String:
{
const auto & null_value_ref = std::get<String>(attribute.null_values);
@ -926,6 +963,11 @@ void CacheDictionary::setAttributeValue(Attribute & attribute, const Key idx, co
case AttributeUnderlyingType::Int64: std::get<ContainerPtrType<Int64>>(attribute.arrays)[idx] = value.get<Int64>(); break;
case AttributeUnderlyingType::Float32: std::get<ContainerPtrType<Float32>>(attribute.arrays)[idx] = value.get<Float64>(); break;
case AttributeUnderlyingType::Float64: std::get<ContainerPtrType<Float64>>(attribute.arrays)[idx] = value.get<Float64>(); break;
case AttributeUnderlyingType::Decimal32: std::get<ContainerPtrType<Decimal32>>(attribute.arrays)[idx] = value.get<Decimal32>(); break;
case AttributeUnderlyingType::Decimal64: std::get<ContainerPtrType<Decimal64>>(attribute.arrays)[idx] = value.get<Decimal64>(); break;
case AttributeUnderlyingType::Decimal128: std::get<ContainerPtrType<Decimal128>>(attribute.arrays)[idx] = value.get<Decimal128>(); break;
case AttributeUnderlyingType::String:
{
const auto & string = value.get<String>();

View File

@ -5,6 +5,7 @@
#include <Dictionaries/DictionaryStructure.h>
#include <Common/ArenaWithFreeLists.h>
#include <Common/CurrentMetrics.h>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnString.h>
#include <ext/bit_cast.h>
#include <cmath>
@ -80,8 +81,11 @@ public:
void isInVectorConstant(const PaddedPODArray<Key> & child_ids, const Key ancestor_id, PaddedPODArray<UInt8> & out) const override;
void isInConstantVector(const Key child_id, const PaddedPODArray<Key> & ancestor_ids, PaddedPODArray<UInt8> & out) const override;
template <typename T>
using ResultArrayType = std::conditional_t<IsDecimalNumber<T>, DecimalPaddedPODArray<T>, PaddedPODArray<T>>;
#define DECLARE(TYPE)\
void get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, PaddedPODArray<TYPE> & out) const;
void get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -93,6 +97,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ColumnString * out) const;
@ -100,7 +107,7 @@ public:
#define DECLARE(TYPE)\
void get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const PaddedPODArray<TYPE> & def,\
PaddedPODArray<TYPE> & out) const;
ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -112,6 +119,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
@ -119,8 +129,7 @@ public:
ColumnString * const out) const;
#define DECLARE(TYPE)\
void get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE def, PaddedPODArray<TYPE> & out) const;
void get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE def, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -132,6 +141,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
@ -174,12 +186,14 @@ private:
UInt8, UInt16, UInt32, UInt64,
UInt128,
Int8, Int16, Int32, Int64,
Decimal32, Decimal64, Decimal128,
Float32, Float64,
String> null_values;
std::tuple<
ContainerPtrType<UInt8>, ContainerPtrType<UInt16>, ContainerPtrType<UInt32>, ContainerPtrType<UInt64>,
ContainerPtrType<UInt128>,
ContainerPtrType<Int8>, ContainerPtrType<Int16>, ContainerPtrType<Int32>, ContainerPtrType<Int64>,
ContainerPtrType<Decimal32>, ContainerPtrType<Decimal64>, ContainerPtrType<Decimal128>,
ContainerPtrType<Float32>, ContainerPtrType<Float64>,
ContainerPtrType<StringRef>> arrays;
};
@ -193,14 +207,14 @@ private:
void getItemsNumber(
Attribute & attribute,
const PaddedPODArray<Key> & ids,
PaddedPODArray<OutputType> & out,
ResultArrayType<OutputType> & out,
DefaultGetter && get_default) const;
template <typename AttributeType, typename OutputType, typename DefaultGetter>
void getItemsNumberImpl(
Attribute & attribute,
const PaddedPODArray<Key> & ids,
PaddedPODArray<OutputType> & out,
ResultArrayType<OutputType> & out,
DefaultGetter && get_default) const;
template <typename DefaultGetter>

View File

@ -6,6 +6,7 @@
#include <tuple>
#include <vector>
#include <shared_mutex>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnString.h>
#include <Common/ArenaWithFreeLists.h>
#include <Common/HashTable/HashMap.h>
@ -128,11 +129,14 @@ public:
return dict_struct.attributes[&getAttribute(attribute_name) - attributes.data()].injective;
}
template <typename T>
using ResultArrayType = std::conditional_t<IsDecimalNumber<T>, DecimalPaddedPODArray<T>, PaddedPODArray<T>>;
/// In all functions below, key_columns must be full (non-constant) columns.
/// See the requirement in IDataType.h for text-serialization functions.
#define DECLARE(TYPE) \
void get##TYPE( \
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, PaddedPODArray<TYPE> & out) const;
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -144,6 +148,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, ColumnString * out) const;
@ -153,7 +160,7 @@ public:
const Columns & key_columns, \
const DataTypes & key_types, \
const PaddedPODArray<TYPE> & def, \
PaddedPODArray<TYPE> & out) const;
ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -165,6 +172,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(const std::string & attribute_name,
@ -178,7 +188,7 @@ public:
const Columns & key_columns, \
const DataTypes & key_types, \
const TYPE def, \
PaddedPODArray<TYPE> & out) const;
ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -190,6 +200,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(const std::string & attribute_name,
@ -247,7 +260,9 @@ private:
struct Attribute final
{
AttributeUnderlyingType type;
std::tuple<UInt8, UInt16, UInt32, UInt64, UInt128, Int8, Int16, Int32, Int64, Float32, Float64, String> null_values;
std::tuple<UInt8, UInt16, UInt32, UInt64, UInt128, Int8, Int16, Int32, Int64,
Decimal32, Decimal64, Decimal128,
Float32, Float64, String> null_values;
std::tuple<ContainerPtrType<UInt8>,
ContainerPtrType<UInt16>,
ContainerPtrType<UInt32>,
@ -257,6 +272,9 @@ private:
ContainerPtrType<Int16>,
ContainerPtrType<Int32>,
ContainerPtrType<Int64>,
ContainerPtrType<Decimal32>,
ContainerPtrType<Decimal64>,
ContainerPtrType<Decimal128>,
ContainerPtrType<Float32>,
ContainerPtrType<Float64>,
ContainerPtrType<StringRef>>
@ -288,6 +306,9 @@ private:
DISPATCH(Int64)
DISPATCH(Float32)
DISPATCH(Float64)
DISPATCH(Decimal32)
DISPATCH(Decimal64)
DISPATCH(Decimal128)
#undef DISPATCH
else throw Exception("Unexpected type of attribute: " + toString(attribute.type), ErrorCodes::LOGICAL_ERROR);
}

View File

@ -64,6 +64,21 @@ ComplexKeyCacheDictionary::Attribute ComplexKeyCacheDictionary::createAttributeW
std::get<ContainerPtrType<Float64>>(attr.arrays) = std::make_unique<ContainerType<Float64>>(size);
bytes_allocated += size * sizeof(Float64);
break;
case AttributeUnderlyingType::Decimal32:
std::get<Decimal32>(attr.null_values) = null_value.get<Decimal32>();
std::get<ContainerPtrType<Decimal32>>(attr.arrays) = std::make_unique<ContainerType<Decimal32>>(size);
bytes_allocated += size * sizeof(Decimal32);
break;
case AttributeUnderlyingType::Decimal64:
std::get<Decimal64>(attr.null_values) = null_value.get<Decimal64>();
std::get<ContainerPtrType<Decimal64>>(attr.arrays) = std::make_unique<ContainerType<Decimal64>>(size);
bytes_allocated += size * sizeof(Decimal64);
break;
case AttributeUnderlyingType::Decimal128:
std::get<Decimal128>(attr.null_values) = null_value.get<Decimal128>();
std::get<ContainerPtrType<Decimal128>>(attr.arrays) = std::make_unique<ContainerType<Decimal128>>(size);
bytes_allocated += size * sizeof(Decimal128);
break;
case AttributeUnderlyingType::String:
std::get<String>(attr.null_values) = null_value.get<String>();
std::get<ContainerPtrType<StringRef>>(attr.arrays) = std::make_unique<ContainerType<StringRef>>(size);

View File

@ -9,7 +9,7 @@ namespace ErrorCodes
#define DECLARE(TYPE) \
void ComplexKeyCacheDictionary::get##TYPE( \
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, PaddedPODArray<TYPE> & out) const \
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types, ResultArrayType<TYPE> & out) const \
{ \
dict_struct.validateKeyTypes(key_types); \
\
@ -33,5 +33,8 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
}

View File

@ -12,7 +12,7 @@ namespace ErrorCodes
const Columns & key_columns, \
const DataTypes & key_types, \
const PaddedPODArray<TYPE> & def, \
PaddedPODArray<TYPE> & out) const \
ResultArrayType<TYPE> & out) const \
{ \
dict_struct.validateKeyTypes(key_types); \
\
@ -34,5 +34,8 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
}

View File

@ -12,7 +12,7 @@ namespace ErrorCodes
const Columns & key_columns, \
const DataTypes & key_types, \
const TYPE def, \
PaddedPODArray<TYPE> & out) const \
ResultArrayType<TYPE> & out) const \
{ \
dict_struct.validateKeyTypes(key_types); \
\
@ -34,5 +34,8 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
}

View File

@ -18,6 +18,11 @@ void ComplexKeyCacheDictionary::setAttributeValue(Attribute & attribute, const s
case AttributeUnderlyingType::Int64: std::get<ContainerPtrType<Int64>>(attribute.arrays)[idx] = value.get<Int64>(); break;
case AttributeUnderlyingType::Float32: std::get<ContainerPtrType<Float32>>(attribute.arrays)[idx] = value.get<Float64>(); break;
case AttributeUnderlyingType::Float64: std::get<ContainerPtrType<Float64>>(attribute.arrays)[idx] = value.get<Float64>(); break;
case AttributeUnderlyingType::Decimal32: std::get<ContainerPtrType<Decimal32>>(attribute.arrays)[idx] = value.get<Decimal32>(); break;
case AttributeUnderlyingType::Decimal64: std::get<ContainerPtrType<Decimal64>>(attribute.arrays)[idx] = value.get<Decimal64>(); break;
case AttributeUnderlyingType::Decimal128: std::get<ContainerPtrType<Decimal128>>(attribute.arrays)[idx] = value.get<Decimal128>(); break;
case AttributeUnderlyingType::String:
{
const auto & string = value.get<String>();

View File

@ -18,6 +18,17 @@ void ComplexKeyCacheDictionary::setDefaultAttributeValue(Attribute & attribute,
case AttributeUnderlyingType::Int64: std::get<ContainerPtrType<Int64>>(attribute.arrays)[idx] = std::get<Int64>(attribute.null_values); break;
case AttributeUnderlyingType::Float32: std::get<ContainerPtrType<Float32>>(attribute.arrays)[idx] = std::get<Float32>(attribute.null_values); break;
case AttributeUnderlyingType::Float64: std::get<ContainerPtrType<Float64>>(attribute.arrays)[idx] = std::get<Float64>(attribute.null_values); break;
case AttributeUnderlyingType::Decimal32:
std::get<ContainerPtrType<Decimal32>>(attribute.arrays)[idx] = std::get<Decimal32>(attribute.null_values);
break;
case AttributeUnderlyingType::Decimal64:
std::get<ContainerPtrType<Decimal64>>(attribute.arrays)[idx] = std::get<Decimal64>(attribute.null_values);
break;
case AttributeUnderlyingType::Decimal128:
std::get<ContainerPtrType<Decimal128>>(attribute.arrays)[idx] = std::get<Decimal128>(attribute.null_values);
break;
case AttributeUnderlyingType::String:
{
const auto & null_value_ref = std::get<String>(attribute.null_values);

View File

@ -45,7 +45,7 @@ ComplexKeyHashedDictionary::ComplexKeyHashedDictionary(const ComplexKeyHashedDic
#define DECLARE(TYPE)\
void ComplexKeyHashedDictionary::get##TYPE(\
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types,\
PaddedPODArray<TYPE> & out) const\
ResultArrayType<TYPE> & out) const\
{\
dict_struct.validateKeyTypes(key_types);\
\
@ -70,6 +70,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void ComplexKeyHashedDictionary::getString(
@ -92,7 +95,7 @@ void ComplexKeyHashedDictionary::getString(
#define DECLARE(TYPE)\
void ComplexKeyHashedDictionary::get##TYPE(\
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types,\
const PaddedPODArray<TYPE> & def, PaddedPODArray<TYPE> & out) const\
const PaddedPODArray<TYPE> & def, ResultArrayType<TYPE> & out) const\
{\
dict_struct.validateKeyTypes(key_types);\
\
@ -115,6 +118,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void ComplexKeyHashedDictionary::getString(
@ -135,7 +141,7 @@ void ComplexKeyHashedDictionary::getString(
#define DECLARE(TYPE)\
void ComplexKeyHashedDictionary::get##TYPE(\
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types,\
const TYPE def, PaddedPODArray<TYPE> & out) const\
const TYPE def, ResultArrayType<TYPE> & out) const\
{\
dict_struct.validateKeyTypes(key_types);\
\
@ -158,6 +164,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void ComplexKeyHashedDictionary::getString(
@ -195,6 +204,10 @@ void ComplexKeyHashedDictionary::has(const Columns & key_columns, const DataType
case AttributeUnderlyingType::Float32: has<Float32>(attribute, key_columns, out); break;
case AttributeUnderlyingType::Float64: has<Float64>(attribute, key_columns, out); break;
case AttributeUnderlyingType::String: has<StringRef>(attribute, key_columns, out); break;
case AttributeUnderlyingType::Decimal32: has<Decimal32>(attribute, key_columns, out); break;
case AttributeUnderlyingType::Decimal64: has<Decimal64>(attribute, key_columns, out); break;
case AttributeUnderlyingType::Decimal128: has<Decimal128>(attribute, key_columns, out); break;
}
}
@ -387,6 +400,11 @@ void ComplexKeyHashedDictionary::calculateBytesAllocated()
case AttributeUnderlyingType::Int64: addAttributeSize<Int64>(attribute); break;
case AttributeUnderlyingType::Float32: addAttributeSize<Float32>(attribute); break;
case AttributeUnderlyingType::Float64: addAttributeSize<Float64>(attribute); break;
case AttributeUnderlyingType::Decimal32: addAttributeSize<Decimal32>(attribute); break;
case AttributeUnderlyingType::Decimal64: addAttributeSize<Decimal64>(attribute); break;
case AttributeUnderlyingType::Decimal128: addAttributeSize<Decimal128>(attribute); break;
case AttributeUnderlyingType::String:
{
addAttributeSize<StringRef>(attribute);
@ -424,6 +442,11 @@ ComplexKeyHashedDictionary::Attribute ComplexKeyHashedDictionary::createAttribut
case AttributeUnderlyingType::Int64: createAttributeImpl<Int64>(attr, null_value); break;
case AttributeUnderlyingType::Float32: createAttributeImpl<Float32>(attr, null_value); break;
case AttributeUnderlyingType::Float64: createAttributeImpl<Float64>(attr, null_value); break;
case AttributeUnderlyingType::Decimal32: createAttributeImpl<Decimal32>(attr, null_value); break;
case AttributeUnderlyingType::Decimal64: createAttributeImpl<Decimal64>(attr, null_value); break;
case AttributeUnderlyingType::Decimal128: createAttributeImpl<Decimal128>(attr, null_value); break;
case AttributeUnderlyingType::String:
{
std::get<String>(attr.null_values) = null_value.get<String>();
@ -459,6 +482,9 @@ void ComplexKeyHashedDictionary::getItemsNumber(
DISPATCH(Int64)
DISPATCH(Float32)
DISPATCH(Float64)
DISPATCH(Decimal32)
DISPATCH(Decimal64)
DISPATCH(Decimal128)
#undef DISPATCH
else
throw Exception("Unexpected type of attribute: " + toString(attribute.type), ErrorCodes::LOGICAL_ERROR);
@ -517,6 +543,11 @@ bool ComplexKeyHashedDictionary::setAttributeValue(Attribute & attribute, const
case AttributeUnderlyingType::Int64: return setAttributeValueImpl<Int64>(attribute, key, value.get<Int64>());
case AttributeUnderlyingType::Float32: return setAttributeValueImpl<Float32>(attribute, key, value.get<Float64>());
case AttributeUnderlyingType::Float64: return setAttributeValueImpl<Float64>(attribute, key, value.get<Float64>());
case AttributeUnderlyingType::Decimal32: return setAttributeValueImpl<Decimal32>(attribute, key, value.get<Decimal32>());
case AttributeUnderlyingType::Decimal64: return setAttributeValueImpl<Decimal64>(attribute, key, value.get<Decimal64>());
case AttributeUnderlyingType::Decimal128: return setAttributeValueImpl<Decimal128>(attribute, key, value.get<Decimal128>());
case AttributeUnderlyingType::String:
{
auto & map = *std::get<ContainerPtrType<StringRef>>(attribute.maps);
@ -604,6 +635,10 @@ std::vector<StringRef> ComplexKeyHashedDictionary::getKeys() const
case AttributeUnderlyingType::Float32: return getKeys<Float32>(attribute);
case AttributeUnderlyingType::Float64: return getKeys<Float64>(attribute);
case AttributeUnderlyingType::String: return getKeys<StringRef>(attribute);
case AttributeUnderlyingType::Decimal32: return getKeys<Decimal32>(attribute);
case AttributeUnderlyingType::Decimal64: return getKeys<Decimal64>(attribute);
case AttributeUnderlyingType::Decimal128: return getKeys<Decimal128>(attribute);
}
return {};
}

View File

@ -5,6 +5,7 @@
#include <Dictionaries/DictionaryStructure.h>
#include <common/StringRef.h>
#include <Common/HashTable/HashMap.h>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnString.h>
#include <Common/Arena.h>
#include <ext/range.h>
@ -65,10 +66,13 @@ public:
return dict_struct.attributes[&getAttribute(attribute_name) - attributes.data()].injective;
}
template <typename T>
using ResultArrayType = std::conditional_t<IsDecimalNumber<T>, DecimalPaddedPODArray<T>, PaddedPODArray<T>>;
#define DECLARE(TYPE)\
void get##TYPE(\
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types,\
PaddedPODArray<TYPE> & out) const;
ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -80,6 +84,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
@ -89,7 +96,7 @@ public:
#define DECLARE(TYPE)\
void get##TYPE(\
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types,\
const PaddedPODArray<TYPE> & def, PaddedPODArray<TYPE> & out) const;
const PaddedPODArray<TYPE> & def, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -101,6 +108,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
@ -110,7 +120,7 @@ public:
#define DECLARE(TYPE)\
void get##TYPE(\
const std::string & attribute_name, const Columns & key_columns, const DataTypes & key_types,\
const TYPE def, PaddedPODArray<TYPE> & out) const;
const TYPE def, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -122,6 +132,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
@ -143,12 +156,14 @@ private:
UInt8, UInt16, UInt32, UInt64,
UInt128,
Int8, Int16, Int32, Int64,
Decimal32, Decimal64, Decimal128,
Float32, Float64,
String> null_values;
std::tuple<
ContainerPtrType<UInt8>, ContainerPtrType<UInt16>, ContainerPtrType<UInt32>, ContainerPtrType<UInt64>,
ContainerPtrType<UInt128>,
ContainerPtrType<Int8>, ContainerPtrType<Int16>, ContainerPtrType<Int32>, ContainerPtrType<Int64>,
ContainerPtrType<Decimal32>, ContainerPtrType<Decimal64>, ContainerPtrType<Decimal128>,
ContainerPtrType<Float32>, ContainerPtrType<Float64>,
ContainerPtrType<StringRef>> maps;
std::unique_ptr<Arena> string_arena;

View File

@ -1,6 +1,7 @@
#pragma once
#include <Columns/ColumnVector.h>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnString.h>
#include <Columns/IColumn.h>
#include <DataStreams/IProfilingBlockInputStream.h>
@ -63,12 +64,20 @@ private:
template <typename Type>
using DictionaryGetter = void (DictionaryType::*)(const std::string &, const PaddedPODArray<Key> &, PaddedPODArray<Type> &) const;
template <typename Type>
using DictionaryDecimalGetter =
void (DictionaryType::*)(const std::string &, const PaddedPODArray<Key> &, DecimalPaddedPODArray<Type> &) const;
using DictionaryStringGetter = void (DictionaryType::*)(const std::string &, const PaddedPODArray<Key> &, ColumnString *) const;
// for complex complex key dictionaries
template <typename Type>
using GetterByKey = void (DictionaryType::*)(const std::string &, const Columns &, const DataTypes &, PaddedPODArray<Type> & out) const;
template <typename Type>
using DecimalGetterByKey =
void (DictionaryType::*)(const std::string &, const Columns &, const DataTypes &, DecimalPaddedPODArray<Type> & out) const;
using StringGetterByKey = void (DictionaryType::*)(const std::string &, const Columns &, const DataTypes &, ColumnString * out) const;
// call getXXX
@ -78,6 +87,11 @@ private:
const Columns & keys, const DataTypes & data_types,
Container & container, const DictionaryAttribute & attribute, const DictionaryType & dictionary) const;
template <typename Type, typename Container>
void callGetter(DictionaryDecimalGetter<Type> getter, const PaddedPODArray<Key> & ids_to_fill,
const Columns & keys, const DataTypes & data_types,
Container & container, const DictionaryAttribute & attribute, const DictionaryType & dictionary) const;
template <typename Container>
void callGetter(DictionaryStringGetter getter, const PaddedPODArray<Key> & ids_to_fill,
const Columns & keys, const DataTypes & data_types,
@ -89,12 +103,17 @@ private:
const Columns & keys, const DataTypes & data_types,
Container & container, const DictionaryAttribute & attribute, const DictionaryType & dictionary) const;
template <typename Type, typename Container>
void callGetter(DecimalGetterByKey<Type> getter, const PaddedPODArray<Key> & ids_to_fill,
const Columns & keys, const DataTypes & data_types,
Container & container, const DictionaryAttribute & attribute, const DictionaryType & dictionary) const;
template <typename Container>
void callGetter(StringGetterByKey getter, const PaddedPODArray<Key> & ids_to_fill,
const Columns & keys, const DataTypes & data_types,
Container & container, const DictionaryAttribute & attribute, const DictionaryType & dictionary) const;
template <template <typename> class Getter, typename StringGetter>
template <template <typename> class Getter, template <typename> class DecimalGetter, typename StringGetter>
Block fillBlock(const PaddedPODArray<Key> & ids_to_fill, const Columns & keys,
const DataTypes & types, ColumnsWithTypeAndName && view) const;
@ -147,7 +166,8 @@ DictionaryBlockInputStream<DictionaryType, Key>::DictionaryBlockInputStream(
dictionary(std::static_pointer_cast<const DictionaryType>(dictionary)),
column_names(column_names), ids(std::move(ids)),
logger(&Poco::Logger::get("DictionaryBlockInputStream")),
fill_block_function(&DictionaryBlockInputStream<DictionaryType, Key>::fillBlock<DictionaryGetter, DictionaryStringGetter>),
fill_block_function(
&DictionaryBlockInputStream<DictionaryType, Key>::fillBlock<DictionaryGetter, DictionaryDecimalGetter, DictionaryStringGetter>),
key_type(DictionaryKeyType::Id)
{
}
@ -159,7 +179,7 @@ DictionaryBlockInputStream<DictionaryType, Key>::DictionaryBlockInputStream(
: DictionaryBlockInputStreamBase(keys.size(), max_block_size),
dictionary(std::static_pointer_cast<const DictionaryType>(dictionary)), column_names(column_names),
logger(&Poco::Logger::get("DictionaryBlockInputStream")),
fill_block_function(&DictionaryBlockInputStream<DictionaryType, Key>::fillBlock<GetterByKey, StringGetterByKey>),
fill_block_function(&DictionaryBlockInputStream<DictionaryType, Key>::fillBlock<GetterByKey, DecimalGetterByKey, StringGetterByKey>),
key_type(DictionaryKeyType::ComplexKey)
{
const DictionaryStructure & dictionaty_structure = dictionary->getStructure();
@ -175,7 +195,7 @@ DictionaryBlockInputStream<DictionaryType, Key>::DictionaryBlockInputStream(
: DictionaryBlockInputStreamBase(data_columns.front()->size(), max_block_size),
dictionary(std::static_pointer_cast<const DictionaryType>(dictionary)), column_names(column_names),
logger(&Poco::Logger::get("DictionaryBlockInputStream")),
fill_block_function(&DictionaryBlockInputStream<DictionaryType, Key>::fillBlock<GetterByKey, StringGetterByKey>),
fill_block_function(&DictionaryBlockInputStream<DictionaryType, Key>::fillBlock<GetterByKey, DecimalGetterByKey, StringGetterByKey>),
data_columns(data_columns),
get_key_columns_function(get_key_columns_function), get_view_columns_function(get_view_columns_function),
key_type(DictionaryKeyType::Callback)
@ -243,6 +263,16 @@ void DictionaryBlockInputStream<DictionaryType, Key>::callGetter(
(dict.*getter)(attribute.name, ids_to_fill, container);
}
template <typename DictionaryType, typename Key>
template <typename Type, typename Container>
void DictionaryBlockInputStream<DictionaryType, Key>::callGetter(
DictionaryDecimalGetter<Type> getter, const PaddedPODArray<Key> & ids_to_fill,
const Columns & /*keys*/, const DataTypes & /*data_types*/,
Container & container, const DictionaryAttribute & attribute, const DictionaryType & dict) const
{
(dict.*getter)(attribute.name, ids_to_fill, container);
}
template <typename DictionaryType, typename Key>
template <typename Container>
void DictionaryBlockInputStream<DictionaryType, Key>::callGetter(
@ -263,6 +293,16 @@ void DictionaryBlockInputStream<DictionaryType, Key>::callGetter(
(dict.*getter)(attribute.name, keys, data_types, container);
}
template <typename DictionaryType, typename Key>
template <typename Type, typename Container>
void DictionaryBlockInputStream<DictionaryType, Key>::callGetter(
DecimalGetterByKey<Type> getter, const PaddedPODArray<Key> & /*ids_to_fill*/,
const Columns & keys, const DataTypes & data_types,
Container & container, const DictionaryAttribute & attribute, const DictionaryType & dict) const
{
(dict.*getter)(attribute.name, keys, data_types, container);
}
template <typename DictionaryType, typename Key>
template <typename Container>
void DictionaryBlockInputStream<DictionaryType, Key>::callGetter(
@ -275,7 +315,7 @@ void DictionaryBlockInputStream<DictionaryType, Key>::callGetter(
template <typename DictionaryType, typename Key>
template <template <typename> class Getter, typename StringGetter>
template <template <typename> class Getter, template <typename> class DecimalGetter, typename StringGetter>
Block DictionaryBlockInputStream<DictionaryType, Key>::fillBlock(
const PaddedPODArray<Key> & ids_to_fill, const Columns & keys, const DataTypes & types, ColumnsWithTypeAndName && view) const
{
@ -343,6 +383,24 @@ Block DictionaryBlockInputStream<DictionaryType, Key>::fillBlock(
case AttributeUnderlyingType::Float64:
GET_COLUMN_FORM_ATTRIBUTE(Float64);
break;
case AttributeUnderlyingType::Decimal32:
{
column = getColumnFromAttribute<Decimal32, DecimalGetter<Decimal32>>(
&DictionaryType::getDecimal32, ids_to_fill, keys, data_types, attribute, *dictionary);
break;
}
case AttributeUnderlyingType::Decimal64:
{
column = getColumnFromAttribute<Decimal64, DecimalGetter<Decimal64>>(
&DictionaryType::getDecimal64, ids_to_fill, keys, data_types, attribute, *dictionary);
break;
}
case AttributeUnderlyingType::Decimal128:
{
column = getColumnFromAttribute<Decimal128, DecimalGetter<Decimal128>>(
&DictionaryType::getDecimal128, ids_to_fill, keys, data_types, attribute, *dictionary);
break;
}
case AttributeUnderlyingType::String:
{
column = getColumnFromStringAttribute<StringGetter>(
@ -350,7 +408,7 @@ Block DictionaryBlockInputStream<DictionaryType, Key>::fillBlock(
break;
}
}
#undef GET_COLUMN_FORM_ATTRIBUTE
block_columns.emplace_back(column, attribute.type, attribute.name);
}
}
@ -365,12 +423,24 @@ ColumnPtr DictionaryBlockInputStream<DictionaryType, Key>::getColumnFromAttribut
const Columns & keys, const DataTypes & data_types,
const DictionaryAttribute & attribute, const DictionaryType & dict) const
{
auto size = ids_to_fill.size();
if (!keys.empty())
size = keys.front()->size();
auto column_vector = ColumnVector<AttributeType>::create(size);
callGetter(getter, ids_to_fill, keys, data_types, column_vector->getData(), attribute, dict);
return column_vector;
if constexpr (IsDecimalNumber<AttributeType>)
{
auto size = ids_to_fill.size();
if (!keys.empty())
size = keys.front()->size();
auto column = ColumnDecimal<AttributeType>::create(size, 0); /// NOTE: There's wrong scale here, but it's unused.
callGetter(getter, ids_to_fill, keys, data_types, column->getData(), attribute, dict);
return column;
}
else
{
auto size = ids_to_fill.size();
if (!keys.empty())
size = keys.front()->size();
auto column_vector = ColumnVector<AttributeType>::create(size);
callGetter(getter, ids_to_fill, keys, data_types, column_vector->getData(), attribute, dict);
return column_vector;
}
}

View File

@ -7,10 +7,12 @@
#include <Dictionaries/ExecutableDictionarySource.h>
#include <Dictionaries/HTTPDictionarySource.h>
#include <Dictionaries/LibraryDictionarySource.h>
#include <Dictionaries/XDBCDictionarySource.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeNullable.h>
#include <Common/FieldVisitors.h>
#include <Common/XDBCBridgeHelper.h>
#include <Columns/ColumnsNumber.h>
#include <IO/HTTPCommon.h>
#include <memory>
@ -22,7 +24,6 @@
#endif
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
#include <Poco/Data/ODBC/Connector.h>
#include <Dictionaries/ODBCDictionarySource.h>
#endif
#if USE_MYSQL
#include <Dictionaries/MySQLDictionarySource.h>
@ -154,11 +155,20 @@ DictionarySourcePtr DictionarySourceFactory::create(
else if ("odbc" == source_type)
{
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
return std::make_unique<ODBCDictionarySource>(dict_struct, config, config_prefix + ".odbc", sample_block, context);
const auto & global_config = context.getConfigRef();
BridgeHelperPtr bridge = std::make_shared<XDBCBridgeHelper<ODBCBridgeMixin>>(global_config, context.getSettings().http_connection_timeout, config.getString(config_prefix + ".odbc.connection_string"));
return std::make_unique<XDBCDictionarySource>(dict_struct, config, config_prefix + ".odbc", sample_block, context, bridge);
#else
throw Exception{"Dictionary source of type `odbc` is disabled because poco library was built without ODBC support.",
ErrorCodes::SUPPORT_IS_DISABLED};
#endif
}
else if ("jdbc" == source_type)
{
throw Exception{"Dictionary source of type `jdbc` is disabled until consistent support for nullable fields.",
ErrorCodes::SUPPORT_IS_DISABLED};
// BridgeHelperPtr bridge = std::make_shared<XDBCBridgeHelper<JDBCBridgeMixin>>(config, context.getSettings().http_connection_timeout, config.getString(config_prefix + ".connection_string"));
// return std::make_unique<XDBCDictionarySource>(dict_struct, config, config_prefix + ".jdbc", sample_block, context, bridge);
}
else if ("executable" == source_type)
{

View File

@ -104,6 +104,17 @@ AttributeUnderlyingType getAttributeUnderlyingType(const std::string & type)
if (it != std::end(dictionary))
return it->second;
if (type.find("Decimal") == 0)
{
size_t start = strlen("Decimal");
if (type.find("32", start) == start)
return AttributeUnderlyingType::Decimal32;
if (type.find("64", start) == start)
return AttributeUnderlyingType::Decimal64;
if (type.find("128", start) == start)
return AttributeUnderlyingType::Decimal128;
}
throw Exception{"Unknown type " + type, ErrorCodes::UNKNOWN_TYPE};
}
@ -123,6 +134,9 @@ std::string toString(const AttributeUnderlyingType type)
case AttributeUnderlyingType::Int64: return "Int64";
case AttributeUnderlyingType::Float32: return "Float32";
case AttributeUnderlyingType::Float64: return "Float64";
case AttributeUnderlyingType::Decimal32: return "Decimal32";
case AttributeUnderlyingType::Decimal64: return "Decimal64";
case AttributeUnderlyingType::Decimal128: return "Decimal128";
case AttributeUnderlyingType::String: return "String";
}
@ -316,10 +330,15 @@ std::vector<DictionaryAttribute> DictionaryStructure::getAttributes(
const auto null_value_string = config.getString(prefix + "null_value");
try
{
ReadBufferFromString null_value_buffer{null_value_string};
auto column_with_null_value = type->createColumn();
type->deserializeTextEscaped(*column_with_null_value, null_value_buffer, format_settings);
null_value = (*column_with_null_value)[0];
if (null_value_string.empty())
null_value = type->getDefault();
else
{
ReadBufferFromString null_value_buffer{null_value_string};
auto column_with_null_value = type->createColumn();
type->deserializeTextEscaped(*column_with_null_value, null_value_buffer, format_settings);
null_value = (*column_with_null_value)[0];
}
}
catch (Exception & e)
{

View File

@ -27,6 +27,9 @@ enum class AttributeUnderlyingType
Int64,
Float32,
Float64,
Decimal32,
Decimal64,
Decimal128,
String
};

View File

@ -14,17 +14,28 @@ namespace DB
namespace ErrorCodes
{
extern const int UNSUPPORTED_METHOD;
extern const int LOGICAL_ERROR;
}
ExternalQueryBuilder::ExternalQueryBuilder(
const DictionaryStructure & dict_struct,
const std::string & db,
const std::string & table,
const std::string & where,
IdentifierQuotingStyle quoting_style)
: dict_struct(dict_struct), db(db), table(table), where(where), quoting_style(quoting_style)
const DictionaryStructure & dict_struct_,
const std::string & db_,
const std::string & table_,
const std::string & where_,
IdentifierQuotingStyle quoting_style_)
: dict_struct(dict_struct_), db(db_), where(where_), quoting_style(quoting_style_)
{
if (auto pos = table_.find('.'); pos != std::string::npos)
{
schema = table_.substr(0, pos);
table = table_.substr(pos + 1);
}
else
{
schema = "";
table = table_;
}
}
@ -124,6 +135,11 @@ std::string ExternalQueryBuilder::composeLoadAllQuery() const
writeQuoted(db, out);
writeChar('.', out);
}
if (!schema.empty())
{
writeQuoted(schema, out);
writeChar('.', out);
}
writeQuoted(table, out);
if (!where.empty())
@ -187,6 +203,12 @@ std::string ExternalQueryBuilder::composeLoadIdsQuery(const std::vector<UInt64>
writeQuoted(db, out);
writeChar('.', out);
}
if (!schema.empty())
{
writeQuoted(schema, out);
writeChar('.', out);
}
writeQuoted(table, out);
writeString(" WHERE ", out);
@ -250,6 +272,12 @@ std::string ExternalQueryBuilder::composeLoadKeysQuery(
writeQuoted(db, out);
writeChar('.', out);
}
if (!schema.empty())
{
writeQuoted(schema, out);
writeChar('.', out);
}
writeQuoted(table, out);
writeString(" WHERE ", out);

View File

@ -18,19 +18,20 @@ class WriteBuffer;
struct ExternalQueryBuilder
{
const DictionaryStructure & dict_struct;
const std::string & db;
const std::string & table;
std::string db;
std::string table;
std::string schema;
const std::string & where;
IdentifierQuotingStyle quoting_style;
ExternalQueryBuilder(
const DictionaryStructure & dict_struct,
const std::string & db,
const std::string & table,
const std::string & where,
IdentifierQuotingStyle quoting_style);
const DictionaryStructure & dict_struct_,
const std::string & db_,
const std::string & table_,
const std::string & where_,
IdentifierQuotingStyle quoting_style_);
/** Generate a query to load all data. */
std::string composeLoadAllQuery() const;

View File

@ -5,6 +5,7 @@
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeDateTime.h>
#include <DataTypes/DataTypeUUID.h>
#include <DataTypes/DataTypeNullable.h>
#include <Common/typeid_cast.h>
@ -20,57 +21,48 @@ void ExternalResultDescription::init(const Block & sample_block_)
{
sample_block = sample_block_;
const auto num_columns = sample_block.columns();
types.reserve(num_columns);
names.reserve(num_columns);
sample_columns.reserve(num_columns);
types.reserve(sample_block.columns());
for (const auto idx : ext::range(0, num_columns))
for (auto & elem : sample_block)
{
const auto & column = sample_block.safeGetByPosition(idx);
const auto type = column.type.get();
/// If default value for column was not provided, use default from data type.
if (elem.column->empty())
elem.column = elem.type->createColumnConstWithDefaultValue(1)->convertToFullColumnIfConst();
bool is_nullable = elem.type->isNullable();
DataTypePtr type_not_nullable = removeNullable(elem.type);
const IDataType * type = type_not_nullable.get();
if (typeid_cast<const DataTypeUInt8 *>(type))
types.push_back(ValueType::UInt8);
types.emplace_back(ValueType::UInt8, is_nullable);
else if (typeid_cast<const DataTypeUInt16 *>(type))
types.push_back(ValueType::UInt16);
types.emplace_back(ValueType::UInt16, is_nullable);
else if (typeid_cast<const DataTypeUInt32 *>(type))
types.push_back(ValueType::UInt32);
types.emplace_back(ValueType::UInt32, is_nullable);
else if (typeid_cast<const DataTypeUInt64 *>(type))
types.push_back(ValueType::UInt64);
types.emplace_back(ValueType::UInt64, is_nullable);
else if (typeid_cast<const DataTypeInt8 *>(type))
types.push_back(ValueType::Int8);
types.emplace_back(ValueType::Int8, is_nullable);
else if (typeid_cast<const DataTypeInt16 *>(type))
types.push_back(ValueType::Int16);
types.emplace_back(ValueType::Int16, is_nullable);
else if (typeid_cast<const DataTypeInt32 *>(type))
types.push_back(ValueType::Int32);
types.emplace_back(ValueType::Int32, is_nullable);
else if (typeid_cast<const DataTypeInt64 *>(type))
types.push_back(ValueType::Int64);
types.emplace_back(ValueType::Int64, is_nullable);
else if (typeid_cast<const DataTypeFloat32 *>(type))
types.push_back(ValueType::Float32);
types.emplace_back(ValueType::Float32, is_nullable);
else if (typeid_cast<const DataTypeFloat64 *>(type))
types.push_back(ValueType::Float64);
types.emplace_back(ValueType::Float64, is_nullable);
else if (typeid_cast<const DataTypeString *>(type))
types.push_back(ValueType::String);
types.emplace_back(ValueType::String, is_nullable);
else if (typeid_cast<const DataTypeDate *>(type))
types.push_back(ValueType::Date);
types.emplace_back(ValueType::Date, is_nullable);
else if (typeid_cast<const DataTypeDateTime *>(type))
types.push_back(ValueType::DateTime);
types.emplace_back(ValueType::DateTime, is_nullable);
else if (typeid_cast<const DataTypeUUID *>(type))
types.push_back(ValueType::UUID);
types.emplace_back(ValueType::UUID, is_nullable);
else
throw Exception{"Unsupported type " + type->getName(), ErrorCodes::UNKNOWN_TYPE};
names.emplace_back(column.name);
sample_columns.emplace_back(column.column);
/// If default value for column was not provided, use default from data type.
if (sample_columns.back()->empty())
{
MutableColumnPtr mutable_column = (*std::move(sample_columns.back())).mutate();
column.type->insertDefaultInto(*mutable_column);
sample_columns.back() = std::move(mutable_column);
}
}
}

View File

@ -25,13 +25,11 @@ struct ExternalResultDescription
String,
Date,
DateTime,
UUID
UUID,
};
Block sample_block;
std::vector<ValueType> types;
std::vector<std::string> names;
Columns sample_columns;
std::vector<std::pair<ValueType, bool /* is_nullable */>> types;
void init(const Block & sample_block_);
};

View File

@ -115,7 +115,7 @@ void FlatDictionary::isInConstantVector(
#define DECLARE(TYPE)\
void FlatDictionary::get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, PaddedPODArray<TYPE> & out) const\
void FlatDictionary::get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ResultArrayType<TYPE> & out) const\
{\
const auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -138,6 +138,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void FlatDictionary::getString(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ColumnString * out) const
@ -156,7 +159,7 @@ void FlatDictionary::getString(const std::string & attribute_name, const PaddedP
#define DECLARE(TYPE)\
void FlatDictionary::get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const PaddedPODArray<TYPE> & def,\
PaddedPODArray<TYPE> & out) const\
ResultArrayType<TYPE> & out) const\
{\
const auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -177,6 +180,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void FlatDictionary::getString(
@ -194,8 +200,7 @@ void FlatDictionary::getString(
#define DECLARE(TYPE)\
void FlatDictionary::get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE def,\
PaddedPODArray<TYPE> & out) const\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE def, ResultArrayType<TYPE> & out) const\
{\
const auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -216,6 +221,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void FlatDictionary::getString(
@ -250,6 +258,10 @@ void FlatDictionary::has(const PaddedPODArray<Key> & ids, PaddedPODArray<UInt8>
case AttributeUnderlyingType::Float32: has<Float32>(attribute, ids, out); break;
case AttributeUnderlyingType::Float64: has<Float64>(attribute, ids, out); break;
case AttributeUnderlyingType::String: has<String>(attribute, ids, out); break;
case AttributeUnderlyingType::Decimal32: has<Decimal32>(attribute, ids, out); break;
case AttributeUnderlyingType::Decimal64: has<Decimal64>(attribute, ids, out); break;
case AttributeUnderlyingType::Decimal128: has<Decimal128>(attribute, ids, out); break;
}
}
@ -408,6 +420,11 @@ void FlatDictionary::calculateBytesAllocated()
case AttributeUnderlyingType::Int64: addAttributeSize<Int64>(attribute); break;
case AttributeUnderlyingType::Float32: addAttributeSize<Float32>(attribute); break;
case AttributeUnderlyingType::Float64: addAttributeSize<Float64>(attribute); break;
case AttributeUnderlyingType::Decimal32: addAttributeSize<Decimal32>(attribute); break;
case AttributeUnderlyingType::Decimal64: addAttributeSize<Decimal64>(attribute); break;
case AttributeUnderlyingType::Decimal128: addAttributeSize<Decimal128>(attribute); break;
case AttributeUnderlyingType::String:
{
addAttributeSize<StringRef>(attribute);
@ -460,6 +477,10 @@ FlatDictionary::Attribute FlatDictionary::createAttributeWithType(const Attribut
case AttributeUnderlyingType::Float32: createAttributeImpl<Float32>(attr, null_value); break;
case AttributeUnderlyingType::Float64: createAttributeImpl<Float64>(attr, null_value); break;
case AttributeUnderlyingType::String: createAttributeImpl<String>(attr, null_value); break;
case AttributeUnderlyingType::Decimal32: createAttributeImpl<Decimal32>(attr, null_value); break;
case AttributeUnderlyingType::Decimal64: createAttributeImpl<Decimal64>(attr, null_value); break;
case AttributeUnderlyingType::Decimal128: createAttributeImpl<Decimal128>(attr, null_value); break;
}
return attr;
@ -488,6 +509,9 @@ void FlatDictionary::getItemsNumber(
DISPATCH(Int64)
DISPATCH(Float32)
DISPATCH(Float64)
DISPATCH(Decimal32)
DISPATCH(Decimal64)
DISPATCH(Decimal128)
#undef DISPATCH
else
throw Exception("Unexpected type of attribute: " + toString(attribute.type), ErrorCodes::LOGICAL_ERROR);
@ -563,6 +587,10 @@ void FlatDictionary::setAttributeValue(Attribute & attribute, const Key id, cons
case AttributeUnderlyingType::Float32: setAttributeValueImpl<Float32>(attribute, id, value.get<Float64>()); break;
case AttributeUnderlyingType::Float64: setAttributeValueImpl<Float64>(attribute, id, value.get<Float64>()); break;
case AttributeUnderlyingType::String: setAttributeValueImpl<String>(attribute, id, value.get<String>()); break;
case AttributeUnderlyingType::Decimal32: setAttributeValueImpl<Decimal32>(attribute, id, value.get<Decimal128>()); break;
case AttributeUnderlyingType::Decimal64: setAttributeValueImpl<Decimal64>(attribute, id, value.get<Decimal128>()); break;
case AttributeUnderlyingType::Decimal128: setAttributeValueImpl<Decimal128>(attribute, id, value.get<Decimal128>()); break;
}
}

View File

@ -3,6 +3,7 @@
#include <Dictionaries/IDictionary.h>
#include <Dictionaries/IDictionarySource.h>
#include <Dictionaries/DictionaryStructure.h>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnString.h>
#include <Common/Arena.h>
#include <ext/range.h>
@ -69,8 +70,11 @@ public:
void isInVectorConstant(const PaddedPODArray<Key> & child_ids, const Key ancestor_id, PaddedPODArray<UInt8> & out) const override;
void isInConstantVector(const Key child_id, const PaddedPODArray<Key> & ancestor_ids, PaddedPODArray<UInt8> & out) const override;
template <typename T>
using ResultArrayType = std::conditional_t<IsDecimalNumber<T>, DecimalPaddedPODArray<T>, PaddedPODArray<T>>;
#define DECLARE(TYPE)\
void get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, PaddedPODArray<TYPE> & out) const;
void get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -82,6 +86,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ColumnString * out) const;
@ -89,7 +96,7 @@ public:
#define DECLARE(TYPE)\
void get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const PaddedPODArray<TYPE> & def,\
PaddedPODArray<TYPE> & out) const;
ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -101,6 +108,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
@ -110,7 +120,7 @@ public:
#define DECLARE(TYPE)\
void get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE def,\
PaddedPODArray<TYPE> & out) const;
ResultArrayType<TYPE> & out) const;
DECLARE(UInt8)
DECLARE(UInt16)
DECLARE(UInt32)
@ -122,6 +132,9 @@ public:
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void getString(
@ -143,12 +156,14 @@ private:
UInt8, UInt16, UInt32, UInt64,
UInt128,
Int8, Int16, Int32, Int64,
Decimal32, Decimal64, Decimal128,
Float32, Float64,
StringRef> null_values;
std::tuple<
ContainerPtrType<UInt8>, ContainerPtrType<UInt16>, ContainerPtrType<UInt32>, ContainerPtrType<UInt64>,
ContainerPtrType<UInt128>,
ContainerPtrType<Int8>, ContainerPtrType<Int16>, ContainerPtrType<Int32>, ContainerPtrType<Int64>,
ContainerPtrType<Decimal32>, ContainerPtrType<Decimal64>, ContainerPtrType<Decimal128>,
ContainerPtrType<Float32>, ContainerPtrType<Float64>,
ContainerPtrType<StringRef>> arrays;
std::unique_ptr<Arena> string_arena;

View File

@ -110,7 +110,7 @@ void HashedDictionary::isInConstantVector(
#define DECLARE(TYPE)\
void HashedDictionary::get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, PaddedPODArray<TYPE> & out) const\
void HashedDictionary::get##TYPE(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ResultArrayType<TYPE> & out) const\
{\
const auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -133,6 +133,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void HashedDictionary::getString(const std::string & attribute_name, const PaddedPODArray<Key> & ids, ColumnString * out) const
@ -151,7 +154,7 @@ void HashedDictionary::getString(const std::string & attribute_name, const Padde
#define DECLARE(TYPE)\
void HashedDictionary::get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const PaddedPODArray<TYPE> & def,\
PaddedPODArray<TYPE> & out) const\
ResultArrayType<TYPE> & out) const\
{\
const auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -172,6 +175,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void HashedDictionary::getString(
@ -189,7 +195,7 @@ void HashedDictionary::getString(
#define DECLARE(TYPE)\
void HashedDictionary::get##TYPE(\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE & def, PaddedPODArray<TYPE> & out) const\
const std::string & attribute_name, const PaddedPODArray<Key> & ids, const TYPE & def, ResultArrayType<TYPE> & out) const\
{\
const auto & attribute = getAttribute(attribute_name);\
if (!isAttributeTypeConvertibleTo(attribute.type, AttributeUnderlyingType::TYPE))\
@ -210,6 +216,9 @@ DECLARE(Int32)
DECLARE(Int64)
DECLARE(Float32)
DECLARE(Float64)
DECLARE(Decimal32)
DECLARE(Decimal64)
DECLARE(Decimal128)
#undef DECLARE
void HashedDictionary::getString(
@ -243,6 +252,10 @@ void HashedDictionary::has(const PaddedPODArray<Key> & ids, PaddedPODArray<UInt8
case AttributeUnderlyingType::Float32: has<Float32>(attribute, ids, out); break;
case AttributeUnderlyingType::Float64: has<Float64>(attribute, ids, out); break;
case AttributeUnderlyingType::String: has<StringRef>(attribute, ids, out); break;
case AttributeUnderlyingType::Decimal32: has<Decimal32>(attribute, ids, out); break;
case AttributeUnderlyingType::Decimal64: has<Decimal64>(attribute, ids, out); break;
case AttributeUnderlyingType::Decimal128: has<Decimal128>(attribute, ids, out); break;
}
}
@ -398,6 +411,11 @@ void HashedDictionary::calculateBytesAllocated()
case AttributeUnderlyingType::Int64: addAttributeSize<Int64>(attribute); break;
case AttributeUnderlyingType::Float32: addAttributeSize<Float32>(attribute); break;
case AttributeUnderlyingType::Float64: addAttributeSize<Float64>(attribute); break;
case AttributeUnderlyingType::Decimal32: addAttributeSize<Decimal32>(attribute); break;
case AttributeUnderlyingType::Decimal64: addAttributeSize<Decimal64>(attribute); break;
case AttributeUnderlyingType::Decimal128: addAttributeSize<Decimal128>(attribute); break;
case AttributeUnderlyingType::String:
{
addAttributeSize<StringRef>(attribute);
@ -433,6 +451,11 @@ HashedDictionary::Attribute HashedDictionary::createAttributeWithType(const Attr
case AttributeUnderlyingType::Int64: createAttributeImpl<Int64>(attr, null_value); break;
case AttributeUnderlyingType::Float32: createAttributeImpl<Float32>(attr, null_value); break;
case AttributeUnderlyingType::Float64: createAttributeImpl<Float64>(attr, null_value); break;
case AttributeUnderlyingType::Decimal32: createAttributeImpl<Decimal32>(attr, null_value); break;
case AttributeUnderlyingType::Decimal64: createAttributeImpl<Decimal64>(attr, null_value); break;
case AttributeUnderlyingType::Decimal128: createAttributeImpl<Decimal128>(attr, null_value); break;
case AttributeUnderlyingType::String:
{
std::get<String>(attr.null_values) = null_value.get<String>();
@ -468,6 +491,9 @@ void HashedDictionary::getItemsNumber(
DISPATCH(Int64)
DISPATCH(Float32)
DISPATCH(Float64)
DISPATCH(Decimal32)
DISPATCH(Decimal64)
DISPATCH(Decimal128)
#undef DISPATCH
else
throw Exception("Unexpected type of attribute: " + toString(attribute.type), ErrorCodes::LOGICAL_ERROR);
@ -515,6 +541,11 @@ void HashedDictionary::setAttributeValue(Attribute & attribute, const Key id, co
case AttributeUnderlyingType::Int64: setAttributeValueImpl<Int64>(attribute, id, value.get<Int64>()); break;
case AttributeUnderlyingType::Float32: setAttributeValueImpl<Float32>(attribute, id, value.get<Float64>()); break;
case AttributeUnderlyingType::Float64: setAttributeValueImpl<Float64>(attribute, id, value.get<Float64>()); break;
case AttributeUnderlyingType::Decimal32: setAttributeValueImpl<Decimal32>(attribute, id, value.get<Decimal32>()); break;
case AttributeUnderlyingType::Decimal64: setAttributeValueImpl<Decimal64>(attribute, id, value.get<Decimal64>()); break;
case AttributeUnderlyingType::Decimal128: setAttributeValueImpl<Decimal128>(attribute, id, value.get<Decimal128>()); break;
case AttributeUnderlyingType::String:
{
auto & map = *std::get<CollectionPtrType<StringRef>>(attribute.maps);
@ -578,6 +609,10 @@ PaddedPODArray<HashedDictionary::Key> HashedDictionary::getIds() const
case AttributeUnderlyingType::Float32: return getIds<Float32>(attribute);
case AttributeUnderlyingType::Float64: return getIds<Float64>(attribute);
case AttributeUnderlyingType::String: return getIds<StringRef>(attribute);
case AttributeUnderlyingType::Decimal32: return getIds<Decimal32>(attribute);
case AttributeUnderlyingType::Decimal64: return getIds<Decimal64>(attribute);
case AttributeUnderlyingType::Decimal128: return getIds<Decimal128>(attribute);
}
return PaddedPODArray<Key>();
}

Some files were not shown because too many files have changed in this diff Show More