From 2c9081768b2a6e4d97a7af208a92980b1fbb9427 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 12:16:14 -0300 Subject: [PATCH 01/30] Update mergetree.md TTL examples --- docs/ru/operations/table_engines/mergetree.md | 46 +++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/docs/ru/operations/table_engines/mergetree.md b/docs/ru/operations/table_engines/mergetree.md index d47336c2593..45c5778d0d7 100644 --- a/docs/ru/operations/table_engines/mergetree.md +++ b/docs/ru/operations/table_engines/mergetree.md @@ -327,10 +327,56 @@ TTL date_time + INTERVAL 15 HOUR Секцию `TTL` нельзя использовать для ключевых столбцов. +Примеры: + +```sql +CREATE TABLE ttl +( + d DateTime, + a Int TTL d + interval 1 month, + b Int TTL d + interval 1 month, + c String +) +ENGINE = MergeTree +PARTITION BY toYYYYMM(d) +ORDER BY d; + +// добавление ttl на колонку существующей таблицы + +ALTER TABLE ttl + MODIFY COLUMN + c String TTL d + interval 1 day; + +// изменение ttl у колонки + +ALTER TABLE ttl + MODIFY COLUMN + c String TTL d + interval 1 month; +``` + **TTL таблицы** Когда некоторые данные в таблице устаревают, ClickHouse удаляет все соответствующие строки. +Примеры: + +```sql +CREATE TABLE ttl +( + d DateTime, + a Int +) +ENGINE = MergeTree +PARTITION BY toYYYYMM(d) +ORDER BY d +TTL d + interval 1 month; + +-- Изменение TTL + +ALTER TABLE ttl + MODIFY TTL d + interval 1 day; +``` + **Удаление данных** Данные с истекшим TTL удаляются, когда ClickHouse мёржит куски данных. From 4c1073bd3c40a0299852f553a153b416458f6e9a Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 12:18:24 -0300 Subject: [PATCH 02/30] Update mergetree.md TTL examples --- docs/ru/operations/table_engines/mergetree.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/ru/operations/table_engines/mergetree.md b/docs/ru/operations/table_engines/mergetree.md index 45c5778d0d7..ac7458d59d5 100644 --- a/docs/ru/operations/table_engines/mergetree.md +++ b/docs/ru/operations/table_engines/mergetree.md @@ -341,13 +341,13 @@ ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d; -// добавление ttl на колонку существующей таблицы +-- добавление ttl на колонку существующей таблицы ALTER TABLE ttl MODIFY COLUMN c String TTL d + interval 1 day; -// изменение ttl у колонки +-- изменение ttl у колонки ALTER TABLE ttl MODIFY COLUMN From 873689b7d5fe8970ff13549cbe7f675e69d842ad Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 12:26:51 -0300 Subject: [PATCH 03/30] Update mergetree.md TTL examples --- docs/ru/operations/table_engines/mergetree.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/ru/operations/table_engines/mergetree.md b/docs/ru/operations/table_engines/mergetree.md index ac7458d59d5..ef7c1c2f3ad 100644 --- a/docs/ru/operations/table_engines/mergetree.md +++ b/docs/ru/operations/table_engines/mergetree.md @@ -330,7 +330,7 @@ TTL date_time + INTERVAL 15 HOUR Примеры: ```sql -CREATE TABLE ttl +CREATE TABLE example_table ( d DateTime, a Int TTL d + interval 1 month, @@ -341,15 +341,15 @@ ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d; --- добавление ttl на колонку существующей таблицы +-- добавление TTL на колонку существующей таблицы -ALTER TABLE ttl +ALTER TABLE example_table MODIFY COLUMN c String TTL d + interval 1 day; --- изменение ttl у колонки +-- изменение TTL у колонки -ALTER TABLE ttl +ALTER TABLE example_table MODIFY COLUMN c String TTL d + interval 1 month; ``` @@ -361,7 +361,7 @@ ALTER TABLE ttl Примеры: ```sql -CREATE TABLE ttl +CREATE TABLE example_table ( d DateTime, a Int @@ -373,7 +373,7 @@ TTL d + interval 1 month; -- Изменение TTL -ALTER TABLE ttl +ALTER TABLE example_table MODIFY TTL d + interval 1 day; ``` From 908055bf1a227377f316321d7ff18fda5eaaa960 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 12:26:55 -0300 Subject: [PATCH 04/30] Update mergetree.md TTL examples --- docs/en/operations/table_engines/mergetree.md | 46 +++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/docs/en/operations/table_engines/mergetree.md b/docs/en/operations/table_engines/mergetree.md index 7c694a1612c..46ff6926894 100644 --- a/docs/en/operations/table_engines/mergetree.md +++ b/docs/en/operations/table_engines/mergetree.md @@ -388,10 +388,56 @@ When the values in the column expire, ClickHouse replaces them with the default The `TTL` clause can't be used for key columns. +Examples: + +```sql +CREATE TABLE example_table +( + d DateTime, + a Int TTL d + interval 1 month, + b Int TTL d + interval 1 month, + c String +) +ENGINE = MergeTree +PARTITION BY toYYYYMM(d) +ORDER BY d; + +// adding TTL to a column of an existing table + +ALTER TABLE example_table + MODIFY COLUMN + c String TTL d + interval 1 day; + +// altering TTL of the column + +ALTER TABLE example_table + MODIFY COLUMN + c String TTL d + interval 1 month; +``` + **Table TTL** When data in a table expires, ClickHouse deletes all corresponding rows. +Examples: + +```sql +CREATE TABLE example_table +( + d DateTime, + a Int +) +ENGINE = MergeTree +PARTITION BY toYYYYMM(d) +ORDER BY d +TTL d + interval 1 month; + +-- altering of TTL + +ALTER TABLE example_table + MODIFY TTL d + interval 1 day; +``` + **Removing Data** Data with an expired TTL is removed when ClickHouse merges data parts. From 529cffeeab666d95e5f447774977794f7a15e8dc Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 12:28:33 -0300 Subject: [PATCH 05/30] Update mergetree.md TTL examples --- docs/en/operations/table_engines/mergetree.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/en/operations/table_engines/mergetree.md b/docs/en/operations/table_engines/mergetree.md index 46ff6926894..1a274de5924 100644 --- a/docs/en/operations/table_engines/mergetree.md +++ b/docs/en/operations/table_engines/mergetree.md @@ -402,13 +402,13 @@ ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d; -// adding TTL to a column of an existing table +-- adding TTL to a column of an existing table ALTER TABLE example_table MODIFY COLUMN c String TTL d + interval 1 day; -// altering TTL of the column +-- altering TTL of the column ALTER TABLE example_table MODIFY COLUMN From 88ded215ce4e565de897ea442aeeba52c677592e Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 13:52:48 -0300 Subject: [PATCH 06/30] Update mergetree.md TTL examples / Requested change --- docs/en/operations/table_engines/mergetree.md | 30 ++++++++++++------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/docs/en/operations/table_engines/mergetree.md b/docs/en/operations/table_engines/mergetree.md index 1a274de5924..c3d64395a02 100644 --- a/docs/en/operations/table_engines/mergetree.md +++ b/docs/en/operations/table_engines/mergetree.md @@ -390,29 +390,35 @@ The `TTL` clause can't be used for key columns. Examples: +Creating a table with TTL + ```sql CREATE TABLE example_table ( d DateTime, - a Int TTL d + interval 1 month, - b Int TTL d + interval 1 month, + a Int TTL d + INTERVAL 1 MONTH, + b Int TTL d + INTERVAL 1 MONTH, c String ) ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d; +``` --- adding TTL to a column of an existing table +Adding TTL to a column of an existing table +```sql ALTER TABLE example_table MODIFY COLUMN - c String TTL d + interval 1 day; - --- altering TTL of the column + c String TTL d + INTERVAL 1 DAY; +``` +Altering TTL of the column + +```sql ALTER TABLE example_table MODIFY COLUMN - c String TTL d + interval 1 month; + c String TTL d + INTERVAL 1 MONTH; ``` **Table TTL** @@ -421,6 +427,8 @@ When data in a table expires, ClickHouse deletes all corresponding rows. Examples: +Creating a table with TTL + ```sql CREATE TABLE example_table ( @@ -430,12 +438,14 @@ CREATE TABLE example_table ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d -TTL d + interval 1 month; +TTL d + INTERVAL 1 MONTH; +``` --- altering of TTL +Altering TTL of the table +```sql ALTER TABLE example_table - MODIFY TTL d + interval 1 day; + MODIFY TTL d + INTERVAL 1 DAY; ``` **Removing Data** From a38911d4df3173893468db1463dc718df6cfcabd Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 13:57:23 -0300 Subject: [PATCH 07/30] Update mergetree.md TTL examples / Requested change --- docs/ru/operations/table_engines/mergetree.md | 28 ++++++++++++------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/docs/ru/operations/table_engines/mergetree.md b/docs/ru/operations/table_engines/mergetree.md index ef7c1c2f3ad..6c4aa6ce2bb 100644 --- a/docs/ru/operations/table_engines/mergetree.md +++ b/docs/ru/operations/table_engines/mergetree.md @@ -329,29 +329,35 @@ TTL date_time + INTERVAL 15 HOUR Примеры: +Создание таблицы с TTL + ```sql CREATE TABLE example_table ( d DateTime, - a Int TTL d + interval 1 month, - b Int TTL d + interval 1 month, + a Int TTL d + INTERVAL 1 MONTH, + b Int TTL d + INTERVAL 1 MONTH, c String ) ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d; +``` --- добавление TTL на колонку существующей таблицы +Добавление TTL на колонку существующей таблицы +```sql ALTER TABLE example_table MODIFY COLUMN - c String TTL d + interval 1 day; - --- изменение TTL у колонки + c String TTL d + INTERVAL 1 DAY; +``` +Изменение TTL у колонки + +```sql ALTER TABLE example_table MODIFY COLUMN - c String TTL d + interval 1 month; + c String TTL d + INTERVAL 1 MONTH; ``` **TTL таблицы** @@ -369,12 +375,14 @@ CREATE TABLE example_table ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY d -TTL d + interval 1 month; +TTL d + INTERVAL 1 MONTH; +``` --- Изменение TTL +Изменение TTL +```sql ALTER TABLE example_table - MODIFY TTL d + interval 1 day; + MODIFY TTL d + INTERVAL 1 DAY; ``` **Удаление данных** From 115edf343dc0e2786870af5f605aadd65bb1c181 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 17:40:50 -0300 Subject: [PATCH 08/30] Update system.md Added some system queries (1st attempt). --- docs/ru/query_language/system.md | 50 ++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/docs/ru/query_language/system.md b/docs/ru/query_language/system.md index f35b4a39061..905d921fb43 100644 --- a/docs/ru/query_language/system.md +++ b/docs/ru/query_language/system.md @@ -1,9 +1,59 @@ # Запросы SYSTEM {#query_language-system} +- [RELOAD DICTIONARIES](#query_language-system-reload-dictionaries) +- [RELOAD DICTIONARY](#query_language-system-reload-dictionary) +- [DROP DNS CACHE](#query_language-system-drop-dns-cache) +- [DROP MARKS CACHE](#query_language-system-drop-marks-cache) +- [FLUSH LOGS](#query_language-system-flush_logs) +- [RELOAD CONFIG](#query_language-system-reload-config) +- [SHUTDOWN](#query_language-system-shutdown) +- [KILL](#query_language-system-kill) - [STOP DISTRIBUTED SENDS](#query_language-system-stop-distributed-sends) - [FLUSH DISTRIBUTED](#query_language-system-flush-distributed) - [START DISTRIBUTED SENDS](#query_language-system-start-distributed-sends) +## RELOAD DICTIONARIES {#query_language-system-reload-dictionaries} + +Перегружает все словари, которые были успешно загружены до этого. +По умолчанию включена ленивая загрузка [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load), поэтому словари не загружаются автоматически при старте, а только при первом обращении через dictGet или SELECT к ENGINE=Dictionary. После этого такие словари (LOADED) будут перегружаться командой `system reload dictionaries`. +Всегда возвращает Ok., вне зависимости от результата обновления словарей. + +## RELOAD DICTIONARY dictionary_name {#query_language-system-reload-dictionary} + +Полностью перегружает словарь `dictionary_name`, вне зависимости от состояния словаря (LOADED/NOT_LOADED/FAILED). +Всегда возвращает Ok., вне зависимости от результата обновления словаря. +Состояние словаря можно проверить запросом к `system.dictionaries`. + +```sql +select name,status from system.dictionaries +``` + +## DROP DNS CACHE {#query_language-system-drop-dns-cache} + +Сбрасывает внутренний DNS кеш ClickHouse. Иногда необходимо использовать эту команду при изменении инфраструктуры (смене IP адреса у другого ClickHouse сервера или сервера, используемого словарями). + +Для более удобного (автоматического) управления кешем см. параметры disable_internal_dns_cache, dns_cache_update_period. + +## DROP MARKS CACHE {#query_language-system-drop-marks-cache} + +Сбрасывает кеш "засечек" (`mark cache`). Используется при разработке ClickHouse и тестах производительности. + +## FLUSH LOGS {#query_language-system-flush_logs} + +Записывает буферы логов в системные таблицы (например system.query_log). Позволяет не ждать 7.5 секунд при отладке. + +## RELOAD CONFIG {#query_language-system-reload-config} + +Перечитывает конфигурацию настроек ClickHouse. Используется при хранении конфигурации в zookeeeper. + +## SHUTDOWN {#query_language-system-shutdown} + +Штатно завершает работу ClickHouse (аналог `service clickhouse-server stop` / `kill {$pid_clickhouse-server}`) + +## KILL {#query_language-system-kill} + +Аварийно завершает работу ClickHouse (аналог `kill -9 {$pid_clickhouse-server}`) + ## Управление распределёнными таблицами {#query_language-system-distributed} ClickHouse может оперировать [распределёнными](../operations/table_engines/distributed.md) таблицами. Когда пользователь вставляет данные в эти таблицы, ClickHouse сначала формирует очередь из данных, которые должны быть отправлены на узлы кластера, а затем асинхронно отправляет подготовленные данные. Вы пожете управлять очередью с помощью запросов [STOP DISTRIBUTED SENDS](#query_language-system-stop-distributed-sends), [START DISTRIBUTED SENDS](#query_language-system-start-distributed-sends) и [FLUSH DISTRIBUTED](#query_language-system-flush-distributed). Также есть возможность синхронно вставлять распределенные данные с помощью настройки `insert_distributed_sync`. From edf101def0e74beee39b810889cf92f8d875372d Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 17:54:37 -0300 Subject: [PATCH 09/30] Update system.md Added some system queries (1st attempt). --- docs/en/query_language/system.md | 49 ++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index b7797df490b..39be59d6c8a 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -1,9 +1,58 @@ # SYSTEM Queries {#query_language-system} +- [RELOAD DICTIONARIES](#query_language-system-reload-dictionaries) +- [RELOAD DICTIONARY](#query_language-system-reload-dictionary) +- [DROP DNS CACHE](#query_language-system-drop-dns-cache) +- [DROP MARKS CACHE](#query_language-system-drop-marks-cache) +- [FLUSH LOGS](#query_language-system-flush_logs) +- [RELOAD CONFIG](#query_language-system-reload-config) +- [SHUTDOWN](#query_language-system-shutdown) +- [KILL](#query_language-system-kill) - [STOP DISTRIBUTED SENDS](#query_language-system-stop-distributed-sends) - [FLUSH DISTRIBUTED](#query_language-system-flush-distributed) - [START DISTRIBUTED SENDS](#query_language-system-start-distributed-sends) +## RELOAD DICTIONARIES {#query_language-system-reload-dictionaries} + +Reloads all dictionaries that have been successfully downloaded before. +By default, lazy loading is enabled [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load), so dictionaries are not loaded automatically at startup, but only when they first access through dictGet or SELECT to ENGINE = Dictionary. `system reload dictionaries` command reloads such dictionaries (LOADED). +Always returns `Ok.` regardless of the result of the dictionary update. + +## RELOAD DICTIONARY dictionary_name {#query_language-system-reload-dictionary} + +Completely reloads a dictionary `dictionary_name`, regardless of the state of the dictionary (LOADED / NOT_LOADED / FAILED). +Always returns `Ok.` regardless of the result of updating the dictionary. +The status of the dictionary can be checked by querying `system.dictionaries`. + +```sql +select name,status from system.dictionaries +``` + +## DROP DNS CACHE {#query_language-system-drop-dns-cache} + +Resets ClickHouse's internal DNS cache. Sometimes it is necessary to use this command when changing the infrastructure (changing the IP address of another ClickHouse server or the server used by dictionaries). + +For more convenient (automatic) cache management, see disable_internal_dns_cache, dns_cache_update_period parameters. + +## DROP MARKS CACHE {#query_language-system-drop-marks-cache} + +Resets the mark cache. Used in development of ClickHouse and performance tests. + +## FLUSH LOGS {#query_language-system-flush_logs} + +Flushes buffers of log messages to system tables (e.g. system.query_log). Allows you to not wait 7.5 seconds when debugging. + +## RELOAD CONFIG {#query_language-system-reload-config} + +Reloads ClickHouse configuration. Used when configuration is stored in zookeeeper. + +## SHUTDOWN {#query_language-system-shutdown} + +Normally shuts down ClickHouse (like `service clickhouse-server stop` / `kill {$pid_clickhouse-server}`) + +## KILL {#query_language-system-kill} + +Aborts ClickHouse process (like `kill -9 {$ pid_clickhouse-server}`) ## Managing Distributed Tables {#query_language-system-distributed} From 2287c590f1f05c1699c09ed163a1591f14cc91a1 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 17:56:39 -0300 Subject: [PATCH 10/30] Update system.md Added some system queries (1st attempt). --- docs/ru/query_language/system.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/ru/query_language/system.md b/docs/ru/query_language/system.md index 905d921fb43..90eb090ad4d 100644 --- a/docs/ru/query_language/system.md +++ b/docs/ru/query_language/system.md @@ -16,12 +16,12 @@ Перегружает все словари, которые были успешно загружены до этого. По умолчанию включена ленивая загрузка [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load), поэтому словари не загружаются автоматически при старте, а только при первом обращении через dictGet или SELECT к ENGINE=Dictionary. После этого такие словари (LOADED) будут перегружаться командой `system reload dictionaries`. -Всегда возвращает Ok., вне зависимости от результата обновления словарей. +Всегда возвращает `Ok.`, вне зависимости от результата обновления словарей. ## RELOAD DICTIONARY dictionary_name {#query_language-system-reload-dictionary} Полностью перегружает словарь `dictionary_name`, вне зависимости от состояния словаря (LOADED/NOT_LOADED/FAILED). -Всегда возвращает Ok., вне зависимости от результата обновления словаря. +Всегда возвращает `Ok.`, вне зависимости от результата обновления словаря. Состояние словаря можно проверить запросом к `system.dictionaries`. ```sql From b8f1ebfe62c7ec5fc338186d86f9da4e09dbfc76 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 20:08:29 -0300 Subject: [PATCH 11/30] Update docs/en/query_language/system.md Co-Authored-By: Ivan Blinkov --- docs/en/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index 39be59d6c8a..a054a9bc533 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -15,7 +15,7 @@ ## RELOAD DICTIONARIES {#query_language-system-reload-dictionaries} Reloads all dictionaries that have been successfully downloaded before. -By default, lazy loading is enabled [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load), so dictionaries are not loaded automatically at startup, but only when they first access through dictGet or SELECT to ENGINE = Dictionary. `system reload dictionaries` command reloads such dictionaries (LOADED). +By default, dictionaries are loaded lazily (see [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load)), so instead of being loaded automatically at startup, they are initialized on first access through dictGet function or SELECT to ENGINE = Dictionary. The `SYSTEM RELOAD DICTIONARIES` query reloads such dictionaries (LOADED). Always returns `Ok.` regardless of the result of the dictionary update. ## RELOAD DICTIONARY dictionary_name {#query_language-system-reload-dictionary} From 26572e19f9aec981ad5286ae5a49014c60322449 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 20:08:42 -0300 Subject: [PATCH 12/30] Update docs/en/query_language/system.md Co-Authored-By: Ivan Blinkov --- docs/en/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index a054a9bc533..a99aa15d72b 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -25,7 +25,7 @@ Always returns `Ok.` regardless of the result of updating the dictionary. The status of the dictionary can be checked by querying `system.dictionaries`. ```sql -select name,status from system.dictionaries +SELECT name, status FROM system.dictionaries; ``` ## DROP DNS CACHE {#query_language-system-drop-dns-cache} From 3178b5050534fb8b331589e08008257be4abef6d Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 20:09:02 -0300 Subject: [PATCH 13/30] Update docs/en/query_language/system.md Co-Authored-By: Ivan Blinkov --- docs/en/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index a99aa15d72b..3f9b3868e33 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -22,7 +22,7 @@ Always returns `Ok.` regardless of the result of the dictionary update. Completely reloads a dictionary `dictionary_name`, regardless of the state of the dictionary (LOADED / NOT_LOADED / FAILED). Always returns `Ok.` regardless of the result of updating the dictionary. -The status of the dictionary can be checked by querying `system.dictionaries`. +The status of the dictionary can be checked by querying the `system.dictionaries` table. ```sql SELECT name, status FROM system.dictionaries; From 34bc227d76ba7624b62e0d1f4e6d88480c4fc4ab Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Fri, 6 Sep 2019 20:11:10 -0300 Subject: [PATCH 14/30] Update docs/en/query_language/system.md Co-Authored-By: Ivan Blinkov --- docs/en/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index 3f9b3868e33..2b6f3ff4607 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -44,7 +44,7 @@ Flushes buffers of log messages to system tables (e.g. system.query_log). Allows ## RELOAD CONFIG {#query_language-system-reload-config} -Reloads ClickHouse configuration. Used when configuration is stored in zookeeeper. +Reloads ClickHouse configuration. Used when configuration is stored in ZooKeeeper. ## SHUTDOWN {#query_language-system-shutdown} From 6c32fc3fc11a27a09b174f1d7b4dd8550b79e918 Mon Sep 17 00:00:00 2001 From: Ivan <5627721+abyss7@users.noreply.github.com> Date: Mon, 9 Sep 2019 19:59:51 +0300 Subject: [PATCH 15/30] Store offsets manually for each message (#6872) --- dbms/src/Storages/Kafka/ReadBufferFromKafkaConsumer.cpp | 7 ++++--- dbms/src/Storages/Kafka/StorageKafka.cpp | 7 ++++--- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/dbms/src/Storages/Kafka/ReadBufferFromKafkaConsumer.cpp b/dbms/src/Storages/Kafka/ReadBufferFromKafkaConsumer.cpp index 4614e581a3c..823eb632b7f 100644 --- a/dbms/src/Storages/Kafka/ReadBufferFromKafkaConsumer.cpp +++ b/dbms/src/Storages/Kafka/ReadBufferFromKafkaConsumer.cpp @@ -72,9 +72,7 @@ void ReadBufferFromKafkaConsumer::commit() PrintOffsets("Polled offset", consumer->get_offsets_position(consumer->get_assignment())); - /// Since we can poll more messages than we already processed - commit only processed messages. - if (!messages.empty()) - consumer->async_commit(*std::prev(current)); + consumer->async_commit(); PrintOffsets("Committed offset", consumer->get_offsets_committed(consumer->get_assignment())); @@ -186,6 +184,9 @@ bool ReadBufferFromKafkaConsumer::nextImpl() auto new_position = reinterpret_cast(const_cast(current->get_payload().get_data())); BufferBase::set(new_position, current->get_payload().get_size(), 0); + /// Since we can poll more messages than we already processed - commit only processed messages. + consumer->store_offset(*current); + ++current; return true; diff --git a/dbms/src/Storages/Kafka/StorageKafka.cpp b/dbms/src/Storages/Kafka/StorageKafka.cpp index 2b41fa9e772..ed067993a18 100644 --- a/dbms/src/Storages/Kafka/StorageKafka.cpp +++ b/dbms/src/Storages/Kafka/StorageKafka.cpp @@ -261,9 +261,10 @@ ConsumerBufferPtr StorageKafka::createReadBuffer() conf.set("metadata.broker.list", brokers); conf.set("group.id", group); conf.set("client.id", VERSION_FULL); - conf.set("auto.offset.reset", "smallest"); // If no offset stored for this group, read all messages from the start - conf.set("enable.auto.commit", "false"); // We manually commit offsets after a stream successfully finished - conf.set("enable.partition.eof", "false"); // Ignore EOF messages + conf.set("auto.offset.reset", "smallest"); // If no offset stored for this group, read all messages from the start + conf.set("enable.auto.commit", "false"); // We manually commit offsets after a stream successfully finished + conf.set("enable.auto.offset.store", "false"); // Update offset automatically - to commit them all at once. + conf.set("enable.partition.eof", "false"); // Ignore EOF messages updateConfiguration(conf); // Create a consumer and subscribe to topics From 10011d1483b738e070bb09f7988ca37f1966c64b Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 14:05:18 -0300 Subject: [PATCH 16/30] Update formats.md Fixed rowbinary translation and added rowbinarywithnamesandtypes --- docs/ru/interfaces/formats.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/docs/ru/interfaces/formats.md b/docs/ru/interfaces/formats.md index 15f7552f877..0145423ffeb 100644 --- a/docs/ru/interfaces/formats.md +++ b/docs/ru/interfaces/formats.md @@ -28,6 +28,7 @@ ClickHouse может принимать (`INSERT`) и отдавать (`SELECT | [Protobuf](#protobuf) | ✔ | ✔ | | [Parquet](#data-format-parquet) | ✔ | ✔ | | [RowBinary](#rowbinary) | ✔ | ✔ | +| [RowBinaryWithNamesAndTypes](#rowbinarywithnamesandtypes) | ✔ | ✔ | | [Native](#native) | ✔ | ✔ | | [Null](#null) | ✗ | ✔ | | [XML](#xml) | ✗ | ✔ | @@ -673,7 +674,15 @@ FixedString представлены просто как последовате Array представлены как длина в формате varint (unsigned [LEB128](https://en.wikipedia.org/wiki/LEB128)), а затем элементы массива, подряд. -Для поддержки [NULL](../query_language/syntax.md#null-literal) перед каждым значением типа [Nullable](../data_types/nullable.md +Для поддержки [NULL](../query_language/syntax.md#null-literal) перед каждым значением типа [Nullable](../data_types/nullable.md) следует байт содержащий 1 или 0. Если байт 1, то значение равно NULL, и этот байт интерпретируется как отдельное значение (т.е. после него следует значение следующего поля). Если байт 0, то после байта следует значение поля (не равно NULL). + +## RowBinaryWithNamesAndTypes {#rowbinarywithnamesandtypes} + +Тоже самое что [RowBinary](#rowbinary), но добавляется заголовок: + + * Число (N) колонок закодированное [LEB128](https://en.wikipedia.org/wiki/LEB128) + * N строк (`String`) с именами колонок + * N строк (`String`) с типами колонок ## Values {#data-format-values} From 4004f17b09c949340df612c82007d10ee5ac8929 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 14:09:42 -0300 Subject: [PATCH 17/30] Update formats.md Added RowBinaryWithNamesAndTypes toc. Fixed list formatting. --- docs/en/interfaces/formats.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/en/interfaces/formats.md b/docs/en/interfaces/formats.md index ed36e79fbc0..4dc123d6647 100644 --- a/docs/en/interfaces/formats.md +++ b/docs/en/interfaces/formats.md @@ -29,6 +29,7 @@ The supported formats are: | [Protobuf](#protobuf) | ✔ | ✔ | | [Parquet](#data-format-parquet) | ✔ | ✔ | | [RowBinary](#rowbinary) | ✔ | ✔ | +| [RowBinaryWithNamesAndTypes](#rowbinarywithnamesandtypes) | ✔ | ✔ | | [Native](#native) | ✔ | ✔ | | [Null](#null) | ✗ | ✔ | | [XML](#xml) | ✗ | ✔ | @@ -680,9 +681,10 @@ For [NULL](../query_language/syntax.md#null-literal) support, an additional byte ## RowBinaryWithNamesAndTypes {#rowbinarywithnamesandtypes} Similar to [RowBinary](#rowbinary), but with added header: -* [LEB128](https://en.wikipedia.org/wiki/LEB128)-encoded number of columns (N) -* N `String`s specifying column names -* N `String`s specifying column types + + * [LEB128](https://en.wikipedia.org/wiki/LEB128)-encoded number of columns (N) + * N `String`s specifying column names + * N `String`s specifying column types ## Values {#data-format-values} From e8cf800c6ef8314307881b7988916a97026d5aac Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 14:12:43 -0300 Subject: [PATCH 18/30] Update formats.md Fixed rowbinary RU translation and added rowbinarywithnamesandtypes --- docs/ru/interfaces/formats.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ru/interfaces/formats.md b/docs/ru/interfaces/formats.md index 0145423ffeb..51fd9635905 100644 --- a/docs/ru/interfaces/formats.md +++ b/docs/ru/interfaces/formats.md @@ -678,7 +678,7 @@ Array представлены как длина в формате varint (unsig ## RowBinaryWithNamesAndTypes {#rowbinarywithnamesandtypes} -Тоже самое что [RowBinary](#rowbinary), но добавляется заголовок: +То же самое что [RowBinary](#rowbinary), но добавляется заголовок: * Число (N) колонок закодированное [LEB128](https://en.wikipedia.org/wiki/LEB128) * N строк (`String`) с именами колонок From 9f94eb74df991a784af1e9c756d61bfb9888ff16 Mon Sep 17 00:00:00 2001 From: Ivan Blinkov Date: Mon, 9 Sep 2019 20:32:15 +0300 Subject: [PATCH 19/30] Remove link to past meetup --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index d5b0bf63165..8a67cc530d1 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,6 @@ ClickHouse is an open-source column-oriented database management system that all * You can also [fill this form](https://forms.yandex.com/surveys/meet-yandex-clickhouse-team/) to meet Yandex ClickHouse team in person. ## Upcoming Events -* [ClickHouse Meetup in Moscow](https://yandex.ru/promo/clickhouse/moscow-2019) on September 5. * [ClickHouse Meetup in Munich](https://www.meetup.com/ClickHouse-Meetup-Munich/events/264185199/) on September 17. * [ClickHouse Meetup in Paris](https://www.eventbrite.com/e/clickhouse-paris-meetup-2019-registration-68493270215) on October 3. * [ClickHouse Meetup in Hong Kong](https://www.meetup.com/Hong-Kong-Machine-Learning-Meetup/events/263580542/) on October 17. From 0c6d583a155f8a0433f8ca159c4e24f215817f76 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 14:51:58 -0300 Subject: [PATCH 20/30] Doc change. less confusing description for pointInEllipses (#6790) --- docs/en/query_language/functions/geo.md | 11 ++++++----- docs/ru/query_language/functions/geo.md | 11 ++++++----- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/docs/en/query_language/functions/geo.md b/docs/en/query_language/functions/geo.md index d05345da29e..79b8390d59f 100644 --- a/docs/en/query_language/functions/geo.md +++ b/docs/en/query_language/functions/geo.md @@ -38,6 +38,7 @@ SELECT greatCircleDistance(55.755831, 37.617673, -55.755831, -37.617673) ## pointInEllipses Checks whether the point belongs to at least one of the ellipses. +Coordinates are geometric in the Cartesian coordinate system. ``` pointInEllipses(x, y, x₀, y₀, a₀, b₀,...,xₙ, yₙ, aₙ, bₙ) @@ -47,7 +48,7 @@ pointInEllipses(x, y, x₀, y₀, a₀, b₀,...,xₙ, yₙ, aₙ, bₙ) - `x, y` — Coordinates of a point on the plane. - `xᵢ, yᵢ` — Coordinates of the center of the `i`-th ellipsis. -- `aᵢ, bᵢ` — Axes of the `i`-th ellipsis in meters. +- `aᵢ, bᵢ` — Axes of the `i`-th ellipsis in units of x, y coordinates. The input parameters must be `2+4⋅n`, where `n` is the number of ellipses. @@ -58,13 +59,13 @@ The input parameters must be `2+4⋅n`, where `n` is the number of ellipses. **Example** ``` sql -SELECT pointInEllipses(55.755831, 37.617673, 55.755831, 37.617673, 1.0, 2.0) +SELECT pointInEllipses(10., 10., 10., 9.1, 1., 0.9999) ``` ``` -┌─pointInEllipses(55.755831, 37.617673, 55.755831, 37.617673, 1., 2.)─┐ -│ 1 │ -└─────────────────────────────────────────────────────────────────────┘ +┌─pointInEllipses(10., 10., 10., 9.1, 1., 0.9999)─┐ +│ 1 │ +└─────────────────────────────────────────────────┘ ``` ## pointInPolygon diff --git a/docs/ru/query_language/functions/geo.md b/docs/ru/query_language/functions/geo.md index b8e37c15aca..63ceae9208e 100644 --- a/docs/ru/query_language/functions/geo.md +++ b/docs/ru/query_language/functions/geo.md @@ -38,6 +38,7 @@ SELECT greatCircleDistance(55.755831, 37.617673, -55.755831, -37.617673) ## pointInEllipses Проверяет, принадлежит ли точка хотя бы одному из эллипсов. +Координаты — геометрические в декартовой системе координат. ``` pointInEllipses(x, y, x₀, y₀, a₀, b₀,...,xₙ, yₙ, aₙ, bₙ) @@ -47,7 +48,7 @@ pointInEllipses(x, y, x₀, y₀, a₀, b₀,...,xₙ, yₙ, aₙ, bₙ) - `x, y` — координаты точки на плоскости. - `xᵢ, yᵢ` — координаты центра `i`-го эллипса. -- `aᵢ, bᵢ` — полуоси `i`-го эллипса в метрах. +- `aᵢ, bᵢ` — полуоси `i`-го эллипса (в единицах измерения координат x,y). Входных параметров должно быть `2+4⋅n`, где `n` — количество эллипсов. @@ -58,13 +59,13 @@ pointInEllipses(x, y, x₀, y₀, a₀, b₀,...,xₙ, yₙ, aₙ, bₙ) **Пример** ```sql -SELECT pointInEllipses(55.755831, 37.617673, 55.755831, 37.617673, 1.0, 2.0) +SELECT pointInEllipses(10., 10., 10., 9.1, 1., 0.9999) ``` ``` -┌─pointInEllipses(55.755831, 37.617673, 55.755831, 37.617673, 1., 2.)─┐ -│ 1 │ -└─────────────────────────────────────────────────────────────────────┘ +┌─pointInEllipses(10., 10., 10., 9.1, 1., 0.9999)─┐ +│ 1 │ +└─────────────────────────────────────────────────┘ ``` ## pointInPolygon From 927a31f8e2b47d593cd12b3e6bd3d118c3d58b3a Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 15:34:08 -0300 Subject: [PATCH 21/30] Update docs/en/query_language/system.md Co-Authored-By: Ivan Blinkov --- docs/en/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index 2b6f3ff4607..fa01c509610 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -15,7 +15,7 @@ ## RELOAD DICTIONARIES {#query_language-system-reload-dictionaries} Reloads all dictionaries that have been successfully downloaded before. -By default, dictionaries are loaded lazily (see [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load)), so instead of being loaded automatically at startup, they are initialized on first access through dictGet function or SELECT to ENGINE = Dictionary. The `SYSTEM RELOAD DICTIONARIES` query reloads such dictionaries (LOADED). +By default, dictionaries are loaded lazily (see [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load)), so instead of being loaded automatically at startup, they are initialized on first access through dictGet function or SELECT from tables with ENGINE = Dictionary. The `SYSTEM RELOAD DICTIONARIES` query reloads such dictionaries (LOADED). Always returns `Ok.` regardless of the result of the dictionary update. ## RELOAD DICTIONARY dictionary_name {#query_language-system-reload-dictionary} From 851daebe4cb22c68dc971396e6394695ab8b6516 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 15:34:32 -0300 Subject: [PATCH 22/30] Update docs/ru/query_language/system.md Co-Authored-By: Ivan Blinkov --- docs/ru/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ru/query_language/system.md b/docs/ru/query_language/system.md index 90eb090ad4d..31cac017a86 100644 --- a/docs/ru/query_language/system.md +++ b/docs/ru/query_language/system.md @@ -25,7 +25,7 @@ Состояние словаря можно проверить запросом к `system.dictionaries`. ```sql -select name,status from system.dictionaries +SELECT name, status FROM system.dictionaries ``` ## DROP DNS CACHE {#query_language-system-drop-dns-cache} From 8002b7d7a8d4ccbc60b5c83fe058135989d8393a Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 15:34:48 -0300 Subject: [PATCH 23/30] Update docs/en/query_language/system.md Co-Authored-By: Ivan Blinkov --- docs/en/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index fa01c509610..493a2ca14ff 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -14,7 +14,7 @@ ## RELOAD DICTIONARIES {#query_language-system-reload-dictionaries} -Reloads all dictionaries that have been successfully downloaded before. +Reloads all dictionaries that have been successfully loaded before. By default, dictionaries are loaded lazily (see [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load)), so instead of being loaded automatically at startup, they are initialized on first access through dictGet function or SELECT from tables with ENGINE = Dictionary. The `SYSTEM RELOAD DICTIONARIES` query reloads such dictionaries (LOADED). Always returns `Ok.` regardless of the result of the dictionary update. From 1275ed4641be5dfa5b1dc8aac7e6a0347d0cf2d8 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 15:35:41 -0300 Subject: [PATCH 24/30] Update system.md --- docs/ru/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ru/query_language/system.md b/docs/ru/query_language/system.md index 31cac017a86..090376a5d12 100644 --- a/docs/ru/query_language/system.md +++ b/docs/ru/query_language/system.md @@ -25,7 +25,7 @@ Состояние словаря можно проверить запросом к `system.dictionaries`. ```sql -SELECT name, status FROM system.dictionaries +SELECT name, status FROM system.dictionaries; ``` ## DROP DNS CACHE {#query_language-system-drop-dns-cache} From 282e24975b779ac8a1c24fa03faba3777b6ac5f0 Mon Sep 17 00:00:00 2001 From: Denis Zhuravlev Date: Mon, 9 Sep 2019 15:42:12 -0300 Subject: [PATCH 25/30] Update formats.md --- docs/ru/interfaces/formats.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/ru/interfaces/formats.md b/docs/ru/interfaces/formats.md index 51fd9635905..fc28d97ecb9 100644 --- a/docs/ru/interfaces/formats.md +++ b/docs/ru/interfaces/formats.md @@ -680,9 +680,9 @@ Array представлены как длина в формате varint (unsig То же самое что [RowBinary](#rowbinary), но добавляется заголовок: - * Число (N) колонок закодированное [LEB128](https://en.wikipedia.org/wiki/LEB128) - * N строк (`String`) с именами колонок - * N строк (`String`) с типами колонок + * Количество колонок - N, закодированное [LEB128](https://en.wikipedia.org/wiki/LEB128), + * N строк (`String`) с именами колонок, + * N строк (`String`) с типами колонок. ## Values {#data-format-values} From 54a5b801b708701b1ddbda95887465b9f7ae5740 Mon Sep 17 00:00:00 2001 From: proller Date: Tue, 10 Sep 2019 00:40:40 +0300 Subject: [PATCH 26/30] Build fixes (Orc, ...) (#6835) * Fix build * cmake: fix cpuinfo * Fix includes after processors merge Conflicts: dbms/src/Processors/Formats/Impl/CapnProtoRowInputFormat.cpp dbms/src/Processors/Formats/Impl/ParquetBlockOutputFormat.cpp dbms/src/Processors/Formats/Impl/ProtobufRowInputFormat.cpp dbms/src/Processors/Formats/Impl/ProtobufRowOutputFormat.cpp * Fix build in gcc8 * fix test link * fix test link * Fix test link * link fix * Fix includes after processors merge 2 Conflicts: dbms/src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp * Fix includes after processors merge 3 * link fix * Fix likely/unlikely conflict with cython * Fix conflict with protobuf/stubs/atomicops.h * remove unlikely.h * Fix macos build (do not use timer_t) * wip * Fix build (orc, ...) * Missing files * Try fix * fix hdfs * Fix llvm 7.1 find --- CMakeLists.txt | 2 +- cmake/find_hdfs3.cmake | 35 +-- cmake/find_llvm.cmake | 22 +- cmake/find_orc.cmake | 40 ++- cmake/find_parquet.cmake | 1 + contrib/CMakeLists.txt | 17 +- contrib/arrow-cmake/CMakeLists.txt | 24 +- contrib/arrow-cmake/orc_check.cmake | 126 ++++++++++ contrib/libhdfs3-cmake/CMakeLists.txt | 10 +- contrib/orc-cmake/CMakeLists.txt | 229 ++++++++++++++++++ .../Formats/Impl/ArrowColumnToCHColumn.cpp | 2 +- .../Formats/Impl/ArrowColumnToCHColumn.h | 2 +- 12 files changed, 443 insertions(+), 67 deletions(-) create mode 100644 contrib/arrow-cmake/orc_check.cmake create mode 100644 contrib/orc-cmake/CMakeLists.txt diff --git a/CMakeLists.txt b/CMakeLists.txt index 5330c8daeb5..578e25b8e16 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -343,7 +343,7 @@ include (cmake/find_hyperscan.cmake) include (cmake/find_simdjson.cmake) include (cmake/find_rapidjson.cmake) include (cmake/find_fastops.cmake) -include (cmake/find_orc.cmake) +#include (cmake/find_orc.cmake) find_contrib_lib(cityhash) find_contrib_lib(farmhash) diff --git a/cmake/find_hdfs3.cmake b/cmake/find_hdfs3.cmake index 4c29047fc75..9c593d3266a 100644 --- a/cmake/find_hdfs3.cmake +++ b/cmake/find_hdfs3.cmake @@ -1,24 +1,29 @@ -if (NOT ARCH_ARM AND NOT OS_FREEBSD AND NOT APPLE AND USE_PROTOBUF) - option (ENABLE_HDFS "Enable HDFS" ${NOT_UNBUNDLED}) -endif () +if(NOT ARCH_ARM AND NOT OS_FREEBSD AND NOT APPLE AND USE_PROTOBUF) + option(ENABLE_HDFS "Enable HDFS" 1) +endif() -if (ENABLE_HDFS AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include/hdfs/hdfs.h") - message (WARNING "submodule contrib/libhdfs3 is missing. to fix try run: \n git submodule update --init --recursive") - set (ENABLE_HDFS 0) -endif () +if(ENABLE_HDFS) +option(USE_INTERNAL_HDFS3_LIBRARY "Set to FALSE to use system HDFS3 instead of bundled" ${NOT_UNBUNDLED}) -if (ENABLE_HDFS) -option (USE_INTERNAL_HDFS3_LIBRARY "Set to FALSE to use system HDFS3 instead of bundled" ON) +if(NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include/hdfs/hdfs.h") + if(USE_INTERNAL_HDFS3_LIBRARY) + message(WARNING "submodule contrib/libhdfs3 is missing. to fix try run: \n git submodule update --init --recursive") + endif() + set(MISSING_INTERNAL_HDFS3_LIBRARY 1) + set(USE_INTERNAL_HDFS3_LIBRARY 0) +endif() -if (NOT USE_INTERNAL_HDFS3_LIBRARY) - find_package(hdfs3) -endif () +if(NOT USE_INTERNAL_HDFS3_LIBRARY) + find_library(HDFS3_LIBRARY hdfs3) + find_path(HDFS3_INCLUDE_DIR NAMES hdfs/hdfs.h PATHS ${HDFS3_INCLUDE_PATHS}) +endif() -if (HDFS3_LIBRARY AND HDFS3_INCLUDE_DIR) +if(HDFS3_LIBRARY AND HDFS3_INCLUDE_DIR) set(USE_HDFS 1) -elseif (LIBGSASL_LIBRARY AND LIBXML2_LIBRARY) +elseif(NOT MISSING_INTERNAL_HDFS3_LIBRARY AND LIBGSASL_LIBRARY AND LIBXML2_LIBRARY) set(HDFS3_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include") set(HDFS3_LIBRARY hdfs3) + set(USE_INTERNAL_HDFS3_LIBRARY 1) set(USE_HDFS 1) else() set(USE_INTERNAL_HDFS3_LIBRARY 0) @@ -26,4 +31,4 @@ endif() endif() -message (STATUS "Using hdfs3=${USE_HDFS}: ${HDFS3_INCLUDE_DIR} : ${HDFS3_LIBRARY}") +message(STATUS "Using hdfs3=${USE_HDFS}: ${HDFS3_INCLUDE_DIR} : ${HDFS3_LIBRARY}") diff --git a/cmake/find_llvm.cmake b/cmake/find_llvm.cmake index 3692a98b979..c668416c0c0 100644 --- a/cmake/find_llvm.cmake +++ b/cmake/find_llvm.cmake @@ -18,22 +18,12 @@ if (ENABLE_EMBEDDED_COMPILER) elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang") find_package(LLVM ${CMAKE_CXX_COMPILER_VERSION} CONFIG PATHS ${LLVM_PATHS}) else () - #TODO: - #if(NOT LLVM_FOUND) - # find_package(LLVM 9 CONFIG PATHS ${LLVM_PATHS}) - #endif() - #if(NOT LLVM_FOUND) - # find_package(LLVM 8 CONFIG PATHS ${LLVM_PATHS}) - #endif() - if (NOT LLVM_FOUND) - find_package (LLVM 7 CONFIG PATHS ${LLVM_PATHS}) - endif () - if (NOT LLVM_FOUND) - find_package (LLVM 6 CONFIG PATHS ${LLVM_PATHS}) - endif () - if (NOT LLVM_FOUND) - find_package (LLVM 5 CONFIG PATHS ${LLVM_PATHS}) - endif () + # TODO: 9 8 + foreach(llvm_v 7.1 7 6 5) + if (NOT LLVM_FOUND) + find_package (LLVM ${llvm_v} CONFIG PATHS ${LLVM_PATHS}) + endif () + endforeach () endif () if (LLVM_FOUND) diff --git a/cmake/find_orc.cmake b/cmake/find_orc.cmake index 3676bec1b6b..50e563b04b4 100644 --- a/cmake/find_orc.cmake +++ b/cmake/find_orc.cmake @@ -1,8 +1,38 @@ -##TODO replace hardcode to find procedure +option (ENABLE_ORC "Enable ORC" 1) -set(USE_ORC 0) -set(USE_INTERNAL_ORC_LIBRARY ON) +if(ENABLE_ORC) +option (USE_INTERNAL_ORC_LIBRARY "Set to FALSE to use system ORC instead of bundled" ${NOT_UNBUNDLED}) -if (ARROW_LIBRARY) +if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/orc/c++/include/orc/OrcFile.hh") + if(USE_INTERNAL_ORC_LIBRARY) + message(WARNING "submodule contrib/orc is missing. to fix try run: \n git submodule update --init --recursive") + set(USE_INTERNAL_ORC_LIBRARY 0) + endif() + set(MISSING_INTERNAL_ORC_LIBRARY 1) +endif () + +if (NOT USE_INTERNAL_ORC_LIBRARY) + find_package(orc) +endif () + +#if (USE_INTERNAL_ORC_LIBRARY) +#find_path(CYRUS_SASL_INCLUDE_DIR sasl/sasl.h) +#find_library(CYRUS_SASL_SHARED_LIB sasl2) +#if (NOT CYRUS_SASL_INCLUDE_DIR OR NOT CYRUS_SASL_SHARED_LIB) +# set(USE_ORC 0) +#endif() +#endif() + +if (ORC_LIBRARY AND ORC_INCLUDE_DIR) set(USE_ORC 1) -endif() \ No newline at end of file +elseif(NOT MISSING_INTERNAL_ORC_LIBRARY AND ARROW_LIBRARY) # (LIBGSASL_LIBRARY AND LIBXML2_LIBRARY) + set(ORC_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/orc/c++/include") + set(ORC_LIBRARY orc) + set(USE_ORC 1) +else() + set(USE_INTERNAL_ORC_LIBRARY 0) +endif() + +endif() + +message (STATUS "Using internal=${USE_INTERNAL_ORC_LIBRARY} orc=${USE_ORC}: ${ORC_INCLUDE_DIR} : ${ORC_LIBRARY}") diff --git a/cmake/find_parquet.cmake b/cmake/find_parquet.cmake index 63f589a9ea5..5c5bc664113 100644 --- a/cmake/find_parquet.cmake +++ b/cmake/find_parquet.cmake @@ -62,6 +62,7 @@ elseif(NOT MISSING_INTERNAL_PARQUET_LIBRARY AND NOT OS_FREEBSD) endif() set(USE_PARQUET 1) + set(USE_ORC 1) endif() endif() diff --git a/contrib/CMakeLists.txt b/contrib/CMakeLists.txt index 96462de0190..54fdc4d69e0 100644 --- a/contrib/CMakeLists.txt +++ b/contrib/CMakeLists.txt @@ -10,19 +10,6 @@ endif () set_property(DIRECTORY PROPERTY EXCLUDE_FROM_ALL 1) -if (USE_INTERNAL_ORC_LIBRARY) - set(BUILD_JAVA OFF) - set (ANALYZE_JAVA OFF) - set (BUILD_CPP_TESTS OFF) - set (BUILD_TOOLS OFF) - option(BUILD_JAVA OFF) - option (ANALYZE_JAVA OFF) - option (BUILD_CPP_TESTS OFF) - option (BUILD_TOOLS OFF) - set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/contrib/orc/cmake_modules") - add_subdirectory(orc) -endif() - if (USE_INTERNAL_BOOST_LIBRARY) add_subdirectory (boost-cmake) endif () @@ -327,3 +314,7 @@ endif() if (USE_FASTOPS) add_subdirectory (fastops-cmake) endif() + +#if (USE_INTERNAL_ORC_LIBRARY) +# add_subdirectory(orc-cmake) +#endif () diff --git a/contrib/arrow-cmake/CMakeLists.txt b/contrib/arrow-cmake/CMakeLists.txt index ba1ddc2414a..cfd57f2b296 100644 --- a/contrib/arrow-cmake/CMakeLists.txt +++ b/contrib/arrow-cmake/CMakeLists.txt @@ -56,11 +56,11 @@ set(ORC_SOURCE_WRAP_DIR ${ORC_SOURCE_DIR}/wrap) set(ORC_BUILD_SRC_DIR ${CMAKE_CURRENT_BINARY_DIR}/../orc/c++/src) set(ORC_BUILD_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/../orc/c++/include) -set(GOOGLE_PROTOBUF_DIR ${ClickHouse_SOURCE_DIR}/contrib/protobuf/src/) +set(GOOGLE_PROTOBUF_DIR ${Protobuf_INCLUDE_DIR}/) set(ORC_ADDITION_SOURCE_DIR ${CMAKE_CURRENT_BINARY_DIR}) set(ARROW_SRC_DIR ${ClickHouse_SOURCE_DIR}/contrib/arrow/cpp/src) -set(PROTOBUF_EXECUTABLE ${CMAKE_CURRENT_BINARY_DIR}/../protobuf/cmake/protoc) +set(PROTOBUF_EXECUTABLE ${Protobuf_PROTOC_EXECUTABLE}) set(PROTO_DIR ${ORC_SOURCE_DIR}/../proto) @@ -70,14 +70,10 @@ add_custom_command(OUTPUT orc_proto.pb.h orc_proto.pb.cc --cpp_out="${CMAKE_CURRENT_BINARY_DIR}" "${PROTO_DIR}/orc_proto.proto") -include_directories(SYSTEM ${ORC_INCLUDE_DIR}) -include_directories(SYSTEM ${ORC_SOURCE_SRC_DIR}) -include_directories(SYSTEM ${ORC_SOURCE_WRAP_DIR}) -include_directories(SYSTEM ${GOOGLE_PROTOBUF_DIR}) -include_directories(SYSTEM ${ORC_BUILD_SRC_DIR}) -include_directories(SYSTEM ${ORC_BUILD_INCLUDE_DIR}) -include_directories(SYSTEM ${ORC_ADDITION_SOURCE_DIR}) -include_directories(SYSTEM ${ARROW_SRC_DIR}) +include(${ClickHouse_SOURCE_DIR}/contrib/orc/cmake_modules/CheckSourceCompiles.cmake) +include(orc_check.cmake) +configure_file("${ORC_INCLUDE_DIR}/orc/orc-config.hh.in" "${ORC_BUILD_INCLUDE_DIR}/orc/orc-config.hh") +configure_file("${ORC_SOURCE_SRC_DIR}/Adaptor.hh.in" "${ORC_BUILD_INCLUDE_DIR}/Adaptor.hh") set(ORC_SRCS @@ -232,6 +228,14 @@ if (ARROW_WITH_ZSTD) target_link_libraries(${ARROW_LIBRARY} PRIVATE ${ZSTD_LIBRARY}) endif() +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${ORC_INCLUDE_DIR}) +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${ORC_SOURCE_SRC_DIR}) +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${ORC_SOURCE_WRAP_DIR}) +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${GOOGLE_PROTOBUF_DIR}) +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${ORC_BUILD_SRC_DIR}) +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${ORC_BUILD_INCLUDE_DIR}) +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${ORC_ADDITION_SOURCE_DIR}) +target_include_directories(${ARROW_LIBRARY} PRIVATE SYSTEM ${ARROW_SRC_DIR}) # === parquet diff --git a/contrib/arrow-cmake/orc_check.cmake b/contrib/arrow-cmake/orc_check.cmake new file mode 100644 index 00000000000..ec1e53cc649 --- /dev/null +++ b/contrib/arrow-cmake/orc_check.cmake @@ -0,0 +1,126 @@ +# Not changed part of contrib/orc/c++/src/CMakeLists.txt + +INCLUDE(CheckCXXSourceCompiles) + +CHECK_CXX_SOURCE_COMPILES(" + #include + #include + int main(int,char*[]){ + int f = open(\"/x/y\", O_RDONLY); + char buf[100]; + return pread(f, buf, 100, 1000) == 0; + }" + HAS_PREAD +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int,char*[]){ + struct tm time2020; + return !strptime(\"2020-02-02 12:34:56\", \"%Y-%m-%d %H:%M:%S\", &time2020); + }" + HAS_STRPTIME +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int,char* argv[]){ + return static_cast(std::stoll(argv[0])); + }" + HAS_STOLL +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + #include + int main(int,char*[]){ + int64_t x = 1; printf(\"%lld\",x); + }" + INT64_IS_LL +) + +CHECK_CXX_SOURCE_COMPILES(" + #ifdef __clang__ + #pragma clang diagnostic push + #pragma clang diagnostic ignored \"-Wdeprecated\" + #pragma clang diagnostic pop + #elif defined(__GNUC__) + #pragma GCC diagnostic push + #pragma GCC diagnostic ignored \"-Wdeprecated\" + #pragma GCC diagnostic pop + #elif defined(_MSC_VER) + #pragma warning( push ) + #pragma warning( disable : 4996 ) + #pragma warning( pop ) + #else + unknownCompiler! + #endif + int main(int, char *[]) {}" + HAS_DIAGNOSTIC_PUSH +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int, char *[]) { + return std::isnan(1.0f); + }" + HAS_STD_ISNAN +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int, char *[]) { + std::mutex test_mutex; + std::lock_guard lock_mutex(test_mutex); + }" + HAS_STD_MUTEX +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + std::string func() { + std::string var = \"test\"; + return std::move(var); + } + int main(int, char *[]) {}" + NEEDS_REDUNDANT_MOVE +) + +INCLUDE(CheckCXXSourceRuns) + +CHECK_CXX_SOURCE_RUNS(" + #include + int main(int, char *[]) { + time_t t = -14210715; // 1969-07-20 12:34:45 + struct tm *ptm = gmtime(&t); + return !(ptm && ptm->tm_year == 69); + }" + HAS_PRE_1970 +) + +CHECK_CXX_SOURCE_RUNS(" + #include + #include + int main(int, char *[]) { + setenv(\"TZ\", \"America/Los_Angeles\", 1); + tzset(); + struct tm time2037; + struct tm time2038; + strptime(\"2037-05-05 12:34:56\", \"%Y-%m-%d %H:%M:%S\", &time2037); + strptime(\"2038-05-05 12:34:56\", \"%Y-%m-%d %H:%M:%S\", &time2038); + return mktime(&time2038) - mktime(&time2037) != 31536000; + }" + HAS_POST_2038 +) + +set(CMAKE_REQUIRED_INCLUDES ${ZLIB_INCLUDE_DIR}) +set(CMAKE_REQUIRED_LIBRARIES zlib) +CHECK_CXX_SOURCE_COMPILES(" + #define Z_PREFIX + #include + z_stream strm; + int main(int, char *[]) { + deflateReset(&strm); + }" + NEEDS_Z_PREFIX +) diff --git a/contrib/libhdfs3-cmake/CMakeLists.txt b/contrib/libhdfs3-cmake/CMakeLists.txt index 8ec14f897b9..e1ba7225b0f 100644 --- a/contrib/libhdfs3-cmake/CMakeLists.txt +++ b/contrib/libhdfs3-cmake/CMakeLists.txt @@ -199,17 +199,17 @@ if (WITH_KERBEROS) endif() target_include_directories(hdfs3 PRIVATE ${LIBXML2_INCLUDE_DIR}) -target_link_libraries(hdfs3 ${LIBGSASL_LIBRARY}) +target_link_libraries(hdfs3 PRIVATE ${LIBGSASL_LIBRARY}) if (WITH_KERBEROS) - target_link_libraries(hdfs3 ${KERBEROS_LIBRARIES}) + target_link_libraries(hdfs3 PRIVATE ${KERBEROS_LIBRARIES}) endif() -target_link_libraries(hdfs3 ${LIBXML2_LIBRARY}) +target_link_libraries(hdfs3 PRIVATE ${LIBXML2_LIBRARY}) # inherit from parent cmake target_include_directories(hdfs3 PRIVATE ${Boost_INCLUDE_DIRS}) target_include_directories(hdfs3 PRIVATE ${Protobuf_INCLUDE_DIR}) -target_link_libraries(hdfs3 ${Protobuf_LIBRARY}) +target_link_libraries(hdfs3 PRIVATE ${Protobuf_LIBRARY}) if(OPENSSL_INCLUDE_DIR AND OPENSSL_LIBRARIES) target_include_directories(hdfs3 PRIVATE ${OPENSSL_INCLUDE_DIR}) - target_link_libraries(hdfs3 ${OPENSSL_LIBRARIES}) + target_link_libraries(hdfs3 PRIVATE ${OPENSSL_LIBRARIES}) endif() diff --git a/contrib/orc-cmake/CMakeLists.txt b/contrib/orc-cmake/CMakeLists.txt new file mode 100644 index 00000000000..066ba00aede --- /dev/null +++ b/contrib/orc-cmake/CMakeLists.txt @@ -0,0 +1,229 @@ +# modifyed copy of contrib/orc/c++/src/CMakeLists.txt +set(LIBRARY_INCLUDE ${ClickHouse_SOURCE_DIR}/contrib/orc/c++/include) +set(LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/orc/c++/src) + +set(PROTOBUF_INCLUDE_DIR ${Protobuf_INCLUDE_DIR}) +set(PROTOBUF_EXECUTABLE ${Protobuf_PROTOC_EXECUTABLE}) + +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${CXX11_FLAGS} ${WARN_FLAGS}") + +INCLUDE(CheckCXXSourceCompiles) + +CHECK_CXX_SOURCE_COMPILES(" + #include + #include + int main(int,char*[]){ + int f = open(\"/x/y\", O_RDONLY); + char buf[100]; + return pread(f, buf, 100, 1000) == 0; + }" + HAS_PREAD +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int,char*[]){ + struct tm time2020; + return !strptime(\"2020-02-02 12:34:56\", \"%Y-%m-%d %H:%M:%S\", &time2020); + }" + HAS_STRPTIME +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int,char* argv[]){ + return static_cast(std::stoll(argv[0])); + }" + HAS_STOLL +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + #include + int main(int,char*[]){ + int64_t x = 1; printf(\"%lld\",x); + }" + INT64_IS_LL +) + +CHECK_CXX_SOURCE_COMPILES(" + #ifdef __clang__ + #pragma clang diagnostic push + #pragma clang diagnostic ignored \"-Wdeprecated\" + #pragma clang diagnostic pop + #elif defined(__GNUC__) + #pragma GCC diagnostic push + #pragma GCC diagnostic ignored \"-Wdeprecated\" + #pragma GCC diagnostic pop + #elif defined(_MSC_VER) + #pragma warning( push ) + #pragma warning( disable : 4996 ) + #pragma warning( pop ) + #else + unknownCompiler! + #endif + int main(int, char *[]) {}" + HAS_DIAGNOSTIC_PUSH +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int, char *[]) { + return std::isnan(1.0f); + }" + HAS_STD_ISNAN +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + int main(int, char *[]) { + std::mutex test_mutex; + std::lock_guard lock_mutex(test_mutex); + }" + HAS_STD_MUTEX +) + +CHECK_CXX_SOURCE_COMPILES(" + #include + std::string func() { + std::string var = \"test\"; + return std::move(var); + } + int main(int, char *[]) {}" + NEEDS_REDUNDANT_MOVE +) + +INCLUDE(CheckCXXSourceRuns) + +CHECK_CXX_SOURCE_RUNS(" + #include + int main(int, char *[]) { + time_t t = -14210715; // 1969-07-20 12:34:45 + struct tm *ptm = gmtime(&t); + return !(ptm && ptm->tm_year == 69); + }" + HAS_PRE_1970 +) + +CHECK_CXX_SOURCE_RUNS(" + #include + #include + int main(int, char *[]) { + setenv(\"TZ\", \"America/Los_Angeles\", 1); + tzset(); + struct tm time2037; + struct tm time2038; + strptime(\"2037-05-05 12:34:56\", \"%Y-%m-%d %H:%M:%S\", &time2037); + strptime(\"2038-05-05 12:34:56\", \"%Y-%m-%d %H:%M:%S\", &time2038); + return mktime(&time2038) - mktime(&time2037) != 31536000; + }" + HAS_POST_2038 +) + +set(CMAKE_REQUIRED_INCLUDES ${ZLIB_INCLUDE_DIR}) +set(CMAKE_REQUIRED_LIBRARIES zlib) +CHECK_CXX_SOURCE_COMPILES(" + #define Z_PREFIX + #include + z_stream strm; + int main(int, char *[]) { + deflateReset(&strm); + }" + NEEDS_Z_PREFIX +) + +configure_file ( + "${LIBRARY_DIR}/Adaptor.hh.in" + "${CMAKE_CURRENT_BINARY_DIR}/Adaptor.hh" + ) + + +add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/orc_proto.pb.h ${CMAKE_CURRENT_BINARY_DIR}/orc_proto.pb.cc + COMMAND ${PROTOBUF_EXECUTABLE} + -I${ClickHouse_SOURCE_DIR}/contrib/orc/proto + --cpp_out="${CMAKE_CURRENT_BINARY_DIR}" + "${ClickHouse_SOURCE_DIR}/contrib/orc/proto/orc_proto.proto" +) + +set(SOURCE_FILES + "${CMAKE_CURRENT_BINARY_DIR}/Adaptor.hh" + ${CMAKE_CURRENT_BINARY_DIR}/orc_proto.pb.h + ${LIBRARY_DIR}/io/InputStream.cc + ${LIBRARY_DIR}/io/OutputStream.cc + ${LIBRARY_DIR}/wrap/orc-proto-wrapper.cc + ${LIBRARY_DIR}/Adaptor.cc + ${LIBRARY_DIR}/ByteRLE.cc + ${LIBRARY_DIR}/ColumnPrinter.cc + ${LIBRARY_DIR}/ColumnReader.cc + ${LIBRARY_DIR}/ColumnWriter.cc + ${LIBRARY_DIR}/Common.cc + ${LIBRARY_DIR}/Compression.cc + ${LIBRARY_DIR}/Exceptions.cc + ${LIBRARY_DIR}/Int128.cc + ${LIBRARY_DIR}/LzoDecompressor.cc + ${LIBRARY_DIR}/MemoryPool.cc + ${LIBRARY_DIR}/OrcFile.cc + ${LIBRARY_DIR}/Reader.cc + ${LIBRARY_DIR}/RLEv1.cc + ${LIBRARY_DIR}/RLEv2.cc + ${LIBRARY_DIR}/RLE.cc + ${LIBRARY_DIR}/Statistics.cc + ${LIBRARY_DIR}/StripeStream.cc + ${LIBRARY_DIR}/Timezone.cc + ${LIBRARY_DIR}/TypeImpl.cc + ${LIBRARY_DIR}/Vector.cc + ${LIBRARY_DIR}/Writer.cc + ) + +if(ORC_CXX_HAS_THREAD_LOCAL AND BUILD_LIBHDFSPP) + set(SOURCE_FILES ${SOURCE_FILES} ${LIBRARY_DIR}/OrcHdfsFile.cc) +endif(ORC_CXX_HAS_THREAD_LOCAL AND BUILD_LIBHDFSPP) + +#list(TRANSFORM SOURCE_FILES PREPEND ${LIBRARY_DIR}/) + +configure_file ( + "${LIBRARY_INCLUDE}/orc/orc-config.hh.in" + "${CMAKE_CURRENT_BINARY_DIR}/orc/orc-config.hh" + ) + +add_library (orc ${SOURCE_FILES}) + +target_include_directories (orc + PRIVATE + ${LIBRARY_INCLUDE} + ${LIBRARY_DIR} + #PUBLIC + ${CMAKE_CURRENT_BINARY_DIR} + PRIVATE + ${PROTOBUF_INCLUDE_DIR} + ${ZLIB_INCLUDE_DIR} + ${SNAPPY_INCLUDE_DIR} + ${LZ4_INCLUDE_DIR} + ${LIBHDFSPP_INCLUDE_DIR} + ) + +target_link_libraries (orc PRIVATE + ${Protobuf_LIBRARY} + ${ZLIB_LIBRARIES} + ${SNAPPY_LIBRARY} + ${LZ4_LIBRARY} + ${LIBHDFSPP_LIBRARIES} + ) + +#install(TARGETS orc DESTINATION lib) + +if(ORC_CXX_HAS_THREAD_LOCAL AND BUILD_LIBHDFSPP) + add_definitions(-DBUILD_LIBHDFSPP) +endif(ORC_CXX_HAS_THREAD_LOCAL AND BUILD_LIBHDFSPP) diff --git a/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp b/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp index 0cd5ffb03e0..edb8d5c15f4 100644 --- a/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp +++ b/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp @@ -1,7 +1,7 @@ #include "config_formats.h" #include "ArrowColumnToCHColumn.h" -#if USE_ORC or USE_PARQUET +#if USE_ORC || USE_PARQUET #include #include #include diff --git a/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.h b/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.h index b5f4732d107..34b58a80091 100644 --- a/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.h +++ b/dbms/src/Processors/Formats/Impl/ArrowColumnToCHColumn.h @@ -1,6 +1,6 @@ #include "config_formats.h" -#if USE_ORC or USE_PARQUET +#if USE_ORC || USE_PARQUET #include #include From 87ec80089aea6d7e7ec632967f81a818c6ce2b07 Mon Sep 17 00:00:00 2001 From: alexey-milovidov Date: Tue, 10 Sep 2019 02:40:41 +0300 Subject: [PATCH 27/30] Update system.md --- docs/en/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index 493a2ca14ff..648aa07f5e7 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -30,7 +30,7 @@ SELECT name, status FROM system.dictionaries; ## DROP DNS CACHE {#query_language-system-drop-dns-cache} -Resets ClickHouse's internal DNS cache. Sometimes it is necessary to use this command when changing the infrastructure (changing the IP address of another ClickHouse server or the server used by dictionaries). +Resets ClickHouse's internal DNS cache. Sometimes (for old ClickHouse versions) it is necessary to use this command when changing the infrastructure (changing the IP address of another ClickHouse server or the server used by dictionaries). For more convenient (automatic) cache management, see disable_internal_dns_cache, dns_cache_update_period parameters. From 76459a4f5770cd422ca120ad2ae0f5e227ca1512 Mon Sep 17 00:00:00 2001 From: alexey-milovidov Date: Tue, 10 Sep 2019 02:41:03 +0300 Subject: [PATCH 28/30] Update system.md --- docs/ru/query_language/system.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ru/query_language/system.md b/docs/ru/query_language/system.md index 090376a5d12..2abdc5d34de 100644 --- a/docs/ru/query_language/system.md +++ b/docs/ru/query_language/system.md @@ -30,7 +30,7 @@ SELECT name, status FROM system.dictionaries; ## DROP DNS CACHE {#query_language-system-drop-dns-cache} -Сбрасывает внутренний DNS кеш ClickHouse. Иногда необходимо использовать эту команду при изменении инфраструктуры (смене IP адреса у другого ClickHouse сервера или сервера, используемого словарями). +Сбрасывает внутренний DNS кеш ClickHouse. Иногда (для старых версий ClickHouse) необходимо использовать эту команду при изменении инфраструктуры (смене IP адреса у другого ClickHouse сервера или сервера, используемого словарями). Для более удобного (автоматического) управления кешем см. параметры disable_internal_dns_cache, dns_cache_update_period. From 1bd75b1e74430be8e4f8e6a4aef2d00a305db3cf Mon Sep 17 00:00:00 2001 From: BayoNet Date: Tue, 10 Sep 2019 11:06:22 +0300 Subject: [PATCH 29/30] DOCAPI-7745: optimize_throw_if_noop docs. (#6848) * Typo fix. * DOCAPI-7745: The first version. * DOCAPI-7745: More text * DOCAPI-7745: More text. * Update docs/en/operations/settings/settings.md Co-Authored-By: Ivan Blinkov * Update docs/en/query_language/misc.md Co-Authored-By: Ivan Blinkov * DOCAPI-7745: Fixes. --- docs/en/interfaces/formats.md | 6 +++--- docs/en/operations/settings/settings.md | 12 ++++++++++++ docs/en/query_language/misc.md | 11 +++++++---- docs/toc_en.yml | 2 ++ 4 files changed, 24 insertions(+), 7 deletions(-) diff --git a/docs/en/interfaces/formats.md b/docs/en/interfaces/formats.md index 4dc123d6647..67fe9762ffb 100644 --- a/docs/en/interfaces/formats.md +++ b/docs/en/interfaces/formats.md @@ -926,17 +926,17 @@ Data types of a ClickHouse table columns can differ from the corresponding field You can insert Parquet data from a file into ClickHouse table by the following command: -``` +```bash cat {filename} | clickhouse-client --query="INSERT INTO {some_table} FORMAT Parquet" ``` You can select data from a ClickHouse table and save them into some file in the Parquet format by the following command: -``` +```sql clickhouse-client --query="SELECT * FROM {some_table} FORMAT Parquet" > {some_file.pq} ``` -To exchange data with the Hadoop, you can use [`HDFS` table engine](../../operations/table_engines/hdfs.md). +To exchange data with the Hadoop, you can use [HDFS table engine](../operations/table_engines/hdfs.md). ## Format Schema {#formatschema} diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md index 3d67fe0e61d..cbb1e02c44b 100644 --- a/docs/en/operations/settings/settings.md +++ b/docs/en/operations/settings/settings.md @@ -857,6 +857,18 @@ Possible values: Default value: 0. +## optimize_throw_if_noop {#setting-optimize_throw_if_noop} + +Enables or disables throwing an exception if the [OPTIMIZE](../../query_language/misc.md#misc_operations-optimize) query have not performed a merge. + +By default `OPTIMIZE` returns successfully even if it haven't done anything. This setting allows to distinguish this situation and get the reason in exception message. + +Possible values: + +- 1 — Throwing an exception is enabled. +- 0 — Throwing an exception is disabled. + +Default value: 0. ## distributed_replica_error_half_life {#settings-distributed_replica_error_half_life} - Type: seconds diff --git a/docs/en/query_language/misc.md b/docs/en/query_language/misc.md index 08e8f819b8c..337049d6624 100644 --- a/docs/en/query_language/misc.md +++ b/docs/en/query_language/misc.md @@ -177,10 +177,13 @@ Changes already made by the mutation are not rolled back. OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition] [FINAL] ``` -Asks the table engine to do something for optimization. -Supported only by `*MergeTree` engines, in which this query initializes a non-scheduled merge of data parts. -If you specify a `PARTITION`, only the specified partition will be optimized. -If you specify `FINAL`, optimization will be performed even when all the data is already in one part. +This query tries to initialize an unscheduled merge of data parts for tables with a table engine of [MergeTree](../operations/table_engines/mergetree.md) family. Other kinds of table engines are not supported. + +When `OPTIMIZE` is used with [ReplicatedMergeTree](../operations/table_engines/replication.md) family of table engines, ClickHouse creates a task for merging and waits for execution on all nodes (if the `replication_alter_partitions_sync` setting is enabled). + +- If `OPTIMIZE` doesn't perform merging for any reason, it doesn't notify the client about it. To enable notification use the [optimize_throw_if_noop](../operations/settings/settings.md#setting-optimize_throw_if_noop) setting. +- If you specify a `PARTITION`, only the specified partition is optimized. +- If you specify `FINAL`, optimization is performed even when all the data is already in one part. !!! warning OPTIMIZE can't fix the "Too many parts" error. diff --git a/docs/toc_en.yml b/docs/toc_en.yml index b0ea90c44a1..dccd51f3cb1 100644 --- a/docs/toc_en.yml +++ b/docs/toc_en.yml @@ -87,6 +87,7 @@ nav: - 'MySQL': 'operations/table_engines/mysql.md' - 'JDBC': 'operations/table_engines/jdbc.md' - 'ODBC': 'operations/table_engines/odbc.md' + - 'HDFS': 'operations/table_engines/hdfs.md' - 'Special': - 'Distributed': 'operations/table_engines/distributed.md' - 'External data': 'operations/table_engines/external_data.md' @@ -158,6 +159,7 @@ nav: - 'mysql': 'query_language/table_functions/mysql.md' - 'jdbc': 'query_language/table_functions/jdbc.md' - 'odbc': 'query_language/table_functions/odbc.md' + - 'hdfs': 'query_language/table_functions/hdfs.md' - 'input': 'query_language/table_functions/input.md' - 'Dictionaries': - 'Introduction': 'query_language/dicts/index.md' From 36c0179f54a43ae4aeb02d46e746c24b788d1e17 Mon Sep 17 00:00:00 2001 From: BayoNet Date: Tue, 10 Sep 2019 14:07:05 +0300 Subject: [PATCH 30/30] Fix of links in docs (#6884) * Typo fix. * Links fix. --- docs/en/operations/system_tables.md | 6 +++--- docs/en/query_language/system.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/en/operations/system_tables.md b/docs/en/operations/system_tables.md index 4e3386764fd..0b6481de3c1 100644 --- a/docs/en/operations/system_tables.md +++ b/docs/en/operations/system_tables.md @@ -64,9 +64,9 @@ Please note that `errors_count` is updated once per query to the cluster, but `e ** See also ** -- [Table engine Distributed](../../operations/table_engines/distributed.md) -- [distributed_replica_error_cap setting](../settings/settings.md#settings-distributed_replica_error_cap) -- [distributed_replica_error_half_life setting](../settings/settings.md#settings-distributed_replica_error_half_life) +- [Table engine Distributed](table_engines/distributed.md) +- [distributed_replica_error_cap setting](settings/settings.md#settings-distributed_replica_error_cap) +- [distributed_replica_error_half_life setting](settings/settings.md#settings-distributed_replica_error_half_life) ## system.columns diff --git a/docs/en/query_language/system.md b/docs/en/query_language/system.md index 648aa07f5e7..3ef504e46b3 100644 --- a/docs/en/query_language/system.md +++ b/docs/en/query_language/system.md @@ -15,7 +15,7 @@ ## RELOAD DICTIONARIES {#query_language-system-reload-dictionaries} Reloads all dictionaries that have been successfully loaded before. -By default, dictionaries are loaded lazily (see [dictionaries_lazy_load](../operations/server_settings/settings.md#dictionaries-lazy-load)), so instead of being loaded automatically at startup, they are initialized on first access through dictGet function or SELECT from tables with ENGINE = Dictionary. The `SYSTEM RELOAD DICTIONARIES` query reloads such dictionaries (LOADED). +By default, dictionaries are loaded lazily (see [dictionaries_lazy_load](../operations/server_settings/settings.md#server_settings-dictionaries_lazy_load)), so instead of being loaded automatically at startup, they are initialized on first access through dictGet function or SELECT from tables with ENGINE = Dictionary. The `SYSTEM RELOAD DICTIONARIES` query reloads such dictionaries (LOADED). Always returns `Ok.` regardless of the result of the dictionary update. ## RELOAD DICTIONARY dictionary_name {#query_language-system-reload-dictionary}