diff --git a/.github/ISSUE_TEMPLATE/documentation-issue.md b/.github/ISSUE_TEMPLATE/documentation-issue.md index a8f31eadc56..557e5ea43c9 100644 --- a/.github/ISSUE_TEMPLATE/documentation-issue.md +++ b/.github/ISSUE_TEMPLATE/documentation-issue.md @@ -2,8 +2,7 @@ name: Documentation issue about: Report something incorrect or missing in documentation title: '' -labels: documentation -assignees: BayoNet +labels: comp-documentation --- diff --git a/.gitmodules b/.gitmodules index 081724c54c8..bd06d9d9acc 100644 --- a/.gitmodules +++ b/.gitmodules @@ -44,6 +44,7 @@ [submodule "contrib/protobuf"] path = contrib/protobuf url = https://github.com/ClickHouse-Extras/protobuf.git + branch = v3.13.0.1 [submodule "contrib/boost"] path = contrib/boost url = https://github.com/ClickHouse-Extras/boost.git @@ -107,6 +108,7 @@ [submodule "contrib/grpc"] path = contrib/grpc url = https://github.com/ClickHouse-Extras/grpc.git + branch = v1.33.2 [submodule "contrib/aws"] path = contrib/aws url = https://github.com/ClickHouse-Extras/aws-sdk-cpp.git @@ -155,7 +157,7 @@ url = https://github.com/ClickHouse-Extras/libcpuid.git [submodule "contrib/openldap"] path = contrib/openldap - url = https://github.com/openldap/openldap.git + url = https://github.com/ClickHouse-Extras/openldap.git [submodule "contrib/AMQP-CPP"] path = contrib/AMQP-CPP url = https://github.com/ClickHouse-Extras/AMQP-CPP.git @@ -198,5 +200,9 @@ url = https://github.com/facebook/rocksdb branch = v6.14.5 [submodule "contrib/xz"] - path = contrib/xz - url = https://github.com/xz-mirror/xz + path = contrib/xz + url = https://github.com/xz-mirror/xz +[submodule "contrib/abseil-cpp"] + path = contrib/abseil-cpp + url = https://github.com/ClickHouse-Extras/abseil-cpp.git + branch = lts_2020_02_25 diff --git a/CHANGELOG.md b/CHANGELOG.md index 355c664664d..c722e4a1ca0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -16,6 +16,7 @@ * Remove `ANALYZE` and `AST` queries, and make the setting `enable_debug_queries` obsolete since now it is the part of full featured `EXPLAIN` query. [#16536](https://github.com/ClickHouse/ClickHouse/pull/16536) ([Ivan](https://github.com/abyss7)). * Aggregate functions `boundingRatio`, `rankCorr`, `retention`, `timeSeriesGroupSum`, `timeSeriesGroupRateSum`, `windowFunnel` were erroneously made case-insensitive. Now their names are made case sensitive as designed. Only functions that are specified in SQL standard or made for compatibility with other DBMS or functions similar to those should be case-insensitive. [#16407](https://github.com/ClickHouse/ClickHouse/pull/16407) ([alexey-milovidov](https://github.com/alexey-milovidov)). * Make `rankCorr` function return nan on insufficient data https://github.com/ClickHouse/ClickHouse/issues/16124. [#16135](https://github.com/ClickHouse/ClickHouse/pull/16135) ([hexiaoting](https://github.com/hexiaoting)). +* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version). #### New Feature @@ -154,6 +155,7 @@ * Change default value of `format_regexp_escaping_rule` setting (it's related to `Regexp` format) to `Raw` (it means - read whole subpattern as a value) to make the behaviour more like to what users expect. [#15426](https://github.com/ClickHouse/ClickHouse/pull/15426) ([alexey-milovidov](https://github.com/alexey-milovidov)). * Add support for nested multiline comments `/* comment /* comment */ */` in SQL. This conforms to the SQL standard. [#14655](https://github.com/ClickHouse/ClickHouse/pull/14655) ([alexey-milovidov](https://github.com/alexey-milovidov)). * Added MergeTree settings (`max_replicated_merges_with_ttl_in_queue` and `max_number_of_merges_with_ttl_in_pool`) to control the number of merges with TTL in the background pool and replicated queue. This change breaks compatibility with older versions only if you use delete TTL. Otherwise, replication will stay compatible. You can avoid incompatibility issues if you update all shard replicas at once or execute `SYSTEM STOP TTL MERGES` until you finish the update of all replicas. If you'll get an incompatible entry in the replication queue, first of all, execute `SYSTEM STOP TTL MERGES` and after `ALTER TABLE ... DETACH PARTITION ...` the partition where incompatible TTL merge was assigned. Attach it back on a single replica. [#14490](https://github.com/ClickHouse/ClickHouse/pull/14490) ([alesapin](https://github.com/alesapin)). +* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version). #### New Feature @@ -438,6 +440,10 @@ ### ClickHouse release v20.9.2.20, 2020-09-22 +#### Backward Incompatible Change + +* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version). + #### New Feature * Added column transformers `EXCEPT`, `REPLACE`, `APPLY`, which can be applied to the list of selected columns (after `*` or `COLUMNS(...)`). For example, you can write `SELECT * EXCEPT(URL) REPLACE(number + 1 AS number)`. Another example: `select * apply(length) apply(max) from wide_string_table` to find out the maxium length of all string columns. [#14233](https://github.com/ClickHouse/ClickHouse/pull/14233) ([Amos Bird](https://github.com/amosbird)). @@ -621,6 +627,7 @@ * Now `OPTIMIZE FINAL` query doesn't recalculate TTL for parts that were added before TTL was created. Use `ALTER TABLE ... MATERIALIZE TTL` once to calculate them, after that `OPTIMIZE FINAL` will evaluate TTL's properly. This behavior never worked for replicated tables. [#14220](https://github.com/ClickHouse/ClickHouse/pull/14220) ([alesapin](https://github.com/alesapin)). * Extend `parallel_distributed_insert_select` setting, adding an option to run `INSERT` into local table. The setting changes type from `Bool` to `UInt64`, so the values `false` and `true` are no longer supported. If you have these values in server configuration, the server will not start. Please replace them with `0` and `1`, respectively. [#14060](https://github.com/ClickHouse/ClickHouse/pull/14060) ([Azat Khuzhin](https://github.com/azat)). * Remove support for the `ODBCDriver` input/output format. This was a deprecated format once used for communication with the ClickHouse ODBC driver, now long superseded by the `ODBCDriver2` format. Resolves [#13629](https://github.com/ClickHouse/ClickHouse/issues/13629). [#13847](https://github.com/ClickHouse/ClickHouse/pull/13847) ([hexiaoting](https://github.com/hexiaoting)). +* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version). #### New Feature @@ -765,6 +772,7 @@ * The function `groupArrayMoving*` was not working for distributed queries. It's result was calculated within incorrect data type (without promotion to the largest type). The function `groupArrayMovingAvg` was returning integer number that was inconsistent with the `avg` function. This fixes [#12568](https://github.com/ClickHouse/ClickHouse/issues/12568). [#12622](https://github.com/ClickHouse/ClickHouse/pull/12622) ([alexey-milovidov](https://github.com/alexey-milovidov)). * Add sanity check for MergeTree settings. If the settings are incorrect, the server will refuse to start or to create a table, printing detailed explanation to the user. [#13153](https://github.com/ClickHouse/ClickHouse/pull/13153) ([alexey-milovidov](https://github.com/alexey-milovidov)). * Protect from the cases when user may set `background_pool_size` to value lower than `number_of_free_entries_in_pool_to_execute_mutation` or `number_of_free_entries_in_pool_to_lower_max_size_of_merge`. In these cases ALTERs won't work or the maximum size of merge will be too limited. It will throw exception explaining what to do. This closes [#10897](https://github.com/ClickHouse/ClickHouse/issues/10897). [#12728](https://github.com/ClickHouse/ClickHouse/pull/12728) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version). #### New Feature @@ -951,6 +959,10 @@ ### ClickHouse release v20.6.3.28-stable +#### Backward Incompatible Change + +* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version). + #### New Feature * Added an initial implementation of `EXPLAIN` query. Syntax: `EXPLAIN SELECT ...`. This fixes [#1118](https://github.com/ClickHouse/ClickHouse/issues/1118). [#11873](https://github.com/ClickHouse/ClickHouse/pull/11873) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). @@ -1139,6 +1151,7 @@ * Update `zstd` to 1.4.4. It has some minor improvements in performance and compression ratio. If you run replicas with different versions of ClickHouse you may see reasonable error messages `Data after merge is not byte-identical to data on another replicas.` with explanation. These messages are Ok and you should not worry. This change is backward compatible but we list it here in changelog in case you will wonder about these messages. [#10663](https://github.com/ClickHouse/ClickHouse/pull/10663) ([alexey-milovidov](https://github.com/alexey-milovidov)). * Added a check for meaningless codecs and a setting `allow_suspicious_codecs` to control this check. This closes [#4966](https://github.com/ClickHouse/ClickHouse/issues/4966). [#10645](https://github.com/ClickHouse/ClickHouse/pull/10645) ([alexey-milovidov](https://github.com/alexey-milovidov)). * Several Kafka setting changes their defaults. See [#11388](https://github.com/ClickHouse/ClickHouse/pull/11388). +* When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to `Part ... intersects previous part` errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version). #### New Feature diff --git a/CMakeLists.txt b/CMakeLists.txt index cababc083fa..1daff973bb1 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -154,17 +154,19 @@ endif () # Make sure the final executable has symbols exported set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -rdynamic") -find_program (OBJCOPY_PATH NAMES "llvm-objcopy" "llvm-objcopy-11" "llvm-objcopy-10" "llvm-objcopy-9" "llvm-objcopy-8" "objcopy") -if (OBJCOPY_PATH) - message(STATUS "Using objcopy: ${OBJCOPY_PATH}.") +if (OS_LINUX) + find_program (OBJCOPY_PATH NAMES "llvm-objcopy" "llvm-objcopy-11" "llvm-objcopy-10" "llvm-objcopy-9" "llvm-objcopy-8" "objcopy") + if (OBJCOPY_PATH) + message(STATUS "Using objcopy: ${OBJCOPY_PATH}.") - if (ARCH_AMD64) - set(OBJCOPY_ARCH_OPTIONS -O elf64-x86-64 -B i386) - elseif (ARCH_AARCH64) - set(OBJCOPY_ARCH_OPTIONS -O elf64-aarch64 -B aarch64) + if (ARCH_AMD64) + set(OBJCOPY_ARCH_OPTIONS -O elf64-x86-64 -B i386) + elseif (ARCH_AARCH64) + set(OBJCOPY_ARCH_OPTIONS -O elf64-aarch64 -B aarch64) + endif () + else () + message(FATAL_ERROR "Cannot find objcopy.") endif () -else () - message(FATAL_ERROR "Cannot find objcopy.") endif () if (OS_DARWIN) @@ -475,9 +477,6 @@ find_contrib_lib(cityhash) find_contrib_lib(farmhash) -set (USE_INTERNAL_BTRIE_LIBRARY ON CACHE INTERNAL "") -find_contrib_lib(btrie) - if (ENABLE_TESTS) include (cmake/find/gtest.cmake) endif () diff --git a/README.md b/README.md index 03b5c988586..763b12439c5 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ [![ClickHouse — open source distributed column-oriented DBMS](https://github.com/ClickHouse/ClickHouse/raw/master/website/images/logo-400x240.png)](https://clickhouse.tech) -ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real time. +ClickHouse® is an open-source column-oriented database management system that allows generating analytical data reports in real time. ## Useful Links @@ -16,7 +16,5 @@ ClickHouse is an open-source column-oriented database management system that all * You can also [fill this form](https://clickhouse.tech/#meet) to meet Yandex ClickHouse team in person. ## Upcoming Events - -* [The Second ClickHouse Meetup East (online)](https://www.eventbrite.com/e/the-second-clickhouse-meetup-east-tickets-126787955187) on October 31, 2020. -* [ClickHouse for Enterprise Meetup (online in Russian)](https://arenadata-events.timepad.ru/event/1465249/) on November 10, 2020. - +* [SF Bay Area ClickHouse Meetup (online)](https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup/events/274498897/) on 2 December 2020. +* [SF Bay Area ClickHouse Virtual Office Hours (online)](https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup/events/274273549/) on 20 January 2020. diff --git a/base/common/LineReader.cpp b/base/common/LineReader.cpp index b2bc929a1df..a32906dd5a5 100644 --- a/base/common/LineReader.cpp +++ b/base/common/LineReader.cpp @@ -127,7 +127,7 @@ String LineReader::readLine(const String & first_prompt, const String & second_p } #endif - line += (line.empty() ? "" : " ") + input; + line += (line.empty() ? "" : "\n") + input; if (!need_next_line) break; diff --git a/base/common/StringRef.h b/base/common/StringRef.h index b51b95456cb..ac9d7c47b72 100644 --- a/base/common/StringRef.h +++ b/base/common/StringRef.h @@ -1,6 +1,7 @@ #pragma once #include +#include // for std::logic_error #include #include #include diff --git a/base/common/getMemoryAmount.cpp b/base/common/getMemoryAmount.cpp index 5e600a37351..e7d284354f9 100644 --- a/base/common/getMemoryAmount.cpp +++ b/base/common/getMemoryAmount.cpp @@ -1,100 +1,28 @@ #include #include "common/getMemoryAmount.h" -// http://nadeausoftware.com/articles/2012/09/c_c_tip_how_get_physical_memory_size_system - -/* - * Author: David Robert Nadeau - * Site: http://NadeauSoftware.com/ - * License: Creative Commons Attribution 3.0 Unported License - * http://creativecommons.org/licenses/by/3.0/deed.en_US - */ - -#if defined(WIN32) || defined(_WIN32) -#include -#else #include #include #include #if defined(BSD) #include #endif -#endif -/** - * Returns the size of physical memory (RAM) in bytes. - * Returns 0 on unsupported platform - */ +/** Returns the size of physical memory (RAM) in bytes. + * Returns 0 on unsupported platform + */ uint64_t getMemoryAmountOrZero() { -#if defined(_WIN32) && (defined(__CYGWIN__) || defined(__CYGWIN32__)) - /* Cygwin under Windows. ------------------------------------ */ - /* New 64-bit MEMORYSTATUSEX isn't available. Use old 32.bit */ - MEMORYSTATUS status; - status.dwLength = sizeof(status); - GlobalMemoryStatus(&status); - return status.dwTotalPhys; + int64_t num_pages = sysconf(_SC_PHYS_PAGES); + if (num_pages <= 0) + return 0; -#elif defined(WIN32) || defined(_WIN32) - /* Windows. ------------------------------------------------- */ - /* Use new 64-bit MEMORYSTATUSEX, not old 32-bit MEMORYSTATUS */ - MEMORYSTATUSEX status; - status.dwLength = sizeof(status); - GlobalMemoryStatusEx(&status); - return status.ullTotalPhys; + int64_t page_size = sysconf(_SC_PAGESIZE); + if (page_size <= 0) + return 0; -#else - /* UNIX variants. ------------------------------------------- */ - /* Prefer sysctl() over sysconf() except sysctl() HW_REALMEM and HW_PHYSMEM */ - -#if defined(CTL_HW) && (defined(HW_MEMSIZE) || defined(HW_PHYSMEM64)) - int mib[2]; - mib[0] = CTL_HW; -#if defined(HW_MEMSIZE) - mib[1] = HW_MEMSIZE; /* OSX. --------------------- */ -#elif defined(HW_PHYSMEM64) - mib[1] = HW_PHYSMEM64; /* NetBSD, OpenBSD. --------- */ -#endif - uint64_t size = 0; /* 64-bit */ - size_t len = sizeof(size); - if (sysctl(mib, 2, &size, &len, nullptr, 0) == 0) - return size; - - return 0; /* Failed? */ - -#elif defined(_SC_AIX_REALMEM) - /* AIX. ----------------------------------------------------- */ - return sysconf(_SC_AIX_REALMEM) * 1024; - -#elif defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE) - /* FreeBSD, Linux, OpenBSD, and Solaris. -------------------- */ - return uint64_t(sysconf(_SC_PHYS_PAGES)) - *uint64_t(sysconf(_SC_PAGESIZE)); - -#elif defined(_SC_PHYS_PAGES) && defined(_SC_PAGE_SIZE) - /* Legacy. -------------------------------------------------- */ - return uint64_t(sysconf(_SC_PHYS_PAGES)) - * uint64_t(sysconf(_SC_PAGE_SIZE)); - -#elif defined(CTL_HW) && (defined(HW_PHYSMEM) || defined(HW_REALMEM)) - /* DragonFly BSD, FreeBSD, NetBSD, OpenBSD, and OSX. -------- */ - int mib[2]; - mib[0] = CTL_HW; -#if defined(HW_REALMEM) - mib[1] = HW_REALMEM; /* FreeBSD. ----------------- */ -#elif defined(HW_PYSMEM) - mib[1] = HW_PHYSMEM; /* Others. ------------------ */ -#endif - unsigned int size = 0; /* 32-bit */ - size_t len = sizeof(size); - if (sysctl(mib, 2, &size, &len, nullptr, 0) == 0) - return size; - - return 0; /* Failed? */ -#endif /* sysctl and sysconf variants */ - -#endif + return num_pages * page_size; } diff --git a/base/common/logger_useful.h b/base/common/logger_useful.h index f760d59de45..d3b4d38d546 100644 --- a/base/common/logger_useful.h +++ b/base/common/logger_useful.h @@ -3,7 +3,6 @@ /// Macros for convenient usage of Poco logger. #include -#include #include #include #include diff --git a/base/common/types.h b/base/common/types.h index f3572da2972..bd5c28fe73b 100644 --- a/base/common/types.h +++ b/base/common/types.h @@ -8,7 +8,7 @@ using Int16 = int16_t; using Int32 = int32_t; using Int64 = int64_t; -#if __cplusplus <= 201703L +#ifndef __cpp_char8_t using char8_t = unsigned char; #endif diff --git a/base/daemon/SentryWriter.cpp b/base/daemon/SentryWriter.cpp index 33f2b237dd5..b8f2e5073ab 100644 --- a/base/daemon/SentryWriter.cpp +++ b/base/daemon/SentryWriter.cpp @@ -6,10 +6,12 @@ #include #include +#include #include #include #include +#include #if !defined(ARCADIA_BUILD) # include "Common/config_version.h" @@ -28,14 +30,13 @@ namespace bool initialized = false; bool anonymize = false; +std::string server_data_path; void setExtras() { - if (!anonymize) - { sentry_set_extra("server_name", sentry_value_new_string(getFQDNOrHostName().c_str())); - } + sentry_set_tag("version", VERSION_STRING); sentry_set_extra("version_githash", sentry_value_new_string(VERSION_GITHASH)); sentry_set_extra("version_describe", sentry_value_new_string(VERSION_DESCRIBE)); @@ -44,6 +45,15 @@ void setExtras() sentry_set_extra("version_major", sentry_value_new_int32(VERSION_MAJOR)); sentry_set_extra("version_minor", sentry_value_new_int32(VERSION_MINOR)); sentry_set_extra("version_patch", sentry_value_new_int32(VERSION_PATCH)); + sentry_set_extra("version_official", sentry_value_new_string(VERSION_OFFICIAL)); + + /// Sentry does not support 64-bit integers. + sentry_set_extra("total_ram", sentry_value_new_string(formatReadableSizeWithBinarySuffix(getMemoryAmountOrZero()).c_str())); + sentry_set_extra("physical_cpu_cores", sentry_value_new_int32(getNumberOfPhysicalCPUCores())); + + if (!server_data_path.empty()) + sentry_set_extra("disk_free_space", sentry_value_new_string(formatReadableSizeWithBinarySuffix( + Poco::File(server_data_path).freeSpace()).c_str())); } void sentry_logger(sentry_level_e level, const char * message, va_list args, void *) @@ -98,6 +108,7 @@ void SentryWriter::initialize(Poco::Util::LayeredConfiguration & config) } if (enabled) { + server_data_path = config.getString("path", ""); const std::filesystem::path & default_tmp_path = std::filesystem::path(config.getString("tmp_path", Poco::Path::temp())) / "sentry"; const std::string & endpoint = config.getString("send_crash_reports.endpoint"); diff --git a/base/glibc-compatibility/musl/accept4.c b/base/glibc-compatibility/musl/accept4.c new file mode 100644 index 00000000000..59ab1726bdc --- /dev/null +++ b/base/glibc-compatibility/musl/accept4.c @@ -0,0 +1,19 @@ +#define _GNU_SOURCE +#include +#include +#include +#include "syscall.h" + +int accept4(int fd, struct sockaddr *restrict addr, socklen_t *restrict len, int flg) +{ + if (!flg) return accept(fd, addr, len); + int ret = socketcall_cp(accept4, fd, addr, len, flg, 0, 0); + if (ret>=0 || (errno != ENOSYS && errno != EINVAL)) return ret; + ret = accept(fd, addr, len); + if (ret<0) return ret; + if (flg & SOCK_CLOEXEC) + __syscall(SYS_fcntl, ret, F_SETFD, FD_CLOEXEC); + if (flg & SOCK_NONBLOCK) + __syscall(SYS_fcntl, ret, F_SETFL, O_NONBLOCK); + return ret; +} diff --git a/base/glibc-compatibility/musl/epoll.c b/base/glibc-compatibility/musl/epoll.c new file mode 100644 index 00000000000..deff5b101aa --- /dev/null +++ b/base/glibc-compatibility/musl/epoll.c @@ -0,0 +1,37 @@ +#include +#include +#include +#include "syscall.h" + +int epoll_create(int size) +{ + return epoll_create1(0); +} + +int epoll_create1(int flags) +{ + int r = __syscall(SYS_epoll_create1, flags); +#ifdef SYS_epoll_create + if (r==-ENOSYS && !flags) r = __syscall(SYS_epoll_create, 1); +#endif + return __syscall_ret(r); +} + +int epoll_ctl(int fd, int op, int fd2, struct epoll_event *ev) +{ + return syscall(SYS_epoll_ctl, fd, op, fd2, ev); +} + +int epoll_pwait(int fd, struct epoll_event *ev, int cnt, int to, const sigset_t *sigs) +{ + int r = __syscall(SYS_epoll_pwait, fd, ev, cnt, to, sigs, _NSIG/8); +#ifdef SYS_epoll_wait + if (r==-ENOSYS && !sigs) r = __syscall(SYS_epoll_wait, fd, ev, cnt, to); +#endif + return __syscall_ret(r); +} + +int epoll_wait(int fd, struct epoll_event *ev, int cnt, int to) +{ + return epoll_pwait(fd, ev, cnt, to, 0); +} diff --git a/base/glibc-compatibility/musl/eventfd.c b/base/glibc-compatibility/musl/eventfd.c new file mode 100644 index 00000000000..68e489c8364 --- /dev/null +++ b/base/glibc-compatibility/musl/eventfd.c @@ -0,0 +1,23 @@ +#include +#include +#include +#include "syscall.h" + +int eventfd(unsigned int count, int flags) +{ + int r = __syscall(SYS_eventfd2, count, flags); +#ifdef SYS_eventfd + if (r==-ENOSYS && !flags) r = __syscall(SYS_eventfd, count); +#endif + return __syscall_ret(r); +} + +int eventfd_read(int fd, eventfd_t *value) +{ + return (sizeof(*value) == read(fd, value, sizeof(*value))) ? 0 : -1; +} + +int eventfd_write(int fd, eventfd_t value) +{ + return (sizeof(value) == write(fd, &value, sizeof(value))) ? 0 : -1; +} diff --git a/base/glibc-compatibility/musl/getauxval.c b/base/glibc-compatibility/musl/getauxval.c new file mode 100644 index 00000000000..a429273fa1a --- /dev/null +++ b/base/glibc-compatibility/musl/getauxval.c @@ -0,0 +1,45 @@ +#include +#include // __environ +#include + +// We don't have libc struct available here. Compute aux vector manually. +static unsigned long * __auxv = NULL; +static unsigned long __auxv_secure = 0; + +static size_t __find_auxv(unsigned long type) +{ + size_t i; + for (i = 0; __auxv[i]; i += 2) + { + if (__auxv[i] == type) + return i + 1; + } + return (size_t) -1; +} + +__attribute__((constructor)) static void __auxv_init() +{ + size_t i; + for (i = 0; __environ[i]; i++); + __auxv = (unsigned long *) (__environ + i + 1); + + size_t secure_idx = __find_auxv(AT_SECURE); + if (secure_idx != ((size_t) -1)) + __auxv_secure = __auxv[secure_idx]; +} + +unsigned long getauxval(unsigned long type) +{ + if (type == AT_SECURE) + return __auxv_secure; + + if (__auxv) + { + size_t index = __find_auxv(type); + if (index != ((size_t) -1)) + return __auxv[index]; + } + + errno = ENOENT; + return 0; +} diff --git a/base/glibc-compatibility/musl/secure_getenv.c b/base/glibc-compatibility/musl/secure_getenv.c new file mode 100644 index 00000000000..fbd9ef3bdcc --- /dev/null +++ b/base/glibc-compatibility/musl/secure_getenv.c @@ -0,0 +1,8 @@ +#define _GNU_SOURCE +#include +#include + +char * secure_getenv(const char * name) +{ + return getauxval(AT_SECURE) ? NULL : getenv(name); +} diff --git a/base/glibc-compatibility/musl/syscall.h b/base/glibc-compatibility/musl/syscall.h index 70b4688f642..3160357f252 100644 --- a/base/glibc-compatibility/musl/syscall.h +++ b/base/glibc-compatibility/musl/syscall.h @@ -13,3 +13,11 @@ long __syscall(syscall_arg_t, ...); __attribute__((visibility("hidden"))) void *__vdsosym(const char *, const char *); + +#define syscall(...) __syscall_ret(__syscall(__VA_ARGS__)) + +#define socketcall(...) __syscall_ret(__socketcall(__VA_ARGS__)) + +#define __socketcall(nm,a,b,c,d,e,f) __syscall(SYS_##nm, a, b, c, d, e, f) + +#define socketcall_cp socketcall diff --git a/base/glibc-compatibility/musl/vdso.c b/base/glibc-compatibility/musl/vdso.c index c0dd0f33e4e..b108c4ef752 100644 --- a/base/glibc-compatibility/musl/vdso.c +++ b/base/glibc-compatibility/musl/vdso.c @@ -40,24 +40,10 @@ static int checkver(Verdef *def, int vsym, const char *vername, char *strings) #define OK_TYPES (1<e_phoff); size_t *dynv=0, base=-1; diff --git a/cmake/Modules/Findbtrie.cmake b/cmake/Modules/Findbtrie.cmake deleted file mode 100644 index 4f3c27f5225..00000000000 --- a/cmake/Modules/Findbtrie.cmake +++ /dev/null @@ -1,44 +0,0 @@ -# - Try to find btrie headers and libraries. -# -# Usage of this module as follows: -# -# find_package(btrie) -# -# Variables used by this module, they can change the default behaviour and need -# to be set before calling find_package: -# -# BTRIE_ROOT_DIR Set this variable to the root installation of -# btrie if the module has problems finding -# the proper installation path. -# -# Variables defined by this module: -# -# BTRIE_FOUND System has btrie libs/headers -# BTRIE_LIBRARIES The btrie library/libraries -# BTRIE_INCLUDE_DIR The location of btrie headers - -find_path(BTRIE_ROOT_DIR - NAMES include/btrie.h -) - -find_library(BTRIE_LIBRARIES - NAMES btrie - PATHS ${BTRIE_ROOT_DIR}/lib ${BTRIE_LIBRARIES_PATHS} -) - -find_path(BTRIE_INCLUDE_DIR - NAMES btrie.h - PATHS ${BTRIE_ROOT_DIR}/include ${BTRIE_INCLUDE_PATHS} -) - -include(FindPackageHandleStandardArgs) -find_package_handle_standard_args(btrie DEFAULT_MSG - BTRIE_LIBRARIES - BTRIE_INCLUDE_DIR -) - -mark_as_advanced( - BTRIE_ROOT_DIR - BTRIE_LIBRARIES - BTRIE_INCLUDE_DIR -) diff --git a/cmake/Modules/FindgRPC.cmake b/cmake/Modules/FindgRPC.cmake index 671d207085b..945d307952b 100644 --- a/cmake/Modules/FindgRPC.cmake +++ b/cmake/Modules/FindgRPC.cmake @@ -6,11 +6,9 @@ Defines the following variables: The include directories of the gRPC framework, including the include directories of the C++ wrapper. ``gRPC_LIBRARIES`` The libraries of the gRPC framework. -``gRPC_UNSECURE_LIBRARIES`` - The libraries of the gRPC framework without SSL. -``_gRPC_CPP_PLUGIN`` +``gRPC_CPP_PLUGIN`` The plugin for generating gRPC client and server C++ stubs from `.proto` files -``_gRPC_PYTHON_PLUGIN`` +``gRPC_PYTHON_PLUGIN`` The plugin for generating gRPC client and server Python stubs from `.proto` files The following :prop_tgt:`IMPORTED` targets are also defined: @@ -19,6 +17,13 @@ The following :prop_tgt:`IMPORTED` targets are also defined: ``grpc_cpp_plugin`` ``grpc_python_plugin`` +Set the following variables to adjust the behaviour of this script: +``gRPC_USE_UNSECURE_LIBRARIES`` + if set gRPC_LIBRARIES will be filled with the unsecure version of the libraries (i.e. without SSL) + instead of the secure ones. +``gRPC_DEBUG` + if set the debug message will be printed. + Add custom commands to process ``.proto`` files to C++:: protobuf_generate_grpc_cpp( [DESCRIPTORS ] [EXPORT_MACRO ] [...]) @@ -242,6 +247,7 @@ find_library(gRPC_LIBRARY NAMES grpc) find_library(gRPC_CPP_LIBRARY NAMES grpc++) find_library(gRPC_UNSECURE_LIBRARY NAMES grpc_unsecure) find_library(gRPC_CPP_UNSECURE_LIBRARY NAMES grpc++_unsecure) +find_library(gRPC_CARES_LIBRARY NAMES cares) set(gRPC_LIBRARIES) if(gRPC_USE_UNSECURE_LIBRARIES) @@ -259,6 +265,7 @@ else() set(gRPC_LIBRARIES ${gRPC_LIBRARIES} ${gRPC_CPP_LIBRARY}) endif() endif() +set(gRPC_LIBRARIES ${gRPC_LIBRARIES} ${gRPC_CARES_LIBRARY}) # Restore the original find library ordering. if(gRPC_USE_STATIC_LIBS) @@ -278,11 +285,11 @@ else() endif() # Get full path to plugin. -find_program(_gRPC_CPP_PLUGIN +find_program(gRPC_CPP_PLUGIN NAMES grpc_cpp_plugin DOC "The plugin for generating gRPC client and server C++ stubs from `.proto` files") -find_program(_gRPC_PYTHON_PLUGIN +find_program(gRPC_PYTHON_PLUGIN NAMES grpc_python_plugin DOC "The plugin for generating gRPC client and server Python stubs from `.proto` files") @@ -317,14 +324,14 @@ endif() #include(FindPackageHandleStandardArgs.cmake) FIND_PACKAGE_HANDLE_STANDARD_ARGS(gRPC - REQUIRED_VARS gRPC_LIBRARY gRPC_CPP_LIBRARY gRPC_UNSECURE_LIBRARY gRPC_CPP_UNSECURE_LIBRARY - gRPC_INCLUDE_DIR gRPC_CPP_INCLUDE_DIR _gRPC_CPP_PLUGIN _gRPC_PYTHON_PLUGIN) + REQUIRED_VARS gRPC_LIBRARY gRPC_CPP_LIBRARY gRPC_UNSECURE_LIBRARY gRPC_CPP_UNSECURE_LIBRARY gRPC_CARES_LIBRARY + gRPC_INCLUDE_DIR gRPC_CPP_INCLUDE_DIR gRPC_CPP_PLUGIN gRPC_PYTHON_PLUGIN) if(gRPC_FOUND) if(gRPC_DEBUG) message(STATUS "gRPC: INCLUDE_DIRS=${gRPC_INCLUDE_DIRS}") message(STATUS "gRPC: LIBRARIES=${gRPC_LIBRARIES}") - message(STATUS "gRPC: CPP_PLUGIN=${_gRPC_CPP_PLUGIN}") - message(STATUS "gRPC: PYTHON_PLUGIN=${_gRPC_PYTHON_PLUGIN}") + message(STATUS "gRPC: CPP_PLUGIN=${gRPC_CPP_PLUGIN}") + message(STATUS "gRPC: PYTHON_PLUGIN=${gRPC_PYTHON_PLUGIN}") endif() endif() diff --git a/cmake/autogenerated_versions.txt b/cmake/autogenerated_versions.txt index 0e65568f185..87a30c9effc 100644 --- a/cmake/autogenerated_versions.txt +++ b/cmake/autogenerated_versions.txt @@ -1,9 +1,9 @@ # This strings autochanged from release_lib.sh: -SET(VERSION_REVISION 54443) +SET(VERSION_REVISION 54444) SET(VERSION_MAJOR 20) -SET(VERSION_MINOR 12) +SET(VERSION_MINOR 13) SET(VERSION_PATCH 1) -SET(VERSION_GITHASH c53725fb1f846fda074347607ab582fbb9c6f7a1) -SET(VERSION_DESCRIBE v20.12.1.1-prestable) -SET(VERSION_STRING 20.12.1.1) +SET(VERSION_GITHASH e581f9ccfc5c64867b0f488cce72412fd2966471) +SET(VERSION_DESCRIBE v20.13.1.1-prestable) +SET(VERSION_STRING 20.13.1.1) # end of autochange diff --git a/cmake/darwin/default_libs.cmake b/cmake/darwin/default_libs.cmake index 7b57e63f4ee..4ee1bcdcfbf 100644 --- a/cmake/darwin/default_libs.cmake +++ b/cmake/darwin/default_libs.cmake @@ -12,13 +12,7 @@ set(CMAKE_CXX_STANDARD_LIBRARIES ${DEFAULT_LIBS}) set(CMAKE_C_STANDARD_LIBRARIES ${DEFAULT_LIBS}) # Minimal supported SDK version - -set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mmacosx-version-min=10.15") -set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmacosx-version-min=10.15") -set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} -mmacosx-version-min=10.15") - -set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -mmacosx-version-min=10.15") -set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -mmacosx-version-min=10.15") +set(CMAKE_OSX_DEPLOYMENT_TARGET 10.15) # Global libraries diff --git a/cmake/find/avro.cmake b/cmake/find/avro.cmake index e0f73d99111..74ccda3489f 100644 --- a/cmake/find/avro.cmake +++ b/cmake/find/avro.cmake @@ -1,3 +1,4 @@ +# Needed when using Apache Avro serialization format option (ENABLE_AVRO "Enable Avro" ${ENABLE_LIBRARIES}) if (NOT ENABLE_AVRO) diff --git a/cmake/find/grpc.cmake b/cmake/find/grpc.cmake index fa283d98225..017a7b094b0 100644 --- a/cmake/find/grpc.cmake +++ b/cmake/find/grpc.cmake @@ -37,8 +37,8 @@ if(NOT USE_INTERNAL_GRPC_LIBRARY) if(NOT gRPC_INCLUDE_DIRS OR NOT gRPC_LIBRARIES) message(${RECONFIGURE_MESSAGE_LEVEL} "Can't find system gRPC library") set(EXTERNAL_GRPC_LIBRARY_FOUND 0) - elseif(NOT _gRPC_CPP_PLUGIN) - message(${RECONFIGURE_MESSAGE_LEVEL} "Can't find system grcp_cpp_plugin") + elseif(NOT gRPC_CPP_PLUGIN) + message(${RECONFIGURE_MESSAGE_LEVEL} "Can't find system grpc_cpp_plugin") set(EXTERNAL_GRPC_LIBRARY_FOUND 0) else() set(EXTERNAL_GRPC_LIBRARY_FOUND 1) @@ -53,8 +53,8 @@ if(NOT EXTERNAL_GRPC_LIBRARY_FOUND AND NOT MISSING_INTERNAL_GRPC_LIBRARY) else() set(gRPC_LIBRARIES grpc grpc++) endif() - set(_gRPC_CPP_PLUGIN $) - set(_gRPC_PROTOC_EXECUTABLE $) + set(gRPC_CPP_PLUGIN $) + set(gRPC_PYTHON_PLUGIN $) include("${ClickHouse_SOURCE_DIR}/contrib/grpc-cmake/protobuf_generate_grpc.cmake") @@ -62,4 +62,4 @@ if(NOT EXTERNAL_GRPC_LIBRARY_FOUND AND NOT MISSING_INTERNAL_GRPC_LIBRARY) set(USE_GRPC 1) endif() -message(STATUS "Using gRPC=${USE_GRPC}: ${gRPC_INCLUDE_DIRS} : ${gRPC_LIBRARIES} : ${_gRPC_CPP_PLUGIN}") +message(STATUS "Using gRPC=${USE_GRPC}: ${gRPC_INCLUDE_DIRS} : ${gRPC_LIBRARIES} : ${gRPC_CPP_PLUGIN}") diff --git a/cmake/find/ssl.cmake b/cmake/find/ssl.cmake index 9058857c173..f7ac9174202 100644 --- a/cmake/find/ssl.cmake +++ b/cmake/find/ssl.cmake @@ -1,3 +1,5 @@ +# Needed when securely connecting to an external server, e.g. +# clickhouse-client --host ... --secure option(ENABLE_SSL "Enable ssl" ${ENABLE_LIBRARIES}) if(NOT ENABLE_SSL) diff --git a/cmake/warnings.cmake b/cmake/warnings.cmake index c5f3ce47775..8fa4a1129ed 100644 --- a/cmake/warnings.cmake +++ b/cmake/warnings.cmake @@ -23,7 +23,7 @@ option (WEVERYTHING "Enable -Weverything option with some exceptions." ON) # Control maximum size of stack frames. It can be important if the code is run in fibers with small stack size. # Only in release build because debug has too large stack frames. -if ((NOT CMAKE_BUILD_TYPE_UC STREQUAL "DEBUG") AND (NOT SANITIZE)) +if ((NOT CMAKE_BUILD_TYPE_UC STREQUAL "DEBUG") AND (NOT SANITIZE) AND (NOT CMAKE_CXX_COMPILER_ID MATCHES "AppleClang")) add_warning(frame-larger-than=32768) endif () diff --git a/contrib/AMQP-CPP b/contrib/AMQP-CPP index d63e1f01658..03781aaff0f 160000 --- a/contrib/AMQP-CPP +++ b/contrib/AMQP-CPP @@ -1 +1 @@ -Subproject commit d63e1f016582e9faaaf279aa24513087a07bc6e7 +Subproject commit 03781aaff0f10ef41f902b8cf865fe0067180c10 diff --git a/contrib/CMakeLists.txt b/contrib/CMakeLists.txt index 92e19efe7c3..37c18b1e1d6 100644 --- a/contrib/CMakeLists.txt +++ b/contrib/CMakeLists.txt @@ -66,10 +66,6 @@ if (USE_INTERNAL_FARMHASH_LIBRARY) add_subdirectory (libfarmhash) endif () -if (USE_INTERNAL_BTRIE_LIBRARY) - add_subdirectory (libbtrie) -endif () - if (USE_INTERNAL_ZLIB_LIBRARY) set (ZLIB_ENABLE_TESTS 0 CACHE INTERNAL "") set (SKIP_INSTALL_ALL 1 CACHE INTERNAL "") diff --git a/contrib/abseil-cpp b/contrib/abseil-cpp new file mode 160000 index 00000000000..4f3b686f86c --- /dev/null +++ b/contrib/abseil-cpp @@ -0,0 +1 @@ +Subproject commit 4f3b686f86c3ebaba7e4e926e62a79cb1c659a54 diff --git a/contrib/cassandra b/contrib/cassandra index a49b4e0e269..d10187efb25 160000 --- a/contrib/cassandra +++ b/contrib/cassandra @@ -1 +1 @@ -Subproject commit a49b4e0e2696a4b8ef286a5b9538d1cbe8490509 +Subproject commit d10187efb25b26da391def077edf3c6f2f3a23dd diff --git a/contrib/cctz b/contrib/cctz index 7a2db4ece6e..260ba195ef6 160000 --- a/contrib/cctz +++ b/contrib/cctz @@ -1 +1 @@ -Subproject commit 7a2db4ece6e0f1b246173cbdb62711ae258ee841 +Subproject commit 260ba195ef6c489968bae8c88c62a67cdac5ff9d diff --git a/contrib/grpc b/contrib/grpc index a6570b863cf..7436366ceb3 160000 --- a/contrib/grpc +++ b/contrib/grpc @@ -1 +1 @@ -Subproject commit a6570b863cf76c9699580ba51c7827d5bffaac43 +Subproject commit 7436366ceb341ba5c00ea29f1645e02a2b70bf93 diff --git a/contrib/grpc-cmake/CMakeLists.txt b/contrib/grpc-cmake/CMakeLists.txt index 5ab70d83429..efb0f1c4f43 100644 --- a/contrib/grpc-cmake/CMakeLists.txt +++ b/contrib/grpc-cmake/CMakeLists.txt @@ -1,6 +1,7 @@ set(_gRPC_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/grpc") set(_gRPC_BINARY_DIR "${ClickHouse_BINARY_DIR}/contrib/grpc") +# Use re2 from ClickHouse contrib, not from gRPC third_party. if(NOT RE2_INCLUDE_DIR) message(FATAL_ERROR " grpc: The location of the \"re2\" library is unknown") endif() @@ -8,6 +9,7 @@ set(gRPC_RE2_PROVIDER "clickhouse" CACHE STRING "" FORCE) set(_gRPC_RE2_INCLUDE_DIR "${RE2_INCLUDE_DIR}") set(_gRPC_RE2_LIBRARIES "${RE2_LIBRARY}") +# Use zlib from ClickHouse contrib, not from gRPC third_party. if(NOT ZLIB_INCLUDE_DIRS) message(FATAL_ERROR " grpc: The location of the \"zlib\" library is unknown") endif() @@ -15,6 +17,7 @@ set(gRPC_ZLIB_PROVIDER "clickhouse" CACHE STRING "" FORCE) set(_gRPC_ZLIB_INCLUDE_DIR "${ZLIB_INCLUDE_DIRS}") set(_gRPC_ZLIB_LIBRARIES "${ZLIB_LIBRARIES}") +# Use protobuf from ClickHouse contrib, not from gRPC third_party. if(NOT Protobuf_INCLUDE_DIR OR NOT Protobuf_LIBRARY) message(FATAL_ERROR " grpc: The location of the \"protobuf\" library is unknown") elseif (NOT Protobuf_PROTOC_EXECUTABLE) @@ -29,21 +32,33 @@ set(_gRPC_PROTOBUF_PROTOC "protoc") set(_gRPC_PROTOBUF_PROTOC_EXECUTABLE "${Protobuf_PROTOC_EXECUTABLE}") set(_gRPC_PROTOBUF_PROTOC_LIBRARIES "${Protobuf_PROTOC_LIBRARY}") +# Use OpenSSL from ClickHouse contrib, not from gRPC third_party. set(gRPC_SSL_PROVIDER "clickhouse" CACHE STRING "" FORCE) set(_gRPC_SSL_INCLUDE_DIR ${OPENSSL_INCLUDE_DIR}) set(_gRPC_SSL_LIBRARIES ${OPENSSL_LIBRARIES}) +# Use abseil-cpp from ClickHouse contrib, not from gRPC third_party. +set(gRPC_ABSL_PROVIDER "clickhouse" CACHE STRING "" FORCE) +set(ABSL_ROOT_DIR "${ClickHouse_SOURCE_DIR}/contrib/abseil-cpp") +if(NOT EXISTS "${ABSL_ROOT_DIR}/CMakeLists.txt") + message(FATAL_ERROR " grpc: submodule third_party/abseil-cpp is missing. To fix try run: \n git submodule update --init --recursive") +endif() +add_subdirectory("${ABSL_ROOT_DIR}" "${ClickHouse_BINARY_DIR}/contrib/abseil-cpp") + +# Choose to build static or shared library for c-ares. +if (MAKE_STATIC_LIBRARIES) + set(CARES_STATIC ON CACHE BOOL "" FORCE) + set(CARES_SHARED OFF CACHE BOOL "" FORCE) +else () + set(CARES_STATIC OFF CACHE BOOL "" FORCE) + set(CARES_SHARED ON CACHE BOOL "" FORCE) +endif () + # We don't want to build C# extensions. set(gRPC_BUILD_CSHARP_EXT OFF) -# We don't want to build abseil tests, so we temporarily switch BUILD_TESTING off. -set(_gRPC_ORIG_BUILD_TESTING ${BUILD_TESTING}) -set(BUILD_TESTING OFF) - add_subdirectory("${_gRPC_SOURCE_DIR}" "${_gRPC_BINARY_DIR}") -set(BUILD_TESTING ${_gRPC_ORIG_BUILD_TESTING}) - # The contrib/grpc/CMakeLists.txt redefined the PROTOBUF_GENERATE_GRPC_CPP() function for its own purposes, # so we need to redefine it back. include("${ClickHouse_SOURCE_DIR}/contrib/grpc-cmake/protobuf_generate_grpc.cmake") diff --git a/contrib/libbtrie/CMakeLists.txt b/contrib/libbtrie/CMakeLists.txt deleted file mode 100644 index 2b0c8e3fd75..00000000000 --- a/contrib/libbtrie/CMakeLists.txt +++ /dev/null @@ -1,6 +0,0 @@ -add_library(btrie - src/btrie.c - include/btrie.h -) - -target_include_directories (btrie SYSTEM PUBLIC include) diff --git a/contrib/libbtrie/LICENSE b/contrib/libbtrie/LICENSE deleted file mode 100644 index d386c6f7b79..00000000000 --- a/contrib/libbtrie/LICENSE +++ /dev/null @@ -1,23 +0,0 @@ -Copyright (c) 2013, CobbLiu -All rights reserved. - -Redistribution and use in source and binary forms, with or without modification, -are permitted provided that the following conditions are met: - - Redistributions of source code must retain the above copyright notice, this - list of conditions and the following disclaimer. - - Redistributions in binary form must reproduce the above copyright notice, this - list of conditions and the following disclaimer in the documentation and/or - other materials provided with the distribution. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND -ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED -WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE -DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR -ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES -(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; -LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON -ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS -SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/contrib/libbtrie/include/btrie.h b/contrib/libbtrie/include/btrie.h deleted file mode 100644 index 6d805108e7a..00000000000 --- a/contrib/libbtrie/include/btrie.h +++ /dev/null @@ -1,160 +0,0 @@ -#pragma once - -#if defined (__cplusplus) -extern "C" { -#endif - -#include -#include - -/** - * In btrie, each leaf means one bit in ip tree. - * Left means 0, and right means 1. - */ - -#define BTRIE_NULL (uintptr_t) -1 - -#if !defined(BTRIE_MAX_PAGES) -/// 54 ip per page. 8 bytes memory per page when empty -#define BTRIE_MAX_PAGES 1024 * 2048 /// 128m ips , ~16mb ram when empty -// #define BTRIE_MAX_PAGES 1024 * 65535 /// 4g ips (whole ipv4), ~512mb ram when empty -#endif - -typedef struct btrie_node_s btrie_node_t; - -struct btrie_node_s { - btrie_node_t *right; - btrie_node_t *left; - btrie_node_t *parent; - uintptr_t value; -}; - - -typedef struct btrie_s { - btrie_node_t *root; - - btrie_node_t *free; /* free list of btrie */ - char *start; - size_t size; - - /* - * memory pool. - * memory management(esp free) will be so easy by using this facility. - */ - char *pools[BTRIE_MAX_PAGES]; - size_t len; -} btrie_t; - - -/** - * Create an empty btrie - * - * @Return: - * An ip radix_tree created. - * NULL if creation failed. - */ - -btrie_t *btrie_create(); - -/** - * Destroy the ip radix_tree - * - * @Return: - * OK if deletion succeed. - * ERROR if error occurs while deleting. - */ -int btrie_destroy(btrie_t *tree); - -/** - * Count the nodes in the radix tree. - */ -size_t btrie_count(btrie_t *tree); - -/** - * Return the allocated number of bytes. - */ -size_t btrie_allocated(btrie_t *tree); - - -/** - * Add an ipv4 into btrie - * - * @Args: - * key: ip address - * mask: key's mask - * value: value of this IP, may be NULL. - * - * @Return: - * OK for success. - * ERROR for failure. - */ -int btrie_insert(btrie_t *tree, uint32_t key, uint32_t mask, - uintptr_t value); - - -/** - * Delete an ipv4 from btrie - * - * @Args: - * - * @Return: - * OK for success. - * ERROR for failure. - */ -int btrie_delete(btrie_t *tree, uint32_t key, uint32_t mask); - - -/** - * Find an ipv4 from btrie - * - - * @Args: - * - * @Return: - * Value if succeed. - * NULL if failed. - */ -uintptr_t btrie_find(btrie_t *tree, uint32_t key); - - -/** - * Add an ipv6 into btrie - * - * @Args: - * key: ip address - * mask: key's mask - * value: value of this IP, may be NULL. - * - * @Return: - * OK for success. - * ERROR for failure. - */ -int btrie_insert_a6(btrie_t *tree, const uint8_t *key, const uint8_t *mask, - uintptr_t value); - -/** - * Delete an ipv6 from btrie - * - * @Args: - * - * @Return: - * OK for success. - * ERROR for failure. - */ -int btrie_delete_a6(btrie_t *tree, const uint8_t *key, const uint8_t *mask); - -/** - * Find an ipv6 from btrie - * - - * @Args: - * - * @Return: - * Value if succeed. - * NULL if failed. - */ -uintptr_t btrie_find_a6(btrie_t *tree, const uint8_t *key); - -#if defined (__cplusplus) -} -#endif \ No newline at end of file diff --git a/contrib/libbtrie/src/btrie.c b/contrib/libbtrie/src/btrie.c deleted file mode 100644 index f9353019ac1..00000000000 --- a/contrib/libbtrie/src/btrie.c +++ /dev/null @@ -1,460 +0,0 @@ -#include -#include -#include - -#define PAGE_SIZE 4096 - - -static btrie_node_t * -btrie_alloc(btrie_t *tree) -{ - btrie_node_t *p; - - if (tree->free) { - p = tree->free; - tree->free = tree->free->right; - return p; - } - - if (tree->size < sizeof(btrie_node_t)) { - tree->start = (char *) calloc(sizeof(char), PAGE_SIZE); - if (tree->start == NULL) { - return NULL; - } - - tree->pools[tree->len++] = tree->start; - tree->size = PAGE_SIZE; - } - - p = (btrie_node_t *) tree->start; - - tree->start += sizeof(btrie_node_t); - tree->size -= sizeof(btrie_node_t); - - return p; -} - - -btrie_t * -btrie_create() -{ - btrie_t *tree = (btrie_t *) malloc(sizeof(btrie_t)); - if (tree == NULL) { - return NULL; - } - - tree->free = NULL; - tree->start = NULL; - tree->size = 0; - memset(tree->pools, 0, sizeof(btrie_t *) * BTRIE_MAX_PAGES); - tree->len = 0; - - tree->root = btrie_alloc(tree); - if (tree->root == NULL) { - return NULL; - } - - tree->root->right = NULL; - tree->root->left = NULL; - tree->root->parent = NULL; - tree->root->value = BTRIE_NULL; - - return tree; -} - -static size_t -subtree_weight(btrie_node_t *node) -{ - size_t weight = 1; - if (node->left) { - weight += subtree_weight(node->left); - } - if (node->right) { - weight += subtree_weight(node->right); - } - return weight; -} - -size_t -btrie_count(btrie_t *tree) -{ - if (tree->root == NULL) { - return 0; - } - - return subtree_weight(tree->root); -} - -size_t -btrie_allocated(btrie_t *tree) -{ - return tree->len * PAGE_SIZE; -} - - -int -btrie_insert(btrie_t *tree, uint32_t key, uint32_t mask, - uintptr_t value) -{ - uint32_t bit; - btrie_node_t *node, *next; - - bit = 0x80000000; - - node = tree->root; - next = tree->root; - - while (bit & mask) { - if (key & bit) { - next = node->right; - - } else { - next = node->left; - } - - if (next == NULL) { - break; - } - - bit >>= 1; - node = next; - } - - if (next) { - if (node->value != BTRIE_NULL) { - return -1; - } - - node->value = value; - return 0; - } - - while (bit & mask) { - next = btrie_alloc(tree); - if (next == NULL) { - return -1; - } - - next->right = NULL; - next->left = NULL; - next->parent = node; - next->value = BTRIE_NULL; - - if (key & bit) { - node->right = next; - - } else { - node->left = next; - } - - bit >>= 1; - node = next; - } - - node->value = value; - - return 0; -} - - -int -btrie_delete(btrie_t *tree, uint32_t key, uint32_t mask) -{ - uint32_t bit; - btrie_node_t *node; - - bit = 0x80000000; - node = tree->root; - - while (node && (bit & mask)) { - if (key & bit) { - node = node->right; - - } else { - node = node->left; - } - - bit >>= 1; - } - - if (node == NULL) { - return -1; - } - - if (node->right || node->left) { - if (node->value != BTRIE_NULL) { - node->value = BTRIE_NULL; - return 0; - } - - return -1; - } - - for ( ;; ) { - if (node->parent->right == node) { - node->parent->right = NULL; - - } else { - node->parent->left = NULL; - } - - node->right = tree->free; - tree->free = node; - - node = node->parent; - - if (node->right || node->left) { - break; - } - - if (node->value != BTRIE_NULL) { - break; - } - - if (node->parent == NULL) { - break; - } - } - - return 0; -} - - -uintptr_t -btrie_find(btrie_t *tree, uint32_t key) -{ - uint32_t bit; - uintptr_t value; - btrie_node_t *node; - - bit = 0x80000000; - value = BTRIE_NULL; - node = tree->root; - - while (node) { - if (node->value != BTRIE_NULL) { - value = node->value; - } - - if (key & bit) { - node = node->right; - - } else { - node = node->left; - } - - bit >>= 1; - } - - return value; -} - - -int -btrie_insert_a6(btrie_t *tree, const uint8_t *key, const uint8_t *mask, - uintptr_t value) -{ - uint8_t bit; - unsigned int i; - btrie_node_t *node, *next; - - i = 0; - bit = 0x80; - - node = tree->root; - next = tree->root; - - while (bit & mask[i]) { - if (key[i] & bit) { - next = node->right; - - } else { - next = node->left; - } - - if (next == NULL) { - break; - } - - bit >>= 1; - node = next; - - if (bit == 0) { - if (++i == 16) { - break; - } - - bit = 0x80; - } - } - - if (next) { - if (node->value != BTRIE_NULL) { - return -1; - } - - node->value = value; - return 0; - } - - while (bit & mask[i]) { - next = btrie_alloc(tree); - if (next == NULL) { - return -1; - } - - next->right = NULL; - next->left = NULL; - next->parent = node; - next->value = BTRIE_NULL; - - if (key[i] & bit) { - node->right = next; - - } else { - node->left = next; - } - - bit >>= 1; - node = next; - - if (bit == 0) { - if (++i == 16) { - break; - } - - bit = 0x80; - } - } - - node->value = value; - - return 0; -} - - -int -btrie_delete_a6(btrie_t *tree, const uint8_t *key, const uint8_t *mask) -{ - uint8_t bit; - unsigned int i; - btrie_node_t *node; - - i = 0; - bit = 0x80; - node = tree->root; - - while (node && (bit & mask[i])) { - if (key[i] & bit) { - node = node->right; - - } else { - node = node->left; - } - - bit >>= 1; - - if (bit == 0) { - if (++i == 16) { - break; - } - - bit = 0x80; - } - } - - if (node == NULL) { - return -1; - } - - if (node->right || node->left) { - if (node->value != BTRIE_NULL) { - node->value = BTRIE_NULL; - return 0; - } - - return -1; - } - - for ( ;; ) { - if (node->parent->right == node) { - node->parent->right = NULL; - - } else { - node->parent->left = NULL; - } - - node->right = tree->free; - tree->free = node; - - node = node->parent; - - if (node->right || node->left) { - break; - } - - if (node->value != BTRIE_NULL) { - break; - } - - if (node->parent == NULL) { - break; - } - } - - return 0; -} - - -uintptr_t -btrie_find_a6(btrie_t *tree, const uint8_t *key) -{ - uint8_t bit; - uintptr_t value; - unsigned int i; - btrie_node_t *node; - - i = 0; - bit = 0x80; - value = BTRIE_NULL; - node = tree->root; - - while (node) { - if (node->value != BTRIE_NULL) { - value = node->value; - } - - if (key[i] & bit) { - node = node->right; - - } else { - node = node->left; - } - - bit >>= 1; - - if (bit == 0) { - i++; - bit = 0x80; - } - } - - return value; -} - - -int -btrie_destroy(btrie_t *tree) -{ - size_t i; - - - /* free memory pools */ - for (i = 0; i < tree->len; i++) { - free(tree->pools[i]); - } - - free(tree); - - return 0; -} diff --git a/contrib/libbtrie/test/test_btrie.c b/contrib/libbtrie/test/test_btrie.c deleted file mode 100644 index 2bbf2b2db7e..00000000000 --- a/contrib/libbtrie/test/test_btrie.c +++ /dev/null @@ -1,103 +0,0 @@ -#include -#include - -int main() -{ - btrie_t *it; - int ret; - - uint8_t prefix_v6[16] = {0xde, 0xad, 0xbe, 0xef}; - uint8_t mask_v6[16] = {0xff, 0xff, 0xff}; - uint8_t ip_v6[16] = {0xde, 0xad, 0xbe, 0xef, 0xde}; - - it = btrie_create(); - if (it == NULL) { - printf("create error!\n"); - return 0; - } - - //add 101.45.69.50/16 - ret = btrie_insert(it, 1697465650, 0xffff0000, 1); - if (ret != 0) { - printf("insert 1 error.\n"); - goto error; - } - - //add 10.45.69.50/16 - ret = btrie_insert(it, 170738994, 0xffff0000, 1); - if (ret != 0) { - printf("insert 2 error.\n"); - goto error; - } - - //add 10.45.79.50/16 - ret = btrie_insert(it, 170741554, 0xffff0000, 1); - if (ret == 0) { - printf("insert 3 error.\n"); - goto error; - } - - //add 102.45.79.50/24 - ret = btrie_insert(it, 1714245426, 0xffffff00, 1); - if (ret != 0) { - printf("insert 4 error.\n"); - goto error; - } - - ret = btrie_find(it, 170741554); - if (ret == 1) { - printf("test case 1 passed\n"); - } else { - printf("test case 1 error\n"); - } - - ret = btrie_find(it, 170786817); - if (ret != 1) { - printf("test case 2 passed\n"); - } else { - printf("test case 2 error\n"); - } - - ret = btrie_delete(it, 1714245426, 0xffffff00); - if (ret != 0) { - printf("delete 1 error\n"); - goto error; - } - - ret = btrie_find(it, 1714245426); - if (ret != 1) { - printf("test case 3 passed\n"); - } else { - printf("test case 3 error\n"); - } - - //add dead:beef::/32 - ret = btrie_insert_a6(it, prefix_v6, mask_v6, 1); - if (ret != 0) { - printf("insert 5 error\n"); - goto error; - } - - ret = btrie_find_a6(it, ip_v6); - if (ret == 1) { - printf("test case 4 passed\n"); - } else { - printf("test case 4 error\n"); - } - - // insert 4m ips - for (size_t ip = 1; ip < 1024 * 1024 * 4; ++ip) { - ret = btrie_insert(it, ip, 0xffffffff, 1); - if (ret != 0) { - printf("insert 5 error (%d) (%zu) .\n", ret, ip); - goto error; - } - } - - return 0; - - error: - btrie_destroy(it); - printf("test failed\n"); - return 1; -} diff --git a/contrib/librdkafka b/contrib/librdkafka index 2090cbf56b7..9902bc4fb18 160000 --- a/contrib/librdkafka +++ b/contrib/librdkafka @@ -1 +1 @@ -Subproject commit 2090cbf56b715247ec2be7f768707a7ab1bf7ede +Subproject commit 9902bc4fb18bb441fa55ca154b341cdda191e5d3 diff --git a/contrib/libunwind-cmake/CMakeLists.txt b/contrib/libunwind-cmake/CMakeLists.txt index 82b3b9c0de5..3afff30eee7 100644 --- a/contrib/libunwind-cmake/CMakeLists.txt +++ b/contrib/libunwind-cmake/CMakeLists.txt @@ -22,7 +22,16 @@ set_source_files_properties(${LIBUNWIND_C_SOURCES} PROPERTIES COMPILE_FLAGS "-st set(LIBUNWIND_ASM_SOURCES ${LIBUNWIND_SOURCE_DIR}/src/UnwindRegistersRestore.S ${LIBUNWIND_SOURCE_DIR}/src/UnwindRegistersSave.S) -set_source_files_properties(${LIBUNWIND_ASM_SOURCES} PROPERTIES LANGUAGE C) + +# CMake doesn't pass the correct architecture for Apple prior to CMake 3.19 [1] +# Workaround these two issues by compiling as C. +# +# [1]: https://gitlab.kitware.com/cmake/cmake/-/issues/20771 +if (APPLE AND CMAKE_VERSION VERSION_LESS 3.19) + set_source_files_properties(${LIBUNWIND_ASM_SOURCES} PROPERTIES LANGUAGE C) +else() + enable_language(ASM) +endif() set(LIBUNWIND_SOURCES ${LIBUNWIND_CXX_SOURCES} diff --git a/contrib/mariadb-connector-c b/contrib/mariadb-connector-c index 1485b0de3ea..e05523ca7c1 160000 --- a/contrib/mariadb-connector-c +++ b/contrib/mariadb-connector-c @@ -1 +1 @@ -Subproject commit 1485b0de3eaa1508dfe49a5ba1e4aa2a71fd8335 +Subproject commit e05523ca7c1fb8d095b612a1b1cfe96e199ffb17 diff --git a/contrib/openldap b/contrib/openldap index 34b9ba94b30..0208811b604 160000 --- a/contrib/openldap +++ b/contrib/openldap @@ -1 +1 @@ -Subproject commit 34b9ba94b30319ed6389a4e001d057f7983fe363 +Subproject commit 0208811b6043ca06fda8631a5e473df1ec515ccb diff --git a/contrib/poco b/contrib/poco index f49c6ab8d3a..f3d791f6568 160000 --- a/contrib/poco +++ b/contrib/poco @@ -1 +1 @@ -Subproject commit f49c6ab8d3aa71828bd1b411485c21722e8c9d82 +Subproject commit f3d791f6568b99366d089b4479f76a515beb66d5 diff --git a/contrib/protobuf b/contrib/protobuf index 445d1ae73a4..73b12814204 160000 --- a/contrib/protobuf +++ b/contrib/protobuf @@ -1 +1 @@ -Subproject commit 445d1ae73a450b1e94622e7040989aa2048402e3 +Subproject commit 73b12814204ad9068ba352914d0dc244648b48ee diff --git a/debian/changelog b/debian/changelog index 3da82efd47e..5ea6b472e46 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,5 +1,5 @@ -clickhouse (20.12.1.1) unstable; urgency=low +clickhouse (20.13.1.1) unstable; urgency=low * Modified source code - -- clickhouse-release Thu, 05 Nov 2020 21:52:47 +0300 + -- clickhouse-release Mon, 23 Nov 2020 10:29:24 +0300 diff --git a/debian/clickhouse-server.init b/debian/clickhouse-server.init index 8f10153a682..3e4e888eacd 100755 --- a/debian/clickhouse-server.init +++ b/debian/clickhouse-server.init @@ -67,26 +67,6 @@ if uname -mpi | grep -q 'x86_64'; then fi -is_running() -{ - pgrep --pidfile "$CLICKHOUSE_PIDFILE" $(echo "${PROGRAM}" | cut -c1-15) 1> /dev/null 2> /dev/null -} - - -wait_for_done() -{ - timeout=$1 - attempts=0 - while is_running; do - attempts=$(($attempts + 1)) - if [ -n "$timeout" ] && [ $attempts -gt $timeout ]; then - return 1 - fi - sleep 1 - done -} - - die() { echo $1 >&2 @@ -105,49 +85,7 @@ check_config() initdb() { - if [ -x "$CLICKHOUSE_BINDIR/$EXTRACT_FROM_CONFIG" ]; then - CLICKHOUSE_DATADIR_FROM_CONFIG=$(su -s $SHELL ${CLICKHOUSE_USER} -c "$CLICKHOUSE_BINDIR/$EXTRACT_FROM_CONFIG --config-file=\"$CLICKHOUSE_CONFIG\" --key=path") - if [ "(" "$?" -ne "0" ")" -o "(" -z "${CLICKHOUSE_DATADIR_FROM_CONFIG}" ")" ]; then - die "Cannot obtain value of path from config file: ${CLICKHOUSE_CONFIG}"; - fi - echo "Path to data directory in ${CLICKHOUSE_CONFIG}: ${CLICKHOUSE_DATADIR_FROM_CONFIG}" - else - CLICKHOUSE_DATADIR_FROM_CONFIG=$CLICKHOUSE_DATADIR - fi - - if ! getent passwd ${CLICKHOUSE_USER} >/dev/null; then - echo "Can't chown to non-existing user ${CLICKHOUSE_USER}" - return - fi - if ! getent group ${CLICKHOUSE_GROUP} >/dev/null; then - echo "Can't chown to non-existing group ${CLICKHOUSE_GROUP}" - return - fi - - if ! $(su -s $SHELL ${CLICKHOUSE_USER} -c "test -r ${CLICKHOUSE_CONFIG}"); then - echo "Warning! clickhouse config [${CLICKHOUSE_CONFIG}] not readable by user [${CLICKHOUSE_USER}]" - fi - - if ! $(su -s $SHELL ${CLICKHOUSE_USER} -c "test -O \"${CLICKHOUSE_DATADIR_FROM_CONFIG}\" && test -G \"${CLICKHOUSE_DATADIR_FROM_CONFIG}\""); then - if [ $(dirname "${CLICKHOUSE_DATADIR_FROM_CONFIG}") = "/" ]; then - echo "Directory ${CLICKHOUSE_DATADIR_FROM_CONFIG} seems too dangerous to chown." - else - if [ ! -e "${CLICKHOUSE_DATADIR_FROM_CONFIG}" ]; then - echo "Creating directory ${CLICKHOUSE_DATADIR_FROM_CONFIG}" - mkdir -p "${CLICKHOUSE_DATADIR_FROM_CONFIG}" - fi - - echo "Changing owner of [${CLICKHOUSE_DATADIR_FROM_CONFIG}] to [${CLICKHOUSE_USER}:${CLICKHOUSE_GROUP}]" - chown -R ${CLICKHOUSE_USER}:${CLICKHOUSE_GROUP} "${CLICKHOUSE_DATADIR_FROM_CONFIG}" - fi - fi - - if ! $(su -s $SHELL ${CLICKHOUSE_USER} -c "test -w ${CLICKHOUSE_LOGDIR}"); then - echo "Changing owner of [${CLICKHOUSE_LOGDIR}/*] to [${CLICKHOUSE_USER}:${CLICKHOUSE_GROUP}]" - chown -R ${CLICKHOUSE_USER}:${CLICKHOUSE_GROUP} ${CLICKHOUSE_LOGDIR}/* - echo "Changing owner of [${CLICKHOUSE_LOGDIR}] to [${CLICKHOUSE_LOGDIR_USER}:${CLICKHOUSE_GROUP}]" - chown ${CLICKHOUSE_LOGDIR_USER}:${CLICKHOUSE_GROUP} ${CLICKHOUSE_LOGDIR} - fi + ${CLICKHOUSE_GENERIC_PROGRAM} install --user "${CLICKHOUSE_USER}" --pid-path "${CLICKHOUSE_PIDDIR}" --config-path "${CLICKHOUSE_CONFDIR}" --binary-path "${CLICKHOUSE_BINDIR}" } @@ -171,17 +109,7 @@ restart() forcestop() { - local EXIT_STATUS - EXIT_STATUS=0 - - echo -n "Stop forcefully $PROGRAM service: " - - kill -KILL $(cat "$CLICKHOUSE_PIDFILE") - - wait_for_done - - echo "DONE" - return $EXIT_STATUS + ${CLICKHOUSE_GENERIC_PROGRAM} stop --force --pid-path "${CLICKHOUSE_PIDDIR}" } @@ -261,16 +189,16 @@ main() service_or_func restart ;; condstart) - is_running || service_or_func start + service_or_func start ;; condstop) - is_running && service_or_func stop + service_or_func stop ;; condrestart) - is_running && service_or_func restart + service_or_func restart ;; condreload) - is_running && service_or_func restart + service_or_func restart ;; initdb) initdb @@ -293,17 +221,7 @@ main() status() { - if is_running; then - echo "$PROGRAM service is running" - exit 0 - else - if is_cron_disabled; then - echo "$PROGRAM service is stopped"; - else - echo "$PROGRAM: process unexpectedly terminated" - fi - exit 3 - fi + ${CLICKHOUSE_GENERIC_PROGRAM} status --pid-path "${CLICKHOUSE_PIDDIR}" } diff --git a/docker/client/Dockerfile b/docker/client/Dockerfile index 2223b942429..3ef6b8c8b32 100644 --- a/docker/client/Dockerfile +++ b/docker/client/Dockerfile @@ -1,7 +1,7 @@ FROM ubuntu:18.04 ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/" -ARG version=20.12.1.* +ARG version=20.13.1.* RUN apt-get update \ && apt-get install --yes --no-install-recommends \ diff --git a/docker/packager/unbundled/Dockerfile b/docker/packager/unbundled/Dockerfile index 261edf1a86c..2f501f76e68 100644 --- a/docker/packager/unbundled/Dockerfile +++ b/docker/packager/unbundled/Dockerfile @@ -56,6 +56,7 @@ RUN apt-get update \ libprotoc-dev \ libgrpc++-dev \ protobuf-compiler-grpc \ + libc-ares-dev \ rapidjson-dev \ libsnappy-dev \ libparquet-dev \ diff --git a/docker/server/Dockerfile b/docker/server/Dockerfile index 1ce6e427409..f7e107a2fc9 100644 --- a/docker/server/Dockerfile +++ b/docker/server/Dockerfile @@ -1,7 +1,7 @@ FROM ubuntu:20.04 ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/" -ARG version=20.12.1.* +ARG version=20.13.1.* ARG gosu_ver=1.10 RUN apt-get update \ diff --git a/docker/test/Dockerfile b/docker/test/Dockerfile index cd2bead5616..8e3b5193874 100644 --- a/docker/test/Dockerfile +++ b/docker/test/Dockerfile @@ -1,7 +1,7 @@ FROM ubuntu:18.04 ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/" -ARG version=20.12.1.* +ARG version=20.13.1.* RUN apt-get update && \ apt-get install -y apt-transport-https dirmngr && \ diff --git a/docker/test/coverage/Dockerfile b/docker/test/coverage/Dockerfile index 32020951539..cea1a63cf6f 100644 --- a/docker/test/coverage/Dockerfile +++ b/docker/test/coverage/Dockerfile @@ -7,8 +7,10 @@ ENV SOURCE_DIR=/build ENV OUTPUT_DIR=/output ENV IGNORE='.*contrib.*' -CMD mkdir -p /build/obj-x86_64-linux-gnu && cd /build/obj-x86_64-linux-gnu && CC=clang-10 CXX=clang++-10 cmake .. && cd /; \ +RUN apt-get update && apt-get install cmake --yes --no-install-recommends + +CMD mkdir -p /build/obj-x86_64-linux-gnu && cd /build/obj-x86_64-linux-gnu && CC=clang-11 CXX=clang++-11 cmake .. && cd /; \ dpkg -i /package_folder/clickhouse-common-static_*.deb; \ - llvm-profdata-10 merge -sparse ${COVERAGE_DIR}/* -o clickhouse.profdata && \ - llvm-cov-10 export /usr/bin/clickhouse -instr-profile=clickhouse.profdata -j=16 -format=lcov -skip-functions -ignore-filename-regex $IGNORE > output.lcov && \ + llvm-profdata-11 merge -sparse ${COVERAGE_DIR}/* -o clickhouse.profdata && \ + llvm-cov-11 export /usr/bin/clickhouse -instr-profile=clickhouse.profdata -j=16 -format=lcov -skip-functions -ignore-filename-regex $IGNORE > output.lcov && \ genhtml output.lcov --ignore-errors source --output-directory ${OUTPUT_DIR} diff --git a/docker/test/fasttest/run.sh b/docker/test/fasttest/run.sh index aef967b6b41..c3f102786ae 100755 --- a/docker/test/fasttest/run.sh +++ b/docker/test/fasttest/run.sh @@ -15,6 +15,9 @@ stage=${stage:-} # empty parameter. read -ra FASTTEST_CMAKE_FLAGS <<< "${FASTTEST_CMAKE_FLAGS:-}" +# Run only matching tests. +FASTTEST_FOCUS=${FASTTEST_FOCUS:-""} + FASTTEST_WORKSPACE=$(readlink -f "${FASTTEST_WORKSPACE:-.}") FASTTEST_SOURCE=$(readlink -f "${FASTTEST_SOURCE:-$FASTTEST_WORKSPACE/ch}") FASTTEST_BUILD=$(readlink -f "${FASTTEST_BUILD:-${BUILD:-$FASTTEST_WORKSPACE/build}}") @@ -101,223 +104,248 @@ function start_server function clone_root { -git clone https://github.com/ClickHouse/ClickHouse.git -- "$FASTTEST_SOURCE" | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/clone_log.txt" + git clone https://github.com/ClickHouse/ClickHouse.git -- "$FASTTEST_SOURCE" | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/clone_log.txt" -( -cd "$FASTTEST_SOURCE" -if [ "$PULL_REQUEST_NUMBER" != "0" ]; then - if git fetch origin "+refs/pull/$PULL_REQUEST_NUMBER/merge"; then - git checkout FETCH_HEAD - echo 'Clonned merge head' - else - git fetch - git checkout "$COMMIT_SHA" - echo 'Checked out to commit' - fi -else - if [ -v COMMIT_SHA ]; then - git checkout "$COMMIT_SHA" - fi -fi -) + ( + cd "$FASTTEST_SOURCE" + if [ "$PULL_REQUEST_NUMBER" != "0" ]; then + if git fetch origin "+refs/pull/$PULL_REQUEST_NUMBER/merge"; then + git checkout FETCH_HEAD + echo 'Clonned merge head' + else + git fetch + git checkout "$COMMIT_SHA" + echo 'Checked out to commit' + fi + else + if [ -v COMMIT_SHA ]; then + git checkout "$COMMIT_SHA" + fi + fi + ) } function clone_submodules { -( -cd "$FASTTEST_SOURCE" + ( + cd "$FASTTEST_SOURCE" -SUBMODULES_TO_UPDATE=(contrib/boost contrib/zlib-ng contrib/libxml2 contrib/poco contrib/libunwind contrib/ryu contrib/fmtlib contrib/base64 contrib/cctz contrib/libcpuid contrib/double-conversion contrib/libcxx contrib/libcxxabi contrib/libc-headers contrib/lz4 contrib/zstd contrib/fastops contrib/rapidjson contrib/re2 contrib/sparsehash-c11 contrib/croaring contrib/miniselect contrib/xz) + SUBMODULES_TO_UPDATE=( + contrib/boost + contrib/zlib-ng + contrib/libxml2 + contrib/poco + contrib/libunwind + contrib/ryu + contrib/fmtlib + contrib/base64 + contrib/cctz + contrib/libcpuid + contrib/double-conversion + contrib/libcxx + contrib/libcxxabi + contrib/libc-headers + contrib/lz4 + contrib/zstd + contrib/fastops + contrib/rapidjson + contrib/re2 + contrib/sparsehash-c11 + contrib/croaring + contrib/miniselect + contrib/xz + ) -git submodule sync -git submodule update --init --recursive "${SUBMODULES_TO_UPDATE[@]}" -git submodule foreach git reset --hard -git submodule foreach git checkout @ -f -git submodule foreach git clean -xfd -) + git submodule sync + git submodule update --init --recursive "${SUBMODULES_TO_UPDATE[@]}" + git submodule foreach git reset --hard + git submodule foreach git checkout @ -f + git submodule foreach git clean -xfd + ) } function run_cmake { -CMAKE_LIBS_CONFIG=( - "-DENABLE_LIBRARIES=0" - "-DENABLE_TESTS=0" - "-DENABLE_UTILS=0" - "-DENABLE_EMBEDDED_COMPILER=0" - "-DENABLE_THINLTO=0" - "-DUSE_UNWIND=1" -) + CMAKE_LIBS_CONFIG=( + "-DENABLE_LIBRARIES=0" + "-DENABLE_TESTS=0" + "-DENABLE_UTILS=0" + "-DENABLE_EMBEDDED_COMPILER=0" + "-DENABLE_THINLTO=0" + "-DUSE_UNWIND=1" + ) -# TODO remove this? we don't use ccache anyway. An option would be to download it -# from S3 simultaneously with cloning. -export CCACHE_DIR="$FASTTEST_WORKSPACE/ccache" -export CCACHE_BASEDIR="$FASTTEST_SOURCE" -export CCACHE_NOHASHDIR=true -export CCACHE_COMPILERCHECK=content -export CCACHE_MAXSIZE=15G + # TODO remove this? we don't use ccache anyway. An option would be to download it + # from S3 simultaneously with cloning. + export CCACHE_DIR="$FASTTEST_WORKSPACE/ccache" + export CCACHE_BASEDIR="$FASTTEST_SOURCE" + export CCACHE_NOHASHDIR=true + export CCACHE_COMPILERCHECK=content + export CCACHE_MAXSIZE=15G -ccache --show-stats ||: -ccache --zero-stats ||: + ccache --show-stats ||: + ccache --zero-stats ||: -mkdir "$FASTTEST_BUILD" ||: + mkdir "$FASTTEST_BUILD" ||: -( -cd "$FASTTEST_BUILD" -cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER=clang++-10 -DCMAKE_C_COMPILER=clang-10 "${CMAKE_LIBS_CONFIG[@]}" "${FASTTEST_CMAKE_FLAGS[@]}" | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt" -) + ( + cd "$FASTTEST_BUILD" + cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER=clang++-10 -DCMAKE_C_COMPILER=clang-10 "${CMAKE_LIBS_CONFIG[@]}" "${FASTTEST_CMAKE_FLAGS[@]}" | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt" + ) } function build { -( -cd "$FASTTEST_BUILD" -time ninja clickhouse-bundle | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/build_log.txt" -if [ "$COPY_CLICKHOUSE_BINARY_TO_OUTPUT" -eq "1" ]; then - cp programs/clickhouse "$FASTTEST_OUTPUT/clickhouse" -fi -ccache --show-stats ||: -) + ( + cd "$FASTTEST_BUILD" + time ninja clickhouse-bundle | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/build_log.txt" + if [ "$COPY_CLICKHOUSE_BINARY_TO_OUTPUT" -eq "1" ]; then + cp programs/clickhouse "$FASTTEST_OUTPUT/clickhouse" + fi + ccache --show-stats ||: + ) } function configure { -clickhouse-client --version -clickhouse-test --help + clickhouse-client --version + clickhouse-test --help -mkdir -p "$FASTTEST_DATA"{,/client-config} -cp -a "$FASTTEST_SOURCE/programs/server/"{config,users}.xml "$FASTTEST_DATA" -"$FASTTEST_SOURCE/tests/config/install.sh" "$FASTTEST_DATA" "$FASTTEST_DATA/client-config" -cp -a "$FASTTEST_SOURCE/programs/server/config.d/log_to_console.xml" "$FASTTEST_DATA/config.d" -# doesn't support SSL -rm -f "$FASTTEST_DATA/config.d/secure_ports.xml" + mkdir -p "$FASTTEST_DATA"{,/client-config} + cp -a "$FASTTEST_SOURCE/programs/server/"{config,users}.xml "$FASTTEST_DATA" + "$FASTTEST_SOURCE/tests/config/install.sh" "$FASTTEST_DATA" "$FASTTEST_DATA/client-config" + cp -a "$FASTTEST_SOURCE/programs/server/config.d/log_to_console.xml" "$FASTTEST_DATA/config.d" + # doesn't support SSL + rm -f "$FASTTEST_DATA/config.d/secure_ports.xml" } function run_tests { -clickhouse-server --version -clickhouse-test --help + clickhouse-server --version + clickhouse-test --help -# Kill the server in case we are running locally and not in docker -stop_server ||: - -start_server - -TESTS_TO_SKIP=( - 00105_shard_collations - 00109_shard_totals_after_having - 00110_external_sort - 00302_http_compression - 00417_kill_query - 00436_convert_charset - 00490_special_line_separators_and_characters_outside_of_bmp - 00652_replicated_mutations_zookeeper - 00682_empty_parts_merge - 00701_rollup - 00834_cancel_http_readonly_queries_on_client_close - 00911_tautological_compare - 00926_multimatch - 00929_multi_match_edit_distance - 01031_mutations_interpreter_and_context - 01053_ssd_dictionary # this test mistakenly requires acces to /var/lib/clickhouse -- can't run this locally, disabled - 01083_expressions_in_engine_arguments - 01092_memory_profiler - 01098_msgpack_format - 01098_temporary_and_external_tables - 01103_check_cpu_instructions_at_startup # avoid dependency on qemu -- invonvenient when running locally - 01193_metadata_loading - 01238_http_memory_tracking # max_memory_usage_for_user can interfere another queries running concurrently - 01251_dict_is_in_infinite_loop - 01259_dictionary_custom_settings_ddl - 01268_dictionary_direct_layout - 01280_ssd_complex_key_dictionary - 01281_group_by_limit_memory_tracking # max_memory_usage_for_user can interfere another queries running concurrently - 01318_encrypt # Depends on OpenSSL - 01318_decrypt # Depends on OpenSSL - 01281_unsucceeded_insert_select_queries_counter - 01292_create_user - 01294_lazy_database_concurrent - 01305_replica_create_drop_zookeeper - 01354_order_by_tuple_collate_const - 01355_ilike - 01411_bayesian_ab_testing - 01532_collate_in_low_cardinality - 01533_collate_in_nullable - 01542_collate_in_array - 01543_collate_in_tuple - _orc_ - arrow - avro - base64 - brotli - capnproto - client - ddl_dictionaries - h3 - hashing - hdfs - java_hash - json - limit_memory - live_view - memory_leak - memory_limit - mysql - odbc - parallel_alter - parquet - protobuf - secure - sha256 - xz - - # Not sure why these two fail even in sequential mode. Disabled for now - # to make some progress. - 00646_url_engine - 00974_query_profiler - - # In fasttest, ENABLE_LIBRARIES=0, so rocksdb engine is not enabled by default - 01504_rocksdb - - # Look at DistributedFilesToInsert, so cannot run in parallel. - 01460_DistributedFilesToInsert - - 01541_max_memory_usage_for_user - - # Require python libraries like scipy, pandas and numpy - 01322_ttest_scipy - - 01545_system_errors - # Checks system.errors - 01563_distributed_query_finish -) - -time clickhouse-test -j 8 --order=random --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/test_log.txt" - -# substr is to remove semicolon after test name -readarray -t FAILED_TESTS < <(awk '/FAIL|TIMEOUT|ERROR/ { print substr($3, 1, length($3)-1) }' "$FASTTEST_OUTPUT/test_log.txt" | tee "$FASTTEST_OUTPUT/failed-parallel-tests.txt") - -# We will rerun sequentially any tests that have failed during parallel run. -# They might have failed because there was some interference from other tests -# running concurrently. If they fail even in seqential mode, we will report them. -# FIXME All tests that require exclusive access to the server must be -# explicitly marked as `sequential`, and `clickhouse-test` must detect them and -# run them in a separate group after all other tests. This is faster and also -# explicit instead of guessing. -if [[ -n "${FAILED_TESTS[*]}" ]] -then + # Kill the server in case we are running locally and not in docker stop_server ||: - # Clean the data so that there is no interference from the previous test run. - rm -rf "$FASTTEST_DATA"/{{meta,}data,user_files} ||: - start_server - echo "Going to run again: ${FAILED_TESTS[*]}" + TESTS_TO_SKIP=( + 00105_shard_collations + 00109_shard_totals_after_having + 00110_external_sort + 00302_http_compression + 00417_kill_query + 00436_convert_charset + 00490_special_line_separators_and_characters_outside_of_bmp + 00652_replicated_mutations_zookeeper + 00682_empty_parts_merge + 00701_rollup + 00834_cancel_http_readonly_queries_on_client_close + 00911_tautological_compare + 00926_multimatch + 00929_multi_match_edit_distance + 01031_mutations_interpreter_and_context + 01053_ssd_dictionary # this test mistakenly requires acces to /var/lib/clickhouse -- can't run this locally, disabled + 01083_expressions_in_engine_arguments + 01092_memory_profiler + 01098_msgpack_format + 01098_temporary_and_external_tables + 01103_check_cpu_instructions_at_startup # avoid dependency on qemu -- invonvenient when running locally + 01193_metadata_loading + 01238_http_memory_tracking # max_memory_usage_for_user can interfere another queries running concurrently + 01251_dict_is_in_infinite_loop + 01259_dictionary_custom_settings_ddl + 01268_dictionary_direct_layout + 01280_ssd_complex_key_dictionary + 01281_group_by_limit_memory_tracking # max_memory_usage_for_user can interfere another queries running concurrently + 01318_encrypt # Depends on OpenSSL + 01318_decrypt # Depends on OpenSSL + 01281_unsucceeded_insert_select_queries_counter + 01292_create_user + 01294_lazy_database_concurrent + 01305_replica_create_drop_zookeeper + 01354_order_by_tuple_collate_const + 01355_ilike + 01411_bayesian_ab_testing + 01532_collate_in_low_cardinality + 01533_collate_in_nullable + 01542_collate_in_array + 01543_collate_in_tuple + _orc_ + arrow + avro + base64 + brotli + capnproto + client + ddl_dictionaries + h3 + hashing + hdfs + java_hash + json + limit_memory + live_view + memory_leak + memory_limit + mysql + odbc + parallel_alter + parquet + protobuf + secure + sha256 + xz - clickhouse-test --order=random --no-long --testname --shard --zookeeper "${FAILED_TESTS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee -a "$FASTTEST_OUTPUT/test_log.txt" -else - echo "No failed tests" -fi + # Not sure why these two fail even in sequential mode. Disabled for now + # to make some progress. + 00646_url_engine + 00974_query_profiler + + # In fasttest, ENABLE_LIBRARIES=0, so rocksdb engine is not enabled by default + 01504_rocksdb + + # Look at DistributedFilesToInsert, so cannot run in parallel. + 01460_DistributedFilesToInsert + + 01541_max_memory_usage_for_user + + # Require python libraries like scipy, pandas and numpy + 01322_ttest_scipy + 01561_mann_whitney_scipy + + 01545_system_errors + # Checks system.errors + 01563_distributed_query_finish + ) + + time clickhouse-test -j 8 --order=random --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" -- "$FASTTEST_FOCUS" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/test_log.txt" + + # substr is to remove semicolon after test name + readarray -t FAILED_TESTS < <(awk '/FAIL|TIMEOUT|ERROR/ { print substr($3, 1, length($3)-1) }' "$FASTTEST_OUTPUT/test_log.txt" | tee "$FASTTEST_OUTPUT/failed-parallel-tests.txt") + + # We will rerun sequentially any tests that have failed during parallel run. + # They might have failed because there was some interference from other tests + # running concurrently. If they fail even in seqential mode, we will report them. + # FIXME All tests that require exclusive access to the server must be + # explicitly marked as `sequential`, and `clickhouse-test` must detect them and + # run them in a separate group after all other tests. This is faster and also + # explicit instead of guessing. + if [[ -n "${FAILED_TESTS[*]}" ]] + then + stop_server ||: + + # Clean the data so that there is no interference from the previous test run. + rm -rf "$FASTTEST_DATA"/{{meta,}data,user_files} ||: + + start_server + + echo "Going to run again: ${FAILED_TESTS[*]}" + + clickhouse-test --order=random --no-long --testname --shard --zookeeper "${FAILED_TESTS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee -a "$FASTTEST_OUTPUT/test_log.txt" + else + echo "No failed tests" + fi } case "$stage" in diff --git a/docker/test/integration/runner/compose/docker_compose_kerberized_kafka.yml b/docker/test/integration/runner/compose/docker_compose_kerberized_kafka.yml index 3ce0000b148..6e1e11344bb 100644 --- a/docker/test/integration/runner/compose/docker_compose_kerberized_kafka.yml +++ b/docker/test/integration/runner/compose/docker_compose_kerberized_kafka.yml @@ -50,7 +50,7 @@ services: - label:disable kafka_kerberos: - image: yandex/clickhouse-kerberos-kdc:${DOCKER_KERBEROS_KDC_TAG} + image: yandex/clickhouse-kerberos-kdc:${DOCKER_KERBEROS_KDC_TAG:-latest} hostname: kafka_kerberos volumes: - ${KERBERIZED_KAFKA_DIR}/secrets:/tmp/keytab diff --git a/docker/test/integration/runner/compose/docker_compose_mysql.yml b/docker/test/integration/runner/compose/docker_compose_mysql.yml index 2f09c2c01e3..90daf8a4238 100644 --- a/docker/test/integration/runner/compose/docker_compose_mysql.yml +++ b/docker/test/integration/runner/compose/docker_compose_mysql.yml @@ -7,4 +7,4 @@ services: MYSQL_ROOT_PASSWORD: clickhouse ports: - 3308:3306 - command: --server_id=100 --log-bin='mysql-bin-1.log' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency \ No newline at end of file + command: --server_id=100 --log-bin='mysql-bin-1.log' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency diff --git a/docker/test/integration/runner/compose/docker_compose_mysql_5_7_for_materialize_mysql.yml b/docker/test/integration/runner/compose/docker_compose_mysql_5_7_for_materialize_mysql.yml new file mode 100644 index 00000000000..e7d762203ee --- /dev/null +++ b/docker/test/integration/runner/compose/docker_compose_mysql_5_7_for_materialize_mysql.yml @@ -0,0 +1,10 @@ +version: '2.3' +services: + mysql1: + image: mysql:5.7 + restart: 'no' + environment: + MYSQL_ROOT_PASSWORD: clickhouse + ports: + - 3308:3306 + command: --server_id=100 --log-bin='mysql-bin-1.log' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency diff --git a/docker/test/integration/runner/compose/docker_compose_mysql_8_0.yml b/docker/test/integration/runner/compose/docker_compose_mysql_8_0_for_materialize_mysql.yml similarity index 93% rename from docker/test/integration/runner/compose/docker_compose_mysql_8_0.yml rename to docker/test/integration/runner/compose/docker_compose_mysql_8_0_for_materialize_mysql.yml index 1aa97f59a83..918a2b5f80f 100644 --- a/docker/test/integration/runner/compose/docker_compose_mysql_8_0.yml +++ b/docker/test/integration/runner/compose/docker_compose_mysql_8_0_for_materialize_mysql.yml @@ -2,7 +2,7 @@ version: '2.3' services: mysql8_0: image: mysql:8.0 - restart: always + restart: 'no' environment: MYSQL_ROOT_PASSWORD: clickhouse ports: diff --git a/docker/test/integration/runner/compose/docker_compose_mysql_golang_client.yml b/docker/test/integration/runner/compose/docker_compose_mysql_golang_client.yml index b172cbcb2c6..a6a338eb6a8 100644 --- a/docker/test/integration/runner/compose/docker_compose_mysql_golang_client.yml +++ b/docker/test/integration/runner/compose/docker_compose_mysql_golang_client.yml @@ -1,6 +1,6 @@ version: '2.3' services: golang1: - image: yandex/clickhouse-mysql-golang-client:${DOCKER_MYSQL_GOLANG_CLIENT_TAG} + image: yandex/clickhouse-mysql-golang-client:${DOCKER_MYSQL_GOLANG_CLIENT_TAG:-latest} # to keep container running command: sleep infinity diff --git a/docker/test/integration/runner/compose/docker_compose_mysql_java_client.yml b/docker/test/integration/runner/compose/docker_compose_mysql_java_client.yml index be1b3ad3f72..21d927df82c 100644 --- a/docker/test/integration/runner/compose/docker_compose_mysql_java_client.yml +++ b/docker/test/integration/runner/compose/docker_compose_mysql_java_client.yml @@ -1,6 +1,6 @@ version: '2.3' services: java1: - image: yandex/clickhouse-mysql-java-client:${DOCKER_MYSQL_JAVA_CLIENT_TAG} + image: yandex/clickhouse-mysql-java-client:${DOCKER_MYSQL_JAVA_CLIENT_TAG:-latest} # to keep container running command: sleep infinity diff --git a/docker/test/integration/runner/compose/docker_compose_mysql_js_client.yml b/docker/test/integration/runner/compose/docker_compose_mysql_js_client.yml index 83954229111..dbd85cf2382 100644 --- a/docker/test/integration/runner/compose/docker_compose_mysql_js_client.yml +++ b/docker/test/integration/runner/compose/docker_compose_mysql_js_client.yml @@ -1,6 +1,6 @@ version: '2.3' services: mysqljs1: - image: yandex/clickhouse-mysql-js-client:${DOCKER_MYSQL_JS_CLIENT_TAG} + image: yandex/clickhouse-mysql-js-client:${DOCKER_MYSQL_JS_CLIENT_TAG:-latest} # to keep container running command: sleep infinity diff --git a/docker/test/integration/runner/compose/docker_compose_mysql_php_client.yml b/docker/test/integration/runner/compose/docker_compose_mysql_php_client.yml index e61cb193b0e..f24f5337a7e 100644 --- a/docker/test/integration/runner/compose/docker_compose_mysql_php_client.yml +++ b/docker/test/integration/runner/compose/docker_compose_mysql_php_client.yml @@ -1,6 +1,6 @@ version: '2.3' services: php1: - image: yandex/clickhouse-mysql-php-client:${DOCKER_MYSQL_PHP_CLIENT_TAG} + image: yandex/clickhouse-mysql-php-client:${DOCKER_MYSQL_PHP_CLIENT_TAG:-latest} # to keep container running command: sleep infinity diff --git a/docker/test/integration/runner/compose/docker_compose_postgesql_java_client.yml b/docker/test/integration/runner/compose/docker_compose_postgesql_java_client.yml index ef18d1edd7b..38191f1bdd6 100644 --- a/docker/test/integration/runner/compose/docker_compose_postgesql_java_client.yml +++ b/docker/test/integration/runner/compose/docker_compose_postgesql_java_client.yml @@ -1,6 +1,6 @@ version: '2.2' services: java: - image: yandex/clickhouse-postgresql-java-client:${DOCKER_POSTGRESQL_JAVA_CLIENT_TAG} + image: yandex/clickhouse-postgresql-java-client:${DOCKER_POSTGRESQL_JAVA_CLIENT_TAG:-latest} # to keep container running command: sleep infinity diff --git a/docker/test/performance-comparison/Dockerfile b/docker/test/performance-comparison/Dockerfile index 76cadc3ce11..8734e47e80f 100644 --- a/docker/test/performance-comparison/Dockerfile +++ b/docker/test/performance-comparison/Dockerfile @@ -25,12 +25,13 @@ RUN apt-get update \ python3 \ python3-dev \ python3-pip \ + python3-setuptools \ rsync \ tree \ tzdata \ vim \ wget \ - && pip3 --no-cache-dir install clickhouse_driver scipy \ + && pip3 --no-cache-dir install 'git+https://github.com/mymarilyn/clickhouse-driver.git' scipy \ && apt-get purge --yes python3-dev g++ \ && apt-get autoremove --yes \ && apt-get clean \ diff --git a/docker/test/performance-comparison/perf.py b/docker/test/performance-comparison/perf.py index 337f13690b6..7175d0e4143 100755 --- a/docker/test/performance-comparison/perf.py +++ b/docker/test/performance-comparison/perf.py @@ -14,10 +14,12 @@ import string import sys import time import traceback +import logging import xml.etree.ElementTree as et from threading import Thread from scipy import stats +logging.basicConfig(format='%(asctime)s: %(levelname)s: %(module)s: %(message)s', level='WARNING') total_start_seconds = time.perf_counter() stage_start_seconds = total_start_seconds @@ -46,6 +48,8 @@ parser.add_argument('--profile-seconds', type=int, default=0, help='For how many parser.add_argument('--long', action='store_true', help='Do not skip the tests tagged as long.') parser.add_argument('--print-queries', action='store_true', help='Print test queries and exit.') parser.add_argument('--print-settings', action='store_true', help='Print test settings and exit.') +parser.add_argument('--keep-created-tables', action='store_true', help="Don't drop the created tables after the test.") +parser.add_argument('--use-existing-tables', action='store_true', help="Don't create or drop the tables, use the existing ones instead.") args = parser.parse_args() reportStageEnd('start') @@ -139,44 +143,37 @@ reportStageEnd('before-connect') # Open connections servers = [{'host': host or args.host[0], 'port': port or args.port[0]} for (host, port) in itertools.zip_longest(args.host, args.port)] -all_connections = [clickhouse_driver.Client(**server) for server in servers] +# Force settings_is_important to fail queries on unknown settings. +all_connections = [clickhouse_driver.Client(**server, settings_is_important=True) for server in servers] for i, s in enumerate(servers): print(f'server\t{i}\t{s["host"]}\t{s["port"]}') reportStageEnd('connect') -# Run drop queries, ignoring errors. Do this before all other activity, because -# clickhouse_driver disconnects on error (this is not configurable), and the new -# connection loses the changes in settings. -drop_query_templates = [q.text for q in root.findall('drop_query')] -drop_queries = substitute_parameters(drop_query_templates) -for conn_index, c in enumerate(all_connections): - for q in drop_queries: - try: - c.execute(q) - print(f'drop\t{conn_index}\t{c.last_query.elapsed}\t{tsv_escape(q)}') - except: - pass +if not args.use_existing_tables: + # Run drop queries, ignoring errors. Do this before all other activity, + # because clickhouse_driver disconnects on error (this is not configurable), + # and the new connection loses the changes in settings. + drop_query_templates = [q.text for q in root.findall('drop_query')] + drop_queries = substitute_parameters(drop_query_templates) + for conn_index, c in enumerate(all_connections): + for q in drop_queries: + try: + c.execute(q) + print(f'drop\t{conn_index}\t{c.last_query.elapsed}\t{tsv_escape(q)}') + except: + pass -reportStageEnd('drop-1') + reportStageEnd('drop-1') # Apply settings. -# If there are errors, report them and continue -- maybe a new test uses a setting -# that is not in master, but the queries can still run. If we have multiple -# settings and one of them throws an exception, all previous settings for this -# connection will be reset, because the driver reconnects on error (not -# configurable). So the end result is uncertain, but hopefully we'll be able to -# run at least some queries. settings = root.findall('settings/*') for conn_index, c in enumerate(all_connections): for s in settings: - try: - q = f"set {s.tag} = '{s.text}'" - c.execute(q) - print(f'set\t{conn_index}\t{c.last_query.elapsed}\t{tsv_escape(q)}') - except: - print(traceback.format_exc(), file=sys.stderr) + # requires clickhouse-driver >= 1.1.5 to accept arbitrary new settings + # (https://github.com/mymarilyn/clickhouse-driver/pull/142) + c.settings[s.tag] = s.text reportStageEnd('settings') @@ -194,37 +191,40 @@ for t in tables: reportStageEnd('preconditions') -# Run create and fill queries. We will run them simultaneously for both servers, -# to save time. -# The weird search is to keep the relative order of elements, which matters, and -# etree doesn't support the appropriate xpath query. -create_query_templates = [q.text for q in root.findall('./*') if q.tag in ('create_query', 'fill_query')] -create_queries = substitute_parameters(create_query_templates) +if not args.use_existing_tables: + # Run create and fill queries. We will run them simultaneously for both + # servers, to save time. The weird XML search + filter is because we want to + # keep the relative order of elements, and etree doesn't support the + # appropriate xpath query. + create_query_templates = [q.text for q in root.findall('./*') + if q.tag in ('create_query', 'fill_query')] + create_queries = substitute_parameters(create_query_templates) -# Disallow temporary tables, because the clickhouse_driver reconnects on errors, -# and temporary tables are destroyed. We want to be able to continue after some -# errors. -for q in create_queries: - if re.search('create temporary table', q, flags=re.IGNORECASE): - print(f"Temporary tables are not allowed in performance tests: '{q}'", - file = sys.stderr) - sys.exit(1) + # Disallow temporary tables, because the clickhouse_driver reconnects on + # errors, and temporary tables are destroyed. We want to be able to continue + # after some errors. + for q in create_queries: + if re.search('create temporary table', q, flags=re.IGNORECASE): + print(f"Temporary tables are not allowed in performance tests: '{q}'", + file = sys.stderr) + sys.exit(1) -def do_create(connection, index, queries): - for q in queries: - connection.execute(q) - print(f'create\t{index}\t{connection.last_query.elapsed}\t{tsv_escape(q)}') + def do_create(connection, index, queries): + for q in queries: + connection.execute(q) + print(f'create\t{index}\t{connection.last_query.elapsed}\t{tsv_escape(q)}') -threads = [Thread(target = do_create, args = (connection, index, create_queries)) - for index, connection in enumerate(all_connections)] + threads = [ + Thread(target = do_create, args = (connection, index, create_queries)) + for index, connection in enumerate(all_connections)] -for t in threads: - t.start() + for t in threads: + t.start() -for t in threads: - t.join() + for t in threads: + t.join() -reportStageEnd('create') + reportStageEnd('create') # By default, test all queries. queries_to_run = range(0, len(test_queries)) @@ -403,10 +403,11 @@ print(f'profile-total\t{profile_total_seconds}') reportStageEnd('run') # Run drop queries -drop_queries = substitute_parameters(drop_query_templates) -for conn_index, c in enumerate(all_connections): - for q in drop_queries: - c.execute(q) - print(f'drop\t{conn_index}\t{c.last_query.elapsed}\t{tsv_escape(q)}') +if not args.keep_created_tables and not args.use_existing_tables: + drop_queries = substitute_parameters(drop_query_templates) + for conn_index, c in enumerate(all_connections): + for q in drop_queries: + c.execute(q) + print(f'drop\t{conn_index}\t{c.last_query.elapsed}\t{tsv_escape(q)}') -reportStageEnd('drop-2') + reportStageEnd('drop-2') diff --git a/docker/test/pvs/Dockerfile b/docker/test/pvs/Dockerfile index 0aedb67e572..382b486dda3 100644 --- a/docker/test/pvs/Dockerfile +++ b/docker/test/pvs/Dockerfile @@ -10,6 +10,11 @@ RUN apt-get update --yes \ gpg-agent \ debsig-verify \ strace \ + protobuf-compiler \ + protobuf-compiler-grpc \ + libprotoc-dev \ + libgrpc++-dev \ + libc-ares-dev \ --yes --no-install-recommends #RUN wget -nv -O - http://files.viva64.com/etc/pubkey.txt | sudo apt-key add - @@ -33,7 +38,8 @@ RUN set -x \ && dpkg -i "${PKG_VERSION}.deb" CMD echo "Running PVS version $PKG_VERSION" && cd /repo_folder && pvs-studio-analyzer credentials $LICENCE_NAME $LICENCE_KEY -o ./licence.lic \ - && cmake . -D"ENABLE_EMBEDDED_COMPILER"=OFF && ninja re2_st \ + && cmake . -D"ENABLE_EMBEDDED_COMPILER"=OFF -D"USE_INTERNAL_PROTOBUF_LIBRARY"=OFF -D"USE_INTERNAL_GRPC_LIBRARY"=OFF \ + && ninja re2_st clickhouse_grpc_protos \ && pvs-studio-analyzer analyze -o pvs-studio.log -e contrib -j 4 -l ./licence.lic; \ plog-converter -a GA:1,2 -t fullhtml -o /test_output/pvs-studio-html-report pvs-studio.log; \ plog-converter -a GA:1,2 -t tasklist -o /test_output/pvs-studio-task-report.txt pvs-studio.log diff --git a/docker/test/stateful_with_coverage/Dockerfile b/docker/test/stateful_with_coverage/Dockerfile index f5d66ed5013..ac6645b9463 100644 --- a/docker/test/stateful_with_coverage/Dockerfile +++ b/docker/test/stateful_with_coverage/Dockerfile @@ -1,12 +1,12 @@ # docker build -t yandex/clickhouse-stateful-test-with-coverage . -FROM yandex/clickhouse-stateless-test +FROM yandex/clickhouse-stateless-test-with-coverage RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-9 main" >> /etc/apt/sources.list RUN apt-get update -y \ && env DEBIAN_FRONTEND=noninteractive \ apt-get install --yes --no-install-recommends \ - python3-requests + python3-requests procps psmisc COPY s3downloader /s3downloader COPY run.sh /run.sh diff --git a/docker/test/stateful_with_coverage/run.sh b/docker/test/stateful_with_coverage/run.sh index 7a21c397ce5..5fc6350fad8 100755 --- a/docker/test/stateful_with_coverage/run.sh +++ b/docker/test/stateful_with_coverage/run.sh @@ -1,40 +1,44 @@ #!/bin/bash kill_clickhouse () { - kill "$(pgrep -u clickhouse)" 2>/dev/null + echo "clickhouse pids $(pgrep -u clickhouse)" | ts '%Y-%m-%d %H:%M:%S' + pkill -f "clickhouse-server" 2>/dev/null - for _ in {1..10} + + for _ in {1..120} do - if ! kill -0 "$(pgrep -u clickhouse)"; then - echo "No clickhouse process" - break - else - echo "Process $(pgrep -u clickhouse) still alive" - sleep 10 - fi + if ! pkill -0 -f "clickhouse-server" ; then break ; fi + echo "ClickHouse still alive" | ts '%Y-%m-%d %H:%M:%S' + sleep 1 done + + if pkill -0 -f "clickhouse-server" + then + pstree -apgT + jobs + echo "Failed to kill the ClickHouse server" | ts '%Y-%m-%d %H:%M:%S' + return 1 + fi } start_clickhouse () { LLVM_PROFILE_FILE='server_%h_%p_%m.profraw' sudo -Eu clickhouse /usr/bin/clickhouse-server --config /etc/clickhouse-server/config.xml & -} - -wait_llvm_profdata () { - while kill -0 "$(pgrep llvm-profdata-10)" + counter=0 + until clickhouse-client --query "SELECT 1" do - echo "Waiting for profdata $(pgrep llvm-profdata-10) still alive" - sleep 3 + if [ "$counter" -gt 120 ] + then + echo "Cannot start clickhouse-server" + cat /var/log/clickhouse-server/stdout.log + tail -n1000 /var/log/clickhouse-server/stderr.log + tail -n1000 /var/log/clickhouse-server/clickhouse-server.log + break + fi + sleep 0.5 + counter=$((counter + 1)) done } -merge_client_files_in_background () { - client_files=$(ls /client_*profraw 2>/dev/null) - if [ -n "$client_files" ] - then - llvm-profdata-10 merge -sparse "$client_files" -o "merged_client_$(date +%s).profraw" - rm "$client_files" - fi -} chmod 777 / @@ -51,26 +55,7 @@ chmod 777 -R /var/log/clickhouse-server/ # install test configs /usr/share/clickhouse-test/config/install.sh -function start() -{ - counter=0 - until clickhouse-client --query "SELECT 1" - do - if [ "$counter" -gt 120 ] - then - echo "Cannot start clickhouse-server" - cat /var/log/clickhouse-server/stdout.log - tail -n1000 /var/log/clickhouse-server/stderr.log - tail -n1000 /var/log/clickhouse-server/clickhouse-server.log - break - fi - timeout 120 service clickhouse-server start - sleep 0.5 - counter=$((counter + 1)) - done -} - -start +start_clickhouse # shellcheck disable=SC2086 # No quotes because I want to split it into words. if ! /s3downloader --dataset-names $DATASETS; then @@ -81,10 +66,6 @@ fi chmod 777 -R /var/lib/clickhouse -while /bin/true; do - merge_client_files_in_background - sleep 2 -done & LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW DATABASES" LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "ATTACH DATABASE datasets ENGINE = Ordinary" @@ -93,14 +74,13 @@ LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "CREATE DA kill_clickhouse start_clickhouse -sleep 10 - LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW TABLES FROM datasets" LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW TABLES FROM test" LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "RENAME TABLE datasets.hits_v1 TO test.hits" LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "RENAME TABLE datasets.visits_v1 TO test.visits" LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-client --query "SHOW TABLES FROM test" + if grep -q -- "--use-skip-list" /usr/bin/clickhouse-test; then SKIP_LIST_OPT="--use-skip-list" fi @@ -113,11 +93,6 @@ LLVM_PROFILE_FILE='client_coverage.profraw' clickhouse-test --testname --shard - kill_clickhouse -wait_llvm_profdata - sleep 3 -wait_llvm_profdata # 100% merged all parts - - cp /*.profraw /profraw ||: diff --git a/docker/test/stateless_with_coverage/Dockerfile b/docker/test/stateless_with_coverage/Dockerfile index 1d6a85adf9e..f7379ba5568 100644 --- a/docker/test/stateless_with_coverage/Dockerfile +++ b/docker/test/stateless_with_coverage/Dockerfile @@ -1,4 +1,4 @@ -# docker build -t yandex/clickhouse-stateless-with-coverage-test . +# docker build -t yandex/clickhouse-stateless-test-with-coverage . # TODO: that can be based on yandex/clickhouse-stateless-test (llvm version and CMD differs) FROM yandex/clickhouse-test-base @@ -28,7 +28,9 @@ RUN apt-get update -y \ lsof \ unixodbc \ wget \ - qemu-user-static + qemu-user-static \ + procps \ + psmisc RUN mkdir -p /tmp/clickhouse-odbc-tmp \ && wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ diff --git a/docker/test/stateless_with_coverage/run.sh b/docker/test/stateless_with_coverage/run.sh index 758591df618..4e4d9430a11 100755 --- a/docker/test/stateless_with_coverage/run.sh +++ b/docker/test/stateless_with_coverage/run.sh @@ -2,27 +2,41 @@ kill_clickhouse () { echo "clickhouse pids $(pgrep -u clickhouse)" | ts '%Y-%m-%d %H:%M:%S' - kill "$(pgrep -u clickhouse)" 2>/dev/null + pkill -f "clickhouse-server" 2>/dev/null - for _ in {1..10} + + for _ in {1..120} do - if ! kill -0 "$(pgrep -u clickhouse)"; then - echo "No clickhouse process" | ts '%Y-%m-%d %H:%M:%S' - break - else - echo "Process $(pgrep -u clickhouse) still alive" | ts '%Y-%m-%d %H:%M:%S' - sleep 10 - fi + if ! pkill -0 -f "clickhouse-server" ; then break ; fi + echo "ClickHouse still alive" | ts '%Y-%m-%d %H:%M:%S' + sleep 1 done - echo "Will try to send second kill signal for sure" - kill "$(pgrep -u clickhouse)" 2>/dev/null - sleep 5 - echo "clickhouse pids $(pgrep -u clickhouse)" | ts '%Y-%m-%d %H:%M:%S' + if pkill -0 -f "clickhouse-server" + then + pstree -apgT + jobs + echo "Failed to kill the ClickHouse server" | ts '%Y-%m-%d %H:%M:%S' + return 1 + fi } start_clickhouse () { LLVM_PROFILE_FILE='server_%h_%p_%m.profraw' sudo -Eu clickhouse /usr/bin/clickhouse-server --config /etc/clickhouse-server/config.xml & + counter=0 + until clickhouse-client --query "SELECT 1" + do + if [ "$counter" -gt 120 ] + then + echo "Cannot start clickhouse-server" + cat /var/log/clickhouse-server/stdout.log + tail -n1000 /var/log/clickhouse-server/stderr.log + tail -n1000 /var/log/clickhouse-server/clickhouse-server.log + break + fi + sleep 0.5 + counter=$((counter + 1)) + done } chmod 777 / @@ -44,9 +58,6 @@ chmod 777 -R /var/log/clickhouse-server/ start_clickhouse -sleep 10 - - if grep -q -- "--use-skip-list" /usr/bin/clickhouse-test; then SKIP_LIST_OPT="--use-skip-list" fi diff --git a/docs/_includes/cmake_in_clickhouse_header.md b/docs/_includes/cmake_in_clickhouse_header.md index 10776e04c01..7dfda35e34a 100644 --- a/docs/_includes/cmake_in_clickhouse_header.md +++ b/docs/_includes/cmake_in_clickhouse_header.md @@ -13,9 +13,9 @@ cmake .. \ -DENABLE_CLICKHOUSE_SERVER=ON \ -DENABLE_CLICKHOUSE_CLIENT=ON \ -DUSE_STATIC_LIBRARIES=OFF \ - -DCLICKHOUSE_SPLIT_BINARY=ON \ -DSPLIT_SHARED_LIBRARIES=ON \ -DENABLE_LIBRARIES=OFF \ + -DUSE_UNWIND=ON \ -DENABLE_UTILS=OFF \ -DENABLE_TESTS=OFF ``` diff --git a/docs/en/development/contrib.md b/docs/en/development/contrib.md index 639b78185e4..76a2f647231 100644 --- a/docs/en/development/contrib.md +++ b/docs/en/development/contrib.md @@ -17,7 +17,6 @@ toc_title: Third-Party Libraries Used | googletest | [BSD 3-Clause License](https://github.com/google/googletest/blob/master/LICENSE) | | h3 | [Apache License 2.0](https://github.com/uber/h3/blob/master/LICENSE) | | hyperscan | [BSD 3-Clause License](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| libbtrie | [BSD 2-Clause License](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | libdivide | [Zlib License](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | libgsasl | [LGPL v2.1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/en/engines/table-engines/mergetree-family/collapsingmergetree.md b/docs/en/engines/table-engines/mergetree-family/collapsingmergetree.md index 4bfb9dc200e..ea0b265d652 100644 --- a/docs/en/engines/table-engines/mergetree-family/collapsingmergetree.md +++ b/docs/en/engines/table-engines/mergetree-family/collapsingmergetree.md @@ -273,13 +273,15 @@ SELECT sum(Duration) AS Duration FROM UAct GROUP BY UserID -```text +``` + +``` text ┌──────────────UserID─┬─PageViews─┬─Duration─┐ │ 4324182021466249494 │ 6 │ 185 │ └─────────────────────┴───────────┴──────────┘ ``` -``` sqk +``` sql select count() FROM UAct ``` diff --git a/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md index 684e7e28112..b82bc65afc2 100644 --- a/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md @@ -5,7 +5,7 @@ toc_title: ReplacingMergeTree # ReplacingMergeTree {#replacingmergetree} -The engine differs from [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md#table_engines-mergetree) in that it removes duplicate entries with the same [sorting key](../../../engines/table-engines/mergetree-family/mergetree.md) value. +The engine differs from [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md#table_engines-mergetree) in that it removes duplicate entries with the same [sorting key](../../../engines/table-engines/mergetree-family/mergetree.md) value (`ORDER BY` table section, not `PRIMARY KEY`). Data deduplication occurs only during a merge. Merging occurs in the background at an unknown time, so you can’t plan for it. Some of the data may remain unprocessed. Although you can run an unscheduled merge using the `OPTIMIZE` query, don’t count on using it, because the `OPTIMIZE` query will read and write a large amount of data. @@ -29,13 +29,16 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] For a description of request parameters, see [statement description](../../../sql-reference/statements/create/table.md). +!!! note "Attention" + Uniqueness of rows is determined by the `ORDER BY` table section, not `PRIMARY KEY`. + **ReplacingMergeTree Parameters** - `ver` — column with version. Type `UInt*`, `Date` or `DateTime`. Optional parameter. When merging, `ReplacingMergeTree` from all the rows with the same sorting key leaves only one: - - Last in the selection, if `ver` not set. + - The last in the selection, if `ver` not set. A selection is a set of rows in a set of parts participating in the merge. The most recently created part (the last insert) will be the last one in the selection. Thus, after deduplication, the very last row from the most recent insert will remain for each unique sorting key. - With the maximum version, if `ver` specified. **Query clauses** diff --git a/docs/en/engines/table-engines/mergetree-family/replication.md b/docs/en/engines/table-engines/mergetree-family/replication.md index 932facc9ddc..625869a3cb8 100644 --- a/docs/en/engines/table-engines/mergetree-family/replication.md +++ b/docs/en/engines/table-engines/mergetree-family/replication.md @@ -53,6 +53,42 @@ Example of setting the addresses of the ZooKeeper cluster: ``` +ClickHouse also supports to store replicas meta information in the auxiliary ZooKeeper cluster by providing ZooKeeper cluster name and path as engine arguments. +In other word, it supports to store the metadata of differnt tables in different ZooKeeper clusters. + +Example of setting the addresses of the auxiliary ZooKeeper cluster: + +``` xml + + + + example_2_1 + 2181 + + + example_2_2 + 2181 + + + example_2_3 + 2181 + + + + + example_3_1 + 2181 + + + +``` + +To store table datameta in a auxiliary ZooKeeper cluster instead of default ZooKeeper cluster, we can use the SQL to create table with +ReplicatedMergeTree engine as follow: + +``` +CREATE TABLE table_name ( ... ) ENGINE = ReplicatedMergeTree('zookeeper_name_configured_in_auxiliary_zookeepers:path', 'replica_name') ... +``` You can specify any existing ZooKeeper cluster and the system will use a directory on it for its own data (the directory is specified when creating a replicatable table). If ZooKeeper isn’t set in the config file, you can’t create replicated tables, and any existing replicated tables will be read-only. @@ -152,7 +188,7 @@ You can specify default arguments for `Replicated` table engine in the server co ```xml /clickhouse/tables/{shard}/{database}/{table} -{replica} +{replica} ``` In this case, you can omit arguments when creating tables: diff --git a/docs/en/getting-started/tutorial.md b/docs/en/getting-started/tutorial.md index 8d41279fef9..3e051456a75 100644 --- a/docs/en/getting-started/tutorial.md +++ b/docs/en/getting-started/tutorial.md @@ -11,7 +11,7 @@ By going through this tutorial, you’ll learn how to set up a simple ClickHouse ## Single Node Setup {#single-node-setup} -To postpone the complexities of a distributed environment, we’ll start with deploying ClickHouse on a single server or virtual machine. ClickHouse is usually installed from [deb](../getting-started/install.md#install-from-deb-packages) or [rpm](../getting-started/install.md#from-rpm-packages) packages, but there are [alternatives](../getting-started/install.md#from-docker-image) for the operating systems that do no support them. +To postpone the complexities of a distributed environment, we’ll start with deploying ClickHouse on a single server or virtual machine. ClickHouse is usually installed from [deb](../getting-started/install.md#install-from-deb-packages) or [rpm](../getting-started/install.md#from-rpm-packages) packages, but there are [alternatives](../getting-started/install.md#from-docker-image) for the operating systems that do not support them. For example, you have chosen `deb` packages and executed: diff --git a/docs/en/index.md b/docs/en/index.md index 8280d5c9f97..676fd444995 100644 --- a/docs/en/index.md +++ b/docs/en/index.md @@ -5,7 +5,7 @@ toc_title: Overview # What Is ClickHouse? {#what-is-clickhouse} -ClickHouse is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP). +ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP). In a “normal” row-oriented DBMS, data is stored in this order: diff --git a/docs/en/interfaces/formats.md b/docs/en/interfaces/formats.md index d310705d1c1..618ae374e8a 100644 --- a/docs/en/interfaces/formats.md +++ b/docs/en/interfaces/formats.md @@ -25,6 +25,7 @@ The supported formats are: | [Vertical](#vertical) | ✗ | ✔ | | [VerticalRaw](#verticalraw) | ✗ | ✔ | | [JSON](#json) | ✗ | ✔ | +| [JSONAsString](#jsonasstring) | ✔ | ✗ | | [JSONString](#jsonstring) | ✗ | ✔ | | [JSONCompact](#jsoncompact) | ✗ | ✔ | | [JSONCompactString](#jsoncompactstring) | ✗ | ✔ | @@ -507,6 +508,34 @@ Example: } ``` +## JSONAsString {#jsonasstring} + +In this format, a single JSON object is interpreted as a single value. If input has several JSON objects (comma separated) they will be interpreted as a sepatate rows. + +This format can only be parsed for table with a single field of type [String](../sql-reference/data-types/string.md). The remaining columns must be set to [DEFAULT](../sql-reference/statements/create/table.md#default) or [MATERIALIZED](../sql-reference/statements/create/table.md#materialized), or omitted. Once you collect whole JSON object to string you can use [JSON functions](../sql-reference/functions/json-functions.md) to process it. + +**Example** + +Query: + +``` sql +DROP TABLE IF EXISTS json_as_string; +CREATE TABLE json_as_string (json String) ENGINE = Memory; +INSERT INTO json_as_string FORMAT JSONAsString {"foo":{"bar":{"x":"y"},"baz":1}},{},{"any json stucture":1} +SELECT * FROM json_as_string; +``` + +Result: + +``` text +┌─json──────────────────────────────┐ +│ {"foo":{"bar":{"x":"y"},"baz":1}} │ +│ {} │ +│ {"any json stucture":1} │ +└───────────────────────────────────┘ +``` + + ## JSONCompact {#jsoncompact} ## JSONCompactString {#jsoncompactstring} diff --git a/docs/en/introduction/adopters.md b/docs/en/introduction/adopters.md index 1cffead788a..b365bd880ac 100644 --- a/docs/en/introduction/adopters.md +++ b/docs/en/introduction/adopters.md @@ -23,6 +23,7 @@ toc_title: Adopters | BIGO | Video | Computing Platform | — | — | [Blog Article, August 2020](https://www.programmersought.com/article/44544895251/) | | Bloomberg | Finance, Media | Monitoring | 102 servers | — | [Slides, May 2018](https://www.slideshare.net/Altinity/http-analytics-for-6m-requests-per-second-using-clickhouse-by-alexander-bocharov) | | Bloxy | Blockchain | Analytics | — | — | [Slides in Russian, August 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup17/4_bloxy.pptx) | +| Bytedance | Social platforms | — | — | — | [The ClickHouse Meetup East, October 2020](https://www.youtube.com/watch?v=ckChUkC3Pns) | | CardsMobile | Finance | Analytics | — | — | [VC.ru](https://vc.ru/s/cardsmobile/143449-rukovoditel-gruppy-analiza-dannyh) | | CARTO | Business Intelligence | Geo analytics | — | — | [Geospatial processing with ClickHouse](https://carto.com/blog/geospatial-processing-with-clickhouse/) | | CERN | Research | Experiment | — | — | [Press release, April 2012](https://www.yandex.com/company/press_center/press_releases/2012/2012-04-10/) | @@ -96,6 +97,7 @@ toc_title: Adopters | Splunk | Business Analytics | Main product | — | — | [Slides in English, January 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup12/splunk.pdf) | | Spotify | Music | Experimentation | — | — | [Slides, July 2018](https://www.slideshare.net/glebus/using-clickhouse-for-experimentation-104247173) | | Staffcop | Information Security | Main Product | — | — | [Official website, Documentation](https://www.staffcop.ru/sce43) | +| Suning | E-Commerce | User behaviour analytics | — | — | [Blog article](https://www.sohu.com/a/434152235_411876) | | Teralytics | Mobility | Analytics | — | — | [Tech blog](https://www.teralytics.net/knowledge-hub/visualizing-mobility-data-the-scalability-challenge) | | Tencent | Big Data | Data processing | — | — | [Slides in Chinese, October 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup19/5.%20ClickHouse大数据集群应用_李俊飞腾讯网媒事业部.pdf) | | Tencent | Messaging | Logging | — | — | [Talk in Chinese, November 2019](https://youtu.be/T-iVQRuw-QY?t=5050) | @@ -111,7 +113,7 @@ toc_title: Adopters | Yandex Cloud | Public Cloud | Main product | — | — | [Talk in Russian, December 2019](https://www.youtube.com/watch?v=pgnak9e_E0o) | | Yandex DataLens | Business Intelligence | Main product | — | — | [Slides in Russian, December 2019](https://presentations.clickhouse.tech/meetup38/datalens.pdf) | | Yandex Market | e-Commerce | Metrics, Logging | — | — | [Talk in Russian, January 2019](https://youtu.be/_l1qP0DyBcA?t=478) | -| Yandex Metrica | Web analytics | Main product | 360 servers in one cluster, 1862 servers in one department | 66.41 PiB / 5.68 PiB | [Slides, February 2020](https://presentations.clickhouse.tech/meetup40/introduction/#13) | +| Yandex Metrica | Web analytics | Main product | 630 servers in one cluster, 360 servers in another cluster, 1862 servers in one department | 133 PiB / 8.31 PiB / 120 trillion records | [Slides, February 2020](https://presentations.clickhouse.tech/meetup40/introduction/#13) | | ЦВТ | Software Development | Metrics, Logging | — | — | [Blog Post, March 2019, in Russian](https://vc.ru/dev/62715-kak-my-stroili-monitoring-na-prometheus-clickhouse-i-elk) | | МКБ | Bank | Web-system monitoring | — | — | [Slides in Russian, September 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup28/mkb.pdf) | | ЦФТ | Banking, Financial products, Payments | — | — | — | [Meetup in Russian, April 2020](https://team.cft.ru/events/162) | diff --git a/docs/en/operations/opentelemetry.md b/docs/en/operations/opentelemetry.md index 45533d3733f..2afeabc7956 100644 --- a/docs/en/operations/opentelemetry.md +++ b/docs/en/operations/opentelemetry.md @@ -44,11 +44,10 @@ stages, such as query planning or distributed queries. To be useful, the tracing information has to be exported to a monitoring system that supports OpenTelemetry, such as Jaeger or Prometheus. ClickHouse avoids -a dependency on a particular monitoring system, instead only -providing the tracing data conforming to the standard. A natural way to do so -in an SQL RDBMS is a system table. OpenTelemetry trace span information +a dependency on a particular monitoring system, instead only providing the +tracing data through a system table. OpenTelemetry trace span information [required by the standard](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/overview.md#span) -is stored in the system table called `system.opentelemetry_span_log`. +is stored in the `system.opentelemetry_span_log` table. The table must be enabled in the server configuration, see the `opentelemetry_span_log` element in the default config file `config.xml`. It is enabled by default. @@ -67,3 +66,31 @@ The table has the following columns: The tags or attributes are saved as two parallel arrays, containing the keys and values. Use `ARRAY JOIN` to work with them. + +## Integration with monitoring systems + +At the moment, there is no ready tool that can export the tracing data from +ClickHouse to a monitoring system. + +For testing, it is possible to setup the export using a materialized view with the URL engine over the `system.opentelemetry_span_log` table, which would push the arriving log data to an HTTP endpoint of a trace collector. For example, to push the minimal span data to a Zipkin instance running at `http://localhost:9411`, in Zipkin v2 JSON format: + +```sql +CREATE MATERIALIZED VIEW default.zipkin_spans +ENGINE = URL('http://127.0.0.1:9411/api/v2/spans', 'JSONEachRow') +SETTINGS output_format_json_named_tuples_as_objects = 1, + output_format_json_array_of_rows = 1 AS +SELECT + lower(hex(reinterpretAsFixedString(trace_id))) AS traceId, + lower(hex(parent_span_id)) AS parentId, + lower(hex(span_id)) AS id, + operation_name AS name, + start_time_us AS timestamp, + finish_time_us - start_time_us AS duration, + cast(tuple('clickhouse'), 'Tuple(serviceName text)') AS localEndpoint, + cast(tuple( + attribute.values[indexOf(attribute.names, 'db.statement')]), + 'Tuple("db.statement" text)') AS tags +FROM system.opentelemetry_span_log +``` + +In case of any errors, the part of the log data for which the error has occurred will be silently lost. Check the server log for error messages if the data does not arrive. diff --git a/docs/en/operations/server-configuration-parameters/settings.md b/docs/en/operations/server-configuration-parameters/settings.md index e111cf3ab75..533fcea5500 100644 --- a/docs/en/operations/server-configuration-parameters/settings.md +++ b/docs/en/operations/server-configuration-parameters/settings.md @@ -139,7 +139,7 @@ Lazy loading of dictionaries. If `true`, then each dictionary is created on first use. If dictionary creation failed, the function that was using the dictionary throws an exception. -If `false`, all dictionaries are created when the server starts, and if there is an error, the server shuts down. +If `false`, all dictionaries are created when the server starts, if the dictionary or dictionaries are created too long or are created with errors, then the server boots without of these dictionaries and continues to try to create these dictionaries. The default is `true`. diff --git a/docs/en/operations/settings/settings.md b/docs/en/operations/settings/settings.md index ba899754b18..8346d5ceac9 100644 --- a/docs/en/operations/settings/settings.md +++ b/docs/en/operations/settings/settings.md @@ -2293,6 +2293,47 @@ Result: └─────────────────────────┴─────────┘ ``` +## system_events_show_zero_values {#system_events_show_zero_values} + +Allows to select zero-valued events from [`system.events`](../../operations/system-tables/events.md). + +Some monitoring systems require passing all the metrics values to them for each checkpoint, even if the metric value is zero. + +Possible values: + +- 0 — Disabled. +- 1 — Enabled. + +Default value: `0`. + +**Examples** + +Query + +```sql +SELECT * FROM system.events WHERE event='QueryMemoryLimitExceeded'; +``` + +Result + +```text +Ok. +``` + +Query +```sql +SET system_events_show_zero_values = 1; +SELECT * FROM system.events WHERE event='QueryMemoryLimitExceeded'; +``` + +Result + +```text +┌─event────────────────────┬─value─┬─description───────────────────────────────────────────┐ +│ QueryMemoryLimitExceeded │ 0 │ Number of times when memory limit exceeded for query. │ +└──────────────────────────┴───────┴───────────────────────────────────────────────────────┘ +``` + ## allow_experimental_bigint_types {#allow_experimental_bigint_types} Enables or disables integer values exceeding the range that is supported by the int data type. diff --git a/docs/en/operations/system-tables/databases.md b/docs/en/operations/system-tables/databases.md index 84b696a3bf8..8ef5551d9b0 100644 --- a/docs/en/operations/system-tables/databases.md +++ b/docs/en/operations/system-tables/databases.md @@ -1,9 +1,38 @@ # system.databases {#system-databases} -This table contains a single String column called ‘name’ – the name of a database. +Contains information about the databases that are available to the current user. -Each database that the server knows about has a corresponding entry in the table. +Columns: -This system table is used for implementing the `SHOW DATABASES` query. +- `name` ([String](../../sql-reference/data-types/string.md)) — Database name. +- `engine` ([String](../../sql-reference/data-types/string.md)) — [Database engine](../../engines/database-engines/index.md). +- `data_path` ([String](../../sql-reference/data-types/string.md)) — Data path. +- `metadata_path` ([String](../../sql-reference/data-types/enum.md)) — Metadata path. +- `uuid` ([UUID](../../sql-reference/data-types/uuid.md)) — Database UUID. -[Original article](https://clickhouse.tech/docs/en/operations/system_tables/databases) \ No newline at end of file +The `name` column from this system table is used for implementing the `SHOW DATABASES` query. + +**Example** + +Create a database. + +``` sql +CREATE DATABASE test +``` + +Check all of the available databases to the user. + +``` sql +SELECT * FROM system.databases +``` + +``` text +┌─name───────────────────────────┬─engine─┬─data_path──────────────────┬─metadata_path───────────────────────────────────────────────────────┬─────────────────────────────────uuid─┐ +│ _temporary_and_external_tables │ Memory │ /var/lib/clickhouse/ │ │ 00000000-0000-0000-0000-000000000000 │ +│ default │ Atomic │ /var/lib/clickhouse/store/ │ /var/lib/clickhouse/store/d31/d317b4bd-3595-4386-81ee-c2334694128a/ │ d317b4bd-3595-4386-81ee-c2334694128a │ +│ test │ Atomic │ /var/lib/clickhouse/store/ │ /var/lib/clickhouse/store/39b/39bf0cc5-4c06-4717-87fe-c75ff3bd8ebb/ │ 39bf0cc5-4c06-4717-87fe-c75ff3bd8ebb │ +│ system │ Atomic │ /var/lib/clickhouse/store/ │ /var/lib/clickhouse/store/1d1/1d1c869d-e465-4b1b-a51f-be033436ebf9/ │ 1d1c869d-e465-4b1b-a51f-be033436ebf9 │ +└────────────────────────────────┴────────┴────────────────────────────┴─────────────────────────────────────────────────────────────────────┴──────────────────────────────────────┘ +``` + +[Original article](https://clickhouse.tech/docs/en/operations/system_tables/databases) diff --git a/docs/en/operations/utilities/clickhouse-copier.md b/docs/en/operations/utilities/clickhouse-copier.md index ec5a619b86b..4137bd6f334 100644 --- a/docs/en/operations/utilities/clickhouse-copier.md +++ b/docs/en/operations/utilities/clickhouse-copier.md @@ -70,11 +70,21 @@ Parameters: + false 127.0.0.1 9000 + ... diff --git a/docs/en/operations/utilities/clickhouse-obfuscator.md b/docs/en/operations/utilities/clickhouse-obfuscator.md index 8a2ea1eecf6..7fd608fcac0 100644 --- a/docs/en/operations/utilities/clickhouse-obfuscator.md +++ b/docs/en/operations/utilities/clickhouse-obfuscator.md @@ -1,42 +1,42 @@ -# ClickHouse obfuscator - -Simple tool for table data obfuscation. - -It reads input table and produces output table, that retain some properties of input, but contains different data. -It allows to publish almost real production data for usage in benchmarks. - -It is designed to retain the following properties of data: -- cardinalities of values (number of distinct values) for every column and for every tuple of columns; -- conditional cardinalities: number of distinct values of one column under condition on value of another column; -- probability distributions of absolute value of integers; sign of signed integers; exponent and sign for floats; -- probability distributions of length of strings; -- probability of zero values of numbers; empty strings and arrays, NULLs; -- data compression ratio when compressed with LZ77 and entropy family of codecs; -- continuity (magnitude of difference) of time values across table; continuity of floating point values. -- date component of DateTime values; -- UTF-8 validity of string values; -- string values continue to look somewhat natural. - -Most of the properties above are viable for performance testing: - -reading data, filtering, aggregation and sorting will work at almost the same speed -as on original data due to saved cardinalities, magnitudes, compression ratios, etc. - -It works in deterministic fashion: you define a seed value and transform is totally determined by input data and by seed. -Some transforms are one to one and could be reversed, so you need to have large enough seed and keep it in secret. - -It use some cryptographic primitives to transform data, but from the cryptographic point of view, -It doesn't do anything properly and you should never consider the result as secure, unless you have other reasons for it. - -It may retain some data you don't want to publish. - -It always leave numbers 0, 1, -1 as is. Also it leaves dates, lengths of arrays and null flags exactly as in source data. -For example, you have a column IsMobile in your table with values 0 and 1. In transformed data, it will have the same value. -So, the user will be able to count exact ratio of mobile traffic. - -Another example, suppose you have some private data in your table, like user email and you don't want to publish any single email address. -If your table is large enough and contain multiple different emails and there is no email that have very high frequency than all others, -It will perfectly anonymize all data. But if you have small amount of different values in a column, it can possibly reproduce some of them. -And you should take care and look at exact algorithm, how this tool works, and probably fine tune some of it command line parameters. - -This tool works fine only with reasonable amount of data (at least 1000s of rows). +# ClickHouse obfuscator + +A simple tool for table data obfuscation. + +It reads an input table and produces an output table, that retains some properties of input, but contains different data. +It allows publishing almost real production data for usage in benchmarks. + +It is designed to retain the following properties of data: +- cardinalities of values (number of distinct values) for every column and every tuple of columns; +- conditional cardinalities: number of distinct values of one column under the condition on the value of another column; +- probability distributions of the absolute value of integers; the sign of signed integers; exponent and sign for floats; +- probability distributions of the length of strings; +- probability of zero values of numbers; empty strings and arrays, `NULL`s; + +- data compression ratio when compressed with LZ77 and entropy family of codecs; +- continuity (magnitude of difference) of time values across the table; continuity of floating-point values; +- date component of `DateTime` values; + +- UTF-8 validity of string values; +- string values look natural. + +Most of the properties above are viable for performance testing: + +reading data, filtering, aggregatio, and sorting will work at almost the same speed +as on original data due to saved cardinalities, magnitudes, compression ratios, etc. + +It works in a deterministic fashion: you define a seed value and the transformation is determined by input data and by seed. +Some transformations are one to one and could be reversed, so you need to have a large seed and keep it in secret. + +It uses some cryptographic primitives to transform data but from the cryptographic point of view, it doesn't do it properly, that is why you should not consider the result as secure unless you have another reason. The result may retain some data you don't want to publish. + + +It always leaves 0, 1, -1 numbers, dates, lengths of arrays, and null flags exactly as in source data. +For example, you have a column `IsMobile` in your table with values 0 and 1. In transformed data, it will have the same value. + +So, the user will be able to count the exact ratio of mobile traffic. + +Let's give another example. When you have some private data in your table, like user email and you don't want to publish any single email address. +If your table is large enough and contains multiple different emails and no email has a very high frequency than all others, it will anonymize all data. But if you have a small number of different values in a column, it can reproduce some of them. +You should look at the working algorithm of this tool works, and fine-tune its command line parameters. + +This tool works fine only with an average amount of data (at least 1000s of rows). diff --git a/docs/en/sql-reference/aggregate-functions/index.md b/docs/en/sql-reference/aggregate-functions/index.md index 270b7d8db39..543a5d3fed8 100644 --- a/docs/en/sql-reference/aggregate-functions/index.md +++ b/docs/en/sql-reference/aggregate-functions/index.md @@ -44,8 +44,6 @@ SELECT sum(y) FROM t_null_big └────────┘ ``` -The `sum` function interprets `NULL` as `0`. In particular, this means that if the function receives input of a selection where all the values are `NULL`, then the result will be `0`, not `NULL`. - Now you can use the `groupArray` function to create an array from the `y` column: ``` sql diff --git a/docs/en/sql-reference/aggregate-functions/reference/avg.md b/docs/en/sql-reference/aggregate-functions/reference/avg.md index 4ebae95b79d..e2e6aace734 100644 --- a/docs/en/sql-reference/aggregate-functions/reference/avg.md +++ b/docs/en/sql-reference/aggregate-functions/reference/avg.md @@ -4,4 +4,59 @@ toc_priority: 5 # avg {#agg_function-avg} -Calculates the average. Only works for numbers. The result is always Float64. +Calculates the arithmetic mean. + +**Syntax** + +``` sql +avgWeighted(x) +``` + +**Parameter** + +- `x` — Values. + +`x` must be +[Integer](../../../sql-reference/data-types/int-uint.md), +[floating-point](../../../sql-reference/data-types/float.md), or +[Decimal](../../../sql-reference/data-types/decimal.md). + +**Returned value** + +- `NaN` if the supplied parameter is empty. +- Mean otherwise. + +**Return type** is always [Float64](../../../sql-reference/data-types/float.md). + +**Example** + +Query: + +``` sql +SELECT avg(x) FROM values('x Int8', 0, 1, 2, 3, 4, 5) +``` + +Result: + +``` text +┌─avg(x)─┐ +│ 2.5 │ +└────────┘ +``` + +**Example** + +Query: + +``` sql +CREATE table test (t UInt8) ENGINE = Memory; +SELECT avg(t) FROM test +``` + +Result: + +``` text +┌─avg(x)─┐ +│ nan │ +└────────┘ +``` diff --git a/docs/en/sql-reference/aggregate-functions/reference/avgweighted.md b/docs/en/sql-reference/aggregate-functions/reference/avgweighted.md index 20b7187a744..7b9c0de2755 100644 --- a/docs/en/sql-reference/aggregate-functions/reference/avgweighted.md +++ b/docs/en/sql-reference/aggregate-functions/reference/avgweighted.md @@ -14,17 +14,21 @@ avgWeighted(x, weight) **Parameters** -- `x` — Values. [Integer](../../../sql-reference/data-types/int-uint.md) or [floating-point](../../../sql-reference/data-types/float.md). -- `weight` — Weights of the values. [Integer](../../../sql-reference/data-types/int-uint.md) or [floating-point](../../../sql-reference/data-types/float.md). +- `x` — Values. +- `weight` — Weights of the values. -Type of `x` and `weight` must be the same. +`x` and `weight` must both be +[Integer](../../../sql-reference/data-types/int-uint.md), +[floating-point](../../../sql-reference/data-types/float.md), or +[Decimal](../../../sql-reference/data-types/decimal.md), +but may have different types. **Returned value** -- Weighted mean. -- `NaN`. If all the weights are equal to 0. +- `NaN` if all the weights are equal to 0 or the supplied weights parameter is empty. +- Weighted mean otherwise. -Type: [Float64](../../../sql-reference/data-types/float.md). +**Return type** is always [Float64](../../../sql-reference/data-types/float.md). **Example** @@ -42,3 +46,54 @@ Result: │ 8 │ └────────────────────────┘ ``` + +**Example** + +Query: + +``` sql +SELECT avgWeighted(x, w) +FROM values('x Int8, w Float64', (4, 1), (1, 0), (10, 2)) +``` + +Result: + +``` text +┌─avgWeighted(x, weight)─┐ +│ 8 │ +└────────────────────────┘ +``` + +**Example** + +Query: + +``` sql +SELECT avgWeighted(x, w) +FROM values('x Int8, w Int8', (0, 0), (1, 0), (10, 0)) +``` + +Result: + +``` text +┌─avgWeighted(x, weight)─┐ +│ nan │ +└────────────────────────┘ +``` + +**Example** + +Query: + +``` sql +CREATE table test (t UInt8) ENGINE = Memory; +SELECT avgWeighted(t) FROM test +``` + +Result: + +``` text +┌─avgWeighted(x, weight)─┐ +│ nan │ +└────────────────────────┘ +``` diff --git a/docs/en/sql-reference/aggregate-functions/reference/initializeAggregation.md b/docs/en/sql-reference/aggregate-functions/reference/initializeAggregation.md new file mode 100644 index 00000000000..ea44d5f1ddd --- /dev/null +++ b/docs/en/sql-reference/aggregate-functions/reference/initializeAggregation.md @@ -0,0 +1,37 @@ +--- +toc_priority: 150 +--- + +## initializeAggregation {#initializeaggregation} + +Initializes aggregation for your input rows. It is intended for the functions with the suffix `State`. +Use it for tests or to process columns of types `AggregateFunction` and `AggregationgMergeTree`. + +**Syntax** + +``` sql +initializeAggregation (aggregate_function, column_1, column_2); +``` + +**Parameters** + +- `aggregate_function` — Name of the aggregation function. The state of this function — the creating one. [String](../../../sql-reference/data-types/string.md#string). +- `column_n` — The column to translate it into the function as it's argument. [String](../../../sql-reference/data-types/string.md#string). + +**Returned value(s)** + +Returns the result of the aggregation for your input rows. The return type will be the same as the return type of function, that `initializeAgregation` takes as first argument. +For example for functions with the suffix `State` the return type will be `AggregateFunction`. + +**Example** + +Query: + +```sql +SELECT uniqMerge(state) FROM (SELECT initializeAggregation('uniqState', number % 3) AS state FROM system.numbers LIMIT 10000); +``` +Result: + +┌─uniqMerge(state)─┐ +│ 3 │ +└──────────────────┘ diff --git a/docs/en/sql-reference/aggregate-functions/reference/rankCorr.md b/docs/en/sql-reference/aggregate-functions/reference/rankCorr.md new file mode 100644 index 00000000000..dc23029f239 --- /dev/null +++ b/docs/en/sql-reference/aggregate-functions/reference/rankCorr.md @@ -0,0 +1,53 @@ +## rankCorr {#agg_function-rankcorr} + +Computes a rank correlation coefficient. + +**Syntax** + +``` sql +rankCorr(x, y) +``` + +**Parameters** + +- `x` — Arbitrary value. [Float32](../../../sql-reference/data-types/float.md#float32-float64) or [Float64](../../../sql-reference/data-types/float.md#float32-float64). +- `y` — Arbitrary value. [Float32](../../../sql-reference/data-types/float.md#float32-float64) or [Float64](../../../sql-reference/data-types/float.md#float32-float64). + +**Returned value(s)** + +- Returns a rank correlation coefficient of the ranks of x and y. The value of the correlation coefficient ranges from -1 to +1. If less than two arguments are passed, the function will return an exception. The value close to +1 denotes a high linear relationship, and with an increase of one random variable, the second random variable also increases. The value close to -1 denotes a high linear relationship, and with an increase of one random variable, the second random variable decreases. The value close or equal to 0 denotes no relationship between the two random variables. + +Type: [Float64](../../../sql-reference/data-types/float.md#float32-float64). + +**Example** + +Query: + +``` sql +SELECT rankCorr(number, number) FROM numbers(100); +``` + +Result: + +``` text +┌─rankCorr(number, number)─┐ +│ 1 │ +└──────────────────────────┘ +``` + +Query: + +``` sql +SELECT roundBankers(rankCorr(exp(number), sin(number)), 3) FROM numbers(100); +``` + +Result: + +``` text +┌─roundBankers(rankCorr(exp(number), sin(number)), 3)─┐ +│ -0.037 │ +└─────────────────────────────────────────────────────┘ +``` +**See Also** + +- [Spearman's rank correlation coefficient](https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient) \ No newline at end of file diff --git a/docs/en/sql-reference/functions/date-time-functions.md b/docs/en/sql-reference/functions/date-time-functions.md index 5f4d31225b8..75db3fafe36 100644 --- a/docs/en/sql-reference/functions/date-time-functions.md +++ b/docs/en/sql-reference/functions/date-time-functions.md @@ -25,7 +25,37 @@ SELECT ## toTimeZone {#totimezone} -Convert time or date and time to the specified time zone. +Convert time or date and time to the specified time zone. The time zone is an attribute of the Date/DateTime types. The internal value (number of seconds) of the table field or of the resultset's column does not change, the column's type changes and its string representation changes accordingly. + +```sql +SELECT + toDateTime('2019-01-01 00:00:00', 'UTC') AS time_utc, + toTypeName(time_utc) AS type_utc, + toInt32(time_utc) AS int32utc, + toTimeZone(time_utc, 'Asia/Yekaterinburg') AS time_yekat, + toTypeName(time_yekat) AS type_yekat, + toInt32(time_yekat) AS int32yekat, + toTimeZone(time_utc, 'US/Samoa') AS time_samoa, + toTypeName(time_samoa) AS type_samoa, + toInt32(time_samoa) AS int32samoa +FORMAT Vertical; +``` + +```text +Row 1: +────── +time_utc: 2019-01-01 00:00:00 +type_utc: DateTime('UTC') +int32utc: 1546300800 +time_yekat: 2019-01-01 05:00:00 +type_yekat: DateTime('Asia/Yekaterinburg') +int32yekat: 1546300800 +time_samoa: 2018-12-31 13:00:00 +type_samoa: DateTime('US/Samoa') +int32samoa: 1546300800 +``` + +`toTimeZone(time_utc, 'Asia/Yekaterinburg')` changes the `DateTime('UTC')` type to `DateTime('Asia/Yekaterinburg')`. The value (Unixtimestamp) 1546300800 stays the same, but the string representation (the result of the toString() function) changes from `time_utc: 2019-01-01 00:00:00` to `time_yekat: 2019-01-01 05:00:00`. ## toYear {#toyear} @@ -67,9 +97,8 @@ Leap seconds are not accounted for. ## toUnixTimestamp {#to-unix-timestamp} -For DateTime argument: converts value to its internal numeric representation (Unix Timestamp). -For String argument: parse datetime from string according to the timezone (optional second argument, server timezone is used by default) and returns the corresponding unix timestamp. -For Date argument: the behaviour is unspecified. +For DateTime argument: converts value to the number with type UInt32 -- Unix Timestamp (https://en.wikipedia.org/wiki/Unix_time). +For String argument: converts the input string to the datetime according to the timezone (optional second argument, server timezone is used by default) and returns the corresponding unix timestamp. **Syntax** @@ -535,18 +564,7 @@ dateDiff('unit', startdate, enddate, [timezone]) - `unit` — Time unit, in which the returned value is expressed. [String](../../sql-reference/syntax.md#syntax-string-literal). - Supported values: - - | unit | - | ---- | - |second | - |minute | - |hour | - |day | - |week | - |month | - |quarter | - |year | + Supported values: second, minute, hour, day, week, month, quarter, year. - `startdate` — The first time value to compare. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). diff --git a/docs/en/sql-reference/functions/in-functions.md b/docs/en/sql-reference/functions/in-functions.md index 065805a36ae..dd3c1900fdc 100644 --- a/docs/en/sql-reference/functions/in-functions.md +++ b/docs/en/sql-reference/functions/in-functions.md @@ -9,16 +9,4 @@ toc_title: IN Operator See the section [IN operators](../../sql-reference/operators/in.md#select-in-operators). -## tuple(x, y, …), operator (x, y, …) {#tuplex-y-operator-x-y} - -A function that allows grouping multiple columns. -For columns with the types T1, T2, …, it returns a Tuple(T1, T2, …) type tuple containing these columns. There is no cost to execute the function. -Tuples are normally used as intermediate values for an argument of IN operators, or for creating a list of formal parameters of lambda functions. Tuples can’t be written to a table. - -## tupleElement(tuple, n), operator x.N {#tupleelementtuple-n-operator-x-n} - -A function that allows getting a column from a tuple. -‘N’ is the column index, starting from 1. N must be a constant. ‘N’ must be a constant. ‘N’ must be a strict postive integer no greater than the size of the tuple. -There is no cost to execute the function. - [Original article](https://clickhouse.tech/docs/en/query_language/functions/in_functions/) diff --git a/docs/en/sql-reference/functions/other-functions.md b/docs/en/sql-reference/functions/other-functions.md index 31ed47c3195..51a1f6b4cd7 100644 --- a/docs/en/sql-reference/functions/other-functions.md +++ b/docs/en/sql-reference/functions/other-functions.md @@ -1,5 +1,5 @@ --- -toc_priority: 66 +toc_priority: 67 toc_title: Other --- diff --git a/docs/en/sql-reference/functions/string-search-functions.md b/docs/en/sql-reference/functions/string-search-functions.md index 881139f103c..dba8a6e275c 100644 --- a/docs/en/sql-reference/functions/string-search-functions.md +++ b/docs/en/sql-reference/functions/string-search-functions.md @@ -536,4 +536,58 @@ For case-insensitive search or/and in UTF-8 format use functions `ngramSearchCas !!! note "Note" For UTF-8 case we use 3-gram distance. All these are not perfectly fair n-gram distances. We use 2-byte hashes to hash n-grams and then calculate the (non-)symmetric difference between these hash tables – collisions may occur. With UTF-8 case-insensitive format we do not use fair `tolower` function – we zero the 5-th bit (starting from zero) of each codepoint byte and first bit of zeroth byte if bytes more than one – this works for Latin and mostly for all Cyrillic letters. +## countSubstrings(haystack, needle) {#countSubstrings} + +Count the number of substring occurrences + +For a case-insensitive search, use the function `countSubstringsCaseInsensitive` (or `countSubstringsCaseInsensitiveUTF8`). + +**Syntax** + +``` sql +countSubstrings(haystack, needle[, start_pos]) +``` + +**Parameters** + +- `haystack` — The string to search in. [String](../../sql-reference/syntax.md#syntax-string-literal). +- `needle` — The substring to search for. [String](../../sql-reference/syntax.md#syntax-string-literal). +- `start_pos` – Optional parameter, position of the first character in the string to start search. [UInt](../../sql-reference/data-types/int-uint.md) + +**Returned values** + +- Number of occurrences. + +Type: `Integer`. + +**Examples** + +Query: + +``` sql +SELECT countSubstrings('foobar.com', '.') +``` + +Result: + +``` text +┌─countSubstrings('foobar.com', '.')─┐ +│ 1 │ +└────────────────────────────────────┘ +``` + +Query: + +``` sql +SELECT countSubstrings('aaaa', 'aa') +``` + +Result: + +``` text +┌─countSubstrings('aaaa', 'aa')─┐ +│ 2 │ +└───────────────────────────────┘ +``` + [Original article](https://clickhouse.tech/docs/en/query_language/functions/string_search_functions/) diff --git a/docs/en/sql-reference/functions/tuple-functions.md b/docs/en/sql-reference/functions/tuple-functions.md new file mode 100644 index 00000000000..dcbcd3e374b --- /dev/null +++ b/docs/en/sql-reference/functions/tuple-functions.md @@ -0,0 +1,114 @@ +--- +toc_priority: 66 +toc_title: Tuples +--- + +# Functions for Working with Tuples {#tuple-functions} + +## tuple {#tuple} + +A function that allows grouping multiple columns. +For columns with the types T1, T2, …, it returns a Tuple(T1, T2, …) type tuple containing these columns. There is no cost to execute the function. +Tuples are normally used as intermediate values for an argument of IN operators, or for creating a list of formal parameters of lambda functions. Tuples can’t be written to a table. + +The function implements the operator `(x, y, …)`. + +**Syntax** + +``` sql +tuple(x, y, …) +``` + +## tupleElement {#tupleelement} + +A function that allows getting a column from a tuple. +‘N’ is the column index, starting from 1. N must be a constant. ‘N’ must be a constant. ‘N’ must be a strict postive integer no greater than the size of the tuple. +There is no cost to execute the function. + +The function implements the operator `x.N`. + +**Syntax** + +``` sql +tupleElement(tuple, n) +``` + +## untuple {#untuple} + +Performs syntactic substitution of [tuple](../../sql-reference/data-types/tuple.md#tuplet1-t2) elements in the call location. + +**Syntax** + +``` sql +untuple(x) +``` + +You can use the `EXCEPT` expression to skip columns as a result of the query. + +**Parameters** + +- `x` - A `tuple` function, column, or tuple of elements. [Tuple](../../sql-reference/data-types/tuple.md). + +**Returned value** + +- None. + +**Examples** + +Input table: + +``` text +┌─key─┬─v1─┬─v2─┬─v3─┬─v4─┬─v5─┬─v6────────┐ +│ 1 │ 10 │ 20 │ 40 │ 30 │ 15 │ (33,'ab') │ +│ 2 │ 25 │ 65 │ 70 │ 40 │ 6 │ (44,'cd') │ +│ 3 │ 57 │ 30 │ 20 │ 10 │ 5 │ (55,'ef') │ +│ 4 │ 55 │ 12 │ 7 │ 80 │ 90 │ (66,'gh') │ +│ 5 │ 30 │ 50 │ 70 │ 25 │ 55 │ (77,'kl') │ +└─────┴────┴────┴────┴────┴────┴───────────┘ +``` + +Example of using a `Tuple`-type column as the `untuple` function parameter: + +Query: + +``` sql +SELECT untuple(v6) FROM kv; +``` + +Result: + +``` text +┌─_ut_1─┬─_ut_2─┐ +│ 33 │ ab │ +│ 44 │ cd │ +│ 55 │ ef │ +│ 66 │ gh │ +│ 77 │ kl │ +└───────┴───────┘ +``` + +Example of using an `EXCEPT` expression: + +Query: + +``` sql +SELECT untuple((* EXCEPT (v2, v3),)) FROM kv; +``` + +Result: + +``` text +┌─key─┬─v1─┬─v4─┬─v5─┬─v6────────┐ +│ 1 │ 10 │ 30 │ 15 │ (33,'ab') │ +│ 2 │ 25 │ 40 │ 6 │ (44,'cd') │ +│ 3 │ 57 │ 10 │ 5 │ (55,'ef') │ +│ 4 │ 55 │ 80 │ 90 │ (66,'gh') │ +│ 5 │ 30 │ 25 │ 55 │ (77,'kl') │ +└─────┴────┴────┴────┴───────────┘ +``` + +**See Also** + +- [Tuple](../../sql-reference/data-types/tuple.md) + +[Original article](https://clickhouse.tech/docs/en/sql-reference/functions/tuple-functions/) diff --git a/docs/en/sql-reference/statements/alter/index/index.md b/docs/en/sql-reference/statements/alter/index/index.md index 4660478551f..5e93d521f38 100644 --- a/docs/en/sql-reference/statements/alter/index/index.md +++ b/docs/en/sql-reference/statements/alter/index/index.md @@ -14,7 +14,7 @@ The following operations are available: - `ALTER TABLE [db.]table MATERIALIZE INDEX name IN PARTITION partition_name` - The query rebuilds the secondary index `name` in the partition `partition_name`. Implemented as a [mutation](../../../../sql-reference/statements/alter/index.md#mutations). -The first two commands areare lightweight in a sense that they only change metadata or remove files. +The first two commands are lightweight in a sense that they only change metadata or remove files. Also, they are replicated, syncing indices metadata via ZooKeeper. diff --git a/docs/en/sql-reference/statements/alter/partition.md b/docs/en/sql-reference/statements/alter/partition.md index d2dd1c638cc..1d34449e918 100644 --- a/docs/en/sql-reference/statements/alter/partition.md +++ b/docs/en/sql-reference/statements/alter/partition.md @@ -21,10 +21,10 @@ The following operations with [partitions](../../../engines/table-engines/merget -## DETACH PARTITION {#alter_detach-partition} +## DETACH PARTITION\|PART {#alter_detach-partition} ``` sql -ALTER TABLE table_name DETACH PARTITION partition_expr +ALTER TABLE table_name DETACH PARTITION|PART partition_expr ``` Moves all data for the specified partition to the `detached` directory. The server forgets about the detached data partition as if it does not exist. The server will not know about this data until you make the [ATTACH](#alter_attach-partition) query. @@ -32,7 +32,8 @@ Moves all data for the specified partition to the `detached` directory. The serv Example: ``` sql -ALTER TABLE visits DETACH PARTITION 201901 +ALTER TABLE mt DETACH PARTITION '2020-11-21'; +ALTER TABLE mt DETACH PART 'all_2_2_0'; ``` Read about setting the partition expression in a section [How to specify the partition expression](#alter-how-to-specify-part-expr). @@ -41,10 +42,10 @@ After the query is executed, you can do whatever you want with the data in the ` This query is replicated – it moves the data to the `detached` directory on all replicas. Note that you can execute this query only on a leader replica. To find out if a replica is a leader, perform the `SELECT` query to the [system.replicas](../../../operations/system-tables/replicas.md#system_tables-replicas) table. Alternatively, it is easier to make a `DETACH` query on all replicas - all the replicas throw an exception, except the leader replica. -## DROP PARTITION {#alter_drop-partition} +## DROP PARTITION\|PART {#alter_drop-partition} ``` sql -ALTER TABLE table_name DROP PARTITION partition_expr +ALTER TABLE table_name DROP PARTITION|PART partition_expr ``` Deletes the specified partition from the table. This query tags the partition as inactive and deletes data completely, approximately in 10 minutes. @@ -53,6 +54,13 @@ Read about setting the partition expression in a section [How to specify the par The query is replicated – it deletes data on all replicas. +Example: + +``` sql +ALTER TABLE mt DROP PARTITION '2020-11-21'; +ALTER TABLE mt DROP PART 'all_4_4_0'; +``` + ## DROP DETACHED PARTITION\|PART {#alter_drop-detached} ``` sql diff --git a/docs/en/sql-reference/statements/create/table.md b/docs/en/sql-reference/statements/create/table.md index 82326bf51cf..e9952fc76fd 100644 --- a/docs/en/sql-reference/statements/create/table.md +++ b/docs/en/sql-reference/statements/create/table.md @@ -29,6 +29,8 @@ A column description is `name type` in the simplest case. Example: `RegionID UIn Expressions can also be defined for default values (see below). +If necessary, primary key can be specified, with one or more key expressions. + ### With a Schema Similar to Other Table {#with-a-schema-similar-to-other-table} ``` sql @@ -97,6 +99,34 @@ If you add a new column to a table but later change its default expression, the It is not possible to set default values for elements in nested data structures. +## Primary Key {#primary-key} + +You can define a [primary key](../../../engines/table-engines/mergetree-family/mergetree.md#primary-keys-and-indexes-in-queries) when creating a table. Primary key can be specified in two ways: + +- inside the column list + +``` sql +CREATE TABLE db.table_name +( + name1 type1, name2 type2, ..., + PRIMARY KEY(expr1[, expr2,...])] +) +ENGINE = engine; +``` + +- outside the column list + +``` sql +CREATE TABLE db.table_name +( + name1 type1, name2 type2, ... +) +ENGINE = engine +PRIMARY KEY(expr1[, expr2,...]); +``` + +You can't combine both ways in one query. + ## Constraints {#constraints} Along with columns descriptions constraints could be defined: diff --git a/docs/en/sql-reference/statements/select/group-by.md b/docs/en/sql-reference/statements/select/group-by.md index 6cb99f285f2..500a09dcbef 100644 --- a/docs/en/sql-reference/statements/select/group-by.md +++ b/docs/en/sql-reference/statements/select/group-by.md @@ -6,7 +6,7 @@ toc_title: GROUP BY `GROUP BY` clause switches the `SELECT` query into an aggregation mode, which works as follows: -- `GROUP BY` clause contains a list of expressions (or a single expression, which is considered to be the list of length one). This list acts as a “grouping key”, while each individual expression will be referred to as a “key expressions”. +- `GROUP BY` clause contains a list of expressions (or a single expression, which is considered to be the list of length one). This list acts as a “grouping key”, while each individual expression will be referred to as a “key expression”. - All the expressions in the [SELECT](../../../sql-reference/statements/select/index.md), [HAVING](../../../sql-reference/statements/select/having.md), and [ORDER BY](../../../sql-reference/statements/select/order-by.md) clauses **must** be calculated based on key expressions **or** on [aggregate functions](../../../sql-reference/aggregate-functions/index.md) over non-key expressions (including plain columns). In other words, each column selected from the table must be used either in a key expression or inside an aggregate function, but not both. - Result of aggregating `SELECT` query will contain as many rows as there were unique values of “grouping key” in source table. Usually this signficantly reduces the row count, often by orders of magnitude, but not necessarily: row count stays the same if all “grouping key” values were distinct. @@ -45,6 +45,154 @@ You can see that `GROUP BY` for `y = NULL` summed up `x`, as if `NULL` is this v If you pass several keys to `GROUP BY`, the result will give you all the combinations of the selection, as if `NULL` were a specific value. +## WITH ROLLUP Modifier {#with-rollup-modifier} + +`WITH ROLLUP` modifier is used to calculate subtotals for the key expressions, based on their order in the `GROUP BY` list. The subtotals rows are added after the result table. + +The subtotals are calculated in the reverse order: at first subtotals are calculated for the last key expression in the list, then for the previous one, and so on up to the first key expression. + +In the subtotals rows the values of already "grouped" key expressions are set to `0` or empty line. + +!!! note "Note" + Mind that [HAVING](../../../sql-reference/statements/select/having.md) clause can affect the subtotals results. + +**Example** + +Consider the table t: + +```text +┌─year─┬─month─┬─day─┐ +│ 2019 │ 1 │ 5 │ +│ 2019 │ 1 │ 15 │ +│ 2020 │ 1 │ 5 │ +│ 2020 │ 1 │ 15 │ +│ 2020 │ 10 │ 5 │ +│ 2020 │ 10 │ 15 │ +└──────┴───────┴─────┘ +``` + +Query: + +```sql +SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH ROLLUP; +``` +As `GROUP BY` section has three key expressions, the result contains four tables with subtotals "rolled up" from right to left: + +- `GROUP BY year, month, day`; +- `GROUP BY year, month` (and `day` column is filled with zeros); +- `GROUP BY year` (now `month, day` columns are both filled with zeros); +- and totals (and all three key expression columns are zeros). + +```text +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2020 │ 10 │ 15 │ 1 │ +│ 2020 │ 1 │ 5 │ 1 │ +│ 2019 │ 1 │ 5 │ 1 │ +│ 2020 │ 1 │ 15 │ 1 │ +│ 2019 │ 1 │ 15 │ 1 │ +│ 2020 │ 10 │ 5 │ 1 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 1 │ 0 │ 2 │ +│ 2020 │ 1 │ 0 │ 2 │ +│ 2020 │ 10 │ 0 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 0 │ 0 │ 2 │ +│ 2020 │ 0 │ 0 │ 4 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 0 │ 0 │ 6 │ +└──────┴───────┴─────┴─────────┘ +``` + +## WITH CUBE Modifier {#with-cube-modifier} + +`WITH CUBE` modifier is used to calculate subtotals for every combination of the key expressions in the `GROUP BY` list. The subtotals rows are added after the result table. + +In the subtotals rows the values of all "grouped" key expressions are set to `0` or empty line. + +!!! note "Note" + Mind that [HAVING](../../../sql-reference/statements/select/having.md) clause can affect the subtotals results. + +**Example** + +Consider the table t: + +```text +┌─year─┬─month─┬─day─┐ +│ 2019 │ 1 │ 5 │ +│ 2019 │ 1 │ 15 │ +│ 2020 │ 1 │ 5 │ +│ 2020 │ 1 │ 15 │ +│ 2020 │ 10 │ 5 │ +│ 2020 │ 10 │ 15 │ +└──────┴───────┴─────┘ +``` + +Query: + +```sql +SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH CUBE; +``` + +As `GROUP BY` section has three key expressions, the result contains eight tables with subtotals for all key expression combinations: + +- `GROUP BY year, month, day` +- `GROUP BY year, month` +- `GROUP BY year, day` +- `GROUP BY year` +- `GROUP BY month, day` +- `GROUP BY month` +- `GROUP BY day` +- and totals. + +Columns, excluded from `GROUP BY`, are filled with zeros. + +```text +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2020 │ 10 │ 15 │ 1 │ +│ 2020 │ 1 │ 5 │ 1 │ +│ 2019 │ 1 │ 5 │ 1 │ +│ 2020 │ 1 │ 15 │ 1 │ +│ 2019 │ 1 │ 15 │ 1 │ +│ 2020 │ 10 │ 5 │ 1 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 1 │ 0 │ 2 │ +│ 2020 │ 1 │ 0 │ 2 │ +│ 2020 │ 10 │ 0 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2020 │ 0 │ 5 │ 2 │ +│ 2019 │ 0 │ 5 │ 1 │ +│ 2020 │ 0 │ 15 │ 2 │ +│ 2019 │ 0 │ 15 │ 1 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 0 │ 0 │ 2 │ +│ 2020 │ 0 │ 0 │ 4 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 1 │ 5 │ 2 │ +│ 0 │ 10 │ 15 │ 1 │ +│ 0 │ 10 │ 5 │ 1 │ +│ 0 │ 1 │ 15 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 1 │ 0 │ 4 │ +│ 0 │ 10 │ 0 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 0 │ 5 │ 3 │ +│ 0 │ 0 │ 15 │ 3 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 0 │ 0 │ 6 │ +└──────┴───────┴─────┴─────────┘ +``` + + ## WITH TOTALS Modifier {#with-totals-modifier} If the `WITH TOTALS` modifier is specified, another row will be calculated. This row will have key columns containing default values (zeros or empty lines), and columns of aggregate functions with the values calculated across all the rows (the “total” values). @@ -88,8 +236,6 @@ SELECT FROM hits ``` -However, in contrast to standard SQL, if the table doesn’t have any rows (either there aren’t any at all, or there aren’t any after using WHERE to filter), an empty result is returned, and not the result from one of the rows containing the initial values of aggregate functions. - As opposed to MySQL (and conforming to standard SQL), you can’t get some value of some column that is not in a key or aggregate function (except constant expressions). To work around this, you can use the ‘any’ aggregate function (get the first encountered value) or ‘min/max’. Example: @@ -105,10 +251,6 @@ GROUP BY domain For every different key value encountered, `GROUP BY` calculates a set of aggregate function values. -`GROUP BY` is not supported for array columns. - -A constant can’t be specified as arguments for aggregate functions. Example: `sum(1)`. Instead of this, you can get rid of the constant. Example: `count()`. - ## Implementation Details {#implementation-details} Aggregation is one of the most important features of a column-oriented DBMS, and thus it’s implementation is one of the most heavily optimized parts of ClickHouse. By default, aggregation is done in memory using a hash-table. It has 40+ specializations that are chosen automatically depending on “grouping key” data types. diff --git a/docs/en/sql-reference/statements/select/index.md b/docs/en/sql-reference/statements/select/index.md index 3107f791eb9..60c769c4660 100644 --- a/docs/en/sql-reference/statements/select/index.md +++ b/docs/en/sql-reference/statements/select/index.md @@ -20,12 +20,12 @@ SELECT [DISTINCT] expr_list [GLOBAL] [ANY|ALL|ASOF] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER|SEMI|ANTI] JOIN (subquery)|table (ON )|(USING ) [PREWHERE expr] [WHERE expr] -[GROUP BY expr_list] [WITH TOTALS] +[GROUP BY expr_list] [WITH ROLLUP|WITH CUBE] [WITH TOTALS] [HAVING expr] [ORDER BY expr_list] [WITH FILL] [FROM expr] [TO expr] [STEP expr] [LIMIT [offset_value, ]n BY columns] [LIMIT [n, ]m] [WITH TIES] -[UNION ALL ...] +[UNION ...] [INTO OUTFILE filename] [FORMAT format] ``` @@ -46,7 +46,7 @@ Specifics of each optional clause are covered in separate sections, which are li - [SELECT clause](#select-clause) - [DISTINCT clause](../../../sql-reference/statements/select/distinct.md) - [LIMIT clause](../../../sql-reference/statements/select/limit.md) -- [UNION ALL clause](../../../sql-reference/statements/select/union-all.md) +- [UNION clause](../../../sql-reference/statements/select/union-all.md) - [INTO OUTFILE clause](../../../sql-reference/statements/select/into-outfile.md) - [FORMAT clause](../../../sql-reference/statements/select/format.md) @@ -159,4 +159,111 @@ If the query omits the `DISTINCT`, `GROUP BY` and `ORDER BY` clauses and the `IN For more information, see the section “Settings”. It is possible to use external sorting (saving temporary tables to a disk) and external aggregation. -{## [Original article](https://clickhouse.tech/docs/en/sql-reference/statements/select/) ##} +## SELECT modifiers {#select-modifiers} + +You can use the following modifiers in `SELECT` queries. + +### APPLY {#apply-modifier} + +Allows you to invoke some function for each row returned by an outer table expression of a query. + +**Syntax:** + +``` sql +SELECT APPLY( ) FROM [db.]table_name +``` + +**Example:** + +``` sql +CREATE TABLE columns_transformers (i Int64, j Int16, k Int64) ENGINE = MergeTree ORDER by (i); +INSERT INTO columns_transformers VALUES (100, 10, 324), (120, 8, 23); +SELECT * APPLY(sum) FROM columns_transformers; +``` + +``` +┌─sum(i)─┬─sum(j)─┬─sum(k)─┐ +│ 220 │ 18 │ 347 │ +└────────┴────────┴────────┘ +``` + +### EXCEPT {#except-modifier} + +Specifies the names of one or more columns to exclude from the result. All matching column names are omitted from the output. + +**Syntax:** + +``` sql +SELECT EXCEPT ( col_name1 [, col_name2, col_name3, ...] ) FROM [db.]table_name +``` + +**Example:** + +``` sql +SELECT * EXCEPT (i) from columns_transformers; +``` + +``` +┌──j─┬───k─┐ +│ 10 │ 324 │ +│ 8 │ 23 │ +└────┴─────┘ +``` + +### REPLACE {#replace-modifier} + +Specifies one or more [expression aliases](../../../sql-reference/syntax.md#syntax-expression_aliases). Each alias must match a column name from the `SELECT *` statement. In the output column list, the column that matches the alias is replaced by the expression in that `REPLACE`. + +This modifier does not change the names or order of columns. However, it can change the value and the value type. + +**Syntax:** + +``` sql +SELECT REPLACE( AS col_name) from [db.]table_name +``` + +**Example:** + +``` sql +SELECT * REPLACE(i + 1 AS i) from columns_transformers; +``` + +``` +┌───i─┬──j─┬───k─┐ +│ 101 │ 10 │ 324 │ +│ 121 │ 8 │ 23 │ +└─────┴────┴─────┘ +``` + +### Modifier Combinations {#modifier-combinations} + +You can use each modifier separately or combine them. + +**Examples:** + +Using the same modifier multiple times. + +``` sql +SELECT COLUMNS('[jk]') APPLY(toString) APPLY(length) APPLY(max) from columns_transformers; +``` + +``` +┌─max(length(toString(j)))─┬─max(length(toString(k)))─┐ +│ 2 │ 3 │ +└──────────────────────────┴──────────────────────────┘ +``` + +Using multiple modifiers in a single query. + +``` sql +SELECT * REPLACE(i + 1 AS i) EXCEPT (j) APPLY(sum) from columns_transformers; +``` + +``` +┌─sum(plus(i, 1))─┬─sum(k)─┐ +│ 222 │ 347 │ +└─────────────────┴────────┘ +``` + +[Original article](https://clickhouse.tech/docs/en/sql-reference/statements/select/) + diff --git a/docs/en/sql-reference/statements/select/union-all.md b/docs/en/sql-reference/statements/select/union-all.md index 5230363609e..f150efbdc80 100644 --- a/docs/en/sql-reference/statements/select/union-all.md +++ b/docs/en/sql-reference/statements/select/union-all.md @@ -1,5 +1,5 @@ --- -toc_title: UNION ALL +toc_title: UNION --- # UNION ALL Clause {#union-all-clause} @@ -25,10 +25,13 @@ Type casting is performed for unions. For example, if two queries being combined Queries that are parts of `UNION ALL` can’t be enclosed in round brackets. [ORDER BY](../../../sql-reference/statements/select/order-by.md) and [LIMIT](../../../sql-reference/statements/select/limit.md) are applied to separate queries, not to the final result. If you need to apply a conversion to the final result, you can put all the queries with `UNION ALL` in a subquery in the [FROM](../../../sql-reference/statements/select/from.md) clause. -## Limitations {#limitations} +# UNION DISTINCT Clause {#union-distinct-clause} +The difference between `UNION ALL` and `UNION DISTINCT` is that `UNION DISTINCT` will do a distinct transform for union result, it is equivalent to `SELECT DISTINCT` from a subquery containing `UNION ALL`. + +# UNION Clause {#union-clause} +By default, `UNION` has the same behavior as `UNION DISTINCT`, but you can specify union mode by setting `union_default_mode`, values can be 'ALL', 'DISTINCT' or empty string. However, if you use `UNION` with setting `union_default_mode` to empty string, it will throw an exception. -Only `UNION ALL` is supported. The regular `UNION` (`UNION DISTINCT`) is not supported. If you need `UNION DISTINCT`, you can write `SELECT DISTINCT` from a subquery containing `UNION ALL`. ## Implementation Details {#implementation-details} -Queries that are parts of `UNION ALL` can be run simultaneously, and their results can be mixed together. +Queries that are parts of `UNION/UNION ALL/UNION DISTINCT` can be run simultaneously, and their results can be mixed together. diff --git a/docs/es/development/contrib.md b/docs/es/development/contrib.md index 9018c19cc92..3f3013570e5 100644 --- a/docs/es/development/contrib.md +++ b/docs/es/development/contrib.md @@ -19,7 +19,6 @@ toc_title: Bibliotecas de terceros utilizadas | Más información | [Licencia de 3 cláusulas BSD](https://github.com/google/googletest/blob/master/LICENSE) | | H3 | [Licencia Apache 2.0](https://github.com/uber/h3/blob/master/LICENSE) | | hyperscan | [Licencia de 3 cláusulas BSD](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| libbtrie | [Licencia BSD de 2 cláusulas](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | libdivide | [Licencia Zlib](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | libgsasl | [Información adicional](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/fa/development/contrib.md b/docs/fa/development/contrib.md index 25573c28125..2ee5fc73369 100644 --- a/docs/fa/development/contrib.md +++ b/docs/fa/development/contrib.md @@ -21,7 +21,6 @@ toc_title: "\u06A9\u062A\u0627\u0628\u062E\u0627\u0646\u0647 \u0647\u0627\u06CC | googletest | [لیسانس 3 بند](https://github.com/google/googletest/blob/master/LICENSE) | | اچ 3 | [نمایی مجوز 2.0](https://github.com/uber/h3/blob/master/LICENSE) | | hyperscan | [لیسانس 3 بند](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| لیبتری | [لیسانس 2 بند](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | شکنجه نوجوان | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | لیبیدوید | [مجوز زلب](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | نوشیدن شراب | [الجی پی ال2.1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/fr/development/contrib.md b/docs/fr/development/contrib.md index f4006d0a787..6909ef905bd 100644 --- a/docs/fr/development/contrib.md +++ b/docs/fr/development/contrib.md @@ -19,7 +19,6 @@ toc_title: "Biblioth\xE8ques Tierces Utilis\xE9es" | googletest | [Licence BSD 3-Clause](https://github.com/google/googletest/blob/master/LICENSE) | | h3 | [Licence Apache 2.0](https://github.com/uber/h3/blob/master/LICENSE) | | hyperscan | [Licence BSD 3-Clause](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| libbtrie | [Licence BSD 2-Clause](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | libdivide | [Licence Zlib](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | libgsasl | [LGPL v2.1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/ja/development/contrib.md b/docs/ja/development/contrib.md index 2e16b2bc72a..892d2c66a13 100644 --- a/docs/ja/development/contrib.md +++ b/docs/ja/development/contrib.md @@ -20,7 +20,6 @@ toc_title: "\u30B5\u30FC\u30C9\u30D1\u30FC\u30C6\u30A3\u88FD\u30E9\u30A4\u30D6\u | googletest | [BSD3条項ライセンス](https://github.com/google/googletest/blob/master/LICENSE) | | h3 | [Apacheライセンス2.0](https://github.com/uber/h3/blob/master/LICENSE) | | hyperscan | [BSD3条項ライセンス](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| libbtrie | [BSD2条項ライセンス](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | libdivide | [Zlibライセンス](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | libgsasl | [LGPL v2.1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/ru/development/contrib.md b/docs/ru/development/contrib.md index e65ab4819e8..05367267e41 100644 --- a/docs/ru/development/contrib.md +++ b/docs/ru/development/contrib.md @@ -18,7 +18,6 @@ toc_title: "\u0418\u0441\u043f\u043e\u043b\u044c\u0437\u0443\u0435\u043c\u044b\u | googletest | [BSD 3-Clause License](https://github.com/google/googletest/blob/master/LICENSE) | | h3 | [Apache License 2.0](https://github.com/uber/h3/blob/master/LICENSE) | | hyperscan | [BSD 3-Clause License](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| libbtrie | [BSD 2-Clause License](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | libdivide | [Zlib License](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | libgsasl | [LGPL v2.1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/ru/engines/table-engines/mergetree-family/mergetree.md b/docs/ru/engines/table-engines/mergetree-family/mergetree.md index e4b6e0b1e59..7428c8b0911 100644 --- a/docs/ru/engines/table-engines/mergetree-family/mergetree.md +++ b/docs/ru/engines/table-engines/mergetree-family/mergetree.md @@ -183,18 +183,18 @@ ClickHouse не требует уникального первичного кл - Увеличить эффективность индекса. - Пусть первичный ключ — `(a, b)`, тогда добавление ещё одного столбца `c` повысит эффективность, если выполнены условия: + Пусть первичный ключ — `(a, b)`, тогда добавление ещё одного столбца `c` повысит эффективность, если выполнены условия: - - Есть запросы с условием на столбец `c`. - - Часто встречаются достаточно длинные (в несколько раз больше `index_granularity`) диапазоны данных с одинаковыми значениями `(a, b)`. Иначе говоря, когда добавление ещё одного столбца позволит пропускать достаточно длинные диапазоны данных. + - Есть запросы с условием на столбец `c`. + - Часто встречаются достаточно длинные (в несколько раз больше `index_granularity`) диапазоны данных с одинаковыми значениями `(a, b)`. Иначе говоря, когда добавление ещё одного столбца позволит пропускать достаточно длинные диапазоны данных. - Улучшить сжатие данных. - ClickHouse сортирует данные по первичному ключу, поэтому чем выше однородность, тем лучше сжатие. + ClickHouse сортирует данные по первичному ключу, поэтому чем выше однородность, тем лучше сжатие. - Обеспечить дополнительную логику при слиянии кусков данных в движках [CollapsingMergeTree](collapsingmergetree.md#table_engine-collapsingmergetree) и [SummingMergeTree](summingmergetree.md). - В этом случае имеет смысл указать отдельный *ключ сортировки*, отличающийся от первичного ключа. + В этом случае имеет смысл указать отдельный *ключ сортировки*, отличающийся от первичного ключа. Длинный первичный ключ будет негативно влиять на производительность вставки и потребление памяти, однако на производительность ClickHouse при запросах `SELECT` лишние столбцы в первичном ключе не влияют. @@ -309,11 +309,11 @@ SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234 - `bloom_filter([false_positive])` — [фильтр Блума](https://en.wikipedia.org/wiki/Bloom_filter) для указанных стоблцов. - Необязательный параметр `false_positive` — это вероятность получения ложноположительного срабатывания. Возможные значения: (0, 1). Значение по умолчанию: 0.025. + Необязательный параметр `false_positive` — это вероятность получения ложноположительного срабатывания. Возможные значения: (0, 1). Значение по умолчанию: 0.025. - Поддержанные типы данных: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`. - - Фильтром могут пользоваться функции: [equals](../../../engines/table_engines/mergetree_family/mergetree.md), [notEquals](../../../engines/table_engines/mergetree_family/mergetree.md), [in](../../../engines/table_engines/mergetree_family/mergetree.md), [notIn](../../../engines/table_engines/mergetree_family/mergetree.md). + Поддержанные типы данных: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`. + + Фильтром могут пользоваться функции: [equals](../../../engines/table-engines/mergetree-family/mergetree.md), [notEquals](../../../engines/table-engines/mergetree-family/mergetree.md), [in](../../../engines/table-engines/mergetree-family/mergetree.md), [notIn](../../../engines/table-engines/mergetree-family/mergetree.md). **Примеры** @@ -645,4 +645,4 @@ SETTINGS storage_policy = 'moving_from_ssd_to_hdd' После выполнения фоновых слияний или мутаций старые куски не удаляются сразу, а через некоторое время (табличная настройка `old_parts_lifetime`). Также они не перемещаются на другие тома или диски, поэтому до момента удаления они продолжают учитываться при подсчёте занятого дискового пространства. -[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/table_engines/mergetree/) +[Оригинальная статья](https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/mergetree/) diff --git a/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md b/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md index 1228371e8ea..a4e47b161ad 100644 --- a/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md +++ b/docs/ru/engines/table-engines/mergetree-family/replacingmergetree.md @@ -5,7 +5,7 @@ toc_title: ReplacingMergeTree # ReplacingMergeTree {#replacingmergetree} -Движок отличается от [MergeTree](mergetree.md#table_engines-mergetree) тем, что выполняет удаление дублирующихся записей с одинаковым значением [ключа сортировки](mergetree.md)). +Движок отличается от [MergeTree](mergetree.md#table_engines-mergetree) тем, что выполняет удаление дублирующихся записей с одинаковым значением [ключа сортировки](mergetree.md) (секция `ORDER BY`, не `PRIMARY KEY`). Дедупликация данных производится лишь во время слияний. Слияние происходят в фоне в неизвестный момент времени, на который вы не можете ориентироваться. Некоторая часть данных может остаться необработанной. Хотя вы можете вызвать внеочередное слияние с помощью запроса `OPTIMIZE`, на это не стоит рассчитывать, так как запрос `OPTIMIZE` приводит к чтению и записи большого объёма данных. @@ -28,14 +28,17 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] Описание параметров запроса смотрите в [описании запроса](../../../engines/table-engines/mergetree-family/replacingmergetree.md). +!!! note "Внимание" + Уникальность строк определяется `ORDER BY` секцией таблицы, а не `PRIMARY KEY`. + **Параметры ReplacingMergeTree** - `ver` — столбец с версией, тип `UInt*`, `Date` или `DateTime`. Необязательный параметр. - При слиянии, из всех строк с одинаковым значением ключа сортировки `ReplacingMergeTree` оставляет только одну: + При слиянии `ReplacingMergeTree` оставляет только строку для каждого уникального ключа сортировки: - - Последнюю в выборке, если `ver` не задан. - - С максимальной версией, если `ver` задан. + - Последнюю в выборке, если `ver` не задан. Под выборкой здесь понимается набор строк в наборе партов, участвующих в слиянии. Последний по времени создания парт (последний инсерт) будет последним в выборке. Таким образом, после дедупликации для каждого значения ключа сортировки останется самая последняя строка из самого последнего инсерта. + - С максимальной версией, если `ver` задан. **Секции запроса** diff --git a/docs/ru/operations/server-configuration-parameters/settings.md b/docs/ru/operations/server-configuration-parameters/settings.md index 9941e4f3ac5..58aae05f188 100644 --- a/docs/ru/operations/server-configuration-parameters/settings.md +++ b/docs/ru/operations/server-configuration-parameters/settings.md @@ -127,7 +127,8 @@ ClickHouse проверяет условия для `min_part_size` и `min_part Если `true`, то каждый словарь создаётся при первом использовании. Если словарь не удалось создать, то вызов функции, использующей словарь, сгенерирует исключение. -Если `false`, то все словари создаются при старте сервера, и в случае ошибки сервер завершает работу. +Если `false`, то все словари создаются при старте сервера, если словарь или словари создаются слишком долго или создаются с ошибкой, то сервер загружается без +этих словарей и продолжает попытки создать эти словари. По умолчанию - `true`. diff --git a/docs/ru/operations/settings/settings.md b/docs/ru/operations/settings/settings.md index af0fc3e6137..b04a927f944 100644 --- a/docs/ru/operations/settings/settings.md +++ b/docs/ru/operations/settings/settings.md @@ -2099,6 +2099,48 @@ SELECT TOP 3 name, value FROM system.settings; └─────────────────────────┴─────────┘ ``` +## system_events_show_zero_values {#system_events_show_zero_values} + +Позволяет выбрать события с нулевыми значениями из таблицы [`system.events`](../../operations/system-tables/events.md). + +В некоторые системы мониторинга вам нужно передать значения всех измерений (для каждой контрольной точки), даже если в результате — "0". + +Возможные значения: + +- 0 — настройка отключена — вы получите все события. +- 1 — настройка включена — вы сможете отсортировать события по нулевым и остальным значениям. + +Значение по умолчанию: `0`. + +**Примеры** + +Запрос + +```sql +SELECT * FROM system.events WHERE event='QueryMemoryLimitExceeded'; +``` + +Результат + +```text +Ok. +``` + +Запрос + +```sql +SET system_events_show_zero_values = 1; +SELECT * FROM system.events WHERE event='QueryMemoryLimitExceeded'; +``` + +Результат + +```text +┌─event────────────────────┬─value─┬─description───────────────────────────────────────────┐ +│ QueryMemoryLimitExceeded │ 0 │ Number of times when memory limit exceeded for query. │ +└──────────────────────────┴───────┴───────────────────────────────────────────────────────┘ +``` + ## allow_experimental_bigint_types {#allow_experimental_bigint_types} Включает или отключает поддержку целочисленных значений, превышающих максимальное значение, допустимое для типа `int`. diff --git a/docs/ru/operations/utilities/clickhouse-obfuscator.md b/docs/ru/operations/utilities/clickhouse-obfuscator.md new file mode 100644 index 00000000000..a52d538965b --- /dev/null +++ b/docs/ru/operations/utilities/clickhouse-obfuscator.md @@ -0,0 +1,43 @@ +# Обфускатор ClickHouse + +Простой инструмент для обфускации табличных данных. + +Он считывает данные входной таблицы и создает выходную таблицу, которая сохраняет некоторые свойства входных данных, но при этом содержит другие данные. + +Это позволяет публиковать практически реальные данные и использовать их в тестах на производительность. + +Обфускатор предназначен для сохранения следующих свойств данных: +- кардинальность (количество уникальных данных) для каждого столбца и каждого кортежа столбцов; +- условная кардинальность: количество уникальных данных одного столбца в соответствии со значением другого столбца; +- вероятностные распределения абсолютного значения целых чисел; знак числа типа Int; показатель степени и знак для чисел с плавающей запятой; +- вероятностное распределение длины строк; +- вероятность нулевых значений чисел; пустые строки и массивы, `NULL`; +- степень сжатия данных алгоритмом LZ77 и семейством энтропийных кодеков; + +- непрерывность (величина разницы) значений времени в таблице; непрерывность значений с плавающей запятой; +- дату из значений `DateTime`; + +- кодировка UTF-8 значений строки; +- строковые значения выглядят естественным образом. + + +Большинство перечисленных выше свойств пригодны для тестирования производительности. Чтение данных, фильтрация, агрегирование и сортировка будут работать почти с той же скоростью, что и исходные данные, благодаря сохраненной кардинальности, величине, степени сжатия и т. д. + +Он работает детерминированно. Вы задаёте значение инициализатора, а преобразование полностью определяется входными данными и инициализатором. + +Некоторые преобразования выполняются один к одному, и их можно отменить. Поэтому нужно использовать большое значение инициализатора и хранить его в секрете. + + +Обфускатор использует некоторые криптографические примитивы для преобразования данных, но, с криптографической точки зрения, результат будет небезопасным. В нем могут сохраниться данные, которые не следует публиковать. + + +Он всегда оставляет без изменений числа 0, 1, -1, даты, длины массивов и нулевые флаги. +Например, если у вас есть столбец `IsMobile` в таблице со значениями 0 и 1, то в преобразованных данных он будет иметь то же значение. + +Таким образом, пользователь сможет посчитать точное соотношение мобильного трафика. + +Давайте рассмотрим случай, когда у вас есть какие-то личные данные в таблице (например, электронная почта пользователя), и вы не хотите их публиковать. +Если ваша таблица достаточно большая и содержит несколько разных электронных почтовых адресов, и ни один из них не встречается часто, то обфускатор полностью анонимизирует все данные. Но, если у вас есть небольшое количество разных значений в столбце, он может скопировать некоторые из них. +В этом случае вам следует посмотреть на алгоритм работы инструмента и настроить параметры командной строки. + +Обфускатор полезен в работе со средним объемом данных (не менее 1000 строк). diff --git a/docs/ru/sql-reference/aggregate-functions/index.md b/docs/ru/sql-reference/aggregate-functions/index.md index e7f6acee738..4a7768f587f 100644 --- a/docs/ru/sql-reference/aggregate-functions/index.md +++ b/docs/ru/sql-reference/aggregate-functions/index.md @@ -44,8 +44,6 @@ SELECT sum(y) FROM t_null_big └────────┘ ``` -Функция `sum` работает с `NULL` как с `0`. В частности, это означает, что если на вход в функцию подать выборку, где все значения `NULL`, то результат будет `0`, а не `NULL`. - Теперь с помощью функции `groupArray` сформируем массив из столбца `y`: ``` sql diff --git a/docs/ru/sql-reference/aggregate-functions/reference/initializeAggregation.md b/docs/ru/sql-reference/aggregate-functions/reference/initializeAggregation.md new file mode 100644 index 00000000000..a2e3764193e --- /dev/null +++ b/docs/ru/sql-reference/aggregate-functions/reference/initializeAggregation.md @@ -0,0 +1,40 @@ +--- +toc_priority: 150 +--- + +## initializeAggregation {#initializeaggregation} + +Инициализирует агрегацию для введеных строчек. Предназначена для функций с суффиксом `State`. +Поможет вам проводить тесты или работать со столбцами типов: `AggregateFunction` и `AggregationgMergeTree`. + +**Синтаксис** + +``` sql +initializeAggregation (aggregate_function, column_1, column_2); +``` + +**Параметры** + +- `aggregate_function` — название функции агрегации, состояние которой нужно создать. [String](../../../sql-reference/data-types/string.md#string). +- `column_n` — столбец, который передается в функцию агрегации как аргумент. [String](../../../sql-reference/data-types/string.md#string). + +**Возвращаемое значение** + +Возвращает результат агрегации введенной информации. Тип возвращаемого значения такой же, как и для функции, которая становится первым аргументом для `initializeAgregation`. + +Пример: + +Возвращаемый тип функций с суффиксом `State` — `AggregateFunction`. + +**Пример** + +Запрос: + +```sql +SELECT uniqMerge(state) FROM (SELECT initializeAggregation('uniqState', number % 3) AS state FROM system.numbers LIMIT 10000); +``` +Результат: + +┌─uniqMerge(state)─┐ +│ 3 │ +└──────────────────┘ diff --git a/docs/ru/sql-reference/aggregate-functions/reference/rankCorr.md b/docs/ru/sql-reference/aggregate-functions/reference/rankCorr.md new file mode 100644 index 00000000000..48a19e87c52 --- /dev/null +++ b/docs/ru/sql-reference/aggregate-functions/reference/rankCorr.md @@ -0,0 +1,53 @@ +## rankCorr {#agg_function-rankcorr} + +Вычисляет коэффициент ранговой корреляции. + +**Синтаксис** + +``` sql +rankCorr(x, y) +``` + +**Параметры** + +- `x` — Произвольное значение. [Float32](../../../sql-reference/data-types/float.md#float32-float64) или [Float64](../../../sql-reference/data-types/float.md#float32-float64). +- `y` — Произвольное значение. [Float32](../../../sql-reference/data-types/float.md#float32-float64) или [Float64](../../../sql-reference/data-types/float.md#float32-float64). + +**Возвращаемое значение** + +- Возвращает коэффициент ранговой корреляции рангов x и y. Значение коэффициента корреляции изменяется в пределах от -1 до +1. Если передается менее двух аргументов, функция возвращает исключение. Значение, близкое к +1, указывает на высокую линейную зависимость, и с увеличением одной случайной величины увеличивается и вторая случайная величина. Значение, близкое к -1, указывает на высокую линейную зависимость, и с увеличением одной случайной величины вторая случайная величина уменьшается. Значение, близкое или равное 0, означает отсутствие связи между двумя случайными величинами. + +Тип: [Float64](../../../sql-reference/data-types/float.md#float32-float64). + +**Пример** + +Запрос: + +``` sql +SELECT rankCorr(number, number) FROM numbers(100); +``` + +Результат: + +``` text +┌─rankCorr(number, number)─┐ +│ 1 │ +└──────────────────────────┘ +``` + +Запрос: + +``` sql +SELECT roundBankers(rankCorr(exp(number), sin(number)), 3) FROM numbers(100); +``` + +Результат: + +``` text +┌─roundBankers(rankCorr(exp(number), sin(number)), 3)─┐ +│ -0.037 │ +└─────────────────────────────────────────────────────┘ +``` +**Смотрите также** + +- [Коэффициент ранговой корреляции Спирмена](https://ru.wikipedia.org/wiki/%D0%9A%D0%BE%D1%80%D1%80%D0%B5%D0%BB%D1%8F%D1%86%D0%B8%D1%8F#%D0%9A%D0%BE%D1%8D%D1%84%D1%84%D0%B8%D1%86%D0%B8%D0%B5%D0%BD%D1%82_%D1%80%D0%B0%D0%BD%D0%B3%D0%BE%D0%B2%D0%BE%D0%B9_%D0%BA%D0%BE%D1%80%D1%80%D0%B5%D0%BB%D1%8F%D1%86%D0%B8%D0%B8_%D0%A1%D0%BF%D0%B8%D1%80%D0%BC%D0%B5%D0%BD%D0%B0) \ No newline at end of file diff --git a/docs/ru/sql-reference/functions/date-time-functions.md b/docs/ru/sql-reference/functions/date-time-functions.md index deffc935870..3c9bd99de57 100644 --- a/docs/ru/sql-reference/functions/date-time-functions.md +++ b/docs/ru/sql-reference/functions/date-time-functions.md @@ -25,6 +25,40 @@ SELECT Поддерживаются только часовые пояса, отличающиеся от UTC на целое число часов. +## toTimeZone {#totimezone} + +Переводит дату или дату-с-временем в указанный часовой пояс. Часовой пояс (таймзона) это атрибут типов Date/DateTime, внутреннее значение (количество секунд) поля таблицы или колонки результата не изменяется, изменяется тип поля и автоматически его текстовое отображение. + +```sql +SELECT + toDateTime('2019-01-01 00:00:00', 'UTC') AS time_utc, + toTypeName(time_utc) AS type_utc, + toInt32(time_utc) AS int32utc, + toTimeZone(time_utc, 'Asia/Yekaterinburg') AS time_yekat, + toTypeName(time_yekat) AS type_yekat, + toInt32(time_yekat) AS int32yekat, + toTimeZone(time_utc, 'US/Samoa') AS time_samoa, + toTypeName(time_samoa) AS type_samoa, + toInt32(time_samoa) AS int32samoa +FORMAT Vertical; +``` + +```text +Row 1: +────── +time_utc: 2019-01-01 00:00:00 +type_utc: DateTime('UTC') +int32utc: 1546300800 +time_yekat: 2019-01-01 05:00:00 +type_yekat: DateTime('Asia/Yekaterinburg') +int32yekat: 1546300800 +time_samoa: 2018-12-31 13:00:00 +type_samoa: DateTime('US/Samoa') +int32samoa: 1546300800 +``` + +`toTimeZone(time_utc, 'Asia/Yekaterinburg')` изменяет тип `DateTime('UTC')` в `DateTime('Asia/Yekaterinburg')`. Значение (unix-время) 1546300800 остается неизменным, но текстовое отображение (результат функции toString()) меняется `time_utc: 2019-01-01 00:00:00` в `time_yekat: 2019-01-01 05:00:00`. + ## toYear {#toyear} Переводит дату или дату-с-временем в число типа UInt16, содержащее номер года (AD). @@ -57,32 +91,31 @@ SELECT ## toUnixTimestamp {#to-unix-timestamp} -For DateTime argument: converts value to its internal numeric representation (Unix Timestamp). -For String argument: parse datetime from string according to the timezone (optional second argument, server timezone is used by default) and returns the corresponding unix timestamp. -For Date argument: the behaviour is unspecified. +Переводит дату-с-временем в число типа UInt32 -- Unix Timestamp (https://en.wikipedia.org/wiki/Unix_time). +Для аргумента String, строка конвертируется в дату и время в соответствии с часовым поясом (необязательный второй аргумент, часовой пояс сервера используется по умолчанию). -**Syntax** +**Синтаксис** ``` sql toUnixTimestamp(datetime) toUnixTimestamp(str, [timezone]) ``` -**Returned value** +**Возвращаемое значение** -- Returns the unix timestamp. +- Возвращает Unix Timestamp. -Type: `UInt32`. +Тип: `UInt32`. -**Example** +**Пример** -Query: +Запрос: ``` sql SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp ``` -Result: +Результат: ``` text ┌─unix_timestamp─┐ @@ -490,4 +523,4 @@ SELECT formatDateTime(toDate('2010-01-04'), '%g') └────────────────────────────────────────────┘ ``` -[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/date_time_functions/) \ No newline at end of file +[Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/date_time_functions/) diff --git a/docs/ru/sql-reference/functions/in-functions.md b/docs/ru/sql-reference/functions/in-functions.md index e137187a36b..b732f67303b 100644 --- a/docs/ru/sql-reference/functions/in-functions.md +++ b/docs/ru/sql-reference/functions/in-functions.md @@ -9,16 +9,4 @@ toc_title: "\u0424\u0443\u043d\u043a\u0446\u0438\u0438\u0020\u0434\u043b\u044f\u Смотрите раздел [Операторы IN](../operators/in.md#select-in-operators). -## tuple(x, y, …), оператор (x, y, …) {#tuplex-y-operator-x-y} - -Функция, позволяющая сгруппировать несколько столбцов. -Для столбцов, имеющих типы T1, T2, … возвращает кортеж типа Tuple(T1, T2, …), содержащий эти столбцы. Выполнение функции ничего не стоит. -Кортежи обычно используются как промежуточное значение в качестве аргумента операторов IN, или для создания списка формальных параметров лямбда-функций. Кортежи не могут быть записаны в таблицу. - -## tupleElement(tuple, n), оператор x.N {#tupleelementtuple-n-operator-x-n} - -Функция, позволяющая достать столбец из кортежа. -N - индекс столбца начиная с 1. N должно быть константой. N должно быть целым строго положительным числом не большим размера кортежа. -Выполнение функции ничего не стоит. - [Оригинальная статья](https://clickhouse.tech/docs/ru/query_language/functions/in_functions/) diff --git a/docs/ru/sql-reference/functions/tuple-functions.md b/docs/ru/sql-reference/functions/tuple-functions.md new file mode 100644 index 00000000000..f88886ec6f1 --- /dev/null +++ b/docs/ru/sql-reference/functions/tuple-functions.md @@ -0,0 +1,114 @@ +--- +toc_priority: 68 +toc_title: Функции для работы с кортежами +--- + +# Функции для работы с кортежами {#tuple-functions} + +## tuple {#tuple} + +Функция, позволяющая сгруппировать несколько столбцов. +Для столбцов, имеющих типы T1, T2, … возвращает кортеж типа Tuple(T1, T2, …), содержащий эти столбцы. Выполнение функции ничего не стоит. +Кортежи обычно используются как промежуточное значение в качестве аргумента операторов IN, или для создания списка формальных параметров лямбда-функций. Кортежи не могут быть записаны в таблицу. + +С помощью функции реализуется оператор `(x, y, …)`. + +**Синтаксис** + +``` sql +tuple(x, y, …) +``` + +## tupleElement {#tupleelement} + +Функция, позволяющая достать столбец из кортежа. +N - индекс столбца начиная с 1. N должно быть константой. N должно быть целым строго положительным числом не большим размера кортежа. +Выполнение функции ничего не стоит. + +С помощью функции реализуется оператор `x.N`. + +**Синтаксис** + +``` sql +tupleElement(tuple, n) +``` + +## untuple {#untuple} + +Выполняет синтаксическую подстановку элементов [кортежа](../../sql-reference/data-types/tuple.md#tuplet1-t2) в место вызова. + +**Синтаксис** + +``` sql +untuple(x) +``` + +Чтобы пропустить некоторые столбцы в результате запроса, вы можете использовать выражение `EXCEPT`. + +**Параметры** + +- `x` - функция `tuple`, столбец или кортеж элементов. [Tuple](../../sql-reference/data-types/tuple.md). + +**Возвращаемое значение** + +- Нет. + +**Примеры** + +Входная таблица: + +``` text +┌─key─┬─v1─┬─v2─┬─v3─┬─v4─┬─v5─┬─v6────────┐ +│ 1 │ 10 │ 20 │ 40 │ 30 │ 15 │ (33,'ab') │ +│ 2 │ 25 │ 65 │ 70 │ 40 │ 6 │ (44,'cd') │ +│ 3 │ 57 │ 30 │ 20 │ 10 │ 5 │ (55,'ef') │ +│ 4 │ 55 │ 12 │ 7 │ 80 │ 90 │ (66,'gh') │ +│ 5 │ 30 │ 50 │ 70 │ 25 │ 55 │ (77,'kl') │ +└─────┴────┴────┴────┴────┴────┴───────────┘ +``` + +Пример использования столбца типа `Tuple` в качестве параметра функции `untuple`: + +Запрос: + +``` sql +SELECT untuple(v6) FROM kv; +``` + +Результат: + +``` text +┌─_ut_1─┬─_ut_2─┐ +│ 33 │ ab │ +│ 44 │ cd │ +│ 55 │ ef │ +│ 66 │ gh │ +│ 77 │ kl │ +└───────┴───────┘ +``` + +Пример использования выражения `EXCEPT`: + +Запрос: + +``` sql +SELECT untuple((* EXCEPT (v2, v3),)) FROM kv; +``` + +Результат: + +``` text +┌─key─┬─v1─┬─v4─┬─v5─┬─v6────────┐ +│ 1 │ 10 │ 30 │ 15 │ (33,'ab') │ +│ 2 │ 25 │ 40 │ 6 │ (44,'cd') │ +│ 3 │ 57 │ 10 │ 5 │ (55,'ef') │ +│ 4 │ 55 │ 80 │ 90 │ (66,'gh') │ +│ 5 │ 30 │ 25 │ 55 │ (77,'kl') │ +└─────┴────┴────┴────┴───────────┘ +``` + +**Смотрите также** + +- [Tuple](../../sql-reference/data-types/tuple.md) + +[Оригинальная статья](https://clickhouse.tech/docs/ru/sql-reference/functions/tuple-functions/) diff --git a/docs/ru/sql-reference/statements/alter/partition.md b/docs/ru/sql-reference/statements/alter/partition.md index 5c4a23428ad..df0123c63f1 100644 --- a/docs/ru/sql-reference/statements/alter/partition.md +++ b/docs/ru/sql-reference/statements/alter/partition.md @@ -19,10 +19,10 @@ toc_title: PARTITION - [FETCH PARTITION](#alter_fetch-partition) — скачать партицию с другого сервера; - [MOVE PARTITION\|PART](#alter_move-partition) — переместить партицию/кускок на другой диск или том. -## DETACH PARTITION {#alter_detach-partition} +## DETACH PARTITION\|PART {#alter_detach-partition} ``` sql -ALTER TABLE table_name DETACH PARTITION partition_expr +ALTER TABLE table_name DETACH PARTITION|PART partition_expr ``` Перемещает заданную партицию в директорию `detached`. Сервер не будет знать об этой партиции до тех пор, пока вы не выполните запрос [ATTACH](#alter_attach-partition). @@ -30,7 +30,8 @@ ALTER TABLE table_name DETACH PARTITION partition_expr Пример: ``` sql -ALTER TABLE visits DETACH PARTITION 201901 +ALTER TABLE mt DETACH PARTITION '2020-11-21'; +ALTER TABLE mt DETACH PART 'all_2_2_0'; ``` Подробнее о том, как корректно задать имя партиции, см. в разделе [Как задавать имя партиции в запросах ALTER](#alter-how-to-specify-part-expr). @@ -39,10 +40,10 @@ ALTER TABLE visits DETACH PARTITION 201901 Запрос реплицируется — данные будут перенесены в директорию `detached` и забыты на всех репликах. Обратите внимание, запрос может быть отправлен только на реплику-лидер. Чтобы узнать, является ли реплика лидером, выполните запрос `SELECT` к системной таблице [system.replicas](../../../operations/system-tables/replicas.md#system_tables-replicas). Либо можно выполнить запрос `DETACH` на всех репликах — тогда на всех репликах, кроме реплики-лидера, запрос вернет ошибку. -## DROP PARTITION {#alter_drop-partition} +## DROP PARTITION\|PART {#alter_drop-partition} ``` sql -ALTER TABLE table_name DROP PARTITION partition_expr +ALTER TABLE table_name DROP PARTITION|PART partition_expr ``` Удаляет партицию. Партиция помечается как неактивная и будет полностью удалена примерно через 10 минут. @@ -51,6 +52,13 @@ ALTER TABLE table_name DROP PARTITION partition_expr Запрос реплицируется — данные будут удалены на всех репликах. +Пример: + +``` sql +ALTER TABLE mt DROP PARTITION '2020-11-21'; +ALTER TABLE mt DROP PART 'all_4_4_0'; +``` + ## DROP DETACHED PARTITION\|PART {#alter_drop-detached} ``` sql diff --git a/docs/ru/sql-reference/statements/select/group-by.md b/docs/ru/sql-reference/statements/select/group-by.md index a0454ef1d91..0c8a29d0c26 100644 --- a/docs/ru/sql-reference/statements/select/group-by.md +++ b/docs/ru/sql-reference/statements/select/group-by.md @@ -43,6 +43,153 @@ toc_title: GROUP BY Если в `GROUP BY` передать несколько ключей, то в результате мы получим все комбинации выборки, как если бы `NULL` был конкретным значением. +## Модификатор WITH ROLLUP {#with-rollup-modifier} + +Модификатор `WITH ROLLUP` применяется для подсчета подытогов для ключевых выражений. При этом учитывается порядок следования ключевых выражений в списке `GROUP BY`. Подытоги подсчитываются в обратном порядке: сначала для последнего ключевого выражения в списке, потом для предпоследнего и так далее вплоть до самого первого ключевого выражения. + +Строки с подытогами добавляются в конец результирующей таблицы. В колонках, по которым строки уже сгруппированы, указывается значение `0` или пустая строка. + +!!! note "Примечание" + Если в запросе есть секция [HAVING](../../../sql-reference/statements/select/having.md), она может повлиять на результаты расчета подытогов. + +**Пример** + +Рассмотрим таблицу t: + +```text +┌─year─┬─month─┬─day─┐ +│ 2019 │ 1 │ 5 │ +│ 2019 │ 1 │ 15 │ +│ 2020 │ 1 │ 5 │ +│ 2020 │ 1 │ 15 │ +│ 2020 │ 10 │ 5 │ +│ 2020 │ 10 │ 15 │ +└──────┴───────┴─────┘ +``` + +Запрос: + +```sql +SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH ROLLUP; +``` + +Поскольку секция `GROUP BY` содержит три ключевых выражения, результат состоит из четырех таблиц с подытогами, которые как бы "сворачиваются" справа налево: + +- `GROUP BY year, month, day`; +- `GROUP BY year, month` (а колонка `day` заполнена нулями); +- `GROUP BY year` (теперь обе колонки `month, day` заполнены нулями); +- и общий итог (все три колонки с ключевыми выражениями заполнены нулями). + +```text +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2020 │ 10 │ 15 │ 1 │ +│ 2020 │ 1 │ 5 │ 1 │ +│ 2019 │ 1 │ 5 │ 1 │ +│ 2020 │ 1 │ 15 │ 1 │ +│ 2019 │ 1 │ 15 │ 1 │ +│ 2020 │ 10 │ 5 │ 1 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 1 │ 0 │ 2 │ +│ 2020 │ 1 │ 0 │ 2 │ +│ 2020 │ 10 │ 0 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 0 │ 0 │ 2 │ +│ 2020 │ 0 │ 0 │ 4 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 0 │ 0 │ 6 │ +└──────┴───────┴─────┴─────────┘ +``` + +## Модификатор WITH CUBE {#with-cube-modifier} + +Модификатор `WITH CUBE` применятеся для расчета подытогов по всем комбинациям группировки ключевых выражений в списке `GROUP BY`. + +Строки с подытогами добавляются в конец результирующей таблицы. В колонках, по которым выполняется группировка, указывается значение `0` или пустая строка. + +!!! note "Примечание" + Если в запросе есть секция [HAVING](../../../sql-reference/statements/select/having.md), она может повлиять на результаты расчета подытогов. + +**Пример** + +Рассмотрим таблицу t: + +```text +┌─year─┬─month─┬─day─┐ +│ 2019 │ 1 │ 5 │ +│ 2019 │ 1 │ 15 │ +│ 2020 │ 1 │ 5 │ +│ 2020 │ 1 │ 15 │ +│ 2020 │ 10 │ 5 │ +│ 2020 │ 10 │ 15 │ +└──────┴───────┴─────┘ +``` + +Query: + +```sql +SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH CUBE; +``` + +Поскольку секция `GROUP BY` содержит три ключевых выражения, результат состоит из восьми таблиц с подытогами — по таблице для каждой комбинации ключевых выражений: + +- `GROUP BY year, month, day` +- `GROUP BY year, month` +- `GROUP BY year, day` +- `GROUP BY year` +- `GROUP BY month, day` +- `GROUP BY month` +- `GROUP BY day` +- и общий итог. + +Колонки, которые не участвуют в `GROUP BY`, заполнены нулями. + +```text +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2020 │ 10 │ 15 │ 1 │ +│ 2020 │ 1 │ 5 │ 1 │ +│ 2019 │ 1 │ 5 │ 1 │ +│ 2020 │ 1 │ 15 │ 1 │ +│ 2019 │ 1 │ 15 │ 1 │ +│ 2020 │ 10 │ 5 │ 1 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 1 │ 0 │ 2 │ +│ 2020 │ 1 │ 0 │ 2 │ +│ 2020 │ 10 │ 0 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2020 │ 0 │ 5 │ 2 │ +│ 2019 │ 0 │ 5 │ 1 │ +│ 2020 │ 0 │ 15 │ 2 │ +│ 2019 │ 0 │ 15 │ 1 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 2019 │ 0 │ 0 │ 2 │ +│ 2020 │ 0 │ 0 │ 4 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 1 │ 5 │ 2 │ +│ 0 │ 10 │ 15 │ 1 │ +│ 0 │ 10 │ 5 │ 1 │ +│ 0 │ 1 │ 15 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 1 │ 0 │ 4 │ +│ 0 │ 10 │ 0 │ 2 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 0 │ 5 │ 3 │ +│ 0 │ 0 │ 15 │ 3 │ +└──────┴───────┴─────┴─────────┘ +┌─year─┬─month─┬─day─┬─count()─┐ +│ 0 │ 0 │ 0 │ 6 │ +└──────┴───────┴─────┴─────────┘ +``` + + ## Модификатор WITH TOTALS {#with-totals-modifier} Если указан модификатор `WITH TOTALS`, то будет посчитана ещё одна строчка, в которой в столбцах-ключах будут содержаться значения по умолчанию (нули, пустые строки), а в столбцах агрегатных функций - значения, посчитанные по всем строкам («тотальные» значения). @@ -86,8 +233,6 @@ SELECT FROM hits ``` -Но, в отличие от стандартного SQL, если в таблице нет строк (вообще нет или после фильтрации с помощью WHERE), в качестве результата возвращается пустой результат, а не результат из одной строки, содержащий «начальные» значения агрегатных функций. - В отличие от MySQL (и в соответствии со стандартом SQL), вы не можете получить какое-нибудь значение некоторого столбца, не входящего в ключ или агрегатную функцию (за исключением константных выражений). Для обхода этого вы можете воспользоваться агрегатной функцией any (получить первое попавшееся значение) или min/max. Пример: @@ -103,10 +248,6 @@ GROUP BY domain GROUP BY вычисляет для каждого встретившегося различного значения ключей, набор значений агрегатных функций. -Не поддерживается GROUP BY по столбцам-массивам. - -Не поддерживается указание констант в качестве аргументов агрегатных функций. Пример: `sum(1)`. Вместо этого, вы можете избавиться от констант. Пример: `count()`. - ## Детали реализации {#implementation-details} Агрегация является одной из наиболее важных возможностей столбцовых СУБД, и поэтому её реализация является одной из наиболее сильно оптимизированных частей ClickHouse. По умолчанию агрегирование выполняется в памяти с помощью хэш-таблицы. Она имеет более 40 специализаций, которые выбираются автоматически в зависимости от типов данных ключа группировки. diff --git a/docs/ru/sql-reference/statements/select/index.md b/docs/ru/sql-reference/statements/select/index.md index f5fe2788370..c2e05f05079 100644 --- a/docs/ru/sql-reference/statements/select/index.md +++ b/docs/ru/sql-reference/statements/select/index.md @@ -18,7 +18,7 @@ SELECT [DISTINCT] expr_list [GLOBAL] [ANY|ALL|ASOF] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER|SEMI|ANTI] JOIN (subquery)|table (ON )|(USING ) [PREWHERE expr] [WHERE expr] -[GROUP BY expr_list] [WITH TOTALS] +[GROUP BY expr_list] [WITH ROLLUP|WITH CUBE] [WITH TOTALS] [HAVING expr] [ORDER BY expr_list] [WITH FILL] [FROM expr] [TO expr] [STEP expr] [LIMIT [offset_value, ]n BY columns] diff --git a/docs/ru/whats-new/extended-roadmap.md b/docs/ru/whats-new/extended-roadmap.md index 57a29ce90ad..aff8e1cbcfb 100644 --- a/docs/ru/whats-new/extended-roadmap.md +++ b/docs/ru/whats-new/extended-roadmap.md @@ -15,8 +15,6 @@ Задача «normalized z-Order curve» в перспективе может быть полезна для БК и Метрики, так как позволяет смешивать OrderID и PageID и избежать дублирования данных. В задаче также вводится способ индексации путём обращения функции нескольких аргументов на интервале, что имеет смысл для дальнейшего развития. -[Андрей Чулков](https://github.com/achulkov2), ВШЭ. - ### 1.2. + Wait-free каталог баз данных {#wait-free-katalog-baz-dannykh} Q2. Делает [Александр Токмаков](https://github.com/tavplubix), первый рабочий вариант в декабре 2019. Нужно для DataLens и Яндекс.Метрики. @@ -292,7 +290,8 @@ Upd. Иван Блинков сделал эту задачу путём зам ### 4.1. Уменьшение числа потоков при распределённых запросах {#umenshenie-chisla-potokov-pri-raspredelionnykh-zaprosakh} -Весна 2020. Upd. Есть прототип. Upd. Он не работает. Upd. Человек отказался от задачи, теперь сроки не определены. +Upd. Есть прототип. Upd. Он не работает. Upd. Человек отказался от задачи, теперь сроки не определены. +Upd. Павел Круглов, весна 2021. ### 4.2. Спекулятивное выполнение запросов на нескольких репликах {#spekuliativnoe-vypolnenie-zaprosov-na-neskolkikh-replikakh} @@ -306,6 +305,8 @@ Upd. Иван Блинков сделал эту задачу путём зам Upd. Сейчас обсуждается, как сделать другую задачу вместо этой. +Павел Круглов, весна 2021. + ### 4.3. Ограничение числа одновременных скачиваний с реплик {#ogranichenie-chisla-odnovremennykh-skachivanii-s-replik} Изначально делал Олег Алексеенков, но пока решение не готово, хотя там не так уж много доделывать. @@ -320,9 +321,10 @@ Upd. Сейчас обсуждается, как сделать другую з ### 4.7. Ленивая загрузка множеств для IN и JOIN с помощью k/v запросов {#lenivaia-zagruzka-mnozhestv-dlia-in-i-join-s-pomoshchiu-kv-zaprosov} -### 4.8. Разделить background pool для fetch и merge {#razdelit-background-pool-dlia-fetch-i-merge} +### 4.8. + Разделить background pool для fetch и merge {#razdelit-background-pool-dlia-fetch-i-merge} -В очереди. Исправить проблему, что восстанавливающаяся реплика перестаёт мержить. Частично компенсируется 4.3. +Исправить проблему, что восстанавливающаяся реплика перестаёт мержить. Частично компенсируется 4.3. +Ура, готово! Сделал Александр Сапин. ## 5. Операции {#operatsii} @@ -381,6 +383,7 @@ Upd. Появилась вторая версия LTS - 20.3. ### 6.5. Эксперименты с LLVM X-Ray {#eksperimenty-s-llvm-x-ray} Требует 2.2. +Перенос на 2021 или отмена. ### 6.6. + Стек трейс для любых исключений {#stek-treis-dlia-liubykh-iskliuchenii} @@ -401,6 +404,8 @@ Upd. В разработке. ### 6.10. Сбор общих системных метрик {#sbor-obshchikh-sistemnykh-metrik} +Перенос на весну 2021. + ## 7. Сопровождение разработки {#soprovozhdenie-razrabotki} @@ -461,7 +466,7 @@ UBSan включен в функциональных тестах, но не в ### 7.12. Показывать тестовое покрытие нового кода в PR {#pokazyvat-testovoe-pokrytie-novogo-koda-v-pr} Пока есть просто показ тестового покрытия всего кода. -Отложено. +Отложено на весну 2021. ### 7.13. + Включение аналога -Weverything в gcc {#vkliuchenie-analoga-weverything-v-gcc} @@ -512,6 +517,7 @@ Upd. Минимальная подсветка добавлена, а все о Поводом использования libressl послужило желание нашего хорошего друга из известной компании несколько лет назад. Но сейчас ситуация состоит в том, что openssl продолжает развиваться, а libressl не особо, и можно спокойно менять обратно. Нужно для Яндекс.Облака для поддержки TLS 1.3. +Теперь нужно заменить OpenSSL на BoringSSL. ### 7.16. + tzdata внутри бинарника {#tzdata-vnutri-binarnika} @@ -612,7 +618,7 @@ Upd. Эльдар Заитов добавляет OSS Fuzz. Upd. Сделаны randomString, randomFixedString. Upd. Сделаны fuzzBits. -### 7.24. Fuzzing лексера и парсера запросов; кодеков и форматов {#fuzzing-leksera-i-parsera-zaprosov-kodekov-i-formatov} +### 7.24. + Fuzzing лексера и парсера запросов; кодеков и форматов {#fuzzing-leksera-i-parsera-zaprosov-kodekov-i-formatov} Продолжение 7.23. @@ -656,6 +662,7 @@ Upd. В Аркадии частично работает небольшая ча ### 7.30. Возможность переключения бинарных файлов на продакшене без выкладки пакетов {#vozmozhnost-perekliucheniia-binarnykh-failov-na-prodakshene-bez-vykladki-paketov} Низкий приоритет. +Сделали файл clickhouse.old. ### 7.31. Зеркалирование нагрузки между серверами {#zerkalirovanie-nagruzki-mezhdu-serverami} @@ -737,7 +744,7 @@ Upd. Задача взята в работу. ### 8.6. Kerberos аутентификация для HDFS и Kafka {#kerberos-autentifikatsiia-dlia-hdfs-i-kafka} Андрей Коняев, ArenaData. Он куда-то пропал. -Upd. В процессе работа для Kafka. +Для Kafka готово, для HDFS в процессе. ### 8.7. + Исправление мелочи HDFS на очень старых ядрах Linux {#ispravlenie-melochi-hdfs-na-ochen-starykh-iadrakh-linux} @@ -1024,14 +1031,14 @@ Upd. Сделано хранение прав. До готового к испо [Виталий Баранов](https://github.com/vitlibar). Финальная стадия разработки, рабочая версия в декабре 2019. Q1. Сделано управление правами полностью, но не реализовано их хранение, см. 12.1. -### 12.3. Подключение справочника пользователей и прав доступа из LDAP {#podkliuchenie-spravochnika-polzovatelei-i-prav-dostupa-iz-ldap} +### 12.3. + Подключение справочника пользователей и прав доступа из LDAP {#podkliuchenie-spravochnika-polzovatelei-i-prav-dostupa-iz-ldap} Аутентификация через LDAP - Денис Глазачев. [Виталий Баранов](https://github.com/vitlibar) и Денис Глазачев, Altinity. Требует 12.1. Q3. Upd. Pull request на финальной стадии. -### 12.4. Подключение IDM системы Яндекса как справочника пользователей и прав доступа {#podkliuchenie-idm-sistemy-iandeksa-kak-spravochnika-polzovatelei-i-prav-dostupa} +### 12.4. - Подключение IDM системы Яндекса как справочника пользователей и прав доступа {#podkliuchenie-idm-sistemy-iandeksa-kak-spravochnika-polzovatelei-i-prav-dostupa} Пока низкий приоритет. Нужно для Метрики. Требует 12.3. Отложено. @@ -1051,7 +1058,7 @@ Upd. Есть pull request. ### 13.1. Overcommit запросов по памяти и вытеснение {#overcommit-zaprosov-po-pamiati-i-vytesnenie} -Требует 2.1. Способ реализации обсуждается. Александр Казаков. +Требует 2.1. Способ реализации обсуждается. ### 13.2. Общий конвейер выполнения на сервер {#obshchii-konveier-vypolneniia-na-server} @@ -1059,8 +1066,6 @@ Upd. Есть pull request. ### 13.3. Пулы ресурсов {#puly-resursov} -Александр Казаков. - Требует 13.2 или сможем сделать более неудобную реализацию раньше. Обсуждается вариант неудобной реализации. Пока средний приоритет, целимся на Q1/Q2. Вариант реализации выбрал Александр Казаков. @@ -1068,6 +1073,7 @@ Upd. Не уследили, и задачу стали обсуждать мен Upd. Задачу смотрит Александр Казаков. Upd. Задача взята в работу. Upd. Задача как будто взята в работу. +Upd. Задачу не сделал. ## 14. Диалект SQL {#dialekt-sql} @@ -1082,19 +1088,18 @@ Upd. Задача как будто взята в работу. ### 14.3. Поддержка подстановок для множеств в правой части IN {#podderzhka-podstanovok-dlia-mnozhestv-v-pravoi-chasti-in} -### 14.4. Поддержка подстановок для идентификаторов (имён) в SQL запросе {#podderzhka-podstanovok-dlia-identifikatorov-imion-v-sql-zaprose} +### 14.4. + Поддержка подстановок для идентификаторов (имён) в SQL запросе {#podderzhka-podstanovok-dlia-identifikatorov-imion-v-sql-zaprose} -zhang2014 -Задача на паузе. +Amos Bird сделал. ### 14.5. + Поддержка задания множества как массива в правой части секции IN {#podderzhka-zadaniia-mnozhestva-kak-massiva-v-pravoi-chasti-sektsii-in} Василий Немков, Altinity, делал эту задачу, но забросил её в пользу других задач. В результате, сейчас доделывает Антон Попов. -### 14.6. Глобальный scope для WITH {#globalnyi-scope-dlia-with} +### 14.6. + Глобальный scope для WITH {#globalnyi-scope-dlia-with} -В обсуждении. Amos Bird. +Amos Bird сделал. ### 14.7. Nullable для WITH ROLLUP, WITH CUBE, WITH TOTALS {#nullable-dlia-with-rollup-with-cube-with-totals} @@ -1148,13 +1153,13 @@ Upd. Есть pull request. Готово. ### 14.17. + Ввести понятие stateful функций {#vvesti-poniatie-stateful-funktsii} -zhang2014. Для runningDifference, neighbour - их учёт в оптимизаторе запросов. В интерфейсе уже сделано. Надо проверить, что учитывается в нужных местах (например, что работает predicate pushdown сквозь ORDER BY, если таких функций нет). +Александр Кузьменков. -### 14.18. UNION DISTINCT и возможность включить его по-умолчанию {#union-distinct-i-vozmozhnost-vkliuchit-ego-po-umolchaniiu} +### 14.18. + UNION DISTINCT и возможность включить его по-умолчанию {#union-distinct-i-vozmozhnost-vkliuchit-ego-po-umolchaniiu} -Для BI систем. +Для BI систем. flynn ucasFL. ### 14.19. + Совместимость парсера типов данных с SQL {#sovmestimost-parsera-tipov-dannykh-s-sql} @@ -1278,7 +1283,7 @@ Upd. Есть pull request. Исправление фундаментальной проблемы - есть PR. Фундаментальная проблема решена. -### 18.2. Агрегатные функции для статистических тестов {#agregatnye-funktsii-dlia-statisticheskikh-testov} +### 18.2. + Агрегатные функции для статистических тестов {#agregatnye-funktsii-dlia-statisticheskikh-testov} Артём Цыганов, Руденский Константин Игоревич, Семёнов Денис, ВШЭ. @@ -1286,6 +1291,7 @@ Upd. Есть pull request. Сделали прототип двух тестов, есть pull request. Также есть pull request для корелляции рангов. Upd. Помержили корелляцию рангов, но ещё не помержили сравнение t-test, u-test. +Upd. Всё доделал Никита Михайлов. ### 18.3. Инфраструктура для тренировки моделей в ClickHouse {#infrastruktura-dlia-trenirovki-modelei-v-clickhouse} @@ -1295,7 +1301,7 @@ Upd. Помержили корелляцию рангов, но ещё не по ## 19. Улучшение работы кластера {#uluchshenie-raboty-klastera} -### 19.1. Параллельные кворумные вставки без линеаризуемости {#parallelnye-kvorumnye-vstavki-bez-linearizuemosti} +### 19.1. + Параллельные кворумные вставки без линеаризуемости {#parallelnye-kvorumnye-vstavki-bez-linearizuemosti} Upd. В работе, ожидается в начале октября. @@ -1361,6 +1367,8 @@ Upd. Задача в разработке. ### 20.2. Поддержка DELETE путём преобразования множества ключей в множество row_numbers на реплике, столбца флагов и индекса по диапазонам {#podderzhka-delete-putiom-preobrazovaniia-mnozhestva-kliuchei-v-mnozhestvo-row-numbers-na-replike-stolbtsa-flagov-i-indeksa-po-diapazonam} +Задача назначена на 2021. + ### 20.3. Поддержка ленивых DELETE путём запоминания выражений и преобразования к множеству ключей в фоне {#podderzhka-lenivykh-delete-putiom-zapominaniia-vyrazhenii-i-preobrazovaniia-k-mnozhestvu-kliuchei-v-fone} ### 20.4. Поддержка UPDATE с помощью преобразования в DELETE и вставок {#podderzhka-update-s-pomoshchiu-preobrazovaniia-v-delete-i-vstavok} @@ -1413,6 +1421,7 @@ ucasFL, в разработке. Готово. [Achimbab](https://github.com/achimbab). Есть pull request. Но это не совсем то. Upd. В обсуждении. +Upd. Назначено на 2021. ### 21.8. Взаимная интеграция аллокатора и кэша {#vzaimnaia-integratsiia-allokatora-i-kesha} @@ -1427,6 +1436,7 @@ Upd. В обсуждении. Upd. Есть нерабочий прототип, скорее всего будет отложено. Upd. Отложено до осени. Upd. Отложено до. +Upd. Отложено. ### 21.8.1. Отдельный аллокатор для кэшей с ASLR {#otdelnyi-allokator-dlia-keshei-s-aslr} @@ -1517,7 +1527,7 @@ Upd. Сделаны самые существенные из предложен Для сортировки по кортежам используется обычная сортировка с компаратором, который в цикле по элементам кортежа делает виртуальные вызовы `IColumn::compareAt`. Это неоптимально - как из-за короткого цикла по неизвестному в compile-time количеству элементов, так и из-за виртуальных вызовов. Чтобы обойтись без виртуальных вызовов, есть метод `IColumn::getPermutation`. Он используется в случае сортировки по одному столбцу. Есть вариант, что в случае сортировки по кортежу, что-то похожее тоже можно применить… например, сделать метод `updatePermutation`, принимающий аргументы offset и limit, и допереставляющий перестановку в диапазоне значений, в которых предыдущий столбец имел равные значения. -3. RadixSort для сортировки. +\+ 3. RadixSort для сортировки. Один наш знакомый начал делать задачу по попытке использования RadixSort для сортировки столбцов. Был сделан вариант indirect сортировки (для `getPermutation`), но не оптимизирован до конца - есть лишние ненужные перекладывания элементов. Для того, чтобы его оптимизировать, придётся добавить немного шаблонной магии (на последнем шаге что-то не копировать, вместо перекладывания индексов - складывать их в готовое место). Также этот человек добавил метод MSD Radix Sort для реализации radix partial sort. Но даже не проверил производительность. @@ -1527,7 +1537,9 @@ Upd. Сделаны самые существенные из предложен Виртуальный метод `compareAt` возвращает -1, 0, 1. Но алгоритмы сортировки сравнениями обычно рассчитаны на `operator<` и не могут получить преимущества от three-way comparison. А можно ли написать так, чтобы преимущество было? -5. pdq partial sort +\+ 5. pdq partial sort + +Upd. Данила Кутенин решил эту задачу ультимативно, используя Floyd–Rivest алгоритм. Хороший алгоритм сортировки сравнениями `pdqsort` не имеет варианта partial sort. Заметим, что на практике, почти все сортировки в запросах ClickHouse являются partial_sort, так как `ORDER BY` почти всегда идёт с `LIMIT`. Кстати, Данила Кутенин уже попробовал это и показал, что в тривиальном случае преимущества нет. Но не очевидно, что нельзя сделать лучше. @@ -1619,6 +1631,7 @@ Upd. Добавили таймауты. Altinity. Я не в курсе, какой статус. +Там предлагают очень сложное решение вместо простого. ### 22.16. + Исправление низкой производительности кодека DoubleDelta {#ispravlenie-nizkoi-proizvoditelnosti-kodeka-doubledelta} @@ -1656,15 +1669,15 @@ Upd. Готово. Нужно для Метрики. Алексей Миловидов. -### 22.25. Избавиться от библиотеки btrie {#izbavitsia-ot-biblioteki-btrie} +### 22.25. + Избавиться от библиотеки btrie {#izbavitsia-ot-biblioteki-btrie} -Алексей Миловидов. Низкий приоритет. +Владимир Черкасов сделал эту задачу. ### 22.26. Плохая производительность quantileTDigest {#plokhaia-proizvoditelnost-quantiletdigest} [#2668](https://github.com/ClickHouse/ClickHouse/issues/2668) -Алексей Миловидов или будет переназначено. +Павел Круглов и Илья Щербак (ВК). ### 22.27. Проверить несколько PR, которые были закрыты zhang2014 и sundy-li {#proverit-neskolko-pr-kotorye-byli-zakryty-zhang2014-i-sundy-li} @@ -1766,7 +1779,7 @@ Upd. Отменено. Виталий Баранов. Отложено, после бэкапов. -### 24.5. Поддержка функций шифрования для отдельных значений {#podderzhka-funktsii-shifrovaniia-dlia-otdelnykh-znachenii} +### 24.5. + Поддержка функций шифрования для отдельных значений {#podderzhka-funktsii-shifrovaniia-dlia-otdelnykh-znachenii} Смотрите также 24.5. @@ -1775,6 +1788,7 @@ Upd. Отменено. Делает Василий Немков, Altinity Есть pull request в процессе ревью, исправляем проблемы производительности. +Сейчас в состоянии, что уже добавлено в продакшен, но производительность всё ещё низкая (тех долг). ### 24.6. Userspace RAID {#userspace-raid} @@ -1825,7 +1839,7 @@ RAID позволяет одновременно увеличить надёжн Upd. Есть pull request. В стадии ревью. Готово. -### 24.10. Поддержка типов half/bfloat16/unum {#podderzhka-tipov-halfbfloat16unum} +### 24.10. - Поддержка типов half/bfloat16/unum {#podderzhka-tipov-halfbfloat16unum} [#7657](https://github.com/ClickHouse/ClickHouse/issues/7657) @@ -1833,6 +1847,7 @@ Upd. Есть pull request. В стадии ревью. Готово. Есть pull request на промежуточной стадии. Отложено. +Отменено. ### 24.11. User Defined Functions {#user-defined-functions} @@ -1882,10 +1897,12 @@ Upd. Прототип bitonic sort помержен, но целесообраз Требует 2.1. Upd. Есть два прототипа от внешних контрибьюторов. +Александр Кузьменков. ### 24.15. Поддержка полуструктурированных данных {#podderzhka-polustrukturirovannykh-dannykh} Требует 1.14 и 2.10. +Антон Попов. ### 24.16. Улучшение эвристики слияний {#uluchshenie-evristiki-sliianii} @@ -1915,6 +1932,7 @@ Upd. Есть pull request - в большинстве случаев однов ### 24.21. Реализация в ClickHouse протокола распределённого консенсуса {#realizatsiia-v-clickhouse-protokola-raspredelionnogo-konsensusa} Имеет смысл только после 19.2. +Александр Сапин. ### 24.22. Вывод типов по блоку данных. Вывод формата данных по примеру {#vyvod-tipov-po-bloku-dannykh-vyvod-formata-dannykh-po-primeru} @@ -1955,13 +1973,14 @@ ClickHouse также может использоваться для быстр Михаил Филитов, ВШЭ. Upd. Есть pull request. Нужно ещё чистить код библиотеки. -### 24.26. Поддержка open tracing или аналогов {#podderzhka-open-tracing-ili-analogov} +### 24.26. + Поддержка open tracing или аналогов {#podderzhka-open-tracing-ili-analogov} [#5182](https://github.com/ClickHouse/ClickHouse/issues/5182) Александр Кожихов, ВШЭ и Яндекс.YT. Upd. Есть pull request с прототипом. Upd. Александ Кузьменков взял задачу в работу. +Сделано. ### 24.27. Реализация алгоритмов min-hash, sim-hash для нечёткого поиска полудубликатов {#realizatsiia-algoritmov-min-hash-sim-hash-dlia-nechiotkogo-poiska-poludublikatov} @@ -1995,7 +2014,7 @@ Amos Bird, но его решение слишком громоздкое и п Перепиcывание в JOIN. Не раньше 21.11, 21.12, 21.9. Низкий приоритет. Отложено. -### 24.32. Поддержка GRPC {#podderzhka-grpc} +### 24.32. + Поддержка GRPC {#podderzhka-grpc} Мария Конькова, ВШЭ и Яндекс. Также смотрите 24.29. @@ -2009,6 +2028,7 @@ Amos Bird, но его решение слишком громоздкое и п Задача в работе, есть pull request. [#10136](https://github.com/ClickHouse/ClickHouse/pull/10136) Upd. Задачу взял в работу Виталий Баранов. +Сделано. ## 25. DevRel {#devrel} @@ -2067,13 +2087,14 @@ Upd. Задачу взял в работу Виталий Баранов. Алексей Миловидов и все подготовленные докладчики. Upd. Участвуем. -### 25.14. Конференции в России: все HighLoad, возможно CodeFest, DUMP или UWDC, возможно C++ Russia {#konferentsii-v-rossii-vse-highload-vozmozhno-codefest-dump-ili-uwdc-vozmozhno-c-russia} +### 25.14. + Конференции в России: все HighLoad, возможно CodeFest, DUMP или UWDC, возможно C++ Russia {#konferentsii-v-rossii-vse-highload-vozmozhno-codefest-dump-ili-uwdc-vozmozhno-c-russia} Алексей Миловидов и все подготовленные докладчики. Upd. Есть Saint HighLoad online. Upd. Есть C++ Russia. CodeFest, DUMP, UWDC отменились. Upd. Добавились Highload Fwdays, Матемаркетинг. +Upd. Добавились подкасты C++ Russia. ### 25.15. Конференции зарубежные: Percona, DataOps, попытка попасть на более крупные {#konferentsii-zarubezhnye-percona-dataops-popytka-popast-na-bolee-krupnye} @@ -2096,6 +2117,7 @@ DataOps отменилась. Есть минимальный прототип. Сделал Илья Яцишин. Этот прототип не позволяет делиться ссылками на результаты запросов. Upd. На финальной стадии инструмент для экспериментирования с разными версиями ClickHouse. +Upd. По факту, задача считается не сделанной (готово только 99%, не 100%). ### 25.17. Взаимодействие с ВУЗами: ВШЭ, УрФУ, ICT Beijing {#vzaimodeistvie-s-vuzami-vshe-urfu-ict-beijing} @@ -2103,6 +2125,7 @@ Upd. На финальной стадии инструмент для экспе Благодаря Robert Hodges добавлен CMU. Upd. Взаимодействие с ВШЭ 2019/2020 успешно выполнено. Upd. Идёт подготовка к 2020/2021. +Upd. Уже взяли несколько десятков человек на 2020/2021. ### 25.18. - Лекция в ШАД {#lektsiia-v-shad} diff --git a/docs/tools/build.py b/docs/tools/build.py index bcbf3ac27cd..45d74423fa8 100755 --- a/docs/tools/build.py +++ b/docs/tools/build.py @@ -202,7 +202,11 @@ def build(args): if __name__ == '__main__': os.chdir(os.path.join(os.path.dirname(__file__), '..')) - website_dir = os.path.join('..', 'website') + + # A root path to ClickHouse source code. + src_dir = '..' + + website_dir = os.path.join(src_dir, 'website') arg_parser = argparse.ArgumentParser() arg_parser.add_argument('--lang', default='en,es,fr,ru,zh,ja,tr,fa') @@ -210,6 +214,7 @@ if __name__ == '__main__': arg_parser.add_argument('--docs-dir', default='.') arg_parser.add_argument('--theme-dir', default=website_dir) arg_parser.add_argument('--website-dir', default=website_dir) + arg_parser.add_argument('--src-dir', default=src_dir) arg_parser.add_argument('--blog-dir', default=os.path.join(website_dir, 'blog')) arg_parser.add_argument('--output-dir', default='build') arg_parser.add_argument('--enable-stable-releases', action='store_true') diff --git a/docs/tools/website.py b/docs/tools/website.py index a658b0cfc34..4cce69bd869 100644 --- a/docs/tools/website.py +++ b/docs/tools/website.py @@ -145,13 +145,19 @@ def build_website(args): 'public', 'node_modules', 'templates', - 'locale' + 'locale', + '.gitkeep' ) ) + + # This file can be requested to check for available ClickHouse releases. + shutil.copy2( + os.path.join(args.src_dir, 'utils', 'list-versions', 'version_date.tsv'), + os.path.join(args.output_dir, 'data', 'version_date.tsv')) + shutil.copy2( os.path.join(args.website_dir, 'js', 'embedd.min.js'), - os.path.join(args.output_dir, 'js', 'embedd.min.js') - ) + os.path.join(args.output_dir, 'js', 'embedd.min.js')) for root, _, filenames in os.walk(args.output_dir): for filename in filenames: diff --git a/docs/tr/development/contrib.md b/docs/tr/development/contrib.md index 63cc289ec9b..f56cf2a625b 100644 --- a/docs/tr/development/contrib.md +++ b/docs/tr/development/contrib.md @@ -19,7 +19,6 @@ toc_title: "Kullan\u0131lan \xDC\xE7\xFCnc\xFC Taraf K\xFCt\xFCphaneleri" | googletest | [BSD 3-Clause Lisansı](https://github.com/google/googletest/blob/master/LICENSE) | | h33 | [Apache Lic 2.0ense 2.0](https://github.com/uber/h3/blob/master/LICENSE) | | hyperscan | [BSD 3-Clause Lisansı](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| libbtrie | [BSD 2-Clause Lisansı](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | libdivide | [Zlib Lisansı](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | libgsasl | [LGPL v2. 1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/zh/development/contrib.md b/docs/zh/development/contrib.md index 0129ee62ce7..8e8efc3c04e 100644 --- a/docs/zh/development/contrib.md +++ b/docs/zh/development/contrib.md @@ -11,7 +11,6 @@ | FastMemcpy | [MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libmemcpy/impl/LICENSE) | | googletest | [BSD3-条款许可](https://github.com/google/googletest/blob/master/LICENSE) | | 超扫描 | [BSD3-条款许可](https://github.com/intel/hyperscan/blob/master/LICENSE) | -| libbtrie | [BSD2-条款许可](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libbtrie/LICENSE) | | libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) | | libdivide | [Zlib许可证](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) | | libgsasl | [LGPL v2.1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) | diff --git a/docs/zh/getting-started/playground.md b/docs/zh/getting-started/playground.md index 670889f303c..f7ab0ac0013 100644 --- a/docs/zh/getting-started/playground.md +++ b/docs/zh/getting-started/playground.md @@ -38,10 +38,15 @@ ClickHouse体验平台提供了小型集群[Managed Service for ClickHouse](http - 不允许插入查询 还强制执行以下设置: -- [max_result_bytes=10485760](../operations/settings/query_complexity/#max-result-bytes) -- [max_result_rows=2000](../operations/settings/query_complexity/#setting-max_result_rows) -- [result_overflow_mode=break](../operations/settings/query_complexity/#result-overflow-mode) -- [max_execution_time=60000](../operations/settings/query_complexity/#max-execution-time) +- [max_result_bytes=10485760](../operations/settings/query-complexity/#max-result-bytes) +- [max_result_rows=2000](../operations/settings/query-complexity/#setting-max_result_rows) +- [result_overflow_mode=break](../operations/settings/query-complexity/#result-overflow-mode) +- [max_execution_time=60000](../operations/settings/query-complexity/#max-execution-time) + +ClickHouse体验还有如下: +[ClickHouse管理服务](https://cloud.yandex.com/services/managed-clickhouse) +实例托管 [Yandex云](https://cloud.yandex.com/)。 +更多信息 [云提供商](../commercial/cloud.md)。 ## 示例 {#examples} diff --git a/docs/zh/operations/monitoring.md b/docs/zh/operations/monitoring.md index a5c30e46f4c..73896d3f8c1 100644 --- a/docs/zh/operations/monitoring.md +++ b/docs/zh/operations/monitoring.md @@ -33,10 +33,10 @@ ClickHouse 收集的指标项: - 服务用于计算的资源占用的各种指标。 - 关于查询处理的常见统计信息。 -可以在 [系统指标](system-tables/metrics.md#system_tables-metrics) ,[系统事件](system-tables/events.md#system_tables-events) 以及[系统异步指标](system-tables/asynchronous_metrics.md#system_tables-asynchronous_metrics) 等系统表查看所有的指标项。 +可以在[系统指标](system-tables/metrics.md#system_tables-metrics),[系统事件](system-tables/events.md#system_tables-events)以及[系统异步指标](system-tables/asynchronous_metrics.md#system_tables-asynchronous_metrics)等系统表查看所有的指标项。 -可以配置ClickHouse 往 [石墨](https://github.com/graphite-project)导入指标。 参考 [石墨部分](server-configuration-parameters/settings.md#server_configuration_parameters-graphite) 配置文件。在配置指标导出之前,需要参考Graphite[官方教程](https://graphite.readthedocs.io/en/latest/install.html)搭建服务。 +可以配置ClickHouse向[Graphite](https://github.com/graphite-project)推送监控信息并导入指标。参考[Graphite监控](server-configuration-parameters/settings.md#server_configuration_parameters-graphite)配置文件。在配置指标导出之前,需要参考[Graphite官方教程](https://graphite.readthedocs.io/en/latest/install.html)搭建Graphite服务。 -此外,您可以通过HTTP API监视服务器可用性。 将HTTP GET请求发送到 `/ping`。 如果服务器可用,它将以 `200 OK` 响应。 +此外,您可以通过HTTP API监视服务器可用性。将HTTP GET请求发送到`/ping`。如果服务器可用,它将以 `200 OK` 响应。 -要监视服务器集群的配置,应设置[max_replica_delay_for_distributed_queries](settings/settings.md#settings-max_replica_delay_for_distributed_queries)参数并使用HTTP资源`/replicas_status`。 如果副本可用,并且不延迟在其他副本之后,则对`/replicas_status`的请求将返回200 OK。 如果副本滞后,请求将返回 `503 HTTP_SERVICE_UNAVAILABLE`,包括有关待办事项大小的信息。 +要监视服务器集群的配置,应设置[max_replica_delay_for_distributed_queries](settings/settings.md#settings-max_replica_delay_for_distributed_queries)参数并使用HTTP资源`/replicas_status`。 如果副本可用,并且不延迟在其他副本之后,则对`/replicas_status`的请求将返回`200 OK`。 如果副本滞后,请求将返回`503 HTTP_SERVICE_UNAVAILABLE`,包括有关待办事项大小的信息。 diff --git a/docs/zh/operations/utilities/clickhouse-local.md b/docs/zh/operations/utilities/clickhouse-local.md index 4e89961e198..3ff38c01651 100644 --- a/docs/zh/operations/utilities/clickhouse-local.md +++ b/docs/zh/operations/utilities/clickhouse-local.md @@ -3,18 +3,18 @@ toc_priority: 60 toc_title: clickhouse-local --- -# ツ环板-ョツ嘉ッツ偲 {#clickhouse-local} +# ClickHouse Local {#clickhouse-local} -该 `clickhouse-local` 程序使您能够对本地文件执行快速处理,而无需部署和配置ClickHouse服务器。 +`clickhouse-local`模式可以使您能够对本地文件执行快速处理,而无需部署和配置ClickHouse服务器。 -接受表示表的数据并使用以下方式查询它们 [ツ环板ECTョツ嘉ッツ偲](../../operations/utilities/clickhouse-local.md). +[ClickHouse SQL语法](../../operations/utilities/clickhouse-local.md)支持对表格数据的查询. -`clickhouse-local` 使用与ClickHouse server相同的核心,因此它支持大多数功能以及相同的格式和表引擎。 +`clickhouse-local`使用与ClickHouse Server相同的核心,因此它支持大多数功能以及相同的格式和表引擎。 -默认情况下 `clickhouse-local` 不能访问同一主机上的数据,但它支持使用以下方式加载服务器配置 `--config-file` 争论。 +默认情况下`clickhouse-local`不能访问同一主机上的数据,但它支持使用`--config-file`方式加载服务器配置。 !!! warning "警告" - 不建议将生产服务器配置加载到 `clickhouse-local` 因为数据可以在人为错误的情况下被损坏。 + 不建议将生产服务器配置加载到`clickhouse-local`因为数据可以在人为错误的情况下被损坏。 ## 用途 {#usage} @@ -26,21 +26,21 @@ clickhouse-local --structure "table_structure" --input-format "format_of_incomin 参数: -- `-S`, `--structure` — table structure for input data. -- `-if`, `--input-format` — input format, `TSV` 默认情况下。 -- `-f`, `--file` — path to data, `stdin` 默认情况下。 -- `-q` `--query` — queries to execute with `;` 如delimeter。 -- `-N`, `--table` — table name where to put output data, `table` 默认情况下。 -- `-of`, `--format`, `--output-format` — output format, `TSV` 默认情况下。 -- `--stacktrace` — whether to dump debug output in case of exception. -- `--verbose` — more details on query execution. -- `-s` — disables `stderr` 记录。 -- `--config-file` — path to configuration file in same format as for ClickHouse server, by default the configuration empty. -- `--help` — arguments references for `clickhouse-local`. +- `-S`, `--structure` — 输入数据的表结构。 +- `-if`, `--input-format` — 输入格式化类型, 默认是`TSV`。 +- `-f`, `--file` — 数据路径, 默认是`stdin`。 +- `-q` `--query` — 要查询的SQL语句使用`;`做分隔符。 +- `-N`, `--table` — 数据输出的表名,默认是`table`。 +- `-of`, `--format`, `--output-format` — 输出格式化类型, 默认是`TSV`。 +- `--stacktrace` — 是否在出现异常时输出栈信息。 +- `--verbose` — debug显示查询的详细信息。 +- `-s` — 禁用`stderr`输出信息。 +- `--config-file` — 与ClickHouse服务器格式相同配置文件的路径,默认情况下配置为空。 +- `--help` — `clickhouse-local`使用帮助信息。 -还有每个ClickHouse配置变量的参数,这些变量更常用,而不是 `--config-file`. +对于每个ClickHouse配置的参数,也可以单独使用,可以不使用`--config-file`指定。 -## 例 {#examples} +## 示例 {#examples} ``` bash echo -e "1,2\n3,4" | clickhouse-local -S "a Int64, b Int64" -if "CSV" -q "SELECT * FROM table" @@ -49,7 +49,7 @@ Read 2 rows, 32.00 B in 0.000 sec., 5182 rows/sec., 80.97 KiB/sec. 3 4 ``` -前面的例子是一样的: +另一个示例,类似上一个使用示例: ``` bash $ echo -e "1,2\n3,4" | clickhouse-local -q "CREATE TABLE table (a Int64, b Int64) ENGINE = File(CSV, stdin); SELECT a, b FROM table; DROP TABLE table" @@ -58,7 +58,22 @@ Read 2 rows, 32.00 B in 0.000 sec., 4987 rows/sec., 77.93 KiB/sec. 3 4 ``` -现在让我们为每个Unix用户输出内存用户: +你可以使用`stdin`或`--file`参数, 打开任意数量的文件来使用多个文件[`file` table function](../../sql-reference/table-functions/file.md): + +```bash +$ echo 1 | tee 1.tsv +1 + +$ echo 2 | tee 2.tsv +2 + +$ clickhouse-local --query " + select * from file('1.tsv', TSV, 'a int') t1 + cross join file('2.tsv', TSV, 'b int') t2" +1 2 +``` + +现在让我们查询每个Unix用户使用内存: ``` bash $ ps aux | tail -n +2 | awk '{ printf("%s\t%s\n", $1, $4) }' | clickhouse-local -S "user String, mem Float64" -q "SELECT user, round(sum(mem), 2) as memTotal FROM table GROUP BY user ORDER BY memTotal DESC FORMAT Pretty" diff --git a/docs/zh/sql-reference/functions/bitmap-functions.md b/docs/zh/sql-reference/functions/bitmap-functions.md index d2018f5d9c1..5a6baf2f217 100644 --- a/docs/zh/sql-reference/functions/bitmap-functions.md +++ b/docs/zh/sql-reference/functions/bitmap-functions.md @@ -6,7 +6,7 @@ 我们使用RoaringBitmap实际存储位图对象,当基数小于或等于32时,它使用Set保存。当基数大于32时,它使用RoaringBitmap保存。这也是为什么低基数集的存储更快的原因。 -有关RoaringBitmap的更多信息,请参阅:[呻吟声](https://github.com/RoaringBitmap/CRoaring)。 +有关RoaringBitmap的更多信息,请参阅:[RoaringBitmap](https://github.com/RoaringBitmap/CRoaring)。 ## bitmapBuild {#bitmapbuild} diff --git a/docs/zh/sql-reference/functions/math-functions.md b/docs/zh/sql-reference/functions/math-functions.md index 81c2fcecdbc..6634b095b0d 100644 --- a/docs/zh/sql-reference/functions/math-functions.md +++ b/docs/zh/sql-reference/functions/math-functions.md @@ -76,7 +76,7 @@ SELECT erf(3 / sqrt(2)) 返回x的三角余弦值。 -## 谭(x) {#tanx} +## tan(x) {#tanx} 返回x的三角正切值。 @@ -88,7 +88,7 @@ SELECT erf(3 / sqrt(2)) 返回x的反三角余弦值。 -## 阿坦(x) {#atanx} +## atan(x) {#atanx} 返回x的反三角正切值。 diff --git a/programs/CMakeLists.txt b/programs/CMakeLists.txt index 3817bc62bcb..8f45bf53f53 100644 --- a/programs/CMakeLists.txt +++ b/programs/CMakeLists.txt @@ -43,13 +43,81 @@ else () ${ENABLE_CLICKHOUSE_ALL}) endif () +message(STATUS "ClickHouse modes:") + +if (NOT ENABLE_CLICKHOUSE_SERVER) + message(WARNING "ClickHouse server mode is not going to be built.") +else() + message(STATUS "Server mode: ON") +endif() + +if (NOT ENABLE_CLICKHOUSE_CLIENT) + message(WARNING "ClickHouse client mode is not going to be built. You won't be able to connect to the server and run + tests") +else() + message(STATUS "Client mode: ON") +endif() + +if (ENABLE_CLICKHOUSE_LOCAL) + message(STATUS "Local mode: ON") +else() + message(STATUS "Local mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_BENCHMARK) + message(STATUS "Benchmark mode: ON") +else() + message(STATUS "Benchmark mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_EXTRACT_FROM_CONFIG) + message(STATUS "Extract from config mode: ON") +else() + message(STATUS "Extract from config mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_COMPRESSOR) + message(STATUS "Compressor mode: ON") +else() + message(STATUS "Compressor mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_COPIER) + message(STATUS "Copier mode: ON") +else() + message(STATUS "Copier mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_FORMAT) + message(STATUS "Format mode: ON") +else() + message(STATUS "Format mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_OBFUSCATOR) + message(STATUS "Obfuscator mode: ON") +else() + message(STATUS "Obfuscator mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_ODBC_BRIDGE) + message(STATUS "ODBC bridge mode: ON") +else() + message(STATUS "ODBC bridge mode: OFF") +endif() + +if (ENABLE_CLICKHOUSE_INSTALL) + message(STATUS "ClickHouse install: ON") +else() + message(STATUS "ClickHouse install: OFF") +endif() + if(NOT (MAKE_STATIC_LIBRARIES OR SPLIT_SHARED_LIBRARIES)) set(CLICKHOUSE_ONE_SHARED ON) endif() configure_file (config_tools.h.in ${ConfigIncludePath}/config_tools.h) - macro(clickhouse_target_link_split_lib target name) if(NOT CLICKHOUSE_ONE_SHARED) target_link_libraries(${target} PRIVATE clickhouse-${name}-lib) diff --git a/programs/client/Client.cpp b/programs/client/Client.cpp index 5348a9e36c5..e4858eeda8b 100644 --- a/programs/client/Client.cpp +++ b/programs/client/Client.cpp @@ -2515,7 +2515,7 @@ public: { std::string traceparent = options["opentelemetry-traceparent"].as(); std::string error; - if (!context.getClientInfo().parseTraceparentHeader( + if (!context.getClientInfo().client_trace_context.parseTraceparentHeader( traceparent, error)) { throw Exception(ErrorCodes::BAD_ARGUMENTS, @@ -2526,7 +2526,7 @@ public: if (options.count("opentelemetry-tracestate")) { - context.getClientInfo().opentelemetry_tracestate = + context.getClientInfo().client_trace_context.tracestate = options["opentelemetry-tracestate"].as(); } diff --git a/programs/copier/ClusterCopier.cpp b/programs/copier/ClusterCopier.cpp index a129dc7efcc..2f19fc47fd2 100644 --- a/programs/copier/ClusterCopier.cpp +++ b/programs/copier/ClusterCopier.cpp @@ -62,6 +62,9 @@ decltype(auto) ClusterCopier::retry(T && func, UInt64 max_tries) { std::exception_ptr exception; + if (max_tries == 0) + throw Exception("Cannot perform zero retries", ErrorCodes::LOGICAL_ERROR); + for (UInt64 try_number = 1; try_number <= max_tries; ++try_number) { try @@ -605,7 +608,7 @@ TaskStatus ClusterCopier::tryMoveAllPiecesToDestinationTable(const TaskTable & t settings_push.replication_alter_partitions_sync = 2; query_alter_ast_string += " ALTER TABLE " + getQuotedTable(original_table) + - " ATTACH PARTITION " + partition_name + + ((partition_name == "'all'") ? " ATTACH PARTITION ID " : " ATTACH PARTITION ") + partition_name + " FROM " + getQuotedTable(helping_table); LOG_DEBUG(log, "Executing ALTER query: {}", query_alter_ast_string); @@ -636,7 +639,7 @@ TaskStatus ClusterCopier::tryMoveAllPiecesToDestinationTable(const TaskTable & t if (!task_table.isReplicatedTable()) { query_deduplicate_ast_string += " OPTIMIZE TABLE " + getQuotedTable(original_table) + - " PARTITION " + partition_name + " DEDUPLICATE;"; + ((partition_name == "'all'") ? " PARTITION ID " : " PARTITION ") + partition_name + " DEDUPLICATE;"; LOG_DEBUG(log, "Executing OPTIMIZE DEDUPLICATE query: {}", query_alter_ast_string); @@ -807,7 +810,7 @@ bool ClusterCopier::tryDropPartitionPiece( DatabaseAndTableName helping_table = DatabaseAndTableName(original_table.first, original_table.second + "_piece_" + toString(current_piece_number)); String query = "ALTER TABLE " + getQuotedTable(helping_table); - query += " DROP PARTITION " + task_partition.name + ""; + query += ((task_partition.name == "'all'") ? " DROP PARTITION ID " : " DROP PARTITION ") + task_partition.name + ""; /// TODO: use this statement after servers will be updated up to 1.1.54310 // query += " DROP PARTITION ID '" + task_partition.name + "'"; @@ -1567,7 +1570,7 @@ void ClusterCopier::dropParticularPartitionPieceFromAllHelpingTables(const TaskT DatabaseAndTableName original_table = task_table.table_push; DatabaseAndTableName helping_table = DatabaseAndTableName(original_table.first, original_table.second + "_piece_" + toString(current_piece_number)); - String query = "ALTER TABLE " + getQuotedTable(helping_table) + " DROP PARTITION " + partition_name; + String query = "ALTER TABLE " + getQuotedTable(helping_table) + ((partition_name == "'all'") ? " DROP PARTITION ID " : " DROP PARTITION ") + partition_name; const ClusterPtr & cluster_push = task_table.cluster_push; Settings settings_push = task_cluster->settings_push; @@ -1670,14 +1673,24 @@ void ClusterCopier::createShardInternalTables(const ConnectionTimeouts & timeout std::set ClusterCopier::getShardPartitions(const ConnectionTimeouts & timeouts, TaskShard & task_shard) { + std::set res; + createShardInternalTables(timeouts, task_shard, false); TaskTable & task_table = task_shard.task_table; + const String & partition_name = queryToString(task_table.engine_push_partition_key_ast); + + if (partition_name == "'all'") + { + res.emplace("'all'"); + return res; + } + String query; { WriteBufferFromOwnString wb; - wb << "SELECT DISTINCT " << queryToString(task_table.engine_push_partition_key_ast) << " AS partition FROM" + wb << "SELECT DISTINCT " << partition_name << " AS partition FROM" << " " << getQuotedTable(task_shard.table_read_shard) << " ORDER BY partition DESC"; query = wb.str(); } @@ -1692,7 +1705,6 @@ std::set ClusterCopier::getShardPartitions(const ConnectionTimeouts & ti local_context.setSettings(task_cluster->settings_pull); Block block = getBlockWithAllStreamData(InterpreterFactory::get(query_ast, local_context)->execute().getInputStream()); - std::set res; if (block) { ColumnWithTypeAndName & column = block.getByPosition(0); @@ -1803,7 +1815,7 @@ UInt64 ClusterCopier::executeQueryOnCluster( if (execution_mode == ClusterExecutionMode::ON_EACH_NODE) max_successful_executions_per_shard = 0; - std::atomic origin_replicas_number; + std::atomic origin_replicas_number = 0; /// We need to execute query on one replica at least auto do_for_shard = [&] (UInt64 shard_index, Settings shard_settings) diff --git a/programs/install/Install.cpp b/programs/install/Install.cpp index da22452819a..9e3942e126d 100644 --- a/programs/install/Install.cpp +++ b/programs/install/Install.cpp @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -70,7 +71,7 @@ namespace po = boost::program_options; namespace fs = std::filesystem; -auto executeScript(const std::string & command, bool throw_on_error = false) +static auto executeScript(const std::string & command, bool throw_on_error = false) { auto sh = ShellCommand::execute(command); WriteBufferFromFileDescriptor wb_stdout(STDOUT_FILENO); @@ -87,7 +88,7 @@ auto executeScript(const std::string & command, bool throw_on_error = false) return sh->tryWait(); } -bool ask(std::string question) +static bool ask(std::string question) { while (true) { @@ -104,6 +105,16 @@ bool ask(std::string question) } } +static bool filesEqual(std::string path1, std::string path2) +{ + MMapReadBufferFromFile in1(path1, 0); + MMapReadBufferFromFile in2(path2, 0); + + /// memcmp is faster than hashing and comparing hashes + return in1.buffer().size() == in2.buffer().size() + && 0 == memcmp(in1.buffer().begin(), in2.buffer().begin(), in1.buffer().size()); +} + int mainEntryClickHouseInstall(int argc, char ** argv) { @@ -143,57 +154,89 @@ int mainEntryClickHouseInstall(int argc, char ** argv) throw Exception(ErrorCodes::FILE_DOESNT_EXIST, "Cannot obtain path to the binary from {}, file doesn't exist", binary_self_path.string()); + fs::path binary_self_canonical_path = fs::canonical(binary_self_path); + /// Copy binary to the destination directory. /// TODO An option to link instead of copy - useful for developers. - /// TODO Check if the binary is the same. - - size_t binary_size = fs::file_size(binary_self_path); fs::path prefix = fs::path(options["prefix"].as()); fs::path bin_dir = prefix / fs::path(options["binary-path"].as()); - size_t available_space = fs::space(bin_dir).available; - if (available_space < binary_size) - throw Exception(ErrorCodes::NOT_ENOUGH_SPACE, "Not enough space for clickhouse binary in {}, required {}, available {}.", - bin_dir.string(), ReadableSize(binary_size), ReadableSize(available_space)); - fs::path main_bin_path = bin_dir / "clickhouse"; fs::path main_bin_tmp_path = bin_dir / "clickhouse.new"; fs::path main_bin_old_path = bin_dir / "clickhouse.old"; - fmt::print("Copying ClickHouse binary to {}\n", main_bin_tmp_path.string()); + size_t binary_size = fs::file_size(binary_self_path); - try + bool old_binary_exists = fs::exists(main_bin_path); + bool already_installed = false; + + /// Check if the binary is the same file (already installed). + if (old_binary_exists && binary_self_canonical_path == fs::canonical(main_bin_path)) { - ReadBufferFromFile in(binary_self_path.string()); - WriteBufferFromFile out(main_bin_tmp_path.string()); - copyData(in, out); - out.sync(); - - if (0 != fchmod(out.getFD(), S_IRUSR | S_IRGRP | S_IROTH | S_IXUSR | S_IXGRP | S_IXOTH)) - throwFromErrno(fmt::format("Cannot chmod {}", main_bin_tmp_path.string()), ErrorCodes::SYSTEM_ERROR); - - out.finalize(); + already_installed = true; + fmt::print("ClickHouse binary is already located at {}\n", main_bin_path.string()); } - catch (const Exception & e) + /// Check if binary has the same content. + else if (old_binary_exists && binary_size == fs::file_size(main_bin_path)) { - if (e.code() == ErrorCodes::CANNOT_OPEN_FILE && geteuid() != 0) - std::cerr << "Install must be run as root: sudo ./clickhouse install\n"; - throw; + fmt::print("Found already existing ClickHouse binary at {} having the same size. Will check its contents.\n", + main_bin_path.string()); + + if (filesEqual(binary_self_path.string(), main_bin_path.string())) + { + already_installed = true; + fmt::print("ClickHouse binary is already located at {} and it has the same content as {}\n", + main_bin_path.string(), binary_self_canonical_path.string()); + } } - if (fs::exists(main_bin_path)) + if (already_installed) { - fmt::print("{} already exists, will rename existing binary to {} and put the new binary in place\n", - main_bin_path.string(), main_bin_old_path.string()); - - /// There is file exchange operation in Linux but it's not portable. - fs::rename(main_bin_path, main_bin_old_path); + if (0 != chmod(main_bin_path.string().c_str(), S_IRUSR | S_IRGRP | S_IROTH | S_IXUSR | S_IXGRP | S_IXOTH)) + throwFromErrno(fmt::format("Cannot chmod {}", main_bin_path.string()), ErrorCodes::SYSTEM_ERROR); } + else + { + size_t available_space = fs::space(bin_dir).available; + if (available_space < binary_size) + throw Exception(ErrorCodes::NOT_ENOUGH_SPACE, "Not enough space for clickhouse binary in {}, required {}, available {}.", + bin_dir.string(), ReadableSize(binary_size), ReadableSize(available_space)); - fmt::print("Renaming {} to {}.\n", main_bin_tmp_path.string(), main_bin_path.string()); - fs::rename(main_bin_tmp_path, main_bin_path); + fmt::print("Copying ClickHouse binary to {}\n", main_bin_tmp_path.string()); + + try + { + ReadBufferFromFile in(binary_self_path.string()); + WriteBufferFromFile out(main_bin_tmp_path.string()); + copyData(in, out); + out.sync(); + + if (0 != fchmod(out.getFD(), S_IRUSR | S_IRGRP | S_IROTH | S_IXUSR | S_IXGRP | S_IXOTH)) + throwFromErrno(fmt::format("Cannot chmod {}", main_bin_tmp_path.string()), ErrorCodes::SYSTEM_ERROR); + + out.finalize(); + } + catch (const Exception & e) + { + if (e.code() == ErrorCodes::CANNOT_OPEN_FILE && geteuid() != 0) + std::cerr << "Install must be run as root: sudo ./clickhouse install\n"; + throw; + } + + if (old_binary_exists) + { + fmt::print("{} already exists, will rename existing binary to {} and put the new binary in place\n", + main_bin_path.string(), main_bin_old_path.string()); + + /// There is file exchange operation in Linux but it's not portable. + fs::rename(main_bin_path, main_bin_old_path); + } + + fmt::print("Renaming {} to {}.\n", main_bin_tmp_path.string(), main_bin_path.string()); + fs::rename(main_bin_tmp_path, main_bin_path); + } /// Create symlinks. @@ -401,8 +444,8 @@ int mainEntryClickHouseInstall(int argc, char ** argv) ConfigurationPtr configuration(new Poco::Util::XMLConfiguration(processor.processConfig())); if (!configuration->getString("users.default.password", "").empty() - || configuration->getString("users.default.password_sha256_hex", "").empty() - || configuration->getString("users.default.password_double_sha1_hex", "").empty()) + || !configuration->getString("users.default.password_sha256_hex", "").empty() + || !configuration->getString("users.default.password_double_sha1_hex", "").empty()) { has_password_for_default_user = true; } @@ -576,7 +619,7 @@ int mainEntryClickHouseInstall(int argc, char ** argv) " || echo \"Cannot set 'net_admin' or 'ipc_lock' or 'sys_nice' capability for clickhouse binary." " This is optional. Taskstats accounting will be disabled." " To enable taskstats accounting you may add the required capability later manually.\"", - "/tmp/test_setcap.sh", main_bin_path.string()); + "/tmp/test_setcap.sh", fs::canonical(main_bin_path).string()); fmt::print(" {}\n", command); executeScript(command); #endif @@ -597,10 +640,6 @@ int mainEntryClickHouseInstall(int argc, char ** argv) } } - std::string maybe_sudo; - if (getuid() != 0) - maybe_sudo = "sudo "; - std::string maybe_password; if (has_password_for_default_user) maybe_password = " --password"; @@ -608,10 +647,19 @@ int mainEntryClickHouseInstall(int argc, char ** argv) fmt::print( "\nClickHouse has been successfully installed.\n" "\nStart clickhouse-server with:\n" - " {}clickhouse start\n" + " sudo clickhouse start\n" "\nStart clickhouse-client with:\n" " clickhouse-client{}\n\n", - maybe_sudo, maybe_password); + maybe_password); + } + catch (const fs::filesystem_error &) + { + std::cerr << getCurrentExceptionMessage(false) << '\n'; + + if (getuid() != 0) + std::cerr << "\nRun with sudo.\n"; + + return getCurrentExceptionCode(); } catch (...) { @@ -783,17 +831,20 @@ namespace return pid; } - int stop(const fs::path & pid_file) + int stop(const fs::path & pid_file, bool force) { UInt64 pid = isRunning(pid_file); if (!pid) return 0; - if (0 == kill(pid, 15)) /// Terminate - fmt::print("Sent termination signal.\n", pid); + int signal = force ? SIGKILL : SIGTERM; + const char * signal_name = force ? "kill" : "terminate"; + + if (0 == kill(pid, signal)) + fmt::print("Sent {} signal to process with pid {}.\n", signal_name, pid); else - throwFromErrno("Cannot send termination signal", ErrorCodes::SYSTEM_ERROR); + throwFromErrno(fmt::format("Cannot send {} signal", signal_name), ErrorCodes::SYSTEM_ERROR); size_t try_num = 0; constexpr size_t num_tries = 60; @@ -869,6 +920,7 @@ int mainEntryClickHouseStop(int argc, char ** argv) desc.add_options() ("help,h", "produce help message") ("pid-path", po::value()->default_value("/var/run/clickhouse-server"), "directory for pid file") + ("force", po::value()->default_value(false), "Stop with KILL signal instead of TERM") ; po::variables_map options; @@ -887,7 +939,7 @@ int mainEntryClickHouseStop(int argc, char ** argv) { fs::path pid_file = fs::path(options["pid-path"].as()) / "clickhouse-server.pid"; - return stop(pid_file); + return stop(pid_file, options["force"].as()); } catch (...) { @@ -940,6 +992,7 @@ int mainEntryClickHouseRestart(int argc, char ** argv) ("config-path", po::value()->default_value("/etc/clickhouse-server"), "directory with configs") ("pid-path", po::value()->default_value("/var/run/clickhouse-server"), "directory for pid file") ("user", po::value()->default_value("clickhouse"), "clickhouse user") + ("force", po::value()->default_value(false), "Stop with KILL signal instead of TERM") ; po::variables_map options; @@ -962,7 +1015,7 @@ int mainEntryClickHouseRestart(int argc, char ** argv) fs::path config = fs::path(options["config-path"].as()) / "config.xml"; fs::path pid_file = fs::path(options["pid-path"].as()) / "clickhouse-server.pid"; - if (int res = stop(pid_file)) + if (int res = stop(pid_file, options["force"].as())) return res; return start(user, executable, config, pid_file); } diff --git a/programs/odbc-bridge/ODBCBlockInputStream.cpp b/programs/odbc-bridge/ODBCBlockInputStream.cpp index 00ca89bd887..3e2a2d0c7d4 100644 --- a/programs/odbc-bridge/ODBCBlockInputStream.cpp +++ b/programs/odbc-bridge/ODBCBlockInputStream.cpp @@ -79,11 +79,18 @@ namespace assert_cast(column).insert(value.convert()); break; case ValueType::vtDate: - assert_cast(column).insertValue(UInt16{LocalDate{value.convert()}.getDayNum()}); + { + Poco::DateTime date = value.convert(); + assert_cast(column).insertValue(UInt16{LocalDate(date.year(), date.month(), date.day()).getDayNum()}); break; + } case ValueType::vtDateTime: - assert_cast(column).insertValue(time_t{LocalDateTime{value.convert()}}); + { + Poco::DateTime datetime = value.convert(); + assert_cast(column).insertValue(time_t{LocalDateTime( + datetime.year(), datetime.month(), datetime.day(), datetime.hour(), datetime.minute(), datetime.second())}); break; + } case ValueType::vtUUID: assert_cast(column).insert(parse(value.convert())); break; @@ -112,6 +119,7 @@ Block ODBCBlockInputStream::readImpl() for (const auto idx : ext::range(0, row.fieldCount())) { + /// TODO This is extremely slow. const Poco::Dynamic::Var & value = row[idx]; if (!value.isEmpty()) diff --git a/programs/odbc-bridge/ODBCBridge.cpp b/programs/odbc-bridge/ODBCBridge.cpp index 24aa8e32ddb..3b26e192a07 100644 --- a/programs/odbc-bridge/ODBCBridge.cpp +++ b/programs/odbc-bridge/ODBCBridge.cpp @@ -109,6 +109,14 @@ void ODBCBridge::defineOptions(Poco::Util::OptionSet & options) .argument("err-log-path") .binding("logger.errorlog")); + options.addOption(Poco::Util::Option("stdout-path", "", "stdout log path, default console") + .argument("stdout-path") + .binding("logger.stdout")); + + options.addOption(Poco::Util::Option("stderr-path", "", "stderr log path, default console") + .argument("stderr-path") + .binding("logger.stderr")); + using Me = std::decay_t; options.addOption(Poco::Util::Option("help", "", "produce this help message") .binding("help") @@ -127,6 +135,27 @@ void ODBCBridge::initialize(Application & self) config().setString("logger", "ODBCBridge"); + /// Redirect stdout, stderr to specified files. + /// Some libraries and sanitizers write to stderr in case of errors. + const auto stdout_path = config().getString("logger.stdout", ""); + if (!stdout_path.empty()) + { + if (!freopen(stdout_path.c_str(), "a+", stdout)) + throw Poco::OpenFileException("Cannot attach stdout to " + stdout_path); + + /// Disable buffering for stdout. + setbuf(stdout, nullptr); + } + const auto stderr_path = config().getString("logger.stderr", ""); + if (!stderr_path.empty()) + { + if (!freopen(stderr_path.c_str(), "a+", stderr)) + throw Poco::OpenFileException("Cannot attach stderr to " + stderr_path); + + /// Disable buffering for stderr. + setbuf(stderr, nullptr); + } + buildLoggers(config(), logger(), self.commandName()); BaseDaemon::logRevision(); diff --git a/programs/server/Server.cpp b/programs/server/Server.cpp index 951ece89929..26339c5ad3f 100644 --- a/programs/server/Server.cpp +++ b/programs/server/Server.cpp @@ -64,6 +64,7 @@ #include #include #include +#include #if !defined(ARCADIA_BUILD) @@ -84,6 +85,11 @@ # include #endif +#if USE_GRPC +# include +#endif + + namespace CurrentMetrics { extern const Metric Revision; @@ -806,7 +812,7 @@ int Server::main(const std::vector & /*args*/) http_params->setTimeout(settings.http_receive_timeout); http_params->setKeepAliveTimeout(keep_alive_timeout); - std::vector> servers; + std::vector servers; std::vector listen_hosts = DB::getMultipleValuesFromConfig(config(), "", "listen_host"); @@ -1035,6 +1041,15 @@ int Server::main(const std::vector & /*args*/) LOG_INFO(log, "Listening for PostgreSQL compatibility protocol: " + address.toString()); }); +#if USE_GRPC + create_server("grpc_port", [&](UInt16 port) + { + Poco::Net::SocketAddress server_address(listen_host, port); + servers.emplace_back(std::make_unique(*this, make_socket_address(listen_host, port))); + LOG_INFO(log, "Listening for gRPC protocol: " + server_address.toString()); + }); +#endif + /// Prometheus (if defined and not setup yet with http_port) create_server("prometheus.port", [&](UInt16 port) { @@ -1056,7 +1071,7 @@ int Server::main(const std::vector & /*args*/) global_context->enableNamedSessions(); for (auto & server : servers) - server->start(); + server.start(); { String level_str = config().getString("text_log.level", ""); @@ -1088,8 +1103,8 @@ int Server::main(const std::vector & /*args*/) int current_connections = 0; for (auto & server : servers) { - server->stop(); - current_connections += server->currentConnections(); + server.stop(); + current_connections += server.currentConnections(); } if (current_connections) @@ -1109,7 +1124,7 @@ int Server::main(const std::vector & /*args*/) { current_connections = 0; for (auto & server : servers) - current_connections += server->currentConnections(); + current_connections += server.currentConnections(); if (!current_connections) break; sleep_current_ms += sleep_one_ms; diff --git a/programs/server/config.d/logging_no_rotate.xml b/programs/server/config.d/logging_no_rotate.xml new file mode 120000 index 00000000000..cd66c69b3ed --- /dev/null +++ b/programs/server/config.d/logging_no_rotate.xml @@ -0,0 +1 @@ +../../../tests/config/config.d/logging_no_rotate.xml \ No newline at end of file diff --git a/programs/server/config.xml b/programs/server/config.xml index e17b59671af..dde3702a44b 100644 --- a/programs/server/config.xml +++ b/programs/server/config.xml @@ -11,6 +11,9 @@ trace /var/log/clickhouse-server/clickhouse-server.log /var/log/clickhouse-server/clickhouse-server.err.log + 1000M 10 @@ -131,6 +134,34 @@ 4096 3 + + + + true + + + + + + + + + + + + + + 100 @@ -550,7 +581,7 @@ system query_log
toYYYYMM(event_date) + *_dictionary.xml diff --git a/programs/server/play.html b/programs/server/play.html index 22eea0002ca..37869228c04 100644 --- a/programs/server/play.html +++ b/programs/server/play.html @@ -1,6 +1,7 @@ + ClickHouse Query @@ -286,6 +288,8 @@
 (Ctrl+Enter) + + 🌑🌞
@@ -299,50 +303,117 @@