diff --git a/CHANGELOG.md b/CHANGELOG.md index e1764f07acf..8e4acdc293f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,77 @@ +## ClickHouse release 20.9 + +### ClickHouse release v20.9.2.20-stable, 2020-09-22 + +#### New Feature + +* Added column transformers `EXCEPT`, `REPLACE`, `APPLY`, which can be applied to the list of selected columns (after `*` or `COLUMNS(...)`). For example, you can write `SELECT * EXCEPT(URL) REPLACE(number + 1 AS number)`. Another example: `select * apply(length) apply(max) from wide_string_table` to find out the maxium length of all string columns. [#14233](https://github.com/ClickHouse/ClickHouse/pull/14233) ([Amos Bird](https://github.com/amosbird)). +* Added an aggregate function `rankCorr` which computes a rank correlation coefficient. [#11769](https://github.com/ClickHouse/ClickHouse/pull/11769) ([antikvist](https://github.com/antikvist)) [#14411](https://github.com/ClickHouse/ClickHouse/pull/14411) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Added table function `view` which turns a subquery into a table object. This helps passing queries around. For instance, it can be used in remote/cluster table functions. [#12567](https://github.com/ClickHouse/ClickHouse/pull/12567) ([Amos Bird](https://github.com/amosbird)). + +#### Bug Fix + +* Fix bug when `ALTER UPDATE` mutation with Nullable column in assignment expression and constant value (like `UPDATE x = 42`) leads to incorrect value in column or segfault. Fixes [#13634](https://github.com/ClickHouse/ClickHouse/issues/13634), [#14045](https://github.com/ClickHouse/ClickHouse/issues/14045). [#14646](https://github.com/ClickHouse/ClickHouse/pull/14646) ([alesapin](https://github.com/alesapin)). +* Fix wrong Decimal multiplication result caused wrong decimal scale of result column. [#14603](https://github.com/ClickHouse/ClickHouse/pull/14603) ([Artem Zuikov](https://github.com/4ertus2)). +* Fixed the incorrect sorting order of `Nullable` column. This fixes [#14344](https://github.com/ClickHouse/ClickHouse/issues/14344). [#14495](https://github.com/ClickHouse/ClickHouse/pull/14495) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)). +* Fixed inconsistent comparison with primary key of type `FixedString` on index analysis if they're compered with a string of less size. This fixes https://github.com/ClickHouse/ClickHouse/issues/14908. [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)). +* Fix bug which leads to wrong merges assignment if table has partitions with a single part. [#14444](https://github.com/ClickHouse/ClickHouse/pull/14444) ([alesapin](https://github.com/alesapin)). +* If function `bar` was called with specifically crafted arguments, buffer overflow was possible. This closes [#13926](https://github.com/ClickHouse/ClickHouse/issues/13926). [#15028](https://github.com/ClickHouse/ClickHouse/pull/15028) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes https://github.com/ClickHouse/ClickHouse/issues/14923. [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)). +* Fixed `.metadata.tmp File exists` error when using `MaterializeMySQL` database engine. [#14898](https://github.com/ClickHouse/ClickHouse/pull/14898) ([Winter Zhang](https://github.com/zhang2014)). +* Fix the issue when some invocations of `extractAllGroups` function may trigger "Memory limit exceeded" error. This fixes [#13383](https://github.com/ClickHouse/ClickHouse/issues/13383). [#14889](https://github.com/ClickHouse/ClickHouse/pull/14889) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix SIGSEGV for an attempt to INSERT into StorageFile(fd). [#14887](https://github.com/ClickHouse/ClickHouse/pull/14887) ([Azat Khuzhin](https://github.com/azat)). +* Fix rare error in `SELECT` queries when the queried column has `DEFAULT` expression which depends on the other column which also has `DEFAULT` and not present in select query and not exists on disk. Partially fixes [#14531](https://github.com/ClickHouse/ClickHouse/issues/14531). [#14845](https://github.com/ClickHouse/ClickHouse/pull/14845) ([alesapin](https://github.com/alesapin)). +* Fix wrong monotonicity detection for shrunk `Int -> Int` cast of signed types. It might lead to incorrect query result. This bug is unveiled in [#14513](https://github.com/ClickHouse/ClickHouse/issues/14513). [#14783](https://github.com/ClickHouse/ClickHouse/pull/14783) ([Amos Bird](https://github.com/amosbird)). +* Fixed missed default database name in metadata of materialized view when executing `ALTER ... MODIFY QUERY`. [#14664](https://github.com/ClickHouse/ClickHouse/pull/14664) ([tavplubix](https://github.com/tavplubix)). +* Fix possibly incorrect result of function `has` when LowCardinality and Nullable types are involved. [#14591](https://github.com/ClickHouse/ClickHouse/pull/14591) ([Mike](https://github.com/myrrc)). +* Cleanup data directory after Zookeeper exceptions during CREATE query for tables with ReplicatedMergeTree Engine. [#14563](https://github.com/ClickHouse/ClickHouse/pull/14563) ([Bharat Nallan](https://github.com/bharatnc)). +* Fix rare segfaults in functions with combinator `-Resample`, which could appear in result of overflow with very large parameters. [#14562](https://github.com/ClickHouse/ClickHouse/pull/14562) ([Anton Popov](https://github.com/CurtizJ)). +* Check for array size overflow in `topK` aggregate function. Without this check the user may send a query with carefully crafted parameters that will lead to server crash. This closes [#14452](https://github.com/ClickHouse/ClickHouse/issues/14452). [#14467](https://github.com/ClickHouse/ClickHouse/pull/14467) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Proxy restart/start/stop/reload of SysVinit to systemd (if it is used). [#14460](https://github.com/ClickHouse/ClickHouse/pull/14460) ([Azat Khuzhin](https://github.com/azat)). +* Stop query execution if exception happened in `PipelineExecutor` itself. This could prevent rare possible query hung. [#14334](https://github.com/ClickHouse/ClickHouse/pull/14334) [#14402](https://github.com/ClickHouse/ClickHouse/pull/14402) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Fix crash during `ALTER` query for table which was created `AS table_function`. Fixes [#14212](https://github.com/ClickHouse/ClickHouse/issues/14212). [#14326](https://github.com/ClickHouse/ClickHouse/pull/14326) ([alesapin](https://github.com/alesapin)). +* Fix exception during ALTER LIVE VIEW query with REFRESH command. LIVE VIEW is an experimental feature. [#14320](https://github.com/ClickHouse/ClickHouse/pull/14320) ([Bharat Nallan](https://github.com/bharatnc)). +* Fix QueryPlan lifetime (for EXPLAIN PIPELINE graph=1) for queries with nested interpreter. [#14315](https://github.com/ClickHouse/ClickHouse/pull/14315) ([Azat Khuzhin](https://github.com/azat)). +* Better check for tuple size in SSD cache complex key external dictionaries. This fixes [#13981](https://github.com/ClickHouse/ClickHouse/issues/13981). [#14313](https://github.com/ClickHouse/ClickHouse/pull/14313) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Disallows `CODEC` on `ALIAS` column type. Fixes [#13911](https://github.com/ClickHouse/ClickHouse/issues/13911). [#14263](https://github.com/ClickHouse/ClickHouse/pull/14263) ([Bharat Nallan](https://github.com/bharatnc)). +* Fix GRANT ALL statement when executed on a non-global level. [#13987](https://github.com/ClickHouse/ClickHouse/pull/13987) ([Vitaly Baranov](https://github.com/vitlibar)). +* Fix arrayJoin() capturing in lambda (exception with logical error message was thrown). [#13792](https://github.com/ClickHouse/ClickHouse/pull/13792) ([Azat Khuzhin](https://github.com/azat)). + +#### Experimental Feature + +* Added `db-generator` tool for random database generation by given SELECT queries. It may faciliate reproducing issues when there is only incomplete bug report from the user. [#14442](https://github.com/ClickHouse/ClickHouse/pull/14442) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)) [#10973](https://github.com/ClickHouse/ClickHouse/issues/10973) ([ZeDRoman](https://github.com/ZeDRoman)). + +#### Improvement + +* Allow using multi-volume storage configuration in storage Distributed. [#14839](https://github.com/ClickHouse/ClickHouse/pull/14839) ([Pavel Kovalenko](https://github.com/Jokser)). +* Disallow empty time_zone argument in `toStartOf*` type of functions. [#14509](https://github.com/ClickHouse/ClickHouse/pull/14509) ([Bharat Nallan](https://github.com/bharatnc)). +* MySQL handler returns `OK` for queries like `SET @@var = value`. Such statement is ignored. It is needed because some MySQL drivers send `SET @@` query for setup after handshake https://github.com/ClickHouse/ClickHouse/issues/9336#issuecomment-686222422 . [#14469](https://github.com/ClickHouse/ClickHouse/pull/14469) ([BohuTANG](https://github.com/BohuTANG)). +* Now TTLs will be applied during merge if they were not previously materialized. [#14438](https://github.com/ClickHouse/ClickHouse/pull/14438) ([alesapin](https://github.com/alesapin)). +* Now `clickhouse-obfuscator` supports UUID type as proposed in [#13163](https://github.com/ClickHouse/ClickHouse/issues/13163). [#14409](https://github.com/ClickHouse/ClickHouse/pull/14409) ([dimarub2000](https://github.com/dimarub2000)). +* Added new setting `system_events_show_zero_values` as proposed in [#11384](https://github.com/ClickHouse/ClickHouse/issues/11384). [#14404](https://github.com/ClickHouse/ClickHouse/pull/14404) ([dimarub2000](https://github.com/dimarub2000)). +* Implicitly convert primary key to not null in `MaterializeMySQL` (Same as `MySQL`). Fixes [#14114](https://github.com/ClickHouse/ClickHouse/issues/14114). [#14397](https://github.com/ClickHouse/ClickHouse/pull/14397) ([Winter Zhang](https://github.com/zhang2014)). +* Replace wide integers (256 bit) from boost multiprecision with implementation from https://github.com/cerevra/int. 256bit integers are experimental. [#14229](https://github.com/ClickHouse/ClickHouse/pull/14229) ([Artem Zuikov](https://github.com/4ertus2)). +* Add default compression codec for parts in `system.part_log` with the name `default_compression_codec`. [#14116](https://github.com/ClickHouse/ClickHouse/pull/14116) ([alesapin](https://github.com/alesapin)). +* Add precision argument for `DateTime` type. It allows to use `DateTime` name instead of `DateTime64`. [#13761](https://github.com/ClickHouse/ClickHouse/pull/13761) ([Winter Zhang](https://github.com/zhang2014)). +* Added requirepass authorization for `Redis` external dictionary. [#13688](https://github.com/ClickHouse/ClickHouse/pull/13688) ([Ivan Torgashov](https://github.com/it1804)). +* Improvements in `RabbitMQ` engine: added connection and channels failure handling, proper commits, insert failures handling, better exchanges, queue durability and queue resume opportunity, new queue settings. Fixed tests. [#12761](https://github.com/ClickHouse/ClickHouse/pull/12761) ([Kseniia Sumarokova](https://github.com/kssenii)). +* Support custom codecs in compact parts. [#12183](https://github.com/ClickHouse/ClickHouse/pull/12183) ([Anton Popov](https://github.com/CurtizJ)). + +#### Performance Improvement + +* Optimize queries with LIMIT/LIMIT BY/ORDER BY for distributed with GROUP BY sharding_key (under optimize_skip_unused_shards and optimize_distributed_group_by_sharding_key). [#10373](https://github.com/ClickHouse/ClickHouse/pull/10373) ([Azat Khuzhin](https://github.com/azat)). +* Creating sets for multiple `JOIN` and `IN` in parallel. It may slightly improve performance for queries with several different `IN subquery` expressions. [#14412](https://github.com/ClickHouse/ClickHouse/pull/14412) ([Nikolai Kochetov](https://github.com/KochetovNicolai)). +* Improve Kafka engine performance by providing independent thread for each consumer. Separate thread pool for streaming engines (like Kafka). [#13939](https://github.com/ClickHouse/ClickHouse/pull/13939) ([fastio](https://github.com/fastio)). + +#### Build/Testing/Packaging Improvement + +* Lower binary size in debug build by removing debug info from `Functions`. This is needed only for one internal project in Yandex who is using very old linker. [#14549](https://github.com/ClickHouse/ClickHouse/pull/14549) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Prepare for build with clang 11. [#14455](https://github.com/ClickHouse/ClickHouse/pull/14455) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Fix the logic in backport script. In previous versions it was triggered for any labels of 100% red color. It was strange. [#14433](https://github.com/ClickHouse/ClickHouse/pull/14433) ([alexey-milovidov](https://github.com/alexey-milovidov)). +* Integration tests use default base config. All config changes are explicit with main_configs, user_configs and dictionaries parameters for instance. [#13647](https://github.com/ClickHouse/ClickHouse/pull/13647) ([Ilya Yatsishin](https://github.com/qoega)). + + + ## ClickHouse release 20.8 ### ClickHouse release v20.8.2.3-stable, 2020-09-08 diff --git a/CMakeLists.txt b/CMakeLists.txt index 14f1fcb4a64..be1b6ac04f5 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -28,10 +28,11 @@ endforeach() project(ClickHouse) +# If turned off: e.g. when ENABLE_FOO is ON, but FOO tool was not found, the CMake will continue. option(FAIL_ON_UNSUPPORTED_OPTIONS_COMBINATION - "Stop/Fail CMake configuration if some ENABLE_XXX option is defined (either ON or OFF) but is not possible to satisfy" - ON -) + "Stop/Fail CMake configuration if some ENABLE_XXX option is defined (either ON or OFF) + but is not possible to satisfy" ON) + if(FAIL_ON_UNSUPPORTED_OPTIONS_COMBINATION) set(RECONFIGURE_MESSAGE_LEVEL FATAL_ERROR) else() @@ -58,7 +59,11 @@ set(CMAKE_DEBUG_POSTFIX "d" CACHE STRING "Generate debug library name with a pos # For more info see https://cmake.org/cmake/help/latest/prop_gbl/USE_FOLDERS.html set_property(GLOBAL PROPERTY USE_FOLDERS ON) -option(ENABLE_IPO "Enable full link time optimization (it's usually impractical; see also ENABLE_THINLTO)" OFF) # need cmake 3.9+ +# cmake 3.9+ needed. +# Usually impractical. +# See also ${ENABLE_THINLTO} +option(ENABLE_IPO "Full link time optimization") + if(ENABLE_IPO) cmake_policy(SET CMP0069 NEW) include(CheckIPOSupported) @@ -93,11 +98,16 @@ message (STATUS "CMAKE_BUILD_TYPE: ${CMAKE_BUILD_TYPE}") string (TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE_UC) -option (USE_STATIC_LIBRARIES "Set to FALSE to use shared libraries" ON) -option (MAKE_STATIC_LIBRARIES "Set to FALSE to make shared libraries" ${USE_STATIC_LIBRARIES}) +option(USE_STATIC_LIBRARIES "Disable to use shared libraries" ON) +option(MAKE_STATIC_LIBRARIES "Disable to make shared libraries" ${USE_STATIC_LIBRARIES}) + if (NOT MAKE_STATIC_LIBRARIES) - option (SPLIT_SHARED_LIBRARIES "DEV ONLY. Keep all internal libs as separate .so for faster linking" OFF) - option (CLICKHOUSE_SPLIT_BINARY "Make several binaries instead one bundled (clickhouse-server, clickhouse-client, ... )" OFF) + # DEVELOPER ONLY. + # Faster linking if turned on. + option(SPLIT_SHARED_LIBRARIES "Keep all internal libraries as separate .so files") + + option(CLICKHOUSE_SPLIT_BINARY + "Make several binaries (clickhouse-server, clickhouse-client etc.) instead of one bundled") endif () if (MAKE_STATIC_LIBRARIES AND SPLIT_SHARED_LIBRARIES) @@ -112,7 +122,8 @@ if (USE_STATIC_LIBRARIES) list(REVERSE CMAKE_FIND_LIBRARY_SUFFIXES) endif () -option (ENABLE_FUZZING "Enables fuzzing instrumentation" OFF) +# Implies ${WITH_COVERAGE} +option (ENABLE_FUZZING "Fuzzy testing using libfuzzer" OFF) if (ENABLE_FUZZING) message (STATUS "Fuzzing instrumentation enabled") @@ -144,10 +155,13 @@ if (COMPILER_CLANG) endif () endif () -option (ENABLE_TESTS "Enables tests" ON) +# If turned `ON`, assumes the user has either the system GTest library or the bundled one. +option(ENABLE_TESTS "Provide unit_test_dbms target with Google.Test unit tests" ON) if (OS_LINUX AND NOT UNBUNDLED AND MAKE_STATIC_LIBRARIES AND NOT SPLIT_SHARED_LIBRARIES AND CMAKE_VERSION VERSION_GREATER "3.9.0") - option (GLIBC_COMPATIBILITY "Set to TRUE to enable compatibility with older glibc libraries. Only for x86_64, Linux. Implies ENABLE_FASTMEMCPY." ON) + # Only for Linux, x86_64. + # Implies ${ENABLE_FASTMEMCPY} + option(GLIBC_COMPATIBILITY "Enable compatibility with older glibc libraries." ON) elseif(GLIBC_COMPATIBILITY) message (${RECONFIGURE_MESSAGE_LEVEL} "Glibc compatibility cannot be enabled in current configuration") endif () @@ -180,7 +194,9 @@ else () set(NO_WHOLE_ARCHIVE --no-whole-archive) endif () -option (ADD_GDB_INDEX_FOR_GOLD "Set to add .gdb-index to resulting binaries for gold linker. NOOP if lld is used." 0) +# Ignored if `lld` is used +option(ADD_GDB_INDEX_FOR_GOLD "Add .gdb-index to resulting binaries for gold linker.") + if (NOT CMAKE_BUILD_TYPE_UC STREQUAL "RELEASE") if (LINKER_NAME STREQUAL "lld") set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--gdb-index") @@ -201,9 +217,13 @@ if (NOT CMAKE_BUILD_TYPE_UC STREQUAL "RELEASE") endif() cmake_host_system_information(RESULT AVAILABLE_PHYSICAL_MEMORY QUERY AVAILABLE_PHYSICAL_MEMORY) # Not available under freebsd + + if(NOT AVAILABLE_PHYSICAL_MEMORY OR AVAILABLE_PHYSICAL_MEMORY GREATER 8000) - option(COMPILER_PIPE "-pipe compiler option [less /tmp usage, more ram usage]" ON) + # Less `/tmp` usage, more RAM usage. + option(COMPILER_PIPE "-pipe compiler option" ON) endif() + if(COMPILER_PIPE) set(COMPILER_FLAGS "${COMPILER_FLAGS} -pipe") else() @@ -214,7 +234,8 @@ if(NOT DISABLE_CPU_OPTIMIZE) include(cmake/cpu_features.cmake) endif() -option(ARCH_NATIVE "Enable -march=native compiler flag" 0) +option(ARCH_NATIVE "Add -march=native compiler flag") + if (ARCH_NATIVE) set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native") endif () @@ -225,6 +246,7 @@ if (UNBUNDLED AND (COMPILER_GCC OR COMPILER_CLANG)) else() set (_CXX_STANDARD "-std=c++2a") endif() + # cmake < 3.12 doesn't support 20. We'll set CMAKE_CXX_FLAGS for now # set (CMAKE_CXX_STANDARD 20) set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${_CXX_STANDARD}") @@ -237,7 +259,8 @@ if (COMPILER_GCC OR COMPILER_CLANG) set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsized-deallocation") endif () -option(WITH_COVERAGE "Build with coverage." 0) +# Compiler-specific coverage flags e.g. -fcoverage-mapping for gcc +option(WITH_COVERAGE "Profile the resulting binary/binaries" OFF) if (WITH_COVERAGE AND COMPILER_CLANG) set(COMPILER_FLAGS "${COMPILER_FLAGS} -fprofile-instr-generate -fcoverage-mapping") @@ -271,10 +294,13 @@ if (COMPILER_CLANG) set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fdiagnostics-absolute-paths") if (NOT ENABLE_TESTS AND NOT SANITIZE) - option(ENABLE_THINLTO "Enable Thin LTO. Only applicable for clang. It's also suppressed when building with tests or sanitizers." ON) + # https://clang.llvm.org/docs/ThinLTO.html + # Applies to clang only. + # Disabled when building with tests or sanitizers. + option(ENABLE_THINLTO "Clang-specific link time optimization" ON) endif() - # We cannot afford to use LTO when compiling unitests, and it's not enough + # We cannot afford to use LTO when compiling unit tests, and it's not enough # to only supply -fno-lto at the final linking stage. So we disable it # completely. if (ENABLE_THINLTO AND NOT ENABLE_TESTS AND NOT SANITIZE) @@ -287,8 +313,8 @@ if (COMPILER_CLANG) endif () # Always prefer llvm tools when using clang. For instance, we cannot use GNU ar when llvm LTO is enabled - find_program (LLVM_AR_PATH NAMES "llvm-ar" "llvm-ar-10" "llvm-ar-9" "llvm-ar-8") + if (LLVM_AR_PATH) message(STATUS "Using llvm-ar: ${LLVM_AR_PATH}.") set (CMAKE_AR ${LLVM_AR_PATH}) @@ -297,30 +323,38 @@ if (COMPILER_CLANG) endif () find_program (LLVM_RANLIB_PATH NAMES "llvm-ranlib" "llvm-ranlib-10" "llvm-ranlib-9" "llvm-ranlib-8") + if (LLVM_RANLIB_PATH) message(STATUS "Using llvm-ranlib: ${LLVM_RANLIB_PATH}.") set (CMAKE_RANLIB ${LLVM_RANLIB_PATH}) else () message(WARNING "Cannot find llvm-ranlib. System ranlib will be used instead. It does not work with ThinLTO.") endif () + elseif (ENABLE_THINLTO) message (${RECONFIGURE_MESSAGE_LEVEL} "ThinLTO is only available with CLang") endif () -option (ENABLE_LIBRARIES "Enable all libraries (Global default switch)" ON) +# Turns on all external libs like s3, kafka, ODBC, ... +option(ENABLE_LIBRARIES "Enable all external libraries by default" ON) + +# We recommend avoiding this mode for production builds because we can't guarantee all needed libraries exist in your +# system. +# This mode exists for enthusiastic developers who are searching for trouble. +# Useful for maintainers of OS packages. +option (UNBUNDLED "Use system libraries instead of ones in contrib/" OFF) -option (UNBUNDLED "Try find all libraries in system. We recommend to avoid this mode for production builds, because we cannot guarantee exact versions and variants of libraries your system has installed. This mode exists for enthusiastic developers who search for trouble. Also it is useful for maintainers of OS packages." OFF) if (UNBUNDLED) - set(NOT_UNBUNDLED 0) + set(NOT_UNBUNDLED OFF) else () - set(NOT_UNBUNDLED 1) + set(NOT_UNBUNDLED ON) endif () if (UNBUNDLED OR NOT (OS_LINUX OR OS_DARWIN)) # Using system libs can cause a lot of warnings in includes (on macro expansion). - option (WERROR "Enable -Werror compiler option" OFF) + option(WERROR "Enable -Werror compiler option" OFF) else () - option (WERROR "Enable -Werror compiler option" ON) + option(WERROR "Enable -Werror compiler option" ON) endif () if (WERROR) @@ -362,8 +396,9 @@ else () set (CMAKE_POSITION_INDEPENDENT_CODE ON) endif () -# Using "include-what-you-use" tool. -option (USE_INCLUDE_WHAT_YOU_USE "Use 'include-what-you-use' tool" OFF) +# https://github.com/include-what-you-use/include-what-you-use +option (USE_INCLUDE_WHAT_YOU_USE "Automatically reduce unneeded includes in source code (external tool)" OFF) + if (USE_INCLUDE_WHAT_YOU_USE) find_program(IWYU_PATH NAMES include-what-you-use iwyu) if (NOT IWYU_PATH) @@ -375,8 +410,11 @@ if (USE_INCLUDE_WHAT_YOU_USE) endif () if (ENABLE_TESTS) - message (STATUS "Tests are enabled") + message (STATUS "Unit tests are enabled") +else() + message(STATUS "Unit tests are disabled") endif () + enable_testing() # Enable for tests without binary # when installing to /usr - place configs to /etc but for /usr/local place to /usr/local/etc @@ -386,7 +424,13 @@ else () set (CLICKHOUSE_ETC_DIR "${CMAKE_INSTALL_PREFIX}/etc") endif () -message (STATUS "Building for: ${CMAKE_SYSTEM} ${CMAKE_SYSTEM_PROCESSOR} ${CMAKE_LIBRARY_ARCHITECTURE} ; USE_STATIC_LIBRARIES=${USE_STATIC_LIBRARIES} MAKE_STATIC_LIBRARIES=${MAKE_STATIC_LIBRARIES} SPLIT_SHARED=${SPLIT_SHARED_LIBRARIES} UNBUNDLED=${UNBUNDLED} CCACHE=${CCACHE_FOUND} ${CCACHE_VERSION}") +message (STATUS + "Building for: ${CMAKE_SYSTEM} ${CMAKE_SYSTEM_PROCESSOR} ${CMAKE_LIBRARY_ARCHITECTURE} ; + USE_STATIC_LIBRARIES=${USE_STATIC_LIBRARIES} + MAKE_STATIC_LIBRARIES=${MAKE_STATIC_LIBRARIES} + SPLIT_SHARED=${SPLIT_SHARED_LIBRARIES} + UNBUNDLED=${UNBUNDLED} + CCACHE=${CCACHE_FOUND} ${CCACHE_VERSION}") include (GNUInstallDirs) include (cmake/contrib_finder.cmake) diff --git a/README.md b/README.md index f1c8e17086b..6b909dd710c 100644 --- a/README.md +++ b/README.md @@ -17,5 +17,5 @@ ClickHouse is an open-source column-oriented database management system that all ## Upcoming Events -* [eBay migrating from Druid](https://us02web.zoom.us/webinar/register/tZMkfu6rpjItHtaQ1DXcgPWcSOnmM73HLGKL) on September 23, 2020. * [ClickHouse for Edge Analytics](https://ones2020.sched.com/event/bWPs) on September 29, 2020. +* [ClickHouse online meetup (in Russian)](https://clck.ru/R2zB9) on October 1, 2020. diff --git a/base/common/arithmeticOverflow.h b/base/common/arithmeticOverflow.h index c20fd635924..8df037a14af 100644 --- a/base/common/arithmeticOverflow.h +++ b/base/common/arithmeticOverflow.h @@ -31,8 +31,8 @@ namespace common template <> inline bool addOverflow(__int128 x, __int128 y, __int128 & res) { - static constexpr __int128 min_int128 = __int128(0x8000000000000000ll) << 64; - static constexpr __int128 max_int128 = (__int128(0x7fffffffffffffffll) << 64) + 0xffffffffffffffffll; + static constexpr __int128 min_int128 = minInt128(); + static constexpr __int128 max_int128 = maxInt128(); res = x + y; return (y > 0 && x > max_int128 - y) || (y < 0 && x < min_int128 - y); } @@ -79,8 +79,8 @@ namespace common template <> inline bool subOverflow(__int128 x, __int128 y, __int128 & res) { - static constexpr __int128 min_int128 = __int128(0x8000000000000000ll) << 64; - static constexpr __int128 max_int128 = (__int128(0x7fffffffffffffffll) << 64) + 0xffffffffffffffffll; + static constexpr __int128 min_int128 = minInt128(); + static constexpr __int128 max_int128 = maxInt128(); res = x - y; return (y < 0 && x > max_int128 + y) || (y > 0 && x < min_int128 + y); } diff --git a/base/common/extended_types.h b/base/common/extended_types.h index fe5f7184954..ea475163f6a 100644 --- a/base/common/extended_types.h +++ b/base/common/extended_types.h @@ -13,6 +13,9 @@ using wUInt256 = wide::integer<256, unsigned>; static_assert(sizeof(wInt256) == 32); static_assert(sizeof(wUInt256) == 32); +static constexpr __int128 minInt128() { return static_cast(1) << 127; } +static constexpr __int128 maxInt128() { return (static_cast(1) << 127) - 1; } + /// The standard library type traits, such as std::is_arithmetic, with one exception /// (std::common_type), are "set in stone". Attempting to specialize them causes undefined behavior. /// So instead of using the std type_traits, we use our own version which allows extension. diff --git a/base/common/itoa.h b/base/common/itoa.h index 5d660ca4378..a02e7b68c05 100644 --- a/base/common/itoa.h +++ b/base/common/itoa.h @@ -372,7 +372,7 @@ static inline char * writeLeadingMinus(char * pos) static inline char * writeSIntText(int128_t x, char * pos) { - static const int128_t min_int128 = int128_t(0x8000000000000000ll) << 64; + static constexpr int128_t min_int128 = uint128_t(1) << 127; if (unlikely(x == min_int128)) { diff --git a/base/daemon/BaseDaemon.cpp b/base/daemon/BaseDaemon.cpp index 78801e71a6f..22455d09cf2 100644 --- a/base/daemon/BaseDaemon.cpp +++ b/base/daemon/BaseDaemon.cpp @@ -781,7 +781,7 @@ void BaseDaemon::initializeTerminationAndSignalProcessing() void BaseDaemon::logRevision() const { Poco::Logger::root().information("Starting " + std::string{VERSION_FULL} - + " with revision " + std::to_string(ClickHouseRevision::get()) + + " with revision " + std::to_string(ClickHouseRevision::getVersionRevision()) + ", " + build_id_info + ", PID " + std::to_string(getpid())); } diff --git a/benchmark/hardware.sh b/benchmark/hardware.sh new file mode 100755 index 00000000000..e7773ab1d1e --- /dev/null +++ b/benchmark/hardware.sh @@ -0,0 +1,127 @@ +#!/bin/bash -e + +if [[ -n $1 ]]; then + SCALE=$1 +else + SCALE=100 +fi + +TABLE="hits_${SCALE}m_obfuscated" +DATASET="${TABLE}_v1.tar.xz" +QUERIES_FILE="queries.sql" +TRIES=3 + +AMD64_BIN_URL="https://clickhouse-builds.s3.yandex.net/0/e29c4c3cc47ab2a6c4516486c1b77d57e7d42643/clickhouse_build_check/gcc-10_relwithdebuginfo_none_bundled_unsplitted_disable_False_binary/clickhouse" +AARCH64_BIN_URL="https://clickhouse-builds.s3.yandex.net/0/e29c4c3cc47ab2a6c4516486c1b77d57e7d42643/clickhouse_special_build_check/clang-10-aarch64_relwithdebuginfo_none_bundled_unsplitted_disable_False_binary/clickhouse" + +# Note: on older Ubuntu versions, 'axel' does not support IPv6. If you are using IPv6-only servers on very old Ubuntu, just don't install 'axel'. + +FASTER_DOWNLOAD=wget +if command -v axel >/dev/null; then + FASTER_DOWNLOAD=axel +else + echo "It's recommended to install 'axel' for faster downloads." +fi + +if command -v pixz >/dev/null; then + TAR_PARAMS='-Ipixz' +else + echo "It's recommended to install 'pixz' for faster decompression of the dataset." +fi + +mkdir -p clickhouse-benchmark-$SCALE +pushd clickhouse-benchmark-$SCALE + +if [[ ! -f clickhouse ]]; then + CPU=$(uname -m) + if [[ ($CPU == x86_64) || ($CPU == amd64) ]]; then + $FASTER_DOWNLOAD "$AMD64_BIN_URL" + elif [[ $CPU == aarch64 ]]; then + $FASTER_DOWNLOAD "$AARCH64_BIN_URL" + else + echo "Unsupported CPU type: $CPU" + exit 1 + fi +fi + +chmod a+x clickhouse + +if [[ ! -f $QUERIES_FILE ]]; then + wget "https://raw.githubusercontent.com/ClickHouse/ClickHouse/master/benchmark/clickhouse/$QUERIES_FILE" +fi + +if [[ ! -d data ]]; then + if [[ ! -f $DATASET ]]; then + $FASTER_DOWNLOAD "https://clickhouse-datasets.s3.yandex.net/hits/partitions/$DATASET" + fi + + tar $TAR_PARAMS --strip-components=1 --directory=. -x -v -f $DATASET +fi + +uptime + +echo "Starting clickhouse-server" + +./clickhouse server > server.log 2>&1 & +PID=$! + +function finish { + kill $PID + wait +} +trap finish EXIT + +echo "Waiting for clickhouse-server to start" + +for i in {1..30}; do + sleep 1 + ./clickhouse client --query "SELECT 'The dataset size is: ', count() FROM $TABLE" 2>/dev/null && break || echo '.' + if [[ $i == 30 ]]; then exit 1; fi +done + +echo +echo "Will perform benchmark. Results:" +echo + +cat "$QUERIES_FILE" | sed "s/{table}/${TABLE}/g" | while read query; do + sync + echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null + + echo -n "[" + for i in $(seq 1 $TRIES); do + RES=$(./clickhouse client --max_memory_usage 100000000000 --time --format=Null --query="$query" 2>&1 ||:) + [[ "$?" == "0" ]] && echo -n "${RES}" || echo -n "null" + [[ "$i" != $TRIES ]] && echo -n ", " + done + echo "]," +done + + +echo +echo "Benchmark complete. System info:" +echo + +echo '----Version, build id-----------' +./clickhouse local --query "SELECT format('Version: {}, build id: {}', version(), buildId())" +./clickhouse local --query "SELECT format('The number of threads is: {}', value) FROM system.settings WHERE name = 'max_threads'" --output-format TSVRaw +./clickhouse local --query "SELECT format('Current time: {}', toString(now(), 'UTC'))" +echo '----CPU-------------------------' +cat /proc/cpuinfo | grep -i -F 'model name' | uniq +lscpu +echo '----Block Devices---------------' +lsblk +echo '----Disk Free and Total--------' +df -h . +echo '----Memory Free and Total-------' +free -h +echo '----Physical Memory Amount------' +cat /proc/meminfo | grep MemTotal +echo '----RAID Info-------------------' +cat /proc/mdstat +#echo '----PCI-------------------------' +#lspci +#echo '----All Hardware Info-----------' +#lshw +echo '--------------------------------' + +echo diff --git a/cmake/analysis.cmake b/cmake/analysis.cmake index daaa730ac4b..0818d608f32 100644 --- a/cmake/analysis.cmake +++ b/cmake/analysis.cmake @@ -1,20 +1,28 @@ -# This file configures static analysis tools that can be integrated to the build process +# https://clang.llvm.org/extra/clang-tidy/ +option (ENABLE_CLANG_TIDY "Use clang-tidy static analyzer" OFF) -option (ENABLE_CLANG_TIDY "Use 'clang-tidy' static analyzer if present" OFF) if (ENABLE_CLANG_TIDY) if (${CMAKE_VERSION} VERSION_LESS "3.6.0") message(FATAL_ERROR "clang-tidy requires CMake version at least 3.6.") endif() find_program (CLANG_TIDY_PATH NAMES "clang-tidy" "clang-tidy-10" "clang-tidy-9" "clang-tidy-8") + if (CLANG_TIDY_PATH) - message(STATUS "Using clang-tidy: ${CLANG_TIDY_PATH}. The checks will be run during build process. See the .clang-tidy file at the root directory to configure the checks.") - set (USE_CLANG_TIDY 1) + message(STATUS + "Using clang-tidy: ${CLANG_TIDY_PATH}. + The checks will be run during build process. + See the .clang-tidy file at the root directory to configure the checks.") + + set (USE_CLANG_TIDY ON) + # The variable CMAKE_CXX_CLANG_TIDY will be set inside src and base directories with non third-party code. # set (CMAKE_CXX_CLANG_TIDY "${CLANG_TIDY_PATH}") elseif (FAIL_ON_UNSUPPORTED_OPTIONS_COMBINATION) message(FATAL_ERROR "clang-tidy is not found") else () - message(STATUS "clang-tidy is not found. This is normal - the tool is only used for static code analysis and isn't essential for the build.") + message(STATUS + "clang-tidy is not found. + This is normal - the tool is only used for code static analysis and isn't essential for the build.") endif () endif () diff --git a/cmake/find/ccache.cmake b/cmake/find/ccache.cmake index 270db1b4e66..8e9fe4d84ce 100644 --- a/cmake/find/ccache.cmake +++ b/cmake/find/ccache.cmake @@ -18,7 +18,8 @@ if (NOT CCACHE_FOUND AND NOT DEFINED ENABLE_CCACHE AND NOT COMPILER_MATCHES_CCAC "Setting it up will significantly reduce compilation time for 2nd and consequent builds") endif() -option(ENABLE_CCACHE "Speedup re-compilations using ccache" ${ENABLE_CCACHE_BY_DEFAULT}) +# https://ccache.dev/ +option(ENABLE_CCACHE "Speedup re-compilations using ccache (external tool)" ${ENABLE_CCACHE_BY_DEFAULT}) if (NOT ENABLE_CCACHE) return() diff --git a/cmake/find/cxx.cmake b/cmake/find/cxx.cmake index 02f7113e6fb..b1da125e219 100644 --- a/cmake/find/cxx.cmake +++ b/cmake/find/cxx.cmake @@ -4,13 +4,16 @@ if (NOT USE_LIBCXX) if (USE_INTERNAL_LIBCXX_LIBRARY) message (${RECONFIGURE_MESSAGE_LEVEL} "Cannot use internal libcxx with USE_LIBCXX=OFF") endif() + target_link_libraries(global-libs INTERFACE -l:libstdc++.a -l:libstdc++fs.a) # Always link these libraries as static target_link_libraries(global-libs INTERFACE ${EXCEPTION_HANDLING_LIBRARY}) return() endif() set(USE_INTERNAL_LIBCXX_LIBRARY_DEFAULT ${NOT_UNBUNDLED}) -option (USE_INTERNAL_LIBCXX_LIBRARY "Set to FALSE to use system libcxx and libcxxabi libraries instead of bundled" ${USE_INTERNAL_LIBCXX_LIBRARY_DEFAULT}) + +option (USE_INTERNAL_LIBCXX_LIBRARY "Disable to use system libcxx and libcxxabi libraries instead of bundled" + ${USE_INTERNAL_LIBCXX_LIBRARY_DEFAULT}) if(NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libcxx/CMakeLists.txt") if (USE_INTERNAL_LIBCXX_LIBRARY) diff --git a/cmake/find/gtest.cmake b/cmake/find/gtest.cmake index 36e45a1381e..9d4ab2608cb 100644 --- a/cmake/find/gtest.cmake +++ b/cmake/find/gtest.cmake @@ -1,11 +1,4 @@ -option (ENABLE_GTEST_LIBRARY "Enable gtest library" ${ENABLE_LIBRARIES}) - -if (NOT ENABLE_GTEST_LIBRARY) - if(USE_INTERNAL_GTEST_LIBRARY) - message (${RECONFIGURE_MESSAGE_LEVEL} "Cannot use internal Google Test when ENABLE_GTEST_LIBRARY=OFF") - endif() - return() -endif() +# included only if ENABLE_TESTS=1 option (USE_INTERNAL_GTEST_LIBRARY "Set to FALSE to use system Google Test instead of bundled" ${NOT_UNBUNDLED}) @@ -15,6 +8,7 @@ if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/googletest/googletest/CMakeList message (${RECONFIGURE_MESSAGE_LEVEL} "Can't find internal gtest") set (USE_INTERNAL_GTEST_LIBRARY 0) endif () + set (MISSING_INTERNAL_GTEST_LIBRARY 1) endif () diff --git a/cmake/find/sentry.cmake b/cmake/find/sentry.cmake index 2936c045f99..a986599abce 100644 --- a/cmake/find/sentry.cmake +++ b/cmake/find/sentry.cmake @@ -1,4 +1,5 @@ set (SENTRY_LIBRARY "sentry") + set (SENTRY_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/sentry-native/include") if (NOT EXISTS "${SENTRY_INCLUDE_DIR}/sentry.h") message (WARNING "submodule contrib/sentry-native is missing. to fix try run: \n git submodule update --init --recursive") diff --git a/cmake/find/snappy.cmake b/cmake/find/snappy.cmake index e719231c338..2e1c8473904 100644 --- a/cmake/find/snappy.cmake +++ b/cmake/find/snappy.cmake @@ -1,4 +1,4 @@ -option(USE_SNAPPY "Enable support of snappy library" ${ENABLE_LIBRARIES}) +option(USE_SNAPPY "Enable snappy library" ${ENABLE_LIBRARIES}) if(NOT USE_SNAPPY) if (USE_INTERNAL_SNAPPY_LIBRARY) diff --git a/cmake/fuzzer.cmake b/cmake/fuzzer.cmake index 7ce4559ffae..578a9757270 100644 --- a/cmake/fuzzer.cmake +++ b/cmake/fuzzer.cmake @@ -1,11 +1,12 @@ -option (FUZZER "Enable fuzzer: libfuzzer") - +# see ./CMakeLists.txt for variable declaration if (FUZZER) if (FUZZER STREQUAL "libfuzzer") # NOTE: Eldar Zaitov decided to name it "libfuzzer" instead of "fuzzer" to keep in mind another possible fuzzer backends. - # NOTE: no-link means that all the targets are built with instrumentation for fuzzer, but only some of them (tests) have entry point for fuzzer and it's not checked. + # NOTE: no-link means that all the targets are built with instrumentation for fuzzer, but only some of them + # (tests) have entry point for fuzzer and it's not checked. set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} -fsanitize=fuzzer-no-link") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} -fsanitize=fuzzer-no-link") + if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=fuzzer-no-link") endif() @@ -14,7 +15,6 @@ if (FUZZER) if (NOT LIB_FUZZING_ENGINE) set (LIB_FUZZING_ENGINE "-fsanitize=fuzzer") endif () - else () message (FATAL_ERROR "Unknown fuzzer type: ${FUZZER}") endif () diff --git a/cmake/limit_jobs.cmake b/cmake/limit_jobs.cmake index 4f305bfb4c3..5b962f34c38 100644 --- a/cmake/limit_jobs.cmake +++ b/cmake/limit_jobs.cmake @@ -6,26 +6,35 @@ cmake_host_system_information(RESULT AVAILABLE_PHYSICAL_MEMORY QUERY AVAILABLE_PHYSICAL_MEMORY) # Not available under freebsd cmake_host_system_information(RESULT NUMBER_OF_LOGICAL_CORES QUERY NUMBER_OF_LOGICAL_CORES) -option(PARALLEL_COMPILE_JOBS "Define the maximum number of concurrent compilation jobs" "") +# 1 if not set +option(PARALLEL_COMPILE_JOBS "Maximum number of concurrent compilation jobs" "") + +# 1 if not set +option(PARALLEL_LINK_JOBS "Maximum number of concurrent link jobs" "") + if (NOT PARALLEL_COMPILE_JOBS AND AVAILABLE_PHYSICAL_MEMORY AND MAX_COMPILER_MEMORY) math(EXPR PARALLEL_COMPILE_JOBS ${AVAILABLE_PHYSICAL_MEMORY}/${MAX_COMPILER_MEMORY}) + if (NOT PARALLEL_COMPILE_JOBS) set (PARALLEL_COMPILE_JOBS 1) endif () endif () + if (PARALLEL_COMPILE_JOBS AND (NOT NUMBER_OF_LOGICAL_CORES OR PARALLEL_COMPILE_JOBS LESS NUMBER_OF_LOGICAL_CORES)) set(CMAKE_JOB_POOL_COMPILE compile_job_pool${CMAKE_CURRENT_SOURCE_DIR}) string (REGEX REPLACE "[^a-zA-Z0-9]+" "_" CMAKE_JOB_POOL_COMPILE ${CMAKE_JOB_POOL_COMPILE}) set_property(GLOBAL APPEND PROPERTY JOB_POOLS ${CMAKE_JOB_POOL_COMPILE}=${PARALLEL_COMPILE_JOBS}) endif () -option(PARALLEL_LINK_JOBS "Define the maximum number of concurrent link jobs" "") + if (NOT PARALLEL_LINK_JOBS AND AVAILABLE_PHYSICAL_MEMORY AND MAX_LINKER_MEMORY) math(EXPR PARALLEL_LINK_JOBS ${AVAILABLE_PHYSICAL_MEMORY}/${MAX_LINKER_MEMORY}) + if (NOT PARALLEL_LINK_JOBS) set (PARALLEL_LINK_JOBS 1) endif () endif () + if (PARALLEL_LINK_JOBS AND (NOT NUMBER_OF_LOGICAL_CORES OR PARALLEL_COMPILE_JOBS LESS NUMBER_OF_LOGICAL_CORES)) set(CMAKE_JOB_POOL_LINK link_job_pool${CMAKE_CURRENT_SOURCE_DIR}) string (REGEX REPLACE "[^a-zA-Z0-9]+" "_" CMAKE_JOB_POOL_LINK ${CMAKE_JOB_POOL_LINK}) @@ -33,5 +42,7 @@ if (PARALLEL_LINK_JOBS AND (NOT NUMBER_OF_LOGICAL_CORES OR PARALLEL_COMPILE_JOBS endif () if (PARALLEL_COMPILE_JOBS OR PARALLEL_LINK_JOBS) - message(STATUS "${CMAKE_CURRENT_SOURCE_DIR}: Have ${AVAILABLE_PHYSICAL_MEMORY} megabytes of memory. Limiting concurrent linkers jobs to ${PARALLEL_LINK_JOBS} and compiler jobs to ${PARALLEL_COMPILE_JOBS}") + message(STATUS + "${CMAKE_CURRENT_SOURCE_DIR}: Have ${AVAILABLE_PHYSICAL_MEMORY} megabytes of memory. + Limiting concurrent linkers jobs to ${PARALLEL_LINK_JOBS} and compiler jobs to ${PARALLEL_COMPILE_JOBS}") endif () diff --git a/cmake/sanitize.cmake b/cmake/sanitize.cmake index 7c7e9c388a0..0ccd6933dec 100644 --- a/cmake/sanitize.cmake +++ b/cmake/sanitize.cmake @@ -1,4 +1,5 @@ -option (SANITIZE "Enable sanitizer: address, memory, thread, undefined" "") +# Possible values: `address` (ASan), `memory` (MSan), `thread` (TSan), `undefined` (UBSan), and "" (no sanitizing) +option (SANITIZE "Enable one of the code sanitizers" "") set (SAN_FLAGS "${SAN_FLAGS} -g -fno-omit-frame-pointer -DSANITIZER") diff --git a/cmake/tools.cmake b/cmake/tools.cmake index 723a14c6584..6f07cc2439c 100644 --- a/cmake/tools.cmake +++ b/cmake/tools.cmake @@ -40,7 +40,9 @@ endif () STRING(REGEX MATCHALL "[0-9]+" COMPILER_VERSION_LIST ${CMAKE_CXX_COMPILER_VERSION}) LIST(GET COMPILER_VERSION_LIST 0 COMPILER_VERSION_MAJOR) +# Example values: `lld-10`, `gold`. option (LINKER_NAME "Linker name or full path") + if (COMPILER_GCC AND NOT LINKER_NAME) find_program (LLD_PATH NAMES "ld.lld") find_program (GOLD_PATH NAMES "ld.gold") diff --git a/cmake/warnings.cmake b/cmake/warnings.cmake index 6b26b9b95a5..3b2215f9bb6 100644 --- a/cmake/warnings.cmake +++ b/cmake/warnings.cmake @@ -17,8 +17,9 @@ if (USE_DEBUG_HELPERS) endif () # Add some warnings that are not available even with -Wall -Wextra -Wpedantic. - -option (WEVERYTHING "Enables -Weverything option with some exceptions. This is intended for exploration of new compiler warnings that may be found to be useful. Only makes sense for clang." ON) +# Intended for exploration of new compiler warnings that may be found useful. +# Applies to clang only +option (WEVERYTHING "Enable -Weverything option with some exceptions." ON) # Control maximum size of stack frames. It can be important if the code is run in fibers with small stack size. # Only in release build because debug has too large stack frames. diff --git a/contrib/protobuf b/contrib/protobuf index d6a10dd3db5..445d1ae73a4 160000 --- a/contrib/protobuf +++ b/contrib/protobuf @@ -1 +1 @@ -Subproject commit d6a10dd3db55d8f7f9e464db9151874cde1f79ec +Subproject commit 445d1ae73a450b1e94622e7040989aa2048402e3 diff --git a/contrib/protobuf-cmake/CMakeLists.txt b/contrib/protobuf-cmake/CMakeLists.txt index 683429194fc..1f8d9b02b3e 100644 --- a/contrib/protobuf-cmake/CMakeLists.txt +++ b/contrib/protobuf-cmake/CMakeLists.txt @@ -11,3 +11,7 @@ else () endif () add_subdirectory("${protobuf_SOURCE_DIR}/cmake" "${protobuf_BINARY_DIR}") + +# We don't want to stop compilation on warnings in protobuf's headers. +# The following line overrides the value assigned by the command target_include_directories() in libprotobuf.cmake +set_property(TARGET libprotobuf PROPERTY INTERFACE_SYSTEM_INCLUDE_DIRECTORIES ${protobuf_SOURCE_DIR}/src) diff --git a/debian/rules b/debian/rules index 5b271a8691f..837f81dd503 100755 --- a/debian/rules +++ b/debian/rules @@ -36,8 +36,8 @@ endif CMAKE_FLAGS += -DENABLE_UTILS=0 -DEB_CC ?= $(shell which gcc-9 gcc-8 gcc | head -n1) -DEB_CXX ?= $(shell which g++-9 g++-8 g++ | head -n1) +DEB_CC ?= $(shell which gcc-10 gcc-9 gcc | head -n1) +DEB_CXX ?= $(shell which g++-10 g++-9 g++ | head -n1) ifdef DEB_CXX DEB_BUILD_GNU_TYPE := $(shell dpkg-architecture -qDEB_BUILD_GNU_TYPE) diff --git a/docker/server/entrypoint.sh b/docker/server/entrypoint.sh index 8fc9c670b06..ba352c2bbc2 100644 --- a/docker/server/entrypoint.sh +++ b/docker/server/entrypoint.sh @@ -89,7 +89,8 @@ EOT fi if [ -n "$(ls /docker-entrypoint-initdb.d/)" ] || [ -n "$CLICKHOUSE_DB" ]; then - $gosu /usr/bin/clickhouse-server --config-file=$CLICKHOUSE_CONFIG & + # Listen only on localhost until the initialization is done + $gosu /usr/bin/clickhouse-server --config-file=$CLICKHOUSE_CONFIG -- --listen_host=127.0.0.1 & pid="$!" # check if clickhouse is ready to accept connections diff --git a/docker/test/fasttest/run.sh b/docker/test/fasttest/run.sh index ccbadb84f27..a277ddf9d36 100755 --- a/docker/test/fasttest/run.sh +++ b/docker/test/fasttest/run.sh @@ -83,7 +83,7 @@ SUBMODULES_TO_UPDATE=(contrib/boost contrib/zlib-ng contrib/libxml2 contrib/poco git submodule update --init --recursive "${SUBMODULES_TO_UPDATE[@]}" | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/submodule_log.txt -export CMAKE_LIBS_CONFIG="-DENABLE_LIBRARIES=0 -DENABLE_TESTS=0 -DENABLE_UTILS=0 -DENABLE_EMBEDDED_COMPILER=0 -DENABLE_THINLTO=0 -DUSE_UNWIND=1" +CMAKE_LIBS_CONFIG=(-DENABLE_LIBRARIES=0 -DENABLE_TESTS=0 -DENABLE_UTILS=0 -DENABLE_EMBEDDED_COMPILER=0 -DENABLE_THINLTO=0 -DUSE_UNWIND=1) export CCACHE_DIR=/ccache export CCACHE_BASEDIR=/ClickHouse @@ -96,8 +96,8 @@ ccache --zero-stats ||: mkdir build cd build -cmake .. -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_CXX_COMPILER=clang++-10 -DCMAKE_C_COMPILER=clang-10 "$CMAKE_LIBS_CONFIG" "${FASTTEST_CMAKE_FLAGS[@]}" | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/cmake_log.txt -ninja clickhouse-bundle | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/build_log.txt +cmake .. -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_CXX_COMPILER=clang++-10 -DCMAKE_C_COMPILER=clang-10 "${CMAKE_LIBS_CONFIG[@]}" "${FASTTEST_CMAKE_FLAGS[@]}" | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/cmake_log.txt +time ninja clickhouse-bundle | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/build_log.txt ninja install | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/install_log.txt @@ -111,35 +111,10 @@ ln -s /test_output /var/log/clickhouse-server cp "$CLICKHOUSE_DIR/programs/server/config.xml" /etc/clickhouse-server/ cp "$CLICKHOUSE_DIR/programs/server/users.xml" /etc/clickhouse-server/ -mkdir -p /etc/clickhouse-server/dict_examples -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/zookeeper.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/listen.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/part_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/text_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/metric_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/custom_settings_prefixes.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/log_queries.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/readonly.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/access_management.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/executable_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/macros.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/disks.xml /etc/clickhouse-server/config.d/ -#ln -s /usr/share/clickhouse-test/config/secure_ports.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/clusters.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/graphite.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/server.key /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/server.crt /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/dhparam.pem /etc/clickhouse-server/ -ln -sf /usr/share/clickhouse-test/config/client_config.xml /etc/clickhouse-client/config.xml - -# Keep original query_masking_rules.xml -ln -s --backup=simple --suffix=_original.xml /usr/share/clickhouse-test/config/query_masking_rules.xml /etc/clickhouse-server/config.d/ +# install tests config +$CLICKHOUSE_DIR/tests/config/install.sh +# doesn't support SSL +rm -f /etc/clickhouse-server/config.d/secure_ports.xml # Kill the server in case we are running locally and not in docker kill_clickhouse @@ -216,7 +191,7 @@ TESTS_TO_SKIP=( 01460_DistributedFilesToInsert ) -clickhouse-test -j 4 --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/test_log.txt +time clickhouse-test -j 8 --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee /test_output/test_log.txt # substr is to remove semicolon after test name @@ -234,7 +209,7 @@ then kill_clickhouse # Clean the data so that there is no interference from the previous test run. - rm -rvf /var/lib/clickhouse ||: + rm -rf /var/lib/clickhouse ||: mkdir /var/lib/clickhouse clickhouse-server --config /etc/clickhouse-server/config.xml --daemon diff --git a/docker/test/fuzzer/run-fuzzer.sh b/docker/test/fuzzer/run-fuzzer.sh index bcac5a433cc..d35a13cc421 100755 --- a/docker/test/fuzzer/run-fuzzer.sh +++ b/docker/test/fuzzer/run-fuzzer.sh @@ -48,7 +48,7 @@ function configure cp -av "$repo_dir"/programs/server/config* db cp -av "$repo_dir"/programs/server/user* db # TODO figure out which ones are needed - cp -av "$repo_dir"/tests/config/listen.xml db/config.d + cp -av "$repo_dir"/tests/config/config.d/listen.xml db/config.d cp -av "$script_dir"/query-fuzzer-tweaks-users.xml db/users.d } diff --git a/docker/test/performance-comparison/compare.sh b/docker/test/performance-comparison/compare.sh index 6a9898ba797..ddcc303da0d 100755 --- a/docker/test/performance-comparison/compare.sh +++ b/docker/test/performance-comparison/compare.sh @@ -114,8 +114,6 @@ function run_tests # Just check that the script runs at all "$script_dir/perf.py" --help > /dev/null - changed_test_files="" - # Find the directory with test files. if [ -v CHPC_TEST_PATH ] then @@ -130,14 +128,6 @@ function run_tests else # For PRs, use newer test files so we can test these changes. test_prefix=right/performance - - # If only the perf tests were changed in the PR, we will run only these - # tests. The list of changed tests in changed-test.txt is prepared in - # entrypoint.sh from git diffs, because it has the cloned repo. Used - # to use rsync for that but it was really ugly and not always correct - # (e.g. when the reference SHA is really old and has some other - # differences to the tested SHA, besides the one introduced by the PR). - changed_test_files=$(sed "s/tests\/performance/${test_prefix//\//\\/}/" changed-tests.txt) fi # Determine which tests to run. @@ -146,25 +136,32 @@ function run_tests # Run only explicitly specified tests, if any. # shellcheck disable=SC2010 test_files=$(ls "$test_prefix" | grep "$CHPC_TEST_GREP" | xargs -I{} -n1 readlink -f "$test_prefix/{}") - elif [ "$changed_test_files" != "" ] + elif [ "$PR_TO_TEST" -ne 0 ] \ + && [ "$(wc -l < changed-test-definitions.txt)" -gt 0 ] \ + && [ "$(wc -l < changed-test-scripts.txt)" -eq 0 ] \ + && [ "$(wc -l < other-changed-files.txt)" -eq 0 ] then - # Use test files that changed in the PR. - test_files="$changed_test_files" + # If only the perf tests were changed in the PR, we will run only these + # tests. The lists of changed files are prepared in entrypoint.sh because + # it has the repository. + test_files=$(sed "s/tests\/performance/${test_prefix//\//\\/}/" changed-test-definitions.txt) else # The default -- run all tests found in the test dir. test_files=$(ls "$test_prefix"/*.xml) fi - # For PRs, test only a subset of queries, and run them less times. - # If the corresponding environment variables are already set, keep - # those values. - if [ "$PR_TO_TEST" == "0" ] + # For PRs w/o changes in test definitons and scripts, test only a subset of + # queries, and run them less times. If the corresponding environment variables + # are already set, keep those values. + if [ "$PR_TO_TEST" -ne 0 ] \ + && [ "$(wc -l < changed-test-definitions.txt)" -eq 0 ] \ + && [ "$(wc -l < changed-test-scripts.txt)" -eq 0 ] then - CHPC_RUNS=${CHPC_RUNS:-13} - CHPC_MAX_QUERIES=${CHPC_MAX_QUERIES:-0} - else CHPC_RUNS=${CHPC_RUNS:-7} CHPC_MAX_QUERIES=${CHPC_MAX_QUERIES:-20} + else + CHPC_RUNS=${CHPC_RUNS:-13} + CHPC_MAX_QUERIES=${CHPC_MAX_QUERIES:-0} fi export CHPC_RUNS export CHPC_MAX_QUERIES @@ -629,17 +626,53 @@ create table test_time engine Memory as from total_client_time_per_query full join queries using (test, query_index) group by test; +create view query_runs as select * from file('analyze/query-runs.tsv', TSV, + 'test text, query_index int, query_id text, version UInt8, time float'); + +-- +-- Guess the number of query runs used for this test. The number is required to +-- calculate and check the average query run time in the report. +-- We have to be careful, because we will encounter: +-- 1) partial queries which run only on one server +-- 2) short queries which run for a much higher number of times +-- 3) some errors that make query run for a different number of times on a +-- particular server. +-- +create view test_runs as + select test, + -- Default to 7 runs if there are only 'short' queries in the test, and + -- we can't determine the number of runs. + if((ceil(medianOrDefaultIf(t.runs, not short), 0) as r) != 0, r, 7) runs + from ( + select + -- The query id is the same for both servers, so no need to divide here. + uniqExact(query_id) runs, + (test, query_index) in + (select * from file('analyze/marked-short-queries.tsv', TSV, + 'test text, query_index int')) + as short, + test, query_index + from query_runs + group by test, query_index + ) t + group by test + ; + create table test_times_report engine File(TSV, 'report/test-times.tsv') as select wall_clock_time_per_test.test, real, toDecimal64(total_client_time, 3), queries, toDecimal64(query_max, 3), toDecimal64(real / queries, 3) avg_real_per_query, - toDecimal64(query_min, 3) + toDecimal64(query_min, 3), + runs from test_time - -- wall clock times are also measured for skipped tests, so don't - -- do full join - left join wall_clock_time_per_test using test + -- wall clock times are also measured for skipped tests, so don't + -- do full join + left join wall_clock_time_per_test + on wall_clock_time_per_test.test = test_time.test + full join test_runs + on test_runs.test = test_time.test order by avg_real_per_query desc; -- report for all queries page, only main metric @@ -693,8 +726,8 @@ create view shortness create table inconsistent_short_marking_report engine File(TSV, 'report/unexpected-query-duration.tsv') as select - multiIf(marked_short and time > 0.1, '"short" queries must run faster than 0.02 s', - not marked_short and time < 0.02, '"normal" queries must run longer than 0.1 s', + multiIf(marked_short and time > 0.1, '\"short\" queries must run faster than 0.02 s', + not marked_short and time < 0.02, '\"normal\" queries must run longer than 0.1 s', '') problem, marked_short, time, test, query_index, query_display_name @@ -1032,7 +1065,7 @@ case "$stage" in # to collect the logs. Prefer not to restart, because addresses might change # and we won't be able to process trace_log data. Start in a subshell, so that # it doesn't interfere with the watchdog through `wait`. - ( get_profiles || restart && get_profiles ) ||: + ( get_profiles || { restart && get_profiles ; } ) ||: # Kill the whole process group, because somehow when the subshell is killed, # the sleep inside remains alive and orphaned. diff --git a/docker/test/performance-comparison/entrypoint.sh b/docker/test/performance-comparison/entrypoint.sh index 9e9a46a3ce6..ed2e542eadd 100755 --- a/docker/test/performance-comparison/entrypoint.sh +++ b/docker/test/performance-comparison/entrypoint.sh @@ -97,13 +97,10 @@ then # tests for use by compare.sh. Compare to merge base, because master might be # far in the future and have unrelated test changes. base=$(git -C right/ch merge-base pr origin/master) - git -C right/ch diff --name-only "$base" pr | tee changed-tests.txt - if grep -vq '^tests/performance' changed-tests.txt - then - # Have some other changes besides the tests, so truncate the test list, - # meaning, run all tests. - : > changed-tests.txt - fi + git -C right/ch diff --name-only "$base" pr -- . | tee all-changed-files.txt + git -C right/ch diff --name-only "$base" pr -- tests/performance | tee changed-test-definitions.txt + git -C right/ch diff --name-only "$base" pr -- docker/test/performance-comparison | tee changed-test-scripts.txt + git -C right/ch diff --name-only "$base" pr -- :!tests/performance :!docker/test/performance-comparison | tee other-changed-files.txt fi # Set python output encoding so that we can print queries with Russian letters. diff --git a/docker/test/performance-comparison/perf.py b/docker/test/performance-comparison/perf.py index 79cdc8ea8d2..23686091e45 100755 --- a/docker/test/performance-comparison/perf.py +++ b/docker/test/performance-comparison/perf.py @@ -15,6 +15,7 @@ import sys import time import traceback import xml.etree.ElementTree as et +from threading import Thread from scipy import stats def tsv_escape(s): @@ -23,10 +24,11 @@ def tsv_escape(s): parser = argparse.ArgumentParser(description='Run performance test.') # Explicitly decode files as UTF-8 because sometimes we have Russian characters in queries, and LANG=C is set. parser.add_argument('file', metavar='FILE', type=argparse.FileType('r', encoding='utf-8'), nargs=1, help='test description file') -parser.add_argument('--host', nargs='*', default=['localhost'], help="Server hostname(s). Corresponds to '--port' options.") -parser.add_argument('--port', nargs='*', default=[9000], help="Server port(s). Corresponds to '--host' options.") +parser.add_argument('--host', nargs='*', default=['localhost'], help="Space-separated list of server hostname(s). Corresponds to '--port' options.") +parser.add_argument('--port', nargs='*', default=[9000], help="Space-separated list of server port(s). Corresponds to '--host' options.") parser.add_argument('--runs', type=int, default=1, help='Number of query runs per server.') parser.add_argument('--max-queries', type=int, default=None, help='Test no more than this number of queries, chosen at random.') +parser.add_argument('--queries-to-run', nargs='*', type=int, default=None, help='Space-separated list of indexes of queries to test.') parser.add_argument('--long', action='store_true', help='Do not skip the tests tagged as long.') parser.add_argument('--print-queries', action='store_true', help='Print test queries and exit.') parser.add_argument('--print-settings', action='store_true', help='Print test settings and exit.') @@ -157,8 +159,11 @@ for t in tables: print(f'skipped\t{tsv_escape(skipped_message)}') sys.exit(0) -# Run create queries -create_query_templates = [q.text for q in root.findall('create_query')] +# Run create and fill queries. We will run them simultaneously for both servers, +# to save time. +# The weird search is to keep the relative order of elements, which matters, and +# etree doesn't support the appropriate xpath query. +create_query_templates = [q.text for q in root.findall('./*') if q.tag in ('create_query', 'fill_query')] create_queries = substitute_parameters(create_query_templates) # Disallow temporary tables, because the clickhouse_driver reconnects on errors, @@ -170,23 +175,34 @@ for q in create_queries: file = sys.stderr) sys.exit(1) -for conn_index, c in enumerate(all_connections): - for q in create_queries: - c.execute(q) - print(f'create\t{conn_index}\t{c.last_query.elapsed}\t{tsv_escape(q)}') +def do_create(connection, index, queries): + for q in queries: + connection.execute(q) + print(f'create\t{index}\t{connection.last_query.elapsed}\t{tsv_escape(q)}') -# Run fill queries -fill_query_templates = [q.text for q in root.findall('fill_query')] -fill_queries = substitute_parameters(fill_query_templates) -for conn_index, c in enumerate(all_connections): - for q in fill_queries: - c.execute(q) - print(f'fill\t{conn_index}\t{c.last_query.elapsed}\t{tsv_escape(q)}') +threads = [Thread(target = do_create, args = (connection, index, create_queries)) + for index, connection in enumerate(all_connections)] -# Run the queries in randomized order, but preserve their indexes as specified -# in the test XML. To avoid using too much time, limit the number of queries -# we run per test. -queries_to_run = random.sample(range(0, len(test_queries)), min(len(test_queries), args.max_queries or len(test_queries))) +for t in threads: + t.start() + +for t in threads: + t.join() + +queries_to_run = range(0, len(test_queries)) + +if args.max_queries: + # If specified, test a limited number of queries chosen at random. + queries_to_run = random.sample(range(0, len(test_queries)), min(len(test_queries), args.max_queries)) + +if args.queries_to_run: + # Run the specified queries, with some sanity check. + for i in args.queries_to_run: + if i < 0 or i >= len(test_queries): + print(f'There is no query no. "{i}" in this test, only [{0}-{len(test_queries) - 1}] are present') + exit(1) + + queries_to_run = args.queries_to_run # Run test queries. for query_index in queries_to_run: diff --git a/docker/test/performance-comparison/report.py b/docker/test/performance-comparison/report.py index 8304aa55fc2..5e4e0a161e1 100755 --- a/docker/test/performance-comparison/report.py +++ b/docker/test/performance-comparison/report.py @@ -187,8 +187,10 @@ def td(value, cell_attributes = ''): cell_attributes = cell_attributes, value = value) -def th(x): - return '' + str(x) + '' +def th(value, cell_attributes = ''): + return '{value}'.format( + cell_attributes = cell_attributes, + value = value) def tableRow(cell_values, cell_attributes = [], anchor=None): return tr( @@ -199,8 +201,13 @@ def tableRow(cell_values, cell_attributes = [], anchor=None): if a is not None and v is not None]), anchor) -def tableHeader(r): - return tr(''.join([th(f) for f in r])) +def tableHeader(cell_values, cell_attributes = []): + return tr( + ''.join([th(v, a) + for v, a in itertools.zip_longest( + cell_values, cell_attributes, + fillvalue = '') + if a is not None and v is not None])) def tableStart(title): cls = '-'.join(title.lower().split(' ')[:3]); @@ -377,16 +384,16 @@ if args.report == 'main': 'Ratio of speedup (-) or slowdown (+)', # 2 'Relative difference (new − old) / old', # 3 'p < 0.01 threshold', # 4 - # Failed # 5 + '', # Failed # 5 'Test', # 6 '#', # 7 'Query', # 8 ] - - text += tableHeader(columns) - attrs = ['' for c in columns] attrs[5] = None + + text += tableHeader(columns, attrs) + for row in rows: anchor = f'{currentTableAnchor()}.{row[6]}.{row[7]}' if int(row[5]): @@ -421,17 +428,17 @@ if args.report == 'main': 'New, s', #1 'Relative difference (new - old)/old', #2 'p < 0.01 threshold', #3 - # Failed #4 + '', # Failed #4 'Test', #5 '#', #6 'Query' #7 ] - - text = tableStart('Unstable Queries') - text += tableHeader(columns) - attrs = ['' for c in columns] attrs[4] = None + + text = tableStart('Unstable Queries') + text += tableHeader(columns, attrs) + for r in unstable_rows: anchor = f'{currentTableAnchor()}.{r[5]}.{r[6]}' if int(r[4]): @@ -461,24 +468,25 @@ if args.report == 'main': return columns = [ - 'Test', #0 + 'Test', #0 'Wall clock time, s', #1 'Total client time, s', #2 - 'Total queries', #3 + 'Total queries', #3 'Longest query
(sum for all runs), s', #4 'Avg wall clock time
(sum for all runs), s', #5 'Shortest query
(sum for all runs), s', #6 + '', # Runs #7 ] + attrs = ['' for c in columns] + attrs[7] = None text = tableStart('Test Times') - text += tableHeader(columns) + text += tableHeader(columns, attrs) - nominal_runs = 7 # FIXME pass this as an argument - total_runs = (nominal_runs + 1) * 2 # one prewarm run, two servers - allowed_average_run_time = allowed_single_run_time + 60 / total_runs; # some allowance for fill/create queries - attrs = ['' for c in columns] + allowed_average_run_time = 1.6 # 30 seconds per test at 7 runs for r in rows: anchor = f'{currentTableAnchor()}.{r[0]}' + total_runs = (int(r[7]) + 1) * 2 # one prewarm run, two servers if float(r[5]) > allowed_average_run_time * total_runs: # FIXME should be 15s max -- investigate parallel_insert slow_average_tests += 1 @@ -580,8 +588,8 @@ elif args.report == 'all-queries': return columns = [ - # Changed #0 - # Unstable #1 + '', # Changed #0 + '', # Unstable #1 'Old, s', #2 'New, s', #3 'Ratio of speedup (-) or slowdown (+)', #4 @@ -591,13 +599,13 @@ elif args.report == 'all-queries': '#', #8 'Query', #9 ] - - text = tableStart('All Query Times') - text += tableHeader(columns) - attrs = ['' for c in columns] attrs[0] = None attrs[1] = None + + text = tableStart('All Query Times') + text += tableHeader(columns, attrs) + for r in rows: anchor = f'{currentTableAnchor()}.{r[7]}.{r[8]}' if int(r[1]): diff --git a/docker/test/stateful/run.sh b/docker/test/stateful/run.sh index c3576acc0e4..87cc4054ee6 100755 --- a/docker/test/stateful/run.sh +++ b/docker/test/stateful/run.sh @@ -8,26 +8,8 @@ dpkg -i package_folder/clickhouse-server_*.deb dpkg -i package_folder/clickhouse-client_*.deb dpkg -i package_folder/clickhouse-test_*.deb -mkdir -p /etc/clickhouse-server/dict_examples -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/zookeeper.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/listen.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/part_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/text_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/metric_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/log_queries.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/readonly.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/macros.xml /etc/clickhouse-server/config.d/ - -if [[ -n "$USE_DATABASE_ATOMIC" ]] && [[ "$USE_DATABASE_ATOMIC" -eq 1 ]]; then - ln -s /usr/share/clickhouse-test/config/database_atomic_configd.xml /etc/clickhouse-server/config.d/ - ln -s /usr/share/clickhouse-test/config/database_atomic_usersd.xml /etc/clickhouse-server/users.d/ -fi +# install test configs +/usr/share/clickhouse-test/config/install.sh function start() { diff --git a/docker/test/stateful_with_coverage/run.sh b/docker/test/stateful_with_coverage/run.sh index c2434b319b9..7191745ec83 100755 --- a/docker/test/stateful_with_coverage/run.sh +++ b/docker/test/stateful_with_coverage/run.sh @@ -48,28 +48,8 @@ mkdir -p /var/lib/clickhouse mkdir -p /var/log/clickhouse-server chmod 777 -R /var/log/clickhouse-server/ -# Temorary way to keep CI green while moving dictionaries to separate directory -mkdir -p /etc/clickhouse-server/dict_examples -chmod 777 -R /etc/clickhouse-server/dict_examples -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/dict_examples/; \ - ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/dict_examples/; \ - ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/dict_examples/; - -ln -s /usr/share/clickhouse-test/config/zookeeper.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/listen.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/part_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/text_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/metric_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/log_queries.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/readonly.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/macros.xml /etc/clickhouse-server/config.d/ - -# Retain any pre-existing config and allow ClickHouse to load those if required -ln -s --backup=simple --suffix=_original.xml \ - /usr/share/clickhouse-test/config/query_masking_rules.xml /etc/clickhouse-server/config.d/ +# install test configs +/usr/share/clickhouse-test/config/install.sh function start() { diff --git a/docker/test/stateless/Dockerfile b/docker/test/stateless/Dockerfile index 409a1b07bef..516d8d5842b 100644 --- a/docker/test/stateless/Dockerfile +++ b/docker/test/stateless/Dockerfile @@ -21,9 +21,7 @@ RUN apt-get update -y \ telnet \ tree \ unixodbc \ - wget \ - zookeeper \ - zookeeperd + wget RUN mkdir -p /tmp/clickhouse-odbc-tmp \ && wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ diff --git a/docker/test/stateless/run.sh b/docker/test/stateless/run.sh index b6b48cd0943..9f2bb9bf62d 100755 --- a/docker/test/stateless/run.sh +++ b/docker/test/stateless/run.sh @@ -8,48 +8,9 @@ dpkg -i package_folder/clickhouse-server_*.deb dpkg -i package_folder/clickhouse-client_*.deb dpkg -i package_folder/clickhouse-test_*.deb -mkdir -p /etc/clickhouse-server/dict_examples -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/zookeeper.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/listen.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/part_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/text_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/metric_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/custom_settings_prefixes.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/log_queries.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/readonly.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/access_management.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/executable_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/macros.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/disks.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/secure_ports.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/clusters.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/graphite.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/server.key /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/server.crt /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/dhparam.pem /etc/clickhouse-server/ +# install test configs +/usr/share/clickhouse-test/config/install.sh -# Retain any pre-existing config and allow ClickHouse to load it if required -ln -s --backup=simple --suffix=_original.xml \ - /usr/share/clickhouse-test/config/query_masking_rules.xml /etc/clickhouse-server/config.d/ - -if [[ -n "$USE_POLYMORPHIC_PARTS" ]] && [[ "$USE_POLYMORPHIC_PARTS" -eq 1 ]]; then - ln -s /usr/share/clickhouse-test/config/polymorphic_parts.xml /etc/clickhouse-server/config.d/ -fi -if [[ -n "$USE_DATABASE_ATOMIC" ]] && [[ "$USE_DATABASE_ATOMIC" -eq 1 ]]; then - ln -s /usr/share/clickhouse-test/config/database_atomic_configd.xml /etc/clickhouse-server/config.d/ - ln -s /usr/share/clickhouse-test/config/database_atomic_usersd.xml /etc/clickhouse-server/users.d/ -fi - -ln -sf /usr/share/clickhouse-test/config/client_config.xml /etc/clickhouse-client/config.xml - -service zookeeper start -sleep 5 service clickhouse-server start && sleep 5 if cat /usr/bin/clickhouse-test | grep -q -- "--use-skip-list"; then diff --git a/docker/test/stateless_unbundled/Dockerfile b/docker/test/stateless_unbundled/Dockerfile index b05e46406da..cb8cd158e5f 100644 --- a/docker/test/stateless_unbundled/Dockerfile +++ b/docker/test/stateless_unbundled/Dockerfile @@ -66,9 +66,7 @@ RUN apt-get --allow-unauthenticated update -y \ unixodbc \ unixodbc-dev \ wget \ - zlib1g-dev \ - zookeeper \ - zookeeperd + zlib1g-dev RUN mkdir -p /tmp/clickhouse-odbc-tmp \ && wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \ diff --git a/docker/test/stateless_unbundled/run.sh b/docker/test/stateless_unbundled/run.sh index b6b48cd0943..9f2bb9bf62d 100755 --- a/docker/test/stateless_unbundled/run.sh +++ b/docker/test/stateless_unbundled/run.sh @@ -8,48 +8,9 @@ dpkg -i package_folder/clickhouse-server_*.deb dpkg -i package_folder/clickhouse-client_*.deb dpkg -i package_folder/clickhouse-test_*.deb -mkdir -p /etc/clickhouse-server/dict_examples -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/dict_examples/ -ln -s /usr/share/clickhouse-test/config/zookeeper.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/listen.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/part_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/text_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/metric_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/custom_settings_prefixes.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/log_queries.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/readonly.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/access_management.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/executable_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/macros.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/disks.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/secure_ports.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/clusters.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/graphite.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/server.key /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/server.crt /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/dhparam.pem /etc/clickhouse-server/ +# install test configs +/usr/share/clickhouse-test/config/install.sh -# Retain any pre-existing config and allow ClickHouse to load it if required -ln -s --backup=simple --suffix=_original.xml \ - /usr/share/clickhouse-test/config/query_masking_rules.xml /etc/clickhouse-server/config.d/ - -if [[ -n "$USE_POLYMORPHIC_PARTS" ]] && [[ "$USE_POLYMORPHIC_PARTS" -eq 1 ]]; then - ln -s /usr/share/clickhouse-test/config/polymorphic_parts.xml /etc/clickhouse-server/config.d/ -fi -if [[ -n "$USE_DATABASE_ATOMIC" ]] && [[ "$USE_DATABASE_ATOMIC" -eq 1 ]]; then - ln -s /usr/share/clickhouse-test/config/database_atomic_configd.xml /etc/clickhouse-server/config.d/ - ln -s /usr/share/clickhouse-test/config/database_atomic_usersd.xml /etc/clickhouse-server/users.d/ -fi - -ln -sf /usr/share/clickhouse-test/config/client_config.xml /etc/clickhouse-client/config.xml - -service zookeeper start -sleep 5 service clickhouse-server start && sleep 5 if cat /usr/bin/clickhouse-test | grep -q -- "--use-skip-list"; then diff --git a/docker/test/stateless_with_coverage/Dockerfile b/docker/test/stateless_with_coverage/Dockerfile index 77357d5142f..b76989de1cf 100644 --- a/docker/test/stateless_with_coverage/Dockerfile +++ b/docker/test/stateless_with_coverage/Dockerfile @@ -11,8 +11,6 @@ RUN apt-get update -y \ tzdata \ fakeroot \ debhelper \ - zookeeper \ - zookeeperd \ expect \ python \ python-lxml \ diff --git a/docker/test/stateless_with_coverage/run.sh b/docker/test/stateless_with_coverage/run.sh index c3ccb18659b..2f3f05a335a 100755 --- a/docker/test/stateless_with_coverage/run.sh +++ b/docker/test/stateless_with_coverage/run.sh @@ -39,41 +39,8 @@ mkdir -p /var/log/clickhouse-server chmod 777 -R /var/lib/clickhouse chmod 777 -R /var/log/clickhouse-server/ -# Temorary way to keep CI green while moving dictionaries to separate directory -mkdir -p /etc/clickhouse-server/dict_examples -chmod 777 -R /etc/clickhouse-server/dict_examples -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/dict_examples/; \ - ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/dict_examples/; \ - ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/dict_examples/; - -ln -s /usr/share/clickhouse-test/config/zookeeper.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/listen.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/part_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/text_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/metric_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/log_queries.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/readonly.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/access_management.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/ints_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/strings_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/decimals_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/executable_dictionary.xml /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/macros.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/disks.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/secure_ports.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/clusters.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/graphite.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/server.key /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/server.crt /etc/clickhouse-server/ -ln -s /usr/share/clickhouse-test/config/dhparam.pem /etc/clickhouse-server/ -ln -sf /usr/share/clickhouse-test/config/client_config.xml /etc/clickhouse-client/config.xml - -# Retain any pre-existing config and allow ClickHouse to load it if required -ln -s --backup=simple --suffix=_original.xml \ - /usr/share/clickhouse-test/config/query_masking_rules.xml /etc/clickhouse-server/config.d/ - -service zookeeper start -sleep 5 +# install test configs +/usr/share/clickhouse-test/config/install.sh start_clickhouse diff --git a/docker/test/stress/run.sh b/docker/test/stress/run.sh index 8295e90b3ef..28c66a72d39 100755 --- a/docker/test/stress/run.sh +++ b/docker/test/stress/run.sh @@ -39,9 +39,8 @@ function start() done } -ln -s /usr/share/clickhouse-test/config/log_queries.xml /etc/clickhouse-server/users.d/ -ln -s /usr/share/clickhouse-test/config/part_log.xml /etc/clickhouse-server/config.d/ -ln -s /usr/share/clickhouse-test/config/text_log.xml /etc/clickhouse-server/config.d/ +# install test configs +/usr/share/clickhouse-test/config/install.sh echo "ASAN_OPTIONS='malloc_context_size=10 verbosity=1 allocator_release_to_os_interval_ms=10000'" >> /etc/environment diff --git a/docker/test/stress/stress b/docker/test/stress/stress index 60db5ec465c..a36adda3aad 100755 --- a/docker/test/stress/stress +++ b/docker/test/stress/stress @@ -29,7 +29,7 @@ def get_options(i): if 0 < i: options += " --order=random" if i % 2 == 1: - options += " --atomic-db-engine" + options += " --db-engine=Ordinary" return options diff --git a/docker/test/testflows/runner/Dockerfile b/docker/test/testflows/runner/Dockerfile index 898552ade56..ed49743319c 100644 --- a/docker/test/testflows/runner/Dockerfile +++ b/docker/test/testflows/runner/Dockerfile @@ -35,7 +35,7 @@ RUN apt-get update \ ENV TZ=Europe/Moscow RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone -RUN pip3 install urllib3 testflows==1.6.42 docker-compose docker dicttoxml kazoo tzlocal +RUN pip3 install urllib3 testflows==1.6.48 docker-compose docker dicttoxml kazoo tzlocal ENV DOCKER_CHANNEL stable ENV DOCKER_VERSION 17.09.1-ce diff --git a/docs/_includes/cmake_in_clickhouse_footer.md b/docs/_includes/cmake_in_clickhouse_footer.md new file mode 100644 index 00000000000..ab884bd4dfe --- /dev/null +++ b/docs/_includes/cmake_in_clickhouse_footer.md @@ -0,0 +1,121 @@ + +## Developer's guide for adding new CMake options + +### Don't be obvious. Be informative. + +Bad: +```cmake +option (ENABLE_TESTS "Enables testing" OFF) +``` + +This description is quite useless as is neither gives the viewer any additional information nor explains the option purpose. + +Better: + +```cmake +option(ENABLE_TESTS "Provide unit_test_dbms target with Google.test unit tests" OFF) +``` + +If the option's purpose can't be guessed by its name, or the purpose guess may be misleading, or option has some +pre-conditions, leave a comment above the `option()` line and explain what it does. +The best way would be linking the docs page (if it exists). +The comment is parsed into a separate column (see below). + +Even better: + +```cmake +# implies ${TESTS_ARE_ENABLED} +# see tests/CMakeLists.txt for implementation detail. +option(ENABLE_TESTS "Provide unit_test_dbms target with Google.test unit tests" OFF) +``` + +### If the option's state could produce unwanted (or unusual) result, explicitly warn the user. + +Suppose you have an option that may strip debug symbols from the ClickHouse's part. +This can speed up the linking process, but produces a binary that cannot be debugged. +In that case, prefer explicitly raising a warning telling the developer that he may be doing something wrong. +Also, such options should be disabled if applies. + +Bad: +```cmake +option(STRIP_DEBUG_SYMBOLS_FUNCTIONS + "Do not generate debugger info for ClickHouse functions. + ${STRIP_DSF_DEFAULT}) + +if (STRIP_DEBUG_SYMBOLS_FUNCTIONS) + target_compile_options(clickhouse_functions PRIVATE "-g0") +endif() + +``` +Better: + +```cmake +# Provides faster linking and lower binary size. +# Tradeoff is the inability to debug some source files with e.g. gdb +# (empty stack frames and no local variables)." +option(STRIP_DEBUG_SYMBOLS_FUNCTIONS + "Do not generate debugger info for ClickHouse functions." + ${STRIP_DSF_DEFAULT}) + +if (STRIP_DEBUG_SYMBOLS_FUNCTIONS) + message(WARNING "Not generating debugger info for ClickHouse functions") + target_compile_options(clickhouse_functions PRIVATE "-g0") +endif() +``` + +### In the option's description, explain WHAT the option does rather than WHY it does something. + +The WHY explanation should be placed in the comment. +You may find that the option's name is self-descriptive. + +Bad: + +```cmake +option(ENABLE_THINLTO "Enable Thin LTO. Only applicable for clang. It's also suppressed when building with tests or sanitizers." ON) +``` + +Better: + +```cmake +# Only applicable for clang. +# Turned off when building with tests or sanitizers. +option(ENABLE_THINLTO "Clang-specific link time optimisation" ON). +``` + +### Don't assume other developers know as much as you do. + +In ClickHouse, there are many tools used that an ordinary developer may not know. If you are in doubt, give a link to +the tool's docs. It won't take much of your time. + +Bad: + +```cmake +option(ENABLE_THINLTO "Enable Thin LTO. Only applicable for clang. It's also suppressed when building with tests or sanitizers." ON) +``` + +Better (combined with the above hint): + +```cmake +# https://clang.llvm.org/docs/ThinLTO.html +# Only applicable for clang. +# Turned off when building with tests or sanitizers. +option(ENABLE_THINLTO "Clang-specific link time optimisation" ON). +``` + +Other example, bad: + +```cmake +option (USE_INCLUDE_WHAT_YOU_USE "Use 'include-what-you-use' tool" OFF) +``` + +Better: + +```cmake +# https://github.com/include-what-you-use/include-what-you-use +option (USE_INCLUDE_WHAT_YOU_USE "Reduce unneeded #include s (external tool)" OFF) +``` + +### Prefer consistent default values. + +CMake allows you to pass a plethora of values representing boolean `true/false`, e.g. `1, ON, YES, ...`. +Prefer the `ON/OFF` values, if possible. diff --git a/docs/_includes/cmake_in_clickhouse_header.md b/docs/_includes/cmake_in_clickhouse_header.md new file mode 100644 index 00000000000..10776e04c01 --- /dev/null +++ b/docs/_includes/cmake_in_clickhouse_header.md @@ -0,0 +1,34 @@ +# CMake in ClickHouse + +## TL; DR How to make ClickHouse compile and link faster? + +Developer only! This command will likely fulfill most of your needs. Run before calling `ninja`. + +```cmake +cmake .. \ + -DCMAKE_C_COMPILER=/bin/clang-10 \ + -DCMAKE_CXX_COMPILER=/bin/clang++-10 \ + -DCMAKE_BUILD_TYPE=Debug \ + -DENABLE_CLICKHOUSE_ALL=OFF \ + -DENABLE_CLICKHOUSE_SERVER=ON \ + -DENABLE_CLICKHOUSE_CLIENT=ON \ + -DUSE_STATIC_LIBRARIES=OFF \ + -DCLICKHOUSE_SPLIT_BINARY=ON \ + -DSPLIT_SHARED_LIBRARIES=ON \ + -DENABLE_LIBRARIES=OFF \ + -DENABLE_UTILS=OFF \ + -DENABLE_TESTS=OFF +``` + +## CMake files types + +1. ClickHouse's source CMake files (located in the root directory and in `/src`). +2. Arch-dependent CMake files (located in `/cmake/*os_name*`). +3. Libraries finders (search for contrib libraries, located in `/cmake/find`). +3. Contrib build CMake files (used instead of libraries' own CMake files, located in `/cmake/modules`) + +## List of CMake flags + +* This list is auto-generated by [this Python script](https://github.com/clickhouse/clickhouse/blob/master/docs/tools/cmake_in_clickhouse_generator.py). +* The flag name is a link to its position in the code. +* If an option's default value is itself an option, it's also a link to its position in this list. diff --git a/docs/en/introduction/adopters.md b/docs/en/introduction/adopters.md index 6d57dfde9cd..0cbfdcb7d81 100644 --- a/docs/en/introduction/adopters.md +++ b/docs/en/introduction/adopters.md @@ -38,7 +38,7 @@ toc_title: Adopters | Deutsche Bank | Finance | BI Analytics | — | — | [Slides in English, October 2019](https://bigdatadays.ru/wp-content/uploads/2019/10/D2-H3-3_Yakunin-Goihburg.pdf) | | Diva-e | Digital consulting | Main Product | — | — | [Slides in English, September 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup29/ClickHouse-MeetUp-Unusual-Applications-sd-2019-09-17.pdf) | | Ecwid | E-commerce SaaS | Metrics, Logging | — | — | [Slides in Russian, April 2019](https://nastachku.ru/var/files/1/presentation/backend/2_Backend_6.pdf) | -| eBay | E-commerce | TBA | — | — | [Webinar, Sep 2020](https://altinity.com/webinarspage/2020/09/08/migrating-from-druid-to-next-gen-olap-on-clickhouse-ebays-experience) | +| eBay | E-commerce | Logs, Metrics and Events | — | — | [Official website, Sep 2020](https://tech.ebayinc.com/engineering/ou-online-analytical-processing/) | | Exness | Trading | Metrics, Logging | — | — | [Talk in Russian, May 2019](https://youtu.be/_rpU-TvSfZ8?t=3215) | | FastNetMon | DDoS Protection | Main Product | | — | [Official website](https://fastnetmon.com/docs-fnm-advanced/fastnetmon-advanced-traffic-persistency/) | | Flipkart | e-Commerce | — | — | — | [Talk in English, July 2020](https://youtu.be/GMiXCMFDMow?t=239) | diff --git a/docs/en/introduction/info.md b/docs/en/introduction/info.md new file mode 100644 index 00000000000..a397c40950d --- /dev/null +++ b/docs/en/introduction/info.md @@ -0,0 +1,10 @@ +--- +toc_priority: 100 +--- + +# Information support {#information-support} + +- Email address: +- Phone: +7-495-780-6510 + +[Original article](https://clickhouse.tech/docs/en/introduction/info/) \ No newline at end of file diff --git a/programs/compressor/README.md b/docs/en/operations/utilities/clickhouse-compressor.md similarity index 100% rename from programs/compressor/README.md rename to docs/en/operations/utilities/clickhouse-compressor.md diff --git a/docs/en/operations/utilities/clickhouse-obfuscator.md b/docs/en/operations/utilities/clickhouse-obfuscator.md new file mode 100644 index 00000000000..8a2ea1eecf6 --- /dev/null +++ b/docs/en/operations/utilities/clickhouse-obfuscator.md @@ -0,0 +1,42 @@ +# ClickHouse obfuscator + +Simple tool for table data obfuscation. + +It reads input table and produces output table, that retain some properties of input, but contains different data. +It allows to publish almost real production data for usage in benchmarks. + +It is designed to retain the following properties of data: +- cardinalities of values (number of distinct values) for every column and for every tuple of columns; +- conditional cardinalities: number of distinct values of one column under condition on value of another column; +- probability distributions of absolute value of integers; sign of signed integers; exponent and sign for floats; +- probability distributions of length of strings; +- probability of zero values of numbers; empty strings and arrays, NULLs; +- data compression ratio when compressed with LZ77 and entropy family of codecs; +- continuity (magnitude of difference) of time values across table; continuity of floating point values. +- date component of DateTime values; +- UTF-8 validity of string values; +- string values continue to look somewhat natural. + +Most of the properties above are viable for performance testing: + +reading data, filtering, aggregation and sorting will work at almost the same speed +as on original data due to saved cardinalities, magnitudes, compression ratios, etc. + +It works in deterministic fashion: you define a seed value and transform is totally determined by input data and by seed. +Some transforms are one to one and could be reversed, so you need to have large enough seed and keep it in secret. + +It use some cryptographic primitives to transform data, but from the cryptographic point of view, +It doesn't do anything properly and you should never consider the result as secure, unless you have other reasons for it. + +It may retain some data you don't want to publish. + +It always leave numbers 0, 1, -1 as is. Also it leaves dates, lengths of arrays and null flags exactly as in source data. +For example, you have a column IsMobile in your table with values 0 and 1. In transformed data, it will have the same value. +So, the user will be able to count exact ratio of mobile traffic. + +Another example, suppose you have some private data in your table, like user email and you don't want to publish any single email address. +If your table is large enough and contain multiple different emails and there is no email that have very high frequency than all others, +It will perfectly anonymize all data. But if you have small amount of different values in a column, it can possibly reproduce some of them. +And you should take care and look at exact algorithm, how this tool works, and probably fine tune some of it command line parameters. + +This tool works fine only with reasonable amount of data (at least 1000s of rows). diff --git a/programs/odbc-bridge/README.md b/docs/en/operations/utilities/odbc-bridge.md similarity index 100% rename from programs/odbc-bridge/README.md rename to docs/en/operations/utilities/odbc-bridge.md diff --git a/docs/en/sql-reference/functions/tuple-map-functions.md b/docs/en/sql-reference/functions/tuple-map-functions.md index f826b810d23..55f34b5831e 100644 --- a/docs/en/sql-reference/functions/tuple-map-functions.md +++ b/docs/en/sql-reference/functions/tuple-map-functions.md @@ -9,8 +9,7 @@ toc_title: Working with maps Collect all the keys and sum corresponding values. -Arguments are tuples of two arrays, where items in the first array represent keys, and the second array -contains values for the each key. +Arguments are tuples of two arrays, where items in the first array represent keys, and the second array contains values for the each key. All key arrays should have same type, and all value arrays should contain items which are promotable to the one type (Int64, UInt64 or Float64). The common promoted type is used as a type for the result array. @@ -30,8 +29,7 @@ SELECT mapAdd(([toUInt8(1), 2], [1, 1]), ([toUInt8(1), 2], [1, 1])) as res, toTy Collect all the keys and subtract corresponding values. -Arguments are tuples of two arrays, where items in the first array represent keys, and the second array -contains values for the each key. +Arguments are tuples of two arrays, where items in the first array represent keys, and the second array contains values for the each key. All key arrays should have same type, and all value arrays should contain items which are promotable to the one type (Int64, UInt64 or Float64). The common promoted type is used as a type for the result array. @@ -45,25 +43,24 @@ SELECT mapSubtract(([toUInt8(1), 2], [toInt32(1), 1]), ([toUInt8(1), 2], [toInt3 ┌─res────────────┬─type──────────────────────────────┐ │ ([1,2],[-1,0]) │ Tuple(Array(UInt8), Array(Int64)) │ └────────────────┴───────────────────────────────────┘ -```` +``` ## mapPopulateSeries {#function-mappopulateseries} Syntax: `mapPopulateSeries((keys : Array(), values : Array()[, max : ])` -Generates a map, where keys are a series of numbers, from minimum to maximum keys (or `max` argument if it specified) taken from `keys` array with step size of one, -and corresponding values taken from `values` array. If the value is not specified for the key, then it uses default value in the resulting map. +Generates a map, where keys are a series of numbers, from minimum to maximum keys (or `max` argument if it specified) taken from `keys` array with step size of one, and corresponding values taken from `values` array. If the value is not specified for the key, then it uses default value in the resulting map. For repeated keys only the first value (in order of appearing) gets associated with the key. The number of elements in `keys` and `values` must be the same for each row. Returns a tuple of two arrays: keys in sorted order, and values the corresponding keys. -``` sql +```sql select mapPopulateSeries([1,2,4], [11,22,44], 5) as res, toTypeName(res) as type; ``` -``` text +```text ┌─res──────────────────────────┬─type──────────────────────────────┐ │ ([1,2,3,4,5],[11,22,0,44,0]) │ Tuple(Array(UInt8), Array(UInt8)) │ └──────────────────────────────┴───────────────────────────────────┘ diff --git a/docs/ru/introduction/info.md b/docs/ru/introduction/info.md index 14e517eebae..a9398b8c9cd 100644 --- a/docs/ru/introduction/info.md +++ b/docs/ru/introduction/info.md @@ -7,6 +7,6 @@ toc_priority: 100 Информационная поддержка ClickHouse осуществляется на всей территории Российской Федерации без ограничений посредством использования телефонной связи и средств электронной почты на русском языке в круглосуточном режиме: - Адрес электронной почты: -- Телефон: 8-800-250-96-39 (звонки бесплатны из всех регионов России) +- Телефон: +7-495-780-6510 [Оригинальная статья](https://clickhouse.tech/docs/ru/introduction/info/) diff --git a/docs/ru/whats-new/extended-roadmap.md b/docs/ru/whats-new/extended-roadmap.md index fa371df2d4a..b1d56ef005e 100644 --- a/docs/ru/whats-new/extended-roadmap.md +++ b/docs/ru/whats-new/extended-roadmap.md @@ -97,7 +97,9 @@ Upd. Есть pull request. Upd. Сделано. Частный случай такой задачи уже есть в https://clickhouse.tech/docs/ru/operations/table_engines/graphitemergetree/ Но это было сделано для конкретной задачи. А надо обобщить. -### 1.10. Пережатие старых данных в фоне {#perezhatie-starykh-dannykh-v-fone} +### 1.10. + Пережатие старых данных в фоне {#perezhatie-starykh-dannykh-v-fone} + +В master, сделал Александр Сапин, https://github.com/ClickHouse/ClickHouse/pull/14494 Будет делать Кирилл Барухов, ВШЭ, экспериментальная реализация к весне 2020. Нужно для Яндекс.Метрики. @@ -138,27 +140,32 @@ Upd: PR [#10463](https://github.com/ClickHouse/ClickHouse/pull/10463) ### 1.14. Не писать столбцы, полностью состоящие из нулей {#ne-pisat-stolbtsy-polnostiu-sostoiashchie-iz-nulei} -Антон Попов. Q3. +Антон Попов. Q4. В очереди. Простая задача, является небольшим пререквизитом для потенциальной поддержки полуструктурированных данных. +Upd. В очереди после чтения срезов столбцов. ### 1.15. Возможность иметь разный первичный ключ в разных кусках {#vozmozhnost-imet-raznyi-pervichnyi-kliuch-v-raznykh-kuskakh} Сложная задача, только после 1.3. Upd. В обсуждении. +Upd. Взял в работу Amos Bird. Описана концепция. Совпадает с 1.16. ### 1.16. Несколько физических представлений для одного куска данных {#neskolko-fizicheskikh-predstavlenii-dlia-odnogo-kuska-dannykh} Сложная задача, только после 1.3 и 1.6. Позволяет компенсировать 21.20. Upd. В обсуждении. +Upd. Взял в работу Amos Bird. Описана концепция, работа на начальной стадии. ### 1.17. Несколько сортировок для одной таблицы {#neskolko-sortirovok-dlia-odnoi-tablitsy} Сложная задача, только после 1.3 и 1.6. Upd. В обсуждении. +Upd. Взял в работу Amos Bird. Описана концепция. Совпадает с 1.16. -### 1.18. Отдельное хранение файлов кусков {#otdelnoe-khranenie-failov-kuskov} +### 1.18. - Отдельное хранение файлов кусков {#otdelnoe-khranenie-failov-kuskov} Требует 1.3 и 1.6. Полная замена hard links на sym links, что будет лучше для 1.12. +Отменено. ## 2. Крупные рефакторинги {#krupnye-refaktoringi} @@ -194,13 +201,14 @@ Upd. Старый код по большей части удалён. ### 2.5. Версионирование состояний агрегатных функций {#versionirovanie-sostoianii-agregatnykh-funktsii} -В очереди. +В очереди. Описана схема реализации. Алексей Миловидов. ### 2.6. Правая часть IN как тип данных. Выполнение IN в виде скалярного подзапроса {#pravaia-chast-in-kak-tip-dannykh-vypolnenie-in-v-vide-skaliarnogo-podzaprosa} Требует 2.1. +Отменено. -### 2.7. Нормализация Context {#normalizatsiia-context} +### 2.7. + Нормализация Context {#normalizatsiia-context} В очереди. Нужно для YQL. @@ -209,12 +217,14 @@ Upd. Старый код по большей части удалён. Upd. Каталог БД вынесен из Context. Upd. SharedContext вынесен из Context. Upd. Проблема нейтрализована и перестала быть актуальной. +Upd. Вообще всё стало Ок. ### 2.8. Декларативный парсер запросов {#deklarativnyi-parser-zaprosov} Средний приоритет. Нужно для YQL. Upd. В очереди. Иван Лежанкин. +Upd. Задача в финальной стадии. Пока рассматривается только как альтернативный парсер, описание которого подойдёт для сторонних приложений. ### 2.9. + Логгировние в format-стиле {#loggirovnie-v-format-stile} @@ -225,10 +235,12 @@ Upd. В очереди. Иван Лежанкин. ### 2.10. Запрашивать у таблиц не столбцы, а срезы {#zaprashivat-u-tablits-ne-stolbtsy-a-srezy} В очереди. +В работе, Антон Попов, Q4. -### 2.11. Разбирательство и нормализация функциональности для bitmap {#razbiratelstvo-i-normalizatsiia-funktsionalnosti-dlia-bitmap} +### 2.11. - Разбирательство и нормализация функциональности для bitmap {#razbiratelstvo-i-normalizatsiia-funktsionalnosti-dlia-bitmap} В очереди. +Не актуально. ### 2.12. Декларативные сигнатуры функций {#deklarativnye-signatury-funktsii} @@ -265,7 +277,7 @@ Upd. Поползновения наблюдаются. Требует 3.1. -### + 3.3. Исправить катастрофически отвратительно неприемлемый поиск по документации {#ispravit-katastroficheski-otvratitelno-nepriemlemyi-poisk-po-dokumentatsii} +### 3.3. + Исправить катастрофически отвратительно неприемлемый поиск по документации {#ispravit-katastroficheski-otvratitelno-nepriemlemyi-poisk-po-dokumentatsii} [Иван Блинков](https://github.com/blinkov/) - очень хороший человек. Сам сайт документации основан на технологиях, не удовлетворяющих требованиям задачи, и эти технологии трудно исправить. Задачу будет делать первый встретившийся нам frontend разработчик, которого мы сможем заставить это сделать. @@ -311,7 +323,6 @@ Upd. Сейчас обсуждается, как сделать другую з ### 4.8. Разделить background pool для fetch и merge {#razdelit-background-pool-dlia-fetch-i-merge} В очереди. Исправить проблему, что восстанавливающаяся реплика перестаёт мержить. Частично компенсируется 4.3. -Александр Казаков. ## 5. Операции {#operatsii} @@ -450,6 +461,7 @@ UBSan включен в функциональных тестах, но не в ### 7.12. Показывать тестовое покрытие нового кода в PR {#pokazyvat-testovoe-pokrytie-novogo-koda-v-pr} Пока есть просто показ тестового покрытия всего кода. +Отложено. ### 7.13. + Включение аналога -Weverything в gcc {#vkliuchenie-analoga-weverything-v-gcc} @@ -598,7 +610,7 @@ Upd. Сергей Штыков сделал функцию `randomPrintableASCII Upd. Илья Яцишин сделал табличную функцию `generateRandom`. Upd. Эльдар Заитов добавляет OSS Fuzz. Upd. Сделаны randomString, randomFixedString. -Upd. Сделаны fuzzBits, fuzzBytes. +Upd. Сделаны fuzzBits. ### 7.24. Fuzzing лексера и парсера запросов; кодеков и форматов {#fuzzing-leksera-i-parsera-zaprosov-kodekov-i-formatov} @@ -649,10 +661,11 @@ Upd. В Аркадии частично работает небольшая ча В очереди. Нужно для Яндекс.Метрики. -### 7.32. Обфускация продакшен запросов {#obfuskatsiia-prodakshen-zaprosov} +### 7.32. + Обфускация продакшен запросов {#obfuskatsiia-prodakshen-zaprosov} Роман Ильговский. Нужно для Яндекс.Метрики. -Есть pull request, почти готово: https://github.com/ClickHouse/ClickHouse/pull/10973 +Есть pull request: https://github.com/ClickHouse/ClickHouse/pull/10973 +Готово. Имея SQL запрос, требуется вывести структуру таблиц, на которых этот запрос будет выполнен, и заполнить эти таблицы случайными данными, такими, что результат этого запроса зависит от выбора подмножества данных. @@ -660,6 +673,8 @@ Upd. В Аркадии частично работает небольшая ча Обфускация запросов: имея секретные запросы и структуру таблиц, заменить имена полей и константы, чтобы запросы можно было использовать в качестве публично доступных тестов. +Upd. Последняя часть пока не сделана и будет сделана отдельно. + ### 7.33. Выкладывать патч релизы в репозиторий автоматически {#vykladyvat-patch-relizy-v-repozitorii-avtomaticheski} В очереди. Иван Лежанкин. @@ -701,10 +716,11 @@ Upd. Частично решён вопрос с visibility - есть како Altinity. Никто не делает эту задачу. -### 8.2. Поддержка Mongo Atlas URI {#podderzhka-mongo-atlas-uri} +### 8.2. - Поддержка Mongo Atlas URI {#podderzhka-mongo-atlas-uri} [Александр Кузьменков](https://github.com/akuzm). Upd. Задача взята в работу. +Все pull requests успешно закрыты. ### 8.3. + Доработки globs (правильная поддержка диапазонов, уменьшение числа одновременных stream-ов) {#dorabotki-globs-pravilnaia-podderzhka-diapazonov-umenshenie-chisla-odnovremennykh-stream-ov} @@ -721,6 +737,7 @@ Upd. Задача взята в работу. ### 8.6. Kerberos аутентификация для HDFS и Kafka {#kerberos-autentifikatsiia-dlia-hdfs-i-kafka} Андрей Коняев, ArenaData. Он куда-то пропал. +Upd. В процессе работа для Kafka. ### 8.7. + Исправление мелочи HDFS на очень старых ядрах Linux {#ispravlenie-melochi-hdfs-na-ochen-starykh-iadrakh-linux} @@ -759,6 +776,8 @@ Upd. В стадии код-ревью. ### 8.15. Запись данных в CapNProto {#zapis-dannykh-v-capnproto} +Отложено. + ### 8.16. + Поддержка формата Avro {#podderzhka-formata-avro} Andrew Onyshchuk. Есть pull request. Q1. Сделано. @@ -814,12 +833,13 @@ Upd. Готово. Низкий приоритет. Отменено. -### 8.21. Поддержка произвольного количества языков для имён регионов {#podderzhka-proizvolnogo-kolichestva-iazykov-dlia-imion-regionov} +### 8.21. - Поддержка произвольного количества языков для имён регионов {#podderzhka-proizvolnogo-kolichestva-iazykov-dlia-imion-regionov} Нужно для БК. Декабрь 2019. В декабре для БК сделан минимальный вариант этой задачи. Максимальный вариант, вроде, никому не нужен. Upd. Всё ещё кажется, что задача не нужна. +Отменено. ### 8.22. + Поддержка синтаксиса для переменных в стиле MySQL {#podderzhka-sintaksisa-dlia-peremennykh-v-stile-mysql} @@ -831,6 +851,7 @@ Upd. Сделано теми людьми, кому не запрещено ра ### 8.23. Подписка для импорта обновляемых и ротируемых логов в ФС {#podpiska-dlia-importa-obnovliaemykh-i-rotiruemykh-logov-v-fs} Желательно 2.15. +Отложено. ## 9. Безопасность {#bezopasnost} @@ -870,9 +891,10 @@ Upd. Одну причину устранили, но ещё что-то неи Upd. Нас заставляют переписать эту библиотеку с одного API на другое, так как старое внезапно устарело. Кажется, что переписывание случайно исправит все проблемы. Upd. Ура, нашли причину и исправили. -### 10.3. Возможность чтения данных из статических таблиц в YT словарях {#vozmozhnost-chteniia-dannykh-iz-staticheskikh-tablits-v-yt-slovariakh} +### 10.3. - Возможность чтения данных из статических таблиц в YT словарях {#vozmozhnost-chteniia-dannykh-iz-staticheskikh-tablits-v-yt-slovariakh} Нужно для БК и Метрики. +Отменено. ### 10.4. - Словарь из YDB (KikiMR) {#slovar-iz-ydb-kikimr} @@ -884,9 +906,11 @@ Upd. Ура, нашли причину и исправили. Для MySQL сделал Clément Rodriguez. -### 10.6. Словари из Cassandra и Couchbase {#slovari-iz-cassandra-i-couchbase} +### 10.6. + Словари из Cassandra и Couchbase {#slovari-iz-cassandra-i-couchbase} Готова Cassandra. +Couchbase отменён, так как не было спроса. +Aerospike под вопросом. ### 10.7. Поддержка Nullable в словарях {#podderzhka-nullable-v-slovariakh} @@ -929,10 +953,14 @@ Upd. Задача в финальной стадии готовности. ### 10.17. Локальный дамп состояния словаря для быстрого старта сервера {#lokalnyi-damp-sostoianiia-slovaria-dlia-bystrogo-starta-servera} +Отложено. + ### 10.18. Таблица Join или словарь на удалённом сервере как key-value БД для cache словаря {#tablitsa-join-ili-slovar-na-udalionnom-servere-kak-key-value-bd-dlia-cache-slovaria} ### 10.19. Возможность зарегистрировать некоторые функции, использующие словари, под пользовательскими именами {#vozmozhnost-zaregistrirovat-nekotorye-funktsii-ispolzuiushchie-slovari-pod-polzovatelskimi-imenami} +Отложено. + ## 11. Интерфейсы {#interfeisy} @@ -943,6 +971,7 @@ Upd. Задача в финальной стадии готовности. Нужно разобраться, как упаковывать Java в статический бинарник, возможно AppImage. Или предоставить максимально простую инструкцию по установке jdbc-bridge. Может быть будет заинтересован Александр Крашенинников, Badoo, так как он разработал jdbc-bridge. Upd. Александр Крашенинников перешёл в другую компанию и больше не занимается этим. +Upd. Задачу взял Zhichun Wu. ### 11.3. + Интеграционные тесты ODBC драйвера путём подключения ClickHouse к самому себе через ODBC {#integratsionnye-testy-odbc-draivera-putiom-podkliucheniia-clickhouse-k-samomu-sebe-cherez-odbc} @@ -960,6 +989,8 @@ Altinity целиком взяли на себя поддержку clickhouse-c ### 11.7. Интерактивный режим работы программы clickhouse-local {#interaktivnyi-rezhim-raboty-programmy-clickhouse-local} +Отложено. + ### 11.8. + Поддержка протокола PostgreSQL {#podderzhka-protokola-postgresql} Элбакян Мовсес Андраникович, ВШЭ. @@ -998,14 +1029,17 @@ Q1. Сделано управление правами полностью, но Аутентификация через LDAP - Денис Глазачев. [Виталий Баранов](https://github.com/vitlibar) и Денис Глазачев, Altinity. Требует 12.1. Q3. +Upd. Pull request на финальной стадии. ### 12.4. Подключение IDM системы Яндекса как справочника пользователей и прав доступа {#podkliuchenie-idm-sistemy-iandeksa-kak-spravochnika-polzovatelei-i-prav-dostupa} Пока низкий приоритет. Нужно для Метрики. Требует 12.3. +Отложено. ### 12.5. Pluggable аутентификация с помощью Kerberos (возможно, подключение GSASL) {#pluggable-autentifikatsiia-s-pomoshchiu-kerberos-vozmozhno-podkliuchenie-gsasl} [Виталий Баранов](https://github.com/vitlibar) и Денис Глазачев, Altinity. Требует 12.1. +Upd. Есть pull request. ### 12.6. + Информация о пользователях и квотах в системной таблице {#informatsiia-o-polzovateliakh-i-kvotakh-v-sistemnoi-tablitse} @@ -1033,6 +1067,7 @@ Q3. Upd. Не уследили, и задачу стали обсуждать менеджеры. Upd. Задачу смотрит Александр Казаков. Upd. Задача взята в работу. +Upd. Задача как будто взята в работу. ## 14. Диалект SQL {#dialekt-sql} @@ -1041,7 +1076,9 @@ Upd. Задача взята в работу. Нужно для DataLens. А также для внедрения в BI инструмент Looker. -### 14.2. Поддержка WITH для подзапросов {#podderzhka-with-dlia-podzaprosov} +### 14.2. + Поддержка WITH для подзапросов {#podderzhka-with-dlia-podzaprosov} + +Сделал Amos Bird. ### 14.3. Поддержка подстановок для множеств в правой части IN {#podderzhka-podstanovok-dlia-mnozhestv-v-pravoi-chasti-in} @@ -1057,11 +1094,13 @@ zhang2014 ### 14.6. Глобальный scope для WITH {#globalnyi-scope-dlia-with} +В обсуждении. Amos Bird. + ### 14.7. Nullable для WITH ROLLUP, WITH CUBE, WITH TOTALS {#nullable-dlia-with-rollup-with-cube-with-totals} Простая задача. -### 14.8. Модификаторы DISTINCT, ORDER BY для агрегатных функций {#modifikatory-distinct-order-by-dlia-agregatnykh-funktsii} +### 14.8. + Модификаторы DISTINCT, ORDER BY для агрегатных функций {#modifikatory-distinct-order-by-dlia-agregatnykh-funktsii} В ClickHouse поддерживается вычисление COUNT(DISTINCT x). Предлагается добавить возможность использования модификатора DISTINCT для всех агрегатных функций. Например, AVG(DISTINCT x) - вычислить среднее значение для всех различных значений x. Под вопросом вариант, в котором фильтрация уникальных значений выполняется по одному выражению, а агрегация по другому. @@ -1069,6 +1108,7 @@ zhang2014 Upd. Есть pull request-ы. Upd. DISTINCT готов. +Upd. ORDER BY отменён и будет заново сделан уже с LIMIT. ### 14.9. + Поддержка запроса EXPLAIN {#podderzhka-zaprosa-explain} @@ -1079,8 +1119,12 @@ Upd. Есть pull request. Готово. ### 14.11. Функции для grouping sets {#funktsii-dlia-grouping-sets} +Отложено. + ### 14.12. Функции обработки временных рядов {#funktsii-obrabotki-vremennykh-riadov} +Отложено. + Сложная задача, так как вводит новый класс функций и требует его обработку в оптимизаторе запросов. В time-series СУБД нужны функции, которые зависят от последовательности значений. Или даже от последовательности значений и их меток времени. Примеры: moving average, exponential smoothing, derivative, Holt-Winters forecast. Вычисление таких функций поддерживается в ClickHouse лишь частично. Так, ClickHouse поддерживает тип данных «массив» и позволяет реализовать эти функции как функции, принимающие массивы. Но гораздо удобнее для пользователя было бы иметь возможность применить такие функции к таблице (промежуточному результату запроса после сортировки). @@ -1089,6 +1133,8 @@ Upd. Есть pull request. Готово. ### 14.13. Применимость функций высшего порядка для кортежей и Nested {#primenimost-funktsii-vysshego-poriadka-dlia-kortezhei-i-nested} +После задачи "чтение срезов столбцов". + ### 14.14. Неявные преобразования типов констант {#neiavnye-preobrazovaniia-tipov-konstant} Сделано для операторов сравнения с константами (подавляющее большинство use cases). @@ -1180,12 +1226,14 @@ Upd. Секретного изменения в работе не будет, з ### 16.5. Функции для XML и HTML escape {#funktsii-dlia-xml-i-html-escape} -### 16.6. Функции нормализации и хэширования SQL запросов {#funktsii-normalizatsii-i-kheshirovaniia-sql-zaprosov} +### 16.6. + Функции нормализации и хэширования SQL запросов {#funktsii-normalizatsii-i-kheshirovaniia-sql-zaprosov} + +Алексей Миловидов. Сделано. ## 17. Работа с географическими данными {#rabota-s-geograficheskimi-dannymi} -### 17.1. Гео-словари для определения региона по координатам {#geo-slovari-dlia-opredeleniia-regiona-po-koordinatam} +### 17.1. + Гео-словари для определения региона по координатам {#geo-slovari-dlia-opredeleniia-regiona-po-koordinatam} [Андрей Чулков](https://github.com/achulkov2), Антон Кваша, Артур Петуховский, ВШЭ. Будет основано на коде от Арслана Урташева. @@ -1198,6 +1246,7 @@ Upd. Андрей сделал прототип интерфейса и реал Upd. Андрей сделал прототип более оптимальной структуры данных. Upd. Есть обнадёживающие результаты. Upd. В ревью. +Upd. В релизе. ### 17.2. GIS типы данных и операции {#gis-tipy-dannykh-i-operatsii} @@ -1227,6 +1276,7 @@ Upd. Есть pull request. Александр Кожихов, Максим Кузнецов. Обнаружена фундаментальная проблема в реализации, доделывает предположительно [Николай Кочетов](https://github.com/KochetovNicolai). Он может делегировать задачу кому угодно. Исправление фундаментальной проблемы - есть PR. +Фундаментальная проблема решена. ### 18.2. Агрегатные функции для статистических тестов {#agregatnye-funktsii-dlia-statisticheskikh-testov} @@ -1235,16 +1285,20 @@ Upd. Есть pull request. Предлагается реализовать в ClickHouse статистические тесты (Analysis of Variance, тесты нормальности распределения и т. п.) в виде агрегатных функций. Пример: `welchTTest(value, sample_idx)`. Сделали прототип двух тестов, есть pull request. Также есть pull request для корелляции рангов. +Upd. Помержили корелляцию рангов, но ещё не помержили сравнение t-test, u-test. ### 18.3. Инфраструктура для тренировки моделей в ClickHouse {#infrastruktura-dlia-trenirovki-modelei-v-clickhouse} В очереди. +Отложено. ## 19. Улучшение работы кластера {#uluchshenie-raboty-klastera} ### 19.1. Параллельные кворумные вставки без линеаризуемости {#parallelnye-kvorumnye-vstavki-bez-linearizuemosti} +Upd. В работе, ожидается в начале октября. + Репликация данных в ClickHouse по-умолчанию является асинхронной без выделенного мастера. Это значит, что клиент, осуществляющий вставку данных, получает успешный ответ после того, как данные попали на один сервер; репликация данных по остальным серверам осуществляется в другой момент времени. Это ненадёжно, потому что допускает потерю только что вставленных данных при потере лишь одного сервера. Для решения этой проблемы, в ClickHouse есть возможность включить «кворумную» вставку. Это значит, что клиент, осуществляющий вставку данных, получает успешный ответ после того, как данные попали на несколько (кворум) серверов. Обеспечивается линеаризуемость: клиент, получает успешный ответ после того, как данные попали на несколько реплик, *которые содержат все предыдущие данные, вставленные с кворумом* (такие реплики можно называть «синхронными»), и при запросе SELECT можно выставить настройку, разрешающую только чтение с синхронных реплик. @@ -1265,6 +1319,7 @@ Upd. Есть pull request. Upd. Алексей сделал какой-то вариант, но борется с тем, что ничего не работает. Upd. Есть pull request на начальной стадии. +Upd. Взято в работу, но непонятна перспектива, так как не ясно, подлежат ли исправлению некоторые нюансы. ### 19.3. - Подключение YT Cypress или YDB как альтернативы ZooKeeper {#podkliuchenie-yt-cypress-ili-ydb-kak-alternativy-zookeeper} @@ -1349,9 +1404,9 @@ Upd. Для DISTINCT есть pull request. [Vxider](https://github.com/Vxider), ICT Есть pull request. -### 21.6. Уменьшение числа потоков для SELECT в случае тривиального INSERT SELECT {#umenshenie-chisla-potokov-dlia-select-v-sluchae-trivialnogo-insert-select} +### 21.6. + Уменьшение числа потоков для SELECT в случае тривиального INSERT SELECT {#umenshenie-chisla-potokov-dlia-select-v-sluchae-trivialnogo-insert-select} -ucasFL, в разработке. +ucasFL, в разработке. Готово. ### 21.7. Кэш результатов запросов {#kesh-rezultatov-zaprosov} @@ -1371,11 +1426,14 @@ Upd. В обсуждении. Upd. Есть нерабочий прототип, скорее всего будет отложено. Upd. Отложено до осени. +Upd. Отложено до. ### 21.8.1. Отдельный аллокатор для кэшей с ASLR {#otdelnyi-allokator-dlia-keshei-s-aslr} В прошлом году задачу пытался сделать Данила Кутенин с помощью lfalloc из Аркадии и mimalloc из Microsoft, но оба решения не были квалифицированы для использования в продакшене. Успешная реализация задачи 21.8 отменит необходимость в этой задаче, поэтому холд. +Upd. Ещё попробовали новый tcmalloc, результаты неудовлетворительные. Пока отменено. + ### 21.9. Исправить push-down выражений с помощью Processors {#ispravit-push-down-vyrazhenii-s-pomoshchiu-processors} [Николай Кочетов](https://github.com/KochetovNicolai). Требует 2.1. @@ -1384,7 +1442,7 @@ Upd. Отложено до осени. Amos Bird. -### 21.11. Peephole оптимизации запросов {#peephole-optimizatsii-zaprosov} +### 21.11. + Peephole оптимизации запросов {#peephole-optimizatsii-zaprosov} Руслан Камалов, Михаил Малафеев, Виктор Гришанин, ВШЭ @@ -1399,8 +1457,9 @@ Amos Bird. Сделано ещё несколько оптимизаций. Upd. Все вышеперечисленные оптимизации доступны в pull requests. Upd. Из них почти все помержены, осталась одна. +Upd. Помержили всё. -### 21.12. Алгебраические оптимизации запросов {#algebraicheskie-optimizatsii-zaprosov} +### 21.12. + Алгебраические оптимизации запросов {#algebraicheskie-optimizatsii-zaprosov} Руслан Камалов, Михаил Малафеев, Виктор Гришанин, ВШЭ @@ -1415,6 +1474,7 @@ Upd. Из них почти все помержены, осталась одна Несколько оптимизаций есть в PR. Upd. Все оптимизации кроме "Обращение инъективных функций в сравнениях на равенство" есть в PR. Upd. Из них больше половины помержены, осталось ещё две. +Upd. Помержили всё. ### 21.13. Fusion агрегатных функций {#fusion-agregatnykh-funktsii} @@ -1427,6 +1487,7 @@ Constraints позволяют задать выражение, истиннос Если выражение содержит равенство, то встретив в запросе одну из частей равенства, её можно заменить на другую часть равенства, если это сделает проще чтение данных или вычисление выражения. Например, задан constraint: `URLDomain = domain(URL)`. Значит, выражение `domain(URL)` можно заменить на `URLDomain`. Upd. Возможно будет отложено на следующий год. +Отложено на следующий год. ### 21.15. Многоступенчатое чтение данных вместо PREWHERE {#mnogostupenchatoe-chtenie-dannykh-vmesto-prewhere} @@ -1442,10 +1503,11 @@ Upd. Возможно будет отложено на следующий год ### 21.18. Внутренняя параллелизация мержа больших состояний агрегатных функций {#vnutrenniaia-parallelizatsiia-merzha-bolshikh-sostoianii-agregatnykh-funktsii} -### 21.19. Оптимизация сортировки {#optimizatsiia-sortirovki} +### 21.19. + Оптимизация сортировки {#optimizatsiia-sortirovki} Василий Морозов, Арслан Гумеров, Альберт Кидрачев, ВШЭ. В прошлом году задачу начинал делать другой человек, но не добился достаточного прогресса. +Upd. Сделаны самые существенные из предложенных вариантов. \+ 1. Оптимизация top sort. @@ -1481,11 +1543,13 @@ Upd. Вместо этого будем делать задачу 1.16. ### 21.22. Userspace page cache {#userspace-page-cache} Требует 21.8. +Отложено. -### 21.23. Ускорение работы с вторичными индексами {#uskorenie-raboty-s-vtorichnymi-indeksami} +### 21.23. + Ускорение работы с вторичными индексами {#uskorenie-raboty-s-vtorichnymi-indeksami} zhang2014. Есть pull request. +Готово. ## 22. Долги и недоделанные возможности {#dolgi-i-nedodelannye-vozmozhnosti} @@ -1679,15 +1743,18 @@ Q1. [Николай Кочетов](https://github.com/KochetovNicolai). ### 24.2. Экспериментальные алгоритмы сжатия {#eksperimentalnye-algoritmy-szhatiia} +Отложено. + ClickHouse поддерживает LZ4 и ZSTD для сжатия данных. Эти алгоритмы являются парето-оптимальными по соотношению скорости и коэффициентам сжатия среди достаточно известных. Тем не менее, существуют менее известные алгоритмы сжатия, которые могут превзойти их по какому-либо критерию. Из потенциально более быстрых по сравнимом коэффициенте сжатия: Lizard, LZSSE, density. Из более сильных: bsc и csc. Необходимо изучить эти алгоритмы, добавить их поддержку в ClickHouse и исследовать их работу на тестовых датасетах. -### 24.3. Экспериментальные кодеки {#eksperimentalnye-kodeki} +### 24.3. - Экспериментальные кодеки {#eksperimentalnye-kodeki} Существуют специализированные алгоритмы кодирования числовых последовательностей: Group VarInt, MaskedVByte, PFOR. Необходимо изучить наиболее эффективные реализации этих алгоритмов. Примеры вы сможете найти на https://github.com/lemire и https://github.com/powturbo/ а также https://github.com/schizofreny/middle-out Внедрить их в ClickHouse в виде кодеков и изучить их работу на тестовых датасетах. Upd. Есть два pull requests в начальной стадии, отложено. +Upd. Отменено. ### 24.4. Шифрование в ClickHouse на уровне VFS {#shifrovanie-v-clickhouse-na-urovne-vfs} @@ -1697,6 +1764,7 @@ Upd. Есть два pull requests в начальной стадии, отло Обсуждаются детали реализации. Q3/Q4. Виталий Баранов. +Отложено, после бэкапов. ### 24.5. Поддержка функций шифрования для отдельных значений {#podderzhka-funktsii-shifrovaniia-dlia-otdelnykh-znachenii} @@ -1706,6 +1774,7 @@ Upd. Есть два pull requests в начальной стадии, отло Для этого требуется реализовать функции шифрования и расшифрования, доступные из SQL. Для шифрования реализовать возможность добавления нужного количества случайных бит для исключения одинаковых зашифрованных значений на одинаковых данных. Это позволит реализовать возможность «забывания» данных без удаления строк таблицы: можно шифровать данные разных клиентов разными ключами, и для того, чтобы забыть данные одного клиента, потребуется всего лишь удалить ключ. Делает Василий Немков, Altinity +Есть pull request в процессе ревью, исправляем проблемы производительности. ### 24.6. Userspace RAID {#userspace-raid} @@ -1722,6 +1791,7 @@ RAID позволяет одновременно увеличить надёжн Для преодоления этих ограничений, предлагается реализовать в ClickHouse встроенный алгоритм расположения данных на дисках. Есть pull request на начальной стадии. +Отложено. ### 24.7. Вероятностные структуры данных для фильтрации по подзапросам {#veroiatnostnye-struktury-dannykh-dlia-filtratsii-po-podzaprosam} @@ -1762,6 +1832,7 @@ Upd. Есть pull request. В стадии ревью. Готово. Рустам Гусейн-заде, ВШЭ. Есть pull request на промежуточной стадии. +Отложено. ### 24.11. User Defined Functions {#user-defined-functions} @@ -1785,7 +1856,7 @@ ClickHouse предоставляет достаточно богатый наб Upd. В работе два варианта реализации UDF. -### 24.12. GPU offloading {#gpu-offloading} +### 24.12. - GPU offloading {#gpu-offloading} Риск состоит в том, что даже известные GPU базы, такие как OmniSci, работают медленнее, чем ClickHouse. Преимущество возможно только на полной сортировке и JOIN. @@ -1794,10 +1865,11 @@ Upd. В работе два варианта реализации UDF. В компании nVidia сделали прототип offloading вычисления GROUP BY с некоторыми из агрегатных функций в ClickHouse и обещат предоставить исходники в публичный доступ для дальнейшего развития. Предлагается изучить этот прототип и расширить его применимость для более широкого сценария использования. В качестве альтернативы, предлагается изучить исходные коды системы `OmniSci` или `Alenka` или библиотеку `CUB` https://nvlabs.github.io/cub/ и применить некоторые из алгоритмов в ClickHouse. Upd. В компании nVidia выложили прототип, теперь нужна интеграция в систему сборки. -Upd. Интеграция в систему сборки - Иван Лежанкин. +Upd. Интеграция в систему сборки - Иван Лежанкин (не сделано). Upd. Есть прототип bitonic sort. Upd. Прототип bitonic sort помержен, но целесообразность под вопросом (он работает медленнее). Наверное надо будет подержать и удалить. +Удалили. ### 24.13. Stream запросы {#stream-zaprosy} @@ -1819,6 +1891,8 @@ Upd. Есть два прототипа от внешних контрибьют В прошлом году исследование по этой задаче сделал Егор Соловьёв, ВШЭ и Яндекс.Такси. Его исследование показало, что алгоритм нельзя существенно улучшить путём изменения параметров. Но исследование лажовое, так как рассмотрен только уже использующийся алгоритм. То есть, задача остаётся открытой. +Отложено. + ### 24.17. Экспериментальные способы ускорения параллельного GROUP BY {#eksperimentalnye-sposoby-uskoreniia-parallelnogo-group-by} Максим Серебряков @@ -1831,9 +1905,12 @@ Upd. Есть pull request - в большинстве случаев однов ### 24.19. Промежуточное состояние GROUP BY как структура данных для key-value доступа {#promezhutochnoe-sostoianie-group-by-kak-struktura-dannykh-dlia-key-value-dostupa} +Отложено. + ### 24.20. Short-circuit вычисления некоторых выражений {#short-circuit-vychisleniia-nekotorykh-vyrazhenii} Два года назад задачу попробовала сделать Анастасия Царькова, ВШЭ и Яндекс, но реализация получилась слишком неудобной и её удалили. +В обсуждении. ### 24.21. Реализация в ClickHouse протокола распределённого консенсуса {#realizatsiia-v-clickhouse-protokola-raspredelionnogo-konsensusa} @@ -1851,9 +1928,10 @@ ClickHouse также может использоваться для быстр Другая экспериментальная задача - реализация эвристик для обработки данных в неизвестном построчном текстовом формате. Детектирование CSV, TSV, JSON, детектирование разделителей и форматов значений. -### 24.23. Минимальная поддержка транзакций для множества вставок/чтений {#minimalnaia-podderzhka-tranzaktsii-dlia-mnozhestva-vstavokchtenii} +### 24.23. - Минимальная поддержка транзакций для множества вставок/чтений {#minimalnaia-podderzhka-tranzaktsii-dlia-mnozhestva-vstavokchtenii} Максим Кузнецов, ВШЭ. +Отменено. Таблицы типа MergeTree состоят из набора независимых неизменяемых «кусков» данных. При вставках данных (INSERT), формируются новые куски. При модификациях данных (слияние кусков), формируются новые куски, а старые - становятся неактивными и перестают использоваться следующими запросами. Чтение данных (SELECT) производится из снэпшота множества кусков на некоторый момент времени. Таким образом, чтения и вставки не блокируют друг друга. @@ -1863,11 +1941,12 @@ ClickHouse также может использоваться для быстр Для решения этих проблем, предлагается ввести глобальные метки времени для кусков данных (сейчас уже есть инкрементальные номера кусков, но они выделяются в рамках одной таблицы). Первым шагом сделаем эти метки времени в рамках сервера. Вторым шагом сделаем метки времени в рамках всех серверов, но неточные на основе локальных часов. Третьим шагом сделаем метки времени, выдаваемые сервисом координации. -### 24.24. Реализация алгоритмов differential privacy {#realizatsiia-algoritmov-differential-privacy} +### 24.24. - Реализация алгоритмов differential privacy {#realizatsiia-algoritmov-differential-privacy} [\#6874](https://github.com/ClickHouse/ClickHouse/issues/6874) Артём Вишняков, ВШЭ. Есть pull request. +Отменено, так как решение имеет низкую практичность. ### 24.25. Интеграция в ClickHouse функциональности обработки HTTP User Agent {#integratsiia-v-clickhouse-funktsionalnosti-obrabotki-http-user-agent} @@ -1882,6 +1961,7 @@ Upd. Есть pull request. Нужно ещё чистить код библио Александр Кожихов, ВШЭ и Яндекс.YT. Upd. Есть pull request с прототипом. +Upd. Александ Кузьменков взял задачу в работу. ### 24.27. Реализация алгоритмов min-hash, sim-hash для нечёткого поиска полудубликатов {#realizatsiia-algoritmov-min-hash-sim-hash-dlia-nechiotkogo-poiska-poludublikatov} @@ -1892,10 +1972,12 @@ ucasFL, ICT. Алгоритмы min-hash и sim-hash позволяют вычислить для текста несколько хэш-значений таких, что при небольшом изменении текста, по крайней мере один из хэшей не меняется. Вычисления можно реализовать на n-грамах и словарных шинглах. Предлагается добавить поддержку этих алгоритмов в виде функций в ClickHouse и изучить их применимость для задачи нечёткого поиска полудубликатов. Есть pull request, есть что доделывать. +Upd. Николай Кочетов взял задачу в работу. ### 24.28. Другой sketch для квантилей {#drugoi-sketch-dlia-kvantilei} Похоже на quantileTiming, но с логарифмическими корзинами. См. DDSketch. +Отложено. ### 24.29. Поддержка Arrow Flight {#podderzhka-arrow-flight} @@ -1911,6 +1993,7 @@ Amos Bird, но его решение слишком громоздкое и п ### 24.31. Кореллированные подзапросы {#korellirovannye-podzaprosy} Перепиcывание в JOIN. Не раньше 21.11, 21.12, 21.9. Низкий приоритет. +Отложено. ### 24.32. Поддержка GRPC {#podderzhka-grpc} @@ -1925,6 +2008,7 @@ Amos Bird, но его решение слишком громоздкое и п Рассматривается вариант - поддержка GRPC в ClickHouse. Здесь есть неочевидные моменты, такие как - эффективная передача массивов данных в column-oriented формате - насколько удобно будет обернуть это в GRPC. Задача в работе, есть pull request. [#10136](https://github.com/ClickHouse/ClickHouse/pull/10136) +Upd. Задачу взял в работу Виталий Баранов. ## 25. DevRel {#devrel} @@ -1970,17 +2054,18 @@ Amos Bird, но его решение слишком громоздкое и п Екатерина - организация. Upd. Проведено два онлайн митапа на русском и два на английском. -### 25.11. Митапы зарубежные: восток США (Нью Йорк, возможно Raleigh), возможно северо-запад (Сиэтл), Китай (Пекин снова, возможно митап для разработчиков или хакатон), Лондон {#mitapy-zarubezhnye-vostok-ssha-niu-iork-vozmozhno-raleigh-vozmozhno-severo-zapad-sietl-kitai-pekin-snova-vozmozhno-mitap-dlia-razrabotchikov-ili-khakaton-london} +### 25.11. + Митапы зарубежные: восток США (Нью Йорк, возможно Raleigh), возможно северо-запад (Сиэтл), Китай (Пекин снова, возможно митап для разработчиков или хакатон), Лондон {#mitapy-zarubezhnye-vostok-ssha-niu-iork-vozmozhno-raleigh-vozmozhno-severo-zapad-sietl-kitai-pekin-snova-vozmozhno-mitap-dlia-razrabotchikov-ili-khakaton-london} -[Иван Блинков](https://github.com/blinkov/) - организация. Две штуки в США запланированы. Upd. Два митапа в США и один в Европе проведены. +[Иван Блинков](https://github.com/blinkov/) - организация. Две штуки в США запланированы. Upd. Два митапа в США и один в Европе проведены. Upd. Все остальные перенесены в онлайн. ### 25.12. Статья «научная» - про устройство хранения данных и индексов или whitepaper по архитектуре. Есть вариант подать на VLDB {#statia-nauchnaia-pro-ustroistvo-khraneniia-dannykh-i-indeksov-ili-whitepaper-po-arkhitekture-est-variant-podat-na-vldb} Низкий приоритет. Алексей Миловидов. -### 25.13. Участие во всех мероприятиях Яндекса, которые связаны с разработкой бэкенда, C++ разработкой или с базами данных, возможно участие в DevRel мероприятиях {#uchastie-vo-vsekh-meropriiatiiakh-iandeksa-kotorye-sviazany-s-razrabotkoi-bekenda-c-razrabotkoi-ili-s-bazami-dannykh-vozmozhno-uchastie-v-devrel-meropriiatiiakh} +### 25.13. + Участие во всех мероприятиях Яндекса, которые связаны с разработкой бэкенда, C++ разработкой или с базами данных, возможно участие в DevRel мероприятиях {#uchastie-vo-vsekh-meropriiatiiakh-iandeksa-kotorye-sviazany-s-razrabotkoi-bekenda-c-razrabotkoi-ili-s-bazami-dannykh-vozmozhno-uchastie-v-devrel-meropriiatiiakh} -Алексей Миловидов и все подготовленные докладчики +Алексей Миловидов и все подготовленные докладчики. +Upd. Участвуем. ### 25.14. Конференции в России: все HighLoad, возможно CodeFest, DUMP или UWDC, возможно C++ Russia {#konferentsii-v-rossii-vse-highload-vozmozhno-codefest-dump-ili-uwdc-vozmozhno-c-russia} @@ -1988,6 +2073,7 @@ Amos Bird, но его решение слишком громоздкое и п Upd. Есть Saint HighLoad online. Upd. Есть C++ Russia. CodeFest, DUMP, UWDC отменились. +Upd. Добавились Highload Fwdays, Матемаркетинг. ### 25.15. Конференции зарубежные: Percona, DataOps, попытка попасть на более крупные {#konferentsii-zarubezhnye-percona-dataops-popytka-popast-na-bolee-krupnye} @@ -2009,16 +2095,18 @@ DataOps отменилась. Требуется проработать вопрос безопасности и изоляции инстансов (поднятие в контейнерах с ограничениями по сети), подключение тестовых датасетов с помощью copy-on-write файловой системы; органичения ресурсов. Есть минимальный прототип. Сделал Илья Яцишин. Этот прототип не позволяет делиться ссылками на результаты запросов. +Upd. На финальной стадии инструмент для экспериментирования с разными версиями ClickHouse. ### 25.17. Взаимодействие с ВУЗами: ВШЭ, УрФУ, ICT Beijing {#vzaimodeistvie-s-vuzami-vshe-urfu-ict-beijing} Алексей Миловидов и вся группа разработки. Благодаря Robert Hodges добавлен CMU. Upd. Взаимодействие с ВШЭ 2019/2020 успешно выполнено. +Upd. Идёт подготовка к 2020/2021. ### 25.18. - Лекция в ШАД {#lektsiia-v-shad} -Алексей Миловидов +Алексей Миловидов. ### 25.19. - Участие в курсе разработки на C++ в ШАД {#uchastie-v-kurse-razrabotki-na-c-v-shad} @@ -2029,6 +2117,8 @@ Upd. Взаимодействие с ВШЭ 2019/2020 успешно выпол Существуют мало известные специализированные СУБД, способные конкурировать с ClickHouse по скорости обработки некоторых классов запросов. Пример: `TDEngine` и `DolphinDB`, `VictoriaMetrics`, а также `Apache Doris` и `LocustDB`. Предлагается изучить и классифицировать архитектурные особенности этих систем - их особенности и преимущества. Установить эти системы, загрузить тестовые данные, изучить производительность. Проанализировать, за счёт чего достигаются преимущества. Upd. Есть поползновения с TDEngine. +Upd. Добавили OmniSci, обновили MonetDB. +Также посмотрели QuestDB и VectorSQL (они не работают). ### 25.21. Повторное награждение контрибьюторов в Китае {#povtornoe-nagrazhdenie-kontribiutorov-v-kitae} @@ -2038,6 +2128,7 @@ Upd. Ждём снятия ограничений и восстановлени [Иван Блинков](https://github.com/blinkov/) - организация. Провёл мероприятие для турецкой компании. Upd. On-site заменяется на Online. +Upd. Проведены консультации для нескольких секретных компаний. ### 25.23. Новый мерч для ClickHouse {#novyi-merch-dlia-clickhouse} diff --git a/docs/tools/build.py b/docs/tools/build.py index 120af33c8fb..c91cc8d5f3c 100755 --- a/docs/tools/build.py +++ b/docs/tools/build.py @@ -28,6 +28,7 @@ import test import util import website +from cmake_in_clickhouse_generator import generate_cmake_flags_files class ClickHouseMarkdown(markdown.extensions.Extension): class ClickHousePreprocessor(markdown.util.Processor): @@ -184,6 +185,8 @@ def build(args): test.test_templates(args.website_dir) if not args.skip_docs: + generate_cmake_flags_files(os.path.join(os.path.dirname(__file__), '..', '..')) + build_docs(args) from github import build_releases build_releases(args, build_docs) @@ -200,6 +203,7 @@ def build(args): if __name__ == '__main__': os.chdir(os.path.join(os.path.dirname(__file__), '..')) website_dir = os.path.join('..', 'website') + arg_parser = argparse.ArgumentParser() arg_parser.add_argument('--lang', default='en,es,fr,ru,zh,ja,tr,fa') arg_parser.add_argument('--blog-lang', default='en,ru') diff --git a/docs/tools/cmake_in_clickhouse_generator.py b/docs/tools/cmake_in_clickhouse_generator.py new file mode 100644 index 00000000000..b15df76151e --- /dev/null +++ b/docs/tools/cmake_in_clickhouse_generator.py @@ -0,0 +1,152 @@ +import re +import os +from typing import TextIO, List, Tuple, Optional, Dict + +# name, default value, description +Entity = Tuple[str, str, str] + +# https://regex101.com/r/R6iogw/12 +cmake_option_regex: str = r"^\s*option\s*\(([A-Z_0-9${}]+)\s*(?:\"((?:.|\n)*?)\")?\s*(.*)?\).*$" + +ch_master_url: str = "https://github.com/clickhouse/clickhouse/blob/master/" + +name_str: str = "[`{name}`](" + ch_master_url + "{path}#L{line})" +default_anchor_str: str = "[`{name}`](#{anchor})" + +comment_var_regex: str = r"\${(.+)}" +comment_var_replace: str = "`\\1`" + +table_header: str = """ +| Name | Default value | Description | Comment | +|------|---------------|-------------|---------| +""" + +# Needed to detect conditional variables (those which are defined twice) +# name -> (path, values) +entities: Dict[str, Tuple[str, str]] = {} + + +def make_anchor(t: str) -> str: + return "".join(["-" if i == "_" else i.lower() for i in t if i.isalpha() or i == "_"]) + +def process_comment(comment: str) -> str: + return re.sub(comment_var_regex, comment_var_replace, comment, flags=re.MULTILINE) + +def build_entity(path: str, entity: Entity, line_comment: Tuple[int, str]) -> None: + (line, comment) = line_comment + (name, description, default) = entity + + if name in entities: + return + + # cannot escape the { in macro option description -> invalid AMP html + # Skipping "USE_INTERNAL_${LIB_NAME_UC}_LIBRARY" + if "LIB_NAME_UC" in name: + return + + if len(default) == 0: + formatted_default: str = "`OFF`" + elif default[0] == "$": + formatted_default: str = "`{}`".format(default[2:-1]) + else: + formatted_default: str = "`" + default + "`" + + formatted_name: str = name_str.format( + anchor=make_anchor(name), + name=name, + path=path, + line=line if line > 0 else 1) + + formatted_description: str = "".join(description.split("\n")) + + formatted_comment: str = process_comment(comment) + + formatted_entity: str = "| {} | {} | {} | {} |".format( + formatted_name, formatted_default, formatted_description, formatted_comment) + + entities[name] = path, formatted_entity + +def process_file(root_path: str, input_name: str) -> None: + with open(os.path.join(root_path, input_name), 'r') as cmake_file: + contents: str = cmake_file.read() + + def get_line_and_comment(target: str) -> Tuple[int, str]: + contents_list: List[str] = contents.split("\n") + comment: str = "" + + for n, line in enumerate(contents_list): + if line.find(target) == -1: + continue + + for maybe_comment_line in contents_list[n - 1::-1]: + if not re.match("\s*#\s*", maybe_comment_line): + break + + comment = re.sub("\s*#\s*", "", maybe_comment_line) + " " + comment + + return n, comment + + matches: Optional[List[Entity]] = re.findall(cmake_option_regex, contents, re.MULTILINE) + + if matches: + for entity in matches: + build_entity(os.path.join(root_path[6:], input_name), entity, get_line_and_comment(entity[0])) + +def process_folder(root_path:str, name: str) -> None: + for root, _, files in os.walk(os.path.join(root_path, name)): + for f in files: + if f == "CMakeLists.txt" or ".cmake" in f: + process_file(root, f) + +def generate_cmake_flags_files(root_path: str) -> None: + output_file_name: str = os.path.join(root_path, "docs/en/development/cmake-in-clickhouse.md") + header_file_name: str = os.path.join(root_path, "docs/_includes/cmake_in_clickhouse_header.md") + footer_file_name: str = os.path.join(root_path, "docs/_includes/cmake_in_clickhouse_footer.md") + + process_file(root_path, "CMakeLists.txt") + process_file(root_path, "programs/CMakeLists.txt") + + process_folder(root_path, "base") + process_folder(root_path, "cmake") + process_folder(root_path, "src") + + with open(output_file_name, "w") as f: + with open(header_file_name, "r") as header: + f.write(header.read()) + + sorted_keys: List[str] = sorted(entities.keys()) + ignored_keys: List[str] = [] + + f.write("### ClickHouse modes\n" + table_header) + + for k in sorted_keys: + if k.startswith("ENABLE_CLICKHOUSE_"): + f.write(entities[k][1] + "\n") + ignored_keys.append(k) + + f.write("\n### External libraries\nNote that ClickHouse uses forks of these libraries, see https://github.com/ClickHouse-Extras.\n" + + table_header) + + for k in sorted_keys: + if k.startswith("ENABLE_") and ".cmake" in entities[k][0]: + f.write(entities[k][1] + "\n") + ignored_keys.append(k) + + f.write("\n### External libraries system/bundled mode\n" + table_header) + + for k in sorted_keys: + if k.startswith("USE_INTERNAL_"): + f.write(entities[k][1] + "\n") + ignored_keys.append(k) + + f.write("\n### Other flags\n" + table_header) + + for k in sorted(set(sorted_keys).difference(set(ignored_keys))): + f.write(entities[k][1] + "\n") + + with open(footer_file_name, "r") as footer: + f.write(footer.read()) + + +if __name__ == '__main__': + generate_cmake_flags_files("../../") diff --git a/docs/tools/translate/translate.py b/docs/tools/translate/translate.py index 6486a8cbcc7..343ab09f12a 100755 --- a/docs/tools/translate/translate.py +++ b/docs/tools/translate/translate.py @@ -49,13 +49,14 @@ def translate_impl(text, target_language=None): def translate(text, target_language=None): - result = [] - for part in re.split(curly_braces_re, text): - if part.startswith('{') and part.endswith('}'): - result.append(part) - else: - result.append(translate_impl(part, target_language=target_language)) - return ''.join(result) + return "".join( + [ + part + if part.startswith("{") and part.endswith("}") + else translate_impl(part, target_language=target_language) + for part in re.split(curly_braces_re, text) + ] + ) def translate_toc(root, lang): diff --git a/docs/zh/sql-reference/operators/in.md b/docs/zh/sql-reference/operators/in.md index eaaa477fbe1..bcd3ca1fa18 100644 --- a/docs/zh/sql-reference/operators/in.md +++ b/docs/zh/sql-reference/operators/in.md @@ -3,7 +3,7 @@ machine_translated: true machine_translated_rev: 5decc73b5dc60054f19087d3690c4eb99446a6c3 --- -# 在运营商 {#select-in-operators} +# IN 操作符 {#select-in-operators} 该 `IN`, `NOT IN`, `GLOBAL IN`,和 `GLOBAL NOT IN` 运算符是单独复盖的,因为它们的功能相当丰富。 diff --git a/programs/CMakeLists.txt b/programs/CMakeLists.txt index ae4a72ef62a..3577ee3df31 100644 --- a/programs/CMakeLists.txt +++ b/programs/CMakeLists.txt @@ -2,31 +2,49 @@ if (USE_CLANG_TIDY) set (CMAKE_CXX_CLANG_TIDY "${CLANG_TIDY_PATH}") endif () -# 'clickhouse' binary is a multi purpose tool, -# that contain multiple execution modes (client, server, etc.) -# each of them is built and linked as a separate library, defined below. +# The `clickhouse` binary is a multi purpose tool that contains multiple execution modes (client, server, etc.), +# each of them may be built and linked as a separate library. +# If you do not know what modes you need, turn this option OFF and enable SERVER and CLIENT only. +option (ENABLE_CLICKHOUSE_ALL "Enable all ClickHouse modes by default" ON) -option (ENABLE_CLICKHOUSE_ALL "Enable all tools" ON) -option (ENABLE_CLICKHOUSE_SERVER "Enable clickhouse-server" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_CLIENT "Enable clickhouse-client" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_LOCAL "Enable clickhouse-local" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_BENCHMARK "Enable clickhouse-benchmark" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_EXTRACT_FROM_CONFIG "Enable clickhouse-extract-from-config" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_COMPRESSOR "Enable clickhouse-compressor" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_COPIER "Enable clickhouse-copier" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_FORMAT "Enable clickhouse-format" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_OBFUSCATOR "Enable clickhouse-obfuscator" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_GIT_IMPORT "Enable clickhouse-git-import" ${ENABLE_CLICKHOUSE_ALL}) -option (ENABLE_CLICKHOUSE_ODBC_BRIDGE "Enable clickhouse-odbc-bridge" ${ENABLE_CLICKHOUSE_ALL}) +option (ENABLE_CLICKHOUSE_SERVER "Server mode (main mode)" ${ENABLE_CLICKHOUSE_ALL}) +option (ENABLE_CLICKHOUSE_CLIENT "Client mode (interactive tui/shell that connects to the server)" + ${ENABLE_CLICKHOUSE_ALL}) + +# https://clickhouse.tech/docs/en/operations/utilities/clickhouse-local/ +option (ENABLE_CLICKHOUSE_LOCAL "Local files fast processing mode" ${ENABLE_CLICKHOUSE_ALL}) + +# https://clickhouse.tech/docs/en/operations/utilities/clickhouse-benchmark/ +option (ENABLE_CLICKHOUSE_BENCHMARK "Queries benchmarking mode" ${ENABLE_CLICKHOUSE_ALL}) + +option (ENABLE_CLICKHOUSE_EXTRACT_FROM_CONFIG "Configs processor (extract values etc.)" ${ENABLE_CLICKHOUSE_ALL}) + +# https://clickhouse.tech/docs/en/operations/utilities/clickhouse-compressor/ +option (ENABLE_CLICKHOUSE_COMPRESSOR "Data compressor and decompressor" ${ENABLE_CLICKHOUSE_ALL}) + +# https://clickhouse.tech/docs/en/operations/utilities/clickhouse-copier/ +option (ENABLE_CLICKHOUSE_COPIER "Inter-cluster data copying mode" ${ENABLE_CLICKHOUSE_ALL}) + +option (ENABLE_CLICKHOUSE_FORMAT "Queries pretty-printer and formatter with syntax highlighting" + ${ENABLE_CLICKHOUSE_ALL}) + +# https://clickhouse.tech/docs/en/operations/utilities/clickhouse-obfuscator/ +option (ENABLE_CLICKHOUSE_OBFUSCATOR "Table data obfuscator (convert real data to benchmark-ready one)" + ${ENABLE_CLICKHOUSE_ALL}) + +# https://clickhouse.tech/docs/en/operations/utilities/odbc-bridge/ +option (ENABLE_CLICKHOUSE_ODBC_BRIDGE "HTTP-server working like a proxy to ODBC driver" + ${ENABLE_CLICKHOUSE_ALL}) if (CLICKHOUSE_SPLIT_BINARY) - option (ENABLE_CLICKHOUSE_INSTALL "Enable clickhouse-install" OFF) + option(ENABLE_CLICKHOUSE_INSTALL "Install ClickHouse without .deb/.rpm/.tgz packages (having the binary only)" OFF) else () - option (ENABLE_CLICKHOUSE_INSTALL "Enable clickhouse-install" ${ENABLE_CLICKHOUSE_ALL}) + option(ENABLE_CLICKHOUSE_INSTALL "Install ClickHouse without .deb/.rpm/.tgz packages (having the binary only)" + ${ENABLE_CLICKHOUSE_ALL}) endif () if(NOT (MAKE_STATIC_LIBRARIES OR SPLIT_SHARED_LIBRARIES)) - set(CLICKHOUSE_ONE_SHARED 1) + set(CLICKHOUSE_ONE_SHARED ON) endif() configure_file (config_tools.h.in ${ConfigIncludePath}/config_tools.h) diff --git a/programs/client/Client.cpp b/programs/client/Client.cpp index 7c6d386ba05..d900eb17d78 100644 --- a/programs/client/Client.cpp +++ b/programs/client/Client.cpp @@ -1167,6 +1167,9 @@ private: dump_of_cloned_ast.str().c_str()); fprintf(stderr, "dump after fuzz:\n"); fuzz_base->dumpTree(std::cerr); + + fmt::print(stderr, "IAST::clone() is broken for some AST node. This is a bug. The original AST ('dump before fuzz') and its cloned copy ('dump of cloned AST') refer to the same nodes, which must never happen. This means that their parent node doesn't implement clone() correctly."); + assert(false); } diff --git a/programs/client/Suggest.cpp b/programs/client/Suggest.cpp index ac18a131c3a..e85e7a21261 100644 --- a/programs/client/Suggest.cpp +++ b/programs/client/Suggest.cpp @@ -80,7 +80,7 @@ Suggest::Suggest() "WITH", "TOTALS", "HAVING", "ORDER", "COLLATE", "LIMIT", "UNION", "AND", "OR", "ASC", "IN", "KILL", "QUERY", "SYNC", "ASYNC", "TEST", "BETWEEN", "TRUNCATE", "USER", "ROLE", "PROFILE", "QUOTA", "POLICY", "ROW", "GRANT", "REVOKE", "OPTION", "ADMIN", "EXCEPT", "REPLACE", - "IDENTIFIED", "HOST", "NAME", "READONLY", "WRITABLE", "PERMISSIVE", "FOR", "RESTRICTIVE", "FOR", "RANDOMIZED", + "IDENTIFIED", "HOST", "NAME", "READONLY", "WRITABLE", "PERMISSIVE", "FOR", "RESTRICTIVE", "RANDOMIZED", "INTERVAL", "LIMITS", "ONLY", "TRACKING", "IP", "REGEXP", "ILIKE"}; } diff --git a/programs/copier/ClusterCopier.cpp b/programs/copier/ClusterCopier.cpp index b3d1ca7bcec..4ee14b14119 100644 --- a/programs/copier/ClusterCopier.cpp +++ b/programs/copier/ClusterCopier.cpp @@ -1477,7 +1477,9 @@ TaskStatus ClusterCopier::processPartitionPieceTaskImpl( { auto create_query_push_ast = rewriteCreateQueryStorage(task_shard.current_pull_table_create_query, task_table.table_push, task_table.engine_push_ast); - create_query_push_ast->as().if_not_exists = true; + auto & create = create_query_push_ast->as(); + create.if_not_exists = true; + InterpreterCreateQuery::prepareOnClusterQuery(create, context, task_table.cluster_push_name); String query = queryToString(create_query_push_ast); LOG_DEBUG(log, "Create destination tables. Query: {}", query); diff --git a/programs/copier/ClusterCopierApp.cpp b/programs/copier/ClusterCopierApp.cpp index ec64e118f45..08a7e50a9d7 100644 --- a/programs/copier/ClusterCopierApp.cpp +++ b/programs/copier/ClusterCopierApp.cpp @@ -105,7 +105,7 @@ void ClusterCopierApp::mainImpl() ThreadStatus thread_status; auto * log = &logger(); - LOG_INFO(log, "Starting clickhouse-copier (id {}, host_id {}, path {}, revision {})", process_id, host_id, process_path, ClickHouseRevision::get()); + LOG_INFO(log, "Starting clickhouse-copier (id {}, host_id {}, path {}, revision {})", process_id, host_id, process_path, ClickHouseRevision::getVersionRevision()); SharedContextHolder shared_context = Context::createShared(); auto context = std::make_unique(Context::createGlobal(shared_context.get())); diff --git a/programs/copier/Internals.cpp b/programs/copier/Internals.cpp index 12da07a772a..ca26f0d1831 100644 --- a/programs/copier/Internals.cpp +++ b/programs/copier/Internals.cpp @@ -215,31 +215,20 @@ Names extractPrimaryKeyColumnNames(const ASTPtr & storage_ast) return primary_key_columns; } -String extractReplicatedTableZookeeperPath(const ASTPtr & storage_ast) +bool isReplicatedTableEngine(const ASTPtr & storage_ast) { - String storage_str = queryToString(storage_ast); - const auto & storage = storage_ast->as(); const auto & engine = storage.engine->as(); if (!endsWith(engine.name, "MergeTree")) { + String storage_str = queryToString(storage_ast); throw Exception( "Unsupported engine was specified in " + storage_str + ", only *MergeTree engines are supported", ErrorCodes::BAD_ARGUMENTS); } - if (!startsWith(engine.name, "Replicated")) - { - return ""; - } - - auto replicated_table_arguments = engine.arguments->children; - - auto zk_table_path_ast = replicated_table_arguments[0]->as(); - auto zk_table_path_string = zk_table_path_ast.value.safeGet(); - - return zk_table_path_string; + return startsWith(engine.name, "Replicated"); } ShardPriority getReplicasPriority(const Cluster::Addresses & replicas, const std::string & local_hostname, UInt8 random) diff --git a/programs/copier/Internals.h b/programs/copier/Internals.h index b61b6d59629..7e45c0ea2ee 100644 --- a/programs/copier/Internals.h +++ b/programs/copier/Internals.h @@ -200,7 +200,7 @@ ASTPtr extractOrderBy(const ASTPtr & storage_ast); Names extractPrimaryKeyColumnNames(const ASTPtr & storage_ast); -String extractReplicatedTableZookeeperPath(const ASTPtr & storage_ast); +bool isReplicatedTableEngine(const ASTPtr & storage_ast); ShardPriority getReplicasPriority(const Cluster::Addresses & replicas, const std::string & local_hostname, UInt8 random); diff --git a/programs/copier/TaskTableAndShard.h b/programs/copier/TaskTableAndShard.h index 11ceffd12cd..27c4b89377d 100644 --- a/programs/copier/TaskTableAndShard.h +++ b/programs/copier/TaskTableAndShard.h @@ -48,7 +48,7 @@ struct TaskTable String getCertainPartitionPieceTaskStatusPath(const String & partition_name, const size_t piece_number) const; - bool isReplicatedTable() const { return engine_push_zk_path != ""; } + bool isReplicatedTable() const { return is_replicated_table; } /// Partitions will be split into number-of-splits pieces. /// Each piece will be copied independently. (10 by default) @@ -78,6 +78,7 @@ struct TaskTable /// First argument of Replicated...MergeTree() String engine_push_zk_path; + bool is_replicated_table; ASTPtr rewriteReplicatedCreateQueryToPlain(); @@ -269,7 +270,7 @@ inline TaskTable::TaskTable(TaskCluster & parent, const Poco::Util::AbstractConf engine_push_ast = parseQuery(parser_storage, engine_push_str, 0, DBMS_DEFAULT_MAX_PARSER_DEPTH); engine_push_partition_key_ast = extractPartitionKey(engine_push_ast); primary_key_comma_separated = Nested::createCommaSeparatedStringFrom(extractPrimaryKeyColumnNames(engine_push_ast)); - engine_push_zk_path = extractReplicatedTableZookeeperPath(engine_push_ast); + is_replicated_table = isReplicatedTableEngine(engine_push_ast); } sharding_key_str = config.getString(table_prefix + "sharding_key"); @@ -372,15 +373,18 @@ inline ASTPtr TaskTable::rewriteReplicatedCreateQueryToPlain() auto & new_storage_ast = prev_engine_push_ast->as(); auto & new_engine_ast = new_storage_ast.engine->as(); - auto & replicated_table_arguments = new_engine_ast.arguments->children; - - /// Delete first two arguments of Replicated...MergeTree() table. - replicated_table_arguments.erase(replicated_table_arguments.begin()); - replicated_table_arguments.erase(replicated_table_arguments.begin()); - - /// Remove replicated from name + /// Remove "Replicated" from name new_engine_ast.name = new_engine_ast.name.substr(10); + if (new_engine_ast.arguments) + { + auto & replicated_table_arguments = new_engine_ast.arguments->children; + + /// Delete first two arguments of Replicated...MergeTree() table. + replicated_table_arguments.erase(replicated_table_arguments.begin()); + replicated_table_arguments.erase(replicated_table_arguments.begin()); + } + return new_storage_ast.clone(); } diff --git a/programs/format/CMakeLists.txt b/programs/format/CMakeLists.txt index ab06708cd3a..49f17ef163f 100644 --- a/programs/format/CMakeLists.txt +++ b/programs/format/CMakeLists.txt @@ -5,6 +5,9 @@ set (CLICKHOUSE_FORMAT_LINK boost::program_options clickhouse_common_io clickhouse_parsers + clickhouse_functions + clickhouse_aggregate_functions + clickhouse_table_functions dbms ) diff --git a/programs/format/Format.cpp b/programs/format/Format.cpp index daf2d671568..01f952bf95e 100644 --- a/programs/format/Format.cpp +++ b/programs/format/Format.cpp @@ -1,13 +1,29 @@ #include +#include +#include #include #include #include +#include #include #include #include +#include #include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + #pragma GCC diagnostic ignored "-Wunused-function" #pragma GCC diagnostic ignored "-Wmissing-declarations" @@ -22,6 +38,8 @@ int mainEntryClickHouseFormat(int argc, char ** argv) ("oneline", "format in single line") ("quiet,q", "just check syntax, no output on success") ("multiquery,n", "allow multiple queries in the same file") + ("obfuscate", "obfuscate instead of formatting") + ("seed", po::value(), "seed (arbitrary string) that determines the result of obfuscation") ; boost::program_options::variables_map options; @@ -40,10 +58,17 @@ int mainEntryClickHouseFormat(int argc, char ** argv) bool oneline = options.count("oneline"); bool quiet = options.count("quiet"); bool multiple = options.count("multiquery"); + bool obfuscate = options.count("obfuscate"); - if (quiet && (hilite || oneline)) + if (quiet && (hilite || oneline || obfuscate)) { - std::cerr << "Options 'hilite' or 'oneline' have no sense in 'quiet' mode." << std::endl; + std::cerr << "Options 'hilite' or 'oneline' or 'obfuscate' have no sense in 'quiet' mode." << std::endl; + return 2; + } + + if (obfuscate && (hilite || oneline || quiet)) + { + std::cerr << "Options 'hilite' or 'oneline' or 'quiet' have no sense in 'obfuscate' mode." << std::endl; return 2; } @@ -51,21 +76,66 @@ int mainEntryClickHouseFormat(int argc, char ** argv) ReadBufferFromFileDescriptor in(STDIN_FILENO); readStringUntilEOF(query, in); - const char * pos = query.data(); - const char * end = pos + query.size(); - - ParserQuery parser(end); - do + if (obfuscate) { - ASTPtr res = parseQueryAndMovePosition(parser, pos, end, "query", multiple, 0, DBMS_DEFAULT_MAX_PARSER_DEPTH); - if (!quiet) + WordMap obfuscated_words_map; + WordSet used_nouns; + SipHash hash_func; + + if (options.count("seed")) { - formatAST(*res, std::cout, hilite, oneline); - if (multiple) - std::cout << "\n;\n"; - std::cout << std::endl; + std::string seed; + hash_func.update(options["seed"].as()); } - } while (multiple && pos != end); + + SharedContextHolder shared_context = Context::createShared(); + Context context = Context::createGlobal(shared_context.get()); + context.makeGlobalContext(); + + registerFunctions(); + registerAggregateFunctions(); + registerTableFunctions(); + registerStorages(); + + std::unordered_set additional_names; + + auto all_known_storage_names = StorageFactory::instance().getAllRegisteredNames(); + auto all_known_data_type_names = DataTypeFactory::instance().getAllRegisteredNames(); + + additional_names.insert(all_known_storage_names.begin(), all_known_storage_names.end()); + additional_names.insert(all_known_data_type_names.begin(), all_known_data_type_names.end()); + + KnownIdentifierFunc is_known_identifier = [&](std::string_view name) + { + std::string what(name); + + return FunctionFactory::instance().tryGet(what, context) != nullptr + || AggregateFunctionFactory::instance().isAggregateFunctionName(what) + || TableFunctionFactory::instance().isTableFunctionName(what) + || additional_names.count(what); + }; + + WriteBufferFromFileDescriptor out(STDOUT_FILENO); + obfuscateQueries(query, out, obfuscated_words_map, used_nouns, hash_func, is_known_identifier); + } + else + { + const char * pos = query.data(); + const char * end = pos + query.size(); + + ParserQuery parser(end); + do + { + ASTPtr res = parseQueryAndMovePosition(parser, pos, end, "query", multiple, 0, DBMS_DEFAULT_MAX_PARSER_DEPTH); + if (!quiet) + { + formatAST(*res, std::cout, hilite, oneline); + if (multiple) + std::cout << "\n;\n"; + std::cout << std::endl; + } + } while (multiple && pos != end); + } } catch (...) { diff --git a/programs/server/CMakeLists.txt b/programs/server/CMakeLists.txt index 5500a4680b7..b3dcf1955fe 100644 --- a/programs/server/CMakeLists.txt +++ b/programs/server/CMakeLists.txt @@ -3,7 +3,7 @@ set(CLICKHOUSE_SERVER_SOURCES Server.cpp ) -if (OS_LINUX AND ARCH_AMD64) +if (OS_LINUX) set (LINK_CONFIG_LIB INTERFACE "-Wl,${WHOLE_ARCHIVE} $ -Wl,${NO_WHOLE_ARCHIVE}") endif () diff --git a/programs/server/Server.cpp b/programs/server/Server.cpp index 300698b4439..6341653ee2f 100644 --- a/programs/server/Server.cpp +++ b/programs/server/Server.cpp @@ -273,7 +273,7 @@ int Server::main(const std::vector & /*args*/) #endif #endif - CurrentMetrics::set(CurrentMetrics::Revision, ClickHouseRevision::get()); + CurrentMetrics::set(CurrentMetrics::Revision, ClickHouseRevision::getVersionRevision()); CurrentMetrics::set(CurrentMetrics::VersionInteger, ClickHouseRevision::getVersionInteger()); if (ThreadFuzzer::instance().isEffective()) @@ -297,6 +297,11 @@ int Server::main(const std::vector & /*args*/) global_context->makeGlobalContext(); global_context->setApplicationType(Context::ApplicationType::SERVER); + // Initialize global thread pool. Do it before we fetch configs from zookeeper + // nodes (`from_zk`), because ZooKeeper interface uses the pool. We will + // ignore `max_thread_pool_size` in configs we fetch from ZK, but oh well. + GlobalThreadPool::initialize(config().getUInt("max_thread_pool_size", 10000)); + bool has_zookeeper = config().has("zookeeper"); zkutil::ZooKeeperNodeCache main_config_zk_node_cache([&] { return global_context->getZooKeeper(); }); @@ -334,16 +339,23 @@ int Server::main(const std::vector & /*args*/) { if (hasLinuxCapability(CAP_IPC_LOCK)) { - /// Get the memory area with (current) code segment. - /// It's better to lock only the code segment instead of calling "mlockall", - /// because otherwise debug info will be also locked in memory, and it can be huge. - auto [addr, len] = getMappedArea(reinterpret_cast(mainEntryClickHouseServer)); + try + { + /// Get the memory area with (current) code segment. + /// It's better to lock only the code segment instead of calling "mlockall", + /// because otherwise debug info will be also locked in memory, and it can be huge. + auto [addr, len] = getMappedArea(reinterpret_cast(mainEntryClickHouseServer)); - LOG_TRACE(log, "Will do mlock to prevent executable memory from being paged out. It may take a few seconds."); - if (0 != mlock(addr, len)) - LOG_WARNING(log, "Failed mlock: {}", errnoToString(ErrorCodes::SYSTEM_ERROR)); - else - LOG_TRACE(log, "The memory map of clickhouse executable has been mlock'ed, total {}", ReadableSize(len)); + LOG_TRACE(log, "Will do mlock to prevent executable memory from being paged out. It may take a few seconds."); + if (0 != mlock(addr, len)) + LOG_WARNING(log, "Failed mlock: {}", errnoToString(ErrorCodes::SYSTEM_ERROR)); + else + LOG_TRACE(log, "The memory map of clickhouse executable has been mlock'ed, total {}", ReadableSize(len)); + } + catch (...) + { + LOG_WARNING(log, "Cannot mlock: {}", getCurrentExceptionMessage(false)); + } } else { @@ -436,9 +448,6 @@ int Server::main(const std::vector & /*args*/) DateLUT::instance(); LOG_TRACE(log, "Initialized DateLUT with time zone '{}'.", DateLUT::instance().getTimeZone()); - /// Initialize global thread pool - GlobalThreadPool::initialize(config().getUInt("max_thread_pool_size", 10000)); - /// Storage with temporary data for processing of heavy queries. { std::string tmp_path = config().getString("tmp_path", path + "tmp/"); @@ -662,6 +671,10 @@ int Server::main(const std::vector & /*args*/) total_memory_tracker.setDescription("(total)"); total_memory_tracker.setMetric(CurrentMetrics::MemoryTracking); + /// Set current database name before loading tables and databases because + /// system logs may copy global context. + global_context->setCurrentDatabaseNameInGlobalContext(default_database); + LOG_INFO(log, "Loading metadata from {}", path); try @@ -669,11 +682,14 @@ int Server::main(const std::vector & /*args*/) loadMetadataSystem(*global_context); /// After attaching system databases we can initialize system log. global_context->initializeSystemLogs(); + auto & database_catalog = DatabaseCatalog::instance(); /// After the system database is created, attach virtual system tables (in addition to query_log and part_log) - attachSystemTablesServer(*DatabaseCatalog::instance().getSystemDatabase(), has_zookeeper); + attachSystemTablesServer(*database_catalog.getSystemDatabase(), has_zookeeper); /// Then, load remaining databases loadMetadata(*global_context, default_database); - DatabaseCatalog::instance().loadDatabases(); + database_catalog.loadDatabases(); + /// After loading validate that default database exists + database_catalog.assertDatabaseExists(default_database); } catch (...) { @@ -736,8 +752,6 @@ int Server::main(const std::vector & /*args*/) LOG_INFO(log, "Query Profiler and TraceCollector are disabled because they require PHDR cache to be created" " (otherwise the function 'dl_iterate_phdr' is not lock free and not async-signal safe)."); - global_context->setCurrentDatabase(default_database); - if (has_zookeeper && config().has("distributed_ddl")) { /// DDL worker should be started after all tables were loaded diff --git a/release b/release index b20683a9caa..b446ceca0d5 100755 --- a/release +++ b/release @@ -95,9 +95,9 @@ then exit 3 fi - export DEB_CC=${DEB_CC=clang-6.0} - export DEB_CXX=${DEB_CXX=clang++-6.0} - EXTRAPACKAGES="$EXTRAPACKAGES clang-6.0 lld-6.0" + export DEB_CC=${DEB_CC=clang-10} + export DEB_CXX=${DEB_CXX=clang++-10} + EXTRAPACKAGES="$EXTRAPACKAGES clang-10 lld-10" elif [[ $BUILD_TYPE == 'valgrind' ]]; then MALLOC_OPTS="-DENABLE_TCMALLOC=0 -DENABLE_JEMALLOC=0" VERSION_POSTFIX+="+valgrind" @@ -118,8 +118,8 @@ echo -e "\nCurrent version is $VERSION_STRING" if [ -z "$NO_BUILD" ] ; then gen_changelog "$VERSION_STRING" "" "$AUTHOR" "" if [ -z "$USE_PBUILDER" ] ; then - DEB_CC=${DEB_CC:=`which gcc-9 gcc-8 gcc | head -n1`} - DEB_CXX=${DEB_CXX:=`which g++-9 g++-8 g++ | head -n1`} + DEB_CC=${DEB_CC:=`which gcc-10 gcc-9 gcc | head -n1`} + DEB_CXX=${DEB_CXX:=`which gcc-10 g++-9 g++ | head -n1`} # Build (only binary packages). debuild --preserve-env -e PATH \ -e DEB_CC=$DEB_CC -e DEB_CXX=$DEB_CXX -e CMAKE_FLAGS="$CMAKE_FLAGS" \ diff --git a/src/Access/UsersConfigAccessStorage.cpp b/src/Access/UsersConfigAccessStorage.cpp index 60bcc3784f3..ce10ebf0bcc 100644 --- a/src/Access/UsersConfigAccessStorage.cpp +++ b/src/Access/UsersConfigAccessStorage.cpp @@ -192,7 +192,7 @@ namespace } - std::vector parseUsers(const Poco::Util::AbstractConfiguration & config, Poco::Logger * log) + std::vector parseUsers(const Poco::Util::AbstractConfiguration & config) { Poco::Util::AbstractConfiguration::Keys user_names; config.keys("users", user_names); @@ -200,16 +200,8 @@ namespace std::vector users; users.reserve(user_names.size()); for (const auto & user_name : user_names) - { - try - { - users.push_back(parseUser(config, user_name)); - } - catch (...) - { - tryLogCurrentException(log, "Could not parse user " + backQuote(user_name)); - } - } + users.push_back(parseUser(config, user_name)); + return users; } @@ -256,12 +248,11 @@ namespace } quota->to_roles.add(user_ids); - return quota; } - std::vector parseQuotas(const Poco::Util::AbstractConfiguration & config, Poco::Logger * log) + std::vector parseQuotas(const Poco::Util::AbstractConfiguration & config) { Poco::Util::AbstractConfiguration::Keys user_names; config.keys("users", user_names); @@ -278,76 +269,63 @@ namespace quotas.reserve(quota_names.size()); for (const auto & quota_name : quota_names) { - try - { - auto it = quota_to_user_ids.find(quota_name); - const std::vector & quota_users = (it != quota_to_user_ids.end()) ? std::move(it->second) : std::vector{}; - quotas.push_back(parseQuota(config, quota_name, quota_users)); - } - catch (...) - { - tryLogCurrentException(log, "Could not parse quota " + backQuote(quota_name)); - } + auto it = quota_to_user_ids.find(quota_name); + const std::vector & quota_users = (it != quota_to_user_ids.end()) ? std::move(it->second) : std::vector{}; + quotas.push_back(parseQuota(config, quota_name, quota_users)); } return quotas; } - std::vector parseRowPolicies(const Poco::Util::AbstractConfiguration & config, Poco::Logger * log) + std::vector parseRowPolicies(const Poco::Util::AbstractConfiguration & config) { std::map, std::unordered_map> all_filters_map; + Poco::Util::AbstractConfiguration::Keys user_names; + config.keys("users", user_names); - try + for (const String & user_name : user_names) { - config.keys("users", user_names); - for (const String & user_name : user_names) + const String databases_config = "users." + user_name + ".databases"; + if (config.has(databases_config)) { - const String databases_config = "users." + user_name + ".databases"; - if (config.has(databases_config)) + Poco::Util::AbstractConfiguration::Keys database_keys; + config.keys(databases_config, database_keys); + + /// Read tables within databases + for (const String & database_key : database_keys) { - Poco::Util::AbstractConfiguration::Keys database_keys; - config.keys(databases_config, database_keys); + const String database_config = databases_config + "." + database_key; - /// Read tables within databases - for (const String & database_key : database_keys) + String database_name; + if (((database_key == "database") || (database_key.starts_with("database["))) && config.has(database_config + "[@name]")) + database_name = config.getString(database_config + "[@name]"); + else if (size_t bracket_pos = database_key.find('['); bracket_pos != std::string::npos) + database_name = database_key.substr(0, bracket_pos); + else + database_name = database_key; + + Poco::Util::AbstractConfiguration::Keys table_keys; + config.keys(database_config, table_keys); + + /// Read table properties + for (const String & table_key : table_keys) { - const String database_config = databases_config + "." + database_key; - - String database_name; - if (((database_key == "database") || (database_key.starts_with("database["))) && config.has(database_config + "[@name]")) - database_name = config.getString(database_config + "[@name]"); - else if (size_t bracket_pos = database_key.find('['); bracket_pos != std::string::npos) - database_name = database_key.substr(0, bracket_pos); + String table_config = database_config + "." + table_key; + String table_name; + if (((table_key == "table") || (table_key.starts_with("table["))) && config.has(table_config + "[@name]")) + table_name = config.getString(table_config + "[@name]"); + else if (size_t bracket_pos = table_key.find('['); bracket_pos != std::string::npos) + table_name = table_key.substr(0, bracket_pos); else - database_name = database_key; + table_name = table_key; - Poco::Util::AbstractConfiguration::Keys table_keys; - config.keys(database_config, table_keys); - - /// Read table properties - for (const String & table_key : table_keys) - { - String table_config = database_config + "." + table_key; - String table_name; - if (((table_key == "table") || (table_key.starts_with("table["))) && config.has(table_config + "[@name]")) - table_name = config.getString(table_config + "[@name]"); - else if (size_t bracket_pos = table_key.find('['); bracket_pos != std::string::npos) - table_name = table_key.substr(0, bracket_pos); - else - table_name = table_key; - - String filter_config = table_config + ".filter"; - all_filters_map[{database_name, table_name}][user_name] = config.getString(filter_config); - } + String filter_config = table_config + ".filter"; + all_filters_map[{database_name, table_name}][user_name] = config.getString(filter_config); } } } } - catch (...) - { - tryLogCurrentException(log, "Could not parse row policies"); - } std::vector policies; for (auto & [database_and_table_name, user_to_filters] : all_filters_map) @@ -450,23 +428,14 @@ namespace std::vector parseSettingsProfiles( const Poco::Util::AbstractConfiguration & config, - const std::function & check_setting_name_function, - Poco::Logger * log) + const std::function & check_setting_name_function) { std::vector profiles; Poco::Util::AbstractConfiguration::Keys profile_names; config.keys("profiles", profile_names); for (const auto & profile_name : profile_names) - { - try - { - profiles.push_back(parseSettingsProfile(config, profile_name, check_setting_name_function)); - } - catch (...) - { - tryLogCurrentException(log, "Could not parse profile " + backQuote(profile_name)); - } - } + profiles.push_back(parseSettingsProfile(config, profile_name, check_setting_name_function)); + return profiles; } } @@ -520,13 +489,13 @@ void UsersConfigAccessStorage::setConfig(const Poco::Util::AbstractConfiguration void UsersConfigAccessStorage::parseFromConfig(const Poco::Util::AbstractConfiguration & config) { std::vector> all_entities; - for (const auto & entity : parseUsers(config, getLogger())) + for (const auto & entity : parseUsers(config)) all_entities.emplace_back(generateID(*entity), entity); - for (const auto & entity : parseQuotas(config, getLogger())) + for (const auto & entity : parseQuotas(config)) all_entities.emplace_back(generateID(*entity), entity); - for (const auto & entity : parseRowPolicies(config, getLogger())) + for (const auto & entity : parseRowPolicies(config)) all_entities.emplace_back(generateID(*entity), entity); - for (const auto & entity : parseSettingsProfiles(config, check_setting_name_function, getLogger())) + for (const auto & entity : parseSettingsProfiles(config, check_setting_name_function)) all_entities.emplace_back(generateID(*entity), entity); memory_storage.setAll(all_entities); } diff --git a/src/Client/Connection.cpp b/src/Client/Connection.cpp index d8fe865136f..f388ffed4a3 100644 --- a/src/Client/Connection.cpp +++ b/src/Client/Connection.cpp @@ -165,14 +165,12 @@ void Connection::sendHello() || has_control_character(password)) throw Exception("Parameters 'default_database', 'user' and 'password' must not contain ASCII control characters", ErrorCodes::BAD_ARGUMENTS); - auto client_revision = ClickHouseRevision::get(); - writeVarUInt(Protocol::Client::Hello, *out); writeStringBinary((DBMS_NAME " ") + client_name, *out); writeVarUInt(DBMS_VERSION_MAJOR, *out); writeVarUInt(DBMS_VERSION_MINOR, *out); // NOTE For backward compatibility of the protocol, client cannot send its version_patch. - writeVarUInt(client_revision, *out); + writeVarUInt(DBMS_TCP_PROTOCOL_VERSION, *out); writeStringBinary(default_database, *out); /// If interserver-secret is used, one do not need password /// (NOTE we do not check for DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET, since we cannot ignore inter-server secret if it was requested) diff --git a/src/Common/BitHelpers.h b/src/Common/BitHelpers.h index 699e379b8d3..e79daeba14e 100644 --- a/src/Common/BitHelpers.h +++ b/src/Common/BitHelpers.h @@ -1,22 +1,12 @@ #pragma once #include +#include #include #include #include -/** Returns log2 of number, rounded down. - * Compiles to single 'bsr' instruction on x86. - * For zero argument, result is unspecified. - */ -inline unsigned int bitScanReverse(unsigned int x) -{ - assert(x != 0); - return sizeof(unsigned int) * 8 - 1 - __builtin_clz(x); -} - - /** For zero argument, result is zero. * For arguments with most significand bit set, result is n. * For other arguments, returns value, rounded up to power of two. @@ -41,10 +31,9 @@ inline size_t roundUpToPowerOfTwoOrZero(size_t n) template -inline size_t getLeadingZeroBits(T x) +inline size_t getLeadingZeroBitsUnsafe(T x) { - if (!x) - return sizeof(x) * 8; + assert(x != 0); if constexpr (sizeof(T) <= sizeof(unsigned int)) { @@ -60,10 +49,32 @@ inline size_t getLeadingZeroBits(T x) } } + +template +inline size_t getLeadingZeroBits(T x) +{ + if (!x) + return sizeof(x) * 8; + + return getLeadingZeroBitsUnsafe(x); +} + +/** Returns log2 of number, rounded down. + * Compiles to single 'bsr' instruction on x86. + * For zero argument, result is unspecified. + */ +template +inline uint32_t bitScanReverse(T x) +{ + return (std::max(sizeof(T), sizeof(unsigned int))) * 8 - 1 - getLeadingZeroBitsUnsafe(x); +} + // Unsafe since __builtin_ctz()-family explicitly state that result is undefined on x == 0 template inline size_t getTrailingZeroBitsUnsafe(T x) { + assert(x != 0); + if constexpr (sizeof(T) <= sizeof(unsigned int)) { return __builtin_ctz(x); @@ -88,8 +99,8 @@ inline size_t getTrailingZeroBits(T x) } /** Returns a mask that has '1' for `bits` LSB set: - * maskLowBits(3) => 00000111 - */ + * maskLowBits(3) => 00000111 + */ template inline T maskLowBits(unsigned char bits) { diff --git a/src/Common/ClickHouseRevision.cpp b/src/Common/ClickHouseRevision.cpp index 0b81026adca..2c52ebb064a 100644 --- a/src/Common/ClickHouseRevision.cpp +++ b/src/Common/ClickHouseRevision.cpp @@ -6,6 +6,6 @@ namespace ClickHouseRevision { - unsigned get() { return VERSION_REVISION; } + unsigned getVersionRevision() { return VERSION_REVISION; } unsigned getVersionInteger() { return VERSION_INTEGER; } } diff --git a/src/Common/ClickHouseRevision.h b/src/Common/ClickHouseRevision.h index 1d097a5bf89..86d1e3db334 100644 --- a/src/Common/ClickHouseRevision.h +++ b/src/Common/ClickHouseRevision.h @@ -2,6 +2,6 @@ namespace ClickHouseRevision { - unsigned get(); + unsigned getVersionRevision(); unsigned getVersionInteger(); } diff --git a/src/Common/Config/ConfigReloader.cpp b/src/Common/Config/ConfigReloader.cpp index d4a2dfbafe5..677448e03ae 100644 --- a/src/Common/Config/ConfigReloader.cpp +++ b/src/Common/Config/ConfigReloader.cpp @@ -138,6 +138,7 @@ void ConfigReloader::reloadIfNewer(bool force, bool throw_on_error, bool fallbac if (throw_on_error) throw; tryLogCurrentException(log, "Error updating configuration from '" + path + "' config."); + return; } LOG_DEBUG(log, "Loaded config '{}', performed update on configuration", path); diff --git a/src/Common/Macros.cpp b/src/Common/Macros.cpp index a4981fa5be3..e3735c44359 100644 --- a/src/Common/Macros.cpp +++ b/src/Common/Macros.cpp @@ -23,18 +23,15 @@ Macros::Macros(const Poco::Util::AbstractConfiguration & config, const String & } String Macros::expand(const String & s, - size_t level, - const String & database_name, - const String & table_name, - const UUID & uuid) const + MacroExpansionInfo & info) const { if (s.find('{') == String::npos) return s; - if (level && s.size() > 65536) + if (info.level && s.size() > 65536) throw Exception("Too long string while expanding macros", ErrorCodes::SYNTAX_ERROR); - if (level >= 10) + if (info.level >= 10) throw Exception("Too deep recursion while expanding macros: '" + s + "'", ErrorCodes::SYNTAX_ERROR); String res; @@ -64,17 +61,28 @@ String Macros::expand(const String & s, /// Prefer explicit macros over implicit. if (it != macros.end()) res += it->second; - else if (macro_name == "database" && !database_name.empty()) - res += database_name; - else if (macro_name == "table" && !table_name.empty()) - res += table_name; + else if (macro_name == "database" && !info.database_name.empty()) + res += info.database_name; + else if (macro_name == "table" && !info.table_name.empty()) + res += info.table_name; else if (macro_name == "uuid") { - if (uuid == UUIDHelpers::Nil) + if (info.uuid == UUIDHelpers::Nil) throw Exception("Macro 'uuid' and empty arguments of ReplicatedMergeTree " "are supported only for ON CLUSTER queries with Atomic database engine", ErrorCodes::SYNTAX_ERROR); - res += toString(uuid); + /// For ON CLUSTER queries we don't want to require all macros definitions in initiator's config. + /// However, initiator must check that for cross-replication cluster zookeeper_path does not contain {uuid} macro. + /// It becomes impossible to check if {uuid} is contained inside some unknown macro. + if (info.level) + throw Exception("Macro 'uuid' should not be inside another macro", ErrorCodes::SYNTAX_ERROR); + res += toString(info.uuid); + info.expanded_uuid = true; + } + else if (info.ignore_unknown) + { + res += macro_name; + info.has_unknown = true; } else throw Exception("No macro '" + macro_name + @@ -84,7 +92,8 @@ String Macros::expand(const String & s, pos = end + 1; } - return expand(res, level + 1, database_name, table_name); + ++info.level; + return expand(res, info); } String Macros::getValue(const String & key) const @@ -94,9 +103,20 @@ String Macros::getValue(const String & key) const throw Exception("No macro " + key + " in config", ErrorCodes::SYNTAX_ERROR); } + +String Macros::expand(const String & s) const +{ + MacroExpansionInfo info; + return expand(s, info); +} + String Macros::expand(const String & s, const StorageID & table_id, bool allow_uuid) const { - return expand(s, 0, table_id.database_name, table_id.table_name, allow_uuid ? table_id.uuid : UUIDHelpers::Nil); + MacroExpansionInfo info; + info.database_name = table_id.database_name; + info.table_name = table_id.table_name; + info.uuid = allow_uuid ? table_id.uuid : UUIDHelpers::Nil; + return expand(s, info); } Names Macros::expand(const Names & source_names, size_t level) const @@ -104,8 +124,12 @@ Names Macros::expand(const Names & source_names, size_t level) const Names result_names; result_names.reserve(source_names.size()); + MacroExpansionInfo info; for (const String & name : source_names) - result_names.push_back(expand(name, level)); + { + info.level = level; + result_names.push_back(expand(name, info)); + } return result_names; } diff --git a/src/Common/Macros.h b/src/Common/Macros.h index bcd6075782e..6e4f25d55ef 100644 --- a/src/Common/Macros.h +++ b/src/Common/Macros.h @@ -27,15 +27,28 @@ public: Macros() = default; Macros(const Poco::Util::AbstractConfiguration & config, const String & key); + struct MacroExpansionInfo + { + /// Settings + String database_name; + String table_name; + UUID uuid = UUIDHelpers::Nil; + bool ignore_unknown = false; + + /// Information about macro expansion + size_t level = 0; + bool expanded_uuid = false; + bool has_unknown = false; + }; + /** Replace the substring of the form {macro_name} with the value for macro_name, obtained from the config file. * If {database} and {table} macros aren`t defined explicitly, expand them as database_name and table_name respectively. * level - the level of recursion. */ String expand(const String & s, - size_t level = 0, - const String & database_name = "", - const String & table_name = "", - const UUID & uuid = UUIDHelpers::Nil) const; + MacroExpansionInfo & info) const; + + String expand(const String & s) const; String expand(const String & s, const StorageID & table_id, bool allow_uuid) const; diff --git a/src/Common/StatusFile.cpp b/src/Common/StatusFile.cpp index 7c6bbf814a0..b21454c9ed8 100644 --- a/src/Common/StatusFile.cpp +++ b/src/Common/StatusFile.cpp @@ -37,7 +37,7 @@ StatusFile::FillFunction StatusFile::write_full_info = [](WriteBuffer & out) { out << "PID: " << getpid() << "\n" << "Started at: " << LocalDateTime(time(nullptr)) << "\n" - << "Revision: " << ClickHouseRevision::get() << "\n"; + << "Revision: " << ClickHouseRevision::getVersionRevision() << "\n"; }; diff --git a/src/Common/StringUtils/StringUtils.h b/src/Common/StringUtils/StringUtils.h index a1e8fb79435..904e3035dd8 100644 --- a/src/Common/StringUtils/StringUtils.h +++ b/src/Common/StringUtils/StringUtils.h @@ -67,10 +67,19 @@ inline bool isASCII(char c) return static_cast(c) < 0x80; } +inline bool isLowerAlphaASCII(char c) +{ + return (c >= 'a' && c <= 'z'); +} + +inline bool isUpperAlphaASCII(char c) +{ + return (c >= 'A' && c <= 'Z'); +} + inline bool isAlphaASCII(char c) { - return (c >= 'a' && c <= 'z') - || (c >= 'A' && c <= 'Z'); + return isLowerAlphaASCII(c) || isUpperAlphaASCII(c); } inline bool isNumericASCII(char c) @@ -122,6 +131,16 @@ inline bool isPrintableASCII(char c) return uc >= 32 && uc <= 126; /// 127 is ASCII DEL. } +inline bool isPunctuationASCII(char c) +{ + uint8_t uc = c; + return (uc >= 33 && uc <= 47) + || (uc >= 58 && uc <= 64) + || (uc >= 91 && uc <= 96) + || (uc >= 123 && uc <= 125); +} + + inline bool isValidIdentifier(const std::string_view & str) { return !str.empty() && isValidIdentifierBegin(str[0]) && std::all_of(str.begin() + 1, str.end(), isWordCharASCII); diff --git a/src/Common/ThreadPool.cpp b/src/Common/ThreadPool.cpp index 56198b97be5..dda16c7725d 100644 --- a/src/Common/ThreadPool.cpp +++ b/src/Common/ThreadPool.cpp @@ -13,6 +13,7 @@ namespace DB namespace ErrorCodes { extern const int CANNOT_SCHEDULE_TASK; + extern const int LOGICAL_ERROR; } } @@ -277,7 +278,11 @@ std::unique_ptr GlobalThreadPool::the_instance; void GlobalThreadPool::initialize(size_t max_threads) { - assert(!the_instance); + if (the_instance) + { + throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, + "The global thread pool is initialized twice"); + } the_instance.reset(new GlobalThreadPool(max_threads, 1000 /*max_free_threads*/, 10000 /*max_queue_size*/, diff --git a/src/Common/config_version.h.in b/src/Common/config_version.h.in index c3c0c6df87b..880824f8ad0 100644 --- a/src/Common/config_version.h.in +++ b/src/Common/config_version.h.in @@ -2,18 +2,7 @@ // .h autogenerated by cmake! -#cmakedefine01 USE_DBMS_TCP_PROTOCOL_VERSION - -#if USE_DBMS_TCP_PROTOCOL_VERSION - #include "Core/Defines.h" - #ifndef VERSION_REVISION - #define VERSION_REVISION DBMS_TCP_PROTOCOL_VERSION - #endif -#else - #cmakedefine VERSION_REVISION @VERSION_REVISION@ -#endif - - +#cmakedefine VERSION_REVISION @VERSION_REVISION@ #cmakedefine VERSION_NAME "@VERSION_NAME@" #define DBMS_NAME VERSION_NAME #cmakedefine VERSION_MAJOR @VERSION_MAJOR@ diff --git a/src/Common/randomSeed.cpp b/src/Common/randomSeed.cpp index 8ad624febdd..ded224e56c3 100644 --- a/src/Common/randomSeed.cpp +++ b/src/Common/randomSeed.cpp @@ -4,6 +4,7 @@ #include #include #include +#include #include @@ -19,7 +20,7 @@ namespace DB DB::UInt64 randomSeed() { struct timespec times; - if (clock_gettime(CLOCK_THREAD_CPUTIME_ID, ×)) + if (clock_gettime(CLOCK_MONOTONIC, ×)) DB::throwFromErrno("Cannot clock_gettime.", DB::ErrorCodes::CANNOT_CLOCK_GETTIME); /// Not cryptographically secure as time, pid and stack address can be predictable. @@ -27,7 +28,7 @@ DB::UInt64 randomSeed() SipHash hash; hash.update(times.tv_nsec); hash.update(times.tv_sec); - hash.update(getpid()); + hash.update(getThreadId()); hash.update(×); return hash.get64(); } diff --git a/src/Core/AccurateComparison.h b/src/Core/AccurateComparison.h index bbd820bc65f..500346872db 100644 --- a/src/Core/AccurateComparison.h +++ b/src/Core/AccurateComparison.h @@ -4,6 +4,7 @@ #include #include "Defines.h" #include "Types.h" +#include #include #include @@ -382,8 +383,8 @@ inline bool equalsOp(DB::Float32 f, DB::UInt128 u) inline bool NO_SANITIZE_UNDEFINED greaterOp(DB::Int128 i, DB::Float64 f) { - static constexpr Int128 min_int128 = Int128(0x8000000000000000ll) << 64; - static constexpr Int128 max_int128 = (Int128(0x7fffffffffffffffll) << 64) + 0xffffffffffffffffll; + static constexpr Int128 min_int128 = minInt128(); + static constexpr Int128 max_int128 = maxInt128(); if (-MAX_INT64_WITH_EXACT_FLOAT64_REPR <= i && i <= MAX_INT64_WITH_EXACT_FLOAT64_REPR) return static_cast(i) > f; @@ -394,8 +395,8 @@ inline bool NO_SANITIZE_UNDEFINED greaterOp(DB::Int128 i, DB::Float64 f) inline bool NO_SANITIZE_UNDEFINED greaterOp(DB::Float64 f, DB::Int128 i) { - static constexpr Int128 min_int128 = Int128(0x8000000000000000ll) << 64; - static constexpr Int128 max_int128 = (Int128(0x7fffffffffffffffll) << 64) + 0xffffffffffffffffll; + static constexpr Int128 min_int128 = minInt128(); + static constexpr Int128 max_int128 = maxInt128(); if (-MAX_INT64_WITH_EXACT_FLOAT64_REPR <= i && i <= MAX_INT64_WITH_EXACT_FLOAT64_REPR) return f > static_cast(i); diff --git a/src/Core/Defines.h b/src/Core/Defines.h index 3a7d29e92b1..ba3d37242fa 100644 --- a/src/Core/Defines.h +++ b/src/Core/Defines.h @@ -70,7 +70,7 @@ /// Mininum revision supporting interserver secret. #define DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET 54441 -/// Version of ClickHouse TCP protocol. Set to git tag with latest protocol change. +/// Version of ClickHouse TCP protocol. Increment it manually when you change the protocol. #define DBMS_TCP_PROTOCOL_VERSION 54441 /// The boundary on which the blocks for asynchronous file operations should be aligned. diff --git a/src/Core/MySQL/PacketsProtocolText.cpp b/src/Core/MySQL/PacketsProtocolText.cpp index 766bcf636e4..ad34cd8c28d 100644 --- a/src/Core/MySQL/PacketsProtocolText.cpp +++ b/src/Core/MySQL/PacketsProtocolText.cpp @@ -77,7 +77,7 @@ ColumnDefinition::ColumnDefinition( size_t ColumnDefinition::getPayloadSize() const { - return 13 + getLengthEncodedStringSize("def") + getLengthEncodedStringSize(schema) + getLengthEncodedStringSize(table) + getLengthEncodedStringSize(org_table) + \ + return 12 + getLengthEncodedStringSize("def") + getLengthEncodedStringSize(schema) + getLengthEncodedStringSize(table) + getLengthEncodedStringSize(org_table) + \ getLengthEncodedStringSize(name) + getLengthEncodedStringSize(org_name) + getLengthEncodedNumberSize(next_length); } @@ -96,7 +96,7 @@ void ColumnDefinition::readPayloadImpl(ReadBuffer & payload) payload.readStrict(reinterpret_cast(&column_length), 4); payload.readStrict(reinterpret_cast(&column_type), 1); payload.readStrict(reinterpret_cast(&flags), 2); - payload.readStrict(reinterpret_cast(&decimals), 2); + payload.readStrict(reinterpret_cast(&decimals), 1); payload.ignore(2); } @@ -113,7 +113,7 @@ void ColumnDefinition::writePayloadImpl(WriteBuffer & buffer) const buffer.write(reinterpret_cast(&column_length), 4); buffer.write(reinterpret_cast(&column_type), 1); buffer.write(reinterpret_cast(&flags), 2); - buffer.write(reinterpret_cast(&decimals), 2); + buffer.write(reinterpret_cast(&decimals), 1); writeChar(0x0, 2, buffer); } diff --git a/src/Core/Settings.h b/src/Core/Settings.h index b96b1b12c24..9449cd571a1 100644 --- a/src/Core/Settings.h +++ b/src/Core/Settings.h @@ -350,7 +350,7 @@ class IColumn; M(UInt64, max_live_view_insert_blocks_before_refresh, 64, "Limit maximum number of inserted blocks after which mergeable blocks are dropped and query is re-executed.", 0) \ M(UInt64, min_free_disk_space_for_temporary_data, 0, "The minimum disk space to keep while writing temporary data used in external sorting and aggregation.", 0) \ \ - M(DefaultDatabaseEngine, default_database_engine, DefaultDatabaseEngine::Ordinary, "Default database engine.", 0) \ + M(DefaultDatabaseEngine, default_database_engine, DefaultDatabaseEngine::Atomic, "Default database engine.", 0) \ M(Bool, show_table_uuid_in_table_create_query_if_not_nil, false, "For tables in databases with Engine=Atomic show UUID of the table in its CREATE query.", 0) \ M(Bool, enable_scalar_subquery_optimization, true, "If it is set to true, prevent scalar subqueries from (de)serializing large scalar values and possibly avoid running the same subquery more than once.", 0) \ M(Bool, optimize_trivial_count_query, true, "Process trivial 'SELECT count() FROM table' query from metadata.", 0) \ diff --git a/src/DataStreams/TemporaryFileStream.h b/src/DataStreams/TemporaryFileStream.h index 6871800a540..b481cef1bb2 100644 --- a/src/DataStreams/TemporaryFileStream.h +++ b/src/DataStreams/TemporaryFileStream.h @@ -1,6 +1,5 @@ #pragma once -#include #include #include #include @@ -23,7 +22,7 @@ struct TemporaryFileStream TemporaryFileStream(const std::string & path) : file_in(path) , compressed_in(file_in) - , block_in(std::make_shared(compressed_in, ClickHouseRevision::get())) + , block_in(std::make_shared(compressed_in, DBMS_TCP_PROTOCOL_VERSION)) {} TemporaryFileStream(const std::string & path, const Block & header_) diff --git a/src/DataTypes/DataTypesDecimal.h b/src/DataTypes/DataTypesDecimal.h index fd7c1f91c68..079812c2e74 100644 --- a/src/DataTypes/DataTypesDecimal.h +++ b/src/DataTypes/DataTypesDecimal.h @@ -153,8 +153,9 @@ convertToDecimal(const typename FromDataType::FieldType & value, UInt32 scale) auto out = value * static_cast(DecimalUtils::scaleMultiplier(scale)); if constexpr (std::is_same_v) { - static constexpr Int128 min_int128 = Int128(0x8000000000000000ll) << 64; - static constexpr Int128 max_int128 = (Int128(0x7fffffffffffffffll) << 64) + 0xffffffffffffffffll; + static constexpr Int128 min_int128 = minInt128(); + static constexpr Int128 max_int128 = maxInt128(); + if (out <= static_cast(min_int128) || out >= static_cast(max_int128)) throw Exception(std::string(ToDataType::family_name) + " convert overflow. Float is out of Decimal range", ErrorCodes::DECIMAL_OVERFLOW); diff --git a/src/Dictionaries/CacheDictionary.cpp b/src/Dictionaries/CacheDictionary.cpp index cb39dffeb6c..e3140e6fb8b 100644 --- a/src/Dictionaries/CacheDictionary.cpp +++ b/src/Dictionaries/CacheDictionary.cpp @@ -835,7 +835,8 @@ void CacheDictionary::waitForCurrentUpdateFinish(UpdateUnitPtr & update_unit_ptr catch (...) { throw DB::Exception(ErrorCodes::CACHE_DICTIONARY_UPDATE_FAIL, - "Dictionary update failed: {}", + "Update failed for dictionary '{}': {}", + getDictionaryID().getNameForLogs(), getCurrentExceptionMessage(true /*with stack trace*/, true /*check embedded stack trace*/)); } @@ -925,7 +926,7 @@ void CacheDictionary::update(UpdateUnitPtr & update_unit_ptr) const else cell.setExpiresAt(std::chrono::time_point::max()); - update_unit_ptr->getPresentIdHandler()(id, cell_idx); + update_unit_ptr->callPresentIdHandler(id, cell_idx); /// mark corresponding id as found remaining_ids[id] = 1; } @@ -987,9 +988,9 @@ void CacheDictionary::update(UpdateUnitPtr & update_unit_ptr) const if (was_default) cell.setDefault(); if (was_default) - update_unit_ptr->getAbsentIdHandler()(id, cell_idx); + update_unit_ptr->callAbsentIdHandler(id, cell_idx); else - update_unit_ptr->getPresentIdHandler()(id, cell_idx); + update_unit_ptr->callPresentIdHandler(id, cell_idx); continue; } /// We don't have expired data for that `id` so all we can do is to rethrow `last_exception`. @@ -1021,7 +1022,7 @@ void CacheDictionary::update(UpdateUnitPtr & update_unit_ptr) const setDefaultAttributeValue(attribute, cell_idx); /// inform caller that the cell has not been found - update_unit_ptr->getAbsentIdHandler()(id, cell_idx); + update_unit_ptr->callAbsentIdHandler(id, cell_idx); } ProfileEvents::increment(ProfileEvents::DictCacheKeysRequestedMiss, not_found_num); diff --git a/src/Dictionaries/CacheDictionary.h b/src/Dictionaries/CacheDictionary.h index 5e7e272ff2e..ee4229b3249 100644 --- a/src/Dictionaries/CacheDictionary.h +++ b/src/Dictionaries/CacheDictionary.h @@ -399,16 +399,18 @@ private: absent_id_handler([](Key, size_t){}){} - PresentIdHandler getPresentIdHandler() + void callPresentIdHandler(Key key, size_t cell_idx) { std::lock_guard lock(callback_mutex); - return can_use_callback ? present_id_handler : PresentIdHandler{}; + if (can_use_callback) + present_id_handler(key, cell_idx); } - AbsentIdHandler getAbsentIdHandler() + void callAbsentIdHandler(Key key, size_t cell_idx) { std::lock_guard lock(callback_mutex); - return can_use_callback ? absent_id_handler : AbsentIdHandler{}; + if (can_use_callback) + absent_id_handler(key, cell_idx); } std::vector requested_ids; diff --git a/src/Dictionaries/CacheDictionary.inc.h b/src/Dictionaries/CacheDictionary.inc.h index 27064d113e6..8867d6a3c4a 100644 --- a/src/Dictionaries/CacheDictionary.inc.h +++ b/src/Dictionaries/CacheDictionary.inc.h @@ -148,7 +148,9 @@ void CacheDictionary::getItemsNumberImpl( std::begin(cache_expired_ids), std::end(cache_expired_ids), std::back_inserter(required_ids), [](auto & pair) { return pair.first; }); - auto on_cell_updated = [&] (const auto id, const auto cell_idx) + auto on_cell_updated = + [&attribute_array, &cache_not_found_ids, &cache_expired_ids, &out] + (const auto id, const auto cell_idx) { const auto attribute_value = attribute_array[cell_idx]; diff --git a/src/Dictionaries/SSDCacheDictionary.cpp b/src/Dictionaries/SSDCacheDictionary.cpp index 20e62acbd82..5547e34758f 100644 --- a/src/Dictionaries/SSDCacheDictionary.cpp +++ b/src/Dictionaries/SSDCacheDictionary.cpp @@ -53,6 +53,7 @@ namespace ErrorCodes extern const int AIO_READ_ERROR; extern const int AIO_WRITE_ERROR; extern const int BAD_ARGUMENTS; + extern const int CACHE_DICTIONARY_UPDATE_FAIL; extern const int CANNOT_ALLOCATE_MEMORY; extern const int CANNOT_CREATE_DIRECTORY; extern const int CANNOT_FSYNC; @@ -1193,8 +1194,23 @@ void SSDCacheStorage::update(DictionarySourcePtr & source_ptr, const std::vector { /// TODO: use old values - /// We don't have expired data for that `id` so all we can do is to rethrow `last_exception`. - std::rethrow_exception(last_update_exception); + // We don't have expired data for that `id` so all we can do is + // to rethrow `last_exception`. We might have to throw the same + // exception for different callers of dictGet() in different + // threads, which might then modify the exception object, so we + // have to throw a copy. + try + { + std::rethrow_exception(last_update_exception); + } + catch (...) + { + throw DB::Exception(ErrorCodes::CACHE_DICTIONARY_UPDATE_FAIL, + "Update failed for dictionary '{}': {}", + getPath(), + getCurrentExceptionMessage(true /*with stack trace*/, + true /*check embedded stack trace*/)); + } } /// Set key diff --git a/src/Dictionaries/SSDComplexKeyCacheDictionary.cpp b/src/Dictionaries/SSDComplexKeyCacheDictionary.cpp index 972d10da24d..44847df48ff 100644 --- a/src/Dictionaries/SSDComplexKeyCacheDictionary.cpp +++ b/src/Dictionaries/SSDComplexKeyCacheDictionary.cpp @@ -54,6 +54,7 @@ namespace ErrorCodes extern const int AIO_READ_ERROR; extern const int AIO_WRITE_ERROR; extern const int BAD_ARGUMENTS; + extern const int CACHE_DICTIONARY_UPDATE_FAIL; extern const int CANNOT_ALLOCATE_MEMORY; extern const int CANNOT_CREATE_DIRECTORY; extern const int CANNOT_FSYNC; @@ -1266,8 +1267,23 @@ void SSDComplexKeyCacheStorage::update( { /// TODO: use old values. - /// We don't have expired data for that `id` so all we can do is to rethrow `last_exception`. - std::rethrow_exception(last_update_exception); + // We don't have expired data for that `id` so all we can do is + // to rethrow `last_exception`. We might have to throw the same + // exception for different callers of dictGet() in different + // threads, which might then modify the exception object, so we + // have to throw a copy. + try + { + std::rethrow_exception(last_update_exception); + } + catch (...) + { + throw DB::Exception(ErrorCodes::CACHE_DICTIONARY_UPDATE_FAIL, + "Update failed for dictionary '{}': {}", + getPath(), + getCurrentExceptionMessage(true /*with stack trace*/, + true /*check embedded stack trace*/)); + } } std::uniform_int_distribution distribution{lifetime.min_sec, lifetime.max_sec}; diff --git a/src/Functions/CMakeLists.txt b/src/Functions/CMakeLists.txt index 0a99a034a33..bdf89c983f1 100644 --- a/src/Functions/CMakeLists.txt +++ b/src/Functions/CMakeLists.txt @@ -62,12 +62,10 @@ else() endif() -option(STRIP_DEBUG_SYMBOLS_FUNCTIONS - "Do not generate debugger info for ClickHouse functions. - Provides faster linking and lower binary size. - Tradeoff is the inability to debug some source files with e.g. gdb - (empty stack frames and no local variables)." - ${STRIP_DSF_DEFAULT}) +# Provides faster linking and lower binary size. +# Tradeoff is the inability to debug some source files with e.g. gdb +# (empty stack frames and no local variables)." +option(STRIP_DEBUG_SYMBOLS_FUNCTIONS "Do not generate debugger info for ClickHouse functions" ${STRIP_DSF_DEFAULT}) if (STRIP_DEBUG_SYMBOLS_FUNCTIONS) message(WARNING "Not generating debugger info for ClickHouse functions") @@ -115,7 +113,11 @@ if(USE_RAPIDJSON) target_include_directories(clickhouse_functions SYSTEM PRIVATE ${RAPIDJSON_INCLUDE_DIR}) endif() -option(ENABLE_MULTITARGET_CODE "" ON) +# ClickHouse developers may use platform-dependent code under some macro (e.g. `#ifdef ENABLE_MULTITARGET`). +# If turned ON, this option defines such macro. +# See `src/Functions/TargetSpecific.h` +option(ENABLE_MULTITARGET_CODE "Enable platform-dependent code" ON) + if (ENABLE_MULTITARGET_CODE) add_definitions(-DENABLE_MULTITARGET_CODE=1) else() diff --git a/src/Functions/FunctionJoinGet.h b/src/Functions/FunctionJoinGet.h index 6b3b1202f60..e1afd2715f0 100644 --- a/src/Functions/FunctionJoinGet.h +++ b/src/Functions/FunctionJoinGet.h @@ -80,7 +80,7 @@ public: DataTypePtr getReturnType(const ColumnsWithTypeAndName &) const override { return {}; } // Not used bool useDefaultImplementationForNulls() const override { return false; } - bool useDefaultImplementationForLowCardinalityColumns() const override { return true; } + bool useDefaultImplementationForLowCardinalityColumns() const override { return false; } bool isVariadic() const override { return true; } size_t getNumberOfArguments() const override { return 0; } diff --git a/src/Functions/FunctionsComparison.h b/src/Functions/FunctionsComparison.h index 0a3d544f9e5..436502aead4 100644 --- a/src/Functions/FunctionsComparison.h +++ b/src/Functions/FunctionsComparison.h @@ -1213,7 +1213,7 @@ public: const bool left_is_string = isStringOrFixedString(which_left); const bool right_is_string = isStringOrFixedString(which_right); - bool date_and_datetime = (left_type != right_type) && + bool date_and_datetime = (which_left.idx != which_right.idx) && which_left.isDateOrDateTime() && which_right.isDateOrDateTime(); if (left_is_num && right_is_num && !date_and_datetime) diff --git a/src/Functions/TargetSpecific.h b/src/Functions/TargetSpecific.h index bc433702180..8de6a3dbec4 100644 --- a/src/Functions/TargetSpecific.h +++ b/src/Functions/TargetSpecific.h @@ -11,7 +11,7 @@ * * If compiler is not gcc/clang or target isn't x86_64 or ENABLE_MULTITARGET_CODE * was set to OFF in cmake, all code inside these macros will be removed and - * USE_MUTLITARGE_CODE will be set to 0. Use #if USE_MUTLITARGE_CODE whenever you + * USE_MULTITARGET_CODE will be set to 0. Use #if USE_MULTITARGET_CODE whenever you * use anything from this namespaces. * * For similarities there is a macros DECLARE_DEFAULT_CODE, which wraps code diff --git a/src/Interpreters/ActionsVisitor.cpp b/src/Interpreters/ActionsVisitor.cpp index 9d6d5f783ff..1d524669fd9 100644 --- a/src/Interpreters/ActionsVisitor.cpp +++ b/src/Interpreters/ActionsVisitor.cpp @@ -235,11 +235,7 @@ static Block createBlockFromAST(const ASTPtr & node, const DataTypes & types, co return header.cloneWithColumns(std::move(columns)); } -/** Create a block for set from literal. - * 'set_element_types' - types of what are on the left hand side of IN. - * 'right_arg' - Literal - Tuple or Array. - */ -static Block createBlockForSet( +Block createBlockForSet( const DataTypePtr & left_arg_type, const ASTPtr & right_arg, const DataTypes & set_element_types, @@ -280,14 +276,7 @@ static Block createBlockForSet( return block; } -/** Create a block for set from expression. - * 'set_element_types' - types of what are on the left hand side of IN. - * 'right_arg' - list of values: 1, 2, 3 or list of tuples: (1, 2), (3, 4), (5, 6). - * - * We need special implementation for ASTFunction, because in case, when we interpret - * large tuple or array as function, `evaluateConstantExpression` works extremely slow. - */ -static Block createBlockForSet( +Block createBlockForSet( const DataTypePtr & left_arg_type, const std::shared_ptr & right_arg, const DataTypes & set_element_types, @@ -900,10 +889,11 @@ SetPtr ActionsMatcher::makeSet(const ASTFunction & node, Data & data, bool no_su * in the subquery_for_set object, this subquery is set as source and the temporary table _data1 as the table. * - this function shows the expression IN_data1. */ - if (subquery_for_set.source.empty() && data.no_storage_or_local) + if (!subquery_for_set.source && data.no_storage_or_local) { auto interpreter = interpretSubquery(right_in_operand, data.context, data.subquery_depth, {}); - subquery_for_set.source = QueryPipeline::getPipe(interpreter->execute().pipeline); + subquery_for_set.source = std::make_unique(); + interpreter->buildQueryPlan(*subquery_for_set.source); } subquery_for_set.set = set; diff --git a/src/Interpreters/ActionsVisitor.h b/src/Interpreters/ActionsVisitor.h index d8d85f1c0bf..98ea3f79fff 100644 --- a/src/Interpreters/ActionsVisitor.h +++ b/src/Interpreters/ActionsVisitor.h @@ -16,11 +16,37 @@ struct ExpressionAction; class ExpressionActions; using ExpressionActionsPtr = std::shared_ptr; - /// The case of an explicit enumeration of values. +/// The case of an explicit enumeration of values. SetPtr makeExplicitSet( const ASTFunction * node, const Block & sample_block, bool create_ordered_set, const Context & context, const SizeLimits & limits, PreparedSets & prepared_sets); +/** Create a block for set from expression. + * 'set_element_types' - types of what are on the left hand side of IN. + * 'right_arg' - list of values: 1, 2, 3 or list of tuples: (1, 2), (3, 4), (5, 6). + * + * We need special implementation for ASTFunction, because in case, when we interpret + * large tuple or array as function, `evaluateConstantExpression` works extremely slow. + * + * Note: this and following functions are used in third-party applications in Arcadia, so + * they should be declared in header file. + * + */ +Block createBlockForSet( + const DataTypePtr & left_arg_type, + const std::shared_ptr & right_arg, + const DataTypes & set_element_types, + const Context & context); + +/** Create a block for set from literal. + * 'set_element_types' - types of what are on the left hand side of IN. + * 'right_arg' - Literal - Tuple or Array. + */ +Block createBlockForSet( + const DataTypePtr & left_arg_type, + const ASTPtr & right_arg, + const DataTypes & set_element_types, + const Context & context); /** For ActionsVisitor * A stack of ExpressionActions corresponding to nested lambda expressions. diff --git a/src/Interpreters/Aggregator.cpp b/src/Interpreters/Aggregator.cpp index 1df76f96663..5b9169a878b 100644 --- a/src/Interpreters/Aggregator.cpp +++ b/src/Interpreters/Aggregator.cpp @@ -844,7 +844,7 @@ void Aggregator::writeToTemporaryFile(AggregatedDataVariants & data_variants, co const std::string & path = file->path(); WriteBufferFromFile file_buf(path); CompressedWriteBuffer compressed_buf(file_buf); - NativeBlockOutputStream block_out(compressed_buf, ClickHouseRevision::get(), getHeader(false)); + NativeBlockOutputStream block_out(compressed_buf, DBMS_TCP_PROTOCOL_VERSION, getHeader(false)); LOG_DEBUG(log, "Writing part of aggregation data into temporary file {}.", path); ProfileEvents::increment(ProfileEvents::ExternalAggregationWritePart); diff --git a/src/Interpreters/ClientInfo.cpp b/src/Interpreters/ClientInfo.cpp index 378375dcc18..71567a424c5 100644 --- a/src/Interpreters/ClientInfo.cpp +++ b/src/Interpreters/ClientInfo.cpp @@ -5,7 +5,6 @@ #include #include #include -#include #include #if !defined(ARCADIA_BUILD) @@ -44,7 +43,7 @@ void ClientInfo::write(WriteBuffer & out, const UInt64 server_protocol_revision) writeBinary(client_name, out); writeVarUInt(client_version_major, out); writeVarUInt(client_version_minor, out); - writeVarUInt(client_revision, out); + writeVarUInt(client_tcp_protocol_version, out); } else if (interface == Interface::HTTP) { @@ -92,7 +91,7 @@ void ClientInfo::read(ReadBuffer & in, const UInt64 client_protocol_revision) readBinary(client_name, in); readVarUInt(client_version_major, in); readVarUInt(client_version_minor, in); - readVarUInt(client_revision, in); + readVarUInt(client_tcp_protocol_version, in); } else if (interface == Interface::HTTP) { @@ -111,7 +110,7 @@ void ClientInfo::read(ReadBuffer & in, const UInt64 client_protocol_revision) if (client_protocol_revision >= DBMS_MIN_REVISION_WITH_VERSION_PATCH) readVarUInt(client_version_patch, in); else - client_version_patch = client_revision; + client_version_patch = client_tcp_protocol_version; } } @@ -137,7 +136,7 @@ void ClientInfo::fillOSUserHostNameAndVersionInfo() client_version_major = DBMS_VERSION_MAJOR; client_version_minor = DBMS_VERSION_MINOR; client_version_patch = DBMS_VERSION_PATCH; - client_revision = ClickHouseRevision::get(); + client_tcp_protocol_version = DBMS_TCP_PROTOCOL_VERSION; } diff --git a/src/Interpreters/ClientInfo.h b/src/Interpreters/ClientInfo.h index 99426716cb2..704f1913b89 100644 --- a/src/Interpreters/ClientInfo.h +++ b/src/Interpreters/ClientInfo.h @@ -69,7 +69,7 @@ public: UInt64 client_version_major = 0; UInt64 client_version_minor = 0; UInt64 client_version_patch = 0; - unsigned client_revision = 0; + unsigned client_tcp_protocol_version = 0; /// For http HTTPMethod http_method = HTTPMethod::UNKNOWN; diff --git a/src/Interpreters/Cluster.cpp b/src/Interpreters/Cluster.cpp index b385e74adc5..8a98e8282a6 100644 --- a/src/Interpreters/Cluster.cpp +++ b/src/Interpreters/Cluster.cpp @@ -623,4 +623,21 @@ const std::string & Cluster::ShardInfo::pathForInsert(bool prefer_localhost_repl return dir_name_for_internal_replication_with_local; } +bool Cluster::maybeCrossReplication() const +{ + /// Cluster can be used for cross-replication if some replicas have different default database names, + /// so one clickhouse-server instance can contain multiple replicas. + + if (addresses_with_failover.empty()) + return false; + + const String & database_name = addresses_with_failover.front().front().default_database; + for (const auto & shard : addresses_with_failover) + for (const auto & replica : shard) + if (replica.default_database != database_name) + return true; + + return false; +} + } diff --git a/src/Interpreters/Cluster.h b/src/Interpreters/Cluster.h index 4985c70e6e2..c8225c81453 100644 --- a/src/Interpreters/Cluster.h +++ b/src/Interpreters/Cluster.h @@ -193,6 +193,10 @@ public: /// Get a new Cluster that contains all servers (all shards with all replicas) from existing cluster as independent shards. std::unique_ptr getClusterWithReplicasAsShards(const Settings & settings) const; + /// Returns false if cluster configuration doesn't allow to use it for cross-replication. + /// NOTE: true does not mean, that it's actually a cross-replication cluster. + bool maybeCrossReplication() const; + private: using SlotToShard = std::vector; SlotToShard slot_to_shard; diff --git a/src/Interpreters/Context.cpp b/src/Interpreters/Context.cpp index 704b21f3a4a..be35c8a9184 100644 --- a/src/Interpreters/Context.cpp +++ b/src/Interpreters/Context.cpp @@ -1088,6 +1088,18 @@ String Context::getInitialQueryId() const } +void Context::setCurrentDatabaseNameInGlobalContext(const String & name) +{ + if (global_context != this) + throw Exception("Cannot set current database for non global context, this method should be used during server initialization", ErrorCodes::LOGICAL_ERROR); + auto lock = getLock(); + + if (!current_database.empty()) + throw Exception("Default database name cannot be changed in global context without server restart", ErrorCodes::LOGICAL_ERROR); + + current_database = name; +} + void Context::setCurrentDatabase(const String & name) { DatabaseCatalog::instance().assertDatabaseExists(name); diff --git a/src/Interpreters/Context.h b/src/Interpreters/Context.h index 3d66ef239e7..bd5e17fe2e4 100644 --- a/src/Interpreters/Context.h +++ b/src/Interpreters/Context.h @@ -359,6 +359,9 @@ public: String getInitialQueryId() const; void setCurrentDatabase(const String & name); + /// Set current_database for global context. We don't validate that database + /// exists because it should be set before databases loading. + void setCurrentDatabaseNameInGlobalContext(const String & name); void setCurrentQueryId(const String & query_id); void killCurrentQuery(); diff --git a/src/Interpreters/CrashLog.cpp b/src/Interpreters/CrashLog.cpp index 12fd57c33dc..9d84d5a18e9 100644 --- a/src/Interpreters/CrashLog.cpp +++ b/src/Interpreters/CrashLog.cpp @@ -49,7 +49,7 @@ void CrashLogElement::appendToBlock(MutableColumns & columns) const columns[i++]->insert(trace); columns[i++]->insert(trace_full); columns[i++]->insert(VERSION_FULL); - columns[i++]->insert(ClickHouseRevision::get()); + columns[i++]->insert(ClickHouseRevision::getVersionRevision()); String build_id_hex; #if defined(__ELF__) && !defined(__FreeBSD__) diff --git a/src/Interpreters/ExpressionAnalyzer.cpp b/src/Interpreters/ExpressionAnalyzer.cpp index 8d67672612c..2ca183ff9af 100644 --- a/src/Interpreters/ExpressionAnalyzer.cpp +++ b/src/Interpreters/ExpressionAnalyzer.cpp @@ -525,7 +525,7 @@ static bool allowDictJoin(StoragePtr joined_storage, const Context & context, St if (!dict) return false; - dict_name = dict->dictionaryName(); + dict_name = dict->resolvedDictionaryName(); auto dictionary = context.getExternalDictionariesLoader().getDictionary(dict_name); if (!dictionary) return false; @@ -582,7 +582,7 @@ JoinPtr SelectQueryExpressionAnalyzer::makeTableJoin(const ASTTablesInSelectQuer ExpressionActionsPtr joined_block_actions = createJoinedBlockActions(context, analyzedJoin()); Names original_right_columns; - if (subquery_for_join.source.empty()) + if (!subquery_for_join.source) { NamesWithAliases required_columns_with_aliases = analyzedJoin().getRequiredColumns( joined_block_actions->getSampleBlock(), joined_block_actions->getRequiredColumns()); diff --git a/src/Interpreters/GlobalSubqueriesVisitor.h b/src/Interpreters/GlobalSubqueriesVisitor.h index e155a132241..719794f0607 100644 --- a/src/Interpreters/GlobalSubqueriesVisitor.h +++ b/src/Interpreters/GlobalSubqueriesVisitor.h @@ -135,7 +135,8 @@ public: ast = database_and_table_name; external_tables[external_table_name] = external_storage_holder; - subqueries_for_sets[external_table_name].source = QueryPipeline::getPipe(interpreter->execute().pipeline); + subqueries_for_sets[external_table_name].source = std::make_unique(); + interpreter->buildQueryPlan(*subqueries_for_sets[external_table_name].source); subqueries_for_sets[external_table_name].table = external_storage; /** NOTE If it was written IN tmp_table - the existing temporary (but not external) table, diff --git a/src/Interpreters/InterpreterCreateQuery.cpp b/src/Interpreters/InterpreterCreateQuery.cpp index d7230940bb2..6f318b3658a 100644 --- a/src/Interpreters/InterpreterCreateQuery.cpp +++ b/src/Interpreters/InterpreterCreateQuery.cpp @@ -5,6 +5,7 @@ #include #include #include +#include #include #include @@ -853,17 +854,60 @@ BlockIO InterpreterCreateQuery::createDictionary(ASTCreateQuery & create) return {}; } +void InterpreterCreateQuery::prepareOnClusterQuery(ASTCreateQuery & create, const Context & context, const String & cluster_name) +{ + if (create.attach) + return; + + /// For CREATE query generate UUID on initiator, so it will be the same on all hosts. + /// It will be ignored if database does not support UUIDs. + if (create.uuid == UUIDHelpers::Nil) + create.uuid = UUIDHelpers::generateV4(); + + /// For cross-replication cluster we cannot use UUID in replica path. + String cluster_name_expanded = context.getMacros()->expand(cluster_name); + ClusterPtr cluster = context.getCluster(cluster_name_expanded); + + if (cluster->maybeCrossReplication()) + { + /// Check that {uuid} macro is not used in zookeeper_path for ReplicatedMergeTree. + /// Otherwise replicas will generate different paths. + if (!create.storage) + return; + if (!create.storage->engine) + return; + if (!startsWith(create.storage->engine->name, "Replicated")) + return; + + bool has_explicit_zk_path_arg = create.storage->engine->arguments && + create.storage->engine->arguments->children.size() >= 2 && + create.storage->engine->arguments->children[0]->as() && + create.storage->engine->arguments->children[0]->as()->value.getType() == Field::Types::String; + + if (has_explicit_zk_path_arg) + { + String zk_path = create.storage->engine->arguments->children[0]->as()->value.get(); + Macros::MacroExpansionInfo info; + info.uuid = create.uuid; + info.ignore_unknown = true; + context.getMacros()->expand(zk_path, info); + if (!info.expanded_uuid) + return; + } + + throw Exception("Seems like cluster is configured for cross-replication, " + "but zookeeper_path for ReplicatedMergeTree is not specified or contains {uuid} macro. " + "It's not supported for cross replication, because tables must have different UUIDs. " + "Please specify unique zookeeper_path explicitly.", ErrorCodes::INCORRECT_QUERY); + } +} + BlockIO InterpreterCreateQuery::execute() { auto & create = query_ptr->as(); if (!create.cluster.empty()) { - /// Allows to execute ON CLUSTER queries during version upgrade - bool force_backward_compatibility = !context.getSettingsRef().show_table_uuid_in_table_create_query_if_not_nil; - /// For CREATE query generate UUID on initiator, so it will be the same on all hosts. - /// It will be ignored if database does not support UUIDs. - if (!force_backward_compatibility && !create.attach && create.uuid == UUIDHelpers::Nil) - create.uuid = UUIDHelpers::generateV4(); + prepareOnClusterQuery(create, context, create.cluster); return executeDDLQueryOnCluster(query_ptr, context, getRequiredAccess()); } diff --git a/src/Interpreters/InterpreterCreateQuery.h b/src/Interpreters/InterpreterCreateQuery.h index 4a5d57c11d1..07fca5f3910 100644 --- a/src/Interpreters/InterpreterCreateQuery.h +++ b/src/Interpreters/InterpreterCreateQuery.h @@ -55,6 +55,8 @@ public: static ColumnsDescription getColumnsDescription(const ASTExpressionList & columns, const Context & context, bool sanity_check_compression_codecs); static ConstraintsDescription getConstraintsDescription(const ASTExpressionList * constraints); + static void prepareOnClusterQuery(ASTCreateQuery & create, const Context & context, const String & cluster_name); + private: struct TableProperties { diff --git a/src/Interpreters/InterpreterExplainQuery.cpp b/src/Interpreters/InterpreterExplainQuery.cpp index c936556ce39..a0a63dfed08 100644 --- a/src/Interpreters/InterpreterExplainQuery.cpp +++ b/src/Interpreters/InterpreterExplainQuery.cpp @@ -119,13 +119,17 @@ struct QueryPlanSettings { QueryPlan::ExplainPlanOptions query_plan_options; + /// Apply query plan optimisations. + bool optimize = true; + constexpr static char name[] = "PLAN"; std::unordered_map> boolean_settings = { {"header", query_plan_options.header}, {"description", query_plan_options.description}, - {"actions", query_plan_options.actions} + {"actions", query_plan_options.actions}, + {"optimize", optimize}, }; }; @@ -248,7 +252,8 @@ BlockInputStreamPtr InterpreterExplainQuery::executeImpl() InterpreterSelectWithUnionQuery interpreter(ast.getExplainedQuery(), context, SelectQueryOptions()); interpreter.buildQueryPlan(plan); - plan.optimize(); + if (settings.optimize) + plan.optimize(); WriteBufferFromOStream buffer(ss); plan.explainPlan(buffer, settings.query_plan_options); diff --git a/src/Interpreters/InterpreterSelectQuery.cpp b/src/Interpreters/InterpreterSelectQuery.cpp index 22106387fc4..823808759a2 100644 --- a/src/Interpreters/InterpreterSelectQuery.cpp +++ b/src/Interpreters/InterpreterSelectQuery.cpp @@ -1908,14 +1908,8 @@ void InterpreterSelectQuery::executeSubqueriesInSetsAndJoins(QueryPlan & query_p const Settings & settings = context->getSettingsRef(); - auto creating_sets = std::make_unique( - query_plan.getCurrentDataStream(), - std::move(subqueries_for_sets), - SizeLimits(settings.max_rows_to_transfer, settings.max_bytes_to_transfer, settings.transfer_overflow_mode), - *context); - - creating_sets->setStepDescription("Create sets for subqueries and joins"); - query_plan.addStep(std::move(creating_sets)); + SizeLimits limits(settings.max_rows_to_transfer, settings.max_bytes_to_transfer, settings.transfer_overflow_mode); + addCreatingSetsStep(query_plan, std::move(subqueries_for_sets), limits, *context); } diff --git a/src/Interpreters/InterpreterSelectWithUnionQuery.cpp b/src/Interpreters/InterpreterSelectWithUnionQuery.cpp index 1e631ea538b..ba0ebfaaf27 100644 --- a/src/Interpreters/InterpreterSelectWithUnionQuery.cpp +++ b/src/Interpreters/InterpreterSelectWithUnionQuery.cpp @@ -183,13 +183,14 @@ void InterpreterSelectWithUnionQuery::buildQueryPlan(QueryPlan & query_plan) return; } - std::vector plans(num_plans); + std::vector> plans(num_plans); DataStreams data_streams(num_plans); for (size_t i = 0; i < num_plans; ++i) { - nested_interpreters[i]->buildQueryPlan(plans[i]); - data_streams[i] = plans[i].getCurrentDataStream(); + plans[i] = std::make_unique(); + nested_interpreters[i]->buildQueryPlan(*plans[i]); + data_streams[i] = plans[i]->getCurrentDataStream(); } auto max_threads = context->getSettingsRef().max_threads; diff --git a/src/Interpreters/MergeJoin.cpp b/src/Interpreters/MergeJoin.cpp index 0154f8453b3..c9072ec3480 100644 --- a/src/Interpreters/MergeJoin.cpp +++ b/src/Interpreters/MergeJoin.cpp @@ -602,7 +602,7 @@ void MergeJoin::joinBlock(Block & block, ExtraBlockPtr & not_processed) { JoinCommon::checkTypesOfKeys(block, table_join->keyNamesLeft(), right_table_keys, table_join->keyNamesRight()); materializeBlockInplace(block); - JoinCommon::removeLowCardinalityInplace(block, table_join->keyNamesLeft()); + JoinCommon::removeLowCardinalityInplace(block, table_join->keyNamesLeft(), false); sortBlock(block, left_sort_description); @@ -636,6 +636,8 @@ void MergeJoin::joinBlock(Block & block, ExtraBlockPtr & not_processed) /// Back thread even with no data. We have some unfinished data in buffer. if (!not_processed && left_blocks_buffer) not_processed = std::make_shared(NotProcessed{{}, 0, 0, 0}); + + JoinCommon::restoreLowCardinalityInplace(block); } template diff --git a/src/Interpreters/MutationsInterpreter.cpp b/src/Interpreters/MutationsInterpreter.cpp index 089e3d1c23f..30da0d6e65f 100644 --- a/src/Interpreters/MutationsInterpreter.cpp +++ b/src/Interpreters/MutationsInterpreter.cpp @@ -11,6 +11,11 @@ #include #include #include +#include +#include +#include +#include +#include #include #include #include @@ -19,6 +24,7 @@ #include #include #include +#include namespace DB @@ -524,10 +530,11 @@ ASTPtr MutationsInterpreter::prepare(bool dry_run) SelectQueryOptions().analyze(/* dry_run = */ false).ignoreLimits()}; auto first_stage_header = interpreter.getSampleBlock(); - QueryPipeline pipeline; - pipeline.init(Pipe(std::make_shared(first_stage_header))); - addStreamsForLaterStages(stages_copy, pipeline); - updated_header = std::make_unique(pipeline.getHeader()); + QueryPlan plan; + auto source = std::make_shared(first_stage_header); + plan.addStep(std::make_unique(Pipe(std::move(source)))); + auto pipeline = addStreamsForLaterStages(stages_copy, plan); + updated_header = std::make_unique(pipeline->getHeader()); } /// Special step to recalculate affected indices and TTL expressions. @@ -656,7 +663,7 @@ ASTPtr MutationsInterpreter::prepareInterpreterSelectQuery(std::vector & return select; } -void MutationsInterpreter::addStreamsForLaterStages(const std::vector & prepared_stages, QueryPipeline & pipeline) const +QueryPipelinePtr MutationsInterpreter::addStreamsForLaterStages(const std::vector & prepared_stages, QueryPlan & plan) const { for (size_t i_stage = 1; i_stage < prepared_stages.size(); ++i_stage) { @@ -668,18 +675,12 @@ void MutationsInterpreter::addStreamsForLaterStages(const std::vector & p if (i < stage.filter_column_names.size()) { /// Execute DELETEs. - pipeline.addSimpleTransform([&](const Block & header) - { - return std::make_shared(header, step->actions(), stage.filter_column_names[i], false); - }); + plan.addStep(std::make_unique(plan.getCurrentDataStream(), step->actions(), stage.filter_column_names[i], false)); } else { /// Execute UPDATE or final projection. - pipeline.addSimpleTransform([&](const Block & header) - { - return std::make_shared(header, step->actions()); - }); + plan.addStep(std::make_unique(plan.getCurrentDataStream(), step->actions())); } } @@ -689,14 +690,17 @@ void MutationsInterpreter::addStreamsForLaterStages(const std::vector & p const Settings & settings = context.getSettingsRef(); SizeLimits network_transfer_limits( settings.max_rows_to_transfer, settings.max_bytes_to_transfer, settings.transfer_overflow_mode); - pipeline.addCreatingSetsTransform(std::move(subqueries_for_sets), network_transfer_limits, context); + addCreatingSetsStep(plan, std::move(subqueries_for_sets), network_transfer_limits, context); } } - pipeline.addSimpleTransform([&](const Block & header) + auto pipeline = plan.buildQueryPipeline(); + pipeline->addSimpleTransform([&](const Block & header) { return std::make_shared(header); }); + + return pipeline; } void MutationsInterpreter::validate() @@ -718,8 +722,9 @@ void MutationsInterpreter::validate() } } - auto block_io = select_interpreter->execute(); - addStreamsForLaterStages(stages, block_io.pipeline); + QueryPlan plan; + select_interpreter->buildQueryPlan(plan); + auto pipeline = addStreamsForLaterStages(stages, plan); } BlockInputStreamPtr MutationsInterpreter::execute() @@ -727,10 +732,11 @@ BlockInputStreamPtr MutationsInterpreter::execute() if (!can_execute) throw Exception("Cannot execute mutations interpreter because can_execute flag set to false", ErrorCodes::LOGICAL_ERROR); - auto block_io = select_interpreter->execute(); - addStreamsForLaterStages(stages, block_io.pipeline); + QueryPlan plan; + select_interpreter->buildQueryPlan(plan); - auto result_stream = block_io.getInputStream(); + auto pipeline = addStreamsForLaterStages(stages, plan); + BlockInputStreamPtr result_stream = std::make_shared(std::move(*pipeline)); /// Sometimes we update just part of columns (for example UPDATE mutation) /// in this case we don't read sorting key, so just we don't check anything. diff --git a/src/Interpreters/MutationsInterpreter.h b/src/Interpreters/MutationsInterpreter.h index 359ee1a3fd0..59d9e7657c3 100644 --- a/src/Interpreters/MutationsInterpreter.h +++ b/src/Interpreters/MutationsInterpreter.h @@ -13,7 +13,10 @@ namespace DB { class Context; +class QueryPlan; + class QueryPipeline; +using QueryPipelinePtr = std::unique_ptr; /// Return false if the data isn't going to be changed by mutations. bool isStorageTouchedByMutations( @@ -52,7 +55,7 @@ private: struct Stage; ASTPtr prepareInterpreterSelectQuery(std::vector &prepared_stages, bool dry_run); - void addStreamsForLaterStages(const std::vector & prepared_stages, QueryPipeline & pipeline) const; + QueryPipelinePtr addStreamsForLaterStages(const std::vector & prepared_stages, QueryPlan & plan) const; std::optional getStorageSortDescriptionIfPossible(const Block & header) const; diff --git a/src/Interpreters/ProcessList.cpp b/src/Interpreters/ProcessList.cpp index d86b5678f6d..018ddbcfa1d 100644 --- a/src/Interpreters/ProcessList.cpp +++ b/src/Interpreters/ProcessList.cpp @@ -401,7 +401,7 @@ void ProcessList::killAllQueries() QueryStatusInfo QueryStatus::getInfo(bool get_thread_list, bool get_profile_events, bool get_settings) const { - QueryStatusInfo res; + QueryStatusInfo res{}; res.query = query; res.client_info = client_info; diff --git a/src/Interpreters/QueryLog.cpp b/src/Interpreters/QueryLog.cpp index 62dbc114633..75e0fae615a 100644 --- a/src/Interpreters/QueryLog.cpp +++ b/src/Interpreters/QueryLog.cpp @@ -118,7 +118,7 @@ void QueryLogElement::appendToBlock(MutableColumns & columns) const appendClientInfo(client_info, columns, i); - columns[i++]->insert(ClickHouseRevision::get()); + columns[i++]->insert(ClickHouseRevision::getVersionRevision()); { Array threads_array; @@ -172,7 +172,7 @@ void QueryLogElement::appendClientInfo(const ClientInfo & client_info, MutableCo columns[i++]->insert(client_info.os_user); columns[i++]->insert(client_info.client_hostname); columns[i++]->insert(client_info.client_name); - columns[i++]->insert(client_info.client_revision); + columns[i++]->insert(client_info.client_tcp_protocol_version); columns[i++]->insert(client_info.client_version_major); columns[i++]->insert(client_info.client_version_minor); columns[i++]->insert(client_info.client_version_patch); diff --git a/src/Interpreters/QueryThreadLog.cpp b/src/Interpreters/QueryThreadLog.cpp index 22ad60d96b4..e5a8cf7c5cf 100644 --- a/src/Interpreters/QueryThreadLog.cpp +++ b/src/Interpreters/QueryThreadLog.cpp @@ -93,7 +93,7 @@ void QueryThreadLogElement::appendToBlock(MutableColumns & columns) const QueryLogElement::appendClientInfo(client_info, columns, i); - columns[i++]->insert(ClickHouseRevision::get()); + columns[i++]->insert(ClickHouseRevision::getVersionRevision()); if (profile_counters) { diff --git a/src/Interpreters/SubqueryForSet.cpp b/src/Interpreters/SubqueryForSet.cpp index 038ecbbb0b6..17ea813c545 100644 --- a/src/Interpreters/SubqueryForSet.cpp +++ b/src/Interpreters/SubqueryForSet.cpp @@ -8,13 +8,19 @@ namespace DB { +SubqueryForSet::SubqueryForSet() = default; +SubqueryForSet::~SubqueryForSet() = default; +SubqueryForSet::SubqueryForSet(SubqueryForSet &&) = default; +SubqueryForSet & SubqueryForSet::operator= (SubqueryForSet &&) = default; + void SubqueryForSet::makeSource(std::shared_ptr & interpreter, NamesWithAliases && joined_block_aliases_) { joined_block_aliases = std::move(joined_block_aliases_); - source = QueryPipeline::getPipe(interpreter->execute().pipeline); + source = std::make_unique(); + interpreter->buildQueryPlan(*source); - sample_block = source.getHeader(); + sample_block = interpreter->getSampleBlock(); renameColumns(sample_block); } diff --git a/src/Interpreters/SubqueryForSet.h b/src/Interpreters/SubqueryForSet.h index d268758c3e8..fd073500dc2 100644 --- a/src/Interpreters/SubqueryForSet.h +++ b/src/Interpreters/SubqueryForSet.h @@ -5,7 +5,6 @@ #include #include #include -#include namespace DB @@ -14,12 +13,18 @@ namespace DB class InterpreterSelectWithUnionQuery; class ExpressionActions; using ExpressionActionsPtr = std::shared_ptr; +class QueryPlan; /// Information on what to do when executing a subquery in the [GLOBAL] IN/JOIN section. struct SubqueryForSet { + SubqueryForSet(); + ~SubqueryForSet(); + SubqueryForSet(SubqueryForSet &&); + SubqueryForSet & operator= (SubqueryForSet &&); + /// The source is obtained using the InterpreterSelectQuery subquery. - Pipe source; + std::unique_ptr source; /// If set, build it from result. SetPtr set; diff --git a/src/Interpreters/SystemLog.h b/src/Interpreters/SystemLog.h index 03b1b735cbc..2a0ce9cef53 100644 --- a/src/Interpreters/SystemLog.h +++ b/src/Interpreters/SystemLog.h @@ -438,7 +438,7 @@ void SystemLog::flushImpl(const std::vector & to_flush, ASTPtr query_ptr(insert.release()); // we need query context to do inserts to target table with MV containing subqueries or joins - auto insert_context = Context(context); + Context insert_context(context); insert_context.makeQueryContext(); InterpreterInsertQuery interpreter(query_ptr, insert_context); diff --git a/src/Interpreters/TextLog.cpp b/src/Interpreters/TextLog.cpp index d166b24ef4f..243bf6d299a 100644 --- a/src/Interpreters/TextLog.cpp +++ b/src/Interpreters/TextLog.cpp @@ -62,7 +62,7 @@ void TextLogElement::appendToBlock(MutableColumns & columns) const columns[i++]->insert(logger_name); columns[i++]->insert(message); - columns[i++]->insert(ClickHouseRevision::get()); + columns[i++]->insert(ClickHouseRevision::getVersionRevision()); columns[i++]->insert(source_file); columns[i++]->insert(source_line); diff --git a/src/Interpreters/TraceLog.cpp b/src/Interpreters/TraceLog.cpp index c4fa7307b1a..f7e82032f49 100644 --- a/src/Interpreters/TraceLog.cpp +++ b/src/Interpreters/TraceLog.cpp @@ -43,7 +43,7 @@ void TraceLogElement::appendToBlock(MutableColumns & columns) const columns[i++]->insert(DateLUT::instance().toDayNum(event_time)); columns[i++]->insert(event_time); columns[i++]->insert(timestamp_ns); - columns[i++]->insert(ClickHouseRevision::get()); + columns[i++]->insert(ClickHouseRevision::getVersionRevision()); columns[i++]->insert(static_cast(trace_type)); columns[i++]->insert(thread_id); columns[i++]->insertData(query_id.data(), query_id.size()); diff --git a/src/Interpreters/join_common.cpp b/src/Interpreters/join_common.cpp index 866893fa359..17c289b151d 100644 --- a/src/Interpreters/join_common.cpp +++ b/src/Interpreters/join_common.cpp @@ -185,13 +185,24 @@ void removeLowCardinalityInplace(Block & block) } } -void removeLowCardinalityInplace(Block & block, const Names & names) +void removeLowCardinalityInplace(Block & block, const Names & names, bool change_type) { for (const String & column_name : names) { auto & col = block.getByName(column_name); col.column = recursiveRemoveLowCardinality(col.column); - col.type = recursiveRemoveLowCardinality(col.type); + if (change_type) + col.type = recursiveRemoveLowCardinality(col.type); + } +} + +void restoreLowCardinalityInplace(Block & block) +{ + for (size_t i = 0; i < block.columns(); ++i) + { + auto & col = block.getByPosition(i); + if (col.type->lowCardinality() && col.column && !col.column->lowCardinality()) + col.column = changeLowCardinality(col.column, col.type->createColumn()); } } diff --git a/src/Interpreters/join_common.h b/src/Interpreters/join_common.h index 11fecd4e3fb..cfd727704a0 100644 --- a/src/Interpreters/join_common.h +++ b/src/Interpreters/join_common.h @@ -23,7 +23,8 @@ Columns materializeColumns(const Block & block, const Names & names); ColumnRawPtrs materializeColumnsInplace(Block & block, const Names & names); ColumnRawPtrs getRawPointers(const Columns & columns); void removeLowCardinalityInplace(Block & block); -void removeLowCardinalityInplace(Block & block, const Names & names); +void removeLowCardinalityInplace(Block & block, const Names & names, bool change_type = true); +void restoreLowCardinalityInplace(Block & block); ColumnRawPtrs extractKeysForJoin(const Block & block_keys, const Names & key_names_right); diff --git a/src/Parsers/ASTColumnsTransformers.h b/src/Parsers/ASTColumnsTransformers.h index ddf0d70dc35..4b7a933647e 100644 --- a/src/Parsers/ASTColumnsTransformers.h +++ b/src/Parsers/ASTColumnsTransformers.h @@ -53,7 +53,7 @@ public: ASTPtr clone() const override { auto replacement = std::make_shared(*this); - replacement->name = name; + replacement->children.clear(); replacement->expr = expr->clone(); replacement->children.push_back(replacement->expr); return replacement; diff --git a/src/Parsers/ASTWithElement.cpp b/src/Parsers/ASTWithElement.cpp index e8dd4ff0498..9d22286c2fd 100644 --- a/src/Parsers/ASTWithElement.cpp +++ b/src/Parsers/ASTWithElement.cpp @@ -6,6 +6,7 @@ namespace DB ASTPtr ASTWithElement::clone() const { const auto res = std::make_shared(*this); + res->children.clear(); res->name = name; res->subquery = subquery->clone(); res->children.emplace_back(res->subquery); diff --git a/src/Parsers/obfuscateQueries.cpp b/src/Parsers/obfuscateQueries.cpp new file mode 100644 index 00000000000..32382b70bd7 --- /dev/null +++ b/src/Parsers/obfuscateQueries.cpp @@ -0,0 +1,937 @@ +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +namespace DB +{ + +namespace ErrorCodes +{ + extern const int TOO_MANY_TEMPORARY_COLUMNS; +} + + +namespace +{ + +const std::unordered_set keywords +{ + "CREATE", "DATABASE", "IF", "NOT", "EXISTS", "TEMPORARY", "TABLE", "ON", "CLUSTER", "DEFAULT", + "MATERIALIZED", "ALIAS", "ENGINE", "AS", "VIEW", "POPULATE", "SETTINGS", "ATTACH", "DETACH", "DROP", + "RENAME", "TO", "ALTER", "ADD", "MODIFY", "CLEAR", "COLUMN", "AFTER", "COPY", "PROJECT", + "PRIMARY", "KEY", "CHECK", "PARTITION", "PART", "FREEZE", "FETCH", "FROM", "SHOW", "INTO", + "OUTFILE", "FORMAT", "TABLES", "DATABASES", "LIKE", "PROCESSLIST", "CASE", "WHEN", "THEN", "ELSE", + "END", "DESCRIBE", "DESC", "USE", "SET", "OPTIMIZE", "FINAL", "DEDUPLICATE", "INSERT", "VALUES", + "SELECT", "DISTINCT", "SAMPLE", "ARRAY", "JOIN", "GLOBAL", "LOCAL", "ANY", "ALL", "INNER", + "LEFT", "RIGHT", "FULL", "OUTER", "CROSS", "USING", "PREWHERE", "WHERE", "GROUP", "BY", + "WITH", "TOTALS", "HAVING", "ORDER", "COLLATE", "LIMIT", "UNION", "AND", "OR", "ASC", + "IN", "KILL", "QUERY", "SYNC", "ASYNC", "TEST", "BETWEEN", "TRUNCATE", "USER", "ROLE", + "PROFILE", "QUOTA", "POLICY", "ROW", "GRANT", "REVOKE", "OPTION", "ADMIN", "EXCEPT", "REPLACE", + "IDENTIFIED", "HOST", "NAME", "READONLY", "WRITABLE", "PERMISSIVE", "FOR", "RESTRICTIVE", "RANDOMIZED", + "INTERVAL", "LIMITS", "ONLY", "TRACKING", "IP", "REGEXP", "ILIKE", "DICTIONARY" +}; + +const std::unordered_set keep_words +{ + "id", "name", "value", "num", + "Id", "Name", "Value", "Num", + "ID", "NAME", "VALUE", "NUM", +}; + +/// The list of nouns collected from here: http://www.desiquintans.com/nounlist, Public domain. +std::initializer_list nouns +{ +"aardvark", "abacus", "abbey", "abbreviation", "abdomen", "ability", "abnormality", "abolishment", "abortion", +"abrogation", "absence", "abundance", "abuse", "academics", "academy", "accelerant", "accelerator", "accent", "acceptance", "access", +"accessory", "accident", "accommodation", "accompanist", "accomplishment", "accord", "accordance", "accordion", "account", "accountability", +"accountant", "accounting", "accuracy", "accusation", "acetate", "achievement", "achiever", "acid", "acknowledgment", "acorn", "acoustics", +"acquaintance", "acquisition", "acre", "acrylic", "act", "action", "activation", "activist", "activity", "actor", "actress", "acupuncture", +"ad", "adaptation", "adapter", "addiction", "addition", "address", "adjective", "adjustment", "admin", "administration", "administrator", +"admire", "admission", "adobe", "adoption", "adrenalin", "adrenaline", "adult", "adulthood", "advance", "advancement", "advantage", "advent", +"adverb", "advertisement", "advertising", "advice", "adviser", "advocacy", "advocate", "affair", "affect", "affidavit", "affiliate", +"affinity", "afoul", "afterlife", "aftermath", "afternoon", "aftershave", "aftershock", "afterthought", "age", "agency", "agenda", "agent", +"aggradation", "aggression", "aglet", "agony", "agreement", "agriculture", "aid", "aide", "aim", "air", "airbag", "airbus", "aircraft", +"airfare", "airfield", "airforce", "airline", "airmail", "airman", "airplane", "airport", "airship", "airspace", "alarm", "alb", "albatross", +"album", "alcohol", "alcove", "alder", "ale", "alert", "alfalfa", "algebra", "algorithm", "alias", "alibi", "alien", "allegation", "allergist", +"alley", "alliance", "alligator", "allocation", "allowance", "alloy", "alluvium", "almanac", "almighty", "almond", "alpaca", "alpenglow", +"alpenhorn", "alpha", "alphabet", "altar", "alteration", "alternative", "altitude", "alto", "aluminium", "aluminum", "amazement", "amazon", +"ambassador", "amber", "ambience", "ambiguity", "ambition", "ambulance", "amendment", "amenity", "ammunition", "amnesty", "amount", "amusement", +"anagram", "analgesia", "analog", "analogue", "analogy", "analysis", "analyst", "analytics", "anarchist", "anarchy", "anatomy", "ancestor", +"anchovy", "android", "anesthesiologist", "anesthesiology", "angel", "anger", "angina", "angiosperm", "angle", "angora", "angstrom", +"anguish", "animal", "anime", "anise", "ankle", "anklet", "anniversary", "announcement", "annual", "anorak", "answer", "ant", "anteater", +"antecedent", "antechamber", "antelope", "antennae", "anterior", "anthropology", "antibody", "anticipation", "anticodon", "antigen", +"antique", "antiquity", "antler", "antling", "anxiety", "anybody", "anyone", "anything", "anywhere", "apartment", "ape", "aperitif", +"apology", "app", "apparatus", "apparel", "appeal", "appearance", "appellation", "appendix", "appetiser", "appetite", "appetizer", "applause", +"apple", "applewood", "appliance", "application", "appointment", "appreciation", "apprehension", "approach", "appropriation", "approval", +"apricot", "apron", "apse", "aquarium", "aquifer", "arcade", "arch", "archaeologist", "archaeology", "archeology", "archer", +"architect", "architecture", "archives", "area", "arena", "argument", "arithmetic", "ark", "arm", "armadillo", "armament", +"armchair", "armoire", "armor", "armour", "armpit", "armrest", "army", "arrangement", "array", "arrest", "arrival", "arrogance", "arrow", +"art", "artery", "arthur", "artichoke", "article", "artifact", "artificer", "artist", "ascend", "ascent", "ascot", "ash", "ashram", "ashtray", +"aside", "asparagus", "aspect", "asphalt", "aspic", "assassination", "assault", "assembly", "assertion", "assessment", "asset", +"assignment", "assist", "assistance", "assistant", "associate", "association", "assumption", "assurance", "asterisk", "astrakhan", "astrolabe", +"astrologer", "astrology", "astronomy", "asymmetry", "atelier", "atheist", "athlete", "athletics", "atmosphere", "atom", "atrium", "attachment", +"attack", "attacker", "attainment", "attempt", "attendance", "attendant", "attention", "attenuation", "attic", "attitude", "attorney", +"attraction", "attribute", "auction", "audience", "audit", "auditorium", "aunt", "authentication", "authenticity", "author", "authorisation", +"authority", "authorization", "auto", "autoimmunity", "automation", "automaton", "autumn", "availability", "avalanche", "avenue", "average", +"avocado", "award", "awareness", "awe", "axis", "azimuth", "babe", "baboon", "babushka", "baby", "bachelor", "back", "backbone", +"backburn", "backdrop", "background", "backpack", "backup", "backyard", "bacon", "bacterium", "badge", "badger", "bafflement", "bag", +"bagel", "baggage", "baggie", "baggy", "bagpipe", "bail", "bait", "bake", "baker", "bakery", "bakeware", "balaclava", "balalaika", "balance", +"balcony", "ball", "ballet", "balloon", "balloonist", "ballot", "ballpark", "bamboo", "ban", "banana", "band", "bandana", "bandanna", +"bandolier", "bandwidth", "bangle", "banjo", "bank", "bankbook", "banker", "banking", "bankruptcy", "banner", "banquette", "banyan", +"baobab", "bar", "barbecue", "barbeque", "barber", "barbiturate", "bargain", "barge", "baritone", "barium", "bark", "barley", "barn", +"barometer", "barracks", "barrage", "barrel", "barrier", "barstool", "bartender", "base", "baseball", "baseboard", "baseline", "basement", +"basics", "basil", "basin", "basis", "basket", "basketball", "bass", "bassinet", "bassoon", "bat", "bath", "bather", "bathhouse", "bathrobe", +"bathroom", "bathtub", "battalion", "batter", "battery", "batting", "battle", "battleship", "bay", "bayou", "beach", "bead", "beak", +"beam", "bean", "beancurd", "beanie", "beanstalk", "bear", "beard", "beast", "beastie", "beat", "beating", "beauty", "beaver", "beck", +"bed", "bedrock", "bedroom", "bee", "beech", "beef", "beer", "beet", "beetle", "beggar", "beginner", "beginning", "begonia", "behalf", +"behavior", "behaviour", "beheading", "behest", "behold", "being", "belfry", "belief", "believer", "bell", "belligerency", "bellows", +"belly", "belt", "bench", "bend", "beneficiary", "benefit", "beret", "berry", "bestseller", "bet", "beverage", "beyond", +"bias", "bibliography", "bicycle", "bid", "bidder", "bidding", "bidet", "bifocals", "bijou", "bike", "bikini", "bill", "billboard", "billing", +"billion", "bin", "binoculars", "biology", "biopsy", "biosphere", "biplane", "birch", "bird", "birdbath", "birdcage", +"birdhouse", "birth", "birthday", "biscuit", "bit", "bite", "bitten", "bitter", "black", "blackberry", "blackbird", "blackboard", "blackfish", +"blackness", "bladder", "blade", "blame", "blank", "blanket", "blast", "blazer", "blend", "blessing", "blight", "blind", "blinker", "blister", +"blizzard", "block", "blocker", "blog", "blogger", "blood", "bloodflow", "bloom", "bloomer", "blossom", "blouse", "blow", "blowgun", +"blowhole", "blue", "blueberry", "blush", "boar", "board", "boat", "boatload", "boatyard", "bob", "bobcat", "body", "bog", "bolero", +"bolt", "bomb", "bomber", "bombing", "bond", "bonding", "bondsman", "bone", "bonfire", "bongo", "bonnet", "bonsai", "bonus", "boogeyman", +"book", "bookcase", "bookend", "booking", "booklet", "bookmark", "boolean", "boom", "boon", "boost", "booster", "boot", "bootee", "bootie", +"booty", "border", "bore", "borrower", "borrowing", "bosom", "boss", "botany", "bother", "bottle", "bottling", "bottom", +"boudoir", "bough", "boulder", "boulevard", "boundary", "bouquet", "bourgeoisie", "bout", "boutique", "bow", "bower", "bowl", "bowler", +"bowling", "bowtie", "box", "boxer", "boxspring", "boy", "boycott", "boyfriend", "boyhood", "boysenberry", "bra", "brace", "bracelet", +"bracket", "brain", "brake", "bran", "branch", "brand", "brandy", "brass", "brassiere", "bratwurst", "bread", "breadcrumb", "breadfruit", +"break", "breakdown", "breakfast", "breakpoint", "breakthrough", "breast", "breastplate", "breath", "breeze", "brewer", "bribery", "brick", +"bricklaying", "bride", "bridge", "brief", "briefing", "briefly", "briefs", "brilliant", "brink", "brisket", "broad", "broadcast", "broccoli", +"brochure", "brocolli", "broiler", "broker", "bronchitis", "bronco", "bronze", "brooch", "brood", "brook", "broom", "brother", +"brow", "brown", "brownie", "browser", "browsing", "brunch", "brush", "brushfire", "brushing", "bubble", "buck", "bucket", "buckle", +"buckwheat", "bud", "buddy", "budget", "buffalo", "buffer", "buffet", "bug", "buggy", "bugle", "builder", "building", "bulb", "bulk", +"bull", "bulldozer", "bullet", "bump", "bumper", "bun", "bunch", "bungalow", "bunghole", "bunkhouse", "burden", "bureau", +"burglar", "burial", "burlesque", "burn", "burning", "burrito", "burro", "burrow", "burst", "bus", "bush", "business", "businessman", +"bust", "bustle", "butane", "butcher", "butler", "butter", "butterfly", "button", "buy", "buyer", "buying", "buzz", "buzzard", +"cabana", "cabbage", "cabin", "cabinet", "cable", "caboose", "cacao", "cactus", "caddy", "cadet", "cafe", "caffeine", "caftan", "cage", +"cake", "calcification", "calculation", "calculator", "calculus", "calendar", "calf", "caliber", "calibre", "calico", "call", "calm", +"calorie", "camel", "cameo", "camera", "camp", "campaign", "campaigning", "campanile", "camper", "campus", "can", "canal", "cancer", +"candelabra", "candidacy", "candidate", "candle", "candy", "cane", "cannibal", "cannon", "canoe", "canon", "canopy", "cantaloupe", "canteen", +"canvas", "cap", "capability", "capacity", "cape", "caper", "capital", "capitalism", "capitulation", "capon", "cappelletti", "cappuccino", +"captain", "caption", "captor", "car", "carabao", "caramel", "caravan", "carbohydrate", "carbon", "carboxyl", "card", "cardboard", "cardigan", +"care", "career", "cargo", "caribou", "carload", "carnation", "carnival", "carol", "carotene", "carp", "carpenter", "carpet", "carpeting", +"carport", "carriage", "carrier", "carrot", "carry", "cart", "cartel", "carter", "cartilage", "cartload", "cartoon", "cartridge", "carving", +"cascade", "case", "casement", "cash", "cashew", "cashier", "casino", "casket", "cassava", "casserole", "cassock", "cast", "castanet", +"castle", "casualty", "cat", "catacomb", "catalogue", "catalysis", "catalyst", "catamaran", "catastrophe", "catch", "catcher", "category", +"caterpillar", "cathedral", "cation", "catsup", "cattle", "cauliflower", "causal", "cause", "causeway", "caution", "cave", "caviar", +"cayenne", "ceiling", "celebration", "celebrity", "celeriac", "celery", "cell", "cellar", "cello", "celsius", "cement", "cemetery", "cenotaph", +"census", "cent", "center", "centimeter", "centre", "centurion", "century", "cephalopod", "ceramic", "ceramics", "cereal", "ceremony", +"certainty", "certificate", "certification", "cesspool", "chafe", "chain", "chainstay", "chair", "chairlift", "chairman", "chairperson", +"chaise", "chalet", "chalice", "chalk", "challenge", "chamber", "champagne", "champion", "championship", "chance", "chandelier", "change", +"channel", "chaos", "chap", "chapel", "chaplain", "chapter", "character", "characteristic", "characterization", "chard", "charge", "charger", +"charity", "charlatan", "charm", "charset", "chart", "charter", "chasm", "chassis", "chastity", "chasuble", "chateau", "chatter", "chauffeur", +"chauvinist", "check", "checkbook", "checking", "checkout", "checkroom", "cheddar", "cheek", "cheer", "cheese", "cheesecake", "cheetah", +"chef", "chem", "chemical", "chemistry", "chemotaxis", "cheque", "cherry", "chess", "chest", "chestnut", "chick", "chicken", "chicory", +"chief", "chiffonier", "child", "childbirth", "childhood", "chili", "chill", "chime", "chimpanzee", "chin", "chinchilla", "chino", "chip", +"chipmunk", "chivalry", "chive", "chives", "chocolate", "choice", "choir", "choker", "cholesterol", "choosing", "chop", +"chops", "chopstick", "chopsticks", "chord", "chorus", "chow", "chowder", "chrome", "chromolithograph", "chronicle", "chronograph", "chronometer", +"chrysalis", "chub", "chuck", "chug", "church", "churn", "chutney", "cicada", "cigarette", "cilantro", "cinder", "cinema", "cinnamon", +"circadian", "circle", "circuit", "circulation", "circumference", "circumstance", "cirrhosis", "cirrus", "citizen", "citizenship", "citron", +"citrus", "city", "civilian", "civilisation", "civilization", "claim", "clam", "clamp", "clan", "clank", "clapboard", "clarification", +"clarinet", "clarity", "clasp", "class", "classic", "classification", "classmate", "classroom", "clause", "clave", "clavicle", "clavier", +"claw", "clay", "cleaner", "clearance", "clearing", "cleat", "cleavage", "clef", "cleft", "clergyman", "cleric", "clerk", "click", "client", +"cliff", "climate", "climb", "clinic", "clip", "clipboard", "clipper", "cloak", "cloakroom", "clock", "clockwork", "clogs", "cloister", +"clone", "close", "closet", "closing", "closure", "cloth", "clothes", "clothing", "cloud", "cloudburst", "clove", "clover", "cloves", +"club", "clue", "cluster", "clutch", "coach", "coal", "coalition", "coast", "coaster", "coat", "cob", "cobbler", "cobweb", +"cock", "cockpit", "cockroach", "cocktail", "cocoa", "coconut", "cod", "code", "codepage", "codling", "codon", "codpiece", "coevolution", +"cofactor", "coffee", "coffin", "cohesion", "cohort", "coil", "coin", "coincidence", "coinsurance", "coke", "cold", "coleslaw", "coliseum", +"collaboration", "collagen", "collapse", "collar", "collard", "collateral", "colleague", "collection", "collectivisation", "collectivization", +"collector", "college", "collision", "colloquy", "colon", "colonial", "colonialism", "colonisation", "colonization", "colony", "color", +"colorlessness", "colt", "column", "columnist", "comb", "combat", "combination", "combine", "comeback", "comedy", "comestible", "comfort", +"comfortable", "comic", "comics", "comma", "command", "commander", "commandment", "comment", "commerce", "commercial", "commission", +"commitment", "committee", "commodity", "common", "commonsense", "commotion", "communicant", "communication", "communion", "communist", +"community", "commuter", "company", "comparison", "compass", "compassion", "compassionate", "compensation", "competence", "competition", +"competitor", "complaint", "complement", "completion", "complex", "complexity", "compliance", "complication", "complicity", "compliment", +"component", "comportment", "composer", "composite", "composition", "compost", "comprehension", "compress", "compromise", "comptroller", +"compulsion", "computer", "comradeship", "con", "concentrate", "concentration", "concept", "conception", "concern", "concert", "conclusion", +"concrete", "condition", "conditioner", "condominium", "condor", "conduct", "conductor", "cone", "confectionery", "conference", "confidence", +"confidentiality", "configuration", "confirmation", "conflict", "conformation", "confusion", "conga", "congo", "congregation", "congress", +"congressman", "congressperson", "conifer", "connection", "connotation", "conscience", "consciousness", "consensus", "consent", "consequence", +"conservation", "conservative", "consideration", "consignment", "consist", "consistency", "console", "consonant", "conspiracy", "conspirator", +"constant", "constellation", "constitution", "constraint", "construction", "consul", "consulate", "consulting", "consumer", "consumption", +"contact", "contact lens", "contagion", "container", "content", "contention", "contest", "context", "continent", "contingency", "continuity", +"contour", "contract", "contractor", "contrail", "contrary", "contrast", "contribution", "contributor", "control", "controller", "controversy", +"convection", "convenience", "convention", "conversation", "conversion", "convert", "convertible", "conviction", "cook", "cookbook", +"cookie", "cooking", "coonskin", "cooperation", "coordination", "coordinator", "cop", "cope", "copper", "copy", "copying", +"copyright", "copywriter", "coral", "cord", "corduroy", "core", "cork", "cormorant", "corn", "corner", "cornerstone", "cornet", "cornflakes", +"cornmeal", "corporal", "corporation", "corporatism", "corps", "corral", "correspondence", "correspondent", "corridor", "corruption", +"corsage", "cosset", "cost", "costume", "cot", "cottage", "cotton", "couch", "cougar", "cough", "council", "councilman", "councilor", +"councilperson", "counsel", "counseling", "counselling", "counsellor", "counselor", "count", "counter", "counterpart", +"counterterrorism", "countess", "country", "countryside", "county", "couple", "coupon", "courage", "course", "court", "courthouse", "courtroom", +"cousin", "covariate", "cover", "coverage", "coverall", "cow", "cowbell", "cowboy", "coyote", "crab", "crack", "cracker", "crackers", +"cradle", "craft", "craftsman", "cranberry", "crane", "cranky", "crash", "crate", "cravat", "craw", "crawdad", "crayfish", "crayon", +"crazy", "cream", "creation", "creationism", "creationist", "creative", "creativity", "creator", "creature", "creche", "credential", +"credenza", "credibility", "credit", "creditor", "creek", "creme brulee", "crepe", "crest", "crew", "crewman", "crewmate", "crewmember", +"crewmen", "cria", "crib", "cribbage", "cricket", "cricketer", "crime", "criminal", "crinoline", "crisis", "crisp", "criteria", "criterion", +"critic", "criticism", "crocodile", "crocus", "croissant", "crook", "crop", "cross", "crotch", +"croup", "crow", "crowd", "crown", "crucifixion", "crude", "cruelty", "cruise", "crumb", "crunch", "crusader", "crush", "crust", "cry", +"crystal", "crystallography", "cub", "cube", "cuckoo", "cucumber", "cue", "cuisine", "cultivar", "cultivator", "culture", +"culvert", "cummerbund", "cup", "cupboard", "cupcake", "cupola", "curd", "cure", "curio", "curiosity", "curl", "curler", "currant", "currency", +"current", "curriculum", "curry", "curse", "cursor", "curtailment", "curtain", "curve", "cushion", "custard", "custody", "custom", "customer", +"cut", "cuticle", "cutlet", "cutover", "cutting", "cyclamen", "cycle", "cyclone", "cyclooxygenase", "cygnet", "cylinder", "cymbal", "cynic", +"cyst", "cytokine", "cytoplasm", "dad", "daddy", "daffodil", "dagger", "dahlia", "daikon", "daily", "dairy", "daisy", "dam", "damage", +"dame", "dance", "dancer", "dancing", "dandelion", "danger", "dare", "dark", "darkness", "darn", "dart", "dash", "dashboard", +"data", "database", "date", "daughter", "dawn", "day", "daybed", "daylight", "dead", "deadline", "deal", "dealer", "dealing", "dearest", +"death", "deathwatch", "debate", "debris", "debt", "debtor", "decade", "decadence", "decency", "decimal", "decision", +"deck", "declaration", "declination", "decline", "decoder", "decongestant", "decoration", "decrease", "decryption", "dedication", "deduce", +"deduction", "deed", "deep", "deer", "default", "defeat", "defendant", "defender", "defense", "deficit", "definition", "deformation", +"degradation", "degree", "delay", "deliberation", "delight", "delivery", "demand", "democracy", "democrat", "demon", "demur", "den", +"denim", "denominator", "density", "dentist", "deodorant", "department", "departure", "dependency", "dependent", "deployment", "deposit", +"deposition", "depot", "depression", "depressive", "depth", "deputy", "derby", "derivation", "derivative", "derrick", "descendant", "descent", +"description", "desert", "design", "designation", "designer", "desire", "desk", "desktop", "dessert", "destination", "destiny", "destroyer", +"destruction", "detail", "detainee", "detainment", "detection", "detective", "detector", "detention", "determination", "detour", "devastation", +"developer", "developing", "development", "developmental", "deviance", "deviation", "device", "devil", "dew", "dhow", "diabetes", "diadem", +"diagnosis", "diagram", "dial", "dialect", "dialogue", "diam", "diamond", "diaper", "diaphragm", "diarist", "diary", "dibble", "dickey", "dictaphone", "dictator", "diction", "dictionary", "die", "diesel", "diet", "difference", "differential", "difficulty", "diffuse", +"dig", "digestion", "digestive", "digger", "digging", "digit", "dignity", "dilapidation", "dill", "dilution", "dime", "dimension", "dimple", +"diner", "dinghy", "dining", "dinner", "dinosaur", "dioxide", "dip", "diploma", "diplomacy", "dipstick", "direction", "directive", "director", +"directory", "dirndl", "dirt", "disability", "disadvantage", "disagreement", "disappointment", "disarmament", "disaster", "discharge", +"discipline", "disclaimer", "disclosure", "disco", "disconnection", "discount", "discourse", "discovery", "discrepancy", "discretion", +"discrimination", "discussion", "disdain", "disease", "disembodiment", "disengagement", "disguise", "disgust", "dish", "dishwasher", +"disk", "disparity", "dispatch", "displacement", "display", "disposal", "disposer", "disposition", "dispute", "disregard", "disruption", +"dissemination", "dissonance", "distance", "distinction", "distortion", "distribution", "distributor", "district", "divalent", "divan", +"diver", "diversity", "divide", "dividend", "divider", "divine", "diving", "division", "divorce", "doc", "dock", "doctor", "doctorate", +"doctrine", "document", "documentary", "documentation", "doe", "dog", "doggie", "dogsled", "dogwood", "doing", "doll", "dollar", "dollop", +"dolman", "dolor", "dolphin", "domain", "dome", "domination", "donation", "donkey", "donor", "donut", "door", "doorbell", "doorknob", +"doorpost", "doorway", "dory", "dose", "dot", "double", "doubling", "doubt", "doubter", "dough", "doughnut", "down", "downfall", "downforce", +"downgrade", "download", "downstairs", "downtown", "downturn", "dozen", "draft", "drag", "dragon", "dragonfly", "dragonfruit", "dragster", +"drain", "drainage", "drake", "drama", "dramaturge", "drapes", "draw", "drawbridge", "drawer", "drawing", "dream", "dreamer", "dredger", +"dress", "dresser", "dressing", "drill", "drink", "drinking", "drive", "driver", "driveway", "driving", "drizzle", "dromedary", "drop", +"drudgery", "drug", "drum", "drummer", "drunk", "dryer", "duck", "duckling", "dud", "dude", "due", "duel", "dueling", "duffel", "dugout", +"dulcimer", "dumbwaiter", "dump", "dump truck", "dune", "dune buggy", "dungarees", "dungeon", "duplexer", "duration", "durian", "dusk", +"dust", "dust storm", "duster", "duty", "dwarf", "dwell", "dwelling", "dynamics", "dynamite", "dynamo", "dynasty", "dysfunction", +"eagle", "eaglet", "ear", "eardrum", "earmuffs", "earnings", "earplug", "earring", "earrings", "earth", "earthquake", +"earthworm", "ease", "easel", "east", "eating", "eaves", "eavesdropper", "ecclesia", "echidna", "eclipse", "ecliptic", "ecology", "economics", +"economy", "ecosystem", "ectoderm", "ectodermal", "ecumenist", "eddy", "edge", "edger", "edible", "editing", "edition", "editor", "editorial", +"education", "eel", "effacement", "effect", "effective", "effectiveness", "effector", "efficacy", "efficiency", "effort", "egg", "egghead", +"eggnog", "eggplant", "ego", "eicosanoid", "ejector", "elbow", "elderberry", "election", "electricity", "electrocardiogram", "electronics", +"element", "elephant", "elevation", "elevator", "eleventh", "elf", "elicit", "eligibility", "elimination", "elite", "elixir", "elk", +"ellipse", "elm", "elongation", "elver", "email", "emanate", "embarrassment", "embassy", "embellishment", "embossing", "embryo", "emerald", +"emergence", "emergency", "emergent", "emery", "emission", "emitter", "emotion", "emphasis", "empire", "employ", "employee", "employer", +"employment", "empowerment", "emu", "enactment", "encirclement", "enclave", "enclosure", "encounter", "encouragement", "encyclopedia", +"end", "endive", "endoderm", "endorsement", "endothelium", "endpoint", "enemy", "energy", "enforcement", "engagement", "engine", "engineer", +"engineering", "enigma", "enjoyment", "enquiry", "enrollment", "enterprise", "entertainment", "enthusiasm", "entirety", "entity", "entrance", +"entree", "entrepreneur", "entry", "envelope", "environment", "envy", "enzyme", "epauliere", "epee", "ephemera", "ephemeris", "ephyra", +"epic", "episode", "epithelium", "epoch", "eponym", "epoxy", "equal", "equality", "equation", "equinox", "equipment", "equity", "equivalent", +"era", "eraser", "erection", "erosion", "error", "escalator", "escape", "escort", "espadrille", "espalier", "essay", "essence", "essential", +"establishment", "estate", "estimate", "estrogen", "estuary", "eternity", "ethernet", "ethics", "ethnicity", "ethyl", "euphonium", "eurocentrism", +"evaluation", "evaluator", "evaporation", "eve", "evening", "event", "everybody", "everyone", "everything", "eviction", +"evidence", "evil", "evocation", "evolution", "exaggeration", "exam", "examination", "examiner", "example", +"exasperation", "excellence", "exception", "excerpt", "excess", "exchange", "excitement", "exclamation", "excursion", "excuse", "execution", +"executive", "executor", "exercise", "exhaust", "exhaustion", "exhibit", "exhibition", "exile", "existence", "exit", "exocrine", "expansion", +"expansionism", "expectancy", "expectation", "expedition", "expense", "experience", "experiment", "experimentation", "expert", "expertise", +"explanation", "exploration", "explorer", "explosion", "export", "expose", "exposition", "exposure", "expression", "extension", "extent", +"exterior", "external", "extinction", "extreme", "extremist", "eye", "eyeball", "eyebrow", "eyebrows", "eyeglasses", "eyelash", "eyelashes", +"eyelid", "eyelids", "eyeliner", "eyestrain", "eyrie", "fabric", "face", "facelift", "facet", "facility", "facsimile", "fact", "factor", +"factory", "faculty", "fahrenheit", "fail", "failure", "fairness", "fairy", "faith", "faithful", "fall", "fallacy", "fame", +"familiar", "familiarity", "family", "fan", "fang", "fanlight", "fanny", "fantasy", "farm", "farmer", "farming", "farmland", +"farrow", "fascia", "fashion", "fat", "fate", "father", "fatigue", "fatigues", "faucet", "fault", "fav", "fava", "favor", +"favorite", "fawn", "fax", "fear", "feast", "feather", "feature", "fedelini", "federation", "fedora", "fee", "feed", "feedback", "feeding", +"feel", "feeling", "fellow", "felony", "female", "fen", "fence", "fencing", "fender", "feng", "fennel", "ferret", "ferry", "ferryboat", +"fertilizer", "festival", "fetus", "few", "fiber", "fiberglass", "fibre", "fibroblast", "fibrosis", "ficlet", "fiction", "fiddle", "field", +"fiery", "fiesta", "fifth", "fig", "fight", "fighter", "figure", "figurine", "file", "filing", "fill", "fillet", "filly", "film", "filter", +"filth", "final", "finance", "financing", "finding", "fine", "finer", "finger", "fingerling", "fingernail", "finish", "finisher", "fir", +"fire", "fireman", "fireplace", "firewall", "firm", "first", "fish", "fishbone", "fisherman", "fishery", "fishing", "fishmonger", "fishnet", +"fisting", "fit", "fitness", "fix", "fixture", "flag", "flair", "flame", "flan", "flanker", "flare", "flash", "flat", "flatboat", "flavor", +"flax", "fleck", "fledgling", "fleece", "flesh", "flexibility", "flick", "flicker", "flight", "flint", "flintlock", "flock", +"flood", "floodplain", "floor", "floozie", "flour", "flow", "flower", "flu", "flugelhorn", "fluke", "flume", "flung", "flute", "fly", +"flytrap", "foal", "foam", "fob", "focus", "fog", "fold", "folder", "folk", "folklore", "follower", "following", "fondue", "font", "food", +"foodstuffs", "fool", "foot", "footage", "football", "footnote", "footprint", "footrest", "footstep", "footstool", "footwear", "forage", +"forager", "foray", "force", "ford", "forearm", "forebear", "forecast", "forehead", "foreigner", "forelimb", "forest", "forestry", "forever", +"forgery", "fork", "form", "formal", "formamide", "format", "formation", "former", "formicarium", "formula", "fort", "forte", "fortnight", +"fortress", "fortune", "forum", "foundation", "founder", "founding", "fountain", "fourths", "fowl", "fox", "foxglove", "fraction", "fragrance", +"frame", "framework", "fratricide", "fraud", "fraudster", "freak", "freckle", "freedom", "freelance", "freezer", "freezing", "freight", +"freighter", "frenzy", "freon", "frequency", "fresco", "friction", "fridge", "friend", "friendship", "fries", "frigate", "fright", "fringe", +"fritter", "frock", "frog", "front", "frontier", "frost", "frosting", "frown", "fruit", "frustration", "fry", "fuel", "fugato", +"fulfillment", "full", "fun", "function", "functionality", "fund", "funding", "fundraising", "funeral", "fur", "furnace", "furniture", +"furry", "fusarium", "futon", "future", "gadget", "gaffe", "gaffer", "gain", "gaiters", "gale", "gallery", "galley", +"gallon", "galoshes", "gambling", "game", "gamebird", "gaming", "gander", "gang", "gap", "garage", "garb", "garbage", "garden", +"garlic", "garment", "garter", "gas", "gasket", "gasoline", "gasp", "gastronomy", "gastropod", "gate", "gateway", "gather", "gathering", +"gator", "gauge", "gauntlet", "gavel", "gazebo", "gazelle", "gear", "gearshift", "geek", "gel", "gelatin", "gelding", "gem", "gemsbok", +"gender", "gene", "general", "generation", "generator", "generosity", "genetics", "genie", "genius", "genocide", "genre", "gentleman", +"geography", "geology", "geometry", "geranium", "gerbil", "gesture", "geyser", "gherkin", "ghost", "giant", "gift", "gig", "gigantism", +"giggle", "ginger", "gingerbread", "ginseng", "giraffe", "girdle", "girl", "girlfriend", "git", "glacier", "gladiolus", "glance", "gland", +"glass", "glasses", "glee", "glen", "glider", "gliding", "glimpse", "globe", "glockenspiel", "gloom", "glory", "glove", "glow", "glucose", +"glue", "glut", "glutamate", "gnat", "gnu", "goal", "goat", "gobbler", "god", "goddess", "godfather", "godmother", "godparent", +"goggles", "going", "gold", "goldfish", "golf", "gondola", "gong", "good", "goodbye", "goodie", "goodness", "goodnight", +"goodwill", "goose", "gopher", "gorilla", "gosling", "gossip", "governance", "government", "governor", "gown", "grace", "grade", +"gradient", "graduate", "graduation", "graffiti", "graft", "grain", "gram", "grammar", "gran", "grand", "grandchild", "granddaughter", +"grandfather", "grandma", "grandmom", "grandmother", "grandpa", "grandparent", "grandson", "granny", "granola", "grant", "grape", "grapefruit", +"graph", "graphic", "grasp", "grass", "grasshopper", "grassland", "gratitude", "gravel", "gravitas", "gravity", "gravy", "gray", "grease", +"greatness", "greed", "green", "greenhouse", "greens", "grenade", "grey", "grid", "grief", +"grill", "grin", "grip", "gripper", "grit", "grocery", "ground", "group", "grouper", "grouse", "grove", "growth", "grub", "guacamole", +"guarantee", "guard", "guava", "guerrilla", "guess", "guest", "guestbook", "guidance", "guide", "guideline", "guilder", "guilt", "guilty", +"guinea", "guitar", "guitarist", "gum", "gumshoe", "gun", "gunpowder", "gutter", "guy", "gym", "gymnast", "gymnastics", "gynaecology", +"gyro", "habit", "habitat", "hacienda", "hacksaw", "hackwork", "hail", "hair", "haircut", "hake", "half", +"halibut", "hall", "halloween", "hallway", "halt", "ham", "hamburger", "hammer", "hammock", "hamster", "hand", "handball", +"handful", "handgun", "handicap", "handle", "handlebar", "handmaiden", "handover", "handrail", "handsaw", "hanger", "happening", "happiness", +"harald", "harbor", "harbour", "hardboard", "hardcover", "hardening", "hardhat", "hardship", "hardware", "hare", "harm", +"harmonica", "harmonise", "harmonize", "harmony", "harp", "harpooner", "harpsichord", "harvest", "harvester", "hash", "hashtag", "hassock", +"haste", "hat", "hatbox", "hatchet", "hatchling", "hate", "hatred", "haunt", "haven", "haversack", "havoc", "hawk", "hay", "haze", "hazel", +"hazelnut", "head", "headache", "headlight", "headline", "headphones", "headquarters", "headrest", "health", "hearing", +"hearsay", "heart", "heartache", "heartbeat", "hearth", "hearthside", "heartwood", "heat", "heater", "heating", "heaven", +"heavy", "hectare", "hedge", "hedgehog", "heel", "heifer", "height", "heir", "heirloom", "helicopter", "helium", "hell", "hellcat", "hello", +"helmet", "helo", "help", "hemisphere", "hemp", "hen", "hepatitis", "herb", "herbs", "heritage", "hermit", "hero", "heroine", "heron", +"herring", "hesitation", "hexagon", "heyday", "hiccups", "hide", "hierarchy", "high", "highland", "highlight", +"highway", "hike", "hiking", "hill", "hint", "hip", "hippodrome", "hippopotamus", "hire", "hiring", "historian", "history", "hit", "hive", +"hobbit", "hobby", "hockey", "hoe", "hog", "hold", "holder", "hole", "holiday", "home", "homeland", "homeownership", "hometown", "homework", +"homicide", "homogenate", "homonym", "honesty", "honey", "honeybee", "honeydew", "honor", "honoree", "hood", +"hoof", "hook", "hop", "hope", "hops", "horde", "horizon", "hormone", "horn", "hornet", "horror", "horse", "horseradish", "horst", "hose", +"hosiery", "hospice", "hospital", "hospitalisation", "hospitality", "hospitalization", "host", "hostel", "hostess", "hotdog", "hotel", +"hound", "hour", "hourglass", "house", "houseboat", "household", "housewife", "housework", "housing", "hovel", "hovercraft", "howard", +"howitzer", "hub", "hubcap", "hubris", "hug", "hugger", "hull", "human", "humanity", "humidity", "hummus", "humor", "humour", "hunchback", +"hundred", "hunger", "hunt", "hunter", "hunting", "hurdle", "hurdler", "hurricane", "hurry", "hurt", "husband", "hut", "hutch", "hyacinth", +"hybridisation", "hybridization", "hydrant", "hydraulics", "hydrocarb", "hydrocarbon", "hydrofoil", "hydrogen", "hydrolyse", "hydrolysis", +"hydrolyze", "hydroxyl", "hyena", "hygienic", "hype", "hyphenation", "hypochondria", "hypothermia", "hypothesis", "ice", +"iceberg", "icebreaker", "icecream", "icicle", "icing", "icon", "icy", "id", "idea", "ideal", "identification", "identity", "ideology", +"idiom", "idiot", "igloo", "ignorance", "ignorant", "ikebana", "illegal", "illiteracy", "illness", "illusion", "illustration", "image", +"imagination", "imbalance", "imitation", "immigrant", "immigration", "immortal", "impact", "impairment", "impala", "impediment", "implement", +"implementation", "implication", "import", "importance", "impostor", "impress", "impression", "imprisonment", "impropriety", "improvement", +"impudence", "impulse", "inability", "inauguration", "inbox", "incandescence", "incarnation", "incense", "incentive", +"inch", "incidence", "incident", "incision", "inclusion", "income", "incompetence", "inconvenience", "increase", "incubation", "independence", +"independent", "index", "indication", "indicator", "indigence", "individual", "industrialisation", "industrialization", "industry", "inequality", +"inevitable", "infancy", "infant", "infarction", "infection", "infiltration", "infinite", "infix", "inflammation", "inflation", "influence", +"influx", "info", "information", "infrastructure", "infusion", "inglenook", "ingrate", "ingredient", "inhabitant", "inheritance", "inhibition", +"inhibitor", "initial", "initialise", "initialize", "initiative", "injunction", "injury", "injustice", "ink", "inlay", "inn", "innervation", +"innocence", "innocent", "innovation", "input", "inquiry", "inscription", "insect", "insectarium", "insert", "inside", "insight", "insolence", +"insomnia", "inspection", "inspector", "inspiration", "installation", "instance", "instant", "instinct", "institute", "institution", +"instruction", "instructor", "instrument", "instrumentalist", "instrumentation", "insulation", "insurance", "insurgence", "insurrection", +"integer", "integral", "integration", "integrity", "intellect", "intelligence", "intensity", "intent", "intention", "intentionality", +"interaction", "interchange", "interconnection", "intercourse", "interest", "interface", "interferometer", "interior", "interject", "interloper", +"internet", "interpretation", "interpreter", "interval", "intervenor", "intervention", "interview", "interviewer", "intestine", "introduction", +"intuition", "invader", "invasion", "invention", "inventor", "inventory", "inverse", "inversion", "investigation", "investigator", "investment", +"investor", "invitation", "invite", "invoice", "involvement", "iridescence", "iris", "iron", "ironclad", "irony", "irrigation", "ischemia", +"island", "isogloss", "isolation", "issue", "item", "itinerary", "ivory", "jack", "jackal", "jacket", "jackfruit", "jade", "jaguar", +"jail", "jailhouse", "jalapeño", "jam", "jar", "jasmine", "jaw", "jazz", "jealousy", "jeans", "jeep", "jelly", "jellybeans", "jellyfish", +"jerk", "jet", "jewel", "jeweller", "jewellery", "jewelry", "jicama", "jiffy", "job", "jockey", "jodhpurs", "joey", "jogging", "joint", +"joke", "jot", "journal", "journalism", "journalist", "journey", "joy", "judge", "judgment", "judo", "jug", "juggernaut", "juice", "julienne", +"jumbo", "jump", "jumper", "jumpsuit", "jungle", "junior", "junk", "junker", "junket", "jury", "justice", "justification", "jute", "kale", +"kamikaze", "kangaroo", "karate", "kayak", "kazoo", "kebab", "keep", "keeper", "kendo", "kennel", "ketch", "ketchup", "kettle", "kettledrum", +"key", "keyboard", "keyboarding", "keystone", "kick", "kid", "kidney", "kielbasa", "kill", "killer", "killing", "kilogram", +"kilometer", "kilt", "kimono", "kinase", "kind", "kindness", "king", "kingdom", "kingfish", "kiosk", "kiss", "kit", "kitchen", "kite", +"kitsch", "kitten", "kitty", "kiwi", "knee", "kneejerk", "knickers", "knife", "knight", "knitting", "knock", "knot", +"knowledge", "knuckle", "koala", "kohlrabi", "kumquat", "lab", "label", "labor", "laboratory", "laborer", "labour", "labourer", "lace", +"lack", "lacquerware", "lad", "ladder", "ladle", "lady", "ladybug", "lag", "lake", "lamb", "lambkin", "lament", "lamp", "lanai", "land", +"landform", "landing", "landmine", "landscape", "lane", "language", "lantern", "lap", "laparoscope", "lapdog", "laptop", "larch", "lard", +"larder", "lark", "larva", "laryngitis", "lasagna", "lashes", "last", "latency", "latex", "lathe", "latitude", "latte", "latter", "laugh", +"laughter", "laundry", "lava", "law", "lawmaker", "lawn", "lawsuit", "lawyer", "lay", "layer", "layout", "lead", "leader", "leadership", +"leading", "leaf", "league", "leaker", "leap", "learning", "leash", "leather", "leave", "leaver", "lecture", "leek", "leeway", "left", +"leg", "legacy", "legal", "legend", "legging", "legislation", "legislator", "legislature", "legitimacy", "legume", "leisure", "lemon", +"lemonade", "lemur", "lender", "lending", "length", "lens", "lentil", "leopard", "leprosy", "leptocephalus", "lesson", "letter", +"lettuce", "level", "lever", "leverage", "leveret", "liability", "liar", "liberty", "libido", "library", "licence", "license", "licensing", +"licorice", "lid", "lie", "lieu", "lieutenant", "life", "lifestyle", "lifetime", "lift", "ligand", "light", "lighting", "lightning", +"lightscreen", "ligula", "likelihood", "likeness", "lilac", "lily", "limb", "lime", "limestone", "limit", "limitation", "limo", "line", +"linen", "liner", "linguist", "linguistics", "lining", "link", "linkage", "linseed", "lion", "lip", "lipid", "lipoprotein", "lipstick", +"liquid", "liquidity", "liquor", "list", "listening", "listing", "literate", "literature", "litigation", "litmus", "litter", "littleneck", +"liver", "livestock", "living", "lizard", "llama", "load", "loading", "loaf", "loafer", "loan", "lobby", "lobotomy", "lobster", "local", +"locality", "location", "lock", "locker", "locket", "locomotive", "locust", "lode", "loft", "log", "loggia", "logic", "login", "logistics", +"logo", "loincloth", "lollipop", "loneliness", "longboat", "longitude", "look", "lookout", "loop", "loophole", "loquat", "lord", "loss", +"lot", "lotion", "lottery", "lounge", "louse", "lout", "love", "lover", "lox", "loyalty", "luck", "luggage", "lumber", "lumberman", "lunch", +"luncheonette", "lunchmeat", "lunchroom", "lung", "lunge", "lust", "lute", "luxury", "lychee", "lycra", "lye", "lymphocyte", "lynx", +"lyocell", "lyre", "lyrics", "lysine", "mRNA", "macadamia", "macaroni", "macaroon", "macaw", "machine", "machinery", "macrame", "macro", +"macrofauna", "madam", "maelstrom", "maestro", "magazine", "maggot", "magic", "magnet", "magnitude", "maid", "maiden", "mail", "mailbox", +"mailer", "mailing", "mailman", "main", "mainland", "mainstream", "maintainer", "maintenance", "maize", "major", "majority", +"makeover", "maker", "makeup", "making", "male", "malice", "mall", "mallard", "mallet", "malnutrition", "mama", "mambo", "mammoth", "man", +"manacle", "management", "manager", "manatee", "mandarin", "mandate", "mandolin", "mangle", "mango", "mangrove", "manhunt", "maniac", +"manicure", "manifestation", "manipulation", "mankind", "manner", "manor", "mansard", "manservant", "mansion", "mantel", "mantle", "mantua", +"manufacturer", "manufacturing", "many", "map", "maple", "mapping", "maracas", "marathon", "marble", "march", "mare", "margarine", "margin", +"mariachi", "marimba", "marines", "marionberry", "mark", "marker", "market", "marketer", "marketing", "marketplace", "marksman", "markup", +"marmalade", "marriage", "marsh", "marshland", "marshmallow", "marten", "marxism", "mascara", "mask", "masonry", "mass", "massage", "mast", +"master", "masterpiece", "mastication", "mastoid", "mat", "match", "matchmaker", "mate", "material", "maternity", "math", "mathematics", +"matrix", "matter", "mattock", "mattress", "max", "maximum", "maybe", "mayonnaise", "mayor", "meadow", "meal", "mean", "meander", "meaning", +"means", "meantime", "measles", "measure", "measurement", "meat", "meatball", "meatloaf", "mecca", "mechanic", "mechanism", "med", "medal", +"media", "median", "medication", "medicine", "medium", "meet", "meeting", "melatonin", "melody", "melon", "member", "membership", "membrane", +"meme", "memo", "memorial", "memory", "men", "menopause", "menorah", "mention", "mentor", "menu", "merchandise", "merchant", "mercury", +"meridian", "meringue", "merit", "mesenchyme", "mess", "message", "messenger", "messy", "metabolite", "metal", "metallurgist", "metaphor", +"meteor", "meteorology", "meter", "methane", "method", "methodology", "metric", "metro", "metronome", "mezzanine", "microlending", "micronutrient", +"microphone", "microwave", "midden", "middle", "middleman", "midline", "midnight", "midwife", "might", "migrant", "migration", +"mile", "mileage", "milepost", "milestone", "military", "milk", "milkshake", "mill", "millennium", "millet", "millimeter", "million", +"millisecond", "millstone", "mime", "mimosa", "min", "mincemeat", "mind", "mine", "mineral", "mineshaft", "mini", "minibus", +"minimalism", "minimum", "mining", "minion", "minister", "mink", "minnow", "minor", "minority", "mint", "minute", "miracle", +"mirror", "miscarriage", "miscommunication", "misfit", "misnomer", "misogyny", "misplacement", "misreading", "misrepresentation", "miss", +"missile", "mission", "missionary", "mist", "mistake", "mister", "misunderstand", "miter", "mitten", "mix", "mixer", "mixture", "moai", +"moat", "mob", "mobile", "mobility", "mobster", "moccasins", "mocha", "mochi", "mode", "model", "modeling", "modem", "modernist", "modernity", +"modification", "molar", "molasses", "molding", "mole", "molecule", "mom", "moment", "monastery", "monasticism", "money", "monger", "monitor", +"monitoring", "monk", "monkey", "monocle", "monopoly", "monotheism", "monsoon", "monster", "month", "monument", "mood", "moody", "moon", +"moonlight", "moonscape", "moonshine", "moose", "mop", "morale", "morbid", "morbidity", "morning", "moron", "morphology", "morsel", "mortal", +"mortality", "mortgage", "mortise", "mosque", "mosquito", "most", "motel", "moth", "mother", "motion", "motivation", +"motive", "motor", "motorboat", "motorcar", "motorcycle", "mound", "mountain", "mouse", "mouser", "mousse", "moustache", "mouth", "mouton", +"movement", "mover", "movie", "mower", "mozzarella", "mud", "muffin", "mug", "mukluk", "mule", "multimedia", "murder", "muscat", "muscatel", +"muscle", "musculature", "museum", "mushroom", "music", "musician", "muskrat", "mussel", "mustache", "mustard", +"mutation", "mutt", "mutton", "mycoplasma", "mystery", "myth", "mythology", "nail", "name", "naming", "nanoparticle", "napkin", "narrative", +"nasal", "nation", "nationality", "native", "naturalisation", "nature", "navigation", "necessity", "neck", "necklace", "necktie", "nectar", +"nectarine", "need", "needle", "neglect", "negligee", "negotiation", "neighbor", "neighborhood", "neighbour", "neighbourhood", "neologism", +"neon", "neonate", "nephew", "nerve", "nest", "nestling", "nestmate", "net", "netball", "netbook", "netsuke", "network", "networking", +"neurobiologist", "neuron", "neuropathologist", "neuropsychiatry", "news", "newsletter", "newspaper", "newsprint", "newsstand", "nexus", +"nibble", "nicety", "niche", "nick", "nickel", "nickname", "niece", "night", "nightclub", "nightgown", "nightingale", "nightlife", "nightlight", +"nightmare", "ninja", "nit", "nitrogen", "nobody", "nod", "node", "noir", "noise", "nonbeliever", "nonconformist", "nondisclosure", "nonsense", +"noodle", "noodles", "noon", "norm", "normal", "normalisation", "normalization", "north", "nose", "notation", "note", "notebook", "notepad", +"nothing", "notice", "notion", "notoriety", "nougat", "noun", "nourishment", "novel", "nucleotidase", "nucleotide", "nudge", "nuke", +"number", "numeracy", "numeric", "numismatist", "nun", "nurse", "nursery", "nursing", "nurture", "nut", "nutmeg", "nutrient", "nutrition", +"nylon", "nymph", "oak", "oar", "oasis", "oat", "oatmeal", "oats", "obedience", "obesity", "obi", "object", "objection", "objective", +"obligation", "oboe", "observation", "observatory", "obsession", "obsidian", "obstacle", "occasion", "occupation", "occurrence", "ocean", +"ocelot", "octagon", "octave", "octavo", "octet", "octopus", "odometer", "odyssey", "oeuvre", "offence", "offense", "offer", +"offering", "office", "officer", "official", "offset", "oil", "okra", "oldie", "oleo", "olive", "omega", "omelet", "omission", "omnivore", +"oncology", "onion", "online", "onset", "opening", "opera", "operating", "operation", "operator", "ophthalmologist", "opinion", "opium", +"opossum", "opponent", "opportunist", "opportunity", "opposite", "opposition", "optimal", "optimisation", "optimist", "optimization", +"option", "orange", "orangutan", "orator", "orchard", "orchestra", "orchid", "order", "ordinary", "ordination", "ore", "oregano", "organ", +"organisation", "organising", "organization", "organizing", "orient", "orientation", "origin", "original", "originality", "ornament", +"osmosis", "osprey", "ostrich", "other", "otter", "ottoman", "ounce", "outback", "outcome", "outfielder", "outfit", "outhouse", "outlaw", +"outlay", "outlet", "outline", "outlook", "output", "outrage", "outrigger", "outrun", "outset", "outside", "oval", "ovary", "oven", "overcharge", +"overclocking", "overcoat", "overexertion", "overflight", "overhead", "overheard", "overload", "overnighter", "overshoot", "oversight", +"overview", "overweight", "owl", "owner", "ownership", "ox", "oxford", "oxygen", "oyster", "ozone", "pace", "pacemaker", "pack", "package", +"packaging", "packet", "pad", "paddle", "paddock", "pagan", "page", "pagoda", "pail", "pain", "paint", "painter", "painting", "paintwork", +"pair", "pajamas", "palace", "palate", "palm", "pamphlet", "pan", "pancake", "pancreas", "panda", "panel", "panic", "pannier", "panpipe", +"pansy", "panther", "panties", "pantologist", "pantology", "pantry", "pants", "pantsuit", "panty", "pantyhose", "papa", "papaya", "paper", +"paperback", "paperwork", "parable", "parachute", "parade", "paradise", "paragraph", "parallelogram", "paramecium", "paramedic", "parameter", +"paranoia", "parcel", "parchment", "pard", "pardon", "parent", "parenthesis", "parenting", "park", "parka", "parking", "parliament", +"parole", "parrot", "parser", "parsley", "parsnip", "part", "participant", "participation", "particle", "particular", "partner", "partnership", +"partridge", "party", "pass", "passage", "passbook", "passenger", "passing", "passion", "passive", "passport", "password", "past", "pasta", +"paste", "pastor", "pastoralist", "pastry", "pasture", "pat", "patch", "pate", "patent", "patentee", "path", "pathogenesis", "pathology", +"pathway", "patience", "patient", "patina", "patio", "patriarch", "patrimony", "patriot", "patrol", "patroller", "patrolling", "patron", +"pattern", "patty", "pattypan", "pause", "pavement", "pavilion", "paw", "pawnshop", "pay", "payee", "payment", "payoff", "pea", "peace", +"peach", "peacoat", "peacock", "peak", "peanut", "pear", "pearl", "peasant", "pecan", "pecker", "pedal", "peek", "peen", "peer", +"pegboard", "pelican", "pelt", "pen", "penalty", "pence", "pencil", "pendant", "pendulum", "penguin", "penicillin", "peninsula", "penis", +"pennant", "penny", "pension", "pentagon", "peony", "people", "pepper", "pepperoni", "percent", "percentage", "perception", "perch", +"perennial", "perfection", "performance", "perfume", "period", "periodical", "peripheral", "permafrost", "permission", "permit", "perp", +"perpendicular", "persimmon", "person", "personal", "personality", "personnel", "perspective", "pest", "pet", "petal", "petition", "petitioner", +"petticoat", "pew", "pharmacist", "pharmacopoeia", "phase", "pheasant", "phenomenon", "phenotype", "pheromone", "philanthropy", "philosopher", +"philosophy", "phone", "phosphate", "photo", "photodiode", "photograph", "photographer", "photography", "photoreceptor", "phrase", "phrasing", +"physical", "physics", "physiology", "pianist", "piano", "piccolo", "pick", "pickax", "pickaxe", "picket", "pickle", "pickup", "picnic", +"picture", "picturesque", "pie", "piece", "pier", "piety", "pig", "pigeon", "piglet", "pigpen", "pigsty", "pike", "pilaf", "pile", "pilgrim", +"pilgrimage", "pill", "pillar", "pillbox", "pillow", "pilot", "pimp", "pimple", "pin", "pinafore", "pine", "pineapple", +"pinecone", "ping", "pink", "pinkie", "pinot", "pinstripe", "pint", "pinto", "pinworm", "pioneer", "pipe", "pipeline", "piracy", "pirate", +"pistol", "pit", "pita", "pitch", "pitcher", "pitching", "pith", "pizza", "place", "placebo", "placement", "placode", "plagiarism", +"plain", "plaintiff", "plan", "plane", "planet", "planning", "plant", "plantation", "planter", "planula", "plaster", "plasterboard", +"plastic", "plate", "platelet", "platform", "platinum", "platter", "platypus", "play", "player", "playground", "playroom", "playwright", +"plea", "pleasure", "pleat", "pledge", "plenty", "plier", "pliers", "plight", "plot", "plough", "plover", "plow", "plowman", "plug", +"plugin", "plum", "plumber", "plume", "plunger", "plywood", "pneumonia", "pocket", "pocketbook", "pod", "podcast", "poem", +"poet", "poetry", "poignance", "point", "poison", "poisoning", "poker", "polarisation", "polarization", "pole", "polenta", "police", +"policeman", "policy", "polish", "politician", "politics", "poll", "polliwog", "pollutant", "pollution", "polo", "polyester", "polyp", +"pomegranate", "pomelo", "pompom", "poncho", "pond", "pony", "pool", "poor", "pop", "popcorn", "poppy", "popsicle", "popularity", "population", +"populist", "porcelain", "porch", "porcupine", "pork", "porpoise", "port", "porter", "portfolio", "porthole", "portion", "portrait", +"position", "possession", "possibility", "possible", "post", "postage", "postbox", "poster", "posterior", "postfix", "pot", "potato", +"potential", "pottery", "potty", "pouch", "poultry", "pound", "pounding", "poverty", "powder", "power", "practice", "practitioner", "prairie", +"praise", "pray", "prayer", "precedence", "precedent", "precipitation", "precision", "predecessor", "preface", "preference", "prefix", +"pregnancy", "prejudice", "prelude", "premeditation", "premier", "premise", "premium", "preoccupation", "preparation", "prescription", +"presence", "present", "presentation", "preservation", "preserves", "presidency", "president", "press", "pressroom", "pressure", "pressurisation", +"pressurization", "prestige", "presume", "pretzel", "prevalence", "prevention", "prey", "price", "pricing", "pride", "priest", "priesthood", +"primary", "primate", "prince", "princess", "principal", "principle", "print", "printer", "printing", "prior", "priority", "prison", +"prisoner", "privacy", "private", "privilege", "prize", "prizefight", "probability", "probation", "probe", "problem", "procedure", "proceedings", +"process", "processing", "processor", "proctor", "procurement", "produce", "producer", "product", "production", "productivity", "profession", +"professional", "professor", "profile", "profit", "progenitor", "program", "programme", "programming", "progress", "progression", "prohibition", +"project", "proliferation", "promenade", "promise", "promotion", "prompt", "pronoun", "pronunciation", "proof", "propaganda", +"propane", "property", "prophet", "proponent", "proportion", "proposal", "proposition", "proprietor", "prose", "prosecution", "prosecutor", +"prospect", "prosperity", "prostacyclin", "prostanoid", "prostrate", "protection", "protein", "protest", "protocol", "providence", "provider", +"province", "provision", "prow", "proximal", "proximity", "prune", "pruner", "pseudocode", "pseudoscience", "psychiatrist", "psychoanalyst", +"psychologist", "psychology", "ptarmigan", "pub", "public", "publication", "publicity", "publisher", "publishing", "pudding", "puddle", +"puffin", "pug", "puggle", "pulley", "pulse", "puma", "pump", "pumpernickel", "pumpkin", "pumpkinseed", "pun", "punch", "punctuation", +"punishment", "pup", "pupa", "pupil", "puppet", "puppy", "purchase", "puritan", "purity", "purple", "purpose", "purr", "purse", "pursuit", +"push", "pusher", "put", "puzzle", "pyramid", "pyridine", "quadrant", "quail", "qualification", "quality", "quantity", "quart", "quarter", +"quartet", "quartz", "queen", "query", "quest", "question", "questioner", "questionnaire", "quiche", "quicksand", "quiet", "quill", "quilt", +"quince", "quinoa", "quit", "quiver", "quota", "quotation", "quote", "rabbi", "rabbit", "raccoon", "race", "racer", "racing", "racism", +"racist", "rack", "radar", "radiator", "radio", "radiosonde", "radish", "raffle", "raft", "rag", "rage", "raid", "rail", "railing", "railroad", +"railway", "raiment", "rain", "rainbow", "raincoat", "rainmaker", "rainstorm", "rainy", "raise", "raisin", "rake", "rally", "ram", "rambler", +"ramen", "ramie", "ranch", "rancher", "randomisation", "randomization", "range", "ranger", "rank", "rap", "rape", "raspberry", "rat", +"rate", "ratepayer", "rating", "ratio", "rationale", "rations", "raven", "ravioli", "rawhide", "ray", "rayon", "razor", "reach", "reactant", +"reaction", "read", "reader", "readiness", "reading", "real", "reality", "realization", "realm", "reamer", "rear", "reason", "reasoning", +"rebel", "rebellion", "reboot", "recall", "recapitulation", "receipt", "receiver", "reception", "receptor", "recess", "recession", "recipe", +"recipient", "reciprocity", "reclamation", "recliner", "recognition", "recollection", "recommendation", "reconsideration", "record", +"recorder", "recording", "recovery", "recreation", "recruit", "rectangle", "red", "redesign", "redhead", "redirect", "rediscovery", "reduction", +"reef", "refectory", "reference", "referendum", "reflection", "reform", "refreshments", "refrigerator", "refuge", "refund", "refusal", +"refuse", "regard", "regime", "region", "regionalism", "register", "registration", "registry", "regret", "regulation", "regulator", +"rehospitalization", "reindeer", "reinscription", "reject", "relation", "relationship", "relative", "relaxation", "relay", "release", +"reliability", "relief", "religion", "relish", "reluctance", "remains", "remark", "reminder", "remnant", "remote", "removal", "renaissance", +"rent", "reorganisation", "reorganization", "repair", "reparation", "repayment", "repeat", "replacement", "replica", "replication", "reply", +"report", "reporter", "reporting", "repository", "representation", "representative", "reprocessing", "republic", "republican", "reputation", +"request", "requirement", "resale", "rescue", "research", "researcher", "resemblance", "reservation", "reserve", "reservoir", "reset", +"residence", "resident", "residue", "resist", "resistance", "resolution", "resolve", "resort", "resource", "respect", "respite", "response", +"responsibility", "rest", "restaurant", "restoration", "restriction", "restroom", "restructuring", "result", "resume", "retailer", "retention", +"rethinking", "retina", "retirement", "retouching", "retreat", "retrospect", "retrospective", "retrospectivity", "return", "reunion", +"revascularisation", "revascularization", "reveal", "revelation", "revenant", "revenge", "revenue", "reversal", "reverse", "review", +"revitalisation", "revitalization", "revival", "revolution", "revolver", "reward", "rhetoric", "rheumatism", "rhinoceros", "rhubarb", +"rhyme", "rhythm", "rib", "ribbon", "rice", "riddle", "ride", "rider", "ridge", "riding", "rifle", "right", "rim", "ring", "ringworm", +"riot", "rip", "ripple", "rise", "riser", "risk", "rite", "ritual", "river", "riverbed", "rivulet", "road", "roadway", "roar", "roast", +"robe", "robin", "robot", "robotics", "rock", "rocker", "rocket", "rod", "role", "roll", "roller", "romaine", "romance", +"roof", "room", "roommate", "rooster", "root", "rope", "rose", "rosemary", "roster", "rostrum", "rotation", "round", "roundabout", "route", +"router", "routine", "row", "rowboat", "rowing", "rubber", "rubric", "ruby", "ruckus", "rudiment", "ruffle", "rug", "rugby", +"ruin", "rule", "ruler", "ruling", "rum", "rumor", "run", "runaway", "runner", "running", "runway", "rush", "rust", "rutabaga", "rye", +"sabre", "sac", "sack", "saddle", "sadness", "safari", "safe", "safeguard", "safety", "saffron", "sage", "sail", "sailboat", "sailing", +"sailor", "saint", "sake", "salad", "salami", "salary", "sale", "salesman", "salmon", "salon", "saloon", "salsa", "salt", "salute", "samovar", +"sampan", "sample", "samurai", "sanction", "sanctity", "sanctuary", "sand", "sandal", "sandbar", "sandpaper", "sandwich", "sanity", "sardine", +"sari", "sarong", "sash", "satellite", "satin", "satire", "satisfaction", "sauce", "saucer", "sauerkraut", "sausage", "savage", "savannah", +"saving", "savings", "savior", "saviour", "savory", "saw", "saxophone", "scaffold", "scale", "scallion", "scallops", "scalp", "scam", +"scanner", "scarecrow", "scarf", "scarification", "scenario", "scene", "scenery", "scent", "schedule", "scheduling", "schema", "scheme", +"schizophrenic", "schnitzel", "scholar", "scholarship", "school", "schoolhouse", "schooner", "science", "scientist", "scimitar", "scissors", +"scooter", "scope", "score", "scorn", "scorpion", "scotch", "scout", "scow", "scrambled", "scrap", "scraper", "scratch", "screamer", +"screen", "screening", "screenwriting", "screw", "screwdriver", "scrim", "scrip", "script", "scripture", "scrutiny", "sculpting", +"sculptural", "sculpture", "sea", "seabass", "seafood", "seagull", "seal", "seaplane", "search", "seashore", "seaside", "season", "seat", +"seaweed", "second", "secrecy", "secret", "secretariat", "secretary", "secretion", "section", "sectional", "sector", "security", "sediment", +"seed", "seeder", "seeker", "seep", "segment", "seizure", "selection", "self", "seller", +"selling", "semantics", "semester", "semicircle", "semicolon", "semiconductor", "seminar", "senate", "senator", "sender", "senior", "sense", +"sensibility", "sensitive", "sensitivity", "sensor", "sentence", "sentencing", "sentiment", "sepal", "separation", "septicaemia", "sequel", +"sequence", "serial", "series", "sermon", "serum", "serval", "servant", "server", "service", "servitude", "sesame", "session", "set", +"setback", "setting", "settlement", "settler", "severity", "sewer", "sex", "sexuality", "shack", "shackle", "shade", "shadow", "shadowbox", +"shakedown", "shaker", "shallot", "shallows", "shame", "shampoo", "shanty", "shape", "share", "shareholder", "shark", "shaw", "shawl", +"shear", "shearling", "sheath", "shed", "sheep", "sheet", "shelf", "shell", "shelter", "sherbet", "sherry", "shield", "shift", "shin", +"shine", "shingle", "ship", "shipper", "shipping", "shipyard", "shirt", "shirtdress", "shoat", "shock", "shoe", +"shoehorn", "shoelace", "shoemaker", "shoes", "shoestring", "shofar", "shoot", "shootdown", "shop", "shopper", "shopping", "shore", "shoreline", +"short", "shortage", "shorts", "shortwave", "shot", "shoulder", "shout", "shovel", "show", "shower", "shred", "shrimp", +"shrine", "shutdown", "sibling", "sick", "sickness", "side", "sideboard", "sideburns", "sidecar", "sidestream", "sidewalk", "siding", +"siege", "sigh", "sight", "sightseeing", "sign", "signal", "signature", "signet", "significance", "signify", "signup", "silence", "silica", +"silicon", "silk", "silkworm", "sill", "silly", "silo", "silver", "similarity", "simple", "simplicity", "simplification", "simvastatin", +"sin", "singer", "singing", "singular", "sink", "sinuosity", "sip", "sir", "sister", "sitar", "site", "situation", "size", +"skate", "skating", "skean", "skeleton", "ski", "skiing", "skill", "skin", "skirt", "skull", "skullcap", "skullduggery", "skunk", "sky", +"skylight", "skyline", "skyscraper", "skywalk", "slang", "slapstick", "slash", "slate", "slavery", "slaw", "sled", "sledge", +"sleep", "sleepiness", "sleeping", "sleet", "sleuth", "slice", "slide", "slider", "slime", "slip", "slipper", "slippers", "slope", "slot", +"sloth", "slump", "smell", "smelting", "smile", "smith", "smock", "smog", "smoke", "smoking", "smolt", "smuggling", "snack", "snail", +"snake", "snakebite", "snap", "snarl", "sneaker", "sneakers", "sneeze", "sniffle", "snob", "snorer", "snow", "snowboarding", "snowflake", +"snowman", "snowmobiling", "snowplow", "snowstorm", "snowsuit", "snuck", "snug", "snuggle", "soap", "soccer", "socialism", "socialist", +"society", "sociology", "sock", "socks", "soda", "sofa", "softball", "softdrink", "softening", "software", "soil", "soldier", "sole", +"solicitation", "solicitor", "solidarity", "solidity", "soliloquy", "solitaire", "solution", "solvency", "sombrero", "somebody", "someone", +"someplace", "somersault", "something", "somewhere", "son", "sonar", "sonata", "song", "songbird", "sonnet", "soot", "sophomore", "soprano", +"sorbet", "sorghum", "sorrel", "sorrow", "sort", "soul", "soulmate", "sound", "soundness", "soup", "source", "sourwood", "sousaphone", +"south", "southeast", "souvenir", "sovereignty", "sow", "soy", "soybean", "space", "spacing", "spade", "spaghetti", "span", "spandex", +"spank", "sparerib", "spark", "sparrow", "spasm", "spat", "spatula", "spawn", "speaker", "speakerphone", "speaking", "spear", "spec", +"special", "specialist", "specialty", "species", "specification", "spectacle", "spectacles", "spectrograph", "spectrum", "speculation", +"speech", "speed", "speedboat", "spell", "spelling", "spelt", "spending", "sphere", "sphynx", "spice", "spider", "spiderling", "spike", +"spill", "spinach", "spine", "spiral", "spirit", "spiritual", "spirituality", "spit", "spite", "spleen", "splendor", "split", "spokesman", +"spokeswoman", "sponge", "sponsor", "sponsorship", "spool", "spoon", "spork", "sport", "sportsman", "spot", "spotlight", "spouse", "sprag", +"sprat", "spray", "spread", "spreadsheet", "spree", "spring", "sprinkles", "sprinter", "sprout", "spruce", "spud", "spume", "spur", "spy", +"spyglass", "square", "squash", "squatter", "squeegee", "squid", "squirrel", "stab", "stability", "stable", "stack", "stacking", "stadium", +"staff", "stag", "stage", "stain", "stair", "staircase", "stake", "stalk", "stall", "stallion", "stamen", "stamina", "stamp", "stance", +"stand", "standard", "standardisation", "standardization", "standing", "standoff", "standpoint", "star", "starboard", "start", "starter", +"state", "statement", "statin", "station", "statistic", "statistics", "statue", "status", "statute", "stay", "steak", +"stealth", "steam", "steamroller", "steel", "steeple", "stem", "stench", "stencil", "step", +"stepdaughter", "stepmother", +"stepson", "stereo", "stew", "steward", "stick", "sticker", "stiletto", "still", "stimulation", "stimulus", "sting", +"stinger", "stitch", "stitcher", "stock", "stockings", "stole", "stomach", "stone", "stonework", "stool", +"stop", "stopsign", "stopwatch", "storage", "store", "storey", "storm", "story", "storyboard", "stot", "stove", "strait", +"strand", "stranger", "strap", "strategy", "straw", "strawberry", "strawman", "stream", "street", "streetcar", "strength", "stress", +"stretch", "strife", "strike", "string", "strip", "stripe", "strobe", "stroke", "structure", "strudel", "struggle", "stucco", "stud", +"student", "studio", "study", "stuff", "stumbling", "stump", "stupidity", "sturgeon", "sty", "style", "styling", "stylus", "sub", "subcomponent", +"subconscious", "subcontractor", "subexpression", "subgroup", "subject", "submarine", "submitter", "subprime", "subroutine", "subscription", +"subsection", "subset", "subsidence", "subsidiary", "subsidy", "substance", "substitution", "subtitle", "suburb", "subway", "success", +"succotash", "suck", "sucker", "suede", "suet", "suffocation", "sugar", "suggestion", "suicide", "suit", "suitcase", "suite", "sulfur", +"sultan", "sum", "summary", "summer", "summit", "sun", "sunbeam", "sunbonnet", "sundae", "sunday", "sundial", "sunflower", "sunglasses", +"sunlamp", "sunlight", "sunrise", "sunroom", "sunset", "sunshine", "superiority", "supermarket", "supernatural", "supervision", "supervisor", +"supper", "supplement", "supplier", "supply", "support", "supporter", "suppression", "supreme", "surface", "surfboard", "surge", "surgeon", +"surgery", "surname", "surplus", "surprise", "surround", "surroundings", "surrounds", "survey", "survival", "survivor", "sushi", "suspect", +"suspenders", "suspension", "sustainment", "sustenance", "swallow", "swamp", "swan", "swanling", "swath", "sweat", "sweater", "sweatshirt", +"sweatshop", "sweatsuit", "sweets", "swell", "swim", "swimming", "swimsuit", "swine", "swing", "switch", "switchboard", "switching", +"swivel", "sword", "swordfight", "swordfish", "sycamore", "symbol", "symmetry", "sympathy", "symptom", "syndicate", "syndrome", "synergy", +"synod", "synonym", "synthesis", "syrup", "system", "tab", "tabby", "tabernacle", "table", "tablecloth", "tablet", "tabletop", +"tachometer", "tackle", "taco", "tactics", "tactile", "tadpole", "tag", "tail", "tailbud", "tailor", "tailspin", "takeover", +"tale", "talent", "talk", "talking", "tamale", "tambour", "tambourine", "tan", "tandem", "tangerine", "tank", +"tanker", "tankful", "tap", "tape", "tapioca", "target", "taro", "tarragon", "tart", "task", "tassel", "taste", "tatami", "tattler", +"tattoo", "tavern", "tax", "taxi", "taxicab", "taxpayer", "tea", "teacher", "teaching", "team", "teammate", "teapot", "tear", "tech", +"technician", "technique", "technologist", "technology", "tectonics", "teen", "teenager", "teepee", "telephone", "telescreen", "teletype", +"television", "tell", "teller", "temp", "temper", "temperature", "temple", "tempo", "temporariness", "temporary", "temptation", "temptress", +"tenant", "tendency", "tender", "tenement", "tenet", "tennis", "tenor", "tension", "tensor", "tent", "tentacle", "tenth", "tepee", "teriyaki", +"term", "terminal", "termination", "terminology", "termite", "terrace", "terracotta", "terrapin", "terrarium", "territory", "terror", +"terrorism", "terrorist", "test", "testament", "testimonial", "testimony", "testing", "text", "textbook", "textual", "texture", "thanks", +"thaw", "theater", "theft", "theism", "theme", "theology", "theory", "therapist", "therapy", "thermals", "thermometer", "thermostat", +"thesis", "thickness", "thief", "thigh", "thing", "thinking", "thirst", "thistle", "thong", "thongs", "thorn", "thought", "thousand", +"thread", "threat", "threshold", "thrift", "thrill", "throat", "throne", "thrush", "thrust", "thug", "thumb", "thump", "thunder", "thunderbolt", +"thunderhead", "thunderstorm", "thyme", "tiara", "tic", "tick", "ticket", "tide", "tie", "tiger", "tights", "tile", "till", "tilt", "timbale", +"timber", "time", "timeline", "timeout", "timer", "timetable", "timing", "timpani", "tin", "tinderbox", "tinkle", "tintype", "tip", "tire", +"tissue", "titanium", "title", "toad", "toast", "toaster", "tobacco", "today", "toe", "toenail", "toffee", "tofu", "tog", "toga", "toilet", +"tolerance", "tolerant", "toll", "tomatillo", "tomato", "tomb", "tomography", "tomorrow", "ton", "tonality", "tone", "tongue", +"tonic", "tonight", "tool", "toot", "tooth", "toothbrush", "toothpaste", "toothpick", "top", "topic", "topsail", "toque", +"toreador", "tornado", "torso", "torte", "tortellini", "tortilla", "tortoise", "tosser", "total", "tote", "touch", "tour", +"tourism", "tourist", "tournament", "towel", "tower", "town", "townhouse", "township", "toy", "trace", "trachoma", "track", +"tracking", "tracksuit", "tract", "tractor", "trade", "trader", "trading", "tradition", "traditionalism", "traffic", "trafficker", "tragedy", +"trail", "trailer", "trailpatrol", "train", "trainer", "training", "trait", "tram", "tramp", "trance", "transaction", "transcript", "transfer", +"transformation", "transit", "transition", "translation", "transmission", "transom", "transparency", "transplantation", "transport", +"transportation", "trap", "trapdoor", "trapezium", "trapezoid", "trash", "travel", "traveler", "tray", "treasure", "treasury", "treat", +"treatment", "treaty", "tree", "trek", "trellis", "tremor", "trench", "trend", "triad", "trial", "triangle", "tribe", "tributary", "trick", +"trigger", "trigonometry", "trillion", "trim", "trinket", "trip", "tripod", "tritone", "triumph", "trolley", "trombone", "troop", "trooper", +"trophy", "trouble", "trousers", "trout", "trove", "trowel", "truck", "trumpet", "trunk", "trust", "trustee", "truth", "try", "tsunami", +"tub", "tuba", "tube", "tuber", "tug", "tugboat", "tuition", "tulip", "tumbler", "tummy", "tuna", "tune", "tunic", "tunnel", +"turban", "turf", "turkey", "turmeric", "turn", "turning", "turnip", "turnover", "turnstile", "turret", "turtle", "tusk", "tussle", "tutu", +"tuxedo", "tweet", "tweezers", "twig", "twilight", "twine", "twins", "twist", "twister", "twitter", "type", "typeface", "typewriter", +"typhoon", "ukulele", "ultimatum", "umbrella", "unblinking", "uncertainty", "uncle", "underclothes", "underestimate", "underground", +"underneath", "underpants", "underpass", "undershirt", "understanding", "understatement", "undertaker", "underwear", "underweight", "underwire", +"underwriting", "unemployment", "unibody", "uniform", "uniformity", "union", "unique", "unit", "unity", "universe", "university", "update", +"upgrade", "uplift", "upper", "upstairs", "upward", "urge", "urgency", "urn", "usage", "use", "user", "usher", "usual", "utensil", "utilisation", +"utility", "utilization", "vacation", "vaccine", "vacuum", "vagrant", "valance", "valentine", "validate", "validity", "valley", "valuable", +"value", "vampire", "van", "vanadyl", "vane", "vanilla", "vanity", "variability", "variable", "variant", "variation", "variety", "vascular", +"vase", "vault", "vaulting", "veal", "vector", "vegetable", "vegetarian", "vegetarianism", "vegetation", "vehicle", "veil", "vein", "veldt", +"vellum", "velocity", "velodrome", "velvet", "vendor", "veneer", "vengeance", "venison", "venom", "venti", "venture", "venue", "veranda", +"verb", "verdict", "verification", "vermicelli", "vernacular", "verse", "version", "vertigo", "verve", "vessel", "vest", "vestment", +"vet", "veteran", "veterinarian", "veto", "viability", "vibe", "vibraphone", "vibration", "vibrissae", "vice", "vicinity", "victim", +"victory", "video", "view", "viewer", "vignette", "villa", "village", "vine", "vinegar", "vineyard", "vintage", "vintner", "vinyl", "viola", +"violation", "violence", "violet", "violin", "virginal", "virtue", "virus", "visa", "viscose", "vise", "vision", "visit", "visitor", +"visor", "vista", "visual", "vitality", "vitamin", "vitro", "vivo", "vixen", "vodka", "vogue", "voice", "void", "vol", "volatility", +"volcano", "volleyball", "volume", "volunteer", "volunteering", "vomit", "vote", "voter", "voting", "voyage", "vulture", "wad", "wafer", +"waffle", "wage", "wagon", "waist", "waistband", "wait", "waiter", "waiting", "waitress", "waiver", "wake", "walk", "walker", "walking", +"walkway", "wall", "wallaby", "wallet", "walnut", "walrus", "wampum", "wannabe", "want", "war", "warden", "wardrobe", "warfare", "warlock", +"warlord", "warming", "warmth", "warning", "warrant", "warren", "warrior", "wasabi", "wash", "washbasin", "washcloth", "washer", +"washtub", "wasp", "waste", "wastebasket", "wasting", "watch", "watcher", "watchmaker", "water", "waterbed", "watercress", "waterfall", +"waterfront", "watermelon", "waterskiing", "waterspout", "waterwheel", "wave", "waveform", "wax", "way", "weakness", "wealth", "weapon", +"wear", "weasel", "weather", "web", "webinar", "webmail", "webpage", "website", "wedding", "wedge", "weed", "weeder", "weedkiller", "week", +"weekend", "weekender", "weight", "weird", "welcome", "welfare", "well", "west", "western", "wetland", "wetsuit", +"whack", "whale", "wharf", "wheat", "wheel", "whelp", "whey", "whip", "whirlpool", "whirlwind", "whisker", "whiskey", "whisper", "whistle", +"white", "whole", "wholesale", "wholesaler", "whorl", "wick", "widget", "widow", "width", "wife", "wifi", "wild", "wildebeest", "wilderness", +"wildlife", "will", "willingness", "willow", "win", "wind", "windage", "window", "windscreen", "windshield", "wine", "winery", +"wing", "wingman", "wingtip", "wink", "winner", "winter", "wire", "wiretap", "wiring", "wisdom", "wiseguy", "wish", "wisteria", "wit", +"witch", "withdrawal", "witness", "wok", "wolf", "woman", "wombat", "wonder", "wont", "wood", "woodchuck", "woodland", +"woodshed", "woodwind", "wool", "woolens", "word", "wording", "work", "workbench", "worker", "workforce", "workhorse", "working", "workout", +"workplace", "workshop", "world", "worm", "worry", "worship", "worshiper", "worth", "wound", "wrap", "wraparound", "wrapper", "wrapping", +"wreck", "wrecker", "wren", "wrench", "wrestler", "wriggler", "wrinkle", "wrist", "writer", "writing", "wrong", "xylophone", "yacht", +"yahoo", "yak", "yam", "yang", "yard", "yarmulke", "yarn", "yawl", "year", "yeast", "yellow", "yellowjacket", "yesterday", "yew", "yin", +"yoga", "yogurt", "yoke", "yolk", "young", "youngster", "yourself", "youth", "yoyo", "yurt", "zampone", "zebra", "zebrafish", "zen", +"zephyr", "zero", "ziggurat", "zinc", "zipper", "zither", "zombie", "zone", "zoo", "zoologist", "zoology", "zucchini" +}; + + +std::string_view obfuscateWord(std::string_view src, WordMap & obfuscate_map, WordSet & used_nouns, SipHash hash_func) +{ + /// Prevent using too many nouns + if (obfuscate_map.size() * 2 > nouns.size()) + throw Exception("Too many unique identifiers in queries", ErrorCodes::TOO_MANY_TEMPORARY_COLUMNS); + + std::string_view & mapped = obfuscate_map[src]; + if (!mapped.empty()) + return mapped; + + hash_func.update(src.data(), src.size()); + std::string_view noun = nouns.begin()[hash_func.get64() % nouns.size()]; + + /// Prevent collisions + while (!used_nouns.insert(noun).second) + { + hash_func.update('\0'); + noun = nouns.begin()[hash_func.get64() % nouns.size()]; + } + + mapped = noun; + return mapped; +} + + +void obfuscateIdentifier(std::string_view src, WriteBuffer & result, WordMap & obfuscate_map, WordSet & used_nouns, SipHash hash_func) +{ + /// Find words in form 'snake_case', 'CamelCase' or 'ALL_CAPS'. + + const char * src_pos = src.data(); + const char * src_end = src_pos + src.size(); + + const char * word_begin = src_pos; + bool word_has_alphanumerics = false; + + auto append_word = [&] + { + std::string_view word(word_begin, src_pos - word_begin); + + if (keep_words.count(word)) + { + result.write(word.data(), word.size()); + } + else + { + std::string_view obfuscated_word = obfuscateWord(word, obfuscate_map, used_nouns, hash_func); + + /// Match the style of source word. + bool first_caps = !word.empty() && isUpperAlphaASCII(word[0]); + bool all_caps = first_caps && word.size() >= 2 && isUpperAlphaASCII(word[1]); + + for (size_t i = 0, size = obfuscated_word.size(); i < size; ++i) + { + if (all_caps || (i == 0 && first_caps)) + result.write(toUpperIfAlphaASCII(obfuscated_word[i])); + else + result.write(obfuscated_word[i]); + } + } + + word_begin = src_pos; + word_has_alphanumerics = false; + }; + + while (src_pos < src_end) + { + if (isAlphaNumericASCII(src_pos[0])) + word_has_alphanumerics = true; + + if (word_has_alphanumerics && src_pos[0] == '_') + { + append_word(); + result.write('_'); + ++word_begin; + } + else if (word_has_alphanumerics && isUpperAlphaASCII(src_pos[0]) && isLowerAlphaASCII(src_pos[-1])) /// xX + { + append_word(); + } + + ++src_pos; + } + + if (word_begin < src_pos) + append_word(); +} + + +void obfuscateLiteral(std::string_view src, WriteBuffer & result, SipHash hash_func) +{ + const char * src_pos = src.data(); + const char * src_end = src_pos + src.size(); + + while (src_pos < src_end) + { + /// Date + if (src_pos + strlen("0000-00-00") <= src_end + && isNumericASCII(src_pos[0]) + && isNumericASCII(src_pos[1]) + && isNumericASCII(src_pos[2]) + && isNumericASCII(src_pos[3]) + && src_pos[4] == '-' + && isNumericASCII(src_pos[5]) + && isNumericASCII(src_pos[6]) + && src_pos[7] == '-' + && isNumericASCII(src_pos[8]) + && isNumericASCII(src_pos[9])) + { + DayNum date; + ReadBufferFromMemory in(src_pos, strlen("0000-00-00")); + readDateText(date, in); + + SipHash hash_func_date = hash_func; + + if (date != 0) + { + date += hash_func_date.get64() % 256; + } + + writeDateText(date, result); + src_pos += strlen("0000-00-00"); + + /// DateTime + if (src_pos + strlen(" 00:00:00") <= src_end + && isNumericASCII(src_pos[1]) + && isNumericASCII(src_pos[2]) + && src_pos[3] == ':' + && isNumericASCII(src_pos[4]) + && isNumericASCII(src_pos[5]) + && src_pos[6] == ':' + && isNumericASCII(src_pos[7]) + && isNumericASCII(src_pos[8])) + { + result.write(src_pos[0]); + + hash_func_date.update(src_pos + 1, strlen("00:00:00")); + + uint64_t hash_value = hash_func_date.get64(); + uint32_t new_hour = hash_value % 24; + hash_value /= 24; + uint32_t new_minute = hash_value % 60; + hash_value /= 60; + uint32_t new_second = hash_value % 60; + + result.write('0' + (new_hour / 10)); + result.write('0' + (new_hour % 10)); + result.write(':'); + result.write('0' + (new_minute / 10)); + result.write('0' + (new_minute % 10)); + result.write(':'); + result.write('0' + (new_second / 10)); + result.write('0' + (new_second % 10)); + + src_pos += strlen(" 00:00:00"); + } + } + else if (isNumericASCII(src_pos[0])) + { + /// Number + if (src_pos[0] == '0' || src_pos[0] == '1') + { + /// Keep zero and one as is. + result.write(src_pos[0]); + ++src_pos; + } + else + { + ReadBufferFromMemory in(src_pos, src_end - src_pos); + uint64_t num; + readIntText(num, in); + SipHash hash_func_num = hash_func; + hash_func_num.update(src_pos, in.count()); + src_pos += in.count(); + + /// Obfuscate number but keep it within same power of two range. + + uint64_t obfuscated = hash_func_num.get64(); + uint64_t log2 = bitScanReverse(num); + + obfuscated = (1ULL << log2) + obfuscated % (1ULL << log2); + writeIntText(obfuscated, result); + } + } + else if (src_pos + 1 < src_end + && (src_pos[0] == 'e' || src_pos[0] == 'E') + && (isNumericASCII(src_pos[1]) || (src_pos[1] == '-' && src_pos + 2 < src_end && isNumericASCII(src_pos[2])))) + { + /// Something like an exponent of floating point number. Keep it as is. + /// But if it looks like a large number, overflow it into 16 bit. + + result.write(src_pos[0]); + ++src_pos; + + ReadBufferFromMemory in(src_pos, src_end - src_pos); + int16_t num; + readIntText(num, in); + writeIntText(num, result); + src_pos += in.count(); + } + else if (isAlphaASCII(src_pos[0])) + { + /// Alphabetial characters + + const char * alpha_end = src_pos + 1; + while (alpha_end < src_end && isAlphaASCII(*alpha_end)) + ++alpha_end; + + hash_func.update(src_pos, alpha_end - src_pos); + pcg64 rng(hash_func.get64()); + + while (src_pos < alpha_end) + { + auto random = rng(); + if (isLowerAlphaASCII(*src_pos)) + result.write('a' + random % 26); + else + result.write('A' + random % 26); + + ++src_pos; + } + } + else if (isASCII(src_pos[0])) + { + /// Punctuation, whitespace and control characters - keep as is. + + result.write(src_pos[0]); + ++src_pos; + } + else if (src_pos[0] <= '\xBF') + { + /// Continuation of UTF-8 sequence. + hash_func.update(src_pos[0]); + uint64_t hash = hash_func.get64(); + + char c = 0x80 + hash % (0xC0 - 0x80); + result.write(c); + + ++src_pos; + } + else + { + /// Start of UTF-8 sequence. + hash_func.update(src_pos[0]); + uint64_t hash = hash_func.get64(); + + if (src_pos[0] < '\xE0') + { + char c = 0xC0 + hash % 32; + result.write(c); + } + else if (src_pos[0] < '\xF0') + { + char c = 0xE0 + hash % 16; + result.write(c); + } + else + { + char c = 0xF0 + hash % 8; + result.write(c); + } + + ++src_pos; + } + } +} + +} + + +void obfuscateQueries( + std::string_view src, + WriteBuffer & result, + WordMap & obfuscate_map, + WordSet & used_nouns, + SipHash hash_func, + KnownIdentifierFunc known_identifier_func) +{ + Lexer lexer(src.data(), src.data() + src.size()); + while (true) + { + Token token = lexer.nextToken(); + std::string_view whole_token(token.begin, token.size()); + + if (token.isEnd()) + break; + + if (token.type == TokenType::BareWord) + { + std::string whole_token_uppercase(whole_token); + Poco::toUpperInPlace(whole_token_uppercase); + + if (keywords.count(whole_token_uppercase) + || known_identifier_func(whole_token)) + { + /// Keep keywords as is. + result.write(token.begin, token.size()); + } + else + { + /// Obfuscate identifiers + obfuscateIdentifier(whole_token, result, obfuscate_map, used_nouns, hash_func); + } + } + else if (token.type == TokenType::QuotedIdentifier) + { + assert(token.size() >= 2); + + /// Write quotes and the obfuscated content inside. + result.write(*token.begin); + obfuscateIdentifier({token.begin + 1, token.size() - 2}, result, obfuscate_map, used_nouns, hash_func); + result.write(token.end[-1]); + } + else if (token.type == TokenType::Number) + { + obfuscateLiteral(whole_token, result, hash_func); + } + else if (token.type == TokenType::StringLiteral) + { + assert(token.size() >= 2); + + result.write(*token.begin); + obfuscateLiteral({token.begin + 1, token.size() - 2}, result, hash_func); + result.write(token.end[-1]); + } + else if (token.type == TokenType::Comment) + { + /// Skip comments - they may contain confidential info. + } + else + { + /// Everyting else is kept as is. + result.write(token.begin, token.size()); + } + } +} + +} + diff --git a/src/Parsers/obfuscateQueries.h b/src/Parsers/obfuscateQueries.h new file mode 100644 index 00000000000..0a192649a92 --- /dev/null +++ b/src/Parsers/obfuscateQueries.h @@ -0,0 +1,50 @@ +#pragma once + +#include +#include +#include +#include +#include + +#include + + +namespace DB +{ + +class WriteBuffer; + +using WordMap = std::unordered_map; +using WordSet = std::unordered_set; +using KnownIdentifierFunc = std::function; + +/** Takes one or multiple queries and obfuscates them by replacing identifiers to pseudorandom words + * and replacing literals to random values, while preserving the structure of the queries and the general sense. + * + * Its intended use case is when the user wants to share their queries for testing and debugging + * but is afraid to disclose the details about their column names, domain area and values of constants. + * + * It can obfuscate multiple queries in consistent fashion - identical names will be transformed to identical results. + * + * The function is not guaranteed to always give correct result or to be secure. It's implemented in "best effort" fashion. + * + * @param src - a string with source queries. + * @param result - where the obfuscated queries will be written. + * @param obfuscate_map - information about substituted identifiers + * (pass empty map at the beginning or reuse it from previous invocation to get consistent result) + * @param used_nouns - information about words used for substitution + * (pass empty set at the beginning or reuse it from previous invocation to get consistent result) + * @param hash_func - hash function that will be used as a pseudorandom source, + * it's recommended to preseed the function before passing here. + * @param known_identifier_func - a function that returns true if identifier is known name + * (of function, aggregate function, etc. that should be kept as is). If it returns false, identifier will be obfuscated. + */ +void obfuscateQueries( + std::string_view src, + WriteBuffer & result, + WordMap & obfuscate_map, + WordSet & used_nouns, + SipHash hash_func, + KnownIdentifierFunc known_identifier_func); + +} diff --git a/src/Parsers/ya.make b/src/Parsers/ya.make index 0a0c301b722..4ec97b8b55b 100644 --- a/src/Parsers/ya.make +++ b/src/Parsers/ya.make @@ -85,6 +85,7 @@ SRCS( MySQL/ASTDeclareReference.cpp MySQL/ASTDeclareSubPartition.cpp MySQL/ASTDeclareTableOptions.cpp + obfuscateQueries.cpp parseDatabaseAndTableName.cpp parseIdentifierOrStringLiteral.cpp parseIntervalKind.cpp diff --git a/src/Processors/ForkProcessor.cpp b/src/Processors/ForkProcessor.cpp index 7fa21c4236d..9b17f8ad5ca 100644 --- a/src/Processors/ForkProcessor.cpp +++ b/src/Processors/ForkProcessor.cpp @@ -10,7 +10,6 @@ ForkProcessor::Status ForkProcessor::prepare() /// Check can output. - bool all_finished = true; bool all_can_push = true; size_t num_active_outputs = 0; @@ -18,7 +17,6 @@ ForkProcessor::Status ForkProcessor::prepare() { if (!output.isFinished()) { - all_finished = false; ++num_active_outputs; /// The order is important. @@ -27,7 +25,7 @@ ForkProcessor::Status ForkProcessor::prepare() } } - if (all_finished) + if (0 == num_active_outputs) { input.close(); return Status::Finished; diff --git a/src/Processors/QueryPipeline.cpp b/src/Processors/QueryPipeline.cpp index 0b654d0f325..4cbb4d9edb7 100644 --- a/src/Processors/QueryPipeline.cpp +++ b/src/Processors/QueryPipeline.cpp @@ -196,59 +196,6 @@ void QueryPipeline::addExtremesTransform() pipe.addTransform(std::move(transform), nullptr, port); } -void QueryPipeline::addCreatingSetsTransform(SubqueriesForSets subqueries_for_sets, const SizeLimits & network_transfer_limits, const Context & context) -{ - checkInitializedAndNotCompleted(); - - Pipes sources; - - for (auto & subquery : subqueries_for_sets) - { - if (!subquery.second.source.empty()) - { - auto & source = sources.emplace_back(std::move(subquery.second.source)); - if (source.numOutputPorts() > 1) - source.addTransform(std::make_shared(source.getHeader(), source.numOutputPorts(), 1)); - - source.dropExtremes(); - - auto creating_sets = std::make_shared( - source.getHeader(), - getHeader(), - std::move(subquery.second), - network_transfer_limits, - context); - - InputPort * totals = nullptr; - if (source.getTotalsPort()) - totals = creating_sets->addTotalsPort(); - - source.addTransform(std::move(creating_sets), totals, nullptr); - } - } - - if (sources.empty()) - return; - - auto * collected_processors = pipe.collected_processors; - - /// We unite all sources together. - /// Set collected_processors to attach all newly-added processors to current query plan step. - auto source = Pipe::unitePipes(std::move(sources), collected_processors); - if (source.numOutputPorts() > 1) - source.addTransform(std::make_shared(source.getHeader(), source.numOutputPorts(), 1)); - source.collected_processors = nullptr; - - resize(1); - - Pipes pipes; - pipes.emplace_back(std::move(source)); - pipes.emplace_back(std::move(pipe)); - pipe = Pipe::unitePipes(std::move(pipes), collected_processors); - - pipe.addTransform(std::make_shared(getHeader(), 2)); -} - void QueryPipeline::setOutputFormat(ProcessorPtr output) { checkInitializedAndNotCompleted(); @@ -315,6 +262,46 @@ QueryPipeline QueryPipeline::unitePipelines( return pipeline; } + +void QueryPipeline::addCreatingSetsTransform(const Block & res_header, SubqueryForSet subquery_for_set, const SizeLimits & limits, const Context & context) +{ + resize(1); + + auto transform = std::make_shared( + getHeader(), + res_header, + std::move(subquery_for_set), + limits, + context); + + InputPort * totals_port = nullptr; + + if (pipe.getTotalsPort()) + totals_port = transform->addTotalsPort(); + + pipe.addTransform(std::move(transform), totals_port, nullptr); +} + +void QueryPipeline::addPipelineBefore(QueryPipeline pipeline) +{ + checkInitializedAndNotCompleted(); + assertBlocksHaveEqualStructure(getHeader(), pipeline.getHeader(), "QueryPipeline"); + + IProcessor::PortNumbers delayed_streams(pipe.numOutputPorts()); + for (size_t i = 0; i < delayed_streams.size(); ++i) + delayed_streams[i] = i; + + auto * collected_processors = pipe.collected_processors; + + Pipes pipes; + pipes.emplace_back(std::move(pipe)); + pipes.emplace_back(QueryPipeline::getPipe(std::move(pipeline))); + pipe = Pipe::unitePipes(std::move(pipes), collected_processors); + + auto processor = std::make_shared(getHeader(), pipe.numOutputPorts(), delayed_streams); + addTransform(std::move(processor)); +} + void QueryPipeline::setProgressCallback(const ProgressCallback & callback) { for (auto & processor : pipe.processors) diff --git a/src/Processors/QueryPipeline.h b/src/Processors/QueryPipeline.h index 45b410ab323..80ae1d591a4 100644 --- a/src/Processors/QueryPipeline.h +++ b/src/Processors/QueryPipeline.h @@ -26,6 +26,8 @@ class QueryPlan; struct SubqueryForSet; using SubqueriesForSets = std::unordered_map; +struct SizeLimits; + class QueryPipeline { public: @@ -55,8 +57,6 @@ public: void addTotalsHavingTransform(ProcessorPtr transform); /// Add transform which calculates extremes. This transform adds extremes port and doesn't change inputs number. void addExtremesTransform(); - /// Adds transform which creates sets. It will be executed before reading any data from input ports. - void addCreatingSetsTransform(SubqueriesForSets subqueries_for_sets, const SizeLimits & network_transfer_limits, const Context & context); /// Resize pipeline to single output and add IOutputFormat. Pipeline will be completed after this transformation. void setOutputFormat(ProcessorPtr output); /// Get current OutputFormat. @@ -87,6 +87,12 @@ public: size_t max_threads_limit = 0, Processors * collected_processors = nullptr); + /// Add other pipeline and execute it before current one. + /// Pipeline must have same header. + void addPipelineBefore(QueryPipeline pipeline); + + void addCreatingSetsTransform(const Block & res_header, SubqueryForSet subquery_for_set, const SizeLimits & limits, const Context & context); + PipelineExecutorPtr execute(); size_t getNumStreams() const { return pipe.numOutputPorts(); } diff --git a/src/Processors/QueryPlan/CreatingSetsStep.cpp b/src/Processors/QueryPlan/CreatingSetsStep.cpp index 7e840e1531b..5868a7045f7 100644 --- a/src/Processors/QueryPlan/CreatingSetsStep.cpp +++ b/src/Processors/QueryPlan/CreatingSetsStep.cpp @@ -6,6 +6,11 @@ namespace DB { +namespace ErrorCodes +{ + extern const int LOGICAL_ERROR; +} + static ITransformingStep::Traits getTraits() { return ITransformingStep::Traits @@ -22,37 +27,128 @@ static ITransformingStep::Traits getTraits() }; } -CreatingSetsStep::CreatingSetsStep( +CreatingSetStep::CreatingSetStep( const DataStream & input_stream_, - SubqueriesForSets subqueries_for_sets_, + Block header, + String description_, + SubqueryForSet subquery_for_set_, SizeLimits network_transfer_limits_, const Context & context_) - : ITransformingStep(input_stream_, input_stream_.header, getTraits()) - , subqueries_for_sets(std::move(subqueries_for_sets_)) + : ITransformingStep(input_stream_, header, getTraits()) + , description(std::move(description_)) + , subquery_for_set(std::move(subquery_for_set_)) , network_transfer_limits(std::move(network_transfer_limits_)) , context(context_) { } -void CreatingSetsStep::transformPipeline(QueryPipeline & pipeline) +void CreatingSetStep::transformPipeline(QueryPipeline & pipeline) { - pipeline.addCreatingSetsTransform(std::move(subqueries_for_sets), network_transfer_limits, context); + pipeline.addCreatingSetsTransform(getOutputStream().header, std::move(subquery_for_set), network_transfer_limits, context); } -void CreatingSetsStep::describeActions(FormatSettings & settings) const +void CreatingSetStep::describeActions(FormatSettings & settings) const { String prefix(settings.offset, ' '); - for (const auto & set : subqueries_for_sets) + settings.out << prefix; + if (subquery_for_set.set) + settings.out << "Set: "; + else if (subquery_for_set.join) + settings.out << "Join: "; + + settings.out << description << '\n'; +} + +CreatingSetsStep::CreatingSetsStep(DataStreams input_streams_) +{ + if (input_streams_.empty()) + throw Exception("CreatingSetsStep cannot be created with no inputs", ErrorCodes::LOGICAL_ERROR); + + input_streams = std::move(input_streams_); + output_stream = input_streams.front(); + + for (size_t i = 1; i < input_streams.size(); ++i) + assertBlocksHaveEqualStructure(output_stream->header, input_streams[i].header, "CreatingSets"); +} + +QueryPipelinePtr CreatingSetsStep::updatePipeline(QueryPipelines pipelines) +{ + if (pipelines.empty()) + throw Exception("CreatingSetsStep cannot be created with no inputs", ErrorCodes::LOGICAL_ERROR); + + auto main_pipeline = std::move(pipelines.front()); + if (pipelines.size() == 1) + return main_pipeline; + + std::swap(pipelines.front(), pipelines.back()); + pipelines.pop_back(); + + QueryPipeline delayed_pipeline; + if (pipelines.size() > 1) { - settings.out << prefix; - if (set.second.set) - settings.out << "Set: "; - else if (set.second.join) - settings.out << "Join: "; - - settings.out << set.first << '\n'; + QueryPipelineProcessorsCollector collector(delayed_pipeline, this); + delayed_pipeline = QueryPipeline::unitePipelines(std::move(pipelines), output_stream->header); + processors = collector.detachProcessors(); } + else + delayed_pipeline = std::move(*pipelines.front()); + + QueryPipelineProcessorsCollector collector(*main_pipeline, this); + main_pipeline->addPipelineBefore(std::move(delayed_pipeline)); + auto added_processors = collector.detachProcessors(); + processors.insert(processors.end(), added_processors.begin(), added_processors.end()); + + return main_pipeline; +} + +void CreatingSetsStep::describePipeline(FormatSettings & settings) const +{ + IQueryPlanStep::describePipeline(processors, settings); +} + +void addCreatingSetsStep( + QueryPlan & query_plan, SubqueriesForSets subqueries_for_sets, const SizeLimits & limits, const Context & context) +{ + DataStreams input_streams; + input_streams.emplace_back(query_plan.getCurrentDataStream()); + + std::vector> plans; + plans.emplace_back(std::make_unique(std::move(query_plan))); + query_plan = QueryPlan(); + + for (auto & [description, set] : subqueries_for_sets) + { + if (!set.source) + continue; + + auto plan = std::move(set.source); + std::string type = (set.join != nullptr) ? "JOIN" + : "subquery"; + + auto creating_set = std::make_unique( + plan->getCurrentDataStream(), + input_streams.front().header, + std::move(description), + std::move(set), + limits, + context); + creating_set->setStepDescription("Create set for " + type); + plan->addStep(std::move(creating_set)); + + input_streams.emplace_back(plan->getCurrentDataStream()); + plans.emplace_back(std::move(plan)); + } + + if (plans.size() == 1) + { + query_plan = std::move(*plans.front()); + return; + } + + auto creating_sets = std::make_unique(std::move(input_streams)); + creating_sets->setStepDescription("Create sets before main query execution"); + query_plan.unitePlans(std::move(creating_sets), std::move(plans)); } } diff --git a/src/Processors/QueryPlan/CreatingSetsStep.h b/src/Processors/QueryPlan/CreatingSetsStep.h index 4ba4863c043..ec13ab2052e 100644 --- a/src/Processors/QueryPlan/CreatingSetsStep.h +++ b/src/Processors/QueryPlan/CreatingSetsStep.h @@ -7,25 +7,49 @@ namespace DB { /// Creates sets for subqueries and JOIN. See CreatingSetsTransform. -class CreatingSetsStep : public ITransformingStep +class CreatingSetStep : public ITransformingStep { public: - CreatingSetsStep( + CreatingSetStep( const DataStream & input_stream_, - SubqueriesForSets subqueries_for_sets_, + Block header, + String description_, + SubqueryForSet subquery_for_set_, SizeLimits network_transfer_limits_, const Context & context_); - String getName() const override { return "CreatingSets"; } + String getName() const override { return "CreatingSet"; } void transformPipeline(QueryPipeline & pipeline) override; void describeActions(FormatSettings & settings) const override; private: - SubqueriesForSets subqueries_for_sets; + String description; + SubqueryForSet subquery_for_set; SizeLimits network_transfer_limits; const Context & context; }; +class CreatingSetsStep : public IQueryPlanStep +{ +public: + CreatingSetsStep(DataStreams input_streams_); + + String getName() const override { return "CreatingSets"; } + + QueryPipelinePtr updatePipeline(QueryPipelines pipelines) override; + + void describePipeline(FormatSettings & settings) const override; + +private: + Processors processors; +}; + +void addCreatingSetsStep( + QueryPlan & query_plan, + SubqueriesForSets subqueries_for_sets, + const SizeLimits & limits, + const Context & context); + } diff --git a/src/Processors/QueryPlan/QueryPlan.cpp b/src/Processors/QueryPlan/QueryPlan.cpp index 31b9de2fcee..74781f4b5d9 100644 --- a/src/Processors/QueryPlan/QueryPlan.cpp +++ b/src/Processors/QueryPlan/QueryPlan.cpp @@ -26,6 +26,8 @@ namespace ErrorCodes QueryPlan::QueryPlan() = default; QueryPlan::~QueryPlan() = default; +QueryPlan::QueryPlan(QueryPlan &&) = default; +QueryPlan & QueryPlan::operator=(QueryPlan &&) = default; void QueryPlan::checkInitialized() const { @@ -51,7 +53,7 @@ const DataStream & QueryPlan::getCurrentDataStream() const return root->step->getOutputStream(); } -void QueryPlan::unitePlans(QueryPlanStepPtr step, std::vector plans) +void QueryPlan::unitePlans(QueryPlanStepPtr step, std::vector> plans) { if (isInitialized()) throw Exception("Cannot unite plans because current QueryPlan is already initialized", @@ -70,7 +72,7 @@ void QueryPlan::unitePlans(QueryPlanStepPtr step, std::vector plans) for (size_t i = 0; i < num_inputs; ++i) { const auto & step_header = inputs[i].header; - const auto & plan_header = plans[i].getCurrentDataStream().header; + const auto & plan_header = plans[i]->getCurrentDataStream().header; if (!blocksHaveEqualStructure(step_header, plan_header)) throw Exception("Cannot unite QueryPlans using " + step->getName() + " because " "it has incompatible header with plan " + root->step->getName() + " " @@ -79,19 +81,19 @@ void QueryPlan::unitePlans(QueryPlanStepPtr step, std::vector plans) } for (auto & plan : plans) - nodes.splice(nodes.end(), std::move(plan.nodes)); + nodes.splice(nodes.end(), std::move(plan->nodes)); nodes.emplace_back(Node{.step = std::move(step)}); root = &nodes.back(); for (auto & plan : plans) - root->children.emplace_back(plan.root); + root->children.emplace_back(plan->root); for (auto & plan : plans) { - max_threads = std::max(max_threads, plan.max_threads); + max_threads = std::max(max_threads, plan->max_threads); interpreter_context.insert(interpreter_context.end(), - plan.interpreter_context.begin(), plan.interpreter_context.end()); + plan->interpreter_context.begin(), plan->interpreter_context.end()); } } diff --git a/src/Processors/QueryPlan/QueryPlan.h b/src/Processors/QueryPlan/QueryPlan.h index 7ce8d9426c4..6296eac7502 100644 --- a/src/Processors/QueryPlan/QueryPlan.h +++ b/src/Processors/QueryPlan/QueryPlan.h @@ -25,8 +25,10 @@ class QueryPlan public: QueryPlan(); ~QueryPlan(); + QueryPlan(QueryPlan &&); + QueryPlan & operator=(QueryPlan &&); - void unitePlans(QueryPlanStepPtr step, std::vector plans); + void unitePlans(QueryPlanStepPtr step, std::vector> plans); void addStep(QueryPlanStepPtr step); bool isInitialized() const { return root != nullptr; } /// Tree is not empty diff --git a/src/Processors/QueryPlan/ReadFromPreparedSource.cpp b/src/Processors/QueryPlan/ReadFromPreparedSource.cpp index 6f0d1693ce0..979b4101046 100644 --- a/src/Processors/QueryPlan/ReadFromPreparedSource.cpp +++ b/src/Processors/QueryPlan/ReadFromPreparedSource.cpp @@ -14,7 +14,8 @@ ReadFromPreparedSource::ReadFromPreparedSource(Pipe pipe_, std::shared_ptr context_); + explicit ReadFromPreparedSource(Pipe pipe_, std::shared_ptr context_ = nullptr); String getName() const override { return "ReadNothing"; } diff --git a/src/Processors/Transforms/AggregatingTransform.cpp b/src/Processors/Transforms/AggregatingTransform.cpp index 42caf4b3446..0a97cc3d4cb 100644 --- a/src/Processors/Transforms/AggregatingTransform.cpp +++ b/src/Processors/Transforms/AggregatingTransform.cpp @@ -1,6 +1,5 @@ #include -#include #include #include #include @@ -56,7 +55,7 @@ namespace public: SourceFromNativeStream(const Block & header, const std::string & path) : ISource(header), file_in(path), compressed_in(file_in), - block_in(std::make_shared(compressed_in, ClickHouseRevision::get())) + block_in(std::make_shared(compressed_in, DBMS_TCP_PROTOCOL_VERSION)) { block_in->readPrefix(); } diff --git a/src/Processors/Transforms/JoiningTransform.h b/src/Processors/Transforms/JoiningTransform.h index c00ac5b83dd..15a203635e2 100644 --- a/src/Processors/Transforms/JoiningTransform.h +++ b/src/Processors/Transforms/JoiningTransform.h @@ -14,7 +14,7 @@ public: JoiningTransform(Block input_header, JoinPtr join_, bool on_totals_ = false, bool default_totals_ = false); - String getName() const override { return "InflatingExpressionTransform"; } + String getName() const override { return "JoiningTransform"; } static Block transformHeader(Block header, const JoinPtr & join); diff --git a/src/Server/TCPHandler.cpp b/src/Server/TCPHandler.cpp index 4d77759e517..0dcf1227c30 100644 --- a/src/Server/TCPHandler.cpp +++ b/src/Server/TCPHandler.cpp @@ -1,7 +1,6 @@ #include #include #include -#include #include #include #include @@ -185,7 +184,7 @@ void TCPHandler::runImpl() /// Should we send internal logs to client? const auto client_logs_level = query_context->getSettingsRef().send_logs_level; - if (client_revision >= DBMS_MIN_REVISION_WITH_SERVER_LOGS + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_SERVER_LOGS && client_logs_level != LogsLevel::none) { state.logs_queue = std::make_shared(); @@ -220,7 +219,7 @@ void TCPHandler::runImpl() state.need_receive_data_for_input = true; /// Send ColumnsDescription for input storage. - if (client_revision >= DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA && query_context->getSettingsRef().input_format_defaults_for_omitted_fields) { sendTableColumns(metadata_snapshot->getColumns()); @@ -250,7 +249,7 @@ void TCPHandler::runImpl() customizeContext(*query_context); - bool may_have_embedded_data = client_revision >= DBMS_MIN_REVISION_WITH_CLIENT_SUPPORT_EMBEDDED_DATA; + bool may_have_embedded_data = client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_CLIENT_SUPPORT_EMBEDDED_DATA; /// Processing Query state.io = executeQuery(state.query, *query_context, false, state.stage, may_have_embedded_data); @@ -492,7 +491,7 @@ void TCPHandler::processInsertQuery(const Settings & connection_settings) state.io.out->writePrefix(); /// Send ColumnsDescription for insertion table - if (client_revision >= DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA) + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA) { const auto & table_id = query_context->getInsertionTable(); if (query_context->getSettingsRef().input_format_defaults_for_omitted_fields) @@ -648,7 +647,7 @@ void TCPHandler::processOrdinaryQueryWithProcessors() void TCPHandler::processTablesStatusRequest() { TablesStatusRequest request; - request.read(*in, client_revision); + request.read(*in, client_tcp_protocol_version); TablesStatusResponse response; for (const QualifiedTableName & table_name: request.tables) @@ -671,13 +670,13 @@ void TCPHandler::processTablesStatusRequest() } writeVarUInt(Protocol::Server::TablesStatusResponse, *out); - response.write(*out, client_revision); + response.write(*out, client_tcp_protocol_version); } void TCPHandler::receiveUnexpectedTablesStatusRequest() { TablesStatusRequest skip_request; - skip_request.read(*in, client_revision); + skip_request.read(*in, client_tcp_protocol_version); throw NetException("Unexpected packet TablesStatusRequest received from client", ErrorCodes::UNEXPECTED_PACKET_FROM_CLIENT); } @@ -752,7 +751,7 @@ void TCPHandler::receiveHello() readVarUInt(client_version_major, *in); readVarUInt(client_version_minor, *in); // NOTE For backward compatibility of the protocol, client cannot send its version_patch. - readVarUInt(client_revision, *in); + readVarUInt(client_tcp_protocol_version, *in); readStringBinary(default_database, *in); readStringBinary(user, *in); readStringBinary(password, *in); @@ -763,7 +762,7 @@ void TCPHandler::receiveHello() LOG_DEBUG(log, "Connected {} version {}.{}.{}, revision: {}{}{}.", client_name, client_version_major, client_version_minor, client_version_patch, - client_revision, + client_tcp_protocol_version, (!default_database.empty() ? ", database: " + default_database : ""), (!user.empty() ? ", user: " + user : "") ); @@ -802,12 +801,12 @@ void TCPHandler::sendHello() writeStringBinary(DBMS_NAME, *out); writeVarUInt(DBMS_VERSION_MAJOR, *out); writeVarUInt(DBMS_VERSION_MINOR, *out); - writeVarUInt(ClickHouseRevision::get(), *out); - if (client_revision >= DBMS_MIN_REVISION_WITH_SERVER_TIMEZONE) + writeVarUInt(DBMS_TCP_PROTOCOL_VERSION, *out); + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_SERVER_TIMEZONE) writeStringBinary(DateLUT::instance().getTimeZone(), *out); - if (client_revision >= DBMS_MIN_REVISION_WITH_SERVER_DISPLAY_NAME) + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_SERVER_DISPLAY_NAME) writeStringBinary(server_display_name, *out); - if (client_revision >= DBMS_MIN_REVISION_WITH_VERSION_PATCH) + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_VERSION_PATCH) writeVarUInt(DBMS_VERSION_PATCH, *out); out->next(); } @@ -894,8 +893,8 @@ void TCPHandler::receiveQuery() /// Client info ClientInfo & client_info = query_context->getClientInfo(); - if (client_revision >= DBMS_MIN_REVISION_WITH_CLIENT_INFO) - client_info.read(*in, client_revision); + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_CLIENT_INFO) + client_info.read(*in, client_tcp_protocol_version); /// For better support of old clients, that does not send ClientInfo. if (client_info.query_kind == ClientInfo::QueryKind::NO_QUERY) @@ -905,7 +904,7 @@ void TCPHandler::receiveQuery() client_info.client_version_major = client_version_major; client_info.client_version_minor = client_version_minor; client_info.client_version_patch = client_version_patch; - client_info.client_revision = client_revision; + client_info.client_tcp_protocol_version = client_tcp_protocol_version; } /// Set fields, that are known apriori. @@ -921,14 +920,14 @@ void TCPHandler::receiveQuery() /// Per query settings are also passed via TCP. /// We need to check them before applying due to they can violate the settings constraints. - auto settings_format = (client_revision >= DBMS_MIN_REVISION_WITH_SETTINGS_SERIALIZED_AS_STRINGS) ? SettingsWriteFormat::STRINGS_WITH_FLAGS + auto settings_format = (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_SETTINGS_SERIALIZED_AS_STRINGS) ? SettingsWriteFormat::STRINGS_WITH_FLAGS : SettingsWriteFormat::BINARY; Settings passed_settings; passed_settings.read(*in, settings_format); /// Interserver secret. std::string received_hash; - if (client_revision >= DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET) + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET) { readStringBinary(received_hash, *in, 32); } @@ -1011,16 +1010,16 @@ void TCPHandler::receiveUnexpectedQuery() readStringBinary(skip_string, *in); ClientInfo skip_client_info; - if (client_revision >= DBMS_MIN_REVISION_WITH_CLIENT_INFO) - skip_client_info.read(*in, client_revision); + if (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_CLIENT_INFO) + skip_client_info.read(*in, client_tcp_protocol_version); Settings skip_settings; - auto settings_format = (client_revision >= DBMS_MIN_REVISION_WITH_SETTINGS_SERIALIZED_AS_STRINGS) ? SettingsWriteFormat::STRINGS_WITH_FLAGS + auto settings_format = (client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_SETTINGS_SERIALIZED_AS_STRINGS) ? SettingsWriteFormat::STRINGS_WITH_FLAGS : SettingsWriteFormat::BINARY; skip_settings.read(*in, settings_format); std::string skip_hash; - bool interserver_secret = client_revision >= DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET; + bool interserver_secret = client_tcp_protocol_version >= DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET; if (interserver_secret) readStringBinary(skip_hash, *in, 32); @@ -1094,7 +1093,7 @@ void TCPHandler::receiveUnexpectedData() auto skip_block_in = std::make_shared( *maybe_compressed_in, last_block_in.header, - client_revision); + client_tcp_protocol_version); skip_block_in->read(); throw NetException("Unexpected packet Data received from client", ErrorCodes::UNEXPECTED_PACKET_FROM_CLIENT); @@ -1121,7 +1120,7 @@ void TCPHandler::initBlockInput() state.block_in = std::make_shared( *state.maybe_compressed_in, header, - client_revision); + client_tcp_protocol_version); } } @@ -1152,7 +1151,7 @@ void TCPHandler::initBlockOutput(const Block & block) state.block_out = std::make_shared( *state.maybe_compressed_out, - client_revision, + client_tcp_protocol_version, block.cloneEmpty(), !connection_context.getSettingsRef().low_cardinality_allow_in_native_format); } @@ -1165,7 +1164,7 @@ void TCPHandler::initLogsBlockOutput(const Block & block) /// Use uncompressed stream since log blocks usually contain only one row state.logs_block_out = std::make_shared( *out, - client_revision, + client_tcp_protocol_version, block.cloneEmpty(), !connection_context.getSettingsRef().low_cardinality_allow_in_native_format); } @@ -1269,7 +1268,7 @@ void TCPHandler::sendProgress() { writeVarUInt(Protocol::Server::Progress, *out); auto increment = state.progress.fetchAndResetPiecewiseAtomically(); - increment.write(*out, client_revision); + increment.write(*out, client_tcp_protocol_version); out->next(); } diff --git a/src/Server/TCPHandler.h b/src/Server/TCPHandler.h index 3771755892f..12149d9a66f 100644 --- a/src/Server/TCPHandler.h +++ b/src/Server/TCPHandler.h @@ -123,7 +123,7 @@ private: UInt64 client_version_major = 0; UInt64 client_version_minor = 0; UInt64 client_version_patch = 0; - UInt64 client_revision = 0; + UInt64 client_tcp_protocol_version = 0; Context connection_context; std::optional query_context; diff --git a/src/Storages/Distributed/DirectoryMonitor.cpp b/src/Storages/Distributed/DirectoryMonitor.cpp index dfb35f62bc4..f40ce1e06fc 100644 --- a/src/Storages/Distributed/DirectoryMonitor.cpp +++ b/src/Storages/Distributed/DirectoryMonitor.cpp @@ -3,7 +3,6 @@ #include #include #include -#include #include #include #include @@ -366,7 +365,7 @@ void StorageDistributedDirectoryMonitor::readHeader( UInt64 initiator_revision; readVarUInt(initiator_revision, header_buf); - if (ClickHouseRevision::get() < initiator_revision) + if (DBMS_TCP_PROTOCOL_VERSION < initiator_revision) { LOG_WARNING(log, "ClickHouse shard version is older than ClickHouse initiator version. It may lack support for new features."); } @@ -585,7 +584,7 @@ public: explicit DirectoryMonitorBlockInputStream(const String & file_name) : in(file_name) , decompressing_in(in) - , block_in(decompressing_in, ClickHouseRevision::get()) + , block_in(decompressing_in, DBMS_TCP_PROTOCOL_VERSION) , log{&Poco::Logger::get("DirectoryMonitorBlockInputStream")} { Settings insert_settings; @@ -690,7 +689,7 @@ void StorageDistributedDirectoryMonitor::processFilesWithBatching(const std::map readHeader(in, insert_settings, insert_query, client_info, log); CompressedReadBuffer decompressing_in(in); - NativeBlockInputStream block_in(decompressing_in, ClickHouseRevision::get()); + NativeBlockInputStream block_in(decompressing_in, DBMS_TCP_PROTOCOL_VERSION); block_in.readPrefix(); while (Block block = block_in.read()) diff --git a/src/Storages/Distributed/DistributedBlockOutputStream.cpp b/src/Storages/Distributed/DistributedBlockOutputStream.cpp index 172a398258f..f08cdf76cbf 100644 --- a/src/Storages/Distributed/DistributedBlockOutputStream.cpp +++ b/src/Storages/Distributed/DistributedBlockOutputStream.cpp @@ -21,7 +21,6 @@ #include #include #include -#include #include #include #include @@ -583,16 +582,16 @@ void DistributedBlockOutputStream::writeToShard(const Block & block, const std:: { WriteBufferFromFile out{first_file_tmp_path}; CompressedWriteBuffer compress{out}; - NativeBlockOutputStream stream{compress, ClickHouseRevision::get(), block.cloneEmpty()}; + NativeBlockOutputStream stream{compress, DBMS_TCP_PROTOCOL_VERSION, block.cloneEmpty()}; /// Prepare the header. /// We wrap the header into a string for compatibility with older versions: /// a shard will able to read the header partly and ignore other parts based on its version. WriteBufferFromOwnString header_buf; - writeVarUInt(ClickHouseRevision::get(), header_buf); + writeVarUInt(DBMS_TCP_PROTOCOL_VERSION, header_buf); writeStringBinary(query_string, header_buf); context.getSettingsRef().write(header_buf); - context.getClientInfo().write(header_buf, ClickHouseRevision::get()); + context.getClientInfo().write(header_buf, DBMS_TCP_PROTOCOL_VERSION); /// Add new fields here, for example: /// writeVarUInt(my_new_data, header_buf); diff --git a/src/Storages/IStorage.h b/src/Storages/IStorage.h index dbd18c9558e..40500e78de1 100644 --- a/src/Storages/IStorage.h +++ b/src/Storages/IStorage.h @@ -53,6 +53,9 @@ class QueryPlan; class StoragePolicy; using StoragePolicyPtr = std::shared_ptr; +struct StreamLocalLimits; +class EnabledQuota; + struct ColumnSize { size_t marks = 0; diff --git a/src/Storages/MergeTree/MergeSelector.h b/src/Storages/MergeTree/MergeSelector.h index e460b8ae06a..fcdfcf5b890 100644 --- a/src/Storages/MergeTree/MergeSelector.h +++ b/src/Storages/MergeTree/MergeSelector.h @@ -44,7 +44,7 @@ public: /// Information about different TTLs for part. Can be used by /// TTLSelector to assign merges with TTL. - const MergeTreeDataPartTTLInfos * ttl_infos; + const MergeTreeDataPartTTLInfos * ttl_infos = nullptr; /// Part compression codec definition. ASTPtr compression_codec_desc; diff --git a/src/Storages/MergeTree/MergeTreeData.h b/src/Storages/MergeTree/MergeTreeData.h index 0fc5ec43048..1125eb32b66 100644 --- a/src/Storages/MergeTree/MergeTreeData.h +++ b/src/Storages/MergeTree/MergeTreeData.h @@ -717,6 +717,8 @@ protected: bool require_part_metadata; + /// Relative path data, changes during rename for ordinary databases use + /// under lockForShare if rename is possible. String relative_data_path; diff --git a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp index ffd5d616cb0..2b8b886daaf 100644 --- a/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp +++ b/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp @@ -679,7 +679,7 @@ Pipe MergeTreeDataSelectExecutor::readFromParts( parts_with_ranges.resize(next_part); } - LOG_DEBUG(log, "Selected {} parts by date, {} parts by key, {} marks by primary key, {} marks to read from {} ranges", parts.size(), parts_with_ranges.size(), sum_marks_pk.load(std::memory_order_relaxed), sum_marks, sum_ranges); + LOG_DEBUG(log, "Selected {} parts by partition key, {} parts by primary key, {} marks by primary key, {} marks to read from {} ranges", parts.size(), parts_with_ranges.size(), sum_marks_pk.load(std::memory_order_relaxed), sum_marks, sum_ranges); if (parts_with_ranges.empty()) return {}; diff --git a/src/Storages/MergeTree/ReplicatedMergeTreeCleanupThread.cpp b/src/Storages/MergeTree/ReplicatedMergeTreeCleanupThread.cpp index a5216e6fda3..11f23a5c110 100644 --- a/src/Storages/MergeTree/ReplicatedMergeTreeCleanupThread.cpp +++ b/src/Storages/MergeTree/ReplicatedMergeTreeCleanupThread.cpp @@ -56,10 +56,12 @@ void ReplicatedMergeTreeCleanupThread::run() void ReplicatedMergeTreeCleanupThread::iterate() { storage.clearOldPartsAndRemoveFromZK(); - storage.clearOldWriteAheadLogs(); { auto lock = storage.lockForShare(RWLockImpl::NO_QUERY, storage.getSettings()->lock_acquire_timeout_for_background_operations); + /// Both use relative_data_path which changes during rename, so we + /// do it under share lock + storage.clearOldWriteAheadLogs(); storage.clearOldTemporaryDirectories(); } diff --git a/src/Storages/StorageBuffer.cpp b/src/Storages/StorageBuffer.cpp index 5b9957f4ed4..14f188275e5 100644 --- a/src/Storages/StorageBuffer.cpp +++ b/src/Storages/StorageBuffer.cpp @@ -547,7 +547,7 @@ bool StorageBuffer::optimize( if (deduplicate) throw Exception("DEDUPLICATE cannot be specified when optimizing table of type Buffer", ErrorCodes::NOT_IMPLEMENTED); - flushAllBuffers(false); + flushAllBuffers(false, true); return true; } @@ -595,14 +595,14 @@ bool StorageBuffer::checkThresholdsImpl(size_t rows, size_t bytes, time_t time_p } -void StorageBuffer::flushAllBuffers(const bool check_thresholds) +void StorageBuffer::flushAllBuffers(bool check_thresholds, bool reset_blocks_structure) { for (auto & buf : buffers) - flushBuffer(buf, check_thresholds); + flushBuffer(buf, check_thresholds, false, reset_blocks_structure); } -void StorageBuffer::flushBuffer(Buffer & buffer, bool check_thresholds, bool locked) +void StorageBuffer::flushBuffer(Buffer & buffer, bool check_thresholds, bool locked, bool reset_block_structure) { Block block_to_write; time_t current_time = time(nullptr); @@ -655,6 +655,8 @@ void StorageBuffer::flushBuffer(Buffer & buffer, bool check_thresholds, bool loc try { writeBlockToDestination(block_to_write, DatabaseCatalog::instance().tryGetTable(destination_id, global_context)); + if (reset_block_structure) + buffer.data.clear(); } catch (...) { @@ -829,7 +831,9 @@ void StorageBuffer::alter(const AlterCommands & params, const Context & context, checkAlterIsPossible(params, context.getSettingsRef()); auto metadata_snapshot = getInMemoryMetadataPtr(); - /// So that no blocks of the old structure remain. + /// Flush all buffers to storages, so that no non-empty blocks of the old + /// structure remain. Structure of empty blocks will be updated during first + /// insert. optimize({} /*query*/, metadata_snapshot, {} /*partition_id*/, false /*final*/, false /*deduplicate*/, context); StorageInMemoryMetadata new_metadata = *metadata_snapshot; diff --git a/src/Storages/StorageBuffer.h b/src/Storages/StorageBuffer.h index 8f1354399ef..b18b574ec6c 100644 --- a/src/Storages/StorageBuffer.h +++ b/src/Storages/StorageBuffer.h @@ -130,9 +130,11 @@ private: Poco::Logger * log; - void flushAllBuffers(bool check_thresholds = true); - /// Reset the buffer. If check_thresholds is set - resets only if thresholds are exceeded. - void flushBuffer(Buffer & buffer, bool check_thresholds, bool locked = false); + void flushAllBuffers(bool check_thresholds = true, bool reset_blocks_structure = false); + /// Reset the buffer. If check_thresholds is set - resets only if thresholds + /// are exceeded. If reset_block_structure is set - clears inner block + /// structure inside buffer (useful in OPTIMIZE and ALTER). + void flushBuffer(Buffer & buffer, bool check_thresholds, bool locked = false, bool reset_block_structure = false); bool checkThresholds(const Buffer & buffer, time_t current_time, size_t additional_rows = 0, size_t additional_bytes = 0) const; bool checkThresholdsImpl(size_t rows, size_t bytes, time_t time_passed) const; diff --git a/src/Storages/StorageDictionary.cpp b/src/Storages/StorageDictionary.cpp index 5d92b9cec55..e859baa702e 100644 --- a/src/Storages/StorageDictionary.cpp +++ b/src/Storages/StorageDictionary.cpp @@ -92,6 +92,12 @@ String StorageDictionary::generateNamesAndTypesDescription(const NamesAndTypesLi return ss.str(); } +String StorageDictionary::resolvedDictionaryName() const +{ + if (location == Location::SameDatabaseAndNameAsDictionary) + return dictionary_name; + return DatabaseCatalog::instance().resolveDictionaryName(dictionary_name); +} StorageDictionary::StorageDictionary( const StorageID & table_id_, @@ -132,7 +138,7 @@ Pipe StorageDictionary::read( const size_t max_block_size, const unsigned /*threads*/) { - auto dictionary = context.getExternalDictionariesLoader().getDictionary(dictionary_name); + auto dictionary = context.getExternalDictionariesLoader().getDictionary(resolvedDictionaryName()); auto stream = dictionary->getBlockInputStream(column_names, max_block_size); /// TODO: update dictionary interface for processors. return Pipe(std::make_shared(stream)); @@ -152,7 +158,8 @@ void registerStorageDictionary(StorageFactory & factory) if (!args.attach) { - const auto & dictionary = args.context.getExternalDictionariesLoader().getDictionary(dictionary_name); + auto resolved = DatabaseCatalog::instance().resolveDictionaryName(dictionary_name); + const auto & dictionary = args.context.getExternalDictionariesLoader().getDictionary(resolved); const DictionaryStructure & dictionary_structure = dictionary->getStructure(); checkNamesAndTypesCompatibleWithDictionary(dictionary_name, args.columns, dictionary_structure); } diff --git a/src/Storages/StorageDictionary.h b/src/Storages/StorageDictionary.h index d822552124d..5c7beb88d88 100644 --- a/src/Storages/StorageDictionary.h +++ b/src/Storages/StorageDictionary.h @@ -29,6 +29,7 @@ public: static String generateNamesAndTypesDescription(const NamesAndTypesList & list); const String & dictionaryName() const { return dictionary_name; } + String resolvedDictionaryName() const; /// Specifies where the table is located relative to the dictionary. enum class Location diff --git a/src/Storages/StorageMergeTree.cpp b/src/Storages/StorageMergeTree.cpp index 347474753dc..55fb42b550e 100644 --- a/src/Storages/StorageMergeTree.cpp +++ b/src/Storages/StorageMergeTree.cpp @@ -919,11 +919,13 @@ BackgroundProcessingPoolTaskResult StorageMergeTree::mergeMutateTask() { { auto share_lock = lockForShare(RWLockImpl::NO_QUERY, getSettings()->lock_acquire_timeout_for_background_operations); + /// All use relative_data_path which changes during rename + /// so execute under share lock. clearOldPartsFromFilesystem(); clearOldTemporaryDirectories(); + clearOldWriteAheadLogs(); } clearOldMutations(); - clearOldWriteAheadLogs(); } ///TODO: read deduplicate option from table config diff --git a/src/Storages/System/StorageSystemProcesses.cpp b/src/Storages/System/StorageSystemProcesses.cpp index c65a6b78e41..d899a1708bf 100644 --- a/src/Storages/System/StorageSystemProcesses.cpp +++ b/src/Storages/System/StorageSystemProcesses.cpp @@ -91,7 +91,7 @@ void StorageSystemProcesses::fillData(MutableColumns & res_columns, const Contex res_columns[i++]->insert(process.client_info.os_user); res_columns[i++]->insert(process.client_info.client_hostname); res_columns[i++]->insert(process.client_info.client_name); - res_columns[i++]->insert(process.client_info.client_revision); + res_columns[i++]->insert(process.client_info.client_tcp_protocol_version); res_columns[i++]->insert(process.client_info.client_version_major); res_columns[i++]->insert(process.client_info.client_version_minor); res_columns[i++]->insert(process.client_info.client_version_patch); diff --git a/src/Storages/System/StorageSystemZooKeeper.cpp b/src/Storages/System/StorageSystemZooKeeper.cpp index 17ab4ed4efb..81a42f1fe63 100644 --- a/src/Storages/System/StorageSystemZooKeeper.cpp +++ b/src/Storages/System/StorageSystemZooKeeper.cpp @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -42,7 +43,7 @@ NamesAndTypesList StorageSystemZooKeeper::getNamesAndTypes() } -static bool extractPathImpl(const IAST & elem, String & res) +static bool extractPathImpl(const IAST & elem, String & res, const Context & context) { const auto * function = elem.as(); if (!function) @@ -51,7 +52,7 @@ static bool extractPathImpl(const IAST & elem, String & res) if (function->name == "and") { for (const auto & child : function->arguments->children) - if (extractPathImpl(*child, res)) + if (extractPathImpl(*child, res, context)) return true; return false; @@ -60,23 +61,24 @@ static bool extractPathImpl(const IAST & elem, String & res) if (function->name == "equals") { const auto & args = function->arguments->as(); - const IAST * value; + ASTPtr value; if (args.children.size() != 2) return false; const ASTIdentifier * ident; if ((ident = args.children.at(0)->as())) - value = args.children.at(1).get(); + value = args.children.at(1); else if ((ident = args.children.at(1)->as())) - value = args.children.at(0).get(); + value = args.children.at(0); else return false; if (ident->name != "path") return false; - const auto * literal = value->as(); + auto evaluated = evaluateConstantExpressionAsLiteral(value, context); + const auto * literal = evaluated->as(); if (!literal) return false; @@ -93,20 +95,20 @@ static bool extractPathImpl(const IAST & elem, String & res) /** Retrieve from the query a condition of the form `path = 'path'`, from conjunctions in the WHERE clause. */ -static String extractPath(const ASTPtr & query) +static String extractPath(const ASTPtr & query, const Context & context) { const auto & select = query->as(); if (!select.where()) return ""; String res; - return extractPathImpl(*select.where(), res) ? res : ""; + return extractPathImpl(*select.where(), res, context) ? res : ""; } void StorageSystemZooKeeper::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const { - String path = extractPath(query_info.query); + String path = extractPath(query_info.query, context); if (path.empty()) throw Exception("SELECT from system.zookeeper table must contain condition like path = 'path' in WHERE clause.", ErrorCodes::BAD_ARGUMENTS); diff --git a/tests/ci/ci_config.json b/tests/ci/ci_config.json index 220d8d801ec..c12c6dad999 100644 --- a/tests/ci/ci_config.json +++ b/tests/ci/ci_config.json @@ -237,7 +237,7 @@ "with_coverage": false } }, - "Functional stateful tests (release, DatabaseAtomic)": { + "Functional stateful tests (release, DatabaseOrdinary)": { "required_build_properties": { "compiler": "gcc-10", "package_type": "deb", @@ -345,7 +345,7 @@ "with_coverage": false } }, - "Functional stateless tests (release, DatabaseAtomic)": { + "Functional stateless tests (release, DatabaseOrdinary)": { "required_build_properties": { "compiler": "gcc-10", "package_type": "deb", @@ -441,6 +441,18 @@ "with_coverage": false } }, + "Integration tests flaky check (asan)": { + "required_build_properties": { + "compiler": "clang-11", + "package_type": "deb", + "build_type": "relwithdebuginfo", + "sanitizer": "address", + "bundled": "bundled", + "splitted": "unsplitted", + "clang-tidy": "disable", + "with_coverage": false + } + }, "Compatibility check": { "required_build_properties": { "compiler": "gcc-10", diff --git a/tests/clickhouse-test b/tests/clickhouse-test index a3bed189d55..2a9c95eb830 100755 --- a/tests/clickhouse-test +++ b/tests/clickhouse-test @@ -107,9 +107,9 @@ def remove_control_characters(s): return s def get_db_engine(args): - if args.atomic_db_engine: - return " ENGINE=Atomic" - return "" + if args.db_engine: + return " ENGINE=" + args.db_engine + return "" # Will use default engine def run_single_test(args, ext, server_logs_level, client_options, case_file, stdout_file, stderr_file): @@ -303,6 +303,12 @@ def run_tests_array(all_tests_with_params): clickhouse_proc = Popen(shlex.split(args.client), stdin=PIPE, stdout=PIPE, stderr=PIPE) clickhouse_proc.communicate("SELECT 'Running test {suite}/{case} from pid={pid}';".format(pid = os.getpid(), case = case, suite = suite)) + if clickhouse_proc.returncode != 0: + failures += 1 + print("Server does not respond to health check") + SERVER_DIED = True + break + reference_file = os.path.join(suite_dir, name) + '.reference' stdout_file = os.path.join(suite_tmp_dir, name) + '.stdout' stderr_file = os.path.join(suite_tmp_dir, name) + '.stderr' @@ -456,7 +462,7 @@ class BuildFlags(object): DEBUG = 'debug-build' UNBUNDLED = 'unbundled-build' RELEASE = 'release-build' - DATABASE_ATOMIC = 'database-atomic' + DATABASE_ORDINARY = 'database-ordinary' POLYMORPHIC_PARTS = 'polymorphic-parts' @@ -501,8 +507,8 @@ def collect_build_flags(client): (stdout, stderr) = clickhouse_proc.communicate("SELECT value FROM system.settings WHERE name = 'default_database_engine'") if clickhouse_proc.returncode == 0: - if 'Atomic' in stdout: - result.append(BuildFlags.DATABASE_ATOMIC) + if 'Ordinary' in stdout: + result.append(BuildFlags.DATABASE_ORDINARY) else: raise Exception("Cannot get inforamtion about build from server errorcode {}, stderr {}".format(clickhouse_proc.returncode, stderr)) @@ -792,7 +798,7 @@ if __name__ == '__main__': parser.add_argument('-r', '--server-check-retries', default=30, type=int, help='Num of tries to execute SELECT 1 before tests started') parser.add_argument('--skip-list-path', help="Path to skip-list file") parser.add_argument('--use-skip-list', action='store_true', default=False, help="Use skip list to skip tests if found") - parser.add_argument('--atomic-db-engine', action='store_true', help='Create databases with Atomic engine by default') + parser.add_argument('--db-engine', help='Database engine name') parser.add_argument('--no-stateless', action='store_true', help='Disable all stateless tests') parser.add_argument('--no-stateful', action='store_true', help='Disable all stateful tests') diff --git a/tests/config/README.md b/tests/config/README.md new file mode 100644 index 00000000000..8dd775a275a --- /dev/null +++ b/tests/config/README.md @@ -0,0 +1,8 @@ +# ClickHouse configs for test environment + +## How to use +CI use these configs in all checks installing them with `install.sh` script. If you want to run all tests from `tests/queries/0_stateless` and `test/queries/1_stateful` on your local machine you have to set up configs from this directory for your `clickhouse-server`. The most simple way is to install them using `install.sh` script. Other option is just copy files into your clickhouse config directory. + +## How to add new config + +Just place file `.xml` with new config into appropriate directory and add `ln` command into `install.sh` script. After that CI will use this config in all tests runs. diff --git a/tests/config/clusters.xml b/tests/config/config.d/clusters.xml similarity index 100% rename from tests/config/clusters.xml rename to tests/config/config.d/clusters.xml diff --git a/tests/config/custom_settings_prefixes.xml b/tests/config/config.d/custom_settings_prefixes.xml similarity index 100% rename from tests/config/custom_settings_prefixes.xml rename to tests/config/config.d/custom_settings_prefixes.xml diff --git a/tests/config/database_atomic_configd.xml b/tests/config/config.d/database_atomic.xml similarity index 100% rename from tests/config/database_atomic_configd.xml rename to tests/config/config.d/database_atomic.xml diff --git a/tests/config/disks.xml b/tests/config/config.d/disks.xml similarity index 100% rename from tests/config/disks.xml rename to tests/config/config.d/disks.xml diff --git a/tests/config/graphite.xml b/tests/config/config.d/graphite.xml similarity index 100% rename from tests/config/graphite.xml rename to tests/config/config.d/graphite.xml diff --git a/tests/config/listen.xml b/tests/config/config.d/listen.xml similarity index 100% rename from tests/config/listen.xml rename to tests/config/config.d/listen.xml diff --git a/tests/config/macros.xml b/tests/config/config.d/macros.xml similarity index 100% rename from tests/config/macros.xml rename to tests/config/config.d/macros.xml diff --git a/tests/config/metric_log.xml b/tests/config/config.d/metric_log.xml similarity index 100% rename from tests/config/metric_log.xml rename to tests/config/config.d/metric_log.xml diff --git a/tests/config/part_log.xml b/tests/config/config.d/part_log.xml similarity index 100% rename from tests/config/part_log.xml rename to tests/config/config.d/part_log.xml diff --git a/tests/config/polymorphic_parts.xml b/tests/config/config.d/polymorphic_parts.xml similarity index 100% rename from tests/config/polymorphic_parts.xml rename to tests/config/config.d/polymorphic_parts.xml diff --git a/tests/config/query_masking_rules.xml b/tests/config/config.d/query_masking_rules.xml similarity index 100% rename from tests/config/query_masking_rules.xml rename to tests/config/config.d/query_masking_rules.xml diff --git a/tests/config/secure_ports.xml b/tests/config/config.d/secure_ports.xml similarity index 100% rename from tests/config/secure_ports.xml rename to tests/config/config.d/secure_ports.xml diff --git a/tests/config/text_log.xml b/tests/config/config.d/text_log.xml similarity index 100% rename from tests/config/text_log.xml rename to tests/config/config.d/text_log.xml diff --git a/tests/config/zookeeper.xml b/tests/config/config.d/zookeeper.xml similarity index 100% rename from tests/config/zookeeper.xml rename to tests/config/config.d/zookeeper.xml diff --git a/tests/config/database_atomic_usersd.xml b/tests/config/database_atomic_usersd.xml deleted file mode 100644 index 201d476da24..00000000000 --- a/tests/config/database_atomic_usersd.xml +++ /dev/null @@ -1,8 +0,0 @@ - - - - Atomic - 0 - - - diff --git a/tests/config/install.sh b/tests/config/install.sh new file mode 100755 index 00000000000..0f33854ef95 --- /dev/null +++ b/tests/config/install.sh @@ -0,0 +1,54 @@ +#!/bin/bash + +# script allows to install configs for clickhouse server and clients required +# for testing (stateless and stateful tests) + +set -x -e + +DEST_SERVER_PATH="${1:-/etc/clickhouse-server}" +DEST_CLIENT_PATH="${2:-/etc/clickhouse-client}" +SRC_PATH="$( cd "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )" + +echo "Going to install test configs from $SRC_PATH into $DEST_SERVER_PATH" + +mkdir -p $DEST_SERVER_PATH/config.d/ +mkdir -p $DEST_SERVER_PATH/users.d/ +mkdir -p $DEST_CLIENT_PATH + +ln -s $SRC_PATH/config.d/zookeeper.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/listen.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/part_log.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/text_log.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/metric_log.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/custom_settings_prefixes.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/macros.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/disks.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/secure_ports.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/clusters.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/graphite.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/config.d/database_atomic.xml $DEST_SERVER_PATH/config.d/ +ln -s $SRC_PATH/users.d/log_queries.xml $DEST_SERVER_PATH/users.d/ +ln -s $SRC_PATH/users.d/readonly.xml $DEST_SERVER_PATH/users.d/ +ln -s $SRC_PATH/users.d/access_management.xml $DEST_SERVER_PATH/users.d/ + +ln -s $SRC_PATH/ints_dictionary.xml $DEST_SERVER_PATH/ +ln -s $SRC_PATH/strings_dictionary.xml $DEST_SERVER_PATH/ +ln -s $SRC_PATH/decimals_dictionary.xml $DEST_SERVER_PATH/ +ln -s $SRC_PATH/executable_dictionary.xml $DEST_SERVER_PATH/ + +ln -s $SRC_PATH/server.key $DEST_SERVER_PATH/ +ln -s $SRC_PATH/server.crt $DEST_SERVER_PATH/ +ln -s $SRC_PATH/dhparam.pem $DEST_SERVER_PATH/ + +# Retain any pre-existing config and allow ClickHouse to load it if required +ln -s --backup=simple --suffix=_original.xml \ + $SRC_PATH/config.d/query_masking_rules.xml $DEST_SERVER_PATH/config.d/ + +if [[ -n "$USE_POLYMORPHIC_PARTS" ]] && [[ "$USE_POLYMORPHIC_PARTS" -eq 1 ]]; then + ln -s $SRC_PATH/config.d/polymorphic_parts.xml $DEST_SERVER_PATH/config.d/ +fi +if [[ -n "$USE_DATABASE_ORDINARY" ]] && [[ "$USE_DATABASE_ORDINARY" -eq 1 ]]; then + ln -s $SRC_PATH/users.d/database_ordinary.xml $DEST_SERVER_PATH/users.d/ +fi + +ln -sf $SRC_PATH/client_config.xml $DEST_CLIENT_PATH/config.xml diff --git a/tests/config/access_management.xml b/tests/config/users.d/access_management.xml similarity index 100% rename from tests/config/access_management.xml rename to tests/config/users.d/access_management.xml diff --git a/tests/config/users.d/database_ordinary.xml b/tests/config/users.d/database_ordinary.xml new file mode 100644 index 00000000000..68f3b044f75 --- /dev/null +++ b/tests/config/users.d/database_ordinary.xml @@ -0,0 +1,7 @@ + + + + Ordinary + + + diff --git a/tests/config/log_queries.xml b/tests/config/users.d/log_queries.xml similarity index 100% rename from tests/config/log_queries.xml rename to tests/config/users.d/log_queries.xml diff --git a/tests/config/readonly.xml b/tests/config/users.d/readonly.xml similarity index 100% rename from tests/config/readonly.xml rename to tests/config/users.d/readonly.xml diff --git a/tests/integration/helpers/cluster.py b/tests/integration/helpers/cluster.py index 6d0f038daed..0b7fa9264bd 100644 --- a/tests/integration/helpers/cluster.py +++ b/tests/integration/helpers/cluster.py @@ -486,8 +486,8 @@ class ClickHouseCluster: start = time.time() while time.time() - start < timeout: try: - connection.database_names() - print "Connected to Mongo dbs:", connection.database_names() + connection.list_database_names() + print "Connected to Mongo dbs:", connection.list_database_names() return except Exception as ex: print "Can't connect to Mongo " + str(ex) diff --git a/tests/integration/helpers/external_sources.py b/tests/integration/helpers/external_sources.py index 0d01a1bcbfd..a52cf7a02d8 100644 --- a/tests/integration/helpers/external_sources.py +++ b/tests/integration/helpers/external_sources.py @@ -333,16 +333,16 @@ class _SourceExecutableBase(ExternalSource): user='root') -class SourceExecutableCache(_SourceExecutableBase): +class SourceExecutableHashed(_SourceExecutableBase): def _get_cmd(self, path): return "cat {}".format(path) def compatible_with_layout(self, layout): - return 'cache' not in layout.name + return 'hashed' in layout.name -class SourceExecutableHashed(_SourceExecutableBase): +class SourceExecutableCache(_SourceExecutableBase): def _get_cmd(self, path): return "cat - >/dev/null;cat {}".format(path) diff --git a/tests/integration/helpers/test_tools.py b/tests/integration/helpers/test_tools.py index d196142c518..9fbffe41819 100644 --- a/tests/integration/helpers/test_tools.py +++ b/tests/integration/helpers/test_tools.py @@ -60,3 +60,19 @@ def assert_eq_with_retry(instance, query, expectation, retry_count=20, sleep_tim if expectation_tsv != val: raise AssertionError("'{}' != '{}'\n{}".format(expectation_tsv, val, '\n'.join( expectation_tsv.diff(val, n1="expectation", n2="query")))) + +def assert_logs_contain(instance, substring): + if not instance.contains_in_log(substring): + raise AssertionError("'{}' not found in logs".format(substring)) + +def assert_logs_contain_with_retry(instance, substring, retry_count=20, sleep_time=0.5): + for i in xrange(retry_count): + try: + if instance.contains_in_log(substring): + break + time.sleep(sleep_time) + except Exception as ex: + print "contains_in_log_with_retry retry {} exception {}".format(i + 1, ex) + time.sleep(sleep_time) + else: + raise AssertionError("'{}' not found in logs".format(substring)) diff --git a/tests/integration/test_atomic_drop_table/test.py b/tests/integration/test_atomic_drop_table/test.py index 7ff06c7f369..dc1ad47aa75 100644 --- a/tests/integration/test_atomic_drop_table/test.py +++ b/tests/integration/test_atomic_drop_table/test.py @@ -13,7 +13,7 @@ node1 = cluster.add_instance('node1', main_configs=["configs/config.d/zookeeper_ def start_cluster(): try: cluster.start() - node1.query("CREATE DATABASE zktest ENGINE=Ordinary;") + node1.query("CREATE DATABASE zktest ENGINE=Ordinary;") # Different behaviour with Atomic node1.query( ''' CREATE TABLE zktest.atomic_drop_table (n UInt32) diff --git a/tests/integration/test_backup_restore/test.py b/tests/integration/test_backup_restore/test.py index 111dc6d24f8..170266aaaea 100644 --- a/tests/integration/test_backup_restore/test.py +++ b/tests/integration/test_backup_restore/test.py @@ -14,7 +14,7 @@ path_to_data = '/var/lib/clickhouse/' def started_cluster(): try: cluster.start() - q('CREATE DATABASE test ENGINE = Ordinary') + q('CREATE DATABASE test ENGINE = Ordinary') # Different path in shadow/ with Atomic yield cluster diff --git a/tests/integration/test_backup_with_other_granularity/test.py b/tests/integration/test_backup_with_other_granularity/test.py index df8bd6ab56f..5ed1cb06787 100644 --- a/tests/integration/test_backup_with_other_granularity/test.py +++ b/tests/integration/test_backup_with_other_granularity/test.py @@ -17,6 +17,7 @@ node4 = cluster.add_instance('node4') def started_cluster(): try: cluster.start() + yield cluster finally: cluster.shutdown() @@ -141,22 +142,24 @@ def test_backup_from_old_version_config(started_cluster): def test_backup_and_alter(started_cluster): - node4.query("CREATE TABLE backup_table(A Int64, B String, C Date) Engine = MergeTree order by tuple()") + node4.query("CREATE DATABASE test ENGINE=Ordinary") # Different path in shadow/ with Atomic - node4.query("INSERT INTO backup_table VALUES(2, '2', toDate('2019-10-01'))") + node4.query("CREATE TABLE test.backup_table(A Int64, B String, C Date) Engine = MergeTree order by tuple()") - node4.query("ALTER TABLE backup_table FREEZE PARTITION tuple();") + node4.query("INSERT INTO test.backup_table VALUES(2, '2', toDate('2019-10-01'))") - node4.query("ALTER TABLE backup_table DROP COLUMN C") + node4.query("ALTER TABLE test.backup_table FREEZE PARTITION tuple();") - node4.query("ALTER TABLE backup_table MODIFY COLUMN B UInt64") + node4.query("ALTER TABLE test.backup_table DROP COLUMN C") - node4.query("ALTER TABLE backup_table DROP PARTITION tuple()") + node4.query("ALTER TABLE test.backup_table MODIFY COLUMN B UInt64") + + node4.query("ALTER TABLE test.backup_table DROP PARTITION tuple()") node4.exec_in_container(['bash', '-c', - 'cp -r /var/lib/clickhouse/shadow/1/data/default/backup_table/all_1_1_0/ /var/lib/clickhouse/data/default/backup_table/detached']) + 'cp -r /var/lib/clickhouse/shadow/1/data/test/backup_table/all_1_1_0/ /var/lib/clickhouse/data/test/backup_table/detached']) - node4.query("ALTER TABLE backup_table ATTACH PARTITION tuple()") + node4.query("ALTER TABLE test.backup_table ATTACH PARTITION tuple()") - assert node4.query("SELECT sum(A) FROM backup_table") == "2\n" - assert node4.query("SELECT B + 2 FROM backup_table") == "4\n" + assert node4.query("SELECT sum(A) FROM test.backup_table") == "2\n" + assert node4.query("SELECT B + 2 FROM test.backup_table") == "4\n" diff --git a/tests/integration/test_cluster_copier/task0_description.xml b/tests/integration/test_cluster_copier/task0_description.xml index 72eff8d464d..d56053ffd39 100644 --- a/tests/integration/test_cluster_copier/task0_description.xml +++ b/tests/integration/test_cluster_copier/task0_description.xml @@ -33,7 +33,7 @@ 3 4 5 6 1 2 0 - ENGINE=ReplicatedMergeTree('/clickhouse/tables/cluster{cluster}/{shard}/hits', '{replica}') PARTITION BY d % 3 ORDER BY (d, sipHash64(d)) SAMPLE BY sipHash64(d) SETTINGS index_granularity = 16 + ENGINE=ReplicatedMergeTree PARTITION BY d % 3 ORDER BY (d, sipHash64(d)) SAMPLE BY sipHash64(d) SETTINGS index_granularity = 16 d + 1 @@ -93,4 +93,4 @@ - \ No newline at end of file + diff --git a/tests/integration/test_cluster_copier/task_month_to_week_description.xml b/tests/integration/test_cluster_copier/task_month_to_week_description.xml index ee134603310..26dfc7d3e00 100644 --- a/tests/integration/test_cluster_copier/task_month_to_week_description.xml +++ b/tests/integration/test_cluster_copier/task_month_to_week_description.xml @@ -34,7 +34,7 @@ ENGINE= - ReplicatedMergeTree('/clickhouse/tables/cluster{cluster}/{shard}/b', '{replica}') + ReplicatedMergeTree PARTITION BY toMonday(date) ORDER BY d @@ -97,4 +97,4 @@ - \ No newline at end of file + diff --git a/tests/integration/test_cluster_copier/task_test_block_size.xml b/tests/integration/test_cluster_copier/task_test_block_size.xml index ea63d580c1c..c9c99a083ea 100644 --- a/tests/integration/test_cluster_copier/task_test_block_size.xml +++ b/tests/integration/test_cluster_copier/task_test_block_size.xml @@ -28,7 +28,7 @@ ENGINE= - ReplicatedMergeTree('/clickhouse/tables/cluster{cluster}/{shard}/test_block_size', '{replica}') + ReplicatedMergeTree ORDER BY d PARTITION BY partition @@ -99,4 +99,4 @@ - \ No newline at end of file + diff --git a/tests/integration/test_cluster_copier/test.py b/tests/integration/test_cluster_copier/test.py index 2a9e696ca46..88dac06f158 100644 --- a/tests/integration/test_cluster_copier/test.py +++ b/tests/integration/test_cluster_copier/test.py @@ -81,11 +81,11 @@ class Task1: for cluster_num in ["0", "1"]: ddl_check_query(instance, "DROP DATABASE IF EXISTS default ON CLUSTER cluster{}".format(cluster_num)) ddl_check_query(instance, - "CREATE DATABASE IF NOT EXISTS default ON CLUSTER cluster{} ENGINE=Ordinary".format( + "CREATE DATABASE IF NOT EXISTS default ON CLUSTER cluster{}".format( cluster_num)) ddl_check_query(instance, "CREATE TABLE hits ON CLUSTER cluster0 (d UInt64, d1 UInt64 MATERIALIZED d+1) " + - "ENGINE=ReplicatedMergeTree('/clickhouse/tables/cluster_{cluster}/{shard}/hits', '{replica}') " + + "ENGINE=ReplicatedMergeTree " + "PARTITION BY d % 3 ORDER BY (d, sipHash64(d)) SAMPLE BY sipHash64(d) SETTINGS index_granularity = 16") ddl_check_query(instance, "CREATE TABLE hits_all ON CLUSTER cluster0 (d UInt64) ENGINE=Distributed(cluster0, default, hits, d)") @@ -110,10 +110,11 @@ class Task1: class Task2: - def __init__(self, cluster): + def __init__(self, cluster, unique_zk_path): self.cluster = cluster self.zk_task_path = "/clickhouse-copier/task_month_to_week_partition" self.copier_task_config = open(os.path.join(CURRENT_TEST_DIR, 'task_month_to_week_description.xml'), 'r').read() + self.unique_zk_path = unique_zk_path def start(self): instance = cluster.instances['s0_0_0'] @@ -121,11 +122,13 @@ class Task2: for cluster_num in ["0", "1"]: ddl_check_query(instance, "DROP DATABASE IF EXISTS default ON CLUSTER cluster{}".format(cluster_num)) ddl_check_query(instance, - "CREATE DATABASE IF NOT EXISTS default ON CLUSTER cluster{} ENGINE=Ordinary".format( + "CREATE DATABASE IF NOT EXISTS default ON CLUSTER cluster{}".format( cluster_num)) ddl_check_query(instance, - "CREATE TABLE a ON CLUSTER cluster0 (date Date, d UInt64, d1 UInt64 ALIAS d+1) ENGINE=ReplicatedMergeTree('/clickhouse/tables/cluster_{cluster}/{shard}/a', '{replica}', date, intHash64(d), (date, intHash64(d)), 8192)") + "CREATE TABLE a ON CLUSTER cluster0 (date Date, d UInt64, d1 UInt64 ALIAS d+1) " + "ENGINE=ReplicatedMergeTree('/clickhouse/tables/cluster_{cluster}/{shard}/" + self.unique_zk_path + "', " + "'{replica}', date, intHash64(d), (date, intHash64(d)), 8192)") ddl_check_query(instance, "CREATE TABLE a_all ON CLUSTER cluster0 (date Date, d UInt64) ENGINE=Distributed(cluster0, default, a, d)") @@ -169,7 +172,7 @@ class Task_test_block_size: ddl_check_query(instance, """ CREATE TABLE test_block_size ON CLUSTER shard_0_0 (partition Date, d UInt64) - ENGINE=ReplicatedMergeTree('/clickhouse/tables/cluster_{cluster}/{shard}/test_block_size', '{replica}') + ENGINE=ReplicatedMergeTree ORDER BY (d, sipHash64(d)) SAMPLE BY sipHash64(d)""", 2) instance.query( @@ -332,17 +335,17 @@ def test_copy_with_recovering_after_move_faults(started_cluster, use_sample_offs @pytest.mark.timeout(600) def test_copy_month_to_week_partition(started_cluster): - execute_task(Task2(started_cluster), []) + execute_task(Task2(started_cluster, "test1"), []) @pytest.mark.timeout(600) def test_copy_month_to_week_partition_with_recovering(started_cluster): - execute_task(Task2(started_cluster), ['--copy-fault-probability', str(COPYING_FAIL_PROBABILITY)]) + execute_task(Task2(started_cluster, "test2"), ['--copy-fault-probability', str(COPYING_FAIL_PROBABILITY)]) @pytest.mark.timeout(600) def test_copy_month_to_week_partition_with_recovering_after_move_faults(started_cluster): - execute_task(Task2(started_cluster), ['--move-fault-probability', str(MOVING_FAIL_PROBABILITY)]) + execute_task(Task2(started_cluster, "test3"), ['--move-fault-probability', str(MOVING_FAIL_PROBABILITY)]) def test_block_size(started_cluster): diff --git a/tests/integration/test_cluster_copier/trivial_test.py b/tests/integration/test_cluster_copier/trivial_test.py index 3d0c5d0f5b0..035faf0bb9f 100644 --- a/tests/integration/test_cluster_copier/trivial_test.py +++ b/tests/integration/test_cluster_copier/trivial_test.py @@ -59,7 +59,7 @@ class TaskTrivial: for node in [source, destination]: node.query("DROP DATABASE IF EXISTS default") - node.query("CREATE DATABASE IF NOT EXISTS default ENGINE=Ordinary") + node.query("CREATE DATABASE IF NOT EXISTS default") source.query("CREATE TABLE trivial (d UInt64, d1 UInt64 MATERIALIZED d+1) " "ENGINE=ReplicatedMergeTree('/clickhouse/tables/source_trivial_cluster/1/trivial', '1') " diff --git a/tests/integration/test_custom_settings/configs/config.d/text_log.xml b/tests/integration/test_custom_settings/configs/config.d/text_log.xml deleted file mode 100644 index f386249f170..00000000000 --- a/tests/integration/test_custom_settings/configs/config.d/text_log.xml +++ /dev/null @@ -1,3 +0,0 @@ - - - diff --git a/tests/integration/test_custom_settings/configs/users.d/custom_settings.xml b/tests/integration/test_custom_settings/configs/custom_settings.xml similarity index 56% rename from tests/integration/test_custom_settings/configs/users.d/custom_settings.xml rename to tests/integration/test_custom_settings/configs/custom_settings.xml index f32d0f3626d..d3865b434e6 100644 --- a/tests/integration/test_custom_settings/configs/users.d/custom_settings.xml +++ b/tests/integration/test_custom_settings/configs/custom_settings.xml @@ -6,13 +6,5 @@ Float64_-43.25e-1 'some text' - - - 1 - - - - 1 - diff --git a/tests/integration/test_custom_settings/configs/illformed_setting.xml b/tests/integration/test_custom_settings/configs/illformed_setting.xml new file mode 100644 index 00000000000..267978a8af9 --- /dev/null +++ b/tests/integration/test_custom_settings/configs/illformed_setting.xml @@ -0,0 +1,7 @@ + + + + 1 + + + diff --git a/tests/integration/test_custom_settings/test.py b/tests/integration/test_custom_settings/test.py index 32df79ec1e9..7e147f999a9 100644 --- a/tests/integration/test_custom_settings/test.py +++ b/tests/integration/test_custom_settings/test.py @@ -1,9 +1,10 @@ import pytest +import os from helpers.cluster import ClickHouseCluster +SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__)) cluster = ClickHouseCluster(__file__) -node = cluster.add_instance('node', main_configs=["configs/config.d/text_log.xml"], - user_configs=["configs/users.d/custom_settings.xml"]) +node = cluster.add_instance('node') @pytest.fixture(scope="module", autouse=True) @@ -16,28 +17,17 @@ def started_cluster(): cluster.shutdown() -def test(): +def test_custom_settings(): + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/custom_settings.xml"), '/etc/clickhouse-server/users.d/z.xml') + node.query("SYSTEM RELOAD CONFIG") + assert node.query("SELECT getSetting('custom_a')") == "-5\n" assert node.query("SELECT getSetting('custom_b')") == "10000000000\n" assert node.query("SELECT getSetting('custom_c')") == "-4.325\n" assert node.query("SELECT getSetting('custom_d')") == "some text\n" - assert "custom_a = -5, custom_b = 10000000000, custom_c = -4.325, custom_d = \\'some text\\'" \ - in node.query("SHOW CREATE SETTINGS PROFILE default") - assert "no settings profile" in node.query_and_get_error( - "SHOW CREATE SETTINGS PROFILE profile_with_unknown_setting") - assert "no settings profile" in node.query_and_get_error("SHOW CREATE SETTINGS PROFILE profile_illformed_setting") - - -def test_invalid_settings(): - node.query("SYSTEM RELOAD CONFIG") - node.query("SYSTEM FLUSH LOGS") - - assert node.query("SELECT COUNT() FROM system.text_log WHERE" - " message LIKE '%Could not parse profile `profile_illformed_setting`%'" - " AND message LIKE '%Couldn\\'t restore Field from dump%'") == "1\n" - - assert node.query("SELECT COUNT() FROM system.text_log WHERE" - " message LIKE '%Could not parse profile `profile_with_unknown_setting`%'" - " AND message LIKE '%Setting x is neither a builtin setting nor started with the prefix \\'custom_\\'%'") == "1\n" +def test_illformed_setting(): + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/illformed_setting.xml"), '/etc/clickhouse-server/users.d/z.xml') + error_message = "Couldn't restore Field from dump: 1" + assert error_message in node.query_and_get_error("SYSTEM RELOAD CONFIG") diff --git a/tests/integration/test_dictionaries_all_layouts_and_sources/test.py b/tests/integration/test_dictionaries_all_layouts_and_sources/test.py deleted file mode 100644 index 5880ead7c5a..00000000000 --- a/tests/integration/test_dictionaries_all_layouts_and_sources/test.py +++ /dev/null @@ -1,346 +0,0 @@ -import math -import os - -import pytest -from helpers.cluster import ClickHouseCluster -from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout -from helpers.external_sources import SourceMongo, SourceMongoURI, SourceHTTP, SourceHTTPS, SourceCassandra -from helpers.external_sources import SourceMySQL, SourceClickHouse, SourceFile, SourceExecutableCache, \ - SourceExecutableHashed - -SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__)) -dict_configs_path = os.path.join(SCRIPT_DIR, 'configs/dictionaries') - -FIELDS = { - "simple": [ - Field("KeyField", 'UInt64', is_key=True, default_value_for_get=9999999), - Field("UInt8_", 'UInt8', default_value_for_get=55), - Field("UInt16_", 'UInt16', default_value_for_get=66), - Field("UInt32_", 'UInt32', default_value_for_get=77), - Field("UInt64_", 'UInt64', default_value_for_get=88), - Field("Int8_", 'Int8', default_value_for_get=-55), - Field("Int16_", 'Int16', default_value_for_get=-66), - Field("Int32_", 'Int32', default_value_for_get=-77), - Field("Int64_", 'Int64', default_value_for_get=-88), - Field("UUID_", 'UUID', default_value_for_get='550e8400-0000-0000-0000-000000000000'), - Field("Date_", 'Date', default_value_for_get='2018-12-30'), - Field("DateTime_", 'DateTime', default_value_for_get='2018-12-30 00:00:00'), - Field("String_", 'String', default_value_for_get='hi'), - Field("Float32_", 'Float32', default_value_for_get=555.11), - Field("Float64_", 'Float64', default_value_for_get=777.11), - Field("ParentKeyField", "UInt64", default_value_for_get=444, hierarchical=True) - ], - "complex": [ - Field("KeyField1", 'UInt64', is_key=True, default_value_for_get=9999999), - Field("KeyField2", 'String', is_key=True, default_value_for_get='xxxxxxxxx'), - Field("UInt8_", 'UInt8', default_value_for_get=55), - Field("UInt16_", 'UInt16', default_value_for_get=66), - Field("UInt32_", 'UInt32', default_value_for_get=77), - Field("UInt64_", 'UInt64', default_value_for_get=88), - Field("Int8_", 'Int8', default_value_for_get=-55), - Field("Int16_", 'Int16', default_value_for_get=-66), - Field("Int32_", 'Int32', default_value_for_get=-77), - Field("Int64_", 'Int64', default_value_for_get=-88), - Field("UUID_", 'UUID', default_value_for_get='550e8400-0000-0000-0000-000000000000'), - Field("Date_", 'Date', default_value_for_get='2018-12-30'), - Field("DateTime_", 'DateTime', default_value_for_get='2018-12-30 00:00:00'), - Field("String_", 'String', default_value_for_get='hi'), - Field("Float32_", 'Float32', default_value_for_get=555.11), - Field("Float64_", 'Float64', default_value_for_get=777.11), - ], - "ranged": [ - Field("KeyField1", 'UInt64', is_key=True), - Field("KeyField2", 'Date', is_range_key=True), - Field("StartDate", 'Date', range_hash_type='min'), - Field("EndDate", 'Date', range_hash_type='max'), - Field("UInt8_", 'UInt8', default_value_for_get=55), - Field("UInt16_", 'UInt16', default_value_for_get=66), - Field("UInt32_", 'UInt32', default_value_for_get=77), - Field("UInt64_", 'UInt64', default_value_for_get=88), - Field("Int8_", 'Int8', default_value_for_get=-55), - Field("Int16_", 'Int16', default_value_for_get=-66), - Field("Int32_", 'Int32', default_value_for_get=-77), - Field("Int64_", 'Int64', default_value_for_get=-88), - Field("UUID_", 'UUID', default_value_for_get='550e8400-0000-0000-0000-000000000000'), - Field("Date_", 'Date', default_value_for_get='2018-12-30'), - Field("DateTime_", 'DateTime', default_value_for_get='2018-12-30 00:00:00'), - Field("String_", 'String', default_value_for_get='hi'), - Field("Float32_", 'Float32', default_value_for_get=555.11), - Field("Float64_", 'Float64', default_value_for_get=777.11), - ] -} - -VALUES = { - "simple": [ - [1, 22, 333, 4444, 55555, -6, -77, - -888, -999, '550e8400-e29b-41d4-a716-446655440003', - '1973-06-28', '1985-02-28 23:43:25', 'hello', 22.543, 3332154213.4, 0], - [2, 3, 4, 5, 6, -7, -8, - -9, -10, '550e8400-e29b-41d4-a716-446655440002', - '1978-06-28', '1986-02-28 23:42:25', 'hello', 21.543, 3222154213.4, 1] - ], - "complex": [ - [1, 'world', 22, 333, 4444, 55555, -6, - -77, -888, -999, '550e8400-e29b-41d4-a716-446655440003', - '1973-06-28', '1985-02-28 23:43:25', - 'hello', 22.543, 3332154213.4], - [2, 'qwerty2', 52, 2345, 6544, 9191991, -2, - -717, -81818, -92929, '550e8400-e29b-41d4-a716-446655440007', - '1975-09-28', '2000-02-28 23:33:24', - 'my', 255.543, 3332221.44] - - ], - "ranged": [ - [1, '2019-02-10', '2019-02-01', '2019-02-28', - 22, 333, 4444, 55555, -6, -77, -888, -999, - '550e8400-e29b-41d4-a716-446655440003', - '1973-06-28', '1985-02-28 23:43:25', 'hello', - 22.543, 3332154213.4], - [2, '2019-04-10', '2019-04-01', '2019-04-28', - 11, 3223, 41444, 52515, -65, -747, -8388, -9099, - '550e8400-e29b-41d4-a716-446655440004', - '1973-06-29', '2002-02-28 23:23:25', '!!!!', - 32.543, 3332543.4] - ] -} - -LAYOUTS = [ - Layout("flat"), - Layout("hashed"), - Layout("cache"), - Layout("complex_key_hashed"), - Layout("complex_key_cache"), - Layout("range_hashed"), - Layout("direct"), - Layout("complex_key_direct") -] - -SOURCES = [ - SourceCassandra("Cassandra", "localhost", "9043", "cassandra1", "9042", "", ""), - SourceMongo("MongoDB", "localhost", "27018", "mongo1", "27017", "root", "clickhouse"), - SourceMongoURI("MongoDB_URI", "localhost", "27018", "mongo1", "27017", "root", "clickhouse"), - SourceMySQL("MySQL", "localhost", "3308", "mysql1", "3306", "root", "clickhouse"), - SourceClickHouse("RemoteClickHouse", "localhost", "9000", "clickhouse1", "9000", "default", ""), - SourceClickHouse("LocalClickHouse", "localhost", "9000", "node", "9000", "default", ""), - SourceFile("File", "localhost", "9000", "node", "9000", "", ""), - SourceExecutableHashed("ExecutableHashed", "localhost", "9000", "node", "9000", "", ""), - SourceExecutableCache("ExecutableCache", "localhost", "9000", "node", "9000", "", ""), - SourceHTTP("SourceHTTP", "localhost", "9000", "clickhouse1", "9000", "", ""), - SourceHTTPS("SourceHTTPS", "localhost", "9000", "clickhouse1", "9000", "", ""), -] - -DICTIONARIES = [] - -cluster = None -node = None - - -def get_dict(source, layout, fields, suffix_name=''): - global dict_configs_path - - structure = DictionaryStructure(layout, fields) - dict_name = source.name + "_" + layout.name + '_' + suffix_name - dict_path = os.path.join(dict_configs_path, dict_name + '.xml') - dictionary = Dictionary(dict_name, structure, source, dict_path, "table_" + dict_name, fields) - dictionary.generate_config() - return dictionary - - -def setup_module(module): - global DICTIONARIES - global cluster - global node - global dict_configs_path - - for f in os.listdir(dict_configs_path): - os.remove(os.path.join(dict_configs_path, f)) - - for layout in LAYOUTS: - for source in SOURCES: - if source.compatible_with_layout(layout): - DICTIONARIES.append(get_dict(source, layout, FIELDS[layout.layout_type])) - else: - print "Source", source.name, "incompatible with layout", layout.name - - cluster = ClickHouseCluster(__file__) - - main_configs = [] - main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) - - cluster.add_instance('clickhouse1', main_configs=main_configs) - - dictionaries = [] - for fname in os.listdir(dict_configs_path): - dictionaries.append(os.path.join(dict_configs_path, fname)) - - node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries, with_mysql=True, - with_mongo=True, with_redis=True, with_cassandra=True) - - -@pytest.fixture(scope="module") -def started_cluster(): - try: - cluster.start() - for dictionary in DICTIONARIES: - print "Preparing", dictionary.name - dictionary.prepare_source(cluster) - print "Prepared" - - yield cluster - - finally: - cluster.shutdown() - - -def get_dictionaries(fold, total_folds, all_dicts): - chunk_len = int(math.ceil(len(all_dicts) / float(total_folds))) - if chunk_len * fold >= len(all_dicts): - return [] - return all_dicts[fold * chunk_len: (fold + 1) * chunk_len] - - -def remove_mysql_dicts(): - """ - We have false-positive race condition in our openSSL version. - MySQL dictionary use OpenSSL, so to prevent known failure we - disable tests for these dictionaries. - - Read of size 8 at 0x7b3c00005dd0 by thread T61 (mutexes: write M1010349240585225536): - #0 EVP_CIPHER_mode (clickhouse+0x13b2223b) - #1 do_ssl3_write (clickhouse+0x13a137bc) - #2 ssl3_write_bytes (clickhouse+0x13a12387) - #3 ssl3_write (clickhouse+0x139db0e6) - #4 ssl_write_internal (clickhouse+0x139eddce) - #5 SSL_write (clickhouse+0x139edf20) - #6 ma_tls_write (clickhouse+0x139c7557) - #7 ma_pvio_tls_write (clickhouse+0x139a8f59) - #8 ma_pvio_write (clickhouse+0x139a8488) - #9 ma_net_real_write (clickhouse+0x139a4e2c) - #10 ma_net_write_command (clickhouse+0x139a546d) - #11 mthd_my_send_cmd (clickhouse+0x13992546) - #12 mysql_close_slow_part (clickhouse+0x13999afd) - #13 mysql_close (clickhouse+0x13999071) - #14 mysqlxx::Connection::~Connection() (clickhouse+0x1370f814) - #15 mysqlxx::Pool::~Pool() (clickhouse+0x13715a7b) - - TODO remove this when open ssl will be fixed or thread sanitizer will be suppressed - """ - - # global DICTIONARIES - # DICTIONARIES = [d for d in DICTIONARIES if not d.name.startswith("MySQL")] - - -@pytest.mark.parametrize("fold", list(range(10))) -def test_simple_dictionaries(started_cluster, fold): - if node.is_built_with_thread_sanitizer(): - remove_mysql_dicts() - - fields = FIELDS["simple"] - values = VALUES["simple"] - data = [Row(fields, vals) for vals in values] - - all_simple_dicts = [d for d in DICTIONARIES if d.structure.layout.layout_type == "simple"] - simple_dicts = get_dictionaries(fold, 10, all_simple_dicts) - - print "Length of dicts:", len(simple_dicts) - for dct in simple_dicts: - dct.load_data(data) - - node.query("system reload dictionaries") - - queries_with_answers = [] - for dct in simple_dicts: - for row in data: - for field in fields: - if not field.is_key: - for query in dct.get_select_get_queries(field, row): - queries_with_answers.append((query, row.get_value_by_name(field.name))) - - for query in dct.get_select_has_queries(field, row): - queries_with_answers.append((query, 1)) - - for query in dct.get_select_get_or_default_queries(field, row): - queries_with_answers.append((query, field.default_value_for_get)) - for query in dct.get_hierarchical_queries(data[0]): - queries_with_answers.append((query, [1])) - - for query in dct.get_hierarchical_queries(data[1]): - queries_with_answers.append((query, [2, 1])) - - for query in dct.get_is_in_queries(data[0], data[1]): - queries_with_answers.append((query, 0)) - - for query in dct.get_is_in_queries(data[1], data[0]): - queries_with_answers.append((query, 1)) - - for query, answer in queries_with_answers: - print query - if isinstance(answer, list): - answer = str(answer).replace(' ', '') - assert node.query(query) == str(answer) + '\n' - - -@pytest.mark.parametrize("fold", list(range(10))) -def test_complex_dictionaries(started_cluster, fold): - if node.is_built_with_thread_sanitizer(): - remove_mysql_dicts() - - fields = FIELDS["complex"] - values = VALUES["complex"] - data = [Row(fields, vals) for vals in values] - - all_complex_dicts = [d for d in DICTIONARIES if d.structure.layout.layout_type == "complex"] - complex_dicts = get_dictionaries(fold, 10, all_complex_dicts) - - for dct in complex_dicts: - dct.load_data(data) - - node.query("system reload dictionaries") - - queries_with_answers = [] - for dct in complex_dicts: - for row in data: - for field in fields: - if not field.is_key: - for query in dct.get_select_get_queries(field, row): - queries_with_answers.append((query, row.get_value_by_name(field.name))) - - for query in dct.get_select_has_queries(field, row): - queries_with_answers.append((query, 1)) - - for query in dct.get_select_get_or_default_queries(field, row): - queries_with_answers.append((query, field.default_value_for_get)) - - for query, answer in queries_with_answers: - print query - assert node.query(query) == str(answer) + '\n' - - -@pytest.mark.parametrize("fold", list(range(10))) -def test_ranged_dictionaries(started_cluster, fold): - if node.is_built_with_thread_sanitizer(): - remove_mysql_dicts() - - fields = FIELDS["ranged"] - values = VALUES["ranged"] - data = [Row(fields, vals) for vals in values] - - all_ranged_dicts = [d for d in DICTIONARIES if d.structure.layout.layout_type == "ranged"] - ranged_dicts = get_dictionaries(fold, 10, all_ranged_dicts) - - for dct in ranged_dicts: - dct.load_data(data) - - node.query("system reload dictionaries") - - queries_with_answers = [] - for dct in ranged_dicts: - for row in data: - for field in fields: - if not field.is_key and not field.is_range: - for query in dct.get_select_get_queries(field, row): - queries_with_answers.append((query, row.get_value_by_name(field.name))) - - for query, answer in queries_with_answers: - print query - assert node.query(query) == str(answer) + '\n' diff --git a/tests/integration/test_dictionaries_all_layouts_and_sources/__init__.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/__init__.py similarity index 100% rename from tests/integration/test_dictionaries_all_layouts_and_sources/__init__.py rename to tests/integration/test_dictionaries_all_layouts_separate_sources/__init__.py diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/common.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/common.py new file mode 100644 index 00000000000..0411b5d9475 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/common.py @@ -0,0 +1,239 @@ +import os + +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout + +KEY_FIELDS = { + "simple": [ + Field("KeyField", 'UInt64', is_key=True, default_value_for_get=9999999) + ], + "complex": [ + Field("KeyField1", 'UInt64', is_key=True, default_value_for_get=9999999), + Field("KeyField2", 'String', is_key=True, default_value_for_get='xxxxxxxxx') + ], + "ranged": [ + Field("KeyField1", 'UInt64', is_key=True), + Field("KeyField2", 'Date', is_range_key=True) + ] +} + +START_FIELDS = { + "simple": [], + "complex": [], + "ranged" : [ + Field("StartDate", 'Date', range_hash_type='min'), + Field("EndDate", 'Date', range_hash_type='max') + ] +} + +MIDDLE_FIELDS = [ + Field("UInt8_", 'UInt8', default_value_for_get=55), + Field("UInt16_", 'UInt16', default_value_for_get=66), + Field("UInt32_", 'UInt32', default_value_for_get=77), + Field("UInt64_", 'UInt64', default_value_for_get=88), + Field("Int8_", 'Int8', default_value_for_get=-55), + Field("Int16_", 'Int16', default_value_for_get=-66), + Field("Int32_", 'Int32', default_value_for_get=-77), + Field("Int64_", 'Int64', default_value_for_get=-88), + Field("UUID_", 'UUID', default_value_for_get='550e8400-0000-0000-0000-000000000000'), + Field("Date_", 'Date', default_value_for_get='2018-12-30'), + Field("DateTime_", 'DateTime', default_value_for_get='2018-12-30 00:00:00'), + Field("String_", 'String', default_value_for_get='hi'), + Field("Float32_", 'Float32', default_value_for_get=555.11), + Field("Float64_", 'Float64', default_value_for_get=777.11), +] + +END_FIELDS = { + "simple" : [ + Field("ParentKeyField", "UInt64", default_value_for_get=444, hierarchical=True) + ], + "complex" : [], + "ranged" : [] +} + +LAYOUTS_SIMPLE = ["flat", "hashed", "cache", "direct"] +LAYOUTS_COMPLEX = ["complex_key_hashed", "complex_key_cache", "complex_key_direct"] +LAYOUTS_RANGED = ["range_hashed"] + +VALUES = { + "simple": [ + [1, 22, 333, 4444, 55555, -6, -77, + -888, -999, '550e8400-e29b-41d4-a716-446655440003', + '1973-06-28', '1985-02-28 23:43:25', 'hello', 22.543, 3332154213.4, 0], + [2, 3, 4, 5, 6, -7, -8, + -9, -10, '550e8400-e29b-41d4-a716-446655440002', + '1978-06-28', '1986-02-28 23:42:25', 'hello', 21.543, 3222154213.4, 1] + ], + "complex": [ + [1, 'world', 22, 333, 4444, 55555, -6, + -77, -888, -999, '550e8400-e29b-41d4-a716-446655440003', + '1973-06-28', '1985-02-28 23:43:25', + 'hello', 22.543, 3332154213.4], + [2, 'qwerty2', 52, 2345, 6544, 9191991, -2, + -717, -81818, -92929, '550e8400-e29b-41d4-a716-446655440007', + '1975-09-28', '2000-02-28 23:33:24', + 'my', 255.543, 3332221.44] + ], + "ranged": [ + [1, '2019-02-10', '2019-02-01', '2019-02-28', + 22, 333, 4444, 55555, -6, -77, -888, -999, + '550e8400-e29b-41d4-a716-446655440003', + '1973-06-28', '1985-02-28 23:43:25', 'hello', + 22.543, 3332154213.4], + [2, '2019-04-10', '2019-04-01', '2019-04-28', + 11, 3223, 41444, 52515, -65, -747, -8388, -9099, + '550e8400-e29b-41d4-a716-446655440004', + '1973-06-29', '2002-02-28 23:23:25', '!!!!', + 32.543, 3332543.4] + ] +} + + +SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__)) +DICT_CONFIG_PATH = os.path.join(SCRIPT_DIR, 'configs/dictionaries') + +def get_dict(source, layout, fields, suffix_name=''): + global DICT_CONFIG_PATH + structure = DictionaryStructure(layout, fields) + dict_name = source.name + "_" + layout.name + '_' + suffix_name + dict_path = os.path.join(DICT_CONFIG_PATH, dict_name + '.xml') + dictionary = Dictionary(dict_name, structure, source, dict_path, "table_" + dict_name, fields) + dictionary.generate_config() + return dictionary + +class SimpleLayoutTester: + def __init__(self): + self.fields = KEY_FIELDS["simple"] + START_FIELDS["simple"] + MIDDLE_FIELDS + END_FIELDS["simple"] + self.values = VALUES["simple"] + self.data = [Row(self.fields, vals) for vals in self.values] + self.layout_to_dictionary = dict() + + def create_dictionaries(self, source_): + for layout in LAYOUTS_SIMPLE: + if source_.compatible_with_layout(Layout(layout)): + self.layout_to_dictionary[layout] = get_dict(source_, Layout(layout), self.fields) + + def prepare(self, cluster_): + for _, dictionary in self.layout_to_dictionary.items(): + dictionary.prepare_source(cluster_) + dictionary.load_data(self.data) + + def execute(self, layout_name, node): + if not self.layout_to_dictionary.has_key(layout_name): + raise RuntimeError("Source doesn't support layout: {}".format(layout_name)) + + dct = self.layout_to_dictionary[layout_name] + + node.query("system reload dictionaries") + queries_with_answers = [] + + for row in self.data: + for field in self.fields: + if not field.is_key: + for query in dct.get_select_get_queries(field, row): + queries_with_answers.append((query, row.get_value_by_name(field.name))) + + for query in dct.get_select_has_queries(field, row): + queries_with_answers.append((query, 1)) + + for query in dct.get_select_get_or_default_queries(field, row): + queries_with_answers.append((query, field.default_value_for_get)) + + for query in dct.get_hierarchical_queries(self.data[0]): + queries_with_answers.append((query, [1])) + + for query in dct.get_hierarchical_queries(self.data[1]): + queries_with_answers.append((query, [2, 1])) + + for query in dct.get_is_in_queries(self.data[0], self.data[1]): + queries_with_answers.append((query, 0)) + + for query in dct.get_is_in_queries(self.data[1], self.data[0]): + queries_with_answers.append((query, 1)) + + for query, answer in queries_with_answers: + # print query + if isinstance(answer, list): + answer = str(answer).replace(' ', '') + assert node.query(query) == str(answer) + '\n' + + +class ComplexLayoutTester: + def __init__(self): + self.fields = KEY_FIELDS["complex"] + START_FIELDS["complex"] + MIDDLE_FIELDS + END_FIELDS["complex"] + self.values = VALUES["complex"] + self.data = [Row(self.fields, vals) for vals in self.values] + self.layout_to_dictionary = dict() + + def create_dictionaries(self, source_): + for layout in LAYOUTS_COMPLEX: + if source_.compatible_with_layout(Layout(layout)): + self.layout_to_dictionary[layout] = get_dict(source_, Layout(layout), self.fields) + + def prepare(self, cluster_): + for _, dictionary in self.layout_to_dictionary.items(): + dictionary.prepare_source(cluster_) + dictionary.load_data(self.data) + + def execute(self, layout_name, node): + if not self.layout_to_dictionary.has_key(layout_name): + raise RuntimeError("Source doesn't support layout: {}".format(layout_name)) + + dct = self.layout_to_dictionary[layout_name] + + node.query("system reload dictionaries") + queries_with_answers = [] + + for row in self.data: + for field in self.fields: + if not field.is_key: + for query in dct.get_select_get_queries(field, row): + queries_with_answers.append((query, row.get_value_by_name(field.name))) + + for query in dct.get_select_has_queries(field, row): + queries_with_answers.append((query, 1)) + + for query in dct.get_select_get_or_default_queries(field, row): + queries_with_answers.append((query, field.default_value_for_get)) + + for query, answer in queries_with_answers: + # print query + assert node.query(query) == str(answer) + '\n' + + +class RangedLayoutTester: + def __init__(self): + self.fields = KEY_FIELDS["ranged"] + START_FIELDS["ranged"] + MIDDLE_FIELDS + END_FIELDS["ranged"] + self.values = VALUES["ranged"] + self.data = [Row(self.fields, vals) for vals in self.values] + self.layout_to_dictionary = dict() + + def create_dictionaries(self, source_): + for layout in LAYOUTS_RANGED: + if source_.compatible_with_layout(Layout(layout)): + self.layout_to_dictionary[layout] = get_dict(source_, Layout(layout), self.fields) + + def prepare(self, cluster_): + for _, dictionary in self.layout_to_dictionary.items(): + dictionary.prepare_source(cluster_) + dictionary.load_data(self.data) + + def execute(self, layout_name, node): + + if not self.layout_to_dictionary.has_key(layout_name): + raise RuntimeError("Source doesn't support layout: {}".format(layout_name)) + + dct = self.layout_to_dictionary[layout_name] + + node.query("system reload dictionaries") + + queries_with_answers = [] + for row in self.data: + for field in self.fields: + if not field.is_key and not field.is_range: + for query in dct.get_select_get_queries(field, row): + queries_with_answers.append((query, row.get_value_by_name(field.name))) + + for query, answer in queries_with_answers: + # print query + assert node.query(query) == str(answer) + '\n' + diff --git a/tests/integration/test_dictionaries_all_layouts_and_sources/configs/config.xml b/tests/integration/test_dictionaries_all_layouts_separate_sources/configs/config.xml similarity index 100% rename from tests/integration/test_dictionaries_all_layouts_and_sources/configs/config.xml rename to tests/integration/test_dictionaries_all_layouts_separate_sources/configs/config.xml diff --git a/tests/integration/test_dictionaries_all_layouts_and_sources/configs/dictionaries/.gitkeep b/tests/integration/test_dictionaries_all_layouts_separate_sources/configs/dictionaries/.gitkeep similarity index 100% rename from tests/integration/test_dictionaries_all_layouts_and_sources/configs/dictionaries/.gitkeep rename to tests/integration/test_dictionaries_all_layouts_separate_sources/configs/dictionaries/.gitkeep diff --git a/tests/integration/test_dictionaries_all_layouts_and_sources/configs/disable_ssl_verification.xml b/tests/integration/test_dictionaries_all_layouts_separate_sources/configs/disable_ssl_verification.xml similarity index 100% rename from tests/integration/test_dictionaries_all_layouts_and_sources/configs/disable_ssl_verification.xml rename to tests/integration/test_dictionaries_all_layouts_separate_sources/configs/disable_ssl_verification.xml diff --git a/tests/integration/test_dictionaries_all_layouts_and_sources/configs/enable_dictionaries.xml b/tests/integration/test_dictionaries_all_layouts_separate_sources/configs/enable_dictionaries.xml similarity index 100% rename from tests/integration/test_dictionaries_all_layouts_and_sources/configs/enable_dictionaries.xml rename to tests/integration/test_dictionaries_all_layouts_separate_sources/configs/enable_dictionaries.xml diff --git a/tests/integration/test_dictionaries_all_layouts_and_sources/configs/users.xml b/tests/integration/test_dictionaries_all_layouts_separate_sources/configs/users.xml similarity index 100% rename from tests/integration/test_dictionaries_all_layouts_and_sources/configs/users.xml rename to tests/integration/test_dictionaries_all_layouts_separate_sources/configs/users.xml diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_cassandra.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_cassandra.py new file mode 100644 index 00000000000..c6b2ed370f4 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_cassandra.py @@ -0,0 +1,82 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceCassandra + +SOURCE = SourceCassandra("Cassandra", "localhost", "9043", "cassandra1", "9042", "", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries, with_cassandra=True) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", LAYOUTS_SIMPLE) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_COMPLEX) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_clickhouse_local.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_clickhouse_local.py new file mode 100644 index 00000000000..1adc02ba6aa --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_clickhouse_local.py @@ -0,0 +1,82 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceClickHouse + +SOURCE = SourceClickHouse("LocalClickHouse", "localhost", "9000", "node", "9000", "default", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", LAYOUTS_SIMPLE) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_COMPLEX) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_clickhouse_remote.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_clickhouse_remote.py new file mode 100644 index 00000000000..4e7f307b959 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_clickhouse_remote.py @@ -0,0 +1,84 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceClickHouse + +SOURCE = SourceClickHouse("RemoteClickHouse", "localhost", "9000", "clickhouse1", "9000", "default", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + cluster.add_instance('clickhouse1', main_configs=main_configs) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", list(set(LAYOUTS_SIMPLE).difference(set("cache"))) ) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", list(set(LAYOUTS_COMPLEX).difference(set("complex_key_cache")))) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_executable_cache.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_executable_cache.py new file mode 100644 index 00000000000..1d741d5271c --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_executable_cache.py @@ -0,0 +1,78 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceExecutableCache + +SOURCE = SourceExecutableCache("ExecutableCache", "localhost", "9000", "node", "9000", "", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", ['cache']) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", ['complex_key_cache']) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_executable_hashed.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_executable_hashed.py new file mode 100644 index 00000000000..03af42bb1d4 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_executable_hashed.py @@ -0,0 +1,82 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceExecutableHashed + +SOURCE = SourceExecutableHashed("ExecutableHashed", "localhost", "9000", "node", "9000", "", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", ['hashed']) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", ['complex_key_hashed']) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_file.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_file.py new file mode 100644 index 00000000000..f786bda847f --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_file.py @@ -0,0 +1,82 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceFile + +SOURCE = SourceFile("File", "localhost", "9000", "node", "9000", "", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", set(LAYOUTS_SIMPLE).difference({'cache', 'direct'}) ) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", list(set(LAYOUTS_COMPLEX).difference({'complex_key_cache', 'complex_key_direct'}))) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_http.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_http.py new file mode 100644 index 00000000000..80baee5ee45 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_http.py @@ -0,0 +1,84 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceHTTP + +SOURCE = SourceHTTP("SourceHTTP", "localhost", "9000", "clickhouse1", "9000", "", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + cluster.add_instance('clickhouse1', main_configs=main_configs) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", LAYOUTS_SIMPLE) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_COMPLEX) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_https.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_https.py new file mode 100644 index 00000000000..ccac2cfd268 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_https.py @@ -0,0 +1,84 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceHTTPS + +SOURCE = SourceHTTPS("SourceHTTPS", "localhost", "9000", "clickhouse1", "9000", "", "") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + cluster.add_instance('clickhouse1', main_configs=main_configs) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", LAYOUTS_SIMPLE) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_COMPLEX) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mongo.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mongo.py new file mode 100644 index 00000000000..ffa376dcdb3 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mongo.py @@ -0,0 +1,82 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceMongo + +SOURCE = SourceMongo("MongoDB", "localhost", "27018", "mongo1", "27017", "root", "clickhouse") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries, with_mongo=True) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", LAYOUTS_SIMPLE) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_COMPLEX) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mongo_uri.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mongo_uri.py new file mode 100644 index 00000000000..5c09627d0b9 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mongo_uri.py @@ -0,0 +1,75 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceMongoURI + +SOURCE = SourceMongoURI("MongoDB_URI", "localhost", "27018", "mongo1", "27017", "root", "clickhouse") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries, with_mongo=True) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +# See comment in SourceMongoURI +@pytest.mark.parametrize("layout_name", ["flat"]) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mysql.py b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mysql.py new file mode 100644 index 00000000000..69e2543f226 --- /dev/null +++ b/tests/integration/test_dictionaries_all_layouts_separate_sources/test_mysql.py @@ -0,0 +1,82 @@ +import os +import math +import pytest + +from .common import * + +from helpers.cluster import ClickHouseCluster +from helpers.dictionary import Field, Row, Dictionary, DictionaryStructure, Layout +from helpers.external_sources import SourceMySQL + +SOURCE = SourceMySQL("MySQL", "localhost", "3308", "mysql1", "3306", "root", "clickhouse") + +cluster = None +node = None +simple_tester = None +complex_tester = None +ranged_tester = None + + +def setup_module(module): + global cluster + global node + global simple_tester + global complex_tester + global ranged_tester + + for f in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, f)) + + simple_tester = SimpleLayoutTester() + simple_tester.create_dictionaries(SOURCE) + + complex_tester = ComplexLayoutTester() + complex_tester.create_dictionaries(SOURCE) + + ranged_tester = RangedLayoutTester() + ranged_tester.create_dictionaries(SOURCE) + # Since that all .xml configs were created + + cluster = ClickHouseCluster(__file__) + + dictionaries = [] + main_configs = [] + main_configs.append(os.path.join('configs', 'disable_ssl_verification.xml')) + + for fname in os.listdir(DICT_CONFIG_PATH): + dictionaries.append(os.path.join(DICT_CONFIG_PATH, fname)) + + node = cluster.add_instance('node', main_configs=main_configs, dictionaries=dictionaries, with_mysql=True) + + +def teardown_module(module): + global DICT_CONFIG_PATH + for fname in os.listdir(DICT_CONFIG_PATH): + os.remove(os.path.join(DICT_CONFIG_PATH, fname)) + + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + + simple_tester.prepare(cluster) + complex_tester.prepare(cluster) + ranged_tester.prepare(cluster) + + yield cluster + + finally: + cluster.shutdown() + +@pytest.mark.parametrize("layout_name", LAYOUTS_SIMPLE) +def test_simple(started_cluster, layout_name): + simple_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_COMPLEX) +def test_complex(started_cluster, layout_name): + complex_tester.execute(layout_name, node) + +@pytest.mark.parametrize("layout_name", LAYOUTS_RANGED) +def test_ranged(started_cluster, layout_name): + ranged_tester.execute(layout_name, node) diff --git a/tests/integration/test_dictionaries_dependency/test.py b/tests/integration/test_dictionaries_dependency/test.py index 119bd7c6863..d615f90dc79 100644 --- a/tests/integration/test_dictionaries_dependency/test.py +++ b/tests/integration/test_dictionaries_dependency/test.py @@ -13,14 +13,17 @@ def start_cluster(): cluster.start() for node in nodes: node.query("CREATE DATABASE IF NOT EXISTS test") + # Different internal dictionary name with Atomic + node.query("CREATE DATABASE IF NOT EXISTS test_ordinary ENGINE=Ordinary") node.query("CREATE DATABASE IF NOT EXISTS atest") node.query("CREATE DATABASE IF NOT EXISTS ztest") node.query("CREATE TABLE test.source(x UInt64, y UInt64) ENGINE=Log") node.query("INSERT INTO test.source VALUES (5,6)") - node.query("CREATE DICTIONARY test.dict(x UInt64, y UInt64) PRIMARY KEY x " \ - "SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'source' DB 'test')) " \ - "LAYOUT(FLAT()) LIFETIME(0)") + for db in ("test", "test_ordinary"): + node.query("CREATE DICTIONARY {}.dict(x UInt64, y UInt64) PRIMARY KEY x " \ + "SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'source' DB 'test')) " \ + "LAYOUT(FLAT()) LIFETIME(0)".format(db)) yield cluster finally: @@ -91,10 +94,10 @@ def test_dependency_via_explicit_table(node): def test_dependency_via_dictionary_database(node): node.query("CREATE DATABASE dict_db ENGINE=Dictionary") - d_names = ["test.adict", "test.zdict", "atest.dict", "ztest.dict"] + d_names = ["test_ordinary.adict", "test_ordinary.zdict", "atest.dict", "ztest.dict"] for d_name in d_names: node.query("CREATE DICTIONARY {}(x UInt64, y UInt64) PRIMARY KEY x " \ - "SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'test.dict' DB 'dict_db')) " \ + "SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'test_ordinary.dict' DB 'dict_db')) " \ "LAYOUT(FLAT()) LIFETIME(0)".format(d_name)) def check(): diff --git a/tests/integration/test_distributed_ddl/test.py b/tests/integration/test_distributed_ddl/test.py index 08027fa13ca..b788dafe167 100755 --- a/tests/integration/test_distributed_ddl/test.py +++ b/tests/integration/test_distributed_ddl/test.py @@ -326,21 +326,30 @@ def test_socket_timeout(test_cluster): def test_replicated_without_arguments(test_cluster): rules = test_cluster.pm_random_drops.pop_rules() instance = test_cluster.instances['ch1'] - test_cluster.ddl_check_query(instance, "CREATE DATABASE test_atomic ON CLUSTER cluster ENGINE=Atomic", - settings={'show_table_uuid_in_table_create_query_if_not_nil': 1}) + test_cluster.ddl_check_query(instance, "CREATE DATABASE test_atomic ON CLUSTER cluster ENGINE=Atomic") + assert "are supported only for ON CLUSTER queries with Atomic database engine" in \ + instance.query_and_get_error("CREATE TABLE test_atomic.rmt (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n") test_cluster.ddl_check_query(instance, - "CREATE TABLE test_atomic.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n", - settings={'show_table_uuid_in_table_create_query_if_not_nil': 1}) + "CREATE TABLE test_atomic.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n") test_cluster.ddl_check_query(instance, "DROP TABLE test_atomic.rmt ON CLUSTER cluster") test_cluster.ddl_check_query(instance, - "CREATE TABLE test_atomic.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n", - settings={'show_table_uuid_in_table_create_query_if_not_nil': 1}) + "CREATE TABLE test_atomic.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n") test_cluster.ddl_check_query(instance, "RENAME TABLE test_atomic.rmt TO test_atomic.rmt_renamed ON CLUSTER cluster") test_cluster.ddl_check_query(instance, - "CREATE TABLE test_atomic.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n", - settings={'show_table_uuid_in_table_create_query_if_not_nil': 1}) + "CREATE TABLE test_atomic.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') ORDER BY n") test_cluster.ddl_check_query(instance, "EXCHANGE TABLES test_atomic.rmt AND test_atomic.rmt_renamed ON CLUSTER cluster") + assert instance.query("SELECT countDistinct(uuid) from clusterAllReplicas('cluster', 'system', 'databases') WHERE uuid != 0 AND name='test_atomic'") == "1\n" + assert instance.query("SELECT countDistinct(uuid) from clusterAllReplicas('cluster', 'system', 'tables') WHERE uuid != 0 AND name='rmt'") == "1\n" + test_cluster.ddl_check_query(instance, "DROP DATABASE test_atomic ON CLUSTER cluster") + + test_cluster.ddl_check_query(instance, "CREATE DATABASE test_ordinary ON CLUSTER cluster ENGINE=Ordinary") + assert "are supported only for ON CLUSTER queries with Atomic database engine" in \ + instance.query_and_get_error("CREATE TABLE test_ordinary.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n") + assert "are supported only for ON CLUSTER queries with Atomic database engine" in \ + instance.query_and_get_error("CREATE TABLE test_ordinary.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree('/{shard}/{uuid}/', '{replica}') ORDER BY n") + test_cluster.ddl_check_query(instance, "CREATE TABLE test_ordinary.rmt ON CLUSTER cluster (n UInt64, s String) ENGINE=ReplicatedMergeTree('/{shard}/{table}/', '{replica}') ORDER BY n") + test_cluster.ddl_check_query(instance, "DROP DATABASE test_ordinary ON CLUSTER cluster") test_cluster.pm_random_drops.push_rules(rules) diff --git a/tests/integration/test_distributed_ddl_on_cross_replication/test.py b/tests/integration/test_distributed_ddl_on_cross_replication/test.py index 16238f0326d..85800b2e5e6 100644 --- a/tests/integration/test_distributed_ddl_on_cross_replication/test.py +++ b/tests/integration/test_distributed_ddl_on_cross_replication/test.py @@ -77,3 +77,30 @@ def test_alter_ddl(started_cluster): node2.query("SYSTEM SYNC REPLICA replica_2.replicated_local;", timeout=5) assert_eq_with_retry(node1, "SELECT count(*) FROM replica_2.replicated", '0') + +def test_atomic_database(started_cluster): + node1.query('''DROP DATABASE IF EXISTS replica_1 ON CLUSTER cross_3shards_2replicas; + DROP DATABASE IF EXISTS replica_2 ON CLUSTER cross_3shards_2replicas; + CREATE DATABASE replica_1 ON CLUSTER cross_3shards_2replicas ENGINE=Atomic; + CREATE DATABASE replica_2 ON CLUSTER cross_3shards_2replicas ENGINE=Atomic;''') + + assert "It's not supported for cross replication" in \ + node1.query_and_get_error("CREATE TABLE rmt ON CLUSTER cross_3shards_2replicas (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n") + assert "It's not supported for cross replication" in \ + node1.query_and_get_error("CREATE TABLE replica_1.rmt ON CLUSTER cross_3shards_2replicas (n UInt64, s String) ENGINE=ReplicatedMergeTree ORDER BY n") + assert "It's not supported for cross replication" in \ + node1.query_and_get_error("CREATE TABLE rmt ON CLUSTER cross_3shards_2replicas (n UInt64, s String) ENGINE=ReplicatedMergeTree('/{shard}/{uuid}/', '{replica}') ORDER BY n") + assert "It's not supported for cross replication" in \ + node1.query_and_get_error("CREATE TABLE replica_2.rmt ON CLUSTER cross_3shards_2replicas (n UInt64, s String) ENGINE=ReplicatedMergeTree('/{shard}/{uuid}/', '{replica}') ORDER BY n") + assert "For a distributed DDL on circular replicated cluster its table name must be qualified by database name" in \ + node1.query_and_get_error("CREATE TABLE rmt ON CLUSTER cross_3shards_2replicas (n UInt64, s String) ENGINE=ReplicatedMergeTree('/tables/{shard}/rmt/', '{replica}') ORDER BY n") + + node1.query("CREATE TABLE replica_1.rmt ON CLUSTER cross_3shards_2replicas (n UInt64, s String) ENGINE=ReplicatedMergeTree('/tables/{shard}/rmt/', '{replica}') ORDER BY n") + node1.query("CREATE TABLE replica_2.rmt ON CLUSTER cross_3shards_2replicas (n UInt64, s String) ENGINE=ReplicatedMergeTree('/tables/{shard_bk}/rmt/', '{replica_bk}') ORDER BY n") + + assert node1.query("SELECT countDistinct(uuid) from remote('node1,node2,node3', 'system', 'databases') WHERE uuid != 0 AND name='replica_1'") == "1\n" + assert node1.query("SELECT countDistinct(uuid) from remote('node1,node2,node3', 'system', 'tables') WHERE uuid != 0 AND name='rmt'") == "2\n" + + node1.query("INSERT INTO replica_1.rmt VALUES (1, 'test')") + node2.query("SYSTEM SYNC REPLICA replica_2.rmt", timeout=5) + assert_eq_with_retry(node2, "SELECT * FROM replica_2.rmt", '1\ttest') diff --git a/tests/integration/test_distributed_format/test.py b/tests/integration/test_distributed_format/test.py index 7658814a720..7e9d740c171 100644 --- a/tests/integration/test_distributed_format/test.py +++ b/tests/integration/test_distributed_format/test.py @@ -15,7 +15,7 @@ cluster_param = pytest.mark.parametrize("cluster", [ def started_cluster(): try: cluster.start() - node.query("create database test engine=Ordinary") + node.query("create database test") yield cluster finally: diff --git a/tests/integration/test_distributed_storage_configuration/test.py b/tests/integration/test_distributed_storage_configuration/test.py index a932e9a55c5..d293b96399d 100644 --- a/tests/integration/test_distributed_storage_configuration/test.py +++ b/tests/integration/test_distributed_storage_configuration/test.py @@ -17,7 +17,7 @@ node = cluster.add_instance('node', def start_cluster(): try: cluster.start() - node.query('CREATE DATABASE test ENGINE=Ordinary') + node.query('CREATE DATABASE test ENGINE=Ordinary') # Different paths with Atomic yield cluster finally: cluster.shutdown() diff --git a/tests/integration/test_filesystem_layout/test.py b/tests/integration/test_filesystem_layout/test.py index e2441d0d20d..2519d0e5ac3 100644 --- a/tests/integration/test_filesystem_layout/test.py +++ b/tests/integration/test_filesystem_layout/test.py @@ -27,3 +27,19 @@ def test_file_path_escaping(started_cluster): node.exec_in_container(["bash", "-c", "test -f /var/lib/clickhouse/data/test/T%2Ea_b%2Cl%2De%21/1_1_1_0/%7EId.bin"]) node.exec_in_container( ["bash", "-c", "test -f /var/lib/clickhouse/shadow/1/data/test/T%2Ea_b%2Cl%2De%21/1_1_1_0/%7EId.bin"]) + +def test_file_path_escaping_atomic_db(started_cluster): + node.query('CREATE DATABASE IF NOT EXISTS `test 2` ENGINE = Atomic') + node.query(''' + CREATE TABLE `test 2`.`T.a_b,l-e!` UUID '12345678-1000-4000-8000-000000000001' (`~Id` UInt32) + ENGINE = MergeTree() PARTITION BY `~Id` ORDER BY `~Id` SETTINGS min_bytes_for_wide_part = 0; + ''') + node.query('''INSERT INTO `test 2`.`T.a_b,l-e!` VALUES (1);''') + node.query('''ALTER TABLE `test 2`.`T.a_b,l-e!` FREEZE;''') + + node.exec_in_container(["bash", "-c", "test -f /var/lib/clickhouse/store/123/12345678-1000-4000-8000-000000000001/1_1_1_0/%7EId.bin"]) + # Check symlink + node.exec_in_container(["bash", "-c", "test -L /var/lib/clickhouse/data/test%202/T%2Ea_b%2Cl%2De%21"]) + node.exec_in_container(["bash", "-c", "test -f /var/lib/clickhouse/data/test%202/T%2Ea_b%2Cl%2De%21/1_1_1_0/%7EId.bin"]) + node.exec_in_container( + ["bash", "-c", "test -f /var/lib/clickhouse/shadow/2/store/123/12345678-1000-4000-8000-000000000001/1_1_1_0/%7EId.bin"]) diff --git a/tests/integration/test_partition/test.py b/tests/integration/test_partition/test.py index 5b27ff94ddb..679c6fb8c5b 100644 --- a/tests/integration/test_partition/test.py +++ b/tests/integration/test_partition/test.py @@ -13,7 +13,7 @@ path_to_data = '/var/lib/clickhouse/' def started_cluster(): try: cluster.start() - q('CREATE DATABASE test ENGINE = Ordinary') + q('CREATE DATABASE test ENGINE = Ordinary') # Different path in shadow/ with Atomic yield cluster diff --git a/tests/integration/test_polymorphic_parts/configs/users.d/not_optimize_count.xml b/tests/integration/test_polymorphic_parts/configs/users.d/not_optimize_count.xml index 82689093adf..5a06453b214 100644 --- a/tests/integration/test_polymorphic_parts/configs/users.d/not_optimize_count.xml +++ b/tests/integration/test_polymorphic_parts/configs/users.d/not_optimize_count.xml @@ -2,7 +2,6 @@ 0 - Ordinary diff --git a/tests/integration/test_polymorphic_parts/test.py b/tests/integration/test_polymorphic_parts/test.py index dbbf5c0b4ff..50a8192fbc5 100644 --- a/tests/integration/test_polymorphic_parts/test.py +++ b/tests/integration/test_polymorphic_parts/test.py @@ -336,7 +336,7 @@ def test_polymorphic_parts_non_adaptive(start_cluster): "Wide\t2\n") assert node1.contains_in_log( - " default.non_adaptive_table: Table can't create parts with adaptive granularity") + " default.non_adaptive_table ([0-9a-f-]*): Table can't create parts with adaptive granularity") def test_in_memory(start_cluster): @@ -408,24 +408,29 @@ def test_in_memory_wal(start_cluster): pm.partition_instances(node11, node12) check(node11, 300, 6) - wal_file = os.path.join(node11.path, "database/data/default/wal_table/wal.bin") + wal_file = "/var/lib/clickhouse/data/default/wal_table/wal.bin" # Corrupt wal file - open(wal_file, 'rw+').truncate(os.path.getsize(wal_file) - 10) + # Truncate it to it's size minus 10 bytes + node11.exec_in_container(['bash', '-c', 'truncate --size="$(($(stat -c "%s" {}) - 10))" {}'.format(wal_file, wal_file)], + privileged=True, user='root') node11.restart_clickhouse(kill=True) # Broken part is lost, but other restored successfully check(node11, 250, 5) # WAL with blocks from 0 to 4 - broken_wal_file = os.path.join(node11.path, "database/data/default/wal_table/wal_0_4.bin") - assert os.path.exists(broken_wal_file) + broken_wal_file = "/var/lib/clickhouse/data/default/wal_table/wal_0_4.bin" + # Check file exists + node11.exec_in_container(['bash', '-c', 'test -f {}'.format(broken_wal_file)]) # Fetch lost part from replica node11.query("SYSTEM SYNC REPLICA wal_table", timeout=20) check(node11, 300, 6) # Check that new data is written to new wal, but old is still exists for restoring - assert os.path.getsize(wal_file) > 0 - assert os.path.exists(broken_wal_file) + # Check file not empty + node11.exec_in_container(['bash', '-c', 'test -s {}'.format(wal_file)]) + # Check file exists + node11.exec_in_container(['bash', '-c', 'test -f {}'.format(broken_wal_file)]) # Data is lost without WAL node11.query("ALTER TABLE wal_table MODIFY SETTING in_memory_parts_enable_wal = 0") @@ -446,8 +451,8 @@ def test_in_memory_wal_rotate(start_cluster): insert_random_data('restore_table', node11, 50) for i in range(5): - wal_file = os.path.join(node11.path, "database/data/default/restore_table/wal_{0}_{0}.bin".format(i)) - assert os.path.exists(wal_file) + # Check file exists + node11.exec_in_container(['bash', '-c', 'test -f /var/lib/clickhouse/data/default/restore_table/wal_{0}_{0}.bin'.format(i)]) for node in [node11, node12]: node.query( @@ -459,13 +464,14 @@ def test_in_memory_wal_rotate(start_cluster): node11.restart_clickhouse(kill=True) for i in range(5): - wal_file = os.path.join(node11.path, "database/data/default/restore_table/wal_{0}_{0}.bin".format(i)) - assert not os.path.exists(wal_file) + # check file doesn't exist + node11.exec_in_container(['bash', '-c', 'test ! -e /var/lib/clickhouse/data/default/restore_table/wal_{0}_{0}.bin'.format(i)]) # New wal file was created and ready to write part to it - wal_file = os.path.join(node11.path, "database/data/default/restore_table/wal.bin") - assert os.path.exists(wal_file) - assert os.path.getsize(wal_file) == 0 + # Check file exists + node11.exec_in_container(['bash', '-c', 'test -f /var/lib/clickhouse/data/default/restore_table/wal.bin']) + # Chech file empty + node11.exec_in_container(['bash', '-c', 'test ! -s /var/lib/clickhouse/data/default/restore_table/wal.bin']) def test_in_memory_deduplication(start_cluster): @@ -509,19 +515,20 @@ def test_in_memory_alters(start_cluster): def test_polymorphic_parts_index(start_cluster): + node1.query('CREATE DATABASE test_index ENGINE=Ordinary') # Different paths with Atomic node1.query(''' - CREATE TABLE index_compact(a UInt32, s String) + CREATE TABLE test_index.index_compact(a UInt32, s String) ENGINE = MergeTree ORDER BY a SETTINGS min_rows_for_wide_part = 1000, index_granularity = 128, merge_max_block_size = 100''') - node1.query("INSERT INTO index_compact SELECT number, toString(number) FROM numbers(100)") - node1.query("INSERT INTO index_compact SELECT number, toString(number) FROM numbers(30)") - node1.query("OPTIMIZE TABLE index_compact FINAL") + node1.query("INSERT INTO test_index.index_compact SELECT number, toString(number) FROM numbers(100)") + node1.query("INSERT INTO test_index.index_compact SELECT number, toString(number) FROM numbers(30)") + node1.query("OPTIMIZE TABLE test_index.index_compact FINAL") assert node1.query("SELECT part_type FROM system.parts WHERE table = 'index_compact' AND active") == "Compact\n" assert node1.query("SELECT marks FROM system.parts WHERE table = 'index_compact' AND active") == "2\n" - index_path = os.path.join(node1.path, "database/data/default/index_compact/all_1_2_1/primary.idx") + index_path = os.path.join(node1.path, "database/data/test_index/index_compact/all_1_2_1/primary.idx") f = open(index_path, 'rb') assert os.path.getsize(index_path) == 8 diff --git a/tests/integration/test_quorum_inserts/configs/config.d/remote_servers.xml b/tests/integration/test_quorum_inserts/configs/config.d/remote_servers.xml new file mode 100644 index 00000000000..b1cd417f8b9 --- /dev/null +++ b/tests/integration/test_quorum_inserts/configs/config.d/remote_servers.xml @@ -0,0 +1,20 @@ + + + + + + zero + 9000 + + + first + 9000 + + + second + 9000 + + + + + diff --git a/tests/integration/test_quorum_inserts/test.py b/tests/integration/test_quorum_inserts/test.py index 0adee0afc64..2211333bb26 100644 --- a/tests/integration/test_quorum_inserts/test.py +++ b/tests/integration/test_quorum_inserts/test.py @@ -7,23 +7,21 @@ from helpers.test_tools import TSV cluster = ClickHouseCluster(__file__) zero = cluster.add_instance("zero", user_configs=["configs/users.d/settings.xml"], + main_configs=["configs/config.d/remote_servers.xml"], macros={"cluster": "anime", "shard": "0", "replica": "zero"}, with_zookeeper=True) first = cluster.add_instance("first", user_configs=["configs/users.d/settings.xml"], + main_configs=["configs/config.d/remote_servers.xml"], macros={"cluster": "anime", "shard": "0", "replica": "first"}, with_zookeeper=True) second = cluster.add_instance("second", user_configs=["configs/users.d/settings.xml"], + main_configs=["configs/config.d/remote_servers.xml"], macros={"cluster": "anime", "shard": "0", "replica": "second"}, with_zookeeper=True) -def execute_on_all_cluster(query_): - for node in [zero, first, second]: - node.query(query_) - - @pytest.fixture(scope="module") def started_cluster(): global cluster @@ -36,7 +34,7 @@ def started_cluster(): def test_simple_add_replica(started_cluster): - execute_on_all_cluster("DROP TABLE IF EXISTS test_simple") + zero.query("DROP TABLE IF EXISTS test_simple ON CLUSTER cluster") create_query = "CREATE TABLE test_simple " \ "(a Int8, d Date) " \ @@ -67,11 +65,11 @@ def test_simple_add_replica(started_cluster): assert '1\t2011-01-01\n' == first.query("SELECT * from test_simple") assert '1\t2011-01-01\n' == second.query("SELECT * from test_simple") - execute_on_all_cluster("DROP TABLE IF EXISTS test_simple") + zero.query("DROP TABLE IF EXISTS test_simple ON CLUSTER cluster") def test_drop_replica_and_achieve_quorum(started_cluster): - execute_on_all_cluster("DROP TABLE IF EXISTS test_drop_replica_and_achieve_quorum") + zero.query("DROP TABLE IF EXISTS test_drop_replica_and_achieve_quorum ON CLUSTER cluster") create_query = "CREATE TABLE test_drop_replica_and_achieve_quorum " \ "(a Int8, d Date) " \ @@ -125,7 +123,7 @@ def test_drop_replica_and_achieve_quorum(started_cluster): assert TSV("1\t2011-01-01\n2\t2012-02-02\n") == TSV( second.query("SELECT * FROM test_drop_replica_and_achieve_quorum ORDER BY a")) - execute_on_all_cluster("DROP TABLE IF EXISTS test_drop_replica_and_achieve_quorum") + zero.query("DROP TABLE IF EXISTS test_drop_replica_and_achieve_quorum ON CLUSTER cluster") @pytest.mark.parametrize( @@ -136,17 +134,15 @@ def test_drop_replica_and_achieve_quorum(started_cluster): ] ) def test_insert_quorum_with_drop_partition(started_cluster, add_new_data): - execute_on_all_cluster("DROP TABLE IF EXISTS test_quorum_insert_with_drop_partition") + zero.query("DROP TABLE IF EXISTS test_quorum_insert_with_drop_partition ON CLUSTER cluster") - create_query = "CREATE TABLE test_quorum_insert_with_drop_partition " \ + create_query = "CREATE TABLE test_quorum_insert_with_drop_partition ON CLUSTER cluster " \ "(a Int8, d Date) " \ - "Engine = ReplicatedMergeTree('/clickhouse/tables/{shard}/{table}', '{replica}') " \ + "Engine = ReplicatedMergeTree " \ "PARTITION BY d ORDER BY a " print("Create Replicated table with three replicas") zero.query(create_query) - first.query(create_query) - second.query(create_query) print("Stop fetches for test_quorum_insert_with_drop_partition at first replica.") first.query("SYSTEM STOP FETCHES test_quorum_insert_with_drop_partition") @@ -167,9 +163,11 @@ def test_insert_quorum_with_drop_partition(started_cluster, add_new_data): print("Sync first replica with others.") first.query("SYSTEM SYNC REPLICA test_quorum_insert_with_drop_partition") - assert "20110101" not in first.query("SELECT * FROM system.zookeeper " \ - "where path='/clickhouse/tables/0/test_quorum_insert_with_drop_partition/quorum/last_part' " \ - "format Vertical") + assert "20110101" not in first.query(""" + WITH (SELECT toString(uuid) FROM system.tables WHERE name = 'test_quorum_insert_with_drop_partition') AS uuid, + '/clickhouse/tables/' || uuid || '/0/quorum/last_part' AS p + SELECT * FROM system.zookeeper WHERE path = p FORMAT Vertical + """) print("Select from updated partition.") if (add_new_data): @@ -179,7 +177,7 @@ def test_insert_quorum_with_drop_partition(started_cluster, add_new_data): assert TSV("") == TSV(zero.query("SELECT * FROM test_quorum_insert_with_drop_partition")) assert TSV("") == TSV(second.query("SELECT * FROM test_quorum_insert_with_drop_partition")) - execute_on_all_cluster("DROP TABLE IF EXISTS test_quorum_insert_with_drop_partition") + zero.query("DROP TABLE IF EXISTS test_quorum_insert_with_drop_partition ON CLUSTER cluster") @pytest.mark.parametrize( @@ -190,28 +188,24 @@ def test_insert_quorum_with_drop_partition(started_cluster, add_new_data): ] ) def test_insert_quorum_with_move_partition(started_cluster, add_new_data): - execute_on_all_cluster("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_source") - execute_on_all_cluster("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_destination") + zero.query("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_source ON CLUSTER cluster") + zero.query("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_destination ON CLUSTER cluster") - create_source = "CREATE TABLE test_insert_quorum_with_move_partition_source " \ + create_source = "CREATE TABLE test_insert_quorum_with_move_partition_source ON CLUSTER cluster " \ "(a Int8, d Date) " \ - "Engine = ReplicatedMergeTree('/clickhouse/tables/{shard}/{table}', '{replica}') " \ + "Engine = ReplicatedMergeTree " \ "PARTITION BY d ORDER BY a " - create_destination = "CREATE TABLE test_insert_quorum_with_move_partition_destination " \ + create_destination = "CREATE TABLE test_insert_quorum_with_move_partition_destination ON CLUSTER cluster " \ "(a Int8, d Date) " \ - "Engine = ReplicatedMergeTree('/clickhouse/tables/{shard}/{table}', '{replica}') " \ + "Engine = ReplicatedMergeTree " \ "PARTITION BY d ORDER BY a " print("Create source Replicated table with three replicas") zero.query(create_source) - first.query(create_source) - second.query(create_source) print("Create destination Replicated table with three replicas") zero.query(create_destination) - first.query(create_destination) - second.query(create_destination) print("Stop fetches for test_insert_quorum_with_move_partition_source at first replica.") first.query("SYSTEM STOP FETCHES test_insert_quorum_with_move_partition_source") @@ -233,9 +227,11 @@ def test_insert_quorum_with_move_partition(started_cluster, add_new_data): print("Sync first replica with others.") first.query("SYSTEM SYNC REPLICA test_insert_quorum_with_move_partition_source") - assert "20110101" not in first.query("SELECT * FROM system.zookeeper " \ - "where path='/clickhouse/tables/0/test_insert_quorum_with_move_partition_source/quorum/last_part' " \ - "format Vertical") + assert "20110101" not in first.query(""" + WITH (SELECT toString(uuid) FROM system.tables WHERE name = 'test_insert_quorum_with_move_partition_source') AS uuid, + '/clickhouse/tables/' || uuid || '/0/quorum/last_part' AS p + SELECT * FROM system.zookeeper WHERE path = p FORMAT Vertical + """) print("Select from updated partition.") if (add_new_data): @@ -246,12 +242,12 @@ def test_insert_quorum_with_move_partition(started_cluster, add_new_data): assert TSV("") == TSV(zero.query("SELECT * FROM test_insert_quorum_with_move_partition_source")) assert TSV("") == TSV(second.query("SELECT * FROM test_insert_quorum_with_move_partition_source")) - execute_on_all_cluster("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_source") - execute_on_all_cluster("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_destination") + zero.query("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_source ON CLUSTER cluster") + zero.query("DROP TABLE IF EXISTS test_insert_quorum_with_move_partition_destination ON CLUSTER cluster") def test_insert_quorum_with_ttl(started_cluster): - execute_on_all_cluster("DROP TABLE IF EXISTS test_insert_quorum_with_ttl") + zero.query("DROP TABLE IF EXISTS test_insert_quorum_with_ttl ON CLUSTER cluster") create_query = "CREATE TABLE test_insert_quorum_with_ttl " \ "(a Int8, d Date) " \ @@ -298,4 +294,4 @@ def test_insert_quorum_with_ttl(started_cluster): assert TSV("2\t2012-02-02\n") == TSV( first.query("SELECT * FROM test_insert_quorum_with_ttl", settings={'select_sequential_consistency': 1})) - execute_on_all_cluster("DROP TABLE IF EXISTS test_insert_quorum_with_ttl") + zero.query("DROP TABLE IF EXISTS test_insert_quorum_with_ttl ON CLUSTER cluster") diff --git a/tests/integration/test_reloading_settings_from_users_xml/__init__.py b/tests/integration/test_reloading_settings_from_users_xml/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/tests/integration/test_reloading_settings_from_users_xml/configs/changed_settings.xml b/tests/integration/test_reloading_settings_from_users_xml/configs/changed_settings.xml new file mode 100644 index 00000000000..382c2b2dc20 --- /dev/null +++ b/tests/integration/test_reloading_settings_from_users_xml/configs/changed_settings.xml @@ -0,0 +1,9 @@ + + + + + 20000000000 + nearest_hostname + + + diff --git a/tests/integration/test_reloading_settings_from_users_xml/configs/normal_settings.xml b/tests/integration/test_reloading_settings_from_users_xml/configs/normal_settings.xml new file mode 100644 index 00000000000..85d1c26659f --- /dev/null +++ b/tests/integration/test_reloading_settings_from_users_xml/configs/normal_settings.xml @@ -0,0 +1,9 @@ + + + + + 10000000000 + first_or_random + + + diff --git a/tests/integration/test_reloading_settings_from_users_xml/configs/unexpected_setting_enum.xml b/tests/integration/test_reloading_settings_from_users_xml/configs/unexpected_setting_enum.xml new file mode 100644 index 00000000000..ff2b40583de --- /dev/null +++ b/tests/integration/test_reloading_settings_from_users_xml/configs/unexpected_setting_enum.xml @@ -0,0 +1,9 @@ + + + + + 20000000000 + a + + + diff --git a/tests/integration/test_reloading_settings_from_users_xml/configs/unexpected_setting_int.xml b/tests/integration/test_reloading_settings_from_users_xml/configs/unexpected_setting_int.xml new file mode 100644 index 00000000000..4ef15ed3680 --- /dev/null +++ b/tests/integration/test_reloading_settings_from_users_xml/configs/unexpected_setting_int.xml @@ -0,0 +1,9 @@ + + + + + a + nearest_hostname + + + diff --git a/tests/integration/test_reloading_settings_from_users_xml/configs/unknown_setting.xml b/tests/integration/test_reloading_settings_from_users_xml/configs/unknown_setting.xml new file mode 100644 index 00000000000..9bac09aef18 --- /dev/null +++ b/tests/integration/test_reloading_settings_from_users_xml/configs/unknown_setting.xml @@ -0,0 +1,8 @@ + + + + + 8 + + + diff --git a/tests/integration/test_reloading_settings_from_users_xml/test.py b/tests/integration/test_reloading_settings_from_users_xml/test.py new file mode 100644 index 00000000000..b45568ee904 --- /dev/null +++ b/tests/integration/test_reloading_settings_from_users_xml/test.py @@ -0,0 +1,90 @@ +import pytest +import os +import time +from helpers.cluster import ClickHouseCluster +from helpers.test_tools import assert_eq_with_retry, assert_logs_contain_with_retry + +SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__)) +cluster = ClickHouseCluster(__file__) +node = cluster.add_instance('node', user_configs=["configs/normal_settings.xml"]) + +@pytest.fixture(scope="module", autouse=True) +def started_cluster(): + try: + cluster.start() + yield cluster + finally: + cluster.shutdown() + + +@pytest.fixture(autouse=True) +def reset_to_normal_settings_after_test(): + try: + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/normal_settings.xml"), '/etc/clickhouse-server/users.d/z.xml') + node.query("SYSTEM RELOAD CONFIG") + yield + finally: + pass + + +def test_force_reload(): + assert node.query("SELECT getSetting('max_memory_usage')") == "10000000000\n" + assert node.query("SELECT getSetting('load_balancing')") == "first_or_random\n" + + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/changed_settings.xml"), '/etc/clickhouse-server/users.d/z.xml') + node.query("SYSTEM RELOAD CONFIG") + + assert node.query("SELECT getSetting('max_memory_usage')") == "20000000000\n" + assert node.query("SELECT getSetting('load_balancing')") == "nearest_hostname\n" + + +def test_reload_on_timeout(): + assert node.query("SELECT getSetting('max_memory_usage')") == "10000000000\n" + assert node.query("SELECT getSetting('load_balancing')") == "first_or_random\n" + + time.sleep(1) # The modification time of the 'z.xml' file should be different, + # because config files are reload by timer only when the modification time is changed. + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/changed_settings.xml"), '/etc/clickhouse-server/users.d/z.xml') + + assert_eq_with_retry(node, "SELECT getSetting('max_memory_usage')", "20000000000") + assert_eq_with_retry(node, "SELECT getSetting('load_balancing')", "nearest_hostname") + + +def test_unknown_setting_force_reload(): + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/unknown_setting.xml"), '/etc/clickhouse-server/users.d/z.xml') + + error_message = "Setting xyz is neither a builtin setting nor started with the prefix 'custom_' registered for user-defined settings" + assert error_message in node.query_and_get_error("SYSTEM RELOAD CONFIG") + + assert node.query("SELECT getSetting('max_memory_usage')") == "10000000000\n" + assert node.query("SELECT getSetting('load_balancing')") == "first_or_random\n" + + +def test_unknown_setting_reload_on_timeout(): + time.sleep(1) # The modification time of the 'z.xml' file should be different, + # because config files are reload by timer only when the modification time is changed. + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/unknown_setting.xml"), '/etc/clickhouse-server/users.d/z.xml') + + error_message = "Setting xyz is neither a builtin setting nor started with the prefix 'custom_' registered for user-defined settings" + assert_logs_contain_with_retry(node, error_message) + + assert node.query("SELECT getSetting('max_memory_usage')") == "10000000000\n" + assert node.query("SELECT getSetting('load_balancing')") == "first_or_random\n" + + +def test_unexpected_setting_int(): + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/unexpected_setting_int.xml"), '/etc/clickhouse-server/users.d/z.xml') + error_message = "Cannot parse" + assert error_message in node.query_and_get_error("SYSTEM RELOAD CONFIG") + + assert node.query("SELECT getSetting('max_memory_usage')") == "10000000000\n" + assert node.query("SELECT getSetting('load_balancing')") == "first_or_random\n" + + +def test_unexpected_setting_enum(): + node.copy_file_to_container(os.path.join(SCRIPT_DIR, "configs/unexpected_setting_int.xml"), '/etc/clickhouse-server/users.d/z.xml') + error_message = "Cannot parse" + assert error_message in node.query_and_get_error("SYSTEM RELOAD CONFIG") + + assert node.query("SELECT getSetting('max_memory_usage')") == "10000000000\n" + assert node.query("SELECT getSetting('load_balancing')") == "first_or_random\n" diff --git a/tests/integration/test_rename_column/configs/config.d/storage_configuration.xml b/tests/integration/test_rename_column/configs/config.d/storage_configuration.xml index 131219abf3d..f1b92ab32a6 100644 --- a/tests/integration/test_rename_column/configs/config.d/storage_configuration.xml +++ b/tests/integration/test_rename_column/configs/config.d/storage_configuration.xml @@ -24,5 +24,9 @@ + + + 0 + diff --git a/tests/integration/test_replicated_merge_tree_s3/configs/config.d/storage_conf.xml b/tests/integration/test_replicated_merge_tree_s3/configs/config.d/storage_conf.xml index f3b7f959ce9..20b750ffff3 100644 --- a/tests/integration/test_replicated_merge_tree_s3/configs/config.d/storage_conf.xml +++ b/tests/integration/test_replicated_merge_tree_s3/configs/config.d/storage_conf.xml @@ -22,4 +22,28 @@ 0 + + + + + + node1 + 9000 + + + node2 + 9000 + + + node3 + 9000 + + + + + + + 0 + + diff --git a/tests/integration/test_replicated_merge_tree_s3/test.py b/tests/integration/test_replicated_merge_tree_s3/test.py index 1414905759a..4d19793d0b2 100644 --- a/tests/integration/test_replicated_merge_tree_s3/test.py +++ b/tests/integration/test_replicated_merge_tree_s3/test.py @@ -14,11 +14,11 @@ def cluster(): try: cluster = ClickHouseCluster(__file__) - cluster.add_instance("node1", main_configs=["configs/config.d/storage_conf.xml"], macros={'cluster': 'test1'}, + cluster.add_instance("node1", main_configs=["configs/config.d/storage_conf.xml"], macros={'replica': '1'}, with_minio=True, with_zookeeper=True) - cluster.add_instance("node2", main_configs=["configs/config.d/storage_conf.xml"], macros={'cluster': 'test1'}, + cluster.add_instance("node2", main_configs=["configs/config.d/storage_conf.xml"], macros={'replica': '2'}, with_zookeeper=True) - cluster.add_instance("node3", main_configs=["configs/config.d/storage_conf.xml"], macros={'cluster': 'test1'}, + cluster.add_instance("node3", main_configs=["configs/config.d/storage_conf.xml"], macros={'replica': '3'}, with_zookeeper=True) logging.info("Starting cluster...") @@ -49,12 +49,12 @@ def generate_values(date_str, count, sign=1): def create_table(cluster, additional_settings=None): create_table_statement = """ - CREATE TABLE s3_test ( + CREATE TABLE s3_test ON CLUSTER cluster( dt Date, id Int64, data String, INDEX min_max (id) TYPE minmax GRANULARITY 3 - ) ENGINE=ReplicatedMergeTree('/clickhouse/{cluster}/tables/test/s3', '{instance}') + ) ENGINE=ReplicatedMergeTree() PARTITION BY dt ORDER BY (dt, id) SETTINGS storage_policy='s3' @@ -63,8 +63,7 @@ def create_table(cluster, additional_settings=None): create_table_statement += "," create_table_statement += additional_settings - for node in cluster.instances.values(): - node.query(create_table_statement) + cluster.instances.values()[0].query(create_table_statement) @pytest.fixture(autouse=True) diff --git a/tests/integration/test_row_policy/test.py b/tests/integration/test_row_policy/test.py index a407f0b2c7a..c3c86f5a9c5 100644 --- a/tests/integration/test_row_policy/test.py +++ b/tests/integration/test_row_policy/test.py @@ -34,7 +34,7 @@ def started_cluster(): for current_node in nodes: current_node.query(''' - CREATE DATABASE mydb ENGINE=Ordinary; + CREATE DATABASE mydb; CREATE TABLE mydb.filtered_table1 (a UInt8, b UInt8) ENGINE MergeTree ORDER BY a; INSERT INTO mydb.filtered_table1 values (0, 0), (0, 1), (1, 0), (1, 1); @@ -360,7 +360,7 @@ def test_miscellaneous_engines(): # ReplicatedCollapsingMergeTree node.query("DROP TABLE mydb.filtered_table1") node.query( - "CREATE TABLE mydb.filtered_table1 (a UInt8, b Int8) ENGINE ReplicatedCollapsingMergeTree('/clickhouse/tables/00-00/filtered_table1', 'replica1', b) ORDER BY a") + "CREATE TABLE mydb.filtered_table1 (a UInt8, b Int8) ENGINE ReplicatedCollapsingMergeTree('/clickhouse/tables/00-01/filtered_table1', 'replica1', b) ORDER BY a") node.query("INSERT INTO mydb.filtered_table1 values (0, 1), (0, 1), (1, 1), (1, 1)") assert node.query("SELECT * FROM mydb.filtered_table1") == TSV([[1, 1], [1, 1]]) diff --git a/tests/integration/test_s3_with_proxy/test.py b/tests/integration/test_s3_with_proxy/test.py index 9df209826f9..70a50ae0e15 100644 --- a/tests/integration/test_s3_with_proxy/test.py +++ b/tests/integration/test_s3_with_proxy/test.py @@ -1,5 +1,6 @@ import logging import os +import time import pytest from helpers.cluster import ClickHouseCluster @@ -37,10 +38,15 @@ def cluster(): def check_proxy_logs(cluster, proxy_instance, http_methods={"POST", "PUT", "GET", "DELETE"}): - logs = cluster.get_container_logs(proxy_instance) - # Check that all possible interactions with Minio are present - for http_method in http_methods: - assert logs.find(http_method + " http://minio1") >= 0 + for i in range(10): + logs = cluster.get_container_logs(proxy_instance) + # Check with retry that all possible interactions with Minio are present + for http_method in http_methods: + if logs.find(http_method + " http://minio1") >= 0: + return + time.sleep(1) + else: + assert False, "http method not found in logs" @pytest.mark.parametrize( diff --git a/tests/integration/test_system_merges/test.py b/tests/integration/test_system_merges/test.py index 07e6f7331d9..1f2da606cd1 100644 --- a/tests/integration/test_system_merges/test.py +++ b/tests/integration/test_system_merges/test.py @@ -21,7 +21,7 @@ node2 = cluster.add_instance('node2', def started_cluster(): try: cluster.start() - node1.query('CREATE DATABASE test ENGINE=Ordinary') + node1.query('CREATE DATABASE test ENGINE=Ordinary') # Different paths with Atomic node2.query('CREATE DATABASE test ENGINE=Ordinary') yield cluster diff --git a/tests/performance/columns_hashing.xml b/tests/performance/columns_hashing.xml index fb340c20ccd..3ea2e013acc 100644 --- a/tests/performance/columns_hashing.xml +++ b/tests/performance/columns_hashing.xml @@ -1,15 +1,12 @@ - - columns_hashing - - - test.hits + hits_10m_single + hits_100m_single - - - - - + select sum(UserID + 1 in (select UserID from hits_10m_single)) from hits_10m_single + select sum((UserID + 1, RegionID) in (select UserID, RegionID from hits_10m_single)) from hits_10m_single + select sum(URL in (select URL from hits_10m_single where URL != '')) from hits_10m_single + select sum(MobilePhoneModel in (select MobilePhoneModel from hits_100m_single where MobilePhoneModel != '')) from hits_100m_single + select sum((MobilePhoneModel, UserID + 1) in (select MobilePhoneModel, UserID from hits_100m_single where MobilePhoneModel != '')) from hits_100m_single diff --git a/tests/performance/datetime_comparison.xml b/tests/performance/datetime_comparison.xml new file mode 100644 index 00000000000..2d47ded0b1a --- /dev/null +++ b/tests/performance/datetime_comparison.xml @@ -0,0 +1,5 @@ + + SELECT count() FROM numbers(1000000000) WHERE materialize(now()) > toString(toDateTime('2020-09-30 00:00:00')) + SELECT count() FROM numbers(1000000000) WHERE materialize(now()) > toUInt32(toDateTime('2020-09-30 00:00:00')) + SELECT count() FROM numbers(1000000000) WHERE materialize(now()) > toDateTime('2020-09-30 00:00:00') + diff --git a/tests/performance/broken/decimal_casts.xml b/tests/performance/decimal_casts.xml similarity index 87% rename from tests/performance/broken/decimal_casts.xml rename to tests/performance/decimal_casts.xml index 6c090faee77..582672fa30e 100644 --- a/tests/performance/broken/decimal_casts.xml +++ b/tests/performance/decimal_casts.xml @@ -1,11 +1,11 @@ - 10G + 15G CREATE TABLE t (x UInt64, d32 Decimal32(3), d64 Decimal64(4), d128 Decimal128(5)) ENGINE = Memory - INSERT INTO t SELECT number AS x, x % 1000000 AS d32, x AS d64, x d128 FROM numbers_mt(25000000) SETTINGS max_threads = 8 + INSERT INTO t SELECT number AS x, x % 1000000 AS d32, x AS d64, x d128 FROM numbers_mt(100000000) SETTINGS max_threads = 8 DROP TABLE IF EXISTS t SELECT toUInt32(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null @@ -13,8 +13,8 @@ SELECT toInt64(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null SELECT toUInt64(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null SELECT toInt128(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null - SELECT toInt256(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null - SELECT toUInt256(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null + SELECT toInt256(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t LIMIT 10000000 FORMAT Null + SELECT toUInt256(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t LIMIT 10000000 FORMAT Null SELECT toFloat32(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null SELECT toFloat64(x) y, toDecimal32(y, 1), toDecimal64(y, 5), toDecimal128(y, 6) FROM t FORMAT Null diff --git a/tests/performance/insert_values_with_expressions.xml b/tests/performance/insert_values_with_expressions.xml index 3456cd0ec68..daa3488e34b 100644 --- a/tests/performance/insert_values_with_expressions.xml +++ b/tests/performance/insert_values_with_expressions.xml @@ -17,7 +17,7 @@ file('test_some_expr_matches.values', Values, 'i Int64, ari Array(Int64), ars Array(String)') - select * from file('test_all_expr_matches.values', Values, 'd DateTime, i UInt32, s String, ni Nullable(UInt64), ns Nullable(String), ars Array(String)') - select * from file('test_some_expr_matches.values', Values, 'i Int64, ari Array(Int64), ars Array(String)') + select * from file('test_all_expr_matches.values', Values, 'd DateTime, i UInt32, s String, ni Nullable(UInt64), ns Nullable(String), ars Array(String)') format Null + select * from file('test_some_expr_matches.values', Values, 'i Int64, ari Array(Int64), ars Array(String)') format Null diff --git a/tests/queries/0_stateless/00510_materizlized_view_and_deduplication_zookeeper.sql b/tests/queries/0_stateless/00510_materizlized_view_and_deduplication_zookeeper.sql index 48e1cd65c49..8df012a8588 100644 --- a/tests/queries/0_stateless/00510_materizlized_view_and_deduplication_zookeeper.sql +++ b/tests/queries/0_stateless/00510_materizlized_view_and_deduplication_zookeeper.sql @@ -8,10 +8,10 @@ CREATE TABLE with_deduplication(x UInt32) CREATE TABLE without_deduplication(x UInt32) ENGINE ReplicatedMergeTree('/clickhouse/tables/test_00510/without_deduplication', 'r1') ORDER BY x SETTINGS replicated_deduplication_window = 0; -CREATE MATERIALIZED VIEW with_deduplication_mv +CREATE MATERIALIZED VIEW with_deduplication_mv UUID '00000510-1000-4000-8000-000000000001' ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/test_00510/with_deduplication_mv', 'r1') ORDER BY dummy AS SELECT 0 AS dummy, countState(x) AS cnt FROM with_deduplication; -CREATE MATERIALIZED VIEW without_deduplication_mv +CREATE MATERIALIZED VIEW without_deduplication_mv UUID '00000510-1000-4000-8000-000000000002' ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/test_00510/without_deduplication_mv', 'r1') ORDER BY dummy AS SELECT 0 AS dummy, countState(x) AS cnt FROM without_deduplication; @@ -32,12 +32,12 @@ SELECT countMerge(cnt) FROM with_deduplication_mv; SELECT countMerge(cnt) FROM without_deduplication_mv; -- Explicit insert is deduplicated -ALTER TABLE `.inner.with_deduplication_mv` DROP PARTITION ID 'all'; -ALTER TABLE `.inner.without_deduplication_mv` DROP PARTITION ID 'all'; -INSERT INTO `.inner.with_deduplication_mv` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; -INSERT INTO `.inner.with_deduplication_mv` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; -INSERT INTO `.inner.without_deduplication_mv` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; -INSERT INTO `.inner.without_deduplication_mv` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; +ALTER TABLE `.inner_id.00000510-1000-4000-8000-000000000001` DROP PARTITION ID 'all'; +ALTER TABLE `.inner_id.00000510-1000-4000-8000-000000000002` DROP PARTITION ID 'all'; +INSERT INTO `.inner_id.00000510-1000-4000-8000-000000000001` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; +INSERT INTO `.inner_id.00000510-1000-4000-8000-000000000001` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; +INSERT INTO `.inner_id.00000510-1000-4000-8000-000000000002` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; +INSERT INTO `.inner_id.00000510-1000-4000-8000-000000000002` SELECT 0 AS dummy, arrayReduce('countState', [toUInt32(42)]) AS cnt; SELECT ''; SELECT countMerge(cnt) FROM with_deduplication_mv; diff --git a/tests/queries/0_stateless/00604_show_create_database.reference b/tests/queries/0_stateless/00604_show_create_database.reference index a9ad6abea25..c05b088280e 100644 --- a/tests/queries/0_stateless/00604_show_create_database.reference +++ b/tests/queries/0_stateless/00604_show_create_database.reference @@ -1 +1 @@ -CREATE DATABASE test_00604\nENGINE = Ordinary +CREATE DATABASE test_00604\nENGINE = Atomic diff --git a/tests/queries/0_stateless/00609_mv_index_in_in.sql b/tests/queries/0_stateless/00609_mv_index_in_in.sql index 7064d8e36cd..28085194327 100644 --- a/tests/queries/0_stateless/00609_mv_index_in_in.sql +++ b/tests/queries/0_stateless/00609_mv_index_in_in.sql @@ -4,11 +4,11 @@ DROP TABLE IF EXISTS test_mv_00609; create table test_00609 (a Int8) engine=Memory; insert into test_00609 values (1); -create materialized view test_mv_00609 Engine=MergeTree(date, (a), 8192) populate as select a, toDate('2000-01-01') date from test_00609; +create materialized view test_mv_00609 uuid '00000609-1000-4000-8000-000000000001' Engine=MergeTree(date, (a), 8192) populate as select a, toDate('2000-01-01') date from test_00609; select * from test_mv_00609; -- OK select * from test_mv_00609 where a in (select a from test_mv_00609); -- EMPTY (bug) -select * from ".inner.test_mv_00609" where a in (select a from test_mv_00609); -- OK +select * from ".inner_id.00000609-1000-4000-8000-000000000001" where a in (select a from test_mv_00609); -- OK DROP TABLE test_00609; DROP TABLE test_mv_00609; diff --git a/tests/queries/0_stateless/00738_lock_for_inner_table.sh b/tests/queries/0_stateless/00738_lock_for_inner_table.sh index 2f7035b6759..4570c853f31 100755 --- a/tests/queries/0_stateless/00738_lock_for_inner_table.sh +++ b/tests/queries/0_stateless/00738_lock_for_inner_table.sh @@ -7,13 +7,13 @@ CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) echo "DROP TABLE IF EXISTS tab_00738; DROP TABLE IF EXISTS mv; CREATE TABLE tab_00738(a Int) ENGINE = Log; -CREATE MATERIALIZED VIEW mv ENGINE = Log AS SELECT a FROM tab_00738;" | ${CLICKHOUSE_CLIENT} -n +CREATE MATERIALIZED VIEW mv UUID '00000738-1000-4000-8000-000000000001' ENGINE = Log AS SELECT a FROM tab_00738;" | ${CLICKHOUSE_CLIENT} -n ${CLICKHOUSE_CLIENT} --query_id test_00738 --query "INSERT INTO tab_00738 SELECT number FROM numbers(10000000)" & function drop() { - ${CLICKHOUSE_CLIENT} --query "DROP TABLE \`.inner.mv\`" -n + ${CLICKHOUSE_CLIENT} --query "DROP TABLE \`.inner_id.00000738-1000-4000-8000-000000000001\`" -n } function wait_for_query_to_start() diff --git a/tests/queries/0_stateless/00853_join_with_nulls_crash.sql b/tests/queries/0_stateless/00853_join_with_nulls_crash.sql index eb64ed29ffe..464ddbb1990 100644 --- a/tests/queries/0_stateless/00853_join_with_nulls_crash.sql +++ b/tests/queries/0_stateless/00853_join_with_nulls_crash.sql @@ -21,14 +21,14 @@ SELECT s1.other, s2.other, count_a, count_b, toTypeName(s1.other), toTypeName(s2 ALL FULL JOIN ( SELECT other, count() AS count_b FROM table_b GROUP BY other ) s2 ON s1.other = s2.other -ORDER BY s2.other DESC; +ORDER BY s2.other DESC, count_a; SELECT s1.other, s2.other, count_a, count_b, toTypeName(s1.other), toTypeName(s2.other) FROM ( SELECT other, count() AS count_a FROM table_a GROUP BY other ) s1 ALL FULL JOIN ( SELECT other, count() AS count_b FROM table_b GROUP BY other ) s2 USING other -ORDER BY s2.other DESC; +ORDER BY s2.other DESC, count_a; SELECT s1.something, s2.something, count_a, count_b, toTypeName(s1.something), toTypeName(s2.something) FROM ( SELECT something, count() AS count_a FROM table_a GROUP BY something ) s1 diff --git a/tests/queries/0_stateless/01018_ddl_dictionaries_create.reference b/tests/queries/0_stateless/01018_ddl_dictionaries_create.reference index 7c2eca9cedf..5b020911d2e 100644 --- a/tests/queries/0_stateless/01018_ddl_dictionaries_create.reference +++ b/tests/queries/0_stateless/01018_ddl_dictionaries_create.reference @@ -1,14 +1,14 @@ =DICTIONARY in Ordinary DB -CREATE DICTIONARY ordinary_db.dict1\n(\n `key_column` UInt64 DEFAULT 0,\n `second_column` UInt8 DEFAULT 1,\n `third_column` String DEFAULT \'qqq\'\n)\nPRIMARY KEY key_column\nSOURCE(CLICKHOUSE(HOST \'localhost\' PORT 9000 USER \'default\' TABLE \'table_for_dict\' PASSWORD \'\' DB \'database_for_dict\'))\nLIFETIME(MIN 1 MAX 10)\nLAYOUT(FLAT()) +CREATE DICTIONARY db_01018.dict1\n(\n `key_column` UInt64 DEFAULT 0,\n `second_column` UInt8 DEFAULT 1,\n `third_column` String DEFAULT \'qqq\'\n)\nPRIMARY KEY key_column\nSOURCE(CLICKHOUSE(HOST \'localhost\' PORT 9000 USER \'default\' TABLE \'table_for_dict\' PASSWORD \'\' DB \'database_for_dict_01018\'))\nLIFETIME(MIN 1 MAX 10)\nLAYOUT(FLAT()) dict1 1 -ordinary_db dict1 +db_01018 dict1 ==DETACH DICTIONARY 0 ==ATTACH DICTIONARY dict1 1 -ordinary_db dict1 +db_01018 dict1 ==DROP DICTIONARY 0 =DICTIONARY in Memory DB diff --git a/tests/queries/0_stateless/01018_ddl_dictionaries_create.sql b/tests/queries/0_stateless/01018_ddl_dictionaries_create.sql index d7d7c02baa8..3261b1e61d1 100644 --- a/tests/queries/0_stateless/01018_ddl_dictionaries_create.sql +++ b/tests/queries/0_stateless/01018_ddl_dictionaries_create.sql @@ -1,12 +1,12 @@ SET send_logs_level = 'fatal'; -DROP DATABASE IF EXISTS database_for_dict; +DROP DATABASE IF EXISTS database_for_dict_01018; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict_01018; -DROP TABLE IF EXISTS database_for_dict.table_for_dict; +DROP TABLE IF EXISTS database_for_dict_01018.table_for_dict; -CREATE TABLE database_for_dict.table_for_dict +CREATE TABLE database_for_dict_01018.table_for_dict ( key_column UInt64, second_column UInt8, @@ -15,64 +15,64 @@ CREATE TABLE database_for_dict.table_for_dict ENGINE = MergeTree() ORDER BY key_column; -INSERT INTO database_for_dict.table_for_dict VALUES (1, 100, 'Hello world'); +INSERT INTO database_for_dict_01018.table_for_dict VALUES (1, 100, 'Hello world'); -DROP DATABASE IF EXISTS ordinary_db; +DROP DATABASE IF EXISTS db_01018; -CREATE DATABASE ordinary_db ENGINE = Ordinary; +CREATE DATABASE db_01018; SELECT '=DICTIONARY in Ordinary DB'; -DROP DICTIONARY IF EXISTS ordinary_db.dict1; +DROP DICTIONARY IF EXISTS db_01018.dict1; -CREATE DICTIONARY ordinary_db.dict1 +CREATE DICTIONARY db_01018.dict1 ( key_column UInt64 DEFAULT 0, second_column UInt8 DEFAULT 1, third_column String DEFAULT 'qqq' ) PRIMARY KEY key_column -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict_01018')) LIFETIME(MIN 1 MAX 10) LAYOUT(FLAT()); -SHOW CREATE DICTIONARY ordinary_db.dict1; +SHOW CREATE DICTIONARY db_01018.dict1; -SHOW DICTIONARIES FROM ordinary_db LIKE 'dict1'; +SHOW DICTIONARIES FROM db_01018 LIKE 'dict1'; -EXISTS DICTIONARY ordinary_db.dict1; +EXISTS DICTIONARY db_01018.dict1; SELECT database, name FROM system.dictionaries WHERE name LIKE 'dict1'; SELECT '==DETACH DICTIONARY'; -DETACH DICTIONARY ordinary_db.dict1; +DETACH DICTIONARY db_01018.dict1; -SHOW DICTIONARIES FROM ordinary_db LIKE 'dict1'; +SHOW DICTIONARIES FROM db_01018 LIKE 'dict1'; -EXISTS DICTIONARY ordinary_db.dict1; +EXISTS DICTIONARY db_01018.dict1; SELECT database, name FROM system.dictionaries WHERE name LIKE 'dict1'; SELECT '==ATTACH DICTIONARY'; -ATTACH DICTIONARY ordinary_db.dict1; +ATTACH DICTIONARY db_01018.dict1; -SHOW DICTIONARIES FROM ordinary_db LIKE 'dict1'; +SHOW DICTIONARIES FROM db_01018 LIKE 'dict1'; -EXISTS DICTIONARY ordinary_db.dict1; +EXISTS DICTIONARY db_01018.dict1; SELECT database, name FROM system.dictionaries WHERE name LIKE 'dict1'; SELECT '==DROP DICTIONARY'; -DROP DICTIONARY IF EXISTS ordinary_db.dict1; +DROP DICTIONARY IF EXISTS db_01018.dict1; -SHOW DICTIONARIES FROM ordinary_db LIKE 'dict1'; +SHOW DICTIONARIES FROM db_01018 LIKE 'dict1'; -EXISTS DICTIONARY ordinary_db.dict1; +EXISTS DICTIONARY db_01018.dict1; SELECT database, name FROM system.dictionaries WHERE name LIKE 'dict1'; -DROP DATABASE IF EXISTS ordinary_db; +DROP DATABASE IF EXISTS db_01018; DROP DATABASE IF EXISTS memory_db; @@ -87,7 +87,7 @@ CREATE DICTIONARY memory_db.dict2 third_column String DEFAULT 'qqq' ) PRIMARY KEY key_column -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict_01018')) LIFETIME(MIN 1 MAX 10) LAYOUT(FLAT()); -- {serverError 48} @@ -112,7 +112,7 @@ CREATE DICTIONARY lazy_db.dict3 third_column String DEFAULT 'qqq' ) PRIMARY KEY key_column, second_column -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict_01018')) LIFETIME(MIN 1 MAX 10) LAYOUT(COMPLEX_KEY_HASHED()); -- {serverError 48} @@ -120,45 +120,45 @@ DROP DATABASE IF EXISTS lazy_db; SELECT '=DROP DATABASE WITH DICTIONARY'; -DROP DATABASE IF EXISTS ordinary_db; +DROP DATABASE IF EXISTS db_01018; -CREATE DATABASE ordinary_db ENGINE = Ordinary; +CREATE DATABASE db_01018; -CREATE DICTIONARY ordinary_db.dict4 +CREATE DICTIONARY db_01018.dict4 ( key_column UInt64 DEFAULT 0, second_column UInt8 DEFAULT 1, third_column String DEFAULT 'qqq' ) PRIMARY KEY key_column -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict_01018')) LIFETIME(MIN 1 MAX 10) LAYOUT(FLAT()); -SHOW DICTIONARIES FROM ordinary_db; +SHOW DICTIONARIES FROM db_01018; -DROP DATABASE IF EXISTS ordinary_db; +DROP DATABASE IF EXISTS db_01018; -CREATE DATABASE ordinary_db ENGINE = Ordinary; +CREATE DATABASE db_01018; -SHOW DICTIONARIES FROM ordinary_db; +SHOW DICTIONARIES FROM db_01018; -CREATE DICTIONARY ordinary_db.dict4 +CREATE DICTIONARY db_01018.dict4 ( key_column UInt64 DEFAULT 0, second_column UInt8 DEFAULT 1, third_column String DEFAULT 'qqq' ) PRIMARY KEY key_column -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' PASSWORD '' DB 'database_for_dict_01018')) LIFETIME(MIN 1 MAX 10) LAYOUT(FLAT()); -SHOW DICTIONARIES FROM ordinary_db; +SHOW DICTIONARIES FROM db_01018; -DROP DATABASE IF EXISTS ordinary_db; +DROP DATABASE IF EXISTS db_01018; -DROP TABLE IF EXISTS database_for_dict.table_for_dict; +DROP TABLE IF EXISTS database_for_dict_01018.table_for_dict; -DROP DATABASE IF EXISTS database_for_dict; +DROP DATABASE IF EXISTS database_for_dict_01018; DROP DATABASE IF EXISTS memory_db; diff --git a/tests/queries/0_stateless/01018_ddl_dictionaries_select.sql b/tests/queries/0_stateless/01018_ddl_dictionaries_select.sql index f4de269e774..4b548a913ea 100644 --- a/tests/queries/0_stateless/01018_ddl_dictionaries_select.sql +++ b/tests/queries/0_stateless/01018_ddl_dictionaries_select.sql @@ -2,7 +2,7 @@ SET send_logs_level = 'fatal'; DROP DATABASE IF EXISTS database_for_dict; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict; CREATE TABLE database_for_dict.table_for_dict ( diff --git a/tests/queries/0_stateless/01018_ddl_dictionaries_special.sql b/tests/queries/0_stateless/01018_ddl_dictionaries_special.sql index 6d9b499a247..ede5897bdf7 100644 --- a/tests/queries/0_stateless/01018_ddl_dictionaries_special.sql +++ b/tests/queries/0_stateless/01018_ddl_dictionaries_special.sql @@ -2,7 +2,7 @@ SET send_logs_level = 'fatal'; DROP DATABASE IF EXISTS database_for_dict; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict; SELECT '***date dict***'; diff --git a/tests/queries/0_stateless/01018_dictionaries_from_dictionaries.sql b/tests/queries/0_stateless/01018_dictionaries_from_dictionaries.sql index 4d2cd6351b5..86180643f88 100644 --- a/tests/queries/0_stateless/01018_dictionaries_from_dictionaries.sql +++ b/tests/queries/0_stateless/01018_dictionaries_from_dictionaries.sql @@ -2,7 +2,7 @@ SET send_logs_level = 'fatal'; DROP DATABASE IF EXISTS database_for_dict; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict; CREATE TABLE database_for_dict.table_for_dict ( diff --git a/tests/queries/0_stateless/01033_dictionaries_lifetime.sql b/tests/queries/0_stateless/01033_dictionaries_lifetime.sql index 57776e1fec1..0a8288c2df0 100644 --- a/tests/queries/0_stateless/01033_dictionaries_lifetime.sql +++ b/tests/queries/0_stateless/01033_dictionaries_lifetime.sql @@ -2,7 +2,7 @@ SET send_logs_level = 'fatal'; DROP DATABASE IF EXISTS database_for_dict; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict; DROP TABLE IF EXISTS database_for_dict.table_for_dict; @@ -19,7 +19,7 @@ INSERT INTO database_for_dict.table_for_dict VALUES (1, 100, 'Hello world'); DROP DATABASE IF EXISTS ordinary_db; -CREATE DATABASE ordinary_db ENGINE = Ordinary; +CREATE DATABASE ordinary_db; DROP DICTIONARY IF EXISTS ordinary_db.dict1; diff --git a/tests/queries/0_stateless/01037_polygon_dicts_correctness_all.sh b/tests/queries/0_stateless/01037_polygon_dicts_correctness_all.sh index 1b80fcef80b..e7df8282433 100755 --- a/tests/queries/0_stateless/01037_polygon_dicts_correctness_all.sh +++ b/tests/queries/0_stateless/01037_polygon_dicts_correctness_all.sh @@ -11,7 +11,7 @@ tar -xf "${CURDIR}"/01037_test_data_search.tar.gz -C "${CURDIR}" $CLICKHOUSE_CLIENT -n --query=" DROP DATABASE IF EXISTS test_01037; -CREATE DATABASE test_01037 Engine = Ordinary; +CREATE DATABASE test_01037; DROP TABLE IF EXISTS test_01037.points; CREATE TABLE test_01037.points (x Float64, y Float64) ENGINE = Memory; " diff --git a/tests/queries/0_stateless/01037_polygon_dicts_correctness_fast.sh b/tests/queries/0_stateless/01037_polygon_dicts_correctness_fast.sh index 4ca95b72937..22d08d425a6 100755 --- a/tests/queries/0_stateless/01037_polygon_dicts_correctness_fast.sh +++ b/tests/queries/0_stateless/01037_polygon_dicts_correctness_fast.sh @@ -11,7 +11,7 @@ tar -xf "${CURDIR}"/01037_test_data_perf.tar.gz -C "${CURDIR}" $CLICKHOUSE_CLIENT -n --query=" DROP DATABASE IF EXISTS test_01037; -CREATE DATABASE test_01037 Engine = Ordinary; +CREATE DATABASE test_01037; DROP TABLE IF EXISTS test_01037.points; CREATE TABLE test_01037.points (x Float64, y Float64) ENGINE = Memory; " diff --git a/tests/queries/0_stateless/01037_polygon_dicts_simple_functions.sh b/tests/queries/0_stateless/01037_polygon_dicts_simple_functions.sh index d32b75ca735..c3d820e1292 100755 --- a/tests/queries/0_stateless/01037_polygon_dicts_simple_functions.sh +++ b/tests/queries/0_stateless/01037_polygon_dicts_simple_functions.sh @@ -8,7 +8,7 @@ TMP_DIR="/tmp" $CLICKHOUSE_CLIENT -n --query=" DROP DATABASE IF EXISTS test_01037; -CREATE DATABASE test_01037 Engine = Ordinary; +CREATE DATABASE test_01037; DROP TABLE IF EXISTS test_01037.polygons_array; diff --git a/tests/queries/0_stateless/01038_dictionary_lifetime_min_zero_sec.sh b/tests/queries/0_stateless/01038_dictionary_lifetime_min_zero_sec.sh index c3643399ba1..48171b56dd3 100755 --- a/tests/queries/0_stateless/01038_dictionary_lifetime_min_zero_sec.sh +++ b/tests/queries/0_stateless/01038_dictionary_lifetime_min_zero_sec.sh @@ -3,13 +3,13 @@ CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) . "$CURDIR"/../shell_config.sh -$CLICKHOUSE_CLIENT --query "DROP DATABASE IF EXISTS database_for_dict" +$CLICKHOUSE_CLIENT --query "DROP DATABASE IF EXISTS db_01038" -$CLICKHOUSE_CLIENT --query "CREATE DATABASE database_for_dict Engine = Ordinary" +$CLICKHOUSE_CLIENT --query "CREATE DATABASE db_01038" $CLICKHOUSE_CLIENT --query " -CREATE TABLE database_for_dict.table_for_dict +CREATE TABLE db_01038.table_for_dict ( key_column UInt64, value Float64 @@ -17,34 +17,34 @@ CREATE TABLE database_for_dict.table_for_dict ENGINE = MergeTree() ORDER BY key_column" -$CLICKHOUSE_CLIENT --query "INSERT INTO database_for_dict.table_for_dict VALUES (1, 1.1)" +$CLICKHOUSE_CLIENT --query "INSERT INTO db_01038.table_for_dict VALUES (1, 1.1)" $CLICKHOUSE_CLIENT --query " -CREATE DICTIONARY database_for_dict.dict_with_zero_min_lifetime +CREATE DICTIONARY db_01038.dict_with_zero_min_lifetime ( key_column UInt64, value Float64 DEFAULT 77.77 ) PRIMARY KEY key_column -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict' DB 'db_01038')) LIFETIME(1) LAYOUT(FLAT())" -$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('database_for_dict.dict_with_zero_min_lifetime', 'value', toUInt64(1))" +$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('db_01038.dict_with_zero_min_lifetime', 'value', toUInt64(1))" -$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('database_for_dict.dict_with_zero_min_lifetime', 'value', toUInt64(2))" +$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('db_01038.dict_with_zero_min_lifetime', 'value', toUInt64(2))" -$CLICKHOUSE_CLIENT --query "INSERT INTO database_for_dict.table_for_dict VALUES (2, 2.2)" +$CLICKHOUSE_CLIENT --query "INSERT INTO db_01038.table_for_dict VALUES (2, 2.2)" function check() { - query_result=$($CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('database_for_dict.dict_with_zero_min_lifetime', 'value', toUInt64(2))") + query_result=$($CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('db_01038.dict_with_zero_min_lifetime', 'value', toUInt64(2))") while [ "$query_result" != "2.2" ] do - query_result=$($CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('database_for_dict.dict_with_zero_min_lifetime', 'value', toUInt64(2))") + query_result=$($CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('db_01038.dict_with_zero_min_lifetime', 'value', toUInt64(2))") done } @@ -53,8 +53,8 @@ export -f check; timeout 10 bash -c check -$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('database_for_dict.dict_with_zero_min_lifetime', 'value', toUInt64(1))" +$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('db_01038.dict_with_zero_min_lifetime', 'value', toUInt64(1))" -$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('database_for_dict.dict_with_zero_min_lifetime', 'value', toUInt64(2))" +$CLICKHOUSE_CLIENT --query "SELECT dictGetFloat64('db_01038.dict_with_zero_min_lifetime', 'value', toUInt64(2))" -$CLICKHOUSE_CLIENT --query "DROP DATABASE IF EXISTS database_for_dict" +$CLICKHOUSE_CLIENT --query "DROP DATABASE IF EXISTS db_01038" diff --git a/tests/queries/0_stateless/01040_dictionary_invalidate_query_switchover_long.sh b/tests/queries/0_stateless/01040_dictionary_invalidate_query_switchover_long.sh index 44a192cf178..6b509ac7925 100755 --- a/tests/queries/0_stateless/01040_dictionary_invalidate_query_switchover_long.sh +++ b/tests/queries/0_stateless/01040_dictionary_invalidate_query_switchover_long.sh @@ -6,7 +6,7 @@ CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) $CLICKHOUSE_CLIENT --query "DROP DATABASE IF EXISTS dictdb" -$CLICKHOUSE_CLIENT --query "CREATE DATABASE dictdb Engine = Ordinary" +$CLICKHOUSE_CLIENT --query "CREATE DATABASE dictdb" $CLICKHOUSE_CLIENT --query " CREATE TABLE dictdb.dict_invalidate diff --git a/tests/queries/0_stateless/01041_create_dictionary_if_not_exists.sql b/tests/queries/0_stateless/01041_create_dictionary_if_not_exists.sql index 8c30abeb28f..5ec76e6ae91 100644 --- a/tests/queries/0_stateless/01041_create_dictionary_if_not_exists.sql +++ b/tests/queries/0_stateless/01041_create_dictionary_if_not_exists.sql @@ -2,7 +2,7 @@ DROP TABLE IF EXISTS dictdb.table_for_dict; DROP DICTIONARY IF EXISTS dictdb.dict_exists; DROP DATABASE IF EXISTS dictdb; -CREATE DATABASE dictdb ENGINE = Ordinary; +CREATE DATABASE dictdb; CREATE TABLE dictdb.table_for_dict ( diff --git a/tests/queries/0_stateless/01042_system_reload_dictionary_reloads_completely.sh b/tests/queries/0_stateless/01042_system_reload_dictionary_reloads_completely.sh index f03f7511a4f..46031a3d508 100755 --- a/tests/queries/0_stateless/01042_system_reload_dictionary_reloads_completely.sh +++ b/tests/queries/0_stateless/01042_system_reload_dictionary_reloads_completely.sh @@ -8,7 +8,7 @@ set -e -o pipefail # Run the client. $CLICKHOUSE_CLIENT --multiquery <<'EOF' DROP DATABASE IF EXISTS dictdb; -CREATE DATABASE dictdb Engine = Ordinary; +CREATE DATABASE dictdb; CREATE TABLE dictdb.table(x Int64, y Int64, insert_time DateTime) ENGINE = MergeTree ORDER BY tuple(); INSERT INTO dictdb.table VALUES (12, 102, now()); diff --git a/tests/queries/0_stateless/01043_dictionary_attribute_properties_values.sql b/tests/queries/0_stateless/01043_dictionary_attribute_properties_values.sql index afd1c1c5780..adeb5630529 100644 --- a/tests/queries/0_stateless/01043_dictionary_attribute_properties_values.sql +++ b/tests/queries/0_stateless/01043_dictionary_attribute_properties_values.sql @@ -1,5 +1,5 @@ DROP DATABASE IF EXISTS dictdb; -CREATE DATABASE dictdb Engine = Ordinary; +CREATE DATABASE dictdb; CREATE TABLE dictdb.dicttbl(key Int64, value_default String, value_expression String) ENGINE = MergeTree ORDER BY tuple(); INSERT INTO dictdb.dicttbl VALUES (12, 'hello', '55:66:77'); diff --git a/tests/queries/0_stateless/01048_exists_query.sql b/tests/queries/0_stateless/01048_exists_query.sql index 700b4f5983d..31b6d2af6c0 100644 --- a/tests/queries/0_stateless/01048_exists_query.sql +++ b/tests/queries/0_stateless/01048_exists_query.sql @@ -3,7 +3,7 @@ EXISTS TABLE db_01048.t_01048; EXISTS DICTIONARY db_01048.t_01048; DROP DATABASE IF EXISTS db_01048; -CREATE DATABASE db_01048 Engine = Ordinary; +CREATE DATABASE db_01048; DROP TABLE IF EXISTS db_01048.t_01048; EXISTS db_01048.t_01048; diff --git a/tests/queries/0_stateless/01053_drop_database_mat_view.sql b/tests/queries/0_stateless/01053_drop_database_mat_view.sql index 60803bced7e..9f7438d594e 100644 --- a/tests/queries/0_stateless/01053_drop_database_mat_view.sql +++ b/tests/queries/0_stateless/01053_drop_database_mat_view.sql @@ -1,5 +1,5 @@ DROP DATABASE IF EXISTS some_tests; -CREATE DATABASE some_tests ENGINE=Ordinary; +CREATE DATABASE some_tests ENGINE=Ordinary; -- Different inner table name with Atomic create table some_tests.my_table ENGINE = MergeTree(day, (day), 8192) as select today() as day, 'mystring' as str; show tables from some_tests; diff --git a/tests/queries/0_stateless/01053_ssd_dictionary.sql b/tests/queries/0_stateless/01053_ssd_dictionary.sql index 416d26bd637..fb4acdeadb4 100644 --- a/tests/queries/0_stateless/01053_ssd_dictionary.sql +++ b/tests/queries/0_stateless/01053_ssd_dictionary.sql @@ -23,6 +23,8 @@ INSERT INTO database_for_dict.table_for_dict SELECT number, 0, -1, 'c' FROM syst DROP DICTIONARY IF EXISTS database_for_dict.ssd_dict; +-- FIXME filesystem error: in create_directory: Permission denied [/var/lib/clickhouse] +-- Probably we need rewrite it to integration test CREATE DICTIONARY database_for_dict.ssd_dict ( id UInt64, diff --git a/tests/queries/0_stateless/01115_join_with_dictionary.sql b/tests/queries/0_stateless/01115_join_with_dictionary.sql index f1477df7df2..807b53c39c0 100644 --- a/tests/queries/0_stateless/01115_join_with_dictionary.sql +++ b/tests/queries/0_stateless/01115_join_with_dictionary.sql @@ -1,7 +1,7 @@ SET send_logs_level = 'fatal'; DROP DATABASE IF EXISTS db_01115; -CREATE DATABASE db_01115 Engine = Ordinary; +CREATE DATABASE db_01115; USE db_01115; diff --git a/tests/queries/0_stateless/01142_merge_join_lc_and_nullable_in_key.reference b/tests/queries/0_stateless/01142_merge_join_lc_and_nullable_in_key.reference new file mode 100644 index 00000000000..d1b29b46df6 --- /dev/null +++ b/tests/queries/0_stateless/01142_merge_join_lc_and_nullable_in_key.reference @@ -0,0 +1,29 @@ +1 l \N Nullable(String) +2 \N Nullable(String) +1 l \N Nullable(String) +2 \N Nullable(String) +- +1 l \N Nullable(String) +0 \N Nullable(String) +0 \N Nullable(String) +1 l \N Nullable(String) +- +1 l \N Nullable(String) +0 \N Nullable(String) +0 \N Nullable(String) +1 l \N Nullable(String) +- +1 l \N Nullable(String) +2 \N Nullable(String) +1 l \N Nullable(String) +2 \N Nullable(String) +- +1 l \N Nullable(String) +\N \N Nullable(String) +1 l \N Nullable(String) +\N \N Nullable(String) +- +1 l \N Nullable(String) +\N \N Nullable(String) +1 l \N Nullable(String) +\N \N Nullable(String) diff --git a/tests/queries/0_stateless/01142_merge_join_lc_and_nullable_in_key.sql b/tests/queries/0_stateless/01142_merge_join_lc_and_nullable_in_key.sql new file mode 100644 index 00000000000..8a1601e3faa --- /dev/null +++ b/tests/queries/0_stateless/01142_merge_join_lc_and_nullable_in_key.sql @@ -0,0 +1,48 @@ +SET join_algorithm = 'partial_merge'; + +DROP TABLE IF EXISTS t; +DROP TABLE IF EXISTS nr; + +CREATE TABLE t (`x` UInt32, `lc` LowCardinality(String)) ENGINE = Memory; +CREATE TABLE nr (`x` Nullable(UInt32), `lc` Nullable(String)) ENGINE = Memory; + +INSERT INTO t VALUES (1, 'l'); +INSERT INTO nr VALUES (2, NULL); + +SET join_use_nulls = 0; + +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l LEFT JOIN nr AS r USING (x) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l RIGHT JOIN nr AS r USING (x) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l FULL JOIN nr AS r USING (x) ORDER BY x; + +SELECT '-'; + +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l LEFT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l RIGHT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l FULL JOIN nr AS r USING (lc) ORDER BY x; + +SELECT '-'; + +SELECT x, lc, materialize(r.lc) y, toTypeName(y) FROM t AS l LEFT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, materialize(r.lc) y, toTypeName(y) FROM t AS l RIGHT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, materialize(r.lc) y, toTypeName(y) FROM t AS l FULL JOIN nr AS r USING (lc) ORDER BY x; + +SELECT '-'; + +SET join_use_nulls = 1; + +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l LEFT JOIN nr AS r USING (x) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l RIGHT JOIN nr AS r USING (x) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l FULL JOIN nr AS r USING (x) ORDER BY x; + +SELECT '-'; + +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l LEFT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l RIGHT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, r.lc, toTypeName(r.lc) FROM t AS l FULL JOIN nr AS r USING (lc) ORDER BY x; + +SELECT '-'; + +SELECT x, lc, materialize(r.lc) y, toTypeName(y) FROM t AS l LEFT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, materialize(r.lc) y, toTypeName(y) FROM t AS l RIGHT JOIN nr AS r USING (lc) ORDER BY x; +SELECT x, lc, materialize(r.lc) y, toTypeName(y) FROM t AS l FULL JOIN nr AS r USING (lc) ORDER BY x; diff --git a/tests/queries/0_stateless/01190_full_attach_syntax.reference b/tests/queries/0_stateless/01190_full_attach_syntax.reference index 619861849c8..4e6eabcd6f0 100644 --- a/tests/queries/0_stateless/01190_full_attach_syntax.reference +++ b/tests/queries/0_stateless/01190_full_attach_syntax.reference @@ -1,13 +1,13 @@ CREATE DICTIONARY test_01190.dict\n(\n `key` UInt64 DEFAULT 0,\n `col` UInt8 DEFAULT 1\n)\nPRIMARY KEY key\nSOURCE(CLICKHOUSE(HOST \'localhost\' PORT 9000 USER \'default\' TABLE \'table_for_dict\' PASSWORD \'\' DB \'test_01190\'))\nLIFETIME(MIN 1 MAX 10)\nLAYOUT(FLAT()) CREATE DICTIONARY test_01190.dict\n(\n `key` UInt64 DEFAULT 0,\n `col` UInt8 DEFAULT 1\n)\nPRIMARY KEY key\nSOURCE(CLICKHOUSE(HOST \'localhost\' PORT 9000 USER \'default\' TABLE \'table_for_dict\' PASSWORD \'\' DB \'test_01190\'))\nLIFETIME(MIN 1 MAX 10)\nLAYOUT(FLAT()) -CREATE TABLE default.log\n(\n `s` String\n)\nENGINE = Log -CREATE TABLE default.log\n(\n `s` String\n)\nENGINE = Log() +CREATE TABLE test_01190.log\n(\n `s` String\n)\nENGINE = Log +CREATE TABLE test_01190.log\n(\n `s` String\n)\nENGINE = Log() test -CREATE TABLE default.mt\n(\n `key` Array(UInt8),\n `s` String,\n `n` UInt64,\n `d` Date MATERIALIZED \'2000-01-01\'\n)\nENGINE = MergeTree(d, (key, s, n), 1) +CREATE TABLE test_01190.mt\n(\n `key` Array(UInt8),\n `s` String,\n `n` UInt64,\n `d` Date MATERIALIZED \'2000-01-01\'\n)\nENGINE = MergeTree(d, (key, s, n), 1) [1,2] Hello 2 -CREATE TABLE default.mt\n(\n `key` Array(UInt8),\n `s` String,\n `n` UInt64,\n `d` Date\n)\nENGINE = MergeTree(d, (key, s, n), 1) -CREATE MATERIALIZED VIEW default.mv\n(\n `s` String\n)\nENGINE = Null AS\nSELECT *\nFROM default.log -CREATE MATERIALIZED VIEW default.mv\n(\n `s` String\n)\nENGINE = Null AS\nSELECT *\nFROM default.log -CREATE MATERIALIZED VIEW default.mv\n(\n `key` Array(UInt8),\n `s` String,\n `n` UInt64,\n `d` Date\n)\nENGINE = Null AS\nSELECT *\nFROM default.mt -CREATE LIVE VIEW default.lv\n(\n `1` UInt8\n) AS\nSELECT 1 -CREATE LIVE VIEW default.lv\n(\n `1` UInt8\n) AS\nSELECT 1 +CREATE TABLE test_01190.mt\n(\n `key` Array(UInt8),\n `s` String,\n `n` UInt64,\n `d` Date\n)\nENGINE = MergeTree(d, (key, s, n), 1) +CREATE MATERIALIZED VIEW test_01190.mv\n(\n `s` String\n)\nENGINE = Null AS\nSELECT *\nFROM test_01190.log +CREATE MATERIALIZED VIEW test_01190.mv\n(\n `s` String\n)\nENGINE = Null AS\nSELECT *\nFROM test_01190.log +CREATE MATERIALIZED VIEW test_01190.mv\n(\n `key` Array(UInt8),\n `s` String,\n `n` UInt64,\n `d` Date\n)\nENGINE = Null AS\nSELECT *\nFROM test_01190.mt +CREATE LIVE VIEW test_01190.lv\n(\n `1` UInt8\n) AS\nSELECT 1 +CREATE LIVE VIEW test_01190.lv\n(\n `1` UInt8\n) AS\nSELECT 1 diff --git a/tests/queries/0_stateless/01190_full_attach_syntax.sql b/tests/queries/0_stateless/01190_full_attach_syntax.sql index 3a91eccc8cd..78f0f53d101 100644 --- a/tests/queries/0_stateless/01190_full_attach_syntax.sql +++ b/tests/queries/0_stateless/01190_full_attach_syntax.sql @@ -1,5 +1,6 @@ DROP DATABASE IF EXISTS test_01190; -CREATE DATABASE test_01190; +CREATE DATABASE test_01190 ENGINE=Ordinary; -- Full ATTACH requires UUID with Atomic +USE test_01190; CREATE TABLE test_01190.table_for_dict (key UInt64, col UInt8) ENGINE = Memory; @@ -14,14 +15,6 @@ ATTACH DICTIONARY test_01190.dict (key UInt64 DEFAULT 0, col UInt8 DEFAULT 42) P ATTACH DICTIONARY test_01190.dict; SHOW CREATE DICTIONARY test_01190.dict; -DROP DATABASE test_01190; - - -DROP TABLE IF EXISTS log; -DROP TABLE IF EXISTS mt; -DROP TABLE IF EXISTS mv; -DROP TABLE IF EXISTS lv; - CREATE TABLE log ENGINE = Log AS SELECT 'test' AS s; SHOW CREATE log; DETACH TABLE log; @@ -58,9 +51,6 @@ DETACH VIEW lv; ATTACH LIVE VIEW lv AS SELECT 1; SHOW CREATE lv; -DROP TABLE log; -DROP TABLE mt; -DROP TABLE mv; -DROP TABLE lv; +DROP DATABASE test_01190; diff --git a/tests/queries/0_stateless/01224_no_superfluous_dict_reload.sql b/tests/queries/0_stateless/01224_no_superfluous_dict_reload.sql index cf8b2a471c4..da4928a26fb 100644 --- a/tests/queries/0_stateless/01224_no_superfluous_dict_reload.sql +++ b/tests/queries/0_stateless/01224_no_superfluous_dict_reload.sql @@ -1,6 +1,6 @@ DROP DATABASE IF EXISTS dict_db_01224; DROP DATABASE IF EXISTS dict_db_01224_dictionary; -CREATE DATABASE dict_db_01224; +CREATE DATABASE dict_db_01224 ENGINE=Ordinary; -- Different internal dictionary name with Atomic CREATE DATABASE dict_db_01224_dictionary Engine=Dictionary; CREATE TABLE dict_db_01224.dict_data (key UInt64, val UInt64) Engine=Memory(); diff --git a/tests/queries/0_stateless/01225_show_create_table_from_dictionary.sql b/tests/queries/0_stateless/01225_show_create_table_from_dictionary.sql index a494511ebd8..24d10537dbb 100644 --- a/tests/queries/0_stateless/01225_show_create_table_from_dictionary.sql +++ b/tests/queries/0_stateless/01225_show_create_table_from_dictionary.sql @@ -1,6 +1,6 @@ DROP DATABASE IF EXISTS dict_db_01225; DROP DATABASE IF EXISTS dict_db_01225_dictionary; -CREATE DATABASE dict_db_01225; +CREATE DATABASE dict_db_01225 ENGINE=Ordinary; -- Different internal dictionary name with Atomic CREATE DATABASE dict_db_01225_dictionary Engine=Dictionary; CREATE TABLE dict_db_01225.dict_data (key UInt64, val UInt64) Engine=Memory(); diff --git a/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.reference b/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.reference index e3f4955d4cf..fb993e8d572 100644 --- a/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.reference +++ b/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.reference @@ -1,3 +1,3 @@ -CREATE TABLE default.bloom_filter_idx_good\n(\n `u64` UInt64,\n `i32` Int32,\n `f64` Float64,\n `d` Decimal(10, 2),\n `s` String,\n `e` Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3),\n `dt` Date,\n INDEX bloom_filter_a i32 TYPE bloom_filter(0., 1.) GRANULARITY 1\n)\nENGINE = MergeTree()\nORDER BY u64\nSETTINGS index_granularity = 8192 -CREATE TABLE default.bloom_filter_idx_good\n(\n `u64` UInt64,\n `i32` Int32,\n `f64` Float64,\n `d` Decimal(10, 2),\n `s` String,\n `e` Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3),\n `dt` Date,\n INDEX bloom_filter_a i32 TYPE bloom_filter(-0.1) GRANULARITY 1\n)\nENGINE = MergeTree()\nORDER BY u64\nSETTINGS index_granularity = 8192 -CREATE TABLE default.bloom_filter_idx_good\n(\n `u64` UInt64,\n `i32` Int32,\n `f64` Float64,\n `d` Decimal(10, 2),\n `s` String,\n `e` Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3),\n `dt` Date,\n INDEX bloom_filter_a i32 TYPE bloom_filter(1.01) GRANULARITY 1\n)\nENGINE = MergeTree()\nORDER BY u64\nSETTINGS index_granularity = 8192 +CREATE TABLE test_01249.bloom_filter_idx_good\n(\n `u64` UInt64,\n `i32` Int32,\n `f64` Float64,\n `d` Decimal(10, 2),\n `s` String,\n `e` Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3),\n `dt` Date,\n INDEX bloom_filter_a i32 TYPE bloom_filter(0., 1.) GRANULARITY 1\n)\nENGINE = MergeTree()\nORDER BY u64\nSETTINGS index_granularity = 8192 +CREATE TABLE test_01249.bloom_filter_idx_good\n(\n `u64` UInt64,\n `i32` Int32,\n `f64` Float64,\n `d` Decimal(10, 2),\n `s` String,\n `e` Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3),\n `dt` Date,\n INDEX bloom_filter_a i32 TYPE bloom_filter(-0.1) GRANULARITY 1\n)\nENGINE = MergeTree()\nORDER BY u64\nSETTINGS index_granularity = 8192 +CREATE TABLE test_01249.bloom_filter_idx_good\n(\n `u64` UInt64,\n `i32` Int32,\n `f64` Float64,\n `d` Decimal(10, 2),\n `s` String,\n `e` Enum8(\'a\' = 1, \'b\' = 2, \'c\' = 3),\n `dt` Date,\n INDEX bloom_filter_a i32 TYPE bloom_filter(1.01) GRANULARITY 1\n)\nENGINE = MergeTree()\nORDER BY u64\nSETTINGS index_granularity = 8192 diff --git a/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.sql b/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.sql index b60fbc05457..8902b164c09 100644 --- a/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.sql +++ b/tests/queries/0_stateless/01249_bad_arguments_for_bloom_filter.sql @@ -1,3 +1,7 @@ +DROP DATABASE IF EXISTS test_01249; +CREATE DATABASE test_01249 ENGINE=Ordinary; -- Full ATTACH requires UUID with Atomic +USE test_01249; + CREATE TABLE bloom_filter_idx_good(`u64` UInt64, `i32` Int32, `f64` Float64, `d` Decimal(10, 2), `s` String, `e` Enum8('a' = 1, 'b' = 2, 'c' = 3), `dt` Date, INDEX bloom_filter_a i32 TYPE bloom_filter(0, 1) GRANULARITY 1) ENGINE = MergeTree() ORDER BY u64 SETTINGS index_granularity = 8192; -- { serverError 42 } CREATE TABLE bloom_filter_idx_good(`u64` UInt64, `i32` Int32, `f64` Float64, `d` Decimal(10, 2), `s` String, `e` Enum8('a' = 1, 'b' = 2, 'c' = 3), `dt` Date, INDEX bloom_filter_a i32 TYPE bloom_filter(-0.1) GRANULARITY 1) ENGINE = MergeTree() ORDER BY u64 SETTINGS index_granularity = 8192; -- { serverError 36 } CREATE TABLE bloom_filter_idx_good(`u64` UInt64, `i32` Int32, `f64` Float64, `d` Decimal(10, 2), `s` String, `e` Enum8('a' = 1, 'b' = 2, 'c' = 3), `dt` Date, INDEX bloom_filter_a i32 TYPE bloom_filter(1.01) GRANULARITY 1) ENGINE = MergeTree() ORDER BY u64 SETTINGS index_granularity = 8192; -- { serverError 36 } @@ -14,4 +18,4 @@ DROP TABLE IF EXISTS bloom_filter_idx_good; ATTACH TABLE bloom_filter_idx_good(`u64` UInt64, `i32` Int32, `f64` Float64, `d` Decimal(10, 2), `s` String, `e` Enum8('a' = 1, 'b' = 2, 'c' = 3), `dt` Date, INDEX bloom_filter_a i32 TYPE bloom_filter(1.01) GRANULARITY 1) ENGINE = MergeTree() ORDER BY u64 SETTINGS index_granularity = 8192; SHOW CREATE TABLE bloom_filter_idx_good; -DROP TABLE IF EXISTS bloom_filter_idx_good; +DROP DATABASE test_01249; diff --git a/tests/queries/0_stateless/01251_dict_is_in_infinite_loop.sql b/tests/queries/0_stateless/01251_dict_is_in_infinite_loop.sql index decf65dc8cf..8e7e76697b5 100644 --- a/tests/queries/0_stateless/01251_dict_is_in_infinite_loop.sql +++ b/tests/queries/0_stateless/01251_dict_is_in_infinite_loop.sql @@ -1,5 +1,5 @@ DROP DATABASE IF EXISTS database_for_dict; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict; DROP TABLE IF EXISTS database_for_dict.dict_source; CREATE TABLE database_for_dict.dict_source (id UInt64, parent_id UInt64, value String) ENGINE = Memory; diff --git a/tests/queries/0_stateless/01259_dictionary_custom_settings_ddl.sql b/tests/queries/0_stateless/01259_dictionary_custom_settings_ddl.sql index cbac234305d..9c2174c8469 100644 --- a/tests/queries/0_stateless/01259_dictionary_custom_settings_ddl.sql +++ b/tests/queries/0_stateless/01259_dictionary_custom_settings_ddl.sql @@ -1,6 +1,6 @@ DROP DATABASE IF EXISTS database_for_dict; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict; DROP TABLE IF EXISTS database_for_dict.table_for_dict; @@ -17,7 +17,7 @@ INSERT INTO database_for_dict.table_for_dict VALUES (100500, 10000000, 'Hello wo DROP DATABASE IF EXISTS ordinary_db; -CREATE DATABASE ordinary_db ENGINE = Ordinary; +CREATE DATABASE ordinary_db; DROP DICTIONARY IF EXISTS ordinary_db.dict1; diff --git a/tests/queries/0_stateless/01268_dictionary_direct_layout.sql b/tests/queries/0_stateless/01268_dictionary_direct_layout.sql index 9b2f2344242..48642c91102 100644 --- a/tests/queries/0_stateless/01268_dictionary_direct_layout.sql +++ b/tests/queries/0_stateless/01268_dictionary_direct_layout.sql @@ -1,12 +1,12 @@ -DROP DATABASE IF EXISTS database_for_dict; +DROP DATABASE IF EXISTS database_for_dict_01268; -CREATE DATABASE database_for_dict Engine = Ordinary; +CREATE DATABASE database_for_dict_01268; -DROP TABLE IF EXISTS database_for_dict.table_for_dict1; -DROP TABLE IF EXISTS database_for_dict.table_for_dict2; -DROP TABLE IF EXISTS database_for_dict.table_for_dict3; +DROP TABLE IF EXISTS database_for_dict_01268.table_for_dict1; +DROP TABLE IF EXISTS database_for_dict_01268.table_for_dict2; +DROP TABLE IF EXISTS database_for_dict_01268.table_for_dict3; -CREATE TABLE database_for_dict.table_for_dict1 +CREATE TABLE database_for_dict_01268.table_for_dict1 ( key_column UInt64, second_column UInt64, @@ -15,9 +15,9 @@ CREATE TABLE database_for_dict.table_for_dict1 ENGINE = MergeTree() ORDER BY key_column; -INSERT INTO database_for_dict.table_for_dict1 VALUES (100500, 10000000, 'Hello world'); +INSERT INTO database_for_dict_01268.table_for_dict1 VALUES (100500, 10000000, 'Hello world'); -CREATE TABLE database_for_dict.table_for_dict2 +CREATE TABLE database_for_dict_01268.table_for_dict2 ( region_id UInt64, parent_region UInt64, @@ -26,13 +26,13 @@ CREATE TABLE database_for_dict.table_for_dict2 ENGINE = MergeTree() ORDER BY region_id; -INSERT INTO database_for_dict.table_for_dict2 VALUES (1, 0, 'Russia'); -INSERT INTO database_for_dict.table_for_dict2 VALUES (2, 1, 'Moscow'); -INSERT INTO database_for_dict.table_for_dict2 VALUES (3, 2, 'Center'); -INSERT INTO database_for_dict.table_for_dict2 VALUES (4, 0, 'Great Britain'); -INSERT INTO database_for_dict.table_for_dict2 VALUES (5, 4, 'London'); +INSERT INTO database_for_dict_01268.table_for_dict2 VALUES (1, 0, 'Russia'); +INSERT INTO database_for_dict_01268.table_for_dict2 VALUES (2, 1, 'Moscow'); +INSERT INTO database_for_dict_01268.table_for_dict2 VALUES (3, 2, 'Center'); +INSERT INTO database_for_dict_01268.table_for_dict2 VALUES (4, 0, 'Great Britain'); +INSERT INTO database_for_dict_01268.table_for_dict2 VALUES (5, 4, 'London'); -CREATE TABLE database_for_dict.table_for_dict3 +CREATE TABLE database_for_dict_01268.table_for_dict3 ( region_id UInt64, parent_region Float32, @@ -41,91 +41,91 @@ CREATE TABLE database_for_dict.table_for_dict3 ENGINE = MergeTree() ORDER BY region_id; -INSERT INTO database_for_dict.table_for_dict3 VALUES (1, 0.5, 'Russia'); -INSERT INTO database_for_dict.table_for_dict3 VALUES (2, 1.6, 'Moscow'); -INSERT INTO database_for_dict.table_for_dict3 VALUES (3, 2.3, 'Center'); -INSERT INTO database_for_dict.table_for_dict3 VALUES (4, 0.2, 'Great Britain'); -INSERT INTO database_for_dict.table_for_dict3 VALUES (5, 4.9, 'London'); +INSERT INTO database_for_dict_01268.table_for_dict3 VALUES (1, 0.5, 'Russia'); +INSERT INTO database_for_dict_01268.table_for_dict3 VALUES (2, 1.6, 'Moscow'); +INSERT INTO database_for_dict_01268.table_for_dict3 VALUES (3, 2.3, 'Center'); +INSERT INTO database_for_dict_01268.table_for_dict3 VALUES (4, 0.2, 'Great Britain'); +INSERT INTO database_for_dict_01268.table_for_dict3 VALUES (5, 4.9, 'London'); -DROP DATABASE IF EXISTS ordinary_db; +DROP DATABASE IF EXISTS db_01268; -CREATE DATABASE ordinary_db ENGINE = Ordinary; +CREATE DATABASE db_01268; -DROP DICTIONARY IF EXISTS ordinary_db.dict1; -DROP DICTIONARY IF EXISTS ordinary_db.dict2; -DROP DICTIONARY IF EXISTS ordinary_db.dict3; +DROP DICTIONARY IF EXISTS db_01268.dict1; +DROP DICTIONARY IF EXISTS db_01268.dict2; +DROP DICTIONARY IF EXISTS db_01268.dict3; -CREATE DICTIONARY ordinary_db.dict1 +CREATE DICTIONARY db_01268.dict1 ( key_column UInt64 DEFAULT 0, second_column UInt64 DEFAULT 1, third_column String DEFAULT 'qqq' ) PRIMARY KEY key_column -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict1' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict1' PASSWORD '' DB 'database_for_dict_01268')) LAYOUT(DIRECT()) SETTINGS(max_result_bytes=1); -CREATE DICTIONARY ordinary_db.dict2 +CREATE DICTIONARY db_01268.dict2 ( region_id UInt64 DEFAULT 0, parent_region UInt64 DEFAULT 0 HIERARCHICAL, region_name String DEFAULT '' ) PRIMARY KEY region_id -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict2' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict2' PASSWORD '' DB 'database_for_dict_01268')) LAYOUT(DIRECT()); -CREATE DICTIONARY ordinary_db.dict3 +CREATE DICTIONARY db_01268.dict3 ( region_id UInt64 DEFAULT 0, parent_region Float32 DEFAULT 0, region_name String DEFAULT '' ) PRIMARY KEY region_id -SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict3' PASSWORD '' DB 'database_for_dict')) +SOURCE(CLICKHOUSE(HOST 'localhost' PORT 9000 USER 'default' TABLE 'table_for_dict3' PASSWORD '' DB 'database_for_dict_01268')) LAYOUT(DIRECT()); SELECT 'INITIALIZING DICTIONARY'; -SELECT dictGetHierarchy('ordinary_db.dict2', toUInt64(3)); -SELECT dictHas('ordinary_db.dict2', toUInt64(3)); -SELECT dictHas('ordinary_db.dict2', toUInt64(45)); -SELECT dictIsIn('ordinary_db.dict2', toUInt64(3), toUInt64(1)); -SELECT dictIsIn('ordinary_db.dict2', toUInt64(1), toUInt64(3)); -SELECT dictGetUInt64('ordinary_db.dict2', 'parent_region', toUInt64(3)); -SELECT dictGetUInt64('ordinary_db.dict2', 'parent_region', toUInt64(99)); -SELECT dictGetFloat32('ordinary_db.dict3', 'parent_region', toUInt64(3)); -SELECT dictGetFloat32('ordinary_db.dict3', 'parent_region', toUInt64(2)); -SELECT dictGetFloat32('ordinary_db.dict3', 'parent_region', toUInt64(1)); -SELECT dictGetString('ordinary_db.dict2', 'region_name', toUInt64(5)); -SELECT dictGetString('ordinary_db.dict2', 'region_name', toUInt64(4)); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(100), 'NONE'); +SELECT dictGetHierarchy('db_01268.dict2', toUInt64(3)); +SELECT dictHas('db_01268.dict2', toUInt64(3)); +SELECT dictHas('db_01268.dict2', toUInt64(45)); +SELECT dictIsIn('db_01268.dict2', toUInt64(3), toUInt64(1)); +SELECT dictIsIn('db_01268.dict2', toUInt64(1), toUInt64(3)); +SELECT dictGetUInt64('db_01268.dict2', 'parent_region', toUInt64(3)); +SELECT dictGetUInt64('db_01268.dict2', 'parent_region', toUInt64(99)); +SELECT dictGetFloat32('db_01268.dict3', 'parent_region', toUInt64(3)); +SELECT dictGetFloat32('db_01268.dict3', 'parent_region', toUInt64(2)); +SELECT dictGetFloat32('db_01268.dict3', 'parent_region', toUInt64(1)); +SELECT dictGetString('db_01268.dict2', 'region_name', toUInt64(5)); +SELECT dictGetString('db_01268.dict2', 'region_name', toUInt64(4)); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(100), 'NONE'); -SELECT number + 1, dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(number + 1), 'NONE') chars FROM numbers(10); -SELECT number + 1, dictGetFloat32OrDefault('ordinary_db.dict3', 'parent_region', toUInt64(number + 1), toFloat32(0)) chars FROM numbers(10); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(1), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(2), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(3), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(4), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(5), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(6), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(7), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(8), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(9), 'NONE'); -SELECT dictGetStringOrDefault('ordinary_db.dict2', 'region_name', toUInt64(10), 'NONE'); +SELECT number + 1, dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(number + 1), 'NONE') chars FROM numbers(10); +SELECT number + 1, dictGetFloat32OrDefault('db_01268.dict3', 'parent_region', toUInt64(number + 1), toFloat32(0)) chars FROM numbers(10); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(1), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(2), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(3), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(4), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(5), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(6), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(7), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(8), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(9), 'NONE'); +SELECT dictGetStringOrDefault('db_01268.dict2', 'region_name', toUInt64(10), 'NONE'); -SELECT dictGetUInt64('ordinary_db.dict1', 'second_column', toUInt64(100500)); -- { serverError 396 } +SELECT dictGetUInt64('db_01268.dict1', 'second_column', toUInt64(100500)); -- { serverError 396 } SELECT 'END'; -DROP DICTIONARY IF EXISTS ordinary_db.dict1; -DROP DICTIONARY IF EXISTS ordinary_db.dict2; -DROP DICTIONARY IF EXISTS ordinary_db.dict3; +DROP DICTIONARY IF EXISTS db_01268.dict1; +DROP DICTIONARY IF EXISTS db_01268.dict2; +DROP DICTIONARY IF EXISTS db_01268.dict3; -DROP DATABASE IF EXISTS ordinary_db; +DROP DATABASE IF EXISTS db_01268; -DROP TABLE IF EXISTS database_for_dict.table_for_dict1; -DROP TABLE IF EXISTS database_for_dict.table_for_dict2; -DROP TABLE IF EXISTS database_for_dict.table_for_dict3; +DROP TABLE IF EXISTS database_for_dict_01268.table_for_dict1; +DROP TABLE IF EXISTS database_for_dict_01268.table_for_dict2; +DROP TABLE IF EXISTS database_for_dict_01268.table_for_dict3; -DROP DATABASE IF EXISTS database_for_dict; +DROP DATABASE IF EXISTS database_for_dict_01268; diff --git a/tests/queries/0_stateless/01280_ssd_complex_key_dictionary.sql b/tests/queries/0_stateless/01280_ssd_complex_key_dictionary.sql index 952a8c2ff55..9faafb6c0c7 100644 --- a/tests/queries/0_stateless/01280_ssd_complex_key_dictionary.sql +++ b/tests/queries/0_stateless/01280_ssd_complex_key_dictionary.sql @@ -24,6 +24,8 @@ INSERT INTO database_for_dict.table_for_dict SELECT toString(number), number + 1 DROP DICTIONARY IF EXISTS database_for_dict.ssd_dict; +-- FIXME filesystem error: in create_directory: Permission denied [/var/lib/clickhouse] +-- Probably we need rewrite it to integration test CREATE DICTIONARY database_for_dict.ssd_dict ( k1 String, diff --git a/tests/queries/0_stateless/01320_create_sync_race_condition_zookeeper.sh b/tests/queries/0_stateless/01320_create_sync_race_condition_zookeeper.sh index 5bbec57a236..cc6a66bd6bc 100755 --- a/tests/queries/0_stateless/01320_create_sync_race_condition_zookeeper.sh +++ b/tests/queries/0_stateless/01320_create_sync_race_condition_zookeeper.sh @@ -5,16 +5,17 @@ CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) set -e -$CLICKHOUSE_CLIENT --query "DROP TABLE IF EXISTS r;" +$CLICKHOUSE_CLIENT --query "DROP DATABASE IF EXISTS test_01320" +$CLICKHOUSE_CLIENT --query "CREATE DATABASE test_01320 ENGINE=Ordinary" # Different bahaviour of DROP with Atomic function thread1() { - while true; do $CLICKHOUSE_CLIENT -n --query "CREATE TABLE r (x UInt64) ENGINE = ReplicatedMergeTree('/test_01320/table', 'r') ORDER BY x; DROP TABLE r;"; done + while true; do $CLICKHOUSE_CLIENT -n --query "CREATE TABLE test_01320.r (x UInt64) ENGINE = ReplicatedMergeTree('/test_01320/table', 'r') ORDER BY x; DROP TABLE test_01320.r;"; done } function thread2() { - while true; do $CLICKHOUSE_CLIENT --query "SYSTEM SYNC REPLICA r" 2>/dev/null; done + while true; do $CLICKHOUSE_CLIENT --query "SYSTEM SYNC REPLICA test_01320.r" 2>/dev/null; done } export -f thread1 @@ -25,4 +26,4 @@ timeout 10 bash -c thread2 & wait -$CLICKHOUSE_CLIENT --query "DROP TABLE IF EXISTS r;" +$CLICKHOUSE_CLIENT --query "DROP DATABASE test_01320" 2>&1 | grep -F "Code:" | grep -v "New table appeared in database being dropped or detached" || exit 0 diff --git a/tests/queries/0_stateless/01391_join_on_dict_crash.sql b/tests/queries/0_stateless/01391_join_on_dict_crash.sql index 998e0e21745..238a966727f 100644 --- a/tests/queries/0_stateless/01391_join_on_dict_crash.sql +++ b/tests/queries/0_stateless/01391_join_on_dict_crash.sql @@ -1,5 +1,5 @@ DROP DATABASE IF EXISTS db_01391; -CREATE DATABASE db_01391 Engine = Ordinary; +CREATE DATABASE db_01391; USE db_01391; DROP TABLE IF EXISTS t; diff --git a/tests/queries/0_stateless/01400_join_get_with_multi_keys.reference b/tests/queries/0_stateless/01400_join_get_with_multi_keys.reference index 49d59571fbf..726b0a9a7a5 100644 --- a/tests/queries/0_stateless/01400_join_get_with_multi_keys.reference +++ b/tests/queries/0_stateless/01400_join_get_with_multi_keys.reference @@ -1 +1,2 @@ 0.1 +0.1 diff --git a/tests/queries/0_stateless/01400_join_get_with_multi_keys.sql b/tests/queries/0_stateless/01400_join_get_with_multi_keys.sql index 73068270762..8a19865359b 100644 --- a/tests/queries/0_stateless/01400_join_get_with_multi_keys.sql +++ b/tests/queries/0_stateless/01400_join_get_with_multi_keys.sql @@ -6,4 +6,10 @@ INSERT INTO test_joinGet VALUES ('ab', '1', 0.1), ('ab', '2', 0.2), ('cd', '3', SELECT joinGet(test_joinGet, 'c', 'ab', '1'); +CREATE TABLE test_lc(a LowCardinality(String), b LowCardinality(String), c Float64) ENGINE = Join(any, left, a, b); + +INSERT INTO test_lc VALUES ('ab', '1', 0.1), ('ab', '2', 0.2), ('cd', '3', 0.3); + +SELECT joinGet(test_lc, 'c', 'ab', '1'); + DROP TABLE test_joinGet; diff --git a/tests/queries/0_stateless/01477_lc_in_merge_join_left_key.reference b/tests/queries/0_stateless/01477_lc_in_merge_join_left_key.reference new file mode 100644 index 00000000000..0612b4ca23e --- /dev/null +++ b/tests/queries/0_stateless/01477_lc_in_merge_join_left_key.reference @@ -0,0 +1,35 @@ +1 l \N LowCardinality(String) Nullable(String) +2 \N LowCardinality(String) Nullable(String) +1 l \N LowCardinality(String) Nullable(String) +2 \N LowCardinality(String) Nullable(String) +- +0 \N Nullable(String) LowCardinality(String) +1 \N l Nullable(String) LowCardinality(String) +0 \N Nullable(String) LowCardinality(String) +1 \N l Nullable(String) LowCardinality(String) +- +1 l \N LowCardinality(String) Nullable(String) +0 \N LowCardinality(String) Nullable(String) +0 \N LowCardinality(String) Nullable(String) +1 l \N LowCardinality(String) Nullable(String) +- +0 \N Nullable(String) LowCardinality(String) +1 \N l Nullable(String) LowCardinality(String) +0 \N Nullable(String) LowCardinality(String) +1 \N l Nullable(String) LowCardinality(String) +- +1 l \N LowCardinality(String) Nullable(String) +2 \N LowCardinality(String) Nullable(String) +1 l \N LowCardinality(String) Nullable(String) +2 \N LowCardinality(String) Nullable(String) +- +\N \N Nullable(String) LowCardinality(String) +1 \N l Nullable(String) LowCardinality(String) +1 \N l Nullable(String) LowCardinality(String) +\N \N Nullable(String) LowCardinality(String) +- +1 l \N LowCardinality(String) Nullable(String) +\N \N LowCardinality(String) Nullable(String) +1 l \N LowCardinality(String) Nullable(String) +\N \N LowCardinality(String) Nullable(String) +- diff --git a/tests/queries/0_stateless/01477_lc_in_merge_join_left_key.sql b/tests/queries/0_stateless/01477_lc_in_merge_join_left_key.sql new file mode 100644 index 00000000000..2507613f051 --- /dev/null +++ b/tests/queries/0_stateless/01477_lc_in_merge_join_left_key.sql @@ -0,0 +1,65 @@ +SET join_algorithm = 'auto'; +SET max_bytes_in_join = 100; + +DROP TABLE IF EXISTS t; +DROP TABLE IF EXISTS nr; + +CREATE TABLE t (`x` UInt32, `s` LowCardinality(String)) ENGINE = Memory; +CREATE TABLE nr (`x` Nullable(UInt32), `s` Nullable(String)) ENGINE = Memory; + +INSERT INTO t VALUES (1, 'l'); +INSERT INTO nr VALUES (2, NULL); + +SET join_use_nulls = 0; + +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l LEFT JOIN nr AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l RIGHT JOIN nr AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l FULL JOIN nr AS r USING (x) ORDER BY t.x; + +SELECT '-'; + +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l LEFT JOIN t AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l RIGHT JOIN t AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l FULL JOIN t AS r USING (x) ORDER BY t.x; + +SELECT '-'; + +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l LEFT JOIN nr AS r USING (s) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l RIGHT JOIN nr AS r USING (s) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l FULL JOIN nr AS r USING (s) ORDER BY t.x; + +SELECT '-'; + +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l LEFT JOIN t AS r USING (s) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l RIGHT JOIN t AS r USING (s) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l FULL JOIN t AS r USING (s) ORDER BY t.x; + +SET join_use_nulls = 1; + +SELECT '-'; + +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l LEFT JOIN nr AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l RIGHT JOIN nr AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l FULL JOIN nr AS r USING (x) ORDER BY t.x; + +SELECT '-'; + +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l LEFT JOIN t AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l RIGHT JOIN t AS r USING (x) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l FULL JOIN t AS r USING (x) ORDER BY t.x; + +SELECT '-'; + +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l LEFT JOIN nr AS r USING (s) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l RIGHT JOIN nr AS r USING (s) ORDER BY t.x; +SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM t AS l FULL JOIN nr AS r USING (s) ORDER BY t.x; + +SELECT '-'; + +-- TODO +-- SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l LEFT JOIN t AS r USING (s) ORDER BY t.x; +-- SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l RIGHT JOIN t AS r USING (s) ORDER BY t.x; +-- SELECT t.x, l.s, r.s, toTypeName(l.s), toTypeName(r.s) FROM nr AS l FULL JOIN t AS r USING (s) ORDER BY t.x; + +DROP TABLE t; +DROP TABLE nr; diff --git a/tests/queries/0_stateless/01479_cross_join_9855.reference b/tests/queries/0_stateless/01479_cross_join_9855.reference new file mode 100644 index 00000000000..a74732eabe3 --- /dev/null +++ b/tests/queries/0_stateless/01479_cross_join_9855.reference @@ -0,0 +1,2 @@ +6 +36 diff --git a/tests/queries/0_stateless/01479_cross_join_9855.sql b/tests/queries/0_stateless/01479_cross_join_9855.sql new file mode 100644 index 00000000000..0b549619489 --- /dev/null +++ b/tests/queries/0_stateless/01479_cross_join_9855.sql @@ -0,0 +1,7 @@ +SELECT count() +FROM numbers(4) AS n1, numbers(3) AS n2 +WHERE n1.number > (select avg(n.number) from numbers(3) n); + +SELECT count() +FROM numbers(4) AS n1, numbers(3) AS n2, numbers(6) AS n3 +WHERE n1.number > (select avg(n.number) from numbers(3) n); diff --git a/tests/queries/0_stateless/01481_join_with_materialized.reference b/tests/queries/0_stateless/01481_join_with_materialized.reference new file mode 100644 index 00000000000..b8626c4cff2 --- /dev/null +++ b/tests/queries/0_stateless/01481_join_with_materialized.reference @@ -0,0 +1 @@ +4 diff --git a/tests/queries/0_stateless/01481_join_with_materialized.sql b/tests/queries/0_stateless/01481_join_with_materialized.sql new file mode 100644 index 00000000000..833b483dc93 --- /dev/null +++ b/tests/queries/0_stateless/01481_join_with_materialized.sql @@ -0,0 +1,21 @@ +drop table if exists t1; +drop table if exists t2; + +create table t1 +( + col UInt64, + x UInt64 MATERIALIZED col + 1 +) Engine = MergeTree order by tuple(); + +create table t2 +( + x UInt64 +) Engine = MergeTree order by tuple(); + +insert into t1 values (1),(2),(3),(4),(5); +insert into t2 values (1),(2),(3),(4),(5); + +SELECT COUNT() FROM t1 INNER JOIN t2 USING x; + +drop table t1; +drop table t2; diff --git a/tests/queries/0_stateless/01482_move_to_prewhere_and_cast.reference b/tests/queries/0_stateless/01482_move_to_prewhere_and_cast.reference new file mode 100644 index 00000000000..29597554bbc --- /dev/null +++ b/tests/queries/0_stateless/01482_move_to_prewhere_and_cast.reference @@ -0,0 +1 @@ +ApplicationA 2020-01-01 diff --git a/tests/queries/0_stateless/01482_move_to_prewhere_and_cast.sql b/tests/queries/0_stateless/01482_move_to_prewhere_and_cast.sql new file mode 100644 index 00000000000..b79a3cf05b4 --- /dev/null +++ b/tests/queries/0_stateless/01482_move_to_prewhere_and_cast.sql @@ -0,0 +1,31 @@ +DROP TABLE IF EXISTS APPLICATION; +DROP TABLE IF EXISTS DATABASE_IO; + +CREATE TABLE APPLICATION ( + `Name` LowCardinality(String), + `Base` LowCardinality(String) +) ENGINE = Memory(); + +insert into table APPLICATION values ('ApplicationA', 'BaseA'), ('ApplicationB', 'BaseB') , ('ApplicationC', 'BaseC'); + +CREATE TABLE DATABASE_IO ( + `Application` LowCardinality(String), + `Base` LowCardinality(String), + `Date` DateTime, + `Ios` UInt32 ) +ENGINE = MergeTree() +ORDER BY Date; + +insert into table DATABASE_IO values ('AppA', 'BaseA', '2020-01-01 00:00:00', 1000); + +SELECT `APPLICATION`.`Name` AS `App`, + CAST(CAST(`DATABASE_IO`.`Date` AS DATE) AS DATE) AS `date` +FROM `DATABASE_IO` +INNER +JOIN `APPLICATION` ON (`DATABASE_IO`.`Base` = `APPLICATION`.`Base`) +WHERE ( + CAST(CAST(`DATABASE_IO`.`Date` AS DATE) AS TIMESTAMP) >= toDateTime('2020-01-01 00:00:00') +); + +DROP TABLE APPLICATION; +DROP TABLE DATABASE_IO; diff --git a/tests/queries/0_stateless/01506_buffer_table_alter_block_structure.reference b/tests/queries/0_stateless/01506_buffer_table_alter_block_structure.reference new file mode 100644 index 00000000000..1f90610041b --- /dev/null +++ b/tests/queries/0_stateless/01506_buffer_table_alter_block_structure.reference @@ -0,0 +1,3 @@ +2020-01-01 00:05:00 +2020-01-01 00:05:00 +2020-01-01 00:06:00 hello diff --git a/tests/queries/0_stateless/01506_buffer_table_alter_block_structure.sql b/tests/queries/0_stateless/01506_buffer_table_alter_block_structure.sql new file mode 100644 index 00000000000..cba7d84fac6 --- /dev/null +++ b/tests/queries/0_stateless/01506_buffer_table_alter_block_structure.sql @@ -0,0 +1,22 @@ +DROP TABLE IF EXISTS buf_dest; +DROP TABLE IF EXISTS buf; + +CREATE TABLE buf_dest (timestamp DateTime) +ENGINE = MergeTree PARTITION BY toYYYYMMDD(timestamp) +ORDER BY (timestamp); + +CREATE TABLE buf (timestamp DateTime) Engine = Buffer(currentDatabase(), buf_dest, 16, 3, 20, 2000000, 20000000, 100000000, 300000000);; + +INSERT INTO buf (timestamp) VALUES (toDateTime('2020-01-01 00:05:00')); + +ALTER TABLE buf_dest ADD COLUMN s String; +ALTER TABLE buf ADD COLUMN s String; + +SELECT * FROM buf; + +INSERT INTO buf (timestamp, s) VALUES (toDateTime('2020-01-01 00:06:00'), 'hello'); + +SELECT * FROM buf ORDER BY timestamp; + +DROP TABLE IF EXISTS buf; +DROP TABLE IF EXISTS buf_dest; diff --git a/tests/queries/0_stateless/01507_clickhouse_server_start_with_embedded_config.reference b/tests/queries/0_stateless/01507_clickhouse_server_start_with_embedded_config.reference new file mode 100644 index 00000000000..c3829d603de --- /dev/null +++ b/tests/queries/0_stateless/01507_clickhouse_server_start_with_embedded_config.reference @@ -0,0 +1,5 @@ +Starting clickhouse-server +Waiting for clickhouse-server to start +1 +Hello +World diff --git a/tests/queries/0_stateless/01507_clickhouse_server_start_with_embedded_config.sh b/tests/queries/0_stateless/01507_clickhouse_server_start_with_embedded_config.sh new file mode 100755 index 00000000000..68198ec6e16 --- /dev/null +++ b/tests/queries/0_stateless/01507_clickhouse_server_start_with_embedded_config.sh @@ -0,0 +1,48 @@ +#!/usr/bin/env bash + +CLICKHOUSE_PORT_TCP=50111 +CLICKHOUSE_DATABASE=default + +CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +. "$CURDIR"/../shell_config.sh + +echo "Starting clickhouse-server" + +$PORT + +$CLICKHOUSE_BINARY server -- --tcp_port "$CLICKHOUSE_PORT_TCP" > server.log 2>&1 & +PID=$! + +function finish { + kill $PID + wait +} +trap finish EXIT + +echo "Waiting for clickhouse-server to start" + +for i in {1..30}; do + sleep 1 + $CLICKHOUSE_CLIENT --query "SELECT 1" 2>/dev/null && break + if [[ $i == 30 ]]; then + cat server.log + exit 1 + fi +done + +# Check access rights + +$CLICKHOUSE_CLIENT -n --query " + DROP DATABASE IF EXISTS test; + CREATE DATABASE test; + USE test; + + CREATE TABLE t (s String) ENGINE=TinyLog; + INSERT INTO t VALUES ('Hello'); + SELECT * FROM t; + DROP TABLE t; + + CREATE TEMPORARY TABLE t (s String); + INSERT INTO t VALUES ('World'); + SELECT * FROM t; +"; diff --git a/tests/queries/0_stateless/01508_partition_pruning.queries b/tests/queries/0_stateless/01508_partition_pruning.queries new file mode 100644 index 00000000000..3773e907c53 --- /dev/null +++ b/tests/queries/0_stateless/01508_partition_pruning.queries @@ -0,0 +1,124 @@ +DROP TABLE IF EXISTS tMM; +DROP TABLE IF EXISTS tDD; +DROP TABLE IF EXISTS sDD; +DROP TABLE IF EXISTS xMM; +CREATE TABLE tMM(d DateTime,a Int64) ENGINE = MergeTree PARTITION BY toYYYYMM(d) ORDER BY tuple() SETTINGS index_granularity = 8192; +SYSTEM STOP MERGES tMM; +INSERT INTO tMM SELECT toDateTime('2020-08-16 00:00:00') + number*60, number FROM numbers(5000); +INSERT INTO tMM SELECT toDateTime('2020-08-16 00:00:00') + number*60, number FROM numbers(5000); +INSERT INTO tMM SELECT toDateTime('2020-09-01 00:00:00') + number*60, number FROM numbers(5000); +INSERT INTO tMM SELECT toDateTime('2020-09-01 00:00:00') + number*60, number FROM numbers(5000); +INSERT INTO tMM SELECT toDateTime('2020-10-01 00:00:00') + number*60, number FROM numbers(5000); +INSERT INTO tMM SELECT toDateTime('2020-10-15 00:00:00') + number*60, number FROM numbers(5000); + +CREATE TABLE tDD(d DateTime,a Int) ENGINE = MergeTree PARTITION BY toYYYYMMDD(d) ORDER BY tuple() SETTINGS index_granularity = 8192; +SYSTEM STOP MERGES tDD; +insert into tDD select toDateTime(toDate('2020-09-23')), number from numbers(10000) UNION ALL select toDateTime(toDateTime('2020-09-23 11:00:00')), number from numbers(10000) UNION ALL select toDateTime(toDate('2020-09-24')), number from numbers(10000) UNION ALL select toDateTime(toDate('2020-09-25')), number from numbers(10000) UNION ALL select toDateTime(toDate('2020-08-15')), number from numbers(10000); + +CREATE TABLE sDD(d UInt64,a Int) ENGINE = MergeTree PARTITION BY toYYYYMM(toDate(intDiv(d,1000))) ORDER BY tuple() SETTINGS index_granularity = 8192; +SYSTEM STOP MERGES sDD; +insert into sDD select (1597536000+number*60)*1000, number from numbers(5000); +insert into sDD select (1597536000+number*60)*1000, number from numbers(5000); +insert into sDD select (1598918400+number*60)*1000, number from numbers(5000); +insert into sDD select (1598918400+number*60)*1000, number from numbers(5000); +insert into sDD select (1601510400+number*60)*1000, number from numbers(5000); +insert into sDD select (1602720000+number*60)*1000, number from numbers(5000); + +CREATE TABLE xMM(d DateTime,a Int64, f Int64) ENGINE = MergeTree PARTITION BY (toYYYYMM(d), a) ORDER BY tuple() SETTINGS index_granularity = 8192; +SYSTEM STOP MERGES xMM; +INSERT INTO xMM SELECT toDateTime('2020-08-16 00:00:00') + number*60, 1, number FROM numbers(5000); +INSERT INTO xMM SELECT toDateTime('2020-08-16 00:00:00') + number*60, 2, number FROM numbers(5000); +INSERT INTO xMM SELECT toDateTime('2020-09-01 00:00:00') + number*60, 3, number FROM numbers(5000); +INSERT INTO xMM SELECT toDateTime('2020-09-01 00:00:00') + number*60, 2, number FROM numbers(5000); +INSERT INTO xMM SELECT toDateTime('2020-10-01 00:00:00') + number*60, 1, number FROM numbers(5000); +INSERT INTO xMM SELECT toDateTime('2020-10-15 00:00:00') + number*60, 1, number FROM numbers(5000); + + +SELECT '--------- tMM ----------------------------'; +select uniqExact(_part), count() from tMM where toDate(d)=toDate('2020-09-15'); +select uniqExact(_part), count() from tMM where toDate(d)=toDate('2020-09-01'); +select uniqExact(_part), count() from tMM where toDate(d)=toDate('2020-10-15'); +select uniqExact(_part), count() from tMM where toDate(d)='2020-09-15'; +select uniqExact(_part), count() from tMM where toYYYYMM(d)=202009; +select uniqExact(_part), count() from tMM where toYYYYMMDD(d)=20200816; +select uniqExact(_part), count() from tMM where toYYYYMMDD(d)=20201015; +select uniqExact(_part), count() from tMM where toDate(d)='2020-10-15'; +select uniqExact(_part), count() from tMM where d >= '2020-09-01 00:00:00' and d<'2020-10-15 00:00:00'; +select uniqExact(_part), count() from tMM where d >= '2020-01-16 00:00:00' and d < toDateTime('2021-08-17 00:00:00'); +select uniqExact(_part), count() from tMM where d >= '2020-09-16 00:00:00' and d < toDateTime('2020-10-01 00:00:00'); +select uniqExact(_part), count() from tMM where d >= '2020-09-12 00:00:00' and d < '2020-10-16 00:00:00'; +select uniqExact(_part), count() from tMM where toStartOfDay(d) >= '2020-09-12 00:00:00'; +select uniqExact(_part), count() from tMM where toStartOfDay(d) = '2020-09-01 00:00:00'; +select uniqExact(_part), count() from tMM where toStartOfDay(d) = '2020-10-01 00:00:00'; +select uniqExact(_part), count() from tMM where toStartOfDay(d) >= '2020-09-15 00:00:00' and d < '2020-10-16 00:00:00'; +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202010; +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202009; +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202010 and toStartOfDay(d) = '2020-10-01 00:00:00'; +select uniqExact(_part), count() from tMM where toYYYYMM(d) >= 202009 and toStartOfDay(d) < '2020-10-02 00:00:00'; +select uniqExact(_part), count() from tMM where toYYYYMM(d) > 202009 and toStartOfDay(d) < '2020-10-02 00:00:00'; +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202009 and toStartOfDay(d) < '2020-10-02 00:00:00'; +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202010 and toStartOfDay(d) < '2020-10-02 00:00:00'; +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202010; +select uniqExact(_part), count() from tMM where toYYYYMM(d-1)+1 = 202010; +select uniqExact(_part), count() from tMM where toStartOfMonth(d) >= '2020-09-15'; +select uniqExact(_part), count() from tMM where toStartOfMonth(d) >= '2020-09-01'; +select uniqExact(_part), count() from tMM where toStartOfMonth(d) >= '2020-09-01' and toStartOfMonth(d) < '2020-10-01'; + +SYSTEM START MERGES tMM; +OPTIMIZE TABLE tMM FINAL; + +select uniqExact(_part), count() from tMM where toYYYYMM(d-1)+1 = 202010; +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202010; +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202010; + + +SELECT '--------- tDD ----------------------------'; +SYSTEM START MERGES tDD; +OPTIMIZE TABLE tDD FINAL; + +select uniqExact(_part), count() from tDD where toDate(d)=toDate('2020-09-24'); +select uniqExact(_part), count() FROM tDD WHERE toDate(d) = toDate('2020-09-24'); +select uniqExact(_part), count() FROM tDD WHERE toDate(d) = '2020-09-24'; +select uniqExact(_part), count() FROM tDD WHERE toDate(d) >= '2020-09-23' and toDate(d) <= '2020-09-26'; +select uniqExact(_part), count() FROM tDD WHERE toYYYYMMDD(d) >= 20200923 and toDate(d) <= '2020-09-26'; + + +SELECT '--------- sDD ----------------------------'; +select uniqExact(_part), count() from sDD; +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1)+1 = 202010; +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1) = 202010; +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1) = 202110; +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC'))+1 > 202009 and toStartOfDay(toDateTime(intDiv(d,1000),'UTC')) < toDateTime('2020-10-02 00:00:00','UTC'); +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC'))+1 > 202009 and toDateTime(intDiv(d,1000),'UTC') < toDateTime('2020-10-01 00:00:00','UTC'); +select uniqExact(_part), count() from sDD where d >= 1598918400000; +select uniqExact(_part), count() from sDD where d >= 1598918400000 and toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1) < 202010; + + +SELECT '--------- xMM ----------------------------'; +select uniqExact(_part), count() from xMM where toStartOfDay(d) >= '2020-10-01 00:00:00'; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00'; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-10-01 00:00:00'; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00' and a=1; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00' and a<>3; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-10-01 00:00:00' and a<>3; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-11-01 00:00:00' and a = 1; +select uniqExact(_part), count() from xMM where a = 1; +select uniqExact(_part), count() from xMM where a = 66; +select uniqExact(_part), count() from xMM where a <> 66; +select uniqExact(_part), count() from xMM where a = 2; + +SYSTEM START MERGES xMM; +optimize table xMM final; + +select uniqExact(_part), count() from xMM where a = 1; +select uniqExact(_part), count() from xMM where toStartOfDay(d) >= '2020-10-01 00:00:00'; +select uniqExact(_part), count() from xMM where a <> 66; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00' and a<>3; +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-10-01 00:00:00' and a<>3; + +DROP TABLE tMM; +DROP TABLE tDD; +DROP TABLE sDD; +DROP TABLE xMM; + + diff --git a/tests/queries/0_stateless/01508_partition_pruning.reference b/tests/queries/0_stateless/01508_partition_pruning.reference new file mode 100644 index 00000000000..0cc40d23b41 --- /dev/null +++ b/tests/queries/0_stateless/01508_partition_pruning.reference @@ -0,0 +1,244 @@ +--------- tMM ---------------------------- +select uniqExact(_part), count() from tMM where toDate(d)=toDate('2020-09-15'); +0 0 +Selected 0 parts by partition key, 0 parts by primary key, 0 marks by primary key, 0 marks to read from 0 ranges + +select uniqExact(_part), count() from tMM where toDate(d)=toDate('2020-09-01'); +2 2880 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toDate(d)=toDate('2020-10-15'); +1 1440 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where toDate(d)='2020-09-15'; +0 0 +Selected 0 parts by partition key, 0 parts by primary key, 0 marks by primary key, 0 marks to read from 0 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d)=202009; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toYYYYMMDD(d)=20200816; +2 2880 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toYYYYMMDD(d)=20201015; +1 1440 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where toDate(d)='2020-10-15'; +1 1440 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where d >= '2020-09-01 00:00:00' and d<'2020-10-15 00:00:00'; +3 15000 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from tMM where d >= '2020-01-16 00:00:00' and d < toDateTime('2021-08-17 00:00:00'); +6 30000 +Selected 6 parts by partition key, 6 parts by primary key, 6 marks by primary key, 6 marks to read from 6 ranges + +select uniqExact(_part), count() from tMM where d >= '2020-09-16 00:00:00' and d < toDateTime('2020-10-01 00:00:00'); +0 0 +Selected 0 parts by partition key, 0 parts by primary key, 0 marks by primary key, 0 marks to read from 0 ranges + +select uniqExact(_part), count() from tMM where d >= '2020-09-12 00:00:00' and d < '2020-10-16 00:00:00'; +2 6440 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toStartOfDay(d) >= '2020-09-12 00:00:00'; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toStartOfDay(d) = '2020-09-01 00:00:00'; +2 2880 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toStartOfDay(d) = '2020-10-01 00:00:00'; +1 1440 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where toStartOfDay(d) >= '2020-09-15 00:00:00' and d < '2020-10-16 00:00:00'; +2 6440 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202010; +4 20000 +Selected 4 parts by partition key, 4 parts by primary key, 4 marks by primary key, 4 marks to read from 4 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202009; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202010 and toStartOfDay(d) = '2020-10-01 00:00:00'; +1 1440 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d) >= 202009 and toStartOfDay(d) < '2020-10-02 00:00:00'; +3 11440 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d) > 202009 and toStartOfDay(d) < '2020-10-02 00:00:00'; +1 1440 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202009 and toStartOfDay(d) < '2020-10-02 00:00:00'; +3 11440 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202010 and toStartOfDay(d) < '2020-10-02 00:00:00'; +1 1440 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202010; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d-1)+1 = 202010; +3 9999 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from tMM where toStartOfMonth(d) >= '2020-09-15'; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toStartOfMonth(d) >= '2020-09-01'; +4 20000 +Selected 4 parts by partition key, 4 parts by primary key, 4 marks by primary key, 4 marks to read from 4 ranges + +select uniqExact(_part), count() from tMM where toStartOfMonth(d) >= '2020-09-01' and toStartOfMonth(d) < '2020-10-01'; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d-1)+1 = 202010; +2 9999 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d)+1 > 202010; +1 10000 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from tMM where toYYYYMM(d) between 202009 and 202010; +2 20000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +--------- tDD ---------------------------- +select uniqExact(_part), count() from tDD where toDate(d)=toDate('2020-09-24'); +1 10000 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() FROM tDD WHERE toDate(d) = toDate('2020-09-24'); +1 10000 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() FROM tDD WHERE toDate(d) = '2020-09-24'; +1 10000 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() FROM tDD WHERE toDate(d) >= '2020-09-23' and toDate(d) <= '2020-09-26'; +3 40000 +Selected 3 parts by partition key, 3 parts by primary key, 4 marks by primary key, 4 marks to read from 3 ranges + +select uniqExact(_part), count() FROM tDD WHERE toYYYYMMDD(d) >= 20200923 and toDate(d) <= '2020-09-26'; +3 40000 +Selected 3 parts by partition key, 3 parts by primary key, 4 marks by primary key, 4 marks to read from 3 ranges + +--------- sDD ---------------------------- +select uniqExact(_part), count() from sDD; +6 30000 +Selected 6 parts by partition key, 6 parts by primary key, 6 marks by primary key, 6 marks to read from 6 ranges + +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1)+1 = 202010; +3 9999 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1) = 202010; +2 9999 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1) = 202110; +0 0 +Selected 0 parts by partition key, 0 parts by primary key, 0 marks by primary key, 0 marks to read from 0 ranges + +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC'))+1 > 202009 and toStartOfDay(toDateTime(intDiv(d,1000),'UTC')) < toDateTime('2020-10-02 00:00:00','UTC'); +3 11440 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from sDD where toYYYYMM(toDateTime(intDiv(d,1000),'UTC'))+1 > 202009 and toDateTime(intDiv(d,1000),'UTC') < toDateTime('2020-10-01 00:00:00','UTC'); +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from sDD where d >= 1598918400000; +4 20000 +Selected 4 parts by partition key, 4 parts by primary key, 4 marks by primary key, 4 marks to read from 4 ranges + +select uniqExact(_part), count() from sDD where d >= 1598918400000 and toYYYYMM(toDateTime(intDiv(d,1000),'UTC')-1) < 202010; +3 10001 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +--------- xMM ---------------------------- +select uniqExact(_part), count() from xMM where toStartOfDay(d) >= '2020-10-01 00:00:00'; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00'; +3 10001 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-10-01 00:00:00'; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00' and a=1; +1 1 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00' and a<>3; +2 5001 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-10-01 00:00:00' and a<>3; +1 5000 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-11-01 00:00:00' and a = 1; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from xMM where a = 1; +3 15000 +Selected 3 parts by partition key, 3 parts by primary key, 3 marks by primary key, 3 marks to read from 3 ranges + +select uniqExact(_part), count() from xMM where a = 66; +0 0 +Selected 0 parts by partition key, 0 parts by primary key, 0 marks by primary key, 0 marks to read from 0 ranges + +select uniqExact(_part), count() from xMM where a <> 66; +6 30000 +Selected 6 parts by partition key, 6 parts by primary key, 6 marks by primary key, 6 marks to read from 6 ranges + +select uniqExact(_part), count() from xMM where a = 2; +2 10000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from xMM where a = 1; +2 15000 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from xMM where toStartOfDay(d) >= '2020-10-01 00:00:00'; +1 10000 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + +select uniqExact(_part), count() from xMM where a <> 66; +5 30000 +Selected 5 parts by partition key, 5 parts by primary key, 5 marks by primary key, 5 marks to read from 5 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d <= '2020-10-01 00:00:00' and a<>3; +2 5001 +Selected 2 parts by partition key, 2 parts by primary key, 2 marks by primary key, 2 marks to read from 2 ranges + +select uniqExact(_part), count() from xMM where d >= '2020-09-01 00:00:00' and d < '2020-10-01 00:00:00' and a<>3; +1 5000 +Selected 1 parts by partition key, 1 parts by primary key, 1 marks by primary key, 1 marks to read from 1 ranges + diff --git a/tests/queries/0_stateless/01508_partition_pruning.sh b/tests/queries/0_stateless/01508_partition_pruning.sh new file mode 100755 index 00000000000..c886946c7d9 --- /dev/null +++ b/tests/queries/0_stateless/01508_partition_pruning.sh @@ -0,0 +1,37 @@ +#!/usr/bin/env bash + +#------------------------------------------------------------------------------------------- +# Description of test result: +# Test the correctness of the partition +# pruning +# +# Script executes queries from a file 01508_partition_pruning.queries (1 line = 1 query) +# Queries are started with 'select' (but NOT with 'SELECT') are executed with log_level=debug +#------------------------------------------------------------------------------------------- + +CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +. "$CURDIR"/../shell_config.sh + +#export CLICKHOUSE_CLIENT="clickhouse-client --send_logs_level=none" +#export CLICKHOUSE_CLIENT_SERVER_LOGS_LEVEL=none +#export CURDIR=. + + +queries="${CURDIR}/01508_partition_pruning.queries" +while IFS= read -r sql +do + [ -z "$sql" ] && continue + if [[ "$sql" == select* ]] ; + then + echo "$sql" + ${CLICKHOUSE_CLIENT} --query "$sql" + CLICKHOUSE_CLIENT=$(echo ${CLICKHOUSE_CLIENT} | sed 's/'"--send_logs_level=${CLICKHOUSE_CLIENT_SERVER_LOGS_LEVEL}"'/--send_logs_level=debug/g') + ${CLICKHOUSE_CLIENT} --query "$sql" 2>&1 | grep -oh "Selected .* parts by partition key, *. parts by primary key, .* marks by primary key, .* marks to read from .* ranges.*$" + CLICKHOUSE_CLIENT=$(echo ${CLICKHOUSE_CLIENT} | sed 's/--send_logs_level=debug/'"--send_logs_level=${CLICKHOUSE_CLIENT_SERVER_LOGS_LEVEL}"'/g') + echo "" + else + ${CLICKHOUSE_CLIENT} --query "$sql" + fi +done < "$queries" + + diff --git a/tests/queries/0_stateless/01508_query_obfuscator.reference b/tests/queries/0_stateless/01508_query_obfuscator.reference new file mode 100644 index 00000000000..0064ac73a09 --- /dev/null +++ b/tests/queries/0_stateless/01508_query_obfuscator.reference @@ -0,0 +1,16 @@ +SELECT 116, 'Qqfu://2020-02-10isqkc1203 sp 2000-05-27T18:38:01', 13e100, Residue_id_breakfastDevice, park(Innervation), avgIf(remote('128.0.0.1')) +SELECT shell_dust_tintype between crumb and shoat, case when peach >= 116 then bombing else null end + +SELECT + ChimeID, + Testimonial.ID, Testimonial.SipCauseway, + TankfulTRUMPET, + HUMIDITY.TermiteName, HUMIDITY.TermiteSculptural, HUMIDITY.TermiteGuilt, HUMIDITY.TermiteIntensity, HUMIDITY.SipCauseway, HUMIDITY.Coat +FROM merge.tinkle_efficiency +WHERE + FaithSeller >= '2020-10-13' AND FaithSeller <= '2020-10-21' + AND MandolinID = 30750384 + AND intHash32(GafferID) = 448362928 AND intHash64(GafferID) = 12572659331310383983 + AND ChimeID IN (8195672321757027078, 7079643623150622129, 5057006826979676478, 7886875230160484653, 7494974311229040743) + AND Stot = 1 + diff --git a/tests/queries/0_stateless/01508_query_obfuscator.sh b/tests/queries/0_stateless/01508_query_obfuscator.sh new file mode 100755 index 00000000000..d60e42489fa --- /dev/null +++ b/tests/queries/0_stateless/01508_query_obfuscator.sh @@ -0,0 +1,22 @@ +#!/usr/bin/env bash + +CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +. "$CURDIR"/../shell_config.sh + +$CLICKHOUSE_FORMAT --seed Hello --obfuscate <<< "SELECT 123, 'Test://2020-01-01hello1234 at 2000-01-01T01:02:03', 12e100, Gibberish_id_testCool, hello(World), avgIf(remote('127.0.0.1'))" +$CLICKHOUSE_FORMAT --seed Hello --obfuscate <<< "SELECT cost_first_screen between a and b, case when x >= 123 then y else null end" + +$CLICKHOUSE_FORMAT --seed Hello --obfuscate <<< " +SELECT + VisitID, + Goals.ID, Goals.EventTime, + WatchIDs, + EAction.ProductName, EAction.ProductPrice, EAction.ProductCurrency, EAction.ProductQuantity, EAction.EventTime, EAction.Type +FROM merge.visits_v2 +WHERE + StartDate >= '2020-09-17' AND StartDate <= '2020-09-25' + AND CounterID = 24226447 + AND intHash32(UserID) = 416638616 AND intHash64(UserID) = 13269091395366875299 + AND VisitID IN (5653048135597886819, 5556254872710352304, 5516214175671455313, 5476714937521999313, 5464051549483503043) + AND Sign = 1 +" diff --git a/tests/queries/0_stateless/01508_race_condition_rename_clear_zookeeper.reference b/tests/queries/0_stateless/01508_race_condition_rename_clear_zookeeper.reference new file mode 100644 index 00000000000..13de30f45d1 --- /dev/null +++ b/tests/queries/0_stateless/01508_race_condition_rename_clear_zookeeper.reference @@ -0,0 +1 @@ +3000 diff --git a/tests/queries/0_stateless/01508_race_condition_rename_clear_zookeeper.sh b/tests/queries/0_stateless/01508_race_condition_rename_clear_zookeeper.sh new file mode 100755 index 00000000000..2af1cb214a4 --- /dev/null +++ b/tests/queries/0_stateless/01508_race_condition_rename_clear_zookeeper.sh @@ -0,0 +1,27 @@ +#!/usr/bin/env bash + +CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +. "$CURDIR"/../shell_config.sh + +$CLICKHOUSE_CLIENT --query "DROP TABLE IF EXISTS table_for_renames0" +$CLICKHOUSE_CLIENT --query "DROP TABLE IF EXISTS table_for_renames50" + + +$CLICKHOUSE_CLIENT --query "CREATE TABLE table_for_renames0 (value UInt64, data String) ENGINE ReplicatedMergeTree('/clickhouse/tables/test_01508/concurrent_rename', '1') ORDER BY tuple() SETTINGS cleanup_delay_period = 1, cleanup_delay_period_random_add = 0, min_rows_for_compact_part = 100000, min_rows_for_compact_part = 10000000, write_ahead_log_max_bytes = 1" + + +$CLICKHOUSE_CLIENT --query "INSERT INTO table_for_renames0 SELECT number, toString(number) FROM numbers(1000)" + +$CLICKHOUSE_CLIENT --query "INSERT INTO table_for_renames0 SELECT number, toString(number) FROM numbers(1000, 1000)" + +$CLICKHOUSE_CLIENT --query "INSERT INTO table_for_renames0 SELECT number, toString(number) FROM numbers(2000, 1000)" + +for i in $(seq 1 50); do + prev_i=$((i - 1)) + $CLICKHOUSE_CLIENT --query "RENAME TABLE table_for_renames$prev_i TO table_for_renames$i" +done + +$CLICKHOUSE_CLIENT --query "SELECT COUNT() from table_for_renames50" + +$CLICKHOUSE_CLIENT --query "DROP TABLE IF EXISTS table_for_renames0" +$CLICKHOUSE_CLIENT --query "DROP TABLE IF EXISTS table_for_renames50" diff --git a/tests/queries/0_stateless/arcadia_skip_list.txt b/tests/queries/0_stateless/arcadia_skip_list.txt index 69391ca9fd4..e59a2634d0c 100644 --- a/tests/queries/0_stateless/arcadia_skip_list.txt +++ b/tests/queries/0_stateless/arcadia_skip_list.txt @@ -145,3 +145,7 @@ 01461_query_start_time_microseconds 01455_shard_leaf_max_rows_bytes_to_read 01505_distributed_local_type_conversion_enum +00604_show_create_database +00609_mv_index_in_in +00510_materizlized_view_and_deduplication_zookeeper +00738_lock_for_inner_table diff --git a/tests/queries/skip_list.json b/tests/queries/skip_list.json index e4713b2d960..26e5bbf78cf 100644 --- a/tests/queries/skip_list.json +++ b/tests/queries/skip_list.json @@ -88,19 +88,10 @@ ], "release-build": [ ], - "database-atomic": [ - /// Inner tables of materialized views have different names - "00738_lock_for_inner_table", + "database-ordinary": [ + "00604_show_create_database", "00609_mv_index_in_in", "00510_materizlized_view_and_deduplication_zookeeper", - /// Different database engine - "00604_show_create_database", - /// UUID must be specified in ATTACH TABLE - "01190_full_attach_syntax", - /// Assumes blocking DROP - "01320_create_sync_race_condition", - /// Internal distionary name is different - "01225_show_create_table_from_dictionary", - "01224_no_superfluous_dict_reload" + "00738_lock_for_inner_table" ] } diff --git a/tests/testflows/ldap/tests/user_config.py b/tests/testflows/ldap/tests/user_config.py index edc85a5877e..f609231b752 100644 --- a/tests/testflows/ldap/tests/user_config.py +++ b/tests/testflows/ldap/tests/user_config.py @@ -29,6 +29,7 @@ def empty_user_name(self, timeout=20): def empty_server_name(self, timeout=20): """Check that if server name is an empty string then login is not allowed. """ + message = "Exception: LDAP server name cannot be empty for user" servers = {"openldap1": { "host": "openldap1", "port": "389", "enable_tls": "no", "auth_dn_prefix": "cn=", "auth_dn_suffix": ",ou=users,dc=company,dc=com" @@ -37,7 +38,8 @@ def empty_server_name(self, timeout=20): "errorcode": 4, "message": "DB::Exception: user1: Authentication failed: password is incorrect or there is no user with such name" }] - login(servers, *users) + config = create_ldap_users_config_content(*users) + invalid_user_config(servers, config, message=message, tail=15, timeout=timeout) @TestScenario @Requirements( @@ -147,9 +149,6 @@ def ldap_and_password(self): with Then("I expect an error when I try to load the configuration file", description=error_message): invalid_user_config(servers, new_config, message=error_message, tail=16) - with And("I expect the authentication to fail when I try to login"): - login(servers, user, config=new_config) - @TestFeature @Name("user config") def feature(self, node="clickhouse1"): diff --git a/tests/tsan_suppressions.txt b/tests/tsan_suppressions.txt index 912e0361bff..668710a33d7 100644 --- a/tests/tsan_suppressions.txt +++ b/tests/tsan_suppressions.txt @@ -1 +1,2 @@ -# Fortunately, we have no suppressions! +# looks like a bug in clang-11 thread sanitizer, detects normal data race with random FD in this method +race:DB::LazyPipeFDs::close diff --git a/utils/list-versions/version_date.tsv b/utils/list-versions/version_date.tsv index 3ec9ee11b95..75605968a37 100644 --- a/utils/list-versions/version_date.tsv +++ b/utils/list-versions/version_date.tsv @@ -1,3 +1,4 @@ +v20.9.2.20-stable 2020-09-22 v20.8.3.18-stable 2020-09-18 v20.8.2.3-stable 2020-09-08 v20.7.3.7-stable 2020-09-18 diff --git a/website/benchmark/benchmark.js b/website/benchmark/benchmark.js index 6113864d4d1..8fb2693aa97 100644 --- a/website/benchmark/benchmark.js +++ b/website/benchmark/benchmark.js @@ -403,7 +403,7 @@ function generate_diagram() { var table_row = ""; table_row += ""; - table_row += "Vamsi Krishna B.
Results for Dell XPS laptop and Google Pixel phone is from Alexander Kuzmenkov.
Results for Android phones for "cold cache" are done without cache flushing, so they are not "cold" and cannot be compared.
-Results for Digital Ocean are from Zimin Aleksey. +Results for Digital Ocean are from Zimin Aleksey.
+Results for 2x EPYC 7642 w/ 512 GB RAM (192 Cores) + 12X 1TB SSD (RAID6) are from Yiğit Konur and Metehan Çetinkaya of seo.do.

diff --git a/website/benchmark/hardware/results/022_amd_epyc_7402p.json b/website/benchmark/hardware/results/amd_epyc_7402p.json similarity index 100% rename from website/benchmark/hardware/results/022_amd_epyc_7402p.json rename to website/benchmark/hardware/results/amd_epyc_7402p.json diff --git a/website/benchmark/hardware/results/043_amd_epyc_7502p.json b/website/benchmark/hardware/results/amd_epyc_7502p.json similarity index 100% rename from website/benchmark/hardware/results/043_amd_epyc_7502p.json rename to website/benchmark/hardware/results/amd_epyc_7502p.json diff --git a/website/benchmark/hardware/results/005_amd_epyc_7551.json b/website/benchmark/hardware/results/amd_epyc_7551.json similarity index 100% rename from website/benchmark/hardware/results/005_amd_epyc_7551.json rename to website/benchmark/hardware/results/amd_epyc_7551.json diff --git a/website/benchmark/hardware/results/amd_epyc_7642.json b/website/benchmark/hardware/results/amd_epyc_7642.json new file mode 100644 index 00000000000..b60146d515f --- /dev/null +++ b/website/benchmark/hardware/results/amd_epyc_7642.json @@ -0,0 +1,56 @@ +[ + { + "system": "AMD EPYC 7642", + "system_full": "2x AMD EPYC 7642 / 512 GB RAM / 12x 1TB SSD (RAID 6)", + "time": "2020-09-21 00:00:00", + "kind": "server", + "result": + [ + [0.003, 0.003, 0.002], + [0.039, 0.041, 0.024], + [0.052, 0.029, 0.029], + [0.087, 0.031, 0.032], + [0.152, 0.106, 0.105], + [0.204, 0.128, 0.128], + [0.049, 0.028, 0.027], + [0.031, 0.024, 0.027], + [0.190, 0.130, 0.125], + [0.210, 0.142, 0.138], + [0.142, 0.091, 0.087], + [0.143, 0.101, 0.097], + [0.318, 0.170, 0.163], + [0.303, 0.193, 0.191], + [0.240, 0.175, 0.166], + [0.200, 0.166, 0.161], + [0.466, 0.364, 0.345], + [0.298, 0.244, 0.231], + [1.288, 0.901, 0.859], + [0.087, 0.031, 0.025], + [0.663, 0.201, 0.191], + [0.661, 0.213, 0.154], + [1.118, 0.599, 0.593], + [1.708, 0.392, 0.318], + [0.202, 0.065, 0.066], + [0.135, 0.061, 0.057], + [0.203, 0.066, 0.067], + [0.630, 0.296, 0.290], + [0.578, 0.281, 0.262], + [0.662, 0.670, 0.639], + [0.241, 0.153, 0.150], + [0.424, 0.235, 0.231], + [1.505, 1.090, 1.090], + [1.038, 0.818, 0.799], + [1.064, 0.856, 0.809], + [0.332, 0.297, 0.275], + [0.200, 0.169, 0.168], + [0.083, 0.070, 0.071], + [0.090, 0.059, 0.063], + [0.416, 0.419, 0.398], + [0.048, 0.032, 0.032], + [0.036, 0.027, 0.025], + [0.007, 0.007, 0.007] + ] + } +] + + diff --git a/website/benchmark/hardware/results/041_amd_epyc_7702.json b/website/benchmark/hardware/results/amd_epyc_7702.json similarity index 100% rename from website/benchmark/hardware/results/041_amd_epyc_7702.json rename to website/benchmark/hardware/results/amd_epyc_7702.json diff --git a/website/benchmark/hardware/results/038_amd_ryzen_9_3950x.json b/website/benchmark/hardware/results/amd_ryzen_9_3950x.json similarity index 100% rename from website/benchmark/hardware/results/038_amd_ryzen_9_3950x.json rename to website/benchmark/hardware/results/amd_ryzen_9_3950x.json diff --git a/website/benchmark/hardware/results/035_aws_a1_4xlarge.json b/website/benchmark/hardware/results/aws_a1_4xlarge.json similarity index 100% rename from website/benchmark/hardware/results/035_aws_a1_4xlarge.json rename to website/benchmark/hardware/results/aws_a1_4xlarge.json diff --git a/website/benchmark/hardware/results/049_aws_c5metal.json b/website/benchmark/hardware/results/aws_c5metal_100.json similarity index 97% rename from website/benchmark/hardware/results/049_aws_c5metal.json rename to website/benchmark/hardware/results/aws_c5metal_100.json index 9d933500ad1..4bb0a1f1f52 100644 --- a/website/benchmark/hardware/results/049_aws_c5metal.json +++ b/website/benchmark/hardware/results/aws_c5metal_100.json @@ -1,6 +1,6 @@ [ { - "system": "AWS c5.metal", + "system": "AWS c5.metal 100GB", "system_full": "AWS c5.metal 96vCPU 192GiB 100GB SSD", "time": "2020-01-17 00:00:00", "kind": "cloud", diff --git a/website/benchmark/hardware/results/aws_c5metal_300.json b/website/benchmark/hardware/results/aws_c5metal_300.json new file mode 100644 index 00000000000..87435f6fb45 --- /dev/null +++ b/website/benchmark/hardware/results/aws_c5metal_300.json @@ -0,0 +1,54 @@ +[ + { + "system": "AWS c5.metal 300GB", + "system_full": "AWS c5.metal 96vCPU 192GiB 300GB SSD", + "time": "2020-09-23 00:00:00", + "kind": "cloud", + "result": + [ +[0.012, 0.002, 0.002], +[0.066, 0.018, 0.018], +[0.066, 0.028, 0.027], +[0.186, 0.033, 0.031], +[0.362, 0.095, 0.093], +[1.092, 0.141, 0.142], +[0.035, 0.020, 0.021], +[0.023, 0.018, 0.018], +[0.303, 0.176, 0.181], +[0.817, 0.198, 0.198], +[0.322, 0.091, 0.092], +[0.600, 0.098, 0.098], +[1.059, 0.265, 0.253], +[1.542, 0.318, 0.310], +[0.682, 0.286, 0.283], +[0.372, 0.320, 0.295], +[1.610, 0.832, 0.750], +[1.301, 0.492, 0.458], +[3.446, 1.361, 1.330], +[0.189, 0.050, 0.035], +[9.246, 0.338, 0.265], +[10.163, 0.277, 0.249], +[19.616, 0.663, 0.639], +[20.068, 0.418, 0.367], +[1.812, 0.097, 0.093], +[0.976, 0.090, 0.083], +[2.458, 0.097, 0.095], +[9.397, 0.344, 0.323], +[7.320, 0.415, 0.413], +[0.780, 0.753, 0.748], +[1.328, 0.226, 0.223], +[4.643, 0.339, 0.329], +[4.136, 2.049, 2.021], +[9.213, 1.080, 0.923], +[9.192, 1.019, 0.959], +[0.410, 0.360, 0.378], +[0.244, 0.155, 0.163], +[0.102, 0.077, 0.071], +[0.045, 0.055, 0.049], +[0.459, 0.318, 0.316], +[0.069, 0.033, 0.026], +[0.035, 0.027, 0.020], +[0.019, 0.009, 0.010] + ] + } +] diff --git a/website/benchmark/hardware/results/aws_c6metal.json b/website/benchmark/hardware/results/aws_c6metal.json new file mode 100644 index 00000000000..83e75506ad9 --- /dev/null +++ b/website/benchmark/hardware/results/aws_c6metal.json @@ -0,0 +1,54 @@ +[ + { + "system": "AWS c6.metal (Graviton 2)", + "system_full": "AWS c6.metal (Graviton 2) 64 CPU 128GiB 2x1.7TB local SSD md-RAID-0", + "time": "2020-09-23 00:00:00", + "kind": "cloud", + "result": + [ +[0.004, 0.003, 0.001], +[0.085, 0.030, 0.032], +[0.029, 0.028, 0.026], +[0.047, 0.068, 0.070], +[0.090, 0.075, 0.079], +[0.140, 0.126, 0.124], +[0.018, 0.013, 0.012], +[0.032, 0.021, 0.032], +[0.154, 0.139, 0.138], +[0.204, 0.155, 0.156], +[0.101, 0.091, 0.090], +[0.104, 0.104, 0.100], +[0.223, 0.203, 0.203], +[0.273, 0.255, 0.253], +[0.232, 0.212, 0.213], +[0.230, 0.223, 0.223], +[0.506, 0.484, 0.483], +[0.334, 0.330, 0.316], +[1.139, 1.085, 1.088], +[0.065, 0.077, 0.054], +[0.484, 0.315, 0.315], +[0.545, 0.295, 0.291], +[0.980, 0.661, 1.476], +[1.415, 1.101, 0.675], +[0.150, 0.086, 0.085], +[0.094, 0.077, 0.078], +[0.150, 0.087, 0.086], +[0.478, 0.348, 0.346], +[0.424, 0.403, 0.399], +[1.435, 1.388, 1.417], +[0.215, 0.178, 0.178], +[0.378, 0.294, 0.289], +[1.669, 1.590, 1.596], +[1.105, 1.007, 1.010], +[1.074, 1.041, 1.014], +[0.339, 0.323, 0.323], +[0.210, 0.199, 0.204], +[0.096, 0.091, 0.092], +[0.084, 0.080, 0.079], +[0.425, 0.405, 0.423], +[0.034, 0.025, 0.022], +[0.022, 0.019, 0.018], +[0.007, 0.007, 0.007] + ] + } +] diff --git a/website/benchmark/hardware/results/015_aws_i3_8xlarge.json b/website/benchmark/hardware/results/aws_i3_8xlarge.json similarity index 100% rename from website/benchmark/hardware/results/015_aws_i3_8xlarge.json rename to website/benchmark/hardware/results/aws_i3_8xlarge.json diff --git a/website/benchmark/hardware/results/017_aws_i3en_24xlarge.json b/website/benchmark/hardware/results/aws_i3en_24xlarge.json similarity index 100% rename from website/benchmark/hardware/results/017_aws_i3en_24xlarge.json rename to website/benchmark/hardware/results/aws_i3en_24xlarge.json diff --git a/website/benchmark/hardware/results/046_aws_lightsail_4vcpu.json b/website/benchmark/hardware/results/aws_lightsail_4vcpu.json similarity index 100% rename from website/benchmark/hardware/results/046_aws_lightsail_4vcpu.json rename to website/benchmark/hardware/results/aws_lightsail_4vcpu.json diff --git a/website/benchmark/hardware/results/051_aws_m5a_8xlarge.json b/website/benchmark/hardware/results/aws_m5a_8xlarge.json similarity index 100% rename from website/benchmark/hardware/results/051_aws_m5a_8xlarge.json rename to website/benchmark/hardware/results/aws_m5a_8xlarge.json diff --git a/website/benchmark/hardware/results/019_aws_m5ad_24xlarge.json b/website/benchmark/hardware/results/aws_m5ad_24xlarge.json similarity index 100% rename from website/benchmark/hardware/results/019_aws_m5ad_24xlarge.json rename to website/benchmark/hardware/results/aws_m5ad_24xlarge.json diff --git a/website/benchmark/hardware/results/016_aws_m5d_24xlarge.json b/website/benchmark/hardware/results/aws_m5d_24xlarge.json similarity index 100% rename from website/benchmark/hardware/results/016_aws_m5d_24xlarge.json rename to website/benchmark/hardware/results/aws_m5d_24xlarge.json diff --git a/website/benchmark/hardware/results/045_aws_m6g_16xlarge.json b/website/benchmark/hardware/results/aws_m6g_16xlarge.json similarity index 92% rename from website/benchmark/hardware/results/045_aws_m6g_16xlarge.json rename to website/benchmark/hardware/results/aws_m6g_16xlarge.json index 323fd2cc50c..a0d15a0d384 100644 --- a/website/benchmark/hardware/results/045_aws_m6g_16xlarge.json +++ b/website/benchmark/hardware/results/aws_m6g_16xlarge.json @@ -1,7 +1,7 @@ [ { - "system": "AWS m6g.16xlarge", - "system_full": "AWS m6g.16xlarge (Graviton2) 64 vCPU, 256 GiB RAM, EBS", + "system": "AWS m6g.16xlarge (Graviton 2)", + "system_full": "AWS m6g.16xlarge (Graviton 2) 64 vCPU, 256 GiB RAM, EBS", "time": "2020-02-13 00:00:00", "kind": "cloud", "result": diff --git a/website/benchmark/hardware/results/014_azure_ds3v2.json b/website/benchmark/hardware/results/azure_ds3v2.json similarity index 100% rename from website/benchmark/hardware/results/014_azure_ds3v2.json rename to website/benchmark/hardware/results/azure_ds3v2.json diff --git a/website/benchmark/hardware/results/039_azure_e32s.json b/website/benchmark/hardware/results/azure_e32s.json similarity index 100% rename from website/benchmark/hardware/results/039_azure_e32s.json rename to website/benchmark/hardware/results/azure_e32s.json diff --git a/website/benchmark/hardware/results/009_core_i5_3210M_lenovo_b580.json b/website/benchmark/hardware/results/core_i5_3210M_lenovo_b580.json similarity index 100% rename from website/benchmark/hardware/results/009_core_i5_3210M_lenovo_b580.json rename to website/benchmark/hardware/results/core_i5_3210M_lenovo_b580.json diff --git a/website/benchmark/hardware/results/042_core_i7_6770hq_intel_nuc.json b/website/benchmark/hardware/results/core_i7_6770hq_intel_nuc.json similarity index 100% rename from website/benchmark/hardware/results/042_core_i7_6770hq_intel_nuc.json rename to website/benchmark/hardware/results/core_i7_6770hq_intel_nuc.json diff --git a/website/benchmark/hardware/results/020_core_i7_8550u_lenovo_x1.json b/website/benchmark/hardware/results/core_i7_8550u_lenovo_x1.json similarity index 100% rename from website/benchmark/hardware/results/020_core_i7_8550u_lenovo_x1.json rename to website/benchmark/hardware/results/core_i7_8550u_lenovo_x1.json diff --git a/website/benchmark/hardware/results/040_core_i7_macbook_pro_2018.json b/website/benchmark/hardware/results/core_i7_macbook_pro_2018.json similarity index 100% rename from website/benchmark/hardware/results/040_core_i7_macbook_pro_2018.json rename to website/benchmark/hardware/results/core_i7_macbook_pro_2018.json diff --git a/website/benchmark/hardware/results/012_dell_r530.json b/website/benchmark/hardware/results/dell_r530.json similarity index 100% rename from website/benchmark/hardware/results/012_dell_r530.json rename to website/benchmark/hardware/results/dell_r530.json diff --git a/website/benchmark/hardware/results/047_dell_xps.json b/website/benchmark/hardware/results/dell_xps.json similarity index 100% rename from website/benchmark/hardware/results/047_dell_xps.json rename to website/benchmark/hardware/results/dell_xps.json diff --git a/website/benchmark/hardware/results/018_huawei_taishan_2280_v2.json b/website/benchmark/hardware/results/huawei_taishan_2280_v2.json similarity index 100% rename from website/benchmark/hardware/results/018_huawei_taishan_2280_v2.json rename to website/benchmark/hardware/results/huawei_taishan_2280_v2.json diff --git a/website/benchmark/hardware/results/037_pinebook_pro.json b/website/benchmark/hardware/results/pinebook_pro.json similarity index 100% rename from website/benchmark/hardware/results/037_pinebook_pro.json rename to website/benchmark/hardware/results/pinebook_pro.json diff --git a/website/benchmark/hardware/results/048_pixel_3a.json b/website/benchmark/hardware/results/pixel_3a.json similarity index 100% rename from website/benchmark/hardware/results/048_pixel_3a.json rename to website/benchmark/hardware/results/pixel_3a.json diff --git a/website/benchmark/hardware/results/024_selectel_cloud_16vcpu.json b/website/benchmark/hardware/results/selectel_cloud_16vcpu.json similarity index 100% rename from website/benchmark/hardware/results/024_selectel_cloud_16vcpu.json rename to website/benchmark/hardware/results/selectel_cloud_16vcpu.json diff --git a/website/benchmark/hardware/results/008_skylake_kvm.json b/website/benchmark/hardware/results/skylake_kvm.json similarity index 100% rename from website/benchmark/hardware/results/008_skylake_kvm.json rename to website/benchmark/hardware/results/skylake_kvm.json diff --git a/website/benchmark/hardware/results/013_xeon_2176g.json b/website/benchmark/hardware/results/xeon_2176g.json similarity index 100% rename from website/benchmark/hardware/results/013_xeon_2176g.json rename to website/benchmark/hardware/results/xeon_2176g.json diff --git a/website/benchmark/hardware/results/021_xeon_e5645.json b/website/benchmark/hardware/results/xeon_e5645.json similarity index 100% rename from website/benchmark/hardware/results/021_xeon_e5645.json rename to website/benchmark/hardware/results/xeon_e5645.json diff --git a/website/benchmark/hardware/results/023_xeon_e5_1650v3.json b/website/benchmark/hardware/results/xeon_e5_1650v3.json similarity index 100% rename from website/benchmark/hardware/results/023_xeon_e5_1650v3.json rename to website/benchmark/hardware/results/xeon_e5_1650v3.json diff --git a/website/benchmark/hardware/results/010_xeon_e5_2640v4.json b/website/benchmark/hardware/results/xeon_e5_2640v4.json similarity index 100% rename from website/benchmark/hardware/results/010_xeon_e5_2640v4.json rename to website/benchmark/hardware/results/xeon_e5_2640v4.json diff --git a/website/benchmark/hardware/results/xeon_e5_2650_4hdd.json b/website/benchmark/hardware/results/xeon_e5_2650_4hdd.json new file mode 100644 index 00000000000..478229badcc --- /dev/null +++ b/website/benchmark/hardware/results/xeon_e5_2650_4hdd.json @@ -0,0 +1,54 @@ +[ + { + "system": "Xeon E5-2650", + "system_full": "Xeon E5-2650 v2 @ 2.60GHz, 2 sockets, 16 threads, 4xHDD RAID-10", + "time": "2020-09-25 00:00:00", + "kind": "server", + "result": + [ +[0.040, 0.002, 0.002], +[0.698, 0.014, 0.013], +[0.533, 0.030, 0.030], +[0.700, 0.043, 0.046], +[0.749, 0.108, 0.102], +[1.350, 0.221, 0.259], +[0.168, 0.020, 0.020], +[0.096, 0.013, 0.013], +[1.132, 0.406, 0.386], +[1.279, 0.426, 0.440], +[0.842, 0.153, 0.146], +[1.042, 0.186, 0.182], +[1.149, 0.536, 0.533], +[1.734, 0.688, 0.683], +[1.481, 0.688, 0.651], +[1.100, 0.709, 0.700], +[2.367, 1.668, 1.682], +[1.687, 1.013, 0.988], +[4.768, 3.647, 3.783], +[0.599, 0.055, 0.040], +[5.530, 0.646, 0.622], +[6.658, 0.671, 0.648], +[11.795, 1.645, 1.574], +[19.248, 1.168, 0.906], +[1.826, 0.224, 0.232], +[0.964, 0.189, 0.187], +[2.058, 0.234, 0.215], +[5.811, 0.758, 0.704], +[4.805, 1.014, 0.995], +[2.272, 2.035, 1.838], +[1.827, 0.546, 0.547], +[3.643, 0.863, 0.834], +[5.816, 5.069, 5.168], +[6.585, 2.655, 2.756], +[6.949, 2.681, 2.795], +[1.325, 1.090, 1.072], +[0.460, 0.183, 0.179], +[1.000, 0.087, 0.091], +[0.142, 0.051, 0.038], +[0.808, 0.392, 0.391], +[0.256, 0.021, 0.015], +[0.132, 0.038, 0.012], +[0.054, 0.006, 0.006] + ] + } +] diff --git a/website/benchmark/hardware/results/007_xeon_e5_2650.json b/website/benchmark/hardware/results/xeon_e5_2650_8hdd.json similarity index 100% rename from website/benchmark/hardware/results/007_xeon_e5_2650.json rename to website/benchmark/hardware/results/xeon_e5_2650_8hdd.json diff --git a/website/benchmark/hardware/results/050_xeon_e5_2650l_v3.json b/website/benchmark/hardware/results/xeon_e5_2650l_v3.json similarity index 100% rename from website/benchmark/hardware/results/050_xeon_e5_2650l_v3.json rename to website/benchmark/hardware/results/xeon_e5_2650l_v3.json diff --git a/website/benchmark/hardware/results/001_xeon_gold_6230.json b/website/benchmark/hardware/results/xeon_gold_6230.json similarity index 100% rename from website/benchmark/hardware/results/001_xeon_gold_6230.json rename to website/benchmark/hardware/results/xeon_gold_6230.json diff --git a/website/benchmark/hardware/results/044_xeon_silver_4114.json b/website/benchmark/hardware/results/xeon_silver_4114.json similarity index 100% rename from website/benchmark/hardware/results/044_xeon_silver_4114.json rename to website/benchmark/hardware/results/xeon_silver_4114.json diff --git a/website/benchmark/hardware/results/006_xeon_sp_gold.json b/website/benchmark/hardware/results/xeon_sp_gold.json similarity index 100% rename from website/benchmark/hardware/results/006_xeon_sp_gold.json rename to website/benchmark/hardware/results/xeon_sp_gold.json diff --git a/website/benchmark/hardware/results/036_xeon_x5675.json b/website/benchmark/hardware/results/xeon_x5675.json similarity index 100% rename from website/benchmark/hardware/results/036_xeon_x5675.json rename to website/benchmark/hardware/results/xeon_x5675.json diff --git a/website/benchmark/hardware/results/004_yandex_cloud_broadwell_4_vcpu.json b/website/benchmark/hardware/results/yandex_cloud_broadwell_4_vcpu.json similarity index 100% rename from website/benchmark/hardware/results/004_yandex_cloud_broadwell_4_vcpu.json rename to website/benchmark/hardware/results/yandex_cloud_broadwell_4_vcpu.json diff --git a/website/benchmark/hardware/results/yandex_cloud_cascade_lake_32_vcpu.json b/website/benchmark/hardware/results/yandex_cloud_cascade_lake_32_vcpu.json new file mode 100644 index 00000000000..5d2927c224d --- /dev/null +++ b/website/benchmark/hardware/results/yandex_cloud_cascade_lake_32_vcpu.json @@ -0,0 +1,55 @@ +[ + { + "system": "Yandex Cloud 32vCPU", + "system_full": "Yandex Cloud Cascade Lake, 32 vCPU, 128 GB RAM, 300 GB SSD", + "cpu_vendor": "Intel", + "time": "2020-09-23 00:00:00", + "kind": "cloud", + "result": + [ +[0.021, 0.001, 0.001], +[0.051, 0.011, 0.010], +[0.396, 0.025, 0.025], +[1.400, 0.035, 0.033], +[1.413, 0.095, 0.098], +[2.272, 0.222, 0.208], +[0.042, 0.014, 0.014], +[0.024, 0.011, 0.010], +[1.948, 0.311, 0.303], +[2.267, 0.379, 0.348], +[1.498, 0.138, 0.135], +[1.563, 0.164, 0.155], +[2.435, 0.544, 0.516], +[3.937, 0.661, 0.659], +[2.724, 0.727, 0.642], +[1.795, 0.683, 0.641], +[4.668, 1.682, 1.643], +[3.802, 1.051, 0.895], +[8.297, 3.835, 4.592], +[1.427, 0.100, 0.033], +[16.816, 0.652, 0.547], +[19.159, 0.650, 0.532], +[35.374, 1.538, 1.311], +[32.736, 0.854, 0.699], +[4.767, 0.203, 0.184], +[2.249, 0.166, 0.158], +[4.759, 0.207, 0.189], +[16.826, 0.584, 0.529], +[14.308, 0.920, 0.789], +[1.137, 1.041, 0.992], +[3.967, 0.545, 0.555], +[9.196, 0.872, 0.789], +[9.554, 5.501, 5.694], +[17.810, 2.712, 2.329], +[17.726, 2.653, 2.793], +[1.260, 0.955, 0.978], +[0.260, 0.171, 0.164], +[0.092, 0.065, 0.069], +[0.046, 0.041, 0.037], +[0.475, 0.391, 0.383], +[0.066, 0.021, 0.019], +[0.023, 0.024, 0.011], +[0.022, 0.005, 0.005] + ] + } +] diff --git a/website/benchmark/hardware/results/003_yandex_cloud_cascade_lake_4_vcpu.json b/website/benchmark/hardware/results/yandex_cloud_cascade_lake_4_vcpu.json similarity index 100% rename from website/benchmark/hardware/results/003_yandex_cloud_cascade_lake_4_vcpu.json rename to website/benchmark/hardware/results/yandex_cloud_cascade_lake_4_vcpu.json diff --git a/website/benchmark/hardware/results/002_yandex_cloud_cascade_lake_64_vcpu.json b/website/benchmark/hardware/results/yandex_cloud_cascade_lake_64_vcpu.json similarity index 100% rename from website/benchmark/hardware/results/002_yandex_cloud_cascade_lake_64_vcpu.json rename to website/benchmark/hardware/results/yandex_cloud_cascade_lake_64_vcpu.json diff --git a/website/benchmark/hardware/results/yandex_cloud_cascade_lake_80_vcpu.json b/website/benchmark/hardware/results/yandex_cloud_cascade_lake_80_vcpu.json new file mode 100644 index 00000000000..565a5bd41c2 --- /dev/null +++ b/website/benchmark/hardware/results/yandex_cloud_cascade_lake_80_vcpu.json @@ -0,0 +1,55 @@ +[ + { + "system": "Yandex Cloud 80vCPU", + "system_full": "Yandex Cloud Cascade Lake, 80 vCPU, 160 GB RAM, 4TB SSD", + "cpu_vendor": "Intel", + "time": "2020-09-23 00:00:00", + "kind": "cloud", + "result": + [ +[0.024, 0.002, 0.002], +[0.067, 0.012, 0.012], +[0.104, 0.017, 0.017], +[0.411, 0.020, 0.021], +[0.577, 0.069, 0.068], +[0.739, 0.123, 0.122], +[0.038, 0.015, 0.014], +[0.024, 0.012, 0.012], +[0.625, 0.169, 0.168], +[0.748, 0.216, 0.207], +[0.471, 0.089, 0.082], +[0.487, 0.092, 0.087], +[0.818, 0.256, 0.245], +[1.324, 0.352, 0.352], +[0.927, 0.333, 0.319], +[0.642, 0.376, 0.377], +[1.686, 0.983, 0.959], +[1.290, 0.588, 0.582], +[3.105, 1.793, 1.818], +[0.426, 0.031, 0.034], +[5.559, 0.415, 0.344], +[6.343, 0.435, 0.405], +[11.779, 1.151, 1.101], +[11.851, 0.537, 0.509], +[1.530, 0.125, 0.126], +[0.695, 0.103, 0.103], +[1.531, 0.127, 0.119], +[5.576, 0.541, 0.496], +[4.718, 0.740, 0.719], +[1.429, 1.467, 1.500], +[1.309, 0.335, 0.322], +[3.138, 0.505, 0.518], +[5.481, 3.475, 3.512], +[6.330, 1.877, 1.818], +[6.238, 1.843, 1.813], +[0.660, 0.626, 0.603], +[0.251, 0.152, 0.151], +[0.090, 0.058, 0.059], +[0.041, 0.038, 0.034], +[0.470, 0.376, 0.385], +[0.076, 0.015, 0.018], +[0.030, 0.018, 0.010], +[0.024, 0.006, 0.005] + ] + } +] diff --git a/website/benchmark/hardware/results/011_yandex_managed_clickhouse_s3_3xlarge.json b/website/benchmark/hardware/results/yandex_managed_clickhouse_s3_3xlarge.json similarity index 100% rename from website/benchmark/hardware/results/011_yandex_managed_clickhouse_s3_3xlarge.json rename to website/benchmark/hardware/results/yandex_managed_clickhouse_s3_3xlarge.json