Merge remote-tracking branch 'upstream/master' into replxx

This commit is contained in:
Ivan Lezhankin 2020-01-09 19:21:04 +03:00
commit 5950f6c081
636 changed files with 10431 additions and 5824 deletions

3
.gitmodules vendored
View File

@ -137,3 +137,6 @@
[submodule "contrib/replxx"] [submodule "contrib/replxx"]
path = contrib/replxx path = contrib/replxx
url = https://github.com/AmokHuginnsson/replxx.git url = https://github.com/AmokHuginnsson/replxx.git
[submodule "contrib/ryu"]
path = contrib/ryu
url = https://github.com/ClickHouse-Extras/ryu.git

View File

@ -1,3 +1,38 @@
## ClickHouse release v19.17.6.36, 2019-12-27
### Bug Fix
* Fixed potential buffer overflow in decompress. Malicious user can pass fabricated compressed data that could cause read after buffer. This issue was found by Eldar Zaitov from Yandex information security team. [#8404](https://github.com/ClickHouse/ClickHouse/pull/8404) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed possible server crash (`std::terminate`) when the server cannot send or write data in JSON or XML format with values of String data type (that require UTF-8 validation) or when compressing result data with Brotli algorithm or in some other rare cases. [#8384](https://github.com/ClickHouse/ClickHouse/pull/8384) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed dictionaries with source from a clickhouse `VIEW`, now reading such dictionaries doesn't cause the error `There is no query`. [#8351](https://github.com/ClickHouse/ClickHouse/pull/8351) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed checking if a client host is allowed by host_regexp specified in users.xml. [#8241](https://github.com/ClickHouse/ClickHouse/pull/8241), [#8342](https://github.com/ClickHouse/ClickHouse/pull/8342) ([Vitaly Baranov](https://github.com/vitlibar))
* `RENAME TABLE` for a distributed table now renames the folder containing inserted data before sending to shards. This fixes an issue with successive renames `tableA->tableB`, `tableC->tableA`. [#8306](https://github.com/ClickHouse/ClickHouse/pull/8306) ([tavplubix](https://github.com/tavplubix))
* `range_hashed` external dictionaries created by DDL queries now allow ranges of arbitrary numeric types. [#8275](https://github.com/ClickHouse/ClickHouse/pull/8275) ([alesapin](https://github.com/alesapin))
* Fixed `INSERT INTO table SELECT ... FROM mysql(...)` table function. [#8234](https://github.com/ClickHouse/ClickHouse/pull/8234) ([tavplubix](https://github.com/tavplubix))
* Fixed segfault in `INSERT INTO TABLE FUNCTION file()` while inserting into a file which doesn't exist. Now in this case file would be created and then insert would be processed. [#8177](https://github.com/ClickHouse/ClickHouse/pull/8177) ([Olga Khvostikova](https://github.com/stavrolia))
* Fixed bitmapAnd error when intersecting an aggregated bitmap and a scalar bitmap. [#8082](https://github.com/ClickHouse/ClickHouse/pull/8082) ([Yue Huang](https://github.com/moon03432))
* Fixed segfault when `EXISTS` query was used without `TABLE` or `DICTIONARY` qualifier, just like `EXISTS t`. [#8213](https://github.com/ClickHouse/ClickHouse/pull/8213) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed return type for functions `rand` and `randConstant` in case of nullable argument. Now functions always return `UInt32` and never `Nullable(UInt32)`. [#8204](https://github.com/ClickHouse/ClickHouse/pull/8204) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fixed `DROP DICTIONARY IF EXISTS db.dict`, now it doesn't throw exception if `db` doesn't exist. [#8185](https://github.com/ClickHouse/ClickHouse/pull/8185) ([Vitaly Baranov](https://github.com/vitlibar))
* If a table wasn't completely dropped because of server crash, the server will try to restore and load it [#8176](https://github.com/ClickHouse/ClickHouse/pull/8176) ([tavplubix](https://github.com/tavplubix))
* Fixed a trivial count query for a distributed table if there are more than two shard local table. [#8164](https://github.com/ClickHouse/ClickHouse/pull/8164) ([小路](https://github.com/nicelulu))
* Fixed bug that lead to a data race in DB::BlockStreamProfileInfo::calculateRowsBeforeLimit() [#8143](https://github.com/ClickHouse/ClickHouse/pull/8143) ([Alexander Kazakov](https://github.com/Akazz))
* Fixed `ALTER table MOVE part` executed immediately after merging the specified part, which could cause moving a part which the specified part merged into. Now it correctly moves the specified part. [#8104](https://github.com/ClickHouse/ClickHouse/pull/8104) ([Vladimir Chebotarev](https://github.com/excitoon))
* Expressions for dictionaries can be specified as strings now. This is useful for calculation of attributes while extracting data from non-ClickHouse sources because it allows to use non-ClickHouse syntax for those expressions. [#8098](https://github.com/ClickHouse/ClickHouse/pull/8098) ([alesapin](https://github.com/alesapin))
* Fixed a very rare race in `clickhouse-copier` because of an overflow in ZXid. [#8088](https://github.com/ClickHouse/ClickHouse/pull/8088) ([Ding Xiang Fei](https://github.com/dingxiangfei2009))
* Fixed the bug when after the query failed (due to "Too many simultaneous queries" for example) it would not read external tables info, and the
next request would interpret this info as the beginning of the next query causing an error like `Unknown packet from client`. [#8084](https://github.com/ClickHouse/ClickHouse/pull/8084) ([Azat Khuzhin](https://github.com/azat))
* Avoid null dereference after "Unknown packet X from server" [#8071](https://github.com/ClickHouse/ClickHouse/pull/8071) ([Azat Khuzhin](https://github.com/azat))
* Restore support of all ICU locales, add the ability to apply collations for constant expressions and add language name to system.collations table. [#8051](https://github.com/ClickHouse/ClickHouse/pull/8051) ([alesapin](https://github.com/alesapin))
* Number of streams for read from `StorageFile` and `StorageHDFS` is now limited, to avoid exceeding the memory limit. [#7981](https://github.com/ClickHouse/ClickHouse/pull/7981) ([alesapin](https://github.com/alesapin))
* Fixed `CHECK TABLE` query for `*MergeTree` tables without key. [#7979](https://github.com/ClickHouse/ClickHouse/pull/7979) ([alesapin](https://github.com/alesapin))
* Removed the mutation number from a part name in case there were no mutations. This removing improved the compatibility with older versions. [#8250](https://github.com/ClickHouse/ClickHouse/pull/8250) ([alesapin](https://github.com/alesapin))
* Fixed the bug that mutations are skipped for some attached parts due to their data_version are larger than the table mutation version. [#7812](https://github.com/ClickHouse/ClickHouse/pull/7812) ([Zhichang Yu](https://github.com/yuzhichang))
* Allow starting the server with redundant copies of parts after moving them to another device. [#7810](https://github.com/ClickHouse/ClickHouse/pull/7810) ([Vladimir Chebotarev](https://github.com/excitoon))
* Fixed the error "Sizes of columns doesn't match" that might appear when using aggregate function columns. [#7790](https://github.com/ClickHouse/ClickHouse/pull/7790) ([Boris Granveaud](https://github.com/bgranvea))
* Now an exception will be thrown in case of using WITH TIES alongside LIMIT BY. And now it's possible to use TOP with LIMIT BY. [#7637](https://github.com/ClickHouse/ClickHouse/pull/7637) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
* Fix dictionary reload if it has `invalidate_query`, which stopped updates and some exception on previous update tries. [#8029](https://github.com/ClickHouse/ClickHouse/pull/8029) ([alesapin](https://github.com/alesapin))
## ClickHouse release v19.17.4.11, 2019-11-22 ## ClickHouse release v19.17.4.11, 2019-11-22
### Backward Incompatible Change ### Backward Incompatible Change

View File

@ -176,7 +176,9 @@ if (ARCH_NATIVE)
set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native") set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native")
endif () endif ()
set (CMAKE_CXX_STANDARD 17) # cmake < 3.12 doesn't supoprt 20. We'll set CMAKE_CXX_FLAGS for now
# set (CMAKE_CXX_STANDARD 20)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++2a")
set (CMAKE_CXX_EXTENSIONS 0) # https://cmake.org/cmake/help/latest/prop_tgt/CXX_EXTENSIONS.html#prop_tgt:CXX_EXTENSIONS set (CMAKE_CXX_EXTENSIONS 0) # https://cmake.org/cmake/help/latest/prop_tgt/CXX_EXTENSIONS.html#prop_tgt:CXX_EXTENSIONS
set (CMAKE_CXX_STANDARD_REQUIRED ON) set (CMAKE_CXX_STANDARD_REQUIRED ON)
@ -208,7 +210,7 @@ set (CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -O0 -g3 -ggdb3
if (COMPILER_CLANG) if (COMPILER_CLANG)
# Exception unwinding doesn't work in clang release build without this option # Exception unwinding doesn't work in clang release build without this option
# TODO investigate if contrib/libcxxabi is out of date # TODO investigate that
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-omit-frame-pointer") set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-omit-frame-pointer")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fno-omit-frame-pointer") set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fno-omit-frame-pointer")
endif () endif ()
@ -248,8 +250,16 @@ endif ()
string (TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE_UC) string (TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE_UC)
set (CMAKE_POSTFIX_VARIABLE "CMAKE_${CMAKE_BUILD_TYPE_UC}_POSTFIX") set (CMAKE_POSTFIX_VARIABLE "CMAKE_${CMAKE_BUILD_TYPE_UC}_POSTFIX")
if (NOT MAKE_STATIC_LIBRARIES) if (MAKE_STATIC_LIBRARIES)
set(CMAKE_POSITION_INDEPENDENT_CODE ON) set (CMAKE_POSITION_INDEPENDENT_CODE OFF)
if (OS_LINUX)
# Slightly more efficient code can be generated
set (CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -fno-pie")
set (CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELWITHDEBINFO} -fno-pie")
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,-no-pie")
endif ()
else ()
set (CMAKE_POSITION_INDEPENDENT_CODE ON)
endif () endif ()
# Using "include-what-you-use" tool. # Using "include-what-you-use" tool.

View File

@ -15,6 +15,7 @@ set(CMAKE_C_STANDARD_LIBRARIES ${DEFAULT_LIBS})
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mmacosx-version-min=10.14") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mmacosx-version-min=10.14")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmacosx-version-min=10.14") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmacosx-version-min=10.14")
set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} -mmacosx-version-min=10.14")
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -mmacosx-version-min=10.14") set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -mmacosx-version-min=10.14")
set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -mmacosx-version-min=10.14") set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -mmacosx-version-min=10.14")

View File

@ -2,6 +2,7 @@ set (CMAKE_SYSTEM_NAME "Darwin")
set (CMAKE_SYSTEM_PROCESSOR "x86_64") set (CMAKE_SYSTEM_PROCESSOR "x86_64")
set (CMAKE_C_COMPILER_TARGET "x86_64-apple-darwin") set (CMAKE_C_COMPILER_TARGET "x86_64-apple-darwin")
set (CMAKE_CXX_COMPILER_TARGET "x86_64-apple-darwin") set (CMAKE_CXX_COMPILER_TARGET "x86_64-apple-darwin")
set (CMAKE_ASM_COMPILER_TARGET "x86_64-apple-darwin")
set (CMAKE_OSX_SYSROOT "${CMAKE_CURRENT_LIST_DIR}/../toolchain/darwin-x86_64") set (CMAKE_OSX_SYSROOT "${CMAKE_CURRENT_LIST_DIR}/../toolchain/darwin-x86_64")
set (CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY) # disable linkage check - it doesn't work in CMake set (CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY) # disable linkage check - it doesn't work in CMake

View File

@ -1,4 +1,8 @@
option(ENABLE_ICU "Enable ICU" ${ENABLE_LIBRARIES}) if (OS_LINUX)
option(ENABLE_ICU "Enable ICU" ${ENABLE_LIBRARIES})
else ()
option(ENABLE_ICU "Enable ICU" 0)
endif ()
if (ENABLE_ICU) if (ENABLE_ICU)

View File

@ -13,7 +13,6 @@ if (CMAKE_CROSSCOMPILING)
if (OS_DARWIN) if (OS_DARWIN)
# FIXME: broken dependencies # FIXME: broken dependencies
set (USE_SNAPPY OFF CACHE INTERNAL "") set (USE_SNAPPY OFF CACHE INTERNAL "")
set (ENABLE_SSL OFF CACHE INTERNAL "")
set (ENABLE_PROTOBUF OFF CACHE INTERNAL "") set (ENABLE_PROTOBUF OFF CACHE INTERNAL "")
set (ENABLE_PARQUET OFF CACHE INTERNAL "") set (ENABLE_PARQUET OFF CACHE INTERNAL "")
set (ENABLE_ICU OFF CACHE INTERNAL "") set (ENABLE_ICU OFF CACHE INTERNAL "")

View File

@ -2,10 +2,10 @@
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -w") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -w")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -w -std=c++1z") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -w")
elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang") elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -w") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -w")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -w -std=c++1z") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -w")
endif () endif ()
set_property(DIRECTORY PROPERTY EXCLUDE_FROM_ALL 1) set_property(DIRECTORY PROPERTY EXCLUDE_FROM_ALL 1)
@ -32,6 +32,8 @@ if (USE_INTERNAL_DOUBLE_CONVERSION_LIBRARY)
add_subdirectory (double-conversion-cmake) add_subdirectory (double-conversion-cmake)
endif () endif ()
add_subdirectory (ryu-cmake)
if (USE_INTERNAL_CITYHASH_LIBRARY) if (USE_INTERNAL_CITYHASH_LIBRARY)
add_subdirectory (cityhash102) add_subdirectory (cityhash102)
endif () endif ()
@ -250,6 +252,7 @@ if (USE_EMBEDDED_COMPILER AND USE_INTERNAL_LLVM_LIBRARY)
endif () endif ()
set (LLVM_ENABLE_EH 1 CACHE INTERNAL "") set (LLVM_ENABLE_EH 1 CACHE INTERNAL "")
set (LLVM_ENABLE_RTTI 1 CACHE INTERNAL "") set (LLVM_ENABLE_RTTI 1 CACHE INTERNAL "")
set (LLVM_ENABLE_PIC 0 CACHE INTERNAL "")
set (LLVM_TARGETS_TO_BUILD "X86;AArch64" CACHE STRING "") set (LLVM_TARGETS_TO_BUILD "X86;AArch64" CACHE STRING "")
add_subdirectory (llvm/llvm) add_subdirectory (llvm/llvm)
endif () endif ()

View File

@ -1,5 +1,7 @@
include(ExternalProject) include(ExternalProject)
set (CMAKE_CXX_STANDARD 17)
# === thrift # === thrift
set(LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/thrift/lib/cpp) set(LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/thrift/lib/cpp)

View File

@ -6,88 +6,87 @@ SET(AWS_EVENT_STREAM_LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/aws-c-event-st
OPTION(USE_AWS_MEMORY_MANAGEMENT "Aws memory management" OFF) OPTION(USE_AWS_MEMORY_MANAGEMENT "Aws memory management" OFF)
configure_file("${AWS_CORE_LIBRARY_DIR}/include/aws/core/SDKConfig.h.in" configure_file("${AWS_CORE_LIBRARY_DIR}/include/aws/core/SDKConfig.h.in"
"${CMAKE_CURRENT_BINARY_DIR}/include/aws/core/SDKConfig.h" @ONLY) "${CMAKE_CURRENT_BINARY_DIR}/include/aws/core/SDKConfig.h" @ONLY)
configure_file("${AWS_COMMON_LIBRARY_DIR}/include/aws/common/config.h.in" configure_file("${AWS_COMMON_LIBRARY_DIR}/include/aws/common/config.h.in"
"${CMAKE_CURRENT_BINARY_DIR}/include/aws/common/config.h" @ONLY) "${CMAKE_CURRENT_BINARY_DIR}/include/aws/common/config.h" @ONLY)
file(GLOB AWS_CORE_SOURCES file(GLOB AWS_CORE_SOURCES
"${AWS_CORE_LIBRARY_DIR}/source/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/auth/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/auth/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/client/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/client/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/http/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/http/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/http/standard/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/http/standard/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/http/curl/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/http/curl/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/config/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/config/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/external/cjson/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/external/cjson/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/external/tinyxml2/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/external/tinyxml2/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/internal/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/internal/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/monitoring/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/monitoring/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/net/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/net/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/linux-shared/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/linux-shared/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/platform/linux-shared/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/platform/linux-shared/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/base64/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/base64/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/event/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/event/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/crypto/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/crypto/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/crypto/openssl/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/crypto/openssl/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/crypto/factory/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/crypto/factory/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/json/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/json/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/logging/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/logging/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/memory/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/memory/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/memory/stl/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/memory/stl/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/stream/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/stream/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/threading/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/threading/*.cpp"
"${AWS_CORE_LIBRARY_DIR}/source/utils/xml/*.cpp" "${AWS_CORE_LIBRARY_DIR}/source/utils/xml/*.cpp"
) )
file(GLOB AWS_S3_SOURCES file(GLOB AWS_S3_SOURCES
"${AWS_S3_LIBRARY_DIR}/source/*.cpp" "${AWS_S3_LIBRARY_DIR}/source/*.cpp"
) )
file(GLOB AWS_S3_MODEL_SOURCES file(GLOB AWS_S3_MODEL_SOURCES
"${AWS_S3_LIBRARY_DIR}/source/model/*.cpp" "${AWS_S3_LIBRARY_DIR}/source/model/*.cpp"
) )
file(GLOB AWS_EVENT_STREAM_SOURCES file(GLOB AWS_EVENT_STREAM_SOURCES
"${AWS_EVENT_STREAM_LIBRARY_DIR}/source/*.c" "${AWS_EVENT_STREAM_LIBRARY_DIR}/source/*.c"
) )
file(GLOB AWS_COMMON_SOURCES file(GLOB AWS_COMMON_SOURCES
"${AWS_COMMON_LIBRARY_DIR}/source/*.c" "${AWS_COMMON_LIBRARY_DIR}/source/*.c"
"${AWS_COMMON_LIBRARY_DIR}/source/posix/*.c" "${AWS_COMMON_LIBRARY_DIR}/source/posix/*.c"
) )
file(GLOB AWS_CHECKSUMS_SOURCES file(GLOB AWS_CHECKSUMS_SOURCES
"${AWS_CHECKSUMS_LIBRARY_DIR}/source/*.c" "${AWS_CHECKSUMS_LIBRARY_DIR}/source/*.c"
"${AWS_CHECKSUMS_LIBRARY_DIR}/source/intel/*.c" "${AWS_CHECKSUMS_LIBRARY_DIR}/source/intel/*.c"
"${AWS_CHECKSUMS_LIBRARY_DIR}/source/arm/*.c" "${AWS_CHECKSUMS_LIBRARY_DIR}/source/arm/*.c"
) )
file(GLOB S3_UNIFIED_SRC file(GLOB S3_UNIFIED_SRC
${AWS_EVENT_STREAM_SOURCES} ${AWS_EVENT_STREAM_SOURCES}
${AWS_COMMON_SOURCES} ${AWS_COMMON_SOURCES}
${AWS_S3_SOURCES} ${AWS_S3_SOURCES}
${AWS_S3_MODEL_SOURCES} ${AWS_S3_MODEL_SOURCES}
${AWS_CORE_SOURCES} ${AWS_CORE_SOURCES}
) )
set(S3_INCLUDES set(S3_INCLUDES
"${CMAKE_CURRENT_SOURCE_DIR}/include/" "${CMAKE_CURRENT_SOURCE_DIR}/include/"
"${AWS_COMMON_LIBRARY_DIR}/include/" "${AWS_COMMON_LIBRARY_DIR}/include/"
"${AWS_EVENT_STREAM_LIBRARY_DIR}/include/" "${AWS_EVENT_STREAM_LIBRARY_DIR}/include/"
"${AWS_S3_LIBRARY_DIR}/include/" "${AWS_S3_LIBRARY_DIR}/include/"
"${AWS_CORE_LIBRARY_DIR}/include/" "${AWS_CORE_LIBRARY_DIR}/include/"
"${CMAKE_CURRENT_BINARY_DIR}/include/" "${CMAKE_CURRENT_BINARY_DIR}/include/"
) )
add_library(aws_s3_checksums ${AWS_CHECKSUMS_SOURCES}) add_library(aws_s3_checksums ${AWS_CHECKSUMS_SOURCES})
target_include_directories(aws_s3_checksums PUBLIC "${AWS_CHECKSUMS_LIBRARY_DIR}/include/") target_include_directories(aws_s3_checksums PUBLIC "${AWS_CHECKSUMS_LIBRARY_DIR}/include/")
if(CMAKE_BUILD_TYPE STREQUAL "" OR CMAKE_BUILD_TYPE STREQUAL "Debug") if(CMAKE_BUILD_TYPE STREQUAL "" OR CMAKE_BUILD_TYPE STREQUAL "Debug")
target_compile_definitions(aws_s3_checksums PRIVATE "-DDEBUG_BUILD") target_compile_definitions(aws_s3_checksums PRIVATE "-DDEBUG_BUILD")
endif() endif()
set_target_properties(aws_s3_checksums PROPERTIES COMPILE_OPTIONS -fPIC)
set_target_properties(aws_s3_checksums PROPERTIES LINKER_LANGUAGE C) set_target_properties(aws_s3_checksums PROPERTIES LINKER_LANGUAGE C)
set_property(TARGET aws_s3_checksums PROPERTY C_STANDARD 99) set_property(TARGET aws_s3_checksums PROPERTY C_STANDARD 99)

View File

@ -1,5 +1,7 @@
set (CAPNPROTO_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/capnproto/c++/src) set (CAPNPROTO_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/capnproto/c++/src)
set (CMAKE_CXX_STANDARD 17)
set (KJ_SRCS set (KJ_SRCS
${CAPNPROTO_SOURCE_DIR}/kj/array.c++ ${CAPNPROTO_SOURCE_DIR}/kj/array.c++
${CAPNPROTO_SOURCE_DIR}/kj/common.c++ ${CAPNPROTO_SOURCE_DIR}/kj/common.c++

View File

@ -65,11 +65,6 @@ if(CMAKE_COMPILER_IS_GNUCC OR CMAKE_COMPILER_IS_CLANG)
endif() endif()
endif() endif()
# For debug libs and exes, add "-d" postfix
if(NOT DEFINED CMAKE_DEBUG_POSTFIX)
set(CMAKE_DEBUG_POSTFIX "-d")
endif()
# initialize CURL_LIBS # initialize CURL_LIBS
set(CURL_LIBS "") set(CURL_LIBS "")
@ -115,8 +110,6 @@ if(ENABLE_IPV6 AND NOT WIN32)
endif() endif()
endif() endif()
curl_nroff_check()
# We need ansi c-flags, especially on HP # We need ansi c-flags, especially on HP
set(CMAKE_C_FLAGS "${CMAKE_ANSI_CFLAGS} ${CMAKE_C_FLAGS}") set(CMAKE_C_FLAGS "${CMAKE_ANSI_CFLAGS} ${CMAKE_C_FLAGS}")
set(CMAKE_REQUIRED_FLAGS ${CMAKE_ANSI_CFLAGS}) set(CMAKE_REQUIRED_FLAGS ${CMAKE_ANSI_CFLAGS})
@ -132,21 +125,21 @@ include(CheckCSourceCompiles)
if(ENABLE_THREADED_RESOLVER) if(ENABLE_THREADED_RESOLVER)
find_package(Threads REQUIRED) find_package(Threads REQUIRED)
if(WIN32) set(USE_THREADS_POSIX ${CMAKE_USE_PTHREADS_INIT})
set(USE_THREADS_WIN32 ON) set(HAVE_PTHREAD_H ${CMAKE_USE_PTHREADS_INIT})
else()
set(USE_THREADS_POSIX ${CMAKE_USE_PTHREADS_INIT})
set(HAVE_PTHREAD_H ${CMAKE_USE_PTHREADS_INIT})
endif()
set(CURL_LIBS ${CURL_LIBS} ${CMAKE_THREAD_LIBS_INIT}) set(CURL_LIBS ${CURL_LIBS} ${CMAKE_THREAD_LIBS_INIT})
endif() endif()
# Check for all needed libraries # Check for all needed libraries
check_library_exists_concat("${CMAKE_DL_LIBS}" dlopen HAVE_LIBDL)
check_library_exists_concat("socket" connect HAVE_LIBSOCKET)
check_library_exists("c" gethostbyname "" NOT_NEED_LIBNSL)
check_function_exists(gethostname HAVE_GETHOSTNAME) # We don't want any plugin loading at runtime. It is harmful.
#check_library_exists_concat("${CMAKE_DL_LIBS}" dlopen HAVE_LIBDL)
# This is unneeded.
#check_library_exists_concat("socket" connect HAVE_LIBSOCKET)
set (NOT_NEED_LIBNSL 1)
set (gethostname HAVE_GETHOSTNAME 1)
# From cmake/find/ssl.cmake # From cmake/find/ssl.cmake
if (OPENSSL_FOUND) if (OPENSSL_FOUND)
@ -167,10 +160,12 @@ if (OPENSSL_FOUND)
endif() endif()
# Check for idn # Check for idn
check_library_exists_concat("idn2" idn2_lookup_ul HAVE_LIBIDN2) # No, we don't need that.
# check_library_exists_concat("idn2" idn2_lookup_ul HAVE_LIBIDN2)
# Check for symbol dlopen (same as HAVE_LIBDL) # Check for symbol dlopen (same as HAVE_LIBDL)
check_library_exists("${CURL_LIBS}" dlopen "" HAVE_DLOPEN) # We don't want any plugin loading at runtime. It is harmful.
# check_library_exists("${CURL_LIBS}" dlopen "" HAVE_DLOPEN)
# From /cmake/find/zlib.cmake # From /cmake/find/zlib.cmake
if (ZLIB_FOUND) if (ZLIB_FOUND)
@ -181,7 +176,7 @@ if (ZLIB_FOUND)
list(APPEND CURL_LIBS ${ZLIB_LIBRARIES}) list(APPEND CURL_LIBS ${ZLIB_LIBRARIES})
endif() endif()
option(ENABLE_UNIX_SOCKETS "Define if you want Unix domain sockets support" ON) option(ENABLE_UNIX_SOCKETS "Define if you want Unix domain sockets support" OFF)
if(ENABLE_UNIX_SOCKETS) if(ENABLE_UNIX_SOCKETS)
include(CheckStructHasMember) include(CheckStructHasMember)
check_struct_has_member("struct sockaddr_un" sun_path "sys/un.h" USE_UNIX_SOCKETS) check_struct_has_member("struct sockaddr_un" sun_path "sys/un.h" USE_UNIX_SOCKETS)
@ -217,14 +212,14 @@ check_include_file_concat("sys/utime.h" HAVE_SYS_UTIME_H)
check_include_file_concat("sys/xattr.h" HAVE_SYS_XATTR_H) check_include_file_concat("sys/xattr.h" HAVE_SYS_XATTR_H)
check_include_file_concat("alloca.h" HAVE_ALLOCA_H) check_include_file_concat("alloca.h" HAVE_ALLOCA_H)
check_include_file_concat("arpa/inet.h" HAVE_ARPA_INET_H) check_include_file_concat("arpa/inet.h" HAVE_ARPA_INET_H)
check_include_file_concat("arpa/tftp.h" HAVE_ARPA_TFTP_H) #check_include_file_concat("arpa/tftp.h" HAVE_ARPA_TFTP_H)
check_include_file_concat("assert.h" HAVE_ASSERT_H) check_include_file_concat("assert.h" HAVE_ASSERT_H)
check_include_file_concat("crypto.h" HAVE_CRYPTO_H) check_include_file_concat("crypto.h" HAVE_CRYPTO_H)
check_include_file_concat("des.h" HAVE_DES_H) check_include_file_concat("des.h" HAVE_DES_H)
check_include_file_concat("err.h" HAVE_ERR_H) check_include_file_concat("err.h" HAVE_ERR_H)
check_include_file_concat("errno.h" HAVE_ERRNO_H) check_include_file_concat("errno.h" HAVE_ERRNO_H)
check_include_file_concat("fcntl.h" HAVE_FCNTL_H) check_include_file_concat("fcntl.h" HAVE_FCNTL_H)
check_include_file_concat("idn2.h" HAVE_IDN2_H) #check_include_file_concat("idn2.h" HAVE_IDN2_H)
check_include_file_concat("ifaddrs.h" HAVE_IFADDRS_H) check_include_file_concat("ifaddrs.h" HAVE_IFADDRS_H)
check_include_file_concat("io.h" HAVE_IO_H) check_include_file_concat("io.h" HAVE_IO_H)
check_include_file_concat("krb.h" HAVE_KRB_H) check_include_file_concat("krb.h" HAVE_KRB_H)
@ -259,7 +254,7 @@ check_include_file_concat("x509.h" HAVE_X509_H)
check_include_file_concat("process.h" HAVE_PROCESS_H) check_include_file_concat("process.h" HAVE_PROCESS_H)
check_include_file_concat("stddef.h" HAVE_STDDEF_H) check_include_file_concat("stddef.h" HAVE_STDDEF_H)
check_include_file_concat("dlfcn.h" HAVE_DLFCN_H) #check_include_file_concat("dlfcn.h" HAVE_DLFCN_H)
check_include_file_concat("malloc.h" HAVE_MALLOC_H) check_include_file_concat("malloc.h" HAVE_MALLOC_H)
check_include_file_concat("memory.h" HAVE_MEMORY_H) check_include_file_concat("memory.h" HAVE_MEMORY_H)
check_include_file_concat("netinet/if_ether.h" HAVE_NETINET_IF_ETHER_H) check_include_file_concat("netinet/if_ether.h" HAVE_NETINET_IF_ETHER_H)
@ -276,30 +271,11 @@ check_type_size("int" SIZEOF_INT)
check_type_size("__int64" SIZEOF___INT64) check_type_size("__int64" SIZEOF___INT64)
check_type_size("long double" SIZEOF_LONG_DOUBLE) check_type_size("long double" SIZEOF_LONG_DOUBLE)
check_type_size("time_t" SIZEOF_TIME_T) check_type_size("time_t" SIZEOF_TIME_T)
if(NOT HAVE_SIZEOF_SSIZE_T)
if(SIZEOF_LONG EQUAL SIZEOF_SIZE_T)
set(ssize_t long)
endif()
if(NOT ssize_t AND SIZEOF___INT64 EQUAL SIZEOF_SIZE_T)
set(ssize_t __int64)
endif()
endif()
# off_t is sized later, after the HAVE_FILE_OFFSET_BITS test
if(HAVE_SIZEOF_LONG_LONG) set(HAVE_LONGLONG 1)
set(HAVE_LONGLONG 1) set(HAVE_LL 1)
set(HAVE_LL 1)
endif()
find_file(RANDOM_FILE urandom /dev) set(RANDOM_FILE /dev/urandom)
mark_as_advanced(RANDOM_FILE)
# Check for some functions that are used
if(HAVE_LIBWS2_32)
set(CMAKE_REQUIRED_LIBRARIES ws2_32)
elseif(HAVE_LIBSOCKET)
set(CMAKE_REQUIRED_LIBRARIES socket)
endif()
check_symbol_exists(basename "${CURL_INCLUDES}" HAVE_BASENAME) check_symbol_exists(basename "${CURL_INCLUDES}" HAVE_BASENAME)
check_symbol_exists(socket "${CURL_INCLUDES}" HAVE_SOCKET) check_symbol_exists(socket "${CURL_INCLUDES}" HAVE_SOCKET)
@ -311,18 +287,15 @@ check_symbol_exists(strtok_r "${CURL_INCLUDES}" HAVE_STRTOK_R)
check_symbol_exists(strftime "${CURL_INCLUDES}" HAVE_STRFTIME) check_symbol_exists(strftime "${CURL_INCLUDES}" HAVE_STRFTIME)
check_symbol_exists(uname "${CURL_INCLUDES}" HAVE_UNAME) check_symbol_exists(uname "${CURL_INCLUDES}" HAVE_UNAME)
check_symbol_exists(strcasecmp "${CURL_INCLUDES}" HAVE_STRCASECMP) check_symbol_exists(strcasecmp "${CURL_INCLUDES}" HAVE_STRCASECMP)
check_symbol_exists(stricmp "${CURL_INCLUDES}" HAVE_STRICMP) #check_symbol_exists(stricmp "${CURL_INCLUDES}" HAVE_STRICMP)
check_symbol_exists(strcmpi "${CURL_INCLUDES}" HAVE_STRCMPI) #check_symbol_exists(strcmpi "${CURL_INCLUDES}" HAVE_STRCMPI)
check_symbol_exists(strncmpi "${CURL_INCLUDES}" HAVE_STRNCMPI) #check_symbol_exists(strncmpi "${CURL_INCLUDES}" HAVE_STRNCMPI)
check_symbol_exists(alarm "${CURL_INCLUDES}" HAVE_ALARM) check_symbol_exists(alarm "${CURL_INCLUDES}" HAVE_ALARM)
if(NOT HAVE_STRNCMPI) #check_symbol_exists(gethostbyaddr "${CURL_INCLUDES}" HAVE_GETHOSTBYADDR)
set(HAVE_STRCMPI)
endif()
check_symbol_exists(gethostbyaddr "${CURL_INCLUDES}" HAVE_GETHOSTBYADDR)
check_symbol_exists(gethostbyaddr_r "${CURL_INCLUDES}" HAVE_GETHOSTBYADDR_R) check_symbol_exists(gethostbyaddr_r "${CURL_INCLUDES}" HAVE_GETHOSTBYADDR_R)
check_symbol_exists(gettimeofday "${CURL_INCLUDES}" HAVE_GETTIMEOFDAY) check_symbol_exists(gettimeofday "${CURL_INCLUDES}" HAVE_GETTIMEOFDAY)
check_symbol_exists(inet_addr "${CURL_INCLUDES}" HAVE_INET_ADDR) check_symbol_exists(inet_addr "${CURL_INCLUDES}" HAVE_INET_ADDR)
check_symbol_exists(inet_ntoa "${CURL_INCLUDES}" HAVE_INET_NTOA) #check_symbol_exists(inet_ntoa "${CURL_INCLUDES}" HAVE_INET_NTOA)
check_symbol_exists(inet_ntoa_r "${CURL_INCLUDES}" HAVE_INET_NTOA_R) check_symbol_exists(inet_ntoa_r "${CURL_INCLUDES}" HAVE_INET_NTOA_R)
check_symbol_exists(tcsetattr "${CURL_INCLUDES}" HAVE_TCSETATTR) check_symbol_exists(tcsetattr "${CURL_INCLUDES}" HAVE_TCSETATTR)
check_symbol_exists(tcgetattr "${CURL_INCLUDES}" HAVE_TCGETATTR) check_symbol_exists(tcgetattr "${CURL_INCLUDES}" HAVE_TCGETATTR)
@ -331,8 +304,8 @@ check_symbol_exists(closesocket "${CURL_INCLUDES}" HAVE_CLOSESOCKET)
check_symbol_exists(setvbuf "${CURL_INCLUDES}" HAVE_SETVBUF) check_symbol_exists(setvbuf "${CURL_INCLUDES}" HAVE_SETVBUF)
check_symbol_exists(sigsetjmp "${CURL_INCLUDES}" HAVE_SIGSETJMP) check_symbol_exists(sigsetjmp "${CURL_INCLUDES}" HAVE_SIGSETJMP)
check_symbol_exists(getpass_r "${CURL_INCLUDES}" HAVE_GETPASS_R) check_symbol_exists(getpass_r "${CURL_INCLUDES}" HAVE_GETPASS_R)
check_symbol_exists(strlcat "${CURL_INCLUDES}" HAVE_STRLCAT) #check_symbol_exists(strlcat "${CURL_INCLUDES}" HAVE_STRLCAT)
check_symbol_exists(getpwuid "${CURL_INCLUDES}" HAVE_GETPWUID) #check_symbol_exists(getpwuid "${CURL_INCLUDES}" HAVE_GETPWUID)
check_symbol_exists(getpwuid_r "${CURL_INCLUDES}" HAVE_GETPWUID_R) check_symbol_exists(getpwuid_r "${CURL_INCLUDES}" HAVE_GETPWUID_R)
check_symbol_exists(geteuid "${CURL_INCLUDES}" HAVE_GETEUID) check_symbol_exists(geteuid "${CURL_INCLUDES}" HAVE_GETEUID)
check_symbol_exists(usleep "${CURL_INCLUDES}" HAVE_USLEEP) check_symbol_exists(usleep "${CURL_INCLUDES}" HAVE_USLEEP)
@ -340,17 +313,15 @@ check_symbol_exists(utime "${CURL_INCLUDES}" HAVE_UTIME)
check_symbol_exists(gmtime_r "${CURL_INCLUDES}" HAVE_GMTIME_R) check_symbol_exists(gmtime_r "${CURL_INCLUDES}" HAVE_GMTIME_R)
check_symbol_exists(localtime_r "${CURL_INCLUDES}" HAVE_LOCALTIME_R) check_symbol_exists(localtime_r "${CURL_INCLUDES}" HAVE_LOCALTIME_R)
check_symbol_exists(gethostbyname "${CURL_INCLUDES}" HAVE_GETHOSTBYNAME) #check_symbol_exists(gethostbyname "${CURL_INCLUDES}" HAVE_GETHOSTBYNAME)
check_symbol_exists(gethostbyname_r "${CURL_INCLUDES}" HAVE_GETHOSTBYNAME_R) check_symbol_exists(gethostbyname_r "${CURL_INCLUDES}" HAVE_GETHOSTBYNAME_R)
check_symbol_exists(signal "${CURL_INCLUDES}" HAVE_SIGNAL_FUNC) check_symbol_exists(signal "${CURL_INCLUDES}" HAVE_SIGNAL_FUNC)
check_symbol_exists(SIGALRM "${CURL_INCLUDES}" HAVE_SIGNAL_MACRO) check_symbol_exists(SIGALRM "${CURL_INCLUDES}" HAVE_SIGNAL_MACRO)
if(HAVE_SIGNAL_FUNC AND HAVE_SIGNAL_MACRO) set(HAVE_SIGNAL 1)
set(HAVE_SIGNAL 1)
endif()
check_symbol_exists(uname "${CURL_INCLUDES}" HAVE_UNAME) check_symbol_exists(uname "${CURL_INCLUDES}" HAVE_UNAME)
check_symbol_exists(strtoll "${CURL_INCLUDES}" HAVE_STRTOLL) check_symbol_exists(strtoll "${CURL_INCLUDES}" HAVE_STRTOLL)
check_symbol_exists(_strtoi64 "${CURL_INCLUDES}" HAVE__STRTOI64) #check_symbol_exists(_strtoi64 "${CURL_INCLUDES}" HAVE__STRTOI64)
check_symbol_exists(strerror_r "${CURL_INCLUDES}" HAVE_STRERROR_R) check_symbol_exists(strerror_r "${CURL_INCLUDES}" HAVE_STRERROR_R)
check_symbol_exists(siginterrupt "${CURL_INCLUDES}" HAVE_SIGINTERRUPT) check_symbol_exists(siginterrupt "${CURL_INCLUDES}" HAVE_SIGINTERRUPT)
check_symbol_exists(perror "${CURL_INCLUDES}" HAVE_PERROR) check_symbol_exists(perror "${CURL_INCLUDES}" HAVE_PERROR)

View File

@ -1,6 +1,8 @@
set(ICU_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/icu/icu4c/source) set(ICU_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/icu/icu4c/source)
set(ICUDATA_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/icudata/) set(ICUDATA_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/icudata/)
set (CMAKE_CXX_STANDARD 17)
# These lists of sources were generated from build log of the original ICU build system (configure + make). # These lists of sources were generated from build log of the original ICU build system (configure + make).
set(ICUUC_SOURCES set(ICUUC_SOURCES

@ -1 +1 @@
Subproject commit cd82fd9d8eefe50a47a0adf7c617c3ea7d558d11 Subproject commit 9676d2645a713e679dc981ffd84dee99fcd68b8e

2
contrib/libcxx vendored

@ -1 +1 @@
Subproject commit f7c63235238a71b7e0563fab8c7c5ec1b54831f6 Subproject commit a8c453300879d0bf255f9d5959d42e2c8aac1bfb

View File

@ -47,6 +47,11 @@ add_library(cxx ${SRCS})
target_include_directories(cxx SYSTEM BEFORE PUBLIC $<BUILD_INTERFACE:${LIBCXX_SOURCE_DIR}/include>) target_include_directories(cxx SYSTEM BEFORE PUBLIC $<BUILD_INTERFACE:${LIBCXX_SOURCE_DIR}/include>)
target_compile_definitions(cxx PRIVATE -D_LIBCPP_BUILDING_LIBRARY -DLIBCXX_BUILDING_LIBCXXABI) target_compile_definitions(cxx PRIVATE -D_LIBCPP_BUILDING_LIBRARY -DLIBCXX_BUILDING_LIBCXXABI)
# Enable capturing stack traces for all exceptions.
if (USE_UNWIND)
target_compile_definitions(cxx PUBLIC -DSTD_EXCEPTION_HAS_STACK_TRACE=1)
endif ()
target_compile_options(cxx PUBLIC $<$<COMPILE_LANGUAGE:CXX>:-nostdinc++>) target_compile_options(cxx PUBLIC $<$<COMPILE_LANGUAGE:CXX>:-nostdinc++>)
check_cxx_compiler_flag(-Wreserved-id-macro HAVE_WARNING_RESERVED_ID_MACRO) check_cxx_compiler_flag(-Wreserved-id-macro HAVE_WARNING_RESERVED_ID_MACRO)

View File

@ -32,6 +32,11 @@ target_compile_definitions(cxxabi PRIVATE -D_LIBCPP_BUILDING_LIBRARY)
target_compile_options(cxxabi PRIVATE -nostdinc++ -fno-sanitize=undefined -Wno-macro-redefined) # If we don't disable UBSan, infinite recursion happens in dynamic_cast. target_compile_options(cxxabi PRIVATE -nostdinc++ -fno-sanitize=undefined -Wno-macro-redefined) # If we don't disable UBSan, infinite recursion happens in dynamic_cast.
target_link_libraries(cxxabi PUBLIC ${EXCEPTION_HANDLING_LIBRARY}) target_link_libraries(cxxabi PUBLIC ${EXCEPTION_HANDLING_LIBRARY})
# Enable capturing stack traces for all exceptions.
if (USE_UNWIND)
target_compile_definitions(cxxabi PUBLIC -DSTD_EXCEPTION_HAS_STACK_TRACE=1)
endif ()
install( install(
TARGETS cxxabi TARGETS cxxabi
EXPORT global EXPORT global

View File

@ -7,10 +7,14 @@ ELSE(CMAKE_SYSTEM_NAME STREQUAL "Linux")
ENDIF(CMAKE_SYSTEM_NAME STREQUAL "Linux") ENDIF(CMAKE_SYSTEM_NAME STREQUAL "Linux")
IF(CMAKE_COMPILER_IS_GNUCXX) IF(CMAKE_COMPILER_IS_GNUCXX)
EXECUTE_PROCESS(COMMAND ${CMAKE_CXX_COMPILER} -dumpversion OUTPUT_VARIABLE GCC_COMPILER_VERSION) EXECUTE_PROCESS(COMMAND ${CMAKE_CXX_COMPILER} -dumpfullversion OUTPUT_VARIABLE GCC_COMPILER_VERSION)
IF (NOT GCC_COMPILER_VERSION) IF (NOT GCC_COMPILER_VERSION)
MESSAGE(FATAL_ERROR "Cannot get gcc version") EXECUTE_PROCESS(COMMAND ${CMAKE_CXX_COMPILER} -dumpversion OUTPUT_VARIABLE GCC_COMPILER_VERSION)
IF (NOT GCC_COMPILER_VERSION)
MESSAGE(FATAL_ERROR "Cannot get gcc version")
ENDIF (NOT GCC_COMPILER_VERSION)
ENDIF (NOT GCC_COMPILER_VERSION) ENDIF (NOT GCC_COMPILER_VERSION)
STRING(REGEX MATCHALL "[0-9]+" GCC_COMPILER_VERSION ${GCC_COMPILER_VERSION}) STRING(REGEX MATCHALL "[0-9]+" GCC_COMPILER_VERSION ${GCC_COMPILER_VERSION})

View File

@ -1,16 +1,6 @@
set(OPENSSL_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/openssl) set(OPENSSL_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/openssl)
set(OPENSSL_BINARY_DIR ${ClickHouse_BINARY_DIR}/contrib/openssl) set(OPENSSL_BINARY_DIR ${ClickHouse_BINARY_DIR}/contrib/openssl)
#file(READ ${CMAKE_CURRENT_SOURCE_DIR}/${OPENSSL_SOURCE_DIR}/ssl/VERSION SSL_VERSION)
#string(STRIP ${SSL_VERSION} SSL_VERSION)
#string(REPLACE ":" "." SSL_VERSION ${SSL_VERSION})
#string(REGEX REPLACE "\\..*" "" SSL_MAJOR_VERSION ${SSL_VERSION})
#file(READ ${CMAKE_CURRENT_SOURCE_DIR}/${OPENSSL_SOURCE_DIR}/crypto/VERSION CRYPTO_VERSION)
#string(STRIP ${CRYPTO_VERSION} CRYPTO_VERSION)
#string(REPLACE ":" "." CRYPTO_VERSION ${CRYPTO_VERSION})
#string(REGEX REPLACE "\\..*" "" CRYPTO_MAJOR_VERSION ${CRYPTO_VERSION})
set(OPENSSLDIR "/etc/ssl" CACHE PATH "Set the default openssl directory") set(OPENSSLDIR "/etc/ssl" CACHE PATH "Set the default openssl directory")
set(OPENSSL_ENGINESDIR "/usr/lib/engines-3" CACHE PATH "Set the default openssl directory for engines") set(OPENSSL_ENGINESDIR "/usr/lib/engines-3" CACHE PATH "Set the default openssl directory for engines")
set(OPENSSL_MODULESDIR "/usr/local/lib/ossl-modules" CACHE PATH "Set the default openssl directory for modules") set(OPENSSL_MODULESDIR "/usr/local/lib/ossl-modules" CACHE PATH "Set the default openssl directory for modules")
@ -27,19 +17,27 @@ elseif(ARCH_AARCH64)
endif() endif()
enable_language(ASM) enable_language(ASM)
if (COMPILER_CLANG) if (COMPILER_CLANG)
add_definitions(-Wno-unused-command-line-argument) add_definitions(-Wno-unused-command-line-argument)
endif () endif ()
if (ARCH_AMD64) if (ARCH_AMD64)
if (OS_DARWIN)
set (OPENSSL_SYSTEM "macosx")
endif ()
macro(perl_generate_asm FILE_IN FILE_OUT) macro(perl_generate_asm FILE_IN FILE_OUT)
get_filename_component(DIRNAME ${FILE_OUT} DIRECTORY)
file(MAKE_DIRECTORY ${DIRNAME})
add_custom_command(OUTPUT ${FILE_OUT} add_custom_command(OUTPUT ${FILE_OUT}
COMMAND /usr/bin/env perl ${FILE_IN} ${FILE_OUT} COMMAND /usr/bin/env perl ${FILE_IN} ${OPENSSL_SYSTEM} ${FILE_OUT}
# ASM code has broken unwind tables (CFI), strip them. # ASM code has broken unwind tables (CFI), strip them.
# Otherwise asynchronous unwind (that we use for query profiler) # Otherwise asynchronous unwind (that we use for query profiler)
# will lead to segfault while trying to interpret wrong "CFA expression". # will lead to segfault while trying to interpret wrong "CFA expression".
COMMAND sed -i -e '/^\.cfi_/d' ${FILE_OUT}) COMMAND sed -i -e '/^\.cfi_/d' ${FILE_OUT})
endmacro() endmacro()
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aes-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aes-x86_64.s) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aes-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aes-x86_64.s)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aesni-mb-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aesni-mb-x86_64.s) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aesni-mb-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aesni-mb-x86_64.s)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aesni-sha1-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aesni-sha1-x86_64.s) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aesni-sha1-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aesni-sha1-x86_64.s)
@ -70,12 +68,17 @@ if (ARCH_AMD64)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha256-x86_64.s) # This is not a mistake perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha256-x86_64.s) # This is not a mistake
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha512-x86_64.s) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha512-x86_64.s)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/whrlpool/asm/wp-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/whrlpool/wp-x86_64.s) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/whrlpool/asm/wp-x86_64.pl ${OPENSSL_BINARY_DIR}/crypto/whrlpool/wp-x86_64.s)
elseif (ARCH_AARCH64) elseif (ARCH_AARCH64)
macro(perl_generate_asm FILE_IN FILE_OUT) macro(perl_generate_asm FILE_IN FILE_OUT)
get_filename_component(DIRNAME ${FILE_OUT} DIRECTORY)
file(MAKE_DIRECTORY ${DIRNAME})
add_custom_command(OUTPUT ${FILE_OUT} add_custom_command(OUTPUT ${FILE_OUT}
COMMAND /usr/bin/env perl ${FILE_IN} "linux64" ${FILE_OUT}) COMMAND /usr/bin/env perl ${FILE_IN} "linux64" ${FILE_OUT})
# Hope that the ASM code for AArch64 doesn't have broken CFI. Otherwise, add the same sed as for x86_64. # Hope that the ASM code for AArch64 doesn't have broken CFI. Otherwise, add the same sed as for x86_64.
endmacro() endmacro()
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aesv8-armx.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aesv8-armx.S) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aesv8-armx.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aesv8-armx.S)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/vpaes-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/aes/vpaes-armv8.S) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/vpaes-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/aes/vpaes-armv8.S)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/bn/asm/armv8-mont.pl ${OPENSSL_BINARY_DIR}/crypto/bn/armv8-mont.S) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/bn/asm/armv8-mont.pl ${OPENSSL_BINARY_DIR}/crypto/bn/armv8-mont.S)
@ -88,6 +91,7 @@ elseif (ARCH_AARCH64)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha1-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha1-armv8.S) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha1-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha1-armv8.S)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha256-armv8.S) # This is not a mistake perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha256-armv8.S) # This is not a mistake
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha512-armv8.S) perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/sha512-armv8.pl ${OPENSSL_BINARY_DIR}/crypto/sha/sha512-armv8.S)
endif () endif ()
set(CRYPTO_SRCS set(CRYPTO_SRCS

1
contrib/ryu vendored Submodule

@ -0,0 +1 @@
Subproject commit 5b4a853534b47438b4d97935370f6b2397137c2b

View File

@ -0,0 +1,10 @@
SET(LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/ryu)
add_library(ryu
${LIBRARY_DIR}/ryu/d2fixed.c
${LIBRARY_DIR}/ryu/d2s.c
${LIBRARY_DIR}/ryu/f2s.c
${LIBRARY_DIR}/ryu/generic_128.c
)
target_include_directories(ryu SYSTEM BEFORE PUBLIC "${LIBRARY_DIR}")

2
contrib/zlib-ng vendored

@ -1 +1 @@
Subproject commit 5673222fbd37ea89afb2ea73096f9bf5ec68ea31 Subproject commit bba56a73be249514acfbc7d49aa2a68994dad8ab

View File

@ -330,6 +330,7 @@ target_link_libraries (clickhouse_common_io
${LINK_LIBRARIES_ONLY_ON_X86_64} ${LINK_LIBRARIES_ONLY_ON_X86_64}
PUBLIC PUBLIC
${DOUBLE_CONVERSION_LIBRARIES} ${DOUBLE_CONVERSION_LIBRARIES}
ryu
PUBLIC PUBLIC
${Poco_Net_LIBRARY} ${Poco_Net_LIBRARY}
${Poco_Util_LIBRARY} ${Poco_Util_LIBRARY}

View File

@ -1,11 +1,11 @@
# This strings autochanged from release_lib.sh: # This strings autochanged from release_lib.sh:
set(VERSION_REVISION 54430) set(VERSION_REVISION 54431)
set(VERSION_MAJOR 19) set(VERSION_MAJOR 20)
set(VERSION_MINOR 19) set(VERSION_MINOR 1)
set(VERSION_PATCH 1) set(VERSION_PATCH 1)
set(VERSION_GITHASH 8bd9709d1dec3366e35d2efeab213435857f67a9) set(VERSION_GITHASH 51d4c8a53be94504e3607b2232e12e5ef7a8ec28)
set(VERSION_DESCRIBE v19.19.1.1-prestable) set(VERSION_DESCRIBE v20.1.1.1-prestable)
set(VERSION_STRING 19.19.1.1) set(VERSION_STRING 20.1.1.1)
# end of autochange # end of autochange
set(VERSION_EXTRA "" CACHE STRING "") set(VERSION_EXTRA "" CACHE STRING "")

View File

@ -2,7 +2,6 @@
#include "ConnectionParameters.h" #include "ConnectionParameters.h"
#include "Suggest.h" #include "Suggest.h"
#include <port/unistd.h>
#include <stdlib.h> #include <stdlib.h>
#include <fcntl.h> #include <fcntl.h>
#include <signal.h> #include <signal.h>
@ -263,7 +262,7 @@ private:
&& std::string::npos == embedded_stack_trace_pos) && std::string::npos == embedded_stack_trace_pos)
{ {
std::cerr << "Stack trace:" << std::endl std::cerr << "Stack trace:" << std::endl
<< e.getStackTrace().toString(); << e.getStackTraceString();
} }
/// If exception code isn't zero, we should return non-zero return code anyway. /// If exception code isn't zero, we should return non-zero return code anyway.
@ -290,6 +289,78 @@ private:
|| (now.month() == 1 && now.day() <= 5); || (now.month() == 1 && now.day() <= 5);
} }
bool isChineseNewYearMode(const String & local_tz)
{
/// Days of Dec. 20 in Chinese calendar starting from year 2019 to year 2105
static constexpr UInt16 chineseNewYearIndicators[]
= {18275, 18659, 19014, 19368, 19752, 20107, 20491, 20845, 21199, 21583, 21937, 22292, 22676, 23030, 23414, 23768, 24122, 24506,
24860, 25215, 25599, 25954, 26308, 26692, 27046, 27430, 27784, 28138, 28522, 28877, 29232, 29616, 29970, 30354, 30708, 31062,
31446, 31800, 32155, 32539, 32894, 33248, 33632, 33986, 34369, 34724, 35078, 35462, 35817, 36171, 36555, 36909, 37293, 37647,
38002, 38386, 38740, 39095, 39479, 39833, 40187, 40571, 40925, 41309, 41664, 42018, 42402, 42757, 43111, 43495, 43849, 44233,
44587, 44942, 45326, 45680, 46035, 46418, 46772, 47126, 47510, 47865, 48249, 48604, 48958, 49342};
static constexpr size_t N = sizeof(chineseNewYearIndicators) / sizeof(chineseNewYearIndicators[0]);
/// All time zone names are acquired from https://www.iana.org/time-zones
static constexpr const char * chineseNewYearTimeZoneIndicators[] = {
/// Time zones celebrating Chinese new year.
"Asia/Shanghai",
"Asia/Chongqing",
"Asia/Harbin",
"Asia/Urumqi",
"Asia/Hong_Kong",
"Asia/Chungking",
"Asia/Macao",
"Asia/Macau",
"Asia/Taipei",
"Asia/Singapore",
/// Time zones celebrating Chinese new year but with different festival names. Let's not print the message for now.
// "Asia/Brunei",
// "Asia/Ho_Chi_Minh",
// "Asia/Hovd",
// "Asia/Jakarta",
// "Asia/Jayapura",
// "Asia/Kashgar",
// "Asia/Kuala_Lumpur",
// "Asia/Kuching",
// "Asia/Makassar",
// "Asia/Pontianak",
// "Asia/Pyongyang",
// "Asia/Saigon",
// "Asia/Seoul",
// "Asia/Ujung_Pandang",
// "Asia/Ulaanbaatar",
// "Asia/Ulan_Bator",
};
static constexpr size_t M = sizeof(chineseNewYearTimeZoneIndicators) / sizeof(chineseNewYearTimeZoneIndicators[0]);
time_t current_time = time(nullptr);
if (chineseNewYearTimeZoneIndicators + M
== std::find_if(chineseNewYearTimeZoneIndicators, chineseNewYearTimeZoneIndicators + M, [&local_tz](const char * tz)
{
return tz == local_tz;
}))
return false;
/// It's bad to be intrusive.
if (current_time % 3 != 0)
return false;
auto days = DateLUT::instance().toDayNum(current_time).toUnderType();
for (auto i = 0ul; i < N; ++i)
{
auto d = chineseNewYearIndicators[i];
/// Let's celebrate until Lantern Festival
if (d <= days && d + 25u >= days)
return true;
else if (d > days)
return false;
}
return false;
}
int mainImpl() int mainImpl()
{ {
UseSSL use_ssl; UseSSL use_ssl;
@ -337,7 +408,7 @@ private:
connect(); connect();
/// Initialize DateLUT here to avoid counting time spent here as query execution time. /// Initialize DateLUT here to avoid counting time spent here as query execution time.
DateLUT::instance(); const auto local_tz = DateLUT::instance().getTimeZone();
if (!context.getSettingsRef().use_client_time_zone) if (!context.getSettingsRef().use_client_time_zone)
{ {
const auto & time_zone = connection->getServerTimezone(connection_parameters.timeouts); const auto & time_zone = connection->getServerTimezone(connection_parameters.timeouts);
@ -448,8 +519,7 @@ private:
<< "Code: " << e.code() << ". " << e.displayText() << std::endl; << "Code: " << e.code() << ". " << e.displayText() << std::endl;
if (config().getBool("stacktrace", false)) if (config().getBool("stacktrace", false))
std::cerr << "Stack trace:" << std::endl std::cerr << "Stack trace:" << std::endl << e.getStackTraceString() << std::endl;
<< e.getStackTrace().toString() << std::endl;
std::cerr << std::endl; std::cerr << std::endl;
@ -463,7 +533,12 @@ private:
} }
while (true); while (true);
std::cout << (isNewYearMode() ? "Happy new year." : "Bye.") << std::endl; if (isNewYearMode())
std::cout << "Happy new year." << std::endl;
else if (isChineseNewYearMode(local_tz))
std::cout << "Happy Chinese new year. 春节快乐!" << std::endl;
else
std::cout << "Bye." << std::endl;
return 0; return 0;
} }
else else
@ -553,27 +628,11 @@ private:
} }
/// Check if multi-line query is inserted from the paste buffer.
/// Allows delaying the start of query execution until the entirety of query is inserted.
static bool hasDataInSTDIN()
{
timeval timeout = { 0, 0 };
fd_set fds;
FD_ZERO(&fds);
FD_SET(STDIN_FILENO, &fds);
return select(1, &fds, nullptr, nullptr, &timeout) == 1;
}
inline const String prompt() const inline const String prompt() const
{ {
return boost::replace_all_copy(prompt_by_server_display_name, "{database}", config().getString("database", "default")); return boost::replace_all_copy(prompt_by_server_display_name, "{database}", config().getString("database", "default"));
} }
void loop()
{
}
void nonInteractive() void nonInteractive()
{ {

View File

@ -76,7 +76,7 @@ void LocalServer::initialize(Poco::Util::Application & self)
if (config().has("logger") || config().has("logger.level") || config().has("logger.log")) if (config().has("logger") || config().has("logger.level") || config().has("logger.log"))
{ {
// sensitive data rules are not used here // sensitive data rules are not used here
buildLoggers(config(), logger()); buildLoggers(config(), logger(), self.commandName());
} }
else else
{ {

View File

@ -115,7 +115,7 @@ void ODBCHandler::handleRequest(Poco::Net::HTTPServerRequest & request, Poco::Ne
catch (const Exception & ex) catch (const Exception & ex)
{ {
process_error("Invalid 'columns' parameter in request body '" + ex.message() + "'"); process_error("Invalid 'columns' parameter in request body '" + ex.message() + "'");
LOG_WARNING(log, ex.getStackTrace().toString()); LOG_WARNING(log, ex.getStackTraceString());
return; return;
} }

View File

@ -124,7 +124,7 @@ void ODBCBridge::initialize(Application & self)
config().setString("logger", "ODBCBridge"); config().setString("logger", "ODBCBridge");
buildLoggers(config(), logger()); buildLoggers(config(), logger(), self.commandName());
log = &logger(); log = &logger();
hostname = config().getString("listen-host", "localhost"); hostname = config().getString("listen-host", "localhost");

View File

@ -85,16 +85,6 @@ bool PerformanceTest::checkPreconditions() const
for (const std::string & precondition : preconditions) for (const std::string & precondition : preconditions)
{ {
if (precondition == "flush_disk_cache")
{
if (system(
"(>&2 echo 'Flushing disk cache...') && (sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches') && (>&2 echo 'Flushed.')"))
{
LOG_WARNING(log, "Failed to flush disk cache");
return false;
}
}
if (precondition == "ram_size") if (precondition == "ram_size")
{ {
size_t ram_size_needed = config->getUInt64("preconditions.ram_size"); size_t ram_size_needed = config->getUInt64("preconditions.ram_size");
@ -337,7 +327,7 @@ void PerformanceTest::runQueries(
{ {
statistics.exception = "Code: " + std::to_string(e.code()) + ", e.displayText() = " + e.displayText(); statistics.exception = "Code: " + std::to_string(e.code()) + ", e.displayText() = " + e.displayText();
LOG_WARNING(log, "Code: " << e.code() << ", e.displayText() = " << e.displayText() LOG_WARNING(log, "Code: " << e.code() << ", e.displayText() = " << e.displayText()
<< ", Stack trace:\n\n" << e.getStackTrace().toString()); << ", Stack trace:\n\n" << e.getStackTraceString());
} }
if (!statistics.got_SIGINT) if (!statistics.got_SIGINT)

View File

@ -45,21 +45,11 @@ namespace fs = std::filesystem;
PerformanceTestInfo::PerformanceTestInfo( PerformanceTestInfo::PerformanceTestInfo(
XMLConfigurationPtr config, XMLConfigurationPtr config,
const std::string & profiles_file_,
const Settings & global_settings_) const Settings & global_settings_)
: profiles_file(profiles_file_) : settings(global_settings_)
, settings(global_settings_)
{ {
path = config->getString("path"); path = config->getString("path");
test_name = fs::path(path).stem().string(); test_name = fs::path(path).stem().string();
if (config->has("main_metric"))
{
Strings main_metrics;
config->keys("main_metric", main_metrics);
if (main_metrics.size())
main_metric = main_metrics[0];
}
applySettings(config); applySettings(config);
extractQueries(config); extractQueries(config);
extractAuxiliaryQueries(config); extractAuxiliaryQueries(config);
@ -75,38 +65,8 @@ void PerformanceTestInfo::applySettings(XMLConfigurationPtr config)
SettingsChanges settings_to_apply; SettingsChanges settings_to_apply;
Strings config_settings; Strings config_settings;
config->keys("settings", config_settings); config->keys("settings", config_settings);
auto settings_contain = [&config_settings] (const std::string & setting)
{
auto position = std::find(config_settings.begin(), config_settings.end(), setting);
return position != config_settings.end();
};
/// Preprocess configuration file
if (settings_contain("profile"))
{
if (!profiles_file.empty())
{
std::string profile_name = config->getString("settings.profile");
XMLConfigurationPtr profiles_config(new XMLConfiguration(profiles_file));
Strings profile_settings;
profiles_config->keys("profiles." + profile_name, profile_settings);
extractSettings(profiles_config, "profiles." + profile_name, profile_settings, settings_to_apply);
}
}
extractSettings(config, "settings", config_settings, settings_to_apply); extractSettings(config, "settings", config_settings, settings_to_apply);
settings.applyChanges(settings_to_apply); settings.applyChanges(settings_to_apply);
if (settings_contain("average_rows_speed_precision"))
TestStats::avg_rows_speed_precision =
config->getDouble("settings.average_rows_speed_precision");
if (settings_contain("average_bytes_speed_precision"))
TestStats::avg_bytes_speed_precision =
config->getDouble("settings.average_bytes_speed_precision");
} }
} }

View File

@ -26,15 +26,13 @@ using StringToVector = std::map<std::string, Strings>;
class PerformanceTestInfo class PerformanceTestInfo
{ {
public: public:
PerformanceTestInfo(XMLConfigurationPtr config, const std::string & profiles_file_, const Settings & global_settings_); PerformanceTestInfo(XMLConfigurationPtr config, const Settings & global_settings_);
std::string test_name; std::string test_name;
std::string path; std::string path;
std::string main_metric;
Strings queries; Strings queries;
std::string profiles_file;
Settings settings; Settings settings;
ExecutionType exec_type; ExecutionType exec_type;
StringToVector substitutions; StringToVector substitutions;

View File

@ -64,7 +64,6 @@ public:
const std::string & password_, const std::string & password_,
const Settings & cmd_settings, const Settings & cmd_settings,
const bool lite_output_, const bool lite_output_,
const std::string & profiles_file_,
Strings && input_files_, Strings && input_files_,
Strings && tests_tags_, Strings && tests_tags_,
Strings && skip_tags_, Strings && skip_tags_,
@ -86,7 +85,6 @@ public:
, skip_names_regexp(std::move(skip_names_regexp_)) , skip_names_regexp(std::move(skip_names_regexp_))
, query_indexes(query_indexes_) , query_indexes(query_indexes_)
, lite_output(lite_output_) , lite_output(lite_output_)
, profiles_file(profiles_file_)
, input_files(input_files_) , input_files(input_files_)
, log(&Poco::Logger::get("PerformanceTestSuite")) , log(&Poco::Logger::get("PerformanceTestSuite"))
{ {
@ -139,7 +137,6 @@ private:
using XMLConfigurationPtr = Poco::AutoPtr<XMLConfiguration>; using XMLConfigurationPtr = Poco::AutoPtr<XMLConfiguration>;
bool lite_output; bool lite_output;
std::string profiles_file;
Strings input_files; Strings input_files;
std::vector<XMLConfigurationPtr> tests_configurations; std::vector<XMLConfigurationPtr> tests_configurations;
@ -197,7 +194,7 @@ private:
std::pair<std::string, bool> runTest(XMLConfigurationPtr & test_config) std::pair<std::string, bool> runTest(XMLConfigurationPtr & test_config)
{ {
PerformanceTestInfo info(test_config, profiles_file, global_context.getSettingsRef()); PerformanceTestInfo info(test_config, global_context.getSettingsRef());
LOG_INFO(log, "Config for test '" << info.test_name << "' parsed"); LOG_INFO(log, "Config for test '" << info.test_name << "' parsed");
PerformanceTest current(test_config, connection, timeouts, interrupt_listener, info, global_context, query_indexes[info.path]); PerformanceTest current(test_config, connection, timeouts, interrupt_listener, info, global_context, query_indexes[info.path]);
@ -332,7 +329,6 @@ try
desc.add_options() desc.add_options()
("help", "produce help message") ("help", "produce help message")
("lite", "use lite version of output") ("lite", "use lite version of output")
("profiles-file", value<std::string>()->default_value(""), "Specify a file with global profiles")
("host,h", value<std::string>()->default_value("localhost"), "") ("host,h", value<std::string>()->default_value("localhost"), "")
("port", value<UInt16>()->default_value(9000), "") ("port", value<UInt16>()->default_value(9000), "")
("secure,s", "Use TLS connection") ("secure,s", "Use TLS connection")
@ -401,7 +397,6 @@ try
options["password"].as<std::string>(), options["password"].as<std::string>(),
cmd_settings, cmd_settings,
options.count("lite") > 0, options.count("lite") > 0,
options["profiles-file"].as<std::string>(),
std::move(input_files), std::move(input_files),
std::move(tests_tags), std::move(tests_tags),
std::move(skip_tags), std::move(skip_tags),

View File

@ -17,23 +17,25 @@ namespace DB
namespace namespace
{ {
const std::regex QUOTE_REGEX{"\""};
std::string getMainMetric(const PerformanceTestInfo & test_info) std::string getMainMetric(const PerformanceTestInfo & test_info)
{ {
std::string main_metric; if (test_info.exec_type == ExecutionType::Loop)
if (test_info.main_metric.empty()) return "min_time";
if (test_info.exec_type == ExecutionType::Loop)
main_metric = "min_time";
else
main_metric = "rows_per_second";
else else
main_metric = test_info.main_metric; return "rows_per_second";
return main_metric;
} }
bool isASCIIString(const std::string & str) bool isASCIIString(const std::string & str)
{ {
return std::all_of(str.begin(), str.end(), isASCII); return std::all_of(str.begin(), str.end(), isASCII);
} }
String jsonString(const String & str, FormatSettings & settings)
{
WriteBufferFromOwnString buffer;
writeJSONString(str, buffer, settings);
return std::move(buffer.str());
}
} }
ReportBuilder::ReportBuilder(const std::string & server_version_) ReportBuilder::ReportBuilder(const std::string & server_version_)
@ -55,6 +57,8 @@ std::string ReportBuilder::buildFullReport(
std::vector<TestStats> & stats, std::vector<TestStats> & stats,
const std::vector<std::size_t> & queries_to_run) const const std::vector<std::size_t> & queries_to_run) const
{ {
FormatSettings settings;
JSONString json_output; JSONString json_output;
json_output.set("hostname", hostname); json_output.set("hostname", hostname);
@ -65,22 +69,18 @@ std::string ReportBuilder::buildFullReport(
json_output.set("time", getCurrentTime()); json_output.set("time", getCurrentTime());
json_output.set("test_name", test_info.test_name); json_output.set("test_name", test_info.test_name);
json_output.set("path", test_info.path); json_output.set("path", test_info.path);
json_output.set("main_metric", getMainMetric(test_info));
if (test_info.substitutions.size()) if (!test_info.substitutions.empty())
{ {
JSONString json_parameters(2); /// here, 2 is the size of \t padding JSONString json_parameters(2); /// here, 2 is the size of \t padding
for (auto it = test_info.substitutions.begin(); it != test_info.substitutions.end(); ++it) for (auto & [parameter, values] : test_info.substitutions)
{ {
std::string parameter = it->first;
Strings values = it->second;
std::ostringstream array_string; std::ostringstream array_string;
array_string << "["; array_string << "[";
for (size_t i = 0; i != values.size(); ++i) for (size_t i = 0; i != values.size(); ++i)
{ {
array_string << '"' << std::regex_replace(values[i], QUOTE_REGEX, "\\\"") << '"'; array_string << jsonString(values[i], settings);
if (i != values.size() - 1) if (i != values.size() - 1)
{ {
array_string << ", "; array_string << ", ";
@ -110,13 +110,12 @@ std::string ReportBuilder::buildFullReport(
JSONString runJSON; JSONString runJSON;
auto query = std::regex_replace(test_info.queries[query_index], QUOTE_REGEX, "\\\""); runJSON.set("query", jsonString(test_info.queries[query_index], settings), false);
runJSON.set("query", query);
runJSON.set("query_index", query_index); runJSON.set("query_index", query_index);
if (!statistics.exception.empty()) if (!statistics.exception.empty())
{ {
if (isASCIIString(statistics.exception)) if (isASCIIString(statistics.exception))
runJSON.set("exception", std::regex_replace(statistics.exception, QUOTE_REGEX, "\\\"")); runJSON.set("exception", jsonString(statistics.exception, settings), false);
else else
runJSON.set("exception", "Some exception occured with non ASCII message. This may produce invalid JSON. Try reproduce locally."); runJSON.set("exception", "Some exception occured with non ASCII message. This may produce invalid JSON. Try reproduce locally.");
} }
@ -183,7 +182,7 @@ std::string ReportBuilder::buildCompactReport(
std::vector<TestStats> & stats, std::vector<TestStats> & stats,
const std::vector<std::size_t> & queries_to_run) const const std::vector<std::size_t> & queries_to_run) const
{ {
FormatSettings settings;
std::ostringstream output; std::ostringstream output;
for (size_t query_index = 0; query_index < test_info.queries.size(); ++query_index) for (size_t query_index = 0; query_index < test_info.queries.size(); ++query_index)
@ -194,7 +193,7 @@ std::string ReportBuilder::buildCompactReport(
for (size_t number_of_launch = 0; number_of_launch < test_info.times_to_run; ++number_of_launch) for (size_t number_of_launch = 0; number_of_launch < test_info.times_to_run; ++number_of_launch)
{ {
if (test_info.queries.size() > 1) if (test_info.queries.size() > 1)
output << "query \"" << test_info.queries[query_index] << "\", "; output << "query " << jsonString(test_info.queries[query_index], settings) << ", ";
output << "run " << std::to_string(number_of_launch + 1) << ": "; output << "run " << std::to_string(number_of_launch + 1) << ": ";

View File

@ -20,8 +20,6 @@
#include <Compression/CompressedReadBuffer.h> #include <Compression/CompressedReadBuffer.h>
#include <Compression/CompressedWriteBuffer.h> #include <Compression/CompressedWriteBuffer.h>
#include <IO/ReadBufferFromIStream.h> #include <IO/ReadBufferFromIStream.h>
#include <IO/ZlibInflatingReadBuffer.h>
#include <IO/BrotliReadBuffer.h>
#include <IO/ReadBufferFromString.h> #include <IO/ReadBufferFromString.h>
#include <IO/WriteBufferFromString.h> #include <IO/WriteBufferFromString.h>
#include <IO/WriteBufferFromHTTPServerResponse.h> #include <IO/WriteBufferFromHTTPServerResponse.h>
@ -300,32 +298,24 @@ void HTTPHandler::processQuery(
/// The client can pass a HTTP header indicating supported compression method (gzip or deflate). /// The client can pass a HTTP header indicating supported compression method (gzip or deflate).
String http_response_compression_methods = request.get("Accept-Encoding", ""); String http_response_compression_methods = request.get("Accept-Encoding", "");
bool client_supports_http_compression = false; CompressionMethod http_response_compression_method = CompressionMethod::None;
CompressionMethod http_response_compression_method {};
if (!http_response_compression_methods.empty()) if (!http_response_compression_methods.empty())
{ {
/// If client supports brotli - it's preferred.
/// Both gzip and deflate are supported. If the client supports both, gzip is preferred. /// Both gzip and deflate are supported. If the client supports both, gzip is preferred.
/// NOTE parsing of the list of methods is slightly incorrect. /// NOTE parsing of the list of methods is slightly incorrect.
if (std::string::npos != http_response_compression_methods.find("gzip"))
{ if (std::string::npos != http_response_compression_methods.find("br"))
client_supports_http_compression = true;
http_response_compression_method = CompressionMethod::Gzip;
}
else if (std::string::npos != http_response_compression_methods.find("deflate"))
{
client_supports_http_compression = true;
http_response_compression_method = CompressionMethod::Zlib;
}
#if USE_BROTLI
else if (http_response_compression_methods == "br")
{
client_supports_http_compression = true;
http_response_compression_method = CompressionMethod::Brotli; http_response_compression_method = CompressionMethod::Brotli;
} else if (std::string::npos != http_response_compression_methods.find("gzip"))
#endif http_response_compression_method = CompressionMethod::Gzip;
else if (std::string::npos != http_response_compression_methods.find("deflate"))
http_response_compression_method = CompressionMethod::Zlib;
} }
bool client_supports_http_compression = http_response_compression_method != CompressionMethod::None;
/// Client can pass a 'compress' flag in the query string. In this case the query result is /// Client can pass a 'compress' flag in the query string. In this case the query result is
/// compressed using internal algorithm. This is not reflected in HTTP headers. /// compressed using internal algorithm. This is not reflected in HTTP headers.
bool internal_compression = params.getParsed<bool>("compress", false); bool internal_compression = params.getParsed<bool>("compress", false);
@ -344,8 +334,8 @@ void HTTPHandler::processQuery(
unsigned keep_alive_timeout = config.getUInt("keep_alive_timeout", 10); unsigned keep_alive_timeout = config.getUInt("keep_alive_timeout", 10);
used_output.out = std::make_shared<WriteBufferFromHTTPServerResponse>( used_output.out = std::make_shared<WriteBufferFromHTTPServerResponse>(
request, response, keep_alive_timeout, request, response, keep_alive_timeout, client_supports_http_compression, http_response_compression_method);
client_supports_http_compression, http_response_compression_method, buffer_size_http);
if (internal_compression) if (internal_compression)
used_output.out_maybe_compressed = std::make_shared<CompressedWriteBuffer>(*used_output.out); used_output.out_maybe_compressed = std::make_shared<CompressedWriteBuffer>(*used_output.out);
else else
@ -400,32 +390,9 @@ void HTTPHandler::processQuery(
std::unique_ptr<ReadBuffer> in_post_raw = std::make_unique<ReadBufferFromIStream>(istr); std::unique_ptr<ReadBuffer> in_post_raw = std::make_unique<ReadBufferFromIStream>(istr);
/// Request body can be compressed using algorithm specified in the Content-Encoding header. /// Request body can be compressed using algorithm specified in the Content-Encoding header.
std::unique_ptr<ReadBuffer> in_post;
String http_request_compression_method_str = request.get("Content-Encoding", ""); String http_request_compression_method_str = request.get("Content-Encoding", "");
if (!http_request_compression_method_str.empty()) std::unique_ptr<ReadBuffer> in_post = wrapReadBufferWithCompressionMethod(
{ std::make_unique<ReadBufferFromIStream>(istr), chooseCompressionMethod({}, http_request_compression_method_str));
if (http_request_compression_method_str == "gzip")
{
in_post = std::make_unique<ZlibInflatingReadBuffer>(std::move(in_post_raw), CompressionMethod::Gzip);
}
else if (http_request_compression_method_str == "deflate")
{
in_post = std::make_unique<ZlibInflatingReadBuffer>(std::move(in_post_raw), CompressionMethod::Zlib);
}
#if USE_BROTLI
else if (http_request_compression_method_str == "br")
{
in_post = std::make_unique<BrotliReadBuffer>(std::move(in_post_raw));
}
#endif
else
{
throw Exception("Unknown Content-Encoding of HTTP request: " + http_request_compression_method_str,
ErrorCodes::UNKNOWN_COMPRESSION_METHOD);
}
}
else
in_post = std::move(in_post_raw);
/// The data can also be compressed using incompatible internal algorithm. This is indicated by /// The data can also be compressed using incompatible internal algorithm. This is indicated by
/// 'decompress' query parameter. /// 'decompress' query parameter.

View File

@ -947,6 +947,7 @@ int Server::main(const std::vector<std::string> & /*args*/)
}); });
/// try to load dictionaries immediately, throw on error and die /// try to load dictionaries immediately, throw on error and die
ext::scope_guard dictionaries_xmls, models_xmls;
try try
{ {
if (!config().getBool("dictionaries_lazy_load", true)) if (!config().getBool("dictionaries_lazy_load", true))
@ -954,12 +955,10 @@ int Server::main(const std::vector<std::string> & /*args*/)
global_context->tryCreateEmbeddedDictionaries(); global_context->tryCreateEmbeddedDictionaries();
global_context->getExternalDictionariesLoader().enableAlwaysLoadEverything(true); global_context->getExternalDictionariesLoader().enableAlwaysLoadEverything(true);
} }
dictionaries_xmls = global_context->getExternalDictionariesLoader().addConfigRepository(
auto dictionaries_repository = std::make_unique<ExternalLoaderXMLConfigRepository>(config(), "dictionaries_config"); std::make_unique<ExternalLoaderXMLConfigRepository>(config(), "dictionaries_config"));
global_context->getExternalDictionariesLoader().addConfigRepository("", std::move(dictionaries_repository)); models_xmls = global_context->getExternalModelsLoader().addConfigRepository(
std::make_unique<ExternalLoaderXMLConfigRepository>(config(), "models_config"));
auto models_repository = std::make_unique<ExternalLoaderXMLConfigRepository>(config(), "models_config");
global_context->getExternalModelsLoader().addConfigRepository("", std::move(models_repository));
} }
catch (...) catch (...)
{ {

View File

@ -112,7 +112,7 @@ void TCPHandler::runImpl()
{ {
Exception e("Database " + backQuote(default_database) + " doesn't exist", ErrorCodes::UNKNOWN_DATABASE); Exception e("Database " + backQuote(default_database) + " doesn't exist", ErrorCodes::UNKNOWN_DATABASE);
LOG_ERROR(log, "Code: " << e.code() << ", e.displayText() = " << e.displayText() LOG_ERROR(log, "Code: " << e.code() << ", e.displayText() = " << e.displayText()
<< ", Stack trace:\n\n" << e.getStackTrace().toString()); << ", Stack trace:\n\n" << e.getStackTraceString());
sendException(e, connection_context.getSettingsRef().calculate_text_stack_trace); sendException(e, connection_context.getSettingsRef().calculate_text_stack_trace);
return; return;
} }
@ -158,7 +158,7 @@ void TCPHandler::runImpl()
/** An exception during the execution of request (it must be sent over the network to the client). /** An exception during the execution of request (it must be sent over the network to the client).
* The client will be able to accept it, if it did not happen while sending another packet and the client has not disconnected yet. * The client will be able to accept it, if it did not happen while sending another packet and the client has not disconnected yet.
*/ */
std::unique_ptr<Exception> exception; std::optional<DB::Exception> exception;
bool network_error = false; bool network_error = false;
bool send_exception_with_stack_trace = connection_context.getSettingsRef().calculate_text_stack_trace; bool send_exception_with_stack_trace = connection_context.getSettingsRef().calculate_text_stack_trace;
@ -280,7 +280,7 @@ void TCPHandler::runImpl()
catch (const Exception & e) catch (const Exception & e)
{ {
state.io.onException(); state.io.onException();
exception.reset(e.clone()); exception.emplace(e);
if (e.code() == ErrorCodes::UNKNOWN_PACKET_FROM_CLIENT) if (e.code() == ErrorCodes::UNKNOWN_PACKET_FROM_CLIENT)
throw; throw;
@ -298,22 +298,22 @@ void TCPHandler::runImpl()
* We will try to send exception to the client in any case - see below. * We will try to send exception to the client in any case - see below.
*/ */
state.io.onException(); state.io.onException();
exception = std::make_unique<Exception>(e.displayText(), ErrorCodes::POCO_EXCEPTION); exception.emplace(Exception::CreateFromPoco, e);
} }
catch (const Poco::Exception & e) catch (const Poco::Exception & e)
{ {
state.io.onException(); state.io.onException();
exception = std::make_unique<Exception>(e.displayText(), ErrorCodes::POCO_EXCEPTION); exception.emplace(Exception::CreateFromPoco, e);
} }
catch (const std::exception & e) catch (const std::exception & e)
{ {
state.io.onException(); state.io.onException();
exception = std::make_unique<Exception>(e.what(), ErrorCodes::STD_EXCEPTION); exception.emplace(Exception::CreateFromSTD, e);
} }
catch (...) catch (...)
{ {
state.io.onException(); state.io.onException();
exception = std::make_unique<Exception>("Unknown exception", ErrorCodes::UNKNOWN_EXCEPTION); exception.emplace("Unknown exception", ErrorCodes::UNKNOWN_EXCEPTION);
} }
try try
@ -546,7 +546,7 @@ void TCPHandler::processOrdinaryQueryWithProcessors(size_t num_threads)
auto & pipeline = state.io.pipeline; auto & pipeline = state.io.pipeline;
if (pipeline.getMaxThreads()) if (pipeline.getMaxThreads())
num_threads = pipeline.getMaxThreads(); num_threads = std::min(num_threads, pipeline.getMaxThreads());
/// Send header-block, to allow client to prepare output format for data to send. /// Send header-block, to allow client to prepare output format for data to send.
{ {

View File

@ -217,9 +217,18 @@ const SettingsConstraints::Constraint * SettingsConstraints::tryGetConstraint(si
void SettingsConstraints::setProfile(const String & profile_name, const Poco::Util::AbstractConfiguration & config) void SettingsConstraints::setProfile(const String & profile_name, const Poco::Util::AbstractConfiguration & config)
{ {
String parent_profile = "profiles." + profile_name + ".profile"; String elem = "profiles." + profile_name;
if (config.has(parent_profile))
setProfile(parent_profile, config); // Inheritance of one profile from another. Poco::Util::AbstractConfiguration::Keys config_keys;
config.keys(elem, config_keys);
for (const std::string & key : config_keys)
{
if (key == "profile" || 0 == key.compare(0, strlen("profile["), "profile[")) /// Inheritance of profiles from the current one.
setProfile(config.getString(elem + "." + key), config);
else
continue;
}
String path_to_constraints = "profiles." + profile_name + ".constraints"; String path_to_constraints = "profiles." + profile_name + ".constraints";
if (config.has(path_to_constraints)) if (config.has(path_to_constraints))

View File

@ -0,0 +1,119 @@
#include <memory>
#include <random>
#include <DataTypes/DataTypesNumber.h>
#include <Common/thread_local_rng.h>
#include <IO/ReadBuffer.h>
#include <IO/WriteBuffer.h>
#include <AggregateFunctions/IAggregateFunction.h>
#include <AggregateFunctions/AggregateFunctionFactory.h>
namespace DB
{
namespace ErrorCodes
{
extern const int AGGREGATE_FUNCTION_THROW;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
}
namespace
{
struct AggregateFunctionThrowData
{
bool allocated;
AggregateFunctionThrowData() : allocated(true) {}
~AggregateFunctionThrowData()
{
volatile bool * allocated_ptr = &allocated;
if (*allocated_ptr)
*allocated_ptr = false;
else
abort();
}
};
/** Throw on creation with probability specified in parameter.
* It will check correct destruction of the state.
* This is intended to check for exception safety.
*/
class AggregateFunctionThrow final : public IAggregateFunctionDataHelper<AggregateFunctionThrowData, AggregateFunctionThrow>
{
private:
Float64 throw_probability;
public:
AggregateFunctionThrow(const DataTypes & argument_types_, const Array & parameters_, Float64 throw_probability_)
: IAggregateFunctionDataHelper(argument_types_, parameters_), throw_probability(throw_probability_) {}
String getName() const override
{
return "aggThrow";
}
DataTypePtr getReturnType() const override
{
return std::make_shared<DataTypeUInt8>();
}
void create(AggregateDataPtr place) const override
{
if (std::uniform_real_distribution<>(0.0, 1.0)(thread_local_rng) <= throw_probability)
throw Exception("Aggregate function " + getName() + " has thrown exception successfully", ErrorCodes::AGGREGATE_FUNCTION_THROW);
new (place) Data;
}
void destroy(AggregateDataPtr place) const noexcept override
{
data(place).~Data();
}
void add(AggregateDataPtr, const IColumn **, size_t, Arena *) const override
{
}
void merge(AggregateDataPtr, ConstAggregateDataPtr, Arena *) const override
{
}
void serialize(ConstAggregateDataPtr, WriteBuffer & buf) const override
{
char c = 0;
buf.write(c);
}
void deserialize(AggregateDataPtr, ReadBuffer & buf, Arena *) const override
{
char c = 0;
buf.read(c);
}
void insertResultInto(ConstAggregateDataPtr, IColumn & to) const override
{
to.insertDefault();
}
};
}
void registerAggregateFunctionAggThrow(AggregateFunctionFactory & factory)
{
factory.registerFunction("aggThrow", [](const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
Float64 throw_probability = 1.0;
if (parameters.size() == 1)
throw_probability = parameters[0].safeGet<Float64>();
else if (parameters.size() > 1)
throw Exception("Aggregate function " + name + " cannot have more than one parameter", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
return std::make_shared<AggregateFunctionThrow>(argument_types, parameters, throw_probability);
});
}
}

View File

@ -129,9 +129,9 @@ public:
void add(AggregateDataPtr place, const IColumn ** columns, const size_t row_num, Arena *) const override void add(AggregateDataPtr place, const IColumn ** columns, const size_t row_num, Arena *) const override
{ {
/// TODO Inefficient. /// NOTE Slightly inefficient.
const auto x = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[0])[row_num]); const auto x = columns[0]->getFloat64(row_num);
const auto y = applyVisitor(FieldVisitorConvertToNumber<Float64>(), (*columns[1])[row_num]); const auto y = columns[1]->getFloat64(row_num);
data(place).add(x, y); data(place).add(x, y);
} }

View File

@ -100,7 +100,18 @@ public:
void create(AggregateDataPtr place) const override void create(AggregateDataPtr place) const override
{ {
for (size_t i = 0; i < total; ++i) for (size_t i = 0; i < total; ++i)
nested_function->create(place + i * size_of_data); {
try
{
nested_function->create(place + i * size_of_data);
}
catch (...)
{
for (size_t j = 0; j < i; ++j)
nested_function->destroy(place + j * size_of_data);
throw;
}
}
} }
void destroy(AggregateDataPtr place) const noexcept override void destroy(AggregateDataPtr place) const noexcept override

View File

@ -23,13 +23,13 @@ inline void assertNoParameters(const std::string & name, const Array & parameter
inline void assertUnary(const std::string & name, const DataTypes & argument_types) inline void assertUnary(const std::string & name, const DataTypes & argument_types)
{ {
if (argument_types.size() != 1) if (argument_types.size() != 1)
throw Exception("Aggregate function " + name + " require single argument", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); throw Exception("Aggregate function " + name + " requires single argument", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
} }
inline void assertBinary(const std::string & name, const DataTypes & argument_types) inline void assertBinary(const std::string & name, const DataTypes & argument_types)
{ {
if (argument_types.size() != 2) if (argument_types.size() != 2)
throw Exception("Aggregate function " + name + " require two arguments", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); throw Exception("Aggregate function " + name + " requires two arguments", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
} }
template<std::size_t maximal_arity> template<std::size_t maximal_arity>

View File

@ -55,7 +55,7 @@ static IAggregateFunction * createAggregateFunctionArgMinMaxSecond(const DataTyp
#define DISPATCH(TYPE) \ #define DISPATCH(TYPE) \
if (which.idx == TypeIndex::TYPE) \ if (which.idx == TypeIndex::TYPE) \
return new AggregateFunctionArgMinMax<AggregateFunctionArgMinMaxData<ResData, MinMaxData<SingleValueDataFixed<TYPE>>>>(res_type, val_type); \ return new AggregateFunctionArgMinMax<AggregateFunctionArgMinMaxData<ResData, MinMaxData<SingleValueDataFixed<TYPE>>>>(res_type, val_type);
FOR_NUMERIC_TYPES(DISPATCH) FOR_NUMERIC_TYPES(DISPATCH)
#undef DISPATCH #undef DISPATCH

View File

@ -131,9 +131,7 @@ public:
/** Contains a loop with calls to "add" function. You can collect arguments into array "places" /** Contains a loop with calls to "add" function. You can collect arguments into array "places"
* and do a single call to "addBatch" for devirtualization and inlining. * and do a single call to "addBatch" for devirtualization and inlining.
*/ */
virtual void virtual void addBatch(size_t batch_size, AggregateDataPtr * places, size_t place_offset, const IColumn ** columns, Arena * arena) const = 0;
addBatch(size_t batch_size, AggregateDataPtr * places, size_t place_offset, const IColumn ** columns, Arena * arena)
const = 0;
/** The same for single place. /** The same for single place.
*/ */
@ -144,9 +142,8 @@ public:
* -Array combinator. It might also be used generally to break data dependency when array * -Array combinator. It might also be used generally to break data dependency when array
* "places" contains a large number of same values consecutively. * "places" contains a large number of same values consecutively.
*/ */
virtual void virtual void addBatchArray(
addBatchArray(size_t batch_size, AggregateDataPtr * places, size_t place_offset, const IColumn ** columns, const UInt64 * offsets, Arena * arena) size_t batch_size, AggregateDataPtr * places, size_t place_offset, const IColumn ** columns, const UInt64 * offsets, Arena * arena) const = 0;
const = 0;
const DataTypes & getArgumentTypes() const { return argument_types; } const DataTypes & getArgumentTypes() const { return argument_types; }
const Array & getParameters() const { return parameters; } const Array & getParameters() const { return parameters; }
@ -213,7 +210,7 @@ protected:
public: public:
IAggregateFunctionDataHelper(const DataTypes & argument_types_, const Array & parameters_) IAggregateFunctionDataHelper(const DataTypes & argument_types_, const Array & parameters_)
: IAggregateFunctionHelper<Derived>(argument_types_, parameters_) {} : IAggregateFunctionHelper<Derived>(argument_types_, parameters_) {}
void create(AggregateDataPtr place) const override void create(AggregateDataPtr place) const override
{ {

View File

@ -42,6 +42,7 @@ void registerAggregateFunctions()
registerAggregateFunctionSimpleLinearRegression(factory); registerAggregateFunctionSimpleLinearRegression(factory);
registerAggregateFunctionMoving(factory); registerAggregateFunctionMoving(factory);
registerAggregateFunctionCategoricalIV(factory); registerAggregateFunctionCategoricalIV(factory);
registerAggregateFunctionAggThrow(factory);
} }
{ {

View File

@ -34,6 +34,7 @@ void registerAggregateFunctionEntropy(AggregateFunctionFactory &);
void registerAggregateFunctionSimpleLinearRegression(AggregateFunctionFactory &); void registerAggregateFunctionSimpleLinearRegression(AggregateFunctionFactory &);
void registerAggregateFunctionMoving(AggregateFunctionFactory &); void registerAggregateFunctionMoving(AggregateFunctionFactory &);
void registerAggregateFunctionCategoricalIV(AggregateFunctionFactory &); void registerAggregateFunctionCategoricalIV(AggregateFunctionFactory &);
void registerAggregateFunctionAggThrow(AggregateFunctionFactory &);
class AggregateFunctionCombinatorFactory; class AggregateFunctionCombinatorFactory;
void registerAggregateFunctionCombinatorIf(AggregateFunctionCombinatorFactory &); void registerAggregateFunctionCombinatorIf(AggregateFunctionCombinatorFactory &);

View File

@ -212,21 +212,23 @@ public:
Float64 getFloat64(size_t n) const override; Float64 getFloat64(size_t n) const override;
Float32 getFloat32(size_t n) const override; Float32 getFloat32(size_t n) const override;
UInt64 getUInt(size_t n) const override /// Out of range conversion is permitted.
UInt64 NO_SANITIZE_UNDEFINED getUInt(size_t n) const override
{ {
return UInt64(data[n]); return UInt64(data[n]);
} }
/// Out of range conversion is permitted.
Int64 NO_SANITIZE_UNDEFINED getInt(size_t n) const override
{
return Int64(data[n]);
}
bool getBool(size_t n) const override bool getBool(size_t n) const override
{ {
return bool(data[n]); return bool(data[n]);
} }
Int64 getInt(size_t n) const override
{
return Int64(data[n]);
}
void insert(const Field & x) override void insert(const Field & x) override
{ {
data.push_back(DB::get<NearestFieldType<T>>(x)); data.push_back(DB::get<NearestFieldType<T>>(x));

View File

@ -138,7 +138,6 @@ namespace ErrorCodes
extern const int FUNCTION_IS_SPECIAL = 129; extern const int FUNCTION_IS_SPECIAL = 129;
extern const int CANNOT_READ_ARRAY_FROM_TEXT = 130; extern const int CANNOT_READ_ARRAY_FROM_TEXT = 130;
extern const int TOO_LARGE_STRING_SIZE = 131; extern const int TOO_LARGE_STRING_SIZE = 131;
extern const int CANNOT_CREATE_TABLE_FROM_METADATA = 132;
extern const int AGGREGATE_FUNCTION_DOESNT_ALLOW_PARAMETERS = 133; extern const int AGGREGATE_FUNCTION_DOESNT_ALLOW_PARAMETERS = 133;
extern const int PARAMETERS_TO_AGGREGATE_FUNCTIONS_MUST_BE_LITERALS = 134; extern const int PARAMETERS_TO_AGGREGATE_FUNCTIONS_MUST_BE_LITERALS = 134;
extern const int ZERO_ARRAY_OR_TUPLE_INDEX = 135; extern const int ZERO_ARRAY_OR_TUPLE_INDEX = 135;
@ -474,9 +473,9 @@ namespace ErrorCodes
extern const int NOT_ENOUGH_PRIVILEGES = 497; extern const int NOT_ENOUGH_PRIVILEGES = 497;
extern const int LIMIT_BY_WITH_TIES_IS_NOT_SUPPORTED = 498; extern const int LIMIT_BY_WITH_TIES_IS_NOT_SUPPORTED = 498;
extern const int S3_ERROR = 499; extern const int S3_ERROR = 499;
extern const int CANNOT_CREATE_DICTIONARY_FROM_METADATA = 500;
extern const int CANNOT_CREATE_DATABASE = 501; extern const int CANNOT_CREATE_DATABASE = 501;
extern const int CANNOT_SIGQUEUE = 502; extern const int CANNOT_SIGQUEUE = 502;
extern const int AGGREGATE_FUNCTION_THROW = 503;
extern const int KEEPER_EXCEPTION = 999; extern const int KEEPER_EXCEPTION = 999;
extern const int POCO_EXCEPTION = 1000; extern const int POCO_EXCEPTION = 1000;

View File

@ -25,6 +25,55 @@ namespace ErrorCodes
extern const int NOT_IMPLEMENTED; extern const int NOT_IMPLEMENTED;
} }
Exception::Exception()
{
}
Exception::Exception(const std::string & msg, int code)
: Poco::Exception(msg, code)
{
}
Exception::Exception(CreateFromPocoTag, const Poco::Exception & exc)
: Poco::Exception(exc.displayText(), ErrorCodes::POCO_EXCEPTION)
{
#ifdef STD_EXCEPTION_HAS_STACK_TRACE
set_stack_trace(exc.get_stack_trace_frames(), exc.get_stack_trace_size());
#endif
}
Exception::Exception(CreateFromSTDTag, const std::exception & exc)
: Poco::Exception(String(typeid(exc).name()) + ": " + String(exc.what()), ErrorCodes::STD_EXCEPTION)
{
#ifdef STD_EXCEPTION_HAS_STACK_TRACE
set_stack_trace(exc.get_stack_trace_frames(), exc.get_stack_trace_size());
#endif
}
std::string getExceptionStackTraceString(const std::exception & e)
{
#ifdef STD_EXCEPTION_HAS_STACK_TRACE
return StackTrace::toString(e.get_stack_trace_frames(), 0, e.get_stack_trace_size());
#else
if (const auto * db_exception = dynamic_cast<const Exception *>(&e))
return db_exception->getStackTraceString();
return {};
#endif
}
std::string Exception::getStackTraceString() const
{
#ifdef STD_EXCEPTION_HAS_STACK_TRACE
return StackTrace::toString(get_stack_trace_frames(), 0, get_stack_trace_size());
#else
return trace.toString();
#endif
}
std::string errnoToString(int code, int e) std::string errnoToString(int code, int e)
{ {
const size_t buf_size = 128; const size_t buf_size = 128;
@ -141,6 +190,7 @@ std::string getCurrentExceptionMessage(bool with_stacktrace, bool check_embedded
{ {
stream << "Poco::Exception. Code: " << ErrorCodes::POCO_EXCEPTION << ", e.code() = " << e.code() stream << "Poco::Exception. Code: " << ErrorCodes::POCO_EXCEPTION << ", e.code() = " << e.code()
<< ", e.displayText() = " << e.displayText() << ", e.displayText() = " << e.displayText()
<< (with_stacktrace ? getExceptionStackTraceString(e) : "")
<< (with_extra_info ? getExtraExceptionInfo(e) : "") << (with_extra_info ? getExtraExceptionInfo(e) : "")
<< " (version " << VERSION_STRING << VERSION_OFFICIAL; << " (version " << VERSION_STRING << VERSION_OFFICIAL;
} }
@ -157,8 +207,9 @@ std::string getCurrentExceptionMessage(bool with_stacktrace, bool check_embedded
name += " (demangling status: " + toString(status) + ")"; name += " (demangling status: " + toString(status) + ")";
stream << "std::exception. Code: " << ErrorCodes::STD_EXCEPTION << ", type: " << name << ", e.what() = " << e.what() stream << "std::exception. Code: " << ErrorCodes::STD_EXCEPTION << ", type: " << name << ", e.what() = " << e.what()
<< (with_extra_info ? getExtraExceptionInfo(e) : "") << (with_stacktrace ? getExceptionStackTraceString(e) : "")
<< ", version = " << VERSION_STRING << VERSION_OFFICIAL; << (with_extra_info ? getExtraExceptionInfo(e) : "")
<< ", version = " << VERSION_STRING << VERSION_OFFICIAL;
} }
catch (...) {} catch (...) {}
} }
@ -261,7 +312,7 @@ std::string getExceptionMessage(const Exception & e, bool with_stacktrace, bool
stream << "Code: " << e.code() << ", e.displayText() = " << text; stream << "Code: " << e.code() << ", e.displayText() = " << text;
if (with_stacktrace && !has_embedded_stack_trace) if (with_stacktrace && !has_embedded_stack_trace)
stream << ", Stack trace (when copying this message, always include the lines below):\n\n" << e.getStackTrace().toString(); stream << ", Stack trace (when copying this message, always include the lines below):\n\n" << e.getStackTraceString();
} }
catch (...) {} catch (...) {}

View File

@ -22,13 +22,14 @@ namespace ErrorCodes
class Exception : public Poco::Exception class Exception : public Poco::Exception
{ {
public: public:
Exception() {} /// For deferred initialization. Exception();
Exception(const std::string & msg, int code) : Poco::Exception(msg, code) {} Exception(const std::string & msg, int code);
Exception(const std::string & msg, const Exception & nested_exception, int code)
: Poco::Exception(msg, nested_exception, code), trace(nested_exception.trace) {}
enum CreateFromPocoTag { CreateFromPoco }; enum CreateFromPocoTag { CreateFromPoco };
Exception(CreateFromPocoTag, const Poco::Exception & exc) : Poco::Exception(exc.displayText(), ErrorCodes::POCO_EXCEPTION) {} enum CreateFromSTDTag { CreateFromSTD };
Exception(CreateFromPocoTag, const Poco::Exception & exc);
Exception(CreateFromSTDTag, const std::exception & exc);
Exception * clone() const override { return new Exception(*this); } Exception * clone() const override { return new Exception(*this); }
void rethrow() const override { throw *this; } void rethrow() const override { throw *this; }
@ -38,15 +39,20 @@ public:
/// Add something to the existing message. /// Add something to the existing message.
void addMessage(const std::string & arg) { extendedMessage(arg); } void addMessage(const std::string & arg) { extendedMessage(arg); }
const StackTrace & getStackTrace() const { return trace; } std::string getStackTraceString() const;
private: private:
#ifndef STD_EXCEPTION_HAS_STACK_TRACE
StackTrace trace; StackTrace trace;
#endif
const char * className() const throw() override { return "DB::Exception"; } const char * className() const throw() override { return "DB::Exception"; }
}; };
std::string getExceptionStackTraceString(const std::exception & e);
/// Contains an additional member `saved_errno`. See the throwFromErrno function. /// Contains an additional member `saved_errno`. See the throwFromErrno function.
class ErrnoException : public Exception class ErrnoException : public Exception
{ {

View File

@ -23,11 +23,10 @@ struct TrivialWeightFunction
}; };
/// Thread-safe cache that evicts entries which are not used for a long time or are expired. /// Thread-safe cache that evicts entries which are not used for a long time.
/// WeightFunction is a functor that takes Mapped as a parameter and returns "weight" (approximate size) /// WeightFunction is a functor that takes Mapped as a parameter and returns "weight" (approximate size)
/// of that value. /// of that value.
/// Cache starts to evict entries when their total weight exceeds max_size and when expiration time of these /// Cache starts to evict entries when their total weight exceeds max_size.
/// entries is due.
/// Value weight should not change after insertion. /// Value weight should not change after insertion.
template <typename TKey, typename TMapped, typename HashFunction = std::hash<TKey>, typename WeightFunction = TrivialWeightFunction<TMapped>> template <typename TKey, typename TMapped, typename HashFunction = std::hash<TKey>, typename WeightFunction = TrivialWeightFunction<TMapped>>
class LRUCache class LRUCache
@ -36,15 +35,13 @@ public:
using Key = TKey; using Key = TKey;
using Mapped = TMapped; using Mapped = TMapped;
using MappedPtr = std::shared_ptr<Mapped>; using MappedPtr = std::shared_ptr<Mapped>;
using Delay = std::chrono::seconds;
private: private:
using Clock = std::chrono::steady_clock; using Clock = std::chrono::steady_clock;
using Timestamp = Clock::time_point;
public: public:
LRUCache(size_t max_size_, const Delay & expiration_delay_ = Delay::zero()) LRUCache(size_t max_size_)
: max_size(std::max(static_cast<size_t>(1), max_size_)), expiration_delay(expiration_delay_) {} : max_size(std::max(static_cast<size_t>(1), max_size_)) {}
MappedPtr get(const Key & key) MappedPtr get(const Key & key)
{ {
@ -167,16 +164,9 @@ protected:
struct Cell struct Cell
{ {
bool expired(const Timestamp & last_timestamp, const Delay & delay) const
{
return (delay == Delay::zero()) ||
((last_timestamp > timestamp) && ((last_timestamp - timestamp) > delay));
}
MappedPtr value; MappedPtr value;
size_t size; size_t size;
LRUQueueIterator queue_iterator; LRUQueueIterator queue_iterator;
Timestamp timestamp;
}; };
using Cells = std::unordered_map<Key, Cell, HashFunction>; using Cells = std::unordered_map<Key, Cell, HashFunction>;
@ -257,7 +247,6 @@ private:
/// Total weight of values. /// Total weight of values.
size_t current_size = 0; size_t current_size = 0;
const size_t max_size; const size_t max_size;
const Delay expiration_delay;
std::atomic<size_t> hits {0}; std::atomic<size_t> hits {0};
std::atomic<size_t> misses {0}; std::atomic<size_t> misses {0};
@ -273,7 +262,6 @@ private:
} }
Cell & cell = it->second; Cell & cell = it->second;
updateCellTimestamp(cell);
/// Move the key to the end of the queue. The iterator remains valid. /// Move the key to the end of the queue. The iterator remains valid.
queue.splice(queue.end(), queue, cell.queue_iterator); queue.splice(queue.end(), queue, cell.queue_iterator);
@ -303,18 +291,11 @@ private:
cell.value = mapped; cell.value = mapped;
cell.size = cell.value ? weight_function(*cell.value) : 0; cell.size = cell.value ? weight_function(*cell.value) : 0;
current_size += cell.size; current_size += cell.size;
updateCellTimestamp(cell);
removeOverflow(cell.timestamp); removeOverflow();
} }
void updateCellTimestamp(Cell & cell) void removeOverflow()
{
if (expiration_delay != Delay::zero())
cell.timestamp = Clock::now();
}
void removeOverflow(const Timestamp & last_timestamp)
{ {
size_t current_weight_lost = 0; size_t current_weight_lost = 0;
size_t queue_size = cells.size(); size_t queue_size = cells.size();
@ -330,8 +311,6 @@ private:
} }
const auto & cell = it->second; const auto & cell = it->second;
if (!cell.expired(last_timestamp, expiration_delay))
break;
current_size -= cell.size; current_size -= cell.size;
current_weight_lost += cell.size; current_weight_lost += cell.size;

View File

@ -37,6 +37,8 @@
M(CreatedReadBufferOrdinary, "") \ M(CreatedReadBufferOrdinary, "") \
M(CreatedReadBufferAIO, "") \ M(CreatedReadBufferAIO, "") \
M(CreatedReadBufferAIOFailed, "") \ M(CreatedReadBufferAIOFailed, "") \
M(CreatedReadBufferMMap, "") \
M(CreatedReadBufferMMapFailed, "") \
M(CreatedWriteBufferOrdinary, "") \ M(CreatedWriteBufferOrdinary, "") \
M(CreatedWriteBufferAIO, "") \ M(CreatedWriteBufferAIO, "") \
M(CreatedWriteBufferAIOFailed, "") \ M(CreatedWriteBufferAIOFailed, "") \

View File

@ -4,6 +4,7 @@
#include <Common/Elf.h> #include <Common/Elf.h>
#include <Common/SymbolIndex.h> #include <Common/SymbolIndex.h>
#include <Common/config.h> #include <Common/config.h>
#include <Common/MemorySanitizer.h>
#include <common/SimpleCache.h> #include <common/SimpleCache.h>
#include <common/demangle.h> #include <common/demangle.h>
#include <Core/Defines.h> #include <Core/Defines.h>
@ -226,6 +227,7 @@ void StackTrace::tryCapture()
size = 0; size = 0;
#if USE_UNWIND #if USE_UNWIND
size = unw_backtrace(frames.data(), capacity); size = unw_backtrace(frames.data(), capacity);
__msan_unpoison(frames.data(), size * sizeof(frames[0]));
#endif #endif
} }
@ -328,3 +330,15 @@ std::string StackTrace::toString() const
static SimpleCache<decltype(toStringImpl), &toStringImpl> func_cached; static SimpleCache<decltype(toStringImpl), &toStringImpl> func_cached;
return func_cached(frames, offset, size); return func_cached(frames, offset, size);
} }
std::string StackTrace::toString(void ** frames_, size_t offset, size_t size)
{
__msan_unpoison(frames_, size * sizeof(*frames_));
StackTrace::Frames frames_copy{};
for (size_t i = 0; i < size; ++i)
frames_copy[i] = frames_[i];
static SimpleCache<decltype(toStringImpl), &toStringImpl> func_cached;
return func_cached(frames_copy, offset, size);
}

View File

@ -41,6 +41,8 @@ public:
const Frames & getFrames() const; const Frames & getFrames() const;
std::string toString() const; std::string toString() const;
static std::string toString(void ** frames, size_t offset, size_t size);
void toStringEveryLine(std::function<void(const std::string &)> callback) const; void toStringEveryLine(std::function<void(const std::string &)> callback) const;
protected: protected:

View File

@ -13,9 +13,6 @@ target_link_libraries (sip_hash_perf PRIVATE clickhouse_common_io)
add_executable (auto_array auto_array.cpp) add_executable (auto_array auto_array.cpp)
target_link_libraries (auto_array PRIVATE clickhouse_common_io) target_link_libraries (auto_array PRIVATE clickhouse_common_io)
add_executable (lru_cache lru_cache.cpp)
target_link_libraries (lru_cache PRIVATE clickhouse_common_io)
add_executable (hash_table hash_table.cpp) add_executable (hash_table hash_table.cpp)
target_link_libraries (hash_table PRIVATE clickhouse_common_io) target_link_libraries (hash_table PRIVATE clickhouse_common_io)

View File

@ -1,317 +0,0 @@
#include <Common/LRUCache.h>
#include <Common/Exception.h>
#include <iostream>
#include <string>
#include <thread>
#include <chrono>
#include <functional>
namespace
{
void run();
void runTest(unsigned int num, const std::function<bool()> & func);
bool test1();
bool test2();
bool test_concurrent();
#define ASSERT_CHECK(cond, res) \
do \
{ \
if (!(cond)) \
{ \
std::cout << __FILE__ << ":" << __LINE__ << ":" \
<< "Assertion " << #cond << " failed.\n"; \
if ((res)) { (res) = false; } \
} \
} \
while (0)
void run()
{
const std::vector<std::function<bool()>> tests =
{
test1,
test2,
test_concurrent
};
unsigned int num = 0;
for (const auto & test : tests)
{
++num;
runTest(num, test);
}
}
void runTest(unsigned int num, const std::function<bool()> & func)
{
bool ok;
try
{
ok = func();
}
catch (const DB::Exception & ex)
{
ok = false;
std::cout << "Caught exception " << ex.displayText() << "\n";
}
catch (const std::exception & ex)
{
ok = false;
std::cout << "Caught exception " << ex.what() << "\n";
}
catch (...)
{
ok = false;
std::cout << "Caught unhandled exception\n";
}
if (ok)
std::cout << "Test " << num << " passed\n";
else
std::cout << "Test " << num << " failed\n";
}
struct Weight
{
size_t operator()(const std::string & s) const
{
return s.size();
}
};
bool test1()
{
using Cache = DB::LRUCache<std::string, std::string, std::hash<std::string>, Weight>;
using MappedPtr = Cache::MappedPtr;
auto ptr = [](const std::string & s)
{
return MappedPtr(new std::string(s));
};
Cache cache(10);
bool res = true;
ASSERT_CHECK(!cache.get("asd"), res);
cache.set("asd", ptr("qwe"));
ASSERT_CHECK((*cache.get("asd") == "qwe"), res);
cache.set("zxcv", ptr("12345"));
cache.set("01234567891234567", ptr("--"));
ASSERT_CHECK((*cache.get("zxcv") == "12345"), res);
ASSERT_CHECK((*cache.get("asd") == "qwe"), res);
ASSERT_CHECK((*cache.get("01234567891234567") == "--"), res);
ASSERT_CHECK(!cache.get("123x"), res);
cache.set("321x", ptr("+"));
ASSERT_CHECK(!cache.get("zxcv"), res);
ASSERT_CHECK((*cache.get("asd") == "qwe"), res);
ASSERT_CHECK((*cache.get("01234567891234567") == "--"), res);
ASSERT_CHECK(!cache.get("123x"), res);
ASSERT_CHECK((*cache.get("321x") == "+"), res);
ASSERT_CHECK((cache.weight() == 6), res);
ASSERT_CHECK((cache.count() == 3), res);
return res;
}
bool test2()
{
using namespace std::literals;
using Cache = DB::LRUCache<std::string, std::string, std::hash<std::string>, Weight>;
using MappedPtr = Cache::MappedPtr;
auto ptr = [](const std::string & s)
{
return MappedPtr(new std::string(s));
};
Cache cache(10, 3s);
bool res = true;
ASSERT_CHECK(!cache.get("asd"), res);
cache.set("asd", ptr("qwe"));
ASSERT_CHECK((*cache.get("asd") == "qwe"), res);
cache.set("zxcv", ptr("12345"));
cache.set("01234567891234567", ptr("--"));
ASSERT_CHECK((*cache.get("zxcv") == "12345"), res);
ASSERT_CHECK((*cache.get("asd") == "qwe"), res);
ASSERT_CHECK((*cache.get("01234567891234567") == "--"), res);
ASSERT_CHECK(!cache.get("123x"), res);
cache.set("321x", ptr("+"));
ASSERT_CHECK((cache.get("zxcv")), res);
ASSERT_CHECK((*cache.get("asd") == "qwe"), res);
ASSERT_CHECK((*cache.get("01234567891234567") == "--"), res);
ASSERT_CHECK(!cache.get("123x"), res);
ASSERT_CHECK((*cache.get("321x") == "+"), res);
ASSERT_CHECK((cache.weight() == 11), res);
ASSERT_CHECK((cache.count() == 4), res);
std::this_thread::sleep_for(5s);
cache.set("123x", ptr("2769"));
ASSERT_CHECK(!cache.get("zxcv"), res);
ASSERT_CHECK((*cache.get("asd") == "qwe"), res);
ASSERT_CHECK((*cache.get("01234567891234567") == "--"), res);
ASSERT_CHECK((*cache.get("321x") == "+"), res);
ASSERT_CHECK((cache.weight() == 10), res);
ASSERT_CHECK((cache.count() == 4), res);
return res;
}
bool test_concurrent()
{
using namespace std::literals;
using Cache = DB::LRUCache<std::string, std::string, std::hash<std::string>, Weight>;
Cache cache(2);
bool res = true;
auto load_func = [](const std::string & result, std::chrono::seconds sleep_for, bool throw_exc)
{
std::this_thread::sleep_for(sleep_for);
if (throw_exc)
throw std::runtime_error("Exception!");
return std::make_shared<std::string>(result);
};
/// Case 1: Both threads are able to load the value.
std::pair<Cache::MappedPtr, bool> result1;
std::thread thread1([&]()
{
result1 = cache.getOrSet("key", [&]() { return load_func("val1", 1s, false); });
});
std::pair<Cache::MappedPtr, bool> result2;
std::thread thread2([&]()
{
result2 = cache.getOrSet("key", [&]() { return load_func("val2", 1s, false); });
});
thread1.join();
thread2.join();
ASSERT_CHECK((result1.first == result2.first), res);
ASSERT_CHECK((result1.second != result2.second), res);
/// Case 2: One thread throws an exception during loading.
cache.reset();
bool thrown = false;
thread1 = std::thread([&]()
{
try
{
cache.getOrSet("key", [&]() { return load_func("val1", 2s, true); });
}
catch (...)
{
thrown = true;
}
});
thread2 = std::thread([&]()
{
std::this_thread::sleep_for(1s);
result2 = cache.getOrSet("key", [&]() { return load_func("val2", 1s, false); });
});
thread1.join();
thread2.join();
ASSERT_CHECK((thrown == true), res);
ASSERT_CHECK((result2.second == true), res);
ASSERT_CHECK((result2.first.get() == cache.get("key").get()), res);
ASSERT_CHECK((*result2.first == "val2"), res);
/// Case 3: All threads throw an exception.
cache.reset();
bool thrown1 = false;
thread1 = std::thread([&]()
{
try
{
cache.getOrSet("key", [&]() { return load_func("val1", 1s, true); });
}
catch (...)
{
thrown1 = true;
}
});
bool thrown2 = false;
thread2 = std::thread([&]()
{
try
{
cache.getOrSet("key", [&]() { return load_func("val1", 1s, true); });
}
catch (...)
{
thrown2 = true;
}
});
thread1.join();
thread2.join();
ASSERT_CHECK((thrown1 == true), res);
ASSERT_CHECK((thrown2 == true), res);
ASSERT_CHECK((cache.get("key") == nullptr), res);
/// Case 4: Concurrent reset.
cache.reset();
thread1 = std::thread([&]()
{
result1 = cache.getOrSet("key", [&]() { return load_func("val1", 2s, false); });
});
std::this_thread::sleep_for(1s);
cache.reset();
thread1.join();
ASSERT_CHECK((result1.second == true), res);
ASSERT_CHECK((*result1.first == "val1"), res);
ASSERT_CHECK((cache.get("key") == nullptr), res);
return res;
}
}
int main()
{
run();
return 0;
}

View File

@ -19,7 +19,7 @@ void CachedCompressedReadBuffer::initInput()
{ {
if (!file_in) if (!file_in)
{ {
file_in = createReadBufferFromFileBase(path, estimated_size, aio_threshold, buf_size); file_in = createReadBufferFromFileBase(path, estimated_size, aio_threshold, mmap_threshold, buf_size);
compressed_in = file_in.get(); compressed_in = file_in.get();
if (profile_callback) if (profile_callback)
@ -73,10 +73,11 @@ bool CachedCompressedReadBuffer::nextImpl()
CachedCompressedReadBuffer::CachedCompressedReadBuffer( CachedCompressedReadBuffer::CachedCompressedReadBuffer(
const std::string & path_, UncompressedCache * cache_, size_t estimated_size_, size_t aio_threshold_, const std::string & path_, UncompressedCache * cache_,
size_t estimated_size_, size_t aio_threshold_, size_t mmap_threshold_,
size_t buf_size_) size_t buf_size_)
: ReadBuffer(nullptr, 0), path(path_), cache(cache_), buf_size(buf_size_), estimated_size(estimated_size_), : ReadBuffer(nullptr, 0), path(path_), cache(cache_), buf_size(buf_size_), estimated_size(estimated_size_),
aio_threshold(aio_threshold_), file_pos(0) aio_threshold(aio_threshold_), mmap_threshold(mmap_threshold_), file_pos(0)
{ {
} }

View File

@ -26,6 +26,7 @@ private:
size_t buf_size; size_t buf_size;
size_t estimated_size; size_t estimated_size;
size_t aio_threshold; size_t aio_threshold;
size_t mmap_threshold;
std::unique_ptr<ReadBufferFromFileBase> file_in; std::unique_ptr<ReadBufferFromFileBase> file_in;
size_t file_pos; size_t file_pos;
@ -42,7 +43,8 @@ private:
public: public:
CachedCompressedReadBuffer( CachedCompressedReadBuffer(
const std::string & path_, UncompressedCache * cache_, size_t estimated_size_, size_t aio_threshold_, const std::string & path_, UncompressedCache * cache_,
size_t estimated_size_, size_t aio_threshold_, size_t mmap_threshold_,
size_t buf_size_ = DBMS_DEFAULT_BUFFER_SIZE); size_t buf_size_ = DBMS_DEFAULT_BUFFER_SIZE);

View File

@ -33,9 +33,9 @@ bool CompressedReadBufferFromFile::nextImpl()
CompressedReadBufferFromFile::CompressedReadBufferFromFile( CompressedReadBufferFromFile::CompressedReadBufferFromFile(
const std::string & path, size_t estimated_size, size_t aio_threshold, size_t buf_size) const std::string & path, size_t estimated_size, size_t aio_threshold, size_t mmap_threshold, size_t buf_size)
: BufferWithOwnMemory<ReadBuffer>(0), : BufferWithOwnMemory<ReadBuffer>(0),
p_file_in(createReadBufferFromFileBase(path, estimated_size, aio_threshold, buf_size)), p_file_in(createReadBufferFromFileBase(path, estimated_size, aio_threshold, mmap_threshold, buf_size)),
file_in(*p_file_in) file_in(*p_file_in)
{ {
compressed_in = &file_in; compressed_in = &file_in;

View File

@ -30,7 +30,7 @@ private:
public: public:
CompressedReadBufferFromFile( CompressedReadBufferFromFile(
const std::string & path, size_t estimated_size, size_t aio_threshold, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE); const std::string & path, size_t estimated_size, size_t aio_threshold, size_t mmap_threshold, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE);
void seek(size_t offset_in_compressed_file, size_t offset_in_decompressed_block); void seek(size_t offset_in_compressed_file, size_t offset_in_decompressed_block);

View File

@ -26,7 +26,7 @@ extern const int CANNOT_DECOMPRESS;
namespace namespace
{ {
Int64 getMaxValueForByteSize(UInt8 byte_size) inline Int64 getMaxValueForByteSize(Int8 byte_size)
{ {
switch (byte_size) switch (byte_size)
{ {
@ -51,11 +51,56 @@ struct WriteSpec
const UInt8 data_bits; const UInt8 data_bits;
}; };
const std::array<UInt8, 5> DELTA_SIZES{7, 9, 12, 32, 64}; // delta size prefix and data lengths based on few high bits peeked from binary stream
static const WriteSpec WRITE_SPEC_LUT[32] = {
// 0b0 - 1-bit prefix, no data to read
/* 00000 */ {1, 0b0, 0},
/* 00001 */ {1, 0b0, 0},
/* 00010 */ {1, 0b0, 0},
/* 00011 */ {1, 0b0, 0},
/* 00100 */ {1, 0b0, 0},
/* 00101 */ {1, 0b0, 0},
/* 00110 */ {1, 0b0, 0},
/* 00111 */ {1, 0b0, 0},
/* 01000 */ {1, 0b0, 0},
/* 01001 */ {1, 0b0, 0},
/* 01010 */ {1, 0b0, 0},
/* 01011 */ {1, 0b0, 0},
/* 01100 */ {1, 0b0, 0},
/* 01101 */ {1, 0b0, 0},
/* 01110 */ {1, 0b0, 0},
/* 01111 */ {1, 0b0, 0},
// 0b10 - 2 bit prefix, 7 bits of data
/* 10000 */ {2, 0b10, 7},
/* 10001 */ {2, 0b10, 7},
/* 10010 */ {2, 0b10, 7},
/* 10011 */ {2, 0b10, 7},
/* 10100 */ {2, 0b10, 7},
/* 10101 */ {2, 0b10, 7},
/* 10110 */ {2, 0b10, 7},
/* 10111 */ {2, 0b10, 7},
// 0b110 - 3 bit prefix, 9 bits of data
/* 11000 */ {3, 0b110, 9},
/* 11001 */ {3, 0b110, 9},
/* 11010 */ {3, 0b110, 9},
/* 11011 */ {3, 0b110, 9},
// 0b1110 - 4 bit prefix, 12 bits of data
/* 11100 */ {4, 0b1110, 12},
/* 11101 */ {4, 0b1110, 12},
// 5-bit prefixes
/* 11110 */ {5, 0b11110, 32},
/* 11111 */ {5, 0b11111, 64},
};
template <typename T> template <typename T>
WriteSpec getDeltaWriteSpec(const T & value) WriteSpec getDeltaWriteSpec(const T & value)
{ {
// TODO: to speed up things a bit by counting number of leading zeroes instead of doing lots of comparisons
if (value > -63 && value < 64) if (value > -63 && value < 64)
{ {
return WriteSpec{2, 0b10, 7}; return WriteSpec{2, 0b10, 7};
@ -107,14 +152,15 @@ UInt32 getCompressedDataSize(UInt8 data_bytes_size, UInt32 uncompressed_size)
template <typename ValueType> template <typename ValueType>
UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest) UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
{ {
// Since only unsinged int has granted 2-compliment overflow handling, we are doing math here on unsigned types. // Since only unsinged int has granted 2-complement overflow handling,
// To simplify and booletproof code, we operate enforce ValueType to be unsigned too. // we are doing math here only on unsigned types.
// To simplify and booletproof code, we enforce ValueType to be unsigned too.
static_assert(is_unsigned_v<ValueType>, "ValueType must be unsigned."); static_assert(is_unsigned_v<ValueType>, "ValueType must be unsigned.");
using UnsignedDeltaType = ValueType; using UnsignedDeltaType = ValueType;
// We use signed delta type to turn huge unsigned values into smaller signed: // We use signed delta type to turn huge unsigned values into smaller signed:
// ffffffff => -1 // ffffffff => -1
using SignedDeltaType = typename std::make_signed<UnsignedDeltaType>::type; using SignedDeltaType = typename std::make_signed_t<UnsignedDeltaType>;
if (source_size % sizeof(ValueType) != 0) if (source_size % sizeof(ValueType) != 0)
throw Exception("Cannot compress, data size " + toString(source_size) throw Exception("Cannot compress, data size " + toString(source_size)
@ -149,8 +195,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
prev_value = curr_value; prev_value = curr_value;
} }
WriteBuffer buffer(dest, getCompressedDataSize(sizeof(ValueType), source_size - sizeof(ValueType)*2)); BitWriter writer(dest, getCompressedDataSize(sizeof(ValueType), source_size - sizeof(ValueType)*2));
BitWriter writer(buffer);
int item = 2; int item = 2;
for (; source < source_end; source += sizeof(ValueType), ++item) for (; source < source_end; source += sizeof(ValueType), ++item)
@ -170,7 +215,8 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
else else
{ {
const SignedDeltaType signed_dd = static_cast<SignedDeltaType>(double_delta); const SignedDeltaType signed_dd = static_cast<SignedDeltaType>(double_delta);
const auto sign = std::signbit(signed_dd); const auto sign = signed_dd < 0;
// -1 shirnks dd down to fit into number of bits, and there can't be 0, so it is OK. // -1 shirnks dd down to fit into number of bits, and there can't be 0, so it is OK.
const auto abs_value = static_cast<UnsignedDeltaType>(std::abs(signed_dd) - 1); const auto abs_value = static_cast<UnsignedDeltaType>(std::abs(signed_dd) - 1);
const auto write_spec = getDeltaWriteSpec(signed_dd); const auto write_spec = getDeltaWriteSpec(signed_dd);
@ -183,7 +229,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
writer.flush(); writer.flush();
return sizeof(items_count) + sizeof(prev_value) + sizeof(prev_delta) + buffer.count(); return sizeof(items_count) + sizeof(prev_value) + sizeof(prev_delta) + writer.count() / 8;
} }
template <typename ValueType> template <typename ValueType>
@ -220,35 +266,28 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
dest += sizeof(prev_value); dest += sizeof(prev_value);
} }
ReadBufferFromMemory buffer(source, source_size - sizeof(prev_value) - sizeof(prev_delta) - sizeof(items_count)); BitReader reader(source, source_size - sizeof(prev_value) - sizeof(prev_delta) - sizeof(items_count));
BitReader reader(buffer);
// since data is tightly packed, up to 1 bit per value, and last byte is padded with zeroes, // since data is tightly packed, up to 1 bit per value, and last byte is padded with zeroes,
// we have to keep track of items to avoid reading more that there is. // we have to keep track of items to avoid reading more that there is.
for (UInt32 items_read = 2; items_read < items_count && !reader.eof(); ++items_read) for (UInt32 items_read = 2; items_read < items_count && !reader.eof(); ++items_read)
{ {
UnsignedDeltaType double_delta = 0; UnsignedDeltaType double_delta = 0;
if (reader.readBit() == 1)
{
UInt8 i = 0;
for (; i < sizeof(DELTA_SIZES) - 1; ++i)
{
const auto next_bit = reader.readBit();
if (next_bit == 0)
{
break;
}
}
static_assert(sizeof(WRITE_SPEC_LUT)/sizeof(WRITE_SPEC_LUT[0]) == 32); // 5-bit prefix lookup table
const auto write_spec = WRITE_SPEC_LUT[reader.peekByte() >> (8 - 5)]; // only 5 high bits of peeked byte value
reader.skipBufferedBits(write_spec.prefix_bits); // discard the prefix value, since we've already used it
if (write_spec.data_bits != 0)
{
const UInt8 sign = reader.readBit(); const UInt8 sign = reader.readBit();
SignedDeltaType signed_dd = static_cast<SignedDeltaType>(reader.readBits(DELTA_SIZES[i] - 1) + 1); SignedDeltaType signed_dd = static_cast<SignedDeltaType>(reader.readBits(write_spec.data_bits - 1) + 1);
if (sign) if (sign)
{ {
signed_dd *= -1; signed_dd *= -1;
} }
double_delta = static_cast<UnsignedDeltaType>(signed_dd); double_delta = static_cast<UnsignedDeltaType>(signed_dd);
} }
// else if first bit is zero, no need to read more data.
const UnsignedDeltaType delta = double_delta + prev_delta; const UnsignedDeltaType delta = double_delta + prev_delta;
const ValueType curr_value = prev_value + delta; const ValueType curr_value = prev_value + delta;

View File

@ -5,6 +5,92 @@
namespace DB namespace DB
{ {
/** DoubleDelta column codec implementation.
*
* Based on Gorilla paper: http://www.vldb.org/pvldb/vol8/p1816-teller.pdf, which was extended
* to support 64bit types. The drawback is 1 extra bit for 32-byte wide deltas: 5-bit prefix
* instead of 4-bit prefix.
*
* This codec is best used against monotonic integer sequences with constant (or almost contant)
* stride, like event timestamp for some monitoring application.
*
* Given input sequence a: [a0, a1, ... an]:
*
* First, write number of items (sizeof(int32)*8 bits): n
* Then write first item as is (sizeof(a[0])*8 bits): a[0]
* Second item is written as delta (sizeof(a[0])*8 bits): a[1] - a[0]
* Loop over remaining items and calculate double delta:
* double_delta = a[i] - 2 * a[i - 1] + a[i - 2]
* Write it in compact binary form with `BitWriter`
* if double_delta == 0:
* write 1bit: 0
* else if -63 < double_delta < 64:
* write 2 bit prefix: 10
* write sign bit (1 if signed): x
* write 7-1 bits of abs(double_delta - 1): xxxxxx
* else if -255 < double_delta < 256:
* write 3 bit prefix: 110
* write sign bit (1 if signed): x
* write 9-1 bits of abs(double_delta - 1): xxxxxxxx
* else if -2047 < double_delta < 2048:
* write 4 bit prefix: 1110
* write sign bit (1 if signed): x
* write 12-1 bits of abs(double_delta - 1): xxxxxxxxxxx
* else if double_delta fits into 32-bit int:
* write 5 bit prefix: 11110
* write sign bit (1 if signed): x
* write 32-1 bits of abs(double_delta - 1): xxxxxxxxxxx...
* else
* write 5 bit prefix: 11111
* write sign bit (1 if signed): x
* write 64-1 bits of abs(double_delta - 1): xxxxxxxxxxx...
*
* @example sequence of UInt8 values [1, 2, 3, 4, 5, 6, 7, 8, 9 10] is encoded as (codec header is ommited):
*
* .- 4-byte little-endian sequence length (10 == 0xa)
* | .- 1 byte (sizeof(UInt8) a[0] : 0x01
* | | .- 1 byte of delta: a[1] - a[0] = 2 - 1 = 1 : 0x01
* | | | .- 8 zero bits since double delta for remaining 8 elements was 0 : 0x00
* v_______________v___v___v___
* \x0a\x00\x00\x00\x01\x01\x00
*
* @example sequence of Int16 values [-10, 10, -20, 20, -40, 40] is encoded as:
*
* .- 4-byte little endian sequence length = 6 : 0x00000006
* | .- 2 bytes (sizeof(Int16) a[0] as UInt16 = -10 : 0xfff6
* | | .- 2 bytes of delta: a[1] - a[0] = 10 - (-10) = 20 : 0x0014
* | | | .- 4 encoded double deltas (see below)
* v_______________ v______ v______ v______________________
* \x06\x00\x00\x00\xf6\xff\x14\x00\xb8\xe2\x2e\xb1\xe4\x58
*
* 4 binary encoded double deltas (\xb8\xe2\x2e\xb1\xe4\x58):
* double_delta (DD) = -20 - 2 * 10 + (-10) = -50
* .- 2-bit prefix : 0b10
* | .- sign-bit : 0b1
* | |.- abs(DD - 1) = 49 : 0b110001
* | ||
* | || DD = 20 - 2 * (-20) + 10 = 70
* | || .- 3-bit prefix : 0b110
* | || | .- sign bit : 0b0
* | || | |.- abs(DD - 1) = 69 : 0b1000101
* | || | ||
* | || | || DD = -40 - 2 * 20 + (-20) = -100
* | || | || .- 3-bit prefix : 0b110
* | || | || | .- sign-bit : 0b0
* | || | || | |.- abs(DD - 1) = 99 : 0b1100011
* | || | || | ||
* | || | || | || DD = 40 - 2 * (-40) + 20 = 140
* | || | || | || .- 3-bit prefix : 0b110
* | || | || | || | .- sign bit : 0b0
* | || | || | || | |.- abs(DD - 1) = 139 : 0b10001011
* | || | || | || | ||
* V_vv______V__vv________V____vv_______V__vv________,- padding bits
* 10111000 11100010 00101110 10110001 11100100 01011000
*
* Please also see unit tests for:
* * Examples on what output `BitWriter` produces on predefined input.
* * Compatibility tests solidifying encoded binary output on set of predefined sequences.
*/
class CompressionCodecDoubleDelta : public ICompressionCodec class CompressionCodecDoubleDelta : public ICompressionCodec
{ {
public: public:

View File

@ -112,8 +112,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
dest += sizeof(prev_value); dest += sizeof(prev_value);
} }
WriteBuffer buffer(dest, dest_end - dest); BitWriter writer(dest, dest_end - dest);
BitWriter writer(buffer);
while (source < source_end) while (source < source_end)
{ {
@ -148,7 +147,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
writer.flush(); writer.flush();
return sizeof(items_count) + sizeof(prev_value) + buffer.count(); return sizeof(items_count) + sizeof(prev_value) + writer.count() / 8;
} }
template <typename T> template <typename T>
@ -174,8 +173,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
dest += sizeof(prev_value); dest += sizeof(prev_value);
} }
ReadBufferFromMemory buffer(source, source_size - sizeof(items_count) - sizeof(prev_value)); BitReader reader(source, source_size - sizeof(items_count) - sizeof(prev_value));
BitReader reader(buffer);
binary_value_info prev_xored_info{0, 0, 0}; binary_value_info prev_xored_info{0, 0, 0};

View File

@ -5,6 +5,89 @@
namespace DB namespace DB
{ {
/** Gorilla column codec implementation.
*
* Based on Gorilla paper: http://www.vldb.org/pvldb/vol8/p1816-teller.pdf
*
* This codec is best used against monotonic floating sequences, like CPU usage percentage
* or any other gauge.
*
* Given input sequence a: [a0, a1, ... an]
*
* First, write number of items (sizeof(int32)*8 bits): n
* Then write first item as is (sizeof(a[0])*8 bits): a[0]
* Loop over remaining items and calculate xor_diff:
* xor_diff = a[i] ^ a[i - 1] (e.g. 00000011'10110100)
* Write it in compact binary form with `BitWriter`
* if xor_diff == 0:
* write 1 bit: 0
* else:
* calculate leading zero bits (lzb)
* and trailing zero bits (tzb) of xor_diff,
* compare to lzb and tzb of previous xor_diff
* (X = sizeof(a[i]) * 8, e.g. X = 16, lzb = 6, tzb = 2)
* if lzb >= prev_lzb && tzb >= prev_tzb:
* (e.g. prev_lzb=4, prev_tzb=1)
* write 2 bit prefix: 0b10
* write xor_diff >> prev_tzb (X - prev_lzb - prev_tzb bits):0b00111011010
* (where X = sizeof(a[i]) * 8, e.g. 16)
* else:
* write 2 bit prefix: 0b11
* write 5 bits of lzb: 0b00110
* write 6 bits of (X - lzb - tzb)=(16-6-2)=8: 0b001000
* write (X - lzb - tzb) non-zero bits of xor_diff: 0b11101101
* prev_lzb = lzb
* prev_tzb = tzb
*
* @example sequence of Float32 values [0.1, 0.1, 0.11, 0.2, 0.1] is encoded as:
*
* .- 4-byte little endian sequence length: 5 : 0x00000005
* | .- 4 byte (sizeof(Float32) a[0] as UInt32 : -10 : 0xcdcccc3d
* | | .- 4 encoded xor diffs (see below)
* v_______________ v______________ v__________________________________________________
* \x05\x00\x00\x00\xcd\xcc\xcc\x3d\x6a\x5a\xd8\xb6\x3c\xcd\x75\xb1\x6c\x77\x00\x00\x00
*
* 4 binary encoded xor diffs (\x6a\x5a\xd8\xb6\x3c\xcd\x75\xb1\x6c\x77\x00\x00\x00):
*
* ...........................................
* a[i-1] = 00111101110011001100110011001101
* a[i] = 00111101110011001100110011001101
* xor_diff = 00000000000000000000000000000000
* .- 1-bit prefix : 0b0
* |
* | ...........................................
* | a[i-1] = 00111101110011001100110011001101
* ! a[i] = 00111101111000010100011110101110
* | xor_diff = 00000000001011011000101101100011
* | lzb = 10
* | tzb = 0
* |.- 2-bit prefix : 0b11
* || .- lzb (10) : 0b1010
* || | .- data length (32-10-0): 22 : 0b010110
* || | | .- data : 0b1011011000101101100011
* || | | |
* || | | | ...........................................
* || | | | a[i-1] = 00111101111000010100011110101110
* || | | | a[i] = 00111110010011001100110011001101
* || | | | xor_diff = 00000011101011011000101101100011
* || | | | .- 2-bit prefix : 0b11
* || | | | | .- lzb = 6 : 0b00110
* || | | | | | .- data length = (32 - 6) = 26 : 0b011010
* || | | | | | | .- data : 0b11101011011000101101100011
* || | | | | | | |
* || | | | | | | | ...........................................
* || | | | | | | | a[i-1] = 00111110010011001100110011001101
* || | | | | | | | a[i] = 00111101110011001100110011001101
* || | | | | | | | xor_diff = 00000011100000000000000000000000
* || | | | | | | | .- 2-bit prefix : 0b10
* || | | | | | | | | .- data : 0b11100000000000000000000000
* VV_v____ v_____v________________________V_v_____v______v____________________________V_v_____________________________
* 01101010 01011010 11011000 10110110 00111100 11001101 01110101 10110001 01101100 01110111 00000000 00000000 00000000
*
* Please also see unit tests for:
* * Examples on what output `BitWriter` produces on predefined input.
* * Compatibility tests solidifying encoded binary output on set of predefined sequences.
*/
class CompressionCodecGorilla : public ICompressionCodec class CompressionCodecGorilla : public ICompressionCodec
{ {
public: public:

View File

@ -32,7 +32,7 @@ int main(int argc, char ** argv)
{ {
Stopwatch watch; Stopwatch watch;
CachedCompressedReadBuffer in(path, &cache, 0, 0); CachedCompressedReadBuffer in(path, &cache, 0, 0, 0);
WriteBufferFromFile out("/dev/null"); WriteBufferFromFile out("/dev/null");
copyData(in, out); copyData(in, out);
@ -44,7 +44,7 @@ int main(int argc, char ** argv)
{ {
Stopwatch watch; Stopwatch watch;
CachedCompressedReadBuffer in(path, &cache, 0, 0); CachedCompressedReadBuffer in(path, &cache, 0, 0, 0);
WriteBufferFromFile out("/dev/null"); WriteBufferFromFile out("/dev/null");
copyData(in, out); copyData(in, out);

View File

@ -1,6 +1,7 @@
#include <Compression/CompressionFactory.h> #include <Compression/CompressionFactory.h>
#include <Common/PODArray.h> #include <Common/PODArray.h>
#include <Common/Stopwatch.h>
#include <Core/Types.h> #include <Core/Types.h>
#include <DataTypes/DataTypesNumber.h> #include <DataTypes/DataTypesNumber.h>
#include <DataTypes/IDataType.h> #include <DataTypes/IDataType.h>
@ -62,6 +63,32 @@ std::vector<T> operator+(std::vector<T> && left, std::vector<T> && right)
namespace namespace
{ {
template <typename T>
struct AsHexStringHelper
{
const T & container;
};
template <typename T>
std::ostream & operator << (std::ostream & ostr, const AsHexStringHelper<T> & helper)
{
ostr << std::hex;
for (const auto & e : helper.container)
{
ostr << "\\x" << std::setw(2) << std::setfill('0') << (static_cast<unsigned int>(e) & 0xFF);
}
return ostr;
}
template <typename T>
AsHexStringHelper<T> AsHexString(const T & container)
{
static_assert (sizeof(container[0]) == 1 && std::is_pod<std::decay_t<decltype(container[0])>>::value, "Only works on containers of byte-size PODs.");
return AsHexStringHelper<T>{container};
}
template <typename T> template <typename T>
std::string bin(const T & value, size_t bits = sizeof(T)*8) std::string bin(const T & value, size_t bits = sizeof(T)*8)
{ {
@ -113,10 +140,71 @@ DataTypePtr makeDataType()
#undef MAKE_DATA_TYPE #undef MAKE_DATA_TYPE
assert(false && "unsupported size"); assert(false && "unknown datatype");
return nullptr; return nullptr;
} }
template <typename T, typename Container>
class BinaryDataAsSequenceOfValuesIterator
{
const Container & container;
const void * data;
const void * data_end;
T current_value;
public:
using Self = BinaryDataAsSequenceOfValuesIterator<T, Container>;
explicit BinaryDataAsSequenceOfValuesIterator(const Container & container_)
: container(container_),
data(&container[0]),
data_end(reinterpret_cast<const char *>(data) + container.size()),
current_value(T{})
{
static_assert(sizeof(container[0]) == 1 && std::is_pod<std::decay_t<decltype(container[0])>>::value, "Only works on containers of byte-size PODs.");
read();
}
const T & operator*() const
{
return current_value;
}
size_t ItemsLeft() const
{
return reinterpret_cast<const char *>(data_end) - reinterpret_cast<const char *>(data);
}
Self & operator++()
{
read();
return *this;
}
operator bool() const
{
return ItemsLeft() > 0;
}
private:
void read()
{
if (!*this)
{
throw std::runtime_error("No more data to read");
}
current_value = unalignedLoad<T>(data);
data = reinterpret_cast<const char *>(data) + sizeof(T);
}
};
template <typename T, typename Container>
BinaryDataAsSequenceOfValuesIterator<T, Container> AsSequenceOf(const Container & container)
{
return BinaryDataAsSequenceOfValuesIterator<T, Container>(container);
}
template <typename T, typename ContainerLeft, typename ContainerRight> template <typename T, typename ContainerLeft, typename ContainerRight>
::testing::AssertionResult EqualByteContainersAs(const ContainerLeft & left, const ContainerRight & right) ::testing::AssertionResult EqualByteContainersAs(const ContainerLeft & left, const ContainerRight & right)
@ -126,9 +214,6 @@ template <typename T, typename ContainerLeft, typename ContainerRight>
::testing::AssertionResult result = ::testing::AssertionSuccess(); ::testing::AssertionResult result = ::testing::AssertionSuccess();
ReadBufferFromMemory left_read_buffer(left.data(), left.size());
ReadBufferFromMemory right_read_buffer(right.data(), right.size());
const auto l_size = left.size() / sizeof(T); const auto l_size = left.size() / sizeof(T);
const auto r_size = right.size() / sizeof(T); const auto r_size = right.size() / sizeof(T);
const auto size = std::min(l_size, r_size); const auto size = std::min(l_size, r_size);
@ -137,16 +222,25 @@ template <typename T, typename ContainerLeft, typename ContainerRight>
{ {
result = ::testing::AssertionFailure() << "size mismatch" << " expected: " << l_size << " got:" << r_size; result = ::testing::AssertionFailure() << "size mismatch" << " expected: " << l_size << " got:" << r_size;
} }
if (l_size == 0 || r_size == 0)
{
return result;
}
auto l = AsSequenceOf<T>(left);
auto r = AsSequenceOf<T>(right);
const auto MAX_MISMATCHING_ITEMS = 5; const auto MAX_MISMATCHING_ITEMS = 5;
int mismatching_items = 0; int mismatching_items = 0;
for (int i = 0; i < size; ++i) size_t i = 0;
{
T left_value{};
left_read_buffer.readStrict(reinterpret_cast<char*>(&left_value), sizeof(left_value));
T right_value{}; while (l && r)
right_read_buffer.readStrict(reinterpret_cast<char*>(&right_value), sizeof(right_value)); {
const auto left_value = *l;
const auto right_value = *r;
++l;
++r;
++i;
if (left_value != right_value) if (left_value != right_value)
{ {
@ -157,25 +251,47 @@ template <typename T, typename ContainerLeft, typename ContainerRight>
if (++mismatching_items <= MAX_MISMATCHING_ITEMS) if (++mismatching_items <= MAX_MISMATCHING_ITEMS)
{ {
result << "mismatching " << sizeof(T) << "-byte item #" << i result << "\nmismatching " << sizeof(T) << "-byte item #" << i
<< "\nexpected: " << bin(left_value) << " (0x" << std::hex << left_value << ")" << "\nexpected: " << bin(left_value) << " (0x" << std::hex << left_value << ")"
<< "\ngot : " << bin(right_value) << " (0x" << std::hex << right_value << ")" << "\ngot : " << bin(right_value) << " (0x" << std::hex << right_value << ")";
<< std::endl;
if (mismatching_items == MAX_MISMATCHING_ITEMS) if (mismatching_items == MAX_MISMATCHING_ITEMS)
{ {
result << "..." << std::endl; result << "\n..." << std::endl;
} }
} }
} }
} }
if (mismatching_items > 0) if (mismatching_items > 0)
{ {
result << "\ntotal mismatching items:" << mismatching_items << " of " << size; result << "total mismatching items:" << mismatching_items << " of " << size;
} }
return result; return result;
} }
template <typename ContainerLeft, typename ContainerRight>
::testing::AssertionResult EqualByteContainers(UInt8 element_size, const ContainerLeft & left, const ContainerRight & right)
{
switch (element_size)
{
case 1:
return EqualByteContainersAs<UInt8>(left, right);
break;
case 2:
return EqualByteContainersAs<UInt16>(left, right);
break;
case 4:
return EqualByteContainersAs<UInt32>(left, right);
break;
case 8:
return EqualByteContainersAs<UInt64>(left, right);
break;
default:
assert(false && "Invalid element_size");
return ::testing::AssertionFailure() << "Invalid element_size: " << element_size;
}
}
struct Codec struct Codec
{ {
std::string codec_statement; std::string codec_statement;
@ -214,20 +330,23 @@ struct CodecTestSequence
CodecTestSequence & operator=(const CodecTestSequence &) = default; CodecTestSequence & operator=(const CodecTestSequence &) = default;
CodecTestSequence(CodecTestSequence &&) = default; CodecTestSequence(CodecTestSequence &&) = default;
CodecTestSequence & operator=(CodecTestSequence &&) = default; CodecTestSequence & operator=(CodecTestSequence &&) = default;
CodecTestSequence & append(const CodecTestSequence & other)
{
assert(data_type->equals(*other.data_type));
serialized_data.insert(serialized_data.end(), other.serialized_data.begin(), other.serialized_data.end());
if (!name.empty())
name += " + ";
name += other.name;
return *this;
}
}; };
CodecTestSequence operator+(CodecTestSequence && left, CodecTestSequence && right) CodecTestSequence operator+(CodecTestSequence && left, const CodecTestSequence & right)
{ {
assert(left.data_type->equals(*right.data_type)); return left.append(right);
std::vector<char> data(std::move(left.serialized_data));
data.insert(data.end(), right.serialized_data.begin(), right.serialized_data.end());
return CodecTestSequence{
left.name + " + " + right.name,
std::move(data),
std::move(left.data_type)
};
} }
template <typename T> template <typename T>
@ -288,17 +407,22 @@ CodecTestSequence makeSeq(Args && ... args)
}; };
} }
template <typename T, typename Generator> template <typename T, typename Generator, typename B = int, typename E = int>
CodecTestSequence generateSeq(Generator gen, const char* gen_name, size_t Begin = 0, size_t End = 10000) CodecTestSequence generateSeq(Generator gen, const char* gen_name, B Begin = 0, E End = 10000)
{ {
assert (End >= Begin); const auto direction = std::signbit(End - Begin) ? -1 : 1;
std::vector<char> data(sizeof(T) * (End - Begin)); std::vector<char> data(sizeof(T) * (End - Begin));
char * write_pos = data.data(); char * write_pos = data.data();
for (size_t i = Begin; i < End; ++i) for (auto i = Begin; i < End; i += direction)
{ {
const T v = gen(static_cast<T>(i)); const T v = gen(static_cast<T>(i));
// if constexpr (debug_log_items)
// {
// std::cerr << "#" << i << " " << type_name<T>() << "(" << sizeof(T) << " bytes) : " << v << std::endl;
// }
unalignedStore<T>(write_pos, v); unalignedStore<T>(write_pos, v);
write_pos += sizeof(v); write_pos += sizeof(v);
} }
@ -310,6 +434,96 @@ CodecTestSequence generateSeq(Generator gen, const char* gen_name, size_t Begin
}; };
} }
struct NoOpTimer
{
void start() {}
void report(const char*) {}
};
struct StopwatchTimer
{
explicit StopwatchTimer(clockid_t clock_type, size_t estimated_marks = 32)
: stopwatch(clock_type)
{
results.reserve(estimated_marks);
}
void start()
{
stopwatch.restart();
}
void report(const char * mark)
{
results.emplace_back(mark, stopwatch.elapsed());
}
void stop()
{
stopwatch.stop();
}
const std::vector<std::tuple<const char*, UInt64>> & getResults() const
{
return results;
}
private:
Stopwatch stopwatch;
std::vector<std::tuple<const char*, UInt64>> results;
};
CompressionCodecPtr makeCodec(const std::string & codec_string, const DataTypePtr data_type)
{
const std::string codec_statement = "(" + codec_string + ")";
Tokens tokens(codec_statement.begin().base(), codec_statement.end().base());
IParser::Pos token_iterator(tokens);
Expected expected;
ASTPtr codec_ast;
ParserCodec parser;
parser.parse(token_iterator, codec_ast, expected);
return CompressionCodecFactory::instance().get(codec_ast, data_type);
}
template <typename Timer>
void testTranscoding(Timer & timer, ICompressionCodec & codec, const CodecTestSequence & test_sequence, std::optional<double> expected_compression_ratio = std::optional<double>{})
{
const auto & source_data = test_sequence.serialized_data;
const UInt32 encoded_max_size = codec.getCompressedReserveSize(source_data.size());
PODArray<char> encoded(encoded_max_size);
timer.start();
const UInt32 encoded_size = codec.compress(source_data.data(), source_data.size(), encoded.data());
timer.report("encoding");
encoded.resize(encoded_size);
PODArray<char> decoded(source_data.size());
timer.start();
const UInt32 decoded_size = codec.decompress(encoded.data(), encoded.size(), decoded.data());
timer.report("decoding");
decoded.resize(decoded_size);
ASSERT_TRUE(EqualByteContainers(test_sequence.data_type->getSizeOfValueInMemory(), source_data, decoded));
const auto header_size = codec.getHeaderSize();
const auto compression_ratio = (encoded_size - header_size) / (source_data.size() * 1.0);
if (expected_compression_ratio)
{
ASSERT_LE(compression_ratio, *expected_compression_ratio)
<< "\n\tdecoded size: " << source_data.size()
<< "\n\tencoded size: " << encoded_size
<< "(no header: " << encoded_size - header_size << ")";
}
}
class CodecTest : public ::testing::TestWithParam<std::tuple<Codec, CodecTestSequence>> class CodecTest : public ::testing::TestWithParam<std::tuple<Codec, CodecTestSequence>>
{ {
@ -320,67 +534,18 @@ public:
CODEC_WITHOUT_DATA_TYPE, CODEC_WITHOUT_DATA_TYPE,
}; };
CompressionCodecPtr makeCodec(MakeCodecParam with_data_type) const CompressionCodecPtr makeCodec(MakeCodecParam with_data_type)
{ {
const auto & codec_string = std::get<0>(GetParam()).codec_statement; const auto & codec_string = std::get<0>(GetParam()).codec_statement;
const auto & data_type = with_data_type == CODEC_WITH_DATA_TYPE ? std::get<1>(GetParam()).data_type : nullptr; const auto & data_type = with_data_type == CODEC_WITH_DATA_TYPE ? std::get<1>(GetParam()).data_type : nullptr;
const std::string codec_statement = "(" + codec_string + ")"; return ::makeCodec(codec_string, data_type);
Tokens tokens(codec_statement.begin().base(), codec_statement.end().base());
IParser::Pos token_iterator(tokens);
Expected expected;
ASTPtr codec_ast;
ParserCodec parser;
parser.parse(token_iterator, codec_ast, expected);
return CompressionCodecFactory::instance().get(codec_ast, data_type);
} }
void testTranscoding(ICompressionCodec & codec) void testTranscoding(ICompressionCodec & codec)
{ {
const auto & test_sequence = std::get<1>(GetParam()); NoOpTimer timer;
const auto & source_data = test_sequence.serialized_data; ::testTranscoding(timer, codec, std::get<1>(GetParam()), std::get<0>(GetParam()).expected_compression_ratio);
const UInt32 encoded_max_size = codec.getCompressedReserveSize(source_data.size());
PODArray<char> encoded(encoded_max_size);
const UInt32 encoded_size = codec.compress(source_data.data(), source_data.size(), encoded.data());
encoded.resize(encoded_size);
PODArray<char> decoded(source_data.size());
const UInt32 decoded_size = codec.decompress(encoded.data(), encoded.size(), decoded.data());
decoded.resize(decoded_size);
switch (test_sequence.data_type->getSizeOfValueInMemory())
{
case 1:
ASSERT_TRUE(EqualByteContainersAs<UInt8>(source_data, decoded));
break;
case 2:
ASSERT_TRUE(EqualByteContainersAs<UInt16>(source_data, decoded));
break;
case 4:
ASSERT_TRUE(EqualByteContainersAs<UInt32>(source_data, decoded));
break;
case 8:
ASSERT_TRUE(EqualByteContainersAs<UInt64>(source_data, decoded));
break;
default:
FAIL() << "Invalid test sequence data type: " << test_sequence.data_type->getName();
}
const auto header_size = codec.getHeaderSize();
const auto compression_ratio = (encoded_size - header_size) / (source_data.size() * 1.0);
const auto & codec_spec = std::get<0>(GetParam());
if (codec_spec.expected_compression_ratio)
{
ASSERT_LE(compression_ratio, *codec_spec.expected_compression_ratio)
<< "\n\tdecoded size: " << source_data.size()
<< "\n\tencoded size: " << encoded_size
<< "(no header: " << encoded_size - header_size << ")";
}
} }
}; };
@ -396,10 +561,121 @@ TEST_P(CodecTest, TranscodingWithoutDataType)
testTranscoding(*codec); testTranscoding(*codec);
} }
// Param is tuple-of-tuple to simplify instantiating with values, since typically group of cases test only one codec.
class CodecTest_Compatibility : public ::testing::TestWithParam<std::tuple<Codec, std::tuple<CodecTestSequence, std::string>>>
{};
// Check that iput sequence when encoded matches the encoded string binary.
TEST_P(CodecTest_Compatibility, Encoding)
{
const auto & codec_spec = std::get<0>(GetParam());
const auto & [data_sequence, expected] = std::get<1>(GetParam());
const auto codec = makeCodec(codec_spec.codec_statement, data_sequence.data_type);
const auto & source_data = data_sequence.serialized_data;
// Just encode the data with codec
const UInt32 encoded_max_size = codec->getCompressedReserveSize(source_data.size());
PODArray<char> encoded(encoded_max_size);
const UInt32 encoded_size = codec->compress(source_data.data(), source_data.size(), encoded.data());
encoded.resize(encoded_size);
SCOPED_TRACE(::testing::Message("encoded: ") << AsHexString(encoded));
ASSERT_TRUE(EqualByteContainersAs<UInt8>(expected, encoded));
}
// Check that binary string is exactly decoded into input sequence.
TEST_P(CodecTest_Compatibility, Decoding)
{
const auto & codec_spec = std::get<0>(GetParam());
const auto & [expected, encoded_data] = std::get<1>(GetParam());
const auto codec = makeCodec(codec_spec.codec_statement, expected.data_type);
PODArray<char> decoded(expected.serialized_data.size());
const UInt32 decoded_size = codec->decompress(encoded_data.c_str(), encoded_data.size(), decoded.data());
decoded.resize(decoded_size);
ASSERT_TRUE(EqualByteContainers(expected.data_type->getSizeOfValueInMemory(), expected.serialized_data, decoded));
}
class CodecTest_Performance : public ::testing::TestWithParam<std::tuple<Codec, CodecTestSequence>>
{};
TEST_P(CodecTest_Performance, TranscodingWithDataType)
{
const auto & [codec_spec, test_seq] = GetParam();
const auto codec = ::makeCodec(codec_spec.codec_statement, test_seq.data_type);
const auto runs = 10;
std::map<std::string, std::vector<UInt64>> results;
for (size_t i = 0; i < runs; ++i)
{
StopwatchTimer timer{CLOCK_THREAD_CPUTIME_ID};
::testTranscoding(timer, *codec, test_seq);
timer.stop();
for (const auto & [label, value] : timer.getResults())
{
results[label].push_back(value);
}
}
auto computeMeanAndStdDev = [](const auto & values)
{
double mean{};
if (values.size() < 2)
return std::make_tuple(mean, double{});
using ValueType = typename std::decay_t<decltype(values)>::value_type;
std::vector<ValueType> tmp_v(std::begin(values), std::end(values));
std::sort(tmp_v.begin(), tmp_v.end());
// remove min and max
tmp_v.erase(tmp_v.begin());
tmp_v.erase(tmp_v.end() - 1);
for (const auto & v : tmp_v)
{
mean += v;
}
mean = mean / tmp_v.size();
double std_dev = 0.0;
for (const auto & v : tmp_v)
{
const auto d = (v - mean);
std_dev += (d * d);
}
std_dev = std::sqrt(std_dev / tmp_v.size());
return std::make_tuple(mean, std_dev);
};
std::cerr << codec_spec.codec_statement
<< " " << test_seq.data_type->getName()
<< " (" << test_seq.serialized_data.size() << " bytes, "
<< std::hex << CityHash_v1_0_2::CityHash64(test_seq.serialized_data.data(), test_seq.serialized_data.size()) << std::dec
<< ", average of " << runs << " runs, μs)";
for (const auto & k : {"encoding", "decoding"})
{
const auto & values = results[k];
const auto & [mean, std_dev] = computeMeanAndStdDev(values);
// Ensure that Coefficient of variation is reasonably low, otherwise these numbers are meaningless
EXPECT_GT(0.05, std_dev / mean);
std::cerr << "\t" << std::fixed << std::setprecision(1) << mean / 1000.0;
}
std::cerr << std::endl;
}
/////////////////////////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////////////////////////
// Here we use generators to produce test payload for codecs. // Here we use generators to produce test payload for codecs.
// Generator is a callable that can produce infinite number of values, // Generator is a callable that can produce infinite number of values,
// output value MUST be of the same type input value. // output value MUST be of the same type as input value.
/////////////////////////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////////////////////////
auto SameValueGenerator = [](auto value) auto SameValueGenerator = [](auto value)
@ -543,6 +819,23 @@ std::vector<CodecTestSequence> generatePyramidOfSequences(const size_t sequences
return sequences; return sequences;
}; };
// Just as if all sequences from generatePyramidOfSequences were appended to one-by-one to the first one.
template <typename T, typename Generator>
CodecTestSequence generatePyramidSequence(const size_t sequences_count, Generator && generator, const char* generator_name)
{
CodecTestSequence sequence;
sequence.data_type = makeDataType<T>();
sequence.serialized_data.reserve(sequences_count * sequences_count * sizeof(T));
for (size_t i = 1; i < sequences_count; ++i)
{
std::string name = generator_name + std::string(" from 0 to ") + std::to_string(i);
sequence.append(generateSeq<T>(std::forward<decltype(generator)>(generator), name.c_str(), 0, i));
}
return sequence;
};
// helper macro to produce human-friendly sequence name from generator // helper macro to produce human-friendly sequence name from generator
#define G(generator) generator, #generator #define G(generator) generator, #generator
@ -575,7 +868,7 @@ INSTANTIATE_TEST_CASE_P(SmallSequences,
::testing::Combine( ::testing::Combine(
DefaultCodecsToTest, DefaultCodecsToTest,
::testing::ValuesIn( ::testing::ValuesIn(
generatePyramidOfSequences<Int8 >(42, G(SequentialGenerator(1))) generatePyramidOfSequences<Int8 >(42, G(SequentialGenerator(1)))
+ generatePyramidOfSequences<Int16 >(42, G(SequentialGenerator(1))) + generatePyramidOfSequences<Int16 >(42, G(SequentialGenerator(1)))
+ generatePyramidOfSequences<Int32 >(42, G(SequentialGenerator(1))) + generatePyramidOfSequences<Int32 >(42, G(SequentialGenerator(1)))
+ generatePyramidOfSequences<Int64 >(42, G(SequentialGenerator(1))) + generatePyramidOfSequences<Int64 >(42, G(SequentialGenerator(1)))
@ -609,7 +902,7 @@ INSTANTIATE_TEST_CASE_P(SameValueInt,
::testing::Combine( ::testing::Combine(
DefaultCodecsToTest, DefaultCodecsToTest,
::testing::Values( ::testing::Values(
generateSeq<Int8 >(G(SameValueGenerator(1000))), generateSeq<Int8>(G(SameValueGenerator(1000))),
generateSeq<Int16 >(G(SameValueGenerator(1000))), generateSeq<Int16 >(G(SameValueGenerator(1000))),
generateSeq<Int32 >(G(SameValueGenerator(1000))), generateSeq<Int32 >(G(SameValueGenerator(1000))),
generateSeq<Int64 >(G(SameValueGenerator(1000))), generateSeq<Int64 >(G(SameValueGenerator(1000))),
@ -626,7 +919,7 @@ INSTANTIATE_TEST_CASE_P(SameNegativeValueInt,
::testing::Combine( ::testing::Combine(
DefaultCodecsToTest, DefaultCodecsToTest,
::testing::Values( ::testing::Values(
generateSeq<Int8 >(G(SameValueGenerator(-1000))), generateSeq<Int8>(G(SameValueGenerator(-1000))),
generateSeq<Int16 >(G(SameValueGenerator(-1000))), generateSeq<Int16 >(G(SameValueGenerator(-1000))),
generateSeq<Int32 >(G(SameValueGenerator(-1000))), generateSeq<Int32 >(G(SameValueGenerator(-1000))),
generateSeq<Int64 >(G(SameValueGenerator(-1000))), generateSeq<Int64 >(G(SameValueGenerator(-1000))),
@ -671,7 +964,7 @@ INSTANTIATE_TEST_CASE_P(SequentialInt,
::testing::Combine( ::testing::Combine(
DefaultCodecsToTest, DefaultCodecsToTest,
::testing::Values( ::testing::Values(
generateSeq<Int8 >(G(SequentialGenerator(1))), generateSeq<Int8>(G(SequentialGenerator(1))),
generateSeq<Int16 >(G(SequentialGenerator(1))), generateSeq<Int16 >(G(SequentialGenerator(1))),
generateSeq<Int32 >(G(SequentialGenerator(1))), generateSeq<Int32 >(G(SequentialGenerator(1))),
generateSeq<Int64 >(G(SequentialGenerator(1))), generateSeq<Int64 >(G(SequentialGenerator(1))),
@ -690,7 +983,7 @@ INSTANTIATE_TEST_CASE_P(SequentialReverseInt,
::testing::Combine( ::testing::Combine(
DefaultCodecsToTest, DefaultCodecsToTest,
::testing::Values( ::testing::Values(
generateSeq<Int8 >(G(SequentialGenerator(-1))), generateSeq<Int8>(G(SequentialGenerator(-1))),
generateSeq<Int16 >(G(SequentialGenerator(-1))), generateSeq<Int16 >(G(SequentialGenerator(-1))),
generateSeq<Int32 >(G(SequentialGenerator(-1))), generateSeq<Int32 >(G(SequentialGenerator(-1))),
generateSeq<Int64 >(G(SequentialGenerator(-1))), generateSeq<Int64 >(G(SequentialGenerator(-1))),
@ -735,10 +1028,10 @@ INSTANTIATE_TEST_CASE_P(MonotonicInt,
::testing::Combine( ::testing::Combine(
DefaultCodecsToTest, DefaultCodecsToTest,
::testing::Values( ::testing::Values(
generateSeq<Int8 >(G(MonotonicGenerator(1, 5))), generateSeq<Int8>(G(MonotonicGenerator(1, 5))),
generateSeq<Int16 >(G(MonotonicGenerator(1, 5))), generateSeq<Int16>(G(MonotonicGenerator(1, 5))),
generateSeq<Int32 >(G(MonotonicGenerator(1, 5))), generateSeq<Int32>(G(MonotonicGenerator(1, 5))),
generateSeq<Int64 >(G(MonotonicGenerator(1, 5))), generateSeq<Int64>(G(MonotonicGenerator(1, 5))),
generateSeq<UInt8 >(G(MonotonicGenerator(1, 5))), generateSeq<UInt8 >(G(MonotonicGenerator(1, 5))),
generateSeq<UInt16>(G(MonotonicGenerator(1, 5))), generateSeq<UInt16>(G(MonotonicGenerator(1, 5))),
generateSeq<UInt32>(G(MonotonicGenerator(1, 5))), generateSeq<UInt32>(G(MonotonicGenerator(1, 5))),
@ -752,11 +1045,11 @@ INSTANTIATE_TEST_CASE_P(MonotonicReverseInt,
::testing::Combine( ::testing::Combine(
DefaultCodecsToTest, DefaultCodecsToTest,
::testing::Values( ::testing::Values(
generateSeq<Int8 >(G(MonotonicGenerator(-1, 5))), generateSeq<Int8>(G(MonotonicGenerator(-1, 5))),
generateSeq<Int16 >(G(MonotonicGenerator(-1, 5))), generateSeq<Int16>(G(MonotonicGenerator(-1, 5))),
generateSeq<Int32 >(G(MonotonicGenerator(-1, 5))), generateSeq<Int32>(G(MonotonicGenerator(-1, 5))),
generateSeq<Int64 >(G(MonotonicGenerator(-1, 5))), generateSeq<Int64>(G(MonotonicGenerator(-1, 5))),
generateSeq<UInt8 >(G(MonotonicGenerator(-1, 5))), generateSeq<UInt8>(G(MonotonicGenerator(-1, 5))),
generateSeq<UInt16>(G(MonotonicGenerator(-1, 5))), generateSeq<UInt16>(G(MonotonicGenerator(-1, 5))),
generateSeq<UInt32>(G(MonotonicGenerator(-1, 5))), generateSeq<UInt32>(G(MonotonicGenerator(-1, 5))),
generateSeq<UInt64>(G(MonotonicGenerator(-1, 5))) generateSeq<UInt64>(G(MonotonicGenerator(-1, 5)))
@ -862,4 +1155,191 @@ INSTANTIATE_TEST_CASE_P(OverflowFloat,
), ),
); );
template <typename ValueType>
auto DDCompatibilityTestSequence()
{
// Generates sequences with double delta in given range.
auto ddGenerator = [prev_delta = static_cast<Int64>(0), prev = static_cast<Int64>(0)](auto dd) mutable
{
const auto curr = dd + prev + prev_delta;
prev = curr;
prev_delta = dd + prev_delta;
return curr;
};
auto ret = generateSeq<ValueType>(G(SameValueGenerator(42)), 0, 3);
// These values are from DoubleDelta paper (and implementation) and represent points at which DD encoded length is changed.
// DD value less that this point is encoded in shorter binary form (bigger - longer binary).
const Int64 dd_corner_points[] = {-63, 64, -255, 256, -2047, 2048, std::numeric_limits<Int32>::min(), std::numeric_limits<Int32>::max()};
for (const auto & p : dd_corner_points)
{
if (std::abs(p) > std::numeric_limits<ValueType>::max())
{
break;
}
// - 4 is to allow DD value to settle before transitioning through important point,
// since DD depends on 2 previous values of data, + 2 is arbitrary.
ret.append(generateSeq<ValueType>(G(ddGenerator), p - 4, p + 2));
}
return ret;
}
#define BIN_STR(x) std::string{x, sizeof(x) - 1}
INSTANTIATE_TEST_CASE_P(DoubleDelta,
CodecTest_Compatibility,
::testing::Combine(
::testing::Values(Codec("DoubleDelta")),
::testing::ValuesIn(std::initializer_list<std::tuple<CodecTestSequence, std::string>>{
{
DDCompatibilityTestSequence<Int8>(),
BIN_STR("\x94\x21\x00\x00\x00\x0f\x00\x00\x00\x01\x00\x0f\x00\x00\x00\x2a\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xb1\xaa\xf4\xf6\x7d\x87\xf8\x80")
},
{
DDCompatibilityTestSequence<UInt8>(),
BIN_STR("\x94\x27\x00\x00\x00\x15\x00\x00\x00\x01\x00\x15\x00\x00\x00\x2a\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xb1\xaa\xf4\xf6\x7d\x87\xf8\x81\x8e\xd0\xca\x02\x01\x01")
},
{
DDCompatibilityTestSequence<Int16>(),
BIN_STR("\x94\x70\x00\x00\x00\x4e\x00\x00\x00\x02\x00\x27\x00\x00\x00\x2a\x00\x00\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xbc\xe3\x5d\xa3\xd3\xd9\xf6\x1f\xe2\x07\x7c\x47\x20\x67\x48\x07\x47\xff\x47\xf6\xfe\xf8\x00\x00\x70\x6b\xd0\x00\x02\x83\xd9\xfb\x9f\xdc\x1f\xfc\x20\x1e\x80\x00\x22\xc8\xf0\x00\x00\x66\x67\xa0\x00\x02\x00\x3d\x00\x00\x0f\xff\xe8\x00\x00\x7f\xee\xff\xdf\x40\x00\x0f\xf2\x78\x00\x01\x7f\x83\x9f\xf7\x9f\xfb\xc0\x00\x00\xff\xfe\x00\x00\x08\x00")
},
{
DDCompatibilityTestSequence<UInt16>(),
BIN_STR("\x94\x70\x00\x00\x00\x4e\x00\x00\x00\x02\x00\x27\x00\x00\x00\x2a\x00\x00\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xbc\xe3\x5d\xa3\xd3\xd9\xf6\x1f\xe2\x07\x7c\x47\x20\x67\x48\x07\x47\xff\x47\xf6\xfe\xf8\x00\x00\x70\x6b\xd0\x00\x02\x83\xd9\xfb\x9f\xdc\x1f\xfc\x20\x1e\x80\x00\x22\xc8\xf0\x00\x00\x66\x67\xa0\x00\x02\x00\x3d\x00\x00\x0f\xff\xe8\x00\x00\x7f\xee\xff\xdf\x40\x00\x0f\xf2\x78\x00\x01\x7f\x83\x9f\xf7\x9f\xfb\xc0\x00\x00\xff\xfe\x00\x00\x08\x00")
},
{
DDCompatibilityTestSequence<Int32>(),
BIN_STR("\x94\x74\x00\x00\x00\x9c\x00\x00\x00\x04\x00\x27\x00\x00\x00\x2a\x00\x00\x00\x00\x00\x00\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xbc\xe3\x5d\xa3\xd3\xd9\xf6\x1f\xe2\x07\x7c\x47\x20\x67\x48\x07\x47\xff\x47\xf6\xfe\xf8\x00\x00\x70\x6b\xd0\x00\x02\x83\xd9\xfb\x9f\xdc\x1f\xfc\x20\x1e\x80\x00\x22\xc8\xf0\x00\x00\x66\x67\xa0\x00\x02\x00\x3d\x00\x00\x0f\xff\xe8\x00\x00\x7f\xee\xff\xdf\x00\x00\x70\x0d\x7a\x00\x02\x80\x7b\x9f\xf7\x9f\xfb\xc0\x00\x00\xff\xfe\x00\x00\x08\x00")
},
{
DDCompatibilityTestSequence<UInt32>(),
BIN_STR("\x94\xb5\x00\x00\x00\xcc\x00\x00\x00\x04\x00\x33\x00\x00\x00\x2a\x00\x00\x00\x00\x00\x00\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xbc\xe3\x5d\xa3\xd3\xd9\xf6\x1f\xe2\x07\x7c\x47\x20\x67\x48\x07\x47\xff\x47\xf6\xfe\xf8\x00\x00\x70\x6b\xd0\x00\x02\x83\xd9\xfb\x9f\xdc\x1f\xfc\x20\x1e\x80\x00\x22\xc8\xf0\x00\x00\x66\x67\xa0\x00\x02\x00\x3d\x00\x00\x0f\xff\xe8\x00\x00\x7f\xee\xff\xdf\x00\x00\x70\x0d\x7a\x00\x02\x80\x7b\x9f\xf7\x9f\xfb\xc0\x00\x00\xff\xfe\x00\x00\x08\x00\xf3\xff\xf9\x41\xaf\xbf\xff\xd6\x0c\xfc\xff\xff\xff\xfb\xf0\x00\x00\x00\x07\xff\xff\xff\xef\xc0\x00\x00\x00\x3f\xff\xff\xff\xfb\xff\xff\xff\xfa\x69\x74\xf3\xff\xff\xff\xe7\x9f\xff\xff\xff\x7e\x00\x00\x00\x00\xff\xff\xff\xfd\xf8\x00\x00\x00\x07\xff\xff\xff\xf0")
},
{
DDCompatibilityTestSequence<Int64>(),
BIN_STR("\x94\xd4\x00\x00\x00\x98\x01\x00\x00\x08\x00\x33\x00\x00\x00\x2a\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xbc\xe3\x5d\xa3\xd3\xd9\xf6\x1f\xe2\x07\x7c\x47\x20\x67\x48\x07\x47\xff\x47\xf6\xfe\xf8\x00\x00\x70\x6b\xd0\x00\x02\x83\xd9\xfb\x9f\xdc\x1f\xfc\x20\x1e\x80\x00\x22\xc8\xf0\x00\x00\x66\x67\xa0\x00\x02\x00\x3d\x00\x00\x0f\xff\xe8\x00\x00\x7f\xee\xff\xdf\x00\x00\x70\x0d\x7a\x00\x02\x80\x7b\x9f\xf7\x9f\xfb\xc0\x00\x00\xff\xfe\x00\x00\x08\x00\xfc\x00\x00\x00\x04\x00\x06\xbe\x4f\xbf\xff\xd6\x0c\xff\x00\x00\x00\x01\x00\x00\x00\x03\xf8\x00\x00\x00\x08\x00\x00\x00\x0f\xc0\x00\x00\x00\x3f\xff\xff\xff\xfb\xff\xff\xff\xfb\xe0\x00\x00\x01\xc0\x00\x00\x06\x9f\x80\x00\x00\x0a\x00\x00\x00\x34\xf3\xff\xff\xff\xe7\x9f\xff\xff\xff\x7e\x00\x00\x00\x00\xff\xff\xff\xfd\xf0\x00\x00\x00\x07\xff\xff\xff\xf0")
},
{
DDCompatibilityTestSequence<UInt64>(),
BIN_STR("\x94\xd4\x00\x00\x00\x98\x01\x00\x00\x08\x00\x33\x00\x00\x00\x2a\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x6b\x65\x5f\x50\x34\xff\x4f\xaf\xbc\xe3\x5d\xa3\xd3\xd9\xf6\x1f\xe2\x07\x7c\x47\x20\x67\x48\x07\x47\xff\x47\xf6\xfe\xf8\x00\x00\x70\x6b\xd0\x00\x02\x83\xd9\xfb\x9f\xdc\x1f\xfc\x20\x1e\x80\x00\x22\xc8\xf0\x00\x00\x66\x67\xa0\x00\x02\x00\x3d\x00\x00\x0f\xff\xe8\x00\x00\x7f\xee\xff\xdf\x00\x00\x70\x0d\x7a\x00\x02\x80\x7b\x9f\xf7\x9f\xfb\xc0\x00\x00\xff\xfe\x00\x00\x08\x00\xfc\x00\x00\x00\x04\x00\x06\xbe\x4f\xbf\xff\xd6\x0c\xff\x00\x00\x00\x01\x00\x00\x00\x03\xf8\x00\x00\x00\x08\x00\x00\x00\x0f\xc0\x00\x00\x00\x3f\xff\xff\xff\xfb\xff\xff\xff\xfb\xe0\x00\x00\x01\xc0\x00\x00\x06\x9f\x80\x00\x00\x0a\x00\x00\x00\x34\xf3\xff\xff\xff\xe7\x9f\xff\xff\xff\x7e\x00\x00\x00\x00\xff\xff\xff\xfd\xf0\x00\x00\x00\x07\xff\xff\xff\xf0")
},
})
),
);
template <typename ValueType>
auto DDperformanceTestSequence()
{
const auto times = 100'000;
return DDCompatibilityTestSequence<ValueType>() * times // average case
+ generateSeq<ValueType>(G(MinMaxGenerator()), 0, times) // worst
+ generateSeq<ValueType>(G(SameValueGenerator(42)), 0, times); // best
}
// prime numbers in ascending order with some random repitions hit all the cases of Gorilla.
auto PrimesWithMultiplierGenerator = [](int multiplier = 1)
{
return [multiplier](auto i)
{
static const int vals[] = {
2, 3, 5, 7, 11, 11, 13, 17, 19, 23, 29, 29, 31, 37, 41, 43,
47, 47, 53, 59, 61, 61, 67, 71, 73, 79, 83, 89, 89, 97, 101, 103,
107, 107, 109, 113, 113, 127, 127, 127
};
static const size_t count = sizeof(vals)/sizeof(vals[0]);
using T = decltype(i);
return static_cast<T>(vals[i % count] * static_cast<T>(multiplier));
};
};
template <typename ValueType>
auto GCompatibilityTestSequence()
{
// Also multiply result by some factor to test large values on types that can hold those.
return generateSeq<ValueType>(G(PrimesWithMultiplierGenerator(intExp10(sizeof(ValueType)))), 0, 42);
}
INSTANTIATE_TEST_CASE_P(Gorilla,
CodecTest_Compatibility,
::testing::Combine(
::testing::Values(Codec("Gorilla")),
::testing::ValuesIn(std::initializer_list<std::tuple<CodecTestSequence, std::string>>{
{
GCompatibilityTestSequence<Int8>(),
BIN_STR("\x95\x35\x00\x00\x00\x2a\x00\x00\x00\x01\x00\x2a\x00\x00\x00\x14\xe1\xdd\x25\xe5\x7b\x29\x86\xee\x2a\x16\x5a\xc5\x0b\x23\x75\x1b\x3c\xb1\x97\x8b\x5f\xcb\x43\xd9\xc5\x48\xab\x23\xaf\x62\x93\x71\x4a\x73\x0f\xc6\x0a")
},
{
GCompatibilityTestSequence<UInt8>(),
BIN_STR("\x95\x35\x00\x00\x00\x2a\x00\x00\x00\x01\x00\x2a\x00\x00\x00\x14\xe1\xdd\x25\xe5\x7b\x29\x86\xee\x2a\x16\x5a\xc5\x0b\x23\x75\x1b\x3c\xb1\x97\x8b\x5f\xcb\x43\xd9\xc5\x48\xab\x23\xaf\x62\x93\x71\x4a\x73\x0f\xc6\x0a")
},
{
GCompatibilityTestSequence<Int16>(),
BIN_STR("\x95\x52\x00\x00\x00\x54\x00\x00\x00\x02\x00\x2a\x00\x00\x00\xc8\x00\xdc\xfe\x66\xdb\x1f\x4e\xa7\xde\xdc\xd5\xec\x6e\xf7\x37\x3a\x23\xe7\x63\xf5\x6a\x8e\x99\x37\x34\xf9\xf8\x2e\x76\x35\x2d\x51\xbb\x3b\xc3\x6d\x13\xbf\x86\x53\x9e\x25\xe4\xaf\xaf\x63\xd5\x6a\x6e\x76\x35\x3a\x27\xd3\x0f\x91\xae\x6b\x33\x57\x6e\x64\xcc\x55\x81\xe4")
},
{
GCompatibilityTestSequence<UInt16>(),
BIN_STR("\x95\x52\x00\x00\x00\x54\x00\x00\x00\x02\x00\x2a\x00\x00\x00\xc8\x00\xdc\xfe\x66\xdb\x1f\x4e\xa7\xde\xdc\xd5\xec\x6e\xf7\x37\x3a\x23\xe7\x63\xf5\x6a\x8e\x99\x37\x34\xf9\xf8\x2e\x76\x35\x2d\x51\xbb\x3b\xc3\x6d\x13\xbf\x86\x53\x9e\x25\xe4\xaf\xaf\x63\xd5\x6a\x6e\x76\x35\x3a\x27\xd3\x0f\x91\xae\x6b\x33\x57\x6e\x64\xcc\x55\x81\xe4")
},
{
GCompatibilityTestSequence<Int32>(),
BIN_STR("\x95\x65\x00\x00\x00\xa8\x00\x00\x00\x04\x00\x2a\x00\x00\x00\x20\x4e\x00\x00\xe4\x57\x63\xc0\xbb\x67\xbc\xce\x91\x97\x99\x15\x9e\xe3\x36\x3f\x89\x5f\x8e\xf2\xec\x8e\xd3\xbf\x75\x43\x58\xc4\x7e\xcf\x93\x43\x38\xc6\x91\x36\x1f\xe7\xb6\x11\x6f\x02\x73\x46\xef\xe0\xec\x50\xfb\x79\xcb\x9c\x14\xfa\x13\xea\x8d\x66\x43\x48\xa0\xde\x3a\xcf\xff\x26\xe0\x5f\x93\xde\x5e\x7f\x6e\x36\x5e\xe6\xb4\x66\x5d\xb0\x0e\xc4")
},
{
GCompatibilityTestSequence<UInt32>(),
BIN_STR("\x95\x65\x00\x00\x00\xa8\x00\x00\x00\x04\x00\x2a\x00\x00\x00\x20\x4e\x00\x00\xe4\x57\x63\xc0\xbb\x67\xbc\xce\x91\x97\x99\x15\x9e\xe3\x36\x3f\x89\x5f\x8e\xf2\xec\x8e\xd3\xbf\x75\x43\x58\xc4\x7e\xcf\x93\x43\x38\xc6\x91\x36\x1f\xe7\xb6\x11\x6f\x02\x73\x46\xef\xe0\xec\x50\xfb\x79\xcb\x9c\x14\xfa\x13\xea\x8d\x66\x43\x48\xa0\xde\x3a\xcf\xff\x26\xe0\x5f\x93\xde\x5e\x7f\x6e\x36\x5e\xe6\xb4\x66\x5d\xb0\x0e\xc4")
},
{
GCompatibilityTestSequence<Int64>(),
BIN_STR("\x95\x91\x00\x00\x00\x50\x01\x00\x00\x08\x00\x2a\x00\x00\x00\x00\xc2\xeb\x0b\x00\x00\x00\x00\xe3\x2b\xa0\xa6\x19\x85\x98\xdc\x45\x74\x74\x43\xc2\x57\x41\x4c\x6e\x42\x79\xd9\x8f\x88\xa5\x05\xf3\xf1\x94\xa3\x62\x1e\x02\xdf\x05\x10\xf1\x15\x97\x35\x2a\x50\x71\x0f\x09\x6c\x89\xf7\x65\x1d\x11\xb7\xcc\x7d\x0b\x70\xc1\x86\x88\x48\x47\x87\xb6\x32\x26\xa7\x86\x87\x88\xd3\x93\x3d\xfc\x28\x68\x85\x05\x0b\x13\xc6\x5f\xd4\x70\xe1\x5e\x76\xf1\x9f\xf3\x33\x2a\x14\x14\x5e\x40\xc1\x5c\x28\x3f\xec\x43\x03\x05\x11\x91\xe8\xeb\x8e\x0a\x0e\x27\x21\x55\xcb\x39\xbc\x6a\xff\x11\x5d\x81\xa0\xa6\x10")
},
{
GCompatibilityTestSequence<UInt64>(),
BIN_STR("\x95\x91\x00\x00\x00\x50\x01\x00\x00\x08\x00\x2a\x00\x00\x00\x00\xc2\xeb\x0b\x00\x00\x00\x00\xe3\x2b\xa0\xa6\x19\x85\x98\xdc\x45\x74\x74\x43\xc2\x57\x41\x4c\x6e\x42\x79\xd9\x8f\x88\xa5\x05\xf3\xf1\x94\xa3\x62\x1e\x02\xdf\x05\x10\xf1\x15\x97\x35\x2a\x50\x71\x0f\x09\x6c\x89\xf7\x65\x1d\x11\xb7\xcc\x7d\x0b\x70\xc1\x86\x88\x48\x47\x87\xb6\x32\x26\xa7\x86\x87\x88\xd3\x93\x3d\xfc\x28\x68\x85\x05\x0b\x13\xc6\x5f\xd4\x70\xe1\x5e\x76\xf1\x9f\xf3\x33\x2a\x14\x14\x5e\x40\xc1\x5c\x28\x3f\xec\x43\x03\x05\x11\x91\xe8\xeb\x8e\x0a\x0e\x27\x21\x55\xcb\x39\xbc\x6a\xff\x11\x5d\x81\xa0\xa6\x10")
},
})
),
);
// These 'tests' try to measure performance of encoding and decoding and hence only make sence to be run locally,
// also they require pretty big data to run agains and generating this data slows down startup of unit test process.
// So un-comment only at your discretion.
//INSTANTIATE_TEST_CASE_P(DoubleDelta,
// CodecTest_Performance,
// ::testing::Combine(
// ::testing::Values(Codec("DoubleDelta")),
// ::testing::Values(
// DDperformanceTestSequence<Int8 >(),
// DDperformanceTestSequence<UInt8 >(),
// DDperformanceTestSequence<Int16 >(),
// DDperformanceTestSequence<UInt16>(),
// DDperformanceTestSequence<Int32 >(),
// DDperformanceTestSequence<UInt32>(),
// DDperformanceTestSequence<Int64 >(),
// DDperformanceTestSequence<UInt64>()
// )
// ),
//);
//INSTANTIATE_TEST_CASE_P(Gorilla,
// CodecTest_Performance,
// ::testing::Combine(
// ::testing::Values(Codec("Gorilla")),
// ::testing::Values(
// generatePyramidSequence<Int8 >(42, G(PrimesWithMultiplierGenerator())) * 6'000,
// generatePyramidSequence<UInt8 >(42, G(PrimesWithMultiplierGenerator())) * 6'000,
// generatePyramidSequence<Int16 >(42, G(PrimesWithMultiplierGenerator())) * 6'000,
// generatePyramidSequence<UInt16>(42, G(PrimesWithMultiplierGenerator())) * 6'000,
// generatePyramidSequence<Int32 >(42, G(PrimesWithMultiplierGenerator())) * 6'000,
// generatePyramidSequence<UInt32>(42, G(PrimesWithMultiplierGenerator())) * 6'000,
// generatePyramidSequence<Int64 >(42, G(PrimesWithMultiplierGenerator())) * 6'000,
// generatePyramidSequence<UInt64>(42, G(PrimesWithMultiplierGenerator())) * 6'000
// )
// ),
//);
} }

View File

@ -37,7 +37,7 @@ void Settings::setProfile(const String & profile_name, const Poco::Util::Abstrac
{ {
if (key == "constraints") if (key == "constraints")
continue; continue;
if (key == "profile") /// Inheritance of one profile from another. if (key == "profile" || 0 == key.compare(0, strlen("profile["), "profile[")) /// Inheritance of profiles from the current one.
setProfile(config.getString(elem + "." + key), config); setProfile(config.getString(elem + "." + key), config);
else else
set(key, config.getString(elem + "." + key)); set(key, config.getString(elem + "." + key));

View File

@ -127,12 +127,11 @@ struct Settings : public SettingsCollection<Settings>
M(SettingUInt64, optimize_min_equality_disjunction_chain_length, 3, "The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", 0) \ M(SettingUInt64, optimize_min_equality_disjunction_chain_length, 3, "The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", 0) \
\ \
M(SettingUInt64, min_bytes_to_use_direct_io, 0, "The minimum number of bytes for reading the data with O_DIRECT option during SELECT queries execution. 0 - disabled.", 0) \ M(SettingUInt64, min_bytes_to_use_direct_io, 0, "The minimum number of bytes for reading the data with O_DIRECT option during SELECT queries execution. 0 - disabled.", 0) \
M(SettingUInt64, min_bytes_to_use_mmap_io, 0, "The minimum number of bytes for reading the data with mmap option during SELECT queries execution. 0 - disabled.", 0) \
\ \
M(SettingBool, force_index_by_date, 0, "Throw an exception if there is a partition key in a table, and it is not used.", 0) \ M(SettingBool, force_index_by_date, 0, "Throw an exception if there is a partition key in a table, and it is not used.", 0) \
M(SettingBool, force_primary_key, 0, "Throw an exception if there is primary key in a table, and it is not used.", 0) \ M(SettingBool, force_primary_key, 0, "Throw an exception if there is primary key in a table, and it is not used.", 0) \
\ \
M(SettingUInt64, mark_cache_min_lifetime, 10000, "If the maximum size of mark_cache is exceeded, delete only records older than mark_cache_min_lifetime seconds.", 0) \
\
M(SettingFloat, max_streams_to_max_threads_ratio, 1, "Allows you to use more sources than the number of threads - to more evenly distribute work across threads. It is assumed that this is a temporary solution, since it will be possible in the future to make the number of sources equal to the number of threads, but for each source to dynamically select available work for itself.", 0) \ M(SettingFloat, max_streams_to_max_threads_ratio, 1, "Allows you to use more sources than the number of threads - to more evenly distribute work across threads. It is assumed that this is a temporary solution, since it will be possible in the future to make the number of sources equal to the number of threads, but for each source to dynamically select available work for itself.", 0) \
M(SettingFloat, max_streams_multiplier_for_merge_tables, 5, "Ask more streams when reading from Merge table. Streams will be spread across tables that Merge table will use. This allows more even distribution of work across threads and especially helpful when merged tables differ in size.", 0) \ M(SettingFloat, max_streams_multiplier_for_merge_tables, 5, "Ask more streams when reading from Merge table. Streams will be spread across tables that Merge table will use. This allows more even distribution of work across threads and especially helpful when merged tables differ in size.", 0) \
\ \
@ -358,11 +357,8 @@ struct Settings : public SettingsCollection<Settings>
M(SettingBool, enable_unaligned_array_join, false, "Allow ARRAY JOIN with multiple arrays that have different sizes. When this settings is enabled, arrays will be resized to the longest one.", 0) \ M(SettingBool, enable_unaligned_array_join, false, "Allow ARRAY JOIN with multiple arrays that have different sizes. When this settings is enabled, arrays will be resized to the longest one.", 0) \
M(SettingBool, optimize_read_in_order, true, "Enable ORDER BY optimization for reading data in corresponding order in MergeTree tables.", 0) \ M(SettingBool, optimize_read_in_order, true, "Enable ORDER BY optimization for reading data in corresponding order in MergeTree tables.", 0) \
M(SettingBool, low_cardinality_allow_in_native_format, true, "Use LowCardinality type in Native format. Otherwise, convert LowCardinality columns to ordinary for select query, and convert ordinary columns to required LowCardinality for insert query.", 0) \ M(SettingBool, low_cardinality_allow_in_native_format, true, "Use LowCardinality type in Native format. Otherwise, convert LowCardinality columns to ordinary for select query, and convert ordinary columns to required LowCardinality for insert query.", 0) \
M(SettingBool, allow_experimental_multiple_joins_emulation, true, "Emulate multiple joins using subselects", 0) \
M(SettingBool, allow_experimental_cross_to_join_conversion, true, "Convert CROSS JOIN to INNER JOIN if possible", 0) \
M(SettingBool, cancel_http_readonly_queries_on_client_close, false, "Cancel HTTP readonly queries when a client closes the connection without waiting for response.", 0) \ M(SettingBool, cancel_http_readonly_queries_on_client_close, false, "Cancel HTTP readonly queries when a client closes the connection without waiting for response.", 0) \
M(SettingBool, external_table_functions_use_nulls, true, "If it is set to true, external table functions will implicitly use Nullable type if needed. Otherwise NULLs will be substituted with default values. Currently supported only by 'mysql' and 'odbc' table functions.", 0) \ M(SettingBool, external_table_functions_use_nulls, true, "If it is set to true, external table functions will implicitly use Nullable type if needed. Otherwise NULLs will be substituted with default values. Currently supported only by 'mysql' and 'odbc' table functions.", 0) \
M(SettingBool, allow_experimental_data_skipping_indices, false, "If it is set to true, data skipping indices can be used in CREATE TABLE/ALTER TABLE queries.", 0) \
\ \
M(SettingBool, experimental_use_processors, false, "Use processors pipeline.", 0) \ M(SettingBool, experimental_use_processors, false, "Use processors pipeline.", 0) \
\ \
@ -386,14 +382,18 @@ struct Settings : public SettingsCollection<Settings>
M(SettingBool, enable_scalar_subquery_optimization, true, "If it is set to true, prevent scalar subqueries from (de)serializing large scalar values and possibly avoid running the same subquery more than once.", 0) \ M(SettingBool, enable_scalar_subquery_optimization, true, "If it is set to true, prevent scalar subqueries from (de)serializing large scalar values and possibly avoid running the same subquery more than once.", 0) \
M(SettingBool, optimize_trivial_count_query, true, "Process trivial 'SELECT count() FROM table' query from metadata.", 0) \ M(SettingBool, optimize_trivial_count_query, true, "Process trivial 'SELECT count() FROM table' query from metadata.", 0) \
M(SettingUInt64, mutations_sync, 0, "Wait for synchronous execution of ALTER TABLE UPDATE/DELETE queries (mutations). 0 - execute asynchronously. 1 - wait current server. 2 - wait all replicas if they exist.", 0) \ M(SettingUInt64, mutations_sync, 0, "Wait for synchronous execution of ALTER TABLE UPDATE/DELETE queries (mutations). 0 - execute asynchronously. 1 - wait current server. 2 - wait all replicas if they exist.", 0) \
M(SettingBool, optimize_if_chain_to_miltiif, false, "Replace if(cond1, then1, if(cond2, ...)) chains to multiIf. Currently it's not beneficial for numeric types.", 0) \
\ \
/** Obsolete settings that do nothing but left for compatibility reasons. Remove each one after half a year of obsolescence. */ \ /** Obsolete settings that do nothing but left for compatibility reasons. Remove each one after half a year of obsolescence. */ \
\ \
M(SettingBool, allow_experimental_low_cardinality_type, true, "Obsolete setting, does nothing. Will be removed after 2019-08-13", 0) \ M(SettingBool, allow_experimental_low_cardinality_type, true, "Obsolete setting, does nothing. Will be removed after 2019-08-13", 0) \
M(SettingBool, compile, false, "Obsolete setting, does nothing. Will be removed after 2020-03-13", 0) \ M(SettingBool, compile, false, "Whether query compilation is enabled. Will be removed after 2020-03-13", 0) \
M(SettingUInt64, min_count_to_compile, 0, "Obsolete setting, does nothing. Will be removed after 2020-03-13", 0) \ M(SettingUInt64, min_count_to_compile, 0, "Obsolete setting, does nothing. Will be removed after 2020-03-13", 0) \
M(SettingBool, allow_experimental_multiple_joins_emulation, true, "Obsolete setting, does nothing. Will be removed after 2020-05-31", 0) \
M(SettingBool, allow_experimental_cross_to_join_conversion, true, "Obsolete setting, does nothing. Will be removed after 2020-05-31", 0) \
M(SettingBool, allow_experimental_data_skipping_indices, true, "Obsolete setting, does nothing. Will be removed after 2020-05-31", 0) \
M(SettingBool, merge_tree_uniform_read_distribution, true, "Obsolete setting, does nothing. Will be removed after 2020-05-20", 0) \ M(SettingBool, merge_tree_uniform_read_distribution, true, "Obsolete setting, does nothing. Will be removed after 2020-05-20", 0) \
M(SettingUInt64, mark_cache_min_lifetime, 0, "Obsolete setting, does nothing. Will be removed after 2020-05-31", 0) \
DECLARE_SETTINGS_COLLECTION(LIST_OF_SETTINGS) DECLARE_SETTINGS_COLLECTION(LIST_OF_SETTINGS)

View File

@ -22,8 +22,8 @@ namespace DB
*/ */
struct SortCursorImpl struct SortCursorImpl
{ {
ColumnRawPtrs all_columns;
ColumnRawPtrs sort_columns; ColumnRawPtrs sort_columns;
ColumnRawPtrs all_columns;
SortDescription desc; SortDescription desc;
size_t sort_columns_size = 0; size_t sort_columns_size = 0;
size_t pos = 0; size_t pos = 0;
@ -110,21 +110,52 @@ using SortCursorImpls = std::vector<SortCursorImpl>;
/// For easy copying. /// For easy copying.
struct SortCursor template <typename Derived>
struct SortCursorHelper
{ {
SortCursorImpl * impl; SortCursorImpl * impl;
SortCursor(SortCursorImpl * impl_) : impl(impl_) {} const Derived & derived() const { return static_cast<const Derived &>(*this); }
SortCursorHelper(SortCursorImpl * impl_) : impl(impl_) {}
SortCursorImpl * operator-> () { return impl; } SortCursorImpl * operator-> () { return impl; }
const SortCursorImpl * operator-> () const { return impl; } const SortCursorImpl * operator-> () const { return impl; }
bool ALWAYS_INLINE greater(const SortCursorHelper & rhs) const
{
return derived().greaterAt(rhs.derived(), impl->pos, rhs.impl->pos);
}
/// Inverted so that the priority queue elements are removed in ascending order.
bool ALWAYS_INLINE operator< (const SortCursorHelper & rhs) const
{
return derived().greater(rhs.derived());
}
/// Checks that all rows in the current block of this cursor are less than or equal to all the rows of the current block of another cursor.
bool ALWAYS_INLINE totallyLessOrEquals(const SortCursorHelper & rhs) const
{
if (impl->rows == 0 || rhs.impl->rows == 0)
return false;
/// The last row of this cursor is no larger than the first row of the another cursor.
return !derived().greaterAt(rhs.derived(), impl->rows - 1, 0);
}
};
struct SortCursor : SortCursorHelper<SortCursor>
{
using SortCursorHelper<SortCursor>::SortCursorHelper;
/// The specified row of this cursor is greater than the specified row of another cursor. /// The specified row of this cursor is greater than the specified row of another cursor.
bool greaterAt(const SortCursor & rhs, size_t lhs_pos, size_t rhs_pos) const bool ALWAYS_INLINE greaterAt(const SortCursor & rhs, size_t lhs_pos, size_t rhs_pos) const
{ {
for (size_t i = 0; i < impl->sort_columns_size; ++i) for (size_t i = 0; i < impl->sort_columns_size; ++i)
{ {
int direction = impl->desc[i].direction; const auto & desc = impl->desc[i];
int nulls_direction = impl->desc[i].nulls_direction; int direction = desc.direction;
int nulls_direction = desc.nulls_direction;
int res = direction * impl->sort_columns[i]->compareAt(lhs_pos, rhs_pos, *(rhs.impl->sort_columns[i]), nulls_direction); int res = direction * impl->sort_columns[i]->compareAt(lhs_pos, rhs_pos, *(rhs.impl->sort_columns[i]), nulls_direction);
if (res > 0) if (res > 0)
return true; return true;
@ -133,45 +164,37 @@ struct SortCursor
} }
return impl->order > rhs.impl->order; return impl->order > rhs.impl->order;
} }
};
/// Checks that all rows in the current block of this cursor are less than or equal to all the rows of the current block of another cursor.
bool totallyLessOrEquals(const SortCursor & rhs) const /// For the case with a single column and when there is no order between different cursors.
struct SimpleSortCursor : SortCursorHelper<SimpleSortCursor>
{
using SortCursorHelper<SimpleSortCursor>::SortCursorHelper;
bool ALWAYS_INLINE greaterAt(const SimpleSortCursor & rhs, size_t lhs_pos, size_t rhs_pos) const
{ {
if (impl->rows == 0 || rhs.impl->rows == 0) const auto & desc = impl->desc[0];
return false; int direction = desc.direction;
int nulls_direction = desc.nulls_direction;
/// The last row of this cursor is no larger than the first row of the another cursor. int res = impl->sort_columns[0]->compareAt(lhs_pos, rhs_pos, *(rhs.impl->sort_columns[0]), nulls_direction);
return !greaterAt(rhs, impl->rows - 1, 0); return res != 0 && ((res > 0) == (direction > 0));
}
bool greater(const SortCursor & rhs) const
{
return greaterAt(rhs, impl->pos, rhs.impl->pos);
}
/// Inverted so that the priority queue elements are removed in ascending order.
bool operator< (const SortCursor & rhs) const
{
return greater(rhs);
} }
}; };
/// Separate comparator for locale-sensitive string comparisons /// Separate comparator for locale-sensitive string comparisons
struct SortCursorWithCollation struct SortCursorWithCollation : SortCursorHelper<SortCursorWithCollation>
{ {
SortCursorImpl * impl; using SortCursorHelper<SortCursorWithCollation>::SortCursorHelper;
SortCursorWithCollation(SortCursorImpl * impl_) : impl(impl_) {} bool ALWAYS_INLINE greaterAt(const SortCursorWithCollation & rhs, size_t lhs_pos, size_t rhs_pos) const
SortCursorImpl * operator-> () { return impl; }
const SortCursorImpl * operator-> () const { return impl; }
bool greaterAt(const SortCursorWithCollation & rhs, size_t lhs_pos, size_t rhs_pos) const
{ {
for (size_t i = 0; i < impl->sort_columns_size; ++i) for (size_t i = 0; i < impl->sort_columns_size; ++i)
{ {
int direction = impl->desc[i].direction; const auto & desc = impl->desc[i];
int nulls_direction = impl->desc[i].nulls_direction; int direction = desc.direction;
int nulls_direction = desc.nulls_direction;
int res; int res;
if (impl->need_collation[i]) if (impl->need_collation[i])
{ {
@ -189,29 +212,11 @@ struct SortCursorWithCollation
} }
return impl->order > rhs.impl->order; return impl->order > rhs.impl->order;
} }
bool totallyLessOrEquals(const SortCursorWithCollation & rhs) const
{
if (impl->rows == 0 || rhs.impl->rows == 0)
return false;
/// The last row of this cursor is no larger than the first row of the another cursor.
return !greaterAt(rhs, impl->rows - 1, 0);
}
bool greater(const SortCursorWithCollation & rhs) const
{
return greaterAt(rhs, impl->pos, rhs.impl->pos);
}
bool operator< (const SortCursorWithCollation & rhs) const
{
return greater(rhs);
}
}; };
/** Allows to fetch data from multiple sort cursors in sorted order (merging sorted data streams). /** Allows to fetch data from multiple sort cursors in sorted order (merging sorted data streams).
* TODO: Replace with "Loser Tree", see https://en.wikipedia.org/wiki/K-way_merge_algorithm
*/ */
template <typename Cursor> template <typename Cursor>
class SortingHeap class SortingHeap
@ -225,7 +230,8 @@ public:
size_t size = cursors.size(); size_t size = cursors.size();
queue.reserve(size); queue.reserve(size);
for (size_t i = 0; i < size; ++i) for (size_t i = 0; i < size; ++i)
queue.emplace_back(&cursors[i]); if (!cursors[i].empty())
queue.emplace_back(&cursors[i]);
std::make_heap(queue.begin(), queue.end()); std::make_heap(queue.begin(), queue.end());
} }
@ -233,7 +239,11 @@ public:
Cursor & current() { return queue.front(); } Cursor & current() { return queue.front(); }
void next() size_t size() { return queue.size(); }
Cursor & nextChild() { return queue[nextChildIndex()]; }
void ALWAYS_INLINE next()
{ {
assert(isValid()); assert(isValid());
@ -246,34 +256,67 @@ public:
removeTop(); removeTop();
} }
void replaceTop(Cursor new_top)
{
current() = new_top;
updateTop();
}
void removeTop()
{
std::pop_heap(queue.begin(), queue.end());
queue.pop_back();
next_idx = 0;
}
void push(SortCursorImpl & cursor)
{
queue.emplace_back(&cursor);
std::push_heap(queue.begin(), queue.end());
next_idx = 0;
}
private: private:
using Container = std::vector<Cursor>; using Container = std::vector<Cursor>;
Container queue; Container queue;
/// Cache comparison between first and second child if the order in queue has not been changed.
size_t next_idx = 0;
size_t ALWAYS_INLINE nextChildIndex()
{
if (next_idx == 0)
{
next_idx = 1;
if (queue.size() > 2 && queue[1] < queue[2])
++next_idx;
}
return next_idx;
}
/// This is adapted version of the function __sift_down from libc++. /// This is adapted version of the function __sift_down from libc++.
/// Why cannot simply use std::priority_queue? /// Why cannot simply use std::priority_queue?
/// - because it doesn't support updating the top element and requires pop and push instead. /// - because it doesn't support updating the top element and requires pop and push instead.
void updateTop() /// Also look at "Boost.Heap" library.
void ALWAYS_INLINE updateTop()
{ {
size_t size = queue.size(); size_t size = queue.size();
if (size < 2) if (size < 2)
return; return;
size_t child_idx = 1;
auto begin = queue.begin(); auto begin = queue.begin();
auto child_it = begin + 1;
/// Right child exists and is greater than left child. size_t child_idx = nextChildIndex();
if (size > 2 && *child_it < *(child_it + 1)) auto child_it = begin + child_idx;
{
++child_it;
++child_idx;
}
/// Check if we are in order. /// Check if we are in order.
if (*child_it < *begin) if (*child_it < *begin)
return; return;
next_idx = 0;
auto curr_it = begin; auto curr_it = begin;
auto top(std::move(*begin)); auto top(std::move(*begin));
do do
@ -282,11 +325,12 @@ private:
*curr_it = std::move(*child_it); *curr_it = std::move(*child_it);
curr_it = child_it; curr_it = child_it;
if ((size - 2) / 2 < child_idx)
break;
// recompute the child based off of the updated parent // recompute the child based off of the updated parent
child_idx = 2 * child_idx + 1; child_idx = 2 * child_idx + 1;
if (child_idx >= size)
break;
child_it = begin + child_idx; child_it = begin + child_idx;
if ((child_idx + 1) < size && *child_it < *(child_it + 1)) if ((child_idx + 1) < size && *child_it < *(child_it + 1))
@ -300,12 +344,6 @@ private:
} while (!(*child_it < top)); } while (!(*child_it < top));
*curr_it = std::move(top); *curr_it = std::move(top);
} }
void removeTop()
{
std::pop_heap(queue.begin(), queue.end());
queue.pop_back();
}
}; };
} }

View File

@ -138,14 +138,14 @@ Block AggregatingSortedBlockInputStream::readImpl()
} }
void AggregatingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue) void AggregatingSortedBlockInputStream::merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue)
{ {
size_t merged_rows = 0; size_t merged_rows = 0;
/// We take the rows in the correct order and put them in `merged_block`, while the rows are no more than `max_block_size` /// We take the rows in the correct order and put them in `merged_block`, while the rows are no more than `max_block_size`
while (!queue.empty()) while (queue.isValid())
{ {
SortCursor current = queue.top(); SortCursor current = queue.current();
setPrimaryKeyRef(next_key, current); setPrimaryKeyRef(next_key, current);
@ -167,8 +167,6 @@ void AggregatingSortedBlockInputStream::merge(MutableColumns & merged_columns, s
return; return;
} }
queue.pop();
if (key_differs) if (key_differs)
{ {
current_key.swap(next_key); current_key.swap(next_key);
@ -202,8 +200,7 @@ void AggregatingSortedBlockInputStream::merge(MutableColumns & merged_columns, s
if (!current->isLast()) if (!current->isLast())
{ {
current->next(); queue.next();
queue.push(current);
} }
else else
{ {

View File

@ -55,7 +55,7 @@ private:
/** We support two different cursors - with Collation and without. /** We support two different cursors - with Collation and without.
* Templates are used instead of polymorphic SortCursor and calls to virtual functions. * Templates are used instead of polymorphic SortCursor and calls to virtual functions.
*/ */
void merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue); void merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue);
/** Extract all states of aggregate functions and merge them with the current group. /** Extract all states of aggregate functions and merge them with the current group.
*/ */

View File

@ -105,15 +105,15 @@ Block CollapsingSortedBlockInputStream::readImpl()
} }
void CollapsingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue) void CollapsingSortedBlockInputStream::merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue)
{ {
MergeStopCondition stop_condition(average_block_sizes, max_block_size); MergeStopCondition stop_condition(average_block_sizes, max_block_size);
size_t current_block_granularity; size_t current_block_granularity;
/// Take rows in correct order and put them into `merged_columns` until the rows no more than `max_block_size` /// Take rows in correct order and put them into `merged_columns` until the rows no more than `max_block_size`
for (; !queue.empty(); ++current_pos) for (; queue.isValid(); ++current_pos)
{ {
SortCursor current = queue.top(); SortCursor current = queue.current();
current_block_granularity = current->rows; current_block_granularity = current->rows;
if (current_key.empty()) if (current_key.empty())
@ -131,8 +131,6 @@ void CollapsingSortedBlockInputStream::merge(MutableColumns & merged_columns, st
return; return;
} }
queue.pop();
if (key_differs) if (key_differs)
{ {
/// We write data for the previous primary key. /// We write data for the previous primary key.
@ -185,8 +183,7 @@ void CollapsingSortedBlockInputStream::merge(MutableColumns & merged_columns, st
if (!current->isLast()) if (!current->isLast())
{ {
current->next(); queue.next();
queue.push(current);
} }
else else
{ {

View File

@ -73,7 +73,7 @@ private:
/** We support two different cursors - with Collation and without. /** We support two different cursors - with Collation and without.
* Templates are used instead of polymorphic SortCursors and calls to virtual functions. * Templates are used instead of polymorphic SortCursors and calls to virtual functions.
*/ */
void merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue); void merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue);
/// Output to result rows for the current primary key. /// Output to result rows for the current primary key.
void insertRows(MutableColumns & merged_columns, size_t block_size, MergeStopCondition & condition); void insertRows(MutableColumns & merged_columns, size_t block_size, MergeStopCondition & condition);

View File

@ -161,7 +161,7 @@ Block GraphiteRollupSortedBlockInputStream::readImpl()
} }
void GraphiteRollupSortedBlockInputStream::merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue) void GraphiteRollupSortedBlockInputStream::merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue)
{ {
const DateLUTImpl & date_lut = DateLUT::instance(); const DateLUTImpl & date_lut = DateLUT::instance();
@ -173,9 +173,9 @@ void GraphiteRollupSortedBlockInputStream::merge(MutableColumns & merged_columns
/// contribute towards current output row. /// contribute towards current output row.
/// Variables starting with next_* refer to the row at the top of the queue. /// Variables starting with next_* refer to the row at the top of the queue.
while (!queue.empty()) while (queue.isValid())
{ {
SortCursor next_cursor = queue.top(); SortCursor next_cursor = queue.current();
StringRef next_path = next_cursor->all_columns[path_column_num]->getDataAt(next_cursor->pos); StringRef next_path = next_cursor->all_columns[path_column_num]->getDataAt(next_cursor->pos);
bool new_path = is_first || next_path != current_group_path; bool new_path = is_first || next_path != current_group_path;
@ -253,12 +253,9 @@ void GraphiteRollupSortedBlockInputStream::merge(MutableColumns & merged_columns
current_group_path = next_path; current_group_path = next_path;
} }
queue.pop();
if (!next_cursor->isLast()) if (!next_cursor->isLast())
{ {
next_cursor->next(); queue.next();
queue.push(next_cursor);
} }
else else
{ {

View File

@ -225,7 +225,7 @@ private:
UInt32 selectPrecision(const Graphite::Retentions & retentions, time_t time) const; UInt32 selectPrecision(const Graphite::Retentions & retentions, time_t time) const;
void merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue); void merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue);
/// Insert the values into the resulting columns, which will not be changed in the future. /// Insert the values into the resulting columns, which will not be changed in the future.
template <typename TSortCursor> template <typename TSortCursor>

View File

@ -150,10 +150,12 @@ MergeSortingBlocksBlockInputStream::MergeSortingBlocksBlockInputStream(
blocks.swap(nonempty_blocks); blocks.swap(nonempty_blocks);
if (!has_collation) if (has_collation)
queue_with_collation = SortingHeap<SortCursorWithCollation>(cursors);
else if (description.size() > 1)
queue_without_collation = SortingHeap<SortCursor>(cursors); queue_without_collation = SortingHeap<SortCursor>(cursors);
else else
queue_with_collation = SortingHeap<SortCursorWithCollation>(cursors); queue_simple = SortingHeap<SimpleSortCursor>(cursors);
} }
@ -169,9 +171,12 @@ Block MergeSortingBlocksBlockInputStream::readImpl()
return res; return res;
} }
return !has_collation if (has_collation)
? mergeImpl(queue_without_collation) return mergeImpl(queue_with_collation);
: mergeImpl(queue_with_collation); else if (description.size() > 1)
return mergeImpl(queue_without_collation);
else
return mergeImpl(queue_simple);
} }
@ -179,9 +184,18 @@ template <typename TSortingHeap>
Block MergeSortingBlocksBlockInputStream::mergeImpl(TSortingHeap & queue) Block MergeSortingBlocksBlockInputStream::mergeImpl(TSortingHeap & queue)
{ {
size_t num_columns = header.columns(); size_t num_columns = header.columns();
MutableColumns merged_columns = header.cloneEmptyColumns(); MutableColumns merged_columns = header.cloneEmptyColumns();
/// TODO: reserve (in each column)
/// Reserve
if (queue.isValid() && !blocks.empty())
{
/// The expected size of output block is the same as input block
size_t size_to_reserve = blocks[0].rows();
for (auto & column : merged_columns)
column->reserve(size_to_reserve);
}
/// TODO: Optimization when a single block left.
/// Take rows from queue in right order and push to 'merged'. /// Take rows from queue in right order and push to 'merged'.
size_t merged_rows = 0; size_t merged_rows = 0;
@ -210,6 +224,9 @@ Block MergeSortingBlocksBlockInputStream::mergeImpl(TSortingHeap & queue)
break; break;
} }
if (!queue.isValid())
blocks.clear();
if (merged_rows == 0) if (merged_rows == 0)
return {}; return {};

View File

@ -59,6 +59,7 @@ private:
bool has_collation = false; bool has_collation = false;
SortingHeap<SortCursor> queue_without_collation; SortingHeap<SortCursor> queue_without_collation;
SortingHeap<SimpleSortCursor> queue_simple;
SortingHeap<SortCursorWithCollation> queue_with_collation; SortingHeap<SortCursorWithCollation> queue_with_collation;
/** Two different cursors are supported - with and without Collation. /** Two different cursors are supported - with and without Collation.

View File

@ -59,9 +59,9 @@ void MergingSortedBlockInputStream::init(MutableColumns & merged_columns)
} }
if (has_collation) if (has_collation)
initQueue(queue_with_collation); queue_with_collation = SortingHeap<SortCursorWithCollation>(cursors);
else else
initQueue(queue_without_collation); queue_without_collation = SortingHeap<SortCursor>(cursors);
} }
/// Let's check that all source blocks have the same structure. /// Let's check that all source blocks have the same structure.
@ -82,15 +82,6 @@ void MergingSortedBlockInputStream::init(MutableColumns & merged_columns)
} }
template <typename TSortCursor>
void MergingSortedBlockInputStream::initQueue(std::priority_queue<TSortCursor> & queue)
{
for (size_t i = 0; i < cursors.size(); ++i)
if (!cursors[i].empty())
queue.push(TSortCursor(&cursors[i]));
}
Block MergingSortedBlockInputStream::readImpl() Block MergingSortedBlockInputStream::readImpl()
{ {
if (finished) if (finished)
@ -115,7 +106,7 @@ Block MergingSortedBlockInputStream::readImpl()
template <typename TSortCursor> template <typename TSortCursor>
void MergingSortedBlockInputStream::fetchNextBlock(const TSortCursor & current, std::priority_queue<TSortCursor> & queue) void MergingSortedBlockInputStream::fetchNextBlock(const TSortCursor & current, SortingHeap<TSortCursor> & queue)
{ {
size_t order = current->order; size_t order = current->order;
size_t size = cursors.size(); size_t size = cursors.size();
@ -125,15 +116,19 @@ void MergingSortedBlockInputStream::fetchNextBlock(const TSortCursor & current,
while (true) while (true)
{ {
source_blocks[order] = new detail::SharedBlock(children[order]->read()); source_blocks[order] = new detail::SharedBlock(children[order]->read()); /// intrusive ptr
if (!*source_blocks[order]) if (!*source_blocks[order])
{
queue.removeTop();
break; break;
}
if (source_blocks[order]->rows()) if (source_blocks[order]->rows())
{ {
cursors[order].reset(*source_blocks[order]); cursors[order].reset(*source_blocks[order]);
queue.push(TSortCursor(&cursors[order])); queue.replaceTop(&cursors[order]);
source_blocks[order]->all_columns = cursors[order].all_columns; source_blocks[order]->all_columns = cursors[order].all_columns;
source_blocks[order]->sort_columns = cursors[order].sort_columns; source_blocks[order]->sort_columns = cursors[order].sort_columns;
break; break;
@ -154,19 +149,14 @@ bool MergingSortedBlockInputStream::MergeStopCondition::checkStop() const
return sum_rows_count >= average; return sum_rows_count >= average;
} }
template
void MergingSortedBlockInputStream::fetchNextBlock<SortCursor>(const SortCursor & current, std::priority_queue<SortCursor> & queue);
template template <typename TSortingHeap>
void MergingSortedBlockInputStream::fetchNextBlock<SortCursorWithCollation>(const SortCursorWithCollation & current, std::priority_queue<SortCursorWithCollation> & queue); void MergingSortedBlockInputStream::merge(MutableColumns & merged_columns, TSortingHeap & queue)
template <typename TSortCursor>
void MergingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::priority_queue<TSortCursor> & queue)
{ {
size_t merged_rows = 0; size_t merged_rows = 0;
MergeStopCondition stop_condition(average_block_sizes, max_block_size); MergeStopCondition stop_condition(average_block_sizes, max_block_size);
/** Increase row counters. /** Increase row counters.
* Return true if it's time to finish generating the current data block. * Return true if it's time to finish generating the current data block.
*/ */
@ -186,123 +176,100 @@ void MergingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::
return stop_condition.checkStop(); return stop_condition.checkStop();
}; };
/// Take rows in required order and put them into `merged_columns`, while the rows are no more than `max_block_size` /// Take rows in required order and put them into `merged_columns`, while the number of rows are no more than `max_block_size`
while (!queue.empty()) while (queue.isValid())
{ {
TSortCursor current = queue.top(); auto current = queue.current();
size_t current_block_granularity = current->rows; size_t current_block_granularity = current->rows;
queue.pop();
while (true) /** And what if the block is totally less or equal than the rest for the current cursor?
* Or is there only one data source left in the queue? Then you can take the entire block on current cursor.
*/
if (current->isFirst()
&& (queue.size() == 1
|| (queue.size() >= 2 && current.totallyLessOrEquals(queue.nextChild()))))
{ {
/** And what if the block is totally less or equal than the rest for the current cursor? // std::cerr << "current block is totally less or equals\n";
* Or is there only one data source left in the queue? Then you can take the entire block on current cursor.
*/ /// If there are already data in the current block, we first return it. We'll get here again the next time we call the merge function.
if (current->isFirst() && (queue.empty() || current.totallyLessOrEquals(queue.top()))) if (merged_rows != 0)
{ {
// std::cerr << "current block is totally less or equals\n"; //std::cerr << "merged rows is non-zero\n";
/// If there are already data in the current block, we first return it. We'll get here again the next time we call the merge function.
if (merged_rows != 0)
{
//std::cerr << "merged rows is non-zero\n";
queue.push(current);
return;
}
/// Actually, current->order stores source number (i.e. cursors[current->order] == current)
size_t source_num = current->order;
if (source_num >= cursors.size())
throw Exception("Logical error in MergingSortedBlockInputStream", ErrorCodes::LOGICAL_ERROR);
for (size_t i = 0; i < num_columns; ++i)
merged_columns[i] = (*std::move(source_blocks[source_num]->getByPosition(i).column)).mutate();
// std::cerr << "copied columns\n";
merged_rows = merged_columns.at(0)->size();
/// Limit output
if (limit && total_merged_rows + merged_rows > limit)
{
merged_rows = limit - total_merged_rows;
for (size_t i = 0; i < num_columns; ++i)
{
auto & column = merged_columns[i];
column = (*column->cut(0, merged_rows)).mutate();
}
cancel(false);
finished = true;
}
/// Write order of rows for other columns
/// this data will be used in grather stream
if (out_row_sources_buf)
{
RowSourcePart row_source(source_num);
for (size_t i = 0; i < merged_rows; ++i)
out_row_sources_buf->write(row_source.data);
}
//std::cerr << "fetching next block\n";
total_merged_rows += merged_rows;
fetchNextBlock(current, queue);
return; return;
} }
// std::cerr << "total_merged_rows: " << total_merged_rows << ", merged_rows: " << merged_rows << "\n"; /// Actually, current->order stores source number (i.e. cursors[current->order] == current)
// std::cerr << "Inserting row\n"; size_t source_num = current->order;
for (size_t i = 0; i < num_columns; ++i)
merged_columns[i]->insertFrom(*current->all_columns[i], current->pos);
if (source_num >= cursors.size())
throw Exception("Logical error in MergingSortedBlockInputStream", ErrorCodes::LOGICAL_ERROR);
for (size_t i = 0; i < num_columns; ++i)
merged_columns[i] = (*std::move(source_blocks[source_num]->getByPosition(i).column)).mutate();
// std::cerr << "copied columns\n";
merged_rows = merged_columns.at(0)->size();
/// Limit output
if (limit && total_merged_rows + merged_rows > limit)
{
merged_rows = limit - total_merged_rows;
for (size_t i = 0; i < num_columns; ++i)
{
auto & column = merged_columns[i];
column = (*column->cut(0, merged_rows)).mutate();
}
cancel(false);
finished = true;
}
/// Write order of rows for other columns
/// this data will be used in grather stream
if (out_row_sources_buf) if (out_row_sources_buf)
{ {
/// Actually, current.impl->order stores source number (i.e. cursors[current.impl->order] == current.impl) RowSourcePart row_source(source_num);
RowSourcePart row_source(current->order); for (size_t i = 0; i < merged_rows; ++i)
out_row_sources_buf->write(row_source.data); out_row_sources_buf->write(row_source.data);
} }
if (!current->isLast()) //std::cerr << "fetching next block\n";
{
// std::cerr << "moving to next row\n";
current->next();
if (queue.empty() || !(current.greater(queue.top()))) total_merged_rows += merged_rows;
{ fetchNextBlock(current, queue);
if (count_row_and_check_limit(current_block_granularity)) return;
{ }
// std::cerr << "pushing back to queue\n";
queue.push(current);
return;
}
/// Do not put the cursor back in the queue, but continue to work with the current cursor. // std::cerr << "total_merged_rows: " << total_merged_rows << ", merged_rows: " << merged_rows << "\n";
// std::cerr << "current is still on top, using current row\n"; // std::cerr << "Inserting row\n";
continue; for (size_t i = 0; i < num_columns; ++i)
} merged_columns[i]->insertFrom(*current->all_columns[i], current->pos);
else
{
// std::cerr << "next row is not least, pushing back to queue\n";
queue.push(current);
}
}
else
{
/// We get the next block from the corresponding source, if there is one.
// std::cerr << "It was last row, fetching next block\n";
fetchNextBlock(current, queue);
}
break; if (out_row_sources_buf)
{
/// Actually, current.impl->order stores source number (i.e. cursors[current.impl->order] == current.impl)
RowSourcePart row_source(current->order);
out_row_sources_buf->write(row_source.data);
}
if (!current->isLast())
{
// std::cerr << "moving to next row\n";
queue.next();
}
else
{
/// We get the next block from the corresponding source, if there is one.
// std::cerr << "It was last row, fetching next block\n";
fetchNextBlock(current, queue);
} }
if (count_row_and_check_limit(current_block_granularity)) if (count_row_and_check_limit(current_block_granularity))
return; return;
} }
/// We have read all data. Ask childs to cancel providing more data.
cancel(false); cancel(false);
finished = true; finished = true;
} }

View File

@ -1,7 +1,5 @@
#pragma once #pragma once
#include <queue>
#include <boost/smart_ptr/intrusive_ptr.hpp> #include <boost/smart_ptr/intrusive_ptr.hpp>
#include <common/logger_useful.h> #include <common/logger_useful.h>
@ -87,7 +85,7 @@ protected:
/// Gets the next block from the source corresponding to the `current`. /// Gets the next block from the source corresponding to the `current`.
template <typename TSortCursor> template <typename TSortCursor>
void fetchNextBlock(const TSortCursor & current, std::priority_queue<TSortCursor> & queue); void fetchNextBlock(const TSortCursor & current, SortingHeap<TSortCursor> & queue);
Block header; Block header;
@ -109,14 +107,10 @@ protected:
size_t num_columns = 0; size_t num_columns = 0;
std::vector<SharedBlockPtr> source_blocks; std::vector<SharedBlockPtr> source_blocks;
using CursorImpls = std::vector<SortCursorImpl>; SortCursorImpls cursors;
CursorImpls cursors;
using Queue = std::priority_queue<SortCursor>; SortingHeap<SortCursor> queue_without_collation;
Queue queue_without_collation; SortingHeap<SortCursorWithCollation> queue_with_collation;
using QueueWithCollation = std::priority_queue<SortCursorWithCollation>;
QueueWithCollation queue_with_collation;
/// Used in Vertical merge algorithm to gather non-PK/non-index columns (on next step) /// Used in Vertical merge algorithm to gather non-PK/non-index columns (on next step)
/// If it is not nullptr then it should be populated during execution /// If it is not nullptr then it should be populated during execution
@ -177,13 +171,10 @@ protected:
private: private:
/** We support two different cursors - with Collation and without. /** We support two different cursors - with Collation and without.
* Templates are used instead of polymorphic SortCursor and calls to virtual functions. * Templates are used instead of polymorphic SortCursor and calls to virtual functions.
*/ */
template <typename TSortCursor> template <typename TSortingHeap>
void initQueue(std::priority_queue<TSortCursor> & queue); void merge(MutableColumns & merged_columns, TSortingHeap & queue);
template <typename TSortCursor>
void merge(MutableColumns & merged_columns, std::priority_queue<TSortCursor> & queue);
Logger * log = &Logger::get("MergingSortedBlockInputStream"); Logger * log = &Logger::get("MergingSortedBlockInputStream");

View File

@ -129,7 +129,7 @@ void PushingToViewsBlockOutputStream::write(const Block & block)
for (size_t view_num = 0; view_num < views.size(); ++view_num) for (size_t view_num = 0; view_num < views.size(); ++view_num)
{ {
auto thread_group = CurrentThread::getGroup(); auto thread_group = CurrentThread::getGroup();
pool.scheduleOrThrowOnError([=] pool.scheduleOrThrowOnError([=, this]
{ {
setThreadName("PushingToViews"); setThreadName("PushingToViews");
if (thread_group) if (thread_group)

View File

@ -48,13 +48,14 @@ Block ReplacingSortedBlockInputStream::readImpl()
} }
void ReplacingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue) void ReplacingSortedBlockInputStream::merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue)
{ {
MergeStopCondition stop_condition(average_block_sizes, max_block_size); MergeStopCondition stop_condition(average_block_sizes, max_block_size);
/// Take the rows in needed order and put them into `merged_columns` until rows no more than `max_block_size` /// Take the rows in needed order and put them into `merged_columns` until rows no more than `max_block_size`
while (!queue.empty()) while (queue.isValid())
{ {
SortCursor current = queue.top(); SortCursor current = queue.current();
size_t current_block_granularity = current->rows; size_t current_block_granularity = current->rows;
if (current_key.empty()) if (current_key.empty())
@ -68,8 +69,6 @@ void ReplacingSortedBlockInputStream::merge(MutableColumns & merged_columns, std
if (key_differs && stop_condition.checkStop()) if (key_differs && stop_condition.checkStop())
return; return;
queue.pop();
if (key_differs) if (key_differs)
{ {
/// Write the data for the previous primary key. /// Write the data for the previous primary key.
@ -98,8 +97,7 @@ void ReplacingSortedBlockInputStream::merge(MutableColumns & merged_columns, std
if (!current->isLast()) if (!current->isLast())
{ {
current->next(); queue.next();
queue.push(current);
} }
else else
{ {

View File

@ -52,7 +52,7 @@ private:
/// Sources of rows with the current primary key. /// Sources of rows with the current primary key.
PODArray<RowSourcePart> current_row_sources; PODArray<RowSourcePart> current_row_sources;
void merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue); void merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue);
/// Output into result the rows for current primary key. /// Output into result the rows for current primary key.
void insertRow(MutableColumns & merged_columns); void insertRow(MutableColumns & merged_columns);

View File

@ -314,14 +314,14 @@ Block SummingSortedBlockInputStream::readImpl()
} }
void SummingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue) void SummingSortedBlockInputStream::merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue)
{ {
merged_rows = 0; merged_rows = 0;
/// Take the rows in needed order and put them in `merged_columns` until rows no more than `max_block_size` /// Take the rows in needed order and put them in `merged_columns` until rows no more than `max_block_size`
while (!queue.empty()) while (queue.isValid())
{ {
SortCursor current = queue.top(); SortCursor current = queue.current();
setPrimaryKeyRef(next_key, current); setPrimaryKeyRef(next_key, current);
@ -383,12 +383,9 @@ void SummingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::
current_row_is_zero = false; current_row_is_zero = false;
} }
queue.pop();
if (!current->isLast()) if (!current->isLast())
{ {
current->next(); queue.next();
queue.push(current);
} }
else else
{ {

View File

@ -1,5 +1,7 @@
#pragma once #pragma once
#include <queue>
#include <Core/Row.h> #include <Core/Row.h>
#include <Core/ColumnNumbers.h> #include <Core/ColumnNumbers.h>
#include <Common/AlignedBuffer.h> #include <Common/AlignedBuffer.h>
@ -140,7 +142,7 @@ private:
/** We support two different cursors - with Collation and without. /** We support two different cursors - with Collation and without.
* Templates are used instead of polymorphic SortCursor and calls to virtual functions. * Templates are used instead of polymorphic SortCursor and calls to virtual functions.
*/ */
void merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue); void merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue);
/// Insert the summed row for the current group into the result and updates some of per-block flags if the row is not "zero". /// Insert the summed row for the current group into the result and updates some of per-block flags if the row is not "zero".
void insertCurrentRowIfNeeded(MutableColumns & merged_columns); void insertCurrentRowIfNeeded(MutableColumns & merged_columns);

View File

@ -82,21 +82,18 @@ Block VersionedCollapsingSortedBlockInputStream::readImpl()
} }
void VersionedCollapsingSortedBlockInputStream::merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue) void VersionedCollapsingSortedBlockInputStream::merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue)
{ {
MergeStopCondition stop_condition(average_block_sizes, max_block_size); MergeStopCondition stop_condition(average_block_sizes, max_block_size);
auto update_queue = [this, & queue](SortCursor & cursor) auto update_queue = [this, & queue](SortCursor & cursor)
{ {
queue.pop();
if (out_row_sources_buf) if (out_row_sources_buf)
current_row_sources.emplace(cursor->order, true); current_row_sources.emplace(cursor->order, true);
if (!cursor->isLast()) if (!cursor->isLast())
{ {
cursor->next(); queue.next();
queue.push(cursor);
} }
else else
{ {
@ -106,9 +103,9 @@ void VersionedCollapsingSortedBlockInputStream::merge(MutableColumns & merged_co
}; };
/// Take rows in correct order and put them into `merged_columns` until the rows no more than `max_block_size` /// Take rows in correct order and put them into `merged_columns` until the rows no more than `max_block_size`
while (!queue.empty()) while (queue.isValid())
{ {
SortCursor current = queue.top(); SortCursor current = queue.current();
size_t current_block_granularity = current->rows; size_t current_block_granularity = current->rows;
SharedBlockRowRef next_key; SharedBlockRowRef next_key;

View File

@ -5,7 +5,7 @@
#include <DataStreams/MergingSortedBlockInputStream.h> #include <DataStreams/MergingSortedBlockInputStream.h>
#include <DataStreams/ColumnGathererStream.h> #include <DataStreams/ColumnGathererStream.h>
#include <deque> #include <queue>
namespace DB namespace DB
@ -204,7 +204,7 @@ private:
/// Sources of rows for VERTICAL merge algorithm. Size equals to (size + number of gaps) in current_keys. /// Sources of rows for VERTICAL merge algorithm. Size equals to (size + number of gaps) in current_keys.
std::queue<RowSourcePart> current_row_sources; std::queue<RowSourcePart> current_row_sources;
void merge(MutableColumns & merged_columns, std::priority_queue<SortCursor> & queue); void merge(MutableColumns & merged_columns, SortingHeap<SortCursor> & queue);
/// Output to result row for the current primary key. /// Output to result row for the current primary key.
void insertRow(size_t skip_rows, const SharedBlockRowRef & row, MutableColumns & merged_columns); void insertRow(size_t skip_rows, const SharedBlockRowRef & row, MutableColumns & merged_columns);

View File

@ -57,6 +57,6 @@ catch (const Exception & e)
std::cerr << e.what() << ", " << e.displayText() << std::endl std::cerr << e.what() << ", " << e.displayText() << std::endl
<< std::endl << std::endl
<< "Stack trace:" << std::endl << "Stack trace:" << std::endl
<< e.getStackTrace().toString(); << e.getStackTraceString();
return 1; return 1;
} }

View File

@ -30,7 +30,7 @@ namespace ErrorCodes
extern const int LOGICAL_ERROR; extern const int LOGICAL_ERROR;
} }
static const std::vector<String> supported_functions{"any", "anyLast", "min", "max", "sum"}; static const std::vector<String> supported_functions{"any", "anyLast", "min", "max", "sum", "groupBitAnd", "groupBitOr", "groupBitXor"};
String DataTypeCustomSimpleAggregateFunction::getName() const String DataTypeCustomSimpleAggregateFunction::getName() const

View File

@ -23,12 +23,8 @@ namespace ErrorCodes
} }
DatabaseDictionary::DatabaseDictionary(const String & name_) DatabaseDictionary::DatabaseDictionary(const String & name_)
: name(name_), : IDatabase(name_),
log(&Logger::get("DatabaseDictionary(" + name + ")")) log(&Logger::get("DatabaseDictionary(" + database_name + ")"))
{
}
void DatabaseDictionary::loadStoredObjects(Context &, bool)
{ {
} }
@ -69,65 +65,6 @@ bool DatabaseDictionary::isTableExist(
return context.getExternalDictionariesLoader().getCurrentStatus(table_name) != ExternalLoader::Status::NOT_EXIST; return context.getExternalDictionariesLoader().getCurrentStatus(table_name) != ExternalLoader::Status::NOT_EXIST;
} }
bool DatabaseDictionary::isDictionaryExist(
const Context & /*context*/,
const String & /*table_name*/) const
{
return false;
}
DatabaseDictionariesIteratorPtr DatabaseDictionary::getDictionariesIterator(
const Context & /*context*/,
const FilterByNameFunction & /*filter_by_dictionary_name*/)
{
return std::make_unique<DatabaseDictionariesSnapshotIterator>();
}
void DatabaseDictionary::createDictionary(
const Context & /*context*/,
const String & /*dictionary_name*/,
const ASTPtr & /*query*/)
{
throw Exception("Dictionary engine doesn't support dictionaries.", ErrorCodes::UNSUPPORTED_METHOD);
}
void DatabaseDictionary::removeDictionary(
const Context & /*context*/,
const String & /*table_name*/)
{
throw Exception("Dictionary engine doesn't support dictionaries.", ErrorCodes::UNSUPPORTED_METHOD);
}
void DatabaseDictionary::attachDictionary(
const String & /*dictionary_name*/, const Context & /*context*/)
{
throw Exception("Dictionary engine doesn't support dictionaries.", ErrorCodes::UNSUPPORTED_METHOD);
}
void DatabaseDictionary::detachDictionary(const String & /*dictionary_name*/, const Context & /*context*/)
{
throw Exception("Dictionary engine doesn't support dictionaries.", ErrorCodes::UNSUPPORTED_METHOD);
}
ASTPtr DatabaseDictionary::tryGetCreateDictionaryQuery(
const Context & /*context*/,
const String & /*table_name*/) const
{
return nullptr;
}
ASTPtr DatabaseDictionary::getCreateDictionaryQuery(
const Context & /*context*/,
const String & /*table_name*/) const
{
throw Exception("Dictionary engine doesn't support dictionaries.", ErrorCodes::UNSUPPORTED_METHOD);
}
StoragePtr DatabaseDictionary::tryGetTable( StoragePtr DatabaseDictionary::tryGetTable(
const Context & context, const Context & context,
const String & table_name) const const String & table_name) const
@ -153,39 +90,6 @@ bool DatabaseDictionary::empty(const Context & context) const
return !context.getExternalDictionariesLoader().hasCurrentlyLoadedObjects(); return !context.getExternalDictionariesLoader().hasCurrentlyLoadedObjects();
} }
StoragePtr DatabaseDictionary::detachTable(const String & /*table_name*/)
{
throw Exception("DatabaseDictionary: detachTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
}
void DatabaseDictionary::attachTable(const String & /*table_name*/, const StoragePtr & /*table*/)
{
throw Exception("DatabaseDictionary: attachTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
}
void DatabaseDictionary::createTable(
const Context &,
const String &,
const StoragePtr &,
const ASTPtr &)
{
throw Exception("DatabaseDictionary: createTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
}
void DatabaseDictionary::removeTable(
const Context &,
const String &)
{
throw Exception("DatabaseDictionary: removeTable() is not supported", ErrorCodes::NOT_IMPLEMENTED);
}
time_t DatabaseDictionary::getObjectMetadataModificationTime(
const Context &,
const String &)
{
return static_cast<time_t>(0);
}
ASTPtr DatabaseDictionary::getCreateTableQueryImpl(const Context & context, ASTPtr DatabaseDictionary::getCreateTableQueryImpl(const Context & context,
const String & table_name, bool throw_on_error) const const String & table_name, bool throw_on_error) const
{ {
@ -196,9 +100,11 @@ ASTPtr DatabaseDictionary::getCreateTableQueryImpl(const Context & context,
const auto & dictionaries = context.getExternalDictionariesLoader(); const auto & dictionaries = context.getExternalDictionariesLoader();
auto dictionary = throw_on_error ? dictionaries.getDictionary(table_name) auto dictionary = throw_on_error ? dictionaries.getDictionary(table_name)
: dictionaries.tryGetDictionary(table_name); : dictionaries.tryGetDictionary(table_name);
if (!dictionary)
return {};
auto names_and_types = StorageDictionary::getNamesAndTypes(dictionary->getStructure()); auto names_and_types = StorageDictionary::getNamesAndTypes(dictionary->getStructure());
buffer << "CREATE TABLE " << backQuoteIfNeed(name) << '.' << backQuoteIfNeed(table_name) << " ("; buffer << "CREATE TABLE " << backQuoteIfNeed(database_name) << '.' << backQuoteIfNeed(table_name) << " (";
buffer << StorageDictionary::generateNamesAndTypesDescription(names_and_types.begin(), names_and_types.end()); buffer << StorageDictionary::generateNamesAndTypesDescription(names_and_types.begin(), names_and_types.end());
buffer << ") Engine = Dictionary(" << backQuoteIfNeed(table_name) << ")"; buffer << ") Engine = Dictionary(" << backQuoteIfNeed(table_name) << ")";
} }
@ -215,22 +121,12 @@ ASTPtr DatabaseDictionary::getCreateTableQueryImpl(const Context & context,
return ast; return ast;
} }
ASTPtr DatabaseDictionary::getCreateTableQuery(const Context & context, const String & table_name) const ASTPtr DatabaseDictionary::getCreateDatabaseQuery() const
{
return getCreateTableQueryImpl(context, table_name, true);
}
ASTPtr DatabaseDictionary::tryGetCreateTableQuery(const Context & context, const String & table_name) const
{
return getCreateTableQueryImpl(context, table_name, false);
}
ASTPtr DatabaseDictionary::getCreateDatabaseQuery(const Context & /*context*/) const
{ {
String query; String query;
{ {
WriteBufferFromString buffer(query); WriteBufferFromString buffer(query);
buffer << "CREATE DATABASE " << backQuoteIfNeed(name) << " ENGINE = Dictionary"; buffer << "CREATE DATABASE " << backQuoteIfNeed(database_name) << " ENGINE = Dictionary";
} }
ParserCreateQuery parser; ParserCreateQuery parser;
return parseQuery(parser, query.data(), query.data() + query.size(), "", 0); return parseQuery(parser, query.data(), query.data() + query.size(), "", 0);
@ -240,9 +136,4 @@ void DatabaseDictionary::shutdown()
{ {
} }
String DatabaseDictionary::getDatabaseName() const
{
return name;
}
} }

View File

@ -24,85 +24,36 @@ class DatabaseDictionary : public IDatabase
public: public:
DatabaseDictionary(const String & name_); DatabaseDictionary(const String & name_);
String getDatabaseName() const override;
String getEngineName() const override String getEngineName() const override
{ {
return "Dictionary"; return "Dictionary";
} }
void loadStoredObjects(
Context & context,
bool has_force_restore_data_flag) override;
bool isTableExist( bool isTableExist(
const Context & context, const Context & context,
const String & table_name) const override; const String & table_name) const override;
bool isDictionaryExist(const Context & context, const String & table_name) const override;
StoragePtr tryGetTable( StoragePtr tryGetTable(
const Context & context, const Context & context,
const String & table_name) const override; const String & table_name) const override;
DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name = {}) override; DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name = {}) override;
DatabaseDictionariesIteratorPtr getDictionariesIterator(const Context & context, const FilterByNameFunction & filter_by_dictionary_name = {}) override;
bool empty(const Context & context) const override; bool empty(const Context & context) const override;
void createTable( ASTPtr getCreateDatabaseQuery() const override;
const Context & context,
const String & table_name,
const StoragePtr & table,
const ASTPtr & query) override;
void createDictionary(
const Context & context, const String & dictionary_name, const ASTPtr & query) override;
void removeTable(
const Context & context,
const String & table_name) override;
void removeDictionary(const Context & context, const String & table_name) override;
void attachTable(const String & table_name, const StoragePtr & table) override;
StoragePtr detachTable(const String & table_name) override;
time_t getObjectMetadataModificationTime(
const Context & context,
const String & table_name) override;
ASTPtr getCreateTableQuery(
const Context & context,
const String & table_name) const override;
ASTPtr tryGetCreateTableQuery(
const Context & context,
const String & table_name) const override;
ASTPtr getCreateDatabaseQuery(const Context & context) const override;
ASTPtr getCreateDictionaryQuery(const Context & context, const String & table_name) const override;
ASTPtr tryGetCreateDictionaryQuery(const Context & context, const String & table_name) const override;
void attachDictionary(const String & dictionary_name, const Context & context) override;
void detachDictionary(const String & dictionary_name, const Context & context) override;
void shutdown() override; void shutdown() override;
protected:
ASTPtr getCreateTableQueryImpl(const Context & context, const String & table_name, bool throw_on_error) const override;
private: private:
const String name;
mutable std::mutex mutex; mutable std::mutex mutex;
Poco::Logger * log; Poco::Logger * log;
Tables listTables(const Context & context, const FilterByNameFunction & filter_by_name); Tables listTables(const Context & context, const FilterByNameFunction & filter_by_name);
ASTPtr getCreateTableQueryImpl(const Context & context, const String & table_name, bool throw_on_error) const;
}; };
} }

View File

@ -3,7 +3,6 @@
#include <Databases/DatabaseOnDisk.h> #include <Databases/DatabaseOnDisk.h>
#include <Databases/DatabasesCommon.h> #include <Databases/DatabasesCommon.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <IO/ReadBufferFromFile.h>
#include <IO/ReadHelpers.h> #include <IO/ReadHelpers.h>
#include <IO/WriteBufferFromFile.h> #include <IO/WriteBufferFromFile.h>
#include <IO/WriteHelpers.h> #include <IO/WriteHelpers.h>
@ -24,18 +23,14 @@ namespace ErrorCodes
extern const int TABLE_ALREADY_EXISTS; extern const int TABLE_ALREADY_EXISTS;
extern const int UNKNOWN_TABLE; extern const int UNKNOWN_TABLE;
extern const int UNSUPPORTED_METHOD; extern const int UNSUPPORTED_METHOD;
extern const int CANNOT_CREATE_TABLE_FROM_METADATA;
extern const int LOGICAL_ERROR; extern const int LOGICAL_ERROR;
} }
DatabaseLazy::DatabaseLazy(const String & name_, const String & metadata_path_, time_t expiration_time_, const Context & context_) DatabaseLazy::DatabaseLazy(const String & name_, const String & metadata_path_, time_t expiration_time_, const Context & context_)
: name(name_) : DatabaseOnDisk(name_, metadata_path_, "DatabaseLazy (" + name_ + ")")
, metadata_path(metadata_path_)
, data_path("data/" + escapeForFileName(name) + "/")
, expiration_time(expiration_time_) , expiration_time(expiration_time_)
, log(&Logger::get("DatabaseLazy (" + name + ")"))
{ {
Poco::File(context_.getPath() + getDataPath()).createDirectories(); Poco::File(context_.getPath() + getDataPath()).createDirectories();
} }
@ -45,7 +40,7 @@ void DatabaseLazy::loadStoredObjects(
Context & context, Context & context,
bool /* has_force_restore_data_flag */) bool /* has_force_restore_data_flag */)
{ {
DatabaseOnDisk::iterateMetadataFiles(*this, log, context, [this](const String & file_name) iterateMetadataFiles(context, [this](const String & file_name)
{ {
const std::string table_name = file_name.substr(0, file_name.size() - 4); const std::string table_name = file_name.substr(0, file_name.size() - 4);
attachTable(table_name, nullptr); attachTable(table_name, nullptr);
@ -62,75 +57,21 @@ void DatabaseLazy::createTable(
SCOPE_EXIT({ clearExpiredTables(); }); SCOPE_EXIT({ clearExpiredTables(); });
if (!endsWith(table->getName(), "Log")) if (!endsWith(table->getName(), "Log"))
throw Exception("Lazy engine can be used only with *Log tables.", ErrorCodes::UNSUPPORTED_METHOD); throw Exception("Lazy engine can be used only with *Log tables.", ErrorCodes::UNSUPPORTED_METHOD);
DatabaseOnDisk::createTable(*this, context, table_name, table, query); DatabaseOnDisk::createTable(context, table_name, table, query);
/// DatabaseOnDisk::createTable renames file, so we need to get new metadata_modification_time. /// DatabaseOnDisk::createTable renames file, so we need to get new metadata_modification_time.
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
auto it = tables_cache.find(table_name); auto it = tables_cache.find(table_name);
if (it != tables_cache.end()) if (it != tables_cache.end())
it->second.metadata_modification_time = DatabaseOnDisk::getObjectMetadataModificationTime(*this, table_name); it->second.metadata_modification_time = DatabaseOnDisk::getObjectMetadataModificationTime(table_name);
} }
void DatabaseLazy::createDictionary(
const Context & /*context*/,
const String & /*dictionary_name*/,
const ASTPtr & /*query*/)
{
throw Exception("Lazy engine can be used only with *Log tables.", ErrorCodes::UNSUPPORTED_METHOD);
}
void DatabaseLazy::removeTable( void DatabaseLazy::removeTable(
const Context & context, const Context & context,
const String & table_name) const String & table_name)
{ {
SCOPE_EXIT({ clearExpiredTables(); }); SCOPE_EXIT({ clearExpiredTables(); });
DatabaseOnDisk::removeTable(*this, context, table_name, log); DatabaseOnDisk::removeTable(context, table_name);
}
void DatabaseLazy::removeDictionary(
const Context & /*context*/,
const String & /*table_name*/)
{
throw Exception("Lazy engine can be used only with *Log tables.", ErrorCodes::UNSUPPORTED_METHOD);
}
ASTPtr DatabaseLazy::getCreateDictionaryQuery(
const Context & /*context*/,
const String & /*table_name*/) const
{
throw Exception("Lazy engine can be used only with *Log tables.", ErrorCodes::UNSUPPORTED_METHOD);
}
ASTPtr DatabaseLazy::tryGetCreateDictionaryQuery(const Context & /*context*/, const String & /*table_name*/) const
{
return nullptr;
}
bool DatabaseLazy::isDictionaryExist(const Context & /*context*/, const String & /*table_name*/) const
{
return false;
}
DatabaseDictionariesIteratorPtr DatabaseLazy::getDictionariesIterator(
const Context & /*context*/,
const FilterByNameFunction & /*filter_by_dictionary_name*/)
{
return std::make_unique<DatabaseDictionariesSnapshotIterator>();
}
void DatabaseLazy::attachDictionary(
const String & /*dictionary_name*/,
const Context & /*context*/)
{
throw Exception("Lazy engine can be used only with *Log tables.", ErrorCodes::UNSUPPORTED_METHOD);
}
void DatabaseLazy::detachDictionary(const String & /*dictionary_name*/, const Context & /*context*/)
{
throw Exception("Lazy engine can be used only with *Log tables.", ErrorCodes::UNSUPPORTED_METHOD);
} }
void DatabaseLazy::renameTable( void DatabaseLazy::renameTable(
@ -141,61 +82,34 @@ void DatabaseLazy::renameTable(
TableStructureWriteLockHolder & lock) TableStructureWriteLockHolder & lock)
{ {
SCOPE_EXIT({ clearExpiredTables(); }); SCOPE_EXIT({ clearExpiredTables(); });
DatabaseOnDisk::renameTable<DatabaseLazy>(*this, context, table_name, to_database, to_table_name, lock); DatabaseOnDisk::renameTable(context, table_name, to_database, to_table_name, lock);
} }
time_t DatabaseLazy::getObjectMetadataModificationTime( time_t DatabaseLazy::getObjectMetadataModificationTime(const String & table_name) const
const Context & /* context */,
const String & table_name)
{ {
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
auto it = tables_cache.find(table_name); auto it = tables_cache.find(table_name);
if (it != tables_cache.end()) if (it != tables_cache.end())
return it->second.metadata_modification_time; return it->second.metadata_modification_time;
else throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
}
ASTPtr DatabaseLazy::getCreateTableQuery(const Context & context, const String & table_name) const
{
return DatabaseOnDisk::getCreateTableQuery(*this, context, table_name);
}
ASTPtr DatabaseLazy::tryGetCreateTableQuery(const Context & context, const String & table_name) const
{
return DatabaseOnDisk::tryGetCreateTableQuery(*this, context, table_name);
}
ASTPtr DatabaseLazy::getCreateDatabaseQuery(const Context & context) const
{
return DatabaseOnDisk::getCreateDatabaseQuery(*this, context);
} }
void DatabaseLazy::alterTable( void DatabaseLazy::alterTable(
const Context & /* context */, const Context & /* context */,
const String & /* table_name */, const String & /* table_name */,
const ColumnsDescription & /* columns */, const StorageInMemoryMetadata & /* metadata */)
const IndicesDescription & /* indices */,
const ConstraintsDescription & /* constraints */,
const ASTModifier & /* storage_modifier */)
{ {
SCOPE_EXIT({ clearExpiredTables(); }); clearExpiredTables();
throw Exception("ALTER query is not supported for Lazy database.", ErrorCodes::UNSUPPORTED_METHOD); throw Exception("ALTER query is not supported for Lazy database.", ErrorCodes::UNSUPPORTED_METHOD);
} }
void DatabaseLazy::drop(const Context & context)
{
DatabaseOnDisk::drop(*this, context);
}
bool DatabaseLazy::isTableExist( bool DatabaseLazy::isTableExist(
const Context & /* context */, const Context & /* context */,
const String & table_name) const const String & table_name) const
{ {
SCOPE_EXIT({ clearExpiredTables(); }); SCOPE_EXIT({ clearExpiredTables(); });
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
return tables_cache.find(table_name) != tables_cache.end(); return tables_cache.find(table_name) != tables_cache.end();
} }
@ -205,7 +119,7 @@ StoragePtr DatabaseLazy::tryGetTable(
{ {
SCOPE_EXIT({ clearExpiredTables(); }); SCOPE_EXIT({ clearExpiredTables(); });
{ {
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
auto it = tables_cache.find(table_name); auto it = tables_cache.find(table_name);
if (it == tables_cache.end()) if (it == tables_cache.end())
throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE); throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
@ -225,7 +139,7 @@ StoragePtr DatabaseLazy::tryGetTable(
DatabaseTablesIteratorPtr DatabaseLazy::getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name) DatabaseTablesIteratorPtr DatabaseLazy::getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name)
{ {
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
Strings filtered_tables; Strings filtered_tables;
for (const auto & [table_name, cached_table] : tables_cache) for (const auto & [table_name, cached_table] : tables_cache)
{ {
@ -244,12 +158,12 @@ bool DatabaseLazy::empty(const Context & /* context */) const
void DatabaseLazy::attachTable(const String & table_name, const StoragePtr & table) void DatabaseLazy::attachTable(const String & table_name, const StoragePtr & table)
{ {
LOG_DEBUG(log, "Attach table " << backQuote(table_name) << "."); LOG_DEBUG(log, "Attach table " << backQuote(table_name) << ".");
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
time_t current_time = std::chrono::system_clock::to_time_t(std::chrono::system_clock::now()); time_t current_time = std::chrono::system_clock::to_time_t(std::chrono::system_clock::now());
auto [it, inserted] = tables_cache.emplace(std::piecewise_construct, auto [it, inserted] = tables_cache.emplace(std::piecewise_construct,
std::forward_as_tuple(table_name), std::forward_as_tuple(table_name),
std::forward_as_tuple(table, current_time, DatabaseOnDisk::getObjectMetadataModificationTime(*this, table_name))); std::forward_as_tuple(table, current_time, DatabaseOnDisk::getObjectMetadataModificationTime(table_name)));
if (!inserted) if (!inserted)
throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " already exists.", ErrorCodes::TABLE_ALREADY_EXISTS); throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " already exists.", ErrorCodes::TABLE_ALREADY_EXISTS);
@ -261,7 +175,7 @@ StoragePtr DatabaseLazy::detachTable(const String & table_name)
StoragePtr res; StoragePtr res;
{ {
LOG_DEBUG(log, "Detach table " << backQuote(table_name) << "."); LOG_DEBUG(log, "Detach table " << backQuote(table_name) << ".");
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
auto it = tables_cache.find(table_name); auto it = tables_cache.find(table_name);
if (it == tables_cache.end()) if (it == tables_cache.end())
throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE); throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
@ -277,7 +191,7 @@ void DatabaseLazy::shutdown()
{ {
TablesCache tables_snapshot; TablesCache tables_snapshot;
{ {
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
tables_snapshot = tables_cache; tables_snapshot = tables_cache;
} }
@ -287,7 +201,7 @@ void DatabaseLazy::shutdown()
kv.second.table->shutdown(); kv.second.table->shutdown();
} }
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
tables_cache.clear(); tables_cache.clear();
} }
@ -303,26 +217,6 @@ DatabaseLazy::~DatabaseLazy()
} }
} }
String DatabaseLazy::getDataPath() const
{
return data_path;
}
String DatabaseLazy::getMetadataPath() const
{
return metadata_path;
}
String DatabaseLazy::getDatabaseName() const
{
return name;
}
String DatabaseLazy::getObjectMetadataPath(const String & table_name) const
{
return DatabaseOnDisk::getObjectMetadataPath(*this, table_name);
}
StoragePtr DatabaseLazy::loadTable(const Context & context, const String & table_name) const StoragePtr DatabaseLazy::loadTable(const Context & context, const String & table_name) const
{ {
SCOPE_EXIT({ clearExpiredTables(); }); SCOPE_EXIT({ clearExpiredTables(); });
@ -333,19 +227,21 @@ StoragePtr DatabaseLazy::loadTable(const Context & context, const String & table
try try
{ {
String table_name_;
StoragePtr table; StoragePtr table;
Context context_copy(context); /// some tables can change context, but not LogTables Context context_copy(context); /// some tables can change context, but not LogTables
auto ast = parseCreateQueryFromMetadataFile(table_metadata_path, log); auto ast = parseQueryFromMetadata(table_metadata_path, /*throw_on_error*/ true, /*remove_empty*/false);
if (ast) if (ast)
std::tie(table_name_, table) = createTableFromAST( {
ast->as<const ASTCreateQuery &>(), name, getDataPath(), context_copy, false); auto & ast_create = ast->as<const ASTCreateQuery &>();
String table_data_path_relative = getTableDataPath(ast_create);
table = createTableFromAST(ast_create, database_name, table_data_path_relative, context_copy, false).second;
}
if (!ast || !endsWith(table->getName(), "Log")) if (!ast || !endsWith(table->getName(), "Log"))
throw Exception("Only *Log tables can be used with Lazy database engine.", ErrorCodes::LOGICAL_ERROR); throw Exception("Only *Log tables can be used with Lazy database engine.", ErrorCodes::LOGICAL_ERROR);
{ {
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
auto it = tables_cache.find(table_name); auto it = tables_cache.find(table_name);
if (it == tables_cache.end()) if (it == tables_cache.end())
throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE); throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
@ -358,16 +254,16 @@ StoragePtr DatabaseLazy::loadTable(const Context & context, const String & table
return it->second.table = table; return it->second.table = table;
} }
} }
catch (const Exception & e) catch (Exception & e)
{ {
throw Exception("Cannot create table from metadata file " + table_metadata_path + ". Error: " + DB::getCurrentExceptionMessage(true), e.addMessage("Cannot create table from metadata file " + table_metadata_path);
e, DB::ErrorCodes::CANNOT_CREATE_TABLE_FROM_METADATA); throw;
} }
} }
void DatabaseLazy::clearExpiredTables() const void DatabaseLazy::clearExpiredTables() const
{ {
std::lock_guard lock(tables_mutex); std::lock_guard lock(mutex);
auto time_now = std::chrono::system_clock::to_time_t(std::chrono::system_clock::now()); auto time_now = std::chrono::system_clock::to_time_t(std::chrono::system_clock::now());
CacheExpirationQueue expired_tables; CacheExpirationQueue expired_tables;

View File

@ -1,6 +1,6 @@
#pragma once #pragma once
#include <Databases/DatabasesCommon.h> #include <Databases/DatabaseOnDisk.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <Parsers/ASTCreateQuery.h> #include <Parsers/ASTCreateQuery.h>
@ -15,7 +15,7 @@ class DatabaseLazyIterator;
* Works like DatabaseOrdinary, but stores in memory only cache. * Works like DatabaseOrdinary, but stores in memory only cache.
* Can be used only with *Log engines. * Can be used only with *Log engines.
*/ */
class DatabaseLazy : public IDatabase class DatabaseLazy : public DatabaseOnDisk
{ {
public: public:
DatabaseLazy(const String & name_, const String & metadata_path_, time_t expiration_time_, const Context & context_); DatabaseLazy(const String & name_, const String & metadata_path_, time_t expiration_time_, const Context & context_);
@ -32,19 +32,10 @@ public:
const StoragePtr & table, const StoragePtr & table,
const ASTPtr & query) override; const ASTPtr & query) override;
void createDictionary(
const Context & context,
const String & dictionary_name,
const ASTPtr & query) override;
void removeTable( void removeTable(
const Context & context, const Context & context,
const String & table_name) override; const String & table_name) override;
void removeDictionary(
const Context & context,
const String & table_name) override;
void renameTable( void renameTable(
const Context & context, const Context & context,
const String & table_name, const String & table_name,
@ -55,48 +46,14 @@ public:
void alterTable( void alterTable(
const Context & context, const Context & context,
const String & name, const String & name,
const ColumnsDescription & columns, const StorageInMemoryMetadata & metadata) override;
const IndicesDescription & indices,
const ConstraintsDescription & constraints,
const ASTModifier & engine_modifier) override;
time_t getObjectMetadataModificationTime( time_t getObjectMetadataModificationTime(const String & table_name) const override;
const Context & context,
const String & table_name) override;
ASTPtr getCreateTableQuery(
const Context & context,
const String & table_name) const override;
ASTPtr tryGetCreateTableQuery(
const Context & context,
const String & table_name) const override;
ASTPtr getCreateDictionaryQuery(
const Context & context,
const String & dictionary_name) const override;
ASTPtr tryGetCreateDictionaryQuery(
const Context & context,
const String & dictionary_name) const override;
ASTPtr getCreateDatabaseQuery(const Context & context) const override;
String getDataPath() const override;
String getDatabaseName() const override;
String getMetadataPath() const override;
String getObjectMetadataPath(const String & table_name) const override;
void drop(const Context & context) override;
bool isTableExist( bool isTableExist(
const Context & context, const Context & context,
const String & table_name) const override; const String & table_name) const override;
bool isDictionaryExist(
const Context & context,
const String & table_name) const override;
StoragePtr tryGetTable( StoragePtr tryGetTable(
const Context & context, const Context & context,
const String & table_name) const override; const String & table_name) const override;
@ -105,16 +62,10 @@ public:
DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name = {}) override; DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name = {}) override;
DatabaseDictionariesIteratorPtr getDictionariesIterator(const Context & context, const FilterByNameFunction & filter_by_dictionary_name = {}) override;
void attachTable(const String & table_name, const StoragePtr & table) override; void attachTable(const String & table_name, const StoragePtr & table) override;
StoragePtr detachTable(const String & table_name) override; StoragePtr detachTable(const String & table_name) override;
void attachDictionary(const String & dictionary_name, const Context & context) override;
void detachDictionary(const String & dictionary_name, const Context & context) override;
void shutdown() override; void shutdown() override;
~DatabaseLazy() override; ~DatabaseLazy() override;
@ -146,19 +97,12 @@ private:
using TablesCache = std::unordered_map<String, CachedTable>; using TablesCache = std::unordered_map<String, CachedTable>;
String name;
const String metadata_path;
const String data_path;
const time_t expiration_time; const time_t expiration_time;
mutable std::mutex tables_mutex; /// TODO use DatabaseWithOwnTablesBase::tables
mutable TablesCache tables_cache; mutable TablesCache tables_cache;
mutable CacheExpirationQueue cache_expiration_queue; mutable CacheExpirationQueue cache_expiration_queue;
Poco::Logger * log;
StoragePtr loadTable(const Context & context, const String & table_name) const; StoragePtr loadTable(const Context & context, const String & table_name) const;
void clearExpiredTables() const; void clearExpiredTables() const;

View File

@ -1,30 +1,16 @@
#include <common/logger_useful.h> #include <common/logger_useful.h>
#include <Databases/DatabaseMemory.h> #include <Databases/DatabaseMemory.h>
#include <Databases/DatabasesCommon.h> #include <Databases/DatabasesCommon.h>
#include <Parsers/ASTCreateQuery.h>
namespace DB namespace DB
{ {
namespace ErrorCodes DatabaseMemory::DatabaseMemory(const String & name_)
{ : DatabaseWithOwnTablesBase(name_, "DatabaseMemory(" + name_ + ")")
extern const int CANNOT_GET_CREATE_TABLE_QUERY;
extern const int CANNOT_GET_CREATE_DICTIONARY_QUERY;
extern const int UNSUPPORTED_METHOD;
}
DatabaseMemory::DatabaseMemory(String name_)
: DatabaseWithOwnTablesBase(std::move(name_))
, log(&Logger::get("DatabaseMemory(" + name + ")"))
{} {}
void DatabaseMemory::loadStoredObjects(
Context & /*context*/,
bool /*has_force_restore_data_flag*/)
{
/// Nothing to load.
}
void DatabaseMemory::createTable( void DatabaseMemory::createTable(
const Context & /*context*/, const Context & /*context*/,
const String & table_name, const String & table_name,
@ -34,21 +20,6 @@ void DatabaseMemory::createTable(
attachTable(table_name, table); attachTable(table_name, table);
} }
void DatabaseMemory::attachDictionary(const String & /*name*/, const Context & /*context*/)
{
throw Exception("There is no ATTACH DICTIONARY query for DatabaseMemory", ErrorCodes::UNSUPPORTED_METHOD);
}
void DatabaseMemory::createDictionary(
const Context & /*context*/,
const String & /*dictionary_name*/,
const ASTPtr & /*query*/)
{
throw Exception("There is no CREATE DICTIONARY query for DatabaseMemory", ErrorCodes::UNSUPPORTED_METHOD);
}
void DatabaseMemory::removeTable( void DatabaseMemory::removeTable(
const Context & /*context*/, const Context & /*context*/,
const String & table_name) const String & table_name)
@ -56,52 +27,13 @@ void DatabaseMemory::removeTable(
detachTable(table_name); detachTable(table_name);
} }
ASTPtr DatabaseMemory::getCreateDatabaseQuery() const
void DatabaseMemory::detachDictionary(const String & /*name*/, const Context & /*context*/)
{ {
throw Exception("There is no DETACH DICTIONARY query for DatabaseMemory", ErrorCodes::UNSUPPORTED_METHOD); auto create_query = std::make_shared<ASTCreateQuery>();
} create_query->database = database_name;
create_query->set(create_query->storage, std::make_shared<ASTStorage>());
create_query->storage->set(create_query->storage->engine, makeASTFunction(getEngineName()));
void DatabaseMemory::removeDictionary( return create_query;
const Context & /*context*/,
const String & /*dictionary_name*/)
{
throw Exception("There is no DROP DICTIONARY query for DatabaseMemory", ErrorCodes::UNSUPPORTED_METHOD);
}
time_t DatabaseMemory::getObjectMetadataModificationTime(
const Context &, const String &)
{
return static_cast<time_t>(0);
}
ASTPtr DatabaseMemory::getCreateTableQuery(
const Context &,
const String &) const
{
throw Exception("There is no CREATE TABLE query for DatabaseMemory tables", ErrorCodes::CANNOT_GET_CREATE_TABLE_QUERY);
}
ASTPtr DatabaseMemory::getCreateDictionaryQuery(
const Context &,
const String &) const
{
throw Exception("There is no CREATE DICTIONARY query for DatabaseMemory dictionaries", ErrorCodes::CANNOT_GET_CREATE_DICTIONARY_QUERY);
}
ASTPtr DatabaseMemory::getCreateDatabaseQuery(
const Context &) const
{
throw Exception("There is no CREATE DATABASE query for DatabaseMemory", ErrorCodes::CANNOT_GET_CREATE_TABLE_QUERY);
}
String DatabaseMemory::getDatabaseName() const
{
return name;
} }
} }

View File

@ -17,54 +17,21 @@ namespace DB
class DatabaseMemory : public DatabaseWithOwnTablesBase class DatabaseMemory : public DatabaseWithOwnTablesBase
{ {
public: public:
DatabaseMemory(String name_); DatabaseMemory(const String & name_);
String getDatabaseName() const override;
String getEngineName() const override { return "Memory"; } String getEngineName() const override { return "Memory"; }
void loadStoredObjects(
Context & context,
bool has_force_restore_data_flag) override;
void createTable( void createTable(
const Context & context, const Context & context,
const String & table_name, const String & table_name,
const StoragePtr & table, const StoragePtr & table,
const ASTPtr & query) override; const ASTPtr & query) override;
void createDictionary(
const Context & context,
const String & dictionary_name,
const ASTPtr & query) override;
void attachDictionary(
const String & name,
const Context & context) override;
void removeTable( void removeTable(
const Context & context, const Context & context,
const String & table_name) override; const String & table_name) override;
void removeDictionary( ASTPtr getCreateDatabaseQuery() const override;
const Context & context,
const String & dictionary_name) override;
void detachDictionary(
const String & name,
const Context & context) override;
time_t getObjectMetadataModificationTime(const Context & context, const String & table_name) override;
ASTPtr getCreateTableQuery(const Context & context, const String & table_name) const override;
ASTPtr getCreateDictionaryQuery(const Context & context, const String & table_name) const override;
ASTPtr tryGetCreateTableQuery(const Context &, const String &) const override { return nullptr; }
ASTPtr tryGetCreateDictionaryQuery(const Context &, const String &) const override { return nullptr; }
ASTPtr getCreateDatabaseQuery(const Context & context) const override;
private:
Poco::Logger * log;
}; };
} }

View File

@ -9,10 +9,8 @@
#include <Formats/MySQLBlockInputStream.h> #include <Formats/MySQLBlockInputStream.h>
#include <DataTypes/DataTypeString.h> #include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesNumber.h> #include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeDateTime.h> #include <DataTypes/DataTypeDateTime.h>
#include <DataTypes/DataTypeNullable.h> #include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypeFixedString.h>
#include <Storages/StorageMySQL.h> #include <Storages/StorageMySQL.h>
#include <Parsers/ASTFunction.h> #include <Parsers/ASTFunction.h>
#include <Parsers/ParserCreateQuery.h> #include <Parsers/ParserCreateQuery.h>
@ -63,8 +61,12 @@ static String toQueryStringWithQuote(const std::vector<String> & quote_list)
DatabaseMySQL::DatabaseMySQL( DatabaseMySQL::DatabaseMySQL(
const Context & global_context_, const String & database_name_, const String & metadata_path_, const Context & global_context_, const String & database_name_, const String & metadata_path_,
const ASTStorage * database_engine_define_, const String & database_name_in_mysql_, mysqlxx::Pool && pool) const ASTStorage * database_engine_define_, const String & database_name_in_mysql_, mysqlxx::Pool && pool)
: global_context(global_context_), database_name(database_name_), metadata_path(metadata_path_), : IDatabase(database_name_)
database_engine_define(database_engine_define_->clone()), database_name_in_mysql(database_name_in_mysql_), mysql_pool(std::move(pool)) , global_context(global_context_)
, metadata_path(metadata_path_)
, database_engine_define(database_engine_define_->clone())
, database_name_in_mysql(database_name_in_mysql_)
, mysql_pool(std::move(pool))
{ {
} }
@ -150,19 +152,24 @@ static ASTPtr getCreateQueryFromStorage(const StoragePtr & storage, const ASTPtr
return create_table_query; return create_table_query;
} }
ASTPtr DatabaseMySQL::tryGetCreateTableQuery(const Context &, const String & table_name) const ASTPtr DatabaseMySQL::getCreateTableQueryImpl(const Context &, const String & table_name, bool throw_on_error) const
{ {
std::lock_guard<std::mutex> lock(mutex); std::lock_guard<std::mutex> lock(mutex);
fetchTablesIntoLocalCache(); fetchTablesIntoLocalCache();
if (local_tables_cache.find(table_name) == local_tables_cache.end()) if (local_tables_cache.find(table_name) == local_tables_cache.end())
throw Exception("MySQL table " + database_name_in_mysql + "." + table_name + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE); {
if (throw_on_error)
throw Exception("MySQL table " + database_name_in_mysql + "." + table_name + " doesn't exist..",
ErrorCodes::UNKNOWN_TABLE);
return nullptr;
}
return getCreateQueryFromStorage(local_tables_cache[table_name].second, database_engine_define); return getCreateQueryFromStorage(local_tables_cache[table_name].second, database_engine_define);
} }
time_t DatabaseMySQL::getObjectMetadataModificationTime(const Context &, const String & table_name) time_t DatabaseMySQL::getObjectMetadataModificationTime(const String & table_name) const
{ {
std::lock_guard<std::mutex> lock(mutex); std::lock_guard<std::mutex> lock(mutex);
@ -174,7 +181,7 @@ time_t DatabaseMySQL::getObjectMetadataModificationTime(const Context &, const S
return time_t(local_tables_cache[table_name].first); return time_t(local_tables_cache[table_name].first);
} }
ASTPtr DatabaseMySQL::getCreateDatabaseQuery(const Context &) const ASTPtr DatabaseMySQL::getCreateDatabaseQuery() const
{ {
const auto & create_query = std::make_shared<ASTCreateQuery>(); const auto & create_query = std::make_shared<ASTCreateQuery>();
create_query->database = database_name; create_query->database = database_name;

View File

@ -28,35 +28,17 @@ public:
String getEngineName() const override { return "MySQL"; } String getEngineName() const override { return "MySQL"; }
String getDatabaseName() const override { return database_name; }
bool empty(const Context & context) const override; bool empty(const Context & context) const override;
DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name = {}) override; DatabaseTablesIteratorPtr getTablesIterator(const Context & context, const FilterByNameFunction & filter_by_table_name = {}) override;
DatabaseDictionariesIteratorPtr getDictionariesIterator(const Context &, const FilterByNameFunction & = {}) override ASTPtr getCreateDatabaseQuery() const override;
{
return std::make_unique<DatabaseDictionariesSnapshotIterator>();
}
ASTPtr getCreateDatabaseQuery(const Context & context) const override;
bool isTableExist(const Context & context, const String & name) const override; bool isTableExist(const Context & context, const String & name) const override;
bool isDictionaryExist(const Context &, const String &) const override { return false; }
StoragePtr tryGetTable(const Context & context, const String & name) const override; StoragePtr tryGetTable(const Context & context, const String & name) const override;
ASTPtr tryGetCreateTableQuery(const Context & context, const String & name) const override; time_t getObjectMetadataModificationTime(const String & name) const override;
ASTPtr getCreateDictionaryQuery(const Context &, const String &) const override
{
throw Exception("MySQL database engine does not support dictionaries.", ErrorCodes::NOT_IMPLEMENTED);
}
ASTPtr tryGetCreateDictionaryQuery(const Context &, const String &) const override { return nullptr; }
time_t getObjectMetadataModificationTime(const Context & context, const String & name) override;
void shutdown() override; void shutdown() override;
@ -74,29 +56,12 @@ public:
void attachTable(const String & table_name, const StoragePtr & storage) override; void attachTable(const String & table_name, const StoragePtr & storage) override;
void detachDictionary(const String &, const Context &) override
{
throw Exception("MySQL database engine does not support detach dictionary.", ErrorCodes::NOT_IMPLEMENTED);
}
void removeDictionary(const Context &, const String &) override protected:
{ ASTPtr getCreateTableQueryImpl(const Context & context, const String & name, bool throw_on_error) const override;
throw Exception("MySQL database engine does not support remove dictionary.", ErrorCodes::NOT_IMPLEMENTED);
}
void attachDictionary(const String &, const Context &) override
{
throw Exception("MySQL database engine does not support attach dictionary.", ErrorCodes::NOT_IMPLEMENTED);
}
void createDictionary(const Context &, const String &, const ASTPtr &) override
{
throw Exception("MySQL database engine does not support create dictionary.", ErrorCodes::NOT_IMPLEMENTED);
}
private: private:
Context global_context; Context global_context;
String database_name;
String metadata_path; String metadata_path;
ASTPtr database_engine_define; ASTPtr database_engine_define;
String database_name_in_mysql; String database_name_in_mysql;

View File

@ -6,9 +6,6 @@
#include <IO/WriteHelpers.h> #include <IO/WriteHelpers.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <Interpreters/InterpreterCreateQuery.h> #include <Interpreters/InterpreterCreateQuery.h>
#include <Interpreters/ExternalDictionariesLoader.h>
#include <Interpreters/ExternalLoaderPresetConfigRepository.h>
#include <Dictionaries/getDictionaryConfigurationFromAST.h>
#include <Parsers/ASTCreateQuery.h> #include <Parsers/ASTCreateQuery.h>
#include <Parsers/ParserCreateQuery.h> #include <Parsers/ParserCreateQuery.h>
#include <Parsers/formatAST.h> #include <Parsers/formatAST.h>
@ -19,7 +16,6 @@
#include <Common/escapeForFileName.h> #include <Common/escapeForFileName.h>
#include <common/logger_useful.h> #include <common/logger_useful.h>
#include <ext/scope_guard.h>
#include <Poco/DirectoryIterator.h> #include <Poco/DirectoryIterator.h>
@ -31,8 +27,6 @@ static constexpr size_t METADATA_FILE_BUFFER_SIZE = 32768;
namespace ErrorCodes namespace ErrorCodes
{ {
extern const int CANNOT_GET_CREATE_TABLE_QUERY;
extern const int CANNOT_GET_CREATE_DICTIONARY_QUERY;
extern const int FILE_DOESNT_EXIST; extern const int FILE_DOESNT_EXIST;
extern const int INCORRECT_FILE_NAME; extern const int INCORRECT_FILE_NAME;
extern const int SYNTAX_ERROR; extern const int SYNTAX_ERROR;
@ -43,93 +37,10 @@ namespace ErrorCodes
} }
namespace detail
{
String getObjectMetadataPath(const String & base_path, const String & table_name)
{
return base_path + (endsWith(base_path, "/") ? "" : "/") + escapeForFileName(table_name) + ".sql";
}
String getDatabaseMetadataPath(const String & base_path)
{
return (endsWith(base_path, "/") ? base_path.substr(0, base_path.size() - 1) : base_path) + ".sql";
}
ASTPtr getQueryFromMetadata(const String & metadata_path, bool throw_on_error)
{
String query;
try
{
ReadBufferFromFile in(metadata_path, 4096);
readStringUntilEOF(query, in);
}
catch (const Exception & e)
{
if (!throw_on_error && e.code() == ErrorCodes::FILE_DOESNT_EXIST)
return nullptr;
else
throw;
}
ParserCreateQuery parser;
const char * pos = query.data();
std::string error_message;
auto ast = tryParseQuery(parser, pos, pos + query.size(), error_message, /* hilite = */ false,
"in file " + metadata_path, /* allow_multi_statements = */ false, 0);
if (!ast && throw_on_error)
throw Exception(error_message, ErrorCodes::SYNTAX_ERROR);
return ast;
}
ASTPtr getCreateQueryFromMetadata(const String & metadata_path, const String & database, bool throw_on_error)
{
ASTPtr ast = getQueryFromMetadata(metadata_path, throw_on_error);
if (ast)
{
auto & ast_create_query = ast->as<ASTCreateQuery &>();
ast_create_query.attach = false;
ast_create_query.database = database;
}
return ast;
}
}
ASTPtr parseCreateQueryFromMetadataFile(const String & filepath, Poco::Logger * log)
{
String definition;
{
char in_buf[METADATA_FILE_BUFFER_SIZE];
ReadBufferFromFile in(filepath, METADATA_FILE_BUFFER_SIZE, -1, in_buf);
readStringUntilEOF(definition, in);
}
/** Empty files with metadata are generated after a rough restart of the server.
* Remove these files to slightly reduce the work of the admins on startup.
*/
if (definition.empty())
{
LOG_ERROR(log, "File " << filepath << " is empty. Removing.");
Poco::File(filepath).remove();
return nullptr;
}
ParserCreateQuery parser_create;
ASTPtr result = parseQuery(parser_create, definition, "in file " + filepath, 0);
return result;
}
std::pair<String, StoragePtr> createTableFromAST( std::pair<String, StoragePtr> createTableFromAST(
ASTCreateQuery ast_create_query, ASTCreateQuery ast_create_query,
const String & database_name, const String & database_name,
const String & database_data_path_relative, const String & table_data_path_relative,
Context & context, Context & context,
bool has_force_restore_data_flag) bool has_force_restore_data_flag)
{ {
@ -144,7 +55,7 @@ std::pair<String, StoragePtr> createTableFromAST(
return {ast_create_query.table, storage}; return {ast_create_query.table, storage};
} }
/// We do not directly use `InterpreterCreateQuery::execute`, because /// We do not directly use `InterpreterCreateQuery::execute`, because
/// - the database has not been created yet; /// - the database has not been loaded yet;
/// - the code is simpler, since the query is already brought to a suitable form. /// - the code is simpler, since the query is already brought to a suitable form.
if (!ast_create_query.columns_list || !ast_create_query.columns_list->columns) if (!ast_create_query.columns_list || !ast_create_query.columns_list->columns)
throw Exception("Missing definition of columns.", ErrorCodes::EMPTY_LIST_OF_COLUMNS_PASSED); throw Exception("Missing definition of columns.", ErrorCodes::EMPTY_LIST_OF_COLUMNS_PASSED);
@ -152,7 +63,6 @@ std::pair<String, StoragePtr> createTableFromAST(
ColumnsDescription columns = InterpreterCreateQuery::getColumnsDescription(*ast_create_query.columns_list->columns, context); ColumnsDescription columns = InterpreterCreateQuery::getColumnsDescription(*ast_create_query.columns_list->columns, context);
ConstraintsDescription constraints = InterpreterCreateQuery::getConstraintsDescription(ast_create_query.columns_list->constraints); ConstraintsDescription constraints = InterpreterCreateQuery::getConstraintsDescription(ast_create_query.columns_list->constraints);
String table_data_path_relative = database_data_path_relative + escapeForFileName(ast_create_query.table) + '/';
return return
{ {
ast_create_query.table, ast_create_query.table,
@ -202,7 +112,6 @@ String getObjectDefinitionFromCreateQuery(const ASTPtr & query)
} }
void DatabaseOnDisk::createTable( void DatabaseOnDisk::createTable(
IDatabase & database,
const Context & context, const Context & context,
const String & table_name, const String & table_name,
const StoragePtr & table, const StoragePtr & table,
@ -222,14 +131,14 @@ void DatabaseOnDisk::createTable(
/// A race condition would be possible if a table with the same name is simultaneously created using CREATE and using ATTACH. /// A race condition would be possible if a table with the same name is simultaneously created using CREATE and using ATTACH.
/// But there is protection from it - see using DDLGuard in InterpreterCreateQuery. /// But there is protection from it - see using DDLGuard in InterpreterCreateQuery.
if (database.isDictionaryExist(context, table_name)) if (isDictionaryExist(context, table_name))
throw Exception("Dictionary " + backQuote(database.getDatabaseName()) + "." + backQuote(table_name) + " already exists.", throw Exception("Dictionary " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " already exists.",
ErrorCodes::DICTIONARY_ALREADY_EXISTS); ErrorCodes::DICTIONARY_ALREADY_EXISTS);
if (database.isTableExist(context, table_name)) if (isTableExist(context, table_name))
throw Exception("Table " + backQuote(database.getDatabaseName()) + "." + backQuote(table_name) + " already exists.", ErrorCodes::TABLE_ALREADY_EXISTS); throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " already exists.", ErrorCodes::TABLE_ALREADY_EXISTS);
String table_metadata_path = database.getObjectMetadataPath(table_name); String table_metadata_path = getObjectMetadataPath(table_name);
String table_metadata_tmp_path = table_metadata_path + ".tmp"; String table_metadata_tmp_path = table_metadata_path + ".tmp";
String statement; String statement;
@ -248,7 +157,7 @@ void DatabaseOnDisk::createTable(
try try
{ {
/// Add a table to the map of known tables. /// Add a table to the map of known tables.
database.attachTable(table_name, table); attachTable(table_name, table);
/// If it was ATTACH query and file with table metadata already exist /// If it was ATTACH query and file with table metadata already exist
/// (so, ATTACH is done after DETACH), then rename atomically replaces old file with new one. /// (so, ATTACH is done after DETACH), then rename atomically replaces old file with new one.
@ -261,107 +170,11 @@ void DatabaseOnDisk::createTable(
} }
} }
void DatabaseOnDisk::removeTable(const Context & /* context */, const String & table_name)
void DatabaseOnDisk::createDictionary(
IDatabase & database,
const Context & context,
const String & dictionary_name,
const ASTPtr & query)
{ {
const auto & settings = context.getSettingsRef(); StoragePtr res = detachTable(table_name);
/** The code is based on the assumption that all threads share the same order of operations: String table_metadata_path = getObjectMetadataPath(table_name);
* - create the .sql.tmp file;
* - add the dictionary to ExternalDictionariesLoader;
* - load the dictionary in case dictionaries_lazy_load == false;
* - attach the dictionary;
* - rename .sql.tmp to .sql.
*/
/// A race condition would be possible if a dictionary with the same name is simultaneously created using CREATE and using ATTACH.
/// But there is protection from it - see using DDLGuard in InterpreterCreateQuery.
if (database.isDictionaryExist(context, dictionary_name))
throw Exception("Dictionary " + backQuote(database.getDatabaseName()) + "." + backQuote(dictionary_name) + " already exists.", ErrorCodes::DICTIONARY_ALREADY_EXISTS);
/// A dictionary with the same full name could be defined in *.xml config files.
String full_name = database.getDatabaseName() + "." + dictionary_name;
auto & external_loader = const_cast<ExternalDictionariesLoader &>(context.getExternalDictionariesLoader());
if (external_loader.getCurrentStatus(full_name) != ExternalLoader::Status::NOT_EXIST)
throw Exception(
"Dictionary " + backQuote(database.getDatabaseName()) + "." + backQuote(dictionary_name) + " already exists.",
ErrorCodes::DICTIONARY_ALREADY_EXISTS);
if (database.isTableExist(context, dictionary_name))
throw Exception("Table " + backQuote(database.getDatabaseName()) + "." + backQuote(dictionary_name) + " already exists.", ErrorCodes::TABLE_ALREADY_EXISTS);
String dictionary_metadata_path = database.getObjectMetadataPath(dictionary_name);
String dictionary_metadata_tmp_path = dictionary_metadata_path + ".tmp";
String statement = getObjectDefinitionFromCreateQuery(query);
{
/// Exclusive flags guarantees, that table is not created right now in another thread. Otherwise, exception will be thrown.
WriteBufferFromFile out(dictionary_metadata_tmp_path, statement.size(), O_WRONLY | O_CREAT | O_EXCL);
writeString(statement, out);
out.next();
if (settings.fsync_metadata)
out.sync();
out.close();
}
bool succeeded = false;
SCOPE_EXIT({
if (!succeeded)
Poco::File(dictionary_metadata_tmp_path).remove();
});
/// Add a temporary repository containing the dictionary.
/// We need this temp repository to try loading the dictionary before actually attaching it to the database.
static std::atomic<size_t> counter = 0;
String temp_repository_name = String(IExternalLoaderConfigRepository::INTERNAL_REPOSITORY_NAME_PREFIX) + " creating " + full_name + " "
+ std::to_string(++counter);
external_loader.addConfigRepository(
temp_repository_name,
std::make_unique<ExternalLoaderPresetConfigRepository>(
std::vector{std::pair{dictionary_metadata_tmp_path,
getDictionaryConfigurationFromAST(query->as<const ASTCreateQuery &>(), database.getDatabaseName())}}));
SCOPE_EXIT({ external_loader.removeConfigRepository(temp_repository_name); });
bool lazy_load = context.getConfigRef().getBool("dictionaries_lazy_load", true);
if (!lazy_load)
{
/// load() is called here to force loading the dictionary, wait until the loading is finished,
/// and throw an exception if the loading is failed.
external_loader.load(full_name);
}
database.attachDictionary(dictionary_name, context);
SCOPE_EXIT({
if (!succeeded)
database.detachDictionary(dictionary_name, context);
});
/// If it was ATTACH query and file with dictionary metadata already exist
/// (so, ATTACH is done after DETACH), then rename atomically replaces old file with new one.
Poco::File(dictionary_metadata_tmp_path).renameTo(dictionary_metadata_path);
/// ExternalDictionariesLoader doesn't know we renamed the metadata path.
/// So we have to manually call reloadConfig() here.
external_loader.reloadConfig(database.getDatabaseName(), full_name);
/// Everything's ok.
succeeded = true;
}
void DatabaseOnDisk::removeTable(
IDatabase & database,
const Context & /* context */,
const String & table_name,
Poco::Logger * log)
{
StoragePtr res = database.detachTable(table_name);
String table_metadata_path = database.getObjectMetadataPath(table_name);
try try
{ {
@ -378,51 +191,64 @@ void DatabaseOnDisk::removeTable(
{ {
LOG_WARNING(log, getCurrentExceptionMessage(__PRETTY_FUNCTION__)); LOG_WARNING(log, getCurrentExceptionMessage(__PRETTY_FUNCTION__));
} }
database.attachTable(table_name, res); attachTable(table_name, res);
throw; throw;
} }
} }
void DatabaseOnDisk::renameTable(
void DatabaseOnDisk::removeDictionary( const Context & context,
IDatabase & database, const String & table_name,
const Context & context, IDatabase & to_database,
const String & dictionary_name, const String & to_table_name,
Poco::Logger * /*log*/) TableStructureWriteLockHolder & lock)
{ {
database.detachDictionary(dictionary_name, context); if (typeid(*this) != typeid(to_database))
throw Exception("Moving tables between databases of different engines is not supported", ErrorCodes::NOT_IMPLEMENTED);
String dictionary_metadata_path = database.getObjectMetadataPath(dictionary_name); StoragePtr table = tryGetTable(context, table_name);
if (Poco::File(dictionary_metadata_path).exists())
if (!table)
throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
ASTPtr ast = parseQueryFromMetadata(getObjectMetadataPath(table_name));
if (!ast)
throw Exception("There is no metadata file for table " + backQuote(table_name) + ".", ErrorCodes::FILE_DOESNT_EXIST);
auto & create = ast->as<ASTCreateQuery &>();
create.table = to_table_name;
/// Notify the table that it is renamed. If the table does not support renaming, exception is thrown.
try
{ {
try table->rename(to_database.getTableDataPath(create),
{ to_database.getDatabaseName(),
Poco::File(dictionary_metadata_path).remove(); to_table_name, lock);
}
catch (...)
{
/// If remove was not possible for some reason
database.attachDictionary(dictionary_name, context);
throw;
}
} }
catch (const Exception &)
{
throw;
}
catch (const Poco::Exception & e)
{
/// Better diagnostics.
throw Exception{Exception::CreateFromPoco, e};
}
/// NOTE Non-atomic.
to_database.createTable(context, to_table_name, table, ast);
removeTable(context, table_name);
} }
ASTPtr DatabaseOnDisk::getCreateTableQueryImpl(const Context & context, const String & table_name, bool throw_on_error) const
ASTPtr DatabaseOnDisk::getCreateTableQueryImpl(
const IDatabase & database,
const Context & context,
const String & table_name,
bool throw_on_error)
{ {
ASTPtr ast; ASTPtr ast;
auto table_metadata_path = detail::getObjectMetadataPath(database.getMetadataPath(), table_name); auto table_metadata_path = getObjectMetadataPath(table_name);
ast = detail::getCreateQueryFromMetadata(table_metadata_path, database.getDatabaseName(), throw_on_error); ast = getCreateQueryFromMetadata(table_metadata_path, throw_on_error);
if (!ast && throw_on_error) if (!ast && throw_on_error)
{ {
/// Handle system.* tables for which there are no table.sql files. /// Handle system.* tables for which there are no table.sql files.
bool has_table = database.tryGetTable(context, table_name) != nullptr; bool has_table = tryGetTable(context, table_name) != nullptr;
auto msg = has_table auto msg = has_table
? "There is no CREATE TABLE query for table " ? "There is no CREATE TABLE query for table "
@ -434,61 +260,18 @@ ASTPtr DatabaseOnDisk::getCreateTableQueryImpl(
return ast; return ast;
} }
ASTPtr DatabaseOnDisk::getCreateDatabaseQuery() const
ASTPtr DatabaseOnDisk::getCreateDictionaryQueryImpl(
const IDatabase & database,
const Context & context,
const String & dictionary_name,
bool throw_on_error)
{ {
ASTPtr ast; ASTPtr ast;
auto dictionary_metadata_path = detail::getObjectMetadataPath(database.getMetadataPath(), dictionary_name); auto metadata_dir_path = getMetadataPath();
ast = detail::getCreateQueryFromMetadata(dictionary_metadata_path, database.getDatabaseName(), throw_on_error); auto database_metadata_path = metadata_dir_path.substr(0, metadata_dir_path.size() - 1) + ".sql";
if (!ast && throw_on_error) ast = getCreateQueryFromMetadata(database_metadata_path, true);
{
/// Handle system.* tables for which there are no table.sql files.
bool has_dictionary = database.isDictionaryExist(context, dictionary_name);
auto msg = has_dictionary ? "There is no CREATE DICTIONARY query for table " : "There is no metadata file for dictionary ";
throw Exception(msg + backQuote(dictionary_name), ErrorCodes::CANNOT_GET_CREATE_DICTIONARY_QUERY);
}
return ast;
}
ASTPtr DatabaseOnDisk::getCreateTableQuery(const IDatabase & database, const Context & context, const String & table_name)
{
return getCreateTableQueryImpl(database, context, table_name, true);
}
ASTPtr DatabaseOnDisk::tryGetCreateTableQuery(const IDatabase & database, const Context & context, const String & table_name)
{
return getCreateTableQueryImpl(database, context, table_name, false);
}
ASTPtr DatabaseOnDisk::getCreateDictionaryQuery(const IDatabase & database, const Context & context, const String & dictionary_name)
{
return getCreateDictionaryQueryImpl(database, context, dictionary_name, true);
}
ASTPtr DatabaseOnDisk::tryGetCreateDictionaryQuery(const IDatabase & database, const Context & context, const String & dictionary_name)
{
return getCreateDictionaryQueryImpl(database, context, dictionary_name, false);
}
ASTPtr DatabaseOnDisk::getCreateDatabaseQuery(const IDatabase & database, const Context & /*context*/)
{
ASTPtr ast;
auto database_metadata_path = detail::getDatabaseMetadataPath(database.getMetadataPath());
ast = detail::getCreateQueryFromMetadata(database_metadata_path, database.getDatabaseName(), true);
if (!ast) if (!ast)
{ {
/// Handle databases (such as default) for which there are no database.sql files. /// Handle databases (such as default) for which there are no database.sql files.
String query = "CREATE DATABASE " + backQuoteIfNeed(database.getDatabaseName()) + " ENGINE = Lazy"; /// If database.sql doesn't exist, then engine is Ordinary
String query = "CREATE DATABASE " + backQuoteIfNeed(getDatabaseName()) + " ENGINE = Ordinary";
ParserCreateQuery parser; ParserCreateQuery parser;
ast = parseQuery(parser, query.data(), query.data() + query.size(), "", 0); ast = parseQuery(parser, query.data(), query.data() + query.size(), "", 0);
} }
@ -496,22 +279,20 @@ ASTPtr DatabaseOnDisk::getCreateDatabaseQuery(const IDatabase & database, const
return ast; return ast;
} }
void DatabaseOnDisk::drop(const IDatabase & database, const Context & context) void DatabaseOnDisk::drop(const Context & context)
{ {
Poco::File(context.getPath() + database.getDataPath()).remove(false); Poco::File(context.getPath() + getDataPath()).remove(false);
Poco::File(database.getMetadataPath()).remove(false); Poco::File(getMetadataPath()).remove(false);
} }
String DatabaseOnDisk::getObjectMetadataPath(const IDatabase & database, const String & table_name) String DatabaseOnDisk::getObjectMetadataPath(const String & table_name) const
{ {
return detail::getObjectMetadataPath(database.getMetadataPath(), table_name); return getMetadataPath() + escapeForFileName(table_name) + ".sql";
} }
time_t DatabaseOnDisk::getObjectMetadataModificationTime( time_t DatabaseOnDisk::getObjectMetadataModificationTime(const String & table_name) const
const IDatabase & database,
const String & table_name)
{ {
String table_metadata_path = getObjectMetadataPath(database, table_name); String table_metadata_path = getObjectMetadataPath(table_name);
Poco::File meta_file(table_metadata_path); Poco::File meta_file(table_metadata_path);
if (meta_file.exists()) if (meta_file.exists())
@ -520,10 +301,10 @@ time_t DatabaseOnDisk::getObjectMetadataModificationTime(
return static_cast<time_t>(0); return static_cast<time_t>(0);
} }
void DatabaseOnDisk::iterateMetadataFiles(const IDatabase & database, Poco::Logger * log, const Context & context, const IteratingFunction & iterating_function) void DatabaseOnDisk::iterateMetadataFiles(const Context & context, const IteratingFunction & iterating_function) const
{ {
Poco::DirectoryIterator dir_end; Poco::DirectoryIterator dir_end;
for (Poco::DirectoryIterator dir_it(database.getMetadataPath()); dir_it != dir_end; ++dir_it) for (Poco::DirectoryIterator dir_it(getMetadataPath()); dir_it != dir_end; ++dir_it)
{ {
/// For '.svn', '.gitignore' directory and similar. /// For '.svn', '.gitignore' directory and similar.
if (dir_it.name().at(0) == '.') if (dir_it.name().at(0) == '.')
@ -538,10 +319,10 @@ void DatabaseOnDisk::iterateMetadataFiles(const IDatabase & database, Poco::Logg
if (endsWith(dir_it.name(), tmp_drop_ext)) if (endsWith(dir_it.name(), tmp_drop_ext))
{ {
const std::string object_name = dir_it.name().substr(0, dir_it.name().size() - strlen(tmp_drop_ext)); const std::string object_name = dir_it.name().substr(0, dir_it.name().size() - strlen(tmp_drop_ext));
if (Poco::File(context.getPath() + database.getDataPath() + '/' + object_name).exists()) if (Poco::File(context.getPath() + getDataPath() + '/' + object_name).exists())
{ {
/// TODO maybe complete table drop and remove all table data (including data on other volumes and metadata in ZK) /// TODO maybe complete table drop and remove all table data (including data on other volumes and metadata in ZK)
Poco::File(dir_it->path()).renameTo(database.getMetadataPath() + object_name + ".sql"); Poco::File(dir_it->path()).renameTo(getMetadataPath() + object_name + ".sql");
LOG_WARNING(log, "Object " << backQuote(object_name) << " was not dropped previously and will be restored"); LOG_WARNING(log, "Object " << backQuote(object_name) << " was not dropped previously and will be restored");
iterating_function(object_name + ".sql"); iterating_function(object_name + ".sql");
} }
@ -567,9 +348,64 @@ void DatabaseOnDisk::iterateMetadataFiles(const IDatabase & database, Poco::Logg
iterating_function(dir_it.name()); iterating_function(dir_it.name());
} }
else else
throw Exception("Incorrect file extension: " + dir_it.name() + " in metadata directory " + database.getMetadataPath(), throw Exception("Incorrect file extension: " + dir_it.name() + " in metadata directory " + getMetadataPath(),
ErrorCodes::INCORRECT_FILE_NAME); ErrorCodes::INCORRECT_FILE_NAME);
} }
} }
ASTPtr DatabaseOnDisk::parseQueryFromMetadata(const String & metadata_file_path, bool throw_on_error /*= true*/, bool remove_empty /*= false*/) const
{
String query;
try
{
ReadBufferFromFile in(metadata_file_path, METADATA_FILE_BUFFER_SIZE);
readStringUntilEOF(query, in);
}
catch (const Exception & e)
{
if (!throw_on_error && e.code() == ErrorCodes::FILE_DOESNT_EXIST)
return nullptr;
else
throw;
}
/** Empty files with metadata are generated after a rough restart of the server.
* Remove these files to slightly reduce the work of the admins on startup.
*/
if (remove_empty && query.empty())
{
LOG_ERROR(log, "File " << metadata_file_path << " is empty. Removing.");
Poco::File(metadata_file_path).remove();
return nullptr;
}
ParserCreateQuery parser;
const char * pos = query.data();
std::string error_message;
auto ast = tryParseQuery(parser, pos, pos + query.size(), error_message, /* hilite = */ false,
"in file " + getMetadataPath(), /* allow_multi_statements = */ false, 0);
if (!ast && throw_on_error)
throw Exception(error_message, ErrorCodes::SYNTAX_ERROR);
else if (!ast)
return nullptr;
return ast;
}
ASTPtr DatabaseOnDisk::getCreateQueryFromMetadata(const String & database_metadata_path, bool throw_on_error) const
{
ASTPtr ast = parseQueryFromMetadata(database_metadata_path, throw_on_error);
if (ast)
{
auto & ast_create_query = ast->as<ASTCreateQuery &>();
ast_create_query.attach = false;
ast_create_query.database = database_name;
}
return ast;
}
} }

View File

@ -11,24 +11,14 @@
namespace DB namespace DB
{ {
namespace detail
{
String getObjectMetadataPath(const String & base_path, const String & dictionary_name);
String getDatabaseMetadataPath(const String & base_path);
ASTPtr getQueryFromMetadata(const String & metadata_path, bool throw_on_error = true);
ASTPtr getCreateQueryFromMetadata(const String & metadata_path, const String & database, bool throw_on_error);
}
ASTPtr parseCreateQueryFromMetadataFile(const String & filepath, Poco::Logger * log);
std::pair<String, StoragePtr> createTableFromAST( std::pair<String, StoragePtr> createTableFromAST(
ASTCreateQuery ast_create_query, ASTCreateQuery ast_create_query,
const String & database_name, const String & database_name,
const String & database_data_path_relative, const String & table_data_path_relative,
Context & context, Context & context,
bool has_force_restore_data_flag); bool has_force_restore_data_flag);
/** Get the row with the table definition based on the CREATE query. /** Get the string with the table definition based on the CREATE query.
* It is an ATTACH query that you can execute to create a table from the correspondent database. * It is an ATTACH query that you can execute to create a table from the correspondent database.
* See the implementation. * See the implementation.
*/ */
@ -37,147 +27,59 @@ String getObjectDefinitionFromCreateQuery(const ASTPtr & query);
/* Class to provide basic operations with tables when metadata is stored on disk in .sql files. /* Class to provide basic operations with tables when metadata is stored on disk in .sql files.
*/ */
class DatabaseOnDisk class DatabaseOnDisk : public DatabaseWithOwnTablesBase
{ {
public: public:
static void createTable( DatabaseOnDisk(const String & name, const String & metadata_path_, const String & logger)
IDatabase & database, : DatabaseWithOwnTablesBase(name, logger)
, metadata_path(metadata_path_)
, data_path("data/" + escapeForFileName(database_name) + "/") {}
void createTable(
const Context & context, const Context & context,
const String & table_name, const String & table_name,
const StoragePtr & table, const StoragePtr & table,
const ASTPtr & query); const ASTPtr & query) override;
static void createDictionary( void removeTable(
IDatabase & database,
const Context & context, const Context & context,
const String & dictionary_name, const String & table_name) override;
const ASTPtr & query);
static void removeTable( void renameTable(
IDatabase & database,
const Context & context,
const String & table_name,
Poco::Logger * log);
static void removeDictionary(
IDatabase & database,
const Context & context,
const String & dictionary_name,
Poco::Logger * log);
template <typename Database>
static void renameTable(
IDatabase & database,
const Context & context, const Context & context,
const String & table_name, const String & table_name,
IDatabase & to_database, IDatabase & to_database,
const String & to_table_name, const String & to_table_name,
TableStructureWriteLockHolder & lock); TableStructureWriteLockHolder & lock) override;
static ASTPtr getCreateTableQuery( ASTPtr getCreateDatabaseQuery() const override;
const IDatabase & database,
const Context & context,
const String & table_name);
static ASTPtr tryGetCreateTableQuery( void drop(const Context & context) override;
const IDatabase & database,
const Context & context,
const String & table_name);
static ASTPtr getCreateDictionaryQuery( String getObjectMetadataPath(const String & object_name) const override;
const IDatabase & database,
const Context & context,
const String & dictionary_name);
static ASTPtr tryGetCreateDictionaryQuery( time_t getObjectMetadataModificationTime(const String & object_name) const override;
const IDatabase & database,
const Context & context,
const String & dictionary_name);
static ASTPtr getCreateDatabaseQuery(
const IDatabase & database,
const Context & context);
static void drop(const IDatabase & database, const Context & context);
static String getObjectMetadataPath(
const IDatabase & database,
const String & object_name);
static time_t getObjectMetadataModificationTime(
const IDatabase & database,
const String & object_name);
String getDataPath() const override { return data_path; }
String getTableDataPath(const String & table_name) const override { return data_path + escapeForFileName(table_name) + "/"; }
String getTableDataPath(const ASTCreateQuery & query) const override { return getTableDataPath(query.table); }
String getMetadataPath() const override { return metadata_path; }
protected:
using IteratingFunction = std::function<void(const String &)>; using IteratingFunction = std::function<void(const String &)>;
static void iterateMetadataFiles(const IDatabase & database, Poco::Logger * log, const Context & context, const IteratingFunction & iterating_function); void iterateMetadataFiles(const Context & context, const IteratingFunction & iterating_function) const;
private: ASTPtr getCreateTableQueryImpl(
static ASTPtr getCreateTableQueryImpl(
const IDatabase & database,
const Context & context, const Context & context,
const String & table_name, const String & table_name,
bool throw_on_error); bool throw_on_error) const override;
static ASTPtr getCreateDictionaryQueryImpl( ASTPtr parseQueryFromMetadata(const String & metadata_file_path, bool throw_on_error = true, bool remove_empty = false) const;
const IDatabase & database, ASTPtr getCreateQueryFromMetadata(const String & metadata_path, bool throw_on_error) const;
const Context & context,
const String & dictionary_name,
bool throw_on_error); const String metadata_path;
const String data_path;
}; };
namespace ErrorCodes
{
extern const int NOT_IMPLEMENTED;
extern const int UNKNOWN_TABLE;
extern const int FILE_DOESNT_EXIST;
}
template <typename Database>
void DatabaseOnDisk::renameTable(
IDatabase & database,
const Context & context,
const String & table_name,
IDatabase & to_database,
const String & to_table_name,
TableStructureWriteLockHolder & lock)
{
Database * to_database_concrete = typeid_cast<Database *>(&to_database);
if (!to_database_concrete)
throw Exception("Moving tables between databases of different engines is not supported", ErrorCodes::NOT_IMPLEMENTED);
StoragePtr table = database.tryGetTable(context, table_name);
if (!table)
throw Exception("Table " + backQuote(database.getDatabaseName()) + "." + backQuote(table_name) + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
/// Notify the table that it is renamed. If the table does not support renaming, exception is thrown.
try
{
table->rename("/data/" + escapeForFileName(to_database_concrete->getDatabaseName()) + "/" + escapeForFileName(to_table_name) + '/',
to_database_concrete->getDatabaseName(),
to_table_name, lock);
}
catch (const Exception &)
{
throw;
}
catch (const Poco::Exception & e)
{
/// Better diagnostics.
throw Exception{Exception::CreateFromPoco, e};
}
ASTPtr ast = detail::getQueryFromMetadata(detail::getObjectMetadataPath(database.getMetadataPath(), table_name));
if (!ast)
throw Exception("There is no metadata file for table " + backQuote(table_name) + ".", ErrorCodes::FILE_DOESNT_EXIST);
ast->as<ASTCreateQuery &>().table = to_table_name;
/// NOTE Non-atomic.
to_database_concrete->createTable(context, to_table_name, table, ast);
database.removeTable(context, table_name);
}
} }

View File

@ -1,7 +1,6 @@
#include <iomanip> #include <iomanip>
#include <Core/Settings.h> #include <Core/Settings.h>
#include <Databases/DatabaseMemory.h>
#include <Databases/DatabaseOnDisk.h> #include <Databases/DatabaseOnDisk.h>
#include <Databases/DatabaseOrdinary.h> #include <Databases/DatabaseOrdinary.h>
#include <Databases/DatabasesCommon.h> #include <Databases/DatabasesCommon.h>
@ -11,22 +10,19 @@
#include <IO/WriteHelpers.h> #include <IO/WriteHelpers.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <Interpreters/InterpreterCreateQuery.h> #include <Interpreters/InterpreterCreateQuery.h>
#include <Interpreters/ExternalLoaderDatabaseConfigRepository.h>
#include <Interpreters/ExternalDictionariesLoader.h>
#include <Parsers/ASTCreateQuery.h> #include <Parsers/ASTCreateQuery.h>
#include <Parsers/ParserCreateQuery.h> #include <Parsers/ParserCreateQuery.h>
#include <Parsers/ParserDictionary.h>
#include <Storages/StorageFactory.h> #include <Storages/StorageFactory.h>
#include <Dictionaries/DictionaryFactory.h>
#include <Parsers/parseQuery.h> #include <Parsers/parseQuery.h>
#include <Parsers/formatAST.h> #include <Parsers/formatAST.h>
#include <Storages/IStorage.h> #include <Parsers/ASTSetQuery.h>
#include <TableFunctions/TableFunctionFactory.h> #include <TableFunctions/TableFunctionFactory.h>
#include <Parsers/queryToString.h>
#include <Poco/DirectoryIterator.h> #include <Poco/DirectoryIterator.h>
#include <Poco/Event.h> #include <Poco/Event.h>
#include <Common/Stopwatch.h> #include <Common/Stopwatch.h>
#include <Common/StringUtils/StringUtils.h>
#include <Common/quoteString.h> #include <Common/quoteString.h>
#include <Common/ThreadPool.h> #include <Common/ThreadPool.h>
#include <Common/escapeForFileName.h> #include <Common/escapeForFileName.h>
@ -40,11 +36,9 @@ namespace DB
namespace ErrorCodes namespace ErrorCodes
{ {
extern const int CANNOT_CREATE_TABLE_FROM_METADATA;
extern const int CANNOT_CREATE_DICTIONARY_FROM_METADATA; extern const int CANNOT_CREATE_DICTIONARY_FROM_METADATA;
extern const int EMPTY_LIST_OF_COLUMNS_PASSED; extern const int EMPTY_LIST_OF_COLUMNS_PASSED;
extern const int CANNOT_PARSE_TEXT; extern const int CANNOT_PARSE_TEXT;
extern const int EMPTY_LIST_OF_ATTRIBUTES_PASSED;
} }
@ -68,16 +62,13 @@ namespace
String table_name; String table_name;
StoragePtr table; StoragePtr table;
std::tie(table_name, table) std::tie(table_name, table)
= createTableFromAST(query, database_name, database.getDataPath(), context, has_force_restore_data_flag); = createTableFromAST(query, database_name, database.getTableDataPath(query), context, has_force_restore_data_flag);
database.attachTable(table_name, table); database.attachTable(table_name, table);
} }
catch (const Exception & e) catch (Exception & e)
{ {
throw Exception( e.addMessage("Cannot attach table '" + backQuote(query.table) + "' from query " + serializeAST(query));
"Cannot attach table '" + query.table + "' from query " + serializeAST(query) throw;
+ ". Error: " + DB::getCurrentExceptionMessage(true),
e,
DB::ErrorCodes::CANNOT_CREATE_TABLE_FROM_METADATA);
} }
} }
@ -92,13 +83,10 @@ namespace
{ {
database.attachDictionary(query.table, context); database.attachDictionary(query.table, context);
} }
catch (const Exception & e) catch (Exception & e)
{ {
throw Exception( e.addMessage("Cannot attach table '" + backQuote(query.table) + "' from query " + serializeAST(query));
"Cannot create dictionary '" + query.table + "' from query " + serializeAST(query) throw;
+ ". Error: " + DB::getCurrentExceptionMessage(true),
e,
DB::ErrorCodes::CANNOT_CREATE_DICTIONARY_FROM_METADATA);
} }
} }
@ -114,11 +102,8 @@ namespace
} }
DatabaseOrdinary::DatabaseOrdinary(String name_, const String & metadata_path_, const Context & context_) DatabaseOrdinary::DatabaseOrdinary(const String & name_, const String & metadata_path_, const Context & context_)
: DatabaseWithOwnTablesBase(std::move(name_)) : DatabaseWithDictionaries(name_, metadata_path_, "DatabaseOrdinary (" + name_ + ")")
, metadata_path(metadata_path_)
, data_path("data/" + escapeForFileName(name) + "/")
, log(&Logger::get("DatabaseOrdinary (" + name + ")"))
{ {
Poco::File(context_.getPath() + getDataPath()).createDirectories(); Poco::File(context_.getPath() + getDataPath()).createDirectories();
} }
@ -137,12 +122,12 @@ void DatabaseOrdinary::loadStoredObjects(
FileNames file_names; FileNames file_names;
size_t total_dictionaries = 0; size_t total_dictionaries = 0;
DatabaseOnDisk::iterateMetadataFiles(*this, log, context, [&file_names, &total_dictionaries, this](const String & file_name) iterateMetadataFiles(context, [&file_names, &total_dictionaries, this](const String & file_name)
{ {
String full_path = metadata_path + "/" + file_name; String full_path = getMetadataPath() + file_name;
try try
{ {
auto ast = parseCreateQueryFromMetadataFile(full_path, log); auto ast = parseQueryFromMetadata(full_path, /*throw_on_error*/ true, /*remove_empty*/false);
if (ast) if (ast)
{ {
auto * create_query = ast->as<ASTCreateQuery>(); auto * create_query = ast->as<ASTCreateQuery>();
@ -150,10 +135,10 @@ void DatabaseOrdinary::loadStoredObjects(
total_dictionaries += create_query->is_dictionary; total_dictionaries += create_query->is_dictionary;
} }
} }
catch (const Exception & e) catch (Exception & e)
{ {
throw Exception( e.addMessage("Cannot parse definition from metadata file " + full_path);
"Cannot parse definition from metadata file " + full_path + ". Error: " + DB::getCurrentExceptionMessage(true), e, ErrorCodes::CANNOT_PARSE_TEXT); throw;
} }
}); });
@ -187,12 +172,8 @@ void DatabaseOrdinary::loadStoredObjects(
/// After all tables was basically initialized, startup them. /// After all tables was basically initialized, startup them.
startupTables(pool); startupTables(pool);
/// Add database as repository
auto dictionaries_repository = std::make_unique<ExternalLoaderDatabaseConfigRepository>(shared_from_this(), context);
auto & external_loader = context.getExternalDictionariesLoader();
external_loader.addConfigRepository(getDatabaseName(), std::move(dictionaries_repository));
/// Attach dictionaries. /// Attach dictionaries.
attachToExternalDictionariesLoader(context);
for (const auto & name_with_query : file_names) for (const auto & name_with_query : file_names)
{ {
auto create_query = name_with_query.second->as<const ASTCreateQuery &>(); auto create_query = name_with_query.second->as<const ASTCreateQuery &>();
@ -237,94 +218,14 @@ void DatabaseOrdinary::startupTables(ThreadPool & thread_pool)
thread_pool.wait(); thread_pool.wait();
} }
void DatabaseOrdinary::createTable(
const Context & context,
const String & table_name,
const StoragePtr & table,
const ASTPtr & query)
{
DatabaseOnDisk::createTable(*this, context, table_name, table, query);
}
void DatabaseOrdinary::createDictionary(
const Context & context,
const String & dictionary_name,
const ASTPtr & query)
{
DatabaseOnDisk::createDictionary(*this, context, dictionary_name, query);
}
void DatabaseOrdinary::removeTable(
const Context & context,
const String & table_name)
{
DatabaseOnDisk::removeTable(*this, context, table_name, log);
}
void DatabaseOrdinary::removeDictionary(
const Context & context,
const String & table_name)
{
DatabaseOnDisk::removeDictionary(*this, context, table_name, log);
}
void DatabaseOrdinary::renameTable(
const Context & context,
const String & table_name,
IDatabase & to_database,
const String & to_table_name,
TableStructureWriteLockHolder & lock)
{
DatabaseOnDisk::renameTable<DatabaseOrdinary>(*this, context, table_name, to_database, to_table_name, lock);
}
time_t DatabaseOrdinary::getObjectMetadataModificationTime(
const Context & /* context */,
const String & table_name)
{
return DatabaseOnDisk::getObjectMetadataModificationTime(*this, table_name);
}
ASTPtr DatabaseOrdinary::getCreateTableQuery(const Context & context, const String & table_name) const
{
return DatabaseOnDisk::getCreateTableQuery(*this, context, table_name);
}
ASTPtr DatabaseOrdinary::tryGetCreateTableQuery(const Context & context, const String & table_name) const
{
return DatabaseOnDisk::tryGetCreateTableQuery(*this, context, table_name);
}
ASTPtr DatabaseOrdinary::getCreateDictionaryQuery(const Context & context, const String & dictionary_name) const
{
return DatabaseOnDisk::getCreateDictionaryQuery(*this, context, dictionary_name);
}
ASTPtr DatabaseOrdinary::tryGetCreateDictionaryQuery(const Context & context, const String & dictionary_name) const
{
return DatabaseOnDisk::tryGetCreateTableQuery(*this, context, dictionary_name);
}
ASTPtr DatabaseOrdinary::getCreateDatabaseQuery(const Context & context) const
{
return DatabaseOnDisk::getCreateDatabaseQuery(*this, context);
}
void DatabaseOrdinary::alterTable( void DatabaseOrdinary::alterTable(
const Context & context, const Context & context,
const String & table_name, const String & table_name,
const ColumnsDescription & columns, const StorageInMemoryMetadata & metadata)
const IndicesDescription & indices,
const ConstraintsDescription & constraints,
const ASTModifier & storage_modifier)
{ {
/// Read the definition of the table and replace the necessary parts with new ones. /// Read the definition of the table and replace the necessary parts with new ones.
String table_metadata_path = getObjectMetadataPath(table_name);
String table_name_escaped = escapeForFileName(table_name); String table_metadata_tmp_path = table_metadata_path + ".tmp";
String table_metadata_tmp_path = getMetadataPath() + "/" + table_name_escaped + ".sql.tmp";
String table_metadata_path = getMetadataPath() + "/" + table_name_escaped + ".sql";
String statement; String statement;
{ {
@ -338,19 +239,30 @@ void DatabaseOrdinary::alterTable(
const auto & ast_create_query = ast->as<ASTCreateQuery &>(); const auto & ast_create_query = ast->as<ASTCreateQuery &>();
ASTPtr new_columns = InterpreterCreateQuery::formatColumns(columns); ASTPtr new_columns = InterpreterCreateQuery::formatColumns(metadata.columns);
ASTPtr new_indices = InterpreterCreateQuery::formatIndices(indices); ASTPtr new_indices = InterpreterCreateQuery::formatIndices(metadata.indices);
ASTPtr new_constraints = InterpreterCreateQuery::formatConstraints(constraints); ASTPtr new_constraints = InterpreterCreateQuery::formatConstraints(metadata.constraints);
ast_create_query.columns_list->replace(ast_create_query.columns_list->columns, new_columns); ast_create_query.columns_list->replace(ast_create_query.columns_list->columns, new_columns);
ast_create_query.columns_list->setOrReplace(ast_create_query.columns_list->indices, new_indices); ast_create_query.columns_list->setOrReplace(ast_create_query.columns_list->indices, new_indices);
ast_create_query.columns_list->setOrReplace(ast_create_query.columns_list->constraints, new_constraints); ast_create_query.columns_list->setOrReplace(ast_create_query.columns_list->constraints, new_constraints);
if (storage_modifier) ASTStorage & storage_ast = *ast_create_query.storage;
storage_modifier(*ast_create_query.storage); /// ORDER BY may change, but cannot appear, it's required construction
if (metadata.order_by_ast && storage_ast.order_by)
storage_ast.set(storage_ast.order_by, metadata.order_by_ast);
if (metadata.primary_key_ast)
storage_ast.set(storage_ast.primary_key, metadata.primary_key_ast);
if (metadata.ttl_for_table_ast)
storage_ast.set(storage_ast.ttl_table, metadata.ttl_for_table_ast);
if (metadata.settings_ast)
storage_ast.set(storage_ast.settings, metadata.settings_ast);
statement = getObjectDefinitionFromCreateQuery(ast); statement = getObjectDefinitionFromCreateQuery(ast);
{ {
WriteBufferFromFile out(table_metadata_tmp_path, statement.size(), O_WRONLY | O_CREAT | O_EXCL); WriteBufferFromFile out(table_metadata_tmp_path, statement.size(), O_WRONLY | O_CREAT | O_EXCL);
writeString(statement, out); writeString(statement, out);
@ -372,31 +284,4 @@ void DatabaseOrdinary::alterTable(
} }
} }
void DatabaseOrdinary::drop(const Context & context)
{
DatabaseOnDisk::drop(*this, context);
}
String DatabaseOrdinary::getDataPath() const
{
return data_path;
}
String DatabaseOrdinary::getMetadataPath() const
{
return metadata_path;
}
String DatabaseOrdinary::getDatabaseName() const
{
return name;
}
String DatabaseOrdinary::getObjectMetadataPath(const String & table_name) const
{
return DatabaseOnDisk::getObjectMetadataPath(*this, table_name);
}
} }

View File

@ -1,6 +1,6 @@
#pragma once #pragma once
#include <Databases/DatabasesCommon.h> #include <Databases/DatabaseWithDictionaries.h>
#include <Common/ThreadPool.h> #include <Common/ThreadPool.h>
@ -11,10 +11,10 @@ namespace DB
* It stores tables list in filesystem using list of .sql files, * It stores tables list in filesystem using list of .sql files,
* that contain declaration of table represented by SQL ATTACH TABLE query. * that contain declaration of table represented by SQL ATTACH TABLE query.
*/ */
class DatabaseOrdinary : public DatabaseWithOwnTablesBase class DatabaseOrdinary : public DatabaseWithDictionaries //DatabaseWithOwnTablesBase
{ {
public: public:
DatabaseOrdinary(String name_, const String & metadata_path_, const Context & context); DatabaseOrdinary(const String & name_, const String & metadata_path_, const Context & context);
String getEngineName() const override { return "Ordinary"; } String getEngineName() const override { return "Ordinary"; }
@ -22,73 +22,12 @@ public:
Context & context, Context & context,
bool has_force_restore_data_flag) override; bool has_force_restore_data_flag) override;
void createTable(
const Context & context,
const String & table_name,
const StoragePtr & table,
const ASTPtr & query) override;
void createDictionary(
const Context & context,
const String & dictionary_name,
const ASTPtr & query) override;
void removeTable(
const Context & context,
const String & table_name) override;
void removeDictionary(
const Context & context,
const String & table_name) override;
void renameTable(
const Context & context,
const String & table_name,
IDatabase & to_database,
const String & to_table_name,
TableStructureWriteLockHolder &) override;
void alterTable( void alterTable(
const Context & context, const Context & context,
const String & name, const String & name,
const ColumnsDescription & columns, const StorageInMemoryMetadata & metadata) override;
const IndicesDescription & indices,
const ConstraintsDescription & constraints,
const ASTModifier & engine_modifier) override;
time_t getObjectMetadataModificationTime(
const Context & context,
const String & table_name) override;
ASTPtr getCreateTableQuery(
const Context & context,
const String & table_name) const override;
ASTPtr tryGetCreateTableQuery(
const Context & context,
const String & table_name) const override;
ASTPtr tryGetCreateDictionaryQuery(
const Context & context,
const String & name) const override;
ASTPtr getCreateDictionaryQuery(
const Context & context,
const String & name) const override;
ASTPtr getCreateDatabaseQuery(const Context & context) const override;
String getDataPath() const override;
String getDatabaseName() const override;
String getMetadataPath() const override;
String getObjectMetadataPath(const String & table_name) const override;
void drop(const Context & context) override;
private: private:
const String metadata_path;
const String data_path;
Poco::Logger * log;
void startupTables(ThreadPool & thread_pool); void startupTables(ThreadPool & thread_pool);
}; };

View File

@ -0,0 +1,271 @@
#include <Databases/DatabaseWithDictionaries.h>
#include <Interpreters/ExternalDictionariesLoader.h>
#include <Interpreters/ExternalLoaderTempConfigRepository.h>
#include <Interpreters/ExternalLoaderDatabaseConfigRepository.h>
#include <Dictionaries/getDictionaryConfigurationFromAST.h>
#include <Interpreters/Context.h>
#include <Storages/StorageDictionary.h>
#include <IO/WriteBufferFromFile.h>
#include <Poco/File.h>
#include <ext/scope_guard.h>
namespace DB
{
namespace ErrorCodes
{
extern const int EMPTY_LIST_OF_COLUMNS_PASSED;
extern const int TABLE_ALREADY_EXISTS;
extern const int UNKNOWN_TABLE;
extern const int LOGICAL_ERROR;
extern const int DICTIONARY_ALREADY_EXISTS;
}
void DatabaseWithDictionaries::attachDictionary(const String & dictionary_name, const Context & context)
{
String full_name = getDatabaseName() + "." + dictionary_name;
{
std::lock_guard lock(mutex);
if (!dictionaries.emplace(dictionary_name).second)
throw Exception("Dictionary " + full_name + " already exists.", ErrorCodes::DICTIONARY_ALREADY_EXISTS);
}
/// ExternalLoader::reloadConfig() will find out that the dictionary's config has been added
/// and in case `dictionaries_lazy_load == false` it will load the dictionary.
const auto & external_loader = context.getExternalDictionariesLoader();
external_loader.reloadConfig(getDatabaseName(), full_name);
}
void DatabaseWithDictionaries::detachDictionary(const String & dictionary_name, const Context & context)
{
String full_name = getDatabaseName() + "." + dictionary_name;
{
std::lock_guard lock(mutex);
auto it = dictionaries.find(dictionary_name);
if (it == dictionaries.end())
throw Exception("Dictionary " + full_name + " doesn't exist.", ErrorCodes::UNKNOWN_TABLE);
dictionaries.erase(it);
}
/// ExternalLoader::reloadConfig() will find out that the dictionary's config has been removed
/// and therefore it will unload the dictionary.
const auto & external_loader = context.getExternalDictionariesLoader();
external_loader.reloadConfig(getDatabaseName(), full_name);
}
void DatabaseWithDictionaries::createDictionary(const Context & context, const String & dictionary_name, const ASTPtr & query)
{
const auto & settings = context.getSettingsRef();
/** The code is based on the assumption that all threads share the same order of operations:
* - create the .sql.tmp file;
* - add the dictionary to ExternalDictionariesLoader;
* - load the dictionary in case dictionaries_lazy_load == false;
* - attach the dictionary;
* - rename .sql.tmp to .sql.
*/
/// A race condition would be possible if a dictionary with the same name is simultaneously created using CREATE and using ATTACH.
/// But there is protection from it - see using DDLGuard in InterpreterCreateQuery.
if (isDictionaryExist(context, dictionary_name))
throw Exception("Dictionary " + backQuote(getDatabaseName()) + "." + backQuote(dictionary_name) + " already exists.", ErrorCodes::DICTIONARY_ALREADY_EXISTS);
/// A dictionary with the same full name could be defined in *.xml config files.
String full_name = getDatabaseName() + "." + dictionary_name;
const auto & external_loader = context.getExternalDictionariesLoader();
if (external_loader.getCurrentStatus(full_name) != ExternalLoader::Status::NOT_EXIST)
throw Exception(
"Dictionary " + backQuote(getDatabaseName()) + "." + backQuote(dictionary_name) + " already exists.",
ErrorCodes::DICTIONARY_ALREADY_EXISTS);
if (isTableExist(context, dictionary_name))
throw Exception("Table " + backQuote(getDatabaseName()) + "." + backQuote(dictionary_name) + " already exists.", ErrorCodes::TABLE_ALREADY_EXISTS);
String dictionary_metadata_path = getObjectMetadataPath(dictionary_name);
String dictionary_metadata_tmp_path = dictionary_metadata_path + ".tmp";
String statement = getObjectDefinitionFromCreateQuery(query);
{
/// Exclusive flags guarantees, that table is not created right now in another thread. Otherwise, exception will be thrown.
WriteBufferFromFile out(dictionary_metadata_tmp_path, statement.size(), O_WRONLY | O_CREAT | O_EXCL);
writeString(statement, out);
out.next();
if (settings.fsync_metadata)
out.sync();
out.close();
}
bool succeeded = false;
SCOPE_EXIT({
if (!succeeded)
Poco::File(dictionary_metadata_tmp_path).remove();
});
/// Add a temporary repository containing the dictionary.
/// We need this temp repository to try loading the dictionary before actually attaching it to the database.
auto temp_repository
= const_cast<ExternalDictionariesLoader &>(external_loader) /// the change of ExternalDictionariesLoader is temporary
.addConfigRepository(std::make_unique<ExternalLoaderTempConfigRepository>(
getDatabaseName(), dictionary_metadata_tmp_path, getDictionaryConfigurationFromAST(query->as<const ASTCreateQuery &>())));
bool lazy_load = context.getConfigRef().getBool("dictionaries_lazy_load", true);
if (!lazy_load)
{
/// load() is called here to force loading the dictionary, wait until the loading is finished,
/// and throw an exception if the loading is failed.
external_loader.load(full_name);
}
attachDictionary(dictionary_name, context);
SCOPE_EXIT({
if (!succeeded)
detachDictionary(dictionary_name, context);
});
/// If it was ATTACH query and file with dictionary metadata already exist
/// (so, ATTACH is done after DETACH), then rename atomically replaces old file with new one.
Poco::File(dictionary_metadata_tmp_path).renameTo(dictionary_metadata_path);
/// ExternalDictionariesLoader doesn't know we renamed the metadata path.
/// So we have to manually call reloadConfig() here.
external_loader.reloadConfig(getDatabaseName(), full_name);
/// Everything's ok.
succeeded = true;
}
void DatabaseWithDictionaries::removeDictionary(const Context & context, const String & dictionary_name)
{
detachDictionary(dictionary_name, context);
String dictionary_metadata_path = getObjectMetadataPath(dictionary_name);
try
{
Poco::File(dictionary_metadata_path).remove();
}
catch (...)
{
/// If remove was not possible for some reason
attachDictionary(dictionary_name, context);
throw;
}
}
StoragePtr DatabaseWithDictionaries::tryGetTable(const Context & context, const String & table_name) const
{
if (auto table_ptr = DatabaseWithOwnTablesBase::tryGetTable(context, table_name))
return table_ptr;
if (isDictionaryExist(context, table_name))
/// We don't need lock database here, because database doesn't store dictionary itself
/// just metadata
return getDictionaryStorage(context, table_name);
return {};
}
DatabaseTablesIteratorPtr DatabaseWithDictionaries::getTablesWithDictionaryTablesIterator(const Context & context, const FilterByNameFunction & filter_by_name)
{
/// NOTE: it's not atomic
auto tables_it = getTablesIterator(context, filter_by_name);
auto dictionaries_it = getDictionariesIterator(context, filter_by_name);
Tables result;
while (tables_it && tables_it->isValid())
{
result.emplace(tables_it->name(), tables_it->table());
tables_it->next();
}
while (dictionaries_it && dictionaries_it->isValid())
{
auto table_name = dictionaries_it->name();
auto table_ptr = getDictionaryStorage(context, table_name);
if (table_ptr)
result.emplace(table_name, table_ptr);
dictionaries_it->next();
}
return std::make_unique<DatabaseTablesSnapshotIterator>(result);
}
DatabaseDictionariesIteratorPtr DatabaseWithDictionaries::getDictionariesIterator(const Context & /*context*/, const FilterByNameFunction & filter_by_dictionary_name)
{
std::lock_guard lock(mutex);
if (!filter_by_dictionary_name)
return std::make_unique<DatabaseDictionariesSnapshotIterator>(dictionaries);
Dictionaries filtered_dictionaries;
for (const auto & dictionary_name : dictionaries)
if (filter_by_dictionary_name(dictionary_name))
filtered_dictionaries.emplace(dictionary_name);
return std::make_unique<DatabaseDictionariesSnapshotIterator>(std::move(filtered_dictionaries));
}
bool DatabaseWithDictionaries::isDictionaryExist(const Context & /*context*/, const String & dictionary_name) const
{
std::lock_guard lock(mutex);
return dictionaries.find(dictionary_name) != dictionaries.end();
}
StoragePtr DatabaseWithDictionaries::getDictionaryStorage(const Context & context, const String & table_name) const
{
auto dict_name = database_name + "." + table_name;
const auto & external_loader = context.getExternalDictionariesLoader();
auto dict_ptr = external_loader.tryGetDictionary(dict_name);
if (dict_ptr)
{
const DictionaryStructure & dictionary_structure = dict_ptr->getStructure();
auto columns = StorageDictionary::getNamesAndTypes(dictionary_structure);
return StorageDictionary::create(database_name, table_name, ColumnsDescription{columns}, context, true, dict_name);
}
return nullptr;
}
ASTPtr DatabaseWithDictionaries::getCreateDictionaryQueryImpl(
const Context & context,
const String & dictionary_name,
bool throw_on_error) const
{
ASTPtr ast;
auto dictionary_metadata_path = getObjectMetadataPath(dictionary_name);
ast = getCreateQueryFromMetadata(dictionary_metadata_path, throw_on_error);
if (!ast && throw_on_error)
{
/// Handle system.* tables for which there are no table.sql files.
bool has_dictionary = isDictionaryExist(context, dictionary_name);
auto msg = has_dictionary ? "There is no CREATE DICTIONARY query for table " : "There is no metadata file for dictionary ";
throw Exception(msg + backQuote(dictionary_name), ErrorCodes::CANNOT_GET_CREATE_DICTIONARY_QUERY);
}
return ast;
}
void DatabaseWithDictionaries::shutdown()
{
detachFromExternalDictionariesLoader();
DatabaseOnDisk::shutdown();
}
DatabaseWithDictionaries::~DatabaseWithDictionaries() = default;
void DatabaseWithDictionaries::attachToExternalDictionariesLoader(Context & context)
{
database_as_config_repo_for_external_loader = context.getExternalDictionariesLoader().addConfigRepository(
std::make_unique<ExternalLoaderDatabaseConfigRepository>(*this, context));
}
void DatabaseWithDictionaries::detachFromExternalDictionariesLoader()
{
database_as_config_repo_for_external_loader = {};
}
}

Some files were not shown because too many files have changed in this diff Show More