Merge master

This commit is contained in:
kssenii 2022-03-28 22:51:56 +02:00
commit 1d49a85963
151 changed files with 1078 additions and 506 deletions

View File

@ -210,3 +210,6 @@ CheckOptions:
value: false value: false
- key: performance-move-const-arg.CheckTriviallyCopyableMove - key: performance-move-const-arg.CheckTriviallyCopyableMove
value: false value: false
# Workaround clang-tidy bug: https://github.com/llvm/llvm-project/issues/46097
- key: readability-identifier-naming.TypeTemplateParameterIgnoredRegexp
value: expr-type

View File

@ -11,6 +11,7 @@
* Make `arrayCompact` function behave as other higher-order functions: perform compaction not of lambda function results but on the original array. If you're using nontrivial lambda functions in arrayCompact you may restore old behaviour by wrapping `arrayCompact` arguments into `arrayMap`. Closes [#34010](https://github.com/ClickHouse/ClickHouse/issues/34010) [#18535](https://github.com/ClickHouse/ClickHouse/issues/18535) [#14778](https://github.com/ClickHouse/ClickHouse/issues/14778). [#34795](https://github.com/ClickHouse/ClickHouse/pull/34795) ([Alexandre Snarskii](https://github.com/snar)). * Make `arrayCompact` function behave as other higher-order functions: perform compaction not of lambda function results but on the original array. If you're using nontrivial lambda functions in arrayCompact you may restore old behaviour by wrapping `arrayCompact` arguments into `arrayMap`. Closes [#34010](https://github.com/ClickHouse/ClickHouse/issues/34010) [#18535](https://github.com/ClickHouse/ClickHouse/issues/18535) [#14778](https://github.com/ClickHouse/ClickHouse/issues/14778). [#34795](https://github.com/ClickHouse/ClickHouse/pull/34795) ([Alexandre Snarskii](https://github.com/snar)).
* Change implementation specific behavior on overflow of function `toDatetime`. It will be saturated to the nearest min/max supported instant of datetime instead of wraparound. This change is highlighted as "backward incompatible" because someone may unintentionally rely on the old behavior. [#32898](https://github.com/ClickHouse/ClickHouse/pull/32898) ([HaiBo Li](https://github.com/marising)). * Change implementation specific behavior on overflow of function `toDatetime`. It will be saturated to the nearest min/max supported instant of datetime instead of wraparound. This change is highlighted as "backward incompatible" because someone may unintentionally rely on the old behavior. [#32898](https://github.com/ClickHouse/ClickHouse/pull/32898) ([HaiBo Li](https://github.com/marising)).
* Make function `cast(value, 'IPv4')`, `cast(value, 'IPv6')` behave same as `toIPv4`, `toIPv6` functions. Changed behavior of incorrect IP address passed into functions `toIPv4`,` toIPv6`, now if invalid IP address passes into this functions exception will be raised, before this function return default value. Added functions `IPv4StringToNumOrDefault`, `IPv4StringToNumOrNull`, `IPv6StringToNumOrDefault`, `IPv6StringOrNull` `toIPv4OrDefault`, `toIPv4OrNull`, `toIPv6OrDefault`, `toIPv6OrNull`. Functions `IPv4StringToNumOrDefault `, `toIPv4OrDefault `, `toIPv6OrDefault ` should be used if previous logic relied on `IPv4StringToNum`, `toIPv4`, `toIPv6` returning default value for invalid address. Added setting `cast_ipv4_ipv6_default_on_conversion_error`, if this setting enabled, then IP address conversion functions will behave as before. Closes [#22825](https://github.com/ClickHouse/ClickHouse/issues/22825). Closes [#5799](https://github.com/ClickHouse/ClickHouse/issues/5799). Closes [#35156](https://github.com/ClickHouse/ClickHouse/issues/35156). [#35240](https://github.com/ClickHouse/ClickHouse/pull/35240) ([Maksim Kita](https://github.com/kitaisreal)).
#### New Feature #### New Feature

View File

@ -261,12 +261,12 @@ endif ()
# Add a section with the hash of the compiled machine code for integrity checks. # Add a section with the hash of the compiled machine code for integrity checks.
# Only for official builds, because adding a section can be time consuming (rewrite of several GB). # Only for official builds, because adding a section can be time consuming (rewrite of several GB).
# And cross compiled binaries are not supported (since you cannot execute clickhouse hash-binary) # And cross compiled binaries are not supported (since you cannot execute clickhouse hash-binary)
if (OBJCOPY_PATH AND YANDEX_OFFICIAL_BUILD AND (NOT CMAKE_TOOLCHAIN_FILE)) if (OBJCOPY_PATH AND CLICKHOUSE_OFFICIAL_BUILD AND (NOT CMAKE_TOOLCHAIN_FILE))
set (USE_BINARY_HASH 1) set (USE_BINARY_HASH 1)
endif () endif ()
# Allows to build stripped binary in a separate directory # Allows to build stripped binary in a separate directory
if (OBJCOPY_PATH AND READELF_PATH) if (OBJCOPY_PATH AND STRIP_PATH)
option(INSTALL_STRIPPED_BINARIES "Build stripped binaries with debug info in separate directory" OFF) option(INSTALL_STRIPPED_BINARIES "Build stripped binaries with debug info in separate directory" OFF)
if (INSTALL_STRIPPED_BINARIES) if (INSTALL_STRIPPED_BINARIES)
set(STRIPPED_BINARIES_OUTPUT "stripped" CACHE STRING "A separate directory for stripped information") set(STRIPPED_BINARIES_OUTPUT "stripped" CACHE STRING "A separate directory for stripped information")

View File

@ -51,6 +51,6 @@ if (GLIBC_COMPATIBILITY)
message (STATUS "Some symbols from glibc will be replaced for compatibility") message (STATUS "Some symbols from glibc will be replaced for compatibility")
elseif (YANDEX_OFFICIAL_BUILD) elseif (CLICKHOUSE_OFFICIAL_BUILD)
message (WARNING "Option GLIBC_COMPATIBILITY must be turned on for production builds.") message (WARNING "Option GLIBC_COMPATIBILITY must be turned on for production builds.")
endif () endif ()

View File

@ -1,28 +0,0 @@
#!/usr/bin/env bash
BINARY_PATH=$1
BINARY_NAME=$(basename "$BINARY_PATH")
DESTINATION_STRIPPED_DIR=$2
OBJCOPY_PATH=${3:objcopy}
READELF_PATH=${4:readelf}
BUILD_ID=$($READELF_PATH -n "$1" | sed -n '/Build ID/ { s/.*: //p; q; }')
BUILD_ID_PREFIX=${BUILD_ID:0:2}
BUILD_ID_SUFFIX=${BUILD_ID:2}
DESTINATION_DEBUG_INFO_DIR="$DESTINATION_STRIPPED_DIR/lib/debug/.build-id"
DESTINATION_STRIP_BINARY_DIR="$DESTINATION_STRIPPED_DIR/bin"
mkdir -p "$DESTINATION_DEBUG_INFO_DIR/$BUILD_ID_PREFIX"
mkdir -p "$DESTINATION_STRIP_BINARY_DIR"
cp "$BINARY_PATH" "$DESTINATION_STRIP_BINARY_DIR/$BINARY_NAME"
$OBJCOPY_PATH --only-keep-debug --compress-debug-sections "$DESTINATION_STRIP_BINARY_DIR/$BINARY_NAME" "$DESTINATION_DEBUG_INFO_DIR/$BUILD_ID_PREFIX/$BUILD_ID_SUFFIX.debug"
chmod 0644 "$DESTINATION_DEBUG_INFO_DIR/$BUILD_ID_PREFIX/$BUILD_ID_SUFFIX.debug"
chown 0:0 "$DESTINATION_DEBUG_INFO_DIR/$BUILD_ID_PREFIX/$BUILD_ID_SUFFIX.debug"
strip --remove-section=.comment --remove-section=.note "$DESTINATION_STRIP_BINARY_DIR/$BINARY_NAME"
$OBJCOPY_PATH --add-gnu-debuglink "$DESTINATION_DEBUG_INFO_DIR/$BUILD_ID_PREFIX/$BUILD_ID_SUFFIX.debug" "$DESTINATION_STRIP_BINARY_DIR/$BINARY_NAME"

View File

@ -11,16 +11,43 @@ macro(clickhouse_strip_binary)
message(FATAL_ERROR "A binary path name must be provided for stripping binary") message(FATAL_ERROR "A binary path name must be provided for stripping binary")
endif() endif()
if (NOT DEFINED STRIP_DESTINATION_DIR) if (NOT DEFINED STRIP_DESTINATION_DIR)
message(FATAL_ERROR "Destination directory for stripped binary must be provided") message(FATAL_ERROR "Destination directory for stripped binary must be provided")
endif() endif()
add_custom_command(TARGET ${STRIP_TARGET} POST_BUILD add_custom_command(TARGET ${STRIP_TARGET} POST_BUILD
COMMAND bash ${ClickHouse_SOURCE_DIR}/cmake/strip.sh ${STRIP_BINARY_PATH} ${STRIP_DESTINATION_DIR} ${OBJCOPY_PATH} ${READELF_PATH} COMMAND mkdir -p "${STRIP_DESTINATION_DIR}/lib/debug/bin"
COMMAND mkdir -p "${STRIP_DESTINATION_DIR}/bin"
COMMAND cp "${STRIP_BINARY_PATH}" "${STRIP_DESTINATION_DIR}/bin/${STRIP_TARGET}"
COMMAND "${OBJCOPY_PATH}" --only-keep-debug --compress-debug-sections "${STRIP_DESTINATION_DIR}/bin/${STRIP_TARGET}" "${STRIP_DESTINATION_DIR}/lib/debug/bin/${STRIP_TARGET}.debug"
COMMAND chmod 0644 "${STRIP_DESTINATION_DIR}/lib/debug/bin/${STRIP_TARGET}.debug"
COMMAND "${STRIP_PATH}" --remove-section=.comment --remove-section=.note "${STRIP_DESTINATION_DIR}/bin/${STRIP_TARGET}"
COMMAND "${OBJCOPY_PATH}" --add-gnu-debuglink "${STRIP_DESTINATION_DIR}/lib/debug/bin/${STRIP_TARGET}.debug" "${STRIP_DESTINATION_DIR}/bin/${STRIP_TARGET}"
COMMENT "Stripping clickhouse binary" VERBATIM COMMENT "Stripping clickhouse binary" VERBATIM
) )
install(PROGRAMS ${STRIP_DESTINATION_DIR}/bin/${STRIP_TARGET} DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse) install(PROGRAMS ${STRIP_DESTINATION_DIR}/bin/${STRIP_TARGET} DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
install(DIRECTORY ${STRIP_DESTINATION_DIR}/lib/debug DESTINATION ${CMAKE_INSTALL_LIBDIR} COMPONENT clickhouse) install(FILES ${STRIP_DESTINATION_DIR}/lib/debug/bin/${STRIP_TARGET}.debug DESTINATION ${CMAKE_INSTALL_LIBDIR}/debug/${CMAKE_INSTALL_FULL_BINDIR}/${STRIP_TARGET}.debug COMPONENT clickhouse)
endmacro()
macro(clickhouse_make_empty_debug_info_for_nfpm)
set(oneValueArgs TARGET DESTINATION_DIR)
cmake_parse_arguments(EMPTY_DEBUG "" "${oneValueArgs}" "" ${ARGN})
if (NOT DEFINED EMPTY_DEBUG_TARGET)
message(FATAL_ERROR "A target name must be provided for stripping binary")
endif()
if (NOT DEFINED EMPTY_DEBUG_DESTINATION_DIR)
message(FATAL_ERROR "Destination directory for empty debug must be provided")
endif()
add_custom_command(TARGET ${EMPTY_DEBUG_TARGET} POST_BUILD
COMMAND mkdir -p "${EMPTY_DEBUG_DESTINATION_DIR}/lib/debug"
COMMAND touch "${EMPTY_DEBUG_DESTINATION_DIR}/lib/debug/${EMPTY_DEBUG_TARGET}.debug"
COMMENT "Addiding empty debug info for NFPM" VERBATIM
)
install(FILES "${EMPTY_DEBUG_DESTINATION_DIR}/lib/debug/${EMPTY_DEBUG_TARGET}.debug" DESTINATION "${CMAKE_INSTALL_LIBDIR}/debug/${CMAKE_INSTALL_FULL_BINDIR}" COMPONENT clickhouse)
endmacro() endmacro()

View File

@ -170,32 +170,32 @@ else ()
message (FATAL_ERROR "Cannot find objcopy.") message (FATAL_ERROR "Cannot find objcopy.")
endif () endif ()
# Readelf (FIXME copypaste) # Strip (FIXME copypaste)
if (COMPILER_GCC) if (COMPILER_GCC)
find_program (READELF_PATH NAMES "llvm-readelf" "llvm-readelf-13" "llvm-readelf-12" "llvm-readelf-11" "readelf") find_program (STRIP_PATH NAMES "llvm-strip" "llvm-strip-13" "llvm-strip-12" "llvm-strip-11" "strip")
else () else ()
find_program (READELF_PATH NAMES "llvm-readelf-${COMPILER_VERSION_MAJOR}" "llvm-readelf" "readelf") find_program (STRIP_PATH NAMES "llvm-strip-${COMPILER_VERSION_MAJOR}" "llvm-strip" "strip")
endif () endif ()
if (NOT READELF_PATH AND OS_DARWIN) if (NOT STRIP_PATH AND OS_DARWIN)
find_program (BREW_PATH NAMES "brew") find_program (BREW_PATH NAMES "brew")
if (BREW_PATH) if (BREW_PATH)
execute_process (COMMAND ${BREW_PATH} --prefix llvm ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE OUTPUT_VARIABLE LLVM_PREFIX) execute_process (COMMAND ${BREW_PATH} --prefix llvm ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE OUTPUT_VARIABLE LLVM_PREFIX)
if (LLVM_PREFIX) if (LLVM_PREFIX)
find_program (READELF_PATH NAMES "llvm-readelf" PATHS "${LLVM_PREFIX}/bin" NO_DEFAULT_PATH) find_program (STRIP_PATH NAMES "llvm-strip" PATHS "${LLVM_PREFIX}/bin" NO_DEFAULT_PATH)
endif () endif ()
if (NOT READELF_PATH) if (NOT STRIP_PATH)
execute_process (COMMAND ${BREW_PATH} --prefix binutils ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE OUTPUT_VARIABLE BINUTILS_PREFIX) execute_process (COMMAND ${BREW_PATH} --prefix binutils ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE OUTPUT_VARIABLE BINUTILS_PREFIX)
if (BINUTILS_PREFIX) if (BINUTILS_PREFIX)
find_program (READELF_PATH NAMES "readelf" PATHS "${BINUTILS_PREFIX}/bin" NO_DEFAULT_PATH) find_program (STRIP_PATH NAMES "strip" PATHS "${BINUTILS_PREFIX}/bin" NO_DEFAULT_PATH)
endif () endif ()
endif () endif ()
endif () endif ()
endif () endif ()
if (READELF_PATH) if (STRIP_PATH)
message (STATUS "Using readelf: ${READELF_PATH}") message (STATUS "Using strip: ${STRIP_PATH}")
else () else ()
message (FATAL_ERROR "Cannot find readelf.") message (FATAL_ERROR "Cannot find strip.")
endif () endif ()

View File

@ -18,6 +18,6 @@ set (VERSION_STRING_SHORT "${VERSION_MAJOR}.${VERSION_MINOR}")
math (EXPR VERSION_INTEGER "${VERSION_PATCH} + ${VERSION_MINOR}*1000 + ${VERSION_MAJOR}*1000000") math (EXPR VERSION_INTEGER "${VERSION_PATCH} + ${VERSION_MINOR}*1000 + ${VERSION_MAJOR}*1000000")
if(YANDEX_OFFICIAL_BUILD) if(CLICKHOUSE_OFFICIAL_BUILD)
set(VERSION_OFFICIAL " (official build)") set(VERSION_OFFICIAL " (official build)")
endif() endif()

2
contrib/hyperscan vendored

@ -1 +1 @@
Subproject commit e9f08df0213fc637aac0a5bbde9beeaeba2fe9fa Subproject commit 5edc68c5ac68d2d4f876159e9ee84def6d3dc87c

2
contrib/libcxx vendored

@ -1 +1 @@
Subproject commit 61e60294b1de01483caa9f5d00f437c99b674de6 Subproject commit 172b2ae074f6755145b91c53a95c8540c1468239

View File

@ -18,12 +18,14 @@ set(SRCS
"${LIBCXX_SOURCE_DIR}/src/filesystem/directory_iterator.cpp" "${LIBCXX_SOURCE_DIR}/src/filesystem/directory_iterator.cpp"
"${LIBCXX_SOURCE_DIR}/src/filesystem/int128_builtins.cpp" "${LIBCXX_SOURCE_DIR}/src/filesystem/int128_builtins.cpp"
"${LIBCXX_SOURCE_DIR}/src/filesystem/operations.cpp" "${LIBCXX_SOURCE_DIR}/src/filesystem/operations.cpp"
"${LIBCXX_SOURCE_DIR}/src/format.cpp"
"${LIBCXX_SOURCE_DIR}/src/functional.cpp" "${LIBCXX_SOURCE_DIR}/src/functional.cpp"
"${LIBCXX_SOURCE_DIR}/src/future.cpp" "${LIBCXX_SOURCE_DIR}/src/future.cpp"
"${LIBCXX_SOURCE_DIR}/src/hash.cpp" "${LIBCXX_SOURCE_DIR}/src/hash.cpp"
"${LIBCXX_SOURCE_DIR}/src/ios.cpp" "${LIBCXX_SOURCE_DIR}/src/ios.cpp"
"${LIBCXX_SOURCE_DIR}/src/ios.instantiations.cpp" "${LIBCXX_SOURCE_DIR}/src/ios.instantiations.cpp"
"${LIBCXX_SOURCE_DIR}/src/iostream.cpp" "${LIBCXX_SOURCE_DIR}/src/iostream.cpp"
"${LIBCXX_SOURCE_DIR}/src/legacy_pointer_safety.cpp"
"${LIBCXX_SOURCE_DIR}/src/locale.cpp" "${LIBCXX_SOURCE_DIR}/src/locale.cpp"
"${LIBCXX_SOURCE_DIR}/src/memory.cpp" "${LIBCXX_SOURCE_DIR}/src/memory.cpp"
"${LIBCXX_SOURCE_DIR}/src/mutex.cpp" "${LIBCXX_SOURCE_DIR}/src/mutex.cpp"
@ -33,6 +35,9 @@ set(SRCS
"${LIBCXX_SOURCE_DIR}/src/random.cpp" "${LIBCXX_SOURCE_DIR}/src/random.cpp"
"${LIBCXX_SOURCE_DIR}/src/random_shuffle.cpp" "${LIBCXX_SOURCE_DIR}/src/random_shuffle.cpp"
"${LIBCXX_SOURCE_DIR}/src/regex.cpp" "${LIBCXX_SOURCE_DIR}/src/regex.cpp"
"${LIBCXX_SOURCE_DIR}/src/ryu/d2fixed.cpp"
"${LIBCXX_SOURCE_DIR}/src/ryu/d2s.cpp"
"${LIBCXX_SOURCE_DIR}/src/ryu/f2s.cpp"
"${LIBCXX_SOURCE_DIR}/src/shared_mutex.cpp" "${LIBCXX_SOURCE_DIR}/src/shared_mutex.cpp"
"${LIBCXX_SOURCE_DIR}/src/stdexcept.cpp" "${LIBCXX_SOURCE_DIR}/src/stdexcept.cpp"
"${LIBCXX_SOURCE_DIR}/src/string.cpp" "${LIBCXX_SOURCE_DIR}/src/string.cpp"
@ -49,7 +54,9 @@ set(SRCS
add_library(cxx ${SRCS}) add_library(cxx ${SRCS})
set_target_properties(cxx PROPERTIES FOLDER "contrib/libcxx-cmake") set_target_properties(cxx PROPERTIES FOLDER "contrib/libcxx-cmake")
target_include_directories(cxx SYSTEM BEFORE PUBLIC $<BUILD_INTERFACE:${LIBCXX_SOURCE_DIR}/include>) target_include_directories(cxx SYSTEM BEFORE PUBLIC
$<BUILD_INTERFACE:${LIBCXX_SOURCE_DIR}/include>
$<BUILD_INTERFACE:${LIBCXX_SOURCE_DIR}>/src)
target_compile_definitions(cxx PRIVATE -D_LIBCPP_BUILDING_LIBRARY -DLIBCXX_BUILDING_LIBCXXABI) target_compile_definitions(cxx PRIVATE -D_LIBCPP_BUILDING_LIBRARY -DLIBCXX_BUILDING_LIBCXXABI)
# Enable capturing stack traces for all exceptions. # Enable capturing stack traces for all exceptions.

2
contrib/libcxxabi vendored

@ -1 +1 @@
Subproject commit df8f1e727dbc9e2bedf2282096fa189dc3fe0076 Subproject commit 6eb7cc7a7bdd779e6734d1b9fb451df2274462d7

View File

@ -1,24 +1,24 @@
set(LIBCXXABI_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libcxxabi") set(LIBCXXABI_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libcxxabi")
set(SRCS set(SRCS
"${LIBCXXABI_SOURCE_DIR}/src/stdlib_stdexcept.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_virtual.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_thread_atexit.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/fallback_malloc.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_guard.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_default_handlers.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_personality.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/stdlib_exception.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/abort_message.cpp" "${LIBCXXABI_SOURCE_DIR}/src/abort_message.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_aux_runtime.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_default_handlers.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_demangle.cpp" "${LIBCXXABI_SOURCE_DIR}/src/cxa_demangle.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_exception.cpp" "${LIBCXXABI_SOURCE_DIR}/src/cxa_exception.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_handlers.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_exception_storage.cpp" "${LIBCXXABI_SOURCE_DIR}/src/cxa_exception_storage.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/private_typeinfo.cpp" "${LIBCXXABI_SOURCE_DIR}/src/cxa_guard.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/stdlib_typeinfo.cpp" "${LIBCXXABI_SOURCE_DIR}/src/cxa_handlers.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_aux_runtime.cpp" "${LIBCXXABI_SOURCE_DIR}/src/cxa_personality.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_thread_atexit.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_vector.cpp" "${LIBCXXABI_SOURCE_DIR}/src/cxa_vector.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/cxa_virtual.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/fallback_malloc.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/private_typeinfo.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/stdlib_exception.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/stdlib_new_delete.cpp" "${LIBCXXABI_SOURCE_DIR}/src/stdlib_new_delete.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/stdlib_stdexcept.cpp"
"${LIBCXXABI_SOURCE_DIR}/src/stdlib_typeinfo.cpp"
) )
add_library(cxxabi ${SRCS}) add_library(cxxabi ${SRCS})
@ -30,6 +30,7 @@ target_compile_options(cxxabi PRIVATE -w)
target_include_directories(cxxabi SYSTEM BEFORE target_include_directories(cxxabi SYSTEM BEFORE
PUBLIC $<BUILD_INTERFACE:${LIBCXXABI_SOURCE_DIR}/include> PUBLIC $<BUILD_INTERFACE:${LIBCXXABI_SOURCE_DIR}/include>
PRIVATE $<BUILD_INTERFACE:${LIBCXXABI_SOURCE_DIR}/../libcxx/include> PRIVATE $<BUILD_INTERFACE:${LIBCXXABI_SOURCE_DIR}/../libcxx/include>
PRIVATE $<BUILD_INTERFACE:${LIBCXXABI_SOURCE_DIR}/../libcxx/src>
) )
target_compile_definitions(cxxabi PRIVATE -D_LIBCPP_BUILDING_LIBRARY) target_compile_definitions(cxxabi PRIVATE -D_LIBCPP_BUILDING_LIBRARY)
target_compile_options(cxxabi PRIVATE -nostdinc++ -fno-sanitize=undefined -Wno-macro-redefined) # If we don't disable UBSan, infinite recursion happens in dynamic_cast. target_compile_options(cxxabi PRIVATE -nostdinc++ -fno-sanitize=undefined -Wno-macro-redefined) # If we don't disable UBSan, infinite recursion happens in dynamic_cast.

2
contrib/replxx vendored

@ -1 +1 @@
Subproject commit 9460e5e0fc10f78f460af26a6bd928798cac864d Subproject commit 6f0b6f151ae2a044625ae93acd19ca365fcea64d

View File

@ -1,4 +1,3 @@
# rebuild in #33610
# docker build -t clickhouse/docs-check . # docker build -t clickhouse/docs-check .
ARG FROM_TAG=latest ARG FROM_TAG=latest
FROM clickhouse/docs-builder:$FROM_TAG FROM clickhouse/docs-builder:$FROM_TAG

View File

@ -163,6 +163,7 @@ def parse_env_variables(
cmake_flags.append("-DCMAKE_INSTALL_PREFIX=/usr") cmake_flags.append("-DCMAKE_INSTALL_PREFIX=/usr")
cmake_flags.append("-DCMAKE_INSTALL_SYSCONFDIR=/etc") cmake_flags.append("-DCMAKE_INSTALL_SYSCONFDIR=/etc")
cmake_flags.append("-DCMAKE_INSTALL_LOCALSTATEDIR=/var") cmake_flags.append("-DCMAKE_INSTALL_LOCALSTATEDIR=/var")
cmake_flags.append("-DBUILD_STANDALONE_KEEPER=ON")
if is_release_build(build_type, package_type, sanitizer, split_binary): if is_release_build(build_type, package_type, sanitizer, split_binary):
cmake_flags.append("-DINSTALL_STRIPPED_BINARIES=ON") cmake_flags.append("-DINSTALL_STRIPPED_BINARIES=ON")
@ -244,7 +245,7 @@ def parse_env_variables(
result.append(f"AUTHOR='{author}'") result.append(f"AUTHOR='{author}'")
if official: if official:
cmake_flags.append("-DYANDEX_OFFICIAL_BUILD=1") cmake_flags.append("-DCLICKHOUSE_OFFICIAL_BUILD=1")
result.append('CMAKE_FLAGS="' + " ".join(cmake_flags) + '"') result.append('CMAKE_FLAGS="' + " ".join(cmake_flags) + '"')

View File

@ -267,6 +267,7 @@ function run_tests
local test_opts=( local test_opts=(
--hung-check --hung-check
--fast-tests-only --fast-tests-only
--no-random-settings
--no-long --no-long
--testname --testname
--shard --shard

View File

@ -13,7 +13,7 @@ script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
echo "$script_dir" echo "$script_dir"
repo_dir=ch repo_dir=ch
BINARY_TO_DOWNLOAD=${BINARY_TO_DOWNLOAD:="clang-13_debug_none_bundled_unsplitted_disable_False_binary"} BINARY_TO_DOWNLOAD=${BINARY_TO_DOWNLOAD:="clang-13_debug_none_bundled_unsplitted_disable_False_binary"}
BINARY_URL_TO_DOWNLOAD=${BINARY_URL_TO_DOWNLOAD:="https://clickhouse-builds.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/$BINARY_TO_DOWNLOAD/clickhouse"} BINARY_URL_TO_DOWNLOAD=${BINARY_URL_TO_DOWNLOAD:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/$BINARY_TO_DOWNLOAD/clickhouse"}
function clone function clone
{ {

View File

@ -2,7 +2,7 @@
set -euo pipefail set -euo pipefail
CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-13_relwithdebuginfo_none_bundled_unsplitted_disable_False_binary/clickhouse"} CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-13_relwithdebuginfo_none_bundled_unsplitted_disable_False_binary/clickhouse"}
CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""} CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""}
@ -10,7 +10,7 @@ if [ -z "$CLICKHOUSE_REPO_PATH" ]; then
CLICKHOUSE_REPO_PATH=ch CLICKHOUSE_REPO_PATH=ch
rm -rf ch ||: rm -rf ch ||:
mkdir ch ||: mkdir ch ||:
wget -nv -nd -c "https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/repo/clickhouse_no_subs.tar.gz" wget -nv -nd -c "https://clickhouse-test-reports.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/repo/clickhouse_no_subs.tar.gz"
tar -C ch --strip-components=1 -xf clickhouse_no_subs.tar.gz tar -C ch --strip-components=1 -xf clickhouse_no_subs.tar.gz
ls -lath ||: ls -lath ||:
fi fi

View File

@ -1294,15 +1294,15 @@ create table ci_checks engine File(TSVWithNamesAndTypes, 'ci-checks.tsv')
select '' test_name, select '' test_name,
'$(sed -n 's/.*<!--message: \(.*\)-->/\1/p' report.html)' test_status, '$(sed -n 's/.*<!--message: \(.*\)-->/\1/p' report.html)' test_status,
0 test_duration_ms, 0 test_duration_ms,
'https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#fail1' report_url 'https://clickhouse-test-reports.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#fail1' report_url
union all union all
select test || ' #' || toString(query_index), 'slower' test_status, 0 test_duration_ms, select test || ' #' || toString(query_index), 'slower' test_status, 0 test_duration_ms,
'https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#changes-in-performance.' 'https://clickhouse-test-reports.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#changes-in-performance.'
|| test || '.' || toString(query_index) report_url || test || '.' || toString(query_index) report_url
from queries where changed_fail != 0 and diff > 0 from queries where changed_fail != 0 and diff > 0
union all union all
select test || ' #' || toString(query_index), 'unstable' test_status, 0 test_duration_ms, select test || ' #' || toString(query_index), 'unstable' test_status, 0 test_duration_ms,
'https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#unstable-queries.' 'https://clickhouse-test-reports.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/performance_comparison/report.html#unstable-queries.'
|| test || '.' || toString(query_index) report_url || test || '.' || toString(query_index) report_url
from queries where unstable_fail != 0 from queries where unstable_fail != 0
) )

View File

@ -16,26 +16,17 @@ right_sha=$4
datasets=${CHPC_DATASETS-"hits1 hits10 hits100 values"} datasets=${CHPC_DATASETS-"hits1 hits10 hits100 values"}
declare -A dataset_paths declare -A dataset_paths
if [[ $S3_URL == *"s3.amazonaws.com"* ]]; then
dataset_paths["hits10"]="https://clickhouse-private-datasets.s3.amazonaws.com/hits_10m_single/partitions/hits_10m_single.tar" dataset_paths["hits10"]="https://clickhouse-private-datasets.s3.amazonaws.com/hits_10m_single/partitions/hits_10m_single.tar"
dataset_paths["hits100"]="https://clickhouse-private-datasets.s3.amazonaws.com/hits_100m_single/partitions/hits_100m_single.tar" dataset_paths["hits100"]="https://clickhouse-private-datasets.s3.amazonaws.com/hits_100m_single/partitions/hits_100m_single.tar"
dataset_paths["hits1"]="https://clickhouse-datasets.s3.amazonaws.com/hits/partitions/hits_v1.tar" dataset_paths["hits1"]="https://clickhouse-datasets.s3.amazonaws.com/hits/partitions/hits_v1.tar"
dataset_paths["values"]="https://clickhouse-datasets.s3.amazonaws.com/values_with_expressions/partitions/test_values.tar" dataset_paths["values"]="https://clickhouse-datasets.s3.amazonaws.com/values_with_expressions/partitions/test_values.tar"
else
dataset_paths["hits10"]="https://s3.mds.yandex.net/clickhouse-private-datasets/hits_10m_single/partitions/hits_10m_single.tar"
dataset_paths["hits100"]="https://s3.mds.yandex.net/clickhouse-private-datasets/hits_100m_single/partitions/hits_100m_single.tar"
dataset_paths["hits1"]="https://clickhouse-datasets.s3.yandex.net/hits/partitions/hits_v1.tar"
dataset_paths["values"]="https://clickhouse-datasets.s3.yandex.net/values_with_expressions/partitions/test_values.tar"
fi
function download function download
{ {
# Historically there were various paths for the performance test package. # Historically there were various paths for the performance test package.
# Test all of them. # Test all of them.
declare -a urls_to_try=("https://s3.amazonaws.com/clickhouse-builds/$left_pr/$left_sha/performance/performance.tgz" declare -a urls_to_try=("https://s3.amazonaws.com/clickhouse-builds/$left_pr/$left_sha/performance/performance.tgz")
"https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/clickhouse_build_check/performance/performance.tgz"
)
for path in "${urls_to_try[@]}" for path in "${urls_to_try[@]}"
do do

View File

@ -4,7 +4,7 @@ set -ex
CHPC_CHECK_START_TIMESTAMP="$(date +%s)" CHPC_CHECK_START_TIMESTAMP="$(date +%s)"
export CHPC_CHECK_START_TIMESTAMP export CHPC_CHECK_START_TIMESTAMP
S3_URL=${S3_URL:="https://clickhouse-builds.s3.yandex.net"} S3_URL=${S3_URL:="https://clickhouse-builds.s3.amazonaws.com"}
COMMON_BUILD_PREFIX="/clickhouse_build_check" COMMON_BUILD_PREFIX="/clickhouse_build_check"
if [[ $S3_URL == *"s3.amazonaws.com"* ]]; then if [[ $S3_URL == *"s3.amazonaws.com"* ]]; then
@ -64,9 +64,7 @@ function find_reference_sha
# Historically there were various path for the performance test package, # Historically there were various path for the performance test package,
# test all of them. # test all of them.
unset found unset found
declare -a urls_to_try=("https://s3.amazonaws.com/clickhouse-builds/0/$REF_SHA/performance/performance.tgz" declare -a urls_to_try=("https://s3.amazonaws.com/clickhouse-builds/0/$REF_SHA/performance/performance.tgz")
"https://clickhouse-builds.s3.yandex.net/0/$REF_SHA/clickhouse_build_check/performance/performance.tgz"
)
for path in "${urls_to_try[@]}" for path in "${urls_to_try[@]}"
do do
if curl_with_retry "$path" if curl_with_retry "$path"

View File

@ -11,7 +11,7 @@ RUN apt-get update -y \
COPY s3downloader /s3downloader COPY s3downloader /s3downloader
ENV S3_URL="https://clickhouse-datasets.s3.yandex.net" ENV S3_URL="https://clickhouse-datasets.s3.amazonaws.com"
ENV DATASETS="hits visits" ENV DATASETS="hits visits"
ENV EXPORT_S3_STORAGE_POLICIES=1 ENV EXPORT_S3_STORAGE_POLICIES=1

View File

@ -10,7 +10,7 @@ import requests
import tempfile import tempfile
DEFAULT_URL = 'https://clickhouse-datasets.s3.yandex.net' DEFAULT_URL = 'https://clickhouse-datasets.s3.amazonaws.com'
AVAILABLE_DATASETS = { AVAILABLE_DATASETS = {
'hits': 'hits_v1.tar', 'hits': 'hits_v1.tar',

View File

@ -29,7 +29,7 @@ COPY ./download_previous_release /download_previous_release
COPY run.sh / COPY run.sh /
ENV DATASETS="hits visits" ENV DATASETS="hits visits"
ENV S3_URL="https://clickhouse-datasets.s3.yandex.net" ENV S3_URL="https://clickhouse-datasets.s3.amazonaws.com"
ENV EXPORT_S3_STORAGE_POLICIES=1 ENV EXPORT_S3_STORAGE_POLICIES=1
CMD ["/bin/bash", "/run.sh"] CMD ["/bin/bash", "/run.sh"]

View File

@ -137,7 +137,7 @@ CREATE TABLE test.test_orc
`f_array_array_float` Array(Array(Float32)), `f_array_array_float` Array(Array(Float32)),
`day` String `day` String
) )
ENGINE = Hive('thrift://202.168.117.26:9083', 'test', 'test_orc') ENGINE = Hive('thrift://localhost:9083', 'test', 'test_orc')
PARTITION BY day PARTITION BY day
``` ```

View File

@ -1616,3 +1616,14 @@ Possible values:
Default value: `10000`. Default value: `10000`.
## global_memory_usage_overcommit_max_wait_microseconds {#global_memory_usage_overcommit_max_wait_microseconds}
Sets maximum waiting time for global overcommit tracker.
Possible values:
- Positive integer.
Default value: `0`.

View File

@ -0,0 +1,31 @@
# Memory overcommit
Memory overcommit is an experimental technique intended to allow to set more flexible memory limits for queries.
The idea of this technique is to introduce settings which can represent guaranteed amount of memory a query can use.
When memory overcommit is enabled and the memory limit is reached ClickHouse will select the most overcommitted query and try to free memory by killing this query.
When memory limit is reached any query will wait some time during atempt to allocate new memory.
If timeout is passed and memory is freed, the query continues execution. Otherwise an exception will be thrown and the query is killed.
Selection of query to stop or kill is performed by either global or user overcommit trackers depending on what memory limit is reached.
## User overcommit tracker
User overcommit tracker finds a query with the biggest overcommit ratio in the user's query list.
Overcommit ratio is computed as number of allocated bytes divided by value of `max_guaranteed_memory_usage` setting.
Waiting timeout is set by `memory_usage_overcommit_max_wait_microseconds` setting.
**Example**
```sql
SELECT number FROM numbers(1000) GROUP BY number SETTINGS max_guaranteed_memory_usage=4000, memory_usage_overcommit_max_wait_microseconds=500
```
## Global overcommit tracker
Global overcommit tracker finds a query with the biggest overcommit ratio in the list of all queries.
In this case overcommit ratio is computed as number of allocated bytes divided by value of `max_guaranteed_memory_usage_for_user` setting.
Waiting timeout is set by `global_memory_usage_overcommit_max_wait_microseconds` parameter in the configuration file.

View File

@ -4220,10 +4220,36 @@ Possible values:
- 0 — Disabled. - 0 — Disabled.
- 1 — Enabled. The wait time equal shutdown_wait_unfinished config. - 1 — Enabled. The wait time equal shutdown_wait_unfinished config.
Default value: 0. Default value: `0`.
## shutdown_wait_unfinished ## shutdown_wait_unfinished
The waiting time in seconds for currently handled connections when shutdown server. The waiting time in seconds for currently handled connections when shutdown server.
Default Value: 5. Default Value: `5`.
## max_guaranteed_memory_usage
Maximum guaranteed memory usage for processing of single query.
It represents soft limit in case when hard limit is reached on user level.
Zero means unlimited.
Read more about [memory overcommit](memory-overcommit.md).
Default value: `0`.
## memory_usage_overcommit_max_wait_microseconds
Maximum time thread will wait for memory to be freed in the case of memory overcommit on a user level.
If the timeout is reached and memory is not freed, an exception is thrown.
Read more about [memory overcommit](memory-overcommit.md).
Default value: `0`.
## max_guaranteed_memory_usage_for_user
Maximum guaranteed memory usage for processing all concurrently running queries for the user.
It represents soft limit in case when hard limit is reached on global level.
Zero means unlimited.
Read more about [memory overcommit](memory-overcommit.md).
Default value: `0`.

View File

@ -10,7 +10,7 @@ cssmin==0.2.0
future==0.18.2 future==0.18.2
htmlmin==0.1.12 htmlmin==0.1.12
idna==2.10 idna==2.10
Jinja2>=3.0.3 Jinja2==3.0.3
jinja2-highlight==0.6.1 jinja2-highlight==0.6.1
jsmin==3.0.0 jsmin==3.0.0
livereload==2.6.3 livereload==2.6.3

View File

@ -140,7 +140,7 @@ CREATE TABLE test.test_orc
`f_array_array_float` Array(Array(Float32)), `f_array_array_float` Array(Array(Float32)),
`day` String `day` String
) )
ENGINE = Hive('thrift://202.168.117.26:9083', 'test', 'test_orc') ENGINE = Hive('thrift://localhost:9083', 'test', 'test_orc')
PARTITION BY day PARTITION BY day
``` ```

View File

@ -21,8 +21,12 @@ description: |
This package contains the debugging symbols for clickhouse-common. This package contains the debugging symbols for clickhouse-common.
contents: contents:
- src: root/usr/lib/debug - src: root/usr/lib/debug/usr/bin/clickhouse.debug
dst: /usr/lib/debug dst: /usr/lib/debug/usr/bin/clickhouse.debug
- src: root/usr/lib/debug/usr/bin/clickhouse-odbc-bridge.debug
dst: /usr/lib/debug/usr/bin/clickhouse-odbc-bridge.debug
- src: root/usr/lib/debug/usr/bin/clickhouse-library-bridge.debug
dst: /usr/lib/debug/usr/bin/clickhouse-library-bridge.debug
# docs # docs
- src: ../AUTHORS - src: ../AUTHORS
dst: /usr/share/doc/clickhouse-common-static-dbg/AUTHORS dst: /usr/share/doc/clickhouse-common-static-dbg/AUTHORS

View File

@ -0,0 +1,28 @@
# package sources should be placed in ${PWD}/root
# nfpm should run from the same directory with a config
name: "clickhouse-keeper-dbg"
arch: "${DEB_ARCH}" # amd64, arm64
platform: "linux"
version: "${CLICKHOUSE_VERSION_STRING}"
vendor: "ClickHouse Inc."
homepage: "https://clickhouse.com"
license: "Apache"
section: "database"
priority: "optional"
maintainer: "ClickHouse Dev Team <packages+linux@clickhouse.com>"
description: |
debugging symbols for clickhouse-keeper
This package contains the debugging symbols for clickhouse-keeper.
contents:
- src: root/usr/lib/debug/usr/bin/clickhouse-keeper.debug
dst: /usr/lib/debug/usr/bin/clickhouse-keeper.debug
# docs
- src: ../AUTHORS
dst: /usr/share/doc/clickhouse-keeper-dbg/AUTHORS
- src: ../CHANGELOG.md
dst: /usr/share/doc/clickhouse-keeper-dbg/CHANGELOG.md
- src: ../LICENSE
dst: /usr/share/doc/clickhouse-keeper-dbg/LICENSE
- src: ../README.md
dst: /usr/share/doc/clickhouse-keeper-dbg/README.md

View File

@ -0,0 +1,40 @@
# package sources should be placed in ${PWD}/root
# nfpm should run from the same directory with a config
name: "clickhouse-keeper"
arch: "${DEB_ARCH}" # amd64, arm64
platform: "linux"
version: "${CLICKHOUSE_VERSION_STRING}"
vendor: "ClickHouse Inc."
homepage: "https://clickhouse.com"
license: "Apache"
section: "database"
priority: "optional"
conflicts:
- clickhouse-server
depends:
- adduser
suggests:
- clickhouse-keeper-dbg
maintainer: "ClickHouse Dev Team <packages+linux@clickhouse.com>"
description: |
Static clickhouse-keeper binary
A stand-alone clickhouse-keeper package
contents:
- src: root/etc/clickhouse-keeper
dst: /etc/clickhouse-keeper
type: config
- src: root/usr/bin/clickhouse-keeper
dst: /usr/bin/clickhouse-keeper
# docs
- src: ../AUTHORS
dst: /usr/share/doc/clickhouse-keeper/AUTHORS
- src: ../CHANGELOG.md
dst: /usr/share/doc/clickhouse-keeper/CHANGELOG.md
- src: ../LICENSE
dst: /usr/share/doc/clickhouse-keeper/LICENSE
- src: ../README.md
dst: /usr/share/doc/clickhouse-keeper/README.md

View File

@ -473,18 +473,11 @@ else ()
if (INSTALL_STRIPPED_BINARIES) if (INSTALL_STRIPPED_BINARIES)
clickhouse_strip_binary(TARGET clickhouse DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/${STRIPPED_BINARIES_OUTPUT} BINARY_PATH clickhouse) clickhouse_strip_binary(TARGET clickhouse DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/${STRIPPED_BINARIES_OUTPUT} BINARY_PATH clickhouse)
else() else()
clickhouse_make_empty_debug_info_for_nfpm(TARGET clickhouse DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/${STRIPPED_BINARIES_OUTPUT})
install (TARGETS clickhouse RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse) install (TARGETS clickhouse RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
endif() endif()
endif() endif()
if (NOT INSTALL_STRIPPED_BINARIES)
# Install dunny debug directory
# TODO: move logic to every place where clickhouse_strip_binary is used
add_custom_command(TARGET clickhouse POST_BUILD COMMAND echo > .empty )
install(FILES "${CMAKE_CURRENT_BINARY_DIR}/.empty" DESTINATION ${CMAKE_INSTALL_LIBDIR}/debug/.empty)
endif()
if (ENABLE_TESTS) if (ENABLE_TESTS)
set (CLICKHOUSE_UNIT_TESTS_TARGETS unit_tests_dbms) set (CLICKHOUSE_UNIT_TESTS_TARGETS unit_tests_dbms)
add_custom_target (clickhouse-tests ALL DEPENDS ${CLICKHOUSE_UNIT_TESTS_TARGETS}) add_custom_target (clickhouse-tests ALL DEPENDS ${CLICKHOUSE_UNIT_TESTS_TARGETS})

View File

@ -71,17 +71,11 @@ if (BUILD_STANDALONE_KEEPER)
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressedReadBuffer.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressedReadBuffer.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressedReadBufferFromFile.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressedReadBufferFromFile.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressedWriteBuffer.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressedWriteBuffer.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecDelta.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecDoubleDelta.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecEncrypted.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecGorilla.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecLZ4.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecLZ4.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecMultiple.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecMultiple.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecNone.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecNone.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecT64.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecZSTD.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionCodecZSTD.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionFactory.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/CompressionFactory.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/getCompressionCodecForFile.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/ICompressionCodec.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/ICompressionCodec.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/LZ4_decompress_faster.cpp ${CMAKE_CURRENT_SOURCE_DIR}/../../src/Compression/LZ4_decompress_faster.cpp
@ -137,5 +131,10 @@ if (BUILD_STANDALONE_KEEPER)
add_dependencies(clickhouse-keeper clickhouse_keeper_configs) add_dependencies(clickhouse-keeper clickhouse_keeper_configs)
set_target_properties(clickhouse-keeper PROPERTIES RUNTIME_OUTPUT_DIRECTORY ../) set_target_properties(clickhouse-keeper PROPERTIES RUNTIME_OUTPUT_DIRECTORY ../)
if (INSTALL_STRIPPED_BINARIES)
clickhouse_strip_binary(TARGET clickhouse-keeper DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT} BINARY_PATH ../clickhouse-keeper)
else()
clickhouse_make_empty_debug_info_for_nfpm(TARGET clickhouse-keeper DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT})
install(TARGETS clickhouse-keeper RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse) install(TARGETS clickhouse-keeper RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
endif() endif()
endif()

View File

@ -27,5 +27,6 @@ set_target_properties(clickhouse-library-bridge PROPERTIES RUNTIME_OUTPUT_DIRECT
if (INSTALL_STRIPPED_BINARIES) if (INSTALL_STRIPPED_BINARIES)
clickhouse_strip_binary(TARGET clickhouse-library-bridge DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT} BINARY_PATH ../clickhouse-library-bridge) clickhouse_strip_binary(TARGET clickhouse-library-bridge DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT} BINARY_PATH ../clickhouse-library-bridge)
else() else()
clickhouse_make_empty_debug_info_for_nfpm(TARGET clickhouse-library-bridge DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT})
install(TARGETS clickhouse-library-bridge RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse) install(TARGETS clickhouse-library-bridge RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
endif() endif()

View File

@ -42,6 +42,7 @@ endif()
if (INSTALL_STRIPPED_BINARIES) if (INSTALL_STRIPPED_BINARIES)
clickhouse_strip_binary(TARGET clickhouse-odbc-bridge DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT} BINARY_PATH ../clickhouse-odbc-bridge) clickhouse_strip_binary(TARGET clickhouse-odbc-bridge DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT} BINARY_PATH ../clickhouse-odbc-bridge)
else() else()
clickhouse_make_empty_debug_info_for_nfpm(TARGET clickhouse-odbc-bridge DESTINATION_DIR ${CMAKE_CURRENT_BINARY_DIR}/../${STRIPPED_BINARIES_OUTPUT})
install(TARGETS clickhouse-odbc-bridge RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse) install(TARGETS clickhouse-odbc-bridge RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
endif() endif()

View File

@ -13,7 +13,7 @@ enum class QuotaType
{ {
QUERIES, /// Number of queries. QUERIES, /// Number of queries.
QUERY_SELECTS, /// Number of select queries. QUERY_SELECTS, /// Number of select queries.
QUERY_INSERTS, /// Number of inserts queries. QUERY_INSERTS, /// Number of insert queries.
ERRORS, /// Number of queries with exceptions. ERRORS, /// Number of queries with exceptions.
RESULT_ROWS, /// Number of rows returned as result. RESULT_ROWS, /// Number of rows returned as result.
RESULT_BYTES, /// Number of bytes returned as result. RESULT_BYTES, /// Number of bytes returned as result.

View File

@ -31,8 +31,8 @@ public:
/// probably it worth to try to increase stack size for coroutines. /// probably it worth to try to increase stack size for coroutines.
/// ///
/// Current value is just enough for all tests in our CI. It's not selected in some special /// Current value is just enough for all tests in our CI. It's not selected in some special
/// way. We will have 40 pages with 4KB page size. /// way. We will have 80 pages with 4KB page size.
static constexpr size_t default_stack_size = 192 * 1024; /// 64KB was not enough for tests static constexpr size_t default_stack_size = 320 * 1024; /// 64KB was not enough for tests
explicit FiberStack(size_t stack_size_ = default_stack_size) : stack_size(stack_size_) explicit FiberStack(size_t stack_size_ = default_stack_size) : stack_size(stack_size_)
{ {

View File

@ -23,6 +23,12 @@ void OvercommitTracker::setMaxWaitTime(UInt64 wait_time)
bool OvercommitTracker::needToStopQuery(MemoryTracker * tracker) bool OvercommitTracker::needToStopQuery(MemoryTracker * tracker)
{ {
// NOTE: Do not change the order of locks
//
// global_mutex must be acquired before overcommit_m, because
// method OvercommitTracker::unsubscribe(MemoryTracker *) is
// always called with already acquired global_mutex in
// ProcessListEntry::~ProcessListEntry().
std::unique_lock<std::mutex> global_lock(global_mutex); std::unique_lock<std::mutex> global_lock(global_mutex);
std::unique_lock<std::mutex> lk(overcommit_m); std::unique_lock<std::mutex> lk(overcommit_m);
@ -76,7 +82,7 @@ void UserOvercommitTracker::pickQueryToExcludeImpl()
MemoryTracker * query_tracker = nullptr; MemoryTracker * query_tracker = nullptr;
OvercommitRatio current_ratio{0, 0}; OvercommitRatio current_ratio{0, 0};
// At this moment query list must be read only. // At this moment query list must be read only.
// BlockQueryIfMemoryLimit is used in ProcessList to guarantee this. // This is guaranteed by locking global_mutex in OvercommitTracker::needToStopQuery.
auto & queries = user_process_list->queries; auto & queries = user_process_list->queries;
LOG_DEBUG(logger, "Trying to choose query to stop from {} queries", queries.size()); LOG_DEBUG(logger, "Trying to choose query to stop from {} queries", queries.size());
for (auto const & query : queries) for (auto const & query : queries)
@ -111,9 +117,9 @@ void GlobalOvercommitTracker::pickQueryToExcludeImpl()
MemoryTracker * query_tracker = nullptr; MemoryTracker * query_tracker = nullptr;
OvercommitRatio current_ratio{0, 0}; OvercommitRatio current_ratio{0, 0};
// At this moment query list must be read only. // At this moment query list must be read only.
// BlockQueryIfMemoryLimit is used in ProcessList to guarantee this. // This is guaranteed by locking global_mutex in OvercommitTracker::needToStopQuery.
LOG_DEBUG(logger, "Trying to choose query to stop"); LOG_DEBUG(logger, "Trying to choose query to stop from {} queries", process_list->size());
process_list->processEachQueryStatus([&](DB::QueryStatus const & query) for (auto const & query : process_list->processes)
{ {
if (query.isKilled()) if (query.isKilled())
return; return;
@ -134,7 +140,7 @@ void GlobalOvercommitTracker::pickQueryToExcludeImpl()
query_tracker = memory_tracker; query_tracker = memory_tracker;
current_ratio = ratio; current_ratio = ratio;
} }
}); }
LOG_DEBUG(logger, "Selected to stop query with overcommit ratio {}/{}", LOG_DEBUG(logger, "Selected to stop query with overcommit ratio {}/{}",
current_ratio.committed, current_ratio.soft_limit); current_ratio.committed, current_ratio.soft_limit);
picked_tracker = query_tracker; picked_tracker = query_tracker;

View File

@ -43,8 +43,6 @@ class MemoryTracker;
// is killed to free memory. // is killed to free memory.
struct OvercommitTracker : boost::noncopyable struct OvercommitTracker : boost::noncopyable
{ {
explicit OvercommitTracker(std::mutex & global_mutex_);
void setMaxWaitTime(UInt64 wait_time); void setMaxWaitTime(UInt64 wait_time);
bool needToStopQuery(MemoryTracker * tracker); bool needToStopQuery(MemoryTracker * tracker);
@ -54,8 +52,12 @@ struct OvercommitTracker : boost::noncopyable
virtual ~OvercommitTracker() = default; virtual ~OvercommitTracker() = default;
protected: protected:
explicit OvercommitTracker(std::mutex & global_mutex_);
virtual void pickQueryToExcludeImpl() = 0; virtual void pickQueryToExcludeImpl() = 0;
// This mutex is used to disallow concurrent access
// to picked_tracker and cancelation_state variables.
mutable std::mutex overcommit_m; mutable std::mutex overcommit_m;
mutable std::condition_variable cv; mutable std::condition_variable cv;
@ -87,6 +89,11 @@ private:
} }
} }
// Global mutex which is used in ProcessList to synchronize
// insertion and deletion of queries.
// OvercommitTracker::pickQueryToExcludeImpl() implementations
// require this mutex to be locked, because they read list (or sublist)
// of queries.
std::mutex & global_mutex; std::mutex & global_mutex;
}; };

View File

@ -9,6 +9,7 @@
M(SelectQuery, "Same as Query, but only for SELECT queries.") \ M(SelectQuery, "Same as Query, but only for SELECT queries.") \
M(InsertQuery, "Same as Query, but only for INSERT queries.") \ M(InsertQuery, "Same as Query, but only for INSERT queries.") \
M(AsyncInsertQuery, "Same as InsertQuery, but only for asynchronous INSERT queries.") \ M(AsyncInsertQuery, "Same as InsertQuery, but only for asynchronous INSERT queries.") \
M(AsyncInsertBytes, "Data size in bytes of asynchronous INSERT queries.") \
M(FailedQuery, "Number of failed queries.") \ M(FailedQuery, "Number of failed queries.") \
M(FailedSelectQuery, "Same as FailedQuery, but only for SELECT queries.") \ M(FailedSelectQuery, "Same as FailedQuery, but only for SELECT queries.") \
M(FailedInsertQuery, "Same as FailedQuery, but only for INSERT queries.") \ M(FailedInsertQuery, "Same as FailedQuery, but only for INSERT queries.") \

View File

@ -11,7 +11,7 @@
constexpr size_t IPV4_BINARY_LENGTH = 4; constexpr size_t IPV4_BINARY_LENGTH = 4;
constexpr size_t IPV6_BINARY_LENGTH = 16; constexpr size_t IPV6_BINARY_LENGTH = 16;
constexpr size_t IPV4_MAX_TEXT_LENGTH = 15; /// Does not count tail zero byte. constexpr size_t IPV4_MAX_TEXT_LENGTH = 15; /// Does not count tail zero byte.
constexpr size_t IPV6_MAX_TEXT_LENGTH = 39; constexpr size_t IPV6_MAX_TEXT_LENGTH = 45; /// Does not count tail zero byte.
namespace DB namespace DB
{ {

View File

@ -165,25 +165,36 @@ void registerCodecNone(CompressionCodecFactory & factory);
void registerCodecLZ4(CompressionCodecFactory & factory); void registerCodecLZ4(CompressionCodecFactory & factory);
void registerCodecLZ4HC(CompressionCodecFactory & factory); void registerCodecLZ4HC(CompressionCodecFactory & factory);
void registerCodecZSTD(CompressionCodecFactory & factory); void registerCodecZSTD(CompressionCodecFactory & factory);
void registerCodecMultiple(CompressionCodecFactory & factory);
/// Keeper use only general-purpose codecs, so we don't need these special codecs
/// in standalone build
#ifndef KEEPER_STANDALONE_BUILD
void registerCodecDelta(CompressionCodecFactory & factory); void registerCodecDelta(CompressionCodecFactory & factory);
void registerCodecT64(CompressionCodecFactory & factory); void registerCodecT64(CompressionCodecFactory & factory);
void registerCodecDoubleDelta(CompressionCodecFactory & factory); void registerCodecDoubleDelta(CompressionCodecFactory & factory);
void registerCodecGorilla(CompressionCodecFactory & factory); void registerCodecGorilla(CompressionCodecFactory & factory);
void registerCodecEncrypted(CompressionCodecFactory & factory); void registerCodecEncrypted(CompressionCodecFactory & factory);
void registerCodecMultiple(CompressionCodecFactory & factory);
#endif
CompressionCodecFactory::CompressionCodecFactory() CompressionCodecFactory::CompressionCodecFactory()
{ {
registerCodecLZ4(*this);
registerCodecNone(*this); registerCodecNone(*this);
registerCodecLZ4(*this);
registerCodecZSTD(*this); registerCodecZSTD(*this);
registerCodecLZ4HC(*this); registerCodecLZ4HC(*this);
registerCodecMultiple(*this);
#ifndef KEEPER_STANDALONE_BUILD
registerCodecDelta(*this); registerCodecDelta(*this);
registerCodecT64(*this); registerCodecT64(*this);
registerCodecDoubleDelta(*this); registerCodecDoubleDelta(*this);
registerCodecGorilla(*this); registerCodecGorilla(*this);
registerCodecEncrypted(*this); registerCodecEncrypted(*this);
registerCodecMultiple(*this); #endif
default_codec = get("LZ4", {}); default_codec = get("LZ4", {});
} }

View File

@ -13,6 +13,7 @@
#include <iterator> #include <iterator>
#include <base/sort.h> #include <base/sort.h>
#include <boost/algorithm/string.hpp>
namespace DB namespace DB
@ -269,8 +270,18 @@ const ColumnWithTypeAndName & Block::safeGetByPosition(size_t position) const
} }
const ColumnWithTypeAndName * Block::findByName(const std::string & name) const const ColumnWithTypeAndName * Block::findByName(const std::string & name, bool case_insensitive) const
{ {
if (case_insensitive)
{
auto found = std::find_if(data.begin(), data.end(), [&](const auto & column) { return boost::iequals(column.name, name); });
if (found == data.end())
{
return nullptr;
}
return &*found;
}
auto it = index_by_name.find(name); auto it = index_by_name.find(name);
if (index_by_name.end() == it) if (index_by_name.end() == it)
{ {
@ -280,19 +291,23 @@ const ColumnWithTypeAndName * Block::findByName(const std::string & name) const
} }
const ColumnWithTypeAndName & Block::getByName(const std::string & name) const const ColumnWithTypeAndName & Block::getByName(const std::string & name, bool case_insensitive) const
{ {
const auto * result = findByName(name); const auto * result = findByName(name, case_insensitive);
if (!result) if (!result)
throw Exception("Not found column " + name + " in block. There are only columns: " + dumpNames() throw Exception(
, ErrorCodes::NOT_FOUND_COLUMN_IN_BLOCK); "Not found column " + name + " in block. There are only columns: " + dumpNames(), ErrorCodes::NOT_FOUND_COLUMN_IN_BLOCK);
return *result; return *result;
} }
bool Block::has(const std::string & name) const bool Block::has(const std::string & name, bool case_insensitive) const
{ {
if (case_insensitive)
return std::find_if(data.begin(), data.end(), [&](const auto & column) { return boost::iequals(column.name, name); })
!= data.end();
return index_by_name.end() != index_by_name.find(name); return index_by_name.end() != index_by_name.find(name);
} }
@ -301,8 +316,8 @@ size_t Block::getPositionByName(const std::string & name) const
{ {
auto it = index_by_name.find(name); auto it = index_by_name.find(name);
if (index_by_name.end() == it) if (index_by_name.end() == it)
throw Exception("Not found column " + name + " in block. There are only columns: " + dumpNames() throw Exception(
, ErrorCodes::NOT_FOUND_COLUMN_IN_BLOCK); "Not found column " + name + " in block. There are only columns: " + dumpNames(), ErrorCodes::NOT_FOUND_COLUMN_IN_BLOCK);
return it->second; return it->second;
} }

View File

@ -60,21 +60,21 @@ public:
ColumnWithTypeAndName & safeGetByPosition(size_t position); ColumnWithTypeAndName & safeGetByPosition(size_t position);
const ColumnWithTypeAndName & safeGetByPosition(size_t position) const; const ColumnWithTypeAndName & safeGetByPosition(size_t position) const;
ColumnWithTypeAndName* findByName(const std::string & name) ColumnWithTypeAndName* findByName(const std::string & name, bool case_insensitive = false)
{ {
return const_cast<ColumnWithTypeAndName *>( return const_cast<ColumnWithTypeAndName *>(
const_cast<const Block *>(this)->findByName(name)); const_cast<const Block *>(this)->findByName(name, case_insensitive));
} }
const ColumnWithTypeAndName * findByName(const std::string & name) const; const ColumnWithTypeAndName * findByName(const std::string & name, bool case_insensitive = false) const;
ColumnWithTypeAndName & getByName(const std::string & name) ColumnWithTypeAndName & getByName(const std::string & name, bool case_insensitive = false)
{ {
return const_cast<ColumnWithTypeAndName &>( return const_cast<ColumnWithTypeAndName &>(
const_cast<const Block *>(this)->getByName(name)); const_cast<const Block *>(this)->getByName(name, case_insensitive));
} }
const ColumnWithTypeAndName & getByName(const std::string & name) const; const ColumnWithTypeAndName & getByName(const std::string & name, bool case_insensitive = false) const;
Container::iterator begin() { return data.begin(); } Container::iterator begin() { return data.begin(); }
Container::iterator end() { return data.end(); } Container::iterator end() { return data.end(); }
@ -83,7 +83,7 @@ public:
Container::const_iterator cbegin() const { return data.cbegin(); } Container::const_iterator cbegin() const { return data.cbegin(); }
Container::const_iterator cend() const { return data.cend(); } Container::const_iterator cend() const { return data.cend(); }
bool has(const std::string & name) const; bool has(const std::string & name, bool case_insensitive = false) const;
size_t getPositionByName(const std::string & name) const; size_t getPositionByName(const std::string & name) const;

View File

@ -616,11 +616,13 @@ class IColumn;
M(Bool, input_format_tsv_empty_as_default, false, "Treat empty fields in TSV input as default values.", 0) \ M(Bool, input_format_tsv_empty_as_default, false, "Treat empty fields in TSV input as default values.", 0) \
M(Bool, input_format_tsv_enum_as_number, false, "Treat inserted enum values in TSV formats as enum indices \\N", 0) \ M(Bool, input_format_tsv_enum_as_number, false, "Treat inserted enum values in TSV formats as enum indices \\N", 0) \
M(Bool, input_format_null_as_default, true, "For text input formats initialize null fields with default values if data type of this field is not nullable", 0) \ M(Bool, input_format_null_as_default, true, "For text input formats initialize null fields with default values if data type of this field is not nullable", 0) \
M(Bool, input_format_use_lowercase_column_name, false, "Use lowercase column name while reading input formats", 0) \
M(Bool, input_format_arrow_import_nested, false, "Allow to insert array of structs into Nested table in Arrow input format.", 0) \ M(Bool, input_format_arrow_import_nested, false, "Allow to insert array of structs into Nested table in Arrow input format.", 0) \
M(Bool, input_format_arrow_case_insensitive_column_matching, false, "Ignore case when matching Arrow columns with CH columns.", 0) \
M(Bool, input_format_orc_import_nested, false, "Allow to insert array of structs into Nested table in ORC input format.", 0) \ M(Bool, input_format_orc_import_nested, false, "Allow to insert array of structs into Nested table in ORC input format.", 0) \
M(Int64, input_format_orc_row_batch_size, 100'000, "Batch size when reading ORC stripes.", 0) \ M(Int64, input_format_orc_row_batch_size, 100'000, "Batch size when reading ORC stripes.", 0) \
M(Bool, input_format_orc_case_insensitive_column_matching, false, "Ignore case when matching ORC columns with CH columns.", 0) \
M(Bool, input_format_parquet_import_nested, false, "Allow to insert array of structs into Nested table in Parquet input format.", 0) \ M(Bool, input_format_parquet_import_nested, false, "Allow to insert array of structs into Nested table in Parquet input format.", 0) \
M(Bool, input_format_parquet_case_insensitive_column_matching, false, "Ignore case when matching Parquet columns with CH columns.", 0) \
M(Bool, input_format_allow_seeks, true, "Allow seeks while reading in ORC/Parquet/Arrow input formats", 0) \ M(Bool, input_format_allow_seeks, true, "Allow seeks while reading in ORC/Parquet/Arrow input formats", 0) \
M(Bool, input_format_orc_allow_missing_columns, false, "Allow missing columns while reading ORC input formats", 0) \ M(Bool, input_format_orc_allow_missing_columns, false, "Allow missing columns while reading ORC input formats", 0) \
M(Bool, input_format_parquet_allow_missing_columns, false, "Allow missing columns while reading Parquet input formats", 0) \ M(Bool, input_format_parquet_allow_missing_columns, false, "Allow missing columns while reading Parquet input formats", 0) \

View File

@ -15,6 +15,8 @@
#include <Parsers/IAST.h> #include <Parsers/IAST.h>
#include <boost/algorithm/string/case_conv.hpp>
namespace DB namespace DB
{ {
@ -227,14 +229,17 @@ void validateArraySizes(const Block & block)
} }
std::unordered_set<String> getAllTableNames(const Block & block) std::unordered_set<String> getAllTableNames(const Block & block, bool to_lower_case)
{ {
std::unordered_set<String> nested_table_names; std::unordered_set<String> nested_table_names;
for (auto & name : block.getNames()) for (const auto & name : block.getNames())
{ {
auto nested_table_name = Nested::extractTableName(name); auto nested_table_name = Nested::extractTableName(name);
if (to_lower_case)
boost::to_lower(nested_table_name);
if (!nested_table_name.empty()) if (!nested_table_name.empty())
nested_table_names.insert(nested_table_name); nested_table_names.insert(std::move(nested_table_name));
} }
return nested_table_names; return nested_table_names;
} }

View File

@ -32,7 +32,7 @@ namespace Nested
void validateArraySizes(const Block & block); void validateArraySizes(const Block & block);
/// Get all nested tables names from a block. /// Get all nested tables names from a block.
std::unordered_set<String> getAllTableNames(const Block & block); std::unordered_set<String> getAllTableNames(const Block & block, bool to_lower_case = false);
} }
} }

View File

@ -88,6 +88,9 @@ DatabaseReplicated::DatabaseReplicated(
/// If zookeeper chroot prefix is used, path should start with '/', because chroot concatenates without it. /// If zookeeper chroot prefix is used, path should start with '/', because chroot concatenates without it.
if (zookeeper_path.front() != '/') if (zookeeper_path.front() != '/')
zookeeper_path = "/" + zookeeper_path; zookeeper_path = "/" + zookeeper_path;
if (!db_settings.collection_name.value.empty())
fillClusterAuthInfo(db_settings.collection_name.value, context_->getConfigRef());
} }
String DatabaseReplicated::getFullReplicaName() const String DatabaseReplicated::getFullReplicaName() const
@ -191,22 +194,36 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
shards.back().emplace_back(unescapeForFileName(host_port)); shards.back().emplace_back(unescapeForFileName(host_port));
} }
String username = db_settings.cluster_username;
String password = db_settings.cluster_password;
UInt16 default_port = getContext()->getTCPPort(); UInt16 default_port = getContext()->getTCPPort();
bool secure = db_settings.cluster_secure_connection;
bool treat_local_as_remote = false; bool treat_local_as_remote = false;
bool treat_local_port_as_remote = getContext()->getApplicationType() == Context::ApplicationType::LOCAL; bool treat_local_port_as_remote = getContext()->getApplicationType() == Context::ApplicationType::LOCAL;
return std::make_shared<Cluster>( return std::make_shared<Cluster>(
getContext()->getSettingsRef(), getContext()->getSettingsRef(),
shards, shards,
username, cluster_auth_info.cluster_username,
password, cluster_auth_info.cluster_password,
default_port, default_port,
treat_local_as_remote, treat_local_as_remote,
treat_local_port_as_remote, treat_local_port_as_remote,
secure); cluster_auth_info.cluster_secure_connection,
/*priority=*/1,
database_name,
cluster_auth_info.cluster_secret);
}
void DatabaseReplicated::fillClusterAuthInfo(String collection_name, const Poco::Util::AbstractConfiguration & config_ref)
{
const auto & config_prefix = fmt::format("named_collections.{}", collection_name);
if (!config_ref.has(config_prefix))
throw Exception(ErrorCodes::BAD_ARGUMENTS, "There is no collection named `{}` in config", collection_name);
cluster_auth_info.cluster_username = config_ref.getString(config_prefix + ".cluster_username", "");
cluster_auth_info.cluster_password = config_ref.getString(config_prefix + ".cluster_password", "");
cluster_auth_info.cluster_secret = config_ref.getString(config_prefix + ".cluster_secret", "");
cluster_auth_info.cluster_secure_connection = config_ref.getBool(config_prefix + ".cluster_secure_connection", false);
} }
void DatabaseReplicated::tryConnectToZooKeeperAndInitDatabase(bool force_attach) void DatabaseReplicated::tryConnectToZooKeeperAndInitDatabase(bool force_attach)

View File

@ -75,6 +75,16 @@ private:
bool createDatabaseNodesInZooKeeper(const ZooKeeperPtr & current_zookeeper); bool createDatabaseNodesInZooKeeper(const ZooKeeperPtr & current_zookeeper);
void createReplicaNodesInZooKeeper(const ZooKeeperPtr & current_zookeeper); void createReplicaNodesInZooKeeper(const ZooKeeperPtr & current_zookeeper);
struct
{
String cluster_username{"default"};
String cluster_password;
String cluster_secret;
bool cluster_secure_connection{false};
} cluster_auth_info;
void fillClusterAuthInfo(String collection_name, const Poco::Util::AbstractConfiguration & config);
void checkQueryValid(const ASTPtr & query, ContextPtr query_context) const; void checkQueryValid(const ASTPtr & query, ContextPtr query_context) const;
void recoverLostReplica(const ZooKeeperPtr & current_zookeeper, UInt32 our_log_ptr, UInt32 max_log_ptr); void recoverLostReplica(const ZooKeeperPtr & current_zookeeper, UInt32 our_log_ptr, UInt32 max_log_ptr);

View File

@ -11,9 +11,8 @@ class ASTStorage;
M(Float, max_broken_tables_ratio, 0.5, "Do not recover replica automatically if the ratio of staled tables to all tables is greater", 0) \ M(Float, max_broken_tables_ratio, 0.5, "Do not recover replica automatically if the ratio of staled tables to all tables is greater", 0) \
M(UInt64, max_replication_lag_to_enqueue, 10, "Replica will throw exception on attempt to execute query if its replication lag greater", 0) \ M(UInt64, max_replication_lag_to_enqueue, 10, "Replica will throw exception on attempt to execute query if its replication lag greater", 0) \
M(UInt64, wait_entry_commited_timeout_sec, 3600, "Replicas will try to cancel query if timeout exceed, but initiator host has not executed it yet", 0) \ M(UInt64, wait_entry_commited_timeout_sec, 3600, "Replicas will try to cancel query if timeout exceed, but initiator host has not executed it yet", 0) \
M(String, cluster_username, "default", "Username to use when connecting to hosts of cluster", 0) \ M(String, collection_name, "", "A name of a collection defined in server's config where all info for cluster authentication is defined", 0) \
M(String, cluster_password, "", "Password to use when connecting to hosts of cluster", 0) \
M(Bool, cluster_secure_connection, false, "Enable TLS when connecting to hosts of cluster", 0) \
DECLARE_SETTINGS_TRAITS(DatabaseReplicatedSettingsTraits, LIST_OF_DATABASE_REPLICATED_SETTINGS) DECLARE_SETTINGS_TRAITS(DatabaseReplicatedSettingsTraits, LIST_OF_DATABASE_REPLICATED_SETTINGS)

View File

@ -89,10 +89,10 @@ FormatSettings getFormatSettings(ContextPtr context, const Settings & settings)
format_settings.json.quote_64bit_integers = settings.output_format_json_quote_64bit_integers; format_settings.json.quote_64bit_integers = settings.output_format_json_quote_64bit_integers;
format_settings.json.quote_denormals = settings.output_format_json_quote_denormals; format_settings.json.quote_denormals = settings.output_format_json_quote_denormals;
format_settings.null_as_default = settings.input_format_null_as_default; format_settings.null_as_default = settings.input_format_null_as_default;
format_settings.use_lowercase_column_name = settings.input_format_use_lowercase_column_name;
format_settings.decimal_trailing_zeros = settings.output_format_decimal_trailing_zeros; format_settings.decimal_trailing_zeros = settings.output_format_decimal_trailing_zeros;
format_settings.parquet.row_group_size = settings.output_format_parquet_row_group_size; format_settings.parquet.row_group_size = settings.output_format_parquet_row_group_size;
format_settings.parquet.import_nested = settings.input_format_parquet_import_nested; format_settings.parquet.import_nested = settings.input_format_parquet_import_nested;
format_settings.parquet.case_insensitive_column_matching = settings.input_format_parquet_case_insensitive_column_matching;
format_settings.parquet.allow_missing_columns = settings.input_format_parquet_allow_missing_columns; format_settings.parquet.allow_missing_columns = settings.input_format_parquet_allow_missing_columns;
format_settings.pretty.charset = settings.output_format_pretty_grid_charset.toString() == "ASCII" ? FormatSettings::Pretty::Charset::ASCII : FormatSettings::Pretty::Charset::UTF8; format_settings.pretty.charset = settings.output_format_pretty_grid_charset.toString() == "ASCII" ? FormatSettings::Pretty::Charset::ASCII : FormatSettings::Pretty::Charset::UTF8;
format_settings.pretty.color = settings.output_format_pretty_color; format_settings.pretty.color = settings.output_format_pretty_color;
@ -123,9 +123,11 @@ FormatSettings getFormatSettings(ContextPtr context, const Settings & settings)
format_settings.arrow.low_cardinality_as_dictionary = settings.output_format_arrow_low_cardinality_as_dictionary; format_settings.arrow.low_cardinality_as_dictionary = settings.output_format_arrow_low_cardinality_as_dictionary;
format_settings.arrow.import_nested = settings.input_format_arrow_import_nested; format_settings.arrow.import_nested = settings.input_format_arrow_import_nested;
format_settings.arrow.allow_missing_columns = settings.input_format_arrow_allow_missing_columns; format_settings.arrow.allow_missing_columns = settings.input_format_arrow_allow_missing_columns;
format_settings.arrow.case_insensitive_column_matching = settings.input_format_arrow_case_insensitive_column_matching;
format_settings.orc.import_nested = settings.input_format_orc_import_nested; format_settings.orc.import_nested = settings.input_format_orc_import_nested;
format_settings.orc.allow_missing_columns = settings.input_format_orc_allow_missing_columns; format_settings.orc.allow_missing_columns = settings.input_format_orc_allow_missing_columns;
format_settings.orc.row_batch_size = settings.input_format_orc_row_batch_size; format_settings.orc.row_batch_size = settings.input_format_orc_row_batch_size;
format_settings.orc.case_insensitive_column_matching = settings.input_format_orc_case_insensitive_column_matching;
format_settings.defaults_for_omitted_fields = settings.input_format_defaults_for_omitted_fields; format_settings.defaults_for_omitted_fields = settings.input_format_defaults_for_omitted_fields;
format_settings.capn_proto.enum_comparing_mode = settings.format_capn_proto_enum_comparising_mode; format_settings.capn_proto.enum_comparing_mode = settings.format_capn_proto_enum_comparising_mode;
format_settings.seekable_read = settings.input_format_allow_seeks; format_settings.seekable_read = settings.input_format_allow_seeks;

View File

@ -32,7 +32,6 @@ struct FormatSettings
bool null_as_default = true; bool null_as_default = true;
bool decimal_trailing_zeros = false; bool decimal_trailing_zeros = false;
bool defaults_for_omitted_fields = true; bool defaults_for_omitted_fields = true;
bool use_lowercase_column_name = false;
bool seekable_read = true; bool seekable_read = true;
UInt64 max_rows_to_read_for_schema_inference = 100; UInt64 max_rows_to_read_for_schema_inference = 100;
@ -75,6 +74,7 @@ struct FormatSettings
bool low_cardinality_as_dictionary = false; bool low_cardinality_as_dictionary = false;
bool import_nested = false; bool import_nested = false;
bool allow_missing_columns = false; bool allow_missing_columns = false;
bool case_insensitive_column_matching = false;
} arrow; } arrow;
struct struct
@ -137,6 +137,7 @@ struct FormatSettings
UInt64 row_group_size = 1000000; UInt64 row_group_size = 1000000;
bool import_nested = false; bool import_nested = false;
bool allow_missing_columns = false; bool allow_missing_columns = false;
bool case_insensitive_column_matching = false;
} parquet; } parquet;
struct Pretty struct Pretty
@ -217,6 +218,7 @@ struct FormatSettings
bool import_nested = false; bool import_nested = false;
bool allow_missing_columns = false; bool allow_missing_columns = false;
int64_t row_batch_size = 100'000; int64_t row_batch_size = 100'000;
bool case_insensitive_column_matching = false;
} orc; } orc;
/// For capnProto format we should determine how to /// For capnProto format we should determine how to

View File

@ -13,6 +13,7 @@ void registerFileSegmentationEngineCSV(FormatFactory & factory);
void registerFileSegmentationEngineJSONEachRow(FormatFactory & factory); void registerFileSegmentationEngineJSONEachRow(FormatFactory & factory);
void registerFileSegmentationEngineRegexp(FormatFactory & factory); void registerFileSegmentationEngineRegexp(FormatFactory & factory);
void registerFileSegmentationEngineJSONAsString(FormatFactory & factory); void registerFileSegmentationEngineJSONAsString(FormatFactory & factory);
void registerFileSegmentationEngineJSONAsObject(FormatFactory & factory);
void registerFileSegmentationEngineJSONCompactEachRow(FormatFactory & factory); void registerFileSegmentationEngineJSONCompactEachRow(FormatFactory & factory);
/// Formats for both input/output. /// Formats for both input/output.
@ -103,6 +104,7 @@ void registerProtobufSchemaReader(FormatFactory & factory);
void registerProtobufListSchemaReader(FormatFactory & factory); void registerProtobufListSchemaReader(FormatFactory & factory);
void registerLineAsStringSchemaReader(FormatFactory & factory); void registerLineAsStringSchemaReader(FormatFactory & factory);
void registerJSONAsStringSchemaReader(FormatFactory & factory); void registerJSONAsStringSchemaReader(FormatFactory & factory);
void registerJSONAsObjectSchemaReader(FormatFactory & factory);
void registerRawBLOBSchemaReader(FormatFactory & factory); void registerRawBLOBSchemaReader(FormatFactory & factory);
void registerMsgPackSchemaReader(FormatFactory & factory); void registerMsgPackSchemaReader(FormatFactory & factory);
void registerCapnProtoSchemaReader(FormatFactory & factory); void registerCapnProtoSchemaReader(FormatFactory & factory);
@ -123,6 +125,7 @@ void registerFormats()
registerFileSegmentationEngineJSONEachRow(factory); registerFileSegmentationEngineJSONEachRow(factory);
registerFileSegmentationEngineRegexp(factory); registerFileSegmentationEngineRegexp(factory);
registerFileSegmentationEngineJSONAsString(factory); registerFileSegmentationEngineJSONAsString(factory);
registerFileSegmentationEngineJSONAsObject(factory);
registerFileSegmentationEngineJSONCompactEachRow(factory); registerFileSegmentationEngineJSONCompactEachRow(factory);
registerInputFormatNative(factory); registerInputFormatNative(factory);
@ -207,6 +210,7 @@ void registerFormats()
registerProtobufListSchemaReader(factory); registerProtobufListSchemaReader(factory);
registerLineAsStringSchemaReader(factory); registerLineAsStringSchemaReader(factory);
registerJSONAsStringSchemaReader(factory); registerJSONAsStringSchemaReader(factory);
registerJSONAsObjectSchemaReader(factory);
registerRawBLOBSchemaReader(factory); registerRawBLOBSchemaReader(factory);
registerMsgPackSchemaReader(factory); registerMsgPackSchemaReader(factory);
registerCapnProtoSchemaReader(factory); registerCapnProtoSchemaReader(factory);

View File

@ -43,6 +43,9 @@ public:
for (size_t i = 2; i < args.size() - 1; i += 2) for (size_t i = 2; i < args.size() - 1; i += 2)
dst_array_types.push_back(args[i]); dst_array_types.push_back(args[i]);
// Type of the ELSE branch
dst_array_types.push_back(args.back());
return getLeastSupertype(dst_array_types); return getLeastSupertype(dst_array_types);
} }

View File

@ -32,6 +32,7 @@ namespace CurrentMetrics
namespace ProfileEvents namespace ProfileEvents
{ {
extern const Event AsyncInsertQuery; extern const Event AsyncInsertQuery;
extern const Event AsyncInsertBytes;
} }
namespace DB namespace DB
@ -222,7 +223,9 @@ void AsynchronousInsertQueue::pushImpl(InsertData::EntryPtr entry, QueueIterator
if (!data) if (!data)
data = std::make_unique<InsertData>(); data = std::make_unique<InsertData>();
data->size += entry->bytes.size(); size_t entry_data_size = entry->bytes.size();
data->size += entry_data_size;
data->last_update = std::chrono::steady_clock::now(); data->last_update = std::chrono::steady_clock::now();
data->entries.emplace_back(entry); data->entries.emplace_back(entry);
@ -239,6 +242,7 @@ void AsynchronousInsertQueue::pushImpl(InsertData::EntryPtr entry, QueueIterator
CurrentMetrics::add(CurrentMetrics::PendingAsyncInsert); CurrentMetrics::add(CurrentMetrics::PendingAsyncInsert);
ProfileEvents::increment(ProfileEvents::AsyncInsertQuery); ProfileEvents::increment(ProfileEvents::AsyncInsertQuery);
ProfileEvents::increment(ProfileEvents::AsyncInsertBytes, entry_data_size);
} }
void AsynchronousInsertQueue::waitForProcessingQuery(const String & query_id, const Milliseconds & timeout) void AsynchronousInsertQueue::waitForProcessingQuery(const String & query_id, const Milliseconds & timeout)

View File

@ -132,7 +132,9 @@ Cluster::Address::Address(
bool secure_, bool secure_,
Int64 priority_, Int64 priority_,
UInt32 shard_index_, UInt32 shard_index_,
UInt32 replica_index_) UInt32 replica_index_,
String cluster_name_,
String cluster_secret_)
: user(user_), password(password_) : user(user_), password(password_)
{ {
bool can_be_local = true; bool can_be_local = true;
@ -164,6 +166,8 @@ Cluster::Address::Address(
is_local = can_be_local && isLocal(clickhouse_port); is_local = can_be_local && isLocal(clickhouse_port);
shard_index = shard_index_; shard_index = shard_index_;
replica_index = replica_index_; replica_index = replica_index_;
cluster = cluster_name_;
cluster_secret = cluster_secret_;
} }
@ -537,10 +541,14 @@ Cluster::Cluster(
bool treat_local_as_remote, bool treat_local_as_remote,
bool treat_local_port_as_remote, bool treat_local_port_as_remote,
bool secure, bool secure,
Int64 priority) Int64 priority,
String cluster_name,
String cluster_secret)
{ {
UInt32 current_shard_num = 1; UInt32 current_shard_num = 1;
secret = cluster_secret;
for (const auto & shard : names) for (const auto & shard : names)
{ {
Addresses current; Addresses current;
@ -554,7 +562,9 @@ Cluster::Cluster(
secure, secure,
priority, priority,
current_shard_num, current_shard_num,
current.size() + 1); current.size() + 1,
cluster_name,
cluster_secret);
addresses_with_failover.emplace_back(current); addresses_with_failover.emplace_back(current);
@ -690,6 +700,9 @@ Cluster::Cluster(Cluster::ReplicasAsShardsTag, const Cluster & from, const Setti
} }
} }
secret = from.secret;
name = from.name;
initMisc(); initMisc();
} }
@ -704,6 +717,9 @@ Cluster::Cluster(Cluster::SubclusterTag, const Cluster & from, const std::vector
addresses_with_failover.emplace_back(from.addresses_with_failover.at(index)); addresses_with_failover.emplace_back(from.addresses_with_failover.at(index));
} }
secret = from.secret;
name = from.name;
initMisc(); initMisc();
} }

View File

@ -55,7 +55,9 @@ public:
bool treat_local_as_remote, bool treat_local_as_remote,
bool treat_local_port_as_remote, bool treat_local_port_as_remote,
bool secure = false, bool secure = false,
Int64 priority = 1); Int64 priority = 1,
String cluster_name = "",
String cluster_secret = "");
Cluster(const Cluster &)= delete; Cluster(const Cluster &)= delete;
Cluster & operator=(const Cluster &) = delete; Cluster & operator=(const Cluster &) = delete;
@ -127,7 +129,9 @@ public:
bool secure_ = false, bool secure_ = false,
Int64 priority_ = 1, Int64 priority_ = 1,
UInt32 shard_index_ = 0, UInt32 shard_index_ = 0,
UInt32 replica_index_ = 0); UInt32 replica_index_ = 0,
String cluster_name = "",
String cluster_secret_ = "");
/// Returns 'escaped_host_name:port' /// Returns 'escaped_host_name:port'
String toString() const; String toString() const;

View File

@ -100,20 +100,9 @@ bool checkPositionalArguments(ASTPtr & argument, const ASTSelectQuery * select_q
{ {
auto columns = select_query->select()->children; auto columns = select_query->select()->children;
const auto * group_by_expr_with_alias = dynamic_cast<const ASTWithAlias *>(argument.get()); const auto * expr_with_alias = dynamic_cast<const ASTWithAlias *>(argument.get());
if (group_by_expr_with_alias && !group_by_expr_with_alias->alias.empty()) if (expr_with_alias && !expr_with_alias->alias.empty())
{
for (const auto & column : columns)
{
const auto * col_with_alias = dynamic_cast<const ASTWithAlias *>(column.get());
if (col_with_alias)
{
const auto & alias = col_with_alias->alias;
if (!alias.empty() && alias == group_by_expr_with_alias->alias)
return false; return false;
}
}
}
const auto * ast_literal = typeid_cast<const ASTLiteral *>(argument.get()); const auto * ast_literal = typeid_cast<const ASTLiteral *>(argument.get());
if (!ast_literal) if (!ast_literal)
@ -130,7 +119,7 @@ bool checkPositionalArguments(ASTPtr & argument, const ASTSelectQuery * select_q
pos, columns.size()); pos, columns.size());
const auto & column = columns[--pos]; const auto & column = columns[--pos];
if (typeid_cast<const ASTIdentifier *>(column.get())) if (typeid_cast<const ASTIdentifier *>(column.get()) || typeid_cast<const ASTLiteral *>(column.get()))
{ {
argument = column->clone(); argument = column->clone();
} }
@ -1324,8 +1313,10 @@ ActionsDAGPtr SelectQueryExpressionAnalyzer::appendOrderBy(ExpressionActionsChai
throw Exception("Bad ORDER BY expression AST", ErrorCodes::UNKNOWN_TYPE_OF_AST_NODE); throw Exception("Bad ORDER BY expression AST", ErrorCodes::UNKNOWN_TYPE_OF_AST_NODE);
if (getContext()->getSettingsRef().enable_positional_arguments) if (getContext()->getSettingsRef().enable_positional_arguments)
{
replaceForPositionalArguments(ast->children.at(0), select_query, ASTSelectQuery::Expression::ORDER_BY); replaceForPositionalArguments(ast->children.at(0), select_query, ASTSelectQuery::Expression::ORDER_BY);
} }
}
getRootActions(select_query->orderBy(), only_types, step.actions()); getRootActions(select_query->orderBy(), only_types, step.actions());

View File

@ -962,18 +962,29 @@ public:
/// If it's joinGetOrNull, we need to wrap not-nullable columns in StorageJoin. /// If it's joinGetOrNull, we need to wrap not-nullable columns in StorageJoin.
for (size_t j = 0, size = right_indexes.size(); j < size; ++j) for (size_t j = 0, size = right_indexes.size(); j < size; ++j)
{ {
const auto & column = *block.getByPosition(right_indexes[j]).column; auto column_from_block = block.getByPosition(right_indexes[j]);
if (auto * nullable_col = typeid_cast<ColumnNullable *>(columns[j].get()); nullable_col && !column.isNullable()) if (type_name[j].type->lowCardinality() != column_from_block.type->lowCardinality())
nullable_col->insertFromNotNullable(column, row_num); {
JoinCommon::changeLowCardinalityInplace(column_from_block);
}
if (auto * nullable_col = typeid_cast<ColumnNullable *>(columns[j].get());
nullable_col && !column_from_block.column->isNullable())
nullable_col->insertFromNotNullable(*column_from_block.column, row_num);
else else
columns[j]->insertFrom(column, row_num); columns[j]->insertFrom(*column_from_block.column, row_num);
} }
} }
else else
{ {
for (size_t j = 0, size = right_indexes.size(); j < size; ++j) for (size_t j = 0, size = right_indexes.size(); j < size; ++j)
{ {
columns[j]->insertFrom(*block.getByPosition(right_indexes[j]).column, row_num); auto column_from_block = block.getByPosition(right_indexes[j]);
if (type_name[j].type->lowCardinality() != column_from_block.type->lowCardinality())
{
JoinCommon::changeLowCardinalityInplace(column_from_block);
}
columns[j]->insertFrom(*column_from_block.column, row_num);
} }
} }
} }
@ -1013,6 +1024,7 @@ private:
void addColumn(const ColumnWithTypeAndName & src_column, const std::string & qualified_name) void addColumn(const ColumnWithTypeAndName & src_column, const std::string & qualified_name)
{ {
columns.push_back(src_column.column->cloneEmpty()); columns.push_back(src_column.column->cloneEmpty());
columns.back()->reserve(src_column.column->size()); columns.back()->reserve(src_column.column->size());
type_name.emplace_back(src_column.type, src_column.name, qualified_name); type_name.emplace_back(src_column.type, src_column.name, qualified_name);

View File

@ -1180,11 +1180,10 @@ bool InterpreterCreateQuery::doCreateTable(ASTCreateQuery & create,
/// old instance of the storage. For example, AsynchronousMetrics may cause ATTACH to fail, /// old instance of the storage. For example, AsynchronousMetrics may cause ATTACH to fail,
/// so we allow waiting here. If database_atomic_wait_for_drop_and_detach_synchronously is disabled /// so we allow waiting here. If database_atomic_wait_for_drop_and_detach_synchronously is disabled
/// and old storage instance still exists it will throw exception. /// and old storage instance still exists it will throw exception.
bool throw_if_table_in_use = getContext()->getSettingsRef().database_atomic_wait_for_drop_and_detach_synchronously; if (getContext()->getSettingsRef().database_atomic_wait_for_drop_and_detach_synchronously)
if (throw_if_table_in_use)
database->checkDetachedTableNotInUse(create.uuid);
else
database->waitDetachedTableNotInUse(create.uuid); database->waitDetachedTableNotInUse(create.uuid);
else
database->checkDetachedTableNotInUse(create.uuid);
} }
StoragePtr res; StoragePtr res;

View File

@ -351,15 +351,6 @@ public:
max_size = max_size_; max_size = max_size_;
} }
// Before calling this method you should be sure
// that lock is acquired.
template <typename F>
void processEachQueryStatus(F && func) const
{
for (auto && query : processes)
func(query);
}
void setMaxInsertQueriesAmount(size_t max_insert_queries_amount_) void setMaxInsertQueriesAmount(size_t max_insert_queries_amount_)
{ {
std::lock_guard lock(mutex); std::lock_guard lock(mutex);

View File

@ -345,7 +345,10 @@ void replaceWithSumCount(String column_name, ASTFunction & func)
{ {
/// Rewrite "avg" to sumCount().1 / sumCount().2 /// Rewrite "avg" to sumCount().1 / sumCount().2
auto new_arg1 = makeASTFunction("tupleElement", func_base, std::make_shared<ASTLiteral>(UInt8(1))); auto new_arg1 = makeASTFunction("tupleElement", func_base, std::make_shared<ASTLiteral>(UInt8(1)));
auto new_arg2 = makeASTFunction("tupleElement", func_base, std::make_shared<ASTLiteral>(UInt8(2))); auto new_arg2 = makeASTFunction("CAST",
makeASTFunction("tupleElement", func_base, std::make_shared<ASTLiteral>(UInt8(2))),
std::make_shared<ASTLiteral>("Float64"));
func.name = "divide"; func.name = "divide";
exp_list->children.push_back(new_arg1); exp_list->children.push_back(new_arg1);
exp_list->children.push_back(new_arg2); exp_list->children.push_back(new_arg2);

View File

@ -326,9 +326,10 @@ ColumnRawPtrMap materializeColumnsInplaceMap(Block & block, const Names & names)
for (const auto & column_name : names) for (const auto & column_name : names)
{ {
auto & column = block.getByName(column_name).column; auto & column = block.getByName(column_name);
column = recursiveRemoveLowCardinality(column->convertToFullColumnIfConst()); column.column = recursiveRemoveLowCardinality(column.column->convertToFullColumnIfConst());
ptrs[column_name] = column.get(); column.type = recursiveRemoveLowCardinality(column.type);
ptrs[column_name] = column.column.get();
} }
return ptrs; return ptrs;

View File

@ -139,7 +139,11 @@ void ArrowBlockInputFormat::prepareReader()
} }
arrow_column_to_ch_column = std::make_unique<ArrowColumnToCHColumn>( arrow_column_to_ch_column = std::make_unique<ArrowColumnToCHColumn>(
getPort().getHeader(), "Arrow", format_settings.arrow.import_nested, format_settings.arrow.allow_missing_columns); getPort().getHeader(),
"Arrow",
format_settings.arrow.import_nested,
format_settings.arrow.allow_missing_columns,
format_settings.arrow.case_insensitive_column_matching);
missing_columns = arrow_column_to_ch_column->getMissingColumns(*schema); missing_columns = arrow_column_to_ch_column->getMissingColumns(*schema);
if (stream) if (stream)

View File

@ -31,6 +31,7 @@
#include <algorithm> #include <algorithm>
#include <arrow/builder.h> #include <arrow/builder.h>
#include <arrow/array.h> #include <arrow/array.h>
#include <boost/algorithm/string/case_conv.hpp>
/// UINT16 and UINT32 are processed separately, see comments in readColumnFromArrowColumn. /// UINT16 and UINT32 are processed separately, see comments in readColumnFromArrowColumn.
#define FOR_ARROW_NUMERIC_TYPES(M) \ #define FOR_ARROW_NUMERIC_TYPES(M) \
@ -484,15 +485,18 @@ static void checkStatus(const arrow::Status & status, const String & column_name
throw Exception{ErrorCodes::UNKNOWN_EXCEPTION, "Error with a {} column '{}': {}.", format_name, column_name, status.ToString()}; throw Exception{ErrorCodes::UNKNOWN_EXCEPTION, "Error with a {} column '{}': {}.", format_name, column_name, status.ToString()};
} }
Block ArrowColumnToCHColumn::arrowSchemaToCHHeader(const arrow::Schema & schema, const std::string & format_name, const Block * hint_header) Block ArrowColumnToCHColumn::arrowSchemaToCHHeader(
const arrow::Schema & schema, const std::string & format_name, const Block * hint_header, bool ignore_case)
{ {
ColumnsWithTypeAndName sample_columns; ColumnsWithTypeAndName sample_columns;
std::unordered_set<String> nested_table_names; std::unordered_set<String> nested_table_names;
if (hint_header) if (hint_header)
nested_table_names = Nested::getAllTableNames(*hint_header); nested_table_names = Nested::getAllTableNames(*hint_header, ignore_case);
for (const auto & field : schema.fields()) for (const auto & field : schema.fields())
{ {
if (hint_header && !hint_header->has(field->name()) && !nested_table_names.contains(field->name())) if (hint_header && !hint_header->has(field->name(), ignore_case)
&& !nested_table_names.contains(ignore_case ? boost::to_lower_copy(field->name()) : field->name()))
continue; continue;
/// Create empty arrow column by it's type and convert it to ClickHouse column. /// Create empty arrow column by it's type and convert it to ClickHouse column.
@ -516,20 +520,31 @@ Block ArrowColumnToCHColumn::arrowSchemaToCHHeader(const arrow::Schema & schema,
} }
ArrowColumnToCHColumn::ArrowColumnToCHColumn( ArrowColumnToCHColumn::ArrowColumnToCHColumn(
const Block & header_, const std::string & format_name_, bool import_nested_, bool allow_missing_columns_) const Block & header_,
: header(header_), format_name(format_name_), import_nested(import_nested_), allow_missing_columns(allow_missing_columns_) const std::string & format_name_,
bool import_nested_,
bool allow_missing_columns_,
bool case_insensitive_matching_)
: header(header_)
, format_name(format_name_)
, import_nested(import_nested_)
, allow_missing_columns(allow_missing_columns_)
, case_insensitive_matching(case_insensitive_matching_)
{ {
} }
void ArrowColumnToCHColumn::arrowTableToCHChunk(Chunk & res, std::shared_ptr<arrow::Table> & table) void ArrowColumnToCHColumn::arrowTableToCHChunk(Chunk & res, std::shared_ptr<arrow::Table> & table)
{ {
NameToColumnPtr name_to_column_ptr; NameToColumnPtr name_to_column_ptr;
for (const auto & column_name : table->ColumnNames()) for (auto column_name : table->ColumnNames())
{ {
std::shared_ptr<arrow::ChunkedArray> arrow_column = table->GetColumnByName(column_name); std::shared_ptr<arrow::ChunkedArray> arrow_column = table->GetColumnByName(column_name);
if (!arrow_column) if (!arrow_column)
throw Exception(ErrorCodes::DUPLICATE_COLUMN, "Column '{}' is duplicated", column_name); throw Exception(ErrorCodes::DUPLICATE_COLUMN, "Column '{}' is duplicated", column_name);
name_to_column_ptr[column_name] = arrow_column;
if (case_insensitive_matching)
boost::to_lower(column_name);
name_to_column_ptr[std::move(column_name)] = arrow_column;
} }
arrowColumnsToCHChunk(res, name_to_column_ptr); arrowColumnsToCHChunk(res, name_to_column_ptr);
@ -548,22 +563,31 @@ void ArrowColumnToCHColumn::arrowColumnsToCHChunk(Chunk & res, NameToColumnPtr &
{ {
const ColumnWithTypeAndName & header_column = header.getByPosition(column_i); const ColumnWithTypeAndName & header_column = header.getByPosition(column_i);
auto search_column_name = header_column.name;
if (case_insensitive_matching)
boost::to_lower(search_column_name);
bool read_from_nested = false; bool read_from_nested = false;
String nested_table_name = Nested::extractTableName(header_column.name); String nested_table_name = Nested::extractTableName(header_column.name);
if (!name_to_column_ptr.contains(header_column.name)) String search_nested_table_name = nested_table_name;
if (case_insensitive_matching)
boost::to_lower(search_nested_table_name);
if (!name_to_column_ptr.contains(search_column_name))
{ {
/// Check if it's a column from nested table. /// Check if it's a column from nested table.
if (import_nested && name_to_column_ptr.contains(nested_table_name)) if (import_nested && name_to_column_ptr.contains(search_nested_table_name))
{ {
if (!nested_tables.contains(nested_table_name)) if (!nested_tables.contains(search_nested_table_name))
{ {
std::shared_ptr<arrow::ChunkedArray> arrow_column = name_to_column_ptr[nested_table_name]; std::shared_ptr<arrow::ChunkedArray> arrow_column = name_to_column_ptr[search_nested_table_name];
ColumnsWithTypeAndName cols = {readColumnFromArrowColumn(arrow_column, nested_table_name, format_name, false, dictionary_values, true)}; ColumnsWithTypeAndName cols
= {readColumnFromArrowColumn(arrow_column, nested_table_name, format_name, false, dictionary_values, true)};
Block block(cols); Block block(cols);
nested_tables[nested_table_name] = std::make_shared<Block>(Nested::flatten(block)); nested_tables[search_nested_table_name] = std::make_shared<Block>(Nested::flatten(block));
} }
read_from_nested = nested_tables[nested_table_name]->has(header_column.name); read_from_nested = nested_tables[search_nested_table_name]->has(header_column.name, case_insensitive_matching);
} }
if (!read_from_nested) if (!read_from_nested)
@ -580,13 +604,19 @@ void ArrowColumnToCHColumn::arrowColumnsToCHChunk(Chunk & res, NameToColumnPtr &
} }
} }
std::shared_ptr<arrow::ChunkedArray> arrow_column = name_to_column_ptr[header_column.name];
ColumnWithTypeAndName column; ColumnWithTypeAndName column;
if (read_from_nested) if (read_from_nested)
column = nested_tables[nested_table_name]->getByName(header_column.name); {
column = nested_tables[search_nested_table_name]->getByName(header_column.name, case_insensitive_matching);
if (case_insensitive_matching)
column.name = header_column.name;
}
else else
{
auto arrow_column = name_to_column_ptr[search_column_name];
column = readColumnFromArrowColumn(arrow_column, header_column.name, format_name, false, dictionary_values, true); column = readColumnFromArrowColumn(arrow_column, header_column.name, format_name, false, dictionary_values, true);
}
try try
{ {
@ -594,8 +624,11 @@ void ArrowColumnToCHColumn::arrowColumnsToCHChunk(Chunk & res, NameToColumnPtr &
} }
catch (Exception & e) catch (Exception & e)
{ {
e.addMessage(fmt::format("while converting column {} from type {} to type {}", e.addMessage(fmt::format(
backQuote(header_column.name), column.type->getName(), header_column.type->getName())); "while converting column {} from type {} to type {}",
backQuote(header_column.name),
column.type->getName(),
header_column.type->getName()));
throw; throw;
} }
@ -609,22 +642,23 @@ void ArrowColumnToCHColumn::arrowColumnsToCHChunk(Chunk & res, NameToColumnPtr &
std::vector<size_t> ArrowColumnToCHColumn::getMissingColumns(const arrow::Schema & schema) const std::vector<size_t> ArrowColumnToCHColumn::getMissingColumns(const arrow::Schema & schema) const
{ {
std::vector<size_t> missing_columns; std::vector<size_t> missing_columns;
auto block_from_arrow = arrowSchemaToCHHeader(schema, format_name, &header); auto block_from_arrow = arrowSchemaToCHHeader(schema, format_name, &header, case_insensitive_matching);
auto flatten_block_from_arrow = Nested::flatten(block_from_arrow); auto flatten_block_from_arrow = Nested::flatten(block_from_arrow);
for (size_t i = 0, columns = header.columns(); i < columns; ++i) for (size_t i = 0, columns = header.columns(); i < columns; ++i)
{ {
const auto & column = header.getByPosition(i); const auto & header_column = header.getByPosition(i);
bool read_from_nested = false; bool read_from_nested = false;
String nested_table_name = Nested::extractTableName(column.name); String nested_table_name = Nested::extractTableName(header_column.name);
if (!block_from_arrow.has(column.name)) if (!block_from_arrow.has(header_column.name, case_insensitive_matching))
{ {
if (import_nested && block_from_arrow.has(nested_table_name)) if (import_nested && block_from_arrow.has(nested_table_name, case_insensitive_matching))
read_from_nested = flatten_block_from_arrow.has(column.name); read_from_nested = flatten_block_from_arrow.has(header_column.name, case_insensitive_matching);
if (!read_from_nested) if (!read_from_nested)
{ {
if (!allow_missing_columns) if (!allow_missing_columns)
throw Exception{ErrorCodes::THERE_IS_NO_COLUMN, "Column '{}' is not presented in input data.", column.name}; throw Exception{ErrorCodes::THERE_IS_NO_COLUMN, "Column '{}' is not presented in input data.", header_column.name};
missing_columns.push_back(i); missing_columns.push_back(i);
} }

View File

@ -25,7 +25,8 @@ public:
const Block & header_, const Block & header_,
const std::string & format_name_, const std::string & format_name_,
bool import_nested_, bool import_nested_,
bool allow_missing_columns_); bool allow_missing_columns_,
bool case_insensitive_matching_ = false);
void arrowTableToCHChunk(Chunk & res, std::shared_ptr<arrow::Table> & table); void arrowTableToCHChunk(Chunk & res, std::shared_ptr<arrow::Table> & table);
@ -36,7 +37,8 @@ public:
/// Transform arrow schema to ClickHouse header. If hint_header is provided, /// Transform arrow schema to ClickHouse header. If hint_header is provided,
/// we will skip columns in schema that are not in hint_header. /// we will skip columns in schema that are not in hint_header.
static Block arrowSchemaToCHHeader(const arrow::Schema & schema, const std::string & format_name, const Block * hint_header = nullptr); static Block arrowSchemaToCHHeader(
const arrow::Schema & schema, const std::string & format_name, const Block * hint_header = nullptr, bool ignore_case = false);
private: private:
const Block & header; const Block & header;
@ -44,6 +46,7 @@ private:
bool import_nested; bool import_nested;
/// If false, throw exception if some columns in header not exists in arrow table. /// If false, throw exception if some columns in header not exists in arrow table.
bool allow_missing_columns; bool allow_missing_columns;
bool case_insensitive_matching;
/// Map {column name : dictionary column}. /// Map {column name : dictionary column}.
/// To avoid converting dictionary from Arrow Dictionary /// To avoid converting dictionary from Arrow Dictionary

View File

@ -228,6 +228,14 @@ void registerNonTrivialPrefixAndSuffixCheckerJSONAsString(FormatFactory & factor
factory.registerNonTrivialPrefixAndSuffixChecker("JSONAsString", nonTrivialPrefixAndSuffixCheckerJSONEachRowImpl); factory.registerNonTrivialPrefixAndSuffixChecker("JSONAsString", nonTrivialPrefixAndSuffixCheckerJSONEachRowImpl);
} }
void registerJSONAsStringSchemaReader(FormatFactory & factory)
{
factory.registerExternalSchemaReader("JSONAsString", [](const FormatSettings &)
{
return std::make_shared<JSONAsStringExternalSchemaReader>();
});
}
void registerInputFormatJSONAsObject(FormatFactory & factory) void registerInputFormatJSONAsObject(FormatFactory & factory)
{ {
factory.registerInputFormat("JSONAsObject", []( factory.registerInputFormat("JSONAsObject", [](
@ -245,11 +253,16 @@ void registerNonTrivialPrefixAndSuffixCheckerJSONAsObject(FormatFactory & factor
factory.registerNonTrivialPrefixAndSuffixChecker("JSONAsObject", nonTrivialPrefixAndSuffixCheckerJSONEachRowImpl); factory.registerNonTrivialPrefixAndSuffixChecker("JSONAsObject", nonTrivialPrefixAndSuffixCheckerJSONEachRowImpl);
} }
void registerJSONAsStringSchemaReader(FormatFactory & factory) void registerFileSegmentationEngineJSONAsObject(FormatFactory & factory)
{ {
factory.registerExternalSchemaReader("JSONAsString", [](const FormatSettings &) factory.registerFileSegmentationEngine("JSONAsObject", &fileSegmentationEngineJSONEachRow);
}
void registerJSONAsObjectSchemaReader(FormatFactory & factory)
{ {
return std::make_shared<JSONAsStringExternalSchemaReader>(); factory.registerExternalSchemaReader("JSONAsObject", [](const FormatSettings &)
{
return std::make_shared<JSONAsObjectExternalSchemaReader>();
}); });
} }

View File

@ -5,6 +5,7 @@
#include <Formats/FormatFactory.h> #include <Formats/FormatFactory.h>
#include <IO/PeekableReadBuffer.h> #include <IO/PeekableReadBuffer.h>
#include <DataTypes/DataTypeString.h> #include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeObject.h>
namespace DB namespace DB
{ {
@ -73,4 +74,13 @@ public:
} }
}; };
class JSONAsObjectExternalSchemaReader : public IExternalSchemaReader
{
public:
NamesAndTypesList readSchema() override
{
return {{"json", std::make_shared<DataTypeObject>("json", false)}};
}
};
} }

View File

@ -53,9 +53,6 @@ Chunk ORCBlockInputFormat::generate()
if (!table || !table->num_rows()) if (!table || !table->num_rows())
return res; return res;
if (format_settings.use_lowercase_column_name)
table = *table->RenameColumns(include_column_names);
arrow_column_to_ch_column->arrowTableToCHChunk(res, table); arrow_column_to_ch_column->arrowTableToCHChunk(res, table);
/// If defaults_for_omitted_fields is true, calculate the default values from default expression for omitted fields. /// If defaults_for_omitted_fields is true, calculate the default values from default expression for omitted fields.
/// Otherwise fill the missing columns with zero values of its type. /// Otherwise fill the missing columns with zero values of its type.
@ -73,7 +70,6 @@ void ORCBlockInputFormat::resetParser()
file_reader.reset(); file_reader.reset();
include_indices.clear(); include_indices.clear();
include_column_names.clear();
block_missing_values.clear(); block_missing_values.clear();
} }
@ -125,20 +121,6 @@ static void getFileReaderAndSchema(
if (!read_schema_result.ok()) if (!read_schema_result.ok())
throw Exception(read_schema_result.status().ToString(), ErrorCodes::BAD_ARGUMENTS); throw Exception(read_schema_result.status().ToString(), ErrorCodes::BAD_ARGUMENTS);
schema = std::move(read_schema_result).ValueOrDie(); schema = std::move(read_schema_result).ValueOrDie();
if (format_settings.use_lowercase_column_name)
{
std::vector<std::shared_ptr<::arrow::Field>> fields;
fields.reserve(schema->num_fields());
for (int i = 0; i < schema->num_fields(); ++i)
{
const auto& field = schema->field(i);
auto name = field->name();
boost::to_lower(name);
fields.push_back(field->WithName(name));
}
schema = arrow::schema(fields, schema->metadata());
}
} }
void ORCBlockInputFormat::prepareReader() void ORCBlockInputFormat::prepareReader()
@ -149,12 +131,17 @@ void ORCBlockInputFormat::prepareReader()
return; return;
arrow_column_to_ch_column = std::make_unique<ArrowColumnToCHColumn>( arrow_column_to_ch_column = std::make_unique<ArrowColumnToCHColumn>(
getPort().getHeader(), "ORC", format_settings.orc.import_nested, format_settings.orc.allow_missing_columns); getPort().getHeader(),
"ORC",
format_settings.orc.import_nested,
format_settings.orc.allow_missing_columns,
format_settings.orc.case_insensitive_column_matching);
missing_columns = arrow_column_to_ch_column->getMissingColumns(*schema); missing_columns = arrow_column_to_ch_column->getMissingColumns(*schema);
const bool ignore_case = format_settings.orc.case_insensitive_column_matching;
std::unordered_set<String> nested_table_names; std::unordered_set<String> nested_table_names;
if (format_settings.orc.import_nested) if (format_settings.orc.import_nested)
nested_table_names = Nested::getAllTableNames(getPort().getHeader()); nested_table_names = Nested::getAllTableNames(getPort().getHeader(), ignore_case);
/// In ReadStripe column indices should be started from 1, /// In ReadStripe column indices should be started from 1,
/// because 0 indicates to select all columns. /// because 0 indicates to select all columns.
@ -165,19 +152,18 @@ void ORCBlockInputFormat::prepareReader()
/// so we should recursively count the number of indices we need for this type. /// so we should recursively count the number of indices we need for this type.
int indexes_count = countIndicesForType(schema->field(i)->type()); int indexes_count = countIndicesForType(schema->field(i)->type());
const auto & name = schema->field(i)->name(); const auto & name = schema->field(i)->name();
if (getPort().getHeader().has(name) || nested_table_names.contains(name)) if (getPort().getHeader().has(name, ignore_case) || nested_table_names.contains(ignore_case ? boost::to_lower_copy(name) : name))
{ {
for (int j = 0; j != indexes_count; ++j) for (int j = 0; j != indexes_count; ++j)
{
include_indices.push_back(index + j); include_indices.push_back(index + j);
include_column_names.push_back(name);
}
} }
index += indexes_count; index += indexes_count;
} }
} }
ORCSchemaReader::ORCSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_) : ISchemaReader(in_), format_settings(format_settings_) ORCSchemaReader::ORCSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_)
: ISchemaReader(in_), format_settings(format_settings_)
{ {
} }

View File

@ -47,7 +47,6 @@ private:
// indices of columns to read from ORC file // indices of columns to read from ORC file
std::vector<int> include_indices; std::vector<int> include_indices;
std::vector<String> include_column_names;
std::vector<size_t> missing_columns; std::vector<size_t> missing_columns;
BlockMissingValues block_missing_values; BlockMissingValues block_missing_values;

View File

@ -53,11 +53,7 @@ Chunk ParquetBlockInputFormat::generate()
std::shared_ptr<arrow::Table> table; std::shared_ptr<arrow::Table> table;
arrow::Status read_status = file_reader->ReadRowGroup(row_group_current, column_indices, &table); arrow::Status read_status = file_reader->ReadRowGroup(row_group_current, column_indices, &table);
if (!read_status.ok()) if (!read_status.ok())
throw ParsingException{"Error while reading Parquet data: " + read_status.ToString(), throw ParsingException{"Error while reading Parquet data: " + read_status.ToString(), ErrorCodes::CANNOT_READ_ALL_DATA};
ErrorCodes::CANNOT_READ_ALL_DATA};
if (format_settings.use_lowercase_column_name)
table = *table->RenameColumns(column_names);
++row_group_current; ++row_group_current;
@ -78,7 +74,6 @@ void ParquetBlockInputFormat::resetParser()
file_reader.reset(); file_reader.reset();
column_indices.clear(); column_indices.clear();
column_names.clear();
row_group_current = 0; row_group_current = 0;
block_missing_values.clear(); block_missing_values.clear();
} }
@ -123,20 +118,6 @@ static void getFileReaderAndSchema(
return; return;
THROW_ARROW_NOT_OK(parquet::arrow::OpenFile(std::move(arrow_file), arrow::default_memory_pool(), &file_reader)); THROW_ARROW_NOT_OK(parquet::arrow::OpenFile(std::move(arrow_file), arrow::default_memory_pool(), &file_reader));
THROW_ARROW_NOT_OK(file_reader->GetSchema(&schema)); THROW_ARROW_NOT_OK(file_reader->GetSchema(&schema));
if (format_settings.use_lowercase_column_name)
{
std::vector<std::shared_ptr<::arrow::Field>> fields;
fields.reserve(schema->num_fields());
for (int i = 0; i < schema->num_fields(); ++i)
{
const auto& field = schema->field(i);
auto name = field->name();
boost::to_lower(name);
fields.push_back(field->WithName(name));
}
schema = arrow::schema(fields, schema->metadata());
}
} }
void ParquetBlockInputFormat::prepareReader() void ParquetBlockInputFormat::prepareReader()
@ -149,12 +130,18 @@ void ParquetBlockInputFormat::prepareReader()
row_group_total = file_reader->num_row_groups(); row_group_total = file_reader->num_row_groups();
row_group_current = 0; row_group_current = 0;
arrow_column_to_ch_column = std::make_unique<ArrowColumnToCHColumn>(getPort().getHeader(), "Parquet", format_settings.parquet.import_nested, format_settings.parquet.allow_missing_columns); arrow_column_to_ch_column = std::make_unique<ArrowColumnToCHColumn>(
getPort().getHeader(),
"Parquet",
format_settings.parquet.import_nested,
format_settings.parquet.allow_missing_columns,
format_settings.parquet.case_insensitive_column_matching);
missing_columns = arrow_column_to_ch_column->getMissingColumns(*schema); missing_columns = arrow_column_to_ch_column->getMissingColumns(*schema);
const bool ignore_case = format_settings.parquet.case_insensitive_column_matching;
std::unordered_set<String> nested_table_names; std::unordered_set<String> nested_table_names;
if (format_settings.parquet.import_nested) if (format_settings.parquet.import_nested)
nested_table_names = Nested::getAllTableNames(getPort().getHeader()); nested_table_names = Nested::getAllTableNames(getPort().getHeader(), ignore_case);
int index = 0; int index = 0;
for (int i = 0; i < schema->num_fields(); ++i) for (int i = 0; i < schema->num_fields(); ++i)
@ -164,19 +151,19 @@ void ParquetBlockInputFormat::prepareReader()
/// count the number of indices we need for this type. /// count the number of indices we need for this type.
int indexes_count = countIndicesForType(schema->field(i)->type()); int indexes_count = countIndicesForType(schema->field(i)->type());
const auto & name = schema->field(i)->name(); const auto & name = schema->field(i)->name();
if (getPort().getHeader().has(name) || nested_table_names.contains(name))
if (getPort().getHeader().has(name, ignore_case) || nested_table_names.contains(ignore_case ? boost::to_lower_copy(name) : name))
{ {
for (int j = 0; j != indexes_count; ++j) for (int j = 0; j != indexes_count; ++j)
{
column_indices.push_back(index + j); column_indices.push_back(index + j);
column_names.push_back(name);
}
} }
index += indexes_count; index += indexes_count;
} }
} }
ParquetSchemaReader::ParquetSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_) : ISchemaReader(in_), format_settings(format_settings_) ParquetSchemaReader::ParquetSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_)
: ISchemaReader(in_), format_settings(format_settings_)
{ {
} }

View File

@ -40,7 +40,6 @@ private:
int row_group_total = 0; int row_group_total = 0;
// indices of columns to read from Parquet file // indices of columns to read from Parquet file
std::vector<int> column_indices; std::vector<int> column_indices;
std::vector<String> column_names;
std::unique_ptr<ArrowColumnToCHColumn> arrow_column_to_ch_column; std::unique_ptr<ArrowColumnToCHColumn> arrow_column_to_ch_column;
int row_group_current = 0; int row_group_current = 0;
std::vector<size_t> missing_columns; std::vector<size_t> missing_columns;

View File

@ -1,16 +1,22 @@
#include <Processors/QueryPlan/QueryPlan.h> #include <stack>
#include <Processors/QueryPlan/IQueryPlanStep.h>
#include <QueryPipeline/QueryPipelineBuilder.h> #include <Common/JSONBuilder.h>
#include <IO/WriteBuffer.h>
#include <IO/Operators.h>
#include <Interpreters/ActionsDAG.h> #include <Interpreters/ActionsDAG.h>
#include <Interpreters/ArrayJoinAction.h> #include <Interpreters/ArrayJoinAction.h>
#include <stack>
#include <IO/Operators.h>
#include <IO/WriteBuffer.h>
#include <Processors/QueryPlan/BuildQueryPipelineSettings.h>
#include <Processors/QueryPlan/IQueryPlanStep.h>
#include <Processors/QueryPlan/Optimizations/Optimizations.h> #include <Processors/QueryPlan/Optimizations/Optimizations.h>
#include <Processors/QueryPlan/Optimizations/QueryPlanOptimizationSettings.h> #include <Processors/QueryPlan/Optimizations/QueryPlanOptimizationSettings.h>
#include <Processors/QueryPlan/BuildQueryPipelineSettings.h> #include <Processors/QueryPlan/QueryPlan.h>
#include <Processors/QueryPlan/ReadFromMergeTree.h> #include <Processors/QueryPlan/ReadFromMergeTree.h>
#include <Common/JSONBuilder.h>
#include <QueryPipeline/QueryPipelineBuilder.h>
namespace DB namespace DB
{ {
@ -388,6 +394,7 @@ void QueryPlan::explainPlan(WriteBuffer & buffer, const ExplainPlanOptions & opt
static void explainPipelineStep(IQueryPlanStep & step, IQueryPlanStep::FormatSettings & settings) static void explainPipelineStep(IQueryPlanStep & step, IQueryPlanStep::FormatSettings & settings)
{ {
settings.out << String(settings.offset, settings.indent_char) << "(" << step.getName() << ")\n"; settings.out << String(settings.offset, settings.indent_char) << "(" << step.getName() << ")\n";
size_t current_offset = settings.offset; size_t current_offset = settings.offset;
step.describePipeline(settings); step.describePipeline(settings);
if (current_offset == settings.offset) if (current_offset == settings.offset)

View File

@ -112,6 +112,9 @@ ReadFromMergeTree::ReadFromMergeTree(
if (enable_parallel_reading) if (enable_parallel_reading)
read_task_callback = context->getMergeTreeReadTaskCallback(); read_task_callback = context->getMergeTreeReadTaskCallback();
/// Add explicit description.
setStepDescription(data.getStorageID().getFullNameNotQuoted());
} }
Pipe ReadFromMergeTree::readFromPool( Pipe ReadFromMergeTree::readFromPool(

View File

@ -100,7 +100,8 @@ public:
bool enable_parallel_reading bool enable_parallel_reading
); );
String getName() const override { return "ReadFromMergeTree"; } static constexpr auto name = "ReadFromMergeTree";
String getName() const override { return name; }
void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override; void initializePipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) override;

View File

@ -325,6 +325,7 @@ void URLBasedDataSourceConfiguration::set(const URLBasedDataSourceConfiguration
compression_method = conf.compression_method; compression_method = conf.compression_method;
structure = conf.structure; structure = conf.structure;
http_method = conf.http_method; http_method = conf.http_method;
headers = conf.headers;
} }
@ -364,6 +365,10 @@ std::optional<URLBasedDataSourceConfig> getURLBasedDataSourceConfiguration(const
{ {
configuration.structure = config.getString(config_prefix + ".structure", ""); configuration.structure = config.getString(config_prefix + ".structure", "");
} }
else if (key == "compression_method")
{
configuration.compression_method = config.getString(config_prefix + ".compression_method", "");
}
else if (key == "headers") else if (key == "headers")
{ {
Poco::Util::AbstractConfiguration::Keys header_keys; Poco::Util::AbstractConfiguration::Keys header_keys;

View File

@ -279,14 +279,17 @@ bool MergeFromLogEntryTask::finalize(ReplicatedMergeMutateTaskBase::PartLogWrite
ProfileEvents::increment(ProfileEvents::DataAfterMergeDiffersFromReplica); ProfileEvents::increment(ProfileEvents::DataAfterMergeDiffersFromReplica);
LOG_ERROR(log, LOG_ERROR(log,
"{}. Data after merge is not byte-identical to data on another replicas. There could be several" "{}. Data after merge is not byte-identical to data on another replicas. There could be several reasons:"
" reasons: 1. Using newer version of compression library after server update. 2. Using another" " 1. Using newer version of compression library after server update."
" compression method. 3. Non-deterministic compression algorithm (highly unlikely). 4." " 2. Using another compression method."
" Non-deterministic merge algorithm due to logical error in code. 5. Data corruption in memory due" " 3. Non-deterministic compression algorithm (highly unlikely)."
" to bug in code. 6. Data corruption in memory due to hardware issue. 7. Manual modification of" " 4. Non-deterministic merge algorithm due to logical error in code."
" source data after server startup. 8. Manual modification of checksums stored in ZooKeeper. 9." " 5. Data corruption in memory due to bug in code."
" Part format related settings like 'enable_mixed_granularity_parts' are different on different" " 6. Data corruption in memory due to hardware issue."
" replicas. We will download merged part from replica to force byte-identical result.", " 7. Manual modification of source data after server startup."
" 8. Manual modification of checksums stored in ZooKeeper."
" 9. Part format related settings like 'enable_mixed_granularity_parts' are different on different replicas."
" We will download merged part from replica to force byte-identical result.",
getCurrentExceptionMessage(false)); getCurrentExceptionMessage(false));
write_part_log(ExecutionStatus::fromCurrentException()); write_part_log(ExecutionStatus::fromCurrentException());

View File

@ -185,7 +185,8 @@ bool MutateFromLogEntryTask::finalize(ReplicatedMergeMutateTaskBase::PartLogWrit
ProfileEvents::increment(ProfileEvents::DataAfterMutationDiffersFromReplica); ProfileEvents::increment(ProfileEvents::DataAfterMutationDiffersFromReplica);
LOG_ERROR(log, "{}. Data after mutation is not byte-identical to data on another replicas. We will download merged part from replica to force byte-identical result.", getCurrentExceptionMessage(false)); LOG_ERROR(log, "{}. Data after mutation is not byte-identical to data on another replicas. "
"We will download merged part from replica to force byte-identical result.", getCurrentExceptionMessage(false));
write_part_log(ExecutionStatus::fromCurrentException()); write_part_log(ExecutionStatus::fromCurrentException());

View File

@ -98,8 +98,24 @@ MaterializedPostgreSQLConsumer::StorageData::Buffer::Buffer(
} }
void MaterializedPostgreSQLConsumer::assertCorrectInsertion(StorageData::Buffer & buffer, size_t column_idx)
{
if (column_idx >= buffer.description.sample_block.columns()
|| column_idx >= buffer.description.types.size()
|| column_idx >= buffer.columns.size())
throw Exception(
ErrorCodes::LOGICAL_ERROR,
"Attempt to insert into buffer at position: {}, but block columns size is {}, types size: {}, columns size: {}, buffer structure: {}",
column_idx,
buffer.description.sample_block.columns(), buffer.description.types.size(), buffer.columns.size(),
buffer.description.sample_block.dumpStructure());
}
void MaterializedPostgreSQLConsumer::insertValue(StorageData::Buffer & buffer, const std::string & value, size_t column_idx) void MaterializedPostgreSQLConsumer::insertValue(StorageData::Buffer & buffer, const std::string & value, size_t column_idx)
{ {
assertCorrectInsertion(buffer, column_idx);
const auto & sample = buffer.description.sample_block.getByPosition(column_idx); const auto & sample = buffer.description.sample_block.getByPosition(column_idx);
bool is_nullable = buffer.description.types[column_idx].second; bool is_nullable = buffer.description.types[column_idx].second;
@ -134,6 +150,8 @@ void MaterializedPostgreSQLConsumer::insertValue(StorageData::Buffer & buffer, c
void MaterializedPostgreSQLConsumer::insertDefaultValue(StorageData::Buffer & buffer, size_t column_idx) void MaterializedPostgreSQLConsumer::insertDefaultValue(StorageData::Buffer & buffer, size_t column_idx)
{ {
assertCorrectInsertion(buffer, column_idx);
const auto & sample = buffer.description.sample_block.getByPosition(column_idx); const auto & sample = buffer.description.sample_block.getByPosition(column_idx);
insertDefaultPostgreSQLValue(*buffer.columns[column_idx], *sample.column); insertDefaultPostgreSQLValue(*buffer.columns[column_idx], *sample.column);
} }
@ -514,14 +532,15 @@ void MaterializedPostgreSQLConsumer::processReplicationMessage(const char * repl
void MaterializedPostgreSQLConsumer::syncTables() void MaterializedPostgreSQLConsumer::syncTables()
{
try
{ {
for (const auto & table_name : tables_to_sync) for (const auto & table_name : tables_to_sync)
{ {
auto & storage_data = storages.find(table_name)->second; auto & storage_data = storages.find(table_name)->second;
Block result_rows = storage_data.buffer.description.sample_block.cloneWithColumns(std::move(storage_data.buffer.columns)); Block result_rows = storage_data.buffer.description.sample_block.cloneWithColumns(std::move(storage_data.buffer.columns));
storage_data.buffer.columns = storage_data.buffer.description.sample_block.cloneEmptyColumns();
try
{
if (result_rows.rows()) if (result_rows.rows())
{ {
auto storage = storage_data.storage; auto storage = storage_data.storage;
@ -543,13 +562,18 @@ void MaterializedPostgreSQLConsumer::syncTables()
CompletedPipelineExecutor executor(io.pipeline); CompletedPipelineExecutor executor(io.pipeline);
executor.execute(); executor.execute();
}
storage_data.buffer.columns = storage_data.buffer.description.sample_block.cloneEmptyColumns(); }
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
} }
} }
LOG_DEBUG(log, "Table sync end for {} tables, last lsn: {} = {}, (attempted lsn {})", tables_to_sync.size(), current_lsn, getLSNValue(current_lsn), getLSNValue(final_lsn)); LOG_DEBUG(log, "Table sync end for {} tables, last lsn: {} = {}, (attempted lsn {})", tables_to_sync.size(), current_lsn, getLSNValue(current_lsn), getLSNValue(final_lsn));
try
{
auto tx = std::make_shared<pqxx::nontransaction>(connection->getRef()); auto tx = std::make_shared<pqxx::nontransaction>(connection->getRef());
current_lsn = advanceLSN(tx); current_lsn = advanceLSN(tx);
tables_to_sync.clear(); tables_to_sync.clear();

View File

@ -122,6 +122,8 @@ private:
void markTableAsSkipped(Int32 relation_id, const String & relation_name); void markTableAsSkipped(Int32 relation_id, const String & relation_name);
static void assertCorrectInsertion(StorageData::Buffer & buffer, size_t column_idx);
/// lsn - log sequnce nuumber, like wal offset (64 bit). /// lsn - log sequnce nuumber, like wal offset (64 bit).
static Int64 getLSNValue(const std::string & lsn) static Int64 getLSNValue(const std::string & lsn)
{ {

View File

@ -64,8 +64,8 @@ PostgreSQLReplicationHandler::PostgreSQLReplicationHandler(
bool is_attach_, bool is_attach_,
const MaterializedPostgreSQLSettings & replication_settings, const MaterializedPostgreSQLSettings & replication_settings,
bool is_materialized_postgresql_database_) bool is_materialized_postgresql_database_)
: log(&Poco::Logger::get("PostgreSQLReplicationHandler")) : WithContext(context_->getGlobalContext())
, context(context_) , log(&Poco::Logger::get("PostgreSQLReplicationHandler"))
, is_attach(is_attach_) , is_attach(is_attach_)
, postgres_database(postgres_database_) , postgres_database(postgres_database_)
, postgres_schema(replication_settings.materialized_postgresql_schema) , postgres_schema(replication_settings.materialized_postgresql_schema)
@ -94,9 +94,9 @@ PostgreSQLReplicationHandler::PostgreSQLReplicationHandler(
} }
publication_name = fmt::format("{}_ch_publication", replication_identifier); publication_name = fmt::format("{}_ch_publication", replication_identifier);
startup_task = context->getSchedulePool().createTask("PostgreSQLReplicaStartup", [this]{ checkConnectionAndStart(); }); startup_task = getContext()->getSchedulePool().createTask("PostgreSQLReplicaStartup", [this]{ checkConnectionAndStart(); });
consumer_task = context->getSchedulePool().createTask("PostgreSQLReplicaStartup", [this]{ consumerFunc(); }); consumer_task = getContext()->getSchedulePool().createTask("PostgreSQLReplicaStartup", [this]{ consumerFunc(); });
cleanup_task = context->getSchedulePool().createTask("PostgreSQLReplicaStartup", [this]{ cleanupFunc(); }); cleanup_task = getContext()->getSchedulePool().createTask("PostgreSQLReplicaStartup", [this]{ cleanupFunc(); });
} }
@ -296,7 +296,7 @@ void PostgreSQLReplicationHandler::startSynchronization(bool throw_on_error)
/// (Apart from the case, when shutdownFinal is called). /// (Apart from the case, when shutdownFinal is called).
/// Handler uses it only for loadFromSnapshot and shutdown methods. /// Handler uses it only for loadFromSnapshot and shutdown methods.
consumer = std::make_shared<MaterializedPostgreSQLConsumer>( consumer = std::make_shared<MaterializedPostgreSQLConsumer>(
context, getContext(),
std::move(tmp_connection), std::move(tmp_connection),
replication_slot, replication_slot,
publication_name, publication_name,
@ -921,9 +921,9 @@ void PostgreSQLReplicationHandler::reloadFromSnapshot(const std::vector<std::pai
for (const auto & [relation_id, table_name] : relation_data) for (const auto & [relation_id, table_name] : relation_data)
{ {
auto storage = DatabaseCatalog::instance().getTable(StorageID(current_database_name, table_name), context); auto storage = DatabaseCatalog::instance().getTable(StorageID(current_database_name, table_name), getContext());
auto * materialized_storage = storage->as <StorageMaterializedPostgreSQL>(); auto * materialized_storage = storage->as <StorageMaterializedPostgreSQL>();
auto materialized_table_lock = materialized_storage->lockForShare(String(), context->getSettingsRef().lock_acquire_timeout); auto materialized_table_lock = materialized_storage->lockForShare(String(), getContext()->getSettingsRef().lock_acquire_timeout);
/// If for some reason this temporary table already exists - also drop it. /// If for some reason this temporary table already exists - also drop it.
auto temp_materialized_storage = materialized_storage->createTemporary(); auto temp_materialized_storage = materialized_storage->createTemporary();

View File

@ -13,7 +13,7 @@ namespace DB
class StorageMaterializedPostgreSQL; class StorageMaterializedPostgreSQL;
struct SettingChange; struct SettingChange;
class PostgreSQLReplicationHandler class PostgreSQLReplicationHandler : WithContext
{ {
friend class TemporaryReplicationSlot; friend class TemporaryReplicationSlot;
@ -98,7 +98,6 @@ private:
std::pair<String, String> getSchemaAndTableName(const String & table_name) const; std::pair<String, String> getSchemaAndTableName(const String & table_name) const;
Poco::Logger * log; Poco::Logger * log;
ContextPtr context;
/// If it is not attach, i.e. a create query, then if publication already exists - always drop it. /// If it is not attach, i.e. a create query, then if publication already exists - always drop it.
bool is_attach; bool is_attach;

View File

@ -1312,10 +1312,14 @@ void StorageReplicatedMergeTree::checkPartChecksumsAndAddCommitOps(const zkutil:
if (replica_part_header.getColumnsHash() != local_part_header.getColumnsHash()) if (replica_part_header.getColumnsHash() != local_part_header.getColumnsHash())
{ {
/// Either it's a bug or ZooKeeper contains broken data. /// Currently there are two (known) cases when it may happen:
/// TODO Fix KILL MUTATION and replace CHECKSUM_DOESNT_MATCH with LOGICAL_ERROR /// - KILL MUTATION query had removed mutation before all replicas have executed assigned MUTATE_PART entries.
/// (some replicas may skip killed mutation even if it was executed on other replicas) /// Some replicas may skip this mutation and update part version without actually applying any changes.
throw Exception(ErrorCodes::CHECKSUM_DOESNT_MATCH, "Part {} from {} has different columns hash", part_name, replica); /// It leads to mismatching checksum if changes were applied on other replicas.
/// - ALTER_METADATA and MERGE_PARTS were reordered on some replicas.
/// It may lead to different number of columns in merged parts on these replicas.
throw Exception(ErrorCodes::CHECKSUM_DOESNT_MATCH, "Part {} from {} has different columns hash "
"(it may rarely happen on race condition with KILL MUTATION or ALTER COLUMN).", part_name, replica);
} }
replica_part_header.getChecksums().checkEqual(local_part_header.getChecksums(), true); replica_part_header.getChecksums().checkEqual(local_part_header.getChecksums(), true);

View File

@ -9,11 +9,10 @@ from github import Github
from env_helper import ( from env_helper import (
GITHUB_REPOSITORY, GITHUB_REPOSITORY,
TEMP_PATH, GITHUB_RUN_URL,
REPO_COPY,
REPORTS_PATH, REPORTS_PATH,
GITHUB_SERVER_URL, REPO_COPY,
GITHUB_RUN_ID, TEMP_PATH,
) )
from s3_helper import S3Helper from s3_helper import S3Helper
from get_robot_token import get_best_robot_token from get_robot_token import get_best_robot_token
@ -126,7 +125,7 @@ if __name__ == "__main__":
logging.info("Exception uploading file %s text %s", f, ex) logging.info("Exception uploading file %s text %s", f, ex)
paths[f] = "" paths[f] = ""
report_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/actions/runs/{GITHUB_RUN_ID}" report_url = GITHUB_RUN_URL
if paths["runlog.log"]: if paths["runlog.log"]:
report_url = paths["runlog.log"] report_url = paths["runlog.log"]
if paths["main.log"]: if paths["main.log"]:

View File

@ -11,7 +11,7 @@ from env_helper import (
TEMP_PATH, TEMP_PATH,
GITHUB_REPOSITORY, GITHUB_REPOSITORY,
GITHUB_SERVER_URL, GITHUB_SERVER_URL,
GITHUB_RUN_ID, GITHUB_RUN_URL,
) )
from report import create_build_html_report from report import create_build_html_report
from s3_helper import S3Helper from s3_helper import S3Helper
@ -148,6 +148,17 @@ if __name__ == "__main__":
build_name, build_name,
) )
some_builds_are_missing = len(build_reports_map) < len(reports_order)
if some_builds_are_missing:
logging.info(
"Expected to get %s build results, got %s",
len(reports_order),
len(build_reports_map),
)
else:
logging.info("Got exactly %s builds", len(build_reports_map))
build_reports = [ build_reports = [
build_reports_map[build_name] build_reports_map[build_name]
for build_name in reports_order for build_name in reports_order
@ -180,9 +191,7 @@ if __name__ == "__main__":
branch_name = "PR #{}".format(pr_info.number) branch_name = "PR #{}".format(pr_info.number)
branch_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/pull/{pr_info.number}" branch_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/pull/{pr_info.number}"
commit_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/commit/{pr_info.sha}" commit_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/commit/{pr_info.sha}"
task_url = ( task_url = GITHUB_RUN_URL
f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/actions/runs/{GITHUB_RUN_ID or '0'}"
)
report = create_build_html_report( report = create_build_html_report(
build_check_name, build_check_name,
build_results, build_results,
@ -221,10 +230,10 @@ if __name__ == "__main__":
if build_result.status == "success": if build_result.status == "success":
ok_builds += 1 ok_builds += 1
if ok_builds == 0: if ok_builds == 0 or some_builds_are_missing:
summary_status = "error" summary_status = "error"
description = "{}/{} builds are OK".format(ok_builds, total_builds) description = f"{ok_builds}/{total_builds} builds are OK"
print("::notice ::Report url: {}".format(url)) print("::notice ::Report url: {}".format(url))

View File

@ -8,30 +8,16 @@ from get_robot_token import get_parameter_from_ssm
class ClickHouseHelper: class ClickHouseHelper:
def __init__(self, url=None, user=None, password=None): def __init__(self, url=None):
self.url2 = None
self.auth2 = None
if url is None: if url is None:
url = get_parameter_from_ssm("clickhouse-test-stat-url") self.url = get_parameter_from_ssm("clickhouse-test-stat-url2")
self.url2 = get_parameter_from_ssm("clickhouse-test-stat-url2") self.auth = {
self.auth2 = {
"X-ClickHouse-User": get_parameter_from_ssm( "X-ClickHouse-User": get_parameter_from_ssm(
"clickhouse-test-stat-login2" "clickhouse-test-stat-login2"
), ),
"X-ClickHouse-Key": "", "X-ClickHouse-Key": "",
} }
self.url = url
self.auth = {
"X-ClickHouse-User": user
if user is not None
else get_parameter_from_ssm("clickhouse-test-stat-login"),
"X-ClickHouse-Key": password
if password is not None
else get_parameter_from_ssm("clickhouse-test-stat-password"),
}
@staticmethod @staticmethod
def _insert_json_str_info_impl(url, auth, db, table, json_str): def _insert_json_str_info_impl(url, auth, db, table, json_str):
params = { params = {
@ -78,8 +64,6 @@ class ClickHouseHelper:
def _insert_json_str_info(self, db, table, json_str): def _insert_json_str_info(self, db, table, json_str):
self._insert_json_str_info_impl(self.url, self.auth, db, table, json_str) self._insert_json_str_info_impl(self.url, self.auth, db, table, json_str)
if self.url2:
self._insert_json_str_info_impl(self.url2, self.auth2, db, table, json_str)
def insert_event_into(self, db, table, event): def insert_event_into(self, db, table, event):
event_str = json.dumps(event) event_str = json.dumps(event)

View File

@ -11,7 +11,7 @@ from typing import Dict, List, Optional, Set, Tuple, Union
from github import Github from github import Github
from env_helper import GITHUB_WORKSPACE, RUNNER_TEMP from env_helper import GITHUB_WORKSPACE, RUNNER_TEMP, GITHUB_RUN_URL
from s3_helper import S3Helper from s3_helper import S3Helper
from pr_info import PRInfo from pr_info import PRInfo
from get_robot_token import get_best_robot_token, get_parameter_from_ssm from get_robot_token import get_best_robot_token, get_parameter_from_ssm
@ -234,6 +234,7 @@ def build_and_push_one_image(
with open(build_log, "wb") as bl: with open(build_log, "wb") as bl:
cmd = ( cmd = (
"docker buildx build --builder default " "docker buildx build --builder default "
f"--label build-url={GITHUB_RUN_URL} "
f"{from_tag_arg}" f"{from_tag_arg}"
f"--build-arg BUILDKIT_INLINE_CACHE=1 " f"--build-arg BUILDKIT_INLINE_CACHE=1 "
f"--tag {image.repo}:{version_string} " f"--tag {image.repo}:{version_string} "

View File

@ -4,6 +4,7 @@ import os
import unittest import unittest
from unittest.mock import patch from unittest.mock import patch
from env_helper import GITHUB_RUN_URL
from pr_info import PRInfo from pr_info import PRInfo
import docker_images_check as di import docker_images_check as di
@ -117,7 +118,8 @@ class TestDockerImageCheck(unittest.TestCase):
mock_popen.assert_called_once() mock_popen.assert_called_once()
mock_machine.assert_not_called() mock_machine.assert_not_called()
self.assertIn( self.assertIn(
"docker buildx build --builder default --build-arg FROM_TAG=version " f"docker buildx build --builder default --label build-url={GITHUB_RUN_URL} "
"--build-arg FROM_TAG=version "
"--build-arg BUILDKIT_INLINE_CACHE=1 --tag name:version --cache-from " "--build-arg BUILDKIT_INLINE_CACHE=1 --tag name:version --cache-from "
"type=registry,ref=name:version --push --progress plain path", "type=registry,ref=name:version --push --progress plain path",
mock_popen.call_args.args, mock_popen.call_args.args,
@ -133,7 +135,8 @@ class TestDockerImageCheck(unittest.TestCase):
mock_popen.assert_called_once() mock_popen.assert_called_once()
mock_machine.assert_not_called() mock_machine.assert_not_called()
self.assertIn( self.assertIn(
"docker buildx build --builder default --build-arg FROM_TAG=version2 " f"docker buildx build --builder default --label build-url={GITHUB_RUN_URL} "
"--build-arg FROM_TAG=version2 "
"--build-arg BUILDKIT_INLINE_CACHE=1 --tag name:version2 --cache-from " "--build-arg BUILDKIT_INLINE_CACHE=1 --tag name:version2 --cache-from "
"type=registry,ref=name:version2 --progress plain path", "type=registry,ref=name:version2 --progress plain path",
mock_popen.call_args.args, mock_popen.call_args.args,
@ -149,7 +152,7 @@ class TestDockerImageCheck(unittest.TestCase):
mock_popen.assert_called_once() mock_popen.assert_called_once()
mock_machine.assert_not_called() mock_machine.assert_not_called()
self.assertIn( self.assertIn(
"docker buildx build --builder default " f"docker buildx build --builder default --label build-url={GITHUB_RUN_URL} "
"--build-arg BUILDKIT_INLINE_CACHE=1 --tag name:version2 --cache-from " "--build-arg BUILDKIT_INLINE_CACHE=1 --tag name:version2 --cache-from "
"type=registry,ref=name:version2 --progress plain path", "type=registry,ref=name:version2 --progress plain path",
mock_popen.call_args.args, mock_popen.call_args.args,

View File

@ -7,9 +7,10 @@ CACHES_PATH = os.getenv("CACHES_PATH", TEMP_PATH)
CLOUDFLARE_TOKEN = os.getenv("CLOUDFLARE_TOKEN") CLOUDFLARE_TOKEN = os.getenv("CLOUDFLARE_TOKEN")
GITHUB_EVENT_PATH = os.getenv("GITHUB_EVENT_PATH") GITHUB_EVENT_PATH = os.getenv("GITHUB_EVENT_PATH")
GITHUB_REPOSITORY = os.getenv("GITHUB_REPOSITORY", "ClickHouse/ClickHouse") GITHUB_REPOSITORY = os.getenv("GITHUB_REPOSITORY", "ClickHouse/ClickHouse")
GITHUB_RUN_ID = os.getenv("GITHUB_RUN_ID") GITHUB_RUN_ID = os.getenv("GITHUB_RUN_ID", "0")
GITHUB_SERVER_URL = os.getenv("GITHUB_SERVER_URL", "https://github.com") GITHUB_SERVER_URL = os.getenv("GITHUB_SERVER_URL", "https://github.com")
GITHUB_WORKSPACE = os.getenv("GITHUB_WORKSPACE", os.path.abspath("../../")) GITHUB_WORKSPACE = os.getenv("GITHUB_WORKSPACE", os.path.abspath("../../"))
GITHUB_RUN_URL = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/actions/runs/{GITHUB_RUN_ID}"
IMAGES_PATH = os.getenv("IMAGES_PATH") IMAGES_PATH = os.getenv("IMAGES_PATH")
REPORTS_PATH = os.getenv("REPORTS_PATH", "./reports") REPORTS_PATH = os.getenv("REPORTS_PATH", "./reports")
REPO_COPY = os.getenv("REPO_COPY", os.path.abspath("../../")) REPO_COPY = os.getenv("REPO_COPY", os.path.abspath("../../"))

View File

@ -2,7 +2,7 @@
import logging import logging
from github import Github from github import Github
from env_helper import GITHUB_SERVER_URL, GITHUB_REPOSITORY, GITHUB_RUN_ID from env_helper import GITHUB_RUN_URL
from pr_info import PRInfo from pr_info import PRInfo
from get_robot_token import get_best_robot_token from get_robot_token import get_best_robot_token
from commit_status_helper import get_commit from commit_status_helper import get_commit
@ -33,7 +33,7 @@ if __name__ == "__main__":
gh = Github(get_best_robot_token()) gh = Github(get_best_robot_token())
commit = get_commit(gh, pr_info.sha) commit = get_commit(gh, pr_info.sha)
url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/actions/runs/{GITHUB_RUN_ID}" url = GITHUB_RUN_URL
statuses = filter_statuses(list(commit.get_statuses())) statuses = filter_statuses(list(commit.get_statuses()))
if NAME in statuses and statuses[NAME].state == "pending": if NAME in statuses and statuses[NAME].state == "pending":
commit.create_status( commit.create_status(

View File

@ -11,6 +11,7 @@ import re
from github import Github from github import Github
from env_helper import GITHUB_RUN_URL
from pr_info import PRInfo from pr_info import PRInfo
from s3_helper import S3Helper from s3_helper import S3Helper
from get_robot_token import get_best_robot_token from get_robot_token import get_best_robot_token
@ -88,9 +89,9 @@ if __name__ == "__main__":
else: else:
pr_link = f"https://github.com/ClickHouse/ClickHouse/pull/{pr_info.number}" pr_link = f"https://github.com/ClickHouse/ClickHouse/pull/{pr_info.number}"
task_url = f"https://github.com/ClickHouse/ClickHouse/actions/runs/{os.getenv('GITHUB_RUN_ID')}" docker_env += (
docker_env += ' -e CHPC_ADD_REPORT_LINKS="<a href={}>Job (actions)</a> <a href={}>Tested commit</a>"'.format( f' -e CHPC_ADD_REPORT_LINKS="<a href={GITHUB_RUN_URL}>'
task_url, pr_link f'Job (actions)</a> <a href={pr_link}>Tested commit</a>"'
) )
if "RUN_BY_HASH_TOTAL" in os.environ: if "RUN_BY_HASH_TOTAL" in os.environ:
@ -199,7 +200,7 @@ if __name__ == "__main__":
status = "failure" status = "failure"
message = "No message in report." message = "No message in report."
report_url = task_url report_url = GITHUB_RUN_URL
if paths["runlog.log"]: if paths["runlog.log"]:
report_url = paths["runlog.log"] report_url = paths["runlog.log"]

View File

@ -8,7 +8,7 @@ from build_download_helper import get_with_retries
from env_helper import ( from env_helper import (
GITHUB_REPOSITORY, GITHUB_REPOSITORY,
GITHUB_SERVER_URL, GITHUB_SERVER_URL,
GITHUB_RUN_ID, GITHUB_RUN_URL,
GITHUB_EVENT_PATH, GITHUB_EVENT_PATH,
) )
@ -111,7 +111,7 @@ class PRInfo:
self.sha = github_event["pull_request"]["head"]["sha"] self.sha = github_event["pull_request"]["head"]["sha"]
repo_prefix = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}" repo_prefix = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}"
self.task_url = f"{repo_prefix}/actions/runs/{GITHUB_RUN_ID or '0'}" self.task_url = GITHUB_RUN_URL
self.repo_full_name = GITHUB_REPOSITORY self.repo_full_name = GITHUB_REPOSITORY
self.commit_html_url = f"{repo_prefix}/commits/{self.sha}" self.commit_html_url = f"{repo_prefix}/commits/{self.sha}"
@ -142,7 +142,7 @@ class PRInfo:
self.sha = github_event["after"] self.sha = github_event["after"]
pull_request = get_pr_for_commit(self.sha, github_event["ref"]) pull_request = get_pr_for_commit(self.sha, github_event["ref"])
repo_prefix = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}" repo_prefix = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}"
self.task_url = f"{repo_prefix}/actions/runs/{GITHUB_RUN_ID or '0'}" self.task_url = GITHUB_RUN_URL
self.commit_html_url = f"{repo_prefix}/commits/{self.sha}" self.commit_html_url = f"{repo_prefix}/commits/{self.sha}"
self.repo_full_name = GITHUB_REPOSITORY self.repo_full_name = GITHUB_REPOSITORY
if pull_request is None or pull_request["state"] == "closed": if pull_request is None or pull_request["state"] == "closed":
@ -180,7 +180,7 @@ class PRInfo:
self.number = 0 self.number = 0
self.labels = {} self.labels = {}
repo_prefix = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}" repo_prefix = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}"
self.task_url = f"{repo_prefix}/actions/runs/{GITHUB_RUN_ID or '0'}" self.task_url = GITHUB_RUN_URL
self.commit_html_url = f"{repo_prefix}/commits/{self.sha}" self.commit_html_url = f"{repo_prefix}/commits/{self.sha}"
self.repo_full_name = GITHUB_REPOSITORY self.repo_full_name = GITHUB_REPOSITORY
self.pr_html_url = f"{repo_prefix}/commits/{ref}" self.pr_html_url = f"{repo_prefix}/commits/{ref}"

View File

@ -5,7 +5,7 @@ import re
from typing import Tuple from typing import Tuple
from github import Github from github import Github
from env_helper import GITHUB_RUN_ID, GITHUB_REPOSITORY, GITHUB_SERVER_URL from env_helper import GITHUB_RUN_URL, GITHUB_REPOSITORY, GITHUB_SERVER_URL
from pr_info import PRInfo from pr_info import PRInfo
from get_robot_token import get_best_robot_token from get_robot_token import get_best_robot_token
from commit_status_helper import get_commit from commit_status_helper import get_commit
@ -231,7 +231,7 @@ if __name__ == "__main__":
) )
sys.exit(1) sys.exit(1)
url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/actions/runs/{GITHUB_RUN_ID}" url = GITHUB_RUN_URL
if not can_run: if not can_run:
print("::notice ::Cannot run") print("::notice ::Cannot run")
commit.create_status( commit.create_status(

View File

@ -2,7 +2,7 @@ import os
import logging import logging
import ast import ast
from env_helper import GITHUB_SERVER_URL, GITHUB_REPOSITORY, GITHUB_RUN_ID from env_helper import GITHUB_SERVER_URL, GITHUB_REPOSITORY, GITHUB_RUN_URL
from report import ReportColorTheme, create_test_html_report from report import ReportColorTheme, create_test_html_report
@ -66,7 +66,7 @@ def upload_results(
branch_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/pull/{pr_number}" branch_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/pull/{pr_number}"
commit_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/commit/{commit_sha}" commit_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/commit/{commit_sha}"
task_url = f"{GITHUB_SERVER_URL}/{GITHUB_REPOSITORY}/actions/runs/{GITHUB_RUN_ID}" task_url = GITHUB_RUN_URL
if additional_urls: if additional_urls:
raw_log_url = additional_urls[0] raw_log_url = additional_urls[0]

View File

@ -238,7 +238,7 @@ def _update_dockerfile(repo_path: str, version: ClickHouseVersion):
def update_version_local(repo_path, version, version_type="testing"): def update_version_local(repo_path, version, version_type="testing"):
update_contributors() update_contributors()
version.with_description(version_type) version.with_description(version_type)
update_cmake_version(version, version_type) update_cmake_version(version)
_update_changelog(repo_path, version) _update_changelog(repo_path, version)
_update_dockerfile(repo_path, version) _update_dockerfile(repo_path, version)

View File

@ -379,12 +379,16 @@ def check_need_to_rerun(workflow_description):
def rerun_workflow(workflow_description, token): def rerun_workflow(workflow_description, token):
print("Going to rerun workflow") print("Going to rerun workflow")
try:
_exec_post_with_retry(f"{workflow_description.rerun_url}-failed-jobs", token)
except Exception:
_exec_post_with_retry(workflow_description.rerun_url, token) _exec_post_with_retry(workflow_description.rerun_url, token)
def main(event): def main(event):
token = get_token_from_aws() token = get_token_from_aws()
event_data = json.loads(event["body"]) event_data = json.loads(event["body"])
print("The body received:", event_data)
workflow_description = get_workflow_description_from_event(event_data) workflow_description = get_workflow_description_from_event(event_data)
print("Got workflow description", workflow_description) print("Got workflow description", workflow_description)

View File

@ -374,7 +374,7 @@ class SettingsRandomizer:
"output_format_parallel_formatting": lambda: random.randint(0, 1), "output_format_parallel_formatting": lambda: random.randint(0, 1),
"input_format_parallel_parsing": lambda: random.randint(0, 1), "input_format_parallel_parsing": lambda: random.randint(0, 1),
"min_chunk_bytes_for_parallel_parsing": lambda: max(1024, int(random.gauss(10 * 1024 * 1024, 5 * 1000 * 1000))), "min_chunk_bytes_for_parallel_parsing": lambda: max(1024, int(random.gauss(10 * 1024 * 1024, 5 * 1000 * 1000))),
"max_read_buffer_size": lambda: random.randint(1, 20) if random.random() < 0.1 else random.randint(500000, 1048576), "max_read_buffer_size": lambda: random.randint(500000, 1048576),
"prefer_localhost_replica": lambda: random.randint(0, 1), "prefer_localhost_replica": lambda: random.randint(0, 1),
"max_block_size": lambda: random.randint(8000, 100000), "max_block_size": lambda: random.randint(8000, 100000),
"max_threads": lambda: random.randint(1, 64), "max_threads": lambda: random.randint(1, 64),
@ -468,9 +468,11 @@ class TestCase:
return testcase_args return testcase_args
def add_random_settings(self, client_options): def add_random_settings(self, args, client_options):
if self.tags and 'no-random-settings' in self.tags: if self.tags and 'no-random-settings' in self.tags:
return client_options return client_options
if args.no_random_settings:
return client_options
if len(self.base_url_params) == 0: if len(self.base_url_params) == 0:
os.environ['CLICKHOUSE_URL_PARAMS'] = '&'.join(self.random_settings) os.environ['CLICKHOUSE_URL_PARAMS'] = '&'.join(self.random_settings)
@ -485,9 +487,11 @@ class TestCase:
os.environ['CLICKHOUSE_URL_PARAMS'] = self.base_url_params os.environ['CLICKHOUSE_URL_PARAMS'] = self.base_url_params
os.environ['CLICKHOUSE_CLIENT_OPT'] = self.base_client_options os.environ['CLICKHOUSE_CLIENT_OPT'] = self.base_client_options
def add_info_about_settings(self, description): def add_info_about_settings(self, args, description):
if self.tags and 'no-random-settings' in self.tags: if self.tags and 'no-random-settings' in self.tags:
return description return description
if args.no_random_settings:
return description
return description + "\n" + "Settings used in the test: " + "--" + " --".join(self.random_settings) + "\n" return description + "\n" + "Settings used in the test: " + "--" + " --".join(self.random_settings) + "\n"
@ -788,13 +792,13 @@ class TestCase:
self.runs_count += 1 self.runs_count += 1
self.testcase_args = self.configure_testcase_args(args, self.case_file, suite.suite_tmp_path) self.testcase_args = self.configure_testcase_args(args, self.case_file, suite.suite_tmp_path)
client_options = self.add_random_settings(client_options) client_options = self.add_random_settings(args, client_options)
proc, stdout, stderr, total_time = self.run_single_test(server_logs_level, client_options) proc, stdout, stderr, total_time = self.run_single_test(server_logs_level, client_options)
result = self.process_result_impl(proc, stdout, stderr, total_time) result = self.process_result_impl(proc, stdout, stderr, total_time)
result.check_if_need_retry(args, stdout, stderr, self.runs_count) result.check_if_need_retry(args, stdout, stderr, self.runs_count)
if result.status == TestStatus.FAIL: if result.status == TestStatus.FAIL:
result.description = self.add_info_about_settings(result.description) result.description = self.add_info_about_settings(args, result.description)
return result return result
except KeyboardInterrupt as e: except KeyboardInterrupt as e:
raise e raise e
@ -802,12 +806,12 @@ class TestCase:
return TestResult(self.name, TestStatus.FAIL, return TestResult(self.name, TestStatus.FAIL,
FailureReason.INTERNAL_QUERY_FAIL, FailureReason.INTERNAL_QUERY_FAIL,
0., 0.,
self.add_info_about_settings(self.get_description_from_exception_info(sys.exc_info()))) self.add_info_about_settings(args, self.get_description_from_exception_info(sys.exc_info())))
except (ConnectionRefusedError, ConnectionResetError): except (ConnectionRefusedError, ConnectionResetError):
return TestResult(self.name, TestStatus.FAIL, return TestResult(self.name, TestStatus.FAIL,
FailureReason.SERVER_DIED, FailureReason.SERVER_DIED,
0., 0.,
self.add_info_about_settings(self.get_description_from_exception_info(sys.exc_info()))) self.add_info_about_settings(args, self.get_description_from_exception_info(sys.exc_info())))
except: except:
return TestResult(self.name, TestStatus.UNKNOWN, return TestResult(self.name, TestStatus.UNKNOWN,
FailureReason.INTERNAL_ERROR, FailureReason.INTERNAL_ERROR,
@ -1501,6 +1505,7 @@ if __name__ == '__main__':
parser.add_argument('--print-time', action='store_true', dest='print_time', help='Print test time') parser.add_argument('--print-time', action='store_true', dest='print_time', help='Print test time')
parser.add_argument('--check-zookeeper-session', action='store_true', help='Check ZooKeeper session uptime to determine if failed test should be retried') parser.add_argument('--check-zookeeper-session', action='store_true', help='Check ZooKeeper session uptime to determine if failed test should be retried')
parser.add_argument('--s3-storage', action='store_true', default=False, help='Run tests over s3 storage') parser.add_argument('--s3-storage', action='store_true', default=False, help='Run tests over s3 storage')
parser.add_argument('--no-random-settings', action='store_true', default=False, help='Disable settings randomization')
parser.add_argument('--run-by-hash-num', type=int, help='Run tests matching crc32(test_name) % run_by_hash_total == run_by_hash_num') parser.add_argument('--run-by-hash-num', type=int, help='Run tests matching crc32(test_name) % run_by_hash_total == run_by_hash_num')
parser.add_argument('--run-by-hash-total', type=int, help='Total test groups for crc32(test_name) % run_by_hash_total == run_by_hash_num') parser.add_argument('--run-by-hash-total', type=int, help='Total test groups for crc32(test_name) % run_by_hash_total == run_by_hash_num')

Some files were not shown because too many files have changed in this diff Show More