Merge branch 'master' of https://github.com/yandex/clickhouse into match-process-euid-against-data-owner

This commit is contained in:
Sergey V. Galtsev 2018-12-09 22:46:24 +03:00
commit 52d18ee7c6
129 changed files with 5910 additions and 518 deletions

18
.gitmodules vendored
View File

@ -31,9 +31,6 @@
[submodule "contrib/ssl"]
path = contrib/ssl
url = https://github.com/ClickHouse-Extras/ssl.git
[submodule "contrib/boost"]
path = contrib/boost
url = https://github.com/ClickHouse-Extras/boost.git
[submodule "contrib/llvm"]
path = contrib/llvm
url = https://github.com/ClickHouse-Extras/llvm
@ -46,6 +43,21 @@
[submodule "contrib/unixodbc"]
path = contrib/unixodbc
url = https://github.com/ClickHouse-Extras/UnixODBC.git
[submodule "contrib/protobuf"]
path = contrib/protobuf
url = https://github.com/ClickHouse-Extras/protobuf.git
[submodule "contrib/boost"]
path = contrib/boost
url = https://github.com/ClickHouse-Extras/boost-extra.git
[submodule "contrib/base64"]
path = contrib/base64
url = https://github.com/aklomp/base64.git
[submodule "contrib/libhdfs3"]
path = contrib/libhdfs3
url = https://github.com/ClickHouse-Extras/libhdfs3.git
[submodule "contrib/libxml2"]
path = contrib/libxml2
url = https://github.com/GNOME/libxml2.git
[submodule "contrib/libgsasl"]
path = contrib/libgsasl
url = https://github.com/ClickHouse-Extras/libgsasl.git

View File

@ -274,6 +274,9 @@ include (cmake/find_rdkafka.cmake)
include (cmake/find_capnp.cmake)
include (cmake/find_llvm.cmake)
include (cmake/find_cpuid.cmake)
include (cmake/find_libgsasl.cmake)
include (cmake/find_libxml2.cmake)
include (cmake/find_hdfs3.cmake)
include (cmake/find_consistent-hashing.cmake)
include (cmake/find_base64.cmake)
if (ENABLE_TESTS)

190
LICENSE
View File

@ -1,4 +1,192 @@
Copyright 2016-2018 YANDEX LLC
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2016-2018 Yandex LLC
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

26
cmake/find_hdfs3.cmake Normal file
View File

@ -0,0 +1,26 @@
if (NOT ARCH_ARM AND NOT OS_FREEBSD AND NOT APPLE)
option (ENABLE_HDFS "Enable HDFS" ${NOT_UNBUNDLED})
endif ()
if (ENABLE_HDFS AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include/hdfs/hdfs.h")
message (WARNING "submodule contrib/libhdfs3 is missing. to fix try run: \n git submodule update --init --recursive")
set (ENABLE_HDFS 0)
endif ()
if (ENABLE_HDFS)
option (USE_INTERNAL_HDFS3_LIBRARY "Set to FALSE to use system HDFS3 instead of bundled" ON)
if (NOT USE_INTERNAL_HDFS3_LIBRARY)
find_package(hdfs3)
endif ()
if (HDFS3_LIBRARY AND HDFS3_INCLUDE_DIR)
else ()
set(HDFS3_INCLUDE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libhdfs3/include")
set(HDFS3_LIBRARY hdfs3)
endif()
set (USE_HDFS 1)
endif()
message (STATUS "Using hdfs3: ${HDFS3_INCLUDE_DIR} : ${HDFS3_LIBRARY}")

22
cmake/find_libgsasl.cmake Normal file
View File

@ -0,0 +1,22 @@
if (NOT APPLE)
option (USE_INTERNAL_LIBGSASL_LIBRARY "Set to FALSE to use system libgsasl library instead of bundled" ${NOT_UNBUNDLED})
endif ()
if (USE_INTERNAL_LIBGSASL_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src/gsasl.h")
message (WARNING "submodule contrib/libgsasl is missing. to fix try run: \n git submodule update --init --recursive")
set (USE_INTERNAL_LIBGSASL_LIBRARY 0)
endif ()
if (NOT USE_INTERNAL_LIBGSASL_LIBRARY)
find_library (LIBGSASL_LIBRARY gsasl)
find_path (LIBGSASL_INCLUDE_DIR NAMES gsasl.h PATHS ${LIBGSASL_INCLUDE_PATHS})
endif ()
if (LIBGSASL_LIBRARY AND LIBGSASL_INCLUDE_DIR)
else ()
set (LIBGSASL_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/src ${ClickHouse_SOURCE_DIR}/contrib/libgsasl/linux_x86_64/include)
set (USE_INTERNAL_LIBGSASL_LIBRARY 1)
set (LIBGSASL_LIBRARY libgsasl)
endif ()
message (STATUS "Using libgsasl: ${LIBGSASL_INCLUDE_DIR} : ${LIBGSASL_LIBRARY}")

20
cmake/find_libxml2.cmake Normal file
View File

@ -0,0 +1,20 @@
option (USE_INTERNAL_LIBXML2_LIBRARY "Set to FALSE to use system libxml2 library instead of bundled" ${NOT_UNBUNDLED})
if (USE_INTERNAL_LIBXML2_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libxml2/libxml.h")
message (WARNING "submodule contrib/libxml2 is missing. to fix try run: \n git submodule update --init --recursive")
set (USE_INTERNAL_LIBXML2_LIBRARY 0)
endif ()
if (NOT USE_INTERNAL_LIBXML2_LIBRARY)
find_library (LIBXML2_LIBRARY libxml2)
find_path (LIBXML2_INCLUDE_DIR NAMES libxml.h PATHS ${LIBXML2_INCLUDE_PATHS})
endif ()
if (LIBXML2_LIBRARY AND LIBXML2_INCLUDE_DIR)
else ()
set (LIBXML2_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libxml2/include ${ClickHouse_SOURCE_DIR}/contrib/libxml2-cmake/linux_x86_64/include)
set (USE_INTERNAL_LIBXML2_LIBRARY 1)
set (LIBXML2_LIBRARY libxml2)
endif ()
message (STATUS "Using libxml2: ${LIBXML2_INCLUDE_DIR} : ${LIBXML2_LIBRARY}")

80
cmake/find_protobuf.cmake Normal file
View File

@ -0,0 +1,80 @@
option (USE_INTERNAL_PROTOBUF_LIBRARY "Set to FALSE to use system protobuf instead of bundled" ON)
if (NOT USE_INTERNAL_PROTOBUF_LIBRARY)
find_package(Protobuf)
endif ()
if (Protobuf_LIBRARY AND Protobuf_INCLUDE_DIR)
else ()
set(Protobuf_INCLUDE_DIR ${CMAKE_SOURCE_DIR}/contrib/protobuf/src)
set(Protobuf_LIBRARY libprotobuf)
set(Protobuf_PROTOC_LIBRARY libprotoc)
set(Protobuf_LITE_LIBRARY libprotobuf-lite)
set(Protobuf_PROTOC_EXECUTABLE ${CMAKE_BINARY_DIR}/contrib/protobuf/cmake/protoc)
if(NOT DEFINED PROTOBUF_GENERATE_CPP_APPEND_PATH)
set(PROTOBUF_GENERATE_CPP_APPEND_PATH TRUE)
endif()
function(PROTOBUF_GENERATE_CPP SRCS HDRS)
if(NOT ARGN)
message(SEND_ERROR "Error: PROTOBUF_GENERATE_CPP() called without any proto files")
return()
endif()
if(PROTOBUF_GENERATE_CPP_APPEND_PATH)
# Create an include path for each file specified
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(ABS_PATH ${ABS_FIL} PATH)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
else()
set(_protobuf_include_path -I ${CMAKE_CURRENT_SOURCE_DIR})
endif()
if(DEFINED PROTOBUF_IMPORT_DIRS AND NOT DEFINED Protobuf_IMPORT_DIRS)
set(Protobuf_IMPORT_DIRS "${PROTOBUF_IMPORT_DIRS}")
endif()
if(DEFINED Protobuf_IMPORT_DIRS)
foreach(DIR ${Protobuf_IMPORT_DIRS})
get_filename_component(ABS_PATH ${DIR} ABSOLUTE)
list(FIND _protobuf_include_path ${ABS_PATH} _contains_already)
if(${_contains_already} EQUAL -1)
list(APPEND _protobuf_include_path -I ${ABS_PATH})
endif()
endforeach()
endif()
set(${SRCS})
set(${HDRS})
foreach(FIL ${ARGN})
get_filename_component(ABS_FIL ${FIL} ABSOLUTE)
get_filename_component(FIL_WE ${FIL} NAME_WE)
list(APPEND ${SRCS} "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.cc")
list(APPEND ${HDRS} "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.h")
add_custom_command(
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.cc"
"${CMAKE_CURRENT_BINARY_DIR}/${FIL_WE}.pb.h"
COMMAND ${Protobuf_PROTOC_EXECUTABLE}
ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR} ${_protobuf_include_path} ${ABS_FIL}
DEPENDS ${ABS_FIL} ${Protobuf_PROTOC_EXECUTABLE}
COMMENT "Running C++ protocol buffer compiler on ${FIL}"
VERBATIM )
endforeach()
set_source_files_properties(${${SRCS}} ${${HDRS}} PROPERTIES GENERATED TRUE)
set(${SRCS} ${${SRCS}} PARENT_SCOPE)
set(${HDRS} ${${HDRS}} PARENT_SCOPE)
endfunction()
endif()
message (STATUS "Using protobuf: ${Protobuf_INCLUDE_DIR} : ${Protobuf_LIBRARY}")

View File

@ -110,6 +110,7 @@ if (USE_INTERNAL_SSL_LIBRARY)
add_subdirectory (ssl)
target_include_directories(${OPENSSL_CRYPTO_LIBRARY} SYSTEM PUBLIC ${OPENSSL_INCLUDE_DIR})
target_include_directories(${OPENSSL_SSL_LIBRARY} SYSTEM PUBLIC ${OPENSSL_INCLUDE_DIR})
set (POCO_SKIP_OPENSSL_FIND 1)
endif ()
if (ENABLE_MYSQL AND USE_INTERNAL_MYSQL_LIBRARY)
@ -192,6 +193,24 @@ if (USE_INTERNAL_LLVM_LIBRARY)
add_subdirectory (llvm/llvm)
endif ()
if (USE_INTERNAL_LIBGSASL_LIBRARY)
add_subdirectory(libgsasl)
endif()
if (USE_INTERNAL_LIBXML2_LIBRARY)
add_subdirectory(libxml2-cmake)
endif ()
if (USE_INTERNAL_HDFS3_LIBRARY)
include(${ClickHouse_SOURCE_DIR}/cmake/find_protobuf.cmake)
if (USE_INTERNAL_PROTOBUF_LIBRARY)
set(protobuf_BUILD_TESTS OFF CACHE INTERNAL "" FORCE)
set(protobuf_BUILD_SHARED_LIBS OFF CACHE INTERNAL "" FORCE)
add_subdirectory(protobuf/cmake)
endif ()
add_subdirectory(libhdfs3-cmake)
endif ()
if (USE_BASE64)
add_subdirectory (base64-cmake)
endif()

2
contrib/boost vendored

@ -1 +1 @@
Subproject commit 2d5cb2c86f61126f4e1efe9ab97332efd44e7dea
Subproject commit 6883b40449f378019aec792f9983ce3afc7ff16e

View File

@ -42,12 +42,17 @@ ${LIBRARY_DIR}/libs/filesystem/src/windows_file_codecvt.cpp)
add_library(boost_system_internal ${LINK_MODE}
${LIBRARY_DIR}/libs/system/src/error_code.cpp)
add_library(boost_random_internal ${LINK_MODE}
${LIBRARY_DIR}/libs/random/src/random_device.cpp)
target_link_libraries (boost_filesystem_internal PUBLIC boost_system_internal)
target_include_directories (boost_program_options_internal SYSTEM BEFORE PUBLIC ${Boost_INCLUDE_DIRS})
target_include_directories (boost_filesystem_internal SYSTEM BEFORE PUBLIC ${Boost_INCLUDE_DIRS})
target_include_directories (boost_system_internal SYSTEM BEFORE PUBLIC ${Boost_INCLUDE_DIRS})
target_include_directories (boost_random_internal SYSTEM BEFORE PUBLIC ${Boost_INCLUDE_DIRS})
target_compile_definitions (boost_program_options_internal PUBLIC BOOST_SYSTEM_NO_DEPRECATED)
target_compile_definitions (boost_filesystem_internal PUBLIC BOOST_SYSTEM_NO_DEPRECATED)
target_compile_definitions (boost_system_internal PUBLIC BOOST_SYSTEM_NO_DEPRECATED)
target_compile_definitions (boost_random_internal PUBLIC BOOST_SYSTEM_NO_DEPRECATED)

1
contrib/libgsasl vendored Submodule

@ -0,0 +1 @@
Subproject commit 3b8948a4042e34fb00b4fb987535dc9e02e39040

1
contrib/libhdfs3 vendored Submodule

@ -0,0 +1 @@
Subproject commit bd6505cbb0c130b0db695305b9a38546fa880e5a

View File

@ -0,0 +1,10 @@
#include <exception>
#include <stdexcept>
int main() {
try {
throw 2;
} catch (int) {
std::throw_with_nested(std::runtime_error("test"));
}
}

View File

@ -0,0 +1,7 @@
#include <chrono>
using std::chrono::steady_clock;
void foo(const steady_clock &clock) {
return;
}

View File

@ -0,0 +1,10 @@
#include <string.h>
int main()
{
// We can't test "char *p = strerror_r()" because that only causes a
// compiler warning when strerror_r returns an integer.
char *buf = 0;
int i = strerror_r(0, buf, 100);
return i;
}

View File

@ -0,0 +1,48 @@
# Check prereqs
FIND_PROGRAM(GCOV_PATH gcov)
FIND_PROGRAM(LCOV_PATH lcov)
FIND_PROGRAM(GENHTML_PATH genhtml)
IF(NOT GCOV_PATH)
MESSAGE(FATAL_ERROR "gcov not found! Aborting...")
ENDIF(NOT GCOV_PATH)
IF(NOT CMAKE_BUILD_TYPE STREQUAL Debug)
MESSAGE(WARNING "Code coverage results with an optimised (non-Debug) build may be misleading")
ENDIF(NOT CMAKE_BUILD_TYPE STREQUAL Debug)
#Setup compiler options
ADD_DEFINITIONS(-fprofile-arcs -ftest-coverage)
SET(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fprofile-arcs ")
SET(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -fprofile-arcs ")
IF(NOT LCOV_PATH)
MESSAGE(FATAL_ERROR "lcov not found! Aborting...")
ENDIF(NOT LCOV_PATH)
IF(NOT GENHTML_PATH)
MESSAGE(FATAL_ERROR "genhtml not found! Aborting...")
ENDIF(NOT GENHTML_PATH)
#Setup target
ADD_CUSTOM_TARGET(ShowCoverage
#Capturing lcov counters and generating report
COMMAND ${LCOV_PATH} --directory . --capture --output-file CodeCoverage.info
COMMAND ${LCOV_PATH} --remove CodeCoverage.info '${CMAKE_CURRENT_BINARY_DIR}/*' 'test/*' 'mock/*' '/usr/*' '/opt/*' '*ext/rhel5_x86_64*' '*ext/osx*' --output-file CodeCoverage.info.cleaned
COMMAND ${GENHTML_PATH} -o CodeCoverageReport CodeCoverage.info.cleaned
)
ADD_CUSTOM_TARGET(ShowAllCoverage
#Capturing lcov counters and generating report
COMMAND ${LCOV_PATH} -a CodeCoverage.info.cleaned -a CodeCoverage.info.cleaned_withoutHA -o AllCodeCoverage.info
COMMAND sed -e 's|/.*/src|${CMAKE_SOURCE_DIR}/src|' -ig AllCodeCoverage.info
COMMAND ${GENHTML_PATH} -o AllCodeCoverageReport AllCodeCoverage.info
)
ADD_CUSTOM_TARGET(ResetCoverage
#Cleanup lcov
COMMAND ${LCOV_PATH} --directory . --zerocounters
)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,26 @@
# - Try to find the GNU sasl library (gsasl)
#
# Once done this will define
#
# GSASL_FOUND - System has gnutls
# GSASL_INCLUDE_DIR - The gnutls include directory
# GSASL_LIBRARIES - The libraries needed to use gnutls
# GSASL_DEFINITIONS - Compiler switches required for using gnutls
IF (GSASL_INCLUDE_DIR AND GSASL_LIBRARIES)
# in cache already
SET(GSasl_FIND_QUIETLY TRUE)
ENDIF (GSASL_INCLUDE_DIR AND GSASL_LIBRARIES)
FIND_PATH(GSASL_INCLUDE_DIR gsasl.h)
FIND_LIBRARY(GSASL_LIBRARIES gsasl)
INCLUDE(FindPackageHandleStandardArgs)
# handle the QUIETLY and REQUIRED arguments and set GSASL_FOUND to TRUE if
# all listed variables are TRUE
FIND_PACKAGE_HANDLE_STANDARD_ARGS(GSASL DEFAULT_MSG GSASL_LIBRARIES GSASL_INCLUDE_DIR)
MARK_AS_ADVANCED(GSASL_INCLUDE_DIR GSASL_LIBRARIES)

View File

@ -0,0 +1,65 @@
include(CheckCXXSourceRuns)
find_path(GTest_INCLUDE_DIR gtest/gtest.h
NO_DEFAULT_PATH
PATHS
"${PROJECT_SOURCE_DIR}/../thirdparty/googletest/googletest/include"
"/usr/local/include"
"/usr/include")
find_path(GMock_INCLUDE_DIR gmock/gmock.h
NO_DEFAULT_PATH
PATHS
"${PROJECT_SOURCE_DIR}/../thirdparty/googletest/googlemock/include"
"/usr/local/include"
"/usr/include")
find_library(Gtest_LIBRARY
NAMES libgtest.a
HINTS
"${PROJECT_SOURCE_DIR}/../thirdparty/googletest/build/googlemock/gtest"
"/usr/local/lib"
"/usr/lib")
find_library(Gmock_LIBRARY
NAMES libgmock.a
HINTS
"${PROJECT_SOURCE_DIR}/../thirdparty/googletest/build/googlemock"
"/usr/local/lib"
"/usr/lib")
message(STATUS "Find GoogleTest include path: ${GTest_INCLUDE_DIR}")
message(STATUS "Find GoogleMock include path: ${GMock_INCLUDE_DIR}")
message(STATUS "Find Gtest library path: ${Gtest_LIBRARY}")
message(STATUS "Find Gmock library path: ${Gmock_LIBRARY}")
set(CMAKE_REQUIRED_INCLUDES ${GTest_INCLUDE_DIR} ${GMock_INCLUDE_DIR})
set(CMAKE_REQUIRED_LIBRARIES ${Gtest_LIBRARY} ${Gmock_LIBRARY} -lpthread)
set(CMAKE_REQUIRED_FLAGS)
check_cxx_source_runs("
#include <gtest/gtest.h>
#include <gmock/gmock.h>
int main(int argc, char *argv[])
{
double pi = 3.14;
EXPECT_EQ(pi, 3.14);
return 0;
}
" GoogleTest_CHECK_FINE)
message(STATUS "GoogleTest check: ${GoogleTest_CHECK_FINE}")
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
GoogleTest
REQUIRED_VARS
GTest_INCLUDE_DIR
GMock_INCLUDE_DIR
Gtest_LIBRARY
Gmock_LIBRARY
GoogleTest_CHECK_FINE)
set(GoogleTest_INCLUDE_DIR ${GTest_INCLUDE_DIR} ${GMock_INCLUDE_DIR})
set(GoogleTest_LIBRARIES ${Gtest_LIBRARY} ${Gmock_LIBRARY})
mark_as_advanced(
GoogleTest_INCLUDE_DIR
GoogleTest_LIBRARIES)

View File

@ -0,0 +1,23 @@
# - Find kerberos
# Find the native KERBEROS includes and library
#
# KERBEROS_INCLUDE_DIRS - where to find krb5.h, etc.
# KERBEROS_LIBRARIES - List of libraries when using krb5.
# KERBEROS_FOUND - True if krb5 found.
IF (KERBEROS_INCLUDE_DIRS)
# Already in cache, be silent
SET(KERBEROS_FIND_QUIETLY TRUE)
ENDIF (KERBEROS_INCLUDE_DIRS)
FIND_PATH(KERBEROS_INCLUDE_DIRS krb5.h)
SET(KERBEROS_NAMES krb5 k5crypto com_err)
FIND_LIBRARY(KERBEROS_LIBRARIES NAMES ${KERBEROS_NAMES})
# handle the QUIETLY and REQUIRED arguments and set KERBEROS_FOUND to TRUE if
# all listed variables are TRUE
INCLUDE(FindPackageHandleStandardArgs)
FIND_PACKAGE_HANDLE_STANDARD_ARGS(KERBEROS DEFAULT_MSG KERBEROS_LIBRARIES KERBEROS_INCLUDE_DIRS)
MARK_AS_ADVANCED(KERBEROS_LIBRARIES KERBEROS_INCLUDE_DIRS)

View File

@ -0,0 +1,26 @@
# - Try to find the Open ssl library (ssl)
#
# Once done this will define
#
# SSL_FOUND - System has gnutls
# SSL_INCLUDE_DIR - The gnutls include directory
# SSL_LIBRARIES - The libraries needed to use gnutls
# SSL_DEFINITIONS - Compiler switches required for using gnutls
IF (SSL_INCLUDE_DIR AND SSL_LIBRARIES)
# in cache already
SET(SSL_FIND_QUIETLY TRUE)
ENDIF (SSL_INCLUDE_DIR AND SSL_LIBRARIES)
FIND_PATH(SSL_INCLUDE_DIR openssl/opensslv.h)
FIND_LIBRARY(SSL_LIBRARIES crypto)
INCLUDE(FindPackageHandleStandardArgs)
# handle the QUIETLY and REQUIRED arguments and set SSL_FOUND to TRUE if
# all listed variables are TRUE
FIND_PACKAGE_HANDLE_STANDARD_ARGS(SSL DEFAULT_MSG SSL_LIBRARIES SSL_INCLUDE_DIR)
MARK_AS_ADVANCED(SSL_INCLUDE_DIR SSL_LIBRARIES)

View File

@ -0,0 +1,46 @@
FUNCTION(AUTO_SOURCES RETURN_VALUE PATTERN SOURCE_SUBDIRS)
IF ("${SOURCE_SUBDIRS}" STREQUAL "RECURSE")
SET(PATH ".")
IF (${ARGC} EQUAL 4)
LIST(GET ARGV 3 PATH)
ENDIF ()
ENDIF()
IF ("${SOURCE_SUBDIRS}" STREQUAL "RECURSE")
UNSET(${RETURN_VALUE})
FILE(GLOB SUBDIR_FILES "${PATH}/${PATTERN}")
LIST(APPEND ${RETURN_VALUE} ${SUBDIR_FILES})
FILE(GLOB SUBDIRS RELATIVE ${PATH} ${PATH}/*)
FOREACH(DIR ${SUBDIRS})
IF (IS_DIRECTORY ${PATH}/${DIR})
IF (NOT "${DIR}" STREQUAL "CMAKEFILES")
FILE(GLOB_RECURSE SUBDIR_FILES "${PATH}/${DIR}/${PATTERN}")
LIST(APPEND ${RETURN_VALUE} ${SUBDIR_FILES})
ENDIF()
ENDIF()
ENDFOREACH()
ELSE ()
FILE(GLOB ${RETURN_VALUE} "${PATTERN}")
FOREACH (PATH ${SOURCE_SUBDIRS})
FILE(GLOB SUBDIR_FILES "${PATH}/${PATTERN}")
LIST(APPEND ${RETURN_VALUE} ${SUBDIR_FILES})
ENDFOREACH(PATH ${SOURCE_SUBDIRS})
ENDIF ()
IF (${FILTER_OUT})
LIST(REMOVE_ITEM ${RETURN_VALUE} ${FILTER_OUT})
ENDIF()
SET(${RETURN_VALUE} ${${RETURN_VALUE}} PARENT_SCOPE)
ENDFUNCTION(AUTO_SOURCES)
FUNCTION(CONTAINS_STRING FILE SEARCH RETURN_VALUE)
FILE(STRINGS ${FILE} FILE_CONTENTS REGEX ".*${SEARCH}.*")
IF (FILE_CONTENTS)
SET(${RETURN_VALUE} TRUE PARENT_SCOPE)
ENDIF()
ENDFUNCTION(CONTAINS_STRING)

View File

@ -0,0 +1,169 @@
OPTION(ENABLE_COVERAGE "enable code coverage" OFF)
OPTION(ENABLE_DEBUG "enable debug build" OFF)
OPTION(ENABLE_SSE "enable SSE4.2 buildin function" ON)
OPTION(ENABLE_FRAME_POINTER "enable frame pointer on 64bit system with flag -fno-omit-frame-pointer, on 32bit system, it is always enabled" ON)
OPTION(ENABLE_LIBCPP "using libc++ instead of libstdc++, only valid for clang compiler" OFF)
OPTION(ENABLE_BOOST "using boost instead of native compiler c++0x support" OFF)
INCLUDE (CheckFunctionExists)
CHECK_FUNCTION_EXISTS(dladdr HAVE_DLADDR)
CHECK_FUNCTION_EXISTS(nanosleep HAVE_NANOSLEEP)
IF(ENABLE_DEBUG STREQUAL ON)
SET(CMAKE_BUILD_TYPE Debug CACHE
STRING "Choose the type of build, options are: None Debug Release RelWithDebInfo MinSizeRel." FORCE)
SET(CMAKE_CXX_FLAGS_DEBUG "-g -O0" CACHE STRING "compiler flags for debug" FORCE)
SET(CMAKE_C_FLAGS_DEBUG "-g -O0" CACHE STRING "compiler flags for debug" FORCE)
ELSE(ENABLE_DEBUG STREQUAL ON)
SET(CMAKE_BUILD_TYPE RelWithDebInfo CACHE
STRING "Choose the type of build, options are: None Debug Release RelWithDebInfo MinSizeRel." FORCE)
ENDIF(ENABLE_DEBUG STREQUAL ON)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-strict-aliasing")
SET(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fno-strict-aliasing")
IF(ENABLE_COVERAGE STREQUAL ON)
INCLUDE(CodeCoverage)
ENDIF(ENABLE_COVERAGE STREQUAL ON)
IF(ENABLE_FRAME_POINTER STREQUAL ON)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-omit-frame-pointer")
ENDIF(ENABLE_FRAME_POINTER STREQUAL ON)
IF(ENABLE_SSE STREQUAL ON)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse4.2")
ENDIF(ENABLE_SSE STREQUAL ON)
IF(NOT TEST_HDFS_PREFIX)
SET(TEST_HDFS_PREFIX "./" CACHE STRING "default directory prefix used for test." FORCE)
ENDIF(NOT TEST_HDFS_PREFIX)
ADD_DEFINITIONS(-DTEST_HDFS_PREFIX="${TEST_HDFS_PREFIX}")
ADD_DEFINITIONS(-D__STDC_FORMAT_MACROS)
ADD_DEFINITIONS(-D_GNU_SOURCE)
IF(OS_MACOSX AND CMAKE_COMPILER_IS_GNUCXX)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wl,-bind_at_load")
ENDIF(OS_MACOSX AND CMAKE_COMPILER_IS_GNUCXX)
IF(OS_LINUX)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wl,--export-dynamic")
ENDIF(OS_LINUX)
SET(BOOST_ROOT ${CMAKE_PREFIX_PATH})
IF(ENABLE_BOOST STREQUAL ON)
MESSAGE(STATUS "using boost instead of native compiler c++0x support.")
FIND_PACKAGE(Boost 1.50 REQUIRED)
SET(NEED_BOOST true CACHE INTERNAL "boost is required")
ELSE(ENABLE_BOOST STREQUAL ON)
SET(NEED_BOOST false CACHE INTERNAL "boost is required")
ENDIF(ENABLE_BOOST STREQUAL ON)
IF(CMAKE_COMPILER_IS_GNUCXX)
IF(ENABLE_LIBCPP STREQUAL ON)
MESSAGE(FATAL_ERROR "Unsupport using GCC compiler with libc++")
ENDIF(ENABLE_LIBCPP STREQUAL ON)
IF((GCC_COMPILER_VERSION_MAJOR EQUAL 4) AND (GCC_COMPILER_VERSION_MINOR EQUAL 4) AND OS_MACOSX)
SET(NEED_GCCEH true CACHE INTERNAL "Explicitly link with gcc_eh")
MESSAGE(STATUS "link with -lgcc_eh for TLS")
ENDIF((GCC_COMPILER_VERSION_MAJOR EQUAL 4) AND (GCC_COMPILER_VERSION_MINOR EQUAL 4) AND OS_MACOSX)
IF((GCC_COMPILER_VERSION_MAJOR LESS 4) OR ((GCC_COMPILER_VERSION_MAJOR EQUAL 4) AND (GCC_COMPILER_VERSION_MINOR LESS 4)))
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall")
IF(NOT ENABLE_BOOST STREQUAL ON)
MESSAGE(STATUS "gcc version is older than 4.6.0, boost is required.")
FIND_PACKAGE(Boost 1.50 REQUIRED)
SET(NEED_BOOST true CACHE INTERNAL "boost is required")
ENDIF(NOT ENABLE_BOOST STREQUAL ON)
ELSEIF((GCC_COMPILER_VERSION_MAJOR EQUAL 4) AND (GCC_COMPILER_VERSION_MINOR LESS 7))
IF(NOT ENABLE_BOOST STREQUAL ON)
MESSAGE(STATUS "gcc version is older than 4.6.0, boost is required.")
FIND_PACKAGE(Boost 1.50 REQUIRED)
SET(NEED_BOOST true CACHE INTERNAL "boost is required")
ENDIF(NOT ENABLE_BOOST STREQUAL ON)
MESSAGE(STATUS "adding c++0x support for gcc compiler")
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++0x")
ELSE((GCC_COMPILER_VERSION_MAJOR LESS 4) OR ((GCC_COMPILER_VERSION_MAJOR EQUAL 4) AND (GCC_COMPILER_VERSION_MINOR LESS 4)))
MESSAGE(STATUS "adding c++0x support for gcc compiler")
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++0x")
ENDIF((GCC_COMPILER_VERSION_MAJOR LESS 4) OR ((GCC_COMPILER_VERSION_MAJOR EQUAL 4) AND (GCC_COMPILER_VERSION_MINOR LESS 4)))
IF(NEED_BOOST)
IF((Boost_MAJOR_VERSION LESS 1) OR ((Boost_MAJOR_VERSION EQUAL 1) AND (Boost_MINOR_VERSION LESS 50)))
MESSAGE(FATAL_ERROR "boost 1.50+ is required")
ENDIF()
ELSE(NEED_BOOST)
IF(HAVE_NANOSLEEP)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_GLIBCXX_USE_NANOSLEEP")
ELSE(HAVE_NANOSLEEP)
MESSAGE(FATAL_ERROR "nanosleep() is required")
ENDIF(HAVE_NANOSLEEP)
ENDIF(NEED_BOOST)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall")
ELSEIF(CMAKE_COMPILER_IS_CLANG)
MESSAGE(STATUS "adding c++0x support for clang compiler")
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++0x")
SET(CMAKE_XCODE_ATTRIBUTE_CLANG_CXX_LANGUAGE_STANDARD "c++0x")
IF(ENABLE_LIBCPP STREQUAL ON)
MESSAGE(STATUS "using libc++ instead of libstdc++")
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++")
SET(CMAKE_XCODE_ATTRIBUTE_CLANG_CXX_LIBRARY "libc++")
ENDIF(ENABLE_LIBCPP STREQUAL ON)
ENDIF(CMAKE_COMPILER_IS_GNUCXX)
TRY_COMPILE(STRERROR_R_RETURN_INT
${CMAKE_CURRENT_BINARY_DIR}
${HDFS3_ROOT_DIR}/CMake/CMakeTestCompileStrerror.cpp
CMAKE_FLAGS "-DCMAKE_CXX_LINK_EXECUTABLE='echo not linking now...'"
OUTPUT_VARIABLE OUTPUT)
MESSAGE(STATUS "Checking whether strerror_r returns an int")
IF(STRERROR_R_RETURN_INT)
MESSAGE(STATUS "Checking whether strerror_r returns an int -- yes")
ELSE(STRERROR_R_RETURN_INT)
MESSAGE(STATUS "Checking whether strerror_r returns an int -- no")
ENDIF(STRERROR_R_RETURN_INT)
TRY_COMPILE(HAVE_STEADY_CLOCK
${CMAKE_CURRENT_BINARY_DIR}
${HDFS3_ROOT_DIR}/CMake/CMakeTestCompileSteadyClock.cpp
CMAKE_FLAGS "-DCMAKE_CXX_LINK_EXECUTABLE='echo not linking now...'"
OUTPUT_VARIABLE OUTPUT)
TRY_COMPILE(HAVE_NESTED_EXCEPTION
${CMAKE_CURRENT_BINARY_DIR}
${HDFS3_ROOT_DIR}/CMake/CMakeTestCompileNestedException.cpp
CMAKE_FLAGS "-DCMAKE_CXX_LINK_EXECUTABLE='echo not linking now...'"
OUTPUT_VARIABLE OUTPUT)
FILE(WRITE ${CMAKE_CURRENT_BINARY_DIR}/test.cpp "#include <boost/chrono.hpp>")
TRY_COMPILE(HAVE_BOOST_CHRONO
${CMAKE_CURRENT_BINARY_DIR}
${CMAKE_CURRENT_BINARY_DIR}/test.cpp
CMAKE_FLAGS "-DCMAKE_CXX_LINK_EXECUTABLE='echo not linking now...'"
-DINCLUDE_DIRECTORIES=${Boost_INCLUDE_DIR}
OUTPUT_VARIABLE OUTPUT)
FILE(WRITE ${CMAKE_CURRENT_BINARY_DIR}/test.cpp "#include <chrono>")
TRY_COMPILE(HAVE_STD_CHRONO
${CMAKE_CURRENT_BINARY_DIR}
${CMAKE_CURRENT_BINARY_DIR}/test.cpp
CMAKE_FLAGS "-DCMAKE_CXX_LINK_EXECUTABLE='echo not linking now...'"
OUTPUT_VARIABLE OUTPUT)
FILE(WRITE ${CMAKE_CURRENT_BINARY_DIR}/test.cpp "#include <boost/atomic.hpp>")
TRY_COMPILE(HAVE_BOOST_ATOMIC
${CMAKE_CURRENT_BINARY_DIR}
${CMAKE_CURRENT_BINARY_DIR}/test.cpp
CMAKE_FLAGS "-DCMAKE_CXX_LINK_EXECUTABLE='echo not linking now...'"
-DINCLUDE_DIRECTORIES=${Boost_INCLUDE_DIR}
OUTPUT_VARIABLE OUTPUT)
FILE(WRITE ${CMAKE_CURRENT_BINARY_DIR}/test.cpp "#include <atomic>")
TRY_COMPILE(HAVE_STD_ATOMIC
${CMAKE_CURRENT_BINARY_DIR}
${CMAKE_CURRENT_BINARY_DIR}/test.cpp
CMAKE_FLAGS "-DCMAKE_CXX_LINK_EXECUTABLE='echo not linking now...'"
OUTPUT_VARIABLE OUTPUT)

View File

@ -0,0 +1,33 @@
IF(CMAKE_SYSTEM_NAME STREQUAL "Linux")
SET(OS_LINUX true CACHE INTERNAL "Linux operating system")
ELSEIF(CMAKE_SYSTEM_NAME STREQUAL "Darwin")
SET(OS_MACOSX true CACHE INTERNAL "Mac Darwin operating system")
ELSE(CMAKE_SYSTEM_NAME STREQUAL "Linux")
MESSAGE(FATAL_ERROR "Unsupported OS: \"${CMAKE_SYSTEM_NAME}\"")
ENDIF(CMAKE_SYSTEM_NAME STREQUAL "Linux")
IF(CMAKE_COMPILER_IS_GNUCXX)
EXECUTE_PROCESS(COMMAND ${CMAKE_CXX_COMPILER} -dumpversion OUTPUT_VARIABLE GCC_COMPILER_VERSION)
IF (NOT GCC_COMPILER_VERSION)
MESSAGE(FATAL_ERROR "Cannot get gcc version")
ENDIF (NOT GCC_COMPILER_VERSION)
STRING(REGEX MATCHALL "[0-9]+" GCC_COMPILER_VERSION ${GCC_COMPILER_VERSION})
LIST(GET GCC_COMPILER_VERSION 0 GCC_COMPILER_VERSION_MAJOR)
LIST(GET GCC_COMPILER_VERSION 0 GCC_COMPILER_VERSION_MINOR)
SET(GCC_COMPILER_VERSION_MAJOR ${GCC_COMPILER_VERSION_MAJOR} CACHE INTERNAL "gcc major version")
SET(GCC_COMPILER_VERSION_MINOR ${GCC_COMPILER_VERSION_MINOR} CACHE INTERNAL "gcc minor version")
MESSAGE(STATUS "checking compiler: GCC (${GCC_COMPILER_VERSION_MAJOR}.${GCC_COMPILER_VERSION_MINOR}.${GCC_COMPILER_VERSION_PATCH})")
ELSE(CMAKE_COMPILER_IS_GNUCXX)
EXECUTE_PROCESS(COMMAND ${CMAKE_C_COMPILER} --version OUTPUT_VARIABLE COMPILER_OUTPUT)
IF(COMPILER_OUTPUT MATCHES "clang")
SET(CMAKE_COMPILER_IS_CLANG true CACHE INTERNAL "using clang as compiler")
MESSAGE(STATUS "checking compiler: CLANG")
ELSE(COMPILER_OUTPUT MATCHES "clang")
MESSAGE(FATAL_ERROR "Unsupported compiler: \"${CMAKE_CXX_COMPILER}\"")
ENDIF(COMPILER_OUTPUT MATCHES "clang")
ENDIF(CMAKE_COMPILER_IS_GNUCXX)

View File

@ -0,0 +1,212 @@
if (NOT USE_INTERNAL_PROTOBUF_LIBRARY)
# compatiable with protobuf which was compiled old C++ ABI
set(CMAKE_CXX_FLAGS "-D_GLIBCXX_USE_CXX11_ABI=0")
set(CMAKE_C_FLAGS "")
if (NOT (CMAKE_VERSION VERSION_LESS "3.8.0"))
unset(CMAKE_CXX_STANDARD)
endif ()
endif()
SET(WITH_KERBEROS false)
# project and source dir
set(HDFS3_ROOT_DIR ${CMAKE_SOURCE_DIR}/contrib/libhdfs3)
set(HDFS3_SOURCE_DIR ${HDFS3_ROOT_DIR}/src)
set(HDFS3_COMMON_DIR ${HDFS3_SOURCE_DIR}/common)
# module
set(CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/CMake" ${CMAKE_MODULE_PATH})
include(Platform)
include(Options)
# prefer shared libraries
if (WITH_KERBEROS)
find_package(KERBEROS REQUIRED)
endif()
# source
set(PROTO_FILES
#${HDFS3_SOURCE_DIR}/proto/encryption.proto
${HDFS3_SOURCE_DIR}/proto/ClientDatanodeProtocol.proto
${HDFS3_SOURCE_DIR}/proto/hdfs.proto
${HDFS3_SOURCE_DIR}/proto/Security.proto
${HDFS3_SOURCE_DIR}/proto/ProtobufRpcEngine.proto
${HDFS3_SOURCE_DIR}/proto/ClientNamenodeProtocol.proto
${HDFS3_SOURCE_DIR}/proto/IpcConnectionContext.proto
${HDFS3_SOURCE_DIR}/proto/RpcHeader.proto
${HDFS3_SOURCE_DIR}/proto/datatransfer.proto
)
PROTOBUF_GENERATE_CPP(PROTO_SOURCES PROTO_HEADERS ${PROTO_FILES})
configure_file(${HDFS3_SOURCE_DIR}/platform.h.in ${CMAKE_CURRENT_BINARY_DIR}/platform.h)
set(SRCS
${HDFS3_SOURCE_DIR}/network/TcpSocket.cpp
${HDFS3_SOURCE_DIR}/network/DomainSocket.cpp
${HDFS3_SOURCE_DIR}/network/BufferedSocketReader.cpp
${HDFS3_SOURCE_DIR}/client/ReadShortCircuitInfo.cpp
${HDFS3_SOURCE_DIR}/client/Pipeline.cpp
${HDFS3_SOURCE_DIR}/client/Hdfs.cpp
${HDFS3_SOURCE_DIR}/client/Packet.cpp
${HDFS3_SOURCE_DIR}/client/OutputStreamImpl.cpp
${HDFS3_SOURCE_DIR}/client/KerberosName.cpp
${HDFS3_SOURCE_DIR}/client/PacketHeader.cpp
${HDFS3_SOURCE_DIR}/client/LocalBlockReader.cpp
${HDFS3_SOURCE_DIR}/client/UserInfo.cpp
${HDFS3_SOURCE_DIR}/client/RemoteBlockReader.cpp
${HDFS3_SOURCE_DIR}/client/Permission.cpp
${HDFS3_SOURCE_DIR}/client/FileSystemImpl.cpp
${HDFS3_SOURCE_DIR}/client/DirectoryIterator.cpp
${HDFS3_SOURCE_DIR}/client/FileSystemKey.cpp
${HDFS3_SOURCE_DIR}/client/DataTransferProtocolSender.cpp
${HDFS3_SOURCE_DIR}/client/LeaseRenewer.cpp
${HDFS3_SOURCE_DIR}/client/PeerCache.cpp
${HDFS3_SOURCE_DIR}/client/InputStream.cpp
${HDFS3_SOURCE_DIR}/client/FileSystem.cpp
${HDFS3_SOURCE_DIR}/client/InputStreamImpl.cpp
${HDFS3_SOURCE_DIR}/client/Token.cpp
${HDFS3_SOURCE_DIR}/client/PacketPool.cpp
${HDFS3_SOURCE_DIR}/client/OutputStream.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcChannelKey.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcProtocolInfo.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcClient.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcRemoteCall.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcChannel.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcAuth.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcContentWrapper.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcConfig.cpp
${HDFS3_SOURCE_DIR}/rpc/RpcServerInfo.cpp
${HDFS3_SOURCE_DIR}/rpc/SaslClient.cpp
${HDFS3_SOURCE_DIR}/server/Datanode.cpp
${HDFS3_SOURCE_DIR}/server/LocatedBlocks.cpp
${HDFS3_SOURCE_DIR}/server/NamenodeProxy.cpp
${HDFS3_SOURCE_DIR}/server/NamenodeImpl.cpp
${HDFS3_SOURCE_DIR}/server/NamenodeInfo.cpp
${HDFS3_SOURCE_DIR}/common/WritableUtils.cpp
${HDFS3_SOURCE_DIR}/common/ExceptionInternal.cpp
${HDFS3_SOURCE_DIR}/common/SessionConfig.cpp
${HDFS3_SOURCE_DIR}/common/StackPrinter.cpp
${HDFS3_SOURCE_DIR}/common/Exception.cpp
${HDFS3_SOURCE_DIR}/common/Logger.cpp
${HDFS3_SOURCE_DIR}/common/CFileWrapper.cpp
${HDFS3_SOURCE_DIR}/common/XmlConfig.cpp
${HDFS3_SOURCE_DIR}/common/WriteBuffer.cpp
${HDFS3_SOURCE_DIR}/common/HWCrc32c.cpp
${HDFS3_SOURCE_DIR}/common/MappedFileWrapper.cpp
${HDFS3_SOURCE_DIR}/common/Hash.cpp
${HDFS3_SOURCE_DIR}/common/SWCrc32c.cpp
${HDFS3_SOURCE_DIR}/common/Thread.cpp
${HDFS3_SOURCE_DIR}/network/TcpSocket.h
${HDFS3_SOURCE_DIR}/network/BufferedSocketReader.h
${HDFS3_SOURCE_DIR}/network/Socket.h
${HDFS3_SOURCE_DIR}/network/DomainSocket.h
${HDFS3_SOURCE_DIR}/network/Syscall.h
${HDFS3_SOURCE_DIR}/client/InputStreamImpl.h
${HDFS3_SOURCE_DIR}/client/FileSystem.h
${HDFS3_SOURCE_DIR}/client/ReadShortCircuitInfo.h
${HDFS3_SOURCE_DIR}/client/InputStreamInter.h
${HDFS3_SOURCE_DIR}/client/FileSystemImpl.h
${HDFS3_SOURCE_DIR}/client/PacketPool.h
${HDFS3_SOURCE_DIR}/client/Pipeline.h
${HDFS3_SOURCE_DIR}/client/OutputStreamInter.h
${HDFS3_SOURCE_DIR}/client/RemoteBlockReader.h
${HDFS3_SOURCE_DIR}/client/Token.h
${HDFS3_SOURCE_DIR}/client/KerberosName.h
${HDFS3_SOURCE_DIR}/client/DirectoryIterator.h
${HDFS3_SOURCE_DIR}/client/hdfs.h
${HDFS3_SOURCE_DIR}/client/FileSystemStats.h
${HDFS3_SOURCE_DIR}/client/FileSystemKey.h
${HDFS3_SOURCE_DIR}/client/DataTransferProtocolSender.h
${HDFS3_SOURCE_DIR}/client/Packet.h
${HDFS3_SOURCE_DIR}/client/PacketHeader.h
${HDFS3_SOURCE_DIR}/client/FileSystemInter.h
${HDFS3_SOURCE_DIR}/client/LocalBlockReader.h
${HDFS3_SOURCE_DIR}/client/TokenInternal.h
${HDFS3_SOURCE_DIR}/client/InputStream.h
${HDFS3_SOURCE_DIR}/client/PipelineAck.h
${HDFS3_SOURCE_DIR}/client/BlockReader.h
${HDFS3_SOURCE_DIR}/client/Permission.h
${HDFS3_SOURCE_DIR}/client/OutputStreamImpl.h
${HDFS3_SOURCE_DIR}/client/LeaseRenewer.h
${HDFS3_SOURCE_DIR}/client/UserInfo.h
${HDFS3_SOURCE_DIR}/client/PeerCache.h
${HDFS3_SOURCE_DIR}/client/OutputStream.h
${HDFS3_SOURCE_DIR}/client/FileStatus.h
${HDFS3_SOURCE_DIR}/client/DataTransferProtocol.h
${HDFS3_SOURCE_DIR}/client/BlockLocation.h
${HDFS3_SOURCE_DIR}/rpc/RpcConfig.h
${HDFS3_SOURCE_DIR}/rpc/SaslClient.h
${HDFS3_SOURCE_DIR}/rpc/RpcAuth.h
${HDFS3_SOURCE_DIR}/rpc/RpcClient.h
${HDFS3_SOURCE_DIR}/rpc/RpcCall.h
${HDFS3_SOURCE_DIR}/rpc/RpcContentWrapper.h
${HDFS3_SOURCE_DIR}/rpc/RpcProtocolInfo.h
${HDFS3_SOURCE_DIR}/rpc/RpcRemoteCall.h
${HDFS3_SOURCE_DIR}/rpc/RpcServerInfo.h
${HDFS3_SOURCE_DIR}/rpc/RpcChannel.h
${HDFS3_SOURCE_DIR}/rpc/RpcChannelKey.h
${HDFS3_SOURCE_DIR}/server/BlockLocalPathInfo.h
${HDFS3_SOURCE_DIR}/server/LocatedBlocks.h
${HDFS3_SOURCE_DIR}/server/DatanodeInfo.h
${HDFS3_SOURCE_DIR}/server/RpcHelper.h
${HDFS3_SOURCE_DIR}/server/ExtendedBlock.h
${HDFS3_SOURCE_DIR}/server/NamenodeInfo.h
${HDFS3_SOURCE_DIR}/server/NamenodeImpl.h
${HDFS3_SOURCE_DIR}/server/LocatedBlock.h
${HDFS3_SOURCE_DIR}/server/NamenodeProxy.h
${HDFS3_SOURCE_DIR}/server/Datanode.h
${HDFS3_SOURCE_DIR}/server/Namenode.h
${HDFS3_SOURCE_DIR}/common/XmlConfig.h
${HDFS3_SOURCE_DIR}/common/Logger.h
${HDFS3_SOURCE_DIR}/common/WriteBuffer.h
${HDFS3_SOURCE_DIR}/common/HWCrc32c.h
${HDFS3_SOURCE_DIR}/common/Checksum.h
${HDFS3_SOURCE_DIR}/common/SessionConfig.h
${HDFS3_SOURCE_DIR}/common/Unordered.h
${HDFS3_SOURCE_DIR}/common/BigEndian.h
${HDFS3_SOURCE_DIR}/common/Thread.h
${HDFS3_SOURCE_DIR}/common/StackPrinter.h
${HDFS3_SOURCE_DIR}/common/Exception.h
${HDFS3_SOURCE_DIR}/common/WritableUtils.h
${HDFS3_SOURCE_DIR}/common/StringUtil.h
${HDFS3_SOURCE_DIR}/common/LruMap.h
${HDFS3_SOURCE_DIR}/common/Function.h
${HDFS3_SOURCE_DIR}/common/DateTime.h
${HDFS3_SOURCE_DIR}/common/Hash.h
${HDFS3_SOURCE_DIR}/common/SWCrc32c.h
${HDFS3_SOURCE_DIR}/common/ExceptionInternal.h
${HDFS3_SOURCE_DIR}/common/Memory.h
${HDFS3_SOURCE_DIR}/common/FileWrapper.h
)
# target
add_library(hdfs3 STATIC ${SRCS} ${PROTO_SOURCES} ${PROTO_HEADERS})
if (USE_INTERNAL_PROTOBUF_LIBRARY)
add_dependencies(hdfs3 protoc)
endif()
target_include_directories(hdfs3 PRIVATE ${HDFS3_SOURCE_DIR})
target_include_directories(hdfs3 PRIVATE ${HDFS3_COMMON_DIR})
target_include_directories(hdfs3 PRIVATE ${CMAKE_CURRENT_BINARY_DIR})
target_include_directories(hdfs3 PRIVATE ${LIBGSASL_INCLUDE_DIR})
if (WITH_KERBEROS)
target_include_directories(hdfs3 PRIVATE ${KERBEROS_INCLUDE_DIRS})
endif()
target_include_directories(hdfs3 PRIVATE ${LIBXML2_INCLUDE_DIR})
target_link_libraries(hdfs3 ${LIBGSASL_LIBRARY})
if (WITH_KERBEROS)
target_link_libraries(hdfs3 ${KERBEROS_LIBRARIES})
endif()
target_link_libraries(hdfs3 ${LIBXML2_LIBRARY})
# inherit from parent cmake
target_include_directories(hdfs3 PRIVATE ${Boost_INCLUDE_DIRS})
target_include_directories(hdfs3 PRIVATE ${Protobuf_INCLUDE_DIR})
target_include_directories(hdfs3 PRIVATE ${OPENSSL_INCLUDE_DIR})
target_link_libraries(hdfs3 ${Protobuf_LIBRARY})
target_link_libraries(hdfs3 ${OPENSSL_LIBRARIES})

1
contrib/libxml2 vendored Submodule

@ -0,0 +1 @@
Subproject commit 18890f471c420411aa3c989e104d090966ec9dbf

View File

@ -0,0 +1,65 @@
set(LIBXML2_SOURCE_DIR ${CMAKE_SOURCE_DIR}/contrib/libxml2)
set(LIBXML2_BINARY_DIR ${CMAKE_BINARY_DIR}/contrib/libxml2)
set(SRCS
${LIBXML2_SOURCE_DIR}/parser.c
${LIBXML2_SOURCE_DIR}/HTMLparser.c
${LIBXML2_SOURCE_DIR}/buf.c
${LIBXML2_SOURCE_DIR}/xzlib.c
${LIBXML2_SOURCE_DIR}/xmlregexp.c
${LIBXML2_SOURCE_DIR}/entities.c
${LIBXML2_SOURCE_DIR}/rngparser.c
${LIBXML2_SOURCE_DIR}/encoding.c
${LIBXML2_SOURCE_DIR}/legacy.c
${LIBXML2_SOURCE_DIR}/error.c
${LIBXML2_SOURCE_DIR}/debugXML.c
${LIBXML2_SOURCE_DIR}/xpointer.c
${LIBXML2_SOURCE_DIR}/DOCBparser.c
${LIBXML2_SOURCE_DIR}/xmlcatalog.c
${LIBXML2_SOURCE_DIR}/c14n.c
${LIBXML2_SOURCE_DIR}/xmlreader.c
${LIBXML2_SOURCE_DIR}/xmlstring.c
${LIBXML2_SOURCE_DIR}/dict.c
${LIBXML2_SOURCE_DIR}/xpath.c
${LIBXML2_SOURCE_DIR}/tree.c
${LIBXML2_SOURCE_DIR}/trionan.c
${LIBXML2_SOURCE_DIR}/pattern.c
${LIBXML2_SOURCE_DIR}/globals.c
${LIBXML2_SOURCE_DIR}/xmllint.c
${LIBXML2_SOURCE_DIR}/chvalid.c
${LIBXML2_SOURCE_DIR}/relaxng.c
${LIBXML2_SOURCE_DIR}/list.c
${LIBXML2_SOURCE_DIR}/xinclude.c
${LIBXML2_SOURCE_DIR}/xmlIO.c
${LIBXML2_SOURCE_DIR}/triostr.c
${LIBXML2_SOURCE_DIR}/hash.c
${LIBXML2_SOURCE_DIR}/xmlsave.c
${LIBXML2_SOURCE_DIR}/HTMLtree.c
${LIBXML2_SOURCE_DIR}/SAX.c
${LIBXML2_SOURCE_DIR}/xmlschemas.c
${LIBXML2_SOURCE_DIR}/SAX2.c
${LIBXML2_SOURCE_DIR}/threads.c
${LIBXML2_SOURCE_DIR}/runsuite.c
${LIBXML2_SOURCE_DIR}/catalog.c
${LIBXML2_SOURCE_DIR}/uri.c
${LIBXML2_SOURCE_DIR}/xmlmodule.c
${LIBXML2_SOURCE_DIR}/xlink.c
${LIBXML2_SOURCE_DIR}/parserInternals.c
${LIBXML2_SOURCE_DIR}/xmlwriter.c
${LIBXML2_SOURCE_DIR}/xmlunicode.c
${LIBXML2_SOURCE_DIR}/runxmlconf.c
${LIBXML2_SOURCE_DIR}/xmlmemory.c
${LIBXML2_SOURCE_DIR}/nanoftp.c
${LIBXML2_SOURCE_DIR}/xmlschemastypes.c
${LIBXML2_SOURCE_DIR}/valid.c
${LIBXML2_SOURCE_DIR}/nanohttp.c
${LIBXML2_SOURCE_DIR}/schematron.c
)
add_library(libxml2 STATIC ${SRCS})
target_link_libraries(libxml2 ${ZLIB_LIBRARIES})
target_include_directories(libxml2 PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64/include)
target_include_directories(libxml2 PUBLIC ${LIBXML2_SOURCE_DIR}/include)
target_include_directories(libxml2 PRIVATE ${ZLIB_INCLUDE_DIR}/include)

View File

@ -0,0 +1,285 @@
/* config.h. Generated from config.h.in by configure. */
/* config.h.in. Generated from configure.ac by autoheader. */
/* Type cast for the gethostbyname() argument */
#define GETHOSTBYNAME_ARG_CAST /**/
/* Define to 1 if you have the <arpa/inet.h> header file. */
#define HAVE_ARPA_INET_H 1
/* Define to 1 if you have the <arpa/nameser.h> header file. */
#define HAVE_ARPA_NAMESER_H 1
/* Whether struct sockaddr::__ss_family exists */
/* #undef HAVE_BROKEN_SS_FAMILY */
/* Define to 1 if you have the <ctype.h> header file. */
#define HAVE_CTYPE_H 1
/* Define to 1 if you have the <dirent.h> header file. */
#define HAVE_DIRENT_H 1
/* Define to 1 if you have the <dlfcn.h> header file. */
#define HAVE_DLFCN_H 1
/* Have dlopen based dso */
#define HAVE_DLOPEN /**/
/* Define to 1 if you have the <dl.h> header file. */
/* #undef HAVE_DL_H */
/* Define to 1 if you have the <errno.h> header file. */
#define HAVE_ERRNO_H 1
/* Define to 1 if you have the <fcntl.h> header file. */
#define HAVE_FCNTL_H 1
/* Define to 1 if you have the <float.h> header file. */
#define HAVE_FLOAT_H 1
/* Define to 1 if you have the `fprintf' function. */
#define HAVE_FPRINTF 1
/* Define to 1 if you have the `ftime' function. */
#define HAVE_FTIME 1
/* Define if getaddrinfo is there */
#define HAVE_GETADDRINFO /**/
/* Define to 1 if you have the `gettimeofday' function. */
#define HAVE_GETTIMEOFDAY 1
/* Define to 1 if you have the <inttypes.h> header file. */
#define HAVE_INTTYPES_H 1
/* Define to 1 if you have the `isascii' function. */
#define HAVE_ISASCII 1
/* Define if isinf is there */
#define HAVE_ISINF /**/
/* Define if isnan is there */
#define HAVE_ISNAN /**/
/* Define if history library is there (-lhistory) */
/* #undef HAVE_LIBHISTORY */
/* Define if pthread library is there (-lpthread) */
#define HAVE_LIBPTHREAD /**/
/* Define if readline library is there (-lreadline) */
/* #undef HAVE_LIBREADLINE */
/* Define to 1 if you have the <limits.h> header file. */
#define HAVE_LIMITS_H 1
/* Define to 1 if you have the `localtime' function. */
#define HAVE_LOCALTIME 1
/* Define to 1 if you have the <lzma.h> header file. */
/* #undef HAVE_LZMA_H */
/* Define to 1 if you have the <malloc.h> header file. */
#define HAVE_MALLOC_H 1
/* Define to 1 if you have the <math.h> header file. */
#define HAVE_MATH_H 1
/* Define to 1 if you have the <memory.h> header file. */
#define HAVE_MEMORY_H 1
/* Define to 1 if you have the `mmap' function. */
#define HAVE_MMAP 1
/* Define to 1 if you have the `munmap' function. */
#define HAVE_MUNMAP 1
/* mmap() is no good without munmap() */
#if defined(HAVE_MMAP) && !defined(HAVE_MUNMAP)
# undef /**/ HAVE_MMAP
#endif
/* Define to 1 if you have the <ndir.h> header file, and it defines `DIR'. */
/* #undef HAVE_NDIR_H */
/* Define to 1 if you have the <netdb.h> header file. */
#define HAVE_NETDB_H 1
/* Define to 1 if you have the <netinet/in.h> header file. */
#define HAVE_NETINET_IN_H 1
/* Define to 1 if you have the <poll.h> header file. */
#define HAVE_POLL_H 1
/* Define to 1 if you have the `printf' function. */
#define HAVE_PRINTF 1
/* Define if <pthread.h> is there */
#define HAVE_PTHREAD_H /**/
/* Define to 1 if you have the `putenv' function. */
#define HAVE_PUTENV 1
/* Define to 1 if you have the `rand' function. */
#define HAVE_RAND 1
/* Define to 1 if you have the `rand_r' function. */
#define HAVE_RAND_R 1
/* Define to 1 if you have the <resolv.h> header file. */
#define HAVE_RESOLV_H 1
/* Have shl_load based dso */
/* #undef HAVE_SHLLOAD */
/* Define to 1 if you have the `signal' function. */
#define HAVE_SIGNAL 1
/* Define to 1 if you have the <signal.h> header file. */
#define HAVE_SIGNAL_H 1
/* Define to 1 if you have the `snprintf' function. */
#define HAVE_SNPRINTF 1
/* Define to 1 if you have the `sprintf' function. */
#define HAVE_SPRINTF 1
/* Define to 1 if you have the `srand' function. */
#define HAVE_SRAND 1
/* Define to 1 if you have the `sscanf' function. */
#define HAVE_SSCANF 1
/* Define to 1 if you have the `stat' function. */
#define HAVE_STAT 1
/* Define to 1 if you have the <stdarg.h> header file. */
#define HAVE_STDARG_H 1
/* Define to 1 if you have the <stdint.h> header file. */
#define HAVE_STDINT_H 1
/* Define to 1 if you have the <stdlib.h> header file. */
#define HAVE_STDLIB_H 1
/* Define to 1 if you have the `strftime' function. */
#define HAVE_STRFTIME 1
/* Define to 1 if you have the <strings.h> header file. */
#define HAVE_STRINGS_H 1
/* Define to 1 if you have the <string.h> header file. */
#define HAVE_STRING_H 1
/* Define to 1 if you have the <sys/dir.h> header file, and it defines `DIR'.
*/
/* #undef HAVE_SYS_DIR_H */
/* Define to 1 if you have the <sys/mman.h> header file. */
#define HAVE_SYS_MMAN_H 1
/* Define to 1 if you have the <sys/ndir.h> header file, and it defines `DIR'.
*/
/* #undef HAVE_SYS_NDIR_H */
/* Define to 1 if you have the <sys/select.h> header file. */
#define HAVE_SYS_SELECT_H 1
/* Define to 1 if you have the <sys/socket.h> header file. */
#define HAVE_SYS_SOCKET_H 1
/* Define to 1 if you have the <sys/stat.h> header file. */
#define HAVE_SYS_STAT_H 1
/* Define to 1 if you have the <sys/timeb.h> header file. */
#define HAVE_SYS_TIMEB_H 1
/* Define to 1 if you have the <sys/time.h> header file. */
#define HAVE_SYS_TIME_H 1
/* Define to 1 if you have the <sys/types.h> header file. */
#define HAVE_SYS_TYPES_H 1
/* Define to 1 if you have the `time' function. */
#define HAVE_TIME 1
/* Define to 1 if you have the <time.h> header file. */
#define HAVE_TIME_H 1
/* Define to 1 if you have the <unistd.h> header file. */
#define HAVE_UNISTD_H 1
/* Whether va_copy() is available */
#define HAVE_VA_COPY 1
/* Define to 1 if you have the `vfprintf' function. */
#define HAVE_VFPRINTF 1
/* Define to 1 if you have the `vsnprintf' function. */
#define HAVE_VSNPRINTF 1
/* Define to 1 if you have the `vsprintf' function. */
#define HAVE_VSPRINTF 1
/* Define to 1 if you have the <zlib.h> header file. */
/* #undef HAVE_ZLIB_H */
/* Whether __va_copy() is available */
/* #undef HAVE___VA_COPY */
/* Define as const if the declaration of iconv() needs const. */
#define ICONV_CONST
/* Define to the sub-directory where libtool stores uninstalled libraries. */
#define LT_OBJDIR ".libs/"
/* Name of package */
#define PACKAGE "libxml2"
/* Define to the address where bug reports for this package should be sent. */
#define PACKAGE_BUGREPORT ""
/* Define to the full name of this package. */
#define PACKAGE_NAME ""
/* Define to the full name and version of this package. */
#define PACKAGE_STRING ""
/* Define to the one symbol short name of this package. */
#define PACKAGE_TARNAME ""
/* Define to the home page for this package. */
#define PACKAGE_URL ""
/* Define to the version of this package. */
#define PACKAGE_VERSION ""
/* Type cast for the send() function 2nd arg */
#define SEND_ARG2_CAST /**/
/* Define to 1 if you have the ANSI C header files. */
#define STDC_HEADERS 1
/* Support for IPv6 */
#define SUPPORT_IP6 /**/
/* Define if va_list is an array type */
#define VA_LIST_IS_ARRAY 1
/* Version number of package */
#define VERSION "2.9.8"
/* Determine what socket length (socklen_t) data type is */
#define XML_SOCKLEN_T socklen_t
/* Define for Solaris 2.5.1 so the uint32_t typedef from <sys/synch.h>,
<pthread.h>, or <semaphore.h> is not used. If the typedef were allowed, the
#define below would cause a syntax error. */
/* #undef _UINT32_T */
/* ss_family is not defined here, use __ss_family instead */
/* #undef ss_family */
/* Define to the type of an unsigned integer type of width exactly 32 bits if
such a type exists and the standard includes do not define it. */
/* #undef uint32_t */

View File

@ -0,0 +1,481 @@
/*
* Summary: compile-time version informations
* Description: compile-time version informations for the XML library
*
* Copy: See Copyright for the status of this software.
*
* Author: Daniel Veillard
*/
#ifndef __XML_VERSION_H__
#define __XML_VERSION_H__
#include <libxml/xmlexports.h>
#ifdef __cplusplus
extern "C" {
#endif
/*
* use those to be sure nothing nasty will happen if
* your library and includes mismatch
*/
#ifndef LIBXML2_COMPILING_MSCCDEF
XMLPUBFUN void XMLCALL xmlCheckVersion(int version);
#endif /* LIBXML2_COMPILING_MSCCDEF */
/**
* LIBXML_DOTTED_VERSION:
*
* the version string like "1.2.3"
*/
#define LIBXML_DOTTED_VERSION "2.9.8"
/**
* LIBXML_VERSION:
*
* the version number: 1.2.3 value is 10203
*/
#define LIBXML_VERSION 20908
/**
* LIBXML_VERSION_STRING:
*
* the version number string, 1.2.3 value is "10203"
*/
#define LIBXML_VERSION_STRING "20908"
/**
* LIBXML_VERSION_EXTRA:
*
* extra version information, used to show a CVS compilation
*/
#define LIBXML_VERSION_EXTRA "-GITv2.9.9-rc2-1-g6fc04d71"
/**
* LIBXML_TEST_VERSION:
*
* Macro to check that the libxml version in use is compatible with
* the version the software has been compiled against
*/
#define LIBXML_TEST_VERSION xmlCheckVersion(20908);
#ifndef VMS
#if 0
/**
* WITH_TRIO:
*
* defined if the trio support need to be configured in
*/
#define WITH_TRIO
#else
/**
* WITHOUT_TRIO:
*
* defined if the trio support should not be configured in
*/
#define WITHOUT_TRIO
#endif
#else /* VMS */
/**
* WITH_TRIO:
*
* defined if the trio support need to be configured in
*/
#define WITH_TRIO 1
#endif /* VMS */
/**
* LIBXML_THREAD_ENABLED:
*
* Whether the thread support is configured in
*/
#define LIBXML_THREAD_ENABLED 1
/**
* LIBXML_THREAD_ALLOC_ENABLED:
*
* Whether the allocation hooks are per-thread
*/
#if 0
#define LIBXML_THREAD_ALLOC_ENABLED
#endif
/**
* LIBXML_TREE_ENABLED:
*
* Whether the DOM like tree manipulation API support is configured in
*/
#if 1
#define LIBXML_TREE_ENABLED
#endif
/**
* LIBXML_OUTPUT_ENABLED:
*
* Whether the serialization/saving support is configured in
*/
#if 1
#define LIBXML_OUTPUT_ENABLED
#endif
/**
* LIBXML_PUSH_ENABLED:
*
* Whether the push parsing interfaces are configured in
*/
#if 1
#define LIBXML_PUSH_ENABLED
#endif
/**
* LIBXML_READER_ENABLED:
*
* Whether the xmlReader parsing interface is configured in
*/
#if 1
#define LIBXML_READER_ENABLED
#endif
/**
* LIBXML_PATTERN_ENABLED:
*
* Whether the xmlPattern node selection interface is configured in
*/
#if 1
#define LIBXML_PATTERN_ENABLED
#endif
/**
* LIBXML_WRITER_ENABLED:
*
* Whether the xmlWriter saving interface is configured in
*/
#if 1
#define LIBXML_WRITER_ENABLED
#endif
/**
* LIBXML_SAX1_ENABLED:
*
* Whether the older SAX1 interface is configured in
*/
#if 1
#define LIBXML_SAX1_ENABLED
#endif
/**
* LIBXML_FTP_ENABLED:
*
* Whether the FTP support is configured in
*/
#if 1
#define LIBXML_FTP_ENABLED
#endif
/**
* LIBXML_HTTP_ENABLED:
*
* Whether the HTTP support is configured in
*/
#if 1
#define LIBXML_HTTP_ENABLED
#endif
/**
* LIBXML_VALID_ENABLED:
*
* Whether the DTD validation support is configured in
*/
#if 1
#define LIBXML_VALID_ENABLED
#endif
/**
* LIBXML_HTML_ENABLED:
*
* Whether the HTML support is configured in
*/
#if 1
#define LIBXML_HTML_ENABLED
#endif
/**
* LIBXML_LEGACY_ENABLED:
*
* Whether the deprecated APIs are compiled in for compatibility
*/
#if 1
#define LIBXML_LEGACY_ENABLED
#endif
/**
* LIBXML_C14N_ENABLED:
*
* Whether the Canonicalization support is configured in
*/
#if 1
#define LIBXML_C14N_ENABLED
#endif
/**
* LIBXML_CATALOG_ENABLED:
*
* Whether the Catalog support is configured in
*/
#if 1
#define LIBXML_CATALOG_ENABLED
#endif
/**
* LIBXML_DOCB_ENABLED:
*
* Whether the SGML Docbook support is configured in
*/
#if 1
#define LIBXML_DOCB_ENABLED
#endif
/**
* LIBXML_XPATH_ENABLED:
*
* Whether XPath is configured in
*/
#if 1
#define LIBXML_XPATH_ENABLED
#endif
/**
* LIBXML_XPTR_ENABLED:
*
* Whether XPointer is configured in
*/
#if 1
#define LIBXML_XPTR_ENABLED
#endif
/**
* LIBXML_XINCLUDE_ENABLED:
*
* Whether XInclude is configured in
*/
#if 1
#define LIBXML_XINCLUDE_ENABLED
#endif
/**
* LIBXML_ICONV_ENABLED:
*
* Whether iconv support is available
*/
#if 1
#define LIBXML_ICONV_ENABLED
#endif
/**
* LIBXML_ICU_ENABLED:
*
* Whether icu support is available
*/
#if 0
#define LIBXML_ICU_ENABLED
#endif
/**
* LIBXML_ISO8859X_ENABLED:
*
* Whether ISO-8859-* support is made available in case iconv is not
*/
#if 1
#define LIBXML_ISO8859X_ENABLED
#endif
/**
* LIBXML_DEBUG_ENABLED:
*
* Whether Debugging module is configured in
*/
#if 1
#define LIBXML_DEBUG_ENABLED
#endif
/**
* DEBUG_MEMORY_LOCATION:
*
* Whether the memory debugging is configured in
*/
#if 0
#define DEBUG_MEMORY_LOCATION
#endif
/**
* LIBXML_DEBUG_RUNTIME:
*
* Whether the runtime debugging is configured in
*/
#if 0
#define LIBXML_DEBUG_RUNTIME
#endif
/**
* LIBXML_UNICODE_ENABLED:
*
* Whether the Unicode related interfaces are compiled in
*/
#if 1
#define LIBXML_UNICODE_ENABLED
#endif
/**
* LIBXML_REGEXP_ENABLED:
*
* Whether the regular expressions interfaces are compiled in
*/
#if 1
#define LIBXML_REGEXP_ENABLED
#endif
/**
* LIBXML_AUTOMATA_ENABLED:
*
* Whether the automata interfaces are compiled in
*/
#if 1
#define LIBXML_AUTOMATA_ENABLED
#endif
/**
* LIBXML_EXPR_ENABLED:
*
* Whether the formal expressions interfaces are compiled in
*/
#if 1
#define LIBXML_EXPR_ENABLED
#endif
/**
* LIBXML_SCHEMAS_ENABLED:
*
* Whether the Schemas validation interfaces are compiled in
*/
#if 1
#define LIBXML_SCHEMAS_ENABLED
#endif
/**
* LIBXML_SCHEMATRON_ENABLED:
*
* Whether the Schematron validation interfaces are compiled in
*/
#if 1
#define LIBXML_SCHEMATRON_ENABLED
#endif
/**
* LIBXML_MODULES_ENABLED:
*
* Whether the module interfaces are compiled in
*/
#if 1
#define LIBXML_MODULES_ENABLED
/**
* LIBXML_MODULE_EXTENSION:
*
* the string suffix used by dynamic modules (usually shared libraries)
*/
#define LIBXML_MODULE_EXTENSION ".so"
#endif
/**
* LIBXML_ZLIB_ENABLED:
*
* Whether the Zlib support is compiled in
*/
#if 1
#define LIBXML_ZLIB_ENABLED
#endif
/**
* LIBXML_LZMA_ENABLED:
*
* Whether the Lzma support is compiled in
*/
#if 0
#define LIBXML_LZMA_ENABLED
#endif
#ifdef __GNUC__
/**
* ATTRIBUTE_UNUSED:
*
* Macro used to signal to GCC unused function parameters
*/
#ifndef ATTRIBUTE_UNUSED
# if ((__GNUC__ > 2) || ((__GNUC__ == 2) && (__GNUC_MINOR__ >= 7)))
# define ATTRIBUTE_UNUSED __attribute__((unused))
# else
# define ATTRIBUTE_UNUSED
# endif
#endif
/**
* LIBXML_ATTR_ALLOC_SIZE:
*
* Macro used to indicate to GCC this is an allocator function
*/
#ifndef LIBXML_ATTR_ALLOC_SIZE
# if (!defined(__clang__) && ((__GNUC__ > 4) || ((__GNUC__ == 4) && (__GNUC_MINOR__ >= 3))))
# define LIBXML_ATTR_ALLOC_SIZE(x) __attribute__((alloc_size(x)))
# else
# define LIBXML_ATTR_ALLOC_SIZE(x)
# endif
#else
# define LIBXML_ATTR_ALLOC_SIZE(x)
#endif
/**
* LIBXML_ATTR_FORMAT:
*
* Macro used to indicate to GCC the parameter are printf like
*/
#ifndef LIBXML_ATTR_FORMAT
# if ((__GNUC__ > 3) || ((__GNUC__ == 3) && (__GNUC_MINOR__ >= 3)))
# define LIBXML_ATTR_FORMAT(fmt,args) __attribute__((__format__(__printf__,fmt,args)))
# else
# define LIBXML_ATTR_FORMAT(fmt,args)
# endif
#else
# define LIBXML_ATTR_FORMAT(fmt,args)
#endif
#else /* ! __GNUC__ */
/**
* ATTRIBUTE_UNUSED:
*
* Macro used to signal to GCC unused function parameters
*/
#define ATTRIBUTE_UNUSED
/**
* LIBXML_ATTR_ALLOC_SIZE:
*
* Macro used to indicate to GCC this is an allocator function
*/
#define LIBXML_ATTR_ALLOC_SIZE(x)
/**
* LIBXML_ATTR_FORMAT:
*
* Macro used to indicate to GCC the parameter are printf like
*/
#define LIBXML_ATTR_FORMAT(fmt,args)
#endif /* __GNUC__ */
#ifdef __cplusplus
}
#endif /* __cplusplus */
#endif

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit 20c1d877773b6a672f1bbfe3290dfea42a117ed5
Subproject commit fe5505e56c27b6ecb0dcbc40c49dc2caf4e9637f

1
contrib/protobuf vendored Submodule

@ -0,0 +1 @@
Subproject commit 12735370922a35f03999afff478e1c6d7aa917a4

View File

@ -264,6 +264,11 @@ target_link_libraries(dbms PRIVATE ${OPENSSL_CRYPTO_LIBRARY} Threads::Threads)
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${DIVIDE_INCLUDE_DIR})
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${SPARCEHASH_INCLUDE_DIR})
if (USE_HDFS)
target_link_libraries (dbms PRIVATE ${HDFS3_LIBRARY})
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${HDFS3_INCLUDE_DIR})
endif()
if (NOT USE_INTERNAL_LZ4_LIBRARY)
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${LZ4_INCLUDE_DIR})
endif ()

View File

@ -376,13 +376,21 @@ int Server::main(const std::vector<std::string> & /*args*/)
format_schema_path.createDirectories();
LOG_INFO(log, "Loading metadata.");
loadMetadataSystem(*global_context);
/// After attaching system databases we can initialize system log.
global_context->initializeSystemLogs();
/// After the system database is created, attach virtual system tables (in addition to query_log and part_log)
attachSystemTablesServer(*global_context->getDatabase("system"), has_zookeeper);
/// Then, load remaining databases
loadMetadata(*global_context);
try
{
loadMetadataSystem(*global_context);
/// After attaching system databases we can initialize system log.
global_context->initializeSystemLogs();
/// After the system database is created, attach virtual system tables (in addition to query_log and part_log)
attachSystemTablesServer(*global_context->getDatabase("system"), has_zookeeper);
/// Then, load remaining databases
loadMetadata(*global_context);
}
catch (...)
{
tryLogCurrentException(log, "Caught exception while loading metadata");
throw;
}
LOG_DEBUG(log, "Loaded metadata.");
global_context->setCurrentDatabase(default_database);

View File

@ -30,6 +30,7 @@
#include <Storages/StorageMemory.h>
#include <Storages/StorageReplicatedMergeTree.h>
#include <Core/ExternalTable.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include "TCPHandler.h"
@ -361,6 +362,17 @@ void TCPHandler::processInsertQuery(const Settings & global_settings)
/// Send block to the client - table structure.
Block block = state.io.out->getHeader();
/// Support insert from old clients without low cardinality type.
if (client_revision && client_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE)
{
for (auto & col : block)
{
col.type = recursiveRemoveLowCardinality(col.type);
col.column = recursiveRemoveLowCardinality(col.column);
}
}
sendData(block);
readData(global_settings);
@ -743,8 +755,13 @@ void TCPHandler::initBlockInput()
else
state.maybe_compressed_in = in;
Block header;
if (state.io.out)
header = state.io.out->getHeader();
state.block_in = std::make_shared<NativeBlockInputStream>(
*state.maybe_compressed_in,
header,
client_revision);
}
}

View File

@ -18,7 +18,7 @@ namespace DB
namespace ErrorCodes
{
extern const int TOO_SLOW;
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_MANY_ARGUMENTS_FOR_FUNCTION;
extern const int SYNTAX_ERROR;
extern const int BAD_ARGUMENTS;
@ -146,7 +146,7 @@ public:
if (!sufficientArgs(arg_count))
throw Exception{"Aggregate function " + derived().getName() + " requires at least 3 arguments.",
ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION};
ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
if (arg_count - 1 > AggregateFunctionSequenceMatchData::max_events)
throw Exception{"Aggregate function " + derived().getName() + " supports up to " +

View File

@ -163,7 +163,7 @@ void ColumnLowCardinality::insertRangeFrom(const IColumn & src, size_t start, si
auto * low_cardinality_src = typeid_cast<const ColumnLowCardinality *>(&src);
if (!low_cardinality_src)
throw Exception("Expected ColumnLowCardinality, got" + src.getName(), ErrorCodes::ILLEGAL_COLUMN);
throw Exception("Expected ColumnLowCardinality, got " + src.getName(), ErrorCodes::ILLEGAL_COLUMN);
if (&low_cardinality_src->getDictionary() == &getDictionary())
{

View File

@ -36,6 +36,7 @@ public:
const ColumnPtr & getNestedColumn() const override;
const ColumnPtr & getNestedNotNullableColumn() const override { return column_holder; }
bool nestedColumnIsNullable() const override { return is_nullable; }
size_t uniqueInsert(const Field & x) override;
size_t uniqueInsertFrom(const IColumn & src, size_t n) override;

View File

@ -18,6 +18,8 @@ public:
/// The same as getNestedColumn, but removes null map if nested column is nullable.
virtual const ColumnPtr & getNestedNotNullableColumn() const = 0;
virtual bool nestedColumnIsNullable() const = 0;
/// Returns array with StringRefHash calculated for each row of getNestedNotNullableColumn() column.
/// Returns nullptr if nested column doesn't contain strings. Otherwise calculates hash (if it wasn't).
/// Uses thread-safe cache.

View File

@ -42,7 +42,7 @@ namespace ErrorCodes
extern const int ATTEMPT_TO_READ_AFTER_EOF = 32;
extern const int CANNOT_READ_ALL_DATA = 33;
extern const int TOO_MANY_ARGUMENTS_FOR_FUNCTION = 34;
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION = 35;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION = 35;
extern const int BAD_ARGUMENTS = 36;
extern const int UNKNOWN_ELEMENT_IN_AST = 37;
extern const int CANNOT_PARSE_DATE = 38;
@ -285,7 +285,7 @@ namespace ErrorCodes
extern const int INCORRECT_INDEX = 282;
extern const int UNKNOWN_DISTRIBUTED_PRODUCT_MODE = 283;
extern const int UNKNOWN_GLOBAL_SUBQUERIES_METHOD = 284;
extern const int TOO_LESS_LIVE_REPLICAS = 285;
extern const int TOO_FEW_LIVE_REPLICAS = 285;
extern const int UNSATISFIED_QUORUM_FOR_PREVIOUS_WRITE = 286;
extern const int UNKNOWN_FORMAT_VERSION = 287;
extern const int DISTRIBUTED_IN_JOIN_SUBQUERY_DENIED = 288;

View File

@ -10,16 +10,17 @@ template
typename Cell,
typename Hash = DefaultHash<Key>,
typename Grower = TwoLevelHashTableGrower<>,
typename Allocator = HashTableAllocator
typename Allocator = HashTableAllocator,
template <typename ...> typename ImplTable = HashMapTable
>
class TwoLevelHashMapTable : public TwoLevelHashTable<Key, Cell, Hash, Grower, Allocator, HashMapTable<Key, Cell, Hash, Grower, Allocator>>
class TwoLevelHashMapTable : public TwoLevelHashTable<Key, Cell, Hash, Grower, Allocator, ImplTable<Key, Cell, Hash, Grower, Allocator>>
{
public:
using key_type = Key;
using mapped_type = typename Cell::Mapped;
using value_type = typename Cell::value_type;
using TwoLevelHashTable<Key, Cell, Hash, Grower, Allocator, HashMapTable<Key, Cell, Hash, Grower, Allocator>>::TwoLevelHashTable;
using TwoLevelHashTable<Key, Cell, Hash, Grower, Allocator, ImplTable<Key, Cell, Hash, Grower, Allocator>>::TwoLevelHashTable;
mapped_type & ALWAYS_INLINE operator[](Key x)
{
@ -41,9 +42,10 @@ template
typename Mapped,
typename Hash = DefaultHash<Key>,
typename Grower = TwoLevelHashTableGrower<>,
typename Allocator = HashTableAllocator
typename Allocator = HashTableAllocator,
template <typename ...> typename ImplTable = HashMapTable
>
using TwoLevelHashMap = TwoLevelHashMapTable<Key, HashMapCell<Key, Mapped, Hash>, Hash, Grower, Allocator>;
using TwoLevelHashMap = TwoLevelHashMapTable<Key, HashMapCell<Key, Mapped, Hash>, Hash, Grower, Allocator, ImplTable>;
template
@ -52,6 +54,7 @@ template
typename Mapped,
typename Hash = DefaultHash<Key>,
typename Grower = TwoLevelHashTableGrower<>,
typename Allocator = HashTableAllocator
typename Allocator = HashTableAllocator,
template <typename ...> typename ImplTable = HashMapTable
>
using TwoLevelHashMapWithSavedHash = TwoLevelHashMapTable<Key, HashMapCellWithSavedHash<Key, Mapped, Hash>, Hash, Grower, Allocator>;
using TwoLevelHashMapWithSavedHash = TwoLevelHashMapTable<Key, HashMapCellWithSavedHash<Key, Mapped, Hash>, Hash, Grower, Allocator, ImplTable>;

View File

@ -16,3 +16,4 @@
#cmakedefine01 USE_POCO_NETSSL
#cmakedefine01 CLICKHOUSE_SPLIT_BINARY
#cmakedefine01 USE_BASE64
#cmakedefine01 USE_HDFS

View File

@ -52,6 +52,8 @@
/// (keys will be placed in different buckets and result will not be fully aggregated).
#define DBMS_MIN_REVISION_WITH_CURRENT_AGGREGATION_VARIANT_SELECTION_METHOD 54408
#define DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE 54405
/// Version of ClickHouse TCP protocol. Set to git tag with latest protocol change.
#define DBMS_TCP_PROTOCOL_VERSION 54226

View File

@ -9,6 +9,7 @@
#include <ext/range.h>
#include <DataStreams/NativeBlockInputStream.h>
#include <DataTypes/DataTypeLowCardinality.h>
namespace DB
@ -152,6 +153,12 @@ Block NativeBlockInputStream::readImpl()
column.column = std::move(read_column);
if (server_revision && server_revision < DBMS_MIN_REVISION_WITH_LOW_CARDINALITY_TYPE)
{
column.column = recursiveLowCardinalityConversion(column.column, column.type, header.getByPosition(i).type);
column.type = header.getByPosition(i).type;
}
res.insert(std::move(column));
if (use_index)

View File

@ -164,4 +164,13 @@ private:
/// Returns dictionary type if type is DataTypeLowCardinality, type otherwise.
DataTypePtr removeLowCardinality(const DataTypePtr & type);
/// Remove LowCardinality recursively from all nested types.
DataTypePtr recursiveRemoveLowCardinality(const DataTypePtr & type);
/// Remove LowCardinality recursively from all nested columns.
ColumnPtr recursiveRemoveLowCardinality(const ColumnPtr & column);
/// Convert column of type from_type to type to_type by converting nested LowCardinality columns.
ColumnPtr recursiveLowCardinalityConversion(const ColumnPtr & column, const DataTypePtr & from_type, const DataTypePtr & to_type);
}

View File

@ -0,0 +1,137 @@
#include <Columns/ColumnArray.h>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnTuple.h>
#include <Columns/ColumnLowCardinality.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeTuple.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
extern const int TYPE_MISMATCH;
}
DataTypePtr recursiveRemoveLowCardinality(const DataTypePtr & type)
{
if (!type)
return type;
if (const auto * array_type = typeid_cast<const DataTypeArray *>(type.get()))
return std::make_shared<DataTypeArray>(recursiveRemoveLowCardinality(array_type->getNestedType()));
if (const auto * tuple_type = typeid_cast<const DataTypeTuple *>(type.get()))
{
DataTypes elements = tuple_type->getElements();
for (auto & element : elements)
element = recursiveRemoveLowCardinality(element);
if (tuple_type->haveExplicitNames())
return std::make_shared<DataTypeTuple>(elements, tuple_type->getElementNames());
else
return std::make_shared<DataTypeTuple>(elements);
}
if (const auto * low_cardinality_type = typeid_cast<const DataTypeLowCardinality *>(type.get()))
return low_cardinality_type->getDictionaryType();
return type;
}
ColumnPtr recursiveRemoveLowCardinality(const ColumnPtr & column)
{
if (!column)
return column;
if (const auto * column_array = typeid_cast<const ColumnArray *>(column.get()))
return ColumnArray::create(recursiveRemoveLowCardinality(column_array->getDataPtr()), column_array->getOffsetsPtr());
if (const auto * column_const = typeid_cast<const ColumnConst *>(column.get()))
return ColumnConst::create(recursiveRemoveLowCardinality(column_const->getDataColumnPtr()), column_const->size());
if (const auto * column_tuple = typeid_cast<const ColumnTuple *>(column.get()))
{
Columns columns = column_tuple->getColumns();
for (auto & element : columns)
element = recursiveRemoveLowCardinality(element);
return ColumnTuple::create(columns);
}
if (const auto * column_low_cardinality = typeid_cast<const ColumnLowCardinality *>(column.get()))
return column_low_cardinality->convertToFullColumn();
return column;
}
ColumnPtr recursiveLowCardinalityConversion(const ColumnPtr & column, const DataTypePtr & from_type, const DataTypePtr & to_type)
{
if (from_type->equals(*to_type))
return column;
if (const auto * column_const = typeid_cast<const ColumnConst *>(column.get()))
return ColumnConst::create(recursiveLowCardinalityConversion(column_const->getDataColumnPtr(), from_type, to_type),
column_const->size());
if (const auto * low_cardinality_type = typeid_cast<const DataTypeLowCardinality *>(from_type.get()))
{
if (to_type->equals(*low_cardinality_type->getDictionaryType()))
return column->convertToFullColumnIfLowCardinality();
}
if (const auto * low_cardinality_type = typeid_cast<const DataTypeLowCardinality *>(to_type.get()))
{
if (from_type->equals(*low_cardinality_type->getDictionaryType()))
{
auto col = low_cardinality_type->createColumn();
static_cast<ColumnLowCardinality &>(*col).insertRangeFromFullColumn(*column, 0, column->size());
return std::move(col);
}
}
if (const auto * from_array_type = typeid_cast<const DataTypeArray *>(from_type.get()))
{
if (const auto * to_array_type = typeid_cast<const DataTypeArray *>(to_type.get()))
{
const auto * column_array = typeid_cast<const ColumnArray *>(column.get());
if (!column_array)
throw Exception("Unexpected column " + column->getName() + " for type " + from_type->getName(),
ErrorCodes::ILLEGAL_COLUMN);
auto & nested_from = from_array_type->getNestedType();
auto & nested_to = to_array_type->getNestedType();
return ColumnArray::create(
recursiveLowCardinalityConversion(column_array->getDataPtr(), nested_from, nested_to),
column_array->getOffsetsPtr());
}
}
if (const auto * from_tuple_type = typeid_cast<const DataTypeTuple *>(from_type.get()))
{
if (const auto * to_tuple_type = typeid_cast<const DataTypeTuple *>(to_type.get()))
{
const auto * column_tuple = typeid_cast<const ColumnTuple *>(column.get());
if (!column_tuple)
throw Exception("Unexpected column " + column->getName() + " for type " + from_type->getName(),
ErrorCodes::ILLEGAL_COLUMN);
Columns columns = column_tuple->getColumns();
auto & from_elements = from_tuple_type->getElements();
auto & to_elements = to_tuple_type->getElements();
for (size_t i = 0; i < columns.size(); ++i)
{
auto & element = columns[i];
element = recursiveLowCardinalityConversion(element, from_elements.at(i), to_elements.at(i));
}
return ColumnTuple::create(columns);
}
}
throw Exception("Cannot convert: " + from_type->getName() + " to " + to_type->getName(), ErrorCodes::TYPE_MISMATCH);
}
}

View File

@ -21,6 +21,7 @@
#include <IO/ReadBufferFromFile.h>
#include <IO/WriteHelpers.h>
#include <IO/ReadHelpers.h>
#include <ext/scope_guard.h>
namespace DB
@ -164,9 +165,15 @@ void DatabaseOrdinary::loadTables(
AtomicStopwatch watch;
std::atomic<size_t> tables_processed {0};
Poco::Event all_tables_processed;
ExceptionHandler exception_handler;
auto task_function = [&](const String & table)
{
SCOPE_EXIT(
if (++tables_processed == total_tables)
all_tables_processed.set()
);
/// Messages, so that it's not boring to wait for the server to load for a long time.
if ((tables_processed + 1) % PRINT_MESSAGE_EACH_N_TABLES == 0
|| watch.compareAndRestart(PRINT_MESSAGE_EACH_N_SECONDS))
@ -176,14 +183,11 @@ void DatabaseOrdinary::loadTables(
}
loadTable(context, metadata_path, *this, name, data_path, table, has_force_restore_data_flag);
if (++tables_processed == total_tables)
all_tables_processed.set();
};
for (const auto & filename : file_names)
{
auto task = std::bind(task_function, filename);
auto task = createExceptionHandledJob(std::bind(task_function, filename), exception_handler);
if (thread_pool)
thread_pool->schedule(task);
@ -194,6 +198,8 @@ void DatabaseOrdinary::loadTables(
if (thread_pool)
all_tables_processed.wait();
exception_handler.throwIfException();
/// After all tables was basically initialized, startup them.
startupTables(thread_pool);
}
@ -207,12 +213,18 @@ void DatabaseOrdinary::startupTables(ThreadPool * thread_pool)
std::atomic<size_t> tables_processed {0};
size_t total_tables = tables.size();
Poco::Event all_tables_processed;
ExceptionHandler exception_handler;
if (!total_tables)
return;
auto task_function = [&](const StoragePtr & table)
{
SCOPE_EXIT(
if (++tables_processed == total_tables)
all_tables_processed.set()
);
if ((tables_processed + 1) % PRINT_MESSAGE_EACH_N_TABLES == 0
|| watch.compareAndRestart(PRINT_MESSAGE_EACH_N_SECONDS))
{
@ -221,14 +233,11 @@ void DatabaseOrdinary::startupTables(ThreadPool * thread_pool)
}
table->startup();
if (++tables_processed == total_tables)
all_tables_processed.set();
};
for (const auto & name_storage : tables)
{
auto task = std::bind(task_function, name_storage.second);
auto task = createExceptionHandledJob(std::bind(task_function, name_storage.second), exception_handler);
if (thread_pool)
thread_pool->schedule(task);
@ -238,6 +247,8 @@ void DatabaseOrdinary::startupTables(ThreadPool * thread_pool)
if (thread_pool)
all_tables_processed.wait();
exception_handler.throwIfException();
}

View File

@ -16,7 +16,7 @@ namespace ErrorCodes
extern const int ILLEGAL_DIVISION;
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
}
@ -36,7 +36,7 @@ public:
{
if (arguments.size() < 2)
throw Exception{"Number of arguments for function " + getName() + " doesn't match: passed "
+ toString(arguments.size()) + ", should be at least 2.", ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION};
+ toString(arguments.size()) + ", should be at least 2.", ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
const auto & first_arg = arguments.front();

View File

@ -30,7 +30,7 @@ namespace DB
namespace ErrorCodes
{
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
extern const int LOGICAL_ERROR;
}

View File

@ -57,7 +57,7 @@ namespace ErrorCodes
extern const int CANNOT_PARSE_TEXT;
extern const int CANNOT_PARSE_UUID;
extern const int TOO_LARGE_STRING_SIZE;
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
extern const int LOGICAL_ERROR;
extern const int TYPE_MISMATCH;
extern const int CANNOT_CONVERT_TYPE;
@ -883,7 +883,7 @@ private:
{
if (!arguments.size())
throw Exception{"Function " + getName() + " expects at least 1 arguments",
ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION};
ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
const IDataType * from_type = block.getByPosition(arguments[0]).type.get();
@ -897,7 +897,7 @@ private:
{
if (arguments.size() != 2)
throw Exception{"Function " + getName() + " expects 2 arguments for Decimal.",
ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION};
ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
const ColumnWithTypeAndName & scale_column = block.getByPosition(arguments[1]);
UInt32 scale = extractToDecimalScale(scale_column);

View File

@ -22,7 +22,7 @@ FunctionPtr FunctionModelEvaluate::create(const Context & context)
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
extern const int ILLEGAL_COLUMN;
}
@ -30,7 +30,7 @@ DataTypePtr FunctionModelEvaluate::getReturnTypeImpl(const DataTypes & arguments
{
if (arguments.size() < 2)
throw Exception("Function " + getName() + " expects at least 2 arguments",
ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION);
ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION);
if (!isString(arguments[0]))
throw Exception("Illegal type " + arguments[0]->getName() + " of first argument of function " + getName()

View File

@ -29,7 +29,7 @@ namespace DB
namespace ErrorCodes
{
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
extern const int BAD_ARGUMENTS;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
@ -111,7 +111,7 @@ public:
{
if (arguments.size() < 2)
{
throw Exception("Too few arguments", ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION);
throw Exception("Too few arguments", ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION);
}
auto getMsgPrefix = [this](size_t i) { return "Argument " + toString(i + 1) + " for function " + getName(); };

View File

@ -103,58 +103,6 @@ void PreparedFunctionImpl::createLowCardinalityResultCache(size_t cache_size)
}
static DataTypePtr recursiveRemoveLowCardinality(const DataTypePtr & type)
{
if (!type)
return type;
if (const auto * array_type = typeid_cast<const DataTypeArray *>(type.get()))
return std::make_shared<DataTypeArray>(recursiveRemoveLowCardinality(array_type->getNestedType()));
if (const auto * tuple_type = typeid_cast<const DataTypeTuple *>(type.get()))
{
DataTypes elements = tuple_type->getElements();
for (auto & element : elements)
element = recursiveRemoveLowCardinality(element);
if (tuple_type->haveExplicitNames())
return std::make_shared<DataTypeTuple>(elements, tuple_type->getElementNames());
else
return std::make_shared<DataTypeTuple>(elements);
}
if (const auto * low_cardinality_type = typeid_cast<const DataTypeLowCardinality *>(type.get()))
return low_cardinality_type->getDictionaryType();
return type;
}
static ColumnPtr recursiveRemoveLowCardinality(const ColumnPtr & column)
{
if (!column)
return column;
if (const auto * column_array = typeid_cast<const ColumnArray *>(column.get()))
return ColumnArray::create(recursiveRemoveLowCardinality(column_array->getDataPtr()), column_array->getOffsetsPtr());
if (const auto * column_const = typeid_cast<const ColumnConst *>(column.get()))
return ColumnConst::create(recursiveRemoveLowCardinality(column_const->getDataColumnPtr()), column_const->size());
if (const auto * column_tuple = typeid_cast<const ColumnTuple *>(column.get()))
{
Columns columns = column_tuple->getColumns();
for (auto & element : columns)
element = recursiveRemoveLowCardinality(element);
return ColumnTuple::create(columns);
}
if (const auto * column_low_cardinality = typeid_cast<const ColumnLowCardinality *>(column.get()))
return column_low_cardinality->convertToFullColumn();
return column;
}
ColumnPtr wrapInNullable(const ColumnPtr & src, const Block & block, const ColumnNumbers & args, size_t result, size_t input_rows_count)
{
ColumnPtr result_null_map_column;

View File

@ -9,7 +9,7 @@ namespace DB
namespace ErrorCodes
{
extern const int TOO_LESS_ARGUMENTS_FOR_FUNCTION;
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
}
/// Implements the CASE construction when it is
@ -30,7 +30,7 @@ public:
{
if (!args.size())
throw Exception{"Function " + getName() + " expects at least 1 arguments",
ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION};
ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
/// See the comments in executeImpl() to understand why we actually have to
/// get the return type of a transform function.
@ -48,7 +48,7 @@ public:
{
if (!args.size())
throw Exception{"Function " + getName() + " expects at least 1 argument",
ErrorCodes::TOO_LESS_ARGUMENTS_FOR_FUNCTION};
ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
/// In the following code, we turn the construction:
/// CASE expr WHEN val[0] THEN branch[0] ... WHEN val[N-1] then branch[N-1] ELSE branchN

View File

@ -0,0 +1,96 @@
#pragma once
#include <Common/config.h>
#if USE_HDFS
#include <IO/ReadBuffer.h>
#include <Poco/URI.h>
#include <hdfs/hdfs.h>
#include <IO/BufferWithOwnMemory.h>
#ifndef O_DIRECT
#define O_DIRECT 00040000
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
extern const int NETWORK_ERROR;
}
/** Accepts path to file and opens it, or pre-opened file descriptor.
* Closes file by himself (thus "owns" a file descriptor).
*/
class ReadBufferFromHDFS : public BufferWithOwnMemory<ReadBuffer>
{
protected:
std::string hdfs_uri;
struct hdfsBuilder *builder;
hdfsFS fs;
hdfsFile fin;
public:
ReadBufferFromHDFS(const std::string & hdfs_name_, size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE)
: BufferWithOwnMemory<ReadBuffer>(buf_size), hdfs_uri(hdfs_name_) , builder(hdfsNewBuilder())
{
Poco::URI uri(hdfs_name_);
auto & host = uri.getHost();
auto port = uri.getPort();
auto & path = uri.getPath();
if (host.empty() || port == 0 || path.empty())
{
throw Exception("Illegal HDFS URI: " + hdfs_uri, ErrorCodes::BAD_ARGUMENTS);
}
// set read/connect timeout, default value in libhdfs3 is about 1 hour, and too large
/// TODO Allow to tune from query Settings.
hdfsBuilderConfSetStr(builder, "input.read.timeout", "60000"); // 1 min
hdfsBuilderConfSetStr(builder, "input.connect.timeout", "60000"); // 1 min
hdfsBuilderSetNameNode(builder, host.c_str());
hdfsBuilderSetNameNodePort(builder, port);
fs = hdfsBuilderConnect(builder);
if (fs == nullptr)
{
throw Exception("Unable to connect to HDFS: " + String(hdfsGetLastError()), ErrorCodes::NETWORK_ERROR);
}
fin = hdfsOpenFile(fs, path.c_str(), O_RDONLY, 0, 0, 0);
}
ReadBufferFromHDFS(ReadBufferFromHDFS &&) = default;
~ReadBufferFromHDFS() override
{
close();
hdfsFreeBuilder(builder);
}
/// Close HDFS connection before destruction of object.
void close()
{
hdfsCloseFile(fs, fin);
}
bool nextImpl() override
{
int bytes_read = hdfsRead(fs, fin, internal_buffer.begin(), internal_buffer.size());
if (bytes_read < 0)
{
throw Exception("Fail to read HDFS file: " + hdfs_uri + " " + String(hdfsGetLastError()), ErrorCodes::NETWORK_ERROR);
}
if (bytes_read)
working_buffer.resize(bytes_read);
else
return false;
return true;
}
const std::string & getHDFSUri() const
{
return hdfs_uri;
}
};
}
#endif

View File

@ -8,6 +8,7 @@
#include <Parsers/ASTSubquery.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Parsers/ASTTablesInSelectQuery.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/DumpASTNode.h>
namespace DB
@ -38,8 +39,9 @@ public:
void visit(ASTPtr & ast) const
{
if (!tryVisit<ASTSelectQuery>(ast) &&
!tryVisit<ASTSelectWithUnionQuery>(ast))
visitChildren(ast);
!tryVisit<ASTSelectWithUnionQuery>(ast) &&
!tryVisit<ASTFunction>(ast))
visitChildren(*ast);
}
void visit(ASTSelectQuery & select) const
@ -70,10 +72,7 @@ private:
if (select.tables)
tryVisit<ASTTablesInSelectQuery>(select.tables);
if (select.prewhere_expression)
visitChildren(select.prewhere_expression);
if (select.where_expression)
visitChildren(select.where_expression);
visitChildren(select);
}
void visit(ASTTablesInSelectQuery & tables, ASTPtr &) const
@ -112,9 +111,43 @@ private:
tryVisit<ASTSelectWithUnionQuery>(subquery.children[0]);
}
void visitChildren(ASTPtr & ast) const
void visit(ASTFunction & function, ASTPtr &) const
{
for (auto & child : ast->children)
bool is_operator_in = false;
for (auto name : {"in", "notIn", "globalIn", "globalNotIn"})
{
if (function.name == name)
{
is_operator_in = true;
break;
}
}
for (auto & child : function.children)
{
if (child.get() == function.arguments.get())
{
for (size_t i = 0; i < child->children.size(); ++i)
{
if (is_operator_in && i == 1)
{
/// Second argument of the "in" function (or similar) may be a table name or a subselect.
/// Rewrite the table name or descend into subselect.
if (!tryVisit<ASTIdentifier>(child->children[i]))
visit(child->children[i]);
}
else
visit(child->children[i]);
}
}
else
visit(child);
}
}
void visitChildren(IAST & ast) const
{
for (auto & child : ast.children)
visit(child);
}

View File

@ -453,6 +453,27 @@ AggregatedDataVariants::Type Aggregator::chooseAggregationMethod()
return AggregatedDataVariants::Type::nullable_keys256;
}
if (has_low_cardinality && params.keys_size == 1)
{
if (types_removed_nullable[0]->isValueRepresentedByNumber())
{
size_t size_of_field = types_removed_nullable[0]->getSizeOfValueInMemory();
if (size_of_field == 1)
return AggregatedDataVariants::Type::low_cardinality_key8;
if (size_of_field == 2)
return AggregatedDataVariants::Type::low_cardinality_key16;
if (size_of_field == 4)
return AggregatedDataVariants::Type::low_cardinality_key32;
if (size_of_field == 8)
return AggregatedDataVariants::Type::low_cardinality_key64;
}
else if (isString(types_removed_nullable[0]))
return AggregatedDataVariants::Type::low_cardinality_key_string;
else if (isFixedString(types_removed_nullable[0]))
return AggregatedDataVariants::Type::low_cardinality_key_fixed_string;
}
/// Fallback case.
return AggregatedDataVariants::Type::serialized;
}
@ -1139,12 +1160,10 @@ void Aggregator::convertToBlockImpl(
convertToBlockImplFinal(method, data, key_columns, final_aggregate_columns);
else
convertToBlockImplNotFinal(method, data, key_columns, aggregate_columns);
/// In order to release memory early.
data.clearAndShrink();
}
template <typename Method, typename Table>
void NO_INLINE Aggregator::convertToBlockImplFinal(
Method & method,
@ -1152,6 +1171,19 @@ void NO_INLINE Aggregator::convertToBlockImplFinal(
MutableColumns & key_columns,
MutableColumns & final_aggregate_columns) const
{
if constexpr (Method::low_cardinality_optimization)
{
if (data.hasNullKeyData())
{
key_columns[0]->insert(Field()); /// Null
for (size_t i = 0; i < params.aggregates_size; ++i)
aggregate_functions[i]->insertResultInto(
data.getNullKeyData() + offsets_of_aggregate_states[i],
*final_aggregate_columns[i]);
}
}
for (const auto & value : data)
{
method.insertKeyIntoColumns(value, key_columns, key_sizes);
@ -1172,6 +1204,17 @@ void NO_INLINE Aggregator::convertToBlockImplNotFinal(
MutableColumns & key_columns,
AggregateColumnsData & aggregate_columns) const
{
if constexpr (Method::low_cardinality_optimization)
{
if (data.hasNullKeyData())
{
key_columns[0]->insert(Field()); /// Null
for (size_t i = 0; i < params.aggregates_size; ++i)
aggregate_columns[i]->push_back(data.getNullKeyData() + offsets_of_aggregate_states[i]);
}
}
for (auto & value : data)
{
method.insertKeyIntoColumns(value, key_columns, key_sizes);
@ -1470,12 +1513,50 @@ BlocksList Aggregator::convertToBlocks(AggregatedDataVariants & data_variants, b
}
template <typename Method, typename Table>
void NO_INLINE Aggregator::mergeDataNullKey(
Table & table_dst,
Table & table_src,
Arena * arena) const
{
if constexpr (Method::low_cardinality_optimization)
{
if (table_src.hasNullKeyData())
{
if (!table_dst.hasNullKeyData())
{
table_dst.hasNullKeyData() = true;
table_dst.getNullKeyData() = table_src.getNullKeyData();
}
else
{
for (size_t i = 0; i < params.aggregates_size; ++i)
aggregate_functions[i]->merge(
table_dst.getNullKeyData() + offsets_of_aggregate_states[i],
table_src.getNullKeyData() + offsets_of_aggregate_states[i],
arena);
for (size_t i = 0; i < params.aggregates_size; ++i)
aggregate_functions[i]->destroy(
table_src.getNullKeyData() + offsets_of_aggregate_states[i]);
}
table_src.hasNullKeyData() = false;
table_src.getNullKeyData() = nullptr;
}
}
}
template <typename Method, typename Table>
void NO_INLINE Aggregator::mergeDataImpl(
Table & table_dst,
Table & table_src,
Arena * arena) const
{
if constexpr (Method::low_cardinality_optimization)
mergeDataNullKey<Method, Table>(table_dst, table_src, arena);
for (auto it = table_src.begin(), end = table_src.end(); it != end; ++it)
{
typename Table::iterator res_it;
@ -1513,6 +1594,10 @@ void NO_INLINE Aggregator::mergeDataNoMoreKeysImpl(
Table & table_src,
Arena * arena) const
{
/// Note : will create data for NULL key if not exist
if constexpr (Method::low_cardinality_optimization)
mergeDataNullKey<Method, Table>(table_dst, table_src, arena);
for (auto it = table_src.begin(), end = table_src.end(); it != end; ++it)
{
typename Table::iterator res_it = table_dst.find(it->first, it.getHash());
@ -1543,6 +1628,10 @@ void NO_INLINE Aggregator::mergeDataOnlyExistingKeysImpl(
Table & table_src,
Arena * arena) const
{
/// Note : will create data for NULL key if not exist
if constexpr (Method::low_cardinality_optimization)
mergeDataNullKey<Method, Table>(table_dst, table_src, arena);
for (auto it = table_src.begin(); it != table_src.end(); ++it)
{
decltype(it) res_it = table_dst.find(it->first, it.getHash());
@ -2341,6 +2430,15 @@ void NO_INLINE Aggregator::convertBlockToTwoLevelImpl(
/// For every row.
for (size_t i = 0; i < rows; ++i)
{
if constexpr (Method::low_cardinality_optimization)
{
if (state.isNullAt(i))
{
selector[i] = 0;
continue;
}
}
/// Obtain a key. Calculate bucket number from it.
typename Method::Key key = state.getKey(key_columns, params.keys_size, i, key_sizes, keys, *pool);

View File

@ -88,6 +88,56 @@ using AggregatedDataWithStringKeyHash64 = HashMapWithSavedHash<StringRef, Aggreg
using AggregatedDataWithKeys128Hash64 = HashMap<UInt128, AggregateDataPtr, UInt128Hash>;
using AggregatedDataWithKeys256Hash64 = HashMap<UInt256, AggregateDataPtr, UInt256Hash>;
template <typename Base>
struct AggregationDataWithNullKey : public Base
{
using Base::Base;
bool & hasNullKeyData() { return has_null_key_data; }
AggregateDataPtr & getNullKeyData() { return null_key_data; }
bool hasNullKeyData() const { return has_null_key_data; }
const AggregateDataPtr & getNullKeyData() const { return null_key_data; }
private:
bool has_null_key_data = false;
AggregateDataPtr null_key_data = nullptr;
};
template <typename Base>
struct AggregationDataWithNullKeyTwoLevel : public Base
{
using Base::Base;
using Base::impls;
template <typename Other>
explicit AggregationDataWithNullKeyTwoLevel(const Other & other) : Base(other)
{
impls[0].hasNullKeyData() = other.hasNullKeyData();
impls[0].getNullKeyData() = other.getNullKeyData();
}
bool & hasNullKeyData() { return impls[0].hasNullKeyData(); }
AggregateDataPtr & getNullKeyData() { return impls[0].getNullKeyData(); }
bool hasNullKeyData() const { return impls[0].hasNullKeyData(); }
const AggregateDataPtr & getNullKeyData() const { return impls[0].getNullKeyData(); }
};
template <typename ... Types>
using HashTableWithNullKey = AggregationDataWithNullKey<HashMapTable<Types ...>>;
using AggregatedDataWithNullableUInt8Key = AggregationDataWithNullKey<AggregatedDataWithUInt8Key>;
using AggregatedDataWithNullableUInt16Key = AggregationDataWithNullKey<AggregatedDataWithUInt16Key>;
using AggregatedDataWithNullableUInt64Key = AggregationDataWithNullKey<AggregatedDataWithUInt64Key>;
using AggregatedDataWithNullableStringKey = AggregationDataWithNullKey<AggregatedDataWithStringKey>;
using AggregatedDataWithNullableUInt64KeyTwoLevel = AggregationDataWithNullKeyTwoLevel<
TwoLevelHashMap<UInt64, AggregateDataPtr, HashCRC32<UInt64>,
TwoLevelHashTableGrower<>, HashTableAllocator, HashTableWithNullKey>>;
using AggregatedDataWithNullableStringKeyTwoLevel = AggregationDataWithNullKeyTwoLevel<
TwoLevelHashMapWithSavedHash<StringRef, AggregateDataPtr, DefaultHash<StringRef>,
TwoLevelHashTableGrower<>, HashTableAllocator, HashTableWithNullKey>>;
/// Cache which can be used by aggregations method's states. Object is shared in all threads.
struct AggregationStateCache
{
@ -403,8 +453,10 @@ struct AggregationMethodSingleLowCardinalityColumn : public SingleColumnMethod
ColumnPtr dictionary_holder;
/// Cache AggregateDataPtr for current column in order to decrease the number of hash table usages.
PaddedPODArray<AggregateDataPtr> aggregate_data;
PaddedPODArray<AggregateDataPtr> * aggregate_data_cache;
PaddedPODArray<AggregateDataPtr> aggregate_data_cache;
/// If initialized column is nullable.
bool is_nullable = false;
void init(ColumnRawPtrs &)
{
@ -429,7 +481,8 @@ struct AggregationMethodSingleLowCardinalityColumn : public SingleColumnMethod
+ demangle(typeid(cached_val).name()), ErrorCodes::LOGICAL_ERROR);
}
auto * dict = column->getDictionary().getNestedColumn().get();
auto * dict = column->getDictionary().getNestedNotNullableColumn().get();
is_nullable = column->getDictionary().nestedColumnIsNullable();
key = {dict};
bool is_shared_dict = column->isSharedDictionary();
@ -463,8 +516,7 @@ struct AggregationMethodSingleLowCardinalityColumn : public SingleColumnMethod
}
AggregateDataPtr default_data = nullptr;
aggregate_data.assign(key[0]->size(), default_data);
aggregate_data_cache = &aggregate_data;
aggregate_data_cache.assign(key[0]->size(), default_data);
size_of_index_type = column->getSizeOfIndexType();
positions = column->getIndexesPtr().get();
@ -507,10 +559,18 @@ struct AggregationMethodSingleLowCardinalityColumn : public SingleColumnMethod
Arena & pool)
{
size_t row = getIndexAt(i);
if ((*aggregate_data_cache)[row])
if (is_nullable && row == 0)
{
inserted = !data.hasNullKeyData();
data.hasNullKeyData() = true;
return &data.getNullKeyData();
}
if (aggregate_data_cache[row])
{
inserted = false;
return &(*aggregate_data_cache)[row];
return &aggregate_data_cache[row];
}
else
{
@ -527,23 +587,35 @@ struct AggregationMethodSingleLowCardinalityColumn : public SingleColumnMethod
if (inserted)
Base::onNewKey(*it, keys_size, keys, pool);
else
(*aggregate_data_cache)[row] = Base::getAggregateData(it->second);
aggregate_data_cache[row] = Base::getAggregateData(it->second);
return &Base::getAggregateData(it->second);
}
}
ALWAYS_INLINE bool isNullAt(size_t i)
{
if (!is_nullable)
return false;
return getIndexAt(i) == 0;
}
ALWAYS_INLINE void cacheAggregateData(size_t i, AggregateDataPtr data)
{
size_t row = getIndexAt(i);
(*aggregate_data_cache)[row] = data;
aggregate_data_cache[row] = data;
}
template <typename D>
ALWAYS_INLINE AggregateDataPtr * findFromRow(D & data, size_t i)
{
size_t row = getIndexAt(i);
if (!(*aggregate_data_cache)[row])
if (is_nullable && row == 0)
return data.hasNullKeyData() ? &data.getNullKeyData() : nullptr;
if (!aggregate_data_cache[row])
{
ColumnRawPtrs key_columns;
Sizes key_sizes;
@ -558,9 +630,9 @@ struct AggregationMethodSingleLowCardinalityColumn : public SingleColumnMethod
it = data.find(key);
if (it != data.end())
(*aggregate_data_cache)[row] = Base::getAggregateData(it->second);
aggregate_data_cache[row] = Base::getAggregateData(it->second);
}
return &(*aggregate_data_cache)[row];
return &aggregate_data_cache[row];
}
};
@ -971,17 +1043,17 @@ struct AggregatedDataVariants : private boost::noncopyable
std::unique_ptr<AggregationMethodKeysFixed<AggregatedDataWithKeys256TwoLevel, true>> nullable_keys256_two_level;
/// Support for low cardinality.
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt8, AggregatedDataWithUInt8Key>>> low_cardinality_key8;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt16, AggregatedDataWithUInt16Key>>> low_cardinality_key16;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt32, AggregatedDataWithUInt64Key>>> low_cardinality_key32;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt64, AggregatedDataWithUInt64Key>>> low_cardinality_key64;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodString<AggregatedDataWithStringKey>>> low_cardinality_key_string;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodFixedString<AggregatedDataWithStringKey>>> low_cardinality_key_fixed_string;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt8, AggregatedDataWithNullableUInt8Key>>> low_cardinality_key8;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt16, AggregatedDataWithNullableUInt16Key>>> low_cardinality_key16;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt32, AggregatedDataWithNullableUInt64Key>>> low_cardinality_key32;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt64, AggregatedDataWithNullableUInt64Key>>> low_cardinality_key64;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodString<AggregatedDataWithNullableStringKey>>> low_cardinality_key_string;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodFixedString<AggregatedDataWithNullableStringKey>>> low_cardinality_key_fixed_string;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt32, AggregatedDataWithUInt64KeyTwoLevel>>> low_cardinality_key32_two_level;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt64, AggregatedDataWithUInt64KeyTwoLevel>>> low_cardinality_key64_two_level;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodString<AggregatedDataWithStringKeyTwoLevel>>> low_cardinality_key_string_two_level;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodFixedString<AggregatedDataWithStringKeyTwoLevel>>> low_cardinality_key_fixed_string_two_level;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt32, AggregatedDataWithNullableUInt64KeyTwoLevel>>> low_cardinality_key32_two_level;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodOneNumber<UInt64, AggregatedDataWithNullableUInt64KeyTwoLevel>>> low_cardinality_key64_two_level;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodString<AggregatedDataWithNullableStringKeyTwoLevel>>> low_cardinality_key_string_two_level;
std::unique_ptr<AggregationMethodSingleLowCardinalityColumn<AggregationMethodFixedString<AggregatedDataWithNullableStringKeyTwoLevel>>> low_cardinality_key_fixed_string_two_level;
std::unique_ptr<AggregationMethodKeysFixed<AggregatedDataWithKeys128, false, true>> low_cardinality_keys128;
std::unique_ptr<AggregationMethodKeysFixed<AggregatedDataWithKeys256, false, true>> low_cardinality_keys256;
@ -1580,6 +1652,13 @@ public:
Arena * arena) const;
protected:
/// Merge NULL key data from hash table `src` into `dst`.
template <typename Method, typename Table>
void mergeDataNullKey(
Table & table_dst,
Table & table_src,
Arena * arena) const;
/// Merge data from hash table `src` into `dst`.
template <typename Method, typename Table>
void mergeDataImpl(

View File

@ -590,7 +590,7 @@ void ExpressionActions::checkLimits(Block & block) const
{
std::stringstream list_of_non_const_columns;
for (size_t i = 0, size = block.columns(); i < size; ++i)
if (!block.safeGetByPosition(i).column->isColumnConst())
if (block.safeGetByPosition(i).column && !block.safeGetByPosition(i).column->isColumnConst())
list_of_non_const_columns << "\n" << block.safeGetByPosition(i).name;
throw Exception("Too many temporary non-const columns:" + list_of_non_const_columns.str()

View File

@ -1,6 +1,7 @@
#include <Interpreters/InterpreterAlterQuery.h>
#include <Interpreters/DDLWorker.h>
#include <Interpreters/MutationsInterpreter.h>
#include <Interpreters/AddDefaultDatabaseVisitor.h>
#include <Parsers/ASTAlterQuery.h>
#include <Common/typeid_cast.h>
@ -33,6 +34,12 @@ BlockIO InterpreterAlterQuery::execute()
String database_name = alter.database.empty() ? context.getCurrentDatabase() : alter.database;
StoragePtr table = context.getTable(database_name, table_name);
/// Add default database to table identifiers that we can encounter in e.g. default expressions,
/// mutation expression, etc.
AddDefaultDatabaseVisitor visitor(database_name);
ASTPtr command_list_ptr = alter.command_list->ptr();
visitor.visit(command_list_ptr);
AlterCommands alter_commands;
PartitionCommands partition_commands;
MutationCommands mutation_commands;

View File

@ -743,9 +743,9 @@ void InterpreterSelectQuery::executeFetchColumns(
}
/// We will create an expression to return all the requested columns, with the calculation of the required ALIAS columns.
auto required_columns_expr_list = std::make_shared<ASTExpressionList>();
ASTPtr required_columns_expr_list = std::make_shared<ASTExpressionList>();
/// Separate expression for columns used in prewhere.
auto required_prewhere_columns_expr_list = std::make_shared<ASTExpressionList>();
ASTPtr required_prewhere_columns_expr_list = std::make_shared<ASTExpressionList>();
for (const auto & column : required_columns)
{
@ -823,8 +823,10 @@ void InterpreterSelectQuery::executeFetchColumns(
}
prewhere_info->prewhere_actions = std::move(new_actions);
auto source_columns = storage->getColumns().getAllPhysical();
auto analyzed_result = SyntaxAnalyzer(context, {}).analyze(required_prewhere_columns_expr_list, source_columns);
prewhere_info->alias_actions =
ExpressionAnalyzer(required_prewhere_columns_expr_list, syntax_analyzer_result, context)
ExpressionAnalyzer(required_prewhere_columns_expr_list, analyzed_result, context)
.getActions(true, false);
/// Add columns required by alias actions.

View File

@ -15,6 +15,7 @@
#include <Interpreters/createBlockSelector.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <Common/setThreadName.h>
#include <Common/ClickHouseRevision.h>
#include <Common/CurrentMetrics.h>
@ -406,9 +407,13 @@ IColumn::Selector DistributedBlockOutputStream::createSelector(const Block & sou
const auto & key_column = current_block_with_sharding_key_expr.getByName(storage.getShardingKeyColumnName());
const auto & slot_to_shard = cluster->getSlotToShard();
// If key_column.type is DataTypeLowCardinality, do shard according to its dictionaryType
#define CREATE_FOR_TYPE(TYPE) \
if (typeid_cast<const DataType ## TYPE *>(key_column.type.get())) \
return createBlockSelector<TYPE>(*key_column.column, slot_to_shard);
return createBlockSelector<TYPE>(*key_column.column, slot_to_shard); \
else if (auto * type_low_cardinality = typeid_cast<const DataTypeLowCardinality *>(key_column.type.get())) \
if (typeid_cast<const DataType ## TYPE *>(type_low_cardinality->getDictionaryType().get())) \
return createBlockSelector<TYPE>(*key_column.column->convertToFullColumnIfLowCardinality(), slot_to_shard);
CREATE_FOR_TYPE(UInt8)
CREATE_FOR_TYPE(UInt16)

View File

@ -18,7 +18,7 @@ namespace DB
namespace ErrorCodes
{
extern const int TOO_LESS_LIVE_REPLICAS;
extern const int TOO_FEW_LIVE_REPLICAS;
extern const int UNSATISFIED_QUORUM_FOR_PREVIOUS_WRITE;
extern const int CHECKSUM_DOESNT_MATCH;
extern const int UNEXPECTED_ZOOKEEPER_ERROR;
@ -76,7 +76,7 @@ void ReplicatedMergeTreeBlockOutputStream::checkQuorumPrecondition(zkutil::ZooKe
if (leader_election_stat.numChildren < static_cast<int32_t>(quorum))
throw Exception("Number of alive replicas ("
+ toString(leader_election_stat.numChildren) + ") is less than requested quorum (" + toString(quorum) + ").",
ErrorCodes::TOO_LESS_LIVE_REPLICAS);
ErrorCodes::TOO_FEW_LIVE_REPLICAS);
/** Is there a quorum for the last part for which a quorum is needed?
* Write of all the parts with the included quorum is linearly ordered.

View File

@ -0,0 +1,179 @@
#include <Common/config.h>
#if USE_HDFS
#include <Storages/StorageFactory.h>
#include <Storages/StorageHDFS.h>
#include <Interpreters/Context.h>
#include <Interpreters/evaluateConstantExpression.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <IO/ReadBufferFromHDFS.h>
#include <Formats/FormatFactory.h>
#include <DataStreams/IBlockOutputStream.h>
#include <DataStreams/UnionBlockInputStream.h>
#include <DataStreams/IProfilingBlockInputStream.h>
#include <DataStreams/OwningBlockInputStream.h>
#include <Poco/Path.h>
#include <TableFunctions/parseRemoteDescription.h>
#include <Common/typeid_cast.h>
namespace DB
{
namespace ErrorCodes
{
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int NOT_IMPLEMENTED;
extern const int BAD_ARGUMENTS;
}
StorageHDFS::StorageHDFS(const String & uri_,
const std::string & table_name_,
const String & format_name_,
const ColumnsDescription & columns_,
Context &)
: IStorage(columns_), uri(uri_), format_name(format_name_), table_name(table_name_)
{
}
namespace
{
class StorageHDFSBlockInputStream : public IProfilingBlockInputStream
{
public:
StorageHDFSBlockInputStream(const String & uri,
const String & format,
const String & name_,
const Block & sample_block,
const Context & context,
size_t max_block_size)
: name(name_)
{
// Assume no query and fragment in uri, todo, add sanity check
String fuzzyFileNames;
String uriPrefix = uri.substr(0, uri.find_last_of('/'));
if (uriPrefix.length() == uri.length())
{
fuzzyFileNames = uri;
uriPrefix.clear();
}
else
{
uriPrefix += "/";
fuzzyFileNames = uri.substr(uriPrefix.length());
}
std::vector<String> fuzzyNameList = parseRemoteDescription(fuzzyFileNames, 0, fuzzyFileNames.length(), ',' , 100/* hard coded max files */);
BlockInputStreams inputs;
for (auto & name: fuzzyNameList)
{
std::unique_ptr<ReadBuffer> read_buf = std::make_unique<ReadBufferFromHDFS>(uriPrefix + name);
inputs.emplace_back(
std::make_shared<OwningBlockInputStream<ReadBuffer>>(
FormatFactory::instance().getInput(format, *read_buf, sample_block, context, max_block_size),
std::move(read_buf)));
}
if (inputs.size() == 0)
throw Exception("StorageHDFS inputs interpreter error", ErrorCodes::BAD_ARGUMENTS);
if (inputs.size() == 1)
{
reader = inputs[0];
}
else
{
reader = std::make_shared<UnionBlockInputStream>(inputs, nullptr, context.getSettingsRef().max_distributed_connections);
}
}
String getName() const override
{
return name;
}
Block readImpl() override
{
return reader->read();
}
Block getHeader() const override
{
return reader->getHeader();
}
void readPrefixImpl() override
{
reader->readPrefix();
}
void readSuffixImpl() override
{
auto explicitReader = dynamic_cast<UnionBlockInputStream *>(reader.get());
if (explicitReader) explicitReader->cancel(false); // skip Union read suffix assertion
reader->readSuffix();
}
private:
String name;
BlockInputStreamPtr reader;
};
}
BlockInputStreams StorageHDFS::read(
const Names & /*column_names*/,
const SelectQueryInfo & /*query_info*/,
const Context & context,
QueryProcessingStage::Enum /*processed_stage*/,
size_t max_block_size,
unsigned /*num_streams*/)
{
return {std::make_shared<StorageHDFSBlockInputStream>(
uri,
format_name,
getName(),
getSampleBlock(),
context,
max_block_size)};
}
void StorageHDFS::rename(const String & /*new_path_to_db*/, const String & /*new_database_name*/, const String & /*new_table_name*/) {}
BlockOutputStreamPtr StorageHDFS::write(const ASTPtr & /*query*/, const Settings & /*settings*/)
{
throw Exception("StorageHDFS write is not supported yet", ErrorCodes::NOT_IMPLEMENTED);
return {};
}
void registerStorageHDFS(StorageFactory & factory)
{
factory.registerStorage("HDFS", [](const StorageFactory::Arguments & args)
{
ASTs & engine_args = args.engine_args;
if (!(engine_args.size() == 1 || engine_args.size() == 2))
throw Exception(
"Storage HDFS requires exactly 2 arguments: url and name of used format.", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
engine_args[0] = evaluateConstantExpressionOrIdentifierAsLiteral(engine_args[0], args.local_context);
String url = static_cast<const ASTLiteral &>(*engine_args[0]).value.safeGet<String>();
engine_args[1] = evaluateConstantExpressionOrIdentifierAsLiteral(engine_args[1], args.local_context);
String format_name = static_cast<const ASTLiteral &>(*engine_args[1]).value.safeGet<String>();
return StorageHDFS::create(url, args.table_name, format_name, args.columns, args.context);
});
}
}
#endif

View File

@ -0,0 +1,56 @@
#pragma once
#include <Common/config.h>
#if USE_HDFS
#include <Storages/IStorage.h>
#include <Poco/URI.h>
#include <common/logger_useful.h>
#include <ext/shared_ptr_helper.h>
namespace DB
{
/**
* This class represents table engine for external hdfs files.
* Read method is supported for now.
*/
class StorageHDFS : public ext::shared_ptr_helper<StorageHDFS>, public IStorage
{
public:
String getName() const override
{
return "HDFS";
}
String getTableName() const override
{
return table_name;
}
BlockInputStreams read(const Names & column_names,
const SelectQueryInfo & query_info,
const Context & context,
QueryProcessingStage::Enum processed_stage,
size_t max_block_size,
unsigned num_streams) override;
BlockOutputStreamPtr write(const ASTPtr & query, const Settings & settings) override;
void rename(const String & new_path_to_db, const String & new_database_name, const String & new_table_name) override;
protected:
StorageHDFS(const String & uri_,
const String & table_name_,
const String & format_name_,
const ColumnsDescription & columns_,
Context & context_);
private:
String uri;
String format_name;
String table_name;
Logger * log = &Logger::get("StorageHDFS");
};
}
#endif

View File

@ -24,6 +24,10 @@ void registerStorageJoin(StorageFactory & factory);
void registerStorageView(StorageFactory & factory);
void registerStorageMaterializedView(StorageFactory & factory);
#if USE_HDFS
void registerStorageHDFS(StorageFactory & factory);
#endif
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
void registerStorageODBC(StorageFactory & factory);
#endif
@ -60,6 +64,10 @@ void registerStorages()
registerStorageView(factory);
registerStorageMaterializedView(factory);
#if USE_HDFS
registerStorageHDFS(factory);
#endif
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
registerStorageODBC(factory);
#endif

View File

@ -3,7 +3,6 @@
#include <string>
#include <memory>
namespace DB
{

View File

@ -0,0 +1,25 @@
#include <Common/config.h>
#if USE_HDFS
#include <Storages/StorageHDFS.h>
#include <TableFunctions/TableFunctionFactory.h>
#include <TableFunctions/TableFunctionHDFS.h>
namespace DB
{
StoragePtr TableFunctionHDFS::getStorage(
const String & source, const String & format, const Block & sample_block, Context & global_context) const
{
return StorageHDFS::create(source,
getName(),
format,
ColumnsDescription{sample_block.getNamesAndTypesList()},
global_context);
}
void registerTableFunctionHDFS(TableFunctionFactory & factory)
{
factory.registerFunction<TableFunctionHDFS>();
}
}
#endif

View File

@ -0,0 +1,32 @@
#pragma once
#include <Common/config.h>
#if USE_HDFS
#include <TableFunctions/ITableFunctionFileLike.h>
#include <Interpreters/Context.h>
#include <Core/Block.h>
namespace DB
{
/* hdfs(name_node_ip:name_node_port, format, structure) - creates a temporary storage from hdfs file
*
*/
class TableFunctionHDFS : public ITableFunctionFileLike
{
public:
static constexpr auto name = "hdfs";
std::string getName() const override
{
return name;
}
private:
StoragePtr getStorage(
const String & source, const String & format, const Block & sample_block, Context & global_context) const override;
};
}
#endif

View File

@ -11,6 +11,7 @@
#include <TableFunctions/TableFunctionRemote.h>
#include <TableFunctions/TableFunctionFactory.h>
#include <TableFunctions/parseRemoteDescription.h>
namespace DB
@ -22,165 +23,6 @@ namespace ErrorCodes
extern const int BAD_ARGUMENTS;
}
/// The Cartesian product of two sets of rows, the result is written in place of the first argument
static void append(std::vector<String> & to, const std::vector<String> & what, size_t max_addresses)
{
if (what.empty())
return;
if (to.empty())
{
to = what;
return;
}
if (what.size() * to.size() > max_addresses)
throw Exception("Table function 'remote': first argument generates too many result addresses",
ErrorCodes::BAD_ARGUMENTS);
std::vector<String> res;
for (size_t i = 0; i < to.size(); ++i)
for (size_t j = 0; j < what.size(); ++j)
res.push_back(to[i] + what[j]);
to.swap(res);
}
/// Parse number from substring
static bool parseNumber(const String & description, size_t l, size_t r, size_t & res)
{
res = 0;
for (size_t pos = l; pos < r; pos ++)
{
if (!isNumericASCII(description[pos]))
return false;
res = res * 10 + description[pos] - '0';
if (res > 1e15)
return false;
}
return true;
}
/* Parse a string that generates shards and replicas. Separator - one of two characters | or ,
* depending on whether shards or replicas are generated.
* For example:
* host1,host2,... - generates set of shards from host1, host2, ...
* host1|host2|... - generates set of replicas from host1, host2, ...
* abc{8..10}def - generates set of shards abc8def, abc9def, abc10def.
* abc{08..10}def - generates set of shards abc08def, abc09def, abc10def.
* abc{x,yy,z}def - generates set of shards abcxdef, abcyydef, abczdef.
* abc{x|yy|z} def - generates set of replicas abcxdef, abcyydef, abczdef.
* abc{1..9}de{f,g,h} - is a direct product, 27 shards.
* abc{1..9}de{0|1} - is a direct product, 9 shards, in each 2 replicas.
*/
static std::vector<String> parseDescription(const String & description, size_t l, size_t r, char separator, size_t max_addresses)
{
std::vector<String> res;
std::vector<String> cur;
/// An empty substring means a set of an empty string
if (l >= r)
{
res.push_back("");
return res;
}
for (size_t i = l; i < r; ++i)
{
/// Either the numeric interval (8..10) or equivalent expression in brackets
if (description[i] == '{')
{
int cnt = 1;
int last_dot = -1; /// The rightmost pair of points, remember the index of the right of the two
size_t m;
std::vector<String> buffer;
bool have_splitter = false;
/// Look for the corresponding closing bracket
for (m = i + 1; m < r; ++m)
{
if (description[m] == '{') ++cnt;
if (description[m] == '}') --cnt;
if (description[m] == '.' && description[m-1] == '.') last_dot = m;
if (description[m] == separator) have_splitter = true;
if (cnt == 0) break;
}
if (cnt != 0)
throw Exception("Table function 'remote': incorrect brace sequence in first argument",
ErrorCodes::BAD_ARGUMENTS);
/// The presence of a dot - numeric interval
if (last_dot != -1)
{
size_t left, right;
if (description[last_dot - 1] != '.')
throw Exception("Table function 'remote': incorrect argument in braces (only one dot): " + description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (!parseNumber(description, i + 1, last_dot - 1, left))
throw Exception("Table function 'remote': incorrect argument in braces (Incorrect left number): "
+ description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (!parseNumber(description, last_dot + 1, m, right))
throw Exception("Table function 'remote': incorrect argument in braces (Incorrect right number): "
+ description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (left > right)
throw Exception("Table function 'remote': incorrect argument in braces (left number is greater then right): "
+ description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (right - left + 1 > max_addresses)
throw Exception("Table function 'remote': first argument generates too many result addresses",
ErrorCodes::BAD_ARGUMENTS);
bool add_leading_zeroes = false;
size_t len = last_dot - 1 - (i + 1);
/// If the left and right borders have equal numbers, then you must add leading zeros.
if (last_dot - 1 - (i + 1) == m - (last_dot + 1))
add_leading_zeroes = true;
for (size_t id = left; id <= right; ++id)
{
String cur = toString<UInt64>(id);
if (add_leading_zeroes)
{
while (cur.size() < len)
cur = "0" + cur;
}
buffer.push_back(cur);
}
}
else if (have_splitter) /// If there is a current delimiter inside, then generate a set of resulting rows
buffer = parseDescription(description, i + 1, m, separator, max_addresses);
else /// Otherwise just copy, spawn will occur when you call with the correct delimiter
buffer.push_back(description.substr(i, m - i + 1));
/// Add all possible received extensions to the current set of lines
append(cur, buffer, max_addresses);
i = m;
}
else if (description[i] == separator)
{
/// If the delimiter, then add found rows
res.insert(res.end(), cur.begin(), cur.end());
cur.clear();
}
else
{
/// Otherwise, simply append the character to current lines
std::vector<String> buffer;
buffer.push_back(description.substr(i, 1));
append(cur, buffer, max_addresses);
}
}
res.insert(res.end(), cur.begin(), cur.end());
if (res.size() > max_addresses)
throw Exception("Table function 'remote': first argument generates too many result addresses",
ErrorCodes::BAD_ARGUMENTS);
return res;
}
StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const Context & context) const
{
ASTs & args_func = typeid_cast<ASTFunction &>(*ast_function).children;
@ -304,11 +146,11 @@ StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const C
{
/// Create new cluster from the scratch
size_t max_addresses = context.getSettingsRef().table_function_remote_max_addresses;
std::vector<String> shards = parseDescription(cluster_description, 0, cluster_description.size(), ',', max_addresses);
std::vector<String> shards = parseRemoteDescription(cluster_description, 0, cluster_description.size(), ',', max_addresses);
std::vector<std::vector<String>> names;
for (size_t i = 0; i < shards.size(); ++i)
names.push_back(parseDescription(shards[i], 0, shards[i].size(), '|', max_addresses));
names.push_back(parseRemoteDescription(shards[i], 0, shards[i].size(), '|', max_addresses));
if (names.empty())
throw Exception("Shard list is empty after parsing first argument", ErrorCodes::BAD_ARGUMENTS);

View File

@ -0,0 +1,171 @@
#include <TableFunctions/parseRemoteDescription.h>
#include <Common/Exception.h>
#include <IO/WriteHelpers.h>
namespace DB
{
namespace ErrorCodes
{
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int BAD_ARGUMENTS;
}
/// The Cartesian product of two sets of rows, the result is written in place of the first argument
static void append(std::vector<String> & to, const std::vector<String> & what, size_t max_addresses)
{
if (what.empty())
return;
if (to.empty())
{
to = what;
return;
}
if (what.size() * to.size() > max_addresses)
throw Exception("Table function 'remote': first argument generates too many result addresses",
ErrorCodes::BAD_ARGUMENTS);
std::vector<String> res;
for (size_t i = 0; i < to.size(); ++i)
for (size_t j = 0; j < what.size(); ++j)
res.push_back(to[i] + what[j]);
to.swap(res);
}
/// Parse number from substring
static bool parseNumber(const String & description, size_t l, size_t r, size_t & res)
{
res = 0;
for (size_t pos = l; pos < r; pos ++)
{
if (!isNumericASCII(description[pos]))
return false;
res = res * 10 + description[pos] - '0';
if (res > 1e15)
return false;
}
return true;
}
/* Parse a string that generates shards and replicas. Separator - one of two characters | or ,
* depending on whether shards or replicas are generated.
* For example:
* host1,host2,... - generates set of shards from host1, host2, ...
* host1|host2|... - generates set of replicas from host1, host2, ...
* abc{8..10}def - generates set of shards abc8def, abc9def, abc10def.
* abc{08..10}def - generates set of shards abc08def, abc09def, abc10def.
* abc{x,yy,z}def - generates set of shards abcxdef, abcyydef, abczdef.
* abc{x|yy|z} def - generates set of replicas abcxdef, abcyydef, abczdef.
* abc{1..9}de{f,g,h} - is a direct product, 27 shards.
* abc{1..9}de{0|1} - is a direct product, 9 shards, in each 2 replicas.
*/
std::vector<String> parseRemoteDescription(const String & description, size_t l, size_t r, char separator, size_t max_addresses)
{
std::vector<String> res;
std::vector<String> cur;
/// An empty substring means a set of an empty string
if (l >= r)
{
res.push_back("");
return res;
}
for (size_t i = l; i < r; ++i)
{
/// Either the numeric interval (8..10) or equivalent expression in brackets
if (description[i] == '{')
{
int cnt = 1;
int last_dot = -1; /// The rightmost pair of points, remember the index of the right of the two
size_t m;
std::vector<String> buffer;
bool have_splitter = false;
/// Look for the corresponding closing bracket
for (m = i + 1; m < r; ++m)
{
if (description[m] == '{') ++cnt;
if (description[m] == '}') --cnt;
if (description[m] == '.' && description[m-1] == '.') last_dot = m;
if (description[m] == separator) have_splitter = true;
if (cnt == 0) break;
}
if (cnt != 0)
throw Exception("Table function 'remote': incorrect brace sequence in first argument",
ErrorCodes::BAD_ARGUMENTS);
/// The presence of a dot - numeric interval
if (last_dot != -1)
{
size_t left, right;
if (description[last_dot - 1] != '.')
throw Exception("Table function 'remote': incorrect argument in braces (only one dot): " + description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (!parseNumber(description, i + 1, last_dot - 1, left))
throw Exception("Table function 'remote': incorrect argument in braces (Incorrect left number): "
+ description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (!parseNumber(description, last_dot + 1, m, right))
throw Exception("Table function 'remote': incorrect argument in braces (Incorrect right number): "
+ description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (left > right)
throw Exception("Table function 'remote': incorrect argument in braces (left number is greater then right): "
+ description.substr(i, m - i + 1),
ErrorCodes::BAD_ARGUMENTS);
if (right - left + 1 > max_addresses)
throw Exception("Table function 'remote': first argument generates too many result addresses",
ErrorCodes::BAD_ARGUMENTS);
bool add_leading_zeroes = false;
size_t len = last_dot - 1 - (i + 1);
/// If the left and right borders have equal numbers, then you must add leading zeros.
if (last_dot - 1 - (i + 1) == m - (last_dot + 1))
add_leading_zeroes = true;
for (size_t id = left; id <= right; ++id)
{
String cur = toString<UInt64>(id);
if (add_leading_zeroes)
{
while (cur.size() < len)
cur = "0" + cur;
}
buffer.push_back(cur);
}
}
else if (have_splitter) /// If there is a current delimiter inside, then generate a set of resulting rows
buffer = parseRemoteDescription(description, i + 1, m, separator, max_addresses);
else /// Otherwise just copy, spawn will occur when you call with the correct delimiter
buffer.push_back(description.substr(i, m - i + 1));
/// Add all possible received extensions to the current set of lines
append(cur, buffer, max_addresses);
i = m;
}
else if (description[i] == separator)
{
/// If the delimiter, then add found rows
res.insert(res.end(), cur.begin(), cur.end());
cur.clear();
}
else
{
/// Otherwise, simply append the character to current lines
std::vector<String> buffer;
buffer.push_back(description.substr(i, 1));
append(cur, buffer, max_addresses);
}
}
res.insert(res.end(), cur.begin(), cur.end());
if (res.size() > max_addresses)
throw Exception("Table function 'remote': first argument generates too many result addresses",
ErrorCodes::BAD_ARGUMENTS);
return res;
}
}

View File

@ -0,0 +1,20 @@
#pragma once
#include <Core/Types.h>
#include <vector>
namespace DB
{
/* Parse a string that generates shards and replicas. Separator - one of two characters | or ,
* depending on whether shards or replicas are generated.
* For example:
* host1,host2,... - generates set of shards from host1, host2, ...
* host1|host2|... - generates set of replicas from host1, host2, ...
* abc{8..10}def - generates set of shards abc8def, abc9def, abc10def.
* abc{08..10}def - generates set of shards abc08def, abc09def, abc10def.
* abc{x,yy,z}def - generates set of shards abcxdef, abcyydef, abczdef.
* abc{x|yy|z} def - generates set of replicas abcxdef, abcyydef, abczdef.
* abc{1..9}de{f,g,h} - is a direct product, 27 shards.
* abc{1..9}de{0|1} - is a direct product, 9 shards, in each 2 replicas.
*/
std::vector<String> parseRemoteDescription(const String & description, size_t l, size_t r, char separator, size_t max_addresses);
}

View File

@ -14,6 +14,10 @@ void registerTableFunctionCatBoostPool(TableFunctionFactory & factory);
void registerTableFunctionFile(TableFunctionFactory & factory);
void registerTableFunctionURL(TableFunctionFactory & factory);
#if USE_HDFS
void registerTableFunctionHDFS(TableFunctionFactory & factory);
#endif
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
void registerTableFunctionODBC(TableFunctionFactory & factory);
#endif
@ -37,6 +41,10 @@ void registerTableFunctions()
registerTableFunctionFile(factory);
registerTableFunctionURL(factory);
#if USE_HDFS
registerTableFunctionHDFS(factory);
#endif
#if USE_POCO_SQLODBC || USE_POCO_DATAODBC
registerTableFunctionODBC(factory);
#endif

View File

@ -33,7 +33,12 @@ set the following environment variables:
### Running with runner script
The only requirement is fresh docker.
The only requirement is fresh docker configured docker.
Notes:
* If you want to run integration tests without `sudo` you have to add your user to docker group `sudo usermod -aG docker $USER`. [More information](https://docs.docker.com/install/linux/linux-postinstall/) about docker configuration.
* If you already had run these tests without `./runner` script you may have problems with pytest cache. It can be removed with `rm -r __pycache__ .pytest_cache/`.
* Some tests maybe require a lot of resources (CPU, RAM, etc.). Better not try large tests like `test_cluster_copier` or `test_distributed_ddl*` on your notebook.
You can run tests via `./runner` script and pass pytest arguments as last arg:
```

View File

@ -14,11 +14,13 @@ import xml.dom.minidom
from kazoo.client import KazooClient
from kazoo.exceptions import KazooException
import psycopg2
import requests
import docker
from docker.errors import ContainerError
from .client import Client, CommandRequest
from .hdfs_api import HDFSApi
HELPERS_DIR = p.dirname(__file__)
@ -83,6 +85,7 @@ class ClickHouseCluster:
self.with_postgres = False
self.with_kafka = False
self.with_odbc_drivers = False
self.with_hdfs = False
self.docker_client = None
self.is_up = False
@ -94,7 +97,7 @@ class ClickHouseCluster:
cmd += " client"
return cmd
def add_instance(self, name, config_dir=None, main_configs=[], user_configs=[], macros={}, with_zookeeper=False, with_mysql=False, with_kafka=False, clickhouse_path_dir=None, with_odbc_drivers=False, with_postgres=False, hostname=None, env_variables={}, image="yandex/clickhouse-integration-test", stay_alive=False):
def add_instance(self, name, config_dir=None, main_configs=[], user_configs=[], macros={}, with_zookeeper=False, with_mysql=False, with_kafka=False, clickhouse_path_dir=None, with_odbc_drivers=False, with_postgres=False, with_hdfs=False, hostname=None, env_variables={}, image="yandex/clickhouse-integration-test", stay_alive=False):
"""Add an instance to the cluster.
name - the name of the instance directory and the value of the 'instance' macro in ClickHouse.
@ -148,13 +151,19 @@ class ClickHouseCluster:
self.base_postgres_cmd = ['docker-compose', '--project-directory', self.base_dir, '--project-name',
self.project_name, '--file', p.join(HELPERS_DIR, 'docker_compose_postgres.yml')]
if with_kafka and not self.with_kafka:
self.with_kafka = True
self.base_cmd.extend(['--file', p.join(HELPERS_DIR, 'docker_compose_kafka.yml')])
self.base_kafka_cmd = ['docker-compose', '--project-directory', self.base_dir, '--project-name',
self.project_name, '--file', p.join(HELPERS_DIR, 'docker_compose_kafka.yml')]
if with_hdfs and not self.with_hdfs:
self.with_hdfs = True
self.base_cmd.extend(['--file', p.join(HELPERS_DIR, 'docker_compose_hdfs.yml')])
self.base_hdfs_cmd = ['docker-compose', '--project-directory', self.base_dir, '--project-name',
self.project_name, '--file', p.join(HELPERS_DIR, 'docker_compose_hdfs.yml')]
return instance
@ -212,6 +221,20 @@ class ClickHouseCluster:
raise Exception("Cannot wait ZooKeeper container")
def wait_hdfs_to_start(self, timeout=60):
hdfs_api = HDFSApi("root")
start = time.time()
while time.time() - start < timeout:
try:
hdfs_api.write_data("/somefilewithrandomname222", "1")
print "Connected to HDFS and SafeMode disabled! "
return
except Exception as ex:
print "Can't connect to HDFS " + str(ex)
time.sleep(1)
raise Exception("Can't wait HDFS to start")
def start(self, destroy_dirs=True):
if self.is_up:
return
@ -250,7 +273,11 @@ class ClickHouseCluster:
subprocess_check_call(self.base_kafka_cmd + ['up', '-d', '--force-recreate'])
self.kafka_docker_id = self.get_instance_docker_id('kafka1')
subprocess_check_call(self.base_cmd + ['up', '-d', '--force-recreate'])
if self.with_hdfs and self.base_hdfs_cmd:
subprocess_check_call(self.base_hdfs_cmd + ['up', '-d', '--force-recreate'])
self.wait_hdfs_to_start(120)
subprocess_check_call(self.base_cmd + ['up', '-d', '--no-recreate'])
start_deadline = time.time() + 20.0 # seconds
for instance in self.instances.itervalues():
@ -310,7 +337,6 @@ services:
{name}:
image: {image}
hostname: {hostname}
user: '{uid}'
volumes:
- {binary_path}:/usr/bin/clickhouse:ro
- {configs_dir}:/etc/clickhouse-server/
@ -588,7 +614,6 @@ class ClickHouseInstance:
image=self.image,
name=self.name,
hostname=self.hostname,
uid=os.getuid(),
binary_path=self.server_bin_path,
configs_dir=configs_dir,
config_d_dir=config_d_dir,

View File

@ -0,0 +1,9 @@
version: '2'
services:
hdfs1:
image: sequenceiq/hadoop-docker:2.7.0
restart: always
ports:
- 50075:50075
- 50070:50070
entrypoint: /etc/bootstrap.sh -d

View File

@ -0,0 +1,46 @@
#-*- coding: utf-8 -*-
import requests
import subprocess
from tempfile import NamedTemporaryFile
class HDFSApi(object):
def __init__(self, user):
self.host = "localhost"
self.http_proxy_port = "50070"
self.http_data_port = "50075"
self.user = user
def read_data(self, path):
response = requests.get("http://{host}:{port}/webhdfs/v1{path}?op=OPEN".format(host=self.host, port=self.http_proxy_port, path=path), allow_redirects=False)
if response.status_code != 307:
response.raise_for_status()
additional_params = '&'.join(response.headers['Location'].split('&')[1:2])
response_data = requests.get("http://{host}:{port}/webhdfs/v1{path}?op=OPEN&{params}".format(host=self.host, port=self.http_data_port, path=path, params=additional_params))
if response_data.status_code != 200:
response_data.raise_for_status()
return response_data.text
# Requests can't put file
def _curl_to_put(self, filename, path, params):
url = "http://{host}:{port}/webhdfs/v1{path}?op=CREATE&{params}".format(host=self.host, port=self.http_data_port, path=path, params=params)
cmd = "curl -s -i -X PUT -T {fname} '{url}'".format(fname=filename, url=url)
output = subprocess.check_output(cmd, shell=True)
return output
def write_data(self, path, content):
named_file = NamedTemporaryFile()
fpath = named_file.name
named_file.write(content)
named_file.flush()
response = requests.put(
"http://{host}:{port}/webhdfs/v1{path}?op=CREATE".format(host=self.host, port=self.http_proxy_port, path=path, user=self.user),
allow_redirects=False
)
if response.status_code != 307:
response.raise_for_status()
additional_params = '&'.join(response.headers['Location'].split('&')[1:2] + ["user.name={}".format(self.user), "overwrite=true"])
output = self._curl_to_put(fpath, path, additional_params)
if "201 Created" not in output:
raise Exception("Can't create file on hdfs:\n {}".format(output))

View File

@ -2,54 +2,94 @@
#-*- coding: utf-8 -*-
import subprocess
import os
import getpass
import argparse
import logging
import signal
import subprocess
CUR_FILE_DIR_PATH = os.path.dirname(os.path.realpath(__file__))
DEFAULT_CLICKHOUSE_ROOT = os.path.abspath(os.path.join(CUR_FILE_DIR_PATH, "../../../"))
CUR_FILE_DIR = os.path.dirname(os.path.realpath(__file__))
DEFAULT_CLICKHOUSE_ROOT = os.path.abspath(os.path.join(CUR_FILE_DIR, "../../../"))
CURRENT_WORK_DIR = os.getcwd()
CONTAINER_NAME = "clickhouse_integration_tests"
DIND_INTEGRATION_TESTS_IMAGE_NAME = "yandex/clickhouse-integration-tests-runner"
def check_args_and_update_paths(args):
if not os.path.isabs(args.binary):
args.binary = os.path.abspath(os.path.join(CURRENT_WORK_DIR, args.binary))
if not os.path.isabs(args.configs_dir):
args.configs_dir = os.path.abspath(os.path.join(CURRENT_WORK_DIR, args.configs_dir))
if not os.path.isabs(args.clickhouse_root):
args.clickhouse_root = os.path.abspath(os.path.join(CURRENT_WORK_DIR, args.clickhouse_root))
for path in [args.binary, args.configs_dir, args.clickhouse_root]:
if not os.path.exists(path):
raise Exception("Path {} doesn't exists".format(path))
def try_rm_image():
try:
subprocess.check_call('docker rm {name}'.format(name=CONTAINER_NAME), shell=True)
except:
pass
def docker_kill_handler_handler(signum, frame):
subprocess.check_call('docker kill $(docker ps -a -q --filter name={name} --format="{{{{.ID}}}}")'.format(name=CONTAINER_NAME), shell=True)
try_rm_image()
raise KeyboardInterrupt("Killed by Ctrl+C")
signal.signal(signal.SIGINT, docker_kill_handler_handler)
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(message)s')
parser = argparse.ArgumentParser(description="ClickHouse integration tests runner")
parser.add_argument(
"--binary",
default=os.environ.get("CLICKHOUSE_TESTS_SERVER_BIN_PATH", os.environ.get("CLICKHOUSE_TESTS_CLIENT_BIN_PATH", "/usr/bin/clickhouse")),
help="Path to clickhouse binary")
parser.add_argument(
"--configs-dir",
default=os.environ.get("CLICKHOUSE_TESTS_BASE_CONFIG_DIR", "/etc/clickhouse-server"),
help="Path to clickhouse configs directory"
)
default=os.environ.get("CLICKHOUSE_TESTS_BASE_CONFIG_DIR", os.path.join(DEFAULT_CLICKHOUSE_ROOT, "dbms/programs/server")),
help="Path to clickhouse configs directory")
parser.add_argument(
"--clickhouse-root",
default=DEFAULT_CLICKHOUSE_ROOT,
help="Path to repository root folder"
)
help="Path to repository root folder")
parser.add_argument(
"--disable-net-host",
action='store_true',
default=False,
help="Don't use net host in parent docker container"
)
help="Don't use net host in parent docker container")
parser.add_argument('pytest_args', nargs='*', help="args for pytest command")
args = parser.parse_args()
check_args_and_update_paths(args)
net = ""
if not args.disable_net_host:
net = "--net=host"
cmd = "docker run {net} --privileged --volume={bin}:/clickhouse \
--volume={cfg}:/clickhouse-config --volume={pth}:/ClickHouse -e PYTEST_OPTS='{opts}' {img}".format(
cmd = "docker run {net} --name {name} --user={user} --privileged --volume={bin}:/clickhouse \
--volume={cfg}:/clickhouse-config --volume={pth}:/ClickHouse -e PYTEST_OPTS='{opts}' {img} ".format(
net=net,
bin=args.binary,
cfg=args.configs_dir,
pth=args.clickhouse_root,
opts=' '.join(args.pytest_args),
img=DIND_INTEGRATION_TESTS_IMAGE_NAME,
user=getpass.getuser(),
name=CONTAINER_NAME,
)
subprocess.check_call(cmd, shell=True)
try:
subprocess.check_call(cmd, shell=True)
finally:
try_rm_image()

View File

@ -28,5 +28,19 @@
</replica>
</shard>
</shard_with_local_replica>
<shard_with_low_cardinality>
<shard>
<replica>
<host>shard1</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>shard2</host>
<port>9000</port>
</replica>
</shard>
</shard_with_low_cardinality>
</remote_servers>
</yandex>

View File

@ -21,6 +21,8 @@ instance_test_inserts_local_cluster = cluster.add_instance(
node1 = cluster.add_instance('node1', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
node2 = cluster.add_instance('node2', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
shard1 = cluster.add_instance('shard1', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
shard2 = cluster.add_instance('shard2', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
@pytest.fixture(scope="module")
def started_cluster():
@ -56,6 +58,19 @@ CREATE TABLE distributed (date Date, id UInt32) ENGINE = Distributed('shard_with
node2.query('''
CREATE TABLE distributed (date Date, id UInt32) ENGINE = Distributed('shard_with_local_replica', 'default', 'replicated')
''')
shard1.query('''
SET allow_experimental_low_cardinality_type = 1;
CREATE TABLE low_cardinality (d Date, x UInt32, s LowCardinality(String)) ENGINE = MergeTree(d, x, 8192)''')
shard2.query('''
SET allow_experimental_low_cardinality_type = 1;
CREATE TABLE low_cardinality (d Date, x UInt32, s LowCardinality(String)) ENGINE = MergeTree(d, x, 8192)''')
shard1.query('''
SET allow_experimental_low_cardinality_type = 1;
CREATE TABLE low_cardinality_all (d Date, x UInt32, s LowCardinality(String)) ENGINE = Distributed('shard_with_low_cardinality', 'default', 'low_cardinality', sipHash64(s))''')
yield cluster
finally:
@ -170,3 +185,10 @@ def test_prefer_localhost_replica(started_cluster):
'''
# Now query is sent to node1, as it higher in order
assert TSV(node2.query("SET load_balancing='in_order'; SET prefer_localhost_replica=0;" + test_query)) == TSV(expected_from_node1)
def test_inserts_low_cardinality(started_cluster):
instance = shard1
instance.query("INSERT INTO low_cardinality_all (d,x,s) VALUES ('2018-11-12',1,'123')")
time.sleep(0.5)
assert instance.query("SELECT count(*) FROM low_cardinality_all").strip() == '1'

View File

@ -0,0 +1,11 @@
<yandex>
<logger>
<level>trace</level>
<log>/var/log/clickhouse-server/log.log</log>
<errorlog>/var/log/clickhouse-server/log.err.log</errorlog>
<size>1000M</size>
<count>10</count>
<stderr>/var/log/clickhouse-server/stderr.log</stderr>
<stdout>/var/log/clickhouse-server/stdout.log</stdout>
</logger>
</yandex>

View File

@ -0,0 +1,47 @@
import time
import pytest
import requests
from tempfile import NamedTemporaryFile
from helpers.hdfs_api import HDFSApi
import os
from helpers.cluster import ClickHouseCluster
import subprocess
SCRIPT_DIR = os.path.dirname(os.path.realpath(__file__))
cluster = ClickHouseCluster(__file__)
node1 = cluster.add_instance('node1', with_hdfs=True, image='withlibsimage', config_dir="configs", main_configs=['configs/log_conf.xml'])
@pytest.fixture(scope="module")
def started_cluster():
try:
cluster.start()
yield cluster
except Exception as ex:
print(ex)
raise ex
finally:
cluster.shutdown()
def test_read_write_storage(started_cluster):
hdfs_api = HDFSApi("root")
hdfs_api.write_data("/simple_storage", "1\tMark\t72.53\n")
assert hdfs_api.read_data("/simple_storage") == "1\tMark\t72.53\n"
node1.query("create table SimpleHDFSStorage (id UInt32, name String, weight Float64) ENGINE = HDFS('hdfs://hdfs1:9000/simple_storage', 'TSV')")
assert node1.query("select * from SimpleHDFSStorage") == "1\tMark\t72.53\n"
def test_read_write_table(started_cluster):
hdfs_api = HDFSApi("root")
data = "1\tSerialize\t555.222\n2\tData\t777.333\n"
hdfs_api.write_data("/simple_table_function", data)
assert hdfs_api.read_data("/simple_table_function") == data
assert node1.query("select * from hdfs('hdfs://hdfs1:9000/simple_table_function', 'TSV', 'id UInt64, text String, number Float64')") == data

View File

@ -0,0 +1,3 @@
123 1
234 4
345 5

View File

@ -0,0 +1,29 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
. $CURDIR/mergetree_mutations.lib
${CLICKHOUSE_CLIENT} --multiquery << EOF
DROP TABLE IF EXISTS test.mutations;
DROP TABLE IF EXISTS test.for_subquery;
USE test;
CREATE TABLE mutations(x UInt32, y UInt32) ENGINE MergeTree ORDER BY x;
INSERT INTO mutations VALUES (123, 1), (234, 2), (345, 3);
CREATE TABLE for_subquery(x UInt32) ENGINE TinyLog;
INSERT INTO for_subquery VALUES (234), (345);
ALTER TABLE mutations UPDATE y = y + 1 WHERE x IN for_subquery;
ALTER TABLE mutations UPDATE y = y + 1 WHERE x IN (SELECT x FROM for_subquery);
EOF
wait_for_mutation "mutations" "mutation_3.txt"
${CLICKHOUSE_CLIENT} --query="SELECT * FROM test.mutations"
${CLICKHOUSE_CLIENT} --query="DROP TABLE test.mutations"
${CLICKHOUSE_CLIENT} --query="DROP TABLE test.for_subquery"

View File

@ -0,0 +1,29 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
. $CURDIR/mergetree_mutations.lib
${CLICKHOUSE_CLIENT} --multiquery << EOF
DROP TABLE IF EXISTS test.mutations_r1;
DROP TABLE IF EXISTS test.for_subquery;
USE test;
CREATE TABLE mutations_r1(x UInt32, y UInt32) ENGINE ReplicatedMergeTree('/clickhouse/tables/test/mutations', 'r1') ORDER BY x;
INSERT INTO mutations_r1 VALUES (123, 1), (234, 2), (345, 3);
CREATE TABLE for_subquery(x UInt32) ENGINE TinyLog;
INSERT INTO for_subquery VALUES (234), (345);
ALTER TABLE mutations_r1 UPDATE y = y + 1 WHERE x IN for_subquery;
ALTER TABLE mutations_r1 UPDATE y = y + 1 WHERE x IN (SELECT x FROM for_subquery);
EOF
wait_for_mutation "mutations_r1" "0000000001"
${CLICKHOUSE_CLIENT} --query="SELECT * FROM test.mutations_r1"
${CLICKHOUSE_CLIENT} --query="DROP TABLE test.mutations_r1"
${CLICKHOUSE_CLIENT} --query="DROP TABLE test.for_subquery"

View File

@ -0,0 +1,14 @@
drop table if exists test.table;
CREATE TABLE test.table (a UInt32, date Date, b UInt64, c UInt64, str String, d Int8, arr Array(UInt64), arr_alias Array(UInt64) ALIAS arr) ENGINE = MergeTree(date, intHash32(c), (a, date, intHash32(c), b), 8192);
SELECT alias2 AS alias3
FROM test.table
ARRAY JOIN
arr_alias AS alias2,
arrayEnumerateUniq(arr_alias) AS _uniq_Event
WHERE (date = toDate('2010-10-10')) AND (a IN (2, 3)) AND (str NOT IN ('z', 'x')) AND (d != -1)
LIMIT 1;
drop table if exists test.table;

View File

@ -1,4 +1,4 @@
CREATE MATERIALIZED VIEW test.t_mv ( date Date, platform Enum8('a' = 0, 'b' = 1), app Enum8('a' = 0, 'b' = 1)) ENGINE = MergeTree ORDER BY date SETTINGS index_granularity = 8192 AS SELECT date, platform, app FROM test.t WHERE (app = (SELECT min(app) FROM test.u )) AND (platform = (SELECT min(platform) FROM test.v ))
CREATE MATERIALIZED VIEW test.t_mv ( date Date, platform Enum8('a' = 0, 'b' = 1), app Enum8('a' = 0, 'b' = 1)) ENGINE = MergeTree ORDER BY date SETTINGS index_granularity = 8192 AS SELECT date, platform, app FROM test.t WHERE (app = (SELECT min(app) FROM test.u )) AND (platform = (SELECT (SELECT min(platform) FROM test.v )))
2000-01-01 a a
2000-01-02 b b
2000-01-03 a a

View File

@ -20,19 +20,25 @@ INSERT INTO v VALUES ('b');
CREATE MATERIALIZED VIEW t_mv ENGINE = MergeTree ORDER BY date
AS SELECT date, platform, app FROM t
WHERE app = (SELECT min(app) from u) AND platform = (SELECT min(platform) from v);
WHERE app = (SELECT min(app) from u) AND platform = (SELECT (SELECT min(platform) from v));
SHOW CREATE TABLE test.t_mv FORMAT TabSeparatedRaw;
INSERT INTO t VALUES ('2000-01-01', 'a', 'a') ('2000-01-02', 'b', 'b');
USE default;
DETACH TABLE test.t_mv;
ATTACH TABLE test.t_mv;
INSERT INTO u VALUES ('a');
INSERT INTO v VALUES ('a');
INSERT INTO test.t VALUES ('2000-01-01', 'a', 'a') ('2000-01-02', 'b', 'b');
INSERT INTO t VALUES ('2000-01-03', 'a', 'a') ('2000-01-04', 'b', 'b');
INSERT INTO test.u VALUES ('a');
INSERT INTO test.v VALUES ('a');
SELECT * FROM t ORDER BY date;
SELECT * FROM t_mv ORDER BY date;
INSERT INTO test.t VALUES ('2000-01-03', 'a', 'a') ('2000-01-04', 'b', 'b');
DROP TABLE IF EXISTS t;
DROP TABLE IF EXISTS t_mv;
SELECT * FROM test.t ORDER BY date;
SELECT * FROM test.t_mv ORDER BY date;
DROP TABLE test.t;
DROP TABLE test.t_mv;
DROP TABLE test.u;
DROP TABLE test.v;

View File

@ -17,7 +17,7 @@ function thread1()
function thread2()
{
seq 1 1000 | sed -r -e 's/.+/SELECT count() FROM test.buffer;/' | ${CLICKHOUSE_CLIENT} --multiquery --server_logs_file='/dev/null' --ignore-error 2>&1 | grep -vP '^0$|^10$|^Received exception|^Code: 60'
seq 1 1000 | sed -r -e 's/.+/SELECT count() FROM test.buffer;/' | ${CLICKHOUSE_CLIENT} --multiquery --server_logs_file='/dev/null' --ignore-error 2>&1 | grep -vP '^0$|^10$|^Received exception|^Code: 60|^Code: 218'
}
thread1 &

View File

@ -0,0 +1,12 @@
SET allow_experimental_low_cardinality_type = 1;
DROP TABLE IF EXISTS test.low_cardinality;
DROP TABLE IF EXISTS test.low_cardinality_all;
CREATE TABLE test.low_cardinality (d Date, x UInt32, s LowCardinality(String)) ENGINE = MergeTree(d, x, 8192);
CREATE TABLE test.low_cardinality_all (d Date, x UInt32, s LowCardinality(String)) ENGINE = Distributed(test_shard_localhost, test, low_cardinality, sipHash64(s));
INSERT INTO test.low_cardinality_all (d,x,s) VALUES ('2018-11-12',1,'123');
SELECT s FROM test.low_cardinality_all;
DROP TABLE IF EXISTS test.low_cardinality;
DROP TABLE IF EXISTS test.low_cardinality_all;

View File

@ -24,6 +24,8 @@ RUN apt-get update \
/tmp/* \
&& apt-get clean
RUN mkdir /docker-entrypoint-initdb.d
COPY docker_related_config.xml /etc/clickhouse-server/config.d/
COPY entrypoint.sh /entrypoint.sh
ADD https://github.com/tianon/gosu/releases/download/1.10/gosu-amd64 /bin/gosu

View File

@ -33,6 +33,22 @@ ClickHouse configuration represented with a file "config.xml" ([documentation](h
$ docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 -v /path/to/your/config.xml:/etc/clickhouse-server/config.xml yandex/clickhouse-server
```
## How to extend this image
If you would like to do additional initialization in an image derived from this one, add one or more `*.sql`, `*.sql.gz`, or `*.sh` scripts under `/docker-entrypoint-initdb.d`. After the entrypoint calls `initdb` it will run any `*.sql` files, run any executable `*.sh` scripts, and source any non-executable `*.sh` scripts found in that directory to do further initialization before starting the service.
For example, to add an additional user and database, add the following to `/docker-entrypoint-initdb.d/init-db.sh`:
```bash
#!/bin/bash
set -e
clickhouse client -n <<-EOSQL
CREATE DATABASE docker;
CREATE TABLE docker.docker (x Int32) ENGINE = Log;
EOSQL
```
## License
View [license information](https://github.com/yandex/ClickHouse/blob/master/LICENSE) for the software contained in this image.

View File

@ -30,6 +30,36 @@ chown -R $USER:$GROUP \
"$TMP_DIR" \
"$USER_PATH"
if [ -n "$(ls /docker-entrypoint-initdb.d/)" ]; then
gosu clickhouse /usr/bin/clickhouse-server --config-file=$CLICKHOUSE_CONFIG &
pid="$!"
sleep 1
clickhouseclient=( clickhouse client --multiquery )
echo
for f in /docker-entrypoint-initdb.d/*; do
case "$f" in
*.sh)
if [ -x "$f" ]; then
echo "$0: running $f"
"$f"
else
echo "$0: sourcing $f"
. "$f"
fi
;;
*.sql) echo "$0: running $f"; cat "$f" | "${clickhouseclient[@]}" ; echo ;;
*.sql.gz) echo "$0: running $f"; gunzip -c "$f" | "${clickhouseclient[@]}"; echo ;;
*) echo "$0: ignoring $f" ;;
esac
echo
done
if ! kill -s TERM "$pid" || ! wait "$pid"; then
echo >&2 'ClickHouse init process failed.'
exit 1
fi
fi
# if no args passed to `docker run` or first argument start with `--`, then the user is passing clickhouse-server arguments
if [[ $# -lt 1 ]] || [[ "$1" == "--"* ]]; then

View File

@ -39,6 +39,7 @@ You can also download and install packages manually from here: <https://repo.yan
Yandex does not run ClickHouse on `rpm` based Linux distributions and `rpm` packages are not as thoroughly tested. So use them at your own risk, but there are many other companies that do successfully run them in production without any major issues.
For CentOS, RHEL or Fedora there are the following options:
* Packages from <https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/> are generated from official `deb` packages by Yandex and have byte-identical binaries.
* Packages from <https://github.com/Altinity/clickhouse-rpm-install> are built by independent company Altinity, but are used widely without any complaints.
* Or you can use Docker (see below).

View File

@ -1,10 +1,12 @@
# Visual Interfaces from Third-party Developers
## Tabix
## Open-Source
### Tabix
Web interface for ClickHouse in the [Tabix](https://github.com/tabixio/tabix) project.
Main features:
Features:
- Works with ClickHouse directly from the browser, without the need to install additional software.
- Query editor with syntax highlighting.
@ -14,11 +16,11 @@ Main features:
[Tabix documentation](https://tabix.io/doc/).
## HouseOps
### HouseOps
[HouseOps](https://github.com/HouseOps/HouseOps) is a UI/IDE for OSX, Linux and Windows.
Main features:
Features:
- Query builder with syntax highlighting. View the response in a table or JSON view.
- Export query results as CSV or JSON.
@ -36,25 +38,27 @@ The following features are planned for development:
- Cluster management.
- Monitoring replicated and Kafka tables.
## DBeaver
## Commercial
### DBeaver
[DBeaver](https://dbeaver.io/) - universal desktop database client with ClickHouse support.
Key features:
Features:
- Query development with syntax highlight.
- Table preview.
- Autocompletion.
## DataGrip
### DataGrip
[DataGrip](https://www.jetbrains.com/datagrip/) - Database IDE from JetBrains with dedicated support for ClickHouse. The same is embedded into other IntelliJ-based tools: PyCharm, IntelliJIDEA, GoLand, PhpStorm etc.
[DataGrip](https://www.jetbrains.com/datagrip/) is a database IDE from JetBrains with dedicated support for ClickHouse. It is also embedded into other IntelliJ-based tools: PyCharm, IntelliJ IDEA, GoLand, PhpStorm and others.
Features:
- Very fast code completion.
- Clickhouse synthax highlighting.
- Specific Clickhouse features support in SQL, i.e. nested columns, table engines.
- ClickHouse syntax highlighting.
- Support for features specific to ClickHouse, for example nested columns, table engines.
- Data Editor.
- Refactorings.
- Search and Navigation.

View File

@ -3,6 +3,11 @@
!!! warning "Disclaimer"
Yandex does **not** maintain the libraries listed below and haven't done any extensive testing to ensure their quality.
- Relational database management systems
- [MySQL](https://www.mysql.com)
- [ProxySQL](https://github.com/sysown/proxysql/wiki/ClickHouse-Support)
- [PostgreSQL](https://www.postgresql.org)
- [infi.clickhouse_fdw](https://github.com/Infinidat/infi.clickhouse_fdw) (uses [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
- Python
- [SQLAlchemy](https://www.sqlalchemy.org)
- [sqlalchemy-clickhouse](https://github.com/cloudflare/sqlalchemy-clickhouse) (uses [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
@ -22,4 +27,4 @@
- [clickhouse_ecto](https://github.com/appodeal/clickhouse_ecto)
[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party/integrations/) <!--hide-->
[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party/integrations/) <!--hide-->

Some files were not shown because too many files have changed in this diff Show More