mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-30 11:32:03 +00:00
Merge branch 'master' into keeper-log-improvements
This commit is contained in:
commit
767193caf5
10
.clang-tidy
10
.clang-tidy
@ -23,7 +23,7 @@ Checks: '*,
|
|||||||
-bugprone-implicit-widening-of-multiplication-result,
|
-bugprone-implicit-widening-of-multiplication-result,
|
||||||
-bugprone-narrowing-conversions,
|
-bugprone-narrowing-conversions,
|
||||||
-bugprone-not-null-terminated-result,
|
-bugprone-not-null-terminated-result,
|
||||||
-bugprone-reserved-identifier,
|
-bugprone-reserved-identifier, # useful but too slow, TODO retry when https://reviews.llvm.org/rG1c282052624f9d0bd273bde0b47b30c96699c6c7 is merged
|
||||||
-bugprone-unchecked-optional-access,
|
-bugprone-unchecked-optional-access,
|
||||||
|
|
||||||
-cert-dcl16-c,
|
-cert-dcl16-c,
|
||||||
@ -111,6 +111,7 @@ Checks: '*,
|
|||||||
-misc-no-recursion,
|
-misc-no-recursion,
|
||||||
-misc-non-private-member-variables-in-classes,
|
-misc-non-private-member-variables-in-classes,
|
||||||
-misc-confusable-identifiers, # useful but slooow
|
-misc-confusable-identifiers, # useful but slooow
|
||||||
|
-misc-use-anonymous-namespace,
|
||||||
|
|
||||||
-modernize-avoid-c-arrays,
|
-modernize-avoid-c-arrays,
|
||||||
-modernize-concat-nested-namespaces,
|
-modernize-concat-nested-namespaces,
|
||||||
@ -136,7 +137,7 @@ Checks: '*,
|
|||||||
-readability-function-cognitive-complexity,
|
-readability-function-cognitive-complexity,
|
||||||
-readability-function-size,
|
-readability-function-size,
|
||||||
-readability-identifier-length,
|
-readability-identifier-length,
|
||||||
-readability-identifier-naming,
|
-readability-identifier-naming, # useful but too slow
|
||||||
-readability-implicit-bool-conversion,
|
-readability-implicit-bool-conversion,
|
||||||
-readability-isolate-declaration,
|
-readability-isolate-declaration,
|
||||||
-readability-magic-numbers,
|
-readability-magic-numbers,
|
||||||
@ -148,7 +149,7 @@ Checks: '*,
|
|||||||
-readability-uppercase-literal-suffix,
|
-readability-uppercase-literal-suffix,
|
||||||
-readability-use-anyofallof,
|
-readability-use-anyofallof,
|
||||||
|
|
||||||
-zirkon-*,
|
-zircon-*,
|
||||||
'
|
'
|
||||||
|
|
||||||
WarningsAsErrors: '*'
|
WarningsAsErrors: '*'
|
||||||
@ -168,11 +169,10 @@ CheckOptions:
|
|||||||
readability-identifier-naming.ParameterPackCase: lower_case
|
readability-identifier-naming.ParameterPackCase: lower_case
|
||||||
readability-identifier-naming.StructCase: CamelCase
|
readability-identifier-naming.StructCase: CamelCase
|
||||||
readability-identifier-naming.TemplateTemplateParameterCase: CamelCase
|
readability-identifier-naming.TemplateTemplateParameterCase: CamelCase
|
||||||
readability-identifier-naming.TemplateUsingCase: lower_case
|
readability-identifier-naming.TemplateParameterCase: lower_case
|
||||||
readability-identifier-naming.TypeTemplateParameterCase: CamelCase
|
readability-identifier-naming.TypeTemplateParameterCase: CamelCase
|
||||||
readability-identifier-naming.TypedefCase: CamelCase
|
readability-identifier-naming.TypedefCase: CamelCase
|
||||||
readability-identifier-naming.UnionCase: CamelCase
|
readability-identifier-naming.UnionCase: CamelCase
|
||||||
readability-identifier-naming.UsingCase: CamelCase
|
|
||||||
modernize-loop-convert.UseCxx20ReverseRanges: false
|
modernize-loop-convert.UseCxx20ReverseRanges: false
|
||||||
performance-move-const-arg.CheckTriviallyCopyableMove: false
|
performance-move-const-arg.CheckTriviallyCopyableMove: false
|
||||||
# Workaround clang-tidy bug: https://github.com/llvm/llvm-project/issues/46097
|
# Workaround clang-tidy bug: https://github.com/llvm/llvm-project/issues/46097
|
||||||
|
16
.clangd
Normal file
16
.clangd
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
Diagnostics:
|
||||||
|
# clangd does parse .clang-tidy, but some checks are too slow to run in
|
||||||
|
# clang-tidy build, so let's enable them explicitly for clangd at least.
|
||||||
|
ClangTidy:
|
||||||
|
# The following checks had been disabled due to slowliness with C++23,
|
||||||
|
# for more details see [1].
|
||||||
|
#
|
||||||
|
# [1]: https://github.com/llvm/llvm-project/issues/61418
|
||||||
|
#
|
||||||
|
# But the code base had been written in a style that had been checked
|
||||||
|
# by this check, so at least, let's enable it for clangd.
|
||||||
|
Add: [
|
||||||
|
# configured in .clang-tidy
|
||||||
|
readability-identifier-naming,
|
||||||
|
bugprone-reserved-identifier,
|
||||||
|
]
|
34
.github/workflows/master.yml
vendored
34
.github/workflows/master.yml
vendored
@ -1341,6 +1341,40 @@ jobs:
|
|||||||
docker ps --quiet | xargs --no-run-if-empty docker kill ||:
|
docker ps --quiet | xargs --no-run-if-empty docker kill ||:
|
||||||
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
|
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
|
||||||
sudo rm -fr "$TEMP_PATH"
|
sudo rm -fr "$TEMP_PATH"
|
||||||
|
FunctionalStatelessTestReleaseAnalyzer:
|
||||||
|
needs: [BuilderDebRelease]
|
||||||
|
runs-on: [self-hosted, func-tester]
|
||||||
|
steps:
|
||||||
|
- name: Set envs
|
||||||
|
run: |
|
||||||
|
cat >> "$GITHUB_ENV" << 'EOF'
|
||||||
|
TEMP_PATH=${{runner.temp}}/stateless_analyzer
|
||||||
|
REPORTS_PATH=${{runner.temp}}/reports_dir
|
||||||
|
CHECK_NAME=Stateless tests (release, analyzer)
|
||||||
|
REPO_COPY=${{runner.temp}}/stateless_analyzer/ClickHouse
|
||||||
|
KILL_TIMEOUT=10800
|
||||||
|
EOF
|
||||||
|
- name: Download json reports
|
||||||
|
uses: actions/download-artifact@v3
|
||||||
|
with:
|
||||||
|
path: ${{ env.REPORTS_PATH }}
|
||||||
|
- name: Check out repository code
|
||||||
|
uses: ClickHouse/checkout@v1
|
||||||
|
with:
|
||||||
|
clear-repository: true
|
||||||
|
- name: Functional test
|
||||||
|
run: |
|
||||||
|
sudo rm -fr "$TEMP_PATH"
|
||||||
|
mkdir -p "$TEMP_PATH"
|
||||||
|
cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH"
|
||||||
|
cd "$REPO_COPY/tests/ci"
|
||||||
|
python3 functional_test_check.py "$CHECK_NAME" "$KILL_TIMEOUT"
|
||||||
|
- name: Cleanup
|
||||||
|
if: always()
|
||||||
|
run: |
|
||||||
|
docker ps --quiet | xargs --no-run-if-empty docker kill ||:
|
||||||
|
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
|
||||||
|
sudo rm -fr "$TEMP_PATH"
|
||||||
FunctionalStatelessTestAarch64:
|
FunctionalStatelessTestAarch64:
|
||||||
needs: [BuilderDebAarch64]
|
needs: [BuilderDebAarch64]
|
||||||
runs-on: [self-hosted, func-tester-aarch64]
|
runs-on: [self-hosted, func-tester-aarch64]
|
||||||
|
7
.github/workflows/nightly.yml
vendored
7
.github/workflows/nightly.yml
vendored
@ -72,6 +72,9 @@ jobs:
|
|||||||
with:
|
with:
|
||||||
name: changed_images
|
name: changed_images
|
||||||
path: ${{ runner.temp }}/changed_images.json
|
path: ${{ runner.temp }}/changed_images.json
|
||||||
|
Codebrowser:
|
||||||
|
needs: [DockerHubPush]
|
||||||
|
uses: ./.github/workflows/woboq.yml
|
||||||
BuilderCoverity:
|
BuilderCoverity:
|
||||||
needs: DockerHubPush
|
needs: DockerHubPush
|
||||||
runs-on: [self-hosted, builder]
|
runs-on: [self-hosted, builder]
|
||||||
@ -125,8 +128,8 @@ jobs:
|
|||||||
SONAR_SCANNER_VERSION: 4.8.0.2856
|
SONAR_SCANNER_VERSION: 4.8.0.2856
|
||||||
SONAR_SERVER_URL: "https://sonarcloud.io"
|
SONAR_SERVER_URL: "https://sonarcloud.io"
|
||||||
BUILD_WRAPPER_OUT_DIR: build_wrapper_output_directory # Directory where build-wrapper output will be placed
|
BUILD_WRAPPER_OUT_DIR: build_wrapper_output_directory # Directory where build-wrapper output will be placed
|
||||||
CC: clang-15
|
CC: clang-16
|
||||||
CXX: clang++-15
|
CXX: clang++-16
|
||||||
steps:
|
steps:
|
||||||
- name: Check out repository code
|
- name: Check out repository code
|
||||||
uses: ClickHouse/checkout@v1
|
uses: ClickHouse/checkout@v1
|
||||||
|
7
.github/workflows/woboq.yml
vendored
7
.github/workflows/woboq.yml
vendored
@ -6,9 +6,8 @@ env:
|
|||||||
concurrency:
|
concurrency:
|
||||||
group: woboq
|
group: woboq
|
||||||
on: # yamllint disable-line rule:truthy
|
on: # yamllint disable-line rule:truthy
|
||||||
schedule:
|
|
||||||
- cron: '0 */18 * * *'
|
|
||||||
workflow_dispatch:
|
workflow_dispatch:
|
||||||
|
workflow_call:
|
||||||
jobs:
|
jobs:
|
||||||
# don't use dockerhub push because this image updates so rarely
|
# don't use dockerhub push because this image updates so rarely
|
||||||
WoboqCodebrowser:
|
WoboqCodebrowser:
|
||||||
@ -26,6 +25,10 @@ jobs:
|
|||||||
with:
|
with:
|
||||||
clear-repository: true
|
clear-repository: true
|
||||||
submodules: 'true'
|
submodules: 'true'
|
||||||
|
- name: Download json reports
|
||||||
|
uses: actions/download-artifact@v3
|
||||||
|
with:
|
||||||
|
path: ${{ env.IMAGES_PATH }}
|
||||||
- name: Codebrowser
|
- name: Codebrowser
|
||||||
run: |
|
run: |
|
||||||
sudo rm -fr "$TEMP_PATH"
|
sudo rm -fr "$TEMP_PATH"
|
||||||
|
1
.gitignore
vendored
1
.gitignore
vendored
@ -129,7 +129,6 @@ website/package-lock.json
|
|||||||
/.ccls-cache
|
/.ccls-cache
|
||||||
|
|
||||||
# clangd cache
|
# clangd cache
|
||||||
/.clangd
|
|
||||||
/.cache
|
/.cache
|
||||||
|
|
||||||
/compile_commands.json
|
/compile_commands.json
|
||||||
|
5
.gitmodules
vendored
5
.gitmodules
vendored
@ -267,7 +267,7 @@
|
|||||||
url = https://github.com/ClickHouse/nats.c
|
url = https://github.com/ClickHouse/nats.c
|
||||||
[submodule "contrib/vectorscan"]
|
[submodule "contrib/vectorscan"]
|
||||||
path = contrib/vectorscan
|
path = contrib/vectorscan
|
||||||
url = https://github.com/VectorCamp/vectorscan
|
url = https://github.com/VectorCamp/vectorscan.git
|
||||||
[submodule "contrib/c-ares"]
|
[submodule "contrib/c-ares"]
|
||||||
path = contrib/c-ares
|
path = contrib/c-ares
|
||||||
url = https://github.com/ClickHouse/c-ares
|
url = https://github.com/ClickHouse/c-ares
|
||||||
@ -338,6 +338,9 @@
|
|||||||
[submodule "contrib/liburing"]
|
[submodule "contrib/liburing"]
|
||||||
path = contrib/liburing
|
path = contrib/liburing
|
||||||
url = https://github.com/axboe/liburing
|
url = https://github.com/axboe/liburing
|
||||||
|
[submodule "contrib/libfiu"]
|
||||||
|
path = contrib/libfiu
|
||||||
|
url = https://github.com/ClickHouse/libfiu.git
|
||||||
[submodule "contrib/isa-l"]
|
[submodule "contrib/isa-l"]
|
||||||
path = contrib/isa-l
|
path = contrib/isa-l
|
||||||
url = https://github.com/ClickHouse/isa-l.git
|
url = https://github.com/ClickHouse/isa-l.git
|
||||||
|
@ -102,6 +102,17 @@ if (ENABLE_FUZZING)
|
|||||||
set (ENABLE_PROTOBUF 1)
|
set (ENABLE_PROTOBUF 1)
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
|
option (ENABLE_WOBOQ_CODEBROWSER "Build for woboq codebrowser" OFF)
|
||||||
|
|
||||||
|
if (ENABLE_WOBOQ_CODEBROWSER)
|
||||||
|
set (ENABLE_EMBEDDED_COMPILER 0)
|
||||||
|
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-poison-system-directories")
|
||||||
|
# woboq codebrowser uses clang tooling, and they could add default system
|
||||||
|
# clang includes, and later clang will warn for those added by itself
|
||||||
|
# includes.
|
||||||
|
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-poison-system-directories")
|
||||||
|
endif()
|
||||||
|
|
||||||
# Global libraries
|
# Global libraries
|
||||||
# See:
|
# See:
|
||||||
# - default_libs.cmake
|
# - default_libs.cmake
|
||||||
@ -259,8 +270,8 @@ endif ()
|
|||||||
option (ENABLE_BUILD_PATH_MAPPING "Enable remapping of file source paths in debug info, predefined preprocessor macros, and __builtin_FILE(). It's used to generate reproducible builds. See https://reproducible-builds.org/docs/build-path" ${ENABLE_BUILD_PATH_MAPPING_DEFAULT})
|
option (ENABLE_BUILD_PATH_MAPPING "Enable remapping of file source paths in debug info, predefined preprocessor macros, and __builtin_FILE(). It's used to generate reproducible builds. See https://reproducible-builds.org/docs/build-path" ${ENABLE_BUILD_PATH_MAPPING_DEFAULT})
|
||||||
|
|
||||||
if (ENABLE_BUILD_PATH_MAPPING)
|
if (ENABLE_BUILD_PATH_MAPPING)
|
||||||
set (COMPILER_FLAGS "${COMPILER_FLAGS} -ffile-prefix-map=${CMAKE_SOURCE_DIR}=.")
|
set (COMPILER_FLAGS "${COMPILER_FLAGS} -ffile-prefix-map=${PROJECT_SOURCE_DIR}=.")
|
||||||
set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} -ffile-prefix-map=${CMAKE_SOURCE_DIR}=.")
|
set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} -ffile-prefix-map=${PROJECT_SOURCE_DIR}=.")
|
||||||
endif ()
|
endif ()
|
||||||
|
|
||||||
option (ENABLE_BUILD_PROFILING "Enable profiling of build time" OFF)
|
option (ENABLE_BUILD_PROFILING "Enable profiling of build time" OFF)
|
||||||
@ -342,13 +353,6 @@ if (COMPILER_CLANG)
|
|||||||
|
|
||||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fstrict-vtable-pointers")
|
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fstrict-vtable-pointers")
|
||||||
|
|
||||||
if (CMAKE_CXX_COMPILER_VERSION VERSION_LESS 16)
|
|
||||||
# Set new experimental pass manager, it's a performance, build time and binary size win.
|
|
||||||
# Can be removed after https://reviews.llvm.org/D66490 merged and released to at least two versions of clang.
|
|
||||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fexperimental-new-pass-manager")
|
|
||||||
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fexperimental-new-pass-manager")
|
|
||||||
endif ()
|
|
||||||
|
|
||||||
# We cannot afford to use LTO when compiling unit tests, and it's not enough
|
# We cannot afford to use LTO when compiling unit tests, and it's not enough
|
||||||
# to only supply -fno-lto at the final linking stage. So we disable it
|
# to only supply -fno-lto at the final linking stage. So we disable it
|
||||||
# completely.
|
# completely.
|
||||||
@ -395,6 +399,8 @@ if ((NOT OS_LINUX AND NOT OS_ANDROID) OR (CMAKE_BUILD_TYPE_UC STREQUAL "DEBUG"))
|
|||||||
set(ENABLE_GWP_ASAN OFF)
|
set(ENABLE_GWP_ASAN OFF)
|
||||||
endif ()
|
endif ()
|
||||||
|
|
||||||
|
option (ENABLE_FIU "Enable Fiu" ON)
|
||||||
|
|
||||||
option(WERROR "Enable -Werror compiler option" ON)
|
option(WERROR "Enable -Werror compiler option" ON)
|
||||||
|
|
||||||
if (WERROR)
|
if (WERROR)
|
||||||
@ -562,7 +568,7 @@ if (NATIVE_BUILD_TARGETS
|
|||||||
)
|
)
|
||||||
message (STATUS "Building native targets...")
|
message (STATUS "Building native targets...")
|
||||||
|
|
||||||
set (NATIVE_BUILD_DIR "${CMAKE_BINARY_DIR}/native")
|
set (NATIVE_BUILD_DIR "${PROJECT_BINARY_DIR}/native")
|
||||||
|
|
||||||
execute_process(
|
execute_process(
|
||||||
COMMAND ${CMAKE_COMMAND} -E make_directory "${NATIVE_BUILD_DIR}"
|
COMMAND ${CMAKE_COMMAND} -E make_directory "${NATIVE_BUILD_DIR}"
|
||||||
@ -576,7 +582,7 @@ if (NATIVE_BUILD_TARGETS
|
|||||||
# Avoid overriding .cargo/config.toml with native toolchain.
|
# Avoid overriding .cargo/config.toml with native toolchain.
|
||||||
"-DENABLE_RUST=OFF"
|
"-DENABLE_RUST=OFF"
|
||||||
"-DENABLE_CLICKHOUSE_SELF_EXTRACTING=${ENABLE_CLICKHOUSE_SELF_EXTRACTING}"
|
"-DENABLE_CLICKHOUSE_SELF_EXTRACTING=${ENABLE_CLICKHOUSE_SELF_EXTRACTING}"
|
||||||
${CMAKE_SOURCE_DIR}
|
${PROJECT_SOURCE_DIR}
|
||||||
WORKING_DIRECTORY "${NATIVE_BUILD_DIR}"
|
WORKING_DIRECTORY "${NATIVE_BUILD_DIR}"
|
||||||
COMMAND_ECHO STDOUT)
|
COMMAND_ECHO STDOUT)
|
||||||
|
|
||||||
|
22
README.md
22
README.md
@ -21,11 +21,25 @@ curl https://clickhouse.com/ | sh
|
|||||||
* [Contacts](https://clickhouse.com/company/contact) can help to get your questions answered if there are any.
|
* [Contacts](https://clickhouse.com/company/contact) can help to get your questions answered if there are any.
|
||||||
|
|
||||||
## Upcoming Events
|
## Upcoming Events
|
||||||
* [**ClickHouse Spring Meetup in Manhattan**](https://www.meetup.com/clickhouse-new-york-user-group/events/292517734) - April 26 - It's spring, and it's time to meet again in the city! Talks include: "Building a domain specific query language on top of Clickhouse", "A Galaxy of Information", "Our Journey to ClickHouse Cloud from Redshift", and a ClickHouse update!
|
|
||||||
* [**v23.4 Release Webinar**](https://clickhouse.com/company/events/v23-4-release-webinar?utm_source=github&utm_medium=social&utm_campaign=release-webinar-2023-04) - April 26 - 23.4 is rapidly approaching. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
|
* [**v23.5 Release Webinar**](https://clickhouse.com/company/events/v23-5-release-webinar?utm_source=github&utm_medium=social&utm_campaign=release-webinar-2023-05) - May 31 - 23.5 is rapidly approaching. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
|
||||||
* [**ClickHouse Meetup in Berlin**](https://www.meetup.com/clickhouse-berlin-user-group/events/292892466) - May 16 - Save the date! ClickHouse is coming back to Berlin. We’re excited to announce an upcoming ClickHouse Meetup that you won’t want to miss. Join us as we gather together to discuss the latest in the world of ClickHouse and share user stories.
|
* [**ClickHouse Meetup in Barcelona**](https://www.meetup.com/clickhouse-barcelona-user-group/events/292892669) - May 25
|
||||||
|
* [**ClickHouse Meetup in London**](https://www.meetup.com/clickhouse-london-user-group/events/292892824) - May 25
|
||||||
|
* [**ClickHouse Meetup in San Francisco**](https://www.meetup.com/clickhouse-silicon-valley-meetup-group/events/293426725/) - Jun 7
|
||||||
|
* [**ClickHouse Meetup in Stockholm**](https://www.meetup.com/clickhouse-berlin-user-group/events/292892466) - Jun 13
|
||||||
|
|
||||||
|
Also, keep an eye out for upcoming meetups in Amsterdam, Boston, NYC, Beijing, and Toronto. Somewhere else you want us to be? Please feel free to reach out to tyler <at> clickhouse <dot> com.
|
||||||
|
|
||||||
## Recent Recordings
|
## Recent Recordings
|
||||||
* **Recent Meetup Videos**: [Meetup Playlist](https://www.youtube.com/playlist?list=PL0Z2YDlm0b3iNDUzpY1S3L_iV4nARda_U) Whenever possible recordings of the ClickHouse Community Meetups are edited and presented as individual talks. Current featuring "Modern SQL in 2023", "Fast, Concurrent, and Consistent Asynchronous INSERTS in ClickHouse", and "Full-Text Indices: Design and Experiments"
|
* **Recent Meetup Videos**: [Meetup Playlist](https://www.youtube.com/playlist?list=PL0Z2YDlm0b3iNDUzpY1S3L_iV4nARda_U) Whenever possible recordings of the ClickHouse Community Meetups are edited and presented as individual talks. Current featuring "Modern SQL in 2023", "Fast, Concurrent, and Consistent Asynchronous INSERTS in ClickHouse", and "Full-Text Indices: Design and Experiments"
|
||||||
* **Recording available**: [**v23.3 Release Webinar**](https://www.youtube.com/watch?v=ISaGUjvBNao) UNDROP TABLE, server settings introspection, nested dynamic disks, MySQL compatibility, parseDate Time, Lightweight Deletes, Parallel Replicas, integrations updates, and so much more! Watch it now!
|
* **Recording available**: [**v23.4 Release Webinar**](https://www.youtube.com/watch?v=4rrf6bk_mOg) Faster Parquet Reading, Asynchonous Connections to Reoplicas, Trailing Comma before FROM, extractKeyValuePairs, integrations updates, and so much more! Watch it now!
|
||||||
* **All release webinar recordings**: [YouTube playlist](https://www.youtube.com/playlist?list=PL0Z2YDlm0b3jAlSy1JxyP8zluvXaN3nxU)
|
* **All release webinar recordings**: [YouTube playlist](https://www.youtube.com/playlist?list=PL0Z2YDlm0b3jAlSy1JxyP8zluvXaN3nxU)
|
||||||
|
|
||||||
|
|
||||||
|
## Interested in joining ClickHouse and making it your full time job?
|
||||||
|
|
||||||
|
We are a globally diverse and distributed team, united behind a common goal of creating industry-leading, real-time analytics. Here, you will have an opportunity to solve some of the most cutting edge technical challenges and have direct ownership of your work and vision. If you are a contributor by nature, a thinker as well as a doer - we’ll definitely click!
|
||||||
|
|
||||||
|
Check out our **current openings** here: https://clickhouse.com/company/careers
|
||||||
|
|
||||||
|
Cant find what you are looking for, but want to let us know you are interested in joining ClickHouse? Email careers@clickhouse.com!
|
||||||
|
@ -3,6 +3,7 @@
|
|||||||
#include <cassert>
|
#include <cassert>
|
||||||
#include <stdexcept> // for std::logic_error
|
#include <stdexcept> // for std::logic_error
|
||||||
#include <string>
|
#include <string>
|
||||||
|
#include <type_traits>
|
||||||
#include <vector>
|
#include <vector>
|
||||||
#include <functional>
|
#include <functional>
|
||||||
#include <iosfwd>
|
#include <iosfwd>
|
||||||
@ -326,5 +327,16 @@ namespace ZeroTraits
|
|||||||
inline void set(StringRef & x) { x.size = 0; }
|
inline void set(StringRef & x) { x.size = 0; }
|
||||||
}
|
}
|
||||||
|
|
||||||
|
namespace PackedZeroTraits
|
||||||
|
{
|
||||||
|
template <typename Second, template <typename, typename> class PackedPairNoInit>
|
||||||
|
inline bool check(const PackedPairNoInit<StringRef, Second> p)
|
||||||
|
{ return 0 == p.key.size; }
|
||||||
|
|
||||||
|
template <typename Second, template <typename, typename> class PackedPairNoInit>
|
||||||
|
inline void set(PackedPairNoInit<StringRef, Second> & p)
|
||||||
|
{ p.key.size = 0; }
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
std::ostream & operator<<(std::ostream & os, const StringRef & str);
|
std::ostream & operator<<(std::ostream & os, const StringRef & str);
|
||||||
|
@ -314,7 +314,14 @@ struct integer<Bits, Signed>::_impl
|
|||||||
|
|
||||||
const T alpha = t / static_cast<T>(max_int);
|
const T alpha = t / static_cast<T>(max_int);
|
||||||
|
|
||||||
if (alpha <= static_cast<T>(max_int))
|
/** Here we have to use strict comparison.
|
||||||
|
* The max_int is 2^64 - 1.
|
||||||
|
* When casted to floating point type, it will be rounded to the closest representable number,
|
||||||
|
* which is 2^64.
|
||||||
|
* But 2^64 is not representable in uint64_t,
|
||||||
|
* so the maximum representable number will be strictly less.
|
||||||
|
*/
|
||||||
|
if (alpha < static_cast<T>(max_int))
|
||||||
self = static_cast<uint64_t>(alpha);
|
self = static_cast<uint64_t>(alpha);
|
||||||
else // max(double) / 2^64 will surely contain less than 52 precision bits, so speed up computations.
|
else // max(double) / 2^64 will surely contain less than 52 precision bits, so speed up computations.
|
||||||
set_multiplier<double>(self, static_cast<double>(alpha));
|
set_multiplier<double>(self, static_cast<double>(alpha));
|
||||||
|
@ -53,7 +53,7 @@ float logf(float x)
|
|||||||
tmp = ix - OFF;
|
tmp = ix - OFF;
|
||||||
i = (tmp >> (23 - LOGF_TABLE_BITS)) % N;
|
i = (tmp >> (23 - LOGF_TABLE_BITS)) % N;
|
||||||
k = (int32_t)tmp >> 23; /* arithmetic shift */
|
k = (int32_t)tmp >> 23; /* arithmetic shift */
|
||||||
iz = ix - (tmp & 0x1ff << 23);
|
iz = ix - (tmp & 0xff800000);
|
||||||
invc = T[i].invc;
|
invc = T[i].invc;
|
||||||
logc = T[i].logc;
|
logc = T[i].logc;
|
||||||
z = (double_t)asfloat(iz);
|
z = (double_t)asfloat(iz);
|
||||||
|
@ -5,11 +5,11 @@ if (NOT TARGET check)
|
|||||||
if (CMAKE_CONFIGURATION_TYPES)
|
if (CMAKE_CONFIGURATION_TYPES)
|
||||||
add_custom_target (check COMMAND ${CMAKE_CTEST_COMMAND}
|
add_custom_target (check COMMAND ${CMAKE_CTEST_COMMAND}
|
||||||
--force-new-ctest-process --output-on-failure --build-config "$<CONFIGURATION>"
|
--force-new-ctest-process --output-on-failure --build-config "$<CONFIGURATION>"
|
||||||
WORKING_DIRECTORY ${CMAKE_BINARY_DIR})
|
WORKING_DIRECTORY ${PROJECT_BINARY_DIR})
|
||||||
else ()
|
else ()
|
||||||
add_custom_target (check COMMAND ${CMAKE_CTEST_COMMAND}
|
add_custom_target (check COMMAND ${CMAKE_CTEST_COMMAND}
|
||||||
--force-new-ctest-process --output-on-failure
|
--force-new-ctest-process --output-on-failure
|
||||||
WORKING_DIRECTORY ${CMAKE_BINARY_DIR})
|
WORKING_DIRECTORY ${PROJECT_BINARY_DIR})
|
||||||
endif ()
|
endif ()
|
||||||
endif ()
|
endif ()
|
||||||
|
|
||||||
|
@ -5,14 +5,14 @@ if (Git_FOUND)
|
|||||||
# Commit hash + whether the building workspace was dirty or not
|
# Commit hash + whether the building workspace was dirty or not
|
||||||
execute_process(COMMAND
|
execute_process(COMMAND
|
||||||
"${GIT_EXECUTABLE}" rev-parse HEAD
|
"${GIT_EXECUTABLE}" rev-parse HEAD
|
||||||
WORKING_DIRECTORY ${CMAKE_SOURCE_DIR}
|
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
|
||||||
OUTPUT_VARIABLE GIT_HASH
|
OUTPUT_VARIABLE GIT_HASH
|
||||||
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||||
|
|
||||||
# Branch name
|
# Branch name
|
||||||
execute_process(COMMAND
|
execute_process(COMMAND
|
||||||
"${GIT_EXECUTABLE}" rev-parse --abbrev-ref HEAD
|
"${GIT_EXECUTABLE}" rev-parse --abbrev-ref HEAD
|
||||||
WORKING_DIRECTORY ${CMAKE_SOURCE_DIR}
|
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
|
||||||
OUTPUT_VARIABLE GIT_BRANCH
|
OUTPUT_VARIABLE GIT_BRANCH
|
||||||
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||||
|
|
||||||
@ -20,14 +20,14 @@ if (Git_FOUND)
|
|||||||
SET(ENV{TZ} "UTC")
|
SET(ENV{TZ} "UTC")
|
||||||
execute_process(COMMAND
|
execute_process(COMMAND
|
||||||
"${GIT_EXECUTABLE}" log -1 --format=%ad --date=iso-local
|
"${GIT_EXECUTABLE}" log -1 --format=%ad --date=iso-local
|
||||||
WORKING_DIRECTORY ${CMAKE_SOURCE_DIR}
|
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
|
||||||
OUTPUT_VARIABLE GIT_DATE
|
OUTPUT_VARIABLE GIT_DATE
|
||||||
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||||
|
|
||||||
# Subject of the commit
|
# Subject of the commit
|
||||||
execute_process(COMMAND
|
execute_process(COMMAND
|
||||||
"${GIT_EXECUTABLE}" log -1 --format=%s
|
"${GIT_EXECUTABLE}" log -1 --format=%s
|
||||||
WORKING_DIRECTORY ${CMAKE_SOURCE_DIR}
|
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
|
||||||
OUTPUT_VARIABLE GIT_COMMIT_SUBJECT
|
OUTPUT_VARIABLE GIT_COMMIT_SUBJECT
|
||||||
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||||
|
|
||||||
@ -35,7 +35,7 @@ if (Git_FOUND)
|
|||||||
|
|
||||||
execute_process(
|
execute_process(
|
||||||
COMMAND ${GIT_EXECUTABLE} status
|
COMMAND ${GIT_EXECUTABLE} status
|
||||||
WORKING_DIRECTORY ${CMAKE_SOURCE_DIR} OUTPUT_STRIP_TRAILING_WHITESPACE)
|
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR} OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||||
else()
|
else()
|
||||||
message(STATUS "Git could not be found.")
|
message(STATUS "Git could not be found.")
|
||||||
endif()
|
endif()
|
||||||
|
@ -21,7 +21,7 @@ set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} --gcc-toolchain=${TOOLCHAIN_PATH}")
|
|||||||
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --gcc-toolchain=${TOOLCHAIN_PATH}")
|
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --gcc-toolchain=${TOOLCHAIN_PATH}")
|
||||||
set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} --gcc-toolchain=${TOOLCHAIN_PATH}")
|
set (CMAKE_ASM_FLAGS "${CMAKE_ASM_FLAGS} --gcc-toolchain=${TOOLCHAIN_PATH}")
|
||||||
|
|
||||||
set (CMAKE_EXE_LINKER_FLAGS_INIT "-fuse-ld=bfd")
|
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fuse-ld=bfd")
|
||||||
|
|
||||||
# Currently, lld does not work with the error:
|
# Currently, lld does not work with the error:
|
||||||
# ld.lld: error: section size decrease is too large
|
# ld.lld: error: section size decrease is too large
|
||||||
|
@ -7,6 +7,6 @@ message (STATUS "compiler CXX = ${CMAKE_CXX_COMPILER} ${FULL_CXX_FLAGS}")
|
|||||||
message (STATUS "LINKER_FLAGS = ${FULL_EXE_LINKER_FLAGS}")
|
message (STATUS "LINKER_FLAGS = ${FULL_EXE_LINKER_FLAGS}")
|
||||||
|
|
||||||
# Reproducible builds
|
# Reproducible builds
|
||||||
string (REPLACE "${CMAKE_SOURCE_DIR}" "." FULL_C_FLAGS_NORMALIZED "${FULL_C_FLAGS}")
|
string (REPLACE "${PROJECT_SOURCE_DIR}" "." FULL_C_FLAGS_NORMALIZED "${FULL_C_FLAGS}")
|
||||||
string (REPLACE "${CMAKE_SOURCE_DIR}" "." FULL_CXX_FLAGS_NORMALIZED "${FULL_CXX_FLAGS}")
|
string (REPLACE "${PROJECT_SOURCE_DIR}" "." FULL_CXX_FLAGS_NORMALIZED "${FULL_CXX_FLAGS}")
|
||||||
string (REPLACE "${CMAKE_SOURCE_DIR}" "." FULL_EXE_LINKER_FLAGS_NORMALIZED "${FULL_EXE_LINKER_FLAGS}")
|
string (REPLACE "${PROJECT_SOURCE_DIR}" "." FULL_EXE_LINKER_FLAGS_NORMALIZED "${FULL_EXE_LINKER_FLAGS}")
|
||||||
|
@ -8,11 +8,21 @@ option (SANITIZE "Enable one of the code sanitizers" "")
|
|||||||
|
|
||||||
set (SAN_FLAGS "${SAN_FLAGS} -g -fno-omit-frame-pointer -DSANITIZER")
|
set (SAN_FLAGS "${SAN_FLAGS} -g -fno-omit-frame-pointer -DSANITIZER")
|
||||||
|
|
||||||
|
# It's possible to pass an ignore list to sanitizers (-fsanitize-ignorelist). Intentionally not doing this because
|
||||||
|
# 1. out-of-source suppressions are awkward 2. it seems ignore lists don't work after the Clang v16 upgrade (#49829)
|
||||||
|
|
||||||
if (SANITIZE)
|
if (SANITIZE)
|
||||||
if (SANITIZE STREQUAL "address")
|
if (SANITIZE STREQUAL "address")
|
||||||
# LLVM-15 has a bug in Address Sanitizer, preventing the usage of 'sanitize-address-use-after-scope',
|
set (ASAN_FLAGS "-fsanitize=address -fsanitize-address-use-after-scope")
|
||||||
# see https://github.com/llvm/llvm-project/issues/58633
|
if (COMPILER_CLANG)
|
||||||
set (ASAN_FLAGS "-fsanitize=address -fno-sanitize-address-use-after-scope")
|
if (${CMAKE_CXX_COMPILER_VERSION} VERSION_GREATER_EQUAL 15 AND ${CMAKE_CXX_COMPILER_VERSION} VERSION_LESS 16)
|
||||||
|
# LLVM-15 has a bug in Address Sanitizer, preventing the usage
|
||||||
|
# of 'sanitize-address-use-after-scope', see [1].
|
||||||
|
#
|
||||||
|
# [1]: https://github.com/llvm/llvm-project/issues/58633
|
||||||
|
set (ASAN_FLAGS "${ASAN_FLAGS} -fno-sanitize-address-use-after-scope")
|
||||||
|
endif()
|
||||||
|
endif()
|
||||||
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}")
|
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}")
|
||||||
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}")
|
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}")
|
||||||
|
|
||||||
@ -22,14 +32,14 @@ if (SANITIZE)
|
|||||||
|
|
||||||
# Linking can fail due to relocation overflows (see #49145), caused by too big object files / libraries.
|
# Linking can fail due to relocation overflows (see #49145), caused by too big object files / libraries.
|
||||||
# Work around this with position-independent builds (-fPIC and -fpie), this is slightly slower than non-PIC/PIE but that's okay.
|
# Work around this with position-independent builds (-fPIC and -fpie), this is slightly slower than non-PIC/PIE but that's okay.
|
||||||
set (MSAN_FLAGS "-fsanitize=memory -fsanitize-memory-use-after-dtor -fsanitize-memory-track-origins -fno-optimize-sibling-calls -fPIC -fpie -fsanitize-blacklist=${CMAKE_SOURCE_DIR}/tests/msan_suppressions.txt")
|
set (MSAN_FLAGS "-fsanitize=memory -fsanitize-memory-use-after-dtor -fsanitize-memory-track-origins -fno-optimize-sibling-calls -fPIC -fpie")
|
||||||
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${MSAN_FLAGS}")
|
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${MSAN_FLAGS}")
|
||||||
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} ${MSAN_FLAGS}")
|
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} ${MSAN_FLAGS}")
|
||||||
|
|
||||||
elseif (SANITIZE STREQUAL "thread")
|
elseif (SANITIZE STREQUAL "thread")
|
||||||
set (TSAN_FLAGS "-fsanitize=thread")
|
set (TSAN_FLAGS "-fsanitize=thread")
|
||||||
if (COMPILER_CLANG)
|
if (COMPILER_CLANG)
|
||||||
set (TSAN_FLAGS "${TSAN_FLAGS} -fsanitize-blacklist=${CMAKE_SOURCE_DIR}/tests/tsan_suppressions.txt")
|
set (TSAN_FLAGS "${TSAN_FLAGS} -fsanitize-blacklist=${PROJECT_SOURCE_DIR}/tests/tsan_suppressions.txt")
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${TSAN_FLAGS}")
|
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${TSAN_FLAGS}")
|
||||||
@ -47,7 +57,7 @@ if (SANITIZE)
|
|||||||
set(UBSAN_FLAGS "${UBSAN_FLAGS} -fno-sanitize=unsigned-integer-overflow")
|
set(UBSAN_FLAGS "${UBSAN_FLAGS} -fno-sanitize=unsigned-integer-overflow")
|
||||||
endif()
|
endif()
|
||||||
if (COMPILER_CLANG)
|
if (COMPILER_CLANG)
|
||||||
set (UBSAN_FLAGS "${UBSAN_FLAGS} -fsanitize-blacklist=${CMAKE_SOURCE_DIR}/tests/ubsan_suppressions.txt")
|
set (UBSAN_FLAGS "${UBSAN_FLAGS} -fsanitize-blacklist=${PROJECT_SOURCE_DIR}/tests/ubsan_suppressions.txt")
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${UBSAN_FLAGS}")
|
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${UBSAN_FLAGS}")
|
||||||
|
@ -70,13 +70,15 @@ if (LINKER_NAME)
|
|||||||
if (NOT LLD_PATH)
|
if (NOT LLD_PATH)
|
||||||
message (FATAL_ERROR "Using linker ${LINKER_NAME} but can't find its path.")
|
message (FATAL_ERROR "Using linker ${LINKER_NAME} but can't find its path.")
|
||||||
endif ()
|
endif ()
|
||||||
if (COMPILER_CLANG)
|
# This a temporary quirk to emit .debug_aranges with ThinLTO, it is only the case clang/llvm <16
|
||||||
# This a temporary quirk to emit .debug_aranges with ThinLTO, can be removed after upgrade to clang-16
|
if (COMPILER_CLANG AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 16)
|
||||||
set (LLD_WRAPPER "${CMAKE_CURRENT_BINARY_DIR}/ld.lld")
|
set (LLD_WRAPPER "${CMAKE_CURRENT_BINARY_DIR}/ld.lld")
|
||||||
configure_file ("${CMAKE_CURRENT_SOURCE_DIR}/cmake/ld.lld.in" "${LLD_WRAPPER}" @ONLY)
|
configure_file ("${CMAKE_CURRENT_SOURCE_DIR}/cmake/ld.lld.in" "${LLD_WRAPPER}" @ONLY)
|
||||||
|
|
||||||
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} --ld-path=${LLD_WRAPPER}")
|
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} --ld-path=${LLD_WRAPPER}")
|
||||||
endif ()
|
else ()
|
||||||
|
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} --ld-path=${LLD_PATH}")
|
||||||
|
endif()
|
||||||
|
|
||||||
endif ()
|
endif ()
|
||||||
|
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
include(${CMAKE_SOURCE_DIR}/cmake/autogenerated_versions.txt)
|
include(${PROJECT_SOURCE_DIR}/cmake/autogenerated_versions.txt)
|
||||||
|
|
||||||
set(VERSION_EXTRA "" CACHE STRING "")
|
set(VERSION_EXTRA "" CACHE STRING "")
|
||||||
set(VERSION_TWEAK "" CACHE STRING "")
|
set(VERSION_TWEAK "" CACHE STRING "")
|
||||||
|
15
contrib/CMakeLists.txt
vendored
15
contrib/CMakeLists.txt
vendored
@ -105,6 +105,7 @@ add_contrib (libfarmhash)
|
|||||||
add_contrib (icu-cmake icu)
|
add_contrib (icu-cmake icu)
|
||||||
add_contrib (h3-cmake h3)
|
add_contrib (h3-cmake h3)
|
||||||
add_contrib (mariadb-connector-c-cmake mariadb-connector-c)
|
add_contrib (mariadb-connector-c-cmake mariadb-connector-c)
|
||||||
|
add_contrib (libfiu-cmake libfiu)
|
||||||
|
|
||||||
if (ENABLE_TESTS)
|
if (ENABLE_TESTS)
|
||||||
add_contrib (googletest-cmake googletest)
|
add_contrib (googletest-cmake googletest)
|
||||||
@ -177,7 +178,19 @@ endif()
|
|||||||
add_contrib (sqlite-cmake sqlite-amalgamation)
|
add_contrib (sqlite-cmake sqlite-amalgamation)
|
||||||
add_contrib (s2geometry-cmake s2geometry)
|
add_contrib (s2geometry-cmake s2geometry)
|
||||||
add_contrib (c-ares-cmake c-ares)
|
add_contrib (c-ares-cmake c-ares)
|
||||||
add_contrib (qpl-cmake qpl)
|
|
||||||
|
if (OS_LINUX AND ARCH_AMD64 AND ENABLE_SSE42)
|
||||||
|
option (ENABLE_QPL "Enable Intel® Query Processing Library" ${ENABLE_LIBRARIES})
|
||||||
|
elseif(ENABLE_QPL)
|
||||||
|
message (${RECONFIGURE_MESSAGE_LEVEL} "QPL library is only supported on x86_64 arch with SSE 4.2 or higher")
|
||||||
|
endif()
|
||||||
|
if (ENABLE_QPL)
|
||||||
|
add_contrib (idxd-config-cmake idxd-config)
|
||||||
|
add_contrib (qpl-cmake qpl) # requires: idxd-config
|
||||||
|
else()
|
||||||
|
message(STATUS "Not using QPL")
|
||||||
|
endif ()
|
||||||
|
|
||||||
add_contrib (morton-nd-cmake morton-nd)
|
add_contrib (morton-nd-cmake morton-nd)
|
||||||
if (ARCH_S390X)
|
if (ARCH_S390X)
|
||||||
add_contrib(crc32-s390x-cmake crc32-s390x)
|
add_contrib(crc32-s390x-cmake crc32-s390x)
|
||||||
|
@ -6,7 +6,7 @@ if (NOT ENABLE_AVRO)
|
|||||||
return()
|
return()
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
set(AVROCPP_ROOT_DIR "${CMAKE_SOURCE_DIR}/contrib/avro/lang/c++")
|
set(AVROCPP_ROOT_DIR "${PROJECT_SOURCE_DIR}/contrib/avro/lang/c++")
|
||||||
set(AVROCPP_INCLUDE_DIR "${AVROCPP_ROOT_DIR}/api")
|
set(AVROCPP_INCLUDE_DIR "${AVROCPP_ROOT_DIR}/api")
|
||||||
set(AVROCPP_SOURCE_DIR "${AVROCPP_ROOT_DIR}/impl")
|
set(AVROCPP_SOURCE_DIR "${AVROCPP_ROOT_DIR}/impl")
|
||||||
|
|
||||||
|
2
contrib/aws
vendored
2
contrib/aws
vendored
@ -1 +1 @@
|
|||||||
Subproject commit ecccfc026a42b30023289410a67024d561f4bf3e
|
Subproject commit ca02358dcc7ce3ab733dd4cbcc32734eecfa4ee3
|
2
contrib/aws-c-auth
vendored
2
contrib/aws-c-auth
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 30df6c407e2df43bd244e2c34c9b4a4b87372bfb
|
Subproject commit 97133a2b5dbca1ccdf88cd6f44f39d0531d27d12
|
2
contrib/aws-c-common
vendored
2
contrib/aws-c-common
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 324fd1d973ccb25c813aa747bf1759cfde5121c5
|
Subproject commit 45dcb2849c891dba2100b270b4676765c92949ff
|
2
contrib/aws-c-event-stream
vendored
2
contrib/aws-c-event-stream
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 39bfa94a14b7126bf0c1330286ef8db452d87e66
|
Subproject commit 2f9b60c42f90840ec11822acda3d8cdfa97a773d
|
2
contrib/aws-c-http
vendored
2
contrib/aws-c-http
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 2c5a2a7d5556600b9782ffa6c9d7e09964df1abc
|
Subproject commit dd34461987947672444d0bc872c5a733dfdb9711
|
2
contrib/aws-c-io
vendored
2
contrib/aws-c-io
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 5d32c453560d0823df521a686bf7fbacde7f9be3
|
Subproject commit d58ed4f272b1cb4f89ac9196526ceebe5f2b0d89
|
2
contrib/aws-c-mqtt
vendored
2
contrib/aws-c-mqtt
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 882c689561a3db1466330ccfe3b63637e0a575d3
|
Subproject commit 33c3455cec82b16feb940e12006cefd7b3ef4194
|
2
contrib/aws-c-s3
vendored
2
contrib/aws-c-s3
vendored
@ -1 +1 @@
|
|||||||
Subproject commit a41255ece72a7c887bba7f9d998ca3e14f4c8a1b
|
Subproject commit d7bfe602d6925948f1fff95784e3613cca6a3900
|
2
contrib/aws-c-sdkutils
vendored
2
contrib/aws-c-sdkutils
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 25bf5cf225f977c3accc6a05a0a7a181ef2a4a30
|
Subproject commit 208a701fa01e99c7c8cc3dcebc8317da71362972
|
2
contrib/aws-checksums
vendored
2
contrib/aws-checksums
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 48e7c0e01479232f225c8044d76c84e74192889d
|
Subproject commit ad53be196a25bbefa3700a01187fdce573a7d2d0
|
@ -52,8 +52,8 @@ endif()
|
|||||||
|
|
||||||
# Directories.
|
# Directories.
|
||||||
SET(AWS_SDK_DIR "${ClickHouse_SOURCE_DIR}/contrib/aws")
|
SET(AWS_SDK_DIR "${ClickHouse_SOURCE_DIR}/contrib/aws")
|
||||||
SET(AWS_SDK_CORE_DIR "${AWS_SDK_DIR}/aws-cpp-sdk-core")
|
SET(AWS_SDK_CORE_DIR "${AWS_SDK_DIR}/src/aws-cpp-sdk-core")
|
||||||
SET(AWS_SDK_S3_DIR "${AWS_SDK_DIR}/aws-cpp-sdk-s3")
|
SET(AWS_SDK_S3_DIR "${AWS_SDK_DIR}/generated/src/aws-cpp-sdk-s3")
|
||||||
|
|
||||||
SET(AWS_AUTH_DIR "${ClickHouse_SOURCE_DIR}/contrib/aws-c-auth")
|
SET(AWS_AUTH_DIR "${ClickHouse_SOURCE_DIR}/contrib/aws-c-auth")
|
||||||
SET(AWS_CAL_DIR "${ClickHouse_SOURCE_DIR}/contrib/aws-c-cal")
|
SET(AWS_CAL_DIR "${ClickHouse_SOURCE_DIR}/contrib/aws-c-cal")
|
||||||
|
2
contrib/aws-crt-cpp
vendored
2
contrib/aws-crt-cpp
vendored
@ -1 +1 @@
|
|||||||
Subproject commit ec0bea288f451d884c0d80d534bc5c66241c39a4
|
Subproject commit 8a301b7e842f1daed478090c869207300972379f
|
2
contrib/aws-s2n-tls
vendored
2
contrib/aws-s2n-tls
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 0f1ba9e5c4a67cb3898de0c0b4f911d4194dc8de
|
Subproject commit 71f4794b7580cf780eb4aca77d69eded5d3c7bb4
|
@ -103,11 +103,19 @@ set (SRCS_CONTEXT
|
|||||||
)
|
)
|
||||||
|
|
||||||
if (ARCH_AARCH64)
|
if (ARCH_AARCH64)
|
||||||
|
if (OS_DARWIN)
|
||||||
|
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||||
|
"${LIBRARY_DIR}/libs/context/src/asm/jump_arm64_aapcs_macho_gas.S"
|
||||||
|
"${LIBRARY_DIR}/libs/context/src/asm/make_arm64_aapcs_macho_gas.S"
|
||||||
|
"${LIBRARY_DIR}/libs/context/src/asm/ontop_arm64_aapcs_macho_gas.S"
|
||||||
|
)
|
||||||
|
else()
|
||||||
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||||
"${LIBRARY_DIR}/libs/context/src/asm/jump_arm64_aapcs_elf_gas.S"
|
"${LIBRARY_DIR}/libs/context/src/asm/jump_arm64_aapcs_elf_gas.S"
|
||||||
"${LIBRARY_DIR}/libs/context/src/asm/make_arm64_aapcs_elf_gas.S"
|
"${LIBRARY_DIR}/libs/context/src/asm/make_arm64_aapcs_elf_gas.S"
|
||||||
"${LIBRARY_DIR}/libs/context/src/asm/ontop_arm64_aapcs_elf_gas.S"
|
"${LIBRARY_DIR}/libs/context/src/asm/ontop_arm64_aapcs_elf_gas.S"
|
||||||
)
|
)
|
||||||
|
endif()
|
||||||
elseif (ARCH_PPC64LE)
|
elseif (ARCH_PPC64LE)
|
||||||
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||||
"${LIBRARY_DIR}/libs/context/src/asm/jump_ppc64_sysv_elf_gas.S"
|
"${LIBRARY_DIR}/libs/context/src/asm/jump_ppc64_sysv_elf_gas.S"
|
||||||
|
@ -111,6 +111,8 @@ elseif(${CMAKE_SYSTEM_PROCESSOR} STREQUAL "mips")
|
|||||||
set(ARCH "generic")
|
set(ARCH "generic")
|
||||||
elseif(${CMAKE_SYSTEM_PROCESSOR} STREQUAL "ppc64le")
|
elseif(${CMAKE_SYSTEM_PROCESSOR} STREQUAL "ppc64le")
|
||||||
set(ARCH "ppc64le")
|
set(ARCH "ppc64le")
|
||||||
|
elseif(${CMAKE_SYSTEM_PROCESSOR} STREQUAL "riscv64")
|
||||||
|
set(ARCH "riscv64")
|
||||||
else()
|
else()
|
||||||
message(FATAL_ERROR "Unknown processor:" ${CMAKE_SYSTEM_PROCESSOR})
|
message(FATAL_ERROR "Unknown processor:" ${CMAKE_SYSTEM_PROCESSOR})
|
||||||
endif()
|
endif()
|
||||||
|
@ -18,7 +18,7 @@ endif()
|
|||||||
# Need to use C++17 since the compilation is not possible with C++20 currently.
|
# Need to use C++17 since the compilation is not possible with C++20 currently.
|
||||||
set (CMAKE_CXX_STANDARD 17)
|
set (CMAKE_CXX_STANDARD 17)
|
||||||
|
|
||||||
set(CASS_ROOT_DIR ${CMAKE_SOURCE_DIR}/contrib/cassandra)
|
set(CASS_ROOT_DIR ${PROJECT_SOURCE_DIR}/contrib/cassandra)
|
||||||
set(CASS_SRC_DIR "${CASS_ROOT_DIR}/src")
|
set(CASS_SRC_DIR "${CASS_ROOT_DIR}/src")
|
||||||
set(CASS_INCLUDE_DIR "${CASS_ROOT_DIR}/include")
|
set(CASS_INCLUDE_DIR "${CASS_ROOT_DIR}/include")
|
||||||
|
|
||||||
|
@ -26,7 +26,7 @@ endif ()
|
|||||||
# StorageSystemTimeZones.generated.cpp is autogenerated each time during a build
|
# StorageSystemTimeZones.generated.cpp is autogenerated each time during a build
|
||||||
# data in this file will be used to populate the system.time_zones table, this is specific to OS_LINUX
|
# data in this file will be used to populate the system.time_zones table, this is specific to OS_LINUX
|
||||||
# as the library that's built using embedded tzdata is also specific to OS_LINUX
|
# as the library that's built using embedded tzdata is also specific to OS_LINUX
|
||||||
set(SYSTEM_STORAGE_TZ_FILE "${CMAKE_BINARY_DIR}/src/Storages/System/StorageSystemTimeZones.generated.cpp")
|
set(SYSTEM_STORAGE_TZ_FILE "${PROJECT_BINARY_DIR}/src/Storages/System/StorageSystemTimeZones.generated.cpp")
|
||||||
# remove existing copies so that its generated fresh on each build.
|
# remove existing copies so that its generated fresh on each build.
|
||||||
file(REMOVE ${SYSTEM_STORAGE_TZ_FILE})
|
file(REMOVE ${SYSTEM_STORAGE_TZ_FILE})
|
||||||
|
|
||||||
|
@ -1,15 +1,30 @@
|
|||||||
set (SRC_DIR "${ClickHouse_SOURCE_DIR}/contrib/googletest/googletest")
|
set (SRC_DIR "${ClickHouse_SOURCE_DIR}/contrib/googletest")
|
||||||
|
|
||||||
add_library(_gtest "${SRC_DIR}/src/gtest-all.cc")
|
add_library(_gtest "${SRC_DIR}/googletest/src/gtest-all.cc")
|
||||||
set_target_properties(_gtest PROPERTIES VERSION "1.0.0")
|
set_target_properties(_gtest PROPERTIES VERSION "1.0.0")
|
||||||
target_compile_definitions (_gtest PUBLIC GTEST_HAS_POSIX_RE=0)
|
target_compile_definitions (_gtest PUBLIC GTEST_HAS_POSIX_RE=0)
|
||||||
target_include_directories(_gtest SYSTEM PUBLIC "${SRC_DIR}/include")
|
target_include_directories(_gtest SYSTEM PUBLIC "${SRC_DIR}/googletest/include")
|
||||||
target_include_directories(_gtest PRIVATE "${SRC_DIR}")
|
target_include_directories(_gtest PRIVATE "${SRC_DIR}/googletest")
|
||||||
|
|
||||||
add_library(_gtest_main "${SRC_DIR}/src/gtest_main.cc")
|
add_library(_gtest_main "${SRC_DIR}/googletest/src/gtest_main.cc")
|
||||||
set_target_properties(_gtest_main PROPERTIES VERSION "1.0.0")
|
set_target_properties(_gtest_main PROPERTIES VERSION "1.0.0")
|
||||||
target_link_libraries(_gtest_main PUBLIC _gtest)
|
target_link_libraries(_gtest_main PUBLIC _gtest)
|
||||||
|
|
||||||
add_library(_gtest_all INTERFACE)
|
add_library(_gtest_all INTERFACE)
|
||||||
target_link_libraries(_gtest_all INTERFACE _gtest _gtest_main)
|
target_link_libraries(_gtest_all INTERFACE _gtest _gtest_main)
|
||||||
add_library(ch_contrib::gtest_all ALIAS _gtest_all)
|
add_library(ch_contrib::gtest_all ALIAS _gtest_all)
|
||||||
|
|
||||||
|
|
||||||
|
add_library(_gmock "${SRC_DIR}/googlemock/src/gmock-all.cc")
|
||||||
|
set_target_properties(_gmock PROPERTIES VERSION "1.0.0")
|
||||||
|
target_compile_definitions (_gmock PUBLIC GTEST_HAS_POSIX_RE=0)
|
||||||
|
target_include_directories(_gmock SYSTEM PUBLIC "${SRC_DIR}/googlemock/include" "${SRC_DIR}/googletest/include")
|
||||||
|
target_include_directories(_gmock PRIVATE "${SRC_DIR}/googlemock")
|
||||||
|
|
||||||
|
add_library(_gmock_main "${SRC_DIR}/googlemock/src/gmock_main.cc")
|
||||||
|
set_target_properties(_gmock_main PROPERTIES VERSION "1.0.0")
|
||||||
|
target_link_libraries(_gmock_main PUBLIC _gmock)
|
||||||
|
|
||||||
|
add_library(_gmock_all INTERFACE)
|
||||||
|
target_link_libraries(_gmock_all INTERFACE _gmock _gmock_main)
|
||||||
|
add_library(ch_contrib::gmock_all ALIAS _gmock_all)
|
||||||
|
23
contrib/idxd-config-cmake/CMakeLists.txt
Normal file
23
contrib/idxd-config-cmake/CMakeLists.txt
Normal file
@ -0,0 +1,23 @@
|
|||||||
|
## accel_config is the utility library required by QPL-Deflate codec for controlling and configuring Intel® In-Memory Analytics Accelerator (Intel® IAA).
|
||||||
|
set (LIBACCEL_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/idxd-config")
|
||||||
|
set (UUID_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl-cmake")
|
||||||
|
set (LIBACCEL_HEADER_DIR "${ClickHouse_SOURCE_DIR}/contrib/idxd-config-cmake/include")
|
||||||
|
set (SRCS
|
||||||
|
"${LIBACCEL_SOURCE_DIR}/accfg/lib/libaccfg.c"
|
||||||
|
"${LIBACCEL_SOURCE_DIR}/util/log.c"
|
||||||
|
"${LIBACCEL_SOURCE_DIR}/util/sysfs.c"
|
||||||
|
)
|
||||||
|
|
||||||
|
add_library(_accel-config ${SRCS})
|
||||||
|
|
||||||
|
target_compile_options(_accel-config PRIVATE "-D_GNU_SOURCE")
|
||||||
|
|
||||||
|
target_include_directories(_accel-config BEFORE
|
||||||
|
PRIVATE ${UUID_DIR}
|
||||||
|
PRIVATE ${LIBACCEL_HEADER_DIR}
|
||||||
|
PRIVATE ${LIBACCEL_SOURCE_DIR})
|
||||||
|
|
||||||
|
target_include_directories(_accel-config SYSTEM BEFORE
|
||||||
|
PUBLIC ${LIBACCEL_SOURCE_DIR}/accfg)
|
||||||
|
|
||||||
|
add_library(ch_contrib::accel-config ALIAS _accel-config)
|
1
contrib/libfiu
vendored
Submodule
1
contrib/libfiu
vendored
Submodule
@ -0,0 +1 @@
|
|||||||
|
Subproject commit b85edbde4cf974b1b40d27828a56f0505f4e2ee5
|
20
contrib/libfiu-cmake/CMakeLists.txt
Normal file
20
contrib/libfiu-cmake/CMakeLists.txt
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
if (NOT ENABLE_FIU)
|
||||||
|
message (STATUS "Not using fiu")
|
||||||
|
return ()
|
||||||
|
endif ()
|
||||||
|
|
||||||
|
set(FIU_DIR "${ClickHouse_SOURCE_DIR}/contrib/libfiu/")
|
||||||
|
|
||||||
|
set(FIU_SOURCES
|
||||||
|
${FIU_DIR}/libfiu/fiu.c
|
||||||
|
${FIU_DIR}/libfiu/fiu-rc.c
|
||||||
|
${FIU_DIR}/libfiu/backtrace.c
|
||||||
|
${FIU_DIR}/libfiu/wtable.c
|
||||||
|
)
|
||||||
|
|
||||||
|
set(FIU_HEADERS "${FIU_DIR}/libfiu")
|
||||||
|
|
||||||
|
add_library(_fiu ${FIU_SOURCES})
|
||||||
|
target_compile_definitions(_fiu PUBLIC DUMMY_BACKTRACE)
|
||||||
|
target_include_directories(_fiu PUBLIC ${FIU_HEADERS})
|
||||||
|
add_library(ch_contrib::fiu ALIAS _fiu)
|
2
contrib/libpqxx
vendored
2
contrib/libpqxx
vendored
@ -1 +1 @@
|
|||||||
Subproject commit a4e834839270a8c1f7ff1db351ba85afced3f0e2
|
Subproject commit bdd6540fb95ff56c813691ceb5da5a3266cf235d
|
@ -1,7 +1,7 @@
|
|||||||
# This file is a modified version of contrib/libuv/CMakeLists.txt
|
# This file is a modified version of contrib/libuv/CMakeLists.txt
|
||||||
|
|
||||||
set (SOURCE_DIR "${CMAKE_SOURCE_DIR}/contrib/libuv")
|
set (SOURCE_DIR "${PROJECT_SOURCE_DIR}/contrib/libuv")
|
||||||
set (BINARY_DIR "${CMAKE_BINARY_DIR}/contrib/libuv")
|
set (BINARY_DIR "${PROJECT_BINARY_DIR}/contrib/libuv")
|
||||||
|
|
||||||
set(uv_sources
|
set(uv_sources
|
||||||
src/fs-poll.c
|
src/fs-poll.c
|
||||||
|
@ -15,7 +15,7 @@ endif()
|
|||||||
|
|
||||||
# This is the LGPL libmariadb project.
|
# This is the LGPL libmariadb project.
|
||||||
|
|
||||||
set(CC_SOURCE_DIR ${CMAKE_SOURCE_DIR}/contrib/mariadb-connector-c)
|
set(CC_SOURCE_DIR ${PROJECT_SOURCE_DIR}/contrib/mariadb-connector-c)
|
||||||
set(CC_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR})
|
set(CC_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR})
|
||||||
|
|
||||||
set(WITH_SSL ON)
|
set(WITH_SSL ON)
|
||||||
|
2
contrib/qpl
vendored
2
contrib/qpl
vendored
@ -1 +1 @@
|
|||||||
Subproject commit 0bce2b03423f6fbeb8bce66cc8be0bf558058848
|
Subproject commit 3f8f5cea27739f5261e8fd577dc233ffe88bf679
|
@ -1,36 +1,5 @@
|
|||||||
## The Intel® QPL provides high performance implementations of data processing functions for existing hardware accelerator, and/or software path in case if hardware accelerator is not available.
|
## The Intel® QPL provides high performance implementations of data processing functions for existing hardware accelerator, and/or software path in case if hardware accelerator is not available.
|
||||||
if (OS_LINUX AND ARCH_AMD64 AND (ENABLE_AVX2 OR ENABLE_AVX512))
|
|
||||||
option (ENABLE_QPL "Enable Intel® Query Processing Library" ${ENABLE_LIBRARIES})
|
|
||||||
elseif(ENABLE_QPL)
|
|
||||||
message (${RECONFIGURE_MESSAGE_LEVEL} "QPL library is only supported on x86_64 arch with avx2/avx512 support")
|
|
||||||
endif()
|
|
||||||
|
|
||||||
if (NOT ENABLE_QPL)
|
|
||||||
message(STATUS "Not using QPL")
|
|
||||||
return()
|
|
||||||
endif()
|
|
||||||
|
|
||||||
## QPL has build dependency on libaccel-config. Here is to build libaccel-config which is required by QPL.
|
|
||||||
## libaccel-config is the utility library for controlling and configuring Intel® In-Memory Analytics Accelerator (Intel® IAA).
|
|
||||||
set (LIBACCEL_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/idxd-config")
|
|
||||||
set (UUID_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl-cmake")
|
set (UUID_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl-cmake")
|
||||||
set (LIBACCEL_HEADER_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl-cmake/idxd-header")
|
|
||||||
set (SRCS
|
|
||||||
"${LIBACCEL_SOURCE_DIR}/accfg/lib/libaccfg.c"
|
|
||||||
"${LIBACCEL_SOURCE_DIR}/util/log.c"
|
|
||||||
"${LIBACCEL_SOURCE_DIR}/util/sysfs.c"
|
|
||||||
)
|
|
||||||
|
|
||||||
add_library(accel-config ${SRCS})
|
|
||||||
|
|
||||||
target_compile_options(accel-config PRIVATE "-D_GNU_SOURCE")
|
|
||||||
|
|
||||||
target_include_directories(accel-config BEFORE
|
|
||||||
PRIVATE ${UUID_DIR}
|
|
||||||
PRIVATE ${LIBACCEL_HEADER_DIR}
|
|
||||||
PRIVATE ${LIBACCEL_SOURCE_DIR})
|
|
||||||
|
|
||||||
## QPL build start here.
|
|
||||||
set (QPL_PROJECT_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl")
|
set (QPL_PROJECT_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl")
|
||||||
set (QPL_SRC_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl/sources")
|
set (QPL_SRC_DIR "${ClickHouse_SOURCE_DIR}/contrib/qpl/sources")
|
||||||
set (QPL_BINARY_DIR "${ClickHouse_BINARY_DIR}/build/contrib/qpl")
|
set (QPL_BINARY_DIR "${ClickHouse_BINARY_DIR}/build/contrib/qpl")
|
||||||
@ -53,8 +22,8 @@ GetLibraryVersion("${HEADER_CONTENT}" QPL_VERSION)
|
|||||||
message(STATUS "Intel QPL version: ${QPL_VERSION}")
|
message(STATUS "Intel QPL version: ${QPL_VERSION}")
|
||||||
|
|
||||||
# There are 5 source subdirectories under $QPL_SRC_DIR: isal, c_api, core-sw, middle-layer, c_api.
|
# There are 5 source subdirectories under $QPL_SRC_DIR: isal, c_api, core-sw, middle-layer, c_api.
|
||||||
# Generate 7 library targets: middle_layer_lib, isal, isal_asm, qplcore_px, qplcore_avx512, core_iaa, middle_layer_lib.
|
# Generate 8 library targets: middle_layer_lib, isal, isal_asm, qplcore_px, qplcore_avx512, qplcore_sw_dispatcher, core_iaa, middle_layer_lib.
|
||||||
# Output ch_contrib::qpl by linking with 7 library targets.
|
# Output ch_contrib::qpl by linking with 8 library targets.
|
||||||
|
|
||||||
include("${QPL_PROJECT_DIR}/cmake/CompileOptions.cmake")
|
include("${QPL_PROJECT_DIR}/cmake/CompileOptions.cmake")
|
||||||
|
|
||||||
@ -119,31 +88,36 @@ set(ISAL_ASM_SRC ${QPL_SRC_DIR}/isal/igzip/igzip_body.asm
|
|||||||
add_library(isal OBJECT ${ISAL_C_SRC})
|
add_library(isal OBJECT ${ISAL_C_SRC})
|
||||||
add_library(isal_asm OBJECT ${ISAL_ASM_SRC})
|
add_library(isal_asm OBJECT ${ISAL_ASM_SRC})
|
||||||
|
|
||||||
|
set_property(GLOBAL APPEND PROPERTY QPL_LIB_DEPS
|
||||||
|
$<TARGET_OBJECTS:isal>)
|
||||||
|
|
||||||
|
set_property(GLOBAL APPEND PROPERTY QPL_LIB_DEPS
|
||||||
|
$<TARGET_OBJECTS:isal_asm>)
|
||||||
|
|
||||||
# Setting external and internal interfaces for ISA-L library
|
# Setting external and internal interfaces for ISA-L library
|
||||||
target_include_directories(isal
|
target_include_directories(isal
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/isal/include>
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/isal/include>
|
||||||
PRIVATE ${QPL_SRC_DIR}/isal/include
|
PRIVATE ${QPL_SRC_DIR}/isal/include
|
||||||
PUBLIC ${QPL_SRC_DIR}/isal/igzip)
|
PUBLIC ${QPL_SRC_DIR}/isal/igzip)
|
||||||
|
|
||||||
|
set_target_properties(isal PROPERTIES
|
||||||
|
CXX_STANDARD 11
|
||||||
|
C_STANDARD 99)
|
||||||
|
|
||||||
target_compile_options(isal PRIVATE
|
target_compile_options(isal PRIVATE
|
||||||
"$<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS}>"
|
"$<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS}>"
|
||||||
"$<$<CONFIG:Debug>:>"
|
"$<$<CONFIG:Debug>:>"
|
||||||
"$<$<CONFIG:Release>:>")
|
"$<$<CONFIG:Release>:>")
|
||||||
|
|
||||||
|
# AS_FEATURE_LEVEL=10 means "Check SIMD capabilities of the target system at runtime and use up to AVX512 if available".
|
||||||
|
# HAVE_KNOWS_AVX512 means rely on AVX512 being available on the target system.
|
||||||
target_compile_options(isal_asm PRIVATE "-I${QPL_SRC_DIR}/isal/include/"
|
target_compile_options(isal_asm PRIVATE "-I${QPL_SRC_DIR}/isal/include/"
|
||||||
PRIVATE "-I${QPL_SRC_DIR}/isal/igzip/"
|
PRIVATE "-I${QPL_SRC_DIR}/isal/igzip/"
|
||||||
PRIVATE "-I${QPL_SRC_DIR}/isal/crc/"
|
PRIVATE "-I${QPL_SRC_DIR}/isal/crc/"
|
||||||
|
PRIVATE "-DHAVE_AS_KNOWS_AVX512"
|
||||||
|
PRIVATE "-DAS_FEATURE_LEVEL=10"
|
||||||
PRIVATE "-DQPL_LIB")
|
PRIVATE "-DQPL_LIB")
|
||||||
|
|
||||||
# AS_FEATURE_LEVEL=10 means "Check SIMD capabilities of the target system at runtime and use up to AVX512 if available".
|
|
||||||
# AS_FEATURE_LEVEL=5 means "Check SIMD capabilities of the target system at runtime and use up to AVX2 if available".
|
|
||||||
# HAVE_KNOWS_AVX512 means rely on AVX512 being available on the target system.
|
|
||||||
if (ENABLE_AVX512)
|
|
||||||
target_compile_options(isal_asm PRIVATE "-DHAVE_AS_KNOWS_AVX512" "-DAS_FEATURE_LEVEL=10")
|
|
||||||
else()
|
|
||||||
target_compile_options(isal_asm PRIVATE "-DAS_FEATURE_LEVEL=5")
|
|
||||||
endif()
|
|
||||||
|
|
||||||
# Here must remove "-fno-sanitize=undefined" from COMPILE_OPTIONS.
|
# Here must remove "-fno-sanitize=undefined" from COMPILE_OPTIONS.
|
||||||
# Otherwise nasm compiler would fail to proceed due to unrecognition of "-fno-sanitize=undefined"
|
# Otherwise nasm compiler would fail to proceed due to unrecognition of "-fno-sanitize=undefined"
|
||||||
if (SANITIZE STREQUAL "undefined")
|
if (SANITIZE STREQUAL "undefined")
|
||||||
@ -157,78 +131,97 @@ target_compile_definitions(isal PUBLIC
|
|||||||
NDEBUG)
|
NDEBUG)
|
||||||
|
|
||||||
# [SUBDIR]core-sw
|
# [SUBDIR]core-sw
|
||||||
# Two libraries:qplcore_avx512/qplcore_px for SW fallback will be created which are implemented by AVX512 and non-AVX512 instructions respectively.
|
# Create set of libraries corresponding to supported platforms for SW fallback which are implemented by AVX512 and non-AVX512 instructions respectively.
|
||||||
# The upper level QPL API will check SIMD capabilities of the target system at runtime and decide to call AVX512 function or non-AVX512 function.
|
# The upper level QPL API will check SIMD capabilities of the target system at runtime and decide to call AVX512 function or non-AVX512 function.
|
||||||
# Hence, here we don't need put qplcore_avx512 under an ENABLE_AVX512 CMake switch.
|
# Hence, here we don't need put ENABLE_AVX512 CMake switch.
|
||||||
# Actually, if we do that, some undefined symbols errors would happen because both of AVX512 function and non-AVX512 function are referenced by QPL API.
|
|
||||||
# PLATFORM=2 means AVX512 implementation; PLATFORM=0 means non-AVX512 implementation.
|
|
||||||
|
|
||||||
# Find Core Sources
|
get_list_of_supported_optimizations(PLATFORMS_LIST)
|
||||||
file(GLOB SOURCES
|
|
||||||
|
foreach(PLATFORM_ID IN LISTS PLATFORMS_LIST)
|
||||||
|
# Find Core Sources
|
||||||
|
file(GLOB SOURCES
|
||||||
${QPL_SRC_DIR}/core-sw/src/checksums/*.c
|
${QPL_SRC_DIR}/core-sw/src/checksums/*.c
|
||||||
${QPL_SRC_DIR}/core-sw/src/filtering/*.c
|
${QPL_SRC_DIR}/core-sw/src/filtering/*.c
|
||||||
${QPL_SRC_DIR}/core-sw/src/other/*.c
|
${QPL_SRC_DIR}/core-sw/src/other/*.c
|
||||||
${QPL_SRC_DIR}/core-sw/src/compression/*.c)
|
${QPL_SRC_DIR}/core-sw/src/compression/*.c)
|
||||||
|
|
||||||
file(GLOB DATA_SOURCES
|
file(GLOB DATA_SOURCES
|
||||||
${QPL_SRC_DIR}/core-sw/src/data/*.c)
|
${QPL_SRC_DIR}/core-sw/src/data/*.c)
|
||||||
|
|
||||||
# Create avx512 library
|
# Create library
|
||||||
add_library(qplcore_avx512 OBJECT ${SOURCES})
|
add_library(qplcore_${PLATFORM_ID} OBJECT ${SOURCES})
|
||||||
|
|
||||||
target_compile_definitions(qplcore_avx512 PRIVATE PLATFORM=2)
|
set_property(GLOBAL APPEND PROPERTY QPL_LIB_DEPS
|
||||||
|
$<TARGET_OBJECTS:qplcore_${PLATFORM_ID}>)
|
||||||
|
|
||||||
target_include_directories(qplcore_avx512
|
target_include_directories(qplcore_${PLATFORM_ID}
|
||||||
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw>
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/include>
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/include>
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/src/include>
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/src/include>
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/src/compression/include>
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/src/compression/include>
|
||||||
PRIVATE $<TARGET_PROPERTY:isal,INTERFACE_INCLUDE_DIRECTORIES>)
|
PRIVATE $<TARGET_PROPERTY:isal,INTERFACE_INCLUDE_DIRECTORIES>)
|
||||||
|
|
||||||
set_target_properties(qplcore_avx512 PROPERTIES
|
set_target_properties(qplcore_${PLATFORM_ID} PROPERTIES
|
||||||
$<$<C_COMPILER_ID:GNU>:C_STANDARD 17>)
|
$<$<C_COMPILER_ID:GNU>:C_STANDARD 17>)
|
||||||
|
|
||||||
target_link_libraries(qplcore_avx512
|
target_compile_options(qplcore_${PLATFORM_ID}
|
||||||
PRIVATE isal
|
|
||||||
PRIVATE ${CMAKE_DL_LIBS})
|
|
||||||
|
|
||||||
target_compile_options(qplcore_avx512
|
|
||||||
PRIVATE ${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS}
|
|
||||||
PRIVATE -march=skylake-avx512
|
|
||||||
PRIVATE "$<$<CONFIG:Debug>:>"
|
|
||||||
PRIVATE "$<$<CONFIG:Release>:-O3;-D_FORTIFY_SOURCE=2>")
|
|
||||||
|
|
||||||
|
|
||||||
target_compile_definitions(qplcore_avx512 PUBLIC QPL_BADARG_CHECK)
|
|
||||||
|
|
||||||
#
|
|
||||||
# Create px library
|
|
||||||
#
|
|
||||||
#set(CMAKE_INCLUDE_CURRENT_DIR ON)
|
|
||||||
|
|
||||||
# Create library
|
|
||||||
add_library(qplcore_px OBJECT ${SOURCES} ${DATA_SOURCES})
|
|
||||||
|
|
||||||
target_compile_definitions(qplcore_px PRIVATE PLATFORM=0)
|
|
||||||
|
|
||||||
target_include_directories(qplcore_px
|
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/include>
|
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/src/include>
|
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/src/compression/include>
|
|
||||||
PRIVATE $<TARGET_PROPERTY:isal,INTERFACE_INCLUDE_DIRECTORIES>)
|
|
||||||
|
|
||||||
set_target_properties(qplcore_px PROPERTIES
|
|
||||||
$<$<C_COMPILER_ID:GNU>:C_STANDARD 17>)
|
|
||||||
|
|
||||||
target_link_libraries(qplcore_px
|
|
||||||
PRIVATE isal
|
|
||||||
PRIVATE ${CMAKE_DL_LIBS})
|
|
||||||
|
|
||||||
target_compile_options(qplcore_px
|
|
||||||
PRIVATE ${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS}
|
PRIVATE ${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS}
|
||||||
PRIVATE "$<$<CONFIG:Debug>:>"
|
PRIVATE "$<$<CONFIG:Debug>:>"
|
||||||
PRIVATE "$<$<CONFIG:Release>:-O3;-D_FORTIFY_SOURCE=2>")
|
PRIVATE "$<$<CONFIG:Release>:-O3;-D_FORTIFY_SOURCE=2>")
|
||||||
|
|
||||||
target_compile_definitions(qplcore_px PUBLIC QPL_BADARG_CHECK)
|
# Set specific compiler options and/or definitions based on a platform
|
||||||
|
if (${PLATFORM_ID} MATCHES "avx512")
|
||||||
|
target_compile_definitions(qplcore_${PLATFORM_ID} PRIVATE PLATFORM=2)
|
||||||
|
target_compile_options(qplcore_${PLATFORM_ID} PRIVATE -march=skylake-avx512)
|
||||||
|
else() # Create default px library
|
||||||
|
target_compile_definitions(qplcore_${PLATFORM_ID} PRIVATE PLATFORM=0)
|
||||||
|
endif()
|
||||||
|
|
||||||
|
target_link_libraries(qplcore_${PLATFORM_ID} isal)
|
||||||
|
endforeach()
|
||||||
|
|
||||||
|
#
|
||||||
|
# Create dispatcher between platforms and auto-generated wrappers
|
||||||
|
#
|
||||||
|
file(GLOB SW_DISPATCHER_SOURCES ${QPL_SRC_DIR}/core-sw/dispatcher/*.cpp)
|
||||||
|
|
||||||
|
add_library(qplcore_sw_dispatcher OBJECT ${SW_DISPATCHER_SOURCES})
|
||||||
|
|
||||||
|
set_property(GLOBAL APPEND PROPERTY QPL_LIB_DEPS
|
||||||
|
$<TARGET_OBJECTS:qplcore_sw_dispatcher>)
|
||||||
|
|
||||||
|
target_include_directories(qplcore_sw_dispatcher
|
||||||
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-sw/dispatcher>)
|
||||||
|
|
||||||
|
# Generate kernel wrappers
|
||||||
|
generate_unpack_kernel_arrays(${QPL_BINARY_DIR} "${PLATFORMS_LIST}")
|
||||||
|
|
||||||
|
foreach(PLATFORM_ID IN LISTS PLATFORMS_LIST)
|
||||||
|
file(GLOB GENERATED_${PLATFORM_ID}_TABLES_SRC ${QPL_BINARY_DIR}/generated/${PLATFORM_ID}_*.cpp)
|
||||||
|
|
||||||
|
target_sources(qplcore_sw_dispatcher PRIVATE ${GENERATED_${PLATFORM_ID}_TABLES_SRC})
|
||||||
|
|
||||||
|
# Set specific compiler options and/or definitions based on a platform
|
||||||
|
if (${PLATFORM_ID} MATCHES "avx512")
|
||||||
|
set_source_files_properties(${GENERATED_${PLATFORM_ID}_TABLES_SRC} PROPERTIES COMPILE_DEFINITIONS PLATFORM=2)
|
||||||
|
else()
|
||||||
|
set_source_files_properties(${GENERATED_${PLATFORM_ID}_TABLES_SRC} PROPERTIES COMPILE_DEFINITIONS PLATFORM=0)
|
||||||
|
endif()
|
||||||
|
|
||||||
|
target_include_directories(qplcore_sw_dispatcher
|
||||||
|
PUBLIC $<TARGET_PROPERTY:qplcore_${PLATFORM_ID},INTERFACE_INCLUDE_DIRECTORIES>)
|
||||||
|
endforeach()
|
||||||
|
|
||||||
|
set_target_properties(qplcore_sw_dispatcher PROPERTIES CXX_STANDARD 17)
|
||||||
|
|
||||||
|
# w/a for build compatibility with ISAL codebase
|
||||||
|
target_compile_definitions(qplcore_sw_dispatcher PUBLIC -DQPL_LIB)
|
||||||
|
|
||||||
|
target_compile_options(qplcore_sw_dispatcher
|
||||||
|
PRIVATE $<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS};
|
||||||
|
${QPL_LINUX_TOOLCHAIN_DYNAMIC_LIBRARY_FLAGS};
|
||||||
|
$<$<CONFIG:Release>:-O3;-D_FORTIFY_SOURCE=2>>
|
||||||
|
PRIVATE $<$<COMPILE_LANG_AND_ID:CXX,GNU>:${QPL_LINUX_TOOLCHAIN_CPP_EMBEDDED_FLAGS}>)
|
||||||
|
|
||||||
# [SUBDIR]core-iaa
|
# [SUBDIR]core-iaa
|
||||||
file(GLOB HW_PATH_SRC ${QPL_SRC_DIR}/core-iaa/sources/aecs/*.c
|
file(GLOB HW_PATH_SRC ${QPL_SRC_DIR}/core-iaa/sources/aecs/*.c
|
||||||
@ -242,13 +235,20 @@ file(GLOB HW_PATH_SRC ${QPL_SRC_DIR}/core-iaa/sources/aecs/*.c
|
|||||||
# Create library
|
# Create library
|
||||||
add_library(core_iaa OBJECT ${HW_PATH_SRC})
|
add_library(core_iaa OBJECT ${HW_PATH_SRC})
|
||||||
|
|
||||||
|
set_property(GLOBAL APPEND PROPERTY QPL_LIB_DEPS
|
||||||
|
$<TARGET_OBJECTS:core_iaa>)
|
||||||
|
|
||||||
target_include_directories(core_iaa
|
target_include_directories(core_iaa
|
||||||
PRIVATE ${UUID_DIR}
|
PRIVATE ${UUID_DIR}
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-iaa/include>
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-iaa/include>
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-iaa/sources/include>
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/core-iaa/sources/include>
|
||||||
PRIVATE $<BUILD_INTERFACE:${QPL_PROJECT_DIR}/include> # status.h in own_checkers.h
|
PRIVATE $<BUILD_INTERFACE:${QPL_PROJECT_DIR}/include> # status.h in own_checkers.h
|
||||||
PRIVATE $<BUILD_INTERFACE:${QPL_PROJECT_DIR}/sources/c_api> # own_checkers.h
|
PRIVATE $<BUILD_INTERFACE:${QPL_PROJECT_DIR}/sources/c_api> # own_checkers.h
|
||||||
PRIVATE $<TARGET_PROPERTY:qplcore_avx512,INTERFACE_INCLUDE_DIRECTORIES>)
|
PRIVATE $<TARGET_PROPERTY:qplcore_sw_dispatcher,INTERFACE_INCLUDE_DIRECTORIES>)
|
||||||
|
|
||||||
|
set_target_properties(core_iaa PROPERTIES
|
||||||
|
$<$<C_COMPILER_ID:GNU>:C_STANDARD 17>
|
||||||
|
CXX_STANDARD 17)
|
||||||
|
|
||||||
target_compile_options(core_iaa
|
target_compile_options(core_iaa
|
||||||
PRIVATE $<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS};
|
PRIVATE $<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS};
|
||||||
@ -258,11 +258,10 @@ target_compile_features(core_iaa PRIVATE c_std_11)
|
|||||||
|
|
||||||
target_compile_definitions(core_iaa PRIVATE QPL_BADARG_CHECK
|
target_compile_definitions(core_iaa PRIVATE QPL_BADARG_CHECK
|
||||||
PRIVATE $<$<BOOL:${BLOCK_ON_FAULT}>: BLOCK_ON_FAULT_ENABLED>
|
PRIVATE $<$<BOOL:${BLOCK_ON_FAULT}>: BLOCK_ON_FAULT_ENABLED>
|
||||||
PRIVATE $<$<BOOL:${LOG_HW_INIT}>:LOG_HW_INIT>)
|
PRIVATE $<$<BOOL:${LOG_HW_INIT}>:LOG_HW_INIT>
|
||||||
|
PRIVATE $<$<BOOL:${DYNAMIC_LOADING_LIBACCEL_CONFIG}>:DYNAMIC_LOADING_LIBACCEL_CONFIG>)
|
||||||
|
|
||||||
# [SUBDIR]middle-layer
|
# [SUBDIR]middle-layer
|
||||||
generate_unpack_kernel_arrays(${QPL_BINARY_DIR})
|
|
||||||
|
|
||||||
file(GLOB MIDDLE_LAYER_SRC
|
file(GLOB MIDDLE_LAYER_SRC
|
||||||
${QPL_SRC_DIR}/middle-layer/analytics/*.cpp
|
${QPL_SRC_DIR}/middle-layer/analytics/*.cpp
|
||||||
${QPL_SRC_DIR}/middle-layer/c_wrapper/*.cpp
|
${QPL_SRC_DIR}/middle-layer/c_wrapper/*.cpp
|
||||||
@ -277,14 +276,12 @@ file(GLOB MIDDLE_LAYER_SRC
|
|||||||
${QPL_SRC_DIR}/middle-layer/inflate/*.cpp
|
${QPL_SRC_DIR}/middle-layer/inflate/*.cpp
|
||||||
${QPL_SRC_DIR}/core-iaa/sources/accelerator/*.cpp) # todo
|
${QPL_SRC_DIR}/core-iaa/sources/accelerator/*.cpp) # todo
|
||||||
|
|
||||||
file(GLOB GENERATED_PX_TABLES_SRC ${QPL_BINARY_DIR}/generated/px_*.cpp)
|
|
||||||
file(GLOB GENERATED_AVX512_TABLES_SRC ${QPL_BINARY_DIR}/generated/avx512_*.cpp)
|
|
||||||
|
|
||||||
add_library(middle_layer_lib OBJECT
|
add_library(middle_layer_lib OBJECT
|
||||||
${GENERATED_PX_TABLES_SRC}
|
|
||||||
${GENERATED_AVX512_TABLES_SRC}
|
|
||||||
${MIDDLE_LAYER_SRC})
|
${MIDDLE_LAYER_SRC})
|
||||||
|
|
||||||
|
set_property(GLOBAL APPEND PROPERTY QPL_LIB_DEPS
|
||||||
|
$<TARGET_OBJECTS:middle_layer_lib>)
|
||||||
|
|
||||||
target_compile_options(middle_layer_lib
|
target_compile_options(middle_layer_lib
|
||||||
PRIVATE $<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS};
|
PRIVATE $<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS};
|
||||||
${QPL_LINUX_TOOLCHAIN_DYNAMIC_LIBRARY_FLAGS};
|
${QPL_LINUX_TOOLCHAIN_DYNAMIC_LIBRARY_FLAGS};
|
||||||
@ -295,17 +292,16 @@ target_compile_definitions(middle_layer_lib
|
|||||||
PUBLIC QPL_VERSION="${QPL_VERSION}"
|
PUBLIC QPL_VERSION="${QPL_VERSION}"
|
||||||
PUBLIC $<$<BOOL:${LOG_HW_INIT}>:LOG_HW_INIT>
|
PUBLIC $<$<BOOL:${LOG_HW_INIT}>:LOG_HW_INIT>
|
||||||
PUBLIC $<$<BOOL:${EFFICIENT_WAIT}>:QPL_EFFICIENT_WAIT>
|
PUBLIC $<$<BOOL:${EFFICIENT_WAIT}>:QPL_EFFICIENT_WAIT>
|
||||||
PUBLIC QPL_BADARG_CHECK)
|
PUBLIC QPL_BADARG_CHECK
|
||||||
|
PUBLIC $<$<BOOL:${DYNAMIC_LOADING_LIBACCEL_CONFIG}>:DYNAMIC_LOADING_LIBACCEL_CONFIG>)
|
||||||
|
|
||||||
set_source_files_properties(${GENERATED_PX_TABLES_SRC} PROPERTIES COMPILE_DEFINITIONS PLATFORM=0)
|
set_target_properties(middle_layer_lib PROPERTIES CXX_STANDARD 17)
|
||||||
set_source_files_properties(${GENERATED_AVX512_TABLES_SRC} PROPERTIES COMPILE_DEFINITIONS PLATFORM=2)
|
|
||||||
|
|
||||||
target_include_directories(middle_layer_lib
|
target_include_directories(middle_layer_lib
|
||||||
PRIVATE ${UUID_DIR}
|
PRIVATE ${UUID_DIR}
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/middle-layer>
|
PUBLIC $<BUILD_INTERFACE:${QPL_SRC_DIR}/middle-layer>
|
||||||
PUBLIC $<TARGET_PROPERTY:_qpl,INTERFACE_INCLUDE_DIRECTORIES>
|
PUBLIC $<TARGET_PROPERTY:_qpl,INTERFACE_INCLUDE_DIRECTORIES>
|
||||||
PUBLIC $<TARGET_PROPERTY:qplcore_px,INTERFACE_INCLUDE_DIRECTORIES>
|
PUBLIC $<TARGET_PROPERTY:qplcore_sw_dispatcher,INTERFACE_INCLUDE_DIRECTORIES>
|
||||||
PUBLIC $<TARGET_PROPERTY:qplcore_avx512,INTERFACE_INCLUDE_DIRECTORIES>
|
|
||||||
PUBLIC $<TARGET_PROPERTY:isal,INTERFACE_INCLUDE_DIRECTORIES>
|
PUBLIC $<TARGET_PROPERTY:isal,INTERFACE_INCLUDE_DIRECTORIES>
|
||||||
PUBLIC $<TARGET_PROPERTY:core_iaa,INTERFACE_INCLUDE_DIRECTORIES>)
|
PUBLIC $<TARGET_PROPERTY:core_iaa,INTERFACE_INCLUDE_DIRECTORIES>)
|
||||||
|
|
||||||
@ -316,20 +312,19 @@ file(GLOB_RECURSE QPL_C_API_SRC
|
|||||||
${QPL_SRC_DIR}/c_api/*.c
|
${QPL_SRC_DIR}/c_api/*.c
|
||||||
${QPL_SRC_DIR}/c_api/*.cpp)
|
${QPL_SRC_DIR}/c_api/*.cpp)
|
||||||
|
|
||||||
add_library(_qpl STATIC ${QPL_C_API_SRC}
|
get_property(LIB_DEPS GLOBAL PROPERTY QPL_LIB_DEPS)
|
||||||
$<TARGET_OBJECTS:middle_layer_lib>
|
|
||||||
$<TARGET_OBJECTS:isal>
|
add_library(_qpl STATIC ${QPL_C_API_SRC} ${LIB_DEPS})
|
||||||
$<TARGET_OBJECTS:isal_asm>
|
|
||||||
$<TARGET_OBJECTS:qplcore_px>
|
|
||||||
$<TARGET_OBJECTS:qplcore_avx512>
|
|
||||||
$<TARGET_OBJECTS:core_iaa>
|
|
||||||
$<TARGET_OBJECTS:middle_layer_lib>)
|
|
||||||
|
|
||||||
target_include_directories(_qpl
|
target_include_directories(_qpl
|
||||||
PUBLIC $<BUILD_INTERFACE:${QPL_PROJECT_DIR}/include/>
|
PUBLIC $<BUILD_INTERFACE:${QPL_PROJECT_DIR}/include/> $<INSTALL_INTERFACE:include>
|
||||||
PRIVATE $<TARGET_PROPERTY:middle_layer_lib,INTERFACE_INCLUDE_DIRECTORIES>
|
PRIVATE $<TARGET_PROPERTY:middle_layer_lib,INTERFACE_INCLUDE_DIRECTORIES>
|
||||||
PRIVATE $<BUILD_INTERFACE:${QPL_SRC_DIR}/c_api>)
|
PRIVATE $<BUILD_INTERFACE:${QPL_SRC_DIR}/c_api>)
|
||||||
|
|
||||||
|
set_target_properties(_qpl PROPERTIES
|
||||||
|
$<$<C_COMPILER_ID:GNU>:C_STANDARD 17>
|
||||||
|
CXX_STANDARD 17)
|
||||||
|
|
||||||
target_compile_options(_qpl
|
target_compile_options(_qpl
|
||||||
PRIVATE $<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS};
|
PRIVATE $<$<C_COMPILER_ID:GNU>:${QPL_LINUX_TOOLCHAIN_REQUIRED_FLAGS};
|
||||||
${QPL_LINUX_TOOLCHAIN_DYNAMIC_LIBRARY_FLAGS};
|
${QPL_LINUX_TOOLCHAIN_DYNAMIC_LIBRARY_FLAGS};
|
||||||
@ -339,15 +334,15 @@ target_compile_options(_qpl
|
|||||||
target_compile_definitions(_qpl
|
target_compile_definitions(_qpl
|
||||||
PRIVATE -DQPL_LIB
|
PRIVATE -DQPL_LIB
|
||||||
PRIVATE -DQPL_BADARG_CHECK
|
PRIVATE -DQPL_BADARG_CHECK
|
||||||
|
PRIVATE $<$<BOOL:${DYNAMIC_LOADING_LIBACCEL_CONFIG}>:DYNAMIC_LOADING_LIBACCEL_CONFIG>
|
||||||
PUBLIC -DENABLE_QPL_COMPRESSION)
|
PUBLIC -DENABLE_QPL_COMPRESSION)
|
||||||
|
|
||||||
target_link_libraries(_qpl
|
target_link_libraries(_qpl
|
||||||
PRIVATE accel-config
|
PRIVATE ch_contrib::accel-config
|
||||||
PRIVATE ch_contrib::isal
|
PRIVATE ch_contrib::isal)
|
||||||
PRIVATE ${CMAKE_DL_LIBS})
|
|
||||||
|
|
||||||
add_library (ch_contrib::qpl ALIAS _qpl)
|
|
||||||
target_include_directories(_qpl SYSTEM BEFORE
|
target_include_directories(_qpl SYSTEM BEFORE
|
||||||
PUBLIC "${QPL_PROJECT_DIR}/include"
|
PUBLIC "${QPL_PROJECT_DIR}/include"
|
||||||
PUBLIC "${LIBACCEL_SOURCE_DIR}/accfg"
|
|
||||||
PUBLIC ${UUID_DIR})
|
PUBLIC ${UUID_DIR})
|
||||||
|
|
||||||
|
add_library (ch_contrib::qpl ALIAS _qpl)
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
set (SOURCE_DIR "${CMAKE_SOURCE_DIR}/contrib/snappy")
|
set (SOURCE_DIR "${PROJECT_SOURCE_DIR}/contrib/snappy")
|
||||||
|
|
||||||
if (ARCH_S390X)
|
if (ARCH_S390X)
|
||||||
set (SNAPPY_IS_BIG_ENDIAN 1)
|
set (SNAPPY_IS_BIG_ENDIAN 1)
|
||||||
|
@ -5,8 +5,8 @@ echo "Using sparse checkout for aws"
|
|||||||
FILES_TO_CHECKOUT=$(git rev-parse --git-dir)/info/sparse-checkout
|
FILES_TO_CHECKOUT=$(git rev-parse --git-dir)/info/sparse-checkout
|
||||||
echo '/*' > $FILES_TO_CHECKOUT
|
echo '/*' > $FILES_TO_CHECKOUT
|
||||||
echo '!/*/*' >> $FILES_TO_CHECKOUT
|
echo '!/*/*' >> $FILES_TO_CHECKOUT
|
||||||
echo '/aws-cpp-sdk-core/*' >> $FILES_TO_CHECKOUT
|
echo '/src/aws-cpp-sdk-core/*' >> $FILES_TO_CHECKOUT
|
||||||
echo '/aws-cpp-sdk-s3/*' >> $FILES_TO_CHECKOUT
|
echo '/generated/src/aws-cpp-sdk-s3/*' >> $FILES_TO_CHECKOUT
|
||||||
|
|
||||||
git config core.sparsecheckout true
|
git config core.sparsecheckout true
|
||||||
git checkout $1
|
git checkout $1
|
||||||
|
2
contrib/vectorscan
vendored
2
contrib/vectorscan
vendored
@ -1 +1 @@
|
|||||||
Subproject commit b4bba94b1a250603b0b198e0394946e32f6c3f30
|
Subproject commit 38431d111781843741a781a57a6381a527d900a4
|
@ -1,4 +1,4 @@
|
|||||||
set (SOURCE_DIR ${CMAKE_SOURCE_DIR}/contrib/zlib-ng)
|
set (SOURCE_DIR ${PROJECT_SOURCE_DIR}/contrib/zlib-ng)
|
||||||
|
|
||||||
add_definitions(-DZLIB_COMPAT)
|
add_definitions(-DZLIB_COMPAT)
|
||||||
add_definitions(-DWITH_GZFILEOP)
|
add_definitions(-DWITH_GZFILEOP)
|
||||||
|
@ -362,17 +362,16 @@ def parse_args() -> argparse.Namespace:
|
|||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
"--compiler",
|
"--compiler",
|
||||||
choices=(
|
choices=(
|
||||||
"clang-15",
|
"clang-16",
|
||||||
"clang-15-darwin",
|
"clang-16-darwin",
|
||||||
"clang-15-darwin-aarch64",
|
"clang-16-darwin-aarch64",
|
||||||
"clang-15-aarch64",
|
"clang-16-aarch64",
|
||||||
"clang-15-aarch64-v80compat",
|
"clang-16-aarch64-v80compat",
|
||||||
"clang-15-ppc64le",
|
"clang-16-ppc64le",
|
||||||
"clang-15-amd64-compat",
|
"clang-16-amd64-compat",
|
||||||
"clang-15-freebsd",
|
"clang-16-freebsd",
|
||||||
"gcc-11",
|
|
||||||
),
|
),
|
||||||
default="clang-15",
|
default="clang-16",
|
||||||
help="a compiler to use",
|
help="a compiler to use",
|
||||||
)
|
)
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
|
@ -10,53 +10,21 @@ RUN sed -i "s|http://archive.ubuntu.com|$apt_archive|g" /etc/apt/sources.list
|
|||||||
|
|
||||||
RUN apt-get update && apt-get --yes --allow-unauthenticated install libclang-${LLVM_VERSION}-dev libmlir-${LLVM_VERSION}-dev
|
RUN apt-get update && apt-get --yes --allow-unauthenticated install libclang-${LLVM_VERSION}-dev libmlir-${LLVM_VERSION}-dev
|
||||||
|
|
||||||
# libclang-15-dev does not contain proper symlink:
|
|
||||||
#
|
|
||||||
# This is what cmake will search for:
|
|
||||||
#
|
|
||||||
# # readlink -f /usr/lib/llvm-15/lib/libclang-15.so.1
|
|
||||||
# /usr/lib/x86_64-linux-gnu/libclang-15.so.1
|
|
||||||
#
|
|
||||||
# This is what exists:
|
|
||||||
#
|
|
||||||
# # ls -l /usr/lib/x86_64-linux-gnu/libclang-15*
|
|
||||||
# lrwxrwxrwx 1 root root 16 Sep 5 13:31 /usr/lib/x86_64-linux-gnu/libclang-15.so -> libclang-15.so.1
|
|
||||||
# lrwxrwxrwx 1 root root 21 Sep 5 13:31 /usr/lib/x86_64-linux-gnu/libclang-15.so.15 -> libclang-15.so.15.0.0
|
|
||||||
# -rw-r--r-- 1 root root 31835760 Sep 5 13:31 /usr/lib/x86_64-linux-gnu/libclang-15.so.15.0.0
|
|
||||||
#
|
|
||||||
ARG TARGETARCH
|
ARG TARGETARCH
|
||||||
RUN arch=${TARGETARCH:-amd64} \
|
RUN arch=${TARGETARCH:-amd64} \
|
||||||
&& case $arch in \
|
&& case $arch in \
|
||||||
amd64) rarch=x86_64 ;; \
|
amd64) rarch=x86_64 ;; \
|
||||||
arm64) rarch=aarch64 ;; \
|
arm64) rarch=aarch64 ;; \
|
||||||
*) exit 1 ;; \
|
*) exit 1 ;; \
|
||||||
esac \
|
esac
|
||||||
&& ln -rsf /usr/lib/$rarch-linux-gnu/libclang-15.so.15 /usr/lib/$rarch-linux-gnu/libclang-15.so.1
|
|
||||||
|
|
||||||
# repo versions doesn't work correctly with C++17
|
# repo versions doesn't work correctly with C++17
|
||||||
# also we push reports to s3, so we add index.html to subfolder urls
|
# also we push reports to s3, so we add index.html to subfolder urls
|
||||||
# https://github.com/ClickHouse-Extras/woboq_codebrowser/commit/37e15eaf377b920acb0b48dbe82471be9203f76b
|
# https://github.com/ClickHouse/woboq_codebrowser/commit/37e15eaf377b920acb0b48dbe82471be9203f76b
|
||||||
RUN git clone https://github.com/ClickHouse/woboq_codebrowser \
|
RUN git clone --branch=master --depth=1 https://github.com/ClickHouse/woboq_codebrowser /woboq_codebrowser \
|
||||||
&& cd woboq_codebrowser \
|
&& cd /woboq_codebrowser \
|
||||||
&& cmake . -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang\+\+-${LLVM_VERSION} -DCMAKE_C_COMPILER=clang-${LLVM_VERSION} \
|
&& cmake . -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang\+\+-${LLVM_VERSION} -DCMAKE_C_COMPILER=clang-${LLVM_VERSION} -DCLANG_BUILTIN_HEADERS_DIR=/usr/lib/llvm-${LLVM_VERSION}/lib/clang/${LLVM_VERSION}/include \
|
||||||
&& ninja \
|
&& ninja
|
||||||
&& cd .. \
|
|
||||||
&& rm -rf woboq_codebrowser
|
|
||||||
|
|
||||||
ENV CODEGEN=/woboq_codebrowser/generator/codebrowser_generator
|
COPY build.sh /
|
||||||
ENV CODEINDEX=/woboq_codebrowser/indexgenerator/codebrowser_indexgenerator
|
CMD ["bash", "-c", "/build.sh 2>&1"]
|
||||||
ENV STATIC_DATA=/woboq_codebrowser/data
|
|
||||||
|
|
||||||
ENV SOURCE_DIRECTORY=/repo_folder
|
|
||||||
ENV BUILD_DIRECTORY=/build
|
|
||||||
ENV HTML_RESULT_DIRECTORY=$BUILD_DIRECTORY/html_report
|
|
||||||
ENV SHA=nosha
|
|
||||||
ENV DATA="https://s3.amazonaws.com/clickhouse-test-reports/codebrowser/data"
|
|
||||||
|
|
||||||
CMD mkdir -p $BUILD_DIRECTORY && cd $BUILD_DIRECTORY && \
|
|
||||||
cmake $SOURCE_DIRECTORY -DCMAKE_CXX_COMPILER=/usr/bin/clang\+\+-${LLVM_VERSION} -DCMAKE_C_COMPILER=/usr/bin/clang-${LLVM_VERSION} -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DENABLE_EMBEDDED_COMPILER=0 -DENABLE_S3=0 && \
|
|
||||||
mkdir -p $HTML_RESULT_DIRECTORY && \
|
|
||||||
$CODEGEN -b $BUILD_DIRECTORY -a -o $HTML_RESULT_DIRECTORY -p ClickHouse:$SOURCE_DIRECTORY:$SHA -d $DATA | ts '%Y-%m-%d %H:%M:%S' && \
|
|
||||||
cp -r $STATIC_DATA $HTML_RESULT_DIRECTORY/ &&\
|
|
||||||
$CODEINDEX $HTML_RESULT_DIRECTORY -d "$DATA" | ts '%Y-%m-%d %H:%M:%S' && \
|
|
||||||
mv $HTML_RESULT_DIRECTORY /test_output
|
|
||||||
|
29
docker/test/codebrowser/build.sh
Executable file
29
docker/test/codebrowser/build.sh
Executable file
@ -0,0 +1,29 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
|
||||||
|
set -x -e
|
||||||
|
|
||||||
|
|
||||||
|
STATIC_DATA=${STATIC_DATA:-/woboq_codebrowser/data}
|
||||||
|
SOURCE_DIRECTORY=${SOURCE_DIRECTORY:-/build}
|
||||||
|
BUILD_DIRECTORY=${BUILD_DIRECTORY:-/workdir/build}
|
||||||
|
OUTPUT_DIRECTORY=${OUTPUT_DIRECTORY:-/workdir/output}
|
||||||
|
HTML_RESULT_DIRECTORY=${HTML_RESULT_DIRECTORY:-$OUTPUT_DIRECTORY/html_report}
|
||||||
|
SHA=${SHA:-nosha}
|
||||||
|
DATA=${DATA:-https://s3.amazonaws.com/clickhouse-test-reports/codebrowser/data}
|
||||||
|
nproc=$(($(nproc) + 2)) # increase parallelism
|
||||||
|
|
||||||
|
read -ra CMAKE_FLAGS <<< "${CMAKE_FLAGS:-}"
|
||||||
|
|
||||||
|
mkdir -p "$BUILD_DIRECTORY" && cd "$BUILD_DIRECTORY"
|
||||||
|
cmake "$SOURCE_DIRECTORY" -DCMAKE_CXX_COMPILER="/usr/bin/clang++-${LLVM_VERSION}" -DCMAKE_C_COMPILER="/usr/bin/clang-${LLVM_VERSION}" -DENABLE_WOBOQ_CODEBROWSER=ON "${CMAKE_FLAGS[@]}"
|
||||||
|
mkdir -p "$HTML_RESULT_DIRECTORY"
|
||||||
|
echo 'Filter out too noisy "Error: filename" lines and keep them in full codebrowser_generator.log'
|
||||||
|
/woboq_codebrowser/generator/codebrowser_generator -b "$BUILD_DIRECTORY" -a \
|
||||||
|
-o "$HTML_RESULT_DIRECTORY" --execute-concurrency="$nproc" -p "ClickHouse:$SOURCE_DIRECTORY:$SHA" \
|
||||||
|
-d "$DATA" \
|
||||||
|
|& ts '%Y-%m-%d %H:%M:%S' \
|
||||||
|
| tee "$OUTPUT_DIRECTORY/codebrowser_generator.log" \
|
||||||
|
| grep --line-buffered -v ':[0-9]* Error: '
|
||||||
|
cp -r "$STATIC_DATA" "$HTML_RESULT_DIRECTORY/"
|
||||||
|
/woboq_codebrowser/indexgenerator/codebrowser_indexgenerator "$HTML_RESULT_DIRECTORY" \
|
||||||
|
-d "$DATA" |& ts '%Y-%m-%d %H:%M:%S'
|
@ -9,7 +9,7 @@ trap 'kill $(jobs -pr) ||:' EXIT
|
|||||||
stage=${stage:-}
|
stage=${stage:-}
|
||||||
|
|
||||||
# Compiler version, normally set by Dockerfile
|
# Compiler version, normally set by Dockerfile
|
||||||
export LLVM_VERSION=${LLVM_VERSION:-13}
|
export LLVM_VERSION=${LLVM_VERSION:-16}
|
||||||
|
|
||||||
# A variable to pass additional flags to CMake.
|
# A variable to pass additional flags to CMake.
|
||||||
# Here we explicitly default it to nothing so that bash doesn't complain about
|
# Here we explicitly default it to nothing so that bash doesn't complain about
|
||||||
@ -147,6 +147,7 @@ function clone_submodules
|
|||||||
contrib/xxHash
|
contrib/xxHash
|
||||||
contrib/simdjson
|
contrib/simdjson
|
||||||
contrib/liburing
|
contrib/liburing
|
||||||
|
contrib/libfiu
|
||||||
)
|
)
|
||||||
|
|
||||||
git submodule sync
|
git submodule sync
|
||||||
|
@ -15,7 +15,7 @@ stage=${stage:-}
|
|||||||
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||||
echo "$script_dir"
|
echo "$script_dir"
|
||||||
repo_dir=ch
|
repo_dir=ch
|
||||||
BINARY_TO_DOWNLOAD=${BINARY_TO_DOWNLOAD:="clang-15_debug_none_unsplitted_disable_False_binary"}
|
BINARY_TO_DOWNLOAD=${BINARY_TO_DOWNLOAD:="clang-16_debug_none_unsplitted_disable_False_binary"}
|
||||||
BINARY_URL_TO_DOWNLOAD=${BINARY_URL_TO_DOWNLOAD:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/$BINARY_TO_DOWNLOAD/clickhouse"}
|
BINARY_URL_TO_DOWNLOAD=${BINARY_URL_TO_DOWNLOAD:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/$BINARY_TO_DOWNLOAD/clickhouse"}
|
||||||
|
|
||||||
function git_clone_with_retry
|
function git_clone_with_retry
|
||||||
|
@ -2,7 +2,7 @@
|
|||||||
set -euo pipefail
|
set -euo pipefail
|
||||||
|
|
||||||
|
|
||||||
CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-15_relwithdebuginfo_none_unsplitted_disable_False_binary/clickhouse"}
|
CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-16_relwithdebuginfo_none_unsplitted_disable_False_binary/clickhouse"}
|
||||||
CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""}
|
CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""}
|
||||||
|
|
||||||
|
|
||||||
|
@ -2,7 +2,7 @@
|
|||||||
set -euo pipefail
|
set -euo pipefail
|
||||||
|
|
||||||
|
|
||||||
CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-15_relwithdebuginfo_none_unsplitted_disable_False_binary/clickhouse"}
|
CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-16_relwithdebuginfo_none_unsplitted_disable_False_binary/clickhouse"}
|
||||||
CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""}
|
CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""}
|
||||||
|
|
||||||
|
|
||||||
|
@ -132,6 +132,9 @@ function run_tests()
|
|||||||
|
|
||||||
ADDITIONAL_OPTIONS+=('--report-logs-stats')
|
ADDITIONAL_OPTIONS+=('--report-logs-stats')
|
||||||
|
|
||||||
|
clickhouse-test "00001_select_1" > /dev/null ||:
|
||||||
|
clickhouse-client -q "insert into system.zookeeper (name, path, value) values ('auxiliary_zookeeper2', '/test/chroot/', '')" ||:
|
||||||
|
|
||||||
set +e
|
set +e
|
||||||
clickhouse-test --testname --shard --zookeeper --check-zookeeper-session --hung-check --print-time \
|
clickhouse-test --testname --shard --zookeeper --check-zookeeper-session --hung-check --print-time \
|
||||||
--test-runs "$NUM_TRIES" "${ADDITIONAL_OPTIONS[@]}" 2>&1 \
|
--test-runs "$NUM_TRIES" "${ADDITIONAL_OPTIONS[@]}" 2>&1 \
|
||||||
|
@ -20,31 +20,27 @@ install_packages package_folder
|
|||||||
|
|
||||||
# Thread Fuzzer allows to check more permutations of possible thread scheduling
|
# Thread Fuzzer allows to check more permutations of possible thread scheduling
|
||||||
# and find more potential issues.
|
# and find more potential issues.
|
||||||
# Temporarily disable ThreadFuzzer with tsan because of https://github.com/google/sanitizers/issues/1540
|
export THREAD_FUZZER_CPU_TIME_PERIOD_US=1000
|
||||||
is_tsan_build=$(clickhouse local -q "select value like '% -fsanitize=thread %' from system.build_options where name='CXX_FLAGS'")
|
export THREAD_FUZZER_SLEEP_PROBABILITY=0.1
|
||||||
if [ "$is_tsan_build" -eq "0" ]; then
|
export THREAD_FUZZER_SLEEP_TIME_US=100000
|
||||||
export THREAD_FUZZER_CPU_TIME_PERIOD_US=1000
|
|
||||||
export THREAD_FUZZER_SLEEP_PROBABILITY=0.1
|
|
||||||
export THREAD_FUZZER_SLEEP_TIME_US=100000
|
|
||||||
|
|
||||||
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_MIGRATE_PROBABILITY=1
|
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_MIGRATE_PROBABILITY=1
|
||||||
export THREAD_FUZZER_pthread_mutex_lock_AFTER_MIGRATE_PROBABILITY=1
|
export THREAD_FUZZER_pthread_mutex_lock_AFTER_MIGRATE_PROBABILITY=1
|
||||||
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_MIGRATE_PROBABILITY=1
|
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_MIGRATE_PROBABILITY=1
|
||||||
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_MIGRATE_PROBABILITY=1
|
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_MIGRATE_PROBABILITY=1
|
||||||
|
|
||||||
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_SLEEP_PROBABILITY=0.001
|
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_SLEEP_PROBABILITY=0.001
|
||||||
export THREAD_FUZZER_pthread_mutex_lock_AFTER_SLEEP_PROBABILITY=0.001
|
export THREAD_FUZZER_pthread_mutex_lock_AFTER_SLEEP_PROBABILITY=0.001
|
||||||
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_SLEEP_PROBABILITY=0.001
|
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_SLEEP_PROBABILITY=0.001
|
||||||
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_SLEEP_PROBABILITY=0.001
|
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_SLEEP_PROBABILITY=0.001
|
||||||
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_SLEEP_TIME_US=10000
|
export THREAD_FUZZER_pthread_mutex_lock_BEFORE_SLEEP_TIME_US=10000
|
||||||
|
|
||||||
export THREAD_FUZZER_pthread_mutex_lock_AFTER_SLEEP_TIME_US=10000
|
export THREAD_FUZZER_pthread_mutex_lock_AFTER_SLEEP_TIME_US=10000
|
||||||
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_SLEEP_TIME_US=10000
|
export THREAD_FUZZER_pthread_mutex_unlock_BEFORE_SLEEP_TIME_US=10000
|
||||||
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_SLEEP_TIME_US=10000
|
export THREAD_FUZZER_pthread_mutex_unlock_AFTER_SLEEP_TIME_US=10000
|
||||||
|
|
||||||
export THREAD_FUZZER_EXPLICIT_SLEEP_PROBABILITY=0.01
|
export THREAD_FUZZER_EXPLICIT_SLEEP_PROBABILITY=0.01
|
||||||
export THREAD_FUZZER_EXPLICIT_MEMORY_EXCEPTION_PROBABILITY=0.01
|
export THREAD_FUZZER_EXPLICIT_MEMORY_EXCEPTION_PROBABILITY=0.01
|
||||||
fi
|
|
||||||
|
|
||||||
export ZOOKEEPER_FAULT_INJECTION=1
|
export ZOOKEEPER_FAULT_INJECTION=1
|
||||||
# Initial run without S3 to create system.*_log on local file system to make it
|
# Initial run without S3 to create system.*_log on local file system to make it
|
||||||
|
@ -65,6 +65,9 @@ sudo cat /etc/clickhouse-server/config.d/storage_conf.xml \
|
|||||||
> /etc/clickhouse-server/config.d/storage_conf.xml.tmp
|
> /etc/clickhouse-server/config.d/storage_conf.xml.tmp
|
||||||
sudo mv /etc/clickhouse-server/config.d/storage_conf.xml.tmp /etc/clickhouse-server/config.d/storage_conf.xml
|
sudo mv /etc/clickhouse-server/config.d/storage_conf.xml.tmp /etc/clickhouse-server/config.d/storage_conf.xml
|
||||||
|
|
||||||
|
# it contains some new settings, but we can safely remove it
|
||||||
|
rm /etc/clickhouse-server/config.d/merge_tree.xml
|
||||||
|
|
||||||
start
|
start
|
||||||
stop
|
stop
|
||||||
mv /var/log/clickhouse-server/clickhouse-server.log /var/log/clickhouse-server/clickhouse-server.initial.log
|
mv /var/log/clickhouse-server/clickhouse-server.log /var/log/clickhouse-server/clickhouse-server.initial.log
|
||||||
@ -94,6 +97,9 @@ sudo cat /etc/clickhouse-server/config.d/storage_conf.xml \
|
|||||||
> /etc/clickhouse-server/config.d/storage_conf.xml.tmp
|
> /etc/clickhouse-server/config.d/storage_conf.xml.tmp
|
||||||
sudo mv /etc/clickhouse-server/config.d/storage_conf.xml.tmp /etc/clickhouse-server/config.d/storage_conf.xml
|
sudo mv /etc/clickhouse-server/config.d/storage_conf.xml.tmp /etc/clickhouse-server/config.d/storage_conf.xml
|
||||||
|
|
||||||
|
# it contains some new settings, but we can safely remove it
|
||||||
|
rm /etc/clickhouse-server/config.d/merge_tree.xml
|
||||||
|
|
||||||
start
|
start
|
||||||
|
|
||||||
clickhouse-client --query="SELECT 'Server version: ', version()"
|
clickhouse-client --query="SELECT 'Server version: ', version()"
|
||||||
|
@ -6,7 +6,7 @@ ARG apt_archive="http://archive.ubuntu.com"
|
|||||||
RUN sed -i "s|http://archive.ubuntu.com|$apt_archive|g" /etc/apt/sources.list
|
RUN sed -i "s|http://archive.ubuntu.com|$apt_archive|g" /etc/apt/sources.list
|
||||||
|
|
||||||
# 15.0.2
|
# 15.0.2
|
||||||
ENV DEBIAN_FRONTEND=noninteractive LLVM_VERSION=15
|
ENV DEBIAN_FRONTEND=noninteractive LLVM_VERSION=16
|
||||||
|
|
||||||
RUN apt-get update \
|
RUN apt-get update \
|
||||||
&& apt-get install \
|
&& apt-get install \
|
||||||
@ -52,6 +52,7 @@ RUN apt-get update \
|
|||||||
lld-${LLVM_VERSION} \
|
lld-${LLVM_VERSION} \
|
||||||
llvm-${LLVM_VERSION} \
|
llvm-${LLVM_VERSION} \
|
||||||
llvm-${LLVM_VERSION}-dev \
|
llvm-${LLVM_VERSION}-dev \
|
||||||
|
libclang-${LLVM_VERSION}-dev \
|
||||||
moreutils \
|
moreutils \
|
||||||
nasm \
|
nasm \
|
||||||
ninja-build \
|
ninja-build \
|
||||||
|
@ -11,14 +11,14 @@ This is intended for continuous integration checks that run on Linux servers. If
|
|||||||
|
|
||||||
The cross-build for macOS is based on the [Build instructions](../development/build.md), follow them first.
|
The cross-build for macOS is based on the [Build instructions](../development/build.md), follow them first.
|
||||||
|
|
||||||
## Install Clang-15
|
## Install Clang-16
|
||||||
|
|
||||||
Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup.
|
Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup.
|
||||||
For example the commands for Bionic are like:
|
For example the commands for Bionic are like:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
sudo echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-15 main" >> /etc/apt/sources.list
|
sudo echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-16 main" >> /etc/apt/sources.list
|
||||||
sudo apt-get install clang-15
|
sudo apt-get install clang-16
|
||||||
```
|
```
|
||||||
|
|
||||||
## Install Cross-Compilation Toolset {#install-cross-compilation-toolset}
|
## Install Cross-Compilation Toolset {#install-cross-compilation-toolset}
|
||||||
@ -55,7 +55,7 @@ curl -L 'https://github.com/phracker/MacOSX-SDKs/releases/download/10.15/MacOSX1
|
|||||||
cd ClickHouse
|
cd ClickHouse
|
||||||
mkdir build-darwin
|
mkdir build-darwin
|
||||||
cd build-darwin
|
cd build-darwin
|
||||||
CC=clang-15 CXX=clang++-15 cmake -DCMAKE_AR:FILEPATH=${CCTOOLS}/bin/x86_64-apple-darwin-ar -DCMAKE_INSTALL_NAME_TOOL=${CCTOOLS}/bin/x86_64-apple-darwin-install_name_tool -DCMAKE_RANLIB:FILEPATH=${CCTOOLS}/bin/x86_64-apple-darwin-ranlib -DLINKER_NAME=${CCTOOLS}/bin/x86_64-apple-darwin-ld -DCMAKE_TOOLCHAIN_FILE=cmake/darwin/toolchain-x86_64.cmake ..
|
CC=clang-16 CXX=clang++-16 cmake -DCMAKE_AR:FILEPATH=${CCTOOLS}/bin/x86_64-apple-darwin-ar -DCMAKE_INSTALL_NAME_TOOL=${CCTOOLS}/bin/x86_64-apple-darwin-install_name_tool -DCMAKE_RANLIB:FILEPATH=${CCTOOLS}/bin/x86_64-apple-darwin-ranlib -DLINKER_NAME=${CCTOOLS}/bin/x86_64-apple-darwin-ld -DCMAKE_TOOLCHAIN_FILE=cmake/darwin/toolchain-x86_64.cmake ..
|
||||||
ninja
|
ninja
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -11,7 +11,7 @@ This is for the case when you have Linux machine and want to use it to build `cl
|
|||||||
|
|
||||||
The cross-build for RISC-V 64 is based on the [Build instructions](../development/build.md), follow them first.
|
The cross-build for RISC-V 64 is based on the [Build instructions](../development/build.md), follow them first.
|
||||||
|
|
||||||
## Install Clang-13
|
## Install Clang-16
|
||||||
|
|
||||||
Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup or do
|
Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup or do
|
||||||
```
|
```
|
||||||
@ -23,7 +23,7 @@ sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
|
|||||||
``` bash
|
``` bash
|
||||||
cd ClickHouse
|
cd ClickHouse
|
||||||
mkdir build-riscv64
|
mkdir build-riscv64
|
||||||
CC=clang-14 CXX=clang++-14 cmake . -Bbuild-riscv64 -G Ninja -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-riscv64.cmake -DGLIBC_COMPATIBILITY=OFF -DENABLE_LDAP=OFF -DOPENSSL_NO_ASM=ON -DENABLE_JEMALLOC=ON -DENABLE_PARQUET=OFF -DUSE_UNWIND=OFF -DENABLE_GRPC=OFF -DENABLE_HDFS=OFF -DENABLE_MYSQL=OFF
|
CC=clang-16 CXX=clang++-16 cmake . -Bbuild-riscv64 -G Ninja -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-riscv64.cmake -DGLIBC_COMPATIBILITY=OFF -DENABLE_LDAP=OFF -DOPENSSL_NO_ASM=ON -DENABLE_JEMALLOC=ON -DENABLE_PARQUET=OFF -DUSE_UNWIND=OFF -DENABLE_GRPC=OFF -DENABLE_HDFS=OFF -DENABLE_MYSQL=OFF
|
||||||
ninja -C build-riscv64
|
ninja -C build-riscv64
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -22,7 +22,7 @@ The minimum recommended Ubuntu version for development is 22.04 LTS.
|
|||||||
### Install Prerequisites {#install-prerequisites}
|
### Install Prerequisites {#install-prerequisites}
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
sudo apt-get install git cmake ccache python3 ninja-build nasm yasm gawk
|
sudo apt-get install git cmake ccache python3 ninja-build nasm yasm gawk lsb-release wget software-properties-common gnupg
|
||||||
```
|
```
|
||||||
|
|
||||||
### Install and Use the Clang compiler
|
### Install and Use the Clang compiler
|
||||||
@ -46,9 +46,14 @@ As of April 2023, any version of Clang >= 15 will work.
|
|||||||
GCC as a compiler is not supported
|
GCC as a compiler is not supported
|
||||||
To build with a specific Clang version:
|
To build with a specific Clang version:
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
This is optional, if you are following along and just now installed Clang then check
|
||||||
|
to see what version you have installed before setting this environment variable.
|
||||||
|
:::
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
export CC=clang-15
|
export CC=clang-16
|
||||||
export CXX=clang++-15
|
export CXX=clang++-16
|
||||||
```
|
```
|
||||||
|
|
||||||
### Checkout ClickHouse Sources {#checkout-clickhouse-sources}
|
### Checkout ClickHouse Sources {#checkout-clickhouse-sources}
|
||||||
|
@ -4,20 +4,22 @@ sidebar_position: 73
|
|||||||
sidebar_label: Building and Benchmarking DEFLATE_QPL
|
sidebar_label: Building and Benchmarking DEFLATE_QPL
|
||||||
description: How to build Clickhouse and run benchmark with DEFLATE_QPL Codec
|
description: How to build Clickhouse and run benchmark with DEFLATE_QPL Codec
|
||||||
---
|
---
|
||||||
|
|
||||||
# Build Clickhouse with DEFLATE_QPL
|
# Build Clickhouse with DEFLATE_QPL
|
||||||
- Make sure your target machine meet the QPL required [Prerequisites](https://intel.github.io/qpl/documentation/get_started_docs/installation.html#prerequisites)
|
|
||||||
- Pass the following flag to CMake when building ClickHouse, depending on the capabilities of your target machine:
|
- Make sure your target machine meet the QPL required [prerequisites](https://intel.github.io/qpl/documentation/get_started_docs/installation.html#prerequisites)
|
||||||
|
- Pass the following flag to CMake when building ClickHouse:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
cmake -DENABLE_AVX2=1 -DENABLE_QPL=1 ..
|
cmake -DENABLE_QPL=1 ..
|
||||||
```
|
|
||||||
or
|
|
||||||
``` bash
|
|
||||||
cmake -DENABLE_AVX512=1 -DENABLE_QPL=1 ..
|
|
||||||
```
|
```
|
||||||
|
|
||||||
- For generic requirements, please refer to Clickhouse generic [build instructions](/docs/en/development/build.md)
|
- For generic requirements, please refer to Clickhouse generic [build instructions](/docs/en/development/build.md)
|
||||||
|
|
||||||
# Run Benchmark with DEFLATE_QPL
|
# Run Benchmark with DEFLATE_QPL
|
||||||
|
|
||||||
## Files list
|
## Files list
|
||||||
|
|
||||||
The folders `benchmark_sample` under [qpl-cmake](https://github.com/ClickHouse/ClickHouse/tree/master/contrib/qpl-cmake) give example to run benchmark with python scripts:
|
The folders `benchmark_sample` under [qpl-cmake](https://github.com/ClickHouse/ClickHouse/tree/master/contrib/qpl-cmake) give example to run benchmark with python scripts:
|
||||||
|
|
||||||
`client_scripts` contains python scripts for running typical benchmark, for example:
|
`client_scripts` contains python scripts for running typical benchmark, for example:
|
||||||
@ -28,48 +30,60 @@ The folders `benchmark_sample` under [qpl-cmake](https://github.com/ClickHouse/C
|
|||||||
`database_files` means it will store database files according to lz4/deflate/zstd codec.
|
`database_files` means it will store database files according to lz4/deflate/zstd codec.
|
||||||
|
|
||||||
## Run benchmark automatically for Star Schema:
|
## Run benchmark automatically for Star Schema:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./benchmark_sample/client_scripts
|
$ cd ./benchmark_sample/client_scripts
|
||||||
$ sh run_ssb.sh
|
$ sh run_ssb.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
After complete, please check all the results in this folder:`./output/`
|
After complete, please check all the results in this folder:`./output/`
|
||||||
|
|
||||||
In case you run into failure, please manually run benchmark as below sections.
|
In case you run into failure, please manually run benchmark as below sections.
|
||||||
|
|
||||||
## Definition
|
## Definition
|
||||||
|
|
||||||
[CLICKHOUSE_EXE] means the path of clickhouse executable program.
|
[CLICKHOUSE_EXE] means the path of clickhouse executable program.
|
||||||
|
|
||||||
## Environment
|
## Environment
|
||||||
|
|
||||||
- CPU: Sapphire Rapid
|
- CPU: Sapphire Rapid
|
||||||
- OS Requirements refer to [System Requirements for QPL](https://intel.github.io/qpl/documentation/get_started_docs/installation.html#system-requirements)
|
- OS Requirements refer to [System Requirements for QPL](https://intel.github.io/qpl/documentation/get_started_docs/installation.html#system-requirements)
|
||||||
- IAA Setup refer to [Accelerator Configuration](https://intel.github.io/qpl/documentation/get_started_docs/installation.html#accelerator-configuration)
|
- IAA Setup refer to [Accelerator Configuration](https://intel.github.io/qpl/documentation/get_started_docs/installation.html#accelerator-configuration)
|
||||||
- Install python modules:
|
- Install python modules:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
pip3 install clickhouse_driver numpy
|
pip3 install clickhouse_driver numpy
|
||||||
```
|
```
|
||||||
|
|
||||||
[Self-check for IAA]
|
[Self-check for IAA]
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ accel-config list | grep -P 'iax|state'
|
$ accel-config list | grep -P 'iax|state'
|
||||||
```
|
```
|
||||||
|
|
||||||
Expected output like this:
|
Expected output like this:
|
||||||
``` bash
|
``` bash
|
||||||
"dev":"iax1",
|
"dev":"iax1",
|
||||||
"state":"enabled",
|
"state":"enabled",
|
||||||
"state":"enabled",
|
"state":"enabled",
|
||||||
```
|
```
|
||||||
|
|
||||||
If you see nothing output, it means IAA is not ready to work. Please check IAA setup again.
|
If you see nothing output, it means IAA is not ready to work. Please check IAA setup again.
|
||||||
|
|
||||||
## Generate raw data
|
## Generate raw data
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./benchmark_sample
|
$ cd ./benchmark_sample
|
||||||
$ mkdir rawdata_dir && cd rawdata_dir
|
$ mkdir rawdata_dir && cd rawdata_dir
|
||||||
```
|
```
|
||||||
|
|
||||||
Use [`dbgen`](https://clickhouse.com/docs/en/getting-started/example-datasets/star-schema) to generate 100 million rows data with the parameters:
|
Use [`dbgen`](https://clickhouse.com/docs/en/getting-started/example-datasets/star-schema) to generate 100 million rows data with the parameters:
|
||||||
-s 20
|
-s 20
|
||||||
|
|
||||||
The files like `*.tbl` are expected to output under `./benchmark_sample/rawdata_dir/ssb-dbgen`:
|
The files like `*.tbl` are expected to output under `./benchmark_sample/rawdata_dir/ssb-dbgen`:
|
||||||
|
|
||||||
## Database setup
|
## Database setup
|
||||||
|
|
||||||
Set up database with LZ4 codec
|
Set up database with LZ4 codec
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
@ -77,6 +91,7 @@ $ cd ./database_dir/lz4
|
|||||||
$ [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
$ [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
||||||
$ [CLICKHOUSE_EXE] client
|
$ [CLICKHOUSE_EXE] client
|
||||||
```
|
```
|
||||||
|
|
||||||
Here you should see the message `Connected to ClickHouse server` from console which means client successfully setup connection with server.
|
Here you should see the message `Connected to ClickHouse server` from console which means client successfully setup connection with server.
|
||||||
|
|
||||||
Complete below three steps mentioned in [Star Schema Benchmark](https://clickhouse.com/docs/en/getting-started/example-datasets/star-schema)
|
Complete below three steps mentioned in [Star Schema Benchmark](https://clickhouse.com/docs/en/getting-started/example-datasets/star-schema)
|
||||||
@ -114,6 +129,7 @@ You are expected to see below output:
|
|||||||
└───────────┘
|
└───────────┘
|
||||||
```
|
```
|
||||||
[Self-check for IAA Deflate codec]
|
[Self-check for IAA Deflate codec]
|
||||||
|
|
||||||
At the first time you execute insertion or query from client, clickhouse server console is expected to print this log:
|
At the first time you execute insertion or query from client, clickhouse server console is expected to print this log:
|
||||||
```text
|
```text
|
||||||
Hardware-assisted DeflateQpl codec is ready!
|
Hardware-assisted DeflateQpl codec is ready!
|
||||||
@ -125,17 +141,21 @@ Initialization of hardware-assisted DeflateQpl codec failed
|
|||||||
That means IAA devices is not ready, you need check IAA setup again.
|
That means IAA devices is not ready, you need check IAA setup again.
|
||||||
|
|
||||||
## Benchmark with single instance
|
## Benchmark with single instance
|
||||||
|
|
||||||
- Before start benchmark, Please disable C6 and set CPU frequency governor to be `performance`
|
- Before start benchmark, Please disable C6 and set CPU frequency governor to be `performance`
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cpupower idle-set -d 3
|
$ cpupower idle-set -d 3
|
||||||
$ cpupower frequency-set -g performance
|
$ cpupower frequency-set -g performance
|
||||||
```
|
```
|
||||||
|
|
||||||
- To eliminate impact of memory bound on cross sockets, we use `numactl` to bind server on one socket and client on another socket.
|
- To eliminate impact of memory bound on cross sockets, we use `numactl` to bind server on one socket and client on another socket.
|
||||||
- Single instance means single server connected with single client
|
- Single instance means single server connected with single client
|
||||||
|
|
||||||
Now run benchmark for LZ4/Deflate/ZSTD respectively:
|
Now run benchmark for LZ4/Deflate/ZSTD respectively:
|
||||||
|
|
||||||
LZ4:
|
LZ4:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/lz4
|
$ cd ./database_dir/lz4
|
||||||
$ numactl -m 0 -N 0 [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
$ numactl -m 0 -N 0 [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
||||||
@ -144,13 +164,16 @@ $ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 1 > lz4.log
|
|||||||
```
|
```
|
||||||
|
|
||||||
IAA deflate:
|
IAA deflate:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/deflate
|
$ cd ./database_dir/deflate
|
||||||
$ numactl -m 0 -N 0 [CLICKHOUSE_EXE] server -C config_deflate.xml >&/dev/null&
|
$ numactl -m 0 -N 0 [CLICKHOUSE_EXE] server -C config_deflate.xml >&/dev/null&
|
||||||
$ cd ./client_scripts
|
$ cd ./client_scripts
|
||||||
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 1 > deflate.log
|
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 1 > deflate.log
|
||||||
```
|
```
|
||||||
|
|
||||||
ZSTD:
|
ZSTD:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/zstd
|
$ cd ./database_dir/zstd
|
||||||
$ numactl -m 0 -N 0 [CLICKHOUSE_EXE] server -C config_zstd.xml >&/dev/null&
|
$ numactl -m 0 -N 0 [CLICKHOUSE_EXE] server -C config_zstd.xml >&/dev/null&
|
||||||
@ -170,6 +193,7 @@ How to check performance metrics:
|
|||||||
We focus on QPS, please search the keyword: `QPS_Final` and collect statistics
|
We focus on QPS, please search the keyword: `QPS_Final` and collect statistics
|
||||||
|
|
||||||
## Benchmark with multi-instances
|
## Benchmark with multi-instances
|
||||||
|
|
||||||
- To reduce impact of memory bound on too much threads, We recommend run benchmark with multi-instances.
|
- To reduce impact of memory bound on too much threads, We recommend run benchmark with multi-instances.
|
||||||
- Multi-instance means multiple(2 or 4)servers connected with respective client.
|
- Multi-instance means multiple(2 or 4)servers connected with respective client.
|
||||||
- The cores of one socket need to be divided equally and assigned to the servers respectively.
|
- The cores of one socket need to be divided equally and assigned to the servers respectively.
|
||||||
@ -182,35 +206,46 @@ There are 2 differences:
|
|||||||
Here we assume there are 60 cores per socket and take 2 instances for example.
|
Here we assume there are 60 cores per socket and take 2 instances for example.
|
||||||
Launch server for first instance
|
Launch server for first instance
|
||||||
LZ4:
|
LZ4:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/lz4
|
$ cd ./database_dir/lz4
|
||||||
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
||||||
```
|
```
|
||||||
|
|
||||||
ZSTD:
|
ZSTD:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/zstd
|
$ cd ./database_dir/zstd
|
||||||
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_zstd.xml >&/dev/null&
|
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_zstd.xml >&/dev/null&
|
||||||
```
|
```
|
||||||
|
|
||||||
IAA Deflate:
|
IAA Deflate:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/deflate
|
$ cd ./database_dir/deflate
|
||||||
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_deflate.xml >&/dev/null&
|
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_deflate.xml >&/dev/null&
|
||||||
```
|
```
|
||||||
|
|
||||||
[Launch server for second instance]
|
[Launch server for second instance]
|
||||||
|
|
||||||
LZ4:
|
LZ4:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir && mkdir lz4_s2 && cd lz4_s2
|
$ cd ./database_dir && mkdir lz4_s2 && cd lz4_s2
|
||||||
$ cp ../../server_config/config_lz4_s2.xml ./
|
$ cp ../../server_config/config_lz4_s2.xml ./
|
||||||
$ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_lz4_s2.xml >&/dev/null&
|
$ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_lz4_s2.xml >&/dev/null&
|
||||||
```
|
```
|
||||||
|
|
||||||
ZSTD:
|
ZSTD:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir && mkdir zstd_s2 && cd zstd_s2
|
$ cd ./database_dir && mkdir zstd_s2 && cd zstd_s2
|
||||||
$ cp ../../server_config/config_zstd_s2.xml ./
|
$ cp ../../server_config/config_zstd_s2.xml ./
|
||||||
$ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_zstd_s2.xml >&/dev/null&
|
$ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_zstd_s2.xml >&/dev/null&
|
||||||
```
|
```
|
||||||
|
|
||||||
IAA Deflate:
|
IAA Deflate:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir && mkdir deflate_s2 && cd deflate_s2
|
$ cd ./database_dir && mkdir deflate_s2 && cd deflate_s2
|
||||||
$ cp ../../server_config/config_deflate_s2.xml ./
|
$ cp ../../server_config/config_deflate_s2.xml ./
|
||||||
@ -220,19 +255,24 @@ $ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_deflate_s2.xml >&/d
|
|||||||
Creating tables && Inserting data for second instance
|
Creating tables && Inserting data for second instance
|
||||||
|
|
||||||
Creating tables:
|
Creating tables:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ [CLICKHOUSE_EXE] client -m --port=9001
|
$ [CLICKHOUSE_EXE] client -m --port=9001
|
||||||
```
|
```
|
||||||
|
|
||||||
Inserting data:
|
Inserting data:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ [CLICKHOUSE_EXE] client --query "INSERT INTO [TBL_FILE_NAME] FORMAT CSV" < [TBL_FILE_NAME].tbl --port=9001
|
$ [CLICKHOUSE_EXE] client --query "INSERT INTO [TBL_FILE_NAME] FORMAT CSV" < [TBL_FILE_NAME].tbl --port=9001
|
||||||
```
|
```
|
||||||
|
|
||||||
- [TBL_FILE_NAME] represents the name of a file named with the regular expression: *. tbl under `./benchmark_sample/rawdata_dir/ssb-dbgen`.
|
- [TBL_FILE_NAME] represents the name of a file named with the regular expression: *. tbl under `./benchmark_sample/rawdata_dir/ssb-dbgen`.
|
||||||
- `--port=9001` stands for the assigned port for server instance which is also defined in config_lz4_s2.xml/config_zstd_s2.xml/config_deflate_s2.xml. For even more instances, you need replace it with the value: 9002/9003 which stand for s3/s4 instance respectively. If you don't assign it, the port is 9000 by default which has been used by first instance.
|
- `--port=9001` stands for the assigned port for server instance which is also defined in config_lz4_s2.xml/config_zstd_s2.xml/config_deflate_s2.xml. For even more instances, you need replace it with the value: 9002/9003 which stand for s3/s4 instance respectively. If you don't assign it, the port is 9000 by default which has been used by first instance.
|
||||||
|
|
||||||
Benchmarking with 2 instances
|
Benchmarking with 2 instances
|
||||||
|
|
||||||
LZ4:
|
LZ4:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/lz4
|
$ cd ./database_dir/lz4
|
||||||
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_lz4.xml >&/dev/null&
|
||||||
@ -241,7 +281,9 @@ $ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_lz4_s2.xml >&/dev/n
|
|||||||
$ cd ./client_scripts
|
$ cd ./client_scripts
|
||||||
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 2 > lz4_2insts.log
|
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 2 > lz4_2insts.log
|
||||||
```
|
```
|
||||||
|
|
||||||
ZSTD:
|
ZSTD:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/zstd
|
$ cd ./database_dir/zstd
|
||||||
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_zstd.xml >&/dev/null&
|
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_zstd.xml >&/dev/null&
|
||||||
@ -250,7 +292,9 @@ $ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_zstd_s2.xml >&/dev/
|
|||||||
$ cd ./client_scripts
|
$ cd ./client_scripts
|
||||||
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 2 > zstd_2insts.log
|
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 2 > zstd_2insts.log
|
||||||
```
|
```
|
||||||
|
|
||||||
IAA deflate
|
IAA deflate
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ cd ./database_dir/deflate
|
$ cd ./database_dir/deflate
|
||||||
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_deflate.xml >&/dev/null&
|
$ numactl -C 0-29,120-149 [CLICKHOUSE_EXE] server -C config_deflate.xml >&/dev/null&
|
||||||
@ -259,9 +303,11 @@ $ numactl -C 30-59,150-179 [CLICKHOUSE_EXE] server -C config_deflate_s2.xml >&/d
|
|||||||
$ cd ./client_scripts
|
$ cd ./client_scripts
|
||||||
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 2 > deflate_2insts.log
|
$ numactl -m 1 -N 1 python3 client_stressing_test.py queries_ssb.sql 2 > deflate_2insts.log
|
||||||
```
|
```
|
||||||
|
|
||||||
Here the last argument: `2` of client_stressing_test.py stands for the number of instances. For more instances, you need replace it with the value: 3 or 4. This script support up to 4 instances/
|
Here the last argument: `2` of client_stressing_test.py stands for the number of instances. For more instances, you need replace it with the value: 3 or 4. This script support up to 4 instances/
|
||||||
|
|
||||||
Now three logs should be output as expected:
|
Now three logs should be output as expected:
|
||||||
|
|
||||||
``` text
|
``` text
|
||||||
lz4_2insts.log
|
lz4_2insts.log
|
||||||
deflate_2insts.log
|
deflate_2insts.log
|
||||||
@ -275,7 +321,9 @@ Benchmark setup for 4 instances is similar with 2 instances above.
|
|||||||
We recommend use 2 instances benchmark data as final report for review.
|
We recommend use 2 instances benchmark data as final report for review.
|
||||||
|
|
||||||
## Tips
|
## Tips
|
||||||
|
|
||||||
Each time before launch new clickhouse server, please make sure no background clickhouse process running, please check and kill old one:
|
Each time before launch new clickhouse server, please make sure no background clickhouse process running, please check and kill old one:
|
||||||
|
|
||||||
``` bash
|
``` bash
|
||||||
$ ps -aux| grep clickhouse
|
$ ps -aux| grep clickhouse
|
||||||
$ kill -9 [PID]
|
$ kill -9 [PID]
|
||||||
|
@ -102,7 +102,7 @@ Builds ClickHouse in various configurations for use in further steps. You have t
|
|||||||
|
|
||||||
### Report Details
|
### Report Details
|
||||||
|
|
||||||
- **Compiler**: `clang-15`, optionally with the name of a target platform
|
- **Compiler**: `clang-16`, optionally with the name of a target platform
|
||||||
- **Build type**: `Debug` or `RelWithDebInfo` (cmake).
|
- **Build type**: `Debug` or `RelWithDebInfo` (cmake).
|
||||||
- **Sanitizer**: `none` (without sanitizers), `address` (ASan), `memory` (MSan), `undefined` (UBSan), or `thread` (TSan).
|
- **Sanitizer**: `none` (without sanitizers), `address` (ASan), `memory` (MSan), `undefined` (UBSan), or `thread` (TSan).
|
||||||
- **Status**: `success` or `fail`
|
- **Status**: `success` or `fail`
|
||||||
|
@ -152,7 +152,7 @@ While inside the `build` directory, configure your build by running CMake. Befor
|
|||||||
export CC=clang CXX=clang++
|
export CC=clang CXX=clang++
|
||||||
cmake ..
|
cmake ..
|
||||||
|
|
||||||
If you installed clang using the automatic installation script above, also specify the version of clang installed in the first command, e.g. `export CC=clang-15 CXX=clang++-15`. The clang version will be in the script output.
|
If you installed clang using the automatic installation script above, also specify the version of clang installed in the first command, e.g. `export CC=clang-16 CXX=clang++-16`. The clang version will be in the script output.
|
||||||
|
|
||||||
The `CC` variable specifies the compiler for C (short for C Compiler), and `CXX` variable instructs which C++ compiler is to be used for building.
|
The `CC` variable specifies the compiler for C (short for C Compiler), and `CXX` variable instructs which C++ compiler is to be used for building.
|
||||||
|
|
||||||
|
@ -19,8 +19,8 @@ Kafka lets you:
|
|||||||
``` sql
|
``` sql
|
||||||
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
||||||
(
|
(
|
||||||
name1 [type1],
|
name1 [type1] [ALIAS expr1],
|
||||||
name2 [type2],
|
name2 [type2] [ALIAS expr2],
|
||||||
...
|
...
|
||||||
) ENGINE = Kafka()
|
) ENGINE = Kafka()
|
||||||
SETTINGS
|
SETTINGS
|
||||||
|
@ -13,8 +13,8 @@ The PostgreSQL engine allows to perform `SELECT` and `INSERT` queries on data th
|
|||||||
``` sql
|
``` sql
|
||||||
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
||||||
(
|
(
|
||||||
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
|
name1 type1 [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
|
||||||
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
|
name2 type2 [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
|
||||||
...
|
...
|
||||||
) ENGINE = PostgreSQL('host:port', 'database', 'table', 'user', 'password'[, `schema`]);
|
) ENGINE = PostgreSQL('host:port', 'database', 'table', 'user', 'password'[, `schema`]);
|
||||||
```
|
```
|
||||||
|
@ -90,15 +90,17 @@ SELECT * FROM mySecondReplacingMT FINAL;
|
|||||||
|
|
||||||
### is_deleted
|
### is_deleted
|
||||||
|
|
||||||
`is_deleted` — Name of the column with the type of row: `1` is a “deleted“ row, `0` is a “state“ row.
|
`is_deleted` — Name of a column used during a merge to determine whether the data in this row represents the state or is to be deleted; `1` is a “deleted“ row, `0` is a “state“ row.
|
||||||
|
|
||||||
Column data type — `Int8`.
|
Column data type — `UInt8`.
|
||||||
|
|
||||||
Can only be enabled when `ver` is used.
|
:::note
|
||||||
The row is deleted when use the `OPTIMIZE ... FINAL CLEANUP`, or `OPTIMIZE ... FINAL` if the engine settings `clean_deleted_rows` has been set to `Always`.
|
`is_deleted` can only be enabled when `ver` is used.
|
||||||
No matter the operation on the data, the version must be increased. If two inserted rows have the same version number, the last inserted one is the one kept.
|
|
||||||
|
|
||||||
|
The row is deleted when `OPTIMIZE ... FINAL CLEANUP` or `OPTIMIZE ... FINAL` is used, or if the engine setting `clean_deleted_rows` has been set to `Always`.
|
||||||
|
|
||||||
|
No matter the operation on the data, the version must be increased. If two inserted rows have the same version number, the last inserted row is the one kept.
|
||||||
|
:::
|
||||||
|
|
||||||
## Query clauses
|
## Query clauses
|
||||||
|
|
||||||
|
636
docs/en/getting-started/example-datasets/reddit-comments.md
Normal file
636
docs/en/getting-started/example-datasets/reddit-comments.md
Normal file
@ -0,0 +1,636 @@
|
|||||||
|
---
|
||||||
|
slug: /en/getting-started/example-datasets/reddit-comments
|
||||||
|
sidebar_label: Reddit comments
|
||||||
|
---
|
||||||
|
|
||||||
|
# Reddit comments dataset
|
||||||
|
|
||||||
|
This dataset contains publicly-available comments on Reddit that go back to December, 2005, to March, 2023, and contains over 7B rows of data. The raw data is in JSON format in compressed `.zst` files and the rows look like the following:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"controversiality":0,"body":"A look at Vietnam and Mexico exposes the myth of market liberalisation.","subreddit_id":"t5_6","link_id":"t3_17863","stickied":false,"subreddit":"reddit.com","score":2,"ups":2,"author_flair_css_class":null,"created_utc":1134365188,"author_flair_text":null,"author":"frjo","id":"c13","edited":false,"parent_id":"t3_17863","gilded":0,"distinguished":null,"retrieved_on":1473738411}
|
||||||
|
{"created_utc":1134365725,"author_flair_css_class":null,"score":1,"ups":1,"subreddit":"reddit.com","stickied":false,"link_id":"t3_17866","subreddit_id":"t5_6","controversiality":0,"body":"The site states \"What can I use it for? Meeting notes, Reports, technical specs Sign-up sheets, proposals and much more...\", just like any other new breeed of sites that want us to store everything we have on the web. And they even guarantee multiple levels of security and encryption etc. But what prevents these web site operators fom accessing and/or stealing Meeting notes, Reports, technical specs Sign-up sheets, proposals and much more, for competitive or personal gains...? I am pretty sure that most of them are honest, but what's there to prevent me from setting up a good useful site and stealing all your data? Call me paranoid - I am.","retrieved_on":1473738411,"distinguished":null,"gilded":0,"id":"c14","edited":false,"parent_id":"t3_17866","author":"zse7zse","author_flair_text":null}
|
||||||
|
{"gilded":0,"distinguished":null,"retrieved_on":1473738411,"author":"[deleted]","author_flair_text":null,"edited":false,"id":"c15","parent_id":"t3_17869","subreddit":"reddit.com","score":0,"ups":0,"created_utc":1134366848,"author_flair_css_class":null,"body":"Jython related topics by Frank Wierzbicki","controversiality":0,"subreddit_id":"t5_6","stickied":false,"link_id":"t3_17869"}
|
||||||
|
{"gilded":0,"retrieved_on":1473738411,"distinguished":null,"author_flair_text":null,"author":"[deleted]","edited":false,"parent_id":"t3_17870","id":"c16","subreddit":"reddit.com","created_utc":1134367660,"author_flair_css_class":null,"score":1,"ups":1,"body":"[deleted]","controversiality":0,"stickied":false,"link_id":"t3_17870","subreddit_id":"t5_6"}
|
||||||
|
{"gilded":0,"retrieved_on":1473738411,"distinguished":null,"author_flair_text":null,"author":"rjoseph","edited":false,"id":"c17","parent_id":"t3_17817","subreddit":"reddit.com","author_flair_css_class":null,"created_utc":1134367754,"score":1,"ups":1,"body":"Saft is by far the best extension you could tak onto your Safari","controversiality":0,"link_id":"t3_17817","stickied":false,"subreddit_id":"t5_6"}
|
||||||
|
```
|
||||||
|
|
||||||
|
A shoutout to Percona for the [motivation behind ingesting this dataset](https://www.percona.com/blog/big-data-set-reddit-comments-analyzing-clickhouse/), which we have downloaded and stored in an S3 bucket.
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The following commands were executed on ClickHouse Cloud. To run this on your own cluster, replace `default` in the `s3Cluster` function call with the name of your cluster. If you do not have a cluster, then replace the `s3Cluster` function with the `s3` function.
|
||||||
|
:::
|
||||||
|
|
||||||
|
1. Let's create a table for the Reddit data:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE reddit
|
||||||
|
(
|
||||||
|
subreddit LowCardinality(String),
|
||||||
|
subreddit_id LowCardinality(String),
|
||||||
|
subreddit_type Enum('public' = 1, 'restricted' = 2, 'user' = 3, 'archived' = 4, 'gold_restricted' = 5, 'private' = 6),
|
||||||
|
author LowCardinality(String),
|
||||||
|
body String CODEC(ZSTD(6)),
|
||||||
|
created_date Date DEFAULT toDate(created_utc),
|
||||||
|
created_utc DateTime,
|
||||||
|
retrieved_on DateTime,
|
||||||
|
id String,
|
||||||
|
parent_id String,
|
||||||
|
link_id String,
|
||||||
|
score Int32,
|
||||||
|
total_awards_received UInt16,
|
||||||
|
controversiality UInt8,
|
||||||
|
gilded UInt8,
|
||||||
|
collapsed_because_crowd_control UInt8,
|
||||||
|
collapsed_reason Enum('' = 0, 'comment score below threshold' = 1, 'may be sensitive content' = 2, 'potentially toxic' = 3, 'potentially toxic content' = 4),
|
||||||
|
distinguished Enum('' = 0, 'moderator' = 1, 'admin' = 2, 'special' = 3),
|
||||||
|
removal_reason Enum('' = 0, 'legal' = 1),
|
||||||
|
author_created_utc DateTime,
|
||||||
|
author_fullname LowCardinality(String),
|
||||||
|
author_patreon_flair UInt8,
|
||||||
|
author_premium UInt8,
|
||||||
|
can_gild UInt8,
|
||||||
|
can_mod_post UInt8,
|
||||||
|
collapsed UInt8,
|
||||||
|
is_submitter UInt8,
|
||||||
|
_edited String,
|
||||||
|
locked UInt8,
|
||||||
|
quarantined UInt8,
|
||||||
|
no_follow UInt8,
|
||||||
|
send_replies UInt8,
|
||||||
|
stickied UInt8,
|
||||||
|
author_flair_text LowCardinality(String)
|
||||||
|
)
|
||||||
|
ENGINE = MergeTree
|
||||||
|
ORDER BY (subreddit, created_date, author);
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The names of the files in S3 start with `RC_YYYY-MM` where `YYYY-MM` goes from `2005-12` to `2023-02`. The compression changes a couple of times though, so the file extensions are not consistent. For example:
|
||||||
|
|
||||||
|
- the file names are initially `RC_2005-12.bz2` to `RC_2017-11.bz2`
|
||||||
|
- then they look like `RC_2017-12.xz` to `RC_2018-09.xz`
|
||||||
|
- and finally `RC_2018-10.zst` to `RC_2023-02.zst`
|
||||||
|
:::
|
||||||
|
|
||||||
|
2. We are going to start with one month of data, but if you want to simply insert every row - skip ahead to step 8 below. The following file has 86M records from December, 2017:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
INSERT INTO reddit
|
||||||
|
SELECT *
|
||||||
|
FROM s3Cluster(
|
||||||
|
'default',
|
||||||
|
'https://clickhouse-public-datasets.s3.eu-central-1.amazonaws.com/reddit/original/RC_2017-12.xz',
|
||||||
|
'JSONEachRow'
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
If you do not have a cluster, use `s3` instead of `s3Cluster`:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
INSERT INTO reddit
|
||||||
|
SELECT *
|
||||||
|
FROM s3(
|
||||||
|
'https://clickhouse-public-datasets.s3.eu-central-1.amazonaws.com/reddit/original/RC_2017-12.xz',
|
||||||
|
'JSONEachRow'
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
3. It will take a while depending on your resources, but when it's done verify it worked:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT formatReadableQuantity(count())
|
||||||
|
FROM reddit;
|
||||||
|
```
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌─formatReadableQuantity(count())─┐
|
||||||
|
│ 85.97 million │
|
||||||
|
└─────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Let's see how many unique subreddits were in December of 2017:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT uniqExact(subreddit)
|
||||||
|
FROM reddit;
|
||||||
|
```
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌─uniqExact(subreddit)─┐
|
||||||
|
│ 91613 │
|
||||||
|
└──────────────────────┘
|
||||||
|
|
||||||
|
1 row in set. Elapsed: 1.572 sec. Processed 85.97 million rows, 367.43 MB (54.71 million rows/s., 233.80 MB/s.)
|
||||||
|
```
|
||||||
|
|
||||||
|
5. This query returns the top 10 subreddits (in terms of number of comments):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT
|
||||||
|
subreddit,
|
||||||
|
count() AS c
|
||||||
|
FROM reddit
|
||||||
|
GROUP BY subreddit
|
||||||
|
ORDER BY c DESC
|
||||||
|
LIMIT 20;
|
||||||
|
```
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌─subreddit───────┬───────c─┐
|
||||||
|
│ AskReddit │ 5245881 │
|
||||||
|
│ politics │ 1753120 │
|
||||||
|
│ nfl │ 1220266 │
|
||||||
|
│ nba │ 960388 │
|
||||||
|
│ The_Donald │ 931857 │
|
||||||
|
│ news │ 796617 │
|
||||||
|
│ worldnews │ 765709 │
|
||||||
|
│ CFB │ 710360 │
|
||||||
|
│ gaming │ 602761 │
|
||||||
|
│ movies │ 601966 │
|
||||||
|
│ soccer │ 590628 │
|
||||||
|
│ Bitcoin │ 583783 │
|
||||||
|
│ pics │ 563408 │
|
||||||
|
│ StarWars │ 562514 │
|
||||||
|
│ funny │ 547563 │
|
||||||
|
│ leagueoflegends │ 517213 │
|
||||||
|
│ teenagers │ 492020 │
|
||||||
|
│ DestinyTheGame │ 477377 │
|
||||||
|
│ todayilearned │ 472650 │
|
||||||
|
│ videos │ 450581 │
|
||||||
|
└─────────────────┴─────────┘
|
||||||
|
|
||||||
|
20 rows in set. Elapsed: 0.368 sec. Processed 85.97 million rows, 367.43 MB (233.34 million rows/s., 997.25 MB/s.)
|
||||||
|
```
|
||||||
|
|
||||||
|
6. Here are the top 10 authors in December of 2017, in terms of number of comments posted:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT
|
||||||
|
author,
|
||||||
|
count() AS c
|
||||||
|
FROM reddit
|
||||||
|
GROUP BY author
|
||||||
|
ORDER BY c DESC
|
||||||
|
LIMIT 10;
|
||||||
|
```
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌─author──────────┬───────c─┐
|
||||||
|
│ [deleted] │ 5913324 │
|
||||||
|
│ AutoModerator │ 784886 │
|
||||||
|
│ ImagesOfNetwork │ 83241 │
|
||||||
|
│ BitcoinAllBot │ 54484 │
|
||||||
|
│ imguralbumbot │ 45822 │
|
||||||
|
│ RPBot │ 29337 │
|
||||||
|
│ WikiTextBot │ 25982 │
|
||||||
|
│ Concise_AMA_Bot │ 19974 │
|
||||||
|
│ MTGCardFetcher │ 19103 │
|
||||||
|
│ TotesMessenger │ 19057 │
|
||||||
|
└─────────────────┴─────────┘
|
||||||
|
|
||||||
|
10 rows in set. Elapsed: 8.143 sec. Processed 85.97 million rows, 711.05 MB (10.56 million rows/s., 87.32 MB/s.)
|
||||||
|
```
|
||||||
|
|
||||||
|
7. We already inserted some data, but we will start over:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
TRUNCATE TABLE reddit;
|
||||||
|
```
|
||||||
|
|
||||||
|
8. This is a fun dataset and it looks like we can find some great information, so let's go ahead and insert the entire dataset from 2005 to 2023. When you're ready, run this command to insert all the rows. (It takes a while - up to 17 hours!)
|
||||||
|
|
||||||
|
```sql
|
||||||
|
INSERT INTO reddit
|
||||||
|
SELECT *
|
||||||
|
FROM s3Cluster(
|
||||||
|
'default',
|
||||||
|
'https://clickhouse-public-datasets.s3.amazonaws.com/reddit/original/RC*',
|
||||||
|
'JSONEachRow'
|
||||||
|
)
|
||||||
|
SETTINGS zstd_window_log_max = 31;
|
||||||
|
```
|
||||||
|
|
||||||
|
The response looks like:
|
||||||
|
|
||||||
|
```response
|
||||||
|
0 rows in set. Elapsed: 61187.839 sec. Processed 6.74 billion rows, 2.06 TB (110.17 thousand rows/s., 33.68 MB/s.)
|
||||||
|
```
|
||||||
|
|
||||||
|
8. Let's see how many rows were inserted and how much disk space the table is using:
|
||||||
|
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT
|
||||||
|
sum(rows) AS count,
|
||||||
|
formatReadableQuantity(count),
|
||||||
|
formatReadableSize(sum(bytes)) AS disk_size,
|
||||||
|
formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed_size
|
||||||
|
FROM system.parts
|
||||||
|
WHERE (table = 'reddit') AND active
|
||||||
|
```
|
||||||
|
|
||||||
|
Notice the compression of disk storage is about 1/3 of the uncompressed size:
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌──────count─┬─formatReadableQuantity(sum(rows))─┬─disk_size──┬─uncompressed_size─┐
|
||||||
|
│ 6739503568 │ 6.74 billion │ 501.10 GiB │ 1.51 TiB │
|
||||||
|
└────────────┴───────────────────────────────────┴────────────┴───────────────────┘
|
||||||
|
|
||||||
|
1 row in set. Elapsed: 0.010 sec.
|
||||||
|
```
|
||||||
|
|
||||||
|
9. The following query shows how many comments, authors and subreddits we have for each month:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT
|
||||||
|
toStartOfMonth(created_utc) AS firstOfMonth,
|
||||||
|
count() AS c,
|
||||||
|
bar(c, 0, 50000000, 25) AS bar_count,
|
||||||
|
uniq(author) AS authors,
|
||||||
|
bar(authors, 0, 5000000, 25) AS bar_authors,
|
||||||
|
uniq(subreddit) AS subreddits,
|
||||||
|
bar(subreddits, 0, 100000, 25) AS bar_subreddits
|
||||||
|
FROM reddit
|
||||||
|
GROUP BY firstOfMonth
|
||||||
|
ORDER BY firstOfMonth ASC;
|
||||||
|
```
|
||||||
|
|
||||||
|
This is a substantial query that has to process all 6.74 billion rows, but we still get an impressive response time (about 3 minutes):
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌─firstOfMonth─┬─────────c─┬─bar_count─────────────────┬─authors─┬─bar_authors───────────────┬─subreddits─┬─bar_subreddits────────────┐
|
||||||
|
│ 2005-12-01 │ 1075 │ │ 394 │ │ 1 │ │
|
||||||
|
│ 2006-01-01 │ 3666 │ │ 791 │ │ 2 │ │
|
||||||
|
│ 2006-02-01 │ 9095 │ │ 1464 │ │ 18 │ │
|
||||||
|
│ 2006-03-01 │ 13859 │ │ 1958 │ │ 15 │ │
|
||||||
|
│ 2006-04-01 │ 19090 │ │ 2334 │ │ 21 │ │
|
||||||
|
│ 2006-05-01 │ 26859 │ │ 2698 │ │ 21 │ │
|
||||||
|
│ 2006-06-01 │ 29163 │ │ 3043 │ │ 19 │ │
|
||||||
|
│ 2006-07-01 │ 37031 │ │ 3532 │ │ 22 │ │
|
||||||
|
│ 2006-08-01 │ 50559 │ │ 4750 │ │ 24 │ │
|
||||||
|
│ 2006-09-01 │ 50675 │ │ 4908 │ │ 21 │ │
|
||||||
|
│ 2006-10-01 │ 54148 │ │ 5654 │ │ 31 │ │
|
||||||
|
│ 2006-11-01 │ 62021 │ │ 6490 │ │ 23 │ │
|
||||||
|
│ 2006-12-01 │ 61018 │ │ 6707 │ │ 24 │ │
|
||||||
|
│ 2007-01-01 │ 81341 │ │ 7931 │ │ 23 │ │
|
||||||
|
│ 2007-02-01 │ 95634 │ │ 9020 │ │ 21 │ │
|
||||||
|
│ 2007-03-01 │ 112444 │ │ 10842 │ │ 23 │ │
|
||||||
|
│ 2007-04-01 │ 126773 │ │ 10701 │ │ 26 │ │
|
||||||
|
│ 2007-05-01 │ 170097 │ │ 11365 │ │ 25 │ │
|
||||||
|
│ 2007-06-01 │ 178800 │ │ 11267 │ │ 22 │ │
|
||||||
|
│ 2007-07-01 │ 203319 │ │ 12482 │ │ 25 │ │
|
||||||
|
│ 2007-08-01 │ 225111 │ │ 14124 │ │ 30 │ │
|
||||||
|
│ 2007-09-01 │ 259497 │ ▏ │ 15416 │ │ 33 │ │
|
||||||
|
│ 2007-10-01 │ 274170 │ ▏ │ 15302 │ │ 36 │ │
|
||||||
|
│ 2007-11-01 │ 372983 │ ▏ │ 15134 │ │ 43 │ │
|
||||||
|
│ 2007-12-01 │ 363390 │ ▏ │ 15915 │ │ 31 │ │
|
||||||
|
│ 2008-01-01 │ 452990 │ ▏ │ 18857 │ │ 126 │ │
|
||||||
|
│ 2008-02-01 │ 441768 │ ▏ │ 18266 │ │ 173 │ │
|
||||||
|
│ 2008-03-01 │ 463728 │ ▏ │ 18947 │ │ 292 │ │
|
||||||
|
│ 2008-04-01 │ 468317 │ ▏ │ 18590 │ │ 323 │ │
|
||||||
|
│ 2008-05-01 │ 536380 │ ▎ │ 20861 │ │ 375 │ │
|
||||||
|
│ 2008-06-01 │ 577684 │ ▎ │ 22557 │ │ 575 │ ▏ │
|
||||||
|
│ 2008-07-01 │ 592610 │ ▎ │ 23123 │ │ 657 │ ▏ │
|
||||||
|
│ 2008-08-01 │ 595959 │ ▎ │ 23729 │ │ 707 │ ▏ │
|
||||||
|
│ 2008-09-01 │ 680892 │ ▎ │ 26374 │ ▏ │ 801 │ ▏ │
|
||||||
|
│ 2008-10-01 │ 789874 │ ▍ │ 28970 │ ▏ │ 893 │ ▏ │
|
||||||
|
│ 2008-11-01 │ 792310 │ ▍ │ 30272 │ ▏ │ 1024 │ ▎ │
|
||||||
|
│ 2008-12-01 │ 850359 │ ▍ │ 34073 │ ▏ │ 1103 │ ▎ │
|
||||||
|
│ 2009-01-01 │ 1051649 │ ▌ │ 38978 │ ▏ │ 1316 │ ▎ │
|
||||||
|
│ 2009-02-01 │ 944711 │ ▍ │ 43390 │ ▏ │ 1132 │ ▎ │
|
||||||
|
│ 2009-03-01 │ 1048643 │ ▌ │ 46516 │ ▏ │ 1203 │ ▎ │
|
||||||
|
│ 2009-04-01 │ 1094599 │ ▌ │ 48284 │ ▏ │ 1334 │ ▎ │
|
||||||
|
│ 2009-05-01 │ 1201257 │ ▌ │ 52512 │ ▎ │ 1395 │ ▎ │
|
||||||
|
│ 2009-06-01 │ 1258750 │ ▋ │ 57728 │ ▎ │ 1473 │ ▎ │
|
||||||
|
│ 2009-07-01 │ 1470290 │ ▋ │ 60098 │ ▎ │ 1686 │ ▍ │
|
||||||
|
│ 2009-08-01 │ 1750688 │ ▉ │ 67347 │ ▎ │ 1777 │ ▍ │
|
||||||
|
│ 2009-09-01 │ 2032276 │ █ │ 78051 │ ▍ │ 1784 │ ▍ │
|
||||||
|
│ 2009-10-01 │ 2242017 │ █ │ 93409 │ ▍ │ 2071 │ ▌ │
|
||||||
|
│ 2009-11-01 │ 2207444 │ █ │ 95940 │ ▍ │ 2141 │ ▌ │
|
||||||
|
│ 2009-12-01 │ 2560510 │ █▎ │ 104239 │ ▌ │ 2141 │ ▌ │
|
||||||
|
│ 2010-01-01 │ 2884096 │ █▍ │ 114314 │ ▌ │ 2313 │ ▌ │
|
||||||
|
│ 2010-02-01 │ 2687779 │ █▎ │ 115683 │ ▌ │ 2522 │ ▋ │
|
||||||
|
│ 2010-03-01 │ 3228254 │ █▌ │ 125775 │ ▋ │ 2890 │ ▋ │
|
||||||
|
│ 2010-04-01 │ 3209898 │ █▌ │ 128936 │ ▋ │ 3170 │ ▊ │
|
||||||
|
│ 2010-05-01 │ 3267363 │ █▋ │ 131851 │ ▋ │ 3166 │ ▊ │
|
||||||
|
│ 2010-06-01 │ 3532867 │ █▊ │ 139522 │ ▋ │ 3301 │ ▊ │
|
||||||
|
│ 2010-07-01 │ 4032737 │ ██ │ 153451 │ ▊ │ 3662 │ ▉ │
|
||||||
|
│ 2010-08-01 │ 4247982 │ ██ │ 164071 │ ▊ │ 3653 │ ▉ │
|
||||||
|
│ 2010-09-01 │ 4704069 │ ██▎ │ 186613 │ ▉ │ 4009 │ █ │
|
||||||
|
│ 2010-10-01 │ 5032368 │ ██▌ │ 203800 │ █ │ 4154 │ █ │
|
||||||
|
│ 2010-11-01 │ 5689002 │ ██▊ │ 226134 │ █▏ │ 4383 │ █ │
|
||||||
|
│ 2010-12-01 │ 5972642 │ ██▉ │ 245824 │ █▏ │ 4692 │ █▏ │
|
||||||
|
│ 2011-01-01 │ 6603329 │ ███▎ │ 270025 │ █▎ │ 5141 │ █▎ │
|
||||||
|
│ 2011-02-01 │ 6363114 │ ███▏ │ 277593 │ █▍ │ 5202 │ █▎ │
|
||||||
|
│ 2011-03-01 │ 7556165 │ ███▊ │ 314748 │ █▌ │ 5445 │ █▎ │
|
||||||
|
│ 2011-04-01 │ 7571398 │ ███▊ │ 329920 │ █▋ │ 6128 │ █▌ │
|
||||||
|
│ 2011-05-01 │ 8803949 │ ████▍ │ 365013 │ █▊ │ 6834 │ █▋ │
|
||||||
|
│ 2011-06-01 │ 9766511 │ ████▉ │ 393945 │ █▉ │ 7519 │ █▉ │
|
||||||
|
│ 2011-07-01 │ 10557466 │ █████▎ │ 424235 │ ██ │ 8293 │ ██ │
|
||||||
|
│ 2011-08-01 │ 12316144 │ ██████▏ │ 475326 │ ██▍ │ 9657 │ ██▍ │
|
||||||
|
│ 2011-09-01 │ 12150412 │ ██████ │ 503142 │ ██▌ │ 10278 │ ██▌ │
|
||||||
|
│ 2011-10-01 │ 13470278 │ ██████▋ │ 548801 │ ██▋ │ 10922 │ ██▋ │
|
||||||
|
│ 2011-11-01 │ 13621533 │ ██████▊ │ 574435 │ ██▊ │ 11572 │ ██▉ │
|
||||||
|
│ 2011-12-01 │ 14509469 │ ███████▎ │ 622849 │ ███ │ 12335 │ ███ │
|
||||||
|
│ 2012-01-01 │ 16350205 │ ████████▏ │ 696110 │ ███▍ │ 14281 │ ███▌ │
|
||||||
|
│ 2012-02-01 │ 16015695 │ ████████ │ 722892 │ ███▌ │ 14949 │ ███▋ │
|
||||||
|
│ 2012-03-01 │ 17881943 │ ████████▉ │ 789664 │ ███▉ │ 15795 │ ███▉ │
|
||||||
|
│ 2012-04-01 │ 19044534 │ █████████▌ │ 842491 │ ████▏ │ 16440 │ ████ │
|
||||||
|
│ 2012-05-01 │ 20388260 │ ██████████▏ │ 886176 │ ████▍ │ 16974 │ ████▏ │
|
||||||
|
│ 2012-06-01 │ 21897913 │ ██████████▉ │ 946798 │ ████▋ │ 17952 │ ████▍ │
|
||||||
|
│ 2012-07-01 │ 24087517 │ ████████████ │ 1018636 │ █████ │ 19069 │ ████▊ │
|
||||||
|
│ 2012-08-01 │ 25703326 │ ████████████▊ │ 1094445 │ █████▍ │ 20553 │ █████▏ │
|
||||||
|
│ 2012-09-01 │ 23419524 │ ███████████▋ │ 1088491 │ █████▍ │ 20831 │ █████▏ │
|
||||||
|
│ 2012-10-01 │ 24788236 │ ████████████▍ │ 1131885 │ █████▋ │ 21868 │ █████▍ │
|
||||||
|
│ 2012-11-01 │ 24648302 │ ████████████▎ │ 1167608 │ █████▊ │ 21791 │ █████▍ │
|
||||||
|
│ 2012-12-01 │ 26080276 │ █████████████ │ 1218402 │ ██████ │ 22622 │ █████▋ │
|
||||||
|
│ 2013-01-01 │ 30365867 │ ███████████████▏ │ 1341703 │ ██████▋ │ 24696 │ ██████▏ │
|
||||||
|
│ 2013-02-01 │ 27213960 │ █████████████▌ │ 1304756 │ ██████▌ │ 24514 │ ██████▏ │
|
||||||
|
│ 2013-03-01 │ 30771274 │ ███████████████▍ │ 1391703 │ ██████▉ │ 25730 │ ██████▍ │
|
||||||
|
│ 2013-04-01 │ 33259557 │ ████████████████▋ │ 1485971 │ ███████▍ │ 27294 │ ██████▊ │
|
||||||
|
│ 2013-05-01 │ 33126225 │ ████████████████▌ │ 1506473 │ ███████▌ │ 27299 │ ██████▊ │
|
||||||
|
│ 2013-06-01 │ 32648247 │ ████████████████▎ │ 1506650 │ ███████▌ │ 27450 │ ██████▊ │
|
||||||
|
│ 2013-07-01 │ 34922133 │ █████████████████▍ │ 1561771 │ ███████▊ │ 28294 │ ███████ │
|
||||||
|
│ 2013-08-01 │ 34766579 │ █████████████████▍ │ 1589781 │ ███████▉ │ 28943 │ ███████▏ │
|
||||||
|
│ 2013-09-01 │ 31990369 │ ███████████████▉ │ 1570342 │ ███████▊ │ 29408 │ ███████▎ │
|
||||||
|
│ 2013-10-01 │ 35940040 │ █████████████████▉ │ 1683770 │ ████████▍ │ 30273 │ ███████▌ │
|
||||||
|
│ 2013-11-01 │ 37396497 │ ██████████████████▋ │ 1757467 │ ████████▊ │ 31173 │ ███████▊ │
|
||||||
|
│ 2013-12-01 │ 39810216 │ ███████████████████▉ │ 1846204 │ █████████▏ │ 32326 │ ████████ │
|
||||||
|
│ 2014-01-01 │ 42420655 │ █████████████████████▏ │ 1927229 │ █████████▋ │ 35603 │ ████████▉ │
|
||||||
|
│ 2014-02-01 │ 38703362 │ ███████████████████▎ │ 1874067 │ █████████▎ │ 37007 │ █████████▎ │
|
||||||
|
│ 2014-03-01 │ 42459956 │ █████████████████████▏ │ 1959888 │ █████████▊ │ 37948 │ █████████▍ │
|
||||||
|
│ 2014-04-01 │ 42440735 │ █████████████████████▏ │ 1951369 │ █████████▊ │ 38362 │ █████████▌ │
|
||||||
|
│ 2014-05-01 │ 42514094 │ █████████████████████▎ │ 1970197 │ █████████▊ │ 39078 │ █████████▊ │
|
||||||
|
│ 2014-06-01 │ 41990650 │ ████████████████████▉ │ 1943850 │ █████████▋ │ 38268 │ █████████▌ │
|
||||||
|
│ 2014-07-01 │ 46868899 │ ███████████████████████▍ │ 2059346 │ ██████████▎ │ 40634 │ ██████████▏ │
|
||||||
|
│ 2014-08-01 │ 46990813 │ ███████████████████████▍ │ 2117335 │ ██████████▌ │ 41764 │ ██████████▍ │
|
||||||
|
│ 2014-09-01 │ 44992201 │ ██████████████████████▍ │ 2124708 │ ██████████▌ │ 41890 │ ██████████▍ │
|
||||||
|
│ 2014-10-01 │ 47497520 │ ███████████████████████▋ │ 2206535 │ ███████████ │ 43109 │ ██████████▊ │
|
||||||
|
│ 2014-11-01 │ 46118074 │ ███████████████████████ │ 2239747 │ ███████████▏ │ 43718 │ ██████████▉ │
|
||||||
|
│ 2014-12-01 │ 48807699 │ ████████████████████████▍ │ 2372945 │ ███████████▊ │ 43823 │ ██████████▉ │
|
||||||
|
│ 2015-01-01 │ 53851542 │ █████████████████████████ │ 2499536 │ ████████████▍ │ 47172 │ ███████████▊ │
|
||||||
|
│ 2015-02-01 │ 48342747 │ ████████████████████████▏ │ 2448496 │ ████████████▏ │ 47229 │ ███████████▊ │
|
||||||
|
│ 2015-03-01 │ 54564441 │ █████████████████████████ │ 2550534 │ ████████████▊ │ 48156 │ ████████████ │
|
||||||
|
│ 2015-04-01 │ 55005780 │ █████████████████████████ │ 2609443 │ █████████████ │ 49865 │ ████████████▍ │
|
||||||
|
│ 2015-05-01 │ 54504410 │ █████████████████████████ │ 2585535 │ ████████████▉ │ 50137 │ ████████████▌ │
|
||||||
|
│ 2015-06-01 │ 54258492 │ █████████████████████████ │ 2595129 │ ████████████▉ │ 49598 │ ████████████▍ │
|
||||||
|
│ 2015-07-01 │ 58451788 │ █████████████████████████ │ 2720026 │ █████████████▌ │ 55022 │ █████████████▊ │
|
||||||
|
│ 2015-08-01 │ 58075327 │ █████████████████████████ │ 2743994 │ █████████████▋ │ 55302 │ █████████████▊ │
|
||||||
|
│ 2015-09-01 │ 55574825 │ █████████████████████████ │ 2672793 │ █████████████▎ │ 53960 │ █████████████▍ │
|
||||||
|
│ 2015-10-01 │ 59494045 │ █████████████████████████ │ 2816426 │ ██████████████ │ 70210 │ █████████████████▌ │
|
||||||
|
│ 2015-11-01 │ 57117500 │ █████████████████████████ │ 2847146 │ ██████████████▏ │ 71363 │ █████████████████▊ │
|
||||||
|
│ 2015-12-01 │ 58523312 │ █████████████████████████ │ 2854840 │ ██████████████▎ │ 94559 │ ███████████████████████▋ │
|
||||||
|
│ 2016-01-01 │ 61991732 │ █████████████████████████ │ 2920366 │ ██████████████▌ │ 108438 │ █████████████████████████ │
|
||||||
|
│ 2016-02-01 │ 59189875 │ █████████████████████████ │ 2854683 │ ██████████████▎ │ 109916 │ █████████████████████████ │
|
||||||
|
│ 2016-03-01 │ 63918864 │ █████████████████████████ │ 2969542 │ ██████████████▊ │ 84787 │ █████████████████████▏ │
|
||||||
|
│ 2016-04-01 │ 64271256 │ █████████████████████████ │ 2999086 │ ██████████████▉ │ 61647 │ ███████████████▍ │
|
||||||
|
│ 2016-05-01 │ 65212004 │ █████████████████████████ │ 3034674 │ ███████████████▏ │ 67465 │ ████████████████▊ │
|
||||||
|
│ 2016-06-01 │ 65867743 │ █████████████████████████ │ 3057604 │ ███████████████▎ │ 75170 │ ██████████████████▊ │
|
||||||
|
│ 2016-07-01 │ 66974735 │ █████████████████████████ │ 3199374 │ ███████████████▉ │ 77732 │ ███████████████████▍ │
|
||||||
|
│ 2016-08-01 │ 69654819 │ █████████████████████████ │ 3239957 │ ████████████████▏ │ 63080 │ ███████████████▊ │
|
||||||
|
│ 2016-09-01 │ 67024973 │ █████████████████████████ │ 3190864 │ ███████████████▉ │ 62324 │ ███████████████▌ │
|
||||||
|
│ 2016-10-01 │ 71826553 │ █████████████████████████ │ 3284340 │ ████████████████▍ │ 62549 │ ███████████████▋ │
|
||||||
|
│ 2016-11-01 │ 71022319 │ █████████████████████████ │ 3300822 │ ████████████████▌ │ 69718 │ █████████████████▍ │
|
||||||
|
│ 2016-12-01 │ 72942967 │ █████████████████████████ │ 3430324 │ █████████████████▏ │ 71705 │ █████████████████▉ │
|
||||||
|
│ 2017-01-01 │ 78946585 │ █████████████████████████ │ 3572093 │ █████████████████▊ │ 78198 │ ███████████████████▌ │
|
||||||
|
│ 2017-02-01 │ 70609487 │ █████████████████████████ │ 3421115 │ █████████████████ │ 69823 │ █████████████████▍ │
|
||||||
|
│ 2017-03-01 │ 79723106 │ █████████████████████████ │ 3638122 │ ██████████████████▏ │ 73865 │ ██████████████████▍ │
|
||||||
|
│ 2017-04-01 │ 77478009 │ █████████████████████████ │ 3620591 │ ██████████████████ │ 74387 │ ██████████████████▌ │
|
||||||
|
│ 2017-05-01 │ 79810360 │ █████████████████████████ │ 3650820 │ ██████████████████▎ │ 74356 │ ██████████████████▌ │
|
||||||
|
│ 2017-06-01 │ 79901711 │ █████████████████████████ │ 3737614 │ ██████████████████▋ │ 72114 │ ██████████████████ │
|
||||||
|
│ 2017-07-01 │ 81798725 │ █████████████████████████ │ 3872330 │ ███████████████████▎ │ 76052 │ ███████████████████ │
|
||||||
|
│ 2017-08-01 │ 84658503 │ █████████████████████████ │ 3960093 │ ███████████████████▊ │ 77798 │ ███████████████████▍ │
|
||||||
|
│ 2017-09-01 │ 83165192 │ █████████████████████████ │ 3880501 │ ███████████████████▍ │ 78402 │ ███████████████████▌ │
|
||||||
|
│ 2017-10-01 │ 85828912 │ █████████████████████████ │ 3980335 │ ███████████████████▉ │ 80685 │ ████████████████████▏ │
|
||||||
|
│ 2017-11-01 │ 84965681 │ █████████████████████████ │ 4026749 │ ████████████████████▏ │ 82659 │ ████████████████████▋ │
|
||||||
|
│ 2017-12-01 │ 85973810 │ █████████████████████████ │ 4196354 │ ████████████████████▉ │ 91984 │ ██████████████████████▉ │
|
||||||
|
│ 2018-01-01 │ 91558594 │ █████████████████████████ │ 4364443 │ █████████████████████▊ │ 102577 │ █████████████████████████ │
|
||||||
|
│ 2018-02-01 │ 86467179 │ █████████████████████████ │ 4277899 │ █████████████████████▍ │ 104610 │ █████████████████████████ │
|
||||||
|
│ 2018-03-01 │ 96490262 │ █████████████████████████ │ 4422470 │ ██████████████████████ │ 112559 │ █████████████████████████ │
|
||||||
|
│ 2018-04-01 │ 98101232 │ █████████████████████████ │ 4572434 │ ██████████████████████▊ │ 105284 │ █████████████████████████ │
|
||||||
|
│ 2018-05-01 │ 100109100 │ █████████████████████████ │ 4698908 │ ███████████████████████▍ │ 103910 │ █████████████████████████ │
|
||||||
|
│ 2018-06-01 │ 100009462 │ █████████████████████████ │ 4697426 │ ███████████████████████▍ │ 101107 │ █████████████████████████ │
|
||||||
|
│ 2018-07-01 │ 108151359 │ █████████████████████████ │ 5099492 │ █████████████████████████ │ 106184 │ █████████████████████████ │
|
||||||
|
│ 2018-08-01 │ 107330940 │ █████████████████████████ │ 5084082 │ █████████████████████████ │ 109985 │ █████████████████████████ │
|
||||||
|
│ 2018-09-01 │ 104473929 │ █████████████████████████ │ 5011953 │ █████████████████████████ │ 109710 │ █████████████████████████ │
|
||||||
|
│ 2018-10-01 │ 112346556 │ █████████████████████████ │ 5320405 │ █████████████████████████ │ 112533 │ █████████████████████████ │
|
||||||
|
│ 2018-11-01 │ 112573001 │ █████████████████████████ │ 5353282 │ █████████████████████████ │ 112211 │ █████████████████████████ │
|
||||||
|
│ 2018-12-01 │ 121953600 │ █████████████████████████ │ 5611543 │ █████████████████████████ │ 118291 │ █████████████████████████ │
|
||||||
|
│ 2019-01-01 │ 129386587 │ █████████████████████████ │ 6016687 │ █████████████████████████ │ 125725 │ █████████████████████████ │
|
||||||
|
│ 2019-02-01 │ 120645639 │ █████████████████████████ │ 5974488 │ █████████████████████████ │ 125420 │ █████████████████████████ │
|
||||||
|
│ 2019-03-01 │ 137650471 │ █████████████████████████ │ 6410197 │ █████████████████████████ │ 135924 │ █████████████████████████ │
|
||||||
|
│ 2019-04-01 │ 138473643 │ █████████████████████████ │ 6416384 │ █████████████████████████ │ 139844 │ █████████████████████████ │
|
||||||
|
│ 2019-05-01 │ 142463421 │ █████████████████████████ │ 6574836 │ █████████████████████████ │ 142012 │ █████████████████████████ │
|
||||||
|
│ 2019-06-01 │ 134172939 │ █████████████████████████ │ 6601267 │ █████████████████████████ │ 140997 │ █████████████████████████ │
|
||||||
|
│ 2019-07-01 │ 145965083 │ █████████████████████████ │ 6901822 │ █████████████████████████ │ 147802 │ █████████████████████████ │
|
||||||
|
│ 2019-08-01 │ 146854393 │ █████████████████████████ │ 6993882 │ █████████████████████████ │ 151888 │ █████████████████████████ │
|
||||||
|
│ 2019-09-01 │ 137540219 │ █████████████████████████ │ 7001362 │ █████████████████████████ │ 148839 │ █████████████████████████ │
|
||||||
|
│ 2019-10-01 │ 129771456 │ █████████████████████████ │ 6825690 │ █████████████████████████ │ 144453 │ █████████████████████████ │
|
||||||
|
│ 2019-11-01 │ 107990259 │ █████████████████████████ │ 6368286 │ █████████████████████████ │ 141768 │ █████████████████████████ │
|
||||||
|
│ 2019-12-01 │ 112895934 │ █████████████████████████ │ 6640902 │ █████████████████████████ │ 148277 │ █████████████████████████ │
|
||||||
|
│ 2020-01-01 │ 54354879 │ █████████████████████████ │ 4782339 │ ███████████████████████▉ │ 111658 │ █████████████████████████ │
|
||||||
|
│ 2020-02-01 │ 22696923 │ ███████████▎ │ 3135175 │ ███████████████▋ │ 79521 │ ███████████████████▉ │
|
||||||
|
│ 2020-03-01 │ 3466677 │ █▋ │ 987960 │ ████▉ │ 40901 │ ██████████▏ │
|
||||||
|
└──────────────┴───────────┴───────────────────────────┴─────────┴───────────────────────────┴────────────┴───────────────────────────┘
|
||||||
|
|
||||||
|
172 rows in set. Elapsed: 184.809 sec. Processed 6.74 billion rows, 89.56 GB (36.47 million rows/s., 484.62 MB/s.)
|
||||||
|
```
|
||||||
|
|
||||||
|
10. Here are the top 10 subreddits of 2022:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT
|
||||||
|
subreddit,
|
||||||
|
count() AS count
|
||||||
|
FROM reddit
|
||||||
|
WHERE toYear(created_utc) = 2022
|
||||||
|
GROUP BY subreddit
|
||||||
|
ORDER BY count DESC
|
||||||
|
LIMIT 10;
|
||||||
|
```
|
||||||
|
|
||||||
|
The response is:
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌─subreddit────────┬───count─┐
|
||||||
|
│ AskReddit │ 3858203 │
|
||||||
|
│ politics │ 1356782 │
|
||||||
|
│ memes │ 1249120 │
|
||||||
|
│ nfl │ 883667 │
|
||||||
|
│ worldnews │ 866065 │
|
||||||
|
│ teenagers │ 777095 │
|
||||||
|
│ AmItheAsshole │ 752720 │
|
||||||
|
│ dankmemes │ 657932 │
|
||||||
|
│ nba │ 514184 │
|
||||||
|
│ unpopularopinion │ 473649 │
|
||||||
|
└──────────────────┴─────────┘
|
||||||
|
|
||||||
|
10 rows in set. Elapsed: 27.824 sec. Processed 6.74 billion rows, 53.26 GB (242.22 million rows/s., 1.91 GB/s.)
|
||||||
|
```
|
||||||
|
|
||||||
|
11. Let's see which subreddits had the biggest increase in commnents from 2018 to 2019:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT
|
||||||
|
subreddit,
|
||||||
|
newcount - oldcount AS diff
|
||||||
|
FROM
|
||||||
|
(
|
||||||
|
SELECT
|
||||||
|
subreddit,
|
||||||
|
count(*) AS newcount
|
||||||
|
FROM reddit
|
||||||
|
WHERE toYear(created_utc) = 2019
|
||||||
|
GROUP BY subreddit
|
||||||
|
)
|
||||||
|
ALL INNER JOIN
|
||||||
|
(
|
||||||
|
SELECT
|
||||||
|
subreddit,
|
||||||
|
count(*) AS oldcount
|
||||||
|
FROM reddit
|
||||||
|
WHERE toYear(created_utc) = 2018
|
||||||
|
GROUP BY subreddit
|
||||||
|
) USING (subreddit)
|
||||||
|
ORDER BY diff DESC
|
||||||
|
LIMIT 50
|
||||||
|
SETTINGS joined_subquery_requires_alias = 0;
|
||||||
|
```
|
||||||
|
|
||||||
|
It looks like memes and teenagers were busy on Reddit in 2019:
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌─subreddit────────────┬─────diff─┐
|
||||||
|
│ memes │ 15368369 │
|
||||||
|
│ AskReddit │ 14663662 │
|
||||||
|
│ teenagers │ 12266991 │
|
||||||
|
│ AmItheAsshole │ 11561538 │
|
||||||
|
│ dankmemes │ 11305158 │
|
||||||
|
│ unpopularopinion │ 6332772 │
|
||||||
|
│ PewdiepieSubmissions │ 5930818 │
|
||||||
|
│ Market76 │ 5014668 │
|
||||||
|
│ relationship_advice │ 3776383 │
|
||||||
|
│ freefolk │ 3169236 │
|
||||||
|
│ Minecraft │ 3160241 │
|
||||||
|
│ classicwow │ 2907056 │
|
||||||
|
│ Animemes │ 2673398 │
|
||||||
|
│ gameofthrones │ 2402835 │
|
||||||
|
│ PublicFreakout │ 2267605 │
|
||||||
|
│ ShitPostCrusaders │ 2207266 │
|
||||||
|
│ RoastMe │ 2195715 │
|
||||||
|
│ gonewild │ 2148649 │
|
||||||
|
│ AnthemTheGame │ 1803818 │
|
||||||
|
│ entitledparents │ 1706270 │
|
||||||
|
│ MortalKombat │ 1679508 │
|
||||||
|
│ Cringetopia │ 1620555 │
|
||||||
|
│ pokemon │ 1615266 │
|
||||||
|
│ HistoryMemes │ 1608289 │
|
||||||
|
│ Brawlstars │ 1574977 │
|
||||||
|
│ iamatotalpieceofshit │ 1558315 │
|
||||||
|
│ trashy │ 1518549 │
|
||||||
|
│ ChapoTrapHouse │ 1505748 │
|
||||||
|
│ Pikabu │ 1501001 │
|
||||||
|
│ Showerthoughts │ 1475101 │
|
||||||
|
│ cursedcomments │ 1465607 │
|
||||||
|
│ ukpolitics │ 1386043 │
|
||||||
|
│ wallstreetbets │ 1384431 │
|
||||||
|
│ interestingasfuck │ 1378900 │
|
||||||
|
│ wholesomememes │ 1353333 │
|
||||||
|
│ AskOuija │ 1233263 │
|
||||||
|
│ borderlands3 │ 1197192 │
|
||||||
|
│ aww │ 1168257 │
|
||||||
|
│ insanepeoplefacebook │ 1155473 │
|
||||||
|
│ FortniteCompetitive │ 1122778 │
|
||||||
|
│ EpicSeven │ 1117380 │
|
||||||
|
│ FreeKarma4U │ 1116423 │
|
||||||
|
│ YangForPresidentHQ │ 1086700 │
|
||||||
|
│ SquaredCircle │ 1044089 │
|
||||||
|
│ MurderedByWords │ 1042511 │
|
||||||
|
│ AskMen │ 1024434 │
|
||||||
|
│ thedivision │ 1016634 │
|
||||||
|
│ barstoolsports │ 985032 │
|
||||||
|
│ nfl │ 978340 │
|
||||||
|
│ BattlefieldV │ 971408 │
|
||||||
|
└──────────────────────┴──────────┘
|
||||||
|
|
||||||
|
50 rows in set. Elapsed: 65.954 sec. Processed 13.48 billion rows, 79.67 GB (204.37 million rows/s., 1.21 GB/s.)
|
||||||
|
```
|
||||||
|
|
||||||
|
12. One more query: let's compare ClickHouse mentions to other technologies like Snowflake and Postgres. This query is a big one because it has to search all the comments three times for a substring, and unfortunately ClickHouse user are obviously not very active on Reddit yet:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT
|
||||||
|
toStartOfQuarter(created_utc) AS quarter,
|
||||||
|
sum(if(positionCaseInsensitive(body, 'clickhouse') > 0, 1, 0)) AS clickhouse,
|
||||||
|
sum(if(positionCaseInsensitive(body, 'snowflake') > 0, 1, 0)) AS snowflake,
|
||||||
|
sum(if(positionCaseInsensitive(body, 'postgres') > 0, 1, 0)) AS postgres
|
||||||
|
FROM reddit
|
||||||
|
GROUP BY quarter
|
||||||
|
ORDER BY quarter ASC;
|
||||||
|
```
|
||||||
|
|
||||||
|
```response
|
||||||
|
┌────Quarter─┬─clickhouse─┬─snowflake─┬─postgres─┐
|
||||||
|
│ 2005-10-01 │ 0 │ 0 │ 0 │
|
||||||
|
│ 2006-01-01 │ 0 │ 2 │ 23 │
|
||||||
|
│ 2006-04-01 │ 0 │ 2 │ 24 │
|
||||||
|
│ 2006-07-01 │ 0 │ 4 │ 13 │
|
||||||
|
│ 2006-10-01 │ 0 │ 23 │ 73 │
|
||||||
|
│ 2007-01-01 │ 0 │ 14 │ 91 │
|
||||||
|
│ 2007-04-01 │ 0 │ 10 │ 59 │
|
||||||
|
│ 2007-07-01 │ 0 │ 39 │ 116 │
|
||||||
|
│ 2007-10-01 │ 0 │ 45 │ 125 │
|
||||||
|
│ 2008-01-01 │ 0 │ 53 │ 234 │
|
||||||
|
│ 2008-04-01 │ 0 │ 79 │ 303 │
|
||||||
|
│ 2008-07-01 │ 0 │ 102 │ 174 │
|
||||||
|
│ 2008-10-01 │ 0 │ 156 │ 323 │
|
||||||
|
│ 2009-01-01 │ 0 │ 206 │ 208 │
|
||||||
|
│ 2009-04-01 │ 0 │ 178 │ 417 │
|
||||||
|
│ 2009-07-01 │ 0 │ 300 │ 295 │
|
||||||
|
│ 2009-10-01 │ 0 │ 633 │ 589 │
|
||||||
|
│ 2010-01-01 │ 0 │ 555 │ 501 │
|
||||||
|
│ 2010-04-01 │ 0 │ 587 │ 469 │
|
||||||
|
│ 2010-07-01 │ 0 │ 770 │ 821 │
|
||||||
|
│ 2010-10-01 │ 0 │ 1480 │ 550 │
|
||||||
|
│ 2011-01-01 │ 0 │ 1482 │ 568 │
|
||||||
|
│ 2011-04-01 │ 0 │ 1558 │ 406 │
|
||||||
|
│ 2011-07-01 │ 0 │ 2163 │ 628 │
|
||||||
|
│ 2011-10-01 │ 0 │ 4064 │ 566 │
|
||||||
|
│ 2012-01-01 │ 0 │ 4621 │ 662 │
|
||||||
|
│ 2012-04-01 │ 0 │ 5737 │ 785 │
|
||||||
|
│ 2012-07-01 │ 0 │ 6097 │ 1127 │
|
||||||
|
│ 2012-10-01 │ 0 │ 7986 │ 600 │
|
||||||
|
│ 2013-01-01 │ 0 │ 9704 │ 839 │
|
||||||
|
│ 2013-04-01 │ 0 │ 8161 │ 853 │
|
||||||
|
│ 2013-07-01 │ 0 │ 9704 │ 1028 │
|
||||||
|
│ 2013-10-01 │ 0 │ 12879 │ 1404 │
|
||||||
|
│ 2014-01-01 │ 0 │ 12317 │ 1548 │
|
||||||
|
│ 2014-04-01 │ 0 │ 13181 │ 1577 │
|
||||||
|
│ 2014-07-01 │ 0 │ 15640 │ 1710 │
|
||||||
|
│ 2014-10-01 │ 0 │ 19479 │ 1959 │
|
||||||
|
│ 2015-01-01 │ 0 │ 20411 │ 2104 │
|
||||||
|
│ 2015-04-01 │ 1 │ 20309 │ 9112 │
|
||||||
|
│ 2015-07-01 │ 0 │ 20325 │ 4771 │
|
||||||
|
│ 2015-10-01 │ 0 │ 25087 │ 3030 │
|
||||||
|
│ 2016-01-01 │ 0 │ 23462 │ 3126 │
|
||||||
|
│ 2016-04-01 │ 3 │ 25496 │ 2757 │
|
||||||
|
│ 2016-07-01 │ 4 │ 28233 │ 2928 │
|
||||||
|
│ 2016-10-01 │ 2 │ 45445 │ 2449 │
|
||||||
|
│ 2017-01-01 │ 9 │ 76019 │ 2808 │
|
||||||
|
│ 2017-04-01 │ 9 │ 67919 │ 2803 │
|
||||||
|
│ 2017-07-01 │ 13 │ 68974 │ 2771 │
|
||||||
|
│ 2017-10-01 │ 12 │ 69730 │ 2906 │
|
||||||
|
│ 2018-01-01 │ 17 │ 67476 │ 3152 │
|
||||||
|
│ 2018-04-01 │ 3 │ 67139 │ 3986 │
|
||||||
|
│ 2018-07-01 │ 14 │ 67979 │ 3609 │
|
||||||
|
│ 2018-10-01 │ 28 │ 74147 │ 3850 │
|
||||||
|
│ 2019-01-01 │ 14 │ 80250 │ 4305 │
|
||||||
|
│ 2019-04-01 │ 30 │ 70307 │ 3872 │
|
||||||
|
│ 2019-07-01 │ 33 │ 77149 │ 4164 │
|
||||||
|
│ 2019-10-01 │ 13 │ 76746 │ 3541 │
|
||||||
|
│ 2020-01-01 │ 16 │ 54475 │ 846 │
|
||||||
|
└────────────┴────────────┴───────────┴──────────┘
|
||||||
|
|
||||||
|
58 rows in set. Elapsed: 2663.751 sec. Processed 6.74 billion rows, 1.21 TB (2.53 million rows/s., 454.37 MB/s.)
|
||||||
|
```
|
@ -143,8 +143,9 @@ You can also download and install packages manually from [here](https://packages
|
|||||||
#### Install standalone ClickHouse Keeper
|
#### Install standalone ClickHouse Keeper
|
||||||
|
|
||||||
:::tip
|
:::tip
|
||||||
If you are going to run ClickHouse Keeper on the same server as ClickHouse server you
|
In production environment we [strongly recommend](/docs/en/operations/tips.md#L143-L144) running ClickHouse Keeper on dedicated nodes.
|
||||||
do not need to install ClickHouse Keeper as it is included with ClickHouse server. This command is only needed on standalone ClickHouse Keeper servers.
|
In test environments, if you decide to run ClickHouse Server and ClickHouse Keeper on the same server, you do not need to install ClickHouse Keeper as it is included with ClickHouse server.
|
||||||
|
This command is only needed on standalone ClickHouse Keeper servers.
|
||||||
:::
|
:::
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@ -211,8 +212,9 @@ clickhouse-client # or "clickhouse-client --password" if you set up a password.
|
|||||||
#### Install standalone ClickHouse Keeper
|
#### Install standalone ClickHouse Keeper
|
||||||
|
|
||||||
:::tip
|
:::tip
|
||||||
If you are going to run ClickHouse Keeper on the same server as ClickHouse server you
|
In production environment we [strongly recommend](/docs/en/operations/tips.md#L143-L144) running ClickHouse Keeper on dedicated nodes.
|
||||||
do not need to install ClickHouse Keeper as it is included with ClickHouse server. This command is only needed on standalone ClickHouse Keeper servers.
|
In test environments, if you decide to run ClickHouse Server and ClickHouse Keeper on the same server, you do not need to install ClickHouse Keeper as it is included with ClickHouse server.
|
||||||
|
This command is only needed on standalone ClickHouse Keeper servers.
|
||||||
:::
|
:::
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
@ -177,11 +177,11 @@ You can pass parameters to `clickhouse-client` (all parameters have a default va
|
|||||||
- `--user, -u` – The username. Default value: default.
|
- `--user, -u` – The username. Default value: default.
|
||||||
- `--password` – The password. Default value: empty string.
|
- `--password` – The password. Default value: empty string.
|
||||||
- `--ask-password` - Prompt the user to enter a password.
|
- `--ask-password` - Prompt the user to enter a password.
|
||||||
- `--query, -q` – The query to process when using non-interactive mode. You must specify either `query` or `queries-file` option.
|
- `--query, -q` – The query to process when using non-interactive mode. Cannot be used simultaneously with `--queries-file`.
|
||||||
- `--queries-file` – file path with queries to execute. You must specify either `query` or `queries-file` option.
|
- `--queries-file` – file path with queries to execute. Cannot be used simultaneously with `--query`.
|
||||||
- `--database, -d` – Select the current default database. Default value: the current database from the server settings (‘default’ by default).
|
- `--multiquery, -n` – If specified, multiple queries separated by semicolons can be listed after the `--query` option. For convenience, it is also possible to omit `--query` and pass the queries directly after `--multiquery`.
|
||||||
- `--multiline, -m` – If specified, allow multiline queries (do not send the query on Enter).
|
- `--multiline, -m` – If specified, allow multiline queries (do not send the query on Enter).
|
||||||
- `--multiquery, -n` – If specified, allow processing multiple queries separated by semicolons.
|
- `--database, -d` – Select the current default database. Default value: the current database from the server settings (‘default’ by default).
|
||||||
- `--format, -f` – Use the specified default format to output the result.
|
- `--format, -f` – Use the specified default format to output the result.
|
||||||
- `--vertical, -E` – If specified, use the [Vertical format](../interfaces/formats.md#vertical) by default to output the result. This is the same as `–format=Vertical`. In this format, each value is printed on a separate line, which is helpful when displaying wide tables.
|
- `--vertical, -E` – If specified, use the [Vertical format](../interfaces/formats.md#vertical) by default to output the result. This is the same as `–format=Vertical`. In this format, each value is printed on a separate line, which is helpful when displaying wide tables.
|
||||||
- `--time, -t` – If specified, print the query execution time to ‘stderr’ in non-interactive mode.
|
- `--time, -t` – If specified, print the query execution time to ‘stderr’ in non-interactive mode.
|
||||||
|
@ -30,7 +30,7 @@ description: In order to effectively mitigate possible human errors, you should
|
|||||||
```
|
```
|
||||||
|
|
||||||
:::note ALL
|
:::note ALL
|
||||||
`ALL` is only applicable to the `RESTORE` command.
|
`ALL` is only applicable to the `RESTORE` command prior to version 23.4 of Clickhouse.
|
||||||
:::
|
:::
|
||||||
|
|
||||||
## Background
|
## Background
|
||||||
|
@ -2,34 +2,115 @@
|
|||||||
slug: /en/operations/named-collections
|
slug: /en/operations/named-collections
|
||||||
sidebar_position: 69
|
sidebar_position: 69
|
||||||
sidebar_label: "Named collections"
|
sidebar_label: "Named collections"
|
||||||
|
title: "Named collections"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Storing details for connecting to external sources in configuration files
|
Named collections provide a way to store collections of key-value pairs to be
|
||||||
|
used to configure integrations with external sources. You can use named collections with
|
||||||
|
dictionaries, tables, table functions, and object storage.
|
||||||
|
|
||||||
Details for connecting to external sources (dictionaries, tables, table functions) can be saved
|
Named collections can be configured with DDL or in configuration files and are applied
|
||||||
in configuration files and thus simplify the creation of objects and hide credentials
|
when ClickHouse starts. They simplify the creation of objects and the hiding of credentials
|
||||||
from users with only SQL access.
|
from users without administrative access.
|
||||||
|
|
||||||
Parameters can be set in XML `<format>CSV</format>` and overridden in SQL `, format = 'TSV'`.
|
The keys in a named collection must match the parameter names of the corresponding
|
||||||
The parameters in SQL can be overridden using format `key` = `value`: `compression_method = 'gzip'`.
|
function, table engine, database, etc. In the examples below the parameter list is
|
||||||
|
linked to for each type.
|
||||||
|
|
||||||
Named collections are stored in the `config.xml` file of the ClickHouse server in the `<named_collections>` section and are applied when ClickHouse starts.
|
Parameters set in a named collection can be overridden in SQL, this is shown in the examples
|
||||||
|
below.
|
||||||
|
|
||||||
Example of configuration:
|
## Storing named collections in the system database
|
||||||
```xml
|
|
||||||
$ cat /etc/clickhouse-server/config.d/named_collections.xml
|
### DDL example
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE NAMED COLLECTION name AS
|
||||||
|
key_1 = 'value',
|
||||||
|
key_2 = 'value2',
|
||||||
|
url = 'https://connection.url/'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Permissions to create named collections with DDL
|
||||||
|
|
||||||
|
To manage named collections with DDL a user must have the `named_control_collection` privilege. This can be assigned by adding a file to `/etc/clickhouse-server/users.d/`. The example gives the user `default` both the `access_management` and `named_collection_control` privileges:
|
||||||
|
|
||||||
|
```xml title='/etc/clickhouse-server/users.d/user_default.xml'
|
||||||
|
<clickhouse>
|
||||||
|
<users>
|
||||||
|
<default>
|
||||||
|
<password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex replace=true>
|
||||||
|
<access_management>1</access_management>
|
||||||
|
<!-- highlight-start -->
|
||||||
|
<named_collection_control>1</named_collection_control>
|
||||||
|
<!-- highlight-end -->
|
||||||
|
</default>
|
||||||
|
</users>
|
||||||
|
</clickhouse>
|
||||||
|
```
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
In the above example the `passowrd_sha256_hex` value is the hexadecimal representation of the SHA256 hash of the password. This configuration for the user `default` has the attribute `replace=true` as in the default configuration has a plain text `password` set, and it is not possible to have both plain text and sha256 hex passwords set for a user.
|
||||||
|
:::
|
||||||
|
|
||||||
|
## Storing named collections in configuration files
|
||||||
|
|
||||||
|
### XML example
|
||||||
|
|
||||||
|
```xml title='/etc/clickhouse-server/config.d/named_collections.xml'
|
||||||
<clickhouse>
|
<clickhouse>
|
||||||
<named_collections>
|
<named_collections>
|
||||||
...
|
<name>
|
||||||
|
<key_1>value</key_1>
|
||||||
|
<key_2>value_2</key_2>
|
||||||
|
<url>https://connection.url/</url>
|
||||||
|
</name>
|
||||||
</named_collections>
|
</named_collections>
|
||||||
</clickhouse>
|
</clickhouse>
|
||||||
```
|
```
|
||||||
|
|
||||||
## Named collections for accessing S3.
|
## Modifying named collections
|
||||||
|
|
||||||
|
Named collections that are created with DDL queries can be altered or dropped with DDL. Named collections created with XML files can be managed by editing or deleting the corresponding XML.
|
||||||
|
|
||||||
|
### Alter a DDL named collection
|
||||||
|
|
||||||
|
Change or add the keys `key1` and `key3` of the collection `collection2`:
|
||||||
|
```sql
|
||||||
|
ALTER NAMED COLLECTION collection2 SET key1=4, key3='value3'
|
||||||
|
```
|
||||||
|
|
||||||
|
Remove the key `key2` from `collection2`:
|
||||||
|
```sql
|
||||||
|
ALTER NAMED COLLECTION collection2 DELETE key2
|
||||||
|
```
|
||||||
|
|
||||||
|
Change or add the key `key1` and delete the key `key3` of the collection `collection2`:
|
||||||
|
```sql
|
||||||
|
ALTER NAMED COLLECTION collection2 SET key1=4, DELETE key3
|
||||||
|
```
|
||||||
|
|
||||||
|
### Drop the DDL named collection `collection2`:
|
||||||
|
```sql
|
||||||
|
DROP NAMED COLLECTION collection2
|
||||||
|
```
|
||||||
|
|
||||||
|
## Named collections for accessing S3
|
||||||
|
|
||||||
The description of parameters see [s3 Table Function](../sql-reference/table-functions/s3.md).
|
The description of parameters see [s3 Table Function](../sql-reference/table-functions/s3.md).
|
||||||
|
|
||||||
Example of configuration:
|
### DDL example
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE NAMED COLLECTION s3_mydata AS
|
||||||
|
access_key_id = 'AKIAIOSFODNN7EXAMPLE',
|
||||||
|
secret_access_key = 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
|
||||||
|
format = 'CSV',
|
||||||
|
url = 'https://s3.us-east-1.amazonaws.com/yourbucket/mydata/'
|
||||||
|
```
|
||||||
|
|
||||||
|
### XML example
|
||||||
|
|
||||||
```xml
|
```xml
|
||||||
<clickhouse>
|
<clickhouse>
|
||||||
<named_collections>
|
<named_collections>
|
||||||
@ -43,23 +124,23 @@ Example of configuration:
|
|||||||
</clickhouse>
|
</clickhouse>
|
||||||
```
|
```
|
||||||
|
|
||||||
### Example of using named collections with the s3 function
|
### s3() function and S3 Table named collection examples
|
||||||
|
|
||||||
|
Both of the following examples use the same named collection `s3_mydata`:
|
||||||
|
|
||||||
|
#### s3() function
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
INSERT INTO FUNCTION s3(s3_mydata, filename = 'test_file.tsv.gz',
|
INSERT INTO FUNCTION s3(s3_mydata, filename = 'test_file.tsv.gz',
|
||||||
format = 'TSV', structure = 'number UInt64', compression_method = 'gzip')
|
format = 'TSV', structure = 'number UInt64', compression_method = 'gzip')
|
||||||
SELECT * FROM numbers(10000);
|
SELECT * FROM numbers(10000);
|
||||||
|
|
||||||
SELECT count()
|
|
||||||
FROM s3(s3_mydata, filename = 'test_file.tsv.gz')
|
|
||||||
|
|
||||||
┌─count()─┐
|
|
||||||
│ 10000 │
|
|
||||||
└─────────┘
|
|
||||||
1 rows in set. Elapsed: 0.279 sec. Processed 10.00 thousand rows, 90.00 KB (35.78 thousand rows/s., 322.02 KB/s.)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Example of using named collections with an S3 table
|
:::tip
|
||||||
|
The first argument to the `s3()` function above is the name of the collection, `s3_mydata`. Without named collections, the access key ID, secret, format, and URL would all be passed in every call to the `s3()` function.
|
||||||
|
:::
|
||||||
|
|
||||||
|
#### S3 table
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
CREATE TABLE s3_engine_table (number Int64)
|
CREATE TABLE s3_engine_table (number Int64)
|
||||||
@ -78,7 +159,22 @@ SELECT * FROM s3_engine_table LIMIT 3;
|
|||||||
|
|
||||||
The description of parameters see [mysql](../sql-reference/table-functions/mysql.md).
|
The description of parameters see [mysql](../sql-reference/table-functions/mysql.md).
|
||||||
|
|
||||||
Example of configuration:
|
### DDL example
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE NAMED COLLECTION mymysql AS
|
||||||
|
user = 'myuser',
|
||||||
|
password = 'mypass',
|
||||||
|
host = '127.0.0.1',
|
||||||
|
port = 3306,
|
||||||
|
database = 'test'
|
||||||
|
connection_pool_size = 8
|
||||||
|
on_duplicate_clause = 1
|
||||||
|
replace_query = 1
|
||||||
|
```
|
||||||
|
|
||||||
|
### XML example
|
||||||
|
|
||||||
```xml
|
```xml
|
||||||
<clickhouse>
|
<clickhouse>
|
||||||
<named_collections>
|
<named_collections>
|
||||||
@ -96,7 +192,11 @@ Example of configuration:
|
|||||||
</clickhouse>
|
</clickhouse>
|
||||||
```
|
```
|
||||||
|
|
||||||
### Example of using named collections with the mysql function
|
### mysql() function, MySQL table, MySQL database, and Dictionary named collection examples
|
||||||
|
|
||||||
|
The four following examples use the same named collection `mymysql`:
|
||||||
|
|
||||||
|
#### mysql() function
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
SELECT count() FROM mysql(mymysql, table = 'test');
|
SELECT count() FROM mysql(mymysql, table = 'test');
|
||||||
@ -105,8 +205,11 @@ SELECT count() FROM mysql(mymysql, table = 'test');
|
|||||||
│ 3 │
|
│ 3 │
|
||||||
└─────────┘
|
└─────────┘
|
||||||
```
|
```
|
||||||
|
:::note
|
||||||
|
The named collection does not specify the `table` parameter, so it is specified in the function call as `table = 'test'`.
|
||||||
|
:::
|
||||||
|
|
||||||
### Example of using named collections with an MySQL table
|
#### MySQL table
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
CREATE TABLE mytable(A Int64) ENGINE = MySQL(mymysql, table = 'test', connection_pool_size=3, replace_query=0);
|
CREATE TABLE mytable(A Int64) ENGINE = MySQL(mymysql, table = 'test', connection_pool_size=3, replace_query=0);
|
||||||
@ -117,7 +220,11 @@ SELECT count() FROM mytable;
|
|||||||
└─────────┘
|
└─────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
### Example of using named collections with database with engine MySQL
|
:::note
|
||||||
|
The DDL overrides the named collection setting for connection_pool_size.
|
||||||
|
:::
|
||||||
|
|
||||||
|
#### MySQL database
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
CREATE DATABASE mydatabase ENGINE = MySQL(mymysql);
|
CREATE DATABASE mydatabase ENGINE = MySQL(mymysql);
|
||||||
@ -130,7 +237,7 @@ SHOW TABLES FROM mydatabase;
|
|||||||
└────────┘
|
└────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
### Example of using named collections with a dictionary with source MySQL
|
#### MySQL Dictionary
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
CREATE DICTIONARY dict (A Int64, B String)
|
CREATE DICTIONARY dict (A Int64, B String)
|
||||||
@ -150,6 +257,17 @@ SELECT dictGet('dict', 'B', 2);
|
|||||||
|
|
||||||
The description of parameters see [postgresql](../sql-reference/table-functions/postgresql.md).
|
The description of parameters see [postgresql](../sql-reference/table-functions/postgresql.md).
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE NAMED COLLECTION mypg AS
|
||||||
|
user = 'pguser',
|
||||||
|
password = 'jw8s0F4',
|
||||||
|
host = '127.0.0.1',
|
||||||
|
port = 5432,
|
||||||
|
database = 'test',
|
||||||
|
schema = 'test_schema',
|
||||||
|
connection_pool_size = 8
|
||||||
|
```
|
||||||
|
|
||||||
Example of configuration:
|
Example of configuration:
|
||||||
```xml
|
```xml
|
||||||
<clickhouse>
|
<clickhouse>
|
||||||
@ -229,12 +347,22 @@ SELECT dictGet('dict', 'b', 2);
|
|||||||
└─────────────────────────┘
|
└─────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
## Named collections for accessing remote ClickHouse database
|
## Named collections for accessing a remote ClickHouse database
|
||||||
|
|
||||||
The description of parameters see [remote](../sql-reference/table-functions/remote.md/#parameters).
|
The description of parameters see [remote](../sql-reference/table-functions/remote.md/#parameters).
|
||||||
|
|
||||||
Example of configuration:
|
Example of configuration:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE NAMED COLLECTION remote1 AS
|
||||||
|
host = 'remote_host',
|
||||||
|
port = 9000,
|
||||||
|
database = 'system',
|
||||||
|
user = 'foo',
|
||||||
|
password = 'secret',
|
||||||
|
secure = 1
|
||||||
|
```
|
||||||
|
|
||||||
```xml
|
```xml
|
||||||
<clickhouse>
|
<clickhouse>
|
||||||
<named_collections>
|
<named_collections>
|
||||||
@ -286,3 +414,4 @@ SELECT dictGet('dict', 'b', 1);
|
|||||||
│ a │
|
│ a │
|
||||||
└─────────────────────────┘
|
└─────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -1063,8 +1063,8 @@ Default value: 16.
|
|||||||
|
|
||||||
## background_merges_mutations_concurrency_ratio {#background_merges_mutations_concurrency_ratio}
|
## background_merges_mutations_concurrency_ratio {#background_merges_mutations_concurrency_ratio}
|
||||||
|
|
||||||
Sets a ratio between the number of threads and the number of background merges and mutations that can be executed concurrently. For example if the ratio equals to 2 and
|
Sets a ratio between the number of threads and the number of background merges and mutations that can be executed concurrently. For example, if the ratio equals to 2 and
|
||||||
`background_pool_size` is set to 16 then ClickHouse can execute 32 background merges concurrently. This is possible, because background operation could be suspended and postponed. This is needed to give small merges more execution priority. You can only increase this ratio at runtime. To lower it you have to restart the server.
|
`background_pool_size` is set to 16 then ClickHouse can execute 32 background merges concurrently. This is possible, because background operations could be suspended and postponed. This is needed to give small merges more execution priority. You can only increase this ratio at runtime. To lower it you have to restart the server.
|
||||||
The same as for `background_pool_size` setting `background_merges_mutations_concurrency_ratio` could be applied from the `default` profile for backward compatibility.
|
The same as for `background_pool_size` setting `background_merges_mutations_concurrency_ratio` could be applied from the `default` profile for backward compatibility.
|
||||||
|
|
||||||
Possible values:
|
Possible values:
|
||||||
@ -1079,6 +1079,33 @@ Default value: 2.
|
|||||||
<background_merges_mutations_concurrency_ratio>3</background_merges_mutations_concurrency_ratio>
|
<background_merges_mutations_concurrency_ratio>3</background_merges_mutations_concurrency_ratio>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## merges_mutations_memory_usage_soft_limit {#merges_mutations_memory_usage_soft_limit}
|
||||||
|
|
||||||
|
Sets the limit on how much RAM is allowed to use for performing merge and mutation operations.
|
||||||
|
Zero means unlimited.
|
||||||
|
If ClickHouse reaches this limit, it won't schedule any new background merge or mutation operations but will continue to execute already scheduled tasks.
|
||||||
|
|
||||||
|
Possible values:
|
||||||
|
|
||||||
|
- Any positive integer.
|
||||||
|
|
||||||
|
**Example**
|
||||||
|
|
||||||
|
```xml
|
||||||
|
<merges_mutations_memory_usage_soft_limit>0</merges_mutations_memory_usage_soft_limit>
|
||||||
|
```
|
||||||
|
|
||||||
|
## merges_mutations_memory_usage_to_ram_ratio {#merges_mutations_memory_usage_to_ram_ratio}
|
||||||
|
|
||||||
|
The default `merges_mutations_memory_usage_soft_limit` value is calculated as `memory_amount * merges_mutations_memory_usage_to_ram_ratio`.
|
||||||
|
|
||||||
|
Default value: `0.5`.
|
||||||
|
|
||||||
|
**See also**
|
||||||
|
|
||||||
|
- [max_memory_usage](../../operations/settings/query-complexity.md#settings_max_memory_usage)
|
||||||
|
- [merges_mutations_memory_usage_soft_limit](#merges_mutations_memory_usage_soft_limit)
|
||||||
|
|
||||||
## background_merges_mutations_scheduling_policy {#background_merges_mutations_scheduling_policy}
|
## background_merges_mutations_scheduling_policy {#background_merges_mutations_scheduling_policy}
|
||||||
|
|
||||||
Algorithm used to select next merge or mutation to be executed by background thread pool. Policy may be changed at runtime without server restart.
|
Algorithm used to select next merge or mutation to be executed by background thread pool. Policy may be changed at runtime without server restart.
|
||||||
|
@ -38,6 +38,10 @@ Structure of the `users` section:
|
|||||||
</table_name>
|
</table_name>
|
||||||
</database_name>
|
</database_name>
|
||||||
</databases>
|
</databases>
|
||||||
|
|
||||||
|
<grants>
|
||||||
|
<query>GRANT SELECT ON system.*</query>
|
||||||
|
</grants>
|
||||||
</user_name>
|
</user_name>
|
||||||
<!-- Other users settings -->
|
<!-- Other users settings -->
|
||||||
</users>
|
</users>
|
||||||
@ -86,6 +90,28 @@ Possible values:
|
|||||||
|
|
||||||
Default value: 0.
|
Default value: 0.
|
||||||
|
|
||||||
|
### grants {#grants-user-setting}
|
||||||
|
|
||||||
|
This setting allows to grant any rights to selected user.
|
||||||
|
Each element of the list should be `GRANT` query without any grantees specified.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```xml
|
||||||
|
<user1>
|
||||||
|
<grants>
|
||||||
|
<query>GRANT SHOW ON *.*</query>
|
||||||
|
<query>GRANT CREATE ON *.* WITH GRANT OPTION</query>
|
||||||
|
<query>GRANT SELECT ON system.*</query>
|
||||||
|
</grants>
|
||||||
|
</user1>
|
||||||
|
```
|
||||||
|
|
||||||
|
This setting can't be specified at the same time with
|
||||||
|
`dictionaries`, `access_management`, `named_collection_control`, `show_named_collections_secrets`
|
||||||
|
and `allow_databases` settings.
|
||||||
|
|
||||||
|
|
||||||
### user_name/networks {#user-namenetworks}
|
### user_name/networks {#user-namenetworks}
|
||||||
|
|
||||||
List of networks from which the user can connect to the ClickHouse server.
|
List of networks from which the user can connect to the ClickHouse server.
|
||||||
|
@ -452,6 +452,8 @@ Possible values:
|
|||||||
|
|
||||||
The first phase of a grace join reads the right table and splits it into N buckets depending on the hash value of key columns (initially, N is `grace_hash_join_initial_buckets`). This is done in a way to ensure that each bucket can be processed independently. Rows from the first bucket are added to an in-memory hash table while the others are saved to disk. If the hash table grows beyond the memory limit (e.g., as set by [`max_bytes_in_join`](/docs/en/operations/settings/query-complexity.md/#settings-max_bytes_in_join)), the number of buckets is increased and the assigned bucket for each row. Any rows which don’t belong to the current bucket are flushed and reassigned.
|
The first phase of a grace join reads the right table and splits it into N buckets depending on the hash value of key columns (initially, N is `grace_hash_join_initial_buckets`). This is done in a way to ensure that each bucket can be processed independently. Rows from the first bucket are added to an in-memory hash table while the others are saved to disk. If the hash table grows beyond the memory limit (e.g., as set by [`max_bytes_in_join`](/docs/en/operations/settings/query-complexity.md/#settings-max_bytes_in_join)), the number of buckets is increased and the assigned bucket for each row. Any rows which don’t belong to the current bucket are flushed and reassigned.
|
||||||
|
|
||||||
|
Supports `INNER/LEFT/RIGHT/FULL ALL/ANY JOIN`.
|
||||||
|
|
||||||
- hash
|
- hash
|
||||||
|
|
||||||
[Hash join algorithm](https://en.wikipedia.org/wiki/Hash_join) is used. The most generic implementation that supports all combinations of kind and strictness and multiple join keys that are combined with `OR` in the `JOIN ON` section.
|
[Hash join algorithm](https://en.wikipedia.org/wiki/Hash_join) is used. The most generic implementation that supports all combinations of kind and strictness and multiple join keys that are combined with `OR` in the `JOIN ON` section.
|
||||||
@ -608,6 +610,17 @@ See also:
|
|||||||
|
|
||||||
- [JOIN strictness](../../sql-reference/statements/select/join.md/#join-settings)
|
- [JOIN strictness](../../sql-reference/statements/select/join.md/#join-settings)
|
||||||
|
|
||||||
|
## max_rows_in_set_to_optimize_join
|
||||||
|
|
||||||
|
Maximal size of the set to filter joined tables by each other's row sets before joining.
|
||||||
|
|
||||||
|
Possible values:
|
||||||
|
|
||||||
|
- 0 — Disable.
|
||||||
|
- Any positive integer.
|
||||||
|
|
||||||
|
Default value: 100000.
|
||||||
|
|
||||||
## temporary_files_codec {#temporary_files_codec}
|
## temporary_files_codec {#temporary_files_codec}
|
||||||
|
|
||||||
Sets compression codec for temporary files used in sorting and joining operations on disk.
|
Sets compression codec for temporary files used in sorting and joining operations on disk.
|
||||||
@ -1125,6 +1138,12 @@ If unsuccessful, several attempts are made to connect to various replicas.
|
|||||||
|
|
||||||
Default value: 1000.
|
Default value: 1000.
|
||||||
|
|
||||||
|
## connect_timeout_with_failover_secure_ms
|
||||||
|
|
||||||
|
Connection timeout for selecting first healthy replica (for secure connections)
|
||||||
|
|
||||||
|
Default value: 1000.
|
||||||
|
|
||||||
## connection_pool_max_wait_ms {#connection-pool-max-wait-ms}
|
## connection_pool_max_wait_ms {#connection-pool-max-wait-ms}
|
||||||
|
|
||||||
The wait time in milliseconds for a connection when the connection pool is full.
|
The wait time in milliseconds for a connection when the connection pool is full.
|
||||||
@ -1360,6 +1379,12 @@ Possible values:
|
|||||||
|
|
||||||
Default value: `default`.
|
Default value: `default`.
|
||||||
|
|
||||||
|
## allow_experimental_parallel_reading_from_replicas
|
||||||
|
|
||||||
|
If true, ClickHouse will send a SELECT query to all replicas of a table (up to `max_parallel_replicas`) . It will work for any kind of MergeTree table.
|
||||||
|
|
||||||
|
Default value: `false`.
|
||||||
|
|
||||||
## compile_expressions {#compile-expressions}
|
## compile_expressions {#compile-expressions}
|
||||||
|
|
||||||
Enables or disables compilation of frequently used simple functions and operators to native code with LLVM at runtime.
|
Enables or disables compilation of frequently used simple functions and operators to native code with LLVM at runtime.
|
||||||
@ -1410,8 +1435,8 @@ and [enable_writes_to_query_cache](#enable-writes-to-query-cache) control in mor
|
|||||||
|
|
||||||
Possible values:
|
Possible values:
|
||||||
|
|
||||||
- 0 - Yes
|
- 0 - Disabled
|
||||||
- 1 - No
|
- 1 - Enabled
|
||||||
|
|
||||||
Default value: `0`.
|
Default value: `0`.
|
||||||
|
|
||||||
@ -1630,7 +1655,7 @@ For not replicated tables see [non_replicated_deduplication_window](merge-tree-s
|
|||||||
|
|
||||||
### async_insert {#async-insert}
|
### async_insert {#async-insert}
|
||||||
|
|
||||||
Enables or disables asynchronous inserts. This makes sense only for insertion over HTTP protocol. Note that deduplication isn't working for such inserts.
|
Enables or disables asynchronous inserts. Note that deduplication is disabled by default, see [async_insert_deduplicate](#async-insert-deduplicate).
|
||||||
|
|
||||||
If enabled, the data is combined into batches before the insertion into tables, so it is possible to do small and frequent insertions into ClickHouse (up to 15000 queries per second) without buffer tables.
|
If enabled, the data is combined into batches before the insertion into tables, so it is possible to do small and frequent insertions into ClickHouse (up to 15000 queries per second) without buffer tables.
|
||||||
|
|
||||||
@ -1691,7 +1716,7 @@ Default value: `100000`.
|
|||||||
|
|
||||||
### async_insert_max_query_number {#async-insert-max-query-number}
|
### async_insert_max_query_number {#async-insert-max-query-number}
|
||||||
|
|
||||||
The maximum number of insert queries per block before being inserted. This setting takes effect only if [async_insert_deduplicate](#settings-async-insert-deduplicate) is enabled.
|
The maximum number of insert queries per block before being inserted. This setting takes effect only if [async_insert_deduplicate](#async-insert-deduplicate) is enabled.
|
||||||
|
|
||||||
Possible values:
|
Possible values:
|
||||||
|
|
||||||
@ -1722,7 +1747,7 @@ Possible values:
|
|||||||
|
|
||||||
Default value: `0`.
|
Default value: `0`.
|
||||||
|
|
||||||
### async_insert_deduplicate {#settings-async-insert-deduplicate}
|
### async_insert_deduplicate {#async-insert-deduplicate}
|
||||||
|
|
||||||
Enables or disables insert deduplication of `ASYNC INSERT` (for Replicated\* tables).
|
Enables or disables insert deduplication of `ASYNC INSERT` (for Replicated\* tables).
|
||||||
|
|
||||||
@ -3196,17 +3221,6 @@ Possible values:
|
|||||||
|
|
||||||
Default value: `0`.
|
Default value: `0`.
|
||||||
|
|
||||||
## allow_experimental_geo_types {#allow-experimental-geo-types}
|
|
||||||
|
|
||||||
Allows working with experimental [geo data types](../../sql-reference/data-types/geo.md).
|
|
||||||
|
|
||||||
Possible values:
|
|
||||||
|
|
||||||
- 0 — Working with geo data types is disabled.
|
|
||||||
- 1 — Working with geo data types is enabled.
|
|
||||||
|
|
||||||
Default value: `0`.
|
|
||||||
|
|
||||||
## database_atomic_wait_for_drop_and_detach_synchronously {#database_atomic_wait_for_drop_and_detach_synchronously}
|
## database_atomic_wait_for_drop_and_detach_synchronously {#database_atomic_wait_for_drop_and_detach_synchronously}
|
||||||
|
|
||||||
Adds a modifier `SYNC` to all `DROP` and `DETACH` queries.
|
Adds a modifier `SYNC` to all `DROP` and `DETACH` queries.
|
||||||
@ -3562,7 +3576,7 @@ Default value: `1`.
|
|||||||
|
|
||||||
If the setting is set to `0`, the table function does not make Nullable columns and inserts default values instead of NULL. This is also applicable for NULL values inside arrays.
|
If the setting is set to `0`, the table function does not make Nullable columns and inserts default values instead of NULL. This is also applicable for NULL values inside arrays.
|
||||||
|
|
||||||
## allow_experimental_projection_optimization {#allow-experimental-projection-optimization}
|
## optimize_use_projections {#optimize_use_projections}
|
||||||
|
|
||||||
Enables or disables [projection](../../engines/table-engines/mergetree-family/mergetree.md/#projections) optimization when processing `SELECT` queries.
|
Enables or disables [projection](../../engines/table-engines/mergetree-family/mergetree.md/#projections) optimization when processing `SELECT` queries.
|
||||||
|
|
||||||
@ -3575,7 +3589,7 @@ Default value: `1`.
|
|||||||
|
|
||||||
## force_optimize_projection {#force-optimize-projection}
|
## force_optimize_projection {#force-optimize-projection}
|
||||||
|
|
||||||
Enables or disables the obligatory use of [projections](../../engines/table-engines/mergetree-family/mergetree.md/#projections) in `SELECT` queries, when projection optimization is enabled (see [allow_experimental_projection_optimization](#allow-experimental-projection-optimization) setting).
|
Enables or disables the obligatory use of [projections](../../engines/table-engines/mergetree-family/mergetree.md/#projections) in `SELECT` queries, when projection optimization is enabled (see [optimize_use_projections](#optimize_use_projections) setting).
|
||||||
|
|
||||||
Possible values:
|
Possible values:
|
||||||
|
|
||||||
@ -4206,3 +4220,12 @@ Possible values:
|
|||||||
- false — Disallow.
|
- false — Disallow.
|
||||||
|
|
||||||
Default value: `false`.
|
Default value: `false`.
|
||||||
|
|
||||||
|
## zstd_window_log_max
|
||||||
|
|
||||||
|
Allows you to select the max window log of ZSTD (it will not be used for MergeTree family)
|
||||||
|
|
||||||
|
Type: Int64
|
||||||
|
|
||||||
|
Default: 0
|
||||||
|
|
||||||
|
@ -172,7 +172,9 @@ Example of configuration for versions earlier than 22.8:
|
|||||||
</storage_configuration>
|
</storage_configuration>
|
||||||
```
|
```
|
||||||
|
|
||||||
Cache **configuration settings**:
|
File Cache **disk configuration settings**:
|
||||||
|
|
||||||
|
These settings should be defined in the disk configuration section.
|
||||||
|
|
||||||
- `path` - path to the directory with cache. Default: None, this setting is obligatory.
|
- `path` - path to the directory with cache. Default: None, this setting is obligatory.
|
||||||
|
|
||||||
@ -182,7 +184,7 @@ Cache **configuration settings**:
|
|||||||
|
|
||||||
- `enable_filesystem_query_cache_limit` - allow to limit the size of cache which is downloaded within each query (depends on user setting `max_query_cache_size`). Default: `false`.
|
- `enable_filesystem_query_cache_limit` - allow to limit the size of cache which is downloaded within each query (depends on user setting `max_query_cache_size`). Default: `false`.
|
||||||
|
|
||||||
- `enable_cache_hits_threshold` - a number, which defines how many times some data needs to be read before it will be cached. Default: `0`, e.g. the data is cached at the first attempt to read it.
|
- `enable_cache_hits_threshold` - number which defines how many times some data needs to be read before it will be cached. Default: `0`, e.g. the data is cached at the first attempt to read it.
|
||||||
|
|
||||||
- `do_not_evict_index_and_mark_files` - do not evict small frequently used files according to cache policy. Default: `false`. This setting was added in version 22.8. If you used filesystem cache before this version, then it will not work on versions starting from 22.8 if this setting is set to `true`. If you want to use this setting, clear old cache created before version 22.8 before upgrading.
|
- `do_not_evict_index_and_mark_files` - do not evict small frequently used files according to cache policy. Default: `false`. This setting was added in version 22.8. If you used filesystem cache before this version, then it will not work on versions starting from 22.8 if this setting is set to `true`. If you want to use this setting, clear old cache created before version 22.8 before upgrading.
|
||||||
|
|
||||||
@ -190,21 +192,23 @@ Cache **configuration settings**:
|
|||||||
|
|
||||||
- `max_elements` - a limit for a number of cache files. Default: `1048576`.
|
- `max_elements` - a limit for a number of cache files. Default: `1048576`.
|
||||||
|
|
||||||
Cache **query settings**:
|
File Cache **query/profile settings**:
|
||||||
|
|
||||||
|
Some of these settings will disable cache features per query/profile that are enabled by default or in disk configuration settings. For example, you can enable cache in disk configuration and disable it per query/profile setting `enable_filesystem_cache` to `false`. Also setting `cache_on_write_operations` to `true` in disk configuration means that "write-though" cache is enabled. But if you need to disable this general setting per specific queries then setting `enable_filesystem_cache_on_write_operations` to `false` means that write operations cache will be disabled for a specific query/profile.
|
||||||
|
|
||||||
- `enable_filesystem_cache` - allows to disable cache per query even if storage policy was configured with `cache` disk type. Default: `true`.
|
- `enable_filesystem_cache` - allows to disable cache per query even if storage policy was configured with `cache` disk type. Default: `true`.
|
||||||
|
|
||||||
- `read_from_filesystem_cache_if_exists_otherwise_bypass_cache` - allows to use cache in query only if it already exists, otherwise query data will not be written to local cache storage. Default: `false`.
|
- `read_from_filesystem_cache_if_exists_otherwise_bypass_cache` - allows to use cache in query only if it already exists, otherwise query data will not be written to local cache storage. Default: `false`.
|
||||||
|
|
||||||
- `enable_filesystem_cache_on_write_operations` - turn on `write-through` cache. This setting works only if setting `cache_on_write_operations` in cache configuration is turned on.
|
- `enable_filesystem_cache_on_write_operations` - turn on `write-through` cache. This setting works only if setting `cache_on_write_operations` in cache configuration is turned on. Default: `false`.
|
||||||
|
|
||||||
- `enable_filesystem_cache_log` - turn on logging to `system.filesystem_cache_log` table. Gives a detailed view of cache usage per query. Default: `false`.
|
- `enable_filesystem_cache_log` - turn on logging to `system.filesystem_cache_log` table. Gives a detailed view of cache usage per query. It can be turn on for specific queries or enabled in a profile. Default: `false`.
|
||||||
|
|
||||||
- `max_query_cache_size` - a limit for the cache size, which can be written to local cache storage. Requires enabled `enable_filesystem_query_cache_limit` in cache configuration. Default: `false`.
|
- `max_query_cache_size` - a limit for the cache size, which can be written to local cache storage. Requires enabled `enable_filesystem_query_cache_limit` in cache configuration. Default: `false`.
|
||||||
|
|
||||||
- `skip_download_if_exceeds_query_cache` - allows to change the behaviour of setting `max_query_cache_size`. Default: `true`. If this setting is turned on and cache download limit during query was reached, no more cache will be downloaded to cache storage. If this setting is turned off and cache download limit during query was reached, cache will still be written by cost of evicting previously downloaded (within current query) data, e.g. second behaviour allows to preserve `last recentltly used` behaviour while keeping query cache limit.
|
- `skip_download_if_exceeds_query_cache` - allows to change the behaviour of setting `max_query_cache_size`. Default: `true`. If this setting is turned on and cache download limit during query was reached, no more cache will be downloaded to cache storage. If this setting is turned off and cache download limit during query was reached, cache will still be written by cost of evicting previously downloaded (within current query) data, e.g. second behaviour allows to preserve `last recently used` behaviour while keeping query cache limit.
|
||||||
|
|
||||||
** Warning **
|
**Warning**
|
||||||
Cache configuration settings and cache query settings correspond to the latest ClickHouse version, for earlier versions something might not be supported.
|
Cache configuration settings and cache query settings correspond to the latest ClickHouse version, for earlier versions something might not be supported.
|
||||||
|
|
||||||
Cache **system tables**:
|
Cache **system tables**:
|
||||||
@ -215,7 +219,7 @@ Cache **system tables**:
|
|||||||
|
|
||||||
Cache **commands**:
|
Cache **commands**:
|
||||||
|
|
||||||
- `SYSTEM DROP FILESYSTEM CACHE (<path>) (ON CLUSTER)`
|
- `SYSTEM DROP FILESYSTEM CACHE (<cache_name>) (ON CLUSTER)` -- `ON CLUSTER` is only supported when no `<cache_name>` is provided
|
||||||
|
|
||||||
- `SHOW FILESYSTEM CACHES` -- show list of filesystem caches which were configured on the server. (For versions <= `22.8` the command is named `SHOW CACHES`)
|
- `SHOW FILESYSTEM CACHES` -- show list of filesystem caches which were configured on the server. (For versions <= `22.8` the command is named `SHOW CACHES`)
|
||||||
|
|
||||||
@ -231,10 +235,10 @@ Result:
|
|||||||
└───────────┘
|
└───────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
- `DESCRIBE CACHE '<cache_name>'` - show cache configuration and some general statistics for a specific cache. Cache name can be taken from `SHOW CACHES` command. (For versions <= `22.8` the command is named `DESCRIBE CACHE`)
|
- `DESCRIBE FILESYSTEM CACHE '<cache_name>'` - show cache configuration and some general statistics for a specific cache. Cache name can be taken from `SHOW FILESYSTEM CACHES` command. (For versions <= `22.8` the command is named `DESCRIBE CACHE`)
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
DESCRIBE CACHE 's3_cache'
|
DESCRIBE FILESYSTEM CACHE 's3_cache'
|
||||||
```
|
```
|
||||||
|
|
||||||
``` text
|
``` text
|
||||||
|
27
docs/en/operations/system-tables/build_options.md
Normal file
27
docs/en/operations/system-tables/build_options.md
Normal file
@ -0,0 +1,27 @@
|
|||||||
|
---
|
||||||
|
slug: /en/operations/system-tables/build_options
|
||||||
|
---
|
||||||
|
# build_options
|
||||||
|
|
||||||
|
Contains information about the ClickHouse server's build options.
|
||||||
|
|
||||||
|
Columns:
|
||||||
|
|
||||||
|
- `name` (String) — Name of the build option, e.g. `USE_ODBC`
|
||||||
|
- `value` (String) — Value of the build option, e.g. `1`
|
||||||
|
|
||||||
|
**Example**
|
||||||
|
|
||||||
|
``` sql
|
||||||
|
SELECT * FROM system.build_options LIMIT 5
|
||||||
|
```
|
||||||
|
|
||||||
|
``` text
|
||||||
|
┌─name─────────────┬─value─┐
|
||||||
|
│ USE_BROTLI │ 1 │
|
||||||
|
│ USE_BZIP2 │ 1 │
|
||||||
|
│ USE_CAPNP │ 1 │
|
||||||
|
│ USE_CASSANDRA │ 1 │
|
||||||
|
│ USE_DATASKETCHES │ 1 │
|
||||||
|
└──────────────────┴───────┘
|
||||||
|
```
|
@ -5,16 +5,18 @@ This table contains profiling on processors level (that you can find in [`EXPLAI
|
|||||||
Columns:
|
Columns:
|
||||||
|
|
||||||
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the event happened.
|
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — The date when the event happened.
|
||||||
- `event_time` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — The date and time when the event happened.
|
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — The date and time when the event happened.
|
||||||
|
- `event_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — The date and time with microseconds precision when the event happened.
|
||||||
- `id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — ID of processor
|
- `id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — ID of processor
|
||||||
- `parent_ids` ([Array(UInt64)](../../sql-reference/data-types/array.md)) — Parent processors IDs
|
- `parent_ids` ([Array(UInt64)](../../sql-reference/data-types/array.md)) — Parent processors IDs
|
||||||
|
- `plan_step` ([UInt64](../../sql-reference/data-types/int-uint.md)) — ID of the query plan step which created this processor. The value is zero if the processor was not added from any step.
|
||||||
|
- `plan_group` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Group of the processor if it was created by query plan step. A group is a logical partitioning of processors added from the same query plan step. Group is used only for beautifying the result of EXPLAIN PIPELINE result.
|
||||||
|
- `initial_query_id` ([String](../../sql-reference/data-types/string.md)) — ID of the initial query (for distributed query execution).
|
||||||
- `query_id` ([String](../../sql-reference/data-types/string.md)) — ID of the query
|
- `query_id` ([String](../../sql-reference/data-types/string.md)) — ID of the query
|
||||||
- `name` ([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md)) — Name of the processor.
|
- `name` ([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md)) — Name of the processor.
|
||||||
- `elapsed_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of microseconds this processor was executed.
|
- `elapsed_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of microseconds this processor was executed.
|
||||||
- `input_wait_elapsed_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of microseconds this processor was waiting for data (from other processor).
|
- `input_wait_elapsed_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of microseconds this processor was waiting for data (from other processor).
|
||||||
- `output_wait_elapsed_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of microseconds this processor was waiting because output port was full.
|
- `output_wait_elapsed_us` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Number of microseconds this processor was waiting because output port was full.
|
||||||
- `plan_step` ([UInt64](../../sql-reference/data-types/int-uint.md)) — ID of the query plan step which created this processor. The value is zero if the processor was not added from any step.
|
|
||||||
- `plan_group` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Group of the processor if it was created by query plan step. A group is a logical partitioning of processors added from the same query plan step. Group is used only for beautifying the result of EXPLAIN PIPELINE result.
|
|
||||||
- `input_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows consumed by processor.
|
- `input_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows consumed by processor.
|
||||||
- `input_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of bytes consumed by processor.
|
- `input_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of bytes consumed by processor.
|
||||||
- `output_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows generated by processor.
|
- `output_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows generated by processor.
|
||||||
|
@ -59,9 +59,10 @@ Columns:
|
|||||||
- `query_kind` ([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md)) — Type of the query.
|
- `query_kind` ([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md)) — Type of the query.
|
||||||
- `databases` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the databases present in the query.
|
- `databases` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the databases present in the query.
|
||||||
- `tables` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the tables present in the query.
|
- `tables` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the tables present in the query.
|
||||||
- `views` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the (materialized or live) views present in the query.
|
|
||||||
- `columns` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the columns present in the query.
|
- `columns` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the columns present in the query.
|
||||||
|
- `partitions` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the partitions present in the query.
|
||||||
- `projections` ([String](../../sql-reference/data-types/string.md)) — Names of the projections used during the query execution.
|
- `projections` ([String](../../sql-reference/data-types/string.md)) — Names of the projections used during the query execution.
|
||||||
|
- `views` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — Names of the (materialized or live) views present in the query.
|
||||||
- `exception_code` ([Int32](../../sql-reference/data-types/int-uint.md)) — Code of an exception.
|
- `exception_code` ([Int32](../../sql-reference/data-types/int-uint.md)) — Code of an exception.
|
||||||
- `exception` ([String](../../sql-reference/data-types/string.md)) — Exception message.
|
- `exception` ([String](../../sql-reference/data-types/string.md)) — Exception message.
|
||||||
- `stack_trace` ([String](../../sql-reference/data-types/string.md)) — [Stack trace](https://en.wikipedia.org/wiki/Stack_trace). An empty string, if the query was completed successfully.
|
- `stack_trace` ([String](../../sql-reference/data-types/string.md)) — [Stack trace](https://en.wikipedia.org/wiki/Stack_trace). An empty string, if the query was completed successfully.
|
||||||
@ -97,8 +98,8 @@ Columns:
|
|||||||
- `forwarded_for` ([String](../../sql-reference/data-types/string.md)) — HTTP header `X-Forwarded-For` passed in the HTTP query.
|
- `forwarded_for` ([String](../../sql-reference/data-types/string.md)) — HTTP header `X-Forwarded-For` passed in the HTTP query.
|
||||||
- `quota_key` ([String](../../sql-reference/data-types/string.md)) — The `quota key` specified in the [quotas](../../operations/quotas.md) setting (see `keyed`).
|
- `quota_key` ([String](../../sql-reference/data-types/string.md)) — The `quota key` specified in the [quotas](../../operations/quotas.md) setting (see `keyed`).
|
||||||
- `revision` ([UInt32](../../sql-reference/data-types/int-uint.md)) — ClickHouse revision.
|
- `revision` ([UInt32](../../sql-reference/data-types/int-uint.md)) — ClickHouse revision.
|
||||||
- `ProfileEvents` ([Map(String, UInt64)](../../sql-reference/data-types/array.md)) — ProfileEvents that measure different metrics. The description of them could be found in the table [system.events](../../operations/system-tables/events.md#system_tables-events)
|
- `ProfileEvents` ([Map(String, UInt64)](../../sql-reference/data-types/map.md)) — ProfileEvents that measure different metrics. The description of them could be found in the table [system.events](../../operations/system-tables/events.md#system_tables-events)
|
||||||
- `Settings` ([Map(String, String)](../../sql-reference/data-types/array.md)) — Settings that were changed when the client ran the query. To enable logging changes to settings, set the `log_query_settings` parameter to 1.
|
- `Settings` ([Map(String, String)](../../sql-reference/data-types/map.md)) — Settings that were changed when the client ran the query. To enable logging changes to settings, set the `log_query_settings` parameter to 1.
|
||||||
- `log_comment` ([String](../../sql-reference/data-types/string.md)) — Log comment. It can be set to arbitrary string no longer than [max_query_size](../../operations/settings/settings.md#settings-max_query_size). An empty string if it is not defined.
|
- `log_comment` ([String](../../sql-reference/data-types/string.md)) — Log comment. It can be set to arbitrary string no longer than [max_query_size](../../operations/settings/settings.md#settings-max_query_size). An empty string if it is not defined.
|
||||||
- `thread_ids` ([Array(UInt64)](../../sql-reference/data-types/array.md)) — Thread ids that are participating in query execution.
|
- `thread_ids` ([Array(UInt64)](../../sql-reference/data-types/array.md)) — Thread ids that are participating in query execution.
|
||||||
- `used_aggregate_functions` ([Array(String)](../../sql-reference/data-types/array.md)) — Canonical names of `aggregate functions`, which were used during query execution.
|
- `used_aggregate_functions` ([Array(String)](../../sql-reference/data-types/array.md)) — Canonical names of `aggregate functions`, which were used during query execution.
|
||||||
|
@ -12,7 +12,7 @@ Columns:
|
|||||||
|
|
||||||
- `database` ([String](../../sql-reference/data-types/string.md)) — Database name.
|
- `database` ([String](../../sql-reference/data-types/string.md)) — Database name.
|
||||||
|
|
||||||
- `table` ([String](../../sql-reference/data-types/string.md)) — Table name.
|
- `table` ([String](../../sql-reference/data-types/string.md)) — Table name. Empty if policy for database.
|
||||||
|
|
||||||
- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — Row policy ID.
|
- `id` ([UUID](../../sql-reference/data-types/uuid.md)) — Row policy ID.
|
||||||
|
|
||||||
|
29
docs/en/operations/system-tables/zookeeper_connection.md
Normal file
29
docs/en/operations/system-tables/zookeeper_connection.md
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
---
|
||||||
|
slug: /en/operations/system-tables/zookeeper_connection
|
||||||
|
---
|
||||||
|
#zookeeper_connection
|
||||||
|
|
||||||
|
This table does not exist if ZooKeeper is not configured. The 'system.zookeeper_connection' table shows current connections to ZooKeeper (including auxiliary ZooKeepers). Each row shows information about one connection.
|
||||||
|
|
||||||
|
Columns:
|
||||||
|
|
||||||
|
- `name` ([String](../../sql-reference/data-types/string.md)) — ZooKeeper cluster's name.
|
||||||
|
- `host` ([String](../../sql-reference/data-types/string.md)) — The hostname/IP of the ZooKeeper node that ClickHouse connected to.
|
||||||
|
- `port` ([String](../../sql-reference/data-types/string.md)) — The port of the ZooKeeper node that ClickHouse connected to.
|
||||||
|
- `index` ([UInt8](../../sql-reference/data-types/int-uint.md)) — The index of the ZooKeeper node that ClickHouse connected to. The index is from ZooKeeper config.
|
||||||
|
- `connected_time` ([String](../../sql-reference/data-types/string.md)) — When the connection was established
|
||||||
|
- `is_expired` ([UInt8](../../sql-reference/data-types/int-uint.md)) — Is the current connection expired.
|
||||||
|
- `keeper_api_version` ([String](../../sql-reference/data-types/string.md)) — Keeper API version.
|
||||||
|
- `client_id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Session id of the connection.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
``` sql
|
||||||
|
SELECT * FROM system.zookeeper_connection;
|
||||||
|
```
|
||||||
|
|
||||||
|
``` text
|
||||||
|
┌─name──────────────┬─host─────────┬─port─┬─index─┬──────connected_time─┬─is_expired─┬─keeper_api_version─┬──────────client_id─┐
|
||||||
|
│ default_zookeeper │ 127.0.0.1 │ 2181 │ 0 │ 2023-05-19 14:30:16 │ 0 │ 0 │ 216349144108826660 │
|
||||||
|
└───────────────────┴──────────────┴──────┴───────┴─────────────────────┴────────────┴────────────────────┴────────────────────┘
|
||||||
|
```
|
@ -183,8 +183,9 @@ Arguments:
|
|||||||
- `-S`, `--structure` — table structure for input data.
|
- `-S`, `--structure` — table structure for input data.
|
||||||
- `--input-format` — input format, `TSV` by default.
|
- `--input-format` — input format, `TSV` by default.
|
||||||
- `-f`, `--file` — path to data, `stdin` by default.
|
- `-f`, `--file` — path to data, `stdin` by default.
|
||||||
- `-q`, `--query` — queries to execute with `;` as delimiter. You must specify either `query` or `queries-file` option.
|
- `-q`, `--query` — queries to execute with `;` as delimiter. Cannot be used simultaneously with `--queries-file`.
|
||||||
- `--queries-file` - file path with queries to execute. You must specify either `query` or `queries-file` option.
|
- `--queries-file` - file path with queries to execute. Cannot be used simultaneously with `--query`.
|
||||||
|
- `--multiquery, -n` – If specified, multiple queries separated by semicolons can be listed after the `--query` option. For convenience, it is also possible to omit `--query` and pass the queries directly after `--multiquery`.
|
||||||
- `-N`, `--table` — table name where to put output data, `table` by default.
|
- `-N`, `--table` — table name where to put output data, `table` by default.
|
||||||
- `--format`, `--output-format` — output format, `TSV` by default.
|
- `--format`, `--output-format` — output format, `TSV` by default.
|
||||||
- `-d`, `--database` — default database, `_local` by default.
|
- `-d`, `--database` — default database, `_local` by default.
|
||||||
|
@ -0,0 +1,55 @@
|
|||||||
|
---
|
||||||
|
slug: /en/sql-reference/aggregate-functions/reference/first_value
|
||||||
|
sidebar_position: 7
|
||||||
|
---
|
||||||
|
|
||||||
|
# first_value
|
||||||
|
|
||||||
|
Selects the first encountered value, similar to `any`, but could accept NULL.
|
||||||
|
|
||||||
|
## examples
|
||||||
|
|
||||||
|
```sql
|
||||||
|
insert into test_data (a,b) values (1,null), (2,3), (4, 5), (6,null)
|
||||||
|
```
|
||||||
|
|
||||||
|
### example1
|
||||||
|
The NULL value is ignored at default.
|
||||||
|
```sql
|
||||||
|
select first_value(b) from test_data
|
||||||
|
```
|
||||||
|
|
||||||
|
```text
|
||||||
|
┌─first_value_ignore_nulls(b)─┐
|
||||||
|
│ 3 │
|
||||||
|
└─────────────────────────────┘
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### example2
|
||||||
|
The NULL value is ignored.
|
||||||
|
```sql
|
||||||
|
select first_value(b) ignore nulls sfrom test_data
|
||||||
|
```
|
||||||
|
|
||||||
|
```text
|
||||||
|
┌─first_value_ignore_nulls(b)─┐
|
||||||
|
│ 3 │
|
||||||
|
└─────────────────────────────┘
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### example3
|
||||||
|
The NULL value is accepted.
|
||||||
|
```sql
|
||||||
|
select first_value(b) respect nulls from test_data
|
||||||
|
```
|
||||||
|
|
||||||
|
```text
|
||||||
|
|
||||||
|
┌─first_value_respect_nulls(b)─┐
|
||||||
|
│ ᴺᵁᴸᴸ │
|
||||||
|
└──────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
|
@ -26,6 +26,8 @@ ClickHouse-specific aggregate functions:
|
|||||||
|
|
||||||
- [anyHeavy](../../../sql-reference/aggregate-functions/reference/anyheavy.md)
|
- [anyHeavy](../../../sql-reference/aggregate-functions/reference/anyheavy.md)
|
||||||
- [anyLast](../../../sql-reference/aggregate-functions/reference/anylast.md)
|
- [anyLast](../../../sql-reference/aggregate-functions/reference/anylast.md)
|
||||||
|
- [first_value](../../../sql-reference/aggregate-functions/reference/first_value.md)
|
||||||
|
- [last_value](../../../sql-reference/aggregate-functions/reference/last_value.md)
|
||||||
- [argMin](../../../sql-reference/aggregate-functions/reference/argmin.md)
|
- [argMin](../../../sql-reference/aggregate-functions/reference/argmin.md)
|
||||||
- [argMax](../../../sql-reference/aggregate-functions/reference/argmax.md)
|
- [argMax](../../../sql-reference/aggregate-functions/reference/argmax.md)
|
||||||
- [avgWeighted](../../../sql-reference/aggregate-functions/reference/avgweighted.md)
|
- [avgWeighted](../../../sql-reference/aggregate-functions/reference/avgweighted.md)
|
||||||
|
@ -0,0 +1,53 @@
|
|||||||
|
---
|
||||||
|
slug: /en/sql-reference/aggregate-functions/reference/last_value
|
||||||
|
sidebar_position: 8
|
||||||
|
---
|
||||||
|
|
||||||
|
# last_value
|
||||||
|
|
||||||
|
Selects the last encountered value, similar to `anyLast`, but could accept NULL.
|
||||||
|
|
||||||
|
|
||||||
|
## examples
|
||||||
|
|
||||||
|
```sql
|
||||||
|
insert into test_data (a,b) values (1,null), (2,3), (4, 5), (6,null)
|
||||||
|
```
|
||||||
|
|
||||||
|
### example1
|
||||||
|
The NULL value is ignored at default.
|
||||||
|
```sql
|
||||||
|
select last_value(b) from test_data
|
||||||
|
```
|
||||||
|
|
||||||
|
```text
|
||||||
|
┌─last_value_ignore_nulls(b)─┐
|
||||||
|
│ 5 │
|
||||||
|
└────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### example2
|
||||||
|
The NULL value is ignored.
|
||||||
|
```sql
|
||||||
|
select last_value(b) ignore nulls from test_data
|
||||||
|
```
|
||||||
|
|
||||||
|
```text
|
||||||
|
┌─last_value_ignore_nulls(b)─┐
|
||||||
|
│ 5 │
|
||||||
|
└────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### example3
|
||||||
|
The NULL value is accepted.
|
||||||
|
```sql
|
||||||
|
select last_value(b) respect nulls from test_data
|
||||||
|
```
|
||||||
|
|
||||||
|
```text
|
||||||
|
┌─last_value_respect_nulls(b)─┐
|
||||||
|
│ ᴺᵁᴸᴸ │
|
||||||
|
└─────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
|
@ -46,8 +46,6 @@ SELECT [1, 2] AS x, toTypeName(x)
|
|||||||
|
|
||||||
## Working with Data Types
|
## Working with Data Types
|
||||||
|
|
||||||
The maximum size of an array is limited to one million elements.
|
|
||||||
|
|
||||||
When creating an array on the fly, ClickHouse automatically defines the argument type as the narrowest data type that can store all the listed arguments. If there are any [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable) or literal [NULL](../../sql-reference/syntax.md#null-literal) values, the type of an array element also becomes [Nullable](../../sql-reference/data-types/nullable.md).
|
When creating an array on the fly, ClickHouse automatically defines the argument type as the narrowest data type that can store all the listed arguments. If there are any [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable) or literal [NULL](../../sql-reference/syntax.md#null-literal) values, the type of an array element also becomes [Nullable](../../sql-reference/data-types/nullable.md).
|
||||||
|
|
||||||
If ClickHouse couldn’t determine the data type, it generates an exception. For instance, this happens when trying to create an array with strings and numbers simultaneously (`SELECT array(1, 'a')`).
|
If ClickHouse couldn’t determine the data type, it generates an exception. For instance, this happens when trying to create an array with strings and numbers simultaneously (`SELECT array(1, 'a')`).
|
||||||
|
@ -29,5 +29,5 @@ ClickHouse data types include:
|
|||||||
- **Tuples**: A [`Tuple` of elements](./tuple.md), each having an individual type.
|
- **Tuples**: A [`Tuple` of elements](./tuple.md), each having an individual type.
|
||||||
- **Nullable**: [`Nullable`](./nullable.md) allows you to store a value as `NULL` when a value is "missing" (instead of the column gettings its default value for the data type)
|
- **Nullable**: [`Nullable`](./nullable.md) allows you to store a value as `NULL` when a value is "missing" (instead of the column gettings its default value for the data type)
|
||||||
- **IP addresses**: use [`IPv4`](./domains/ipv4.md) and [`IPv6`](./domains/ipv6.md) to efficiently store IP addresses
|
- **IP addresses**: use [`IPv4`](./domains/ipv4.md) and [`IPv6`](./domains/ipv6.md) to efficiently store IP addresses
|
||||||
- **Geo types**: for[ geographical data](./geo.md), including `Point`, `Ring`, `Polygon` and `MultiPolygon`
|
- **Geo types**: for [geographical data](./geo.md), including `Point`, `Ring`, `Polygon` and `MultiPolygon`
|
||||||
- **Special data types**: including [`Expression`](./special-data-types/expression.md), [`Set`](./special-data-types/set.md), [`Nothing`](./special-data-types/nothing.md) and [`Interval`](./special-data-types/interval.md)
|
- **Special data types**: including [`Expression`](./special-data-types/expression.md), [`Set`](./special-data-types/set.md), [`Nothing`](./special-data-types/nothing.md) and [`Interval`](./special-data-types/interval.md)
|
@ -8,10 +8,6 @@ sidebar_label: Interval
|
|||||||
|
|
||||||
The family of data types representing time and date intervals. The resulting types of the [INTERVAL](../../../sql-reference/operators/index.md#operator-interval) operator.
|
The family of data types representing time and date intervals. The resulting types of the [INTERVAL](../../../sql-reference/operators/index.md#operator-interval) operator.
|
||||||
|
|
||||||
:::note
|
|
||||||
`Interval` data type values can’t be stored in tables.
|
|
||||||
:::
|
|
||||||
|
|
||||||
Structure:
|
Structure:
|
||||||
|
|
||||||
- Time interval as an unsigned integer value.
|
- Time interval as an unsigned integer value.
|
||||||
@ -19,6 +15,9 @@ Structure:
|
|||||||
|
|
||||||
Supported interval types:
|
Supported interval types:
|
||||||
|
|
||||||
|
- `NANOSECOND`
|
||||||
|
- `MICROSECOND`
|
||||||
|
- `MILLISECOND`
|
||||||
- `SECOND`
|
- `SECOND`
|
||||||
- `MINUTE`
|
- `MINUTE`
|
||||||
- `HOUR`
|
- `HOUR`
|
||||||
|
@ -267,14 +267,16 @@ or
|
|||||||
LAYOUT(HASHED())
|
LAYOUT(HASHED())
|
||||||
```
|
```
|
||||||
|
|
||||||
If `shards` greater then 1 (default is `1`) the dictionary will load data in parallel, useful if you have huge amount of elements in one dictionary.
|
|
||||||
|
|
||||||
Configuration example:
|
Configuration example:
|
||||||
|
|
||||||
``` xml
|
``` xml
|
||||||
<layout>
|
<layout>
|
||||||
<hashed>
|
<hashed>
|
||||||
|
<!-- If shards greater then 1 (default is `1`) the dictionary will load
|
||||||
|
data in parallel, useful if you have huge amount of elements in one
|
||||||
|
dictionary. -->
|
||||||
<shards>10</shards>
|
<shards>10</shards>
|
||||||
|
|
||||||
<!-- Size of the backlog for blocks in parallel queue.
|
<!-- Size of the backlog for blocks in parallel queue.
|
||||||
|
|
||||||
Since the bottleneck in parallel loading is rehash, and so to avoid
|
Since the bottleneck in parallel loading is rehash, and so to avoid
|
||||||
@ -284,6 +286,14 @@ Configuration example:
|
|||||||
10000 is good balance between memory and speed.
|
10000 is good balance between memory and speed.
|
||||||
Even for 10e10 elements and can handle all the load without starvation. -->
|
Even for 10e10 elements and can handle all the load without starvation. -->
|
||||||
<shard_load_queue_backlog>10000</shard_load_queue_backlog>
|
<shard_load_queue_backlog>10000</shard_load_queue_backlog>
|
||||||
|
|
||||||
|
<!-- Maximum load factor of the hash table, with greater values, the memory
|
||||||
|
is utilized more efficiently (less memory is wasted) but read/performance
|
||||||
|
may deteriorate.
|
||||||
|
|
||||||
|
Valid values: [0.5, 0.99]
|
||||||
|
Default: 0.5 -->
|
||||||
|
<max_load_factor>0.5</max_load_factor>
|
||||||
</hashed>
|
</hashed>
|
||||||
</layout>
|
</layout>
|
||||||
```
|
```
|
||||||
@ -291,7 +301,7 @@ Configuration example:
|
|||||||
or
|
or
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
LAYOUT(HASHED(SHARDS 10 [SHARD_LOAD_QUEUE_BACKLOG 10000]))
|
LAYOUT(HASHED([SHARDS 1] [SHARD_LOAD_QUEUE_BACKLOG 10000] [MAX_LOAD_FACTOR 0.5]))
|
||||||
```
|
```
|
||||||
|
|
||||||
### sparse_hashed
|
### sparse_hashed
|
||||||
@ -304,14 +314,18 @@ Configuration example:
|
|||||||
|
|
||||||
``` xml
|
``` xml
|
||||||
<layout>
|
<layout>
|
||||||
<sparse_hashed />
|
<sparse_hashed>
|
||||||
|
<!-- <shards>1</shards> -->
|
||||||
|
<!-- <shard_load_queue_backlog>10000</shard_load_queue_backlog> -->
|
||||||
|
<!-- <max_load_factor>0.5</max_load_factor> -->
|
||||||
|
</sparse_hashed>
|
||||||
</layout>
|
</layout>
|
||||||
```
|
```
|
||||||
|
|
||||||
or
|
or
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
LAYOUT(SPARSE_HASHED())
|
LAYOUT(SPARSE_HASHED([SHARDS 1] [SHARD_LOAD_QUEUE_BACKLOG 10000] [MAX_LOAD_FACTOR 0.5]))
|
||||||
```
|
```
|
||||||
|
|
||||||
It is also possible to use `shards` for this type of dictionary, and again it is more important for `sparse_hashed` then for `hashed`, since `sparse_hashed` is slower.
|
It is also possible to use `shards` for this type of dictionary, and again it is more important for `sparse_hashed` then for `hashed`, since `sparse_hashed` is slower.
|
||||||
@ -325,8 +339,9 @@ Configuration example:
|
|||||||
``` xml
|
``` xml
|
||||||
<layout>
|
<layout>
|
||||||
<complex_key_hashed>
|
<complex_key_hashed>
|
||||||
<shards>1</shards>
|
<!-- <shards>1</shards> -->
|
||||||
<!-- <shard_load_queue_backlog>10000</shard_load_queue_backlog> -->
|
<!-- <shard_load_queue_backlog>10000</shard_load_queue_backlog> -->
|
||||||
|
<!-- <max_load_factor>0.5</max_load_factor> -->
|
||||||
</complex_key_hashed>
|
</complex_key_hashed>
|
||||||
</layout>
|
</layout>
|
||||||
```
|
```
|
||||||
@ -334,7 +349,7 @@ Configuration example:
|
|||||||
or
|
or
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
LAYOUT(COMPLEX_KEY_HASHED([SHARDS 1] [SHARD_LOAD_QUEUE_BACKLOG 10000]))
|
LAYOUT(COMPLEX_KEY_HASHED([SHARDS 1] [SHARD_LOAD_QUEUE_BACKLOG 10000] [MAX_LOAD_FACTOR 0.5]))
|
||||||
```
|
```
|
||||||
|
|
||||||
### complex_key_sparse_hashed
|
### complex_key_sparse_hashed
|
||||||
@ -346,7 +361,9 @@ Configuration example:
|
|||||||
``` xml
|
``` xml
|
||||||
<layout>
|
<layout>
|
||||||
<complex_key_sparse_hashed>
|
<complex_key_sparse_hashed>
|
||||||
<shards>1</shards>
|
<!-- <shards>1</shards> -->
|
||||||
|
<!-- <shard_load_queue_backlog>10000</shard_load_queue_backlog> -->
|
||||||
|
<!-- <max_load_factor>0.5</max_load_factor> -->
|
||||||
</complex_key_sparse_hashed>
|
</complex_key_sparse_hashed>
|
||||||
</layout>
|
</layout>
|
||||||
```
|
```
|
||||||
@ -354,7 +371,7 @@ Configuration example:
|
|||||||
or
|
or
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
LAYOUT(COMPLEX_KEY_SPARSE_HASHED([SHARDS 1] [SHARD_LOAD_QUEUE_BACKLOG 10000]))
|
LAYOUT(COMPLEX_KEY_SPARSE_HASHED([SHARDS 1] [SHARD_LOAD_QUEUE_BACKLOG 10000] [MAX_LOAD_FACTOR 0.5]))
|
||||||
```
|
```
|
||||||
|
|
||||||
### hashed_array
|
### hashed_array
|
||||||
@ -848,16 +865,34 @@ LIFETIME(3600);
|
|||||||
|
|
||||||
The key must have only one `String` type attribute that contains an allowed IP prefix. Other types are not supported yet.
|
The key must have only one `String` type attribute that contains an allowed IP prefix. Other types are not supported yet.
|
||||||
|
|
||||||
For queries, you must use the same functions (`dictGetT` with a tuple) as for dictionaries with composite keys. The syntax is:
|
The syntax is:
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
dictGetT('dict_name', 'attr_name', tuple(ip))
|
dictGetT('dict_name', 'attr_name', ip)
|
||||||
```
|
```
|
||||||
|
|
||||||
The function takes either `UInt32` for IPv4, or `FixedString(16)` for IPv6. For example:
|
The function takes either `UInt32` for IPv4, or `FixedString(16)` for IPv6. For example:
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
select dictGet('my_ip_trie_dictionary', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
|
SELECT dictGet('my_ip_trie_dictionary', 'cca2', toIPv4('202.79.32.10')) AS result;
|
||||||
|
|
||||||
|
┌─result─┐
|
||||||
|
│ NP │
|
||||||
|
└────────┘
|
||||||
|
|
||||||
|
|
||||||
|
SELECT dictGet('my_ip_trie_dictionary', 'asn', IPv6StringToNum('2001:db8::1')) AS result;
|
||||||
|
|
||||||
|
┌─result─┐
|
||||||
|
│ 65536 │
|
||||||
|
└────────┘
|
||||||
|
|
||||||
|
|
||||||
|
SELECT dictGet('my_ip_trie_dictionary', ('asn', 'cca2'), IPv6StringToNum('2001:db8::1')) AS result;
|
||||||
|
|
||||||
|
┌─result───────┐
|
||||||
|
│ (65536,'ZZ') │
|
||||||
|
└──────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
Other types are not supported yet. The function returns the attribute for the prefix that corresponds to this IP address. If there are overlapping prefixes, the most specific one is returned.
|
Other types are not supported yet. The function returns the attribute for the prefix that corresponds to this IP address. If there are overlapping prefixes, the most specific one is returned.
|
||||||
@ -2197,16 +2232,16 @@ Result:
|
|||||||
└─────────────────────────────────┴───────┘
|
└─────────────────────────────────┴───────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
## RegExp Tree Dictionary {#regexp-tree-dictionary}
|
## Regular Expression Tree Dictionary {#regexp-tree-dictionary}
|
||||||
|
|
||||||
Regexp Tree dictionary stores multiple trees of regular expressions with attributions. Users can retrieve strings in the dictionary. If a string matches the root of the regexp tree, we will collect the corresponding attributes of the matched root and continue to walk the children. If any of the children matches the string, we will collect attributes and rewrite the old ones if conflicts occur, then continue the traverse until we reach leaf nodes.
|
Regular expression tree dictionaries are a special type of dictionary which represent the mapping from key to attributes using a tree of regular expressions. There are some use cases, e.g. parsing of (user agent)[https://en.wikipedia.org/wiki/User_agent] strings, which can be expressed elegantly with regexp tree dictionaries.
|
||||||
|
|
||||||
Example of the ddl query for creating Regexp Tree dictionary:
|
### Use Regular Expression Tree Dictionary in ClickHouse Open-Source
|
||||||
|
|
||||||
<CloudDetails />
|
Regular expression tree dictionaries are defined in ClickHouse open-source using the YAMLRegExpTree source which is provided the path to a YAML file containing the regular expression tree.
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
create dictionary regexp_dict
|
CREATE DICTIONARY regexp_dict
|
||||||
(
|
(
|
||||||
regexp String,
|
regexp String,
|
||||||
name String,
|
name String,
|
||||||
@ -2218,19 +2253,15 @@ LAYOUT(regexp_tree)
|
|||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
We only allow `YAMLRegExpTree` to work with regexp_tree dicitionary layout. If you want to use other sources, please set variable `regexp_dict_allow_other_sources` true.
|
The dictionary source `YAMLRegExpTree` represents the structure of a regexp tree. For example:
|
||||||
|
|
||||||
**Source**
|
```yaml
|
||||||
|
|
||||||
We introduce a type of source called `YAMLRegExpTree` representing the structure of Regexp Tree dictionary. An Example of a valid yaml config is like:
|
|
||||||
|
|
||||||
```xml
|
|
||||||
- regexp: 'Linux/(\d+[\.\d]*).+tlinux'
|
- regexp: 'Linux/(\d+[\.\d]*).+tlinux'
|
||||||
name: 'TencentOS'
|
name: 'TencentOS'
|
||||||
version: '\1'
|
version: '\1'
|
||||||
|
|
||||||
- regexp: '\d+/tclwebkit(?:\d+[\.\d]*)'
|
- regexp: '\d+/tclwebkit(?:\d+[\.\d]*)'
|
||||||
name: 'Andriod'
|
name: 'Android'
|
||||||
versions:
|
versions:
|
||||||
- regexp: '33/tclwebkit'
|
- regexp: '33/tclwebkit'
|
||||||
version: '13'
|
version: '13'
|
||||||
@ -2242,17 +2273,14 @@ We introduce a type of source called `YAMLRegExpTree` representing the structure
|
|||||||
version: '10'
|
version: '10'
|
||||||
```
|
```
|
||||||
|
|
||||||
The key `regexp` represents the regular expression of a tree node. The name of key is same as the dictionary key. The `name` and `version` is user-defined attributions in the dicitionary. The `versions` (which can be any name that not appear in attributions or the key) indicates the children nodes of this tree.
|
This config consists of a list of regular expression tree nodes. Each node has the following structure:
|
||||||
|
|
||||||
**Back Reference**
|
- **regexp**: the regular expression of the node.
|
||||||
|
- **attributes**: a list of user-defined dictionary attributes. In this example, there are two attributes: `name` and `version`. The first node defines both attributes. The second node only defines attribute `name`. Attribute `version` is provided by the child nodes of the second node.
|
||||||
|
- The value of an attribute may contain **back references**, referring to capture groups of the matched regular expression. In the example, the value of attribute `version` in the first node consists of a back-reference `\1` to capture group `(\d+[\.\d]*)` in the regular expression. Back-reference numbers range from 1 to 9 and are written as `$1` or `\1` (for number 1). The back reference is replaced by the matched capture group during query execution.
|
||||||
|
- **child nodes**: a list of children of a regexp tree node, each of which has its own attributes and (potentially) children nodes. String matching proceeds in a depth-first fashion. If a string matches a regexp node, the dictionary checks if it also matches the nodes' child nodes. If that is the case, the attributes of the deepest matching node are assigned. Attributes of a child node overwrite equally named attributes of parent nodes. The name of child nodes in YAML files can be arbitrary, e.g. `versions` in above example.
|
||||||
|
|
||||||
The value of an attribution could contain a back reference which refers to a capture group of the matched regular expression. Reference number ranges from 1 to 9 and writes as `$1` or `\1`.
|
Regexp tree dictionaries only allow access using functions `dictGet`, `dictGetOrDefault` and `dictGetOrNull`.
|
||||||
|
|
||||||
During the query execution, the back reference in the value will be replaced by the matched capture group.
|
|
||||||
|
|
||||||
**Query**
|
|
||||||
|
|
||||||
Due to the specialty of Regexp Tree dictionary, we only allow functions `dictGet`, `dictGetOrDefault` and `dictGetOrNull` work with it.
|
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
@ -2262,12 +2290,83 @@ SELECT dictGet('regexp_dict', ('name', 'version'), '31/tclwebkit1024');
|
|||||||
|
|
||||||
Result:
|
Result:
|
||||||
|
|
||||||
```
|
```text
|
||||||
┌─dictGet('regexp_dict', ('name', 'version'), '31/tclwebkit1024')─┐
|
┌─dictGet('regexp_dict', ('name', 'version'), '31/tclwebkit1024')─┐
|
||||||
│ ('Andriod','12') │
|
│ ('Android','12') │
|
||||||
└─────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
|
In this case, we first match the regular expression `\d+/tclwebkit(?:\d+[\.\d]*)` in the top layer's second node. The dictionary then continues to look into the child nodes and finds that the string also matches `3[12]/tclwebkit`. As a result, the value of attribute `name` is `Android` (defined in the first layer) and the value of attribute `version` is `12` (defined the child node).
|
||||||
|
|
||||||
|
With a powerful YAML configure file, we can use a regexp tree dictionaries as a user agent string parser. We support [uap-core](https://github.com/ua-parser/uap-core) and demonstrate how to use it in the functional test [02504_regexp_dictionary_ua_parser](https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/02504_regexp_dictionary_ua_parser.sh)
|
||||||
|
|
||||||
|
### Use Regular Expression Tree Dictionary in ClickHouse Cloud
|
||||||
|
|
||||||
|
Above used `YAMLRegExpTree` source works in ClickHouse Open Source but not in ClickHouse Cloud. To use regexp tree dictionaries in ClickHouse could, first create a regexp tree dictionary from a YAML file locally in ClickHouse Open Source, then dump this dictionary into a CSV file using the `dictionary` table function and the [INTO OUTFILE](../statements/select/into-outfile.md) clause.
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT * FROM dictionary(regexp_dict) INTO OUTFILE('regexp_dict.csv')
|
||||||
|
```
|
||||||
|
|
||||||
|
The content of csv file is:
|
||||||
|
|
||||||
|
```text
|
||||||
|
1,0,"Linux/(\d+[\.\d]*).+tlinux","['version','name']","['\\1','TencentOS']"
|
||||||
|
2,0,"(\d+)/tclwebkit(\d+[\.\d]*)","['comment','version','name']","['test $1 and $2','$1','Android']"
|
||||||
|
3,2,"33/tclwebkit","['version']","['13']"
|
||||||
|
4,2,"3[12]/tclwebkit","['version']","['12']"
|
||||||
|
5,2,"3[12]/tclwebkit","['version']","['11']"
|
||||||
|
6,2,"3[12]/tclwebkit","['version']","['10']"
|
||||||
|
```
|
||||||
|
|
||||||
|
The schema of dumped file is:
|
||||||
|
|
||||||
|
- `id UInt64`: the id of the RegexpTree node.
|
||||||
|
- `parent_id UInt64`: the id of the parent of a node.
|
||||||
|
- `regexp String`: the regular expression string.
|
||||||
|
- `keys Array(String)`: the names of user-defined attributes.
|
||||||
|
- `values Array(String)`: the values of user-defined attributes.
|
||||||
|
|
||||||
|
To create the dictionary in ClickHouse Cloud, first create a table `regexp_dictionary_source_table` with below table structure:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE regexp_dictionary_source_table
|
||||||
|
(
|
||||||
|
id UInt64,
|
||||||
|
parent_id UInt64,
|
||||||
|
regexp String,
|
||||||
|
keys Array(String),
|
||||||
|
values Array(String)
|
||||||
|
) ENGINE=Memory;
|
||||||
|
```
|
||||||
|
|
||||||
|
Then update the local CSV by
|
||||||
|
|
||||||
|
```bash
|
||||||
|
clickhouse client \
|
||||||
|
--host MY_HOST \
|
||||||
|
--secure \
|
||||||
|
--password MY_PASSWORD \
|
||||||
|
--query "
|
||||||
|
INSERT INTO regexp_dictionary_source_table
|
||||||
|
SELECT * FROM input ('id UInt64, parent_id UInt64, regexp String, keys Array(String), values Array(String)')
|
||||||
|
FORMAT CSV" < regexp_dict.csv
|
||||||
|
```
|
||||||
|
|
||||||
|
You can see how to [Insert Local Files](https://clickhouse.com/docs/en/integrations/data-ingestion/insert-local-files) for more details. After we initialize the source table, we can create a RegexpTree by table source:
|
||||||
|
|
||||||
|
``` sql
|
||||||
|
CREATE DICTIONARY regexp_dict
|
||||||
|
(
|
||||||
|
regexp String,
|
||||||
|
name String,
|
||||||
|
version String
|
||||||
|
PRIMARY KEY(regexp)
|
||||||
|
SOURCE(CLICKHOUSE(TABLE 'regexp_dictionary_source_table'))
|
||||||
|
LIFETIME(0)
|
||||||
|
LAYOUT(regexp_tree);
|
||||||
|
```
|
||||||
|
|
||||||
## Embedded Dictionaries {#embedded-dictionaries}
|
## Embedded Dictionaries {#embedded-dictionaries}
|
||||||
|
|
||||||
<SelfManaged />
|
<SelfManaged />
|
||||||
|
@ -20,7 +20,7 @@ Strings are compared byte-by-byte. Note that this may lead to unexpected results
|
|||||||
|
|
||||||
A string S1 which has another string S2 as prefix is considered longer than S2.
|
A string S1 which has another string S2 as prefix is considered longer than S2.
|
||||||
|
|
||||||
## equals
|
## equals, `=`, `==` operators
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
@ -32,7 +32,7 @@ Alias:
|
|||||||
- `a = b` (operator)
|
- `a = b` (operator)
|
||||||
- `a == b` (operator)
|
- `a == b` (operator)
|
||||||
|
|
||||||
## notEquals
|
## notEquals, `!=`, `<>` operators
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
@ -44,7 +44,7 @@ Alias:
|
|||||||
- `a != b` (operator)
|
- `a != b` (operator)
|
||||||
- `a <> b` (operator)
|
- `a <> b` (operator)
|
||||||
|
|
||||||
## less
|
## less, `<` operator
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
@ -55,7 +55,7 @@ less(a, b)
|
|||||||
Alias:
|
Alias:
|
||||||
- `a < b` (operator)
|
- `a < b` (operator)
|
||||||
|
|
||||||
## greater
|
## greater, `>` operator
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
@ -66,7 +66,7 @@ greater(a, b)
|
|||||||
Alias:
|
Alias:
|
||||||
- `a > b` (operator)
|
- `a > b` (operator)
|
||||||
|
|
||||||
## lessOrEquals
|
## lessOrEquals, `<=` operator
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
|
@ -152,3 +152,85 @@ FROM LEFT_RIGHT
|
|||||||
│ 4 │ ᴺᵁᴸᴸ │ Both equal │
|
│ 4 │ ᴺᵁᴸᴸ │ Both equal │
|
||||||
└──────┴───────┴──────────────────┘
|
└──────┴───────┴──────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## greatest
|
||||||
|
|
||||||
|
Returns the greatest across a list of values. All of the list members must be of comparable types.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT greatest(1, 2, toUInt8(3), 3.) result, toTypeName(result) type;
|
||||||
|
```
|
||||||
|
```response
|
||||||
|
┌─result─┬─type────┐
|
||||||
|
│ 3 │ Float64 │
|
||||||
|
└────────┴─────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The type returned is a Float64 as the UInt8 must be promoted to 64 bit for the comparison.
|
||||||
|
:::
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT greatest(['hello'], ['there'], ['world'])
|
||||||
|
```
|
||||||
|
```response
|
||||||
|
┌─greatest(['hello'], ['there'], ['world'])─┐
|
||||||
|
│ ['world'] │
|
||||||
|
└───────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT greatest(toDateTime32(now() + toIntervalDay(1)), toDateTime64(now(), 3))
|
||||||
|
```
|
||||||
|
```response
|
||||||
|
┌─greatest(toDateTime32(plus(now(), toIntervalDay(1))), toDateTime64(now(), 3))─┐
|
||||||
|
│ 2023-05-12 01:16:59.000 │
|
||||||
|
└──---──────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The type returned is a DateTime64 as the DataTime32 must be promoted to 64 bit for the comparison.
|
||||||
|
:::
|
||||||
|
|
||||||
|
## least
|
||||||
|
|
||||||
|
Returns the least across a list of values. All of the list members must be of comparable types.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT least(1, 2, toUInt8(3), 3.) result, toTypeName(result) type;
|
||||||
|
```
|
||||||
|
```response
|
||||||
|
┌─result─┬─type────┐
|
||||||
|
│ 1 │ Float64 │
|
||||||
|
└────────┴─────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The type returned is a Float64 as the UInt8 must be promoted to 64 bit for the comparison.
|
||||||
|
:::
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT least(['hello'], ['there'], ['world'])
|
||||||
|
```
|
||||||
|
```response
|
||||||
|
┌─least(['hello'], ['there'], ['world'])─┐
|
||||||
|
│ ['hello'] │
|
||||||
|
└────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT least(toDateTime32(now() + toIntervalDay(1)), toDateTime64(now(), 3))
|
||||||
|
```
|
||||||
|
```response
|
||||||
|
┌─least(toDateTime32(plus(now(), toIntervalDay(1))), toDateTime64(now(), 3))─┐
|
||||||
|
│ 2023-05-12 01:16:59.000 │
|
||||||
|
└────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The type returned is a DateTime64 as the DataTime32 must be promoted to 64 bit for the comparison.
|
||||||
|
:::
|
||||||
|
@ -357,14 +357,14 @@ Alias: `SECOND`.
|
|||||||
|
|
||||||
## toUnixTimestamp
|
## toUnixTimestamp
|
||||||
|
|
||||||
For DateTime arguments: converts the value to the number with type UInt32 -- Unix Timestamp (https://en.wikipedia.org/wiki/Unix_time).
|
Converts a string, a date or a date with time to the [Unix Timestamp](https://en.wikipedia.org/wiki/Unix_time) in `UInt32` representation.
|
||||||
|
|
||||||
For String argument: converts the input string to the datetime according to the timezone (optional second argument, server timezone is used by default) and returns the corresponding unix timestamp.
|
If the function is called with a string, it accepts an optional timezone argument.
|
||||||
|
|
||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
toUnixTimestamp(datetime)
|
toUnixTimestamp(date)
|
||||||
toUnixTimestamp(str, [timezone])
|
toUnixTimestamp(str, [timezone])
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -377,15 +377,29 @@ Type: `UInt32`.
|
|||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp
|
SELECT
|
||||||
|
'2017-11-05 08:07:47' AS dt_str,
|
||||||
|
toUnixTimestamp(dt_str) AS from_str,
|
||||||
|
toUnixTimestamp(dt_str, 'Asia/Tokyo') AS from_str_tokyo,
|
||||||
|
toUnixTimestamp(toDateTime(dt_str)) AS from_datetime,
|
||||||
|
toUnixTimestamp(toDateTime64(dt_str, 0)) AS from_datetime64,
|
||||||
|
toUnixTimestamp(toDate(dt_str)) AS from_date,
|
||||||
|
toUnixTimestamp(toDate32(dt_str)) AS from_date32
|
||||||
|
FORMAT Vertical;
|
||||||
```
|
```
|
||||||
|
|
||||||
Result:
|
Result:
|
||||||
|
|
||||||
``` text
|
``` text
|
||||||
┌─unix_timestamp─┐
|
Row 1:
|
||||||
│ 1509836867 │
|
──────
|
||||||
└────────────────┘
|
dt_str: 2017-11-05 08:07:47
|
||||||
|
from_str: 1509869267
|
||||||
|
from_str_tokyo: 1509836867
|
||||||
|
from_datetime: 1509869267
|
||||||
|
from_datetime64: 1509869267
|
||||||
|
from_date: 1509840000
|
||||||
|
from_date32: 1509840000
|
||||||
```
|
```
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
@ -1218,12 +1232,16 @@ Rounds the time to the half hour.
|
|||||||
|
|
||||||
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY \* 100 + MM). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY \* 100 + MM). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
||||||
|
|
||||||
### example
|
**Example**
|
||||||
```sql
|
|
||||||
|
``` sql
|
||||||
SELECT
|
SELECT
|
||||||
toYYYYMM(now(), 'US/Eastern')
|
toYYYYMM(now(), 'US/Eastern')
|
||||||
```
|
```
|
||||||
```response
|
|
||||||
|
Result:
|
||||||
|
|
||||||
|
``` text
|
||||||
┌─toYYYYMM(now(), 'US/Eastern')─┐
|
┌─toYYYYMM(now(), 'US/Eastern')─┐
|
||||||
│ 202303 │
|
│ 202303 │
|
||||||
└───────────────────────────────┘
|
└───────────────────────────────┘
|
||||||
@ -1233,11 +1251,15 @@ SELECT
|
|||||||
|
|
||||||
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY \* 10000 + MM \* 100 + DD). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY \* 10000 + MM \* 100 + DD). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
||||||
|
|
||||||
### example
|
**Example**
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
SELECT
|
SELECT
|
||||||
toYYYYMMDD(now(), 'US/Eastern')
|
toYYYYMMDD(now(), 'US/Eastern')
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Result:
|
||||||
|
|
||||||
```response
|
```response
|
||||||
┌─toYYYYMMDD(now(), 'US/Eastern')─┐
|
┌─toYYYYMMDD(now(), 'US/Eastern')─┐
|
||||||
│ 20230302 │
|
│ 20230302 │
|
||||||
@ -1248,11 +1270,15 @@ SELECT
|
|||||||
|
|
||||||
Converts a date or date with time to a UInt64 number containing the year and month number (YYYY \* 10000000000 + MM \* 100000000 + DD \* 1000000 + hh \* 10000 + mm \* 100 + ss). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
Converts a date or date with time to a UInt64 number containing the year and month number (YYYY \* 10000000000 + MM \* 100000000 + DD \* 1000000 + hh \* 10000 + mm \* 100 + ss). Accepts a second optional timezone argument. If provided, the timezone must be a string constant.
|
||||||
|
|
||||||
### example
|
**Example**
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
SELECT
|
SELECT
|
||||||
toYYYYMMDDhhmmss(now(), 'US/Eastern')
|
toYYYYMMDDhhmmss(now(), 'US/Eastern')
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Result:
|
||||||
|
|
||||||
```response
|
```response
|
||||||
┌─toYYYYMMDDhhmmss(now(), 'US/Eastern')─┐
|
┌─toYYYYMMDDhhmmss(now(), 'US/Eastern')─┐
|
||||||
│ 20230302112209 │
|
│ 20230302112209 │
|
||||||
|
396
docs/en/sql-reference/functions/geo/polygon.md
Normal file
396
docs/en/sql-reference/functions/geo/polygon.md
Normal file
File diff suppressed because one or more lines are too long
@ -279,6 +279,8 @@ cityHash64(par1,...)
|
|||||||
|
|
||||||
This is a fast non-cryptographic hash function. It uses the CityHash algorithm for string parameters and implementation-specific fast non-cryptographic hash function for parameters with other data types. The function uses the CityHash combinator to get the final results.
|
This is a fast non-cryptographic hash function. It uses the CityHash algorithm for string parameters and implementation-specific fast non-cryptographic hash function for parameters with other data types. The function uses the CityHash combinator to get the final results.
|
||||||
|
|
||||||
|
Note that Google changed the algorithm of CityHash after it has been added to ClickHouse. In other words, ClickHouse's cityHash64 and Google's upstream CityHash now produce different results. ClickHouse cityHash64 corresponds to CityHash v1.0.2.
|
||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
The function takes a variable number of input parameters. Arguments can be any of the [supported data types](/docs/en/sql-reference/data-types/index.md). For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed `Tuple` with the same data, `Map` and the corresponding `Array(Tuple(key, value))` type with the same data).
|
The function takes a variable number of input parameters. Arguments can be any of the [supported data types](/docs/en/sql-reference/data-types/index.md). For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed `Tuple` with the same data, `Map` and the corresponding `Array(Tuple(key, value))` type with the same data).
|
||||||
|
@ -59,244 +59,6 @@ A lambda function that accepts multiple arguments can also be passed to a higher
|
|||||||
|
|
||||||
For some functions the first argument (the lambda function) can be omitted. In this case, identical mapping is assumed.
|
For some functions the first argument (the lambda function) can be omitted. In this case, identical mapping is assumed.
|
||||||
|
|
||||||
## SQL User Defined Functions
|
## User Defined Functions (UDFs)
|
||||||
|
|
||||||
Custom functions from lambda expressions can be created using the [CREATE FUNCTION](../statements/create/function.md) statement. To delete these functions use the [DROP FUNCTION](../statements/drop.md#drop-function) statement.
|
ClickHouse supports user-defined functions. See [UDFs](/docs/en/sql-reference/functions/udf.md).
|
||||||
|
|
||||||
## Executable User Defined Functions
|
|
||||||
ClickHouse can call any external executable program or script to process data.
|
|
||||||
|
|
||||||
The configuration of executable user defined functions can be located in one or more xml-files. The path to the configuration is specified in the [user_defined_executable_functions_config](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-user_defined_executable_functions_config) parameter.
|
|
||||||
|
|
||||||
A function configuration contains the following settings:
|
|
||||||
|
|
||||||
- `name` - a function name.
|
|
||||||
- `command` - script name to execute or command if `execute_direct` is false.
|
|
||||||
- `argument` - argument description with the `type`, and optional `name` of an argument. Each argument is described in a separate setting. Specifying name is necessary if argument names are part of serialization for user defined function format like [Native](../../interfaces/formats.md#native) or [JSONEachRow](../../interfaces/formats.md#jsoneachrow). Default argument name value is `c` + argument_number.
|
|
||||||
- `format` - a [format](../../interfaces/formats.md) in which arguments are passed to the command.
|
|
||||||
- `return_type` - the type of a returned value.
|
|
||||||
- `return_name` - name of retuned value. Specifying return name is necessary if return name is part of serialization for user defined function format like [Native](../../interfaces/formats.md#native) or [JSONEachRow](../../interfaces/formats.md#jsoneachrow). Optional. Default value is `result`.
|
|
||||||
- `type` - an executable type. If `type` is set to `executable` then single command is started. If it is set to `executable_pool` then a pool of commands is created.
|
|
||||||
- `max_command_execution_time` - maximum execution time in seconds for processing block of data. This setting is valid for `executable_pool` commands only. Optional. Default value is `10`.
|
|
||||||
- `command_termination_timeout` - time in seconds during which a command should finish after its pipe is closed. After that time `SIGTERM` is sent to the process executing the command. Optional. Default value is `10`.
|
|
||||||
- `command_read_timeout` - timeout for reading data from command stdout in milliseconds. Default value 10000. Optional parameter.
|
|
||||||
- `command_write_timeout` - timeout for writing data to command stdin in milliseconds. Default value 10000. Optional parameter.
|
|
||||||
- `pool_size` - the size of a command pool. Optional. Default value is `16`.
|
|
||||||
- `send_chunk_header` - controls whether to send row count before sending a chunk of data to process. Optional. Default value is `false`.
|
|
||||||
- `execute_direct` - If `execute_direct` = `1`, then `command` will be searched inside user_scripts folder specified by [user_scripts_path](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-user_scripts_path). Additional script arguments can be specified using whitespace separator. Example: `script_name arg1 arg2`. If `execute_direct` = `0`, `command` is passed as argument for `bin/sh -c`. Default value is `1`. Optional parameter.
|
|
||||||
- `lifetime` - the reload interval of a function in seconds. If it is set to `0` then the function is not reloaded. Default value is `0`. Optional parameter.
|
|
||||||
|
|
||||||
The command must read arguments from `STDIN` and must output the result to `STDOUT`. The command must process arguments iteratively. That is after processing a chunk of arguments it must wait for the next chunk.
|
|
||||||
|
|
||||||
**Example**
|
|
||||||
|
|
||||||
Creating `test_function` using XML configuration.
|
|
||||||
File `test_function.xml` (`/etc/clickhouse-server/test_function.xml` with default path settings).
|
|
||||||
```xml
|
|
||||||
<functions>
|
|
||||||
<function>
|
|
||||||
<type>executable</type>
|
|
||||||
<name>test_function_python</name>
|
|
||||||
<return_type>String</return_type>
|
|
||||||
<argument>
|
|
||||||
<type>UInt64</type>
|
|
||||||
<name>value</name>
|
|
||||||
</argument>
|
|
||||||
<format>TabSeparated</format>
|
|
||||||
<command>test_function.py</command>
|
|
||||||
</function>
|
|
||||||
</functions>
|
|
||||||
```
|
|
||||||
|
|
||||||
Script file inside `user_scripts` folder `test_function.py` (`/var/lib/clickhouse/user_scripts/test_function.py` with default path settings).
|
|
||||||
|
|
||||||
```python
|
|
||||||
#!/usr/bin/python3
|
|
||||||
|
|
||||||
import sys
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
for line in sys.stdin:
|
|
||||||
print("Value " + line, end='')
|
|
||||||
sys.stdout.flush()
|
|
||||||
```
|
|
||||||
|
|
||||||
Query:
|
|
||||||
|
|
||||||
``` sql
|
|
||||||
SELECT test_function_python(toUInt64(2));
|
|
||||||
```
|
|
||||||
|
|
||||||
Result:
|
|
||||||
|
|
||||||
``` text
|
|
||||||
┌─test_function_python(2)─┐
|
|
||||||
│ Value 2 │
|
|
||||||
└─────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
Creating `test_function_sum` manually specifying `execute_direct` to `0` using XML configuration.
|
|
||||||
File `test_function.xml` (`/etc/clickhouse-server/test_function.xml` with default path settings).
|
|
||||||
```xml
|
|
||||||
<functions>
|
|
||||||
<function>
|
|
||||||
<type>executable</type>
|
|
||||||
<name>test_function_sum</name>
|
|
||||||
<return_type>UInt64</return_type>
|
|
||||||
<argument>
|
|
||||||
<type>UInt64</type>
|
|
||||||
<name>lhs</name>
|
|
||||||
</argument>
|
|
||||||
<argument>
|
|
||||||
<type>UInt64</type>
|
|
||||||
<name>rhs</name>
|
|
||||||
</argument>
|
|
||||||
<format>TabSeparated</format>
|
|
||||||
<command>cd /; clickhouse-local --input-format TabSeparated --output-format TabSeparated --structure 'x UInt64, y UInt64' --query "SELECT x + y FROM table"</command>
|
|
||||||
<execute_direct>0</execute_direct>
|
|
||||||
</function>
|
|
||||||
</functions>
|
|
||||||
```
|
|
||||||
|
|
||||||
Query:
|
|
||||||
|
|
||||||
``` sql
|
|
||||||
SELECT test_function_sum(2, 2);
|
|
||||||
```
|
|
||||||
|
|
||||||
Result:
|
|
||||||
|
|
||||||
``` text
|
|
||||||
┌─test_function_sum(2, 2)─┐
|
|
||||||
│ 4 │
|
|
||||||
└─────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
Creating `test_function_sum_json` with named arguments and format [JSONEachRow](../../interfaces/formats.md#jsoneachrow) using XML configuration.
|
|
||||||
File `test_function.xml` (`/etc/clickhouse-server/test_function.xml` with default path settings).
|
|
||||||
```xml
|
|
||||||
<functions>
|
|
||||||
<function>
|
|
||||||
<type>executable</type>
|
|
||||||
<name>test_function_sum_json</name>
|
|
||||||
<return_type>UInt64</return_type>
|
|
||||||
<return_name>result_name</return_name>
|
|
||||||
<argument>
|
|
||||||
<type>UInt64</type>
|
|
||||||
<name>argument_1</name>
|
|
||||||
</argument>
|
|
||||||
<argument>
|
|
||||||
<type>UInt64</type>
|
|
||||||
<name>argument_2</name>
|
|
||||||
</argument>
|
|
||||||
<format>JSONEachRow</format>
|
|
||||||
<command>test_function_sum_json.py</command>
|
|
||||||
</function>
|
|
||||||
</functions>
|
|
||||||
```
|
|
||||||
|
|
||||||
Script file inside `user_scripts` folder `test_function_sum_json.py` (`/var/lib/clickhouse/user_scripts/test_function_sum_json.py` with default path settings).
|
|
||||||
|
|
||||||
```python
|
|
||||||
#!/usr/bin/python3
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import json
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
for line in sys.stdin:
|
|
||||||
value = json.loads(line)
|
|
||||||
first_arg = int(value['argument_1'])
|
|
||||||
second_arg = int(value['argument_2'])
|
|
||||||
result = {'result_name': first_arg + second_arg}
|
|
||||||
print(json.dumps(result), end='\n')
|
|
||||||
sys.stdout.flush()
|
|
||||||
```
|
|
||||||
|
|
||||||
Query:
|
|
||||||
|
|
||||||
``` sql
|
|
||||||
SELECT test_function_sum_json(2, 2);
|
|
||||||
```
|
|
||||||
|
|
||||||
Result:
|
|
||||||
|
|
||||||
``` text
|
|
||||||
┌─test_function_sum_json(2, 2)─┐
|
|
||||||
│ 4 │
|
|
||||||
└──────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
Executable user defined functions can take constant parameters configured in `command` setting (works only for user defined functions with `executable` type).
|
|
||||||
File `test_function_parameter_python.xml` (`/etc/clickhouse-server/test_function_parameter_python.xml` with default path settings).
|
|
||||||
```xml
|
|
||||||
<functions>
|
|
||||||
<function>
|
|
||||||
<type>executable</type>
|
|
||||||
<name>test_function_parameter_python</name>
|
|
||||||
<return_type>String</return_type>
|
|
||||||
<argument>
|
|
||||||
<type>UInt64</type>
|
|
||||||
</argument>
|
|
||||||
<format>TabSeparated</format>
|
|
||||||
<command>test_function_parameter_python.py {test_parameter:UInt64}</command>
|
|
||||||
</function>
|
|
||||||
</functions>
|
|
||||||
```
|
|
||||||
|
|
||||||
Script file inside `user_scripts` folder `test_function_parameter_python.py` (`/var/lib/clickhouse/user_scripts/test_function_parameter_python.py` with default path settings).
|
|
||||||
|
|
||||||
```python
|
|
||||||
#!/usr/bin/python3
|
|
||||||
|
|
||||||
import sys
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
for line in sys.stdin:
|
|
||||||
print("Parameter " + str(sys.argv[1]) + " value " + str(line), end="")
|
|
||||||
sys.stdout.flush()
|
|
||||||
```
|
|
||||||
|
|
||||||
Query:
|
|
||||||
|
|
||||||
``` sql
|
|
||||||
SELECT test_function_parameter_python(1)(2);
|
|
||||||
```
|
|
||||||
|
|
||||||
Result:
|
|
||||||
|
|
||||||
``` text
|
|
||||||
┌─test_function_parameter_python(1)(2)─┐
|
|
||||||
│ Parameter 1 value 2 │
|
|
||||||
└──────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
## Error Handling
|
|
||||||
|
|
||||||
Some functions might throw an exception if the data is invalid. In this case, the query is canceled and an error text is returned to the client. For distributed processing, when an exception occurs on one of the servers, the other servers also attempt to abort the query.
|
|
||||||
|
|
||||||
## Evaluation of Argument Expressions
|
|
||||||
|
|
||||||
In almost all programming languages, one of the arguments might not be evaluated for certain operators. This is usually the operators `&&`, `||`, and `?:`.
|
|
||||||
But in ClickHouse, arguments of functions (operators) are always evaluated. This is because entire parts of columns are evaluated at once, instead of calculating each row separately.
|
|
||||||
|
|
||||||
## Performing Functions for Distributed Query Processing
|
|
||||||
|
|
||||||
For distributed query processing, as many stages of query processing as possible are performed on remote servers, and the rest of the stages (merging intermediate results and everything after that) are performed on the requestor server.
|
|
||||||
|
|
||||||
This means that functions can be performed on different servers.
|
|
||||||
For example, in the query `SELECT f(sum(g(x))) FROM distributed_table GROUP BY h(y),`
|
|
||||||
|
|
||||||
- if a `distributed_table` has at least two shards, the functions ‘g’ and ‘h’ are performed on remote servers, and the function ‘f’ is performed on the requestor server.
|
|
||||||
- if a `distributed_table` has only one shard, all the ‘f’, ‘g’, and ‘h’ functions are performed on this shard’s server.
|
|
||||||
|
|
||||||
The result of a function usually does not depend on which server it is performed on. However, sometimes this is important.
|
|
||||||
For example, functions that work with dictionaries use the dictionary that exists on the server they are running on.
|
|
||||||
Another example is the `hostName` function, which returns the name of the server it is running on in order to make `GROUP BY` by servers in a `SELECT` query.
|
|
||||||
|
|
||||||
If a function in a query is performed on the requestor server, but you need to perform it on remote servers, you can wrap it in an ‘any’ aggregate function or add it to a key in `GROUP BY`.
|
|
||||||
|
|
||||||
|
|
||||||
## Related Content
|
|
||||||
|
|
||||||
- [User-defined functions in ClickHouse Cloud](https://clickhouse.com/blog/user-defined-functions-clickhouse-udfs)
|
|
||||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user