Merge branch 'master' into refactoring-ip-types

This commit is contained in:
Yakov Olkhovskiy 2022-11-14 09:21:54 -05:00 committed by GitHub
commit 9aeebf3bdf
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
870 changed files with 23689 additions and 8886 deletions

View File

@ -13,6 +13,8 @@ assignees: ''
> A clear and concise description of what works not as it is supposed to. > A clear and concise description of what works not as it is supposed to.
> A link to reproducer in [https://fiddle.clickhouse.com/](https://fiddle.clickhouse.com/).
**Does it reproduce on recent release?** **Does it reproduce on recent release?**
[The list of releases](https://github.com/ClickHouse/ClickHouse/blob/master/utils/list-versions/version_date.tsv) [The list of releases](https://github.com/ClickHouse/ClickHouse/blob/master/utils/list-versions/version_date.tsv)

View File

@ -6,7 +6,7 @@ env:
on: # yamllint disable-line rule:truthy on: # yamllint disable-line rule:truthy
workflow_run: workflow_run:
workflows: ["PullRequestCI", "ReleaseCI", "DocsCheck", "BackportPR"] workflows: ["PullRequestCI", "ReleaseBranchCI", "DocsCheck", "BackportPR"]
types: types:
- requested - requested
jobs: jobs:

View File

@ -2,7 +2,7 @@
name: Debug name: Debug
'on': 'on':
[push, pull_request, release, workflow_dispatch] [push, pull_request, release, workflow_dispatch, workflow_call]
jobs: jobs:
DebugInfo: DebugInfo:

View File

@ -32,10 +32,41 @@ jobs:
mkdir -p "$TEMP_PATH" mkdir -p "$TEMP_PATH"
cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH" cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH"
cd "$REPO_COPY/tests/ci" cd "$REPO_COPY/tests/ci"
python3 keeper_jepsen_check.py python3 jepsen_check.py keeper
- name: Cleanup - name: Cleanup
if: always() if: always()
run: | run: |
docker ps --quiet | xargs --no-run-if-empty docker kill ||: docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||: docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH" sudo rm -fr "$TEMP_PATH"
# ServerJepsenRelease:
# runs-on: [self-hosted, style-checker]
# if: ${{ always() }}
# needs: [KeeperJepsenRelease]
# steps:
# - name: Set envs
# run: |
# cat >> "$GITHUB_ENV" << 'EOF'
# TEMP_PATH=${{runner.temp}}/server_jepsen
# REPO_COPY=${{runner.temp}}/server_jepsen/ClickHouse
# EOF
# - name: Clear repository
# run: |
# sudo rm -fr "$GITHUB_WORKSPACE" && mkdir "$GITHUB_WORKSPACE"
# - name: Check out repository code
# uses: actions/checkout@v2
# with:
# fetch-depth: 0
# - name: Jepsen Test
# run: |
# sudo rm -fr "$TEMP_PATH"
# mkdir -p "$TEMP_PATH"
# cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH"
# cd "$REPO_COPY/tests/ci"
# python3 jepsen_check.py server
# - name: Cleanup
# if: always()
# run: |
# docker ps --quiet | xargs --no-run-if-empty docker kill ||:
# docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
# sudo rm -fr "$TEMP_PATH"

View File

@ -1056,6 +1056,23 @@ jobs:
docker ps --quiet | xargs --no-run-if-empty docker kill ||: docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||: docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH" sudo rm -fr "$TEMP_PATH"
MarkReleaseReady:
needs:
- BuilderBinDarwin
- BuilderBinDarwinAarch64
- BuilderDebRelease
- BuilderDebAarch64
runs-on: [self-hosted, style-checker]
steps:
- name: Clear repository
run: |
sudo rm -fr "$GITHUB_WORKSPACE" && mkdir "$GITHUB_WORKSPACE"
- name: Check out repository code
uses: actions/checkout@v2
- name: Mark Commit Release Ready
run: |
cd "$GITHUB_WORKSPACE/tests/ci"
python3 mark_release_ready.py
############################################################################################## ##############################################################################################
########################### FUNCTIONAl STATELESS TESTS ####################################### ########################### FUNCTIONAl STATELESS TESTS #######################################
############################################################################################## ##############################################################################################
@ -2994,10 +3011,83 @@ jobs:
docker ps --quiet | xargs --no-run-if-empty docker kill ||: docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||: docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH" sudo rm -fr "$TEMP_PATH"
##############################################################################################
###################################### SQLANCER FUZZERS ######################################
##############################################################################################
SQLancerTestRelease:
needs: [BuilderDebRelease]
runs-on: [self-hosted, fuzzer-unit-tester]
steps:
- name: Set envs
run: |
cat >> "$GITHUB_ENV" << 'EOF'
TEMP_PATH=${{runner.temp}}/sqlancer_release
REPORTS_PATH=${{runner.temp}}/reports_dir
CHECK_NAME=SQLancer (release)
REPO_COPY=${{runner.temp}}/sqlancer_release/ClickHouse
EOF
- name: Download json reports
uses: actions/download-artifact@v2
with:
path: ${{ env.REPORTS_PATH }}
- name: Clear repository
run: |
sudo rm -fr "$GITHUB_WORKSPACE" && mkdir "$GITHUB_WORKSPACE"
- name: Check out repository code
uses: actions/checkout@v2
- name: SQLancer
run: |
sudo rm -fr "$TEMP_PATH"
mkdir -p "$TEMP_PATH"
cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH"
cd "$REPO_COPY/tests/ci"
python3 sqlancer_check.py "$CHECK_NAME"
- name: Cleanup
if: always()
run: |
docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH"
SQLancerTestDebug:
needs: [BuilderDebDebug]
runs-on: [self-hosted, fuzzer-unit-tester]
steps:
- name: Set envs
run: |
cat >> "$GITHUB_ENV" << 'EOF'
TEMP_PATH=${{runner.temp}}/sqlancer_debug
REPORTS_PATH=${{runner.temp}}/reports_dir
CHECK_NAME=SQLancer (debug)
REPO_COPY=${{runner.temp}}/sqlancer_debug/ClickHouse
EOF
- name: Download json reports
uses: actions/download-artifact@v2
with:
path: ${{ env.REPORTS_PATH }}
- name: Clear repository
run: |
sudo rm -fr "$GITHUB_WORKSPACE" && mkdir "$GITHUB_WORKSPACE"
- name: Check out repository code
uses: actions/checkout@v2
- name: SQLancer
run: |
sudo rm -fr "$TEMP_PATH"
mkdir -p "$TEMP_PATH"
cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH"
cd "$REPO_COPY/tests/ci"
python3 sqlancer_check.py "$CHECK_NAME"
- name: Cleanup
if: always()
run: |
docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH"
FinishCheck: FinishCheck:
needs: needs:
- DockerHubPush - DockerHubPush
- BuilderReport - BuilderReport
- BuilderSpecialReport
- MarkReleaseReady
- FunctionalStatelessTestDebug0 - FunctionalStatelessTestDebug0
- FunctionalStatelessTestDebug1 - FunctionalStatelessTestDebug1
- FunctionalStatelessTestDebug2 - FunctionalStatelessTestDebug2
@ -3053,6 +3143,8 @@ jobs:
- UnitTestsUBsan - UnitTestsUBsan
- UnitTestsReleaseClang - UnitTestsReleaseClang
- SharedBuildSmokeTest - SharedBuildSmokeTest
- SQLancerTestRelease
- SQLancerTestDebug
runs-on: [self-hosted, style-checker] runs-on: [self-hosted, style-checker]
steps: steps:
- name: Clear repository - name: Clear repository

View File

@ -10,6 +10,9 @@ env:
workflow_dispatch: workflow_dispatch:
jobs: jobs:
Debug:
# The task for having a preserved ENV and event.json for later investigation
uses: ./.github/workflows/debug.yml
DockerHubPushAarch64: DockerHubPushAarch64:
runs-on: [self-hosted, style-checker-aarch64] runs-on: [self-hosted, style-checker-aarch64]
steps: steps:
@ -122,3 +125,58 @@ jobs:
docker ps --quiet | xargs --no-run-if-empty docker kill ||: docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||: docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH" "$CACHES_PATH" sudo rm -fr "$TEMP_PATH" "$CACHES_PATH"
SonarCloud:
runs-on: [self-hosted, builder]
env:
SONAR_SCANNER_VERSION: 4.7.0.2747
SONAR_SERVER_URL: "https://sonarcloud.io"
BUILD_WRAPPER_OUT_DIR: build_wrapper_output_directory # Directory where build-wrapper output will be placed
CC: clang-15
CXX: clang++-15
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0 # Shallow clones should be disabled for a better relevancy of analysis
submodules: true
- name: Set up JDK 11
uses: actions/setup-java@v1
with:
java-version: 11
- name: Download and set up sonar-scanner
env:
SONAR_SCANNER_DOWNLOAD_URL: https://binaries.sonarsource.com/Distribution/sonar-scanner-cli/sonar-scanner-cli-${{ env.SONAR_SCANNER_VERSION }}-linux.zip
run: |
mkdir -p "$HOME/.sonar"
curl -sSLo "$HOME/.sonar/sonar-scanner.zip" "${{ env.SONAR_SCANNER_DOWNLOAD_URL }}"
unzip -o "$HOME/.sonar/sonar-scanner.zip" -d "$HOME/.sonar/"
echo "$HOME/.sonar/sonar-scanner-${{ env.SONAR_SCANNER_VERSION }}-linux/bin" >> "$GITHUB_PATH"
- name: Download and set up build-wrapper
env:
BUILD_WRAPPER_DOWNLOAD_URL: ${{ env.SONAR_SERVER_URL }}/static/cpp/build-wrapper-linux-x86.zip
run: |
curl -sSLo "$HOME/.sonar/build-wrapper-linux-x86.zip" "${{ env.BUILD_WRAPPER_DOWNLOAD_URL }}"
unzip -o "$HOME/.sonar/build-wrapper-linux-x86.zip" -d "$HOME/.sonar/"
echo "$HOME/.sonar/build-wrapper-linux-x86" >> "$GITHUB_PATH"
- name: Set Up Build Tools
run: |
sudo apt-get update
sudo apt-get install -yq git cmake ccache python3 ninja-build
sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
- name: Run build-wrapper
run: |
mkdir build
cd build
cmake ..
cd ..
build-wrapper-linux-x86-64 --out-dir ${{ env.BUILD_WRAPPER_OUT_DIR }} cmake --build build/
- name: Run sonar-scanner
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
run: |
sonar-scanner \
--define sonar.host.url="${{ env.SONAR_SERVER_URL }}" \
--define sonar.cfamily.build-wrapper-output="${{ env.BUILD_WRAPPER_OUT_DIR }}" \
--define sonar.projectKey="ClickHouse_ClickHouse" \
--define sonar.organization="clickhouse-java" \
--define sonar.exclusions="**/*.java,**/*.ts,**/*.js,**/*.css,**/*.sql"

View File

@ -2023,6 +2023,7 @@ jobs:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||: docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH" sudo rm -fr "$TEMP_PATH"
TestsBugfixCheck: TestsBugfixCheck:
needs: [CheckLabels, StyleCheck]
runs-on: [self-hosted, stress-tester] runs-on: [self-hosted, stress-tester]
steps: steps:
- name: Set envs - name: Set envs
@ -3490,6 +3491,77 @@ jobs:
docker ps --quiet | xargs --no-run-if-empty docker kill ||: docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||: docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH" sudo rm -fr "$TEMP_PATH"
##############################################################################################
###################################### SQLANCER FUZZERS ######################################
##############################################################################################
SQLancerTestRelease:
needs: [BuilderDebRelease]
runs-on: [self-hosted, fuzzer-unit-tester]
steps:
- name: Set envs
run: |
cat >> "$GITHUB_ENV" << 'EOF'
TEMP_PATH=${{runner.temp}}/sqlancer_release
REPORTS_PATH=${{runner.temp}}/reports_dir
CHECK_NAME=SQLancer (release)
REPO_COPY=${{runner.temp}}/sqlancer_release/ClickHouse
EOF
- name: Download json reports
uses: actions/download-artifact@v2
with:
path: ${{ env.REPORTS_PATH }}
- name: Clear repository
run: |
sudo rm -fr "$GITHUB_WORKSPACE" && mkdir "$GITHUB_WORKSPACE"
- name: Check out repository code
uses: actions/checkout@v2
- name: SQLancer
run: |
sudo rm -fr "$TEMP_PATH"
mkdir -p "$TEMP_PATH"
cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH"
cd "$REPO_COPY/tests/ci"
python3 sqlancer_check.py "$CHECK_NAME"
- name: Cleanup
if: always()
run: |
docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH"
SQLancerTestDebug:
needs: [BuilderDebDebug]
runs-on: [self-hosted, fuzzer-unit-tester]
steps:
- name: Set envs
run: |
cat >> "$GITHUB_ENV" << 'EOF'
TEMP_PATH=${{runner.temp}}/sqlancer_debug
REPORTS_PATH=${{runner.temp}}/reports_dir
CHECK_NAME=SQLancer (debug)
REPO_COPY=${{runner.temp}}/sqlancer_debug/ClickHouse
EOF
- name: Download json reports
uses: actions/download-artifact@v2
with:
path: ${{ env.REPORTS_PATH }}
- name: Clear repository
run: |
sudo rm -fr "$GITHUB_WORKSPACE" && mkdir "$GITHUB_WORKSPACE"
- name: Check out repository code
uses: actions/checkout@v2
- name: SQLancer
run: |
sudo rm -fr "$TEMP_PATH"
mkdir -p "$TEMP_PATH"
cp -r "$GITHUB_WORKSPACE" "$TEMP_PATH"
cd "$REPO_COPY/tests/ci"
python3 sqlancer_check.py "$CHECK_NAME"
- name: Cleanup
if: always()
run: |
docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH"
############################################################################################# #############################################################################################
###################################### JEPSEN TESTS ######################################### ###################################### JEPSEN TESTS #########################################
############################################################################################# #############################################################################################
@ -3500,7 +3572,6 @@ jobs:
if: contains(github.event.pull_request.labels.*.name, 'jepsen-test') if: contains(github.event.pull_request.labels.*.name, 'jepsen-test')
needs: [BuilderBinRelease] needs: [BuilderBinRelease]
uses: ./.github/workflows/jepsen.yml uses: ./.github/workflows/jepsen.yml
FinishCheck: FinishCheck:
needs: needs:
- StyleCheck - StyleCheck
@ -3508,6 +3579,7 @@ jobs:
- DockerServerImages - DockerServerImages
- CheckLabels - CheckLabels
- BuilderReport - BuilderReport
- BuilderSpecialReport
- FastTest - FastTest
- FunctionalStatelessTestDebug0 - FunctionalStatelessTestDebug0
- FunctionalStatelessTestDebug1 - FunctionalStatelessTestDebug1
@ -3575,6 +3647,8 @@ jobs:
- SharedBuildSmokeTest - SharedBuildSmokeTest
- CompatibilityCheck - CompatibilityCheck
- IntegrationTestsFlakyCheck - IntegrationTestsFlakyCheck
- SQLancerTestRelease
- SQLancerTestDebug
runs-on: [self-hosted, style-checker] runs-on: [self-hosted, style-checker]
steps: steps:
- name: Clear repository - name: Clear repository

View File

@ -615,6 +615,23 @@ jobs:
docker ps --quiet | xargs --no-run-if-empty docker kill ||: docker ps --quiet | xargs --no-run-if-empty docker kill ||:
docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||: docker ps --all --quiet | xargs --no-run-if-empty docker rm -f ||:
sudo rm -fr "$TEMP_PATH" sudo rm -fr "$TEMP_PATH"
MarkReleaseReady:
needs:
- BuilderBinDarwin
- BuilderBinDarwinAarch64
- BuilderDebRelease
- BuilderDebAarch64
runs-on: [self-hosted, style-checker]
steps:
- name: Clear repository
run: |
sudo rm -fr "$GITHUB_WORKSPACE" && mkdir "$GITHUB_WORKSPACE"
- name: Check out repository code
uses: actions/checkout@v2
- name: Mark Commit Release Ready
run: |
cd "$GITHUB_WORKSPACE/tests/ci"
python3 mark_release_ready.py
############################################################################################## ##############################################################################################
########################### FUNCTIONAl STATELESS TESTS ####################################### ########################### FUNCTIONAl STATELESS TESTS #######################################
############################################################################################## ##############################################################################################
@ -1888,6 +1905,7 @@ jobs:
- DockerServerImages - DockerServerImages
- BuilderReport - BuilderReport
- BuilderSpecialReport - BuilderSpecialReport
- MarkReleaseReady
- FunctionalStatelessTestDebug0 - FunctionalStatelessTestDebug0
- FunctionalStatelessTestDebug1 - FunctionalStatelessTestDebug1
- FunctionalStatelessTestDebug2 - FunctionalStatelessTestDebug2

3
.gitignore vendored
View File

@ -154,3 +154,6 @@ website/package-lock.json
/programs/server/metadata /programs/server/metadata
/programs/server/store /programs/server/store
# temporary test files
tests/queries/0_stateless/test_*
tests/queries/0_stateless/*.binary

4
.snyk Normal file
View File

@ -0,0 +1,4 @@
# Snyk (https://snyk.io) policy file
exclude:
global:
- tests/**

View File

@ -202,7 +202,7 @@ option(ADD_GDB_INDEX_FOR_GOLD "Add .gdb-index to resulting binaries for gold lin
if (NOT CMAKE_BUILD_TYPE_UC STREQUAL "RELEASE") if (NOT CMAKE_BUILD_TYPE_UC STREQUAL "RELEASE")
# Can be lld or ld-lld or lld-13 or /path/to/lld. # Can be lld or ld-lld or lld-13 or /path/to/lld.
if (LINKER_NAME MATCHES "lld") if (LINKER_NAME MATCHES "lld" AND OS_LINUX)
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--gdb-index") set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--gdb-index")
set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,--gdb-index") set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,--gdb-index")
message (STATUS "Adding .gdb-index via --gdb-index linker option.") message (STATUS "Adding .gdb-index via --gdb-index linker option.")
@ -248,7 +248,7 @@ endif ()
# Create BuildID when using lld. For other linkers it is created by default. # Create BuildID when using lld. For other linkers it is created by default.
# (NOTE: LINKER_NAME can be either path or name, and in different variants) # (NOTE: LINKER_NAME can be either path or name, and in different variants)
if (LINKER_NAME MATCHES "lld") if (LINKER_NAME MATCHES "lld" AND OS_LINUX)
# SHA1 is not cryptographically secure but it is the best what lld is offering. # SHA1 is not cryptographically secure but it is the best what lld is offering.
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--build-id=sha1") set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--build-id=sha1")
endif () endif ()

View File

@ -5,7 +5,7 @@ ClickHouse® is an open-source column-oriented database management system that a
## Useful Links ## Useful Links
* [Official website](https://clickhouse.com/) has a quick high-level overview of ClickHouse on the main page. * [Official website](https://clickhouse.com/) has a quick high-level overview of ClickHouse on the main page.
* [ClickHouse Cloud](https://clickhouse.com/cloud) ClickHouse as a service, built by the creators and maintainers. * [ClickHouse Cloud](https://clickhouse.cloud) ClickHouse as a service, built by the creators and maintainers.
* [Tutorial](https://clickhouse.com/docs/en/getting_started/tutorial/) shows how to set up and query a small ClickHouse cluster. * [Tutorial](https://clickhouse.com/docs/en/getting_started/tutorial/) shows how to set up and query a small ClickHouse cluster.
* [Documentation](https://clickhouse.com/docs/en/) provides more in-depth information. * [Documentation](https://clickhouse.com/docs/en/) provides more in-depth information.
* [YouTube channel](https://www.youtube.com/c/ClickHouseDB) has a lot of content about ClickHouse in video format. * [YouTube channel](https://www.youtube.com/c/ClickHouseDB) has a lot of content about ClickHouse in video format.
@ -16,5 +16,6 @@ ClickHouse® is an open-source column-oriented database management system that a
* [Contacts](https://clickhouse.com/company/contact) can help to get your questions answered if there are any. * [Contacts](https://clickhouse.com/company/contact) can help to get your questions answered if there are any.
## Upcoming events ## Upcoming events
* [**v22.10 Release Webinar**](https://clickhouse.com/company/events/v22-10-release-webinar) Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release, provide live demos, and share vision into what is coming in the roadmap. * [**v22.11 Release Webinar**](https://clickhouse.com/company/events/v22-11-release-webinar) Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release, provide live demos, and share vision into what is coming in the roadmap.
* [**Introducing ClickHouse Cloud**](https://clickhouse.com/company/events/cloud-beta) Introducing ClickHouse as a service, built by creators and maintainers of the fastest OLAP database on earth. Join Tanya Bragin for a detailed walkthrough of ClickHouse Cloud capabilities, as well as a peek behind the curtain to understand the unique architecture that makes our service tick. * [**ClickHouse Meetup at the Deutsche Bank office in Berlin**](https://www.meetup.com/clickhouse-berlin-user-group/events/289311596/) Hear from Deutsche Bank on why they chose ClickHouse for big sensitive data in a regulated environment. The ClickHouse team will then present how ClickHouse is used for real time financial data analytics, including tick data, trade analytics and risk management.
* [**AWS re:Invent**](https://clickhouse.com/company/events/aws-reinvent) Core members of the ClickHouse team -- including 2 of our founders -- will be at re:Invent from November 29 to December 3. We are available on the show floor, but are also determining interest in holding an event during the time there.

View File

@ -1,8 +1,10 @@
#if defined(OS_LINUX) #if defined(OS_LINUX)
# include <sys/syscall.h> # include <sys/syscall.h>
#endif #endif
#include <cstdlib>
#include <unistd.h> #include <unistd.h>
#include <base/safeExit.h> #include <base/safeExit.h>
#include <base/defines.h> /// for THREAD_SANITIZER
[[noreturn]] void safeExit(int code) [[noreturn]] void safeExit(int code)
{ {

View File

@ -8,6 +8,14 @@
#include <link.h> // ElfW #include <link.h> // ElfW
#include <errno.h> #include <errno.h>
#include "syscall.h"
#if defined(__has_feature)
#if __has_feature(memory_sanitizer)
#include <sanitizer/msan_interface.h>
#endif
#endif
#define ARRAY_SIZE(a) sizeof((a))/sizeof((a[0])) #define ARRAY_SIZE(a) sizeof((a))/sizeof((a[0]))
/// Suppress TSan since it is possible for this code to be called from multiple threads, /// Suppress TSan since it is possible for this code to be called from multiple threads,
@ -39,7 +47,9 @@ ssize_t __retry_read(int fd, void * buf, size_t count)
{ {
for (;;) for (;;)
{ {
ssize_t ret = read(fd, buf, count); // We cannot use the read syscall as it will be intercept by sanitizers, which aren't
// initialized yet. Emit syscall directly.
ssize_t ret = __syscall_ret(__syscall(SYS_read, fd, buf, count));
if (ret == -1) if (ret == -1)
{ {
if (errno == EINTR) if (errno == EINTR)
@ -90,6 +100,11 @@ static unsigned long NO_SANITIZE_THREAD __auxv_init_procfs(unsigned long type)
_Static_assert(sizeof(aux) < 4096, "Unexpected sizeof(aux)"); _Static_assert(sizeof(aux) < 4096, "Unexpected sizeof(aux)");
while (__retry_read(fd, &aux, sizeof(aux)) == sizeof(aux)) while (__retry_read(fd, &aux, sizeof(aux)) == sizeof(aux))
{ {
#if defined(__has_feature)
#if __has_feature(memory_sanitizer)
__msan_unpoison(&aux, sizeof(aux));
#endif
#endif
if (aux.a_type == AT_NULL) if (aux.a_type == AT_NULL)
{ {
break; break;

View File

@ -3,7 +3,17 @@ option (ENABLE_CLANG_TIDY "Use clang-tidy static analyzer" OFF)
if (ENABLE_CLANG_TIDY) if (ENABLE_CLANG_TIDY)
find_program (CLANG_TIDY_CACHE_PATH NAMES "clang-tidy-cache")
if (CLANG_TIDY_CACHE_PATH)
find_program (_CLANG_TIDY_PATH NAMES "clang-tidy" "clang-tidy-15" "clang-tidy-14" "clang-tidy-13" "clang-tidy-12")
# Why do we use ';' here?
# It's a cmake black magic: https://cmake.org/cmake/help/latest/prop_tgt/LANG_CLANG_TIDY.html#prop_tgt:%3CLANG%3E_CLANG_TIDY
# The CLANG_TIDY_PATH is passed to CMAKE_CXX_CLANG_TIDY, which follows CXX_CLANG_TIDY syntax.
set (CLANG_TIDY_PATH "${CLANG_TIDY_CACHE_PATH};${_CLANG_TIDY_PATH}" CACHE STRING "A combined command to run clang-tidy with caching wrapper")
else ()
find_program (CLANG_TIDY_PATH NAMES "clang-tidy" "clang-tidy-15" "clang-tidy-14" "clang-tidy-13" "clang-tidy-12") find_program (CLANG_TIDY_PATH NAMES "clang-tidy" "clang-tidy-15" "clang-tidy-14" "clang-tidy-13" "clang-tidy-12")
endif ()
if (CLANG_TIDY_PATH) if (CLANG_TIDY_PATH)
message (STATUS message (STATUS
@ -17,7 +27,11 @@ if (ENABLE_CLANG_TIDY)
# Note that NDEBUG is set implicitly by CMake for non-debug builds # Note that NDEBUG is set implicitly by CMake for non-debug builds
set (COMPILER_FLAGS "${COMPILER_FLAGS} -UNDEBUG") set (COMPILER_FLAGS "${COMPILER_FLAGS} -UNDEBUG")
# The variable CMAKE_CXX_CLANG_TIDY will be set inside src and base directories with non third-party code. # The variable CMAKE_CXX_CLANG_TIDY will be set inside the following directories with non third-party code.
# - base
# - programs
# - src
# - utils
# set (CMAKE_CXX_CLANG_TIDY "${CLANG_TIDY_PATH}") # set (CMAKE_CXX_CLANG_TIDY "${CLANG_TIDY_PATH}")
else () else ()
message (${RECONFIGURE_MESSAGE_LEVEL} "clang-tidy is not found") message (${RECONFIGURE_MESSAGE_LEVEL} "clang-tidy is not found")

View File

@ -16,7 +16,9 @@ endmacro()
if (SANITIZE) if (SANITIZE)
if (SANITIZE STREQUAL "address") if (SANITIZE STREQUAL "address")
set (ASAN_FLAGS "-fsanitize=address -fsanitize-address-use-after-scope") # LLVM-15 has a bug in Address Sanitizer, preventing the usage of 'sanitize-address-use-after-scope',
# see https://github.com/llvm/llvm-project/issues/58633
set (ASAN_FLAGS "-fsanitize=address -fno-sanitize-address-use-after-scope")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}") set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}") set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} ${ASAN_FLAGS}")

View File

@ -58,13 +58,19 @@ if (NOT LINKER_NAME)
find_program (LLD_PATH NAMES "ld.lld") find_program (LLD_PATH NAMES "ld.lld")
find_program (GOLD_PATH NAMES "ld.gold") find_program (GOLD_PATH NAMES "ld.gold")
elseif (COMPILER_CLANG) elseif (COMPILER_CLANG)
find_program (LLD_PATH NAMES "ld.lld-${COMPILER_VERSION_MAJOR}" "lld-${COMPILER_VERSION_MAJOR}" "ld.lld" "lld") # llvm lld is a generic driver.
# Invoke ld.lld (Unix), ld64.lld (macOS), lld-link (Windows), wasm-ld (WebAssembly) instead
if (OS_LINUX)
find_program (LLD_PATH NAMES "ld.lld-${COMPILER_VERSION_MAJOR}" "ld.lld")
elseif (OS_DARWIN)
find_program (LLD_PATH NAMES "ld64.lld-${COMPILER_VERSION_MAJOR}" "ld64.lld")
endif ()
find_program (GOLD_PATH NAMES "ld.gold" "gold") find_program (GOLD_PATH NAMES "ld.gold" "gold")
endif () endif ()
endif() endif()
if (OS_LINUX AND NOT LINKER_NAME) if ((OS_LINUX OR OS_DARWIN) AND NOT LINKER_NAME)
# prefer lld linker over gold or ld on linux # prefer lld linker over gold or ld on linux and macos
if (LLD_PATH) if (LLD_PATH)
if (COMPILER_GCC) if (COMPILER_GCC)
# GCC driver requires one of supported linker names like "lld". # GCC driver requires one of supported linker names like "lld".

2
contrib/NuRaft vendored

@ -1 +1 @@
Subproject commit 1be805e7cb2494aa8170015493474379b0362dfc Subproject commit e4e746a24eb56861a86f3672771e3308d8c40722

2
contrib/cctz vendored

@ -1 +1 @@
Subproject commit 7a454c25c7d16053bcd327cdd16329212a08fa4a Subproject commit 5c8528fb35e89ee0b3a7157490423fba0d4dd7b5

View File

@ -21,6 +21,9 @@ set (LLVM_INCLUDE_DIRS
"${ClickHouse_BINARY_DIR}/contrib/llvm-project/llvm/include" "${ClickHouse_BINARY_DIR}/contrib/llvm-project/llvm/include"
) )
set (LLVM_LIBRARY_DIRS "${ClickHouse_BINARY_DIR}/contrib/llvm-project/llvm") set (LLVM_LIBRARY_DIRS "${ClickHouse_BINARY_DIR}/contrib/llvm-project/llvm")
# NOTE: You should not remove this line since otherwise it will use default 20,
# and llvm cannot be compiled with bundled libcxx and 20 standard.
set (CMAKE_CXX_STANDARD 14)
# This list was generated by listing all LLVM libraries, compiling the binary and removing all libraries while it still compiles. # This list was generated by listing all LLVM libraries, compiling the binary and removing all libraries while it still compiles.
set (REQUIRED_LLVM_LIBRARIES set (REQUIRED_LLVM_LIBRARIES

View File

@ -91,6 +91,9 @@ ENV PATH="$PATH:/usr/local/go/bin"
ENV GOPATH=/workdir/go ENV GOPATH=/workdir/go
ENV GOCACHE=/workdir/ ENV GOCACHE=/workdir/
RUN curl https://raw.githubusercontent.com/matus-chochlik/ctcache/7fd516e91c17779cbc6fc18bd119313d9532dd90/clang-tidy-cache -Lo /usr/bin/clang-tidy-cache \
&& chmod +x /usr/bin/clang-tidy-cache
RUN mkdir /workdir && chmod 777 /workdir RUN mkdir /workdir && chmod 777 /workdir
WORKDIR /workdir WORKDIR /workdir

View File

@ -258,6 +258,10 @@ def parse_env_variables(
if clang_tidy: if clang_tidy:
# 15G is not enough for tidy build # 15G is not enough for tidy build
cache_maxsize = "25G" cache_maxsize = "25G"
# `CTCACHE_DIR` has the same purpose as the `CCACHE_DIR` above.
# It's there to have the clang-tidy cache embedded into our standard `CCACHE_DIR`
result.append("CTCACHE_DIR=/ccache/clang-tidy-cache")
result.append(f"CCACHE_MAXSIZE={cache_maxsize}") result.append(f"CCACHE_MAXSIZE={cache_maxsize}")
if distcc_hosts: if distcc_hosts:
@ -282,9 +286,7 @@ def parse_env_variables(
cmake_flags.append("-DENABLE_TESTS=1") cmake_flags.append("-DENABLE_TESTS=1")
if shared_libraries: if shared_libraries:
cmake_flags.append( cmake_flags.append("-DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1")
"-DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1"
)
# We can't always build utils because it requires too much space, but # We can't always build utils because it requires too much space, but
# we have to build them at least in some way in CI. The shared library # we have to build them at least in some way in CI. The shared library
# build is probably the least heavy disk-wise. # build is probably the least heavy disk-wise.

View File

@ -33,7 +33,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc # lts / testing / prestable / etc
ARG REPO_CHANNEL="stable" ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}" ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="22.10.1.1877" ARG VERSION="22.10.2.11"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static" ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# user/group precreated explicitly with fixed uid/gid on purpose. # user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -21,7 +21,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable" ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb https://packages.clickhouse.com/deb ${REPO_CHANNEL} main" ARG REPOSITORY="deb https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="22.10.1.1877" ARG VERSION="22.10.2.11"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static" ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image # set non-empty deb_location_url url to create a docker image
@ -80,6 +80,16 @@ RUN arch=${TARGETARCH:-amd64} \
&& mkdir -p /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client \ && mkdir -p /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client \
&& chmod ugo+Xrw -R /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client && chmod ugo+Xrw -R /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client
# Remove as much of Ubuntu as possible.
# ClickHouse does not need Ubuntu. It can run on top of Linux kernel without any OS distribution.
# ClickHouse does not need Docker at all. ClickHouse is above all that.
# It does not care about Ubuntu, Docker, or other cruft and you should neither.
# The fact that this Docker image is based on Ubuntu is just a misconception.
# Some vulnerability scanners are arguing about Ubuntu, which is not relevant to ClickHouse at all.
# ClickHouse does not care when you report false vulnerabilities by running some Docker scanners.
RUN apt-get remove --purge -y libksba8 && apt-get autoremove -y
# we need to allow "others" access to clickhouse folder, because docker container # we need to allow "others" access to clickhouse folder, because docker container
# can be started with arbitrary uid (openshift usecase) # can be started with arbitrary uid (openshift usecase)

View File

@ -178,7 +178,7 @@ function fuzz
# interferes with gdb # interferes with gdb
export CLICKHOUSE_WATCHDOG_ENABLE=0 export CLICKHOUSE_WATCHDOG_ENABLE=0
# NOTE: we use process substitution here to preserve keep $! as a pid of clickhouse-server # NOTE: we use process substitution here to preserve keep $! as a pid of clickhouse-server
clickhouse-server --config-file db/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid -- --path db > >(tail -100000 > server.log) 2>&1 & clickhouse-server --config-file db/config.xml --pid-file /var/run/clickhouse-server/clickhouse-server.pid -- --path db 2>&1 | pigz > server.log.gz &
server_pid=$! server_pid=$!
kill -0 $server_pid kill -0 $server_pid
@ -297,7 +297,7 @@ quit
# The server has died. # The server has died.
task_exit_code=210 task_exit_code=210
echo "failure" > status.txt echo "failure" > status.txt
if ! grep --text -ao "Received signal.*\|Logical error.*\|Assertion.*failed\|Failed assertion.*\|.*runtime error: .*\|.*is located.*\|SUMMARY: AddressSanitizer:.*\|SUMMARY: MemorySanitizer:.*\|SUMMARY: ThreadSanitizer:.*\|.*_LIBCPP_ASSERT.*" server.log > description.txt if ! zgrep --text -ao "Received signal.*\|Logical error.*\|Assertion.*failed\|Failed assertion.*\|.*runtime error: .*\|.*is located.*\|SUMMARY: AddressSanitizer:.*\|SUMMARY: MemorySanitizer:.*\|SUMMARY: ThreadSanitizer:.*\|.*_LIBCPP_ASSERT.*" server.log.gz > description.txt
then then
echo "Lost connection to server. See the logs." > description.txt echo "Lost connection to server. See the logs." > description.txt
fi fi
@ -391,8 +391,9 @@ th { cursor: pointer; }
<h1>AST Fuzzer for PR #${PR_TO_TEST} @ ${SHA_TO_TEST}</h1> <h1>AST Fuzzer for PR #${PR_TO_TEST} @ ${SHA_TO_TEST}</h1>
<p class="links"> <p class="links">
<a href="runlog.log">runlog.log</a>
<a href="fuzzer.log">fuzzer.log</a> <a href="fuzzer.log">fuzzer.log</a>
<a href="server.log">server.log</a> <a href="server.log.gz">server.log.gz</a>
<a href="main.log">main.log</a> <a href="main.log">main.log</a>
${CORE_LINK} ${CORE_LINK}
</p> </p>

View File

@ -15,8 +15,8 @@ if [ -z "$CLICKHOUSE_REPO_PATH" ]; then
ls -lath ||: ls -lath ||:
fi fi
cd "$CLICKHOUSE_REPO_PATH/tests/jepsen.clickhouse-keeper" cd "$CLICKHOUSE_REPO_PATH/tests/jepsen.clickhouse"
(lein run test-all --nodes-file "$NODES_FILE_PATH" --username "$NODES_USERNAME" --logging-json --password "$NODES_PASSWORD" --time-limit "$TIME_LIMIT" --concurrency 50 -r 50 --snapshot-distance 100 --stale-log-gap 100 --reserved-log-items 10 --lightweight-run --clickhouse-source "$CLICKHOUSE_PACKAGE" -q --test-count "$TESTS_TO_RUN" || true) | tee "$TEST_OUTPUT/jepsen_run_all_tests.log" (lein run keeper test-all --nodes-file "$NODES_FILE_PATH" --username "$NODES_USERNAME" --logging-json --password "$NODES_PASSWORD" --time-limit "$TIME_LIMIT" --concurrency 50 -r 50 --snapshot-distance 100 --stale-log-gap 100 --reserved-log-items 10 --lightweight-run --clickhouse-source "$CLICKHOUSE_PACKAGE" -q --test-count "$TESTS_TO_RUN" || true) | tee "$TEST_OUTPUT/jepsen_run_all_tests.log"
mv store "$TEST_OUTPUT/" mv store "$TEST_OUTPUT/"

View File

@ -0,0 +1,43 @@
# rebuild in #33610
# docker build -t clickhouse/server-jepsen-test .
ARG FROM_TAG=latest
FROM clickhouse/test-base:$FROM_TAG
ENV DEBIAN_FRONTEND=noninteractive
ENV CLOJURE_VERSION=1.10.3.814
# arguments
ENV PR_TO_TEST=""
ENV SHA_TO_TEST=""
ENV NODES_USERNAME="root"
ENV NODES_PASSWORD=""
ENV TESTS_TO_RUN="8"
ENV TIME_LIMIT="30"
ENV KEEPER_NODE=""
# volumes
ENV NODES_FILE_PATH="/nodes.txt"
ENV TEST_OUTPUT="/test_output"
RUN mkdir "/root/.ssh"
RUN touch "/root/.ssh/known_hosts"
# install java
RUN apt-get update && apt-get install default-jre default-jdk libjna-java libjna-jni ssh gnuplot graphviz --yes --no-install-recommends
# install clojure
RUN curl -O "https://download.clojure.org/install/linux-install-${CLOJURE_VERSION}.sh" && \
chmod +x "linux-install-${CLOJURE_VERSION}.sh" && \
bash "./linux-install-${CLOJURE_VERSION}.sh"
# install leiningen
RUN curl -O "https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein" && \
chmod +x ./lein && \
mv ./lein /usr/bin
COPY run.sh /
CMD ["/bin/bash", "/run.sh"]

View File

@ -0,0 +1,22 @@
#!/usr/bin/env bash
set -euo pipefail
CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-15_relwithdebuginfo_none_unsplitted_disable_False_binary/clickhouse"}
CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""}
if [ -z "$CLICKHOUSE_REPO_PATH" ]; then
CLICKHOUSE_REPO_PATH=ch
rm -rf ch ||:
mkdir ch ||:
wget -nv -nd -c "https://clickhouse-test-reports.s3.amazonaws.com/$PR_TO_TEST/$SHA_TO_TEST/repo/clickhouse_no_subs.tar.gz"
tar -C ch --strip-components=1 -xf clickhouse_no_subs.tar.gz
ls -lath ||:
fi
cd "$CLICKHOUSE_REPO_PATH/tests/jepsen.clickhouse"
(lein run server test-all --keeper "$KEEPER_NODE" --nodes-file "$NODES_FILE_PATH" --username "$NODES_USERNAME" --logging-json --password "$NODES_PASSWORD" --time-limit "$TIME_LIMIT" --concurrency 50 -r 50 --clickhouse-source "$CLICKHOUSE_PACKAGE" --test-count "$TESTS_TO_RUN" || true) | tee "$TEST_OUTPUT/jepsen_run_all_tests.log"
mv store "$TEST_OUTPUT/"

View File

@ -1,5 +1,5 @@
# docker build -t clickhouse/sqlancer-test . # docker build -t clickhouse/sqlancer-test .
FROM ubuntu:20.04 FROM ubuntu:22.04
# ARG for quick switch to a given ubuntu mirror # ARG for quick switch to a given ubuntu mirror
ARG apt_archive="http://archive.ubuntu.com" ARG apt_archive="http://archive.ubuntu.com"

View File

@ -11,13 +11,15 @@ def process_result(result_folder):
summary = [] summary = []
paths = [] paths = []
tests = [ tests = [
"TLPWhere", "TLPAggregate",
"TLPDistinct",
"TLPGroupBy", "TLPGroupBy",
"TLPHaving", "TLPHaving",
"TLPWhere",
"TLPWhereGroupBy", "TLPWhereGroupBy",
"TLPDistinct", "NoREC",
"TLPAggregate",
] ]
failed_tests = []
for test in tests: for test in tests:
err_path = "{}/{}.err".format(result_folder, test) err_path = "{}/{}.err".format(result_folder, test)
@ -33,15 +35,11 @@ def process_result(result_folder):
with open(err_path, "r") as f: with open(err_path, "r") as f:
if "AssertionError" in f.read(): if "AssertionError" in f.read():
summary.append((test, "FAIL")) summary.append((test, "FAIL"))
failed_tests.append(test)
status = "failure" status = "failure"
else: else:
summary.append((test, "OK")) summary.append((test, "OK"))
logs_path = "{}/logs.tar.gz".format(result_folder)
if not os.path.exists(logs_path):
logging.info("No logs tar on path %s", logs_path)
else:
paths.append(logs_path)
stdout_path = "{}/stdout.log".format(result_folder) stdout_path = "{}/stdout.log".format(result_folder)
if not os.path.exists(stdout_path): if not os.path.exists(stdout_path):
logging.info("No stdout log on path %s", stdout_path) logging.info("No stdout log on path %s", stdout_path)
@ -53,18 +51,23 @@ def process_result(result_folder):
else: else:
paths.append(stderr_path) paths.append(stderr_path)
description = "SQLancer test run. See report" description = "SQLancer run successfully"
if status == "failure":
description = f"Failed oracles: {failed_tests}"
return status, description, summary, paths return status, description, summary, paths
def write_results(results_file, status_file, results, status): def write_results(
results_file, status_file, description_file, results, status, description
):
with open(results_file, "w") as f: with open(results_file, "w") as f:
out = csv.writer(f, delimiter="\t") out = csv.writer(f, delimiter="\t")
out.writerows(results) out.writerows(results)
with open(status_file, "w") as f: with open(status_file, "w") as f:
out = csv.writer(f, delimiter="\t") f.write(status + "\n")
out.writerow(status) with open(description_file, "w") as f:
f.write(description + "\n")
if __name__ == "__main__": if __name__ == "__main__":
@ -72,13 +75,20 @@ if __name__ == "__main__":
parser = argparse.ArgumentParser( parser = argparse.ArgumentParser(
description="ClickHouse script for parsing results of sqlancer test" description="ClickHouse script for parsing results of sqlancer test"
) )
parser.add_argument("--in-results-dir", default="/test_output/") parser.add_argument("--in-results-dir", default="/workspace/")
parser.add_argument("--out-results-file", default="/test_output/test_results.tsv") parser.add_argument("--out-results-file", default="/workspace/summary.tsv")
parser.add_argument("--out-status-file", default="/test_output/check_status.tsv") parser.add_argument("--out-description-file", default="/workspace/description.txt")
parser.add_argument("--out-status-file", default="/workspace/status.txt")
args = parser.parse_args() args = parser.parse_args()
state, description, test_results, logs = process_result(args.in_results_dir) status, description, summary, logs = process_result(args.in_results_dir)
logging.info("Result parsed") logging.info("Result parsed")
status = (state, description) write_results(
write_results(args.out_results_file, args.out_status_file, test_results, status) args.out_results_file,
args.out_status_file,
args.out_description_file,
summary,
status,
description,
)
logging.info("Result written") logging.info("Result written")

View File

@ -1,33 +1,62 @@
#!/bin/bash #!/bin/bash
set -exu
trap "exit" INT TERM
set -e -x function wget_with_retry
{
for _ in 1 2 3 4; do
if wget -nv -nd -c "$1";then
return 0
else
sleep 0.5
fi
done
return 1
}
dpkg -i package_folder/clickhouse-common-static_*.deb if [ -z ${BINARY_URL_TO_DOWNLOAD+x} ]
dpkg -i package_folder/clickhouse-common-static-dbg_*.deb then
dpkg -i package_folder/clickhouse-server_*.deb echo "No BINARY_URL_TO_DOWNLOAD provided."
dpkg -i package_folder/clickhouse-client_*.deb else
wget_with_retry "$BINARY_URL_TO_DOWNLOAD"
chmod +x /clickhouse
fi
service clickhouse-server start && sleep 5 if [[ -f "/clickhouse" ]]; then
echo "/clickhouse exists"
else
exit 1
fi
cd /workspace
/clickhouse server -P /workspace/clickhouse-server.pid -L /workspace/clickhouse-server.log -E /workspace/clickhouse-server.log.err --daemon
for _ in $(seq 1 60); do if [[ $(wget -q 'localhost:8123' -O-) == 'Ok.' ]]; then break ; else sleep 1; fi ; done
cd /sqlancer/sqlancer-master cd /sqlancer/sqlancer-master
export TIMEOUT=300 TIMEOUT=300
export NUM_QUERIES=1000 NUM_QUERIES=1000
NUM_THREADS=10
TESTS=( "TLPGroupBy" "TLPHaving" "TLPWhere" "TLPDistinct" "TLPAggregate" "NoREC" )
echo "${TESTS[@]}"
( java -jar target/sqlancer-*.jar --num-threads 10 --timeout-seconds $TIMEOUT --num-queries $NUM_QUERIES --username default --password "" clickhouse --oracle TLPWhere | tee /test_output/TLPWhere.out ) 3>&1 1>&2 2>&3 | tee /test_output/TLPWhere.err for TEST in "${TESTS[@]}"; do
( java -jar target/sqlancer-*.jar --num-threads 10 --timeout-seconds $TIMEOUT --num-queries $NUM_QUERIES --username default --password "" clickhouse --oracle TLPGroupBy | tee /test_output/TLPGroupBy.out ) 3>&1 1>&2 2>&3 | tee /test_output/TLPGroupBy.err echo "$TEST"
( java -jar target/sqlancer-*.jar --num-threads 10 --timeout-seconds $TIMEOUT --num-queries $NUM_QUERIES --username default --password "" clickhouse --oracle TLPHaving | tee /test_output/TLPHaving.out ) 3>&1 1>&2 2>&3 | tee /test_output/TLPHaving.err if [[ $(wget -q 'localhost:8123' -O-) == 'Ok.' ]]
( java -jar target/sqlancer-*.jar --num-threads 10 --timeout-seconds $TIMEOUT --num-queries $NUM_QUERIES --username default --password "" clickhouse --oracle TLPWhere --oracle TLPGroupBy | tee /test_output/TLPWhereGroupBy.out ) 3>&1 1>&2 2>&3 | tee /test_output/TLPWhereGroupBy.err then
( java -jar target/sqlancer-*.jar --num-threads 10 --timeout-seconds $TIMEOUT --num-queries $NUM_QUERIES --username default --password "" clickhouse --oracle TLPDistinct | tee /test_output/TLPDistinct.out ) 3>&1 1>&2 2>&3 | tee /test_output/TLPDistinct.err echo "Server is OK"
( java -jar target/sqlancer-*.jar --num-threads 10 --timeout-seconds $TIMEOUT --num-queries $NUM_QUERIES --username default --password "" clickhouse --oracle TLPAggregate | tee /test_output/TLPAggregate.out ) 3>&1 1>&2 2>&3 | tee /test_output/TLPAggregate.err ( java -jar target/sqlancer-*.jar --log-each-select true --print-failed false --num-threads "$NUM_THREADS" --timeout-seconds "$TIMEOUT" --num-queries "$NUM_QUERIES" --username default --password "" clickhouse --oracle "$TEST" | tee "/workspace/$TEST.out" ) 3>&1 1>&2 2>&3 | tee "/workspace/$TEST.err"
else
touch "/workspace/$TEST.err" "/workspace/$TEST.out"
echo "Server is not responding" | tee /workspace/server_crashed.log
fi
done
service clickhouse stop ls /workspace
pkill -F /workspace/clickhouse-server.pid || true
ls /var/log/clickhouse-server/ for _ in $(seq 1 60); do if [[ $(wget -q 'localhost:8123' -O-) == 'Ok.' ]]; then sleep 1 ; else break; fi ; done
tar czf /test_output/logs.tar.gz -C /var/log/clickhouse-server/ .
tail -n 1000 /var/log/clickhouse-server/stderr.log > /test_output/stderr.log
tail -n 1000 /var/log/clickhouse-server/stdout.log > /test_output/stdout.log
tail -n 1000 /var/log/clickhouse-server/clickhouse-server.log > /test_output/clickhouse-server.log
/process_sqlancer_result.py || echo -e "failure\tCannot parse results" > /test_output/check_status.tsv /process_sqlancer_result.py || echo -e "failure\tCannot parse results" > /workspace/check_status.tsv
ls /test_output ls /workspace

View File

@ -481,6 +481,7 @@ else
-e "The set of parts restored in place of" \ -e "The set of parts restored in place of" \
-e "(ReplicatedMergeTreeAttachThread): Initialization failed. Error" \ -e "(ReplicatedMergeTreeAttachThread): Initialization failed. Error" \
-e "Code: 269. DB::Exception: Destination table is myself" \ -e "Code: 269. DB::Exception: Destination table is myself" \
-e "Coordination::Exception: Connection loss" \
/var/log/clickhouse-server/clickhouse-server.backward.clean.log | zgrep -Fa "<Error>" > /test_output/bc_check_error_messages.txt \ /var/log/clickhouse-server/clickhouse-server.backward.clean.log | zgrep -Fa "<Error>" > /test_output/bc_check_error_messages.txt \
&& echo -e 'Backward compatibility check: Error message in clickhouse-server.log (see bc_check_error_messages.txt)\tFAIL' >> /test_output/test_results.tsv \ && echo -e 'Backward compatibility check: Error message in clickhouse-server.log (see bc_check_error_messages.txt)\tFAIL' >> /test_output/test_results.tsv \
|| echo -e 'Backward compatibility check: No Error messages in clickhouse-server.log\tOK' >> /test_output/test_results.tsv || echo -e 'Backward compatibility check: No Error messages in clickhouse-server.log\tOK' >> /test_output/test_results.tsv

View File

@ -1,7 +1,7 @@
# docker build -t clickhouse/style-test . # docker build -t clickhouse/style-test .
FROM ubuntu:20.04 FROM ubuntu:20.04
ARG ACT_VERSION=0.2.25 ARG ACT_VERSION=0.2.33
ARG ACTIONLINT_VERSION=1.6.8 ARG ACTIONLINT_VERSION=1.6.22
# ARG for quick switch to a given ubuntu mirror # ARG for quick switch to a given ubuntu mirror
ARG apt_archive="http://archive.ubuntu.com" ARG apt_archive="http://archive.ubuntu.com"

View File

@ -212,4 +212,4 @@ Templates:
## How to Build Documentation ## How to Build Documentation
You can build your documentation manually by following the instructions in [docs/tools/README.md](../docs/tools/README.md). Also, our CI runs the documentation build after the `documentation` label is added to PR. You can see the results of a build in the GitHub interface. If you have no permissions to add labels, a reviewer of your PR will add it. You can build your documentation manually by following the instructions in the docs repo [contrib-writing-guide](https://github.com/ClickHouse/clickhouse-docs/blob/main/contrib-writing-guide.md). Also, our CI runs the documentation build after the `documentation` label is added to PR. You can see the results of a build in the GitHub interface. If you have no permissions to add labels, a reviewer of your PR will add it.

View File

@ -0,0 +1,18 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.10.2.11-stable (d2bfcaba002) FIXME as compared to v22.10.1.1877-stable (98ab5a3c189)
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#42750](https://github.com/ClickHouse/ClickHouse/issues/42750): A segmentation fault related to DNS & c-ares has been reported. The below error ocurred in multiple threads: ``` 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008088 [ 356 ] {} <Fatal> BaseDaemon: ######################################## 2022-09-28 15:41:19.008,"2022.09.28 15:41:19.008147 [ 356 ] {} <Fatal> BaseDaemon: (version 22.8.5.29 (official build), build id: 92504ACA0B8E2267) (from thread 353) (no query) Received signal Segmentation fault (11)" 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008196 [ 356 ] {} <Fatal> BaseDaemon: Address: 0xf Access: write. Address not mapped to object. 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008216 [ 356 ] {} <Fatal> BaseDaemon: Stack trace: 0x188f8212 0x1626851b 0x1626a69e 0x16269b3f 0x16267eab 0x13cf8284 0x13d24afc 0x13c5217e 0x14ec2495 0x15ba440f 0x15b9d13b 0x15bb2699 0x1891ccb3 0x1891e00d 0x18ae0769 0x18ade022 0x7f76aa985609 0x7f76aa8aa133 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008274 [ 356 ] {} <Fatal> BaseDaemon: 2. Poco::Net::IPAddress::family() const @ 0x188f8212 in /usr/bin/clickhouse 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008297 [ 356 ] {} <Fatal> BaseDaemon: 3. ? @ 0x1626851b in /usr/bin/clickhouse 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008309 [ 356 ] {} <Fatal> BaseDaemon: 4. ? @ 0x1626a69e in /usr/bin/clickhouse ```. [#42234](https://github.com/ClickHouse/ClickHouse/pull/42234) ([Arthur Passos](https://github.com/arthurpassos)).
* Backported in [#42793](https://github.com/ClickHouse/ClickHouse/issues/42793): Fix a bug in ParserFunction that could have led to a segmentation fault. [#42724](https://github.com/ClickHouse/ClickHouse/pull/42724) ([Nikolay Degterinsky](https://github.com/evillique)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Always run `BuilderReport` and `BuilderSpecialReport` in all CI types [#42684](https://github.com/ClickHouse/ClickHouse/pull/42684) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,26 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.3.14.18-lts (642946f61b2) FIXME as compared to v22.3.13.80-lts (e2708b01fba)
#### Bug Fix
* Backported in [#42432](https://github.com/ClickHouse/ClickHouse/issues/42432): - Choose correct aggregation method for LowCardinality with BigInt. [#42342](https://github.com/ClickHouse/ClickHouse/pull/42342) ([Duc Canh Le](https://github.com/canhld94)).
#### Build/Testing/Packaging Improvement
* Backported in [#42328](https://github.com/ClickHouse/ClickHouse/issues/42328): Update cctz to the latest master, update tzdb to 2020e. [#42273](https://github.com/ClickHouse/ClickHouse/pull/42273) ([Dom Del Nano](https://github.com/ddelnano)).
* Backported in [#42358](https://github.com/ClickHouse/ClickHouse/issues/42358): Update tzdata to 2022e to support the new timezone changes. Palestine transitions are now Saturdays at 02:00. Simplify three Ukraine zones into one. Jordan and Syria switch from +02/+03 with DST to year-round +03. (https://data.iana.org/time-zones/tzdb/NEWS). This closes [#42252](https://github.com/ClickHouse/ClickHouse/issues/42252). [#42327](https://github.com/ClickHouse/ClickHouse/pull/42327) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#42298](https://github.com/ClickHouse/ClickHouse/issues/42298): Fix a bug with projections and the `aggregate_functions_null_for_empty` setting. This bug is very rare and appears only if you enable the `aggregate_functions_null_for_empty` setting in the server's config. This closes [#41647](https://github.com/ClickHouse/ClickHouse/issues/41647). [#42198](https://github.com/ClickHouse/ClickHouse/pull/42198) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#42592](https://github.com/ClickHouse/ClickHouse/issues/42592): This closes [#42453](https://github.com/ClickHouse/ClickHouse/issues/42453). [#42573](https://github.com/ClickHouse/ClickHouse/pull/42573) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Add a warning message to release.py script, require release type [#41975](https://github.com/ClickHouse/ClickHouse/pull/41975) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Revert [#27787](https://github.com/ClickHouse/ClickHouse/issues/27787) [#42136](https://github.com/ClickHouse/ClickHouse/pull/42136) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).

View File

@ -0,0 +1,29 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.3.14.23-lts (74956bfee4d) FIXME as compared to v22.3.13.80-lts (e2708b01fba)
#### Improvement
* Backported in [#42527](https://github.com/ClickHouse/ClickHouse/issues/42527): Fix issue with passing MySQL timeouts for MySQL database engine and MySQL table function. Closes [#34168](https://github.com/ClickHouse/ClickHouse/issues/34168)?notification_referrer_id=NT_kwDOAzsV57MzMDMxNjAzNTY5OjU0MjAzODc5. [#40751](https://github.com/ClickHouse/ClickHouse/pull/40751) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### Bug Fix
* Backported in [#42432](https://github.com/ClickHouse/ClickHouse/issues/42432): - Choose correct aggregation method for LowCardinality with BigInt. [#42342](https://github.com/ClickHouse/ClickHouse/pull/42342) ([Duc Canh Le](https://github.com/canhld94)).
#### Build/Testing/Packaging Improvement
* Backported in [#42328](https://github.com/ClickHouse/ClickHouse/issues/42328): Update cctz to the latest master, update tzdb to 2020e. [#42273](https://github.com/ClickHouse/ClickHouse/pull/42273) ([Dom Del Nano](https://github.com/ddelnano)).
* Backported in [#42358](https://github.com/ClickHouse/ClickHouse/issues/42358): Update tzdata to 2022e to support the new timezone changes. Palestine transitions are now Saturdays at 02:00. Simplify three Ukraine zones into one. Jordan and Syria switch from +02/+03 with DST to year-round +03. (https://data.iana.org/time-zones/tzdb/NEWS). This closes [#42252](https://github.com/ClickHouse/ClickHouse/issues/42252). [#42327](https://github.com/ClickHouse/ClickHouse/pull/42327) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#42298](https://github.com/ClickHouse/ClickHouse/issues/42298): Fix a bug with projections and the `aggregate_functions_null_for_empty` setting. This bug is very rare and appears only if you enable the `aggregate_functions_null_for_empty` setting in the server's config. This closes [#41647](https://github.com/ClickHouse/ClickHouse/issues/41647). [#42198](https://github.com/ClickHouse/ClickHouse/pull/42198) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#42592](https://github.com/ClickHouse/ClickHouse/issues/42592): This closes [#42453](https://github.com/ClickHouse/ClickHouse/issues/42453). [#42573](https://github.com/ClickHouse/ClickHouse/pull/42573) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Add a warning message to release.py script, require release type [#41975](https://github.com/ClickHouse/ClickHouse/pull/41975) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Revert [#27787](https://github.com/ClickHouse/ClickHouse/issues/27787) [#42136](https://github.com/ClickHouse/ClickHouse/pull/42136) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).

View File

@ -105,7 +105,7 @@ ninja
Example for Fedora Rawhide: Example for Fedora Rawhide:
``` bash ``` bash
sudo yum update sudo yum update
yum --nogpg install git cmake make clang-c++ python3 sudo yum --nogpg install git cmake make clang python3 ccache
git clone --recursive https://github.com/ClickHouse/ClickHouse.git git clone --recursive https://github.com/ClickHouse/ClickHouse.git
mkdir build && cd build mkdir build && cd build
cmake ../ClickHouse cmake ../ClickHouse

View File

@ -77,15 +77,15 @@ While turning on `gtid_mode` you should also specify `enforce_gtid_consistency =
## Virtual Columns {#virtual-columns} ## Virtual Columns {#virtual-columns}
When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md) tables are used with virtual `_sign` and `_version` columns. When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](/docs/en/engines/table-engines/mergetree-family/replacingmergetree.md) tables are used with virtual `_sign` and `_version` columns.
### \_version ### \_version
`_version` — Transaction counter. Type [UInt64](../../sql-reference/data-types/int-uint.md). `_version` — Transaction counter. Type [UInt64](/docs/en/sql-reference/data-types/int-uint.md).
### \_sign ### \_sign
`_sign` — Deletion mark. Type [Int8](../../sql-reference/data-types/int-uint.md). Possible values: `_sign` — Deletion mark. Type [Int8](/docs/en/sql-reference/data-types/int-uint.md). Possible values:
- `1` — Row is not deleted, - `1` — Row is not deleted,
- `-1` — Row is deleted. - `-1` — Row is deleted.
@ -93,29 +93,29 @@ When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](
| MySQL | ClickHouse | | MySQL | ClickHouse |
|-------------------------|--------------------------------------------------------------| |-------------------------|--------------------------------------------------------------|
| TINY | [Int8](../../sql-reference/data-types/int-uint.md) | | TINY | [Int8](/docs/en/sql-reference/data-types/int-uint.md) |
| SHORT | [Int16](../../sql-reference/data-types/int-uint.md) | | SHORT | [Int16](/docs/en/sql-reference/data-types/int-uint.md) |
| INT24 | [Int32](../../sql-reference/data-types/int-uint.md) | | INT24 | [Int32](/docs/en/sql-reference/data-types/int-uint.md) |
| LONG | [UInt32](../../sql-reference/data-types/int-uint.md) | | LONG | [UInt32](/docs/en/sql-reference/data-types/int-uint.md) |
| LONGLONG | [UInt64](../../sql-reference/data-types/int-uint.md) | | LONGLONG | [UInt64](/docs/en/sql-reference/data-types/int-uint.md) |
| FLOAT | [Float32](../../sql-reference/data-types/float.md) | | FLOAT | [Float32](/docs/en/sql-reference/data-types/float.md) |
| DOUBLE | [Float64](../../sql-reference/data-types/float.md) | | DOUBLE | [Float64](/docs/en/sql-reference/data-types/float.md) |
| DECIMAL, NEWDECIMAL | [Decimal](../../sql-reference/data-types/decimal.md) | | DECIMAL, NEWDECIMAL | [Decimal](/docs/en/sql-reference/data-types/decimal.md) |
| DATE, NEWDATE | [Date](../../sql-reference/data-types/date.md) | | DATE, NEWDATE | [Date](/docs/en/sql-reference/data-types/date.md) |
| DATETIME, TIMESTAMP | [DateTime](../../sql-reference/data-types/datetime.md) | | DATETIME, TIMESTAMP | [DateTime](/docs/en/sql-reference/data-types/datetime.md) |
| DATETIME2, TIMESTAMP2 | [DateTime64](../../sql-reference/data-types/datetime64.md) | | DATETIME2, TIMESTAMP2 | [DateTime64](/docs/en/sql-reference/data-types/datetime64.md) |
| YEAR | [UInt16](../../sql-reference/data-types/int-uint.md) | | YEAR | [UInt16](/docs/en/sql-reference/data-types/int-uint.md) |
| TIME | [Int64](../../sql-reference/data-types/int-uint.md) | | TIME | [Int64](/docs/en/sql-reference/data-types/int-uint.md) |
| ENUM | [Enum](../../sql-reference/data-types/enum.md) | | ENUM | [Enum](/docs/en/sql-reference/data-types/enum.md) |
| STRING | [String](../../sql-reference/data-types/string.md) | | STRING | [String](/docs/en/sql-reference/data-types/string.md) |
| VARCHAR, VAR_STRING | [String](../../sql-reference/data-types/string.md) | | VARCHAR, VAR_STRING | [String](/docs/en/sql-reference/data-types/string.md) |
| BLOB | [String](../../sql-reference/data-types/string.md) | | BLOB | [String](/docs/en/sql-reference/data-types/string.md) |
| GEOMETRY | [String](../../sql-reference/data-types/string.md) | | GEOMETRY | [String](/docs/en/sql-reference/data-types/string.md) |
| BINARY | [FixedString](../../sql-reference/data-types/fixedstring.md) | | BINARY | [FixedString](/docs/en/sql-reference/data-types/fixedstring.md) |
| BIT | [UInt64](../../sql-reference/data-types/int-uint.md) | | BIT | [UInt64](/docs/en/sql-reference/data-types/int-uint.md) |
| SET | [UInt64](../../sql-reference/data-types/int-uint.md) | | SET | [UInt64](/docs/en/sql-reference/data-types/int-uint.md) |
[Nullable](../../sql-reference/data-types/nullable.md) is supported. [Nullable](/docs/en/sql-reference/data-types/nullable.md) is supported.
The data of TIME type in MySQL is converted to microseconds in ClickHouse. The data of TIME type in MySQL is converted to microseconds in ClickHouse.
@ -133,7 +133,7 @@ Apart of the data types limitations there are few restrictions comparing to `MyS
### DDL Queries {#ddl-queries} ### DDL Queries {#ddl-queries}
MySQL DDL queries are converted into the corresponding ClickHouse DDL queries ([ALTER](../../sql-reference/statements/alter/index.md), [CREATE](../../sql-reference/statements/create/index.md), [DROP](../../sql-reference/statements/drop), [RENAME](../../sql-reference/statements/rename.md)). If ClickHouse cannot parse some DDL query, the query is ignored. MySQL DDL queries are converted into the corresponding ClickHouse DDL queries ([ALTER](/docs/en/sql-reference/statements/alter/index.md), [CREATE](/docs/en/sql-reference/statements/create/index.md), [DROP](/docs/en/sql-reference/statements/drop.md), [RENAME](/docs/en/sql-reference/statements/rename.md)). If ClickHouse cannot parse some DDL query, the query is ignored.
### Data Replication {#data-replication} ### Data Replication {#data-replication}
@ -151,7 +151,7 @@ MySQL DDL queries are converted into the corresponding ClickHouse DDL queries ([
`SELECT` query from `MaterializedMySQL` tables has some specifics: `SELECT` query from `MaterializedMySQL` tables has some specifics:
- If `_version` is not specified in the `SELECT` query, the - If `_version` is not specified in the `SELECT` query, the
[FINAL](../../sql-reference/statements/select/from.md#select-from-final) modifier is used, so only rows with [FINAL](/docs/en/sql-reference/statements/select/from.md/#select-from-final) modifier is used, so only rows with
`MAX(_version)` are returned for each primary key value. `MAX(_version)` are returned for each primary key value.
- If `_sign` is not specified in the `SELECT` query, `WHERE _sign=1` is used by default. So the deleted rows are not - If `_sign` is not specified in the `SELECT` query, `WHERE _sign=1` is used by default. So the deleted rows are not
@ -164,7 +164,7 @@ MySQL DDL queries are converted into the corresponding ClickHouse DDL queries ([
MySQL `PRIMARY KEY` and `INDEX` clauses are converted into `ORDER BY` tuples in ClickHouse tables. MySQL `PRIMARY KEY` and `INDEX` clauses are converted into `ORDER BY` tuples in ClickHouse tables.
ClickHouse has only one physical order, which is determined by `ORDER BY` clause. To create a new physical order, use ClickHouse has only one physical order, which is determined by `ORDER BY` clause. To create a new physical order, use
[materialized views](../../sql-reference/statements/create/view.md#materialized). [materialized views](/docs/en/sql-reference/statements/create/view.md/#materialized).
**Notes** **Notes**
@ -173,7 +173,7 @@ ClickHouse has only one physical order, which is determined by `ORDER BY` clause
MySQL binlog. MySQL binlog.
- Replication can be easily broken. - Replication can be easily broken.
- Manual operations on database and tables are forbidden. - Manual operations on database and tables are forbidden.
- `MaterializedMySQL` is affected by the [optimize_on_insert](../../operations/settings/settings.md#optimize-on-insert) - `MaterializedMySQL` is affected by the [optimize_on_insert](/docs/en/operations/settings/settings.md/#optimize-on-insert)
setting. Data is merged in the corresponding table in the `MaterializedMySQL` database when a table in the MySQL setting. Data is merged in the corresponding table in the `MaterializedMySQL` database when a table in the MySQL
server changes. server changes.
@ -187,19 +187,19 @@ These are the schema conversion manipulations you can do with table overrides fo
* Modify column type. Must be compatible with the original type, or replication will fail. For example, * Modify column type. Must be compatible with the original type, or replication will fail. For example,
you can modify a UInt32 column to UInt64, but you can not modify a String column to Array(String). you can modify a UInt32 column to UInt64, but you can not modify a String column to Array(String).
* Modify [column TTL](../table-engines/mergetree-family/mergetree/#mergetree-column-ttl). * Modify [column TTL](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#mergetree-column-ttl).
* Modify [column compression codec](../../sql-reference/statements/create/table/#codecs). * Modify [column compression codec](/docs/en/sql-reference/statements/create/table.md/#codecs).
* Add [ALIAS columns](../../sql-reference/statements/create/table/#alias). * Add [ALIAS columns](/docs/en/sql-reference/statements/create/table.md/#alias).
* Add [skipping indexes](../table-engines/mergetree-family/mergetree/#table_engine-mergetree-data_skipping-indexes) * Add [skipping indexes](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-data_skipping-indexes)
* Add [projections](../table-engines/mergetree-family/mergetree/#projections). Note that projection optimizations are * Add [projections](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#projections). Note that projection optimizations are
disabled when using `SELECT ... FINAL` (which MaterializedMySQL does by default), so their utility is limited here. disabled when using `SELECT ... FINAL` (which MaterializedMySQL does by default), so their utility is limited here.
`INDEX ... TYPE hypothesis` as [described in the v21.12 blog post]](https://clickhouse.com/blog/en/2021/clickhouse-v21.12-released/) `INDEX ... TYPE hypothesis` as [described in the v21.12 blog post]](https://clickhouse.com/blog/en/2021/clickhouse-v21.12-released/)
may be more useful in this case. may be more useful in this case.
* Modify [PARTITION BY](../table-engines/mergetree-family/custom-partitioning-key/) * Modify [PARTITION BY](/docs/en/engines/table-engines/mergetree-family/custom-partitioning-key/)
* Modify [ORDER BY](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses) * Modify [ORDER BY](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#mergetree-query-clauses)
* Modify [PRIMARY KEY](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses) * Modify [PRIMARY KEY](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#mergetree-query-clauses)
* Add [SAMPLE BY](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses) * Add [SAMPLE BY](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#mergetree-query-clauses)
* Add [table TTL](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses) * Add [table TTL](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#mergetree-query-clauses)
```sql ```sql
CREATE DATABASE db_name ENGINE = MaterializedMySQL(...) CREATE DATABASE db_name ENGINE = MaterializedMySQL(...)

View File

@ -86,7 +86,7 @@ node1 :) SELECT materialize(hostName()) AS host, groupArray(n) FROM r.d GROUP BY
``` text ``` text
┌─hosts─┬─groupArray(n)─┐ ┌─hosts─┬─groupArray(n)─┐
│ node1 │ [1,3,5,7,9] │ │ node3 │ [1,3,5,7,9] │
│ node2 │ [0,2,4,6,8] │ │ node2 │ [0,2,4,6,8] │
└───────┴───────────────┘ └───────┴───────────────┘
``` ```

View File

@ -139,7 +139,7 @@ The following settings can be specified in configuration file for given endpoint
- `use_environment_credentials` — If set to `true`, S3 client will try to obtain credentials from environment variables and [Amazon EC2](https://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud) metadata for given endpoint. Optional, default value is `false`. - `use_environment_credentials` — If set to `true`, S3 client will try to obtain credentials from environment variables and [Amazon EC2](https://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud) metadata for given endpoint. Optional, default value is `false`.
- `region` — Specifies S3 region name. Optional. - `region` — Specifies S3 region name. Optional.
- `use_insecure_imds_request` — If set to `true`, S3 client will use insecure IMDS request while obtaining credentials from Amazon EC2 metadata. Optional, default value is `false`. - `use_insecure_imds_request` — If set to `true`, S3 client will use insecure IMDS request while obtaining credentials from Amazon EC2 metadata. Optional, default value is `false`.
- `header` — Adds specified HTTP header to a request to given endpoint. Optional, can be speficied multiple times. - `header` — Adds specified HTTP header to a request to given endpoint. Optional, can be specified multiple times.
- `server_side_encryption_customer_key_base64` — If specified, required headers for accessing S3 objects with SSE-C encryption will be set. Optional. - `server_side_encryption_customer_key_base64` — If specified, required headers for accessing S3 objects with SSE-C encryption will be set. Optional.
- `max_single_read_retries` — The maximum number of attempts during single read. Default value is `4`. Optional. - `max_single_read_retries` — The maximum number of attempts during single read. Default value is `4`. Optional.

View File

@ -10,11 +10,11 @@ These engines were developed for scenarios when you need to quickly write many s
Engines of the family: Engines of the family:
- [StripeLog](../../../engines/table-engines/log-family/stripelog.md) - [StripeLog](/docs/en/engines/table-engines/log-family/stripelog.md)
- [Log](../../../engines/table-engines/log-family/log.md) - [Log](/docs/en/engines/table-engines/log-family/log.md)
- [TinyLog](../../../engines/table-engines/log-family/tinylog.md) - [TinyLog](/docs/en/engines/table-engines/log-family/tinylog.md)
`Log` family table engines can store data to [HDFS](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-hdfs) or [S3](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-s3) distributed file systems. `Log` family table engines can store data to [HDFS](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-hdfs) or [S3](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-s3) distributed file systems.
## Common Properties {#common-properties} ## Common Properties {#common-properties}
@ -28,7 +28,7 @@ Engines:
During `INSERT` queries, the table is locked, and other queries for reading and writing data both wait for the table to unlock. If there are no data writing queries, any number of data reading queries can be performed concurrently. During `INSERT` queries, the table is locked, and other queries for reading and writing data both wait for the table to unlock. If there are no data writing queries, any number of data reading queries can be performed concurrently.
- Do not support [mutations](../../../sql-reference/statements/alter/index.md#alter-mutations). - Do not support [mutations](/docs/en/sql-reference/statements/alter/index.md/#alter-mutations).
- Do not support indexes. - Do not support indexes.

View File

@ -68,36 +68,57 @@ In the results of `SELECT` query, the values of `AggregateFunction` type have im
## Example of an Aggregated Materialized View {#example-of-an-aggregated-materialized-view} ## Example of an Aggregated Materialized View {#example-of-an-aggregated-materialized-view}
`AggregatingMergeTree` materialized view that watches the `test.visits` table: We will create the table `test.visits` that contain the raw data:
``` sql ``` sql
CREATE MATERIALIZED VIEW test.basic CREATE TABLE test.visits
ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate) (
StartDate DateTime64 NOT NULL,
CounterID UInt64,
Sign Nullable(Int32),
UserID Nullable(Int32)
) ENGINE = MergeTree ORDER BY (StartDate, CounterID);
```
`AggregatingMergeTree` materialized view that watches the `test.visits` table, and use the `AggregateFunction` type:
``` sql
CREATE MATERIALIZED VIEW test.mv_visits
(
StartDate DateTime64 NOT NULL,
CounterID UInt64,
Visits AggregateFunction(sum, Nullable(Int32)),
Users AggregateFunction(uniq, Nullable(Int32))
)
ENGINE = AggregatingMergeTree() ORDER BY (StartDate, CounterID)
AS SELECT AS SELECT
CounterID,
StartDate, StartDate,
CounterID,
sumState(Sign) AS Visits, sumState(Sign) AS Visits,
uniqState(UserID) AS Users uniqState(UserID) AS Users
FROM test.visits FROM test.visits
GROUP BY CounterID, StartDate; GROUP BY StartDate, CounterID;
``` ```
Inserting data into the `test.visits` table. Inserting data into the `test.visits` table.
``` sql ``` sql
INSERT INTO test.visits ... INSERT INTO test.visits (StartDate, CounterID, Sign, UserID)
VALUES (1667446031, 1, 3, 4)
INSERT INTO test.visits (StartDate, CounterID, Sign, UserID)
VALUES (1667446031, 1, 6, 3)
``` ```
The data are inserted in both the table and view `test.basic` that will perform the aggregation. The data are inserted in both the table and the materialized view `test.mv_visits`.
To get the aggregated data, we need to execute a query such as `SELECT ... GROUP BY ...` from the view `test.basic`: To get the aggregated data, we need to execute a query such as `SELECT ... GROUP BY ...` from the materialized view `test.mv_visits`:
``` sql ``` sql
SELECT SELECT
StartDate, StartDate,
sumMerge(Visits) AS Visits, sumMerge(Visits) AS Visits,
uniqMerge(Users) AS Users uniqMerge(Users) AS Users
FROM test.basic FROM test.mv_visits
GROUP BY StartDate GROUP BY StartDate
ORDER BY StartDate; ORDER BY StartDate;
``` ```

View File

@ -16,20 +16,20 @@ Main features:
This allows you to create a small sparse index that helps find data faster. This allows you to create a small sparse index that helps find data faster.
- Partitions can be used if the [partitioning key](../../../engines/table-engines/mergetree-family/custom-partitioning-key.md) is specified. - Partitions can be used if the [partitioning key](/docs/en/engines/table-engines/mergetree-family/custom-partitioning-key.md) is specified.
ClickHouse supports certain operations with partitions that are more efficient than general operations on the same data with the same result. ClickHouse also automatically cuts off the partition data where the partitioning key is specified in the query. ClickHouse supports certain operations with partitions that are more efficient than general operations on the same data with the same result. ClickHouse also automatically cuts off the partition data where the partitioning key is specified in the query.
- Data replication support. - Data replication support.
The family of `ReplicatedMergeTree` tables provides data replication. For more information, see [Data replication](../../../engines/table-engines/mergetree-family/replication.md). The family of `ReplicatedMergeTree` tables provides data replication. For more information, see [Data replication](/docs/en/engines/table-engines/mergetree-family/replication.md).
- Data sampling support. - Data sampling support.
If necessary, you can set the data sampling method in the table. If necessary, you can set the data sampling method in the table.
:::info :::info
The [Merge](../../../engines/table-engines/special/merge.md#merge) engine does not belong to the `*MergeTree` family. The [Merge](/docs/en/engines/table-engines/special/merge.md/#merge) engine does not belong to the `*MergeTree` family.
::: :::
## Creating a Table {#table_engine-mergetree-creating-a-table} ## Creating a Table {#table_engine-mergetree-creating-a-table}
@ -57,7 +57,7 @@ ORDER BY expr
[SETTINGS name=value, ...] [SETTINGS name=value, ...]
``` ```
For a description of parameters, see the [CREATE query description](../../../sql-reference/statements/create/table.md). For a description of parameters, see the [CREATE query description](/docs/en/sql-reference/statements/create/table.md).
### Query Clauses {#mergetree-query-clauses} ### Query Clauses {#mergetree-query-clauses}
@ -77,9 +77,9 @@ Use the `ORDER BY tuple()` syntax, if you do not need sorting. See [Selecting th
#### PARTITION BY #### PARTITION BY
`PARTITION BY` — The [partitioning key](../../../engines/table-engines/mergetree-family/custom-partitioning-key.md). Optional. In most cases you don't need partition key, and in most other cases you don't need partition key more granular than by months. Partitioning does not speed up queries (in contrast to the ORDER BY expression). You should never use too granular partitioning. Don't partition your data by client identifiers or names (instead make client identifier or name the first column in the ORDER BY expression). `PARTITION BY` — The [partitioning key](/docs/en/engines/table-engines/mergetree-family/custom-partitioning-key.md). Optional. In most cases you don't need partition key, and in most other cases you don't need partition key more granular than by months. Partitioning does not speed up queries (in contrast to the ORDER BY expression). You should never use too granular partitioning. Don't partition your data by client identifiers or names (instead make client identifier or name the first column in the ORDER BY expression).
For partitioning by month, use the `toYYYYMM(date_column)` expression, where `date_column` is a column with a date of the type [Date](../../../sql-reference/data-types/date.md). The partition names here have the `"YYYYMM"` format. For partitioning by month, use the `toYYYYMM(date_column)` expression, where `date_column` is a column with a date of the type [Date](/docs/en/sql-reference/data-types/date.md). The partition names here have the `"YYYYMM"` format.
#### PRIMARY KEY #### PRIMARY KEY
@ -127,7 +127,7 @@ Additional parameters that control the behavior of the `MergeTree` (optional):
#### use_minimalistic_part_header_in_zookeeper #### use_minimalistic_part_header_in_zookeeper
`use_minimalistic_part_header_in_zookeeper` — Storage method of the data parts headers in ZooKeeper. If `use_minimalistic_part_header_in_zookeeper=1`, then ZooKeeper stores less data. For more information, see the [setting description](../../../operations/server-configuration-parameters/settings.md#server-settings-use_minimalistic_part_header_in_zookeeper) in “Server configuration parameters”. `use_minimalistic_part_header_in_zookeeper` — Storage method of the data parts headers in ZooKeeper. If `use_minimalistic_part_header_in_zookeeper=1`, then ZooKeeper stores less data. For more information, see the [setting description](/docs/en/operations/server-configuration-parameters/settings.md/#server-settings-use_minimalistic_part_header_in_zookeeper) in “Server configuration parameters”.
#### min_merge_bytes_to_use_direct_io #### min_merge_bytes_to_use_direct_io
@ -166,15 +166,15 @@ Additional parameters that control the behavior of the `MergeTree` (optional):
#### max_compress_block_size #### max_compress_block_size
`max_compress_block_size` — Maximum size of blocks of uncompressed data before compressing for writing to a table. You can also specify this setting in the global settings (see [max_compress_block_size](../../../operations/settings/settings.md#max-compress-block-size) setting). The value specified when table is created overrides the global value for this setting. `max_compress_block_size` — Maximum size of blocks of uncompressed data before compressing for writing to a table. You can also specify this setting in the global settings (see [max_compress_block_size](/docs/en/operations/settings/settings.md/#max-compress-block-size) setting). The value specified when table is created overrides the global value for this setting.
#### min_compress_block_size #### min_compress_block_size
`min_compress_block_size` — Minimum size of blocks of uncompressed data required for compression when writing the next mark. You can also specify this setting in the global settings (see [min_compress_block_size](../../../operations/settings/settings.md#min-compress-block-size) setting). The value specified when table is created overrides the global value for this setting. `min_compress_block_size` — Minimum size of blocks of uncompressed data required for compression when writing the next mark. You can also specify this setting in the global settings (see [min_compress_block_size](/docs/en/operations/settings/settings.md/#min-compress-block-size) setting). The value specified when table is created overrides the global value for this setting.
#### max_partitions_to_read #### max_partitions_to_read
`max_partitions_to_read` — Limits the maximum number of partitions that can be accessed in one query. You can also specify setting [max_partitions_to_read](../../../operations/settings/merge-tree-settings.md#max-partitions-to-read) in the global setting. `max_partitions_to_read` — Limits the maximum number of partitions that can be accessed in one query. You can also specify setting [max_partitions_to_read](/docs/en/operations/settings/merge-tree-settings.md/#max-partitions-to-read) in the global setting.
**Example of Sections Setting** **Example of Sections Setting**
@ -184,7 +184,7 @@ ENGINE MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDa
In the example, we set partitioning by month. In the example, we set partitioning by month.
We also set an expression for sampling as a hash by the user ID. This allows you to pseudorandomize the data in the table for each `CounterID` and `EventDate`. If you define a [SAMPLE](../../../sql-reference/statements/select/sample.md#select-sample-clause) clause when selecting the data, ClickHouse will return an evenly pseudorandom data sample for a subset of users. We also set an expression for sampling as a hash by the user ID. This allows you to pseudorandomize the data in the table for each `CounterID` and `EventDate`. If you define a [SAMPLE](/docs/en/sql-reference/statements/select/sample.md/#select-sample-clause) clause when selecting the data, ClickHouse will return an evenly pseudorandom data sample for a subset of users.
The `index_granularity` setting can be omitted because 8192 is the default value. The `index_granularity` setting can be omitted because 8192 is the default value.
@ -207,9 +207,9 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
**MergeTree() Parameters** **MergeTree() Parameters**
- `date-column` — The name of a column of the [Date](../../../sql-reference/data-types/date.md) type. ClickHouse automatically creates partitions by month based on this column. The partition names are in the `"YYYYMM"` format. - `date-column` — The name of a column of the [Date](/docs/en/sql-reference/data-types/date.md) type. ClickHouse automatically creates partitions by month based on this column. The partition names are in the `"YYYYMM"` format.
- `sampling_expression` — An expression for sampling. - `sampling_expression` — An expression for sampling.
- `(primary, key)` — Primary key. Type: [Tuple()](../../../sql-reference/data-types/tuple.md) - `(primary, key)` — Primary key. Type: [Tuple()](/docs/en/sql-reference/data-types/tuple.md)
- `index_granularity` — The granularity of an index. The number of data rows between the “marks” of an index. The value 8192 is appropriate for most tasks. - `index_granularity` — The granularity of an index. The number of data rows between the “marks” of an index. The value 8192 is appropriate for most tasks.
**Example** **Example**
@ -262,7 +262,7 @@ Sparse indexes allow you to work with a very large number of table rows, because
ClickHouse does not require a unique primary key. You can insert multiple rows with the same primary key. ClickHouse does not require a unique primary key. You can insert multiple rows with the same primary key.
You can use `Nullable`-typed expressions in the `PRIMARY KEY` and `ORDER BY` clauses but it is strongly discouraged. To allow this feature, turn on the [allow_nullable_key](../../../operations/settings/settings.md#allow-nullable-key) setting. The [NULLS_LAST](../../../sql-reference/statements/select/order-by.md#sorting-of-special-values) principle applies for `NULL` values in the `ORDER BY` clause. You can use `Nullable`-typed expressions in the `PRIMARY KEY` and `ORDER BY` clauses but it is strongly discouraged. To allow this feature, turn on the [allow_nullable_key](/docs/en/operations/settings/settings.md/#allow-nullable-key) setting. The [NULLS_LAST](/docs/en/sql-reference/statements/select/order-by.md/#sorting-of-special-values) principle applies for `NULL` values in the `ORDER BY` clause.
### Selecting the Primary Key {#selecting-the-primary-key} ### Selecting the Primary Key {#selecting-the-primary-key}
@ -279,26 +279,26 @@ The number of columns in the primary key is not explicitly limited. Depending on
ClickHouse sorts data by primary key, so the higher the consistency, the better the compression. ClickHouse sorts data by primary key, so the higher the consistency, the better the compression.
- Provide additional logic when merging data parts in the [CollapsingMergeTree](../../../engines/table-engines/mergetree-family/collapsingmergetree.md#table_engine-collapsingmergetree) and [SummingMergeTree](../../../engines/table-engines/mergetree-family/summingmergetree.md) engines. - Provide additional logic when merging data parts in the [CollapsingMergeTree](/docs/en/engines/table-engines/mergetree-family/collapsingmergetree.md/#table_engine-collapsingmergetree) and [SummingMergeTree](/docs/en/engines/table-engines/mergetree-family/summingmergetree.md) engines.
In this case it makes sense to specify the *sorting key* that is different from the primary key. In this case it makes sense to specify the *sorting key* that is different from the primary key.
A long primary key will negatively affect the insert performance and memory consumption, but extra columns in the primary key do not affect ClickHouse performance during `SELECT` queries. A long primary key will negatively affect the insert performance and memory consumption, but extra columns in the primary key do not affect ClickHouse performance during `SELECT` queries.
You can create a table without a primary key using the `ORDER BY tuple()` syntax. In this case, ClickHouse stores data in the order of inserting. If you want to save data order when inserting data by `INSERT ... SELECT` queries, set [max_insert_threads = 1](../../../operations/settings/settings.md#settings-max-insert-threads). You can create a table without a primary key using the `ORDER BY tuple()` syntax. In this case, ClickHouse stores data in the order of inserting. If you want to save data order when inserting data by `INSERT ... SELECT` queries, set [max_insert_threads = 1](/docs/en/operations/settings/settings.md/#settings-max-insert-threads).
To select data in the initial order, use [single-threaded](../../../operations/settings/settings.md#settings-max_threads) `SELECT` queries. To select data in the initial order, use [single-threaded](/docs/en/operations/settings/settings.md/#settings-max_threads) `SELECT` queries.
### Choosing a Primary Key that Differs from the Sorting Key {#choosing-a-primary-key-that-differs-from-the-sorting-key} ### Choosing a Primary Key that Differs from the Sorting Key {#choosing-a-primary-key-that-differs-from-the-sorting-key}
It is possible to specify a primary key (an expression with values that are written in the index file for each mark) that is different from the sorting key (an expression for sorting the rows in data parts). In this case the primary key expression tuple must be a prefix of the sorting key expression tuple. It is possible to specify a primary key (an expression with values that are written in the index file for each mark) that is different from the sorting key (an expression for sorting the rows in data parts). In this case the primary key expression tuple must be a prefix of the sorting key expression tuple.
This feature is helpful when using the [SummingMergeTree](../../../engines/table-engines/mergetree-family/summingmergetree.md) and This feature is helpful when using the [SummingMergeTree](/docs/en/engines/table-engines/mergetree-family/summingmergetree.md) and
[AggregatingMergeTree](../../../engines/table-engines/mergetree-family/aggregatingmergetree.md) table engines. In a common case when using these engines, the table has two types of columns: *dimensions* and *measures*. Typical queries aggregate values of measure columns with arbitrary `GROUP BY` and filtering by dimensions. Because SummingMergeTree and AggregatingMergeTree aggregate rows with the same value of the sorting key, it is natural to add all dimensions to it. As a result, the key expression consists of a long list of columns and this list must be frequently updated with newly added dimensions. [AggregatingMergeTree](/docs/en/engines/table-engines/mergetree-family/aggregatingmergetree.md) table engines. In a common case when using these engines, the table has two types of columns: *dimensions* and *measures*. Typical queries aggregate values of measure columns with arbitrary `GROUP BY` and filtering by dimensions. Because SummingMergeTree and AggregatingMergeTree aggregate rows with the same value of the sorting key, it is natural to add all dimensions to it. As a result, the key expression consists of a long list of columns and this list must be frequently updated with newly added dimensions.
In this case it makes sense to leave only a few columns in the primary key that will provide efficient range scans and add the remaining dimension columns to the sorting key tuple. In this case it makes sense to leave only a few columns in the primary key that will provide efficient range scans and add the remaining dimension columns to the sorting key tuple.
[ALTER](../../../sql-reference/statements/alter/index.md) of the sorting key is a lightweight operation because when a new column is simultaneously added to the table and to the sorting key, existing data parts do not need to be changed. Since the old sorting key is a prefix of the new sorting key and there is no data in the newly added column, the data is sorted by both the old and new sorting keys at the moment of table modification. [ALTER](/docs/en/sql-reference/statements/alter/index.md) of the sorting key is a lightweight operation because when a new column is simultaneously added to the table and to the sorting key, existing data parts do not need to be changed. Since the old sorting key is a prefix of the new sorting key and there is no data in the newly added column, the data is sorted by both the old and new sorting keys at the moment of table modification.
### Use of Indexes and Partitions in Queries {#use-of-indexes-and-partitions-in-queries} ### Use of Indexes and Partitions in Queries {#use-of-indexes-and-partitions-in-queries}
@ -342,7 +342,7 @@ In the example below, the index cant be used.
SELECT count() FROM table WHERE CounterID = 34 OR URL LIKE '%upyachka%' SELECT count() FROM table WHERE CounterID = 34 OR URL LIKE '%upyachka%'
``` ```
To check whether ClickHouse can use the index when running a query, use the settings [force_index_by_date](../../../operations/settings/settings.md#settings-force_index_by_date) and [force_primary_key](../../../operations/settings/settings.md#force-primary-key). To check whether ClickHouse can use the index when running a query, use the settings [force_index_by_date](/docs/en/operations/settings/settings.md/#settings-force_index_by_date) and [force_primary_key](/docs/en/operations/settings/settings.md/#force-primary-key).
The key for partitioning by month allows reading only those data blocks which contain dates from the proper range. In this case, the data block may contain data for many dates (up to an entire month). Within a block, data is sorted by primary key, which might not contain the date as the first column. Because of this, using a query with only a date condition that does not specify the primary key prefix will cause more data to be read than for a single date. The key for partitioning by month allows reading only those data blocks which contain dates from the proper range. In this case, the data block may contain data for many dates (up to an entire month). Within a block, data is sorted by primary key, which might not contain the date as the first column. Because of this, using a query with only a date condition that does not specify the primary key prefix will cause more data to be read than for a single date.
@ -400,7 +400,7 @@ Stores unique values of the specified expression (no more than `max_rows` rows,
#### `ngrambf_v1(n, size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)` #### `ngrambf_v1(n, size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)`
Stores a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) that contains all ngrams from a block of data. Works only with datatypes: [String](../../../sql-reference/data-types/string.md), [FixedString](../../../sql-reference/data-types/fixedstring.md) and [Map](../../../sql-reference/data-types/map.md). Can be used for optimization of `EQUALS`, `LIKE` and `IN` expressions. Stores a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) that contains all ngrams from a block of data. Works only with datatypes: [String](/docs/en/sql-reference/data-types/string.md), [FixedString](/docs/en/sql-reference/data-types/fixedstring.md) and [Map](/docs/en/sql-reference/data-types/map.md). Can be used for optimization of `EQUALS`, `LIKE` and `IN` expressions.
- `n` — ngram size, - `n` — ngram size,
- `size_of_bloom_filter_in_bytes` — Bloom filter size in bytes (you can use large values here, for example, 256 or 512, because it can be compressed well). - `size_of_bloom_filter_in_bytes` — Bloom filter size in bytes (you can use large values here, for example, 256 or 512, because it can be compressed well).
@ -417,11 +417,11 @@ The optional `false_positive` parameter is the probability of receiving a false
Supported data types: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`, `Array`, `LowCardinality`, `Nullable`, `UUID`, `Map`. Supported data types: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`, `Array`, `LowCardinality`, `Nullable`, `UUID`, `Map`.
For `Map` data type client can specify if index should be created for keys or values using [mapKeys](../../../sql-reference/functions/tuple-map-functions.md#mapkeys) or [mapValues](../../../sql-reference/functions/tuple-map-functions.md#mapvalues) function. For `Map` data type client can specify if index should be created for keys or values using [mapKeys](/docs/en/sql-reference/functions/tuple-map-functions.md/#mapkeys) or [mapValues](/docs/en/sql-reference/functions/tuple-map-functions.md/#mapvalues) function.
There are also special-purpose and experimental indexes to support approximate nearest neighbor (ANN) queries. See [here](annindexes.md) for details. There are also special-purpose and experimental indexes to support approximate nearest neighbor (ANN) queries. See [here](annindexes.md) for details.
The following functions can use the filter: [equals](../../../sql-reference/functions/comparison-functions.md), [notEquals](../../../sql-reference/functions/comparison-functions.md), [in](../../../sql-reference/functions/in-functions), [notIn](../../../sql-reference/functions/in-functions), [has](../../../sql-reference/functions/array-functions#hasarr-elem), [hasAny](../../../sql-reference/functions/array-functions#hasany), [hasAll](../../../sql-reference/functions/array-functions#hasall). The following functions can use the filter: [equals](/docs/en/sql-reference/functions/comparison-functions.md), [notEquals](/docs/en/sql-reference/functions/comparison-functions.md), [in](/docs/en/sql-reference/functions/in-functions), [notIn](/docs/en/sql-reference/functions/in-functions), [has](/docs/en/sql-reference/functions/array-functions#hasarr-elem), [hasAny](/docs/en/sql-reference/functions/array-functions#hasany), [hasAll](/docs/en/sql-reference/functions/array-functions#hasall).
Example of index creation for `Map` data type Example of index creation for `Map` data type
@ -445,21 +445,21 @@ The `set` index can be used with all functions. Function subsets for other index
| Function (operator) / Index | primary key | minmax | ngrambf_v1 | tokenbf_v1 | bloom_filter | | Function (operator) / Index | primary key | minmax | ngrambf_v1 | tokenbf_v1 | bloom_filter |
|------------------------------------------------------------------------------------------------------------|-------------|--------|-------------|-------------|---------------| |------------------------------------------------------------------------------------------------------------|-------------|--------|-------------|-------------|---------------|
| [equals (=, ==)](../../../sql-reference/functions/comparison-functions.md#function-equals) | ✔ | ✔ | ✔ | ✔ | ✔ | | [equals (=, ==)](/docs/en/sql-reference/functions/comparison-functions.md/#function-equals) | ✔ | ✔ | ✔ | ✔ | ✔ |
| [notEquals(!=, &lt;&gt;)](../../../sql-reference/functions/comparison-functions.md#function-notequals) | ✔ | ✔ | ✔ | ✔ | ✔ | | [notEquals(!=, &lt;&gt;)](/docs/en/sql-reference/functions/comparison-functions.md/#function-notequals) | ✔ | ✔ | ✔ | ✔ | ✔ |
| [like](../../../sql-reference/functions/string-search-functions.md#function-like) | ✔ | ✔ | ✔ | ✔ | ✗ | | [like](/docs/en/sql-reference/functions/string-search-functions.md/#function-like) | ✔ | ✔ | ✔ | ✔ | ✗ |
| [notLike](../../../sql-reference/functions/string-search-functions.md#function-notlike) | ✔ | ✔ | ✔ | ✔ | ✗ | | [notLike](/docs/en/sql-reference/functions/string-search-functions.md/#function-notlike) | ✔ | ✔ | ✔ | ✔ | ✗ |
| [startsWith](../../../sql-reference/functions/string-functions.md#startswith) | ✔ | ✔ | ✔ | ✔ | ✗ | | [startsWith](/docs/en/sql-reference/functions/string-functions.md/#startswith) | ✔ | ✔ | ✔ | ✔ | ✗ |
| [endsWith](../../../sql-reference/functions/string-functions.md#endswith) | ✗ | ✗ | ✔ | ✔ | ✗ | | [endsWith](/docs/en/sql-reference/functions/string-functions.md/#endswith) | ✗ | ✗ | ✔ | ✔ | ✗ |
| [multiSearchAny](../../../sql-reference/functions/string-search-functions.md#function-multisearchany) | ✗ | ✗ | ✔ | ✗ | ✗ | | [multiSearchAny](/docs/en/sql-reference/functions/string-search-functions.md/#function-multisearchany) | ✗ | ✗ | ✔ | ✗ | ✗ |
| [in](../../../sql-reference/functions/in-functions#in-functions) | ✔ | ✔ | ✔ | ✔ | ✔ | | [in](/docs/en/sql-reference/functions/in-functions#in-functions) | ✔ | ✔ | ✔ | ✔ | ✔ |
| [notIn](../../../sql-reference/functions/in-functions#in-functions) | ✔ | ✔ | ✔ | ✔ | ✔ | | [notIn](/docs/en/sql-reference/functions/in-functions#in-functions) | ✔ | ✔ | ✔ | ✔ | ✔ |
| [less (<)](../../../sql-reference/functions/comparison-functions.md#function-less) | ✔ | ✔ | ✗ | ✗ | ✗ | | [less (<)](/docs/en/sql-reference/functions/comparison-functions.md/#function-less) | ✔ | ✔ | ✗ | ✗ | ✗ |
| [greater (>)](../../../sql-reference/functions/comparison-functions.md#function-greater) | ✔ | ✔ | ✗ | ✗ | ✗ | | [greater (>)](/docs/en/sql-reference/functions/comparison-functions.md/#function-greater) | ✔ | ✔ | ✗ | ✗ | ✗ |
| [lessOrEquals (<=)](../../../sql-reference/functions/comparison-functions.md#function-lessorequals) | ✔ | ✔ | ✗ | ✗ | ✗ | | [lessOrEquals (<=)](/docs/en/sql-reference/functions/comparison-functions.md/#function-lessorequals) | ✔ | ✔ | ✗ | ✗ | ✗ |
| [greaterOrEquals (>=)](../../../sql-reference/functions/comparison-functions.md#function-greaterorequals) | ✔ | ✔ | ✗ | ✗ | ✗ | | [greaterOrEquals (>=)](/docs/en/sql-reference/functions/comparison-functions.md/#function-greaterorequals) | ✔ | ✔ | ✗ | ✗ | ✗ |
| [empty](../../../sql-reference/functions/array-functions#function-empty) | ✔ | ✔ | ✗ | ✗ | ✗ | | [empty](/docs/en/sql-reference/functions/array-functions#function-empty) | ✔ | ✔ | ✗ | ✗ | ✗ |
| [notEmpty](../../../sql-reference/functions/array-functions#function-notempty) | ✔ | ✔ | ✗ | ✗ | ✗ | | [notEmpty](/docs/en/sql-reference/functions/array-functions#function-notempty) | ✔ | ✔ | ✗ | ✗ | ✗ |
| hasToken | ✗ | ✗ | ✗ | ✔ | ✗ | | hasToken | ✗ | ✗ | ✗ | ✔ | ✗ |
Functions with a constant argument that is less than ngram size cant be used by `ngrambf_v1` for query optimization. Functions with a constant argument that is less than ngram size cant be used by `ngrambf_v1` for query optimization.
@ -485,16 +485,16 @@ For example:
## Approximate Nearest Neighbor Search Indexes [experimental] {#table_engines-ANNIndex} ## Approximate Nearest Neighbor Search Indexes [experimental] {#table_engines-ANNIndex}
In addition to skip indices, there are also [Approximate Nearest Neighbor Search Indexes](../../../engines/table-engines/mergetree-family/annindexes.md). In addition to skip indices, there are also [Approximate Nearest Neighbor Search Indexes](/docs/en/engines/table-engines/mergetree-family/annindexes.md).
## Projections {#projections} ## Projections {#projections}
Projections are like [materialized views](../../../sql-reference/statements/create/view.md#materialized) but defined in part-level. It provides consistency guarantees along with automatic usage in queries. Projections are like [materialized views](/docs/en/sql-reference/statements/create/view.md/#materialized) but defined in part-level. It provides consistency guarantees along with automatic usage in queries.
:::note :::note
When you are implementing projections you should also consider the [force_optimize_projection](../../../operations/settings/settings.md#force-optimize-projection) setting. When you are implementing projections you should also consider the [force_optimize_projection](/docs/en/operations/settings/settings.md/#force-optimize-projection) setting.
::: :::
Projections are not supported in the `SELECT` statements with the [FINAL](../../../sql-reference/statements/select/from.md#select-from-final) modifier. Projections are not supported in the `SELECT` statements with the [FINAL](/docs/en/sql-reference/statements/select/from.md/#select-from-final) modifier.
### Projection Query {#projection-query} ### Projection Query {#projection-query}
A projection query is what defines a projection. It implicitly selects data from the parent table. A projection query is what defines a projection. It implicitly selects data from the parent table.
@ -504,7 +504,7 @@ A projection query is what defines a projection. It implicitly selects data from
SELECT <column list expr> [GROUP BY] <group keys expr> [ORDER BY] <expr> SELECT <column list expr> [GROUP BY] <group keys expr> [ORDER BY] <expr>
``` ```
Projections can be modified or dropped with the [ALTER](../../../sql-reference/statements/alter/projection.md) statement. Projections can be modified or dropped with the [ALTER](/docs/en/sql-reference/statements/alter/projection.md) statement.
### Projection Storage {#projection-storage} ### Projection Storage {#projection-storage}
Projections are stored inside the part directory. It's similar to an index but contains a subdirectory that stores an anonymous `MergeTree` table's part. The table is induced by the definition query of the projection. If there is a `GROUP BY` clause, the underlying storage engine becomes [AggregatingMergeTree](aggregatingmergetree.md), and all aggregate functions are converted to `AggregateFunction`. If there is an `ORDER BY` clause, the `MergeTree` table uses it as its primary key expression. During the merge process the projection part is merged via its storage's merge routine. The checksum of the parent table's part is combined with the projection's part. Other maintenance jobs are similar to skip indices. Projections are stored inside the part directory. It's similar to an index but contains a subdirectory that stores an anonymous `MergeTree` table's part. The table is induced by the definition query of the projection. If there is a `GROUP BY` clause, the underlying storage engine becomes [AggregatingMergeTree](aggregatingmergetree.md), and all aggregate functions are converted to `AggregateFunction`. If there is an `ORDER BY` clause, the `MergeTree` table uses it as its primary key expression. During the merge process the projection part is merged via its storage's merge routine. The checksum of the parent table's part is combined with the projection's part. Other maintenance jobs are similar to skip indices.
@ -526,7 +526,7 @@ Determines the lifetime of values.
The `TTL` clause can be set for the whole table and for each individual column. Table-level `TTL` can also specify the logic of automatic moving data between disks and volumes, or recompressing parts where all the data has been expired. The `TTL` clause can be set for the whole table and for each individual column. Table-level `TTL` can also specify the logic of automatic moving data between disks and volumes, or recompressing parts where all the data has been expired.
Expressions must evaluate to [Date](../../../sql-reference/data-types/date.md) or [DateTime](../../../sql-reference/data-types/datetime.md) data type. Expressions must evaluate to [Date](/docs/en/sql-reference/data-types/date.md) or [DateTime](/docs/en/sql-reference/data-types/datetime.md) data type.
**Syntax** **Syntax**
@ -537,7 +537,7 @@ TTL time_column
TTL time_column + interval TTL time_column + interval
``` ```
To define `interval`, use [time interval](../../../sql-reference/operators/index.md#operators-datetime) operators, for example: To define `interval`, use [time interval](/docs/en/sql-reference/operators/index.md/#operators-datetime) operators, for example:
``` sql ``` sql
TTL date_time + INTERVAL 1 MONTH TTL date_time + INTERVAL 1 MONTH
@ -684,11 +684,11 @@ Data with an expired `TTL` is removed when ClickHouse merges data parts.
When ClickHouse detects that data is expired, it performs an off-schedule merge. To control the frequency of such merges, you can set `merge_with_ttl_timeout`. If the value is too low, it will perform many off-schedule merges that may consume a lot of resources. When ClickHouse detects that data is expired, it performs an off-schedule merge. To control the frequency of such merges, you can set `merge_with_ttl_timeout`. If the value is too low, it will perform many off-schedule merges that may consume a lot of resources.
If you perform the `SELECT` query between merges, you may get expired data. To avoid it, use the [OPTIMIZE](../../../sql-reference/statements/optimize.md) query before `SELECT`. If you perform the `SELECT` query between merges, you may get expired data. To avoid it, use the [OPTIMIZE](/docs/en/sql-reference/statements/optimize.md) query before `SELECT`.
**See Also** **See Also**
- [ttl_only_drop_parts](../../../operations/settings/settings.md#ttl_only_drop_parts) setting - [ttl_only_drop_parts](/docs/en/operations/settings/settings.md/#ttl_only_drop_parts) setting
## Using Multiple Block Devices for Data Storage {#table_engine-mergetree-multiple-volumes} ## Using Multiple Block Devices for Data Storage {#table_engine-mergetree-multiple-volumes}
@ -697,16 +697,16 @@ If you perform the `SELECT` query between merges, you may get expired data. To a
`MergeTree` family table engines can store data on multiple block devices. For example, it can be useful when the data of a certain table are implicitly split into “hot” and “cold”. The most recent data is regularly requested but requires only a small amount of space. On the contrary, the fat-tailed historical data is requested rarely. If several disks are available, the “hot” data may be located on fast disks (for example, NVMe SSDs or in memory), while the “cold” data - on relatively slow ones (for example, HDD). `MergeTree` family table engines can store data on multiple block devices. For example, it can be useful when the data of a certain table are implicitly split into “hot” and “cold”. The most recent data is regularly requested but requires only a small amount of space. On the contrary, the fat-tailed historical data is requested rarely. If several disks are available, the “hot” data may be located on fast disks (for example, NVMe SSDs or in memory), while the “cold” data - on relatively slow ones (for example, HDD).
Data part is the minimum movable unit for `MergeTree`-engine tables. The data belonging to one part are stored on one disk. Data parts can be moved between disks in the background (according to user settings) as well as by means of the [ALTER](../../../sql-reference/statements/alter/partition.md#alter_move-partition) queries. Data part is the minimum movable unit for `MergeTree`-engine tables. The data belonging to one part are stored on one disk. Data parts can be moved between disks in the background (according to user settings) as well as by means of the [ALTER](/docs/en/sql-reference/statements/alter/partition.md/#alter_move-partition) queries.
### Terms {#terms} ### Terms {#terms}
- Disk — Block device mounted to the filesystem. - Disk — Block device mounted to the filesystem.
- Default disk — Disk that stores the path specified in the [path](../../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-path) server setting. - Default disk — Disk that stores the path specified in the [path](/docs/en/operations/server-configuration-parameters/settings.md/#server_configuration_parameters-path) server setting.
- Volume — Ordered set of equal disks (similar to [JBOD](https://en.wikipedia.org/wiki/Non-RAID_drive_architectures)). - Volume — Ordered set of equal disks (similar to [JBOD](https://en.wikipedia.org/wiki/Non-RAID_drive_architectures)).
- Storage policy — Set of volumes and the rules for moving data between them. - Storage policy — Set of volumes and the rules for moving data between them.
The names given to the described entities can be found in the system tables, [system.storage_policies](../../../operations/system-tables/storage_policies.md#system_tables-storage_policies) and [system.disks](../../../operations/system-tables/disks.md#system_tables-disks). To apply one of the configured storage policies for a table, use the `storage_policy` setting of `MergeTree`-engine family tables. The names given to the described entities can be found in the system tables, [system.storage_policies](/docs/en/operations/system-tables/storage_policies.md/#system_tables-storage_policies) and [system.disks](/docs/en/operations/system-tables/disks.md/#system_tables-disks). To apply one of the configured storage policies for a table, use the `storage_policy` setting of `MergeTree`-engine family tables.
### Configuration {#table_engine-mergetree-multiple-volumes_configure} ### Configuration {#table_engine-mergetree-multiple-volumes_configure}
@ -853,16 +853,16 @@ SETTINGS storage_policy = 'moving_from_ssd_to_hdd'
The `default` storage policy implies using only one volume, which consists of only one disk given in `<path>`. The `default` storage policy implies using only one volume, which consists of only one disk given in `<path>`.
You could change storage policy after table creation with [ALTER TABLE ... MODIFY SETTING] query, new policy should include all old disks and volumes with same names. You could change storage policy after table creation with [ALTER TABLE ... MODIFY SETTING] query, new policy should include all old disks and volumes with same names.
The number of threads performing background moves of data parts can be changed by [background_move_pool_size](../../../operations/settings/settings.md#background_move_pool_size) setting. The number of threads performing background moves of data parts can be changed by [background_move_pool_size](/docs/en/operations/settings/settings.md/#background_move_pool_size) setting.
### Details {#details} ### Details {#details}
In the case of `MergeTree` tables, data is getting to disk in different ways: In the case of `MergeTree` tables, data is getting to disk in different ways:
- As a result of an insert (`INSERT` query). - As a result of an insert (`INSERT` query).
- During background merges and [mutations](../../../sql-reference/statements/alter/index.md#alter-mutations). - During background merges and [mutations](/docs/en/sql-reference/statements/alter/index.md/#alter-mutations).
- When downloading from another replica. - When downloading from another replica.
- As a result of partition freezing [ALTER TABLE … FREEZE PARTITION](../../../sql-reference/statements/alter/partition.md#alter_freeze-partition). - As a result of partition freezing [ALTER TABLE … FREEZE PARTITION](/docs/en/sql-reference/statements/alter/partition.md/#alter_freeze-partition).
In all these cases except for mutations and partition freezing, a part is stored on a volume and a disk according to the given storage policy: In all these cases except for mutations and partition freezing, a part is stored on a volume and a disk according to the given storage policy:
@ -872,16 +872,16 @@ In all these cases except for mutations and partition freezing, a part is stored
Under the hood, mutations and partition freezing make use of [hard links](https://en.wikipedia.org/wiki/Hard_link). Hard links between different disks are not supported, therefore in such cases the resulting parts are stored on the same disks as the initial ones. Under the hood, mutations and partition freezing make use of [hard links](https://en.wikipedia.org/wiki/Hard_link). Hard links between different disks are not supported, therefore in such cases the resulting parts are stored on the same disks as the initial ones.
In the background, parts are moved between volumes on the basis of the amount of free space (`move_factor` parameter) according to the order the volumes are declared in the configuration file. In the background, parts are moved between volumes on the basis of the amount of free space (`move_factor` parameter) according to the order the volumes are declared in the configuration file.
Data is never transferred from the last one and into the first one. One may use system tables [system.part_log](../../../operations/system-tables/part_log.md#system_tables-part-log) (field `type = MOVE_PART`) and [system.parts](../../../operations/system-tables/parts.md#system_tables-parts) (fields `path` and `disk`) to monitor background moves. Also, the detailed information can be found in server logs. Data is never transferred from the last one and into the first one. One may use system tables [system.part_log](/docs/en/operations/system-tables/part_log.md/#system_tables-part-log) (field `type = MOVE_PART`) and [system.parts](/docs/en/operations/system-tables/parts.md/#system_tables-parts) (fields `path` and `disk`) to monitor background moves. Also, the detailed information can be found in server logs.
User can force moving a part or a partition from one volume to another using the query [ALTER TABLE … MOVE PART\|PARTITION … TO VOLUME\|DISK …](../../../sql-reference/statements/alter/partition.md#alter_move-partition), all the restrictions for background operations are taken into account. The query initiates a move on its own and does not wait for background operations to be completed. User will get an error message if not enough free space is available or if any of the required conditions are not met. User can force moving a part or a partition from one volume to another using the query [ALTER TABLE … MOVE PART\|PARTITION … TO VOLUME\|DISK …](/docs/en/sql-reference/statements/alter/partition.md/#alter_move-partition), all the restrictions for background operations are taken into account. The query initiates a move on its own and does not wait for background operations to be completed. User will get an error message if not enough free space is available or if any of the required conditions are not met.
Moving data does not interfere with data replication. Therefore, different storage policies can be specified for the same table on different replicas. Moving data does not interfere with data replication. Therefore, different storage policies can be specified for the same table on different replicas.
After the completion of background merges and mutations, old parts are removed only after a certain amount of time (`old_parts_lifetime`). After the completion of background merges and mutations, old parts are removed only after a certain amount of time (`old_parts_lifetime`).
During this time, they are not moved to other volumes or disks. Therefore, until the parts are finally removed, they are still taken into account for evaluation of the occupied disk space. During this time, they are not moved to other volumes or disks. Therefore, until the parts are finally removed, they are still taken into account for evaluation of the occupied disk space.
User can assign new big parts to different disks of a [JBOD](https://en.wikipedia.org/wiki/Non-RAID_drive_architectures) volume in a balanced way using the [min_bytes_to_rebalance_partition_over_jbod](../../../operations/settings/merge-tree-settings.md#min-bytes-to-rebalance-partition-over-jbod) setting. User can assign new big parts to different disks of a [JBOD](https://en.wikipedia.org/wiki/Non-RAID_drive_architectures) volume in a balanced way using the [min_bytes_to_rebalance_partition_over_jbod](/docs/en/operations/settings/merge-tree-settings.md/#min-bytes-to-rebalance-partition-over-jbod) setting.
## Using S3 for Data Storage {#table_engine-mergetree-s3} ## Using S3 for Data Storage {#table_engine-mergetree-s3}

View File

@ -20,7 +20,7 @@ Replication works at the level of an individual table, not the entire server. A
Replication does not depend on sharding. Each shard has its own independent replication. Replication does not depend on sharding. Each shard has its own independent replication.
Compressed data for `INSERT` and `ALTER` queries is replicated (for more information, see the documentation for [ALTER](../../../sql-reference/statements/alter/index.md#query_language_queries_alter)). Compressed data for `INSERT` and `ALTER` queries is replicated (for more information, see the documentation for [ALTER](/docs/en/sql-reference/statements/alter/index.md/#query_language_queries_alter)).
`CREATE`, `DROP`, `ATTACH`, `DETACH` and `RENAME` queries are executed on a single server and are not replicated: `CREATE`, `DROP`, `ATTACH`, `DETACH` and `RENAME` queries are executed on a single server and are not replicated:
@ -28,9 +28,9 @@ Compressed data for `INSERT` and `ALTER` queries is replicated (for more informa
- The `DROP TABLE` query deletes the replica located on the server where the query is run. - The `DROP TABLE` query deletes the replica located on the server where the query is run.
- The `RENAME` query renames the table on one of the replicas. In other words, replicated tables can have different names on different replicas. - The `RENAME` query renames the table on one of the replicas. In other words, replicated tables can have different names on different replicas.
ClickHouse uses [ClickHouse Keeper](../../../guides/sre/keeper/clickhouse-keeper.md) for storing replicas meta information. It is possible to use ZooKeeper version 3.4.5 or newer, but ClickHouse Keeper is recommended. ClickHouse uses [ClickHouse Keeper](/docs/en/guides/sre/keeper/clickhouse-keeper.md) for storing replicas meta information. It is possible to use ZooKeeper version 3.4.5 or newer, but ClickHouse Keeper is recommended.
To use replication, set parameters in the [zookeeper](../../../operations/server-configuration-parameters/settings.md#server-settings_zookeeper) server configuration section. To use replication, set parameters in the [zookeeper](/docs/en/operations/server-configuration-parameters/settings.md/#server-settings_zookeeper) server configuration section.
:::warning :::warning
Dont neglect the security setting. ClickHouse supports the `digest` [ACL scheme](https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#sc_ZooKeeperAccessControl) of the ZooKeeper security subsystem. Dont neglect the security setting. ClickHouse supports the `digest` [ACL scheme](https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#sc_ZooKeeperAccessControl) of the ZooKeeper security subsystem.
@ -95,21 +95,21 @@ You can specify any existing ZooKeeper cluster and the system will use a directo
If ZooKeeper isnt set in the config file, you cant create replicated tables, and any existing replicated tables will be read-only. If ZooKeeper isnt set in the config file, you cant create replicated tables, and any existing replicated tables will be read-only.
ZooKeeper is not used in `SELECT` queries because replication does not affect the performance of `SELECT` and queries run just as fast as they do for non-replicated tables. When querying distributed replicated tables, ClickHouse behavior is controlled by the settings [max_replica_delay_for_distributed_queries](../../../operations/settings/settings.md#settings-max_replica_delay_for_distributed_queries) and [fallback_to_stale_replicas_for_distributed_queries](../../../operations/settings/settings.md#settings-fallback_to_stale_replicas_for_distributed_queries). ZooKeeper is not used in `SELECT` queries because replication does not affect the performance of `SELECT` and queries run just as fast as they do for non-replicated tables. When querying distributed replicated tables, ClickHouse behavior is controlled by the settings [max_replica_delay_for_distributed_queries](/docs/en/operations/settings/settings.md/#settings-max_replica_delay_for_distributed_queries) and [fallback_to_stale_replicas_for_distributed_queries](/docs/en/operations/settings/settings.md/#settings-fallback_to_stale_replicas_for_distributed_queries).
For each `INSERT` query, approximately ten entries are added to ZooKeeper through several transactions. (To be more precise, this is for each inserted block of data; an INSERT query contains one block or one block per `max_insert_block_size = 1048576` rows.) This leads to slightly longer latencies for `INSERT` compared to non-replicated tables. But if you follow the recommendations to insert data in batches of no more than one `INSERT` per second, it does not create any problems. The entire ClickHouse cluster used for coordinating one ZooKeeper cluster has a total of several hundred `INSERTs` per second. The throughput on data inserts (the number of rows per second) is just as high as for non-replicated data. For each `INSERT` query, approximately ten entries are added to ZooKeeper through several transactions. (To be more precise, this is for each inserted block of data; an INSERT query contains one block or one block per `max_insert_block_size = 1048576` rows.) This leads to slightly longer latencies for `INSERT` compared to non-replicated tables. But if you follow the recommendations to insert data in batches of no more than one `INSERT` per second, it does not create any problems. The entire ClickHouse cluster used for coordinating one ZooKeeper cluster has a total of several hundred `INSERTs` per second. The throughput on data inserts (the number of rows per second) is just as high as for non-replicated data.
For very large clusters, you can use different ZooKeeper clusters for different shards. However, from our experience this has not proven necessary based on production clusters with approximately 300 servers. For very large clusters, you can use different ZooKeeper clusters for different shards. However, from our experience this has not proven necessary based on production clusters with approximately 300 servers.
Replication is asynchronous and multi-master. `INSERT` queries (as well as `ALTER`) can be sent to any available server. Data is inserted on the server where the query is run, and then it is copied to the other servers. Because it is asynchronous, recently inserted data appears on the other replicas with some latency. If part of the replicas are not available, the data is written when they become available. If a replica is available, the latency is the amount of time it takes to transfer the block of compressed data over the network. The number of threads performing background tasks for replicated tables can be set by [background_schedule_pool_size](../../../operations/settings/settings.md#background_schedule_pool_size) setting. Replication is asynchronous and multi-master. `INSERT` queries (as well as `ALTER`) can be sent to any available server. Data is inserted on the server where the query is run, and then it is copied to the other servers. Because it is asynchronous, recently inserted data appears on the other replicas with some latency. If part of the replicas are not available, the data is written when they become available. If a replica is available, the latency is the amount of time it takes to transfer the block of compressed data over the network. The number of threads performing background tasks for replicated tables can be set by [background_schedule_pool_size](/docs/en/operations/settings/settings.md/#background_schedule_pool_size) setting.
`ReplicatedMergeTree` engine uses a separate thread pool for replicated fetches. Size of the pool is limited by the [background_fetches_pool_size](../../../operations/settings/settings.md#background_fetches_pool_size) setting which can be tuned with a server restart. `ReplicatedMergeTree` engine uses a separate thread pool for replicated fetches. Size of the pool is limited by the [background_fetches_pool_size](/docs/en/operations/settings/settings.md/#background_fetches_pool_size) setting which can be tuned with a server restart.
By default, an INSERT query waits for confirmation of writing the data from only one replica. If the data was successfully written to only one replica and the server with this replica ceases to exist, the stored data will be lost. To enable getting confirmation of data writes from multiple replicas, use the `insert_quorum` option. By default, an INSERT query waits for confirmation of writing the data from only one replica. If the data was successfully written to only one replica and the server with this replica ceases to exist, the stored data will be lost. To enable getting confirmation of data writes from multiple replicas, use the `insert_quorum` option.
Each block of data is written atomically. The INSERT query is divided into blocks up to `max_insert_block_size = 1048576` rows. In other words, if the `INSERT` query has less than 1048576 rows, it is made atomically. Each block of data is written atomically. The INSERT query is divided into blocks up to `max_insert_block_size = 1048576` rows. In other words, if the `INSERT` query has less than 1048576 rows, it is made atomically.
Data blocks are deduplicated. For multiple writes of the same data block (data blocks of the same size containing the same rows in the same order), the block is only written once. The reason for this is in case of network failures when the client application does not know if the data was written to the DB, so the `INSERT` query can simply be repeated. It does not matter which replica INSERTs were sent to with identical data. `INSERTs` are idempotent. Deduplication parameters are controlled by [merge_tree](../../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-merge_tree) server settings. Data blocks are deduplicated. For multiple writes of the same data block (data blocks of the same size containing the same rows in the same order), the block is only written once. The reason for this is in case of network failures when the client application does not know if the data was written to the DB, so the `INSERT` query can simply be repeated. It does not matter which replica INSERTs were sent to with identical data. `INSERTs` are idempotent. Deduplication parameters are controlled by [merge_tree](/docs/en/operations/server-configuration-parameters/settings.md/#server_configuration_parameters-merge_tree) server settings.
During replication, only the source data to insert is transferred over the network. Further data transformation (merging) is coordinated and performed on all the replicas in the same way. This minimizes network usage, which means that replication works well when replicas reside in different datacenters. (Note that duplicating data in different datacenters is the main goal of replication.) During replication, only the source data to insert is transferred over the network. Further data transformation (merging) is coordinated and performed on all the replicas in the same way. This minimizes network usage, which means that replication works well when replicas reside in different datacenters. (Note that duplicating data in different datacenters is the main goal of replication.)
@ -165,7 +165,7 @@ CREATE TABLE table_name
</details> </details>
As the example shows, these parameters can contain substitutions in curly brackets. The substituted values are taken from the [macros](../../../operations/server-configuration-parameters/settings.md#macros) section of the configuration file. As the example shows, these parameters can contain substitutions in curly brackets. The substituted values are taken from the [macros](/docs/en/operations/server-configuration-parameters/settings.md/#macros) section of the configuration file.
Example: Example:
@ -295,10 +295,10 @@ If the data in ClickHouse Keeper was lost or damaged, you can save data by movin
**See Also** **See Also**
- [background_schedule_pool_size](../../../operations/settings/settings.md#background_schedule_pool_size) - [background_schedule_pool_size](/docs/en/operations/settings/settings.md/#background_schedule_pool_size)
- [background_fetches_pool_size](../../../operations/settings/settings.md#background_fetches_pool_size) - [background_fetches_pool_size](/docs/en/operations/settings/settings.md/#background_fetches_pool_size)
- [execute_merges_on_single_replica_time_threshold](../../../operations/settings/settings.md#execute-merges-on-single-replica-time-threshold) - [execute_merges_on_single_replica_time_threshold](/docs/en/operations/settings/settings.md/#execute-merges-on-single-replica-time-threshold)
- [max_replicated_fetches_network_bandwidth](../../../operations/settings/merge-tree-settings.md#max_replicated_fetches_network_bandwidth) - [max_replicated_fetches_network_bandwidth](/docs/en/operations/settings/merge-tree-settings.md/#max_replicated_fetches_network_bandwidth)
- [max_replicated_sends_network_bandwidth](../../../operations/settings/merge-tree-settings.md#max_replicated_sends_network_bandwidth) - [max_replicated_sends_network_bandwidth](/docs/en/operations/settings/merge-tree-settings.md/#max_replicated_sends_network_bandwidth)
[Original article](https://clickhouse.com/docs/en/operations/table_engines/replication/) <!--hide--> [Original article](https://clickhouse.com/docs/en/operations/table_engines/replication/) <!--hide-->

View File

@ -6,10 +6,10 @@ sidebar_label: Join
# Join Table Engine # Join Table Engine
Optional prepared data structure for usage in [JOIN](../../../sql-reference/statements/select/join.md#select-join) operations. Optional prepared data structure for usage in [JOIN](/docs/en/sql-reference/statements/select/join.md/#select-join) operations.
:::note :::note
This is not an article about the [JOIN clause](../../../sql-reference/statements/select/join.md#select-join) itself. This is not an article about the [JOIN clause](/docs/en/sql-reference/statements/select/join.md/#select-join) itself.
::: :::
## Creating a Table {#creating-a-table} ## Creating a Table {#creating-a-table}
@ -22,17 +22,17 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
) ENGINE = Join(join_strictness, join_type, k1[, k2, ...]) ) ENGINE = Join(join_strictness, join_type, k1[, k2, ...])
``` ```
See the detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query. See the detailed description of the [CREATE TABLE](/docs/en/sql-reference/statements/create/table.md/#create-table-query) query.
## Engine Parameters ## Engine Parameters
### join_strictness ### join_strictness
`join_strictness` [JOIN strictness](../../../sql-reference/statements/select/join.md#select-join-types). `join_strictness` [JOIN strictness](/docs/en/sql-reference/statements/select/join.md/#select-join-types).
### join_type ### join_type
`join_type` [JOIN type](../../../sql-reference/statements/select/join.md#select-join-types). `join_type` [JOIN type](/docs/en/sql-reference/statements/select/join.md/#select-join-types).
### Key columns ### Key columns
@ -55,11 +55,11 @@ You can use `INSERT` queries to add data to the `Join`-engine tables. If the tab
Main use-cases for `Join`-engine tables are following: Main use-cases for `Join`-engine tables are following:
- Place the table to the right side in a `JOIN` clause. - Place the table to the right side in a `JOIN` clause.
- Call the [joinGet](../../../sql-reference/functions/other-functions.md#joinget) function, which lets you extract data from the table the same way as from a dictionary. - Call the [joinGet](/docs/en/sql-reference/functions/other-functions.md/#joinget) function, which lets you extract data from the table the same way as from a dictionary.
### Deleting Data {#deleting-data} ### Deleting Data {#deleting-data}
`ALTER DELETE` queries for `Join`-engine tables are implemented as [mutations](../../../sql-reference/statements/alter/index.md#mutations). `DELETE` mutation reads filtered data and overwrites data of memory and disk. `ALTER DELETE` queries for `Join`-engine tables are implemented as [mutations](/docs/en/sql-reference/statements/alter/index.md/#mutations). `DELETE` mutation reads filtered data and overwrites data of memory and disk.
### Limitations and Settings {#join-limitations-and-settings} ### Limitations and Settings {#join-limitations-and-settings}
@ -67,30 +67,30 @@ When creating a table, the following settings are applied:
#### join_use_nulls #### join_use_nulls
[join_use_nulls](../../../operations/settings/settings.md#join_use_nulls) [join_use_nulls](/docs/en/operations/settings/settings.md/#join_use_nulls)
#### max_rows_in_join #### max_rows_in_join
[max_rows_in_join](../../../operations/settings/query-complexity.md#settings-max_rows_in_join) [max_rows_in_join](/docs/en/operations/settings/query-complexity.md/#settings-max_rows_in_join)
#### max_bytes_in_join #### max_bytes_in_join
[max_bytes_in_join](../../../operations/settings/query-complexity.md#settings-max_bytes_in_join) [max_bytes_in_join](/docs/en/operations/settings/query-complexity.md/#settings-max_bytes_in_join)
#### join_overflow_mode #### join_overflow_mode
[join_overflow_mode](../../../operations/settings/query-complexity.md#settings-join_overflow_mode) [join_overflow_mode](/docs/en/operations/settings/query-complexity.md/#settings-join_overflow_mode)
#### join_any_take_last_row #### join_any_take_last_row
[join_any_take_last_row](../../../operations/settings/settings.md#settings-join_any_take_last_row) [join_any_take_last_row](/docs/en/operations/settings/settings.md/#settings-join_any_take_last_row)
#### join_use_nulls #### join_use_nulls
[persistent](../../../operations/settings/settings.md#persistent) [persistent](/docs/en/operations/settings/settings.md/#persistent)
The `Join`-engine tables cant be used in `GLOBAL JOIN` operations. The `Join`-engine tables cant be used in `GLOBAL JOIN` operations.
The `Join`-engine allows to specify [join_use_nulls](../../../operations/settings/settings.md#join_use_nulls) setting in the `CREATE TABLE` statement. [SELECT](../../../sql-reference/statements/select/index.md) query should have the same `join_use_nulls` value. The `Join`-engine allows to specify [join_use_nulls](/docs/en/operations/settings/settings.md/#join_use_nulls) setting in the `CREATE TABLE` statement. [SELECT](/docs/en/sql-reference/statements/select/index.md) query should have the same `join_use_nulls` value.
## Usage Examples {#example} ## Usage Examples {#example}

View File

@ -4,25 +4,39 @@ sidebar_label: Cell Towers
sidebar_position: 3 sidebar_position: 3
title: "Cell Towers" title: "Cell Towers"
--- ---
import ConnectionDetails from '@site/docs/en/_snippets/_gather_your_details_http.mdx';
import Tabs from '@theme/Tabs'; import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem'; import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock'; import CodeBlock from '@theme/CodeBlock';
import ActionsMenu from '@site/docs/en/_snippets/_service_actions_menu.md'; import ActionsMenu from '@site/docs/en/_snippets/_service_actions_menu.md';
import SQLConsoleDetail from '@site/docs/en/_snippets/_launch_sql_console.md'; import SQLConsoleDetail from '@site/docs/en/_snippets/_launch_sql_console.md';
import SupersetDocker from '@site/docs/en/_snippets/_add_superset_detail.md';
This dataset is from [OpenCellid](https://www.opencellid.org/) - The world's largest Open Database of Cell Towers. ## Goal
In this guide you will learn how to:
- Load the OpenCelliD data in Clickhouse
- Connect Apache Superset to ClickHouse
- Build a dashboard based on data available in the dataset
Here is a preview of the dashboard created in this guide:
![Dashboard of cell towers by radio type in mcc 204](@site/docs/en/getting-started/example-datasets/images/superset-cell-tower-dashboard.png)
## Get the Dataset {#get-the-dataset}
This dataset is from [OpenCelliD](https://www.opencellid.org/) - The world's largest Open Database of Cell Towers.
As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc). As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc).
OpenCelliD Project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, and we redistribute a snapshot of this dataset under the terms of the same license. The up-to-date version of the dataset is available to download after sign in. OpenCelliD Project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, and we redistribute a snapshot of this dataset under the terms of the same license. The up-to-date version of the dataset is available to download after sign in.
## Get the Dataset {#get-the-dataset}
<Tabs groupId="deployMethod"> <Tabs groupId="deployMethod">
<TabItem value="serverless" label="ClickHouse Cloud" default> <TabItem value="serverless" label="ClickHouse Cloud" default>
### Load the sample data
ClickHouse Cloud provides an easy-button for uploading this dataset from S3. Log in to your ClickHouse Cloud organization, or create a free trial at [ClickHouse.cloud](https://clickhouse.cloud). ClickHouse Cloud provides an easy-button for uploading this dataset from S3. Log in to your ClickHouse Cloud organization, or create a free trial at [ClickHouse.cloud](https://clickhouse.cloud).
<ActionsMenu menu="Load Data" /> <ActionsMenu menu="Load Data" />
@ -30,13 +44,33 @@ Choose the **Cell Towers** dataset from the **Sample data** tab, and **Load data
![Load cell towers dataset](@site/docs/en/_snippets/images/cloud-load-data-sample.png) ![Load cell towers dataset](@site/docs/en/_snippets/images/cloud-load-data-sample.png)
Examine the schema of the cell_towers table: ### Examine the schema of the cell_towers table
```sql ```sql
DESCRIBE TABLE cell_towers DESCRIBE TABLE cell_towers
``` ```
<SQLConsoleDetail /> <SQLConsoleDetail />
This is the output of `DESCRIBE`. Down further in this guide the field type choices will be described.
```response
┌─name──────────┬─type──────────────────────────────────────────────────────────────────┬
│ radio │ Enum8('' = 0, 'CDMA' = 1, 'GSM' = 2, 'LTE' = 3, 'NR' = 4, 'UMTS' = 5) │
│ mcc │ UInt16 │
│ net │ UInt16 │
│ area │ UInt16 │
│ cell │ UInt64 │
│ unit │ Int16 │
│ lon │ Float64 │
│ lat │ Float64 │
│ range │ UInt32 │
│ samples │ UInt32 │
│ changeable │ UInt8 │
│ created │ DateTime │
│ updated │ DateTime │
│ averageSignal │ UInt8 │
└───────────────┴───────────────────────────────────────────────────────────────────────┴
```
</TabItem> </TabItem>
<TabItem value="selfmanaged" label="Self-managed"> <TabItem value="selfmanaged" label="Self-managed">
@ -86,7 +120,7 @@ clickhouse-client --query "INSERT INTO cell_towers FORMAT CSVWithNames" < cell_t
</TabItem> </TabItem>
</Tabs> </Tabs>
## Example queries {#examples} ## Run some example queries {#examples}
1. A number of cell towers by type: 1. A number of cell towers by type:
@ -127,13 +161,13 @@ SELECT mcc, count() FROM cell_towers GROUP BY mcc ORDER BY count() DESC LIMIT 10
10 rows in set. Elapsed: 0.019 sec. Processed 43.28 million rows, 86.55 MB (2.33 billion rows/s., 4.65 GB/s.) 10 rows in set. Elapsed: 0.019 sec. Processed 43.28 million rows, 86.55 MB (2.33 billion rows/s., 4.65 GB/s.)
``` ```
So, the top countries are: the USA, Germany, and Russia. Based on the above query and the [MCC list](https://en.wikipedia.org/wiki/Mobile_country_code), the countries with the most cell towers are: the USA, Germany, and Russia.
You may want to create an [External Dictionary](../../sql-reference/dictionaries/external-dictionaries/external-dicts.md) in ClickHouse to decode these values. You may want to create an [External Dictionary](../../sql-reference/dictionaries/external-dictionaries/external-dicts.md) in ClickHouse to decode these values.
## Use case: Incorporate geo data {#use-case} ## Use case: Incorporate geo data {#use-case}
Using `pointInPolygon` function. Using the [`pointInPolygon`](/docs/en/sql-reference/functions/geo/coordinates.md/#pointinpolygon) function.
1. Create a table where we will store polygons: 1. Create a table where we will store polygons:
@ -224,6 +258,110 @@ WHERE pointInPolygon((lon, lat), (SELECT * FROM moscow))
1 rows in set. Elapsed: 0.067 sec. Processed 43.28 million rows, 692.42 MB (645.83 million rows/s., 10.33 GB/s.) 1 rows in set. Elapsed: 0.067 sec. Processed 43.28 million rows, 692.42 MB (645.83 million rows/s., 10.33 GB/s.)
``` ```
The data is also available for interactive queries in the [Playground](https://play.clickhouse.com/play?user=play), [example](https://play.clickhouse.com/play?user=play#U0VMRUNUIG1jYywgY291bnQoKSBGUk9NIGNlbGxfdG93ZXJzIEdST1VQIEJZIG1jYyBPUkRFUiBCWSBjb3VudCgpIERFU0M=). ## Review of the schema
Although you cannot create temporary tables there. Before building visualizations in Superset have a look at the columns that you will use. This dataset primarily provides the location (Longitude and Latitude) and radio types at mobile cellular towers worldwide. The column descriptions can be found in the [community forum](https://community.opencellid.org/t/documenting-the-columns-in-the-downloadable-cells-database-csv/186). The columns used in the visualizations that will be built are described below
Here is a description of the columns taken from the OpenCelliD forum:
| Column | Description |
|--------------|--------------------------------------------------------|
| radio | Technology generation: CDMA, GSM, UMTS, 5G NR |
| mcc | Mobile Country Code: `204` is The Netherlands |
| lon | Longitude: With Latitude, approximate tower location |
| lat | Latitude: With Longitude, approximate tower location |
:::tip mcc
To find your MCC check [Mobile network codes](https://en.wikipedia.org/wiki/Mobile_country_code), and use the three digits in the **Mobile country code** column.
:::
The schema for this table was designed for compact storage on disk and query speed.
- The `radio` data is stored as an `Enum8` (`UInt8`) rather than a string.
- `mcc` or Mobile country code, is stored as a `UInt16` as we know the range is 1 - 999.
- `lon` and `lat` are `Float64`.
None of the other fields are used in the queries or visualizations in this guide, but they are described in the forum linked above if you are interested.
## Build visualizations with Apache Superset
Superset is easy to run from Docker. If you already have Superset running, all you need to do is add ClickHouse Connect with `pip install clickhouse-connect`. If you need to install Superset open the **Launch Apache Superset in Docker** directly below.
<SupersetDocker />
To build a Superset dashboard using the OpenCelliD dataset you should:
- Add your ClickHouse service as a Superset **database**
- Add the table **cell_towers** as a Superset **dataset**
- Create some **charts**
- Add the charts to a **dashboard**
### Add your ClickHouse service as a Superset database
<ConnectionDetails />
In Superset a database can be added by choosing the database type, and then providing the connection details. Open Superset and look for the **+**, it has a menu with **Data** and then **Connect database** options.
![Add a database](@site/docs/en/getting-started/example-datasets/images/superset-add.png)
Choose **ClickHouse Connect** from the list:
![Choose clickhouse connect as database type](@site/docs/en/getting-started/example-datasets/images/superset-choose-a-database.png)
:::note
If **ClickHouse Connect** is not one of your options, then you will need to install it. The comand is `pip install clickhouse-connect`, and more info is [available here](https://pypi.org/project/clickhouse-connect/).
:::
#### Add your connection details:
:::tip
Make sure that you set **SSL** on when connecting to ClickHouse Cloud or other ClickHouse systems that enforce the use of SSL.
:::
![Add ClickHouse as a Superset datasource](@site/docs/en/getting-started/example-datasets/images/superset-connect-a-database.png)
### Add the table **cell_towers** as a Superset **dataset**
In Superset a **dataset** maps to a table within a database. Click on add a dataset and choose your ClickHouse service, the database containing your table (`default`), and choose the `cell_towers` table:
![Add cell_towers table as a dataset](@site/docs/en/getting-started/example-datasets/images/superset-add-dataset.png)
### Create some **charts**
When you choose to add a chart in Superset you have to specify the dataset (`cell_towers`) and the chart type. Since the OpenCelliD dataset provides longitude and latitude coordinates for cell towers we will create a **Map** chart. The **deck.gL Scatterplot** type is suited to this dataset as it works well with dense data points on a map.
![Create a map in Superset](@site/docs/en/getting-started/example-datasets/images/superset-create-map.png)
#### Specify the query used for the map
A deck.gl Scatterplot requires a longitude and latitude, and one or more filters can also be applied to the query. In this example two filters are applied, one for cell towers with UMTS radios, and one for the Mobile country code assigned to The Netherlands.
The fields `lon` and `lat` contain the longitude and latitude:
![Specify longitude and latitude fields](@site/docs/en/getting-started/example-datasets/images/superset-lon-lat.png)
Add a filter with `mcc` = `204` (or substitute any other `mcc` value):
![Filter on MCC 204](@site/docs/en/getting-started/example-datasets/images/superset-mcc-204.png)
Add a filter with `radio` = `'UMTS'` (or substitute any other `radio` value, you can see the choices in the output of `DESCRIBE TABLE cell_towers`):
![Filter on radio = UMTS](@site/docs/en/getting-started/example-datasets/images/superset-radio-umts.png)
This is the full configuration for the chart that filters on `radio = 'UMTS'` and `mcc = 204`:
![Chart for UMTS radios in MCC 204](@site/docs/en/getting-started/example-datasets/images/superset-umts-netherlands.png)
Click on **UPDATE CHART** to render the visualization.
### Add the charts to a **dashboard**
This screenshot shows cell tower locations with LTE, UMTS, and GSM radios. The charts are all created in the same way and they are added to a dashboard.
![Dashboard of cell towers by radio type in mcc 204](@site/docs/en/getting-started/example-datasets/images/superset-cell-tower-dashboard.png)
:::tip
The data is also available for interactive queries in the [Playground](https://play.clickhouse.com/play?user=play).
This [example](https://play.clickhouse.com/play?user=play#U0VMRUNUIG1jYywgY291bnQoKSBGUk9NIGNlbGxfdG93ZXJzIEdST1VQIEJZIG1jYyBPUkRFUiBCWSBjb3VudCgpIERFU0M=) will populate the username and even the query for you.
Although you cannot create tables in the Playground, you can run all of the queries and even use Superset (adjust the hostname and port number).
:::

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 475 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 246 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 290 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

View File

@ -33,7 +33,7 @@ CREATE TABLE trips (
tip_amount Float32, tip_amount Float32,
tolls_amount Float32, tolls_amount Float32,
total_amount Float32, total_amount Float32,
payment_type Enum('CSH' = 1, 'CRE' = 2, 'NOC' = 3, 'DIS' = 4), payment_type Enum('CSH' = 1, 'CRE' = 2, 'NOC' = 3, 'DIS' = 4, 'UNK' = 5),
pickup_ntaname LowCardinality(String), pickup_ntaname LowCardinality(String),
dropoff_ntaname LowCardinality(String) dropoff_ntaname LowCardinality(String)
) )
@ -63,7 +63,7 @@ SELECT
payment_type, payment_type,
pickup_ntaname, pickup_ntaname,
dropoff_ntaname dropoff_ntaname
FROM url( FROM s3(
'https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_{0..2}.gz', 'https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_{0..2}.gz',
'TabSeparatedWithNames' 'TabSeparatedWithNames'
) )

View File

@ -4,7 +4,7 @@ sidebar_label: Recipes Dataset
title: "Recipes Dataset" title: "Recipes Dataset"
--- ---
RecipeNLG dataset is available for download [here](https://recipenlg.cs.put.poznan.pl/dataset). It contains 2.2 million recipes. The size is slightly less than 1 GB. The RecipeNLG dataset is available for download [here](https://recipenlg.cs.put.poznan.pl/dataset). It contains 2.2 million recipes. The size is slightly less than 1 GB.
## Download and Unpack the Dataset ## Download and Unpack the Dataset

View File

@ -128,6 +128,24 @@ clickhouse-client # or "clickhouse-client --password" if you set up a password.
</details> </details>
<details>
<summary>Migration Method for installing the deb-packages</summary>
```bash
sudo apt-key del E0C56BD4
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 8919F6BD2B48D754
echo "deb https://packages.clickhouse.com/deb stable main" | sudo tee \
/etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client
sudo service clickhouse-server start
clickhouse-client # or "clickhouse-client --password" if you set up a password.
```
</details>
You can replace `stable` with `lts` to use different [release kinds](/docs/en/faq/operations/production.md) based on your needs. You can replace `stable` with `lts` to use different [release kinds](/docs/en/faq/operations/production.md) based on your needs.
You can also download and install packages manually from [here](https://packages.clickhouse.com/deb/pool/main/c/). You can also download and install packages manually from [here](https://packages.clickhouse.com/deb/pool/main/c/).

View File

@ -1,9 +1,5 @@
---
slug: /en/operations/troubleshooting [//]: # (This file is included in FAQ > Troubleshooting)
sidebar_position: 46
sidebar_label: Troubleshooting
title: Troubleshooting
---
- [Installation](#troubleshooting-installation-errors) - [Installation](#troubleshooting-installation-errors)
- [Connecting to the server](#troubleshooting-accepts-no-connections) - [Connecting to the server](#troubleshooting-accepts-no-connections)
@ -28,18 +24,34 @@ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 8919F6BD2B48D7
sudo apt-get update sudo apt-get update
``` ```
### You Get the Unsupported Architecture Warning with Apt-get {#you-get-the-unsupported-architecture-warning-with-apt-get} ### You Get Different Warnings with `apt-get update` {#you-get-different-warnings-with-apt-get-update}
- The completed warning message is as follows: - The completed warning messages are as one of following:
``` ```
N: Skipping acquire of configured file 'main/binary-i386/Packages' as repository 'https://packages.clickhouse.com/deb stable InRelease' doesn't support architecture 'i386' N: Skipping acquire of configured file 'main/binary-i386/Packages' as repository 'https://packages.clickhouse.com/deb stable InRelease' doesn't support architecture 'i386'
``` ```
```
E: Failed to fetch https://packages.clickhouse.com/deb/dists/stable/main/binary-amd64/Packages.gz File has unexpected size (30451 != 28154). Mirror sync in progress?
```
```
E: Repository 'https://packages.clickhouse.com/deb stable InRelease' changed its 'Origin' value from 'Artifactory' to 'ClickHouse'
E: Repository 'https://packages.clickhouse.com/deb stable InRelease' changed its 'Label' value from 'Artifactory' to 'ClickHouse'
N: Repository 'https://packages.clickhouse.com/deb stable InRelease' changed its 'Suite' value from 'stable' to ''
N: This must be accepted explicitly before updates for this repository can be applied. See apt-secure(8) manpage for details.
```
```
Err:11 https://packages.clickhouse.com/deb stable InRelease
400 Bad Request [IP: 172.66.40.249 443]
```
To resolve the above issue, please use the following script: To resolve the above issue, please use the following script:
```bash ```bash
sudo rm /var/lib/apt/lists/packages.clickhouse.com_* /var/lib/dpkg/arch sudo rm /var/lib/apt/lists/packages.clickhouse.com_* /var/lib/dpkg/arch /var/lib/apt/lists/partial/packages.clickhouse.com_*
sudo apt-get clean sudo apt-get clean
sudo apt-get autoclean sudo apt-get autoclean
``` ```

View File

@ -1,10 +1,7 @@
---
slug: /en/operations/update
sidebar_position: 47
sidebar_label: ClickHouse Upgrade
---
# ClickHouse Upgrade [//]: # (This file is included in Manage > Updates)
## Self-managed ClickHouse Upgrade
If ClickHouse was installed from `deb` packages, execute the following commands on the server: If ClickHouse was installed from `deb` packages, execute the following commands on the server:

View File

@ -126,7 +126,7 @@ clickhouse keeper --config /etc/your_path_to_config/config.xml
ClickHouse Keeper also provides 4lw commands which are almost the same with Zookeeper. Each command is composed of four letters such as `mntr`, `stat` etc. There are some more interesting commands: `stat` gives some general information about the server and connected clients, while `srvr` and `cons` give extended details on server and connections respectively. ClickHouse Keeper also provides 4lw commands which are almost the same with Zookeeper. Each command is composed of four letters such as `mntr`, `stat` etc. There are some more interesting commands: `stat` gives some general information about the server and connected clients, while `srvr` and `cons` give extended details on server and connections respectively.
The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value `conf,cons,crst,envi,ruok,srst,srvr,stat,wchc,wchs,dirs,mntr,isro`. The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value `conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif`.
You can issue the commands to ClickHouse Keeper via telnet or nc, at the client port. You can issue the commands to ClickHouse Keeper via telnet or nc, at the client port.
@ -309,7 +309,26 @@ Sessions with Ephemerals (1):
/clickhouse/task_queue/ddl /clickhouse/task_queue/ddl
``` ```
## [experimental] Migration from ZooKeeper {#migration-from-zookeeper} - `csnp`: Schedule a snapshot creation task. Return the last committed log index of the scheduled snapshot if success or `Failed to schedule snapshot creation task.` if failed. Note that `lgif` command can help you determine whether the snapshot is done.
```
100
```
- `lgif`: Keeper log information. `first_log_idx` : my first log index in log store; `first_log_term` : my first log term; `last_log_idx` : my last log index in log store; `last_log_term` : my last log term; `last_committed_log_idx` : my last committed log index in state machine; `leader_committed_log_idx` : leader's committed log index from my perspective; `target_committed_log_idx` : target log index should be committed to; `last_snapshot_idx` : the largest committed log index in last snapshot.
```
first_log_idx 1
first_log_term 1
last_log_idx 101
last_log_term 1
last_committed_log_idx 100
leader_committed_log_idx 101
target_committed_log_idx 101
last_snapshot_idx 50
```
## Migration from ZooKeeper {#migration-from-zookeeper}
Seamlessly migration from ZooKeeper to ClickHouse Keeper is impossible you have to stop your ZooKeeper cluster, convert data and start ClickHouse Keeper. `clickhouse-keeper-converter` tool allows converting ZooKeeper logs and snapshots to ClickHouse Keeper snapshot. It works only with ZooKeeper > 3.4. Steps for migration: Seamlessly migration from ZooKeeper to ClickHouse Keeper is impossible you have to stop your ZooKeeper cluster, convert data and start ClickHouse Keeper. `clickhouse-keeper-converter` tool allows converting ZooKeeper logs and snapshots to ClickHouse Keeper snapshot. It works only with ZooKeeper > 3.4. Steps for migration:

View File

@ -70,7 +70,7 @@ Another use case of `prefer_global_in_and_join` is accessing tables created by
**See also:** **See also:**
- [Distributed subqueries](../../sql-reference/operators/in.md#select-distributed-subqueries) for more information on how to use `GLOBAL IN`/`GLOBAL JOIN` - [Distributed subqueries](../../sql-reference/operators/in.md/#select-distributed-subqueries) for more information on how to use `GLOBAL IN`/`GLOBAL JOIN`
## enable_optimize_predicate_expression {#enable-optimize-predicate-expression} ## enable_optimize_predicate_expression {#enable-optimize-predicate-expression}
@ -170,7 +170,7 @@ It makes sense to disable it if the server has millions of tiny tables that are
## function_range_max_elements_in_block {#settings-function_range_max_elements_in_block} ## function_range_max_elements_in_block {#settings-function_range_max_elements_in_block}
Sets the safety threshold for data volume generated by function [range](../../sql-reference/functions/array-functions.md#range). Defines the maximum number of values generated by function per block of data (sum of array sizes for every row in a block). Sets the safety threshold for data volume generated by function [range](../../sql-reference/functions/array-functions.md/#range). Defines the maximum number of values generated by function per block of data (sum of array sizes for every row in a block).
Possible values: Possible values:
@ -273,10 +273,10 @@ Default value: 0.
## insert_null_as_default {#insert_null_as_default} ## insert_null_as_default {#insert_null_as_default}
Enables or disables the insertion of [default values](../../sql-reference/statements/create/table.md#create-default-values) instead of [NULL](../../sql-reference/syntax.md#null-literal) into columns with not [nullable](../../sql-reference/data-types/nullable.md#data_type-nullable) data type. Enables or disables the insertion of [default values](../../sql-reference/statements/create/table.md/#create-default-values) instead of [NULL](../../sql-reference/syntax.md/#null-literal) into columns with not [nullable](../../sql-reference/data-types/nullable.md/#data_type-nullable) data type.
If column type is not nullable and this setting is disabled, then inserting `NULL` causes an exception. If column type is nullable, then `NULL` values are inserted as is, regardless of this setting. If column type is not nullable and this setting is disabled, then inserting `NULL` causes an exception. If column type is nullable, then `NULL` values are inserted as is, regardless of this setting.
This setting is applicable to [INSERT ... SELECT](../../sql-reference/statements/insert-into.md#insert_query_insert-select) queries. Note that `SELECT` subqueries may be concatenated with `UNION ALL` clause. This setting is applicable to [INSERT ... SELECT](../../sql-reference/statements/insert-into.md/#insert_query_insert-select) queries. Note that `SELECT` subqueries may be concatenated with `UNION ALL` clause.
Possible values: Possible values:
@ -287,7 +287,7 @@ Default value: `1`.
## join_default_strictness {#settings-join_default_strictness} ## join_default_strictness {#settings-join_default_strictness}
Sets default strictness for [JOIN clauses](../../sql-reference/statements/select/join.md#select-join). Sets default strictness for [JOIN clauses](../../sql-reference/statements/select/join.md/#select-join).
Possible values: Possible values:
@ -322,7 +322,7 @@ When using `partial_merge` algorithm, ClickHouse sorts the data and dumps it to
- `direct` - can be applied when the right storage supports key-value requests. - `direct` - can be applied when the right storage supports key-value requests.
The `direct` algorithm performs a lookup in the right table using rows from the left table as keys. It's supported only by special storage such as [Dictionary](../../engines/table-engines/special/dictionary.md#dictionary) or [EmbeddedRocksDB](../../engines/table-engines/integrations/embedded-rocksdb.md) and only the `LEFT` and `INNER` JOINs. The `direct` algorithm performs a lookup in the right table using rows from the left table as keys. It's supported only by special storage such as [Dictionary](../../engines/table-engines/special/dictionary.md/#dictionary) or [EmbeddedRocksDB](../../engines/table-engines/integrations/embedded-rocksdb.md) and only the `LEFT` and `INNER` JOINs.
- `auto` — try `hash` join and switch on the fly to another algorithm if the memory limit is violated. - `auto` — try `hash` join and switch on the fly to another algorithm if the memory limit is violated.
@ -348,7 +348,7 @@ Default value: 0.
See also: See also:
- [JOIN clause](../../sql-reference/statements/select/join.md#select-join) - [JOIN clause](../../sql-reference/statements/select/join.md/#select-join)
- [Join table engine](../../engines/table-engines/special/join.md) - [Join table engine](../../engines/table-engines/special/join.md)
- [join_default_strictness](#settings-join_default_strictness) - [join_default_strictness](#settings-join_default_strictness)
@ -359,7 +359,7 @@ Sets the type of [JOIN](../../sql-reference/statements/select/join.md) behaviour
Possible values: Possible values:
- 0 — The empty cells are filled with the default value of the corresponding field type. - 0 — The empty cells are filled with the default value of the corresponding field type.
- 1 — `JOIN` behaves the same way as in standard SQL. The type of the corresponding field is converted to [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable), and empty cells are filled with [NULL](../../sql-reference/syntax.md). - 1 — `JOIN` behaves the same way as in standard SQL. The type of the corresponding field is converted to [Nullable](../../sql-reference/data-types/nullable.md/#data_type-nullable), and empty cells are filled with [NULL](../../sql-reference/syntax.md).
Default value: 0. Default value: 0.
@ -431,7 +431,7 @@ Default value: 0.
See also: See also:
- [JOIN strictness](../../sql-reference/statements/select/join.md#join-settings) - [JOIN strictness](../../sql-reference/statements/select/join.md/#join-settings)
## temporary_files_codec {#temporary_files_codec} ## temporary_files_codec {#temporary_files_codec}
@ -532,7 +532,7 @@ Default value: 8.
If ClickHouse should read more than `merge_tree_max_rows_to_use_cache` rows in one query, it does not use the cache of uncompressed blocks. If ClickHouse should read more than `merge_tree_max_rows_to_use_cache` rows in one query, it does not use the cache of uncompressed blocks.
The cache of uncompressed blocks stores data extracted for queries. ClickHouse uses this cache to speed up responses to repeated small queries. This setting protects the cache from trashing by queries that read a large amount of data. The [uncompressed_cache_size](../../operations/server-configuration-parameters/settings.md#server-settings-uncompressed_cache_size) server setting defines the size of the cache of uncompressed blocks. The cache of uncompressed blocks stores data extracted for queries. ClickHouse uses this cache to speed up responses to repeated small queries. This setting protects the cache from trashing by queries that read a large amount of data. The [uncompressed_cache_size](../../operations/server-configuration-parameters/settings.md/#server-settings-uncompressed_cache_size) server setting defines the size of the cache of uncompressed blocks.
Possible values: Possible values:
@ -544,7 +544,7 @@ Default value: 128 ✕ 8192.
If ClickHouse should read more than `merge_tree_max_bytes_to_use_cache` bytes in one query, it does not use the cache of uncompressed blocks. If ClickHouse should read more than `merge_tree_max_bytes_to_use_cache` bytes in one query, it does not use the cache of uncompressed blocks.
The cache of uncompressed blocks stores data extracted for queries. ClickHouse uses this cache to speed up responses to repeated small queries. This setting protects the cache from trashing by queries that read a large amount of data. The [uncompressed_cache_size](../../operations/server-configuration-parameters/settings.md#server-settings-uncompressed_cache_size) server setting defines the size of the cache of uncompressed blocks. The cache of uncompressed blocks stores data extracted for queries. ClickHouse uses this cache to speed up responses to repeated small queries. This setting protects the cache from trashing by queries that read a large amount of data. The [uncompressed_cache_size](../../operations/server-configuration-parameters/settings.md/#server-settings-uncompressed_cache_size) server setting defines the size of the cache of uncompressed blocks.
Possible values: Possible values:
@ -594,7 +594,7 @@ Default value: `1`.
Setting up query logging. Setting up query logging.
Queries sent to ClickHouse with this setup are logged according to the rules in the [query_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query-log) server configuration parameter. Queries sent to ClickHouse with this setup are logged according to the rules in the [query_log](../../operations/server-configuration-parameters/settings.md/#server_configuration_parameters-query-log) server configuration parameter.
Example: Example:
@ -639,7 +639,7 @@ log_queries_min_type='EXCEPTION_WHILE_PROCESSING'
Setting up query threads logging. Setting up query threads logging.
Query threads log into [system.query_thread_log](../../operations/system-tables/query_thread_log.md) table. This setting have effect only when [log_queries](#settings-log-queries) is true. Queries threads run by ClickHouse with this setup are logged according to the rules in the [query_thread_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query_thread_log) server configuration parameter. Query threads log into [system.query_thread_log](../../operations/system-tables/query_thread_log.md) table. This setting have effect only when [log_queries](#settings-log-queries) is true. Queries threads run by ClickHouse with this setup are logged according to the rules in the [query_thread_log](../../operations/server-configuration-parameters/settings.md/#server_configuration_parameters-query_thread_log) server configuration parameter.
Possible values: Possible values:
@ -658,7 +658,7 @@ log_query_threads=1
Setting up query views logging. Setting up query views logging.
When a query run by ClickHouse with this setup on has associated views (materialized or live views), they are logged in the [query_views_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-query_views_log) server configuration parameter. When a query run by ClickHouse with this setup on has associated views (materialized or live views), they are logged in the [query_views_log](../../operations/server-configuration-parameters/settings.md/#server_configuration_parameters-query_views_log) server configuration parameter.
Example: Example:
@ -884,7 +884,7 @@ Default value: `5`.
## max_replicated_fetches_network_bandwidth_for_server {#max_replicated_fetches_network_bandwidth_for_server} ## max_replicated_fetches_network_bandwidth_for_server {#max_replicated_fetches_network_bandwidth_for_server}
Limits the maximum speed of data exchange over the network in bytes per second for [replicated](../../engines/table-engines/mergetree-family/replication.md) fetches for the server. Only has meaning at server startup. You can also limit the speed for a particular table with [max_replicated_fetches_network_bandwidth](../../operations/settings/merge-tree-settings.md#max_replicated_fetches_network_bandwidth) setting. Limits the maximum speed of data exchange over the network in bytes per second for [replicated](../../engines/table-engines/mergetree-family/replication.md) fetches for the server. Only has meaning at server startup. You can also limit the speed for a particular table with [max_replicated_fetches_network_bandwidth](../../operations/settings/merge-tree-settings.md/#max_replicated_fetches_network_bandwidth) setting.
The setting isn't followed perfectly accurately. The setting isn't followed perfectly accurately.
@ -905,7 +905,7 @@ Could be used for throttling speed when replicating the data to add or replace n
## max_replicated_sends_network_bandwidth_for_server {#max_replicated_sends_network_bandwidth_for_server} ## max_replicated_sends_network_bandwidth_for_server {#max_replicated_sends_network_bandwidth_for_server}
Limits the maximum speed of data exchange over the network in bytes per second for [replicated](../../engines/table-engines/mergetree-family/replication.md) sends for the server. Only has meaning at server startup. You can also limit the speed for a particular table with [max_replicated_sends_network_bandwidth](../../operations/settings/merge-tree-settings.md#max_replicated_sends_network_bandwidth) setting. Limits the maximum speed of data exchange over the network in bytes per second for [replicated](../../engines/table-engines/mergetree-family/replication.md) sends for the server. Only has meaning at server startup. You can also limit the speed for a particular table with [max_replicated_sends_network_bandwidth](../../operations/settings/merge-tree-settings.md/#max_replicated_sends_network_bandwidth) setting.
The setting isn't followed perfectly accurately. The setting isn't followed perfectly accurately.
@ -955,7 +955,7 @@ For more information, see the section “Extreme values”.
## kafka_max_wait_ms {#kafka-max-wait-ms} ## kafka_max_wait_ms {#kafka-max-wait-ms}
The wait time in milliseconds for reading messages from [Kafka](../../engines/table-engines/integrations/kafka.md#kafka) before retry. The wait time in milliseconds for reading messages from [Kafka](../../engines/table-engines/integrations/kafka.md/#kafka) before retry.
Possible values: Possible values:
@ -977,7 +977,7 @@ Default value: false.
## use_uncompressed_cache {#setting-use_uncompressed_cache} ## use_uncompressed_cache {#setting-use_uncompressed_cache}
Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 0 (disabled). Whether to use a cache of uncompressed blocks. Accepts 0 or 1. By default, 0 (disabled).
Using the uncompressed cache (only for tables in the MergeTree family) can significantly reduce latency and increase throughput when working with a large number of short queries. Enable this setting for users who send frequent short requests. Also pay attention to the [uncompressed_cache_size](../../operations/server-configuration-parameters/settings.md#server-settings-uncompressed_cache_size) configuration parameter (only set in the config file) the size of uncompressed cache blocks. By default, it is 8 GiB. The uncompressed cache is filled in as needed and the least-used data is automatically deleted. Using the uncompressed cache (only for tables in the MergeTree family) can significantly reduce latency and increase throughput when working with a large number of short queries. Enable this setting for users who send frequent short requests. Also pay attention to the [uncompressed_cache_size](../../operations/server-configuration-parameters/settings.md/#server-settings-uncompressed_cache_size) configuration parameter (only set in the config file) the size of uncompressed cache blocks. By default, it is 8 GiB. The uncompressed cache is filled in as needed and the least-used data is automatically deleted.
For queries that read at least a somewhat large volume of data (one million rows or more), the uncompressed cache is disabled automatically to save space for truly small queries. This means that you can keep the use_uncompressed_cache setting always set to 1. For queries that read at least a somewhat large volume of data (one million rows or more), the uncompressed cache is disabled automatically to save space for truly small queries. This means that you can keep the use_uncompressed_cache setting always set to 1.
@ -1124,7 +1124,7 @@ This setting is useful for replicated tables with a sampling key. A query may be
- The cluster latency distribution has a long tail, so that querying more servers increases the query overall latency. - The cluster latency distribution has a long tail, so that querying more servers increases the query overall latency.
:::warning :::warning
This setting will produce incorrect results when joins or subqueries are involved, and all tables don't meet certain requirements. See [Distributed Subqueries and max_parallel_replicas](../../sql-reference/operators/in.md#max_parallel_replica-subqueries) for more details. This setting will produce incorrect results when joins or subqueries are involved, and all tables don't meet certain requirements. See [Distributed Subqueries and max_parallel_replicas](../../sql-reference/operators/in.md/#max_parallel_replica-subqueries) for more details.
::: :::
## compile_expressions {#compile-expressions} ## compile_expressions {#compile-expressions}
@ -1261,7 +1261,7 @@ Possible values:
Default value: 1. Default value: 1.
By default, blocks inserted into replicated tables by the `INSERT` statement are deduplicated (see [Data Replication](../../engines/table-engines/mergetree-family/replication.md)). By default, blocks inserted into replicated tables by the `INSERT` statement are deduplicated (see [Data Replication](../../engines/table-engines/mergetree-family/replication.md)).
For the replicated tables by default the only 100 of the most recent blocks for each partition are deduplicated (see [replicated_deduplication_window](merge-tree-settings.md#replicated-deduplication-window), [replicated_deduplication_window_seconds](merge-tree-settings.md/#replicated-deduplication-window-seconds)). For the replicated tables by default the only 100 of the most recent blocks for each partition are deduplicated (see [replicated_deduplication_window](merge-tree-settings.md/#replicated-deduplication-window), [replicated_deduplication_window_seconds](merge-tree-settings.md/#replicated-deduplication-window-seconds)).
For not replicated tables see [non_replicated_deduplication_window](merge-tree-settings.md/#non-replicated-deduplication-window). For not replicated tables see [non_replicated_deduplication_window](merge-tree-settings.md/#non-replicated-deduplication-window).
## deduplicate_blocks_in_dependent_materialized_views {#settings-deduplicate-blocks-in-dependent-materialized-views} ## deduplicate_blocks_in_dependent_materialized_views {#settings-deduplicate-blocks-in-dependent-materialized-views}
@ -1296,7 +1296,7 @@ Default value: empty string (disabled)
`insert_deduplication_token` is used for deduplication _only_ when not empty. `insert_deduplication_token` is used for deduplication _only_ when not empty.
For the replicated tables by default the only 100 of the most recent inserts for each partition are deduplicated (see [replicated_deduplication_window](merge-tree-settings.md#replicated-deduplication-window), [replicated_deduplication_window_seconds](merge-tree-settings.md/#replicated-deduplication-window-seconds)). For the replicated tables by default the only 100 of the most recent inserts for each partition are deduplicated (see [replicated_deduplication_window](merge-tree-settings.md/#replicated-deduplication-window), [replicated_deduplication_window_seconds](merge-tree-settings.md/#replicated-deduplication-window-seconds)).
For not replicated tables see [non_replicated_deduplication_window](merge-tree-settings.md/#non-replicated-deduplication-window). For not replicated tables see [non_replicated_deduplication_window](merge-tree-settings.md/#non-replicated-deduplication-window).
Example: Example:
@ -1373,15 +1373,15 @@ Default value: 0.
## count_distinct_implementation {#settings-count_distinct_implementation} ## count_distinct_implementation {#settings-count_distinct_implementation}
Specifies which of the `uniq*` functions should be used to perform the [COUNT(DISTINCT …)](../../sql-reference/aggregate-functions/reference/count.md#agg_function-count) construction. Specifies which of the `uniq*` functions should be used to perform the [COUNT(DISTINCT …)](../../sql-reference/aggregate-functions/reference/count.md/#agg_function-count) construction.
Possible values: Possible values:
- [uniq](../../sql-reference/aggregate-functions/reference/uniq.md#agg_function-uniq) - [uniq](../../sql-reference/aggregate-functions/reference/uniq.md/#agg_function-uniq)
- [uniqCombined](../../sql-reference/aggregate-functions/reference/uniqcombined.md#agg_function-uniqcombined) - [uniqCombined](../../sql-reference/aggregate-functions/reference/uniqcombined.md/#agg_function-uniqcombined)
- [uniqCombined64](../../sql-reference/aggregate-functions/reference/uniqcombined64.md#agg_function-uniqcombined64) - [uniqCombined64](../../sql-reference/aggregate-functions/reference/uniqcombined64.md/#agg_function-uniqcombined64)
- [uniqHLL12](../../sql-reference/aggregate-functions/reference/uniqhll12.md#agg_function-uniqhll12) - [uniqHLL12](../../sql-reference/aggregate-functions/reference/uniqhll12.md/#agg_function-uniqhll12)
- [uniqExact](../../sql-reference/aggregate-functions/reference/uniqexact.md#agg_function-uniqexact) - [uniqExact](../../sql-reference/aggregate-functions/reference/uniqexact.md/#agg_function-uniqexact)
Default value: `uniqExact`. Default value: `uniqExact`.
@ -1616,14 +1616,14 @@ Enables or disables optimization by transforming some functions to reading subco
These functions can be transformed: These functions can be transformed:
- [length](../../sql-reference/functions/array-functions.md#array_functions-length) to read the [size0](../../sql-reference/data-types/array.md#array-size) subcolumn. - [length](../../sql-reference/functions/array-functions.md/#array_functions-length) to read the [size0](../../sql-reference/data-types/array.md/#array-size) subcolumn.
- [empty](../../sql-reference/functions/array-functions.md#function-empty) to read the [size0](../../sql-reference/data-types/array.md#array-size) subcolumn. - [empty](../../sql-reference/functions/array-functions.md/#function-empty) to read the [size0](../../sql-reference/data-types/array.md/#array-size) subcolumn.
- [notEmpty](../../sql-reference/functions/array-functions.md#function-notempty) to read the [size0](../../sql-reference/data-types/array.md#array-size) subcolumn. - [notEmpty](../../sql-reference/functions/array-functions.md/#function-notempty) to read the [size0](../../sql-reference/data-types/array.md/#array-size) subcolumn.
- [isNull](../../sql-reference/operators/index.md#operator-is-null) to read the [null](../../sql-reference/data-types/nullable.md#finding-null) subcolumn. - [isNull](../../sql-reference/operators/index.md/#operator-is-null) to read the [null](../../sql-reference/data-types/nullable.md/#finding-null) subcolumn.
- [isNotNull](../../sql-reference/operators/index.md#is-not-null) to read the [null](../../sql-reference/data-types/nullable.md#finding-null) subcolumn. - [isNotNull](../../sql-reference/operators/index.md/#is-not-null) to read the [null](../../sql-reference/data-types/nullable.md/#finding-null) subcolumn.
- [count](../../sql-reference/aggregate-functions/reference/count.md) to read the [null](../../sql-reference/data-types/nullable.md#finding-null) subcolumn. - [count](../../sql-reference/aggregate-functions/reference/count.md) to read the [null](../../sql-reference/data-types/nullable.md/#finding-null) subcolumn.
- [mapKeys](../../sql-reference/functions/tuple-map-functions.md#mapkeys) to read the [keys](../../sql-reference/data-types/map.md#map-subcolumns) subcolumn. - [mapKeys](../../sql-reference/functions/tuple-map-functions.md/#mapkeys) to read the [keys](../../sql-reference/data-types/map.md/#map-subcolumns) subcolumn.
- [mapValues](../../sql-reference/functions/tuple-map-functions.md#mapvalues) to read the [values](../../sql-reference/data-types/map.md#map-subcolumns) subcolumn. - [mapValues](../../sql-reference/functions/tuple-map-functions.md/#mapvalues) to read the [values](../../sql-reference/data-types/map.md/#map-subcolumns) subcolumn.
Possible values: Possible values:
@ -1782,7 +1782,7 @@ Default value: 1000000000 nanoseconds (once a second).
See also: See also:
- System table [trace_log](../../operations/system-tables/trace_log.md#system_tables-trace_log) - System table [trace_log](../../operations/system-tables/trace_log.md/#system_tables-trace_log)
## query_profiler_cpu_time_period_ns {#query_profiler_cpu_time_period_ns} ## query_profiler_cpu_time_period_ns {#query_profiler_cpu_time_period_ns}
@ -1805,7 +1805,7 @@ Default value: 1000000000 nanoseconds.
See also: See also:
- System table [trace_log](../../operations/system-tables/trace_log.md#system_tables-trace_log) - System table [trace_log](../../operations/system-tables/trace_log.md/#system_tables-trace_log)
## allow_introspection_functions {#settings-allow_introspection_functions} ## allow_introspection_functions {#settings-allow_introspection_functions}
@ -1821,11 +1821,11 @@ Default value: 0.
**See Also** **See Also**
- [Sampling Query Profiler](../../operations/optimizing-performance/sampling-query-profiler.md) - [Sampling Query Profiler](../../operations/optimizing-performance/sampling-query-profiler.md)
- System table [trace_log](../../operations/system-tables/trace_log.md#system_tables-trace_log) - System table [trace_log](../../operations/system-tables/trace_log.md/#system_tables-trace_log)
## input_format_parallel_parsing {#input-format-parallel-parsing} ## input_format_parallel_parsing {#input-format-parallel-parsing}
Enables or disables order-preserving parallel parsing of data formats. Supported only for [TSV](../../interfaces/formats.md#tabseparated), [TKSV](../../interfaces/formats.md#tskv), [CSV](../../interfaces/formats.md#csv) and [JSONEachRow](../../interfaces/formats.md#jsoneachrow) formats. Enables or disables order-preserving parallel parsing of data formats. Supported only for [TSV](../../interfaces/formats.md/#tabseparated), [TKSV](../../interfaces/formats.md/#tskv), [CSV](../../interfaces/formats.md/#csv) and [JSONEachRow](../../interfaces/formats.md/#jsoneachrow) formats.
Possible values: Possible values:
@ -1836,7 +1836,7 @@ Default value: `1`.
## output_format_parallel_formatting {#output-format-parallel-formatting} ## output_format_parallel_formatting {#output-format-parallel-formatting}
Enables or disables parallel formatting of data formats. Supported only for [TSV](../../interfaces/formats.md#tabseparated), [TKSV](../../interfaces/formats.md#tskv), [CSV](../../interfaces/formats.md#csv) and [JSONEachRow](../../interfaces/formats.md#jsoneachrow) formats. Enables or disables parallel formatting of data formats. Supported only for [TSV](../../interfaces/formats.md/#tabseparated), [TKSV](../../interfaces/formats.md/#tskv), [CSV](../../interfaces/formats.md/#csv) and [JSONEachRow](../../interfaces/formats.md/#jsoneachrow) formats.
Possible values: Possible values:
@ -1878,7 +1878,7 @@ Default value: 0.
## insert_distributed_sync {#insert_distributed_sync} ## insert_distributed_sync {#insert_distributed_sync}
Enables or disables synchronous data insertion into a [Distributed](../../engines/table-engines/special/distributed.md#distributed) table. Enables or disables synchronous data insertion into a [Distributed](../../engines/table-engines/special/distributed.md/#distributed) table.
By default, when inserting data into a `Distributed` table, the ClickHouse server sends data to cluster nodes in asynchronous mode. When `insert_distributed_sync=1`, the data is processed synchronously, and the `INSERT` operation succeeds only after all the data is saved on all shards (at least one replica for each shard if `internal_replication` is true). By default, when inserting data into a `Distributed` table, the ClickHouse server sends data to cluster nodes in asynchronous mode. When `insert_distributed_sync=1`, the data is processed synchronously, and the `INSERT` operation succeeds only after all the data is saved on all shards (at least one replica for each shard if `internal_replication` is true).
@ -1891,12 +1891,12 @@ Default value: `0`.
**See Also** **See Also**
- [Distributed Table Engine](../../engines/table-engines/special/distributed.md#distributed) - [Distributed Table Engine](../../engines/table-engines/special/distributed.md/#distributed)
- [Managing Distributed Tables](../../sql-reference/statements/system.md#query-language-system-distributed) - [Managing Distributed Tables](../../sql-reference/statements/system.md/#query-language-system-distributed)
## insert_shard_id {#insert_shard_id} ## insert_shard_id {#insert_shard_id}
If not `0`, specifies the shard of [Distributed](../../engines/table-engines/special/distributed.md#distributed) table into which the data will be inserted synchronously. If not `0`, specifies the shard of [Distributed](../../engines/table-engines/special/distributed.md/#distributed) table into which the data will be inserted synchronously.
If `insert_shard_id` value is incorrect, the server will throw an exception. If `insert_shard_id` value is incorrect, the server will throw an exception.
@ -1909,7 +1909,7 @@ SELECT uniq(shard_num) FROM system.clusters WHERE cluster = 'requested_cluster';
Possible values: Possible values:
- 0 — Disabled. - 0 — Disabled.
- Any number from `1` to `shards_num` of corresponding [Distributed](../../engines/table-engines/special/distributed.md#distributed) table. - Any number from `1` to `shards_num` of corresponding [Distributed](../../engines/table-engines/special/distributed.md/#distributed) table.
Default value: `0`. Default value: `0`.
@ -1969,7 +1969,7 @@ Default value: 16.
## background_move_pool_size {#background_move_pool_size} ## background_move_pool_size {#background_move_pool_size}
Sets the number of threads performing background moves of data parts for [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-multiple-volumes)-engine tables. This setting is applied at the ClickHouse server start and cant be changed in a user session. Sets the number of threads performing background moves of data parts for [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-multiple-volumes)-engine tables. This setting is applied at the ClickHouse server start and cant be changed in a user session.
Possible values: Possible values:
@ -1979,7 +1979,7 @@ Default value: 8.
## background_schedule_pool_size {#background_schedule_pool_size} ## background_schedule_pool_size {#background_schedule_pool_size}
Sets the number of threads performing background tasks for [replicated](../../engines/table-engines/mergetree-family/replication.md) tables, [Kafka](../../engines/table-engines/integrations/kafka.md) streaming, [DNS cache updates](../../operations/server-configuration-parameters/settings.md#server-settings-dns-cache-update-period). This setting is applied at ClickHouse server start and cant be changed in a user session. Sets the number of threads performing background tasks for [replicated](../../engines/table-engines/mergetree-family/replication.md) tables, [Kafka](../../engines/table-engines/integrations/kafka.md) streaming, [DNS cache updates](../../operations/server-configuration-parameters/settings.md/#server-settings-dns-cache-update-period). This setting is applied at ClickHouse server start and cant be changed in a user session.
Possible values: Possible values:
@ -2036,12 +2036,12 @@ Default value: 16.
**See Also** **See Also**
- [Kafka](../../engines/table-engines/integrations/kafka.md#kafka) engine. - [Kafka](../../engines/table-engines/integrations/kafka.md/#kafka) engine.
- [RabbitMQ](../../engines/table-engines/integrations/rabbitmq.md#rabbitmq-engine) engine. - [RabbitMQ](../../engines/table-engines/integrations/rabbitmq.md/#rabbitmq-engine) engine.
## validate_polygons {#validate_polygons} ## validate_polygons {#validate_polygons}
Enables or disables throwing an exception in the [pointInPolygon](../../sql-reference/functions/geo/index.md#pointinpolygon) function, if the polygon is self-intersecting or self-tangent. Enables or disables throwing an exception in the [pointInPolygon](../../sql-reference/functions/geo/index.md/#pointinpolygon) function, if the polygon is self-intersecting or self-tangent.
Possible values: Possible values:
@ -2052,7 +2052,7 @@ Default value: 1.
## transform_null_in {#transform_null_in} ## transform_null_in {#transform_null_in}
Enables equality of [NULL](../../sql-reference/syntax.md#null-literal) values for [IN](../../sql-reference/operators/in.md) operator. Enables equality of [NULL](../../sql-reference/syntax.md/#null-literal) values for [IN](../../sql-reference/operators/in.md) operator.
By default, `NULL` values cant be compared because `NULL` means undefined value. Thus, comparison `expr = NULL` must always return `false`. With this setting `NULL = NULL` returns `true` for `IN` operator. By default, `NULL` values cant be compared because `NULL` means undefined value. Thus, comparison `expr = NULL` must always return `false`. With this setting `NULL = NULL` returns `true` for `IN` operator.
@ -2106,7 +2106,7 @@ Result:
**See Also** **See Also**
- [NULL Processing in IN Operators](../../sql-reference/operators/in.md#in-null-processing) - [NULL Processing in IN Operators](../../sql-reference/operators/in.md/#in-null-processing)
## low_cardinality_max_dictionary_size {#low_cardinality_max_dictionary_size} ## low_cardinality_max_dictionary_size {#low_cardinality_max_dictionary_size}
@ -2133,7 +2133,7 @@ Default value: 0.
## low_cardinality_allow_in_native_format {#low_cardinality_allow_in_native_format} ## low_cardinality_allow_in_native_format {#low_cardinality_allow_in_native_format}
Allows or restricts using the [LowCardinality](../../sql-reference/data-types/lowcardinality.md) data type with the [Native](../../interfaces/formats.md#native) format. Allows or restricts using the [LowCardinality](../../sql-reference/data-types/lowcardinality.md) data type with the [Native](../../interfaces/formats.md/#native) format.
If usage of `LowCardinality` is restricted, ClickHouse server converts `LowCardinality`-columns to ordinary ones for `SELECT` queries, and convert ordinary columns to `LowCardinality`-columns for `INSERT` queries. If usage of `LowCardinality` is restricted, ClickHouse server converts `LowCardinality`-columns to ordinary ones for `SELECT` queries, and convert ordinary columns to `LowCardinality`-columns for `INSERT` queries.
@ -2197,7 +2197,7 @@ Default value: 268435456.
## optimize_read_in_order {#optimize_read_in_order} ## optimize_read_in_order {#optimize_read_in_order}
Enables [ORDER BY](../../sql-reference/statements/select/order-by.md#optimize_read_in_order) optimization in [SELECT](../../sql-reference/statements/select/index.md) queries for reading data from [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md) tables. Enables [ORDER BY](../../sql-reference/statements/select/order-by.md/#optimize_read_in_order) optimization in [SELECT](../../sql-reference/statements/select/index.md) queries for reading data from [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md) tables.
Possible values: Possible values:
@ -2208,7 +2208,7 @@ Default value: `1`.
**See Also** **See Also**
- [ORDER BY Clause](../../sql-reference/statements/select/order-by.md#optimize_read_in_order) - [ORDER BY Clause](../../sql-reference/statements/select/order-by.md/#optimize_read_in_order)
## optimize_aggregation_in_order {#optimize_aggregation_in_order} ## optimize_aggregation_in_order {#optimize_aggregation_in_order}
@ -2223,11 +2223,11 @@ Default value: `0`.
**See Also** **See Also**
- [GROUP BY optimization](../../sql-reference/statements/select/group-by.md#aggregation-in-order) - [GROUP BY optimization](../../sql-reference/statements/select/group-by.md/#aggregation-in-order)
## mutations_sync {#mutations_sync} ## mutations_sync {#mutations_sync}
Allows to execute `ALTER TABLE ... UPDATE|DELETE` queries ([mutations](../../sql-reference/statements/alter/index.md#mutations)) synchronously. Allows to execute `ALTER TABLE ... UPDATE|DELETE` queries ([mutations](../../sql-reference/statements/alter/index.md/#mutations)) synchronously.
Possible values: Possible values:
@ -2239,8 +2239,8 @@ Default value: `0`.
**See Also** **See Also**
- [Synchronicity of ALTER Queries](../../sql-reference/statements/alter/index.md#synchronicity-of-alter-queries) - [Synchronicity of ALTER Queries](../../sql-reference/statements/alter/index.md/#synchronicity-of-alter-queries)
- [Mutations](../../sql-reference/statements/alter/index.md#mutations) - [Mutations](../../sql-reference/statements/alter/index.md/#mutations)
## ttl_only_drop_parts {#ttl_only_drop_parts} ## ttl_only_drop_parts {#ttl_only_drop_parts}
@ -2261,8 +2261,8 @@ Default value: `0`.
**See Also** **See Also**
- [CREATE TABLE query clauses and settings](../../engines/table-engines/mergetree-family/mergetree.md#mergetree-query-clauses) (`merge_with_ttl_timeout` setting) - [CREATE TABLE query clauses and settings](../../engines/table-engines/mergetree-family/mergetree.md/#mergetree-query-clauses) (`merge_with_ttl_timeout` setting)
- [Table TTL](../../engines/table-engines/mergetree-family/mergetree.md#mergetree-table-ttl) - [Table TTL](../../engines/table-engines/mergetree-family/mergetree.md/#mergetree-table-ttl)
## lock_acquire_timeout {#lock_acquire_timeout} ## lock_acquire_timeout {#lock_acquire_timeout}
@ -2279,7 +2279,7 @@ Default value: `120` seconds.
## cast_keep_nullable {#cast_keep_nullable} ## cast_keep_nullable {#cast_keep_nullable}
Enables or disables keeping of the `Nullable` data type in [CAST](../../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) operations. Enables or disables keeping of the `Nullable` data type in [CAST](../../sql-reference/functions/type-conversion-functions.md/#type_conversion_function-cast) operations.
When the setting is enabled and the argument of `CAST` function is `Nullable`, the result is also transformed to `Nullable` type. When the setting is disabled, the result always has the destination type exactly. When the setting is enabled and the argument of `CAST` function is `Nullable`, the result is also transformed to `Nullable` type. When the setting is disabled, the result always has the destination type exactly.
@ -2324,7 +2324,7 @@ Result:
**See Also** **See Also**
- [CAST](../../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) function - [CAST](../../sql-reference/functions/type-conversion-functions.md/#type_conversion_function-cast) function
## system_events_show_zero_values {#system_events_show_zero_values} ## system_events_show_zero_values {#system_events_show_zero_values}
@ -2369,7 +2369,7 @@ Result
## persistent {#persistent} ## persistent {#persistent}
Disables persistency for the [Set](../../engines/table-engines/special/set.md#set) and [Join](../../engines/table-engines/special/join.md#join) table engines. Disables persistency for the [Set](../../engines/table-engines/special/set.md/#set) and [Join](../../engines/table-engines/special/join.md/#join) table engines.
Reduces the I/O overhead. Suitable for scenarios that pursue performance and do not require persistence. Reduces the I/O overhead. Suitable for scenarios that pursue performance and do not require persistence.
@ -2382,7 +2382,7 @@ Default value: `1`.
## allow_nullable_key {#allow-nullable-key} ## allow_nullable_key {#allow-nullable-key}
Allows using of the [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable)-typed values in a sorting and a primary key for [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md#table_engines-mergetree) tables. Allows using of the [Nullable](../../sql-reference/data-types/nullable.md/#data_type-nullable)-typed values in a sorting and a primary key for [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md/#table_engines-mergetree) tables.
Possible values: Possible values:
@ -2401,7 +2401,7 @@ Do not enable this feature in version `<= 21.8`. It's not properly implemented a
## aggregate_functions_null_for_empty {#aggregate_functions_null_for_empty} ## aggregate_functions_null_for_empty {#aggregate_functions_null_for_empty}
Enables or disables rewriting all aggregate functions in a query, adding [-OrNull](../../sql-reference/aggregate-functions/combinators.md#agg-functions-combinator-ornull) suffix to them. Enable it for SQL standard compatibility. Enables or disables rewriting all aggregate functions in a query, adding [-OrNull](../../sql-reference/aggregate-functions/combinators.md/#agg-functions-combinator-ornull) suffix to them. Enable it for SQL standard compatibility.
It is implemented via query rewrite (similar to [count_distinct_implementation](#settings-count_distinct_implementation) setting) to get consistent results for distributed queries. It is implemented via query rewrite (similar to [count_distinct_implementation](#settings-count_distinct_implementation) setting) to get consistent results for distributed queries.
Possible values: Possible values:
@ -2448,7 +2448,7 @@ See examples in [UNION](../../sql-reference/statements/select/union.md).
## data_type_default_nullable {#data_type_default_nullable} ## data_type_default_nullable {#data_type_default_nullable}
Allows data types without explicit modifiers [NULL or NOT NULL](../../sql-reference/statements/create/table.md#null-modifiers) in column definition will be [Nullable](../../sql-reference/data-types/nullable.md#data_type-nullable). Allows data types without explicit modifiers [NULL or NOT NULL](../../sql-reference/statements/create/table.md/#null-modifiers) in column definition will be [Nullable](../../sql-reference/data-types/nullable.md/#data_type-nullable).
Possible values: Possible values:
@ -2478,7 +2478,7 @@ It can be useful when merges are CPU bounded not IO bounded (performing heavy da
## max_final_threads {#max-final-threads} ## max_final_threads {#max-final-threads}
Sets the maximum number of parallel threads for the `SELECT` query data read phase with the [FINAL](../../sql-reference/statements/select/from.md#select-from-final) modifier. Sets the maximum number of parallel threads for the `SELECT` query data read phase with the [FINAL](../../sql-reference/statements/select/from.md/#select-from-final) modifier.
Possible values: Possible values:
@ -2551,7 +2551,7 @@ Result:
└─────────────┘ └─────────────┘
``` ```
Note that this setting influences [Materialized view](../../sql-reference/statements/create/view.md#materialized) and [MaterializedMySQL](../../engines/database-engines/materialized-mysql.md) behaviour. Note that this setting influences [Materialized view](../../sql-reference/statements/create/view.md/#materialized) and [MaterializedMySQL](../../engines/database-engines/materialized-mysql.md) behaviour.
## engine_file_empty_if_not_exists {#engine-file-empty_if-not-exists} ## engine_file_empty_if_not_exists {#engine-file-empty_if-not-exists}
@ -2608,7 +2608,7 @@ Default value: `0`.
## allow_experimental_live_view {#allow-experimental-live-view} ## allow_experimental_live_view {#allow-experimental-live-view}
Allows creation of experimental [live views](../../sql-reference/statements/create/view.md#live-view). Allows creation of experimental [live views](../../sql-reference/statements/create/view.md/#live-view).
Possible values: Possible values:
@ -2619,19 +2619,19 @@ Default value: `0`.
## live_view_heartbeat_interval {#live-view-heartbeat-interval} ## live_view_heartbeat_interval {#live-view-heartbeat-interval}
Sets the heartbeat interval in seconds to indicate [live view](../../sql-reference/statements/create/view.md#live-view) is alive . Sets the heartbeat interval in seconds to indicate [live view](../../sql-reference/statements/create/view.md/#live-view) is alive .
Default value: `15`. Default value: `15`.
## max_live_view_insert_blocks_before_refresh {#max-live-view-insert-blocks-before-refresh} ## max_live_view_insert_blocks_before_refresh {#max-live-view-insert-blocks-before-refresh}
Sets the maximum number of inserted blocks after which mergeable blocks are dropped and query for [live view](../../sql-reference/statements/create/view.md#live-view) is re-executed. Sets the maximum number of inserted blocks after which mergeable blocks are dropped and query for [live view](../../sql-reference/statements/create/view.md/#live-view) is re-executed.
Default value: `64`. Default value: `64`.
## periodic_live_view_refresh {#periodic-live-view-refresh} ## periodic_live_view_refresh {#periodic-live-view-refresh}
Sets the interval in seconds after which periodically refreshed [live view](../../sql-reference/statements/create/view.md#live-view) is forced to refresh. Sets the interval in seconds after which periodically refreshed [live view](../../sql-reference/statements/create/view.md/#live-view) is forced to refresh.
Default value: `60`. Default value: `60`.
@ -2670,7 +2670,7 @@ Default value: 180.
## check_query_single_value_result {#check_query_single_value_result} ## check_query_single_value_result {#check_query_single_value_result}
Defines the level of detail for the [CHECK TABLE](../../sql-reference/statements/check-table.md#checking-mergetree-tables) query result for `MergeTree` family engines . Defines the level of detail for the [CHECK TABLE](../../sql-reference/statements/check-table.md/#checking-mergetree-tables) query result for `MergeTree` family engines .
Possible values: Possible values:
@ -2681,7 +2681,7 @@ Default value: `0`.
## prefer_column_name_to_alias {#prefer-column-name-to-alias} ## prefer_column_name_to_alias {#prefer-column-name-to-alias}
Enables or disables using the original column names instead of aliases in query expressions and clauses. It especially matters when alias is the same as the column name, see [Expression Aliases](../../sql-reference/syntax.md#notes-on-usage). Enable this setting to make aliases syntax rules in ClickHouse more compatible with most other database engines. Enables or disables using the original column names instead of aliases in query expressions and clauses. It especially matters when alias is the same as the column name, see [Expression Aliases](../../sql-reference/syntax.md/#notes-on-usage). Enable this setting to make aliases syntax rules in ClickHouse more compatible with most other database engines.
Possible values: Possible values:
@ -2725,7 +2725,7 @@ Result:
## limit {#limit} ## limit {#limit}
Sets the maximum number of rows to get from the query result. It adjusts the value set by the [LIMIT](../../sql-reference/statements/select/limit.md#limit-clause) clause, so that the limit, specified in the query, cannot exceed the limit, set by this setting. Sets the maximum number of rows to get from the query result. It adjusts the value set by the [LIMIT](../../sql-reference/statements/select/limit.md/#limit-clause) clause, so that the limit, specified in the query, cannot exceed the limit, set by this setting.
Possible values: Possible values:
@ -2736,7 +2736,7 @@ Default value: `0`.
## offset {#offset} ## offset {#offset}
Sets the number of rows to skip before starting to return rows from the query. It adjusts the offset set by the [OFFSET](../../sql-reference/statements/select/offset.md#offset-fetch) clause, so that these two values are summarized. Sets the number of rows to skip before starting to return rows from the query. It adjusts the offset set by the [OFFSET](../../sql-reference/statements/select/offset.md/#offset-fetch) clause, so that these two values are summarized.
Possible values: Possible values:
@ -2773,7 +2773,7 @@ Result:
## optimize_syntax_fuse_functions {#optimize_syntax_fuse_functions} ## optimize_syntax_fuse_functions {#optimize_syntax_fuse_functions}
Enables to fuse aggregate functions with identical argument. It rewrites query contains at least two aggregate functions from [sum](../../sql-reference/aggregate-functions/reference/sum.md#agg_function-sum), [count](../../sql-reference/aggregate-functions/reference/count.md#agg_function-count) or [avg](../../sql-reference/aggregate-functions/reference/avg.md#agg_function-avg) with identical argument to [sumCount](../../sql-reference/aggregate-functions/reference/sumcount.md#agg_function-sumCount). Enables to fuse aggregate functions with identical argument. It rewrites query contains at least two aggregate functions from [sum](../../sql-reference/aggregate-functions/reference/sum.md/#agg_function-sum), [count](../../sql-reference/aggregate-functions/reference/count.md/#agg_function-count) or [avg](../../sql-reference/aggregate-functions/reference/avg.md/#agg_function-avg) with identical argument to [sumCount](../../sql-reference/aggregate-functions/reference/sumcount.md/#agg_function-sumCount).
Possible values: Possible values:
@ -2932,18 +2932,18 @@ If the setting is set to `0`, the table function does not make Nullable columns
## allow_experimental_projection_optimization {#allow-experimental-projection-optimization} ## allow_experimental_projection_optimization {#allow-experimental-projection-optimization}
Enables or disables [projection](../../engines/table-engines/mergetree-family/mergetree.md#projections) optimization when processing `SELECT` queries. Enables or disables [projection](../../engines/table-engines/mergetree-family/mergetree.md/#projections) optimization when processing `SELECT` queries.
Possible values: Possible values:
- 0 — Projection optimization disabled. - 0 — Projection optimization disabled.
- 1 — Projection optimization enabled. - 1 — Projection optimization enabled.
Default value: `0`. Default value: `1`.
## force_optimize_projection {#force-optimize-projection} ## force_optimize_projection {#force-optimize-projection}
Enables or disables the obligatory use of [projections](../../engines/table-engines/mergetree-family/mergetree.md#projections) in `SELECT` queries, when projection optimization is enabled (see [allow_experimental_projection_optimization](#allow-experimental-projection-optimization) setting). Enables or disables the obligatory use of [projections](../../engines/table-engines/mergetree-family/mergetree.md/#projections) in `SELECT` queries, when projection optimization is enabled (see [allow_experimental_projection_optimization](#allow-experimental-projection-optimization) setting).
Possible values: Possible values:
@ -2978,7 +2978,7 @@ Default value: `120` seconds.
## regexp_max_matches_per_row {#regexp-max-matches-per-row} ## regexp_max_matches_per_row {#regexp-max-matches-per-row}
Sets the maximum number of matches for a single regular expression per row. Use it to protect against memory overload when using greedy regular expression in the [extractAllGroupsHorizontal](../../sql-reference/functions/string-search-functions.md#extractallgroups-horizontal) function. Sets the maximum number of matches for a single regular expression per row. Use it to protect against memory overload when using greedy regular expression in the [extractAllGroupsHorizontal](../../sql-reference/functions/string-search-functions.md/#extractallgroups-horizontal) function.
Possible values: Possible values:
@ -3010,7 +3010,7 @@ Default value: `1`.
## short_circuit_function_evaluation {#short-circuit-function-evaluation} ## short_circuit_function_evaluation {#short-circuit-function-evaluation}
Allows calculating the [if](../../sql-reference/functions/conditional-functions.md#if), [multiIf](../../sql-reference/functions/conditional-functions.md#multiif), [and](../../sql-reference/functions/logical-functions.md#logical-and-function), and [or](../../sql-reference/functions/logical-functions.md#logical-or-function) functions according to a [short scheme](https://en.wikipedia.org/wiki/Short-circuit_evaluation). This helps optimize the execution of complex expressions in these functions and prevent possible exceptions (such as division by zero when it is not expected). Allows calculating the [if](../../sql-reference/functions/conditional-functions.md/#if), [multiIf](../../sql-reference/functions/conditional-functions.md/#multiif), [and](../../sql-reference/functions/logical-functions.md/#logical-and-function), and [or](../../sql-reference/functions/logical-functions.md/#logical-or-function) functions according to a [short scheme](https://en.wikipedia.org/wiki/Short-circuit_evaluation). This helps optimize the execution of complex expressions in these functions and prevent possible exceptions (such as division by zero when it is not expected).
Possible values: Possible values:
@ -3022,7 +3022,7 @@ Default value: `enable`.
## max_hyperscan_regexp_length {#max-hyperscan-regexp-length} ## max_hyperscan_regexp_length {#max-hyperscan-regexp-length}
Defines the maximum length for each regular expression in the [hyperscan multi-match functions](../../sql-reference/functions/string-search-functions.md#multimatchanyhaystack-pattern1-pattern2-patternn). Defines the maximum length for each regular expression in the [hyperscan multi-match functions](../../sql-reference/functions/string-search-functions.md/#multimatchanyhaystack-pattern1-pattern2-patternn).
Possible values: Possible values:
@ -3065,7 +3065,7 @@ Exception: Regexp length too large.
## max_hyperscan_regexp_total_length {#max-hyperscan-regexp-total-length} ## max_hyperscan_regexp_total_length {#max-hyperscan-regexp-total-length}
Sets the maximum length total of all regular expressions in each [hyperscan multi-match function](../../sql-reference/functions/string-search-functions.md#multimatchanyhaystack-pattern1-pattern2-patternn). Sets the maximum length total of all regular expressions in each [hyperscan multi-match function](../../sql-reference/functions/string-search-functions.md/#multimatchanyhaystack-pattern1-pattern2-patternn).
Possible values: Possible values:
@ -3142,8 +3142,8 @@ Result:
## enable_extended_results_for_datetime_functions {#enable-extended-results-for-datetime-functions} ## enable_extended_results_for_datetime_functions {#enable-extended-results-for-datetime-functions}
Enables or disables returning results of type: Enables or disables returning results of type:
- `Date32` with extended range (compared to type `Date`) for functions [toStartOfYear](../../sql-reference/functions/date-time-functions.md#tostartofyear), [toStartOfISOYear](../../sql-reference/functions/date-time-functions.md#tostartofisoyear), [toStartOfQuarter](../../sql-reference/functions/date-time-functions.md#tostartofquarter), [toStartOfMonth](../../sql-reference/functions/date-time-functions.md#tostartofmonth), [toStartOfWeek](../../sql-reference/functions/date-time-functions.md#tostartofweek), [toMonday](../../sql-reference/functions/date-time-functions.md#tomonday) and [toLastDayOfMonth](../../sql-reference/functions/date-time-functions.md#tolastdayofmonth). - `Date32` with extended range (compared to type `Date`) for functions [toStartOfYear](../../sql-reference/functions/date-time-functions.md/#tostartofyear), [toStartOfISOYear](../../sql-reference/functions/date-time-functions.md/#tostartofisoyear), [toStartOfQuarter](../../sql-reference/functions/date-time-functions.md/#tostartofquarter), [toStartOfMonth](../../sql-reference/functions/date-time-functions.md/#tostartofmonth), [toStartOfWeek](../../sql-reference/functions/date-time-functions.md/#tostartofweek), [toMonday](../../sql-reference/functions/date-time-functions.md/#tomonday) and [toLastDayOfMonth](../../sql-reference/functions/date-time-functions.md/#tolastdayofmonth).
- `DateTime64` with extended range (compared to type `DateTime`) for functions [toStartOfDay](../../sql-reference/functions/date-time-functions.md#tostartofday), [toStartOfHour](../../sql-reference/functions/date-time-functions.md#tostartofhour), [toStartOfMinute](../../sql-reference/functions/date-time-functions.md#tostartofminute), [toStartOfFiveMinutes](../../sql-reference/functions/date-time-functions.md#tostartoffiveminutes), [toStartOfTenMinutes](../../sql-reference/functions/date-time-functions.md#tostartoftenminutes), [toStartOfFifteenMinutes](../../sql-reference/functions/date-time-functions.md#tostartoffifteenminutes) and [timeSlot](../../sql-reference/functions/date-time-functions.md#timeslot). - `DateTime64` with extended range (compared to type `DateTime`) for functions [toStartOfDay](../../sql-reference/functions/date-time-functions.md/#tostartofday), [toStartOfHour](../../sql-reference/functions/date-time-functions.md/#tostartofhour), [toStartOfMinute](../../sql-reference/functions/date-time-functions.md/#tostartofminute), [toStartOfFiveMinutes](../../sql-reference/functions/date-time-functions.md/#tostartoffiveminutes), [toStartOfTenMinutes](../../sql-reference/functions/date-time-functions.md/#tostartoftenminutes), [toStartOfFifteenMinutes](../../sql-reference/functions/date-time-functions.md/#tostartoffifteenminutes) and [timeSlot](../../sql-reference/functions/date-time-functions.md/#timeslot).
Possible values: Possible values:
@ -3167,7 +3167,7 @@ Default value: `1`.
## optimize_move_to_prewhere_if_final {#optimize_move_to_prewhere_if_final} ## optimize_move_to_prewhere_if_final {#optimize_move_to_prewhere_if_final}
Enables or disables automatic [PREWHERE](../../sql-reference/statements/select/prewhere.md) optimization in [SELECT](../../sql-reference/statements/select/index.md) queries with [FINAL](../../sql-reference/statements/select/from.md#select-from-final) modifier. Enables or disables automatic [PREWHERE](../../sql-reference/statements/select/prewhere.md) optimization in [SELECT](../../sql-reference/statements/select/index.md) queries with [FINAL](../../sql-reference/statements/select/from.md/#select-from-final) modifier.
Works only for [*MergeTree](../../engines/table-engines/mergetree-family/index.md) tables. Works only for [*MergeTree](../../engines/table-engines/mergetree-family/index.md) tables.
@ -3184,7 +3184,7 @@ Default value: `0`.
## describe_include_subcolumns {#describe_include_subcolumns} ## describe_include_subcolumns {#describe_include_subcolumns}
Enables describing subcolumns for a [DESCRIBE](../../sql-reference/statements/describe-table.md) query. For example, members of a [Tuple](../../sql-reference/data-types/tuple.md) or subcolumns of a [Map](../../sql-reference/data-types/map.md#map-subcolumns), [Nullable](../../sql-reference/data-types/nullable.md#finding-null) or an [Array](../../sql-reference/data-types/array.md#array-size) data type. Enables describing subcolumns for a [DESCRIBE](../../sql-reference/statements/describe-table.md) query. For example, members of a [Tuple](../../sql-reference/data-types/tuple.md) or subcolumns of a [Map](../../sql-reference/data-types/map.md/#map-subcolumns), [Nullable](../../sql-reference/data-types/nullable.md/#finding-null) or an [Array](../../sql-reference/data-types/array.md/#array-size) data type.
Possible values: Possible values:
@ -3283,7 +3283,7 @@ Default value: `0`.
## alter_partition_verbose_result {#alter-partition-verbose-result} ## alter_partition_verbose_result {#alter-partition-verbose-result}
Enables or disables the display of information about the parts to which the manipulation operations with partitions and parts have been successfully applied. Enables or disables the display of information about the parts to which the manipulation operations with partitions and parts have been successfully applied.
Applicable to [ATTACH PARTITION|PART](../../sql-reference/statements/alter/partition.md#alter_attach-partition) and to [FREEZE PARTITION](../../sql-reference/statements/alter/partition.md#alter_freeze-partition). Applicable to [ATTACH PARTITION|PART](../../sql-reference/statements/alter/partition.md/#alter_attach-partition) and to [FREEZE PARTITION](../../sql-reference/statements/alter/partition.md/#alter_freeze-partition).
Possible values: Possible values:
@ -3399,6 +3399,17 @@ Use schema from cache for URL with last modification time validation (for urls w
Default value: `true`. Default value: `true`.
## use_structure_from_insertion_table_in_table_functions {use_structure_from_insertion_table_in_table_functions}
Use structure from insertion table instead of schema inference from data.
Possible values:
- 0 - disabled
- 1 - enabled
- 2 - auto
Default value: 2.
## compatibility {#compatibility} ## compatibility {#compatibility}
This setting changes other settings according to provided ClickHouse version. This setting changes other settings according to provided ClickHouse version.
@ -3418,11 +3429,11 @@ When writing data, ClickHouse throws an exception if input data contain columns
Supported formats: Supported formats:
- [JSONEachRow](../../interfaces/formats.md#jsoneachrow) - [JSONEachRow](../../interfaces/formats.md/#jsoneachrow)
- [TSKV](../../interfaces/formats.md#tskv) - [TSKV](../../interfaces/formats.md/#tskv)
- All formats with suffixes WithNames/WithNamesAndTypes - All formats with suffixes WithNames/WithNamesAndTypes
- [JSONColumns](../../interfaces/formats.md#jsoncolumns) - [JSONColumns](../../interfaces/formats.md/#jsoncolumns)
- [MySQLDump](../../interfaces/formats.md#mysqldump) - [MySQLDump](../../interfaces/formats.md/#mysqldump)
Possible values: Possible values:
@ -3439,18 +3450,18 @@ To improve insert performance, we recommend disabling this check if you are sure
Supported formats: Supported formats:
- [CSVWithNames](../../interfaces/formats.md#csvwithnames) - [CSVWithNames](../../interfaces/formats.md/#csvwithnames)
- [CSVWithNamesAndTypes](../../interfaces/formats.md#csvwithnamesandtypes) - [CSVWithNamesAndTypes](../../interfaces/formats.md/#csvwithnamesandtypes)
- [TabSeparatedWithNames](../../interfaces/formats.md#tabseparatedwithnames) - [TabSeparatedWithNames](../../interfaces/formats.md/#tabseparatedwithnames)
- [TabSeparatedWithNamesAndTypes](../../interfaces/formats.md#tabseparatedwithnamesandtypes) - [TabSeparatedWithNamesAndTypes](../../interfaces/formats.md/#tabseparatedwithnamesandtypes)
- [JSONCompactEachRowWithNames](../../interfaces/formats.md#jsoncompacteachrowwithnames) - [JSONCompactEachRowWithNames](../../interfaces/formats.md/#jsoncompacteachrowwithnames)
- [JSONCompactEachRowWithNamesAndTypes](../../interfaces/formats.md#jsoncompacteachrowwithnamesandtypes) - [JSONCompactEachRowWithNamesAndTypes](../../interfaces/formats.md/#jsoncompacteachrowwithnamesandtypes)
- [JSONCompactStringsEachRowWithNames](../../interfaces/formats.md#jsoncompactstringseachrowwithnames) - [JSONCompactStringsEachRowWithNames](../../interfaces/formats.md/#jsoncompactstringseachrowwithnames)
- [JSONCompactStringsEachRowWithNamesAndTypes](../../interfaces/formats.md#jsoncompactstringseachrowwithnamesandtypes) - [JSONCompactStringsEachRowWithNamesAndTypes](../../interfaces/formats.md/#jsoncompactstringseachrowwithnamesandtypes)
- [RowBinaryWithNames](../../interfaces/formats.md#rowbinarywithnames) - [RowBinaryWithNames](../../interfaces/formats.md/#rowbinarywithnames)
- [RowBinaryWithNamesAndTypes](../../interfaces/formats.md#rowbinarywithnamesandtypes) - [RowBinaryWithNamesAndTypes](../../interfaces/formats.md/#rowbinarywithnamesandtypes)
- [CustomSeparatedWithNames](../../interfaces/formats.md#customseparatedwithnames) - [CustomSeparatedWithNames](../../interfaces/formats.md/#customseparatedwithnames)
- [CustomSeparatedWithNamesAndTypes](../../interfaces/formats.md#customseparatedwithnamesandtypes) - [CustomSeparatedWithNamesAndTypes](../../interfaces/formats.md/#customseparatedwithnamesandtypes)
Possible values: Possible values:
@ -3465,12 +3476,12 @@ Controls whether format parser should check if data types from the input data ma
Supported formats: Supported formats:
- [CSVWithNamesAndTypes](../../interfaces/formats.md#csvwithnamesandtypes) - [CSVWithNamesAndTypes](../../interfaces/formats.md/#csvwithnamesandtypes)
- [TabSeparatedWithNamesAndTypes](../../interfaces/formats.md#tabseparatedwithnamesandtypes) - [TabSeparatedWithNamesAndTypes](../../interfaces/formats.md/#tabseparatedwithnamesandtypes)
- [JSONCompactEachRowWithNamesAndTypes](../../interfaces/formats.md#jsoncompacteachrowwithnamesandtypes) - [JSONCompactEachRowWithNamesAndTypes](../../interfaces/formats.md/#jsoncompacteachrowwithnamesandtypes)
- [JSONCompactStringsEachRowWithNamesAndTypes](../../interfaces/formats.md#jsoncompactstringseachrowwithnamesandtypes) - [JSONCompactStringsEachRowWithNamesAndTypes](../../interfaces/formats.md/#jsoncompactstringseachrowwithnamesandtypes)
- [RowBinaryWithNamesAndTypes](../../interfaces/formats.md#rowbinarywithnamesandtypes-rowbinarywithnamesandtypes) - [RowBinaryWithNamesAndTypes](../../interfaces/formats.md/#rowbinarywithnamesandtypes-rowbinarywithnamesandtypes)
- [CustomSeparatedWithNamesAndTypes](../../interfaces/formats.md#customseparatedwithnamesandtypes) - [CustomSeparatedWithNamesAndTypes](../../interfaces/formats.md/#customseparatedwithnamesandtypes)
Possible values: Possible values:
@ -3481,7 +3492,7 @@ Default value: 1.
## input_format_defaults_for_omitted_fields {#input_format_defaults_for_omitted_fields} ## input_format_defaults_for_omitted_fields {#input_format_defaults_for_omitted_fields}
When performing `INSERT` queries, replace omitted input column values with default values of the respective columns. This option only applies to [JSONEachRow](../../interfaces/formats.md#jsoneachrow), [CSV](../../interfaces/formats.md#csv), [TabSeparated](../../interfaces/formats.md#tabseparated) formats and formats with `WithNames`/`WithNamesAndTypes` suffixes. When performing `INSERT` queries, replace omitted input column values with default values of the respective columns. This option only applies to [JSONEachRow](../../interfaces/formats.md/#jsoneachrow), [CSV](../../interfaces/formats.md/#csv), [TabSeparated](../../interfaces/formats.md/#tabseparated) formats and formats with `WithNames`/`WithNamesAndTypes` suffixes.
:::note :::note
When this option is enabled, extended table metadata are sent from server to client. It consumes additional computing resources on the server and can reduce performance. When this option is enabled, extended table metadata are sent from server to client. It consumes additional computing resources on the server and can reduce performance.
@ -3496,7 +3507,7 @@ Default value: 1.
## input_format_null_as_default {#input_format_null_as_default} ## input_format_null_as_default {#input_format_null_as_default}
Enables or disables the initialization of [NULL](../../sql-reference/syntax.md#null-literal) fields with [default values](../../sql-reference/statements/create/table.md#create-default-values), if data type of these fields is not [nullable](../../sql-reference/data-types/nullable.md#data_type-nullable). Enables or disables the initialization of [NULL](../../sql-reference/syntax.md/#null-literal) fields with [default values](../../sql-reference/statements/create/table.md/#create-default-values), if data type of these fields is not [nullable](../../sql-reference/data-types/nullable.md/#data_type-nullable).
If column type is not nullable and this setting is disabled, then inserting `NULL` causes an exception. If column type is nullable, then `NULL` values are inserted as is, regardless of this setting. If column type is not nullable and this setting is disabled, then inserting `NULL` causes an exception. If column type is nullable, then `NULL` values are inserted as is, regardless of this setting.
This setting is applicable to [INSERT ... VALUES](../../sql-reference/statements/insert-into.md) queries for text input formats. This setting is applicable to [INSERT ... VALUES](../../sql-reference/statements/insert-into.md) queries for text input formats.
@ -3663,7 +3674,7 @@ Enabled by default
## insert_distributed_one_random_shard {#insert_distributed_one_random_shard} ## insert_distributed_one_random_shard {#insert_distributed_one_random_shard}
Enables or disables random shard insertion into a [Distributed](../../engines/table-engines/special/distributed.md#distributed) table when there is no distributed key. Enables or disables random shard insertion into a [Distributed](../../engines/table-engines/special/distributed.md/#distributed) table when there is no distributed key.
By default, when inserting data into a `Distributed` table with more than one shard, the ClickHouse server will reject any insertion request if there is no distributed key. When `insert_distributed_one_random_shard = 1`, insertions are allowed and data is forwarded randomly among all shards. By default, when inserting data into a `Distributed` table with more than one shard, the ClickHouse server will reject any insertion request if there is no distributed key. When `insert_distributed_one_random_shard = 1`, insertions are allowed and data is forwarded randomly among all shards.
@ -3682,7 +3693,7 @@ Enables or disables the insertion of JSON data with nested objects.
Supported formats: Supported formats:
- [JSONEachRow](../../interfaces/formats.md#jsoneachrow) - [JSONEachRow](../../interfaces/formats.md/#jsoneachrow)
Possible values: Possible values:
@ -3693,7 +3704,7 @@ Default value: 0.
See also: See also:
- [Usage of Nested Structures](../../interfaces/formats.md#jsoneachrow-nested) with the `JSONEachRow` format. - [Usage of Nested Structures](../../interfaces/formats.md/#jsoneachrow-nested) with the `JSONEachRow` format.
### input_format_json_read_bools_as_numbers {#input_format_json_read_bools_as_numbers} ### input_format_json_read_bools_as_numbers {#input_format_json_read_bools_as_numbers}
@ -3716,7 +3727,7 @@ Enabled by default.
### output_format_json_quote_64bit_integers {#output_format_json_quote_64bit_integers} ### output_format_json_quote_64bit_integers {#output_format_json_quote_64bit_integers}
Controls quoting of 64-bit or bigger [integers](../../sql-reference/data-types/int-uint.md) (like `UInt64` or `Int128`) when they are output in a [JSON](../../interfaces/formats.md#json) format. Controls quoting of 64-bit or bigger [integers](../../sql-reference/data-types/int-uint.md) (like `UInt64` or `Int128`) when they are output in a [JSON](../../interfaces/formats.md/#json) format.
Such integers are enclosed in quotes by default. This behavior is compatible with most JavaScript implementations. Such integers are enclosed in quotes by default. This behavior is compatible with most JavaScript implementations.
Possible values: Possible values:
@ -3734,7 +3745,7 @@ Disabled by default.
### output_format_json_quote_denormals {#output_format_json_quote_denormals} ### output_format_json_quote_denormals {#output_format_json_quote_denormals}
Enables `+nan`, `-nan`, `+inf`, `-inf` outputs in [JSON](../../interfaces/formats.md#json) output format. Enables `+nan`, `-nan`, `+inf`, `-inf` outputs in [JSON](../../interfaces/formats.md/#json) output format.
Possible values: Possible values:
@ -3851,7 +3862,7 @@ Disabled by default.
### output_format_json_array_of_rows {#output_format_json_array_of_rows} ### output_format_json_array_of_rows {#output_format_json_array_of_rows}
Enables the ability to output all rows as a JSON array in the [JSONEachRow](../../interfaces/formats.md#jsoneachrow) format. Enables the ability to output all rows as a JSON array in the [JSONEachRow](../../interfaces/formats.md/#jsoneachrow) format.
Possible values: Possible values:
@ -3904,7 +3915,7 @@ Disabled by default.
### format_json_object_each_row_column_for_object_name {#format_json_object_each_row_column_for_object_name} ### format_json_object_each_row_column_for_object_name {#format_json_object_each_row_column_for_object_name}
The name of column that will be used for storing/writing object names in [JSONObjectEachRow](../../interfaces/formats.md#jsonobjecteachrow) format. The name of column that will be used for storing/writing object names in [JSONObjectEachRow](../../interfaces/formats.md/#jsonobjecteachrow) format.
Column type should be String. If value is empty, default names `row_{i}`will be used for object names. Column type should be String. If value is empty, default names `row_{i}`will be used for object names.
Default value: ''. Default value: ''.
@ -4005,7 +4016,7 @@ Disabled by default.
### format_tsv_null_representation {#format_tsv_null_representation} ### format_tsv_null_representation {#format_tsv_null_representation}
Defines the representation of `NULL` for [TSV](../../interfaces/formats.md#tabseparated) output and input formats. User can set any string as a value, for example, `My NULL`. Defines the representation of `NULL` for [TSV](../../interfaces/formats.md/#tabseparated) output and input formats. User can set any string as a value, for example, `My NULL`.
Default value: `\N`. Default value: `\N`.
@ -4159,7 +4170,7 @@ Default value: `0`.
### format_csv_null_representation {#format_csv_null_representation} ### format_csv_null_representation {#format_csv_null_representation}
Defines the representation of `NULL` for [CSV](../../interfaces/formats.md#csv) output and input formats. User can set any string as a value, for example, `My NULL`. Defines the representation of `NULL` for [CSV](../../interfaces/formats.md/#csv) output and input formats. User can set any string as a value, for example, `My NULL`.
Default value: `\N`. Default value: `\N`.
@ -4198,7 +4209,7 @@ My NULL
### input_format_values_interpret_expressions {#input_format_values_interpret_expressions} ### input_format_values_interpret_expressions {#input_format_values_interpret_expressions}
Enables or disables the full SQL parser if the fast stream parser cant parse the data. This setting is used only for the [Values](../../interfaces/formats.md#data-format-values) format at the data insertion. For more information about syntax parsing, see the [Syntax](../../sql-reference/syntax.md) section. Enables or disables the full SQL parser if the fast stream parser cant parse the data. This setting is used only for the [Values](../../interfaces/formats.md/#data-format-values) format at the data insertion. For more information about syntax parsing, see the [Syntax](../../sql-reference/syntax.md) section.
Possible values: Possible values:
@ -4248,7 +4259,7 @@ Ok.
### input_format_values_deduce_templates_of_expressions {#input_format_values_deduce_templates_of_expressions} ### input_format_values_deduce_templates_of_expressions {#input_format_values_deduce_templates_of_expressions}
Enables or disables template deduction for SQL expressions in [Values](../../interfaces/formats.md#data-format-values) format. It allows parsing and interpreting expressions in `Values` much faster if expressions in consecutive rows have the same structure. ClickHouse tries to deduce the template of an expression, parse the following rows using this template and evaluate the expression on a batch of successfully parsed rows. Enables or disables template deduction for SQL expressions in [Values](../../interfaces/formats.md/#data-format-values) format. It allows parsing and interpreting expressions in `Values` much faster if expressions in consecutive rows have the same structure. ClickHouse tries to deduce the template of an expression, parse the following rows using this template and evaluate the expression on a batch of successfully parsed rows.
Possible values: Possible values:
@ -4293,7 +4304,7 @@ Default value: 1.
### input_format_arrow_import_nested {#input_format_arrow_import_nested} ### input_format_arrow_import_nested {#input_format_arrow_import_nested}
Enables or disables the ability to insert the data into [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs in [Arrow](../../interfaces/formats.md#data_types-matching-arrow) input format. Enables or disables the ability to insert the data into [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs in [Arrow](../../interfaces/formats.md/#data_types-matching-arrow) input format.
Possible values: Possible values:
@ -4322,7 +4333,7 @@ Disabled by default.
### output_format_arrow_low_cardinality_as_dictionary {#output_format_arrow_low_cardinality_as_dictionary} ### output_format_arrow_low_cardinality_as_dictionary {#output_format_arrow_low_cardinality_as_dictionary}
Allows to convert the [LowCardinality](../../sql-reference/data-types/lowcardinality.md) type to the `DICTIONARY` type of the [Arrow](../../interfaces/formats.md#data-format-arrow) format for `SELECT` queries. Allows to convert the [LowCardinality](../../sql-reference/data-types/lowcardinality.md) type to the `DICTIONARY` type of the [Arrow](../../interfaces/formats.md/#data-format-arrow) format for `SELECT` queries.
Possible values: Possible values:
@ -4341,7 +4352,7 @@ Disabled by default.
### input_format_orc_import_nested {#input_format_orc_import_nested} ### input_format_orc_import_nested {#input_format_orc_import_nested}
Enables or disables the ability to insert the data into [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs in [ORC](../../interfaces/formats.md#data-format-orc) input format. Enables or disables the ability to insert the data into [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs in [ORC](../../interfaces/formats.md/#data-format-orc) input format.
Possible values: Possible values:
@ -4384,7 +4395,7 @@ Disabled by default.
## input_format_parquet_import_nested {#input_format_parquet_import_nested} ## input_format_parquet_import_nested {#input_format_parquet_import_nested}
Enables or disables the ability to insert the data into [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs in [Parquet](../../interfaces/formats.md#data-format-parquet) input format. Enables or disables the ability to insert the data into [Nested](../../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs in [Parquet](../../interfaces/formats.md/#data-format-parquet) input format.
Possible values: Possible values:
@ -4481,7 +4492,7 @@ Disabled by default.
### input_format_avro_allow_missing_fields {#input_format_avro_allow_missing_fields} ### input_format_avro_allow_missing_fields {#input_format_avro_allow_missing_fields}
Enables using fields that are not specified in [Avro](../../interfaces/formats.md#data-format-avro) or [AvroConfluent](../../interfaces/formats.md#data-format-avro-confluent) format schema. When a field is not found in the schema, ClickHouse uses the default value instead of throwing an exception. Enables using fields that are not specified in [Avro](../../interfaces/formats.md/#data-format-avro) or [AvroConfluent](../../interfaces/formats.md/#data-format-avro-confluent) format schema. When a field is not found in the schema, ClickHouse uses the default value instead of throwing an exception.
Possible values: Possible values:
@ -4492,7 +4503,7 @@ Default value: 0.
### format_avro_schema_registry_url {#format_avro_schema_registry_url} ### format_avro_schema_registry_url {#format_avro_schema_registry_url}
Sets [Confluent Schema Registry](https://docs.confluent.io/current/schema-registry/index.html) URL to use with [AvroConfluent](../../interfaces/formats.md#data-format-avro-confluent) format. Sets [Confluent Schema Registry](https://docs.confluent.io/current/schema-registry/index.html) URL to use with [AvroConfluent](../../interfaces/formats.md/#data-format-avro-confluent) format.
Default value: `Empty`. Default value: `Empty`.
@ -4549,7 +4560,7 @@ Default value: `250`.
### output_format_pretty_max_value_width {#output_format_pretty_max_value_width} ### output_format_pretty_max_value_width {#output_format_pretty_max_value_width}
Limits the width of value displayed in [Pretty](../../interfaces/formats.md#pretty) formats. If the value width exceeds the limit, the value is cut. Limits the width of value displayed in [Pretty](../../interfaces/formats.md/#pretty) formats. If the value width exceeds the limit, the value is cut.
Possible values: Possible values:
@ -4625,7 +4636,7 @@ SELECT * FROM a;
### output_format_pretty_row_numbers {#output_format_pretty_row_numbers} ### output_format_pretty_row_numbers {#output_format_pretty_row_numbers}
Adds row numbers to output in the [Pretty](../../interfaces/formats.md#pretty) format. Adds row numbers to output in the [Pretty](../../interfaces/formats.md/#pretty) format.
Possible values: Possible values:
@ -4670,52 +4681,52 @@ Delimiter between rows (for Template format).
### format_custom_escaping_rule {#format_custom_escaping_rule} ### format_custom_escaping_rule {#format_custom_escaping_rule}
Sets the field escaping rule for [CustomSeparated](../../interfaces/formats.md#format-customseparated) data format. Sets the field escaping rule for [CustomSeparated](../../interfaces/formats.md/#format-customseparated) data format.
Possible values: Possible values:
- `'Escaped'` — Similarly to [TSV](../../interfaces/formats.md#tabseparated). - `'Escaped'` — Similarly to [TSV](../../interfaces/formats.md/#tabseparated).
- `'Quoted'` — Similarly to [Values](../../interfaces/formats.md#data-format-values). - `'Quoted'` — Similarly to [Values](../../interfaces/formats.md/#data-format-values).
- `'CSV'` — Similarly to [CSV](../../interfaces/formats.md#csv). - `'CSV'` — Similarly to [CSV](../../interfaces/formats.md/#csv).
- `'JSON'` — Similarly to [JSONEachRow](../../interfaces/formats.md#jsoneachrow). - `'JSON'` — Similarly to [JSONEachRow](../../interfaces/formats.md/#jsoneachrow).
- `'XML'` — Similarly to [XML](../../interfaces/formats.md#xml). - `'XML'` — Similarly to [XML](../../interfaces/formats.md/#xml).
- `'Raw'` — Extracts subpatterns as a whole, no escaping rules, similarly to [TSVRaw](../../interfaces/formats.md#tabseparatedraw). - `'Raw'` — Extracts subpatterns as a whole, no escaping rules, similarly to [TSVRaw](../../interfaces/formats.md/#tabseparatedraw).
Default value: `'Escaped'`. Default value: `'Escaped'`.
### format_custom_field_delimiter {#format_custom_field_delimiter} ### format_custom_field_delimiter {#format_custom_field_delimiter}
Sets the character that is interpreted as a delimiter between the fields for [CustomSeparated](../../interfaces/formats.md#format-customseparated) data format. Sets the character that is interpreted as a delimiter between the fields for [CustomSeparated](../../interfaces/formats.md/#format-customseparated) data format.
Default value: `'\t'`. Default value: `'\t'`.
### format_custom_row_before_delimiter {#format_custom_row_before_delimiter} ### format_custom_row_before_delimiter {#format_custom_row_before_delimiter}
Sets the character that is interpreted as a delimiter before the field of the first column for [CustomSeparated](../../interfaces/formats.md#format-customseparated) data format. Sets the character that is interpreted as a delimiter before the field of the first column for [CustomSeparated](../../interfaces/formats.md/#format-customseparated) data format.
Default value: `''`. Default value: `''`.
### format_custom_row_after_delimiter {#format_custom_row_after_delimiter} ### format_custom_row_after_delimiter {#format_custom_row_after_delimiter}
Sets the character that is interpreted as a delimiter after the field of the last column for [CustomSeparated](../../interfaces/formats.md#format-customseparated) data format. Sets the character that is interpreted as a delimiter after the field of the last column for [CustomSeparated](../../interfaces/formats.md/#format-customseparated) data format.
Default value: `'\n'`. Default value: `'\n'`.
### format_custom_row_between_delimiter {#format_custom_row_between_delimiter} ### format_custom_row_between_delimiter {#format_custom_row_between_delimiter}
Sets the character that is interpreted as a delimiter between the rows for [CustomSeparated](../../interfaces/formats.md#format-customseparated) data format. Sets the character that is interpreted as a delimiter between the rows for [CustomSeparated](../../interfaces/formats.md/#format-customseparated) data format.
Default value: `''`. Default value: `''`.
### format_custom_result_before_delimiter {#format_custom_result_before_delimiter} ### format_custom_result_before_delimiter {#format_custom_result_before_delimiter}
Sets the character that is interpreted as a prefix before the result set for [CustomSeparated](../../interfaces/formats.md#format-customseparated) data format. Sets the character that is interpreted as a prefix before the result set for [CustomSeparated](../../interfaces/formats.md/#format-customseparated) data format.
Default value: `''`. Default value: `''`.
### format_custom_result_after_delimiter {#format_custom_result_after_delimiter} ### format_custom_result_after_delimiter {#format_custom_result_after_delimiter}
Sets the character that is interpreted as a suffix after the result set for [CustomSeparated](../../interfaces/formats.md#format-customseparated) data format. Sets the character that is interpreted as a suffix after the result set for [CustomSeparated](../../interfaces/formats.md/#format-customseparated) data format.
Default value: `''`. Default value: `''`.
@ -4727,12 +4738,12 @@ Field escaping rule.
Possible values: Possible values:
- `'Escaped'` — Similarly to [TSV](../../interfaces/formats.md#tabseparated). - `'Escaped'` — Similarly to [TSV](../../interfaces/formats.md/#tabseparated).
- `'Quoted'` — Similarly to [Values](../../interfaces/formats.md#data-format-values). - `'Quoted'` — Similarly to [Values](../../interfaces/formats.md/#data-format-values).
- `'CSV'` — Similarly to [CSV](../../interfaces/formats.md#csv). - `'CSV'` — Similarly to [CSV](../../interfaces/formats.md/#csv).
- `'JSON'` — Similarly to [JSONEachRow](../../interfaces/formats.md#jsoneachrow). - `'JSON'` — Similarly to [JSONEachRow](../../interfaces/formats.md/#jsoneachrow).
- `'XML'` — Similarly to [XML](../../interfaces/formats.md#xml). - `'XML'` — Similarly to [XML](../../interfaces/formats.md/#xml).
- `'Raw'` — Extracts subpatterns as a whole, no escaping rules, similarly to [TSVRaw](../../interfaces/formats.md#tabseparatedraw). - `'Raw'` — Extracts subpatterns as a whole, no escaping rules, similarly to [TSVRaw](../../interfaces/formats.md/#tabseparatedraw).
Default value: `Raw`. Default value: `Raw`.
@ -4746,7 +4757,7 @@ Disabled by default.
### format_capn_proto_enum_comparising_mode {#format_capn_proto_enum_comparising_mode} ### format_capn_proto_enum_comparising_mode {#format_capn_proto_enum_comparising_mode}
Determines how to map ClickHouse `Enum` data type and [CapnProto](../../interfaces/formats.md#capnproto) `Enum` data type from schema. Determines how to map ClickHouse `Enum` data type and [CapnProto](../../interfaces/formats.md/#capnproto) `Enum` data type from schema.
Possible values: Possible values:

View File

@ -7,13 +7,13 @@ title: "External Disks for Storing Data"
Data, processed in ClickHouse, is usually stored in the local file system — on the same machine with the ClickHouse server. That requires large-capacity disks, which can be expensive enough. To avoid that you can store the data remotely — on [Amazon S3](https://aws.amazon.com/s3/) disks or in the Hadoop Distributed File System ([HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)). Data, processed in ClickHouse, is usually stored in the local file system — on the same machine with the ClickHouse server. That requires large-capacity disks, which can be expensive enough. To avoid that you can store the data remotely — on [Amazon S3](https://aws.amazon.com/s3/) disks or in the Hadoop Distributed File System ([HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)).
To work with data stored on `Amazon S3` disks use [S3](../engines/table-engines/integrations/s3.md) table engine, and to work with data in the Hadoop Distributed File System — [HDFS](../engines/table-engines/integrations/hdfs.md) table engine. To work with data stored on `Amazon S3` disks use [S3](/docs/en/engines/table-engines/integrations/s3.md) table engine, and to work with data in the Hadoop Distributed File System — [HDFS](/docs/en/engines/table-engines/integrations/hdfs.md) table engine.
To load data from a web server with static files use a disk with type [web](#storing-data-on-webserver). To load data from a web server with static files use a disk with type [web](#storing-data-on-webserver).
## Configuring HDFS {#configuring-hdfs} ## Configuring HDFS {#configuring-hdfs}
[MergeTree](../engines/table-engines/mergetree-family/mergetree.md) and [Log](../engines/table-engines/log-family/log.md) family table engines can store data to HDFS using a disk with type `HDFS`. [MergeTree](/docs/en/engines/table-engines/mergetree-family/mergetree.md) and [Log](/docs/en/engines/table-engines/log-family/log.md) family table engines can store data to HDFS using a disk with type `HDFS`.
Configuration markup: Configuration markup:
@ -53,7 +53,7 @@ Optional parameters:
## Using Virtual File System for Data Encryption {#encrypted-virtual-file-system} ## Using Virtual File System for Data Encryption {#encrypted-virtual-file-system}
You can encrypt the data stored on [S3](../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-s3), or [HDFS](#configuring-hdfs) external disks, or on a local disk. To turn on the encryption mode, in the configuration file you must define a disk with the type `encrypted` and choose a disk on which the data will be saved. An `encrypted` disk ciphers all written files on the fly, and when you read files from an `encrypted` disk it deciphers them automatically. So you can work with an `encrypted` disk like with a normal one. You can encrypt the data stored on [S3](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-s3), or [HDFS](#configuring-hdfs) external disks, or on a local disk. To turn on the encryption mode, in the configuration file you must define a disk with the type `encrypted` and choose a disk on which the data will be saved. An `encrypted` disk ciphers all written files on the fly, and when you read files from an `encrypted` disk it deciphers them automatically. So you can work with an `encrypted` disk like with a normal one.
Example of disk configuration: Example of disk configuration:
@ -80,14 +80,14 @@ Required parameters:
- `type``encrypted`. Otherwise the encrypted disk is not created. - `type``encrypted`. Otherwise the encrypted disk is not created.
- `disk` — Type of disk for data storage. - `disk` — Type of disk for data storage.
- `key` — The key for encryption and decryption. Type: [Uint64](../sql-reference/data-types/int-uint.md). You can use `key_hex` parameter to encrypt in hexadecimal form. - `key` — The key for encryption and decryption. Type: [Uint64](/docs/en/sql-reference/data-types/int-uint.md). You can use `key_hex` parameter to encrypt in hexadecimal form.
You can specify multiple keys using the `id` attribute (see example above). You can specify multiple keys using the `id` attribute (see example above).
Optional parameters: Optional parameters:
- `path` — Path to the location on the disk where the data will be saved. If not specified, the data will be saved in the root directory. - `path` — Path to the location on the disk where the data will be saved. If not specified, the data will be saved in the root directory.
- `current_key_id` — The key used for encryption. All the specified keys can be used for decryption, and you can always switch to another key while maintaining access to previously encrypted data. - `current_key_id` — The key used for encryption. All the specified keys can be used for decryption, and you can always switch to another key while maintaining access to previously encrypted data.
- `algorithm` — [Algorithm](../sql-reference/statements/create/table.md#create-query-encryption-codecs) for encryption. Possible values: `AES_128_CTR`, `AES_192_CTR` or `AES_256_CTR`. Default value: `AES_128_CTR`. The key length depends on the algorithm: `AES_128_CTR` — 16 bytes, `AES_192_CTR` — 24 bytes, `AES_256_CTR` — 32 bytes. - `algorithm` — [Algorithm](/docs/en/sql-reference/statements/create/table.md/#create-query-encryption-codecs) for encryption. Possible values: `AES_128_CTR`, `AES_192_CTR` or `AES_256_CTR`. Default value: `AES_128_CTR`. The key length depends on the algorithm: `AES_128_CTR` — 16 bytes, `AES_192_CTR` — 24 bytes, `AES_256_CTR` — 32 bytes.
Example of disk configuration: Example of disk configuration:
@ -265,9 +265,9 @@ Cache profile events:
There is a tool `clickhouse-static-files-uploader`, which prepares a data directory for a given table (`SELECT data_paths FROM system.tables WHERE name = 'table_name'`). For each table you need, you get a directory of files. These files can be uploaded to, for example, a web server with static files. After this preparation, you can load this table into any ClickHouse server via `DiskWeb`. There is a tool `clickhouse-static-files-uploader`, which prepares a data directory for a given table (`SELECT data_paths FROM system.tables WHERE name = 'table_name'`). For each table you need, you get a directory of files. These files can be uploaded to, for example, a web server with static files. After this preparation, you can load this table into any ClickHouse server via `DiskWeb`.
This is a read-only disk. Its data is only read and never modified. A new table is loaded to this disk via `ATTACH TABLE` query (see example below). Local disk is not actually used, each `SELECT` query will result in a `http` request to fetch required data. All modification of the table data will result in an exception, i.e. the following types of queries are not allowed: [CREATE TABLE](../sql-reference/statements/create/table.md), [ALTER TABLE](../sql-reference/statements/alter/index.md), [RENAME TABLE](../sql-reference/statements/rename.md#misc_operations-rename_table), [DETACH TABLE](../sql-reference/statements/detach.md) and [TRUNCATE TABLE](../sql-reference/statements/truncate.md). This is a read-only disk. Its data is only read and never modified. A new table is loaded to this disk via `ATTACH TABLE` query (see example below). Local disk is not actually used, each `SELECT` query will result in a `http` request to fetch required data. All modification of the table data will result in an exception, i.e. the following types of queries are not allowed: [CREATE TABLE](/docs/en/sql-reference/statements/create/table.md), [ALTER TABLE](/docs/en/sql-reference/statements/alter/index.md), [RENAME TABLE](/docs/en/sql-reference/statements/rename.md/#misc_operations-rename_table), [DETACH TABLE](/docs/en/sql-reference/statements/detach.md) and [TRUNCATE TABLE](/docs/en/sql-reference/statements/truncate.md).
Web server storage is supported only for the [MergeTree](../engines/table-engines/mergetree-family/mergetree.md) and [Log](../engines/table-engines/log-family/log.md) engine families. To access the data stored on a `web` disk, use the [storage_policy](../engines/table-engines/mergetree-family/mergetree.md#terms) setting when executing the query. For example, `ATTACH TABLE table_web UUID '{}' (id Int32) ENGINE = MergeTree() ORDER BY id SETTINGS storage_policy = 'web'`. Web server storage is supported only for the [MergeTree](/docs/en/engines/table-engines/mergetree-family/mergetree.md) and [Log](/docs/en/engines/table-engines/log-family/log.md) engine families. To access the data stored on a `web` disk, use the [storage_policy](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#terms) setting when executing the query. For example, `ATTACH TABLE table_web UUID '{}' (id Int32) ENGINE = MergeTree() ORDER BY id SETTINGS storage_policy = 'web'`.
A ready test case. You need to add this configuration to config: A ready test case. You need to add this configuration to config:
@ -451,7 +451,7 @@ Optional parameters:
- `remote_fs_read_backoff_threashold` — The maximum wait time when trying to read data for remote disk. Default value: `10000` seconds. - `remote_fs_read_backoff_threashold` — The maximum wait time when trying to read data for remote disk. Default value: `10000` seconds.
- `remote_fs_read_backoff_max_tries` — The maximum number of attempts to read with backoff. Default value: `5`. - `remote_fs_read_backoff_max_tries` — The maximum number of attempts to read with backoff. Default value: `5`.
If a query fails with an exception `DB:Exception Unreachable URL`, then you can try to adjust the settings: [http_connection_timeout](../operations/settings/settings.md#http_connection_timeout), [http_receive_timeout](../operations/settings/settings.md#http_receive_timeout), [keep_alive_timeout](../operations/server-configuration-parameters/settings.md#keep-alive-timeout). If a query fails with an exception `DB:Exception Unreachable URL`, then you can try to adjust the settings: [http_connection_timeout](/docs/en/operations/settings/settings.md/#http_connection_timeout), [http_receive_timeout](/docs/en/operations/settings/settings.md/#http_receive_timeout), [keep_alive_timeout](/docs/en/operations/server-configuration-parameters/settings.md/#keep-alive-timeout).
To get files for upload run: To get files for upload run:
`clickhouse static-files-disk-uploader --metadata-path <path> --output-dir <dir>` (`--metadata-path` can be found in query `SELECT data_paths FROM system.tables WHERE name = 'table_name'`). `clickhouse static-files-disk-uploader --metadata-path <path> --output-dir <dir>` (`--metadata-path` can be found in query `SELECT data_paths FROM system.tables WHERE name = 'table_name'`).
@ -460,7 +460,7 @@ When loading files by `endpoint`, they must be loaded into `<endpoint>/store/` p
If URL is not reachable on disk load when the server is starting up tables, then all errors are caught. If in this case there were errors, tables can be reloaded (become visible) via `DETACH TABLE table_name` -> `ATTACH TABLE table_name`. If metadata was successfully loaded at server startup, then tables are available straight away. If URL is not reachable on disk load when the server is starting up tables, then all errors are caught. If in this case there were errors, tables can be reloaded (become visible) via `DETACH TABLE table_name` -> `ATTACH TABLE table_name`. If metadata was successfully loaded at server startup, then tables are available straight away.
Use [http_max_single_read_retries](../operations/settings/settings.md#http-max-single-read-retries) setting to limit the maximum number of retries during a single HTTP read. Use [http_max_single_read_retries](/docs/en/operations/settings/settings.md/#http-max-single-read-retries) setting to limit the maximum number of retries during a single HTTP read.
## Zero-copy Replication (not ready for production) {#zero-copy} ## Zero-copy Replication (not ready for production) {#zero-copy}

View File

@ -1,7 +1,8 @@
--- ---
slug: /en/operations/system-tables/ slug: /en/operations/system-tables/
sidebar_position: 52 sidebar_position: 52
sidebar_label: System Tables sidebar_label: Overview
pagination_next: 'en/operations/system-tables/asynchronous_metric_log'
--- ---
# System Tables # System Tables
@ -72,4 +73,3 @@ If procfs is supported and enabled on the system, ClickHouse server collects the
- `OSReadBytes` - `OSReadBytes`
- `OSWriteBytes` - `OSWriteBytes`
[Original article](https://clickhouse.com/docs/en/operations/system-tables/) <!--hide-->

View File

@ -178,7 +178,7 @@ Columns:
- `view_definition` ([String](../../sql-reference/data-types/string.md)) — `SELECT` query for view. - `view_definition` ([String](../../sql-reference/data-types/string.md)) — `SELECT` query for view.
- `check_option` ([String](../../sql-reference/data-types/string.md)) — `NONE`, no checking. - `check_option` ([String](../../sql-reference/data-types/string.md)) — `NONE`, no checking.
- `is_updatable` ([Enum8](../../sql-reference/data-types/enum.md)) — `NO`, the view is not updated. - `is_updatable` ([Enum8](../../sql-reference/data-types/enum.md)) — `NO`, the view is not updated.
- `is_insertable_into` ([Enum8](../../sql-reference/data-types/enum.md)) — Shows whether the created view is [materialized](../../sql-reference/statements/create/view/#materialized). Possible values: - `is_insertable_into` ([Enum8](../../sql-reference/data-types/enum.md)) — Shows whether the created view is [materialized](../../sql-reference/statements/create/view.md/#materialized-view). Possible values:
- `NO` — The created view is not materialized. - `NO` — The created view is not materialized.
- `YES` — The created view is materialized. - `YES` — The created view is materialized.
- `is_trigger_updatable` ([Enum8](../../sql-reference/data-types/enum.md)) — `NO`, the trigger is not updated. - `is_trigger_updatable` ([Enum8](../../sql-reference/data-types/enum.md)) — `NO`, the trigger is not updated.

View File

@ -3,31 +3,31 @@ slug: /en/operations/system-tables/mutations
--- ---
# mutations # mutations
The table contains information about [mutations](../../sql-reference/statements/alter/index.md#mutations) of [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md) tables and their progress. Each mutation command is represented by a single row. The table contains information about [mutations](/docs/en/sql-reference/statements/alter/index.md/#mutations) of [MergeTree](/docs/en/engines/table-engines/mergetree-family/mergetree.md) tables and their progress. Each mutation command is represented by a single row.
Columns: Columns:
- `database` ([String](../../sql-reference/data-types/string.md)) — The name of the database to which the mutation was applied. - `database` ([String](/docs/en/sql-reference/data-types/string.md)) — The name of the database to which the mutation was applied.
- `table` ([String](../../sql-reference/data-types/string.md)) — The name of the table to which the mutation was applied. - `table` ([String](/docs/en/sql-reference/data-types/string.md)) — The name of the table to which the mutation was applied.
- `mutation_id` ([String](../../sql-reference/data-types/string.md)) — The ID of the mutation. For replicated tables these IDs correspond to znode names in the `<table_path_in_clickhouse_keeper>/mutations/` directory in ClickHouse Keeper. For non-replicated tables the IDs correspond to file names in the data directory of the table. - `mutation_id` ([String](/docs/en/sql-reference/data-types/string.md)) — The ID of the mutation. For replicated tables these IDs correspond to znode names in the `<table_path_in_clickhouse_keeper>/mutations/` directory in ClickHouse Keeper. For non-replicated tables the IDs correspond to file names in the data directory of the table.
- `command` ([String](../../sql-reference/data-types/string.md)) — The mutation command string (the part of the query after `ALTER TABLE [db.]table`). - `command` ([String](/docs/en/sql-reference/data-types/string.md)) — The mutation command string (the part of the query after `ALTER TABLE [db.]table`).
- `create_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — Date and time when the mutation command was submitted for execution. - `create_time` ([Datetime](/docs/en/sql-reference/data-types/datetime.md)) — Date and time when the mutation command was submitted for execution.
- `block_numbers.partition_id` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — For mutations of replicated tables, the array contains the partitions' IDs (one record for each partition). For mutations of non-replicated tables the array is empty. - `block_numbers.partition_id` ([Array](/docs/en/sql-reference/data-types/array.md)([String](/docs/en/sql-reference/data-types/string.md))) — For mutations of replicated tables, the array contains the partitions' IDs (one record for each partition). For mutations of non-replicated tables the array is empty.
- `block_numbers.number` ([Array](../../sql-reference/data-types/array.md)([Int64](../../sql-reference/data-types/int-uint.md))) — For mutations of replicated tables, the array contains one record for each partition, with the block number that was acquired by the mutation. Only parts that contain blocks with numbers less than this number will be mutated in the partition. - `block_numbers.number` ([Array](/docs/en/sql-reference/data-types/array.md)([Int64](/docs/en/sql-reference/data-types/int-uint.md))) — For mutations of replicated tables, the array contains one record for each partition, with the block number that was acquired by the mutation. Only parts that contain blocks with numbers less than this number will be mutated in the partition.
In non-replicated tables, block numbers in all partitions form a single sequence. This means that for mutations of non-replicated tables, the column will contain one record with a single block number acquired by the mutation. In non-replicated tables, block numbers in all partitions form a single sequence. This means that for mutations of non-replicated tables, the column will contain one record with a single block number acquired by the mutation.
- `parts_to_do_names` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — An array of names of data parts that need to be mutated for the mutation to complete. - `parts_to_do_names` ([Array](/docs/en/sql-reference/data-types/array.md)([String](/docs/en/sql-reference/data-types/string.md))) — An array of names of data parts that need to be mutated for the mutation to complete.
- `parts_to_do` ([Int64](../../sql-reference/data-types/int-uint.md)) — The number of data parts that need to be mutated for the mutation to complete. - `parts_to_do` ([Int64](/docs/en/sql-reference/data-types/int-uint.md)) — The number of data parts that need to be mutated for the mutation to complete.
- `is_done` ([UInt8](../../sql-reference/data-types/int-uint.md)) — The flag whether the mutation is done or not. Possible values: - `is_done` ([UInt8](/docs/en/sql-reference/data-types/int-uint.md)) — The flag whether the mutation is done or not. Possible values:
- `1` if the mutation is completed, - `1` if the mutation is completed,
- `0` if the mutation is still in process. - `0` if the mutation is still in process.
@ -37,16 +37,16 @@ Even if `parts_to_do = 0` it is possible that a mutation of a replicated table i
If there were problems with mutating some data parts, the following columns contain additional information: If there were problems with mutating some data parts, the following columns contain additional information:
- `latest_failed_part` ([String](../../sql-reference/data-types/string.md)) — The name of the most recent part that could not be mutated. - `latest_failed_part` ([String](/docs/en/sql-reference/data-types/string.md)) — The name of the most recent part that could not be mutated.
- `latest_fail_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — The date and time of the most recent part mutation failure. - `latest_fail_time` ([Datetime](/docs/en/sql-reference/data-types/datetime.md)) — The date and time of the most recent part mutation failure.
- `latest_fail_reason` ([String](../../sql-reference/data-types/string.md)) — The exception message that caused the most recent part mutation failure. - `latest_fail_reason` ([String](/docs/en/sql-reference/data-types/string.md)) — The exception message that caused the most recent part mutation failure.
**See Also** **See Also**
- [Mutations](../../sql-reference/statements/alter/index.md#mutations) - [Mutations](/docs/en/sql-reference/statements/alter/index.md/#mutations)
- [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md) table engine - [MergeTree](/docs/en/engines/table-engines/mergetree-family/mergetree.md) table engine
- [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication.md) family - [ReplicatedMergeTree](/docs/en/engines/table-engines/mergetree-family/replication.md) family
[Original article](https://clickhouse.com/docs/en/operations/system-tables/mutations) <!--hide--> [Original article](https://clickhouse.com/docs/en/operations/system-tables/mutations) <!--hide-->

View File

@ -9,7 +9,7 @@ Each row describes one data part.
Columns: Columns:
- `partition` ([String](../../sql-reference/data-types/string.md)) The partition name. To learn what a partition is, see the description of the [ALTER](../../sql-reference/statements/alter/index.md#query_language_queries_alter) query. - `partition` ([String](../../sql-reference/data-types/string.md)) The partition name. To learn what a partition is, see the description of the [ALTER](../../sql-reference/statements/alter/index.md/#query_language_queries_alter) query.
Formats: Formats:
@ -75,7 +75,7 @@ Columns:
- `primary_key_bytes_in_memory_allocated` ([UInt64](../../sql-reference/data-types/int-uint.md)) The amount of memory (in bytes) reserved for primary key values. - `primary_key_bytes_in_memory_allocated` ([UInt64](../../sql-reference/data-types/int-uint.md)) The amount of memory (in bytes) reserved for primary key values.
- `is_frozen` ([UInt8](../../sql-reference/data-types/int-uint.md)) Flag that shows that a partition data backup exists. 1, the backup exists. 0, the backup does not exist. For more details, see [FREEZE PARTITION](../../sql-reference/statements/alter/partition.md#alter_freeze-partition) - `is_frozen` ([UInt8](../../sql-reference/data-types/int-uint.md)) Flag that shows that a partition data backup exists. 1, the backup exists. 0, the backup does not exist. For more details, see [FREEZE PARTITION](../../sql-reference/statements/alter/partition.md/#alter_freeze-partition)
- `database` ([String](../../sql-reference/data-types/string.md)) Name of the database. - `database` ([String](../../sql-reference/data-types/string.md)) Name of the database.
@ -87,25 +87,25 @@ Columns:
- `disk_name` ([String](../../sql-reference/data-types/string.md)) Name of a disk that stores the data part. - `disk_name` ([String](../../sql-reference/data-types/string.md)) Name of a disk that stores the data part.
- `hash_of_all_files` ([String](../../sql-reference/data-types/string.md)) [sipHash128](../../sql-reference/functions/hash-functions.md#hash_functions-siphash128) of compressed files. - `hash_of_all_files` ([String](../../sql-reference/data-types/string.md)) [sipHash128](../../sql-reference/functions/hash-functions.md/#hash_functions-siphash128) of compressed files.
- `hash_of_uncompressed_files` ([String](../../sql-reference/data-types/string.md)) [sipHash128](../../sql-reference/functions/hash-functions.md#hash_functions-siphash128) of uncompressed files (files with marks, index file etc.). - `hash_of_uncompressed_files` ([String](../../sql-reference/data-types/string.md)) [sipHash128](../../sql-reference/functions/hash-functions.md/#hash_functions-siphash128) of uncompressed files (files with marks, index file etc.).
- `uncompressed_hash_of_compressed_files` ([String](../../sql-reference/data-types/string.md)) [sipHash128](../../sql-reference/functions/hash-functions.md#hash_functions-siphash128) of data in the compressed files as if they were uncompressed. - `uncompressed_hash_of_compressed_files` ([String](../../sql-reference/data-types/string.md)) [sipHash128](../../sql-reference/functions/hash-functions.md/#hash_functions-siphash128) of data in the compressed files as if they were uncompressed.
- `delete_ttl_info_min` ([DateTime](../../sql-reference/data-types/datetime.md)) — The minimum value of the date and time key for [TTL DELETE rule](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl). - `delete_ttl_info_min` ([DateTime](../../sql-reference/data-types/datetime.md)) — The minimum value of the date and time key for [TTL DELETE rule](../../engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-ttl).
- `delete_ttl_info_max` ([DateTime](../../sql-reference/data-types/datetime.md)) — The maximum value of the date and time key for [TTL DELETE rule](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl). - `delete_ttl_info_max` ([DateTime](../../sql-reference/data-types/datetime.md)) — The maximum value of the date and time key for [TTL DELETE rule](../../engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-ttl).
- `move_ttl_info.expression` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Array of expressions. Each expression defines a [TTL MOVE rule](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl). - `move_ttl_info.expression` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Array of expressions. Each expression defines a [TTL MOVE rule](../../engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-ttl).
:::warning :::warning
The `move_ttl_info.expression` array is kept mostly for backward compatibility, now the simpliest way to check `TTL MOVE` rule is to use the `move_ttl_info.min` and `move_ttl_info.max` fields. The `move_ttl_info.expression` array is kept mostly for backward compatibility, now the simpliest way to check `TTL MOVE` rule is to use the `move_ttl_info.min` and `move_ttl_info.max` fields.
::: :::
- `move_ttl_info.min` ([Array](../../sql-reference/data-types/array.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Array of date and time values. Each element describes the minimum key value for a [TTL MOVE rule](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl). - `move_ttl_info.min` ([Array](../../sql-reference/data-types/array.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Array of date and time values. Each element describes the minimum key value for a [TTL MOVE rule](../../engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-ttl).
- `move_ttl_info.max` ([Array](../../sql-reference/data-types/array.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Array of date and time values. Each element describes the maximum key value for a [TTL MOVE rule](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl). - `move_ttl_info.max` ([Array](../../sql-reference/data-types/array.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Array of date and time values. Each element describes the maximum key value for a [TTL MOVE rule](../../engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-ttl).
- `bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) Alias for `bytes_on_disk`. - `bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) Alias for `bytes_on_disk`.
@ -166,6 +166,6 @@ move_ttl_info.max: []
**See Also** **See Also**
- [MergeTree family](../../engines/table-engines/mergetree-family/mergetree.md) - [MergeTree family](../../engines/table-engines/mergetree-family/mergetree.md)
- [TTL for Columns and Tables](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl) - [TTL for Columns and Tables](../../engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-ttl)
[Original article](https://clickhouse.com/docs/en/operations/system-tables/parts) <!--hide--> [Original article](https://clickhouse.com/docs/en/operations/system-tables/parts) <!--hide-->

View File

@ -9,7 +9,7 @@ Each row describes one data part.
Columns: Columns:
- `partition` ([String](../../sql-reference/data-types/string.md)) — The partition name. To learn what a partition is, see the description of the [ALTER](../../sql-reference/statements/alter/index.md#query_language_queries_alter) query. - `partition` ([String](../../sql-reference/data-types/string.md)) — The partition name. To learn what a partition is, see the description of the [ALTER](../../sql-reference/statements/alter/index.md/#query_language_queries_alter) query.
Formats: Formats:

View File

@ -68,6 +68,5 @@ thread_id: 54
**See Also** **See Also**
- [Managing ReplicatedMergeTree Tables](../../sql-reference/statements/system/#query-language-system-replicated) - [Managing ReplicatedMergeTree Tables](../../sql-reference/statements/system.md/#managing-replicatedmergetree-tables)
[Original article](https://clickhouse.com/docs/en/operations/system_tables/replicated_fetches) <!--hide-->

View File

@ -24,6 +24,7 @@ Columns:
- `DOUBLE_SHA1_PASSWORD` - `DOUBLE_SHA1_PASSWORD`
- `LDAP` - `LDAP`
- `KERBEROS` - `KERBEROS`
- `SSL_CERTIFICATE`
- `profiles` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — The list of profiles set for all roles and/or users. - `profiles` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — The list of profiles set for all roles and/or users.
- `roles` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — The list of roles to which the profile is applied. - `roles` ([Array](../../sql-reference/data-types/array.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md))) — The list of roles to which the profile is applied.
- `settings` ([Array](../../sql-reference/data-types/array.md)([Tuple](../../sql-reference/data-types/tuple.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md), [String](../../sql-reference/data-types/string.md)))) — Settings that were changed when the client logged in/out. - `settings` ([Array](../../sql-reference/data-types/array.md)([Tuple](../../sql-reference/data-types/tuple.md)([LowCardinality(String)](../../sql-reference/data-types/lowcardinality.md), [String](../../sql-reference/data-types/string.md)))) — Settings that were changed when the client logged in/out.

View File

@ -12,7 +12,7 @@ Columns:
- `storage` ([String](../../sql-reference/data-types/string.md)) — Path to the storage of users. Configured in the `access_control_path` parameter. - `storage` ([String](../../sql-reference/data-types/string.md)) — Path to the storage of users. Configured in the `access_control_path` parameter.
- `auth_type` ([Enum8](../../sql-reference/data-types/enum.md)('no_password' = 0,'plaintext_password' = 1, 'sha256_password' = 2, 'double_sha1_password' = 3)) — Shows the authentication type. There are multiple ways of user identification: with no password, with plain text password, with [SHA256](https://ru.wikipedia.org/wiki/SHA-2)-encoded password or with [double SHA-1](https://ru.wikipedia.org/wiki/SHA-1)-encoded password. - `auth_type` ([Enum8](../../sql-reference/data-types/enum.md)('no_password' = 0,'plaintext_password' = 1, 'sha256_password' = 2, 'double_sha1_password' = 3, 'ldap' = 4, 'kerberos' = 5, 'ssl_certificate' = 6)) — Shows the authentication type. There are multiple ways of user identification: with no password, with plain text password, with [SHA256](https://ru.wikipedia.org/wiki/SHA-2)-encoded password or with [double SHA-1](https://ru.wikipedia.org/wiki/SHA-1)-encoded password.
- `auth_params` ([String](../../sql-reference/data-types/string.md)) — Authentication parameters in the JSON format depending on the `auth_type`. - `auth_params` ([String](../../sql-reference/data-types/string.md)) — Authentication parameters in the JSON format depending on the `auth_type`.

View File

@ -109,56 +109,38 @@ In the report you can find:
`clickhouse-benchmark` can compare performances for two running ClickHouse servers. `clickhouse-benchmark` can compare performances for two running ClickHouse servers.
To use the comparison mode, specify endpoints of both servers by two pairs of `--host`, `--port` keys. Keys matched together by position in arguments list, the first `--host` is matched with the first `--port` and so on. `clickhouse-benchmark` establishes connections to both servers, then sends queries. Each query addressed to a randomly selected server. The results are shown for each server separately. To use the comparison mode, specify endpoints of both servers by two pairs of `--host`, `--port` keys. Keys matched together by position in arguments list, the first `--host` is matched with the first `--port` and so on. `clickhouse-benchmark` establishes connections to both servers, then sends queries. Each query addressed to a randomly selected server. The results are shown in a table.
## Example {#clickhouse-benchmark-example} ## Example {#clickhouse-benchmark-example}
``` bash ``` bash
$ echo "SELECT * FROM system.numbers LIMIT 10000000 OFFSET 10000000" | clickhouse-benchmark -i 10 $ echo "SELECT * FROM system.numbers LIMIT 10000000 OFFSET 10000000" | clickhouse-benchmark --host=localhost --port=9001 --host=localhost --port=9000 -i 10
``` ```
``` text ``` text
Loaded 1 queries. Loaded 1 queries.
Queries executed: 6. Queries executed: 5.
localhost:9000, queries 6, QPS: 6.153, RPS: 123398340.957, MiB/s: 941.455, result RPS: 61532982.200, result MiB/s: 469.459. localhost:9001, queries 2, QPS: 3.764, RPS: 75446929.370, MiB/s: 575.614, result RPS: 37639659.982, result MiB/s: 287.168.
localhost:9000, queries 3, QPS: 3.815, RPS: 76466659.385, MiB/s: 583.394, result RPS: 38148392.297, result MiB/s: 291.049.
0.000% 0.159 sec. 0.000% 0.258 sec. 0.250 sec.
10.000% 0.159 sec. 10.000% 0.258 sec. 0.250 sec.
20.000% 0.159 sec. 20.000% 0.258 sec. 0.250 sec.
30.000% 0.160 sec. 30.000% 0.258 sec. 0.267 sec.
40.000% 0.160 sec. 40.000% 0.258 sec. 0.267 sec.
50.000% 0.162 sec. 50.000% 0.273 sec. 0.267 sec.
60.000% 0.164 sec. 60.000% 0.273 sec. 0.267 sec.
70.000% 0.165 sec. 70.000% 0.273 sec. 0.267 sec.
80.000% 0.166 sec. 80.000% 0.273 sec. 0.269 sec.
90.000% 0.166 sec. 90.000% 0.273 sec. 0.269 sec.
95.000% 0.167 sec. 95.000% 0.273 sec. 0.269 sec.
99.000% 0.167 sec. 99.000% 0.273 sec. 0.269 sec.
99.900% 0.167 sec. 99.900% 0.273 sec. 0.269 sec.
99.990% 0.167 sec. 99.990% 0.273 sec. 0.269 sec.
No difference proven at 99.5% confidence
Queries executed: 10.
localhost:9000, queries 10, QPS: 6.082, RPS: 121959604.568, MiB/s: 930.478, result RPS: 60815551.642, result MiB/s: 463.986.
0.000% 0.159 sec.
10.000% 0.159 sec.
20.000% 0.160 sec.
30.000% 0.163 sec.
40.000% 0.164 sec.
50.000% 0.165 sec.
60.000% 0.166 sec.
70.000% 0.166 sec.
80.000% 0.167 sec.
90.000% 0.167 sec.
95.000% 0.170 sec.
99.000% 0.172 sec.
99.900% 0.172 sec.
99.990% 0.172 sec.
``` ```
[Original article](https://clickhouse.com/docs/en/operations/utilities/clickhouse-benchmark.md) <!--hide--> [Original article](https://clickhouse.com/docs/en/operations/utilities/clickhouse-benchmark.md) <!--hide-->

View File

@ -1,10 +1,11 @@
--- ---
slug: /en/operations/utilities/ slug: /en/operations/utilities/
sidebar_position: 56 sidebar_position: 56
sidebar_label: Utilities sidebar_label: Overview
pagination_next: 'en/operations/utilities/clickhouse-copier'
--- ---
# ClickHouse Utility # ClickHouse Utilities
- [clickhouse-local](../../operations/utilities/clickhouse-local.md) — Allows running SQL queries on data without starting the ClickHouse server, similar to how `awk` does this. - [clickhouse-local](../../operations/utilities/clickhouse-local.md) — Allows running SQL queries on data without starting the ClickHouse server, similar to how `awk` does this.
- [clickhouse-copier](../../operations/utilities/clickhouse-copier.md) — Copies (and reshards) data from one cluster to another cluster. - [clickhouse-copier](../../operations/utilities/clickhouse-copier.md) — Copies (and reshards) data from one cluster to another cluster.

View File

@ -303,17 +303,25 @@ or
CREATE DICTIONARY somedict ( CREATE DICTIONARY somedict (
id UInt64, id UInt64,
first Date, first Date,
last Date last Date,
advertiser_id UInt64
) )
PRIMARY KEY id PRIMARY KEY id
SOURCE(CLICKHOUSE(TABLE 'date_table'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(RANGE_HASHED()) LAYOUT(RANGE_HASHED())
RANGE(MIN first MAX last) RANGE(MIN first MAX last)
``` ```
To work with these dictionaries, you need to pass an additional argument to the `dictGetT` function, for which a range is selected: To work with these dictionaries, you need to pass an additional argument to the `dictGet` function, for which a range is selected:
``` sql ``` sql
dictGetT('dict_name', 'attr_name', id, date) dictGet('dict_name', 'attr_name', id, date)
```
Query example:
``` sql
SELECT dictGet('somedict', 'advertiser_id', 1, '2022-10-20 23:20:10.000'::DateTime64::UInt64);
``` ```
This function returns the value for the specified `id`s and the date range that includes the passed date. This function returns the value for the specified `id`s and the date range that includes the passed date.

View File

@ -1244,7 +1244,7 @@ Result:
└──────────────────────────┘ └──────────────────────────┘
``` ```
When there are two arguments: first is an [Integer](../../sql-reference/data-types/int-uint.md) or [DateTime](../../sql-reference/data-types/datetime.md), second is a constant format string — it acts in the same way as [formatDateTime](#formatdatetime) and return [String](../../sql-reference/data-types/string.md#string) type. When there are two or three arguments, the first an [Integer](../../sql-reference/data-types/int-uint.md), [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md), the second a constant format string and the third an optional constant time zone string — it acts in the same way as [formatDateTime](#formatdatetime) and return [String](../../sql-reference/data-types/string.md#string) type.
For example: For example:

View File

@ -8,70 +8,69 @@ title: "Geo Functions"
## Geographical Coordinates Functions ## Geographical Coordinates Functions
- [greatCircleDistance](./coordinates.md#greatCircleDistance) - [greatCircleDistance](./coordinates.md#greatcircledistance)
- [geoDistance](./coordinates.md#geoDistance) - [geoDistance](./coordinates.md#geodistance)
- [greatCircleAngle](./coordinates.md#greatCircleAngle) - [greatCircleAngle](./coordinates.md#greatcircleangle)
- [pointInEllipses](./coordinates.md#pointInEllipses) - [pointInEllipses](./coordinates.md#pointinellipses)
- [pointInPolygon](./coordinates.md#pointInPolygon) - [pointInPolygon](./coordinates.md#pointinpolygon)
## Geohash Functions ## Geohash Functions
- [geohashEncode](./geohash.md#geohashEncode) - [geohashEncode](./geohash.md#geohashencode)
- [geohashDecode](./geohash.md#geohashDecode) - [geohashDecode](./geohash.md#geohashdecode)
- [geohashesInBox](./geohash.md#geohashesInBox) - [geohashesInBox](./geohash.md#geohashesinbox)
## H3 Indexes Functions ## H3 Indexes Functions
- [h3IsValid](./h3.md#h3IsValid) - [h3IsValid](./h3.md#h3isvalid)
- [h3GetResolution](./h3.md#h3GetResolution) - [h3GetResolution](./h3.md#h3getresolution)
- [h3EdgeAngle](./h3.md#h3EdgeAngle) - [h3EdgeAngle](./h3.md#h3edgeangle)
- [h3EdgeLengthM](./h3.md#h3EdgeLengthM) - [h3EdgeLengthM](./h3.md#h3edgelengthm)
- [h3EdgeLengthKm](./h3.md#h3EdgeLengthKm) - [h3EdgeLengthKm](./h3.md#h3edgelengthkm)
- [geoToH3](./h3.md#geoToH3) - [geoToH3](./h3.md#geotoh3)
- [h3ToGeo](./h3.md#h3ToGeo) - [h3ToGeo](./h3.md#h3togeo)
- [h3ToGeoBoundary](./h3.md#h3ToGeoBoundary) - [h3ToGeoBoundary](./h3.md#h3togeoboundary)
- [h3kRing](./h3.md#h3kRing) - [h3kRing](./h3.md#h3kring)
- [h3GetBaseCell](./h3.md#h3GetBaseCell) - [h3GetBaseCell](./h3.md#h3getbasecell)
- [h3HexAreaM2](./h3.md#h3HexAreaM2) - [h3HexAreaM2](./h3.md#h3hexaream2)
- [h3HexAreaKm2](./h3.md#h3HexAreaKm2) - [h3HexAreaKm2](./h3.md#h3hexareakm2)
- [h3IndexesAreNeighbors](./h3.md#h3IndexesAreNeighbors) - [h3IndexesAreNeighbors](./h3.md#h3indexesareneighbors)
- [h3ToChildren](./h3.md#h3ToChildren) - [h3ToChildren](./h3.md#h3tochildren)
- [h3ToParent](./h3.md#h3ToParent) - [h3ToParent](./h3.md#h3toparent)
- [h3ToString](./h3.md#h3ToString) - [h3ToString](./h3.md#h3tostring)
- [stringToH3](./h3.md#stringToH3) - [stringToH3](./h3.md#stringtoh3)
- [h3GetResolution](./h3.md#h3GetResolution) - [h3GetResolution](./h3.md#h3getresolution)
- [h3IsResClassIII](./h3.md#h3IsResClassIII) - [h3IsResClassIII](./h3.md#h3isresclassiii)
- [h3IsPentagon](./h3.md#h3IsPentagon) - [h3IsPentagon](./h3.md#h3ispentagon)
- [h3GetFaces](./h3.md#h3GetFaces) - [h3GetFaces](./h3.md#h3getfaces)
- [h3CellAreaM2](./h3.md#h3CellAreaM2) - [h3CellAreaM2](./h3.md#h3cellaream2)
- [h3CellAreaRads2](./h3.md#h3CellAreaRads2) - [h3CellAreaRads2](./h3.md#h3cellarearads2)
- [h3ToCenterChild](./h3.md#h3ToCenterChild) - [h3ToCenterChild](./h3.md#h3tocenterchild)
- [h3ExactEdgeLengthM](./h3.md#h3ExactEdgeLengthM) - [h3ExactEdgeLengthM](./h3.md#h3exactedgelengthm)
- [h3ExactEdgeLengthKm](./h3.md#h3ExactEdgeLengthKm) - [h3ExactEdgeLengthKm](./h3.md#h3exactedgelengthkm)
- [h3ExactEdgeLengthRads](./h3.md#h3ExactEdgeLengthRads) - [h3ExactEdgeLengthRads](./h3.md#h3exactedgelengthrads)
- [h3NumHexagons](./h3.md#h3NumHexagons) - [h3NumHexagons](./h3.md#h3numhexagons)
- [h3Line](./h3.md#h3Line) - [h3Line](./h3.md#h3line)
- [h3Distance](./h3.md#h3Distance) - [h3Distance](./h3.md#h3distance)
- [h3HexRing](./h3.md#h3HexRing) - [h3HexRing](./h3.md#h3hexring)
- [h3GetUnidirectionalEdge](./h3.md#h3GetUnidirectionalEdge) - [h3GetUnidirectionalEdge](./h3.md#h3getunidirectionaledge)
- [h3UnidirectionalEdgeIsValid](./h3.md#h3UnidirectionalEdgeIsValid) - [h3UnidirectionalEdgeIsValid](./h3.md#h3unidirectionaledgeisvalid)
- [h3GetOriginIndexFromUnidirectionalEdge](./h3.md#h3GetOriginIndexFromUnidirectionalEdge) - [h3GetOriginIndexFromUnidirectionalEdge](./h3.md#h3getoriginindexfromunidirectionaledge)
- [h3GetDestinationIndexFromUnidirectionalEdge](./h3.md#h3GetDestinationIndexFromUnidirectionalEdge) - [h3GetDestinationIndexFromUnidirectionalEdge](./h3.md#h3getdestinationindexfromunidirectionaledge)
- [h3GetIndexesFromUnidirectionalEdge](./h3.md#h3GetIndexesFromUnidirectionalEdge) - [h3GetIndexesFromUnidirectionalEdge](./h3.md#h3getindexesfromunidirectionaledge)
- [h3GetUnidirectionalEdgesFromHexagon](./h3.md#h3GetUnidirectionalEdgesFromHexagon) - [h3GetUnidirectionalEdgesFromHexagon](./h3.md#h3getunidirectionaledgesfromhexagon)
- [h3GetUnidirectionalEdgeBoundary](./h3.md#h3GetUnidirectionalEdgeBoundary) - [h3GetUnidirectionalEdgeBoundary](./h3.md#h3getunidirectionaledgeboundary)
## S2 Index Functions ## S2 Index Functions
- [geoToS2](./s2.md#geoToS2) - [geoToS2](./s2.md#geotos2)
- [s2ToGeo](./s2.md#s2ToGeo) - [s2ToGeo](./s2.md#s2togeo)
- [s2GetNeighbors](./s2.md#s2GetNeighbors) - [s2GetNeighbors](./s2.md#s2getneighbors)
- [s2CellsIntersect](./s2.md#s2CellsIntersect) - [s2CellsIntersect](./s2.md#s2cellsintersect)
- [s2CapContains](./s2.md#s2CapContains) - [s2CapContains](./s2.md#s2capcontains)
- [s2CapUnion](./s2.md#s2CapUnion) - [s2CapUnion](./s2.md#s2capunion)
- [s2RectAdd](./s2.md#s2RectAdd) - [s2RectAdd](./s2.md#s2rectadd)
- [s2RectContains](./s2.md#s2RectContains) - [s2RectContains](./s2.md#s2rectcontains)
- [s2RectUinion](./s2.md#s2RectUinion) - [s2RectUnion](./s2.md#s2rectunion)
- [s2RectIntersection](./s2.md#s2RectIntersection) - [s2RectIntersection](./s2.md#s2rectintersection)
[Original article](https://clickhouse.com/docs/en/sql-reference/functions/geo/) <!--hide-->

View File

@ -593,6 +593,27 @@ LIMIT 10
└────────────────┴─────────┘ └────────────────┴─────────┘
``` ```
## formatReadableDecimalSize(x)
Accepts the size (number of bytes). Returns a rounded size with a suffix (KB, MB, etc.) as a string.
Example:
``` sql
SELECT
arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
formatReadableDecimalSize(filesize_bytes) AS filesize
```
``` text
┌─filesize_bytes─┬─filesize───┐
│ 1 │ 1.00 B │
│ 1024 │ 1.02 KB │
│ 1048576 │ 1.05 MB │
│ 192851925 │ 192.85 MB │
└────────────────┴────────────┘
```
## formatReadableSize(x) ## formatReadableSize(x)
Accepts the size (number of bytes). Returns a rounded size with a suffix (KiB, MiB, etc.) as a string. Accepts the size (number of bytes). Returns a rounded size with a suffix (KiB, MiB, etc.) as a string.

View File

@ -6,21 +6,22 @@ sidebar_label: Splitting and Merging Strings and Arrays
# Functions for Splitting and Merging Strings and Arrays # Functions for Splitting and Merging Strings and Arrays
## splitByChar(separator, s) ## splitByChar(separator, s[, max_substrings])
Splits a string into substrings separated by a specified character. It uses a constant string `separator` which consisting of exactly one character. Splits a string into substrings separated by a specified character. It uses a constant string `separator` which consists of exactly one character.
Returns an array of selected substrings. Empty substrings may be selected if the separator occurs at the beginning or end of the string, or if there are multiple consecutive separators. Returns an array of selected substrings. Empty substrings may be selected if the separator occurs at the beginning or end of the string, or if there are multiple consecutive separators.
**Syntax** **Syntax**
``` sql ``` sql
splitByChar(separator, s) splitByChar(separator, s[, max_substrings]))
``` ```
**Arguments** **Arguments**
- `separator` — The separator which should contain exactly one character. [String](../../sql-reference/data-types/string.md). - `separator` — The separator which should contain exactly one character. [String](../../sql-reference/data-types/string.md).
- `s` — The string to split. [String](../../sql-reference/data-types/string.md). - `s` — The string to split. [String](../../sql-reference/data-types/string.md).
- `max_substrings` — An optional `Int64` defaulting to 0. When `max_substrings` > 0, the returned substrings will be no more than `max_substrings`, otherwise the function will return as many substrings as possible.
**Returned value(s)** **Returned value(s)**
@ -44,20 +45,22 @@ SELECT splitByChar(',', '1,2,3,abcde');
└─────────────────────────────────┘ └─────────────────────────────────┘
``` ```
## splitByString(separator, s) ## splitByString(separator, s[, max_substrings])
Splits a string into substrings separated by a string. It uses a constant string `separator` of multiple characters as the separator. If the string `separator` is empty, it will split the string `s` into an array of single characters. Splits a string into substrings separated by a string. It uses a constant string `separator` of multiple characters as the separator. If the string `separator` is empty, it will split the string `s` into an array of single characters.
**Syntax** **Syntax**
``` sql ``` sql
splitByString(separator, s) splitByString(separator, s[, max_substrings]))
``` ```
**Arguments** **Arguments**
- `separator` — The separator. [String](../../sql-reference/data-types/string.md). - `separator` — The separator. [String](../../sql-reference/data-types/string.md).
- `s` — The string to split. [String](../../sql-reference/data-types/string.md). - `s` — The string to split. [String](../../sql-reference/data-types/string.md).
- `max_substrings` — An optional `Int64` defaulting to 0. When `max_substrings` > 0, the returned substrings will be no more than `max_substrings`, otherwise the function will return as many substrings as possible.
**Returned value(s)** **Returned value(s)**
@ -91,20 +94,22 @@ SELECT splitByString('', 'abcde');
└────────────────────────────┘ └────────────────────────────┘
``` ```
## splitByRegexp(regexp, s) ## splitByRegexp(regexp, s[, max_substrings])
Splits a string into substrings separated by a regular expression. It uses a regular expression string `regexp` as the separator. If the `regexp` is empty, it will split the string `s` into an array of single characters. If no match is found for this regular expression, the string `s` won't be split. Splits a string into substrings separated by a regular expression. It uses a regular expression string `regexp` as the separator. If the `regexp` is empty, it will split the string `s` into an array of single characters. If no match is found for this regular expression, the string `s` won't be split.
**Syntax** **Syntax**
``` sql ``` sql
splitByRegexp(regexp, s) splitByRegexp(regexp, s[, max_substrings]))
``` ```
**Arguments** **Arguments**
- `regexp` — Regular expression. Constant. [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md). - `regexp` — Regular expression. Constant. [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
- `s` — The string to split. [String](../../sql-reference/data-types/string.md). - `s` — The string to split. [String](../../sql-reference/data-types/string.md).
- `max_substrings` — An optional `Int64` defaulting to 0. When `max_substrings` > 0, the returned substrings will be no more than `max_substrings`, otherwise the function will return as many substrings as possible.
**Returned value(s)** **Returned value(s)**
@ -146,7 +151,7 @@ Result:
└────────────────────────────┘ └────────────────────────────┘
``` ```
## splitByWhitespace(s) ## splitByWhitespace(s[, max_substrings])
Splits a string into substrings separated by whitespace characters. Splits a string into substrings separated by whitespace characters.
Returns an array of selected substrings. Returns an array of selected substrings.
@ -154,12 +159,14 @@ Returns an array of selected substrings.
**Syntax** **Syntax**
``` sql ``` sql
splitByWhitespace(s) splitByWhitespace(s[, max_substrings]))
``` ```
**Arguments** **Arguments**
- `s` — The string to split. [String](../../sql-reference/data-types/string.md). - `s` — The string to split. [String](../../sql-reference/data-types/string.md).
- `max_substrings` — An optional `Int64` defaulting to 0. When `max_substrings` > 0, the returned substrings will be no more than `max_substrings`, otherwise the function will return as many substrings as possible.
**Returned value(s)** **Returned value(s)**
@ -179,7 +186,7 @@ SELECT splitByWhitespace(' 1! a, b. ');
└─────────────────────────────────────┘ └─────────────────────────────────────┘
``` ```
## splitByNonAlpha(s) ## splitByNonAlpha(s[, max_substrings])
Splits a string into substrings separated by whitespace and punctuation characters. Splits a string into substrings separated by whitespace and punctuation characters.
Returns an array of selected substrings. Returns an array of selected substrings.
@ -187,12 +194,14 @@ Returns an array of selected substrings.
**Syntax** **Syntax**
``` sql ``` sql
splitByNonAlpha(s) splitByNonAlpha(s[, max_substrings]))
``` ```
**Arguments** **Arguments**
- `s` — The string to split. [String](../../sql-reference/data-types/string.md). - `s` — The string to split. [String](../../sql-reference/data-types/string.md).
- `max_substrings` — An optional `Int64` defaulting to 0. When `max_substrings` > 0, the returned substrings will be no more than `max_substrings`, otherwise the function will return as many substrings as possible.
**Returned value(s)** **Returned value(s)**
@ -217,10 +226,28 @@ SELECT splitByNonAlpha(' 1! a, b. ');
Concatenates string representations of values listed in the array with the separator. `separator` is an optional parameter: a constant string, set to an empty string by default. Concatenates string representations of values listed in the array with the separator. `separator` is an optional parameter: a constant string, set to an empty string by default.
Returns the string. Returns the string.
## alphaTokens(s) ## alphaTokens(s[, max_substrings]), splitByAlpha(s[, max_substrings])
Selects substrings of consecutive bytes from the ranges a-z and A-Z.Returns an array of substrings. Selects substrings of consecutive bytes from the ranges a-z and A-Z.Returns an array of substrings.
**Syntax**
``` sql
alphaTokens(s[, max_substrings]))
splitByAlpha(s[, max_substrings])
```
**Arguments**
- `s` — The string to split. [String](../../sql-reference/data-types/string.md).
- `max_substrings` — An optional `Int64` defaulting to 0. When `max_substrings` > 0, the returned substrings will be no more than `max_substrings`, otherwise the function will return as many substrings as possible.
**Returned value(s)**
Returns an array of selected substrings.
Type: [Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md)).
**Example** **Example**
``` sql ``` sql

View File

@ -571,13 +571,13 @@ Similar to base58Decode, but returns an empty string in case of error.
## base64Encode(s) ## base64Encode(s)
Encodes s string into base64 Encodes s FixedString or String into base64.
Alias: `TO_BASE64`. Alias: `TO_BASE64`.
## base64Decode(s) ## base64Decode(s)
Decode base64-encoded string s into original string. In case of failure raises an exception. Decode base64-encoded FixedString or String s into original string. In case of failure raises an exception.
Alias: `FROM_BASE64`. Alias: `FROM_BASE64`.
@ -1150,3 +1150,13 @@ A text with tags .
The content within <b>CDATA</b> The content within <b>CDATA</b>
Do Nothing for 2 Minutes 2:00 &nbsp; Do Nothing for 2 Minutes 2:00 &nbsp;
``` ```
## ascii(s) {#ascii}
Returns the ASCII code point of the first character of str. The result type is Int32.
If s is empty, the result is 0. If the first character is not an ASCII character or not part of the Latin-1 Supplement range of UTF-16, the result is undefined.

View File

@ -12,22 +12,23 @@ Functions for [searching](../../sql-reference/functions/string-search-functions.
## replaceOne(haystack, pattern, replacement) ## replaceOne(haystack, pattern, replacement)
Replaces the first occurrence, if it exists, of the pattern substring in haystack with the replacement substring. Replaces the first occurrence of the substring pattern (if it exists) in haystack by the replacement string.
Hereafter, pattern and replacement must be constants. pattern and replacement must be constants.
## replaceAll(haystack, pattern, replacement), replace(haystack, pattern, replacement) ## replaceAll(haystack, pattern, replacement), replace(haystack, pattern, replacement)
Replaces all occurrences of the pattern substring in haystack with the replacement substring. Replaces all occurrences of the substring pattern in haystack by the replacement string.
## replaceRegexpOne(haystack, pattern, replacement) ## replaceRegexpOne(haystack, pattern, replacement)
Replacement using the pattern regular expression. A re2 regular expression. Replaces the first occurrence of the substring matching the regular expression pattern in haystack by the replacement string.
Replaces only the first occurrence, if it exists. pattern must be a constant [re2 regular expression](https://github.com/google/re2/wiki/Syntax).
A pattern can be specified as replacement. This pattern can include substitutions `\0-\9`. replacement must be a plain constant string or a constant string containing substitutions `\0-\9`.
The substitution `\0` includes the entire regular expression. Substitutions `\1-\9` correspond to the subpattern numbers.To use the `\` character in a template, escape it using `\`. Substitutions `\1-\9` correspond to the 1st to 9th capturing group (submatch), substitution `\0` corresponds to the entire match.
Also keep in mind that a string literal requires an extra escape. To use a verbatim `\` character in the pattern or replacement string, escape it using `\`.
Also keep in mind that string literals require an extra escaping.
Example 1. Converting the date to American format: Example 1. Converting ISO dates to American format:
``` sql ``` sql
SELECT DISTINCT SELECT DISTINCT
@ -62,7 +63,7 @@ SELECT replaceRegexpOne('Hello, World!', '.*', '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0')
## replaceRegexpAll(haystack, pattern, replacement) ## replaceRegexpAll(haystack, pattern, replacement)
This does the same thing, but replaces all the occurrences. Example: Like replaceRegexpOne, but replaces all occurrences of the pattern. Example:
``` sql ``` sql
SELECT replaceRegexpAll('Hello, World!', '.', '\\0\\0') AS res SELECT replaceRegexpAll('Hello, World!', '.', '\\0\\0') AS res

View File

@ -35,11 +35,11 @@ These actions are described in detail below.
ADD COLUMN [IF NOT EXISTS] name [type] [default_expr] [codec] [AFTER name_after | FIRST] ADD COLUMN [IF NOT EXISTS] name [type] [default_expr] [codec] [AFTER name_after | FIRST]
``` ```
Adds a new column to the table with the specified `name`, `type`, [`codec`](../create/table.md#codecs) and `default_expr` (see the section [Default expressions](../../../sql-reference/statements/create/table.md#create-default-values)). Adds a new column to the table with the specified `name`, `type`, [`codec`](../create/table.md/#codecs) and `default_expr` (see the section [Default expressions](/docs/en/sql-reference/statements/create/table.md/#create-default-values)).
If the `IF NOT EXISTS` clause is included, the query wont return an error if the column already exists. If you specify `AFTER name_after` (the name of another column), the column is added after the specified one in the list of table columns. If you want to add a column to the beginning of the table use the `FIRST` clause. Otherwise, the column is added to the end of the table. For a chain of actions, `name_after` can be the name of a column that is added in one of the previous actions. If the `IF NOT EXISTS` clause is included, the query wont return an error if the column already exists. If you specify `AFTER name_after` (the name of another column), the column is added after the specified one in the list of table columns. If you want to add a column to the beginning of the table use the `FIRST` clause. Otherwise, the column is added to the end of the table. For a chain of actions, `name_after` can be the name of a column that is added in one of the previous actions.
Adding a column just changes the table structure, without performing any actions with data. The data does not appear on the disk after `ALTER`. If the data is missing for a column when reading from the table, it is filled in with default values (by performing the default expression if there is one, or using zeros or empty strings). The column appears on the disk after merging data parts (see [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md)). Adding a column just changes the table structure, without performing any actions with data. The data does not appear on the disk after `ALTER`. If the data is missing for a column when reading from the table, it is filled in with default values (by performing the default expression if there is one, or using zeros or empty strings). The column appears on the disk after merging data parts (see [MergeTree](/docs/en/engines/table-engines/mergetree-family/mergetree.md)).
This approach allows us to complete the `ALTER` query instantly, without increasing the volume of old data. This approach allows us to complete the `ALTER` query instantly, without increasing the volume of old data.
@ -76,7 +76,7 @@ Deletes the column with the name `name`. If the `IF EXISTS` clause is specified,
Deletes data from the file system. Since this deletes entire files, the query is completed almost instantly. Deletes data from the file system. Since this deletes entire files, the query is completed almost instantly.
:::warning :::warning
You cant delete a column if it is referenced by [materialized view](../../../sql-reference/statements/create/view.md#materialized). Otherwise, it returns an error. You cant delete a column if it is referenced by [materialized view](/docs/en/sql-reference/statements/create/view.md/#materialized). Otherwise, it returns an error.
::: :::
Example: Example:
@ -107,7 +107,7 @@ ALTER TABLE visits RENAME COLUMN webBrowser TO browser
CLEAR COLUMN [IF EXISTS] name IN PARTITION partition_name CLEAR COLUMN [IF EXISTS] name IN PARTITION partition_name
``` ```
Resets all data in a column for a specified partition. Read more about setting the partition name in the section [How to specify the partition expression](#alter-how-to-specify-part-expr). Resets all data in a column for a specified partition. Read more about setting the partition name in the section [How to set the partition expression](partition.md/#how-to-set-partition-expression).
If the `IF EXISTS` clause is specified, the query wont return an error if the column does not exist. If the `IF EXISTS` clause is specified, the query wont return an error if the column does not exist.
@ -127,7 +127,7 @@ Adds a comment to the column. If the `IF EXISTS` clause is specified, the query
Each column can have one comment. If a comment already exists for the column, a new comment overwrites the previous comment. Each column can have one comment. If a comment already exists for the column, a new comment overwrites the previous comment.
Comments are stored in the `comment_expression` column returned by the [DESCRIBE TABLE](../../../sql-reference/statements/describe-table.md) query. Comments are stored in the `comment_expression` column returned by the [DESCRIBE TABLE](/docs/en/sql-reference/statements/describe-table.md) query.
Example: Example:
@ -152,15 +152,15 @@ This query changes the `name` column properties:
- TTL - TTL
For examples of columns compression CODECS modifying, see [Column Compression Codecs](../create/table.md#codecs). For examples of columns compression CODECS modifying, see [Column Compression Codecs](../create/table.md/#codecs).
For examples of columns TTL modifying, see [Column TTL](../../../engines/table-engines/mergetree-family/mergetree.md#mergetree-column-ttl). For examples of columns TTL modifying, see [Column TTL](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#mergetree-column-ttl).
If the `IF EXISTS` clause is specified, the query wont return an error if the column does not exist. If the `IF EXISTS` clause is specified, the query wont return an error if the column does not exist.
The query also can change the order of the columns using `FIRST | AFTER` clause, see [ADD COLUMN](#alter_add-column) description. The query also can change the order of the columns using `FIRST | AFTER` clause, see [ADD COLUMN](#alter_add-column) description.
When changing the type, values are converted as if the [toType](../../../sql-reference/functions/type-conversion-functions.md) functions were applied to them. If only the default expression is changed, the query does not do anything complex, and is completed almost instantly. When changing the type, values are converted as if the [toType](/docs/en/sql-reference/functions/type-conversion-functions.md) functions were applied to them. If only the default expression is changed, the query does not do anything complex, and is completed almost instantly.
Example: Example:
@ -204,8 +204,9 @@ It is used if it is necessary to add or update a column with a complicated expre
Syntax: Syntax:
```sql ```sql
ALTER TABLE table MATERIALIZE COLUMN col; ALTER TABLE [db.]table [ON CLUSTER cluster] MATERIALIZE COLUMN col [IN PARTITION partition | IN PARTITION ID 'partition_id'];
``` ```
- If you specify a PARTITION, a column will be materialized with only the specified partition.
**Example** **Example**
@ -245,7 +246,7 @@ SELECT groupArray(x), groupArray(s) FROM tmp;
**See Also** **See Also**
- [MATERIALIZED](../../statements/create/table.md#materialized). - [MATERIALIZED](/docs/en/sql-reference/statements/create/table.md/#materialized).
## Limitations ## Limitations
@ -253,8 +254,8 @@ The `ALTER` query lets you create and delete separate elements (columns) in nest
There is no support for deleting columns in the primary key or the sampling key (columns that are used in the `ENGINE` expression). Changing the type for columns that are included in the primary key is only possible if this change does not cause the data to be modified (for example, you are allowed to add values to an Enum or to change a type from `DateTime` to `UInt32`). There is no support for deleting columns in the primary key or the sampling key (columns that are used in the `ENGINE` expression). Changing the type for columns that are included in the primary key is only possible if this change does not cause the data to be modified (for example, you are allowed to add values to an Enum or to change a type from `DateTime` to `UInt32`).
If the `ALTER` query is not sufficient to make the table changes you need, you can create a new table, copy the data to it using the [INSERT SELECT](../../../sql-reference/statements/insert-into.md#insert_query_insert-select) query, then switch the tables using the [RENAME](../../../sql-reference/statements/rename.md#rename-table) query and delete the old table. You can use the [clickhouse-copier](../../../operations/utilities/clickhouse-copier.md) as an alternative to the `INSERT SELECT` query. If the `ALTER` query is not sufficient to make the table changes you need, you can create a new table, copy the data to it using the [INSERT SELECT](/docs/en/sql-reference/statements/insert-into.md/#insert_query_insert-select) query, then switch the tables using the [RENAME](/docs/en/sql-reference/statements/rename.md/#rename-table) query and delete the old table. You can use the [clickhouse-copier](/docs/en/operations/utilities/clickhouse-copier.md) as an alternative to the `INSERT SELECT` query.
The `ALTER` query blocks all reads and writes for the table. In other words, if a long `SELECT` is running at the time of the `ALTER` query, the `ALTER` query will wait for it to complete. At the same time, all new queries to the same table will wait while this `ALTER` is running. The `ALTER` query blocks all reads and writes for the table. In other words, if a long `SELECT` is running at the time of the `ALTER` query, the `ALTER` query will wait for it to complete. At the same time, all new queries to the same table will wait while this `ALTER` is running.
For tables that do not store data themselves (such as [Merge](../../../sql-reference/statements/alter/index.md) and [Distributed](../../../sql-reference/statements/alter/index.md)), `ALTER` just changes the table structure, and does not change the structure of subordinate tables. For example, when running ALTER for a `Distributed` table, you will also need to run `ALTER` for the tables on all remote servers. For tables that do not store data themselves (such as [Merge](/docs/en/sql-reference/statements/alter/index.md) and [Distributed](/docs/en/sql-reference/statements/alter/index.md)), `ALTER` just changes the table structure, and does not change the structure of subordinate tables. For example, when running ALTER for a `Distributed` table, you will also need to run `ALTER` for the tables on all remote servers.

View File

@ -10,21 +10,21 @@ sidebar_label: DELETE
ALTER TABLE [db.]table [ON CLUSTER cluster] DELETE WHERE filter_expr ALTER TABLE [db.]table [ON CLUSTER cluster] DELETE WHERE filter_expr
``` ```
Deletes data matching the specified filtering expression. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations). Deletes data matching the specified filtering expression. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations).
:::note :::note
The `ALTER TABLE` prefix makes this syntax different from most other systems supporting SQL. It is intended to signify that unlike similar queries in OLTP databases this is a heavy operation not designed for frequent use. `ALTER TABLE` is considered a heavyweight operation that requires the underlying data to be merged before it is deleted. For MergeTree tables, consider using the [`DELETE FROM` query](../delete.md), which performs a lightweight delete and can be considerably faster. The `ALTER TABLE` prefix makes this syntax different from most other systems supporting SQL. It is intended to signify that unlike similar queries in OLTP databases this is a heavy operation not designed for frequent use. `ALTER TABLE` is considered a heavyweight operation that requires the underlying data to be merged before it is deleted. For MergeTree tables, consider using the [`DELETE FROM` query](/docs/en/sql-reference/statements/delete.md), which performs a lightweight delete and can be considerably faster.
::: :::
The `filter_expr` must be of type `UInt8`. The query deletes rows in the table for which this expression takes a non-zero value. The `filter_expr` must be of type `UInt8`. The query deletes rows in the table for which this expression takes a non-zero value.
One query can contain several commands separated by commas. One query can contain several commands separated by commas.
The synchronicity of the query processing is defined by the [mutations_sync](../../../operations/settings/settings.md#mutations_sync) setting. By default, it is asynchronous. The synchronicity of the query processing is defined by the [mutations_sync](/docs/en/operations/settings/settings.md/#mutations_sync) setting. By default, it is asynchronous.
**See also** **See also**
- [Mutations](../../../sql-reference/statements/alter/index.md#mutations) - [Mutations](/docs/en/sql-reference/statements/alter/index.md/#mutations)
- [Synchronicity of ALTER Queries](../../../sql-reference/statements/alter/index.md#synchronicity-of-alter-queries) - [Synchronicity of ALTER Queries](/docs/en/sql-reference/statements/alter/index.md/#synchronicity-of-alter-queries)
- [mutations_sync](../../../operations/settings/settings.md#mutations_sync) setting - [mutations_sync](/docs/en/operations/settings/settings.md/#mutations_sync) setting

View File

@ -8,43 +8,43 @@ sidebar_label: ALTER
Most `ALTER TABLE` queries modify table settings or data: Most `ALTER TABLE` queries modify table settings or data:
- [COLUMN](../../../sql-reference/statements/alter/column.md) - [COLUMN](/docs/en/sql-reference/statements/alter/column.md)
- [PARTITION](../../../sql-reference/statements/alter/partition.md) - [PARTITION](/docs/en/sql-reference/statements/alter/partition.md)
- [DELETE](../../../sql-reference/statements/alter/delete.md) - [DELETE](/docs/en/sql-reference/statements/alter/delete.md)
- [UPDATE](../../../sql-reference/statements/alter/update.md) - [UPDATE](/docs/en/sql-reference/statements/alter/update.md)
- [ORDER BY](../../../sql-reference/statements/alter/order-by.md) - [ORDER BY](/docs/en/sql-reference/statements/alter/order-by.md)
- [INDEX](../../../sql-reference/statements/alter/index/index.md) - [INDEX](/docs/en/sql-reference/statements/alter/skipping-index.md)
- [CONSTRAINT](../../../sql-reference/statements/alter/constraint.md) - [CONSTRAINT](/docs/en/sql-reference/statements/alter/constraint.md)
- [TTL](../../../sql-reference/statements/alter/ttl.md) - [TTL](/docs/en/sql-reference/statements/alter/ttl.md)
:::note :::note
Most `ALTER TABLE` queries are supported only for [\*MergeTree](../../../engines/table-engines/mergetree-family/index.md) tables, as well as [Merge](../../../engines/table-engines/special/merge.md) and [Distributed](../../../engines/table-engines/special/distributed.md). Most `ALTER TABLE` queries are supported only for [\*MergeTree](/docs/en/engines/table-engines/mergetree-family/index.md) tables, as well as [Merge](/docs/en/engines/table-engines/special/merge.md) and [Distributed](/docs/en/engines/table-engines/special/distributed.md).
::: :::
These `ALTER` statements manipulate views: These `ALTER` statements manipulate views:
- [ALTER TABLE ... MODIFY QUERY](../../../sql-reference/statements/alter/view.md) — Modifies a [Materialized view](../create/view.md#materialized) structure. - [ALTER TABLE ... MODIFY QUERY](/docs/en/sql-reference/statements/alter/view.md) — Modifies a [Materialized view](/docs/en/sql-reference/statements/create/view.md/#materialized) structure.
- [ALTER LIVE VIEW](../../../sql-reference/statements/alter/view.md#alter-live-view) — Refreshes a [Live view](../create/view.md#live-view). - [ALTER LIVE VIEW](/docs/en/sql-reference/statements/alter/view.md/#alter-live-view) — Refreshes a [Live view](/docs/en/sql-reference/statements/create/view.md/#live-view).
These `ALTER` statements modify entities related to role-based access control: These `ALTER` statements modify entities related to role-based access control:
- [USER](../../../sql-reference/statements/alter/user.md) - [USER](/docs/en/sql-reference/statements/alter/user.md)
- [ROLE](../../../sql-reference/statements/alter/role.md) - [ROLE](/docs/en/sql-reference/statements/alter/role.md)
- [QUOTA](../../../sql-reference/statements/alter/quota.md) - [QUOTA](/docs/en/sql-reference/statements/alter/quota.md)
- [ROW POLICY](../../../sql-reference/statements/alter/row-policy.md) - [ROW POLICY](/docs/en/sql-reference/statements/alter/row-policy.md)
- [SETTINGS PROFILE](../../../sql-reference/statements/alter/settings-profile.md) - [SETTINGS PROFILE](/docs/en/sql-reference/statements/alter/settings-profile.md)
[ALTER TABLE ... MODIFY COMMENT](../../../sql-reference/statements/alter/comment.md) statement adds, modifies, or removes comments to the table, regardless if it was set before or not. [ALTER TABLE ... MODIFY COMMENT](/docs/en/sql-reference/statements/alter/comment.md) statement adds, modifies, or removes comments to the table, regardless if it was set before or not.
## Mutations ## Mutations
`ALTER` queries that are intended to manipulate table data are implemented with a mechanism called “mutations”, most notably [ALTER TABLE … DELETE](../../../sql-reference/statements/alter/delete.md) and [ALTER TABLE … UPDATE](../../../sql-reference/statements/alter/update.md). They are asynchronous background processes similar to merges in [MergeTree](../../../engines/table-engines/mergetree-family/index.md) tables that to produce new “mutated” versions of parts. `ALTER` queries that are intended to manipulate table data are implemented with a mechanism called “mutations”, most notably [ALTER TABLE … DELETE](/docs/en/sql-reference/statements/alter/delete.md) and [ALTER TABLE … UPDATE](/docs/en/sql-reference/statements/alter/update.md). They are asynchronous background processes similar to merges in [MergeTree](/docs/en/engines/table-engines/mergetree-family/index.md) tables that to produce new “mutated” versions of parts.
For `*MergeTree` tables mutations execute by **rewriting whole data parts**. There is no atomicity - parts are substituted for mutated parts as soon as they are ready and a `SELECT` query that started executing during a mutation will see data from parts that have already been mutated along with data from parts that have not been mutated yet. For `*MergeTree` tables mutations execute by **rewriting whole data parts**. There is no atomicity - parts are substituted for mutated parts as soon as they are ready and a `SELECT` query that started executing during a mutation will see data from parts that have already been mutated along with data from parts that have not been mutated yet.
Mutations are totally ordered by their creation order and are applied to each part in that order. Mutations are also partially ordered with `INSERT INTO` queries: data that was inserted into the table before the mutation was submitted will be mutated and data that was inserted after that will not be mutated. Note that mutations do not block inserts in any way. Mutations are totally ordered by their creation order and are applied to each part in that order. Mutations are also partially ordered with `INSERT INTO` queries: data that was inserted into the table before the mutation was submitted will be mutated and data that was inserted after that will not be mutated. Note that mutations do not block inserts in any way.
A mutation query returns immediately after the mutation entry is added (in case of replicated tables to ZooKeeper, for non-replicated tables - to the filesystem). The mutation itself executes asynchronously using the system profile settings. To track the progress of mutations you can use the [`system.mutations`](../../../operations/system-tables/mutations.md#system_tables-mutations) table. A mutation that was successfully submitted will continue to execute even if ClickHouse servers are restarted. There is no way to roll back the mutation once it is submitted, but if the mutation is stuck for some reason it can be cancelled with the [`KILL MUTATION`](../../../sql-reference/statements/kill.md#kill-mutation) query. A mutation query returns immediately after the mutation entry is added (in case of replicated tables to ZooKeeper, for non-replicated tables - to the filesystem). The mutation itself executes asynchronously using the system profile settings. To track the progress of mutations you can use the [`system.mutations`](/docs/en/operations/system-tables/mutations.md/#system_tables-mutations) table. A mutation that was successfully submitted will continue to execute even if ClickHouse servers are restarted. There is no way to roll back the mutation once it is submitted, but if the mutation is stuck for some reason it can be cancelled with the [`KILL MUTATION`](/docs/en/sql-reference/statements/kill.md/#kill-mutation) query.
Entries for finished mutations are not deleted right away (the number of preserved entries is determined by the `finished_mutations_to_keep` storage engine parameter). Older mutation entries are deleted. Entries for finished mutations are not deleted right away (the number of preserved entries is determined by the `finished_mutations_to_keep` storage engine parameter). Older mutation entries are deleted.
@ -52,12 +52,12 @@ Entries for finished mutations are not deleted right away (the number of preserv
For non-replicated tables, all `ALTER` queries are performed synchronously. For replicated tables, the query just adds instructions for the appropriate actions to `ZooKeeper`, and the actions themselves are performed as soon as possible. However, the query can wait for these actions to be completed on all the replicas. For non-replicated tables, all `ALTER` queries are performed synchronously. For replicated tables, the query just adds instructions for the appropriate actions to `ZooKeeper`, and the actions themselves are performed as soon as possible. However, the query can wait for these actions to be completed on all the replicas.
For all `ALTER` queries, you can use the [replication_alter_partitions_sync](../../../operations/settings/settings.md#replication-alter-partitions-sync) setting to set up waiting. For all `ALTER` queries, you can use the [replication_alter_partitions_sync](/docs/en/operations/settings/settings.md/#replication-alter-partitions-sync) setting to set up waiting.
You can specify how long (in seconds) to wait for inactive replicas to execute all `ALTER` queries with the [replication_wait_for_inactive_replica_timeout](../../../operations/settings/settings.md#replication-wait-for-inactive-replica-timeout) setting. You can specify how long (in seconds) to wait for inactive replicas to execute all `ALTER` queries with the [replication_wait_for_inactive_replica_timeout](/docs/en/operations/settings/settings.md/#replication-wait-for-inactive-replica-timeout) setting.
:::note :::note
For all `ALTER` queries, if `replication_alter_partitions_sync = 2` and some replicas are not active for more than the time, specified in the `replication_wait_for_inactive_replica_timeout` setting, then an exception `UNFINISHED` is thrown. For all `ALTER` queries, if `replication_alter_partitions_sync = 2` and some replicas are not active for more than the time, specified in the `replication_wait_for_inactive_replica_timeout` setting, then an exception `UNFINISHED` is thrown.
::: :::
For `ALTER TABLE ... UPDATE|DELETE` queries the synchronicity is defined by the [mutations_sync](../../../operations/settings/settings.md#mutations_sync) setting. For `ALTER TABLE ... UPDATE|DELETE` queries the synchronicity is defined by the [mutations_sync](/docs/en/operations/settings/settings.md/#mutations_sync) setting.

View File

@ -5,7 +5,7 @@ sidebar_label: PARTITION
title: "Manipulating Partitions and Parts" title: "Manipulating Partitions and Parts"
--- ---
The following operations with [partitions](../../../engines/table-engines/mergetree-family/custom-partitioning-key.md) are available: The following operations with [partitions](/docs/en/engines/table-engines/mergetree-family/custom-partitioning-key.md) are available:
- [DETACH PARTITION\|PART](#detach-partitionpart) — Moves a partition or part to the `detached` directory and forget it. - [DETACH PARTITION\|PART](#detach-partitionpart) — Moves a partition or part to the `detached` directory and forget it.
- [DROP PARTITION\|PART](#drop-partitionpart) — Deletes a partition or part. - [DROP PARTITION\|PART](#drop-partitionpart) — Deletes a partition or part.
@ -39,11 +39,11 @@ ALTER TABLE mt DETACH PARTITION '2020-11-21';
ALTER TABLE mt DETACH PART 'all_2_2_0'; ALTER TABLE mt DETACH PART 'all_2_2_0';
``` ```
Read about setting the partition expression in a section [How to specify the partition expression](#alter-how-to-specify-part-expr). Read about setting the partition expression in a section [How to set the partition expression](#how-to-set-partition-expression).
After the query is executed, you can do whatever you want with the data in the `detached` directory — delete it from the file system, or just leave it. After the query is executed, you can do whatever you want with the data in the `detached` directory — delete it from the file system, or just leave it.
This query is replicated it moves the data to the `detached` directory on all replicas. Note that you can execute this query only on a leader replica. To find out if a replica is a leader, perform the `SELECT` query to the [system.replicas](../../../operations/system-tables/replicas.md#system_tables-replicas) table. Alternatively, it is easier to make a `DETACH` query on all replicas - all the replicas throw an exception, except the leader replicas (as multiple leaders are allowed). This query is replicated it moves the data to the `detached` directory on all replicas. Note that you can execute this query only on a leader replica. To find out if a replica is a leader, perform the `SELECT` query to the [system.replicas](/docs/en/operations/system-tables/replicas.md/#system_tables-replicas) table. Alternatively, it is easier to make a `DETACH` query on all replicas - all the replicas throw an exception, except the leader replicas (as multiple leaders are allowed).
## DROP PARTITION\|PART ## DROP PARTITION\|PART
@ -53,7 +53,7 @@ ALTER TABLE table_name [ON CLUSTER cluster] DROP PARTITION|PART partition_expr
Deletes the specified partition from the table. This query tags the partition as inactive and deletes data completely, approximately in 10 minutes. Deletes the specified partition from the table. This query tags the partition as inactive and deletes data completely, approximately in 10 minutes.
Read about setting the partition expression in a section [How to specify the partition expression](#alter-how-to-specify-part-expr). Read about setting the partition expression in a section [How to set the partition expression](#how-to-set-partition-expression).
The query is replicated it deletes data on all replicas. The query is replicated it deletes data on all replicas.
@ -71,7 +71,7 @@ ALTER TABLE table_name [ON CLUSTER cluster] DROP DETACHED PARTITION|PART partiti
``` ```
Removes the specified part or all parts of the specified partition from `detached`. Removes the specified part or all parts of the specified partition from `detached`.
Read more about setting the partition expression in a section [How to specify the partition expression](#alter-how-to-specify-part-expr). Read more about setting the partition expression in a section [How to set the partition expression](#how-to-set-partition-expression).
## ATTACH PARTITION\|PART ## ATTACH PARTITION\|PART
@ -86,7 +86,7 @@ ALTER TABLE visits ATTACH PARTITION 201901;
ALTER TABLE visits ATTACH PART 201901_2_2_0; ALTER TABLE visits ATTACH PART 201901_2_2_0;
``` ```
Read more about setting the partition expression in a section [How to specify the partition expression](#alter-how-to-specify-part-expr). Read more about setting the partition expression in a section [How to set the partition expression](#how-to-set-partition-expression).
This query is replicated. The replica-initiator checks whether there is data in the `detached` directory. This query is replicated. The replica-initiator checks whether there is data in the `detached` directory.
If data exists, the query checks its integrity. If everything is correct, the query adds the data to the table. If data exists, the query checks its integrity. If everything is correct, the query adds the data to the table.
@ -166,7 +166,7 @@ This query creates a local backup of a specified partition. If the `PARTITION` c
The entire backup process is performed without stopping the server. The entire backup process is performed without stopping the server.
::: :::
Note that for old-styled tables you can specify the prefix of the partition name (for example, `2019`) - then the query creates the backup for all the corresponding partitions. Read about setting the partition expression in a section [How to specify the partition expression](#alter-how-to-specify-part-expr). Note that for old-styled tables you can specify the prefix of the partition name (for example, `2019`) - then the query creates the backup for all the corresponding partitions. Read about setting the partition expression in a section [How to set the partition expression](#how-to-set-partition-expression).
At the time of execution, for a data snapshot, the query creates hardlinks to a table data. Hardlinks are placed in the directory `/var/lib/clickhouse/shadow/N/...`, where: At the time of execution, for a data snapshot, the query creates hardlinks to a table data. Hardlinks are placed in the directory `/var/lib/clickhouse/shadow/N/...`, where:
@ -175,7 +175,7 @@ At the time of execution, for a data snapshot, the query creates hardlinks to a
- if the `WITH NAME` parameter is specified, then the value of the `'backup_name'` parameter is used instead of the incremental number. - if the `WITH NAME` parameter is specified, then the value of the `'backup_name'` parameter is used instead of the incremental number.
:::note :::note
If you use [a set of disks for data storage in a table](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-multiple-volumes), the `shadow/N` directory appears on every disk, storing data parts that matched by the `PARTITION` expression. If you use [a set of disks for data storage in a table](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-multiple-volumes), the `shadow/N` directory appears on every disk, storing data parts that matched by the `PARTITION` expression.
::: :::
The same structure of directories is created inside the backup as inside `/var/lib/clickhouse/`. The query performs `chmod` for all files, forbidding writing into them. The same structure of directories is created inside the backup as inside `/var/lib/clickhouse/`. The query performs `chmod` for all files, forbidding writing into them.
@ -249,7 +249,7 @@ Although the query is called `ALTER TABLE`, it does not change the table structu
## MOVE PARTITION\|PART ## MOVE PARTITION\|PART
Moves partitions or data parts to another volume or disk for `MergeTree`-engine tables. See [Using Multiple Block Devices for Data Storage](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-multiple-volumes). Moves partitions or data parts to another volume or disk for `MergeTree`-engine tables. See [Using Multiple Block Devices for Data Storage](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#table_engine-mergetree-multiple-volumes).
``` sql ``` sql
ALTER TABLE table_name [ON CLUSTER cluster] MOVE PARTITION|PART partition_expr TO DISK|VOLUME 'disk_name' ALTER TABLE table_name [ON CLUSTER cluster] MOVE PARTITION|PART partition_expr TO DISK|VOLUME 'disk_name'
@ -270,7 +270,7 @@ ALTER TABLE hits MOVE PARTITION '2019-09-01' TO DISK 'fast_ssd'
## UPDATE IN PARTITION ## UPDATE IN PARTITION
Manipulates data in the specifies partition matching the specified filtering expression. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations). Manipulates data in the specifies partition matching the specified filtering expression. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations).
Syntax: Syntax:
@ -286,11 +286,11 @@ ALTER TABLE mt UPDATE x = x + 1 IN PARTITION 2 WHERE p = 2;
### See Also ### See Also
- [UPDATE](../../../sql-reference/statements/alter/update.md#alter-table-update-statements) - [UPDATE](/docs/en/sql-reference/statements/alter/update.md/#alter-table-update-statements)
## DELETE IN PARTITION ## DELETE IN PARTITION
Deletes data in the specifies partition matching the specified filtering expression. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations). Deletes data in the specifies partition matching the specified filtering expression. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations).
Syntax: Syntax:
@ -306,7 +306,7 @@ ALTER TABLE mt DELETE IN PARTITION 2 WHERE p = 2;
### See Also ### See Also
- [DELETE](../../../sql-reference/statements/alter/delete.md#alter-mutations) - [DELETE](/docs/en/sql-reference/statements/alter/delete.md/#alter-mutations)
## How to Set Partition Expression ## How to Set Partition Expression
@ -315,16 +315,16 @@ You can specify the partition expression in `ALTER ... PARTITION` queries in dif
- As a value from the `partition` column of the `system.parts` table. For example, `ALTER TABLE visits DETACH PARTITION 201901`. - As a value from the `partition` column of the `system.parts` table. For example, `ALTER TABLE visits DETACH PARTITION 201901`.
- As a tuple of expressions or constants that matches (in types) the table partitioning keys tuple. In the case of a single element partitioning key, the expression should be wrapped in the `tuple (...)` function. For example, `ALTER TABLE visits DETACH PARTITION tuple(toYYYYMM(toDate('2019-01-25')))`. - As a tuple of expressions or constants that matches (in types) the table partitioning keys tuple. In the case of a single element partitioning key, the expression should be wrapped in the `tuple (...)` function. For example, `ALTER TABLE visits DETACH PARTITION tuple(toYYYYMM(toDate('2019-01-25')))`.
- Using the partition ID. Partition ID is a string identifier of the partition (human-readable, if possible) that is used as the names of partitions in the file system and in ZooKeeper. The partition ID must be specified in the `PARTITION ID` clause, in a single quotes. For example, `ALTER TABLE visits DETACH PARTITION ID '201901'`. - Using the partition ID. Partition ID is a string identifier of the partition (human-readable, if possible) that is used as the names of partitions in the file system and in ZooKeeper. The partition ID must be specified in the `PARTITION ID` clause, in a single quotes. For example, `ALTER TABLE visits DETACH PARTITION ID '201901'`.
- In the [ALTER ATTACH PART](#alter_attach-partition) and [DROP DETACHED PART](#alter_drop-detached) query, to specify the name of a part, use string literal with a value from the `name` column of the [system.detached_parts](../../../operations/system-tables/detached_parts.md#system_tables-detached_parts) table. For example, `ALTER TABLE visits ATTACH PART '201901_1_1_0'`. - In the [ALTER ATTACH PART](#alter_attach-partition) and [DROP DETACHED PART](#alter_drop-detached) query, to specify the name of a part, use string literal with a value from the `name` column of the [system.detached_parts](/docs/en/operations/system-tables/detached_parts.md/#system_tables-detached_parts) table. For example, `ALTER TABLE visits ATTACH PART '201901_1_1_0'`.
Usage of quotes when specifying the partition depends on the type of partition expression. For example, for the `String` type, you have to specify its name in quotes (`'`). For the `Date` and `Int*` types no quotes are needed. Usage of quotes when specifying the partition depends on the type of partition expression. For example, for the `String` type, you have to specify its name in quotes (`'`). For the `Date` and `Int*` types no quotes are needed.
All the rules above are also true for the [OPTIMIZE](../../../sql-reference/statements/optimize.md) query. If you need to specify the only partition when optimizing a non-partitioned table, set the expression `PARTITION tuple()`. For example: All the rules above are also true for the [OPTIMIZE](/docs/en/sql-reference/statements/optimize.md) query. If you need to specify the only partition when optimizing a non-partitioned table, set the expression `PARTITION tuple()`. For example:
``` sql ``` sql
OPTIMIZE TABLE table_not_partitioned PARTITION tuple() FINAL; OPTIMIZE TABLE table_not_partitioned PARTITION tuple() FINAL;
``` ```
`IN PARTITION` specifies the partition to which the [UPDATE](../../../sql-reference/statements/alter/update.md#alter-table-update-statements) or [DELETE](../../../sql-reference/statements/alter/delete.md#alter-mutations) expressions are applied as a result of the `ALTER TABLE` query. New parts are created only from the specified partition. In this way, `IN PARTITION` helps to reduce the load when the table is divided into many partitions, and you only need to update the data point-by-point. `IN PARTITION` specifies the partition to which the [UPDATE](/docs/en/sql-reference/statements/alter/update.md/#alter-table-update-statements) or [DELETE](/docs/en/sql-reference/statements/alter/delete.md/#alter-mutations) expressions are applied as a result of the `ALTER TABLE` query. New parts are created only from the specified partition. In this way, `IN PARTITION` helps to reduce the load when the table is divided into many partitions, and you only need to update the data point-by-point.
The examples of `ALTER ... PARTITION` queries are demonstrated in the tests [`00502_custom_partitioning_local`](https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/00502_custom_partitioning_local.sql) and [`00502_custom_partitioning_replicated_zookeeper`](https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/00502_custom_partitioning_replicated_zookeeper.sql). The examples of `ALTER ... PARTITION` queries are demonstrated in the tests [`00502_custom_partitioning_local`](https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/00502_custom_partitioning_local.sql) and [`00502_custom_partitioning_replicated_zookeeper`](https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/00502_custom_partitioning_replicated_zookeeper.sql).

View File

@ -5,21 +5,29 @@ sidebar_label: PROJECTION
title: "Manipulating Projections" title: "Manipulating Projections"
--- ---
The following operations with [projections](../../../engines/table-engines/mergetree-family/mergetree.md#projections) are available: The following operations with [projections](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#projections) are available:
- `ALTER TABLE [db].name ADD PROJECTION name ( SELECT <COLUMN LIST EXPR> [GROUP BY] [ORDER BY] )` - Adds projection description to tables metadata. ## ADD PROJECTION
- `ALTER TABLE [db].name DROP PROJECTION name` - Removes projection description from tables metadata and deletes projection files from disk. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations). `ALTER TABLE [db].name ADD PROJECTION name ( SELECT <COLUMN LIST EXPR> [GROUP BY] [ORDER BY] )` - Adds projection description to tables metadata.
- `ALTER TABLE [db.]table MATERIALIZE PROJECTION name IN PARTITION partition_name` - The query rebuilds the projection `name` in the partition `partition_name`. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations). ## DROP PROJECTION
- `ALTER TABLE [db.]table CLEAR PROJECTION name IN PARTITION partition_name` - Deletes projection files from disk without removing description. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations). `ALTER TABLE [db].name DROP PROJECTION name` - Removes projection description from tables metadata and deletes projection files from disk. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations).
## MATERIALIZE PROJECTION
`ALTER TABLE [db.]table MATERIALIZE PROJECTION name IN PARTITION partition_name` - The query rebuilds the projection `name` in the partition `partition_name`. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations).
## CLEAR PROJECTION
`ALTER TABLE [db.]table CLEAR PROJECTION name IN PARTITION partition_name` - Deletes projection files from disk without removing description. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations).
The commands `ADD`, `DROP` and `CLEAR` are lightweight in a sense that they only change metadata or remove files. The commands `ADD`, `DROP` and `CLEAR` are lightweight in a sense that they only change metadata or remove files.
Also, they are replicated, syncing projections metadata via ZooKeeper. Also, they are replicated, syncing projections metadata via ClickHouse Keeper or ZooKeeper.
:::note :::note
Projection manipulation is supported only for tables with [`*MergeTree`](../../../engines/table-engines/mergetree-family/mergetree.md) engine (including [replicated](../../../engines/table-engines/mergetree-family/replication.md) variants). Projection manipulation is supported only for tables with [`*MergeTree`](/docs/en/engines/table-engines/mergetree-family/mergetree.md) engine (including [replicated](/docs/en/engines/table-engines/mergetree-family/replication.md) variants).
::: :::

View File

@ -1,5 +1,6 @@
--- ---
slug: /en/sql-reference/statements/alter/index slug: /en/sql-reference/statements/alter/skipping-index
toc_hidden_folder: true toc_hidden_folder: true
sidebar_position: 42 sidebar_position: 42
sidebar_label: INDEX sidebar_label: INDEX
@ -13,12 +14,12 @@ The following operations are available:
- `ALTER TABLE [db].table_name [ON CLUSTER cluster] DROP INDEX name` - Removes index description from tables metadata and deletes index files from disk. - `ALTER TABLE [db].table_name [ON CLUSTER cluster] DROP INDEX name` - Removes index description from tables metadata and deletes index files from disk.
- `ALTER TABLE [db.]table_name [ON CLUSTER cluster] MATERIALIZE INDEX name [IN PARTITION partition_name]` - Rebuilds the secondary index `name` for the specified `partition_name`. Implemented as a [mutation](../../../../sql-reference/statements/alter/index.md#mutations). If `IN PARTITION` part is omitted then it rebuilds the index for the whole table data. - `ALTER TABLE [db.]table_name [ON CLUSTER cluster] MATERIALIZE INDEX name [IN PARTITION partition_name]` - Rebuilds the secondary index `name` for the specified `partition_name`. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations). If `IN PARTITION` part is omitted then it rebuilds the index for the whole table data.
The first two commands are lightweight in a sense that they only change metadata or remove files. The first two commands are lightweight in a sense that they only change metadata or remove files.
Also, they are replicated, syncing indices metadata via ZooKeeper. Also, they are replicated, syncing indices metadata via ZooKeeper.
:::note :::note
Index manipulation is supported only for tables with [`*MergeTree`](../../../../engines/table-engines/mergetree-family/mergetree.md) engine (including [replicated](../../../../engines/table-engines/mergetree-family/replication.md) variants). Index manipulation is supported only for tables with [`*MergeTree`](/docs/en/engines/table-engines/mergetree-family/mergetree.md) engine (including [replicated](/docs/en/engines/table-engines/mergetree-family/replication.md) variants).
::: :::

View File

@ -10,7 +10,7 @@ sidebar_label: UPDATE
ALTER TABLE [db.]table [ON CLUSTER cluster] UPDATE column1 = expr1 [, ...] WHERE filter_expr ALTER TABLE [db.]table [ON CLUSTER cluster] UPDATE column1 = expr1 [, ...] WHERE filter_expr
``` ```
Manipulates data matching the specified filtering expression. Implemented as a [mutation](../../../sql-reference/statements/alter/index.md#mutations). Manipulates data matching the specified filtering expression. Implemented as a [mutation](/docs/en/sql-reference/statements/alter/index.md/#mutations).
:::note :::note
The `ALTER TABLE` prefix makes this syntax different from most other systems supporting SQL. It is intended to signify that unlike similar queries in OLTP databases this is a heavy operation not designed for frequent use. The `ALTER TABLE` prefix makes this syntax different from most other systems supporting SQL. It is intended to signify that unlike similar queries in OLTP databases this is a heavy operation not designed for frequent use.
@ -20,11 +20,11 @@ The `filter_expr` must be of type `UInt8`. This query updates values of specifie
One query can contain several commands separated by commas. One query can contain several commands separated by commas.
The synchronicity of the query processing is defined by the [mutations_sync](../../../operations/settings/settings.md#mutations_sync) setting. By default, it is asynchronous. The synchronicity of the query processing is defined by the [mutations_sync](/docs/en/operations/settings/settings.md/#mutations_sync) setting. By default, it is asynchronous.
**See also** **See also**
- [Mutations](../../../sql-reference/statements/alter/index.md#mutations) - [Mutations](/docs/en/sql-reference/statements/alter/index.md/#mutations)
- [Synchronicity of ALTER Queries](../../../sql-reference/statements/alter/index.md#synchronicity-of-alter-queries) - [Synchronicity of ALTER Queries](/docs/en/sql-reference/statements/alter/index.md/#synchronicity-of-alter-queries)
- [mutations_sync](../../../operations/settings/settings.md#mutations_sync) setting - [mutations_sync](/docs/en/operations/settings/settings.md/#mutations_sync) setting

View File

@ -12,7 +12,7 @@ Syntax:
``` sql ``` sql
ALTER USER [IF EXISTS] name1 [ON CLUSTER cluster_name1] [RENAME TO new_name1] ALTER USER [IF EXISTS] name1 [ON CLUSTER cluster_name1] [RENAME TO new_name1]
[, name2 [ON CLUSTER cluster_name2] [RENAME TO new_name2] ...] [, name2 [ON CLUSTER cluster_name2] [RENAME TO new_name2] ...]
[NOT IDENTIFIED | IDENTIFIED {[WITH {no_password | plaintext_password | sha256_password | sha256_hash | double_sha1_password | double_sha1_hash}] BY {'password' | 'hash'}} | {WITH ldap SERVER 'server_name'} | {WITH kerberos [REALM 'realm']}] [NOT IDENTIFIED | IDENTIFIED {[WITH {no_password | plaintext_password | sha256_password | sha256_hash | double_sha1_password | double_sha1_hash}] BY {'password' | 'hash'}} | {WITH ldap SERVER 'server_name'} | {WITH kerberos [REALM 'realm']} | {WITH ssl_certificate CN 'common_name'}]
[[ADD | DROP] HOST {LOCAL | NAME 'name' | REGEXP 'name_regexp' | IP 'address' | LIKE 'pattern'} [,...] | ANY | NONE] [[ADD | DROP] HOST {LOCAL | NAME 'name' | REGEXP 'name_regexp' | IP 'address' | LIKE 'pattern'} [,...] | ANY | NONE]
[DEFAULT ROLE role [,...] | ALL | ALL EXCEPT role [,...] ] [DEFAULT ROLE role [,...] | ALL | ALL EXCEPT role [,...] ]
[GRANTEES {user | role | ANY | NONE} [,...] [EXCEPT {user | role} [,...]]] [GRANTEES {user | role | ANY | NONE} [,...] [EXCEPT {user | role} [,...]]]

View File

@ -8,7 +8,7 @@ title: "CHECK TABLE Statement"
Checks if the data in the table is corrupted. Checks if the data in the table is corrupted.
``` sql ``` sql
CHECK TABLE [db.]name CHECK TABLE [db.]name [PARTITION partition_expr]
``` ```
The `CHECK TABLE` query compares actual file sizes with the expected values which are stored on the server. If the file sizes do not match the stored values, it means the data is corrupted. This can be caused, for example, by a system crash during query execution. The `CHECK TABLE` query compares actual file sizes with the expected values which are stored on the server. If the file sizes do not match the stored values, it means the data is corrupted. This can be caused, for example, by a system crash during query execution.

View File

@ -4,7 +4,7 @@ sidebar_position: 38
sidebar_label: FUNCTION sidebar_label: FUNCTION
--- ---
# CREATE FUNCTION # CREATE FUNCTION &mdash; user defined function (UDF)
Creates a user defined function from a lambda expression. The expression must consist of function parameters, constants, operators, or other function calls. Creates a user defined function from a lambda expression. The expression must consist of function parameters, constants, operators, or other function calls.

View File

@ -12,7 +12,7 @@ Syntax:
``` sql ``` sql
CREATE USER [IF NOT EXISTS | OR REPLACE] name1 [ON CLUSTER cluster_name1] CREATE USER [IF NOT EXISTS | OR REPLACE] name1 [ON CLUSTER cluster_name1]
[, name2 [ON CLUSTER cluster_name2] ...] [, name2 [ON CLUSTER cluster_name2] ...]
[NOT IDENTIFIED | IDENTIFIED {[WITH {no_password | plaintext_password | sha256_password | sha256_hash | double_sha1_password | double_sha1_hash}] BY {'password' | 'hash'}} | {WITH ldap SERVER 'server_name'} | {WITH kerberos [REALM 'realm']}] [NOT IDENTIFIED | IDENTIFIED {[WITH {no_password | plaintext_password | sha256_password | sha256_hash | double_sha1_password | double_sha1_hash}] BY {'password' | 'hash'}} | {WITH ldap SERVER 'server_name'} | {WITH kerberos [REALM 'realm']} | {WITH ssl_certificate CN 'common_name'}]
[HOST {LOCAL | NAME 'name' | REGEXP 'name_regexp' | IP 'address' | LIKE 'pattern'} [,...] | ANY | NONE] [HOST {LOCAL | NAME 'name' | REGEXP 'name_regexp' | IP 'address' | LIKE 'pattern'} [,...] | ANY | NONE]
[DEFAULT ROLE role [,...]] [DEFAULT ROLE role [,...]]
[DEFAULT DATABASE database | NONE] [DEFAULT DATABASE database | NONE]
@ -34,6 +34,7 @@ There are multiple ways of user identification:
- `IDENTIFIED WITH double_sha1_hash BY 'hash'` - `IDENTIFIED WITH double_sha1_hash BY 'hash'`
- `IDENTIFIED WITH ldap SERVER 'server_name'` - `IDENTIFIED WITH ldap SERVER 'server_name'`
- `IDENTIFIED WITH kerberos` or `IDENTIFIED WITH kerberos REALM 'realm'` - `IDENTIFIED WITH kerberos` or `IDENTIFIED WITH kerberos REALM 'realm'`
- `IDENTIFIED WITH ssl_certificate CN 'mysite.com:user'`
For identification with sha256_hash using `SALT` - hash must be calculated from concatination of 'password' and 'salt'. For identification with sha256_hash using `SALT` - hash must be calculated from concatination of 'password' and 'salt'.

View File

@ -8,25 +8,25 @@ sidebar_label: Statements
Statements represent various kinds of action you can perform using SQL queries. Each kind of statement has its own syntax and usage details that are described separately: Statements represent various kinds of action you can perform using SQL queries. Each kind of statement has its own syntax and usage details that are described separately:
- [SELECT](../../sql-reference/statements/select/index.md) - [SELECT](/docs/en/sql-reference/statements/select/index.md)
- [INSERT INTO](../../sql-reference/statements/insert-into.md) - [INSERT INTO](/docs/en/sql-reference/statements/insert-into.md)
- [CREATE](../../sql-reference/statements/create/index.md) - [CREATE](/docs/en/sql-reference/statements/create/index.md)
- [ALTER](../../sql-reference/statements/alter/index.md) - [ALTER](/docs/en/sql-reference/statements/alter/index.md)
- [SYSTEM](../../sql-reference/statements/system.md) - [SYSTEM](/docs/en/sql-reference/statements/system.md)
- [SHOW](../../sql-reference/statements/show.md) - [SHOW](/docs/en/sql-reference/statements/show.md)
- [GRANT](../../sql-reference/statements/grant.md) - [GRANT](/docs/en/sql-reference/statements/grant.md)
- [REVOKE](../../sql-reference/statements/revoke.md) - [REVOKE](/docs/en/sql-reference/statements/revoke.md)
- [ATTACH](../../sql-reference/statements/attach.md) - [ATTACH](/docs/en/sql-reference/statements/attach.md)
- [CHECK TABLE](../../sql-reference/statements/check-table.md) - [CHECK TABLE](/docs/en/sql-reference/statements/check-table.md)
- [DESCRIBE TABLE](../../sql-reference/statements/describe-table.md) - [DESCRIBE TABLE](/docs/en/sql-reference/statements/describe-table.md)
- [DETACH](../../sql-reference/statements/detach.md) - [DETACH](/docs/en/sql-reference/statements/detach.md)
- [DROP](../../sql-reference/statements/drop.md) - [DROP](/docs/en/sql-reference/statements/drop.md)
- [EXISTS](../../sql-reference/statements/exists.md) - [EXISTS](/docs/en/sql-reference/statements/exists.md)
- [KILL](../../sql-reference/statements/kill.md) - [KILL](/docs/en/sql-reference/statements/kill.md)
- [OPTIMIZE](../../sql-reference/statements/optimize.md) - [OPTIMIZE](/docs/en/sql-reference/statements/optimize.md)
- [RENAME](../../sql-reference/statements/rename.md) - [RENAME](/docs/en/sql-reference/statements/rename.md)
- [SET](../../sql-reference/statements/set.md) - [SET](/docs/en/sql-reference/statements/set.md)
- [SET ROLE](../../sql-reference/statements/set-role.md) - [SET ROLE](/docs/en/sql-reference/statements/set-role.md)
- [TRUNCATE](../../sql-reference/statements/truncate.md) - [TRUNCATE](/docs/en/sql-reference/statements/truncate.md)
- [USE](../../sql-reference/statements/use.md) - [USE](/docs/en/sql-reference/statements/use.md)
- [EXPLAIN](../../sql-reference/statements/explain.md) - [EXPLAIN](/docs/en/sql-reference/statements/explain.md)

View File

@ -22,7 +22,7 @@ The `OPTIMIZE` query is supported for [MergeTree](../../engines/table-engines/me
When `OPTIMIZE` is used with the [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication.md) family of table engines, ClickHouse creates a task for merging and waits for execution on all replicas (if the [replication_alter_partitions_sync](../../operations/settings/settings.md#replication-alter-partitions-sync) setting is set to `2`) or on current replica (if the [replication_alter_partitions_sync](../../operations/settings/settings.md#replication-alter-partitions-sync) setting is set to `1`). When `OPTIMIZE` is used with the [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication.md) family of table engines, ClickHouse creates a task for merging and waits for execution on all replicas (if the [replication_alter_partitions_sync](../../operations/settings/settings.md#replication-alter-partitions-sync) setting is set to `2`) or on current replica (if the [replication_alter_partitions_sync](../../operations/settings/settings.md#replication-alter-partitions-sync) setting is set to `1`).
- If `OPTIMIZE` does not perform a merge for any reason, it does not notify the client. To enable notifications, use the [optimize_throw_if_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop) setting. - If `OPTIMIZE` does not perform a merge for any reason, it does not notify the client. To enable notifications, use the [optimize_throw_if_noop](../../operations/settings/settings.md#setting-optimize_throw_if_noop) setting.
- If you specify a `PARTITION`, only the specified partition is optimized. [How to set partition expression](../../sql-reference/statements/alter/index.md#alter-how-to-specify-part-expr). - If you specify a `PARTITION`, only the specified partition is optimized. [How to set partition expression](alter/partition.md#how-to-set-partition-expression).
- If you specify `FINAL`, optimization is performed even when all the data is already in one part. Also merge is forced even if concurrent merges are performed. - If you specify `FINAL`, optimization is performed even when all the data is already in one part. Also merge is forced even if concurrent merges are performed.
- If you specify `DEDUPLICATE`, then completely identical rows (unless by-clause is specified) will be deduplicated (all columns are compared), it makes sense only for the MergeTree engine. - If you specify `DEDUPLICATE`, then completely identical rows (unless by-clause is specified) will be deduplicated (all columns are compared), it makes sense only for the MergeTree engine.

View File

@ -7,7 +7,7 @@ sidebar_label: INTERSECT
The `INTERSECT` clause returns only those rows that result from both the first and the second queries. The queries must match the number of columns, order, and type. The result of `INTERSECT` can contain duplicate rows. The `INTERSECT` clause returns only those rows that result from both the first and the second queries. The queries must match the number of columns, order, and type. The result of `INTERSECT` can contain duplicate rows.
Multiple `INTERSECT` statements are executes left to right if parenthesis are not specified. The `INTERSECT` operator has a higher priority than the `UNION` and `EXCEPT` clause. Multiple `INTERSECT` statements are executed left to right if parentheses are not specified. The `INTERSECT` operator has a higher priority than the `UNION` and `EXCEPT` clauses.
``` sql ``` sql

View File

@ -281,8 +281,8 @@ After running this statement the `[db.]replicated_merge_tree_family_table_name`
### RESTART REPLICA ### RESTART REPLICA
Provides possibility to reinitialize Zookeeper sessions state for `ReplicatedMergeTree` table, will compare current state with Zookeeper as source of true and add tasks to Zookeeper queue if needed. Provides possibility to reinitialize Zookeeper session's state for `ReplicatedMergeTree` table, will compare current state with Zookeeper as source of truth and add tasks to Zookeeper queue if needed.
Initialization replication queue based on ZooKeeper date happens in the same way as `ATTACH TABLE` statement. For a short time the table will be unavailable for any operations. Initialization of replication queue based on ZooKeeper data happens in the same way as for `ATTACH TABLE` statement. For a short time, the table will be unavailable for any operations.
``` sql ``` sql
SYSTEM RESTART REPLICA [db.]replicated_merge_tree_family_table_name SYSTEM RESTART REPLICA [db.]replicated_merge_tree_family_table_name

Some files were not shown because too many files have changed in this diff Show More