Merge remote-tracking branch 'rschu1ze/master' into docs-install

This commit is contained in:
Robert Schulze 2024-05-02 13:20:53 +00:00
commit e12e5bd7bf
No known key found for this signature in database
GPG Key ID: 26703B55FB13728A
123 changed files with 5631 additions and 2699 deletions

View File

@ -41,8 +41,9 @@ At a minimum, the following information should be added (but add more as needed)
> Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/
---
### Modify your CI run:
<details>
<summary>Modify your CI run</summary>
**NOTE:** If your merge the PR with modified CI you **MUST KNOW** what you are doing
**NOTE:** Checked options will be applied if set before CI RunConfig/PrepareRunConfig step
@ -56,6 +57,7 @@ At a minimum, the following information should be added (but add more as needed)
- [ ] <!---ci_include_asan--> All with ASAN
- [ ] <!---ci_include_tsan--> All with TSAN
- [ ] <!---ci_include_analyzer--> All with Analyzer
- [ ] <!---ci_include_azure --> All with Azure
- [ ] <!---ci_include_KEYWORD--> Add your option here
#### Exclude tests:
@ -82,3 +84,5 @@ At a minimum, the following information should be added (but add more as needed)
- [ ] <!---batch_1--> 2
- [ ] <!---batch_2--> 3
- [ ] <!---batch_3--> 4
<details>

View File

@ -239,15 +239,14 @@ jobs:
build_name: binary_riscv64
data: ${{ needs.RunConfig.outputs.data }}
checkout_depth: 0
# disabled because s390x refused to build in the migration to OpenSSL
# BuilderBinS390X:
# needs: [RunConfig, BuilderDebRelease]
# if: ${{ !failure() && !cancelled() }}
# uses: ./.github/workflows/reusable_build.yml
# with:
# build_name: binary_s390x
# data: ${{ needs.RunConfig.outputs.data }}
# checkout_depth: 0
BuilderBinS390X:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_build.yml
with:
build_name: binary_s390x
data: ${{ needs.RunConfig.outputs.data }}
checkout_depth: 0
############################################################################################
##################################### Docker images #######################################
############################################################################################
@ -298,7 +297,7 @@ jobs:
- BuilderBinFreeBSD
- BuilderBinPPC64
- BuilderBinRISCV64
# - BuilderBinS390X # disabled because s390x refused to build in the migration to OpenSSL
- BuilderBinS390X
- BuilderBinAmd64Compat
- BuilderBinAarch64V80Compat
- BuilderBinClangTidy

View File

@ -7,7 +7,7 @@
# 2024 Changelog
### <a id="244"></a> ClickHouse release 24.4 LTS, 2024-04-30
### <a id="244"></a> ClickHouse release 24.4, 2024-04-30
#### Upgrade Notes
* `clickhouse-odbc-bridge` and `clickhouse-library-bridge` are now separate packages. This closes [#61677](https://github.com/ClickHouse/ClickHouse/issues/61677). [#62114](https://github.com/ClickHouse/ClickHouse/pull/62114) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
@ -33,7 +33,6 @@
#### Performance Improvement
* JOIN filter push down improvements using equivalent sets. [#61216](https://github.com/ClickHouse/ClickHouse/pull/61216) ([Maksim Kita](https://github.com/kitaisreal)).
* Convert OUTER JOIN to INNER JOIN optimization if the filter after JOIN always filters default values. Optimization can be controlled with setting `query_plan_convert_outer_join_to_inner_join`, enabled by default. [#62907](https://github.com/ClickHouse/ClickHouse/pull/62907) ([Maksim Kita](https://github.com/kitaisreal)).
* Enabled fast Parquet encoder by default (output_format_parquet_use_custom_encoder). [#62088](https://github.com/ClickHouse/ClickHouse/pull/62088) ([Michael Kolupaev](https://github.com/al13n321)).
* Improvement for AWS S3. Client has to send header 'Keep-Alive: timeout=X' to the server. If a client receives a response from the server with that header, client has to use the value from the server. Also for a client it is better not to use a connection which is nearly expired in order to avoid connection close race. [#62249](https://github.com/ClickHouse/ClickHouse/pull/62249) ([Sema Checherinda](https://github.com/CheSema)).
* Reduce overhead of the mutations for SELECTs (v2). [#60856](https://github.com/ClickHouse/ClickHouse/pull/60856) ([Azat Khuzhin](https://github.com/azat)).
* More frequently invoked functions in PODArray are now force-inlined. [#61144](https://github.com/ClickHouse/ClickHouse/pull/61144) ([李扬](https://github.com/taiyang-li)).
@ -348,6 +347,7 @@
* Add sanity check for number of threads and block sizes. [#60138](https://github.com/ClickHouse/ClickHouse/pull/60138) ([Raúl Marín](https://github.com/Algunenano)).
* Don't infer floats in exponential notation by default. Add a setting `input_format_try_infer_exponent_floats` that will restore previous behaviour (disabled by default). Closes [#59476](https://github.com/ClickHouse/ClickHouse/issues/59476). [#59500](https://github.com/ClickHouse/ClickHouse/pull/59500) ([Kruglov Pavel](https://github.com/Avogar)).
* Allow alter operations to be surrounded by parenthesis. The emission of parentheses can be controlled by the `format_alter_operations_with_parentheses` config. By default, in formatted queries the parentheses are emitted as we store the formatted alter operations in some places as metadata (e.g.: mutations). The new syntax clarifies some of the queries where alter operations end in a list. E.g.: `ALTER TABLE x MODIFY TTL date GROUP BY a, b, DROP COLUMN c` cannot be parsed properly with the old syntax. In the new syntax the query `ALTER TABLE x (MODIFY TTL date GROUP BY a, b), (DROP COLUMN c)` is obvious. Older versions are not able to read the new syntax, therefore using the new syntax might cause issues if newer and older version of ClickHouse are mixed in a single cluster. [#59532](https://github.com/ClickHouse/ClickHouse/pull/59532) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Fix for the materialized view security issue, which allowed a user to insert into a table without required grants for that. Fix validates that the user has permission to insert not only into a materialized view but also into all underlying tables. This means that some queries, which worked before, now can fail with `Not enough privileges`. To address this problem, the release introduces a new feature of SQL security for views https://clickhouse.com/docs/en/sql-reference/statements/create/view#sql_security. [#54901](https://github.com/ClickHouse/ClickHouse/pull/54901) [#60439](https://github.com/ClickHouse/ClickHouse/pull/60439) ([pufit](https://github.com/pufit)).
#### New Feature
* Added new syntax which allows to specify definer user in View/Materialized View. This allows to execute selects/inserts from views without explicit grants for underlying tables. So, a View will encapsulate the grants. [#54901](https://github.com/ClickHouse/ClickHouse/pull/54901) [#60439](https://github.com/ClickHouse/ClickHouse/pull/60439) ([pufit](https://github.com/pufit)).

View File

@ -13,9 +13,10 @@ The following versions of ClickHouse server are currently being supported with s
| Version | Supported |
|:-|:-|
| 24.4 | ✔️ |
| 24.3 | ✔️ |
| 24.2 | ✔️ |
| 24.1 | ✔️ |
| 24.1 | |
| 23.* | ❌ |
| 23.8 | ✔️ |
| 23.7 | ❌ |

View File

@ -23,18 +23,17 @@ bool cgroupsV2MemoryControllerEnabled()
{
#if defined(OS_LINUX)
chassert(cgroupsV2Enabled());
/// According to https://docs.kernel.org/admin-guide/cgroup-v2.html:
/// - file 'cgroup.controllers' defines which controllers *can* be enabled
/// - file 'cgroup.subtree_control' defines which controllers *are* enabled
/// Caveat: nested groups may disable controllers. For simplicity, check only the top-level group.
std::ifstream subtree_control_file(default_cgroups_mount / "cgroup.subtree_control");
if (!subtree_control_file.is_open())
/// According to https://docs.kernel.org/admin-guide/cgroup-v2.html, file "cgroup.controllers" defines which controllers are available
/// for the current + child cgroups. The set of available controllers can be restricted from level to level using file
/// "cgroups.subtree_control". It is therefore sufficient to check the bottom-most nested "cgroup.controllers" file.
std::string cgroup = cgroupV2OfProcess();
auto cgroup_dir = cgroup.empty() ? default_cgroups_mount : (default_cgroups_mount / cgroup);
std::ifstream controllers_file(cgroup_dir / "cgroup.controllers");
if (!controllers_file.is_open())
return false;
std::string controllers;
std::getline(subtree_control_file, controllers);
if (controllers.find("memory") == std::string::npos)
return false;
return true;
std::getline(controllers_file, controllers);
return controllers.find("memory") != std::string::npos;
#else
return false;
#endif

View File

@ -2,11 +2,11 @@
# NOTE: has nothing common with DBMS_TCP_PROTOCOL_VERSION,
# only DBMS_TCP_PROTOCOL_VERSION should be incremented on protocol changes.
SET(VERSION_REVISION 54485)
SET(VERSION_REVISION 54486)
SET(VERSION_MAJOR 24)
SET(VERSION_MINOR 4)
SET(VERSION_MINOR 5)
SET(VERSION_PATCH 1)
SET(VERSION_GITHASH 2c5c589a882ceec35439650337b92db3e76f0081)
SET(VERSION_DESCRIBE v24.4.1.1-testing)
SET(VERSION_STRING 24.4.1.1)
SET(VERSION_GITHASH 6d4b31322d168356c8b10c43b4cef157c82337ff)
SET(VERSION_DESCRIBE v24.5.1.1-testing)
SET(VERSION_STRING 24.5.1.1)
# end of autochange

2
contrib/azure vendored

@ -1 +1 @@
Subproject commit b90fd3c6ef3185f5be3408056567bca0854129b6
Subproject commit 6262a76ef4c4c330c84e58dd4f6f13f4e6230fcd

View File

@ -93,7 +93,10 @@ enable_language(ASM)
if(COMPILER_CLANG)
add_definitions(-Wno-unused-command-line-argument)
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -fuse-ld=lld") # only relevant for -DENABLE_OPENSSL_DYNAMIC=1
# Note that s390x build uses mold linker
if(NOT ARCH_S390X)
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -fuse-ld=lld") # only relevant for -DENABLE_OPENSSL_DYNAMIC=1
endif()
endif()
if(ARCH_AMD64)
@ -191,6 +194,9 @@ elseif(ARCH_S390X)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/aes/asm/aes-s390x.pl ${OPENSSL_BINARY_DIR}/crypto/aes/aes-s390x.S)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/s390xcpuid.pl ${OPENSSL_BINARY_DIR}/crypto/s390xcpuid.S)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/chacha/asm/chacha-s390x.pl ${OPENSSL_BINARY_DIR}/crypto/chacha/chacha-s390x.S)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/rc4/asm/rc4-s390x.pl ${OPENSSL_BINARY_DIR}/crypto/rc4/rc4-s390x.S)
perl_generate_asm(${OPENSSL_SOURCE_DIR}/crypto/sha/asm/keccak1600-s390x.pl ${OPENSSL_BINARY_DIR}/crypto/sha/keccak1600-s390x.S)
elseif(ARCH_RISCV64)
macro(perl_generate_asm FILE_IN FILE_OUT)
add_custom_command(OUTPUT ${FILE_OUT}
@ -1290,6 +1296,15 @@ elseif(ARCH_S390X)
set(CRYPTO_SRC ${CRYPTO_SRC}
${OPENSSL_BINARY_DIR}/crypto/aes/aes-s390x.S
${OPENSSL_BINARY_DIR}/crypto/s390xcpuid.S
${OPENSSL_SOURCE_DIR}/crypto/bn/asm/s390x.S
${OPENSSL_SOURCE_DIR}/crypto/s390xcap.c
${OPENSSL_SOURCE_DIR}/crypto/bn/bn_s390x.c
${OPENSSL_SOURCE_DIR}/crypto/camellia/camellia.c
${OPENSSL_SOURCE_DIR}/crypto/camellia/cmll_cbc.c
${OPENSSL_BINARY_DIR}/crypto/chacha/chacha-s390x.S
${OPENSSL_BINARY_DIR}/crypto/rc4/rc4-s390x.S
${OPENSSL_BINARY_DIR}/crypto/sha/keccak1600-s390x.S
${OPENSSL_SOURCE_DIR}/crypto/whrlpool/wp_block.c
)
elseif(ARCH_RISCV64)
set(CRYPTO_SRC ${CRYPTO_SRC}

View File

@ -34,7 +34,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="24.3.2.23"
ARG VERSION="24.4.1.2088"
ARG PACKAGES="clickhouse-keeper"
ARG DIRECT_DOWNLOAD_URLS=""

View File

@ -43,14 +43,13 @@ RUN add-apt-repository ppa:ubuntu-toolchain-r/test --yes \
# Download toolchain and SDK for Darwin
RUN curl -sL -O https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz
# disabled because s390x refused to build in the migration to OpenSSL
# Download and install mold 2.0 for s390x build
# RUN curl -Lo /tmp/mold.tar.gz "https://github.com/rui314/mold/releases/download/v2.0.0/mold-2.0.0-x86_64-linux.tar.gz" \
# && mkdir /tmp/mold \
# && tar -xzf /tmp/mold.tar.gz -C /tmp/mold \
# && cp -r /tmp/mold/mold*/* /usr \
# && rm -rf /tmp/mold \
# && rm /tmp/mold.tar.gz
RUN curl -Lo /tmp/mold.tar.gz "https://github.com/rui314/mold/releases/download/v2.0.0/mold-2.0.0-x86_64-linux.tar.gz" \
&& mkdir /tmp/mold \
&& tar -xzf /tmp/mold.tar.gz -C /tmp/mold \
&& cp -r /tmp/mold/mold*/* /usr \
&& rm -rf /tmp/mold \
&& rm /tmp/mold.tar.gz
# Architecture of the image when BuildKit/buildx is used
ARG TARGETARCH

View File

@ -148,7 +148,7 @@ def parse_env_variables(
FREEBSD_SUFFIX = "-freebsd"
PPC_SUFFIX = "-ppc64le"
RISCV_SUFFIX = "-riscv64"
# S390X_SUFFIX = "-s390x" # disabled because s390x refused to build in the migration to OpenSSL
S390X_SUFFIX = "-s390x"
AMD64_COMPAT_SUFFIX = "-amd64-compat"
AMD64_MUSL_SUFFIX = "-amd64-musl"
@ -166,7 +166,7 @@ def parse_env_variables(
is_cross_arm_v80compat = compiler.endswith(ARM_V80COMPAT_SUFFIX)
is_cross_ppc = compiler.endswith(PPC_SUFFIX)
is_cross_riscv = compiler.endswith(RISCV_SUFFIX)
# is_cross_s390x = compiler.endswith(S390X_SUFFIX) # disabled because s390x refused to build in the migration to OpenSSL
is_cross_s390x = compiler.endswith(S390X_SUFFIX)
is_cross_freebsd = compiler.endswith(FREEBSD_SUFFIX)
is_amd64_compat = compiler.endswith(AMD64_COMPAT_SUFFIX)
is_amd64_musl = compiler.endswith(AMD64_MUSL_SUFFIX)
@ -230,12 +230,11 @@ def parse_env_variables(
cmake_flags.append(
"-DCMAKE_TOOLCHAIN_FILE=/build/cmake/linux/toolchain-riscv64.cmake"
)
# disabled because s390x refused to build in the migration to OpenSSL
# elif is_cross_s390x:
# cc = compiler[: -len(S390X_SUFFIX)]
# cmake_flags.append(
# "-DCMAKE_TOOLCHAIN_FILE=/build/cmake/linux/toolchain-s390x.cmake"
# )
elif is_cross_s390x:
cc = compiler[: -len(S390X_SUFFIX)]
cmake_flags.append(
"-DCMAKE_TOOLCHAIN_FILE=/build/cmake/linux/toolchain-s390x.cmake"
)
elif is_amd64_compat:
cc = compiler[: -len(AMD64_COMPAT_SUFFIX)]
result.append("DEB_ARCH=amd64")
@ -411,7 +410,7 @@ def parse_args() -> argparse.Namespace:
"clang-17-aarch64-v80compat",
"clang-17-ppc64le",
"clang-17-riscv64",
# "clang-17-s390x", # disabled because s390x refused to build in the migration to OpenSSL
"clang-17-s390x",
"clang-17-amd64-compat",
"clang-17-amd64-musl",
"clang-17-freebsd",

View File

@ -32,7 +32,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="24.3.2.23"
ARG VERSION="24.4.1.2088"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
ARG DIRECT_DOWNLOAD_URLS=""

View File

@ -27,7 +27,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="24.3.2.23"
ARG VERSION="24.4.1.2088"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image

View File

@ -0,0 +1,60 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.2.3.70-stable (8a7b8c7afb6) FIXME as compared to v24.2.2.71-stable (9293d361e72)
#### Performance Improvement
* Backported in [#61630](https://github.com/ClickHouse/ClickHouse/issues/61630): Optimized function `dotProduct` to omit unnecessary and expensive memory copies. [#60928](https://github.com/ClickHouse/ClickHouse/pull/60928) ([Robert Schulze](https://github.com/rschu1ze)).
#### Build/Testing/Packaging Improvement
* Backported in [#62011](https://github.com/ClickHouse/ClickHouse/issues/62011): Remove from the Keeper Docker image the volumes at /etc/clickhouse-keeper and /var/log/clickhouse-keeper. [#61683](https://github.com/ClickHouse/ClickHouse/pull/61683) ([Tristan](https://github.com/Tristan971)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix possible incorrect result of aggregate function `uniqExact` [#61257](https://github.com/ClickHouse/ClickHouse/pull/61257) ([Anton Popov](https://github.com/CurtizJ)).
* Fix ATTACH query with external ON CLUSTER [#61365](https://github.com/ClickHouse/ClickHouse/pull/61365) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix consecutive keys optimization for nullable keys [#61393](https://github.com/ClickHouse/ClickHouse/pull/61393) ([Anton Popov](https://github.com/CurtizJ)).
* fix issue of actions dag split [#61458](https://github.com/ClickHouse/ClickHouse/pull/61458) ([Raúl Marín](https://github.com/Algunenano)).
* Disable async_insert_use_adaptive_busy_timeout correctly with compatibility settings [#61468](https://github.com/ClickHouse/ClickHouse/pull/61468) ([Raúl Marín](https://github.com/Algunenano)).
* Fix bug when reading system.parts using UUID (issue 61220). [#61479](https://github.com/ClickHouse/ClickHouse/pull/61479) ([Dan Wu](https://github.com/wudanzy)).
* Fix ALTER QUERY MODIFY SQL SECURITY [#61480](https://github.com/ClickHouse/ClickHouse/pull/61480) ([pufit](https://github.com/pufit)).
* Fix client `-s` argument [#61530](https://github.com/ClickHouse/ClickHouse/pull/61530) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix string search with const position [#61547](https://github.com/ClickHouse/ClickHouse/pull/61547) ([Antonio Andelic](https://github.com/antonio2368)).
* Cancel merges before removing moved parts [#61610](https://github.com/ClickHouse/ClickHouse/pull/61610) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Fix crash in `multiSearchAllPositionsCaseInsensitiveUTF8` for incorrect UTF-8 [#61749](https://github.com/ClickHouse/ClickHouse/pull/61749) ([pufit](https://github.com/pufit)).
* Mark CANNOT_PARSE_ESCAPE_SEQUENCE error as parse error to be able to skip it in row input formats [#61883](https://github.com/ClickHouse/ClickHouse/pull/61883) ([Kruglov Pavel](https://github.com/Avogar)).
* Crash in Engine Merge if Row Policy does not have expression [#61971](https://github.com/ClickHouse/ClickHouse/pull/61971) ([Ilya Golshtein](https://github.com/ilejn)).
* Fix data race on scalars in Context [#62305](https://github.com/ClickHouse/ClickHouse/pull/62305) ([Kruglov Pavel](https://github.com/Avogar)).
* Try to fix segfault in Hive engine [#62578](https://github.com/ClickHouse/ClickHouse/pull/62578) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix memory leak in groupArraySorted [#62597](https://github.com/ClickHouse/ClickHouse/pull/62597) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix GCD codec [#62853](https://github.com/ClickHouse/ClickHouse/pull/62853) ([Nikita Taranov](https://github.com/nickitat)).
* Fix temporary data in cache incorrectly processing failure of cache key directory creation [#62925](https://github.com/ClickHouse/ClickHouse/pull/62925) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix incorrect judgement of of monotonicity of function abs [#63097](https://github.com/ClickHouse/ClickHouse/pull/63097) ([Duc Canh Le](https://github.com/canhld94)).
* Make sanity check of settings worse [#63119](https://github.com/ClickHouse/ClickHouse/pull/63119) ([Raúl Marín](https://github.com/Algunenano)).
* Set server name for SSL handshake in MongoDB engine [#63122](https://github.com/ClickHouse/ClickHouse/pull/63122) ([Alexander Gololobov](https://github.com/davenger)).
* Format SQL security option only in `CREATE VIEW` queries. [#63136](https://github.com/ClickHouse/ClickHouse/pull/63136) ([pufit](https://github.com/pufit)).
#### CI Fix or Improvement (changelog entry is not required)
* Backported in [#61430](https://github.com/ClickHouse/ClickHouse/issues/61430):. [#61374](https://github.com/ClickHouse/ClickHouse/pull/61374) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#61490](https://github.com/ClickHouse/ClickHouse/issues/61490): ... [#61441](https://github.com/ClickHouse/ClickHouse/pull/61441) ([Max K.](https://github.com/maxknv)).
* Backported in [#61638](https://github.com/ClickHouse/ClickHouse/issues/61638):. [#61592](https://github.com/ClickHouse/ClickHouse/pull/61592) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#61896](https://github.com/ClickHouse/ClickHouse/issues/61896): ... [#61877](https://github.com/ClickHouse/ClickHouse/pull/61877) ([Max K.](https://github.com/maxknv)).
* Backported in [#62055](https://github.com/ClickHouse/ClickHouse/issues/62055): ... [#62044](https://github.com/ClickHouse/ClickHouse/pull/62044) ([Max K.](https://github.com/maxknv)).
* Backported in [#62203](https://github.com/ClickHouse/ClickHouse/issues/62203):. [#62190](https://github.com/ClickHouse/ClickHouse/pull/62190) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Backported in [#62800](https://github.com/ClickHouse/ClickHouse/issues/62800): We won't fail the job when GH fails to retrieve the job ID and URLs. [#62651](https://github.com/ClickHouse/ClickHouse/pull/62651) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#62970](https://github.com/ClickHouse/ClickHouse/issues/62970):. [#62932](https://github.com/ClickHouse/ClickHouse/pull/62932) ([Robert Schulze](https://github.com/rschu1ze)).
* Backported in [#63115](https://github.com/ClickHouse/ClickHouse/issues/63115): ... [#63108](https://github.com/ClickHouse/ClickHouse/pull/63108) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Backport [#61479](https://github.com/ClickHouse/ClickHouse/issues/61479) to 24.2: Fix bug when reading system.parts using UUID (issue 61220)."'. [#61776](https://github.com/ClickHouse/ClickHouse/pull/61776) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Throw on query timeout in ZooKeeperRetries [#60922](https://github.com/ClickHouse/ClickHouse/pull/60922) ([Antonio Andelic](https://github.com/antonio2368)).

View File

@ -0,0 +1,76 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.3.3.102-lts (7e7f3bdd9be) FIXME as compared to v24.3.2.23-lts (8b7d910960c)
#### Improvement
* Backported in [#62188](https://github.com/ClickHouse/ClickHouse/issues/62188): StorageJoin with strictness `ANY` is consistent after reload. When several rows with the same key are inserted, the first one will have higher priority (before, it was chosen randomly upon table loading). close [#51027](https://github.com/ClickHouse/ClickHouse/issues/51027). [#61972](https://github.com/ClickHouse/ClickHouse/pull/61972) ([vdimir](https://github.com/vdimir)).
* Backported in [#62443](https://github.com/ClickHouse/ClickHouse/issues/62443): Client has to send header 'Keep-Alive: timeout=X' to the server. If a client receives a response from the server with that header, client has to use the value from the server. Also for a client it is better not to use a connection which is nearly expired in order to avoid connection close race. [#62249](https://github.com/ClickHouse/ClickHouse/pull/62249) ([Sema Checherinda](https://github.com/CheSema)).
* Backported in [#62666](https://github.com/ClickHouse/ClickHouse/issues/62666): S3 storage and backups also need the same default keep alive settings as s3 disk. [#62648](https://github.com/ClickHouse/ClickHouse/pull/62648) ([Sema Checherinda](https://github.com/CheSema)).
#### Build/Testing/Packaging Improvement
* Backported in [#62013](https://github.com/ClickHouse/ClickHouse/issues/62013): Remove from the Keeper Docker image the volumes at /etc/clickhouse-keeper and /var/log/clickhouse-keeper. [#61683](https://github.com/ClickHouse/ClickHouse/pull/61683) ([Tristan](https://github.com/Tristan971)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Cancel merges before removing moved parts [#61610](https://github.com/ClickHouse/ClickHouse/pull/61610) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Mark CANNOT_PARSE_ESCAPE_SEQUENCE error as parse error to be able to skip it in row input formats [#61883](https://github.com/ClickHouse/ClickHouse/pull/61883) ([Kruglov Pavel](https://github.com/Avogar)).
* Crash in Engine Merge if Row Policy does not have expression [#61971](https://github.com/ClickHouse/ClickHouse/pull/61971) ([Ilya Golshtein](https://github.com/ilejn)).
* ReadWriteBufferFromHTTP set right header host when redirected [#62068](https://github.com/ClickHouse/ClickHouse/pull/62068) ([Sema Checherinda](https://github.com/CheSema)).
* Analyzer: Fix query parameter resolution [#62186](https://github.com/ClickHouse/ClickHouse/pull/62186) ([Dmitry Novik](https://github.com/novikd)).
* Fixing NULL random seed for generateRandom with analyzer. [#62248](https://github.com/ClickHouse/ClickHouse/pull/62248) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix PartsSplitter [#62268](https://github.com/ClickHouse/ClickHouse/pull/62268) ([Nikita Taranov](https://github.com/nickitat)).
* Analyzer: Fix alias to parametrized view resolution [#62274](https://github.com/ClickHouse/ClickHouse/pull/62274) ([Dmitry Novik](https://github.com/novikd)).
* Analyzer: Fix name resolution from parent scopes [#62281](https://github.com/ClickHouse/ClickHouse/pull/62281) ([Dmitry Novik](https://github.com/novikd)).
* Fix argMax with nullable non native numeric column [#62285](https://github.com/ClickHouse/ClickHouse/pull/62285) ([Raúl Marín](https://github.com/Algunenano)).
* Fix data race on scalars in Context [#62305](https://github.com/ClickHouse/ClickHouse/pull/62305) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix analyzer with positional arguments in distributed query [#62362](https://github.com/ClickHouse/ClickHouse/pull/62362) ([flynn](https://github.com/ucasfl)).
* Fix filter pushdown from additional_table_filters in Merge engine in analyzer [#62398](https://github.com/ClickHouse/ClickHouse/pull/62398) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix GLOBAL IN table queries with analyzer. [#62409](https://github.com/ClickHouse/ClickHouse/pull/62409) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix scalar subquery in LIMIT [#62567](https://github.com/ClickHouse/ClickHouse/pull/62567) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Try to fix segfault in Hive engine [#62578](https://github.com/ClickHouse/ClickHouse/pull/62578) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix memory leak in groupArraySorted [#62597](https://github.com/ClickHouse/ClickHouse/pull/62597) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix argMin/argMax combinator state [#62708](https://github.com/ClickHouse/ClickHouse/pull/62708) ([Raúl Marín](https://github.com/Algunenano)).
* Fix temporary data in cache failing because of cache lock contention optimization [#62715](https://github.com/ClickHouse/ClickHouse/pull/62715) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix FINAL modifier is not respected in CTE with analyzer [#62811](https://github.com/ClickHouse/ClickHouse/pull/62811) ([Duc Canh Le](https://github.com/canhld94)).
* Fix crash in function `formatRow` with `JSON` format and HTTP interface [#62840](https://github.com/ClickHouse/ClickHouse/pull/62840) ([Anton Popov](https://github.com/CurtizJ)).
* Fix GCD codec [#62853](https://github.com/ClickHouse/ClickHouse/pull/62853) ([Nikita Taranov](https://github.com/nickitat)).
* Disable optimize_rewrite_aggregate_function_with_if for sum(nullable) [#62912](https://github.com/ClickHouse/ClickHouse/pull/62912) ([Raúl Marín](https://github.com/Algunenano)).
* Fix temporary data in cache incorrectly processing failure of cache key directory creation [#62925](https://github.com/ClickHouse/ClickHouse/pull/62925) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix optimize_rewrite_aggregate_function_with_if implicit cast [#62999](https://github.com/ClickHouse/ClickHouse/pull/62999) ([Raúl Marín](https://github.com/Algunenano)).
* Do not remove server constants from GROUP BY key for secondary query. [#63047](https://github.com/ClickHouse/ClickHouse/pull/63047) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix incorrect judgement of of monotonicity of function abs [#63097](https://github.com/ClickHouse/ClickHouse/pull/63097) ([Duc Canh Le](https://github.com/canhld94)).
* Set server name for SSL handshake in MongoDB engine [#63122](https://github.com/ClickHouse/ClickHouse/pull/63122) ([Alexander Gololobov](https://github.com/davenger)).
* Use user specified db instead of "config" for MongoDB wire protocol version check [#63126](https://github.com/ClickHouse/ClickHouse/pull/63126) ([Alexander Gololobov](https://github.com/davenger)).
* Format SQL security option only in `CREATE VIEW` queries. [#63136](https://github.com/ClickHouse/ClickHouse/pull/63136) ([pufit](https://github.com/pufit)).
#### CI Fix or Improvement (changelog entry is not required)
* Backported in [#62802](https://github.com/ClickHouse/ClickHouse/issues/62802): We won't fail the job when GH fails to retrieve the job ID and URLs. [#62651](https://github.com/ClickHouse/ClickHouse/pull/62651) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#62834](https://github.com/ClickHouse/ClickHouse/issues/62834): Add `isort` config fo the first-party imports; fail build reports on non-success statuses. [#62786](https://github.com/ClickHouse/ClickHouse/pull/62786) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#62879](https://github.com/ClickHouse/ClickHouse/issues/62879):. [#62835](https://github.com/ClickHouse/ClickHouse/pull/62835) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#62971](https://github.com/ClickHouse/ClickHouse/issues/62971):. [#62932](https://github.com/ClickHouse/ClickHouse/pull/62932) ([Robert Schulze](https://github.com/rschu1ze)).
* Backported in [#63064](https://github.com/ClickHouse/ClickHouse/issues/63064): ... [#63035](https://github.com/ClickHouse/ClickHouse/pull/63035) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Backported in [#63117](https://github.com/ClickHouse/ClickHouse/issues/63117): ... [#63108](https://github.com/ClickHouse/ClickHouse/pull/63108) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### NO CL CATEGORY
* Backported in [#62572](https://github.com/ClickHouse/ClickHouse/issues/62572):. [#62549](https://github.com/ClickHouse/ClickHouse/pull/62549) ([Alexander Tokmakov](https://github.com/tavplubix)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Fix another logical error in group_by_use_nulls. [#62236](https://github.com/ClickHouse/ClickHouse/pull/62236) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix lambda(tuple(x), x + 1) syntax in analyzer [#62253](https://github.com/ClickHouse/ClickHouse/pull/62253) ([vdimir](https://github.com/vdimir)).
* Fix __actionName, add tests for internal functions direct call [#62287](https://github.com/ClickHouse/ClickHouse/pull/62287) ([vdimir](https://github.com/vdimir)).
* Fix optimize_uniq_to_count when only prefix of key is matched [#62325](https://github.com/ClickHouse/ClickHouse/pull/62325) ([vdimir](https://github.com/vdimir)).
* Analyzer: Fix PREWHERE with lambda functions [#62336](https://github.com/ClickHouse/ClickHouse/pull/62336) ([vdimir](https://github.com/vdimir)).
* Use function isNotDistinctFrom only in join key [#62387](https://github.com/ClickHouse/ClickHouse/pull/62387) ([vdimir](https://github.com/vdimir)).
* Fix integration-tests logs compression [#62556](https://github.com/ClickHouse/ClickHouse/pull/62556) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Try to fix if_transform_strings_to_enum performance test [#62558](https://github.com/ClickHouse/ClickHouse/pull/62558) ([Dmitry Novik](https://github.com/novikd)).
* Fix shellcheck style checking and issues [#62761](https://github.com/ClickHouse/ClickHouse/pull/62761) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix flaky 03128_argMin_combinator_projection [#62965](https://github.com/ClickHouse/ClickHouse/pull/62965) ([Raúl Marín](https://github.com/Algunenano)).

View File

@ -0,0 +1,403 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.4.1.2088-stable (6d4b31322d1) FIXME as compared to v24.3.1.2672-lts (2c5c589a882)
#### Backward Incompatible Change
* Don't allow to set max_parallel_replicas to 0 as it doesn't make sense. Setting it to 0 could lead to unexpected logical errors. Closes [#60140](https://github.com/ClickHouse/ClickHouse/issues/60140). [#61201](https://github.com/ClickHouse/ClickHouse/pull/61201) ([Kruglov Pavel](https://github.com/Avogar)).
* `clickhouse-odbc-bridge` and `clickhouse-library-bridge` are separate packages. This closes [#61677](https://github.com/ClickHouse/ClickHouse/issues/61677). [#62114](https://github.com/ClickHouse/ClickHouse/pull/62114) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove support for INSERT WATCH query (part of the experimental LIVE VIEW feature). [#62382](https://github.com/ClickHouse/ClickHouse/pull/62382) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Remove optimize_monotonous_functions_in_order_by setting. [#63004](https://github.com/ClickHouse/ClickHouse/pull/63004) ([Raúl Marín](https://github.com/Algunenano)).
#### New Feature
* Supports dropping multiple tables at the same time like `drop table a,b,c`;. [#58705](https://github.com/ClickHouse/ClickHouse/pull/58705) ([zhongyuankai](https://github.com/zhongyuankai)).
* Table engine is grantable now, and it won't affect existing users behavior. [#60117](https://github.com/ClickHouse/ClickHouse/pull/60117) ([jsc0218](https://github.com/jsc0218)).
* Added a rewritable S3 disk which supports INSERT operations and does not require locally stored metadata. [#61116](https://github.com/ClickHouse/ClickHouse/pull/61116) ([Julia Kartseva](https://github.com/jkartseva)).
* For convenience purpose, `SELECT * FROM numbers() `will work in the same way as `SELECT * FROM system.numbers` - without a limit. [#61969](https://github.com/ClickHouse/ClickHouse/pull/61969) ([YenchangChan](https://github.com/YenchangChan)).
* Modifying memory table settings through `ALTER MODIFY SETTING` is now supported. ``` ALTER TABLE memory MODIFY SETTING min_rows_to_keep = 100, max_rows_to_keep = 1000; ```. [#62039](https://github.com/ClickHouse/ClickHouse/pull/62039) ([zhongyuankai](https://github.com/zhongyuankai)).
* Analyzer support recursive CTEs. [#62074](https://github.com/ClickHouse/ClickHouse/pull/62074) ([Maksim Kita](https://github.com/kitaisreal)).
* Analyzer support QUALIFY clause. Closes [#47819](https://github.com/ClickHouse/ClickHouse/issues/47819). [#62619](https://github.com/ClickHouse/ClickHouse/pull/62619) ([Maksim Kita](https://github.com/kitaisreal)).
* Added `role` query parameter to the HTTP interface. It works similarly to `SET ROLE x`, applying the role before the statement is executed. This allows for overcoming the limitation of the HTTP interface, as multiple statements are not allowed, and it is not possible to send both `SET ROLE x` and the statement itself at the same time. It is possible to set multiple roles that way, e.g., `?role=x&role=y`, which will be an equivalent of `SET ROLE x, y`. [#62669](https://github.com/ClickHouse/ClickHouse/pull/62669) ([Serge Klochkov](https://github.com/slvrtrn)).
* Add `SYSTEM UNLOAD PRIMARY KEY`. [#62738](https://github.com/ClickHouse/ClickHouse/pull/62738) ([Pablo Marcos](https://github.com/pamarcos)).
#### Performance Improvement
* Reduce overhead of the mutations for SELECTs (v2). [#60856](https://github.com/ClickHouse/ClickHouse/pull/60856) ([Azat Khuzhin](https://github.com/azat)).
* More frequently invoked functions in PODArray are now force-inlined. [#61144](https://github.com/ClickHouse/ClickHouse/pull/61144) ([李扬](https://github.com/taiyang-li)).
* JOIN filter push down improvements using equivalent sets. [#61216](https://github.com/ClickHouse/ClickHouse/pull/61216) ([Maksim Kita](https://github.com/kitaisreal)).
* Enabled fast Parquet encoder by default (output_format_parquet_use_custom_encoder). [#62088](https://github.com/ClickHouse/ClickHouse/pull/62088) ([Michael Kolupaev](https://github.com/al13n321)).
* ... When all required fields are read, skip all remaining fields directly which can save a lot of comparison. [#62210](https://github.com/ClickHouse/ClickHouse/pull/62210) ([lgbo](https://github.com/lgbo-ustc)).
* Functions `splitByChar` and `splitByRegexp` were speed up significantly. [#62392](https://github.com/ClickHouse/ClickHouse/pull/62392) ([李扬](https://github.com/taiyang-li)).
* Improve trivial insert select from files in file/s3/hdfs/url/... table functions. Add separate max_parsing_threads setting to control the number of threads used in parallel parsing. [#62404](https://github.com/ClickHouse/ClickHouse/pull/62404) ([Kruglov Pavel](https://github.com/Avogar)).
* Support parallel write buffer for AzureBlobStorage managed by setting `azure_allow_parallel_part_upload`. [#62534](https://github.com/ClickHouse/ClickHouse/pull/62534) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Functions `to_utc_timestamp` and `from_utc_timestamp` are now about 2x faster. [#62583](https://github.com/ClickHouse/ClickHouse/pull/62583) ([KevinyhZou](https://github.com/KevinyhZou)).
* Functions `parseDateTimeOrNull`, `parseDateTimeOrZero`, `parseDateTimeInJodaSyntaxOrNull` and `parseDateTimeInJodaSyntaxOrZero` now run significantly faster (10x - 1000x) when the input contains mostly non-parseable values. [#62634](https://github.com/ClickHouse/ClickHouse/pull/62634) ([LiuNeng](https://github.com/liuneng1994)).
* SELECTs against `system.query_cache` are now noticeably faster when the query cache contains lots of entries (e.g. more than 100.000). [#62671](https://github.com/ClickHouse/ClickHouse/pull/62671) ([Robert Schulze](https://github.com/rschu1ze)).
* QueryPlan convert OUTER JOIN to INNER JOIN optimization if filter after JOIN always filters default values. Optimization can be controlled with setting `query_plan_convert_outer_join_to_inner_join`, enabled by default. [#62907](https://github.com/ClickHouse/ClickHouse/pull/62907) ([Maksim Kita](https://github.com/kitaisreal)).
* Enable optimize_rewrite_sum_if_to_count_if by default. [#62929](https://github.com/ClickHouse/ClickHouse/pull/62929) ([Raúl Marín](https://github.com/Algunenano)).
#### Improvement
* Introduce separate consumer/producer tags for the Kafka configuration. This avoids warnings from librdkafka that consumer properties were specified for producer instances and vice versa (e.g. `Configuration property session.timeout.ms is a consumer property and will be ignored by this producer instance`). Closes: [#58983](https://github.com/ClickHouse/ClickHouse/issues/58983). [#58956](https://github.com/ClickHouse/ClickHouse/pull/58956) ([Aleksandr Musorin](https://github.com/AVMusorin)).
* Added `value1`, `value2`, ..., `value10` columns to `system.text_log`. These columns contain values that were used to format the message. [#59619](https://github.com/ClickHouse/ClickHouse/pull/59619) ([Alexey Katsman](https://github.com/alexkats)).
* Add a setting `first_day_of_week` which affects the first day of the week considered by functions `toStartOfInterval(..., INTERVAL ... WEEK)`. This allows for consistency with function `toStartOfWeek` which defaults to Sunday as the first day of the week. [#60598](https://github.com/ClickHouse/ClickHouse/pull/60598) ([Jordi Villar](https://github.com/jrdi)).
* Added persistent virtual column `_block_offset` which stores original number of row in block that was assigned at insert. Persistence of column `_block_offset` can be enabled by setting `enable_block_offset_column`. Added virtual column`_part_data_version` which contains either min block number or mutation version of part. Persistent virtual column `_block_number` is not considered experimental anymore. [#60676](https://github.com/ClickHouse/ClickHouse/pull/60676) ([Anton Popov](https://github.com/CurtizJ)).
* Less contention in filesystem cache (part 3): execute removal from filesystem without lock on space reservation attempt. [#61163](https://github.com/ClickHouse/ClickHouse/pull/61163) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Functions `date_diff` and `age` now calculate their result at nanosecond instead of microsecond precision. They now also offer `nanosecond` (or `nanoseconds` or `ns`) as a possible value for the `unit` parameter. [#61409](https://github.com/ClickHouse/ClickHouse/pull/61409) ([Austin Kothig](https://github.com/kothiga)).
* Now marks are not loaded for wide parts during merges. [#61551](https://github.com/ClickHouse/ClickHouse/pull/61551) ([Anton Popov](https://github.com/CurtizJ)).
* Reload certificate chain during certificate reload. [#61671](https://github.com/ClickHouse/ClickHouse/pull/61671) ([Pervakov Grigorii](https://github.com/GrigoryPervakov)).
* Speed up dynamic resize of filesystem cache. [#61723](https://github.com/ClickHouse/ClickHouse/pull/61723) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add `TRUNCATE ALL TABLES`. [#61862](https://github.com/ClickHouse/ClickHouse/pull/61862) ([豪肥肥](https://github.com/HowePa)).
* Try to prevent [#60432](https://github.com/ClickHouse/ClickHouse/issues/60432) by not allowing a table to be attached if there is an active replica for that replica path. [#61876](https://github.com/ClickHouse/ClickHouse/pull/61876) ([Arthur Passos](https://github.com/arthurpassos)).
* Add a setting `input_format_json_throw_on_bad_escape_sequence`, disabling it allows saving bad escape sequences in JSON input formats. [#61889](https://github.com/ClickHouse/ClickHouse/pull/61889) ([Kruglov Pavel](https://github.com/Avogar)).
* Userspace page cache works with static web storage (`disk(type = web)`) now. Use client setting `use_page_cache_for_disks_without_file_cache=1` to enable. [#61911](https://github.com/ClickHouse/ClickHouse/pull/61911) ([Michael Kolupaev](https://github.com/al13n321)).
* Implement input() for clickhouse-local. [#61923](https://github.com/ClickHouse/ClickHouse/pull/61923) ([Azat Khuzhin](https://github.com/azat)).
* Fix logical-error when undoing quorum insert transaction. [#61953](https://github.com/ClickHouse/ClickHouse/pull/61953) ([Han Fei](https://github.com/hanfei1991)).
* StorageJoin with strictness `ANY` is consistent after reload. When several rows with the same key are inserted, the first one will have higher priority (before, it was chosen randomly upon table loading). close [#51027](https://github.com/ClickHouse/ClickHouse/issues/51027). [#61972](https://github.com/ClickHouse/ClickHouse/pull/61972) ([vdimir](https://github.com/vdimir)).
* Automatically infer Nullable column types from Apache Arrow schema. [#61984](https://github.com/ClickHouse/ClickHouse/pull/61984) ([Maksim Kita](https://github.com/kitaisreal)).
* Allow to cancel parallel merge of aggregate states during aggregation. Example: `uniqExact`. [#61992](https://github.com/ClickHouse/ClickHouse/pull/61992) ([Maksim Kita](https://github.com/kitaisreal)).
* Don't treat Bool and number variants as suspicious in Variant type. [#61999](https://github.com/ClickHouse/ClickHouse/pull/61999) ([Kruglov Pavel](https://github.com/Avogar)).
* Use `system.keywords` to fill in the suggestions and also use them in the all places internally. [#62000](https://github.com/ClickHouse/ClickHouse/pull/62000) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Implement better conversion from String to Variant using parsing. [#62005](https://github.com/ClickHouse/ClickHouse/pull/62005) ([Kruglov Pavel](https://github.com/Avogar)).
* Support Variant in JSONExtract functions. [#62014](https://github.com/ClickHouse/ClickHouse/pull/62014) ([Kruglov Pavel](https://github.com/Avogar)).
* Dictionary source with `INVALIDATE_QUERY` is not reloaded twice on startup. [#62050](https://github.com/ClickHouse/ClickHouse/pull/62050) ([vdimir](https://github.com/vdimir)).
* `OPTIMIZE FINAL` for `ReplicatedMergeTree` now will wait for currently active merges to finish and then reattempt to schedule a final merge. This will put it more in line with ordinary `MergeTree` behaviour. [#62067](https://github.com/ClickHouse/ClickHouse/pull/62067) ([Nikita Taranov](https://github.com/nickitat)).
* While read data from a hive text file, it would use the first line of hive text file to resize of number of input fields, and sometimes the fields number of first line is not matched with the hive table defined , such as the hive table is defined to have 3 columns, like `test_tbl(a Int32, b Int32, c Int32)`, but the first line of text file only has 2 fields, and in this suitation, the input fields will be resized to 2, and if the next line of the text file has 3 fields, then the third field can not be read but set a default value 0, which is not right. [#62086](https://github.com/ClickHouse/ClickHouse/pull/62086) ([KevinyhZou](https://github.com/KevinyhZou)).
* CREATE AS copies the comment. [#62117](https://github.com/ClickHouse/ClickHouse/pull/62117) ([Pablo Marcos](https://github.com/pamarcos)).
* The syntax highlighting while typing in the client will work on the syntax level (previously, it worked on the lexer level). [#62123](https://github.com/ClickHouse/ClickHouse/pull/62123) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix an issue where when a redundant `= 1` or `= 0` is added after a boolean expression involving the primary key, the primary index is not used. For example, both `SELECT * FROM <table> WHERE <primary-key> IN (<value>) = 1` and `SELECT * FROM <table> WHERE <primary-key> NOT IN (<value>) = 0` will both perform a full table scan, when the primary index can be used. [#62142](https://github.com/ClickHouse/ClickHouse/pull/62142) ([josh-hildred](https://github.com/josh-hildred)).
* Add query progress to table zookeeper. [#62152](https://github.com/ClickHouse/ClickHouse/pull/62152) ([JackyWoo](https://github.com/JackyWoo)).
* Add ability to turn on trace collector (Real and CPU) server-wide. [#62189](https://github.com/ClickHouse/ClickHouse/pull/62189) ([alesapin](https://github.com/alesapin)).
* Added setting `lightweight_deletes_sync` (default value: 2 - wait all replicas synchronously). It is similar to setting `mutations_sync` but affects only behaviour of lightweight deletes. [#62195](https://github.com/ClickHouse/ClickHouse/pull/62195) ([Anton Popov](https://github.com/CurtizJ)).
* Distinguish booleans and integers while parsing values for custom settings: ``` SET custom_a = true; SET custom_b = 1; ```. [#62206](https://github.com/ClickHouse/ClickHouse/pull/62206) ([Vitaly Baranov](https://github.com/vitlibar)).
* Support S3 access through AWS Private Link Interface endpoints. Closes [#60021](https://github.com/ClickHouse/ClickHouse/issues/60021), [#31074](https://github.com/ClickHouse/ClickHouse/issues/31074) and [#53761](https://github.com/ClickHouse/ClickHouse/issues/53761). [#62208](https://github.com/ClickHouse/ClickHouse/pull/62208) ([Arthur Passos](https://github.com/arthurpassos)).
* Client has to send header 'Keep-Alive: timeout=X' to the server. If a client receives a response from the server with that header, client has to use the value from the server. Also for a client it is better not to use a connection which is nearly expired in order to avoid connection close race. [#62249](https://github.com/ClickHouse/ClickHouse/pull/62249) ([Sema Checherinda](https://github.com/CheSema)).
* Added nano- micro- milliseconds unit for date_trunc. [#62335](https://github.com/ClickHouse/ClickHouse/pull/62335) ([Misz606](https://github.com/Misz606)).
* Do not create a directory for UDF in clickhouse-client if it does not exist. This closes [#59597](https://github.com/ClickHouse/ClickHouse/issues/59597). [#62366](https://github.com/ClickHouse/ClickHouse/pull/62366) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* The query cache now no longer caches results of queries against system tables (`system.*`, `information_schema.*`, `INFORMATION_SCHEMA.*`). [#62376](https://github.com/ClickHouse/ClickHouse/pull/62376) ([Robert Schulze](https://github.com/rschu1ze)).
* `MOVE PARTITION TO TABLE` query can be delayed or can throw `TOO_MANY_PARTS` exception to avoid exceeding limits on the part count. The same settings and limits are applied as for the`INSERT` query (see `max_parts_in_total`, `parts_to_delay_insert`, `parts_to_throw_insert`, `inactive_parts_to_throw_insert`, `inactive_parts_to_delay_insert`, `max_avg_part_size_for_too_many_parts`, `min_delay_to_insert_ms` and `max_delay_to_insert` settings). [#62420](https://github.com/ClickHouse/ClickHouse/pull/62420) ([Sergei Trifonov](https://github.com/serxa)).
* Added the missing `hostname` column to system table `blob_storage_log`. [#62456](https://github.com/ClickHouse/ClickHouse/pull/62456) ([Jayme Bird](https://github.com/jaymebrd)).
* Changed the default installation directory on macOS from `/usr/bin` to `/usr/local/bin`. This is necessary because Apple's System Integrity Protection introduced with macOS El Capitan (2015) prevents writing into `/usr/bin`, even with `sudo`. [#62489](https://github.com/ClickHouse/ClickHouse/pull/62489) ([haohang](https://github.com/yokofly)).
* Make transform always return the first match. [#62518](https://github.com/ClickHouse/ClickHouse/pull/62518) ([Raúl Marín](https://github.com/Algunenano)).
* For consistency with other system tables, `system.backup_log` now has a column `event_time`. [#62541](https://github.com/ClickHouse/ClickHouse/pull/62541) ([Jayme Bird](https://github.com/jaymebrd)).
* Avoid evaluating table DEFAULT expressions while executing `RESTORE`. [#62601](https://github.com/ClickHouse/ClickHouse/pull/62601) ([Vitaly Baranov](https://github.com/vitlibar)).
* Return stream of chunks from `system.remote_data_paths` instead of accumulating the whole result in one big chunk. This allows to consume less memory, show intermediate progress and cancel the query. [#62613](https://github.com/ClickHouse/ClickHouse/pull/62613) ([Alexander Gololobov](https://github.com/davenger)).
* S3 storage and backups also need the same default keep alive settings as s3 disk. [#62648](https://github.com/ClickHouse/ClickHouse/pull/62648) ([Sema Checherinda](https://github.com/CheSema)).
* Table `system.backup_log` now has the "default" sorting key which is `event_date, event_time`, the same as for other `_log` table engines. [#62667](https://github.com/ClickHouse/ClickHouse/pull/62667) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Mark type Variant as comparable so it can be used in primary key. [#62693](https://github.com/ClickHouse/ClickHouse/pull/62693) ([Kruglov Pavel](https://github.com/Avogar)).
* Add librdkafka's client identifier to log messages to be able to differentiate log messages from different consumers of a single table. [#62813](https://github.com/ClickHouse/ClickHouse/pull/62813) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Allow special macros {uuid} and {database} in a Replicated database ZooKeeper path. [#62818](https://github.com/ClickHouse/ClickHouse/pull/62818) ([Vitaly Baranov](https://github.com/vitlibar)).
* Allow quota key with different auth scheme in HTTP requests. [#62842](https://github.com/ClickHouse/ClickHouse/pull/62842) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Remove experimental tag from Replicated database engine. Now it is in Beta stage. [#62937](https://github.com/ClickHouse/ClickHouse/pull/62937) ([Justin de Guzman](https://github.com/justindeguzman)).
* Reduce the verbosity of command line argument `--help` in `clickhouse client` and `clickhouse local`. The previous output is now generated by `--help --verbose`. [#62973](https://github.com/ClickHouse/ClickHouse/pull/62973) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Close session if [user's `valid_until`](https://clickhouse.com/docs/en/sql-reference/statements/create/user#valid-until-clause) is reached. [#63046](https://github.com/ClickHouse/ClickHouse/pull/63046) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* `log_bin_use_v1_row_events` was removed in MySQL 8.3, fix [#60479](https://github.com/ClickHouse/ClickHouse/issues/60479). [#63101](https://github.com/ClickHouse/ClickHouse/pull/63101) ([Eugene Klimov](https://github.com/Slach)).
#### Build/Testing/Packaging Improvement
* Ignore DROP queries in stress test with 1/2 probability, use TRUNCATE instead of ignoring DROP in upgrade check for Memory/JOIN tables. [#61476](https://github.com/ClickHouse/ClickHouse/pull/61476) ([Kruglov Pavel](https://github.com/Avogar)).
* Remove from the Keeper Docker image the volumes at /etc/clickhouse-keeper and /var/log/clickhouse-keeper. [#61683](https://github.com/ClickHouse/ClickHouse/pull/61683) ([Tristan](https://github.com/Tristan971)).
* Timeout was updated in https://github.com/ClickHouse/ClickHouse/pull/45765, but exception message was not. [#62139](https://github.com/ClickHouse/ClickHouse/pull/62139) ([Arthur Passos](https://github.com/arthurpassos)).
* Add tests for all issues which are no longer relevant with Analyzer being enabled by default. Closes: [#55794](https://github.com/ClickHouse/ClickHouse/issues/55794) Closes: [#49472](https://github.com/ClickHouse/ClickHouse/issues/49472) Closes: [#44414](https://github.com/ClickHouse/ClickHouse/issues/44414) Closes: [#13843](https://github.com/ClickHouse/ClickHouse/issues/13843) Closes: [#55803](https://github.com/ClickHouse/ClickHouse/issues/55803) Closes: [#48308](https://github.com/ClickHouse/ClickHouse/issues/48308) Closes: [#45535](https://github.com/ClickHouse/ClickHouse/issues/45535) Closes: [#44365](https://github.com/ClickHouse/ClickHouse/issues/44365) Closes: [#44153](https://github.com/ClickHouse/ClickHouse/issues/44153) Closes: [#42399](https://github.com/ClickHouse/ClickHouse/issues/42399) Closes: [#27115](https://github.com/ClickHouse/ClickHouse/issues/27115) Closes: [#23162](https://github.com/ClickHouse/ClickHouse/issues/23162) Closes: [#15395](https://github.com/ClickHouse/ClickHouse/issues/15395) Closes: [#15411](https://github.com/ClickHouse/ClickHouse/issues/15411) Closes: [#14978](https://github.com/ClickHouse/ClickHouse/issues/14978) Closes: [#17319](https://github.com/ClickHouse/ClickHouse/issues/17319) Closes: [#11813](https://github.com/ClickHouse/ClickHouse/issues/11813) Closes: [#13210](https://github.com/ClickHouse/ClickHouse/issues/13210) Closes: [#23053](https://github.com/ClickHouse/ClickHouse/issues/23053) Closes: [#37729](https://github.com/ClickHouse/ClickHouse/issues/37729) Closes: [#32639](https://github.com/ClickHouse/ClickHouse/issues/32639) Closes: [#9954](https://github.com/ClickHouse/ClickHouse/issues/9954) Closes: [#41964](https://github.com/ClickHouse/ClickHouse/issues/41964) Closes: [#54317](https://github.com/ClickHouse/ClickHouse/issues/54317) Closes: [#7520](https://github.com/ClickHouse/ClickHouse/issues/7520) Closes: [#36973](https://github.com/ClickHouse/ClickHouse/issues/36973) Closes: [#40955](https://github.com/ClickHouse/ClickHouse/issues/40955) Closes: [#19687](https://github.com/ClickHouse/ClickHouse/issues/19687) Closes: [#23104](https://github.com/ClickHouse/ClickHouse/issues/23104) Closes: [#21584](https://github.com/ClickHouse/ClickHouse/issues/21584) Closes: [#23344](https://github.com/ClickHouse/ClickHouse/issues/23344) Closes: [#22627](https://github.com/ClickHouse/ClickHouse/issues/22627) Closes: [#10276](https://github.com/ClickHouse/ClickHouse/issues/10276) Closes: [#19687](https://github.com/ClickHouse/ClickHouse/issues/19687) Closes: [#4567](https://github.com/ClickHouse/ClickHouse/issues/4567) Closes: [#17710](https://github.com/ClickHouse/ClickHouse/issues/17710) Closes: [#11068](https://github.com/ClickHouse/ClickHouse/issues/11068) Closes: [#24395](https://github.com/ClickHouse/ClickHouse/issues/24395) Closes: [#23416](https://github.com/ClickHouse/ClickHouse/issues/23416) Closes: [#23162](https://github.com/ClickHouse/ClickHouse/issues/23162) Closes: [#25655](https://github.com/ClickHouse/ClickHouse/issues/25655) Closes: [#11757](https://github.com/ClickHouse/ClickHouse/issues/11757) Closes: [#6571](https://github.com/ClickHouse/ClickHouse/issues/6571) Closes: [#4432](https://github.com/ClickHouse/ClickHouse/issues/4432) Closes: [#8259](https://github.com/ClickHouse/ClickHouse/issues/8259) Closes: [#9233](https://github.com/ClickHouse/ClickHouse/issues/9233) Closes: [#14699](https://github.com/ClickHouse/ClickHouse/issues/14699) Closes: [#27068](https://github.com/ClickHouse/ClickHouse/issues/27068) Closes: [#28687](https://github.com/ClickHouse/ClickHouse/issues/28687) Closes: [#28777](https://github.com/ClickHouse/ClickHouse/issues/28777) Closes: [#29734](https://github.com/ClickHouse/ClickHouse/issues/29734) Closes: [#61238](https://github.com/ClickHouse/ClickHouse/issues/61238) Closes: [#33825](https://github.com/ClickHouse/ClickHouse/issues/33825) Closes: [#35608](https://github.com/ClickHouse/ClickHouse/issues/35608) Closes: [#29838](https://github.com/ClickHouse/ClickHouse/issues/29838) Closes: [#35652](https://github.com/ClickHouse/ClickHouse/issues/35652) Closes: [#36189](https://github.com/ClickHouse/ClickHouse/issues/36189) Closes: [#39634](https://github.com/ClickHouse/ClickHouse/issues/39634) Closes: [#47432](https://github.com/ClickHouse/ClickHouse/issues/47432) Closes: [#54910](https://github.com/ClickHouse/ClickHouse/issues/54910) Closes: [#57321](https://github.com/ClickHouse/ClickHouse/issues/57321) Closes: [#59154](https://github.com/ClickHouse/ClickHouse/issues/59154) Closes: [#61014](https://github.com/ClickHouse/ClickHouse/issues/61014) Closes: [#61950](https://github.com/ClickHouse/ClickHouse/issues/61950) Closes: [#55647](https://github.com/ClickHouse/ClickHouse/issues/55647) Closes: [#61947](https://github.com/ClickHouse/ClickHouse/issues/61947). [#62185](https://github.com/ClickHouse/ClickHouse/pull/62185) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Vendor in rust dependencies. [#62297](https://github.com/ClickHouse/ClickHouse/pull/62297) ([Raúl Marín](https://github.com/Algunenano)).
* Add more tests from issues which are no longer relevant or fixed by analyzer. Closes: [#58985](https://github.com/ClickHouse/ClickHouse/issues/58985) Closes: [#59549](https://github.com/ClickHouse/ClickHouse/issues/59549) Closes: [#36963](https://github.com/ClickHouse/ClickHouse/issues/36963) Closes: [#39453](https://github.com/ClickHouse/ClickHouse/issues/39453) Closes: [#56521](https://github.com/ClickHouse/ClickHouse/issues/56521) Closes: [#47552](https://github.com/ClickHouse/ClickHouse/issues/47552) Closes: [#56503](https://github.com/ClickHouse/ClickHouse/issues/56503) Closes: [#59101](https://github.com/ClickHouse/ClickHouse/issues/59101) Closes: [#50271](https://github.com/ClickHouse/ClickHouse/issues/50271) Closes: [#54954](https://github.com/ClickHouse/ClickHouse/issues/54954) Closes: [#56466](https://github.com/ClickHouse/ClickHouse/issues/56466) Closes: [#11000](https://github.com/ClickHouse/ClickHouse/issues/11000) Closes: [#10894](https://github.com/ClickHouse/ClickHouse/issues/10894) Closes: https://github.com/ClickHouse/ClickHouse/issues/448 Closes: [#8030](https://github.com/ClickHouse/ClickHouse/issues/8030) Closes: [#32139](https://github.com/ClickHouse/ClickHouse/issues/32139) Closes: [#47288](https://github.com/ClickHouse/ClickHouse/issues/47288) Closes: [#50705](https://github.com/ClickHouse/ClickHouse/issues/50705) Closes: [#54511](https://github.com/ClickHouse/ClickHouse/issues/54511) Closes: [#55466](https://github.com/ClickHouse/ClickHouse/issues/55466) Closes: [#58500](https://github.com/ClickHouse/ClickHouse/issues/58500) Closes: [#39923](https://github.com/ClickHouse/ClickHouse/issues/39923) Closes: [#39855](https://github.com/ClickHouse/ClickHouse/issues/39855) Closes: [#4596](https://github.com/ClickHouse/ClickHouse/issues/4596) Closes: [#47422](https://github.com/ClickHouse/ClickHouse/issues/47422) Closes: [#33000](https://github.com/ClickHouse/ClickHouse/issues/33000) Closes: [#14739](https://github.com/ClickHouse/ClickHouse/issues/14739) Closes: [#44039](https://github.com/ClickHouse/ClickHouse/issues/44039) Closes: [#8547](https://github.com/ClickHouse/ClickHouse/issues/8547) Closes: [#22923](https://github.com/ClickHouse/ClickHouse/issues/22923) Closes: [#23865](https://github.com/ClickHouse/ClickHouse/issues/23865) Closes: [#29748](https://github.com/ClickHouse/ClickHouse/issues/29748) Closes: [#4222](https://github.com/ClickHouse/ClickHouse/issues/4222). [#62457](https://github.com/ClickHouse/ClickHouse/pull/62457) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fixed build errors when OpenSSL is linked dynamically (note: this is generally unsupported and only required for s390x platforms). [#62888](https://github.com/ClickHouse/ClickHouse/pull/62888) ([Harry Lee](https://github.com/HarryLeeIBM)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix parser error when using COUNT(*) with FILTER clause [#61357](https://github.com/ClickHouse/ClickHouse/pull/61357) ([Duc Canh Le](https://github.com/canhld94)).
* Fix logical error in group_by_use_nulls + grouping set + analyzer + materialize/constant [#61567](https://github.com/ClickHouse/ClickHouse/pull/61567) ([Kruglov Pavel](https://github.com/Avogar)).
* Cancel merges before removing moved parts [#61610](https://github.com/ClickHouse/ClickHouse/pull/61610) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Try to fix abort in arrow [#61720](https://github.com/ClickHouse/ClickHouse/pull/61720) ([Kruglov Pavel](https://github.com/Avogar)).
* Search for convert_to_replicated flag at the correct path [#61769](https://github.com/ClickHouse/ClickHouse/pull/61769) ([Kirill](https://github.com/kirillgarbar)).
* Fix possible connections data-race for distributed_foreground_insert/distributed_background_insert_batch [#61867](https://github.com/ClickHouse/ClickHouse/pull/61867) ([Azat Khuzhin](https://github.com/azat)).
* Mark CANNOT_PARSE_ESCAPE_SEQUENCE error as parse error to be able to skip it in row input formats [#61883](https://github.com/ClickHouse/ClickHouse/pull/61883) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix writing exception message in output format in HTTP when http_wait_end_of_query is used [#61951](https://github.com/ClickHouse/ClickHouse/pull/61951) ([Kruglov Pavel](https://github.com/Avogar)).
* Proper fix for LowCardinality together with JSONExtact functions [#61957](https://github.com/ClickHouse/ClickHouse/pull/61957) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Crash in Engine Merge if Row Policy does not have expression [#61971](https://github.com/ClickHouse/ClickHouse/pull/61971) ([Ilya Golshtein](https://github.com/ilejn)).
* Fix WriteBufferAzureBlobStorage destructor uncaught exception [#61988](https://github.com/ClickHouse/ClickHouse/pull/61988) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix CREATE TABLE w/o columns definition for ReplicatedMergeTree [#62040](https://github.com/ClickHouse/ClickHouse/pull/62040) ([Azat Khuzhin](https://github.com/azat)).
* Fix optimize_skip_unused_shards_rewrite_in for composite sharding key [#62047](https://github.com/ClickHouse/ClickHouse/pull/62047) ([Azat Khuzhin](https://github.com/azat)).
* ReadWriteBufferFromHTTP set right header host when redirected [#62068](https://github.com/ClickHouse/ClickHouse/pull/62068) ([Sema Checherinda](https://github.com/CheSema)).
* Fix external table cannot parse data type Bool [#62115](https://github.com/ClickHouse/ClickHouse/pull/62115) ([Duc Canh Le](https://github.com/canhld94)).
* Revert "Merge pull request [#61564](https://github.com/ClickHouse/ClickHouse/issues/61564) from liuneng1994/optimize_in_single_value" [#62135](https://github.com/ClickHouse/ClickHouse/pull/62135) ([Raúl Marín](https://github.com/Algunenano)).
* Add test for [#35215](https://github.com/ClickHouse/ClickHouse/issues/35215) [#62180](https://github.com/ClickHouse/ClickHouse/pull/62180) ([Raúl Marín](https://github.com/Algunenano)).
* Analyzer: Fix query parameter resolution [#62186](https://github.com/ClickHouse/ClickHouse/pull/62186) ([Dmitry Novik](https://github.com/novikd)).
* Fix restoring parts while readonly [#62207](https://github.com/ClickHouse/ClickHouse/pull/62207) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix crash in index definition containing sql udf [#62225](https://github.com/ClickHouse/ClickHouse/pull/62225) ([vdimir](https://github.com/vdimir)).
* Fixing NULL random seed for generateRandom with analyzer. [#62248](https://github.com/ClickHouse/ClickHouse/pull/62248) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Correctly handle const columns in DistinctTransfom [#62250](https://github.com/ClickHouse/ClickHouse/pull/62250) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix PartsSplitter [#62268](https://github.com/ClickHouse/ClickHouse/pull/62268) ([Nikita Taranov](https://github.com/nickitat)).
* Analyzer: Fix alias to parametrized view resolution [#62274](https://github.com/ClickHouse/ClickHouse/pull/62274) ([Dmitry Novik](https://github.com/novikd)).
* Analyzer: Fix name resolution from parent scopes [#62281](https://github.com/ClickHouse/ClickHouse/pull/62281) ([Dmitry Novik](https://github.com/novikd)).
* Fix argMax with nullable non native numeric column [#62285](https://github.com/ClickHouse/ClickHouse/pull/62285) ([Raúl Marín](https://github.com/Algunenano)).
* Fix BACKUP and RESTORE of a materialized view in Ordinary database [#62295](https://github.com/ClickHouse/ClickHouse/pull/62295) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix data race on scalars in Context [#62305](https://github.com/ClickHouse/ClickHouse/pull/62305) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix primary key in materialized view [#62319](https://github.com/ClickHouse/ClickHouse/pull/62319) ([Murat Khairulin](https://github.com/mxwell)).
* Do not build multithread insert pipeline for tables without support [#62333](https://github.com/ClickHouse/ClickHouse/pull/62333) ([vdimir](https://github.com/vdimir)).
* Fix analyzer with positional arguments in distributed query [#62362](https://github.com/ClickHouse/ClickHouse/pull/62362) ([flynn](https://github.com/ucasfl)).
* Fix filter pushdown from additional_table_filters in Merge engine in analyzer [#62398](https://github.com/ClickHouse/ClickHouse/pull/62398) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix GLOBAL IN table queries with analyzer. [#62409](https://github.com/ClickHouse/ClickHouse/pull/62409) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Respect settings truncate_on_insert/create_new_file_on_insert in s3/hdfs/azure engines during partitioned write [#62425](https://github.com/ClickHouse/ClickHouse/pull/62425) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix backup restore path for AzureBlobStorage [#62447](https://github.com/ClickHouse/ClickHouse/pull/62447) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix SimpleSquashingChunksTransform [#62451](https://github.com/ClickHouse/ClickHouse/pull/62451) ([Nikita Taranov](https://github.com/nickitat)).
* Fix capture of nested lambda. [#62462](https://github.com/ClickHouse/ClickHouse/pull/62462) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix validation of special MergeTree columns [#62498](https://github.com/ClickHouse/ClickHouse/pull/62498) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Avoid crash when reading protobuf with recursive types [#62506](https://github.com/ClickHouse/ClickHouse/pull/62506) ([Raúl Marín](https://github.com/Algunenano)).
* Fix a bug moving one partition from one to itself [#62524](https://github.com/ClickHouse/ClickHouse/pull/62524) ([helifu](https://github.com/helifu)).
* Fix scalar subquery in LIMIT [#62567](https://github.com/ClickHouse/ClickHouse/pull/62567) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Try to fix segfault in Hive engine [#62578](https://github.com/ClickHouse/ClickHouse/pull/62578) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix memory leak in groupArraySorted [#62597](https://github.com/ClickHouse/ClickHouse/pull/62597) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix crash in largestTriangleThreeBuckets [#62646](https://github.com/ClickHouse/ClickHouse/pull/62646) ([Raúl Marín](https://github.com/Algunenano)).
* Fix tumble[Start,End] and hop[Start,End] for bigger resolutions [#62705](https://github.com/ClickHouse/ClickHouse/pull/62705) ([Jordi Villar](https://github.com/jrdi)).
* Fix argMin/argMax combinator state [#62708](https://github.com/ClickHouse/ClickHouse/pull/62708) ([Raúl Marín](https://github.com/Algunenano)).
* Fix temporary data in cache failing because of cache lock contention optimization [#62715](https://github.com/ClickHouse/ClickHouse/pull/62715) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix crash in function `mergeTreeIndex` [#62762](https://github.com/ClickHouse/ClickHouse/pull/62762) ([Anton Popov](https://github.com/CurtizJ)).
* fix: update: nested materialized columns: size check fixes [#62773](https://github.com/ClickHouse/ClickHouse/pull/62773) ([Eliot Hautefeuille](https://github.com/hileef)).
* Fix FINAL modifier is not respected in CTE with analyzer [#62811](https://github.com/ClickHouse/ClickHouse/pull/62811) ([Duc Canh Le](https://github.com/canhld94)).
* Fix crash in function `formatRow` with `JSON` format and HTTP interface [#62840](https://github.com/ClickHouse/ClickHouse/pull/62840) ([Anton Popov](https://github.com/CurtizJ)).
* Azure: fix building final url from endpoint object [#62850](https://github.com/ClickHouse/ClickHouse/pull/62850) ([Daniel Pozo Escalona](https://github.com/danipozo)).
* Fix GCD codec [#62853](https://github.com/ClickHouse/ClickHouse/pull/62853) ([Nikita Taranov](https://github.com/nickitat)).
* Fix LowCardinality(Nullable) key in hyperrectangle [#62866](https://github.com/ClickHouse/ClickHouse/pull/62866) ([Amos Bird](https://github.com/amosbird)).
* Fix fromUnixtimestamp in joda syntax while the input value beyond UInt32 [#62901](https://github.com/ClickHouse/ClickHouse/pull/62901) ([KevinyhZou](https://github.com/KevinyhZou)).
* Disable optimize_rewrite_aggregate_function_with_if for sum(nullable) [#62912](https://github.com/ClickHouse/ClickHouse/pull/62912) ([Raúl Marín](https://github.com/Algunenano)).
* Fix PREWHERE for StorageBuffer with different source table column types. [#62916](https://github.com/ClickHouse/ClickHouse/pull/62916) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix temporary data in cache incorrectly processing failure of cache key directory creation [#62925](https://github.com/ClickHouse/ClickHouse/pull/62925) ([Kseniia Sumarokova](https://github.com/kssenii)).
* gRPC: fix crash on IPv6 peer connection [#62978](https://github.com/ClickHouse/ClickHouse/pull/62978) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Fix possible CHECKSUM_DOESNT_MATCH (and others) during replicated fetches [#62987](https://github.com/ClickHouse/ClickHouse/pull/62987) ([Azat Khuzhin](https://github.com/azat)).
* Fix terminate with uncaught exception in temporary data in cache [#62998](https://github.com/ClickHouse/ClickHouse/pull/62998) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix optimize_rewrite_aggregate_function_with_if implicit cast [#62999](https://github.com/ClickHouse/ClickHouse/pull/62999) ([Raúl Marín](https://github.com/Algunenano)).
* Fix unhandled exception in ~RestorerFromBackup [#63040](https://github.com/ClickHouse/ClickHouse/pull/63040) ([Vitaly Baranov](https://github.com/vitlibar)).
* Do not remove server constants from GROUP BY key for secondary query. [#63047](https://github.com/ClickHouse/ClickHouse/pull/63047) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix incorrect judgement of of monotonicity of function abs [#63097](https://github.com/ClickHouse/ClickHouse/pull/63097) ([Duc Canh Le](https://github.com/canhld94)).
* Make sanity check of settings worse [#63119](https://github.com/ClickHouse/ClickHouse/pull/63119) ([Raúl Marín](https://github.com/Algunenano)).
* Set server name for SSL handshake in MongoDB engine [#63122](https://github.com/ClickHouse/ClickHouse/pull/63122) ([Alexander Gololobov](https://github.com/davenger)).
* Use user specified db instead of "config" for MongoDB wire protocol version check [#63126](https://github.com/ClickHouse/ClickHouse/pull/63126) ([Alexander Gololobov](https://github.com/davenger)).
* Format SQL security option only in `CREATE VIEW` queries. [#63136](https://github.com/ClickHouse/ClickHouse/pull/63136) ([pufit](https://github.com/pufit)).
#### CI Fix or Improvement (changelog entry is not required)
* ... [#62044](https://github.com/ClickHouse/ClickHouse/pull/62044) ([Max K.](https://github.com/maxknv)).
* We won't fail the job when GH fails to retrieve the job ID and URLs. [#62651](https://github.com/ClickHouse/ClickHouse/pull/62651) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Decouple some work from https://github.com/ClickHouse/ClickHouse/pull/61464 to simplify sync. [#62739](https://github.com/ClickHouse/ClickHouse/pull/62739) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add `isort` config fo the first-party imports; fail build reports on non-success statuses. [#62786](https://github.com/ClickHouse/ClickHouse/pull/62786) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Move all Labels around to have it in a single place. [#62919](https://github.com/ClickHouse/ClickHouse/pull/62919) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* ... [#63035](https://github.com/ClickHouse/ClickHouse/pull/63035) ([Aleksei Filatov](https://github.com/aalexfvk)).
* ... [#63108](https://github.com/ClickHouse/ClickHouse/pull/63108) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Revert "Updated format settings references in the docs (datetime.md)""'. [#61442](https://github.com/ClickHouse/ClickHouse/pull/61442) ([Kruglov Pavel](https://github.com/Avogar)).
* NO CL ENTRY: 'Write `binary version -> commit hash` mapping to CI database (in private)'. [#61544](https://github.com/ClickHouse/ClickHouse/pull/61544) ([Nikita Taranov](https://github.com/nickitat)).
* NO CL ENTRY: 'Fix flaky tests 2 (stateless, integration) '. [#61869](https://github.com/ClickHouse/ClickHouse/pull/61869) ([Nikita Fomichev](https://github.com/fm4v)).
* NO CL ENTRY: 'Fix PR [#60656](https://github.com/ClickHouse/ClickHouse/issues/60656) for install check tests'. [#61910](https://github.com/ClickHouse/ClickHouse/pull/61910) ([Chun-Sheng, Li](https://github.com/peter279k)).
* NO CL ENTRY: '00002_log_and_exception_messages_formatting: exclude one more format string'. [#62190](https://github.com/ClickHouse/ClickHouse/pull/62190) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* NO CL ENTRY: 'Revert "Resubmit 'Update invalidate_query_response on dictionary startup'"'. [#62230](https://github.com/ClickHouse/ClickHouse/pull/62230) ([Raúl Marín](https://github.com/Algunenano)).
* NO CL ENTRY: 'Fix contributor name vulnerability'. [#62357](https://github.com/ClickHouse/ClickHouse/pull/62357) ([Anita Hammer](https://github.com/anitahammer)).
* NO CL ENTRY: 'Revert "Rich syntax highlighting in the client"'. [#62508](https://github.com/ClickHouse/ClickHouse/pull/62508) ([Raúl Marín](https://github.com/Algunenano)).
* NO CL ENTRY: 'Revert "Revert "Rich syntax highlighting in the client""'. [#62512](https://github.com/ClickHouse/ClickHouse/pull/62512) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert "[feature]: allow to attach parts from a different disk"'. [#62549](https://github.com/ClickHouse/ClickHouse/pull/62549) ([Alexander Tokmakov](https://github.com/tavplubix)).
* NO CL ENTRY: 'Revert "More optimal loading of marks"'. [#62577](https://github.com/ClickHouse/ClickHouse/pull/62577) ([Nikita Taranov](https://github.com/nickitat)).
* NO CL ENTRY: 'Revert "Speed up `splitByRegexp`"'. [#62692](https://github.com/ClickHouse/ClickHouse/pull/62692) ([Robert Schulze](https://github.com/rschu1ze)).
* NO CL ENTRY: 'Get rid of merge_commit in style check autofix'. [#62835](https://github.com/ClickHouse/ClickHouse/pull/62835) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* NO CL ENTRY: 'Revert "CI: add FT to MQ remove Style from master"'. [#62927](https://github.com/ClickHouse/ClickHouse/pull/62927) ([Max K.](https://github.com/maxknv)).
* NO CL ENTRY: 'Unflake 02813_func_now_and_alias'. [#62932](https://github.com/ClickHouse/ClickHouse/pull/62932) ([Robert Schulze](https://github.com/rschu1ze)).
* NO CL ENTRY: 'Revert "Enable custom parquet encoder by default"'. [#63153](https://github.com/ClickHouse/ClickHouse/pull/63153) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Update protobuf to v25.1 [#58020](https://github.com/ClickHouse/ClickHouse/pull/58020) ([Mikhail Koviazin](https://github.com/mkmkme)).
* boringssl --> OpenSSL 3.2 [#59870](https://github.com/ClickHouse/ClickHouse/pull/59870) ([Robert Schulze](https://github.com/rschu1ze)).
* Enable all access control improvements by default (even without config.xml) [#60153](https://github.com/ClickHouse/ClickHouse/pull/60153) ([Azat Khuzhin](https://github.com/azat)).
* Change back how receive_timeout is handled for INSERTs [#60302](https://github.com/ClickHouse/ClickHouse/pull/60302) ([Azat Khuzhin](https://github.com/azat)).
* Context getGlobalTemporaryVolume use shared lock [#60997](https://github.com/ClickHouse/ClickHouse/pull/60997) ([Maksim Kita](https://github.com/kitaisreal)).
* Do nothing in `waitForOutdatedPartsToBeLoaded()` if loading is not required [#61232](https://github.com/ClickHouse/ClickHouse/pull/61232) ([Sergei Trifonov](https://github.com/serxa)).
* Fix db iterator wait during async metrics collection [#61534](https://github.com/ClickHouse/ClickHouse/pull/61534) ([Sergei Trifonov](https://github.com/serxa)).
* Fix 02943_rmt_alter_metadata_merge_checksum_mismatch flakiness [#61594](https://github.com/ClickHouse/ClickHouse/pull/61594) ([Azat Khuzhin](https://github.com/azat)).
* Stream rows when reading from system.replicas [#61784](https://github.com/ClickHouse/ClickHouse/pull/61784) ([Alexander Gololobov](https://github.com/davenger)).
* Skip more sanity checks for secondary create queries [#61799](https://github.com/ClickHouse/ClickHouse/pull/61799) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix 00002_log_and_exception_messages_formatting [#61882](https://github.com/ClickHouse/ClickHouse/pull/61882) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add test for [#53352](https://github.com/ClickHouse/ClickHouse/issues/53352) [#61886](https://github.com/ClickHouse/ClickHouse/pull/61886) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Test: tuple elimination with analyzer [#61887](https://github.com/ClickHouse/ClickHouse/pull/61887) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix performance test `aggregating_merge_tree_simple_aggregate_function_string` [#61931](https://github.com/ClickHouse/ClickHouse/pull/61931) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Fix some crashes with analyzer and group_by_use_nulls. [#61933](https://github.com/ClickHouse/ClickHouse/pull/61933) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* fix a race in refreshable view [#61936](https://github.com/ClickHouse/ClickHouse/pull/61936) ([Han Fei](https://github.com/hanfei1991)).
* Follow up to [#60452](https://github.com/ClickHouse/ClickHouse/issues/60452) [#61954](https://github.com/ClickHouse/ClickHouse/pull/61954) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Update 02916_move_partition_inactive_replica.sql [#61955](https://github.com/ClickHouse/ClickHouse/pull/61955) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Check for "SYSTEM STOP MERGES" primarily for MERGE_PARTS/MUTATE_PART [#61976](https://github.com/ClickHouse/ClickHouse/pull/61976) ([Azat Khuzhin](https://github.com/azat)).
* CI: failover for job_url request from gh [#61986](https://github.com/ClickHouse/ClickHouse/pull/61986) ([Max K.](https://github.com/maxknv)).
* CI: remove unnecessary job url for Mark release ready [#61991](https://github.com/ClickHouse/ClickHouse/pull/61991) ([Max K.](https://github.com/maxknv)).
* Update version after release [#61994](https://github.com/ClickHouse/ClickHouse/pull/61994) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Update version_date.tsv and changelogs after v24.3.1.2672-lts [#61996](https://github.com/ClickHouse/ClickHouse/pull/61996) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* [RFC] Send LOGICAL_ERRORs to sentry [#61997](https://github.com/ClickHouse/ClickHouse/pull/61997) ([Azat Khuzhin](https://github.com/azat)).
* Fix scalars create as select [#61998](https://github.com/ClickHouse/ClickHouse/pull/61998) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix clickhouse-test [#62016](https://github.com/ClickHouse/ClickHouse/pull/62016) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix logs saving in DatabaseReplicated tests [#62019](https://github.com/ClickHouse/ClickHouse/pull/62019) ([Nikolay Degterinsky](https://github.com/evillique)).
* fix npy big endianness [#62020](https://github.com/ClickHouse/ClickHouse/pull/62020) ([豪肥肥](https://github.com/HowePa)).
* Update analyzer_tech_debt.txt [#62035](https://github.com/ClickHouse/ClickHouse/pull/62035) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Add analyzer pattern to 00002_log_and_exception_messages_formatting [#62038](https://github.com/ClickHouse/ClickHouse/pull/62038) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix clickhouse-test in case of missing .reference file [#62041](https://github.com/ClickHouse/ClickHouse/pull/62041) ([Azat Khuzhin](https://github.com/azat)).
* Fix optimize_arithmetic_operations_in_aggregate_functions [#62046](https://github.com/ClickHouse/ClickHouse/pull/62046) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Update DatabaseOnDisk.cpp [#62049](https://github.com/ClickHouse/ClickHouse/pull/62049) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Ignore IfChainToMultiIfPass if returned type changed. [#62059](https://github.com/ClickHouse/ClickHouse/pull/62059) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Support more that 255 replicas in system table [#62064](https://github.com/ClickHouse/ClickHouse/pull/62064) ([Alexander Gololobov](https://github.com/davenger)).
* Fix stress tests for analyzer due to experimental WINDOW VIEW (by disabling it) [#62065](https://github.com/ClickHouse/ClickHouse/pull/62065) ([Azat Khuzhin](https://github.com/azat)).
* Fix type for ConvertInToEqualPass [#62066](https://github.com/ClickHouse/ClickHouse/pull/62066) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Remove Dead Code [#62082](https://github.com/ClickHouse/ClickHouse/pull/62082) ([jsc0218](https://github.com/jsc0218)).
* Revert output Pretty in tty [#62090](https://github.com/ClickHouse/ClickHouse/pull/62090) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* More than 255 replicas in ReplicatedTableStatus [#62127](https://github.com/ClickHouse/ClickHouse/pull/62127) ([Alexander Gololobov](https://github.com/davenger)).
* Fix upgrade check [#62136](https://github.com/ClickHouse/ClickHouse/pull/62136) ([Raúl Marín](https://github.com/Algunenano)).
* Fix 0320_long_values_pretty_are_not_cut_if_single [#62150](https://github.com/ClickHouse/ClickHouse/pull/62150) ([Duc Canh Le](https://github.com/canhld94)).
* Update NuRaft [#62156](https://github.com/ClickHouse/ClickHouse/pull/62156) ([Antonio Andelic](https://github.com/antonio2368)).
* Unify lightweight mutation control [#62159](https://github.com/ClickHouse/ClickHouse/pull/62159) ([Raúl Marín](https://github.com/Algunenano)).
* Add some logging [#62160](https://github.com/ClickHouse/ClickHouse/pull/62160) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix xml part in documentation [#62169](https://github.com/ClickHouse/ClickHouse/pull/62169) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Remove a few nested include dependencies [#62170](https://github.com/ClickHouse/ClickHouse/pull/62170) ([Raúl Marín](https://github.com/Algunenano)).
* User specific S3 endpoint for Backup/Restore on cluster [#62175](https://github.com/ClickHouse/ClickHouse/pull/62175) ([Antonio Andelic](https://github.com/antonio2368)).
* Bump `double-conversion` submodule [#62177](https://github.com/ClickHouse/ClickHouse/pull/62177) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix `retention` docs [#62182](https://github.com/ClickHouse/ClickHouse/pull/62182) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Fix 02503_insert_storage_snapshot [#62194](https://github.com/ClickHouse/ClickHouse/pull/62194) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Use ClickHouse threads in NuRaft [#62196](https://github.com/ClickHouse/ClickHouse/pull/62196) ([alesapin](https://github.com/alesapin)).
* Unflake and speed up `01676_clickhouse_client_autocomplete` [#62209](https://github.com/ClickHouse/ClickHouse/pull/62209) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Fix build with clang-19 (master) [#62212](https://github.com/ClickHouse/ClickHouse/pull/62212) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add more documentation to the release script [#62213](https://github.com/ClickHouse/ClickHouse/pull/62213) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Update version_date.tsv and changelogs after v24.3.2.23-lts [#62214](https://github.com/ClickHouse/ClickHouse/pull/62214) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Unlimited output_format_pretty_max_value_width for --pager [#62221](https://github.com/ClickHouse/ClickHouse/pull/62221) ([Azat Khuzhin](https://github.com/azat)).
* Include table name in paranoid checks [#62232](https://github.com/ClickHouse/ClickHouse/pull/62232) ([Raúl Marín](https://github.com/Algunenano)).
* Fix another logical error in group_by_use_nulls. [#62236](https://github.com/ClickHouse/ClickHouse/pull/62236) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Remove reverted PR from 24.3 changelog [#62251](https://github.com/ClickHouse/ClickHouse/pull/62251) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix lambda(tuple(x), x + 1) syntax in analyzer [#62253](https://github.com/ClickHouse/ClickHouse/pull/62253) ([vdimir](https://github.com/vdimir)).
* Fix s3-style link mapper for gcs [#62257](https://github.com/ClickHouse/ClickHouse/pull/62257) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Disable 02980_dist_insert_readonly_replica for SMT [#62260](https://github.com/ClickHouse/ClickHouse/pull/62260) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix logical error from fs cache in stress test [#62261](https://github.com/ClickHouse/ClickHouse/pull/62261) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Remove more nested includes [#62264](https://github.com/ClickHouse/ClickHouse/pull/62264) ([Raúl Marín](https://github.com/Algunenano)).
* Don't access static members through instance [#62265](https://github.com/ClickHouse/ClickHouse/pull/62265) ([Robert Schulze](https://github.com/rschu1ze)).
* Add fault injection for "Cannot allocate thread" [#62266](https://github.com/ClickHouse/ClickHouse/pull/62266) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Analyzer: limit maximal size of column in constant folding [#62273](https://github.com/ClickHouse/ClickHouse/pull/62273) ([vdimir](https://github.com/vdimir)).
* Fix __actionName, add tests for internal functions direct call [#62287](https://github.com/ClickHouse/ClickHouse/pull/62287) ([vdimir](https://github.com/vdimir)).
* Fix `mortonEncode` `use-of-uninitialized-value` [#62288](https://github.com/ClickHouse/ClickHouse/pull/62288) ([Antonio Andelic](https://github.com/antonio2368)).
* Add local address to network exception messages [#62300](https://github.com/ClickHouse/ClickHouse/pull/62300) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Better handling of errors from azure storage [#62306](https://github.com/ClickHouse/ClickHouse/pull/62306) ([Anton Popov](https://github.com/CurtizJ)).
* Cleanup SSH-based authentication code [#62307](https://github.com/ClickHouse/ClickHouse/pull/62307) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix data race in LocalServer [#62309](https://github.com/ClickHouse/ClickHouse/pull/62309) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix for postprocess script: print correct count for frame [#62317](https://github.com/ClickHouse/ClickHouse/pull/62317) ([Antonio Andelic](https://github.com/antonio2368)).
* Use DETACHED_DIR_NAME everywhere [#62318](https://github.com/ClickHouse/ClickHouse/pull/62318) ([Azat Khuzhin](https://github.com/azat)).
* Fix small typo in Dictionary source loader [#62320](https://github.com/ClickHouse/ClickHouse/pull/62320) ([Sean Haynes](https://github.com/seandhaynes)).
* Fix optimize_uniq_to_count when only prefix of key is matched [#62325](https://github.com/ClickHouse/ClickHouse/pull/62325) ([vdimir](https://github.com/vdimir)).
* More complex locking in `StackTrace::toString` [#62332](https://github.com/ClickHouse/ClickHouse/pull/62332) ([Antonio Andelic](https://github.com/antonio2368)).
* Analyzer: Fix PREWHERE with lambda functions [#62336](https://github.com/ClickHouse/ClickHouse/pull/62336) ([vdimir](https://github.com/vdimir)).
* Reduce log levels for ReadWriteBufferFromHTTP retries [#62348](https://github.com/ClickHouse/ClickHouse/pull/62348) ([Alexander Gololobov](https://github.com/davenger)).
* dhparams are not enabled by default [#62365](https://github.com/ClickHouse/ClickHouse/pull/62365) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Disable window view with analyzer properly [#62367](https://github.com/ClickHouse/ClickHouse/pull/62367) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Don't access static members through instance, pt. II [#62375](https://github.com/ClickHouse/ClickHouse/pull/62375) ([Robert Schulze](https://github.com/rschu1ze)).
* Use function isNotDistinctFrom only in join key [#62387](https://github.com/ClickHouse/ClickHouse/pull/62387) ([vdimir](https://github.com/vdimir)).
* CI: fix for docs only pr [#62396](https://github.com/ClickHouse/ClickHouse/pull/62396) ([Max K.](https://github.com/maxknv)).
* Fix one phony case [#62397](https://github.com/ClickHouse/ClickHouse/pull/62397) ([Raúl Marín](https://github.com/Algunenano)).
* CI: test merge queue [#62403](https://github.com/ClickHouse/ClickHouse/pull/62403) ([Max K.](https://github.com/maxknv)).
* Add part name to check part exception message [#62408](https://github.com/ClickHouse/ClickHouse/pull/62408) ([Igor Nikonov](https://github.com/devcrafter)).
* CI: disable finish check for mq [#62410](https://github.com/ClickHouse/ClickHouse/pull/62410) ([Max K.](https://github.com/maxknv)).
* Fix logical error 'numbers_storage.step != UInt64{0}' [#62413](https://github.com/ClickHouse/ClickHouse/pull/62413) ([Kruglov Pavel](https://github.com/Avogar)).
* Don't check overflow in arrayDotProduct in undefined sanitizer [#62417](https://github.com/ClickHouse/ClickHouse/pull/62417) ([Kruglov Pavel](https://github.com/Avogar)).
* Avoid uncaught exception for onFault handler [#62418](https://github.com/ClickHouse/ClickHouse/pull/62418) ([Azat Khuzhin](https://github.com/azat)).
* Update StorageFileLog.cpp [#62421](https://github.com/ClickHouse/ClickHouse/pull/62421) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Support for a tiny feature in stateless tests image [#62427](https://github.com/ClickHouse/ClickHouse/pull/62427) ([Nikolay Degterinsky](https://github.com/evillique)).
* OptimizeGroupByInjectiveFunctionsPass remove unused constant [#62433](https://github.com/ClickHouse/ClickHouse/pull/62433) ([Maksim Kita](https://github.com/kitaisreal)).
* Perf script update path in documentation [#62439](https://github.com/ClickHouse/ClickHouse/pull/62439) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix completion of available ClickHouse tools [#62446](https://github.com/ClickHouse/ClickHouse/pull/62446) ([Azat Khuzhin](https://github.com/azat)).
* Use shared mutex for global stacktrace cache [#62453](https://github.com/ClickHouse/ClickHouse/pull/62453) ([Sergei Trifonov](https://github.com/serxa)).
* Keeper logging fixes [#62455](https://github.com/ClickHouse/ClickHouse/pull/62455) ([Alexander Gololobov](https://github.com/davenger)).
* Add profile events for azure disk [#62458](https://github.com/ClickHouse/ClickHouse/pull/62458) ([Anton Popov](https://github.com/CurtizJ)).
* CI: gh runner version 2.315.0 [#62461](https://github.com/ClickHouse/ClickHouse/pull/62461) ([Max K.](https://github.com/maxknv)).
* Fix clang-tidy build [#62478](https://github.com/ClickHouse/ClickHouse/pull/62478) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix random clang tidy warning [#62480](https://github.com/ClickHouse/ClickHouse/pull/62480) ([Raúl Marín](https://github.com/Algunenano)).
* Disable external sort in 01592_long_window_functions1 [#62487](https://github.com/ClickHouse/ClickHouse/pull/62487) ([Nikita Taranov](https://github.com/nickitat)).
* CI: merge sync pr on push to master [#62488](https://github.com/ClickHouse/ClickHouse/pull/62488) ([Max K.](https://github.com/maxknv)).
* Don't allow the fuzzer to change allow_experimental_analyzer [#62500](https://github.com/ClickHouse/ClickHouse/pull/62500) ([Raúl Marín](https://github.com/Algunenano)).
* Update comment in 02911_support_alias_column_in_indices.sql [#62503](https://github.com/ClickHouse/ClickHouse/pull/62503) ([Robert Schulze](https://github.com/rschu1ze)).
* Add test for [#26674](https://github.com/ClickHouse/ClickHouse/issues/26674) [#62504](https://github.com/ClickHouse/ClickHouse/pull/62504) ([Raúl Marín](https://github.com/Algunenano)).
* Add test for Bug 37909 [#62509](https://github.com/ClickHouse/ClickHouse/pull/62509) ([Robert Schulze](https://github.com/rschu1ze)).
* Add test for bug [#33446](https://github.com/ClickHouse/ClickHouse/issues/33446) [#62511](https://github.com/ClickHouse/ClickHouse/pull/62511) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix upgrade test. Again [#62513](https://github.com/ClickHouse/ClickHouse/pull/62513) ([Raúl Marín](https://github.com/Algunenano)).
* Blind fix for a flaky test [#62516](https://github.com/ClickHouse/ClickHouse/pull/62516) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Add asserts to COW example programs [#62543](https://github.com/ClickHouse/ClickHouse/pull/62543) ([Tomer Shafir](https://github.com/tomershafir)).
* CI: respect Sync status in the MQ [#62550](https://github.com/ClickHouse/ClickHouse/pull/62550) ([Max K.](https://github.com/maxknv)).
* Fix assertion in stress test [#62551](https://github.com/ClickHouse/ClickHouse/pull/62551) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix flaky 03093_bug37909_query_does_not_finish [#62553](https://github.com/ClickHouse/ClickHouse/pull/62553) ([Robert Schulze](https://github.com/rschu1ze)).
* Add test for issue 24607 [#62554](https://github.com/ClickHouse/ClickHouse/pull/62554) ([Robert Schulze](https://github.com/rschu1ze)).
* Follow up to [#61723](https://github.com/ClickHouse/ClickHouse/issues/61723) [#62555](https://github.com/ClickHouse/ClickHouse/pull/62555) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix integration-tests logs compression [#62556](https://github.com/ClickHouse/ClickHouse/pull/62556) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Try to fix if_transform_strings_to_enum performance test [#62558](https://github.com/ClickHouse/ClickHouse/pull/62558) ([Dmitry Novik](https://github.com/novikd)).
* Always use new analyzer in performance tests [#62564](https://github.com/ClickHouse/ClickHouse/pull/62564) ([Dmitry Novik](https://github.com/novikd)).
* CI: Add tests with Azure storage [#62565](https://github.com/ClickHouse/ClickHouse/pull/62565) ([Max K.](https://github.com/maxknv)).
* CI: fix for sync check status in mq [#62568](https://github.com/ClickHouse/ClickHouse/pull/62568) ([Max K.](https://github.com/maxknv)).
* Remove mentions of clean_deleted_rows from the documentation [#62570](https://github.com/ClickHouse/ClickHouse/pull/62570) ([Raúl Marín](https://github.com/Algunenano)).
* Try to fix Bugfix validation job [#62579](https://github.com/ClickHouse/ClickHouse/pull/62579) ([Raúl Marín](https://github.com/Algunenano)).
* CI: add FT to MQ remove Style from master [#62588](https://github.com/ClickHouse/ClickHouse/pull/62588) ([Max K.](https://github.com/maxknv)).
* CI: MQ sync status check fix [#62591](https://github.com/ClickHouse/ClickHouse/pull/62591) ([Max K.](https://github.com/maxknv)).
* Better retries in azure sdk [#62608](https://github.com/ClickHouse/ClickHouse/pull/62608) ([Anton Popov](https://github.com/CurtizJ)).
* Fix: msan in UUIDStringToNum [#62610](https://github.com/ClickHouse/ClickHouse/pull/62610) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix a typo and grammar in `intersect` [#62622](https://github.com/ClickHouse/ClickHouse/pull/62622) ([Josh Rodriguez](https://github.com/jrodshua)).
* JOIN filter push down right stream filled crash fix [#62624](https://github.com/ClickHouse/ClickHouse/pull/62624) ([Maksim Kita](https://github.com/kitaisreal)).
* HashedDictionaryParallelLoader exception safe constructor [#62625](https://github.com/ClickHouse/ClickHouse/pull/62625) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix 02366_kql_summarize [#62642](https://github.com/ClickHouse/ClickHouse/pull/62642) ([Nikita Taranov](https://github.com/nickitat)).
* Disable 02581_share_big_sets_between_mutation_tasks under sanitizers [#62645](https://github.com/ClickHouse/ClickHouse/pull/62645) ([Nikita Taranov](https://github.com/nickitat)).
* Don't allow relative paths when installing [#62658](https://github.com/ClickHouse/ClickHouse/pull/62658) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update TransactionLog.cpp [#62663](https://github.com/ClickHouse/ClickHouse/pull/62663) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Disable aggregation-by-partitions optimisation with parallel replicas [#62697](https://github.com/ClickHouse/ClickHouse/pull/62697) ([Nikita Taranov](https://github.com/nickitat)).
* Fix build when `$CC` isn't set [#62700](https://github.com/ClickHouse/ClickHouse/pull/62700) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump Azure to 1.8.0 [#62702](https://github.com/ClickHouse/ClickHouse/pull/62702) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix --client-option for $CLICKHOUSE_CLIENT in .sh tests [#62710](https://github.com/ClickHouse/ClickHouse/pull/62710) ([Azat Khuzhin](https://github.com/azat)).
* Bump Azure to v1.10 [#62712](https://github.com/ClickHouse/ClickHouse/pull/62712) ([Robert Schulze](https://github.com/rschu1ze)).
* Bump Azure to v1.11 [#62713](https://github.com/ClickHouse/ClickHouse/pull/62713) ([Robert Schulze](https://github.com/rschu1ze)).
* `Trunc` docs fix [#62720](https://github.com/ClickHouse/ClickHouse/pull/62720) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Unify a batch of tests [#62723](https://github.com/ClickHouse/ClickHouse/pull/62723) ([Raúl Marín](https://github.com/Algunenano)).
* Fix typo in exception explanation [#62740](https://github.com/ClickHouse/ClickHouse/pull/62740) ([Igor Markelov](https://github.com/ElderlyPassionFruit)).
* Block cannot allocate thread fault in noexcept functions in `MergeTreeTransaction` [#62751](https://github.com/ClickHouse/ClickHouse/pull/62751) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Log profile events send timings [#62752](https://github.com/ClickHouse/ClickHouse/pull/62752) ([Alexander Gololobov](https://github.com/davenger)).
* Follow-up to [#62700](https://github.com/ClickHouse/ClickHouse/issues/62700): Fix build when `$CC` isn't set [#62754](https://github.com/ClickHouse/ClickHouse/pull/62754) ([Robert Schulze](https://github.com/rschu1ze)).
* Analyzer: Fix exception message [#62755](https://github.com/ClickHouse/ClickHouse/pull/62755) ([Dmitry Novik](https://github.com/novikd)).
* Fix shellcheck style checking and issues [#62761](https://github.com/ClickHouse/ClickHouse/pull/62761) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix taking full part if part contains less than 'limit' rows [#62812](https://github.com/ClickHouse/ClickHouse/pull/62812) ([Artur Malchanau](https://github.com/Hexta)).
* TableEngineGrant: undo breaking change [#62828](https://github.com/ClickHouse/ClickHouse/pull/62828) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Fix typo [#62836](https://github.com/ClickHouse/ClickHouse/pull/62836) ([Robert Schulze](https://github.com/rschu1ze)).
* Revert "Add test for bug [#33446](https://github.com/ClickHouse/ClickHouse/issues/33446)" [#62844](https://github.com/ClickHouse/ClickHouse/pull/62844) ([Robert Schulze](https://github.com/rschu1ze)).
* SYSTEM DROP uninitialized cache fix [#62868](https://github.com/ClickHouse/ClickHouse/pull/62868) ([Maksim Kita](https://github.com/kitaisreal)).
* PlannerJoins remove unused comments [#62874](https://github.com/ClickHouse/ClickHouse/pull/62874) ([Maksim Kita](https://github.com/kitaisreal)).
* Add test for bug 33446 [#62880](https://github.com/ClickHouse/ClickHouse/pull/62880) ([Robert Schulze](https://github.com/rschu1ze)).
* Build kererberized_hadoop image by downloading commons-daemon via https [#62886](https://github.com/ClickHouse/ClickHouse/pull/62886) ([Ilya Golshtein](https://github.com/ilejn)).
* Update run.sh [#62889](https://github.com/ClickHouse/ClickHouse/pull/62889) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix build failed on clang-18 [#62899](https://github.com/ClickHouse/ClickHouse/pull/62899) ([LiuNeng](https://github.com/liuneng1994)).
* Fix parsing of nested proto messages [#62906](https://github.com/ClickHouse/ClickHouse/pull/62906) ([Raúl Marín](https://github.com/Algunenano)).
* Fix `00993_system_parts_race_condition_drop_zookeeper` [#62908](https://github.com/ClickHouse/ClickHouse/pull/62908) ([Nikita Taranov](https://github.com/nickitat)).
* Fix 03013_forbid_attach_table_if_active_replica_already_exists for private [#62909](https://github.com/ClickHouse/ClickHouse/pull/62909) ([Nikita Taranov](https://github.com/nickitat)).
* Fix 03015_optimize_final_rmt in private [#62911](https://github.com/ClickHouse/ClickHouse/pull/62911) ([Nikita Taranov](https://github.com/nickitat)).
* Add some functions to zookeeper client [#62920](https://github.com/ClickHouse/ClickHouse/pull/62920) ([alesapin](https://github.com/alesapin)).
* Fix build on Mac using clang-18 [#62954](https://github.com/ClickHouse/ClickHouse/pull/62954) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Reapply: CI add FT to MQ remove Style from master [#62963](https://github.com/ClickHouse/ClickHouse/pull/62963) ([Max K.](https://github.com/maxknv)).
* Fix flaky 03128_argMin_combinator_projection [#62965](https://github.com/ClickHouse/ClickHouse/pull/62965) ([Raúl Marín](https://github.com/Algunenano)).
* Better exception message [#62967](https://github.com/ClickHouse/ClickHouse/pull/62967) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix race in `executeJob` when updating exception message [#62972](https://github.com/ClickHouse/ClickHouse/pull/62972) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Remove incorrect assertion from DatabaseReplicated [#63000](https://github.com/ClickHouse/ClickHouse/pull/63000) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update version_date.tsv and changelogs after v23.8.13.25-lts [#63014](https://github.com/ClickHouse/ClickHouse/pull/63014) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* JIT sort description crash fix [#63024](https://github.com/ClickHouse/ClickHouse/pull/63024) ([Maksim Kita](https://github.com/kitaisreal)).
* CI: fix ci config to run FT in MQ [#63025](https://github.com/ClickHouse/ClickHouse/pull/63025) ([Max K.](https://github.com/maxknv)).
* Add test for [#42769](https://github.com/ClickHouse/ClickHouse/issues/42769) [#63033](https://github.com/ClickHouse/ClickHouse/pull/63033) ([Raúl Marín](https://github.com/Algunenano)).
* Fix suppressions for librdkafka data-race for statistics code [#63039](https://github.com/ClickHouse/ClickHouse/pull/63039) ([Azat Khuzhin](https://github.com/azat)).
* Enable 03015_optimize_final_rmt for SMT [#63042](https://github.com/ClickHouse/ClickHouse/pull/63042) ([Nikita Taranov](https://github.com/nickitat)).
* CI: fix job config for MQ [#63045](https://github.com/ClickHouse/ClickHouse/pull/63045) ([Max K.](https://github.com/maxknv)).
* Unfork and update curl to 8.7.1 [#63048](https://github.com/ClickHouse/ClickHouse/pull/63048) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix integration tests with old analyzer (and fix some leftovers of enabling it) [#63069](https://github.com/ClickHouse/ClickHouse/pull/63069) ([Azat Khuzhin](https://github.com/azat)).
* Get back test for old inter-server mode (DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET non-v2) [#63070](https://github.com/ClickHouse/ClickHouse/pull/63070) ([Azat Khuzhin](https://github.com/azat)).
* Fix "invalid escape sequence" in clickhouse-test [#63073](https://github.com/ClickHouse/ClickHouse/pull/63073) ([Azat Khuzhin](https://github.com/azat)).
* Fix stateful tests [#63077](https://github.com/ClickHouse/ClickHouse/pull/63077) ([alesapin](https://github.com/alesapin)).
* Better highlighting of keywords [#63079](https://github.com/ClickHouse/ClickHouse/pull/63079) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix race in OpenSSL X509 store [#63109](https://github.com/ClickHouse/ClickHouse/pull/63109) ([Robert Schulze](https://github.com/rschu1ze)).
* Azure always green [#63120](https://github.com/ClickHouse/ClickHouse/pull/63120) ([alesapin](https://github.com/alesapin)).
* Fix flaky `03094_grouparraysorted_memory` [#63121](https://github.com/ClickHouse/ClickHouse/pull/63121) ([Antonio Andelic](https://github.com/antonio2368)).
* test for [#56564](https://github.com/ClickHouse/ClickHouse/issues/56564) [#63124](https://github.com/ClickHouse/ClickHouse/pull/63124) ([Denny Crane](https://github.com/den-crane)).
* Recursive CTE data race fix [#63125](https://github.com/ClickHouse/ClickHouse/pull/63125) ([Maksim Kita](https://github.com/kitaisreal)).
* add test for [#55360](https://github.com/ClickHouse/ClickHouse/issues/55360) [#63127](https://github.com/ClickHouse/ClickHouse/pull/63127) ([flynn](https://github.com/ucasfl)).
* add tests for [#47217](https://github.com/ClickHouse/ClickHouse/issues/47217), [#55965](https://github.com/ClickHouse/ClickHouse/issues/55965) [#63128](https://github.com/ClickHouse/ClickHouse/pull/63128) ([Denny Crane](https://github.com/den-crane)).
* Revert "Merge pull request [#60598](https://github.com/ClickHouse/ClickHouse/issues/60598) from jrdi/week-default-mode" [#63157](https://github.com/ClickHouse/ClickHouse/pull/63157) ([Jordi Villar](https://github.com/jrdi)).

View File

@ -1,19 +1,19 @@
---
slug: /en/engines/table-engines/mergetree-family/invertedindexes
sidebar_label: Inverted Indexes
sidebar_label: Full-text Indexes
description: Quickly find search terms in text.
keywords: [full-text search, text search, inverted, index, indices]
---
# Full-text Search using Inverted Indexes [experimental]
# Full-text Search using Full-text Indexes [experimental]
Inverted indexes are an experimental type of [secondary indexes](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#available-types-of-indices) which provide fast text search
Full-text indexes are an experimental type of [secondary indexes](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#available-types-of-indices) which provide fast text search
capabilities for [String](/docs/en/sql-reference/data-types/string.md) or [FixedString](/docs/en/sql-reference/data-types/fixedstring.md)
columns. The main idea of an inverted index is to store a mapping from "terms" to the rows which contain these terms. "Terms" are
columns. The main idea of a full-text index is to store a mapping from "terms" to the rows which contain these terms. "Terms" are
tokenized cells of the string column. For example, the string cell "I will be a little late" is by default tokenized into six terms "I", "will",
"be", "a", "little" and "late". Another kind of tokenizer is n-grams. For example, the result of 3-gram tokenization will be 21 terms "I w",
" wi", "wil", "ill", "ll ", "l b", " be" etc. The more fine-granular the input strings are tokenized, the bigger but also the more
useful the resulting inverted index will be.
useful the resulting full-text index will be.
<div class='vimeo-container'>
<iframe src="//www.youtube.com/embed/O_MnyUkrIq8"
@ -28,26 +28,26 @@ useful the resulting inverted index will be.
</div>
:::note
Inverted indexes are experimental and should not be used in production environments yet. They may change in the future in backward-incompatible
Full-text indexes are experimental and should not be used in production environments yet. They may change in the future in backward-incompatible
ways, for example with respect to their DDL/DQL syntax or performance/compression characteristics.
:::
## Usage
To use inverted indexes, first enable them in the configuration:
To use full-text indexes, first enable them in the configuration:
```sql
SET allow_experimental_inverted_index = true;
```
An inverted index can be defined on a string column using the following syntax
An full-text index can be defined on a string column using the following syntax
``` sql
CREATE TABLE tab
(
`key` UInt64,
`str` String,
INDEX inv_idx(str) TYPE inverted(0) GRANULARITY 1
INDEX inv_idx(str) TYPE full_text(0) GRANULARITY 1
)
ENGINE = MergeTree
ORDER BY key
@ -55,20 +55,20 @@ ORDER BY key
where `N` specifies the tokenizer:
- `inverted(0)` (or shorter: `inverted()`) set the tokenizer to "tokens", i.e. split strings along spaces,
- `inverted(N)` with `N` between 2 and 8 sets the tokenizer to "ngrams(N)"
- `full_text(0)` (or shorter: `full_text()`) set the tokenizer to "tokens", i.e. split strings along spaces,
- `full_text(N)` with `N` between 2 and 8 sets the tokenizer to "ngrams(N)"
The maximum rows per postings list can be specified as the second parameter. This parameter can be used to control postings list sizes to avoid generating huge postings list files. The following variants exist:
- `inverted(ngrams, max_rows_per_postings_list)`: Use given max_rows_per_postings_list (assuming it is not 0)
- `inverted(ngrams, 0)`: No limitation of maximum rows per postings list
- `inverted(ngrams)`: Use a default maximum rows which is 64K.
- `full_text(ngrams, max_rows_per_postings_list)`: Use given max_rows_per_postings_list (assuming it is not 0)
- `full_text(ngrams, 0)`: No limitation of maximum rows per postings list
- `full_text(ngrams)`: Use a default maximum rows which is 64K.
Being a type of skipping index, inverted indexes can be dropped or added to a column after table creation:
Being a type of skipping index, full-text indexes can be dropped or added to a column after table creation:
``` sql
ALTER TABLE tab DROP INDEX inv_idx;
ALTER TABLE tab ADD INDEX inv_idx(s) TYPE inverted(2);
ALTER TABLE tab ADD INDEX inv_idx(s) TYPE full_text(2);
```
To use the index, no special functions or syntax are required. Typical string search predicates automatically leverage the index. As
@ -83,9 +83,9 @@ SELECT * from tab WHERE multiSearchAny(str, ['Hello', 'World']);
SELECT * from tab WHERE hasToken(str, 'Hello');
```
The inverted index also works on columns of type `Array(String)`, `Array(FixedString)`, `Map(String)` and `Map(String)`.
The full-text index also works on columns of type `Array(String)`, `Array(FixedString)`, `Map(String)` and `Map(String)`.
Like for other secondary indices, each column part has its own inverted index. Furthermore, each inverted index is internally divided into
Like for other secondary indices, each column part has its own full-text index. Furthermore, each full-text index is internally divided into
"segments". The existence and size of the segments are generally transparent to users but the segment size determines the memory consumption
during index construction (e.g. when two parts are merged). Configuration parameter "max_digestion_size_per_segment" (default: 256 MB)
controls the amount of data read consumed from the underlying column before a new segment is created. Incrementing the parameter raises the
@ -94,7 +94,7 @@ average to evaluate a query.
## Full-text search of the Hacker News dataset
Let's look at the performance improvements of inverted indexes on a large dataset with lots of text. We will use 28.7M rows of comments on the popular Hacker News website. Here is the table without an inverted index:
Let's look at the performance improvements of full-text indexes on a large dataset with lots of text. We will use 28.7M rows of comments on the popular Hacker News website. Here is the table without an full-text index:
```sql
CREATE TABLE hackernews (
@ -162,11 +162,11 @@ Notice it takes 3 seconds to execute the query:
1 row in set. Elapsed: 3.001 sec. Processed 28.74 million rows, 9.75 GB (9.58 million rows/s., 3.25 GB/s.)
```
We will use `ALTER TABLE` and add an inverted index on the lowercase of the `comment` column, then materialize it (which can take a while - wait for it to materialize):
We will use `ALTER TABLE` and add an full-text index on the lowercase of the `comment` column, then materialize it (which can take a while - wait for it to materialize):
```sql
ALTER TABLE hackernews
ADD INDEX comment_lowercase(lower(comment)) TYPE inverted;
ADD INDEX comment_lowercase(lower(comment)) TYPE full_text;
ALTER TABLE hackernews MATERIALIZE INDEX comment_lowercase;
```
@ -204,9 +204,9 @@ WHERE hasToken(lower(comment), 'avx') AND hasToken(lower(comment), 'sve');
```
:::note
Unlike other secondary indices, inverted indexes (for now) map to row numbers (row ids) instead of granule ids. The reason for this design
Unlike other secondary indices, full-text indexes (for now) map to row numbers (row ids) instead of granule ids. The reason for this design
is performance. In practice, users often search for multiple terms at once. For example, filter predicate `WHERE s LIKE '%little%' OR s LIKE
'%big%'` can be evaluated directly using an inverted index by forming the union of the row id lists for terms "little" and "big". This also
'%big%'` can be evaluated directly using an full-text index by forming the union of the row id lists for terms "little" and "big". This also
means that the parameter `GRANULARITY` supplied to index creation has no meaning (it may be removed from the syntax in the future).
:::

View File

@ -1189,7 +1189,7 @@ Expired time for HSTS in seconds. The default value is 0 means clickhouse disabl
## include_from {#include_from}
The path to the file with substitutions.
The path to the file with substitutions. Both XML and YAML formats are supported.
For more information, see the section “[Configuration files](../../operations/configuration-files.md#configuration_files)”.

View File

@ -37,4 +37,4 @@ In this case overcommit ratio is computed as number of allocated bytes divided b
If `memory_overcommit_ratio_denominator_for_user` for the query is equals to zero, overcommit tracker won't choose this query.
Waiting timeout is set by `global_memory_usage_overcommit_max_wait_microseconds` parameter in the configuration file.
Waiting timeout is set by `memory_usage_overcommit_max_wait_microseconds` parameter in the configuration file.

View File

@ -8,7 +8,8 @@ sidebar_label: UUID
A Universally Unique Identifier (UUID) is a 16-byte value used to identify records. For detailed information about UUIDs, see [Wikipedia](https://en.wikipedia.org/wiki/Universally_unique_identifier).
While different UUID variants exist (see [here](https://datatracker.ietf.org/doc/html/draft-ietf-uuidrev-rfc4122bis)), ClickHouse does not validate that inserted UUIDs conform to a particular variant. UUIDs are internally treated as a sequence of 16 random bytes with [8-4-4-4-12 representation](https://en.wikipedia.org/wiki/Universally_unique_identifier#Textual_representation) at SQL level.
While different UUID variants exist (see [here](https://datatracker.ietf.org/doc/html/draft-ietf-uuidrev-rfc4122bis)), ClickHouse does not validate that inserted UUIDs conform to a particular variant.
UUIDs are internally treated as a sequence of 16 random bytes with [8-4-4-4-12 representation](https://en.wikipedia.org/wiki/Universally_unique_identifier#Textual_representation) at SQL level.
Example UUID value:
@ -22,6 +23,46 @@ The default UUID is all-zero. It is used, for example, when a new record is inse
00000000-0000-0000-0000-000000000000
```
Due to historical reasons, UUIDs are sorted by their second half (which is unintuitive).
UUIDs should therefore not be used in an primary key (or sorting key) of a table, or as partition key.
Example:
``` sql
CREATE TABLE tab (uuid UUID) ENGINE = Memory;
INSERT INTO tab SELECT generateUUIDv4() FROM numbers(50);
SELECT * FROM tab ORDER BY uuid;
```
Result:
``` text
┌─uuid─────────────────────────────────┐
│ 36a0b67c-b74a-4640-803b-e44bb4547e3c │
│ 3a00aeb8-2605-4eec-8215-08c0ecb51112 │
│ 3fda7c49-282e-421a-85ab-c5684ef1d350 │
│ 16ab55a7-45f6-44a8-873c-7a0b44346b3e │
│ e3776711-6359-4f22-878d-bf290d052c85 │
│ 1be30226-57b2-4739-88ec-5e3d490090f2 │
│ f65853a9-4375-4f0e-8b96-906ff622ed3c │
│ d5a0c7a6-79c6-4107-8bb8-df85915edcb7 │
│ 258e6068-17d1-4a1a-8be3-ed2ceb21815c │
│ 04b0f6a9-1f7b-4a42-8bfc-62f37b8a32b8 │
│ 9924f0d9-9c16-43a9-8f08-0944ab495aed │
│ 6720dc14-4eab-4e3e-8f0c-10c4ae8d2673 │
│ 5ddadb52-0452-4f5d-9030-c3f969af93a4 │
│ [...] │
│ 2dde30e6-59a1-48f8-b260-eb37921185b6 │
│ d5402a1b-77b3-4897-b288-29edf5c3ed12 │
│ 01843939-3ba7-4fea-b2aa-45f9a6f1e057 │
│ 9eceda2f-6946-40e3-b725-16f2709ca41a │
│ 03644f74-47ba-4020-b865-be5fd4c8c7ff │
│ ce3bc93d-ab19-4c74-b8cc-737cb9212099 │
│ b7ad6c91-23d6-4b5e-b8e4-a52297490b56 │
│ 06892f64-cc2d-45f3-bf86-f5c5af5768a9 │
└──────────────────────────────────────┘
```
## Generating UUIDs
ClickHouse provides the [generateUUIDv4](../../sql-reference/functions/uuid-functions.md) function to generate random UUID version 4 values.

View File

@ -1413,9 +1413,10 @@ toStartOfFifteenMinutes(toDateTime('2023-04-21 10:20:00')): 2023-04-21 10:15:00
toStartOfFifteenMinutes(toDateTime('2023-04-21 10:23:00')): 2023-04-21 10:15:00
```
## toStartOfInterval(date_or_date_with_time, INTERVAL x unit \[, time_zone\])
## toStartOfInterval
This function generalizes other `toStartOf*()` functions. For example,
This function generalizes other `toStartOf*()` functions with `toStartOfInterval(date_or_date_with_time, INTERVAL x unit [, time_zone])` syntax.
For example,
- `toStartOfInterval(t, INTERVAL 1 year)` returns the same as `toStartOfYear(t)`,
- `toStartOfInterval(t, INTERVAL 1 month)` returns the same as `toStartOfMonth(t)`,
- `toStartOfInterval(t, INTERVAL 1 day)` returns the same as `toStartOfDay(t)`,
@ -1440,6 +1441,8 @@ The calculation is performed relative to specific points in time:
(*) hour intervals are special: the calculation is always performed relative to 00:00:00 (midnight) of the current day. As a result, only
hour values between 1 and 23 are useful.
If unit `week` was specified, `toStartOfInterval` assumes that weeks start on Monday. Note that this behavior is different from that of function `toStartOfWeek` in which weeks start by default on Sunday.
**See Also**
- [date_trunc](#date_trunc)
@ -1673,7 +1676,7 @@ Like [fromDaysSinceYearZero](#fromDaysSinceYearZero) but returns a [Date32](../.
Returns the `unit` component of the difference between `startdate` and `enddate`. The difference is calculated using a precision of 1 nanosecond.
E.g. the difference between `2021-12-29` and `2022-01-01` is 3 days for `day` unit, 0 months for `month` unit, 0 years for `year` unit.
For an alternative to `age`, see function `date\_diff`.
For an alternative to `age`, see function `date_diff`.
**Syntax**
@ -1742,14 +1745,14 @@ Result:
```
## date\_diff
## date_diff
Returns the count of the specified `unit` boundaries crossed between the `startdate` and the `enddate`.
The difference is calculated using relative units, e.g. the difference between `2021-12-29` and `2022-01-01` is 3 days for unit `day` (see [toRelativeDayNum](#torelativedaynum)), 1 month for unit `month` (see [toRelativeMonthNum](#torelativemonthnum)) and 1 year for unit `year` (see [toRelativeYearNum](#torelativeyearnum)).
If unit `week` was specified, `date\_diff` assumes that weeks start on Monday. Note that this behavior is different from that of function `toWeek()` in which weeks start by default on Sunday.
If unit `week` was specified, `date_diff` assumes that weeks start on Monday. Note that this behavior is different from that of function `toWeek()` in which weeks start by default on Sunday.
For an alternative to `date\_diff`, see function `age`.
For an alternative to `date_diff`, see function `age`.
**Syntax**
@ -1891,7 +1894,7 @@ Result:
**See Also**
- [toStartOfInterval](#tostartofintervaltime-or-data-interval-x-unit-time-zone)
- [toStartOfInterval](#tostartofintervaldate_or_date_with_time-interval-x-unit--time_zone)
## date\_add
@ -2883,7 +2886,7 @@ Result:
## fromUnixTimestamp
This function converts a Unix timestamp to a calendar date and a time of a day.
This function converts a Unix timestamp to a calendar date and a time of a day.
It can be called in two ways:

View File

@ -8,49 +8,267 @@ sidebar_label: UUIDs
## generateUUIDv4
Generates the [UUID](../data-types/uuid.md) of [version 4](https://tools.ietf.org/html/rfc4122#section-4.4).
Generates a [version 4](https://tools.ietf.org/html/rfc4122#section-4.4) [UUID](../data-types/uuid.md).
**Syntax**
``` sql
generateUUIDv4([x])
generateUUIDv4([expr])
```
**Arguments**
- `x` — [Expression](../../sql-reference/syntax.md#syntax-expressions) resulting in any of the [supported data types](../../sql-reference/data-types/index.md#data_types). The resulting value is discarded, but the expression itself if used for bypassing [common subexpression elimination](../../sql-reference/functions/index.md#common-subexpression-elimination) if the function is called multiple times in one query. Optional parameter.
- `expr` — An arbitrary [expression](../../sql-reference/syntax.md#syntax-expressions) used to bypass [common subexpression elimination](../../sql-reference/functions/index.md#common-subexpression-elimination) if the function is called multiple times in a query. The value of the expression has no effect on the returned UUID. Optional.
**Returned value**
The UUID type value.
A value of type UUIDv4.
**Usage example**
**Example**
This example demonstrates creating a table with the UUID type column and inserting a value into the table.
First, create a table with a column of type UUID, then insert a generated UUIDv4 into the table.
``` sql
CREATE TABLE t_uuid (x UUID) ENGINE=TinyLog
CREATE TABLE tab (uuid UUID) ENGINE = Memory;
INSERT INTO t_uuid SELECT generateUUIDv4()
INSERT INTO tab SELECT generateUUIDv4();
SELECT * FROM t_uuid
SELECT * FROM tab;
```
Result:
```response
┌────────────────────────────────────x─┐
┌─────────────────────────────────uuid─┐
│ f4bf890f-f9dc-4332-ad5c-0c18e73f28e9 │
└──────────────────────────────────────┘
```
**Usage example if it is needed to generate multiple values in one row**
**Example with multiple UUIDs generated per row**
```sql
SELECT generateUUIDv4(1), generateUUIDv4(2)
SELECT generateUUIDv4(1), generateUUIDv4(2);
┌─generateUUIDv4(1)────────────────────┬─generateUUIDv4(2)────────────────────┐
│ 2d49dc6e-ddce-4cd0-afb8-790956df54c1 │ 8abf8c13-7dea-4fdf-af3e-0e18767770e6 │
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## generateUUIDv7 {#generateUUIDv7}
Generates a [version 7](https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04) [UUID](../data-types/uuid.md).
The generated UUID contains the current Unix timestamp in milliseconds (48 bits), followed by version "7" (4 bits), a counter (42 bit) to distinguish UUIDs within a millisecond (including a variant field "2", 2 bit), and a random field (32 bits).
For any given timestamp (unix_ts_ms), the counter starts at a random value and is incremented by 1 for each new UUID until the timestamp changes.
In case the counter overflows, the timestamp field is incremented by 1 and the counter is reset to a random new start value.
Function `generateUUIDv7` guarantees that the counter field within a timestamp increments monotonically across all function invocations in concurrently running threads and queries.
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms | ver | counter_high_bits |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
|var| counter_low_bits |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| rand_b |
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
```
:::note
As of April 2024, version 7 UUIDs are in draft status and their layout may change in future.
:::
**Syntax**
``` sql
generateUUIDv7([expr])
```
**Arguments**
- `expr` — An arbitrary [expression](../../sql-reference/syntax.md#syntax-expressions) used to bypass [common subexpression elimination](../../sql-reference/functions/index.md#common-subexpression-elimination) if the function is called multiple times in a query. The value of the expression has no effect on the returned UUID. Optional.
**Returned value**
A value of type UUIDv7.
**Example**
First, create a table with a column of type UUID, then insert a generated UUIDv7 into the table.
``` sql
CREATE TABLE tab (uuid UUID) ENGINE = Memory;
INSERT INTO tab SELECT generateUUIDv7();
SELECT * FROM tab;
```
Result:
```response
┌─────────────────────────────────uuid─┐
│ 018f05af-f4a8-778f-beee-1bedbc95c93b │
└──────────────────────────────────────┘
```
**Example with multiple UUIDs generated per row**
```sql
SELECT generateUUIDv7(1), generateUUIDv7(2);
┌─generateUUIDv7(1)────────────────────┬─generateUUIDv7(2)────────────────────┐
│ 018f05c9-4ab8-7b86-b64e-c9f03fbd45d1 │ 018f05c9-4ab8-7b86-b64e-c9f12efb7e16 │
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## generateUUIDv7ThreadMonotonic
Generates a [UUID](../data-types/uuid.md) of [version 7](https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04).
The generated UUID contains the current Unix timestamp in milliseconds (48 bits), followed by version "7" (4 bits), a counter (42 bit) to distinguish UUIDs within a millisecond (including a variant field "2", 2 bit), and a random field (32 bits).
For any given timestamp (unix_ts_ms), the counter starts at a random value and is incremented by 1 for each new UUID until the timestamp changes.
In case the counter overflows, the timestamp field is incremented by 1 and the counter is reset to a random new start value.
This function behaves like [generateUUIDv7](#generateUUIDv7) but gives no guarantee on counter monotony across different simultaneous requests.
Monotonicity within one timestamp is guaranteed only within the same thread calling this function to generate UUIDs.
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms | ver | counter_high_bits |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
|var| counter_low_bits |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| rand_b |
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
```
:::note
As of April 2024, version 7 UUIDs are in draft status and their layout may change in future.
:::
**Syntax**
``` sql
generateUUIDv7ThreadMonotonic([expr])
```
**Arguments**
- `expr` — An arbitrary [expression](../../sql-reference/syntax.md#syntax-expressions) used to bypass [common subexpression elimination](../../sql-reference/functions/index.md#common-subexpression-elimination) if the function is called multiple times in a query. The value of the expression has no effect on the returned UUID. Optional.
**Returned value**
A value of type UUIDv7.
**Usage example**
First, create a table with a column of type UUID, then insert a generated UUIDv7 into the table.
``` sql
CREATE TABLE tab (uuid UUID) ENGINE = Memory;
INSERT INTO tab SELECT generateUUIDv7ThreadMonotonic();
SELECT * FROM tab;
```
Result:
```response
┌─────────────────────────────────uuid─┐
│ 018f05e2-e3b2-70cb-b8be-64b09b626d32 │
└──────────────────────────────────────┘
```
**Example with multiple UUIDs generated per row**
```sql
SELECT generateUUIDv7ThreadMonotonic(1), generateUUIDv7ThreadMonotonic(2);
┌─generateUUIDv7ThreadMonotonic(1)─────┬─generateUUIDv7ThreadMonotonic(2)─────┐
│ 018f05e1-14ee-7bc5-9906-207153b400b1 │ 018f05e1-14ee-7bc5-9906-2072b8e96758 │
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## generateUUIDv7NonMonotonic
Generates a [UUID](../data-types/uuid.md) of [version 7](https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04).
The generated UUID contains the current Unix timestamp in milliseconds (48 bits), followed by version "7" (4 bits) and a random field (76 bits, including a 2-bit variant field "2").
This function is the fastest `generateUUIDv7*` function but it gives no monotonicity guarantees within a timestamp.
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms | ver | rand_a |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
|var| rand_b |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| rand_b |
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
```
:::note
As of April 2024, version 7 UUIDs are in draft status and their layout may change in future.
:::
**Syntax**
``` sql
generateUUIDv7NonMonotonic([expr])
```
**Arguments**
- `expr` — An arbitrary [expression](../../sql-reference/syntax.md#syntax-expressions) used to bypass [common subexpression elimination](../../sql-reference/functions/index.md#common-subexpression-elimination) if the function is called multiple times in a query. The value of the expression has no effect on the returned UUID. Optional.
**Returned value**
A value of type UUIDv7.
**Example**
First, create a table with a column of type UUID, then insert a generated UUIDv7 into the table.
``` sql
CREATE TABLE tab (uuid UUID) ENGINE = Memory;
INSERT INTO tab SELECT generateUUIDv7NonMonotonic();
SELECT * FROM tab;
```
Result:
```response
┌─────────────────────────────────uuid─┐
│ 018f05af-f4a8-778f-beee-1bedbc95c93b │
└──────────────────────────────────────┘
```
**Example with multiple UUIDs generated per row**
```sql
SELECT generateUUIDv7NonMonotonic(1), generateUUIDv7NonMonotonic(2);
┌─generateUUIDv7NonMonotonic(1) ───────┬─generateUUIDv7(2)NonMonotonic────────┐
│ 018f05b1-8c2e-7567-a988-48d09606ae8c │ 018f05b1-8c2e-7946-895b-fcd7635da9a0 │
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## empty
Checks whether the input UUID is empty.
@ -63,15 +281,15 @@ empty(UUID)
The UUID is considered empty if it contains all zeros (zero UUID).
The function also works for [arrays](array-functions.md#function-empty) or [strings](string-functions.md#empty).
The function also works for [Arrays](array-functions.md#function-empty) and [Strings](string-functions.md#empty).
**Arguments**
- `x`Input UUID. [UUID](../data-types/uuid.md).
- `x`A UUID. [UUID](../data-types/uuid.md).
**Returned value**
- Returns `1` for an empty UUID or `0` for a non-empty UUID.
- Returns `1` for an empty UUID or `0` for a non-empty UUID.
Type: [UInt8](../data-types/int-uint.md).
@ -105,15 +323,15 @@ notEmpty(UUID)
The UUID is considered empty if it contains all zeros (zero UUID).
The function also works for [arrays](array-functions.md#function-notempty) or [strings](string-functions.md#notempty).
The function also works for [Arrays](array-functions.md#function-notempty) or [Strings](string-functions.md#notempty).
**Arguments**
- `x`Input UUID. [UUID](../data-types/uuid.md).
- `x`A UUID. [UUID](../data-types/uuid.md).
**Returned value**
- Returns `1` for a non-empty UUID or `0` for an empty UUID.
- Returns `1` for a non-empty UUID or `0` for an empty UUID.
Type: [UInt8](../data-types/int-uint.md).
@ -135,12 +353,12 @@ Result:
└────────────────────────────┘
```
## toUUID (x)
## toUUID
Converts String type value to UUID type.
Converts a value of type String to a UUID.
``` sql
toUUID(String)
toUUID(string)
```
**Returned value**
@ -153,13 +371,15 @@ The UUID type value.
SELECT toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0') AS uuid
```
Result:
```response
┌─────────────────────────────────uuid─┐
│ 61f0c404-5cb3-11e7-907b-a6006ad3dba0 │
└──────────────────────────────────────┘
```
## toUUIDOrDefault (x,y)
## toUUIDOrDefault
**Arguments**
@ -171,7 +391,7 @@ SELECT toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0') AS uuid
UUID
``` sql
toUUIDOrDefault(String, UUID)
toUUIDOrDefault(string, default)
```
**Returned value**
@ -185,6 +405,9 @@ This first example returns the first argument converted to a UUID type as it can
``` sql
SELECT toUUIDOrDefault('61f0c404-5cb3-11e7-907b-a6006ad3dba0', cast('59f0c404-5cb3-11e7-907b-a6006ad3dba0' as UUID));
```
Result:
```response
┌─toUUIDOrDefault('61f0c404-5cb3-11e7-907b-a6006ad3dba0', CAST('59f0c404-5cb3-11e7-907b-a6006ad3dba0', 'UUID'))─┐
│ 61f0c404-5cb3-11e7-907b-a6006ad3dba0 │
@ -197,18 +420,20 @@ This second example returns the second argument (the provided default UUID) as t
SELECT toUUIDOrDefault('-----61f0c404-5cb3-11e7-907b-a6006ad3dba0', cast('59f0c404-5cb3-11e7-907b-a6006ad3dba0' as UUID));
```
Result:
```response
┌─toUUIDOrDefault('-----61f0c404-5cb3-11e7-907b-a6006ad3dba0', CAST('59f0c404-5cb3-11e7-907b-a6006ad3dba0', 'UUID'))─┐
│ 59f0c404-5cb3-11e7-907b-a6006ad3dba0 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
## toUUIDOrNull (x)
## toUUIDOrNull
It takes an argument of type String and tries to parse it into UUID. If failed, returns NULL.
Takes an argument of type String and tries to parse it into UUID. If failed, returns NULL.
``` sql
toUUIDOrNull(String)
toUUIDOrNull(string)
```
**Returned value**
@ -221,18 +446,20 @@ The Nullable(UUID) type value.
SELECT toUUIDOrNull('61f0c404-5cb3-11e7-907b-a6006ad3dba0T') AS uuid
```
Result:
```response
┌─uuid─┐
│ ᴺᵁᴸᴸ │
└──────┘
```
## toUUIDOrZero (x)
## toUUIDOrZero
It takes an argument of type String and tries to parse it into UUID. If failed, returns zero UUID.
``` sql
toUUIDOrZero(String)
toUUIDOrZero(string)
```
**Returned value**
@ -245,6 +472,8 @@ The UUID type value.
SELECT toUUIDOrZero('61f0c404-5cb3-11e7-907b-a6006ad3dba0T') AS uuid
```
Result:
```response
┌─────────────────────────────────uuid─┐
│ 00000000-0000-0000-0000-000000000000 │
@ -263,7 +492,7 @@ UUIDStringToNum(string[, variant = 1])
**Arguments**
- `string` — String of 36 characters or FixedString(36). [String](../../sql-reference/syntax.md#syntax-string-literal).
- `string`A [String](../../sql-reference/syntax.md#syntax-string-literal) of 36 characters or [FixedString](../../sql-reference/syntax.md#syntax-string-literal)
- `variant` — Integer, representing a variant as specified by [RFC4122](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.1). 1 = `Big-endian` (default), 2 = `Microsoft`.
**Returned value**
@ -278,6 +507,8 @@ SELECT
UUIDStringToNum(uuid) AS bytes
```
Result:
```response
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ a/<@];!~p{jTj={) │
@ -290,6 +521,8 @@ SELECT
UUIDStringToNum(uuid, 2) AS bytes
```
Result:
```response
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ @</a;]~!p{jTj={) │
@ -323,6 +556,8 @@ SELECT
UUIDNumToString(toFixedString(bytes, 16)) AS uuid
```
Result:
```response
┌─bytes────────────┬─uuid─────────────────────────────────┐
│ a/<@];!~p{jTj={) │ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │
@ -335,15 +570,113 @@ SELECT
UUIDNumToString(toFixedString(bytes, 16), 2) AS uuid
```
Result:
```response
┌─bytes────────────┬─uuid─────────────────────────────────┐
│ @</a;]~!p{jTj={) │ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │
└──────────────────┴──────────────────────────────────────┘
```
## UUIDToNum
Accepts a [UUID](../../sql-reference/data-types/uuid.md) and returns its binary representation as a [FixedString(16)](../../sql-reference/data-types/fixedstring.md), with its format optionally specified by `variant` (`Big-endian` by default). This function replaces calls to two separate functions `UUIDStringToNum(toString(uuid))` so no intermediate conversion from UUID to string is required to extract bytes from a UUID.
**Syntax**
``` sql
UUIDToNum(uuid[, variant = 1])
```
**Arguments**
- `uuid` — [UUID](../data-types/uuid.md).
- `variant` — Integer, representing a variant as specified by [RFC4122](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.1). 1 = `Big-endian` (default), 2 = `Microsoft`.
**Returned value**
The binary representation of the UUID.
**Usage examples**
``` sql
SELECT
toUUID('612f3c40-5d3b-217e-707b-6a546a3d7b29') AS uuid,
UUIDToNum(uuid) AS bytes
```
Result:
```response
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ a/<@];!~p{jTj={) │
└──────────────────────────────────────┴──────────────────┘
```
``` sql
SELECT
toUUID('612f3c40-5d3b-217e-707b-6a546a3d7b29') AS uuid,
UUIDToNum(uuid, 2) AS bytes
```
Result:
```response
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ @</a;]~!p{jTj={) │
└──────────────────────────────────────┴──────────────────┘
```
## UUIDv7ToDateTime
Returns the timestamp component of a UUID version 7.
**Syntax**
``` sql
UUIDv7ToDateTime(uuid[, timezone])
```
**Arguments**
- `uuid` — [UUID](../data-types/uuid.md) of version 7.
- `timezone` — [Timezone name](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) for the returned value (optional). [String](../../sql-reference/data-types/string.md).
**Returned value**
- Timestamp with milliseconds precision. If the UUID is not a valid version 7 UUID, it returns 1970-01-01 00:00:00.000.
Type: [DateTime64(3)](/docs/en/sql-reference/data-types/datetime64.md).
**Usage examples**
``` sql
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'))
```
Result:
```response
┌─UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'))─┐
│ 2024-04-22 15:30:29.048 │
└──────────────────────────────────────────────────────────────────┘
```
``` sql
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 'America/New_York')
```
Result:
```response
┌─UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 'America/New_York')─┐
│ 2024-04-22 08:30:29.048 │
└──────────────────────────────────────────────────────────────────────────────────────┘
```
## serverUUID()
Returns the random and unique UUID, which is generated when the server is first started and stored forever. The result writes to the file `uuid` created in the ClickHouse server directory `/var/lib/clickhouse/`.
Returns the random UUID generated during the first start of the ClickHouse server. The UUID is stored in file `uuid` in the ClickHouse server directory (e.g. `/var/lib/clickhouse/`) and retained between server restarts.
**Syntax**
@ -353,10 +686,10 @@ serverUUID()
**Returned value**
- The UUID of the server.
- The UUID of the server.
Type: [UUID](../data-types/uuid.md).
## See Also
## See also
- [dictGetUUID](../../sql-reference/functions/ext-dict-functions.md#ext_dict_functions-other)

View File

@ -165,6 +165,68 @@ Result:
└───┴────┴─────┘
```
## [experimental] Join with inequality conditions
:::note
This feature is experimental. To use it, set `allow_experimental_join_condition` to 1 in your configuration files or by using the `SET` command:
```sql
SET allow_experimental_join_condition=1
```
Otherwise, you'll get `INVALID_JOIN_ON_EXPRESSION`.
:::
Clickhouse currently supports `ALL INNER/LEFT/RIGHT/FULL JOIN` with inequality conditions in addition to equality conditions. The inequality conditions are supported only for `hash` and `grace_hash` join algorithms. The inequality conditions are not supported with `join_use_nulls`.
**Example**
Table `t1`:
```
┌─key──┬─attr─┬─a─┬─b─┬─c─┐
│ key1 │ a │ 1 │ 1 │ 2 │
│ key1 │ b │ 2 │ 3 │ 2 │
│ key1 │ c │ 3 │ 2 │ 1 │
│ key1 │ d │ 4 │ 7 │ 2 │
│ key1 │ e │ 5 │ 5 │ 5 │
│ key2 │ a2 │ 1 │ 1 │ 1 │
│ key4 │ f │ 2 │ 3 │ 4 │
└──────┴──────┴───┴───┴───┘
```
Table `t2`
```
┌─key──┬─attr─┬─a─┬─b─┬─c─┐
│ key1 │ A │ 1 │ 2 │ 1 │
│ key1 │ B │ 2 │ 1 │ 2 │
│ key1 │ C │ 3 │ 4 │ 5 │
│ key1 │ D │ 4 │ 1 │ 6 │
│ key3 │ a3 │ 1 │ 1 │ 1 │
│ key4 │ F │ 1 │ 1 │ 1 │
└──────┴──────┴───┴───┴───┘
```
```sql
SELECT t1.*, t2.* from t1 LEFT JOIN t2 ON t1.key = t2.key and (t1.a < t2.a) ORDER BY (t1.key, t1.attr, t2.key, t2.attr);
```
```
key1 a 1 1 2 key1 B 2 1 2
key1 a 1 1 2 key1 C 3 4 5
key1 a 1 1 2 key1 D 4 1 6
key1 b 2 3 2 key1 C 3 4 5
key1 b 2 3 2 key1 D 4 1 6
key1 c 3 2 1 key1 D 4 1 6
key1 d 4 7 2 0 0 \N
key1 e 5 5 5 0 0 \N
key2 a2 1 1 1 0 0 \N
key4 f 2 3 4 0 0 \N
```
## NULL values in JOIN keys
The NULL is not equal to any value, including itself. It means that if a JOIN key has a NULL value in one table, it won't match a NULL value in the other table.
@ -273,7 +335,7 @@ For example, consider the following tables:
## PASTE JOIN Usage
The result of `PASTE JOIN` is a table that contains all columns from left subquery followed by all columns from the right subquery.
The rows are matched based on their positions in the original tables (the order of rows should be defined).
The rows are matched based on their positions in the original tables (the order of rows should be defined).
If the subqueries return a different number of rows, extra rows will be cut.
Example:

View File

@ -51,6 +51,174 @@ SELECT generateUUIDv4(1), generateUUIDv4(2)
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## generateUUIDv7 {#uuidv7-function-generate}
Генерирует идентификатор [UUID версии 7](https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04). Генерируемый UUID состоит из 48-битной временной метки (Unix time в миллисекундах), маркеров версии 7 и варианта 2, монотонно возрастающего счётчика для данной временной метки и случайных данных в указанной ниже последовательности. Для каждой новой временной метки счётчик стартует с нового случайного значения, а для следующих UUIDv7 он увеличивается на единицу. В случае переполнения счётчика временная метка принудительно увеличивается на 1, и счётчик снова стартует со случайного значения. Монотонность возрастания счётчика для каждой временной метки гарантируется между всеми одновременно работающими функциями `generateUUIDv7`.
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms | ver | counter_high_bits |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
|var| counter_low_bits |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| rand_b |
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
```
::::note
На апрель 2024 года UUIDv7 находится в статусе черновика и его раскладка по битам может в итоге измениться.
::::
**Синтаксис**
``` sql
generateUUIDv7([x])
```
**Аргументы**
- `x` — [выражение](../syntax.md#syntax-expressions), возвращающее значение одного из [поддерживаемых типов данных](../data-types/index.md#data_types). Значение используется, чтобы избежать [склейки одинаковых выражений](index.md#common-subexpression-elimination), если функция вызывается несколько раз в одном запросе. Необязательный параметр.
**Возвращаемое значение**
Значение типа [UUID](../../sql-reference/functions/uuid-functions.md).
**Пример использования**
Этот пример демонстрирует, как создать таблицу с UUID-колонкой и добавить в нее сгенерированный UUIDv7.
``` sql
CREATE TABLE t_uuid (x UUID) ENGINE=TinyLog
INSERT INTO t_uuid SELECT generateUUIDv7WithCounter()
SELECT * FROM t_uuid
```
``` text
┌────────────────────────────────────x─┐
│ 018f05c7-56e3-7ac3-93e9-1d93c4218e0e │
└──────────────────────────────────────┘
```
**Пример использования, для генерации нескольких значений в одной строке**
```sql
SELECT generateUUIDv7(1), generateUUIDv7(2)
┌─generateUUIDv7(1)────────────────────┬─generateUUIDv7(2)────────────────────┐
│ 018f05c9-4ab8-7b86-b64e-c9f03fbd45d1 │ 018f05c9-4ab8-7b86-b64e-c9f12efb7e16 │
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## generateUUIDv7ThreadMonotonic {#uuidv7threadmonotonic-function-generate}
Генерирует идентификатор [UUID версии 7](https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04). Генерируемый UUID состоит из 48-битной временной метки (Unix time в миллисекундах), маркеров версии 7 и варианта 2, монотонно возрастающего счётчика для данной временной метки и случайных данных в указанной ниже последовательности. Для каждой новой временной метки счётчик стартует с нового случайного значения, а для следующих UUIDv7 он увеличивается на единицу. В случае переполнения счётчика временная метка принудительно увеличивается на 1, и счётчик снова стартует со случайного значения. Данная функция является ускоренным аналогом функции `generateUUIDv7` за счёт потери гарантии монотонности счётчика при одной и той же метке времени между одновременно исполняемыми разными запросами. Монотонность счётчика гарантируется только в пределах одного треда, исполняющего данную функцию для генерации нескольких UUID.
**Синтаксис**
``` sql
generateUUIDv7ThreadMonotonic([x])
```
**Аргументы**
- `x` — [выражение](../syntax.md#syntax-expressions), возвращающее значение одного из [поддерживаемых типов данных](../data-types/index.md#data_types). Значение используется, чтобы избежать [склейки одинаковых выражений](index.md#common-subexpression-elimination), если функция вызывается несколько раз в одном запросе. Необязательный параметр.
**Возвращаемое значение**
Значение типа [UUID](../../sql-reference/functions/uuid-functions.md).
**Пример использования**
Этот пример демонстрирует, как создать таблицу с UUID-колонкой и добавить в нее сгенерированный UUIDv7.
``` sql
CREATE TABLE t_uuid (x UUID) ENGINE=TinyLog
INSERT INTO t_uuid SELECT generateUUIDv7ThreadMonotonic()
SELECT * FROM t_uuid
```
``` text
┌────────────────────────────────────x─┐
│ 018f05e2-e3b2-70cb-b8be-64b09b626d32 │
└──────────────────────────────────────┘
```
**Пример использования, для генерации нескольких значений в одной строке**
```sql
SELECT generateUUIDv7ThreadMonotonic(1), generateUUIDv7ThreadMonotonic(7)
┌─generateUUIDv7ThreadMonotonic(1)─────┬─generateUUIDv7ThreadMonotonic(2)─────┐
│ 018f05e1-14ee-7bc5-9906-207153b400b1 │ 018f05e1-14ee-7bc5-9906-2072b8e96758 │
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## generateUUIDv7NonMonotonic {#uuidv7nonmonotonic-function-generate}
Генерирует идентификатор [UUID версии 7](https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04). Генерируемый UUID состоит из 48-битной временной метки (Unix time в миллисекундах), маркеров версии 7 и варианта 2, и случайных данных в следующей последовательности:
```
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| unix_ts_ms | ver | rand_a |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
|var| rand_b |
├─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤
| rand_b |
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
```
::::note
На апрель 2024 года UUIDv7 находится в статусе черновика и его раскладка по битам может в итоге измениться.
::::
**Синтаксис**
``` sql
generateUUIDv7NonMonotonic([x])
```
**Аргументы**
- `x` — [выражение](../syntax.md#syntax-expressions), возвращающее значение одного из [поддерживаемых типов данных](../data-types/index.md#data_types). Значение используется, чтобы избежать [склейки одинаковых выражений](index.md#common-subexpression-elimination), если функция вызывается несколько раз в одном запросе. Необязательный параметр.
**Возвращаемое значение**
Значение типа [UUID](../../sql-reference/functions/uuid-functions.md).
**Пример использования**
Этот пример демонстрирует, как создать таблицу с UUID-колонкой и добавить в нее сгенерированный UUIDv7.
``` sql
CREATE TABLE t_uuid (x UUID) ENGINE=TinyLog
INSERT INTO t_uuid SELECT generateUUIDv7NonMonotonic()
SELECT * FROM t_uuid
```
``` text
┌────────────────────────────────────x─┐
│ 018f05af-f4a8-778f-beee-1bedbc95c93b │
└──────────────────────────────────────┘
```
**Пример использования, для генерации нескольких значений в одной строке**
```sql
SELECT generateUUIDv7NonMonotonic(1), generateUUIDv7NonMonotonic(7)
┌─generateUUIDv7NonMonotonic(1)────────┬─generateUUIDv7NonMonotonic(2)────────┐
│ 018f05b1-8c2e-7567-a988-48d09606ae8c │ 018f05b1-8c2e-7946-895b-fcd7635da9a0 │
└──────────────────────────────────────┴──────────────────────────────────────┘
```
## empty {#empty}
Проверяет, является ли входной UUID пустым.
@ -259,6 +427,84 @@ SELECT
└──────────────────┴──────────────────────────────────────┘
```
## UUIDToNum {#uuidtonum}
Принимает UUID и возвращает в виде набора байт в [FixedString(16)](../../sql-reference/functions/uuid-functions.md). Также принимает необязательный второй параметр - вариант представления UUID, по умолчанию 1 - `Big-endian` (2 означает представление в формате `Microsoft`). Данная функция заменяет последовательность из двух отдельных функций `UUIDStringToNum(toString(uuid))`, так что промежуточная конвертация из UUID в String для извлечения набора байт из UUID не требуется.
``` sql
UUIDToNum(UUID[, variant = 1])
```
**Возвращаемое значение**
FixedString(16)
**Примеры использования**
``` sql
SELECT
toUUID('612f3c40-5d3b-217e-707b-6a546a3d7b29') AS uuid,
UUIDToNum(uuid) AS bytes
```
``` text
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ a/<@];!~p{jTj={) │
└──────────────────────────────────────┴──────────────────┘
```
``` sql
SELECT
toUUID('612f3c40-5d3b-217e-707b-6a546a3d7b29') AS uuid,
UUIDToNum(uuid, 2) AS bytes
```
```text
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ @</a;]~!p{jTj={) │
└──────────────────────────────────────┴──────────────────┘
```
## UUIDv7ToDateTime {#uuidv7todatetime}
Принимает UUID версии 7 и извлекает из него временную метку.
``` sql
UUIDv7ToDateTime(uuid[, timezone])
```
**Параметры**
- `uuid` — [UUID](../data-types/uuid.md) версии 7.
- `timezone` — [Часовой пояс](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-timezone) для возвращаемого значения (необязательный параметр). [String](../../sql-reference/data-types/string.md).
**Возвращаемое значение**
- Временная метка с миллисекундной точностью (1970-01-01 00:00:00.000 в случае UUID не версии 7).
Type: [DateTime64(3)](/docs/ru/sql-reference/data-types/datetime64.md).
**Примеры использования**
``` sql
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'))
```
```text
┌─UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'))─┐
│ 2024-04-22 15:30:29.048 │
└──────────────────────────────────────────────────────────────────┘
```
``` sql
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 'America/New_York')
```
```text
┌─UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 'America/New_York')─┐
│ 2024-04-22 08:30:29.048 │
└──────────────────────────────────────────────────────────────────────────────────────┘
```
## serverUUID() {#server-uuid}
Возвращает случайный и уникальный UUID, который генерируется при первом запуске сервера и сохраняется навсегда. Результат записывается в файл `uuid`, расположенный в каталоге сервера ClickHouse `/var/lib/clickhouse/`.

View File

@ -1573,8 +1573,11 @@ try
global_context->reloadQueryMaskingRulesIfChanged(config);
std::lock_guard lock(servers_lock);
updateServers(*config, server_pool, async_metrics, servers, servers_to_start_before_tables);
if (global_context->isServerCompletelyStarted())
{
std::lock_guard lock(servers_lock);
updateServers(*config, server_pool, async_metrics, servers, servers_to_start_before_tables);
}
}
global_context->updateStorageConfiguration(*config);

View File

@ -14,6 +14,7 @@
#include <Disks/DiskType.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <azure/storage/blobs/blob_options.hpp>
#include <filesystem>
@ -38,6 +39,8 @@ BackupReaderAzureBlobStorage::BackupReaderAzureBlobStorage(
, configuration(configuration_)
{
auto client_ptr = StorageAzureBlob::createClient(configuration, /* is_read_only */ false);
client_ptr->SetClickhouseOptions(Azure::Storage::Blobs::ClickhouseClientOptions{.IsClientForDisk=true});
object_storage = std::make_unique<AzureObjectStorage>("BackupReaderAzureBlobStorage",
std::move(client_ptr),
StorageAzureBlob::createSettings(context_),
@ -97,8 +100,7 @@ void BackupReaderAzureBlobStorage::copyFileToDisk(const String & path_in_backup,
/* dest_path */ blob_path[0],
settings,
read_settings,
threadPoolCallbackRunnerUnsafe<void>(getBackupsIOThreadPool().get(), "BackupRDAzure"),
/* for_disk_azure_blob_storage= */ true);
threadPoolCallbackRunnerUnsafe<void>(getBackupsIOThreadPool().get(), "BackupRDAzure"));
return file_size;
};
@ -123,6 +125,8 @@ BackupWriterAzureBlobStorage::BackupWriterAzureBlobStorage(
, configuration(configuration_)
{
auto client_ptr = StorageAzureBlob::createClient(configuration, /* is_read_only */ false, attempt_to_create_container);
client_ptr->SetClickhouseOptions(Azure::Storage::Blobs::ClickhouseClientOptions{.IsClientForDisk=true});
object_storage = std::make_unique<AzureObjectStorage>("BackupWriterAzureBlobStorage",
std::move(client_ptr),
StorageAzureBlob::createSettings(context_),
@ -177,8 +181,7 @@ void BackupWriterAzureBlobStorage::copyFile(const String & destination, const St
/* dest_path */ destination,
settings,
read_settings,
threadPoolCallbackRunnerUnsafe<void>(getBackupsIOThreadPool().get(), "BackupWRAzure"),
/* for_disk_azure_blob_storage= */ true);
threadPoolCallbackRunnerUnsafe<void>(getBackupsIOThreadPool().get(), "BackupWRAzure"));
}
void BackupWriterAzureBlobStorage::copyDataToFile(const String & path_in_backup, const CreateReadBufferFunction & create_read_buffer, UInt64 start_pos, UInt64 length)

View File

@ -439,8 +439,7 @@ void ClientBase::sendExternalTables(ASTPtr parsed_query)
for (auto & table : external_tables)
data.emplace_back(table.getData(global_context));
if (send_external_tables)
connection->sendExternalTablesData(data);
connection->sendExternalTablesData(data);
}

View File

@ -731,8 +731,20 @@ XMLDocumentPtr ConfigProcessor::processConfig(
{
LOG_DEBUG(log, "Including configuration file '{}'.", include_from_path);
fs::path p(include_from_path);
std::string extension = p.extension();
boost::algorithm::to_lower(extension);
if (extension == ".yaml" || extension == ".yml")
{
include_from = YAMLParser::parse(include_from_path);
}
else
{
include_from = dom_parser.parse(include_from_path);
}
contributing_files.push_back(include_from_path);
include_from = dom_parser.parse(include_from_path);
}
doIncludesRecursive(config, include_from, getRootNode(config.get()), zk_node_cache, zk_changed_event, contributing_zk_paths);

View File

@ -106,6 +106,9 @@ void BaseExternalTable::parseStructureFromTypesField(const std::string & argumen
void BaseExternalTable::initSampleBlock()
{
if (sample_block)
return;
const DataTypeFactory & data_type_factory = DataTypeFactory::instance();
for (const auto & elem : structure)

View File

@ -235,7 +235,7 @@ class IColumn;
M(Bool, do_not_merge_across_partitions_select_final, false, "Merge parts only in one partition in select final", 0) \
M(Bool, split_parts_ranges_into_intersecting_and_non_intersecting_final, true, "Split parts ranges into intersecting and non intersecting during FINAL optimization", 0) \
M(Bool, split_intersecting_parts_ranges_into_layers_final, true, "Split intersecting parts ranges into layers during FINAL optimization", 0) \
M(Bool, allow_experimental_inverted_index, false, "If it is set to true, allow to use experimental inverted index.", 0) \
M(Bool, allow_experimental_inverted_index, false, "If it is set to true, allow to use experimental fulltext (inverted) index.", 0) \
\
M(UInt64, mysql_max_rows_to_insert, 65536, "The maximum number of rows in MySQL batch insertion of the MySQL storage engine", 0) \
M(Bool, mysql_map_string_to_text_in_show_columns, true, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise. Has an effect only when the connection is made through the MySQL wire protocol.", 0) \
@ -322,6 +322,7 @@ class IColumn;
M(Bool, fsync_metadata, true, "Do fsync after changing metadata for tables and databases (.sql files). Could be disabled in case of poor latency on server with high load of DDL queries and high load of disk subsystem.", 0) \
\
M(Bool, join_use_nulls, false, "Use NULLs for non-joined rows of outer JOINs for types that can be inside Nullable. If false, use default value of corresponding columns data type.", IMPORTANT) \
M(Bool, allow_experimental_join_condition, false, "Support join with inequal conditions which involve columns from both left and right table. e.g. t1.y < t2.y.", IMPORTANT) \
\
M(JoinStrictness, join_default_strictness, JoinStrictness::All, "Set default strictness in JOIN query. Possible values: empty string, 'ANY', 'ALL'. If empty, query without strictness will throw exception.", 0) \
M(Bool, any_join_distinct_right_table_keys, false, "Enable old ANY JOIN logic with many-to-one left-to-right table keys mapping for all ANY JOINs. It leads to confusing not equal results for 't1 ANY LEFT JOIN t2' and 't2 ANY RIGHT JOIN t1'. ANY RIGHT JOIN needs one-to-many keys mapping to be consistent with LEFT one.", IMPORTANT) \

View File

@ -96,6 +96,7 @@ static std::map<ClickHouseVersion, SettingsChangesHistory::SettingsChanges> sett
{"temporary_data_in_cache_reserve_space_wait_lock_timeout_milliseconds", (10 * 60 * 1000), (10 * 60 * 1000), "Wait time to lock cache for sapce reservation in temporary data in filesystem cache"},
{"optimize_rewrite_sum_if_to_count_if", false, true, "Only available for the analyzer, where it works correctly"},
{"azure_allow_parallel_part_upload", "true", "true", "Use multiple threads for azure multipart upload."},
{"allow_experimental_join_condition", false, false, "Support join with inequal conditions which involve columns from both left and right table. e.g. t1.y < t2.y."},
{"max_recursive_cte_evaluation_depth", DBMS_RECURSIVE_CTE_MAX_EVALUATION_DEPTH, DBMS_RECURSIVE_CTE_MAX_EVALUATION_DEPTH, "Maximum limit on recursive CTE evaluation depth"},
{"query_plan_convert_outer_join_to_inner_join", false, true, "Allow to convert OUTER JOIN to INNER JOIN if filter after JOIN always filters default values"},
}},

View File

@ -225,7 +225,7 @@ void ReadBufferFromAzureBlobStorage::initialize()
try
{
ProfileEvents::increment(ProfileEvents::AzureGetObject);
if (read_settings.for_object_storage)
if (blob_container_client->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureGetObject);
auto download_response = blob_client->Download(download_options);
@ -279,7 +279,7 @@ size_t ReadBufferFromAzureBlobStorage::readBigAt(char * to, size_t n, size_t ran
try
{
ProfileEvents::increment(ProfileEvents::AzureGetObject);
if (read_settings.for_object_storage)
if (blob_container_client->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureGetObject);
Azure::Storage::Blobs::DownloadBlobOptions download_options;

View File

@ -5,6 +5,7 @@
#include <Common/Exception.h>
#include <Common/re2.h>
#include <azure/identity/managed_identity_credential.hpp>
#include <azure/storage/blobs/blob_options.hpp>
#include <azure/core/http/curl_transport.hpp>
#include <Poco/Util/AbstractConfiguration.h>
#include <Interpreters/Context.h>
@ -206,6 +207,8 @@ Azure::Storage::Blobs::BlobClientOptions getAzureBlobClientOptions(const Poco::U
client_options.Retry = retry_options;
client_options.Transport.Transport = std::make_shared<Azure::Core::Http::CurlTransport>(curl_options);
client_options.ClickhouseOptions = Azure::Storage::Blobs::ClickhouseClientOptions{.IsClientForDisk=true};
return client_options;
}

View File

@ -69,7 +69,8 @@ private:
bool getBatchAndCheckNext(RelativePathsWithMetadata & batch) override
{
ProfileEvents::increment(ProfileEvents::AzureListObjects);
ProfileEvents::increment(ProfileEvents::DiskAzureListObjects);
if (client->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureListObjects);
batch.clear();
auto outcome = client->ListBlobs(options);
@ -130,7 +131,8 @@ bool AzureObjectStorage::exists(const StoredObject & object) const
options.PageSizeHint = 1;
ProfileEvents::increment(ProfileEvents::AzureListObjects);
ProfileEvents::increment(ProfileEvents::DiskAzureListObjects);
if (client_ptr->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureListObjects);
auto blobs_list_response = client_ptr->ListBlobs(options);
auto blobs_list = blobs_list_response.Blobs;
@ -169,7 +171,8 @@ void AzureObjectStorage::listObjects(const std::string & path, RelativePathsWith
while (true)
{
ProfileEvents::increment(ProfileEvents::AzureListObjects);
ProfileEvents::increment(ProfileEvents::DiskAzureListObjects);
if (client_ptr->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureListObjects);
blob_list_response = client_ptr->ListBlobs(options);
auto blobs_list = blob_list_response.Blobs;
@ -298,7 +301,8 @@ std::unique_ptr<WriteBufferFromFileBase> AzureObjectStorage::writeObject( /// NO
void AzureObjectStorage::removeObjectImpl(const StoredObject & object, const SharedAzureClientPtr & client_ptr, bool if_exists)
{
ProfileEvents::increment(ProfileEvents::AzureDeleteObjects);
ProfileEvents::increment(ProfileEvents::DiskAzureDeleteObjects);
if (client_ptr->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureDeleteObjects);
const auto & path = object.remote_path;
LOG_TEST(log, "Removing single object: {}", path);
@ -353,13 +357,14 @@ void AzureObjectStorage::removeObjectsIfExist(const StoredObjects & objects)
ObjectMetadata AzureObjectStorage::getObjectMetadata(const std::string & path) const
{
ProfileEvents::increment(ProfileEvents::AzureGetProperties);
ProfileEvents::increment(ProfileEvents::DiskAzureGetProperties);
auto client_ptr = client.get();
auto blob_client = client_ptr->GetBlobClient(path);
auto properties = blob_client.GetProperties().Value;
ProfileEvents::increment(ProfileEvents::AzureGetProperties);
if (client_ptr->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureGetProperties);
ObjectMetadata result;
result.size_bytes = properties.BlobSize;
if (!properties.Metadata.empty())
@ -391,7 +396,8 @@ void AzureObjectStorage::copyObject( /// NOLINT
}
ProfileEvents::increment(ProfileEvents::AzureCopyObject);
ProfileEvents::increment(ProfileEvents::DiskAzureCopyObject);
if (client_ptr->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureCopyObject);
dest_blob_client.CopyFromUri(source_blob_client.GetUrl(), copy_options);
}

View File

@ -84,16 +84,12 @@ const std::string & IObjectStorage::getCacheName() const
ReadSettings IObjectStorage::patchSettings(const ReadSettings & read_settings) const
{
ReadSettings settings{read_settings};
settings.for_object_storage = true;
return settings;
return read_settings;
}
WriteSettings IObjectStorage::patchSettings(const WriteSettings & write_settings) const
{
WriteSettings settings{write_settings};
settings.for_object_storage = true;
return settings;
return write_settings;
}
}

View File

@ -158,7 +158,7 @@ private:
bool S3ObjectStorage::exists(const StoredObject & object) const
{
auto settings_ptr = s3_settings.get();
return S3::objectExists(*client.get(), uri.bucket, object.remote_path, {}, settings_ptr->request_settings, /* for_disk_s3= */ true);
return S3::objectExists(*client.get(), uri.bucket, object.remote_path, {}, settings_ptr->request_settings);
}
std::unique_ptr<ReadBufferFromFileBase> S3ObjectStorage::readObjects( /// NOLINT
@ -425,7 +425,7 @@ void S3ObjectStorage::removeObjectsIfExist(const StoredObjects & objects)
std::optional<ObjectMetadata> S3ObjectStorage::tryGetObjectMetadata(const std::string & path) const
{
auto settings_ptr = s3_settings.get();
auto object_info = S3::getObjectInfo(*client.get(), uri.bucket, path, {}, settings_ptr->request_settings, /* with_metadata= */ true, /* for_disk_s3= */ true, /* throw_on_error= */ false);
auto object_info = S3::getObjectInfo(*client.get(), uri.bucket, path, {}, settings_ptr->request_settings, /* with_metadata= */ true, /* throw_on_error= */ false);
if (object_info.size == 0 && object_info.last_modification_time == 0 && object_info.metadata.empty())
return {};
@ -441,7 +441,7 @@ std::optional<ObjectMetadata> S3ObjectStorage::tryGetObjectMetadata(const std::s
ObjectMetadata S3ObjectStorage::getObjectMetadata(const std::string & path) const
{
auto settings_ptr = s3_settings.get();
auto object_info = S3::getObjectInfo(*client.get(), uri.bucket, path, {}, settings_ptr->request_settings, /* with_metadata= */ true, /* for_disk_s3= */ true);
auto object_info = S3::getObjectInfo(*client.get(), uri.bucket, path, {}, settings_ptr->request_settings, /* with_metadata= */ true);
ObjectMetadata result;
result.size_bytes = object_info.size;
@ -464,9 +464,11 @@ void S3ObjectStorage::copyObjectToAnotherObjectStorage( // NOLINT
{
auto current_client = dest_s3->client.get();
auto settings_ptr = s3_settings.get();
auto size = S3::getObjectSize(*current_client, uri.bucket, object_from.remote_path, {}, settings_ptr->request_settings, /* for_disk_s3= */ true);
auto size = S3::getObjectSize(*current_client, uri.bucket, object_from.remote_path, {}, settings_ptr->request_settings);
auto scheduler = threadPoolCallbackRunnerUnsafe<void>(getThreadPoolWriter(), "S3ObjStor_copy");
try {
try
{
copyS3File(
current_client,
uri.bucket,
@ -479,8 +481,7 @@ void S3ObjectStorage::copyObjectToAnotherObjectStorage( // NOLINT
patchSettings(read_settings),
BlobStorageLogWriter::create(disk_name),
object_to_attributes,
scheduler,
/* for_disk_s3= */ true);
scheduler);
return;
}
catch (S3Exception & exc)
@ -506,8 +507,9 @@ void S3ObjectStorage::copyObject( // NOLINT
{
auto current_client = client.get();
auto settings_ptr = s3_settings.get();
auto size = S3::getObjectSize(*current_client, uri.bucket, object_from.remote_path, {}, settings_ptr->request_settings, /* for_disk_s3= */ true);
auto size = S3::getObjectSize(*current_client, uri.bucket, object_from.remote_path, {}, settings_ptr->request_settings);
auto scheduler = threadPoolCallbackRunnerUnsafe<void>(getThreadPoolWriter(), "S3ObjStor_copy");
copyS3File(current_client,
uri.bucket,
object_from.remote_path,
@ -519,8 +521,7 @@ void S3ObjectStorage::copyObject( // NOLINT
patchSettings(read_settings),
BlobStorageLogWriter::create(disk_name),
object_to_attributes,
scheduler,
/* for_disk_s3= */ true);
scheduler);
}
void S3ObjectStorage::setNewSettings(std::unique_ptr<S3ObjectStorageSettings> && s3_settings_)

View File

@ -1,14 +1,18 @@
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnsDateTime.h>
#include <Columns/ColumnFixedString.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnsNumber.h>
#include <Columns/ColumnVector.h>
#include <Common/BitHelpers.h>
#include <base/hex.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeFixedString.h>
#include <DataTypes/DataTypeUUID.h>
#include <Functions/FunctionFactory.h>
#include <Functions/IFunction.h>
#include <Functions/FunctionHelpers.h>
#include <Functions/extractTimeZoneFromFunctionArguments.h>
#include <IO/WriteHelpers.h>
#include <Interpreters/Context_fwd.h>
#include <Interpreters/castColumn.h>
@ -17,11 +21,11 @@
namespace DB::ErrorCodes
{
extern const int ARGUMENT_OUT_OF_BOUND;
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int LOGICAL_ERROR;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int ARGUMENT_OUT_OF_BOUND;
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int LOGICAL_ERROR;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
}
namespace
@ -32,7 +36,7 @@ enum class Representation
LittleEndian
};
std::pair<int, int> determineBinaryStartIndexWithIncrement(const ptrdiff_t num_bytes, const Representation representation)
std::pair<int, int> determineBinaryStartIndexWithIncrement(ptrdiff_t num_bytes, Representation representation)
{
if (representation == Representation::BigEndian)
return {0, 1};
@ -42,7 +46,7 @@ std::pair<int, int> determineBinaryStartIndexWithIncrement(const ptrdiff_t num_b
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "{} is not handled yet", magic_enum::enum_name(representation));
}
void formatHex(const std::span<const UInt8> src, UInt8 * dst, const Representation representation)
void formatHex(const std::span<const UInt8> src, UInt8 * dst, Representation representation)
{
const auto src_size = std::ssize(src);
const auto [src_start_index, src_increment] = determineBinaryStartIndexWithIncrement(src_size, representation);
@ -50,7 +54,7 @@ void formatHex(const std::span<const UInt8> src, UInt8 * dst, const Representati
writeHexByteLowercase(src[src_pos], dst + dst_pos);
}
void parseHex(const UInt8 * __restrict src, const std::span<UInt8> dst, const Representation representation)
void parseHex(const UInt8 * __restrict src, const std::span<UInt8> dst, Representation representation)
{
const auto dst_size = std::ssize(dst);
const auto [dst_start_index, dst_increment] = determineBinaryStartIndexWithIncrement(dst_size, representation);
@ -322,10 +326,191 @@ public:
}
};
class FunctionUUIDToNum : public IFunction
{
public:
static constexpr auto name = "UUIDToNum";
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionUUIDToNum>(); }
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool isVariadic() const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
checkArgumentCount(arguments, name);
if (!isUUID(arguments[0]))
{
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Illegal type {} of first argument of function {}, expected UUID",
arguments[0]->getName(),
getName());
}
checkFormatArgument(arguments, name);
return std::make_shared<DataTypeFixedString>(uuid_bytes_length);
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
{
const ColumnWithTypeAndName & col_type_name = arguments[0];
const ColumnPtr & column = col_type_name.column;
const bool defaultFormat = (parseVariant(arguments) == UUIDSerializer::Variant::Default);
if (const auto * col_in = checkAndGetColumn<ColumnUUID>(column.get()))
{
const auto & vec_in = col_in->getData();
const UUID * uuids = vec_in.data();
const size_t size = vec_in.size();
auto col_res = ColumnFixedString::create(uuid_bytes_length);
ColumnString::Chars & vec_res = col_res->getChars();
vec_res.resize(size * uuid_bytes_length);
size_t dst_offset = 0;
for (size_t i = 0; i < size; ++i)
{
uint64_t hiBytes = DB::UUIDHelpers::getHighBytes(uuids[i]);
uint64_t loBytes = DB::UUIDHelpers::getLowBytes(uuids[i]);
unalignedStoreBigEndian<uint64_t>(&vec_res[dst_offset], hiBytes);
unalignedStoreBigEndian<uint64_t>(&vec_res[dst_offset + sizeof(hiBytes)], loBytes);
if (!defaultFormat)
{
std::swap(vec_res[dst_offset], vec_res[dst_offset + 3]);
std::swap(vec_res[dst_offset + 1], vec_res[dst_offset + 2]);
std::swap(vec_res[dst_offset + 4], vec_res[dst_offset + 5]);
std::swap(vec_res[dst_offset + 6], vec_res[dst_offset + 7]);
}
dst_offset += uuid_bytes_length;
}
return col_res;
}
else
throw Exception(
ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of argument of function {}", arguments[0].column->getName(), getName());
}
};
class FunctionUUIDv7ToDateTime : public IFunction
{
public:
static constexpr auto name = "UUIDv7ToDateTime";
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionUUIDv7ToDateTime>(); }
static constexpr UInt32 datetime_scale = 3;
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool isVariadic() const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (arguments.empty() || arguments.size() > 2)
throw Exception(
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH, "Wrong number of arguments for function {}: should be 1 or 2", getName());
if (!checkAndGetDataType<DataTypeUUID>(arguments[0].type.get()))
{
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Illegal type {} of first argument of function {}, expected UUID",
arguments[0].type->getName(),
getName());
}
String timezone;
if (arguments.size() == 2)
{
timezone = extractTimeZoneNameFromColumn(arguments[1].column.get(), arguments[1].name);
if (timezone.empty())
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Function {} supports a 2nd argument (optional) that must be a valid time zone",
getName());
}
return std::make_shared<DataTypeDateTime64>(datetime_scale, timezone);
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
{
const ColumnWithTypeAndName & col_type_name = arguments[0];
const ColumnPtr & column = col_type_name.column;
if (const auto * col_in = checkAndGetColumn<ColumnUUID>(column.get()))
{
const auto & vec_in = col_in->getData();
const UUID * uuids = vec_in.data();
const size_t size = vec_in.size();
auto col_res = ColumnDateTime64::create(size, datetime_scale);
auto & vec_res = col_res->getData();
for (size_t i = 0; i < size; ++i)
{
const uint64_t hiBytes = DB::UUIDHelpers::getHighBytes(uuids[i]);
const uint64_t ms = ((hiBytes & 0xf000) == 0x7000) ? (hiBytes >> 16) : 0;
vec_res[i] = DecimalUtils::decimalFromComponents<DateTime64>(ms / intExp10(datetime_scale), ms % intExp10(datetime_scale), datetime_scale);
}
return col_res;
}
else
throw Exception(
ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of argument of function {}", arguments[0].column->getName(), getName());
}
};
REGISTER_FUNCTION(CodingUUID)
{
factory.registerFunction<FunctionUUIDNumToString>();
factory.registerFunction<FunctionUUIDStringToNum>();
factory.registerFunction<FunctionUUIDToNum>(
FunctionDocumentation{
.description = R"(
This function accepts a UUID and returns a FixedString(16) as its binary representation, with its format optionally specified by variant (Big-endian by default).
)",
.examples{
{"uuid",
"select toUUID(UUIDNumToString(toFixedString('a/<@];!~p{jTj={)', 16))) as uuid, UUIDToNum(uuid) as uuidNum, "
"UUIDToNum(uuid, 2) as uuidMsNum",
R"(
uuiduuidNumuuidMsNum
612f3c40-5d3b-217e-707b-6a546a3d7b29 a/<@];!~p{jTj={) @</a];!~p{jTj={)
)"}},
.categories{"UUID"}},
FunctionFactory::CaseSensitive);
factory.registerFunction<FunctionUUIDv7ToDateTime>(
FunctionDocumentation{
.description = R"(
This function extracts the timestamp from a UUID and returns it as a DateTime64(3) typed value.
The function expects the UUID having version 7 to be provided as the first argument.
An optional second argument can be passed to specify a timezone for the timestamp.
)",
.examples{
{"uuid","select UUIDv7ToDateTime(generateUUIDv7())", ""},
{"uuid","select generateUUIDv7() as uuid, UUIDv7ToDateTime(uuid), UUIDv7ToDateTime(uuid, 'America/New_York')", ""}},
.categories{"UUID"}},
FunctionFactory::CaseSensitive);
}
}

View File

@ -1,15 +1,11 @@
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionsRandom.h>
#include <DataTypes/DataTypeUUID.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionHelpers.h>
#include <Functions/FunctionsRandom.h>
namespace DB
{
namespace ErrorCodes
{
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
}
#define DECLARE_SEVERAL_IMPLEMENTATIONS(...) \
DECLARE_DEFAULT_CODE (__VA_ARGS__) \
DECLARE_AVX2_SPECIFIC_CODE(__VA_ARGS__)
@ -21,30 +17,26 @@ class FunctionGenerateUUIDv4 : public IFunction
public:
static constexpr auto name = "generateUUIDv4";
String getName() const override
{
return name;
}
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 0; }
bool isDeterministic() const override { return false; }
bool isDeterministicInScopeOfQuery() const override { return false; }
bool useDefaultImplementationForNulls() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool isVariadic() const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (arguments.size() > 1)
throw Exception(ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH,
"Number of arguments for function {} doesn't match: passed {}, should be 0 or 1.",
getName(), arguments.size());
FunctionArgumentDescriptors mandatory_args;
FunctionArgumentDescriptors optional_args{
{"expr", nullptr, nullptr, "Arbitrary Expression"}
};
validateFunctionArgumentTypes(*this, arguments, mandatory_args, optional_args);
return std::make_shared<DataTypeUUID>();
}
bool isDeterministic() const override { return false; }
ColumnPtr executeImpl(const ColumnsWithTypeAndName &, const DataTypePtr &, size_t input_rows_count) const override
{
auto col_res = ColumnVector<UUID>::create();
@ -79,10 +71,10 @@ public:
selector.registerImplementation<TargetArch::Default,
TargetSpecific::Default::FunctionGenerateUUIDv4>();
#if USE_MULTITARGET_CODE
#if USE_MULTITARGET_CODE
selector.registerImplementation<TargetArch::AVX2,
TargetSpecific::AVX2::FunctionGenerateUUIDv4>();
#endif
#endif
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override

View File

@ -0,0 +1,289 @@
#include <DataTypes/DataTypeUUID.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionHelpers.h>
#include <Functions/FunctionsRandom.h>
namespace DB
{
namespace
{
/* Bit layouts of UUIDv7
without counter:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
| unix_ts_ms |
| unix_ts_ms | ver | rand_a |
|var| rand_b |
| rand_b |
with counter:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
| unix_ts_ms |
| unix_ts_ms | ver | counter_high_bits |
|var| counter_low_bits |
| rand_b |
*/
/// bit counts
constexpr auto rand_a_bits_count = 12;
constexpr auto rand_b_bits_count = 62;
constexpr auto rand_b_low_bits_count = 32;
constexpr auto counter_high_bits_count = rand_a_bits_count;
constexpr auto counter_low_bits_count = 30;
constexpr auto bits_in_counter = counter_high_bits_count + counter_low_bits_count;
constexpr uint64_t counter_limit = (1ull << bits_in_counter);
/// bit masks for UUIDv7 components
constexpr uint64_t variant_2_mask = (2ull << rand_b_bits_count);
constexpr uint64_t rand_a_bits_mask = (1ull << rand_a_bits_count) - 1;
constexpr uint64_t rand_b_bits_mask = (1ull << rand_b_bits_count) - 1;
constexpr uint64_t rand_b_with_counter_bits_mask = (1ull << rand_b_low_bits_count) - 1;
constexpr uint64_t counter_low_bits_mask = (1ull << counter_low_bits_count) - 1;
constexpr uint64_t counter_high_bits_mask = rand_a_bits_mask;
uint64_t getTimestampMillisecond()
{
timespec tp;
clock_gettime(CLOCK_REALTIME, &tp);
const uint64_t sec = tp.tv_sec;
return sec * 1000 + tp.tv_nsec / 1000000;
}
void setTimestampAndVersion(UUID & uuid, uint64_t timestamp)
{
UUIDHelpers::getHighBytes(uuid) = (UUIDHelpers::getHighBytes(uuid) & rand_a_bits_mask) | (timestamp << 16) | 0x7000;
}
void setVariant(UUID & uuid)
{
UUIDHelpers::getLowBytes(uuid) = (UUIDHelpers::getLowBytes(uuid) & rand_b_bits_mask) | variant_2_mask;
}
struct FillAllRandomPolicy
{
static constexpr auto name = "generateUUIDv7NonMonotonic";
static constexpr auto doc_description = R"(Generates a UUID of version 7. The generated UUID contains the current Unix timestamp in milliseconds (48 bits), followed by version "7" (4 bits), and a random field (74 bit, including a 2-bit variant field "2") to distinguish UUIDs within a millisecond. This function is the fastest generateUUIDv7* function but it gives no monotonicity guarantees within a timestamp.)";
struct Data
{
void generate(UUID & uuid, uint64_t ts)
{
setTimestampAndVersion(uuid, ts);
setVariant(uuid);
}
};
};
struct CounterFields
{
uint64_t last_timestamp = 0;
uint64_t counter = 0;
void resetCounter(const UUID & uuid)
{
const uint64_t counter_low_bits = (UUIDHelpers::getLowBytes(uuid) >> rand_b_low_bits_count) & counter_low_bits_mask;
const uint64_t counter_high_bits = UUIDHelpers::getHighBytes(uuid) & counter_high_bits_mask;
counter = (counter_high_bits << 30) | counter_low_bits;
}
void incrementCounter(UUID & uuid)
{
if (++counter == counter_limit) [[unlikely]]
{
++last_timestamp;
resetCounter(uuid);
setTimestampAndVersion(uuid, last_timestamp);
setVariant(uuid);
}
else
{
UUIDHelpers::getHighBytes(uuid) = (last_timestamp << 16) | 0x7000 | (counter >> counter_low_bits_count);
UUIDHelpers::getLowBytes(uuid) = (UUIDHelpers::getLowBytes(uuid) & rand_b_with_counter_bits_mask) | variant_2_mask | ((counter & counter_low_bits_mask) << rand_b_low_bits_count);
}
}
void generate(UUID & uuid, uint64_t timestamp)
{
const bool need_to_increment_counter = (last_timestamp == timestamp) || ((last_timestamp > timestamp) & (last_timestamp < timestamp + 10000));
if (need_to_increment_counter)
{
incrementCounter(uuid);
}
else
{
last_timestamp = timestamp;
resetCounter(uuid);
setTimestampAndVersion(uuid, last_timestamp);
setVariant(uuid);
}
}
};
struct GlobalCounterPolicy
{
static constexpr auto name = "generateUUIDv7";
static constexpr auto doc_description = R"(Generates a UUID of version 7. The generated UUID contains the current Unix timestamp in milliseconds (48 bits), followed by version "7" (4 bits), a counter (42 bit, including a variant field "2", 2 bit) to distinguish UUIDs within a millisecond, and a random field (32 bits). For any given timestamp (unix_ts_ms), the counter starts at a random value and is incremented by 1 for each new UUID until the timestamp changes. In case the counter overflows, the timestamp field is incremented by 1 and the counter is reset to a random new start value. Function generateUUIDv7 guarantees that the counter field within a timestamp increments monotonically across all function invocations in concurrently running threads and queries.)";
/// Guarantee counter monotonicity within one timestamp across all threads generating UUIDv7 simultaneously.
struct Data
{
static inline CounterFields fields;
static inline SharedMutex mutex; /// works a little bit faster than std::mutex here
std::lock_guard<SharedMutex> guard;
Data()
: guard(mutex)
{}
void generate(UUID & uuid, uint64_t timestamp)
{
fields.generate(uuid, timestamp);
}
};
};
struct ThreadLocalCounterPolicy
{
static constexpr auto name = "generateUUIDv7ThreadMonotonic";
static constexpr auto doc_description = R"(Generates a UUID of version 7. The generated UUID contains the current Unix timestamp in milliseconds (48 bits), followed by version "7" (4 bits), a counter (42 bit, including a variant field "2", 2 bit) to distinguish UUIDs within a millisecond, and a random field (32 bits). For any given timestamp (unix_ts_ms), the counter starts at a random value and is incremented by 1 for each new UUID until the timestamp changes. In case the counter overflows, the timestamp field is incremented by 1 and the counter is reset to a random new start value. This function behaves like generateUUIDv7 but gives no guarantee on counter monotony across different simultaneous requests. Monotonicity within one timestamp is guaranteed only within the same thread calling this function to generate UUIDs.)";
/// Guarantee counter monotonicity within one timestamp within the same thread. Faster than GlobalCounterPolicy if a query uses multiple threads.
struct Data
{
static inline thread_local CounterFields fields;
void generate(UUID & uuid, uint64_t timestamp)
{
fields.generate(uuid, timestamp);
}
};
};
}
#define DECLARE_SEVERAL_IMPLEMENTATIONS(...) \
DECLARE_DEFAULT_CODE (__VA_ARGS__) \
DECLARE_AVX2_SPECIFIC_CODE(__VA_ARGS__)
DECLARE_SEVERAL_IMPLEMENTATIONS(
template <typename FillPolicy>
class FunctionGenerateUUIDv7Base : public IFunction, public FillPolicy
{
public:
String getName() const final { return FillPolicy::name; }
size_t getNumberOfArguments() const final { return 0; }
bool isDeterministic() const override { return false; }
bool isDeterministicInScopeOfQuery() const final { return false; }
bool useDefaultImplementationForNulls() const final { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const final { return false; }
bool isVariadic() const final { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
FunctionArgumentDescriptors mandatory_args;
FunctionArgumentDescriptors optional_args{
{"expr", nullptr, nullptr, "Arbitrary Expression"}
};
validateFunctionArgumentTypes(*this, arguments, mandatory_args, optional_args);
return std::make_shared<DataTypeUUID>();
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName &, const DataTypePtr &, size_t input_rows_count) const override
{
auto col_res = ColumnVector<UUID>::create();
typename ColumnVector<UUID>::Container & vec_to = col_res->getData();
if (input_rows_count)
{
vec_to.resize(input_rows_count);
/// Not all random bytes produced here are required for the UUIDv7 but it's the simplest way to get the required number of them by using RandImpl
RandImpl::execute(reinterpret_cast<char *>(vec_to.data()), vec_to.size() * sizeof(UUID));
/// Note: For performance reasons, clock_gettime is called once per chunk instead of once per UUID. This reduces precision but
/// it still complies with the UUID standard.
uint64_t timestamp = getTimestampMillisecond();
for (UUID & uuid : vec_to)
{
typename FillPolicy::Data data;
data.generate(uuid, timestamp);
}
}
return col_res;
}
};
) // DECLARE_SEVERAL_IMPLEMENTATIONS
#undef DECLARE_SEVERAL_IMPLEMENTATIONS
template <typename FillPolicy>
class FunctionGenerateUUIDv7Base : public TargetSpecific::Default::FunctionGenerateUUIDv7Base<FillPolicy>
{
public:
using Self = FunctionGenerateUUIDv7Base<FillPolicy>;
using Parent = TargetSpecific::Default::FunctionGenerateUUIDv7Base<FillPolicy>;
explicit FunctionGenerateUUIDv7Base(ContextPtr context) : selector(context)
{
selector.registerImplementation<TargetArch::Default, Parent>();
#if USE_MULTITARGET_CODE
using ParentAVX2 = TargetSpecific::AVX2::FunctionGenerateUUIDv7Base<FillPolicy>;
selector.registerImplementation<TargetArch::AVX2, ParentAVX2>();
#endif
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override
{
return selector.selectAndExecute(arguments, result_type, input_rows_count);
}
static FunctionPtr create(ContextPtr context)
{
return std::make_shared<Self>(context);
}
private:
ImplementationSelector<IFunction> selector;
};
template<typename FillPolicy>
void registerUUIDv7Generator(auto& factory)
{
static constexpr auto doc_syntax_format = "{}([expression])";
static constexpr auto example_format = "SELECT {}()";
static constexpr auto multiple_example_format = "SELECT {f}(1), {f}(2)";
FunctionDocumentation::Description doc_description = FillPolicy::doc_description;
FunctionDocumentation::Syntax doc_syntax = fmt::format(doc_syntax_format, FillPolicy::name);
FunctionDocumentation::Arguments doc_arguments = {{"expression", "The expression is used to bypass common subexpression elimination if the function is called multiple times in a query but otherwise ignored. Optional."}};
FunctionDocumentation::ReturnedValue doc_returned_value = "A value of type UUID version 7.";
FunctionDocumentation::Examples doc_examples = {{"uuid", fmt::format(example_format, FillPolicy::name), ""}, {"multiple", fmt::format(multiple_example_format, fmt::arg("f", FillPolicy::name)), ""}};
FunctionDocumentation::Categories doc_categories = {"UUID"};
factory.template registerFunction<FunctionGenerateUUIDv7Base<FillPolicy>>({doc_description, doc_syntax, doc_arguments, doc_returned_value, doc_examples, doc_categories}, FunctionFactory::CaseInsensitive);
}
REGISTER_FUNCTION(GenerateUUIDv7)
{
registerUUIDv7Generator<GlobalCounterPolicy>(factory);
registerUUIDv7Generator<ThreadLocalCounterPolicy>(factory);
registerUUIDv7Generator<FillAllRandomPolicy>(factory);
}
}

View File

@ -19,7 +19,7 @@ using FunctionLocate = FunctionsStringSearch<PositionImpl<NameLocate, PositionCa
REGISTER_FUNCTION(Locate)
{
FunctionDocumentation::Description doc_description = "Like function `position` but with arguments `haystack` and `locate` switched. The behavior of this function depends on the ClickHouse version: In versions < v24.3, `locate` was an alias of function `position` and accepted arguments `(haystack, needle[, start_pos])`. In versions >= 24.3,, `locate` is an individual function (for better compatibility with MySQL) and accepts arguments `(needle, haystack[, start_pos])`. The previous behaviorcan be restored using setting `function_locate_has_mysql_compatible_argument_order = false`.";
FunctionDocumentation::Description doc_description = "Like function `position` but with arguments `haystack` and `locate` switched. The behavior of this function depends on the ClickHouse version: In versions < v24.3, `locate` was an alias of function `position` and accepted arguments `(haystack, needle[, start_pos])`. In versions >= 24.3,, `locate` is an individual function (for better compatibility with MySQL) and accepts arguments `(needle, haystack[, start_pos])`. The previous behavior can be restored using setting `function_locate_has_mysql_compatible_argument_order = false`.";
FunctionDocumentation::Syntax doc_syntax = "location(needle, haystack[, start_pos])";
FunctionDocumentation::Arguments doc_arguments = {{"needle", "Substring to be searched (String)"},
{"haystack", "String in which the search is performed (String)."},

View File

@ -46,7 +46,6 @@ namespace
const String & dest_blob_,
std::shared_ptr<const AzureObjectStorageSettings> settings_,
ThreadPoolCallbackRunnerUnsafe<void> schedule_,
bool for_disk_azure_blob_storage_,
const Poco::Logger * log_)
: create_read_buffer(create_read_buffer_)
, client(client_)
@ -56,7 +55,6 @@ namespace
, dest_blob(dest_blob_)
, settings(settings_)
, schedule(schedule_)
, for_disk_azure_blob_storage(for_disk_azure_blob_storage_)
, log(log_)
, max_single_part_upload_size(settings_->max_single_part_upload_size)
{
@ -73,7 +71,6 @@ namespace
const String & dest_blob;
std::shared_ptr<const AzureObjectStorageSettings> settings;
ThreadPoolCallbackRunnerUnsafe<void> schedule;
bool for_disk_azure_blob_storage;
const Poco::Logger * log;
size_t max_single_part_upload_size;
@ -217,7 +214,7 @@ namespace
void processUploadPartRequest(UploadPartTask & task)
{
ProfileEvents::increment(ProfileEvents::AzureUploadPart);
if (for_disk_azure_blob_storage)
if (client->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureUploadPart);
auto block_blob_client = client->GetBlockBlobClient(dest_blob);
@ -269,10 +266,9 @@ void copyDataToAzureBlobStorageFile(
const String & dest_container_for_logging,
const String & dest_blob,
std::shared_ptr<const AzureObjectStorageSettings> settings,
ThreadPoolCallbackRunnerUnsafe<void> schedule,
bool for_disk_azure_blob_storage)
ThreadPoolCallbackRunnerUnsafe<void> schedule)
{
UploadHelper helper{create_read_buffer, dest_client, offset, size, dest_container_for_logging, dest_blob, settings, schedule, for_disk_azure_blob_storage, &Poco::Logger::get("copyDataToAzureBlobStorageFile")};
UploadHelper helper{create_read_buffer, dest_client, offset, size, dest_container_for_logging, dest_blob, settings, schedule, &Poco::Logger::get("copyDataToAzureBlobStorageFile")};
helper.performCopy();
}
@ -288,14 +284,13 @@ void copyAzureBlobStorageFile(
const String & dest_blob,
std::shared_ptr<const AzureObjectStorageSettings> settings,
const ReadSettings & read_settings,
ThreadPoolCallbackRunnerUnsafe<void> schedule,
bool for_disk_azure_blob_storage)
ThreadPoolCallbackRunnerUnsafe<void> schedule)
{
if (settings->use_native_copy)
{
ProfileEvents::increment(ProfileEvents::AzureCopyObject);
if (for_disk_azure_blob_storage)
if (dest_client->GetClickhouseOptions().IsClientForDisk)
ProfileEvents::increment(ProfileEvents::DiskAzureCopyObject);
auto block_blob_client_src = src_client->GetBlockBlobClient(src_blob);
@ -330,7 +325,7 @@ void copyAzureBlobStorageFile(
settings->max_single_download_retries);
};
UploadHelper helper{create_read_buffer, dest_client, offset, size, dest_container_for_logging, dest_blob, settings, schedule, for_disk_azure_blob_storage, &Poco::Logger::get("copyAzureBlobStorageFile")};
UploadHelper helper{create_read_buffer, dest_client, offset, size, dest_container_for_logging, dest_blob, settings, schedule, &Poco::Logger::get("copyAzureBlobStorageFile")};
helper.performCopy();
}
}

View File

@ -31,8 +31,7 @@ void copyAzureBlobStorageFile(
const String & dest_blob,
std::shared_ptr<const AzureObjectStorageSettings> settings,
const ReadSettings & read_settings,
ThreadPoolCallbackRunnerUnsafe<void> schedule_ = {},
bool for_disk_azure_blob_storage = false);
ThreadPoolCallbackRunnerUnsafe<void> schedule_ = {});
/// Copies data from any seekable source to AzureBlobStorage.
@ -48,8 +47,7 @@ void copyDataToAzureBlobStorageFile(
const String & dest_container_for_logging,
const String & dest_blob,
std::shared_ptr<const AzureObjectStorageSettings> settings,
ThreadPoolCallbackRunnerUnsafe<void> schedule_ = {},
bool for_disk_azure_blob_storage = false);
ThreadPoolCallbackRunnerUnsafe<void> schedule_ = {});
}

View File

@ -314,7 +314,7 @@ size_t ReadBufferFromS3::getFileSize()
if (file_size)
return *file_size;
auto object_size = S3::getObjectSize(*client_ptr, bucket, key, version_id, request_settings, /* for_disk_s3= */ read_settings.for_object_storage);
auto object_size = S3::getObjectSize(*client_ptr, bucket, key, version_id, request_settings);
file_size = object_size;
return *file_size;
@ -415,7 +415,7 @@ Aws::S3::Model::GetObjectResult ReadBufferFromS3::sendRequest(size_t attempt, si
}
ProfileEvents::increment(ProfileEvents::S3GetObject);
if (read_settings.for_object_storage)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3GetObject);
ProfileEventTimeIncrement<Microseconds> watch(ProfileEvents::ReadBufferFromS3InitMicroseconds);

View File

@ -127,9 +127,6 @@ struct ReadSettings
bool http_skip_not_found_url_for_globs = true;
bool http_make_head_request = true;
/// Monitoring
bool for_object_storage = false; // to choose which profile events should be incremented
ReadSettings adjustBufferSize(size_t file_size) const
{
ReadSettings res = *this;

View File

@ -384,7 +384,8 @@ Model::HeadObjectOutcome Client::HeadObject(HeadObjectRequest & request) const
/// The next call is NOT a recurcive call
/// This is a virtuall call Aws::S3::S3Client::HeadObject(const Model::HeadObjectRequest&)
return HeadObject(static_cast<const Model::HeadObjectRequest&>(request));
return enrichErrorMessage(
HeadObject(static_cast<const Model::HeadObjectRequest&>(request)));
}
/// For each request, we wrap the request functions from Aws::S3::Client with doRequest
@ -404,7 +405,8 @@ Model::ListObjectsOutcome Client::ListObjects(ListObjectsRequest & request) cons
Model::GetObjectOutcome Client::GetObject(GetObjectRequest & request) const
{
return doRequest(request, [this](const Model::GetObjectRequest & req) { return GetObject(req); });
return enrichErrorMessage(
doRequest(request, [this](const Model::GetObjectRequest & req) { return GetObject(req); }));
}
Model::AbortMultipartUploadOutcome Client::AbortMultipartUpload(AbortMultipartUploadRequest & request) const
@ -652,14 +654,14 @@ Client::doRequestWithRetryNetworkErrors(RequestType & request, RequestFn request
if constexpr (IsReadMethod)
{
if (client_configuration.for_disk_s3)
if (isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3ReadRequestsErrors);
else
ProfileEvents::increment(ProfileEvents::S3ReadRequestsErrors);
}
else
{
if (client_configuration.for_disk_s3)
if (isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3WriteRequestsErrors);
else
ProfileEvents::increment(ProfileEvents::S3WriteRequestsErrors);
@ -689,6 +691,23 @@ Client::doRequestWithRetryNetworkErrors(RequestType & request, RequestFn request
return doRequest(request, with_retries);
}
template <typename RequestResult>
RequestResult Client::enrichErrorMessage(RequestResult && outcome) const
{
if (outcome.IsSuccess() || !isClientForDisk())
return std::forward<RequestResult>(outcome);
String enriched_message = fmt::format(
"{} {}",
outcome.GetError().GetMessage(),
"This error happened for S3 disk.");
auto error = outcome.GetError();
error.SetMessage(enriched_message);
return RequestResult(error);
}
bool Client::supportsMultiPartCopy() const
{
return provider_type != ProviderType::GCS;

View File

@ -214,6 +214,11 @@ public:
bool isS3ExpressBucket() const { return client_settings.is_s3express_bucket; }
bool isClientForDisk() const
{
return client_configuration.for_disk_s3;
}
private:
friend struct ::MockS3::Client;
@ -265,6 +270,9 @@ private:
bool checkIfWrongRegionDefined(const std::string & bucket, const Aws::S3::S3Error & error, std::string & region) const;
void insertRegionOverride(const std::string & bucket, const std::string & region) const;
template <typename RequestResult>
RequestResult enrichErrorMessage(RequestResult && outcome) const;
String initial_endpoint;
std::shared_ptr<Aws::Auth::AWSCredentialsProvider> credentials_provider;
PocoHTTPClientConfiguration client_configuration;

View File

@ -140,7 +140,7 @@ namespace
fillCreateMultipartRequest(request);
ProfileEvents::increment(ProfileEvents::S3CreateMultipartUpload);
if (for_disk_s3)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3CreateMultipartUpload);
auto outcome = client_ptr->CreateMultipartUpload(request);
@ -189,7 +189,7 @@ namespace
for (size_t retries = 1;; ++retries)
{
ProfileEvents::increment(ProfileEvents::S3CompleteMultipartUpload);
if (for_disk_s3)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3CompleteMultipartUpload);
auto outcome = client_ptr->CompleteMultipartUpload(request);
@ -239,7 +239,7 @@ namespace
void checkObjectAfterUpload()
{
LOG_TRACE(log, "Checking object {} exists after upload", dest_key);
S3::checkObjectExists(*client_ptr, dest_bucket, dest_key, {}, request_settings, {}, "Immediately after upload");
S3::checkObjectExists(*client_ptr, dest_bucket, dest_key, {}, request_settings, "Immediately after upload");
LOG_TRACE(log, "Object {} exists after upload", dest_key);
}
@ -528,7 +528,7 @@ namespace
for (size_t retries = 1;; ++retries)
{
ProfileEvents::increment(ProfileEvents::S3PutObject);
if (for_disk_s3)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3PutObject);
Stopwatch watch;
@ -615,7 +615,7 @@ namespace
auto & req = typeid_cast<S3::UploadPartRequest &>(request);
ProfileEvents::increment(ProfileEvents::S3UploadPart);
if (for_disk_s3)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3UploadPart);
auto outcome = client_ptr->UploadPart(req);
@ -726,7 +726,7 @@ namespace
for (size_t retries = 1;; ++retries)
{
ProfileEvents::increment(ProfileEvents::S3CopyObject);
if (for_disk_s3)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3CopyObject);
auto outcome = client_ptr->CopyObject(request);
@ -830,7 +830,7 @@ namespace
auto & req = typeid_cast<S3::UploadPartCopyRequest &>(request);
ProfileEvents::increment(ProfileEvents::S3UploadPartCopy);
if (for_disk_s3)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3UploadPartCopy);
auto outcome = client_ptr->UploadPartCopy(req);

View File

@ -25,10 +25,10 @@ namespace DB::S3
namespace
{
Aws::S3::Model::HeadObjectOutcome headObject(
const S3::Client & client, const String & bucket, const String & key, const String & version_id, bool for_disk_s3)
const S3::Client & client, const String & bucket, const String & key, const String & version_id)
{
ProfileEvents::increment(ProfileEvents::S3HeadObject);
if (for_disk_s3)
if (client.isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3HeadObject);
S3::HeadObjectRequest req;
@ -44,9 +44,9 @@ namespace
/// Performs a request to get the size and last modification time of an object.
std::pair<std::optional<ObjectInfo>, Aws::S3::S3Error> tryGetObjectInfo(
const S3::Client & client, const String & bucket, const String & key, const String & version_id,
const S3Settings::RequestSettings & /*request_settings*/, bool with_metadata, bool for_disk_s3)
const S3Settings::RequestSettings & /*request_settings*/, bool with_metadata)
{
auto outcome = headObject(client, bucket, key, version_id, for_disk_s3);
auto outcome = headObject(client, bucket, key, version_id);
if (!outcome.IsSuccess())
return {std::nullopt, outcome.GetError()};
@ -75,10 +75,9 @@ ObjectInfo getObjectInfo(
const String & version_id,
const S3Settings::RequestSettings & request_settings,
bool with_metadata,
bool for_disk_s3,
bool throw_on_error)
{
auto [object_info, error] = tryGetObjectInfo(client, bucket, key, version_id, request_settings, with_metadata, for_disk_s3);
auto [object_info, error] = tryGetObjectInfo(client, bucket, key, version_id, request_settings, with_metadata);
if (object_info)
{
return *object_info;
@ -98,10 +97,9 @@ size_t getObjectSize(
const String & key,
const String & version_id,
const S3Settings::RequestSettings & request_settings,
bool for_disk_s3,
bool throw_on_error)
{
return getObjectInfo(client, bucket, key, version_id, request_settings, {}, for_disk_s3, throw_on_error).size;
return getObjectInfo(client, bucket, key, version_id, request_settings, {}, throw_on_error).size;
}
bool objectExists(
@ -109,10 +107,9 @@ bool objectExists(
const String & bucket,
const String & key,
const String & version_id,
const S3Settings::RequestSettings & request_settings,
bool for_disk_s3)
const S3Settings::RequestSettings & request_settings)
{
auto [object_info, error] = tryGetObjectInfo(client, bucket, key, version_id, request_settings, {}, for_disk_s3);
auto [object_info, error] = tryGetObjectInfo(client, bucket, key, version_id, request_settings, {});
if (object_info)
return true;
@ -130,10 +127,9 @@ void checkObjectExists(
const String & key,
const String & version_id,
const S3Settings::RequestSettings & request_settings,
bool for_disk_s3,
std::string_view description)
{
auto [object_info, error] = tryGetObjectInfo(client, bucket, key, version_id, request_settings, {}, for_disk_s3);
auto [object_info, error] = tryGetObjectInfo(client, bucket, key, version_id, request_settings, {});
if (object_info)
return;
throw S3Exception(error.GetErrorType(), "{}Object {} in bucket {} suddenly disappeared: {}",

View File

@ -26,7 +26,6 @@ ObjectInfo getObjectInfo(
const String & version_id = {},
const S3Settings::RequestSettings & request_settings = {},
bool with_metadata = false,
bool for_disk_s3 = false,
bool throw_on_error = true);
size_t getObjectSize(
@ -35,7 +34,6 @@ size_t getObjectSize(
const String & key,
const String & version_id = {},
const S3Settings::RequestSettings & request_settings = {},
bool for_disk_s3 = false,
bool throw_on_error = true);
bool objectExists(
@ -43,8 +41,7 @@ bool objectExists(
const String & bucket,
const String & key,
const String & version_id = {},
const S3Settings::RequestSettings & request_settings = {},
bool for_disk_s3 = false);
const S3Settings::RequestSettings & request_settings = {});
/// Throws an exception if a specified object doesn't exist. `description` is used as a part of the error message.
void checkObjectExists(
@ -53,7 +50,6 @@ void checkObjectExists(
const String & key,
const String & version_id = {},
const S3Settings::RequestSettings & request_settings = {},
bool for_disk_s3 = false,
std::string_view description = {});
bool isNotFoundError(Aws::S3::S3Errors error);

View File

@ -214,9 +214,9 @@ void WriteBufferFromS3::finalizeImpl()
if (request_settings.check_objects_after_upload)
{
S3::checkObjectExists(*client_ptr, bucket, key, {}, request_settings, /* for_disk_s3= */ write_settings.for_object_storage, "Immediately after upload");
S3::checkObjectExists(*client_ptr, bucket, key, {}, request_settings, "Immediately after upload");
size_t actual_size = S3::getObjectSize(*client_ptr, bucket, key, {}, request_settings, /* for_disk_s3= */ write_settings.for_object_storage);
size_t actual_size = S3::getObjectSize(*client_ptr, bucket, key, {}, request_settings);
if (actual_size != total_size)
throw Exception(
ErrorCodes::S3_ERROR,
@ -390,7 +390,7 @@ void WriteBufferFromS3::createMultipartUpload()
client_ptr->setKMSHeaders(req);
ProfileEvents::increment(ProfileEvents::S3CreateMultipartUpload);
if (write_settings.for_object_storage)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3CreateMultipartUpload);
Stopwatch watch;
@ -429,7 +429,7 @@ void WriteBufferFromS3::abortMultipartUpload()
req.SetUploadId(multipart_upload_id);
ProfileEvents::increment(ProfileEvents::S3AbortMultipartUpload);
if (write_settings.for_object_storage)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3AbortMultipartUpload);
Stopwatch watch;
@ -530,7 +530,7 @@ void WriteBufferFromS3::writePart(WriteBufferFromS3::PartData && data)
getShortLogDetails(), data_size, part_number);
ProfileEvents::increment(ProfileEvents::S3UploadPart);
if (write_settings.for_object_storage)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3UploadPart);
auto & request = std::get<0>(*worker_data);
@ -606,7 +606,7 @@ void WriteBufferFromS3::completeMultipartUpload()
for (size_t i = 0; i < max_retry; ++i)
{
ProfileEvents::increment(ProfileEvents::S3CompleteMultipartUpload);
if (write_settings.for_object_storage)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3CompleteMultipartUpload);
Stopwatch watch;
@ -689,7 +689,7 @@ void WriteBufferFromS3::makeSinglepartUpload(WriteBufferFromS3::PartData && data
for (size_t i = 0; i < max_retry; ++i)
{
ProfileEvents::increment(ProfileEvents::S3PutObject);
if (write_settings.for_object_storage)
if (client_ptr->isClientForDisk())
ProfileEvents::increment(ProfileEvents::DiskS3PutObject);
ResourceCost cost = request.GetContentLength();

View File

@ -25,9 +25,6 @@ struct WriteSettings
bool s3_allow_parallel_part_upload = true;
bool azure_allow_parallel_part_upload = true;
/// Monitoring
bool for_object_storage = false; // to choose which profile events should be incremented
bool operator==(const WriteSettings & other) const = default;
};

View File

@ -615,12 +615,16 @@ static void executeAction(const ExpressionActions::Action & action, ExecutionCon
res_column.column = action.node->function->execute(arguments, res_column.type, num_rows, dry_run);
if (res_column.column->getDataType() != res_column.type->getColumnType())
{
throw Exception(
ErrorCodes::LOGICAL_ERROR,
"Unexpected return type from {}. Expected {}. Got {}",
"Unexpected return type from {}. Expected {}. Got {}. Action:\n{},\ninput block structure:{}",
action.node->function->getName(),
res_column.type->getColumnType(),
res_column.column->getDataType());
res_column.type->getName(),
res_column.column->getName(),
action.toString(),
Block(arguments).dumpStructure());
}
}
break;
}

View File

@ -7,8 +7,8 @@
#include <Disks/DiskLocal.h>
#include <Interpreters/GinFilter.h>
#include <Storages/MergeTree/GinIndexStore.h>
#include <Storages/MergeTree/MergeTreeIndexBloomFilterText.h>
#include <Storages/MergeTree/MergeTreeIndexFullText.h>
#include <Storages/MergeTree/MergeTreeIndexInverted.h>
#include <string>
#include <algorithm>
#include <city.h>

View File

@ -16,14 +16,17 @@
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeTuple.h>
#include <Interpreters/ExpressionActions.h>
#include <Interpreters/HashJoin.h>
#include <Interpreters/JoinUtils.h>
#include <Interpreters/TableJoin.h>
#include <Interpreters/joinDispatch.h>
#include <Interpreters/NullableUtils.h>
#include <Interpreters/RowRefs.h>
#include <Storages/IStorage.h>
@ -50,6 +53,7 @@ namespace ErrorCodes
extern const int SET_SIZE_LIMIT_EXCEEDED;
extern const int TYPE_MISMATCH;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int INVALID_JOIN_ON_EXPRESSION;
}
namespace
@ -119,14 +123,14 @@ namespace JoinStuff
}
}
template <bool use_flags, bool multiple_disjuncts, typename FindResult>
template <bool use_flags, bool flag_per_row, typename FindResult>
void JoinUsedFlags::setUsed(const FindResult & f)
{
if constexpr (!use_flags)
return;
/// Could be set simultaneously from different threads.
if constexpr (multiple_disjuncts)
if constexpr (flag_per_row)
{
auto & mapped = f.getMapped();
flags[mapped.block][mapped.row_num].store(true, std::memory_order_relaxed);
@ -137,14 +141,14 @@ namespace JoinStuff
}
}
template <bool use_flags, bool multiple_disjuncts>
template <bool use_flags, bool flag_per_row>
void JoinUsedFlags::setUsed(const Block * block, size_t row_num, size_t offset)
{
if constexpr (!use_flags)
return;
/// Could be set simultaneously from different threads.
if constexpr (multiple_disjuncts)
if constexpr (flag_per_row)
{
flags[block][row_num].store(true, std::memory_order_relaxed);
}
@ -154,13 +158,13 @@ namespace JoinStuff
}
}
template <bool use_flags, bool multiple_disjuncts, typename FindResult>
template <bool use_flags, bool flag_per_row, typename FindResult>
bool JoinUsedFlags::getUsed(const FindResult & f)
{
if constexpr (!use_flags)
return true;
if constexpr (multiple_disjuncts)
if constexpr (flag_per_row)
{
auto & mapped = f.getMapped();
return flags[mapped.block][mapped.row_num].load();
@ -171,13 +175,13 @@ namespace JoinStuff
}
}
template <bool use_flags, bool multiple_disjuncts, typename FindResult>
template <bool use_flags, bool flag_per_row, typename FindResult>
bool JoinUsedFlags::setUsedOnce(const FindResult & f)
{
if constexpr (!use_flags)
return true;
if constexpr (multiple_disjuncts)
if constexpr (flag_per_row)
{
auto & mapped = f.getMapped();
@ -253,6 +257,8 @@ HashJoin::HashJoin(std::shared_ptr<TableJoin> table_join_, const Block & right_s
LOG_TRACE(log, "{}Keys: {}, datatype: {}, kind: {}, strictness: {}, right header: {}",
instance_log_id, TableJoin::formatClauses(table_join->getClauses(), true), data->type, kind, strictness, right_sample_block.dumpStructure());
validateAdditionalFilterExpression(table_join->getMixedJoinExpression());
if (isCrossOrComma(kind))
{
data->type = Type::CROSS;
@ -705,7 +711,8 @@ void HashJoin::initRightBlockStructure(Block & saved_block_sample)
bool save_key_columns = table_join->isEnabledAlgorithm(JoinAlgorithm::AUTO) ||
table_join->isEnabledAlgorithm(JoinAlgorithm::GRACE_HASH) ||
isRightOrFull(kind) ||
multiple_disjuncts;
multiple_disjuncts ||
table_join->getMixedJoinExpression();
if (save_key_columns)
{
saved_block_sample = right_table_keys.cloneEmpty();
@ -835,7 +842,7 @@ bool HashJoin::addBlockToJoin(const Block & source_block_, bool check_limits)
if (rows)
data->empty = false;
bool multiple_disjuncts = !table_join->oneDisjunct();
bool flag_per_row = needUsedFlagsForPerRightTableRow(table_join);
const auto & onexprs = table_join->getClauses();
for (size_t onexpr_idx = 0; onexpr_idx < onexprs.size(); ++onexpr_idx)
{
@ -859,7 +866,7 @@ bool HashJoin::addBlockToJoin(const Block & source_block_, bool check_limits)
auto join_mask_col = JoinCommon::getColumnAsMask(source_block, onexprs[onexpr_idx].condColumnNames().second);
/// Save blocks that do not hold conditions in ON section
ColumnUInt8::MutablePtr not_joined_map = nullptr;
if (!multiple_disjuncts && isRightOrFull(kind) && join_mask_col.hasData())
if (!flag_per_row && isRightOrFull(kind) && join_mask_col.hasData())
{
const auto & join_mask = join_mask_col.getData();
/// Save rows that do not hold conditions
@ -889,7 +896,7 @@ bool HashJoin::addBlockToJoin(const Block & source_block_, bool check_limits)
join_mask_col.getData(),
data->pool, is_inserted);
if (multiple_disjuncts)
if (flag_per_row)
used_flags.reinit<kind_, strictness_>(stored_block);
else if (is_inserted)
/// Number of buckets + 1 value from zero storage
@ -897,19 +904,19 @@ bool HashJoin::addBlockToJoin(const Block & source_block_, bool check_limits)
});
}
if (!multiple_disjuncts && save_nullmap && is_inserted)
if (!flag_per_row && save_nullmap && is_inserted)
{
data->blocks_nullmaps_allocated_size += null_map_holder->allocatedBytes();
data->blocks_nullmaps.emplace_back(stored_block, null_map_holder);
}
if (!multiple_disjuncts && not_joined_map && is_inserted)
if (!flag_per_row && not_joined_map && is_inserted)
{
data->blocks_nullmaps_allocated_size += not_joined_map->allocatedBytes();
data->blocks_nullmaps.emplace_back(stored_block, std::move(not_joined_map));
}
if (!multiple_disjuncts && !is_inserted)
if (!flag_per_row && !is_inserted)
{
LOG_TRACE(log, "Skipping inserting block with {} rows", rows);
data->blocks_allocated_size -= stored_block->allocatedBytes();
@ -1044,14 +1051,17 @@ public:
};
AddedColumns(
const Block & left_block,
const Block & left_block_,
const Block & block_with_columns_to_add,
const Block & saved_block_sample,
const HashJoin & join,
std::vector<JoinOnKeyColumns> && join_on_keys_,
ExpressionActionsPtr additional_filter_expression_,
bool is_asof_join,
bool is_join_get_)
: join_on_keys(join_on_keys_)
: left_block(left_block_)
, join_on_keys(join_on_keys_)
, additional_filter_expression(additional_filter_expression_)
, rows_to_add(left_block.rows())
, is_join_get(is_join_get_)
{
@ -1120,7 +1130,9 @@ public:
const IColumn & leftAsofKey() const { return *left_asof_key; }
Block left_block;
std::vector<JoinOnKeyColumns> join_on_keys;
ExpressionActionsPtr additional_filter_expression;
size_t max_joined_block_rows = 0;
size_t rows_to_add;
@ -1221,7 +1233,7 @@ void AddedColumns<true>::buildOutput()
{
if (!lazy_output.blocks[j])
{
default_count ++;
default_count++;
continue;
}
apply_default();
@ -1340,7 +1352,7 @@ struct JoinFeatures
static constexpr bool need_flags = MapGetter<KIND, STRICTNESS>::flagged;
};
template <bool multiple_disjuncts>
template <bool flag_per_row>
class KnownRowsHolder;
/// Keep already joined rows to prevent duplication if many disjuncts
@ -1415,18 +1427,18 @@ public:
}
};
template <typename Map, bool add_missing, bool multiple_disjuncts, typename AddedColumns>
template <typename Map, bool add_missing, bool flag_per_row, typename AddedColumns>
void addFoundRowAll(
const typename Map::mapped_type & mapped,
AddedColumns & added,
IColumn::Offset & current_offset,
KnownRowsHolder<multiple_disjuncts> & known_rows [[maybe_unused]],
KnownRowsHolder<flag_per_row> & known_rows [[maybe_unused]],
JoinStuff::JoinUsedFlags * used_flags [[maybe_unused]])
{
if constexpr (add_missing)
added.applyLazyDefaults();
if constexpr (multiple_disjuncts)
if constexpr (flag_per_row)
{
std::unique_ptr<std::vector<KnownRowsHolder<true>::Type>> new_known_rows_ptr;
@ -1443,7 +1455,7 @@ void addFoundRowAll(
new_known_rows_ptr->push_back(std::make_pair(it->block, it->row_num));
if (used_flags)
{
used_flags->JoinStuff::JoinUsedFlags::setUsedOnce<true, multiple_disjuncts>(
used_flags->JoinStuff::JoinUsedFlags::setUsedOnce<true, flag_per_row>(
FindResultImpl<const RowRef, false>(*it, true, 0));
}
}
@ -1482,9 +1494,324 @@ void setUsed(IColumn::Filter & filter [[maybe_unused]], size_t pos [[maybe_unuse
filter[pos] = 1;
}
template<typename AddedColumns>
ColumnPtr buildAdditionalFilter(
size_t left_start_row,
const std::vector<RowRef> & selected_rows,
const std::vector<size_t> & row_replicate_offset,
AddedColumns & added_columns)
{
ColumnPtr result_column;
do
{
if (selected_rows.empty())
{
result_column = ColumnUInt8::create();
break;
}
const Block & sample_right_block = *selected_rows.begin()->block;
if (!sample_right_block || !added_columns.additional_filter_expression)
{
auto filter = ColumnUInt8::create();
filter->insertMany(1, selected_rows.size());
result_column = std::move(filter);
break;
}
auto required_cols = added_columns.additional_filter_expression->getRequiredColumnsWithTypes();
if (required_cols.empty())
{
Block block;
added_columns.additional_filter_expression->execute(block);
result_column = block.getByPosition(0).column->cloneResized(selected_rows.size());
break;
}
NameSet required_column_names;
for (auto & col : required_cols)
required_column_names.insert(col.name);
Block executed_block;
size_t right_col_pos = 0;
for (const auto & col : sample_right_block.getColumnsWithTypeAndName())
{
if (required_column_names.contains(col.name))
{
auto new_col = col.column->cloneEmpty();
for (const auto & selected_row : selected_rows)
{
const auto & src_col = selected_row.block->getByPosition(right_col_pos);
new_col->insertFrom(*src_col.column, selected_row.row_num);
}
executed_block.insert({std::move(new_col), col.type, col.name});
}
right_col_pos += 1;
}
if (!executed_block)
{
result_column = ColumnUInt8::create();
break;
}
for (const auto & col_name : required_column_names)
{
const auto * src_col = added_columns.left_block.findByName(col_name);
if (!src_col)
continue;
auto new_col = src_col->column->cloneEmpty();
size_t prev_left_offset = 0;
for (size_t i = 1; i < row_replicate_offset.size(); ++i)
{
const size_t & left_offset = row_replicate_offset[i];
size_t rows = left_offset - prev_left_offset;
if (rows)
new_col->insertManyFrom(*src_col->column, left_start_row + i - 1, rows);
prev_left_offset = left_offset;
}
executed_block.insert({std::move(new_col), src_col->type, col_name});
}
if (!executed_block)
{
throw Exception(
ErrorCodes::LOGICAL_ERROR,
"required columns: [{}], but not found any in left/right table. right table: {}, left table: {}",
required_cols.toString(),
sample_right_block.dumpNames(),
added_columns.left_block.dumpNames());
}
for (const auto & col : executed_block.getColumnsWithTypeAndName())
if (!col.column || !col.type)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Illegal nullptr column in input block: {}", executed_block.dumpStructure());
added_columns.additional_filter_expression->execute(executed_block);
result_column = executed_block.getByPosition(0).column->convertToFullColumnIfConst();
executed_block.clear();
} while (false);
result_column = result_column->convertToFullIfNeeded();
if (result_column->isNullable())
{
/// Convert Nullable(UInt8) to UInt8 ensuring that nulls are zeros
/// Trying to avoid copying data, since we are the only owner of the column.
ColumnPtr mask_column = assert_cast<const ColumnNullable &>(*result_column).getNullMapColumnPtr();
MutableColumnPtr mutable_column;
{
ColumnPtr nested_column = assert_cast<const ColumnNullable &>(*result_column).getNestedColumnPtr();
result_column.reset();
mutable_column = IColumn::mutate(std::move(nested_column));
}
auto & column_data = assert_cast<ColumnUInt8 &>(*mutable_column).getData();
const auto & mask_column_data = assert_cast<const ColumnUInt8 &>(*mask_column).getData();
for (size_t i = 0; i < column_data.size(); ++i)
{
if (mask_column_data[i])
column_data[i] = 0;
}
return mutable_column;
}
return result_column;
}
/// Adapter class to pass into addFoundRowAll
/// In joinRightColumnsWithAdditionalFilter we don't want to add rows directly into AddedColumns,
/// because they need to be filtered by additional_filter_expression.
class PreSelectedRows : public std::vector<RowRef>
{
public:
void appendFromBlock(const Block & block, size_t row_num, bool /* has_default */) { this->emplace_back(&block, row_num); }
};
/// First to collect all matched rows refs by join keys, then filter out rows which are not true in additional filter expression.
template <
typename KeyGetter,
typename Map,
bool need_replication,
typename AddedColumns>
NO_INLINE size_t joinRightColumnsWithAddtitionalFilter(
std::vector<KeyGetter> && key_getter_vector,
const std::vector<const Map *> & mapv,
AddedColumns & added_columns,
JoinStuff::JoinUsedFlags & used_flags [[maybe_unused]],
bool need_filter [[maybe_unused]],
bool need_flags [[maybe_unused]],
bool add_missing [[maybe_unused]],
bool flag_per_row [[maybe_unused]])
{
size_t left_block_rows = added_columns.rows_to_add;
if (need_filter)
added_columns.filter = IColumn::Filter(left_block_rows, 0);
std::unique_ptr<Arena> pool;
if constexpr (need_replication)
added_columns.offsets_to_replicate = std::make_unique<IColumn::Offsets>(left_block_rows);
std::vector<size_t> row_replicate_offset;
row_replicate_offset.reserve(left_block_rows);
using FindResult = typename KeyGetter::FindResult;
size_t max_joined_block_rows = added_columns.max_joined_block_rows;
size_t left_row_iter = 0;
PreSelectedRows selected_rows;
selected_rows.reserve(left_block_rows);
std::vector<FindResult> find_results;
find_results.reserve(left_block_rows);
bool exceeded_max_block_rows = false;
IColumn::Offset total_added_rows = 0;
IColumn::Offset current_added_rows = 0;
auto collect_keys_matched_rows_refs = [&]()
{
pool = std::make_unique<Arena>();
find_results.clear();
row_replicate_offset.clear();
row_replicate_offset.push_back(0);
current_added_rows = 0;
selected_rows.clear();
for (; left_row_iter < left_block_rows; ++left_row_iter)
{
if constexpr (need_replication)
{
if (unlikely(total_added_rows + current_added_rows >= max_joined_block_rows))
{
break;
}
}
KnownRowsHolder<true> all_flag_known_rows;
KnownRowsHolder<false> single_flag_know_rows;
for (size_t join_clause_idx = 0; join_clause_idx < added_columns.join_on_keys.size(); ++join_clause_idx)
{
const auto & join_keys = added_columns.join_on_keys[join_clause_idx];
if (join_keys.null_map && (*join_keys.null_map)[left_row_iter])
continue;
bool row_acceptable = !join_keys.isRowFiltered(left_row_iter);
auto find_result = row_acceptable
? key_getter_vector[join_clause_idx].findKey(*(mapv[join_clause_idx]), left_row_iter, *pool)
: FindResult();
if (find_result.isFound())
{
auto & mapped = find_result.getMapped();
find_results.push_back(find_result);
if (flag_per_row)
addFoundRowAll<Map, false, true>(mapped, selected_rows, current_added_rows, all_flag_known_rows, nullptr);
else
addFoundRowAll<Map, false, false>(mapped, selected_rows, current_added_rows, single_flag_know_rows, nullptr);
}
}
row_replicate_offset.push_back(current_added_rows);
}
};
auto copy_final_matched_rows = [&](size_t left_start_row, ColumnPtr filter_col)
{
const PaddedPODArray<UInt8> & filter_flags = assert_cast<const ColumnUInt8 &>(*filter_col).getData();
size_t prev_replicated_row = 0;
auto selected_right_row_it = selected_rows.begin();
size_t find_result_index = 0;
for (size_t i = 1, n = row_replicate_offset.size(); i < n; ++i)
{
bool any_matched = false;
/// For all right join, flag_per_row is true, we need mark used flags for each row.
if (flag_per_row)
{
for (size_t replicated_row = prev_replicated_row; replicated_row < row_replicate_offset[i]; ++replicated_row)
{
if (filter_flags[replicated_row])
{
any_matched = true;
added_columns.appendFromBlock(*selected_right_row_it->block, selected_right_row_it->row_num, add_missing);
total_added_rows += 1;
if (need_flags)
used_flags.template setUsed<true, true>(selected_right_row_it->block, selected_right_row_it->row_num, 0);
}
++selected_right_row_it;
}
}
else
{
for (size_t replicated_row = prev_replicated_row; replicated_row < row_replicate_offset[i]; ++replicated_row)
{
if (filter_flags[replicated_row])
{
any_matched = true;
added_columns.appendFromBlock(*selected_right_row_it->block, selected_right_row_it->row_num, add_missing);
total_added_rows += 1;
}
++selected_right_row_it;
}
}
if (!any_matched)
{
if (add_missing)
addNotFoundRow<true, need_replication>(added_columns, total_added_rows);
else
addNotFoundRow<false, need_replication>(added_columns, total_added_rows);
}
else
{
if (!flag_per_row && need_flags)
used_flags.template setUsed<true, false>(find_results[find_result_index]);
if (need_filter)
setUsed<true>(added_columns.filter, left_start_row + i - 1);
if (add_missing)
added_columns.applyLazyDefaults();
}
find_result_index += (prev_replicated_row != row_replicate_offset[i]);
if constexpr (need_replication)
{
(*added_columns.offsets_to_replicate)[left_start_row + i - 1] = total_added_rows;
}
prev_replicated_row = row_replicate_offset[i];
}
};
while (left_row_iter < left_block_rows && !exceeded_max_block_rows)
{
auto left_start_row = left_row_iter;
collect_keys_matched_rows_refs();
if (selected_rows.size() != current_added_rows || row_replicate_offset.size() != left_row_iter - left_start_row + 1)
{
throw Exception(
ErrorCodes::LOGICAL_ERROR,
"Sizes are mismatched. selected_rows.size:{}, current_added_rows:{}, row_replicate_offset.size:{}, left_row_iter: {}, "
"left_start_row: {}",
selected_rows.size(),
current_added_rows,
row_replicate_offset.size(),
left_row_iter,
left_start_row);
}
auto filter_col = buildAdditionalFilter(left_start_row, selected_rows, row_replicate_offset, added_columns);
copy_final_matched_rows(left_start_row, filter_col);
if constexpr (need_replication)
{
// Add a check for current_added_rows to avoid run the filter expression on too small size batch.
if (total_added_rows >= max_joined_block_rows || current_added_rows < 1024)
{
exceeded_max_block_rows = true;
}
}
}
if constexpr (need_replication)
{
added_columns.offsets_to_replicate->resize_assume_reserved(left_row_iter);
added_columns.filter.resize_assume_reserved(left_row_iter);
}
added_columns.applyLazyDefaults();
return left_row_iter;
}
/// Joins right table columns which indexes are present in right_indexes using specified map.
/// Makes filter (1 if row presented in right table) and returns offsets to replicate (for ALL JOINS).
template <JoinKind KIND, JoinStrictness STRICTNESS, typename KeyGetter, typename Map, bool need_filter, bool multiple_disjuncts, typename AddedColumns>
template <JoinKind KIND, JoinStrictness STRICTNESS, typename KeyGetter, typename Map, bool need_filter, bool flag_per_row, typename AddedColumns>
NO_INLINE size_t joinRightColumns(
std::vector<KeyGetter> && key_getter_vector,
const std::vector<const Map *> & mapv,
@ -1519,7 +1846,7 @@ NO_INLINE size_t joinRightColumns(
bool right_row_found = false;
KnownRowsHolder<multiple_disjuncts> known_rows;
KnownRowsHolder<flag_per_row> known_rows;
for (size_t onexpr_idx = 0; onexpr_idx < added_columns.join_on_keys.size(); ++onexpr_idx)
{
const auto & join_keys = added_columns.join_on_keys[onexpr_idx];
@ -1542,10 +1869,10 @@ NO_INLINE size_t joinRightColumns(
if (row_ref.block)
{
setUsed<need_filter>(added_columns.filter, i);
if constexpr (multiple_disjuncts)
used_flags.template setUsed<join_features.need_flags, multiple_disjuncts>(row_ref.block, row_ref.row_num, 0);
if constexpr (flag_per_row)
used_flags.template setUsed<join_features.need_flags, flag_per_row>(row_ref.block, row_ref.row_num, 0);
else
used_flags.template setUsed<join_features.need_flags, multiple_disjuncts>(find_result);
used_flags.template setUsed<join_features.need_flags, flag_per_row>(find_result);
added_columns.appendFromBlock(*row_ref.block, row_ref.row_num, join_features.add_missing);
}
@ -1555,14 +1882,14 @@ NO_INLINE size_t joinRightColumns(
else if constexpr (join_features.is_all_join)
{
setUsed<need_filter>(added_columns.filter, i);
used_flags.template setUsed<join_features.need_flags, multiple_disjuncts>(find_result);
used_flags.template setUsed<join_features.need_flags, flag_per_row>(find_result);
auto used_flags_opt = join_features.need_flags ? &used_flags : nullptr;
addFoundRowAll<Map, join_features.add_missing>(mapped, added_columns, current_offset, known_rows, used_flags_opt);
}
else if constexpr ((join_features.is_any_join || join_features.is_semi_join) && join_features.right)
{
/// Use first appeared left key + it needs left columns replication
bool used_once = used_flags.template setUsedOnce<join_features.need_flags, multiple_disjuncts>(find_result);
bool used_once = used_flags.template setUsedOnce<join_features.need_flags, flag_per_row>(find_result);
if (used_once)
{
auto used_flags_opt = join_features.need_flags ? &used_flags : nullptr;
@ -1572,7 +1899,7 @@ NO_INLINE size_t joinRightColumns(
}
else if constexpr (join_features.is_any_join && KIND == JoinKind::Inner)
{
bool used_once = used_flags.template setUsedOnce<join_features.need_flags, multiple_disjuncts>(find_result);
bool used_once = used_flags.template setUsedOnce<join_features.need_flags, flag_per_row>(find_result);
/// Use first appeared left key only
if (used_once)
@ -1590,12 +1917,12 @@ NO_INLINE size_t joinRightColumns(
else if constexpr (join_features.is_anti_join)
{
if constexpr (join_features.right && join_features.need_flags)
used_flags.template setUsed<join_features.need_flags, multiple_disjuncts>(find_result);
used_flags.template setUsed<join_features.need_flags, flag_per_row>(find_result);
}
else /// ANY LEFT, SEMI LEFT, old ANY (RightAny)
{
setUsed<need_filter>(added_columns.filter, i);
used_flags.template setUsed<join_features.need_flags, multiple_disjuncts>(find_result);
used_flags.template setUsed<join_features.need_flags, flag_per_row>(find_result);
added_columns.appendFromBlock(*mapped.block, mapped.row_num, join_features.add_missing);
if (join_features.is_any_or_semi_join)
@ -1630,6 +1957,27 @@ size_t joinRightColumnsSwitchMultipleDisjuncts(
AddedColumns & added_columns,
JoinStuff::JoinUsedFlags & used_flags [[maybe_unused]])
{
constexpr JoinFeatures<KIND, STRICTNESS> join_features;
if constexpr (join_features.is_all_join)
{
if (added_columns.additional_filter_expression)
{
bool mark_per_row_used = join_features.right || join_features.full || mapv.size() > 1;
return joinRightColumnsWithAddtitionalFilter<KeyGetter, Map, join_features.need_replication>(
std::forward<std::vector<KeyGetter>>(key_getter_vector),
mapv,
added_columns,
used_flags,
need_filter,
join_features.need_flags,
join_features.add_missing,
mark_per_row_used);
}
}
if (added_columns.additional_filter_expression)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Additional filter expression is not supported for this JOIN");
return mapv.size() > 1
? joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, need_filter, true>(std::forward<std::vector<KeyGetter>>(key_getter_vector), mapv, added_columns, used_flags)
: joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, need_filter, false>(std::forward<std::vector<KeyGetter>>(key_getter_vector), mapv, added_columns, used_flags);
@ -1788,8 +2136,14 @@ Block HashJoin::joinBlockImpl(
* For ASOF, the last column is used as the ASOF column
*/
AddedColumns<!join_features.is_any_join> added_columns(
block, block_with_columns_to_add, savedBlockSample(), *this, std::move(join_on_keys), join_features.is_asof_join, is_join_get);
block,
block_with_columns_to_add,
savedBlockSample(),
*this,
std::move(join_on_keys),
table_join->getMixedJoinExpression(),
join_features.is_asof_join,
is_join_get);
bool has_required_right_keys = (required_right_keys.columns() != 0);
added_columns.need_filter = join_features.need_filter || has_required_right_keys;
@ -1856,11 +2210,15 @@ Block HashJoin::joinBlockImpl(
/// If ALL ... JOIN - we replicate all the columns except the new ones.
for (size_t i = 0; i < existing_columns; ++i)
{
block.safeGetByPosition(i).column = block.safeGetByPosition(i).column->replicate(*offsets_to_replicate);
}
/// Replicate additional right keys
for (size_t pos : right_keys_to_replicate)
{
block.safeGetByPosition(pos).column = block.safeGetByPosition(pos).column->replicate(*offsets_to_replicate);
}
}
return remaining_block;
@ -2108,10 +2466,10 @@ struct AdderNonJoined
class NotJoinedHash final : public NotJoinedBlocks::RightColumnsFiller
{
public:
NotJoinedHash(const HashJoin & parent_, UInt64 max_block_size_, bool multiple_disjuncts_)
NotJoinedHash(const HashJoin & parent_, UInt64 max_block_size_, bool flag_per_row_)
: parent(parent_)
, max_block_size(max_block_size_)
, multiple_disjuncts(multiple_disjuncts_)
, flag_per_row(flag_per_row_)
, current_block_start(0)
{
if (parent.data == nullptr)
@ -2138,7 +2496,7 @@ public:
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown JOIN strictness '{}' (must be on of: ANY, ALL, ASOF)", parent.strictness);
}
if (!multiple_disjuncts)
if (!flag_per_row)
{
fillNullsFromBlocks(columns_right, rows_added);
}
@ -2149,7 +2507,7 @@ public:
private:
const HashJoin & parent;
UInt64 max_block_size;
bool multiple_disjuncts;
bool flag_per_row;
size_t current_block_start;
@ -2215,7 +2573,7 @@ private:
{
size_t rows_added = 0;
if (multiple_disjuncts)
if (flag_per_row)
{
if (!used_position.has_value())
used_position = parent.data->blocks.begin();
@ -2307,8 +2665,8 @@ IBlocksStreamPtr HashJoin::getNonJoinedBlocks(const Block & left_sample_block,
return {};
size_t left_columns_count = left_sample_block.columns();
bool multiple_disjuncts = !table_join->oneDisjunct();
if (!multiple_disjuncts)
bool flag_per_row = needUsedFlagsForPerRightTableRow(table_join);
if (!flag_per_row)
{
/// With multiple disjuncts, all keys are in sample_block_with_columns_to_add, so invariant is not held
size_t expected_columns_count = left_columns_count + required_right_keys.columns() + sample_block_with_columns_to_add.columns();
@ -2320,7 +2678,7 @@ IBlocksStreamPtr HashJoin::getNonJoinedBlocks(const Block & left_sample_block,
}
}
auto non_joined = std::make_unique<NotJoinedHash>(*this, max_block_size, multiple_disjuncts);
auto non_joined = std::make_unique<NotJoinedHash>(*this, max_block_size, flag_per_row);
return std::make_unique<NotJoinedBlocks>(std::move(non_joined), result_sample_block, left_columns_count, *table_join);
}
@ -2329,8 +2687,8 @@ void HashJoin::reuseJoinedData(const HashJoin & join)
data = join.data;
from_storage_join = true;
bool multiple_disjuncts = !table_join->oneDisjunct();
if (multiple_disjuncts)
bool flag_per_row = needUsedFlagsForPerRightTableRow(table_join);
if (flag_per_row)
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "StorageJoin with ORs is not supported");
for (auto & map : data->maps)
@ -2394,4 +2752,46 @@ const ColumnWithTypeAndName & HashJoin::rightAsofKeyColumn() const
return savedBlockSample().getByName(table_join->getOnlyClause().key_names_right.back());
}
void HashJoin::validateAdditionalFilterExpression(ExpressionActionsPtr additional_filter_expression)
{
if (!additional_filter_expression)
return;
Block expression_sample_block = additional_filter_expression->getSampleBlock();
if (expression_sample_block.columns() != 1)
{
throw Exception(ErrorCodes::LOGICAL_ERROR,
"Unexpected expression in JOIN ON section. Expected single column, got '{}'",
expression_sample_block.dumpStructure());
}
auto type = removeNullable(expression_sample_block.getByPosition(0).type);
if (!type->equals(*std::make_shared<DataTypeUInt8>()))
{
throw Exception(ErrorCodes::LOGICAL_ERROR,
"Unexpected expression in JOIN ON section. Expected boolean (UInt8), got '{}'. expression:\n{}",
expression_sample_block.getByPosition(0).type->getName(),
additional_filter_expression->dumpActions());
}
bool is_supported = (strictness == JoinStrictness::All) && (isInnerOrLeft(kind) || isRightOrFull(kind));
if (!is_supported)
{
throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION,
"Non equi condition '{}' from JOIN ON section is supported only for ALL INNER/LEFT/FULL/RIGHT JOINs",
expression_sample_block.getByPosition(0).name);
}
}
bool HashJoin::needUsedFlagsForPerRightTableRow(std::shared_ptr<TableJoin> table_join_) const
{
if (!table_join_->oneDisjunct())
return true;
/// If it'a a all right join with inequal conditions, we need to mark each row
if (table_join_->getMixedJoinExpression() && isRightOrFull(table_join_->kind()))
return true;
return false;
}
}

View File

@ -31,6 +31,7 @@ namespace DB
{
class TableJoin;
class ExpressionActions;
namespace JoinStuff
{
@ -60,16 +61,16 @@ public:
bool getUsedSafe(size_t i) const;
bool getUsedSafe(const Block * block_ptr, size_t row_idx) const;
template <bool use_flags, bool multiple_disjuncts, typename T>
template <bool use_flags, bool flag_per_row, typename T>
void setUsed(const T & f);
template <bool use_flags, bool multiple_disjunct>
template <bool use_flags, bool flag_per_row>
void setUsed(const Block * block, size_t row_num, size_t offset);
template <bool use_flags, bool multiple_disjuncts, typename T>
template <bool use_flags, bool flag_per_row, typename T>
bool getUsed(const T & f);
template <bool use_flags, bool multiple_disjuncts, typename T>
template <bool use_flags, bool flag_per_row, typename T>
bool setUsedOnce(const T & f);
};
@ -485,6 +486,9 @@ private:
static Type chooseMethod(JoinKind kind, const ColumnRawPtrs & key_columns, Sizes & key_sizes);
bool empty() const;
void validateAdditionalFilterExpression(std::shared_ptr<ExpressionActions> additional_filter_expression);
bool needUsedFlagsForPerRightTableRow(std::shared_ptr<TableJoin> table_join_) const;
};
}

View File

@ -28,6 +28,7 @@ class ASTSelectQuery;
struct DatabaseAndTableWithAlias;
class Block;
class DictionaryJoinAdapter;
class ExpressionActions;
class StorageJoin;
class StorageDictionary;
class IKeyValueEntity;
@ -153,6 +154,8 @@ private:
ASTs key_asts_right;
Clauses clauses;
/// Originally used for inequal join. If there is no any inequal join condition, it will be nullptr.
std::shared_ptr<ExpressionActions> mixed_join_expression = nullptr;
ASTTableJoin table_join;
@ -301,6 +304,9 @@ public:
std::vector<JoinOnClause> & getClauses() { return clauses; }
const std::vector<JoinOnClause> & getClauses() const { return clauses; }
const std::shared_ptr<ExpressionActions> & getMixedJoinExpression() const { return mixed_join_expression; }
std::shared_ptr<ExpressionActions> & getMixedJoinExpression() { return mixed_join_expression; }
Names getAllNames(JoinTableSide side) const;
void resetCollected();

View File

@ -1305,6 +1305,14 @@ JoinTreeQueryPlan buildQueryPlanForJoinNode(const QueryTreeNodePtr & join_table_
std::swap(table_join_clause.key_names_right.at(asof_condition.key_index), table_join_clause.key_names_right.back());
}
}
if (join_clauses_and_actions.mixed_join_expressions_actions)
{
ExpressionActionsPtr & mixed_join_expression = table_join->getMixedJoinExpression();
mixed_join_expression = std::make_shared<ExpressionActions>(
join_clauses_and_actions.mixed_join_expressions_actions,
ExpressionActionsSettings::fromContext(planner_context->getQueryContext()));
}
}
else if (join_node.isUsingJoinExpression())
{

View File

@ -10,6 +10,7 @@
#include <DataTypes/getLeastSupertype.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesNumber.h>
#include <Storages/IStorage.h>
#include <Storages/StorageJoin.h>
@ -125,13 +126,13 @@ TableExpressionSet extractTableExpressionsSet(const QueryTreeNodePtr & node)
return res;
}
std::optional<JoinTableSide> extractJoinTableSideFromExpression(
std::set<JoinTableSide> extractJoinTableSidesFromExpression(//const ActionsDAG::Node * expression_root_node,
const IQueryTreeNode * expression_root_node,
const TableExpressionSet & left_table_expressions,
const TableExpressionSet & right_table_expressions,
const JoinNode & join_node)
{
std::optional<JoinTableSide> table_side;
std::set<JoinTableSide> table_sides;
std::vector<const IQueryTreeNode *> nodes_to_process;
nodes_to_process.push_back(expression_root_node);
@ -169,15 +170,10 @@ std::optional<JoinTableSide> extractJoinTableSideFromExpression(
join_node.getRightTableExpression()->formatASTForErrorMessage());
auto input_table_side = is_column_from_left_expr ? JoinTableSide::Left : JoinTableSide::Right;
if (table_side && (*table_side) != input_table_side)
throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION,
"JOIN {} join expression contains column from left and right table",
join_node.formatASTForErrorMessage());
table_side = input_table_side;
table_sides.insert(input_table_side);
}
return table_side;
return table_sides;
}
const ActionsDAG::Node * appendExpression(
@ -199,6 +195,7 @@ const ActionsDAG::Node * appendExpression(
void buildJoinClause(
ActionsDAGPtr & left_dag,
ActionsDAGPtr & right_dag,
ActionsDAGPtr & mixed_dag,
const PlannerContextPtr & planner_context,
const QueryTreeNodePtr & join_expression,
const TableExpressionSet & left_table_expressions,
@ -219,6 +216,7 @@ void buildJoinClause(
buildJoinClause(
left_dag,
right_dag,
mixed_dag,
planner_context,
child,
left_table_expressions,
@ -238,38 +236,42 @@ void buildJoinClause(
const auto left_child = function_node->getArguments().getNodes().at(0);
const auto right_child = function_node->getArguments().getNodes().at(1);
auto left_expression_side_optional = extractJoinTableSideFromExpression(left_child.get(),
auto left_expression_sides = extractJoinTableSidesFromExpression(left_child.get(),
left_table_expressions,
right_table_expressions,
join_node);
auto right_expression_side_optional = extractJoinTableSideFromExpression(right_child.get(),
auto right_expression_sides = extractJoinTableSidesFromExpression(right_child.get(),
left_table_expressions,
right_table_expressions,
join_node);
if (!left_expression_side_optional && !right_expression_side_optional)
if (left_expression_sides.empty() && right_expression_sides.empty())
{
throw Exception(ErrorCodes::INVALID_JOIN_ON_EXPRESSION,
"JOIN {} ON expression with constants is not supported",
"JOIN {} ON expression expected non-empty left and right table expressions",
join_node.formatASTForErrorMessage());
}
else if (left_expression_side_optional && !right_expression_side_optional)
else if (left_expression_sides.size() == 1 && right_expression_sides.empty())
{
auto & dag = *left_expression_side_optional == JoinTableSide::Left ? left_dag : right_dag;
auto expression_side = *left_expression_sides.begin();
auto & dag = expression_side == JoinTableSide::Left ? left_dag : right_dag;
const auto * node = appendExpression(dag, join_expression, planner_context, join_node);
join_clause.addCondition(*left_expression_side_optional, node);
join_clause.addCondition(expression_side, node);
}
else if (!left_expression_side_optional && right_expression_side_optional)
else if (left_expression_sides.empty() && right_expression_sides.size() == 1)
{
auto & dag = *right_expression_side_optional == JoinTableSide::Left ? left_dag : right_dag;
auto expression_side = *right_expression_sides.begin();
auto & dag = expression_side == JoinTableSide::Left ? left_dag : right_dag;
const auto * node = appendExpression(dag, join_expression, planner_context, join_node);
join_clause.addCondition(*right_expression_side_optional, node);
join_clause.addCondition(expression_side, node);
}
else
else if (left_expression_sides.size() == 1 && right_expression_sides.size() == 1)
{
auto left_expression_side = *left_expression_side_optional;
auto right_expression_side = *right_expression_side_optional;
auto left_expression_side = *left_expression_sides.begin();
auto right_expression_side = *right_expression_sides.begin();
if (left_expression_side != right_expression_side)
{
@ -310,23 +312,62 @@ void buildJoinClause(
join_clause.addCondition(left_expression_side, node);
}
}
else
{
auto support_mixed_join_condition = planner_context->getQueryContext()->getSettingsRef().allow_experimental_join_condition;
auto join_use_nulls = planner_context->getQueryContext()->getSettingsRef().join_use_nulls;
/// If join_use_nulls = true, the columns' nullability will be changed later which make this expression not right.
if (support_mixed_join_condition && !join_use_nulls)
{
/// expression involves both tables.
/// `expr1(left.col1, right.col2) == expr2(left.col3, right.col4)`
const auto * node = appendExpression(mixed_dag, join_expression, planner_context, join_node);
join_clause.addMixedCondition(node);
}
else
{
throw Exception(
ErrorCodes::INVALID_JOIN_ON_EXPRESSION,
"JOIN {} join expression contains column from left and right table",
join_node.formatASTForErrorMessage());
}
}
return;
}
auto expression_side_optional = extractJoinTableSideFromExpression(
join_expression.get(),
left_table_expressions,
right_table_expressions,
join_node);
if (!expression_side_optional)
expression_side_optional = JoinTableSide::Right;
auto expression_side = *expression_side_optional;
auto & dag = expression_side == JoinTableSide::Left ? left_dag : right_dag;
const auto * node = appendExpression(dag, join_expression, planner_context, join_node);
join_clause.addCondition(expression_side, node);
else
{
auto expression_sides = extractJoinTableSidesFromExpression(join_expression.get(),
left_table_expressions,
right_table_expressions,
join_node);
// expression_sides.empty() = true, the expression is constant
if (expression_sides.empty() || expression_sides.size() == 1)
{
auto expression_side = expression_sides.empty() ? JoinTableSide::Right : *expression_sides.begin();
auto & dag = expression_side == JoinTableSide::Left ? left_dag : right_dag;
const auto * node = appendExpression(dag, join_expression, planner_context, join_node);
join_clause.addCondition(expression_side, node);
}
else
{
auto support_mixed_join_condition = planner_context->getQueryContext()->getSettingsRef().allow_experimental_join_condition;
auto join_use_nulls = planner_context->getQueryContext()->getSettingsRef().join_use_nulls;
/// If join_use_nulls = true, the columns' nullability will be changed later which make this expression not right.
if (support_mixed_join_condition && !join_use_nulls)
{
/// expression involves both tables.
const auto * node = appendExpression(mixed_dag, join_expression, planner_context, join_node);
join_clause.addMixedCondition(node);
}
else
{
throw Exception(
ErrorCodes::INVALID_JOIN_ON_EXPRESSION,
"JOIN {} join expression contains column from left and right table",
join_node.formatASTForErrorMessage());
}
}
}
}
JoinClausesAndActions buildJoinClausesAndActions(
@ -337,6 +378,16 @@ JoinClausesAndActions buildJoinClausesAndActions(
{
ActionsDAGPtr left_join_actions = std::make_shared<ActionsDAG>(left_table_expression_columns);
ActionsDAGPtr right_join_actions = std::make_shared<ActionsDAG>(right_table_expression_columns);
ColumnsWithTypeAndName mixed_table_expression_columns;
for (const auto & left_column : left_table_expression_columns)
{
mixed_table_expression_columns.push_back(left_column);
}
for (const auto & right_column : right_table_expression_columns)
{
mixed_table_expression_columns.push_back(right_column);
}
ActionsDAGPtr mixed_join_actions = std::make_shared<ActionsDAG>(mixed_table_expression_columns);
/** It is possible to have constant value in JOIN ON section, that we need to ignore during DAG construction.
* If we do not ignore it, this function will be replaced by underlying constant.
@ -390,6 +441,7 @@ JoinClausesAndActions buildJoinClausesAndActions(
JoinClausesAndActions result;
bool is_inequal_join = false;
const auto & function_name = function_node->getFunction()->getName();
if (function_name == "or")
{
@ -400,12 +452,14 @@ JoinClausesAndActions buildJoinClausesAndActions(
buildJoinClause(
left_join_actions,
right_join_actions,
mixed_join_actions,
planner_context,
child,
join_left_table_expressions,
join_right_table_expressions,
join_node,
result.join_clauses.back());
is_inequal_join |= !result.join_clauses.back().getMixedFilterConditionNodes().empty();
}
}
else
@ -415,12 +469,14 @@ JoinClausesAndActions buildJoinClausesAndActions(
buildJoinClause(
left_join_actions,
right_join_actions,
mixed_join_actions,
planner_context,
join_expression,
join_left_table_expressions,
join_right_table_expressions,
join_node,
result.join_clauses.back());
is_inequal_join |= !result.join_clauses.back().getMixedFilterConditionNodes().empty();
}
auto and_function = FunctionFactory::instance().get("and", planner_context->getQueryContext());
@ -441,7 +497,6 @@ JoinClausesAndActions buildJoinClausesAndActions(
if (!left_filter_condition_nodes.empty())
{
const ActionsDAG::Node * dag_filter_condition_node = nullptr;
if (left_filter_condition_nodes.size() > 1)
dag_filter_condition_node = &left_join_actions->addFunction(and_function, left_filter_condition_nodes, {});
else
@ -540,6 +595,47 @@ JoinClausesAndActions buildJoinClausesAndActions(
result.right_join_tmp_expression_actions = std::move(right_join_actions);
result.right_join_expressions_actions->removeUnusedActions(join_right_actions_names);
if (is_inequal_join)
{
/// In case of multiple disjuncts and any inequal join condition, we need to build full join on expression actions.
/// So, for each column, we recalculate the value of the whole expression from JOIN ON to check if rows should be joined.
if (result.join_clauses.size() > 1)
{
auto mixed_join_expressions_actions = std::make_shared<ActionsDAG>(mixed_table_expression_columns);
PlannerActionsVisitor join_expression_visitor(planner_context);
auto join_expression_dag_node_raw_pointers = join_expression_visitor.visit(mixed_join_expressions_actions, join_expression);
if (join_expression_dag_node_raw_pointers.size() != 1)
throw Exception(
ErrorCodes::LOGICAL_ERROR, "JOIN {} ON clause contains multiple expressions", join_node.formatASTForErrorMessage());
mixed_join_expressions_actions->addOrReplaceInOutputs(*join_expression_dag_node_raw_pointers[0]);
Names required_names{join_expression_dag_node_raw_pointers[0]->result_name};
mixed_join_expressions_actions->removeUnusedActions(required_names);
result.mixed_join_expressions_actions = mixed_join_expressions_actions;
}
else
{
const auto & join_clause = result.join_clauses.front();
const auto & mixed_filter_condition_nodes = join_clause.getMixedFilterConditionNodes();
auto mixed_join_expressions_actions = ActionsDAG::buildFilterActionsDAG(mixed_filter_condition_nodes, {}, true);
result.mixed_join_expressions_actions = mixed_join_expressions_actions;
}
auto outputs = result.mixed_join_expressions_actions->getOutputs();
if (outputs.size() != 1)
{
throw Exception(ErrorCodes::LOGICAL_ERROR, "Only one output is expected, got: {}", result.mixed_join_expressions_actions->dumpDAG());
}
auto output_type = removeNullable(outputs[0]->result_type);
WhichDataType which_type(output_type);
if (!which_type.isUInt8())
{
DataTypePtr uint8_ty = std::make_shared<DataTypeUInt8>();
auto true_col = ColumnWithTypeAndName(uint8_ty->createColumnConst(1, 1), uint8_ty, "true");
const auto * true_node = &result.mixed_join_expressions_actions->addColumn(true_col);
result.mixed_join_expressions_actions = ActionsDAG::buildFilterActionsDAG({outputs[0], true_node});
}
}
return result;
}
@ -751,6 +847,14 @@ std::shared_ptr<IJoin> chooseJoinAlgorithm(std::shared_ptr<TableJoin> & table_jo
const Block & right_table_expression_header,
const PlannerContextPtr & planner_context)
{
if (table_join->getMixedJoinExpression()
&& !table_join->isEnabledAlgorithm(JoinAlgorithm::HASH)
&& !table_join->isEnabledAlgorithm(JoinAlgorithm::GRACE_HASH))
{
throw Exception(ErrorCodes::NOT_IMPLEMENTED,
"JOIN with mixed conditions supports only hash join or grace hash join");
}
trySetStorageInTableJoin(right_table_expression, table_join);
auto & right_table_expression_data = planner_context->getTableExpressionDataOrThrow(right_table_expression);

View File

@ -140,6 +140,21 @@ public:
return right_filter_condition_nodes;
}
ActionsDAG::NodeRawConstPtrs & getMixedFilterConditionNodes()
{
return mixed_filter_condition_nodes;
}
void addMixedCondition(const ActionsDAG::Node * condition_node)
{
mixed_filter_condition_nodes.push_back(condition_node);
}
const ActionsDAG::NodeRawConstPtrs & getMixedFilterConditionNodes() const
{
return mixed_filter_condition_nodes;
}
/// Dump clause into buffer
void dump(WriteBuffer & buffer) const;
@ -154,6 +169,8 @@ private:
ActionsDAG::NodeRawConstPtrs left_filter_condition_nodes;
ActionsDAG::NodeRawConstPtrs right_filter_condition_nodes;
/// conditions which involve both left and right tables
ActionsDAG::NodeRawConstPtrs mixed_filter_condition_nodes;
std::unordered_set<size_t> nullsafe_compare_key_indexes;
};
@ -171,6 +188,9 @@ struct JoinClausesAndActions
ActionsDAGPtr left_join_expressions_actions;
/// Right join expressions actions
ActionsDAGPtr right_join_expressions_actions;
/// Originally used for inequal join. it's the total join expression.
/// If there is no inequal join conditions, it's null.
ActionsDAGPtr mixed_join_expressions_actions;
};
/** Calculate join clauses and actions for JOIN ON section.

View File

@ -1,5 +1,5 @@
#include <Storages/MergeTree/MergeTreeDataPartWriterOnDisk.h>
#include <Storages/MergeTree/MergeTreeIndexInverted.h>
#include <Storages/MergeTree/MergeTreeIndexFullText.h>
#include <Common/ElapsedTimeProfileEventIncrement.h>
#include <Common/MemoryTrackerBlockerInThread.h>
#include <Common/logger_useful.h>
@ -283,7 +283,7 @@ void MergeTreeDataPartWriterOnDisk::initSkipIndices()
settings.query_write_settings));
GinIndexStorePtr store = nullptr;
if (typeid_cast<const MergeTreeIndexInverted *>(&*skip_index) != nullptr)
if (typeid_cast<const MergeTreeIndexFullText *>(&*skip_index) != nullptr)
{
store = std::make_shared<GinIndexStore>(stream_name, data_part->getDataPartStoragePtr(), data_part->getDataPartStoragePtr(), storage.getSettings()->max_digestion_size_per_segment);
gin_index_stores[stream_name] = store;
@ -356,7 +356,7 @@ void MergeTreeDataPartWriterOnDisk::calculateAndSerializeSkipIndices(const Block
WriteBuffer & marks_out = stream.compress_marks ? stream.marks_compressed_hashing : stream.marks_hashing;
GinIndexStorePtr store;
if (typeid_cast<const MergeTreeIndexInverted *>(&*index_helper) != nullptr)
if (typeid_cast<const MergeTreeIndexFullText *>(&*index_helper) != nullptr)
{
String stream_name = index_helper->getFileName();
auto it = gin_index_stores.find(stream_name);
@ -471,7 +471,7 @@ void MergeTreeDataPartWriterOnDisk::fillSkipIndicesChecksums(MergeTreeData::Data
/// Register additional files written only by the inverted index. Required because otherwise DROP TABLE complains about unknown
/// files. Note that the provided actual checksums are bogus. The problem is that at this point the file writes happened already and
/// we'd need to re-open + hash the files (fixing this is TODO). For now, CHECK TABLE skips these four files.
if (typeid_cast<const MergeTreeIndexInverted *>(&*skip_indices[i]) != nullptr)
if (typeid_cast<const MergeTreeIndexFullText *>(&*skip_indices[i]) != nullptr)
{
String filename_without_extension = skip_indices[i]->getFileName();
checksums.files[filename_without_extension + ".gin_dict"] = MergeTreeDataPartChecksums::Checksum();

View File

@ -9,7 +9,7 @@
#include <Storages/MergeTree/KeyCondition.h>
#include <Storages/MergeTree/MergeTreeDataPartUUID.h>
#include <Storages/MergeTree/StorageFromMergeTreeDataPart.h>
#include <Storages/MergeTree/MergeTreeIndexInverted.h>
#include <Storages/MergeTree/MergeTreeIndexFullText.h>
#include <Storages/ReadInOrderOptimizer.h>
#include <Storages/VirtualColumnUtils.h>
#include <Parsers/ASTIdentifier.h>
@ -1297,7 +1297,7 @@ MarkRanges MergeTreeDataSelectExecutor::filterMarksUsingIndex(
PostingsCacheForStore cache_in_store;
if (dynamic_cast<const MergeTreeIndexInverted *>(&*index_helper) != nullptr)
if (dynamic_cast<const MergeTreeIndexFullText *>(&*index_helper) != nullptr)
cache_in_store.store = GinIndexStoreFactory::instance().get(index_helper->getFileName(), part->getDataPartStoragePtr());
for (size_t i = 0; i < ranges.size(); ++i)
@ -1334,7 +1334,7 @@ MarkRanges MergeTreeDataSelectExecutor::filterMarksUsingIndex(
}
bool result = false;
const auto * gin_filter_condition = dynamic_cast<const MergeTreeConditionInverted *>(&*condition);
const auto * gin_filter_condition = dynamic_cast<const MergeTreeConditionFullText *>(&*condition);
if (!gin_filter_condition)
result = condition->mayBeTrueOnGranule(granule);
else

View File

@ -618,7 +618,8 @@ MergeTreeDataWriter::TemporaryPart MergeTreeDataWriter::writeTempPartImpl(
if (projection_block.rows())
{
auto proj_temp_part = writeProjectionPart(data, log, projection_block, projection, new_data_part.get());
auto proj_temp_part
= writeProjectionPart(data, log, projection_block, projection, new_data_part.get(), /*merge_is_needed=*/false);
new_data_part->addProjectionPart(projection.name, std::move(proj_temp_part.part));
for (auto & stream : proj_temp_part.streams)
temp_part.streams.emplace_back(std::move(stream));
@ -647,7 +648,8 @@ MergeTreeDataWriter::TemporaryPart MergeTreeDataWriter::writeProjectionPartImpl(
const MergeTreeData & data,
LoggerPtr log,
Block block,
const ProjectionDescription & projection)
const ProjectionDescription & projection,
bool merge_is_needed)
{
TemporaryPart temp_part;
const auto & metadata_snapshot = projection.metadata;
@ -716,7 +718,7 @@ MergeTreeDataWriter::TemporaryPart MergeTreeDataWriter::writeProjectionPartImpl(
ProfileEvents::increment(ProfileEvents::MergeTreeDataProjectionWriterBlocksAlreadySorted);
}
if (projection.type == ProjectionDescription::Type::Aggregate)
if (projection.type == ProjectionDescription::Type::Aggregate && merge_is_needed)
{
ProfileEventTimeIncrement<Microseconds> watch(ProfileEvents::MergeTreeDataProjectionWriterMergingBlocksMicroseconds);
@ -756,16 +758,11 @@ MergeTreeDataWriter::TemporaryPart MergeTreeDataWriter::writeProjectionPart(
LoggerPtr log,
Block block,
const ProjectionDescription & projection,
IMergeTreeDataPart * parent_part)
IMergeTreeDataPart * parent_part,
bool merge_is_needed)
{
return writeProjectionPartImpl(
projection.name,
false /* is_temp */,
parent_part,
data,
log,
std::move(block),
projection);
projection.name, false /* is_temp */, parent_part, data, log, std::move(block), projection, merge_is_needed);
}
/// This is used for projection materialization process which may contain multiple stages of
@ -780,13 +777,7 @@ MergeTreeDataWriter::TemporaryPart MergeTreeDataWriter::writeTempProjectionPart(
{
auto part_name = fmt::format("{}_{}", projection.name, block_num);
return writeProjectionPartImpl(
part_name,
true /* is_temp */,
parent_part,
data,
log,
std::move(block),
projection);
part_name, true /* is_temp */, parent_part, data, log, std::move(block), projection, /*merge_is_needed=*/true);
}
}

View File

@ -95,7 +95,8 @@ public:
LoggerPtr log,
Block block,
const ProjectionDescription & projection,
IMergeTreeDataPart * parent_part);
IMergeTreeDataPart * parent_part,
bool merge_is_needed);
/// For mutation: MATERIALIZE PROJECTION.
static TemporaryPart writeTempProjectionPart(
@ -129,7 +130,8 @@ private:
const MergeTreeData & data,
LoggerPtr log,
Block block,
const ProjectionDescription & projection);
const ProjectionDescription & projection,
bool merge_is_needed);
MergeTreeData & data;
LoggerPtr log;

View File

@ -1,65 +0,0 @@
#include <Storages/MergeTree/MergeTreeIndexAggregatorBloomFilter.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnsNumber.h>
#include <Columns/ColumnFixedString.h>
#include <Common/HashTable/Hash.h>
#include <DataTypes/DataTypesNumber.h>
#include <Interpreters/BloomFilterHash.h>
#include <IO/WriteHelpers.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
MergeTreeIndexAggregatorBloomFilter::MergeTreeIndexAggregatorBloomFilter(
size_t bits_per_row_, size_t hash_functions_, const Names & columns_name_)
: bits_per_row(bits_per_row_), hash_functions(hash_functions_), index_columns_name(columns_name_), column_hashes(columns_name_.size())
{
assert(bits_per_row != 0);
assert(hash_functions != 0);
}
bool MergeTreeIndexAggregatorBloomFilter::empty() const
{
return !total_rows;
}
MergeTreeIndexGranulePtr MergeTreeIndexAggregatorBloomFilter::getGranuleAndReset()
{
const auto granule = std::make_shared<MergeTreeIndexGranuleBloomFilter>(bits_per_row, hash_functions, column_hashes);
total_rows = 0;
column_hashes.clear();
return granule;
}
void MergeTreeIndexAggregatorBloomFilter::update(const Block & block, size_t * pos, size_t limit)
{
if (*pos >= block.rows())
throw Exception(ErrorCodes::LOGICAL_ERROR, "The provided position is not less than the number of block rows. "
"Position: {}, Block rows: {}.", *pos, block.rows());
Block granule_index_block;
size_t max_read_rows = std::min(block.rows() - *pos, limit);
for (size_t column = 0; column < index_columns_name.size(); ++column)
{
const auto & column_and_type = block.getByName(index_columns_name[column]);
auto index_column = BloomFilterHash::hashWithColumn(column_and_type.type, column_and_type.column, *pos, max_read_rows);
const auto & index_col = checkAndGetColumn<ColumnUInt64>(index_column.get());
const auto & index_data = index_col->getData();
for (const auto & hash: index_data)
column_hashes[column].insert(hash);
}
*pos += max_read_rows;
total_rows += max_read_rows;
}
}

View File

@ -1,30 +0,0 @@
#pragma once
#include <Storages/MergeTree/MergeTreeIndices.h>
#include <Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.h>
#include <Common/HashTable/HashSet.h>
namespace DB
{
class MergeTreeIndexAggregatorBloomFilter final : public IMergeTreeIndexAggregator
{
public:
MergeTreeIndexAggregatorBloomFilter(size_t bits_per_row_, size_t hash_functions_, const Names & columns_name_);
bool empty() const override;
MergeTreeIndexGranulePtr getGranuleAndReset() override;
void update(const Block & block, size_t * pos, size_t limit) override;
private:
size_t bits_per_row;
size_t hash_functions;
const Names index_columns_name;
std::vector<HashSet<UInt64>> column_hashes;
size_t total_rows = 0;
};
}

View File

@ -1,13 +1,36 @@
#include <Storages/MergeTree/MergeTreeIndexBloomFilter.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <Interpreters/TreeRewriter.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <base/types.h>
#include <DataTypes/DataTypeNullable.h>
#include <Storages/MergeTree/MergeTreeIndexConditionBloomFilter.h>
#include <Columns/ColumnArray.h>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnFixedString.h>
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnTuple.h>
#include <Columns/ColumnsNumber.h>
#include <Common/FieldVisitorsAccurateComparison.h>
#include <Common/HashTable/ClearableHashMap.h>
#include <Common/HashTable/Hash.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeMap.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypeTuple.h>
#include <DataTypes/DataTypesNumber.h>
#include <IO/WriteHelpers.h>
#include <Interpreters/BloomFilterHash.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/TreeRewriter.h>
#include <Interpreters/castColumn.h>
#include <Interpreters/convertFieldToType.h>
#include <Interpreters/misc.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTSubquery.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <Storages/MergeTree/MergeTreeIndexUtils.h>
#include <Storages/MergeTree/RPNBuilder.h>
#include <base/types.h>
namespace DB
@ -17,8 +40,839 @@ namespace ErrorCodes
{
extern const int BAD_ARGUMENTS;
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int INCORRECT_QUERY;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int LOGICAL_ERROR;
}
MergeTreeIndexGranuleBloomFilter::MergeTreeIndexGranuleBloomFilter(size_t bits_per_row_, size_t hash_functions_, size_t index_columns_)
: bits_per_row(bits_per_row_), hash_functions(hash_functions_), bloom_filters(index_columns_)
{
total_rows = 0;
for (size_t column = 0; column < index_columns_; ++column)
bloom_filters[column] = std::make_shared<BloomFilter>(bits_per_row, hash_functions, 0);
}
MergeTreeIndexGranuleBloomFilter::MergeTreeIndexGranuleBloomFilter(
size_t bits_per_row_, size_t hash_functions_, const std::vector<HashSet<UInt64>>& column_hashes_)
: bits_per_row(bits_per_row_), hash_functions(hash_functions_), bloom_filters(column_hashes_.size())
{
if (column_hashes_.empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Granule_index_blocks empty or total_rows is zero.");
size_t bloom_filter_max_size = 0;
for (const auto & column_hash : column_hashes_)
bloom_filter_max_size = std::max(bloom_filter_max_size, column_hash.size());
static size_t atom_size = 8;
// If multiple columns are given, we will initialize all the bloom filters
// with the size of the highest-cardinality one. This is done for compatibility with
// existing binary serialization format
total_rows = bloom_filter_max_size;
size_t bytes_size = (bits_per_row * total_rows + atom_size - 1) / atom_size;
for (size_t column = 0, columns = column_hashes_.size(); column < columns; ++column)
{
bloom_filters[column] = std::make_shared<BloomFilter>(bytes_size, hash_functions, 0);
fillingBloomFilter(bloom_filters[column], column_hashes_[column]);
}
}
bool MergeTreeIndexGranuleBloomFilter::empty() const
{
return !total_rows;
}
void MergeTreeIndexGranuleBloomFilter::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
{
if (version != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
readVarUInt(total_rows, istr);
static size_t atom_size = 8;
size_t bytes_size = (bits_per_row * total_rows + atom_size - 1) / atom_size;
size_t read_size = bytes_size;
for (auto & filter : bloom_filters)
{
filter->resize(bytes_size);
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
read_size = filter->getFilter().size() * sizeof(BloomFilter::UnderType);
#endif
istr.readStrict(reinterpret_cast<char *>(filter->getFilter().data()), read_size);
}
}
void MergeTreeIndexGranuleBloomFilter::serializeBinary(WriteBuffer & ostr) const
{
if (empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty bloom filter index.");
writeVarUInt(total_rows, ostr);
static size_t atom_size = 8;
size_t write_size = (bits_per_row * total_rows + atom_size - 1) / atom_size;
for (const auto & bloom_filter : bloom_filters)
{
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
write_size = bloom_filter->getFilter().size() * sizeof(BloomFilter::UnderType);
#endif
ostr.write(reinterpret_cast<const char *>(bloom_filter->getFilter().data()), write_size);
}
}
void MergeTreeIndexGranuleBloomFilter::fillingBloomFilter(BloomFilterPtr & bf, const HashSet<UInt64> &hashes) const
{
for (const auto & bf_base_hash : hashes)
for (size_t i = 0; i < hash_functions; ++i)
bf->addHashWithSeed(bf_base_hash.getKey(), BloomFilterHash::bf_hash_seed[i]);
}
namespace
{
ColumnWithTypeAndName getPreparedSetInfo(const ConstSetPtr & prepared_set)
{
if (prepared_set->getDataTypes().size() == 1)
return {prepared_set->getSetElements()[0], prepared_set->getElementsTypes()[0], "dummy"};
Columns set_elements;
for (auto & set_element : prepared_set->getSetElements())
set_elements.emplace_back(set_element->convertToFullColumnIfConst());
return {ColumnTuple::create(set_elements), std::make_shared<DataTypeTuple>(prepared_set->getElementsTypes()), "dummy"};
}
bool hashMatchesFilter(const BloomFilterPtr& bloom_filter, UInt64 hash, size_t hash_functions)
{
return std::all_of(BloomFilterHash::bf_hash_seed,
BloomFilterHash::bf_hash_seed + hash_functions,
[&](const auto &hash_seed)
{
return bloom_filter->findHashWithSeed(hash,
hash_seed);
});
}
bool maybeTrueOnBloomFilter(const IColumn * hash_column, const BloomFilterPtr & bloom_filter, size_t hash_functions, bool match_all)
{
const auto * const_column = typeid_cast<const ColumnConst *>(hash_column);
const auto * non_const_column = typeid_cast<const ColumnUInt64 *>(hash_column);
if (!const_column && !non_const_column)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Hash column must be Const or UInt64.");
if (const_column)
{
return hashMatchesFilter(bloom_filter,
const_column->getValue<UInt64>(),
hash_functions);
}
const ColumnUInt64::Container & hashes = non_const_column->getData();
if (match_all)
{
return std::all_of(hashes.begin(),
hashes.end(),
[&](const auto& hash_row)
{
return hashMatchesFilter(bloom_filter,
hash_row,
hash_functions);
});
}
else
{
return std::any_of(hashes.begin(),
hashes.end(),
[&](const auto& hash_row)
{
return hashMatchesFilter(bloom_filter,
hash_row,
hash_functions);
});
}
}
}
MergeTreeIndexConditionBloomFilter::MergeTreeIndexConditionBloomFilter(
const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, const Block & header_, size_t hash_functions_)
: WithContext(context_), header(header_), hash_functions(hash_functions_)
{
if (!filter_actions_dag)
{
rpn.push_back(RPNElement::FUNCTION_UNKNOWN);
return;
}
RPNBuilder<RPNElement> builder(
filter_actions_dag->getOutputs().at(0),
context_,
[&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); });
rpn = std::move(builder).extractRPN();
}
bool MergeTreeIndexConditionBloomFilter::alwaysUnknownOrTrue() const
{
std::vector<bool> rpn_stack;
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN
|| element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.push_back(true);
}
else if (element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS
|| element.function == RPNElement::FUNCTION_HAS_ANY
|| element.function == RPNElement::FUNCTION_HAS_ALL
|| element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN
|| element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.push_back(false);
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
// do nothing
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 && arg2;
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 || arg2;
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in KeyCondition::RPNElement");
}
return rpn_stack[0];
}
bool MergeTreeIndexConditionBloomFilter::mayBeTrueOnGranule(const MergeTreeIndexGranuleBloomFilter * granule) const
{
std::vector<BoolMask> rpn_stack;
const auto & filters = granule->getFilters();
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN)
{
rpn_stack.emplace_back(true, true);
}
else if (element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN
|| element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS
|| element.function == RPNElement::FUNCTION_HAS_ANY
|| element.function == RPNElement::FUNCTION_HAS_ALL)
{
bool match_rows = true;
bool match_all = element.function == RPNElement::FUNCTION_HAS_ALL;
const auto & predicate = element.predicate;
for (size_t index = 0; match_rows && index < predicate.size(); ++index)
{
const auto & query_index_hash = predicate[index];
const auto & filter = filters[query_index_hash.first];
const ColumnPtr & hash_column = query_index_hash.second;
match_rows = maybeTrueOnBloomFilter(&*hash_column,
filter,
hash_functions,
match_all);
}
rpn_stack.emplace_back(match_rows, true);
if (element.function == RPNElement::FUNCTION_NOT_EQUALS || element.function == RPNElement::FUNCTION_NOT_IN)
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 | arg2;
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 & arg2;
}
else if (element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.emplace_back(true, false);
}
else if (element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.emplace_back(false, true);
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in KeyCondition::RPNElement");
}
if (rpn_stack.size() != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected stack size in KeyCondition::mayBeTrueInRange");
return rpn_stack[0].can_be_true;
}
bool MergeTreeIndexConditionBloomFilter::extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out)
{
{
Field const_value;
DataTypePtr const_type;
if (node.tryGetConstant(const_value, const_type))
{
if (const_value.getType() == Field::Types::UInt64)
{
out.function = const_value.get<UInt64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Int64)
{
out.function = const_value.get<Int64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Float64)
{
out.function = const_value.get<Float64>() != 0.0 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
}
}
return traverseFunction(node, out, nullptr /*parent*/);
}
bool MergeTreeIndexConditionBloomFilter::traverseFunction(const RPNBuilderTreeNode & node, RPNElement & out, const RPNBuilderTreeNode * parent)
{
bool maybe_useful = false;
if (node.isFunction())
{
const auto function = node.toFunctionNode();
auto arguments_size = function.getArgumentsSize();
auto function_name = function.getFunctionName();
for (size_t i = 0; i < arguments_size; ++i)
{
auto argument = function.getArgumentAt(i);
if (traverseFunction(argument, out, &node))
maybe_useful = true;
}
if (arguments_size != 2)
return false;
auto lhs_argument = function.getArgumentAt(0);
auto rhs_argument = function.getArgumentAt(1);
if (functionIsInOrGlobalInOperator(function_name))
{
if (auto future_set = rhs_argument.tryGetPreparedSet(); future_set)
{
if (auto prepared_set = future_set->buildOrderedSetInplace(rhs_argument.getTreeContext().getQueryContext()); prepared_set)
{
if (prepared_set->hasExplicitSetElements())
{
const auto prepared_info = getPreparedSetInfo(prepared_set);
if (traverseTreeIn(function_name, lhs_argument, prepared_set, prepared_info.type, prepared_info.column, out))
maybe_useful = true;
}
}
}
}
else if (function_name == "equals" ||
function_name == "notEquals" ||
function_name == "has" ||
function_name == "mapContains" ||
function_name == "indexOf" ||
function_name == "hasAny" ||
function_name == "hasAll")
{
Field const_value;
DataTypePtr const_type;
if (rhs_argument.tryGetConstant(const_value, const_type))
{
if (traverseTreeEquals(function_name, lhs_argument, const_type, const_value, out, parent))
maybe_useful = true;
}
else if (lhs_argument.tryGetConstant(const_value, const_type))
{
if (traverseTreeEquals(function_name, rhs_argument, const_type, const_value, out, parent))
maybe_useful = true;
}
}
}
return maybe_useful;
}
bool MergeTreeIndexConditionBloomFilter::traverseTreeIn(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const ConstSetPtr & prepared_set,
const DataTypePtr & type,
const ColumnPtr & column,
RPNElement & out)
{
auto key_node_column_name = key_node.getColumnName();
if (header.has(key_node_column_name))
{
size_t row_size = column->size();
size_t position = header.getPositionByName(key_node_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto & converted_column = castColumn(ColumnWithTypeAndName{column, type, ""}, index_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithColumn(index_type, converted_column, 0, row_size)));
if (function_name == "in" || function_name == "globalIn")
out.function = RPNElement::FUNCTION_IN;
if (function_name == "notIn" || function_name == "globalNotIn")
out.function = RPNElement::FUNCTION_NOT_IN;
return true;
}
if (key_node.isFunction())
{
auto key_node_function = key_node.toFunctionNode();
auto key_node_function_name = key_node_function.getFunctionName();
size_t key_node_function_arguments_size = key_node_function.getArgumentsSize();
WhichDataType which(type);
if (which.isTuple() && key_node_function_name == "tuple")
{
const auto & tuple_column = typeid_cast<const ColumnTuple *>(column.get());
const auto & tuple_data_type = typeid_cast<const DataTypeTuple *>(type.get());
if (tuple_data_type->getElements().size() != key_node_function_arguments_size || tuple_column->getColumns().size() != key_node_function_arguments_size)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal types of arguments of function {}", function_name);
bool match_with_subtype = false;
const auto & sub_columns = tuple_column->getColumns();
const auto & sub_data_types = tuple_data_type->getElements();
for (size_t index = 0; index < key_node_function_arguments_size; ++index)
match_with_subtype |= traverseTreeIn(function_name, key_node_function.getArgumentAt(index), nullptr, sub_data_types[index], sub_columns[index], out);
return match_with_subtype;
}
if (key_node_function_name == "arrayElement")
{
/** Try to parse arrayElement for mapKeys index.
* It is important to ignore keys like column_map['Key'] IN ('') because if key does not exists in map
* we return default value for arrayElement.
*
* We cannot skip keys that does not exist in map if comparison is with default type value because
* that way we skip necessary granules where map key does not exists.
*/
if (!prepared_set)
return false;
auto default_column_to_check = type->createColumnConstWithDefaultValue(1)->convertToFullColumnIfConst();
ColumnWithTypeAndName default_column_with_type_to_check { default_column_to_check, type, "" };
ColumnsWithTypeAndName default_columns_with_type_to_check = {default_column_with_type_to_check};
auto set_contains_default_value_predicate_column = prepared_set->execute(default_columns_with_type_to_check, false /*negative*/);
const auto & set_contains_default_value_predicate_column_typed = assert_cast<const ColumnUInt8 &>(*set_contains_default_value_predicate_column);
bool set_contain_default_value = set_contains_default_value_predicate_column_typed.getData()[0];
if (set_contain_default_value)
return false;
auto first_argument = key_node_function.getArgumentAt(0);
const auto column_name = first_argument.getColumnName();
auto map_keys_index_column_name = fmt::format("mapKeys({})", column_name);
auto map_values_index_column_name = fmt::format("mapValues({})", column_name);
if (header.has(map_keys_index_column_name))
{
/// For mapKeys we serialize key argument with bloom filter
auto second_argument = key_node_function.getArgumentAt(1);
Field constant_value;
DataTypePtr constant_type;
if (second_argument.tryGetConstant(constant_value, constant_type))
{
size_t position = header.getPositionByName(map_keys_index_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(index_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), constant_value)));
}
else
{
return false;
}
}
else if (header.has(map_values_index_column_name))
{
/// For mapValues we serialize set with bloom filter
size_t row_size = column->size();
size_t position = header.getPositionByName(map_values_index_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto & array_type = assert_cast<const DataTypeArray &>(*index_type);
const auto & array_nested_type = array_type.getNestedType();
const auto & converted_column = castColumn(ColumnWithTypeAndName{column, type, ""}, array_nested_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithColumn(array_nested_type, converted_column, 0, row_size)));
}
else
{
return false;
}
if (function_name == "in" || function_name == "globalIn")
out.function = RPNElement::FUNCTION_IN;
if (function_name == "notIn" || function_name == "globalNotIn")
out.function = RPNElement::FUNCTION_NOT_IN;
return true;
}
}
return false;
}
static bool indexOfCanUseBloomFilter(const RPNBuilderTreeNode * parent)
{
if (!parent)
return true;
if (!parent->isFunction())
return false;
auto function = parent->toFunctionNode();
auto function_name = function.getFunctionName();
/// `parent` is a function where `indexOf` is located.
/// Example: `indexOf(arr, x) = 1`, parent is a function named `equals`.
if (function_name == "and")
{
return true;
}
else if (function_name == "equals" /// notEquals is not applicable
|| function_name == "greater" || function_name == "greaterOrEquals"
|| function_name == "less" || function_name == "lessOrEquals")
{
size_t function_arguments_size = function.getArgumentsSize();
if (function_arguments_size != 2)
return false;
/// We don't allow constant expressions like `indexOf(arr, x) = 1 + 0` but it's negligible.
/// We should return true when the corresponding expression implies that the array contains the element.
/// Example: when `indexOf(arr, x)` > 10 is written, it means that arr definitely should contain the element
/// (at least at 11th position but it does not matter).
bool reversed = false;
Field constant_value;
DataTypePtr constant_type;
if (function.getArgumentAt(0).tryGetConstant(constant_value, constant_type))
{
reversed = true;
}
else if (function.getArgumentAt(1).tryGetConstant(constant_value, constant_type))
{
}
else
{
return false;
}
Field zero(0);
bool constant_equal_zero = applyVisitor(FieldVisitorAccurateEquals(), constant_value, zero);
if (function_name == "equals" && !constant_equal_zero)
{
/// indexOf(...) = c, c != 0
return true;
}
else if (function_name == "notEquals" && constant_equal_zero)
{
/// indexOf(...) != c, c = 0
return true;
}
else if (function_name == (reversed ? "less" : "greater") && !applyVisitor(FieldVisitorAccurateLess(), constant_value, zero))
{
/// indexOf(...) > c, c >= 0
return true;
}
else if (function_name == (reversed ? "lessOrEquals" : "greaterOrEquals") && applyVisitor(FieldVisitorAccurateLess(), zero, constant_value))
{
/// indexOf(...) >= c, c > 0
return true;
}
return false;
}
return false;
}
bool MergeTreeIndexConditionBloomFilter::traverseTreeEquals(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out,
const RPNBuilderTreeNode * parent)
{
auto key_column_name = key_node.getColumnName();
if (header.has(key_column_name))
{
size_t position = header.getPositionByName(key_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto * array_type = typeid_cast<const DataTypeArray *>(index_type.get());
if (function_name == "has" || function_name == "indexOf")
{
if (!array_type)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "First argument for function {} must be an array.", function_name);
/// We can treat `indexOf` function similar to `has`.
/// But it is little more cumbersome, compare: `has(arr, elem)` and `indexOf(arr, elem) != 0`.
/// The `parent` in this context is expected to be function `!=` (`notEquals`).
if (function_name == "has" || indexOfCanUseBloomFilter(parent))
{
out.function = RPNElement::FUNCTION_HAS;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(array_type->getNestedType());
auto converted_field = convertFieldToType(value_field, *actual_type, value_type.get());
if (converted_field.isNull())
return false;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), converted_field)));
}
}
else if (function_name == "hasAny" || function_name == "hasAll")
{
if (!array_type)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "First argument for function {} must be an array.", function_name);
if (value_field.getType() != Field::Types::Array)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Second argument for function {} must be an array.", function_name);
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(array_type->getNestedType());
ColumnPtr column;
{
const bool is_nullable = actual_type->isNullable();
auto mutable_column = actual_type->createColumn();
for (const auto & f : value_field.get<Array>())
{
if ((f.isNull() && !is_nullable) || f.isDecimal(f.getType())) /// NOLINT(readability-static-accessed-through-instance)
return false;
auto converted = convertFieldToType(f, *actual_type);
if (converted.isNull())
return false;
mutable_column->insert(converted);
}
column = std::move(mutable_column);
}
out.function = function_name == "hasAny" ?
RPNElement::FUNCTION_HAS_ANY :
RPNElement::FUNCTION_HAS_ALL;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithColumn(actual_type, column, 0, column->size())));
}
else
{
if (array_type)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"An array type of bloom_filter supports only has(), indexOf(), and hasAny() functions.");
out.function = function_name == "equals" ? RPNElement::FUNCTION_EQUALS : RPNElement::FUNCTION_NOT_EQUALS;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(index_type);
auto converted_field = convertFieldToType(value_field, *actual_type, value_type.get());
if (converted_field.isNull())
return false;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), converted_field)));
}
return true;
}
if (function_name == "mapContains" || function_name == "has")
{
auto map_keys_index_column_name = fmt::format("mapKeys({})", key_column_name);
if (!header.has(map_keys_index_column_name))
return false;
size_t position = header.getPositionByName(map_keys_index_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto * array_type = typeid_cast<const DataTypeArray *>(index_type.get());
if (!array_type)
return false;
out.function = RPNElement::FUNCTION_HAS;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(array_type->getNestedType());
auto converted_field = convertFieldToType(value_field, *actual_type, value_type.get());
if (converted_field.isNull())
return false;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), converted_field)));
return true;
}
if (key_node.isFunction())
{
WhichDataType which(value_type);
auto key_node_function = key_node.toFunctionNode();
auto key_node_function_name = key_node_function.getFunctionName();
size_t key_node_function_arguments_size = key_node_function.getArgumentsSize();
if (which.isTuple() && key_node_function_name == "tuple")
{
const Tuple & tuple = value_field.get<const Tuple &>();
const auto * value_tuple_data_type = typeid_cast<const DataTypeTuple *>(value_type.get());
if (tuple.size() != key_node_function_arguments_size)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal types of arguments of function {}", function_name);
bool match_with_subtype = false;
const DataTypes & subtypes = value_tuple_data_type->getElements();
for (size_t index = 0; index < tuple.size(); ++index)
match_with_subtype |= traverseTreeEquals(function_name, key_node_function.getArgumentAt(index), subtypes[index], tuple[index], out, &key_node);
return match_with_subtype;
}
if (key_node_function_name == "arrayElement" && (function_name == "equals" || function_name == "notEquals"))
{
/** Try to parse arrayElement for mapKeys index.
* It is important to ignore keys like column_map['Key'] = '' because if key does not exists in map
* we return default value for arrayElement.
*
* We cannot skip keys that does not exist in map if comparison is with default type value because
* that way we skip necessary granules where map key does not exists.
*/
if (value_field == value_type->getDefault())
return false;
auto first_argument = key_node_function.getArgumentAt(0);
const auto column_name = first_argument.getColumnName();
auto map_keys_index_column_name = fmt::format("mapKeys({})", column_name);
auto map_values_index_column_name = fmt::format("mapValues({})", column_name);
size_t position = 0;
Field const_value = value_field;
DataTypePtr const_type;
if (header.has(map_keys_index_column_name))
{
position = header.getPositionByName(map_keys_index_column_name);
auto second_argument = key_node_function.getArgumentAt(1);
if (!second_argument.tryGetConstant(const_value, const_type))
return false;
}
else if (header.has(map_values_index_column_name))
{
position = header.getPositionByName(map_values_index_column_name);
}
else
{
return false;
}
out.function = function_name == "equals" ? RPNElement::FUNCTION_EQUALS : RPNElement::FUNCTION_NOT_EQUALS;
const auto & index_type = header.getByPosition(position).type;
const auto actual_type = BloomFilter::getPrimitiveType(index_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), const_value)));
return true;
}
}
return false;
}
MergeTreeIndexAggregatorBloomFilter::MergeTreeIndexAggregatorBloomFilter(
size_t bits_per_row_, size_t hash_functions_, const Names & columns_name_)
: bits_per_row(bits_per_row_), hash_functions(hash_functions_), index_columns_name(columns_name_), column_hashes(columns_name_.size())
{
assert(bits_per_row != 0);
assert(hash_functions != 0);
}
bool MergeTreeIndexAggregatorBloomFilter::empty() const
{
return !total_rows;
}
MergeTreeIndexGranulePtr MergeTreeIndexAggregatorBloomFilter::getGranuleAndReset()
{
const auto granule = std::make_shared<MergeTreeIndexGranuleBloomFilter>(bits_per_row, hash_functions, column_hashes);
total_rows = 0;
column_hashes.clear();
return granule;
}
void MergeTreeIndexAggregatorBloomFilter::update(const Block & block, size_t * pos, size_t limit)
{
if (*pos >= block.rows())
throw Exception(ErrorCodes::LOGICAL_ERROR, "The provided position is not less than the number of block rows. "
"Position: {}, Block rows: {}.", *pos, block.rows());
Block granule_index_block;
size_t max_read_rows = std::min(block.rows() - *pos, limit);
for (size_t column = 0; column < index_columns_name.size(); ++column)
{
const auto & column_and_type = block.getByName(index_columns_name[column]);
auto index_column = BloomFilterHash::hashWithColumn(column_and_type.type, column_and_type.column, *pos, max_read_rows);
const auto & index_col = checkAndGetColumn<ColumnUInt64>(index_column.get());
const auto & index_data = index_col->getData();
for (const auto & hash: index_data)
column_hashes[column].insert(hash);
}
*pos += max_read_rows;
total_rows += max_read_rows;
}
MergeTreeIndexBloomFilter::MergeTreeIndexBloomFilter(
@ -67,7 +921,7 @@ static void assertIndexColumnsType(const Block & header)
}
}
MergeTreeIndexPtr bloomFilterIndexCreatorNew(
MergeTreeIndexPtr bloomFilterIndexCreator(
const IndexDescription & index)
{
double max_conflict_probability = 0.025;
@ -84,7 +938,7 @@ MergeTreeIndexPtr bloomFilterIndexCreatorNew(
index, bits_per_row_and_size_of_hash_functions.first, bits_per_row_and_size_of_hash_functions.second);
}
void bloomFilterIndexValidatorNew(const IndexDescription & index, bool attach)
void bloomFilterIndexValidator(const IndexDescription & index, bool attach)
{
assertIndexColumnsType(index.sample_block);

View File

@ -1,13 +1,135 @@
#pragma once
#include <Columns/IColumn.h>
#include <Common/HashTable/HashSet.h>
#include <Interpreters/BloomFilter.h>
#include <Storages/MergeTree/KeyCondition.h>
#include <Storages/MergeTree/MergeTreeIndices.h>
#include <Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.h>
#include <Storages/MergeTree/MergeTreeIndexAggregatorBloomFilter.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
class MergeTreeIndexGranuleBloomFilter final : public IMergeTreeIndexGranule
{
public:
MergeTreeIndexGranuleBloomFilter(size_t bits_per_row_, size_t hash_functions_, size_t index_columns_);
MergeTreeIndexGranuleBloomFilter(size_t bits_per_row_, size_t hash_functions_, const std::vector<HashSet<UInt64>> & column_hashes);
bool empty() const override;
void serializeBinary(WriteBuffer & ostr) const override;
void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) override;
const std::vector<BloomFilterPtr> & getFilters() const { return bloom_filters; }
private:
const size_t bits_per_row;
const size_t hash_functions;
size_t total_rows = 0;
std::vector<BloomFilterPtr> bloom_filters;
void fillingBloomFilter(BloomFilterPtr & bf, const HashSet<UInt64> & hashes) const;
};
class MergeTreeIndexConditionBloomFilter final : public IMergeTreeIndexCondition, WithContext
{
public:
struct RPNElement
{
enum Function
{
/// Atoms of a Boolean expression.
FUNCTION_EQUALS,
FUNCTION_NOT_EQUALS,
FUNCTION_HAS,
FUNCTION_HAS_ANY,
FUNCTION_HAS_ALL,
FUNCTION_IN,
FUNCTION_NOT_IN,
FUNCTION_UNKNOWN, /// Can take any value.
/// Operators of the logical expression.
FUNCTION_NOT,
FUNCTION_AND,
FUNCTION_OR,
/// Constants
ALWAYS_FALSE,
ALWAYS_TRUE,
};
RPNElement(Function function_ = FUNCTION_UNKNOWN) : function(function_) {} /// NOLINT
Function function = FUNCTION_UNKNOWN;
std::vector<std::pair<size_t, ColumnPtr>> predicate;
};
MergeTreeIndexConditionBloomFilter(const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, const Block & header_, size_t hash_functions_);
bool alwaysUnknownOrTrue() const override;
bool mayBeTrueOnGranule(MergeTreeIndexGranulePtr granule) const override
{
if (const auto & bf_granule = typeid_cast<const MergeTreeIndexGranuleBloomFilter *>(granule.get()))
return mayBeTrueOnGranule(bf_granule);
throw Exception(ErrorCodes::LOGICAL_ERROR, "Requires bloom filter index granule.");
}
private:
const Block & header;
const size_t hash_functions;
std::vector<RPNElement> rpn;
bool mayBeTrueOnGranule(const MergeTreeIndexGranuleBloomFilter * granule) const;
bool extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out);
bool traverseFunction(const RPNBuilderTreeNode & node, RPNElement & out, const RPNBuilderTreeNode * parent);
bool traverseTreeIn(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const ConstSetPtr & prepared_set,
const DataTypePtr & type,
const ColumnPtr & column,
RPNElement & out);
bool traverseTreeEquals(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out,
const RPNBuilderTreeNode * parent);
};
class MergeTreeIndexAggregatorBloomFilter final : public IMergeTreeIndexAggregator
{
public:
MergeTreeIndexAggregatorBloomFilter(size_t bits_per_row_, size_t hash_functions_, const Names & columns_name_);
bool empty() const override;
MergeTreeIndexGranulePtr getGranuleAndReset() override;
void update(const Block & block, size_t * pos, size_t limit) override;
private:
size_t bits_per_row;
size_t hash_functions;
const Names index_columns_name;
std::vector<HashSet<UInt64>> column_hashes;
size_t total_rows = 0;
};
class MergeTreeIndexBloomFilter final : public IMergeTreeIndex
{
public:

View File

@ -0,0 +1,815 @@
#include <Storages/MergeTree/MergeTreeIndexBloomFilterText.h>
#include <Columns/ColumnArray.h>
#include <Common/OptimizedRegularExpression.h>
#include <Core/Defines.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypesNumber.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <Interpreters/ExpressionActions.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/TreeRewriter.h>
#include <Interpreters/misc.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTSubquery.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <Storages/MergeTree/MergeTreeIndexUtils.h>
#include <Storages/MergeTree/RPNBuilder.h>
#include <Poco/Logger.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int INCORRECT_QUERY;
extern const int BAD_ARGUMENTS;
}
MergeTreeIndexGranuleBloomFilterText::MergeTreeIndexGranuleBloomFilterText(
const String & index_name_,
size_t columns_number,
const BloomFilterParameters & params_)
: index_name(index_name_)
, params(params_)
, bloom_filters(
columns_number, BloomFilter(params))
, has_elems(false)
{
}
void MergeTreeIndexGranuleBloomFilterText::serializeBinary(WriteBuffer & ostr) const
{
if (empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty fulltext index {}.", backQuote(index_name));
for (const auto & bloom_filter : bloom_filters)
ostr.write(reinterpret_cast<const char *>(bloom_filter.getFilter().data()), params.filter_size);
}
void MergeTreeIndexGranuleBloomFilterText::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
{
if (version != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
for (auto & bloom_filter : bloom_filters)
{
istr.readStrict(reinterpret_cast<char *>(bloom_filter.getFilter().data()), params.filter_size);
}
has_elems = true;
}
MergeTreeIndexAggregatorBloomFilterText::MergeTreeIndexAggregatorBloomFilterText(
const Names & index_columns_,
const String & index_name_,
const BloomFilterParameters & params_,
TokenExtractorPtr token_extractor_)
: index_columns(index_columns_)
, index_name (index_name_)
, params(params_)
, token_extractor(token_extractor_)
, granule(
std::make_shared<MergeTreeIndexGranuleBloomFilterText>(
index_name, index_columns.size(), params))
{
}
MergeTreeIndexGranulePtr MergeTreeIndexAggregatorBloomFilterText::getGranuleAndReset()
{
auto new_granule = std::make_shared<MergeTreeIndexGranuleBloomFilterText>(
index_name, index_columns.size(), params);
new_granule.swap(granule);
return new_granule;
}
void MergeTreeIndexAggregatorBloomFilterText::update(const Block & block, size_t * pos, size_t limit)
{
if (*pos >= block.rows())
throw Exception(ErrorCodes::LOGICAL_ERROR, "The provided position is not less than the number of block rows. "
"Position: {}, Block rows: {}.", *pos, block.rows());
size_t rows_read = std::min(limit, block.rows() - *pos);
for (size_t col = 0; col < index_columns.size(); ++col)
{
const auto & column_with_type = block.getByName(index_columns[col]);
const auto & column = column_with_type.column;
size_t current_position = *pos;
if (isArray(column_with_type.type))
{
const auto & column_array = assert_cast<const ColumnArray &>(*column);
const auto & column_offsets = column_array.getOffsets();
const auto & column_key = column_array.getData();
for (size_t i = 0; i < rows_read; ++i)
{
size_t element_start_row = column_offsets[current_position - 1];
size_t elements_size = column_offsets[current_position] - element_start_row;
for (size_t row_num = 0; row_num < elements_size; ++row_num)
{
auto ref = column_key.getDataAt(element_start_row + row_num);
token_extractor->stringPaddedToBloomFilter(ref.data, ref.size, granule->bloom_filters[col]);
}
current_position += 1;
}
}
else
{
for (size_t i = 0; i < rows_read; ++i)
{
auto ref = column->getDataAt(current_position + i);
token_extractor->stringPaddedToBloomFilter(ref.data, ref.size, granule->bloom_filters[col]);
}
}
}
granule->has_elems = true;
*pos += rows_read;
}
MergeTreeConditionBloomFilterText::MergeTreeConditionBloomFilterText(
const ActionsDAGPtr & filter_actions_dag,
ContextPtr context,
const Block & index_sample_block,
const BloomFilterParameters & params_,
TokenExtractorPtr token_extactor_)
: index_columns(index_sample_block.getNames())
, index_data_types(index_sample_block.getNamesAndTypesList().getTypes())
, params(params_)
, token_extractor(token_extactor_)
{
if (!filter_actions_dag)
{
rpn.push_back(RPNElement::FUNCTION_UNKNOWN);
return;
}
RPNBuilder<RPNElement> builder(
filter_actions_dag->getOutputs().at(0),
context,
[&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); });
rpn = std::move(builder).extractRPN();
}
/// Keep in-sync with MergeTreeConditionGinFilter::alwaysUnknownOrTrue
bool MergeTreeConditionBloomFilterText::alwaysUnknownOrTrue() const
{
/// Check like in KeyCondition.
std::vector<bool> rpn_stack;
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN
|| element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.push_back(true);
}
else if (element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS
|| element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN
|| element.function == RPNElement::FUNCTION_MULTI_SEARCH
|| element.function == RPNElement::FUNCTION_MATCH
|| element.function == RPNElement::FUNCTION_HAS_ANY
|| element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.push_back(false);
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
// do nothing
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 && arg2;
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 || arg2;
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in KeyCondition::RPNElement");
}
return rpn_stack[0];
}
/// Keep in-sync with MergeTreeIndexConditionGin::mayBeTrueOnTranuleInPart
bool MergeTreeConditionBloomFilterText::mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx_granule) const
{
std::shared_ptr<MergeTreeIndexGranuleBloomFilterText> granule
= std::dynamic_pointer_cast<MergeTreeIndexGranuleBloomFilterText>(idx_granule);
if (!granule)
throw Exception(ErrorCodes::LOGICAL_ERROR, "BloomFilter index condition got a granule with the wrong type.");
/// Check like in KeyCondition.
std::vector<BoolMask> rpn_stack;
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN)
{
rpn_stack.emplace_back(true, true);
}
else if (element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS)
{
rpn_stack.emplace_back(granule->bloom_filters[element.key_column].contains(*element.bloom_filter), true);
if (element.function == RPNElement::FUNCTION_NOT_EQUALS)
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN)
{
std::vector<bool> result(element.set_bloom_filters.back().size(), true);
for (size_t column = 0; column < element.set_key_position.size(); ++column)
{
const size_t key_idx = element.set_key_position[column];
const auto & bloom_filters = element.set_bloom_filters[column];
for (size_t row = 0; row < bloom_filters.size(); ++row)
result[row] = result[row] && granule->bloom_filters[key_idx].contains(bloom_filters[row]);
}
rpn_stack.emplace_back(
std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
if (element.function == RPNElement::FUNCTION_NOT_IN)
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_MULTI_SEARCH
|| element.function == RPNElement::FUNCTION_HAS_ANY)
{
std::vector<bool> result(element.set_bloom_filters.back().size(), true);
const auto & bloom_filters = element.set_bloom_filters[0];
for (size_t row = 0; row < bloom_filters.size(); ++row)
result[row] = result[row] && granule->bloom_filters[element.key_column].contains(bloom_filters[row]);
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
}
else if (element.function == RPNElement::FUNCTION_MATCH)
{
if (!element.set_bloom_filters.empty())
{
/// Alternative substrings
std::vector<bool> result(element.set_bloom_filters.back().size(), true);
const auto & bloom_filters = element.set_bloom_filters[0];
for (size_t row = 0; row < bloom_filters.size(); ++row)
result[row] = result[row] && granule->bloom_filters[element.key_column].contains(bloom_filters[row]);
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
}
else if (element.bloom_filter)
{
/// Required substrings
rpn_stack.emplace_back(granule->bloom_filters[element.key_column].contains(*element.bloom_filter), true);
}
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 & arg2;
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 | arg2;
}
else if (element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.emplace_back(false, true);
}
else if (element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.emplace_back(true, false);
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in BloomFilterCondition::RPNElement");
}
if (rpn_stack.size() != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected stack size in BloomFilterCondition::mayBeTrueOnGranule");
return rpn_stack[0].can_be_true;
}
std::optional<size_t> MergeTreeConditionBloomFilterText::getKeyIndex(const std::string & key_column_name)
{
const auto it = std::ranges::find(index_columns, key_column_name);
return it == index_columns.end() ? std::nullopt : std::make_optional<size_t>(std::ranges::distance(index_columns.cbegin(), it));
}
bool MergeTreeConditionBloomFilterText::extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out)
{
{
Field const_value;
DataTypePtr const_type;
if (node.tryGetConstant(const_value, const_type))
{
/// Check constant like in KeyCondition
if (const_value.getType() == Field::Types::UInt64)
{
out.function = const_value.get<UInt64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Int64)
{
out.function = const_value.get<Int64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Float64)
{
out.function = const_value.get<Float64>() != 0.0 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
}
}
if (node.isFunction())
{
auto function_node = node.toFunctionNode();
auto function_name = function_node.getFunctionName();
size_t arguments_size = function_node.getArgumentsSize();
if (arguments_size != 2)
return false;
auto left_argument = function_node.getArgumentAt(0);
auto right_argument = function_node.getArgumentAt(1);
if (functionIsInOrGlobalInOperator(function_name))
{
if (tryPrepareSetBloomFilter(left_argument, right_argument, out))
{
if (function_name == "notIn")
{
out.function = RPNElement::FUNCTION_NOT_IN;
return true;
}
else if (function_name == "in")
{
out.function = RPNElement::FUNCTION_IN;
return true;
}
}
}
else if (function_name == "equals" ||
function_name == "notEquals" ||
function_name == "has" ||
function_name == "mapContains" ||
function_name == "match" ||
function_name == "like" ||
function_name == "notLike" ||
function_name.starts_with("hasToken") ||
function_name == "startsWith" ||
function_name == "endsWith" ||
function_name == "multiSearchAny" ||
function_name == "hasAny")
{
Field const_value;
DataTypePtr const_type;
if (right_argument.tryGetConstant(const_value, const_type))
{
if (traverseTreeEquals(function_name, left_argument, const_type, const_value, out))
return true;
}
else if (left_argument.tryGetConstant(const_value, const_type) && (function_name == "equals" || function_name == "notEquals"))
{
if (traverseTreeEquals(function_name, right_argument, const_type, const_value, out))
return true;
}
}
}
return false;
}
bool MergeTreeConditionBloomFilterText::traverseTreeEquals(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out)
{
auto value_data_type = WhichDataType(value_type);
if (!value_data_type.isStringOrFixedString() && !value_data_type.isArray())
return false;
Field const_value = value_field;
const auto column_name = key_node.getColumnName();
auto key_index = getKeyIndex(column_name);
const auto map_key_index = getKeyIndex(fmt::format("mapKeys({})", column_name));
if (key_node.isFunction())
{
auto key_function_node = key_node.toFunctionNode();
auto key_function_node_function_name = key_function_node.getFunctionName();
if (key_function_node_function_name == "arrayElement")
{
/** Try to parse arrayElement for mapKeys index.
* It is important to ignore keys like column_map['Key'] = '' because if key does not exists in map
* we return default value for arrayElement.
*
* We cannot skip keys that does not exist in map if comparison is with default type value because
* that way we skip necessary granules where map key does not exists.
*/
if (value_field == value_type->getDefault())
return false;
auto first_argument = key_function_node.getArgumentAt(0);
const auto map_column_name = first_argument.getColumnName();
if (const auto map_keys_index = getKeyIndex(fmt::format("mapKeys({})", map_column_name)))
{
auto second_argument = key_function_node.getArgumentAt(1);
DataTypePtr const_type;
if (second_argument.tryGetConstant(const_value, const_type))
{
key_index = map_keys_index;
auto const_data_type = WhichDataType(const_type);
if (!const_data_type.isStringOrFixedString() && !const_data_type.isArray())
return false;
}
else
{
return false;
}
}
else if (const auto map_values_exists = getKeyIndex(fmt::format("mapValues({})", map_column_name)))
{
key_index = map_values_exists;
}
else
{
return false;
}
}
}
const auto lowercase_key_index = getKeyIndex(fmt::format("lower({})", column_name));
const auto is_has_token_case_insensitive = function_name.starts_with("hasTokenCaseInsensitive");
if (const auto is_case_insensitive_scenario = is_has_token_case_insensitive && lowercase_key_index;
function_name.starts_with("hasToken") && ((!is_has_token_case_insensitive && key_index) || is_case_insensitive_scenario))
{
out.key_column = is_case_insensitive_scenario ? *lowercase_key_index : *key_index;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
auto value = const_value.get<String>();
if (is_case_insensitive_scenario)
std::ranges::transform(value, value.begin(), [](const auto & c) { return static_cast<char>(std::tolower(c)); });
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
if (!key_index && !map_key_index)
return false;
if (map_key_index && (function_name == "has" || function_name == "mapContains"))
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_HAS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
else if (function_name == "has")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_HAS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
if (function_name == "notEquals")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_NOT_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
else if (function_name == "equals")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
else if (function_name == "like")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringLikeToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
else if (function_name == "notLike")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_NOT_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringLikeToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
else if (function_name == "startsWith")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
else if (function_name == "endsWith")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
else if (function_name == "multiSearchAny"
|| function_name == "hasAny")
{
out.key_column = *key_index;
out.function = function_name == "multiSearchAny" ?
RPNElement::FUNCTION_MULTI_SEARCH :
RPNElement::FUNCTION_HAS_ANY;
/// 2d vector is not needed here but is used because already exists for FUNCTION_IN
std::vector<std::vector<BloomFilter>> bloom_filters;
bloom_filters.emplace_back();
for (const auto & element : const_value.get<Array>())
{
if (element.getType() != Field::Types::String)
return false;
bloom_filters.back().emplace_back(params);
const auto & value = element.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), bloom_filters.back().back());
}
out.set_bloom_filters = std::move(bloom_filters);
return true;
}
else if (function_name == "match")
{
out.key_column = *key_index;
out.function = RPNElement::FUNCTION_MATCH;
out.bloom_filter = std::make_unique<BloomFilter>(params);
auto & value = const_value.get<String>();
String required_substring;
bool dummy_is_trivial, dummy_required_substring_is_prefix;
std::vector<String> alternatives;
OptimizedRegularExpression::analyze(value, required_substring, dummy_is_trivial, dummy_required_substring_is_prefix, alternatives);
if (required_substring.empty() && alternatives.empty())
return false;
/// out.set_bloom_filters means alternatives exist
/// out.bloom_filter means required_substring exists
if (!alternatives.empty())
{
std::vector<std::vector<BloomFilter>> bloom_filters;
bloom_filters.emplace_back();
for (const auto & alternative : alternatives)
{
bloom_filters.back().emplace_back(params);
token_extractor->stringToBloomFilter(alternative.data(), alternative.size(), bloom_filters.back().back());
}
out.set_bloom_filters = std::move(bloom_filters);
}
else
token_extractor->stringToBloomFilter(required_substring.data(), required_substring.size(), *out.bloom_filter);
return true;
}
return false;
}
bool MergeTreeConditionBloomFilterText::tryPrepareSetBloomFilter(
const RPNBuilderTreeNode & left_argument,
const RPNBuilderTreeNode & right_argument,
RPNElement & out)
{
std::vector<KeyTuplePositionMapping> key_tuple_mapping;
DataTypes data_types;
auto left_argument_function_node_optional = left_argument.toFunctionNodeOrNull();
if (left_argument_function_node_optional && left_argument_function_node_optional->getFunctionName() == "tuple")
{
const auto & left_argument_function_node = *left_argument_function_node_optional;
size_t left_argument_function_node_arguments_size = left_argument_function_node.getArgumentsSize();
for (size_t i = 0; i < left_argument_function_node_arguments_size; ++i)
{
if (const auto key = getKeyIndex(left_argument_function_node.getArgumentAt(i).getColumnName()))
{
key_tuple_mapping.emplace_back(i, *key);
data_types.push_back(index_data_types[*key]);
}
}
}
else if (const auto key = getKeyIndex(left_argument.getColumnName()))
{
key_tuple_mapping.emplace_back(0, *key);
data_types.push_back(index_data_types[*key]);
}
if (key_tuple_mapping.empty())
return false;
auto future_set = right_argument.tryGetPreparedSet(data_types);
if (!future_set)
return false;
auto prepared_set = future_set->buildOrderedSetInplace(right_argument.getTreeContext().getQueryContext());
if (!prepared_set || !prepared_set->hasExplicitSetElements())
return false;
for (const auto & prepared_set_data_type : prepared_set->getDataTypes())
{
auto prepared_set_data_type_id = prepared_set_data_type->getTypeId();
if (prepared_set_data_type_id != TypeIndex::String && prepared_set_data_type_id != TypeIndex::FixedString)
return false;
}
std::vector<std::vector<BloomFilter>> bloom_filters;
std::vector<size_t> key_position;
Columns columns = prepared_set->getSetElements();
size_t prepared_set_total_row_count = prepared_set->getTotalRowCount();
for (const auto & elem : key_tuple_mapping)
{
bloom_filters.emplace_back();
key_position.push_back(elem.key_index);
size_t tuple_idx = elem.tuple_index;
const auto & column = columns[tuple_idx];
for (size_t row = 0; row < prepared_set_total_row_count; ++row)
{
bloom_filters.back().emplace_back(params);
auto ref = column->getDataAt(row);
token_extractor->stringPaddedToBloomFilter(ref.data, ref.size, bloom_filters.back().back());
}
}
out.set_key_position = std::move(key_position);
out.set_bloom_filters = std::move(bloom_filters);
return true;
}
MergeTreeIndexGranulePtr MergeTreeIndexBloomFilterText::createIndexGranule() const
{
return std::make_shared<MergeTreeIndexGranuleBloomFilterText>(index.name, index.column_names.size(), params);
}
MergeTreeIndexAggregatorPtr MergeTreeIndexBloomFilterText::createIndexAggregator(const MergeTreeWriterSettings & /*settings*/) const
{
return std::make_shared<MergeTreeIndexAggregatorBloomFilterText>(index.column_names, index.name, params, token_extractor.get());
}
MergeTreeIndexConditionPtr MergeTreeIndexBloomFilterText::createIndexCondition(
const ActionsDAGPtr & filter_dag, ContextPtr context) const
{
return std::make_shared<MergeTreeConditionBloomFilterText>(filter_dag, context, index.sample_block, params, token_extractor.get());
}
MergeTreeIndexPtr bloomFilterIndexTextCreator(
const IndexDescription & index)
{
if (index.type == NgramTokenExtractor::getName())
{
size_t n = index.arguments[0].get<size_t>();
BloomFilterParameters params(
index.arguments[1].get<size_t>(),
index.arguments[2].get<size_t>(),
index.arguments[3].get<size_t>());
auto tokenizer = std::make_unique<NgramTokenExtractor>(n);
return std::make_shared<MergeTreeIndexBloomFilterText>(index, params, std::move(tokenizer));
}
else if (index.type == SplitTokenExtractor::getName())
{
BloomFilterParameters params(
index.arguments[0].get<size_t>(),
index.arguments[1].get<size_t>(),
index.arguments[2].get<size_t>());
auto tokenizer = std::make_unique<SplitTokenExtractor>();
return std::make_shared<MergeTreeIndexBloomFilterText>(index, params, std::move(tokenizer));
}
else
{
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index type: {}", backQuote(index.name));
}
}
void bloomFilterIndexTextValidator(const IndexDescription & index, bool /*attach*/)
{
for (const auto & index_data_type : index.data_types)
{
WhichDataType data_type(index_data_type);
if (data_type.isArray())
{
const auto & array_type = assert_cast<const DataTypeArray &>(*index_data_type);
data_type = WhichDataType(array_type.getNestedType());
}
else if (data_type.isLowCardinality())
{
const auto & low_cardinality = assert_cast<const DataTypeLowCardinality &>(*index_data_type);
data_type = WhichDataType(low_cardinality.getDictionaryType());
}
if (!data_type.isString() && !data_type.isFixedString() && !data_type.isIPv6())
throw Exception(ErrorCodes::INCORRECT_QUERY,
"Ngram and token bloom filter indexes can only be used with column types `String`, `FixedString`, `LowCardinality(String)`, `LowCardinality(FixedString)`, `Array(String)` or `Array(FixedString)`");
}
if (index.type == NgramTokenExtractor::getName())
{
if (index.arguments.size() != 4)
throw Exception(ErrorCodes::INCORRECT_QUERY, "`ngrambf` index must have exactly 4 arguments.");
}
else if (index.type == SplitTokenExtractor::getName())
{
if (index.arguments.size() != 3)
throw Exception(ErrorCodes::INCORRECT_QUERY, "`tokenbf` index must have exactly 3 arguments.");
}
else
{
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index type: {}", backQuote(index.name));
}
assert(index.arguments.size() >= 3);
for (const auto & arg : index.arguments)
if (arg.getType() != Field::Types::UInt64)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "All parameters to *bf_v1 index must be unsigned integers");
/// Just validate
BloomFilterParameters params(
index.arguments[0].get<size_t>(),
index.arguments[1].get<size_t>(),
index.arguments[2].get<size_t>());
}
}

View File

@ -1,23 +1,24 @@
#pragma once
#include <Interpreters/GinFilter.h>
#include <Interpreters/ITokenExtractor.h>
#include <Storages/MergeTree/KeyCondition.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <base/types.h>
#include <atomic>
#include <memory>
#include <Storages/MergeTree/MergeTreeIndices.h>
#include <Storages/MergeTree/KeyCondition.h>
#include <Interpreters/BloomFilter.h>
#include <Interpreters/ITokenExtractor.h>
namespace DB
{
struct MergeTreeIndexGranuleInverted final : public IMergeTreeIndexGranule
struct MergeTreeIndexGranuleBloomFilterText final : public IMergeTreeIndexGranule
{
explicit MergeTreeIndexGranuleInverted(
explicit MergeTreeIndexGranuleBloomFilterText(
const String & index_name_,
size_t columns_number,
const GinFilterParameters & params_);
const BloomFilterParameters & params_);
~MergeTreeIndexGranuleInverted() override = default;
~MergeTreeIndexGranuleBloomFilterText() override = default;
void serializeBinary(WriteBuffer & ostr) const override;
void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) override;
@ -25,62 +26,52 @@ struct MergeTreeIndexGranuleInverted final : public IMergeTreeIndexGranule
bool empty() const override { return !has_elems; }
const String index_name;
const GinFilterParameters params;
GinFilters gin_filters;
const BloomFilterParameters params;
std::vector<BloomFilter> bloom_filters;
bool has_elems;
};
using MergeTreeIndexGranuleInvertedPtr = std::shared_ptr<MergeTreeIndexGranuleInverted>;
using MergeTreeIndexGranuleBloomFilterTextPtr = std::shared_ptr<MergeTreeIndexGranuleBloomFilterText>;
struct MergeTreeIndexAggregatorInverted final : IMergeTreeIndexAggregator
struct MergeTreeIndexAggregatorBloomFilterText final : IMergeTreeIndexAggregator
{
explicit MergeTreeIndexAggregatorInverted(
GinIndexStorePtr store_,
explicit MergeTreeIndexAggregatorBloomFilterText(
const Names & index_columns_,
const String & index_name_,
const GinFilterParameters & params_,
const BloomFilterParameters & params_,
TokenExtractorPtr token_extractor_);
~MergeTreeIndexAggregatorInverted() override = default;
~MergeTreeIndexAggregatorBloomFilterText() override = default;
bool empty() const override { return !granule || granule->empty(); }
MergeTreeIndexGranulePtr getGranuleAndReset() override;
void update(const Block & block, size_t * pos, size_t limit) override;
void addToGinFilter(UInt32 rowID, const char * data, size_t length, GinFilter & gin_filter);
GinIndexStorePtr store;
Names index_columns;
const String index_name;
const GinFilterParameters params;
String index_name;
BloomFilterParameters params;
TokenExtractorPtr token_extractor;
MergeTreeIndexGranuleInvertedPtr granule;
MergeTreeIndexGranuleBloomFilterTextPtr granule;
};
class MergeTreeConditionInverted final : public IMergeTreeIndexCondition, WithContext
class MergeTreeConditionBloomFilterText final : public IMergeTreeIndexCondition
{
public:
MergeTreeConditionInverted(
MergeTreeConditionBloomFilterText(
const ActionsDAGPtr & filter_actions_dag,
ContextPtr context,
const Block & index_sample_block,
const GinFilterParameters & params_,
const BloomFilterParameters & params_,
TokenExtractorPtr token_extactor_);
~MergeTreeConditionInverted() override = default;
~MergeTreeConditionBloomFilterText() override = default;
bool alwaysUnknownOrTrue() const override;
bool mayBeTrueOnGranule([[maybe_unused]]MergeTreeIndexGranulePtr idx_granule) const override
{
/// should call mayBeTrueOnGranuleInPart instead
assert(false);
return false;
}
bool mayBeTrueOnGranuleInPart(MergeTreeIndexGranulePtr idx_granule, [[maybe_unused]] PostingsCacheForStore & cache_store) const;
bool mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx_granule) const override;
private:
struct KeyTuplePositionMapping
{
@ -99,9 +90,10 @@ private:
FUNCTION_NOT_EQUALS,
FUNCTION_HAS,
FUNCTION_IN,
FUNCTION_MATCH,
FUNCTION_NOT_IN,
FUNCTION_MULTI_SEARCH,
FUNCTION_MATCH,
FUNCTION_HAS_ANY,
FUNCTION_UNKNOWN, /// Can take any value.
/// Operators of the logical expression.
FUNCTION_NOT,
@ -113,19 +105,18 @@ private:
};
RPNElement( /// NOLINT
Function function_ = FUNCTION_UNKNOWN, size_t key_column_ = 0, std::unique_ptr<GinFilter> && const_gin_filter_ = nullptr)
: function(function_), key_column(key_column_), gin_filter(std::move(const_gin_filter_)) {}
Function function_ = FUNCTION_UNKNOWN, size_t key_column_ = 0, std::unique_ptr<BloomFilter> && const_bloom_filter_ = nullptr)
: function(function_), key_column(key_column_), bloom_filter(std::move(const_bloom_filter_)) {}
Function function = FUNCTION_UNKNOWN;
/// For FUNCTION_EQUALS, FUNCTION_NOT_EQUALS and FUNCTION_MULTI_SEARCH
/// For FUNCTION_EQUALS, FUNCTION_NOT_EQUALS, FUNCTION_MULTI_SEARCH and FUNCTION_HAS_ANY
size_t key_column;
/// For FUNCTION_EQUALS, FUNCTION_NOT_EQUALS
std::unique_ptr<GinFilter> gin_filter;
std::unique_ptr<BloomFilter> bloom_filter;
/// For FUNCTION_IN, FUNCTION_NOT_IN and FUNCTION_MULTI_SEARCH
std::vector<GinFilters> set_gin_filters;
/// For FUNCTION_IN, FUNCTION_NOT_IN, FUNCTION_MULTI_SEARCH and FUNCTION_HAS_ANY
std::vector<std::vector<BloomFilter>> set_bloom_filters;
/// For FUNCTION_IN and FUNCTION_NOT_IN
std::vector<size_t> set_key_position;
@ -133,47 +124,48 @@ private:
using RPN = std::vector<RPNElement>;
bool traverseAtomAST(const RPNBuilderTreeNode & node, RPNElement & out);
bool extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out);
bool traverseASTEquals(
bool traverseTreeEquals(
const String & function_name,
const RPNBuilderTreeNode & key_ast,
const RPNBuilderTreeNode & key_node,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out);
bool tryPrepareSetGinFilter(const RPNBuilderTreeNode & lhs, const RPNBuilderTreeNode & rhs, RPNElement & out);
std::optional<size_t> getKeyIndex(const std::string & key_column_name);
bool tryPrepareSetBloomFilter(const RPNBuilderTreeNode & left_argument, const RPNBuilderTreeNode & right_argument, RPNElement & out);
static bool createFunctionEqualsCondition(
RPNElement & out, const Field & value, const GinFilterParameters & params, TokenExtractorPtr token_extractor);
RPNElement & out, const Field & value, const BloomFilterParameters & params, TokenExtractorPtr token_extractor);
const Block & header;
GinFilterParameters params;
Names index_columns;
DataTypes index_data_types;
BloomFilterParameters params;
TokenExtractorPtr token_extractor;
RPN rpn;
/// Sets from syntax analyzer.
PreparedSetsPtr prepared_sets;
};
class MergeTreeIndexInverted final : public IMergeTreeIndex
class MergeTreeIndexBloomFilterText final : public IMergeTreeIndex
{
public:
MergeTreeIndexInverted(
MergeTreeIndexBloomFilterText(
const IndexDescription & index_,
const GinFilterParameters & params_,
const BloomFilterParameters & params_,
std::unique_ptr<ITokenExtractor> && token_extractor_)
: IMergeTreeIndex(index_)
, params(params_)
, token_extractor(std::move(token_extractor_)) {}
~MergeTreeIndexInverted() override = default;
~MergeTreeIndexBloomFilterText() override = default;
MergeTreeIndexGranulePtr createIndexGranule() const override;
MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override;
MergeTreeIndexAggregatorPtr createIndexAggregatorForPart(const GinIndexStorePtr & store, const MergeTreeWriterSettings & /*settings*/) const override;
MergeTreeIndexConditionPtr createIndexCondition(const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override;
GinFilterParameters params;
MergeTreeIndexConditionPtr createIndexCondition(
const ActionsDAGPtr & filter_dag, ContextPtr context) const override;
BloomFilterParameters params;
/// Function for selecting next token.
std::unique_ptr<ITokenExtractor> token_extractor;
};

View File

@ -1,729 +0,0 @@
#include <Common/HashTable/ClearableHashMap.h>
#include <Common/FieldVisitorsAccurateComparison.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeMap.h>
#include <DataTypes/DataTypeTuple.h>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnTuple.h>
#include <Storages/MergeTree/RPNBuilder.h>
#include <Storages/MergeTree/MergeTreeIndexUtils.h>
#include <Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.h>
#include <Storages/MergeTree/MergeTreeIndexConditionBloomFilter.h>
#include <Parsers/ASTSubquery.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSelectQuery.h>
#include <Interpreters/misc.h>
#include <Interpreters/BloomFilterHash.h>
#include <Interpreters/castColumn.h>
#include <Interpreters/convertFieldToType.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int LOGICAL_ERROR;
}
namespace
{
ColumnWithTypeAndName getPreparedSetInfo(const ConstSetPtr & prepared_set)
{
if (prepared_set->getDataTypes().size() == 1)
return {prepared_set->getSetElements()[0], prepared_set->getElementsTypes()[0], "dummy"};
Columns set_elements;
for (auto & set_element : prepared_set->getSetElements())
set_elements.emplace_back(set_element->convertToFullColumnIfConst());
return {ColumnTuple::create(set_elements), std::make_shared<DataTypeTuple>(prepared_set->getElementsTypes()), "dummy"};
}
bool hashMatchesFilter(const BloomFilterPtr& bloom_filter, UInt64 hash, size_t hash_functions)
{
return std::all_of(BloomFilterHash::bf_hash_seed,
BloomFilterHash::bf_hash_seed + hash_functions,
[&](const auto &hash_seed)
{
return bloom_filter->findHashWithSeed(hash,
hash_seed);
});
}
bool maybeTrueOnBloomFilter(const IColumn * hash_column, const BloomFilterPtr & bloom_filter, size_t hash_functions, bool match_all)
{
const auto * const_column = typeid_cast<const ColumnConst *>(hash_column);
const auto * non_const_column = typeid_cast<const ColumnUInt64 *>(hash_column);
if (!const_column && !non_const_column)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Hash column must be Const or UInt64.");
if (const_column)
{
return hashMatchesFilter(bloom_filter,
const_column->getValue<UInt64>(),
hash_functions);
}
const ColumnUInt64::Container & hashes = non_const_column->getData();
if (match_all)
{
return std::all_of(hashes.begin(),
hashes.end(),
[&](const auto& hash_row)
{
return hashMatchesFilter(bloom_filter,
hash_row,
hash_functions);
});
}
else
{
return std::any_of(hashes.begin(),
hashes.end(),
[&](const auto& hash_row)
{
return hashMatchesFilter(bloom_filter,
hash_row,
hash_functions);
});
}
}
}
MergeTreeIndexConditionBloomFilter::MergeTreeIndexConditionBloomFilter(
const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, const Block & header_, size_t hash_functions_)
: WithContext(context_), header(header_), hash_functions(hash_functions_)
{
if (!filter_actions_dag)
{
rpn.push_back(RPNElement::FUNCTION_UNKNOWN);
return;
}
RPNBuilder<RPNElement> builder(
filter_actions_dag->getOutputs().at(0),
context_,
[&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); });
rpn = std::move(builder).extractRPN();
}
bool MergeTreeIndexConditionBloomFilter::alwaysUnknownOrTrue() const
{
std::vector<bool> rpn_stack;
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN
|| element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.push_back(true);
}
else if (element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS
|| element.function == RPNElement::FUNCTION_HAS_ANY
|| element.function == RPNElement::FUNCTION_HAS_ALL
|| element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN
|| element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.push_back(false);
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
// do nothing
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 && arg2;
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 || arg2;
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in KeyCondition::RPNElement");
}
return rpn_stack[0];
}
bool MergeTreeIndexConditionBloomFilter::mayBeTrueOnGranule(const MergeTreeIndexGranuleBloomFilter * granule) const
{
std::vector<BoolMask> rpn_stack;
const auto & filters = granule->getFilters();
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN)
{
rpn_stack.emplace_back(true, true);
}
else if (element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN
|| element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS
|| element.function == RPNElement::FUNCTION_HAS_ANY
|| element.function == RPNElement::FUNCTION_HAS_ALL)
{
bool match_rows = true;
bool match_all = element.function == RPNElement::FUNCTION_HAS_ALL;
const auto & predicate = element.predicate;
for (size_t index = 0; match_rows && index < predicate.size(); ++index)
{
const auto & query_index_hash = predicate[index];
const auto & filter = filters[query_index_hash.first];
const ColumnPtr & hash_column = query_index_hash.second;
match_rows = maybeTrueOnBloomFilter(&*hash_column,
filter,
hash_functions,
match_all);
}
rpn_stack.emplace_back(match_rows, true);
if (element.function == RPNElement::FUNCTION_NOT_EQUALS || element.function == RPNElement::FUNCTION_NOT_IN)
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 | arg2;
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 & arg2;
}
else if (element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.emplace_back(true, false);
}
else if (element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.emplace_back(false, true);
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in KeyCondition::RPNElement");
}
if (rpn_stack.size() != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected stack size in KeyCondition::mayBeTrueInRange");
return rpn_stack[0].can_be_true;
}
bool MergeTreeIndexConditionBloomFilter::extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out)
{
{
Field const_value;
DataTypePtr const_type;
if (node.tryGetConstant(const_value, const_type))
{
if (const_value.getType() == Field::Types::UInt64)
{
out.function = const_value.get<UInt64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Int64)
{
out.function = const_value.get<Int64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Float64)
{
out.function = const_value.get<Float64>() != 0.0 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
}
}
return traverseFunction(node, out, nullptr /*parent*/);
}
bool MergeTreeIndexConditionBloomFilter::traverseFunction(const RPNBuilderTreeNode & node, RPNElement & out, const RPNBuilderTreeNode * parent)
{
bool maybe_useful = false;
if (node.isFunction())
{
const auto function = node.toFunctionNode();
auto arguments_size = function.getArgumentsSize();
auto function_name = function.getFunctionName();
for (size_t i = 0; i < arguments_size; ++i)
{
auto argument = function.getArgumentAt(i);
if (traverseFunction(argument, out, &node))
maybe_useful = true;
}
if (arguments_size != 2)
return false;
auto lhs_argument = function.getArgumentAt(0);
auto rhs_argument = function.getArgumentAt(1);
if (functionIsInOrGlobalInOperator(function_name))
{
if (auto future_set = rhs_argument.tryGetPreparedSet(); future_set)
{
if (auto prepared_set = future_set->buildOrderedSetInplace(rhs_argument.getTreeContext().getQueryContext()); prepared_set)
{
if (prepared_set->hasExplicitSetElements())
{
const auto prepared_info = getPreparedSetInfo(prepared_set);
if (traverseTreeIn(function_name, lhs_argument, prepared_set, prepared_info.type, prepared_info.column, out))
maybe_useful = true;
}
}
}
}
else if (function_name == "equals" ||
function_name == "notEquals" ||
function_name == "has" ||
function_name == "mapContains" ||
function_name == "indexOf" ||
function_name == "hasAny" ||
function_name == "hasAll")
{
Field const_value;
DataTypePtr const_type;
if (rhs_argument.tryGetConstant(const_value, const_type))
{
if (traverseTreeEquals(function_name, lhs_argument, const_type, const_value, out, parent))
maybe_useful = true;
}
else if (lhs_argument.tryGetConstant(const_value, const_type))
{
if (traverseTreeEquals(function_name, rhs_argument, const_type, const_value, out, parent))
maybe_useful = true;
}
}
}
return maybe_useful;
}
bool MergeTreeIndexConditionBloomFilter::traverseTreeIn(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const ConstSetPtr & prepared_set,
const DataTypePtr & type,
const ColumnPtr & column,
RPNElement & out)
{
auto key_node_column_name = key_node.getColumnName();
if (header.has(key_node_column_name))
{
size_t row_size = column->size();
size_t position = header.getPositionByName(key_node_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto & converted_column = castColumn(ColumnWithTypeAndName{column, type, ""}, index_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithColumn(index_type, converted_column, 0, row_size)));
if (function_name == "in" || function_name == "globalIn")
out.function = RPNElement::FUNCTION_IN;
if (function_name == "notIn" || function_name == "globalNotIn")
out.function = RPNElement::FUNCTION_NOT_IN;
return true;
}
if (key_node.isFunction())
{
auto key_node_function = key_node.toFunctionNode();
auto key_node_function_name = key_node_function.getFunctionName();
size_t key_node_function_arguments_size = key_node_function.getArgumentsSize();
WhichDataType which(type);
if (which.isTuple() && key_node_function_name == "tuple")
{
const auto & tuple_column = typeid_cast<const ColumnTuple *>(column.get());
const auto & tuple_data_type = typeid_cast<const DataTypeTuple *>(type.get());
if (tuple_data_type->getElements().size() != key_node_function_arguments_size || tuple_column->getColumns().size() != key_node_function_arguments_size)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal types of arguments of function {}", function_name);
bool match_with_subtype = false;
const auto & sub_columns = tuple_column->getColumns();
const auto & sub_data_types = tuple_data_type->getElements();
for (size_t index = 0; index < key_node_function_arguments_size; ++index)
match_with_subtype |= traverseTreeIn(function_name, key_node_function.getArgumentAt(index), nullptr, sub_data_types[index], sub_columns[index], out);
return match_with_subtype;
}
if (key_node_function_name == "arrayElement")
{
/** Try to parse arrayElement for mapKeys index.
* It is important to ignore keys like column_map['Key'] IN ('') because if key does not exists in map
* we return default value for arrayElement.
*
* We cannot skip keys that does not exist in map if comparison is with default type value because
* that way we skip necessary granules where map key does not exists.
*/
if (!prepared_set)
return false;
auto default_column_to_check = type->createColumnConstWithDefaultValue(1)->convertToFullColumnIfConst();
ColumnWithTypeAndName default_column_with_type_to_check { default_column_to_check, type, "" };
ColumnsWithTypeAndName default_columns_with_type_to_check = {default_column_with_type_to_check};
auto set_contains_default_value_predicate_column = prepared_set->execute(default_columns_with_type_to_check, false /*negative*/);
const auto & set_contains_default_value_predicate_column_typed = assert_cast<const ColumnUInt8 &>(*set_contains_default_value_predicate_column);
bool set_contain_default_value = set_contains_default_value_predicate_column_typed.getData()[0];
if (set_contain_default_value)
return false;
auto first_argument = key_node_function.getArgumentAt(0);
const auto column_name = first_argument.getColumnName();
auto map_keys_index_column_name = fmt::format("mapKeys({})", column_name);
auto map_values_index_column_name = fmt::format("mapValues({})", column_name);
if (header.has(map_keys_index_column_name))
{
/// For mapKeys we serialize key argument with bloom filter
auto second_argument = key_node_function.getArgumentAt(1);
Field constant_value;
DataTypePtr constant_type;
if (second_argument.tryGetConstant(constant_value, constant_type))
{
size_t position = header.getPositionByName(map_keys_index_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(index_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), constant_value)));
}
else
{
return false;
}
}
else if (header.has(map_values_index_column_name))
{
/// For mapValues we serialize set with bloom filter
size_t row_size = column->size();
size_t position = header.getPositionByName(map_values_index_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto & array_type = assert_cast<const DataTypeArray &>(*index_type);
const auto & array_nested_type = array_type.getNestedType();
const auto & converted_column = castColumn(ColumnWithTypeAndName{column, type, ""}, array_nested_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithColumn(array_nested_type, converted_column, 0, row_size)));
}
else
{
return false;
}
if (function_name == "in" || function_name == "globalIn")
out.function = RPNElement::FUNCTION_IN;
if (function_name == "notIn" || function_name == "globalNotIn")
out.function = RPNElement::FUNCTION_NOT_IN;
return true;
}
}
return false;
}
static bool indexOfCanUseBloomFilter(const RPNBuilderTreeNode * parent)
{
if (!parent)
return true;
if (!parent->isFunction())
return false;
auto function = parent->toFunctionNode();
auto function_name = function.getFunctionName();
/// `parent` is a function where `indexOf` is located.
/// Example: `indexOf(arr, x) = 1`, parent is a function named `equals`.
if (function_name == "and")
{
return true;
}
else if (function_name == "equals" /// notEquals is not applicable
|| function_name == "greater" || function_name == "greaterOrEquals"
|| function_name == "less" || function_name == "lessOrEquals")
{
size_t function_arguments_size = function.getArgumentsSize();
if (function_arguments_size != 2)
return false;
/// We don't allow constant expressions like `indexOf(arr, x) = 1 + 0` but it's negligible.
/// We should return true when the corresponding expression implies that the array contains the element.
/// Example: when `indexOf(arr, x)` > 10 is written, it means that arr definitely should contain the element
/// (at least at 11th position but it does not matter).
bool reversed = false;
Field constant_value;
DataTypePtr constant_type;
if (function.getArgumentAt(0).tryGetConstant(constant_value, constant_type))
{
reversed = true;
}
else if (function.getArgumentAt(1).tryGetConstant(constant_value, constant_type))
{
}
else
{
return false;
}
Field zero(0);
bool constant_equal_zero = applyVisitor(FieldVisitorAccurateEquals(), constant_value, zero);
if (function_name == "equals" && !constant_equal_zero)
{
/// indexOf(...) = c, c != 0
return true;
}
else if (function_name == "notEquals" && constant_equal_zero)
{
/// indexOf(...) != c, c = 0
return true;
}
else if (function_name == (reversed ? "less" : "greater") && !applyVisitor(FieldVisitorAccurateLess(), constant_value, zero))
{
/// indexOf(...) > c, c >= 0
return true;
}
else if (function_name == (reversed ? "lessOrEquals" : "greaterOrEquals") && applyVisitor(FieldVisitorAccurateLess(), zero, constant_value))
{
/// indexOf(...) >= c, c > 0
return true;
}
return false;
}
return false;
}
bool MergeTreeIndexConditionBloomFilter::traverseTreeEquals(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out,
const RPNBuilderTreeNode * parent)
{
auto key_column_name = key_node.getColumnName();
if (header.has(key_column_name))
{
size_t position = header.getPositionByName(key_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto * array_type = typeid_cast<const DataTypeArray *>(index_type.get());
if (function_name == "has" || function_name == "indexOf")
{
if (!array_type)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "First argument for function {} must be an array.", function_name);
/// We can treat `indexOf` function similar to `has`.
/// But it is little more cumbersome, compare: `has(arr, elem)` and `indexOf(arr, elem) != 0`.
/// The `parent` in this context is expected to be function `!=` (`notEquals`).
if (function_name == "has" || indexOfCanUseBloomFilter(parent))
{
out.function = RPNElement::FUNCTION_HAS;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(array_type->getNestedType());
auto converted_field = convertFieldToType(value_field, *actual_type, value_type.get());
if (converted_field.isNull())
return false;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), converted_field)));
}
}
else if (function_name == "hasAny" || function_name == "hasAll")
{
if (!array_type)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "First argument for function {} must be an array.", function_name);
if (value_field.getType() != Field::Types::Array)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Second argument for function {} must be an array.", function_name);
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(array_type->getNestedType());
ColumnPtr column;
{
const bool is_nullable = actual_type->isNullable();
auto mutable_column = actual_type->createColumn();
for (const auto & f : value_field.get<Array>())
{
if ((f.isNull() && !is_nullable) || f.isDecimal(f.getType())) /// NOLINT(readability-static-accessed-through-instance)
return false;
auto converted = convertFieldToType(f, *actual_type);
if (converted.isNull())
return false;
mutable_column->insert(converted);
}
column = std::move(mutable_column);
}
out.function = function_name == "hasAny" ?
RPNElement::FUNCTION_HAS_ANY :
RPNElement::FUNCTION_HAS_ALL;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithColumn(actual_type, column, 0, column->size())));
}
else
{
if (array_type)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"An array type of bloom_filter supports only has(), indexOf(), and hasAny() functions.");
out.function = function_name == "equals" ? RPNElement::FUNCTION_EQUALS : RPNElement::FUNCTION_NOT_EQUALS;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(index_type);
auto converted_field = convertFieldToType(value_field, *actual_type, value_type.get());
if (converted_field.isNull())
return false;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), converted_field)));
}
return true;
}
if (function_name == "mapContains" || function_name == "has")
{
auto map_keys_index_column_name = fmt::format("mapKeys({})", key_column_name);
if (!header.has(map_keys_index_column_name))
return false;
size_t position = header.getPositionByName(map_keys_index_column_name);
const DataTypePtr & index_type = header.getByPosition(position).type;
const auto * array_type = typeid_cast<const DataTypeArray *>(index_type.get());
if (!array_type)
return false;
out.function = RPNElement::FUNCTION_HAS;
const DataTypePtr actual_type = BloomFilter::getPrimitiveType(array_type->getNestedType());
auto converted_field = convertFieldToType(value_field, *actual_type, value_type.get());
if (converted_field.isNull())
return false;
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), converted_field)));
return true;
}
if (key_node.isFunction())
{
WhichDataType which(value_type);
auto key_node_function = key_node.toFunctionNode();
auto key_node_function_name = key_node_function.getFunctionName();
size_t key_node_function_arguments_size = key_node_function.getArgumentsSize();
if (which.isTuple() && key_node_function_name == "tuple")
{
const Tuple & tuple = value_field.get<const Tuple &>();
const auto * value_tuple_data_type = typeid_cast<const DataTypeTuple *>(value_type.get());
if (tuple.size() != key_node_function_arguments_size)
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal types of arguments of function {}", function_name);
bool match_with_subtype = false;
const DataTypes & subtypes = value_tuple_data_type->getElements();
for (size_t index = 0; index < tuple.size(); ++index)
match_with_subtype |= traverseTreeEquals(function_name, key_node_function.getArgumentAt(index), subtypes[index], tuple[index], out, &key_node);
return match_with_subtype;
}
if (key_node_function_name == "arrayElement" && (function_name == "equals" || function_name == "notEquals"))
{
/** Try to parse arrayElement for mapKeys index.
* It is important to ignore keys like column_map['Key'] = '' because if key does not exists in map
* we return default value for arrayElement.
*
* We cannot skip keys that does not exist in map if comparison is with default type value because
* that way we skip necessary granules where map key does not exists.
*/
if (value_field == value_type->getDefault())
return false;
auto first_argument = key_node_function.getArgumentAt(0);
const auto column_name = first_argument.getColumnName();
auto map_keys_index_column_name = fmt::format("mapKeys({})", column_name);
auto map_values_index_column_name = fmt::format("mapValues({})", column_name);
size_t position = 0;
Field const_value = value_field;
DataTypePtr const_type;
if (header.has(map_keys_index_column_name))
{
position = header.getPositionByName(map_keys_index_column_name);
auto second_argument = key_node_function.getArgumentAt(1);
if (!second_argument.tryGetConstant(const_value, const_type))
return false;
}
else if (header.has(map_values_index_column_name))
{
position = header.getPositionByName(map_values_index_column_name);
}
else
{
return false;
}
out.function = function_name == "equals" ? RPNElement::FUNCTION_EQUALS : RPNElement::FUNCTION_NOT_EQUALS;
const auto & index_type = header.getByPosition(position).type;
const auto actual_type = BloomFilter::getPrimitiveType(index_type);
out.predicate.emplace_back(std::make_pair(position, BloomFilterHash::hashWithField(actual_type.get(), const_value)));
return true;
}
}
return false;
}
}

View File

@ -1,87 +0,0 @@
#pragma once
#include <Columns/IColumn.h>
#include <Interpreters/BloomFilter.h>
#include <Storages/MergeTree/KeyCondition.h>
#include <Storages/MergeTree/MergeTreeIndices.h>
#include <Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
class MergeTreeIndexConditionBloomFilter final : public IMergeTreeIndexCondition, WithContext
{
public:
struct RPNElement
{
enum Function
{
/// Atoms of a Boolean expression.
FUNCTION_EQUALS,
FUNCTION_NOT_EQUALS,
FUNCTION_HAS,
FUNCTION_HAS_ANY,
FUNCTION_HAS_ALL,
FUNCTION_IN,
FUNCTION_NOT_IN,
FUNCTION_UNKNOWN, /// Can take any value.
/// Operators of the logical expression.
FUNCTION_NOT,
FUNCTION_AND,
FUNCTION_OR,
/// Constants
ALWAYS_FALSE,
ALWAYS_TRUE,
};
RPNElement(Function function_ = FUNCTION_UNKNOWN) : function(function_) {} /// NOLINT
Function function = FUNCTION_UNKNOWN;
std::vector<std::pair<size_t, ColumnPtr>> predicate;
};
MergeTreeIndexConditionBloomFilter(const ActionsDAGPtr & filter_actions_dag, ContextPtr context_, const Block & header_, size_t hash_functions_);
bool alwaysUnknownOrTrue() const override;
bool mayBeTrueOnGranule(MergeTreeIndexGranulePtr granule) const override
{
if (const auto & bf_granule = typeid_cast<const MergeTreeIndexGranuleBloomFilter *>(granule.get()))
return mayBeTrueOnGranule(bf_granule);
throw Exception(ErrorCodes::LOGICAL_ERROR, "Requires bloom filter index granule.");
}
private:
const Block & header;
const size_t hash_functions;
std::vector<RPNElement> rpn;
bool mayBeTrueOnGranule(const MergeTreeIndexGranuleBloomFilter * granule) const;
bool extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out);
bool traverseFunction(const RPNBuilderTreeNode & node, RPNElement & out, const RPNBuilderTreeNode * parent);
bool traverseTreeIn(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const ConstSetPtr & prepared_set,
const DataTypePtr & type,
const ColumnPtr & column,
RPNElement & out);
bool traverseTreeEquals(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out,
const RPNBuilderTreeNode * parent);
};
}

View File

@ -1,45 +1,46 @@
#include <Storages/MergeTree/MergeTreeIndexFullText.h>
#include <Columns/ColumnArray.h>
#include <Common/OptimizedRegularExpression.h>
#include <Columns/ColumnLowCardinality.h>
#include <Columns/ColumnNullable.h>
#include <Core/Defines.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypesNumber.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <Interpreters/ExpressionActions.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/GinFilter.h>
#include <Interpreters/TreeRewriter.h>
#include <Interpreters/misc.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTSubquery.h>
#include <Poco/Logger.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <Storages/MergeTree/MergeTreeIndexUtils.h>
#include <Storages/MergeTree/RPNBuilder.h>
#include <Poco/Logger.h>
#include <Common/OptimizedRegularExpression.h>
#include <algorithm>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int INCORRECT_QUERY;
extern const int BAD_ARGUMENTS;
}
MergeTreeIndexGranuleFullText::MergeTreeIndexGranuleFullText(
const String & index_name_,
size_t columns_number,
const BloomFilterParameters & params_)
const GinFilterParameters & params_)
: index_name(index_name_)
, params(params_)
, bloom_filters(
columns_number, BloomFilter(params))
, gin_filters(columns_number, GinFilter(params))
, has_elems(false)
{
}
@ -49,8 +50,15 @@ void MergeTreeIndexGranuleFullText::serializeBinary(WriteBuffer & ostr) const
if (empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty fulltext index {}.", backQuote(index_name));
for (const auto & bloom_filter : bloom_filters)
ostr.write(reinterpret_cast<const char *>(bloom_filter.getFilter().data()), params.filter_size);
const auto & size_type = std::make_shared<DataTypeUInt32>();
auto size_serialization = size_type->getDefaultSerialization();
for (const auto & gin_filter : gin_filters)
{
size_t filter_size = gin_filter.getFilter().size();
size_serialization->serializeBinary(filter_size, ostr, {});
ostr.write(reinterpret_cast<const char *>(gin_filter.getFilter().data()), filter_size * sizeof(GinSegmentWithRowIdRangeVector::value_type));
}
}
void MergeTreeIndexGranuleFullText::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
@ -58,20 +66,33 @@ void MergeTreeIndexGranuleFullText::deserializeBinary(ReadBuffer & istr, MergeTr
if (version != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
for (auto & bloom_filter : bloom_filters)
Field field_rows;
const auto & size_type = std::make_shared<DataTypeUInt32>();
auto size_serialization = size_type->getDefaultSerialization();
for (auto & gin_filter : gin_filters)
{
istr.readStrict(reinterpret_cast<char *>(bloom_filter.getFilter().data()), params.filter_size);
size_serialization->deserializeBinary(field_rows, istr, {});
size_t filter_size = field_rows.get<size_t>();
gin_filter.getFilter().resize(filter_size);
if (filter_size == 0)
continue;
istr.readStrict(reinterpret_cast<char *>(gin_filter.getFilter().data()), filter_size * sizeof(GinSegmentWithRowIdRangeVector::value_type));
}
has_elems = true;
}
MergeTreeIndexAggregatorFullText::MergeTreeIndexAggregatorFullText(
GinIndexStorePtr store_,
const Names & index_columns_,
const String & index_name_,
const BloomFilterParameters & params_,
const GinFilterParameters & params_,
TokenExtractorPtr token_extractor_)
: index_columns(index_columns_)
: store(store_)
, index_columns(index_columns_)
, index_name (index_name_)
, params(params_)
, token_extractor(token_extractor_)
@ -89,6 +110,16 @@ MergeTreeIndexGranulePtr MergeTreeIndexAggregatorFullText::getGranuleAndReset()
return new_granule;
}
void MergeTreeIndexAggregatorFullText::addToGinFilter(UInt32 rowID, const char * data, size_t length, GinFilter & gin_filter)
{
size_t cur = 0;
size_t token_start = 0;
size_t token_len = 0;
while (cur < length && token_extractor->nextInStringPadded(data, length, &cur, &token_start, &token_len))
gin_filter.add(data + token_start, token_len, rowID, store);
}
void MergeTreeIndexAggregatorFullText::update(const Block & block, size_t * pos, size_t limit)
{
if (*pos >= block.rows())
@ -96,6 +127,8 @@ void MergeTreeIndexAggregatorFullText::update(const Block & block, size_t * pos,
"Position: {}, Block rows: {}.", *pos, block.rows());
size_t rows_read = std::min(limit, block.rows() - *pos);
auto row_id = store->getNextRowIDRange(rows_read);
auto start_row_id = row_id;
for (size_t col = 0; col < index_columns.size(); ++col)
{
@ -103,6 +136,7 @@ void MergeTreeIndexAggregatorFullText::update(const Block & block, size_t * pos,
const auto & column = column_with_type.column;
size_t current_position = *pos;
bool need_to_write = false;
if (isArray(column_with_type.type))
{
const auto & column_array = assert_cast<const ColumnArray &>(*column);
@ -117,10 +151,14 @@ void MergeTreeIndexAggregatorFullText::update(const Block & block, size_t * pos,
for (size_t row_num = 0; row_num < elements_size; ++row_num)
{
auto ref = column_key.getDataAt(element_start_row + row_num);
token_extractor->stringPaddedToBloomFilter(ref.data, ref.size, granule->bloom_filters[col]);
addToGinFilter(row_id, ref.data, ref.size, granule->gin_filters[col]);
store->incrementCurrentSizeBy(ref.size);
}
current_position += 1;
row_id++;
if (store->needToWrite())
need_to_write = true;
}
}
else
@ -128,9 +166,18 @@ void MergeTreeIndexAggregatorFullText::update(const Block & block, size_t * pos,
for (size_t i = 0; i < rows_read; ++i)
{
auto ref = column->getDataAt(current_position + i);
token_extractor->stringPaddedToBloomFilter(ref.data, ref.size, granule->bloom_filters[col]);
addToGinFilter(row_id, ref.data, ref.size, granule->gin_filters[col]);
store->incrementCurrentSizeBy(ref.size);
row_id++;
if (store->needToWrite())
need_to_write = true;
}
}
granule->gin_filters[col].addRowRangeToGinFilter(store->getCurrentSegmentID(), start_row_id, static_cast<UInt32>(start_row_id + rows_read - 1));
if (need_to_write)
{
store->writeSegment();
}
}
granule->has_elems = true;
@ -139,12 +186,11 @@ void MergeTreeIndexAggregatorFullText::update(const Block & block, size_t * pos,
MergeTreeConditionFullText::MergeTreeConditionFullText(
const ActionsDAGPtr & filter_actions_dag,
ContextPtr context,
ContextPtr context_,
const Block & index_sample_block,
const BloomFilterParameters & params_,
const GinFilterParameters & params_,
TokenExtractorPtr token_extactor_)
: index_columns(index_sample_block.getNames())
, index_data_types(index_sample_block.getNamesAndTypesList().getTypes())
: WithContext(context_), header(index_sample_block)
, params(params_)
, token_extractor(token_extactor_)
{
@ -154,14 +200,16 @@ MergeTreeConditionFullText::MergeTreeConditionFullText(
return;
}
RPNBuilder<RPNElement> builder(
filter_actions_dag->getOutputs().at(0),
context,
[&](const RPNBuilderTreeNode & node, RPNElement & out) { return extractAtomFromTree(node, out); });
rpn = std::move(builder).extractRPN();
rpn = std::move(
RPNBuilder<RPNElement>(
filter_actions_dag->getOutputs().at(0), context_,
[&](const RPNBuilderTreeNode & node, RPNElement & out)
{
return this->traverseAtomAST(node, out);
}).extractRPN());
}
/// Keep in-sync with MergeTreeConditionGinFilter::alwaysUnknownOrTrue
/// Keep in-sync with MergeTreeConditionFullText::alwaysUnknownOrTrue
bool MergeTreeConditionFullText::alwaysUnknownOrTrue() const
{
/// Check like in KeyCondition.
@ -181,7 +229,6 @@ bool MergeTreeConditionFullText::alwaysUnknownOrTrue() const
|| element.function == RPNElement::FUNCTION_NOT_IN
|| element.function == RPNElement::FUNCTION_MULTI_SEARCH
|| element.function == RPNElement::FUNCTION_MATCH
|| element.function == RPNElement::FUNCTION_HAS_ANY
|| element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.push_back(false);
@ -211,13 +258,12 @@ bool MergeTreeConditionFullText::alwaysUnknownOrTrue() const
return rpn_stack[0];
}
/// Keep in-sync with MergeTreeIndexConditionGin::mayBeTrueOnTranuleInPart
bool MergeTreeConditionFullText::mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx_granule) const
bool MergeTreeConditionFullText::mayBeTrueOnGranuleInPart(MergeTreeIndexGranulePtr idx_granule,[[maybe_unused]] PostingsCacheForStore & cache_store) const
{
std::shared_ptr<MergeTreeIndexGranuleFullText> granule
= std::dynamic_pointer_cast<MergeTreeIndexGranuleFullText>(idx_granule);
if (!granule)
throw Exception(ErrorCodes::LOGICAL_ERROR, "BloomFilter index condition got a granule with the wrong type.");
throw Exception(ErrorCodes::LOGICAL_ERROR, "GinFilter index condition got a granule with the wrong type.");
/// Check like in KeyCondition.
std::vector<BoolMask> rpn_stack;
@ -231,7 +277,7 @@ bool MergeTreeConditionFullText::mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS)
{
rpn_stack.emplace_back(granule->bloom_filters[element.key_column].contains(*element.bloom_filter), true);
rpn_stack.emplace_back(granule->gin_filters[element.key_column].contains(*element.gin_filter, cache_store), true);
if (element.function == RPNElement::FUNCTION_NOT_EQUALS)
rpn_stack.back() = !rpn_stack.back();
@ -239,53 +285,51 @@ bool MergeTreeConditionFullText::mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx
else if (element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN)
{
std::vector<bool> result(element.set_bloom_filters.back().size(), true);
std::vector<bool> result(element.set_gin_filters.back().size(), true);
for (size_t column = 0; column < element.set_key_position.size(); ++column)
{
const size_t key_idx = element.set_key_position[column];
const auto & bloom_filters = element.set_bloom_filters[column];
for (size_t row = 0; row < bloom_filters.size(); ++row)
result[row] = result[row] && granule->bloom_filters[key_idx].contains(bloom_filters[row]);
const auto & gin_filters = element.set_gin_filters[column];
for (size_t row = 0; row < gin_filters.size(); ++row)
result[row] = result[row] && granule->gin_filters[key_idx].contains(gin_filters[row], cache_store);
}
rpn_stack.emplace_back(
std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
if (element.function == RPNElement::FUNCTION_NOT_IN)
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_MULTI_SEARCH
|| element.function == RPNElement::FUNCTION_HAS_ANY)
else if (element.function == RPNElement::FUNCTION_MULTI_SEARCH)
{
std::vector<bool> result(element.set_bloom_filters.back().size(), true);
std::vector<bool> result(element.set_gin_filters.back().size(), true);
const auto & bloom_filters = element.set_bloom_filters[0];
const auto & gin_filters = element.set_gin_filters[0];
for (size_t row = 0; row < bloom_filters.size(); ++row)
result[row] = result[row] && granule->bloom_filters[element.key_column].contains(bloom_filters[row]);
for (size_t row = 0; row < gin_filters.size(); ++row)
result[row] = result[row] && granule->gin_filters[element.key_column].contains(gin_filters[row], cache_store);
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
}
else if (element.function == RPNElement::FUNCTION_MATCH)
{
if (!element.set_bloom_filters.empty())
if (!element.set_gin_filters.empty())
{
/// Alternative substrings
std::vector<bool> result(element.set_bloom_filters.back().size(), true);
std::vector<bool> result(element.set_gin_filters.back().size(), true);
const auto & bloom_filters = element.set_bloom_filters[0];
const auto & gin_filters = element.set_gin_filters[0];
for (size_t row = 0; row < bloom_filters.size(); ++row)
result[row] = result[row] && granule->bloom_filters[element.key_column].contains(bloom_filters[row]);
for (size_t row = 0; row < gin_filters.size(); ++row)
result[row] = result[row] && granule->gin_filters[element.key_column].contains(gin_filters[row], cache_store);
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
}
else if (element.bloom_filter)
else if (element.gin_filter)
{
/// Required substrings
rpn_stack.emplace_back(granule->bloom_filters[element.key_column].contains(*element.bloom_filter), true);
rpn_stack.emplace_back(granule->gin_filters[element.key_column].contains(*element.gin_filter, cache_store), true);
}
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
@ -314,22 +358,16 @@ bool MergeTreeConditionFullText::mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx
rpn_stack.emplace_back(true, false);
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in BloomFilterCondition::RPNElement");
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in GinFilterCondition::RPNElement");
}
if (rpn_stack.size() != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected stack size in BloomFilterCondition::mayBeTrueOnGranule");
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected stack size in GinFilterCondition::mayBeTrueOnGranule");
return rpn_stack[0].can_be_true;
}
std::optional<size_t> MergeTreeConditionFullText::getKeyIndex(const std::string & key_column_name)
{
const auto it = std::ranges::find(index_columns, key_column_name);
return it == index_columns.end() ? std::nullopt : std::make_optional<size_t>(std::ranges::distance(index_columns.cbegin(), it));
}
bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out)
bool MergeTreeConditionFullText::traverseAtomAST(const RPNBuilderTreeNode & node, RPNElement & out)
{
{
Field const_value;
@ -338,7 +376,6 @@ bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode &
if (node.tryGetConstant(const_value, const_type))
{
/// Check constant like in KeyCondition
if (const_value.getType() == Field::Types::UInt64)
{
out.function = const_value.get<UInt64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
@ -353,7 +390,7 @@ bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode &
if (const_value.getType() == Field::Types::Float64)
{
out.function = const_value.get<Float64>() != 0.0 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
out.function = const_value.get<Float64>() != 0.00 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
}
@ -361,19 +398,19 @@ bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode &
if (node.isFunction())
{
auto function_node = node.toFunctionNode();
auto function_name = function_node.getFunctionName();
const auto function = node.toFunctionNode();
// auto arguments_size = function.getArgumentsSize();
auto function_name = function.getFunctionName();
size_t arguments_size = function_node.getArgumentsSize();
if (arguments_size != 2)
size_t function_arguments_size = function.getArgumentsSize();
if (function_arguments_size != 2)
return false;
auto left_argument = function_node.getArgumentAt(0);
auto right_argument = function_node.getArgumentAt(1);
auto lhs_argument = function.getArgumentAt(0);
auto rhs_argument = function.getArgumentAt(1);
if (functionIsInOrGlobalInOperator(function_name))
{
if (tryPrepareSetBloomFilter(left_argument, right_argument, out))
if (tryPrepareSetGinFilter(lhs_argument, rhs_argument, out))
{
if (function_name == "notIn")
{
@ -391,26 +428,25 @@ bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode &
function_name == "notEquals" ||
function_name == "has" ||
function_name == "mapContains" ||
function_name == "match" ||
function_name == "like" ||
function_name == "notLike" ||
function_name.starts_with("hasToken") ||
function_name == "hasToken" ||
function_name == "hasTokenOrNull" ||
function_name == "startsWith" ||
function_name == "endsWith" ||
function_name == "multiSearchAny" ||
function_name == "hasAny")
function_name == "match")
{
Field const_value;
DataTypePtr const_type;
if (right_argument.tryGetConstant(const_value, const_type))
if (rhs_argument.tryGetConstant(const_value, const_type))
{
if (traverseTreeEquals(function_name, left_argument, const_type, const_value, out))
if (traverseASTEquals(function_name, lhs_argument, const_type, const_value, out))
return true;
}
else if (left_argument.tryGetConstant(const_value, const_type) && (function_name == "equals" || function_name == "notEquals"))
else if (lhs_argument.tryGetConstant(const_value, const_type) && (function_name == "equals" || function_name == "notEquals"))
{
if (traverseTreeEquals(function_name, right_argument, const_type, const_value, out))
if (traverseASTEquals(function_name, rhs_argument, const_type, const_value, out))
return true;
}
}
@ -419,9 +455,9 @@ bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode &
return false;
}
bool MergeTreeConditionFullText::traverseTreeEquals(
bool MergeTreeConditionFullText::traverseASTEquals(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const RPNBuilderTreeNode & key_ast,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out)
@ -431,17 +467,14 @@ bool MergeTreeConditionFullText::traverseTreeEquals(
return false;
Field const_value = value_field;
size_t key_column_num = 0;
bool key_exists = header.has(key_ast.getColumnName());
bool map_key_exists = header.has(fmt::format("mapKeys({})", key_ast.getColumnName()));
const auto column_name = key_node.getColumnName();
auto key_index = getKeyIndex(column_name);
const auto map_key_index = getKeyIndex(fmt::format("mapKeys({})", column_name));
if (key_node.isFunction())
if (key_ast.isFunction())
{
auto key_function_node = key_node.toFunctionNode();
auto key_function_node_function_name = key_function_node.getFunctionName();
if (key_function_node_function_name == "arrayElement")
const auto function = key_ast.toFunctionNode();
if (function.getFunctionName() == "arrayElement")
{
/** Try to parse arrayElement for mapKeys index.
* It is important to ignore keys like column_map['Key'] = '' because if key does not exists in map
@ -453,29 +486,33 @@ bool MergeTreeConditionFullText::traverseTreeEquals(
if (value_field == value_type->getDefault())
return false;
auto first_argument = key_function_node.getArgumentAt(0);
auto first_argument = function.getArgumentAt(0);
const auto map_column_name = first_argument.getColumnName();
if (const auto map_keys_index = getKeyIndex(fmt::format("mapKeys({})", map_column_name)))
auto map_keys_index_column_name = fmt::format("mapKeys({})", map_column_name);
auto map_values_index_column_name = fmt::format("mapValues({})", map_column_name);
if (header.has(map_keys_index_column_name))
{
auto second_argument = key_function_node.getArgumentAt(1);
auto argument = function.getArgumentAt(1);
DataTypePtr const_type;
if (second_argument.tryGetConstant(const_value, const_type))
if (argument.tryGetConstant(const_value, const_type))
{
key_index = map_keys_index;
auto const_data_type = WhichDataType(const_type);
if (!const_data_type.isStringOrFixedString() && !const_data_type.isArray())
return false;
key_column_num = header.getPositionByName(map_keys_index_column_name);
key_exists = true;
}
else
{
return false;
}
}
else if (const auto map_values_exists = getKeyIndex(fmt::format("mapValues({})", map_column_name)))
else if (header.has(map_values_index_column_name))
{
key_index = map_values_exists;
key_column_num = header.getPositionByName(map_values_index_column_name);
key_exists = true;
}
else
{
@ -484,128 +521,115 @@ bool MergeTreeConditionFullText::traverseTreeEquals(
}
}
const auto lowercase_key_index = getKeyIndex(fmt::format("lower({})", column_name));
const auto is_has_token_case_insensitive = function_name.starts_with("hasTokenCaseInsensitive");
if (const auto is_case_insensitive_scenario = is_has_token_case_insensitive && lowercase_key_index;
function_name.starts_with("hasToken") && ((!is_has_token_case_insensitive && key_index) || is_case_insensitive_scenario))
{
out.key_column = is_case_insensitive_scenario ? *lowercase_key_index : *key_index;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
auto value = const_value.get<String>();
if (is_case_insensitive_scenario)
std::ranges::transform(value, value.begin(), [](const auto & c) { return static_cast<char>(std::tolower(c)); });
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
return true;
}
if (!key_index && !map_key_index)
if (!key_exists && !map_key_exists)
return false;
if (map_key_index && (function_name == "has" || function_name == "mapContains"))
if (map_key_exists && (function_name == "has" || function_name == "mapContains"))
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_HAS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "has")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_HAS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
if (function_name == "notEquals")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_NOT_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "equals")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "like")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringLikeToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringLikeToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "notLike")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_NOT_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringLikeToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringLikeToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "hasToken" || function_name == "hasTokenOrNull")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "startsWith")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "endsWith")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.bloom_filter = std::make_unique<BloomFilter>(params);
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), *out.bloom_filter);
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "multiSearchAny"
|| function_name == "hasAny")
else if (function_name == "multiSearchAny")
{
out.key_column = *key_index;
out.function = function_name == "multiSearchAny" ?
RPNElement::FUNCTION_MULTI_SEARCH :
RPNElement::FUNCTION_HAS_ANY;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_MULTI_SEARCH;
/// 2d vector is not needed here but is used because already exists for FUNCTION_IN
std::vector<std::vector<BloomFilter>> bloom_filters;
bloom_filters.emplace_back();
std::vector<GinFilters> gin_filters;
gin_filters.emplace_back();
for (const auto & element : const_value.get<Array>())
{
if (element.getType() != Field::Types::String)
return false;
bloom_filters.back().emplace_back(params);
gin_filters.back().emplace_back(params);
const auto & value = element.get<String>();
token_extractor->stringToBloomFilter(value.data(), value.size(), bloom_filters.back().back());
token_extractor->stringToGinFilter(value.data(), value.size(), gin_filters.back().back());
}
out.set_bloom_filters = std::move(bloom_filters);
out.set_gin_filters = std::move(gin_filters);
return true;
}
else if (function_name == "match")
{
out.key_column = *key_index;
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_MATCH;
out.bloom_filter = std::make_unique<BloomFilter>(params);
auto & value = const_value.get<String>();
String required_substring;
@ -616,100 +640,101 @@ bool MergeTreeConditionFullText::traverseTreeEquals(
if (required_substring.empty() && alternatives.empty())
return false;
/// out.set_bloom_filters means alternatives exist
/// out.bloom_filter means required_substring exists
/// out.set_gin_filters means alternatives exist
/// out.gin_filter means required_substring exists
if (!alternatives.empty())
{
std::vector<std::vector<BloomFilter>> bloom_filters;
bloom_filters.emplace_back();
std::vector<GinFilters> gin_filters;
gin_filters.emplace_back();
for (const auto & alternative : alternatives)
{
bloom_filters.back().emplace_back(params);
token_extractor->stringToBloomFilter(alternative.data(), alternative.size(), bloom_filters.back().back());
gin_filters.back().emplace_back(params);
token_extractor->stringToGinFilter(alternative.data(), alternative.size(), gin_filters.back().back());
}
out.set_bloom_filters = std::move(bloom_filters);
out.set_gin_filters = std::move(gin_filters);
}
else
token_extractor->stringToBloomFilter(required_substring.data(), required_substring.size(), *out.bloom_filter);
{
out.gin_filter = std::make_unique<GinFilter>(params);
token_extractor->stringToGinFilter(required_substring.data(), required_substring.size(), *out.gin_filter);
}
return true;
}
return false;
}
bool MergeTreeConditionFullText::tryPrepareSetBloomFilter(
const RPNBuilderTreeNode & left_argument,
const RPNBuilderTreeNode & right_argument,
bool MergeTreeConditionFullText::tryPrepareSetGinFilter(
const RPNBuilderTreeNode & lhs,
const RPNBuilderTreeNode & rhs,
RPNElement & out)
{
std::vector<KeyTuplePositionMapping> key_tuple_mapping;
DataTypes data_types;
auto left_argument_function_node_optional = left_argument.toFunctionNodeOrNull();
if (left_argument_function_node_optional && left_argument_function_node_optional->getFunctionName() == "tuple")
if (lhs.isFunction() && lhs.toFunctionNode().getFunctionName() == "tuple")
{
const auto & left_argument_function_node = *left_argument_function_node_optional;
size_t left_argument_function_node_arguments_size = left_argument_function_node.getArgumentsSize();
for (size_t i = 0; i < left_argument_function_node_arguments_size; ++i)
const auto function = lhs.toFunctionNode();
auto arguments_size = function.getArgumentsSize();
for (size_t i = 0; i < arguments_size; ++i)
{
if (const auto key = getKeyIndex(left_argument_function_node.getArgumentAt(i).getColumnName()))
if (header.has(function.getArgumentAt(i).getColumnName()))
{
key_tuple_mapping.emplace_back(i, *key);
data_types.push_back(index_data_types[*key]);
auto key = header.getPositionByName(function.getArgumentAt(i).getColumnName());
key_tuple_mapping.emplace_back(i, key);
data_types.push_back(header.getByPosition(key).type);
}
}
}
else if (const auto key = getKeyIndex(left_argument.getColumnName()))
else
{
key_tuple_mapping.emplace_back(0, *key);
data_types.push_back(index_data_types[*key]);
if (header.has(lhs.getColumnName()))
{
auto key = header.getPositionByName(lhs.getColumnName());
key_tuple_mapping.emplace_back(0, key);
data_types.push_back(header.getByPosition(key).type);
}
}
if (key_tuple_mapping.empty())
return false;
auto future_set = right_argument.tryGetPreparedSet(data_types);
auto future_set = rhs.tryGetPreparedSet();
if (!future_set)
return false;
auto prepared_set = future_set->buildOrderedSetInplace(right_argument.getTreeContext().getQueryContext());
auto prepared_set = future_set->buildOrderedSetInplace(rhs.getTreeContext().getQueryContext());
if (!prepared_set || !prepared_set->hasExplicitSetElements())
return false;
for (const auto & prepared_set_data_type : prepared_set->getDataTypes())
{
auto prepared_set_data_type_id = prepared_set_data_type->getTypeId();
if (prepared_set_data_type_id != TypeIndex::String && prepared_set_data_type_id != TypeIndex::FixedString)
for (const auto & data_type : prepared_set->getDataTypes())
if (data_type->getTypeId() != TypeIndex::String && data_type->getTypeId() != TypeIndex::FixedString)
return false;
}
std::vector<std::vector<BloomFilter>> bloom_filters;
std::vector<GinFilters> gin_filters;
std::vector<size_t> key_position;
Columns columns = prepared_set->getSetElements();
size_t prepared_set_total_row_count = prepared_set->getTotalRowCount();
for (const auto & elem : key_tuple_mapping)
{
bloom_filters.emplace_back();
gin_filters.emplace_back();
gin_filters.back().reserve(prepared_set->getTotalRowCount());
key_position.push_back(elem.key_index);
size_t tuple_idx = elem.tuple_index;
const auto & column = columns[tuple_idx];
for (size_t row = 0; row < prepared_set_total_row_count; ++row)
for (size_t row = 0; row < prepared_set->getTotalRowCount(); ++row)
{
bloom_filters.back().emplace_back(params);
gin_filters.back().emplace_back(params);
auto ref = column->getDataAt(row);
token_extractor->stringPaddedToBloomFilter(ref.data, ref.size, bloom_filters.back().back());
token_extractor->stringToGinFilter(ref.data, ref.size, gin_filters.back().back());
}
}
out.set_key_position = std::move(key_position);
out.set_bloom_filters = std::move(bloom_filters);
out.set_gin_filters = std::move(gin_filters);
return true;
}
@ -721,48 +746,43 @@ MergeTreeIndexGranulePtr MergeTreeIndexFullText::createIndexGranule() const
MergeTreeIndexAggregatorPtr MergeTreeIndexFullText::createIndexAggregator(const MergeTreeWriterSettings & /*settings*/) const
{
return std::make_shared<MergeTreeIndexAggregatorFullText>(index.column_names, index.name, params, token_extractor.get());
/// should not be called: createIndexAggregatorForPart should be used
assert(false);
return nullptr;
}
MergeTreeIndexAggregatorPtr MergeTreeIndexFullText::createIndexAggregatorForPart(const GinIndexStorePtr & store, const MergeTreeWriterSettings & /*settings*/) const
{
return std::make_shared<MergeTreeIndexAggregatorFullText>(store, index.column_names, index.name, params, token_extractor.get());
}
MergeTreeIndexConditionPtr MergeTreeIndexFullText::createIndexCondition(
const ActionsDAGPtr & filter_dag, ContextPtr context) const
const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const
{
return std::make_shared<MergeTreeConditionFullText>(filter_dag, context, index.sample_block, params, token_extractor.get());
}
return std::make_shared<MergeTreeConditionFullText>(filter_actions_dag, context, index.sample_block, params, token_extractor.get());
};
MergeTreeIndexPtr bloomFilterIndexCreator(
MergeTreeIndexPtr fullTextIndexCreator(
const IndexDescription & index)
{
if (index.type == NgramTokenExtractor::getName())
{
size_t n = index.arguments[0].get<size_t>();
BloomFilterParameters params(
index.arguments[1].get<size_t>(),
index.arguments[2].get<size_t>(),
index.arguments[3].get<size_t>());
size_t n = index.arguments.empty() ? 0 : index.arguments[0].get<size_t>();
UInt64 max_rows = index.arguments.size() < 2 ? DEFAULT_MAX_ROWS_PER_POSTINGS_LIST : index.arguments[1].get<UInt64>();
GinFilterParameters params(n, max_rows);
/// Use SplitTokenExtractor when n is 0, otherwise use NgramTokenExtractor
if (n > 0)
{
auto tokenizer = std::make_unique<NgramTokenExtractor>(n);
return std::make_shared<MergeTreeIndexFullText>(index, params, std::move(tokenizer));
}
else if (index.type == SplitTokenExtractor::getName())
{
BloomFilterParameters params(
index.arguments[0].get<size_t>(),
index.arguments[1].get<size_t>(),
index.arguments[2].get<size_t>());
auto tokenizer = std::make_unique<SplitTokenExtractor>();
return std::make_shared<MergeTreeIndexFullText>(index, params, std::move(tokenizer));
}
else
{
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index type: {}", backQuote(index.name));
auto tokenizer = std::make_unique<SplitTokenExtractor>();
return std::make_shared<MergeTreeIndexFullText>(index, params, std::move(tokenizer));
}
}
void bloomFilterIndexValidator(const IndexDescription & index, bool /*attach*/)
void fullTextIndexValidator(const IndexDescription & index, bool /*attach*/)
{
for (const auto & index_data_type : index.data_types)
{
@ -770,8 +790,8 @@ void bloomFilterIndexValidator(const IndexDescription & index, bool /*attach*/)
if (data_type.isArray())
{
const auto & array_type = assert_cast<const DataTypeArray &>(*index_data_type);
data_type = WhichDataType(array_type.getNestedType());
const auto & gin_type = assert_cast<const DataTypeArray &>(*index_data_type);
data_type = WhichDataType(gin_type.getNestedType());
}
else if (data_type.isLowCardinality())
{
@ -779,37 +799,28 @@ void bloomFilterIndexValidator(const IndexDescription & index, bool /*attach*/)
data_type = WhichDataType(low_cardinality.getDictionaryType());
}
if (!data_type.isString() && !data_type.isFixedString() && !data_type.isIPv6())
throw Exception(ErrorCodes::INCORRECT_QUERY,
"Ngram and token bloom filter indexes can only be used with column types `String`, `FixedString`, `LowCardinality(String)`, `LowCardinality(FixedString)`, `Array(String)` or `Array(FixedString)`");
if (!data_type.isString() && !data_type.isFixedString())
throw Exception(ErrorCodes::INCORRECT_QUERY, "Full text index can be used only with `String`, `FixedString`,"
"`LowCardinality(String)`, `LowCardinality(FixedString)` "
"column or Array with `String` or `FixedString` values column.");
}
if (index.type == NgramTokenExtractor::getName())
if (index.arguments.size() > 2)
throw Exception(ErrorCodes::INCORRECT_QUERY, "Full text index must have less than two arguments.");
if (!index.arguments.empty() && index.arguments[0].getType() != Field::Types::UInt64)
throw Exception(ErrorCodes::INCORRECT_QUERY, "The first full text index argument must be positive integer.");
if (index.arguments.size() == 2)
{
if (index.arguments.size() != 4)
throw Exception(ErrorCodes::INCORRECT_QUERY, "`ngrambf` index must have exactly 4 arguments.");
if (index.arguments[1].getType() != Field::Types::UInt64)
throw Exception(ErrorCodes::INCORRECT_QUERY, "The second full text index argument must be UInt64");
if (index.arguments[1].get<UInt64>() != UNLIMITED_ROWS_PER_POSTINGS_LIST && index.arguments[1].get<UInt64>() < MIN_ROWS_PER_POSTINGS_LIST)
throw Exception(ErrorCodes::INCORRECT_QUERY, "The maximum rows per postings list must be no less than {}", MIN_ROWS_PER_POSTINGS_LIST);
}
else if (index.type == SplitTokenExtractor::getName())
{
if (index.arguments.size() != 3)
throw Exception(ErrorCodes::INCORRECT_QUERY, "`tokenbf` index must have exactly 3 arguments.");
}
else
{
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index type: {}", backQuote(index.name));
}
assert(index.arguments.size() >= 3);
for (const auto & arg : index.arguments)
if (arg.getType() != Field::Types::UInt64)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "All parameters to *bf_v1 index must be unsigned integers");
/// Just validate
BloomFilterParameters params(
index.arguments[0].get<size_t>(),
index.arguments[1].get<size_t>(),
index.arguments[2].get<size_t>());
size_t ngrams = index.arguments.empty() ? 0 : index.arguments[0].get<size_t>();
UInt64 max_rows_per_postings_list = index.arguments.size() < 2 ? DEFAULT_MAX_ROWS_PER_POSTINGS_LIST : index.arguments[1].get<UInt64>();
GinFilterParameters params(ngrams, max_rows_per_postings_list);
}
}

View File

@ -1,22 +1,20 @@
#pragma once
#include <memory>
#include <Storages/MergeTree/MergeTreeIndices.h>
#include <Storages/MergeTree/KeyCondition.h>
#include <Interpreters/BloomFilter.h>
#include <Interpreters/GinFilter.h>
#include <Interpreters/ITokenExtractor.h>
#include <Storages/MergeTree/KeyCondition.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <base/types.h>
#include <memory>
namespace DB
{
struct MergeTreeIndexGranuleFullText final : public IMergeTreeIndexGranule
{
explicit MergeTreeIndexGranuleFullText(
const String & index_name_,
size_t columns_number,
const BloomFilterParameters & params_);
const GinFilterParameters & params_);
~MergeTreeIndexGranuleFullText() override = default;
@ -26,9 +24,8 @@ struct MergeTreeIndexGranuleFullText final : public IMergeTreeIndexGranule
bool empty() const override { return !has_elems; }
const String index_name;
const BloomFilterParameters params;
std::vector<BloomFilter> bloom_filters;
const GinFilterParameters params;
GinFilters gin_filters;
bool has_elems;
};
@ -37,9 +34,10 @@ using MergeTreeIndexGranuleFullTextPtr = std::shared_ptr<MergeTreeIndexGranuleFu
struct MergeTreeIndexAggregatorFullText final : IMergeTreeIndexAggregator
{
explicit MergeTreeIndexAggregatorFullText(
GinIndexStorePtr store_,
const Names & index_columns_,
const String & index_name_,
const BloomFilterParameters & params_,
const GinFilterParameters & params_,
TokenExtractorPtr token_extractor_);
~MergeTreeIndexAggregatorFullText() override = default;
@ -49,29 +47,39 @@ struct MergeTreeIndexAggregatorFullText final : IMergeTreeIndexAggregator
void update(const Block & block, size_t * pos, size_t limit) override;
void addToGinFilter(UInt32 rowID, const char * data, size_t length, GinFilter & gin_filter);
GinIndexStorePtr store;
Names index_columns;
String index_name;
BloomFilterParameters params;
const String index_name;
const GinFilterParameters params;
TokenExtractorPtr token_extractor;
MergeTreeIndexGranuleFullTextPtr granule;
};
class MergeTreeConditionFullText final : public IMergeTreeIndexCondition
class MergeTreeConditionFullText final : public IMergeTreeIndexCondition, WithContext
{
public:
MergeTreeConditionFullText(
const ActionsDAGPtr & filter_actions_dag,
ContextPtr context,
const Block & index_sample_block,
const BloomFilterParameters & params_,
const GinFilterParameters & params_,
TokenExtractorPtr token_extactor_);
~MergeTreeConditionFullText() override = default;
bool alwaysUnknownOrTrue() const override;
bool mayBeTrueOnGranule(MergeTreeIndexGranulePtr idx_granule) const override;
bool mayBeTrueOnGranule([[maybe_unused]]MergeTreeIndexGranulePtr idx_granule) const override
{
/// should call mayBeTrueOnGranuleInPart instead
assert(false);
return false;
}
bool mayBeTrueOnGranuleInPart(MergeTreeIndexGranulePtr idx_granule, [[maybe_unused]] PostingsCacheForStore & cache_store) const;
private:
struct KeyTuplePositionMapping
{
@ -90,10 +98,9 @@ private:
FUNCTION_NOT_EQUALS,
FUNCTION_HAS,
FUNCTION_IN,
FUNCTION_MATCH,
FUNCTION_NOT_IN,
FUNCTION_MULTI_SEARCH,
FUNCTION_HAS_ANY,
FUNCTION_MATCH,
FUNCTION_UNKNOWN, /// Can take any value.
/// Operators of the logical expression.
FUNCTION_NOT,
@ -105,18 +112,19 @@ private:
};
RPNElement( /// NOLINT
Function function_ = FUNCTION_UNKNOWN, size_t key_column_ = 0, std::unique_ptr<BloomFilter> && const_bloom_filter_ = nullptr)
: function(function_), key_column(key_column_), bloom_filter(std::move(const_bloom_filter_)) {}
Function function_ = FUNCTION_UNKNOWN, size_t key_column_ = 0, std::unique_ptr<GinFilter> && const_gin_filter_ = nullptr)
: function(function_), key_column(key_column_), gin_filter(std::move(const_gin_filter_)) {}
Function function = FUNCTION_UNKNOWN;
/// For FUNCTION_EQUALS, FUNCTION_NOT_EQUALS, FUNCTION_MULTI_SEARCH and FUNCTION_HAS_ANY
/// For FUNCTION_EQUALS, FUNCTION_NOT_EQUALS and FUNCTION_MULTI_SEARCH
size_t key_column;
/// For FUNCTION_EQUALS, FUNCTION_NOT_EQUALS
std::unique_ptr<BloomFilter> bloom_filter;
std::unique_ptr<GinFilter> gin_filter;
/// For FUNCTION_IN, FUNCTION_NOT_IN, FUNCTION_MULTI_SEARCH and FUNCTION_HAS_ANY
std::vector<std::vector<BloomFilter>> set_bloom_filters;
/// For FUNCTION_IN, FUNCTION_NOT_IN and FUNCTION_MULTI_SEARCH
std::vector<GinFilters> set_gin_filters;
/// For FUNCTION_IN and FUNCTION_NOT_IN
std::vector<size_t> set_key_position;
@ -124,26 +132,26 @@ private:
using RPN = std::vector<RPNElement>;
bool extractAtomFromTree(const RPNBuilderTreeNode & node, RPNElement & out);
bool traverseAtomAST(const RPNBuilderTreeNode & node, RPNElement & out);
bool traverseTreeEquals(
bool traverseASTEquals(
const String & function_name,
const RPNBuilderTreeNode & key_node,
const RPNBuilderTreeNode & key_ast,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out);
std::optional<size_t> getKeyIndex(const std::string & key_column_name);
bool tryPrepareSetBloomFilter(const RPNBuilderTreeNode & left_argument, const RPNBuilderTreeNode & right_argument, RPNElement & out);
bool tryPrepareSetGinFilter(const RPNBuilderTreeNode & lhs, const RPNBuilderTreeNode & rhs, RPNElement & out);
static bool createFunctionEqualsCondition(
RPNElement & out, const Field & value, const BloomFilterParameters & params, TokenExtractorPtr token_extractor);
RPNElement & out, const Field & value, const GinFilterParameters & params, TokenExtractorPtr token_extractor);
Names index_columns;
DataTypes index_data_types;
BloomFilterParameters params;
const Block & header;
GinFilterParameters params;
TokenExtractorPtr token_extractor;
RPN rpn;
/// Sets from syntax analyzer.
PreparedSetsPtr prepared_sets;
};
class MergeTreeIndexFullText final : public IMergeTreeIndex
@ -151,7 +159,7 @@ class MergeTreeIndexFullText final : public IMergeTreeIndex
public:
MergeTreeIndexFullText(
const IndexDescription & index_,
const BloomFilterParameters & params_,
const GinFilterParameters & params_,
std::unique_ptr<ITokenExtractor> && token_extractor_)
: IMergeTreeIndex(index_)
, params(params_)
@ -161,11 +169,10 @@ public:
MergeTreeIndexGranulePtr createIndexGranule() const override;
MergeTreeIndexAggregatorPtr createIndexAggregator(const MergeTreeWriterSettings & settings) const override;
MergeTreeIndexAggregatorPtr createIndexAggregatorForPart(const GinIndexStorePtr & store, const MergeTreeWriterSettings & /*settings*/) const override;
MergeTreeIndexConditionPtr createIndexCondition(const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const override;
MergeTreeIndexConditionPtr createIndexCondition(
const ActionsDAGPtr & filter_dag, ContextPtr context) const override;
BloomFilterParameters params;
GinFilterParameters params;
/// Function for selecting next token.
std::unique_ptr<ITokenExtractor> token_extractor;
};

View File

@ -1,102 +0,0 @@
#include <Storages/MergeTree/MergeTreeIndexGranuleBloomFilter.h>
#include <Columns/ColumnArray.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnFixedString.h>
#include <DataTypes/DataTypeNullable.h>
#include <Common/HashTable/Hash.h>
#include <Interpreters/BloomFilterHash.h>
#include <IO/WriteHelpers.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
MergeTreeIndexGranuleBloomFilter::MergeTreeIndexGranuleBloomFilter(size_t bits_per_row_, size_t hash_functions_, size_t index_columns_)
: bits_per_row(bits_per_row_), hash_functions(hash_functions_), bloom_filters(index_columns_)
{
total_rows = 0;
for (size_t column = 0; column < index_columns_; ++column)
bloom_filters[column] = std::make_shared<BloomFilter>(bits_per_row, hash_functions, 0);
}
MergeTreeIndexGranuleBloomFilter::MergeTreeIndexGranuleBloomFilter(
size_t bits_per_row_, size_t hash_functions_, const std::vector<HashSet<UInt64>>& column_hashes_)
: bits_per_row(bits_per_row_), hash_functions(hash_functions_), bloom_filters(column_hashes_.size())
{
if (column_hashes_.empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Granule_index_blocks empty or total_rows is zero.");
size_t bloom_filter_max_size = 0;
for (const auto & column_hash : column_hashes_)
bloom_filter_max_size = std::max(bloom_filter_max_size, column_hash.size());
static size_t atom_size = 8;
// If multiple columns are given, we will initialize all the bloom filters
// with the size of the highest-cardinality one. This is done for compatibility with
// existing binary serialization format
total_rows = bloom_filter_max_size;
size_t bytes_size = (bits_per_row * total_rows + atom_size - 1) / atom_size;
for (size_t column = 0, columns = column_hashes_.size(); column < columns; ++column)
{
bloom_filters[column] = std::make_shared<BloomFilter>(bytes_size, hash_functions, 0);
fillingBloomFilter(bloom_filters[column], column_hashes_[column]);
}
}
bool MergeTreeIndexGranuleBloomFilter::empty() const
{
return !total_rows;
}
void MergeTreeIndexGranuleBloomFilter::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
{
if (version != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
readVarUInt(total_rows, istr);
static size_t atom_size = 8;
size_t bytes_size = (bits_per_row * total_rows + atom_size - 1) / atom_size;
size_t read_size = bytes_size;
for (auto & filter : bloom_filters)
{
filter->resize(bytes_size);
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
read_size = filter->getFilter().size() * sizeof(BloomFilter::UnderType);
#endif
istr.readStrict(reinterpret_cast<char *>(filter->getFilter().data()), read_size);
}
}
void MergeTreeIndexGranuleBloomFilter::serializeBinary(WriteBuffer & ostr) const
{
if (empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty bloom filter index.");
writeVarUInt(total_rows, ostr);
static size_t atom_size = 8;
size_t write_size = (bits_per_row * total_rows + atom_size - 1) / atom_size;
for (const auto & bloom_filter : bloom_filters)
{
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
write_size = bloom_filter->getFilter().size() * sizeof(BloomFilter::UnderType);
#endif
ostr.write(reinterpret_cast<const char *>(bloom_filter->getFilter().data()), write_size);
}
}
void MergeTreeIndexGranuleBloomFilter::fillingBloomFilter(BloomFilterPtr & bf, const HashSet<UInt64> &hashes) const
{
for (const auto & bf_base_hash : hashes)
for (size_t i = 0; i < hash_functions; ++i)
bf->addHashWithSeed(bf_base_hash.getKey(), BloomFilterHash::bf_hash_seed[i]);
}
}

View File

@ -1,35 +0,0 @@
#pragma once
#include <Interpreters/BloomFilter.h>
#include <Storages/MergeTree/MergeTreeIndices.h>
#include <Common/HashTable/HashSet.h>
namespace DB
{
class MergeTreeIndexGranuleBloomFilter final : public IMergeTreeIndexGranule
{
public:
MergeTreeIndexGranuleBloomFilter(size_t bits_per_row_, size_t hash_functions_, size_t index_columns_);
MergeTreeIndexGranuleBloomFilter(size_t bits_per_row_, size_t hash_functions_, const std::vector<HashSet<UInt64>> & column_hashes);
bool empty() const override;
void serializeBinary(WriteBuffer & ostr) const override;
void deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version) override;
const std::vector<BloomFilterPtr> & getFilters() const { return bloom_filters; }
private:
const size_t bits_per_row;
const size_t hash_functions;
size_t total_rows = 0;
std::vector<BloomFilterPtr> bloom_filters;
void fillingBloomFilter(BloomFilterPtr & bf, const HashSet<UInt64> & hashes) const;
};
}

View File

@ -1,826 +0,0 @@
#include <Storages/MergeTree/MergeTreeIndexInverted.h>
#include <Columns/ColumnArray.h>
#include <Columns/ColumnLowCardinality.h>
#include <Columns/ColumnNullable.h>
#include <Core/Defines.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypesNumber.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <Interpreters/ExpressionActions.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/GinFilter.h>
#include <Interpreters/TreeRewriter.h>
#include <Interpreters/misc.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTSubquery.h>
#include <Poco/Logger.h>
#include <Storages/MergeTree/MergeTreeData.h>
#include <Storages/MergeTree/MergeTreeIndexUtils.h>
#include <Storages/MergeTree/RPNBuilder.h>
#include <Common/OptimizedRegularExpression.h>
#include <algorithm>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int INCORRECT_QUERY;
}
MergeTreeIndexGranuleInverted::MergeTreeIndexGranuleInverted(
const String & index_name_,
size_t columns_number,
const GinFilterParameters & params_)
: index_name(index_name_)
, params(params_)
, gin_filters(columns_number, GinFilter(params))
, has_elems(false)
{
}
void MergeTreeIndexGranuleInverted::serializeBinary(WriteBuffer & ostr) const
{
if (empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty fulltext index {}.", backQuote(index_name));
const auto & size_type = std::make_shared<DataTypeUInt32>();
auto size_serialization = size_type->getDefaultSerialization();
for (const auto & gin_filter : gin_filters)
{
size_t filter_size = gin_filter.getFilter().size();
size_serialization->serializeBinary(filter_size, ostr, {});
ostr.write(reinterpret_cast<const char *>(gin_filter.getFilter().data()), filter_size * sizeof(GinSegmentWithRowIdRangeVector::value_type));
}
}
void MergeTreeIndexGranuleInverted::deserializeBinary(ReadBuffer & istr, MergeTreeIndexVersion version)
{
if (version != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown index version {}.", version);
Field field_rows;
const auto & size_type = std::make_shared<DataTypeUInt32>();
auto size_serialization = size_type->getDefaultSerialization();
for (auto & gin_filter : gin_filters)
{
size_serialization->deserializeBinary(field_rows, istr, {});
size_t filter_size = field_rows.get<size_t>();
gin_filter.getFilter().resize(filter_size);
if (filter_size == 0)
continue;
istr.readStrict(reinterpret_cast<char *>(gin_filter.getFilter().data()), filter_size * sizeof(GinSegmentWithRowIdRangeVector::value_type));
}
has_elems = true;
}
MergeTreeIndexAggregatorInverted::MergeTreeIndexAggregatorInverted(
GinIndexStorePtr store_,
const Names & index_columns_,
const String & index_name_,
const GinFilterParameters & params_,
TokenExtractorPtr token_extractor_)
: store(store_)
, index_columns(index_columns_)
, index_name (index_name_)
, params(params_)
, token_extractor(token_extractor_)
, granule(
std::make_shared<MergeTreeIndexGranuleInverted>(
index_name, index_columns.size(), params))
{
}
MergeTreeIndexGranulePtr MergeTreeIndexAggregatorInverted::getGranuleAndReset()
{
auto new_granule = std::make_shared<MergeTreeIndexGranuleInverted>(
index_name, index_columns.size(), params);
new_granule.swap(granule);
return new_granule;
}
void MergeTreeIndexAggregatorInverted::addToGinFilter(UInt32 rowID, const char * data, size_t length, GinFilter & gin_filter)
{
size_t cur = 0;
size_t token_start = 0;
size_t token_len = 0;
while (cur < length && token_extractor->nextInStringPadded(data, length, &cur, &token_start, &token_len))
gin_filter.add(data + token_start, token_len, rowID, store);
}
void MergeTreeIndexAggregatorInverted::update(const Block & block, size_t * pos, size_t limit)
{
if (*pos >= block.rows())
throw Exception(ErrorCodes::LOGICAL_ERROR, "The provided position is not less than the number of block rows. "
"Position: {}, Block rows: {}.", *pos, block.rows());
size_t rows_read = std::min(limit, block.rows() - *pos);
auto row_id = store->getNextRowIDRange(rows_read);
auto start_row_id = row_id;
for (size_t col = 0; col < index_columns.size(); ++col)
{
const auto & column_with_type = block.getByName(index_columns[col]);
const auto & column = column_with_type.column;
size_t current_position = *pos;
bool need_to_write = false;
if (isArray(column_with_type.type))
{
const auto & column_array = assert_cast<const ColumnArray &>(*column);
const auto & column_offsets = column_array.getOffsets();
const auto & column_key = column_array.getData();
for (size_t i = 0; i < rows_read; ++i)
{
size_t element_start_row = column_offsets[current_position - 1];
size_t elements_size = column_offsets[current_position] - element_start_row;
for (size_t row_num = 0; row_num < elements_size; ++row_num)
{
auto ref = column_key.getDataAt(element_start_row + row_num);
addToGinFilter(row_id, ref.data, ref.size, granule->gin_filters[col]);
store->incrementCurrentSizeBy(ref.size);
}
current_position += 1;
row_id++;
if (store->needToWrite())
need_to_write = true;
}
}
else
{
for (size_t i = 0; i < rows_read; ++i)
{
auto ref = column->getDataAt(current_position + i);
addToGinFilter(row_id, ref.data, ref.size, granule->gin_filters[col]);
store->incrementCurrentSizeBy(ref.size);
row_id++;
if (store->needToWrite())
need_to_write = true;
}
}
granule->gin_filters[col].addRowRangeToGinFilter(store->getCurrentSegmentID(), start_row_id, static_cast<UInt32>(start_row_id + rows_read - 1));
if (need_to_write)
{
store->writeSegment();
}
}
granule->has_elems = true;
*pos += rows_read;
}
MergeTreeConditionInverted::MergeTreeConditionInverted(
const ActionsDAGPtr & filter_actions_dag,
ContextPtr context_,
const Block & index_sample_block,
const GinFilterParameters & params_,
TokenExtractorPtr token_extactor_)
: WithContext(context_), header(index_sample_block)
, params(params_)
, token_extractor(token_extactor_)
{
if (!filter_actions_dag)
{
rpn.push_back(RPNElement::FUNCTION_UNKNOWN);
return;
}
rpn = std::move(
RPNBuilder<RPNElement>(
filter_actions_dag->getOutputs().at(0), context_,
[&](const RPNBuilderTreeNode & node, RPNElement & out)
{
return this->traverseAtomAST(node, out);
}).extractRPN());
}
/// Keep in-sync with MergeTreeConditionFullText::alwaysUnknownOrTrue
bool MergeTreeConditionInverted::alwaysUnknownOrTrue() const
{
/// Check like in KeyCondition.
std::vector<bool> rpn_stack;
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN
|| element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.push_back(true);
}
else if (element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS
|| element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN
|| element.function == RPNElement::FUNCTION_MULTI_SEARCH
|| element.function == RPNElement::FUNCTION_MATCH
|| element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.push_back(false);
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
// do nothing
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 && arg2;
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 || arg2;
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in KeyCondition::RPNElement");
}
return rpn_stack[0];
}
bool MergeTreeConditionInverted::mayBeTrueOnGranuleInPart(MergeTreeIndexGranulePtr idx_granule,[[maybe_unused]] PostingsCacheForStore & cache_store) const
{
std::shared_ptr<MergeTreeIndexGranuleInverted> granule
= std::dynamic_pointer_cast<MergeTreeIndexGranuleInverted>(idx_granule);
if (!granule)
throw Exception(ErrorCodes::LOGICAL_ERROR, "GinFilter index condition got a granule with the wrong type.");
/// Check like in KeyCondition.
std::vector<BoolMask> rpn_stack;
for (const auto & element : rpn)
{
if (element.function == RPNElement::FUNCTION_UNKNOWN)
{
rpn_stack.emplace_back(true, true);
}
else if (element.function == RPNElement::FUNCTION_EQUALS
|| element.function == RPNElement::FUNCTION_NOT_EQUALS
|| element.function == RPNElement::FUNCTION_HAS)
{
rpn_stack.emplace_back(granule->gin_filters[element.key_column].contains(*element.gin_filter, cache_store), true);
if (element.function == RPNElement::FUNCTION_NOT_EQUALS)
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_IN
|| element.function == RPNElement::FUNCTION_NOT_IN)
{
std::vector<bool> result(element.set_gin_filters.back().size(), true);
for (size_t column = 0; column < element.set_key_position.size(); ++column)
{
const size_t key_idx = element.set_key_position[column];
const auto & gin_filters = element.set_gin_filters[column];
for (size_t row = 0; row < gin_filters.size(); ++row)
result[row] = result[row] && granule->gin_filters[key_idx].contains(gin_filters[row], cache_store);
}
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
if (element.function == RPNElement::FUNCTION_NOT_IN)
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_MULTI_SEARCH)
{
std::vector<bool> result(element.set_gin_filters.back().size(), true);
const auto & gin_filters = element.set_gin_filters[0];
for (size_t row = 0; row < gin_filters.size(); ++row)
result[row] = result[row] && granule->gin_filters[element.key_column].contains(gin_filters[row], cache_store);
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
}
else if (element.function == RPNElement::FUNCTION_MATCH)
{
if (!element.set_gin_filters.empty())
{
/// Alternative substrings
std::vector<bool> result(element.set_gin_filters.back().size(), true);
const auto & gin_filters = element.set_gin_filters[0];
for (size_t row = 0; row < gin_filters.size(); ++row)
result[row] = result[row] && granule->gin_filters[element.key_column].contains(gin_filters[row], cache_store);
rpn_stack.emplace_back(std::find(std::cbegin(result), std::cend(result), true) != std::end(result), true);
}
else if (element.gin_filter)
{
rpn_stack.emplace_back(granule->gin_filters[element.key_column].contains(*element.gin_filter, cache_store), true);
}
}
else if (element.function == RPNElement::FUNCTION_NOT)
{
rpn_stack.back() = !rpn_stack.back();
}
else if (element.function == RPNElement::FUNCTION_AND)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 & arg2;
}
else if (element.function == RPNElement::FUNCTION_OR)
{
auto arg1 = rpn_stack.back();
rpn_stack.pop_back();
auto arg2 = rpn_stack.back();
rpn_stack.back() = arg1 | arg2;
}
else if (element.function == RPNElement::ALWAYS_FALSE)
{
rpn_stack.emplace_back(false, true);
}
else if (element.function == RPNElement::ALWAYS_TRUE)
{
rpn_stack.emplace_back(true, false);
}
else
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected function type in GinFilterCondition::RPNElement");
}
if (rpn_stack.size() != 1)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected stack size in GinFilterCondition::mayBeTrueOnGranule");
return rpn_stack[0].can_be_true;
}
bool MergeTreeConditionInverted::traverseAtomAST(const RPNBuilderTreeNode & node, RPNElement & out)
{
{
Field const_value;
DataTypePtr const_type;
if (node.tryGetConstant(const_value, const_type))
{
/// Check constant like in KeyCondition
if (const_value.getType() == Field::Types::UInt64)
{
out.function = const_value.get<UInt64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Int64)
{
out.function = const_value.get<Int64>() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
if (const_value.getType() == Field::Types::Float64)
{
out.function = const_value.get<Float64>() != 0.00 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE;
return true;
}
}
}
if (node.isFunction())
{
const auto function = node.toFunctionNode();
// auto arguments_size = function.getArgumentsSize();
auto function_name = function.getFunctionName();
size_t function_arguments_size = function.getArgumentsSize();
if (function_arguments_size != 2)
return false;
auto lhs_argument = function.getArgumentAt(0);
auto rhs_argument = function.getArgumentAt(1);
if (functionIsInOrGlobalInOperator(function_name))
{
if (tryPrepareSetGinFilter(lhs_argument, rhs_argument, out))
{
if (function_name == "notIn")
{
out.function = RPNElement::FUNCTION_NOT_IN;
return true;
}
else if (function_name == "in")
{
out.function = RPNElement::FUNCTION_IN;
return true;
}
}
}
else if (function_name == "equals" ||
function_name == "notEquals" ||
function_name == "has" ||
function_name == "mapContains" ||
function_name == "like" ||
function_name == "notLike" ||
function_name == "hasToken" ||
function_name == "hasTokenOrNull" ||
function_name == "startsWith" ||
function_name == "endsWith" ||
function_name == "multiSearchAny" ||
function_name == "match")
{
Field const_value;
DataTypePtr const_type;
if (rhs_argument.tryGetConstant(const_value, const_type))
{
if (traverseASTEquals(function_name, lhs_argument, const_type, const_value, out))
return true;
}
else if (lhs_argument.tryGetConstant(const_value, const_type) && (function_name == "equals" || function_name == "notEquals"))
{
if (traverseASTEquals(function_name, rhs_argument, const_type, const_value, out))
return true;
}
}
}
return false;
}
bool MergeTreeConditionInverted::traverseASTEquals(
const String & function_name,
const RPNBuilderTreeNode & key_ast,
const DataTypePtr & value_type,
const Field & value_field,
RPNElement & out)
{
auto value_data_type = WhichDataType(value_type);
if (!value_data_type.isStringOrFixedString() && !value_data_type.isArray())
return false;
Field const_value = value_field;
size_t key_column_num = 0;
bool key_exists = header.has(key_ast.getColumnName());
bool map_key_exists = header.has(fmt::format("mapKeys({})", key_ast.getColumnName()));
if (key_ast.isFunction())
{
const auto function = key_ast.toFunctionNode();
if (function.getFunctionName() == "arrayElement")
{
/** Try to parse arrayElement for mapKeys index.
* It is important to ignore keys like column_map['Key'] = '' because if key does not exists in map
* we return default value for arrayElement.
*
* We cannot skip keys that does not exist in map if comparison is with default type value because
* that way we skip necessary granules where map key does not exists.
*/
if (value_field == value_type->getDefault())
return false;
auto first_argument = function.getArgumentAt(0);
const auto map_column_name = first_argument.getColumnName();
auto map_keys_index_column_name = fmt::format("mapKeys({})", map_column_name);
auto map_values_index_column_name = fmt::format("mapValues({})", map_column_name);
if (header.has(map_keys_index_column_name))
{
auto argument = function.getArgumentAt(1);
DataTypePtr const_type;
if (argument.tryGetConstant(const_value, const_type))
{
auto const_data_type = WhichDataType(const_type);
if (!const_data_type.isStringOrFixedString() && !const_data_type.isArray())
return false;
key_column_num = header.getPositionByName(map_keys_index_column_name);
key_exists = true;
}
else
{
return false;
}
}
else if (header.has(map_values_index_column_name))
{
key_column_num = header.getPositionByName(map_values_index_column_name);
key_exists = true;
}
else
{
return false;
}
}
}
if (!key_exists && !map_key_exists)
return false;
if (map_key_exists && (function_name == "has" || function_name == "mapContains"))
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_HAS;
out.gin_filter = std::make_unique<GinFilter>(params);
auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "has")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_HAS;
out.gin_filter = std::make_unique<GinFilter>(params);
auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
if (function_name == "notEquals")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_NOT_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "equals")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "like")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringLikeToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "notLike")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_NOT_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringLikeToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "hasToken" || function_name == "hasTokenOrNull")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "startsWith")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "endsWith")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_EQUALS;
out.gin_filter = std::make_unique<GinFilter>(params);
const auto & value = const_value.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), *out.gin_filter);
return true;
}
else if (function_name == "multiSearchAny")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_MULTI_SEARCH;
/// 2d vector is not needed here but is used because already exists for FUNCTION_IN
std::vector<GinFilters> gin_filters;
gin_filters.emplace_back();
for (const auto & element : const_value.get<Array>())
{
if (element.getType() != Field::Types::String)
return false;
gin_filters.back().emplace_back(params);
const auto & value = element.get<String>();
token_extractor->stringToGinFilter(value.data(), value.size(), gin_filters.back().back());
}
out.set_gin_filters = std::move(gin_filters);
return true;
}
else if (function_name == "match")
{
out.key_column = key_column_num;
out.function = RPNElement::FUNCTION_MATCH;
auto & value = const_value.get<String>();
String required_substring;
bool dummy_is_trivial, dummy_required_substring_is_prefix;
std::vector<String> alternatives;
OptimizedRegularExpression::analyze(value, required_substring, dummy_is_trivial, dummy_required_substring_is_prefix, alternatives);
if (required_substring.empty() && alternatives.empty())
return false;
/// out.set_gin_filters means alternatives exist
/// out.gin_filter means required_substring exists
if (!alternatives.empty())
{
std::vector<GinFilters> gin_filters;
gin_filters.emplace_back();
for (const auto & alternative : alternatives)
{
gin_filters.back().emplace_back(params);
token_extractor->stringToGinFilter(alternative.data(), alternative.size(), gin_filters.back().back());
}
out.set_gin_filters = std::move(gin_filters);
}
else
{
out.gin_filter = std::make_unique<GinFilter>(params);
token_extractor->stringToGinFilter(required_substring.data(), required_substring.size(), *out.gin_filter);
}
return true;
}
return false;
}
bool MergeTreeConditionInverted::tryPrepareSetGinFilter(
const RPNBuilderTreeNode & lhs,
const RPNBuilderTreeNode & rhs,
RPNElement & out)
{
std::vector<KeyTuplePositionMapping> key_tuple_mapping;
DataTypes data_types;
if (lhs.isFunction() && lhs.toFunctionNode().getFunctionName() == "tuple")
{
const auto function = lhs.toFunctionNode();
auto arguments_size = function.getArgumentsSize();
for (size_t i = 0; i < arguments_size; ++i)
{
if (header.has(function.getArgumentAt(i).getColumnName()))
{
auto key = header.getPositionByName(function.getArgumentAt(i).getColumnName());
key_tuple_mapping.emplace_back(i, key);
data_types.push_back(header.getByPosition(key).type);
}
}
}
else
{
if (header.has(lhs.getColumnName()))
{
auto key = header.getPositionByName(lhs.getColumnName());
key_tuple_mapping.emplace_back(0, key);
data_types.push_back(header.getByPosition(key).type);
}
}
if (key_tuple_mapping.empty())
return false;
auto future_set = rhs.tryGetPreparedSet();
if (!future_set)
return false;
auto prepared_set = future_set->buildOrderedSetInplace(rhs.getTreeContext().getQueryContext());
if (!prepared_set || !prepared_set->hasExplicitSetElements())
return false;
for (const auto & data_type : prepared_set->getDataTypes())
if (data_type->getTypeId() != TypeIndex::String && data_type->getTypeId() != TypeIndex::FixedString)
return false;
std::vector<GinFilters> gin_filters;
std::vector<size_t> key_position;
Columns columns = prepared_set->getSetElements();
for (const auto & elem : key_tuple_mapping)
{
gin_filters.emplace_back();
gin_filters.back().reserve(prepared_set->getTotalRowCount());
key_position.push_back(elem.key_index);
size_t tuple_idx = elem.tuple_index;
const auto & column = columns[tuple_idx];
for (size_t row = 0; row < prepared_set->getTotalRowCount(); ++row)
{
gin_filters.back().emplace_back(params);
auto ref = column->getDataAt(row);
token_extractor->stringToGinFilter(ref.data, ref.size, gin_filters.back().back());
}
}
out.set_key_position = std::move(key_position);
out.set_gin_filters = std::move(gin_filters);
return true;
}
MergeTreeIndexGranulePtr MergeTreeIndexInverted::createIndexGranule() const
{
return std::make_shared<MergeTreeIndexGranuleInverted>(index.name, index.column_names.size(), params);
}
MergeTreeIndexAggregatorPtr MergeTreeIndexInverted::createIndexAggregator(const MergeTreeWriterSettings & /*settings*/) const
{
/// should not be called: createIndexAggregatorForPart should be used
assert(false);
return nullptr;
}
MergeTreeIndexAggregatorPtr MergeTreeIndexInverted::createIndexAggregatorForPart(const GinIndexStorePtr & store, const MergeTreeWriterSettings & /*settings*/) const
{
return std::make_shared<MergeTreeIndexAggregatorInverted>(store, index.column_names, index.name, params, token_extractor.get());
}
MergeTreeIndexConditionPtr MergeTreeIndexInverted::createIndexCondition(
const ActionsDAGPtr & filter_actions_dag, ContextPtr context) const
{
return std::make_shared<MergeTreeConditionInverted>(filter_actions_dag, context, index.sample_block, params, token_extractor.get());
};
MergeTreeIndexPtr invertedIndexCreator(
const IndexDescription & index)
{
size_t n = index.arguments.empty() ? 0 : index.arguments[0].get<size_t>();
UInt64 max_rows = index.arguments.size() < 2 ? DEFAULT_MAX_ROWS_PER_POSTINGS_LIST : index.arguments[1].get<UInt64>();
GinFilterParameters params(n, max_rows);
/// Use SplitTokenExtractor when n is 0, otherwise use NgramTokenExtractor
if (n > 0)
{
auto tokenizer = std::make_unique<NgramTokenExtractor>(n);
return std::make_shared<MergeTreeIndexInverted>(index, params, std::move(tokenizer));
}
else
{
auto tokenizer = std::make_unique<SplitTokenExtractor>();
return std::make_shared<MergeTreeIndexInverted>(index, params, std::move(tokenizer));
}
}
void invertedIndexValidator(const IndexDescription & index, bool /*attach*/)
{
for (const auto & index_data_type : index.data_types)
{
WhichDataType data_type(index_data_type);
if (data_type.isArray())
{
const auto & gin_type = assert_cast<const DataTypeArray &>(*index_data_type);
data_type = WhichDataType(gin_type.getNestedType());
}
else if (data_type.isLowCardinality())
{
const auto & low_cardinality = assert_cast<const DataTypeLowCardinality &>(*index_data_type);
data_type = WhichDataType(low_cardinality.getDictionaryType());
}
if (!data_type.isString() && !data_type.isFixedString())
throw Exception(ErrorCodes::INCORRECT_QUERY, "Inverted index can be used only with `String`, `FixedString`,"
"`LowCardinality(String)`, `LowCardinality(FixedString)` "
"column or Array with `String` or `FixedString` values column.");
}
if (index.arguments.size() > 2)
throw Exception(ErrorCodes::INCORRECT_QUERY, "Inverted index must have less than two arguments.");
if (!index.arguments.empty() && index.arguments[0].getType() != Field::Types::UInt64)
throw Exception(ErrorCodes::INCORRECT_QUERY, "The first Inverted index argument must be positive integer.");
if (index.arguments.size() == 2)
{
if (index.arguments[1].getType() != Field::Types::UInt64)
throw Exception(ErrorCodes::INCORRECT_QUERY, "The second Inverted index argument must be UInt64");
if (index.arguments[1].get<UInt64>() != UNLIMITED_ROWS_PER_POSTINGS_LIST && index.arguments[1].get<UInt64>() < MIN_ROWS_PER_POSTINGS_LIST)
throw Exception(ErrorCodes::INCORRECT_QUERY, "The maximum rows per postings list must be no less than {}", MIN_ROWS_PER_POSTINGS_LIST);
}
/// Just validate
size_t ngrams = index.arguments.empty() ? 0 : index.arguments[0].get<size_t>();
UInt64 max_rows_per_postings_list = index.arguments.size() < 2 ? DEFAULT_MAX_ROWS_PER_POSTINGS_LIST : index.arguments[1].get<UInt64>();
GinFilterParameters params(ngrams, max_rows_per_postings_list);
}
}

View File

@ -115,18 +115,18 @@ MergeTreeIndexFactory::MergeTreeIndexFactory()
registerCreator("set", setIndexCreator);
registerValidator("set", setIndexValidator);
registerCreator("ngrambf_v1", bloomFilterIndexCreator);
registerValidator("ngrambf_v1", bloomFilterIndexValidator);
registerCreator("ngrambf_v1", bloomFilterIndexTextCreator);
registerValidator("ngrambf_v1", bloomFilterIndexTextValidator);
registerCreator("tokenbf_v1", bloomFilterIndexCreator);
registerValidator("tokenbf_v1", bloomFilterIndexValidator);
registerCreator("tokenbf_v1", bloomFilterIndexTextCreator);
registerValidator("tokenbf_v1", bloomFilterIndexTextValidator);
registerCreator("bloom_filter", bloomFilterIndexCreatorNew);
registerValidator("bloom_filter", bloomFilterIndexValidatorNew);
registerCreator("bloom_filter", bloomFilterIndexCreator);
registerValidator("bloom_filter", bloomFilterIndexValidator);
registerCreator("hypothesis", hypothesisIndexCreator);
registerValidator("hypothesis", hypothesisIndexValidator);
registerValidator("hypothesis", hypothesisIndexValidator);
#ifdef ENABLE_ANNOY
registerCreator("annoy", annoyIndexCreator);
registerValidator("annoy", annoyIndexValidator);
@ -137,8 +137,8 @@ MergeTreeIndexFactory::MergeTreeIndexFactory()
registerValidator("usearch", usearchIndexValidator);
#endif
registerCreator("inverted", invertedIndexCreator);
registerValidator("inverted", invertedIndexValidator);
registerCreator("full_text", fullTextIndexCreator);
registerValidator("full_text", fullTextIndexValidator);
}

View File

@ -221,12 +221,12 @@ void minmaxIndexValidator(const IndexDescription & index, bool attach);
MergeTreeIndexPtr setIndexCreator(const IndexDescription & index);
void setIndexValidator(const IndexDescription & index, bool attach);
MergeTreeIndexPtr bloomFilterIndexTextCreator(const IndexDescription & index);
void bloomFilterIndexTextValidator(const IndexDescription & index, bool attach);
MergeTreeIndexPtr bloomFilterIndexCreator(const IndexDescription & index);
void bloomFilterIndexValidator(const IndexDescription & index, bool attach);
MergeTreeIndexPtr bloomFilterIndexCreatorNew(const IndexDescription & index);
void bloomFilterIndexValidatorNew(const IndexDescription & index, bool attach);
MergeTreeIndexPtr hypothesisIndexCreator(const IndexDescription & index);
void hypothesisIndexValidator(const IndexDescription & index, bool attach);
@ -240,7 +240,7 @@ MergeTreeIndexPtr usearchIndexCreator(const IndexDescription& index);
void usearchIndexValidator(const IndexDescription& index, bool attach);
#endif
MergeTreeIndexPtr invertedIndexCreator(const IndexDescription& index);
void invertedIndexValidator(const IndexDescription& index, bool attach);
MergeTreeIndexPtr fullTextIndexCreator(const IndexDescription& index);
void fullTextIndexValidator(const IndexDescription& index, bool attach);
}

View File

@ -22,7 +22,7 @@
#include <Storages/MergeTree/MergeTreeDataWriter.h>
#include <Storages/MutationCommands.h>
#include <Storages/MergeTree/MergeTreeDataMergerMutator.h>
#include <Storages/MergeTree/MergeTreeIndexInverted.h>
#include <Storages/MergeTree/MergeTreeIndexFullText.h>
#include <Storages/MergeTree/MergeTreeVirtualColumns.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypeVariant.h>
@ -653,7 +653,7 @@ static NameSet collectFilesToSkip(
files_to_skip.insert(index->getFileName() + mrk_extension);
// Skip all inverted index files, for they will be rebuilt
if (dynamic_cast<const MergeTreeIndexInverted *>(index.get()))
if (dynamic_cast<const MergeTreeIndexFullText *>(index.get()))
{
auto index_filename = index->getFileName();
files_to_skip.insert(index_filename + ".gin_dict");

View File

@ -33,8 +33,6 @@ public:
bool update(const ContextPtr & context);
void connect(const ContextPtr & context);
bool withGlobs() const { return blob_path.find_first_of("*?{") != std::string::npos; }
bool withWildcard() const

View File

@ -1866,7 +1866,6 @@ namespace
configuration.url.version_id,
configuration.request_settings,
/*with_metadata=*/ false,
/*for_disk_s3=*/ false,
/*throw_on_error= */ false).last_modification_time;
}

View File

@ -97,6 +97,7 @@ const char * auto_contributors[] {
"Alexey Gusev",
"Alexey Ilyukhov",
"Alexey Ivanov",
"Alexey Katsman",
"Alexey Korepanov",
"Alexey Milovidov",
"Alexey Perevyshin",
@ -156,6 +157,7 @@ const char * auto_contributors[] {
"Andy Yang",
"AndyB",
"Anish Bhanwala",
"Anita Hammer",
"Anmol Arora",
"Anna",
"Anna Shakhova",
@ -180,6 +182,7 @@ const char * auto_contributors[] {
"Aram Peres",
"Ariel Robaldo",
"Aris Tritas",
"Arnaud Rocher",
"Arsen Hakobyan",
"Arslan G",
"ArtCorp",
@ -252,6 +255,7 @@ const char * auto_contributors[] {
"Carbyn",
"Carlos Rodríguez Hernández",
"Caspian",
"Chandre Van Der Westhuizen",
"Chang Chen",
"Chao Ma",
"Chao Wang",
@ -351,6 +355,7 @@ const char * auto_contributors[] {
"Duc Canh Le",
"DuckSoft",
"Duyet Le",
"Eduard Karacharov",
"Egor O'Sten",
"Egor Savin",
"Eirik",
@ -360,6 +365,7 @@ const char * auto_contributors[] {
"Elena Baskakova",
"Elena Torró",
"Elghazal Ahmed",
"Eliot Hautefeuille",
"Elizaveta Mironyuk",
"Elykov Alexandr",
"Emmanuel Donin de Rosière",
@ -469,6 +475,7 @@ const char * auto_contributors[] {
"Ignat Loskutov",
"Igor",
"Igor Hatarist",
"Igor Markelov",
"Igor Mineev",
"Igor Nikonov",
"Igor Strykhar",
@ -478,6 +485,7 @@ const char * auto_contributors[] {
"Ildar Musin",
"Ildus Kurbangaliev",
"Ilya",
"Ilya Andreev",
"Ilya Breev",
"Ilya Golshtein",
"Ilya Khomutov",
@ -531,6 +539,7 @@ const char * auto_contributors[] {
"Jean Baptiste Favre",
"Jeffrey Dang",
"Jens Hoevenaars",
"Jhonso7393",
"Jiading Guo",
"Jianfei Hu",
"Jiang Tao",
@ -557,6 +566,8 @@ const char * auto_contributors[] {
"Joris Clement",
"Joris Giovannangeli",
"Jose",
"Joseph Redfern",
"Josh Rodriguez",
"Josh Taylor",
"Joshua Hildred",
"João Figueiredo",
@ -585,6 +596,7 @@ const char * auto_contributors[] {
"KevinyhZou",
"KinderRiven",
"Kiran",
"Kirill",
"Kirill Danshin",
"Kirill Ershov",
"Kirill Malev",
@ -605,6 +617,7 @@ const char * auto_contributors[] {
"Korviakov Andrey",
"Kostiantyn Storozhuk",
"Kozlov Ivan",
"KrJin",
"Krisztián Szűcs",
"Kruglov Pavel",
"Krzysztof Góralski",
@ -653,6 +666,7 @@ const char * auto_contributors[] {
"M1eyu2018",
"MEX7",
"MaceWindu",
"Maciej Bak",
"MagiaGroz",
"Maks Skorokhod",
"Maksim",
@ -766,6 +780,7 @@ const char * auto_contributors[] {
"MovElb",
"Mr.General",
"Murat Kabilov",
"Murat Khairulin",
"MyroTk",
"Márcio Martins",
"Mátyás Jani",
@ -863,6 +878,7 @@ const char * auto_contributors[] {
"Pavel Yakunin",
"Pavlo Bashynskiy",
"Pawel Rog",
"Paweł Kudzia",
"Peignon Melvyn",
"Peng Jian",
"Peng Liu",
@ -1069,6 +1085,7 @@ const char * auto_contributors[] {
"Tom Risse",
"Tomas Barton",
"Tomáš Hromada",
"Tristan",
"Tsarkova Anastasia",
"TszkitLo40",
"Tyler Hannan",
@ -1342,6 +1359,7 @@ const char * auto_contributors[] {
"dfenelonov",
"dgrr",
"dheerajathrey",
"dilet6298",
"dimarub2000",
"dinosaur",
"divanik",
@ -1552,6 +1570,7 @@ const char * auto_contributors[] {
"lomberts",
"loneylee",
"long2ice",
"loselarry",
"loyispa",
"lthaooo",
"ltrk2",

View File

@ -1,4 +1,4 @@
#include <Storages/MergeTree/MergeTreeIndexFullText.h>
#include <Storages/MergeTree/MergeTreeIndexBloomFilterText.h>
#include <Common/PODArray_fwd.h>

View File

@ -888,7 +888,7 @@ class CiOptions:
jobs_to_do_requested.append(job)
assert (
jobs_to_do_requested
), "Include tags are set but now job configured - Invalid tags, probably [{self.include_keywords}]"
), f"Include tags are set but no job configured - Invalid tags, probably [{self.include_keywords}]"
if JobNames.STYLE_CHECK not in jobs_to_do_requested:
# Style check must not be omitted
jobs_to_do_requested.append(JobNames.STYLE_CHECK)
@ -943,10 +943,12 @@ class CiOptions:
# we need to add params - otherwise it won't run as "batches" list will be empty
for job in jobs_to_do:
if job not in jobs_params:
num_batches = CI_CONFIG.get_job_config(job).num_batches
job_config = CI_CONFIG.get_job_config(job)
num_batches = job_config.num_batches
jobs_params[job] = {
"batches": list(range(num_batches)),
"num_batches": num_batches,
"run_if_ci_option_include_set": job_config.run_by_ci_option,
}
# 4. Handle "batch_" tags
@ -958,6 +960,18 @@ class CiOptions:
if params["num_batches"] > 1:
params["batches"] = self.job_batches
for job in jobs_to_do[:]:
job_param = jobs_params[job]
if (
job_param["run_if_ci_option_include_set"]
and job not in jobs_to_do_requested
):
print(
f"Erasing job '{job}' from list because it's not in included set, but will run only by include"
)
jobs_to_skip.append(job)
jobs_to_do.remove(job)
return jobs_to_do, jobs_to_skip, jobs_params
@ -1369,7 +1383,11 @@ def _configure_jobs(
continue
if job_config.pr_only and pr_info.is_release_branch:
continue
if job_config.release_only and not pr_info.is_release_branch:
if (
job_config.release_only
and not job_config.run_by_ci_option
and not pr_info.is_release_branch
):
continue
# fill job randomization buckets (for jobs with configured @random_bucket property))
@ -1421,6 +1439,7 @@ def _configure_jobs(
jobs_params[job] = {
"batches": batches_to_do,
"num_batches": num_batches,
"run_if_ci_option_include_set": job_config.run_by_ci_option,
}
elif add_to_skip:
# treat job as being skipped only if it's controlled by digest

View File

@ -78,7 +78,7 @@ class Build(metaclass=WithIter):
BINARY_AMD64_COMPAT = "binary_amd64_compat"
BINARY_AMD64_MUSL = "binary_amd64_musl"
BINARY_RISCV64 = "binary_riscv64"
# BINARY_S390X = "binary_s390x" # disabled because s390x refused to build in the migration to OpenSSL
BINARY_S390X = "binary_s390x"
FUZZERS = "fuzzers"
@ -131,6 +131,7 @@ class JobNames(metaclass=WithIter):
STRESS_TEST_MSAN = "Stress test (msan)"
STRESS_TEST_DEBUG = "Stress test (debug)"
STRESS_TEST_AZURE_TSAN = "Stress test (azure, tsan)"
STRESS_TEST_AZURE_MSAN = "Stress test (azure, msan)"
INTEGRATION_TEST = "Integration tests (release)"
INTEGRATION_TEST_ASAN = "Integration tests (asan)"
@ -244,6 +245,8 @@ class JobConfig:
pr_only: bool = False
# job is for release/master branches only
release_only: bool = False
# job will run if it's enabled in CI option
run_by_ci_option: bool = False
# to randomly pick and run one job among jobs in the same @random_bucket. Applied in PR branches only.
random_bucket: str = ""
@ -1031,13 +1034,12 @@ CI_CONFIG = CIConfig(
package_type="binary",
static_binary_name="riscv64",
),
# disabled because s390x refused to build in the migration to OpenSSL
# Build.BINARY_S390X: BuildConfig(
# name=Build.BINARY_S390X,
# compiler="clang-17-s390x",
# package_type="binary",
# static_binary_name="s390x",
# ),
Build.BINARY_S390X: BuildConfig(
name=Build.BINARY_S390X,
compiler="clang-17-s390x",
package_type="binary",
static_binary_name="s390x",
),
Build.FUZZERS: BuildConfig(
name=Build.FUZZERS,
compiler="clang-17",
@ -1067,7 +1069,7 @@ CI_CONFIG = CIConfig(
Build.BINARY_DARWIN_AARCH64,
Build.BINARY_PPC64LE,
Build.BINARY_RISCV64,
# Build.BINARY_S390X, # disabled because s390x refused to build in the migration to OpenSSL
Build.BINARY_S390X,
Build.BINARY_AMD64_COMPAT,
Build.BINARY_AMD64_MUSL,
Build.PACKAGE_RELEASE_COVERAGE,
@ -1205,7 +1207,7 @@ CI_CONFIG = CIConfig(
),
JobNames.STATELESS_TEST_AZURE_ASAN: TestConfig(
Build.PACKAGE_ASAN,
job_config=JobConfig(num_batches=4, **statless_test_common_params, release_only=True), # type: ignore
job_config=JobConfig(num_batches=4, **statless_test_common_params, release_only=True, run_by_ci_option=True), # type: ignore
),
JobNames.STATELESS_TEST_S3_TSAN: TestConfig(
Build.PACKAGE_TSAN,
@ -1230,7 +1232,10 @@ CI_CONFIG = CIConfig(
Build.PACKAGE_ASAN, job_config=JobConfig(pr_only=True, random_bucket="upgrade_with_sanitizer", **upgrade_test_common_params) # type: ignore
),
JobNames.STRESS_TEST_AZURE_TSAN: TestConfig(
Build.PACKAGE_TSAN, job_config=JobConfig(**stress_test_common_params, release_only=True) # type: ignore
Build.PACKAGE_TSAN, job_config=JobConfig(**stress_test_common_params, release_only=True, run_by_ci_option=True) # type: ignore
),
JobNames.STRESS_TEST_AZURE_MSAN: TestConfig(
Build.PACKAGE_MSAN, job_config=JobConfig(**stress_test_common_params, release_only=True, run_by_ci_option=True) # type: ignore
),
JobNames.UPGRADE_TEST_TSAN: TestConfig(
Build.PACKAGE_TSAN, job_config=JobConfig(pr_only=True, random_bucket="upgrade_with_sanitizer", **upgrade_test_common_params) # type: ignore

View File

@ -29,6 +29,7 @@ _TEST_BODY_1 = """
_TEST_BODY_2 = """
- [x] <!---ci_include_integration--> MUST include integration tests
- [x] <!---ci_include_stateless--> MUST include stateless tests
- [x] <!---ci_include_azure--> MUST include azure
- [x] <!---ci_include_foo_Bar--> no action must be applied
- [ ] <!---ci_include_bar--> no action must be applied
- [x] <!---ci_exclude_tsan--> MUST exclude tsan
@ -43,6 +44,10 @@ _TEST_BODY_3 = """
- [x] <!---ci_include_analyzer--> Must include all tests for analyzer
"""
_TEST_BODY_4 = """
"""
_TEST_JOB_LIST = [
"Style check",
"Fast test",
@ -64,6 +69,7 @@ _TEST_JOB_LIST = [
"Stateless tests (debug, s3 storage)",
"Stateless tests (tsan, s3 storage)",
"Stateless tests flaky check (asan)",
"Stateless tests (azure, asan)",
"Stateful tests (debug)",
"Stateful tests (release)",
"Stateful tests (coverage)",
@ -141,7 +147,8 @@ class TestCIOptions(unittest.TestCase):
_TEST_BODY_2, update_from_api=False
)
self.assertCountEqual(
ci_options.include_keywords, ["integration", "foo_bar", "stateless"]
ci_options.include_keywords,
["integration", "foo_bar", "stateless", "azure"],
)
self.assertCountEqual(
ci_options.exclude_keywords,
@ -149,7 +156,13 @@ class TestCIOptions(unittest.TestCase):
)
jobs_to_do = list(_TEST_JOB_LIST)
jobs_to_skip = []
job_params = {}
job_params = {
"Stateless tests (azure, asan)": {
"batches": list(range(3)),
"num_batches": 3,
"run_if_ci_option_include_set": True,
}
}
jobs_to_do, jobs_to_skip, job_params = ci_options.apply(
jobs_to_do, jobs_to_skip, job_params
)
@ -160,6 +173,7 @@ class TestCIOptions(unittest.TestCase):
"package_release",
"package_asan",
"Stateless tests (asan)",
"Stateless tests (azure, asan)",
"Stateless tests flaky check (asan)",
"Stateless tests (msan)",
"Stateless tests (ubsan)",
@ -194,3 +208,32 @@ class TestCIOptions(unittest.TestCase):
"package_asan",
],
)
def test_options_applied_3(self):
self.maxDiff = None
ci_options = CiOptions.create_from_pr_message(
_TEST_BODY_4, update_from_api=False
)
self.assertIsNone(ci_options.include_keywords, None)
self.assertIsNone(ci_options.exclude_keywords, None)
jobs_to_do = list(_TEST_JOB_LIST)
jobs_to_skip = []
job_params = {}
for job in _TEST_JOB_LIST:
if "Stateless" in job:
job_params[job] = {
"batches": list(range(3)),
"num_batches": 3,
"run_if_ci_option_include_set": "azure" in job,
}
else:
job_params[job] = {"run_if_ci_option_include_set": False}
jobs_to_do, jobs_to_skip, job_params = ci_options.apply(
jobs_to_do, jobs_to_skip, job_params
)
self.assertNotIn(
"Stateless tests (azure, asan)",
jobs_to_do,
)

View File

@ -734,9 +734,9 @@ def get_localzone():
class SettingsRandomizer:
settings = {
"max_insert_threads": lambda: 0
if random.random() < 0.5
else random.randint(1, 16),
"max_insert_threads": lambda: (
0 if random.random() < 0.5 else random.randint(1, 16)
),
"group_by_two_level_threshold": threshold_generator(0.2, 0.2, 1, 1000000),
"group_by_two_level_threshold_bytes": threshold_generator(
0.2, 0.2, 1, 50000000
@ -1470,11 +1470,14 @@ class TestCase:
args.collect_per_test_coverage
and BuildFlags.SANITIZE_COVERAGE in args.build_flags
):
clickhouse_execute(
args,
f"INSERT INTO system.coverage_log SELECT now(), '{self.case}', coverageCurrent()",
retry_error_codes=True,
)
try:
clickhouse_execute(
args,
f"INSERT INTO system.coverage_log SELECT now(), '{self.case}', coverageCurrent()",
retry_error_codes=True,
)
except Exception as e:
print("Cannot insert coverage data: ", str(e))
# Check for dumped coverage files
file_pattern = "coverage.*"
@ -1484,7 +1487,9 @@ class TestCase:
body = read_file_as_binary_string(file_path)
clickhouse_execute(
args,
f"INSERT INTO system.coverage_log SELECT now(), '{self.case}', groupArray(data) FROM input('data UInt64') FORMAT RowBinary",
"INSERT INTO system.coverage_log "
"SETTINGS async_insert=1, wait_for_async_insert=0, async_insert_busy_timeout_min_ms=200, async_insert_busy_timeout_max_ms=1000 "
f"SELECT now(), '{self.case}', groupArray(data) FROM input('data UInt64') FORMAT RowBinary",
body=body,
retry_error_codes=True,
)
@ -1493,6 +1498,8 @@ class TestCase:
# Remove the file even in case of exception to avoid accumulation and quadratic complexity.
os.remove(file_path)
_ = clickhouse_execute(args, "SYSTEM FLUSH ASYNC INSERT QUEUE")
coverage = clickhouse_execute(
args,
"SELECT length(coverageCurrent())",

View File

@ -302,7 +302,6 @@ def test_backup_restore_with_named_collection_azure_conf2(cluster):
def test_backup_restore_on_merge_tree(cluster):
node = cluster.instances["node"]
port = cluster.env_variables["AZURITE_PORT"]
azure_query(
node,
f"CREATE TABLE test_simple_merge_tree(key UInt64, data String) Engine = MergeTree() ORDER BY tuple() SETTINGS storage_policy='blob_storage_policy'",
@ -321,3 +320,5 @@ def test_backup_restore_on_merge_tree(cluster):
assert (
azure_query(node, f"SELECT * from test_simple_merge_tree_restored") == "1\ta\n"
)
azure_query(node, f"DROP TABLE test_simple_merge_tree")
azure_query(node, f"DROP TABLE test_simple_merge_tree_restored")

View File

@ -2,6 +2,8 @@
import logging
import pytest
import os
import minio
from helpers.cluster import ClickHouseCluster
from helpers.mock_servers import start_s3_mock
@ -608,3 +610,68 @@ def test_adaptive_timeouts(cluster, broken_s3, node_name):
else:
assert s3_use_adaptive_timeouts == "0"
assert s3_errors == 0
def test_no_key_found_disk(cluster, broken_s3):
node = cluster.instances["node"]
node.query(
"""
CREATE TABLE no_key_found_disk (
id Int64
) ENGINE=MergeTree()
ORDER BY id
SETTINGS
storage_policy='s3'
"""
)
uuid = node.query(
"""
SELECT uuid
FROM system.tables
WHERE name = 'no_key_found_disk'
"""
).strip()
assert uuid
node.query("INSERT INTO no_key_found_disk VALUES (1)")
data = node.query("SELECT * FROM no_key_found_disk").strip()
assert data == "1"
remote_pathes = (
node.query(
f"""
SELECT remote_path
FROM system.remote_data_paths
WHERE
local_path LIKE '%{uuid}%'
AND local_path LIKE '%.bin%'
ORDER BY ALL
"""
)
.strip()
.split()
)
assert len(remote_pathes) > 0
# path_prefix = os.path.join('/', cluster.minio_bucket)
for path in remote_pathes:
# name = os.path.relpath(path, path_prefix)
# assert False, f"deleting full {path} prefix {path_prefix} name {name}"
assert cluster.minio_client.stat_object(cluster.minio_bucket, path).size > 0
cluster.minio_client.remove_object(cluster.minio_bucket, path)
with pytest.raises(Exception) as exc_info:
size = cluster.minio_client.stat_object(cluster.minio_bucket, path).size
assert size == 0
assert "code: NoSuchKey" in str(exc_info.value)
error = node.query_and_get_error("SELECT * FROM no_key_found_disk").strip()
assert (
"DB::Exception: The specified key does not exist. This error happened for S3 disk."
in error
)

View File

@ -0,0 +1,18 @@
<clickhouse>
<include_from>/etc/clickhouse-server/config.d/include_from_source.yml</include_from>
<profiles>
<default>
<max_query_size incl="mqs" />
</default>
</profiles>
<users>
<default>
<password></password>
<profile>default</profile>
<quota>default</quota>
</default>
<include incl="users_1" />
<include incl="users_2" />
</users>
</clickhouse>

View File

@ -0,0 +1,10 @@
---
mqs: 88888
users_1:
user_1:
password: ""
profile: default
users_2:
user_2:
password: ""
profile: default

View File

@ -49,6 +49,11 @@ node7 = cluster.add_instance(
},
instance_env_variables=True,
)
node8 = cluster.add_instance(
"node8",
user_configs=["configs/config_include_from_yml.xml"],
main_configs=["configs/include_from_source.yml"],
)
@pytest.fixture(scope="module")
@ -115,6 +120,10 @@ def test_config(start_cluster):
node7.query("select value from system.settings where name = 'max_threads'")
== "2\n"
)
assert (
node8.query("select value from system.settings where name = 'max_query_size'")
== "88888\n"
)
def test_config_invalid_overrides(start_cluster):
@ -183,6 +192,11 @@ def test_include_config(start_cluster):
assert node3.query("select 1", user="user_1")
assert node3.query("select 1", user="user_2")
# <include incl="source tag" /> from .yml source
assert node8.query("select 1")
assert node8.query("select 1", user="user_1")
assert node8.query("select 1", user="user_2")
def test_allow_databases(start_cluster):
node5.query("CREATE DATABASE db1")

View File

@ -0,0 +1,5 @@
-- UUIDToNum --
1
1
-- UUIDv7toDateTime --
2024-04-22 08:30:29.048

View File

@ -0,0 +1,19 @@
SELECT '-- UUIDToNum --';
SELECT UUIDToNum(toUUID('00112233-4455-6677-8899-aabbccddeeff'), 1) = UUIDStringToNum('00112233-4455-6677-8899-aabbccddeeff', 1);
SELECT UUIDToNum(toUUID('00112233-4455-6677-8899-aabbccddeeff'), 2) = UUIDStringToNum('00112233-4455-6677-8899-aabbccddeeff', 2);
SELECT UUIDToNum(); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
SELECT UUIDToNum(toUUID('00112233-4455-6677-8899-aabbccddeeff'), 1, 2); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
SELECT UUIDToNum(toUUID('00112233-4455-6677-8899-aabbccddeeff'), 3); -- { serverError ARGUMENT_OUT_OF_BOUND }
SELECT UUIDToNum('00112233-4455-6677-8899-aabbccddeeff', 1); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }
SELECT UUIDToNum(toUUID('00112233-4455-6677-8899-aabbccddeeff'), '1'); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }
SELECT UUIDToNum(toUUID('00112233-4455-6677-8899-aabbccddeeff'), materialize(1)); -- { serverError ILLEGAL_COLUMN }
SELECT '-- UUIDv7toDateTime --';
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 'America/New_York');
SELECT UUIDv7ToDateTime(); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 1); -- { serverError ILLEGAL_COLUMN }
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 'America/New_York', 1); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
SELECT UUIDv7ToDateTime('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), 'America/NewYork'); -- { serverError BAD_ARGUMENTS }
SELECT UUIDv7ToDateTime(toUUID('018f05c9-4ab8-7b86-b64e-c9f03fbd45d1'), materialize('America/New_York')); -- { serverError ILLEGAL_COLUMN }

View File

@ -0,0 +1,21 @@
-- generateUUIDv7 --
UUID
7
2
0
0
1
-- generateUUIDv7ThreadMonotonic --
UUID
7
2
0
0
1
-- generateUUIDv7NonMonotonic --
UUID
7
2
0
0
1

View File

@ -0,0 +1,23 @@
SELECT '-- generateUUIDv7 --';
SELECT toTypeName(generateUUIDv7());
SELECT substring(hex(generateUUIDv7()), 13, 1); -- check version bits
SELECT bitAnd(bitShiftRight(toUInt128(generateUUIDv7()), 62), 3); -- check variant bits
SELECT generateUUIDv7(1) = generateUUIDv7(2);
SELECT generateUUIDv7() = generateUUIDv7(1);
SELECT generateUUIDv7(1) = generateUUIDv7(1);
SELECT '-- generateUUIDv7ThreadMonotonic --';
SELECT toTypeName(generateUUIDv7ThreadMonotonic());
SELECT substring(hex(generateUUIDv7ThreadMonotonic()), 13, 1); -- check version bits
SELECT bitAnd(bitShiftRight(toUInt128(generateUUIDv7ThreadMonotonic()), 62), 3); -- check variant bits
SELECT generateUUIDv7ThreadMonotonic(1) = generateUUIDv7ThreadMonotonic(2);
SELECT generateUUIDv7ThreadMonotonic() = generateUUIDv7ThreadMonotonic(1);
SELECT generateUUIDv7ThreadMonotonic(1) = generateUUIDv7ThreadMonotonic(1);
SELECT '-- generateUUIDv7NonMonotonic --';
SELECT toTypeName(generateUUIDv7NonMonotonic());
SELECT substring(hex(generateUUIDv7NonMonotonic()), 13, 1); -- check version bits
SELECT bitAnd(bitShiftRight(toUInt128(generateUUIDv7NonMonotonic()), 62), 3); -- check variant bits
SELECT generateUUIDv7NonMonotonic(1) = generateUUIDv7NonMonotonic(2);
SELECT generateUUIDv7NonMonotonic() = generateUUIDv7NonMonotonic(1);
SELECT generateUUIDv7NonMonotonic(1) = generateUUIDv7NonMonotonic(1);

View File

@ -5,7 +5,7 @@ CREATE TABLE tab
(
id UInt64,
str String,
INDEX idx str TYPE inverted(3) GRANULARITY 1
INDEX idx str TYPE full_text(3) GRANULARITY 1
)
ENGINE = MergeTree
ORDER BY tuple()

View File

@ -7,7 +7,7 @@ DROP TABLE IF EXISTS tab;
CREATE TABLE tab (
k UInt64,
s Map(String, String),
INDEX idx mapKeys(s) TYPE inverted(2) GRANULARITY 1)
INDEX idx mapKeys(s) TYPE full_text(2) GRANULARITY 1)
ENGINE = MergeTree
ORDER BY k
SETTINGS index_granularity = 2, index_granularity_bytes = '10Mi';

Some files were not shown because too many files have changed in this diff Show More