Merge remote-tracking branch 'blessed/master' into perf_experiment

This commit is contained in:
Raúl Marín 2022-11-21 11:02:31 +01:00
commit ed0c174c0c
207 changed files with 4866 additions and 1474 deletions

View File

@ -7,6 +7,8 @@ assignees: ''
---
**I have tried the following solutions**: https://clickhouse.com/docs/en/faq/troubleshooting/#troubleshooting-installation-errors
**Installation type**
Packages, docker, single binary, curl?

File diff suppressed because it is too large Load Diff

View File

@ -38,7 +38,7 @@ jobs:
with:
ref: master
fetch-depth: 0
- name: Generate versions
- name: Update versions, docker version, changelog, security
env:
GITHUB_TOKEN: ${{ secrets.ROBOT_CLICKHOUSE_COMMIT_TOKEN }}
run: |
@ -51,6 +51,7 @@ jobs:
--gh-user-or-token="$GITHUB_TOKEN" --jobs=5 \
--output="/ClickHouse/docs/changelogs/${GITHUB_TAG}.md" "${GITHUB_TAG}"
git add "./docs/changelogs/${GITHUB_TAG}.md"
python ./utils/security-generator/generate_security.py > SECURITY.md
git diff HEAD
- name: Create Pull Request
uses: peter-evans/create-pull-request@v3

View File

@ -27,7 +27,7 @@
* Added applied row-level policies to `system.query_log`. [#39819](https://github.com/ClickHouse/ClickHouse/pull/39819) ([Vladimir Chebotaryov](https://github.com/quickhouse)).
* Add four-letter command `csnp` for manually creating snapshots in ClickHouse Keeper. Additionally, `lgif` was added to get Raft information for a specific node (e.g. index of last created snapshot, last committed log index). [#41766](https://github.com/ClickHouse/ClickHouse/pull/41766) ([JackyWoo](https://github.com/JackyWoo)).
* Add function `ascii` like in Apache Spark: https://spark.apache.org/docs/latest/api/sql/#ascii. [#42670](https://github.com/ClickHouse/ClickHouse/pull/42670) ([李扬](https://github.com/taiyang-li)).
* Add function `pmod` which returns non-negative result based on modulo. [#42755](https://github.com/ClickHouse/ClickHouse/pull/42755) ([李扬](https://github.com/taiyang-li)).
* Add function `positive_modulo` (`pmod`) which returns non-negative result based on modulo. [#42755](https://github.com/ClickHouse/ClickHouse/pull/42755) ([李扬](https://github.com/taiyang-li)).
* Add function `formatReadableDecimalSize`. [#42774](https://github.com/ClickHouse/ClickHouse/pull/42774) ([Alejandro](https://github.com/alexon1234)).
* Add function `randCanonical`, which is similar to the `rand` function in Apache Spark or Impala. The function generates pseudo random results with independent and identically distributed uniformly distributed values in [0, 1). [#43124](https://github.com/ClickHouse/ClickHouse/pull/43124) ([李扬](https://github.com/taiyang-li)).
* Add function `displayName`, closes [#36770](https://github.com/ClickHouse/ClickHouse/issues/36770). [#37681](https://github.com/ClickHouse/ClickHouse/pull/37681) ([hongbin](https://github.com/xlwh)).
@ -35,6 +35,7 @@
* Add generic implementation for arbitrary structured named collections, access type and `system.named_collections`. [#43147](https://github.com/ClickHouse/ClickHouse/pull/43147) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### Performance Improvement
* Parallelized merging of `uniqExact` states for aggregation without key, i.e. queries like `SELECT uniqExact(number) FROM table`. The improvement becomes noticeable when the number of unique keys approaches 10^6. Also `uniq` performance is slightly optimized. [#43072](https://github.com/ClickHouse/ClickHouse/pull/43072) ([Nikita Taranov](https://github.com/nickitat)).
* `match` function can use the index if it's a condition on string prefix. This closes [#37333](https://github.com/ClickHouse/ClickHouse/issues/37333). [#42458](https://github.com/ClickHouse/ClickHouse/pull/42458) ([clarkcaoliu](https://github.com/Clark0)).
* Speed up AND and OR operators when they are sequenced. [#42214](https://github.com/ClickHouse/ClickHouse/pull/42214) ([Zhiguo Zhou](https://github.com/ZhiguoZh)).
* Support parallel parsing for `LineAsString` input format. This improves performance just slightly. This closes [#42502](https://github.com/ClickHouse/ClickHouse/issues/42502). [#42780](https://github.com/ClickHouse/ClickHouse/pull/42780) ([Kruglov Pavel](https://github.com/Avogar)).

View File

@ -1,3 +1,6 @@
<!--
the file is autogenerated by utils/security-generator/generate_security.py
-->
# Security Policy
@ -10,6 +13,7 @@ The following versions of ClickHouse server are currently being supported with s
| Version | Supported |
|:-|:-|
| 22.11 | ✔️ |
| 22.10 | ✔️ |
| 22.9 | ✔️ |
| 22.8 | ✔️ |
@ -61,5 +65,5 @@ As the security issue moves from triage, to identified fix, to release planning
## Public Disclosure Timing
A public disclosure date is negotiated by the ClickHouse maintainers and the bug submitter. We prefer to fully disclose the bug as soon as possible once a user mitigation is available. It is reasonable to delay disclosure when the bug or the fix is not yet fully understood, the solution is not well-tested, or for vendor coordination. The timeframe for disclosure is from immediate (especially if it's already publicly known) to 90 days. For a vulnerability with a straightforward mitigation, we expect the report date to disclosure date to be on the order of 7 days.
A public disclosure date is negotiated by the ClickHouse maintainers and the bug submitter. We prefer to fully disclose the bug as soon as possible once a user mitigation is available. It is reasonable to delay disclosure when the bug or the fix is not yet fully understood, the solution is not well-tested, or for vendor coordination. The timeframe for disclosure is from immediate (especially if it's already publicly known) to 90 days. For a vulnerability with a straightforward mitigation, we expect the report date to disclosure date to be on the order of 7 days.

View File

@ -2,11 +2,11 @@
# NOTE: has nothing common with DBMS_TCP_PROTOCOL_VERSION,
# only DBMS_TCP_PROTOCOL_VERSION should be incremented on protocol changes.
SET(VERSION_REVISION 54468)
SET(VERSION_REVISION 54469)
SET(VERSION_MAJOR 22)
SET(VERSION_MINOR 11)
SET(VERSION_MINOR 12)
SET(VERSION_PATCH 1)
SET(VERSION_GITHASH 98ab5a3c189232ea2a3dddb9d2be7196ae8b3434)
SET(VERSION_DESCRIBE v22.11.1.1-testing)
SET(VERSION_STRING 22.11.1.1)
SET(VERSION_GITHASH 0d211ed19849fe44b0e43fdebe2c15d76d560a77)
SET(VERSION_DESCRIBE v22.12.1.1-testing)
SET(VERSION_STRING 22.12.1.1)
# end of autochange

View File

@ -33,7 +33,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="22.10.2.11"
ARG VERSION="22.11.1.1360"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -21,7 +21,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="22.10.2.11"
ARG VERSION="22.11.1.1360"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image

View File

@ -254,7 +254,7 @@ sudo chgrp clickhouse /etc/clickhouse-server/config.d/s3_storage_policy_by_defau
start
./stress --hung-check --drop-databases --output-folder test_output --skip-func-tests "$SKIP_TESTS_OPTION" \
./stress --hung-check --drop-databases --output-folder test_output --skip-func-tests "$SKIP_TESTS_OPTION" --global-time-limit 1200 \
&& echo -e 'Test script exit code\tOK' >> /test_output/test_results.tsv \
|| echo -e 'Test script failed\tFAIL' >> /test_output/test_results.tsv
@ -388,6 +388,9 @@ else
rm -f /etc/clickhouse-server/config.d/storage_conf.xml ||:
rm -f /etc/clickhouse-server/config.d/azure_storage_conf.xml ||:
# it uses recently introduced settings which previous versions may not have
rm -f /etc/clickhouse-server/users.d/insert_keeper_retries.xml ||:
start
clickhouse-client --query="SELECT 'Server version: ', version()"

View File

@ -0,0 +1,249 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.11.1.1360-stable (0d211ed1984) FIXME as compared to v22.10.1.1877-stable (98ab5a3c189)
#### Backward Incompatible Change
* JSONExtract family of functions will now attempt to coerce to the request type. [#41502](https://github.com/ClickHouse/ClickHouse/pull/41502) ([Márcio Martins](https://github.com/marcioapm)).
#### New Feature
* - Add function `displayName`, closes [#36770](https://github.com/ClickHouse/ClickHouse/issues/36770). [#37681](https://github.com/ClickHouse/ClickHouse/pull/37681) ([hongbin](https://github.com/xlwh)).
* Added applied row-level policies to `system.query_log`. [#39819](https://github.com/ClickHouse/ClickHouse/pull/39819) ([Vladimir Chebotaryov](https://github.com/quickhouse)).
* Add Hudi and DeltaLake table engines, read-only, only for tables on S3. [#41054](https://github.com/ClickHouse/ClickHouse/pull/41054) ([Daniil Rubin](https://github.com/rubin-do)).
* Add 4LW command `csnp` for manually creating snapshots. Additionally, `lgif` was added to get Raft information for a specific node (e.g. index of last created snapshot, last committed log index). [#41766](https://github.com/ClickHouse/ClickHouse/pull/41766) ([JackyWoo](https://github.com/JackyWoo)).
* Support for keeper request retries during insert into replicated merge trees. Apart from fault tolerance, it aims to provide better user experience, - avoid returning a user an error during insert if keeper is restarted (for example, due to upgrade). [#42607](https://github.com/ClickHouse/ClickHouse/pull/42607) ([Igor Nikonov](https://github.com/devcrafter)).
* Add function ascii like in spark: https://spark.apache.org/docs/latest/api/sql/#ascii. [#42670](https://github.com/ClickHouse/ClickHouse/pull/42670) ([李扬](https://github.com/taiyang-li)).
* Add function pmod which return non-negative result based on modulo. [#42755](https://github.com/ClickHouse/ClickHouse/pull/42755) ([李扬](https://github.com/taiyang-li)).
* Published function `formatReadableDecimalSize`. [#42774](https://github.com/ClickHouse/ClickHouse/pull/42774) ([Alejandro](https://github.com/alexon1234)).
* Added S3 PUTs and GETs request per second rate throttling. Settings `s3_max_get_rps`, `s3_max_get_burst`, `s3_max_put_rps`, `s3_max_put_burst` are used to configure token bucket throttler. Can be used with both S3 ObjectStorage and S3 table function. Different limits can be configured for different S3 disks or endpoints. [#43014](https://github.com/ClickHouse/ClickHouse/pull/43014) ([Sergei Trifonov](https://github.com/serxa)).
* Add table function hudi and deltaLake. [#43080](https://github.com/ClickHouse/ClickHouse/pull/43080) ([flynn](https://github.com/ucasfl)).
* Add function factorial, as in Impala or Spark. [#43110](https://github.com/ClickHouse/ClickHouse/pull/43110) ([李扬](https://github.com/taiyang-li)).
* Add function randCanonical, which is similar to rand function in spark or impala. The function generates pseudo random results with independent and identically distributed uniformly distributed values in [0, 1). [#43124](https://github.com/ClickHouse/ClickHouse/pull/43124) ([李扬](https://github.com/taiyang-li)).
#### Performance Improvement
* Currently, the only saturable operators are And and Or, and their code paths are affected by this change. [#42214](https://github.com/ClickHouse/ClickHouse/pull/42214) ([Zhiguo Zhou](https://github.com/ZhiguoZh)).
* `match` function can use the index if it's a condition on string prefix. This closes [#37333](https://github.com/ClickHouse/ClickHouse/issues/37333). [#42458](https://github.com/ClickHouse/ClickHouse/pull/42458) ([clarkcaoliu](https://github.com/Clark0)).
* Fixed slowness in JSONExtract with LowCardinality(String) tuples. [#42761](https://github.com/ClickHouse/ClickHouse/pull/42761) ([AlfVII](https://github.com/AlfVII)).
* Support parallel parsing for LineAsString input format. This improves performance just slightly. This closes [#42502](https://github.com/ClickHouse/ClickHouse/issues/42502). [#42780](https://github.com/ClickHouse/ClickHouse/pull/42780) ([Kruglov Pavel](https://github.com/Avogar)).
* Keeper performance improvement: improve commit performance for cases when many different nodes have uncommitted states. This should help with cases when a follower node can't sync fast enough. [#42926](https://github.com/ClickHouse/ClickHouse/pull/42926) ([Antonio Andelic](https://github.com/antonio2368)).
* Parallelized merging of `uniqExact` states for aggregation without a key, i.e. queries like `SELECT uniqExact(number) FROM table`. The improvement becomes noticeable when the number of unique keys approaches 10^6. Also `uniq` performance is slightly optimized. This closes [#4510](https://github.com/ClickHouse/ClickHouse/issues/4510). [#43072](https://github.com/ClickHouse/ClickHouse/pull/43072) ([Nikita Taranov](https://github.com/nickitat)).
#### Improvement
* Support type `Object` inside other types, e.g. `Array(JSON)`. [#36969](https://github.com/ClickHouse/ClickHouse/pull/36969) ([Anton Popov](https://github.com/CurtizJ)).
* Remove covered parts for fetched part (to avoid possible replication delay grows). [#39737](https://github.com/ClickHouse/ClickHouse/pull/39737) ([Azat Khuzhin](https://github.com/azat)).
* ClickHouse Client and ClickHouse Local will show progress by default even in non-interactive mode. If `/dev/tty` is available, the progress will be rendered directly to the terminal, without writing to stderr. It allows to get progress even if stderr is redirected to a file, and the file will not be polluted by terminal escape sequences. The progress can be disabled by `--progress false`. This closes [#32238](https://github.com/ClickHouse/ClickHouse/issues/32238). [#42003](https://github.com/ClickHouse/ClickHouse/pull/42003) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* 1. Add, subtract and negate operations are now available on Intervals. In case when the types of Intervals are different they will be transformed into the Tuple of those types. 2. A tuple of intervals can be added to or subtracted from a Date/DateTime field. 3. Added parsing of Intervals with different types, for example: `INTERVAL '1 HOUR 1 MINUTE 1 SECOND'`. [#42195](https://github.com/ClickHouse/ClickHouse/pull/42195) ([Nikolay Degterinsky](https://github.com/evillique)).
* - Add `notLike` to key condition atom map, so condition like `NOT LIKE 'prefix%'` can use primary index. [#42209](https://github.com/ClickHouse/ClickHouse/pull/42209) ([Duc Canh Le](https://github.com/canhld94)).
* Add support for FixedString input to base64 coding functions. [#42285](https://github.com/ClickHouse/ClickHouse/pull/42285) ([ltrk2](https://github.com/ltrk2)).
* Add columns `bytes_on_disk` and `path` to `system.detached_parts`. Closes [#42264](https://github.com/ClickHouse/ClickHouse/issues/42264). [#42303](https://github.com/ClickHouse/ClickHouse/pull/42303) ([chen](https://github.com/xiedeyantu)).
* Improve using structure from insertion table in table functions, now setting `use_structure_from_insertion_table_in_table_functions` has new possible value - `2` that means that ClickHouse will try to determine if we can use structure from insertion table or not automatically. Closes [#40028](https://github.com/ClickHouse/ClickHouse/issues/40028). [#42320](https://github.com/ClickHouse/ClickHouse/pull/42320) ([Kruglov Pavel](https://github.com/Avogar)).
* Added ** glob support for recursive directory traversal to filesystem and S3. resolves [#36316](https://github.com/ClickHouse/ClickHouse/issues/36316). [#42376](https://github.com/ClickHouse/ClickHouse/pull/42376) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Mask passwords and secret keys both in `system.query_log` and `/var/log/clickhouse-server/*.log` and also in error messages. [#42484](https://github.com/ClickHouse/ClickHouse/pull/42484) ([Vitaly Baranov](https://github.com/vitlibar)).
* Add a new variable call `limit` in query_info, indicating whether this query is a limit-trivial query. If so, we will adjust the approximate total rows for later estimation. Closes [#7071](https://github.com/ClickHouse/ClickHouse/issues/7071). [#42580](https://github.com/ClickHouse/ClickHouse/pull/42580) ([Han Fei](https://github.com/hanfei1991)).
* Implement `ATTACH` of `MergeTree` table for `s3_plain` disk (plus some fixes for `s3_plain`). [#42628](https://github.com/ClickHouse/ClickHouse/pull/42628) ([Azat Khuzhin](https://github.com/azat)).
* Fix no progress indication on INSERT FROM INFILE. Closes [#42548](https://github.com/ClickHouse/ClickHouse/issues/42548). [#42634](https://github.com/ClickHouse/ClickHouse/pull/42634) ([chen](https://github.com/xiedeyantu)).
* Add `min_age_to_force_merge_on_partition_only` setting to optimize old parts for the entire partition only. [#42659](https://github.com/ClickHouse/ClickHouse/pull/42659) ([Antonio Andelic](https://github.com/antonio2368)).
* Throttling algorithm changed to token bucket. [#42665](https://github.com/ClickHouse/ClickHouse/pull/42665) ([Sergei Trifonov](https://github.com/serxa)).
* Refactor FunctionTokens to enable max tokens returned for related functions(default disabled). [#42673](https://github.com/ClickHouse/ClickHouse/pull/42673) ([李扬](https://github.com/taiyang-li)).
* Added new field allow_readonly in system.table_functions to allow using table functions in readonly mode resolves [#42414](https://github.com/ClickHouse/ClickHouse/issues/42414) Implementation: * Added a new field allow_readonly to table system.table_functions. * Updated to use new field allow_readonly to allow using table functions in readonly mode. Testing: * Added a test for filesystem tests/queries/0_stateless/02473_functions_in_readonly_mode.sh Documentation: * Updated the english documentation for Table Functions. [#42708](https://github.com/ClickHouse/ClickHouse/pull/42708) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Allow to use Date32 arguments for formatDateTime and FROM_UNIXTIME functions. [#42737](https://github.com/ClickHouse/ClickHouse/pull/42737) ([Roman Vasin](https://github.com/rvasin)).
* Update tzdata to 2022f. Mexico will no longer observe DST except near the US border: https://www.timeanddate.com/news/time/mexico-abolishes-dst-2022.html. Chihuahua moves to year-round UTC-6 on 2022-10-30. Fiji no longer observes DST. See https://github.com/google/cctz/pull/235 and https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/1995209. [#42796](https://github.com/ClickHouse/ClickHouse/pull/42796) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add `FailedAsyncInsertQuery` event metric for async inserts. [#42814](https://github.com/ClickHouse/ClickHouse/pull/42814) ([Krzysztof Góralski](https://github.com/kgoralski)).
* Implement `read-in-order` optimization on top of query plan. It is enabled by default. Set `query_plan_read_in_order = 0` to use previous AST-based version. [#42829](https://github.com/ClickHouse/ClickHouse/pull/42829) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Increase the size of upload part exponentially for backup to S3. [#42833](https://github.com/ClickHouse/ClickHouse/pull/42833) ([Vitaly Baranov](https://github.com/vitlibar)).
* When the merge task is continuously busy and the disk space is insufficient, the completely expired parts cannot be selected and dropped, resulting in insufficient disk space. My idea is that when the entire Part expires, there is no need for additional disk space to guarantee, ensure the normal execution of TTL. [#42869](https://github.com/ClickHouse/ClickHouse/pull/42869) ([zhongyuankai](https://github.com/zhongyuankai)).
* bugfix [#42856](https://github.com/ClickHouse/ClickHouse/issues/42856) ignore Mysql binlog SAVEPOINT event. [#42931](https://github.com/ClickHouse/ClickHouse/pull/42931) ([zzsmdfj](https://github.com/zzsmdfj)).
* Add support for interactive parameters in INSERT VALUES queries. [#43077](https://github.com/ClickHouse/ClickHouse/pull/43077) ([Nikolay Degterinsky](https://github.com/evillique)).
* Add generic implementation for arbitrary structured named collections, access type and system.named_collections. [#43147](https://github.com/ClickHouse/ClickHouse/pull/43147) ([Kseniia Sumarokova](https://github.com/kssenii)).
* add oss function and StorageOSS (This is convenient for users). oss is fully compatible with s3. [#43155](https://github.com/ClickHouse/ClickHouse/pull/43155) ([zzsmdfj](https://github.com/zzsmdfj)).
* Improve error reporting in the collection of OS-related info for the `system.asynchronous_metrics` table. [#43192](https://github.com/ClickHouse/ClickHouse/pull/43192) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* The `system.asynchronous_metrics` gets embedded documentation. This documentation is also exported to Prometheus. Fixed an error with the metrics about `cache` disks - they were calculated only for one arbitrary cache disk instead all of them. This closes [#7644](https://github.com/ClickHouse/ClickHouse/issues/7644). [#43194](https://github.com/ClickHouse/ClickHouse/pull/43194) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Modify the `INFORMATION_SCHEMA` tables in a way so that now ClickHouse can connect to itself using the MySQL compatibility protocol. Add columns instead of aliases (related to [#9769](https://github.com/ClickHouse/ClickHouse/issues/9769)). It will improve the compatibility with various MySQL clients. [#43198](https://github.com/ClickHouse/ClickHouse/pull/43198) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
* Disable `deltaLake` and `hudi` table functions in readonly mode. [#43316](https://github.com/ClickHouse/ClickHouse/pull/43316) ([Antonio Andelic](https://github.com/antonio2368)).
#### Bug Fix
* Updated normaliser to clone the alias ast. resolves [#42452](https://github.com/ClickHouse/ClickHouse/issues/42452) Implementation: * Updated QueryNormalizer to clone alias ast, when its replaced. Previously just assigning the same leads to exception in LogicalExpressinsOptimizer as it would be the same parent being inserted again. * This bug is not seen with new analyser (allow_experimental_analyzer), so no changes for it. I added a test for the same. [#42827](https://github.com/ClickHouse/ClickHouse/pull/42827) ([SmitaRKulkarni](https://github.com/SmitaRKulkarni)).
* Fix race for backup of tables in Lazy databases. [#43104](https://github.com/ClickHouse/ClickHouse/pull/43104) ([Vitaly Baranov](https://github.com/vitlibar)).
* fix skip_unavailable_shards does not work using s3Cluster table function. [#43131](https://github.com/ClickHouse/ClickHouse/pull/43131) ([chen](https://github.com/xiedeyantu)).
#### Build/Testing/Packaging Improvement
* Run SQLancer for each pull request and commit to master. [SQLancer](https://github.com/sqlancer/sqlancer) is an OpenSource fuzzer that focuses on automatic detection of logical bugs. [#42397](https://github.com/ClickHouse/ClickHouse/pull/42397) ([Ilya Yatsishin](https://github.com/qoega)).
* Update to latest zlib-ng. [#42463](https://github.com/ClickHouse/ClickHouse/pull/42463) ([Boris Kuschel](https://github.com/bkuschel)).
* use llvm `l64.lld` in macOS suppress ld warnings, close [#42282](https://github.com/ClickHouse/ClickHouse/issues/42282). [#42470](https://github.com/ClickHouse/ClickHouse/pull/42470) ([Lloyd-Pottiger](https://github.com/Lloyd-Pottiger)).
* Add support for testing ClickHouse server with Jepsen. By the way, we already have support for testing ClickHouse Keeper with Jepsen. This pull request extends it to Replicated tables. [#42619](https://github.com/ClickHouse/ClickHouse/pull/42619) ([Antonio Andelic](https://github.com/antonio2368)).
* * Improve bugfix validation check: fix bug with skipping the check, port separate status in CI, run after check labels and style check. Close [#40349](https://github.com/ClickHouse/ClickHouse/issues/40349). [#42702](https://github.com/ClickHouse/ClickHouse/pull/42702) ([Vladimir C](https://github.com/vdimir)).
* Wait for all files are in sync before archiving them in integration tests. [#42891](https://github.com/ClickHouse/ClickHouse/pull/42891) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Use https://github.com/matus-chochlik/ctcache for clang-tidy results caching. [#42913](https://github.com/ClickHouse/ClickHouse/pull/42913) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Before the fix, the user-defined config was preserved by RPM in `$file.rpmsave`. The PR fixes it and won't replace the user's files from packages. [#42936](https://github.com/ClickHouse/ClickHouse/pull/42936) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add a CI step to mark commits as ready for release; soft-forbid launching a release script from branches but master. [#43017](https://github.com/ClickHouse/ClickHouse/pull/43017) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Fix schema inference in s3Cluster and improve in hdfsCluster. [#41979](https://github.com/ClickHouse/ClickHouse/pull/41979) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix retries while reading from http table engines / table function. (retrtiable errors could be retries more times than needed, non-retrialble errors resulted in failed assertion in code). [#42224](https://github.com/ClickHouse/ClickHouse/pull/42224) ([Kseniia Sumarokova](https://github.com/kssenii)).
* A segmentation fault related to DNS & c-ares has been reported. The below error ocurred in multiple threads: ``` 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008088 [ 356 ] {} <Fatal> BaseDaemon: ######################################## 2022-09-28 15:41:19.008,"2022.09.28 15:41:19.008147 [ 356 ] {} <Fatal> BaseDaemon: (version 22.8.5.29 (official build), build id: 92504ACA0B8E2267) (from thread 353) (no query) Received signal Segmentation fault (11)" 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008196 [ 356 ] {} <Fatal> BaseDaemon: Address: 0xf Access: write. Address not mapped to object. 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008216 [ 356 ] {} <Fatal> BaseDaemon: Stack trace: 0x188f8212 0x1626851b 0x1626a69e 0x16269b3f 0x16267eab 0x13cf8284 0x13d24afc 0x13c5217e 0x14ec2495 0x15ba440f 0x15b9d13b 0x15bb2699 0x1891ccb3 0x1891e00d 0x18ae0769 0x18ade022 0x7f76aa985609 0x7f76aa8aa133 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008274 [ 356 ] {} <Fatal> BaseDaemon: 2. Poco::Net::IPAddress::family() const @ 0x188f8212 in /usr/bin/clickhouse 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008297 [ 356 ] {} <Fatal> BaseDaemon: 3. ? @ 0x1626851b in /usr/bin/clickhouse 2022-09-28 15:41:19.008,2022.09.28 15:41:19.008309 [ 356 ] {} <Fatal> BaseDaemon: 4. ? @ 0x1626a69e in /usr/bin/clickhouse ```. [#42234](https://github.com/ClickHouse/ClickHouse/pull/42234) ([Arthur Passos](https://github.com/arthurpassos)).
* Fix `LOGICAL_ERROR` `Arguments of 'plus' have incorrect data types` which may happen in PK analysis (monotonicity check). Fix invalid PK analysis for monotonic binary functions with first constant argument. [#42410](https://github.com/ClickHouse/ClickHouse/pull/42410) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix incorrect key analysis when key types cannot be inside Nullable. This fixes [#42456](https://github.com/ClickHouse/ClickHouse/issues/42456). [#42469](https://github.com/ClickHouse/ClickHouse/pull/42469) ([Amos Bird](https://github.com/amosbird)).
* Fix typo in setting name that led to bad usage of schema inference cache while using setting `input_format_csv_use_best_effort_in_schema_inference`. Closes [#41735](https://github.com/ClickHouse/ClickHouse/issues/41735). [#42536](https://github.com/ClickHouse/ClickHouse/pull/42536) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix create Set with wrong header when data type is LowCardinality. Closes [#42460](https://github.com/ClickHouse/ClickHouse/issues/42460). [#42579](https://github.com/ClickHouse/ClickHouse/pull/42579) ([flynn](https://github.com/ucasfl)).
* `(U)Int128` and `(U)Int256` values are correctly checked in `PREWHERE`. [#42605](https://github.com/ClickHouse/ClickHouse/pull/42605) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix a bug in ParserFunction that could have led to a segmentation fault. [#42724](https://github.com/ClickHouse/ClickHouse/pull/42724) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix truncate table does not hold lock correctly. [#42728](https://github.com/ClickHouse/ClickHouse/pull/42728) ([flynn](https://github.com/ucasfl)).
* Fix possible SIGSEGV for web disks when file does not exists (or `OPTIMIZE TABLE FINAL`, that also can got the same error eventually). [#42767](https://github.com/ClickHouse/ClickHouse/pull/42767) ([Azat Khuzhin](https://github.com/azat)).
* Fix `auth_type` mapping in `system.session_log`, by including `SSL_CERTIFICATE` for the enum values. [#42782](https://github.com/ClickHouse/ClickHouse/pull/42782) ([Miel Donkers](https://github.com/mdonkers)).
* Fix stack-use-after-return under ASAN build in ParserCreateUserQuery. [#42804](https://github.com/ClickHouse/ClickHouse/pull/42804) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix lowerUTF8()/upperUTF8() in case of symbol was in between 16-byte boundary (very frequent case of you have strings > 16 bytes long). [#42812](https://github.com/ClickHouse/ClickHouse/pull/42812) ([Azat Khuzhin](https://github.com/azat)).
* Additional bound check was added to lz4 decompression routine to fix misbehaviour in case of malformed input. [#42868](https://github.com/ClickHouse/ClickHouse/pull/42868) ([Nikita Taranov](https://github.com/nickitat)).
* Fix rare possible hung on query cancellation. [#42874](https://github.com/ClickHouse/ClickHouse/pull/42874) ([Azat Khuzhin](https://github.com/azat)).
* * Fix incorrect saved_block_sample with multiple disjuncts in hash join, close [#42832](https://github.com/ClickHouse/ClickHouse/issues/42832). [#42876](https://github.com/ClickHouse/ClickHouse/pull/42876) ([Vladimir C](https://github.com/vdimir)).
* A null pointer will be generated when select if as from three table join , For example, the SQL:. [#42883](https://github.com/ClickHouse/ClickHouse/pull/42883) ([zzsmdfj](https://github.com/zzsmdfj)).
* Fix memory sanitizer report in ClusterDiscovery, close [#42763](https://github.com/ClickHouse/ClickHouse/issues/42763). [#42905](https://github.com/ClickHouse/ClickHouse/pull/42905) ([Vladimir C](https://github.com/vdimir)).
* Fix datetime schema inference in case of empty string. [#42911](https://github.com/ClickHouse/ClickHouse/pull/42911) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix rare NOT_FOUND_COLUMN_IN_BLOCK error when projection is possible to use but there is no projection available. This fixes [#42771](https://github.com/ClickHouse/ClickHouse/issues/42771) . The bug was introduced in https://github.com/ClickHouse/ClickHouse/pull/25563. [#42938](https://github.com/ClickHouse/ClickHouse/pull/42938) ([Amos Bird](https://github.com/amosbird)).
* Fixes for s3_plain disk that will allow to attach Wide parts. [#42950](https://github.com/ClickHouse/ClickHouse/pull/42950) ([Azat Khuzhin](https://github.com/azat)).
* Fix ATTACH TABLE in PostgreSQL database engine if the table contains DATETIME data type. Closes [#42817](https://github.com/ClickHouse/ClickHouse/issues/42817). [#42960](https://github.com/ClickHouse/ClickHouse/pull/42960) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix lambda parsing. Closes [#41848](https://github.com/ClickHouse/ClickHouse/issues/41848). [#42979](https://github.com/ClickHouse/ClickHouse/pull/42979) ([Nikolay Degterinsky](https://github.com/evillique)).
* Handle (ignore) SAVEPOINT queries in MaterializedMySQL. [#43086](https://github.com/ClickHouse/ClickHouse/pull/43086) ([Stig Bakken](https://github.com/stigsb)).
* Fix incorrect key analysis when nullable keys appear in the middle of a hyperrectangle. This fixes [#43111](https://github.com/ClickHouse/ClickHouse/issues/43111) . [#43133](https://github.com/ClickHouse/ClickHouse/pull/43133) ([Amos Bird](https://github.com/amosbird)).
* - Fix several buffer over-reads. [#43159](https://github.com/ClickHouse/ClickHouse/pull/43159) ([Raúl Marín](https://github.com/Algunenano)).
* Fix function if in case of NULL and const Nullable arguments. Closes [#43069](https://github.com/ClickHouse/ClickHouse/issues/43069). [#43178](https://github.com/ClickHouse/ClickHouse/pull/43178) ([Kruglov Pavel](https://github.com/Avogar)).
* Fix decimal math overflow in parsing datetime with 'best effort' algorithm. Closes [#43061](https://github.com/ClickHouse/ClickHouse/issues/43061). [#43180](https://github.com/ClickHouse/ClickHouse/pull/43180) ([Kruglov Pavel](https://github.com/Avogar)).
* The `indent` field produced by the `git-import` tool was miscalculated. See https://clickhouse.com/docs/en/getting-started/example-datasets/github/. [#43191](https://github.com/ClickHouse/ClickHouse/pull/43191) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fixed unexpected behaviour of Interval types with subquery and casting. [#43193](https://github.com/ClickHouse/ClickHouse/pull/43193) ([jh0x](https://github.com/jh0x)).
* * Fix logical error in `sumMap/minMap/maxMap` functions executing `TOTALS/ROLLUP/CUBE` on `NULL` values. Close [#43022](https://github.com/ClickHouse/ClickHouse/issues/43022). [#43232](https://github.com/ClickHouse/ClickHouse/pull/43232) ([Vladimir C](https://github.com/vdimir)).
* - Fix ubsan in AggregateFunctionMinMaxAny::read with high sizes. [#43249](https://github.com/ClickHouse/ClickHouse/pull/43249) ([Raúl Marín](https://github.com/Algunenano)).
* Fix IS (NOT) NULL operator priority in regard to other operators. [#43265](https://github.com/ClickHouse/ClickHouse/pull/43265) ([Nikolay Degterinsky](https://github.com/evillique)).
#### Build Improvement
* ... Add support for format ipv6 on s390x. [#42412](https://github.com/ClickHouse/ClickHouse/pull/42412) ([Suzy Wang](https://github.com/SuzyWangIBMer)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Sonar Cloud Workflow"'. [#42725](https://github.com/ClickHouse/ClickHouse/pull/42725) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* NO CL ENTRY: 'Revert " Keeper retries during insert (clean)"'. [#43116](https://github.com/ClickHouse/ClickHouse/pull/43116) ([Alexander Tokmakov](https://github.com/tavplubix)).
* NO CL ENTRY: 'Revert "Revert " Keeper retries during insert (clean)""'. [#43122](https://github.com/ClickHouse/ClickHouse/pull/43122) ([Igor Nikonov](https://github.com/devcrafter)).
* NO CL ENTRY: 'Revert "Optimize TTL merge, completely expired parts can be removed in time"'. [#43134](https://github.com/ClickHouse/ClickHouse/pull/43134) ([Alexander Tokmakov](https://github.com/tavplubix)).
* NO CL ENTRY: 'Revert "Randomize keeper fault injection settings in stress tests"'. [#43218](https://github.com/ClickHouse/ClickHouse/pull/43218) ([Alexander Gololobov](https://github.com/davenger)).
* NO CL ENTRY: 'Revert "S3 request per second rate throttling"'. [#43306](https://github.com/ClickHouse/ClickHouse/pull/43306) ([Alexander Tokmakov](https://github.com/tavplubix)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Better logging for docs builder [#41903](https://github.com/ClickHouse/ClickHouse/pull/41903) ([filimonov](https://github.com/filimonov)).
* Save full server log in AST Fuzzer checks [#42316](https://github.com/ClickHouse/ClickHouse/pull/42316) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Build with libcxx(abi) 15 [#42513](https://github.com/ClickHouse/ClickHouse/pull/42513) ([Robert Schulze](https://github.com/rschu1ze)).
* Sonar Cloud Workflow [#42534](https://github.com/ClickHouse/ClickHouse/pull/42534) ([Julio Jimenez](https://github.com/juliojimenez)).
* Invalid type in where for Merge table (logical error) [#42576](https://github.com/ClickHouse/ClickHouse/pull/42576) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix frequent memory drift message and clarify things in comments [#42582](https://github.com/ClickHouse/ClickHouse/pull/42582) ([Azat Khuzhin](https://github.com/azat)).
* Add functions for PowerBI connect [#42612](https://github.com/ClickHouse/ClickHouse/pull/42612) ([Filatenkov Artur](https://github.com/FArthur-cmd)).
* Try to save `IDataPartStorage` interface [#42618](https://github.com/ClickHouse/ClickHouse/pull/42618) ([Anton Popov](https://github.com/CurtizJ)).
* Remove Ubuntu cruft [#42622](https://github.com/ClickHouse/ClickHouse/pull/42622) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Analyzer change setting into allow_experimental_analyzer [#42649](https://github.com/ClickHouse/ClickHouse/pull/42649) ([Maksim Kita](https://github.com/kitaisreal)).
* Analyzer IQueryTreeNode remove getName method [#42651](https://github.com/ClickHouse/ClickHouse/pull/42651) ([Maksim Kita](https://github.com/kitaisreal)).
* Minor fix iotest_nonblock build [#42658](https://github.com/ClickHouse/ClickHouse/pull/42658) ([Jordi Villar](https://github.com/jrdi)).
* Add tests and doc for some url-related functions [#42664](https://github.com/ClickHouse/ClickHouse/pull/42664) ([Vladimir C](https://github.com/vdimir)).
* Update version_date.tsv and changelogs after v22.10.1.1875-stable [#42676](https://github.com/ClickHouse/ClickHouse/pull/42676) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Fix error handling in clickhouse_helper.py [#42678](https://github.com/ClickHouse/ClickHouse/pull/42678) ([Ilya Yatsishin](https://github.com/qoega)).
* Fix execution of version_helper.py to use git tweaks [#42679](https://github.com/ClickHouse/ClickHouse/pull/42679) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* MergeTree indexes use RPNBuilderTree [#42681](https://github.com/ClickHouse/ClickHouse/pull/42681) ([Maksim Kita](https://github.com/kitaisreal)).
* Always run `BuilderReport` and `BuilderSpecialReport` in all CI types [#42684](https://github.com/ClickHouse/ClickHouse/pull/42684) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Support optimize_syntax_fuse_functions for sum/count/avg via analyzer [#42689](https://github.com/ClickHouse/ClickHouse/pull/42689) ([Vladimir C](https://github.com/vdimir)).
* Update version after release [#42699](https://github.com/ClickHouse/ClickHouse/pull/42699) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Update version_date.tsv and changelogs after v22.10.1.1877-stable [#42700](https://github.com/ClickHouse/ClickHouse/pull/42700) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* OrderByLimitByDuplicateEliminationPass improve performance [#42704](https://github.com/ClickHouse/ClickHouse/pull/42704) ([Maksim Kita](https://github.com/kitaisreal)).
* Analyzer improve subqueries representation [#42705](https://github.com/ClickHouse/ClickHouse/pull/42705) ([Maksim Kita](https://github.com/kitaisreal)).
* Update version_date.tsv and changelogs after v22.9.4.32-stable [#42712](https://github.com/ClickHouse/ClickHouse/pull/42712) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update version_date.tsv and changelogs after v22.8.7.34-lts [#42713](https://github.com/ClickHouse/ClickHouse/pull/42713) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update version_date.tsv and changelogs after v22.7.7.24-stable [#42714](https://github.com/ClickHouse/ClickHouse/pull/42714) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Move SonarCloud Job to nightly [#42718](https://github.com/ClickHouse/ClickHouse/pull/42718) ([Julio Jimenez](https://github.com/juliojimenez)).
* Update version_date.tsv and changelogs after v22.8.8.3-lts [#42738](https://github.com/ClickHouse/ClickHouse/pull/42738) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Minor fix implicit cast CaresPTRResolver [#42747](https://github.com/ClickHouse/ClickHouse/pull/42747) ([Jordi Villar](https://github.com/jrdi)).
* Fix build on master [#42752](https://github.com/ClickHouse/ClickHouse/pull/42752) ([Igor Nikonov](https://github.com/devcrafter)).
* Update version_date.tsv and changelogs after v22.3.14.18-lts [#42759](https://github.com/ClickHouse/ClickHouse/pull/42759) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Fix anchor links [#42760](https://github.com/ClickHouse/ClickHouse/pull/42760) ([Sergei Trifonov](https://github.com/serxa)).
* Update version_date.tsv and changelogs after v22.3.14.23-lts [#42764](https://github.com/ClickHouse/ClickHouse/pull/42764) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Update README.md [#42783](https://github.com/ClickHouse/ClickHouse/pull/42783) ([Yuko Takagi](https://github.com/yukotakagi)).
* Slightly better code with projections [#42794](https://github.com/ClickHouse/ClickHouse/pull/42794) ([Anton Popov](https://github.com/CurtizJ)).
* Fix some races in MergeTree [#42805](https://github.com/ClickHouse/ClickHouse/pull/42805) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix typo in comments [#42809](https://github.com/ClickHouse/ClickHouse/pull/42809) ([Gabriel](https://github.com/Gabriel39)).
* Fix compilation of LLVM with cmake cache [#42816](https://github.com/ClickHouse/ClickHouse/pull/42816) ([Azat Khuzhin](https://github.com/azat)).
* Fix link in docs [#42821](https://github.com/ClickHouse/ClickHouse/pull/42821) ([Sergei Trifonov](https://github.com/serxa)).
* Link to proper place in docs [#42822](https://github.com/ClickHouse/ClickHouse/pull/42822) ([Sergei Trifonov](https://github.com/serxa)).
* Fix argument type check in AggregateFunctionAnalysisOfVariance [#42823](https://github.com/ClickHouse/ClickHouse/pull/42823) ([Vladimir C](https://github.com/vdimir)).
* Tests/lambda analyzer [#42824](https://github.com/ClickHouse/ClickHouse/pull/42824) ([Denny Crane](https://github.com/den-crane)).
* Fix Missing Quotes - Sonar Nightly [#42831](https://github.com/ClickHouse/ClickHouse/pull/42831) ([Julio Jimenez](https://github.com/juliojimenez)).
* Add exclusions from the Snyk scan [#42834](https://github.com/ClickHouse/ClickHouse/pull/42834) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix Missing Env Vars - Sonar Nightly [#42843](https://github.com/ClickHouse/ClickHouse/pull/42843) ([Julio Jimenez](https://github.com/juliojimenez)).
* Fix typo [#42855](https://github.com/ClickHouse/ClickHouse/pull/42855) ([GoGoWen](https://github.com/GoGoWen)).
* Add timezone to 02458_datediff_date32 [#42857](https://github.com/ClickHouse/ClickHouse/pull/42857) ([Vladimir C](https://github.com/vdimir)).
* Adjust cancel and rerun workflow names to the actual [#42862](https://github.com/ClickHouse/ClickHouse/pull/42862) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Analyzer subquery in JOIN TREE with aggregation [#42865](https://github.com/ClickHouse/ClickHouse/pull/42865) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix getauxval for sanitizer builds [#42866](https://github.com/ClickHouse/ClickHouse/pull/42866) ([Amos Bird](https://github.com/amosbird)).
* Update version_date.tsv and changelogs after v22.10.2.11-stable [#42871](https://github.com/ClickHouse/ClickHouse/pull/42871) ([robot-clickhouse](https://github.com/robot-clickhouse)).
* Better usability for dashboard.html on changes [#42872](https://github.com/ClickHouse/ClickHouse/pull/42872) ([Vladimir C](https://github.com/vdimir)).
* Some fixes for ReplicatedMergeTree [#42878](https://github.com/ClickHouse/ClickHouse/pull/42878) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Validate Query Tree in debug [#42879](https://github.com/ClickHouse/ClickHouse/pull/42879) ([Dmitry Novik](https://github.com/novikd)).
* changed type name for s3 plain storage [#42890](https://github.com/ClickHouse/ClickHouse/pull/42890) ([Aleksandr](https://github.com/AVMusorin)).
* Cleanup implementation of regexpReplace(All|One) [#42907](https://github.com/ClickHouse/ClickHouse/pull/42907) ([Robert Schulze](https://github.com/rschu1ze)).
* Do not show status for Bugfix validate check in non bugfix PRs [#42932](https://github.com/ClickHouse/ClickHouse/pull/42932) ([Vladimir C](https://github.com/vdimir)).
* fix(typo): Passible -> Possible [#42933](https://github.com/ClickHouse/ClickHouse/pull/42933) ([Yakko Majuri](https://github.com/yakkomajuri)).
* Pin the cryptography version to not break lambdas [#42934](https://github.com/ClickHouse/ClickHouse/pull/42934) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix: bad cast from type DB::ColumnLowCardinality to DB::ColumnString [#42937](https://github.com/ClickHouse/ClickHouse/pull/42937) ([Igor Nikonov](https://github.com/devcrafter)).
* Attach thread pool for loading parts to the query [#42947](https://github.com/ClickHouse/ClickHouse/pull/42947) ([Azat Khuzhin](https://github.com/azat)).
* Fix macOS M1 builds due to sprintf deprecation [#42962](https://github.com/ClickHouse/ClickHouse/pull/42962) ([Jordi Villar](https://github.com/jrdi)).
* Less use of CH-specific bit_cast() [#42968](https://github.com/ClickHouse/ClickHouse/pull/42968) ([Robert Schulze](https://github.com/rschu1ze)).
* Remove some utils [#42972](https://github.com/ClickHouse/ClickHouse/pull/42972) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix a bug in CAST function parser [#42980](https://github.com/ClickHouse/ClickHouse/pull/42980) ([Nikolay Degterinsky](https://github.com/evillique)).
* Fix old bug to remove `refs/head` from ref name [#42981](https://github.com/ClickHouse/ClickHouse/pull/42981) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add debug information to nightly builds [#42997](https://github.com/ClickHouse/ClickHouse/pull/42997) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add some guard rails around aggregation memory management [#42999](https://github.com/ClickHouse/ClickHouse/pull/42999) ([Raúl Marín](https://github.com/Algunenano)).
* Add `on: workflow_call` to debug CI [#43000](https://github.com/ClickHouse/ClickHouse/pull/43000) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Analyzer added identifier typo corrections [#43002](https://github.com/ClickHouse/ClickHouse/pull/43002) ([Maksim Kita](https://github.com/kitaisreal)).
* Simple fixes for restart replica description [#43004](https://github.com/ClickHouse/ClickHouse/pull/43004) ([Igor Nikonov](https://github.com/devcrafter)).
* Cleanup match code [#43006](https://github.com/ClickHouse/ClickHouse/pull/43006) ([Robert Schulze](https://github.com/rschu1ze)).
* Fix TSan errors (correctly ignore _exit interception) [#43009](https://github.com/ClickHouse/ClickHouse/pull/43009) ([Azat Khuzhin](https://github.com/azat)).
* fix bandwidth throttlers initialization order [#43015](https://github.com/ClickHouse/ClickHouse/pull/43015) ([Sergei Trifonov](https://github.com/serxa)).
* Add test for issue [#42520](https://github.com/ClickHouse/ClickHouse/issues/42520) [#43027](https://github.com/ClickHouse/ClickHouse/pull/43027) ([Robert Schulze](https://github.com/rschu1ze)).
* Analyzer improve ARRAY JOIN with JOIN [#43048](https://github.com/ClickHouse/ClickHouse/pull/43048) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix projection part removal with zero-copy replication [#43060](https://github.com/ClickHouse/ClickHouse/pull/43060) ([alesapin](https://github.com/alesapin)).
* Fix msan warning [#43065](https://github.com/ClickHouse/ClickHouse/pull/43065) ([Raúl Marín](https://github.com/Algunenano)).
* Analyzer AST key condition crash fix [#43070](https://github.com/ClickHouse/ClickHouse/pull/43070) ([Maksim Kita](https://github.com/kitaisreal)).
* Better logging for mark range filtering on projection parts [#43076](https://github.com/ClickHouse/ClickHouse/pull/43076) ([Duc Canh Le](https://github.com/canhld94)).
* Fix ub type punning [#43088](https://github.com/ClickHouse/ClickHouse/pull/43088) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Analyzer improve aliases support for table expressions [#43089](https://github.com/ClickHouse/ClickHouse/pull/43089) ([Maksim Kita](https://github.com/kitaisreal)).
* Throw not implemented for window frame type 'groups' in analyzer [#43090](https://github.com/ClickHouse/ClickHouse/pull/43090) ([Vladimir C](https://github.com/vdimir)).
* Disable clickhouse local and client non-interactive progress by default. [#43092](https://github.com/ClickHouse/ClickHouse/pull/43092) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Make error message after dropping current user more correct. [#43097](https://github.com/ClickHouse/ClickHouse/pull/43097) ([Vitaly Baranov](https://github.com/vitlibar)).
* More stable test [#43102](https://github.com/ClickHouse/ClickHouse/pull/43102) ([alesapin](https://github.com/alesapin)).
* Rewrite tests for memory overcommit [#43105](https://github.com/ClickHouse/ClickHouse/pull/43105) ([Dmitry Novik](https://github.com/novikd)).
* Fix trailing \n from SQLancer status [#43114](https://github.com/ClickHouse/ClickHouse/pull/43114) ([Ilya Yatsishin](https://github.com/qoega)).
* Fix `test_keeper_four_word_command::test_cmd_stat` [#43115](https://github.com/ClickHouse/ClickHouse/pull/43115) ([Antonio Andelic](https://github.com/antonio2368)).
* Enable keeper fault injection for inserts in functional tests [#43117](https://github.com/ClickHouse/ClickHouse/pull/43117) ([Igor Nikonov](https://github.com/devcrafter)).
* Analyzer aggregation crash fix [#43118](https://github.com/ClickHouse/ClickHouse/pull/43118) ([Maksim Kita](https://github.com/kitaisreal)).
* Analyzer aggregation totals crash fix [#43119](https://github.com/ClickHouse/ClickHouse/pull/43119) ([Maksim Kita](https://github.com/kitaisreal)).
* Improve commit_status_helper.py [#43121](https://github.com/ClickHouse/ClickHouse/pull/43121) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Skip hash logging on sanitizer builds [#43129](https://github.com/ClickHouse/ClickHouse/pull/43129) ([Raúl Marín](https://github.com/Algunenano)).
* Analyzer improve JOIN with constants [#43141](https://github.com/ClickHouse/ClickHouse/pull/43141) ([Maksim Kita](https://github.com/kitaisreal)).
* Remove POCO_CLICKHOUSE_PATCH [#43146](https://github.com/ClickHouse/ClickHouse/pull/43146) ([Azat Khuzhin](https://github.com/azat)).
* Update CompressionCodecDeflateQpl.cpp [#43150](https://github.com/ClickHouse/ClickHouse/pull/43150) ([Tiaonmmn](https://github.com/Tiaonmmn)).
* Randomize keeper fault injection settings in stress tests [#43187](https://github.com/ClickHouse/ClickHouse/pull/43187) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix for missing columns bug with projections an ALTER UPDATE [#43189](https://github.com/ClickHouse/ClickHouse/pull/43189) ([Alexander Gololobov](https://github.com/davenger)).
* A workaround for LLVM bug, https://github.com/llvm/llvm-project/issues/58633 [#43195](https://github.com/ClickHouse/ClickHouse/pull/43195) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Stop `ConfigReloader` first to avoid data race [#43201](https://github.com/ClickHouse/ClickHouse/pull/43201) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix typo [#43203](https://github.com/ClickHouse/ClickHouse/pull/43203) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Miscellaneous changes [#43206](https://github.com/ClickHouse/ClickHouse/pull/43206) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Fix flaky 02449_check_dependencies_and_table_shutdown [#43212](https://github.com/ClickHouse/ClickHouse/pull/43212) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add test to check [#43167](https://github.com/ClickHouse/ClickHouse/issues/43167) for all builds [#43216](https://github.com/ClickHouse/ClickHouse/pull/43216) ([Ilya Yatsishin](https://github.com/qoega)).
* Don't throw if shared ID already created in `StorageReplicatedMergeTree` [#43244](https://github.com/ClickHouse/ClickHouse/pull/43244) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix nullptr dereference in collectScopeValidIdentifiersForTypoCorrection [#43245](https://github.com/ClickHouse/ClickHouse/pull/43245) ([Vladimir C](https://github.com/vdimir)).
* Better message in wait_zookeeper_to_start [#43256](https://github.com/ClickHouse/ClickHouse/pull/43256) ([Vladimir C](https://github.com/vdimir)).
* Make test_global_overcommit_tracker non-parallel [#43266](https://github.com/ClickHouse/ClickHouse/pull/43266) ([Dmitry Novik](https://github.com/novikd)).
* Rename canonicalRand to randCanonical [#43283](https://github.com/ClickHouse/ClickHouse/pull/43283) ([Nikita Taranov](https://github.com/nickitat)).
* check limits for an AST in select parser fuzzer [#43285](https://github.com/ClickHouse/ClickHouse/pull/43285) ([Sema Checherinda](https://github.com/CheSema)).
* Allow autoremoval of old parts if detach_not_byte_identical_parts enabled [#43287](https://github.com/ClickHouse/ClickHouse/pull/43287) ([filimonov](https://github.com/filimonov)).
* `pmod`: compatibility with Spark, better documentation [#43313](https://github.com/ClickHouse/ClickHouse/pull/43313) ([Alexey Milovidov](https://github.com/alexey-milovidov)).

View File

@ -127,6 +127,10 @@ The following settings can be set before query execution or placed into configur
- `s3_min_upload_part_size` — The minimum size of part to upload during multipart upload to [S3 Multipart upload](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html). Default value is `512Mb`.
- `s3_max_redirects` — Max number of S3 redirects hops allowed. Default value is `10`.
- `s3_single_read_retries` — The maximum number of attempts during single read. Default value is `4`.
- `s3_max_put_rps` — Maximum PUT requests per second rate before throttling. Default value is `0` (unlimited).
- `s3_max_put_burst` — Max number of requests that can be issued simultaneously before hitting request per second limit. By default (`0` value) equals to `s3_max_put_rps`.
- `s3_max_get_rps` — Maximum GET requests per second rate before throttling. Default value is `0` (unlimited).
- `s3_max_get_burst` — Max number of requests that can be issued simultaneously before hitting request per second limit. By default (`0` value) equals to `s3_max_get_rps`.
Security consideration: if malicious user can specify arbitrary S3 URLs, `s3_max_redirects` must be set to zero to avoid [SSRF](https://en.wikipedia.org/wiki/Server-side_request_forgery) attacks; or alternatively, `remote_host_filter` must be specified in server configuration.
@ -142,6 +146,7 @@ The following settings can be specified in configuration file for given endpoint
- `header` — Adds specified HTTP header to a request to given endpoint. Optional, can be specified multiple times.
- `server_side_encryption_customer_key_base64` — If specified, required headers for accessing S3 objects with SSE-C encryption will be set. Optional.
- `max_single_read_retries` — The maximum number of attempts during single read. Default value is `4`. Optional.
- `max_put_rps`, `max_put_burst`, `max_get_rps` and `max_get_burst` - Throttling settings (see description above) to use for specific endpoint instead of per query. Optional.
**Example:**

View File

@ -940,6 +940,10 @@ Optional parameters:
- `cache_path` — Path on local FS where to store cached mark and index files. Default value is `/var/lib/clickhouse/disks/<disk_name>/cache/`.
- `skip_access_check` — If true, disk access checks will not be performed on disk start-up. Default value is `false`.
- `server_side_encryption_customer_key_base64` — If specified, required headers for accessing S3 objects with SSE-C encryption will be set.
- `s3_max_put_rps` — Maximum PUT requests per second rate before throttling. Default value is `0` (unlimited).
- `s3_max_put_burst` — Max number of requests that can be issued simultaneously before hitting request per second limit. By default (`0` value) equals to `s3_max_put_rps`.
- `s3_max_get_rps` — Maximum GET requests per second rate before throttling. Default value is `0` (unlimited).
- `s3_max_get_burst` — Max number of requests that can be issued simultaneously before hitting request per second limit. By default (`0` value) equals to `s3_max_get_rps`.
S3 disk can be configured as `main` or `cold` storage:
``` xml

View File

@ -57,7 +57,7 @@ Internal coordination settings are located in the `<keeper_server>.<coordination
- `auto_forwarding` — Allow to forward write requests from followers to the leader (default: true).
- `shutdown_timeout` — Wait to finish internal connections and shutdown (ms) (default: 5000).
- `startup_timeout` — If the server doesn't connect to other quorum participants in the specified timeout it will terminate (ms) (default: 30000).
- `four_letter_word_white_list` — White list of 4lw commands (default: `conf,cons,crst,envi,ruok,srst,srvr,stat,wchc,wchs,dirs,mntr,isro`).
- `four_letter_word_white_list` — White list of 4lw commands (default: `conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif,rqld`).
Quorum configuration is located in the `<keeper_server>.<raft_configuration>` section and contain servers description.
@ -126,7 +126,7 @@ clickhouse keeper --config /etc/your_path_to_config/config.xml
ClickHouse Keeper also provides 4lw commands which are almost the same with Zookeeper. Each command is composed of four letters such as `mntr`, `stat` etc. There are some more interesting commands: `stat` gives some general information about the server and connected clients, while `srvr` and `cons` give extended details on server and connections respectively.
The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value `conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif`.
The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value `conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif,rqld`.
You can issue the commands to ClickHouse Keeper via telnet or nc, at the client port.
@ -328,6 +328,12 @@ target_committed_log_idx 101
last_snapshot_idx 50
```
- `rqld`: Request to become new leader. Return `Sent leadership request to leader.` if request sent or `Failed to send leadership request to leader.` if request not sent. Note that if node is already leader the outcome is same as the request is sent.
```
Sent leadership request to leader.
```
## Migration from ZooKeeper {#migration-from-zookeeper}
Seamlessly migration from ZooKeeper to ClickHouse Keeper is impossible you have to stop your ZooKeeper cluster, convert data and start ClickHouse Keeper. `clickhouse-keeper-converter` tool allows converting ZooKeeper logs and snapshots to ClickHouse Keeper snapshot. It works only with ZooKeeper > 3.4. Steps for migration:

View File

@ -189,10 +189,12 @@ preAllocSize=131072
# especially if there are a lot of clients. To prevent ZooKeeper from running
# out of memory due to queued requests, ZooKeeper will throttle clients so that
# there is no more than globalOutstandingLimit outstanding requests in the
# system. The default limit is 1,000.ZooKeeper logs transactions to a
# transaction log. After snapCount transactions are written to a log file a
# snapshot is started and a new transaction log file is started. The default
# snapCount is 10,000.
# system. The default limit is 1000.
# globalOutstandingLimit=1000
# ZooKeeper logs transactions to a transaction log. After snapCount transactions
# are written to a log file a snapshot is started and a new transaction log file
# is started. The default snapCount is 100000.
snapCount=3000000
# If this option is defined, requests will be will logged to a trace file named

View File

@ -185,7 +185,7 @@ unhex(arg)
**Arguments**
- `arg` — A string containing any number of hexadecimal digits. Type: [String](../../sql-reference/data-types/string.md).
- `arg` — A string containing any number of hexadecimal digits. Type: [String](../../sql-reference/data-types/string.md), [FixedString](../../sql-reference/data-types/fixedstring.md).
Supports both uppercase and lowercase letters `A-F`. The number of hexadecimal digits does not have to be even. If it is odd, the last digit is interpreted as the least significant half of the `00-0F` byte. If the argument string contains anything other than hexadecimal digits, some implementation-defined result is returned (an exception isnt thrown). For a numeric argument the inverse of hex(N) is not performed by unhex().

View File

@ -181,7 +181,7 @@ unhex(arg)
**参数**
- `arg` — 包含任意数量的十六进制数字的字符串。类型为:[String](../../sql-reference/data-types/string.md)。
- `arg` — 包含任意数量的十六进制数字的字符串。类型为:[String](../../sql-reference/data-types/string.md)[FixedString](../../sql-reference/data-types/fixedstring.md)
支持大写和小写字母A-F。十六进制数字的数量不必是偶数。如果是奇数则最后一位数被解释为00-0F字节的低位。如果参数字符串包含除十六进制数字以外的任何内容则返回一些实现定义的结果不抛出异常。对于数字参数 unhex()不执行 hex(N) 的倒数。

View File

@ -45,7 +45,7 @@ TaskTable::TaskTable(TaskCluster & parent, const Poco::Util::AbstractConfigurati
engine_push_str = config.getString(table_prefix + "engine", "rand()");
{
ParserStorage parser_storage;
ParserStorage parser_storage{ParserStorage::TABLE_ENGINE};
engine_push_ast = parseQuery(parser_storage, engine_push_str, 0, DBMS_DEFAULT_MAX_PARSER_DEPTH);
engine_push_partition_key_ast = extractPartitionKey(engine_push_ast);
primary_key_comma_separated = boost::algorithm::join(extractPrimaryKeyColumnNames(engine_push_ast), ", ");

View File

@ -29,6 +29,7 @@ namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int NOT_IMPLEMENTED;
extern const int TOO_LARGE_STRING_SIZE;
}
/** Aggregate functions that store one of passed values.
@ -521,7 +522,11 @@ public:
{
if (capacity < rhs_size)
{
capacity = static_cast<UInt32>(roundUpToPowerOfTwoOrZero(rhs_size));
capacity = static_cast<Int32>(roundUpToPowerOfTwoOrZero(rhs_size));
/// It might happen if the size was too big and the rounded value does not fit a size_t
if (unlikely(capacity < rhs_size))
throw Exception(ErrorCodes::TOO_LARGE_STRING_SIZE, "String size is too big ({})", rhs_size);
/// Don't free large_data here.
large_data = arena->alloc(capacity);
}

View File

@ -202,7 +202,7 @@ public:
auto & merged_maps = this->data(place).merged_maps;
for (size_t col = 0, size = values_types.size(); col < size; ++col)
{
const auto & array_column = assert_cast<const ColumnArray&>(*columns[col + 1]);
const auto & array_column = assert_cast<const ColumnArray &>(*columns[col + 1]);
const IColumn & value_column = array_column.getData();
const IColumn::Offsets & offsets = array_column.getOffsets();
const size_t values_vec_offset = offsets[row_num - 1];
@ -532,7 +532,12 @@ private:
public:
explicit FieldVisitorMax(const Field & rhs_) : rhs(rhs_) {}
bool operator() (Null &) const { throw Exception("Cannot compare Nulls", ErrorCodes::LOGICAL_ERROR); }
bool operator() (Null &) const
{
/// Do not update current value, skip nulls
return false;
}
bool operator() (AggregateFunctionStateData &) const { throw Exception("Cannot compare AggregateFunctionStates", ErrorCodes::LOGICAL_ERROR); }
bool operator() (Array & x) const { return compareImpl<Array>(x); }
@ -567,7 +572,13 @@ private:
public:
explicit FieldVisitorMin(const Field & rhs_) : rhs(rhs_) {}
bool operator() (Null &) const { throw Exception("Cannot compare Nulls", ErrorCodes::LOGICAL_ERROR); }
bool operator() (Null &) const
{
/// Do not update current value, skip nulls
return false;
}
bool operator() (AggregateFunctionStateData &) const { throw Exception("Cannot sum AggregateFunctionStates", ErrorCodes::LOGICAL_ERROR); }
bool operator() (Array & x) const { return compareImpl<Array>(x); }

View File

@ -9,6 +9,7 @@
#include <DataTypes/DataTypeTuple.h>
#include <DataTypes/DataTypeUUID.h>
#include <Core/Settings.h>
namespace DB
{
@ -28,8 +29,9 @@ namespace
/** `DataForVariadic` is a data structure that will be used for `uniq` aggregate function of multiple arguments.
* It differs, for example, in that it uses a trivial hash function, since `uniq` of many arguments first hashes them out itself.
*/
template <typename Data, typename DataForVariadic>
AggregateFunctionPtr createAggregateFunctionUniq(const std::string & name, const DataTypes & argument_types, const Array & params, const Settings *)
template <typename Data, template <bool, bool> typename DataForVariadic>
AggregateFunctionPtr
createAggregateFunctionUniq(const std::string & name, const DataTypes & argument_types, const Array & params, const Settings *)
{
assertNoParameters(name, params);
@ -61,21 +63,22 @@ AggregateFunctionPtr createAggregateFunctionUniq(const std::string & name, const
else if (which.isTuple())
{
if (use_exact_hash_function)
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, true, true>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<true, true>>>(argument_types);
else
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, false, true>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<false, true>>>(argument_types);
}
}
/// "Variadic" method also works as a fallback generic case for single argument.
if (use_exact_hash_function)
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, true, false>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<true, false>>>(argument_types);
else
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, false, false>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<false, false>>>(argument_types);
}
template <bool is_exact, template <typename> class Data, typename DataForVariadic>
AggregateFunctionPtr createAggregateFunctionUniq(const std::string & name, const DataTypes & argument_types, const Array & params, const Settings *)
template <bool is_exact, template <typename, bool> typename Data, template <bool, bool, bool> typename DataForVariadic, bool is_able_to_parallelize_merge>
AggregateFunctionPtr
createAggregateFunctionUniq(const std::string & name, const DataTypes & argument_types, const Array & params, const Settings *)
{
assertNoParameters(name, params);
@ -91,35 +94,35 @@ AggregateFunctionPtr createAggregateFunctionUniq(const std::string & name, const
{
const IDataType & argument_type = *argument_types[0];
AggregateFunctionPtr res(createWithNumericType<AggregateFunctionUniq, Data>(*argument_types[0], argument_types));
AggregateFunctionPtr res(createWithNumericType<AggregateFunctionUniq, Data, is_able_to_parallelize_merge>(*argument_types[0], argument_types));
WhichDataType which(argument_type);
if (res)
return res;
else if (which.isDate())
return std::make_shared<AggregateFunctionUniq<DataTypeDate::FieldType, Data<DataTypeDate::FieldType>>>(argument_types);
return std::make_shared<AggregateFunctionUniq<DataTypeDate::FieldType, Data<DataTypeDate::FieldType, is_able_to_parallelize_merge>>>(argument_types);
else if (which.isDate32())
return std::make_shared<AggregateFunctionUniq<DataTypeDate32::FieldType, Data<DataTypeDate32::FieldType>>>(argument_types);
return std::make_shared<AggregateFunctionUniq<DataTypeDate32::FieldType, Data<DataTypeDate32::FieldType, is_able_to_parallelize_merge>>>(argument_types);
else if (which.isDateTime())
return std::make_shared<AggregateFunctionUniq<DataTypeDateTime::FieldType, Data<DataTypeDateTime::FieldType>>>(argument_types);
return std::make_shared<AggregateFunctionUniq<DataTypeDateTime::FieldType, Data<DataTypeDateTime::FieldType, is_able_to_parallelize_merge>>>(argument_types);
else if (which.isStringOrFixedString())
return std::make_shared<AggregateFunctionUniq<String, Data<String>>>(argument_types);
return std::make_shared<AggregateFunctionUniq<String, Data<String, is_able_to_parallelize_merge>>>(argument_types);
else if (which.isUUID())
return std::make_shared<AggregateFunctionUniq<DataTypeUUID::FieldType, Data<DataTypeUUID::FieldType>>>(argument_types);
return std::make_shared<AggregateFunctionUniq<DataTypeUUID::FieldType, Data<DataTypeUUID::FieldType, is_able_to_parallelize_merge>>>(argument_types);
else if (which.isTuple())
{
if (use_exact_hash_function)
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, true, true>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<true, true, is_able_to_parallelize_merge>>>(argument_types);
else
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, false, true>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<false, true, is_able_to_parallelize_merge>>>(argument_types);
}
}
/// "Variadic" method also works as a fallback generic case for single argument.
if (use_exact_hash_function)
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, true, false>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<true, false, is_able_to_parallelize_merge>>>(argument_types);
else
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic, false, false>>(argument_types);
return std::make_shared<AggregateFunctionUniqVariadic<DataForVariadic<false, false, is_able_to_parallelize_merge>>>(argument_types);
}
}
@ -132,14 +135,23 @@ void registerAggregateFunctionsUniq(AggregateFunctionFactory & factory)
{createAggregateFunctionUniq<AggregateFunctionUniqUniquesHashSetData, AggregateFunctionUniqUniquesHashSetDataForVariadic>, properties});
factory.registerFunction("uniqHLL12",
{createAggregateFunctionUniq<false, AggregateFunctionUniqHLL12Data, AggregateFunctionUniqHLL12DataForVariadic>, properties});
{createAggregateFunctionUniq<false, AggregateFunctionUniqHLL12Data, AggregateFunctionUniqHLL12DataForVariadic, false /* is_able_to_parallelize_merge */>, properties});
factory.registerFunction("uniqExact",
{createAggregateFunctionUniq<true, AggregateFunctionUniqExactData, AggregateFunctionUniqExactData<String>>, properties});
auto assign_bool_param = [](const std::string & name, const DataTypes & argument_types, const Array & params, const Settings * settings)
{
/// Using two level hash set if we wouldn't be able to merge in parallel can cause ~10% slowdown.
if (settings && settings->max_threads > 1)
return createAggregateFunctionUniq<
true, AggregateFunctionUniqExactData, AggregateFunctionUniqExactDataForVariadic, true /* is_able_to_parallelize_merge */>(name, argument_types, params, settings);
else
return createAggregateFunctionUniq<
true, AggregateFunctionUniqExactData, AggregateFunctionUniqExactDataForVariadic, false /* is_able_to_parallelize_merge */>(name, argument_types, params, settings);
};
factory.registerFunction("uniqExact", {assign_bool_param, properties});
#if USE_DATASKETCHES
factory.registerFunction("uniqTheta",
{createAggregateFunctionUniq<AggregateFunctionUniqThetaData, AggregateFunctionUniqThetaData>, properties});
{createAggregateFunctionUniq<AggregateFunctionUniqThetaData, AggregateFunctionUniqThetaDataForVariadic>, properties});
#endif
}

View File

@ -1,7 +1,10 @@
#pragma once
#include <city.h>
#include <atomic>
#include <memory>
#include <type_traits>
#include <utility>
#include <city.h>
#include <base/bit_cast.h>
@ -13,17 +16,18 @@
#include <Interpreters/AggregationCommon.h>
#include <Common/CombinedCardinalityEstimator.h>
#include <Common/HashTable/Hash.h>
#include <Common/HashTable/HashSet.h>
#include <Common/HyperLogLogWithSmallSetOptimization.h>
#include <Common/CombinedCardinalityEstimator.h>
#include <Common/typeid_cast.h>
#include <Common/assert_cast.h>
#include <Common/typeid_cast.h>
#include <AggregateFunctions/UniquesHashSet.h>
#include <AggregateFunctions/IAggregateFunction.h>
#include <AggregateFunctions/ThetaSketchData.h>
#include <AggregateFunctions/UniqExactSet.h>
#include <AggregateFunctions/UniqVariadicHash.h>
#include <AggregateFunctions/UniquesHashSet.h>
namespace DB
@ -37,94 +41,128 @@ struct AggregateFunctionUniqUniquesHashSetData
using Set = UniquesHashSet<DefaultHash<UInt64>>;
Set set;
constexpr static bool is_able_to_parallelize_merge = false;
constexpr static bool is_variadic = false;
static String getName() { return "uniq"; }
};
/// For a function that takes multiple arguments. Such a function pre-hashes them in advance, so TrivialHash is used here.
template <bool is_exact_, bool argument_is_tuple_>
struct AggregateFunctionUniqUniquesHashSetDataForVariadic
{
using Set = UniquesHashSet<TrivialHash>;
Set set;
constexpr static bool is_able_to_parallelize_merge = false;
constexpr static bool is_variadic = true;
constexpr static bool is_exact = is_exact_;
constexpr static bool argument_is_tuple = argument_is_tuple_;
static String getName() { return "uniq"; }
};
/// uniqHLL12
template <typename T>
template <typename T, bool is_able_to_parallelize_merge_>
struct AggregateFunctionUniqHLL12Data
{
using Set = HyperLogLogWithSmallSetOptimization<T, 16, 12>;
Set set;
static String getName() { return "uniqHLL12"; }
};
template <>
struct AggregateFunctionUniqHLL12Data<String>
{
using Set = HyperLogLogWithSmallSetOptimization<UInt64, 16, 12>;
Set set;
constexpr static bool is_able_to_parallelize_merge = is_able_to_parallelize_merge_;
constexpr static bool is_variadic = false;
static String getName() { return "uniqHLL12"; }
};
template <>
struct AggregateFunctionUniqHLL12Data<UUID>
struct AggregateFunctionUniqHLL12Data<String, false>
{
using Set = HyperLogLogWithSmallSetOptimization<UInt64, 16, 12>;
Set set;
constexpr static bool is_able_to_parallelize_merge = false;
constexpr static bool is_variadic = false;
static String getName() { return "uniqHLL12"; }
};
template <>
struct AggregateFunctionUniqHLL12Data<UUID, false>
{
using Set = HyperLogLogWithSmallSetOptimization<UInt64, 16, 12>;
Set set;
constexpr static bool is_able_to_parallelize_merge = false;
constexpr static bool is_variadic = false;
static String getName() { return "uniqHLL12"; }
};
template <bool is_exact_, bool argument_is_tuple_, bool is_able_to_parallelize_merge_>
struct AggregateFunctionUniqHLL12DataForVariadic
{
using Set = HyperLogLogWithSmallSetOptimization<UInt64, 16, 12, TrivialHash>;
Set set;
constexpr static bool is_able_to_parallelize_merge = is_able_to_parallelize_merge_;
constexpr static bool is_variadic = true;
constexpr static bool is_exact = is_exact_;
constexpr static bool argument_is_tuple = argument_is_tuple_;
static String getName() { return "uniqHLL12"; }
};
/// uniqExact
template <typename T>
template <typename T, bool is_able_to_parallelize_merge_>
struct AggregateFunctionUniqExactData
{
using Key = T;
/// When creating, the hash table must be small.
using Set = HashSet<
Key,
HashCRC32<Key>,
HashTableGrower<4>,
HashTableAllocatorWithStackMemory<sizeof(Key) * (1 << 4)>>;
using SingleLevelSet = HashSet<Key, HashCRC32<Key>, HashTableGrower<4>, HashTableAllocatorWithStackMemory<sizeof(Key) * (1 << 4)>>;
using TwoLevelSet = TwoLevelHashSet<Key, HashCRC32<Key>>;
using Set = UniqExactSet<SingleLevelSet, TwoLevelSet>;
Set set;
constexpr static bool is_able_to_parallelize_merge = is_able_to_parallelize_merge_;
constexpr static bool is_variadic = false;
static String getName() { return "uniqExact"; }
};
/// For rows, we put the SipHash values (128 bits) into the hash table.
template <>
struct AggregateFunctionUniqExactData<String>
template <bool is_able_to_parallelize_merge_>
struct AggregateFunctionUniqExactData<String, is_able_to_parallelize_merge_>
{
using Key = UInt128;
/// When creating, the hash table must be small.
using Set = HashSet<
Key,
UInt128TrivialHash,
HashTableGrower<3>,
HashTableAllocatorWithStackMemory<sizeof(Key) * (1 << 3)>>;
using SingleLevelSet = HashSet<Key, UInt128TrivialHash, HashTableGrower<3>, HashTableAllocatorWithStackMemory<sizeof(Key) * (1 << 3)>>;
using TwoLevelSet = TwoLevelHashSet<Key, UInt128TrivialHash>;
using Set = UniqExactSet<SingleLevelSet, TwoLevelSet>;
Set set;
constexpr static bool is_able_to_parallelize_merge = is_able_to_parallelize_merge_;
constexpr static bool is_variadic = false;
static String getName() { return "uniqExact"; }
};
template <bool is_exact_, bool argument_is_tuple_, bool is_able_to_parallelize_merge_>
struct AggregateFunctionUniqExactDataForVariadic : AggregateFunctionUniqExactData<String, is_able_to_parallelize_merge_>
{
constexpr static bool is_able_to_parallelize_merge = is_able_to_parallelize_merge_;
constexpr static bool is_variadic = true;
constexpr static bool is_exact = is_exact_;
constexpr static bool argument_is_tuple = argument_is_tuple_;
};
/// uniqTheta
#if USE_DATASKETCHES
@ -134,14 +172,37 @@ struct AggregateFunctionUniqThetaData
using Set = ThetaSketchData<UInt64>;
Set set;
constexpr static bool is_able_to_parallelize_merge = false;
constexpr static bool is_variadic = false;
static String getName() { return "uniqTheta"; }
};
template <bool is_exact_, bool argument_is_tuple_>
struct AggregateFunctionUniqThetaDataForVariadic : AggregateFunctionUniqThetaData
{
constexpr static bool is_able_to_parallelize_merge = false;
constexpr static bool is_variadic = true;
constexpr static bool is_exact = is_exact_;
constexpr static bool argument_is_tuple = argument_is_tuple_;
};
#endif
namespace detail
{
template <typename T>
struct IsUniqExactSet : std::false_type
{
};
template <typename T1, typename T2>
struct IsUniqExactSet<UniqExactSet<T1, T2>> : std::true_type
{
};
/** Hash function for uniq.
*/
template <typename T> struct AggregateFunctionUniqTraits
@ -162,17 +223,31 @@ template <typename T> struct AggregateFunctionUniqTraits
};
/** The structure for the delegation work to add one element to the `uniq` aggregate functions.
/** The structure for the delegation work to add elements to the `uniq` aggregate functions.
* Used for partial specialization to add strings.
*/
template <typename T, typename Data>
struct OneAdder
struct Adder
{
static void ALWAYS_INLINE add(Data & data, const IColumn & column, size_t row_num)
/// We have to introduce this template parameter (and a bunch of ugly code dealing with it), because we cannot
/// add runtime branches in whatever_hash_set::insert - it will immediately pop up in the perf top.
template <bool use_single_level_hash_table = true>
static void ALWAYS_INLINE add(Data & data, const IColumn ** columns, size_t num_args, size_t row_num)
{
if constexpr (std::is_same_v<Data, AggregateFunctionUniqUniquesHashSetData>
|| std::is_same_v<Data, AggregateFunctionUniqHLL12Data<T>>)
if constexpr (Data::is_variadic)
{
if constexpr (IsUniqExactSet<typename Data::Set>::value)
data.set.template insert<T, use_single_level_hash_table>(
UniqVariadicHash<Data::is_exact, Data::argument_is_tuple>::apply(num_args, columns, row_num));
else
data.set.insert(T{UniqVariadicHash<Data::is_exact, Data::argument_is_tuple>::apply(num_args, columns, row_num)});
}
else if constexpr (
std::is_same_v<
Data,
AggregateFunctionUniqUniquesHashSetData> || std::is_same_v<Data, AggregateFunctionUniqHLL12Data<T, Data::is_able_to_parallelize_merge>>)
{
const auto & column = *columns[0];
if constexpr (!std::is_same_v<T, String>)
{
using ValueType = typename decltype(data.set)::value_type;
@ -185,11 +260,13 @@ struct OneAdder
data.set.insert(CityHash_v1_0_2::CityHash64(value.data, value.size));
}
}
else if constexpr (std::is_same_v<Data, AggregateFunctionUniqExactData<T>>)
else if constexpr (std::is_same_v<Data, AggregateFunctionUniqExactData<T, Data::is_able_to_parallelize_merge>>)
{
const auto & column = *columns[0];
if constexpr (!std::is_same_v<T, String>)
{
data.set.insert(assert_cast<const ColumnVector<T> &>(column).getData()[row_num]);
data.set.template insert<const T &, use_single_level_hash_table>(
assert_cast<const ColumnVector<T> &>(column).getData()[row_num]);
}
else
{
@ -200,16 +277,72 @@ struct OneAdder
hash.update(value.data, value.size);
hash.get128(key);
data.set.insert(key);
data.set.template insert<const UInt128 &, use_single_level_hash_table>(key);
}
}
#if USE_DATASKETCHES
else if constexpr (std::is_same_v<Data, AggregateFunctionUniqThetaData>)
{
const auto & column = *columns[0];
data.set.insertOriginal(column.getDataAt(row_num));
}
#endif
}
static void ALWAYS_INLINE
add(Data & data, const IColumn ** columns, size_t num_args, size_t row_begin, size_t row_end, const char8_t * flags, const UInt8 * null_map)
{
bool use_single_level_hash_table = true;
if constexpr (Data::is_able_to_parallelize_merge)
use_single_level_hash_table = data.set.isSingleLevel();
if (use_single_level_hash_table)
addImpl<true>(data, columns, num_args, row_begin, row_end, flags, null_map);
else
addImpl<false>(data, columns, num_args, row_begin, row_end, flags, null_map);
if constexpr (Data::is_able_to_parallelize_merge)
{
if (data.set.isSingleLevel() && data.set.size() > 100'000)
data.set.convertToTwoLevel();
}
}
private:
template <bool use_single_level_hash_table>
static void ALWAYS_INLINE
addImpl(Data & data, const IColumn ** columns, size_t num_args, size_t row_begin, size_t row_end, const char8_t * flags, const UInt8 * null_map)
{
if (!flags)
{
if (!null_map)
{
for (size_t row = row_begin; row < row_end; ++row)
add<use_single_level_hash_table>(data, columns, num_args, row);
}
else
{
for (size_t row = row_begin; row < row_end; ++row)
if (!null_map[row])
add<use_single_level_hash_table>(data, columns, num_args, row);
}
}
else
{
if (!null_map)
{
for (size_t row = row_begin; row < row_end; ++row)
if (flags[row])
add<use_single_level_hash_table>(data, columns, num_args, row);
}
else
{
for (size_t row = row_begin; row < row_end; ++row)
if (!null_map[row] && flags[row])
add<use_single_level_hash_table>(data, columns, num_args, row);
}
}
}
};
}
@ -219,9 +352,15 @@ struct OneAdder
template <typename T, typename Data>
class AggregateFunctionUniq final : public IAggregateFunctionDataHelper<Data, AggregateFunctionUniq<T, Data>>
{
private:
static constexpr size_t num_args = 1;
static constexpr bool is_able_to_parallelize_merge = Data::is_able_to_parallelize_merge;
public:
AggregateFunctionUniq(const DataTypes & argument_types_)
: IAggregateFunctionDataHelper<Data, AggregateFunctionUniq<T, Data>>(argument_types_, {}) {}
explicit AggregateFunctionUniq(const DataTypes & argument_types_)
: IAggregateFunctionDataHelper<Data, AggregateFunctionUniq<T, Data>>(argument_types_, {})
{
}
String getName() const override { return Data::getName(); }
@ -235,7 +374,18 @@ public:
/// ALWAYS_INLINE is required to have better code layout for uniqHLL12 function
void ALWAYS_INLINE add(AggregateDataPtr __restrict place, const IColumn ** columns, size_t row_num, Arena *) const override
{
detail::OneAdder<T, Data>::add(this->data(place), *columns[0], row_num);
detail::Adder<T, Data>::add(this->data(place), columns, num_args, row_num);
}
void ALWAYS_INLINE addBatchSinglePlace(
size_t row_begin, size_t row_end, AggregateDataPtr __restrict place, const IColumn ** columns, Arena *, ssize_t if_argument_pos)
const override
{
const char8_t * flags = nullptr;
if (if_argument_pos >= 0)
flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData().data();
detail::Adder<T, Data>::add(this->data(place), columns, num_args, row_begin, row_end, flags, nullptr /* null_map */);
}
void addManyDefaults(
@ -244,7 +394,23 @@ public:
size_t /*length*/,
Arena * /*arena*/) const override
{
detail::OneAdder<T, Data>::add(this->data(place), *columns[0], 0);
detail::Adder<T, Data>::add(this->data(place), columns, num_args, 0);
}
void addBatchSinglePlaceNotNull(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** columns,
const UInt8 * null_map,
Arena *,
ssize_t if_argument_pos) const override
{
const char8_t * flags = nullptr;
if (if_argument_pos >= 0)
flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData().data();
detail::Adder<T, Data>::add(this->data(place), columns, num_args, row_begin, row_end, flags, null_map);
}
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena *) const override
@ -252,6 +418,16 @@ public:
this->data(place).set.merge(this->data(rhs).set);
}
bool isAbleToParallelizeMerge() const override { return is_able_to_parallelize_merge; }
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, ThreadPool & thread_pool, Arena *) const override
{
if constexpr (is_able_to_parallelize_merge)
this->data(place).set.merge(this->data(rhs).set, &thread_pool);
else
this->data(place).set.merge(this->data(rhs).set);
}
void serialize(ConstAggregateDataPtr __restrict place, WriteBuffer & buf, std::optional<size_t> /* version */) const override
{
this->data(place).set.write(buf);
@ -273,15 +449,20 @@ public:
* You can pass multiple arguments as is; You can also pass one argument - a tuple.
* But (for the possibility of efficient implementation), you can not pass several arguments, among which there are tuples.
*/
template <typename Data, bool is_exact, bool argument_is_tuple>
class AggregateFunctionUniqVariadic final : public IAggregateFunctionDataHelper<Data, AggregateFunctionUniqVariadic<Data, is_exact, argument_is_tuple>>
template <typename Data>
class AggregateFunctionUniqVariadic final : public IAggregateFunctionDataHelper<Data, AggregateFunctionUniqVariadic<Data>>
{
private:
using T = typename Data::Set::value_type;
static constexpr size_t is_able_to_parallelize_merge = Data::is_able_to_parallelize_merge;
static constexpr size_t argument_is_tuple = Data::argument_is_tuple;
size_t num_args = 0;
public:
AggregateFunctionUniqVariadic(const DataTypes & arguments)
: IAggregateFunctionDataHelper<Data, AggregateFunctionUniqVariadic<Data, is_exact, argument_is_tuple>>(arguments, {})
explicit AggregateFunctionUniqVariadic(const DataTypes & arguments)
: IAggregateFunctionDataHelper<Data, AggregateFunctionUniqVariadic<Data>>(arguments, {})
{
if (argument_is_tuple)
num_args = typeid_cast<const DataTypeTuple &>(*arguments[0]).getElements().size();
@ -300,8 +481,34 @@ public:
void add(AggregateDataPtr __restrict place, const IColumn ** columns, size_t row_num, Arena *) const override
{
this->data(place).set.insert(typename Data::Set::value_type(
UniqVariadicHash<is_exact, argument_is_tuple>::apply(num_args, columns, row_num)));
detail::Adder<T, Data>::add(this->data(place), columns, num_args, row_num);
}
void addBatchSinglePlace(
size_t row_begin, size_t row_end, AggregateDataPtr __restrict place, const IColumn ** columns, Arena *, ssize_t if_argument_pos)
const override
{
const char8_t * flags = nullptr;
if (if_argument_pos >= 0)
flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData().data();
detail::Adder<T, Data>::add(this->data(place), columns, num_args, row_begin, row_end, flags, nullptr /* null_map */);
}
void addBatchSinglePlaceNotNull(
size_t row_begin,
size_t row_end,
AggregateDataPtr __restrict place,
const IColumn ** columns,
const UInt8 * null_map,
Arena *,
ssize_t if_argument_pos) const override
{
const char8_t * flags = nullptr;
if (if_argument_pos >= 0)
flags = assert_cast<const ColumnUInt8 &>(*columns[if_argument_pos]).getData().data();
detail::Adder<T, Data>::add(this->data(place), columns, num_args, row_begin, row_end, flags, null_map);
}
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena *) const override
@ -309,6 +516,16 @@ public:
this->data(place).set.merge(this->data(rhs).set);
}
bool isAbleToParallelizeMerge() const override { return is_able_to_parallelize_merge; }
void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, ThreadPool & thread_pool, Arena *) const override
{
if constexpr (is_able_to_parallelize_merge)
this->data(place).set.merge(this->data(rhs).set, &thread_pool);
else
this->data(place).set.merge(this->data(rhs).set);
}
void serialize(ConstAggregateDataPtr __restrict place, WriteBuffer & buf, std::optional<size_t> /* version */) const override
{
this->data(place).set.write(buf);

View File

@ -74,6 +74,19 @@ static IAggregateFunction * createWithNumericType(const IDataType & argument_typ
return nullptr;
}
template <template <typename, typename> class AggregateFunctionTemplate, template <typename, bool> class Data, bool bool_param, typename... TArgs>
static IAggregateFunction * createWithNumericType(const IDataType & argument_type, TArgs && ... args)
{
WhichDataType which(argument_type);
#define DISPATCH(TYPE) \
if (which.idx == TypeIndex::TYPE) return new AggregateFunctionTemplate<TYPE, Data<TYPE, bool_param>>(std::forward<TArgs>(args)...); /// NOLINT
FOR_NUMERIC_TYPES(DISPATCH)
#undef DISPATCH
if (which.idx == TypeIndex::Enum8) return new AggregateFunctionTemplate<Int8, Data<Int8, bool_param>>(std::forward<TArgs>(args)...);
if (which.idx == TypeIndex::Enum16) return new AggregateFunctionTemplate<Int16, Data<Int16, bool_param>>(std::forward<TArgs>(args)...);
return nullptr;
}
template <template <typename, typename> class AggregateFunctionTemplate, template <typename> class Data, typename... TArgs>
static IAggregateFunction * createWithUnsignedIntegerType(const IDataType & argument_type, TArgs && ... args)
{

View File

@ -1,14 +1,15 @@
#pragma once
#include <Columns/ColumnSparse.h>
#include <Columns/ColumnTuple.h>
#include <Columns/ColumnsNumber.h>
#include <Columns/ColumnSparse.h>
#include <Core/Block.h>
#include <Core/ColumnNumbers.h>
#include <Core/Field.h>
#include <Interpreters/Context_fwd.h>
#include <Common/Exception.h>
#include <base/types.h>
#include <Common/Exception.h>
#include <Common/ThreadPool.h>
#include "config.h"
@ -147,6 +148,16 @@ public:
/// Merges state (on which place points to) with other state of current aggregation function.
virtual void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena * arena) const = 0;
/// Tells if merge() with thread pool parameter could be used.
virtual bool isAbleToParallelizeMerge() const { return false; }
/// Should be used only if isAbleToParallelizeMerge() returned true.
virtual void
merge(AggregateDataPtr __restrict /*place*/, ConstAggregateDataPtr /*rhs*/, ThreadPool & /*thread_pool*/, Arena * /*arena*/) const
{
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "merge() with thread pool parameter isn't implemented for {} ", getName());
}
/// Serializes state (to transmit it over the network, for example).
virtual void serialize(ConstAggregateDataPtr __restrict place, WriteBuffer & buf, std::optional<size_t> version = std::nullopt) const = 0; /// NOLINT

View File

@ -0,0 +1,112 @@
#pragma once
#include <Common/CurrentThread.h>
#include <Common/HashTable/HashSet.h>
#include <Common/ThreadPool.h>
#include <Common/setThreadName.h>
namespace DB
{
template <typename SingleLevelSet, typename TwoLevelSet>
class UniqExactSet
{
static_assert(std::is_same_v<typename SingleLevelSet::value_type, typename TwoLevelSet::value_type>);
public:
using value_type = typename SingleLevelSet::value_type;
template <typename Arg, bool use_single_level_hash_table = true>
auto ALWAYS_INLINE insert(Arg && arg)
{
if constexpr (use_single_level_hash_table)
asSingleLevel().insert(std::forward<Arg>(arg));
else
asTwoLevel().insert(std::forward<Arg>(arg));
}
auto merge(const UniqExactSet & other, ThreadPool * thread_pool = nullptr)
{
if (isSingleLevel() && other.isTwoLevel())
convertToTwoLevel();
if (isSingleLevel())
{
asSingleLevel().merge(other.asSingleLevel());
}
else
{
auto & lhs = asTwoLevel();
const auto rhs_ptr = other.getTwoLevelSet();
const auto & rhs = *rhs_ptr;
if (!thread_pool)
{
for (size_t i = 0; i < rhs.NUM_BUCKETS; ++i)
lhs.impls[i].merge(rhs.impls[i]);
}
else
{
auto next_bucket_to_merge = std::make_shared<std::atomic_uint32_t>(0);
auto thread_func = [&lhs, &rhs, next_bucket_to_merge, thread_group = CurrentThread::getGroup()]()
{
if (thread_group)
CurrentThread::attachToIfDetached(thread_group);
setThreadName("UniqExactMerger");
while (true)
{
const auto bucket = next_bucket_to_merge->fetch_add(1);
if (bucket >= rhs.NUM_BUCKETS)
return;
lhs.impls[bucket].merge(rhs.impls[bucket]);
}
};
for (size_t i = 0; i < std::min<size_t>(thread_pool->getMaxThreads(), rhs.NUM_BUCKETS); ++i)
thread_pool->scheduleOrThrowOnError(thread_func);
thread_pool->wait();
}
}
}
void read(ReadBuffer & in) { asSingleLevel().read(in); }
void write(WriteBuffer & out) const
{
if (isSingleLevel())
asSingleLevel().write(out);
else
/// We have to preserve compatibility with the old implementation that used only single level hash sets.
asTwoLevel().writeAsSingleLevel(out);
}
size_t size() const { return isSingleLevel() ? asSingleLevel().size() : asTwoLevel().size(); }
/// To convert set to two level before merging (we cannot just call convertToTwoLevel() on right hand side set, because it is declared const).
std::shared_ptr<TwoLevelSet> getTwoLevelSet() const
{
return two_level_set ? two_level_set : std::make_shared<TwoLevelSet>(asSingleLevel());
}
void convertToTwoLevel()
{
two_level_set = getTwoLevelSet();
single_level_set.clear();
}
bool isSingleLevel() const { return !two_level_set; }
bool isTwoLevel() const { return !!two_level_set; }
private:
SingleLevelSet & asSingleLevel() { return single_level_set; }
const SingleLevelSet & asSingleLevel() const { return single_level_set; }
TwoLevelSet & asTwoLevel() { return *two_level_set; }
const TwoLevelSet & asTwoLevel() const { return *two_level_set; }
SingleLevelSet single_level_set;
std::shared_ptr<TwoLevelSet> two_level_set;
};
}

View File

@ -329,7 +329,7 @@ public:
free();
}
void insert(Value x)
void ALWAYS_INLINE insert(Value x)
{
HashValue hash_value = hash(x);
if (!good(hash_value))

View File

@ -166,7 +166,12 @@ ASTPtr FunctionNode::toASTImpl() const
auto function_ast = std::make_shared<ASTFunction>();
function_ast->name = function_name;
function_ast->is_window_function = isWindowFunction();
if (isWindowFunction())
{
function_ast->is_window_function = true;
function_ast->kind = ASTFunction::Kind::WINDOW_FUNCTION;
}
const auto & parameters = getParameters();
if (!parameters.getNodes().empty())

View File

@ -46,7 +46,7 @@ namespace
context->getRemoteHostFilter(),
static_cast<unsigned>(context->getGlobalContext()->getSettingsRef().s3_max_redirects),
context->getGlobalContext()->getSettingsRef().enable_s3_requests_logging,
/* for_disk_s3 = */ false);
/* for_disk_s3 = */ false, /* get_request_throttler = */ {}, /* put_request_throttler = */ {});
client_configuration.endpointOverride = s3_uri.endpoint;
client_configuration.maxConnections = static_cast<unsigned>(context->getSettingsRef().s3_max_connections);
@ -86,9 +86,10 @@ BackupReaderS3::BackupReaderS3(
const S3::URI & s3_uri_, const String & access_key_id_, const String & secret_access_key_, const ContextPtr & context_)
: s3_uri(s3_uri_)
, client(makeS3Client(s3_uri_, access_key_id_, secret_access_key_, context_))
, max_single_read_retries(context_->getSettingsRef().s3_max_single_read_retries)
, read_settings(context_->getReadSettings())
, request_settings(context_->getStorageS3Settings().getSettings(s3_uri.uri.toString()).request_settings)
{
request_settings.max_single_read_retries = context_->getSettingsRef().s3_max_single_read_retries; // FIXME: Avoid taking value for endpoint
}
DataSourceDescription BackupReaderS3::getDataSourceDescription() const
@ -115,7 +116,7 @@ UInt64 BackupReaderS3::getFileSize(const String & file_name)
std::unique_ptr<SeekableReadBuffer> BackupReaderS3::readFile(const String & file_name)
{
return std::make_unique<ReadBufferFromS3>(
client, s3_uri.bucket, fs::path(s3_uri.key) / file_name, s3_uri.version_id, max_single_read_retries, read_settings);
client, s3_uri.bucket, fs::path(s3_uri.key) / file_name, s3_uri.version_id, request_settings, read_settings);
}
@ -123,12 +124,12 @@ BackupWriterS3::BackupWriterS3(
const S3::URI & s3_uri_, const String & access_key_id_, const String & secret_access_key_, const ContextPtr & context_)
: s3_uri(s3_uri_)
, client(makeS3Client(s3_uri_, access_key_id_, secret_access_key_, context_))
, max_single_read_retries(context_->getSettingsRef().s3_max_single_read_retries)
, read_settings(context_->getReadSettings())
, rw_settings(context_->getStorageS3Settings().getSettings(s3_uri.uri.toString()).rw_settings)
, request_settings(context_->getStorageS3Settings().getSettings(s3_uri.uri.toString()).request_settings)
, log(&Poco::Logger::get("BackupWriterS3"))
{
rw_settings.updateFromSettingsIfEmpty(context_->getSettingsRef());
request_settings.updateFromSettingsIfEmpty(context_->getSettingsRef());
request_settings.max_single_read_retries = context_->getSettingsRef().s3_max_single_read_retries; // FIXME: Avoid taking value for endpoint
}
DataSourceDescription BackupWriterS3::getDataSourceDescription() const
@ -216,7 +217,7 @@ void BackupWriterS3::copyObjectMultipartImpl(
std::vector<String> part_tags;
size_t position = 0;
size_t upload_part_size = rw_settings.min_upload_part_size;
size_t upload_part_size = request_settings.min_upload_part_size;
for (size_t part_number = 1; position < size; ++part_number)
{
@ -248,10 +249,10 @@ void BackupWriterS3::copyObjectMultipartImpl(
position = next_position;
if (part_number % rw_settings.upload_part_size_multiply_parts_count_threshold == 0)
if (part_number % request_settings.upload_part_size_multiply_parts_count_threshold == 0)
{
upload_part_size *= rw_settings.upload_part_size_multiply_factor;
upload_part_size = std::min(upload_part_size, rw_settings.max_upload_part_size);
upload_part_size *= request_settings.upload_part_size_multiply_factor;
upload_part_size = std::min(upload_part_size, request_settings.max_upload_part_size);
}
}
@ -294,7 +295,7 @@ void BackupWriterS3::copyFileNative(DiskPtr from_disk, const String & file_name_
auto file_path = fs::path(s3_uri.key) / file_name_to;
auto head = requestObjectHeadData(source_bucket, objects[0].absolute_path).GetResult();
if (static_cast<size_t>(head.GetContentLength()) < rw_settings.max_single_operation_copy_size)
if (static_cast<size_t>(head.GetContentLength()) < request_settings.max_single_operation_copy_size)
{
copyObjectImpl(
source_bucket, objects[0].absolute_path, s3_uri.bucket, file_path, head);
@ -331,7 +332,7 @@ bool BackupWriterS3::fileContentsEqual(const String & file_name, const String &
try
{
auto in = std::make_unique<ReadBufferFromS3>(
client, s3_uri.bucket, fs::path(s3_uri.key) / file_name, s3_uri.version_id, max_single_read_retries, read_settings);
client, s3_uri.bucket, fs::path(s3_uri.key) / file_name, s3_uri.version_id, request_settings, read_settings);
String actual_file_contents(expected_file_contents.size(), ' ');
return (in->read(actual_file_contents.data(), actual_file_contents.size()) == actual_file_contents.size())
&& (actual_file_contents == expected_file_contents) && in->eof();
@ -349,7 +350,7 @@ std::unique_ptr<WriteBuffer> BackupWriterS3::writeFile(const String & file_name)
client,
s3_uri.bucket,
fs::path(s3_uri.key) / file_name,
rw_settings,
request_settings,
std::nullopt,
DBMS_DEFAULT_BUFFER_SIZE,
threadPoolCallbackRunner<void>(IOThreadPool::get(), "BackupWriterS3"));

View File

@ -39,8 +39,8 @@ public:
private:
S3::URI s3_uri;
std::shared_ptr<Aws::S3::S3Client> client;
UInt64 max_single_read_retries;
ReadSettings read_settings;
S3Settings::RequestSettings request_settings;
};
@ -81,9 +81,8 @@ private:
S3::URI s3_uri;
std::shared_ptr<Aws::S3::S3Client> client;
UInt64 max_single_read_retries;
ReadSettings read_settings;
S3Settings::ReadWriteSettings rw_settings;
S3Settings::RequestSettings request_settings;
Poco::Logger * log;
};

View File

@ -6,7 +6,6 @@
#include <Parsers/ExpressionElementParsers.h>
#include <Parsers/formatAST.h>
#include <Parsers/parseQuery.h>
#include <Interpreters/maskSensitiveInfoInQueryForLogging.h>
namespace DB
@ -36,6 +35,7 @@ ASTPtr BackupInfo::toAST() const
auto func = std::make_shared<ASTFunction>();
func->name = backup_engine_name;
func->no_empty_args = true;
func->kind = ASTFunction::Kind::BACKUP_NAME;
auto list = std::make_shared<ASTExpressionList>();
func->arguments = list;
@ -93,10 +93,9 @@ BackupInfo BackupInfo::fromAST(const IAST & ast)
}
String BackupInfo::toStringForLogging(const ContextPtr & context) const
String BackupInfo::toStringForLogging() const
{
ASTPtr ast = toAST();
return maskSensitiveInfoInBackupNameForLogging(serializeAST(*ast), ast, context);
return toAST()->formatForLogging();
}
}

View File

@ -22,7 +22,7 @@ struct BackupInfo
ASTPtr toAST() const;
static BackupInfo fromAST(const IAST & ast);
String toStringForLogging(const ContextPtr & context) const;
String toStringForLogging() const;
};
}

View File

@ -2,8 +2,8 @@
#include <Backups/BackupSettings.h>
#include <Core/SettingsFields.h>
#include <Parsers/ASTBackupQuery.h>
#include <Parsers/ASTSetQuery.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTSetQuery.h>
#include <Parsers/ASTLiteral.h>
#include <IO/ReadHelpers.h>
@ -126,7 +126,12 @@ void BackupSettings::copySettingsToQuery(ASTBackupQuery & query) const
query.settings = query_settings;
query.base_backup_name = base_backup_info ? base_backup_info->toAST() : nullptr;
auto base_backup_name = base_backup_info ? base_backup_info->toAST() : nullptr;
if (base_backup_name)
query.setOrReplace(query.base_backup_name, base_backup_name);
else
query.reset(query.base_backup_name);
query.cluster_host_ids = !cluster_host_ids.empty() ? Util::clusterHostIDsToAST(cluster_host_ids) : nullptr;
}

View File

@ -16,6 +16,7 @@
#include <Interpreters/Context.h>
#include <Interpreters/executeDDLQueryOnCluster.h>
#include <Parsers/ASTBackupQuery.h>
#include <Parsers/ASTFunction.h>
#include <Common/Exception.h>
#include <Common/Macros.h>
#include <Common/logger_useful.h>
@ -166,7 +167,7 @@ OperationID BackupsWorker::startMakingBackup(const ASTPtr & query, const Context
}
auto backup_info = BackupInfo::fromAST(*backup_query->backup_name);
String backup_name_for_logging = backup_info.toStringForLogging(context);
String backup_name_for_logging = backup_info.toStringForLogging();
try
{
addInfo(backup_id, backup_name_for_logging, backup_settings.internal, BackupStatus::CREATING_BACKUP);
@ -388,7 +389,7 @@ OperationID BackupsWorker::startRestoring(const ASTPtr & query, ContextMutablePt
try
{
auto backup_info = BackupInfo::fromAST(*restore_query->backup_name);
String backup_name_for_logging = backup_info.toStringForLogging(context);
String backup_name_for_logging = backup_info.toStringForLogging();
addInfo(restore_id, backup_name_for_logging, restore_settings.internal, BackupStatus::RESTORING);
/// Prepare context to use.

View File

@ -3,6 +3,7 @@
#include <Backups/RestoreSettings.h>
#include <Core/SettingsFields.h>
#include <Parsers/ASTBackupQuery.h>
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTSetQuery.h>
#include <boost/algorithm/string/predicate.hpp>
#include <Common/FieldVisitorConvertToNumber.h>
@ -213,7 +214,12 @@ void RestoreSettings::copySettingsToQuery(ASTBackupQuery & query) const
query.settings = query_settings;
query.base_backup_name = base_backup_info ? base_backup_info->toAST() : nullptr;
auto base_backup_name = base_backup_info ? base_backup_info->toAST() : nullptr;
if (base_backup_name)
query.setOrReplace(query.base_backup_name, base_backup_name);
else
query.reset(query.base_backup_name);
query.cluster_host_ids = !cluster_host_ids.empty() ? BackupSettings::Util::clusterHostIDsToAST(cluster_host_ids) : nullptr;
}

View File

@ -47,7 +47,7 @@ void registerBackupEngineS3(BackupFactory & factory)
auto creator_fn = []([[maybe_unused]] const BackupFactory::CreateParams & params) -> std::unique_ptr<IBackup>
{
#if USE_AWS_S3
String backup_name_for_logging = params.backup_info.toStringForLogging(params.context);
String backup_name_for_logging = params.backup_info.toStringForLogging();
const String & id_arg = params.backup_info.id_arg;
const auto & args = params.backup_info.args;

View File

@ -99,7 +99,7 @@ void registerBackupEnginesFileAndDisk(BackupFactory & factory)
{
auto creator_fn = [](const BackupFactory::CreateParams & params) -> std::unique_ptr<IBackup>
{
String backup_name_for_logging = params.backup_info.toStringForLogging(params.context);
String backup_name_for_logging = params.backup_info.toStringForLogging();
const String & engine_name = params.backup_info.backup_engine_name;
if (!params.backup_info.id_arg.empty())

View File

@ -152,16 +152,16 @@ MutableColumnPtr ColumnAggregateFunction::convertToValues(MutableColumnPtr colum
/// If there are references to states in final column, we must hold their ownership
/// by holding arenas and source.
auto callback = [&](auto & subcolumn)
auto callback = [&](IColumn & subcolumn)
{
if (auto * aggregate_subcolumn = typeid_cast<ColumnAggregateFunction *>(subcolumn.get()))
if (auto * aggregate_subcolumn = typeid_cast<ColumnAggregateFunction *>(&subcolumn))
{
aggregate_subcolumn->foreign_arenas = concatArenas(column_aggregate_func.foreign_arenas, column_aggregate_func.my_arena);
aggregate_subcolumn->src = column_aggregate_func.getPtr();
}
};
callback(res);
callback(*res);
res->forEachSubcolumnRecursively(callback);
for (auto * val : data)

View File

@ -151,17 +151,17 @@ public:
ColumnPtr compress() const override;
void forEachSubcolumn(ColumnCallback callback) override
void forEachSubcolumn(ColumnCallback callback) const override
{
callback(offsets);
callback(data);
}
void forEachSubcolumnRecursively(ColumnCallback callback) override
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override
{
callback(offsets);
callback(*offsets);
offsets->forEachSubcolumnRecursively(callback);
callback(data);
callback(*data);
data->forEachSubcolumnRecursively(callback);
}

View File

@ -230,14 +230,14 @@ public:
data->getExtremes(min, max);
}
void forEachSubcolumn(ColumnCallback callback) override
void forEachSubcolumn(ColumnCallback callback) const override
{
callback(data);
}
void forEachSubcolumnRecursively(ColumnCallback callback) override
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override
{
callback(data);
callback(*data);
data->forEachSubcolumnRecursively(callback);
}

View File

@ -164,7 +164,7 @@ public:
size_t byteSizeAt(size_t n) const override { return getDictionary().byteSizeAt(getIndexes().getUInt(n)); }
size_t allocatedBytes() const override { return idx.getPositions()->allocatedBytes() + getDictionary().allocatedBytes(); }
void forEachSubcolumn(ColumnCallback callback) override
void forEachSubcolumn(ColumnCallback callback) const override
{
callback(idx.getPositionsPtr());
@ -173,15 +173,15 @@ public:
callback(dictionary.getColumnUniquePtr());
}
void forEachSubcolumnRecursively(ColumnCallback callback) override
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override
{
callback(idx.getPositionsPtr());
callback(*idx.getPositionsPtr());
idx.getPositionsPtr()->forEachSubcolumnRecursively(callback);
/// Column doesn't own dictionary if it's shared.
if (!dictionary.isShared())
{
callback(dictionary.getColumnUniquePtr());
callback(*dictionary.getColumnUniquePtr());
dictionary.getColumnUniquePtr()->forEachSubcolumnRecursively(callback);
}
}
@ -278,6 +278,7 @@ public:
const ColumnPtr & getPositions() const { return positions; }
WrappedPtr & getPositionsPtr() { return positions; }
const WrappedPtr & getPositionsPtr() const { return positions; }
size_t getPositionAt(size_t row) const;
void insertPosition(UInt64 position);
void insertPositionsRange(const IColumn & column, UInt64 offset, UInt64 limit);

View File

@ -273,14 +273,14 @@ void ColumnMap::getExtremes(Field & min, Field & max) const
max = std::move(map_max_value);
}
void ColumnMap::forEachSubcolumn(ColumnCallback callback)
void ColumnMap::forEachSubcolumn(ColumnCallback callback) const
{
callback(nested);
}
void ColumnMap::forEachSubcolumnRecursively(ColumnCallback callback)
void ColumnMap::forEachSubcolumnRecursively(RecursiveColumnCallback callback) const
{
callback(nested);
callback(*nested);
nested->forEachSubcolumnRecursively(callback);
}

View File

@ -88,8 +88,8 @@ public:
size_t byteSizeAt(size_t n) const override;
size_t allocatedBytes() const override;
void protect() override;
void forEachSubcolumn(ColumnCallback callback) override;
void forEachSubcolumnRecursively(ColumnCallback callback) override;
void forEachSubcolumn(ColumnCallback callback) const override;
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override;
bool structureEquals(const IColumn & rhs) const override;
double getRatioOfDefaultRows(double sample_ratio) const override;
void getIndicesOfNonDefaultRows(Offsets & indices, size_t from, size_t limit) const override;

View File

@ -130,17 +130,17 @@ public:
ColumnPtr compress() const override;
void forEachSubcolumn(ColumnCallback callback) override
void forEachSubcolumn(ColumnCallback callback) const override
{
callback(nested_column);
callback(null_map);
}
void forEachSubcolumnRecursively(ColumnCallback callback) override
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override
{
callback(nested_column);
callback(*nested_column);
nested_column->forEachSubcolumnRecursively(callback);
callback(null_map);
callback(*null_map);
null_map->forEachSubcolumnRecursively(callback);
}

View File

@ -664,20 +664,20 @@ size_t ColumnObject::allocatedBytes() const
return res;
}
void ColumnObject::forEachSubcolumn(ColumnCallback callback)
void ColumnObject::forEachSubcolumn(ColumnCallback callback) const
{
for (auto & entry : subcolumns)
for (auto & part : entry->data.data)
for (const auto & entry : subcolumns)
for (const auto & part : entry->data.data)
callback(part);
}
void ColumnObject::forEachSubcolumnRecursively(ColumnCallback callback)
void ColumnObject::forEachSubcolumnRecursively(RecursiveColumnCallback callback) const
{
for (auto & entry : subcolumns)
for (const auto & entry : subcolumns)
{
for (auto & part : entry->data.data)
for (const auto & part : entry->data.data)
{
callback(part);
callback(*part);
part->forEachSubcolumnRecursively(callback);
}
}

View File

@ -206,8 +206,8 @@ public:
size_t size() const override;
size_t byteSize() const override;
size_t allocatedBytes() const override;
void forEachSubcolumn(ColumnCallback callback) override;
void forEachSubcolumnRecursively(ColumnCallback callback) override;
void forEachSubcolumn(ColumnCallback callback) const override;
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override;
void insert(const Field & field) override;
void insertDefault() override;
void insertFrom(const IColumn & src, size_t n) override;

View File

@ -744,17 +744,17 @@ bool ColumnSparse::structureEquals(const IColumn & rhs) const
return false;
}
void ColumnSparse::forEachSubcolumn(ColumnCallback callback)
void ColumnSparse::forEachSubcolumn(ColumnCallback callback) const
{
callback(values);
callback(offsets);
}
void ColumnSparse::forEachSubcolumnRecursively(ColumnCallback callback)
void ColumnSparse::forEachSubcolumnRecursively(RecursiveColumnCallback callback) const
{
callback(values);
callback(*values);
values->forEachSubcolumnRecursively(callback);
callback(offsets);
callback(*offsets);
offsets->forEachSubcolumnRecursively(callback);
}

View File

@ -139,8 +139,8 @@ public:
ColumnPtr compress() const override;
void forEachSubcolumn(ColumnCallback callback) override;
void forEachSubcolumnRecursively(ColumnCallback callback) override;
void forEachSubcolumn(ColumnCallback callback) const override;
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override;
bool structureEquals(const IColumn & rhs) const override;

View File

@ -495,17 +495,17 @@ void ColumnTuple::getExtremes(Field & min, Field & max) const
max = max_tuple;
}
void ColumnTuple::forEachSubcolumn(ColumnCallback callback)
void ColumnTuple::forEachSubcolumn(ColumnCallback callback) const
{
for (auto & column : columns)
for (const auto & column : columns)
callback(column);
}
void ColumnTuple::forEachSubcolumnRecursively(ColumnCallback callback)
void ColumnTuple::forEachSubcolumnRecursively(RecursiveColumnCallback callback) const
{
for (auto & column : columns)
for (const auto & column : columns)
{
callback(column);
callback(*column);
column->forEachSubcolumnRecursively(callback);
}
}

View File

@ -96,8 +96,8 @@ public:
size_t byteSizeAt(size_t n) const override;
size_t allocatedBytes() const override;
void protect() override;
void forEachSubcolumn(ColumnCallback callback) override;
void forEachSubcolumnRecursively(ColumnCallback callback) override;
void forEachSubcolumn(ColumnCallback callback) const override;
void forEachSubcolumnRecursively(RecursiveColumnCallback callback) const override;
bool structureEquals(const IColumn & rhs) const override;
bool isCollationSupported() const override;
ColumnPtr compress() const override;

View File

@ -105,7 +105,13 @@ public:
return column_holder->allocatedBytes() + reverse_index.allocatedBytes()
+ (nested_null_mask ? nested_null_mask->allocatedBytes() : 0);
}
void forEachSubcolumn(IColumn::ColumnCallback callback) override
void forEachSubcolumn(IColumn::ColumnCallback callback) const override
{
callback(column_holder);
}
void forEachSubcolumn(IColumn::MutableColumnCallback callback) override
{
callback(column_holder);
reverse_index.setColumn(getRawColumnPtr());
@ -113,9 +119,15 @@ public:
nested_column_nullable = ColumnNullable::create(column_holder, nested_null_mask);
}
void forEachSubcolumnRecursively(IColumn::ColumnCallback callback) override
void forEachSubcolumnRecursively(IColumn::RecursiveColumnCallback callback) const override
{
callback(column_holder);
callback(*column_holder);
column_holder->forEachSubcolumnRecursively(callback);
}
void forEachSubcolumnRecursively(IColumn::RecursiveMutableColumnCallback callback) override
{
callback(*column_holder);
column_holder->forEachSubcolumnRecursively(callback);
reverse_index.setColumn(getRawColumnPtr());
if (is_nullable)

View File

@ -20,12 +20,10 @@ String IColumn::dumpStructure() const
WriteBufferFromOwnString res;
res << getFamilyName() << "(size = " << size();
ColumnCallback callback = [&](ColumnPtr & subcolumn)
forEachSubcolumn([&](const auto & subcolumn)
{
res << ", " << subcolumn->dumpStructure();
};
const_cast<IColumn*>(this)->forEachSubcolumn(callback);
});
res << ")";
return res.str();
@ -64,6 +62,22 @@ ColumnPtr IColumn::createWithOffsets(const Offsets & offsets, const Field & defa
return res;
}
void IColumn::forEachSubcolumn(MutableColumnCallback callback)
{
std::as_const(*this).forEachSubcolumn([&callback](const WrappedPtr & subcolumn)
{
callback(const_cast<WrappedPtr &>(subcolumn));
});
}
void IColumn::forEachSubcolumnRecursively(RecursiveMutableColumnCallback callback)
{
std::as_const(*this).forEachSubcolumnRecursively([&callback](const IColumn & subcolumn)
{
callback(const_cast<IColumn &>(subcolumn));
});
}
bool isColumnNullable(const IColumn & column)
{
return checkColumn<ColumnNullable>(column);

View File

@ -411,11 +411,22 @@ public:
/// If the column contains subcolumns (such as Array, Nullable, etc), do callback on them.
/// Shallow: doesn't do recursive calls; don't do call for itself.
using ColumnCallback = std::function<void(WrappedPtr&)>;
virtual void forEachSubcolumn(ColumnCallback) {}
using ColumnCallback = std::function<void(const WrappedPtr &)>;
virtual void forEachSubcolumn(ColumnCallback) const {}
using MutableColumnCallback = std::function<void(WrappedPtr &)>;
virtual void forEachSubcolumn(MutableColumnCallback callback);
/// Similar to forEachSubcolumn but it also do recursive calls.
virtual void forEachSubcolumnRecursively(ColumnCallback) {}
/// In recursive calls it's prohibited to replace pointers
/// to subcolumns, so we use another callback function.
using RecursiveColumnCallback = std::function<void(const IColumn &)>;
virtual void forEachSubcolumnRecursively(RecursiveColumnCallback) const {}
using RecursiveMutableColumnCallback = std::function<void(IColumn &)>;
virtual void forEachSubcolumnRecursively(RecursiveMutableColumnCallback callback);
/// Columns have equal structure.
/// If true - you can use "compareAt", "insertFrom", etc. methods.

View File

@ -0,0 +1,27 @@
#include <Columns/ColumnLowCardinality.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <gtest/gtest.h>
#include <thread>
using namespace DB;
TEST(IColumn, dumpStructure)
{
auto type_lc = std::make_shared<DataTypeLowCardinality>(std::make_shared<DataTypeString>());
ColumnPtr column_lc = type_lc->createColumn();
String expected_structure = "ColumnLowCardinality(size = 0, UInt8(size = 0), ColumnUnique(size = 1, String(size = 1)))";
std::vector<std::thread> threads;
for (size_t i = 0; i < 6; ++i)
{
threads.emplace_back([&]
{
for (size_t j = 0; j < 10000; ++j)
ASSERT_EQ(column_lc->dumpStructure(), expected_structure);
});
}
for (auto & t : threads)
t.join();
}

View File

@ -141,7 +141,7 @@ public:
/// Get piece of memory, without alignment.
char * alloc(size_t size)
{
if (unlikely(head->pos + size > head->end))
if (unlikely(static_cast<std::ptrdiff_t>(size) > head->end - head->pos))
addMemoryChunk(size);
char * res = head->pos;

View File

@ -21,7 +21,12 @@ bool FieldVisitorSum::operator() (UInt64 & x) const
bool FieldVisitorSum::operator() (Float64 & x) const { x += rhs.get<Float64>(); return x != 0; }
bool FieldVisitorSum::operator() (Null &) const { throw Exception("Cannot sum Nulls", ErrorCodes::LOGICAL_ERROR); }
bool FieldVisitorSum::operator() (Null &) const
{
/// Do not add anything
return false;
}
bool FieldVisitorSum::operator() (String &) const { throw Exception("Cannot sum Strings", ErrorCodes::LOGICAL_ERROR); }
bool FieldVisitorSum::operator() (Array &) const { throw Exception("Cannot sum Arrays", ErrorCodes::LOGICAL_ERROR); }
bool FieldVisitorSum::operator() (Tuple &) const { throw Exception("Cannot sum Tuples", ErrorCodes::LOGICAL_ERROR); }

View File

@ -3,6 +3,7 @@
#include <Common/HashTable/Hash.h>
#include <Common/HashTable/HashTable.h>
#include <Common/HashTable/HashTableAllocator.h>
#include <Common/HashTable/TwoLevelHashTable.h>
#include <IO/WriteBuffer.h>
#include <IO/WriteHelpers.h>
@ -10,6 +11,14 @@
#include <IO/ReadHelpers.h>
#include <IO/VarInt.h>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
}
/** NOTE HashSet could only be used for memmoveable (position independent) types.
* Example: std::string is not position independent in libstdc++ with C++11 ABI or in libc++.
* Also, key must be of type, that zero bytes is compared equals to zero key.
@ -64,6 +73,47 @@ public:
};
template <
typename Key,
typename TCell, /// Supposed to have no state (HashTableNoState)
typename Hash = DefaultHash<Key>,
typename Grower = TwoLevelHashTableGrower<>,
typename Allocator = HashTableAllocator>
class TwoLevelHashSetTable
: public TwoLevelHashTable<Key, TCell, Hash, Grower, Allocator, HashSetTable<Key, TCell, Hash, Grower, Allocator>>
{
public:
using Self = TwoLevelHashSetTable;
using Base = TwoLevelHashTable<Key, TCell, Hash, Grower, Allocator, HashSetTable<Key, TCell, Hash, Grower, Allocator>>;
using Base::Base;
/// Writes its content in a way that it will be correctly read by HashSetTable.
/// Used by uniqExact to preserve backward compatibility.
void writeAsSingleLevel(DB::WriteBuffer & wb) const
{
DB::writeVarUInt(this->size(), wb);
bool zero_written = false;
for (size_t i = 0; i < Base::NUM_BUCKETS; ++i)
{
if (this->impls[i].hasZero())
{
if (zero_written)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "No more than one zero value expected");
this->impls[i].zeroValue()->write(wb);
zero_written = true;
}
}
static constexpr HashTableNoState state;
for (auto ptr = this->begin(); ptr != this->end(); ++ptr)
if (!ptr.getPtr()->isZero(state))
ptr.getPtr()->write(wb);
}
};
template <typename Key, typename Hash, typename TState = HashTableNoState>
struct HashSetCellWithSavedHash : public HashTableCell<Key, Hash, TState>
{
@ -89,6 +139,13 @@ template <
typename Allocator = HashTableAllocator>
using HashSet = HashSetTable<Key, HashTableCell<Key, Hash>, Hash, Grower, Allocator>;
template <
typename Key,
typename Hash = DefaultHash<Key>,
typename Grower = TwoLevelHashTableGrower<>,
typename Allocator = HashTableAllocator>
using TwoLevelHashSet = TwoLevelHashSetTable<Key, HashTableCell<Key, Hash>, Hash, Grower, Allocator>;
template <typename Key, typename Hash, size_t initial_size_degree>
using HashSetWithStackMemory = HashSet<
Key,

View File

@ -432,20 +432,12 @@ struct AllocatorBufferDeleter<true, Allocator, Cell>
// The HashTable
template
<
typename Key,
typename Cell,
typename Hash,
typename Grower,
typename Allocator
>
class HashTable :
private boost::noncopyable,
protected Hash,
protected Allocator,
protected Cell::State,
protected ZeroValueStorage<Cell::need_zero_value_storage, Cell> /// empty base optimization
template <typename Key, typename Cell, typename Hash, typename Grower, typename Allocator>
class HashTable : private boost::noncopyable,
protected Hash,
protected Allocator,
protected Cell::State,
public ZeroValueStorage<Cell::need_zero_value_storage, Cell> /// empty base optimization
{
public:
// If we use an allocator with inline memory, check that the initial

View File

@ -159,14 +159,16 @@ public:
class const_iterator /// NOLINT
{
Self * container{};
const Self * container{};
size_t bucket{};
typename Impl::const_iterator current_it{};
friend class TwoLevelHashTable;
const_iterator(Self * container_, size_t bucket_, typename Impl::const_iterator current_it_)
: container(container_), bucket(bucket_), current_it(current_it_) {}
const_iterator(const Self * container_, size_t bucket_, typename Impl::const_iterator current_it_)
: container(container_), bucket(bucket_), current_it(current_it_)
{
}
public:
const_iterator() = default;

View File

@ -0,0 +1,48 @@
#include <Common/KnownObjectNames.h>
#include <Poco/String.h>
namespace DB
{
bool KnownObjectNames::exists(const String & name) const
{
std::lock_guard lock{mutex};
if (names.contains(name))
return true;
if (!case_insensitive_names.empty())
{
String lower_name = Poco::toLower(name);
if (case_insensitive_names.contains(lower_name))
return true;
}
return false;
}
void KnownObjectNames::add(const String & name, bool case_insensitive)
{
std::lock_guard lock{mutex};
if (case_insensitive)
case_insensitive_names.emplace(Poco::toLower(name));
else
names.emplace(name);
}
KnownTableFunctionNames & KnownTableFunctionNames::instance()
{
static KnownTableFunctionNames the_instance;
return the_instance;
}
KnownFormatNames & KnownFormatNames::instance()
{
static KnownFormatNames the_instance;
return the_instance;
}
}

View File

@ -0,0 +1,37 @@
#pragma once
#include <base/types.h>
#include <mutex>
#include <unordered_set>
namespace DB
{
class KnownObjectNames
{
public:
bool exists(const String & name) const;
void add(const String & name, bool case_insensitive = false);
private:
mutable std::mutex mutex;
std::unordered_set<String> names;
std::unordered_set<String> case_insensitive_names;
};
class KnownTableFunctionNames : public KnownObjectNames
{
public:
static KnownTableFunctionNames & instance();
};
class KnownFormatNames : public KnownObjectNames
{
public:
static KnownFormatNames & instance();
};
}

View File

@ -62,7 +62,7 @@
M(NetworkSendElapsedMicroseconds, "Total time spent waiting for data to send to network or sending data to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries..") \
M(NetworkReceiveBytes, "Total number of bytes received from network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.") \
M(NetworkSendBytes, "Total number of bytes send to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.") \
M(ThrottlerSleepMicroseconds, "Total time a query was sleeping to conform the 'max_network_bandwidth' setting.") \
M(ThrottlerSleepMicroseconds, "Total time a query was sleeping to conform 'max_network_bandwidth' and other throttling settings.") \
\
M(QueryMaskingRulesMatch, "Number of times query masking rules was successfully matched.") \
\

View File

@ -13,12 +13,19 @@
#include <Common/Exception.h>
#include <Common/StringUtils/StringUtils.h>
#include <Common/ProfileEvents.h>
#ifndef NDEBUG
# include <iostream>
#endif
namespace ProfileEvents
{
extern const Event QueryMaskingRulesMatch;
}
namespace DB
{
namespace ErrorCodes
@ -165,6 +172,10 @@ size_t SensitiveDataMasker::wipeSensitiveData(std::string & data) const
size_t matches = 0;
for (const auto & rule : all_masking_rules)
matches += rule->apply(data);
if (matches)
ProfileEvents::increment(ProfileEvents::QueryMaskingRulesMatch, matches);
return matches;
}
@ -184,4 +195,18 @@ size_t SensitiveDataMasker::rulesCount() const
return all_masking_rules.size();
}
std::string wipeSensitiveDataAndCutToLength(const std::string & str, size_t max_length)
{
std::string res = str;
if (auto * masker = SensitiveDataMasker::getInstance())
masker->wipeSensitiveData(res);
if (max_length && (res.length() > max_length))
res.resize(max_length);
return res;
}
}

View File

@ -69,4 +69,8 @@ public:
size_t rulesCount() const;
};
/// Wipes sensitive data and cuts to a specified maximum length in one function call.
/// If the maximum length is zero then the function doesn't cut to the maximum length.
std::string wipeSensitiveDataAndCutToLength(const std::string & str, size_t max_length);
}

View File

@ -20,8 +20,6 @@ namespace ErrorCodes
/// Just 10^9.
static constexpr auto NS = 1000000000UL;
static const size_t default_burst_seconds = 1;
Throttler::Throttler(size_t max_speed_, const std::shared_ptr<Throttler> & parent_)
: max_speed(max_speed_)
, max_burst(max_speed_ * default_burst_seconds)

View File

@ -17,6 +17,8 @@ namespace DB
class Throttler
{
public:
static const size_t default_burst_seconds = 1;
Throttler(size_t max_speed_, size_t max_burst_, const std::shared_ptr<Throttler> & parent_ = nullptr)
: max_speed(max_speed_), max_burst(max_burst_), limit_exceeded_exception_message(""), tokens(max_burst), parent(parent_) {}

View File

@ -27,7 +27,7 @@ int main(int, char **)
std::cerr << x.getValue() << std::endl;
DB::WriteBufferFromOwnString wb;
cont.writeText(wb);
cont.write(wb);
std::cerr << "dump: " << wb.str() << std::endl;
}

View File

@ -15,6 +15,17 @@
using namespace DB;
namespace
{
std::vector<UInt64> getVectorWithNumbersUpToN(size_t n)
{
std::vector<UInt64> res(n);
std::iota(res.begin(), res.end(), 0);
return res;
}
}
/// To test dump functionality without using other hashes that can change
template <typename T>
@ -371,3 +382,48 @@ TEST(HashTable, Resize)
ASSERT_EQ(actual, expected);
}
}
using HashSetContent = std::vector<UInt64>;
class TwoLevelHashSetFixture : public ::testing::TestWithParam<HashSetContent>
{
};
TEST_P(TwoLevelHashSetFixture, WriteAsSingleLevel)
{
using Key = UInt64;
{
const auto & hash_set_content = GetParam();
TwoLevelHashSet<Key, HashCRC32<Key>> two_level;
for (const auto & elem : hash_set_content)
two_level.insert(elem);
WriteBufferFromOwnString wb;
two_level.writeAsSingleLevel(wb);
ReadBufferFromString rb(wb.str());
HashSet<Key, HashCRC32<Key>> single_level;
single_level.read(rb);
EXPECT_EQ(single_level.size(), hash_set_content.size());
for (const auto & elem : hash_set_content)
EXPECT_NE(single_level.find(elem), nullptr);
}
}
INSTANTIATE_TEST_SUITE_P(
TwoLevelHashSetTests,
TwoLevelHashSetFixture,
::testing::Values(
HashSetContent{},
getVectorWithNumbersUpToN(1),
getVectorWithNumbersUpToN(100),
getVectorWithNumbersUpToN(1000),
getVectorWithNumbersUpToN(10000),
getVectorWithNumbersUpToN(100000),
getVectorWithNumbersUpToN(1000000)));

View File

@ -135,6 +135,9 @@ void CompressionCodecDelta::doDecompressData(const char * source, UInt32 source_
if (source_size < 2)
throw Exception("Cannot decompress. File has wrong header", ErrorCodes::CANNOT_DECOMPRESS);
if (uncompressed_size == 0)
return;
UInt8 bytes_size = source[0];
if (bytes_size == 0)

View File

@ -36,7 +36,7 @@ void CoordinationSettings::loadFromConfig(const String & config_elem, const Poco
}
const String KeeperConfigurationAndSettings::DEFAULT_FOUR_LETTER_WORD_CMD = "conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif";
const String KeeperConfigurationAndSettings::DEFAULT_FOUR_LETTER_WORD_CMD = "conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif,rqld";
KeeperConfigurationAndSettings::KeeperConfigurationAndSettings()
: server_id(NOT_EXIST)

View File

@ -142,6 +142,9 @@ void FourLetterCommandFactory::registerCommands(KeeperDispatcher & keeper_dispat
FourLetterCommandPtr log_info_command = std::make_shared<LogInfoCommand>(keeper_dispatcher);
factory.registerCommand(log_info_command);
FourLetterCommandPtr request_leader_command = std::make_shared<RequestLeaderCommand>(keeper_dispatcher);
factory.registerCommand(request_leader_command);
factory.initializeAllowList(keeper_dispatcher);
factory.setInitialize(true);
}
@ -507,4 +510,9 @@ String LogInfoCommand::run()
return ret.str();
}
String RequestLeaderCommand::run()
{
return keeper_dispatcher.requestLeader() ? "Sent leadership request to leader." : "Failed to send leadership request to leader.";
}
}

View File

@ -364,4 +364,17 @@ struct LogInfoCommand : public IFourLetterCommand
~LogInfoCommand() override = default;
};
/// Request to be leader.
struct RequestLeaderCommand : public IFourLetterCommand
{
explicit RequestLeaderCommand(KeeperDispatcher & keeper_dispatcher_)
: IFourLetterCommand(keeper_dispatcher_)
{
}
String name() override { return "rqld"; }
String run() override;
~RequestLeaderCommand() override = default;
};
}

View File

@ -215,6 +215,12 @@ public:
{
return server->getKeeperLogInfo();
}
/// Request to be leader.
bool requestLeader()
{
return server->requestLeader();
}
};
}

View File

@ -932,4 +932,9 @@ KeeperLogInfo KeeperServer::getKeeperLogInfo()
return log_info;
}
bool KeeperServer::requestLeader()
{
return isLeader() || raft_instance->request_leadership();
}
}

View File

@ -135,6 +135,8 @@ public:
uint64_t createSnapshot();
KeeperLogInfo getKeeperLogInfo();
bool requestLeader();
};
}

View File

@ -93,7 +93,7 @@ void KeeperSnapshotManagerS3::updateS3Configuration(const Poco::Util::AbstractCo
auth_settings.region,
RemoteHostFilter(), s3_max_redirects,
enable_s3_requests_logging,
/* for_disk_s3 = */ false);
/* for_disk_s3 = */ false, /* get_request_throttler = */ {}, /* put_request_throttler = */ {});
client_configuration.endpointOverride = new_uri.endpoint;
@ -135,8 +135,8 @@ void KeeperSnapshotManagerS3::uploadSnapshotImpl(const std::string & snapshot_pa
if (s3_client == nullptr)
return;
S3Settings::ReadWriteSettings read_write_settings;
read_write_settings.upload_part_size_multiply_parts_count_threshold = 10000;
S3Settings::RequestSettings request_settings_1;
request_settings_1.upload_part_size_multiply_parts_count_threshold = 10000;
const auto create_writer = [&](const auto & key)
{
@ -145,7 +145,7 @@ void KeeperSnapshotManagerS3::uploadSnapshotImpl(const std::string & snapshot_pa
s3_client->client,
s3_client->uri.bucket,
key,
read_write_settings
request_settings_1
};
};
@ -194,13 +194,15 @@ void KeeperSnapshotManagerS3::uploadSnapshotImpl(const std::string & snapshot_pa
lock_writer.finalize();
// We read back the written UUID, if it's the same we can upload the file
S3Settings::RequestSettings request_settings_2;
request_settings_2.max_single_read_retries = 1;
ReadBufferFromS3 lock_reader
{
s3_client->client,
s3_client->uri.bucket,
lock_file,
"",
1,
request_settings_2,
{}
};

View File

@ -90,6 +90,10 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
M(UInt64, s3_max_unexpected_write_error_retries, 4, "The maximum number of retries in case of unexpected errors during S3 write.", 0) \
M(UInt64, s3_max_redirects, 10, "Max number of S3 redirects hops allowed.", 0) \
M(UInt64, s3_max_connections, 1024, "The maximum number of connections per server.", 0) \
M(UInt64, s3_max_get_rps, 0, "Limit on S3 GET request per second rate before throttling. Zero means unlimited.", 0) \
M(UInt64, s3_max_get_burst, 0, "Max number of requests that can be issued simultaneously before hitting request per second limit. By default (0) equals to `s3_max_get_rps`", 0) \
M(UInt64, s3_max_put_rps, 0, "Limit on S3 PUT request per second rate before throttling. Zero means unlimited.", 0) \
M(UInt64, s3_max_put_burst, 0, "Max number of requests that can be issued simultaneously before hitting request per second limit. By default (0) equals to `s3_max_put_rps`", 0) \
M(Bool, s3_truncate_on_insert, false, "Enables or disables truncate before insert in s3 engine tables.", 0) \
M(Bool, s3_create_new_file_on_insert, false, "Enables or disables creating a new file on each insert in s3 engine tables", 0) \
M(Bool, s3_check_objects_after_upload, false, "Check each uploaded object to s3 with head request to be sure that upload was successful", 0) \

View File

@ -74,6 +74,8 @@ public:
bool checkUniqueId(const String & id) const override { return delegate->checkUniqueId(id); }
DataSourceDescription getDataSourceDescription() const override { return delegate->getDataSourceDescription(); }
bool isRemote() const override { return delegate->isRemote(); }
bool isReadOnly() const override { return delegate->isReadOnly(); }
bool isWriteOnce() const override { return delegate->isWriteOnce(); }
bool supportZeroCopyReplication() const override { return delegate->supportZeroCopyReplication(); }
bool supportParallelWrite() const override { return delegate->supportParallelWrite(); }
void onFreeze(const String & path) override;

View File

@ -308,6 +308,8 @@ public:
virtual bool isReadOnly() const { return false; }
virtual bool isWriteOnce() const { return false; }
/// Check if disk is broken. Broken disks will have 0 space and cannot be used.
virtual bool isBroken() const { return false; }

View File

@ -101,6 +101,8 @@ public:
bool isReadOnly() const override { return object_storage->isReadOnly(); }
bool isWriteOnce() const override { return object_storage->isWriteOnce(); }
const std::string & getCacheConfigName() const { return cache_config_name; }
ObjectStoragePtr getWrappedObjectStorage() { return object_storage; }

View File

@ -499,6 +499,11 @@ bool DiskObjectStorage::isReadOnly() const
return object_storage->isReadOnly();
}
bool DiskObjectStorage::isWriteOnce() const
{
return object_storage->isWriteOnce();
}
DiskObjectStoragePtr DiskObjectStorage::createDiskObjectStorage()
{
return std::make_shared<DiskObjectStorage>(

View File

@ -177,6 +177,12 @@ public:
/// with static files, so only read-only operations are allowed for this storage.
bool isReadOnly() const override;
/// Is object write-once?
/// For example: S3PlainObjectStorage is write once, this means that it
/// does support BACKUP to this disk, but does not support INSERT into
/// MergeTree table on this disk.
bool isWriteOnce() const override;
/// Add a cache layer.
/// Example: DiskObjectStorage(S3ObjectStorage) -> DiskObjectStorage(CachedObjectStorage(S3ObjectStorage))
/// There can be any number of cache layers:

View File

@ -199,6 +199,7 @@ public:
virtual bool supportsCache() const { return false; }
virtual bool isReadOnly() const { return false; }
virtual bool isWriteOnce() const { return false; }
virtual bool supportParallelWrite() const { return false; }

View File

@ -175,7 +175,7 @@ std::unique_ptr<ReadBufferFromFileBase> S3ObjectStorage::readObjects( /// NOLINT
bucket,
path,
version_id,
settings_ptr->s3_settings.max_single_read_retries,
settings_ptr->request_settings,
disk_read_settings,
/* use_external_buffer */true,
/* offset */0,
@ -212,7 +212,7 @@ std::unique_ptr<ReadBufferFromFileBase> S3ObjectStorage::readObject( /// NOLINT
bucket,
object.absolute_path,
version_id,
settings_ptr->s3_settings.max_single_read_retries,
settings_ptr->request_settings,
patchSettings(read_settings));
}
@ -238,7 +238,7 @@ std::unique_ptr<WriteBufferFromFileBase> S3ObjectStorage::writeObject( /// NOLIN
client.get(),
bucket,
object.absolute_path,
settings_ptr->s3_settings,
settings_ptr->request_settings,
attributes,
buf_size,
std::move(scheduler),
@ -489,7 +489,7 @@ void S3ObjectStorage::copyObjectImpl(
throwIfError(outcome);
auto settings_ptr = s3_settings.get();
if (settings_ptr->s3_settings.check_objects_after_upload)
if (settings_ptr->request_settings.check_objects_after_upload)
{
auto object_head = requestObjectHeadData(dst_bucket, dst_key);
if (!object_head.IsSuccess())
@ -533,7 +533,7 @@ void S3ObjectStorage::copyObjectMultipartImpl(
std::vector<String> part_tags;
size_t upload_part_size = settings_ptr->s3_settings.min_upload_part_size;
size_t upload_part_size = settings_ptr->request_settings.min_upload_part_size;
for (size_t position = 0, part_number = 1; position < size; ++part_number, position += upload_part_size)
{
ProfileEvents::increment(ProfileEvents::S3UploadPartCopy);
@ -586,7 +586,7 @@ void S3ObjectStorage::copyObjectMultipartImpl(
throwIfError(outcome);
}
if (settings_ptr->s3_settings.check_objects_after_upload)
if (settings_ptr->request_settings.check_objects_after_upload)
{
auto object_head = requestObjectHeadData(dst_bucket, dst_key);
if (!object_head.IsSuccess())
@ -643,17 +643,20 @@ void S3ObjectStorage::startup()
void S3ObjectStorage::applyNewSettings(const Poco::Util::AbstractConfiguration & config, const std::string & config_prefix, ContextPtr context)
{
s3_settings.set(getSettings(config, config_prefix, context));
client.set(getClient(config, config_prefix, context));
auto new_s3_settings = getSettings(config, config_prefix, context);
auto new_client = getClient(config, config_prefix, context, *new_s3_settings);
s3_settings.set(std::move(new_s3_settings));
client.set(std::move(new_client));
applyRemoteThrottlingSettings(context);
}
std::unique_ptr<IObjectStorage> S3ObjectStorage::cloneObjectStorage(
const std::string & new_namespace, const Poco::Util::AbstractConfiguration & config, const std::string & config_prefix, ContextPtr context)
{
auto new_s3_settings = getSettings(config, config_prefix, context);
auto new_client = getClient(config, config_prefix, context, *new_s3_settings);
return std::make_unique<S3ObjectStorage>(
getClient(config, config_prefix, context),
getSettings(config, config_prefix, context),
std::move(new_client), std::move(new_s3_settings),
version_id, s3_capabilities, new_namespace,
S3::URI(Poco::URI(config.getString(config_prefix + ".endpoint"))).endpoint);
}

View File

@ -23,17 +23,17 @@ struct S3ObjectStorageSettings
S3ObjectStorageSettings() = default;
S3ObjectStorageSettings(
const S3Settings::ReadWriteSettings & s3_settings_,
const S3Settings::RequestSettings & request_settings_,
uint64_t min_bytes_for_seek_,
int32_t list_object_keys_size_,
int32_t objects_chunk_size_to_delete_)
: s3_settings(s3_settings_)
: request_settings(request_settings_)
, min_bytes_for_seek(min_bytes_for_seek_)
, list_object_keys_size(list_object_keys_size_)
, objects_chunk_size_to_delete(objects_chunk_size_to_delete_)
{}
S3Settings::ReadWriteSettings s3_settings;
S3Settings::RequestSettings request_settings;
uint64_t min_bytes_for_seek;
int32_t list_object_keys_size;
@ -216,6 +216,11 @@ public:
{
data_source_description.type = DataSourceType::S3_Plain;
}
/// Notes:
/// - supports BACKUP to this disk
/// - does not support INSERT into MergeTree table on this disk
bool isWriteOnce() const override { return true; }
};
}

View File

@ -4,6 +4,7 @@
#include <Common/StringUtils/StringUtils.h>
#include <Common/logger_useful.h>
#include <Common/Throttler.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <Interpreters/Context.h>
@ -32,17 +33,26 @@ namespace ErrorCodes
std::unique_ptr<S3ObjectStorageSettings> getSettings(const Poco::Util::AbstractConfiguration & config, const String & config_prefix, ContextPtr context)
{
S3Settings::ReadWriteSettings rw_settings;
rw_settings.max_single_read_retries = config.getUInt64(config_prefix + ".s3_max_single_read_retries", context->getSettingsRef().s3_max_single_read_retries);
rw_settings.min_upload_part_size = config.getUInt64(config_prefix + ".s3_min_upload_part_size", context->getSettingsRef().s3_min_upload_part_size);
rw_settings.upload_part_size_multiply_factor = config.getUInt64(config_prefix + ".s3_upload_part_size_multiply_factor", context->getSettingsRef().s3_upload_part_size_multiply_factor);
rw_settings.upload_part_size_multiply_parts_count_threshold = config.getUInt64(config_prefix + ".s3_upload_part_size_multiply_parts_count_threshold", context->getSettingsRef().s3_upload_part_size_multiply_parts_count_threshold);
rw_settings.max_single_part_upload_size = config.getUInt64(config_prefix + ".s3_max_single_part_upload_size", context->getSettingsRef().s3_max_single_part_upload_size);
rw_settings.check_objects_after_upload = config.getUInt64(config_prefix + ".s3_check_objects_after_upload", context->getSettingsRef().s3_check_objects_after_upload);
rw_settings.max_unexpected_write_error_retries = config.getUInt64(config_prefix + ".s3_max_unexpected_write_error_retries", context->getSettingsRef().s3_max_unexpected_write_error_retries);
const Settings & settings = context->getSettingsRef();
S3Settings::RequestSettings request_settings;
request_settings.max_single_read_retries = config.getUInt64(config_prefix + ".s3_max_single_read_retries", settings.s3_max_single_read_retries);
request_settings.min_upload_part_size = config.getUInt64(config_prefix + ".s3_min_upload_part_size", settings.s3_min_upload_part_size);
request_settings.upload_part_size_multiply_factor = config.getUInt64(config_prefix + ".s3_upload_part_size_multiply_factor", settings.s3_upload_part_size_multiply_factor);
request_settings.upload_part_size_multiply_parts_count_threshold = config.getUInt64(config_prefix + ".s3_upload_part_size_multiply_parts_count_threshold", settings.s3_upload_part_size_multiply_parts_count_threshold);
request_settings.max_single_part_upload_size = config.getUInt64(config_prefix + ".s3_max_single_part_upload_size", settings.s3_max_single_part_upload_size);
request_settings.check_objects_after_upload = config.getUInt64(config_prefix + ".s3_check_objects_after_upload", settings.s3_check_objects_after_upload);
request_settings.max_unexpected_write_error_retries = config.getUInt64(config_prefix + ".s3_max_unexpected_write_error_retries", settings.s3_max_unexpected_write_error_retries);
// NOTE: it would be better to reuse old throttlers to avoid losing token bucket state on every config reload, which could lead to exceeding limit for short time. But it is good enough unless very high `burst` values are used.
if (UInt64 max_get_rps = config.getUInt64(config_prefix + ".s3_max_get_rps", settings.s3_max_get_rps))
request_settings.get_request_throttler = std::make_shared<Throttler>(
max_get_rps, config.getUInt64(config_prefix + ".s3_max_get_burst", settings.s3_max_get_burst ? settings.s3_max_get_burst : Throttler::default_burst_seconds * max_get_rps));
if (UInt64 max_put_rps = config.getUInt64(config_prefix + ".s3_max_put_rps", settings.s3_max_put_rps))
request_settings.put_request_throttler = std::make_shared<Throttler>(
max_put_rps, config.getUInt64(config_prefix + ".s3_max_put_burst", settings.s3_max_put_burst ? settings.s3_max_put_burst : Throttler::default_burst_seconds * max_put_rps));
return std::make_unique<S3ObjectStorageSettings>(
rw_settings,
request_settings,
config.getUInt64(config_prefix + ".min_bytes_for_seek", 1024 * 1024),
config.getInt(config_prefix + ".list_object_keys_size", 1000),
config.getInt(config_prefix + ".objects_chunk_size_to_delete", 1000));
@ -112,14 +122,20 @@ std::shared_ptr<S3::ProxyConfiguration> getProxyConfiguration(const String & pre
}
std::unique_ptr<Aws::S3::S3Client> getClient(const Poco::Util::AbstractConfiguration & config, const String & config_prefix, ContextPtr context)
std::unique_ptr<Aws::S3::S3Client> getClient(
const Poco::Util::AbstractConfiguration & config,
const String & config_prefix,
ContextPtr context,
const S3ObjectStorageSettings & settings)
{
S3::PocoHTTPClientConfiguration client_configuration = S3::ClientFactory::instance().createClientConfiguration(
config.getString(config_prefix + ".region", ""),
context->getRemoteHostFilter(),
static_cast<int>(context->getGlobalContext()->getSettingsRef().s3_max_redirects),
context->getGlobalContext()->getSettingsRef().enable_s3_requests_logging,
/* for_disk_s3 = */ true);
/* for_disk_s3 = */ true,
settings.request_settings.get_request_throttler,
settings.request_settings.put_request_throttler);
S3::URI uri(Poco::URI(config.getString(config_prefix + ".endpoint")));
if (uri.key.back() != '/')

View File

@ -22,7 +22,7 @@ struct S3ObjectStorageSettings;
std::unique_ptr<S3ObjectStorageSettings> getSettings(const Poco::Util::AbstractConfiguration & config, const String & config_prefix, ContextPtr context);
std::unique_ptr<Aws::S3::S3Client> getClient(const Poco::Util::AbstractConfiguration & config, const String & config_prefix, ContextPtr context);
std::unique_ptr<Aws::S3::S3Client> getClient(const Poco::Util::AbstractConfiguration & config, const String & config_prefix, ContextPtr context, const S3ObjectStorageSettings & settings);
}

View File

@ -1,27 +0,0 @@
#pragma once
#include "config.h"
#if USE_AWS_S3
#include <aws/core/client/DefaultRetryStrategy.h>
#include <IO/S3Common.h>
#include <Storages/StorageS3Settings.h>
#include <Disks/ObjectStorages/S3/ProxyConfiguration.h>
#include <Disks/ObjectStorages/S3/ProxyListConfiguration.h>
#include <Disks/ObjectStorages/S3/ProxyResolverConfiguration.h>
#include <Disks/DiskRestartProxy.h>
#include <Disks/DiskLocal.h>
#include <Disks/ObjectStorages/DiskObjectStorageCommon.h>
namespace DB
{
std::unique_ptr<DiskS3Settings> getSettings(const Poco::Util::AbstractConfiguration & config, const String & config_prefix, ContextPtr context);
std::shared_ptr<Aws::S3::S3Client> getClient(const Poco::Util::AbstractConfiguration & config, const String & config_prefix, ContextPtr context);
}

View File

@ -130,21 +130,16 @@ void registerDiskS3(DiskFactory & factory)
chassert(type == "s3" || type == "s3_plain");
MetadataStoragePtr metadata_storage;
auto settings = getSettings(config, config_prefix, context);
auto client = getClient(config, config_prefix, context, *settings);
if (type == "s3_plain")
{
s3_storage = std::make_shared<S3PlainObjectStorage>(
getClient(config, config_prefix, context),
getSettings(config, config_prefix, context),
uri.version_id, s3_capabilities, uri.bucket, uri.endpoint);
s3_storage = std::make_shared<S3PlainObjectStorage>(std::move(client), std::move(settings), uri.version_id, s3_capabilities, uri.bucket, uri.endpoint);
metadata_storage = std::make_shared<MetadataStorageFromPlainObjectStorage>(s3_storage, uri.key);
}
else
{
s3_storage = std::make_shared<S3ObjectStorage>(
getClient(config, config_prefix, context),
getSettings(config, config_prefix, context),
uri.version_id, s3_capabilities, uri.bucket, uri.endpoint);
s3_storage = std::make_shared<S3ObjectStorage>(std::move(client), std::move(settings), uri.version_id, s3_capabilities, uri.bucket, uri.endpoint);
auto [metadata_path, metadata_disk] = prepareForLocalMetadata(name, config, config_prefix, context);
metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, uri.key);
}

View File

@ -13,7 +13,6 @@ namespace DB
namespace ErrorCodes
{
extern const int FILE_DOESNT_EXIST;
extern const int NETWORK_ERROR;
}
MetadataStorageFromStaticFilesWebServer::MetadataStorageFromStaticFilesWebServer(
@ -38,7 +37,7 @@ bool MetadataStorageFromStaticFilesWebServer::exists(const std::string & path) c
if (fs_path.has_extension())
fs_path = fs_path.parent_path();
initializeIfNeeded(fs_path, false);
initializeIfNeeded(fs_path);
if (object_storage.files.empty())
return false;
@ -123,39 +122,21 @@ std::vector<std::string> MetadataStorageFromStaticFilesWebServer::listDirectory(
return result;
}
bool MetadataStorageFromStaticFilesWebServer::initializeIfNeeded(const std::string & path, std::optional<bool> throw_on_error) const
void MetadataStorageFromStaticFilesWebServer::initializeIfNeeded(const std::string & path) const
{
if (object_storage.files.find(path) == object_storage.files.end())
{
try
{
object_storage.initialize(fs::path(object_storage.url) / path);
}
catch (...)
{
const auto message = getCurrentExceptionMessage(false);
bool can_throw = throw_on_error.has_value() ? *throw_on_error : CurrentThread::isInitialized() && CurrentThread::get().getQueryContext();
if (can_throw)
throw Exception(ErrorCodes::NETWORK_ERROR, "Cannot load disk metadata. Error: {}", message);
LOG_TRACE(&Poco::Logger::get("DiskWeb"), "Cannot load disk metadata. Error: {}", message);
return false;
}
object_storage.initialize(fs::path(object_storage.url) / path);
}
return true;
}
DirectoryIteratorPtr MetadataStorageFromStaticFilesWebServer::iterateDirectory(const std::string & path) const
{
std::vector<fs::path> dir_file_paths;
if (!initializeIfNeeded(path))
{
initializeIfNeeded(path);
if (!exists(path))
return std::make_unique<StaticDirectoryIterator>(std::move(dir_file_paths));
}
assertExists(path);
for (const auto & [file_path, _] : object_storage.files)
{

View File

@ -19,7 +19,7 @@ private:
void assertExists(const std::string & path) const;
bool initializeIfNeeded(const std::string & path, std::optional<bool> throw_on_error = std::nullopt) const;
void initializeIfNeeded(const std::string & path) const;
public:
explicit MetadataStorageFromStaticFilesWebServer(const WebObjectStorage & object_storage_);

View File

@ -46,7 +46,10 @@ void WebObjectStorage::initialize(const String & uri_path) const
Poco::Net::HTTPRequest::HTTP_GET,
ReadWriteBufferFromHTTP::OutStreamCallback(),
ConnectionTimeouts::getHTTPTimeouts(getContext()),
credentials);
credentials,
/* max_redirects= */ 0,
/* buffer_size_= */ DBMS_DEFAULT_BUFFER_SIZE,
getContext()->getReadSettings());
String file_name;
FileData file_data{};
@ -82,6 +85,15 @@ void WebObjectStorage::initialize(const String & uri_path) const
files.emplace(std::make_pair(dir_name, FileData({ .type = FileType::Directory })));
}
catch (HTTPException & e)
{
/// 404 - no files
if (e.getHTTPStatus() == Poco::Net::HTTPResponse::HTTP_NOT_FOUND)
return;
e.addMessage("while loading disk metadata");
throw;
}
catch (Exception & e)
{
e.addMessage("while loading disk metadata");

View File

@ -13,6 +13,7 @@
#include <Processors/Formats/Impl/ValuesBlockInputFormat.h>
#include <Poco/URI.h>
#include <Common/Exception.h>
#include <Common/KnownObjectNames.h>
#include <fcntl.h>
#include <unistd.h>
@ -445,6 +446,7 @@ void FormatFactory::registerInputFormat(const String & name, InputCreator input_
throw Exception("FormatFactory: Input format " + name + " is already registered", ErrorCodes::LOGICAL_ERROR);
target = std::move(input_creator);
registerFileExtension(name, name);
KnownFormatNames::instance().add(name);
}
void FormatFactory::registerNonTrivialPrefixAndSuffixChecker(const String & name, NonTrivialPrefixAndSuffixChecker non_trivial_prefix_and_suffix_checker)
@ -483,6 +485,7 @@ void FormatFactory::registerOutputFormat(const String & name, OutputCreator outp
throw Exception("FormatFactory: Output format " + name + " is already registered", ErrorCodes::LOGICAL_ERROR);
target = std::move(output_creator);
registerFileExtension(name, name);
KnownFormatNames::instance().add(name);
}
void FormatFactory::registerFileExtension(const String & extension, const String & format_name)

View File

@ -566,7 +566,8 @@ public:
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isString(arguments[0]))
WhichDataType which(arguments[0]);
if (!which.isStringOrFixedString())
throw Exception("Illegal type " + arguments[0]->getName() + " of argument of function " + getName(),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
@ -612,6 +613,39 @@ public:
return col_res;
}
else if (const ColumnFixedString * col_fix_string = checkAndGetColumn<ColumnFixedString>(column.get()))
{
auto col_res = ColumnString::create();
ColumnString::Chars & out_vec = col_res->getChars();
ColumnString::Offsets & out_offsets = col_res->getOffsets();
const ColumnString::Chars & in_vec = col_fix_string->getChars();
size_t n = col_fix_string->getN();
size_t size = col_fix_string->size();
out_offsets.resize(size);
out_vec.resize(in_vec.size() / word_size + size);
char * begin = reinterpret_cast<char *>(out_vec.data());
char * pos = begin;
size_t prev_offset = 0;
for (size_t i = 0; i < size; ++i)
{
size_t new_offset = prev_offset + n;
Impl::decode(reinterpret_cast<const char *>(&in_vec[prev_offset]), reinterpret_cast<const char *>(&in_vec[new_offset]), pos);
out_offsets[i] = pos - begin;
prev_offset = new_offset;
}
out_vec.resize(pos - begin);
return col_res;
}
else
{
throw Exception("Illegal column " + arguments[0].column->getName()

View File

@ -20,17 +20,19 @@
#include <Columns/ColumnArray.h>
#include <Columns/ColumnTuple.h>
#include <DataTypes/Serializations/SerializationDecimal.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesDecimal.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeEnum.h>
#include <DataTypes/DataTypeFactory.h>
#include <DataTypes/DataTypeFixedString.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeNothing.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeTuple.h>
#include <DataTypes/DataTypeUUID.h>
#include <DataTypes/DataTypesDecimal.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/Serializations/SerializationDecimal.h>
#include <Functions/FunctionFactory.h>
#include <Functions/IFunction.h>
@ -720,8 +722,16 @@ public:
return false;
}
auto & col_vec = assert_cast<ColumnVector<NumberType> &>(dest);
col_vec.insertValue(value);
if (dest.getDataType() == TypeIndex::LowCardinality)
{
ColumnLowCardinality & col_low = assert_cast<ColumnLowCardinality &>(dest);
col_low.insertData(reinterpret_cast<const char *>(&value), sizeof(value));
}
else
{
auto & col_vec = assert_cast<ColumnVector<NumberType> &>(dest);
col_vec.insertValue(value);
}
return true;
}
};
@ -825,8 +835,17 @@ public:
return JSONExtractRawImpl<JSONParser>::insertResultToColumn(dest, element, {});
auto str = element.getString();
ColumnString & col_str = assert_cast<ColumnString &>(dest);
col_str.insertData(str.data(), str.size());
if (dest.getDataType() == TypeIndex::LowCardinality)
{
ColumnLowCardinality & col_low = assert_cast<ColumnLowCardinality &>(dest);
col_low.insertData(str.data(), str.size());
}
else
{
ColumnString & col_str = assert_cast<ColumnString &>(dest);
col_str.insertData(str.data(), str.size());
}
return true;
}
};
@ -855,25 +874,41 @@ struct JSONExtractTree
}
};
class LowCardinalityNode : public Node
class LowCardinalityFixedStringNode : public Node
{
public:
LowCardinalityNode(DataTypePtr dictionary_type_, std::unique_ptr<Node> impl_)
: dictionary_type(dictionary_type_), impl(std::move(impl_)) {}
explicit LowCardinalityFixedStringNode(const size_t fixed_length_) : fixed_length(fixed_length_) { }
bool insertResultToColumn(IColumn & dest, const Element & element) override
{
auto from_col = dictionary_type->createColumn();
if (impl->insertResultToColumn(*from_col, element))
// If element is an object we delegate the insertion to JSONExtractRawImpl
if (element.isObject())
return JSONExtractRawImpl<JSONParser>::insertResultToLowCardinalityFixedStringColumn(dest, element, fixed_length);
else if (!element.isString())
return false;
auto str = element.getString();
if (str.size() > fixed_length)
return false;
// For the non low cardinality case of FixedString, the padding is done in the FixedString Column implementation.
// In order to avoid having to pass the data to a FixedString Column and read it back (which would slow down the execution)
// the data is padded here and written directly to the Low Cardinality Column
if (str.size() == fixed_length)
{
std::string_view value = from_col->getDataAt(0).toView();
assert_cast<ColumnLowCardinality &>(dest).insertData(value.data(), value.size());
return true;
assert_cast<ColumnLowCardinality &>(dest).insertData(str.data(), str.size());
}
return false;
else
{
String padded_str(str);
padded_str.resize(fixed_length, '\0');
assert_cast<ColumnLowCardinality &>(dest).insertData(padded_str.data(), padded_str.size());
}
return true;
}
private:
DataTypePtr dictionary_type;
std::unique_ptr<Node> impl;
const size_t fixed_length;
};
class UUIDNode : public Node
@ -885,7 +920,15 @@ struct JSONExtractTree
return false;
auto uuid = parseFromString<UUID>(element.getString());
assert_cast<ColumnUUID &>(dest).insert(uuid);
if (dest.getDataType() == TypeIndex::LowCardinality)
{
ColumnLowCardinality & col_low = assert_cast<ColumnLowCardinality &>(dest);
col_low.insertData(reinterpret_cast<const char *>(&uuid), sizeof(uuid));
}
else
{
assert_cast<ColumnUUID &>(dest).insert(uuid);
}
return true;
}
};
@ -928,6 +971,7 @@ struct JSONExtractTree
assert_cast<ColumnDecimal<DecimalType> &>(dest).insert(value);
return true;
}
private:
DataTypePtr data_type;
};
@ -946,13 +990,18 @@ struct JSONExtractTree
public:
bool insertResultToColumn(IColumn & dest, const Element & element) override
{
if (!element.isString())
if (element.isNull())
return false;
auto & col_str = assert_cast<ColumnFixedString &>(dest);
if (!element.isString())
return JSONExtractRawImpl<JSONParser>::insertResultToFixedStringColumn(dest, element, {});
auto str = element.getString();
auto & col_str = assert_cast<ColumnFixedString &>(dest);
if (str.size() > col_str.getN())
return false;
col_str.insertData(str.data(), str.size());
return true;
}
};
@ -1178,9 +1227,18 @@ struct JSONExtractTree
case TypeIndex::UUID: return std::make_unique<UUIDNode>();
case TypeIndex::LowCardinality:
{
// The low cardinality case is treated in two different ways:
// For FixedString type, an especial class is implemented for inserting the data in the destination column,
// as the string length must be passed in order to check and pad the incoming data.
// For the rest of low cardinality types, the insertion is done in their corresponding class, adapting the data
// as needed for the insertData function of the ColumnLowCardinality.
auto dictionary_type = typeid_cast<const DataTypeLowCardinality *>(type.get())->getDictionaryType();
auto impl = build(function_name, dictionary_type);
return std::make_unique<LowCardinalityNode>(dictionary_type, std::move(impl));
if ((*dictionary_type).getTypeId() == TypeIndex::FixedString)
{
auto fixed_length = typeid_cast<const DataTypeFixedString *>(dictionary_type.get())->getN();
return std::make_unique<LowCardinalityFixedStringNode>(fixed_length);
}
return build(function_name, dictionary_type);
}
case TypeIndex::Decimal256: return std::make_unique<DecimalNode<Decimal256>>(type);
case TypeIndex::Decimal128: return std::make_unique<DecimalNode<Decimal128>>(type);
@ -1332,13 +1390,63 @@ public:
static bool insertResultToColumn(IColumn & dest, const Element & element, std::string_view)
{
ColumnString & col_str = assert_cast<ColumnString &>(dest);
auto & chars = col_str.getChars();
WriteBufferFromVector<ColumnString::Chars> buf(chars, AppendModeTag());
if (dest.getDataType() == TypeIndex::LowCardinality)
{
ColumnString::Chars chars;
WriteBufferFromVector<ColumnString::Chars> buf(chars, AppendModeTag());
traverse(element, buf);
buf.finalize();
assert_cast<ColumnLowCardinality &>(dest).insertData(reinterpret_cast<const char *>(chars.data()), chars.size());
}
else
{
ColumnString & col_str = assert_cast<ColumnString &>(dest);
auto & chars = col_str.getChars();
WriteBufferFromVector<ColumnString::Chars> buf(chars, AppendModeTag());
traverse(element, buf);
buf.finalize();
chars.push_back(0);
col_str.getOffsets().push_back(chars.size());
}
return true;
}
// We use insertResultToFixedStringColumn in case we are inserting raw data in a FixedString column
static bool insertResultToFixedStringColumn(IColumn & dest, const Element & element, std::string_view)
{
ColumnFixedString::Chars chars;
WriteBufferFromVector<ColumnFixedString::Chars> buf(chars, AppendModeTag());
traverse(element, buf);
buf.finalize();
chars.push_back(0);
col_str.getOffsets().push_back(chars.size());
auto & col_str = assert_cast<ColumnFixedString &>(dest);
if (chars.size() > col_str.getN())
return false;
chars.resize_fill(col_str.getN());
col_str.insertData(reinterpret_cast<const char *>(chars.data()), chars.size());
return true;
}
// We use insertResultToLowCardinalityFixedStringColumn in case we are inserting raw data in a Low Cardinality FixedString column
static bool insertResultToLowCardinalityFixedStringColumn(IColumn & dest, const Element & element, size_t fixed_length)
{
if (element.getObject().size() > fixed_length)
return false;
ColumnFixedString::Chars chars;
WriteBufferFromVector<ColumnFixedString::Chars> buf(chars, AppendModeTag());
traverse(element, buf);
buf.finalize();
if (chars.size() > fixed_length)
return false;
chars.resize_fill(fixed_length);
assert_cast<ColumnLowCardinality &>(dest).insertData(reinterpret_cast<const char *>(chars.data()), chars.size());
return true;
}

View File

@ -182,7 +182,7 @@ REGISTER_FUNCTION(ModuloLegacy)
struct NamePositiveModulo
{
static constexpr auto name = "positive_modulo";
static constexpr auto name = "positiveModulo";
};
using FunctionPositiveModulo = BinaryArithmeticOverloadResolver<PositiveModuloImpl, NamePositiveModulo, false>;
@ -191,11 +191,17 @@ REGISTER_FUNCTION(PositiveModulo)
factory.registerFunction<FunctionPositiveModulo>(
{
R"(
Calculates the remainder when dividing `a` by `b`. Similar to function `modulo` except that `positive_modulo` always return non-negative number.
Calculates the remainder when dividing `a` by `b`. Similar to function `modulo` except that `positiveModulo` always return non-negative number.
Returns the difference between `a` and the nearest integer not greater than `a` divisible by `b`.
In other words, the function returning the modulus (modulo) in the terms of Modular Arithmetic.
)",
Documentation::Examples{{"positive_modulo", "SELECT positive_modulo(-1000, 32);"}},
Documentation::Examples{{"positiveModulo", "SELECT positiveModulo(-1, 10);"}},
Documentation::Categories{"Arithmetic"}},
FunctionFactory::CaseInsensitive);
factory.registerAlias("positive_modulo", "positiveModulo", FunctionFactory::CaseInsensitive);
/// Compatibility with Spark:
factory.registerAlias("pmod", "positiveModulo", FunctionFactory::CaseInsensitive);
}
}

View File

@ -312,15 +312,29 @@ void assertResponseIsOk(const Poco::Net::HTTPRequest & request, Poco::Net::HTTPR
|| status == Poco::Net::HTTPResponse::HTTP_PARTIAL_CONTENT /// Reading with Range header was successful.
|| (isRedirect(status) && allow_redirects)))
{
std::stringstream error_message; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
error_message.exceptions(std::ios::failbit);
error_message << "Received error from remote server " << request.getURI() << ". HTTP status code: " << status << " "
<< response.getReason() << ", body: " << istr.rdbuf();
int code = status == Poco::Net::HTTPResponse::HTTP_TOO_MANY_REQUESTS
? ErrorCodes::RECEIVED_ERROR_TOO_MANY_REQUESTS
: ErrorCodes::RECEIVED_ERROR_FROM_REMOTE_IO_SERVER;
throw Exception(error_message.str(),
status == HTTP_TOO_MANY_REQUESTS ? ErrorCodes::RECEIVED_ERROR_TOO_MANY_REQUESTS
: ErrorCodes::RECEIVED_ERROR_FROM_REMOTE_IO_SERVER);
std::stringstream body; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
body.exceptions(std::ios::failbit);
body << istr.rdbuf();
throw HTTPException(code, request.getURI(), status, response.getReason(), body.str());
}
}
std::string HTTPException::makeExceptionMessage(
const std::string & uri,
Poco::Net::HTTPResponse::HTTPStatus http_status,
const std::string & reason,
const std::string & body)
{
return fmt::format(
"Received error from remote server {}. "
"HTTP status code: {} {}, "
"body: {}",
uri, http_status, reason, body);
}
}

View File

@ -17,8 +17,6 @@
namespace DB
{
constexpr int HTTP_TOO_MANY_REQUESTS = 429;
class HTTPServerResponse;
class SingleEndpointHTTPSessionPool : public PoolBase<Poco::Net::HTTPClientSession>
@ -35,6 +33,38 @@ public:
SingleEndpointHTTPSessionPool(const std::string & host_, UInt16 port_, bool https_, size_t max_pool_size_);
};
class HTTPException : public Exception
{
public:
HTTPException(
int code,
const std::string & uri,
Poco::Net::HTTPResponse::HTTPStatus http_status_,
const std::string & reason,
const std::string & body
)
: Exception(makeExceptionMessage(uri, http_status_, reason, body), code)
, http_status(http_status_)
{}
HTTPException * clone() const override { return new HTTPException(*this); }
void rethrow() const override { throw *this; }
int getHTTPStatus() const { return http_status; }
private:
Poco::Net::HTTPResponse::HTTPStatus http_status{};
static std::string makeExceptionMessage(
const std::string & uri,
Poco::Net::HTTPResponse::HTTPStatus http_status,
const std::string & reason,
const std::string & body);
const char * name() const noexcept override { return "DB::HTTPException"; }
const char * className() const noexcept override { return "DB::HTTPException"; }
};
using PooledHTTPSessionPtr = SingleEndpointHTTPSessionPool::Entry;
using HTTPSessionPtr = std::shared_ptr<Poco::Net::HTTPClientSession>;

View File

@ -45,7 +45,7 @@ ReadBufferFromS3::ReadBufferFromS3(
const String & bucket_,
const String & key_,
const String & version_id_,
UInt64 max_single_read_retries_,
const S3Settings::RequestSettings & request_settings_,
const ReadSettings & settings_,
bool use_external_buffer_,
size_t offset_,
@ -56,7 +56,7 @@ ReadBufferFromS3::ReadBufferFromS3(
, bucket(bucket_)
, key(key_)
, version_id(version_id_)
, max_single_read_retries(max_single_read_retries_)
, request_settings(request_settings_)
, offset(offset_)
, read_until_position(read_until_position_)
, read_settings(settings_)
@ -105,7 +105,7 @@ bool ReadBufferFromS3::nextImpl()
}
size_t sleep_time_with_backoff_milliseconds = 100;
for (size_t attempt = 0; (attempt < max_single_read_retries) && !next_result; ++attempt)
for (size_t attempt = 0; attempt < request_settings.max_single_read_retries && !next_result; ++attempt)
{
Stopwatch watch;
try
@ -166,7 +166,7 @@ bool ReadBufferFromS3::nextImpl()
attempt,
e.message());
if (attempt + 1 == max_single_read_retries)
if (attempt + 1 == request_settings.max_single_read_retries)
throw;
/// Pause before next attempt.
@ -349,7 +349,7 @@ SeekableReadBufferPtr ReadBufferS3Factory::getReader()
bucket,
key,
version_id,
s3_max_single_read_retries,
request_settings,
read_settings,
false /*use_external_buffer*/,
next_range->first,

View File

@ -1,6 +1,7 @@
#pragma once
#include <Common/RangeGenerator.h>
#include <Storages/StorageS3Settings.h>
#include "config.h"
#if USE_AWS_S3
@ -33,7 +34,7 @@ private:
String bucket;
String key;
String version_id;
UInt64 max_single_read_retries;
const S3Settings::RequestSettings request_settings;
/// These variables are atomic because they can be used for `logging only`
/// (where it is not important to get consistent result)
@ -52,7 +53,7 @@ public:
const String & bucket_,
const String & key_,
const String & version_id_,
UInt64 max_single_read_retries_,
const S3Settings::RequestSettings & request_settings_,
const ReadSettings & settings_,
bool use_external_buffer = false,
size_t offset_ = 0,
@ -100,7 +101,7 @@ public:
const String & version_id_,
size_t range_step_,
size_t object_size_,
UInt64 s3_max_single_read_retries_,
const S3Settings::RequestSettings & request_settings_,
const ReadSettings & read_settings_)
: client_ptr(client_ptr_)
, bucket(bucket_)
@ -110,7 +111,7 @@ public:
, range_generator(object_size_, range_step_)
, range_step(range_step_)
, object_size(object_size_)
, s3_max_single_read_retries(s3_max_single_read_retries_)
, request_settings(request_settings_)
{
assert(range_step > 0);
assert(range_step < object_size);
@ -135,7 +136,7 @@ private:
size_t range_step;
size_t object_size;
UInt64 s3_max_single_read_retries;
const S3Settings::RequestSettings request_settings;
};
}

View File

@ -11,6 +11,7 @@
#include <Common/logger_useful.h>
#include <Common/Stopwatch.h>
#include <Common/Throttler.h>
#include <IO/HTTPCommon.h>
#include <IO/WriteBufferFromString.h>
#include <IO/Operators.h>
@ -76,12 +77,16 @@ PocoHTTPClientConfiguration::PocoHTTPClientConfiguration(
const RemoteHostFilter & remote_host_filter_,
unsigned int s3_max_redirects_,
bool enable_s3_requests_logging_,
bool for_disk_s3_)
bool for_disk_s3_,
const ThrottlerPtr & get_request_throttler_,
const ThrottlerPtr & put_request_throttler_)
: force_region(force_region_)
, remote_host_filter(remote_host_filter_)
, s3_max_redirects(s3_max_redirects_)
, enable_s3_requests_logging(enable_s3_requests_logging_)
, for_disk_s3(for_disk_s3_)
, get_request_throttler(get_request_throttler_)
, put_request_throttler(put_request_throttler_)
{
}
@ -128,6 +133,8 @@ PocoHTTPClient::PocoHTTPClient(const PocoHTTPClientConfiguration & client_config
, s3_max_redirects(client_configuration.s3_max_redirects)
, enable_s3_requests_logging(client_configuration.enable_s3_requests_logging)
, for_disk_s3(client_configuration.for_disk_s3)
, get_request_throttler(client_configuration.get_request_throttler)
, put_request_throttler(client_configuration.put_request_throttler)
, extra_headers(client_configuration.extra_headers)
{
}
@ -245,6 +252,23 @@ void PocoHTTPClient::makeRequestInternal(
if (enable_s3_requests_logging)
LOG_TEST(log, "Make request to: {}", uri);
switch (request.GetMethod())
{
case Aws::Http::HttpMethod::HTTP_GET:
case Aws::Http::HttpMethod::HTTP_HEAD:
if (get_request_throttler)
get_request_throttler->add(1);
break;
case Aws::Http::HttpMethod::HTTP_PUT:
case Aws::Http::HttpMethod::HTTP_POST:
case Aws::Http::HttpMethod::HTTP_PATCH:
if (put_request_throttler)
put_request_throttler->add(1);
break;
case Aws::Http::HttpMethod::HTTP_DELETE:
break; // Not throttled
}
addMetric(request, S3MetricType::Count);
CurrentMetrics::Increment metric_increment{CurrentMetrics::S3Requests};

View File

@ -8,6 +8,7 @@
#if USE_AWS_S3
#include <Common/RemoteHostFilter.h>
#include <Common/Throttler_fwd.h>
#include <IO/ConnectionTimeouts.h>
#include <IO/HTTPCommon.h>
#include <IO/S3/SessionAwareIOStream.h>
@ -48,6 +49,8 @@ struct PocoHTTPClientConfiguration : public Aws::Client::ClientConfiguration
unsigned int s3_max_redirects;
bool enable_s3_requests_logging;
bool for_disk_s3;
ThrottlerPtr get_request_throttler;
ThrottlerPtr put_request_throttler;
HeaderCollection extra_headers;
void updateSchemeAndRegion();
@ -60,7 +63,9 @@ private:
const RemoteHostFilter & remote_host_filter_,
unsigned int s3_max_redirects_,
bool enable_s3_requests_logging_,
bool for_disk_s3_
bool for_disk_s3_,
const ThrottlerPtr & get_request_throttler_,
const ThrottlerPtr & put_request_throttler_
);
/// Constructor of Aws::Client::ClientConfiguration must be called after AWS SDK initialization.
@ -154,6 +159,16 @@ private:
unsigned int s3_max_redirects;
bool enable_s3_requests_logging;
bool for_disk_s3;
/// Limits get request per second rate for GET, SELECT and all other requests, excluding throttled by put throttler
/// (i.e. throttles GetObject, HeadObject)
ThrottlerPtr get_request_throttler;
/// Limits put request per second rate for PUT, COPY, POST, LIST requests
/// (i.e. throttles PutObject, CopyObject, ListObjects, CreateMultipartUpload, UploadPartCopy, UploadPart, CompleteMultipartUpload)
/// NOTE: DELETE and CANCEL requests are not throttled by either put or get throttler
ThrottlerPtr put_request_throttler;
const HeaderCollection extra_headers;
};

Some files were not shown because too many files have changed in this diff Show More