Merge remote-tracking branch 'rschu1ze/master' into stabilize-row-ordering

This commit is contained in:
Robert Schulze 2024-06-10 16:53:19 +00:00
commit a482b3efea
No known key found for this signature in database
GPG Key ID: 26703B55FB13728A
71 changed files with 736 additions and 282 deletions

2
contrib/cld2 vendored

@ -1 +1 @@
Subproject commit bc6d493a2f64ed1fc1c4c4b4294a542a04e04217
Subproject commit 217ba8b8805b41557faadaa47bb6e99f2242eea3

View File

@ -0,0 +1,101 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.4.2.141-stable (9e23d27bd11) FIXME as compared to v24.4.1.2088-stable (6d4b31322d1)
#### Improvement
* Backported in [#63467](https://github.com/ClickHouse/ClickHouse/issues/63467): Make rabbitmq nack broken messages. Closes [#45350](https://github.com/ClickHouse/ClickHouse/issues/45350). [#60312](https://github.com/ClickHouse/ClickHouse/pull/60312) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### Build/Testing/Packaging Improvement
* Backported in [#63612](https://github.com/ClickHouse/ClickHouse/issues/63612): The Dockerfile is reviewed by the docker official library in https://github.com/docker-library/official-images/pull/15846. [#63400](https://github.com/ClickHouse/ClickHouse/pull/63400) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Backported in [#64279](https://github.com/ClickHouse/ClickHouse/issues/64279): Fix queries with FINAL give wrong result when table does not use adaptive granularity. [#62432](https://github.com/ClickHouse/ClickHouse/pull/62432) ([Duc Canh Le](https://github.com/canhld94)).
* Backported in [#63295](https://github.com/ClickHouse/ClickHouse/issues/63295): Fix crash with untuple and unresolved lambda. [#63131](https://github.com/ClickHouse/ClickHouse/pull/63131) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#63978](https://github.com/ClickHouse/ClickHouse/issues/63978): Fix intersect parts when restart after drop range. [#63202](https://github.com/ClickHouse/ClickHouse/pull/63202) ([Han Fei](https://github.com/hanfei1991)).
* Backported in [#63413](https://github.com/ClickHouse/ClickHouse/issues/63413): Fix a misbehavior when SQL security defaults don't load for old tables during server startup. [#63209](https://github.com/ClickHouse/ClickHouse/pull/63209) ([pufit](https://github.com/pufit)).
* Backported in [#63388](https://github.com/ClickHouse/ClickHouse/issues/63388): JOIN filter push down filled join fix. Closes [#63228](https://github.com/ClickHouse/ClickHouse/issues/63228). [#63234](https://github.com/ClickHouse/ClickHouse/pull/63234) ([Maksim Kita](https://github.com/kitaisreal)).
* Backported in [#63618](https://github.com/ClickHouse/ClickHouse/issues/63618): Fix bug which could potentially lead to rare LOGICAL_ERROR during SELECT query with message: `Unexpected return type from materialize. Expected type_XXX. Got type_YYY.` Introduced in [#59379](https://github.com/ClickHouse/ClickHouse/issues/59379). [#63353](https://github.com/ClickHouse/ClickHouse/pull/63353) ([alesapin](https://github.com/alesapin)).
* Backported in [#63451](https://github.com/ClickHouse/ClickHouse/issues/63451): Fix `X-ClickHouse-Timezone` header returning wrong timezone when using `session_timezone` as query level setting. [#63377](https://github.com/ClickHouse/ClickHouse/pull/63377) ([Andrey Zvonov](https://github.com/zvonand)).
* Backported in [#63605](https://github.com/ClickHouse/ClickHouse/issues/63605): Fix backup of projection part in case projection was removed from table metadata, but part still has projection. [#63426](https://github.com/ClickHouse/ClickHouse/pull/63426) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Backported in [#63510](https://github.com/ClickHouse/ClickHouse/issues/63510): Fix 'Every derived table must have its own alias' error for MYSQL dictionary source, close [#63341](https://github.com/ClickHouse/ClickHouse/issues/63341). [#63481](https://github.com/ClickHouse/ClickHouse/pull/63481) ([vdimir](https://github.com/vdimir)).
* Backported in [#63592](https://github.com/ClickHouse/ClickHouse/issues/63592): Avoid segafult in `MergeTreePrefetchedReadPool` while fetching projection parts. [#63513](https://github.com/ClickHouse/ClickHouse/pull/63513) ([Antonio Andelic](https://github.com/antonio2368)).
* Backported in [#63750](https://github.com/ClickHouse/ClickHouse/issues/63750): Read only the necessary columns from VIEW (new analyzer). Closes [#62594](https://github.com/ClickHouse/ClickHouse/issues/62594). [#63688](https://github.com/ClickHouse/ClickHouse/pull/63688) ([Maksim Kita](https://github.com/kitaisreal)).
* Backported in [#63772](https://github.com/ClickHouse/ClickHouse/issues/63772): Fix [#63539](https://github.com/ClickHouse/ClickHouse/issues/63539). Forbid WINDOW redefinition in new analyzer. [#63694](https://github.com/ClickHouse/ClickHouse/pull/63694) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#63872](https://github.com/ClickHouse/ClickHouse/issues/63872): Flatten_nested is broken with replicated database. [#63695](https://github.com/ClickHouse/ClickHouse/pull/63695) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#63854](https://github.com/ClickHouse/ClickHouse/issues/63854): Fix `Not found column` and `CAST AS Map from array requires nested tuple of 2 elements` exceptions for distributed queries which use `Map(Nothing, Nothing)` type. Fixes [#63637](https://github.com/ClickHouse/ClickHouse/issues/63637). [#63753](https://github.com/ClickHouse/ClickHouse/pull/63753) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#63847](https://github.com/ClickHouse/ClickHouse/issues/63847): Fix possible `ILLEGAL_COLUMN` error in `partial_merge` join, close [#37928](https://github.com/ClickHouse/ClickHouse/issues/37928). [#63755](https://github.com/ClickHouse/ClickHouse/pull/63755) ([vdimir](https://github.com/vdimir)).
* Backported in [#63908](https://github.com/ClickHouse/ClickHouse/issues/63908): `query_plan_remove_redundant_distinct` can break queries with WINDOW FUNCTIONS (with `allow_experimental_analyzer` is on). Fixes [#62820](https://github.com/ClickHouse/ClickHouse/issues/62820). [#63776](https://github.com/ClickHouse/ClickHouse/pull/63776) ([Igor Nikonov](https://github.com/devcrafter)).
* Backported in [#63955](https://github.com/ClickHouse/ClickHouse/issues/63955): Fix possible crash with SYSTEM UNLOAD PRIMARY KEY. [#63778](https://github.com/ClickHouse/ClickHouse/pull/63778) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#63938](https://github.com/ClickHouse/ClickHouse/issues/63938): Allow JOIN filter push down to both streams if only single equivalent column is used in query. Closes [#63799](https://github.com/ClickHouse/ClickHouse/issues/63799). [#63819](https://github.com/ClickHouse/ClickHouse/pull/63819) ([Maksim Kita](https://github.com/kitaisreal)).
* Backported in [#63991](https://github.com/ClickHouse/ClickHouse/issues/63991): Fix incorrect select query result when parallel replicas were used to read from a Materialized View. [#63861](https://github.com/ClickHouse/ClickHouse/pull/63861) ([Nikita Taranov](https://github.com/nickitat)).
* Backported in [#64033](https://github.com/ClickHouse/ClickHouse/issues/64033): Fix a error `Database name is empty` for remote queries with lambdas over the cluster with modified default database. Fixes [#63471](https://github.com/ClickHouse/ClickHouse/issues/63471). [#63864](https://github.com/ClickHouse/ClickHouse/pull/63864) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#64561](https://github.com/ClickHouse/ClickHouse/issues/64561): Fix SIGSEGV due to CPU/Real (`query_profiler_real_time_period_ns`/`query_profiler_cpu_time_period_ns`) profiler (has been an issue since 2022, that leads to periodic server crashes, especially if you were using distributed engine). [#63865](https://github.com/ClickHouse/ClickHouse/pull/63865) ([Azat Khuzhin](https://github.com/azat)).
* Backported in [#64011](https://github.com/ClickHouse/ClickHouse/issues/64011): Fix analyzer - IN function with arbitrary deep sub-selects in materialized view to use insertion block. [#63930](https://github.com/ClickHouse/ClickHouse/pull/63930) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#64238](https://github.com/ClickHouse/ClickHouse/issues/64238): Fix resolve of unqualified COLUMNS matcher. Preserve the input columns order and forbid usage of unknown identifiers. [#63962](https://github.com/ClickHouse/ClickHouse/pull/63962) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#64103](https://github.com/ClickHouse/ClickHouse/issues/64103): Deserialize untrusted binary inputs in a safer way. [#64024](https://github.com/ClickHouse/ClickHouse/pull/64024) ([Robert Schulze](https://github.com/rschu1ze)).
* Backported in [#64170](https://github.com/ClickHouse/ClickHouse/issues/64170): Add missing settings to recoverLostReplica. [#64040](https://github.com/ClickHouse/ClickHouse/pull/64040) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#64322](https://github.com/ClickHouse/ClickHouse/issues/64322): This fix will use a proper redefined context with the correct definer for each individual view in the query pipeline Closes [#63777](https://github.com/ClickHouse/ClickHouse/issues/63777). [#64079](https://github.com/ClickHouse/ClickHouse/pull/64079) ([pufit](https://github.com/pufit)).
* Backported in [#64382](https://github.com/ClickHouse/ClickHouse/issues/64382): Fix analyzer: "Not found column" error is fixed when using INTERPOLATE. [#64096](https://github.com/ClickHouse/ClickHouse/pull/64096) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#64568](https://github.com/ClickHouse/ClickHouse/issues/64568): Fix creating backups to S3 buckets with different credentials from the disk containing the file. [#64153](https://github.com/ClickHouse/ClickHouse/pull/64153) ([Antonio Andelic](https://github.com/antonio2368)).
* Backported in [#64272](https://github.com/ClickHouse/ClickHouse/issues/64272): Prevent LOGICAL_ERROR on CREATE TABLE as MaterializedView. [#64174](https://github.com/ClickHouse/ClickHouse/pull/64174) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#64330](https://github.com/ClickHouse/ClickHouse/issues/64330): The query cache now considers two identical queries against different databases as different. The previous behavior could be used to bypass missing privileges to read from a table. [#64199](https://github.com/ClickHouse/ClickHouse/pull/64199) ([Robert Schulze](https://github.com/rschu1ze)).
* Backported in [#64254](https://github.com/ClickHouse/ClickHouse/issues/64254): Ignore `text_log` config when using Keeper. [#64218](https://github.com/ClickHouse/ClickHouse/pull/64218) ([Antonio Andelic](https://github.com/antonio2368)).
* Backported in [#64690](https://github.com/ClickHouse/ClickHouse/issues/64690): Fix Query Tree size validation. Closes [#63701](https://github.com/ClickHouse/ClickHouse/issues/63701). [#64377](https://github.com/ClickHouse/ClickHouse/pull/64377) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#64409](https://github.com/ClickHouse/ClickHouse/issues/64409): Fix `Logical error: Bad cast` for `Buffer` table with `PREWHERE`. Fixes [#64172](https://github.com/ClickHouse/ClickHouse/issues/64172). [#64388](https://github.com/ClickHouse/ClickHouse/pull/64388) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#64727](https://github.com/ClickHouse/ClickHouse/issues/64727): Fixed `CREATE TABLE AS` queries for tables with default expressions. [#64455](https://github.com/ClickHouse/ClickHouse/pull/64455) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#64623](https://github.com/ClickHouse/ClickHouse/issues/64623): Fix an error `Cannot find column` in distributed queries with constant CTE in the `GROUP BY` key. [#64519](https://github.com/ClickHouse/ClickHouse/pull/64519) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#64680](https://github.com/ClickHouse/ClickHouse/issues/64680): Fix [#64612](https://github.com/ClickHouse/ClickHouse/issues/64612). Do not rewrite aggregation if `-If` combinator is already used. [#64638](https://github.com/ClickHouse/ClickHouse/pull/64638) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#64942](https://github.com/ClickHouse/ClickHouse/issues/64942): Fix OrderByLimitByDuplicateEliminationVisitor across subqueries. [#64766](https://github.com/ClickHouse/ClickHouse/pull/64766) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#64871](https://github.com/ClickHouse/ClickHouse/issues/64871): Fixed memory possible incorrect memory tracking in several kinds of queries: queries that read any data from S3, queries via http protocol, asynchronous inserts. [#64844](https://github.com/ClickHouse/ClickHouse/pull/64844) ([Anton Popov](https://github.com/CurtizJ)).
#### CI Fix or Improvement (changelog entry is not required)
* Backported in [#63364](https://github.com/ClickHouse/ClickHouse/issues/63364): Implement cumulative A Sync status. [#61464](https://github.com/ClickHouse/ClickHouse/pull/61464) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#63338](https://github.com/ClickHouse/ClickHouse/issues/63338): Use `/commit/` to have the URLs in [reports](https://play.clickhouse.com/play?user=play#c2VsZWN0IGRpc3RpbmN0IGNvbW1pdF91cmwgZnJvbSBjaGVja3Mgd2hlcmUgY2hlY2tfc3RhcnRfdGltZSA+PSBub3coKSAtIGludGVydmFsIDEgbW9udGggYW5kIHB1bGxfcmVxdWVzdF9udW1iZXI9NjA1MzI=) like https://github.com/ClickHouse/ClickHouse/commit/44f8bc5308b53797bec8cccc3bd29fab8a00235d and not like https://github.com/ClickHouse/ClickHouse/commits/44f8bc5308b53797bec8cccc3bd29fab8a00235d. [#63331](https://github.com/ClickHouse/ClickHouse/pull/63331) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#63376](https://github.com/ClickHouse/ClickHouse/issues/63376):. [#63366](https://github.com/ClickHouse/ClickHouse/pull/63366) ([Aleksei Filatov](https://github.com/aalexfvk)).
* Backported in [#63571](https://github.com/ClickHouse/ClickHouse/issues/63571):. [#63551](https://github.com/ClickHouse/ClickHouse/pull/63551) ([Konstantin Bogdanov](https://github.com/thevar1able)).
* Backported in [#63651](https://github.com/ClickHouse/ClickHouse/issues/63651): Fix 02362_part_log_merge_algorithm flaky test. [#63635](https://github.com/ClickHouse/ClickHouse/pull/63635) ([Miсhael Stetsyuk](https://github.com/mstetsyuk)).
* Backported in [#63828](https://github.com/ClickHouse/ClickHouse/issues/63828): Fix test_odbc_interaction from aarch64 [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63787](https://github.com/ClickHouse/ClickHouse/pull/63787) ([alesapin](https://github.com/alesapin)).
* Backported in [#63897](https://github.com/ClickHouse/ClickHouse/issues/63897): Fix test `test_catboost_evaluate` for aarch64. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63789](https://github.com/ClickHouse/ClickHouse/pull/63789) ([alesapin](https://github.com/alesapin)).
* Backported in [#63889](https://github.com/ClickHouse/ClickHouse/issues/63889): Remove HDFS from disks config for one integration test for arm. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63832](https://github.com/ClickHouse/ClickHouse/pull/63832) ([alesapin](https://github.com/alesapin)).
* Backported in [#63881](https://github.com/ClickHouse/ClickHouse/issues/63881): Bump version for old image in test_short_strings_aggregation to make it work on arm. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63836](https://github.com/ClickHouse/ClickHouse/pull/63836) ([alesapin](https://github.com/alesapin)).
* Backported in [#63919](https://github.com/ClickHouse/ClickHouse/issues/63919): Disable test `test_non_default_compression/test.py::test_preconfigured_deflateqpl_codec` on arm. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63839](https://github.com/ClickHouse/ClickHouse/pull/63839) ([alesapin](https://github.com/alesapin)).
* Backported in [#63971](https://github.com/ClickHouse/ClickHouse/issues/63971): Fix 02124_insert_deduplication_token_multiple_blocks. [#63950](https://github.com/ClickHouse/ClickHouse/pull/63950) ([Han Fei](https://github.com/hanfei1991)).
* Backported in [#64049](https://github.com/ClickHouse/ClickHouse/issues/64049): Add `ClickHouseVersion.copy` method. Create a branch release in advance without spinning out the release to increase the stability. [#64039](https://github.com/ClickHouse/ClickHouse/pull/64039) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#64078](https://github.com/ClickHouse/ClickHouse/issues/64078): The mime type is not 100% reliable for Python and shell scripts without shebangs; add a check for file extension. [#64062](https://github.com/ClickHouse/ClickHouse/pull/64062) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#64161](https://github.com/ClickHouse/ClickHouse/issues/64161): Add retries in git submodule update. [#64125](https://github.com/ClickHouse/ClickHouse/pull/64125) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
#### Critical Bug Fix (crash, LOGICAL_ERROR, data loss, RBAC)
* Backported in [#64589](https://github.com/ClickHouse/ClickHouse/issues/64589): Disabled `enable_vertical_final` setting by default. This feature should not be used because it has a bug: [#64543](https://github.com/ClickHouse/ClickHouse/issues/64543). [#64544](https://github.com/ClickHouse/ClickHouse/pull/64544) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Backported in [#64880](https://github.com/ClickHouse/ClickHouse/issues/64880): This PR fixes an error when a user in a specific situation can escalate their privileges on the default database without necessary grants. [#64769](https://github.com/ClickHouse/ClickHouse/pull/64769) ([pufit](https://github.com/pufit)).
#### NO CL CATEGORY
* Backported in [#63306](https://github.com/ClickHouse/ClickHouse/issues/63306):. [#63297](https://github.com/ClickHouse/ClickHouse/pull/63297) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#63710](https://github.com/ClickHouse/ClickHouse/issues/63710):. [#63415](https://github.com/ClickHouse/ClickHouse/pull/63415) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Backport [#64363](https://github.com/ClickHouse/ClickHouse/issues/64363) to 24.4: Split tests 03039_dynamic_all_merge_algorithms to avoid timeouts"'. [#64905](https://github.com/ClickHouse/ClickHouse/pull/64905) ([Raúl Marín](https://github.com/Algunenano)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* group_by_use_nulls strikes back [#62922](https://github.com/ClickHouse/ClickHouse/pull/62922) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Add `FROM` keyword to `TRUNCATE ALL TABLES` [#63241](https://github.com/ClickHouse/ClickHouse/pull/63241) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* More checks for concurrently deleted files and dirs in system.remote_data_paths [#63274](https://github.com/ClickHouse/ClickHouse/pull/63274) ([Alexander Gololobov](https://github.com/davenger)).
* Try fix segfault in `MergeTreeReadPoolBase::createTask` [#63323](https://github.com/ClickHouse/ClickHouse/pull/63323) ([Antonio Andelic](https://github.com/antonio2368)).
* Skip unaccessible table dirs in system.remote_data_paths [#63330](https://github.com/ClickHouse/ClickHouse/pull/63330) ([Alexander Gololobov](https://github.com/davenger)).
* Workaround for `oklch()` inside canvas bug for firefox [#63404](https://github.com/ClickHouse/ClickHouse/pull/63404) ([Sergei Trifonov](https://github.com/serxa)).
* Cancel S3 reads properly when parallel reads are used [#63687](https://github.com/ClickHouse/ClickHouse/pull/63687) ([Antonio Andelic](https://github.com/antonio2368)).
* Userspace page cache: don't collect stats if cache is unused [#63730](https://github.com/ClickHouse/ClickHouse/pull/63730) ([Michael Kolupaev](https://github.com/al13n321)).
* Fix sanitizers [#64090](https://github.com/ClickHouse/ClickHouse/pull/64090) ([Azat Khuzhin](https://github.com/azat)).
* Split tests 03039_dynamic_all_merge_algorithms to avoid timeouts [#64363](https://github.com/ClickHouse/ClickHouse/pull/64363) ([Kruglov Pavel](https://github.com/Avogar)).
* CI: Critical bugfix category in PR template [#64480](https://github.com/ClickHouse/ClickHouse/pull/64480) ([Max K.](https://github.com/maxknv)).

View File

@ -54,6 +54,7 @@ SELECT * FROM test_table;
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## See also

View File

@ -235,6 +235,7 @@ libhdfs3 support HDFS namenode HA.
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## Storage Settings {#storage-settings}

View File

@ -145,6 +145,7 @@ Code: 48. DB::Exception: Received from localhost:9000. DB::Exception: Reading fr
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
For more information about virtual columns see [here](../../../engines/table-engines/index.md#table_engines-virtual_columns).

View File

@ -102,7 +102,7 @@ Type of the rule `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'|GROUP BY` specifies an ac
For more details, see [TTL for columns and tables](#table_engine-mergetree-ttl)
#### Settings
#### SETTINGS
See [MergeTree Settings](../../../operations/settings/merge-tree-settings.md).

View File

@ -102,6 +102,7 @@ For partitioning by month, use the `toYYYYMM(date_column)` expression, where `da
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## Settings {#settings}

View File

@ -108,6 +108,7 @@ For partitioning by month, use the `toYYYYMM(date_column)` expression, where `da
- `_path` — Path to the `URL`. Type: `LowCardinalty(String)`.
- `_file` — Resource name of the `URL`. Type: `LowCardinalty(String)`.
- `_size` — Size of the resource in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## Storage Settings {#storage-settings}

View File

@ -72,6 +72,7 @@ SELECT count(*) FROM azureBlobStorage('DefaultEndpointsProtocol=https;AccountNam
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
**See Also**

View File

@ -196,6 +196,7 @@ SELECT count(*) FROM file('big_dir/**/file002', 'CSV', 'name String, value UInt3
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## Settings {#settings}

View File

@ -97,6 +97,7 @@ FROM hdfs('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name Strin
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## Storage Settings {#storage-settings}

View File

@ -272,6 +272,7 @@ FROM s3(
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## Storage Settings {#storage-settings}

View File

@ -53,6 +53,7 @@ Character `|` inside patterns is used to specify failover addresses. They are it
- `_path` — Path to the `URL`. Type: `LowCardinalty(String)`.
- `_file` — Resource name of the `URL`. Type: `LowCardinalty(String)`.
- `_size` — Size of the resource in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
## Storage Settings {#storage-settings}

View File

@ -773,7 +773,27 @@ try
LOG_INFO(log, "Available CPU instruction sets: {}", cpu_info);
#endif
bool will_have_trace_collector = hasPHDRCache() && config().has("trace_log");
bool has_trace_collector = false;
/// Disable it if we collect test coverage information, because it will work extremely slow.
#if !WITH_COVERAGE
/// Profilers cannot work reliably with any other libunwind or without PHDR cache.
has_trace_collector = hasPHDRCache() && config().has("trace_log");
#endif
/// Describe multiple reasons when query profiler cannot work.
#if WITH_COVERAGE
LOG_INFO(log, "Query Profiler and TraceCollector are disabled because they work extremely slow with test coverage.");
#endif
#if defined(SANITIZER)
LOG_INFO(log, "Query Profiler disabled because they cannot work under sanitizers"
" when two different stack unwinding methods will interfere with each other.");
#endif
if (!hasPHDRCache())
LOG_INFO(log, "Query Profiler and TraceCollector are disabled because they require PHDR cache to be created"
" (otherwise the function 'dl_iterate_phdr' is not lock free and not async-signal safe).");
// Initialize global thread pool. Do it before we fetch configs from zookeeper
// nodes (`from_zk`), because ZooKeeper interface uses the pool. We will
@ -782,8 +802,27 @@ try
server_settings.max_thread_pool_size,
server_settings.max_thread_pool_free_size,
server_settings.thread_pool_queue_size,
will_have_trace_collector ? server_settings.global_profiler_real_time_period_ns : 0,
will_have_trace_collector ? server_settings.global_profiler_cpu_time_period_ns : 0);
has_trace_collector ? server_settings.global_profiler_real_time_period_ns : 0,
has_trace_collector ? server_settings.global_profiler_cpu_time_period_ns : 0);
if (has_trace_collector)
{
global_context->createTraceCollector();
/// Set up server-wide memory profiler (for total memory tracker).
if (server_settings.total_memory_profiler_step)
total_memory_tracker.setProfilerStep(server_settings.total_memory_profiler_step);
if (server_settings.total_memory_tracker_sample_probability > 0.0)
total_memory_tracker.setSampleProbability(server_settings.total_memory_tracker_sample_probability);
if (server_settings.total_memory_profiler_sample_min_allocation_size)
total_memory_tracker.setSampleMinAllocationSize(server_settings.total_memory_profiler_sample_min_allocation_size);
if (server_settings.total_memory_profiler_sample_max_allocation_size)
total_memory_tracker.setSampleMaxAllocationSize(server_settings.total_memory_profiler_sample_max_allocation_size);
}
/// Wait for all threads to avoid possible use-after-free (for example logging objects can be already destroyed).
SCOPE_EXIT({
Stopwatch watch;
@ -1950,52 +1989,9 @@ try
LOG_DEBUG(log, "Loaded metadata.");
/// Init trace collector only after trace_log system table was created
/// Disable it if we collect test coverage information, because it will work extremely slow.
#if !WITH_COVERAGE
/// Profilers cannot work reliably with any other libunwind or without PHDR cache.
if (hasPHDRCache())
{
if (has_trace_collector)
global_context->initializeTraceCollector();
/// Set up server-wide memory profiler (for total memory tracker).
if (server_settings.total_memory_profiler_step)
{
total_memory_tracker.setProfilerStep(server_settings.total_memory_profiler_step);
}
if (server_settings.total_memory_tracker_sample_probability > 0.0)
{
total_memory_tracker.setSampleProbability(server_settings.total_memory_tracker_sample_probability);
}
if (server_settings.total_memory_profiler_sample_min_allocation_size)
{
total_memory_tracker.setSampleMinAllocationSize(server_settings.total_memory_profiler_sample_min_allocation_size);
}
if (server_settings.total_memory_profiler_sample_max_allocation_size)
{
total_memory_tracker.setSampleMaxAllocationSize(server_settings.total_memory_profiler_sample_max_allocation_size);
}
}
#endif
/// Describe multiple reasons when query profiler cannot work.
#if WITH_COVERAGE
LOG_INFO(log, "Query Profiler and TraceCollector are disabled because they work extremely slow with test coverage.");
#endif
#if defined(SANITIZER)
LOG_INFO(log, "Query Profiler disabled because they cannot work under sanitizers"
" when two different stack unwinding methods will interfere with each other.");
#endif
if (!hasPHDRCache())
LOG_INFO(log, "Query Profiler and TraceCollector are disabled because they require PHDR cache to be created"
" (otherwise the function 'dl_iterate_phdr' is not lock free and not async-signal safe).");
#if defined(OS_LINUX)
auto tasks_stats_provider = TasksStatsCounters::findBestAvailableProvider();
if (tasks_stats_provider == TasksStatsCounters::MetricsProvider::None)

View File

@ -289,10 +289,14 @@ void executeColumnIfNeeded(ColumnWithTypeAndName & column, bool empty)
if (!column_function)
return;
size_t original_size = column.column->size();
if (!empty)
column = column_function->reduce();
else
column.column = column_function->getResultType()->createColumn();
column.column = column_function->getResultType()->createColumnConstWithDefaultValue(original_size)->convertToFullColumnIfConst();
chassert(column.column->size() == original_size);
}
int checkShortCircuitArguments(const ColumnsWithTypeAndName & arguments)

View File

@ -228,9 +228,9 @@ void Timer::cleanup()
#endif
template <typename ProfilerImpl>
QueryProfilerBase<ProfilerImpl>::QueryProfilerBase([[maybe_unused]] UInt64 thread_id, [[maybe_unused]] int clock_type, [[maybe_unused]] UInt32 period, [[maybe_unused]] int pause_signal_)
: log(getLogger("QueryProfiler"))
, pause_signal(pause_signal_)
QueryProfilerBase<ProfilerImpl>::QueryProfilerBase(
[[maybe_unused]] UInt64 thread_id, [[maybe_unused]] int clock_type, [[maybe_unused]] UInt32 period, [[maybe_unused]] int pause_signal_)
: log(getLogger("QueryProfiler")), pause_signal(pause_signal_)
{
#if defined(SANITIZER)
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "QueryProfiler disabled because they cannot work under sanitizers");

View File

@ -122,6 +122,13 @@ DatabaseReplicated::DatabaseReplicated(
fillClusterAuthInfo(db_settings.collection_name.value, context_->getConfigRef());
replica_group_name = context_->getConfigRef().getString("replica_group_name", "");
if (!replica_group_name.empty() && database_name.starts_with(DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX))
{
context_->addWarningMessage(fmt::format("There's a Replicated database with a name starting from '{}', "
"and replica_group_name is configured. It may cause collisions in cluster names.",
ALL_GROUPS_CLUSTER_PREFIX));
}
}
String DatabaseReplicated::getFullReplicaName(const String & shard, const String & replica)
@ -173,13 +180,40 @@ ClusterPtr DatabaseReplicated::tryGetCluster() const
return cluster;
}
void DatabaseReplicated::setCluster(ClusterPtr && new_cluster)
ClusterPtr DatabaseReplicated::tryGetAllGroupsCluster() const
{
std::lock_guard lock{mutex};
cluster = std::move(new_cluster);
if (replica_group_name.empty())
return nullptr;
if (cluster_all_groups)
return cluster_all_groups;
/// Database is probably not created or not initialized yet, it's ok to return nullptr
if (is_readonly)
return cluster_all_groups;
try
{
cluster_all_groups = getClusterImpl(/*all_groups*/ true);
}
catch (...)
{
tryLogCurrentException(log);
}
return cluster_all_groups;
}
ClusterPtr DatabaseReplicated::getClusterImpl() const
void DatabaseReplicated::setCluster(ClusterPtr && new_cluster, bool all_groups)
{
std::lock_guard lock{mutex};
if (all_groups)
cluster_all_groups = std::move(new_cluster);
else
cluster = std::move(new_cluster);
}
ClusterPtr DatabaseReplicated::getClusterImpl(bool all_groups) const
{
Strings unfiltered_hosts;
Strings hosts;
@ -199,17 +233,24 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
"It's possible if the first replica is not fully created yet "
"or if the last replica was just dropped or due to logical error", zookeeper_path);
hosts.clear();
std::vector<String> paths;
for (const auto & host : unfiltered_hosts)
paths.push_back(zookeeper_path + "/replicas/" + host + "/replica_group");
auto replica_groups = zookeeper->tryGet(paths);
for (size_t i = 0; i < paths.size(); ++i)
if (all_groups)
{
if (replica_groups[i].data == replica_group_name)
hosts.push_back(unfiltered_hosts[i]);
hosts = unfiltered_hosts;
}
else
{
hosts.clear();
std::vector<String> paths;
for (const auto & host : unfiltered_hosts)
paths.push_back(zookeeper_path + "/replicas/" + host + "/replica_group");
auto replica_groups = zookeeper->tryGet(paths);
for (size_t i = 0; i < paths.size(); ++i)
{
if (replica_groups[i].data == replica_group_name)
hosts.push_back(unfiltered_hosts[i]);
}
}
Int32 cversion = stat.cversion;
@ -274,6 +315,11 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
bool treat_local_as_remote = false;
bool treat_local_port_as_remote = getContext()->getApplicationType() == Context::ApplicationType::LOCAL;
String cluster_name = TSA_SUPPRESS_WARNING_FOR_READ(database_name); /// FIXME
if (all_groups)
cluster_name = ALL_GROUPS_CLUSTER_PREFIX + cluster_name;
ClusterConnectionParameters params{
cluster_auth_info.cluster_username,
cluster_auth_info.cluster_password,
@ -282,7 +328,7 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
treat_local_port_as_remote,
cluster_auth_info.cluster_secure_connection,
Priority{1},
TSA_SUPPRESS_WARNING_FOR_READ(database_name), /// FIXME
cluster_name,
cluster_auth_info.cluster_secret};
return std::make_shared<Cluster>(getContext()->getSettingsRef(), shards, params);

View File

@ -20,6 +20,8 @@ using ClusterPtr = std::shared_ptr<Cluster>;
class DatabaseReplicated : public DatabaseAtomic
{
public:
static constexpr auto ALL_GROUPS_CLUSTER_PREFIX = "all_groups.";
DatabaseReplicated(const String & name_, const String & metadata_path_, UUID uuid,
const String & zookeeper_path_, const String & shard_name_, const String & replica_name_,
DatabaseReplicatedSettings db_settings_,
@ -65,6 +67,7 @@ public:
/// Returns cluster consisting of database replicas
ClusterPtr tryGetCluster() const;
ClusterPtr tryGetAllGroupsCluster() const;
void drop(ContextPtr /*context*/) override;
@ -113,8 +116,8 @@ private:
ASTPtr parseQueryFromMetadataInZooKeeper(const String & node_name, const String & query);
String readMetadataFile(const String & table_name) const;
ClusterPtr getClusterImpl() const;
void setCluster(ClusterPtr && new_cluster);
ClusterPtr getClusterImpl(bool all_groups = false) const;
void setCluster(ClusterPtr && new_cluster, bool all_groups = false);
void createEmptyLogEntry(const ZooKeeperPtr & current_zookeeper);
@ -155,6 +158,7 @@ private:
UInt64 tables_metadata_digest TSA_GUARDED_BY(metadata_mutex);
mutable ClusterPtr cluster;
mutable ClusterPtr cluster_all_groups;
LoadTaskPtr startup_replicated_database_task TSA_GUARDED_BY(mutex);
};

View File

@ -421,6 +421,8 @@ DDLTaskPtr DatabaseReplicatedDDLWorker::initAndCheckTask(const String & entry_na
{
/// Some replica is added or removed, let's update cached cluster
database->setCluster(database->getClusterImpl());
if (!database->replica_group_name.empty())
database->setCluster(database->getClusterImpl(/*all_groups*/ true), /*all_groups*/ true);
out_reason = fmt::format("Entry {} is a dummy task", entry_name);
return {};
}

View File

@ -19,11 +19,15 @@ namespace ProfileEvents
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
struct WriteBufferFromAzureBlobStorage::PartData
{
Memory<> memory;
size_t data_size = 0;
std::string block_id;
};
BufferAllocationPolicyPtr createBufferAllocationPolicy(const AzureObjectStorageSettings & settings)
@ -119,22 +123,30 @@ void WriteBufferFromAzureBlobStorage::preFinalize()
// This function should not be run again
is_prefinalized = true;
hidePartialData();
if (hidden_size > 0)
detachBuffer();
setFakeBufferWhenPreFinalized();
/// If there is only one block and size is less than or equal to max_single_part_upload_size
/// then we use single part upload instead of multi part upload
if (buffer_allocation_policy->getBufferNumber() == 1)
if (block_ids.empty() && detached_part_data.size() == 1 && detached_part_data.front().data_size <= max_single_part_upload_size)
{
size_t data_size = size_t(position() - memory.data());
if (data_size <= max_single_part_upload_size)
{
auto block_blob_client = blob_container_client->GetBlockBlobClient(blob_path);
Azure::Core::IO::MemoryBodyStream memory_stream(reinterpret_cast<const uint8_t *>(memory.data()), data_size);
execWithRetry([&](){ block_blob_client.Upload(memory_stream); }, max_unexpected_write_error_retries, data_size);
LOG_TRACE(log, "Committed single block for blob `{}`", blob_path);
return;
}
}
auto part_data = std::move(detached_part_data.front());
auto block_blob_client = blob_container_client->GetBlockBlobClient(blob_path);
Azure::Core::IO::MemoryBodyStream memory_stream(reinterpret_cast<const uint8_t *>(part_data.memory.data()), part_data.data_size);
execWithRetry([&](){ block_blob_client.Upload(memory_stream); }, max_unexpected_write_error_retries, part_data.data_size);
LOG_TRACE(log, "Committed single block for blob `{}`", blob_path);
writePart();
detached_part_data.pop_front();
return;
}
else
{
writeMultipartUpload();
}
}
void WriteBufferFromAzureBlobStorage::finalizeImpl()
@ -144,9 +156,13 @@ void WriteBufferFromAzureBlobStorage::finalizeImpl()
if (!is_prefinalized)
preFinalize();
chassert(offset() == 0);
chassert(hidden_size == 0);
task_tracker->waitAll();
if (!block_ids.empty())
{
task_tracker->waitAll();
auto block_blob_client = blob_container_client->GetBlockBlobClient(blob_path);
execWithRetry([&](){ block_blob_client.CommitBlockList(block_ids); }, max_unexpected_write_error_retries);
LOG_TRACE(log, "Committed {} blocks for blob `{}`", block_ids.size(), blob_path);
@ -155,14 +171,66 @@ void WriteBufferFromAzureBlobStorage::finalizeImpl()
void WriteBufferFromAzureBlobStorage::nextImpl()
{
if (is_prefinalized)
throw Exception(
ErrorCodes::LOGICAL_ERROR,
"Cannot write to prefinalized buffer for Azure Blob Storage, the file could have been created");
task_tracker->waitIfAny();
writePart();
hidePartialData();
reallocateFirstBuffer();
if (available() > 0)
return;
detachBuffer();
if (detached_part_data.size() > 1)
writeMultipartUpload();
allocateBuffer();
}
void WriteBufferFromAzureBlobStorage::hidePartialData()
{
if (write_settings.remote_throttler)
write_settings.remote_throttler->add(offset(), ProfileEvents::RemoteWriteThrottlerBytes, ProfileEvents::RemoteWriteThrottlerSleepMicroseconds);
chassert(memory.size() >= hidden_size + offset());
hidden_size += offset();
chassert(memory.data() + hidden_size == working_buffer.begin() + offset());
chassert(memory.data() + hidden_size == position());
WriteBuffer::set(memory.data() + hidden_size, memory.size() - hidden_size);
chassert(offset() == 0);
}
void WriteBufferFromAzureBlobStorage::reallocateFirstBuffer()
{
chassert(offset() == 0);
if (buffer_allocation_policy->getBufferNumber() > 1 || available() > 0)
return;
const size_t max_first_buffer = buffer_allocation_policy->getBufferSize();
if (memory.size() == max_first_buffer)
return;
size_t size = std::min(memory.size() * 2, max_first_buffer);
memory.resize(size);
WriteBuffer::set(memory.data() + hidden_size, memory.size() - hidden_size);
chassert(offset() == 0);
}
void WriteBufferFromAzureBlobStorage::allocateBuffer()
{
buffer_allocation_policy->nextBuffer();
chassert(0 == hidden_size);
auto size = buffer_allocation_policy->getBufferSize();
if (buffer_allocation_policy->getBufferNumber() == 1)
@ -172,30 +240,56 @@ void WriteBufferFromAzureBlobStorage::allocateBuffer()
WriteBuffer::set(memory.data(), memory.size());
}
void WriteBufferFromAzureBlobStorage::writePart()
void WriteBufferFromAzureBlobStorage::detachBuffer()
{
auto data_size = size_t(position() - memory.data());
size_t data_size = size_t(position() - memory.data());
if (data_size == 0)
return;
const std::string & block_id = block_ids.emplace_back(getRandomASCIIString(64));
std::shared_ptr<PartData> part_data = std::make_shared<PartData>(std::move(memory), data_size, block_id);
WriteBuffer::set(nullptr, 0);
chassert(data_size == hidden_size);
auto upload_worker = [this, part_data] ()
auto buf = std::move(memory);
WriteBuffer::set(nullptr, 0);
total_size += hidden_size;
hidden_size = 0;
detached_part_data.push_back({std::move(buf), data_size});
WriteBuffer::set(nullptr, 0);
}
void WriteBufferFromAzureBlobStorage::writePart(WriteBufferFromAzureBlobStorage::PartData && part_data)
{
const std::string & block_id = block_ids.emplace_back(getRandomASCIIString(64));
auto worker_data = std::make_shared<std::tuple<std::string, WriteBufferFromAzureBlobStorage::PartData>>(block_id, std::move(part_data));
auto upload_worker = [this, worker_data] ()
{
auto & data_size = std::get<1>(*worker_data).data_size;
auto & data_block_id = std::get<0>(*worker_data);
auto block_blob_client = blob_container_client->GetBlockBlobClient(blob_path);
Azure::Core::IO::MemoryBodyStream memory_stream(reinterpret_cast<const uint8_t *>(part_data->memory.data()), part_data->data_size);
execWithRetry([&](){ block_blob_client.StageBlock(part_data->block_id, memory_stream); }, max_unexpected_write_error_retries, part_data->data_size);
if (write_settings.remote_throttler)
write_settings.remote_throttler->add(part_data->data_size, ProfileEvents::RemoteWriteThrottlerBytes, ProfileEvents::RemoteWriteThrottlerSleepMicroseconds);
Azure::Core::IO::MemoryBodyStream memory_stream(reinterpret_cast<const uint8_t *>(std::get<1>(*worker_data).memory.data()), data_size);
execWithRetry([&](){ block_blob_client.StageBlock(data_block_id, memory_stream); }, max_unexpected_write_error_retries, data_size);
};
task_tracker->add(std::move(upload_worker));
}
void WriteBufferFromAzureBlobStorage::setFakeBufferWhenPreFinalized()
{
WriteBuffer::set(fake_buffer_when_prefinalized, sizeof(fake_buffer_when_prefinalized));
}
void WriteBufferFromAzureBlobStorage::writeMultipartUpload()
{
while (!detached_part_data.empty())
{
writePart(std::move(detached_part_data.front()));
detached_part_data.pop_front();
}
}
}
#endif

View File

@ -48,8 +48,13 @@ public:
private:
struct PartData;
void writePart();
void writeMultipartUpload();
void writePart(PartData && part_data);
void detachBuffer();
void reallocateFirstBuffer();
void allocateBuffer();
void hidePartialData();
void setFakeBufferWhenPreFinalized();
void finalizeImpl() override;
void execWithRetry(std::function<void()> func, size_t num_tries, size_t cost = 0);
@ -77,9 +82,16 @@ private:
MemoryBufferPtr allocateBuffer() const;
char fake_buffer_when_prefinalized[1] = {};
bool first_buffer=true;
size_t total_size = 0;
size_t hidden_size = 0;
std::unique_ptr<TaskTracker> task_tracker;
std::deque<PartData> detached_part_data;
};
}

View File

@ -166,6 +166,8 @@ public:
return client.get();
}
bool supportParallelWrite() const override { return true; }
private:
using SharedAzureClientPtr = std::shared_ptr<const Azure::Storage::Blobs::BlobContainerClient>;
void removeObjectImpl(const StoredObject & object, const SharedAzureClientPtr & client_ptr, bool if_exists);

View File

@ -5,6 +5,7 @@
#include <functional>
#include <memory>
#include <Poco/Timestamp.h>
namespace DB
{
@ -25,6 +26,7 @@ public:
{
UInt64 uncompressed_size;
UInt64 compressed_size;
Poco::Timestamp last_modified;
bool is_encrypted;
};

View File

@ -157,6 +157,7 @@ public:
file_info.emplace();
file_info->uncompressed_size = archive_entry_size(current_entry);
file_info->compressed_size = archive_entry_size(current_entry);
file_info->last_modified = archive_entry_mtime(current_entry);
file_info->is_encrypted = false;
}

View File

@ -740,12 +740,18 @@ struct ContextSharedPart : boost::noncopyable
void initializeTraceCollector(std::shared_ptr<TraceLog> trace_log)
{
if (!trace_log)
return;
if (!trace_collector.has_value())
throw Exception(ErrorCodes::LOGICAL_ERROR, "TraceCollector needs to be first created before initialization");
trace_collector->initialize(trace_log);
}
void createTraceCollector()
{
if (hasTraceCollector())
return;
trace_collector.emplace(std::move(trace_log));
trace_collector.emplace();
}
void addWarningMessage(const String & message) TSA_REQUIRES(mutex)
@ -3891,6 +3897,11 @@ void Context::initializeSystemLogs()
});
}
void Context::createTraceCollector()
{
shared->createTraceCollector();
}
void Context::initializeTraceCollector()
{
shared->initializeTraceCollector(getTraceLog());

View File

@ -1077,6 +1077,8 @@ public:
void initializeSystemLogs();
/// Call after initialization before using trace collector.
void createTraceCollector();
void initializeTraceCollector();
/// Call after unexpected crash happen.

View File

@ -568,8 +568,21 @@ void ZooKeeperMetadataTransaction::commit()
ClusterPtr tryGetReplicatedDatabaseCluster(const String & cluster_name)
{
if (const auto * replicated_db = dynamic_cast<const DatabaseReplicated *>(DatabaseCatalog::instance().tryGetDatabase(cluster_name).get()))
return replicated_db->tryGetCluster();
String name = cluster_name;
bool all_groups = false;
if (name.starts_with(DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX))
{
name = name.substr(strlen(DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX));
all_groups = true;
}
if (const auto * replicated_db = dynamic_cast<const DatabaseReplicated *>(DatabaseCatalog::instance().tryGetDatabase(name).get()))
{
if (all_groups)
return replicated_db->tryGetAllGroupsCluster();
else
return replicated_db->tryGetCluster();
}
return {};
}

View File

@ -1,5 +1,4 @@
#include "TraceCollector.h"
#include <Interpreters/TraceCollector.h>
#include <Core/Field.h>
#include <IO/ReadBufferFromFileDescriptor.h>
#include <IO/ReadHelpers.h>
@ -14,8 +13,12 @@
namespace DB
{
TraceCollector::TraceCollector(std::shared_ptr<TraceLog> trace_log_)
: trace_log(std::move(trace_log_))
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
TraceCollector::TraceCollector()
{
TraceSender::pipe.open();
@ -28,6 +31,23 @@ TraceCollector::TraceCollector(std::shared_ptr<TraceLog> trace_log_)
thread = ThreadFromGlobalPool(&TraceCollector::run, this);
}
void TraceCollector::initialize(std::shared_ptr<TraceLog> trace_log_)
{
if (is_trace_log_initialized)
throw DB::Exception(ErrorCodes::LOGICAL_ERROR, "TraceCollector is already initialized");
trace_log_ptr = trace_log_;
is_trace_log_initialized.store(true, std::memory_order_release);
}
std::shared_ptr<TraceLog> TraceCollector::getTraceLog()
{
if (!is_trace_log_initialized.load(std::memory_order_acquire))
return nullptr;
return trace_log_ptr;
}
void TraceCollector::tryClosePipe()
{
try
@ -120,7 +140,7 @@ void TraceCollector::run()
ProfileEvents::Count increment;
readPODBinary(increment, in);
if (trace_log)
if (auto trace_log = getTraceLog())
{
// time and time_in_microseconds are both being constructed from the same timespec so that the
// times will be equal up to the precision of a second.

View File

@ -1,4 +1,5 @@
#pragma once
#include <atomic>
#include <Common/ThreadPool.h>
class StackTrace;
@ -16,11 +17,17 @@ class TraceLog;
class TraceCollector
{
public:
explicit TraceCollector(std::shared_ptr<TraceLog> trace_log_);
TraceCollector();
~TraceCollector();
void initialize(std::shared_ptr<TraceLog> trace_log_);
private:
std::shared_ptr<TraceLog> trace_log;
std::shared_ptr<TraceLog> getTraceLog();
std::atomic<bool> is_trace_log_initialized = false;
std::shared_ptr<TraceLog> trace_log_ptr;
ThreadFromGlobalPool thread;
void tryClosePipe();

View File

@ -195,12 +195,14 @@ Chunk StorageObjectStorageSource::generate()
const auto & object_info = reader.getObjectInfo();
const auto & filename = object_info.getFileName();
chassert(object_info.metadata);
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(
chunk,
read_from_format_info.requested_virtual_columns,
getUniqueStoragePathIdentifier(*configuration, reader.getObjectInfo(), false),
object_info.metadata->size_bytes, &filename);
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
chunk, read_from_format_info.requested_virtual_columns,
{
.path = getUniqueStoragePathIdentifier(*configuration, reader.getObjectInfo(), false),
.size = object_info.metadata->size_bytes,
.filename = &filename,
.last_modified = object_info.metadata->last_modified
});
return chunk;
}

View File

@ -421,8 +421,14 @@ Chunk StorageS3QueueSource::generate()
file_status->processed_rows += chunk.getNumRows();
processed_rows_from_file += chunk.getNumRows();
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(
chunk, requested_virtual_columns, path, reader.getObjectInfo().metadata->size_bytes);
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
chunk, requested_virtual_columns,
{
.path = path,
.size = reader.getObjectInfo().metadata->size_bytes
});
return chunk;
}
}

View File

@ -1341,6 +1341,7 @@ Chunk StorageFileSource::generate()
chassert(file_enumerator);
current_path = fmt::format("{}::{}", archive_reader->getPath(), *filename_override);
current_file_size = file_enumerator->getFileInfo().uncompressed_size;
current_file_last_modified = file_enumerator->getFileInfo().last_modified;
if (need_only_count && tryGetCountFromCache(current_archive_stat))
continue;
@ -1370,6 +1371,7 @@ Chunk StorageFileSource::generate()
struct stat file_stat;
file_stat = getFileStat(current_path, storage->use_table_fd, storage->table_fd, storage->getName());
current_file_size = file_stat.st_size;
current_file_last_modified = Poco::Timestamp::fromEpochTime(file_stat.st_mtime);
if (getContext()->getSettingsRef().engine_file_skip_empty_files && file_stat.st_size == 0)
continue;
@ -1436,8 +1438,15 @@ Chunk StorageFileSource::generate()
progress(num_rows, chunk_size ? chunk_size : chunk.bytes());
/// Enrich with virtual columns.
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(
chunk, requested_virtual_columns, current_path, current_file_size, filename_override.has_value() ? &filename_override.value() : nullptr);
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
chunk, requested_virtual_columns,
{
.path = current_path,
.size = current_file_size,
.filename = (filename_override.has_value() ? &filename_override.value() : nullptr),
.last_modified = current_file_last_modified
});
return chunk;
}

View File

@ -279,6 +279,7 @@ private:
FilesIteratorPtr files_iterator;
String current_path;
std::optional<size_t> current_file_size;
std::optional<Poco::Timestamp> current_file_last_modified;
struct stat current_archive_stat;
std::optional<String> filename_override;
Block sample_block;

View File

@ -50,6 +50,12 @@ namespace ErrorCodes
namespace
{
struct GenerateRandomState
{
std::atomic<UInt64> add_total_rows = 0;
};
using GenerateRandomStatePtr = std::shared_ptr<GenerateRandomState>;
void fillBufferWithRandomData(char * __restrict data, size_t limit, size_t size_of_type, pcg64 & rng, [[maybe_unused]] bool flip_bytes = false)
{
size_t size = limit * size_of_type;
@ -532,10 +538,24 @@ ColumnPtr fillColumnWithRandomData(
class GenerateSource : public ISource
{
public:
GenerateSource(UInt64 block_size_, UInt64 max_array_length_, UInt64 max_string_length_, UInt64 random_seed_, Block block_header_, ContextPtr context_)
GenerateSource(
UInt64 block_size_,
UInt64 max_array_length_,
UInt64 max_string_length_,
UInt64 random_seed_,
Block block_header_,
ContextPtr context_,
GenerateRandomStatePtr state_)
: ISource(Nested::flattenNested(prepareBlockToFill(block_header_)))
, block_size(block_size_), max_array_length(max_array_length_), max_string_length(max_string_length_)
, block_to_fill(std::move(block_header_)), rng(random_seed_), context(context_) {}
, block_size(block_size_)
, max_array_length(max_array_length_)
, max_string_length(max_string_length_)
, block_to_fill(std::move(block_header_))
, rng(random_seed_)
, context(context_)
, shared_state(state_)
{
}
String getName() const override { return "GenerateRandom"; }
@ -549,7 +569,15 @@ protected:
columns.emplace_back(fillColumnWithRandomData(elem.type, block_size, max_array_length, max_string_length, rng, context));
columns = Nested::flattenNested(block_to_fill.cloneWithColumns(columns)).getColumns();
return {std::move(columns), block_size};
UInt64 total_rows = shared_state->add_total_rows.fetch_and(0);
if (total_rows)
addTotalRowsApprox(total_rows);
auto chunk = Chunk{std::move(columns), block_size};
progress(chunk.getNumRows(), chunk.bytes());
return chunk;
}
private:
@ -561,6 +589,7 @@ private:
pcg64 rng;
ContextPtr context;
GenerateRandomStatePtr shared_state;
static Block & prepareBlockToFill(Block & block)
{
@ -648,9 +677,6 @@ Pipe StorageGenerateRandom::read(
{
storage_snapshot->check(column_names);
Pipes pipes;
pipes.reserve(num_streams);
const ColumnsDescription & our_columns = storage_snapshot->metadata->getColumns();
Block block_header;
for (const auto & name : column_names)
@ -679,16 +705,24 @@ Pipe StorageGenerateRandom::read(
}
}
UInt64 query_limit = query_info.limit;
if (query_limit && num_streams * max_block_size > query_limit)
{
/// We want to avoid spawning more streams than necessary
num_streams = std::min(num_streams, static_cast<size_t>(((query_limit + max_block_size - 1) / max_block_size)));
}
Pipes pipes;
pipes.reserve(num_streams);
/// Will create more seed values for each source from initial seed.
pcg64 generate(random_seed);
auto shared_state = std::make_shared<GenerateRandomState>(query_info.limit);
for (UInt64 i = 0; i < num_streams; ++i)
{
auto source = std::make_shared<GenerateSource>(max_block_size, max_array_length, max_string_length, generate(), block_header, context);
if (i == 0 && query_info.limit)
source->addTotalRowsApprox(query_info.limit);
auto source = std::make_shared<GenerateSource>(
max_block_size, max_array_length, max_string_length, generate(), block_header, context, shared_state);
pipes.emplace_back(std::move(source));
}

View File

@ -411,7 +411,12 @@ Chunk StorageURLSource::generate()
if (input_format)
chunk_size = input_format->getApproxBytesReadForChunk();
progress(num_rows, chunk_size ? chunk_size : chunk.bytes());
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(chunk, requested_virtual_columns, curr_uri.getPath(), current_file_size);
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
chunk, requested_virtual_columns,
{
.path = curr_uri.getPath(),
.size = current_file_size
});
return chunk;
}

View File

@ -54,6 +54,10 @@ void StorageSystemClusters::fillData(MutableColumns & res_columns, ContextPtr co
if (auto database_cluster = replicated->tryGetCluster())
writeCluster(res_columns, {name_and_database.first, database_cluster},
replicated->tryGetAreReplicasActive(database_cluster));
if (auto database_cluster = replicated->tryGetAllGroupsCluster())
writeCluster(res_columns, {DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX + name_and_database.first, database_cluster},
replicated->tryGetAreReplicasActive(database_cluster));
}
}
}

View File

@ -16,7 +16,9 @@ namespace
struct ZerosState
{
explicit ZerosState(UInt64 limit) : add_total_rows(limit) { }
std::atomic<UInt64> num_generated_rows = 0;
std::atomic<UInt64> add_total_rows = 0;
};
using ZerosStatePtr = std::shared_ptr<ZerosState>;
@ -42,10 +44,13 @@ protected:
auto column_ptr = column;
size_t column_size = column_ptr->size();
if (state)
UInt64 total_rows = state->add_total_rows.fetch_and(0);
if (total_rows)
addTotalRowsApprox(total_rows);
if (limit)
{
auto generated_rows = state->num_generated_rows.fetch_add(column_size, std::memory_order_acquire);
if (generated_rows >= limit)
return {};
@ -103,36 +108,25 @@ Pipe StorageSystemZeros::read(
{
storage_snapshot->check(column_names);
bool use_multiple_streams = multithreaded;
UInt64 query_limit = limit ? *limit : 0;
if (query_info.limit)
query_limit = query_limit ? std::min(query_limit, query_info.limit) : query_info.limit;
if (limit && *limit < max_block_size)
{
max_block_size = static_cast<size_t>(*limit);
use_multiple_streams = false;
}
if (query_limit && query_limit < max_block_size)
max_block_size = query_limit;
if (!use_multiple_streams)
if (!multithreaded)
num_streams = 1;
else if (query_limit && num_streams * max_block_size > query_limit)
/// We want to avoid spawning more streams than necessary
num_streams = std::min(num_streams, static_cast<size_t>(((query_limit + max_block_size - 1) / max_block_size)));
ZerosStatePtr state = std::make_shared<ZerosState>(query_limit);
Pipe res;
ZerosStatePtr state;
if (limit)
state = std::make_shared<ZerosState>();
for (size_t i = 0; i < num_streams; ++i)
{
auto source = std::make_shared<ZerosSource>(max_block_size, limit ? *limit : 0, state);
if (i == 0)
{
if (limit)
source->addTotalRowsApprox(*limit);
else if (query_info.limit)
source->addTotalRowsApprox(query_info.limit);
}
auto source = std::make_shared<ZerosSource>(max_block_size, query_limit, state);
res.addSource(std::move(source));
}

View File

@ -26,6 +26,7 @@
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeDateTime.h>
#include <Processors/QueryPlan/QueryPlan.h>
#include <Processors/QueryPlan/BuildQueryPipelineSettings.h>
@ -111,7 +112,7 @@ void filterBlockWithDAG(ActionsDAGPtr dag, Block & block, ContextPtr context)
NameSet getVirtualNamesForFileLikeStorage()
{
return {"_path", "_file", "_size"};
return {"_path", "_file", "_size", "_time"};
}
VirtualColumnsDescription getVirtualsForFileLikeStorage(const ColumnsDescription & storage_columns)
@ -129,6 +130,7 @@ VirtualColumnsDescription getVirtualsForFileLikeStorage(const ColumnsDescription
add_virtual("_path", std::make_shared<DataTypeLowCardinality>(std::make_shared<DataTypeString>()));
add_virtual("_file", std::make_shared<DataTypeLowCardinality>(std::make_shared<DataTypeString>()));
add_virtual("_size", makeNullable(std::make_shared<DataTypeUInt64>()));
add_virtual("_time", makeNullable(std::make_shared<DataTypeDateTime>()));
return desc;
}
@ -187,32 +189,40 @@ ColumnPtr getFilterByPathAndFileIndexes(const std::vector<String> & paths, const
return block.getByName("_idx").column;
}
void addRequestedPathFileAndSizeVirtualsToChunk(
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns, const String & path, std::optional<size_t> size, const String * filename)
void addRequestedFileLikeStorageVirtualsToChunk(
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns,
VirtualsForFileLikeStorage virtual_values)
{
for (const auto & virtual_column : requested_virtual_columns)
{
if (virtual_column.name == "_path")
{
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), path)->convertToFullColumnIfConst());
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), virtual_values.path)->convertToFullColumnIfConst());
}
else if (virtual_column.name == "_file")
{
if (filename)
if (virtual_values.filename)
{
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), *filename)->convertToFullColumnIfConst());
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), (*virtual_values.filename))->convertToFullColumnIfConst());
}
else
{
size_t last_slash_pos = path.find_last_of('/');
auto filename_from_path = path.substr(last_slash_pos + 1);
size_t last_slash_pos = virtual_values.path.find_last_of('/');
auto filename_from_path = virtual_values.path.substr(last_slash_pos + 1);
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), filename_from_path)->convertToFullColumnIfConst());
}
}
else if (virtual_column.name == "_size")
{
if (size)
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), *size)->convertToFullColumnIfConst());
if (virtual_values.size)
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), *virtual_values.size)->convertToFullColumnIfConst());
else
chunk.addColumn(virtual_column.type->createColumnConstWithDefaultValue(chunk.getNumRows())->convertToFullColumnIfConst());
}
else if (virtual_column.name == "_time")
{
if (virtual_values.last_modified)
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), virtual_values.last_modified->epochTime())->convertToFullColumnIfConst());
else
chunk.addColumn(virtual_column.type->createColumnConstWithDefaultValue(chunk.getNumRows())->convertToFullColumnIfConst());
}

View File

@ -68,8 +68,18 @@ void filterByPathOrFile(std::vector<T> & sources, const std::vector<String> & pa
sources = std::move(filtered_sources);
}
void addRequestedPathFileAndSizeVirtualsToChunk(
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns, const String & path, std::optional<size_t> size, const String * filename = nullptr);
struct VirtualsForFileLikeStorage
{
const String & path;
std::optional<size_t> size { std::nullopt };
const String * filename { nullptr };
std::optional<Poco::Timestamp> last_modified { std::nullopt };
};
void addRequestedFileLikeStorageVirtualsToChunk(
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns,
VirtualsForFileLikeStorage virtual_values);
}
}

View File

@ -685,9 +685,6 @@ class CIConfig:
return result
def get_job_parents(self, check_name: str) -> List[str]:
if check_name in self.builds_report_config:
return self.builds_report_config[check_name].builds
res = []
check_name = normalize_string(check_name)
for config in (
@ -903,10 +900,38 @@ CI_CONFIG = CIConfig(
),
CILabels.CI_SET_REQUIRED: LabelConfig(run_jobs=REQUIRED_CHECKS),
CILabels.CI_SET_NORMAL_BUILDS: LabelConfig(
run_jobs=[JobNames.STYLE_CHECK, JobNames.BUILD_CHECK]
run_jobs=[
JobNames.STYLE_CHECK,
JobNames.BUILD_CHECK,
Build.PACKAGE_RELEASE,
Build.PACKAGE_AARCH64,
Build.PACKAGE_ASAN,
Build.PACKAGE_UBSAN,
Build.PACKAGE_TSAN,
Build.PACKAGE_MSAN,
Build.PACKAGE_DEBUG,
Build.BINARY_RELEASE,
Build.PACKAGE_RELEASE_COVERAGE,
Build.FUZZERS,
]
),
CILabels.CI_SET_SPECIAL_BUILDS: LabelConfig(
run_jobs=[JobNames.STYLE_CHECK, JobNames.BUILD_CHECK_SPECIAL]
run_jobs=[
JobNames.STYLE_CHECK,
JobNames.BUILD_CHECK_SPECIAL,
Build.BINARY_TIDY,
Build.BINARY_DARWIN,
Build.BINARY_AARCH64,
Build.BINARY_AARCH64_V80COMPAT,
Build.BINARY_FREEBSD,
Build.BINARY_DARWIN_AARCH64,
Build.BINARY_PPC64LE,
Build.BINARY_RISCV64,
Build.BINARY_S390X,
Build.BINARY_LOONGARCH64,
Build.BINARY_AMD64_COMPAT,
Build.BINARY_AMD64_MUSL,
]
),
CILabels.CI_SET_NON_REQUIRED: LabelConfig(
run_jobs=[job for job in JobNames if job not in REQUIRED_CHECKS]

View File

@ -309,9 +309,6 @@ def main():
state, description, test_results, additional_logs = process_results(
result_path, server_log_path
)
# FIXME (alesapin)
if "azure" in check_name:
state = "success"
else:
print(
"This is validate bugfix or flaky check run, but no changes test to run - skip with success"

View File

@ -0,0 +1,10 @@
<clickhouse>
<database_atomic_delay_before_drop_table_sec>10</database_atomic_delay_before_drop_table_sec>
<allow_moving_table_directory_to_trash>1</allow_moving_table_directory_to_trash>
<merge_tree>
<initialization_retry_period>10</initialization_retry_period>
</merge_tree>
<max_database_replicated_create_table_thread_pool_size>50</max_database_replicated_create_table_thread_pool_size>
<allow_experimental_transactions>42</allow_experimental_transactions>
<replica_group_name>group</replica_group_name>
</clickhouse>

View File

@ -61,7 +61,7 @@ all_nodes = [
bad_settings_node = cluster.add_instance(
"bad_settings_node",
main_configs=["configs/config.xml"],
main_configs=["configs/config2.xml"],
user_configs=["configs/inconsistent_settings.xml"],
with_zookeeper=True,
macros={"shard": 1, "replica": 4},
@ -1522,3 +1522,24 @@ def test_auto_recovery(started_cluster):
assert "42\n" == bad_settings_node.query("SELECT * FROM auto_recovery.t2")
assert "137\n" == bad_settings_node.query("SELECT * FROM auto_recovery.t1")
def test_all_groups_cluster(started_cluster):
dummy_node.query("DROP DATABASE IF EXISTS db_cluster")
bad_settings_node.query("DROP DATABASE IF EXISTS db_cluster")
dummy_node.query(
"CREATE DATABASE db_cluster ENGINE = Replicated('/clickhouse/databases/all_groups_cluster', 'shard1', 'replica1');"
)
bad_settings_node.query(
"CREATE DATABASE db_cluster ENGINE = Replicated('/clickhouse/databases/all_groups_cluster', 'shard1', 'replica2');"
)
assert "dummy_node\n" == dummy_node.query(
"select host_name from system.clusters where name='db_cluster' order by host_name"
)
assert "bad_settings_node\n" == bad_settings_node.query(
"select host_name from system.clusters where name='db_cluster' order by host_name"
)
assert "bad_settings_node\ndummy_node\n" == bad_settings_node.query(
"select host_name from system.clusters where name='all_groups.db_cluster' order by host_name"
)

View File

@ -758,12 +758,12 @@ def test_read_subcolumns(cluster):
)
res = node.query(
f"select a.b.d, _path, a.b, _file, a.e from azureBlobStorage('{storage_account_url}', 'cont', 'test_subcolumns.tsv',"
f"select a.b.d, _path, a.b, _file, dateDiff('minute', _time, now()), a.e from azureBlobStorage('{storage_account_url}', 'cont', 'test_subcolumns.tsv',"
f" 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==', 'auto', 'auto',"
f" 'a Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
)
assert res == "2\tcont/test_subcolumns.tsv\t(1,2)\ttest_subcolumns.tsv\t3\n"
assert res == "2\tcont/test_subcolumns.tsv\t(1,2)\ttest_subcolumns.tsv\t0\t3\n"
res = node.query(
f"select a.b.d, _path, a.b, _file, a.e from azureBlobStorage('{storage_account_url}', 'cont', 'test_subcolumns.jsonl',"

View File

@ -987,10 +987,10 @@ def test_read_subcolumns(started_cluster):
assert res == "2\ttest_subcolumns.jsonl\t(1,2)\ttest_subcolumns.jsonl\t3\n"
res = node.query(
f"select x.b.d, _path, x.b, _file, x.e from hdfs('hdfs://hdfs1:9000/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
f"select x.b.d, _path, x.b, _file, dateDiff('minute', _time, now()), x.e from hdfs('hdfs://hdfs1:9000/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
)
assert res == "0\ttest_subcolumns.jsonl\t(0,0)\ttest_subcolumns.jsonl\t0\n"
assert res == "0\ttest_subcolumns.jsonl\t(0,0)\ttest_subcolumns.jsonl\t0\t0\n"
res = node.query(
f"select x.b.d, _path, x.b, _file, x.e from hdfs('hdfs://hdfs1:9000/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32) default ((42, 42), 42)')"

View File

@ -2117,10 +2117,12 @@ def test_read_subcolumns(started_cluster):
assert res == "0\troot/test_subcolumns.jsonl\t(0,0)\ttest_subcolumns.jsonl\t0\n"
res = instance.query(
f"select x.b.d, _path, x.b, _file, x.e from s3('http://{started_cluster.minio_host}:{started_cluster.minio_port}/{bucket}/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32) default ((42, 42), 42)')"
f"select x.b.d, _path, x.b, _file, dateDiff('minute', _time, now()), x.e from s3('http://{started_cluster.minio_host}:{started_cluster.minio_port}/{bucket}/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32) default ((42, 42), 42)')"
)
assert res == "42\troot/test_subcolumns.jsonl\t(42,42)\ttest_subcolumns.jsonl\t42\n"
assert (
res == "42\troot/test_subcolumns.jsonl\t(42,42)\ttest_subcolumns.jsonl\t0\t42\n"
)
res = instance.query(
f"select a.b.d, _path, a.b, _file, a.e from url('http://{started_cluster.minio_host}:{started_cluster.minio_port}/{bucket}/test_subcolumns.tsv', auto, 'a Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
@ -2148,6 +2150,8 @@ def test_read_subcolumns(started_cluster):
res == "42\t/root/test_subcolumns.jsonl\t(42,42)\ttest_subcolumns.jsonl\t42\n"
)
logging.info("Some custom logging")
def test_filtering_by_file_or_path(started_cluster):
bucket = started_cluster.minio_bucket

View File

@ -24,6 +24,7 @@ function check_refcnt_for_table()
local log_file
log_file=$(mktemp "$CUR_DIR/clickhouse-tests.XXXXXX.log")
local args=(
--allow_repeated_settings
--format Null
--max_threads 1
--max_block_size 1

View File

@ -19,7 +19,7 @@ $CLICKHOUSE_CLIENT -q "select throwIf(substring('$path', 1, 1) != '/', 'Path is
rm -f $path/count.txt
$CLICKHOUSE_CLIENT -q "detach table rmt2 sync"
$CLICKHOUSE_CLIENT --send_logs_level='fatal' -q "attach table rmt2"
$CLICKHOUSE_CLIENT --allow_repeated_settings --send_logs_level='fatal' -q "attach table rmt2"
$CLICKHOUSE_CLIENT -q "select reason, name from system.detached_parts where database='$CLICKHOUSE_DATABASE' and table='rmt2'"

View File

@ -20,12 +20,12 @@ SETTINGS_ANALYZER="SETTINGS use_query_cache=1, max_threads=1, allow_experimental
# Verify that the first query does two aggregations and the second query zero aggregations. Since query cache is currently not integrated
# with EXPLAIN PLAN, we need to check the logs.
${CLICKHOUSE_CLIENT} --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_NO_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_NO_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --allow_repeated_settings --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_NO_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --allow_repeated_settings --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_NO_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --query "SYSTEM DROP QUERY CACHE"
${CLICKHOUSE_CLIENT} --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --allow_repeated_settings --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --allow_repeated_settings --send_logs_level=trace --query "SELECT count(a) / (SELECT sum(a) FROM tab) FROM tab $SETTINGS_ANALYZER" 2>&1 | grep "Aggregated. " | wc -l
${CLICKHOUSE_CLIENT} --query "SYSTEM DROP QUERY CACHE"

View File

@ -41,6 +41,6 @@ run_count_with_custom_key "y"
run_count_with_custom_key "cityHash64(y)"
run_count_with_custom_key "cityHash64(y) + 1"
$CLICKHOUSE_CLIENT --query="SELECT count() FROM cluster(test_cluster_one_shard_three_replicas_localhost, currentDatabase(), 02535_custom_key) as t1 JOIN 02535_custom_key USING y" --parallel_replicas_custom_key="y" --send_logs_level="trace" 2>&1 | grep -Fac "JOINs are not supported with"
$CLICKHOUSE_CLIENT --query="SELECT count() FROM cluster(test_cluster_one_shard_three_replicas_localhost, currentDatabase(), 02535_custom_key) as t1 JOIN 02535_custom_key USING y" --allow_repeated_settings --parallel_replicas_custom_key="y" --send_logs_level="trace" 2>&1 | grep -Fac "JOINs are not supported with"
$CLICKHOUSE_CLIENT --query="DROP TABLE 02535_custom_key"

View File

@ -1,5 +1,5 @@
#!/usr/bin/env bash
# Tags: long, zookeeper, no-parallel, no-fasttest
# Tags: long, zookeeper, no-parallel, no-fasttest, no-asan
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh

View File

@ -58,9 +58,9 @@ function filter_temporary_locks()
function insert_duplicates() {
$CLICKHOUSE_CLIENT -q "insert into r1 values(1);" --send_logs_level="error" &
$CLICKHOUSE_CLIENT -q "insert into r1 values(1);" --allow_repeated_settings --send_logs_level="error" &
$CLICKHOUSE_CLIENT -q "insert into r2 values(1);" --send_logs_level="error"
$CLICKHOUSE_CLIENT -q "insert into r2 values(1);" --allow_repeated_settings --send_logs_level="error"
wait
@ -137,8 +137,8 @@ function list_keeper_nodes() {
list_keeper_nodes "${table_shared_id}"
$CLICKHOUSE_CLIENT -nm -q "drop table r1;" --send_logs_level="error" &
$CLICKHOUSE_CLIENT -nm -q "drop table r2;" --send_logs_level="error" &
$CLICKHOUSE_CLIENT -nm -q "drop table r1;" --allow_repeated_settings --send_logs_level="error" &
$CLICKHOUSE_CLIENT -nm -q "drop table r2;" --allow_repeated_settings --send_logs_level="error" &
wait
list_keeper_nodes "${table_shared_id}"

View File

@ -1,13 +0,0 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CURDIR"/../shell_config.sh
echo "1,2" > $CLICKHOUSE_TEST_UNIQUE_NAME.csv
$CLICKHOUSE_LOCAL -nm -q "
create table test (x UInt64, y UInt32, size UInt64) engine=Memory;
insert into test select c1, c2, _size from file('$CLICKHOUSE_TEST_UNIQUE_NAME.csv') settings use_structure_from_insertion_table_in_table_functions=1;
select * from test;
"
rm $CLICKHOUSE_TEST_UNIQUE_NAME.csv

View File

@ -0,0 +1,14 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CURDIR"/../shell_config.sh
echo "1,2" > $CLICKHOUSE_TEST_UNIQUE_NAME.csv
sleep 1
$CLICKHOUSE_LOCAL -nm -q "
create table test (x UInt64, y UInt32, size UInt64, d32 DateTime32, d64 DateTime64) engine=Memory;
insert into test select c1, c2, _size, _time, _time from file('$CLICKHOUSE_TEST_UNIQUE_NAME.csv') settings use_structure_from_insertion_table_in_table_functions=1;
select x, y, size, (dateDiff('millisecond', d32, now()) < 4000 AND dateDiff('millisecond', d32, now()) > 0), (dateDiff('second', d64, now()) < 4 AND dateDiff('second', d64, now()) > 0) from test;
"
rm $CLICKHOUSE_TEST_UNIQUE_NAME.csv

View File

@ -7,7 +7,7 @@ CLICKHOUSE_LOG_COMMENT=
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1 --index_granularity_bytes=10485760 --index_granularity=8192"
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1"
function test1_insert()
{
@ -115,11 +115,11 @@ run 0
$CH_CLIENT -q "drop table test;"
echo "MergeTree compact"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"
echo "MergeTree wide"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"

View File

@ -7,7 +7,7 @@ CLICKHOUSE_LOG_COMMENT=
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1 --index_granularity_bytes=10485760 --index_granularity=8192"
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1"
function test4_insert()
{
@ -61,11 +61,11 @@ run 0
$CH_CLIENT -q "drop table test;"
echo "MergeTree compact"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"
echo "MergeTree wide"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"

View File

@ -7,7 +7,7 @@ CLICKHOUSE_LOG_COMMENT=
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1 --index_granularity_bytes=10485760 --index_granularity=8192 "
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1"
function test5_insert()
{
@ -63,11 +63,11 @@ run 0
$CH_CLIENT -q "drop table test;"
echo "MergeTree compact"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"
echo "MergeTree wide"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"

View File

@ -7,7 +7,8 @@ CLICKHOUSE_LOG_COMMENT=
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1 --index_granularity_bytes=10485760 --index_granularity=8192 "
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_variant_type=1 --allow_suspicious_variant_types=1"
function test6_insert()
{
@ -57,11 +58,11 @@ run 0
$CH_CLIENT -q "drop table test;"
echo "MergeTree compact"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=100000000, min_bytes_for_wide_part=1000000000, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"
echo "MergeTree wide"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1;"
$CH_CLIENT -q "create table test (id UInt64, v Variant(String, UInt64, LowCardinality(String), Tuple(a UInt32, b UInt32), Array(UInt64))) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, index_granularity_bytes=10485760, index_granularity=8192;"
run 1
$CH_CLIENT -q "drop table test;"

View File

@ -1,49 +0,0 @@
#!/usr/bin/expect -f
set basedir [file dirname $argv0]
set basename [file tail $argv0]
if {[info exists env(CLICKHOUSE_TMP)]} {
set CLICKHOUSE_TMP $env(CLICKHOUSE_TMP)
} else {
set CLICKHOUSE_TMP "."
}
exp_internal -f $CLICKHOUSE_TMP/$basename.debuglog 0
log_user 0
set timeout 60
match_max 100000
set stty_init "rows 25 cols 120"
expect_after {
-i $any_spawn_id eof { exp_continue }
-i $any_spawn_id timeout { exit 1 }
}
spawn clickhouse-local
expect ":) "
# Trivial SELECT with LIMIT from system.zeros shows progress bar.
send "SELECT * FROM system.zeros LIMIT 10000000 FORMAT Null SETTINGS max_execution_speed = 1000000, timeout_before_checking_execution_speed = 0, max_block_size = 128\r"
expect "Progress: "
expect "█"
send "\3"
expect "Query was cancelled."
expect ":) "
send "SELECT * FROM system.zeros_mt LIMIT 10000000 FORMAT Null SETTINGS max_execution_speed = 1000000, timeout_before_checking_execution_speed = 0, max_block_size = 128\r"
expect "Progress: "
expect "█"
send "\3"
expect "Query was cancelled."
expect ":) "
# As well as from generateRandom
send "SELECT * FROM generateRandom() LIMIT 10000000 FORMAT Null SETTINGS max_execution_speed = 1000000, timeout_before_checking_execution_speed = 0, max_block_size = 128\r"
expect "Progress: "
expect "█"
send "\3"
expect "Query was cancelled."
expect ":) "
send "exit\r"
expect eof

View File

@ -0,0 +1,18 @@
#!/usr/bin/env bash
# Tags: no-random-settings
CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
function run_with_progress_and_match_total_rows()
{
CURL_RESPONSE=$(echo "$1" | \
${CLICKHOUSE_CURL} -vsS "${CLICKHOUSE_URL}&wait_end_of_query=1&max_block_size=1&send_progress_in_http_headers=1&http_headers_progress_interval_ms=0&output_format_parallel_formatting=0" --data-binary @- 2>&1)
echo "$CURL_RESPONSE" | grep -q '"total_rows_to_read":"100"' && echo "Matched" || echo "Expected total_rows_to_read not found: ${CURL_RESPONSE}"
}
run_with_progress_and_match_total_rows 'SELECT * FROM system.zeros LIMIT 100'
run_with_progress_and_match_total_rows 'SELECT * FROM system.zeros_mt LIMIT 100'
run_with_progress_and_match_total_rows "SELECT * FROM generateRandom('number UInt64') LIMIT 100"

View File

@ -8,7 +8,7 @@ CLICKHOUSE_LOG_COMMENT=
. "$CUR_DIR"/../shell_config.sh
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_dynamic_type=1 --index_granularity_bytes 10485760 --merge_max_block_size 8192 --merge_max_block_size_bytes=10485760 --index_granularity 8192"
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_dynamic_type=1"
function test()
{
@ -41,12 +41,12 @@ function test()
$CH_CLIENT -q "drop table if exists test;"
echo "MergeTree compact"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1000000000, min_bytes_for_wide_part=10000000000, vertical_merge_algorithm_min_columns_to_activate=10;"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1000000000, min_bytes_for_wide_part=10000000000, vertical_merge_algorithm_min_columns_to_activate=10, index_granularity_bytes=10485760, index_granularity=8192, merge_max_block_size=8192, merge_max_block_size_bytes=10485760;"
test
$CH_CLIENT -q "drop table test;"
echo "MergeTree wide"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, vertical_merge_algorithm_min_columns_to_activate=10;"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, vertical_merge_algorithm_min_columns_to_activate=10, index_granularity_bytes=10485760, index_granularity=8192, merge_max_block_size=8192, merge_max_block_size_bytes=10485760;"
test
$CH_CLIENT -q "drop table test;"

View File

@ -9,7 +9,7 @@ CLICKHOUSE_LOG_COMMENT=
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_dynamic_type=1 --index_granularity_bytes 10485760 --merge_max_block_size 8192 --merge_max_block_size_bytes=10485760 --index_granularity 8192"
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_experimental_dynamic_type=1"
function test()
{
echo "test"
@ -41,11 +41,11 @@ function test()
$CH_CLIENT -q "drop table if exists test;"
echo "MergeTree compact"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1000000000, min_bytes_for_wide_part=10000000000, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1;"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1000000000, min_bytes_for_wide_part=10000000000, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1, index_granularity_bytes=10485760, index_granularity=8192, merge_max_block_size=8192, merge_max_block_size_bytes=10485760;"
test
$CH_CLIENT -q "drop table test;"
echo "MergeTree wide"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1;"
$CH_CLIENT -q "create table test (id UInt64, d Dynamic(max_types=3)) engine=MergeTree order by id settings min_rows_for_wide_part=1, min_bytes_for_wide_part=1, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1, index_granularity_bytes=10485760, index_granularity=8192, merge_max_block_size=8192, merge_max_block_size_bytes=10485760;"
test
$CH_CLIENT -q "drop table test;"

View File

@ -7,6 +7,7 @@ CLICKHOUSE_LOG_COMMENT=
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
# Fix some settings to avoid timeouts because of some settings randomization
CH_CLIENT="$CLICKHOUSE_CLIENT --allow_merge_tree_settings --allow_experimental_dynamic_type=1 --index_granularity_bytes 10485760 --index_granularity 8128 --merge_max_block_size 8128"
@ -32,7 +33,7 @@ echo "MergeTree wide + horizontal merge"
test "min_rows_for_wide_part=1, min_bytes_for_wide_part=1"
echo "MergeTree compact + vertical merge"
test "min_rows_for_wide_part=100000000000, min_bytes_for_wide_part=1000000000000, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1"
test "min_rows_for_wide_part=100000000000, min_bytes_for_wide_part=1000000000000, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1;"
echo "MergeTree wide + vertical merge"
test "min_rows_for_wide_part=1, min_bytes_for_wide_part=1, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1"
test "min_rows_for_wide_part=1, min_bytes_for_wide_part=1, vertical_merge_algorithm_min_rows_to_activate=1, vertical_merge_algorithm_min_columns_to_activate=1;"

View File

@ -0,0 +1,2 @@
{'ja':0.62,'fr':0.36}
{'ja':0.62,'fr':0.36}

View File

@ -0,0 +1,10 @@
-- Tags: no-fasttest
-- Tag no-fasttest: depends on cld2
-- https://github.com/ClickHouse/ClickHouse/issues/64931
SELECT detectLanguageMixed(materialize('二兎を追う者は一兎をも得ず二兎を追う者は一兎をも得ず A vaincre sans peril, on triomphe sans gloire.'))
GROUP BY
GROUPING SETS (
('a', toUInt256(1)),
(stringToH3(toFixedString(toFixedString('85283473ffffff', 14), 14))))
SETTINGS allow_experimental_nlp_functions = 1;

View File

@ -0,0 +1,6 @@
-- https://github.com/ClickHouse/ClickHouse/issues/64946
SELECT
multiIf((number % toLowCardinality(toNullable(toUInt128(2)))) = (number % toNullable(2)), toInt8(1), (number % materialize(toLowCardinality(3))) = toUInt128(toNullable(0)), toInt8(materialize(materialize(2))), toInt64(toUInt128(3)))
FROM system.numbers
LIMIT 44857
FORMAT Null;

View File

@ -1,4 +1,5 @@
v24.5.1.1763-stable 2024-06-01
v24.4.2.141-stable 2024-06-07
v24.4.1.2088-stable 2024-05-01
v24.3.3.102-lts 2024-05-01
v24.3.2.23-lts 2024-04-03

1 v24.5.1.1763-stable 2024-06-01
2 v24.4.2.141-stable 2024-06-07
3 v24.4.1.2088-stable 2024-05-01
4 v24.3.3.102-lts 2024-05-01
5 v24.3.2.23-lts 2024-04-03