mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-24 08:32:02 +00:00
Merge branch 'master' into fix-gwp-asan
This commit is contained in:
commit
5e907a5fb9
2
contrib/cld2
vendored
2
contrib/cld2
vendored
@ -1 +1 @@
|
||||
Subproject commit bc6d493a2f64ed1fc1c4c4b4294a542a04e04217
|
||||
Subproject commit 217ba8b8805b41557faadaa47bb6e99f2242eea3
|
101
docs/changelogs/v24.4.2.141-stable.md
Normal file
101
docs/changelogs/v24.4.2.141-stable.md
Normal file
@ -0,0 +1,101 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
sidebar_label: 2024
|
||||
---
|
||||
|
||||
# 2024 Changelog
|
||||
|
||||
### ClickHouse release v24.4.2.141-stable (9e23d27bd11) FIXME as compared to v24.4.1.2088-stable (6d4b31322d1)
|
||||
|
||||
#### Improvement
|
||||
* Backported in [#63467](https://github.com/ClickHouse/ClickHouse/issues/63467): Make rabbitmq nack broken messages. Closes [#45350](https://github.com/ClickHouse/ClickHouse/issues/45350). [#60312](https://github.com/ClickHouse/ClickHouse/pull/60312) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
|
||||
#### Build/Testing/Packaging Improvement
|
||||
* Backported in [#63612](https://github.com/ClickHouse/ClickHouse/issues/63612): The Dockerfile is reviewed by the docker official library in https://github.com/docker-library/official-images/pull/15846. [#63400](https://github.com/ClickHouse/ClickHouse/pull/63400) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
|
||||
|
||||
#### Bug Fix (user-visible misbehavior in an official stable release)
|
||||
|
||||
* Backported in [#64279](https://github.com/ClickHouse/ClickHouse/issues/64279): Fix queries with FINAL give wrong result when table does not use adaptive granularity. [#62432](https://github.com/ClickHouse/ClickHouse/pull/62432) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Backported in [#63295](https://github.com/ClickHouse/ClickHouse/issues/63295): Fix crash with untuple and unresolved lambda. [#63131](https://github.com/ClickHouse/ClickHouse/pull/63131) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Backported in [#63978](https://github.com/ClickHouse/ClickHouse/issues/63978): Fix intersect parts when restart after drop range. [#63202](https://github.com/ClickHouse/ClickHouse/pull/63202) ([Han Fei](https://github.com/hanfei1991)).
|
||||
* Backported in [#63413](https://github.com/ClickHouse/ClickHouse/issues/63413): Fix a misbehavior when SQL security defaults don't load for old tables during server startup. [#63209](https://github.com/ClickHouse/ClickHouse/pull/63209) ([pufit](https://github.com/pufit)).
|
||||
* Backported in [#63388](https://github.com/ClickHouse/ClickHouse/issues/63388): JOIN filter push down filled join fix. Closes [#63228](https://github.com/ClickHouse/ClickHouse/issues/63228). [#63234](https://github.com/ClickHouse/ClickHouse/pull/63234) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Backported in [#63618](https://github.com/ClickHouse/ClickHouse/issues/63618): Fix bug which could potentially lead to rare LOGICAL_ERROR during SELECT query with message: `Unexpected return type from materialize. Expected type_XXX. Got type_YYY.` Introduced in [#59379](https://github.com/ClickHouse/ClickHouse/issues/59379). [#63353](https://github.com/ClickHouse/ClickHouse/pull/63353) ([alesapin](https://github.com/alesapin)).
|
||||
* Backported in [#63451](https://github.com/ClickHouse/ClickHouse/issues/63451): Fix `X-ClickHouse-Timezone` header returning wrong timezone when using `session_timezone` as query level setting. [#63377](https://github.com/ClickHouse/ClickHouse/pull/63377) ([Andrey Zvonov](https://github.com/zvonand)).
|
||||
* Backported in [#63605](https://github.com/ClickHouse/ClickHouse/issues/63605): Fix backup of projection part in case projection was removed from table metadata, but part still has projection. [#63426](https://github.com/ClickHouse/ClickHouse/pull/63426) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Backported in [#63510](https://github.com/ClickHouse/ClickHouse/issues/63510): Fix 'Every derived table must have its own alias' error for MYSQL dictionary source, close [#63341](https://github.com/ClickHouse/ClickHouse/issues/63341). [#63481](https://github.com/ClickHouse/ClickHouse/pull/63481) ([vdimir](https://github.com/vdimir)).
|
||||
* Backported in [#63592](https://github.com/ClickHouse/ClickHouse/issues/63592): Avoid segafult in `MergeTreePrefetchedReadPool` while fetching projection parts. [#63513](https://github.com/ClickHouse/ClickHouse/pull/63513) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Backported in [#63750](https://github.com/ClickHouse/ClickHouse/issues/63750): Read only the necessary columns from VIEW (new analyzer). Closes [#62594](https://github.com/ClickHouse/ClickHouse/issues/62594). [#63688](https://github.com/ClickHouse/ClickHouse/pull/63688) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Backported in [#63772](https://github.com/ClickHouse/ClickHouse/issues/63772): Fix [#63539](https://github.com/ClickHouse/ClickHouse/issues/63539). Forbid WINDOW redefinition in new analyzer. [#63694](https://github.com/ClickHouse/ClickHouse/pull/63694) ([Dmitry Novik](https://github.com/novikd)).
|
||||
* Backported in [#63872](https://github.com/ClickHouse/ClickHouse/issues/63872): Flatten_nested is broken with replicated database. [#63695](https://github.com/ClickHouse/ClickHouse/pull/63695) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Backported in [#63854](https://github.com/ClickHouse/ClickHouse/issues/63854): Fix `Not found column` and `CAST AS Map from array requires nested tuple of 2 elements` exceptions for distributed queries which use `Map(Nothing, Nothing)` type. Fixes [#63637](https://github.com/ClickHouse/ClickHouse/issues/63637). [#63753](https://github.com/ClickHouse/ClickHouse/pull/63753) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Backported in [#63847](https://github.com/ClickHouse/ClickHouse/issues/63847): Fix possible `ILLEGAL_COLUMN` error in `partial_merge` join, close [#37928](https://github.com/ClickHouse/ClickHouse/issues/37928). [#63755](https://github.com/ClickHouse/ClickHouse/pull/63755) ([vdimir](https://github.com/vdimir)).
|
||||
* Backported in [#63908](https://github.com/ClickHouse/ClickHouse/issues/63908): `query_plan_remove_redundant_distinct` can break queries with WINDOW FUNCTIONS (with `allow_experimental_analyzer` is on). Fixes [#62820](https://github.com/ClickHouse/ClickHouse/issues/62820). [#63776](https://github.com/ClickHouse/ClickHouse/pull/63776) ([Igor Nikonov](https://github.com/devcrafter)).
|
||||
* Backported in [#63955](https://github.com/ClickHouse/ClickHouse/issues/63955): Fix possible crash with SYSTEM UNLOAD PRIMARY KEY. [#63778](https://github.com/ClickHouse/ClickHouse/pull/63778) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Backported in [#63938](https://github.com/ClickHouse/ClickHouse/issues/63938): Allow JOIN filter push down to both streams if only single equivalent column is used in query. Closes [#63799](https://github.com/ClickHouse/ClickHouse/issues/63799). [#63819](https://github.com/ClickHouse/ClickHouse/pull/63819) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Backported in [#63991](https://github.com/ClickHouse/ClickHouse/issues/63991): Fix incorrect select query result when parallel replicas were used to read from a Materialized View. [#63861](https://github.com/ClickHouse/ClickHouse/pull/63861) ([Nikita Taranov](https://github.com/nickitat)).
|
||||
* Backported in [#64033](https://github.com/ClickHouse/ClickHouse/issues/64033): Fix a error `Database name is empty` for remote queries with lambdas over the cluster with modified default database. Fixes [#63471](https://github.com/ClickHouse/ClickHouse/issues/63471). [#63864](https://github.com/ClickHouse/ClickHouse/pull/63864) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Backported in [#64561](https://github.com/ClickHouse/ClickHouse/issues/64561): Fix SIGSEGV due to CPU/Real (`query_profiler_real_time_period_ns`/`query_profiler_cpu_time_period_ns`) profiler (has been an issue since 2022, that leads to periodic server crashes, especially if you were using distributed engine). [#63865](https://github.com/ClickHouse/ClickHouse/pull/63865) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Backported in [#64011](https://github.com/ClickHouse/ClickHouse/issues/64011): Fix analyzer - IN function with arbitrary deep sub-selects in materialized view to use insertion block. [#63930](https://github.com/ClickHouse/ClickHouse/pull/63930) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Backported in [#64238](https://github.com/ClickHouse/ClickHouse/issues/64238): Fix resolve of unqualified COLUMNS matcher. Preserve the input columns order and forbid usage of unknown identifiers. [#63962](https://github.com/ClickHouse/ClickHouse/pull/63962) ([Dmitry Novik](https://github.com/novikd)).
|
||||
* Backported in [#64103](https://github.com/ClickHouse/ClickHouse/issues/64103): Deserialize untrusted binary inputs in a safer way. [#64024](https://github.com/ClickHouse/ClickHouse/pull/64024) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Backported in [#64170](https://github.com/ClickHouse/ClickHouse/issues/64170): Add missing settings to recoverLostReplica. [#64040](https://github.com/ClickHouse/ClickHouse/pull/64040) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Backported in [#64322](https://github.com/ClickHouse/ClickHouse/issues/64322): This fix will use a proper redefined context with the correct definer for each individual view in the query pipeline Closes [#63777](https://github.com/ClickHouse/ClickHouse/issues/63777). [#64079](https://github.com/ClickHouse/ClickHouse/pull/64079) ([pufit](https://github.com/pufit)).
|
||||
* Backported in [#64382](https://github.com/ClickHouse/ClickHouse/issues/64382): Fix analyzer: "Not found column" error is fixed when using INTERPOLATE. [#64096](https://github.com/ClickHouse/ClickHouse/pull/64096) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Backported in [#64568](https://github.com/ClickHouse/ClickHouse/issues/64568): Fix creating backups to S3 buckets with different credentials from the disk containing the file. [#64153](https://github.com/ClickHouse/ClickHouse/pull/64153) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Backported in [#64272](https://github.com/ClickHouse/ClickHouse/issues/64272): Prevent LOGICAL_ERROR on CREATE TABLE as MaterializedView. [#64174](https://github.com/ClickHouse/ClickHouse/pull/64174) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Backported in [#64330](https://github.com/ClickHouse/ClickHouse/issues/64330): The query cache now considers two identical queries against different databases as different. The previous behavior could be used to bypass missing privileges to read from a table. [#64199](https://github.com/ClickHouse/ClickHouse/pull/64199) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Backported in [#64254](https://github.com/ClickHouse/ClickHouse/issues/64254): Ignore `text_log` config when using Keeper. [#64218](https://github.com/ClickHouse/ClickHouse/pull/64218) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Backported in [#64690](https://github.com/ClickHouse/ClickHouse/issues/64690): Fix Query Tree size validation. Closes [#63701](https://github.com/ClickHouse/ClickHouse/issues/63701). [#64377](https://github.com/ClickHouse/ClickHouse/pull/64377) ([Dmitry Novik](https://github.com/novikd)).
|
||||
* Backported in [#64409](https://github.com/ClickHouse/ClickHouse/issues/64409): Fix `Logical error: Bad cast` for `Buffer` table with `PREWHERE`. Fixes [#64172](https://github.com/ClickHouse/ClickHouse/issues/64172). [#64388](https://github.com/ClickHouse/ClickHouse/pull/64388) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Backported in [#64727](https://github.com/ClickHouse/ClickHouse/issues/64727): Fixed `CREATE TABLE AS` queries for tables with default expressions. [#64455](https://github.com/ClickHouse/ClickHouse/pull/64455) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Backported in [#64623](https://github.com/ClickHouse/ClickHouse/issues/64623): Fix an error `Cannot find column` in distributed queries with constant CTE in the `GROUP BY` key. [#64519](https://github.com/ClickHouse/ClickHouse/pull/64519) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Backported in [#64680](https://github.com/ClickHouse/ClickHouse/issues/64680): Fix [#64612](https://github.com/ClickHouse/ClickHouse/issues/64612). Do not rewrite aggregation if `-If` combinator is already used. [#64638](https://github.com/ClickHouse/ClickHouse/pull/64638) ([Dmitry Novik](https://github.com/novikd)).
|
||||
* Backported in [#64942](https://github.com/ClickHouse/ClickHouse/issues/64942): Fix OrderByLimitByDuplicateEliminationVisitor across subqueries. [#64766](https://github.com/ClickHouse/ClickHouse/pull/64766) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
* Backported in [#64871](https://github.com/ClickHouse/ClickHouse/issues/64871): Fixed memory possible incorrect memory tracking in several kinds of queries: queries that read any data from S3, queries via http protocol, asynchronous inserts. [#64844](https://github.com/ClickHouse/ClickHouse/pull/64844) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
|
||||
#### CI Fix or Improvement (changelog entry is not required)
|
||||
|
||||
* Backported in [#63364](https://github.com/ClickHouse/ClickHouse/issues/63364): Implement cumulative A Sync status. [#61464](https://github.com/ClickHouse/ClickHouse/pull/61464) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
|
||||
* Backported in [#63338](https://github.com/ClickHouse/ClickHouse/issues/63338): Use `/commit/` to have the URLs in [reports](https://play.clickhouse.com/play?user=play#c2VsZWN0IGRpc3RpbmN0IGNvbW1pdF91cmwgZnJvbSBjaGVja3Mgd2hlcmUgY2hlY2tfc3RhcnRfdGltZSA+PSBub3coKSAtIGludGVydmFsIDEgbW9udGggYW5kIHB1bGxfcmVxdWVzdF9udW1iZXI9NjA1MzI=) like https://github.com/ClickHouse/ClickHouse/commit/44f8bc5308b53797bec8cccc3bd29fab8a00235d and not like https://github.com/ClickHouse/ClickHouse/commits/44f8bc5308b53797bec8cccc3bd29fab8a00235d. [#63331](https://github.com/ClickHouse/ClickHouse/pull/63331) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
|
||||
* Backported in [#63376](https://github.com/ClickHouse/ClickHouse/issues/63376):. [#63366](https://github.com/ClickHouse/ClickHouse/pull/63366) ([Aleksei Filatov](https://github.com/aalexfvk)).
|
||||
* Backported in [#63571](https://github.com/ClickHouse/ClickHouse/issues/63571):. [#63551](https://github.com/ClickHouse/ClickHouse/pull/63551) ([Konstantin Bogdanov](https://github.com/thevar1able)).
|
||||
* Backported in [#63651](https://github.com/ClickHouse/ClickHouse/issues/63651): Fix 02362_part_log_merge_algorithm flaky test. [#63635](https://github.com/ClickHouse/ClickHouse/pull/63635) ([Miсhael Stetsyuk](https://github.com/mstetsyuk)).
|
||||
* Backported in [#63828](https://github.com/ClickHouse/ClickHouse/issues/63828): Fix test_odbc_interaction from aarch64 [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63787](https://github.com/ClickHouse/ClickHouse/pull/63787) ([alesapin](https://github.com/alesapin)).
|
||||
* Backported in [#63897](https://github.com/ClickHouse/ClickHouse/issues/63897): Fix test `test_catboost_evaluate` for aarch64. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63789](https://github.com/ClickHouse/ClickHouse/pull/63789) ([alesapin](https://github.com/alesapin)).
|
||||
* Backported in [#63889](https://github.com/ClickHouse/ClickHouse/issues/63889): Remove HDFS from disks config for one integration test for arm. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63832](https://github.com/ClickHouse/ClickHouse/pull/63832) ([alesapin](https://github.com/alesapin)).
|
||||
* Backported in [#63881](https://github.com/ClickHouse/ClickHouse/issues/63881): Bump version for old image in test_short_strings_aggregation to make it work on arm. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63836](https://github.com/ClickHouse/ClickHouse/pull/63836) ([alesapin](https://github.com/alesapin)).
|
||||
* Backported in [#63919](https://github.com/ClickHouse/ClickHouse/issues/63919): Disable test `test_non_default_compression/test.py::test_preconfigured_deflateqpl_codec` on arm. [#61457](https://github.com/ClickHouse/ClickHouse/issues/61457). [#63839](https://github.com/ClickHouse/ClickHouse/pull/63839) ([alesapin](https://github.com/alesapin)).
|
||||
* Backported in [#63971](https://github.com/ClickHouse/ClickHouse/issues/63971): Fix 02124_insert_deduplication_token_multiple_blocks. [#63950](https://github.com/ClickHouse/ClickHouse/pull/63950) ([Han Fei](https://github.com/hanfei1991)).
|
||||
* Backported in [#64049](https://github.com/ClickHouse/ClickHouse/issues/64049): Add `ClickHouseVersion.copy` method. Create a branch release in advance without spinning out the release to increase the stability. [#64039](https://github.com/ClickHouse/ClickHouse/pull/64039) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
|
||||
* Backported in [#64078](https://github.com/ClickHouse/ClickHouse/issues/64078): The mime type is not 100% reliable for Python and shell scripts without shebangs; add a check for file extension. [#64062](https://github.com/ClickHouse/ClickHouse/pull/64062) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
|
||||
* Backported in [#64161](https://github.com/ClickHouse/ClickHouse/issues/64161): Add retries in git submodule update. [#64125](https://github.com/ClickHouse/ClickHouse/pull/64125) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
|
||||
#### Critical Bug Fix (crash, LOGICAL_ERROR, data loss, RBAC)
|
||||
|
||||
* Backported in [#64589](https://github.com/ClickHouse/ClickHouse/issues/64589): Disabled `enable_vertical_final` setting by default. This feature should not be used because it has a bug: [#64543](https://github.com/ClickHouse/ClickHouse/issues/64543). [#64544](https://github.com/ClickHouse/ClickHouse/pull/64544) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Backported in [#64880](https://github.com/ClickHouse/ClickHouse/issues/64880): This PR fixes an error when a user in a specific situation can escalate their privileges on the default database without necessary grants. [#64769](https://github.com/ClickHouse/ClickHouse/pull/64769) ([pufit](https://github.com/pufit)).
|
||||
|
||||
#### NO CL CATEGORY
|
||||
|
||||
* Backported in [#63306](https://github.com/ClickHouse/ClickHouse/issues/63306):. [#63297](https://github.com/ClickHouse/ClickHouse/pull/63297) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Backported in [#63710](https://github.com/ClickHouse/ClickHouse/issues/63710):. [#63415](https://github.com/ClickHouse/ClickHouse/pull/63415) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
|
||||
#### NO CL ENTRY
|
||||
|
||||
* NO CL ENTRY: 'Revert "Backport [#64363](https://github.com/ClickHouse/ClickHouse/issues/64363) to 24.4: Split tests 03039_dynamic_all_merge_algorithms to avoid timeouts"'. [#64905](https://github.com/ClickHouse/ClickHouse/pull/64905) ([Raúl Marín](https://github.com/Algunenano)).
|
||||
|
||||
#### NOT FOR CHANGELOG / INSIGNIFICANT
|
||||
|
||||
* group_by_use_nulls strikes back [#62922](https://github.com/ClickHouse/ClickHouse/pull/62922) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Add `FROM` keyword to `TRUNCATE ALL TABLES` [#63241](https://github.com/ClickHouse/ClickHouse/pull/63241) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
|
||||
* More checks for concurrently deleted files and dirs in system.remote_data_paths [#63274](https://github.com/ClickHouse/ClickHouse/pull/63274) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Try fix segfault in `MergeTreeReadPoolBase::createTask` [#63323](https://github.com/ClickHouse/ClickHouse/pull/63323) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Skip unaccessible table dirs in system.remote_data_paths [#63330](https://github.com/ClickHouse/ClickHouse/pull/63330) ([Alexander Gololobov](https://github.com/davenger)).
|
||||
* Workaround for `oklch()` inside canvas bug for firefox [#63404](https://github.com/ClickHouse/ClickHouse/pull/63404) ([Sergei Trifonov](https://github.com/serxa)).
|
||||
* Cancel S3 reads properly when parallel reads are used [#63687](https://github.com/ClickHouse/ClickHouse/pull/63687) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Userspace page cache: don't collect stats if cache is unused [#63730](https://github.com/ClickHouse/ClickHouse/pull/63730) ([Michael Kolupaev](https://github.com/al13n321)).
|
||||
* Fix sanitizers [#64090](https://github.com/ClickHouse/ClickHouse/pull/64090) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Split tests 03039_dynamic_all_merge_algorithms to avoid timeouts [#64363](https://github.com/ClickHouse/ClickHouse/pull/64363) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* CI: Critical bugfix category in PR template [#64480](https://github.com/ClickHouse/ClickHouse/pull/64480) ([Max K.](https://github.com/maxknv)).
|
||||
|
@ -54,6 +54,7 @@ SELECT * FROM test_table;
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## See also
|
||||
|
||||
|
@ -235,6 +235,7 @@ libhdfs3 support HDFS namenode HA.
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## Storage Settings {#storage-settings}
|
||||
|
||||
|
@ -53,14 +53,14 @@ For partitioning by month, use the `toYYYYMM(date_column)` expression, where `da
|
||||
|
||||
This example uses the [docker compose recipe](https://github.com/ClickHouse/examples/tree/5fdc6ff72f4e5137e23ea075c88d3f44b0202490/docker-compose-recipes/recipes/ch-and-minio-S3), which integrates ClickHouse and MinIO. You should be able to reproduce the same queries using S3 by replacing the endpoint and authentication values.
|
||||
|
||||
Notice that the S3 endpoint in the `ENGINE` configuration uses the parameter token `{_partition_id}` as part of the S3 object (filename), and that the SELECT queries select against those resulting object names (e.g., `test_3.csv`).
|
||||
Notice that the S3 endpoint in the `ENGINE` configuration uses the parameter token `{_partition_id}` as part of the S3 object (filename), and that the SELECT queries select against those resulting object names (e.g., `test_3.csv`).
|
||||
|
||||
:::note
|
||||
As shown in the example, querying from S3 tables that are partitioned is
|
||||
not directly supported at this time, but can be accomplished by querying the individual partitions
|
||||
using the S3 table function.
|
||||
|
||||
The primary use-case for writing
|
||||
The primary use-case for writing
|
||||
partitioned data in S3 is to enable transferring that data into another
|
||||
ClickHouse system (for example, moving from on-prem systems to ClickHouse
|
||||
Cloud). Because ClickHouse datasets are often very large, and network
|
||||
@ -78,9 +78,9 @@ CREATE TABLE p
|
||||
)
|
||||
ENGINE = S3(
|
||||
# highlight-next-line
|
||||
'http://minio:10000/clickhouse//test_{_partition_id}.csv',
|
||||
'minioadmin',
|
||||
'minioadminpassword',
|
||||
'http://minio:10000/clickhouse//test_{_partition_id}.csv',
|
||||
'minioadmin',
|
||||
'minioadminpassword',
|
||||
'CSV')
|
||||
PARTITION BY column3
|
||||
```
|
||||
@ -145,6 +145,7 @@ Code: 48. DB::Exception: Received from localhost:9000. DB::Exception: Reading fr
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
For more information about virtual columns see [here](../../../engines/table-engines/index.md#table_engines-virtual_columns).
|
||||
|
||||
|
@ -102,6 +102,7 @@ For partitioning by month, use the `toYYYYMM(date_column)` expression, where `da
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## Settings {#settings}
|
||||
|
||||
|
@ -108,6 +108,7 @@ For partitioning by month, use the `toYYYYMM(date_column)` expression, where `da
|
||||
- `_path` — Path to the `URL`. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Resource name of the `URL`. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the resource in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## Storage Settings {#storage-settings}
|
||||
|
||||
|
@ -72,6 +72,7 @@ SELECT count(*) FROM azureBlobStorage('DefaultEndpointsProtocol=https;AccountNam
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
**See Also**
|
||||
|
||||
|
@ -196,6 +196,7 @@ SELECT count(*) FROM file('big_dir/**/file002', 'CSV', 'name String, value UInt3
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## Settings {#settings}
|
||||
|
||||
|
@ -97,6 +97,7 @@ FROM hdfs('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name Strin
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## Storage Settings {#storage-settings}
|
||||
|
||||
|
@ -272,6 +272,7 @@ FROM s3(
|
||||
- `_path` — Path to the file. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Name of the file. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## Storage Settings {#storage-settings}
|
||||
|
||||
|
@ -53,6 +53,7 @@ Character `|` inside patterns is used to specify failover addresses. They are it
|
||||
- `_path` — Path to the `URL`. Type: `LowCardinalty(String)`.
|
||||
- `_file` — Resource name of the `URL`. Type: `LowCardinalty(String)`.
|
||||
- `_size` — Size of the resource in bytes. Type: `Nullable(UInt64)`. If the size is unknown, the value is `NULL`.
|
||||
- `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
|
||||
|
||||
## Storage Settings {#storage-settings}
|
||||
|
||||
|
@ -54,9 +54,9 @@ namespace
|
||||
S3::PocoHTTPClientConfiguration client_configuration = S3::ClientFactory::instance().createClientConfiguration(
|
||||
settings.auth_settings.region,
|
||||
context->getRemoteHostFilter(),
|
||||
static_cast<unsigned>(global_settings.s3_max_redirects),
|
||||
static_cast<unsigned>(global_settings.s3_retry_attempts),
|
||||
global_settings.enable_s3_requests_logging,
|
||||
static_cast<unsigned>(local_settings.s3_max_redirects),
|
||||
static_cast<unsigned>(local_settings.backup_restore_s3_retry_attempts),
|
||||
local_settings.enable_s3_requests_logging,
|
||||
/* for_disk_s3 = */ false,
|
||||
request_settings.get_request_throttler,
|
||||
request_settings.put_request_throttler,
|
||||
|
@ -289,10 +289,14 @@ void executeColumnIfNeeded(ColumnWithTypeAndName & column, bool empty)
|
||||
if (!column_function)
|
||||
return;
|
||||
|
||||
size_t original_size = column.column->size();
|
||||
|
||||
if (!empty)
|
||||
column = column_function->reduce();
|
||||
else
|
||||
column.column = column_function->getResultType()->createColumn();
|
||||
column.column = column_function->getResultType()->createColumnConstWithDefaultValue(original_size)->convertToFullColumnIfConst();
|
||||
|
||||
chassert(column.column->size() == original_size);
|
||||
}
|
||||
|
||||
int checkShortCircuitArguments(const ColumnsWithTypeAndName & arguments)
|
||||
|
@ -517,6 +517,7 @@ class IColumn;
|
||||
M(UInt64, backup_restore_keeper_value_max_size, 1048576, "Maximum size of data of a [Zoo]Keeper's node during backup", 0) \
|
||||
M(UInt64, backup_restore_batch_size_for_keeper_multiread, 10000, "Maximum size of batch for multiread request to [Zoo]Keeper during backup or restore", 0) \
|
||||
M(UInt64, backup_restore_batch_size_for_keeper_multi, 1000, "Maximum size of batch for multi request to [Zoo]Keeper during backup or restore", 0) \
|
||||
M(UInt64, backup_restore_s3_retry_attempts, 1000, "Setting for Aws::Client::RetryStrategy, Aws::Client does retries itself, 0 means no retries. It takes place only for backup/restore.", 0) \
|
||||
M(UInt64, max_backup_bandwidth, 0, "The maximum read speed in bytes per second for particular backup on server. Zero means unlimited.", 0) \
|
||||
\
|
||||
M(Bool, log_profile_events, true, "Log query performance statistics into the query_log, query_thread_log and query_views_log.", 0) \
|
||||
|
@ -113,6 +113,7 @@ static const std::map<ClickHouseVersion, SettingsChangesHistory::SettingsChanges
|
||||
{"http_max_chunk_size", 0, 0, "Internal limitation"},
|
||||
{"prefer_external_sort_block_bytes", 0, DEFAULT_BLOCK_SIZE * 256, "Prefer maximum block bytes for external sort, reduce the memory usage during merging."},
|
||||
{"input_format_force_null_for_omitted_fields", false, false, "Disable type-defaults for omitted fields when needed"},
|
||||
{"backup_restore_s3_retry_attempts", 0, 1000, "A new setting."},
|
||||
{"cast_string_to_dynamic_use_inference", false, false, "Add setting to allow converting String to Dynamic through parsing"},
|
||||
{"allow_experimental_dynamic_type", false, false, "Add new experimental Dynamic type"},
|
||||
{"azure_max_blocks_in_multipart_upload", 50000, 50000, "Maximum number of blocks in multipart upload for Azure."},
|
||||
|
@ -122,6 +122,13 @@ DatabaseReplicated::DatabaseReplicated(
|
||||
fillClusterAuthInfo(db_settings.collection_name.value, context_->getConfigRef());
|
||||
|
||||
replica_group_name = context_->getConfigRef().getString("replica_group_name", "");
|
||||
|
||||
if (!replica_group_name.empty() && database_name.starts_with(DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX))
|
||||
{
|
||||
context_->addWarningMessage(fmt::format("There's a Replicated database with a name starting from '{}', "
|
||||
"and replica_group_name is configured. It may cause collisions in cluster names.",
|
||||
ALL_GROUPS_CLUSTER_PREFIX));
|
||||
}
|
||||
}
|
||||
|
||||
String DatabaseReplicated::getFullReplicaName(const String & shard, const String & replica)
|
||||
@ -173,13 +180,40 @@ ClusterPtr DatabaseReplicated::tryGetCluster() const
|
||||
return cluster;
|
||||
}
|
||||
|
||||
void DatabaseReplicated::setCluster(ClusterPtr && new_cluster)
|
||||
ClusterPtr DatabaseReplicated::tryGetAllGroupsCluster() const
|
||||
{
|
||||
std::lock_guard lock{mutex};
|
||||
cluster = std::move(new_cluster);
|
||||
if (replica_group_name.empty())
|
||||
return nullptr;
|
||||
|
||||
if (cluster_all_groups)
|
||||
return cluster_all_groups;
|
||||
|
||||
/// Database is probably not created or not initialized yet, it's ok to return nullptr
|
||||
if (is_readonly)
|
||||
return cluster_all_groups;
|
||||
|
||||
try
|
||||
{
|
||||
cluster_all_groups = getClusterImpl(/*all_groups*/ true);
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
tryLogCurrentException(log);
|
||||
}
|
||||
return cluster_all_groups;
|
||||
}
|
||||
|
||||
ClusterPtr DatabaseReplicated::getClusterImpl() const
|
||||
void DatabaseReplicated::setCluster(ClusterPtr && new_cluster, bool all_groups)
|
||||
{
|
||||
std::lock_guard lock{mutex};
|
||||
if (all_groups)
|
||||
cluster_all_groups = std::move(new_cluster);
|
||||
else
|
||||
cluster = std::move(new_cluster);
|
||||
}
|
||||
|
||||
ClusterPtr DatabaseReplicated::getClusterImpl(bool all_groups) const
|
||||
{
|
||||
Strings unfiltered_hosts;
|
||||
Strings hosts;
|
||||
@ -199,17 +233,24 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
|
||||
"It's possible if the first replica is not fully created yet "
|
||||
"or if the last replica was just dropped or due to logical error", zookeeper_path);
|
||||
|
||||
hosts.clear();
|
||||
std::vector<String> paths;
|
||||
for (const auto & host : unfiltered_hosts)
|
||||
paths.push_back(zookeeper_path + "/replicas/" + host + "/replica_group");
|
||||
|
||||
auto replica_groups = zookeeper->tryGet(paths);
|
||||
|
||||
for (size_t i = 0; i < paths.size(); ++i)
|
||||
if (all_groups)
|
||||
{
|
||||
if (replica_groups[i].data == replica_group_name)
|
||||
hosts.push_back(unfiltered_hosts[i]);
|
||||
hosts = unfiltered_hosts;
|
||||
}
|
||||
else
|
||||
{
|
||||
hosts.clear();
|
||||
std::vector<String> paths;
|
||||
for (const auto & host : unfiltered_hosts)
|
||||
paths.push_back(zookeeper_path + "/replicas/" + host + "/replica_group");
|
||||
|
||||
auto replica_groups = zookeeper->tryGet(paths);
|
||||
|
||||
for (size_t i = 0; i < paths.size(); ++i)
|
||||
{
|
||||
if (replica_groups[i].data == replica_group_name)
|
||||
hosts.push_back(unfiltered_hosts[i]);
|
||||
}
|
||||
}
|
||||
|
||||
Int32 cversion = stat.cversion;
|
||||
@ -274,6 +315,11 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
|
||||
|
||||
bool treat_local_as_remote = false;
|
||||
bool treat_local_port_as_remote = getContext()->getApplicationType() == Context::ApplicationType::LOCAL;
|
||||
|
||||
String cluster_name = TSA_SUPPRESS_WARNING_FOR_READ(database_name); /// FIXME
|
||||
if (all_groups)
|
||||
cluster_name = ALL_GROUPS_CLUSTER_PREFIX + cluster_name;
|
||||
|
||||
ClusterConnectionParameters params{
|
||||
cluster_auth_info.cluster_username,
|
||||
cluster_auth_info.cluster_password,
|
||||
@ -282,7 +328,7 @@ ClusterPtr DatabaseReplicated::getClusterImpl() const
|
||||
treat_local_port_as_remote,
|
||||
cluster_auth_info.cluster_secure_connection,
|
||||
Priority{1},
|
||||
TSA_SUPPRESS_WARNING_FOR_READ(database_name), /// FIXME
|
||||
cluster_name,
|
||||
cluster_auth_info.cluster_secret};
|
||||
|
||||
return std::make_shared<Cluster>(getContext()->getSettingsRef(), shards, params);
|
||||
|
@ -20,6 +20,8 @@ using ClusterPtr = std::shared_ptr<Cluster>;
|
||||
class DatabaseReplicated : public DatabaseAtomic
|
||||
{
|
||||
public:
|
||||
static constexpr auto ALL_GROUPS_CLUSTER_PREFIX = "all_groups.";
|
||||
|
||||
DatabaseReplicated(const String & name_, const String & metadata_path_, UUID uuid,
|
||||
const String & zookeeper_path_, const String & shard_name_, const String & replica_name_,
|
||||
DatabaseReplicatedSettings db_settings_,
|
||||
@ -65,6 +67,7 @@ public:
|
||||
|
||||
/// Returns cluster consisting of database replicas
|
||||
ClusterPtr tryGetCluster() const;
|
||||
ClusterPtr tryGetAllGroupsCluster() const;
|
||||
|
||||
void drop(ContextPtr /*context*/) override;
|
||||
|
||||
@ -113,8 +116,8 @@ private:
|
||||
ASTPtr parseQueryFromMetadataInZooKeeper(const String & node_name, const String & query);
|
||||
String readMetadataFile(const String & table_name) const;
|
||||
|
||||
ClusterPtr getClusterImpl() const;
|
||||
void setCluster(ClusterPtr && new_cluster);
|
||||
ClusterPtr getClusterImpl(bool all_groups = false) const;
|
||||
void setCluster(ClusterPtr && new_cluster, bool all_groups = false);
|
||||
|
||||
void createEmptyLogEntry(const ZooKeeperPtr & current_zookeeper);
|
||||
|
||||
@ -155,6 +158,7 @@ private:
|
||||
UInt64 tables_metadata_digest TSA_GUARDED_BY(metadata_mutex);
|
||||
|
||||
mutable ClusterPtr cluster;
|
||||
mutable ClusterPtr cluster_all_groups;
|
||||
|
||||
LoadTaskPtr startup_replicated_database_task TSA_GUARDED_BY(mutex);
|
||||
};
|
||||
|
@ -421,6 +421,8 @@ DDLTaskPtr DatabaseReplicatedDDLWorker::initAndCheckTask(const String & entry_na
|
||||
{
|
||||
/// Some replica is added or removed, let's update cached cluster
|
||||
database->setCluster(database->getClusterImpl());
|
||||
if (!database->replica_group_name.empty())
|
||||
database->setCluster(database->getClusterImpl(/*all_groups*/ true), /*all_groups*/ true);
|
||||
out_reason = fmt::format("Entry {} is a dummy task", entry_name);
|
||||
return {};
|
||||
}
|
||||
|
@ -41,11 +41,11 @@ void applyMetadataChangesToCreateQuery(const ASTPtr & query, const StorageInMemo
|
||||
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Cannot alter table {} because it was created AS table function"
|
||||
" and doesn't have structure in metadata", backQuote(ast_create_query.getTable()));
|
||||
|
||||
if (!has_structure && !ast_create_query.is_dictionary)
|
||||
if (!has_structure && !ast_create_query.is_dictionary && !ast_create_query.isParameterizedView())
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot alter table {} metadata doesn't have structure",
|
||||
backQuote(ast_create_query.getTable()));
|
||||
|
||||
if (!ast_create_query.is_dictionary)
|
||||
if (!ast_create_query.is_dictionary && !ast_create_query.isParameterizedView())
|
||||
{
|
||||
ASTPtr new_columns = InterpreterCreateQuery::formatColumns(metadata.columns);
|
||||
ASTPtr new_indices = InterpreterCreateQuery::formatIndices(metadata.secondary_indices);
|
||||
|
@ -382,7 +382,6 @@ void S3ObjectStorage::removeObjectsImpl(const StoredObjects & objects, bool if_e
|
||||
{
|
||||
std::vector<Aws::S3::Model::ObjectIdentifier> current_chunk;
|
||||
String keys;
|
||||
size_t first_position = current_position;
|
||||
for (; current_position < objects.size() && current_chunk.size() < chunk_size_limit; ++current_position)
|
||||
{
|
||||
Aws::S3::Model::ObjectIdentifier obj;
|
||||
@ -408,9 +407,9 @@ void S3ObjectStorage::removeObjectsImpl(const StoredObjects & objects, bool if_e
|
||||
{
|
||||
const auto * outcome_error = outcome.IsSuccess() ? nullptr : &outcome.GetError();
|
||||
auto time_now = std::chrono::system_clock::now();
|
||||
for (size_t i = first_position; i < current_position; ++i)
|
||||
for (const auto & object : objects)
|
||||
blob_storage_log->addEvent(BlobStorageLogElement::EventType::Delete,
|
||||
uri.bucket, objects[i].remote_path, objects[i].local_path, objects[i].bytes_size,
|
||||
uri.bucket, object.remote_path, object.local_path, object.bytes_size,
|
||||
outcome_error, time_now);
|
||||
}
|
||||
|
||||
|
@ -5,6 +5,7 @@
|
||||
#include <functional>
|
||||
#include <memory>
|
||||
|
||||
#include <Poco/Timestamp.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -25,6 +26,7 @@ public:
|
||||
{
|
||||
UInt64 uncompressed_size;
|
||||
UInt64 compressed_size;
|
||||
Poco::Timestamp last_modified;
|
||||
bool is_encrypted;
|
||||
};
|
||||
|
||||
|
@ -157,6 +157,7 @@ public:
|
||||
file_info.emplace();
|
||||
file_info->uncompressed_size = archive_entry_size(current_entry);
|
||||
file_info->compressed_size = archive_entry_size(current_entry);
|
||||
file_info->last_modified = archive_entry_mtime(current_entry);
|
||||
file_info->is_encrypted = false;
|
||||
}
|
||||
|
||||
|
@ -162,7 +162,7 @@ public:
|
||||
class RetryStrategy : public Aws::Client::RetryStrategy
|
||||
{
|
||||
public:
|
||||
explicit RetryStrategy(uint32_t maxRetries_ = 10, uint32_t scaleFactor_ = 25, uint32_t maxDelayMs_ = 90000);
|
||||
explicit RetryStrategy(uint32_t maxRetries_ = 10, uint32_t scaleFactor_ = 25, uint32_t maxDelayMs_ = 5000);
|
||||
|
||||
/// NOLINTNEXTLINE(google-runtime-int)
|
||||
bool ShouldRetry(const Aws::Client::AWSError<Aws::Client::CoreErrors>& error, long attemptedRetries) const override;
|
||||
|
@ -568,8 +568,21 @@ void ZooKeeperMetadataTransaction::commit()
|
||||
|
||||
ClusterPtr tryGetReplicatedDatabaseCluster(const String & cluster_name)
|
||||
{
|
||||
if (const auto * replicated_db = dynamic_cast<const DatabaseReplicated *>(DatabaseCatalog::instance().tryGetDatabase(cluster_name).get()))
|
||||
return replicated_db->tryGetCluster();
|
||||
String name = cluster_name;
|
||||
bool all_groups = false;
|
||||
if (name.starts_with(DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX))
|
||||
{
|
||||
name = name.substr(strlen(DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX));
|
||||
all_groups = true;
|
||||
}
|
||||
|
||||
if (const auto * replicated_db = dynamic_cast<const DatabaseReplicated *>(DatabaseCatalog::instance().tryGetDatabase(name).get()))
|
||||
{
|
||||
if (all_groups)
|
||||
return replicated_db->tryGetAllGroupsCluster();
|
||||
else
|
||||
return replicated_db->tryGetCluster();
|
||||
}
|
||||
return {};
|
||||
}
|
||||
|
||||
|
@ -504,10 +504,6 @@ void SystemLog<LogElement>::flushImpl(const std::vector<LogElement> & to_flush,
|
||||
Block block(std::move(log_element_columns));
|
||||
|
||||
MutableColumns columns = block.mutateColumns();
|
||||
|
||||
for (auto & column : columns)
|
||||
column->reserve(to_flush.size());
|
||||
|
||||
for (const auto & elem : to_flush)
|
||||
elem.appendToBlock(columns);
|
||||
|
||||
@ -536,8 +532,7 @@ void SystemLog<LogElement>::flushImpl(const std::vector<LogElement> & to_flush,
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
tryLogCurrentException(__PRETTY_FUNCTION__, fmt::format("Failed to flush system log {} with {} entries up to offset {}",
|
||||
table_id.getNameForLogs(), to_flush.size(), to_flush_end));
|
||||
tryLogCurrentException(__PRETTY_FUNCTION__);
|
||||
}
|
||||
|
||||
queue->confirm(to_flush_end);
|
||||
|
@ -18,6 +18,7 @@
|
||||
#include <Interpreters/inplaceBlockConversions.h>
|
||||
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
|
||||
#include <Interpreters/InterpreterSelectQueryAnalyzer.h>
|
||||
#include <Storages/StorageView.h>
|
||||
#include <Parsers/ASTAlterQuery.h>
|
||||
#include <Parsers/ASTColumnDeclaration.h>
|
||||
#include <Parsers/ASTConstraintDeclaration.h>
|
||||
@ -1613,7 +1614,10 @@ void AlterCommands::validate(const StoragePtr & table, ContextPtr context) const
|
||||
}
|
||||
}
|
||||
|
||||
if (all_columns.empty())
|
||||
/// Parameterized views do not have 'columns' in their metadata
|
||||
bool is_parameterized_view = table->as<StorageView>() && table->as<StorageView>()->isParameterizedView();
|
||||
|
||||
if (!is_parameterized_view && all_columns.empty())
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Cannot DROP or CLEAR all columns");
|
||||
|
||||
validateColumnsDefaultsAndGetSampleBlock(default_expr_list, all_columns.getAll(), context);
|
||||
|
@ -195,12 +195,14 @@ Chunk StorageObjectStorageSource::generate()
|
||||
const auto & object_info = reader.getObjectInfo();
|
||||
const auto & filename = object_info.getFileName();
|
||||
chassert(object_info.metadata);
|
||||
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(
|
||||
chunk,
|
||||
read_from_format_info.requested_virtual_columns,
|
||||
getUniqueStoragePathIdentifier(*configuration, reader.getObjectInfo(), false),
|
||||
object_info.metadata->size_bytes, &filename);
|
||||
|
||||
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
|
||||
chunk, read_from_format_info.requested_virtual_columns,
|
||||
{
|
||||
.path = getUniqueStoragePathIdentifier(*configuration, reader.getObjectInfo(), false),
|
||||
.size = object_info.metadata->size_bytes,
|
||||
.filename = &filename,
|
||||
.last_modified = object_info.metadata->last_modified
|
||||
});
|
||||
return chunk;
|
||||
}
|
||||
|
||||
|
@ -421,8 +421,14 @@ Chunk StorageS3QueueSource::generate()
|
||||
file_status->processed_rows += chunk.getNumRows();
|
||||
processed_rows_from_file += chunk.getNumRows();
|
||||
|
||||
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(
|
||||
chunk, requested_virtual_columns, path, reader.getObjectInfo().metadata->size_bytes);
|
||||
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
|
||||
chunk, requested_virtual_columns,
|
||||
{
|
||||
.path = path,
|
||||
.size = reader.getObjectInfo().metadata->size_bytes
|
||||
});
|
||||
|
||||
|
||||
return chunk;
|
||||
}
|
||||
}
|
||||
|
@ -1341,6 +1341,7 @@ Chunk StorageFileSource::generate()
|
||||
chassert(file_enumerator);
|
||||
current_path = fmt::format("{}::{}", archive_reader->getPath(), *filename_override);
|
||||
current_file_size = file_enumerator->getFileInfo().uncompressed_size;
|
||||
current_file_last_modified = file_enumerator->getFileInfo().last_modified;
|
||||
if (need_only_count && tryGetCountFromCache(current_archive_stat))
|
||||
continue;
|
||||
|
||||
@ -1370,6 +1371,7 @@ Chunk StorageFileSource::generate()
|
||||
struct stat file_stat;
|
||||
file_stat = getFileStat(current_path, storage->use_table_fd, storage->table_fd, storage->getName());
|
||||
current_file_size = file_stat.st_size;
|
||||
current_file_last_modified = Poco::Timestamp::fromEpochTime(file_stat.st_mtime);
|
||||
|
||||
if (getContext()->getSettingsRef().engine_file_skip_empty_files && file_stat.st_size == 0)
|
||||
continue;
|
||||
@ -1436,8 +1438,15 @@ Chunk StorageFileSource::generate()
|
||||
progress(num_rows, chunk_size ? chunk_size : chunk.bytes());
|
||||
|
||||
/// Enrich with virtual columns.
|
||||
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(
|
||||
chunk, requested_virtual_columns, current_path, current_file_size, filename_override.has_value() ? &filename_override.value() : nullptr);
|
||||
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
|
||||
chunk, requested_virtual_columns,
|
||||
{
|
||||
.path = current_path,
|
||||
.size = current_file_size,
|
||||
.filename = (filename_override.has_value() ? &filename_override.value() : nullptr),
|
||||
.last_modified = current_file_last_modified
|
||||
});
|
||||
|
||||
return chunk;
|
||||
}
|
||||
|
||||
|
@ -279,6 +279,7 @@ private:
|
||||
FilesIteratorPtr files_iterator;
|
||||
String current_path;
|
||||
std::optional<size_t> current_file_size;
|
||||
std::optional<Poco::Timestamp> current_file_last_modified;
|
||||
struct stat current_archive_stat;
|
||||
std::optional<String> filename_override;
|
||||
Block sample_block;
|
||||
|
@ -50,6 +50,12 @@ namespace ErrorCodes
|
||||
namespace
|
||||
{
|
||||
|
||||
struct GenerateRandomState
|
||||
{
|
||||
std::atomic<UInt64> add_total_rows = 0;
|
||||
};
|
||||
using GenerateRandomStatePtr = std::shared_ptr<GenerateRandomState>;
|
||||
|
||||
void fillBufferWithRandomData(char * __restrict data, size_t limit, size_t size_of_type, pcg64 & rng, [[maybe_unused]] bool flip_bytes = false)
|
||||
{
|
||||
size_t size = limit * size_of_type;
|
||||
@ -532,10 +538,24 @@ ColumnPtr fillColumnWithRandomData(
|
||||
class GenerateSource : public ISource
|
||||
{
|
||||
public:
|
||||
GenerateSource(UInt64 block_size_, UInt64 max_array_length_, UInt64 max_string_length_, UInt64 random_seed_, Block block_header_, ContextPtr context_)
|
||||
GenerateSource(
|
||||
UInt64 block_size_,
|
||||
UInt64 max_array_length_,
|
||||
UInt64 max_string_length_,
|
||||
UInt64 random_seed_,
|
||||
Block block_header_,
|
||||
ContextPtr context_,
|
||||
GenerateRandomStatePtr state_)
|
||||
: ISource(Nested::flattenNested(prepareBlockToFill(block_header_)))
|
||||
, block_size(block_size_), max_array_length(max_array_length_), max_string_length(max_string_length_)
|
||||
, block_to_fill(std::move(block_header_)), rng(random_seed_), context(context_) {}
|
||||
, block_size(block_size_)
|
||||
, max_array_length(max_array_length_)
|
||||
, max_string_length(max_string_length_)
|
||||
, block_to_fill(std::move(block_header_))
|
||||
, rng(random_seed_)
|
||||
, context(context_)
|
||||
, shared_state(state_)
|
||||
{
|
||||
}
|
||||
|
||||
String getName() const override { return "GenerateRandom"; }
|
||||
|
||||
@ -549,7 +569,15 @@ protected:
|
||||
columns.emplace_back(fillColumnWithRandomData(elem.type, block_size, max_array_length, max_string_length, rng, context));
|
||||
|
||||
columns = Nested::flattenNested(block_to_fill.cloneWithColumns(columns)).getColumns();
|
||||
return {std::move(columns), block_size};
|
||||
|
||||
UInt64 total_rows = shared_state->add_total_rows.fetch_and(0);
|
||||
if (total_rows)
|
||||
addTotalRowsApprox(total_rows);
|
||||
|
||||
auto chunk = Chunk{std::move(columns), block_size};
|
||||
progress(chunk.getNumRows(), chunk.bytes());
|
||||
|
||||
return chunk;
|
||||
}
|
||||
|
||||
private:
|
||||
@ -561,6 +589,7 @@ private:
|
||||
pcg64 rng;
|
||||
|
||||
ContextPtr context;
|
||||
GenerateRandomStatePtr shared_state;
|
||||
|
||||
static Block & prepareBlockToFill(Block & block)
|
||||
{
|
||||
@ -648,9 +677,6 @@ Pipe StorageGenerateRandom::read(
|
||||
{
|
||||
storage_snapshot->check(column_names);
|
||||
|
||||
Pipes pipes;
|
||||
pipes.reserve(num_streams);
|
||||
|
||||
const ColumnsDescription & our_columns = storage_snapshot->metadata->getColumns();
|
||||
Block block_header;
|
||||
for (const auto & name : column_names)
|
||||
@ -679,16 +705,24 @@ Pipe StorageGenerateRandom::read(
|
||||
}
|
||||
}
|
||||
|
||||
UInt64 query_limit = query_info.limit;
|
||||
if (query_limit && num_streams * max_block_size > query_limit)
|
||||
{
|
||||
/// We want to avoid spawning more streams than necessary
|
||||
num_streams = std::min(num_streams, static_cast<size_t>(((query_limit + max_block_size - 1) / max_block_size)));
|
||||
}
|
||||
Pipes pipes;
|
||||
pipes.reserve(num_streams);
|
||||
|
||||
/// Will create more seed values for each source from initial seed.
|
||||
pcg64 generate(random_seed);
|
||||
|
||||
auto shared_state = std::make_shared<GenerateRandomState>(query_info.limit);
|
||||
|
||||
for (UInt64 i = 0; i < num_streams; ++i)
|
||||
{
|
||||
auto source = std::make_shared<GenerateSource>(max_block_size, max_array_length, max_string_length, generate(), block_header, context);
|
||||
|
||||
if (i == 0 && query_info.limit)
|
||||
source->addTotalRowsApprox(query_info.limit);
|
||||
|
||||
auto source = std::make_shared<GenerateSource>(
|
||||
max_block_size, max_array_length, max_string_length, generate(), block_header, context, shared_state);
|
||||
pipes.emplace_back(std::move(source));
|
||||
}
|
||||
|
||||
|
@ -411,7 +411,12 @@ Chunk StorageURLSource::generate()
|
||||
if (input_format)
|
||||
chunk_size = input_format->getApproxBytesReadForChunk();
|
||||
progress(num_rows, chunk_size ? chunk_size : chunk.bytes());
|
||||
VirtualColumnUtils::addRequestedPathFileAndSizeVirtualsToChunk(chunk, requested_virtual_columns, curr_uri.getPath(), current_file_size);
|
||||
VirtualColumnUtils::addRequestedFileLikeStorageVirtualsToChunk(
|
||||
chunk, requested_virtual_columns,
|
||||
{
|
||||
.path = curr_uri.getPath(),
|
||||
.size = current_file_size
|
||||
});
|
||||
return chunk;
|
||||
}
|
||||
|
||||
|
@ -54,6 +54,10 @@ void StorageSystemClusters::fillData(MutableColumns & res_columns, ContextPtr co
|
||||
if (auto database_cluster = replicated->tryGetCluster())
|
||||
writeCluster(res_columns, {name_and_database.first, database_cluster},
|
||||
replicated->tryGetAreReplicasActive(database_cluster));
|
||||
|
||||
if (auto database_cluster = replicated->tryGetAllGroupsCluster())
|
||||
writeCluster(res_columns, {DatabaseReplicated::ALL_GROUPS_CLUSTER_PREFIX + name_and_database.first, database_cluster},
|
||||
replicated->tryGetAreReplicasActive(database_cluster));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -16,7 +16,9 @@ namespace
|
||||
|
||||
struct ZerosState
|
||||
{
|
||||
explicit ZerosState(UInt64 limit) : add_total_rows(limit) { }
|
||||
std::atomic<UInt64> num_generated_rows = 0;
|
||||
std::atomic<UInt64> add_total_rows = 0;
|
||||
};
|
||||
|
||||
using ZerosStatePtr = std::shared_ptr<ZerosState>;
|
||||
@ -42,10 +44,13 @@ protected:
|
||||
auto column_ptr = column;
|
||||
size_t column_size = column_ptr->size();
|
||||
|
||||
if (state)
|
||||
UInt64 total_rows = state->add_total_rows.fetch_and(0);
|
||||
if (total_rows)
|
||||
addTotalRowsApprox(total_rows);
|
||||
|
||||
if (limit)
|
||||
{
|
||||
auto generated_rows = state->num_generated_rows.fetch_add(column_size, std::memory_order_acquire);
|
||||
|
||||
if (generated_rows >= limit)
|
||||
return {};
|
||||
|
||||
@ -103,36 +108,25 @@ Pipe StorageSystemZeros::read(
|
||||
{
|
||||
storage_snapshot->check(column_names);
|
||||
|
||||
bool use_multiple_streams = multithreaded;
|
||||
UInt64 query_limit = limit ? *limit : 0;
|
||||
if (query_info.limit)
|
||||
query_limit = query_limit ? std::min(query_limit, query_info.limit) : query_info.limit;
|
||||
|
||||
if (limit && *limit < max_block_size)
|
||||
{
|
||||
max_block_size = static_cast<size_t>(*limit);
|
||||
use_multiple_streams = false;
|
||||
}
|
||||
if (query_limit && query_limit < max_block_size)
|
||||
max_block_size = query_limit;
|
||||
|
||||
if (!use_multiple_streams)
|
||||
if (!multithreaded)
|
||||
num_streams = 1;
|
||||
else if (query_limit && num_streams * max_block_size > query_limit)
|
||||
/// We want to avoid spawning more streams than necessary
|
||||
num_streams = std::min(num_streams, static_cast<size_t>(((query_limit + max_block_size - 1) / max_block_size)));
|
||||
|
||||
ZerosStatePtr state = std::make_shared<ZerosState>(query_limit);
|
||||
|
||||
Pipe res;
|
||||
|
||||
ZerosStatePtr state;
|
||||
|
||||
if (limit)
|
||||
state = std::make_shared<ZerosState>();
|
||||
|
||||
for (size_t i = 0; i < num_streams; ++i)
|
||||
{
|
||||
auto source = std::make_shared<ZerosSource>(max_block_size, limit ? *limit : 0, state);
|
||||
|
||||
if (i == 0)
|
||||
{
|
||||
if (limit)
|
||||
source->addTotalRowsApprox(*limit);
|
||||
else if (query_info.limit)
|
||||
source->addTotalRowsApprox(query_info.limit);
|
||||
}
|
||||
|
||||
auto source = std::make_shared<ZerosSource>(max_block_size, query_limit, state);
|
||||
res.addSource(std::move(source));
|
||||
}
|
||||
|
||||
|
@ -26,6 +26,7 @@
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <DataTypes/DataTypeString.h>
|
||||
#include <DataTypes/DataTypeLowCardinality.h>
|
||||
#include <DataTypes/DataTypeDateTime.h>
|
||||
|
||||
#include <Processors/QueryPlan/QueryPlan.h>
|
||||
#include <Processors/QueryPlan/BuildQueryPipelineSettings.h>
|
||||
@ -111,7 +112,7 @@ void filterBlockWithDAG(ActionsDAGPtr dag, Block & block, ContextPtr context)
|
||||
|
||||
NameSet getVirtualNamesForFileLikeStorage()
|
||||
{
|
||||
return {"_path", "_file", "_size"};
|
||||
return {"_path", "_file", "_size", "_time"};
|
||||
}
|
||||
|
||||
VirtualColumnsDescription getVirtualsForFileLikeStorage(const ColumnsDescription & storage_columns)
|
||||
@ -129,6 +130,7 @@ VirtualColumnsDescription getVirtualsForFileLikeStorage(const ColumnsDescription
|
||||
add_virtual("_path", std::make_shared<DataTypeLowCardinality>(std::make_shared<DataTypeString>()));
|
||||
add_virtual("_file", std::make_shared<DataTypeLowCardinality>(std::make_shared<DataTypeString>()));
|
||||
add_virtual("_size", makeNullable(std::make_shared<DataTypeUInt64>()));
|
||||
add_virtual("_time", makeNullable(std::make_shared<DataTypeDateTime>()));
|
||||
|
||||
return desc;
|
||||
}
|
||||
@ -187,32 +189,40 @@ ColumnPtr getFilterByPathAndFileIndexes(const std::vector<String> & paths, const
|
||||
return block.getByName("_idx").column;
|
||||
}
|
||||
|
||||
void addRequestedPathFileAndSizeVirtualsToChunk(
|
||||
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns, const String & path, std::optional<size_t> size, const String * filename)
|
||||
void addRequestedFileLikeStorageVirtualsToChunk(
|
||||
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns,
|
||||
VirtualsForFileLikeStorage virtual_values)
|
||||
{
|
||||
for (const auto & virtual_column : requested_virtual_columns)
|
||||
{
|
||||
if (virtual_column.name == "_path")
|
||||
{
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), path)->convertToFullColumnIfConst());
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), virtual_values.path)->convertToFullColumnIfConst());
|
||||
}
|
||||
else if (virtual_column.name == "_file")
|
||||
{
|
||||
if (filename)
|
||||
if (virtual_values.filename)
|
||||
{
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), *filename)->convertToFullColumnIfConst());
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), (*virtual_values.filename))->convertToFullColumnIfConst());
|
||||
}
|
||||
else
|
||||
{
|
||||
size_t last_slash_pos = path.find_last_of('/');
|
||||
auto filename_from_path = path.substr(last_slash_pos + 1);
|
||||
size_t last_slash_pos = virtual_values.path.find_last_of('/');
|
||||
auto filename_from_path = virtual_values.path.substr(last_slash_pos + 1);
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), filename_from_path)->convertToFullColumnIfConst());
|
||||
}
|
||||
}
|
||||
else if (virtual_column.name == "_size")
|
||||
{
|
||||
if (size)
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), *size)->convertToFullColumnIfConst());
|
||||
if (virtual_values.size)
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), *virtual_values.size)->convertToFullColumnIfConst());
|
||||
else
|
||||
chunk.addColumn(virtual_column.type->createColumnConstWithDefaultValue(chunk.getNumRows())->convertToFullColumnIfConst());
|
||||
}
|
||||
else if (virtual_column.name == "_time")
|
||||
{
|
||||
if (virtual_values.last_modified)
|
||||
chunk.addColumn(virtual_column.type->createColumnConst(chunk.getNumRows(), virtual_values.last_modified->epochTime())->convertToFullColumnIfConst());
|
||||
else
|
||||
chunk.addColumn(virtual_column.type->createColumnConstWithDefaultValue(chunk.getNumRows())->convertToFullColumnIfConst());
|
||||
}
|
||||
|
@ -68,8 +68,18 @@ void filterByPathOrFile(std::vector<T> & sources, const std::vector<String> & pa
|
||||
sources = std::move(filtered_sources);
|
||||
}
|
||||
|
||||
void addRequestedPathFileAndSizeVirtualsToChunk(
|
||||
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns, const String & path, std::optional<size_t> size, const String * filename = nullptr);
|
||||
struct VirtualsForFileLikeStorage
|
||||
{
|
||||
const String & path;
|
||||
std::optional<size_t> size { std::nullopt };
|
||||
const String * filename { nullptr };
|
||||
std::optional<Poco::Timestamp> last_modified { std::nullopt };
|
||||
|
||||
};
|
||||
|
||||
void addRequestedFileLikeStorageVirtualsToChunk(
|
||||
Chunk & chunk, const NamesAndTypesList & requested_virtual_columns,
|
||||
VirtualsForFileLikeStorage virtual_values);
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -27,6 +27,8 @@ except ImportError:
|
||||
|
||||
DOWNLOAD_RETRIES_COUNT = 5
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class DownloadException(Exception):
|
||||
pass
|
||||
@ -42,7 +44,7 @@ def get_with_retries(
|
||||
sleep: int = 3,
|
||||
**kwargs: Any,
|
||||
) -> requests.Response:
|
||||
logging.info(
|
||||
logger.info(
|
||||
"Getting URL with %i tries and sleep %i in between: %s", retries, sleep, url
|
||||
)
|
||||
exc = Exception("A placeholder to satisfy typing and avoid nesting")
|
||||
@ -54,7 +56,7 @@ def get_with_retries(
|
||||
return response
|
||||
except Exception as e:
|
||||
if i + 1 < retries:
|
||||
logging.info("Exception '%s' while getting, retry %i", e, i + 1)
|
||||
logger.info("Exception '%s' while getting, retry %i", e, i + 1)
|
||||
time.sleep(sleep)
|
||||
|
||||
exc = e
|
||||
@ -103,7 +105,7 @@ def get_gh_api(
|
||||
)
|
||||
try_auth = e.response.status_code == 404
|
||||
if (ratelimit_exceeded or try_auth) and not token_is_set:
|
||||
logging.warning(
|
||||
logger.warning(
|
||||
"Received rate limit exception, setting the auth header and retry"
|
||||
)
|
||||
set_auth_header()
|
||||
@ -114,10 +116,10 @@ def get_gh_api(
|
||||
exc = e
|
||||
|
||||
if try_cnt < retries:
|
||||
logging.info("Exception '%s' while getting, retry %i", exc, try_cnt)
|
||||
logger.info("Exception '%s' while getting, retry %i", exc, try_cnt)
|
||||
time.sleep(sleep)
|
||||
|
||||
raise APIException("Unable to request data from GH API") from exc
|
||||
raise APIException(f"Unable to request data from GH API: {url}") from exc
|
||||
|
||||
|
||||
def get_build_name_for_check(check_name: str) -> str:
|
||||
@ -128,25 +130,25 @@ def read_build_urls(build_name: str, reports_path: Union[Path, str]) -> List[str
|
||||
for root, _, files in os.walk(reports_path):
|
||||
for file in files:
|
||||
if file.endswith(f"_{build_name}.json"):
|
||||
logging.info("Found build report json %s for %s", file, build_name)
|
||||
logger.info("Found build report json %s for %s", file, build_name)
|
||||
with open(
|
||||
os.path.join(root, file), "r", encoding="utf-8"
|
||||
) as file_handler:
|
||||
build_report = json.load(file_handler)
|
||||
return build_report["build_urls"] # type: ignore
|
||||
|
||||
logging.info("A build report is not found for %s", build_name)
|
||||
logger.info("A build report is not found for %s", build_name)
|
||||
return []
|
||||
|
||||
|
||||
def download_build_with_progress(url: str, path: Path) -> None:
|
||||
logging.info("Downloading from %s to temp path %s", url, path)
|
||||
logger.info("Downloading from %s to temp path %s", url, path)
|
||||
for i in range(DOWNLOAD_RETRIES_COUNT):
|
||||
try:
|
||||
response = get_with_retries(url, retries=1, stream=True)
|
||||
total_length = int(response.headers.get("content-length", 0))
|
||||
if path.is_file() and total_length and path.stat().st_size == total_length:
|
||||
logging.info(
|
||||
logger.info(
|
||||
"The file %s already exists and have a proper size %s",
|
||||
path,
|
||||
total_length,
|
||||
@ -155,14 +157,14 @@ def download_build_with_progress(url: str, path: Path) -> None:
|
||||
|
||||
with open(path, "wb") as f:
|
||||
if total_length == 0:
|
||||
logging.info(
|
||||
logger.info(
|
||||
"No content-length, will download file without progress"
|
||||
)
|
||||
f.write(response.content)
|
||||
else:
|
||||
dl = 0
|
||||
|
||||
logging.info("Content length is %ld bytes", total_length)
|
||||
logger.info("Content length is %ld bytes", total_length)
|
||||
for data in response.iter_content(chunk_size=4096):
|
||||
dl += len(data)
|
||||
f.write(data)
|
||||
@ -177,8 +179,8 @@ def download_build_with_progress(url: str, path: Path) -> None:
|
||||
except Exception as e:
|
||||
if sys.stdout.isatty():
|
||||
sys.stdout.write("\n")
|
||||
if os.path.exists(path):
|
||||
os.remove(path)
|
||||
if path.exists():
|
||||
path.unlink()
|
||||
|
||||
if i + 1 < DOWNLOAD_RETRIES_COUNT:
|
||||
time.sleep(3)
|
||||
@ -189,7 +191,7 @@ def download_build_with_progress(url: str, path: Path) -> None:
|
||||
|
||||
if sys.stdout.isatty():
|
||||
sys.stdout.write("\n")
|
||||
logging.info("Downloading finished")
|
||||
logger.info("Downloading finished")
|
||||
|
||||
|
||||
def download_builds(
|
||||
@ -198,7 +200,7 @@ def download_builds(
|
||||
for url in build_urls:
|
||||
if filter_fn(url):
|
||||
fname = os.path.basename(url.replace("%2B", "+").replace("%20", " "))
|
||||
logging.info("Will download %s to %s", fname, result_path)
|
||||
logger.info("Will download %s to %s", fname, result_path)
|
||||
download_build_with_progress(url, result_path / fname)
|
||||
|
||||
|
||||
@ -210,7 +212,7 @@ def download_builds_filter(
|
||||
) -> None:
|
||||
build_name = get_build_name_for_check(check_name)
|
||||
urls = read_build_urls(build_name, reports_path)
|
||||
logging.info("The build report for %s contains the next URLs: %s", build_name, urls)
|
||||
logger.info("The build report for %s contains the next URLs: %s", build_name, urls)
|
||||
|
||||
if not urls:
|
||||
raise DownloadException("No build URLs found")
|
||||
@ -247,7 +249,7 @@ def get_clickhouse_binary_url(
|
||||
) -> Optional[str]:
|
||||
build_name = get_build_name_for_check(check_name)
|
||||
urls = read_build_urls(build_name, reports_path)
|
||||
logging.info("The build report for %s contains the next URLs: %s", build_name, urls)
|
||||
logger.info("The build report for %s contains the next URLs: %s", build_name, urls)
|
||||
for url in urls:
|
||||
check_url = url
|
||||
if "?" in check_url:
|
||||
|
@ -59,7 +59,7 @@ def get_pr_for_commit(sha, ref):
|
||||
data = response.json()
|
||||
our_prs = [] # type: List[Dict]
|
||||
if len(data) > 1:
|
||||
print("Got more than one pr for commit", sha)
|
||||
logging.warning("Got more than one pr for commit %s", sha)
|
||||
for pr in data:
|
||||
# We need to check if the PR is created in our repo, because
|
||||
# https://github.com/kaynewu/ClickHouse/pull/2
|
||||
@ -71,13 +71,20 @@ def get_pr_for_commit(sha, ref):
|
||||
if pr["head"]["ref"] in ref:
|
||||
return pr
|
||||
our_prs.append(pr)
|
||||
print(
|
||||
f"Cannot find PR with required ref {ref}, sha {sha} - returning first one"
|
||||
logging.warning(
|
||||
"Cannot find PR with required ref %s, sha %s - returning first one",
|
||||
ref,
|
||||
sha,
|
||||
)
|
||||
first_pr = our_prs[0]
|
||||
return first_pr
|
||||
except Exception as ex:
|
||||
print(f"Cannot fetch PR info from commit {ref}, {sha}", ex)
|
||||
logging.error(
|
||||
"Cannot fetch PR info from commit ref %s, sha %s, exception: %s",
|
||||
ref,
|
||||
sha,
|
||||
ex,
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
@ -259,12 +266,12 @@ class PRInfo:
|
||||
self.diff_urls.append(
|
||||
self.compare_url(
|
||||
pull_request["base"]["repo"]["default_branch"],
|
||||
pull_request["head"]["label"],
|
||||
pull_request["head"]["sha"],
|
||||
)
|
||||
)
|
||||
self.diff_urls.append(
|
||||
self.compare_url(
|
||||
pull_request["head"]["label"],
|
||||
pull_request["head"]["sha"],
|
||||
pull_request["base"]["repo"]["default_branch"],
|
||||
)
|
||||
)
|
||||
@ -279,7 +286,7 @@ class PRInfo:
|
||||
# itself, but as well files changed since we branched out
|
||||
self.diff_urls.append(
|
||||
self.compare_url(
|
||||
pull_request["head"]["label"],
|
||||
pull_request["head"]["sha"],
|
||||
pull_request["base"]["repo"]["default_branch"],
|
||||
)
|
||||
)
|
||||
@ -289,8 +296,10 @@ class PRInfo:
|
||||
else:
|
||||
# assume this is a dispatch
|
||||
self.event_type = EventType.DISPATCH
|
||||
print("event.json does not match pull_request or push:")
|
||||
print(json.dumps(github_event, sort_keys=True, indent=4))
|
||||
logging.warning(
|
||||
"event.json does not match pull_request or push:\n%s",
|
||||
json.dumps(github_event, sort_keys=True, indent=4),
|
||||
)
|
||||
self.sha = os.getenv(
|
||||
"GITHUB_SHA", "0000000000000000000000000000000000000000"
|
||||
)
|
||||
@ -330,7 +339,7 @@ class PRInfo:
|
||||
return self.event_type == EventType.DISPATCH
|
||||
|
||||
def compare_pr_url(self, pr_object: dict) -> str:
|
||||
return self.compare_url(pr_object["base"]["label"], pr_object["head"]["label"])
|
||||
return self.compare_url(pr_object["base"]["sha"], pr_object["head"]["sha"])
|
||||
|
||||
@staticmethod
|
||||
def compare_url(first: str, second: str) -> str:
|
||||
@ -357,7 +366,7 @@ class PRInfo:
|
||||
diff_object = PatchSet(response.text)
|
||||
self.changed_files.update({f.path for f in diff_object})
|
||||
self.changed_files_requested = True
|
||||
print(f"Fetched info about {len(self.changed_files)} changed files")
|
||||
logging.info("Fetched info about %s changed files", len(self.changed_files))
|
||||
|
||||
def get_dict(self):
|
||||
return {
|
||||
|
@ -2,6 +2,7 @@
|
||||
<profiles>
|
||||
<default>
|
||||
<s3_retry_attempts>5</s3_retry_attempts>
|
||||
<backup_restore_s3_retry_attempts>5</backup_restore_s3_retry_attempts>
|
||||
</default>
|
||||
</profiles>
|
||||
<users>
|
||||
|
@ -0,0 +1,10 @@
|
||||
<clickhouse>
|
||||
<database_atomic_delay_before_drop_table_sec>10</database_atomic_delay_before_drop_table_sec>
|
||||
<allow_moving_table_directory_to_trash>1</allow_moving_table_directory_to_trash>
|
||||
<merge_tree>
|
||||
<initialization_retry_period>10</initialization_retry_period>
|
||||
</merge_tree>
|
||||
<max_database_replicated_create_table_thread_pool_size>50</max_database_replicated_create_table_thread_pool_size>
|
||||
<allow_experimental_transactions>42</allow_experimental_transactions>
|
||||
<replica_group_name>group</replica_group_name>
|
||||
</clickhouse>
|
@ -61,7 +61,7 @@ all_nodes = [
|
||||
|
||||
bad_settings_node = cluster.add_instance(
|
||||
"bad_settings_node",
|
||||
main_configs=["configs/config.xml"],
|
||||
main_configs=["configs/config2.xml"],
|
||||
user_configs=["configs/inconsistent_settings.xml"],
|
||||
with_zookeeper=True,
|
||||
macros={"shard": 1, "replica": 4},
|
||||
@ -1522,3 +1522,24 @@ def test_auto_recovery(started_cluster):
|
||||
|
||||
assert "42\n" == bad_settings_node.query("SELECT * FROM auto_recovery.t2")
|
||||
assert "137\n" == bad_settings_node.query("SELECT * FROM auto_recovery.t1")
|
||||
|
||||
|
||||
def test_all_groups_cluster(started_cluster):
|
||||
dummy_node.query("DROP DATABASE IF EXISTS db_cluster")
|
||||
bad_settings_node.query("DROP DATABASE IF EXISTS db_cluster")
|
||||
dummy_node.query(
|
||||
"CREATE DATABASE db_cluster ENGINE = Replicated('/clickhouse/databases/all_groups_cluster', 'shard1', 'replica1');"
|
||||
)
|
||||
bad_settings_node.query(
|
||||
"CREATE DATABASE db_cluster ENGINE = Replicated('/clickhouse/databases/all_groups_cluster', 'shard1', 'replica2');"
|
||||
)
|
||||
|
||||
assert "dummy_node\n" == dummy_node.query(
|
||||
"select host_name from system.clusters where name='db_cluster' order by host_name"
|
||||
)
|
||||
assert "bad_settings_node\n" == bad_settings_node.query(
|
||||
"select host_name from system.clusters where name='db_cluster' order by host_name"
|
||||
)
|
||||
assert "bad_settings_node\ndummy_node\n" == bad_settings_node.query(
|
||||
"select host_name from system.clusters where name='all_groups.db_cluster' order by host_name"
|
||||
)
|
||||
|
@ -758,12 +758,12 @@ def test_read_subcolumns(cluster):
|
||||
)
|
||||
|
||||
res = node.query(
|
||||
f"select a.b.d, _path, a.b, _file, a.e from azureBlobStorage('{storage_account_url}', 'cont', 'test_subcolumns.tsv',"
|
||||
f"select a.b.d, _path, a.b, _file, dateDiff('minute', _time, now()), a.e from azureBlobStorage('{storage_account_url}', 'cont', 'test_subcolumns.tsv',"
|
||||
f" 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==', 'auto', 'auto',"
|
||||
f" 'a Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
|
||||
)
|
||||
|
||||
assert res == "2\tcont/test_subcolumns.tsv\t(1,2)\ttest_subcolumns.tsv\t3\n"
|
||||
assert res == "2\tcont/test_subcolumns.tsv\t(1,2)\ttest_subcolumns.tsv\t0\t3\n"
|
||||
|
||||
res = node.query(
|
||||
f"select a.b.d, _path, a.b, _file, a.e from azureBlobStorage('{storage_account_url}', 'cont', 'test_subcolumns.jsonl',"
|
||||
|
@ -987,10 +987,10 @@ def test_read_subcolumns(started_cluster):
|
||||
assert res == "2\ttest_subcolumns.jsonl\t(1,2)\ttest_subcolumns.jsonl\t3\n"
|
||||
|
||||
res = node.query(
|
||||
f"select x.b.d, _path, x.b, _file, x.e from hdfs('hdfs://hdfs1:9000/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
|
||||
f"select x.b.d, _path, x.b, _file, dateDiff('minute', _time, now()), x.e from hdfs('hdfs://hdfs1:9000/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
|
||||
)
|
||||
|
||||
assert res == "0\ttest_subcolumns.jsonl\t(0,0)\ttest_subcolumns.jsonl\t0\n"
|
||||
assert res == "0\ttest_subcolumns.jsonl\t(0,0)\ttest_subcolumns.jsonl\t0\t0\n"
|
||||
|
||||
res = node.query(
|
||||
f"select x.b.d, _path, x.b, _file, x.e from hdfs('hdfs://hdfs1:9000/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32) default ((42, 42), 42)')"
|
||||
|
@ -2117,10 +2117,12 @@ def test_read_subcolumns(started_cluster):
|
||||
assert res == "0\troot/test_subcolumns.jsonl\t(0,0)\ttest_subcolumns.jsonl\t0\n"
|
||||
|
||||
res = instance.query(
|
||||
f"select x.b.d, _path, x.b, _file, x.e from s3('http://{started_cluster.minio_host}:{started_cluster.minio_port}/{bucket}/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32) default ((42, 42), 42)')"
|
||||
f"select x.b.d, _path, x.b, _file, dateDiff('minute', _time, now()), x.e from s3('http://{started_cluster.minio_host}:{started_cluster.minio_port}/{bucket}/test_subcolumns.jsonl', auto, 'x Tuple(b Tuple(c UInt32, d UInt32), e UInt32) default ((42, 42), 42)')"
|
||||
)
|
||||
|
||||
assert res == "42\troot/test_subcolumns.jsonl\t(42,42)\ttest_subcolumns.jsonl\t42\n"
|
||||
assert (
|
||||
res == "42\troot/test_subcolumns.jsonl\t(42,42)\ttest_subcolumns.jsonl\t0\t42\n"
|
||||
)
|
||||
|
||||
res = instance.query(
|
||||
f"select a.b.d, _path, a.b, _file, a.e from url('http://{started_cluster.minio_host}:{started_cluster.minio_port}/{bucket}/test_subcolumns.tsv', auto, 'a Tuple(b Tuple(c UInt32, d UInt32), e UInt32)')"
|
||||
@ -2148,6 +2150,8 @@ def test_read_subcolumns(started_cluster):
|
||||
res == "42\t/root/test_subcolumns.jsonl\t(42,42)\ttest_subcolumns.jsonl\t42\n"
|
||||
)
|
||||
|
||||
logging.info("Some custom logging")
|
||||
|
||||
|
||||
def test_filtering_by_file_or_path(started_cluster):
|
||||
bucket = started_cluster.minio_bucket
|
||||
|
@ -1,5 +1,5 @@
|
||||
#!/usr/bin/env bash
|
||||
# Tags: long, zookeeper, no-parallel, no-fasttest
|
||||
# Tags: long, zookeeper, no-parallel, no-fasttest, no-asan
|
||||
|
||||
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||
# shellcheck source=../shell_config.sh
|
||||
|
@ -1 +0,0 @@
|
||||
1 2 4
|
@ -1,13 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||
# shellcheck source=../shell_config.sh
|
||||
. "$CURDIR"/../shell_config.sh
|
||||
|
||||
echo "1,2" > $CLICKHOUSE_TEST_UNIQUE_NAME.csv
|
||||
$CLICKHOUSE_LOCAL -nm -q "
|
||||
create table test (x UInt64, y UInt32, size UInt64) engine=Memory;
|
||||
insert into test select c1, c2, _size from file('$CLICKHOUSE_TEST_UNIQUE_NAME.csv') settings use_structure_from_insertion_table_in_table_functions=1;
|
||||
select * from test;
|
||||
"
|
||||
rm $CLICKHOUSE_TEST_UNIQUE_NAME.csv
|
@ -0,0 +1 @@
|
||||
1 2 4 1 1
|
@ -0,0 +1,14 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||
# shellcheck source=../shell_config.sh
|
||||
. "$CURDIR"/../shell_config.sh
|
||||
|
||||
echo "1,2" > $CLICKHOUSE_TEST_UNIQUE_NAME.csv
|
||||
sleep 1
|
||||
$CLICKHOUSE_LOCAL -nm -q "
|
||||
create table test (x UInt64, y UInt32, size UInt64, d32 DateTime32, d64 DateTime64) engine=Memory;
|
||||
insert into test select c1, c2, _size, _time, _time from file('$CLICKHOUSE_TEST_UNIQUE_NAME.csv') settings use_structure_from_insertion_table_in_table_functions=1;
|
||||
select x, y, size, (dateDiff('millisecond', d32, now()) < 4000 AND dateDiff('millisecond', d32, now()) > 0), (dateDiff('second', d64, now()) < 4 AND dateDiff('second', d64, now()) > 0) from test;
|
||||
"
|
||||
rm $CLICKHOUSE_TEST_UNIQUE_NAME.csv
|
@ -1,49 +0,0 @@
|
||||
#!/usr/bin/expect -f
|
||||
|
||||
set basedir [file dirname $argv0]
|
||||
set basename [file tail $argv0]
|
||||
if {[info exists env(CLICKHOUSE_TMP)]} {
|
||||
set CLICKHOUSE_TMP $env(CLICKHOUSE_TMP)
|
||||
} else {
|
||||
set CLICKHOUSE_TMP "."
|
||||
}
|
||||
exp_internal -f $CLICKHOUSE_TMP/$basename.debuglog 0
|
||||
|
||||
log_user 0
|
||||
set timeout 60
|
||||
match_max 100000
|
||||
set stty_init "rows 25 cols 120"
|
||||
|
||||
expect_after {
|
||||
-i $any_spawn_id eof { exp_continue }
|
||||
-i $any_spawn_id timeout { exit 1 }
|
||||
}
|
||||
|
||||
spawn clickhouse-local
|
||||
expect ":) "
|
||||
|
||||
# Trivial SELECT with LIMIT from system.zeros shows progress bar.
|
||||
send "SELECT * FROM system.zeros LIMIT 10000000 FORMAT Null SETTINGS max_execution_speed = 1000000, timeout_before_checking_execution_speed = 0, max_block_size = 128\r"
|
||||
expect "Progress: "
|
||||
expect "█"
|
||||
send "\3"
|
||||
expect "Query was cancelled."
|
||||
expect ":) "
|
||||
|
||||
send "SELECT * FROM system.zeros_mt LIMIT 10000000 FORMAT Null SETTINGS max_execution_speed = 1000000, timeout_before_checking_execution_speed = 0, max_block_size = 128\r"
|
||||
expect "Progress: "
|
||||
expect "█"
|
||||
send "\3"
|
||||
expect "Query was cancelled."
|
||||
expect ":) "
|
||||
|
||||
# As well as from generateRandom
|
||||
send "SELECT * FROM generateRandom() LIMIT 10000000 FORMAT Null SETTINGS max_execution_speed = 1000000, timeout_before_checking_execution_speed = 0, max_block_size = 128\r"
|
||||
expect "Progress: "
|
||||
expect "█"
|
||||
send "\3"
|
||||
expect "Query was cancelled."
|
||||
expect ":) "
|
||||
|
||||
send "exit\r"
|
||||
expect eof
|
@ -0,0 +1,3 @@
|
||||
Matched
|
||||
Matched
|
||||
Matched
|
@ -0,0 +1,18 @@
|
||||
#!/usr/bin/env bash
|
||||
# Tags: no-random-settings
|
||||
|
||||
CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||
# shellcheck source=../shell_config.sh
|
||||
. "$CUR_DIR"/../shell_config.sh
|
||||
|
||||
function run_with_progress_and_match_total_rows()
|
||||
{
|
||||
CURL_RESPONSE=$(echo "$1" | \
|
||||
${CLICKHOUSE_CURL} -vsS "${CLICKHOUSE_URL}&wait_end_of_query=1&max_block_size=1&send_progress_in_http_headers=1&http_headers_progress_interval_ms=0&output_format_parallel_formatting=0" --data-binary @- 2>&1)
|
||||
|
||||
echo "$CURL_RESPONSE" | grep -q '"total_rows_to_read":"100"' && echo "Matched" || echo "Expected total_rows_to_read not found: ${CURL_RESPONSE}"
|
||||
}
|
||||
|
||||
run_with_progress_and_match_total_rows 'SELECT * FROM system.zeros LIMIT 100'
|
||||
run_with_progress_and_match_total_rows 'SELECT * FROM system.zeros_mt LIMIT 100'
|
||||
run_with_progress_and_match_total_rows "SELECT * FROM generateRandom('number UInt64') LIMIT 100"
|
@ -0,0 +1 @@
|
||||
CREATE VIEW default.test_table_comment AS (SELECT toString({date_from:String})) COMMENT \'test comment\'
|
@ -0,0 +1,5 @@
|
||||
DROP TABLE IF EXISTS test_table_comment;
|
||||
CREATE VIEW test_table_comment AS SELECT toString({date_from:String});
|
||||
ALTER TABLE test_table_comment MODIFY COMMENT 'test comment';
|
||||
SELECT create_table_query FROM system.tables WHERE name = 'test_table_comment' AND database = currentDatabase();
|
||||
DROP TABLE test_table_comment;
|
@ -127,7 +127,9 @@ CREATE TABLE 03165_token_ft
|
||||
INDEX idx_message message TYPE full_text() GRANULARITY 1
|
||||
)
|
||||
ENGINE = MergeTree
|
||||
ORDER BY id;
|
||||
ORDER BY id
|
||||
-- Full text index works only with full parts.
|
||||
SETTINGS min_bytes_for_full_part_storage=0;
|
||||
|
||||
INSERT INTO 03165_token_ft VALUES(1, 'Service is not ready');
|
||||
|
||||
|
2
tests/queries/0_stateless/03168_cld2_tsan.reference
Normal file
2
tests/queries/0_stateless/03168_cld2_tsan.reference
Normal file
@ -0,0 +1,2 @@
|
||||
{'ja':0.62,'fr':0.36}
|
||||
{'ja':0.62,'fr':0.36}
|
10
tests/queries/0_stateless/03168_cld2_tsan.sql
Normal file
10
tests/queries/0_stateless/03168_cld2_tsan.sql
Normal file
@ -0,0 +1,10 @@
|
||||
-- Tags: no-fasttest
|
||||
-- Tag no-fasttest: depends on cld2
|
||||
|
||||
-- https://github.com/ClickHouse/ClickHouse/issues/64931
|
||||
SELECT detectLanguageMixed(materialize('二兎を追う者は一兎をも得ず二兎を追う者は一兎をも得ず A vaincre sans peril, on triomphe sans gloire.'))
|
||||
GROUP BY
|
||||
GROUPING SETS (
|
||||
('a', toUInt256(1)),
|
||||
(stringToH3(toFixedString(toFixedString('85283473ffffff', 14), 14))))
|
||||
SETTINGS allow_experimental_nlp_functions = 1;
|
@ -0,0 +1,6 @@
|
||||
-- https://github.com/ClickHouse/ClickHouse/issues/64946
|
||||
SELECT
|
||||
multiIf((number % toLowCardinality(toNullable(toUInt128(2)))) = (number % toNullable(2)), toInt8(1), (number % materialize(toLowCardinality(3))) = toUInt128(toNullable(0)), toInt8(materialize(materialize(2))), toInt64(toUInt128(3)))
|
||||
FROM system.numbers
|
||||
LIMIT 44857
|
||||
FORMAT Null;
|
@ -1,4 +1,5 @@
|
||||
v24.5.1.1763-stable 2024-06-01
|
||||
v24.4.2.141-stable 2024-06-07
|
||||
v24.4.1.2088-stable 2024-05-01
|
||||
v24.3.3.102-lts 2024-05-01
|
||||
v24.3.2.23-lts 2024-04-03
|
||||
|
|
Loading…
Reference in New Issue
Block a user