mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-26 17:41:59 +00:00
Merge branch 'master' into optimize-entire-partition
This commit is contained in:
commit
35a9672704
131
CHANGELOG.md
131
CHANGELOG.md
@ -1,4 +1,5 @@
|
||||
### Table of Contents
|
||||
**[ClickHouse release v22.10, 2022-10-25](#2210)**<br/>
|
||||
**[ClickHouse release v22.9, 2022-09-22](#229)**<br/>
|
||||
**[ClickHouse release v22.8-lts, 2022-08-18](#228)**<br/>
|
||||
**[ClickHouse release v22.7, 2022-07-21](#227)**<br/>
|
||||
@ -10,6 +11,136 @@
|
||||
**[ClickHouse release v22.1, 2022-01-18](#221)**<br/>
|
||||
**[Changelog for 2021](https://clickhouse.com/docs/en/whats-new/changelog/2021/)**<br/>
|
||||
|
||||
### <a id="2210"></a> ClickHouse release 22.10, 2022-10-26
|
||||
|
||||
#### Backward Incompatible Change
|
||||
* Rename cache commands: `show caches` -> `show filesystem caches`, `describe cache` -> `describe filesystem cache`. [#41508](https://github.com/ClickHouse/ClickHouse/pull/41508) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Remove support for the `WITH TIMEOUT` section for `LIVE VIEW`. This closes [#40557](https://github.com/ClickHouse/ClickHouse/issues/40557). [#42173](https://github.com/ClickHouse/ClickHouse/pull/42173) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Remove support for the `{database}` macro from the client's prompt. It was displayed incorrectly if the database was unspecified and it was not updated on `USE` statements. This closes [#25891](https://github.com/ClickHouse/ClickHouse/issues/25891). [#42508](https://github.com/ClickHouse/ClickHouse/pull/42508) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
|
||||
#### New Feature
|
||||
* Composable protocol configuration is added. Now different protocols can be set up with different listen hosts. Protocol wrappers such as PROXYv1 can be set up over any other protocols (TCP, TCP secure, MySQL, Postgres). [#41198](https://github.com/ClickHouse/ClickHouse/pull/41198) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Add `S3` as a new type of the destination of backups. Support BACKUP to S3 with as-is path/data structure. [#42333](https://github.com/ClickHouse/ClickHouse/pull/42333) ([Vitaly Baranov](https://github.com/vitlibar)), [#42232](https://github.com/ClickHouse/ClickHouse/pull/42232) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Added functions (`randUniform`, `randNormal`, `randLogNormal`, `randExponential`, `randChiSquared`, `randStudentT`, `randFisherF`, `randBernoulli`, `randBinomial`, `randNegativeBinomial`, `randPoisson`) to generate random values according to the specified distributions. This closes [#21834](https://github.com/ClickHouse/ClickHouse/issues/21834). [#42411](https://github.com/ClickHouse/ClickHouse/pull/42411) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* An improvement for ClickHouse Keeper: add support for uploading snapshots to S3. S3 information can be defined inside `keeper_server.s3_snapshot`. [#41342](https://github.com/ClickHouse/ClickHouse/pull/41342) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Added an aggregate function `analysisOfVariance` (`anova`) to perform a statistical test over several groups of normally distributed observations to find out whether all groups have the same mean or not. Original PR [#37872](https://github.com/ClickHouse/ClickHouse/issues/37872). [#42131](https://github.com/ClickHouse/ClickHouse/pull/42131) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Support limiting of temporary data stored on disk using settings `max_temporary_data_on_disk_size_for_user`/`max_temporary_data_on_disk_size_for_query` . [#40893](https://github.com/ClickHouse/ClickHouse/pull/40893) ([Vladimir C](https://github.com/vdimir)).
|
||||
* Add setting `format_json_object_each_row_column_for_object_name` to write/parse object name as column value in JSONObjectEachRow format. [#41703](https://github.com/ClickHouse/ClickHouse/pull/41703) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Add BLAKE3 hash-function to SQL. [#33435](https://github.com/ClickHouse/ClickHouse/pull/33435) ([BoloniniD](https://github.com/BoloniniD)).
|
||||
* The function `javaHash` has been extended to integers. [#41131](https://github.com/ClickHouse/ClickHouse/pull/41131) ([JackyWoo](https://github.com/JackyWoo)).
|
||||
* Add OpenTelemetry support to ON CLUSTER DDL (require `distributed_ddl_entry_format_version` to be set to 4). [#41484](https://github.com/ClickHouse/ClickHouse/pull/41484) ([Frank Chen](https://github.com/FrankChen021)).
|
||||
* Added system table `asynchronous_insert_log`. It contains information about asynchronous inserts (including results of queries in fire-and-forget mode (with `wait_for_async_insert=0`)) for better introspection. [#42040](https://github.com/ClickHouse/ClickHouse/pull/42040) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Add support for methods `lz4`, `bz2`, `snappy` in HTTP's `Accept-Encoding` which is a non-standard extension to HTTP protocol. [#42071](https://github.com/ClickHouse/ClickHouse/pull/42071) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
|
||||
#### Experimental Feature
|
||||
* Added new infrastructure for query analysis and planning under the `allow_experimental_analyzer` setting. [#31796](https://github.com/ClickHouse/ClickHouse/pull/31796) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Initial implementation of Kusto Query Language. Please don't use it. [#37961](https://github.com/ClickHouse/ClickHouse/pull/37961) ([Yong Wang](https://github.com/kashwy)).
|
||||
|
||||
#### Performance Improvement
|
||||
* Relax the "Too many parts" threshold. This closes [#6551](https://github.com/ClickHouse/ClickHouse/issues/6551). Now ClickHouse will allow more parts in a partition if the average part size is large enough (at least 10 GiB). This allows to have up to petabytes of data in a single partition of a single table on a single server, which is possible using disk shelves or object storage. [#42002](https://github.com/ClickHouse/ClickHouse/pull/42002) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Implement operator precedence element parser to make the required stack size smaller. [#34892](https://github.com/ClickHouse/ClickHouse/pull/34892) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* DISTINCT in order optimization leverage sorting properties of data streams. This improvement will enable reading in order for DISTINCT if applicable (before it was necessary to provide ORDER BY for columns in DISTINCT). [#41014](https://github.com/ClickHouse/ClickHouse/pull/41014) ([Igor Nikonov](https://github.com/devcrafter)).
|
||||
* ColumnVector: optimize UInt8 index with AVX512VBMI. [#41247](https://github.com/ClickHouse/ClickHouse/pull/41247) ([Guo Wangyang](https://github.com/guowangy)).
|
||||
* Optimize the lock contentions for `ThreadGroupStatus::mutex`. The performance experiments of **SSB** (Star Schema Benchmark) on the ICX device (Intel Xeon Platinum 8380 CPU, 80 cores, 160 threads) shows that this change could bring a **2.95x** improvement of the geomean of all subcases' QPS. [#41675](https://github.com/ClickHouse/ClickHouse/pull/41675) ([Zhiguo Zhou](https://github.com/ZhiguoZh)).
|
||||
* Add `ldapr` capabilities to AArch64 builds. This is supported from Graviton 2+, Azure and GCP instances. Only appeared in clang-15 [not so long ago](https://github.com/llvm/llvm-project/commit/9609b5daffe9fd28d83d83da895abc5113f76c24). [#41778](https://github.com/ClickHouse/ClickHouse/pull/41778) ([Daniel Kutenin](https://github.com/danlark1)).
|
||||
* Improve performance when comparing strings and one argument is an empty constant string. [#41870](https://github.com/ClickHouse/ClickHouse/pull/41870) ([Jiebin Sun](https://github.com/jiebinn)).
|
||||
* Optimize `insertFrom` of ColumnAggregateFunction to share Aggregate State in some cases. [#41960](https://github.com/ClickHouse/ClickHouse/pull/41960) ([flynn](https://github.com/ucasfl)).
|
||||
* Make writing to `azure_blob_storage` disks faster (respect `max_single_part_upload_size` instead of writing a block per each buffer size). Inefficiency mentioned in [#41754](https://github.com/ClickHouse/ClickHouse/issues/41754). [#42041](https://github.com/ClickHouse/ClickHouse/pull/42041) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Make thread ids in the process list and query_log unique to avoid waste. [#42180](https://github.com/ClickHouse/ClickHouse/pull/42180) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Support skipping cache completely (both download to cache and reading cached data) in case the requested read range exceeds the threshold defined by cache setting `bypass_cache_threashold`, requires to be enabled with `enable_bypass_cache_with_threshold`). [#42418](https://github.com/ClickHouse/ClickHouse/pull/42418) ([Han Shukai](https://github.com/KinderRiven)). This helps on slow local disks.
|
||||
|
||||
#### Improvement
|
||||
* Add setting `allow_implicit_no_password`: in combination with `allow_no_password` it forbids creating a user with no password unless `IDENTIFIED WITH no_password` is explicitly specified. [#41341](https://github.com/ClickHouse/ClickHouse/pull/41341) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Embedded Keeper will always start in the background allowing ClickHouse to start without achieving quorum. [#40991](https://github.com/ClickHouse/ClickHouse/pull/40991) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Made reestablishing a new connection to ZooKeeper more reactive in case of expiration of the previous one. Previously there was a task which spawns every minute by default and thus a table could be in readonly state for about this time. [#41092](https://github.com/ClickHouse/ClickHouse/pull/41092) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Now projections can be used with zero copy replication (zero-copy replication is a non-production feature). [#41147](https://github.com/ClickHouse/ClickHouse/pull/41147) ([alesapin](https://github.com/alesapin)).
|
||||
* Support expression `(EXPLAIN SELECT ...)` in a subquery. Queries like `SELECT * FROM (EXPLAIN PIPELINE SELECT col FROM TABLE ORDER BY col)` became valid. [#40630](https://github.com/ClickHouse/ClickHouse/pull/40630) ([Vladimir C](https://github.com/vdimir)).
|
||||
* Allow changing `async_insert_max_data_size` or `async_insert_busy_timeout_ms` in scope of query. E.g. user wants to insert data rarely and she doesn't have access to the server config to tune default settings. [#40668](https://github.com/ClickHouse/ClickHouse/pull/40668) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Improvements for reading from remote filesystems, made threadpool size for reads/writes configurable. Closes [#41070](https://github.com/ClickHouse/ClickHouse/issues/41070). [#41011](https://github.com/ClickHouse/ClickHouse/pull/41011) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Support all combinators combination in WindowTransform/arratReduce*/initializeAggregation/aggregate functions versioning. Previously combinators like `ForEach/Resample/Map` didn't work in these places, using them led to exception like`State function ... inserts results into non-state column`. [#41107](https://github.com/ClickHouse/ClickHouse/pull/41107) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Add function `tryDecrypt` that returns NULL when decrypt fails (e.g. decrypt with incorrect key) instead of throwing an exception. [#41206](https://github.com/ClickHouse/ClickHouse/pull/41206) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Add the `unreserved_space` column to the `system.disks` table to check how much space is not taken by reservations per disk. [#41254](https://github.com/ClickHouse/ClickHouse/pull/41254) ([filimonov](https://github.com/filimonov)).
|
||||
* Support s3 authorization headers in table function arguments. [#41261](https://github.com/ClickHouse/ClickHouse/pull/41261) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Add support for MultiRead in Keeper and internal ZooKeeper client (this is an extension to ZooKeeper protocol, only available in ClickHouse Keeper). [#41410](https://github.com/ClickHouse/ClickHouse/pull/41410) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Add support for decimal type comparing with floating point literal in IN operator. [#41544](https://github.com/ClickHouse/ClickHouse/pull/41544) ([liang.huang](https://github.com/lhuang09287750)).
|
||||
* Allow readable size values (like `1TB`) in cache config. [#41688](https://github.com/ClickHouse/ClickHouse/pull/41688) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* ClickHouse could cache stale DNS entries for some period of time (15 seconds by default) until the cache won't be updated asynchronously. During these periods ClickHouse can nevertheless try to establish a connection and produce errors. This behavior is fixed. [#41707](https://github.com/ClickHouse/ClickHouse/pull/41707) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Add interactive history search with fzf-like utility (fzf/sk) for `clickhouse-client`/`clickhouse-local` (note you can use `FZF_DEFAULT_OPTS`/`SKIM_DEFAULT_OPTIONS` to additionally configure the behavior). [#41730](https://github.com/ClickHouse/ClickHouse/pull/41730) ([Azat Khuzhin](https://github.com/azat)).
|
||||
*
|
||||
Only allow clients connecting to a secure server with an invalid certificate only to proceed with the '--accept-certificate' flag. [#41743](https://github.com/ClickHouse/ClickHouse/pull/41743) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Add function `tryBase58Decode`, similar to the existing function `tryBase64Decode`. [#41824](https://github.com/ClickHouse/ClickHouse/pull/41824) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Improve feedback when replacing partition with different primary key. Fixes [#34798](https://github.com/ClickHouse/ClickHouse/issues/34798). [#41838](https://github.com/ClickHouse/ClickHouse/pull/41838) ([Salvatore](https://github.com/tbsal)).
|
||||
* Fix parallel parsing: segmentator now checks `max_block_size`. This fixed memory overallocation in case of parallel parsing and small LIMIT. [#41852](https://github.com/ClickHouse/ClickHouse/pull/41852) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Don't add "TABLE_IS_DROPPED" exception to `system.errors` if it's happened during SELECT from a system table and was ignored. [#41908](https://github.com/ClickHouse/ClickHouse/pull/41908) ([AlfVII](https://github.com/AlfVII)).
|
||||
* Improve option `enable_extended_results_for_datetime_functions` to return results of type DateTime64 for functions `toStartOfDay`, `toStartOfHour`, `toStartOfFifteenMinutes`, `toStartOfTenMinutes`, `toStartOfFiveMinutes`, `toStartOfMinute` and `timeSlot`. [#41910](https://github.com/ClickHouse/ClickHouse/pull/41910) ([Roman Vasin](https://github.com/rvasin)).
|
||||
* Improve `DateTime` type inference for text formats. Now it respects setting `date_time_input_format` and doesn't try to infer datetimes from numbers as timestamps. Closes [#41389](https://github.com/ClickHouse/ClickHouse/issues/41389) Closes [#42206](https://github.com/ClickHouse/ClickHouse/issues/42206). [#41912](https://github.com/ClickHouse/ClickHouse/pull/41912) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Remove confusing warning when inserting with `perform_ttl_move_on_insert` = false. [#41980](https://github.com/ClickHouse/ClickHouse/pull/41980) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Allow user to write `countState(*)` similar to `count(*)`. This closes [#9338](https://github.com/ClickHouse/ClickHouse/issues/9338). [#41983](https://github.com/ClickHouse/ClickHouse/pull/41983) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix `rankCorr` size overflow. [#42020](https://github.com/ClickHouse/ClickHouse/pull/42020) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Added an option to specify an arbitrary string as an environment name in the Sentry's config for more handy reports. [#42037](https://github.com/ClickHouse/ClickHouse/pull/42037) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Fix parsing out-of-range Date from CSV. [#42044](https://github.com/ClickHouse/ClickHouse/pull/42044) ([Andrey Zvonov](https://github.com/zvonand)).
|
||||
* `parseDataTimeBestEffort` now supports comma between date and time. Closes [#42038](https://github.com/ClickHouse/ClickHouse/issues/42038). [#42049](https://github.com/ClickHouse/ClickHouse/pull/42049) ([flynn](https://github.com/ucasfl)).
|
||||
* Improved stale replica recovery process for `ReplicatedMergeTree`. If a lost replica has some parts which are absent from a healthy replica, but these parts should appear in the future according to the replication queue of the healthy replica, then the lost replica will keep such parts instead of detaching them. [#42134](https://github.com/ClickHouse/ClickHouse/pull/42134) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Add a possibility to use `Date32` arguments for date_diff function. Fix issue in date_diff function when using DateTime64 arguments with a start date before Unix epoch and end date after Unix epoch. [#42308](https://github.com/ClickHouse/ClickHouse/pull/42308) ([Roman Vasin](https://github.com/rvasin)).
|
||||
* When uploading big parts to Minio, 'Complete Multipart Upload' can take a long time. Minio sends heartbeats every 10 seconds (see https://github.com/minio/minio/pull/7198). But clickhouse times out earlier, because the default send/receive timeout is [set](https://github.com/ClickHouse/ClickHouse/blob/cc24fcd6d5dfb67f5f66f5483e986bd1010ad9cf/src/IO/S3/PocoHTTPClient.cpp#L123) to 5 seconds. [#42321](https://github.com/ClickHouse/ClickHouse/pull/42321) ([filimonov](https://github.com/filimonov)).
|
||||
* Fix rarely invalid cast of aggregate state types with complex types such as Decimal. This fixes [#42408](https://github.com/ClickHouse/ClickHouse/issues/42408). [#42417](https://github.com/ClickHouse/ClickHouse/pull/42417) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Allow to use `Date32` arguments for `dateName` function. [#42554](https://github.com/ClickHouse/ClickHouse/pull/42554) ([Roman Vasin](https://github.com/rvasin)).
|
||||
* Now filters with NULL literals will be used during index analysis. [#34063](https://github.com/ClickHouse/ClickHouse/issues/34063). [#41842](https://github.com/ClickHouse/ClickHouse/pull/41842) ([Amos Bird](https://github.com/amosbird)).
|
||||
|
||||
#### Build/Testing/Packaging Improvement
|
||||
* Add fuzzer for table definitions [#40096](https://github.com/ClickHouse/ClickHouse/pull/40096) ([Anton Popov](https://github.com/CurtizJ)). This represents the biggest advancement for ClickHouse testing in this year so far.
|
||||
* Beta version of the ClickHouse Cloud service is released: [https://clickhouse.cloud/](https://clickhouse.cloud/). It provides the easiest way to use ClickHouse (even slightly easier than the single-command installation).
|
||||
* Added support of WHERE clause generation to AST Fuzzer and possibility to add or remove ORDER BY and WHERE clause. [#38519](https://github.com/ClickHouse/ClickHouse/pull/38519) ([Ilya Yatsishin](https://github.com/qoega)).
|
||||
* Aarch64 binaries now require at least ARMv8.2, released in 2016. Most notably, this enables use of ARM LSE, i.e. native atomic operations. Also, CMake build option "NO_ARMV81_OR_HIGHER" has been added to allow compilation of binaries for older ARMv8.0 hardware, e.g. Raspberry Pi 4. [#41610](https://github.com/ClickHouse/ClickHouse/pull/41610) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Allow building ClickHouse with Musl (small changes after it was already supported but broken). [#41987](https://github.com/ClickHouse/ClickHouse/pull/41987) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Add the `$CLICKHOUSE_CRONFILE` file checking to avoid running the `sed` command to get the file not found error on install. [#42081](https://github.com/ClickHouse/ClickHouse/pull/42081) ([Chun-Sheng, Li](https://github.com/peter279k)).
|
||||
* Update cctz to `2022e` to support the new timezone changes. Palestine transitions are now Saturdays at 02:00. Simplify three Ukraine zones into one. Jordan and Syria switch from +02/+03 with DST to year-round +03. (https://data.iana.org/time-zones/tzdb/NEWS). This closes [#42252](https://github.com/ClickHouse/ClickHouse/issues/42252). [#42327](https://github.com/ClickHouse/ClickHouse/pull/42327) ([Alexey Milovidov](https://github.com/alexey-milovidov)). [#42273](https://github.com/ClickHouse/ClickHouse/pull/42273) ([Dom Del Nano](https://github.com/ddelnano)).
|
||||
* Add Rust code support into ClickHouse with BLAKE3 hash-function library as an example. [#33435](https://github.com/ClickHouse/ClickHouse/pull/33435) ([BoloniniD](https://github.com/BoloniniD)).
|
||||
|
||||
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
|
||||
|
||||
* Choose correct aggregation method for `LowCardinality` with big integer types. [#42342](https://github.com/ClickHouse/ClickHouse/pull/42342) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Several fixes for `web` disk. [#41652](https://github.com/ClickHouse/ClickHouse/pull/41652) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fixes an issue that causes docker run to fail if `https_port` is not present in config. [#41693](https://github.com/ClickHouse/ClickHouse/pull/41693) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
|
||||
* Mutations were not cancelled properly on server shutdown or `SYSTEM STOP MERGES` query and cancellation might take long time, it's fixed. [#41699](https://github.com/ClickHouse/ClickHouse/pull/41699) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Fix wrong result of queries with `ORDER BY` or `GROUP BY` by columns from prefix of sorting key, wrapped into monotonic functions, with enable "read in order" optimization (settings `optimize_read_in_order` and `optimize_aggregation_in_order`). [#41701](https://github.com/ClickHouse/ClickHouse/pull/41701) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Fix possible crash in `SELECT` from `Merge` table with enabled `optimize_monotonous_functions_in_order_by` setting. Fixes [#41269](https://github.com/ClickHouse/ClickHouse/issues/41269). [#41740](https://github.com/ClickHouse/ClickHouse/pull/41740) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fixed "Part ... intersects part ..." error that might happen in extremely rare cases if replica was restarted just after detaching some part as broken. [#41741](https://github.com/ClickHouse/ClickHouse/pull/41741) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Don't allow to create or alter merge tree tables with column name `_row_exists`, which is reserved for lightweight delete. Fixed [#41716](https://github.com/ClickHouse/ClickHouse/issues/41716). [#41763](https://github.com/ClickHouse/ClickHouse/pull/41763) ([Jianmei Zhang](https://github.com/zhangjmruc)).
|
||||
* Fix a bug that CORS headers are missing in some HTTP responses. [#41792](https://github.com/ClickHouse/ClickHouse/pull/41792) ([Frank Chen](https://github.com/FrankChen021)).
|
||||
* 22.9 might fail to startup `ReplicatedMergeTree` table if that table was created by 20.3 or older version and was never altered, it's fixed. Fixes [#41742](https://github.com/ClickHouse/ClickHouse/issues/41742). [#41796](https://github.com/ClickHouse/ClickHouse/pull/41796) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* When the batch sending fails for some reason, it cannot be automatically recovered, and if it is not processed in time, it will lead to accumulation, and the printed error message will become longer and longer, which will cause the http thread to block. [#41813](https://github.com/ClickHouse/ClickHouse/pull/41813) ([zhongyuankai](https://github.com/zhongyuankai)).
|
||||
* Fix compact parts with compressed marks setting. Fixes [#41783](https://github.com/ClickHouse/ClickHouse/issues/41783) and [#41746](https://github.com/ClickHouse/ClickHouse/issues/41746). [#41823](https://github.com/ClickHouse/ClickHouse/pull/41823) ([alesapin](https://github.com/alesapin)).
|
||||
* Old versions of Replicated database don't have a special marker in [Zoo]Keeper. We need to check only whether the node contains come obscure data instead of special mark. [#41875](https://github.com/ClickHouse/ClickHouse/pull/41875) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Fix possible exception in fs cache. [#41884](https://github.com/ClickHouse/ClickHouse/pull/41884) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix `use_environment_credentials` for s3 table function. [#41970](https://github.com/ClickHouse/ClickHouse/pull/41970) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fixed "Directory already exists and is not empty" error on detaching broken part that might prevent `ReplicatedMergeTree` table from starting replication. Fixes [#40957](https://github.com/ClickHouse/ClickHouse/issues/40957). [#41981](https://github.com/ClickHouse/ClickHouse/pull/41981) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* `toDateTime64` now returns the same output with negative integer and float arguments. [#42025](https://github.com/ClickHouse/ClickHouse/pull/42025) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Fix write into `azure_blob_storage`. Partially closes [#41754](https://github.com/ClickHouse/ClickHouse/issues/41754). [#42034](https://github.com/ClickHouse/ClickHouse/pull/42034) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix the `bzip2` decoding issue for specific `bzip2` files. [#42046](https://github.com/ClickHouse/ClickHouse/pull/42046) ([Nikolay Degterinsky](https://github.com/evillique)).
|
||||
* Fix SQL function `toLastDayOfMonth` with setting "enable_extended_results_for_datetime_functions = 1" at the beginning of the extended range (January 1900). - Fix SQL function "toRelativeWeekNum()" with setting "enable_extended_results_for_datetime_functions = 1" at the end of extended range (December 2299). - Improve the performance of for SQL functions "toISOYear()", "toFirstDayNumOfISOYearIndex()" and "toYearWeekOfNewyearMode()" by avoiding unnecessary index arithmetics. [#42084](https://github.com/ClickHouse/ClickHouse/pull/42084) ([Roman Vasin](https://github.com/rvasin)).
|
||||
* The maximum size of fetches for each table accidentally was set to 8 while the pool size could be bigger. Now the maximum size of fetches for table is equal to the pool size. [#42090](https://github.com/ClickHouse/ClickHouse/pull/42090) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* A table might be shut down and a dictionary might be detached before checking if can be dropped without breaking dependencies between table, it's fixed. Fixes [#41982](https://github.com/ClickHouse/ClickHouse/issues/41982). [#42106](https://github.com/ClickHouse/ClickHouse/pull/42106) ([Alexander Tokmakov](https://github.com/tavplubix)).
|
||||
* Fix bad inefficiency of `remote_filesystem_read_method=read` with filesystem cache. Closes [#42125](https://github.com/ClickHouse/ClickHouse/issues/42125). [#42129](https://github.com/ClickHouse/ClickHouse/pull/42129) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix possible timeout exception for distributed queries with use_hedged_requests = 0. [#42130](https://github.com/ClickHouse/ClickHouse/pull/42130) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fixed a minor bug inside function `runningDifference` in case of using it with `Date32` type. Previously `Date` was used and it may cause some logical errors like `Bad cast from type DB::ColumnVector<int> to DB::ColumnVector<unsigned short>'`. [#42143](https://github.com/ClickHouse/ClickHouse/pull/42143) ([Alfred Xu](https://github.com/sperlingxx)).
|
||||
* Fix reusing of files > 4GB from base backup. [#42146](https://github.com/ClickHouse/ClickHouse/pull/42146) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* DISTINCT in order fails with LOGICAL_ERROR if first column in sorting key contains function. [#42186](https://github.com/ClickHouse/ClickHouse/pull/42186) ([Igor Nikonov](https://github.com/devcrafter)).
|
||||
* Fix a bug with projections and the `aggregate_functions_null_for_empty` setting. This bug is very rare and appears only if you enable the `aggregate_functions_null_for_empty` setting in the server's config. This closes [#41647](https://github.com/ClickHouse/ClickHouse/issues/41647). [#42198](https://github.com/ClickHouse/ClickHouse/pull/42198) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix read from `Buffer` tables with read in order desc. [#42236](https://github.com/ClickHouse/ClickHouse/pull/42236) ([Duc Canh Le](https://github.com/canhld94)).
|
||||
* Fix a bug which prevents ClickHouse to start when `background_pool_size setting` is set on default profile but `background_merges_mutations_concurrency_ratio` is not. [#42315](https://github.com/ClickHouse/ClickHouse/pull/42315) ([nvartolomei](https://github.com/nvartolomei)).
|
||||
* `ALTER UPDATE` of attached part (with columns different from table schema) could create an invalid `columns.txt` metadata on disk. Reading from such part could fail with errors or return invalid data. Fixes [#42161](https://github.com/ClickHouse/ClickHouse/issues/42161). [#42319](https://github.com/ClickHouse/ClickHouse/pull/42319) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Setting `additional_table_filters` were not applied to `Distributed` storage. Fixes [#41692](https://github.com/ClickHouse/ClickHouse/issues/41692). [#42322](https://github.com/ClickHouse/ClickHouse/pull/42322) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix a data race in query finish/cancel. This closes [#42346](https://github.com/ClickHouse/ClickHouse/issues/42346). [#42362](https://github.com/ClickHouse/ClickHouse/pull/42362) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* This reverts [#40217](https://github.com/ClickHouse/ClickHouse/issues/40217) which introduced a regression in date/time functions. [#42367](https://github.com/ClickHouse/ClickHouse/pull/42367) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix assert cast in join on falsy condition, Close [#42380](https://github.com/ClickHouse/ClickHouse/issues/42380). [#42407](https://github.com/ClickHouse/ClickHouse/pull/42407) ([Vladimir C](https://github.com/vdimir)).
|
||||
* Fix buffer overflow in the processing of Decimal data types. This closes [#42451](https://github.com/ClickHouse/ClickHouse/issues/42451). [#42465](https://github.com/ClickHouse/ClickHouse/pull/42465) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* `AggregateFunctionQuantile` now correctly works with UInt128 columns. Previously, the quantile state interpreted `UInt128` columns as `Int128` which could have led to incorrect results. [#42473](https://github.com/ClickHouse/ClickHouse/pull/42473) ([Antonio Andelic](https://github.com/antonio2368)).
|
||||
* Fix bad_cast assert during INSERT into `Annoy` indexes over non-Float32 columns. `Annoy` indices is an experimental feature. [#42485](https://github.com/ClickHouse/ClickHouse/pull/42485) ([Robert Schulze](https://github.com/rschu1ze)).
|
||||
* Arithmetic operator with Date or DateTime and 128 or 256-bit integer was referencing uninitialized memory. [#42453](https://github.com/ClickHouse/ClickHouse/issues/42453). [#42573](https://github.com/ClickHouse/ClickHouse/pull/42573) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix unexpected table loading error when partition key contains alias function names during server upgrade. [#36379](https://github.com/ClickHouse/ClickHouse/pull/36379) ([Amos Bird](https://github.com/amosbird)).
|
||||
|
||||
|
||||
### <a id="229"></a> ClickHouse release 22.9, 2022-09-22
|
||||
|
||||
#### Backward Incompatible Change
|
||||
|
@ -17,6 +17,33 @@ title: Troubleshooting
|
||||
- Check firewall settings.
|
||||
- If you cannot access the repository for any reason, download packages as described in the [install guide](../getting-started/install.md) article and install them manually using the `sudo dpkg -i <packages>` command. You will also need the `tzdata` package.
|
||||
|
||||
### You Cannot Update Deb Packages from ClickHouse Repository with Apt-get {#you-cannot-update-deb-packages-from-clickhouse-repository-with-apt-get}
|
||||
|
||||
- The issue may be happened when the GPG key is changed.
|
||||
|
||||
Please use the following scripts to resolve the issue:
|
||||
|
||||
```bash
|
||||
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 8919F6BD2B48D754
|
||||
sudo apt-get update
|
||||
```
|
||||
|
||||
### You Get the Unsupported Architecture Warning with Apt-get {#you-get-the-unsupported-architecture-warning-with-apt-get}
|
||||
|
||||
- The completed warning message is as follows:
|
||||
|
||||
```
|
||||
N: Skipping acquire of configured file 'main/binary-i386/Packages' as repository 'https://packages.clickhouse.com/deb stable InRelease' doesn't support architecture 'i386'
|
||||
```
|
||||
|
||||
To resolve the above issue, please use the following script:
|
||||
|
||||
```bash
|
||||
sudo rm /var/lib/apt/lists/packages.clickhouse.com_* /var/lib/dpkg/arch
|
||||
sudo apt-get clean
|
||||
sudo apt-get autoclean
|
||||
```
|
||||
|
||||
## Connecting to the Server {#troubleshooting-accepts-no-connections}
|
||||
|
||||
Possible issues:
|
||||
|
@ -376,14 +376,6 @@ Result:
|
||||
└─────┘
|
||||
```
|
||||
|
||||
## UUIDStringToNum(str)
|
||||
|
||||
Accepts a string containing 36 characters in the format `123e4567-e89b-12d3-a456-426655440000`, and returns it as a set of bytes in a FixedString(16).
|
||||
|
||||
## UUIDNumToString(str)
|
||||
|
||||
Accepts a FixedString(16) value. Returns a string containing 36 characters in text format.
|
||||
|
||||
## bitmaskToList(num)
|
||||
|
||||
Accepts an integer. Returns a string containing the list of powers of two that total the source number when summed. They are comma-separated without spaces in text format, in ascending order.
|
||||
|
@ -211,12 +211,19 @@ SELECT toUUIDOrZero('61f0c404-5cb3-11e7-907b-a6006ad3dba0T') AS uuid
|
||||
|
||||
## UUIDStringToNum
|
||||
|
||||
Accepts a string containing 36 characters in the format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`, and returns it as a set of bytes in a [FixedString(16)](../../sql-reference/data-types/fixedstring.md).
|
||||
Accepts `string` containing 36 characters in the format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`, and returns a [FixedString(16)](../../sql-reference/data-types/fixedstring.md) as its binary representation, with its format optionally specified by `variant` (`Big-endian` by default).
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
UUIDStringToNum(String)
|
||||
UUIDStringToNum(string[, variant = 1])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String of 36 characters or FixedString(36). [String](../../sql-reference/syntax.md#syntax-string-literal).
|
||||
- `variant` — Integer, representing a variant as specified by [RFC4122](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.1). 1 = `Big-endian` (default), 2 = `Microsoft`.
|
||||
|
||||
**Returned value**
|
||||
|
||||
FixedString(16)
|
||||
@ -235,14 +242,33 @@ SELECT
|
||||
└──────────────────────────────────────┴──────────────────┘
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT
|
||||
'612f3c40-5d3b-217e-707b-6a546a3d7b29' AS uuid,
|
||||
UUIDStringToNum(uuid, 2) AS bytes
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─uuid─────────────────────────────────┬─bytes────────────┐
|
||||
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ @</a;]~!p{jTj={) │
|
||||
└──────────────────────────────────────┴──────────────────┘
|
||||
```
|
||||
|
||||
## UUIDNumToString
|
||||
|
||||
Accepts a [FixedString(16)](../../sql-reference/data-types/fixedstring.md) value, and returns a string containing 36 characters in text format.
|
||||
Accepts `binary` containing a binary representation of a UUID, with its format optionally specified by `variant` (`Big-endian` by default), and returns a string containing 36 characters in text format.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
UUIDNumToString(FixedString(16))
|
||||
UUIDNumToString(binary[, variant = 1])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `binary` — [FixedString(16)](../../sql-reference/data-types/fixedstring.md) as a binary representation of a UUID.
|
||||
- `variant` — Integer, representing a variant as specified by [RFC4122](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.1). 1 = `Big-endian` (default), 2 = `Microsoft`.
|
||||
|
||||
**Returned value**
|
||||
|
||||
String.
|
||||
@ -261,6 +287,18 @@ SELECT
|
||||
└──────────────────┴──────────────────────────────────────┘
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT
|
||||
'@</a;]~!p{jTj={)' AS bytes,
|
||||
UUIDNumToString(toFixedString(bytes, 16), 2) AS uuid
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─bytes────────────┬─uuid─────────────────────────────────┐
|
||||
│ @</a;]~!p{jTj={) │ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │
|
||||
└──────────────────┴──────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## serverUUID()
|
||||
|
||||
Returns the random and unique UUID, which is generated when the server is first started and stored forever. The result writes to the file `uuid` created in the ClickHouse server directory `/var/lib/clickhouse/`.
|
||||
|
@ -81,6 +81,7 @@ Multiple path components can have globs. For being processed file must exist and
|
||||
- `?` — Substitutes any single character.
|
||||
- `{some_string,another_string,yet_another_one}` — Substitutes any of strings `'some_string', 'another_string', 'yet_another_one'`.
|
||||
- `{N..M}` — Substitutes any number in range from N to M including both borders.
|
||||
- `**` - Fetches all files inside the folder recursively.
|
||||
|
||||
Constructions with `{}` are similar to the [remote](remote.md) table function.
|
||||
|
||||
@ -119,6 +120,22 @@ Query the data from files named `file000`, `file001`, … , `file999`:
|
||||
SELECT count(*) FROM file('big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name String, value UInt32');
|
||||
```
|
||||
|
||||
**Example**
|
||||
|
||||
Query the data from all files inside `big_dir` directory recursively:
|
||||
|
||||
``` sql
|
||||
SELECT count(*) FROM file('big_dir/**', 'CSV', 'name String, value UInt32');
|
||||
```
|
||||
|
||||
**Example**
|
||||
|
||||
Query the data from all `file002` files from any folder inside `big_dir` directory recursively:
|
||||
|
||||
``` sql
|
||||
SELECT count(*) FROM file('big_dir/**/file002', 'CSV', 'name String, value UInt32');
|
||||
```
|
||||
|
||||
## Virtual Columns
|
||||
|
||||
- `_path` — Path to the file.
|
||||
|
@ -127,6 +127,18 @@ INSERT INTO FUNCTION s3('https://clickhouse-public-datasets.s3.amazonaws.com/my-
|
||||
SELECT name, value FROM existing_table;
|
||||
```
|
||||
|
||||
Glob ** can be used for recursive directory traversal. Consider the below example, it will fetch all files from `my-test-bucket-768` directory recursively:
|
||||
|
||||
``` sql
|
||||
SELECT * FROM s3('https://clickhouse-public-datasets.s3.amazonaws.com/my-test-bucket-768/**', 'CSV', 'name String, value UInt32', 'gzip');
|
||||
```
|
||||
|
||||
The below get data from all `test-data.csv.gz` files from any folder inside `my-test-bucket` directory recursively:
|
||||
|
||||
``` sql
|
||||
SELECT * FROM s3('https://clickhouse-public-datasets.s3.amazonaws.com/my-test-bucket-768/**/test-data.csv.gz', 'CSV', 'name String, value UInt32', 'gzip');
|
||||
```
|
||||
|
||||
## Partitioned Write
|
||||
|
||||
If you specify `PARTITION BY` expression when inserting data into `S3` table, a separate file is created for each partition value. Splitting the data into separate files helps to improve reading operations efficiency.
|
||||
|
@ -112,11 +112,6 @@ public:
|
||||
return QueryTreeNodeType::COLUMN;
|
||||
}
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
return column.name;
|
||||
}
|
||||
|
||||
DataTypePtr getResultType() const override
|
||||
{
|
||||
return column.type;
|
||||
|
@ -51,11 +51,6 @@ public:
|
||||
return QueryTreeNodeType::CONSTANT;
|
||||
}
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
return value_string;
|
||||
}
|
||||
|
||||
DataTypePtr getResultType() const override
|
||||
{
|
||||
return constant_value->getType();
|
||||
|
@ -93,27 +93,6 @@ void FunctionNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state
|
||||
}
|
||||
}
|
||||
|
||||
String FunctionNode::getName() const
|
||||
{
|
||||
String name = function_name;
|
||||
|
||||
const auto & parameters = getParameters();
|
||||
const auto & parameters_nodes = parameters.getNodes();
|
||||
if (!parameters_nodes.empty())
|
||||
{
|
||||
name += '(';
|
||||
name += parameters.getName();
|
||||
name += ')';
|
||||
}
|
||||
|
||||
const auto & arguments = getArguments();
|
||||
name += '(';
|
||||
name += arguments.getName();
|
||||
name += ')';
|
||||
|
||||
return name;
|
||||
}
|
||||
|
||||
bool FunctionNode::isEqualImpl(const IQueryTreeNode & rhs) const
|
||||
{
|
||||
const auto & rhs_typed = assert_cast<const FunctionNode &>(rhs);
|
||||
|
@ -203,8 +203,6 @@ public:
|
||||
return result_type;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -82,14 +82,6 @@ public:
|
||||
return toString(getNodeType());
|
||||
}
|
||||
|
||||
/** Get name of query tree node that can be used as part of expression.
|
||||
* TODO: Projection name, expression name must be refactored in better interface.
|
||||
*/
|
||||
virtual String getName() const
|
||||
{
|
||||
throw Exception(ErrorCodes::UNSUPPORTED_METHOD, "Method getName is not supported for {} query node", getNodeTypeName());
|
||||
}
|
||||
|
||||
/** Get result type of query tree node that can be used as part of expression.
|
||||
* If node does not support this method exception is thrown.
|
||||
* TODO: Maybe this can be a part of ExpressionQueryTreeNode.
|
||||
|
@ -50,11 +50,6 @@ public:
|
||||
return QueryTreeNodeType::IDENTIFIER;
|
||||
}
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
return identifier.getFullName();
|
||||
}
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -17,15 +17,6 @@ InterpolateNode::InterpolateNode(QueryTreeNodePtr expression_, QueryTreeNodePtr
|
||||
children[interpolate_expression_child_index] = std::move(interpolate_expression_);
|
||||
}
|
||||
|
||||
String InterpolateNode::getName() const
|
||||
{
|
||||
String result = getExpression()->getName();
|
||||
result += " AS ";
|
||||
result += getInterpolateExpression()->getName();
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
void InterpolateNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
|
||||
{
|
||||
buffer << std::string(indent, ' ') << "INTERPOLATE id: " << format_state.getNodeId(this);
|
||||
|
@ -50,8 +50,6 @@ public:
|
||||
return QueryTreeNodeType::INTERPOLATE;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -44,11 +44,6 @@ void LambdaNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state,
|
||||
getExpression()->dumpTreeImpl(buffer, format_state, indent + 4);
|
||||
}
|
||||
|
||||
String LambdaNode::getName() const
|
||||
{
|
||||
return "lambda(" + children[arguments_child_index]->getName() + ") -> " + children[expression_child_index]->getName();
|
||||
}
|
||||
|
||||
bool LambdaNode::isEqualImpl(const IQueryTreeNode & rhs) const
|
||||
{
|
||||
const auto & rhs_typed = assert_cast<const LambdaNode &>(rhs);
|
||||
|
@ -84,8 +84,6 @@ public:
|
||||
return QueryTreeNodeType::LAMBDA;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
DataTypePtr getResultType() const override
|
||||
{
|
||||
return getExpression()->getResultType();
|
||||
|
@ -38,24 +38,6 @@ void ListNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, si
|
||||
}
|
||||
}
|
||||
|
||||
String ListNode::getName() const
|
||||
{
|
||||
if (children.empty())
|
||||
return "";
|
||||
|
||||
std::string result;
|
||||
for (const auto & node : children)
|
||||
{
|
||||
result += node->getName();
|
||||
result += ", ";
|
||||
}
|
||||
|
||||
result.pop_back();
|
||||
result.pop_back();
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
bool ListNode::isEqualImpl(const IQueryTreeNode &) const
|
||||
{
|
||||
/// No state
|
||||
|
@ -39,8 +39,6 @@ public:
|
||||
return QueryTreeNodeType::LIST;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -146,55 +146,6 @@ void MatcherNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state,
|
||||
}
|
||||
}
|
||||
|
||||
String MatcherNode::getName() const
|
||||
{
|
||||
WriteBufferFromOwnString buffer;
|
||||
|
||||
if (!qualified_identifier.empty())
|
||||
buffer << qualified_identifier.getFullName() << '.';
|
||||
|
||||
if (matcher_type == MatcherNodeType::ASTERISK)
|
||||
{
|
||||
buffer << '*';
|
||||
}
|
||||
else
|
||||
{
|
||||
buffer << "COLUMNS(";
|
||||
|
||||
if (columns_matcher)
|
||||
{
|
||||
buffer << ' ' << columns_matcher->pattern();
|
||||
}
|
||||
else if (matcher_type == MatcherNodeType::COLUMNS_LIST)
|
||||
{
|
||||
size_t columns_identifiers_size = columns_identifiers.size();
|
||||
for (size_t i = 0; i < columns_identifiers_size; ++i)
|
||||
{
|
||||
buffer << columns_identifiers[i].getFullName();
|
||||
|
||||
if (i + 1 != columns_identifiers_size)
|
||||
buffer << ", ";
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
buffer << ')';
|
||||
|
||||
const auto & column_transformers = getColumnTransformers().getNodes();
|
||||
size_t column_transformers_size = column_transformers.size();
|
||||
|
||||
for (size_t i = 0; i < column_transformers_size; ++i)
|
||||
{
|
||||
const auto & column_transformer = column_transformers[i];
|
||||
buffer << column_transformer->getName();
|
||||
|
||||
if (i + 1 != column_transformers_size)
|
||||
buffer << ' ';
|
||||
}
|
||||
|
||||
return buffer.str();
|
||||
}
|
||||
|
||||
bool MatcherNode::isEqualImpl(const IQueryTreeNode & rhs) const
|
||||
{
|
||||
const auto & rhs_typed = assert_cast<const MatcherNode &>(rhs);
|
||||
|
@ -139,8 +139,6 @@ public:
|
||||
return QueryTreeNodeType::MATCHER;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -2255,7 +2255,7 @@ QueryTreeNodePtr QueryAnalyzer::tryResolveIdentifierFromJoin(const IdentifierLoo
|
||||
for (auto & join_using_node : join_using_list.getNodes())
|
||||
{
|
||||
auto & column_node = join_using_node->as<ColumnNode &>();
|
||||
join_using_column_name_to_column_node.emplace(column_node.getName(), std::static_pointer_cast<ColumnNode>(join_using_node));
|
||||
join_using_column_name_to_column_node.emplace(column_node.getColumnName(), std::static_pointer_cast<ColumnNode>(join_using_node));
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -32,100 +32,6 @@ QueryNode::QueryNode()
|
||||
children[limit_by_child_index] = std::make_shared<ListNode>();
|
||||
}
|
||||
|
||||
String QueryNode::getName() const
|
||||
{
|
||||
WriteBufferFromOwnString buffer;
|
||||
|
||||
if (hasWith())
|
||||
{
|
||||
buffer << getWith().getName();
|
||||
buffer << ' ';
|
||||
}
|
||||
|
||||
buffer << "SELECT ";
|
||||
buffer << getProjection().getName();
|
||||
|
||||
if (getJoinTree())
|
||||
{
|
||||
buffer << " FROM ";
|
||||
buffer << getJoinTree()->getName();
|
||||
}
|
||||
|
||||
if (getPrewhere())
|
||||
{
|
||||
buffer << " PREWHERE ";
|
||||
buffer << getPrewhere()->getName();
|
||||
}
|
||||
|
||||
if (getWhere())
|
||||
{
|
||||
buffer << " WHERE ";
|
||||
buffer << getWhere()->getName();
|
||||
}
|
||||
|
||||
if (hasGroupBy())
|
||||
{
|
||||
buffer << " GROUP BY ";
|
||||
buffer << getGroupBy().getName();
|
||||
}
|
||||
|
||||
if (hasHaving())
|
||||
{
|
||||
buffer << " HAVING ";
|
||||
buffer << getHaving()->getName();
|
||||
}
|
||||
|
||||
if (hasWindow())
|
||||
{
|
||||
buffer << " WINDOW ";
|
||||
buffer << getWindow().getName();
|
||||
}
|
||||
|
||||
if (hasOrderBy())
|
||||
{
|
||||
buffer << " ORDER BY ";
|
||||
buffer << getOrderByNode()->getName();
|
||||
}
|
||||
|
||||
if (hasInterpolate())
|
||||
{
|
||||
buffer << " INTERPOLATE ";
|
||||
buffer << getInterpolate()->getName();
|
||||
}
|
||||
|
||||
if (hasLimitByLimit())
|
||||
{
|
||||
buffer << "LIMIT ";
|
||||
buffer << getLimitByLimit()->getName();
|
||||
}
|
||||
|
||||
if (hasLimitByOffset())
|
||||
{
|
||||
buffer << "OFFSET ";
|
||||
buffer << getLimitByOffset()->getName();
|
||||
}
|
||||
|
||||
if (hasLimitBy())
|
||||
{
|
||||
buffer << " BY ";
|
||||
buffer << getLimitBy().getName();
|
||||
}
|
||||
|
||||
if (hasLimit())
|
||||
{
|
||||
buffer << " LIMIT ";
|
||||
buffer << getLimit()->getName();
|
||||
}
|
||||
|
||||
if (hasOffset())
|
||||
{
|
||||
buffer << " OFFSET ";
|
||||
buffer << getOffset()->getName();
|
||||
}
|
||||
|
||||
return buffer.str();
|
||||
}
|
||||
|
||||
void QueryNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
|
||||
{
|
||||
buffer << std::string(indent, ' ') << "QUERY id: " << format_state.getNodeId(this);
|
||||
|
@ -559,8 +559,6 @@ public:
|
||||
return QueryTreeNodeType::QUERY;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
DataTypePtr getResultType() const override
|
||||
{
|
||||
if (constant_value)
|
||||
|
@ -35,38 +35,6 @@ SortNode::SortNode(QueryTreeNodePtr expression_,
|
||||
children[sort_expression_child_index] = std::move(expression_);
|
||||
}
|
||||
|
||||
String SortNode::getName() const
|
||||
{
|
||||
String result = getExpression()->getName();
|
||||
|
||||
if (sort_direction == SortDirection::ASCENDING)
|
||||
result += " ASC";
|
||||
else
|
||||
result += " DESC";
|
||||
|
||||
if (nulls_sort_direction)
|
||||
{
|
||||
if (*nulls_sort_direction == SortDirection::ASCENDING)
|
||||
result += " NULLS FIRST";
|
||||
else
|
||||
result += " NULLS LAST";
|
||||
}
|
||||
|
||||
if (with_fill)
|
||||
result += " WITH FILL";
|
||||
|
||||
if (hasFillFrom())
|
||||
result += " FROM " + getFillFrom()->getName();
|
||||
|
||||
if (hasFillStep())
|
||||
result += " STEP " + getFillStep()->getName();
|
||||
|
||||
if (hasFillTo())
|
||||
result += " TO " + getFillTo()->getName();
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
void SortNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
|
||||
{
|
||||
buffer << std::string(indent, ' ') << "SORT id: " << format_state.getNodeId(this);
|
||||
|
@ -128,11 +128,8 @@ public:
|
||||
return QueryTreeNodeType::SORT;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
|
||||
protected:
|
||||
bool isEqualImpl(const IQueryTreeNode & rhs) const override;
|
||||
|
||||
|
@ -50,18 +50,6 @@ const StorageSnapshotPtr & TableFunctionNode::getStorageSnapshot() const
|
||||
return storage_snapshot;
|
||||
}
|
||||
|
||||
String TableFunctionNode::getName() const
|
||||
{
|
||||
String name = table_function_name;
|
||||
|
||||
const auto & arguments = getArguments();
|
||||
name += '(';
|
||||
name += arguments.getName();
|
||||
name += ')';
|
||||
|
||||
return name;
|
||||
}
|
||||
|
||||
void TableFunctionNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
|
||||
{
|
||||
buffer << std::string(indent, ' ') << "TABLE_FUNCTION id: " << format_state.getNodeId(this);
|
||||
|
@ -127,8 +127,6 @@ public:
|
||||
return QueryTreeNodeType::TABLE_FUNCTION;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -66,11 +66,6 @@ void TableNode::updateTreeHashImpl(HashState & state) const
|
||||
table_expression_modifiers->updateTreeHash(state);
|
||||
}
|
||||
|
||||
String TableNode::getName() const
|
||||
{
|
||||
return storage->getStorageID().getFullNameNotQuoted();
|
||||
}
|
||||
|
||||
QueryTreeNodePtr TableNode::cloneImpl() const
|
||||
{
|
||||
auto result_table_node = std::make_shared<TableNode>(storage, storage_id, storage_lock, storage_snapshot);
|
||||
|
@ -76,8 +76,6 @@ public:
|
||||
return QueryTreeNodeType::TABLE;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -79,46 +79,6 @@ NamesAndTypes UnionNode::computeProjectionColumns() const
|
||||
return result_columns;
|
||||
}
|
||||
|
||||
String UnionNode::getName() const
|
||||
{
|
||||
WriteBufferFromOwnString buffer;
|
||||
|
||||
auto query_nodes = getQueries().getNodes();
|
||||
size_t query_nodes_size = query_nodes.size();
|
||||
|
||||
for (size_t i = 0; i < query_nodes_size; ++i)
|
||||
{
|
||||
const auto & query_node = query_nodes[i];
|
||||
buffer << query_node->getName();
|
||||
|
||||
if (i == 0)
|
||||
continue;
|
||||
|
||||
auto query_union_mode = union_modes.at(i - 1);
|
||||
|
||||
if (query_union_mode == SelectUnionMode::UNION_DEFAULT)
|
||||
buffer << "UNION";
|
||||
else if (query_union_mode == SelectUnionMode::UNION_ALL)
|
||||
buffer << "UNION ALL";
|
||||
else if (query_union_mode == SelectUnionMode::UNION_DISTINCT)
|
||||
buffer << "UNION DISTINCT";
|
||||
else if (query_union_mode == SelectUnionMode::EXCEPT_DEFAULT)
|
||||
buffer << "EXCEPT";
|
||||
else if (query_union_mode == SelectUnionMode::EXCEPT_ALL)
|
||||
buffer << "EXCEPT ALL";
|
||||
else if (query_union_mode == SelectUnionMode::EXCEPT_DISTINCT)
|
||||
buffer << "EXCEPT DISTINCT";
|
||||
else if (query_union_mode == SelectUnionMode::INTERSECT_DEFAULT)
|
||||
buffer << "INTERSECT";
|
||||
else if (query_union_mode == SelectUnionMode::INTERSECT_ALL)
|
||||
buffer << "INTERSECT ALL";
|
||||
else if (query_union_mode == SelectUnionMode::INTERSECT_DISTINCT)
|
||||
buffer << "INTERSECT DISTINCT";
|
||||
}
|
||||
|
||||
return buffer.str();
|
||||
}
|
||||
|
||||
void UnionNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
|
||||
{
|
||||
buffer << std::string(indent, ' ') << "UNION id: " << format_state.getNodeId(this);
|
||||
|
@ -154,8 +154,6 @@ public:
|
||||
return QueryTreeNodeType::UNION;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
DataTypePtr getResultType() const override
|
||||
{
|
||||
if (constant_value)
|
||||
|
@ -18,75 +18,6 @@ WindowNode::WindowNode(WindowFrame window_frame_)
|
||||
children[order_by_child_index] = std::make_shared<ListNode>();
|
||||
}
|
||||
|
||||
String WindowNode::getName() const
|
||||
{
|
||||
String result;
|
||||
|
||||
if (hasPartitionBy())
|
||||
{
|
||||
result += "PARTITION BY";
|
||||
result += getPartitionBy().getName();
|
||||
}
|
||||
|
||||
if (hasOrderBy())
|
||||
{
|
||||
result += "ORDER BY";
|
||||
result += getOrderBy().getName();
|
||||
}
|
||||
|
||||
if (!window_frame.is_default)
|
||||
{
|
||||
if (hasPartitionBy() || hasOrderBy())
|
||||
result += ' ';
|
||||
|
||||
if (window_frame.type == WindowFrame::FrameType::ROWS)
|
||||
result += "ROWS";
|
||||
else if (window_frame.type == WindowFrame::FrameType::GROUPS)
|
||||
result += "GROUPS";
|
||||
else if (window_frame.type == WindowFrame::FrameType::RANGE)
|
||||
result += "RANGE";
|
||||
|
||||
result += " BETWEEN ";
|
||||
if (window_frame.begin_type == WindowFrame::BoundaryType::Current)
|
||||
{
|
||||
result += "CURRENT ROW";
|
||||
}
|
||||
else if (window_frame.begin_type == WindowFrame::BoundaryType::Unbounded)
|
||||
{
|
||||
result += "UNBOUNDED";
|
||||
result += " ";
|
||||
result += (window_frame.begin_preceding ? "PRECEDING" : "FOLLOWING");
|
||||
}
|
||||
else
|
||||
{
|
||||
result += getFrameBeginOffsetNode()->getName();
|
||||
result += " ";
|
||||
result += (window_frame.begin_preceding ? "PRECEDING" : "FOLLOWING");
|
||||
}
|
||||
|
||||
result += " AND ";
|
||||
|
||||
if (window_frame.end_type == WindowFrame::BoundaryType::Current)
|
||||
{
|
||||
result += "CURRENT ROW";
|
||||
}
|
||||
else if (window_frame.end_type == WindowFrame::BoundaryType::Unbounded)
|
||||
{
|
||||
result += "UNBOUNDED";
|
||||
result += " ";
|
||||
result += (window_frame.end_preceding ? "PRECEDING" : "FOLLOWING");
|
||||
}
|
||||
else
|
||||
{
|
||||
result += getFrameEndOffsetNode()->getName();
|
||||
result += " ";
|
||||
result += (window_frame.end_preceding ? "PRECEDING" : "FOLLOWING");
|
||||
}
|
||||
}
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
void WindowNode::dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const
|
||||
{
|
||||
buffer << std::string(indent, ' ') << "WINDOW id: " << format_state.getNodeId(this);
|
||||
|
@ -166,8 +166,6 @@ public:
|
||||
return QueryTreeNodeType::WINDOW;
|
||||
}
|
||||
|
||||
String getName() const override;
|
||||
|
||||
void dumpTreeImpl(WriteBuffer & buffer, FormatState & format_state, size_t indent) const override;
|
||||
|
||||
protected:
|
||||
|
@ -28,13 +28,17 @@ public:
|
||||
template <size_t ELEMENT_SIZE>
|
||||
const char * getRawDataBegin() const
|
||||
{
|
||||
return reinterpret_cast<const PODArrayBase<ELEMENT_SIZE, 4096, Allocator<false>, 15, 16> *>(reinterpret_cast<const char *>(this) + sizeof(*this))->raw_data();
|
||||
return reinterpret_cast<const PODArrayBase<ELEMENT_SIZE, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD> *>(
|
||||
reinterpret_cast<const char *>(this) + sizeof(*this))
|
||||
->raw_data();
|
||||
}
|
||||
|
||||
template <size_t ELEMENT_SIZE>
|
||||
void insertRawData(const char * ptr)
|
||||
{
|
||||
return reinterpret_cast<PODArrayBase<ELEMENT_SIZE, 4096, Allocator<false>, 15, 16> *>(reinterpret_cast<char *>(this) + sizeof(*this))->push_back_raw(ptr);
|
||||
return reinterpret_cast<PODArrayBase<ELEMENT_SIZE, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD> *>(
|
||||
reinterpret_cast<char *>(this) + sizeof(*this))
|
||||
->push_back_raw(ptr);
|
||||
}
|
||||
};
|
||||
|
||||
|
@ -34,8 +34,7 @@ namespace DB
|
||||
class Arena : private boost::noncopyable
|
||||
{
|
||||
private:
|
||||
/// Padding allows to use 'memcpySmallAllowReadWriteOverflow15' instead of 'memcpy'.
|
||||
static constexpr size_t pad_right = 15;
|
||||
static constexpr size_t pad_right = PADDING_FOR_SIMD - 1;
|
||||
|
||||
/// Contiguous MemoryChunk of memory and pointer to free space inside it. Member of single-linked list.
|
||||
struct alignas(16) MemoryChunk : private Allocator<false> /// empty base optimization
|
||||
|
@ -6,14 +6,13 @@ namespace DB
|
||||
/// Used for left padding of PODArray when empty
|
||||
const char empty_pod_array[empty_pod_array_size]{};
|
||||
|
||||
template class PODArray<UInt8, 4096, Allocator<false>, 15, 16>;
|
||||
template class PODArray<UInt16, 4096, Allocator<false>, 15, 16>;
|
||||
template class PODArray<UInt32, 4096, Allocator<false>, 15, 16>;
|
||||
template class PODArray<UInt64, 4096, Allocator<false>, 15, 16>;
|
||||
|
||||
template class PODArray<Int8, 4096, Allocator<false>, 15, 16>;
|
||||
template class PODArray<Int16, 4096, Allocator<false>, 15, 16>;
|
||||
template class PODArray<Int32, 4096, Allocator<false>, 15, 16>;
|
||||
template class PODArray<Int64, 4096, Allocator<false>, 15, 16>;
|
||||
template class PODArray<UInt8, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
template class PODArray<UInt16, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
template class PODArray<UInt32, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
template class PODArray<UInt64, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
|
||||
template class PODArray<Int8, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
template class PODArray<Int16, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
template class PODArray<Int32, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
template class PODArray<Int64, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
}
|
||||
|
@ -502,7 +502,7 @@ public:
|
||||
template <typename It1, typename It2, typename ... TAllocatorParams>
|
||||
void insertSmallAllowReadWriteOverflow15(It1 from_begin, It2 from_end, TAllocatorParams &&... allocator_params)
|
||||
{
|
||||
static_assert(pad_right_ >= 15);
|
||||
static_assert(pad_right_ >= PADDING_FOR_SIMD - 1);
|
||||
static_assert(sizeof(T) == sizeof(*from_begin));
|
||||
insertPrepare(from_begin, from_end, std::forward<TAllocatorParams>(allocator_params)...);
|
||||
size_t bytes_to_copy = this->byte_size(from_end - from_begin);
|
||||
@ -778,14 +778,13 @@ void swap(PODArray<T, initial_bytes, TAllocator, pad_right_, pad_left_> & lhs, P
|
||||
|
||||
/// Prevent implicit template instantiation of PODArray for common numeric types
|
||||
|
||||
extern template class PODArray<UInt8, 4096, Allocator<false>, 15, 16>;
|
||||
extern template class PODArray<UInt16, 4096, Allocator<false>, 15, 16>;
|
||||
extern template class PODArray<UInt32, 4096, Allocator<false>, 15, 16>;
|
||||
extern template class PODArray<UInt64, 4096, Allocator<false>, 15, 16>;
|
||||
|
||||
extern template class PODArray<Int8, 4096, Allocator<false>, 15, 16>;
|
||||
extern template class PODArray<Int16, 4096, Allocator<false>, 15, 16>;
|
||||
extern template class PODArray<Int32, 4096, Allocator<false>, 15, 16>;
|
||||
extern template class PODArray<Int64, 4096, Allocator<false>, 15, 16>;
|
||||
extern template class PODArray<UInt8, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
extern template class PODArray<UInt16, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
extern template class PODArray<UInt32, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
extern template class PODArray<UInt64, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
|
||||
extern template class PODArray<Int8, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
extern template class PODArray<Int16, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
extern template class PODArray<Int32, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
extern template class PODArray<Int64, 4096, Allocator<false>, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
}
|
||||
|
@ -4,6 +4,7 @@
|
||||
* PODArray.
|
||||
*/
|
||||
|
||||
#include <Core/Defines.h>
|
||||
#include <base/types.h>
|
||||
#include <Common/Allocator_fwd.h>
|
||||
|
||||
@ -22,7 +23,7 @@ class PODArray;
|
||||
|
||||
/** For columns. Padding is enough to read and write xmm-register at the address of the last element. */
|
||||
template <typename T, size_t initial_bytes = 4096, typename TAllocator = Allocator<false>>
|
||||
using PaddedPODArray = PODArray<T, initial_bytes, TAllocator, 15, 16>;
|
||||
using PaddedPODArray = PODArray<T, initial_bytes, TAllocator, PADDING_FOR_SIMD - 1, PADDING_FOR_SIMD>;
|
||||
|
||||
/** A helper for declaring PODArray that uses inline memory.
|
||||
* The initial size is set to use all the inline bytes, since using less would
|
||||
|
@ -777,19 +777,34 @@ bool ZooKeeper::waitForDisappear(const std::string & path, const WaitCondition &
|
||||
return false;
|
||||
}
|
||||
|
||||
void ZooKeeper::waitForEphemeralToDisappearIfAny(const std::string & path)
|
||||
void ZooKeeper::handleEphemeralNodeExistence(const std::string & path, const std::string & fast_delete_if_equal_value)
|
||||
{
|
||||
zkutil::EventPtr eph_node_disappeared = std::make_shared<Poco::Event>();
|
||||
String content;
|
||||
if (!tryGet(path, content, nullptr, eph_node_disappeared))
|
||||
Coordination::Stat stat;
|
||||
if (!tryGet(path, content, &stat, eph_node_disappeared))
|
||||
return;
|
||||
|
||||
int32_t timeout_ms = 3 * args.session_timeout_ms;
|
||||
if (!eph_node_disappeared->tryWait(timeout_ms))
|
||||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR,
|
||||
"Ephemeral node {} still exists after {}s, probably it's owned by someone else. "
|
||||
"Either session_timeout_ms in client's config is different from server's config or it's a bug. "
|
||||
"Node data: '{}'", path, timeout_ms / 1000, content);
|
||||
if (content == fast_delete_if_equal_value)
|
||||
{
|
||||
auto code = tryRemove(path, stat.version);
|
||||
if (code != Coordination::Error::ZOK && code != Coordination::Error::ZNONODE)
|
||||
throw Coordination::Exception(code, path);
|
||||
}
|
||||
else
|
||||
{
|
||||
LOG_WARNING(log, "Ephemeral node ('{}') already exists but it isn't owned by us. Will wait until it disappears", path);
|
||||
int32_t timeout_ms = 3 * args.session_timeout_ms;
|
||||
if (!eph_node_disappeared->tryWait(timeout_ms))
|
||||
throw DB::Exception(
|
||||
DB::ErrorCodes::LOGICAL_ERROR,
|
||||
"Ephemeral node {} still exists after {}s, probably it's owned by someone else. "
|
||||
"Either session_timeout_ms in client's config is different from server's config or it's a bug. "
|
||||
"Node data: '{}'",
|
||||
path,
|
||||
timeout_ms / 1000,
|
||||
content);
|
||||
}
|
||||
}
|
||||
|
||||
ZooKeeperPtr ZooKeeper::startNewSession() const
|
||||
|
@ -393,9 +393,11 @@ public:
|
||||
/// The function returns true if waited and false if waiting was interrupted by condition.
|
||||
bool waitForDisappear(const std::string & path, const WaitCondition & condition = {});
|
||||
|
||||
/// Wait for the ephemeral node created in previous session to disappear.
|
||||
/// Throws LOGICAL_ERROR if node still exists after 2x session_timeout.
|
||||
void waitForEphemeralToDisappearIfAny(const std::string & path);
|
||||
/// Checks if a the ephemeral node exists. These nodes are removed automatically by ZK when the session ends
|
||||
/// If the node exists and its value is equal to fast_delete_if_equal_value it will remove it
|
||||
/// If the node exists and its value is different, it will wait for it to disappear. It will throw a LOGICAL_ERROR if the node doesn't
|
||||
/// disappear automatically after 3x session_timeout.
|
||||
void handleEphemeralNodeExistence(const std::string & path, const std::string & fast_delete_if_equal_value);
|
||||
|
||||
/// Async interface (a small subset of operations is implemented).
|
||||
///
|
||||
@ -609,7 +611,7 @@ public:
|
||||
catch (...)
|
||||
{
|
||||
ProfileEvents::increment(ProfileEvents::CannotRemoveEphemeralNode);
|
||||
DB::tryLogCurrentException(__PRETTY_FUNCTION__, "Cannot remove " + path + ": ");
|
||||
DB::tryLogCurrentException(__PRETTY_FUNCTION__, "Cannot remove " + path);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -90,17 +90,23 @@ std::string makeRegexpPatternFromGlobs(const std::string & initial_str_with_glob
|
||||
oss_for_replacing << escaped_with_globs.substr(current_index);
|
||||
std::string almost_res = oss_for_replacing.str();
|
||||
WriteBufferFromOwnString buf_final_processing;
|
||||
char previous = ' ';
|
||||
for (const auto & letter : almost_res)
|
||||
{
|
||||
if ((letter == '?') || (letter == '*'))
|
||||
if (previous == '*' && letter == '*')
|
||||
{
|
||||
buf_final_processing << "[^{}]";
|
||||
}
|
||||
else if ((letter == '?') || (letter == '*'))
|
||||
{
|
||||
buf_final_processing << "[^/]"; /// '?' is any symbol except '/'
|
||||
if (letter == '?')
|
||||
continue;
|
||||
}
|
||||
if ((letter == '.') || (letter == '{') || (letter == '}'))
|
||||
else if ((letter == '.') || (letter == '{') || (letter == '}'))
|
||||
buf_final_processing << '\\';
|
||||
buf_final_processing << letter;
|
||||
previous = letter;
|
||||
}
|
||||
return buf_final_processing.str();
|
||||
}
|
||||
|
@ -532,7 +532,7 @@ TEST(Common, PODNoOverallocation)
|
||||
}
|
||||
}
|
||||
|
||||
EXPECT_EQ(capacities, (std::vector<size_t>{4065, 8161, 16353, 32737, 65505, 131041, 262113, 524257, 1048545}));
|
||||
EXPECT_EQ(capacities, (std::vector<size_t>{3969, 8065, 16257, 32641, 65409, 130945, 262017, 524161, 1048449}));
|
||||
}
|
||||
|
||||
template <size_t size>
|
||||
|
@ -14,17 +14,20 @@
|
||||
/// The size of the I/O buffer by default.
|
||||
#define DBMS_DEFAULT_BUFFER_SIZE 1048576ULL
|
||||
|
||||
#define PADDING_FOR_SIMD 64
|
||||
|
||||
/** Which blocks by default read the data (by number of rows).
|
||||
* Smaller values give better cache locality, less consumption of RAM, but more overhead to process the query.
|
||||
*/
|
||||
#define DEFAULT_BLOCK_SIZE 65505 /// 65536 minus 16 + 15 bytes padding that we usually have in arrays
|
||||
#define DEFAULT_BLOCK_SIZE 65409 /// 65536 - PADDING_FOR_SIMD - (PADDING_FOR_SIMD - 1) bytes padding that we usually have in arrays
|
||||
|
||||
/** Which blocks should be formed for insertion into the table, if we control the formation of blocks.
|
||||
* (Sometimes the blocks are inserted exactly such blocks that have been read / transmitted from the outside, and this parameter does not affect their size.)
|
||||
* More than DEFAULT_BLOCK_SIZE, because in some tables a block of data on the disk is created for each block (quite a big thing),
|
||||
* and if the parts were small, then it would be costly then to combine them.
|
||||
*/
|
||||
#define DEFAULT_INSERT_BLOCK_SIZE 1048545 /// 1048576 minus 16 + 15 bytes padding that we usually have in arrays
|
||||
#define DEFAULT_INSERT_BLOCK_SIZE \
|
||||
1048449 /// 1048576 - PADDING_FOR_SIMD - (PADDING_FOR_SIMD - 1) bytes padding that we usually have in arrays
|
||||
|
||||
/** The same, but for merge operations. Less DEFAULT_BLOCK_SIZE for saving RAM (since all the columns are read).
|
||||
* Significantly less, since there are 10-way mergers.
|
||||
|
@ -93,6 +93,7 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
|
||||
M(Bool, s3_truncate_on_insert, false, "Enables or disables truncate before insert in s3 engine tables.", 0) \
|
||||
M(Bool, s3_create_new_file_on_insert, false, "Enables or disables creating a new file on each insert in s3 engine tables", 0) \
|
||||
M(Bool, s3_check_objects_after_upload, false, "Check each uploaded object to s3 with head request to be sure that upload was successful", 0) \
|
||||
M(Bool, s3_allow_parallel_part_upload, true, "Use multiple threads for s3 multipart upload. It may lead to slightly higher memory usage", 0) \
|
||||
M(Bool, enable_s3_requests_logging, false, "Enable very explicit logging of S3 requests. Makes sense for debug only.", 0) \
|
||||
M(UInt64, hdfs_replication, 0, "The actual number of replications can be specified when the hdfs file is created.", 0) \
|
||||
M(Bool, hdfs_truncate_on_insert, false, "Enables or disables truncate before insert in s3 engine tables", 0) \
|
||||
@ -302,7 +303,7 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
|
||||
M(Float, opentelemetry_start_trace_probability, 0., "Probability to start an OpenTelemetry trace for an incoming query.", 0) \
|
||||
M(Bool, opentelemetry_trace_processors, false, "Collect OpenTelemetry spans for processors.", 0) \
|
||||
M(Bool, prefer_column_name_to_alias, false, "Prefer using column names instead of aliases if possible.", 0) \
|
||||
M(Bool, use_analyzer, false, "Use analyzer", 0) \
|
||||
M(Bool, allow_experimental_analyzer, false, "Allow experimental analyzer", 0) \
|
||||
M(Bool, prefer_global_in_and_join, false, "If enabled, all IN/JOIN operators will be rewritten as GLOBAL IN/JOIN. It's useful when the to-be-joined tables are only available on the initiator and we need to always scatter their data on-the-fly during distributed processing with the GLOBAL keyword. It's also useful to reduce the need to access the external sources joining external tables.", 0) \
|
||||
\
|
||||
\
|
||||
|
@ -24,13 +24,13 @@ bool IDisk::isDirectoryEmpty(const String & path) const
|
||||
return !iterateDirectory(path)->isValid();
|
||||
}
|
||||
|
||||
void IDisk::copyFile(const String & from_file_path, IDisk & to_disk, const String & to_file_path)
|
||||
void IDisk::copyFile(const String & from_file_path, IDisk & to_disk, const String & to_file_path, const WriteSettings & settings) /// NOLINT
|
||||
{
|
||||
LOG_DEBUG(&Poco::Logger::get("IDisk"), "Copying from {} (path: {}) {} to {} (path: {}) {}.",
|
||||
getName(), getPath(), from_file_path, to_disk.getName(), to_disk.getPath(), to_file_path);
|
||||
|
||||
auto in = readFile(from_file_path);
|
||||
auto out = to_disk.writeFile(to_file_path);
|
||||
auto out = to_disk.writeFile(to_file_path, DBMS_DEFAULT_BUFFER_SIZE, WriteMode::Rewrite, settings);
|
||||
copyData(*in, *out);
|
||||
out->finalize();
|
||||
}
|
||||
@ -56,15 +56,15 @@ void IDisk::removeSharedFiles(const RemoveBatchRequest & files, bool keep_all_ba
|
||||
|
||||
using ResultsCollector = std::vector<std::future<void>>;
|
||||
|
||||
void asyncCopy(IDisk & from_disk, String from_path, IDisk & to_disk, String to_path, Executor & exec, ResultsCollector & results, bool copy_root_dir)
|
||||
void asyncCopy(IDisk & from_disk, String from_path, IDisk & to_disk, String to_path, Executor & exec, ResultsCollector & results, bool copy_root_dir, const WriteSettings & settings)
|
||||
{
|
||||
if (from_disk.isFile(from_path))
|
||||
{
|
||||
auto result = exec.execute(
|
||||
[&from_disk, from_path, &to_disk, to_path]()
|
||||
[&from_disk, from_path, &to_disk, to_path, &settings]()
|
||||
{
|
||||
setThreadName("DiskCopier");
|
||||
from_disk.copyFile(from_path, to_disk, fs::path(to_path) / fileName(from_path));
|
||||
from_disk.copyFile(from_path, to_disk, fs::path(to_path) / fileName(from_path), settings);
|
||||
});
|
||||
|
||||
results.push_back(std::move(result));
|
||||
@ -80,7 +80,7 @@ void asyncCopy(IDisk & from_disk, String from_path, IDisk & to_disk, String to_p
|
||||
}
|
||||
|
||||
for (auto it = from_disk.iterateDirectory(from_path); it->isValid(); it->next())
|
||||
asyncCopy(from_disk, it->path(), to_disk, dest, exec, results, true);
|
||||
asyncCopy(from_disk, it->path(), to_disk, dest, exec, results, true, settings);
|
||||
}
|
||||
}
|
||||
|
||||
@ -89,7 +89,12 @@ void IDisk::copyThroughBuffers(const String & from_path, const std::shared_ptr<I
|
||||
auto & exec = to_disk->getExecutor();
|
||||
ResultsCollector results;
|
||||
|
||||
asyncCopy(*this, from_path, *to_disk, to_path, exec, results, copy_root_dir);
|
||||
WriteSettings settings;
|
||||
/// Disable parallel write. We already copy in parallel.
|
||||
/// Avoid high memory usage. See test_s3_zero_copy_ttl/test.py::test_move_and_s3_memory_usage
|
||||
settings.s3_allow_parallel_part_upload = false;
|
||||
|
||||
asyncCopy(*this, from_path, *to_disk, to_path, exec, results, copy_root_dir, settings);
|
||||
|
||||
for (auto & result : results)
|
||||
result.wait();
|
||||
|
@ -181,7 +181,11 @@ public:
|
||||
virtual void copyDirectoryContent(const String & from_dir, const std::shared_ptr<IDisk> & to_disk, const String & to_dir);
|
||||
|
||||
/// Copy file `from_file_path` to `to_file_path` located at `to_disk`.
|
||||
virtual void copyFile(const String & from_file_path, IDisk & to_disk, const String & to_file_path);
|
||||
virtual void copyFile( /// NOLINT
|
||||
const String & from_file_path,
|
||||
IDisk & to_disk,
|
||||
const String & to_file_path,
|
||||
const WriteSettings & settings = {});
|
||||
|
||||
/// List files at `path` and add their names to `file_names`
|
||||
virtual void listFiles(const String & path, std::vector<String> & file_names) const = 0;
|
||||
|
@ -230,7 +230,9 @@ std::unique_ptr<WriteBufferFromFileBase> S3ObjectStorage::writeObject( /// NOLIN
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS, "S3 doesn't support append to files");
|
||||
|
||||
auto settings_ptr = s3_settings.get();
|
||||
auto scheduler = threadPoolCallbackRunner<void>(getThreadPoolWriter(), "VFSWrite");
|
||||
ThreadPoolCallbackRunner<void> scheduler;
|
||||
if (write_settings.s3_allow_parallel_part_upload)
|
||||
scheduler = threadPoolCallbackRunner<void>(getThreadPoolWriter(), "VFSWrite");
|
||||
|
||||
auto s3_buffer = std::make_unique<WriteBufferFromS3>(
|
||||
client.get(),
|
||||
|
@ -13,36 +13,151 @@
|
||||
#include <Interpreters/Context_fwd.h>
|
||||
#include <Interpreters/castColumn.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
#include <span>
|
||||
|
||||
namespace ErrorCodes
|
||||
namespace DB::ErrorCodes
|
||||
{
|
||||
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
|
||||
extern const int ILLEGAL_COLUMN;
|
||||
extern const int ARGUMENT_OUT_OF_BOUND;
|
||||
extern const int ILLEGAL_COLUMN;
|
||||
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
|
||||
extern const int LOGICAL_ERROR;
|
||||
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
enum class Representation
|
||||
{
|
||||
BigEndian,
|
||||
LittleEndian
|
||||
};
|
||||
|
||||
std::pair<int, int> determineBinaryStartIndexWithIncrement(const ptrdiff_t num_bytes, const Representation representation)
|
||||
{
|
||||
if (representation == Representation::BigEndian)
|
||||
return {0, 1};
|
||||
else if (representation == Representation::LittleEndian)
|
||||
return {num_bytes - 1, -1};
|
||||
|
||||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "{} is not handled yet", magic_enum::enum_name(representation));
|
||||
}
|
||||
|
||||
void formatHex(const std::span<const UInt8> src, UInt8 * dst, const Representation representation)
|
||||
{
|
||||
const auto src_size = std::ssize(src);
|
||||
const auto [src_start_index, src_increment] = determineBinaryStartIndexWithIncrement(src_size, representation);
|
||||
for (int src_pos = src_start_index, dst_pos = 0; src_pos >= 0 && src_pos < src_size; src_pos += src_increment, dst_pos += 2)
|
||||
writeHexByteLowercase(src[src_pos], dst + dst_pos);
|
||||
}
|
||||
|
||||
void parseHex(const UInt8 * __restrict src, const std::span<UInt8> dst, const Representation representation)
|
||||
{
|
||||
const auto dst_size = std::ssize(dst);
|
||||
const auto [dst_start_index, dst_increment] = determineBinaryStartIndexWithIncrement(dst_size, representation);
|
||||
const auto * src_as_char = reinterpret_cast<const char *>(src);
|
||||
for (auto dst_pos = dst_start_index, src_pos = 0; dst_pos >= 0 && dst_pos < dst_size; dst_pos += dst_increment, src_pos += 2)
|
||||
dst[dst_pos] = unhex2(src_as_char + src_pos);
|
||||
}
|
||||
|
||||
class UUIDSerializer
|
||||
{
|
||||
public:
|
||||
enum class Variant
|
||||
{
|
||||
Default = 1,
|
||||
Microsoft = 2
|
||||
};
|
||||
|
||||
explicit UUIDSerializer(const Variant variant)
|
||||
: first_half_binary_representation(variant == Variant::Microsoft ? Representation::LittleEndian : Representation::BigEndian)
|
||||
{
|
||||
if (variant != Variant::Default && variant != Variant::Microsoft)
|
||||
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "{} is not handled yet", magic_enum::enum_name(variant));
|
||||
}
|
||||
|
||||
void deserialize(const UInt8 * src16, UInt8 * dst36) const
|
||||
{
|
||||
formatHex({src16, 4}, &dst36[0], first_half_binary_representation);
|
||||
dst36[8] = '-';
|
||||
formatHex({src16 + 4, 2}, &dst36[9], first_half_binary_representation);
|
||||
dst36[13] = '-';
|
||||
formatHex({src16 + 6, 2}, &dst36[14], first_half_binary_representation);
|
||||
dst36[18] = '-';
|
||||
formatHex({src16 + 8, 2}, &dst36[19], Representation::BigEndian);
|
||||
dst36[23] = '-';
|
||||
formatHex({src16 + 10, 6}, &dst36[24], Representation::BigEndian);
|
||||
}
|
||||
|
||||
void serialize(const UInt8 * src36, UInt8 * dst16) const
|
||||
{
|
||||
/// If string is not like UUID - implementation specific behaviour.
|
||||
parseHex(&src36[0], {dst16 + 0, 4}, first_half_binary_representation);
|
||||
parseHex(&src36[9], {dst16 + 4, 2}, first_half_binary_representation);
|
||||
parseHex(&src36[14], {dst16 + 6, 2}, first_half_binary_representation);
|
||||
parseHex(&src36[19], {dst16 + 8, 2}, Representation::BigEndian);
|
||||
parseHex(&src36[24], {dst16 + 10, 6}, Representation::BigEndian);
|
||||
}
|
||||
|
||||
private:
|
||||
Representation first_half_binary_representation;
|
||||
};
|
||||
|
||||
void checkArgumentCount(const DB::DataTypes & arguments, const std::string_view function_name)
|
||||
{
|
||||
if (const auto argument_count = std::ssize(arguments); argument_count < 1 || argument_count > 2)
|
||||
throw DB::Exception(
|
||||
DB::ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH,
|
||||
"Number of arguments for function {} doesn't match: passed {}, should be 1 or 2",
|
||||
function_name,
|
||||
argument_count);
|
||||
}
|
||||
|
||||
void checkFormatArgument(const DB::DataTypes & arguments, const std::string_view function_name)
|
||||
{
|
||||
if (const auto argument_count = std::ssize(arguments);
|
||||
argument_count > 1 && !DB::WhichDataType(arguments[1]).isInt8() && !DB::WhichDataType(arguments[1]).isUInt8())
|
||||
throw DB::Exception(
|
||||
DB::ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Illegal type {} of second argument of function {}, expected Int8 or UInt8 type",
|
||||
arguments[1]->getName(),
|
||||
function_name);
|
||||
}
|
||||
|
||||
UUIDSerializer::Variant parseVariant(const DB::ColumnsWithTypeAndName & arguments)
|
||||
{
|
||||
if (arguments.size() < 2)
|
||||
return UUIDSerializer::Variant::Default;
|
||||
|
||||
const auto representation = static_cast<magic_enum::underlying_type_t<UUIDSerializer::Variant>>(arguments[1].column->getInt(0));
|
||||
const auto as_enum = magic_enum::enum_cast<UUIDSerializer::Variant>(representation);
|
||||
if (!as_enum)
|
||||
throw DB::Exception(DB::ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Expected UUID variant, got {}", representation);
|
||||
|
||||
return *as_enum;
|
||||
}
|
||||
}
|
||||
|
||||
namespace DB
|
||||
{
|
||||
constexpr size_t uuid_bytes_length = 16;
|
||||
constexpr size_t uuid_text_length = 36;
|
||||
|
||||
class FunctionUUIDNumToString : public IFunction
|
||||
{
|
||||
|
||||
public:
|
||||
static constexpr auto name = "UUIDNumToString";
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionUUIDNumToString>(); }
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
return name;
|
||||
}
|
||||
|
||||
size_t getNumberOfArguments() const override { return 1; }
|
||||
String getName() const override { return name; }
|
||||
size_t getNumberOfArguments() const override { return 0; }
|
||||
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
|
||||
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
|
||||
bool isVariadic() const override { return true; }
|
||||
|
||||
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
|
||||
{
|
||||
checkArgumentCount(arguments, name);
|
||||
|
||||
const auto * ptr = checkAndGetDataType<DataTypeFixedString>(arguments[0].get());
|
||||
if (!ptr || ptr->getN() != uuid_bytes_length)
|
||||
throw Exception("Illegal type " + arguments[0]->getName() +
|
||||
@ -50,6 +165,8 @@ public:
|
||||
", expected FixedString(" + toString(uuid_bytes_length) + ")",
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
checkFormatArgument(arguments, name);
|
||||
|
||||
return std::make_shared<DataTypeString>();
|
||||
}
|
||||
|
||||
@ -59,7 +176,7 @@ public:
|
||||
{
|
||||
const ColumnWithTypeAndName & col_type_name = arguments[0];
|
||||
const ColumnPtr & column = col_type_name.column;
|
||||
|
||||
const auto variant = parseVariant(arguments);
|
||||
if (const auto * col_in = checkAndGetColumn<ColumnFixedString>(column.get()))
|
||||
{
|
||||
if (col_in->getN() != uuid_bytes_length)
|
||||
@ -82,9 +199,10 @@ public:
|
||||
size_t src_offset = 0;
|
||||
size_t dst_offset = 0;
|
||||
|
||||
const UUIDSerializer uuid_serializer(variant);
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
formatUUID(&vec_in[src_offset], &vec_res[dst_offset]);
|
||||
uuid_serializer.deserialize(&vec_in[src_offset], &vec_res[dst_offset]);
|
||||
src_offset += uuid_bytes_length;
|
||||
dst_offset += uuid_text_length;
|
||||
vec_res[dst_offset] = 0;
|
||||
@ -104,55 +222,33 @@ public:
|
||||
|
||||
class FunctionUUIDStringToNum : public IFunction
|
||||
{
|
||||
private:
|
||||
static void parseHex(const UInt8 * __restrict src, UInt8 * __restrict dst, const size_t num_bytes)
|
||||
{
|
||||
size_t src_pos = 0;
|
||||
size_t dst_pos = 0;
|
||||
for (; dst_pos < num_bytes; ++dst_pos)
|
||||
{
|
||||
dst[dst_pos] = unhex2(reinterpret_cast<const char *>(&src[src_pos]));
|
||||
src_pos += 2;
|
||||
}
|
||||
}
|
||||
|
||||
static void parseUUID(const UInt8 * src36, UInt8 * dst16)
|
||||
{
|
||||
/// If string is not like UUID - implementation specific behaviour.
|
||||
|
||||
parseHex(&src36[0], &dst16[0], 4);
|
||||
parseHex(&src36[9], &dst16[4], 2);
|
||||
parseHex(&src36[14], &dst16[6], 2);
|
||||
parseHex(&src36[19], &dst16[8], 2);
|
||||
parseHex(&src36[24], &dst16[10], 6);
|
||||
}
|
||||
|
||||
public:
|
||||
static constexpr auto name = "UUIDStringToNum";
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionUUIDStringToNum>(); }
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
return name;
|
||||
}
|
||||
|
||||
size_t getNumberOfArguments() const override { return 1; }
|
||||
String getName() const override { return name; }
|
||||
size_t getNumberOfArguments() const override { return 0; }
|
||||
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
|
||||
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
|
||||
bool isVariadic() const override { return true; }
|
||||
|
||||
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
|
||||
{
|
||||
checkArgumentCount(arguments, name);
|
||||
|
||||
/// String or FixedString(36)
|
||||
if (!isString(arguments[0]))
|
||||
{
|
||||
const auto * ptr = checkAndGetDataType<DataTypeFixedString>(arguments[0].get());
|
||||
if (!ptr || ptr->getN() != uuid_text_length)
|
||||
throw Exception("Illegal type " + arguments[0]->getName() +
|
||||
" of argument of function " + getName() +
|
||||
" of first argument of function " + getName() +
|
||||
", expected FixedString(" + toString(uuid_text_length) + ")",
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
}
|
||||
|
||||
checkFormatArgument(arguments, name);
|
||||
|
||||
return std::make_shared<DataTypeFixedString>(uuid_bytes_length);
|
||||
}
|
||||
|
||||
@ -163,6 +259,7 @@ public:
|
||||
const ColumnWithTypeAndName & col_type_name = arguments[0];
|
||||
const ColumnPtr & column = col_type_name.column;
|
||||
|
||||
const UUIDSerializer uuid_serializer(parseVariant(arguments));
|
||||
if (const auto * col_in = checkAndGetColumn<ColumnString>(column.get()))
|
||||
{
|
||||
const auto & vec_in = col_in->getChars();
|
||||
@ -184,7 +281,7 @@ public:
|
||||
|
||||
size_t string_size = offsets_in[i] - src_offset;
|
||||
if (string_size == uuid_text_length + 1)
|
||||
parseUUID(&vec_in[src_offset], &vec_res[dst_offset]);
|
||||
uuid_serializer.serialize(&vec_in[src_offset], &vec_res[dst_offset]);
|
||||
else
|
||||
memset(&vec_res[dst_offset], 0, uuid_bytes_length);
|
||||
|
||||
@ -216,7 +313,7 @@ public:
|
||||
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
parseUUID(&vec_in[src_offset], &vec_res[dst_offset]);
|
||||
uuid_serializer.serialize(&vec_in[src_offset], &vec_res[dst_offset]);
|
||||
src_offset += uuid_text_length;
|
||||
dst_offset += uuid_bytes_length;
|
||||
}
|
||||
|
@ -23,7 +23,6 @@
|
||||
#include <DataTypes/Serializations/SerializationDecimal.h>
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <DataTypes/DataTypeLowCardinality.h>
|
||||
#include <DataTypes/DataTypeFixedString.h>
|
||||
#include <DataTypes/DataTypeString.h>
|
||||
#include <DataTypes/DataTypesDecimal.h>
|
||||
#include <DataTypes/DataTypeUUID.h>
|
||||
@ -696,16 +695,8 @@ public:
|
||||
else
|
||||
return false;
|
||||
|
||||
if (dest.getDataType() == TypeIndex::LowCardinality)
|
||||
{
|
||||
ColumnLowCardinality & col_low = assert_cast<ColumnLowCardinality &>(dest);
|
||||
col_low.insertData(reinterpret_cast<const char *>(&value), sizeof(value));
|
||||
}
|
||||
else
|
||||
{
|
||||
auto & col_vec = assert_cast<ColumnVector<NumberType> &>(dest);
|
||||
col_vec.insertValue(value);
|
||||
}
|
||||
auto & col_vec = assert_cast<ColumnVector<NumberType> &>(dest);
|
||||
col_vec.insertValue(value);
|
||||
return true;
|
||||
}
|
||||
};
|
||||
@ -782,17 +773,8 @@ public:
|
||||
return JSONExtractRawImpl<JSONParser>::insertResultToColumn(dest, element, {});
|
||||
|
||||
auto str = element.getString();
|
||||
|
||||
if (dest.getDataType() == TypeIndex::LowCardinality)
|
||||
{
|
||||
ColumnLowCardinality & col_low = assert_cast<ColumnLowCardinality &>(dest);
|
||||
col_low.insertData(str.data(), str.size());
|
||||
}
|
||||
else
|
||||
{
|
||||
ColumnString & col_str = assert_cast<ColumnString &>(dest);
|
||||
col_str.insertData(str.data(), str.size());
|
||||
}
|
||||
ColumnString & col_str = assert_cast<ColumnString &>(dest);
|
||||
col_str.insertData(str.data(), str.size());
|
||||
return true;
|
||||
}
|
||||
};
|
||||
@ -821,33 +803,25 @@ struct JSONExtractTree
|
||||
}
|
||||
};
|
||||
|
||||
class LowCardinalityFixedStringNode : public Node
|
||||
class LowCardinalityNode : public Node
|
||||
{
|
||||
public:
|
||||
explicit LowCardinalityFixedStringNode(const size_t fixed_length_) : fixed_length(fixed_length_) {}
|
||||
LowCardinalityNode(DataTypePtr dictionary_type_, std::unique_ptr<Node> impl_)
|
||||
: dictionary_type(dictionary_type_), impl(std::move(impl_)) {}
|
||||
bool insertResultToColumn(IColumn & dest, const Element & element) override
|
||||
{
|
||||
// If element is an object we delegate the insertion to JSONExtractRawImpl
|
||||
if (element.isObject())
|
||||
return JSONExtractRawImpl<JSONParser>::insertResultToLowCardinalityFixedStringColumn(dest, element, fixed_length);
|
||||
else if (!element.isString())
|
||||
return false;
|
||||
|
||||
auto str = element.getString();
|
||||
if (str.size() > fixed_length)
|
||||
return false;
|
||||
|
||||
// For the non low cardinality case of FixedString, the padding is done in the FixedString Column implementation.
|
||||
// In order to avoid having to pass the data to a FixedString Column and read it back (which would slow down the execution)
|
||||
// the data is padded here and written directly to the Low Cardinality Column
|
||||
auto padded_str = str.data() + std::string(fixed_length - std::min(fixed_length, str.length()), '\0');
|
||||
|
||||
assert_cast<ColumnLowCardinality &>(dest).insertData(padded_str.data(), padded_str.size());
|
||||
return true;
|
||||
auto from_col = dictionary_type->createColumn();
|
||||
if (impl->insertResultToColumn(*from_col, element))
|
||||
{
|
||||
std::string_view value = from_col->getDataAt(0).toView();
|
||||
assert_cast<ColumnLowCardinality &>(dest).insertData(value.data(), value.size());
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
private:
|
||||
const size_t fixed_length;
|
||||
DataTypePtr dictionary_type;
|
||||
std::unique_ptr<Node> impl;
|
||||
};
|
||||
|
||||
class UUIDNode : public Node
|
||||
@ -859,15 +833,7 @@ struct JSONExtractTree
|
||||
return false;
|
||||
|
||||
auto uuid = parseFromString<UUID>(element.getString());
|
||||
if (dest.getDataType() == TypeIndex::LowCardinality)
|
||||
{
|
||||
ColumnLowCardinality & col_low = assert_cast<ColumnLowCardinality &>(dest);
|
||||
col_low.insertData(reinterpret_cast<const char *>(&uuid), sizeof(uuid));
|
||||
}
|
||||
else
|
||||
{
|
||||
assert_cast<ColumnUUID &>(dest).insert(uuid);
|
||||
}
|
||||
assert_cast<ColumnUUID &>(dest).insert(uuid);
|
||||
return true;
|
||||
}
|
||||
};
|
||||
@ -887,7 +853,6 @@ struct JSONExtractTree
|
||||
assert_cast<ColumnDecimal<DecimalType> &>(dest).insert(result);
|
||||
return true;
|
||||
}
|
||||
|
||||
private:
|
||||
DataTypePtr data_type;
|
||||
};
|
||||
@ -906,18 +871,13 @@ struct JSONExtractTree
|
||||
public:
|
||||
bool insertResultToColumn(IColumn & dest, const Element & element) override
|
||||
{
|
||||
if (element.isNull())
|
||||
return false;
|
||||
|
||||
if (!element.isString())
|
||||
return JSONExtractRawImpl<JSONParser>::insertResultToFixedStringColumn(dest, element, {});
|
||||
|
||||
auto str = element.getString();
|
||||
return false;
|
||||
auto & col_str = assert_cast<ColumnFixedString &>(dest);
|
||||
auto str = element.getString();
|
||||
if (str.size() > col_str.getN())
|
||||
return false;
|
||||
col_str.insertData(str.data(), str.size());
|
||||
|
||||
return true;
|
||||
}
|
||||
};
|
||||
@ -1139,19 +1099,9 @@ struct JSONExtractTree
|
||||
case TypeIndex::UUID: return std::make_unique<UUIDNode>();
|
||||
case TypeIndex::LowCardinality:
|
||||
{
|
||||
// The low cardinality case is treated in two different ways:
|
||||
// For FixedString type, an especial class is implemented for inserting the data in the destination column,
|
||||
// as the string length must be passed in order to check and pad the incoming data.
|
||||
// For the rest of low cardinality types, the insertion is done in their corresponding class, adapting the data
|
||||
// as needed for the insertData function of the ColumnLowCardinality.
|
||||
auto dictionary_type = typeid_cast<const DataTypeLowCardinality *>(type.get())->getDictionaryType();
|
||||
if ((*dictionary_type).getTypeId() == TypeIndex::FixedString)
|
||||
{
|
||||
auto fixed_length = typeid_cast<const DataTypeFixedString *>(dictionary_type.get())->getN();
|
||||
return std::make_unique<LowCardinalityFixedStringNode>(fixed_length);
|
||||
}
|
||||
auto impl = build(function_name, dictionary_type);
|
||||
return impl;
|
||||
return std::make_unique<LowCardinalityNode>(dictionary_type, std::move(impl));
|
||||
}
|
||||
case TypeIndex::Decimal256: return std::make_unique<DecimalNode<Decimal256>>(type);
|
||||
case TypeIndex::Decimal128: return std::make_unique<DecimalNode<Decimal128>>(type);
|
||||
@ -1313,37 +1263,6 @@ public:
|
||||
return true;
|
||||
}
|
||||
|
||||
// We use insertResultToFixedStringColumn in case we are inserting raw data in a FixedString column
|
||||
static bool insertResultToFixedStringColumn(IColumn & dest, const Element & element, std::string_view)
|
||||
{
|
||||
ColumnFixedString & col_str = assert_cast<ColumnFixedString &>(dest);
|
||||
auto & chars = col_str.getChars();
|
||||
WriteBufferFromVector<ColumnFixedString::Chars> buf(chars, AppendModeTag());
|
||||
traverse(element, buf);
|
||||
buf.finalize();
|
||||
col_str.insertDefault();
|
||||
return true;
|
||||
}
|
||||
|
||||
// We use insertResultToLowCardinalityFixedStringColumn in case we are inserting raw data in a Low Cardinality FixedString column
|
||||
static bool insertResultToLowCardinalityFixedStringColumn(IColumn & dest, const Element & element, size_t fixed_length)
|
||||
{
|
||||
if (element.getObject().size() > fixed_length)
|
||||
return false;
|
||||
|
||||
ColumnFixedString::Chars chars;
|
||||
WriteBufferFromVector<ColumnFixedString::Chars> buf(chars, AppendModeTag());
|
||||
traverse(element, buf);
|
||||
buf.finalize();
|
||||
chars.push_back(0);
|
||||
std::string str = reinterpret_cast<const char *>(chars.data());
|
||||
|
||||
auto padded_str = str + std::string(fixed_length - std::min(fixed_length, str.length()), '\0');
|
||||
assert_cast<ColumnLowCardinality &>(dest).insertData(padded_str.data(), padded_str.size());
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
private:
|
||||
static void traverse(const Element & element, WriteBuffer & buf)
|
||||
{
|
||||
|
@ -124,7 +124,7 @@ void RandImpl::execute(char * output, size_t size)
|
||||
char * end = output + size;
|
||||
|
||||
constexpr int vec_size = 4;
|
||||
constexpr int safe_overwrite = 15;
|
||||
constexpr int safe_overwrite = PADDING_FOR_SIMD - 1;
|
||||
constexpr int bytes_per_write = 4 * sizeof(UInt64x4);
|
||||
|
||||
UInt64 rand_seed = randomSeed();
|
||||
|
@ -16,7 +16,7 @@ struct FirstSignificantSubdomainDefaultLookup
|
||||
}
|
||||
};
|
||||
|
||||
template <bool without_www>
|
||||
template <bool without_www, bool conform_rfc>
|
||||
struct ExtractFirstSignificantSubdomain
|
||||
{
|
||||
static size_t getReserveLengthForElement() { return 10; }
|
||||
@ -35,7 +35,7 @@ struct ExtractFirstSignificantSubdomain
|
||||
|
||||
Pos tmp;
|
||||
size_t domain_length;
|
||||
ExtractDomain<without_www>::execute(data, size, tmp, domain_length);
|
||||
ExtractDomain<without_www, conform_rfc>::execute(data, size, tmp, domain_length);
|
||||
|
||||
if (domain_length == 0)
|
||||
return;
|
||||
@ -105,7 +105,7 @@ struct ExtractFirstSignificantSubdomain
|
||||
|
||||
Pos tmp;
|
||||
size_t domain_length;
|
||||
ExtractDomain<without_www>::execute(data, size, tmp, domain_length);
|
||||
ExtractDomain<without_www, conform_rfc>::execute(data, size, tmp, domain_length);
|
||||
|
||||
if (domain_length == 0)
|
||||
return;
|
||||
|
@ -6,7 +6,7 @@
|
||||
namespace DB
|
||||
{
|
||||
|
||||
template <bool without_www>
|
||||
template <bool without_www, bool conform_rfc>
|
||||
struct CutToFirstSignificantSubdomain
|
||||
{
|
||||
static size_t getReserveLengthForElement() { return 15; }
|
||||
@ -19,7 +19,7 @@ struct CutToFirstSignificantSubdomain
|
||||
Pos tmp_data;
|
||||
size_t tmp_length;
|
||||
Pos domain_end;
|
||||
ExtractFirstSignificantSubdomain<without_www>::execute(data, size, tmp_data, tmp_length, &domain_end);
|
||||
ExtractFirstSignificantSubdomain<without_www, conform_rfc>::execute(data, size, tmp_data, tmp_length, &domain_end);
|
||||
|
||||
if (tmp_length == 0)
|
||||
return;
|
||||
@ -30,15 +30,23 @@ struct CutToFirstSignificantSubdomain
|
||||
};
|
||||
|
||||
struct NameCutToFirstSignificantSubdomain { static constexpr auto name = "cutToFirstSignificantSubdomain"; };
|
||||
using FunctionCutToFirstSignificantSubdomain = FunctionStringToString<ExtractSubstringImpl<CutToFirstSignificantSubdomain<true>>, NameCutToFirstSignificantSubdomain>;
|
||||
using FunctionCutToFirstSignificantSubdomain = FunctionStringToString<ExtractSubstringImpl<CutToFirstSignificantSubdomain<true, false>>, NameCutToFirstSignificantSubdomain>;
|
||||
|
||||
struct NameCutToFirstSignificantSubdomainWithWWW { static constexpr auto name = "cutToFirstSignificantSubdomainWithWWW"; };
|
||||
using FunctionCutToFirstSignificantSubdomainWithWWW = FunctionStringToString<ExtractSubstringImpl<CutToFirstSignificantSubdomain<false>>, NameCutToFirstSignificantSubdomainWithWWW>;
|
||||
using FunctionCutToFirstSignificantSubdomainWithWWW = FunctionStringToString<ExtractSubstringImpl<CutToFirstSignificantSubdomain<false, false>>, NameCutToFirstSignificantSubdomainWithWWW>;
|
||||
|
||||
struct NameCutToFirstSignificantSubdomainRFC { static constexpr auto name = "cutToFirstSignificantSubdomainRFC"; };
|
||||
using FunctionCutToFirstSignificantSubdomainRFC = FunctionStringToString<ExtractSubstringImpl<CutToFirstSignificantSubdomain<true, true>>, NameCutToFirstSignificantSubdomainRFC>;
|
||||
|
||||
struct NameCutToFirstSignificantSubdomainWithWWWRFC { static constexpr auto name = "cutToFirstSignificantSubdomainWithWWWRFC"; };
|
||||
using FunctionCutToFirstSignificantSubdomainWithWWWRFC = FunctionStringToString<ExtractSubstringImpl<CutToFirstSignificantSubdomain<false, true>>, NameCutToFirstSignificantSubdomainWithWWWRFC>;
|
||||
|
||||
REGISTER_FUNCTION(CutToFirstSignificantSubdomain)
|
||||
{
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomain>();
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomainWithWWW>();
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomainRFC>();
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomainWithWWWRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -5,7 +5,7 @@
|
||||
namespace DB
|
||||
{
|
||||
|
||||
template <bool without_www>
|
||||
template <bool without_www, bool conform_rfc>
|
||||
struct CutToFirstSignificantSubdomainCustom
|
||||
{
|
||||
static size_t getReserveLengthForElement() { return 15; }
|
||||
@ -18,7 +18,7 @@ struct CutToFirstSignificantSubdomainCustom
|
||||
Pos tmp_data;
|
||||
size_t tmp_length;
|
||||
Pos domain_end;
|
||||
ExtractFirstSignificantSubdomain<without_www>::executeCustom(tld_lookup, data, size, tmp_data, tmp_length, &domain_end);
|
||||
ExtractFirstSignificantSubdomain<without_www, conform_rfc>::executeCustom(tld_lookup, data, size, tmp_data, tmp_length, &domain_end);
|
||||
|
||||
if (tmp_length == 0)
|
||||
return;
|
||||
@ -29,15 +29,23 @@ struct CutToFirstSignificantSubdomainCustom
|
||||
};
|
||||
|
||||
struct NameCutToFirstSignificantSubdomainCustom { static constexpr auto name = "cutToFirstSignificantSubdomainCustom"; };
|
||||
using FunctionCutToFirstSignificantSubdomainCustom = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<true>, NameCutToFirstSignificantSubdomainCustom>;
|
||||
using FunctionCutToFirstSignificantSubdomainCustom = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<true, false>, NameCutToFirstSignificantSubdomainCustom>;
|
||||
|
||||
struct NameCutToFirstSignificantSubdomainCustomWithWWW { static constexpr auto name = "cutToFirstSignificantSubdomainCustomWithWWW"; };
|
||||
using FunctionCutToFirstSignificantSubdomainCustomWithWWW = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<false>, NameCutToFirstSignificantSubdomainCustomWithWWW>;
|
||||
using FunctionCutToFirstSignificantSubdomainCustomWithWWW = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<false, false>, NameCutToFirstSignificantSubdomainCustomWithWWW>;
|
||||
|
||||
struct NameCutToFirstSignificantSubdomainCustomRFC { static constexpr auto name = "cutToFirstSignificantSubdomainCustomRFC"; };
|
||||
using FunctionCutToFirstSignificantSubdomainCustomRFC = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<true, true>, NameCutToFirstSignificantSubdomainCustomRFC>;
|
||||
|
||||
struct NameCutToFirstSignificantSubdomainCustomWithWWWRFC { static constexpr auto name = "cutToFirstSignificantSubdomainCustomWithWWWRFC"; };
|
||||
using FunctionCutToFirstSignificantSubdomainCustomWithWWWRFC = FunctionCutToFirstSignificantSubdomainCustomImpl<CutToFirstSignificantSubdomainCustom<false, true>, NameCutToFirstSignificantSubdomainCustomWithWWWRFC>;
|
||||
|
||||
REGISTER_FUNCTION(CutToFirstSignificantSubdomainCustom)
|
||||
{
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomainCustom>();
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomainCustomWithWWW>();
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomainCustomRFC>();
|
||||
factory.registerFunction<FunctionCutToFirstSignificantSubdomainCustomWithWWWRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -7,12 +7,15 @@ namespace DB
|
||||
{
|
||||
|
||||
struct NameDomain { static constexpr auto name = "domain"; };
|
||||
using FunctionDomain = FunctionStringToString<ExtractSubstringImpl<ExtractDomain<false>>, NameDomain>;
|
||||
using FunctionDomain = FunctionStringToString<ExtractSubstringImpl<ExtractDomain<false, false>>, NameDomain>;
|
||||
|
||||
struct NameDomainRFC { static constexpr auto name = "domainRFC"; };
|
||||
using FunctionDomainRFC = FunctionStringToString<ExtractSubstringImpl<ExtractDomain<false, true>>, NameDomainRFC>;
|
||||
|
||||
REGISTER_FUNCTION(Domain)
|
||||
{
|
||||
factory.registerFunction<FunctionDomain>();
|
||||
factory.registerFunction<FunctionDomainRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -20,6 +20,115 @@ inline std::string_view checkAndReturnHost(const Pos & pos, const Pos & dot_pos,
|
||||
return std::string_view(start_of_host, pos - start_of_host);
|
||||
}
|
||||
|
||||
/// Extracts host from given url (RPC).
|
||||
///
|
||||
/// @return empty string view if the host is not valid (i.e. it does not have dot, or there no symbol after dot).
|
||||
inline std::string_view getURLHostRFC(const char * data, size_t size)
|
||||
{
|
||||
Pos pos = data;
|
||||
Pos end = data + size;
|
||||
|
||||
if (*pos == '/' && *(pos + 1) == '/')
|
||||
{
|
||||
pos += 2;
|
||||
}
|
||||
else
|
||||
{
|
||||
Pos scheme_end = data + std::min(size, 16UL);
|
||||
for (++pos; pos < scheme_end; ++pos)
|
||||
{
|
||||
if (!isAlphaNumericASCII(*pos))
|
||||
{
|
||||
switch (*pos)
|
||||
{
|
||||
case '.':
|
||||
case '-':
|
||||
case '+':
|
||||
break;
|
||||
case ' ': /// restricted symbols
|
||||
case '\t':
|
||||
case '<':
|
||||
case '>':
|
||||
case '%':
|
||||
case '{':
|
||||
case '}':
|
||||
case '|':
|
||||
case '\\':
|
||||
case '^':
|
||||
case '~':
|
||||
case '[':
|
||||
case ']':
|
||||
case ';':
|
||||
case '=':
|
||||
case '&':
|
||||
return std::string_view{};
|
||||
default:
|
||||
goto exloop;
|
||||
}
|
||||
}
|
||||
}
|
||||
exloop: if ((scheme_end - pos) > 2 && *pos == ':' && *(pos + 1) == '/' && *(pos + 2) == '/')
|
||||
pos += 3;
|
||||
else
|
||||
pos = data;
|
||||
}
|
||||
|
||||
Pos dot_pos = nullptr;
|
||||
Pos colon_pos = nullptr;
|
||||
bool has_at_symbol = false;
|
||||
bool has_terminator_after_colon = false;
|
||||
const auto * start_of_host = pos;
|
||||
for (; pos < end; ++pos)
|
||||
{
|
||||
switch (*pos)
|
||||
{
|
||||
case '.':
|
||||
if (has_at_symbol || colon_pos == nullptr)
|
||||
dot_pos = pos;
|
||||
break;
|
||||
case ':':
|
||||
if (has_at_symbol || colon_pos) goto done;
|
||||
colon_pos = pos;
|
||||
break;
|
||||
case '/': /// end symbols
|
||||
case '?':
|
||||
case '#':
|
||||
goto done;
|
||||
case '@': /// myemail@gmail.com
|
||||
if (has_terminator_after_colon) return std::string_view{};
|
||||
if (has_at_symbol) goto done;
|
||||
has_at_symbol = true;
|
||||
start_of_host = pos + 1;
|
||||
break;
|
||||
case ' ': /// restricted symbols in whole URL
|
||||
case '\t':
|
||||
case '<':
|
||||
case '>':
|
||||
case '%':
|
||||
case '{':
|
||||
case '}':
|
||||
case '|':
|
||||
case '\\':
|
||||
case '^':
|
||||
case '~':
|
||||
case '[':
|
||||
case ']':
|
||||
case ';':
|
||||
case '=':
|
||||
case '&':
|
||||
if (colon_pos == nullptr)
|
||||
return std::string_view{};
|
||||
else
|
||||
has_terminator_after_colon = true;
|
||||
}
|
||||
}
|
||||
|
||||
done:
|
||||
if (!has_at_symbol)
|
||||
pos = colon_pos ? colon_pos : pos;
|
||||
return checkAndReturnHost(pos, dot_pos, start_of_host);
|
||||
}
|
||||
|
||||
/// Extracts host from given url.
|
||||
///
|
||||
/// @return empty string view if the host is not valid (i.e. it does not have dot, or there no symbol after dot).
|
||||
@ -113,14 +222,18 @@ exloop: if ((scheme_end - pos) > 2 && *pos == ':' && *(pos + 1) == '/' && *(pos
|
||||
return checkAndReturnHost(pos, dot_pos, start_of_host);
|
||||
}
|
||||
|
||||
template <bool without_www>
|
||||
template <bool without_www, bool conform_rfc>
|
||||
struct ExtractDomain
|
||||
{
|
||||
static size_t getReserveLengthForElement() { return 15; }
|
||||
|
||||
static void execute(Pos data, size_t size, Pos & res_data, size_t & res_size)
|
||||
{
|
||||
std::string_view host = getURLHost(data, size);
|
||||
std::string_view host;
|
||||
if constexpr (conform_rfc)
|
||||
host = getURLHostRFC(data, size);
|
||||
else
|
||||
host = getURLHost(data, size);
|
||||
|
||||
if (host.empty())
|
||||
{
|
||||
|
@ -6,12 +6,16 @@ namespace DB
|
||||
{
|
||||
|
||||
struct NameDomainWithoutWWW { static constexpr auto name = "domainWithoutWWW"; };
|
||||
using FunctionDomainWithoutWWW = FunctionStringToString<ExtractSubstringImpl<ExtractDomain<true>>, NameDomainWithoutWWW>;
|
||||
using FunctionDomainWithoutWWW = FunctionStringToString<ExtractSubstringImpl<ExtractDomain<true, false>>, NameDomainWithoutWWW>;
|
||||
|
||||
struct NameDomainWithoutWWWRFC { static constexpr auto name = "domainWithoutWWWRFC"; };
|
||||
using FunctionDomainWithoutWWWRFC = FunctionStringToString<ExtractSubstringImpl<ExtractDomain<true, true>>, NameDomainWithoutWWWRFC>;
|
||||
|
||||
|
||||
REGISTER_FUNCTION(DomainWithoutWWW)
|
||||
{
|
||||
factory.registerFunction<FunctionDomainWithoutWWW>();
|
||||
factory.registerFunction<FunctionDomainWithoutWWWRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -7,12 +7,15 @@ namespace DB
|
||||
{
|
||||
|
||||
struct NameFirstSignificantSubdomain { static constexpr auto name = "firstSignificantSubdomain"; };
|
||||
using FunctionFirstSignificantSubdomain = FunctionStringToString<ExtractSubstringImpl<ExtractFirstSignificantSubdomain<true, false>>, NameFirstSignificantSubdomain>;
|
||||
|
||||
using FunctionFirstSignificantSubdomain = FunctionStringToString<ExtractSubstringImpl<ExtractFirstSignificantSubdomain<true>>, NameFirstSignificantSubdomain>;
|
||||
struct NameFirstSignificantSubdomainRFC { static constexpr auto name = "firstSignificantSubdomainRFC"; };
|
||||
using FunctionFirstSignificantSubdomainRFC = FunctionStringToString<ExtractSubstringImpl<ExtractFirstSignificantSubdomain<true, true>>, NameFirstSignificantSubdomainRFC>;
|
||||
|
||||
REGISTER_FUNCTION(FirstSignificantSubdomain)
|
||||
{
|
||||
factory.registerFunction<FunctionFirstSignificantSubdomain>();
|
||||
factory.registerFunction<FunctionFirstSignificantSubdomainRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -7,12 +7,15 @@ namespace DB
|
||||
{
|
||||
|
||||
struct NameFirstSignificantSubdomainCustom { static constexpr auto name = "firstSignificantSubdomainCustom"; };
|
||||
using FunctionFirstSignificantSubdomainCustom = FunctionCutToFirstSignificantSubdomainCustomImpl<ExtractFirstSignificantSubdomain<true, false>, NameFirstSignificantSubdomainCustom>;
|
||||
|
||||
using FunctionFirstSignificantSubdomainCustom = FunctionCutToFirstSignificantSubdomainCustomImpl<ExtractFirstSignificantSubdomain<true>, NameFirstSignificantSubdomainCustom>;
|
||||
struct NameFirstSignificantSubdomainCustomRFC { static constexpr auto name = "firstSignificantSubdomainCustomRFC"; };
|
||||
using FunctionFirstSignificantSubdomainCustomRFC = FunctionCutToFirstSignificantSubdomainCustomImpl<ExtractFirstSignificantSubdomain<true, true>, NameFirstSignificantSubdomainCustomRFC>;
|
||||
|
||||
REGISTER_FUNCTION(FirstSignificantSubdomainCustom)
|
||||
{
|
||||
factory.registerFunction<FunctionFirstSignificantSubdomainCustom>();
|
||||
factory.registerFunction<FunctionFirstSignificantSubdomainCustomRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -18,12 +18,9 @@ namespace ErrorCodes
|
||||
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
|
||||
}
|
||||
|
||||
struct FunctionPort : public IFunction
|
||||
template<bool conform_rfc>
|
||||
struct FunctionPortImpl : public IFunction
|
||||
{
|
||||
static constexpr auto name = "port";
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionPort>(); }
|
||||
|
||||
String getName() const override { return name; }
|
||||
bool isVariadic() const override { return true; }
|
||||
size_t getNumberOfArguments() const override { return 0; }
|
||||
bool useDefaultImplementationForConstants() const override { return true; }
|
||||
@ -94,7 +91,12 @@ private:
|
||||
const char * p = reinterpret_cast<const char *>(buf.data()) + offset;
|
||||
const char * end = p + size;
|
||||
|
||||
std::string_view host = getURLHost(p, size);
|
||||
std::string_view host;
|
||||
if constexpr (conform_rfc)
|
||||
host = getURLHostRFC(p, size);
|
||||
else
|
||||
host = getURLHost(p, size);
|
||||
|
||||
if (host.empty())
|
||||
return default_port;
|
||||
if (host.size() == size)
|
||||
@ -121,9 +123,24 @@ private:
|
||||
}
|
||||
};
|
||||
|
||||
struct FunctionPort : public FunctionPortImpl<false>
|
||||
{
|
||||
static constexpr auto name = "port";
|
||||
String getName() const override { return name; }
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionPort>(); }
|
||||
};
|
||||
|
||||
struct FunctionPortRFC : public FunctionPortImpl<true>
|
||||
{
|
||||
static constexpr auto name = "portRFC";
|
||||
String getName() const override { return name; }
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionPortRFC>(); }
|
||||
};
|
||||
|
||||
REGISTER_FUNCTION(Port)
|
||||
{
|
||||
factory.registerFunction<FunctionPort>();
|
||||
factory.registerFunction<FunctionPortRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -5,13 +5,18 @@
|
||||
namespace DB
|
||||
{
|
||||
|
||||
template<bool conform_rfc>
|
||||
struct ExtractTopLevelDomain
|
||||
{
|
||||
static size_t getReserveLengthForElement() { return 5; }
|
||||
|
||||
static void execute(Pos data, size_t size, Pos & res_data, size_t & res_size)
|
||||
{
|
||||
std::string_view host = getURLHost(data, size);
|
||||
std::string_view host;
|
||||
if constexpr (conform_rfc)
|
||||
host = getURLHostRFC(data, size);
|
||||
else
|
||||
host = getURLHost(data, size);
|
||||
|
||||
res_data = data;
|
||||
res_size = 0;
|
||||
@ -41,11 +46,15 @@ struct ExtractTopLevelDomain
|
||||
};
|
||||
|
||||
struct NameTopLevelDomain { static constexpr auto name = "topLevelDomain"; };
|
||||
using FunctionTopLevelDomain = FunctionStringToString<ExtractSubstringImpl<ExtractTopLevelDomain>, NameTopLevelDomain>;
|
||||
using FunctionTopLevelDomain = FunctionStringToString<ExtractSubstringImpl<ExtractTopLevelDomain<false>>, NameTopLevelDomain>;
|
||||
|
||||
struct NameTopLevelDomainRFC { static constexpr auto name = "topLevelDomainRFC"; };
|
||||
using FunctionTopLevelDomainRFC = FunctionStringToString<ExtractSubstringImpl<ExtractTopLevelDomain<true>>, NameTopLevelDomainRFC>;
|
||||
|
||||
REGISTER_FUNCTION(TopLevelDomain)
|
||||
{
|
||||
factory.registerFunction<FunctionTopLevelDomain>();
|
||||
factory.registerFunction<FunctionTopLevelDomainRFC>();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -1025,12 +1025,14 @@ ColumnPtr FunctionArrayElement::executeMap(
|
||||
if (col_const_map)
|
||||
values_array = ColumnConst::create(values_array, input_rows_count);
|
||||
|
||||
const auto & type_map = assert_cast<const DataTypeMap &>(*arguments[0].type);
|
||||
|
||||
/// Prepare arguments to call arrayElement for array with values and calculated indices at previous step.
|
||||
ColumnsWithTypeAndName new_arguments =
|
||||
{
|
||||
{
|
||||
values_array,
|
||||
std::make_shared<DataTypeArray>(result_type),
|
||||
std::make_shared<DataTypeArray>(type_map.getValueType()),
|
||||
""
|
||||
},
|
||||
{
|
||||
@ -1086,7 +1088,9 @@ ColumnPtr FunctionArrayElement::executeImpl(const ColumnsWithTypeAndName & argum
|
||||
|
||||
col_array = checkAndGetColumn<ColumnArray>(arguments[0].column.get());
|
||||
if (col_array)
|
||||
{
|
||||
is_array_of_nullable = isColumnNullable(col_array->getData());
|
||||
}
|
||||
else
|
||||
{
|
||||
col_const_array = checkAndGetColumnConstData<ColumnArray>(arguments[0].column.get());
|
||||
|
@ -34,8 +34,7 @@ namespace ErrorCodes
|
||||
template <typename Allocator = Allocator<false>>
|
||||
struct Memory : boost::noncopyable, Allocator
|
||||
{
|
||||
/// Padding is needed to allow usage of 'memcpySmallAllowReadWriteOverflow15' function with this buffer.
|
||||
static constexpr size_t pad_right = 15;
|
||||
static constexpr size_t pad_right = PADDING_FOR_SIMD - 1;
|
||||
|
||||
size_t m_capacity = 0; /// With padding.
|
||||
size_t m_size = 0;
|
||||
|
@ -28,7 +28,7 @@ void MMapReadBufferFromFileDescriptor::init()
|
||||
BufferBase::set(mapped.getData(), length, 0);
|
||||
|
||||
size_t page_size = static_cast<size_t>(::getPageSize());
|
||||
ReadBuffer::padded = (length % page_size) > 0 && (length % page_size) <= (page_size - 15);
|
||||
ReadBuffer::padded = (length % page_size) > 0 && (length % page_size) <= (page_size - (PADDING_FOR_SIMD - 1));
|
||||
}
|
||||
|
||||
|
||||
|
@ -17,7 +17,7 @@ void MMapReadBufferFromFileWithCache::init()
|
||||
BufferBase::set(mapped->getData(), length, 0);
|
||||
|
||||
size_t page_size = static_cast<size_t>(::getPageSize());
|
||||
ReadBuffer::padded = (length % page_size) > 0 && (length % page_size) <= (page_size - 15);
|
||||
ReadBuffer::padded = (length % page_size) > 0 && (length % page_size) <= (page_size - (PADDING_FOR_SIMD - 1));
|
||||
}
|
||||
|
||||
|
||||
|
@ -99,7 +99,7 @@ private:
|
||||
/// creation (for example if PeekableReadBuffer is often created or if we need to remember small amount of
|
||||
/// data after checkpoint), at the beginning we will use small amount of memory on stack and allocate
|
||||
/// larger buffer only if reserved memory is not enough.
|
||||
char stack_memory[16];
|
||||
char stack_memory[PADDING_FOR_SIMD];
|
||||
bool use_stack_memory = true;
|
||||
};
|
||||
|
||||
|
@ -18,19 +18,6 @@ void formatHex(IteratorSrc src, IteratorDst dst, size_t num_bytes)
|
||||
}
|
||||
}
|
||||
|
||||
void formatUUID(const UInt8 * src16, UInt8 * dst36)
|
||||
{
|
||||
formatHex(&src16[0], &dst36[0], 4);
|
||||
dst36[8] = '-';
|
||||
formatHex(&src16[4], &dst36[9], 2);
|
||||
dst36[13] = '-';
|
||||
formatHex(&src16[6], &dst36[14], 2);
|
||||
dst36[18] = '-';
|
||||
formatHex(&src16[8], &dst36[19], 2);
|
||||
dst36[23] = '-';
|
||||
formatHex(&src16[10], &dst36[24], 6);
|
||||
}
|
||||
|
||||
/** Function used when byte ordering is important when parsing uuid
|
||||
* ex: When we create an UUID type
|
||||
*/
|
||||
|
@ -624,9 +624,6 @@ inline void writeXMLStringForTextElement(std::string_view s, WriteBuffer & buf)
|
||||
writeXMLStringForTextElement(s.data(), s.data() + s.size(), buf);
|
||||
}
|
||||
|
||||
template <typename IteratorSrc, typename IteratorDst>
|
||||
void formatHex(IteratorSrc src, IteratorDst dst, size_t num_bytes);
|
||||
void formatUUID(const UInt8 * src16, UInt8 * dst36);
|
||||
void formatUUID(std::reverse_iterator<const UInt8 *> src16, UInt8 * dst36);
|
||||
|
||||
inline void writeUUIDText(const UUID & uuid, WriteBuffer & buf)
|
||||
|
@ -15,6 +15,7 @@ struct WriteSettings
|
||||
bool enable_filesystem_cache_on_write_operations = false;
|
||||
bool enable_filesystem_cache_log = false;
|
||||
bool is_file_cache_persistent = false;
|
||||
bool s3_allow_parallel_part_upload = true;
|
||||
|
||||
/// Monitoring
|
||||
bool for_object_storage = false; // to choose which profile events should be incremented
|
||||
|
@ -79,24 +79,24 @@ TEST(MemoryResizeTest, SmallInitAndSmallResize)
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
}
|
||||
|
||||
{
|
||||
auto memory = Memory<DummyAllocator>(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
memory.resize(0);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 0);
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
}
|
||||
}
|
||||
@ -116,52 +116,52 @@ TEST(MemoryResizeTest, SmallInitAndBigResizeOverflowWhenPadding)
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
memory.resize(2);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 17);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 1);
|
||||
ASSERT_EQ(memory.m_size, 2);
|
||||
|
||||
EXPECT_THROW_ERROR_CODE(memory.resize(std::numeric_limits<size_t>::max()), Exception, ErrorCodes::ARGUMENT_OUT_OF_BOUND);
|
||||
ASSERT_TRUE(memory.m_data); // state is intact after exception
|
||||
ASSERT_EQ(memory.m_capacity, 17);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 1);
|
||||
ASSERT_EQ(memory.m_size, 2);
|
||||
|
||||
memory.resize(0x8000000000000000ULL-16);
|
||||
memory.resize(0x8000000000000000ULL - PADDING_FOR_SIMD);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 0x8000000000000000ULL - 1);
|
||||
ASSERT_EQ(memory.m_size, 0x8000000000000000ULL - 16);
|
||||
ASSERT_EQ(memory.m_size, 0x8000000000000000ULL - PADDING_FOR_SIMD);
|
||||
|
||||
#ifndef ABORT_ON_LOGICAL_ERROR
|
||||
EXPECT_THROW_ERROR_CODE(memory.resize(0x8000000000000000ULL-15), Exception, ErrorCodes::LOGICAL_ERROR);
|
||||
EXPECT_THROW_ERROR_CODE(memory.resize(0x8000000000000000ULL - (PADDING_FOR_SIMD - 1)), Exception, ErrorCodes::LOGICAL_ERROR);
|
||||
ASSERT_TRUE(memory.m_data); // state is intact after exception
|
||||
ASSERT_EQ(memory.m_capacity, 0x8000000000000000ULL - 1);
|
||||
ASSERT_EQ(memory.m_size, 0x8000000000000000ULL - 16);
|
||||
ASSERT_EQ(memory.m_size, 0x8000000000000000ULL - PADDING_FOR_SIMD);
|
||||
#endif
|
||||
}
|
||||
|
||||
{
|
||||
auto memory = Memory<DummyAllocator>(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
EXPECT_THROW_ERROR_CODE(memory.resize(std::numeric_limits<size_t>::max()), Exception, ErrorCodes::ARGUMENT_OUT_OF_BOUND);
|
||||
ASSERT_TRUE(memory.m_data); // state is intact after exception
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
#ifndef ABORT_ON_LOGICAL_ERROR
|
||||
EXPECT_THROW_ERROR_CODE(memory.resize(0x8000000000000000ULL-15), Exception, ErrorCodes::LOGICAL_ERROR);
|
||||
EXPECT_THROW_ERROR_CODE(memory.resize(0x8000000000000000ULL - (PADDING_FOR_SIMD - 1)), Exception, ErrorCodes::LOGICAL_ERROR);
|
||||
ASSERT_TRUE(memory.m_data); // state is intact after exception
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
#endif
|
||||
}
|
||||
@ -201,7 +201,7 @@ TEST(MemoryResizeTest, BigInitAndSmallResizeOverflowWhenPadding)
|
||||
{
|
||||
EXPECT_THROW_ERROR_CODE(
|
||||
{
|
||||
auto memory = Memory<DummyAllocator>(std::numeric_limits<size_t>::max() - 15);
|
||||
auto memory = Memory<DummyAllocator>(std::numeric_limits<size_t>::max() - (PADDING_FOR_SIMD - 1));
|
||||
}
|
||||
, Exception
|
||||
, ErrorCodes::LOGICAL_ERROR);
|
||||
@ -210,7 +210,7 @@ TEST(MemoryResizeTest, BigInitAndSmallResizeOverflowWhenPadding)
|
||||
{
|
||||
EXPECT_THROW_ERROR_CODE(
|
||||
{
|
||||
auto memory = Memory<DummyAllocator>(0x8000000000000000ULL - 15);
|
||||
auto memory = Memory<DummyAllocator>(0x8000000000000000ULL - (PADDING_FOR_SIMD - 1));
|
||||
}
|
||||
, Exception
|
||||
, ErrorCodes::LOGICAL_ERROR);
|
||||
@ -218,10 +218,10 @@ TEST(MemoryResizeTest, BigInitAndSmallResizeOverflowWhenPadding)
|
||||
#endif
|
||||
|
||||
{
|
||||
auto memory = Memory<DummyAllocator>(0x8000000000000000ULL - 16);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 0x8000000000000000ULL - 1);
|
||||
ASSERT_EQ(memory.m_size, 0x8000000000000000ULL - 16);
|
||||
auto memory = Memory<DummyAllocator>(0x8000000000000000ULL - PADDING_FOR_SIMD);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 0x8000000000000000ULL - 1);
|
||||
ASSERT_EQ(memory.m_size, 0x8000000000000000ULL - PADDING_FOR_SIMD);
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
@ -240,32 +240,32 @@ TEST(MemoryResizeTest, AlignmentWithRealAllocator)
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
memory.resize(2);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 17);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 1);
|
||||
ASSERT_EQ(memory.m_size, 2);
|
||||
|
||||
memory.resize(3);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 18);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 2);
|
||||
ASSERT_EQ(memory.m_size, 3);
|
||||
|
||||
memory.resize(4);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 19);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 3);
|
||||
ASSERT_EQ(memory.m_size, 4);
|
||||
|
||||
memory.resize(0);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 19);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 3);
|
||||
ASSERT_EQ(memory.m_size, 0);
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 19);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 3);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
}
|
||||
|
||||
@ -291,12 +291,12 @@ TEST(MemoryResizeTest, AlignmentWithRealAllocator)
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
memory.resize(32);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 47);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD + 31);
|
||||
ASSERT_EQ(memory.m_size, 32);
|
||||
}
|
||||
}
|
||||
@ -316,13 +316,12 @@ TEST(MemoryResizeTest, SomeAlignmentOverflowWhenAlignment)
|
||||
|
||||
memory.resize(1);
|
||||
ASSERT_TRUE(memory.m_data);
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
|
||||
EXPECT_THROW_ERROR_CODE(memory.resize(std::numeric_limits<size_t>::max()), Exception, ErrorCodes::ARGUMENT_OUT_OF_BOUND);
|
||||
ASSERT_TRUE(memory.m_data); // state is intact after exception
|
||||
ASSERT_EQ(memory.m_capacity, 16);
|
||||
ASSERT_EQ(memory.m_capacity, PADDING_FOR_SIMD);
|
||||
ASSERT_EQ(memory.m_size, 1);
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -463,6 +463,18 @@ struct ContextSharedPart : boost::noncopyable
|
||||
std::unique_ptr<DDLWorker> delete_ddl_worker;
|
||||
std::unique_ptr<AccessControl> delete_access_control;
|
||||
|
||||
/// Delete DDLWorker before zookeeper.
|
||||
/// Cause it can call Context::getZooKeeper and resurrect it.
|
||||
|
||||
{
|
||||
auto lock = std::lock_guard(mutex);
|
||||
delete_ddl_worker = std::move(ddl_worker);
|
||||
}
|
||||
|
||||
/// DDLWorker should be deleted without lock, cause its internal thread can
|
||||
/// take it as well, which will cause deadlock.
|
||||
delete_ddl_worker.reset();
|
||||
|
||||
{
|
||||
auto lock = std::lock_guard(mutex);
|
||||
|
||||
@ -499,7 +511,6 @@ struct ContextSharedPart : boost::noncopyable
|
||||
delete_schedule_pool = std::move(schedule_pool);
|
||||
delete_distributed_schedule_pool = std::move(distributed_schedule_pool);
|
||||
delete_message_broker_schedule_pool = std::move(message_broker_schedule_pool);
|
||||
delete_ddl_worker = std::move(ddl_worker);
|
||||
delete_access_control = std::move(access_control);
|
||||
|
||||
/// Stop trace collector if any
|
||||
@ -528,7 +539,6 @@ struct ContextSharedPart : boost::noncopyable
|
||||
delete_schedule_pool.reset();
|
||||
delete_distributed_schedule_pool.reset();
|
||||
delete_message_broker_schedule_pool.reset();
|
||||
delete_ddl_worker.reset();
|
||||
delete_access_control.reset();
|
||||
|
||||
total_memory_tracker.resetOvercommitTracker();
|
||||
@ -2061,7 +2071,12 @@ zkutil::ZooKeeperPtr Context::getZooKeeper() const
|
||||
if (!shared->zookeeper)
|
||||
shared->zookeeper = std::make_shared<zkutil::ZooKeeper>(config, "zookeeper", getZooKeeperLog());
|
||||
else if (shared->zookeeper->expired())
|
||||
{
|
||||
Stopwatch watch;
|
||||
LOG_DEBUG(shared->log, "Trying to establish a new connection with ZooKeeper");
|
||||
shared->zookeeper = shared->zookeeper->startNewSession();
|
||||
LOG_DEBUG(shared->log, "Establishing a new connection with ZooKeeper took {} ms", watch.elapsedMilliseconds());
|
||||
}
|
||||
|
||||
return shared->zookeeper;
|
||||
}
|
||||
@ -3632,6 +3647,7 @@ WriteSettings Context::getWriteSettings() const
|
||||
|
||||
res.enable_filesystem_cache_on_write_operations = settings.enable_filesystem_cache_on_write_operations;
|
||||
res.enable_filesystem_cache_log = settings.enable_filesystem_cache_log;
|
||||
res.s3_allow_parallel_part_upload = settings.s3_allow_parallel_part_upload;
|
||||
|
||||
res.remote_throttler = getRemoteWriteThrottler();
|
||||
|
||||
|
@ -26,6 +26,7 @@
|
||||
#include <Common/ZooKeeper/KeeperException.h>
|
||||
#include <Common/ZooKeeper/ZooKeeperLock.h>
|
||||
#include <Common/isLocalAddress.h>
|
||||
#include <Core/ServerUUID.h>
|
||||
#include <Storages/StorageReplicatedMergeTree.h>
|
||||
#include <Poco/Timestamp.h>
|
||||
#include <base/sleep.h>
|
||||
@ -532,7 +533,8 @@ void DDLWorker::processTask(DDLTaskBase & task, const ZooKeeperPtr & zookeeper)
|
||||
auto active_node = zkutil::EphemeralNodeHolder::existing(active_node_path, *zookeeper);
|
||||
|
||||
/// Try fast path
|
||||
auto create_active_res = zookeeper->tryCreate(active_node_path, {}, zkutil::CreateMode::Ephemeral);
|
||||
const String canary_value = Field(ServerUUID::get()).dump();
|
||||
auto create_active_res = zookeeper->tryCreate(active_node_path, canary_value, zkutil::CreateMode::Ephemeral);
|
||||
if (create_active_res != Coordination::Error::ZOK)
|
||||
{
|
||||
if (create_active_res != Coordination::Error::ZNONODE && create_active_res != Coordination::Error::ZNODEEXISTS)
|
||||
@ -563,10 +565,10 @@ void DDLWorker::processTask(DDLTaskBase & task, const ZooKeeperPtr & zookeeper)
|
||||
{
|
||||
/// Connection has been lost and now we are retrying,
|
||||
/// but our previous ephemeral node still exists.
|
||||
zookeeper->waitForEphemeralToDisappearIfAny(active_node_path);
|
||||
zookeeper->handleEphemeralNodeExistence(active_node_path, canary_value);
|
||||
}
|
||||
|
||||
zookeeper->create(active_node_path, {}, zkutil::CreateMode::Ephemeral);
|
||||
zookeeper->create(active_node_path, canary_value, zkutil::CreateMode::Ephemeral);
|
||||
}
|
||||
|
||||
/// We must hold the lock until task execution status is committed to ZooKeeper,
|
||||
|
@ -714,7 +714,10 @@ public:
|
||||
/// Object was never loaded successfully and should be reloaded.
|
||||
startLoading(info);
|
||||
}
|
||||
LOG_TRACE(log, "Object '{}' is neither loaded nor failed, so it will not be reloaded as outdated.", info.name);
|
||||
else
|
||||
{
|
||||
LOG_TRACE(log, "Object '{}' is neither loaded nor failed, so it will not be reloaded as outdated.", info.name);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -75,7 +75,7 @@ BlockIO InterpreterDescribeQuery::execute()
|
||||
auto select_query = table_expression.subquery->children.at(0);
|
||||
auto current_context = getContext();
|
||||
|
||||
if (settings.use_analyzer)
|
||||
if (settings.allow_experimental_analyzer)
|
||||
{
|
||||
SelectQueryOptions select_query_options;
|
||||
names_and_types = InterpreterSelectQueryAnalyzer(select_query, select_query_options, current_context).getSampleBlock().getNamesAndTypesList();
|
||||
|
@ -419,7 +419,7 @@ QueryPipeline InterpreterExplainQuery::executeImpl()
|
||||
auto settings = checkAndGetSettings<QueryPlanSettings>(ast.getSettings());
|
||||
QueryPlan plan;
|
||||
|
||||
if (getContext()->getSettingsRef().use_analyzer)
|
||||
if (getContext()->getSettingsRef().allow_experimental_analyzer)
|
||||
{
|
||||
InterpreterSelectQueryAnalyzer interpreter(ast.getExplainedQuery(), options, getContext());
|
||||
plan = std::move(interpreter).extractQueryPlan();
|
||||
@ -462,7 +462,7 @@ QueryPipeline InterpreterExplainQuery::executeImpl()
|
||||
auto settings = checkAndGetSettings<QueryPipelineSettings>(ast.getSettings());
|
||||
QueryPlan plan;
|
||||
|
||||
if (getContext()->getSettingsRef().use_analyzer)
|
||||
if (getContext()->getSettingsRef().allow_experimental_analyzer)
|
||||
{
|
||||
InterpreterSelectQueryAnalyzer interpreter(ast.getExplainedQuery(), options, getContext());
|
||||
plan = std::move(interpreter).extractQueryPlan();
|
||||
|
@ -119,7 +119,7 @@ std::unique_ptr<IInterpreter> InterpreterFactory::get(ASTPtr & query, ContextMut
|
||||
|
||||
if (query->as<ASTSelectQuery>())
|
||||
{
|
||||
if (context->getSettingsRef().use_analyzer)
|
||||
if (context->getSettingsRef().allow_experimental_analyzer)
|
||||
return std::make_unique<InterpreterSelectQueryAnalyzer>(query, options, context);
|
||||
|
||||
/// This is internal part of ASTSelectWithUnionQuery.
|
||||
@ -130,7 +130,7 @@ std::unique_ptr<IInterpreter> InterpreterFactory::get(ASTPtr & query, ContextMut
|
||||
{
|
||||
ProfileEvents::increment(ProfileEvents::SelectQuery);
|
||||
|
||||
if (context->getSettingsRef().use_analyzer)
|
||||
if (context->getSettingsRef().allow_experimental_analyzer)
|
||||
return std::make_unique<InterpreterSelectQueryAnalyzer>(query, options, context);
|
||||
|
||||
return std::make_unique<InterpreterSelectWithUnionQuery>(query, context, options);
|
||||
|
@ -478,7 +478,10 @@ struct Operator
|
||||
{
|
||||
Operator() = default;
|
||||
|
||||
Operator(const std::string & function_name_, int priority_, int arity_ = 2, OperatorType type_ = OperatorType::None)
|
||||
Operator(const std::string & function_name_,
|
||||
int priority_,
|
||||
int arity_,
|
||||
OperatorType type_ = OperatorType::None)
|
||||
: type(type_), priority(priority_), arity(arity_), function_name(function_name_) {}
|
||||
|
||||
OperatorType type;
|
||||
@ -514,10 +517,8 @@ enum class Checkpoint
|
||||
class Layer
|
||||
{
|
||||
public:
|
||||
explicit Layer(bool allow_alias_ = true, bool allow_alias_without_as_keyword_ = true) :
|
||||
allow_alias(allow_alias_), allow_alias_without_as_keyword(allow_alias_without_as_keyword_)
|
||||
{
|
||||
}
|
||||
explicit Layer(bool allow_alias_ = true, bool allow_alias_without_as_keyword_ = false) :
|
||||
allow_alias(allow_alias_), allow_alias_without_as_keyword(allow_alias_without_as_keyword_) {}
|
||||
|
||||
virtual ~Layer() = default;
|
||||
|
||||
@ -620,13 +621,17 @@ public:
|
||||
///
|
||||
bool mergeElement(bool push_to_elements = true)
|
||||
{
|
||||
parsed_alias = false;
|
||||
|
||||
Operator cur_op;
|
||||
while (popOperator(cur_op))
|
||||
{
|
||||
ASTPtr function;
|
||||
|
||||
// Special case of ternary operator
|
||||
if (cur_op.type == OperatorType::StartIf)
|
||||
// We should not meet the starting part of the operator while finishing an element
|
||||
if (cur_op.type == OperatorType::StartIf ||
|
||||
cur_op.type == OperatorType::StartBetween ||
|
||||
cur_op.type == OperatorType::StartNotBetween)
|
||||
return false;
|
||||
|
||||
if (cur_op.type == OperatorType::FinishIf)
|
||||
@ -636,10 +641,6 @@ public:
|
||||
return false;
|
||||
}
|
||||
|
||||
// Special case of a BETWEEN b AND c operator
|
||||
if (cur_op.type == OperatorType::StartBetween || cur_op.type == OperatorType::StartNotBetween)
|
||||
return false;
|
||||
|
||||
if (cur_op.type == OperatorType::FinishBetween)
|
||||
{
|
||||
Operator tmp_op;
|
||||
@ -735,6 +736,9 @@ public:
|
||||
/// In order to distinguish them we keep a counter of BETWEENs without matching ANDs.
|
||||
int between_counter = 0;
|
||||
|
||||
/// Flag we set when we parsed alias to avoid parsing next element as alias
|
||||
bool parsed_alias = false;
|
||||
|
||||
bool allow_alias = true;
|
||||
bool allow_alias_without_as_keyword = true;
|
||||
|
||||
@ -784,16 +788,18 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
/// Basic layer for a function with certain separator and end tokens:
|
||||
/// 1. If we parse a separator we should merge current operands and operators
|
||||
/// into one element and push in to 'elements' vector.
|
||||
/// 2. If we parse an ending token, we should merge everything as in (1) and
|
||||
/// also set 'finished' flag.
|
||||
template <TokenType separator, TokenType end>
|
||||
class BaseLayer : public Layer
|
||||
class LayerWithSeparator : public Layer
|
||||
{
|
||||
public:
|
||||
explicit LayerWithSeparator(bool allow_alias_ = true, bool allow_alias_without_as_keyword_ = false) :
|
||||
Layer(allow_alias_, allow_alias_without_as_keyword_) {}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & action) override
|
||||
{
|
||||
if (ParserToken(separator).ignore(pos, expected))
|
||||
@ -817,11 +823,11 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
class OrdinaryFunctionLayer : public Layer
|
||||
/// Layer for regular and aggregate functions without syntax sugar
|
||||
class FunctionLayer : public Layer
|
||||
{
|
||||
public:
|
||||
explicit OrdinaryFunctionLayer(String function_name_, bool allow_function_parameters_ = true)
|
||||
explicit FunctionLayer(String function_name_, bool allow_function_parameters_ = true)
|
||||
: function_name(function_name_), allow_function_parameters(allow_function_parameters_){}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & action) override
|
||||
@ -966,7 +972,7 @@ public:
|
||||
|
||||
if (parameters)
|
||||
{
|
||||
function_node->parameters = parameters;
|
||||
function_node->parameters = std::move(parameters);
|
||||
function_node->children.push_back(function_node->parameters);
|
||||
}
|
||||
|
||||
@ -999,7 +1005,7 @@ public:
|
||||
return false;
|
||||
}
|
||||
|
||||
elements = {function_node};
|
||||
elements = {std::move(function_node)};
|
||||
finished = true;
|
||||
}
|
||||
|
||||
@ -1068,7 +1074,7 @@ private:
|
||||
};
|
||||
|
||||
/// Layer for array square brackets operator
|
||||
class ArrayLayer : public BaseLayer<TokenType::Comma, TokenType::ClosingSquareBracket>
|
||||
class ArrayLayer : public LayerWithSeparator<TokenType::Comma, TokenType::ClosingSquareBracket>
|
||||
{
|
||||
public:
|
||||
bool getResult(ASTPtr & node) override
|
||||
@ -1079,25 +1085,27 @@ public:
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & action) override
|
||||
{
|
||||
return BaseLayer::parse(pos, expected, action);
|
||||
return LayerWithSeparator::parse(pos, expected, action);
|
||||
}
|
||||
};
|
||||
|
||||
/// Layer for arrayElement square brackets operator
|
||||
/// This layer does not create a function, it is only needed to parse closing token
|
||||
/// and return only one element.
|
||||
class ArrayElementLayer : public BaseLayer<TokenType::Comma, TokenType::ClosingSquareBracket>
|
||||
class ArrayElementLayer : public LayerWithSeparator<TokenType::Comma, TokenType::ClosingSquareBracket>
|
||||
{
|
||||
public:
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & action) override
|
||||
{
|
||||
return BaseLayer::parse(pos, expected, action);
|
||||
return LayerWithSeparator::parse(pos, expected, action);
|
||||
}
|
||||
};
|
||||
|
||||
class CastLayer : public Layer
|
||||
{
|
||||
public:
|
||||
CastLayer() : Layer(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & action) override
|
||||
{
|
||||
/// CAST(x [AS alias1], T [AS alias2]) or CAST(x [AS alias1] AS T)
|
||||
@ -1193,9 +1201,11 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
class ExtractLayer : public BaseLayer<TokenType::Comma, TokenType::ClosingRoundBracket>
|
||||
class ExtractLayer : public LayerWithSeparator<TokenType::Comma, TokenType::ClosingRoundBracket>
|
||||
{
|
||||
public:
|
||||
ExtractLayer() : LayerWithSeparator(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
if (state == 2)
|
||||
@ -1240,7 +1250,7 @@ public:
|
||||
|
||||
if (state == 1)
|
||||
{
|
||||
return BaseLayer::parse(pos, expected, action);
|
||||
return LayerWithSeparator::parse(pos, expected, action);
|
||||
}
|
||||
|
||||
if (state == 2)
|
||||
@ -1265,6 +1275,8 @@ private:
|
||||
class SubstringLayer : public Layer
|
||||
{
|
||||
public:
|
||||
SubstringLayer() : Layer(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
node = makeASTFunction("substring", std::move(elements));
|
||||
@ -1325,6 +1337,8 @@ public:
|
||||
class PositionLayer : public Layer
|
||||
{
|
||||
public:
|
||||
PositionLayer() : Layer(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
if (state == 2)
|
||||
@ -1390,10 +1404,11 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
class ExistsLayer : public Layer
|
||||
{
|
||||
public:
|
||||
ExistsLayer() : Layer(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & /*action*/) override
|
||||
{
|
||||
ASTPtr node;
|
||||
@ -1418,9 +1433,8 @@ public:
|
||||
class TrimLayer : public Layer
|
||||
{
|
||||
public:
|
||||
TrimLayer(bool trim_left_, bool trim_right_) : trim_left(trim_left_), trim_right(trim_right_)
|
||||
{
|
||||
}
|
||||
TrimLayer(bool trim_left_, bool trim_right_)
|
||||
: Layer(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true), trim_left(trim_left_), trim_right(trim_right_) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
@ -1578,13 +1592,11 @@ private:
|
||||
String function_name;
|
||||
};
|
||||
|
||||
|
||||
class DateAddLayer : public BaseLayer<TokenType::Comma, TokenType::ClosingRoundBracket>
|
||||
class DateAddLayer : public LayerWithSeparator<TokenType::Comma, TokenType::ClosingRoundBracket>
|
||||
{
|
||||
public:
|
||||
explicit DateAddLayer(const char * function_name_) : function_name(function_name_)
|
||||
{
|
||||
}
|
||||
explicit DateAddLayer(const char * function_name_)
|
||||
: LayerWithSeparator(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true), function_name(function_name_) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
@ -1626,7 +1638,7 @@ public:
|
||||
|
||||
if (state == 1)
|
||||
{
|
||||
return BaseLayer::parse(pos, expected, action);
|
||||
return LayerWithSeparator::parse(pos, expected, action);
|
||||
}
|
||||
|
||||
return true;
|
||||
@ -1638,10 +1650,11 @@ private:
|
||||
bool parsed_interval_kind = false;
|
||||
};
|
||||
|
||||
|
||||
class DateDiffLayer : public BaseLayer<TokenType::Comma, TokenType::ClosingRoundBracket>
|
||||
class DateDiffLayer : public LayerWithSeparator<TokenType::Comma, TokenType::ClosingRoundBracket>
|
||||
{
|
||||
public:
|
||||
DateDiffLayer() : LayerWithSeparator(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
if (parsed_interval_kind)
|
||||
@ -1680,7 +1693,7 @@ public:
|
||||
|
||||
if (state == 1)
|
||||
{
|
||||
return BaseLayer::parse(pos, expected, action);
|
||||
return LayerWithSeparator::parse(pos, expected, action);
|
||||
}
|
||||
|
||||
return true;
|
||||
@ -1691,10 +1704,11 @@ private:
|
||||
bool parsed_interval_kind = false;
|
||||
};
|
||||
|
||||
|
||||
class IntervalLayer : public Layer
|
||||
{
|
||||
public:
|
||||
IntervalLayer() : Layer(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & action) override
|
||||
{
|
||||
/// INTERVAL 1 HOUR or INTERVAL expr HOUR
|
||||
@ -1769,86 +1783,11 @@ private:
|
||||
IntervalKind interval_kind;
|
||||
};
|
||||
|
||||
/// Layer for table function 'view' and 'viewIfPermitted'
|
||||
class ViewLayer : public Layer
|
||||
{
|
||||
public:
|
||||
explicit ViewLayer(bool if_permitted_) : if_permitted(if_permitted_) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
if (if_permitted)
|
||||
node = makeASTFunction("viewIfPermitted", std::move(elements));
|
||||
else
|
||||
node = makeASTFunction("view", std::move(elements));
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & /*action*/) override
|
||||
{
|
||||
/// view(SELECT ...)
|
||||
/// viewIfPermitted(SELECT ... ELSE func(...))
|
||||
///
|
||||
/// 0. Parse the SELECT query and if 'if_permitted' parse 'ELSE' keyword (-> 1) else (finished)
|
||||
/// 1. Parse closing token
|
||||
|
||||
if (state == 0)
|
||||
{
|
||||
ASTPtr query;
|
||||
|
||||
bool maybe_an_subquery = pos->type == TokenType::OpeningRoundBracket;
|
||||
|
||||
if (!ParserSelectWithUnionQuery().parse(pos, query, expected))
|
||||
return false;
|
||||
|
||||
auto & select_ast = query->as<ASTSelectWithUnionQuery &>();
|
||||
if (select_ast.list_of_selects->children.size() == 1 && maybe_an_subquery)
|
||||
{
|
||||
// It's an subquery. Bail out.
|
||||
return false;
|
||||
}
|
||||
|
||||
pushResult(query);
|
||||
|
||||
if (!if_permitted)
|
||||
{
|
||||
if (!ParserToken(TokenType::ClosingRoundBracket).ignore(pos, expected))
|
||||
return false;
|
||||
|
||||
finished = true;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (!ParserKeyword{"ELSE"}.ignore(pos, expected))
|
||||
return false;
|
||||
|
||||
state = 1;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (state == 1)
|
||||
{
|
||||
if (ParserToken(TokenType::ClosingRoundBracket).ignore(pos, expected))
|
||||
{
|
||||
if (!mergeElement())
|
||||
return false;
|
||||
|
||||
finished = true;
|
||||
}
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
private:
|
||||
bool if_permitted;
|
||||
};
|
||||
|
||||
|
||||
class CaseLayer : public Layer
|
||||
{
|
||||
public:
|
||||
CaseLayer() : Layer(/*allow_alias*/ true, /*allow_alias_without_as_keyword*/ true) {}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & action) override
|
||||
{
|
||||
/// CASE [x] WHEN expr THEN expr [WHEN expr THEN expr [...]] [ELSE expr] END
|
||||
@ -1937,6 +1876,82 @@ private:
|
||||
bool has_case_expr;
|
||||
};
|
||||
|
||||
/// Layer for table function 'view' and 'viewIfPermitted'
|
||||
class ViewLayer : public Layer
|
||||
{
|
||||
public:
|
||||
explicit ViewLayer(bool if_permitted_) : if_permitted(if_permitted_) {}
|
||||
|
||||
bool getResult(ASTPtr & node) override
|
||||
{
|
||||
if (if_permitted)
|
||||
node = makeASTFunction("viewIfPermitted", std::move(elements));
|
||||
else
|
||||
node = makeASTFunction("view", std::move(elements));
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool parse(IParser::Pos & pos, Expected & expected, Action & /*action*/) override
|
||||
{
|
||||
/// view(SELECT ...)
|
||||
/// viewIfPermitted(SELECT ... ELSE func(...))
|
||||
///
|
||||
/// 0. Parse the SELECT query and if 'if_permitted' parse 'ELSE' keyword (-> 1) else (finished)
|
||||
/// 1. Parse closing token
|
||||
|
||||
if (state == 0)
|
||||
{
|
||||
ASTPtr query;
|
||||
|
||||
bool maybe_an_subquery = pos->type == TokenType::OpeningRoundBracket;
|
||||
|
||||
if (!ParserSelectWithUnionQuery().parse(pos, query, expected))
|
||||
return false;
|
||||
|
||||
auto & select_ast = query->as<ASTSelectWithUnionQuery &>();
|
||||
if (select_ast.list_of_selects->children.size() == 1 && maybe_an_subquery)
|
||||
{
|
||||
// It's an subquery. Bail out.
|
||||
return false;
|
||||
}
|
||||
|
||||
pushResult(query);
|
||||
|
||||
if (!if_permitted)
|
||||
{
|
||||
if (!ParserToken(TokenType::ClosingRoundBracket).ignore(pos, expected))
|
||||
return false;
|
||||
|
||||
finished = true;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (!ParserKeyword{"ELSE"}.ignore(pos, expected))
|
||||
return false;
|
||||
|
||||
state = 1;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (state == 1)
|
||||
{
|
||||
if (ParserToken(TokenType::ClosingRoundBracket).ignore(pos, expected))
|
||||
{
|
||||
if (!mergeElement())
|
||||
return false;
|
||||
|
||||
finished = true;
|
||||
}
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
private:
|
||||
bool if_permitted;
|
||||
};
|
||||
|
||||
|
||||
std::unique_ptr<Layer> getFunctionLayer(ASTPtr identifier, bool is_table_function, bool allow_function_parameters_ = true)
|
||||
{
|
||||
@ -2001,9 +2016,9 @@ std::unique_ptr<Layer> getFunctionLayer(ASTPtr identifier, bool is_table_functio
|
||||
|| function_name_lowercase == "timestampdiff" || function_name_lowercase == "timestamp_diff")
|
||||
return std::make_unique<DateDiffLayer>();
|
||||
else if (function_name_lowercase == "grouping")
|
||||
return std::make_unique<OrdinaryFunctionLayer>(function_name_lowercase, allow_function_parameters_);
|
||||
return std::make_unique<FunctionLayer>(function_name_lowercase, allow_function_parameters_);
|
||||
else
|
||||
return std::make_unique<OrdinaryFunctionLayer>(function_name, allow_function_parameters_);
|
||||
return std::make_unique<FunctionLayer>(function_name, allow_function_parameters_);
|
||||
}
|
||||
|
||||
|
||||
@ -2153,22 +2168,22 @@ std::vector<std::pair<const char *, Operator>> ParserExpressionImpl::operators_t
|
||||
{"<", Operator("less", 9, 2, OperatorType::Comparison)},
|
||||
{">", Operator("greater", 9, 2, OperatorType::Comparison)},
|
||||
{"=", Operator("equals", 9, 2, OperatorType::Comparison)},
|
||||
{"LIKE", Operator("like", 9)},
|
||||
{"ILIKE", Operator("ilike", 9)},
|
||||
{"NOT LIKE", Operator("notLike", 9)},
|
||||
{"NOT ILIKE", Operator("notILike", 9)},
|
||||
{"IN", Operator("in", 9)},
|
||||
{"NOT IN", Operator("notIn", 9)},
|
||||
{"GLOBAL IN", Operator("globalIn", 9)},
|
||||
{"GLOBAL NOT IN", Operator("globalNotIn", 9)},
|
||||
{"LIKE", Operator("like", 9, 2)},
|
||||
{"ILIKE", Operator("ilike", 9, 2)},
|
||||
{"NOT LIKE", Operator("notLike", 9, 2)},
|
||||
{"NOT ILIKE", Operator("notILike", 9, 2)},
|
||||
{"IN", Operator("in", 9, 2)},
|
||||
{"NOT IN", Operator("notIn", 9, 2)},
|
||||
{"GLOBAL IN", Operator("globalIn", 9, 2)},
|
||||
{"GLOBAL NOT IN", Operator("globalNotIn", 9, 2)},
|
||||
{"||", Operator("concat", 10, 2, OperatorType::Mergeable)},
|
||||
{"+", Operator("plus", 11)},
|
||||
{"-", Operator("minus", 11)},
|
||||
{"*", Operator("multiply", 12)},
|
||||
{"/", Operator("divide", 12)},
|
||||
{"%", Operator("modulo", 12)},
|
||||
{"MOD", Operator("modulo", 12)},
|
||||
{"DIV", Operator("intDiv", 12)},
|
||||
{"+", Operator("plus", 11, 2)},
|
||||
{"-", Operator("minus", 11, 2)},
|
||||
{"*", Operator("multiply", 12, 2)},
|
||||
{"/", Operator("divide", 12, 2)},
|
||||
{"%", Operator("modulo", 12, 2)},
|
||||
{"MOD", Operator("modulo", 12, 2)},
|
||||
{"DIV", Operator("intDiv", 12, 2)},
|
||||
{".", Operator("tupleElement", 14, 2, OperatorType::TupleElement)},
|
||||
{"[", Operator("arrayElement", 14, 2, OperatorType::ArrayElement)},
|
||||
{"::", Operator("CAST", 14, 2, OperatorType::Cast)},
|
||||
@ -2440,11 +2455,15 @@ Action ParserExpressionImpl::tryParseOperator(Layers & layers, IParser::Pos & po
|
||||
|
||||
if (cur_op == operators_table.end())
|
||||
{
|
||||
ParserAlias alias_parser(layers.back()->allow_alias_without_as_keyword);
|
||||
auto old_pos = pos;
|
||||
if (layers.back()->allow_alias && ParserAlias(layers.back()->allow_alias_without_as_keyword).parse(pos, tmp, expected))
|
||||
if (layers.back()->allow_alias &&
|
||||
!layers.back()->parsed_alias &&
|
||||
alias_parser.parse(pos, tmp, expected) &&
|
||||
layers.back()->insertAlias(tmp))
|
||||
{
|
||||
if (layers.back()->insertAlias(tmp))
|
||||
return Action::OPERATOR;
|
||||
layers.back()->parsed_alias = true;
|
||||
return Action::OPERATOR;
|
||||
}
|
||||
pos = old_pos;
|
||||
return Action::NONE;
|
||||
|
@ -502,7 +502,16 @@ String calculateActionNodeName(const QueryTreeNodePtr & node, const PlannerConte
|
||||
case QueryTreeNodeType::COLUMN:
|
||||
{
|
||||
const auto * column_identifier = planner_context.getColumnNodeIdentifierOrNull(node);
|
||||
result = column_identifier ? *column_identifier : node->getName();
|
||||
|
||||
if (column_identifier)
|
||||
{
|
||||
result = *column_identifier;
|
||||
}
|
||||
else
|
||||
{
|
||||
const auto & column_node = node->as<ColumnNode &>();
|
||||
result = column_node.getColumnName();
|
||||
}
|
||||
|
||||
break;
|
||||
}
|
||||
|
@ -62,7 +62,7 @@ size_t tryReuseStorageOrderingForWindowFunctions(QueryPlan::Node * parent_node,
|
||||
}
|
||||
|
||||
auto context = read_from_merge_tree->getContext();
|
||||
if (!context->getSettings().optimize_read_in_window_order || context->getSettingsRef().use_analyzer)
|
||||
if (!context->getSettings().optimize_read_in_window_order || context->getSettingsRef().allow_experimental_analyzer)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
@ -6,6 +6,7 @@
|
||||
#include <Interpreters/Context.h>
|
||||
#include <Common/ZooKeeper/KeeperException.h>
|
||||
#include <Common/randomSeed.h>
|
||||
#include <Core/ServerUUID.h>
|
||||
#include <boost/algorithm/string/replace.hpp>
|
||||
|
||||
|
||||
@ -26,19 +27,12 @@ namespace DB
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int REPLICA_IS_ALREADY_ACTIVE;
|
||||
extern const int REPLICA_STATUS_CHANGED;
|
||||
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
constexpr auto retry_period_ms = 1000;
|
||||
}
|
||||
|
||||
/// Used to check whether it's us who set node `is_active`, or not.
|
||||
static String generateActiveNodeIdentifier()
|
||||
{
|
||||
return "pid: " + toString(getpid()) + ", random: " + toString(randomSeed());
|
||||
return Field(ServerUUID::get()).dump();
|
||||
}
|
||||
|
||||
ReplicatedMergeTreeRestartingThread::ReplicatedMergeTreeRestartingThread(StorageReplicatedMergeTree & storage_)
|
||||
@ -58,27 +52,34 @@ void ReplicatedMergeTreeRestartingThread::run()
|
||||
if (need_stop)
|
||||
return;
|
||||
|
||||
size_t reschedule_period_ms = check_period_ms;
|
||||
/// In case of any exceptions we want to rerun the this task as fast as possible but we also don't want to keep retrying immediately
|
||||
/// in a close loop (as fast as tasks can be processed), so we'll retry in between 100 and 10000 ms
|
||||
const size_t backoff_ms = 100 * ((consecutive_check_failures + 1) * (consecutive_check_failures + 2)) / 2;
|
||||
const size_t next_failure_retry_ms = std::min(size_t{10000}, backoff_ms);
|
||||
|
||||
try
|
||||
{
|
||||
bool replica_is_active = runImpl();
|
||||
if (!replica_is_active)
|
||||
reschedule_period_ms = retry_period_ms;
|
||||
}
|
||||
catch (const Exception & e)
|
||||
{
|
||||
/// We couldn't activate table let's set it into readonly mode
|
||||
partialShutdown();
|
||||
tryLogCurrentException(log, __PRETTY_FUNCTION__);
|
||||
|
||||
if (e.code() == ErrorCodes::REPLICA_STATUS_CHANGED)
|
||||
reschedule_period_ms = 0;
|
||||
if (replica_is_active)
|
||||
{
|
||||
consecutive_check_failures = 0;
|
||||
task->scheduleAfter(check_period_ms);
|
||||
}
|
||||
else
|
||||
{
|
||||
consecutive_check_failures++;
|
||||
task->scheduleAfter(next_failure_retry_ms);
|
||||
}
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
consecutive_check_failures++;
|
||||
task->scheduleAfter(next_failure_retry_ms);
|
||||
|
||||
/// We couldn't activate table let's set it into readonly mode if necessary
|
||||
/// We do this after scheduling the task in case it throws
|
||||
partialShutdown();
|
||||
tryLogCurrentException(log, __PRETTY_FUNCTION__);
|
||||
tryLogCurrentException(log, "Failed to restart the table. Will try again");
|
||||
}
|
||||
|
||||
if (first_time)
|
||||
@ -92,14 +93,6 @@ void ReplicatedMergeTreeRestartingThread::run()
|
||||
storage.startup_event.set();
|
||||
first_time = false;
|
||||
}
|
||||
|
||||
if (need_stop)
|
||||
return;
|
||||
|
||||
if (reschedule_period_ms)
|
||||
task->scheduleAfter(reschedule_period_ms);
|
||||
else
|
||||
task->schedule();
|
||||
}
|
||||
|
||||
bool ReplicatedMergeTreeRestartingThread::runImpl()
|
||||
@ -132,8 +125,8 @@ bool ReplicatedMergeTreeRestartingThread::runImpl()
|
||||
}
|
||||
catch (const Coordination::Exception &)
|
||||
{
|
||||
/// The exception when you try to zookeeper_init usually happens if DNS does not work. We will try to do it again.
|
||||
tryLogCurrentException(log, __PRETTY_FUNCTION__);
|
||||
/// The exception when you try to zookeeper_init usually happens if DNS does not work or the connection with ZK fails
|
||||
tryLogCurrentException(log, "Failed to establish a new ZK connection. Will try again");
|
||||
assert(storage.is_readonly);
|
||||
return false;
|
||||
}
|
||||
@ -158,12 +151,15 @@ bool ReplicatedMergeTreeRestartingThread::runImpl()
|
||||
storage.cleanup_thread.start();
|
||||
storage.part_check_thread.start();
|
||||
|
||||
LOG_DEBUG(log, "Table started successfully");
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
|
||||
bool ReplicatedMergeTreeRestartingThread::tryStartup()
|
||||
{
|
||||
LOG_DEBUG(log, "Trying to start replica up");
|
||||
try
|
||||
{
|
||||
removeFailedQuorumParts();
|
||||
@ -177,9 +173,7 @@ bool ReplicatedMergeTreeRestartingThread::tryStartup()
|
||||
try
|
||||
{
|
||||
storage.queue.initialize(zookeeper);
|
||||
|
||||
storage.queue.load(zookeeper);
|
||||
|
||||
storage.queue.createLogEntriesToFetchBrokenParts();
|
||||
|
||||
/// pullLogsToQueue() after we mark replica 'is_active' (and after we repair if it was lost);
|
||||
@ -302,7 +296,7 @@ void ReplicatedMergeTreeRestartingThread::activateReplica()
|
||||
ReplicatedMergeTreeAddress address = storage.getReplicatedMergeTreeAddress();
|
||||
|
||||
String is_active_path = fs::path(storage.replica_path) / "is_active";
|
||||
zookeeper->waitForEphemeralToDisappearIfAny(is_active_path);
|
||||
zookeeper->handleEphemeralNodeExistence(is_active_path, active_node_identifier);
|
||||
|
||||
/// Simultaneously declare that this replica is active, and update the host.
|
||||
Coordination::Requests ops;
|
||||
@ -348,7 +342,6 @@ void ReplicatedMergeTreeRestartingThread::partialShutdown(bool part_of_full_shut
|
||||
storage.replica_is_active_node = nullptr;
|
||||
|
||||
LOG_TRACE(log, "Waiting for threads to finish");
|
||||
|
||||
storage.merge_selecting_task->deactivate();
|
||||
storage.queue_updating_task->deactivate();
|
||||
storage.mutations_updating_task->deactivate();
|
||||
|
@ -41,6 +41,7 @@ private:
|
||||
|
||||
BackgroundSchedulePool::TaskHolder task;
|
||||
Int64 check_period_ms; /// The frequency of checking expiration of session in ZK.
|
||||
UInt32 consecutive_check_failures = 0; /// How many consecutive checks have failed
|
||||
bool first_time = true; /// Activate replica for the first time.
|
||||
|
||||
void run();
|
||||
|
@ -81,7 +81,8 @@ void listFilesWithRegexpMatchingImpl(
|
||||
const std::string & path_for_ls,
|
||||
const std::string & for_match,
|
||||
size_t & total_bytes_to_read,
|
||||
std::vector<std::string> & result)
|
||||
std::vector<std::string> & result,
|
||||
bool recursive = false)
|
||||
{
|
||||
const size_t first_glob = for_match.find_first_of("*?{");
|
||||
|
||||
@ -89,10 +90,17 @@ void listFilesWithRegexpMatchingImpl(
|
||||
const std::string suffix_with_globs = for_match.substr(end_of_path_without_globs); /// begin with '/'
|
||||
|
||||
const size_t next_slash = suffix_with_globs.find('/', 1);
|
||||
auto regexp = makeRegexpPatternFromGlobs(suffix_with_globs.substr(0, next_slash));
|
||||
const std::string current_glob = suffix_with_globs.substr(0, next_slash);
|
||||
auto regexp = makeRegexpPatternFromGlobs(current_glob);
|
||||
|
||||
re2::RE2 matcher(regexp);
|
||||
|
||||
bool skip_regex = current_glob == "/*" ? true : false;
|
||||
if (!recursive)
|
||||
recursive = current_glob == "/**" ;
|
||||
|
||||
const std::string prefix_without_globs = path_for_ls + for_match.substr(1, end_of_path_without_globs);
|
||||
|
||||
if (!fs::exists(prefix_without_globs))
|
||||
return;
|
||||
|
||||
@ -107,15 +115,21 @@ void listFilesWithRegexpMatchingImpl(
|
||||
/// Condition is_directory means what kind of path is it in current iteration of ls
|
||||
if (!it->is_directory() && !looking_for_directory)
|
||||
{
|
||||
if (re2::RE2::FullMatch(file_name, matcher))
|
||||
if (skip_regex || re2::RE2::FullMatch(file_name, matcher))
|
||||
{
|
||||
total_bytes_to_read += it->file_size();
|
||||
result.push_back(it->path().string());
|
||||
}
|
||||
}
|
||||
else if (it->is_directory() && looking_for_directory)
|
||||
else if (it->is_directory())
|
||||
{
|
||||
if (re2::RE2::FullMatch(file_name, matcher))
|
||||
if (recursive)
|
||||
{
|
||||
listFilesWithRegexpMatchingImpl(fs::path(full_path).append(it->path().string()) / "" ,
|
||||
looking_for_directory ? suffix_with_globs.substr(next_slash) : current_glob ,
|
||||
total_bytes_to_read, result, recursive);
|
||||
}
|
||||
else if (looking_for_directory && re2::RE2::FullMatch(file_name, matcher))
|
||||
{
|
||||
/// Recursion depth is limited by pattern. '*' works only for depth = 1, for depth = 2 pattern path is '*/*'. So we do not need additional check.
|
||||
listFilesWithRegexpMatchingImpl(fs::path(full_path) / "", suffix_with_globs.substr(next_slash), total_bytes_to_read, result);
|
||||
|
@ -139,7 +139,9 @@ public:
|
||||
|
||||
request.SetBucket(globbed_uri.bucket);
|
||||
request.SetPrefix(key_prefix);
|
||||
|
||||
matcher = std::make_unique<re2::RE2>(makeRegexpPatternFromGlobs(globbed_uri.key));
|
||||
recursive = globbed_uri.key == "/**" ? true : false;
|
||||
fillInternalBufferAssumeLocked();
|
||||
}
|
||||
|
||||
@ -197,7 +199,7 @@ private:
|
||||
for (const auto & row : result_batch)
|
||||
{
|
||||
const String & key = row.GetKey();
|
||||
if (re2::RE2::FullMatch(key, *matcher))
|
||||
if (recursive || re2::RE2::FullMatch(key, *matcher))
|
||||
{
|
||||
String path = fs::path(globbed_uri.bucket) / key;
|
||||
if (object_infos)
|
||||
@ -224,7 +226,7 @@ private:
|
||||
for (const auto & row : result_batch)
|
||||
{
|
||||
String key = row.GetKey();
|
||||
if (re2::RE2::FullMatch(key, *matcher))
|
||||
if (recursive || re2::RE2::FullMatch(key, *matcher))
|
||||
buffer.emplace_back(std::move(key));
|
||||
}
|
||||
}
|
||||
@ -252,6 +254,7 @@ private:
|
||||
Aws::S3::Model::ListObjectsV2Request request;
|
||||
Aws::S3::Model::ListObjectsV2Outcome outcome;
|
||||
std::unique_ptr<re2::RE2> matcher;
|
||||
bool recursive{false};
|
||||
bool is_finished{false};
|
||||
std::unordered_map<String, S3::ObjectInfo> * object_infos;
|
||||
Strings * read_keys;
|
||||
|
1
tests/.rgignore
Normal file
1
tests/.rgignore
Normal file
@ -0,0 +1 @@
|
||||
data_json
|
@ -387,7 +387,7 @@ progress {
|
||||
, stats {
|
||||
rows: 8
|
||||
blocks: 4
|
||||
allocated_bytes: 324
|
||||
allocated_bytes: 1092
|
||||
applied_limit: true
|
||||
rows_before_limit: 8
|
||||
}
|
||||
|
0
tests/integration/test_read_only_table/__init__.py
Normal file
0
tests/integration/test_read_only_table/__init__.py
Normal file
89
tests/integration/test_read_only_table/test.py
Normal file
89
tests/integration/test_read_only_table/test.py
Normal file
@ -0,0 +1,89 @@
|
||||
import time
|
||||
import re
|
||||
import logging
|
||||
|
||||
import pytest
|
||||
from helpers.cluster import ClickHouseCluster
|
||||
from helpers.test_tools import assert_eq_with_retry
|
||||
|
||||
NUM_TABLES = 10
|
||||
|
||||
|
||||
def fill_nodes(nodes):
|
||||
for table_id in range(NUM_TABLES):
|
||||
for node in nodes:
|
||||
node.query(
|
||||
f"""
|
||||
CREATE TABLE test_table_{table_id}(a UInt64)
|
||||
ENGINE = ReplicatedMergeTree('/clickhouse/tables/test/replicated/{table_id}', '{node.name}') ORDER BY tuple();
|
||||
"""
|
||||
)
|
||||
|
||||
|
||||
cluster = ClickHouseCluster(__file__)
|
||||
node1 = cluster.add_instance("node1", with_zookeeper=True)
|
||||
node2 = cluster.add_instance("node2", with_zookeeper=True)
|
||||
node3 = cluster.add_instance("node3", with_zookeeper=True)
|
||||
nodes = [node1, node2, node3]
|
||||
|
||||
|
||||
def sync_replicas(table):
|
||||
for node in nodes:
|
||||
node.query(f"SYSTEM SYNC REPLICA {table}")
|
||||
|
||||
|
||||
@pytest.fixture(scope="module")
|
||||
def start_cluster():
|
||||
try:
|
||||
cluster.start()
|
||||
|
||||
fill_nodes(nodes)
|
||||
|
||||
yield cluster
|
||||
|
||||
except Exception as ex:
|
||||
print(ex)
|
||||
|
||||
finally:
|
||||
cluster.shutdown()
|
||||
|
||||
|
||||
def test_restart_zookeeper(start_cluster):
|
||||
|
||||
for table_id in range(NUM_TABLES):
|
||||
node1.query(
|
||||
f"INSERT INTO test_table_{table_id} VALUES (1), (2), (3), (4), (5);"
|
||||
)
|
||||
|
||||
logging.info("Inserted test data and initialized all tables")
|
||||
|
||||
def get_zookeeper_which_node_connected_to(node):
|
||||
line = str(
|
||||
node.exec_in_container(
|
||||
[
|
||||
"bash",
|
||||
"-c",
|
||||
"lsof -a -i4 -i6 -itcp -w | grep 2181 | grep ESTABLISHED",
|
||||
],
|
||||
privileged=True,
|
||||
user="root",
|
||||
)
|
||||
).strip()
|
||||
|
||||
pattern = re.compile(r"zoo[0-9]+", re.IGNORECASE)
|
||||
result = pattern.findall(line)
|
||||
assert (
|
||||
len(result) == 1
|
||||
), "ClickHouse must be connected only to one Zookeeper at a time"
|
||||
return result[0]
|
||||
|
||||
node1_zk = get_zookeeper_which_node_connected_to(node1)
|
||||
|
||||
# ClickHouse should +- immediately reconnect to another zookeeper node
|
||||
cluster.stop_zookeeper_nodes([node1_zk])
|
||||
time.sleep(5)
|
||||
|
||||
for table_id in range(NUM_TABLES):
|
||||
node1.query(
|
||||
f"INSERT INTO test_table_{table_id} VALUES (6), (7), (8), (9), (10);"
|
||||
)
|
@ -0,0 +1,60 @@
|
||||
#!/usr/bin/env python3
|
||||
import time
|
||||
|
||||
import pytest
|
||||
from helpers.cluster import ClickHouseCluster
|
||||
|
||||
|
||||
single_node_cluster = ClickHouseCluster(__file__)
|
||||
small_node = single_node_cluster.add_instance(
|
||||
"small_node", main_configs=["configs/s3.xml"], with_minio=True
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture(scope="module")
|
||||
def started_single_node_cluster():
|
||||
try:
|
||||
single_node_cluster.start()
|
||||
|
||||
yield single_node_cluster
|
||||
finally:
|
||||
single_node_cluster.shutdown()
|
||||
|
||||
|
||||
def test_move_and_s3_memory_usage(started_single_node_cluster):
|
||||
if small_node.is_built_with_sanitizer() or small_node.is_debug_build():
|
||||
pytest.skip("Disabled for debug and sanitizers. Too slow.")
|
||||
|
||||
small_node.query(
|
||||
"CREATE TABLE s3_test_with_ttl (x UInt32, a String codec(NONE), b String codec(NONE), c String codec(NONE), d String codec(NONE), e String codec(NONE)) engine = MergeTree order by x partition by x SETTINGS storage_policy='s3_and_default'"
|
||||
)
|
||||
|
||||
for _ in range(10):
|
||||
small_node.query(
|
||||
"insert into s3_test_with_ttl select 0, repeat('a', 100), repeat('b', 100), repeat('c', 100), repeat('d', 100), repeat('e', 100) from zeros(400000) settings max_block_size = 8192, max_insert_block_size=10000000, min_insert_block_size_rows=10000000"
|
||||
)
|
||||
|
||||
# After this, we should have 5 columns per 10 * 100 * 400000 ~ 400 MB; total ~2G data in partition
|
||||
small_node.query("optimize table s3_test_with_ttl final")
|
||||
|
||||
small_node.query("system flush logs")
|
||||
# Will take memory usage from metric_log.
|
||||
# It is easier then specifying total memory limit (insert queries can hit this limit).
|
||||
small_node.query("truncate table system.metric_log")
|
||||
|
||||
small_node.query(
|
||||
"alter table s3_test_with_ttl move partition 0 to volume 'external'",
|
||||
settings={"send_logs_level": "error"},
|
||||
)
|
||||
small_node.query("system flush logs")
|
||||
max_usage = small_node.query(
|
||||
"select max(CurrentMetric_MemoryTracking) from system.metric_log"
|
||||
)
|
||||
# 3G limit is a big one. However, we can hit it anyway with parallel s3 writes enabled.
|
||||
# Also actual value can be bigger because of memory drift.
|
||||
# Increase it a little bit if test fails.
|
||||
assert int(max_usage) < 3e9
|
||||
res = small_node.query(
|
||||
"select * from system.errors where last_error_message like '%Memory limit%' limit 1"
|
||||
)
|
||||
assert res == ""
|
@ -1,73 +0,0 @@
|
||||
<test>
|
||||
|
||||
<substitutions>
|
||||
<substitution>
|
||||
<name>string_json</name>
|
||||
<values>
|
||||
<value>'{"a": "hi", "b": "hello", "c": "hola", "d": "see you, bye, bye"}'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>int_json</name>
|
||||
<values>
|
||||
<value>'{"a": 11, "b": 2222, "c": 33333333, "d": 4444444444444444}'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>uuid_json</name>
|
||||
<values>
|
||||
<value>'{"a": "2d49dc6e-ddce-4cd0-afb8-790956df54c4", "b": "2d49dc6e-ddce-4cd0-afb8-790956df54c3", "c": "2d49dc6e-ddce-4cd0-afb8-790956df54c1", "d": "2d49dc6e-ddce-4cd0-afb8-790956df54c1"}'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>low_cardinality_tuple_string</name>
|
||||
<values>
|
||||
<value>'Tuple(a LowCardinality(String), b LowCardinality(String), c LowCardinality(String), d LowCardinality(String) )'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>low_cardinality_tuple_fixed_string</name>
|
||||
<values>
|
||||
<value>'Tuple(a LowCardinality(FixedString(20)), b LowCardinality(FixedString(20)), c LowCardinality(FixedString(20)), d LowCardinality(FixedString(20)) )'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>low_cardinality_tuple_int8</name>
|
||||
<values>
|
||||
<value>'Tuple(a LowCardinality(Int8), b LowCardinality(Int8), c LowCardinality(Int8), d LowCardinality(Int8) )'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>low_cardinality_tuple_int16</name>
|
||||
<values>
|
||||
<value>'Tuple(a LowCardinality(Int16), b LowCardinality(Int16), c LowCardinality(Int16), d LowCardinality(Int16) )'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>low_cardinality_tuple_int32</name>
|
||||
<values>
|
||||
<value>'Tuple(a LowCardinality(Int32), b LowCardinality(Int32), c LowCardinality(Int32), d LowCardinality(Int32) )'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>low_cardinality_tuple_int64</name>
|
||||
<values>
|
||||
<value>'Tuple(a LowCardinality(Int64), b LowCardinality(Int64), c LowCardinality(Int64), d LowCardinality(Int64) )'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
<substitution>
|
||||
<name>low_cardinality_tuple_uuid</name>
|
||||
<values>
|
||||
<value>'Tuple(a LowCardinality(UUID), b LowCardinality(UUID), c LowCardinality(UUID), d LowCardinality(UUID) )'</value>
|
||||
</values>
|
||||
</substitution>
|
||||
</substitutions>
|
||||
|
||||
<query>SELECT 'fixed_string_json' FROM zeros(500000) WHERE NOT ignore(JSONExtract(materialize({string_json}), {low_cardinality_tuple_fixed_string})) FORMAT Null </query>
|
||||
<query>SELECT 'string_json' FROM zeros(500000) WHERE NOT ignore(JSONExtract(materialize({string_json}), {low_cardinality_tuple_string})) FORMAT Null </query>
|
||||
<query>SELECT 'int8_json' FROM zeros(500000) WHERE NOT ignore(JSONExtract(materialize({int_json}), {low_cardinality_tuple_int8})) FORMAT Null </query>
|
||||
<query>SELECT 'int16_json' FROM zeros(500000) WHERE NOT ignore(JSONExtract(materialize({int_json}), {low_cardinality_tuple_int16})) FORMAT Null </query>
|
||||
<query>SELECT 'int32_json' FROM zeros(500000) WHERE NOT ignore(JSONExtract(materialize({int_json}), {low_cardinality_tuple_int32})) FORMAT Null </query>
|
||||
<query>SELECT 'int64_json' FROM zeros(500000) WHERE NOT ignore(JSONExtract(materialize({int_json}), {low_cardinality_tuple_int64})) FORMAT Null </query>
|
||||
<query>SELECT 'uuid_json' FROM zeros(500000) WHERE NOT ignore(JSONExtract(materialize({uuid_json}), {low_cardinality_tuple_uuid})) FORMAT Null </query>
|
||||
</test>
|
@ -6,3 +6,8 @@
|
||||
01234567-89ab-cdef-0123-456789abcdef 01234567-89ab-cdef-0123-456789abcdef 01234567-89ab-cdef-0123-456789abcdef
|
||||
3f1ed72e-f7fe-4459-9cbe-95fe9298f845
|
||||
1
|
||||
-- UUID variants --
|
||||
00112233445566778899AABBCCDDEEFF
|
||||
33221100554477668899AABBCCDDEEFF
|
||||
00112233-4455-6677-8899-aabbccddeeff
|
||||
00112233-4455-6677-8899-aabbccddeeff
|
||||
|
@ -11,3 +11,9 @@ with generateUUIDv4() as uuid,
|
||||
identity(lower(hex(reverse(reinterpretAsString(uuid))))) as str,
|
||||
reinterpretAsUUID(reverse(unhex(str))) as uuid2
|
||||
select uuid = uuid2;
|
||||
|
||||
select '-- UUID variants --';
|
||||
select hex(UUIDStringToNum('00112233-4455-6677-8899-aabbccddeeff', 1));
|
||||
select hex(UUIDStringToNum('00112233-4455-6677-8899-aabbccddeeff', 2));
|
||||
select UUIDNumToString(UUIDStringToNum('00112233-4455-6677-8899-aabbccddeeff', 1), 1);
|
||||
select UUIDNumToString(UUIDStringToNum('00112233-4455-6677-8899-aabbccddeeff', 2), 2);
|
||||
|
@ -8,6 +8,32 @@ http
|
||||
====HOST====
|
||||
www.example.com
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
www.example.com
|
||||
127.0.0.1
|
||||
www.example.com
|
||||
www.example.com
|
||||
www.example.com
|
||||
example.com
|
||||
example.com
|
||||
example.com
|
||||
www.example.com
|
||||
example.com
|
||||
example.com
|
||||
example.com
|
||||
example.com
|
||||
example.com
|
||||
example.com
|
||||
|
||||
|
||||
|
||||
www.example.com
|
||||
127.0.0.1
|
||||
www.example.com
|
||||
|
@ -8,6 +8,14 @@ SELECT protocol('//127.0.0.1:443/') AS Scheme;
|
||||
|
||||
SELECT '====HOST====';
|
||||
SELECT domain('http://paul@www.example.com:80/') AS Host;
|
||||
SELECT domain('user:password@example.com:8080') AS Host;
|
||||
SELECT domain('http://user:password@example.com:8080') AS Host;
|
||||
SELECT domain('http://user:password@example.com:8080/path?query=value#fragment') AS Host;
|
||||
SELECT domain('newuser:@example.com') AS Host;
|
||||
SELECT domain('http://:pass@example.com') AS Host;
|
||||
SELECT domain(':newpass@example.com') AS Host;
|
||||
SELECT domain('http://user:pass@example@.com') AS Host;
|
||||
SELECT domain('http://user:pass:example.com') AS Host;
|
||||
SELECT domain('http:/paul/example/com') AS Host;
|
||||
SELECT domain('http://www.example.com?q=4') AS Host;
|
||||
SELECT domain('http://127.0.0.1:443/') AS Host;
|
||||
@ -17,6 +25,24 @@ SELECT domain('www.example.com') as Host;
|
||||
SELECT domain('example.com') as Host;
|
||||
SELECT domainWithoutWWW('//paul@www.example.com') AS Host;
|
||||
SELECT domainWithoutWWW('http://paul@www.example.com:80/') AS Host;
|
||||
SELECT domainRFC('http://paul@www.example.com:80/') AS Host;
|
||||
SELECT domainRFC('user:password@example.com:8080') AS Host;
|
||||
SELECT domainRFC('http://user:password@example.com:8080') AS Host;
|
||||
SELECT domainRFC('http://user:password@example.com:8080/path?query=value#fragment') AS Host;
|
||||
SELECT domainRFC('newuser:@example.com') AS Host;
|
||||
SELECT domainRFC('http://:pass@example.com') AS Host;
|
||||
SELECT domainRFC(':newpass@example.com') AS Host;
|
||||
SELECT domainRFC('http://user:pass@example@.com') AS Host;
|
||||
SELECT domainRFC('http://user:pass:example.com') AS Host;
|
||||
SELECT domainRFC('http:/paul/example/com') AS Host;
|
||||
SELECT domainRFC('http://www.example.com?q=4') AS Host;
|
||||
SELECT domainRFC('http://127.0.0.1:443/') AS Host;
|
||||
SELECT domainRFC('//www.example.com') AS Host;
|
||||
SELECT domainRFC('//paul@www.example.com') AS Host;
|
||||
SELECT domainRFC('www.example.com') as Host;
|
||||
SELECT domainRFC('example.com') as Host;
|
||||
SELECT domainWithoutWWWRFC('//paul@www.example.com') AS Host;
|
||||
SELECT domainWithoutWWWRFC('http://paul@www.example.com:80/') AS Host;
|
||||
|
||||
SELECT '====NETLOC====';
|
||||
SELECT netloc('http://paul@www.example.com:80/') AS Netloc;
|
||||
|
@ -35,7 +35,7 @@ Check total_bytes/total_rows for StripeLog
|
||||
113 1
|
||||
Check total_bytes/total_rows for Memory
|
||||
0 0
|
||||
64 1
|
||||
256 1
|
||||
Check total_bytes/total_rows for Buffer
|
||||
0 0
|
||||
256 50
|
||||
|
@ -2,5 +2,5 @@
|
||||
< X-ClickHouse-Progress: {"read_rows":"65505","read_bytes":"524040","written_rows":"0","written_bytes":"0","total_rows_to_read":"100000","result_rows":"0","result_bytes":"0"}
|
||||
< X-ClickHouse-Progress: {"read_rows":"131010","read_bytes":"1048080","written_rows":"0","written_bytes":"0","total_rows_to_read":"100000","result_rows":"0","result_bytes":"0"}
|
||||
< X-ClickHouse-Progress: {"read_rows":"131011","read_bytes":"1048081","written_rows":"0","written_bytes":"0","total_rows_to_read":"100000","result_rows":"0","result_bytes":"0"}
|
||||
< X-ClickHouse-Progress: {"read_rows":"131011","read_bytes":"1048081","written_rows":"0","written_bytes":"0","total_rows_to_read":"100000","result_rows":"1","result_bytes":"80"}
|
||||
< X-ClickHouse-Summary: {"read_rows":"131011","read_bytes":"1048081","written_rows":"0","written_bytes":"0","total_rows_to_read":"100000","result_rows":"1","result_bytes":"80"}
|
||||
< X-ClickHouse-Progress: {"read_rows":"131011","read_bytes":"1048081","written_rows":"0","written_bytes":"0","total_rows_to_read":"100000","result_rows":"1","result_bytes":"272"}
|
||||
< X-ClickHouse-Summary: {"read_rows":"131011","read_bytes":"1048081","written_rows":"0","written_bytes":"0","total_rows_to_read":"100000","result_rows":"1","result_bytes":"272"}
|
||||
|
@ -1,6 +1,6 @@
|
||||
-- Tags: no-parallel
|
||||
|
||||
SET use_analyzer = 1;
|
||||
SET allow_experimental_analyzer = 1;
|
||||
|
||||
-- Empty from section
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
SET use_analyzer = 1;
|
||||
SET allow_experimental_analyzer = 1;
|
||||
|
||||
DESCRIBE (SELECT 1);
|
||||
SELECT 1;
|
||||
|
@ -1,6 +1,6 @@
|
||||
-- Tags: no-parallel
|
||||
|
||||
SET use_analyzer = 1;
|
||||
SET allow_experimental_analyzer = 1;
|
||||
|
||||
SELECT 'Matchers without FROM section';
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
SET use_analyzer = 1;
|
||||
SET allow_experimental_analyzer = 1;
|
||||
|
||||
DESCRIBE (SELECT 1 + 1);
|
||||
SELECT 1 + 1;
|
||||
|
@ -1,4 +1,4 @@
|
||||
SET use_analyzer = 1;
|
||||
SET allow_experimental_analyzer = 1;
|
||||
|
||||
SELECT 'Aliases to constants';
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
SET use_analyzer = 1;
|
||||
SET allow_experimental_analyzer = 1;
|
||||
|
||||
SELECT 'Constant tuple';
|
||||
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user