Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into col-identifier-as-col-number

This commit is contained in:
kssenii 2021-08-16 14:50:37 +00:00
commit 7be06f894e
199 changed files with 3253 additions and 925 deletions

View File

@ -2,7 +2,7 @@
name: Bug report
about: Wrong behaviour (visible to users) in official ClickHouse release.
title: ''
labels: bug
labels: 'potential bug'
assignees: ''
---

View File

@ -2,15 +2,15 @@
#### New Features
* Collect common system metrics (in `system.asynchronous_metrics` and `system.asynchronous_metric_log`) on CPU usage, disk usage, memory usage, IO, network, files, load average, CPU frequencies, thermal sensors, EDAC counters, system uptime; also added metrics about the scheduling jitter and the time spent collecting the metrics. It works similar to `atop` in ClickHouse and allows access to monitoring data even if you have no additional tools installed. Close [#9430](https://github.com/ClickHouse/ClickHouse/issues/9430). [#24416](https://github.com/ClickHouse/ClickHouse/pull/24416) ([Yegor Levankov](https://github.com/elevankoff)).
* Add support for a part of SQL/JSON standard. [#24148](https://github.com/ClickHouse/ClickHouse/pull/24148) ([l1tsolaiki](https://github.com/l1tsolaiki), [Kseniia Sumarokova](https://github.com/kssenii)).
* Collect common system metrics (in `system.asynchronous_metrics` and `system.asynchronous_metric_log`) on CPU usage, disk usage, memory usage, IO, network, files, load average, CPU frequencies, thermal sensors, EDAC counters, system uptime; also added metrics about the scheduling jitter and the time spent collecting the metrics. It works similar to `atop` in ClickHouse and allows access to monitoring data even if you have no additional tools installed. Close [#9430](https://github.com/ClickHouse/ClickHouse/issues/9430). [#24416](https://github.com/ClickHouse/ClickHouse/pull/24416) ([alexey-milovidov](https://github.com/alexey-milovidov), [Yegor Levankov](https://github.com/elevankoff)).
* Add MaterializedPostgreSQL table engine and database engine. This database engine allows replicating a whole database or any subset of database tables. [#20470](https://github.com/ClickHouse/ClickHouse/pull/20470) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add new functions `leftPad()`, `rightPad()`, `leftPadUTF8()`, `rightPadUTF8()`. [#26075](https://github.com/ClickHouse/ClickHouse/pull/26075) ([Vitaly Baranov](https://github.com/vitlibar)).
* Add the `FIRST` keyword to the `ADD INDEX` command to be able to add the index at the beginning of the indices list. [#25904](https://github.com/ClickHouse/ClickHouse/pull/25904) ([xjewer](https://github.com/xjewer)).
* Introduce `system.data_skipping_indices` table containing information about existing data skipping indices. Close [#7659](https://github.com/ClickHouse/ClickHouse/issues/7659). [#25693](https://github.com/ClickHouse/ClickHouse/pull/25693) ([Dmitry Novik](https://github.com/novikd)).
* Add `bin`/`unbin` functions. [#25609](https://github.com/ClickHouse/ClickHouse/pull/25609) ([zhaoyu](https://github.com/zxc111)).
* Support `Map` and `(U)Int128`, `U(Int256) types in `mapAdd` and `mapSubtract` functions. [#25596](https://github.com/ClickHouse/ClickHouse/pull/25596) ([Ildus Kurbangaliev](https://github.com/ildus)).
* Support `Map` and `UInt128`, `Int128`, `UInt256`, `Int256` types in `mapAdd` and `mapSubtract` functions. [#25596](https://github.com/ClickHouse/ClickHouse/pull/25596) ([Ildus Kurbangaliev](https://github.com/ildus)).
* Support `DISTINCT ON (columns)` expression, close [#25404](https://github.com/ClickHouse/ClickHouse/issues/25404). [#25589](https://github.com/ClickHouse/ClickHouse/pull/25589) ([Zijie Lu](https://github.com/TszKitLo40)).
* Add support for a part of SQLJSON standard. [#24148](https://github.com/ClickHouse/ClickHouse/pull/24148) ([l1tsolaiki](https://github.com/l1tsolaiki)).
* Add MaterializedPostgreSQL table engine and database engine. This database engine allows replicating a whole database or any subset of database tables. [#20470](https://github.com/ClickHouse/ClickHouse/pull/20470) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add an ability to reset a custom setting to default and remove it from the table's metadata. It allows rolling back the change without knowing the system/config's default. Closes [#14449](https://github.com/ClickHouse/ClickHouse/issues/14449). [#17769](https://github.com/ClickHouse/ClickHouse/pull/17769) ([xjewer](https://github.com/xjewer)).
* Render pipelines as graphs in Web UI if `EXPLAIN PIPELINE graph = 1` query is submitted. [#26067](https://github.com/ClickHouse/ClickHouse/pull/26067) ([alexey-milovidov](https://github.com/alexey-milovidov)).
@ -21,11 +21,11 @@
#### Improvements
* Use `Map` data type for system logs tables (`system.query_log`, `system.query_thread_log`, `system.processes`, `system.opentelemetry_span_log`). These tables will be auto-created with new data types. Virtual columns are created to support old queries. Closes [#18698](https://github.com/ClickHouse/ClickHouse/issues/18698). [#23934](https://github.com/ClickHouse/ClickHouse/pull/23934), [#25773](https://github.com/ClickHouse/ClickHouse/pull/25773) ([hexiaoting](https://github.com/hexiaoting), [sundy-li](https://github.com/sundy-li)).
* Use `Map` data type for system logs tables (`system.query_log`, `system.query_thread_log`, `system.processes`, `system.opentelemetry_span_log`). These tables will be auto-created with new data types. Virtual columns are created to support old queries. Closes [#18698](https://github.com/ClickHouse/ClickHouse/issues/18698). [#23934](https://github.com/ClickHouse/ClickHouse/pull/23934), [#25773](https://github.com/ClickHouse/ClickHouse/pull/25773) ([hexiaoting](https://github.com/hexiaoting), [sundy-li](https://github.com/sundy-li), [Maksim Kita](https://github.com/kitaisreal)).
* For a dictionary with a complex key containing only one attribute, allow not wrapping the key expression in tuple for functions `dictGet`, `dictHas`. [#26130](https://github.com/ClickHouse/ClickHouse/pull/26130) ([Maksim Kita](https://github.com/kitaisreal)).
* Implement function `bin`/`hex` from `AggregateFunction` states. [#26094](https://github.com/ClickHouse/ClickHouse/pull/26094) ([zhaoyu](https://github.com/zxc111)).
* Support arguments of `UUID` type for `empty` and `notEmpty` functions. `UUID` is empty if it is all zeros (nil UUID). Closes [#3446](https://github.com/ClickHouse/ClickHouse/issues/3446). [#25974](https://github.com/ClickHouse/ClickHouse/pull/25974) ([zhaoyu](https://github.com/zxc111)).
* Fix error with query `SET SQL_SELECT_LIMIT` in MySQL protocol. Closes [#17115](https://github.com/ClickHouse/ClickHouse/issues/17115). [#25972](https://github.com/ClickHouse/ClickHouse/pull/25972) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add support for `SET SQL_SELECT_LIMIT` in MySQL protocol. Closes [#17115](https://github.com/ClickHouse/ClickHouse/issues/17115). [#25972](https://github.com/ClickHouse/ClickHouse/pull/25972) ([Kseniia Sumarokova](https://github.com/kssenii)).
* More instrumentation for network interaction: add counters for recv/send bytes; add gauges for recvs/sends. Added missing documentation. Close [#5897](https://github.com/ClickHouse/ClickHouse/issues/5897). [#25962](https://github.com/ClickHouse/ClickHouse/pull/25962) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add setting `optimize_move_to_prewhere_if_final`. If query has `FINAL`, the optimization `move_to_prewhere` will be enabled only if both `optimize_move_to_prewhere` and `optimize_move_to_prewhere_if_final` are enabled. Closes [#8684](https://github.com/ClickHouse/ClickHouse/issues/8684). [#25940](https://github.com/ClickHouse/ClickHouse/pull/25940) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Allow complex quoted identifiers of JOINed tables. Close [#17861](https://github.com/ClickHouse/ClickHouse/issues/17861). [#25924](https://github.com/ClickHouse/ClickHouse/pull/25924) ([alexey-milovidov](https://github.com/alexey-milovidov)).
@ -37,7 +37,7 @@
* Support for queries with a column named `"null"` (it must be specified in back-ticks or double quotes) and `ON CLUSTER`. Closes [#24035](https://github.com/ClickHouse/ClickHouse/issues/24035). [#25907](https://github.com/ClickHouse/ClickHouse/pull/25907) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Support `LowCardinality`, `Decimal`, and `UUID` for `JSONExtract`. Closes [#24606](https://github.com/ClickHouse/ClickHouse/issues/24606). [#25900](https://github.com/ClickHouse/ClickHouse/pull/25900) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Convert history file from `readline` format to `replxx` format. [#25888](https://github.com/ClickHouse/ClickHouse/pull/25888) ([Azat Khuzhin](https://github.com/azat)).
* Fix bug which can lead to intersecting parts after `DROP PART` or background deletion of an empty part. [#25884](https://github.com/ClickHouse/ClickHouse/pull/25884) ([alesapin](https://github.com/alesapin)).
* Fix an issue which can lead to intersecting parts after `DROP PART` or background deletion of an empty part. [#25884](https://github.com/ClickHouse/ClickHouse/pull/25884) ([alesapin](https://github.com/alesapin)).
* Better handling of lost parts for `ReplicatedMergeTree` tables. Fixes rare inconsistencies in `ReplicationQueue`. Fixes [#10368](https://github.com/ClickHouse/ClickHouse/issues/10368). [#25820](https://github.com/ClickHouse/ClickHouse/pull/25820) ([alesapin](https://github.com/alesapin)).
* Allow starting clickhouse-client with unreadable working directory. [#25817](https://github.com/ClickHouse/ClickHouse/pull/25817) ([ianton-ru](https://github.com/ianton-ru)).
* Fix "No available columns" error for `Merge` storage. [#25801](https://github.com/ClickHouse/ClickHouse/pull/25801) ([Azat Khuzhin](https://github.com/azat)).
@ -48,7 +48,7 @@
* Support materialized and aliased columns in JOIN, close [#13274](https://github.com/ClickHouse/ClickHouse/issues/13274). [#25634](https://github.com/ClickHouse/ClickHouse/pull/25634) ([Vladimir C](https://github.com/vdimir)).
* Fix possible logical race condition between `ALTER TABLE ... DETACH` and background merges. [#25605](https://github.com/ClickHouse/ClickHouse/pull/25605) ([Azat Khuzhin](https://github.com/azat)).
* Make `NetworkReceiveElapsedMicroseconds` metric to correctly include the time spent waiting for data from the client to `INSERT`. Close [#9958](https://github.com/ClickHouse/ClickHouse/issues/9958). [#25602](https://github.com/ClickHouse/ClickHouse/pull/25602) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Support `TRUNCATE TABLE` for StorageS3 and StorageHDFS. Close [#25530](https://github.com/ClickHouse/ClickHouse/issues/25530). [#25550](https://github.com/ClickHouse/ClickHouse/pull/25550) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Support `TRUNCATE TABLE` for S3 and HDFS. Close [#25530](https://github.com/ClickHouse/ClickHouse/issues/25530). [#25550](https://github.com/ClickHouse/ClickHouse/pull/25550) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Support for dynamic reloading of config to change number of threads in pool for background jobs execution (merges, mutations, fetches). [#25548](https://github.com/ClickHouse/ClickHouse/pull/25548) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Allow extracting of non-string element as string using `JSONExtract`. This is for [#25414](https://github.com/ClickHouse/ClickHouse/issues/25414). [#25452](https://github.com/ClickHouse/ClickHouse/pull/25452) ([Amos Bird](https://github.com/amosbird)).
* Support regular expression in `Database` argument for `StorageMerge`. Close [#776](https://github.com/ClickHouse/ClickHouse/issues/776). [#25064](https://github.com/ClickHouse/ClickHouse/pull/25064) ([flynn](https://github.com/ucasfl)).
@ -60,13 +60,13 @@
* Fix incorrect `SET ROLE` in some cases. [#26707](https://github.com/ClickHouse/ClickHouse/pull/26707) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix potential `nullptr` dereference in window functions. Fix [#25276](https://github.com/ClickHouse/ClickHouse/issues/25276). [#26668](https://github.com/ClickHouse/ClickHouse/pull/26668) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix incorrect function names of `groupBitmapAnd/Or/Xor`. Fix [#26557](https://github.com/ClickHouse/ClickHouse/pull/26557) ([Amos Bird](https://github.com/amosbird)).
* Fix crash in rabbitmq shutdown in case rabbitmq setup was not started. Closes [#26504](https://github.com/ClickHouse/ClickHouse/issues/26504). [#26529](https://github.com/ClickHouse/ClickHouse/pull/26529) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix crash in RabbitMQ shutdown in case RabbitMQ setup was not started. Closes [#26504](https://github.com/ClickHouse/ClickHouse/issues/26504). [#26529](https://github.com/ClickHouse/ClickHouse/pull/26529) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix issues with `CREATE DICTIONARY` query if dictionary name or database name was quoted. Closes [#26491](https://github.com/ClickHouse/ClickHouse/issues/26491). [#26508](https://github.com/ClickHouse/ClickHouse/pull/26508) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix broken name resolution after rewriting column aliases. Fix [#26432](https://github.com/ClickHouse/ClickHouse/issues/26432). [#26475](https://github.com/ClickHouse/ClickHouse/pull/26475) ([Amos Bird](https://github.com/amosbird)).
* Fix infinite non-joined block stream in `partial_merge_join` close [#26325](https://github.com/ClickHouse/ClickHouse/issues/26325). [#26374](https://github.com/ClickHouse/ClickHouse/pull/26374) ([Vladimir C](https://github.com/vdimir)).
* Fix possible crash when login as dropped user. Fix [#26073](https://github.com/ClickHouse/ClickHouse/issues/26073). [#26363](https://github.com/ClickHouse/ClickHouse/pull/26363) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix `optimize_distributed_group_by_sharding_key` for multiple columns (leads to incorrect result w/ `optimize_skip_unused_shards=1`/`allow_nondeterministic_optimize_skip_unused_shards=1` and multiple columns in sharding key expression). [#26353](https://github.com/ClickHouse/ClickHouse/pull/26353) ([Azat Khuzhin](https://github.com/azat)).
* `CAST` from `Date` to `DateTime` (or `DateTime64`) was not using the timezone of the `DateTime` type. It can also affect the comparison between `Date` and `DateTime`. Inference of the common type for `Date` and `DateTime` also was not using the corresponding timezone. It affected the results of function `if` and array construction. Closes [#24128](https://github.com/ClickHouse/ClickHouse/issues/24128). [#24129](https://github.com/ClickHouse/ClickHouse/pull/24129) ([Maksim Kita](https://github.com/kitaisreal)).
* `CAST` from `Date` to `DateTime` (or `DateTime64`) was not using the timezone of the `DateTime` type. It can also affect the comparison between `Date` and `DateTime`. Inference of the common type for `Date` and `DateTime` also was not using the corresponding timezone. It affected the results of function `if` and array construction. Closes [#24128](https://github.com/ClickHouse/ClickHouse/issues/24128). [#24129](https://github.com/ClickHouse/ClickHouse/pull/24129) ([Maksim Kita](https://github.com/kitaisreal)).
* Fixed rare bug in lost replica recovery that may cause replicas to diverge. [#26321](https://github.com/ClickHouse/ClickHouse/pull/26321) ([tavplubix](https://github.com/tavplubix)).
* Fix zstd decompression in case there are escape sequences at the end of internal buffer. Closes [#26013](https://github.com/ClickHouse/ClickHouse/issues/26013). [#26314](https://github.com/ClickHouse/ClickHouse/pull/26314) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix logical error on join with totals, close [#26017](https://github.com/ClickHouse/ClickHouse/issues/26017). [#26250](https://github.com/ClickHouse/ClickHouse/pull/26250) ([Vladimir C](https://github.com/vdimir)).
@ -75,7 +75,7 @@
* Fix possible crash in `pointInPolygon` if the setting `validate_polygons` is turned off. [#26113](https://github.com/ClickHouse/ClickHouse/pull/26113) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix throwing exception when iterate over non-existing remote directory. [#26087](https://github.com/ClickHouse/ClickHouse/pull/26087) ([ianton-ru](https://github.com/ianton-ru)).
* Fix rare server crash because of `abort` in ZooKeeper client. Fixes [#25813](https://github.com/ClickHouse/ClickHouse/issues/25813). [#26079](https://github.com/ClickHouse/ClickHouse/pull/26079) ([alesapin](https://github.com/alesapin)).
* Fix wrong thread estimation for right subquery join in some cases. Close [#24075](https://github.com/ClickHouse/ClickHouse/issues/24075). [#26052](https://github.com/ClickHouse/ClickHouse/pull/26052) ([Vladimir C](https://github.com/vdimir)).
* Fix wrong thread count estimation for right subquery join in some cases. Close [#24075](https://github.com/ClickHouse/ClickHouse/issues/24075). [#26052](https://github.com/ClickHouse/ClickHouse/pull/26052) ([Vladimir C](https://github.com/vdimir)).
* Fixed incorrect `sequence_id` in MySQL protocol packets that ClickHouse sends on exception during query execution. It might cause MySQL client to reset connection to ClickHouse server. Fixes [#21184](https://github.com/ClickHouse/ClickHouse/issues/21184). [#26051](https://github.com/ClickHouse/ClickHouse/pull/26051) ([tavplubix](https://github.com/tavplubix)).
* Fix possible mismatched header when using normal projection with `PREWHERE`. Fix [#26020](https://github.com/ClickHouse/ClickHouse/issues/26020). [#26038](https://github.com/ClickHouse/ClickHouse/pull/26038) ([Amos Bird](https://github.com/amosbird)).
* Fix formatting of type `Map` with integer keys to `JSON`. [#25982](https://github.com/ClickHouse/ClickHouse/pull/25982) ([Anton Popov](https://github.com/CurtizJ)).
@ -94,20 +94,8 @@
* Fix `ALTER MODIFY COLUMN` of columns, which participates in TTL expressions. [#25554](https://github.com/ClickHouse/ClickHouse/pull/25554) ([Anton Popov](https://github.com/CurtizJ)).
* Fix assertion in `PREWHERE` with non-UInt8 type, close [#19589](https://github.com/ClickHouse/ClickHouse/issues/19589). [#25484](https://github.com/ClickHouse/ClickHouse/pull/25484) ([Vladimir C](https://github.com/vdimir)).
* Fix some fuzzed msan crash. Fixes [#22517](https://github.com/ClickHouse/ClickHouse/issues/22517). [#26428](https://github.com/ClickHouse/ClickHouse/pull/26428) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix empty history file conversion. [#26589](https://github.com/ClickHouse/ClickHouse/pull/26589) ([Azat Khuzhin](https://github.com/azat)).
* Update `chown` cmd check in `clickhouse-server` docker entrypoint. It fixes error 'cluster pod restart failed (or timeout)' on kubernetes. [#26545](https://github.com/ClickHouse/ClickHouse/pull/26545) ([Ky Li](https://github.com/Kylinrix)).
#### Build/Testing/Packaging Improvements
* Disabling TestFlows LDAP module due to test fails. [#26065](https://github.com/ClickHouse/ClickHouse/pull/26065) ([vzakaznikov](https://github.com/vzakaznikov)).
* Enabling all TestFlows modules and fixing some tests. [#26011](https://github.com/ClickHouse/ClickHouse/pull/26011) ([vzakaznikov](https://github.com/vzakaznikov)).
* Add new tests for checking access rights for columns used in filters (`WHERE` / `PREWHERE` / row policy) of the `SELECT` statement after changes in [#24405](https://github.com/ClickHouse/ClickHouse/pull/24405). [#25619](https://github.com/ClickHouse/ClickHouse/pull/25619) ([Vitaly Baranov](https://github.com/vitlibar)).
#### Other
* Add `clickhouse-keeper-converter` tool which allows converting zookeeper logs and snapshots into `clickhouse-keeper` snapshot format. [#25428](https://github.com/ClickHouse/ClickHouse/pull/25428) ([alesapin](https://github.com/alesapin)).
### ClickHouse release v21.7, 2021-07-09
@ -1294,13 +1282,6 @@
* PODArray: Avoid call to memcpy with (nullptr, 0) arguments (Fix UBSan report). This fixes [#18525](https://github.com/ClickHouse/ClickHouse/issues/18525). [#18526](https://github.com/ClickHouse/ClickHouse/pull/18526) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Minor improvement for path concatenation of zookeeper paths inside DDLWorker. [#17767](https://github.com/ClickHouse/ClickHouse/pull/17767) ([Bharat Nallan](https://github.com/bharatnc)).
* Allow to reload symbols from debug file. This PR also fixes a build-id issue. [#17637](https://github.com/ClickHouse/ClickHouse/pull/17637) ([Amos Bird](https://github.com/amosbird)).
* TestFlows: fixes to LDAP tests that fail due to slow test execution. [#18790](https://github.com/ClickHouse/ClickHouse/pull/18790) ([vzakaznikov](https://github.com/vzakaznikov)).
* TestFlows: Merging requirements for AES encryption functions. Updating aes_encryption tests to use new requirements. Updating TestFlows version to 1.6.72. [#18221](https://github.com/ClickHouse/ClickHouse/pull/18221) ([vzakaznikov](https://github.com/vzakaznikov)).
* TestFlows: Updating TestFlows version to the latest 1.6.72. Re-generating requirements.py. [#18208](https://github.com/ClickHouse/ClickHouse/pull/18208) ([vzakaznikov](https://github.com/vzakaznikov)).
* TestFlows: Updating TestFlows README.md to include "How To Debug Why Test Failed" section. [#17808](https://github.com/ClickHouse/ClickHouse/pull/17808) ([vzakaznikov](https://github.com/vzakaznikov)).
* TestFlows: tests for RBAC [ACCESS MANAGEMENT](https://clickhouse.tech/docs/en/sql-reference/statements/grant/#grant-access-management) privileges. [#17804](https://github.com/ClickHouse/ClickHouse/pull/17804) ([MyroTk](https://github.com/MyroTk)).
* TestFlows: RBAC tests for SHOW, TRUNCATE, KILL, and OPTIMIZE. - Updates to old tests. - Resolved comments from #https://github.com/ClickHouse/ClickHouse/pull/16977. [#17657](https://github.com/ClickHouse/ClickHouse/pull/17657) ([MyroTk](https://github.com/MyroTk)).
* TestFlows: Added RBAC tests for `ATTACH`, `CREATE`, `DROP`, and `DETACH`. [#16977](https://github.com/ClickHouse/ClickHouse/pull/16977) ([MyroTk](https://github.com/MyroTk)).
## [Changelog for 2020](https://github.com/ClickHouse/ClickHouse/blob/master/docs/en/whats-new/changelog/2020.md)

View File

@ -395,9 +395,10 @@ endif ()
# Turns on all external libs like s3, kafka, ODBC, ...
option(ENABLE_LIBRARIES "Enable all external libraries by default" ON)
# We recommend avoiding this mode for production builds because we can't guarantee all needed libraries exist in your
# system.
# We recommend avoiding this mode for production builds because we can't guarantee
# all needed libraries exist in your system.
# This mode exists for enthusiastic developers who are searching for trouble.
# The whole idea of using unknown version of libraries from the OS distribution is deeply flawed.
# Useful for maintainers of OS packages.
option (UNBUNDLED "Use system libraries instead of ones in contrib/" OFF)

View File

@ -9,10 +9,6 @@ if (GLIBC_COMPATIBILITY)
check_include_file("sys/random.h" HAVE_SYS_RANDOM_H)
if(COMPILER_CLANG)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-builtin-requires-header")
endif()
add_headers_and_sources(glibc_compatibility .)
add_headers_and_sources(glibc_compatibility musl)
if (ARCH_AARCH64)
@ -35,11 +31,9 @@ if (GLIBC_COMPATIBILITY)
add_library(glibc-compatibility STATIC ${glibc_compatibility_sources})
if (COMPILER_CLANG)
target_compile_options(glibc-compatibility PRIVATE -Wno-unused-command-line-argument)
elseif (COMPILER_GCC)
target_compile_options(glibc-compatibility PRIVATE -Wno-unused-but-set-variable)
endif ()
target_no_warning(glibc-compatibility unused-command-line-argument)
target_no_warning(glibc-compatibility unused-but-set-variable)
target_no_warning(glibc-compatibility builtin-requires-header)
target_include_directories(glibc-compatibility PRIVATE libcxxabi ${musl_arch_include_dir})

View File

@ -296,7 +296,7 @@ void Pool::initialize()
Pool::Connection * Pool::allocConnection(bool dont_throw_if_failed_first_time)
{
std::unique_ptr<Connection> conn_ptr{new Connection};
std::unique_ptr conn_ptr = std::make_unique<Connection>();
try
{

View File

@ -27,3 +27,22 @@ endmacro ()
macro (no_warning flag)
add_warning(no-${flag})
endmacro ()
# The same but only for specified target.
macro (target_add_warning target flag)
string (REPLACE "-" "_" underscored_flag ${flag})
string (REPLACE "+" "x" underscored_flag ${underscored_flag})
check_cxx_compiler_flag("-W${flag}" SUPPORTS_CXXFLAG_${underscored_flag})
if (SUPPORTS_CXXFLAG_${underscored_flag})
target_compile_options (${target} PRIVATE "-W${flag}")
else ()
message (WARNING "Flag -W${flag} is unsupported")
endif ()
endmacro ()
macro (target_no_warning target flag)
target_add_warning(${target} no-${flag})
endmacro ()

View File

@ -6,7 +6,7 @@ SET(VERSION_REVISION 54454)
SET(VERSION_MAJOR 21)
SET(VERSION_MINOR 9)
SET(VERSION_PATCH 1)
SET(VERSION_GITHASH f48c5af90c2ad51955d1ee3b6b05d006b03e4238)
SET(VERSION_DESCRIBE v21.9.1.1-prestable)
SET(VERSION_STRING 21.9.1.1)
SET(VERSION_GITHASH f063e44131a048ba2d9af8075f03700fd5ec3e69)
SET(VERSION_DESCRIBE v21.9.1.7770-prestable)
SET(VERSION_STRING 21.9.1.7770)
# end of autochange

View File

@ -119,12 +119,9 @@ set(ORC_SRCS
"${ORC_SOURCE_SRC_DIR}/ColumnWriter.cc"
"${ORC_SOURCE_SRC_DIR}/Common.cc"
"${ORC_SOURCE_SRC_DIR}/Compression.cc"
"${ORC_SOURCE_SRC_DIR}/Exceptions.cc"
"${ORC_SOURCE_SRC_DIR}/Int128.cc"
"${ORC_SOURCE_SRC_DIR}/LzoDecompressor.cc"
"${ORC_SOURCE_SRC_DIR}/MemoryPool.cc"
"${ORC_SOURCE_SRC_DIR}/OrcFile.cc"
"${ORC_SOURCE_SRC_DIR}/Reader.cc"
"${ORC_SOURCE_SRC_DIR}/RLE.cc"
"${ORC_SOURCE_SRC_DIR}/RLEv1.cc"
"${ORC_SOURCE_SRC_DIR}/RLEv2.cc"

View File

@ -27,8 +27,10 @@ target_include_directories(roaring SYSTEM BEFORE PUBLIC "${LIBRARY_DIR}/cpp")
# We redirect malloc/free family of functions to different functions that will track memory in ClickHouse.
# Also note that we exploit implicit function declarations.
# Also it is disabled on Mac OS because it fails).
target_compile_definitions(roaring PRIVATE
if (NOT OS_DARWIN)
target_compile_definitions(roaring PRIVATE
-Dmalloc=clickhouse_malloc
-Dcalloc=clickhouse_calloc
-Drealloc=clickhouse_realloc
@ -36,4 +38,5 @@ target_compile_definitions(roaring PRIVATE
-Dfree=clickhouse_free
-Dposix_memalign=clickhouse_posix_memalign)
target_link_libraries(roaring PUBLIC clickhouse_common_io)
target_link_libraries(roaring PUBLIC clickhouse_common_io)
endif ()

View File

@ -2,9 +2,5 @@ set (SRCS
src/metrohash64.cpp
src/metrohash128.cpp
)
if (HAVE_SSE42) # Not used. Pretty easy to port.
list (APPEND SRCS src/metrohash128crc.cpp)
endif ()
add_library(metrohash ${SRCS})
target_include_directories(metrohash PUBLIC src)

View File

@ -151,8 +151,14 @@ def parse_env_variables(build_type, compiler, sanitizer, package_type, image_typ
cmake_flags.append('-DENABLE_TESTS=1')
cmake_flags.append('-DUSE_GTEST=1')
# "Unbundled" build is not suitable for any production usage.
# But it is occasionally used by some developers.
# The whole idea of using unknown version of libraries from the OS distribution is deeply flawed.
# We wish these developers good luck.
if unbundled:
cmake_flags.append('-DUNBUNDLED=1 -DUSE_INTERNAL_RDKAFKA_LIBRARY=1 -DENABLE_ARROW=0 -DENABLE_AVRO=0 -DENABLE_ORC=0 -DENABLE_PARQUET=0')
# We also disable all CPU features except basic x86_64.
# It is only slightly related to "unbundled" build, but it is a good place to test if code compiles without these instruction sets.
cmake_flags.append('-DUNBUNDLED=1 -DUSE_INTERNAL_RDKAFKA_LIBRARY=1 -DENABLE_ARROW=0 -DENABLE_AVRO=0 -DENABLE_ORC=0 -DENABLE_PARQUET=0 -DENABLE_SSSE3=0 -DENABLE_SSE41=0 -DENABLE_SSE42=0 -DENABLE_PCLMULQDQ=0 -DENABLE_POPCNT=0 -DENABLE_AVX=0 -DENABLE_AVX2=0')
if split_binary:
cmake_flags.append('-DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1 -DCLICKHOUSE_SPLIT_BINARY=1')

View File

@ -226,7 +226,7 @@ continue
task_exit_code=$fuzzer_exit_code
echo "failure" > status.txt
{ grep --text -o "Found error:.*" fuzzer.log \
|| grep --text -o "Exception.*" fuzzer.log \
|| grep --text -ao "Exception:.*" fuzzer.log \
|| echo "Fuzzer failed ($fuzzer_exit_code). See the logs." ; } \
| tail -1 > description.txt
fi

View File

@ -105,6 +105,10 @@ def process_result(result_path):
description += ", skipped: {}".format(skipped)
if unknown != 0:
description += ", unknown: {}".format(unknown)
# Temporary green for tests with DatabaseReplicated:
if 1 == int(os.environ.get('USE_DATABASE_REPLICATED', 0)):
state = "success"
else:
state = "failure"
description = "Output log doesn't exist"

View File

@ -1,6 +1,6 @@
---
toc_priority: 29
toc_title: MaterializedMySQL
toc_title: "[experimental] MaterializedMySQL"
---
# [experimental] MaterializedMySQL {#materialized-mysql}
@ -27,28 +27,33 @@ ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'passwo
- `password` — User password.
**Engine Settings**
- `max_rows_in_buffer` — Max rows that data is allowed to cache in memory(for single table and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `65505`.
- `max_bytes_in_buffer` — Max bytes that data is allowed to cache in memory(for single table and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `1048576`.
- `max_rows_in_buffers` — Max rows that data is allowed to cache in memory(for database and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `65505`.
- `max_bytes_in_buffers` — Max bytes that data is allowed to cache in memory(for database and the cache data unable to query). when rows is exceeded, the data will be materialized. Default: `1048576`.
- `max_flush_data_time` — Max milliseconds that data is allowed to cache in memory(for database and the cache data unable to query). when this time is exceeded, the data will be materialized. Default: `1000`.
- `max_wait_time_when_mysql_unavailable` — Retry interval when MySQL is not available (milliseconds). Negative value disable retry. Default: `1000`.
- `allows_query_when_mysql_lost` — Allow query materialized table when mysql is lost. Default: `0` (`false`).
```
- `max_rows_in_buffer` — Maximum number of rows that data is allowed to cache in memory (for single table and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `65 505`.
- `max_bytes_in_buffer` — Maximum number of bytes that data is allowed to cache in memory (for single table and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `1 048 576`.
- `max_rows_in_buffers` — Maximum number of rows that data is allowed to cache in memory (for database and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `65 505`.
- `max_bytes_in_buffers` — Maximum number of bytes that data is allowed to cache in memory (for database and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `1 048 576`.
- `max_flush_data_time` — Maximum number of milliseconds that data is allowed to cache in memory (for database and the cache data unable to query). When this time is exceeded, the data will be materialized. Default: `1000`.
- `max_wait_time_when_mysql_unavailable` — Retry interval when MySQL is not available (milliseconds). Negative value disables retry. Default: `1000`.
- `allows_query_when_mysql_lost` — Allows to query a materialized table when MySQL is lost. Default: `0` (`false`).
```sql
CREATE DATABASE mysql ENGINE = MaterializedMySQL('localhost:3306', 'db', 'user', '***')
SETTINGS
allows_query_when_mysql_lost=true,
max_wait_time_when_mysql_unavailable=10000;
```
**Settings on MySQL-server side**
**Settings on MySQL-server Side**
For the correct work of `MaterializedMySQL`, there are few mandatory `MySQL`-side configuration settings that should be set:
For the correct work of `MaterializedMySQL`, there are few mandatory `MySQL`-side configuration settings that must be set:
- `default_authentication_plugin = mysql_native_password` since `MaterializedMySQL` can only authorize with this method.
- `gtid_mode = on` since GTID based logging is a mandatory for providing correct `MaterializedMySQL` replication. Pay attention that while turning this mode `On` you should also specify `enforce_gtid_consistency = on`.
- `gtid_mode = on` since GTID based logging is a mandatory for providing correct `MaterializedMySQL` replication.
## Virtual columns {#virtual-columns}
!!! attention "Attention"
While turning on `gtid_mode` you should also specify `enforce_gtid_consistency = on`.
## Virtual Columns {#virtual-columns}
When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md) tables are used with virtual `_sign` and `_version` columns.
@ -78,13 +83,13 @@ When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](
| BLOB | [String](../../sql-reference/data-types/string.md) |
| BINARY | [FixedString](../../sql-reference/data-types/fixedstring.md) |
Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws exception "Unhandled data type" and stops replication.
[Nullable](../../sql-reference/data-types/nullable.md) is supported.
Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws exception "Unhandled data type" and stops replication.
## Specifics and Recommendations {#specifics-and-recommendations}
### Compatibility restrictions
### Compatibility Restrictions {#compatibility-restrictions}
Apart of the data types limitations there are few restrictions comparing to `MySQL` databases, that should be resolved before replication will be possible:

View File

@ -79,7 +79,7 @@ For a description of parameters, see the [CREATE query description](../../../sql
- `SAMPLE BY` — An expression for sampling. Optional.
If a sampling expression is used, the primary key must contain it. The result of sampling expression must be unsigned integer. Example: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`.
If a sampling expression is used, the primary key must contain it. The result of a sampling expression must be an unsigned integer. Example: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`.
- `TTL` — A list of rules specifying storage duration of rows and defining logic of automatic parts movement [between disks and volumes](#table_engine-mergetree-multiple-volumes). Optional.

View File

@ -7,19 +7,89 @@ toc_title: Arrays
## empty {#function-empty}
Returns 1 for an empty array, or 0 for a non-empty array.
The result type is UInt8.
The function also works for strings.
Checks whether the input array is empty.
Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT empty(arr) FROM table` transforms to `SELECT arr.size0 = 0 FROM TABLE`.
**Syntax**
``` sql
empty([x])
```
An array is considered empty if it does not contain any elements.
!!! note "Note"
Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT empty(arr) FROM TABLE;` transforms to `SELECT arr.size0 = 0 FROM TABLE;`.
The function also works for [strings](string-functions.md#empty) or [UUID](uuid-functions.md#empty).
**Arguments**
- `[x]` — Input array. [Array](../data-types/array.md).
**Returned value**
- Returns `1` for an empty array or `0` for a non-empty array.
Type: [UInt8](../data-types/int-uint.md).
**Example**
Query:
```sql
SELECT empty([]);
```
Result:
```text
┌─empty(array())─┐
│ 1 │
└────────────────┘
```
## notEmpty {#function-notempty}
Returns 0 for an empty array, or 1 for a non-empty array.
The result type is UInt8.
The function also works for strings.
Checks whether the input array is non-empty.
Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT notEmpty(arr) FROM table` transforms to `SELECT arr.size0 != 0 FROM TABLE`.
**Syntax**
``` sql
notEmpty([x])
```
An array is considered non-empty if it contains at least one element.
!!! note "Note"
Can be optimized by enabling the [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns) setting. With `optimize_functions_to_subcolumns = 1` the function reads only [size0](../../sql-reference/data-types/array.md#array-size) subcolumn instead of reading and processing the whole array column. The query `SELECT notEmpty(arr) FROM table` transforms to `SELECT arr.size0 != 0 FROM TABLE`.
The function also works for [strings](string-functions.md#notempty) or [UUID](uuid-functions.md#notempty).
**Arguments**
- `[x]` — Input array. [Array](../data-types/array.md).
**Returned value**
- Returns `1` for a non-empty array or `0` for an empty array.
Type: [UInt8](../data-types/int-uint.md).
**Example**
Query:
```sql
SELECT notEmpty([1,2]);
```
Result:
```text
┌─notEmpty([1, 2])─┐
│ 1 │
└──────────────────┘
```
## length {#array_functions-length}

View File

@ -10,17 +10,83 @@ toc_title: Strings
## empty {#empty}
Returns 1 for an empty string or 0 for a non-empty string.
The result type is UInt8.
Checks whether the input string is empty.
**Syntax**
``` sql
empty(x)
```
A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.
The function also works for arrays or UUID.
UUID is empty if it is all zeros (nil UUID).
The function also works for [arrays](array-functions.md#function-empty) or [UUID](uuid-functions.md#empty).
**Arguments**
- `x` — Input value. [String](../data-types/string.md).
**Returned value**
- Returns `1` for an empty string or `0` for a non-empty string.
Type: [UInt8](../data-types/int-uint.md).
**Example**
Query:
```sql
SELECT empty('');
```
Result:
```text
┌─empty('')─┐
│ 1 │
└───────────┘
```
## notEmpty {#notempty}
Returns 0 for an empty string or 1 for a non-empty string.
The result type is UInt8.
The function also works for arrays or UUID.
Checks whether the input string is non-empty.
**Syntax**
``` sql
notEmpty(x)
```
A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.
The function also works for [arrays](array-functions.md#function-notempty) or [UUID](uuid-functions.md#notempty).
**Arguments**
- `x` — Input value. [String](../data-types/string.md).
**Returned value**
- Returns `1` for a non-empty string or `0` for an empty string string.
Type: [UInt8](../data-types/int-uint.md).
**Example**
Query:
```sql
SELECT notEmpty('text');
```
Result:
```text
┌─notEmpty('text')─┐
│ 1 │
└──────────────────┘
```
## length {#length}

View File

@ -9,7 +9,7 @@ The functions for working with UUID are listed below.
## generateUUIDv4 {#uuid-function-generate}
Generates the [UUID](../../sql-reference/data-types/uuid.md) of [version 4](https://tools.ietf.org/html/rfc4122#section-4.4).
Generates the [UUID](../data-types/uuid.md) of [version 4](https://tools.ietf.org/html/rfc4122#section-4.4).
``` sql
generateUUIDv4()
@ -37,6 +37,90 @@ SELECT * FROM t_uuid
└──────────────────────────────────────┘
```
## empty {#empty}
Checks whether the input UUID is empty.
**Syntax**
```sql
empty(UUID)
```
The UUID is considered empty if it contains all zeros (zero UUID).
The function also works for [arrays](array-functions.md#function-empty) or [strings](string-functions.md#empty).
**Arguments**
- `x` — Input UUID. [UUID](../data-types/uuid.md).
**Returned value**
- Returns `1` for an empty UUID or `0` for a non-empty UUID.
Type: [UInt8](../data-types/int-uint.md).
**Example**
To generate the UUID value, ClickHouse provides the [generateUUIDv4](#uuid-function-generate) function.
Query:
```sql
SELECT empty(generateUUIDv4());
```
Result:
```text
┌─empty(generateUUIDv4())─┐
│ 0 │
└─────────────────────────┘
```
## notEmpty {#notempty}
Checks whether the input UUID is non-empty.
**Syntax**
```sql
notEmpty(UUID)
```
The UUID is considered empty if it contains all zeros (zero UUID).
The function also works for [arrays](array-functions.md#function-notempty) or [strings](string-functions.md#notempty).
**Arguments**
- `x` — Input UUID. [UUID](../data-types/uuid.md).
**Returned value**
- Returns `1` for a non-empty UUID or `0` for an empty UUID.
Type: [UInt8](../data-types/int-uint.md).
**Example**
To generate the UUID value, ClickHouse provides the [generateUUIDv4](#uuid-function-generate) function.
Query:
```sql
SELECT notEmpty(generateUUIDv4());
```
Result:
```text
┌─notEmpty(generateUUIDv4())─┐
│ 1 │
└────────────────────────────┘
```
## toUUID (x) {#touuid-x}
Converts String type value to UUID type.

View File

@ -6,23 +6,55 @@ toc_title: DISTINCT
If `SELECT DISTINCT` is specified, only unique rows will remain in a query result. Thus only a single row will remain out of all the sets of fully matching rows in the result.
## Null Processing {#null-processing}
You can specify the list of columns that must have unique values: `SELECT DISTINCT ON (column1, column2,...)`. If the columns are not specified, all of them are taken into consideration.
`DISTINCT` works with [NULL](../../../sql-reference/syntax.md#null-literal) as if `NULL` were a specific value, and `NULL==NULL`. In other words, in the `DISTINCT` results, different combinations with `NULL` occur only once. It differs from `NULL` processing in most other contexts.
Consider the table:
## Alternatives {#alternatives}
```text
┌─a─┬─b─┬─c─┐
│ 1 │ 1 │ 1 │
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 2 │ 2 │ 2 │
│ 1 │ 1 │ 2 │
│ 1 │ 2 │ 2 │
└───┴───┴───┘
```
It is possible to obtain the same result by applying [GROUP BY](../../../sql-reference/statements/select/group-by.md) across the same set of values as specified as `SELECT` clause, without using any aggregate functions. But there are few differences from `GROUP BY` approach:
Using `DISTINCT` without specifying columns:
- `DISTINCT` can be applied together with `GROUP BY`.
- When [ORDER BY](../../../sql-reference/statements/select/order-by.md) is omitted and [LIMIT](../../../sql-reference/statements/select/limit.md) is defined, the query stops running immediately after the required number of different rows has been read.
- Data blocks are output as they are processed, without waiting for the entire query to finish running.
```sql
SELECT DISTINCT * FROM t1;
```
## Examples {#examples}
```text
┌─a─┬─b─┬─c─┐
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 1 │ 1 │ 2 │
│ 1 │ 2 │ 2 │
└───┴───┴───┘
```
Using `DISTINCT` with specified columns:
```sql
SELECT DISTINCT ON (a,b) * FROM t1;
```
```text
┌─a─┬─b─┬─c─┐
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 1 │ 2 │ 2 │
└───┴───┴───┘
```
## DISTINCT and ORDER BY {#distinct-orderby}
ClickHouse supports using the `DISTINCT` and `ORDER BY` clauses for different columns in one query. The `DISTINCT` clause is executed before the `ORDER BY` clause.
Example table:
Consider the table:
``` text
┌─a─┬─b─┐
@ -33,7 +65,11 @@ Example table:
└───┴───┘
```
When selecting data with the `SELECT DISTINCT a FROM t1 ORDER BY b ASC` query, we get the following result:
Selecting data:
```sql
SELECT DISTINCT a FROM t1 ORDER BY b ASC;
```
``` text
┌─a─┐
@ -42,8 +78,11 @@ When selecting data with the `SELECT DISTINCT a FROM t1 ORDER BY b ASC` query, w
│ 3 │
└───┘
```
Selecting data with the different sorting direction:
If we change the sorting direction `SELECT DISTINCT a FROM t1 ORDER BY b DESC`, we get the following result:
```sql
SELECT DISTINCT a FROM t1 ORDER BY b DESC;
```
``` text
┌─a─┐
@ -56,3 +95,15 @@ If we change the sorting direction `SELECT DISTINCT a FROM t1 ORDER BY b DESC`,
Row `2, 4` was cut before sorting.
Take this implementation specificity into account when programming queries.
## Null Processing {#null-processing}
`DISTINCT` works with [NULL](../../../sql-reference/syntax.md#null-literal) as if `NULL` were a specific value, and `NULL==NULL`. In other words, in the `DISTINCT` results, different combinations with `NULL` occur only once. It differs from `NULL` processing in most other contexts.
## Alternatives {#alternatives}
It is possible to obtain the same result by applying [GROUP BY](../../../sql-reference/statements/select/group-by.md) across the same set of values as specified as `SELECT` clause, without using any aggregate functions. But there are few differences from `GROUP BY` approach:
- `DISTINCT` can be applied together with `GROUP BY`.
- When [ORDER BY](../../../sql-reference/statements/select/order-by.md) is omitted and [LIMIT](../../../sql-reference/statements/select/limit.md) is defined, the query stops running immediately after the required number of different rows has been read.
- Data blocks are output as they are processed, without waiting for the entire query to finish running.

View File

@ -13,7 +13,7 @@ toc_title: Overview
``` sql
[WITH expr_list|(subquery)]
SELECT [DISTINCT] expr_list
SELECT [DISTINCT [ON (column1, column2, ...)]] expr_list
[FROM [db.]table | (subquery) | table_function] [FINAL]
[SAMPLE sample_coeff]
[ARRAY JOIN ...]
@ -36,6 +36,8 @@ All clauses are optional, except for the required list of expressions immediatel
Specifics of each optional clause are covered in separate sections, which are listed in the same order as they are executed:
- [WITH clause](../../../sql-reference/statements/select/with.md)
- [SELECT clause](#select-clause)
- [DISTINCT clause](../../../sql-reference/statements/select/distinct.md)
- [FROM clause](../../../sql-reference/statements/select/from.md)
- [SAMPLE clause](../../../sql-reference/statements/select/sample.md)
- [JOIN clause](../../../sql-reference/statements/select/join.md)
@ -44,8 +46,6 @@ Specifics of each optional clause are covered in separate sections, which are li
- [GROUP BY clause](../../../sql-reference/statements/select/group-by.md)
- [LIMIT BY clause](../../../sql-reference/statements/select/limit-by.md)
- [HAVING clause](../../../sql-reference/statements/select/having.md)
- [SELECT clause](#select-clause)
- [DISTINCT clause](../../../sql-reference/statements/select/distinct.md)
- [LIMIT clause](../../../sql-reference/statements/select/limit.md)
- [OFFSET clause](../../../sql-reference/statements/select/offset.md)
- [UNION clause](../../../sql-reference/statements/select/union.md)

View File

@ -1,10 +1,12 @@
---
toc_priority: 29
toc_title: MaterializedMySQL
toc_title: "[experimental] MaterializedMySQL"
---
# [экспериментальный] MaterializedMySQL {#materialized-mysql}
**Это экспериментальный движок, который не следует использовать в продакшене.**
Создает базу данных ClickHouse со всеми таблицами, существующими в MySQL, и всеми данными в этих таблицах.
Сервер ClickHouse работает как реплика MySQL. Он читает файл binlog и выполняет DDL and DML-запросы.
@ -23,6 +25,32 @@ ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'passwo
- `user` — пользователь MySQL.
- `password` — пароль пользователя.
**Настройки движка**
- `max_rows_in_buffer` — максимальное количество строк, содержимое которых может кешироваться в памяти (для одной таблицы и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `65 505`.
- `max_bytes_in_buffer` — максимальное количество байтов, которое разрешено кешировать в памяти (для одной таблицы и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `1 048 576`.
- `max_rows_in_buffers` — максимальное количество строк, содержимое которых может кешироваться в памяти (для базы данных и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `65 505`.
- `max_bytes_in_buffers` — максимальное количество байтов, которое разрешено кешировать данным в памяти (для базы данных и данных кеша, которые невозможно запросить). При превышении количества строк, данные будут материализованы. Значение по умолчанию: `1 048 576`.
- `max_flush_data_time` — максимальное время в миллисекундах, в течение которого разрешено кешировать данные в памяти (для базы данных и данных кеша, которые невозможно запросить). При превышении количества указанного периода, данные будут материализованы. Значение по умолчанию: `1000`.
- `max_wait_time_when_mysql_unavailable` — интервал между повторными попытками, если MySQL недоступен. Указывается в миллисекундах. Отрицательное значение отключает повторные попытки. Значение по умолчанию: `1000`.
- `allows_query_when_mysql_lost` — признак, разрешен ли запрос к материализованной таблице при потере соединения с MySQL. Значение по умолчанию: `0` (`false`).
```sql
CREATE DATABASE mysql ENGINE = MaterializedMySQL('localhost:3306', 'db', 'user', '***')
SETTINGS
allows_query_when_mysql_lost=true,
max_wait_time_when_mysql_unavailable=10000;
```
**Настройки на стороне MySQL-сервера**
Для правильной работы `MaterializedMySQL` следует обязательно указать на сервере MySQL следующие параметры конфигурации:
- `default_authentication_plugin = mysql_native_password``MaterializedMySQL` может авторизоваться только с помощью этого метода.
- `gtid_mode = on` — ведение журнала на основе GTID является обязательным для обеспечения правильной репликации.
!!! attention "Внимание"
При включении `gtid_mode` вы также должны указать `enforce_gtid_consistency = on`.
## Виртуальные столбцы {#virtual-columns}
При работе с движком баз данных `MaterializedMySQL` используются таблицы семейства [ReplacingMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md) с виртуальными столбцами `_sign` и `_version`.
@ -51,13 +79,21 @@ ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'passwo
| STRING | [String](../../sql-reference/data-types/string.md) |
| VARCHAR, VAR_STRING | [String](../../sql-reference/data-types/string.md) |
| BLOB | [String](../../sql-reference/data-types/string.md) |
Другие типы не поддерживаются. Если таблица MySQL содержит столбец другого типа, ClickHouse выдаст исключение "Неподдерживаемый тип данных" ("Unhandled data type") и остановит репликацию.
| BINARY | [FixedString](../../sql-reference/data-types/fixedstring.md) |
Тип [Nullable](../../sql-reference/data-types/nullable.md) поддерживается.
Другие типы не поддерживаются. Если таблица MySQL содержит столбец другого типа, ClickHouse выдаст исключение "Неподдерживаемый тип данных" ("Unhandled data type") и остановит репликацию.
## Особенности и рекомендации {#specifics-and-recommendations}
### Ограничения совместимости {#compatibility-restrictions}
Кроме ограничений на типы данных, существует несколько ограничений по сравнению с базами данных MySQL, которые следует решить до того, как станет возможной репликация:
- Каждая таблица в MySQL должна содержать `PRIMARY KEY`.
- Репликация для таблиц, содержащих строки со значениями полей `ENUM` вне диапазона значений (определяется размерностью `ENUM`), не будет работать.
### DDL-запросы {#ddl-queries}
DDL-запросы в MySQL конвертируются в соответствующие DDL-запросы в ClickHouse ([ALTER](../../sql-reference/statements/alter/index.md), [CREATE](../../sql-reference/statements/create/index.md), [DROP](../../sql-reference/statements/drop.md), [RENAME](../../sql-reference/statements/rename.md)). Если ClickHouse не может конвертировать какой-либо DDL-запрос, он его игнорирует.
@ -158,3 +194,4 @@ SELECT * FROM mysql.test;
└───┴─────┴──────┘
```
[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/database-engines/materialized-mysql/) <!--hide-->

View File

@ -68,7 +68,7 @@ ORDER BY expr
- `SAMPLE BY` — выражение для сэмплирования. Необязательный параметр.
Если используется выражение для сэмплирования, то первичный ключ должен содержать его. Пример: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`.
Если используется выражение для сэмплирования, то первичный ключ должен содержать его. Результат выражения для сэмплирования должен быть беззнаковым целым числом. Пример: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`.
- `TTL` — список правил, определяющих длительности хранения строк, а также задающих правила перемещения частей на определённые тома или диски. Необязательный параметр.
@ -375,6 +375,24 @@ INDEX b (u64 * length(str), i32 + f64 * 100, date, str) TYPE set(100) GRANULARIT
- `s != 1`
- `NOT startsWith(s, 'test')`
### Проекции {#projections}
Проекции похожи на материализованные представления, но определяются на уровне партов. Это обеспечивает гарантии согласованности наряду с автоматическим использованием в запросах.
#### Запрос {#projection-query}
Запрос проекции — это то, что определяет проекцию. Он имеет следующую грамматику:
`SELECT <COLUMN LIST EXPR> [GROUP BY] [ORDER BY]`
Он неявно выбирает данные из родительской таблицы.
#### Хранение {#projection-storage}
Проекции хранятся в каталоге парта. Это похоже на хранение индексов, но используется подкаталог, в котором хранится анонимный парт таблицы MergeTree. Таблица создается запросом определения проекции. Если есть конструкция GROUP BY, то базовый механизм хранения становится AggregatedMergeTree, а все агрегатные функции преобразуются в AggregateFunction. Если есть конструкция ORDER BY, таблица MergeTree будет использовать его в качестве выражения первичного ключа. Во время процесса слияния парт проекции будет слит с помощью процедуры слияния ее хранилища. Контрольная сумма парта родительской таблицы будет включать парт проекции. Другие процедуры аналогичны индексам пропуска данных.
#### Анализ запросов {#projection-query-analysis}
1. Проверить, можно ли использовать проекцию в данном запросе, то есть, что с ней выходит тот же результат, что и с запросом к базовой таблице.
2. Выбрать наиболее подходящее совпадение, содержащее наименьшее количество гранул для чтения.
3. План запроса, который использует проекции, будет отличаться от того, который использует исходные парты. При отсутствии проекции в некоторых партах можно расширить план, чтобы «проецировать» на лету.
## Конкурентный доступ к данным {#concurrent-data-access}
Для конкурентного доступа к таблице используется мультиверсионность. То есть, при одновременном чтении и обновлении таблицы, данные будут читаться из набора кусочков, актуального на момент запроса. Длинных блокировок нет. Вставки никак не мешают чтениям.

View File

@ -172,7 +172,7 @@ SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2, number = 4) FROM
## sequenceCount(pattern)(time, cond1, cond2, …) {#function-sequencecount}
Вычисляет количество цепочек событий, соответствующих шаблону. Функция обнаруживает только непересекающиеся цепочки событий. Она начитает искать следующую цепочку только после того, как полностью совпала текущая цепочка событий.
Вычисляет количество цепочек событий, соответствующих шаблону. Функция обнаруживает только непересекающиеся цепочки событий. Она начинает искать следующую цепочку только после того, как полностью совпала текущая цепочка событий.
!!! warning "Предупреждение"
События, произошедшие в одну и ту же секунду, располагаются в последовательности в неопределенном порядке, что может повлиять на результат работы функции.

View File

@ -7,19 +7,89 @@ toc_title: "Массивы"
## empty {#function-empty}
Возвращает 1 для пустого массива, и 0 для непустого массива.
Тип результата - UInt8.
Функция также работает для строк.
Проверяет, является ли входной массив пустым.
Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT empty(arr) FROM table` преобразуется к запросу `SELECT arr.size0 = 0 FROM TABLE`.
**Синтаксис**
``` sql
empty([x])
```
Массив считается пустым, если он не содержит ни одного элемента.
!!! note "Примечание"
Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT empty(arr) FROM TABLE` преобразуется к запросу `SELECT arr.size0 = 0 FROM TABLE`.
Функция также поддерживает работу с типами [String](string-functions.md#empty) и [UUID](uuid-functions.md#empty).
**Параметры**
- `[x]` — массив на входе функции. [Array](../data-types/array.md).
**Возвращаемое значение**
- Возвращает `1` для пустого массива или `0` — для непустого массива.
Тип: [UInt8](../data-types/int-uint.md).
**Пример**
Запрос:
```sql
SELECT empty([]);
```
Ответ:
```text
┌─empty(array())─┐
│ 1 │
└────────────────┘
```
## notEmpty {#function-notempty}
Возвращает 0 для пустого массива, и 1 для непустого массива.
Тип результата - UInt8.
Функция также работает для строк.
Проверяет, является ли входной массив непустым.
Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT notEmpty(arr) FROM table` преобразуется к запросу `SELECT arr.size0 != 0 FROM TABLE`.
**Синтаксис**
``` sql
notEmpty([x])
```
Массив считается непустым, если он содержит хотя бы один элемент.
!!! note "Примечание"
Функцию можно оптимизировать, если включить настройку [optimize_functions_to_subcolumns](../../operations/settings/settings.md#optimize-functions-to-subcolumns). При `optimize_functions_to_subcolumns = 1` функция читает только подстолбец [size0](../../sql-reference/data-types/array.md#array-size) вместо чтения и обработки всего столбца массива. Запрос `SELECT notEmpty(arr) FROM table` преобразуется к запросу `SELECT arr.size0 != 0 FROM TABLE`.
Функция также поддерживает работу с типами [String](string-functions.md#notempty) и [UUID](uuid-functions.md#notempty).
**Параметры**
- `[x]` — массив на входе функции. [Array](../data-types/array.md).
**Возвращаемое значение**
- Возвращает `1` для непустого массива или `0` — для пустого массива.
Тип: [UInt8](../data-types/int-uint.md).
**Пример**
Запрос:
```sql
SELECT notEmpty([1,2]);
```
Результат:
```text
┌─notEmpty([1, 2])─┐
│ 1 │
└──────────────────┘
```
## length {#array_functions-length}

View File

@ -7,16 +7,83 @@ toc_title: "Функции для работы со строками"
## empty {#empty}
Возвращает 1 для пустой строки, и 0 для непустой строки.
Тип результата — UInt8.
Проверяет, является ли входная строка пустой.
**Синтаксис**
``` sql
empty(x)
```
Строка считается непустой, если содержит хотя бы один байт, пусть даже это пробел или нулевой байт.
Функция также работает для массивов.
Функция также поддерживает работу с типами [Array](array-functions.md#function-empty) и [UUID](uuid-functions.md#empty).
**Параметры**
- `x` — Входная строка. [String](../data-types/string.md).
**Возвращаемое значение**
- Возвращает `1` для пустой строки и `0` — для непустой строки.
Тип: [UInt8](../data-types/int-uint.md).
**Пример**
Запрос:
```sql
SELECT notempty('text');
```
Результат:
```text
┌─empty('')─┐
│ 1 │
└───────────┘
```
## notEmpty {#notempty}
Возвращает 0 для пустой строки, и 1 для непустой строки.
Тип результата — UInt8.
Функция также работает для массивов.
Проверяет, является ли входная строка непустой.
**Синтаксис**
``` sql
notEmpty(x)
```
Строка считается непустой, если содержит хотя бы один байт, пусть даже это пробел или нулевой байт.
Функция также поддерживает работу с типами [Array](array-functions.md#function-notempty) и [UUID](uuid-functions.md#notempty).
**Параметры**
- `x` — Входная строка. [String](../data-types/string.md).
**Возвращаемое значение**
- Возвращает `1` для непустой строки и `0` — для пустой строки.
Тип: [UInt8](../data-types/int-uint.md).
**Пример**
Запрос:
```sql
SELECT notEmpty('text');
```
Результат:
```text
┌─notEmpty('text')─┐
│ 1 │
└──────────────────┘
```
## length {#length}

View File

@ -35,6 +35,90 @@ SELECT * FROM t_uuid
└──────────────────────────────────────┘
```
## empty {#empty}
Проверяет, является ли входной UUID пустым.
**Синтаксис**
```sql
empty(UUID)
```
UUID считается пустым, если он содержит все нули (нулевой UUID).
Функция также поддерживает работу с типами [Array](array-functions.md#function-empty) и [String](string-functions.md#empty).
**Параметры**
- `x` — UUID на входе функции. [UUID](../data-types/uuid.md).
**Возвращаемое значение**
- Возвращает `1` для пустого UUID или `0` — для непустого UUID.
Тип: [UInt8](../data-types/int-uint.md).
**Пример**
Для генерации UUID-значений предназначена функция [generateUUIDv4](#uuid-function-generate).
Запрос:
```sql
SELECT empty(generateUUIDv4());
```
Ответ:
```text
┌─empty(generateUUIDv4())─┐
│ 0 │
└─────────────────────────┘
```
## notEmpty {#notempty}
Проверяет, является ли входной UUID непустым.
**Синтаксис**
```sql
notEmpty(UUID)
```
UUID считается пустым, если он содержит все нули (нулевой UUID).
Функция также поддерживает работу с типами [Array](array-functions.md#function-notempty) и [String](string-functions.md#function-notempty).
**Параметры**
- `x` — UUID на входе функции. [UUID](../data-types/uuid.md).
**Возвращаемое значение**
- Возвращает `1` для непустого UUID или `0` — для пустого UUID.
Тип: [UInt8](../data-types/int-uint.md).
**Пример**
Для генерации UUID-значений предназначена функция [generateUUIDv4](#uuid-function-generate).
Запрос:
```sql
SELECT notEmpty(generateUUIDv4());
```
Результат:
```text
┌─notEmpty(generateUUIDv4())─┐
│ 1 │
└────────────────────────────┘
```
## toUUID (x) {#touuid-x}
Преобразует значение типа String в тип UUID.

View File

@ -6,19 +6,51 @@ toc_title: DISTINCT
Если указан `SELECT DISTINCT`, то в результате запроса останутся только уникальные строки. Таким образом, из всех наборов полностью совпадающих строк в результате останется только одна строка.
## Обработка NULL {#null-processing}
Вы можете указать столбцы, по которым хотите отбирать уникальные значения: `SELECT DISTINCT ON (column1, column2,...)`. Если столбцы не указаны, то отбираются строки, в которых значения уникальны во всех столбцах.
`DISTINCT` работает с [NULL](../../syntax.md#null-literal) как-будто `NULL` — обычное значение и `NULL==NULL`. Другими словами, в результате `DISTINCT`, различные комбинации с `NULL` встретятся только один раз. Это отличается от обработки `NULL` в большинстве других контекстов.
Рассмотрим таблицу:
## Альтернативы {#alternatives}
```text
┌─a─┬─b─┬─c─┐
│ 1 │ 1 │ 1 │
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 2 │ 2 │ 2 │
│ 1 │ 1 │ 2 │
│ 1 │ 2 │ 2 │
└───┴───┴───┘
```
Такой же результат можно получить, применив секцию [GROUP BY](group-by.md) для того же набора значений, которые указан в секции `SELECT`, без использования каких-либо агрегатных функций. Но есть от `GROUP BY` несколько отличий:
Использование `DISTINCT` без указания столбцов:
- `DISTINCT` может применяться вместе с `GROUP BY`.
- Когда секция [ORDER BY](order-by.md) опущена, а секция [LIMIT](limit.md) присутствует, запрос прекращает выполнение сразу после считывания необходимого количества различных строк.
- Блоки данных выводятся по мере их обработки, не дожидаясь завершения выполнения всего запроса.
```sql
SELECT DISTINCT * FROM t1;
```
## Примеры {#examples}
```text
┌─a─┬─b─┬─c─┐
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 1 │ 1 │ 2 │
│ 1 │ 2 │ 2 │
└───┴───┴───┘
```
Использование `DISTINCT` с указанием столбцов:
```sql
SELECT DISTINCT ON (a,b) * FROM t1;
```
```text
┌─a─┬─b─┬─c─┐
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 2 │
│ 1 │ 2 │ 2 │
└───┴───┴───┘
```
## DISTINCT и ORDER BY {#distinct-orderby}
ClickHouse поддерживает использование секций `DISTINCT` и `ORDER BY` для разных столбцов в одном запросе. Секция `DISTINCT` выполняется до секции `ORDER BY`.
@ -56,3 +88,16 @@ ClickHouse поддерживает использование секций `DIS
Ряд `2, 4` был разрезан перед сортировкой.
Учитывайте эту специфику при разработке запросов.
## Обработка NULL {#null-processing}
`DISTINCT` работает с [NULL](../../syntax.md#null-literal) как-будто `NULL` — обычное значение и `NULL==NULL`. Другими словами, в результате `DISTINCT`, различные комбинации с `NULL` встретятся только один раз. Это отличается от обработки `NULL` в большинстве других контекстов.
## Альтернативы {#alternatives}
Можно получить такой же результат, применив [GROUP BY](group-by.md) для того же набора значений, которые указан в секции `SELECT`, без использования каких-либо агрегатных функций. Но есть несколько отличий от `GROUP BY`:
- `DISTINCT` может применяться вместе с `GROUP BY`.
- Когда секция [ORDER BY](order-by.md) опущена, а секция [LIMIT](limit.md) присутствует, запрос прекращает выполнение сразу после считывания необходимого количества различных строк.
- Блоки данных выводятся по мере их обработки, не дожидаясь завершения выполнения всего запроса.

View File

@ -11,7 +11,7 @@ toc_title: "Обзор"
``` sql
[WITH expr_list|(subquery)]
SELECT [DISTINCT] expr_list
SELECT [DISTINCT [ON (column1, column2, ...)]] expr_list
[FROM [db.]table | (subquery) | table_function] [FINAL]
[SAMPLE sample_coeff]
[ARRAY JOIN ...]
@ -34,6 +34,8 @@ SELECT [DISTINCT] expr_list
Особенности каждой необязательной секции рассматриваются в отдельных разделах, которые перечислены в том же порядке, в каком они выполняются:
- [Секция WITH](with.md)
- [Секция SELECT](#select-clause)
- [Секция DISTINCT](distinct.md)
- [Секция FROM](from.md)
- [Секция SAMPLE](sample.md)
- [Секция JOIN](join.md)
@ -42,8 +44,6 @@ SELECT [DISTINCT] expr_list
- [Секция GROUP BY](group-by.md)
- [Секция LIMIT BY](limit-by.md)
- [Секция HAVING](having.md)
- [Секция SELECT](#select-clause)
- [Секция DISTINCT](distinct.md)
- [Секция LIMIT](limit.md)
[Секция OFFSET](offset.md)
- [Секция UNION ALL](union.md)

View File

@ -3,6 +3,7 @@ set (CLICKHOUSE_CLIENT_SOURCES
ConnectionParameters.cpp
QueryFuzzer.cpp
Suggest.cpp
TestHint.cpp
)
set (CLICKHOUSE_CLIENT_LINK

View File

@ -0,0 +1,105 @@
#include "TestHint.h"
#include <Common/Exception.h>
#include <Common/ErrorCodes.h>
#include <IO/ReadBufferFromString.h>
#include <IO/ReadHelpers.h>
#include <Parsers/Lexer.h>
namespace
{
/// Parse error as number or as a string (name of the error code const)
int parseErrorCode(DB::ReadBufferFromString & in)
{
int code = -1;
String code_name;
auto * pos = in.position();
tryReadText(code, in);
if (pos != in.position())
{
return code;
}
/// Try parse as string
readStringUntilWhitespace(code_name, in);
return DB::ErrorCodes::getErrorCodeByName(code_name);
}
}
namespace DB
{
TestHint::TestHint(bool enabled_, const String & query_)
: query(query_)
{
if (!enabled_)
return;
// Don't parse error hints in leading comments, because it feels weird.
// Leading 'echo' hint is OK.
bool is_leading_hint = true;
Lexer lexer(query.data(), query.data() + query.size());
for (Token token = lexer.nextToken(); !token.isEnd(); token = lexer.nextToken())
{
if (token.type != TokenType::Comment
&& token.type != TokenType::Whitespace)
{
is_leading_hint = false;
}
else if (token.type == TokenType::Comment)
{
String comment(token.begin, token.begin + token.size());
if (!comment.empty())
{
size_t pos_start = comment.find('{', 0);
if (pos_start != String::npos)
{
size_t pos_end = comment.find('}', pos_start);
if (pos_end != String::npos)
{
String hint(comment.begin() + pos_start + 1, comment.begin() + pos_end);
parse(hint, is_leading_hint);
}
}
}
}
}
}
void TestHint::parse(const String & hint, bool is_leading_hint)
{
ReadBufferFromString in(hint);
String item;
while (!in.eof())
{
readStringUntilWhitespace(item, in);
if (in.eof())
break;
skipWhitespaceIfAny(in);
if (!is_leading_hint)
{
if (item == "serverError")
server_error = parseErrorCode(in);
else if (item == "clientError")
client_error = parseErrorCode(in);
}
if (item == "echo")
echo.emplace(true);
if (item == "echoOn")
echo.emplace(true);
if (item == "echoOff")
echo.emplace(false);
}
}
}

View File

@ -1,11 +1,7 @@
#pragma once
#include <memory>
#include <sstream>
#include <iostream>
#include <optional>
#include <Core/Types.h>
#include <Common/Exception.h>
#include <Parsers/Lexer.h>
namespace DB
@ -19,6 +15,10 @@ namespace DB
///
/// - "-- { clientError 20 }" -- in case of you are expecting client error.
///
/// - "-- { serverError FUNCTION_THROW_IF_VALUE_IS_NON_ZERO }" -- by error name.
///
/// - "-- { clientError FUNCTION_THROW_IF_VALUE_IS_NON_ZERO }" -- by error name.
///
/// Remember that the client parse the query first (not the server), so for
/// example if you are expecting syntax error, then you should use
/// clientError not serverError.
@ -43,45 +43,7 @@ namespace DB
class TestHint
{
public:
TestHint(bool enabled_, const String & query_) :
query(query_)
{
if (!enabled_)
return;
// Don't parse error hints in leading comments, because it feels weird.
// Leading 'echo' hint is OK.
bool is_leading_hint = true;
Lexer lexer(query.data(), query.data() + query.size());
for (Token token = lexer.nextToken(); !token.isEnd(); token = lexer.nextToken())
{
if (token.type != TokenType::Comment
&& token.type != TokenType::Whitespace)
{
is_leading_hint = false;
}
else if (token.type == TokenType::Comment)
{
String comment(token.begin, token.begin + token.size());
if (!comment.empty())
{
size_t pos_start = comment.find('{', 0);
if (pos_start != String::npos)
{
size_t pos_end = comment.find('}', pos_start);
if (pos_end != String::npos)
{
String hint(comment.begin() + pos_start + 1, comment.begin() + pos_end);
parse(hint, is_leading_hint);
}
}
}
}
}
}
TestHint(bool enabled_, const String & query_);
int serverError() const { return server_error; }
int clientError() const { return client_error; }
@ -93,34 +55,7 @@ private:
int client_error = 0;
std::optional<bool> echo;
void parse(const String & hint, bool is_leading_hint)
{
std::stringstream ss; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
ss << hint;
String item;
while (!ss.eof())
{
ss >> item;
if (ss.eof())
break;
if (!is_leading_hint)
{
if (item == "serverError")
ss >> server_error;
else if (item == "clientError")
ss >> client_error;
}
if (item == "echo")
echo.emplace(true);
if (item == "echoOn")
echo.emplace(true);
if (item == "echoOff")
echo.emplace(false);
}
}
void parse(const String & hint, bool is_leading_hint);
bool allErrorsExpected(int actual_server_error, int actual_client_error) const
{

View File

@ -97,7 +97,7 @@
#endif
#if USE_SSL
# if USE_INTERNAL_SSL_LIBRARY
# if USE_INTERNAL_SSL_LIBRARY && !defined(ARCADIA_BUILD)
# include <Compression/CompressionCodecEncrypted.h>
# endif
# include <Poco/Net/Context.h>

View File

@ -888,13 +888,13 @@
</query_views_log>
<!-- Uncomment if use part log.
Part log contains information about all actions with parts in MergeTree tables (creation, deletion, merges, downloads).
Part log contains information about all actions with parts in MergeTree tables (creation, deletion, merges, downloads).-->
<part_log>
<database>system</database>
<table>part_log</table>
<partition_by>toYYYYMM(event_date)</partition_by>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</part_log>
-->
<!-- Uncomment to write text log into table.
Text log contains all information from usual server log but stores it in structured and efficient way.

View File

@ -741,10 +741,11 @@ query_views_log:
# Uncomment if use part log.
# Part log contains information about all actions with parts in MergeTree tables (creation, deletion, merges, downloads).
# part_log:
# database: system
# table: part_log
# flush_interval_milliseconds: 7500
part_log:
database: system
table: part_log
partition_by: toYYYYMM(event_date)
flush_interval_milliseconds: 7500
# Uncomment to write text log into table.
# Text log contains all information from usual server log but stores it in structured and efficient way.

View File

@ -71,9 +71,6 @@ then
export DEB_CC=${DEB_CC=clang-10}
export DEB_CXX=${DEB_CXX=clang++-10}
EXTRAPACKAGES="$EXTRAPACKAGES clang-10 lld-10"
elif [[ $BUILD_TYPE == 'valgrind' ]]; then
MALLOC_OPTS="-DENABLE_TCMALLOC=0 -DENABLE_JEMALLOC=0"
VERSION_POSTFIX+="+valgrind"
elif [[ $BUILD_TYPE == 'debug' ]]; then
CMAKE_BUILD_TYPE=Debug
VERSION_POSTFIX+="+debug"

View File

@ -122,6 +122,9 @@ struct AccessRightsElement
class AccessRightsElements : public std::vector<AccessRightsElement>
{
public:
using Base = std::vector<AccessRightsElement>;
using Base::Base;
bool empty() const { return std::all_of(begin(), end(), [](const AccessRightsElement & e) { return e.empty(); }); }
bool sameDatabaseAndTable() const

View File

@ -46,7 +46,6 @@ SRCS(
SettingsProfilesInfo.cpp
User.cpp
UsersConfigAccessStorage.cpp
tests/gtest_access_rights_ops.cpp
)

View File

@ -8,7 +8,7 @@ PEERDIR(
SRCS(
<? find . -name '*.cpp' | grep -v -F examples | sed 's/^\.\// /' | sort ?>
<? find . -name '*.cpp' | grep -v -F tests | grep -v -F examples | sed 's/^\.\// /' | sort ?>
)
END()

View File

@ -6,11 +6,12 @@
#include <Columns/ColumnVector.h>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnNullable.h>
#include <DataTypes/IDataType.h>
#include <DataTypes/DataTypesNumber.h>
#include <common/StringRef.h>
#include <Common/assert_cast.h>
#include <DataTypes/DataTypeNullable.h>
#include <AggregateFunctions/IAggregateFunction.h>
#if !defined(ARCADIA_BUILD)
@ -49,6 +50,8 @@ private:
T value;
public:
static constexpr bool is_nullable = false;
bool has() const
{
return has_value;
@ -469,6 +472,8 @@ private:
char small_data[MAX_SMALL_STRING_SIZE]; /// Including the terminating zero.
public:
static constexpr bool is_nullable = false;
bool has() const
{
return size >= 0;
@ -692,6 +697,8 @@ private:
Field value;
public:
static constexpr bool is_nullable = false;
bool has() const
{
return !value.isNull();
@ -975,6 +982,68 @@ struct AggregateFunctionAnyLastData : Data
#endif
};
template <typename Data>
struct AggregateFunctionSingleValueOrNullData : Data
{
static constexpr bool is_nullable = true;
using Self = AggregateFunctionSingleValueOrNullData;
bool first_value = true;
bool is_null = false;
bool changeIfBetter(const IColumn & column, size_t row_num, Arena * arena)
{
if (first_value)
{
first_value = false;
this->change(column, row_num, arena);
return true;
}
else if (!this->isEqualTo(column, row_num))
{
is_null = true;
}
return false;
}
bool changeIfBetter(const Self & to, Arena * arena)
{
if (first_value)
{
first_value = false;
this->change(to, arena);
return true;
}
else if (!this->isEqualTo(to))
{
is_null = true;
}
return false;
}
void insertResultInto(IColumn & to) const
{
if (is_null || first_value)
{
to.insertDefault();
}
else
{
ColumnNullable & col = typeid_cast<ColumnNullable &>(to);
col.getNullMapColumn().insertDefault();
this->Data::insertResultInto(col.getNestedColumn());
}
}
static const char * name() { return "singleValueOrNull"; }
#if USE_EMBEDDED_COMPILER
static constexpr bool is_compilable = false;
#endif
};
/** Implement 'heavy hitters' algorithm.
* Selects most frequent value if its frequency is more than 50% in each thread of execution.
@ -1074,7 +1143,10 @@ public:
DataTypePtr getReturnType() const override
{
return this->argument_types.at(0);
auto result_type = this->argument_types.at(0);
if constexpr (Data::is_nullable)
return makeNullable(result_type);
return result_type;
}
void add(AggregateDataPtr __restrict place, const IColumn ** columns, size_t row_num, Arena * arena) const override

View File

@ -0,0 +1,27 @@
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/HelpersMinMaxAny.h>
#include <AggregateFunctions/FactoryHelpers.h>
#include "registerAggregateFunctions.h"
namespace DB
{
struct Settings;
namespace
{
AggregateFunctionPtr createAggregateFunctionSingleValueOrNull(const std::string & name, const DataTypes & argument_types, const Array & parameters, const Settings * settings)
{
return AggregateFunctionPtr(createAggregateFunctionSingleValue<AggregateFunctionsSingleValue, AggregateFunctionSingleValueOrNullData>(name, argument_types, parameters, settings));
}
}
void registerAggregateFunctionSingleValueOrNull(AggregateFunctionFactory & factory)
{
factory.registerFunction("singleValueOrNull", createAggregateFunctionSingleValueOrNull);
}
}

View File

@ -48,6 +48,7 @@ void registerAggregateFunctionRankCorrelation(AggregateFunctionFactory &);
void registerAggregateFunctionMannWhitney(AggregateFunctionFactory &);
void registerAggregateFunctionWelchTTest(AggregateFunctionFactory &);
void registerAggregateFunctionStudentTTest(AggregateFunctionFactory &);
void registerAggregateFunctionSingleValueOrNull(AggregateFunctionFactory &);
void registerAggregateFunctionSequenceNextNode(AggregateFunctionFactory &);
class AggregateFunctionCombinatorFactory;
@ -113,6 +114,7 @@ void registerAggregateFunctions()
registerAggregateFunctionSequenceNextNode(factory);
registerAggregateFunctionWelchTTest(factory);
registerAggregateFunctionStudentTTest(factory);
registerAggregateFunctionSingleValueOrNull(factory);
registerWindowFunctions(factory);

View File

@ -1,4 +1,5 @@
#include <Common/ErrorCodes.h>
#include <Common/Exception.h>
#include <chrono>
/** Previously, these constants were located in one enum.
@ -563,6 +564,8 @@
M(593, ZERO_COPY_REPLICATION_ERROR) \
M(594, BZIP2_STREAM_DECODER_FAILED) \
M(595, BZIP2_STREAM_ENCODER_FAILED) \
M(596, INTERSECT_OR_EXCEPT_RESULT_STRUCTURES_MISMATCH) \
M(597, NO_SUCH_ERROR_CODE) \
\
M(998, POSTGRESQL_CONNECTION_FAILURE) \
M(999, KEEPER_EXCEPTION) \
@ -601,6 +604,21 @@ namespace ErrorCodes
return error_codes_names.names[error_code];
}
ErrorCode getErrorCodeByName(std::string_view error_name)
{
for (size_t i = 0, end = ErrorCodes::end(); i < end; ++i)
{
std::string_view name = ErrorCodes::getName(i);
if (name.empty())
continue;
if (name == error_name)
return i;
}
throw Exception(NO_SUCH_ERROR_CODE, "No error code with name: '{}'", error_name);
}
ErrorCode end() { return END + 1; }
void increment(ErrorCode error_code, bool remote, const std::string & message, const FramePointers & trace)

View File

@ -25,6 +25,12 @@ namespace ErrorCodes
/// Get name of error_code by identifier.
/// Returns statically allocated string.
std::string_view getName(ErrorCode error_code);
/// Get error code value by name.
///
/// It has O(N) complexity, but this is not major, since it is used only
/// for test hints, and it does not worth to keep another structure for
/// this.
ErrorCode getErrorCodeByName(std::string_view error_name);
struct Error
{

View File

@ -360,7 +360,7 @@ void MemoryTracker::setOrRaiseHardLimit(Int64 value)
{
/// This is just atomic set to maximum.
Int64 old_value = hard_limit.load(std::memory_order_relaxed);
while (old_value < value && !hard_limit.compare_exchange_weak(old_value, value))
while ((value == 0 || old_value < value) && !hard_limit.compare_exchange_weak(old_value, value))
;
}
@ -368,6 +368,6 @@ void MemoryTracker::setOrRaiseHardLimit(Int64 value)
void MemoryTracker::setOrRaiseProfilerLimit(Int64 value)
{
Int64 old_value = profiler_limit.load(std::memory_order_relaxed);
while (old_value < value && !profiler_limit.compare_exchange_weak(old_value, value))
while ((value == 0 || old_value < value) && !profiler_limit.compare_exchange_weak(old_value, value))
;
}

View File

@ -0,0 +1,24 @@
#pragma once
/// SparseHashMap is a wrapper for google::sparse_hash_map.
#if defined(ARCADIA_BUILD)
#define HASH_FUN_H <unordered_map>
template <typename T>
struct THash;
#endif
#include <sparsehash/sparse_hash_map>
#if !defined(ARCADIA_BUILD)
template <class Key, class T, class HashFcn = std::hash<Key>,
class EqualKey = std::equal_to<Key>,
class Alloc = google::libc_allocator_with_realloc<std::pair<const Key, T>>>
using SparseHashMap = google::sparse_hash_map<Key, T, HashFcn, EqualKey, Alloc>;
#else
template <class Key, class T, class HashFcn = std::hash<Key>,
class EqualKey = std::equal_to<Key>,
class Alloc = google::sparsehash::libc_allocator_with_realloc<std::pair<const Key, T>>>
using SparseHashMap = google::sparsehash::sparse_hash_map<Key, T, HashFcn, EqualKey, Alloc>;
#undef THash
#endif

View File

@ -102,6 +102,7 @@ SRCS(
ZooKeeper/ZooKeeperNodeCache.cpp
checkStackSize.cpp
clearPasswordFromCommandLine.cpp
clickhouse_malloc.cpp
createHardLink.cpp
escapeForFileName.cpp
filesystemHelpers.cpp
@ -116,6 +117,7 @@ SRCS(
hex.cpp
isLocalAddress.cpp
malloc.cpp
memory.cpp
new_delete.cpp
parseAddress.cpp
parseGlobs.cpp

View File

@ -30,16 +30,18 @@ void CompressedWriteBuffer::nextImpl()
compressed_buffer.resize(compressed_reserve_size);
UInt32 compressed_size = codec->compress(working_buffer.begin(), decompressed_size, compressed_buffer.data());
// FIXME remove this after fixing msan report in lz4.
// Almost always reproduces on stateless tests, the exact test unknown.
__msan_unpoison(compressed_buffer.data(), compressed_size);
CityHash_v1_0_2::uint128 checksum = CityHash_v1_0_2::CityHash128(compressed_buffer.data(), compressed_size);
out.write(reinterpret_cast<const char *>(&checksum), CHECKSUM_SIZE);
out.write(compressed_buffer.data(), compressed_size);
}
void CompressedWriteBuffer::finalize()
{
next();
}
CompressedWriteBuffer::CompressedWriteBuffer(
WriteBuffer & out_,
CompressionCodecPtr codec_,
@ -48,6 +50,7 @@ CompressedWriteBuffer::CompressedWriteBuffer(
{
}
CompressedWriteBuffer::~CompressedWriteBuffer()
{
/// FIXME move final flush into the caller

View File

@ -29,6 +29,8 @@ public:
CompressionCodecPtr codec_ = CompressionCodecFactory::instance().getDefaultCodec(),
size_t buf_size = DBMS_DEFAULT_BUFFER_SIZE);
void finalize() override;
/// The amount of compressed data
size_t getCompressedBytes()
{

View File

@ -1,13 +1,15 @@
#include <Common/config.h>
#if !defined(ARCADIA_BUILD)
# include <Common/config.h>
#endif
#include <Compression/CompressionFactory.h>
#if USE_SSL && USE_INTERNAL_SSL_LIBRARY
#include <Compression/CompressionCodecEncrypted.h>
#include <Parsers/ASTLiteral.h>
#include <cassert>
#include <openssl/digest.h>
#include <openssl/digest.h> // Y_IGNORE
#include <openssl/err.h>
#include <openssl/hkdf.h>
#include <openssl/hkdf.h> // Y_IGNORE
#include <string_view>
namespace DB

View File

@ -2,11 +2,11 @@
// This depends on BoringSSL-specific API, notably <openssl/aead.h>.
#include <Common/config.h>
#if USE_SSL && USE_INTERNAL_SSL_LIBRARY
#if USE_SSL && USE_INTERNAL_SSL_LIBRARY && !defined(ARCADIA_BUILD)
#include <Compression/ICompressionCodec.h>
#include <boost/noncopyable.hpp>
#include <openssl/aead.h>
#include <openssl/aead.h> // Y_IGNORE
#include <optional>
namespace DB

View File

@ -6,7 +6,7 @@
namespace DB
{
/** Common part for implementation of MySQLBlockInputStream, MongoDBBlockInputStream and others.
/** Common part for implementation of MySQLSource, MongoDBSource and others.
*/
struct ExternalResultDescription
{

View File

@ -1,4 +1,5 @@
#include <Core/NamesAndTypes.h>
#include <Common/HashTable/HashMap.h>
#include <DataTypes/DataTypeFactory.h>
#include <IO/ReadBuffer.h>
#include <IO/WriteBuffer.h>
@ -6,7 +7,6 @@
#include <IO/WriteHelpers.h>
#include <IO/ReadBufferFromString.h>
#include <IO/WriteBufferFromString.h>
#include <sparsehash/dense_hash_map>
namespace DB
@ -163,12 +163,7 @@ NamesAndTypesList NamesAndTypesList::filter(const Names & names) const
NamesAndTypesList NamesAndTypesList::addTypes(const Names & names) const
{
/// NOTE: It's better to make a map in `IStorage` than to create it here every time again.
#if !defined(ARCADIA_BUILD)
google::dense_hash_map<StringRef, const DataTypePtr *, StringRefHash> types;
#else
google::sparsehash::dense_hash_map<StringRef, const DataTypePtr *, StringRefHash> types;
#endif
types.set_empty_key(StringRef());
HashMapWithSavedHash<StringRef, const DataTypePtr *, StringRefHash> types;
for (const auto & column : *this)
types[column.name] = &column.type;
@ -176,10 +171,11 @@ NamesAndTypesList NamesAndTypesList::addTypes(const Names & names) const
NamesAndTypesList res;
for (const String & name : names)
{
auto it = types.find(name);
const auto * it = types.find(name);
if (it == types.end())
throw Exception("No column " + name, ErrorCodes::THERE_IS_NO_COLUMN);
res.emplace_back(name, *it->second);
throw Exception(ErrorCodes::THERE_IS_NO_COLUMN, "No column {}", name);
res.emplace_back(name, *it->getMapped());
}
return res;

View File

@ -1,3 +1,5 @@
#include "MongoDBSource.h"
#include <string>
#include <vector>
@ -15,7 +17,6 @@
#include <Common/assert_cast.h>
#include <Common/quoteString.h>
#include <common/range.h>
#include <DataStreams/MongoDBBlockInputStream.h>
#include <Poco/URI.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <Poco/Version.h>

View File

@ -1,4 +1,4 @@
#include "PostgreSQLBlockInputStream.h"
#include "PostgreSQLSource.h"
#if USE_LIBPQXX
#include <Columns/ColumnNullable.h>

View File

@ -342,19 +342,7 @@ void PushingToViewsBlockOutputStream::writeSuffix()
runViewStage(view, stage_step, [&] { process_suffix(view); });
if (view.exception)
{
exception_count.fetch_add(1, std::memory_order_relaxed);
}
else
{
LOG_TRACE(
log,
"Pushing (parallel {}) from {} to {} took {} ms.",
max_threads,
storage->getStorageID().getNameForLogs(),
view.table_id.getNameForLogs(),
view.runtime_stats.elapsed_ms);
}
});
}
pool.wait();
@ -371,20 +359,22 @@ void PushingToViewsBlockOutputStream::writeSuffix()
}
runViewStage(view, stage_step, [&] { process_suffix(view); });
if (view.exception)
{
exception_happened = true;
}
else
}
for (auto & view : views)
{
if (!view.exception)
LOG_TRACE(
log,
"Pushing (sequentially) from {} to {} took {} ms.",
"Pushing ({}) from {} to {} took {} ms.",
max_threads <= 1 ? "sequentially" : ("parallel " + std::to_string(max_threads)),
storage->getStorageID().getNameForLogs(),
view.table_id.getNameForLogs(),
view.runtime_stats.elapsed_ms);
}
}
}
if (exception_happened)
checkExceptionsInViews();

View File

@ -1,4 +1,4 @@
#include "SQLiteBlockInputStream.h"
#include "SQLiteSource.h"
#if USE_SQLITE
#include <common/range.h>
@ -22,21 +22,18 @@ namespace ErrorCodes
extern const int SQLITE_ENGINE_ERROR;
}
SQLiteBlockInputStream::SQLiteBlockInputStream(
SQLiteSource::SQLiteSource(
SQLitePtr sqlite_db_,
const String & query_str_,
const Block & sample_block,
const UInt64 max_block_size_)
: query_str(query_str_)
: SourceWithProgress(sample_block.cloneEmpty())
, query_str(query_str_)
, max_block_size(max_block_size_)
, sqlite_db(std::move(sqlite_db_))
{
description.init(sample_block);
}
void SQLiteBlockInputStream::readPrefix()
{
sqlite3_stmt * compiled_stmt = nullptr;
int status = sqlite3_prepare_v2(sqlite_db.get(), query_str.c_str(), query_str.size() + 1, &compiled_stmt, nullptr);
@ -48,11 +45,10 @@ void SQLiteBlockInputStream::readPrefix()
compiled_statement = std::unique_ptr<sqlite3_stmt, StatementDeleter>(compiled_stmt, StatementDeleter());
}
Block SQLiteBlockInputStream::readImpl()
Chunk SQLiteSource::generate()
{
if (!compiled_statement)
return Block();
return {};
MutableColumns columns = description.sample_block.cloneEmptyColumns();
size_t num_rows = 0;
@ -78,25 +74,25 @@ Block SQLiteBlockInputStream::readImpl()
}
int column_count = sqlite3_column_count(compiled_statement.get());
for (const auto idx : collections::range(0, column_count))
{
const auto & sample = description.sample_block.getByPosition(idx);
if (sqlite3_column_type(compiled_statement.get(), idx) == SQLITE_NULL)
for (int column_index = 0; column_index < column_count; ++column_index)
{
insertDefaultSQLiteValue(*columns[idx], *sample.column);
if (sqlite3_column_type(compiled_statement.get(), column_index) == SQLITE_NULL)
{
columns[column_index]->insertDefault();
continue;
}
if (description.types[idx].second)
auto & [type, is_nullable] = description.types[column_index];
if (is_nullable)
{
ColumnNullable & column_nullable = assert_cast<ColumnNullable &>(*columns[idx]);
insertValue(column_nullable.getNestedColumn(), description.types[idx].first, idx);
ColumnNullable & column_nullable = assert_cast<ColumnNullable &>(*columns[column_index]);
insertValue(column_nullable.getNestedColumn(), type, column_index);
column_nullable.getNullMapData().emplace_back(0);
}
else
{
insertValue(*columns[idx], description.types[idx].first, idx);
insertValue(*columns[column_index], type, column_index);
}
}
@ -104,18 +100,16 @@ Block SQLiteBlockInputStream::readImpl()
break;
}
return description.sample_block.cloneWithColumns(std::move(columns));
}
void SQLiteBlockInputStream::readSuffix()
{
if (compiled_statement)
if (num_rows == 0)
{
compiled_statement.reset();
return {};
}
return Chunk(std::move(columns), num_rows);
}
void SQLiteBlockInputStream::insertValue(IColumn & column, const ExternalResultDescription::ValueType type, size_t idx)
void SQLiteSource::insertValue(IColumn & column, ExternalResultDescription::ValueType type, size_t idx)
{
switch (type)
{

View File

@ -6,32 +6,28 @@
#if USE_SQLITE
#include <Core/ExternalResultDescription.h>
#include <DataStreams/IBlockInputStream.h>
#include <Processors/Sources/SourceWithProgress.h>
#include <sqlite3.h> // Y_IGNORE
namespace DB
{
class SQLiteBlockInputStream : public IBlockInputStream
class SQLiteSource : public SourceWithProgress
{
using SQLitePtr = std::shared_ptr<sqlite3>;
public:
SQLiteBlockInputStream(SQLitePtr sqlite_db_,
SQLiteSource(SQLitePtr sqlite_db_,
const String & query_str_,
const Block & sample_block,
UInt64 max_block_size_);
String getName() const override { return "SQLite"; }
Block getHeader() const override { return description.sample_block.cloneEmpty(); }
private:
void insertDefaultSQLiteValue(IColumn & column, const IColumn & sample_column)
{
column.insertFrom(sample_column, 0);
}
using ValueType = ExternalResultDescription::ValueType;
@ -40,19 +36,14 @@ private:
void operator()(sqlite3_stmt * stmt) { sqlite3_finalize(stmt); }
};
void readPrefix() override;
Chunk generate() override;
Block readImpl() override;
void readSuffix() override;
void insertValue(IColumn & column, const ExternalResultDescription::ValueType type, size_t idx);
void insertValue(IColumn & column, ExternalResultDescription::ValueType type, size_t idx);
String query_str;
UInt64 max_block_size;
ExternalResultDescription description;
SQLitePtr sqlite_db;
std::unique_ptr<sqlite3_stmt, StatementDeleter> compiled_statement;
};

View File

@ -29,7 +29,7 @@ SRCS(
ITTLAlgorithm.cpp
InternalTextLogsRowOutputStream.cpp
MaterializingBlockInputStream.cpp
MongoDBBlockInputStream.cpp
MongoDBSource.cpp
NativeBlockInputStream.cpp
NativeBlockOutputStream.cpp
PushingToViewsBlockOutputStream.cpp
@ -37,7 +37,7 @@ SRCS(
RemoteBlockOutputStream.cpp
RemoteQueryExecutor.cpp
RemoteQueryExecutorReadContext.cpp
SQLiteBlockInputStream.cpp
SQLiteSource.cpp
SizeLimits.cpp
SquashingBlockInputStream.cpp
SquashingBlockOutputStream.cpp

View File

@ -79,7 +79,7 @@ void DataTypeMap::assertKeyType() const
std::string DataTypeMap::doGetName() const
{
WriteBufferFromOwnString s;
s << "Map(" << key_type->getName() << "," << value_type->getName() << ")";
s << "Map(" << key_type->getName() << ", " << value_type->getName() << ")";
return s.str();
}

View File

@ -11,7 +11,7 @@
# include <DataTypes/convertMySQLDataType.h>
# include <Databases/MySQL/DatabaseMySQL.h>
# include <Databases/MySQL/FetchTablesColumnsList.h>
# include <Formats/MySQLBlockInputStream.h>
# include <Formats/MySQLSource.h>
# include <Processors/Executors/PullingPipelineExecutor.h>
# include <Processors/QueryPipeline.h>
# include <IO/Operators.h>

View File

@ -10,7 +10,7 @@
#include <DataTypes/DataTypesNumber.h>
#include <Processors/Executors/PullingPipelineExecutor.h>
#include <Processors/QueryPipeline.h>
#include <Formats/MySQLBlockInputStream.h>
#include <Formats/MySQLSource.h>
#include <IO/WriteBufferFromString.h>
#include <IO/WriteHelpers.h>
#include <IO/Operators.h>

View File

@ -5,7 +5,7 @@
#include <Core/Block.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesNumber.h>
#include <Formats/MySQLBlockInputStream.h>
#include <Formats/MySQLSource.h>
#include <Processors/Executors/PullingPipelineExecutor.h>
#include <Processors/QueryPipeline.h>
#include <IO/ReadBufferFromFile.h>

View File

@ -16,7 +16,7 @@
# include <DataStreams/copyData.h>
# include <Databases/MySQL/DatabaseMaterializedMySQL.h>
# include <Databases/MySQL/MaterializeMetadata.h>
# include <Formats/MySQLBlockInputStream.h>
# include <Formats/MySQLSource.h>
# include <IO/ReadBufferFromString.h>
# include <Interpreters/Context.h>
# include <Interpreters/executeQuery.h>

View File

@ -36,10 +36,10 @@ void registerDictionarySourceCassandra(DictionarySourceFactory & factory)
#if USE_CASSANDRA
#include <IO/WriteHelpers.h>
#include <Common/SipHash.h>
#include "CassandraBlockInputStream.h"
#include <common/logger_useful.h>
#include <Common/SipHash.h>
#include <IO/WriteHelpers.h>
#include <Dictionaries/CassandraSource.h>
namespace DB
{

View File

@ -10,7 +10,7 @@
#include <Columns/ColumnsNumber.h>
#include <Core/ExternalResultDescription.h>
#include <IO/ReadHelpers.h>
#include "CassandraBlockInputStream.h"
#include "CassandraSource.h"
namespace DB

View File

@ -5,7 +5,7 @@
#include <variant>
#include <optional>
#include <sparsehash/sparse_hash_map>
#include <Common/SparseHashMap.h>
#include <Common/HashTable/HashMap.h>
#include <Common/HashTable/HashSet.h>
@ -125,14 +125,6 @@ private:
HashMap<UInt64, Value>,
HashMapWithSavedHash<StringRef, Value, DefaultHash<StringRef>>>;
#if !defined(ARCADIA_BUILD)
template <typename Key, typename Value>
using SparseHashMap = google::sparse_hash_map<Key, Value, DefaultHash<Key>>;
#else
template <typename Key, typename Value>
using SparseHashMap = google::sparsehash::sparse_hash_map<Key, Value, DefaultHash<Key>>;
#endif
template <typename Value>
using CollectionTypeSparse = std::conditional_t<
dictionary_key_type == DictionaryKeyType::simple,

View File

@ -50,7 +50,7 @@ void registerDictionarySourceMongoDB(DictionarySourceFactory & factory)
// Poco/MongoDB/BSONWriter.h:54: void writeCString(const std::string & value);
// src/IO/WriteHelpers.h:146 #define writeCString(s, buf)
#include <IO/WriteHelpers.h>
#include <DataStreams/MongoDBBlockInputStream.h>
#include <DataStreams/MongoDBSource.h>
namespace DB

View File

@ -12,7 +12,7 @@
# include "DictionaryStructure.h"
# include "ExternalQueryBuilder.h"
# include "IDictionarySource.h"
# include <Formats/MySQLBlockInputStream.h>
# include <Formats/MySQLSource.h>
namespace Poco
{

View File

@ -7,7 +7,7 @@
#if USE_LIBPQXX
#include <Columns/ColumnString.h>
#include <DataTypes/DataTypeString.h>
#include <DataStreams/PostgreSQLBlockInputStream.h>
#include <DataStreams/PostgreSQLSource.h>
#include "readInvalidateQuery.h"
#include <Interpreters/Context.h>
#endif

View File

@ -31,7 +31,7 @@ void registerDictionarySourceRedis(DictionarySourceFactory & factory)
#include <IO/WriteHelpers.h>
#include "RedisBlockInputStream.h"
#include "RedisSource.h"
namespace DB

View File

@ -1,4 +1,4 @@
#include "RedisBlockInputStream.h"
#include "RedisSource.h"
#include <string>
#include <vector>

View File

@ -222,7 +222,7 @@ Pipe XDBCDictionarySource::loadFromQuery(const Poco::URI & url, const Block & re
};
auto read_buf = std::make_unique<ReadWriteBufferFromHTTP>(url, Poco::Net::HTTPRequest::HTTP_POST, write_body_callback, timeouts);
auto format = FormatFactory::instance().getInput(IXDBCBridgeHelper::DEFAULT_FORMAT, *read_buf, sample_block, getContext(), max_block_size);
auto format = FormatFactory::instance().getInput(IXDBCBridgeHelper::DEFAULT_FORMAT, *read_buf, required_sample_block, getContext(), max_block_size);
format->addBuffer(std::move(read_buf));
return Pipe(std::move(format));

View File

@ -22,13 +22,13 @@ NO_COMPILER_WARNINGS()
SRCS(
CacheDictionary.cpp
CacheDictionaryUpdateQueue.cpp
CassandraBlockInputStream.cpp
CassandraDictionarySource.cpp
CassandraHelpers.cpp
CassandraSource.cpp
ClickHouseDictionarySource.cpp
DictionaryBlockInputStream.cpp
DictionaryBlockInputStreamBase.cpp
DictionaryFactory.cpp
DictionarySource.cpp
DictionarySourceBase.cpp
DictionarySourceFactory.cpp
DictionarySourceHelpers.cpp
DictionaryStructure.cpp
@ -57,8 +57,8 @@ SRCS(
PolygonDictionaryImplementations.cpp
PolygonDictionaryUtils.cpp
RangeHashedDictionary.cpp
RedisBlockInputStream.cpp
RedisDictionarySource.cpp
RedisSource.cpp
XDBCDictionarySource.cpp
getDictionaryConfigurationFromAST.cpp
readInvalidateQuery.cpp

View File

@ -19,7 +19,7 @@
#include <Common/assert_cast.h>
#include <common/range.h>
#include <common/logger_useful.h>
#include "MySQLBlockInputStream.h"
#include "MySQLSource.h"
namespace DB

View File

@ -58,7 +58,7 @@ protected:
ExternalResultDescription description;
};
/// Like MySQLBlockInputStream, but allocates connection only when reading is starting.
/// Like MySQLSource, but allocates connection only when reading is starting.
/// It allows to create a lot of stream objects without occupation of all connection pool.
/// Also makes attempts to reconnect in case of connection failures.
class MySQLWithFailoverSource final : public MySQLSource

View File

@ -14,7 +14,7 @@ SRCS(
FormatFactory.cpp
FormatSchemaInfo.cpp
JSONEachRowUtils.cpp
MySQLBlockInputStream.cpp
MySQLSource.cpp
NativeFormat.cpp
NullFormat.cpp
ParsedTemplateFormatString.cpp

View File

@ -0,0 +1,19 @@
#include <Functions/FunctionFactory.h>
#include <Functions/CastOverloadResolver.h>
namespace DB
{
void registerCastOverloadResolvers(FunctionFactory & factory)
{
factory.registerFunction<CastInternalOverloadResolver<CastType::nonAccurate>>(FunctionFactory::CaseInsensitive);
factory.registerFunction<CastInternalOverloadResolver<CastType::accurate>>();
factory.registerFunction<CastInternalOverloadResolver<CastType::accurateOrNull>>();
factory.registerFunction<CastOverloadResolver<CastType::nonAccurate>>(FunctionFactory::CaseInsensitive);
factory.registerFunction<CastOverloadResolver<CastType::accurate>>();
factory.registerFunction<CastOverloadResolver<CastType::accurateOrNull>>();
}
}

View File

@ -0,0 +1,121 @@
#pragma once
#include <Functions/FunctionsConversion.h>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
/*
* CastInternal does not preserve nullability of the data type,
* i.e. CastInternal(toNullable(toInt8(1)) as Int32) will be Int32(1).
*
* Cast preserves nullability according to setting `cast_keep_nullable`,
* i.e. Cast(toNullable(toInt8(1)) as Int32) will be Nullable(Int32(1)) if `cast_keep_nullable` == 1.
**/
template<CastType cast_type, bool internal, typename CastName, typename FunctionName>
class CastOverloadResolverImpl : public IFunctionOverloadResolver
{
public:
using MonotonicityForRange = FunctionCastBase::MonotonicityForRange;
using Diagnostic = FunctionCastBase::Diagnostic;
static constexpr auto name = cast_type == CastType::accurate
? CastName::accurate_cast_name
: (cast_type == CastType::accurateOrNull ? CastName::accurate_cast_or_null_name : CastName::cast_name);
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 2; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
explicit CastOverloadResolverImpl(std::optional<Diagnostic> diagnostic_, bool keep_nullable_)
: diagnostic(std::move(diagnostic_)), keep_nullable(keep_nullable_)
{
}
static FunctionOverloadResolverPtr create(ContextPtr context)
{
if constexpr (internal)
return createImpl();
return createImpl({}, context->getSettingsRef().cast_keep_nullable);
}
static FunctionOverloadResolverPtr createImpl(std::optional<Diagnostic> diagnostic = {}, bool keep_nullable = false)
{
assert(!internal || !keep_nullable);
return std::make_unique<CastOverloadResolverImpl>(std::move(diagnostic), keep_nullable);
}
protected:
FunctionBasePtr buildImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & return_type) const override
{
DataTypes data_types(arguments.size());
for (size_t i = 0; i < arguments.size(); ++i)
data_types[i] = arguments[i].type;
auto monotonicity = MonotonicityHelper::getMonotonicityInformation(arguments.front().type, return_type.get());
return std::make_unique<FunctionCast<FunctionName>>(name, std::move(monotonicity), data_types, return_type, diagnostic, cast_type);
}
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
const auto & column = arguments.back().column;
if (!column)
throw Exception("Second argument to " + getName() + " must be a constant string describing type."
" Instead there is non-constant column of type " + arguments.back().type->getName(),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
const auto * type_col = checkAndGetColumnConst<ColumnString>(column.get());
if (!type_col)
throw Exception("Second argument to " + getName() + " must be a constant string describing type."
" Instead there is a column with the following structure: " + column->dumpStructure(),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
DataTypePtr type = DataTypeFactory::instance().get(type_col->getValue<String>());
if constexpr (cast_type == CastType::accurateOrNull)
return makeNullable(type);
if constexpr (internal)
return type;
if (keep_nullable && arguments.front().type->isNullable() && type->canBeInsideNullable())
return makeNullable(type);
return type;
}
bool useDefaultImplementationForNulls() const override { return false; }
bool useDefaultImplementationForLowCardinalityColumns() const override { return false; }
private:
std::optional<Diagnostic> diagnostic;
bool keep_nullable;
};
struct CastOverloadName
{
static constexpr auto cast_name = "CAST";
static constexpr auto accurate_cast_name = "accurateCast";
static constexpr auto accurate_cast_or_null_name = "accurateCastOrNull";
};
struct CastInternalOverloadName
{
static constexpr auto cast_name = "_CAST";
static constexpr auto accurate_cast_name = "accurate_Cast";
static constexpr auto accurate_cast_or_null_name = "accurate_CastOrNull";
};
template <CastType cast_type> using CastOverloadResolver = CastOverloadResolverImpl<cast_type, false, CastOverloadName, CastName>;
template <CastType cast_type> using CastInternalOverloadResolver = CastOverloadResolverImpl<cast_type, true, CastInternalOverloadName, CastInternalName>;
}

View File

@ -7,6 +7,8 @@ namespace DB
void registerFunctionFixedString(FunctionFactory & factory);
void registerCastOverloadResolvers(FunctionFactory & factory);
void registerFunctionsConversion(FunctionFactory & factory)
{
factory.registerFunction<FunctionToUInt8>();
@ -43,9 +45,7 @@ void registerFunctionsConversion(FunctionFactory & factory)
factory.registerFunction<FunctionToUnixTimestamp>();
factory.registerFunction<CastOverloadResolver<CastType::nonAccurate>>(FunctionFactory::CaseInsensitive);
factory.registerFunction<CastOverloadResolver<CastType::accurate>>();
factory.registerFunction<CastOverloadResolver<CastType::accurateOrNull>>();
registerCastOverloadResolvers(factory);
factory.registerFunction<FunctionToUInt8OrZero>();
factory.registerFunction<FunctionToUInt16OrZero>();

View File

@ -2412,7 +2412,8 @@ private:
std::optional<Diagnostic> diagnostic;
};
struct NameCast { static constexpr auto name = "CAST"; };
struct CastName { static constexpr auto name = "CAST"; };
struct CastInternalName { static constexpr auto name = "_CAST"; };
enum class CastType
{
@ -2421,17 +2422,26 @@ enum class CastType
accurateOrNull
};
class FunctionCast final : public IFunctionBase
class FunctionCastBase : public IFunctionBase
{
public:
using MonotonicityForRange = std::function<Monotonicity(const IDataType &, const Field &, const Field &)>;
using Diagnostic = ExecutableFunctionCast::Diagnostic;
};
template <typename FunctionName>
class FunctionCast final : public FunctionCastBase
{
public:
using WrapperType = std::function<ColumnPtr(ColumnsWithTypeAndName &, const DataTypePtr &, const ColumnNullable *, size_t)>;
using MonotonicityForRange = std::function<Monotonicity(const IDataType &, const Field &, const Field &)>;
using Diagnostic = ExecutableFunctionCast::Diagnostic;
FunctionCast(const char * name_, MonotonicityForRange && monotonicity_for_range_
, const DataTypes & argument_types_, const DataTypePtr & return_type_
, std::optional<Diagnostic> diagnostic_, CastType cast_type_)
: name(name_), monotonicity_for_range(std::move(monotonicity_for_range_))
FunctionCast(const char * cast_name_
, MonotonicityForRange && monotonicity_for_range_
, const DataTypes & argument_types_
, const DataTypePtr & return_type_
, std::optional<Diagnostic> diagnostic_
, CastType cast_type_)
: cast_name(cast_name_), monotonicity_for_range(std::move(monotonicity_for_range_))
, argument_types(argument_types_), return_type(return_type_), diagnostic(std::move(diagnostic_))
, cast_type(cast_type_)
{
@ -2445,7 +2455,7 @@ public:
try
{
return std::make_unique<ExecutableFunctionCast>(
prepareUnpackDictionaries(getArgumentTypes()[0], getResultType()), name, diagnostic);
prepareUnpackDictionaries(getArgumentTypes()[0], getResultType()), cast_name, diagnostic);
}
catch (Exception & e)
{
@ -2456,7 +2466,7 @@ public:
}
}
String getName() const override { return name; }
String getName() const override { return cast_name; }
bool isDeterministic() const override { return true; }
bool isDeterministicInScopeOfQuery() const override { return true; }
@ -2473,7 +2483,7 @@ public:
private:
const char * name;
const char * cast_name;
MonotonicityForRange monotonicity_for_range;
DataTypes argument_types;
@ -2515,7 +2525,7 @@ private:
{
/// In case when converting to Nullable type, we apply different parsing rule,
/// that will not throw an exception but return NULL in case of malformed input.
FunctionPtr function = FunctionConvertFromString<ToDataType, NameCast, ConvertFromStringExceptionMode::Null>::create();
FunctionPtr function = FunctionConvertFromString<ToDataType, FunctionName, ConvertFromStringExceptionMode::Null>::create();
return createFunctionAdaptor(function, from_type);
}
else if (!can_apply_accurate_cast)
@ -2539,12 +2549,12 @@ private:
{
if (wrapper_cast_type == CastType::accurate)
{
result_column = ConvertImpl<LeftDataType, RightDataType, NameCast>::execute(
result_column = ConvertImpl<LeftDataType, RightDataType, FunctionName>::execute(
arguments, result_type, input_rows_count, AccurateConvertStrategyAdditions());
}
else
{
result_column = ConvertImpl<LeftDataType, RightDataType, NameCast>::execute(
result_column = ConvertImpl<LeftDataType, RightDataType, FunctionName>::execute(
arguments, result_type, input_rows_count, AccurateOrNullConvertStrategyAdditions());
}
@ -2559,7 +2569,7 @@ private:
{
if (wrapper_cast_type == CastType::accurateOrNull)
{
auto nullable_column_wrapper = FunctionCast::createToNullableColumnWrapper();
auto nullable_column_wrapper = FunctionCast<FunctionName>::createToNullableColumnWrapper();
return nullable_column_wrapper(arguments, result_type, column_nullable, input_rows_count);
}
else
@ -2631,7 +2641,7 @@ private:
{
AccurateConvertStrategyAdditions additions;
additions.scale = scale;
result_column = ConvertImpl<LeftDataType, RightDataType, NameCast>::execute(
result_column = ConvertImpl<LeftDataType, RightDataType, FunctionName>::execute(
arguments, result_type, input_rows_count, additions);
return true;
@ -2640,7 +2650,7 @@ private:
{
AccurateOrNullConvertStrategyAdditions additions;
additions.scale = scale;
result_column = ConvertImpl<LeftDataType, RightDataType, NameCast>::execute(
result_column = ConvertImpl<LeftDataType, RightDataType, FunctionName>::execute(
arguments, result_type, input_rows_count, additions);
return true;
@ -2653,14 +2663,14 @@ private:
/// Consistent with CAST(Nullable(String) AS Nullable(Numbers))
/// In case when converting to Nullable type, we apply different parsing rule,
/// that will not throw an exception but return NULL in case of malformed input.
result_column = ConvertImpl<LeftDataType, RightDataType, NameCast, ConvertReturnNullOnErrorTag>::execute(
result_column = ConvertImpl<LeftDataType, RightDataType, FunctionName, ConvertReturnNullOnErrorTag>::execute(
arguments, result_type, input_rows_count, scale);
return true;
}
}
result_column = ConvertImpl<LeftDataType, RightDataType, NameCast>::execute(arguments, result_type, input_rows_count, scale);
result_column = ConvertImpl<LeftDataType, RightDataType, FunctionName>::execute(arguments, result_type, input_rows_count, scale);
return true;
});
@ -2670,7 +2680,7 @@ private:
{
if (wrapper_cast_type == CastType::accurateOrNull)
{
auto nullable_column_wrapper = FunctionCast::createToNullableColumnWrapper();
auto nullable_column_wrapper = FunctionCast<FunctionName>::createToNullableColumnWrapper();
return nullable_column_wrapper(arguments, result_type, column_nullable, input_rows_count);
}
else
@ -2990,7 +3000,7 @@ private:
template <typename ColumnStringType, typename EnumType>
WrapperType createStringToEnumWrapper() const
{
const char * function_name = name;
const char * function_name = cast_name;
return [function_name] (
ColumnsWithTypeAndName & arguments, const DataTypePtr & res_type, const ColumnNullable * nullable_col, size_t /*input_rows_count*/)
{
@ -3324,7 +3334,7 @@ private:
class MonotonicityHelper
{
public:
using MonotonicityForRange = FunctionCast::MonotonicityForRange;
using MonotonicityForRange = FunctionCastBase::MonotonicityForRange;
template <typename DataType>
static auto monotonicityForType(const DataType * const)
@ -3382,89 +3392,4 @@ public:
}
};
template<CastType cast_type>
class CastOverloadResolver : public IFunctionOverloadResolver
{
public:
using MonotonicityForRange = FunctionCast::MonotonicityForRange;
using Diagnostic = FunctionCast::Diagnostic;
static constexpr auto accurate_cast_name = "accurateCast";
static constexpr auto accurate_cast_or_null_name = "accurateCastOrNull";
static constexpr auto cast_name = "CAST";
static constexpr auto name = cast_type == CastType::accurate
? accurate_cast_name
: (cast_type == CastType::accurateOrNull ? accurate_cast_or_null_name : cast_name);
static FunctionOverloadResolverPtr create(ContextPtr context)
{
return createImpl(context->getSettingsRef().cast_keep_nullable);
}
static FunctionOverloadResolverPtr createImpl(bool keep_nullable, std::optional<Diagnostic> diagnostic = {})
{
return std::make_unique<CastOverloadResolver>(keep_nullable, std::move(diagnostic));
}
explicit CastOverloadResolver(bool keep_nullable_, std::optional<Diagnostic> diagnostic_ = {})
: keep_nullable(keep_nullable_), diagnostic(std::move(diagnostic_))
{}
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 2; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
protected:
FunctionBasePtr buildImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & return_type) const override
{
DataTypes data_types(arguments.size());
for (size_t i = 0; i < arguments.size(); ++i)
data_types[i] = arguments[i].type;
auto monotonicity = MonotonicityHelper::getMonotonicityInformation(arguments.front().type, return_type.get());
return std::make_unique<FunctionCast>(name, std::move(monotonicity), data_types, return_type, diagnostic, cast_type);
}
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
const auto & column = arguments.back().column;
if (!column)
throw Exception("Second argument to " + getName() + " must be a constant string describing type."
" Instead there is non-constant column of type " + arguments.back().type->getName(),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
const auto * type_col = checkAndGetColumnConst<ColumnString>(column.get());
if (!type_col)
throw Exception("Second argument to " + getName() + " must be a constant string describing type."
" Instead there is a column with the following structure: " + column->dumpStructure(),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
DataTypePtr type = DataTypeFactory::instance().get(type_col->getValue<String>());
if constexpr (cast_type == CastType::accurateOrNull)
{
return makeNullable(type);
}
else
{
if (keep_nullable && arguments.front().type->isNullable())
return makeNullable(type);
return type;
}
}
bool useDefaultImplementationForNulls() const override { return false; }
bool useDefaultImplementationForLowCardinalityColumns() const override { return false; }
private:
bool keep_nullable;
std::optional<Diagnostic> diagnostic;
};
}

View File

@ -115,6 +115,13 @@ private:
[[maybe_unused]] const NullMap * const null_map_data,
[[maybe_unused]] const NullMap * const null_map_item)
{
if constexpr (std::is_same_v<Data, IColumn> && std::is_same_v<Target, IColumn>)
{
/// Generic variant is using IColumn::compare function that only allows to compare columns of identical types.
if (typeid(data) != typeid(target))
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Columns {} and {} cannot be compared", data.getName(), target.getName());
}
const size_t size = offsets.size();
result.resize(size);

View File

@ -11,7 +11,7 @@
#include <Functions/IFunction.h>
#include <Interpreters/Context.h>
#include <libstemmer.h>
#include <libstemmer.h> // Y_IGNORE
namespace DB

View File

@ -312,6 +312,7 @@ SRCS(
hasToken.cpp
hasTokenCaseInsensitive.cpp
hostName.cpp
hyperscanRegexpChecker.cpp
hypot.cpp
identity.cpp
if.cpp
@ -564,6 +565,7 @@ SRCS(
tuple.cpp
tupleElement.cpp
tupleHammingDistance.cpp
tupleToNameValuePairs.cpp
upper.cpp
upperUTF8.cpp
uptime.cpp

View File

@ -4,7 +4,7 @@
#if USE_BZIP2
# include <IO/Bzip2ReadBuffer.h>
# include <bzlib.h>
# include <bzlib.h> // Y_IGNORE
namespace DB
{

View File

@ -4,7 +4,7 @@
#if USE_BROTLI
# include <IO/Bzip2WriteBuffer.h>
# include <bzlib.h>
# include <bzlib.h> // Y_IGNORE
#include <Common/MemoryTracker.h>

View File

@ -7,6 +7,7 @@
#include <Functions/FunctionsConversion.h>
#include <Functions/materialize.h>
#include <Functions/FunctionsLogical.h>
#include <Functions/CastOverloadResolver.h>
#include <Interpreters/Context.h>
#include <IO/WriteBufferFromString.h>
#include <IO/Operators.h>
@ -1110,8 +1111,8 @@ ActionsDAGPtr ActionsDAG::makeConvertingActions(
const auto * right_arg = &actions_dag->addColumn(std::move(column));
const auto * left_arg = dst_node;
FunctionCast::Diagnostic diagnostic = {dst_node->result_name, res_elem.name};
FunctionOverloadResolverPtr func_builder_cast = CastOverloadResolver<CastType::nonAccurate>::createImpl(false, std::move(diagnostic));
FunctionCastBase::Diagnostic diagnostic = {dst_node->result_name, res_elem.name};
FunctionOverloadResolverPtr func_builder_cast = CastInternalOverloadResolver<CastType::nonAccurate>::createImpl(std::move(diagnostic));
NodeRawConstPtrs children = { left_arg, right_arg };
dst_node = &actions_dag->addFunction(func_builder_cast, std::move(children), {});
@ -1876,7 +1877,7 @@ ActionsDAGPtr ActionsDAG::cloneActionsForFilterPushDown(
predicate->children = {left_arg, right_arg};
auto arguments = prepareFunctionArguments(predicate->children);
FunctionOverloadResolverPtr func_builder_cast = CastOverloadResolver<CastType::nonAccurate>::createImpl(false);
FunctionOverloadResolverPtr func_builder_cast = CastInternalOverloadResolver<CastType::nonAccurate>::createImpl();
predicate->function_builder = func_builder_cast;
predicate->function_base = predicate->function_builder->build(arguments);

View File

@ -43,11 +43,11 @@ void changeIfArguments(ASTPtr & first, ASTPtr & second)
String enum_string = makeStringsEnum(values);
auto enum_literal = std::make_shared<ASTLiteral>(enum_string);
auto first_cast = makeASTFunction("CAST");
auto first_cast = makeASTFunction("_CAST");
first_cast->arguments->children.push_back(first);
first_cast->arguments->children.push_back(enum_literal);
auto second_cast = makeASTFunction("CAST");
auto second_cast = makeASTFunction("_CAST");
second_cast->arguments->children.push_back(second);
second_cast->arguments->children.push_back(enum_literal);
@ -65,12 +65,12 @@ void changeTransformArguments(ASTPtr & array_to, ASTPtr & other)
String enum_string = makeStringsEnum(values);
auto array_cast = makeASTFunction("CAST");
auto array_cast = makeASTFunction("_CAST");
array_cast->arguments->children.push_back(array_to);
array_cast->arguments->children.push_back(std::make_shared<ASTLiteral>("Array(" + enum_string + ")"));
array_to = array_cast;
auto other_cast = makeASTFunction("CAST");
auto other_cast = makeASTFunction("_CAST");
other_cast->arguments->children.push_back(other);
other_cast->arguments->children.push_back(std::make_shared<ASTLiteral>(enum_string));
other = other_cast;
@ -183,4 +183,3 @@ void ConvertStringsToEnumMatcher::visit(ASTFunction & function_node, Data & data
}
}

View File

@ -1,14 +1,17 @@
#include <Parsers/ASTAlterQuery.h>
#include <Parsers/ASTCheckQuery.h>
#include <Parsers/ASTCreateQuery.h>
#include <Parsers/ASTCreateUserQuery.h>
#include <Parsers/ASTCreateRoleQuery.h>
#include <Parsers/ASTCreateQuotaQuery.h>
#include <Parsers/ASTCreateRoleQuery.h>
#include <Parsers/ASTCreateRowPolicyQuery.h>
#include <Parsers/ASTCreateSettingsProfileQuery.h>
#include <Parsers/ASTCreateUserQuery.h>
#include <Parsers/ASTDropAccessEntityQuery.h>
#include <Parsers/ASTDropQuery.h>
#include <Parsers/ASTExplainQuery.h>
#include <Parsers/ASTGrantQuery.h>
#include <Parsers/ASTInsertQuery.h>
#include <Parsers/ASTSelectIntersectExceptQuery.h>
#include <Parsers/ASTKillQueryQuery.h>
#include <Parsers/ASTOptimizeQuery.h>
#include <Parsers/ASTRenameQuery.h>
@ -24,11 +27,9 @@
#include <Parsers/ASTShowProcesslistQuery.h>
#include <Parsers/ASTShowTablesQuery.h>
#include <Parsers/ASTUseQuery.h>
#include <Parsers/ASTExplainQuery.h>
#include <Parsers/TablePropertiesQueriesASTs.h>
#include <Parsers/ASTWatchQuery.h>
#include <Parsers/ASTGrantQuery.h>
#include <Parsers/MySQL/ASTCreateQuery.h>
#include <Parsers/TablePropertiesQueriesASTs.h>
#include <Interpreters/Context.h>
#include <Interpreters/InterpreterAlterQuery.h>
@ -44,9 +45,11 @@
#include <Interpreters/InterpreterDropQuery.h>
#include <Interpreters/InterpreterExistsQuery.h>
#include <Interpreters/InterpreterExplainQuery.h>
#include <Interpreters/InterpreterExternalDDLQuery.h>
#include <Interpreters/InterpreterFactory.h>
#include <Interpreters/InterpreterGrantQuery.h>
#include <Interpreters/InterpreterInsertQuery.h>
#include <Interpreters/InterpreterSelectIntersectExceptQuery.h>
#include <Interpreters/InterpreterKillQueryQuery.h>
#include <Interpreters/InterpreterOptimizeQuery.h>
#include <Interpreters/InterpreterRenameQuery.h>
@ -65,7 +68,6 @@
#include <Interpreters/InterpreterSystemQuery.h>
#include <Interpreters/InterpreterUseQuery.h>
#include <Interpreters/InterpreterWatchQuery.h>
#include <Interpreters/InterpreterExternalDDLQuery.h>
#include <Interpreters/OpenTelemetrySpanLog.h>
#include <Parsers/ASTSystemQuery.h>
@ -109,6 +111,10 @@ std::unique_ptr<IInterpreter> InterpreterFactory::get(ASTPtr & query, ContextMut
ProfileEvents::increment(ProfileEvents::SelectQuery);
return std::make_unique<InterpreterSelectWithUnionQuery>(query, context, options);
}
else if (query->as<ASTSelectIntersectExceptQuery>())
{
return std::make_unique<InterpreterSelectIntersectExceptQuery>(query, context, options);
}
else if (query->as<ASTInsertQuery>())
{
ProfileEvents::increment(ProfileEvents::InsertQuery);

View File

@ -22,69 +22,108 @@ namespace ErrorCodes
namespace
{
template <typename T>
void updateFromQueryTemplate(
T & grantee,
/// Extracts access rights elements which are going to be granted or revoked from a query.
void collectAccessRightsElementsToGrantOrRevoke(
const ASTGrantQuery & query,
const std::vector<UUID> & roles_to_grant_or_revoke)
AccessRightsElements & elements_to_grant,
AccessRightsElements & elements_to_revoke)
{
if (!query.is_revoke)
{
if (query.replace_access)
grantee.access = {};
if (query.replace_granted_roles)
grantee.granted_roles = {};
}
elements_to_grant.clear();
elements_to_revoke.clear();
if (!query.access_rights_elements.empty())
{
if (query.is_revoke)
grantee.access.revoke(query.access_rights_elements);
else
grantee.access.grant(query.access_rights_elements);
}
if (!roles_to_grant_or_revoke.empty())
{
if (query.is_revoke)
{
if (query.admin_option)
grantee.granted_roles.revokeAdminOption(roles_to_grant_or_revoke);
else
grantee.granted_roles.revoke(roles_to_grant_or_revoke);
/// REVOKE
elements_to_revoke = query.access_rights_elements;
}
else if (query.replace_access)
{
/// GRANT WITH REPLACE OPTION
elements_to_grant = query.access_rights_elements;
elements_to_revoke.emplace_back(AccessType::ALL);
}
else
{
if (query.admin_option)
grantee.granted_roles.grantWithAdminOption(roles_to_grant_or_revoke);
else
grantee.granted_roles.grant(roles_to_grant_or_revoke);
}
/// GRANT
elements_to_grant = query.access_rights_elements;
}
}
void updateFromQueryImpl(
IAccessEntity & grantee,
/// Extracts roles which are going to be granted or revoked from a query.
void collectRolesToGrantOrRevoke(
const AccessControlManager & access_control,
const ASTGrantQuery & query,
const std::vector<UUID> & roles_to_grant_or_revoke)
std::vector<UUID> & roles_to_grant,
RolesOrUsersSet & roles_to_revoke)
{
if (auto * user = typeid_cast<User *>(&grantee))
updateFromQueryTemplate(*user, query, roles_to_grant_or_revoke);
else if (auto * role = typeid_cast<Role *>(&grantee))
updateFromQueryTemplate(*role, query, roles_to_grant_or_revoke);
roles_to_grant.clear();
roles_to_revoke.clear();
RolesOrUsersSet roles_to_grant_or_revoke;
if (query.roles)
roles_to_grant_or_revoke = RolesOrUsersSet{*query.roles, access_control};
if (query.is_revoke)
{
/// REVOKE
roles_to_revoke = std::move(roles_to_grant_or_revoke);
}
else if (query.replace_granted_roles)
{
/// GRANT WITH REPLACE OPTION
roles_to_grant = roles_to_grant_or_revoke.getMatchingIDs(access_control);
roles_to_revoke = RolesOrUsersSet::AllTag{};
}
else
{
/// GRANT
roles_to_grant = roles_to_grant_or_revoke.getMatchingIDs(access_control);
}
}
void checkGranteeIsAllowed(const ContextAccess & access, const UUID & grantee_id, const IAccessEntity & grantee)
/// Extracts roles which are going to be granted or revoked from a query.
void collectRolesToGrantOrRevoke(
const ASTGrantQuery & query,
std::vector<UUID> & roles_to_grant,
RolesOrUsersSet & roles_to_revoke)
{
auto current_user = access.getUser();
roles_to_grant.clear();
roles_to_revoke.clear();
RolesOrUsersSet roles_to_grant_or_revoke;
if (query.roles)
roles_to_grant_or_revoke = RolesOrUsersSet{*query.roles};
if (query.is_revoke)
{
/// REVOKE
roles_to_revoke = std::move(roles_to_grant_or_revoke);
}
else if (query.replace_granted_roles)
{
/// GRANT WITH REPLACE OPTION
roles_to_grant = roles_to_grant_or_revoke.getMatchingIDs();
roles_to_revoke = RolesOrUsersSet::AllTag{};
}
else
{
/// GRANT
roles_to_grant = roles_to_grant_or_revoke.getMatchingIDs();
}
}
/// Checks if a grantee is allowed for the current user, throws an exception if not.
void checkGranteeIsAllowed(const ContextAccess & current_user_access, const UUID & grantee_id, const IAccessEntity & grantee)
{
auto current_user = current_user_access.getUser();
if (current_user && !current_user->grantees.match(grantee_id))
throw Exception(grantee.outputTypeAndName() + " is not allowed as grantee", ErrorCodes::ACCESS_DENIED);
}
void checkGranteesAreAllowed(const AccessControlManager & access_control, const ContextAccess & access, const std::vector<UUID> & grantee_ids)
/// Checks if grantees are allowed for the current user, throws an exception if not.
void checkGranteesAreAllowed(const AccessControlManager & access_control, const ContextAccess & current_user_access, const std::vector<UUID> & grantee_ids)
{
auto current_user = access.getUser();
auto current_user = current_user_access.getUser();
if (!current_user || (current_user->grantees == RolesOrUsersSet::AllTag{}))
return;
@ -92,36 +131,26 @@ namespace
{
auto entity = access_control.tryRead(id);
if (auto role = typeid_cast<RolePtr>(entity))
checkGranteeIsAllowed(access, id, *role);
checkGranteeIsAllowed(current_user_access, id, *role);
else if (auto user = typeid_cast<UserPtr>(entity))
checkGranteeIsAllowed(access, id, *user);
checkGranteeIsAllowed(current_user_access, id, *user);
}
}
/// Checks if the current user has enough access rights granted with grant option to grant or revoke specified access rights.
void checkGrantOption(
const AccessControlManager & access_control,
const ContextAccess & access,
const ASTGrantQuery & query,
const ContextAccess & current_user_access,
const std::vector<UUID> & grantees_from_query,
bool & need_check_grantees_are_allowed)
{
const auto & elements = query.access_rights_elements;
need_check_grantees_are_allowed = true;
if (elements.empty())
{
/// No access rights to grant or revoke.
need_check_grantees_are_allowed = false;
return;
}
if (!query.is_revoke)
bool & need_check_grantees_are_allowed,
const AccessRightsElements & elements_to_grant,
AccessRightsElements & elements_to_revoke)
{
/// Check access rights which are going to be granted.
/// To execute the command GRANT the current user needs to have the access granted with GRANT OPTION.
access.checkGrantOption(elements);
return;
}
current_user_access.checkGrantOption(elements_to_grant);
if (access.hasGrantOption(elements))
if (current_user_access.hasGrantOption(elements_to_revoke))
{
/// Simple case: the current user has the grant option for all the access rights specified for REVOKE.
return;
@ -141,69 +170,81 @@ namespace
auto entity = access_control.tryRead(id);
if (auto role = typeid_cast<RolePtr>(entity))
{
checkGranteeIsAllowed(access, id, *role);
checkGranteeIsAllowed(current_user_access, id, *role);
all_granted_access.makeUnion(role->access);
}
else if (auto user = typeid_cast<UserPtr>(entity))
{
checkGranteeIsAllowed(access, id, *user);
checkGranteeIsAllowed(current_user_access, id, *user);
all_granted_access.makeUnion(user->access);
}
}
need_check_grantees_are_allowed = false; /// already checked
AccessRights required_access;
if (elements[0].is_partial_revoke)
{
AccessRightsElements non_revoke_elements = elements;
std::for_each(non_revoke_elements.begin(), non_revoke_elements.end(), [&](AccessRightsElement & element) { element.is_partial_revoke = false; });
required_access.grant(non_revoke_elements);
}
else
{
required_access.grant(elements);
}
required_access.makeIntersection(all_granted_access);
if (!elements_to_revoke.empty() && elements_to_revoke[0].is_partial_revoke)
std::for_each(elements_to_revoke.begin(), elements_to_revoke.end(), [&](AccessRightsElement & element) { element.is_partial_revoke = false; });
AccessRights access_to_revoke;
access_to_revoke.grant(elements_to_revoke);
access_to_revoke.makeIntersection(all_granted_access);
for (auto & required_access_element : required_access.getElements())
/// Build more accurate list of elements to revoke, now we use an intesection of the initial list of elements to revoke
/// and all the granted access rights to these grantees.
bool grant_option = !elements_to_revoke.empty() && elements_to_revoke[0].grant_option;
elements_to_revoke.clear();
for (auto & element_to_revoke : access_to_revoke.getElements())
{
if (!required_access_element.is_partial_revoke && (required_access_element.grant_option || !elements[0].grant_option))
access.checkGrantOption(required_access_element);
}
if (!element_to_revoke.is_partial_revoke && (element_to_revoke.grant_option || !grant_option))
elements_to_revoke.emplace_back(std::move(element_to_revoke));
}
std::vector<UUID> getRoleIDsAndCheckAdminOption(
current_user_access.checkGrantOption(elements_to_revoke);
}
/// Checks if the current user has enough access rights granted with grant option to grant or revoke specified access rights.
/// Also checks if grantees are allowed for the current user.
void checkGrantOptionAndGrantees(
const AccessControlManager & access_control,
const ContextAccess & access,
const ASTGrantQuery & query,
const RolesOrUsersSet & roles_from_query,
const ContextAccess & current_user_access,
const std::vector<UUID> & grantees_from_query,
bool & need_check_grantees_are_allowed)
const AccessRightsElements & elements_to_grant,
AccessRightsElements & elements_to_revoke)
{
need_check_grantees_are_allowed = true;
if (roles_from_query.empty())
{
/// No roles to grant or revoke.
need_check_grantees_are_allowed = false;
return {};
bool need_check_grantees_are_allowed = true;
checkGrantOption(
access_control,
current_user_access,
grantees_from_query,
need_check_grantees_are_allowed,
elements_to_grant,
elements_to_revoke);
if (need_check_grantees_are_allowed)
checkGranteesAreAllowed(access_control, current_user_access, grantees_from_query);
}
std::vector<UUID> matching_ids;
if (!query.is_revoke)
/// Checks if the current user has enough roles granted with admin option to grant or revoke specified roles.
void checkAdminOption(
const AccessControlManager & access_control,
const ContextAccess & current_user_access,
const std::vector<UUID> & grantees_from_query,
bool & need_check_grantees_are_allowed,
const std::vector<UUID> & roles_to_grant,
RolesOrUsersSet & roles_to_revoke,
bool admin_option)
{
/// Check roles which are going to be granted.
/// To execute the command GRANT the current user needs to have the roles granted with ADMIN OPTION.
matching_ids = roles_from_query.getMatchingIDs(access_control);
access.checkAdminOption(matching_ids);
return matching_ids;
}
current_user_access.checkAdminOption(roles_to_grant);
if (!roles_from_query.all)
/// Check roles which are going to be revoked.
std::vector<UUID> roles_to_revoke_ids;
if (!roles_to_revoke.all)
{
matching_ids = roles_from_query.getMatchingIDs();
if (access.hasAdminOption(matching_ids))
roles_to_revoke_ids = roles_to_revoke.getMatchingIDs();
if (current_user_access.hasAdminOption(roles_to_revoke_ids))
{
/// Simple case: the current user has the admin option for all the roles specified for REVOKE.
return matching_ids;
return;
}
}
@ -221,51 +262,109 @@ namespace
auto entity = access_control.tryRead(id);
if (auto role = typeid_cast<RolePtr>(entity))
{
checkGranteeIsAllowed(access, id, *role);
checkGranteeIsAllowed(current_user_access, id, *role);
all_granted_roles.makeUnion(role->granted_roles);
}
else if (auto user = typeid_cast<UserPtr>(entity))
{
checkGranteeIsAllowed(access, id, *user);
checkGranteeIsAllowed(current_user_access, id, *user);
all_granted_roles.makeUnion(user->granted_roles);
}
}
const auto & all_granted_roles_set = admin_option ? all_granted_roles.getGrantedWithAdminOption() : all_granted_roles.getGranted();
need_check_grantees_are_allowed = false; /// already checked
const auto & all_granted_roles_set = query.admin_option ? all_granted_roles.getGrantedWithAdminOption() : all_granted_roles.getGranted();
if (roles_from_query.all)
boost::range::set_difference(all_granted_roles_set, roles_from_query.except_ids, std::back_inserter(matching_ids));
if (roles_to_revoke.all)
boost::range::set_difference(all_granted_roles_set, roles_to_revoke.except_ids, std::back_inserter(roles_to_revoke_ids));
else
boost::range::remove_erase_if(matching_ids, [&](const UUID & id) { return !all_granted_roles_set.count(id); });
access.checkAdminOption(matching_ids);
return matching_ids;
boost::range::remove_erase_if(roles_to_revoke_ids, [&](const UUID & id) { return !all_granted_roles_set.count(id); });
roles_to_revoke = roles_to_revoke_ids;
current_user_access.checkAdminOption(roles_to_revoke_ids);
}
void checkGrantOptionAndGrantees(
/// Checks if the current user has enough roles granted with admin option to grant or revoke specified roles.
/// Also checks if grantees are allowed for the current user.
void checkAdminOptionAndGrantees(
const AccessControlManager & access_control,
const ContextAccess & access,
const ASTGrantQuery & query,
const std::vector<UUID> & grantees_from_query)
const ContextAccess & current_user_access,
const std::vector<UUID> & grantees_from_query,
const std::vector<UUID> & roles_to_grant,
RolesOrUsersSet & roles_to_revoke,
bool admin_option)
{
bool need_check_grantees_are_allowed = true;
checkGrantOption(access_control, access, query, grantees_from_query, need_check_grantees_are_allowed);
checkAdminOption(
access_control,
current_user_access,
grantees_from_query,
need_check_grantees_are_allowed,
roles_to_grant,
roles_to_revoke,
admin_option);
if (need_check_grantees_are_allowed)
checkGranteesAreAllowed(access_control, access, grantees_from_query);
checkGranteesAreAllowed(access_control, current_user_access, grantees_from_query);
}
std::vector<UUID> getRoleIDsAndCheckAdminOptionAndGrantees(
const AccessControlManager & access_control,
const ContextAccess & access,
const ASTGrantQuery & query,
const RolesOrUsersSet & roles_from_query,
const std::vector<UUID> & grantees_from_query)
template <typename T>
void updateGrantedAccessRightsAndRolesTemplate(
T & grantee,
const AccessRightsElements & elements_to_grant,
const AccessRightsElements & elements_to_revoke,
const std::vector<UUID> & roles_to_grant,
const RolesOrUsersSet & roles_to_revoke,
bool admin_option)
{
bool need_check_grantees_are_allowed = true;
auto role_ids = getRoleIDsAndCheckAdminOption(
access_control, access, query, roles_from_query, grantees_from_query, need_check_grantees_are_allowed);
if (need_check_grantees_are_allowed)
checkGranteesAreAllowed(access_control, access, grantees_from_query);
return role_ids;
if (!elements_to_revoke.empty())
grantee.access.revoke(elements_to_revoke);
if (!elements_to_grant.empty())
grantee.access.grant(elements_to_grant);
if (!roles_to_revoke.empty())
{
if (admin_option)
grantee.granted_roles.revokeAdminOption(grantee.granted_roles.findGrantedWithAdminOption(roles_to_revoke));
else
grantee.granted_roles.revoke(grantee.granted_roles.findGranted(roles_to_revoke));
}
if (!roles_to_grant.empty())
{
if (admin_option)
grantee.granted_roles.grantWithAdminOption(roles_to_grant);
else
grantee.granted_roles.grant(roles_to_grant);
}
}
/// Updates grants of a specified user or role.
void updateGrantedAccessRightsAndRoles(
IAccessEntity & grantee,
const AccessRightsElements & elements_to_grant,
const AccessRightsElements & elements_to_revoke,
const std::vector<UUID> & roles_to_grant,
const RolesOrUsersSet & roles_to_revoke,
bool admin_option)
{
if (auto * user = typeid_cast<User *>(&grantee))
updateGrantedAccessRightsAndRolesTemplate(*user, elements_to_grant, elements_to_revoke, roles_to_grant, roles_to_revoke, admin_option);
else if (auto * role = typeid_cast<Role *>(&grantee))
updateGrantedAccessRightsAndRolesTemplate(*role, elements_to_grant, elements_to_revoke, roles_to_grant, roles_to_revoke, admin_option);
}
/// Updates grants of a specified user or role.
void updateFromQuery(IAccessEntity & grantee, const ASTGrantQuery & query)
{
AccessRightsElements elements_to_grant, elements_to_revoke;
collectAccessRightsElementsToGrantOrRevoke(query, elements_to_grant, elements_to_revoke);
std::vector<UUID> roles_to_grant;
RolesOrUsersSet roles_to_revoke;
collectRolesToGrantOrRevoke(query, roles_to_grant, roles_to_revoke);
updateGrantedAccessRightsAndRoles(grantee, elements_to_grant, elements_to_revoke, roles_to_grant, roles_to_revoke, query.admin_option);
}
}
@ -283,16 +382,13 @@ BlockIO InterpreterGrantQuery::execute()
throw Exception("A partial revoke should be revoked, not granted", ErrorCodes::LOGICAL_ERROR);
auto & access_control = getContext()->getAccessControlManager();
std::optional<RolesOrUsersSet> roles_set;
if (query.roles)
roles_set = RolesOrUsersSet{*query.roles, access_control};
std::vector<UUID> grantees = RolesOrUsersSet{*query.grantees, access_control, getContext()->getUserID()}.getMatchingIDs(access_control);
/// Check if the current user has corresponding roles granted with admin option.
std::vector<UUID> roles;
if (roles_set)
roles = getRoleIDsAndCheckAdminOptionAndGrantees(access_control, *getContext()->getAccess(), query, *roles_set, grantees);
std::vector<UUID> roles_to_grant;
RolesOrUsersSet roles_to_revoke;
collectRolesToGrantOrRevoke(access_control, query, roles_to_grant, roles_to_revoke);
checkAdminOptionAndGrantees(access_control, *getContext()->getAccess(), grantees, roles_to_grant, roles_to_revoke, query.admin_option);
if (!query.cluster.empty())
{
@ -306,14 +402,15 @@ BlockIO InterpreterGrantQuery::execute()
query.replaceEmptyDatabase(getContext()->getCurrentDatabase());
/// Check if the current user has corresponding access rights with grant option.
if (!query.access_rights_elements.empty())
checkGrantOptionAndGrantees(access_control, *getContext()->getAccess(), query, grantees);
AccessRightsElements elements_to_grant, elements_to_revoke;
collectAccessRightsElementsToGrantOrRevoke(query, elements_to_grant, elements_to_revoke);
checkGrantOptionAndGrantees(access_control, *getContext()->getAccess(), grantees, elements_to_grant, elements_to_revoke);
/// Update roles and users listed in `grantees`.
auto update_func = [&](const AccessEntityPtr & entity) -> AccessEntityPtr
{
auto clone = entity->clone();
updateFromQueryImpl(*clone, query, roles);
updateGrantedAccessRightsAndRoles(*clone, elements_to_grant, elements_to_revoke, roles_to_grant, roles_to_revoke, query.admin_option);
return clone;
};
@ -325,21 +422,15 @@ BlockIO InterpreterGrantQuery::execute()
void InterpreterGrantQuery::updateUserFromQuery(User & user, const ASTGrantQuery & query)
{
std::vector<UUID> roles_to_grant_or_revoke;
if (query.roles)
roles_to_grant_or_revoke = RolesOrUsersSet{*query.roles}.getMatchingIDs();
updateFromQueryImpl(user, query, roles_to_grant_or_revoke);
updateFromQuery(user, query);
}
void InterpreterGrantQuery::updateRoleFromQuery(Role & role, const ASTGrantQuery & query)
{
std::vector<UUID> roles_to_grant_or_revoke;
if (query.roles)
roles_to_grant_or_revoke = RolesOrUsersSet{*query.roles}.getMatchingIDs();
updateFromQueryImpl(role, query, roles_to_grant_or_revoke);
updateFromQuery(role, query);
}
void InterpreterGrantQuery::extendQueryLogElemImpl(QueryLogElement & elem, const ASTPtr & /*ast*/, ContextPtr) const
{
auto & query = query_ptr->as<ASTGrantQuery &>();

View File

@ -0,0 +1,148 @@
#include <Columns/getLeastSuperColumn.h>
#include <Interpreters/Context.h>
#include <Interpreters/InterpreterSelectIntersectExceptQuery.h>
#include <Interpreters/InterpreterSelectQuery.h>
#include <Parsers/ASTSelectIntersectExceptQuery.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Processors/QueryPlan/IQueryPlanStep.h>
#include <Processors/QueryPlan/IntersectOrExceptStep.h>
#include <Processors/QueryPlan/Optimizations/QueryPlanOptimizationSettings.h>
#include <Processors/QueryPlan/QueryPlan.h>
#include <Processors/QueryPlan/ExpressionStep.h>
namespace DB
{
namespace ErrorCodes
{
extern const int INTERSECT_OR_EXCEPT_RESULT_STRUCTURES_MISMATCH;
extern const int LOGICAL_ERROR;
}
static Block getCommonHeader(const Blocks & headers)
{
size_t num_selects = headers.size();
Block common_header = headers.front();
size_t num_columns = common_header.columns();
for (size_t query_num = 1; query_num < num_selects; ++query_num)
{
if (headers[query_num].columns() != num_columns)
throw Exception(ErrorCodes::INTERSECT_OR_EXCEPT_RESULT_STRUCTURES_MISMATCH,
"Different number of columns in IntersectExceptQuery elements:\n {} \nand\n {}",
common_header.dumpNames(), headers[query_num].dumpNames());
}
std::vector<const ColumnWithTypeAndName *> columns(num_selects);
for (size_t column_num = 0; column_num < num_columns; ++column_num)
{
for (size_t i = 0; i < num_selects; ++i)
columns[i] = &headers[i].getByPosition(column_num);
ColumnWithTypeAndName & result_elem = common_header.getByPosition(column_num);
result_elem = getLeastSuperColumn(columns);
}
return common_header;
}
InterpreterSelectIntersectExceptQuery::InterpreterSelectIntersectExceptQuery(
const ASTPtr & query_ptr_,
ContextPtr context_,
const SelectQueryOptions & options_)
: IInterpreterUnionOrSelectQuery(query_ptr_->clone(), context_, options_)
{
ASTSelectIntersectExceptQuery * ast = query_ptr->as<ASTSelectIntersectExceptQuery>();
final_operator = ast->final_operator;
const auto & children = ast->children;
size_t num_children = children.size();
/// AST must have been changed by the visitor.
if (final_operator == Operator::UNKNOWN || num_children != 2)
throw Exception(ErrorCodes::LOGICAL_ERROR,
"SelectIntersectExceptyQuery has not been normalized (number of children: {})",
num_children);
nested_interpreters.resize(num_children);
for (size_t i = 0; i < num_children; ++i)
nested_interpreters[i] = buildCurrentChildInterpreter(children.at(i));
Blocks headers(num_children);
for (size_t query_num = 0; query_num < num_children; ++query_num)
headers[query_num] = nested_interpreters[query_num]->getSampleBlock();
result_header = getCommonHeader(headers);
}
std::unique_ptr<IInterpreterUnionOrSelectQuery>
InterpreterSelectIntersectExceptQuery::buildCurrentChildInterpreter(const ASTPtr & ast_ptr_)
{
if (ast_ptr_->as<ASTSelectWithUnionQuery>())
return std::make_unique<InterpreterSelectWithUnionQuery>(ast_ptr_, context, SelectQueryOptions());
if (ast_ptr_->as<ASTSelectQuery>())
return std::make_unique<InterpreterSelectQuery>(ast_ptr_, context, SelectQueryOptions());
if (ast_ptr_->as<ASTSelectIntersectExceptQuery>())
return std::make_unique<InterpreterSelectIntersectExceptQuery>(ast_ptr_, context, SelectQueryOptions());
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected query: {}", ast_ptr_->getID());
}
void InterpreterSelectIntersectExceptQuery::buildQueryPlan(QueryPlan & query_plan)
{
size_t num_plans = nested_interpreters.size();
std::vector<std::unique_ptr<QueryPlan>> plans(num_plans);
DataStreams data_streams(num_plans);
for (size_t i = 0; i < num_plans; ++i)
{
plans[i] = std::make_unique<QueryPlan>();
nested_interpreters[i]->buildQueryPlan(*plans[i]);
if (!blocksHaveEqualStructure(plans[i]->getCurrentDataStream().header, result_header))
{
auto actions_dag = ActionsDAG::makeConvertingActions(
plans[i]->getCurrentDataStream().header.getColumnsWithTypeAndName(),
result_header.getColumnsWithTypeAndName(),
ActionsDAG::MatchColumnsMode::Position);
auto converting_step = std::make_unique<ExpressionStep>(plans[i]->getCurrentDataStream(), std::move(actions_dag));
converting_step->setStepDescription("Conversion before UNION");
plans[i]->addStep(std::move(converting_step));
}
data_streams[i] = plans[i]->getCurrentDataStream();
}
auto max_threads = context->getSettingsRef().max_threads;
auto step = std::make_unique<IntersectOrExceptStep>(std::move(data_streams), final_operator, max_threads);
query_plan.unitePlans(std::move(step), std::move(plans));
}
BlockIO InterpreterSelectIntersectExceptQuery::execute()
{
BlockIO res;
QueryPlan query_plan;
buildQueryPlan(query_plan);
auto pipeline = query_plan.buildQueryPipeline(
QueryPlanOptimizationSettings::fromContext(context),
BuildQueryPipelineSettings::fromContext(context));
res.pipeline = std::move(*pipeline);
res.pipeline.addInterpreterContext(context);
return res;
}
void InterpreterSelectIntersectExceptQuery::ignoreWithTotals()
{
for (auto & interpreter : nested_interpreters)
interpreter->ignoreWithTotals();
}
}

View File

@ -0,0 +1,46 @@
#pragma once
#include <Core/QueryProcessingStage.h>
#include <Interpreters/IInterpreter.h>
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <Interpreters/IInterpreterUnionOrSelectQuery.h>
#include <Parsers/ASTSelectIntersectExceptQuery.h>
namespace DB
{
class Context;
class InterpreterSelectQuery;
class QueryPlan;
class InterpreterSelectIntersectExceptQuery : public IInterpreterUnionOrSelectQuery
{
using Operator = ASTSelectIntersectExceptQuery::Operator;
public:
InterpreterSelectIntersectExceptQuery(
const ASTPtr & query_ptr_,
ContextPtr context_,
const SelectQueryOptions & options_);
BlockIO execute() override;
Block getSampleBlock() { return result_header; }
void ignoreWithTotals() override;
private:
static String getName() { return "SelectIntersectExceptQuery"; }
std::unique_ptr<IInterpreterUnionOrSelectQuery>
buildCurrentChildInterpreter(const ASTPtr & ast_ptr_);
void buildQueryPlan(QueryPlan & query_plan) override;
std::vector<std::unique_ptr<IInterpreterUnionOrSelectQuery>> nested_interpreters;
Operator final_operator;
};
}

View File

@ -2,8 +2,10 @@
#include <Interpreters/Context.h>
#include <Interpreters/InterpreterSelectQuery.h>
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <Interpreters/InterpreterSelectIntersectExceptQuery.h>
#include <Parsers/ASTSelectQuery.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Parsers/ASTSelectIntersectExceptQuery.h>
#include <Parsers/queryToString.h>
#include <Processors/QueryPlan/DistinctStep.h>
#include <Processors/QueryPlan/ExpressionStep.h>
@ -208,8 +210,10 @@ InterpreterSelectWithUnionQuery::buildCurrentChildInterpreter(const ASTPtr & ast
{
if (ast_ptr_->as<ASTSelectWithUnionQuery>())
return std::make_unique<InterpreterSelectWithUnionQuery>(ast_ptr_, context, options, current_required_result_column_names);
else
else if (ast_ptr_->as<ASTSelectQuery>())
return std::make_unique<InterpreterSelectQuery>(ast_ptr_, context, options, current_required_result_column_names);
else
return std::make_unique<InterpreterSelectIntersectExceptQuery>(ast_ptr_, context, options);
}
InterpreterSelectWithUnionQuery::~InterpreterSelectWithUnionQuery() = default;
@ -225,10 +229,14 @@ Block InterpreterSelectWithUnionQuery::getSampleBlock(const ASTPtr & query_ptr_,
}
if (is_subquery)
{
return cache[key]
= InterpreterSelectWithUnionQuery(query_ptr_, context_, SelectQueryOptions().subquery().analyze()).getSampleBlock();
}
else
{
return cache[key] = InterpreterSelectWithUnionQuery(query_ptr_, context_, SelectQueryOptions().analyze()).getSampleBlock();
}
}

View File

@ -6,8 +6,8 @@
#if USE_NLP
#include <Common/Exception.h>
#include <Interpreters/Lemmatizers.h>
#include <RdrLemmatizer.h>
#include <Interpreters/Lemmatizers.h> // Y_IGNORE
#include <RdrLemmatizer.h> // Y_IGNORE
#include <vector>
#include <filesystem>

View File

@ -503,10 +503,10 @@ ASTPtr MutationsInterpreter::prepare(bool dry_run)
}
}
auto updated_column = makeASTFunction("CAST",
auto updated_column = makeASTFunction("_CAST",
makeASTFunction("if",
condition,
makeASTFunction("CAST",
makeASTFunction("_CAST",
update_expr->clone(),
type_literal),
std::make_shared<ASTIdentifier>(column)),
@ -920,9 +920,10 @@ BlockInputStreamPtr MutationsInterpreter::execute()
return result_stream;
}
const Block & MutationsInterpreter::getUpdatedHeader() const
Block MutationsInterpreter::getUpdatedHeader() const
{
return *updated_header;
// If it's an index/projection materialization, we don't write any data columns, thus empty header is used
return mutation_kind.mutation_kind == MutationKind::MUTATE_INDEX_PROJECTION ? Block{} : *updated_header;
}
const ColumnDependencies & MutationsInterpreter::getColumnDependencies() const

View File

@ -53,7 +53,7 @@ public:
BlockInputStreamPtr execute();
/// Only changed columns.
const Block & getUpdatedHeader() const;
Block getUpdatedHeader() const;
const ColumnDependencies & getColumnDependencies() const;

Some files were not shown because too many files have changed in this diff Show More