mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-24 00:22:29 +00:00
Merge branch 'master' into ditch-aio
This commit is contained in:
commit
81646f8389
156
CHANGELOG.md
156
CHANGELOG.md
@ -1,3 +1,159 @@
|
||||
### ClickHouse release v21.7, 2021-07-09
|
||||
|
||||
#### Backward Incompatible Change
|
||||
|
||||
* Improved performance of queries with explicitly defined large sets. Added compatibility setting `legacy_column_name_of_tuple_literal`. It makes sense to set it to `true`, while doing rolling update of cluster from version lower than 21.7 to any higher version. Otherwise distributed queries with explicitly defined sets at `IN` clause may fail during update. [#25371](https://github.com/ClickHouse/ClickHouse/pull/25371) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Forward/backward incompatible change of maximum buffer size in clickhouse-keeper (an experimental alternative to ZooKeeper). Better to do it now (before production), than later. [#25421](https://github.com/ClickHouse/ClickHouse/pull/25421) ([alesapin](https://github.com/alesapin)).
|
||||
|
||||
#### New Feature
|
||||
|
||||
* Support configuration in YAML format as alternative to XML. This closes [#3607](https://github.com/ClickHouse/ClickHouse/issues/3607). [#21858](https://github.com/ClickHouse/ClickHouse/pull/21858) ([BoloniniD](https://github.com/BoloniniD)).
|
||||
* Provides a way to restore replicated table when the data is (possibly) present, but the ZooKeeper metadata is lost. Resolves [#13458](https://github.com/ClickHouse/ClickHouse/issues/13458). [#13652](https://github.com/ClickHouse/ClickHouse/pull/13652) ([Mike Kot](https://github.com/myrrc)).
|
||||
* Support structs and maps in Arrow/Parquet/ORC and dictionaries in Arrow input/output formats. Present new setting `output_format_arrow_low_cardinality_as_dictionary`. [#24341](https://github.com/ClickHouse/ClickHouse/pull/24341) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Added support for `Array` type in dictionaries. [#25119](https://github.com/ClickHouse/ClickHouse/pull/25119) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Added function `bitPositionsToArray`. Closes [#23792](https://github.com/ClickHouse/ClickHouse/issues/23792). Author [Kevin Wan] (@MaxWk). [#25394](https://github.com/ClickHouse/ClickHouse/pull/25394) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Added function `dateName` to return names like 'Friday' or 'April'. Author [Daniil Kondratyev] (@dankondr). [#25372](https://github.com/ClickHouse/ClickHouse/pull/25372) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Add `toJSONString` function to serialize columns to their JSON representations. [#25164](https://github.com/ClickHouse/ClickHouse/pull/25164) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Now `query_log` has two new columns: `initial_query_start_time`, `initial_query_start_time_microsecond` that record the starting time of a distributed query if any. [#25022](https://github.com/ClickHouse/ClickHouse/pull/25022) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Add aggregate function `segmentLengthSum`. [#24250](https://github.com/ClickHouse/ClickHouse/pull/24250) ([flynn](https://github.com/ucasfl)).
|
||||
* Add a new boolean setting `prefer_global_in_and_join` which defaults all IN/JOIN as GLOBAL IN/JOIN. [#23434](https://github.com/ClickHouse/ClickHouse/pull/23434) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Support `ALTER DELETE` queries for `Join` table engine. [#23260](https://github.com/ClickHouse/ClickHouse/pull/23260) ([foolchi](https://github.com/foolchi)).
|
||||
* Add `quantileBFloat16` aggregate function as well as the corresponding `quantilesBFloat16` and `medianBFloat16`. It is very simple and fast quantile estimator with relative error not more than 0.390625%. This closes [#16641](https://github.com/ClickHouse/ClickHouse/issues/16641). [#23204](https://github.com/ClickHouse/ClickHouse/pull/23204) ([Ivan Novitskiy](https://github.com/RedClusive)).
|
||||
* Implement `sequenceNextNode()` function useful for `flow analysis`. [#19766](https://github.com/ClickHouse/ClickHouse/pull/19766) ([achimbab](https://github.com/achimbab)).
|
||||
|
||||
#### Experimental Feature
|
||||
|
||||
* Add support for virtual filesystem over HDFS. [#11058](https://github.com/ClickHouse/ClickHouse/pull/11058) ([overshov](https://github.com/overshov)) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Now clickhouse-keeper (an experimental alternative to ZooKeeper) supports ZooKeeper-like `digest` ACLs. [#24448](https://github.com/ClickHouse/ClickHouse/pull/24448) ([alesapin](https://github.com/alesapin)).
|
||||
|
||||
#### Performance Improvement
|
||||
|
||||
* Added optimization that transforms some functions to reading of subcolumns to reduce amount of read data. E.g., statement `col IS NULL` is transformed to reading of subcolumn `col.null`. Optimization can be enabled by setting `optimize_functions_to_subcolumns` which is currently off by default. [#24406](https://github.com/ClickHouse/ClickHouse/pull/24406) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Rewrite more columns to possible alias expressions. This may enable better optimization, such as projections. [#24405](https://github.com/ClickHouse/ClickHouse/pull/24405) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Index of type `bloom_filter` can be used for expressions with `hasAny` function with constant arrays. This closes: [#24291](https://github.com/ClickHouse/ClickHouse/issues/24291). [#24900](https://github.com/ClickHouse/ClickHouse/pull/24900) ([Vasily Nemkov](https://github.com/Enmk)).
|
||||
* Add exponential backoff to reschedule read attempt in case RabbitMQ queues are empty. (ClickHouse has support for importing data from RabbitMQ). Closes [#24340](https://github.com/ClickHouse/ClickHouse/issues/24340). [#24415](https://github.com/ClickHouse/ClickHouse/pull/24415) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
|
||||
#### Improvement
|
||||
|
||||
* Allow to limit bandwidth for replication. Add two Replicated\*MergeTree settings: `max_replicated_fetches_network_bandwidth` and `max_replicated_sends_network_bandwidth` which allows to limit maximum speed of replicated fetches/sends for table. Add two server-wide settings (in `default` user profile): `max_replicated_fetches_network_bandwidth_for_server` and `max_replicated_sends_network_bandwidth_for_server` which limit maximum speed of replication for all tables. The settings are not followed perfectly accurately. Turned off by default. Fixes [#1821](https://github.com/ClickHouse/ClickHouse/issues/1821). [#24573](https://github.com/ClickHouse/ClickHouse/pull/24573) ([alesapin](https://github.com/alesapin)).
|
||||
* Resource constraints and isolation for ODBC and Library bridges. Use separate `clickhouse-bridge` group and user for bridge processes. Set oom_score_adj so the bridges will be first subjects for OOM killer. Set set maximum RSS to 1 GiB. Closes [#23861](https://github.com/ClickHouse/ClickHouse/issues/23861). [#25280](https://github.com/ClickHouse/ClickHouse/pull/25280) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Add standalone `clickhouse-keeper` symlink to the main `clickhouse` binary. Now it's possible to run coordination without the main clickhouse server. [#24059](https://github.com/ClickHouse/ClickHouse/pull/24059) ([alesapin](https://github.com/alesapin)).
|
||||
* Use global settings for query to `VIEW`. Fixed the behavior when queries to `VIEW` use local settings, that leads to errors if setting on `CREATE VIEW` and `SELECT` were different. As for now, `VIEW` won't use these modified settings, but you can still pass additional settings in `SETTINGS` section of `CREATE VIEW` query. Close [#20551](https://github.com/ClickHouse/ClickHouse/issues/20551). [#24095](https://github.com/ClickHouse/ClickHouse/pull/24095) ([Vladimir](https://github.com/vdimir)).
|
||||
* On server start, parts with incorrect partition ID would not be ever removed, but always detached. [#25070](https://github.com/ClickHouse/ClickHouse/issues/25070). [#25166](https://github.com/ClickHouse/ClickHouse/pull/25166) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Increase size of background schedule pool to 128 (`background_schedule_pool_size` setting). It allows avoiding replication queue hung on slow zookeeper connection. [#25072](https://github.com/ClickHouse/ClickHouse/pull/25072) ([alesapin](https://github.com/alesapin)).
|
||||
* Add merge tree setting `max_parts_to_merge_at_once` which limits the number of parts that can be merged in the background at once. Doesn't affect `OPTIMIZE FINAL` query. Fixes [#1820](https://github.com/ClickHouse/ClickHouse/issues/1820). [#24496](https://github.com/ClickHouse/ClickHouse/pull/24496) ([alesapin](https://github.com/alesapin)).
|
||||
* Allow `NOT IN` operator to be used in partition pruning. [#24894](https://github.com/ClickHouse/ClickHouse/pull/24894) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Recognize IPv4 addresses like `127.0.1.1` as local. This is controversial and closes [#23504](https://github.com/ClickHouse/ClickHouse/issues/23504). Michael Filimonov will test this feature. [#24316](https://github.com/ClickHouse/ClickHouse/pull/24316) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* ClickHouse database created with MaterializeMySQL (it is an experimental feature) now contains all column comments from the MySQL database that materialized. [#25199](https://github.com/ClickHouse/ClickHouse/pull/25199) ([Storozhuk Kostiantyn](https://github.com/sand6255)).
|
||||
* Add settings (`connection_auto_close`/`connection_max_tries`/`connection_pool_size`) for MySQL storage engine. [#24146](https://github.com/ClickHouse/ClickHouse/pull/24146) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Improve startup time of Distributed engine. [#25663](https://github.com/ClickHouse/ClickHouse/pull/25663) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Improvement for Distributed tables. Drop replicas from dirname for internal_replication=true (allows INSERT into Distributed with cluster from any number of replicas, before only 15 replicas was supported, everything more will fail with ENAMETOOLONG while creating directory for async blocks). [#25513](https://github.com/ClickHouse/ClickHouse/pull/25513) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Added support `Interval` type for `LowCardinality`. It is needed for intermediate values of some expressions. Closes [#21730](https://github.com/ClickHouse/ClickHouse/issues/21730). [#25410](https://github.com/ClickHouse/ClickHouse/pull/25410) ([Vladimir](https://github.com/vdimir)).
|
||||
* Add `==` operator on time conditions for `sequenceMatch` and `sequenceCount` functions. For eg: sequenceMatch('(?1)(?t==1)(?2)')(time, data = 1, data = 2). [#25299](https://github.com/ClickHouse/ClickHouse/pull/25299) ([Christophe Kalenzaga](https://github.com/mga-chka)).
|
||||
* Add settings `http_max_fields`, `http_max_field_name_size`, `http_max_field_value_size`. [#25296](https://github.com/ClickHouse/ClickHouse/pull/25296) ([Ivan](https://github.com/abyss7)).
|
||||
* Add support for function `if` with `Decimal` and `Int` types on its branches. This closes [#20549](https://github.com/ClickHouse/ClickHouse/issues/20549). This closes [#10142](https://github.com/ClickHouse/ClickHouse/issues/10142). [#25283](https://github.com/ClickHouse/ClickHouse/pull/25283) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Update prompt in `clickhouse-client` and display a message when reconnecting. This closes [#10577](https://github.com/ClickHouse/ClickHouse/issues/10577). [#25281](https://github.com/ClickHouse/ClickHouse/pull/25281) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Correct memory tracking in aggregate function `topK`. This closes [#25259](https://github.com/ClickHouse/ClickHouse/issues/25259). [#25260](https://github.com/ClickHouse/ClickHouse/pull/25260) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix `topLevelDomain` for IDN hosts (i.e. `example.рф`), before it returns empty string for such hosts. [#25103](https://github.com/ClickHouse/ClickHouse/pull/25103) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Detect Linux kernel version at runtime (for worked nested epoll, that is required for `async_socket_for_remote`/`use_hedged_requests`, otherwise remote queries may stuck). [#25067](https://github.com/ClickHouse/ClickHouse/pull/25067) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* For distributed query, when `optimize_skip_unused_shards=1`, allow to skip shard with condition like `(sharding key) IN (one-element-tuple)`. (Tuples with many elements were supported. Tuple with single element did not work because it is parsed as literal). [#24930](https://github.com/ClickHouse/ClickHouse/pull/24930) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Improved log messages of S3 errors, no more double whitespaces in case of empty keys and buckets. [#24897](https://github.com/ClickHouse/ClickHouse/pull/24897) ([Vladimir Chebotarev](https://github.com/excitoon)).
|
||||
* Some queries require multi-pass semantic analysis. Try reusing built sets for `IN` in this case. [#24874](https://github.com/ClickHouse/ClickHouse/pull/24874) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Respect `max_distributed_connections` for `insert_distributed_sync` (otherwise for huge clusters and sync insert it may run out of `max_thread_pool_size`). [#24754](https://github.com/ClickHouse/ClickHouse/pull/24754) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Avoid hiding errors like `Limit for rows or bytes to read exceeded` for scalar subqueries. [#24545](https://github.com/ClickHouse/ClickHouse/pull/24545) ([nvartolomei](https://github.com/nvartolomei)).
|
||||
* Make String-to-Int parser stricter so that `toInt64('+')` will throw. [#24475](https://github.com/ClickHouse/ClickHouse/pull/24475) ([Amos Bird](https://github.com/amosbird)).
|
||||
* If `SSD_CACHE` is created with DDL query, it can be created only inside `user_files` directory. [#24466](https://github.com/ClickHouse/ClickHouse/pull/24466) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* PostgreSQL support for specifying non default schema for insert queries. Closes [#24149](https://github.com/ClickHouse/ClickHouse/issues/24149). [#24413](https://github.com/ClickHouse/ClickHouse/pull/24413) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix IPv6 addresses resolving (i.e. fixes `select * from remote('[::1]', system.one)`). [#24319](https://github.com/ClickHouse/ClickHouse/pull/24319) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix trailing whitespaces in FROM clause with subqueries in multiline mode, and also changes the output of the queries slightly in a more human friendly way. [#24151](https://github.com/ClickHouse/ClickHouse/pull/24151) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Improvement for Distributed tables. Add ability to split distributed batch on failures (i.e. due to memory limits, corruptions), under `distributed_directory_monitor_split_batch_on_failure` (OFF by default). [#23864](https://github.com/ClickHouse/ClickHouse/pull/23864) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Handle column name clashes for `Join` table engine. Closes [#20309](https://github.com/ClickHouse/ClickHouse/issues/20309). [#23769](https://github.com/ClickHouse/ClickHouse/pull/23769) ([Vladimir](https://github.com/vdimir)).
|
||||
* Display progress for `File` table engine in `clickhouse-local` and on INSERT query in `clickhouse-client` when data is passed to stdin. Closes [#18209](https://github.com/ClickHouse/ClickHouse/issues/18209). [#23656](https://github.com/ClickHouse/ClickHouse/pull/23656) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Bugfixes and improvements of `clickhouse-copier`. Allow to copy tables with different (but compatible schemas). Closes [#9159](https://github.com/ClickHouse/ClickHouse/issues/9159). Added test to copy ReplacingMergeTree. Closes [#22711](https://github.com/ClickHouse/ClickHouse/issues/22711). Support TTL on columns and Data Skipping Indices. It simply removes it to create internal Distributed table (underlying table will have TTL and skipping indices). Closes [#19384](https://github.com/ClickHouse/ClickHouse/issues/19384). Allow to copy MATERIALIZED and ALIAS columns. There are some cases in which it could be helpful (e.g. if this column is in PRIMARY KEY). Now it could be allowed by setting `allow_to_copy_alias_and_materialized_columns` property to true in task configuration. Closes [#9177](https://github.com/ClickHouse/ClickHouse/issues/9177). Closes [#11007] (https://github.com/ClickHouse/ClickHouse/issues/11007). Closes [#9514](https://github.com/ClickHouse/ClickHouse/issues/9514). Added a property `allow_to_drop_target_partitions` in task configuration to drop partition in original table before moving helping tables. Closes [#20957](https://github.com/ClickHouse/ClickHouse/issues/20957). Get rid of `OPTIMIZE DEDUPLICATE` query. This hack was needed, because `ALTER TABLE MOVE PARTITION` was retried many times and plain MergeTree tables don't have deduplication. Closes [#17966](https://github.com/ClickHouse/ClickHouse/issues/17966). Write progress to ZooKeeper node on path `task_path + /status` in JSON format. Closes [#20955](https://github.com/ClickHouse/ClickHouse/issues/20955). Support for ReplicatedTables without arguments. Closes [#24834](https://github.com/ClickHouse/ClickHouse/issues/24834) .[#23518](https://github.com/ClickHouse/ClickHouse/pull/23518) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Added sleep with backoff between read retries from S3. [#23461](https://github.com/ClickHouse/ClickHouse/pull/23461) ([Vladimir Chebotarev](https://github.com/excitoon)).
|
||||
* Respect `insert_allow_materialized_columns` (allows materialized columns) for INSERT into `Distributed` table. [#23349](https://github.com/ClickHouse/ClickHouse/pull/23349) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Add ability to push down LIMIT for distributed queries. [#23027](https://github.com/ClickHouse/ClickHouse/pull/23027) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix zero-copy replication with several S3 volumes (Fixes [#22679](https://github.com/ClickHouse/ClickHouse/issues/22679)). [#22864](https://github.com/ClickHouse/ClickHouse/pull/22864) ([ianton-ru](https://github.com/ianton-ru)).
|
||||
* Resolve the actual port number bound when a user requests any available port from the operating system to show it in the log message. [#25569](https://github.com/ClickHouse/ClickHouse/pull/25569) ([bnaecker](https://github.com/bnaecker)).
|
||||
* Fixed case, when sometimes conversion of postgres arrays resulted in String data type, not n-dimensional array, because `attndims` works incorrectly in some cases. Closes [#24804](https://github.com/ClickHouse/ClickHouse/issues/24804). [#25538](https://github.com/ClickHouse/ClickHouse/pull/25538) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix convertion of DateTime with timezone for MySQL, PostgreSQL, ODBC. Closes [#5057](https://github.com/ClickHouse/ClickHouse/issues/5057). [#25528](https://github.com/ClickHouse/ClickHouse/pull/25528) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Distinguish KILL MUTATION for different tables (fixes unexpected `Cancelled mutating parts` error). [#25025](https://github.com/ClickHouse/ClickHouse/pull/25025) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Allow to declare S3 disk at root of bucket (S3 virtual filesystem is an experimental feature under development). [#24898](https://github.com/ClickHouse/ClickHouse/pull/24898) ([Vladimir Chebotarev](https://github.com/excitoon)).
|
||||
* Enable reading of subcolumns (e.g. components of Tuples) for distributed tables. [#24472](https://github.com/ClickHouse/ClickHouse/pull/24472) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* A feature for MySQL compatibility protocol: make `user` function to return correct output. Closes [#25697](https://github.com/ClickHouse/ClickHouse/pull/25697). [#25697](https://github.com/ClickHouse/ClickHouse/pull/25697) ([sundyli](https://github.com/sundy-li)).
|
||||
|
||||
#### Bug Fix
|
||||
|
||||
* Improvement for backward compatibility. Use old modulo function version when used in partition key. Closes [#23508](https://github.com/ClickHouse/ClickHouse/issues/23508). [#24157](https://github.com/ClickHouse/ClickHouse/pull/24157) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix extremely rare bug on low-memory servers which can lead to the inability to perform merges without restart. Possibly fixes [#24603](https://github.com/ClickHouse/ClickHouse/issues/24603). [#24872](https://github.com/ClickHouse/ClickHouse/pull/24872) ([alesapin](https://github.com/alesapin)).
|
||||
* Fix extremely rare error `Tagging already tagged part` in replication queue during concurrent `alter move/replace partition`. Possibly fixes [#22142](https://github.com/ClickHouse/ClickHouse/issues/22142). [#24961](https://github.com/ClickHouse/ClickHouse/pull/24961) ([alesapin](https://github.com/alesapin)).
|
||||
* Fix potential crash when calculating aggregate function states by aggregation of aggregate function states of other aggregate functions (not a practical use case). See [#24523](https://github.com/ClickHouse/ClickHouse/issues/24523). [#25015](https://github.com/ClickHouse/ClickHouse/pull/25015) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fixed the behavior when query `SYSTEM RESTART REPLICA` or `SYSTEM SYNC REPLICA` does not finish. This was detected on server with extremely low amount of RAM. [#24457](https://github.com/ClickHouse/ClickHouse/pull/24457) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Fix bug which can lead to ZooKeeper client hung inside clickhouse-server. [#24721](https://github.com/ClickHouse/ClickHouse/pull/24721) ([alesapin](https://github.com/alesapin)).
|
||||
* If ZooKeeper connection was lost and replica was cloned after restoring the connection, its replication queue might contain outdated entries. Fixed failed assertion when replication queue contains intersecting virtual parts. It may rarely happen if some data part was lost. Print error in log instead of terminating. [#24777](https://github.com/ClickHouse/ClickHouse/pull/24777) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Fix lost `WHERE` condition in expression-push-down optimization of query plan (setting `query_plan_filter_push_down = 1` by default). Fixes [#25368](https://github.com/ClickHouse/ClickHouse/issues/25368). [#25370](https://github.com/ClickHouse/ClickHouse/pull/25370) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix bug which can lead to intersecting parts after merges with TTL: `Part all_40_40_0 is covered by all_40_40_1 but should be merged into all_40_41_1. This shouldn't happen often.`. [#25549](https://github.com/ClickHouse/ClickHouse/pull/25549) ([alesapin](https://github.com/alesapin)).
|
||||
* On ZooKeeper connection loss `ReplicatedMergeTree` table might wait for background operations to complete before trying to reconnect. It's fixed, now background operations are stopped forcefully. [#25306](https://github.com/ClickHouse/ClickHouse/pull/25306) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Fix error `Key expression contains comparison between inconvertible types` for queries with `ARRAY JOIN` in case if array is used in primary key. Fixes [#8247](https://github.com/ClickHouse/ClickHouse/issues/8247). [#25546](https://github.com/ClickHouse/ClickHouse/pull/25546) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Fix wrong totals for query `WITH TOTALS` and `WITH FILL`. Fixes [#20872](https://github.com/ClickHouse/ClickHouse/issues/20872). [#25539](https://github.com/ClickHouse/ClickHouse/pull/25539) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Fix data race when querying `system.clusters` while reloading the cluster configuration at the same time. [#25737](https://github.com/ClickHouse/ClickHouse/pull/25737) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fixed `No such file or directory` error on moving `Distributed` table between databases. Fixes [#24971](https://github.com/ClickHouse/ClickHouse/issues/24971). [#25667](https://github.com/ClickHouse/ClickHouse/pull/25667) ([tavplubix](https://github.com/tavplubix)).
|
||||
* `REPLACE PARTITION` might be ignored in rare cases if the source partition was empty. It's fixed. Fixes [#24869](https://github.com/ClickHouse/ClickHouse/issues/24869). [#25665](https://github.com/ClickHouse/ClickHouse/pull/25665) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Fixed a bug in `Replicated` database engine that might rarely cause some replica to skip enqueued DDL query. [#24805](https://github.com/ClickHouse/ClickHouse/pull/24805) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Fix null pointer dereference in `EXPLAIN AST` without query. [#25631](https://github.com/ClickHouse/ClickHouse/pull/25631) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix waiting of automatic dropping of empty parts. It could lead to full filling of background pool and stuck of replication. [#23315](https://github.com/ClickHouse/ClickHouse/pull/23315) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Fix restore of a table stored in S3 virtual filesystem (it is an experimental feature not ready for production). [#25601](https://github.com/ClickHouse/ClickHouse/pull/25601) ([ianton-ru](https://github.com/ianton-ru)).
|
||||
* Fix nullptr dereference in `Arrow` format when using `Decimal256`. Add `Decimal256` support for `Arrow` format. [#25531](https://github.com/ClickHouse/ClickHouse/pull/25531) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix excessive underscore before the names of the preprocessed configuration files. [#25431](https://github.com/ClickHouse/ClickHouse/pull/25431) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* A fix for `clickhouse-copier` tool: Fix segfault when sharding_key is absent in task config for copier. [#25419](https://github.com/ClickHouse/ClickHouse/pull/25419) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Fix `REPLACE` column transformer when used in DDL by correctly quoting the formated query. This fixes [#23925](https://github.com/ClickHouse/ClickHouse/issues/23925). [#25391](https://github.com/ClickHouse/ClickHouse/pull/25391) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix the possibility of non-deterministic behaviour of the `quantileDeterministic` function and similar. This closes [#20480](https://github.com/ClickHouse/ClickHouse/issues/20480). [#25313](https://github.com/ClickHouse/ClickHouse/pull/25313) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Support `SimpleAggregateFunction(LowCardinality)` for `SummingMergeTree`. Fixes [#25134](https://github.com/ClickHouse/ClickHouse/issues/25134). [#25300](https://github.com/ClickHouse/ClickHouse/pull/25300) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix logical error with exception message "Cannot sum Array/Tuple in min/maxMap". [#25298](https://github.com/ClickHouse/ClickHouse/pull/25298) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix error `Bad cast from type DB::ColumnLowCardinality to DB::ColumnVector<char8_t>` for queries where `LowCardinality` argument was used for IN (this bug appeared in 21.6). Fixes [#25187](https://github.com/ClickHouse/ClickHouse/issues/25187). [#25290](https://github.com/ClickHouse/ClickHouse/pull/25290) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix incorrect behaviour of `joinGetOrNull` with not-nullable columns. This fixes [#24261](https://github.com/ClickHouse/ClickHouse/issues/24261). [#25288](https://github.com/ClickHouse/ClickHouse/pull/25288) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix incorrect behaviour and UBSan report in big integers. In previous versions `CAST(1e19 AS UInt128)` returned zero. [#25279](https://github.com/ClickHouse/ClickHouse/pull/25279) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fixed an error which occurred while inserting a subset of columns using CSVWithNames format. Fixes [#25129](https://github.com/ClickHouse/ClickHouse/issues/25129). [#25169](https://github.com/ClickHouse/ClickHouse/pull/25169) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Do not use table's projection for `SELECT` with `FINAL`. It is not supported yet. [#25163](https://github.com/ClickHouse/ClickHouse/pull/25163) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix possible parts loss after updating up to 21.5 in case table used `UUID` in partition key. (It is not recommended to use `UUID` in partition key). Fixes [#25070](https://github.com/ClickHouse/ClickHouse/issues/25070). [#25127](https://github.com/ClickHouse/ClickHouse/pull/25127) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix crash in query with cross join and `joined_subquery_requires_alias = 0`. Fixes [#24011](https://github.com/ClickHouse/ClickHouse/issues/24011). [#25082](https://github.com/ClickHouse/ClickHouse/pull/25082) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Fix bug with constant maps in mapContains function that lead to error `empty column was returned by function mapContains`. Closes [#25077](https://github.com/ClickHouse/ClickHouse/issues/25077). [#25080](https://github.com/ClickHouse/ClickHouse/pull/25080) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Remove possibility to create tables with columns referencing themselves like `a UInt32 ALIAS a + 1` or `b UInt32 MATERIALIZED b`. Fixes [#24910](https://github.com/ClickHouse/ClickHouse/issues/24910), [#24292](https://github.com/ClickHouse/ClickHouse/issues/24292). [#25059](https://github.com/ClickHouse/ClickHouse/pull/25059) ([alesapin](https://github.com/alesapin)).
|
||||
* Fix wrong result when using aggregate projection with *not empty* `GROUP BY` key to execute query with `GROUP BY` by *empty* key. [#25055](https://github.com/ClickHouse/ClickHouse/pull/25055) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix serialization of splitted nested messages in Protobuf format. This PR fixes [#24647](https://github.com/ClickHouse/ClickHouse/issues/24647). [#25000](https://github.com/ClickHouse/ClickHouse/pull/25000) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Fix limit/offset settings for distributed queries (ignore on the remote nodes). [#24940](https://github.com/ClickHouse/ClickHouse/pull/24940) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix possible heap-buffer-overflow in `Arrow` format. [#24922](https://github.com/ClickHouse/ClickHouse/pull/24922) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fixed possible error 'Cannot read from istream at offset 0' when reading a file from DiskS3 (S3 virtual filesystem is an experimental feature under development that should not be used in production). [#24885](https://github.com/ClickHouse/ClickHouse/pull/24885) ([Pavel Kovalenko](https://github.com/Jokser)).
|
||||
* Fix "Missing columns" exception when joining Distributed Materialized View. [#24870](https://github.com/ClickHouse/ClickHouse/pull/24870) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Allow `NULL` values in postgresql compatibility protocol. Closes [#22622](https://github.com/ClickHouse/ClickHouse/issues/22622). [#24857](https://github.com/ClickHouse/ClickHouse/pull/24857) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Fix bug when exception `Mutation was killed` can be thrown to the client on mutation wait when mutation not loaded into memory yet. [#24809](https://github.com/ClickHouse/ClickHouse/pull/24809) ([alesapin](https://github.com/alesapin)).
|
||||
* Fixed bug in deserialization of random generator state with might cause some data types such as `AggregateFunction(groupArraySample(N), T))` to behave in a non-deterministic way. [#24538](https://github.com/ClickHouse/ClickHouse/pull/24538) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Disallow building uniqXXXXStates of other aggregation states. [#24523](https://github.com/ClickHouse/ClickHouse/pull/24523) ([Raúl Marín](https://github.com/Algunenano)). Then allow it back by actually eliminating the root cause of the related issue. ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix usage of tuples in `CREATE .. AS SELECT` queries. [#24464](https://github.com/ClickHouse/ClickHouse/pull/24464) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Fix computation of total bytes in `Buffer` table. In current ClickHouse version total_writes.bytes counter decreases too much during the buffer flush. It leads to counter overflow and totalBytes return something around 17.44 EB some time after the flush. [#24450](https://github.com/ClickHouse/ClickHouse/pull/24450) ([DimasKovas](https://github.com/DimasKovas)).
|
||||
* Fix incorrect information about the monotonicity of toWeek function. This fixes [#24422](https://github.com/ClickHouse/ClickHouse/issues/24422) . This bug was introduced in https://github.com/ClickHouse/ClickHouse/pull/5212 , and was exposed later by smarter partition pruner. [#24446](https://github.com/ClickHouse/ClickHouse/pull/24446) ([Amos Bird](https://github.com/amosbird)).
|
||||
* When user authentication is managed by LDAP. Fixed potential deadlock that can happen during LDAP role (re)mapping, when LDAP group is mapped to a nonexistent local role. [#24431](https://github.com/ClickHouse/ClickHouse/pull/24431) ([Denis Glazachev](https://github.com/traceon)).
|
||||
* In "multipart/form-data" message consider the CRLF preceding a boundary as part of it. Fixes [#23905](https://github.com/ClickHouse/ClickHouse/issues/23905). [#24399](https://github.com/ClickHouse/ClickHouse/pull/24399) ([Ivan](https://github.com/abyss7)).
|
||||
* Fix drop partition with intersect fake parts. In rare cases there might be parts with mutation version greater than current block number. [#24321](https://github.com/ClickHouse/ClickHouse/pull/24321) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fixed a bug in moving Materialized View from Ordinary to Atomic database (`RENAME TABLE` query). Now inner table is moved to new database together with Materialized View. Fixes [#23926](https://github.com/ClickHouse/ClickHouse/issues/23926). [#24309](https://github.com/ClickHouse/ClickHouse/pull/24309) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Allow empty HTTP headers. Fixes [#23901](https://github.com/ClickHouse/ClickHouse/issues/23901). [#24285](https://github.com/ClickHouse/ClickHouse/pull/24285) ([Ivan](https://github.com/abyss7)).
|
||||
* Correct processing of mutations (ALTER UPDATE/DELETE) in Memory tables. Closes [#24274](https://github.com/ClickHouse/ClickHouse/issues/24274). [#24275](https://github.com/ClickHouse/ClickHouse/pull/24275) ([flynn](https://github.com/ucasfl)).
|
||||
* Make column LowCardinality property in JOIN output the same as in the input, close [#23351](https://github.com/ClickHouse/ClickHouse/issues/23351), close [#20315](https://github.com/ClickHouse/ClickHouse/issues/20315). [#24061](https://github.com/ClickHouse/ClickHouse/pull/24061) ([Vladimir](https://github.com/vdimir)).
|
||||
* A fix for Kafka tables. Fix the bug in failover behavior when Engine = Kafka was not able to start consumption if the same consumer had an empty assignment previously. Closes [#21118](https://github.com/ClickHouse/ClickHouse/issues/21118). [#21267](https://github.com/ClickHouse/ClickHouse/pull/21267) ([filimonov](https://github.com/filimonov)).
|
||||
|
||||
#### Build/Testing/Packaging Improvement
|
||||
|
||||
* Add `darwin-aarch64` (Mac M1 / Apple Silicon) builds in CI [#25560](https://github.com/ClickHouse/ClickHouse/pull/25560) ([Ivan](https://github.com/abyss7)) and put the links to the docs and website ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Adds cross-platform embedding of binary resources into executables. It works on Illumos. [#25146](https://github.com/ClickHouse/ClickHouse/pull/25146) ([bnaecker](https://github.com/bnaecker)).
|
||||
* Add join related options to stress tests to improve fuzzing. [#25200](https://github.com/ClickHouse/ClickHouse/pull/25200) ([Vladimir](https://github.com/vdimir)).
|
||||
* Enable build with s3 module in osx [#25217](https://github.com/ClickHouse/ClickHouse/issues/25217). [#25218](https://github.com/ClickHouse/ClickHouse/pull/25218) ([kevin wan](https://github.com/MaxWk)).
|
||||
* Add integration test cases to cover JDBC bridge. [#25047](https://github.com/ClickHouse/ClickHouse/pull/25047) ([Zhichun Wu](https://github.com/zhicwu)).
|
||||
* Integration tests configuration has special treatment for dictionaries. Removed remaining dictionaries manual setup. [#24728](https://github.com/ClickHouse/ClickHouse/pull/24728) ([Ilya Yatsishin](https://github.com/qoega)).
|
||||
* Add libfuzzer tests for YAMLParser class. [#24480](https://github.com/ClickHouse/ClickHouse/pull/24480) ([BoloniniD](https://github.com/BoloniniD)).
|
||||
* Ubuntu 20.04 is now used to run integration tests, docker-compose version used to run integration tests is updated to 1.28.2. Environment variables now take effect on docker-compose. Rework test_dictionaries_all_layouts_separate_sources to allow parallel run. [#20393](https://github.com/ClickHouse/ClickHouse/pull/20393) ([Ilya Yatsishin](https://github.com/qoega)).
|
||||
* Fix TOCTOU error in installation script. [#25277](https://github.com/ClickHouse/ClickHouse/pull/25277) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
|
||||
|
||||
### ClickHouse release 21.6, 2021-06-05
|
||||
|
||||
#### Upgrade Notes
|
||||
|
@ -2,11 +2,11 @@
|
||||
|
||||
# NOTE: has nothing common with DBMS_TCP_PROTOCOL_VERSION,
|
||||
# only DBMS_TCP_PROTOCOL_VERSION should be incremented on protocol changes.
|
||||
SET(VERSION_REVISION 54453)
|
||||
SET(VERSION_REVISION 54454)
|
||||
SET(VERSION_MAJOR 21)
|
||||
SET(VERSION_MINOR 8)
|
||||
SET(VERSION_MINOR 9)
|
||||
SET(VERSION_PATCH 1)
|
||||
SET(VERSION_GITHASH fb895056568e26200629c7d19626e92d2dedc70d)
|
||||
SET(VERSION_DESCRIBE v21.8.1.1-prestable)
|
||||
SET(VERSION_STRING 21.8.1.1)
|
||||
SET(VERSION_GITHASH f48c5af90c2ad51955d1ee3b6b05d006b03e4238)
|
||||
SET(VERSION_DESCRIBE v21.9.1.1-prestable)
|
||||
SET(VERSION_STRING 21.9.1.1)
|
||||
# end of autochange
|
||||
|
@ -53,5 +53,6 @@ macro(clickhouse_embed_binaries)
|
||||
set_property(SOURCE "${CMAKE_CURRENT_BINARY_DIR}/${ASSEMBLY_FILE_NAME}" APPEND PROPERTY INCLUDE_DIRECTORIES "${EMBED_RESOURCE_DIR}")
|
||||
|
||||
target_sources("${EMBED_TARGET}" PRIVATE "${CMAKE_CURRENT_BINARY_DIR}/${ASSEMBLY_FILE_NAME}")
|
||||
set_target_properties("${EMBED_TARGET}" PROPERTIES OBJECT_DEPENDS "${RESOURCE_FILE}")
|
||||
endforeach()
|
||||
endmacro()
|
||||
|
2
contrib/h3
vendored
2
contrib/h3
vendored
@ -1 +1 @@
|
||||
Subproject commit e209086ae1b5477307f545a0f6111780edc59940
|
||||
Subproject commit c7f46cfd71fb60e2fefc90e28abe81657deff735
|
@ -3,21 +3,22 @@ set(H3_BINARY_DIR "${ClickHouse_BINARY_DIR}/contrib/h3/src/h3lib")
|
||||
|
||||
set(SRCS
|
||||
"${H3_SOURCE_DIR}/lib/algos.c"
|
||||
"${H3_SOURCE_DIR}/lib/baseCells.c"
|
||||
"${H3_SOURCE_DIR}/lib/bbox.c"
|
||||
"${H3_SOURCE_DIR}/lib/coordijk.c"
|
||||
"${H3_SOURCE_DIR}/lib/faceijk.c"
|
||||
"${H3_SOURCE_DIR}/lib/geoCoord.c"
|
||||
"${H3_SOURCE_DIR}/lib/h3Index.c"
|
||||
"${H3_SOURCE_DIR}/lib/h3UniEdge.c"
|
||||
"${H3_SOURCE_DIR}/lib/linkedGeo.c"
|
||||
"${H3_SOURCE_DIR}/lib/localij.c"
|
||||
"${H3_SOURCE_DIR}/lib/mathExtensions.c"
|
||||
"${H3_SOURCE_DIR}/lib/bbox.c"
|
||||
"${H3_SOURCE_DIR}/lib/polygon.c"
|
||||
"${H3_SOURCE_DIR}/lib/h3Index.c"
|
||||
"${H3_SOURCE_DIR}/lib/vec2d.c"
|
||||
"${H3_SOURCE_DIR}/lib/vec3d.c"
|
||||
"${H3_SOURCE_DIR}/lib/vertex.c"
|
||||
"${H3_SOURCE_DIR}/lib/linkedGeo.c"
|
||||
"${H3_SOURCE_DIR}/lib/localij.c"
|
||||
"${H3_SOURCE_DIR}/lib/latLng.c"
|
||||
"${H3_SOURCE_DIR}/lib/directedEdge.c"
|
||||
"${H3_SOURCE_DIR}/lib/mathExtensions.c"
|
||||
"${H3_SOURCE_DIR}/lib/iterators.c"
|
||||
"${H3_SOURCE_DIR}/lib/vertexGraph.c"
|
||||
"${H3_SOURCE_DIR}/lib/faceijk.c"
|
||||
"${H3_SOURCE_DIR}/lib/baseCells.c"
|
||||
)
|
||||
|
||||
configure_file("${H3_SOURCE_DIR}/include/h3api.h.in" "${H3_BINARY_DIR}/include/h3api.h")
|
||||
|
4
debian/changelog
vendored
4
debian/changelog
vendored
@ -1,5 +1,5 @@
|
||||
clickhouse (21.8.1.1) unstable; urgency=low
|
||||
clickhouse (21.9.1.1) unstable; urgency=low
|
||||
|
||||
* Modified source code
|
||||
|
||||
-- clickhouse-release <clickhouse-release@yandex-team.ru> Mon, 28 Jun 2021 00:50:15 +0300
|
||||
-- clickhouse-release <clickhouse-release@yandex-team.ru> Sat, 10 Jul 2021 08:22:49 +0300
|
||||
|
@ -1,7 +1,7 @@
|
||||
FROM ubuntu:18.04
|
||||
|
||||
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
|
||||
ARG version=21.8.1.*
|
||||
ARG version=21.9.1.*
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install --yes --no-install-recommends \
|
||||
|
@ -1,7 +1,7 @@
|
||||
FROM ubuntu:20.04
|
||||
|
||||
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
|
||||
ARG version=21.8.1.*
|
||||
ARG version=21.9.1.*
|
||||
ARG gosu_ver=1.10
|
||||
|
||||
# set non-empty deb_location_url url to create a docker image
|
||||
|
@ -1,7 +1,7 @@
|
||||
FROM ubuntu:18.04
|
||||
|
||||
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
|
||||
ARG version=21.8.1.*
|
||||
ARG version=21.9.1.*
|
||||
|
||||
RUN apt-get update && \
|
||||
apt-get install -y apt-transport-https dirmngr && \
|
||||
|
@ -1178,11 +1178,11 @@ create view right_async_metric_log as
|
||||
-- Use the right log as time reference because it may have higher precision.
|
||||
create table metrics engine File(TSV, 'metrics/metrics.tsv') as
|
||||
with (select min(event_time) from right_async_metric_log) as min_time
|
||||
select name metric, r.event_time - min_time event_time, l.value as left, r.value as right
|
||||
select metric, r.event_time - min_time event_time, l.value as left, r.value as right
|
||||
from right_async_metric_log r
|
||||
asof join file('left-async-metric-log.tsv', TSVWithNamesAndTypes,
|
||||
'$(cat left-async-metric-log.tsv.columns)') l
|
||||
on l.name = r.name and r.event_time <= l.event_time
|
||||
on l.metric = r.metric and r.event_time <= l.event_time
|
||||
order by metric, event_time
|
||||
;
|
||||
|
||||
|
@ -23,6 +23,7 @@
|
||||
|
||||
<!-- disable jit for perf tests -->
|
||||
<compile_expressions>0</compile_expressions>
|
||||
<compile_aggregate_expressions>0</compile_aggregate_expressions>
|
||||
</default>
|
||||
</profiles>
|
||||
<users>
|
||||
|
@ -7,13 +7,13 @@ toc_title: Third-Party Libraries Used
|
||||
|
||||
The list of third-party libraries can be obtained by the following query:
|
||||
|
||||
```
|
||||
``` sql
|
||||
SELECT library_name, license_type, license_path FROM system.licenses ORDER BY library_name COLLATE 'en'
|
||||
```
|
||||
|
||||
[Example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIGxpYnJhcnlfbmFtZSwgbGljZW5zZV90eXBlLCBsaWNlbnNlX3BhdGggRlJPTSBzeXN0ZW0ubGljZW5zZXMgT1JERVIgQlkgbGlicmFyeV9uYW1lIENPTExBVEUgJ2VuJw==)
|
||||
|
||||
| library_name | license_type | license_path |
|
||||
| library_name | license_type | license_path |
|
||||
|:-|:-|:-|
|
||||
| abseil-cpp | Apache | /contrib/abseil-cpp/LICENSE |
|
||||
| AMQP-CPP | Apache | /contrib/AMQP-CPP/LICENSE |
|
||||
@ -89,3 +89,15 @@ SELECT library_name, license_type, license_path FROM system.licenses ORDER BY li
|
||||
| xz | Public Domain | /contrib/xz/COPYING |
|
||||
| zlib-ng | zLib | /contrib/zlib-ng/LICENSE.md |
|
||||
| zstd | BSD | /contrib/zstd/LICENSE |
|
||||
|
||||
## Guidelines for adding new third-party libraries and maintaining custom changes in them {#adding-third-party-libraries}
|
||||
|
||||
1. All external third-party code should reside in the dedicated directories under `contrib` directory of ClickHouse repo. Prefer Git submodules, when available.
|
||||
2. Fork/mirror the official repo in [Clickhouse-extras](https://github.com/ClickHouse-Extras). Prefer official GitHub repos, when available.
|
||||
3. Branch from the branch you want to integrate, e.g., `master` -> `clickhouse/master`, or `release/vX.Y.Z` -> `clickhouse/release/vX.Y.Z`.
|
||||
4. All forks in [Clickhouse-extras](https://github.com/ClickHouse-Extras) can be automatically synchronized with upstreams. `clickhouse/...` branches will remain unaffected, since virtually nobody is going to use that naming pattern in their upstream repos.
|
||||
5. Add submodules under `contrib` of ClickHouse repo that refer the above forks/mirrors. Set the submodules to track the corresponding `clickhouse/...` branches.
|
||||
6. Every time the custom changes have to be made in the library code, a dedicated branch should be created, like `clickhouse/my-fix`. Then this branch should be merged into the branch, that is tracked by the submodule, e.g., `clickhouse/master` or `clickhouse/release/vX.Y.Z`.
|
||||
7. No code should be pushed in any branch of the forks in [Clickhouse-extras](https://github.com/ClickHouse-Extras), whose names do not follow `clickhouse/...` pattern.
|
||||
8. Always write the custom changes with the official repo in mind. Once the PR is merged from (a feature/fix branch in) your personal fork into the fork in [Clickhouse-extras](https://github.com/ClickHouse-Extras), and the submodule is bumped in ClickHouse repo, consider opening another PR from (a feature/fix branch in) the fork in [Clickhouse-extras](https://github.com/ClickHouse-Extras) to the official repo of the library. This will make sure, that 1) the contribution has more than a single use case and importance, 2) others will also benefit from it, 3) the change will not remain a maintenance burden solely on ClickHouse developers.
|
||||
9. When a submodule needs to start using a newer code from the original branch (e.g., `master`), and since the custom changes might be merged in the branch it is tracking (e.g., `clickhouse/master`) and so it may diverge from its original counterpart (i.e., `master`), a careful merge should be carried out first, i.e., `master` -> `clickhouse/master`, and only then the submodule can be bumped in ClickHouse.
|
||||
|
@ -237,6 +237,8 @@ The description of ClickHouse architecture can be found here: https://clickhouse
|
||||
|
||||
The Code Style Guide: https://clickhouse.tech/docs/en/development/style/
|
||||
|
||||
Adding third-party libraries: https://clickhouse.tech/docs/en/development/contrib/#adding-third-party-libraries
|
||||
|
||||
Writing tests: https://clickhouse.tech/docs/en/development/tests/
|
||||
|
||||
List of tasks: https://github.com/ClickHouse/ClickHouse/issues?q=is%3Aopen+is%3Aissue+label%3A%22easy+task%22
|
||||
|
@ -628,7 +628,7 @@ If the class is not intended for polymorphic use, you do not need to make functi
|
||||
|
||||
**18.** Encodings.
|
||||
|
||||
Use UTF-8 everywhere. Use `std::string`and`char *`. Do not use `std::wstring`and`wchar_t`.
|
||||
Use UTF-8 everywhere. Use `std::string` and `char *`. Do not use `std::wstring` and `wchar_t`.
|
||||
|
||||
**19.** Logging.
|
||||
|
||||
@ -749,17 +749,9 @@ If your code in the `master` branch is not buildable yet, exclude it from the bu
|
||||
|
||||
**1.** The C++20 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
|
||||
|
||||
**2.** If necessary, you can use any well-known libraries available in the OS package.
|
||||
**2.** It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in `contrib` directory and built with ClickHouse.
|
||||
|
||||
If there is a good solution already available, then use it, even if it means you have to install another library.
|
||||
|
||||
(But be prepared to remove bad libraries from code.)
|
||||
|
||||
**3.** You can install a library that isn’t in the packages, if the packages do not have what you need or have an outdated version or the wrong type of compilation.
|
||||
|
||||
**4.** If the library is small and does not have its own complex build system, put the source files in the `contrib` folder.
|
||||
|
||||
**5.** Preference is always given to libraries that are already in use.
|
||||
**3.** Preference is always given to libraries that are already in use.
|
||||
|
||||
## General Recommendations {#general-recommendations-1}
|
||||
|
||||
|
BIN
docs/en/images/play.png
Normal file
BIN
docs/en/images/play.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 26 KiB |
@ -7,16 +7,21 @@ toc_title: HTTP Interface
|
||||
|
||||
The HTTP interface lets you use ClickHouse on any platform from any programming language. We use it for working from Java and Perl, as well as shell scripts. In other departments, the HTTP interface is used from Perl, Python, and Go. The HTTP interface is more limited than the native interface, but it has better compatibility.
|
||||
|
||||
By default, clickhouse-server listens for HTTP on port 8123 (this can be changed in the config).
|
||||
By default, `clickhouse-server` listens for HTTP on port 8123 (this can be changed in the config).
|
||||
|
||||
If you make a GET / request without parameters, it returns 200 response code and the string which defined in [http_server_default_response](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-http_server_default_response) default value “Ok.” (with a line feed at the end)
|
||||
If you make a `GET /` request without parameters, it returns 200 response code and the string which defined in [http_server_default_response](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-http_server_default_response) default value “Ok.” (with a line feed at the end)
|
||||
|
||||
``` bash
|
||||
$ curl 'http://localhost:8123/'
|
||||
Ok.
|
||||
```
|
||||
|
||||
Use GET /ping request in health-check scripts. This handler always returns “Ok.” (with a line feed at the end). Available from version 18.12.13.
|
||||
Web UI can be accessed here: `http://localhost:8123/play`.
|
||||
|
||||
![Web UI](../images/play.png)
|
||||
|
||||
|
||||
In health-check scripts use `GET /ping` request. This handler always returns “Ok.” (with a line feed at the end). Available from version 18.12.13.
|
||||
|
||||
``` bash
|
||||
$ curl 'http://localhost:8123/ping'
|
||||
@ -51,8 +56,8 @@ X-ClickHouse-Summary: {"read_rows":"0","read_bytes":"0","written_rows":"0","writ
|
||||
1
|
||||
```
|
||||
|
||||
As you can see, curl is somewhat inconvenient in that spaces must be URL escaped.
|
||||
Although wget escapes everything itself, we do not recommend using it because it does not work well over HTTP 1.1 when using keep-alive and Transfer-Encoding: chunked.
|
||||
As you can see, `curl` is somewhat inconvenient in that spaces must be URL escaped.
|
||||
Although `wget` escapes everything itself, we do not recommend using it because it does not work well over HTTP 1.1 when using keep-alive and Transfer-Encoding: chunked.
|
||||
|
||||
``` bash
|
||||
$ echo 'SELECT 1' | curl 'http://localhost:8123/' --data-binary @-
|
||||
@ -75,7 +80,7 @@ ECT 1
|
||||
, expected One of: SHOW TABLES, SHOW DATABASES, SELECT, INSERT, CREATE, ATTACH, RENAME, DROP, DETACH, USE, SET, OPTIMIZE., e.what() = DB::Exception
|
||||
```
|
||||
|
||||
By default, data is returned in TabSeparated format (for more information, see the “Formats” section).
|
||||
By default, data is returned in [TabSeparated](formats.md#tabseparated) format.
|
||||
|
||||
You use the FORMAT clause of the query to request any other format.
|
||||
|
||||
@ -90,9 +95,11 @@ $ echo 'SELECT 1 FORMAT Pretty' | curl 'http://localhost:8123/?' --data-binary @
|
||||
└───┘
|
||||
```
|
||||
|
||||
The POST method of transmitting data is necessary for INSERT queries. In this case, you can write the beginning of the query in the URL parameter, and use POST to pass the data to insert. The data to insert could be, for example, a tab-separated dump from MySQL. In this way, the INSERT query replaces LOAD DATA LOCAL INFILE from MySQL.
|
||||
The POST method of transmitting data is necessary for `INSERT` queries. In this case, you can write the beginning of the query in the URL parameter, and use POST to pass the data to insert. The data to insert could be, for example, a tab-separated dump from MySQL. In this way, the `INSERT` query replaces `LOAD DATA LOCAL INFILE` from MySQL.
|
||||
|
||||
Examples: Creating a table:
|
||||
**Examples**
|
||||
|
||||
Creating a table:
|
||||
|
||||
``` bash
|
||||
$ echo 'CREATE TABLE t (a UInt8) ENGINE = Memory' | curl 'http://localhost:8123/' --data-binary @-
|
||||
@ -632,6 +639,4 @@ $ curl -vv -H 'XXX:xxx' 'http://localhost:8123/get_relative_path_static_handler'
|
||||
<
|
||||
<html><body>Relative Path File</body></html>
|
||||
* Connection #0 to host localhost left intact
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/interfaces/http_interface/) <!--hide-->
|
||||
```
|
@ -59,6 +59,7 @@ toc_title: Adopters
|
||||
| <a href="https://www.huya.com/" class="favicon">HUYA</a> | Video Streaming | Analytics | — | — | [Slides in Chinese, October 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup19/7.%20ClickHouse万亿数据分析实践%20李本旺(sundy-li)%20虎牙.pdf) |
|
||||
| <a href="https://www.the-ica.com/" class="favicon">ICA</a> | FinTech | Risk Management | — | — | [Blog Post in English, Sep 2020](https://altinity.com/blog/clickhouse-vs-redshift-performance-for-fintech-risk-management?utm_campaign=ClickHouse%20vs%20RedShift&utm_content=143520807&utm_medium=social&utm_source=twitter&hss_channel=tw-3894792263) |
|
||||
| <a href="https://www.idealista.com" class="favicon">Idealista</a> | Real Estate | Analytics | — | — | [Blog Post in English, April 2019](https://clickhouse.tech/blog/en/clickhouse-meetup-in-madrid-on-april-2-2019) |
|
||||
| <a href="https://infobaleen.com" class="favicon">Infobaleen</a> | AI markting tool | Analytics | — | — | [Official site](https://infobaleen.com) |
|
||||
| <a href="https://www.infovista.com/" class="favicon">Infovista</a> | Networks | Analytics | — | — | [Slides in English, October 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup30/infovista.pdf) |
|
||||
| <a href="https://www.innogames.com" class="favicon">InnoGames</a> | Games | Metrics, Logging | — | — | [Slides in Russian, September 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup28/graphite_and_clickHouse.pdf) |
|
||||
| <a href="https://instabug.com/" class="favicon">Instabug</a> | APM Platform | Main product | — | — | [A quote from Co-Founder](https://altinity.com/) |
|
||||
|
114
docs/en/operations/clickhouse-keeper.md
Normal file
114
docs/en/operations/clickhouse-keeper.md
Normal file
@ -0,0 +1,114 @@
|
||||
---
|
||||
toc_priority: 66
|
||||
toc_title: ClickHouse Keeper
|
||||
---
|
||||
|
||||
# [pre-production] clickhouse-keeper
|
||||
|
||||
ClickHouse server use [ZooKeeper](https://zookeeper.apache.org/) coordination system for data [replication](../engines/table-engines/mergetree-family/replication.md) and [distributed DDL](../sql-reference/distributed-ddl.md) queries execution. ClickHouse Keeper is an alternative coordination system compatible with ZooKeeper.
|
||||
|
||||
!!! warning "Warning"
|
||||
This feature currently in pre-production stage. We test it in our CI and on small internal installations.
|
||||
|
||||
## Implemetation details
|
||||
|
||||
ZooKeeper is one of the first well-known open-source coordination systems. It's implemented in Java, has quite a simple and powerful data model. ZooKeeper's coordination algorithm called ZAB (ZooKeeper Atomic Broadcast) doesn't provide linearizability guarantees for reads, because each ZooKeeper node serves reads locally. Unlike ZooKeeper `clickhouse-keeper` written in C++ and use [RAFT algorithm](https://raft.github.io/) [implementation](https://github.com/eBay/NuRaft). This algorithm allows to have linearizability for reads and writes, has several open-source implementations in different languages.
|
||||
|
||||
By default, `clickhouse-keeper` provides the same guarantees as ZooKeeper (linearizable writes, non-linearizable reads). It has a compatible client-server protocol, so any standard ZooKeeper client can be used to interact with `clickhouse-keeper`. Snapshots and logs have an incompatible format with ZooKeeper, but `clickhouse-keeper-converter` tool allows to convert ZooKeeper data to `clickhouse-keeper` snapshot. Interserver protocol in `clickhouse-keeper` also incompatible with ZooKeeper so mixed ZooKeeper/clickhouse-keeper cluster is impossible.
|
||||
|
||||
## Configuration
|
||||
|
||||
`clickhouse-keeper` can be used as a standalone replacement for ZooKeeper or as an internal part of the `clickhouse-server`, but in both cases configuration is almost the same `.xml` file. The main `clickhouse-keeper` configuration tag is `<keeper_server>`. Keeper configuration has the following parameters:
|
||||
|
||||
- `tcp_port` — the port for a client to connect (default for ZooKeeper is `2181`)
|
||||
- `tcp_port_secure` — the secure port for a client to connect
|
||||
- `server_id` — unique server id, each participant of the clickhouse-keeper cluster must have a unique number (1, 2, 3, and so on)
|
||||
- `log_storage_path` — path to coordination logs, better to store logs on the non-busy device (same for ZooKeeper)
|
||||
- `snapshot_storage_path` — path to coordination snapshots
|
||||
|
||||
Other common parameters are inherited from clickhouse-server config (`listen_host`, `logger` and so on).
|
||||
|
||||
Internal coordination settings are located in `<keeper_server>.<coordination_settings>` section:
|
||||
|
||||
- `operation_timeout_ms` — timeout for a single client operation
|
||||
- `session_timeout_ms` — timeout for client session
|
||||
- `dead_session_check_period_ms` — how often clickhouse-keeper check dead sessions and remove them
|
||||
- `heart_beat_interval_ms` — how often a clickhouse-keeper leader will send heartbeats to followers
|
||||
- `election_timeout_lower_bound_ms` — if follower didn't receive heartbeats from the leader in this interval, then it can initiate leader election
|
||||
- `election_timeout_upper_bound_ms` — if follower didn't receive heartbeats from the leader in this interval, then it must initiate leader election
|
||||
- `rotate_log_storage_interval` — how many logs to store in a single file
|
||||
- `reserved_log_items` — how many coordination logs to store before compaction
|
||||
- `snapshot_distance` — how often clickhouse-keeper will create new snapshots (in the number of logs)
|
||||
- `snapshots_to_keep` — how many snapshots to keep
|
||||
- `stale_log_gap` — the threshold when leader consider follower as stale and send snapshot to it instead of logs
|
||||
- `force_sync` — call `fsync` on each write to coordination log
|
||||
- `raft_logs_level` — text logging level about coordination (trace, debug, and so on)
|
||||
- `shutdown_timeout` — wait to finish internal connections and shutdown
|
||||
- `startup_timeout` — if the server doesn't connect to other quorum participants in the specified timeout it will terminate
|
||||
|
||||
Quorum configuration is located in `<keeper_server>.<raft_configuration>` section and contain servers description. The only parameter for the whole quorum is `secure`, which enables encrypted connection for communication between quorum participants. The main parameters for each `<server>` are:
|
||||
|
||||
- `id` — server_id in quorum
|
||||
- `hostname` — hostname where this server placed
|
||||
- `port` — port where this server listen for connections
|
||||
|
||||
|
||||
Examples of configuration for quorum with three nodes can be found in [integration tests](https://github.com/ClickHouse/ClickHouse/tree/master/tests/integration) with `test_keeper_` prefix. Example configuration for server #1:
|
||||
|
||||
```xml
|
||||
<keeper_server>
|
||||
<tcp_port>2181</tcp_port>
|
||||
<server_id>1</server_id>
|
||||
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
|
||||
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
|
||||
|
||||
<coordination_settings>
|
||||
<operation_timeout_ms>10000</operation_timeout_ms>
|
||||
<session_timeout_ms>30000</session_timeout_ms>
|
||||
<raft_logs_level>trace</raft_logs_level>
|
||||
</coordination_settings>
|
||||
|
||||
<raft_configuration>
|
||||
<server>
|
||||
<id>1</id>
|
||||
<hostname>zoo1</hostname>
|
||||
<port>9444</port>
|
||||
</server>
|
||||
<server>
|
||||
<id>2</id>
|
||||
<hostname>zoo2</hostname>
|
||||
<port>9444</port>
|
||||
</server>
|
||||
<server>
|
||||
<id>3</id>
|
||||
<hostname>zoo3</hostname>
|
||||
<port>9444</port>
|
||||
</server>
|
||||
</raft_configuration>
|
||||
</keeper_server>
|
||||
```
|
||||
|
||||
## How to run
|
||||
|
||||
`clickhouse-keeper` is bundled into `clickhouse-server` package, just add configuration of `<keeper_server>` and start clickhouse-server as always. If you want to run standalone `clickhouse-keeper` you can start it in a similar way with:
|
||||
|
||||
```bash
|
||||
clickhouse-keeper --config /etc/your_path_to_config/config.xml --daemon
|
||||
```
|
||||
|
||||
## [experimental] Migration from ZooKeeper
|
||||
|
||||
Seamlessly migration from ZooKeeper to `clickhouse-keeper` is impossible you have to stop your ZooKeeper cluster, convert data and start `clickhouse-keeper`. `clickhouse-keeper-converter` tool allows to convert ZooKeeper logs and snapshots to `clickhouse-keeper` snapshot. It works only with ZooKeeper > 3.4. Steps for migration:
|
||||
|
||||
1. Stop all ZooKeeper nodes.
|
||||
|
||||
2. [optional, but recommended] Found ZooKeeper leader node, start and stop it again. It will force ZooKeeper to create consistent snapshot.
|
||||
|
||||
3. Run `clickhouse-keeper-converter` on leader, example
|
||||
|
||||
```bash
|
||||
clickhouse-keeper-converter --zookeeper-logs-dir /var/lib/zookeeper/version-2 --zookeeper-snapshots-dir /var/lib/zookeeper/version-2 --output-dir /path/to/clickhouse/keeper/snapshots
|
||||
```
|
||||
|
||||
4. Copy snapshot to `clickhouse-server` nodes with configured `keeper` or start `clickhouse-keeper` instead of ZooKeeper. Snapshot must persist only on leader node, leader will sync it automatically to other nodes.
|
||||
|
@ -22,6 +22,23 @@ Some settings specified in the main configuration file can be overridden in othe
|
||||
|
||||
The config can also define “substitutions”. If an element has the `incl` attribute, the corresponding substitution from the file will be used as the value. By default, the path to the file with substitutions is `/etc/metrika.xml`. This can be changed in the [include_from](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-include_from) element in the server config. The substitution values are specified in `/yandex/substitution_name` elements in this file. If a substitution specified in `incl` does not exist, it is recorded in the log. To prevent ClickHouse from logging missing substitutions, specify the `optional="true"` attribute (for example, settings for [macros](../operations/server-configuration-parameters/settings.md)).
|
||||
|
||||
If you want to replace an entire element with a substitution use `include` as element name.
|
||||
|
||||
XML substitution example:
|
||||
|
||||
```xml
|
||||
<yandex>
|
||||
<!-- Appends XML subtree found at `/profiles-in-zookeeper` ZK path to `<profiles>` element. -->
|
||||
<profiles from_zk="/profiles-in-zookeeper" />
|
||||
|
||||
<users>
|
||||
<!-- Replaces `include` element with the subtree found at `/users-in-zookeeper` ZK path. -->
|
||||
<include from_zk="/users-in-zookeeper" />
|
||||
<include from_zk="/other-users-in-zookeeper" />
|
||||
</users>
|
||||
</yandex>
|
||||
```
|
||||
|
||||
Substitutions can also be performed from ZooKeeper. To do this, specify the attribute `from_zk = "/path/to/node"`. The element value is replaced with the contents of the node at `/path/to/node` in ZooKeeper. You can also put an entire XML subtree on the ZooKeeper node and it will be fully inserted into the source element.
|
||||
|
||||
## User Settings {#user-settings}
|
||||
@ -32,6 +49,8 @@ Users configuration can be splitted into separate files similar to `config.xml`
|
||||
Directory name is defined as `users_config` setting without `.xml` postfix concatenated with `.d`.
|
||||
Directory `users.d` is used by default, as `users_config` defaults to `users.xml`.
|
||||
|
||||
Note that configuration files are first merged taking into account [Override](#override) settings and includes are processed after that.
|
||||
|
||||
## XML example {#example}
|
||||
|
||||
For example, you can have separate config file for each user like this:
|
||||
|
@ -1,3 +1,7 @@
|
||||
---
|
||||
toc_priority: 212
|
||||
---
|
||||
|
||||
# median {#median}
|
||||
|
||||
The `median*` functions are the aliases for the corresponding `quantile*` functions. They calculate median of a numeric data sample.
|
||||
@ -12,6 +16,7 @@ Functions:
|
||||
- `medianTimingWeighted` — Alias for [quantileTimingWeighted](../../../sql-reference/aggregate-functions/reference/quantiletimingweighted.md#quantiletimingweighted).
|
||||
- `medianTDigest` — Alias for [quantileTDigest](../../../sql-reference/aggregate-functions/reference/quantiletdigest.md#quantiletdigest).
|
||||
- `medianTDigestWeighted` — Alias for [quantileTDigestWeighted](../../../sql-reference/aggregate-functions/reference/quantiletdigestweighted.md#quantiletdigestweighted).
|
||||
- `medianBFloat16` — Alias for [quantileBFloat16](../../../sql-reference/aggregate-functions/reference/quantilebfloat16.md#quantilebfloat16).
|
||||
|
||||
**Example**
|
||||
|
||||
|
@ -0,0 +1,64 @@
|
||||
---
|
||||
toc_priority: 209
|
||||
---
|
||||
|
||||
# quantileBFloat16 {#quantilebfloat16}
|
||||
|
||||
Computes an approximate [quantile](https://en.wikipedia.org/wiki/Quantile) of a sample consisting of [bfloat16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format) numbers. `bfloat16` is a floating-point data type with 1 sign bit, 8 exponent bits and 7 fraction bits.
|
||||
The function converts input values to 32-bit floats and takes the most significant 16 bits. Then it calculates `bfloat16` quantile value and converts the result to a 64-bit float by appending zero bits.
|
||||
The function is a fast quantile estimator with a relative error no more than 0.390625%.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
quantileBFloat16[(level)](expr)
|
||||
```
|
||||
|
||||
Alias: `medianBFloat16`
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `expr` — Column with numeric data. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Parameters**
|
||||
|
||||
- `level` — Level of quantile. Optional. Possible values are in the range from 0 to 1. Default value: 0.5. [Float](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Approximate quantile of the specified level.
|
||||
|
||||
Type: [Float64](../../../sql-reference/data-types/float.md#float32-float64).
|
||||
|
||||
**Example**
|
||||
|
||||
Input table has an integer and a float columns:
|
||||
|
||||
``` text
|
||||
┌─a─┬─────b─┐
|
||||
│ 1 │ 1.001 │
|
||||
│ 2 │ 1.002 │
|
||||
│ 3 │ 1.003 │
|
||||
│ 4 │ 1.004 │
|
||||
└───┴───────┘
|
||||
```
|
||||
|
||||
Query to calculate 0.75-quantile (third quartile):
|
||||
|
||||
``` sql
|
||||
SELECT quantileBFloat16(0.75)(a), quantileBFloat16(0.75)(b) FROM example_table;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─quantileBFloat16(0.75)(a)─┬─quantileBFloat16(0.75)(b)─┐
|
||||
│ 3 │ 1 │
|
||||
└───────────────────────────┴───────────────────────────┘
|
||||
```
|
||||
Note that all floating point values in the example are truncated to 1.0 when converting to `bfloat16`.
|
||||
|
||||
**See Also**
|
||||
|
||||
- [median](../../../sql-reference/aggregate-functions/reference/median.md#median)
|
||||
- [quantiles](../../../sql-reference/aggregate-functions/reference/quantiles.md#quantiles)
|
@ -74,7 +74,7 @@ When using multiple `quantile*` functions with different levels in a query, the
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
quantileExact(level)(expr)
|
||||
quantileExactLow(level)(expr)
|
||||
```
|
||||
|
||||
Alias: `medianExactLow`.
|
||||
|
@ -8,7 +8,7 @@ toc_priority: 201
|
||||
|
||||
Syntax: `quantiles(level1, level2, …)(x)`
|
||||
|
||||
All the quantile functions also have corresponding quantiles functions: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`. These functions calculate all the quantiles of the listed levels in one pass, and return an array of the resulting values.
|
||||
All the quantile functions also have corresponding quantiles functions: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`, `quantilesBFloat16`. These functions calculate all the quantiles of the listed levels in one pass, and return an array of the resulting values.
|
||||
|
||||
## quantilesExactExclusive {#quantilesexactexclusive}
|
||||
|
||||
@ -18,7 +18,7 @@ To get exact value, all the passed values are combined into an array, whic
|
||||
|
||||
This function is equivalent to [PERCENTILE.EXC](https://support.microsoft.com/en-us/office/percentile-exc-function-bbaa7204-e9e1-4010-85bf-c31dc5dce4ba) Excel function, ([type R6](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample)).
|
||||
|
||||
Works more efficiently with sets of levels than [quantilesExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
|
||||
Works more efficiently with sets of levels than [quantileExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
|
||||
|
||||
**Syntax**
|
||||
|
||||
@ -70,7 +70,7 @@ To get exact value, all the passed values are combined into an array, whic
|
||||
|
||||
This function is equivalent to [PERCENTILE.INC](https://support.microsoft.com/en-us/office/percentile-inc-function-680f9539-45eb-410b-9a5e-c1355e5fe2ed) Excel function, ([type R7](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample)).
|
||||
|
||||
Works more efficiently with sets of levels than [quantilesExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantilesexactinclusive).
|
||||
Works more efficiently with sets of levels than [quantileExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactinclusive).
|
||||
|
||||
**Syntax**
|
||||
|
||||
|
@ -129,7 +129,7 @@ That dictionary source can be configured only via XML configuration. Creating di
|
||||
|
||||
## Executable Pool {#dicts-external_dicts_dict_sources-executable_pool}
|
||||
|
||||
Executable pool allows loading data from pool of processes. This source does not work with dictionary layouts that need to load all data from source. Executable pool works if the dictionary [is stored](external-dicts-dict-layout.md#ways-to-store-dictionaries-in-memory) using `cache`, `complex_key_cache`, `ssd_cache`, `complex_key_ssd_cache`, `direct`, `complex_key_direct` layouts.
|
||||
Executable pool allows loading data from pool of processes. This source does not work with dictionary layouts that need to load all data from source. Executable pool works if the dictionary [is stored](external-dicts-dict-layout.md#ways-to-store-dictionaries-in-memory) using `cache`, `complex_key_cache`, `ssd_cache`, `complex_key_ssd_cache`, `direct`, `complex_key_direct` layouts.
|
||||
|
||||
Executable pool will spawn pool of processes with specified command and keep them running until they exit. The program should read data from STDIN while it is available and output result to STDOUT, and it can wait for next block of data on STDIN. ClickHouse will not close STDIN after processing a block of data but will pipe another chunk of data when needed. The executable script should be ready for this way of data processing — it should poll STDIN and flush data to STDOUT early.
|
||||
|
||||
@ -581,6 +581,7 @@ Example of settings:
|
||||
<db>default</db>
|
||||
<table>ids</table>
|
||||
<where>id=10</where>
|
||||
<secure>1</secure>
|
||||
</clickhouse>
|
||||
</source>
|
||||
```
|
||||
@ -596,6 +597,7 @@ SOURCE(CLICKHOUSE(
|
||||
db 'default'
|
||||
table 'ids'
|
||||
where 'id=10'
|
||||
secure 1
|
||||
))
|
||||
```
|
||||
|
||||
@ -609,6 +611,7 @@ Setting fields:
|
||||
- `table` – Name of the table.
|
||||
- `where` – The selection criteria. May be omitted.
|
||||
- `invalidate_query` – Query for checking the dictionary status. Optional parameter. Read more in the section [Updating dictionaries](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-lifetime.md).
|
||||
- `secure` - Use ssl for connection.
|
||||
|
||||
### Mongodb {#dicts-external_dicts_dict_sources-mongodb}
|
||||
|
||||
|
@ -159,7 +159,7 @@ Configuration fields:
|
||||
| Tag | Description | Required |
|
||||
|------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|
||||
| `name` | Column name. | Yes |
|
||||
| `type` | ClickHouse data type: [UInt8](../../../sql-reference/data-types/int-uint.md), [UInt16](../../../sql-reference/data-types/int-uint.md), [UInt32](../../../sql-reference/data-types/int-uint.md), [UInt64](../../../sql-reference/data-types/int-uint.md), [Int8](../../../sql-reference/data-types/int-uint.md), [Int16](../../../sql-reference/data-types/int-uint.md), [Int32](../../../sql-reference/data-types/int-uint.md), [Int64](../../../sql-reference/data-types/int-uint.md), [Float32](../../../sql-reference/data-types/float.md), [Float64](../../../sql-reference/data-types/float.md), [UUID](../../../sql-reference/data-types/uuid.md), [Decimal32](../../../sql-reference/data-types/decimal.md), [Decimal64](../../../sql-reference/data-types/decimal.md), [Decimal128](../../../sql-reference/data-types/decimal.md), [Decimal256](../../../sql-reference/data-types/decimal.md), [String](../../../sql-reference/data-types/string.md).<br/>ClickHouse tries to cast value from dictionary to the specified data type. For example, for MySQL, the field might be `TEXT`, `VARCHAR`, or `BLOB` in the MySQL source table, but it can be uploaded as `String` in ClickHouse.<br/>[Nullable](../../../sql-reference/data-types/nullable.md) is currently supported for [Flat](external-dicts-dict-layout.md#flat), [Hashed](external-dicts-dict-layout.md#dicts-external_dicts_dict_layout-hashed), [ComplexKeyHashed](external-dicts-dict-layout.md#complex-key-hashed), [Direct](external-dicts-dict-layout.md#direct), [ComplexKeyDirect](external-dicts-dict-layout.md#complex-key-direct), [RangeHashed](external-dicts-dict-layout.md#range-hashed), [Polygon](external-dicts-dict-polygon.md), [Cache](external-dicts-dict-layout.md#cache), [ComplexKeyCache](external-dicts-dict-layout.md#complex-key-cache), [SSDCache](external-dicts-dict-layout.md#ssd-cache), [SSDComplexKeyCache](external-dicts-dict-layout.md#complex-key-ssd-cache) dictionaries. In [IPTrie](external-dicts-dict-layout.md#ip-trie) dictionaries `Nullable` types are not supported. | Yes |
|
||||
| `type` | ClickHouse data type: [UInt8](../../../sql-reference/data-types/int-uint.md), [UInt16](../../../sql-reference/data-types/int-uint.md), [UInt32](../../../sql-reference/data-types/int-uint.md), [UInt64](../../../sql-reference/data-types/int-uint.md), [Int8](../../../sql-reference/data-types/int-uint.md), [Int16](../../../sql-reference/data-types/int-uint.md), [Int32](../../../sql-reference/data-types/int-uint.md), [Int64](../../../sql-reference/data-types/int-uint.md), [Float32](../../../sql-reference/data-types/float.md), [Float64](../../../sql-reference/data-types/float.md), [UUID](../../../sql-reference/data-types/uuid.md), [Decimal32](../../../sql-reference/data-types/decimal.md), [Decimal64](../../../sql-reference/data-types/decimal.md), [Decimal128](../../../sql-reference/data-types/decimal.md), [Decimal256](../../../sql-reference/data-types/decimal.md), [String](../../../sql-reference/data-types/string.md), [Array](../../../sql-reference/data-types/array.md).<br/>ClickHouse tries to cast value from dictionary to the specified data type. For example, for MySQL, the field might be `TEXT`, `VARCHAR`, or `BLOB` in the MySQL source table, but it can be uploaded as `String` in ClickHouse.<br/>[Nullable](../../../sql-reference/data-types/nullable.md) is currently supported for [Flat](external-dicts-dict-layout.md#flat), [Hashed](external-dicts-dict-layout.md#dicts-external_dicts_dict_layout-hashed), [ComplexKeyHashed](external-dicts-dict-layout.md#complex-key-hashed), [Direct](external-dicts-dict-layout.md#direct), [ComplexKeyDirect](external-dicts-dict-layout.md#complex-key-direct), [RangeHashed](external-dicts-dict-layout.md#range-hashed), [Polygon](external-dicts-dict-polygon.md), [Cache](external-dicts-dict-layout.md#cache), [ComplexKeyCache](external-dicts-dict-layout.md#complex-key-cache), [SSDCache](external-dicts-dict-layout.md#ssd-cache), [SSDComplexKeyCache](external-dicts-dict-layout.md#complex-key-ssd-cache) dictionaries. In [IPTrie](external-dicts-dict-layout.md#ip-trie) dictionaries `Nullable` types are not supported. | Yes |
|
||||
| `null_value` | Default value for a non-existing element.<br/>In the example, it is an empty string. [NULL](../../syntax.md#null-literal) value can be used only for the `Nullable` types (see the previous line with types description). | Yes |
|
||||
| `expression` | [Expression](../../../sql-reference/syntax.md#syntax-expressions) that ClickHouse executes on the value.<br/>The expression can be a column name in the remote SQL database. Thus, you can use it to create an alias for the remote column.<br/><br/>Default value: no expression. | No |
|
||||
| <a name="hierarchical-dict-attr"></a> `hierarchical` | If `true`, the attribute contains the value of a parent key for the current key. See [Hierarchical Dictionaries](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-hierarchical.md).<br/><br/>Default value: `false`. | No |
|
||||
|
@ -12,7 +12,7 @@ For information on connecting and configuring external dictionaries, see [Extern
|
||||
|
||||
## dictGet, dictGetOrDefault, dictGetOrNull {#dictget}
|
||||
|
||||
Retrieves values from an external dictionary.
|
||||
Retrieves values from an external dictionary.
|
||||
|
||||
``` sql
|
||||
dictGet('dict_name', attr_names, id_expr)
|
||||
@ -24,7 +24,7 @@ dictGetOrNull('dict_name', attr_name, id_expr)
|
||||
|
||||
- `dict_name` — Name of the dictionary. [String literal](../../sql-reference/syntax.md#syntax-string-literal).
|
||||
- `attr_names` — Name of the column of the dictionary, [String literal](../../sql-reference/syntax.md#syntax-string-literal), or tuple of column names, [Tuple](../../sql-reference/data-types/tuple.md)([String literal](../../sql-reference/syntax.md#syntax-string-literal)).
|
||||
- `id_expr` — Key value. [Expression](../../sql-reference/syntax.md#syntax-expressions) returning a [UInt64](../../sql-reference/data-types/int-uint.md) or [Tuple](../../sql-reference/data-types/tuple.md)-type value depending on the dictionary configuration.
|
||||
- `id_expr` — Key value. [Expression](../../sql-reference/syntax.md#syntax-expressions) returning dictionary key-type value or [Tuple](../../sql-reference/data-types/tuple.md)-type value depending on the dictionary configuration.
|
||||
- `default_value_expr` — Values returned if the dictionary does not contain a row with the `id_expr` key. [Expression](../../sql-reference/syntax.md#syntax-expressions) or [Tuple](../../sql-reference/data-types/tuple.md)([Expression](../../sql-reference/syntax.md#syntax-expressions)), returning the value (or values) in the data types configured for the `attr_names` attribute.
|
||||
|
||||
**Returned value**
|
||||
@ -138,7 +138,7 @@ Configure the external dictionary:
|
||||
<name>c2</name>
|
||||
<type>String</type>
|
||||
<null_value></null_value>
|
||||
</attribute>
|
||||
</attribute>
|
||||
</structure>
|
||||
<lifetime>0</lifetime>
|
||||
</dictionary>
|
||||
@ -237,7 +237,7 @@ dictHas('dict_name', id_expr)
|
||||
**Arguments**
|
||||
|
||||
- `dict_name` — Name of the dictionary. [String literal](../../sql-reference/syntax.md#syntax-string-literal).
|
||||
- `id_expr` — Key value. [Expression](../../sql-reference/syntax.md#syntax-expressions) returning a [UInt64](../../sql-reference/data-types/int-uint.md) or [Tuple](../../sql-reference/data-types/tuple.md)-type value depending on the dictionary configuration.
|
||||
- `id_expr` — Key value. [Expression](../../sql-reference/syntax.md#syntax-expressions) returning dictionary key-type value or [Tuple](../../sql-reference/data-types/tuple.md)-type value depending on the dictionary configuration.
|
||||
|
||||
**Returned value**
|
||||
|
||||
@ -292,16 +292,16 @@ Type: `UInt8`.
|
||||
|
||||
Returns first-level children as an array of indexes. It is the inverse transformation for [dictGetHierarchy](#dictgethierarchy).
|
||||
|
||||
**Syntax**
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
dictGetChildren(dict_name, key)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
**Arguments**
|
||||
|
||||
- `dict_name` — Name of the dictionary. [String literal](../../sql-reference/syntax.md#syntax-string-literal).
|
||||
- `key` — Key value. [Expression](../../sql-reference/syntax.md#syntax-expressions) returning a [UInt64](../../sql-reference/data-types/int-uint.md)-type value.
|
||||
- `dict_name` — Name of the dictionary. [String literal](../../sql-reference/syntax.md#syntax-string-literal).
|
||||
- `key` — Key value. [Expression](../../sql-reference/syntax.md#syntax-expressions) returning a [UInt64](../../sql-reference/data-types/int-uint.md)-type value.
|
||||
|
||||
**Returned values**
|
||||
|
||||
@ -339,7 +339,7 @@ SELECT dictGetChildren('hierarchy_flat_dictionary', number) FROM system.numbers
|
||||
|
||||
## dictGetDescendant {#dictgetdescendant}
|
||||
|
||||
Returns all descendants as if [dictGetChildren](#dictgetchildren) function was applied `level` times recursively.
|
||||
Returns all descendants as if [dictGetChildren](#dictgetchildren) function was applied `level` times recursively.
|
||||
|
||||
**Syntax**
|
||||
|
||||
@ -347,9 +347,9 @@ Returns all descendants as if [dictGetChildren](#dictgetchildren) function was a
|
||||
dictGetDescendants(dict_name, key, level)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
**Arguments**
|
||||
|
||||
- `dict_name` — Name of the dictionary. [String literal](../../sql-reference/syntax.md#syntax-string-literal).
|
||||
- `dict_name` — Name of the dictionary. [String literal](../../sql-reference/syntax.md#syntax-string-literal).
|
||||
- `key` — Key value. [Expression](../../sql-reference/syntax.md#syntax-expressions) returning a [UInt64](../../sql-reference/data-types/int-uint.md)-type value.
|
||||
- `level` — Hierarchy level. If `level = 0` returns all descendants to the end. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
|
@ -5,15 +5,186 @@ toc_title: Logical
|
||||
|
||||
# Logical Functions {#logical-functions}
|
||||
|
||||
Logical functions accept any numeric types, but return a UInt8 number equal to 0 or 1.
|
||||
Performs logical operations on arguments of any numeric types, but returns a [UInt8](../../sql-reference/data-types/int-uint.md) number equal to 0, 1 or `NULL` in some cases.
|
||||
|
||||
Zero as an argument is considered “false,” while any non-zero value is considered “true”.
|
||||
Zero as an argument is considered `false`, while any non-zero value is considered `true`.
|
||||
|
||||
## and, AND operator {#and-and-operator}
|
||||
## and {#logical-and-function}
|
||||
|
||||
## or, OR operator {#or-or-operator}
|
||||
Calculates the result of the logical conjunction between two or more values. Corresponds to [Logical AND Operator](../../sql-reference/operators/index.md#logical-and-operator).
|
||||
|
||||
## not, NOT operator {#not-not-operator}
|
||||
**Syntax**
|
||||
|
||||
## xor {#xor}
|
||||
``` sql
|
||||
and(val1, val2...)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `val1, val2, ...` — List of at least two values. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) or [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- `0`, if there is at least one zero value argument.
|
||||
- `NULL`, if there are no zero values arguments and there is at least one `NULL` argument.
|
||||
- `1`, otherwise.
|
||||
|
||||
Type: [UInt8](../../sql-reference/data-types/int-uint.md) or [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT and(0, 1, -2);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─and(0, 1, -2)─┐
|
||||
│ 0 │
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
With `NULL`:
|
||||
|
||||
``` sql
|
||||
SELECT and(NULL, 1, 10, -2);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─and(NULL, 1, 10, -2)─┐
|
||||
│ ᴺᵁᴸᴸ │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
## or {#logical-or-function}
|
||||
|
||||
Calculates the result of the logical disjunction between two or more values. Corresponds to [Logical OR Operator](../../sql-reference/operators/index.md#logical-or-operator).
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
and(val1, val2...)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `val1, val2, ...` — List of at least two values. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) or [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- `1`, if there is at least one non-zero value.
|
||||
- `0`, if there are only zero values.
|
||||
- `NULL`, if there are only zero values and `NULL`.
|
||||
|
||||
Type: [UInt8](../../sql-reference/data-types/int-uint.md) or [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT or(1, 0, 0, 2, NULL);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─or(1, 0, 0, 2, NULL)─┐
|
||||
│ 1 │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
With `NULL`:
|
||||
|
||||
``` sql
|
||||
SELECT or(0, NULL);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─or(0, NULL)─┐
|
||||
│ ᴺᵁᴸᴸ │
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
## not {#logical-not-function}
|
||||
|
||||
Calculates the result of the logical negation of the value. Corresponds to [Logical Negation Operator](../../sql-reference/operators/index.md#logical-negation-operator).
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
not(val);
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `val` — The value. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) or [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- `1`, if the `val` is `0`.
|
||||
- `0`, if the `val` is a non-zero value.
|
||||
- `NULL`, if the `val` is a `NULL` value.
|
||||
|
||||
Type: [UInt8](../../sql-reference/data-types/int-uint.md) or [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT NOT(1);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` test
|
||||
┌─not(1)─┐
|
||||
│ 0 │
|
||||
└────────┘
|
||||
```
|
||||
|
||||
## xor {#logical-xor-function}
|
||||
|
||||
Calculates the result of the logical exclusive disjunction between two or more values. For more than two values the function works as if it calculates `XOR` of the first two values and then uses the result with the next value to calculate `XOR` and so on.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
xor(val1, val2...)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `val1, val2, ...` — List of at least two values. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) or [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- `1`, for two values: if one of the values is zero and other is not.
|
||||
- `0`, for two values: if both values are zero or non-zero at the same time.
|
||||
- `NULL`, if there is at least one `NULL` value.
|
||||
|
||||
Type: [UInt8](../../sql-reference/data-types/int-uint.md) or [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT xor(0, 1, 1);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─xor(0, 1, 1)─┐
|
||||
│ 0 │
|
||||
└──────────────┘
|
||||
```
|
||||
|
@ -87,6 +87,8 @@ Result:
|
||||
└───────┴───────┘
|
||||
```
|
||||
|
||||
Note: the names are implementation specific and are subject to change. You should not assume specific names of the columns after application of the `untuple`.
|
||||
|
||||
Example of using an `EXCEPT` expression:
|
||||
|
||||
Query:
|
||||
|
@ -211,17 +211,17 @@ SELECT toDateTime('2014-10-26 00:00:00', 'Europe/Moscow') AS time, time + 60 * 6
|
||||
- [Interval](../../sql-reference/data-types/special-data-types/interval.md) data type
|
||||
- [toInterval](../../sql-reference/functions/type-conversion-functions.md#function-tointerval) type conversion functions
|
||||
|
||||
## Logical Negation Operator {#logical-negation-operator}
|
||||
|
||||
`NOT a` – The `not(a)` function.
|
||||
|
||||
## Logical AND Operator {#logical-and-operator}
|
||||
|
||||
`a AND b` – The`and(a, b)` function.
|
||||
Syntax `SELECT a AND b` — calculates logical conjunction of `a` and `b` with the function [and](../../sql-reference/functions/logical-functions.md#logical-and-function).
|
||||
|
||||
## Logical OR Operator {#logical-or-operator}
|
||||
|
||||
`a OR b` – The `or(a, b)` function.
|
||||
Syntax `SELECT a OR b` — calculates logical disjunction of `a` and `b` with the function [or](../../sql-reference/functions/logical-functions.md#logical-or-function).
|
||||
|
||||
## Logical Negation Operator {#logical-negation-operator}
|
||||
|
||||
Syntax `SELECT NOT a` — calculates logical negation of `a` with the function [not](../../sql-reference/functions/logical-functions.md#logical-not-function).
|
||||
|
||||
## Conditional Operator {#conditional-operator}
|
||||
|
||||
|
@ -749,19 +749,11 @@ CPU命令セットは、サーバー間でサポートされる最小のセッ
|
||||
|
||||
## 図書館 {#libraries}
|
||||
|
||||
**1.** C++20標準ライブラリが使用されています(実験的な拡張が許可されています)。 `boost` と `Poco` フレームワーク
|
||||
**1.** The C++20 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
|
||||
|
||||
**2.** 必要に応じて、OSパッケージで利用可能な既知のライブラリを使用できます。
|
||||
**2.** It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in `contrib` directory and built with ClickHouse.
|
||||
|
||||
すでに利用可能な良い解決策がある場合は、別のライブラリをインストールする必要がある場合でも、それを使用してください。
|
||||
|
||||
(が準備をしておいてくださ去の悪い図書館からのコードです。)
|
||||
|
||||
**3.** パッケージに必要なものがない場合や、古いバージョンや間違った種類のコンパイルがある場合は、パッケージにないライブラリをインストールできます。
|
||||
|
||||
**4.** ライブラリが小さく、独自の複雑なビルドシステムがない場合は、ソースファイルを `contrib` フォルダ。
|
||||
|
||||
**5.** すでに使用されているライブラリが優先されます。
|
||||
**3.** Preference is always given to libraries that are already in use.
|
||||
|
||||
## 一般的な推奨事項 {#general-recommendations-1}
|
||||
|
||||
|
@ -824,17 +824,9 @@ The dictionary is configured incorrectly.
|
||||
|
||||
**1.** Используются стандартная библиотека C++20 (допустимо использовать экспериментальные расширения) а также фреймворки `boost`, `Poco`.
|
||||
|
||||
**2.** При необходимости, можно использовать любые известные библиотеки, доступные в ОС из пакетов.
|
||||
**2.** Библиотеки должны быть расположены в виде исходников в директории `contrib` и собираться вместе с ClickHouse. Не разрешено использовать библиотеки, доступные в пакетах ОС или любые другие способы установки библиотек в систему.
|
||||
|
||||
Если есть хорошее готовое решение, то оно используется, даже если для этого придётся установить ещё одну библиотеку.
|
||||
|
||||
(Но будьте готовы к тому, что иногда вам придётся выкидывать плохие библиотеки из кода.)
|
||||
|
||||
**3.** Если в пакетах нет нужной библиотеки, или её версия достаточно старая, или если она собрана не так, как нужно, то можно использовать библиотеку, устанавливаемую не из пакетов.
|
||||
|
||||
**4.** Если библиотека достаточно маленькая и у неё нет своей системы сборки, то следует включить её файлы в проект, в директорию `contrib`.
|
||||
|
||||
**5.** Предпочтение всегда отдаётся уже использующимся библиотекам.
|
||||
**3.** Предпочтение отдаётся уже использующимся библиотекам.
|
||||
|
||||
## Общее {#obshchee-1}
|
||||
|
||||
|
@ -61,4 +61,4 @@ clickhouse client --secure -h play-api.clickhouse.tech --port 9440 -u playground
|
||||
Бэкэнд Playground - это кластер ClickHouse без дополнительных серверных приложений. Как упоминалось выше, способы подключения по HTTPS и TCP/TLS общедоступны как часть Playground. Они проксируются через [Cloudflare Spectrum](https://www.cloudflare.com/products/cloudflare-spectrum/) для добавления дополнительного уровня защиты и улучшенного глобального подключения.
|
||||
|
||||
!!! warning "Предупреждение"
|
||||
Открывать сервер ClickHouse для публичного доступа в любой другой ситуации **настоятельно не рекомендуется**. Убедитесь, что он настроен только на частную сеть и защищен брандмауэром.
|
||||
Открывать сервер ClickHouse для публичного доступа в любой другой ситуации **настоятельно не рекомендуется**. Убедитесь, что он настроен только на частную сеть и защищен брандмауэром.
|
||||
|
BIN
docs/ru/images/play.png
Normal file
BIN
docs/ru/images/play.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 26 KiB |
@ -5,30 +5,33 @@ toc_title: "HTTP-интерфейс"
|
||||
|
||||
# HTTP-интерфейс {#http-interface}
|
||||
|
||||
HTTP интерфейс позволяет использовать ClickHouse на любой платформе, из любого языка программирования. У нас он используется для работы из Java и Perl, а также из shell-скриптов. В других отделах, HTTP интерфейс используется из Perl, Python и Go. HTTP интерфейс более ограничен по сравнению с родным интерфейсом, но является более совместимым.
|
||||
HTTP интерфейс позволяет использовать ClickHouse на любой платформе, из любого языка программирования. У нас он используется для работы из Java и Perl, а также из shell-скриптов. В других отделах HTTP интерфейс используется из Perl, Python и Go. HTTP интерфейс более ограничен по сравнению с родным интерфейсом, но является более совместимым.
|
||||
|
||||
По умолчанию, clickhouse-server слушает HTTP на порту 8123 (это можно изменить в конфиге).
|
||||
Если запросить GET / без параметров, то вернётся строка заданная с помощью настройки [http_server_default_response](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-http_server_default_response). Значение по умолчанию «Ok.» (с переводом строки на конце).
|
||||
По умолчанию `clickhouse-server` слушает HTTP на порту 8123 (это можно изменить в конфиге).
|
||||
Если запросить `GET /` без параметров, то вернётся строка заданная с помощью настройки [http_server_default_response](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-http_server_default_response). Значение по умолчанию «Ok.» (с переводом строки на конце).
|
||||
|
||||
``` bash
|
||||
$ curl 'http://localhost:8123/'
|
||||
Ok.
|
||||
```
|
||||
|
||||
В скриптах проверки доступности вы можете использовать GET /ping без параметров. Если сервер доступен всегда возвращается «Ok.» (с переводом строки на конце).
|
||||
Веб-интерфейс доступен по адресу: `http://localhost:8123/play`.
|
||||
|
||||
![Веб-интерфейс](../images/play.png)
|
||||
|
||||
В скриптах проверки доступности вы можете использовать `GET /ping` без параметров. Если сервер доступен, всегда возвращается «Ok.» (с переводом строки на конце).
|
||||
|
||||
``` bash
|
||||
$ curl 'http://localhost:8123/ping'
|
||||
Ok.
|
||||
```
|
||||
|
||||
Запрос отправляется в виде URL параметра с именем query. Или как тело запроса при использовании метода POST.
|
||||
Запрос отправляется в виде URL параметра с именем `query`. Или как тело запроса при использовании метода POST.
|
||||
Или начало запроса в URL параметре query, а продолжение POST-ом (зачем это нужно, будет объяснено ниже). Размер URL ограничен 16KB, это следует учитывать при отправке больших запросов.
|
||||
|
||||
В случае успеха, вам вернётся код ответа 200 и результат обработки запроса в теле ответа.
|
||||
В случае ошибки, вам вернётся код ответа 500 и текст с описанием ошибки в теле ответа.
|
||||
В случае успеха возвращается код ответа 200 и результат обработки запроса в теле ответа, в случае ошибки — код ответа 500 и текст с описанием ошибки в теле ответа.
|
||||
|
||||
При использовании метода GET, выставляется настройка readonly. То есть, для запросов, модифицирующие данные, можно использовать только метод POST. Сам запрос при этом можно отправлять как в теле POST-а, так и в параметре URL.
|
||||
При использовании метода GET выставляется настройка readonly. То есть, для запросов, модифицирующих данные, можно использовать только метод POST. Сам запрос при этом можно отправлять как в теле POST запроса, так и в параметре URL.
|
||||
|
||||
Примеры:
|
||||
|
||||
@ -51,8 +54,8 @@ X-ClickHouse-Summary: {"read_rows":"0","read_bytes":"0","written_rows":"0","writ
|
||||
1
|
||||
```
|
||||
|
||||
Как видно, curl немного неудобен тем, что надо URL-эскейпить пробелы.
|
||||
Хотя wget сам всё эскейпит, но его не рекомендуется использовать, так как он плохо работает по HTTP 1.1 при использовании keep-alive и Transfer-Encoding: chunked.
|
||||
Как видно, `curl` немного неудобен тем, что надо URL-эскейпить пробелы.
|
||||
Хотя `wget` сам всё эскейпит, но его не рекомендуется использовать, так как он плохо работает по HTTP 1.1 при использовании `keep-alive` и `Transfer-Encoding: chunked`.
|
||||
|
||||
``` bash
|
||||
$ echo 'SELECT 1' | curl 'http://localhost:8123/' --data-binary @-
|
||||
@ -65,7 +68,7 @@ $ echo '1' | curl 'http://localhost:8123/?query=SELECT' --data-binary @-
|
||||
1
|
||||
```
|
||||
|
||||
Если часть запроса отправляется в параметре, а часть POST-ом, то между этими двумя кусками данных ставится перевод строки.
|
||||
Если часть запроса отправляется в параметре, а часть POST запросом, то между этими двумя кусками данных ставится перевод строки.
|
||||
Пример (так работать не будет):
|
||||
|
||||
``` bash
|
||||
@ -75,9 +78,9 @@ ECT 1
|
||||
, expected One of: SHOW TABLES, SHOW DATABASES, SELECT, INSERT, CREATE, ATTACH, RENAME, DROP, DETACH, USE, SET, OPTIMIZE., e.what() = DB::Exception
|
||||
```
|
||||
|
||||
По умолчанию, данные возвращаются в формате TabSeparated (подробнее смотри раздел «Форматы»).
|
||||
По умолчанию данные возвращаются в формате [TabSeparated](formats.md#tabseparated).
|
||||
|
||||
Можно попросить любой другой формат - с помощью секции FORMAT запроса.
|
||||
Можно указать любой другой формат с помощью секции FORMAT запроса.
|
||||
|
||||
Кроме того, вы можете использовать параметр URL-адреса `default_format` или заголовок `X-ClickHouse-Format`, чтобы указать формат по умолчанию, отличный от `TabSeparated`.
|
||||
|
||||
@ -90,9 +93,10 @@ $ echo 'SELECT 1 FORMAT Pretty' | curl 'http://localhost:8123/?' --data-binary @
|
||||
└───┘
|
||||
```
|
||||
|
||||
Возможность передавать данные POST-ом нужна для INSERT-запросов. В этом случае вы можете написать начало запроса в параметре URL, а вставляемые данные передать POST-ом. Вставляемыми данными может быть, например, tab-separated дамп, полученный из MySQL. Таким образом, запрос INSERT заменяет LOAD DATA LOCAL INFILE из MySQL.
|
||||
Возможность передавать данные с помощью POST нужна для запросов `INSERT`. В этом случае вы можете написать начало запроса в параметре URL, а вставляемые данные передать POST запросом. Вставляемыми данными может быть, например, tab-separated дамп, полученный из MySQL. Таким образом, запрос `INSERT` заменяет `LOAD DATA LOCAL INFILE` из MySQL.
|
||||
|
||||
**Примеры**
|
||||
|
||||
Примеры:
|
||||
Создаём таблицу:
|
||||
|
||||
``` bash
|
||||
@ -147,7 +151,7 @@ $ curl 'http://localhost:8123/?query=SELECT%20a%20FROM%20t'
|
||||
$ echo 'DROP TABLE t' | curl 'http://localhost:8123/' --data-binary @-
|
||||
```
|
||||
|
||||
Для запросов, которые не возвращают таблицу с данными, в случае успеха, выдаётся пустое тело ответа.
|
||||
Для запросов, которые не возвращают таблицу с данными, в случае успеха выдаётся пустое тело ответа.
|
||||
|
||||
|
||||
## Сжатие {#compression}
|
||||
@ -165,7 +169,7 @@ $ echo 'DROP TABLE t' | curl 'http://localhost:8123/' --data-binary @-
|
||||
- `deflate`
|
||||
- `xz`
|
||||
|
||||
Для отправки сжатого запроса `POST`, добавьте заголовок `Content-Encoding: compression_method`.
|
||||
Для отправки сжатого запроса `POST` добавьте заголовок `Content-Encoding: compression_method`.
|
||||
Чтобы ClickHouse сжимал ответ, разрешите сжатие настройкой [enable_http_compression](../operations/settings/settings.md#settings-enable_http_compression) и добавьте заголовок `Accept-Encoding: compression_method`. Уровень сжатия данных для всех методов сжатия можно задать с помощью настройки [http_zlib_compression_level](../operations/settings/settings.md#settings-http_zlib_compression_level).
|
||||
|
||||
!!! note "Примечание"
|
||||
@ -281,13 +285,13 @@ X-ClickHouse-Progress: {"read_rows":"8783786","read_bytes":"819092887","total_ro
|
||||
|
||||
HTTP интерфейс позволяет передать внешние данные (внешние временные таблицы) для использования запроса. Подробнее смотрите раздел «Внешние данные для обработки запроса»
|
||||
|
||||
## Буферизация ответа {#buferizatsiia-otveta}
|
||||
## Буферизация ответа {#response-buffering}
|
||||
|
||||
Существует возможность включить буферизацию ответа на стороне сервера. Для этого предусмотрены параметры URL `buffer_size` и `wait_end_of_query`.
|
||||
|
||||
`buffer_size` определяет количество байт результата которые будут буферизованы в памяти сервера. Если тело результата больше этого порога, то буфер будет переписан в HTTP канал, а оставшиеся данные будут отправляться в HTTP-канал напрямую.
|
||||
|
||||
Чтобы гарантировать буферизацию всего ответа необходимо выставить `wait_end_of_query=1`. В этом случае данные, не поместившиеся в памяти, будут буферизованы во временном файле сервера.
|
||||
Чтобы гарантировать буферизацию всего ответа, необходимо выставить `wait_end_of_query=1`. В этом случае данные, не поместившиеся в памяти, будут буферизованы во временном файле сервера.
|
||||
|
||||
Пример:
|
||||
|
||||
@ -295,7 +299,7 @@ HTTP интерфейс позволяет передать внешние да
|
||||
$ curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wait_end_of_query=1' -d 'SELECT toUInt8(number) FROM system.numbers LIMIT 9000000 FORMAT RowBinary'
|
||||
```
|
||||
|
||||
Буферизация позволяет избежать ситуации когда код ответа и HTTP-заголовки были отправлены клиенту, после чего возникла ошибка выполнения запроса. В такой ситуации сообщение об ошибке записывается в конце тела ответа, и на стороне клиента ошибка может быть обнаружена только на этапе парсинга.
|
||||
Буферизация позволяет избежать ситуации, когда код ответа и HTTP-заголовки были отправлены клиенту, после чего возникла ошибка выполнения запроса. В такой ситуации сообщение об ошибке записывается в конце тела ответа, и на стороне клиента ошибка может быть обнаружена только на этапе парсинга.
|
||||
|
||||
### Запросы с параметрами {#cli-queries-with-parameters}
|
||||
|
||||
@ -634,4 +638,3 @@ $ curl -vv -H 'XXX:xxx' 'http://localhost:8123/get_relative_path_static_handler'
|
||||
<html><body>Relative Path File</body></html>
|
||||
* Connection #0 to host localhost left intact
|
||||
```
|
||||
|
||||
|
@ -1,17 +1,19 @@
|
||||
# median {#median}
|
||||
|
||||
Функции `median*` — алиасы для соответствущих функций `quantile*`. Они вычисляют медиану числовой последовательности.
|
||||
Функции `median*` — синонимы для соответствущих функций `quantile*`. Они вычисляют медиану числовой последовательности.
|
||||
|
||||
Functions:
|
||||
Функции:
|
||||
|
||||
- `median` — алиас [quantile](#quantile).
|
||||
- `medianDeterministic` — алиас [quantileDeterministic](#quantiledeterministic).
|
||||
- `medianExact` — алиас [quantileExact](#quantileexact).
|
||||
- `medianExactWeighted` — алиас [quantileExactWeighted](#quantileexactweighted).
|
||||
- `medianTiming` — алиас [quantileTiming](#quantiletiming).
|
||||
- `medianTimingWeighted` — алиас [quantileTimingWeighted](#quantiletimingweighted).
|
||||
- `medianTDigest` — алиас [quantileTDigest](#quantiletdigest).
|
||||
- `medianTDigestWeighted` — алиас [quantileTDigestWeighted](#quantiletdigestweighted).
|
||||
|
||||
- `median` — синоним для [quantile](../../../sql-reference/aggregate-functions/reference/quantile.md#quantile).
|
||||
- `medianDeterministic` — синоним для [quantileDeterministic](../../../sql-reference/aggregate-functions/reference/quantiledeterministic.md#quantiledeterministic).
|
||||
- `medianExact` — синоним для [quantileExact](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexact).
|
||||
- `medianExactWeighted` — синоним для [quantileExactWeighted](../../../sql-reference/aggregate-functions/reference/quantileexactweighted.md#quantileexactweighted).
|
||||
- `medianTiming` — синоним для [quantileTiming](../../../sql-reference/aggregate-functions/reference/quantiletiming.md#quantiletiming).
|
||||
- `medianTimingWeighted` — синоним для [quantileTimingWeighted](../../../sql-reference/aggregate-functions/reference/quantiletimingweighted.md#quantiletimingweighted).
|
||||
- `medianTDigest` — синоним для [quantileTDigest](../../../sql-reference/aggregate-functions/reference/quantiletdigest.md#quantiletdigest).
|
||||
- `medianTDigestWeighted` — синоним для [quantileTDigestWeighted](../../../sql-reference/aggregate-functions/reference/quantiletdigestweighted.md#quantiletdigestweighted).
|
||||
- `medianBFloat16` — синоним для [quantileBFloat16](../../../sql-reference/aggregate-functions/reference/quantilebfloat16.md#quantilebfloat16).
|
||||
|
||||
**Пример**
|
||||
|
||||
|
@ -0,0 +1,64 @@
|
||||
---
|
||||
toc_priority: 209
|
||||
---
|
||||
|
||||
# quantileBFloat16 {#quantilebfloat16}
|
||||
|
||||
Приближенно вычисляет [квантиль](https://ru.wikipedia.org/wiki/Квантиль) выборки чисел в формате [bfloat16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format). `bfloat16` — это формат с плавающей точкой, в котором для представления числа используется 1 знаковый бит, 8 бит для порядка и 7 бит для мантиссы.
|
||||
Функция преобразует входное число в 32-битное с плавающей точкой и обрабатывает его старшие 16 бит. Она вычисляет квантиль в формате `bfloat16` и преобразует его в 64-битное число с плавающей точкой, добавляя нулевые биты.
|
||||
Эта функция выполняет быстрые приближенные вычисления с относительной ошибкой не более 0.390625%.
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
``` sql
|
||||
quantileBFloat16[(level)](expr)
|
||||
```
|
||||
|
||||
Синоним: `medianBFloat16`
|
||||
|
||||
**Аргументы**
|
||||
|
||||
- `expr` — столбец с числовыми данными. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Параметры**
|
||||
|
||||
- `level` — уровень квантиля. Необязательный параметр. Допустимый диапазон значений от 0 до 1. Значение по умолчанию: 0.5. [Float](../../../sql-reference/data-types/float.md).
|
||||
|
||||
**Возвращаемое значение**
|
||||
|
||||
- Приближенное значение квантиля.
|
||||
|
||||
Тип: [Float64](../../../sql-reference/data-types/float.md#float32-float64).
|
||||
|
||||
**Пример**
|
||||
|
||||
В таблице есть столбцы с целыми числами и с числами с плавающей точкой:
|
||||
|
||||
``` text
|
||||
┌─a─┬─────b─┐
|
||||
│ 1 │ 1.001 │
|
||||
│ 2 │ 1.002 │
|
||||
│ 3 │ 1.003 │
|
||||
│ 4 │ 1.004 │
|
||||
└───┴───────┘
|
||||
```
|
||||
|
||||
Запрос для вычисления 0.75-квантиля (верхнего квартиля):
|
||||
|
||||
``` sql
|
||||
SELECT quantileBFloat16(0.75)(a), quantileBFloat16(0.75)(b) FROM example_table;
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` text
|
||||
┌─quantileBFloat16(0.75)(a)─┬─quantileBFloat16(0.75)(b)─┐
|
||||
│ 3 │ 1 │
|
||||
└───────────────────────────┴───────────────────────────┘
|
||||
```
|
||||
Обратите внимание, что все числа с плавающей точкой в примере были округлены до 1.0 при преобразовании к `bfloat16`.
|
||||
|
||||
**См. также**
|
||||
|
||||
- [median](../../../sql-reference/aggregate-functions/reference/median.md#median)
|
||||
- [quantiles](../../../sql-reference/aggregate-functions/reference/quantiles.md#quantiles)
|
@ -8,7 +8,7 @@ toc_priority: 201
|
||||
|
||||
Синтаксис: `quantiles(level1, level2, …)(x)`
|
||||
|
||||
Все функции для вычисления квантилей имеют соответствующие функции для вычисления нескольких квантилей: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`. Эти функции вычисляют все квантили указанных уровней в один проход и возвращают массив с вычисленными значениями.
|
||||
Все функции для вычисления квантилей имеют соответствующие функции для вычисления нескольких квантилей: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`, `quantilesBFloat16`. Эти функции вычисляют все квантили указанных уровней в один проход и возвращают массив с вычисленными значениями.
|
||||
|
||||
## quantilesExactExclusive {#quantilesexactexclusive}
|
||||
|
||||
@ -18,7 +18,7 @@ toc_priority: 201
|
||||
|
||||
Эта функция эквивалентна Excel функции [PERCENTILE.EXC](https://support.microsoft.com/en-us/office/percentile-exc-function-bbaa7204-e9e1-4010-85bf-c31dc5dce4ba), [тип R6](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample).
|
||||
|
||||
С наборами уровней работает эффективнее, чем [quantilesExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
|
||||
С наборами уровней работает эффективнее, чем [quantileExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
@ -70,7 +70,7 @@ SELECT quantilesExactExclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM
|
||||
|
||||
Эта функция эквивалентна Excel функции [PERCENTILE.INC](https://support.microsoft.com/en-us/office/percentile-inc-function-680f9539-45eb-410b-9a5e-c1355e5fe2ed), [тип R7](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample).
|
||||
|
||||
С наборами уровней работает эффективнее, чем [quantilesExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantilesexactinclusive).
|
||||
С наборами уровней работает эффективнее, чем [quantileExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactinclusive).
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
|
@ -159,7 +159,7 @@ CREATE DICTIONARY somename (
|
||||
| Тег | Описание | Обязательный |
|
||||
|------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
|
||||
| `name` | Имя столбца. | Да |
|
||||
| `type` | Тип данных ClickHouse: [UInt8](../../../sql-reference/data-types/int-uint.md), [UInt16](../../../sql-reference/data-types/int-uint.md), [UInt32](../../../sql-reference/data-types/int-uint.md), [UInt64](../../../sql-reference/data-types/int-uint.md), [Int8](../../../sql-reference/data-types/int-uint.md), [Int16](../../../sql-reference/data-types/int-uint.md), [Int32](../../../sql-reference/data-types/int-uint.md), [Int64](../../../sql-reference/data-types/int-uint.md), [Float32](../../../sql-reference/data-types/float.md), [Float64](../../../sql-reference/data-types/float.md), [UUID](../../../sql-reference/data-types/uuid.md), [Decimal32](../../../sql-reference/data-types/decimal.md), [Decimal64](../../../sql-reference/data-types/decimal.md), [Decimal128](../../../sql-reference/data-types/decimal.md), [Decimal256](../../../sql-reference/data-types/decimal.md), [String](../../../sql-reference/data-types/string.md).<br/>ClickHouse пытается привести значение из словаря к заданному типу данных. Например, в случае MySQL, в таблице-источнике поле может быть `TEXT`, `VARCHAR`, `BLOB`, но загружено может быть как `String`. <br/>[Nullable](../../../sql-reference/data-types/nullable.md) в настоящее время поддерживается для словарей [Flat](external-dicts-dict-layout.md#flat), [Hashed](external-dicts-dict-layout.md#dicts-external_dicts_dict_layout-hashed), [ComplexKeyHashed](external-dicts-dict-layout.md#complex-key-hashed), [Direct](external-dicts-dict-layout.md#direct), [ComplexKeyDirect](external-dicts-dict-layout.md#complex-key-direct), [RangeHashed](external-dicts-dict-layout.md#range-hashed), [Polygon](external-dicts-dict-polygon.md), [Cache](external-dicts-dict-layout.md#cache), [ComplexKeyCache](external-dicts-dict-layout.md#complex-key-cache), [SSDCache](external-dicts-dict-layout.md#ssd-cache), [SSDComplexKeyCache](external-dicts-dict-layout.md#complex-key-ssd-cache). Для словарей [IPTrie](external-dicts-dict-layout.md#ip-trie) `Nullable`-типы не поддерживаются. | Да |
|
||||
| `type` | Тип данных ClickHouse: [UInt8](../../../sql-reference/data-types/int-uint.md), [UInt16](../../../sql-reference/data-types/int-uint.md), [UInt32](../../../sql-reference/data-types/int-uint.md), [UInt64](../../../sql-reference/data-types/int-uint.md), [Int8](../../../sql-reference/data-types/int-uint.md), [Int16](../../../sql-reference/data-types/int-uint.md), [Int32](../../../sql-reference/data-types/int-uint.md), [Int64](../../../sql-reference/data-types/int-uint.md), [Float32](../../../sql-reference/data-types/float.md), [Float64](../../../sql-reference/data-types/float.md), [UUID](../../../sql-reference/data-types/uuid.md), [Decimal32](../../../sql-reference/data-types/decimal.md), [Decimal64](../../../sql-reference/data-types/decimal.md), [Decimal128](../../../sql-reference/data-types/decimal.md), [Decimal256](../../../sql-reference/data-types/decimal.md), [String](../../../sql-reference/data-types/string.md), [Array](../../../sql-reference/data-types/array.md).<br/>ClickHouse пытается привести значение из словаря к заданному типу данных. Например, в случае MySQL, в таблице-источнике поле может быть `TEXT`, `VARCHAR`, `BLOB`, но загружено может быть как `String`. <br/>[Nullable](../../../sql-reference/data-types/nullable.md) в настоящее время поддерживается для словарей [Flat](external-dicts-dict-layout.md#flat), [Hashed](external-dicts-dict-layout.md#dicts-external_dicts_dict_layout-hashed), [ComplexKeyHashed](external-dicts-dict-layout.md#complex-key-hashed), [Direct](external-dicts-dict-layout.md#direct), [ComplexKeyDirect](external-dicts-dict-layout.md#complex-key-direct), [RangeHashed](external-dicts-dict-layout.md#range-hashed), [Polygon](external-dicts-dict-polygon.md), [Cache](external-dicts-dict-layout.md#cache), [ComplexKeyCache](external-dicts-dict-layout.md#complex-key-cache), [SSDCache](external-dicts-dict-layout.md#ssd-cache), [SSDComplexKeyCache](external-dicts-dict-layout.md#complex-key-ssd-cache). Для словарей [IPTrie](external-dicts-dict-layout.md#ip-trie) `Nullable`-типы не поддерживаются. | Да |
|
||||
| `null_value` | Значение по умолчанию для несуществующего элемента.<br/>В примере это пустая строка. Значение [NULL](../../syntax.md#null-literal) можно указывать только для типов `Nullable` (см. предыдущую строку с описанием типов). | Да |
|
||||
| `expression` | [Выражение](../../syntax.md#syntax-expressions), которое ClickHouse выполняет со значением.<br/>Выражением может быть имя столбца в удаленной SQL базе. Таким образом, вы можете использовать его для создания псевдонима удаленного столбца.<br/><br/>Значение по умолчанию: нет выражения. | Нет |
|
||||
| <a name="hierarchical-dict-attr"></a> `hierarchical` | Если `true`, то атрибут содержит ключ предка для текущего элемента. Смотрите [Иерархические словари](external-dicts-dict-hierarchical.md).<br/><br/>Значение по умолчанию: `false`. | Нет |
|
||||
|
@ -5,15 +5,186 @@ toc_title: "Логические функции"
|
||||
|
||||
# Логические функции {#logicheskie-funktsii}
|
||||
|
||||
Логические функции принимают любые числовые типы, а возвращают число типа UInt8, равное 0 или 1.
|
||||
Логические функции производят логические операции над любыми числовыми типами, а возвращают число типа [UInt8](../../sql-reference/data-types/int-uint.md), равное 0, 1, а в некоторых случаях `NULL`.
|
||||
|
||||
Ноль в качестве аргумента считается «ложью», а любое ненулевое значение - «истиной».
|
||||
Ноль в качестве аргумента считается `ложью`, а любое ненулевое значение — `истиной`.
|
||||
|
||||
## and, оператор AND {#and-operator-and}
|
||||
## and {#logical-and-function}
|
||||
|
||||
## or, оператор OR {#or-operator-or}
|
||||
Вычисляет результат логической конъюнкции между двумя и более значениями. Соответствует [оператору логического "И"](../../sql-reference/operators/index.md#logical-and-operator).
|
||||
|
||||
## not, оператор NOT {#not-operator-not}
|
||||
**Синтаксис**
|
||||
|
||||
## xor {#xor}
|
||||
``` sql
|
||||
and(val1, val2...)
|
||||
```
|
||||
|
||||
**Аргументы**
|
||||
|
||||
- `val1, val2, ...` — список из как минимум двух значений. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) или [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Возвращаемое значение**
|
||||
|
||||
- `0`, если среди аргументов есть хотя бы один нуль.
|
||||
- `NULL`, если среди аргументов нет нулей, но есть хотя бы один `NULL`.
|
||||
- `1`, в остальных случаях.
|
||||
|
||||
Тип: [UInt8](../../sql-reference/data-types/int-uint.md) или [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Пример**
|
||||
|
||||
Запрос:
|
||||
|
||||
``` sql
|
||||
SELECT and(0, 1, -2);
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` text
|
||||
┌─and(0, 1, -2)─┐
|
||||
│ 0 │
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
Со значениями `NULL`:
|
||||
|
||||
``` sql
|
||||
SELECT and(NULL, 1, 10, -2);
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` text
|
||||
┌─and(NULL, 1, 10, -2)─┐
|
||||
│ ᴺᵁᴸᴸ │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
## or {#logical-or-function}
|
||||
|
||||
Вычисляет результат логической дизъюнкции между двумя и более значениями. Соответствует [оператору логического "ИЛИ"](../../sql-reference/operators/index.md#logical-or-operator).
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
``` sql
|
||||
and(val1, val2...)
|
||||
```
|
||||
|
||||
**Аргументы**
|
||||
|
||||
- `val1, val2, ...` — список из как минимум двух значений. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) или [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- `1`, если среди аргументов есть хотя бы одно ненулевое число.
|
||||
- `0`, если среди аргументов только нули.
|
||||
- `NULL`, если среди аргументов нет ненулевых значений, и есть `NULL`.
|
||||
|
||||
Тип: [UInt8](../../sql-reference/data-types/int-uint.md) или [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Пример**
|
||||
|
||||
Запрос:
|
||||
|
||||
``` sql
|
||||
SELECT or(1, 0, 0, 2, NULL);
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` text
|
||||
┌─or(1, 0, 0, 2, NULL)─┐
|
||||
│ 1 │
|
||||
└──────────────────────┘
|
||||
```
|
||||
|
||||
Со значениями `NULL`:
|
||||
|
||||
``` sql
|
||||
SELECT or(0, NULL);
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` text
|
||||
┌─or(0, NULL)─┐
|
||||
│ ᴺᵁᴸᴸ │
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
## not {#logical-not-function}
|
||||
|
||||
Вычисляет результат логического отрицания аргумента. Соответствует [оператору логического отрицания](../../sql-reference/operators/index.md#logical-negation-operator).
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
``` sql
|
||||
not(val);
|
||||
```
|
||||
|
||||
**Аргументы**
|
||||
|
||||
- `val` — значение. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) или [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Возвращаемое значение**
|
||||
|
||||
- `1`, если `val` — это `0`.
|
||||
- `0`, если `val` — это ненулевое число.
|
||||
- `NULL`, если `val` — это `NULL`.
|
||||
|
||||
Тип: [UInt8](../../sql-reference/data-types/int-uint.md) или [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Пример**
|
||||
|
||||
Запрос:
|
||||
|
||||
``` sql
|
||||
SELECT NOT(1);
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` test
|
||||
┌─not(1)─┐
|
||||
│ 0 │
|
||||
└────────┘
|
||||
```
|
||||
|
||||
## xor {#logical-xor-function}
|
||||
|
||||
Вычисляет результат логической исключающей дизъюнкции между двумя и более значениями. При более чем двух значениях функция работает так: сначала вычисляет `XOR` для первых двух значений, а потом использует полученный результат при вычислении `XOR` со следующим значением и так далее.
|
||||
|
||||
**Синтаксис**
|
||||
|
||||
``` sql
|
||||
xor(val1, val2...)
|
||||
```
|
||||
|
||||
**Аргументы**
|
||||
|
||||
- `val1, val2, ...` — список из как минимум двух значений. [Int](../../sql-reference/data-types/int-uint.md), [UInt](../../sql-reference/data-types/int-uint.md), [Float](../../sql-reference/data-types/float.md) или [Nullable](../../sql-reference/data-types/nullable.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- `1`, для двух значений: если одно из значений является нулем, а второе нет.
|
||||
- `0`, для двух значений: если оба значения одновременно нули или ненулевые числа.
|
||||
- `NULL`, если среди аргументов хотя бы один `NULL`.
|
||||
|
||||
Тип: [UInt8](../../sql-reference/data-types/int-uint.md) or [Nullable](../../sql-reference/data-types/nullable.md)([UInt8](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Пример**
|
||||
|
||||
Запрос:
|
||||
|
||||
``` sql
|
||||
SELECT xor(0, 1, 1);
|
||||
```
|
||||
|
||||
Результат:
|
||||
|
||||
``` text
|
||||
┌─xor(0, 1, 1)─┐
|
||||
│ 0 │
|
||||
└──────────────┘
|
||||
```
|
||||
|
@ -211,17 +211,17 @@ SELECT toDateTime('2014-10-26 00:00:00', 'Europe/Moscow') AS time, time + 60 * 6
|
||||
- Тип данных [Interval](../../sql-reference/operators/index.md)
|
||||
- Функции преобразования типов [toInterval](../../sql-reference/operators/index.md#function-tointerval)
|
||||
|
||||
## Оператор логического отрицания {#operator-logicheskogo-otritsaniia}
|
||||
## Оператор логического "И" {#logical-and-operator}
|
||||
|
||||
`NOT a` - функция `not(a)`
|
||||
Синтаксис `SELECT a AND b` — вычисляет логическую конъюнкцию между `a` и `b` функцией [and](../../sql-reference/functions/logical-functions.md#logical-and-function).
|
||||
|
||||
## Оператор логического ‘И’ {#operator-logicheskogo-i}
|
||||
## Оператор логического "ИЛИ" {#logical-or-operator}
|
||||
|
||||
`a AND b` - функция `and(a, b)`
|
||||
Синтаксис `SELECT a OR b` — вычисляет логическую дизъюнкцию между `a` и `b` функцией [or](../../sql-reference/functions/logical-functions.md#logical-or-function).
|
||||
|
||||
## Оператор логического ‘ИЛИ’ {#operator-logicheskogo-ili}
|
||||
## Оператор логического отрицания {#logical-negation-operator}
|
||||
|
||||
`a OR b` - функция `or(a, b)`
|
||||
Синтаксис `SELECT NOT a` — вычисляет логическое отрицание `a` функцией [not](../../sql-reference/functions/logical-functions.md#logical-not-function).
|
||||
|
||||
## Условный оператор {#uslovnyi-operator}
|
||||
|
||||
|
@ -6,7 +6,7 @@ toc_title: DISTINCT
|
||||
|
||||
Если указан `SELECT DISTINCT`, то в результате запроса останутся только уникальные строки. Таким образом, из всех наборов полностью совпадающих строк в результате останется только одна строка.
|
||||
|
||||
## Обработк NULL {#null-processing}
|
||||
## Обработка NULL {#null-processing}
|
||||
|
||||
`DISTINCT` работает с [NULL](../../syntax.md#null-literal) как-будто `NULL` — обычное значение и `NULL==NULL`. Другими словами, в результате `DISTINCT`, различные комбинации с `NULL` встретятся только один раз. Это отличается от обработки `NULL` в большинстве других контекстов.
|
||||
|
||||
|
@ -742,19 +742,11 @@ CPU指令集是我们服务器中支持的最小集合。 目前,它是SSE 4.2
|
||||
|
||||
## 库 {#ku}
|
||||
|
||||
**1.** 使用C++20标准库(允许实验性功能),以及 `boost` 和 `Poco` 框架。
|
||||
**1.** The C++20 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
|
||||
|
||||
**2.** 如有必要,您可以使用 OS 包中提供的任何已知库。
|
||||
**2.** It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in `contrib` directory and built with ClickHouse.
|
||||
|
||||
如果有一个好的解决方案已经可用,那就使用它,即使这意味着你必须安装另一个库。
|
||||
|
||||
(但要准备从代码中删除不好的库)
|
||||
|
||||
**3.** 如果软件包没有您需要的软件包或者有过时的版本或错误的编译类型,则可以安装不在软件包中的库。
|
||||
|
||||
**4.** 如果库很小并且没有自己的复杂构建系统,请将源文件放在 `contrib` 文件夹中。
|
||||
|
||||
**5.** 始终优先考虑已经使用的库。
|
||||
**3.** Preference is always given to libraries that are already in use.
|
||||
|
||||
## 一般建议 {#yi-ban-jian-yi-1}
|
||||
|
||||
|
@ -81,7 +81,7 @@ SELECT bitmapToArray(bitmapSubsetInRange(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,
|
||||
**示例**
|
||||
|
||||
``` sql
|
||||
SELECT bitmapToArray(bitmapSubsetInRange(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(30), toUInt32(200))) AS res
|
||||
SELECT bitmapToArray(bitmapSubsetLimit(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(30), toUInt32(200))) AS res
|
||||
```
|
||||
|
||||
┌─res───────────────────────┐
|
||||
@ -174,7 +174,7 @@ SELECT bitmapToArray(bitmapAnd(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS re
|
||||
│ [3] │
|
||||
└─────┘
|
||||
|
||||
## 位图 {#bitmapor}
|
||||
## 位图或 {#bitmapor}
|
||||
|
||||
为两个位图对象进行或操作,返回一个新的位图对象。
|
||||
|
||||
|
@ -430,6 +430,7 @@ private:
|
||||
{TokenType::ClosingRoundBracket, Replxx::Color::BROWN},
|
||||
{TokenType::OpeningSquareBracket, Replxx::Color::BROWN},
|
||||
{TokenType::ClosingSquareBracket, Replxx::Color::BROWN},
|
||||
{TokenType::DoubleColon, Replxx::Color::BROWN},
|
||||
{TokenType::OpeningCurlyBrace, Replxx::Color::INTENSE},
|
||||
{TokenType::ClosingCurlyBrace, Replxx::Color::INTENSE},
|
||||
|
||||
|
@ -388,24 +388,32 @@ void LocalServer::processQueries()
|
||||
/// Use the same query_id (and thread group) for all queries
|
||||
CurrentThread::QueryScope query_scope_holder(context);
|
||||
|
||||
///Set progress show
|
||||
/// Set progress show
|
||||
need_render_progress = config().getBool("progress", false);
|
||||
|
||||
std::function<void()> finalize_progress;
|
||||
if (need_render_progress)
|
||||
{
|
||||
/// Set progress callback, which can be run from multiple threads.
|
||||
context->setProgressCallback([&](const Progress & value)
|
||||
{
|
||||
/// Write progress only if progress was updated
|
||||
if (progress_indication.updateProgress(value))
|
||||
progress_indication.writeProgress();
|
||||
});
|
||||
|
||||
/// Set finalizing callback for progress, which is called right before finalizing query output.
|
||||
finalize_progress = [&]()
|
||||
{
|
||||
progress_indication.clearProgressOutput();
|
||||
};
|
||||
|
||||
/// Set callback for file processing progress.
|
||||
progress_indication.setFileProgressCallback(context);
|
||||
}
|
||||
|
||||
bool echo_queries = config().hasOption("echo") || config().hasOption("verbose");
|
||||
|
||||
if (need_render_progress)
|
||||
progress_indication.setFileProgressCallback(context);
|
||||
|
||||
std::exception_ptr exception;
|
||||
|
||||
for (const auto & query : queries)
|
||||
@ -425,7 +433,7 @@ void LocalServer::processQueries()
|
||||
|
||||
try
|
||||
{
|
||||
executeQuery(read_buf, write_buf, /* allow_into_outfile = */ true, context, {});
|
||||
executeQuery(read_buf, write_buf, /* allow_into_outfile = */ true, context, {}, finalize_progress);
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
|
@ -1159,7 +1159,7 @@ int Server::main(const std::vector<std::string> & /*args*/)
|
||||
{
|
||||
/// This object will periodically calculate some metrics.
|
||||
AsynchronousMetrics async_metrics(
|
||||
global_context, config().getUInt("asynchronous_metrics_update_period_s", 60), servers_to_start_before_tables, servers);
|
||||
global_context, config().getUInt("asynchronous_metrics_update_period_s", 1), servers_to_start_before_tables, servers);
|
||||
attachSystemTablesAsync(*DatabaseCatalog::instance().getSystemDatabase(), async_metrics);
|
||||
|
||||
for (const auto & listen_host : listen_hosts)
|
||||
|
@ -583,7 +583,7 @@
|
||||
<port>9019</port>
|
||||
</jdbc_bridge>
|
||||
-->
|
||||
|
||||
|
||||
<!-- Configuration of clusters that could be used in Distributed tables.
|
||||
https://clickhouse.tech/docs/en/operations/table_engines/distributed/
|
||||
-->
|
||||
@ -917,7 +917,7 @@
|
||||
Asynchronous metrics are updated once a minute, so there is
|
||||
no need to flush more often.
|
||||
-->
|
||||
<flush_interval_milliseconds>60000</flush_interval_milliseconds>
|
||||
<flush_interval_milliseconds>7000</flush_interval_milliseconds>
|
||||
</asynchronous_metric_log>
|
||||
|
||||
<!--
|
||||
|
@ -283,6 +283,29 @@
|
||||
color: var(--link-color);
|
||||
text-decoration: none;
|
||||
}
|
||||
|
||||
/* This is for graph in svg */
|
||||
text
|
||||
{
|
||||
font-size: 14px;
|
||||
fill: var(--text-color);
|
||||
}
|
||||
|
||||
.node rect
|
||||
{
|
||||
fill: var(--element-background-color);
|
||||
filter: drop-shadow(.2rem .2rem .2rem var(--shadow-color));
|
||||
}
|
||||
|
||||
.edgePath path
|
||||
{
|
||||
stroke: var(--text-color);
|
||||
}
|
||||
|
||||
marker
|
||||
{
|
||||
fill: var(--text-color);
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
|
||||
@ -305,6 +328,7 @@
|
||||
<table class="monospace shadow" id="data-table"></table>
|
||||
<pre class="monospace shadow" id="data-unparsed"></pre>
|
||||
</div>
|
||||
<svg id="graph" fill="none"></svg>
|
||||
<p id="error" class="monospace shadow">
|
||||
</p>
|
||||
</body>
|
||||
@ -447,6 +471,12 @@
|
||||
table.removeChild(table.lastChild);
|
||||
}
|
||||
|
||||
let graph = document.getElementById('graph');
|
||||
while (graph.firstChild) {
|
||||
graph.removeChild(graph.lastChild);
|
||||
}
|
||||
graph.style.display = 'none';
|
||||
|
||||
document.getElementById('data-unparsed').innerText = '';
|
||||
document.getElementById('data-unparsed').style.display = 'none';
|
||||
|
||||
@ -461,12 +491,21 @@
|
||||
|
||||
function renderResult(response)
|
||||
{
|
||||
//console.log(response);
|
||||
clear();
|
||||
|
||||
let stats = document.getElementById('stats');
|
||||
stats.innerText = 'Elapsed: ' + response.statistics.elapsed.toFixed(3) + " sec, read " + response.statistics.rows_read + " rows.";
|
||||
|
||||
/// We can also render graphs if user performed EXPLAIN PIPELINE graph=1.
|
||||
if (response.data.length > 3 && response.data[0][0] === "digraph" && document.getElementById('query').value.match(/^\s*EXPLAIN/i)) {
|
||||
renderGraph(response);
|
||||
} else {
|
||||
renderTable(response);
|
||||
}
|
||||
}
|
||||
|
||||
function renderTable(response)
|
||||
{
|
||||
let thead = document.createElement('thead');
|
||||
for (let idx in response.meta) {
|
||||
let th = document.createElement('th');
|
||||
@ -559,6 +598,51 @@
|
||||
document.getElementById('error').style.display = 'block';
|
||||
}
|
||||
|
||||
/// Huge JS libraries should be loaded only if needed.
|
||||
function loadJS(src) {
|
||||
return new Promise((resolve, reject) => {
|
||||
const script = document.createElement('script');
|
||||
script.src = src;
|
||||
script.addEventListener('load', function() { resolve(true); });
|
||||
document.head.appendChild(script);
|
||||
});
|
||||
}
|
||||
|
||||
let load_dagre_promise;
|
||||
function loadDagre() {
|
||||
if (load_dagre_promise) { return load_dagre_promise; }
|
||||
|
||||
load_dagre_promise = Promise.all([
|
||||
loadJS('https://dagrejs.github.io/project/dagre/v0.8.5/dagre.min.js'),
|
||||
loadJS('https://dagrejs.github.io/project/graphlib-dot/v0.6.4/graphlib-dot.min.js'),
|
||||
loadJS('https://dagrejs.github.io/project/dagre-d3/v0.6.4/dagre-d3.min.js'),
|
||||
loadJS('https://cdn.jsdelivr.net/npm/d3@7.0.0'),
|
||||
]);
|
||||
|
||||
return load_dagre_promise;
|
||||
}
|
||||
|
||||
async function renderGraph(response)
|
||||
{
|
||||
await loadDagre();
|
||||
|
||||
/// https://github.com/dagrejs/dagre-d3/issues/131
|
||||
const dot = response.data.reduce((acc, row) => acc + '\n' + row[0].replace(/shape\s*=\s*box/g, 'shape=rect'));
|
||||
|
||||
let graph = graphlibDot.read(dot);
|
||||
graph.graph().rankdir = 'TB';
|
||||
|
||||
let render = new dagreD3.render();
|
||||
|
||||
let svg = document.getElementById('graph');
|
||||
svg.style.display = 'block';
|
||||
|
||||
render(d3.select("#graph"), graph);
|
||||
|
||||
svg.style.width = graph.graph().width;
|
||||
svg.style.height = graph.graph().height;
|
||||
}
|
||||
|
||||
function setColorTheme(theme)
|
||||
{
|
||||
window.localStorage.setItem('theme', theme);
|
||||
|
@ -185,8 +185,8 @@ public:
|
||||
|
||||
auto * denominator_type = toNativeType<Denominator>(b);
|
||||
static constexpr size_t denominator_offset = offsetof(Fraction, denominator);
|
||||
auto * denominator_dst_ptr = b.CreatePointerCast(b.CreateConstGEP1_32(nullptr, aggregate_data_dst_ptr, denominator_offset), denominator_type->getPointerTo());
|
||||
auto * denominator_src_ptr = b.CreatePointerCast(b.CreateConstGEP1_32(nullptr, aggregate_data_src_ptr, denominator_offset), denominator_type->getPointerTo());
|
||||
auto * denominator_dst_ptr = b.CreatePointerCast(b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_dst_ptr, denominator_offset), denominator_type->getPointerTo());
|
||||
auto * denominator_src_ptr = b.CreatePointerCast(b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_src_ptr, denominator_offset), denominator_type->getPointerTo());
|
||||
|
||||
auto * denominator_dst_value = b.CreateLoad(denominator_type, denominator_dst_ptr);
|
||||
auto * denominator_src_value = b.CreateLoad(denominator_type, denominator_src_ptr);
|
||||
|
@ -74,7 +74,7 @@ public:
|
||||
auto * denominator_type = toNativeType<Denominator>(b);
|
||||
|
||||
static constexpr size_t denominator_offset = offsetof(Fraction, denominator);
|
||||
auto * denominator_offset_ptr = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, denominator_offset);
|
||||
auto * denominator_offset_ptr = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, denominator_offset);
|
||||
auto * denominator_ptr = b.CreatePointerCast(denominator_offset_ptr, denominator_type->getPointerTo());
|
||||
|
||||
auto * weight_cast_to_denominator = nativeCast(b, arguments_types[1], argument_values[1], denominator_type);
|
||||
|
@ -139,7 +139,7 @@ public:
|
||||
if constexpr (result_is_nullable)
|
||||
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
|
||||
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, { removeNullable(nullable_type) }, { wrapped_value });
|
||||
b.CreateBr(join_block);
|
||||
|
||||
@ -290,7 +290,7 @@ public:
|
||||
if constexpr (result_is_nullable)
|
||||
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
|
||||
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, non_nullable_types, wrapped_values);
|
||||
b.CreateBr(join_block);
|
||||
|
||||
|
@ -199,7 +199,7 @@ public:
|
||||
static constexpr size_t value_offset_from_structure = offsetof(SingleValueDataFixed<T>, value);
|
||||
|
||||
auto * type = toNativeType<T>(builder);
|
||||
auto * value_ptr_with_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, value_offset_from_structure);
|
||||
auto * value_ptr_with_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, value_offset_from_structure);
|
||||
auto * value_ptr = b.CreatePointerCast(value_ptr_with_offset, type->getPointerTo());
|
||||
|
||||
return value_ptr;
|
||||
|
@ -207,7 +207,7 @@ public:
|
||||
if constexpr (result_is_nullable)
|
||||
b.CreateMemSet(aggregate_data_ptr, llvm::ConstantInt::get(b.getInt8Ty(), 0), this->prefix_size, llvm::assumeAligned(this->alignOfData()));
|
||||
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
this->nested_function->compileCreate(b, aggregate_data_ptr_with_prefix_size_offset);
|
||||
}
|
||||
|
||||
@ -225,8 +225,8 @@ public:
|
||||
b.CreateStore(is_null_result_value, aggregate_data_dst_ptr);
|
||||
}
|
||||
|
||||
auto * aggregate_data_dst_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_dst_ptr, this->prefix_size);
|
||||
auto * aggregate_data_src_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_src_ptr, this->prefix_size);
|
||||
auto * aggregate_data_dst_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_dst_ptr, this->prefix_size);
|
||||
auto * aggregate_data_src_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_src_ptr, this->prefix_size);
|
||||
|
||||
this->nested_function->compileMerge(b, aggregate_data_dst_ptr_with_prefix_size_offset, aggregate_data_src_ptr_with_prefix_size_offset);
|
||||
}
|
||||
@ -260,7 +260,7 @@ public:
|
||||
b.CreateBr(join_block);
|
||||
|
||||
b.SetInsertPoint(if_not_null);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * nested_result = this->nested_function->compileGetResult(builder, aggregate_data_ptr_with_prefix_size_offset);
|
||||
b.CreateStore(b.CreateInsertValue(nullable_value, nested_result, {0}), nullable_value_ptr);
|
||||
b.CreateBr(join_block);
|
||||
@ -351,7 +351,7 @@ public:
|
||||
if constexpr (result_is_nullable)
|
||||
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
|
||||
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, { removeNullable(nullable_type) }, { wrapped_value });
|
||||
b.CreateBr(join_block);
|
||||
|
||||
@ -479,7 +479,7 @@ public:
|
||||
if constexpr (result_is_nullable)
|
||||
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
|
||||
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, arguments_types, wrapped_values);
|
||||
b.CreateBr(join_block);
|
||||
|
||||
@ -488,7 +488,7 @@ public:
|
||||
else
|
||||
{
|
||||
b.CreateStore(llvm::ConstantInt::get(b.getInt8Ty(), 1), aggregate_data_ptr);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstGEP1_32(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
auto * aggregate_data_ptr_with_prefix_size_offset = b.CreateConstInBoundsGEP1_64(nullptr, aggregate_data_ptr, this->prefix_size);
|
||||
this->nested_function->compileAdd(b, aggregate_data_ptr_with_prefix_size_offset, non_nullable_types, wrapped_values);
|
||||
}
|
||||
}
|
||||
|
@ -298,11 +298,19 @@ void ConfigProcessor::doIncludesRecursive(
|
||||
{
|
||||
const auto * subst = attributes->getNamedItem(attr_name);
|
||||
attr_nodes[attr_name] = subst;
|
||||
substs_count += static_cast<size_t>(subst == nullptr);
|
||||
substs_count += static_cast<size_t>(subst != nullptr);
|
||||
}
|
||||
|
||||
if (substs_count < SUBSTITUTION_ATTRS.size() - 1) /// only one substitution is allowed
|
||||
throw Poco::Exception("several substitutions attributes set for element <" + node->nodeName() + ">");
|
||||
if (substs_count > 1) /// only one substitution is allowed
|
||||
throw Poco::Exception("More than one substitution attribute is set for element <" + node->nodeName() + ">");
|
||||
|
||||
if (node->nodeName() == "include")
|
||||
{
|
||||
if (node->hasChildNodes())
|
||||
throw Poco::Exception("<include> element must have no children");
|
||||
if (substs_count == 0)
|
||||
throw Poco::Exception("No substitution attributes set for element <include>, must have exactly one");
|
||||
}
|
||||
|
||||
/// Replace the original contents, not add to it.
|
||||
bool replace = attributes->getNamedItem("replace");
|
||||
@ -320,37 +328,57 @@ void ConfigProcessor::doIncludesRecursive(
|
||||
else if (throw_on_bad_incl)
|
||||
throw Poco::Exception(error_msg + name);
|
||||
else
|
||||
{
|
||||
if (node->nodeName() == "include")
|
||||
node->parentNode()->removeChild(node);
|
||||
|
||||
LOG_WARNING(log, "{}{}", error_msg, name);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
Element & element = dynamic_cast<Element &>(*node);
|
||||
|
||||
for (const auto & attr_name : SUBSTITUTION_ATTRS)
|
||||
element.removeAttribute(attr_name);
|
||||
|
||||
if (replace)
|
||||
/// Replace the whole node not just contents.
|
||||
if (node->nodeName() == "include")
|
||||
{
|
||||
while (Node * child = node->firstChild())
|
||||
node->removeChild(child);
|
||||
const NodeListPtr children = node_to_include->childNodes();
|
||||
for (size_t i = 0, size = children->length(); i < size; ++i)
|
||||
{
|
||||
NodePtr new_node = config->importNode(children->item(i), true);
|
||||
node->parentNode()->insertBefore(new_node, node);
|
||||
}
|
||||
|
||||
element.removeAttribute("replace");
|
||||
node->parentNode()->removeChild(node);
|
||||
}
|
||||
|
||||
const NodeListPtr children = node_to_include->childNodes();
|
||||
for (size_t i = 0, size = children->length(); i < size; ++i)
|
||||
else
|
||||
{
|
||||
NodePtr new_node = config->importNode(children->item(i), true);
|
||||
node->appendChild(new_node);
|
||||
}
|
||||
Element & element = dynamic_cast<Element &>(*node);
|
||||
|
||||
const NamedNodeMapPtr from_attrs = node_to_include->attributes();
|
||||
for (size_t i = 0, size = from_attrs->length(); i < size; ++i)
|
||||
{
|
||||
element.setAttributeNode(dynamic_cast<Attr *>(config->importNode(from_attrs->item(i), true)));
|
||||
}
|
||||
for (const auto & attr_name : SUBSTITUTION_ATTRS)
|
||||
element.removeAttribute(attr_name);
|
||||
|
||||
included_something = true;
|
||||
if (replace)
|
||||
{
|
||||
while (Node * child = node->firstChild())
|
||||
node->removeChild(child);
|
||||
|
||||
element.removeAttribute("replace");
|
||||
}
|
||||
|
||||
const NodeListPtr children = node_to_include->childNodes();
|
||||
for (size_t i = 0, size = children->length(); i < size; ++i)
|
||||
{
|
||||
NodePtr new_node = config->importNode(children->item(i), true);
|
||||
node->appendChild(new_node);
|
||||
}
|
||||
|
||||
const NamedNodeMapPtr from_attrs = node_to_include->attributes();
|
||||
for (size_t i = 0, size = from_attrs->length(); i < size; ++i)
|
||||
{
|
||||
element.setAttributeNode(dynamic_cast<Attr *>(config->importNode(from_attrs->item(i), true)));
|
||||
}
|
||||
|
||||
included_something = true;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
|
@ -10,16 +10,10 @@ namespace fs = std::filesystem;
|
||||
namespace DB
|
||||
{
|
||||
|
||||
/// Checks if file exists without throwing an exception but with message in console.
|
||||
bool safeFsExists(const auto & path)
|
||||
bool safeFsExists(const String & path)
|
||||
{
|
||||
std::error_code ec;
|
||||
bool res = fs::exists(path, ec);
|
||||
if (ec)
|
||||
{
|
||||
std::cerr << "Can't check '" << path << "': [" << ec.value() << "] " << ec.message() << std::endl;
|
||||
}
|
||||
return res;
|
||||
return fs::exists(path, ec);
|
||||
};
|
||||
|
||||
bool configReadClient(Poco::Util::LayeredConfiguration & config, const std::string & home_path)
|
||||
|
@ -30,6 +30,8 @@
|
||||
M(OpenFileForWrite, "Number of files open for writing") \
|
||||
M(Read, "Number of read (read, pread, io_getevents, etc.) syscalls in fly") \
|
||||
M(Write, "Number of write (write, pwrite, io_getevents, etc.) syscalls in fly") \
|
||||
M(NetworkReceive, "Number of threads receiving data from network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.") \
|
||||
M(NetworkSend, "Number of threads sending data to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.") \
|
||||
M(SendScalars, "Number of connections that are sending data for scalars to remote servers.") \
|
||||
M(SendExternalTables, "Number of connections that are sending data for external tables to remote servers. External tables are used to implement GLOBAL IN and GLOBAL JOIN operators with distributed subqueries.") \
|
||||
M(QueryThread, "Number of query processing threads") \
|
||||
|
@ -557,6 +557,7 @@
|
||||
M(587, CONCURRENT_ACCESS_NOT_SUPPORTED) \
|
||||
M(588, DISTRIBUTED_BROKEN_BATCH_INFO) \
|
||||
M(589, DISTRIBUTED_BROKEN_BATCH_FILES) \
|
||||
M(590, CANNOT_SYSCONF) \
|
||||
\
|
||||
M(998, POSTGRESQL_CONNECTION_FAILURE) \
|
||||
M(999, KEEPER_EXCEPTION) \
|
||||
|
@ -117,4 +117,16 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
class FieldVisitorAccurateLessOrEqual : public StaticVisitor<bool>
|
||||
{
|
||||
public:
|
||||
template <typename T, typename U>
|
||||
bool operator()(const T & l, const U & r) const
|
||||
{
|
||||
auto less_cmp = FieldVisitorAccurateLess();
|
||||
return !less_cmp(r, l);
|
||||
}
|
||||
};
|
||||
|
||||
}
|
||||
|
@ -237,7 +237,12 @@ public:
|
||||
// 1. Always memcpy 8 times bytes
|
||||
// 2. Use switch case extension to generate fast dispatching table
|
||||
// 3. Funcs are named callables that can be force_inlined
|
||||
//
|
||||
// NOTE: It relies on Little Endianness
|
||||
//
|
||||
// NOTE: It requires padded to 8 bytes keys (IOW you cannot pass
|
||||
// std::string here, but you can pass i.e. ColumnString::getDataAt()),
|
||||
// since it copies 8 bytes at a time.
|
||||
template <typename Self, typename KeyHolder, typename Func>
|
||||
static auto ALWAYS_INLINE dispatch(Self & self, KeyHolder && key_holder, Func && func)
|
||||
{
|
||||
|
@ -47,8 +47,10 @@
|
||||
M(CreatedReadBufferMMapFailed, "") \
|
||||
M(DiskReadElapsedMicroseconds, "Total time spent waiting for read syscall. This include reads from page cache.") \
|
||||
M(DiskWriteElapsedMicroseconds, "Total time spent waiting for write syscall. This include writes to page cache.") \
|
||||
M(NetworkReceiveElapsedMicroseconds, "") \
|
||||
M(NetworkSendElapsedMicroseconds, "") \
|
||||
M(NetworkReceiveElapsedMicroseconds, "Total time spent waiting for data to receive or receiving data from network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.") \
|
||||
M(NetworkSendElapsedMicroseconds, "Total time spent waiting for data to send to network or sending data to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries..") \
|
||||
M(NetworkReceiveBytes, "Total number of bytes received from network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.") \
|
||||
M(NetworkSendBytes, "Total number of bytes send to network. Only ClickHouse-related network interaction is included, not by 3rd party libraries.") \
|
||||
M(ThrottlerSleepMicroseconds, "Total time a query was sleeping to conform the 'max_network_bandwidth' setting.") \
|
||||
\
|
||||
M(QueryMaskingRulesMatch, "Number of times query masking rules was successfully matched.") \
|
||||
|
@ -4,9 +4,6 @@
|
||||
#include <Common/UnicodeBar.h>
|
||||
#include <Databases/DatabaseMemory.h>
|
||||
|
||||
/// FIXME: progress bar in clickhouse-local needs to be cleared after query execution
|
||||
/// - same as it is now in clickhouse-client. Also there is no writeFinalProgress call
|
||||
/// in clickhouse-local.
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
@ -566,7 +566,6 @@ void ZooKeeper::sendThread()
|
||||
if (info.watch)
|
||||
{
|
||||
info.request->has_watch = true;
|
||||
CurrentMetrics::add(CurrentMetrics::ZooKeeperWatch);
|
||||
}
|
||||
|
||||
if (expired)
|
||||
@ -773,6 +772,8 @@ void ZooKeeper::receiveEvent()
|
||||
|
||||
if (add_watch)
|
||||
{
|
||||
CurrentMetrics::add(CurrentMetrics::ZooKeeperWatch);
|
||||
|
||||
/// The key of wathces should exclude the root_path
|
||||
String req_path = request_info.request->getPath();
|
||||
removeRootPath(req_path, root_path);
|
||||
@ -852,7 +853,8 @@ void ZooKeeper::finalize(bool error_send, bool error_receive)
|
||||
}
|
||||
|
||||
/// Send thread will exit after sending close request or on expired flag
|
||||
send_thread.join();
|
||||
if (send_thread.joinable())
|
||||
send_thread.join();
|
||||
}
|
||||
|
||||
/// Set expired flag after we sent close event
|
||||
@ -869,7 +871,7 @@ void ZooKeeper::finalize(bool error_send, bool error_receive)
|
||||
tryLogCurrentException(__PRETTY_FUNCTION__);
|
||||
}
|
||||
|
||||
if (!error_receive)
|
||||
if (!error_receive && receive_thread.joinable())
|
||||
receive_thread.join();
|
||||
|
||||
{
|
||||
@ -905,6 +907,7 @@ void ZooKeeper::finalize(bool error_send, bool error_receive)
|
||||
{
|
||||
std::lock_guard lock(watches_mutex);
|
||||
|
||||
Int64 watch_callback_count = 0;
|
||||
for (auto & path_watches : watches)
|
||||
{
|
||||
WatchResponse response;
|
||||
@ -914,6 +917,7 @@ void ZooKeeper::finalize(bool error_send, bool error_receive)
|
||||
|
||||
for (auto & callback : path_watches.second)
|
||||
{
|
||||
watch_callback_count += 1;
|
||||
if (callback)
|
||||
{
|
||||
try
|
||||
@ -928,7 +932,7 @@ void ZooKeeper::finalize(bool error_send, bool error_receive)
|
||||
}
|
||||
}
|
||||
|
||||
CurrentMetrics::sub(CurrentMetrics::ZooKeeperWatch, watches.size());
|
||||
CurrentMetrics::sub(CurrentMetrics::ZooKeeperWatch, watch_callback_count);
|
||||
watches.clear();
|
||||
}
|
||||
|
||||
|
@ -56,3 +56,37 @@ const char * const hex_char_to_digit_table =
|
||||
"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
|
||||
"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff"
|
||||
"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff";
|
||||
|
||||
const char * const bin_byte_to_char_table =
|
||||
"0000000000000001000000100000001100000100000001010000011000000111"
|
||||
"0000100000001001000010100000101100001100000011010000111000001111"
|
||||
"0001000000010001000100100001001100010100000101010001011000010111"
|
||||
"0001100000011001000110100001101100011100000111010001111000011111"
|
||||
"0010000000100001001000100010001100100100001001010010011000100111"
|
||||
"0010100000101001001010100010101100101100001011010010111000101111"
|
||||
"0011000000110001001100100011001100110100001101010011011000110111"
|
||||
"0011100000111001001110100011101100111100001111010011111000111111"
|
||||
"0100000001000001010000100100001101000100010001010100011001000111"
|
||||
"0100100001001001010010100100101101001100010011010100111001001111"
|
||||
"0101000001010001010100100101001101010100010101010101011001010111"
|
||||
"0101100001011001010110100101101101011100010111010101111001011111"
|
||||
"0110000001100001011000100110001101100100011001010110011001100111"
|
||||
"0110100001101001011010100110101101101100011011010110111001101111"
|
||||
"0111000001110001011100100111001101110100011101010111011001110111"
|
||||
"0111100001111001011110100111101101111100011111010111111001111111"
|
||||
"1000000010000001100000101000001110000100100001011000011010000111"
|
||||
"1000100010001001100010101000101110001100100011011000111010001111"
|
||||
"1001000010010001100100101001001110010100100101011001011010010111"
|
||||
"1001100010011001100110101001101110011100100111011001111010011111"
|
||||
"1010000010100001101000101010001110100100101001011010011010100111"
|
||||
"1010100010101001101010101010101110101100101011011010111010101111"
|
||||
"1011000010110001101100101011001110110100101101011011011010110111"
|
||||
"1011100010111001101110101011101110111100101111011011111010111111"
|
||||
"1100000011000001110000101100001111000100110001011100011011000111"
|
||||
"1100100011001001110010101100101111001100110011011100111011001111"
|
||||
"1101000011010001110100101101001111010100110101011101011011010111"
|
||||
"1101100011011001110110101101101111011100110111011101111011011111"
|
||||
"1110000011100001111000101110001111100100111001011110011011100111"
|
||||
"1110100011101001111010101110101111101100111011011110111011101111"
|
||||
"1111000011110001111100101111001111110100111101011111011011110111"
|
||||
"1111100011111001111110101111101111111100111111011111111011111111";
|
||||
|
@ -39,6 +39,12 @@ inline void writeHexByteLowercase(UInt8 byte, void * out)
|
||||
memcpy(out, &hex_byte_to_char_lowercase_table[static_cast<size_t>(byte) * 2], 2);
|
||||
}
|
||||
|
||||
extern const char * const bin_byte_to_char_table;
|
||||
|
||||
inline void writeBinByte(UInt8 byte, void * out)
|
||||
{
|
||||
memcpy(out, &bin_byte_to_char_table[static_cast<size_t>(byte) * 8], 8);
|
||||
}
|
||||
|
||||
/// Produces hex representation of an unsigned int with leading zeros (for checksums)
|
||||
template <typename TUInt>
|
||||
|
@ -108,7 +108,7 @@ class IColumn;
|
||||
M(Bool, compile_expressions, true, "Compile some scalar functions and operators to native code.", 0) \
|
||||
M(UInt64, min_count_to_compile_expression, 3, "The number of identical expressions before they are JIT-compiled", 0) \
|
||||
M(Bool, compile_aggregate_expressions, true, "Compile aggregate functions to native code.", 0) \
|
||||
M(UInt64, min_count_to_compile_aggregate_expression, 0, "The number of identical aggreagte expressions before they are JIT-compiled", 0) \
|
||||
M(UInt64, min_count_to_compile_aggregate_expression, 3, "The number of identical aggregate expressions before they are JIT-compiled", 0) \
|
||||
M(UInt64, group_by_two_level_threshold, 100000, "From what number of keys, a two-level aggregation starts. 0 - the threshold is not set.", 0) \
|
||||
M(UInt64, group_by_two_level_threshold_bytes, 50000000, "From what size of the aggregation state in bytes, a two-level aggregation begins to be used. 0 - the threshold is not set. Two-level aggregation is used when at least one of the thresholds is triggered.", 0) \
|
||||
M(Bool, distributed_aggregation_memory_efficient, true, "Is the memory-saving mode of distributed aggregation enabled.", 0) \
|
||||
|
@ -37,88 +37,115 @@ TTLAggregationAlgorithm::TTLAggregationAlgorithm(
|
||||
settings.compile_aggregate_expressions, settings.min_count_to_compile_aggregate_expression);
|
||||
|
||||
aggregator = std::make_unique<Aggregator>(params);
|
||||
|
||||
if (isMaxTTLExpired())
|
||||
new_ttl_info.finished = true;
|
||||
}
|
||||
|
||||
void TTLAggregationAlgorithm::execute(Block & block)
|
||||
{
|
||||
if (!block)
|
||||
{
|
||||
if (!aggregation_result.empty())
|
||||
{
|
||||
MutableColumns result_columns = header.cloneEmptyColumns();
|
||||
finalizeAggregates(result_columns);
|
||||
block = header.cloneWithColumns(std::move(result_columns));
|
||||
}
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
const auto & column_names = header.getNames();
|
||||
bool some_rows_were_aggregated = false;
|
||||
MutableColumns result_columns = header.cloneEmptyColumns();
|
||||
MutableColumns aggregate_columns = header.cloneEmptyColumns();
|
||||
|
||||
auto ttl_column = executeExpressionAndGetColumn(description.expression, block, description.result_column);
|
||||
auto where_column = executeExpressionAndGetColumn(description.where_expression, block, description.where_result_column);
|
||||
|
||||
size_t rows_aggregated = 0;
|
||||
size_t current_key_start = 0;
|
||||
size_t rows_with_current_key = 0;
|
||||
|
||||
for (size_t i = 0; i < block.rows(); ++i)
|
||||
if (!block) /// Empty block -- no more data, but we may still have some accumulated rows
|
||||
{
|
||||
UInt32 cur_ttl = getTimestampByIndex(ttl_column.get(), i);
|
||||
bool where_filter_passed = !where_column || where_column->getBool(i);
|
||||
bool ttl_expired = isTTLExpired(cur_ttl) && where_filter_passed;
|
||||
|
||||
bool same_as_current = true;
|
||||
for (size_t j = 0; j < description.group_by_keys.size(); ++j)
|
||||
if (!aggregation_result.empty()) /// Still have some aggregated data, let's update TTL
|
||||
{
|
||||
const String & key_column = description.group_by_keys[j];
|
||||
const IColumn * values_column = block.getByName(key_column).column.get();
|
||||
if (!same_as_current || (*values_column)[i] != current_key_value[j])
|
||||
{
|
||||
values_column->get(i, current_key_value[j]);
|
||||
same_as_current = false;
|
||||
}
|
||||
}
|
||||
|
||||
if (!same_as_current)
|
||||
{
|
||||
if (rows_with_current_key)
|
||||
calculateAggregates(aggregate_columns, current_key_start, rows_with_current_key);
|
||||
finalizeAggregates(result_columns);
|
||||
|
||||
current_key_start = rows_aggregated;
|
||||
rows_with_current_key = 0;
|
||||
some_rows_were_aggregated = true;
|
||||
}
|
||||
|
||||
if (ttl_expired)
|
||||
else /// No block, all aggregated, just finish
|
||||
{
|
||||
++rows_with_current_key;
|
||||
++rows_aggregated;
|
||||
for (const auto & name : column_names)
|
||||
{
|
||||
const IColumn * values_column = block.getByName(name).column.get();
|
||||
auto & column = aggregate_columns[header.getPositionByName(name)];
|
||||
column->insertFrom(*values_column, i);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
new_ttl_info.update(cur_ttl);
|
||||
for (const auto & name : column_names)
|
||||
{
|
||||
const IColumn * values_column = block.getByName(name).column.get();
|
||||
auto & column = result_columns[header.getPositionByName(name)];
|
||||
column->insertFrom(*values_column, i);
|
||||
}
|
||||
return;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
const auto & column_names = header.getNames();
|
||||
MutableColumns aggregate_columns = header.cloneEmptyColumns();
|
||||
|
||||
if (rows_with_current_key)
|
||||
calculateAggregates(aggregate_columns, current_key_start, rows_with_current_key);
|
||||
auto ttl_column = executeExpressionAndGetColumn(description.expression, block, description.result_column);
|
||||
auto where_column = executeExpressionAndGetColumn(description.where_expression, block, description.where_result_column);
|
||||
|
||||
size_t rows_aggregated = 0;
|
||||
size_t current_key_start = 0;
|
||||
size_t rows_with_current_key = 0;
|
||||
|
||||
for (size_t i = 0; i < block.rows(); ++i)
|
||||
{
|
||||
UInt32 cur_ttl = getTimestampByIndex(ttl_column.get(), i);
|
||||
bool where_filter_passed = !where_column || where_column->getBool(i);
|
||||
bool ttl_expired = isTTLExpired(cur_ttl) && where_filter_passed;
|
||||
|
||||
bool same_as_current = true;
|
||||
for (size_t j = 0; j < description.group_by_keys.size(); ++j)
|
||||
{
|
||||
const String & key_column = description.group_by_keys[j];
|
||||
const IColumn * values_column = block.getByName(key_column).column.get();
|
||||
if (!same_as_current || (*values_column)[i] != current_key_value[j])
|
||||
{
|
||||
values_column->get(i, current_key_value[j]);
|
||||
same_as_current = false;
|
||||
}
|
||||
}
|
||||
|
||||
if (!same_as_current)
|
||||
{
|
||||
if (rows_with_current_key)
|
||||
{
|
||||
some_rows_were_aggregated = true;
|
||||
calculateAggregates(aggregate_columns, current_key_start, rows_with_current_key);
|
||||
}
|
||||
finalizeAggregates(result_columns);
|
||||
|
||||
current_key_start = rows_aggregated;
|
||||
rows_with_current_key = 0;
|
||||
}
|
||||
|
||||
if (ttl_expired)
|
||||
{
|
||||
++rows_with_current_key;
|
||||
++rows_aggregated;
|
||||
for (const auto & name : column_names)
|
||||
{
|
||||
const IColumn * values_column = block.getByName(name).column.get();
|
||||
auto & column = aggregate_columns[header.getPositionByName(name)];
|
||||
column->insertFrom(*values_column, i);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
for (const auto & name : column_names)
|
||||
{
|
||||
const IColumn * values_column = block.getByName(name).column.get();
|
||||
auto & column = result_columns[header.getPositionByName(name)];
|
||||
column->insertFrom(*values_column, i);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (rows_with_current_key)
|
||||
{
|
||||
some_rows_were_aggregated = true;
|
||||
calculateAggregates(aggregate_columns, current_key_start, rows_with_current_key);
|
||||
}
|
||||
}
|
||||
|
||||
block = header.cloneWithColumns(std::move(result_columns));
|
||||
|
||||
/// If some rows were aggregated we have to recalculate ttl info's
|
||||
if (some_rows_were_aggregated)
|
||||
{
|
||||
auto ttl_column_after_aggregation = executeExpressionAndGetColumn(description.expression, block, description.result_column);
|
||||
auto where_column_after_aggregation = executeExpressionAndGetColumn(description.where_expression, block, description.where_result_column);
|
||||
for (size_t i = 0; i < block.rows(); ++i)
|
||||
{
|
||||
bool where_filter_passed = !where_column_after_aggregation || where_column_after_aggregation->getBool(i);
|
||||
if (where_filter_passed)
|
||||
new_ttl_info.update(getTimestampByIndex(ttl_column_after_aggregation.get(), i));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void TTLAggregationAlgorithm::calculateAggregates(const MutableColumns & aggregate_columns, size_t start_pos, size_t length)
|
||||
@ -134,6 +161,7 @@ void TTLAggregationAlgorithm::calculateAggregates(const MutableColumns & aggrega
|
||||
|
||||
aggregator->executeOnBlock(aggregate_chunk, length, aggregation_result, key_columns,
|
||||
columns_for_aggregator, no_more_keys);
|
||||
|
||||
}
|
||||
|
||||
void TTLAggregationAlgorithm::finalizeAggregates(MutableColumns & result_columns)
|
||||
@ -141,6 +169,7 @@ void TTLAggregationAlgorithm::finalizeAggregates(MutableColumns & result_columns
|
||||
if (!aggregation_result.empty())
|
||||
{
|
||||
auto aggregated_res = aggregator->convertToBlocks(aggregation_result, true, 1);
|
||||
|
||||
for (auto & agg_block : aggregated_res)
|
||||
{
|
||||
for (const auto & it : description.set_parts)
|
||||
|
@ -21,6 +21,9 @@ TTLColumnAlgorithm::TTLColumnAlgorithm(
|
||||
new_ttl_info = old_ttl_info;
|
||||
is_fully_empty = false;
|
||||
}
|
||||
|
||||
if (isMaxTTLExpired())
|
||||
new_ttl_info.finished = true;
|
||||
}
|
||||
|
||||
void TTLColumnAlgorithm::execute(Block & block)
|
||||
|
@ -9,6 +9,9 @@ TTLDeleteAlgorithm::TTLDeleteAlgorithm(
|
||||
{
|
||||
if (!isMinTTLExpired())
|
||||
new_ttl_info = old_ttl_info;
|
||||
|
||||
if (isMaxTTLExpired())
|
||||
new_ttl_info.finished = true;
|
||||
}
|
||||
|
||||
void TTLDeleteAlgorithm::execute(Block & block)
|
||||
|
@ -1,11 +1,13 @@
|
||||
#include <Columns/ColumnArray.h>
|
||||
#include <Columns/ColumnConst.h>
|
||||
#include <Columns/ColumnTuple.h>
|
||||
#include <Columns/ColumnMap.h>
|
||||
#include <Columns/ColumnLowCardinality.h>
|
||||
|
||||
#include <DataTypes/DataTypeLowCardinality.h>
|
||||
#include <DataTypes/DataTypeArray.h>
|
||||
#include <DataTypes/DataTypeTuple.h>
|
||||
#include <DataTypes/DataTypeMap.h>
|
||||
|
||||
#include <Common/assert_cast.h>
|
||||
|
||||
@ -39,6 +41,11 @@ DataTypePtr recursiveRemoveLowCardinality(const DataTypePtr & type)
|
||||
return std::make_shared<DataTypeTuple>(elements);
|
||||
}
|
||||
|
||||
if (const auto * map_type = typeid_cast<const DataTypeMap *>(type.get()))
|
||||
{
|
||||
return std::make_shared<DataTypeMap>(recursiveRemoveLowCardinality(map_type->getKeyType()), recursiveRemoveLowCardinality(map_type->getValueType()));
|
||||
}
|
||||
|
||||
if (const auto * low_cardinality_type = typeid_cast<const DataTypeLowCardinality *>(type.get()))
|
||||
return low_cardinality_type->getDictionaryType();
|
||||
|
||||
@ -78,6 +85,16 @@ ColumnPtr recursiveRemoveLowCardinality(const ColumnPtr & column)
|
||||
return ColumnTuple::create(columns);
|
||||
}
|
||||
|
||||
if (const auto * column_map = typeid_cast<const ColumnMap *>(column.get()))
|
||||
{
|
||||
const auto & nested = column_map->getNestedColumnPtr();
|
||||
auto nested_no_lc = recursiveRemoveLowCardinality(nested);
|
||||
if (nested.get() == nested_no_lc.get())
|
||||
return column;
|
||||
|
||||
return ColumnMap::create(nested_no_lc);
|
||||
}
|
||||
|
||||
if (const auto * column_low_cardinality = typeid_cast<const ColumnLowCardinality *>(column.get()))
|
||||
return column_low_cardinality->convertToFullColumn();
|
||||
|
||||
|
@ -7,6 +7,7 @@
|
||||
#include <DataTypes/DataTypeMap.h>
|
||||
#include <DataTypes/DataTypeArray.h>
|
||||
#include <DataTypes/DataTypeTuple.h>
|
||||
#include <DataTypes/DataTypeLowCardinality.h>
|
||||
#include <DataTypes/DataTypeFactory.h>
|
||||
#include <DataTypes/Serializations/SerializationMap.h>
|
||||
#include <Parsers/IAST.h>
|
||||
@ -53,12 +54,24 @@ DataTypeMap::DataTypeMap(const DataTypePtr & key_type_, const DataTypePtr & valu
|
||||
|
||||
void DataTypeMap::assertKeyType() const
|
||||
{
|
||||
if (!key_type->isValueRepresentedByInteger()
|
||||
bool type_error = false;
|
||||
if (key_type->getTypeId() == TypeIndex::LowCardinality)
|
||||
{
|
||||
const auto & low_cardinality_data_type = assert_cast<const DataTypeLowCardinality &>(*key_type);
|
||||
if (!isStringOrFixedString(*(low_cardinality_data_type.getDictionaryType())))
|
||||
type_error = true;
|
||||
}
|
||||
else if (!key_type->isValueRepresentedByInteger()
|
||||
&& !isStringOrFixedString(*key_type)
|
||||
&& !WhichDataType(key_type).isNothing()
|
||||
&& !WhichDataType(key_type).isUUID())
|
||||
{
|
||||
type_error = true;
|
||||
}
|
||||
|
||||
if (type_error)
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS,
|
||||
"Type of Map key must be a type, that can be represented by integer or string or UUID,"
|
||||
"Type of Map key must be a type, that can be represented by integer or String or FixedString (possibly LowCardinality) or UUID,"
|
||||
" but {} given", key_type->getName());
|
||||
}
|
||||
|
||||
|
@ -80,8 +80,13 @@ void SerializationMap::deserializeBinary(IColumn & column, ReadBuffer & istr) co
|
||||
}
|
||||
|
||||
|
||||
template <typename Writer>
|
||||
void SerializationMap::serializeTextImpl(const IColumn & column, size_t row_num, bool quote_key, WriteBuffer & ostr, Writer && writer) const
|
||||
template <typename KeyWriter, typename ValueWriter>
|
||||
void SerializationMap::serializeTextImpl(
|
||||
const IColumn & column,
|
||||
size_t row_num,
|
||||
WriteBuffer & ostr,
|
||||
KeyWriter && key_writer,
|
||||
ValueWriter && value_writer) const
|
||||
{
|
||||
const auto & column_map = assert_cast<const ColumnMap &>(column);
|
||||
|
||||
@ -98,17 +103,9 @@ void SerializationMap::serializeTextImpl(const IColumn & column, size_t row_num,
|
||||
if (i != offset)
|
||||
writeChar(',', ostr);
|
||||
|
||||
if (quote_key)
|
||||
{
|
||||
writeChar('"', ostr);
|
||||
writer(key, nested_tuple.getColumn(0), i);
|
||||
writeChar('"', ostr);
|
||||
}
|
||||
else
|
||||
writer(key, nested_tuple.getColumn(0), i);
|
||||
|
||||
key_writer(ostr, key, nested_tuple.getColumn(0), i);
|
||||
writeChar(':', ostr);
|
||||
writer(value, nested_tuple.getColumn(1), i);
|
||||
value_writer(ostr, value, nested_tuple.getColumn(1), i);
|
||||
}
|
||||
writeChar('}', ostr);
|
||||
}
|
||||
@ -148,13 +145,13 @@ void SerializationMap::deserializeTextImpl(IColumn & column, ReadBuffer & istr,
|
||||
if (*istr.position() == '}')
|
||||
break;
|
||||
|
||||
reader(key, key_column);
|
||||
reader(istr, key, key_column);
|
||||
skipWhitespaceIfAny(istr);
|
||||
assertChar(':', istr);
|
||||
|
||||
++size;
|
||||
skipWhitespaceIfAny(istr);
|
||||
reader(value, value_column);
|
||||
reader(istr, value, value_column);
|
||||
|
||||
skipWhitespaceIfAny(istr);
|
||||
}
|
||||
@ -170,41 +167,45 @@ void SerializationMap::deserializeTextImpl(IColumn & column, ReadBuffer & istr,
|
||||
|
||||
void SerializationMap::serializeText(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
serializeTextImpl(column, row_num, /*quote_key=*/ false, ostr,
|
||||
[&](const SerializationPtr & subcolumn_serialization, const IColumn & subcolumn, size_t pos)
|
||||
{
|
||||
subcolumn_serialization->serializeTextQuoted(subcolumn, pos, ostr, settings);
|
||||
});
|
||||
auto writer = [&settings](WriteBuffer & buf, const SerializationPtr & subcolumn_serialization, const IColumn & subcolumn, size_t pos)
|
||||
{
|
||||
subcolumn_serialization->serializeTextQuoted(subcolumn, pos, buf, settings);
|
||||
};
|
||||
|
||||
serializeTextImpl(column, row_num, ostr, writer, writer);
|
||||
}
|
||||
|
||||
void SerializationMap::deserializeText(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
deserializeTextImpl(column, istr,
|
||||
[&](const SerializationPtr & subcolumn_serialization, IColumn & subcolumn)
|
||||
[&settings](ReadBuffer & buf, const SerializationPtr & subcolumn_serialization, IColumn & subcolumn)
|
||||
{
|
||||
subcolumn_serialization->deserializeTextQuoted(subcolumn, istr, settings);
|
||||
subcolumn_serialization->deserializeTextQuoted(subcolumn, buf, settings);
|
||||
});
|
||||
}
|
||||
|
||||
void SerializationMap::serializeTextJSON(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings & settings) const
|
||||
{
|
||||
/// We need to double-quote integer keys to produce valid JSON.
|
||||
const auto & column_key = assert_cast<const ColumnMap &>(column).getNestedData().getColumn(0);
|
||||
bool quote_key = !WhichDataType(column_key.getDataType()).isStringOrFixedString();
|
||||
|
||||
serializeTextImpl(column, row_num, quote_key, ostr,
|
||||
[&](const SerializationPtr & subcolumn_serialization, const IColumn & subcolumn, size_t pos)
|
||||
serializeTextImpl(column, row_num, ostr,
|
||||
[&settings](WriteBuffer & buf, const SerializationPtr & subcolumn_serialization, const IColumn & subcolumn, size_t pos)
|
||||
{
|
||||
subcolumn_serialization->serializeTextJSON(subcolumn, pos, ostr, settings);
|
||||
/// We need to double-quote all keys (including integers) to produce valid JSON.
|
||||
WriteBufferFromOwnString str_buf;
|
||||
subcolumn_serialization->serializeText(subcolumn, pos, str_buf, settings);
|
||||
writeJSONString(str_buf.str(), buf, settings);
|
||||
},
|
||||
[&settings](WriteBuffer & buf, const SerializationPtr & subcolumn_serialization, const IColumn & subcolumn, size_t pos)
|
||||
{
|
||||
subcolumn_serialization->serializeTextJSON(subcolumn, pos, buf, settings);
|
||||
});
|
||||
}
|
||||
|
||||
void SerializationMap::deserializeTextJSON(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const
|
||||
{
|
||||
deserializeTextImpl(column, istr,
|
||||
[&](const SerializationPtr & subcolumn_serialization, IColumn & subcolumn)
|
||||
[&settings](ReadBuffer & buf, const SerializationPtr & subcolumn_serialization, IColumn & subcolumn)
|
||||
{
|
||||
subcolumn_serialization->deserializeTextJSON(subcolumn, istr, settings);
|
||||
subcolumn_serialization->deserializeTextJSON(subcolumn, buf, settings);
|
||||
});
|
||||
}
|
||||
|
||||
|
@ -60,8 +60,8 @@ public:
|
||||
SubstreamsCache * cache) const override;
|
||||
|
||||
private:
|
||||
template <typename Writer>
|
||||
void serializeTextImpl(const IColumn & column, size_t row_num, bool quote_key, WriteBuffer & ostr, Writer && writer) const;
|
||||
template <typename KeyWriter, typename ValueWriter>
|
||||
void serializeTextImpl(const IColumn & column, size_t row_num, WriteBuffer & ostr, KeyWriter && key_writer, ValueWriter && value_writer) const;
|
||||
|
||||
template <typename Reader>
|
||||
void deserializeTextImpl(IColumn & column, ReadBuffer & istr, Reader && reader) const;
|
||||
|
@ -224,9 +224,7 @@ void registerDictionarySourceClickHouse(DictionarySourceFactory & factory)
|
||||
|
||||
ClickHouseDictionarySource::Configuration configuration
|
||||
{
|
||||
.secure = config.getBool(settings_config_prefix + ".secure", false),
|
||||
.host = host,
|
||||
.port = port,
|
||||
.user = config.getString(settings_config_prefix + ".user", "default"),
|
||||
.password = config.getString(settings_config_prefix + ".password", ""),
|
||||
.db = config.getString(settings_config_prefix + ".db", default_database),
|
||||
@ -235,7 +233,9 @@ void registerDictionarySourceClickHouse(DictionarySourceFactory & factory)
|
||||
.invalidate_query = config.getString(settings_config_prefix + ".invalidate_query", ""),
|
||||
.update_field = config.getString(settings_config_prefix + ".update_field", ""),
|
||||
.update_lag = config.getUInt64(settings_config_prefix + ".update_lag", 1),
|
||||
.is_local = isLocalAddress({host, port}, default_port)
|
||||
.port = port,
|
||||
.is_local = isLocalAddress({host, port}, default_port),
|
||||
.secure = config.getBool(settings_config_prefix + ".secure", false)
|
||||
};
|
||||
|
||||
/// We should set user info even for the case when the dictionary is loaded in-process (without TCP communication).
|
||||
|
@ -20,9 +20,7 @@ class ClickHouseDictionarySource final : public IDictionarySource
|
||||
public:
|
||||
struct Configuration
|
||||
{
|
||||
const bool secure;
|
||||
const std::string host;
|
||||
const UInt16 port;
|
||||
const std::string user;
|
||||
const std::string password;
|
||||
const std::string db;
|
||||
@ -31,7 +29,9 @@ public:
|
||||
const std::string invalidate_query;
|
||||
const std::string update_field;
|
||||
const UInt64 update_lag;
|
||||
const UInt16 port;
|
||||
const bool is_local;
|
||||
const bool secure;
|
||||
};
|
||||
|
||||
ClickHouseDictionarySource(
|
||||
|
@ -417,7 +417,11 @@ void IDiskRemote::removeDirectory(const String & path)
|
||||
|
||||
DiskDirectoryIteratorPtr IDiskRemote::iterateDirectory(const String & path)
|
||||
{
|
||||
return std::make_unique<RemoteDiskDirectoryIterator>(metadata_path + path, path);
|
||||
fs::path meta_path = fs::path(metadata_path) / path;
|
||||
if (fs::exists(meta_path) && fs::is_directory(meta_path))
|
||||
return std::make_unique<RemoteDiskDirectoryIterator>(meta_path, path);
|
||||
else
|
||||
return std::make_unique<RemoteDiskDirectoryIterator>();
|
||||
}
|
||||
|
||||
|
||||
|
@ -193,6 +193,7 @@ struct IDiskRemote::Metadata
|
||||
class RemoteDiskDirectoryIterator final : public IDiskDirectoryIterator
|
||||
{
|
||||
public:
|
||||
RemoteDiskDirectoryIterator() {}
|
||||
RemoteDiskDirectoryIterator(const String & full_path, const String & folder_path_) : iter(full_path), folder_path(folder_path_) {}
|
||||
|
||||
void next() override { ++iter; }
|
||||
|
@ -116,6 +116,8 @@ target_link_libraries(clickhouse_functions PRIVATE clickhouse_functions_url)
|
||||
add_subdirectory(array)
|
||||
target_link_libraries(clickhouse_functions PRIVATE clickhouse_functions_array)
|
||||
|
||||
add_subdirectory(JSONPath)
|
||||
|
||||
if (USE_STATS)
|
||||
target_link_libraries(clickhouse_functions PRIVATE stats)
|
||||
endif()
|
||||
|
@ -39,6 +39,8 @@ struct DummyJSONParser
|
||||
std::string_view getString() const { return {}; }
|
||||
Array getArray() const { return {}; }
|
||||
Object getObject() const { return {}; }
|
||||
|
||||
Element getElement() { return {}; }
|
||||
};
|
||||
|
||||
/// References an array in a JSON document.
|
||||
@ -97,4 +99,9 @@ struct DummyJSONParser
|
||||
#endif
|
||||
};
|
||||
|
||||
inline ALWAYS_INLINE std::ostream& operator<<(std::ostream& out, DummyJSONParser::Element)
|
||||
{
|
||||
return out;
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -28,7 +28,7 @@ public:
|
||||
static constexpr auto name = or_null ? "joinGetOrNull" : "joinGet";
|
||||
|
||||
bool useDefaultImplementationForNulls() const override { return false; }
|
||||
bool useDefaultImplementationForLowCardinalityColumns() const override { return true; }
|
||||
bool useDefaultImplementationForLowCardinalityColumns() const override { return false; }
|
||||
bool useDefaultImplementationForConstants() const override { return true; }
|
||||
|
||||
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override;
|
||||
|
15
src/Functions/FunctionSQLJSON.cpp
Normal file
15
src/Functions/FunctionSQLJSON.cpp
Normal file
@ -0,0 +1,15 @@
|
||||
#include <Functions/FunctionSQLJSON.h>
|
||||
#include <Functions/FunctionFactory.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
void registerFunctionsSQLJSON(FunctionFactory & factory)
|
||||
{
|
||||
factory.registerFunction<FunctionSQLJSON<NameJSONExists, JSONExistsImpl>>();
|
||||
factory.registerFunction<FunctionSQLJSON<NameJSONQuery, JSONQueryImpl>>();
|
||||
factory.registerFunction<FunctionSQLJSON<NameJSONValue, JSONValueImpl>>();
|
||||
}
|
||||
|
||||
}
|
334
src/Functions/FunctionSQLJSON.h
Normal file
334
src/Functions/FunctionSQLJSON.h
Normal file
@ -0,0 +1,334 @@
|
||||
#pragma once
|
||||
|
||||
#include <sstream>
|
||||
#include <type_traits>
|
||||
#include <Columns/ColumnConst.h>
|
||||
#include <Columns/ColumnString.h>
|
||||
#include <Columns/ColumnsNumber.h>
|
||||
#include <Core/Settings.h>
|
||||
#include <DataTypes/DataTypeString.h>
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <Functions/DummyJSONParser.h>
|
||||
#include <Functions/IFunction.h>
|
||||
#include <Functions/JSONPath/ASTs/ASTJSONPath.h>
|
||||
#include <Functions/JSONPath/Generator/GeneratorJSONPath.h>
|
||||
#include <Functions/JSONPath/Parsers/ParserJSONPath.h>
|
||||
#include <Functions/RapidJSONParser.h>
|
||||
#include <Functions/SimdJSONParser.h>
|
||||
#include <Interpreters/Context.h>
|
||||
#include <Parsers/IParser.h>
|
||||
#include <Parsers/Lexer.h>
|
||||
#include <common/range.h>
|
||||
|
||||
#if !defined(ARCADIA_BUILD)
|
||||
# include "config_functions.h"
|
||||
#endif
|
||||
|
||||
namespace DB
|
||||
{
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
|
||||
extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION;
|
||||
extern const int BAD_ARGUMENTS;
|
||||
}
|
||||
|
||||
class FunctionSQLJSONHelpers
|
||||
{
|
||||
public:
|
||||
template <typename Name, template <typename> typename Impl, class JSONParser>
|
||||
class Executor
|
||||
{
|
||||
public:
|
||||
static ColumnPtr run(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count, uint32_t parse_depth)
|
||||
{
|
||||
MutableColumnPtr to{result_type->createColumn()};
|
||||
to->reserve(input_rows_count);
|
||||
|
||||
if (arguments.size() < 2)
|
||||
{
|
||||
throw Exception{"JSONPath functions require at least 2 arguments", ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION};
|
||||
}
|
||||
|
||||
const auto & first_column = arguments[0];
|
||||
|
||||
/// Check 1 argument: must be of type String (JSONPath)
|
||||
if (!isString(first_column.type))
|
||||
{
|
||||
throw Exception(
|
||||
"JSONPath functions require 1 argument to be JSONPath of type string, illegal type: " + first_column.type->getName(),
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
}
|
||||
/// Check 1 argument: must be const (JSONPath)
|
||||
if (!isColumnConst(*first_column.column))
|
||||
{
|
||||
throw Exception("1 argument (JSONPath) must be const", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
}
|
||||
|
||||
const auto & second_column = arguments[1];
|
||||
|
||||
/// Check 2 argument: must be of type String (JSON)
|
||||
if (!isString(second_column.type))
|
||||
{
|
||||
throw Exception(
|
||||
"JSONPath functions require 2 argument to be JSON of string, illegal type: " + second_column.type->getName(),
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
}
|
||||
|
||||
const ColumnPtr & arg_jsonpath = first_column.column;
|
||||
const auto * arg_jsonpath_const = typeid_cast<const ColumnConst *>(arg_jsonpath.get());
|
||||
const auto * arg_jsonpath_string = typeid_cast<const ColumnString *>(arg_jsonpath_const->getDataColumnPtr().get());
|
||||
|
||||
const ColumnPtr & arg_json = second_column.column;
|
||||
const auto * col_json_const = typeid_cast<const ColumnConst *>(arg_json.get());
|
||||
const auto * col_json_string
|
||||
= typeid_cast<const ColumnString *>(col_json_const ? col_json_const->getDataColumnPtr().get() : arg_json.get());
|
||||
|
||||
/// Get data and offsets for 1 argument (JSONPath)
|
||||
const ColumnString::Chars & chars_path = arg_jsonpath_string->getChars();
|
||||
const ColumnString::Offsets & offsets_path = arg_jsonpath_string->getOffsets();
|
||||
|
||||
/// Prepare to parse 1 argument (JSONPath)
|
||||
const char * query_begin = reinterpret_cast<const char *>(&chars_path[0]);
|
||||
const char * query_end = query_begin + offsets_path[0] - 1;
|
||||
|
||||
/// Tokenize query
|
||||
Tokens tokens(query_begin, query_end);
|
||||
/// Max depth 0 indicates that depth is not limited
|
||||
IParser::Pos token_iterator(tokens, parse_depth);
|
||||
|
||||
/// Parse query and create AST tree
|
||||
Expected expected;
|
||||
ASTPtr res;
|
||||
ParserJSONPath parser;
|
||||
const bool parse_res = parser.parse(token_iterator, res, expected);
|
||||
if (!parse_res)
|
||||
{
|
||||
throw Exception{"Unable to parse JSONPath", ErrorCodes::BAD_ARGUMENTS};
|
||||
}
|
||||
|
||||
/// Get data and offsets for 2 argument (JSON)
|
||||
const ColumnString::Chars & chars_json = col_json_string->getChars();
|
||||
const ColumnString::Offsets & offsets_json = col_json_string->getOffsets();
|
||||
|
||||
JSONParser json_parser;
|
||||
using Element = typename JSONParser::Element;
|
||||
Element document;
|
||||
bool document_ok = false;
|
||||
|
||||
/// Parse JSON for every row
|
||||
Impl<JSONParser> impl;
|
||||
|
||||
for (const auto i : collections::range(0, input_rows_count))
|
||||
{
|
||||
std::string_view json{
|
||||
reinterpret_cast<const char *>(&chars_json[offsets_json[i - 1]]), offsets_json[i] - offsets_json[i - 1] - 1};
|
||||
document_ok = json_parser.parse(json, document);
|
||||
|
||||
bool added_to_column = false;
|
||||
if (document_ok)
|
||||
{
|
||||
added_to_column = impl.insertResultToColumn(*to, document, res);
|
||||
}
|
||||
if (!added_to_column)
|
||||
{
|
||||
to->insertDefault();
|
||||
}
|
||||
}
|
||||
return to;
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
template <typename Name, template <typename> typename Impl>
|
||||
class FunctionSQLJSON : public IFunction, WithConstContext
|
||||
{
|
||||
public:
|
||||
static FunctionPtr create(ContextPtr context_) { return std::make_shared<FunctionSQLJSON>(context_); }
|
||||
explicit FunctionSQLJSON(ContextPtr context_) : WithConstContext(context_) { }
|
||||
|
||||
static constexpr auto name = Name::name;
|
||||
String getName() const override { return Name::name; }
|
||||
bool isVariadic() const override { return true; }
|
||||
size_t getNumberOfArguments() const override { return 0; }
|
||||
bool useDefaultImplementationForConstants() const override { return true; }
|
||||
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {0}; }
|
||||
|
||||
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
|
||||
{
|
||||
return Impl<DummyJSONParser>::getReturnType(Name::name, arguments);
|
||||
}
|
||||
|
||||
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override
|
||||
{
|
||||
/// Choose JSONParser.
|
||||
/// 1. Lexer(path) -> Tokens
|
||||
/// 2. Create ASTPtr
|
||||
/// 3. Parser(Tokens, ASTPtr) -> complete AST
|
||||
/// 4. Execute functions: call getNextItem on generator and handle each item
|
||||
uint32_t parse_depth = getContext()->getSettingsRef().max_parser_depth;
|
||||
#if USE_SIMDJSON
|
||||
if (getContext()->getSettingsRef().allow_simdjson)
|
||||
return FunctionSQLJSONHelpers::Executor<Name, Impl, SimdJSONParser>::run(arguments, result_type, input_rows_count, parse_depth);
|
||||
#endif
|
||||
return FunctionSQLJSONHelpers::Executor<Name, Impl, DummyJSONParser>::run(arguments, result_type, input_rows_count, parse_depth);
|
||||
}
|
||||
};
|
||||
|
||||
struct NameJSONExists
|
||||
{
|
||||
static constexpr auto name{"JSON_EXISTS"};
|
||||
};
|
||||
|
||||
struct NameJSONValue
|
||||
{
|
||||
static constexpr auto name{"JSON_VALUE"};
|
||||
};
|
||||
|
||||
struct NameJSONQuery
|
||||
{
|
||||
static constexpr auto name{"JSON_QUERY"};
|
||||
};
|
||||
|
||||
template <typename JSONParser>
|
||||
class JSONExistsImpl
|
||||
{
|
||||
public:
|
||||
using Element = typename JSONParser::Element;
|
||||
|
||||
static DataTypePtr getReturnType(const char *, const ColumnsWithTypeAndName &) { return std::make_shared<DataTypeUInt8>(); }
|
||||
|
||||
static size_t getNumberOfIndexArguments(const ColumnsWithTypeAndName & arguments) { return arguments.size() - 1; }
|
||||
|
||||
static bool insertResultToColumn(IColumn & dest, const Element & root, ASTPtr & query_ptr)
|
||||
{
|
||||
GeneratorJSONPath<JSONParser> generator_json_path(query_ptr);
|
||||
Element current_element = root;
|
||||
VisitorStatus status;
|
||||
while ((status = generator_json_path.getNextItem(current_element)) != VisitorStatus::Exhausted)
|
||||
{
|
||||
if (status == VisitorStatus::Ok)
|
||||
{
|
||||
break;
|
||||
}
|
||||
current_element = root;
|
||||
}
|
||||
|
||||
/// insert result, status can be either Ok (if we found the item)
|
||||
/// or Exhausted (if we never found the item)
|
||||
ColumnUInt8 & col_bool = assert_cast<ColumnUInt8 &>(dest);
|
||||
if (status == VisitorStatus::Ok)
|
||||
{
|
||||
col_bool.insert(1);
|
||||
}
|
||||
else
|
||||
{
|
||||
col_bool.insert(0);
|
||||
}
|
||||
return true;
|
||||
}
|
||||
};
|
||||
|
||||
template <typename JSONParser>
|
||||
class JSONValueImpl
|
||||
{
|
||||
public:
|
||||
using Element = typename JSONParser::Element;
|
||||
|
||||
static DataTypePtr getReturnType(const char *, const ColumnsWithTypeAndName &) { return std::make_shared<DataTypeString>(); }
|
||||
|
||||
static size_t getNumberOfIndexArguments(const ColumnsWithTypeAndName & arguments) { return arguments.size() - 1; }
|
||||
|
||||
static bool insertResultToColumn(IColumn & dest, const Element & root, ASTPtr & query_ptr)
|
||||
{
|
||||
GeneratorJSONPath<JSONParser> generator_json_path(query_ptr);
|
||||
Element current_element = root;
|
||||
VisitorStatus status;
|
||||
Element res;
|
||||
while ((status = generator_json_path.getNextItem(current_element)) != VisitorStatus::Exhausted)
|
||||
{
|
||||
if (status == VisitorStatus::Ok)
|
||||
{
|
||||
if (!(current_element.isArray() || current_element.isObject()))
|
||||
{
|
||||
break;
|
||||
}
|
||||
}
|
||||
else if (status == VisitorStatus::Error)
|
||||
{
|
||||
/// ON ERROR
|
||||
/// Here it is possible to handle errors with ON ERROR (as described in ISO/IEC TR 19075-6),
|
||||
/// however this functionality is not implemented yet
|
||||
}
|
||||
current_element = root;
|
||||
}
|
||||
|
||||
if (status == VisitorStatus::Exhausted)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
std::stringstream out; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
|
||||
out << current_element.getElement();
|
||||
auto output_str = out.str();
|
||||
ColumnString & col_str = assert_cast<ColumnString &>(dest);
|
||||
col_str.insertData(output_str.data(), output_str.size());
|
||||
return true;
|
||||
}
|
||||
};
|
||||
|
||||
/**
|
||||
* Function to test jsonpath member access, will be removed in final PR
|
||||
* @tparam JSONParser parser
|
||||
*/
|
||||
template <typename JSONParser>
|
||||
class JSONQueryImpl
|
||||
{
|
||||
public:
|
||||
using Element = typename JSONParser::Element;
|
||||
|
||||
static DataTypePtr getReturnType(const char *, const ColumnsWithTypeAndName &) { return std::make_shared<DataTypeString>(); }
|
||||
|
||||
static size_t getNumberOfIndexArguments(const ColumnsWithTypeAndName & arguments) { return arguments.size() - 1; }
|
||||
|
||||
static bool insertResultToColumn(IColumn & dest, const Element & root, ASTPtr & query_ptr)
|
||||
{
|
||||
GeneratorJSONPath<JSONParser> generator_json_path(query_ptr);
|
||||
Element current_element = root;
|
||||
VisitorStatus status;
|
||||
std::stringstream out; // STYLE_CHECK_ALLOW_STD_STRING_STREAM
|
||||
/// Create json array of results: [res1, res2, ...]
|
||||
out << "[";
|
||||
bool success = false;
|
||||
while ((status = generator_json_path.getNextItem(current_element)) != VisitorStatus::Exhausted)
|
||||
{
|
||||
if (status == VisitorStatus::Ok)
|
||||
{
|
||||
if (success)
|
||||
{
|
||||
out << ", ";
|
||||
}
|
||||
success = true;
|
||||
out << current_element.getElement();
|
||||
}
|
||||
else if (status == VisitorStatus::Error)
|
||||
{
|
||||
/// ON ERROR
|
||||
/// Here it is possible to handle errors with ON ERROR (as described in ISO/IEC TR 19075-6),
|
||||
/// however this functionality is not implemented yet
|
||||
}
|
||||
current_element = root;
|
||||
}
|
||||
out << "]";
|
||||
if (!success)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
ColumnString & col_str = assert_cast<ColumnString &>(dest);
|
||||
auto output_str = out.str();
|
||||
col_str.insertData(output_str.data(), output_str.size());
|
||||
return true;
|
||||
}
|
||||
};
|
||||
|
||||
}
|
@ -21,6 +21,8 @@ void registerFunctionsCoding(FunctionFactory & factory)
|
||||
factory.registerFunction<FunctionUUIDStringToNum>();
|
||||
factory.registerFunction<FunctionHex>(FunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction<FunctionUnhex>(FunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction<FunctionBin>(FunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction<FunctionUnbin>(FunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction<FunctionChar>(FunctionFactory::CaseInsensitive);
|
||||
factory.registerFunction<FunctionBitmaskToArray>();
|
||||
factory.registerFunction<FunctionBitPositionsToArray>();
|
||||
|
@ -19,6 +19,7 @@
|
||||
#include <Functions/FunctionHelpers.h>
|
||||
#include <Functions/IFunction.h>
|
||||
#include <Interpreters/Context_fwd.h>
|
||||
#include <Interpreters/castColumn.h>
|
||||
#include <IO/WriteHelpers.h>
|
||||
#include <Common/IPv6ToBinary.h>
|
||||
#include <Common/formatIPv6.h>
|
||||
@ -65,7 +66,6 @@ namespace ErrorCodes
|
||||
constexpr size_t uuid_bytes_length = 16;
|
||||
constexpr size_t uuid_text_length = 36;
|
||||
|
||||
|
||||
class FunctionIPv6NumToString : public IFunction
|
||||
{
|
||||
public:
|
||||
@ -951,19 +951,22 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
class FunctionHex : public IFunction
|
||||
/// Encode number or string to string with binary or hexadecimal representation
|
||||
template <typename Impl>
|
||||
class EncodeToBinaryRepr : public IFunction
|
||||
{
|
||||
public:
|
||||
static constexpr auto name = "hex";
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionHex>(); }
|
||||
static constexpr auto name = Impl::name;
|
||||
static constexpr size_t word_size = Impl::word_size;
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
return name;
|
||||
}
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<EncodeToBinaryRepr>(); }
|
||||
|
||||
String getName() const override { return name; }
|
||||
|
||||
size_t getNumberOfArguments() const override { return 1; }
|
||||
|
||||
bool useDefaultImplementationForConstants() const override { return true; }
|
||||
|
||||
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
|
||||
|
||||
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
|
||||
@ -976,32 +979,44 @@ public:
|
||||
!which.isDateTime64() &&
|
||||
!which.isUInt() &&
|
||||
!which.isFloat() &&
|
||||
!which.isDecimal())
|
||||
!which.isDecimal() &&
|
||||
!which.isAggregateFunction())
|
||||
throw Exception("Illegal type " + arguments[0]->getName() + " of argument of function " + getName(),
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
return std::make_shared<DataTypeString>();
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
void executeOneUInt(T x, char *& out) const
|
||||
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
|
||||
{
|
||||
bool was_nonzero = false;
|
||||
for (int offset = (sizeof(T) - 1) * 8; offset >= 0; offset -= 8)
|
||||
const IColumn * column = arguments[0].column.get();
|
||||
ColumnPtr res_column;
|
||||
|
||||
WhichDataType which(column->getDataType());
|
||||
if (which.isAggregateFunction())
|
||||
{
|
||||
UInt8 byte = x >> offset;
|
||||
|
||||
/// Leading zeros.
|
||||
if (byte == 0 && !was_nonzero && offset) // -V560
|
||||
continue;
|
||||
|
||||
was_nonzero = true;
|
||||
|
||||
writeHexByteUppercase(byte, out);
|
||||
out += 2;
|
||||
const ColumnPtr to_string = castColumn(arguments[0], std::make_shared<DataTypeString>());
|
||||
const auto * str_column = checkAndGetColumn<ColumnString>(to_string.get());
|
||||
tryExecuteString(str_column, res_column);
|
||||
return res_column;
|
||||
}
|
||||
*out = '\0';
|
||||
++out;
|
||||
|
||||
if (tryExecuteUInt<UInt8>(column, res_column) ||
|
||||
tryExecuteUInt<UInt16>(column, res_column) ||
|
||||
tryExecuteUInt<UInt32>(column, res_column) ||
|
||||
tryExecuteUInt<UInt64>(column, res_column) ||
|
||||
tryExecuteString(column, res_column) ||
|
||||
tryExecuteFixedString(column, res_column) ||
|
||||
tryExecuteFloat<Float32>(column, res_column) ||
|
||||
tryExecuteFloat<Float64>(column, res_column) ||
|
||||
tryExecuteDecimal<Decimal32>(column, res_column) ||
|
||||
tryExecuteDecimal<Decimal64>(column, res_column) ||
|
||||
tryExecuteDecimal<Decimal128>(column, res_column))
|
||||
return res_column;
|
||||
|
||||
throw Exception("Illegal column " + arguments[0].column->getName()
|
||||
+ " of argument of function " + getName(),
|
||||
ErrorCodes::ILLEGAL_COLUMN);
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
@ -1009,7 +1024,7 @@ public:
|
||||
{
|
||||
const ColumnVector<T> * col_vec = checkAndGetColumn<ColumnVector<T>>(col);
|
||||
|
||||
static constexpr size_t MAX_UINT_HEX_LENGTH = sizeof(T) * 2 + 1; /// Including trailing zero byte.
|
||||
static constexpr size_t MAX_LENGTH = sizeof(T) * word_size + 1; /// Including trailing zero byte.
|
||||
|
||||
if (col_vec)
|
||||
{
|
||||
@ -1021,23 +1036,22 @@ public:
|
||||
|
||||
size_t size = in_vec.size();
|
||||
out_offsets.resize(size);
|
||||
out_vec.resize(size * 3 + MAX_UINT_HEX_LENGTH); /// 3 is length of one byte in hex plus zero byte.
|
||||
out_vec.resize(size * (word_size+1) + MAX_LENGTH); /// word_size+1 is length of one byte in hex/bin plus zero byte.
|
||||
|
||||
size_t pos = 0;
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
/// Manual exponential growth, so as not to rely on the linear amortized work time of `resize` (no one guarantees it).
|
||||
if (pos + MAX_UINT_HEX_LENGTH > out_vec.size())
|
||||
out_vec.resize(out_vec.size() * 2 + MAX_UINT_HEX_LENGTH);
|
||||
if (pos + MAX_LENGTH > out_vec.size())
|
||||
out_vec.resize(out_vec.size() * word_size + MAX_LENGTH);
|
||||
|
||||
char * begin = reinterpret_cast<char *>(&out_vec[pos]);
|
||||
char * end = begin;
|
||||
executeOneUInt<T>(in_vec[i], end);
|
||||
Impl::executeOneUInt(in_vec[i], end);
|
||||
|
||||
pos += end - begin;
|
||||
out_offsets[i] = pos;
|
||||
}
|
||||
|
||||
out_vec.resize(pos);
|
||||
|
||||
col_res = std::move(col_str);
|
||||
@ -1049,10 +1063,242 @@ public:
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
void executeFloatAndDecimal(const T & in_vec, ColumnPtr & col_res, const size_t type_size_in_bytes) const
|
||||
bool tryExecuteString(const IColumn *col, ColumnPtr &col_res) const
|
||||
{
|
||||
const size_t hex_length = type_size_in_bytes * 2 + 1; /// Including trailing zero byte.
|
||||
const ColumnString * col_str_in = checkAndGetColumn<ColumnString>(col);
|
||||
|
||||
if (col_str_in)
|
||||
{
|
||||
auto col_str = ColumnString::create();
|
||||
ColumnString::Chars & out_vec = col_str->getChars();
|
||||
ColumnString::Offsets & out_offsets = col_str->getOffsets();
|
||||
|
||||
const ColumnString::Chars & in_vec = col_str_in->getChars();
|
||||
const ColumnString::Offsets & in_offsets = col_str_in->getOffsets();
|
||||
|
||||
size_t size = in_offsets.size();
|
||||
|
||||
out_offsets.resize(size);
|
||||
/// reserve `word_size` bytes for each non trailing zero byte from input + `size` bytes for trailing zeros
|
||||
out_vec.resize((in_vec.size() - size) * word_size + size);
|
||||
|
||||
char * begin = reinterpret_cast<char *>(out_vec.data());
|
||||
char * pos = begin;
|
||||
size_t prev_offset = 0;
|
||||
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
size_t new_offset = in_offsets[i];
|
||||
|
||||
Impl::executeOneString(&in_vec[prev_offset], &in_vec[new_offset - 1], pos);
|
||||
|
||||
out_offsets[i] = pos - begin;
|
||||
|
||||
prev_offset = new_offset;
|
||||
}
|
||||
if (!out_offsets.empty() && out_offsets.back() != out_vec.size())
|
||||
throw Exception("Column size mismatch (internal logical error)", ErrorCodes::LOGICAL_ERROR);
|
||||
|
||||
col_res = std::move(col_str);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
bool tryExecuteDecimal(const IColumn * col, ColumnPtr & col_res) const
|
||||
{
|
||||
const ColumnDecimal<T> * col_dec = checkAndGetColumn<ColumnDecimal<T>>(col);
|
||||
if (col_dec)
|
||||
{
|
||||
const typename ColumnDecimal<T>::Container & in_vec = col_dec->getData();
|
||||
Impl::executeFloatAndDecimal(in_vec, col_res, sizeof(T));
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
static bool tryExecuteFixedString(const IColumn * col, ColumnPtr & col_res)
|
||||
{
|
||||
const ColumnFixedString * col_fstr_in = checkAndGetColumn<ColumnFixedString>(col);
|
||||
|
||||
if (col_fstr_in)
|
||||
{
|
||||
auto col_str = ColumnString::create();
|
||||
ColumnString::Chars & out_vec = col_str->getChars();
|
||||
ColumnString::Offsets & out_offsets = col_str->getOffsets();
|
||||
|
||||
const ColumnString::Chars & in_vec = col_fstr_in->getChars();
|
||||
|
||||
size_t size = col_fstr_in->size();
|
||||
|
||||
out_offsets.resize(size);
|
||||
out_vec.resize(in_vec.size() * word_size + size);
|
||||
|
||||
char * begin = reinterpret_cast<char *>(out_vec.data());
|
||||
char * pos = begin;
|
||||
|
||||
size_t n = col_fstr_in->getN();
|
||||
|
||||
size_t prev_offset = 0;
|
||||
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
size_t new_offset = prev_offset + n;
|
||||
|
||||
Impl::executeOneString(&in_vec[prev_offset], &in_vec[new_offset], pos);
|
||||
|
||||
out_offsets[i] = pos - begin;
|
||||
prev_offset = new_offset;
|
||||
}
|
||||
|
||||
if (!out_offsets.empty() && out_offsets.back() != out_vec.size())
|
||||
throw Exception("Column size mismatch (internal logical error)", ErrorCodes::LOGICAL_ERROR);
|
||||
|
||||
col_res = std::move(col_str);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
bool tryExecuteFloat(const IColumn * col, ColumnPtr & col_res) const
|
||||
{
|
||||
const ColumnVector<T> * col_vec = checkAndGetColumn<ColumnVector<T>>(col);
|
||||
if (col_vec)
|
||||
{
|
||||
const typename ColumnVector<T>::Container & in_vec = col_vec->getData();
|
||||
Impl::executeFloatAndDecimal(in_vec, col_res, sizeof(T));
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
/// Decode number or string from string with binary or hexadecimal representation
|
||||
template <typename Impl>
|
||||
class DecodeFromBinaryRepr : public IFunction
|
||||
{
|
||||
public:
|
||||
static constexpr auto name = Impl::name;
|
||||
static constexpr size_t word_size = Impl::word_size;
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<DecodeFromBinaryRepr>(); }
|
||||
|
||||
String getName() const override { return name; }
|
||||
|
||||
size_t getNumberOfArguments() const override { return 1; }
|
||||
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
|
||||
|
||||
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
|
||||
{
|
||||
if (!isString(arguments[0]))
|
||||
throw Exception("Illegal type " + arguments[0]->getName() + " of argument of function " + getName(),
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
return std::make_shared<DataTypeString>();
|
||||
}
|
||||
|
||||
bool useDefaultImplementationForConstants() const override { return true; }
|
||||
|
||||
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
|
||||
{
|
||||
const ColumnPtr & column = arguments[0].column;
|
||||
|
||||
if (const ColumnString * col = checkAndGetColumn<ColumnString>(column.get()))
|
||||
{
|
||||
auto col_res = ColumnString::create();
|
||||
|
||||
ColumnString::Chars & out_vec = col_res->getChars();
|
||||
ColumnString::Offsets & out_offsets = col_res->getOffsets();
|
||||
|
||||
const ColumnString::Chars & in_vec = col->getChars();
|
||||
const ColumnString::Offsets & in_offsets = col->getOffsets();
|
||||
|
||||
size_t size = in_offsets.size();
|
||||
out_offsets.resize(size);
|
||||
out_vec.resize(in_vec.size() / word_size + size);
|
||||
|
||||
char * begin = reinterpret_cast<char *>(out_vec.data());
|
||||
char * pos = begin;
|
||||
size_t prev_offset = 0;
|
||||
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
size_t new_offset = in_offsets[i];
|
||||
|
||||
Impl::decode(reinterpret_cast<const char *>(&in_vec[prev_offset]), reinterpret_cast<const char *>(&in_vec[new_offset - 1]), pos);
|
||||
|
||||
out_offsets[i] = pos - begin;
|
||||
|
||||
prev_offset = new_offset;
|
||||
}
|
||||
|
||||
out_vec.resize(pos - begin);
|
||||
|
||||
return col_res;
|
||||
}
|
||||
else
|
||||
{
|
||||
throw Exception("Illegal column " + arguments[0].column->getName()
|
||||
+ " of argument of function " + getName(),
|
||||
ErrorCodes::ILLEGAL_COLUMN);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
struct HexImpl
|
||||
{
|
||||
static constexpr auto name = "hex";
|
||||
static constexpr size_t word_size = 2;
|
||||
|
||||
template <typename T>
|
||||
static void executeOneUInt(T x, char *& out)
|
||||
{
|
||||
bool was_nonzero = false;
|
||||
for (int offset = (sizeof(T) - 1) * 8; offset >= 0; offset -= 8)
|
||||
{
|
||||
UInt8 byte = x >> offset;
|
||||
|
||||
/// Skip leading zeros
|
||||
if (byte == 0 && !was_nonzero && offset)
|
||||
continue;
|
||||
|
||||
was_nonzero = true;
|
||||
writeHexByteUppercase(byte, out);
|
||||
out += word_size;
|
||||
}
|
||||
*out = '\0';
|
||||
++out;
|
||||
}
|
||||
|
||||
static void executeOneString(const UInt8 * pos, const UInt8 * end, char *& out)
|
||||
{
|
||||
while (pos < end)
|
||||
{
|
||||
writeHexByteUppercase(*pos, out);
|
||||
++pos;
|
||||
out += word_size;
|
||||
}
|
||||
*out = '\0';
|
||||
++out;
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
static void executeFloatAndDecimal(const T & in_vec, ColumnPtr & col_res, const size_t type_size_in_bytes)
|
||||
{
|
||||
const size_t hex_length = type_size_in_bytes * word_size + 1; /// Including trailing zero byte.
|
||||
auto col_str = ColumnString::create();
|
||||
|
||||
ColumnString::Chars & out_vec = col_str->getChars();
|
||||
@ -1074,193 +1320,14 @@ public:
|
||||
}
|
||||
col_res = std::move(col_str);
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
bool tryExecuteFloat(const IColumn * col, ColumnPtr & col_res) const
|
||||
{
|
||||
const ColumnVector<T> * col_vec = checkAndGetColumn<ColumnVector<T>>(col);
|
||||
if (col_vec)
|
||||
{
|
||||
const typename ColumnVector<T>::Container & in_vec = col_vec->getData();
|
||||
executeFloatAndDecimal<typename ColumnVector<T>::Container>(in_vec, col_res, sizeof(T));
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
bool tryExecuteDecimal(const IColumn * col, ColumnPtr & col_res) const
|
||||
{
|
||||
const ColumnDecimal<T> * col_dec = checkAndGetColumn<ColumnDecimal<T>>(col);
|
||||
if (col_dec)
|
||||
{
|
||||
const typename ColumnDecimal<T>::Container & in_vec = col_dec->getData();
|
||||
executeFloatAndDecimal<typename ColumnDecimal<T>::Container>(in_vec, col_res, sizeof(T));
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
static void executeOneString(const UInt8 * pos, const UInt8 * end, char *& out)
|
||||
{
|
||||
while (pos < end)
|
||||
{
|
||||
writeHexByteUppercase(*pos, out);
|
||||
++pos;
|
||||
out += 2;
|
||||
}
|
||||
*out = '\0';
|
||||
++out;
|
||||
}
|
||||
|
||||
static bool tryExecuteString(const IColumn * col, ColumnPtr & col_res)
|
||||
{
|
||||
const ColumnString * col_str_in = checkAndGetColumn<ColumnString>(col);
|
||||
|
||||
if (col_str_in)
|
||||
{
|
||||
auto col_str = ColumnString::create();
|
||||
ColumnString::Chars & out_vec = col_str->getChars();
|
||||
ColumnString::Offsets & out_offsets = col_str->getOffsets();
|
||||
|
||||
const ColumnString::Chars & in_vec = col_str_in->getChars();
|
||||
const ColumnString::Offsets & in_offsets = col_str_in->getOffsets();
|
||||
|
||||
size_t size = in_offsets.size();
|
||||
out_offsets.resize(size);
|
||||
out_vec.resize(in_vec.size() * 2 - size);
|
||||
|
||||
char * begin = reinterpret_cast<char *>(out_vec.data());
|
||||
char * pos = begin;
|
||||
size_t prev_offset = 0;
|
||||
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
size_t new_offset = in_offsets[i];
|
||||
|
||||
executeOneString(&in_vec[prev_offset], &in_vec[new_offset - 1], pos);
|
||||
|
||||
out_offsets[i] = pos - begin;
|
||||
|
||||
prev_offset = new_offset;
|
||||
}
|
||||
|
||||
if (!out_offsets.empty() && out_offsets.back() != out_vec.size())
|
||||
throw Exception("Column size mismatch (internal logical error)", ErrorCodes::LOGICAL_ERROR);
|
||||
|
||||
col_res = std::move(col_str);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
static bool tryExecuteFixedString(const IColumn * col, ColumnPtr & col_res)
|
||||
{
|
||||
const ColumnFixedString * col_fstr_in = checkAndGetColumn<ColumnFixedString>(col);
|
||||
|
||||
if (col_fstr_in)
|
||||
{
|
||||
auto col_str = ColumnString::create();
|
||||
ColumnString::Chars & out_vec = col_str->getChars();
|
||||
ColumnString::Offsets & out_offsets = col_str->getOffsets();
|
||||
|
||||
const ColumnString::Chars & in_vec = col_fstr_in->getChars();
|
||||
|
||||
size_t size = col_fstr_in->size();
|
||||
|
||||
out_offsets.resize(size);
|
||||
out_vec.resize(in_vec.size() * 2 + size);
|
||||
|
||||
char * begin = reinterpret_cast<char *>(out_vec.data());
|
||||
char * pos = begin;
|
||||
|
||||
size_t n = col_fstr_in->getN();
|
||||
|
||||
size_t prev_offset = 0;
|
||||
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
size_t new_offset = prev_offset + n;
|
||||
|
||||
executeOneString(&in_vec[prev_offset], &in_vec[new_offset], pos);
|
||||
|
||||
out_offsets[i] = pos - begin;
|
||||
prev_offset = new_offset;
|
||||
}
|
||||
|
||||
if (!out_offsets.empty() && out_offsets.back() != out_vec.size())
|
||||
throw Exception("Column size mismatch (internal logical error)", ErrorCodes::LOGICAL_ERROR);
|
||||
|
||||
col_res = std::move(col_str);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
bool useDefaultImplementationForConstants() const override { return true; }
|
||||
|
||||
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
|
||||
{
|
||||
const IColumn * column = arguments[0].column.get();
|
||||
ColumnPtr res_column;
|
||||
|
||||
if (tryExecuteUInt<UInt8>(column, res_column) ||
|
||||
tryExecuteUInt<UInt16>(column, res_column) ||
|
||||
tryExecuteUInt<UInt32>(column, res_column) ||
|
||||
tryExecuteUInt<UInt64>(column, res_column) ||
|
||||
tryExecuteString(column, res_column) ||
|
||||
tryExecuteFixedString(column, res_column) ||
|
||||
tryExecuteFloat<Float32>(column, res_column) ||
|
||||
tryExecuteFloat<Float64>(column, res_column) ||
|
||||
tryExecuteDecimal<Decimal32>(column, res_column) ||
|
||||
tryExecuteDecimal<Decimal64>(column, res_column) ||
|
||||
tryExecuteDecimal<Decimal128>(column, res_column))
|
||||
return res_column;
|
||||
|
||||
throw Exception("Illegal column " + arguments[0].column->getName()
|
||||
+ " of argument of function " + getName(),
|
||||
ErrorCodes::ILLEGAL_COLUMN);
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
class FunctionUnhex : public IFunction
|
||||
struct UnhexImpl
|
||||
{
|
||||
public:
|
||||
static constexpr auto name = "unhex";
|
||||
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionUnhex>(); }
|
||||
static constexpr size_t word_size = 2;
|
||||
|
||||
String getName() const override
|
||||
{
|
||||
return name;
|
||||
}
|
||||
|
||||
size_t getNumberOfArguments() const override { return 1; }
|
||||
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
|
||||
|
||||
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
|
||||
{
|
||||
if (!isString(arguments[0]))
|
||||
throw Exception("Illegal type " + arguments[0]->getName() + " of argument of function " + getName(),
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
return std::make_shared<DataTypeString>();
|
||||
}
|
||||
|
||||
static void unhexOne(const char * pos, const char * end, char *& out)
|
||||
static void decode(const char * pos, const char * end, char *& out)
|
||||
{
|
||||
if ((end - pos) & 1)
|
||||
{
|
||||
@ -1271,61 +1338,139 @@ public:
|
||||
while (pos < end)
|
||||
{
|
||||
*out = unhex2(pos);
|
||||
pos += 2;
|
||||
pos += word_size;
|
||||
++out;
|
||||
}
|
||||
*out = '\0';
|
||||
++out;
|
||||
}
|
||||
};
|
||||
|
||||
bool useDefaultImplementationForConstants() const override { return true; }
|
||||
struct BinImpl
|
||||
{
|
||||
static constexpr auto name = "bin";
|
||||
static constexpr size_t word_size = 8;
|
||||
|
||||
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
|
||||
template <typename T>
|
||||
static void executeOneUInt(T x, char *& out)
|
||||
{
|
||||
const ColumnPtr & column = arguments[0].column;
|
||||
|
||||
if (const ColumnString * col = checkAndGetColumn<ColumnString>(column.get()))
|
||||
bool was_nonzero = false;
|
||||
for (int offset = (sizeof(T) - 1) * 8; offset >= 0; offset -= 8)
|
||||
{
|
||||
auto col_res = ColumnString::create();
|
||||
UInt8 byte = x >> offset;
|
||||
|
||||
ColumnString::Chars & out_vec = col_res->getChars();
|
||||
ColumnString::Offsets & out_offsets = col_res->getOffsets();
|
||||
/// Skip leading zeros
|
||||
if (byte == 0 && !was_nonzero && offset)
|
||||
continue;
|
||||
|
||||
const ColumnString::Chars & in_vec = col->getChars();
|
||||
const ColumnString::Offsets & in_offsets = col->getOffsets();
|
||||
|
||||
size_t size = in_offsets.size();
|
||||
out_offsets.resize(size);
|
||||
out_vec.resize(in_vec.size() / 2 + size);
|
||||
|
||||
char * begin = reinterpret_cast<char *>(out_vec.data());
|
||||
char * pos = begin;
|
||||
size_t prev_offset = 0;
|
||||
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
size_t new_offset = in_offsets[i];
|
||||
|
||||
unhexOne(reinterpret_cast<const char *>(&in_vec[prev_offset]), reinterpret_cast<const char *>(&in_vec[new_offset - 1]), pos);
|
||||
|
||||
out_offsets[i] = pos - begin;
|
||||
|
||||
prev_offset = new_offset;
|
||||
}
|
||||
|
||||
out_vec.resize(pos - begin);
|
||||
|
||||
return col_res;
|
||||
was_nonzero = true;
|
||||
writeBinByte(byte, out);
|
||||
out += word_size;
|
||||
}
|
||||
else
|
||||
*out = '\0';
|
||||
++out;
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
static void executeFloatAndDecimal(const T & in_vec, ColumnPtr & col_res, const size_t type_size_in_bytes)
|
||||
{
|
||||
const size_t hex_length = type_size_in_bytes * word_size + 1; /// Including trailing zero byte.
|
||||
auto col_str = ColumnString::create();
|
||||
|
||||
ColumnString::Chars & out_vec = col_str->getChars();
|
||||
ColumnString::Offsets & out_offsets = col_str->getOffsets();
|
||||
|
||||
size_t size = in_vec.size();
|
||||
out_offsets.resize(size);
|
||||
out_vec.resize(size * hex_length);
|
||||
|
||||
size_t pos = 0;
|
||||
char * out = reinterpret_cast<char *>(out_vec.data());
|
||||
for (size_t i = 0; i < size; ++i)
|
||||
{
|
||||
throw Exception("Illegal column " + arguments[0].column->getName()
|
||||
+ " of argument of function " + getName(),
|
||||
ErrorCodes::ILLEGAL_COLUMN);
|
||||
const UInt8 * in_pos = reinterpret_cast<const UInt8 *>(&in_vec[i]);
|
||||
executeOneString(in_pos, in_pos + type_size_in_bytes, out);
|
||||
|
||||
pos += hex_length;
|
||||
out_offsets[i] = pos;
|
||||
}
|
||||
col_res = std::move(col_str);
|
||||
}
|
||||
|
||||
static void executeOneString(const UInt8 * pos, const UInt8 * end, char *& out)
|
||||
{
|
||||
while (pos < end)
|
||||
{
|
||||
writeBinByte(*pos, out);
|
||||
++pos;
|
||||
out += word_size;
|
||||
}
|
||||
*out = '\0';
|
||||
++out;
|
||||
}
|
||||
};
|
||||
|
||||
struct UnbinImpl
|
||||
{
|
||||
static constexpr auto name = "unbin";
|
||||
static constexpr size_t word_size = 8;
|
||||
|
||||
static void decode(const char * pos, const char * end, char *& out)
|
||||
{
|
||||
if (pos == end)
|
||||
{
|
||||
*out = '\0';
|
||||
++out;
|
||||
return;
|
||||
}
|
||||
|
||||
UInt8 left = 0;
|
||||
|
||||
/// end - pos is the length of input.
|
||||
/// (length & 7) to make remain bits length mod 8 is zero to split.
|
||||
/// e.g. the length is 9 and the input is "101000001",
|
||||
/// first left_cnt is 1, left is 0, right shift, pos is 1, left = 1
|
||||
/// then, left_cnt is 0, remain input is '01000001'.
|
||||
for (UInt8 left_cnt = (end - pos) & 7; left_cnt > 0; --left_cnt)
|
||||
{
|
||||
left = left << 1;
|
||||
if (*pos != '0')
|
||||
left += 1;
|
||||
++pos;
|
||||
}
|
||||
|
||||
if (left != 0 || end - pos == 0)
|
||||
{
|
||||
*out = left;
|
||||
++out;
|
||||
}
|
||||
|
||||
assert((end - pos) % 8 == 0);
|
||||
|
||||
while (end - pos != 0)
|
||||
{
|
||||
UInt8 c = 0;
|
||||
for (UInt8 i = 0; i < 8; ++i)
|
||||
{
|
||||
c = c << 1;
|
||||
if (*pos != '0')
|
||||
c += 1;
|
||||
++pos;
|
||||
}
|
||||
*out = c;
|
||||
++out;
|
||||
}
|
||||
|
||||
*out = '\0';
|
||||
++out;
|
||||
}
|
||||
};
|
||||
|
||||
using FunctionHex = EncodeToBinaryRepr<HexImpl>;
|
||||
using FunctionUnhex = DecodeFromBinaryRepr<UnhexImpl>;
|
||||
using FunctionBin = EncodeToBinaryRepr<BinImpl>;
|
||||
using FunctionUnbin = DecodeFromBinaryRepr<UnbinImpl>;
|
||||
|
||||
class FunctionChar : public IFunction
|
||||
{
|
||||
public:
|
||||
|
@ -163,13 +163,6 @@ public:
|
||||
arguments[0]->getName(),
|
||||
getName());
|
||||
|
||||
if (!WhichDataType(arguments[1]).isUInt64() &&
|
||||
!isTuple(arguments[1]))
|
||||
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Illegal type {} of second argument of function {} must be UInt64 or tuple(...)",
|
||||
arguments[1]->getName(),
|
||||
getName());
|
||||
|
||||
return std::make_shared<DataTypeUInt8>();
|
||||
}
|
||||
|
||||
@ -189,8 +182,8 @@ public:
|
||||
auto dictionary_key_type = dictionary->getKeyType();
|
||||
|
||||
const ColumnWithTypeAndName & key_column_with_type = arguments[1];
|
||||
const auto key_column = key_column_with_type.column;
|
||||
const auto key_column_type = WhichDataType(key_column_with_type.type);
|
||||
auto key_column = key_column_with_type.column;
|
||||
auto key_column_type = key_column_with_type.type;
|
||||
|
||||
ColumnPtr range_col = nullptr;
|
||||
DataTypePtr range_col_type = nullptr;
|
||||
@ -214,7 +207,7 @@ public:
|
||||
|
||||
if (dictionary_key_type == DictionaryKeyType::simple)
|
||||
{
|
||||
if (!key_column_type.isUInt64())
|
||||
if (!WhichDataType(key_column_type).isUInt64())
|
||||
throw Exception(
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Second argument of function {} must be UInt64 when dictionary is simple. Actual type {}.",
|
||||
@ -225,24 +218,39 @@ public:
|
||||
}
|
||||
else if (dictionary_key_type == DictionaryKeyType::complex)
|
||||
{
|
||||
if (!key_column_type.isTuple())
|
||||
throw Exception(
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Second argument of function {} must be tuple when dictionary is complex. Actual type {}.",
|
||||
getName(),
|
||||
key_column_with_type.type->getName());
|
||||
|
||||
/// Functions in external dictionaries_loader only support full-value (not constant) columns with keys.
|
||||
ColumnPtr key_column_full = key_column->convertToFullColumnIfConst();
|
||||
key_column = key_column->convertToFullColumnIfConst();
|
||||
size_t keys_size = dictionary->getStructure().getKeysSize();
|
||||
|
||||
const auto & key_columns = typeid_cast<const ColumnTuple &>(*key_column_full).getColumnsCopy();
|
||||
const auto & key_types = static_cast<const DataTypeTuple &>(*key_column_with_type.type).getElements();
|
||||
if (!isTuple(key_column_type))
|
||||
{
|
||||
if (keys_size > 1)
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Third argument of function {} must be tuple when dictionary is complex and key contains more than 1 attribute."
|
||||
"Actual type {}.",
|
||||
getName(),
|
||||
key_column_type->getName());
|
||||
}
|
||||
else
|
||||
{
|
||||
Columns tuple_columns = {std::move(key_column)};
|
||||
key_column = ColumnTuple::create(tuple_columns);
|
||||
|
||||
DataTypes tuple_types = {key_column_type};
|
||||
key_column_type = std::make_shared<DataTypeTuple>(tuple_types);
|
||||
}
|
||||
}
|
||||
|
||||
const auto & key_columns = assert_cast<const ColumnTuple &>(*key_column).getColumnsCopy();
|
||||
const auto & key_types = assert_cast<const DataTypeTuple &>(*key_column_type).getElements();
|
||||
|
||||
return dictionary->hasKeys(key_columns, key_types);
|
||||
}
|
||||
else
|
||||
{
|
||||
if (!key_column_type.isUInt64())
|
||||
if (!WhichDataType(key_column_type).isUInt64())
|
||||
throw Exception(
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Second argument of function {} must be UInt64 when dictionary is range. Actual type {}.",
|
||||
@ -346,13 +354,6 @@ public:
|
||||
Strings attribute_names = getAttributeNamesFromColumn(arguments[1].column, arguments[1].type);
|
||||
|
||||
auto dictionary = helper.getDictionary(dictionary_name);
|
||||
|
||||
if (!WhichDataType(arguments[2].type).isUInt64() && !isTuple(arguments[2].type))
|
||||
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Illegal type {} of third argument of function {}, must be UInt64 or tuple(...).",
|
||||
arguments[2].type->getName(),
|
||||
getName());
|
||||
|
||||
auto dictionary_key_type = dictionary->getKeyType();
|
||||
|
||||
size_t current_arguments_index = 3;
|
||||
@ -446,18 +447,35 @@ public:
|
||||
}
|
||||
else if (dictionary_key_type == DictionaryKeyType::complex)
|
||||
{
|
||||
if (!isTuple(key_col_with_type.type))
|
||||
throw Exception(
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Third argument of function {} must be tuple when dictionary is complex. Actual type {}.",
|
||||
getName(),
|
||||
key_col_with_type.type->getName());
|
||||
|
||||
/// Functions in external dictionaries_loader only support full-value (not constant) columns with keys.
|
||||
ColumnPtr key_column_full = key_col_with_type.column->convertToFullColumnIfConst();
|
||||
ColumnPtr key_column = key_col_with_type.column->convertToFullColumnIfConst();
|
||||
DataTypePtr key_column_type = key_col_with_type.type;
|
||||
|
||||
const auto & key_columns = typeid_cast<const ColumnTuple &>(*key_column_full).getColumnsCopy();
|
||||
const auto & key_types = static_cast<const DataTypeTuple &>(*key_col_with_type.type).getElements();
|
||||
size_t keys_size = dictionary->getStructure().getKeysSize();
|
||||
|
||||
if (!isTuple(key_column_type))
|
||||
{
|
||||
if (keys_size > 1)
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
|
||||
"Third argument of function {} must be tuple when dictionary is complex and key contains more than 1 attribute."
|
||||
"Actual type {}.",
|
||||
getName(),
|
||||
key_col_with_type.type->getName());
|
||||
}
|
||||
else
|
||||
{
|
||||
Columns tuple_columns = {std::move(key_column)};
|
||||
key_column = ColumnTuple::create(tuple_columns);
|
||||
|
||||
DataTypes tuple_types = {key_column_type};
|
||||
key_column_type = std::make_shared<DataTypeTuple>(tuple_types);
|
||||
}
|
||||
}
|
||||
|
||||
const auto & key_columns = assert_cast<const ColumnTuple &>(*key_column).getColumnsCopy();
|
||||
const auto & key_types = assert_cast<const DataTypeTuple &>(*key_column_type).getElements();
|
||||
|
||||
result = executeDictionaryRequest(
|
||||
dictionary,
|
||||
|
@ -607,6 +607,8 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
template <typename JSONParser>
|
||||
class JSONExtractRawImpl;
|
||||
|
||||
/// Nodes of the extract tree. We need the extract tree to extract from JSON complex values containing array, tuples or nullables.
|
||||
template <typename JSONParser>
|
||||
@ -691,7 +693,10 @@ struct JSONExtractTree
|
||||
public:
|
||||
bool insertResultToColumn(IColumn & dest, const Element & element) override
|
||||
{
|
||||
return JSONExtractStringImpl<JSONParser>::insertResultToColumn(dest, element, {});
|
||||
if (element.isString())
|
||||
return JSONExtractStringImpl<JSONParser>::insertResultToColumn(dest, element, {});
|
||||
else
|
||||
return JSONExtractRawImpl<JSONParser>::insertResultToColumn(dest, element, {});
|
||||
}
|
||||
};
|
||||
|
||||
|
@ -575,12 +575,12 @@ ColumnPtr FunctionAnyArityLogical<Impl, Name>::getConstantResultForNonConstArgum
|
||||
if constexpr (std::is_same_v<Impl, AndImpl>)
|
||||
{
|
||||
if (has_false_constant)
|
||||
result_type->createColumnConst(0, static_cast<UInt8>(false));
|
||||
result_column = result_type->createColumnConst(0, static_cast<UInt8>(false));
|
||||
}
|
||||
else if constexpr (std::is_same_v<Impl, OrImpl>)
|
||||
{
|
||||
if (has_true_constant)
|
||||
result_type->createColumnConst(0, static_cast<UInt8>(true));
|
||||
result_column = result_type->createColumnConst(0, static_cast<UInt8>(true));
|
||||
}
|
||||
|
||||
return result_column;
|
||||
|
@ -755,6 +755,7 @@ struct GenericValueSource : public ValueSourceImpl<GenericValueSource>
|
||||
{
|
||||
using Slice = GenericValueSlice;
|
||||
using SinkType = GenericArraySink;
|
||||
using Column = IColumn;
|
||||
|
||||
const IColumn * column;
|
||||
size_t total_rows;
|
||||
|
@ -358,6 +358,10 @@ public:
|
||||
*/
|
||||
virtual bool useDefaultImplementationForConstants() const { return false; }
|
||||
|
||||
/** Some arguments could remain constant during this implementation.
|
||||
*/
|
||||
virtual ColumnNumbers getArgumentsThatAreAlwaysConstant() const { return {}; }
|
||||
|
||||
/** If function arguments has single low cardinality column and all other arguments are constants, call function on nested column.
|
||||
* Otherwise, convert all low cardinality columns to ordinary columns.
|
||||
* Returns ColumnLowCardinality if at least one argument is ColumnLowCardinality.
|
||||
@ -367,10 +371,6 @@ public:
|
||||
/// If it isn't, will convert all ColumnLowCardinality arguments to full columns.
|
||||
virtual bool canBeExecutedOnLowCardinalityDictionary() const { return true; }
|
||||
|
||||
/** Some arguments could remain constant during this implementation.
|
||||
*/
|
||||
virtual ColumnNumbers getArgumentsThatAreAlwaysConstant() const { return {}; }
|
||||
|
||||
/** True if function can be called on default arguments (include Nullable's) and won't throw.
|
||||
* Counterexample: modulo(0, 0)
|
||||
*/
|
||||
|
18
src/Functions/JSONPath/ASTs/ASTJSONPath.h
Normal file
18
src/Functions/JSONPath/ASTs/ASTJSONPath.h
Normal file
@ -0,0 +1,18 @@
|
||||
#pragma once
|
||||
|
||||
#include <Functions/JSONPath/ASTs/ASTJSONPathQuery.h>
|
||||
#include <Parsers/IAST.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
class ASTJSONPath : public IAST
|
||||
{
|
||||
public:
|
||||
String getID(char) const override { return "ASTJSONPath"; }
|
||||
|
||||
ASTPtr clone() const override { return std::make_shared<ASTJSONPath>(*this); }
|
||||
|
||||
ASTJSONPathQuery * jsonpath_query;
|
||||
};
|
||||
|
||||
}
|
19
src/Functions/JSONPath/ASTs/ASTJSONPathMemberAccess.h
Normal file
19
src/Functions/JSONPath/ASTs/ASTJSONPathMemberAccess.h
Normal file
@ -0,0 +1,19 @@
|
||||
#pragma once
|
||||
|
||||
#include <Parsers/IAST.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
class ASTJSONPathMemberAccess : public IAST
|
||||
{
|
||||
public:
|
||||
String getID(char) const override { return "ASTJSONPathMemberAccess"; }
|
||||
|
||||
ASTPtr clone() const override { return std::make_shared<ASTJSONPathMemberAccess>(*this); }
|
||||
|
||||
public:
|
||||
/// Member name to lookup in json document (in path: $.some_key.another_key. ...)
|
||||
String member_name;
|
||||
};
|
||||
|
||||
}
|
15
src/Functions/JSONPath/ASTs/ASTJSONPathQuery.h
Normal file
15
src/Functions/JSONPath/ASTs/ASTJSONPathQuery.h
Normal file
@ -0,0 +1,15 @@
|
||||
#pragma once
|
||||
|
||||
#include <Parsers/IAST.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
class ASTJSONPathQuery : public IAST
|
||||
{
|
||||
public:
|
||||
String getID(char) const override { return "ASTJSONPathQuery"; }
|
||||
|
||||
ASTPtr clone() const override { return std::make_shared<ASTJSONPathQuery>(*this); }
|
||||
};
|
||||
|
||||
}
|
23
src/Functions/JSONPath/ASTs/ASTJSONPathRange.h
Normal file
23
src/Functions/JSONPath/ASTs/ASTJSONPathRange.h
Normal file
@ -0,0 +1,23 @@
|
||||
#pragma once
|
||||
|
||||
#include <vector>
|
||||
#include <Parsers/IAST.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
class ASTJSONPathRange : public IAST
|
||||
{
|
||||
public:
|
||||
String getID(char) const override { return "ASTJSONPathRange"; }
|
||||
|
||||
ASTPtr clone() const override { return std::make_shared<ASTJSONPathRange>(*this); }
|
||||
|
||||
public:
|
||||
/// Ranges to lookup in json array ($[0, 1, 2, 4 to 9])
|
||||
/// Range is represented as <start, end (non-inclusive)>
|
||||
/// Single index is represented as <start, start + 1>
|
||||
std::vector<std::pair<UInt32, UInt32>> ranges;
|
||||
bool is_star = false;
|
||||
};
|
||||
|
||||
}
|
15
src/Functions/JSONPath/ASTs/ASTJSONPathRoot.h
Normal file
15
src/Functions/JSONPath/ASTs/ASTJSONPathRoot.h
Normal file
@ -0,0 +1,15 @@
|
||||
#pragma once
|
||||
|
||||
#include <Parsers/IAST.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
class ASTJSONPathRoot : public IAST
|
||||
{
|
||||
public:
|
||||
String getID(char) const override { return "ASTJSONPathRoot"; }
|
||||
|
||||
ASTPtr clone() const override { return std::make_shared<ASTJSONPathRoot>(*this); }
|
||||
};
|
||||
|
||||
}
|
15
src/Functions/JSONPath/ASTs/ASTJSONPathStar.h
Normal file
15
src/Functions/JSONPath/ASTs/ASTJSONPathStar.h
Normal file
@ -0,0 +1,15 @@
|
||||
#pragma once
|
||||
|
||||
#include <Parsers/IAST.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
class ASTJSONPathStar : public IAST
|
||||
{
|
||||
public:
|
||||
String getID(char) const override { return "ASTJSONPathStar"; }
|
||||
|
||||
ASTPtr clone() const override { return std::make_shared<ASTJSONPathStar>(*this); }
|
||||
};
|
||||
|
||||
}
|
13
src/Functions/JSONPath/CMakeLists.txt
Normal file
13
src/Functions/JSONPath/CMakeLists.txt
Normal file
@ -0,0 +1,13 @@
|
||||
include("${ClickHouse_SOURCE_DIR}/cmake/dbms_glob_sources.cmake")
|
||||
add_headers_and_sources(clickhouse_functions_jsonpath Parsers)
|
||||
add_headers_and_sources(clickhouse_functions_jsonpath ASTs)
|
||||
add_headers_and_sources(clickhouse_functions_jsonpath Generator)
|
||||
add_library(clickhouse_functions_jsonpath ${clickhouse_functions_jsonpath_sources} ${clickhouse_functions_jsonpath_headers})
|
||||
|
||||
target_link_libraries(clickhouse_functions_jsonpath PRIVATE dbms)
|
||||
target_link_libraries(clickhouse_functions_jsonpath PRIVATE clickhouse_parsers)
|
||||
target_link_libraries(clickhouse_functions PRIVATE clickhouse_functions_jsonpath)
|
||||
|
||||
if (STRIP_DEBUG_SYMBOLS_FUNCTIONS)
|
||||
target_compile_options(clickhouse_functions_jsonpath PRIVATE "-g0")
|
||||
endif()
|
128
src/Functions/JSONPath/Generator/GeneratorJSONPath.h
Normal file
128
src/Functions/JSONPath/Generator/GeneratorJSONPath.h
Normal file
@ -0,0 +1,128 @@
|
||||
#pragma once
|
||||
|
||||
#include <Functions/JSONPath/Generator/IGenerator.h>
|
||||
#include <Functions/JSONPath/Generator/VisitorJSONPathMemberAccess.h>
|
||||
#include <Functions/JSONPath/Generator/VisitorJSONPathRange.h>
|
||||
#include <Functions/JSONPath/Generator/VisitorJSONPathRoot.h>
|
||||
#include <Functions/JSONPath/Generator/VisitorJSONPathStar.h>
|
||||
#include <Functions/JSONPath/Generator/VisitorStatus.h>
|
||||
|
||||
#include <Functions/JSONPath/ASTs/ASTJSONPath.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int LOGICAL_ERROR;
|
||||
}
|
||||
|
||||
template <typename JSONParser>
|
||||
class GeneratorJSONPath : public IGenerator<JSONParser>
|
||||
{
|
||||
public:
|
||||
/**
|
||||
* Traverses children ASTs of ASTJSONPathQuery and creates a vector of corresponding visitors
|
||||
* @param query_ptr_ pointer to ASTJSONPathQuery
|
||||
*/
|
||||
GeneratorJSONPath(ASTPtr query_ptr_)
|
||||
{
|
||||
query_ptr = query_ptr_;
|
||||
const auto * path = query_ptr->as<ASTJSONPath>();
|
||||
if (!path)
|
||||
{
|
||||
throw Exception("Invalid path", ErrorCodes::LOGICAL_ERROR);
|
||||
}
|
||||
const auto * query = path->jsonpath_query;
|
||||
|
||||
for (auto child_ast : query->children)
|
||||
{
|
||||
if (typeid_cast<ASTJSONPathRoot *>(child_ast.get()))
|
||||
{
|
||||
visitors.push_back(std::make_shared<VisitorJSONPathRoot<JSONParser>>(child_ast));
|
||||
}
|
||||
else if (typeid_cast<ASTJSONPathMemberAccess *>(child_ast.get()))
|
||||
{
|
||||
visitors.push_back(std::make_shared<VisitorJSONPathMemberAccess<JSONParser>>(child_ast));
|
||||
}
|
||||
else if (typeid_cast<ASTJSONPathRange *>(child_ast.get()))
|
||||
{
|
||||
visitors.push_back(std::make_shared<VisitorJSONPathRange<JSONParser>>(child_ast));
|
||||
}
|
||||
else if (typeid_cast<ASTJSONPathStar *>(child_ast.get()))
|
||||
{
|
||||
visitors.push_back(std::make_shared<VisitorJSONPathStar<JSONParser>>(child_ast));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const char * getName() const override { return "GeneratorJSONPath"; }
|
||||
|
||||
/**
|
||||
* This method exposes API of traversing all paths, described by JSONPath,
|
||||
* to SQLJSON Functions.
|
||||
* Expected usage is to iteratively call this method from inside the function
|
||||
* and to execute custom logic with received element or handle an error.
|
||||
* On each such call getNextItem will yield next item into element argument
|
||||
* and modify its internal state to prepare for next call.
|
||||
*
|
||||
* @param element root of JSON document
|
||||
* @return is the generator exhausted
|
||||
*/
|
||||
VisitorStatus getNextItem(typename JSONParser::Element & element) override
|
||||
{
|
||||
while (true)
|
||||
{
|
||||
/// element passed to us actually is root, so here we assign current to root
|
||||
auto current = element;
|
||||
if (current_visitor < 0)
|
||||
{
|
||||
return VisitorStatus::Exhausted;
|
||||
}
|
||||
|
||||
for (int i = 0; i < current_visitor; ++i)
|
||||
{
|
||||
visitors[i]->apply(current);
|
||||
}
|
||||
|
||||
VisitorStatus status = VisitorStatus::Error;
|
||||
for (size_t i = current_visitor; i < visitors.size(); ++i)
|
||||
{
|
||||
status = visitors[i]->visit(current);
|
||||
current_visitor = i;
|
||||
if (status == VisitorStatus::Error || status == VisitorStatus::Ignore)
|
||||
{
|
||||
break;
|
||||
}
|
||||
}
|
||||
updateVisitorsForNextRun();
|
||||
|
||||
if (status != VisitorStatus::Ignore)
|
||||
{
|
||||
element = current;
|
||||
return status;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private:
|
||||
bool updateVisitorsForNextRun()
|
||||
{
|
||||
while (current_visitor >= 0 && visitors[current_visitor]->isExhausted())
|
||||
{
|
||||
visitors[current_visitor]->reinitialize();
|
||||
current_visitor--;
|
||||
}
|
||||
if (current_visitor >= 0)
|
||||
{
|
||||
visitors[current_visitor]->updateState();
|
||||
}
|
||||
return current_visitor >= 0;
|
||||
}
|
||||
|
||||
int current_visitor = 0;
|
||||
ASTPtr query_ptr;
|
||||
VisitorList<JSONParser> visitors;
|
||||
};
|
||||
|
||||
}
|
29
src/Functions/JSONPath/Generator/IGenerator.h
Normal file
29
src/Functions/JSONPath/Generator/IGenerator.h
Normal file
@ -0,0 +1,29 @@
|
||||
#pragma once
|
||||
|
||||
#include <Functions/JSONPath/Generator/IGenerator_fwd.h>
|
||||
#include <Functions/JSONPath/Generator/VisitorStatus.h>
|
||||
#include <Parsers/IAST.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
template <typename JSONParser>
|
||||
class IGenerator
|
||||
{
|
||||
public:
|
||||
IGenerator() = default;
|
||||
|
||||
virtual const char * getName() const = 0;
|
||||
|
||||
/**
|
||||
* Used to yield next non-ignored element describes by JSONPath query.
|
||||
*
|
||||
* @param element to be extracted into
|
||||
* @return true if generator is not exhausted
|
||||
*/
|
||||
virtual VisitorStatus getNextItem(typename JSONParser::Element & element) = 0;
|
||||
|
||||
virtual ~IGenerator() = default;
|
||||
};
|
||||
|
||||
}
|
16
src/Functions/JSONPath/Generator/IGenerator_fwd.h
Normal file
16
src/Functions/JSONPath/Generator/IGenerator_fwd.h
Normal file
@ -0,0 +1,16 @@
|
||||
#pragma once
|
||||
|
||||
#include <Functions/JSONPath/Generator/IVisitor.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
template <typename JSONParser>
|
||||
class IGenerator;
|
||||
|
||||
template <typename JSONParser>
|
||||
using IVisitorPtr = std::shared_ptr<IVisitor<JSONParser>>;
|
||||
|
||||
template <typename JSONParser>
|
||||
using VisitorList = std::vector<IVisitorPtr<JSONParser>>;
|
||||
|
||||
}
|
46
src/Functions/JSONPath/Generator/IVisitor.h
Normal file
46
src/Functions/JSONPath/Generator/IVisitor.h
Normal file
@ -0,0 +1,46 @@
|
||||
#pragma once
|
||||
|
||||
#include <Functions/JSONPath/Generator/VisitorStatus.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
template <typename JSONParser>
|
||||
class IVisitor
|
||||
{
|
||||
public:
|
||||
virtual const char * getName() const = 0;
|
||||
|
||||
/**
|
||||
* Applies this visitor to document and mutates its state
|
||||
* @param element simdjson element
|
||||
*/
|
||||
virtual VisitorStatus visit(typename JSONParser::Element & element) = 0;
|
||||
|
||||
/**
|
||||
* Applies this visitor to document, but does not mutate state
|
||||
* @param element simdjson element
|
||||
*/
|
||||
virtual VisitorStatus apply(typename JSONParser::Element & element) const = 0;
|
||||
|
||||
/**
|
||||
* Restores visitor's initial state for later use
|
||||
*/
|
||||
virtual void reinitialize() = 0;
|
||||
|
||||
virtual void updateState() = 0;
|
||||
|
||||
bool isExhausted() { return is_exhausted; }
|
||||
|
||||
void setExhausted(bool exhausted) { is_exhausted = exhausted; }
|
||||
|
||||
virtual ~IVisitor() = default;
|
||||
|
||||
private:
|
||||
/**
|
||||
* This variable is for detecting whether a visitor's next visit will be able
|
||||
* to yield a new item.
|
||||
*/
|
||||
bool is_exhausted = false;
|
||||
};
|
||||
|
||||
}
|
@ -0,0 +1,50 @@
|
||||
#pragma once
|
||||
|
||||
#include <Functions/JSONPath/ASTs/ASTJSONPathMemberAccess.h>
|
||||
#include <Functions/JSONPath/Generator/IVisitor.h>
|
||||
#include <Functions/JSONPath/Generator/VisitorStatus.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
template <typename JSONParser>
|
||||
class VisitorJSONPathMemberAccess : public IVisitor<JSONParser>
|
||||
{
|
||||
public:
|
||||
VisitorJSONPathMemberAccess(ASTPtr member_access_ptr_)
|
||||
: member_access_ptr(member_access_ptr_->as<ASTJSONPathMemberAccess>()) { }
|
||||
|
||||
const char * getName() const override { return "VisitorJSONPathMemberAccess"; }
|
||||
|
||||
VisitorStatus apply(typename JSONParser::Element & element) const override
|
||||
{
|
||||
typename JSONParser::Element result;
|
||||
element.getObject().find(std::string_view(member_access_ptr->member_name), result);
|
||||
element = result;
|
||||
return VisitorStatus::Ok;
|
||||
}
|
||||
|
||||
VisitorStatus visit(typename JSONParser::Element & element) override
|
||||
{
|
||||
this->setExhausted(true);
|
||||
if (!element.isObject())
|
||||
{
|
||||
return VisitorStatus::Error;
|
||||
}
|
||||
typename JSONParser::Element result;
|
||||
if (!element.getObject().find(std::string_view(member_access_ptr->member_name), result))
|
||||
{
|
||||
return VisitorStatus::Error;
|
||||
}
|
||||
apply(element);
|
||||
return VisitorStatus::Ok;
|
||||
}
|
||||
|
||||
void reinitialize() override { this->setExhausted(false); }
|
||||
|
||||
void updateState() override { }
|
||||
|
||||
private:
|
||||
ASTJSONPathMemberAccess * member_access_ptr;
|
||||
};
|
||||
|
||||
}
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user