mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-10 09:32:06 +00:00
Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into alexey-sm-DOCSUP-8179-remove_control_unicode
This commit is contained in:
commit
05af425f83
166
CHANGELOG.md
166
CHANGELOG.md
@ -1,3 +1,167 @@
|
||||
## ClickHouse release 21.4
|
||||
|
||||
### ClickHouse release 21.4.1 2021-04-02
|
||||
|
||||
#### Backward Incompatible Change
|
||||
|
||||
* The `toStartOfIntervalFunction` will align hour intervals to the midnight (in previous versions they were aligned to the start of unix epoch). For example, `toStartOfInterval(x, INTERVAL 11 HOUR)` will split every day into three intervals: `00:00:00..10:59:59`, `11:00:00..21:59:59` and `22:00:00..23:59:59`. This behaviour is more suited for practical needs. This closes [#9510](https://github.com/ClickHouse/ClickHouse/issues/9510). [#22060](https://github.com/ClickHouse/ClickHouse/pull/22060) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix `cutToFirstSignificantSubdomainCustom()`/`firstSignificantSubdomainCustom()` returning wrong result for 3+ level domains present in custom top-level domain list. For input domains matching these custom top-level domains, the third-level domain was considered to be the first significant one. This is now fixed. This change may introduce incompatibility if the function is used in e.g. the sharding key. [#21946](https://github.com/ClickHouse/ClickHouse/pull/21946) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Column `keys` in table `system.dictionaries` was replaced to columns `key.names` and `key.types`. Columns `key.names`, `key.types`, `attribute.names`, `attribute.types` from `system.dictionaries` table does not require dictionary to be loaded. [#21884](https://github.com/ClickHouse/ClickHouse/pull/21884) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Now replicas that are processing the `ALTER TABLE ATTACH PART[ITION]` command search in their `detached/` folders before fetching the data from other replicas. As an implementation detail, a new command `ATTACH_PART` is introduced in the replicated log. Parts are searched and compared by their checksums. [#18978](https://github.com/ClickHouse/ClickHouse/pull/18978) ([Mike Kot](https://github.com/myrrc)). **Note**:
|
||||
* `ATTACH PART[ITION]` queries may not work during cluster upgrade.
|
||||
* It's not possible to rollback to older ClickHouse version after executing `ALTER ... ATTACH` query in new version as the old servers would fail to pass the `ATTACH_PART` entry in the replicated log.
|
||||
|
||||
#### New Feature
|
||||
|
||||
* Added function `dictGetOrNull`. It works like `dictGet`, but return `Null` in case key was not found in dictionary. Closes [#22375](https://github.com/ClickHouse/ClickHouse/issues/22375). [#22413](https://github.com/ClickHouse/ClickHouse/pull/22413) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Added functions `dictGetChildren(dictionary, key)`, `dictGetDescendants(dictionary, key, level)`. Function `dictGetChildren` return all children as an array if indexes. It is a inverse transformation for `dictGetHierarchy`. Function `dictGetDescendants` return all descendants as if `dictGetChildren` was applied `level` times recursively. Zero `level` value is equivalent to infinity. Closes [#14656](https://github.com/ClickHouse/ClickHouse/issues/14656). [#22096](https://github.com/ClickHouse/ClickHouse/pull/22096) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Add `prefer_column_name_to_alias` setting to use original column names instead of aliases. it is needed to be more compatible with common databases' aliasing rules. This is for [#9715](https://github.com/ClickHouse/ClickHouse/issues/9715) and [#9887](https://github.com/ClickHouse/ClickHouse/issues/9887). [#22044](https://github.com/ClickHouse/ClickHouse/pull/22044) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Add function `timezoneOf` that returns the timezone name of `DateTime` or `DateTime64` data types. This does not close [#9959](https://github.com/ClickHouse/ClickHouse/issues/9959). Fix inconsistencies in function names: add aliases `timezone` and `timeZone` as well as `toTimezone` and `toTimeZone` and `timezoneOf` and `timeZoneOf`. [#22001](https://github.com/ClickHouse/ClickHouse/pull/22001) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Added table function `dictionary`. It works the same way as `Dictionary` engine. Closes [#21560](https://github.com/ClickHouse/ClickHouse/issues/21560). [#21910](https://github.com/ClickHouse/ClickHouse/pull/21910) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Support `Nullable` type for `PolygonDictionary` attribute. [#21890](https://github.com/ClickHouse/ClickHouse/pull/21890) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Functions `dictGet`, `dictHas` use current database name if it is not specified for dictionaries created with DDL. Closes [#21632](https://github.com/ClickHouse/ClickHouse/issues/21632). [#21859](https://github.com/ClickHouse/ClickHouse/pull/21859) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Add `ctime` option to `zookeeper-dump-tree`. It allows to dump node creation time. [#21842](https://github.com/ClickHouse/ClickHouse/pull/21842) ([Ilya](https://github.com/HumanUser)).
|
||||
* Add new optional clause `GRANTEES` for `CREATE/ALTER USER` commands. It specifies users or roles which are allowed to receive grants from this user on condition this user has also all required access granted with grant option. By default `GRANTEES ANY` is used which means a user with grant option can grant to anyone. Syntax: `CREATE USER ... GRANTEES {user | role | ANY | NONE} [,...] [EXCEPT {user | role} [,...]]`. [#21641](https://github.com/ClickHouse/ClickHouse/pull/21641) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Add option `--backslash` for `clickhouse-format`, which can add a backslash at the end of each line of the formatted query. [#21494](https://github.com/ClickHouse/ClickHouse/pull/21494) ([flynn](https://github.com/ucasFL)).
|
||||
* Add new column `slowdowns_count` to `system.clusters`. When using hedged requests, it shows how many times we switched to another replica because this replica was responding slowly. Also show actual value of `errors_count` in `system.clusters`. [#21480](https://github.com/ClickHouse/ClickHouse/pull/21480) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Add `_partition_id` virtual column for `MergeTree*` engines. Allow to prune partitions by `_partition_id`. Add `partitionID()` function to calculate partition id string. [#21401](https://github.com/ClickHouse/ClickHouse/pull/21401) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Add function `isIPAddressInRange` to test if an IPv4 or IPv6 address is contained in a given CIDR network prefix. [#21329](https://github.com/ClickHouse/ClickHouse/pull/21329) ([PHO](https://github.com/depressed-pho)).
|
||||
* Added `ExecutablePool` dictionary source. Close [#14528](https://github.com/ClickHouse/ClickHouse/issues/14528). [#21321](https://github.com/ClickHouse/ClickHouse/pull/21321) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Added new SQL command `ALTER TABLE 'table_name' UNFREEZE [PARTITION 'part_expr'] WITH NAME 'backup_name'`. This command is needed to properly remove 'freezed' partitions from all disks. [#21142](https://github.com/ClickHouse/ClickHouse/pull/21142) ([Pavel Kovalenko](https://github.com/Jokser)).
|
||||
* Added `Grant,` `Revoke` and `System` values of `query_kind` column for corresponding queries in `system.query_log`. [#21102](https://github.com/ClickHouse/ClickHouse/pull/21102) ([Vasily Nemkov](https://github.com/Enmk)).
|
||||
* Added async update in `ComplexKeyCache`, `SSDCache`, `SSDComplexKeyCache` dictionaries. Added support for `Nullable` type in `Cache`, `ComplexKeyCache`, `SSDCache`, `SSDComplexKeyCache` dictionaries. Added support for multiple attributes fetch with `dictGet`, `dictGetOrDefault` functions. Fixes [#21517](https://github.com/ClickHouse/ClickHouse/issues/21517). [#20595](https://github.com/ClickHouse/ClickHouse/pull/20595) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Allow customizing timeouts for http connections used for replication independently from other http timeouts. [#20088](https://github.com/ClickHouse/ClickHouse/pull/20088) ([nvartolomei](https://github.com/nvartolomei)).
|
||||
* Supports implicit key type conversion for JOIN. [#19885](https://github.com/ClickHouse/ClickHouse/pull/19885) ([Vladimir](https://github.com/vdimir)).
|
||||
* Support `dictHas` function for `RangeHashedDictionary`. Fixes [#6680](https://github.com/ClickHouse/ClickHouse/issues/6680). [#19816](https://github.com/ClickHouse/ClickHouse/pull/19816) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Zero-copy replication for `ReplicatedMergeTree` over S3 storage. [#16240](https://github.com/ClickHouse/ClickHouse/pull/16240) ([ianton-ru](https://github.com/ianton-ru)).
|
||||
* Extended range of `DateTime64` to properly support dates from year 1925 to 2283. Improved support of `DateTime` around zero date (`1970-01-01`). [#9404](https://github.com/ClickHouse/ClickHouse/pull/9404) ([Vasily Nemkov](https://github.com/Enmk)).
|
||||
|
||||
#### Performance Improvement
|
||||
|
||||
* Enable read with mmap IO for file ranges from 64 MiB (the settings `min_bytes_to_use_mmap_io`). It may lead to moderate performance improvement. [#22326](https://github.com/ClickHouse/ClickHouse/pull/22326) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Add cache for files read with `min_bytes_to_use_mmap_io` setting. It makes significant (2x and more) performance improvement when the value of the setting is small by avoiding frequent mmap/munmap calls and the consequent page faults. Note that mmap IO has major drawbacks that makes it less reliable in production (e.g. hung or SIGBUS on faulty disks; less controllable memory usage). Nevertheless it is good in benchmarks. [#22206](https://github.com/ClickHouse/ClickHouse/pull/22206) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Avoid unnecessary data copy when using codec `NONE`. Please note that codec `NONE` is mostly useless - it's recommended to always use compression (`LZ4` is by default). Despite the common belief, disabling compression may not improve performance (the opposite effect is possible). The `NONE` codec is useful in some cases: - when data is uncompressable; - for synthetic benchmarks. [#22145](https://github.com/ClickHouse/ClickHouse/pull/22145) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Faster `GROUP BY` with small `max_rows_to_group_by` and `group_by_overflow_mode='any'`. [#21856](https://github.com/ClickHouse/ClickHouse/pull/21856) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Optimize performance of queries like `SELECT ... FINAL ... WHERE`. Now in queries with `FINAL` it's allowed to move to `PREWHERE` columns, which are in sorting key. [#21830](https://github.com/ClickHouse/ClickHouse/pull/21830) ([foolchi](https://github.com/foolchi)).
|
||||
* Supported parallel formatting in `clickhouse-local` and everywhere else. [#21630](https://github.com/ClickHouse/ClickHouse/pull/21630) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Improved performance by replacing `memcpy` to another implementation. This closes [#18583](https://github.com/ClickHouse/ClickHouse/issues/18583). [#21520](https://github.com/ClickHouse/ClickHouse/pull/21520) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Support parallel parsing for `CSVWithNames` and `TSVWithNames` formats. This closes [#21085](https://github.com/ClickHouse/ClickHouse/issues/21085). [#21149](https://github.com/ClickHouse/ClickHouse/pull/21149) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
|
||||
#### Improvement
|
||||
|
||||
* Better exception message in client in case of exception while server is writing blocks. In previous versions client may get misleading message like `Data compressed with different methods`. [#22427](https://github.com/ClickHouse/ClickHouse/pull/22427) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fix error `Directory tmp_fetch_XXX already exists` which could happen after failed fetch part. Delete temporary fetch directory if it already exists. Fixes [#14197](https://github.com/ClickHouse/ClickHouse/issues/14197). [#22411](https://github.com/ClickHouse/ClickHouse/pull/22411) ([nvartolomei](https://github.com/nvartolomei)).
|
||||
* Fix MSan report for function `range` with `UInt256` argument (support for large integers is experimental). This closes [#22157](https://github.com/ClickHouse/ClickHouse/issues/22157). [#22387](https://github.com/ClickHouse/ClickHouse/pull/22387) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Add `current_database` column to `system.processes` table. It contains the current database of the query. [#22365](https://github.com/ClickHouse/ClickHouse/pull/22365) ([Alexander Kuzmenkov](https://github.com/akuzm)).
|
||||
* Add case-insensitive history search/navigation and subword movement features to `clickhouse-client`. [#22105](https://github.com/ClickHouse/ClickHouse/pull/22105) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Added possibility to migrate existing S3 disk to the schema with backup-restore capabilities. [#22070](https://github.com/ClickHouse/ClickHouse/pull/22070) ([Pavel Kovalenko](https://github.com/Jokser)).
|
||||
* If tuple of NULLs, e.g. `(NULL, NULL)` is on the left hand side of `IN` operator with tuples of non-NULLs on the right hand side, e.g. `SELECT (NULL, NULL) IN ((0, 0), (3, 1))` return 0 instead of throwing an exception about incompatible types. The expression may also appear due to optimization of something like `SELECT (NULL, NULL) = (8, 0) OR (NULL, NULL) = (3, 2) OR (NULL, NULL) = (0, 0) OR (NULL, NULL) = (3, 1)`. This closes [#22017](https://github.com/ClickHouse/ClickHouse/issues/22017). [#22063](https://github.com/ClickHouse/ClickHouse/pull/22063) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Convert `system.errors.stack_trace` from `String` into `Array(UInt64)` (This should decrease overhead for the errors collecting). [#22058](https://github.com/ClickHouse/ClickHouse/pull/22058) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Update used version of simdjson to 0.9.1. This fixes [#21984](https://github.com/ClickHouse/ClickHouse/issues/21984). [#22057](https://github.com/ClickHouse/ClickHouse/pull/22057) ([Vitaly Baranov](https://github.com/vitlibar)).
|
||||
* Added case insensitive aliases for `CONNECTION_ID()` and `VERSION()` functions. This fixes [#22028](https://github.com/ClickHouse/ClickHouse/issues/22028). [#22042](https://github.com/ClickHouse/ClickHouse/pull/22042) ([Eugene Klimov](https://github.com/Slach)).
|
||||
* Add option `strict_increase` to `windowFunnel` function to calculate each event once (resolve [#21835](https://github.com/ClickHouse/ClickHouse/issues/21835)). [#22025](https://github.com/ClickHouse/ClickHouse/pull/22025) ([Vladimir](https://github.com/vdimir)).
|
||||
* If partition key of a `MergeTree` table does not include `Date` or `DateTime` columns but includes exactly one `DateTime64` column, expose its values in the `min_time` and `max_time` columns in `system.parts` and `system.parts_columns` tables. Add `min_time` and `max_time` columns to `system.parts_columns` table (these was inconsistency to the `system.parts` table). This closes [#18244](https://github.com/ClickHouse/ClickHouse/issues/18244). [#22011](https://github.com/ClickHouse/ClickHouse/pull/22011) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Supported `replication_alter_partitions_sync=1` setting for moving partitions from helping table to destination. Decreased default timeouts. Fixes [#21911](https://github.com/ClickHouse/ClickHouse/issues/21911). [#21912](https://github.com/ClickHouse/ClickHouse/pull/21912) ([turbo jason](https://github.com/songenjie)).
|
||||
* Show path to data directory of `EmbeddedRocksDB` tables in system tables. [#21903](https://github.com/ClickHouse/ClickHouse/pull/21903) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Support `RANGE OFFSET` frame for floating point types. Implement `lagInFrame`/`leadInFrame` window functions, which are analogous to `lag`/`lead`, but respect the window frame. They are identical when the frame is `between unbounded preceding and unbounded following`. This closes [#5485](https://github.com/ClickHouse/ClickHouse/issues/5485). [#21895](https://github.com/ClickHouse/ClickHouse/pull/21895) ([Alexander Kuzmenkov](https://github.com/akuzm)).
|
||||
* Add profile event `HedgedRequestsChangeReplica`, change read data timeout from sec to ms. [#21886](https://github.com/ClickHouse/ClickHouse/pull/21886) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Add connection pool for PostgreSQL table/database engine and dictionary source. Should fix [#21444](https://github.com/ClickHouse/ClickHouse/issues/21444). [#21839](https://github.com/ClickHouse/ClickHouse/pull/21839) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* DiskS3 (experimental feature under development). Fixed bug with the impossibility to move directory if the destination is not empty and cache disk is used. [#21837](https://github.com/ClickHouse/ClickHouse/pull/21837) ([Pavel Kovalenko](https://github.com/Jokser)).
|
||||
* Better formatting for `Array` and `Map` data types in Web UI. [#21798](https://github.com/ClickHouse/ClickHouse/pull/21798) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Support non-default table schema for postgres storage/table-function. Closes [#21701](https://github.com/ClickHouse/ClickHouse/issues/21701). [#21711](https://github.com/ClickHouse/ClickHouse/pull/21711) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Support replicas priority for postgres dictionary source. [#21710](https://github.com/ClickHouse/ClickHouse/pull/21710) ([Kseniia Sumarokova](https://github.com/kssenii)).
|
||||
* Update clusters only if their configurations were updated. [#21685](https://github.com/ClickHouse/ClickHouse/pull/21685) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Propagate query and session settings for distributed DDL queries. Set `distributed_ddl_entry_format_version` to 2 to enable this. Added `distributed_ddl_output_mode` setting. Supported modes: `none`, `throw` (default), `null_status_on_timeout` and `never_throw`. Miscellaneous fixes and improvements for `Replicated` database engine. [#21535](https://github.com/ClickHouse/ClickHouse/pull/21535) ([tavplubix](https://github.com/tavplubix)).
|
||||
* If `PODArray` was instantiated with element size that is neither a fraction or a multiple of 16, buffer overflow was possible. No bugs in current releases exist. [#21533](https://github.com/ClickHouse/ClickHouse/pull/21533) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Add `last_error_time`/`last_error_message`/`last_error_stacktrace`/`remote` columns for `system.errors`. [#21529](https://github.com/ClickHouse/ClickHouse/pull/21529) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Add aliases `simpleJSONExtract/simpleJSONHas` to `visitParam/visitParamExtract{UInt, Int, Bool, Float, Raw, String}`. Fixes #21383. [#21519](https://github.com/ClickHouse/ClickHouse/pull/21519) ([fastio](https://github.com/fastio)).
|
||||
* Add setting `optimize_skip_unused_shards_limit` to limit the number of sharding key values for `optimize_skip_unused_shards`. [#21512](https://github.com/ClickHouse/ClickHouse/pull/21512) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* `Age` and `Precision` in graphite rollup configs should increase from retention to retention. Now it's checked and the wrong config raises an exception. [#21496](https://github.com/ClickHouse/ClickHouse/pull/21496) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
|
||||
* Improve `clickhouse-format` to not throw exception when there are extra spaces or comment after the last query, and throw exception early with readable message when format `ASTInsertQuery` with data . [#21311](https://github.com/ClickHouse/ClickHouse/pull/21311) ([flynn](https://github.com/ucasFL)).
|
||||
* Improve support of integer keys in data type `Map`. [#21157](https://github.com/ClickHouse/ClickHouse/pull/21157) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* MaterializeMySQL: Attempt to reconnect to MySQL if the connection is lost. [#20961](https://github.com/ClickHouse/ClickHouse/pull/20961) ([Håvard Kvålen](https://github.com/havardk)).
|
||||
* Support more cases to rewrite `CROSS JOIN` to `INNER JOIN`. [#20392](https://github.com/ClickHouse/ClickHouse/pull/20392) ([Vladimir](https://github.com/vdimir)).
|
||||
* Do not create empty parts on INSERT when `optimize_on_insert` setting enabled. Fixes [#20304](https://github.com/ClickHouse/ClickHouse/issues/20304). [#20387](https://github.com/ClickHouse/ClickHouse/pull/20387) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* `MaterializeMySQL`: add minmax skipping index for `_version` column. [#20382](https://github.com/ClickHouse/ClickHouse/pull/20382) ([Stig Bakken](https://github.com/stigsb)).
|
||||
* Improve performance of aggregation in order of sorting key (with enabled setting `optimize_aggregation_in_order`). [#19401](https://github.com/ClickHouse/ClickHouse/pull/19401) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Introduce a new merge tree setting `min_bytes_to_rebalance_partition_over_jbod` which allows assigning new parts to different disks of a JBOD volume in a balanced way. [#16481](https://github.com/ClickHouse/ClickHouse/pull/16481) ([Amos Bird](https://github.com/amosbird)).
|
||||
|
||||
#### Bug Fix
|
||||
|
||||
* Remove socket from epoll before cancelling packet receiver in `HedgedConnections` to prevent possible race. I hope it fixes [#22161](https://github.com/ClickHouse/ClickHouse/issues/22161). [#22443](https://github.com/ClickHouse/ClickHouse/pull/22443) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Add (missing) memory accounting in parallel parsing routines. In previous versions OOM was possible when the resultset contains very large blocks of data. This closes [#22008](https://github.com/ClickHouse/ClickHouse/issues/22008). [#22425](https://github.com/ClickHouse/ClickHouse/pull/22425) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Fixed bug in S3 zero-copy replication for hybrid storage. [#22378](https://github.com/ClickHouse/ClickHouse/pull/22378) ([ianton-ru](https://github.com/ianton-ru)).
|
||||
* Now clickhouse will not throw `LOGICAL_ERROR` exception when we try to mutate the already covered part. Fixes [#22013](https://github.com/ClickHouse/ClickHouse/issues/22013). [#22291](https://github.com/ClickHouse/ClickHouse/pull/22291) ([alesapin](https://github.com/alesapin)).
|
||||
* Fix exception which may happen when `SELECT` has constant `WHERE` condition and source table has columns which names are digits. [#22270](https://github.com/ClickHouse/ClickHouse/pull/22270) ([LiuNeng](https://github.com/liuneng1994)).
|
||||
* Fix query cancellation with `use_hedged_requests=0` and `async_socket_for_remote=1`. [#22183](https://github.com/ClickHouse/ClickHouse/pull/22183) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix uncaught exception in `InterserverIOHTTPHandler`. [#22146](https://github.com/ClickHouse/ClickHouse/pull/22146) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix docker entrypoint in case `http_port` is not in the config. [#22132](https://github.com/ClickHouse/ClickHouse/pull/22132) ([Ewout](https://github.com/devwout)).
|
||||
* Fix error `Invalid number of rows in Chunk` in `JOIN` with `TOTALS` and `arrayJoin`. Closes [#19303](https://github.com/ClickHouse/ClickHouse/issues/19303). [#22129](https://github.com/ClickHouse/ClickHouse/pull/22129) ([Vladimir](https://github.com/vdimir)).
|
||||
* Fix the background thread pool name which used to poll message from Kafka. The Kafka engine with the broken thread pool will not consume the message from message queue. [#22122](https://github.com/ClickHouse/ClickHouse/pull/22122) ([fastio](https://github.com/fastio)).
|
||||
* Fix waiting for `OPTIMIZE` and `ALTER` queries for `ReplicatedMergeTree` table engines. Now the query will not hang when the table was detached or restarted. [#22118](https://github.com/ClickHouse/ClickHouse/pull/22118) ([alesapin](https://github.com/alesapin)).
|
||||
* Disable `async_socket_for_remote`/`use_hedged_requests` for buggy linux kernels. [#22109](https://github.com/ClickHouse/ClickHouse/pull/22109) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Docker entrypoint: avoid chown of `.` in case when `LOG_PATH` is empty. Closes [#22100](https://github.com/ClickHouse/ClickHouse/issues/22100). [#22102](https://github.com/ClickHouse/ClickHouse/pull/22102) ([filimonov](https://github.com/filimonov)).
|
||||
* The function `decrypt` was lacking a check for the minimal size of data encrypted in `AEAD` mode. This closes [#21897](https://github.com/ClickHouse/ClickHouse/issues/21897). [#22064](https://github.com/ClickHouse/ClickHouse/pull/22064) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* In rare case, merge for `CollapsingMergeTree` may create granule with `index_granularity + 1` rows. Because of this, internal check, added in [#18928](https://github.com/ClickHouse/ClickHouse/issues/18928) (affects 21.2 and 21.3), may fail with error `Incomplete granules are not allowed while blocks are granules size`. This error did not allow parts to merge. [#21976](https://github.com/ClickHouse/ClickHouse/pull/21976) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Reverted [#15454](https://github.com/ClickHouse/ClickHouse/issues/15454) that may cause significant increase in memory usage while loading external dictionaries of hashed type. This closes [#21935](https://github.com/ClickHouse/ClickHouse/issues/21935). [#21948](https://github.com/ClickHouse/ClickHouse/pull/21948) ([Maksim Kita](https://github.com/kitaisreal)).
|
||||
* Prevent hedged connections overlaps (`Unknown packet 9 from server` error). [#21941](https://github.com/ClickHouse/ClickHouse/pull/21941) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix reading the HTTP POST request with "multipart/form-data" content type. [#21936](https://github.com/ClickHouse/ClickHouse/pull/21936) ([Ivan](https://github.com/abyss7)).
|
||||
* Fix wrong `ORDER BY` results when a query contains window functions, and optimization for reading in primary key order is applied. Fixes [#21828](https://github.com/ClickHouse/ClickHouse/issues/21828). [#21915](https://github.com/ClickHouse/ClickHouse/pull/21915) ([Alexander Kuzmenkov](https://github.com/akuzm)).
|
||||
* Fix deadlock in first catboost model execution. Closes [#13832](https://github.com/ClickHouse/ClickHouse/issues/13832). [#21844](https://github.com/ClickHouse/ClickHouse/pull/21844) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix incorrect query result (and possible crash) which could happen when `WHERE` or `HAVING` condition is pushed before `GROUP BY`. Fixes [#21773](https://github.com/ClickHouse/ClickHouse/issues/21773). [#21841](https://github.com/ClickHouse/ClickHouse/pull/21841) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
|
||||
* Better error handling and logging in `WriteBufferFromS3`. [#21836](https://github.com/ClickHouse/ClickHouse/pull/21836) ([Pavel Kovalenko](https://github.com/Jokser)).
|
||||
* Fix possible crashes in aggregate functions with combinator `Distinct`, while using two-level aggregation. This is a follow-up fix of [#18365](https://github.com/ClickHouse/ClickHouse/pull/18365) . Can only reproduced in production env. [#21818](https://github.com/ClickHouse/ClickHouse/pull/21818) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix scalar subquery index analysis. This fixes [#21717](https://github.com/ClickHouse/ClickHouse/issues/21717) , which was introduced in [#18896](https://github.com/ClickHouse/ClickHouse/pull/18896). [#21766](https://github.com/ClickHouse/ClickHouse/pull/21766) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix bug for `ReplicatedMerge` table engines when `ALTER MODIFY COLUMN` query doesn't change the type of decimal column if its size (32 bit or 64 bit) doesn't change. [#21728](https://github.com/ClickHouse/ClickHouse/pull/21728) ([alesapin](https://github.com/alesapin)).
|
||||
* Fix concurrent `OPTIMIZE` and `DROP` for `ReplicatedMergeTree`. [#21716](https://github.com/ClickHouse/ClickHouse/pull/21716) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix function `arrayElement` with type `Map` for constant integer arguments. [#21699](https://github.com/ClickHouse/ClickHouse/pull/21699) ([Anton Popov](https://github.com/CurtizJ)).
|
||||
* Fix SIGSEGV on not existing attributes from `ip_trie` with `access_to_key_from_attributes`. [#21692](https://github.com/ClickHouse/ClickHouse/pull/21692) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Server now start accepting connections only after `DDLWorker` and dictionaries initialization. [#21676](https://github.com/ClickHouse/ClickHouse/pull/21676) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Add type conversion for `StorageJoin` keys (previously led to SIGSEGV). [#21646](https://github.com/ClickHouse/ClickHouse/pull/21646) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix distributed requests cancellation (for example simple select from multiple shards with limit, i.e. `select * from remote('127.{2,3}', system.numbers) limit 100`) with `async_socket_for_remote=1`. [#21643](https://github.com/ClickHouse/ClickHouse/pull/21643) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Fix `fsync_part_directory` for horizontal merge. [#21642](https://github.com/ClickHouse/ClickHouse/pull/21642) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* Remove unknown columns from joined table in `WHERE` for queries to external database engines (MySQL, PostgreSQL). close [#14614](https://github.com/ClickHouse/ClickHouse/issues/14614), close [#19288](https://github.com/ClickHouse/ClickHouse/issues/19288) (dup), close [#19645](https://github.com/ClickHouse/ClickHouse/issues/19645) (dup). [#21640](https://github.com/ClickHouse/ClickHouse/pull/21640) ([Vladimir](https://github.com/vdimir)).
|
||||
* `std::terminate` was called if there is an error writing data into s3. [#21624](https://github.com/ClickHouse/ClickHouse/pull/21624) ([Vladimir](https://github.com/vdimir)).
|
||||
* Fix possible error ` Cannot find column` when `optimize_skip_unused_shards` is enabled and zero shards are used. [#21579](https://github.com/ClickHouse/ClickHouse/pull/21579) ([Azat Khuzhin](https://github.com/azat)).
|
||||
* In case if query has constant `WHERE` condition, and setting `optimize_skip_unused_shards` enabled, all shards may be skipped and query could return incorrect empty result. [#21550](https://github.com/ClickHouse/ClickHouse/pull/21550) ([Amos Bird](https://github.com/amosbird)).
|
||||
* Fix incorrect `fd_ready` assignment in NuKeeperTCPHandler. [#21544](https://github.com/ClickHouse/ClickHouse/pull/21544) ([小路](https://github.com/nicelulu)).
|
||||
* Fix table function `clusterAllReplicas` returns wrong `_shard_num`. close [#21481](https://github.com/ClickHouse/ClickHouse/issues/21481). [#21498](https://github.com/ClickHouse/ClickHouse/pull/21498) ([flynn](https://github.com/ucasFL)).
|
||||
* Fix that S3 table holds old credentials after config update. [#21457](https://github.com/ClickHouse/ClickHouse/pull/21457) ([Grigory Pervakov](https://github.com/GrigoryPervakov)).
|
||||
* Fixed race on SSL object inside `SecureSocket` in Poco. [#21456](https://github.com/ClickHouse/ClickHouse/pull/21456) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
|
||||
* Fix `Avro` format parsing for `Kafka`. Fixes [#21437](https://github.com/ClickHouse/ClickHouse/issues/21437). [#21438](https://github.com/ClickHouse/ClickHouse/pull/21438) ([Ilya Golshtein](https://github.com/ilejn)).
|
||||
* Fix receive and send timeouts and non-blocking read in secure socket. [#21429](https://github.com/ClickHouse/ClickHouse/pull/21429) ([Kruglov Pavel](https://github.com/Avogar)).
|
||||
* Fix official website documents which introduced cluster secret feature. [#21331](https://github.com/ClickHouse/ClickHouse/pull/21331) ([Chao Ma](https://github.com/godliness)).
|
||||
* `force_drop_table` flag didn't work for `MATERIALIZED VIEW`, it's fixed. Fixes [#18943](https://github.com/ClickHouse/ClickHouse/issues/18943). [#20626](https://github.com/ClickHouse/ClickHouse/pull/20626) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Fix name clashes in `PredicateRewriteVisitor`. It caused incorrect `WHERE` filtration after full join. Close [#20497](https://github.com/ClickHouse/ClickHouse/issues/20497). [#20622](https://github.com/ClickHouse/ClickHouse/pull/20622) ([Vladimir](https://github.com/vdimir)).
|
||||
* Fixed open behavior of remote host filter in case when there is `remote_url_allow_hosts` section in configuration but no entries there. [#20058](https://github.com/ClickHouse/ClickHouse/pull/20058) ([Vladimir Chebotarev](https://github.com/excitoon)).
|
||||
|
||||
#### Build/Testing/Packaging Improvement
|
||||
|
||||
* Enable the bundled openldap on `ppc64le`. [#22487](https://github.com/ClickHouse/ClickHouse/pull/22487) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Re-enable the S3 (AWS) library on `aarch64`. [#22484](https://github.com/ClickHouse/ClickHouse/pull/22484) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Enable compiling on `ppc64le` with Clang. [#22476](https://github.com/ClickHouse/ClickHouse/pull/22476) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Fix compiling boost on `ppc64le`. [#22474](https://github.com/ClickHouse/ClickHouse/pull/22474) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Fix CMake error about internal CMake variable `CMAKE_ASM_COMPILE_OBJECT` not set on `ppc64le`. [#22469](https://github.com/ClickHouse/ClickHouse/pull/22469) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Fix Fedora\RHEL\CentOS not finding `libclang_rt.builtins` on `ppc64le`. [#22458](https://github.com/ClickHouse/ClickHouse/pull/22458) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Enable building with `jemalloc` on `ppc64le`. [#22447](https://github.com/ClickHouse/ClickHouse/pull/22447) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Fix ClickHouse's config embedding and cctz's timezone embedding on `ppc64le`. [#22445](https://github.com/ClickHouse/ClickHouse/pull/22445) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Fixed compiling on `ppc64le` and use the correct instruction pointer register on `ppc64le`. [#22430](https://github.com/ClickHouse/ClickHouse/pull/22430) ([Kfir Itzhak](https://github.com/mastertheknife)).
|
||||
* Added a way to check memory info for the RBAC testflows tests. [#22403](https://github.com/ClickHouse/ClickHouse/pull/22403) ([MyroTk](https://github.com/MyroTk)).
|
||||
* Fix test with MaterializeMySQL. MySQL is started only once with MaterializeMySQL integration test. Fixes [#22289](https://github.com/ClickHouse/ClickHouse/issues/22289). [#22341](https://github.com/ClickHouse/ClickHouse/pull/22341) ([Winter Zhang](https://github.com/zhang2014)).
|
||||
* Run stateless tests in parallel in CI. Depends on [#22181](https://github.com/ClickHouse/ClickHouse/issues/22181). [#22300](https://github.com/ClickHouse/ClickHouse/pull/22300) ([alesapin](https://github.com/alesapin)).
|
||||
* Enable status check for SQLancer CI run. [#22015](https://github.com/ClickHouse/ClickHouse/pull/22015) ([Ilya Yatsishin](https://github.com/qoega)).
|
||||
* Add `tzdata` to Docker containers because reading `ORC` formats requires it. This closes [#14156](https://github.com/ClickHouse/ClickHouse/issues/14156). [#22000](https://github.com/ClickHouse/ClickHouse/pull/22000) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Introduce 2 arguments for `clickhouse-server` image Dockerfile: `deb_location` & `single_binary_location`. [#21977](https://github.com/ClickHouse/ClickHouse/pull/21977) ([filimonov](https://github.com/filimonov)).
|
||||
* Allow to use clang-tidy with release builds by enabling assertions if it is used. [#21914](https://github.com/ClickHouse/ClickHouse/pull/21914) ([alexey-milovidov](https://github.com/alexey-milovidov)).
|
||||
* Remove decode method with python3. [#21832](https://github.com/ClickHouse/ClickHouse/pull/21832) ([kevin wan](https://github.com/MaxWk)).
|
||||
* Add [jepsen](https://github.com/jepsen-io/jepsen) tests for NuKeeper. [#21677](https://github.com/ClickHouse/ClickHouse/pull/21677) ([alesapin](https://github.com/alesapin)).
|
||||
* Updating TestFlows to 1.6.74. [#21673](https://github.com/ClickHouse/ClickHouse/pull/21673) ([vzakaznikov](https://github.com/vzakaznikov)).
|
||||
* Add llvm-12 binaries name to search in cmake scripts. Implicit constants conversions to mute clang warnings. Updated submodules to build with CMake 3.19. Mute recursion in macro expansion in `readpassphrase` library. Deprecated `-fuse-ld` changed to `--ld-path` for clang. [#21597](https://github.com/ClickHouse/ClickHouse/pull/21597) ([Ilya Yatsishin](https://github.com/qoega)).
|
||||
* Updating `docker/test/testflows/runner/dockerd-entrypoint.sh` to use Yandex dockerhub-proxy. [#21551](https://github.com/ClickHouse/ClickHouse/pull/21551) ([vzakaznikov](https://github.com/vzakaznikov)).
|
||||
* Fixing LDAP authentication performance test by removing assertion. [#21507](https://github.com/ClickHouse/ClickHouse/pull/21507) ([vzakaznikov](https://github.com/vzakaznikov)).
|
||||
* Added `ALL` and `NONE` privilege tests. [#21354](https://github.com/ClickHouse/ClickHouse/pull/21354) ([MyroTk](https://github.com/MyroTk)).
|
||||
* Fix macOS shared lib build. [#20184](https://github.com/ClickHouse/ClickHouse/pull/20184) ([nvartolomei](https://github.com/nvartolomei)).
|
||||
|
||||
## ClickHouse release 21.3 (LTS)
|
||||
|
||||
### ClickHouse release v21.3, 2021-03-12
|
||||
@ -26,7 +190,7 @@
|
||||
#### Experimental feature
|
||||
|
||||
* Add experimental `Replicated` database engine. It replicates DDL queries across multiple hosts. [#16193](https://github.com/ClickHouse/ClickHouse/pull/16193) ([tavplubix](https://github.com/tavplubix)).
|
||||
* Introduce experimental support for window functions, enabled with `allow_experimental_functions = 1`. This is a preliminary, alpha-quality implementation that is not suitable for production use and will change in backward-incompatible ways in future releases. Please see [the documentation](https://github.com/ClickHouse/ClickHouse/blob/master/docs/en/sql-reference/window-functions/index.md#experimental-window-functions) for the list of supported features. [#20337](https://github.com/ClickHouse/ClickHouse/pull/20337) ([Alexander Kuzmenkov](https://github.com/akuzm)).
|
||||
* Introduce experimental support for window functions, enabled with `allow_experimental_window_functions = 1`. This is a preliminary, alpha-quality implementation that is not suitable for production use and will change in backward-incompatible ways in future releases. Please see [the documentation](https://github.com/ClickHouse/ClickHouse/blob/master/docs/en/sql-reference/window-functions/index.md#experimental-window-functions) for the list of supported features. [#20337](https://github.com/ClickHouse/ClickHouse/pull/20337) ([Alexander Kuzmenkov](https://github.com/akuzm)).
|
||||
* Add the ability to backup/restore metadata files for DiskS3. [#18377](https://github.com/ClickHouse/ClickHouse/pull/18377) ([Pavel Kovalenko](https://github.com/Jokser)).
|
||||
|
||||
#### Performance Improvement
|
||||
|
@ -39,6 +39,8 @@ else()
|
||||
set(RECONFIGURE_MESSAGE_LEVEL STATUS)
|
||||
endif()
|
||||
|
||||
enable_language(C CXX ASM)
|
||||
|
||||
include (cmake/arch.cmake)
|
||||
include (cmake/target.cmake)
|
||||
include (cmake/tools.cmake)
|
||||
@ -462,6 +464,7 @@ find_contrib_lib(double-conversion) # Must be before parquet
|
||||
include (cmake/find/ssl.cmake)
|
||||
include (cmake/find/ldap.cmake) # after ssl
|
||||
include (cmake/find/icu.cmake)
|
||||
include (cmake/find/xz.cmake)
|
||||
include (cmake/find/zlib.cmake)
|
||||
include (cmake/find/zstd.cmake)
|
||||
include (cmake/find/ltdl.cmake) # for odbc
|
||||
|
@ -8,6 +8,7 @@ add_subdirectory (loggers)
|
||||
add_subdirectory (pcg-random)
|
||||
add_subdirectory (widechar_width)
|
||||
add_subdirectory (readpassphrase)
|
||||
add_subdirectory (bridge)
|
||||
|
||||
if (USE_MYSQL)
|
||||
add_subdirectory (mysqlxx)
|
||||
|
7
base/bridge/CMakeLists.txt
Normal file
7
base/bridge/CMakeLists.txt
Normal file
@ -0,0 +1,7 @@
|
||||
add_library (bridge
|
||||
IBridge.cpp
|
||||
)
|
||||
|
||||
target_include_directories (daemon PUBLIC ..)
|
||||
target_link_libraries (bridge PRIVATE daemon dbms Poco::Data Poco::Data::ODBC)
|
||||
|
238
base/bridge/IBridge.cpp
Normal file
238
base/bridge/IBridge.cpp
Normal file
@ -0,0 +1,238 @@
|
||||
#include "IBridge.h"
|
||||
|
||||
#include <IO/ReadHelpers.h>
|
||||
#include <boost/program_options.hpp>
|
||||
#include <Poco/Net/NetException.h>
|
||||
#include <Poco/Util/HelpFormatter.h>
|
||||
#include <Common/StringUtils/StringUtils.h>
|
||||
#include <Formats/registerFormats.h>
|
||||
#include <common/logger_useful.h>
|
||||
#include <Common/SensitiveDataMasker.h>
|
||||
#include <Server/HTTP/HTTPServer.h>
|
||||
|
||||
#if USE_ODBC
|
||||
# include <Poco/Data/ODBC/Connector.h>
|
||||
#endif
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int ARGUMENT_OUT_OF_BOUND;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
Poco::Net::SocketAddress makeSocketAddress(const std::string & host, UInt16 port, Poco::Logger * log)
|
||||
{
|
||||
Poco::Net::SocketAddress socket_address;
|
||||
try
|
||||
{
|
||||
socket_address = Poco::Net::SocketAddress(host, port);
|
||||
}
|
||||
catch (const Poco::Net::DNSException & e)
|
||||
{
|
||||
const auto code = e.code();
|
||||
if (code == EAI_FAMILY
|
||||
#if defined(EAI_ADDRFAMILY)
|
||||
|| code == EAI_ADDRFAMILY
|
||||
#endif
|
||||
)
|
||||
{
|
||||
LOG_ERROR(log, "Cannot resolve listen_host ({}), error {}: {}. If it is an IPv6 address and your host has disabled IPv6, then consider to specify IPv4 address to listen in <listen_host> element of configuration file. Example: <listen_host>0.0.0.0</listen_host>", host, e.code(), e.message());
|
||||
}
|
||||
|
||||
throw;
|
||||
}
|
||||
return socket_address;
|
||||
}
|
||||
|
||||
Poco::Net::SocketAddress socketBindListen(Poco::Net::ServerSocket & socket, const std::string & host, UInt16 port, Poco::Logger * log)
|
||||
{
|
||||
auto address = makeSocketAddress(host, port, log);
|
||||
#if POCO_VERSION < 0x01080000
|
||||
socket.bind(address, /* reuseAddress = */ true);
|
||||
#else
|
||||
socket.bind(address, /* reuseAddress = */ true, /* reusePort = */ false);
|
||||
#endif
|
||||
|
||||
socket.listen(/* backlog = */ 64);
|
||||
|
||||
return address;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void IBridge::handleHelp(const std::string &, const std::string &)
|
||||
{
|
||||
Poco::Util::HelpFormatter help_formatter(options());
|
||||
help_formatter.setCommand(commandName());
|
||||
help_formatter.setHeader("HTTP-proxy for odbc requests");
|
||||
help_formatter.setUsage("--http-port <port>");
|
||||
help_formatter.format(std::cerr);
|
||||
|
||||
stopOptionsProcessing();
|
||||
}
|
||||
|
||||
|
||||
void IBridge::defineOptions(Poco::Util::OptionSet & options)
|
||||
{
|
||||
options.addOption(
|
||||
Poco::Util::Option("http-port", "", "port to listen").argument("http-port", true) .binding("http-port"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("listen-host", "", "hostname or address to listen, default 127.0.0.1").argument("listen-host").binding("listen-host"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("http-timeout", "", "http timeout for socket, default 1800").argument("http-timeout").binding("http-timeout"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("max-server-connections", "", "max connections to server, default 1024").argument("max-server-connections").binding("max-server-connections"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("keep-alive-timeout", "", "keepalive timeout, default 10").argument("keep-alive-timeout").binding("keep-alive-timeout"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("log-level", "", "sets log level, default info") .argument("log-level").binding("logger.level"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("log-path", "", "log path for all logs, default console").argument("log-path").binding("logger.log"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("err-log-path", "", "err log path for all logs, default no").argument("err-log-path").binding("logger.errorlog"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("stdout-path", "", "stdout log path, default console").argument("stdout-path").binding("logger.stdout"));
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("stderr-path", "", "stderr log path, default console").argument("stderr-path").binding("logger.stderr"));
|
||||
|
||||
using Me = std::decay_t<decltype(*this)>;
|
||||
|
||||
options.addOption(
|
||||
Poco::Util::Option("help", "", "produce this help message").binding("help").callback(Poco::Util::OptionCallback<Me>(this, &Me::handleHelp)));
|
||||
|
||||
ServerApplication::defineOptions(options); // NOLINT Don't need complex BaseDaemon's .xml config
|
||||
}
|
||||
|
||||
|
||||
void IBridge::initialize(Application & self)
|
||||
{
|
||||
BaseDaemon::closeFDs();
|
||||
is_help = config().has("help");
|
||||
|
||||
if (is_help)
|
||||
return;
|
||||
|
||||
config().setString("logger", bridgeName());
|
||||
|
||||
/// Redirect stdout, stderr to specified files.
|
||||
/// Some libraries and sanitizers write to stderr in case of errors.
|
||||
const auto stdout_path = config().getString("logger.stdout", "");
|
||||
if (!stdout_path.empty())
|
||||
{
|
||||
if (!freopen(stdout_path.c_str(), "a+", stdout))
|
||||
throw Poco::OpenFileException("Cannot attach stdout to " + stdout_path);
|
||||
|
||||
/// Disable buffering for stdout.
|
||||
setbuf(stdout, nullptr);
|
||||
}
|
||||
const auto stderr_path = config().getString("logger.stderr", "");
|
||||
if (!stderr_path.empty())
|
||||
{
|
||||
if (!freopen(stderr_path.c_str(), "a+", stderr))
|
||||
throw Poco::OpenFileException("Cannot attach stderr to " + stderr_path);
|
||||
|
||||
/// Disable buffering for stderr.
|
||||
setbuf(stderr, nullptr);
|
||||
}
|
||||
|
||||
buildLoggers(config(), logger(), self.commandName());
|
||||
|
||||
BaseDaemon::logRevision();
|
||||
|
||||
log = &logger();
|
||||
hostname = config().getString("listen-host", "127.0.0.1");
|
||||
port = config().getUInt("http-port");
|
||||
if (port > 0xFFFF)
|
||||
throw Exception("Out of range 'http-port': " + std::to_string(port), ErrorCodes::ARGUMENT_OUT_OF_BOUND);
|
||||
|
||||
http_timeout = config().getUInt("http-timeout", DEFAULT_HTTP_READ_BUFFER_TIMEOUT);
|
||||
max_server_connections = config().getUInt("max-server-connections", 1024);
|
||||
keep_alive_timeout = config().getUInt("keep-alive-timeout", 10);
|
||||
|
||||
initializeTerminationAndSignalProcessing();
|
||||
|
||||
#if USE_ODBC
|
||||
if (bridgeName() == "ODBCBridge")
|
||||
Poco::Data::ODBC::Connector::registerConnector();
|
||||
#endif
|
||||
|
||||
ServerApplication::initialize(self); // NOLINT
|
||||
}
|
||||
|
||||
|
||||
void IBridge::uninitialize()
|
||||
{
|
||||
BaseDaemon::uninitialize();
|
||||
}
|
||||
|
||||
|
||||
int IBridge::main(const std::vector<std::string> & /*args*/)
|
||||
{
|
||||
if (is_help)
|
||||
return Application::EXIT_OK;
|
||||
|
||||
registerFormats();
|
||||
LOG_INFO(log, "Starting up {} on host: {}, port: {}", bridgeName(), hostname, port);
|
||||
|
||||
Poco::Net::ServerSocket socket;
|
||||
auto address = socketBindListen(socket, hostname, port, log);
|
||||
socket.setReceiveTimeout(http_timeout);
|
||||
socket.setSendTimeout(http_timeout);
|
||||
|
||||
Poco::ThreadPool server_pool(3, max_server_connections);
|
||||
|
||||
Poco::Net::HTTPServerParams::Ptr http_params = new Poco::Net::HTTPServerParams;
|
||||
http_params->setTimeout(http_timeout);
|
||||
http_params->setKeepAliveTimeout(keep_alive_timeout);
|
||||
|
||||
auto shared_context = Context::createShared();
|
||||
Context context(Context::createGlobal(shared_context.get()));
|
||||
context.makeGlobalContext();
|
||||
|
||||
if (config().has("query_masking_rules"))
|
||||
SensitiveDataMasker::setInstance(std::make_unique<SensitiveDataMasker>(config(), "query_masking_rules"));
|
||||
|
||||
auto server = HTTPServer(
|
||||
context,
|
||||
getHandlerFactoryPtr(context),
|
||||
server_pool,
|
||||
socket,
|
||||
http_params);
|
||||
|
||||
SCOPE_EXIT({
|
||||
LOG_DEBUG(log, "Received termination signal.");
|
||||
LOG_DEBUG(log, "Waiting for current connections to close.");
|
||||
|
||||
server.stop();
|
||||
|
||||
for (size_t count : ext::range(1, 6))
|
||||
{
|
||||
if (server.currentConnections() == 0)
|
||||
break;
|
||||
LOG_DEBUG(log, "Waiting for {} connections, try {}", server.currentConnections(), count);
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds(1000));
|
||||
}
|
||||
});
|
||||
|
||||
server.start();
|
||||
LOG_INFO(log, "Listening http://{}", address.toString());
|
||||
|
||||
waitForTerminationRequest();
|
||||
return Application::EXIT_OK;
|
||||
}
|
||||
|
||||
}
|
50
base/bridge/IBridge.h
Normal file
50
base/bridge/IBridge.h
Normal file
@ -0,0 +1,50 @@
|
||||
#pragma once
|
||||
|
||||
#include <Interpreters/Context.h>
|
||||
#include <Server/HTTP/HTTPRequestHandlerFactory.h>
|
||||
#include <Poco/Util/ServerApplication.h>
|
||||
#include <Poco/Logger.h>
|
||||
#include <daemon/BaseDaemon.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
/// Class represents base for clickhouse-odbc-bridge and clickhouse-library-bridge servers.
|
||||
/// Listens to incoming HTTP POST and GET requests on specified port and host.
|
||||
/// Has two handlers '/' for all incoming POST requests and /ping for GET request about service status.
|
||||
class IBridge : public BaseDaemon
|
||||
{
|
||||
|
||||
public:
|
||||
/// Define command line arguments
|
||||
void defineOptions(Poco::Util::OptionSet & options) override;
|
||||
|
||||
protected:
|
||||
using HandlerFactoryPtr = std::shared_ptr<HTTPRequestHandlerFactory>;
|
||||
|
||||
void initialize(Application & self) override;
|
||||
|
||||
void uninitialize() override;
|
||||
|
||||
int main(const std::vector<std::string> & args) override;
|
||||
|
||||
virtual const std::string bridgeName() const = 0;
|
||||
|
||||
virtual HandlerFactoryPtr getHandlerFactoryPtr(Context & context) const = 0;
|
||||
|
||||
size_t keep_alive_timeout;
|
||||
|
||||
private:
|
||||
void handleHelp(const std::string &, const std::string &);
|
||||
|
||||
bool is_help;
|
||||
std::string hostname;
|
||||
size_t port;
|
||||
std::string log_level;
|
||||
size_t max_server_connections;
|
||||
size_t http_timeout;
|
||||
|
||||
Poco::Logger * log;
|
||||
};
|
||||
}
|
66
base/ext/scope_guard_safe.h
Normal file
66
base/ext/scope_guard_safe.h
Normal file
@ -0,0 +1,66 @@
|
||||
#pragma once
|
||||
|
||||
#include <ext/scope_guard.h>
|
||||
#include <common/logger_useful.h>
|
||||
#include <Common/MemoryTracker.h>
|
||||
|
||||
/// Same as SCOPE_EXIT() but block the MEMORY_LIMIT_EXCEEDED errors.
|
||||
///
|
||||
/// Typical example of SCOPE_EXIT_MEMORY() usage is when code under it may do
|
||||
/// some tiny allocations, that may fail under high memory pressure or/and low
|
||||
/// max_memory_usage (and related limits).
|
||||
///
|
||||
/// NOTE: it should be used with caution.
|
||||
#define SCOPE_EXIT_MEMORY(...) SCOPE_EXIT( \
|
||||
MemoryTracker::LockExceptionInThread lock_memory_tracker; \
|
||||
__VA_ARGS__; \
|
||||
)
|
||||
|
||||
/// Same as SCOPE_EXIT() but try/catch/tryLogCurrentException any exceptions.
|
||||
///
|
||||
/// SCOPE_EXIT_SAFE() should be used in case the exception during the code
|
||||
/// under SCOPE_EXIT() is not "that fatal" and error message in log is enough.
|
||||
///
|
||||
/// Good example is calling CurrentThread::detachQueryIfNotDetached().
|
||||
///
|
||||
/// Anti-pattern is calling WriteBuffer::finalize() under SCOPE_EXIT_SAFE()
|
||||
/// (since finalize() can do final write and it is better to fail abnormally
|
||||
/// instead of ignoring write error).
|
||||
///
|
||||
/// NOTE: it should be used with double caution.
|
||||
#define SCOPE_EXIT_SAFE(...) SCOPE_EXIT( \
|
||||
try \
|
||||
{ \
|
||||
__VA_ARGS__; \
|
||||
} \
|
||||
catch (...) \
|
||||
{ \
|
||||
tryLogCurrentException(__PRETTY_FUNCTION__); \
|
||||
} \
|
||||
)
|
||||
|
||||
/// Same as SCOPE_EXIT() but:
|
||||
/// - block the MEMORY_LIMIT_EXCEEDED errors,
|
||||
/// - try/catch/tryLogCurrentException any exceptions.
|
||||
///
|
||||
/// SCOPE_EXIT_MEMORY_SAFE() can be used when the error can be ignored, and in
|
||||
/// addition to SCOPE_EXIT_SAFE() it will also lock MEMORY_LIMIT_EXCEEDED to
|
||||
/// avoid such exceptions.
|
||||
///
|
||||
/// It does exists as a separate helper, since you do not need to lock
|
||||
/// MEMORY_LIMIT_EXCEEDED always (there are cases when code under SCOPE_EXIT does
|
||||
/// not do any allocations, while LockExceptionInThread increment atomic
|
||||
/// variable).
|
||||
///
|
||||
/// NOTE: it should be used with triple caution.
|
||||
#define SCOPE_EXIT_MEMORY_SAFE(...) SCOPE_EXIT( \
|
||||
try \
|
||||
{ \
|
||||
MemoryTracker::LockExceptionInThread lock_memory_tracker; \
|
||||
__VA_ARGS__; \
|
||||
} \
|
||||
catch (...) \
|
||||
{ \
|
||||
tryLogCurrentException(__PRETTY_FUNCTION__); \
|
||||
} \
|
||||
)
|
@ -2,7 +2,6 @@
|
||||
#include <ctime>
|
||||
#include <random>
|
||||
#include <thread>
|
||||
|
||||
#include <mysqlxx/PoolWithFailover.h>
|
||||
|
||||
|
||||
@ -15,9 +14,12 @@ static bool startsWith(const std::string & s, const char * prefix)
|
||||
|
||||
using namespace mysqlxx;
|
||||
|
||||
PoolWithFailover::PoolWithFailover(const Poco::Util::AbstractConfiguration & config_,
|
||||
const std::string & config_name_, const unsigned default_connections_,
|
||||
const unsigned max_connections_, const size_t max_tries_)
|
||||
PoolWithFailover::PoolWithFailover(
|
||||
const Poco::Util::AbstractConfiguration & config_,
|
||||
const std::string & config_name_,
|
||||
const unsigned default_connections_,
|
||||
const unsigned max_connections_,
|
||||
const size_t max_tries_)
|
||||
: max_tries(max_tries_)
|
||||
{
|
||||
shareable = config_.getBool(config_name_ + ".share_connection", false);
|
||||
@ -59,16 +61,38 @@ PoolWithFailover::PoolWithFailover(const Poco::Util::AbstractConfiguration & con
|
||||
}
|
||||
}
|
||||
|
||||
PoolWithFailover::PoolWithFailover(const std::string & config_name_, const unsigned default_connections_,
|
||||
const unsigned max_connections_, const size_t max_tries_)
|
||||
: PoolWithFailover{
|
||||
Poco::Util::Application::instance().config(), config_name_,
|
||||
default_connections_, max_connections_, max_tries_}
|
||||
|
||||
PoolWithFailover::PoolWithFailover(
|
||||
const std::string & config_name_,
|
||||
const unsigned default_connections_,
|
||||
const unsigned max_connections_,
|
||||
const size_t max_tries_)
|
||||
: PoolWithFailover{Poco::Util::Application::instance().config(),
|
||||
config_name_, default_connections_, max_connections_, max_tries_}
|
||||
{
|
||||
}
|
||||
|
||||
|
||||
PoolWithFailover::PoolWithFailover(
|
||||
const std::string & database,
|
||||
const RemoteDescription & addresses,
|
||||
const std::string & user,
|
||||
const std::string & password,
|
||||
size_t max_tries_)
|
||||
: max_tries(max_tries_)
|
||||
, shareable(false)
|
||||
{
|
||||
/// Replicas have the same priority, but traversed replicas are moved to the end of the queue.
|
||||
for (const auto & [host, port] : addresses)
|
||||
{
|
||||
replicas_by_priority[0].emplace_back(std::make_shared<Pool>(database, host, user, password, port));
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
PoolWithFailover::PoolWithFailover(const PoolWithFailover & other)
|
||||
: max_tries{other.max_tries}, shareable{other.shareable}
|
||||
: max_tries{other.max_tries}
|
||||
, shareable{other.shareable}
|
||||
{
|
||||
if (shareable)
|
||||
{
|
||||
|
@ -11,6 +11,8 @@
|
||||
namespace mysqlxx
|
||||
{
|
||||
/** MySQL connection pool with support of failover.
|
||||
*
|
||||
* For dictionary source:
|
||||
* Have information about replicas and their priorities.
|
||||
* Tries to connect to replica in an order of priority. When equal priority, choose replica with maximum time without connections.
|
||||
*
|
||||
@ -68,42 +70,58 @@ namespace mysqlxx
|
||||
using PoolPtr = std::shared_ptr<Pool>;
|
||||
using Replicas = std::vector<PoolPtr>;
|
||||
|
||||
/// [priority][index] -> replica.
|
||||
/// [priority][index] -> replica. Highest priority is 0.
|
||||
using ReplicasByPriority = std::map<int, Replicas>;
|
||||
|
||||
ReplicasByPriority replicas_by_priority;
|
||||
|
||||
/// Number of connection tries.
|
||||
size_t max_tries;
|
||||
/// Mutex for set of replicas.
|
||||
std::mutex mutex;
|
||||
|
||||
/// Can the Pool be shared
|
||||
bool shareable;
|
||||
|
||||
public:
|
||||
using Entry = Pool::Entry;
|
||||
using RemoteDescription = std::vector<std::pair<std::string, uint16_t>>;
|
||||
|
||||
/**
|
||||
* config_name Name of parameter in configuration file.
|
||||
* * Mysql dictionary sourse related params:
|
||||
* config_name Name of parameter in configuration file for dictionary source.
|
||||
*
|
||||
* * Mysql storage related parameters:
|
||||
* replicas_description
|
||||
*
|
||||
* * Mutual parameters:
|
||||
* default_connections Number of connection in pool to each replica at start.
|
||||
* max_connections Maximum number of connections in pool to each replica.
|
||||
* max_tries_ Max number of connection tries.
|
||||
*/
|
||||
PoolWithFailover(const std::string & config_name_,
|
||||
PoolWithFailover(
|
||||
const std::string & config_name_,
|
||||
unsigned default_connections_ = MYSQLXX_POOL_WITH_FAILOVER_DEFAULT_START_CONNECTIONS,
|
||||
unsigned max_connections_ = MYSQLXX_POOL_WITH_FAILOVER_DEFAULT_MAX_CONNECTIONS,
|
||||
size_t max_tries_ = MYSQLXX_POOL_WITH_FAILOVER_DEFAULT_MAX_TRIES);
|
||||
|
||||
PoolWithFailover(const Poco::Util::AbstractConfiguration & config_,
|
||||
PoolWithFailover(
|
||||
const Poco::Util::AbstractConfiguration & config_,
|
||||
const std::string & config_name_,
|
||||
unsigned default_connections_ = MYSQLXX_POOL_WITH_FAILOVER_DEFAULT_START_CONNECTIONS,
|
||||
unsigned max_connections_ = MYSQLXX_POOL_WITH_FAILOVER_DEFAULT_MAX_CONNECTIONS,
|
||||
size_t max_tries_ = MYSQLXX_POOL_WITH_FAILOVER_DEFAULT_MAX_TRIES);
|
||||
|
||||
PoolWithFailover(
|
||||
const std::string & database,
|
||||
const RemoteDescription & addresses,
|
||||
const std::string & user,
|
||||
const std::string & password,
|
||||
size_t max_tries_ = MYSQLXX_POOL_WITH_FAILOVER_DEFAULT_MAX_TRIES);
|
||||
|
||||
PoolWithFailover(const PoolWithFailover & other);
|
||||
|
||||
/** Allocates a connection to use. */
|
||||
Entry get();
|
||||
};
|
||||
|
||||
using PoolWithFailoverPtr = std::shared_ptr<PoolWithFailover>;
|
||||
}
|
||||
|
@ -1,9 +1,9 @@
|
||||
# This strings autochanged from release_lib.sh:
|
||||
SET(VERSION_REVISION 54449)
|
||||
SET(VERSION_REVISION 54450)
|
||||
SET(VERSION_MAJOR 21)
|
||||
SET(VERSION_MINOR 4)
|
||||
SET(VERSION_MINOR 5)
|
||||
SET(VERSION_PATCH 1)
|
||||
SET(VERSION_GITHASH af2135ef9dc72f16fa4f229b731262c3f0a8bbdc)
|
||||
SET(VERSION_DESCRIBE v21.4.1.1-prestable)
|
||||
SET(VERSION_STRING 21.4.1.1)
|
||||
SET(VERSION_GITHASH 3827789b3d8fd2021952e57e5110343d26daa1a1)
|
||||
SET(VERSION_DESCRIBE v21.5.1.1-prestable)
|
||||
SET(VERSION_STRING 21.5.1.1)
|
||||
# end of autochange
|
||||
|
@ -1,4 +1,8 @@
|
||||
option (ENABLE_BASE64 "Enable base64" ${ENABLE_LIBRARIES})
|
||||
if(ARCH_AMD64 OR ARCH_ARM)
|
||||
option (ENABLE_BASE64 "Enable base64" ${ENABLE_LIBRARIES})
|
||||
elseif(ENABLE_BASE64)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "base64 library is only supported on x86_64 and aarch64")
|
||||
endif()
|
||||
|
||||
if (NOT ENABLE_BASE64)
|
||||
return()
|
||||
|
@ -1,7 +1,7 @@
|
||||
if(NOT ARCH_ARM AND NOT OS_FREEBSD AND NOT OS_DARWIN)
|
||||
if(ARCH_AMD64 AND NOT OS_FREEBSD AND NOT OS_DARWIN)
|
||||
option(ENABLE_FASTOPS "Enable fast vectorized mathematical functions library by Mikhail Parakhin" ${ENABLE_LIBRARIES})
|
||||
elseif(ENABLE_FASTOPS)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "Fastops library is not supported on ARM, FreeBSD and Darwin")
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "Fastops library is supported on x86_64 only, and not FreeBSD or Darwin")
|
||||
endif()
|
||||
|
||||
if(NOT ENABLE_FASTOPS)
|
||||
|
@ -1,4 +1,4 @@
|
||||
if(NOT ARCH_ARM AND NOT OS_FREEBSD AND NOT APPLE AND USE_PROTOBUF)
|
||||
if(NOT ARCH_ARM AND NOT OS_FREEBSD AND NOT APPLE AND USE_PROTOBUF AND NOT ARCH_PPC64LE)
|
||||
option(ENABLE_HDFS "Enable HDFS" ${ENABLE_LIBRARIES})
|
||||
elseif(ENABLE_HDFS OR USE_INTERNAL_HDFS3_LIBRARY)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "Cannot use HDFS3 with current configuration")
|
||||
|
@ -62,6 +62,7 @@ if (NOT OPENLDAP_FOUND AND NOT MISSING_INTERNAL_LDAP_LIBRARY)
|
||||
if (
|
||||
( "${_system_name}" STREQUAL "linux" AND "${_system_processor}" STREQUAL "x86_64" ) OR
|
||||
( "${_system_name}" STREQUAL "linux" AND "${_system_processor}" STREQUAL "aarch64" ) OR
|
||||
( "${_system_name}" STREQUAL "linux" AND "${_system_processor}" STREQUAL "ppc64le" ) OR
|
||||
( "${_system_name}" STREQUAL "freebsd" AND "${_system_processor}" STREQUAL "x86_64" ) OR
|
||||
( "${_system_name}" STREQUAL "darwin" AND "${_system_processor}" STREQUAL "x86_64" )
|
||||
)
|
||||
|
@ -1,7 +1,7 @@
|
||||
if(NOT OS_FREEBSD AND NOT APPLE AND NOT ARCH_ARM)
|
||||
if(NOT OS_FREEBSD AND NOT APPLE)
|
||||
option(ENABLE_S3 "Enable S3" ${ENABLE_LIBRARIES})
|
||||
elseif(ENABLE_S3 OR USE_INTERNAL_AWS_S3_LIBRARY)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "Can't use S3 on ARM, Apple or FreeBSD")
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "Can't use S3 on Apple or FreeBSD")
|
||||
endif()
|
||||
|
||||
if(NOT ENABLE_S3)
|
||||
|
27
cmake/find/xz.cmake
Normal file
27
cmake/find/xz.cmake
Normal file
@ -0,0 +1,27 @@
|
||||
option (USE_INTERNAL_XZ_LIBRARY "Set to OFF to use system xz (lzma) library instead of bundled" ${NOT_UNBUNDLED})
|
||||
|
||||
if(NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/xz/src/liblzma/api/lzma.h")
|
||||
if(USE_INTERNAL_XZ_LIBRARY)
|
||||
message(WARNING "submodule contrib/xz is missing. to fix try run: \n git submodule update --init --recursive")
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "Can't find internal xz (lzma) library")
|
||||
set(USE_INTERNAL_XZ_LIBRARY 0)
|
||||
endif()
|
||||
set(MISSING_INTERNAL_XZ_LIBRARY 1)
|
||||
endif()
|
||||
|
||||
if (NOT USE_INTERNAL_XZ_LIBRARY)
|
||||
find_library (XZ_LIBRARY lzma)
|
||||
find_path (XZ_INCLUDE_DIR NAMES lzma.h PATHS ${XZ_INCLUDE_PATHS})
|
||||
if (NOT XZ_LIBRARY OR NOT XZ_INCLUDE_DIR)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "Can't find system xz (lzma) library")
|
||||
endif ()
|
||||
endif ()
|
||||
|
||||
if (XZ_LIBRARY AND XZ_INCLUDE_DIR)
|
||||
elseif (NOT MISSING_INTERNAL_XZ_LIBRARY)
|
||||
set (USE_INTERNAL_XZ_LIBRARY 1)
|
||||
set (XZ_LIBRARY liblzma)
|
||||
set (XZ_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/xz/src/liblzma/api)
|
||||
endif ()
|
||||
|
||||
message (STATUS "Using xz (lzma): ${XZ_INCLUDE_DIR} : ${XZ_LIBRARY}")
|
@ -6,7 +6,7 @@ set (DEFAULT_LIBS "-nodefaultlibs")
|
||||
# We need builtins from Clang's RT even without libcxx - for ubsan+int128.
|
||||
# See https://bugs.llvm.org/show_bug.cgi?id=16404
|
||||
if (COMPILER_CLANG AND NOT (CMAKE_CROSSCOMPILING AND ARCH_AARCH64))
|
||||
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIBRARY OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-libgcc-file-name --rtlib=compiler-rt OUTPUT_VARIABLE BUILTINS_LIBRARY OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||
else ()
|
||||
set (BUILTINS_LIBRARY "-lgcc")
|
||||
endif ()
|
||||
|
@ -86,8 +86,3 @@ if (LINKER_NAME)
|
||||
message(STATUS "Using custom linker by name: ${LINKER_NAME}")
|
||||
endif ()
|
||||
|
||||
if (ARCH_PPC64LE)
|
||||
if (COMPILER_CLANG OR (COMPILER_GCC AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 8))
|
||||
message(FATAL_ERROR "Only gcc-8 or higher is supported for powerpc architecture")
|
||||
endif ()
|
||||
endif ()
|
||||
|
5
contrib/CMakeLists.txt
vendored
5
contrib/CMakeLists.txt
vendored
@ -47,7 +47,10 @@ add_subdirectory (lz4-cmake)
|
||||
add_subdirectory (murmurhash)
|
||||
add_subdirectory (replxx-cmake)
|
||||
add_subdirectory (unixodbc-cmake)
|
||||
add_subdirectory (xz)
|
||||
|
||||
if (USE_INTERNAL_XZ_LIBRARY)
|
||||
add_subdirectory (xz)
|
||||
endif()
|
||||
|
||||
add_subdirectory (poco-cmake)
|
||||
add_subdirectory (croaring-cmake)
|
||||
|
2
contrib/NuRaft
vendored
2
contrib/NuRaft
vendored
@ -1 +1 @@
|
||||
Subproject commit 70468326ad5d72e9497944838484c591dae054ea
|
||||
Subproject commit 241fd3754a1eb4d82ab68a9a875dc99391ec9f02
|
@ -160,6 +160,12 @@ if (NOT EXTERNAL_BOOST_FOUND)
|
||||
enable_language(ASM)
|
||||
SET(ASM_OPTIONS "-x assembler-with-cpp")
|
||||
|
||||
set (SRCS_CONTEXT
|
||||
${LIBRARY_DIR}/libs/context/src/dummy.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
|
||||
)
|
||||
|
||||
if (SANITIZE AND (SANITIZE STREQUAL "address" OR SANITIZE STREQUAL "thread"))
|
||||
add_compile_definitions(BOOST_USE_UCONTEXT)
|
||||
|
||||
@ -169,39 +175,34 @@ if (NOT EXTERNAL_BOOST_FOUND)
|
||||
add_compile_definitions(BOOST_USE_TSAN)
|
||||
endif()
|
||||
|
||||
set (SRCS_CONTEXT
|
||||
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||
${LIBRARY_DIR}/libs/context/src/fiber.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/continuation.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/dummy.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
|
||||
)
|
||||
elseif (ARCH_ARM)
|
||||
set (SRCS_CONTEXT
|
||||
endif()
|
||||
if (ARCH_ARM)
|
||||
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||
${LIBRARY_DIR}/libs/context/src/asm/jump_arm64_aapcs_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/make_arm64_aapcs_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/ontop_arm64_aapcs_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/dummy.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
|
||||
)
|
||||
elseif (ARCH_PPC64LE)
|
||||
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||
${LIBRARY_DIR}/libs/context/src/asm/jump_ppc64_sysv_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/make_ppc64_sysv_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/ontop_ppc64_sysv_elf_gas.S
|
||||
)
|
||||
elseif(OS_DARWIN)
|
||||
set (SRCS_CONTEXT
|
||||
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||
${LIBRARY_DIR}/libs/context/src/asm/jump_x86_64_sysv_macho_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/make_x86_64_sysv_macho_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/ontop_x86_64_sysv_macho_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/dummy.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
|
||||
)
|
||||
else()
|
||||
set (SRCS_CONTEXT
|
||||
set (SRCS_CONTEXT ${SRCS_CONTEXT}
|
||||
${LIBRARY_DIR}/libs/context/src/asm/jump_x86_64_sysv_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/make_x86_64_sysv_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/asm/ontop_x86_64_sysv_elf_gas.S
|
||||
${LIBRARY_DIR}/libs/context/src/dummy.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/execution_context.cpp
|
||||
${LIBRARY_DIR}/libs/context/src/posix/stack_traits.cpp
|
||||
)
|
||||
endif()
|
||||
|
||||
|
@ -97,12 +97,19 @@ if (NOT EXTERNAL_CCTZ_LIBRARY_FOUND OR NOT EXTERNAL_CCTZ_LIBRARY_WORKS)
|
||||
set(TZ_OBJS ${TZ_OBJS} ${TZ_OBJ})
|
||||
|
||||
# https://stackoverflow.com/questions/14776463/compile-and-add-an-object-file-from-a-binary-with-cmake
|
||||
add_custom_command(OUTPUT ${TZ_OBJ}
|
||||
COMMAND cp ${TZDIR}/${TIMEZONE} ${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID}
|
||||
COMMAND cd ${CMAKE_CURRENT_BINARY_DIR} && ${OBJCOPY_PATH} -I binary ${OBJCOPY_ARCH_OPTIONS}
|
||||
# PPC64LE fails to do this with objcopy, use ld or lld instead
|
||||
if (ARCH_PPC64LE)
|
||||
add_custom_command(OUTPUT ${TZ_OBJ}
|
||||
COMMAND cp ${TZDIR}/${TIMEZONE} ${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID}
|
||||
COMMAND cd ${CMAKE_CURRENT_BINARY_DIR} && ${CMAKE_LINKER} -m elf64lppc -r -b binary -o ${TZ_OBJ} ${TIMEZONE_ID}
|
||||
COMMAND rm ${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID})
|
||||
else()
|
||||
add_custom_command(OUTPUT ${TZ_OBJ}
|
||||
COMMAND cp ${TZDIR}/${TIMEZONE} ${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID}
|
||||
COMMAND cd ${CMAKE_CURRENT_BINARY_DIR} && ${OBJCOPY_PATH} -I binary ${OBJCOPY_ARCH_OPTIONS}
|
||||
--rename-section .data=.rodata,alloc,load,readonly,data,contents ${TIMEZONE_ID} ${TZ_OBJ}
|
||||
COMMAND rm ${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID})
|
||||
|
||||
COMMAND rm ${CMAKE_CURRENT_BINARY_DIR}/${TIMEZONE_ID})
|
||||
endif()
|
||||
set_source_files_properties(${TZ_OBJ} PROPERTIES EXTERNAL_OBJECT true GENERATED true)
|
||||
endforeach(TIMEZONE)
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
if (SANITIZE OR NOT (ARCH_AMD64 OR ARCH_ARM) OR NOT (OS_LINUX OR OS_FREEBSD OR OS_DARWIN))
|
||||
if (SANITIZE OR NOT (ARCH_AMD64 OR ARCH_ARM OR ARCH_PPC64LE) OR NOT (OS_LINUX OR OS_FREEBSD OR OS_DARWIN))
|
||||
if (ENABLE_JEMALLOC)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL}
|
||||
"jemalloc is disabled implicitly: it doesn't work with sanitizers and can only be used with x86_64 or aarch64 on linux or freebsd.")
|
||||
"jemalloc is disabled implicitly: it doesn't work with sanitizers and can only be used with x86_64, aarch64 or ppc64le on linux or freebsd.")
|
||||
endif()
|
||||
set (ENABLE_JEMALLOC OFF)
|
||||
else()
|
||||
@ -107,6 +107,8 @@ if (ARCH_AMD64)
|
||||
set(JEMALLOC_INCLUDE_PREFIX "${JEMALLOC_INCLUDE_PREFIX}_x86_64")
|
||||
elseif (ARCH_ARM)
|
||||
set(JEMALLOC_INCLUDE_PREFIX "${JEMALLOC_INCLUDE_PREFIX}_aarch64")
|
||||
elseif (ARCH_PPC64LE)
|
||||
set(JEMALLOC_INCLUDE_PREFIX "${JEMALLOC_INCLUDE_PREFIX}_ppc64le")
|
||||
else ()
|
||||
message (FATAL_ERROR "internal jemalloc: This arch is not supported")
|
||||
endif ()
|
||||
|
@ -0,0 +1,367 @@
|
||||
/* include/jemalloc/internal/jemalloc_internal_defs.h. Generated from jemalloc_internal_defs.h.in by configure. */
|
||||
#ifndef JEMALLOC_INTERNAL_DEFS_H_
|
||||
#define JEMALLOC_INTERNAL_DEFS_H_
|
||||
/*
|
||||
* If JEMALLOC_PREFIX is defined via --with-jemalloc-prefix, it will cause all
|
||||
* public APIs to be prefixed. This makes it possible, with some care, to use
|
||||
* multiple allocators simultaneously.
|
||||
*/
|
||||
/* #undef JEMALLOC_PREFIX */
|
||||
/* #undef JEMALLOC_CPREFIX */
|
||||
|
||||
/*
|
||||
* Define overrides for non-standard allocator-related functions if they are
|
||||
* present on the system.
|
||||
*/
|
||||
#define JEMALLOC_OVERRIDE___LIBC_CALLOC
|
||||
#define JEMALLOC_OVERRIDE___LIBC_FREE
|
||||
#define JEMALLOC_OVERRIDE___LIBC_MALLOC
|
||||
#define JEMALLOC_OVERRIDE___LIBC_MEMALIGN
|
||||
#define JEMALLOC_OVERRIDE___LIBC_REALLOC
|
||||
#define JEMALLOC_OVERRIDE___LIBC_VALLOC
|
||||
/* #undef JEMALLOC_OVERRIDE___POSIX_MEMALIGN */
|
||||
|
||||
/*
|
||||
* JEMALLOC_PRIVATE_NAMESPACE is used as a prefix for all library-private APIs.
|
||||
* For shared libraries, symbol visibility mechanisms prevent these symbols
|
||||
* from being exported, but for static libraries, naming collisions are a real
|
||||
* possibility.
|
||||
*/
|
||||
#define JEMALLOC_PRIVATE_NAMESPACE je_
|
||||
|
||||
/*
|
||||
* Hyper-threaded CPUs may need a special instruction inside spin loops in
|
||||
* order to yield to another virtual CPU.
|
||||
*/
|
||||
#define CPU_SPINWAIT
|
||||
/* 1 if CPU_SPINWAIT is defined, 0 otherwise. */
|
||||
#define HAVE_CPU_SPINWAIT 0
|
||||
|
||||
/*
|
||||
* Number of significant bits in virtual addresses. This may be less than the
|
||||
* total number of bits in a pointer, e.g. on x64, for which the uppermost 16
|
||||
* bits are the same as bit 47.
|
||||
*/
|
||||
#define LG_VADDR 64
|
||||
|
||||
/* Defined if C11 atomics are available. */
|
||||
#define JEMALLOC_C11_ATOMICS 1
|
||||
|
||||
/* Defined if GCC __atomic atomics are available. */
|
||||
#define JEMALLOC_GCC_ATOMIC_ATOMICS 1
|
||||
/* and the 8-bit variant support. */
|
||||
#define JEMALLOC_GCC_U8_ATOMIC_ATOMICS 1
|
||||
|
||||
/* Defined if GCC __sync atomics are available. */
|
||||
#define JEMALLOC_GCC_SYNC_ATOMICS 1
|
||||
/* and the 8-bit variant support. */
|
||||
#define JEMALLOC_GCC_U8_SYNC_ATOMICS 1
|
||||
|
||||
/*
|
||||
* Defined if __builtin_clz() and __builtin_clzl() are available.
|
||||
*/
|
||||
#define JEMALLOC_HAVE_BUILTIN_CLZ
|
||||
|
||||
/*
|
||||
* Defined if os_unfair_lock_*() functions are available, as provided by Darwin.
|
||||
*/
|
||||
/* #undef JEMALLOC_OS_UNFAIR_LOCK */
|
||||
|
||||
/* Defined if syscall(2) is usable. */
|
||||
#define JEMALLOC_USE_SYSCALL
|
||||
|
||||
/*
|
||||
* Defined if secure_getenv(3) is available.
|
||||
*/
|
||||
// #define JEMALLOC_HAVE_SECURE_GETENV
|
||||
|
||||
/*
|
||||
* Defined if issetugid(2) is available.
|
||||
*/
|
||||
/* #undef JEMALLOC_HAVE_ISSETUGID */
|
||||
|
||||
/* Defined if pthread_atfork(3) is available. */
|
||||
#define JEMALLOC_HAVE_PTHREAD_ATFORK
|
||||
|
||||
/* Defined if pthread_setname_np(3) is available. */
|
||||
#define JEMALLOC_HAVE_PTHREAD_SETNAME_NP
|
||||
|
||||
/*
|
||||
* Defined if clock_gettime(CLOCK_MONOTONIC_COARSE, ...) is available.
|
||||
*/
|
||||
#define JEMALLOC_HAVE_CLOCK_MONOTONIC_COARSE 1
|
||||
|
||||
/*
|
||||
* Defined if clock_gettime(CLOCK_MONOTONIC, ...) is available.
|
||||
*/
|
||||
#define JEMALLOC_HAVE_CLOCK_MONOTONIC 1
|
||||
|
||||
/*
|
||||
* Defined if mach_absolute_time() is available.
|
||||
*/
|
||||
/* #undef JEMALLOC_HAVE_MACH_ABSOLUTE_TIME */
|
||||
|
||||
/*
|
||||
* Defined if _malloc_thread_cleanup() exists. At least in the case of
|
||||
* FreeBSD, pthread_key_create() allocates, which if used during malloc
|
||||
* bootstrapping will cause recursion into the pthreads library. Therefore, if
|
||||
* _malloc_thread_cleanup() exists, use it as the basis for thread cleanup in
|
||||
* malloc_tsd.
|
||||
*/
|
||||
/* #undef JEMALLOC_MALLOC_THREAD_CLEANUP */
|
||||
|
||||
/*
|
||||
* Defined if threaded initialization is known to be safe on this platform.
|
||||
* Among other things, it must be possible to initialize a mutex without
|
||||
* triggering allocation in order for threaded allocation to be safe.
|
||||
*/
|
||||
#define JEMALLOC_THREADED_INIT
|
||||
|
||||
/*
|
||||
* Defined if the pthreads implementation defines
|
||||
* _pthread_mutex_init_calloc_cb(), in which case the function is used in order
|
||||
* to avoid recursive allocation during mutex initialization.
|
||||
*/
|
||||
/* #undef JEMALLOC_MUTEX_INIT_CB */
|
||||
|
||||
/* Non-empty if the tls_model attribute is supported. */
|
||||
#define JEMALLOC_TLS_MODEL __attribute__((tls_model("initial-exec")))
|
||||
|
||||
/*
|
||||
* JEMALLOC_DEBUG enables assertions and other sanity checks, and disables
|
||||
* inline functions.
|
||||
*/
|
||||
/* #undef JEMALLOC_DEBUG */
|
||||
|
||||
/* JEMALLOC_STATS enables statistics calculation. */
|
||||
#define JEMALLOC_STATS
|
||||
|
||||
/* JEMALLOC_EXPERIMENTAL_SMALLOCX_API enables experimental smallocx API. */
|
||||
/* #undef JEMALLOC_EXPERIMENTAL_SMALLOCX_API */
|
||||
|
||||
/* JEMALLOC_PROF enables allocation profiling. */
|
||||
/* #undef JEMALLOC_PROF */
|
||||
|
||||
/* Use libunwind for profile backtracing if defined. */
|
||||
/* #undef JEMALLOC_PROF_LIBUNWIND */
|
||||
|
||||
/* Use libgcc for profile backtracing if defined. */
|
||||
/* #undef JEMALLOC_PROF_LIBGCC */
|
||||
|
||||
/* Use gcc intrinsics for profile backtracing if defined. */
|
||||
/* #undef JEMALLOC_PROF_GCC */
|
||||
|
||||
/*
|
||||
* JEMALLOC_DSS enables use of sbrk(2) to allocate extents from the data storage
|
||||
* segment (DSS).
|
||||
*/
|
||||
#define JEMALLOC_DSS
|
||||
|
||||
/* Support memory filling (junk/zero). */
|
||||
#define JEMALLOC_FILL
|
||||
|
||||
/* Support utrace(2)-based tracing. */
|
||||
/* #undef JEMALLOC_UTRACE */
|
||||
|
||||
/* Support optional abort() on OOM. */
|
||||
/* #undef JEMALLOC_XMALLOC */
|
||||
|
||||
/* Support lazy locking (avoid locking unless a second thread is launched). */
|
||||
/* #undef JEMALLOC_LAZY_LOCK */
|
||||
|
||||
/*
|
||||
* Minimum allocation alignment is 2^LG_QUANTUM bytes (ignoring tiny size
|
||||
* classes).
|
||||
*/
|
||||
/* #undef LG_QUANTUM */
|
||||
|
||||
/* One page is 2^LG_PAGE bytes. */
|
||||
#define LG_PAGE 16
|
||||
|
||||
/*
|
||||
* One huge page is 2^LG_HUGEPAGE bytes. Note that this is defined even if the
|
||||
* system does not explicitly support huge pages; system calls that require
|
||||
* explicit huge page support are separately configured.
|
||||
*/
|
||||
#define LG_HUGEPAGE 21
|
||||
|
||||
/*
|
||||
* If defined, adjacent virtual memory mappings with identical attributes
|
||||
* automatically coalesce, and they fragment when changes are made to subranges.
|
||||
* This is the normal order of things for mmap()/munmap(), but on Windows
|
||||
* VirtualAlloc()/VirtualFree() operations must be precisely matched, i.e.
|
||||
* mappings do *not* coalesce/fragment.
|
||||
*/
|
||||
#define JEMALLOC_MAPS_COALESCE
|
||||
|
||||
/*
|
||||
* If defined, retain memory for later reuse by default rather than using e.g.
|
||||
* munmap() to unmap freed extents. This is enabled on 64-bit Linux because
|
||||
* common sequences of mmap()/munmap() calls will cause virtual memory map
|
||||
* holes.
|
||||
*/
|
||||
#define JEMALLOC_RETAIN
|
||||
|
||||
/* TLS is used to map arenas and magazine caches to threads. */
|
||||
#define JEMALLOC_TLS
|
||||
|
||||
/*
|
||||
* Used to mark unreachable code to quiet "end of non-void" compiler warnings.
|
||||
* Don't use this directly; instead use unreachable() from util.h
|
||||
*/
|
||||
#define JEMALLOC_INTERNAL_UNREACHABLE __builtin_unreachable
|
||||
|
||||
/*
|
||||
* ffs*() functions to use for bitmapping. Don't use these directly; instead,
|
||||
* use ffs_*() from util.h.
|
||||
*/
|
||||
#define JEMALLOC_INTERNAL_FFSLL __builtin_ffsll
|
||||
#define JEMALLOC_INTERNAL_FFSL __builtin_ffsl
|
||||
#define JEMALLOC_INTERNAL_FFS __builtin_ffs
|
||||
|
||||
/*
|
||||
* popcount*() functions to use for bitmapping.
|
||||
*/
|
||||
#define JEMALLOC_INTERNAL_POPCOUNTL __builtin_popcountl
|
||||
#define JEMALLOC_INTERNAL_POPCOUNT __builtin_popcount
|
||||
|
||||
/*
|
||||
* If defined, explicitly attempt to more uniformly distribute large allocation
|
||||
* pointer alignments across all cache indices.
|
||||
*/
|
||||
#define JEMALLOC_CACHE_OBLIVIOUS
|
||||
|
||||
/*
|
||||
* If defined, enable logging facilities. We make this a configure option to
|
||||
* avoid taking extra branches everywhere.
|
||||
*/
|
||||
/* #undef JEMALLOC_LOG */
|
||||
|
||||
/*
|
||||
* If defined, use readlinkat() (instead of readlink()) to follow
|
||||
* /etc/malloc_conf.
|
||||
*/
|
||||
/* #undef JEMALLOC_READLINKAT */
|
||||
|
||||
/*
|
||||
* Darwin (OS X) uses zones to work around Mach-O symbol override shortcomings.
|
||||
*/
|
||||
/* #undef JEMALLOC_ZONE */
|
||||
|
||||
/*
|
||||
* Methods for determining whether the OS overcommits.
|
||||
* JEMALLOC_PROC_SYS_VM_OVERCOMMIT_MEMORY: Linux's
|
||||
* /proc/sys/vm.overcommit_memory file.
|
||||
* JEMALLOC_SYSCTL_VM_OVERCOMMIT: FreeBSD's vm.overcommit sysctl.
|
||||
*/
|
||||
/* #undef JEMALLOC_SYSCTL_VM_OVERCOMMIT */
|
||||
#define JEMALLOC_PROC_SYS_VM_OVERCOMMIT_MEMORY
|
||||
|
||||
/* Defined if madvise(2) is available. */
|
||||
#define JEMALLOC_HAVE_MADVISE
|
||||
|
||||
/*
|
||||
* Defined if transparent huge pages are supported via the MADV_[NO]HUGEPAGE
|
||||
* arguments to madvise(2).
|
||||
*/
|
||||
#define JEMALLOC_HAVE_MADVISE_HUGE
|
||||
|
||||
/*
|
||||
* Methods for purging unused pages differ between operating systems.
|
||||
*
|
||||
* madvise(..., MADV_FREE) : This marks pages as being unused, such that they
|
||||
* will be discarded rather than swapped out.
|
||||
* madvise(..., MADV_DONTNEED) : If JEMALLOC_PURGE_MADVISE_DONTNEED_ZEROS is
|
||||
* defined, this immediately discards pages,
|
||||
* such that new pages will be demand-zeroed if
|
||||
* the address region is later touched;
|
||||
* otherwise this behaves similarly to
|
||||
* MADV_FREE, though typically with higher
|
||||
* system overhead.
|
||||
*/
|
||||
#define JEMALLOC_PURGE_MADVISE_FREE
|
||||
#define JEMALLOC_PURGE_MADVISE_DONTNEED
|
||||
#define JEMALLOC_PURGE_MADVISE_DONTNEED_ZEROS
|
||||
|
||||
/* Defined if madvise(2) is available but MADV_FREE is not (x86 Linux only). */
|
||||
/* #undef JEMALLOC_DEFINE_MADVISE_FREE */
|
||||
|
||||
/*
|
||||
* Defined if MADV_DO[NT]DUMP is supported as an argument to madvise.
|
||||
*/
|
||||
#define JEMALLOC_MADVISE_DONTDUMP
|
||||
|
||||
/*
|
||||
* Defined if transparent huge pages (THPs) are supported via the
|
||||
* MADV_[NO]HUGEPAGE arguments to madvise(2), and THP support is enabled.
|
||||
*/
|
||||
/* #undef JEMALLOC_THP */
|
||||
|
||||
/* Define if operating system has alloca.h header. */
|
||||
#define JEMALLOC_HAS_ALLOCA_H 1
|
||||
|
||||
/* C99 restrict keyword supported. */
|
||||
#define JEMALLOC_HAS_RESTRICT 1
|
||||
|
||||
/* For use by hash code. */
|
||||
/* #undef JEMALLOC_BIG_ENDIAN */
|
||||
|
||||
/* sizeof(int) == 2^LG_SIZEOF_INT. */
|
||||
#define LG_SIZEOF_INT 2
|
||||
|
||||
/* sizeof(long) == 2^LG_SIZEOF_LONG. */
|
||||
#define LG_SIZEOF_LONG 3
|
||||
|
||||
/* sizeof(long long) == 2^LG_SIZEOF_LONG_LONG. */
|
||||
#define LG_SIZEOF_LONG_LONG 3
|
||||
|
||||
/* sizeof(intmax_t) == 2^LG_SIZEOF_INTMAX_T. */
|
||||
#define LG_SIZEOF_INTMAX_T 3
|
||||
|
||||
/* glibc malloc hooks (__malloc_hook, __realloc_hook, __free_hook). */
|
||||
#define JEMALLOC_GLIBC_MALLOC_HOOK
|
||||
|
||||
/* glibc memalign hook. */
|
||||
#define JEMALLOC_GLIBC_MEMALIGN_HOOK
|
||||
|
||||
/* pthread support */
|
||||
#define JEMALLOC_HAVE_PTHREAD
|
||||
|
||||
/* dlsym() support */
|
||||
#define JEMALLOC_HAVE_DLSYM
|
||||
|
||||
/* Adaptive mutex support in pthreads. */
|
||||
#define JEMALLOC_HAVE_PTHREAD_MUTEX_ADAPTIVE_NP
|
||||
|
||||
/* GNU specific sched_getcpu support */
|
||||
#define JEMALLOC_HAVE_SCHED_GETCPU
|
||||
|
||||
/* GNU specific sched_setaffinity support */
|
||||
#define JEMALLOC_HAVE_SCHED_SETAFFINITY
|
||||
|
||||
/*
|
||||
* If defined, all the features necessary for background threads are present.
|
||||
*/
|
||||
#define JEMALLOC_BACKGROUND_THREAD 1
|
||||
|
||||
/*
|
||||
* If defined, jemalloc symbols are not exported (doesn't work when
|
||||
* JEMALLOC_PREFIX is not defined).
|
||||
*/
|
||||
/* #undef JEMALLOC_EXPORT */
|
||||
|
||||
/* config.malloc_conf options string. */
|
||||
#define JEMALLOC_CONFIG_MALLOC_CONF "@JEMALLOC_CONFIG_MALLOC_CONF@"
|
||||
|
||||
/* If defined, jemalloc takes the malloc/free/etc. symbol names. */
|
||||
#define JEMALLOC_IS_MALLOC 1
|
||||
|
||||
/*
|
||||
* Defined if strerror_r returns char * if _GNU_SOURCE is defined.
|
||||
*/
|
||||
#define JEMALLOC_STRERROR_R_RETURNS_CHAR_WITH_GNU_SOURCE
|
||||
|
||||
/* Performs additional safety checks when defined. */
|
||||
/* #undef JEMALLOC_OPT_SAFETY_CHECKS */
|
||||
|
||||
#endif /* JEMALLOC_INTERNAL_DEFS_H_ */
|
@ -1,11 +1,9 @@
|
||||
if (NOT ARCH_ARM)
|
||||
if(ARCH_AMD64)
|
||||
option (ENABLE_CPUID "Enable libcpuid library (only internal)" ${ENABLE_LIBRARIES})
|
||||
endif()
|
||||
|
||||
if (ARCH_ARM AND ENABLE_CPUID)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "cpuid is not supported on ARM")
|
||||
elseif(ENABLE_CPUID)
|
||||
message (${RECONFIGURE_MESSAGE_LEVEL} "libcpuid is only supported on x86_64")
|
||||
set (ENABLE_CPUID 0)
|
||||
endif ()
|
||||
endif()
|
||||
|
||||
if (NOT ENABLE_CPUID)
|
||||
add_library (cpuid INTERFACE)
|
||||
|
@ -75,6 +75,8 @@
|
||||
#define HAVE_STRNDUP 1
|
||||
// strerror_r
|
||||
#define HAVE_STRERROR_R 1
|
||||
// rand_r
|
||||
#define HAVE_RAND_R 1
|
||||
|
||||
#ifdef __APPLE__
|
||||
// pthread_setname_np
|
||||
|
63
contrib/openldap-cmake/linux_ppc64le/include/lber_types.h
Normal file
63
contrib/openldap-cmake/linux_ppc64le/include/lber_types.h
Normal file
@ -0,0 +1,63 @@
|
||||
/* include/lber_types.h. Generated from lber_types.hin by configure. */
|
||||
/* $OpenLDAP$ */
|
||||
/* This work is part of OpenLDAP Software <http://www.openldap.org/>.
|
||||
*
|
||||
* Copyright 1998-2020 The OpenLDAP Foundation.
|
||||
* All rights reserved.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted only as authorized by the OpenLDAP
|
||||
* Public License.
|
||||
*
|
||||
* A copy of this license is available in file LICENSE in the
|
||||
* top-level directory of the distribution or, alternatively, at
|
||||
* <http://www.OpenLDAP.org/license.html>.
|
||||
*/
|
||||
|
||||
/*
|
||||
* LBER types
|
||||
*/
|
||||
|
||||
#ifndef _LBER_TYPES_H
|
||||
#define _LBER_TYPES_H
|
||||
|
||||
#include <ldap_cdefs.h>
|
||||
|
||||
LDAP_BEGIN_DECL
|
||||
|
||||
/* LBER boolean, enum, integers (32 bits or larger) */
|
||||
#define LBER_INT_T int
|
||||
|
||||
/* LBER tags (32 bits or larger) */
|
||||
#define LBER_TAG_T long
|
||||
|
||||
/* LBER socket descriptor */
|
||||
#define LBER_SOCKET_T int
|
||||
|
||||
/* LBER lengths (32 bits or larger) */
|
||||
#define LBER_LEN_T long
|
||||
|
||||
/* ------------------------------------------------------------ */
|
||||
|
||||
/* booleans, enumerations, and integers */
|
||||
typedef LBER_INT_T ber_int_t;
|
||||
|
||||
/* signed and unsigned versions */
|
||||
typedef signed LBER_INT_T ber_sint_t;
|
||||
typedef unsigned LBER_INT_T ber_uint_t;
|
||||
|
||||
/* tags */
|
||||
typedef unsigned LBER_TAG_T ber_tag_t;
|
||||
|
||||
/* "socket" descriptors */
|
||||
typedef LBER_SOCKET_T ber_socket_t;
|
||||
|
||||
/* lengths */
|
||||
typedef unsigned LBER_LEN_T ber_len_t;
|
||||
|
||||
/* signed lengths */
|
||||
typedef signed LBER_LEN_T ber_slen_t;
|
||||
|
||||
LDAP_END_DECL
|
||||
|
||||
#endif /* _LBER_TYPES_H */
|
74
contrib/openldap-cmake/linux_ppc64le/include/ldap_config.h
Normal file
74
contrib/openldap-cmake/linux_ppc64le/include/ldap_config.h
Normal file
@ -0,0 +1,74 @@
|
||||
/* include/ldap_config.h. Generated from ldap_config.hin by configure. */
|
||||
/* $OpenLDAP$ */
|
||||
/* This work is part of OpenLDAP Software <http://www.openldap.org/>.
|
||||
*
|
||||
* Copyright 1998-2020 The OpenLDAP Foundation.
|
||||
* All rights reserved.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted only as authorized by the OpenLDAP
|
||||
* Public License.
|
||||
*
|
||||
* A copy of this license is available in file LICENSE in the
|
||||
* top-level directory of the distribution or, alternatively, at
|
||||
* <http://www.OpenLDAP.org/license.html>.
|
||||
*/
|
||||
|
||||
/*
|
||||
* This file works in conjunction with OpenLDAP configure system.
|
||||
* If you do no like the values below, adjust your configure options.
|
||||
*/
|
||||
|
||||
#ifndef _LDAP_CONFIG_H
|
||||
#define _LDAP_CONFIG_H
|
||||
|
||||
/* directory separator */
|
||||
#ifndef LDAP_DIRSEP
|
||||
#ifndef _WIN32
|
||||
#define LDAP_DIRSEP "/"
|
||||
#else
|
||||
#define LDAP_DIRSEP "\\"
|
||||
#endif
|
||||
#endif
|
||||
|
||||
/* directory for temporary files */
|
||||
#if defined(_WIN32)
|
||||
# define LDAP_TMPDIR "C:\\." /* we don't have much of a choice */
|
||||
#elif defined( _P_tmpdir )
|
||||
# define LDAP_TMPDIR _P_tmpdir
|
||||
#elif defined( P_tmpdir )
|
||||
# define LDAP_TMPDIR P_tmpdir
|
||||
#elif defined( _PATH_TMPDIR )
|
||||
# define LDAP_TMPDIR _PATH_TMPDIR
|
||||
#else
|
||||
# define LDAP_TMPDIR LDAP_DIRSEP "tmp"
|
||||
#endif
|
||||
|
||||
/* directories */
|
||||
#ifndef LDAP_BINDIR
|
||||
#define LDAP_BINDIR "/tmp/ldap-prefix/bin"
|
||||
#endif
|
||||
#ifndef LDAP_SBINDIR
|
||||
#define LDAP_SBINDIR "/tmp/ldap-prefix/sbin"
|
||||
#endif
|
||||
#ifndef LDAP_DATADIR
|
||||
#define LDAP_DATADIR "/tmp/ldap-prefix/share/openldap"
|
||||
#endif
|
||||
#ifndef LDAP_SYSCONFDIR
|
||||
#define LDAP_SYSCONFDIR "/tmp/ldap-prefix/etc/openldap"
|
||||
#endif
|
||||
#ifndef LDAP_LIBEXECDIR
|
||||
#define LDAP_LIBEXECDIR "/tmp/ldap-prefix/libexec"
|
||||
#endif
|
||||
#ifndef LDAP_MODULEDIR
|
||||
#define LDAP_MODULEDIR "/tmp/ldap-prefix/libexec/openldap"
|
||||
#endif
|
||||
#ifndef LDAP_RUNDIR
|
||||
#define LDAP_RUNDIR "/tmp/ldap-prefix/var"
|
||||
#endif
|
||||
#ifndef LDAP_LOCALEDIR
|
||||
#define LDAP_LOCALEDIR ""
|
||||
#endif
|
||||
|
||||
|
||||
#endif /* _LDAP_CONFIG_H */
|
61
contrib/openldap-cmake/linux_ppc64le/include/ldap_features.h
Normal file
61
contrib/openldap-cmake/linux_ppc64le/include/ldap_features.h
Normal file
@ -0,0 +1,61 @@
|
||||
/* include/ldap_features.h. Generated from ldap_features.hin by configure. */
|
||||
/* $OpenLDAP$ */
|
||||
/* This work is part of OpenLDAP Software <http://www.openldap.org/>.
|
||||
*
|
||||
* Copyright 1998-2020 The OpenLDAP Foundation.
|
||||
* All rights reserved.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted only as authorized by the OpenLDAP
|
||||
* Public License.
|
||||
*
|
||||
* A copy of this license is available in file LICENSE in the
|
||||
* top-level directory of the distribution or, alternatively, at
|
||||
* <http://www.OpenLDAP.org/license.html>.
|
||||
*/
|
||||
|
||||
/*
|
||||
* LDAP Features
|
||||
*/
|
||||
|
||||
#ifndef _LDAP_FEATURES_H
|
||||
#define _LDAP_FEATURES_H 1
|
||||
|
||||
/* OpenLDAP API version macros */
|
||||
#define LDAP_VENDOR_VERSION 20501
|
||||
#define LDAP_VENDOR_VERSION_MAJOR 2
|
||||
#define LDAP_VENDOR_VERSION_MINOR 5
|
||||
#define LDAP_VENDOR_VERSION_PATCH X
|
||||
|
||||
/*
|
||||
** WORK IN PROGRESS!
|
||||
**
|
||||
** OpenLDAP reentrancy/thread-safeness should be dynamically
|
||||
** checked using ldap_get_option().
|
||||
**
|
||||
** The -lldap implementation is not thread-safe.
|
||||
**
|
||||
** The -lldap_r implementation is:
|
||||
** LDAP_API_FEATURE_THREAD_SAFE (basic thread safety)
|
||||
** but also be:
|
||||
** LDAP_API_FEATURE_SESSION_THREAD_SAFE
|
||||
** LDAP_API_FEATURE_OPERATION_THREAD_SAFE
|
||||
**
|
||||
** The preprocessor flag LDAP_API_FEATURE_X_OPENLDAP_THREAD_SAFE
|
||||
** can be used to determine if -lldap_r is available at compile
|
||||
** time. You must define LDAP_THREAD_SAFE if and only if you
|
||||
** link with -lldap_r.
|
||||
**
|
||||
** If you fail to define LDAP_THREAD_SAFE when linking with
|
||||
** -lldap_r or define LDAP_THREAD_SAFE when linking with -lldap,
|
||||
** provided header definitions and declarations may be incorrect.
|
||||
**
|
||||
*/
|
||||
|
||||
/* is -lldap_r available or not */
|
||||
#define LDAP_API_FEATURE_X_OPENLDAP_THREAD_SAFE 1
|
||||
|
||||
/* LDAP v2 Referrals */
|
||||
/* #undef LDAP_API_FEATURE_X_OPENLDAP_V2_REFERRALS */
|
||||
|
||||
#endif /* LDAP_FEATURES */
|
1169
contrib/openldap-cmake/linux_ppc64le/include/portable.h
Normal file
1169
contrib/openldap-cmake/linux_ppc64le/include/portable.h
Normal file
File diff suppressed because it is too large
Load Diff
4
debian/changelog
vendored
4
debian/changelog
vendored
@ -1,5 +1,5 @@
|
||||
clickhouse (21.4.1.1) unstable; urgency=low
|
||||
clickhouse (21.5.1.1) unstable; urgency=low
|
||||
|
||||
* Modified source code
|
||||
|
||||
-- clickhouse-release <clickhouse-release@yandex-team.ru> Sat, 06 Mar 2021 14:43:27 +0300
|
||||
-- clickhouse-release <clickhouse-release@yandex-team.ru> Fri, 02 Apr 2021 18:34:26 +0300
|
||||
|
1
debian/clickhouse-common-static.install
vendored
1
debian/clickhouse-common-static.install
vendored
@ -1,5 +1,6 @@
|
||||
usr/bin/clickhouse
|
||||
usr/bin/clickhouse-odbc-bridge
|
||||
usr/bin/clickhouse-library-bridge
|
||||
usr/bin/clickhouse-extract-from-config
|
||||
usr/share/bash-completion/completions
|
||||
etc/security/limits.d/clickhouse.conf
|
||||
|
@ -1,7 +1,7 @@
|
||||
FROM ubuntu:18.04
|
||||
|
||||
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
|
||||
ARG version=21.4.1.*
|
||||
ARG version=21.5.1.*
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install --yes --no-install-recommends \
|
||||
|
@ -138,7 +138,8 @@
|
||||
"docker/test/stateless_unbundled",
|
||||
"docker/test/stateless_pytest",
|
||||
"docker/test/integration/base",
|
||||
"docker/test/fuzzer"
|
||||
"docker/test/fuzzer",
|
||||
"docker/test/keeper-jepsen"
|
||||
]
|
||||
},
|
||||
"docker/packager/unbundled": {
|
||||
@ -159,5 +160,9 @@
|
||||
"docker/test/sqlancer": {
|
||||
"name": "yandex/clickhouse-sqlancer-test",
|
||||
"dependent": []
|
||||
},
|
||||
"docker/test/keeper-jepsen": {
|
||||
"name": "yandex/clickhouse-keeper-jepsen-test",
|
||||
"dependent": []
|
||||
}
|
||||
}
|
||||
|
@ -15,7 +15,6 @@ mkdir -p build/build_docker
|
||||
cd build/build_docker
|
||||
ccache --show-stats ||:
|
||||
ccache --zero-stats ||:
|
||||
ln -s /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/libOpenCL.so ||:
|
||||
rm -f CMakeCache.txt
|
||||
# Read cmake arguments into array (possibly empty)
|
||||
read -ra CMAKE_FLAGS <<< "${CMAKE_FLAGS:-}"
|
||||
|
@ -23,4 +23,3 @@ then
|
||||
fi
|
||||
fi
|
||||
ccache --show-stats ||:
|
||||
ln -s /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/libOpenCL.so ||:
|
||||
|
@ -13,4 +13,3 @@ mv /*.rpm /output ||: # if exists
|
||||
mv /*.tgz /output ||: # if exists
|
||||
|
||||
ccache --show-stats ||:
|
||||
ln -s /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/libOpenCL.so ||:
|
||||
|
@ -1,7 +1,7 @@
|
||||
FROM ubuntu:20.04
|
||||
|
||||
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
|
||||
ARG version=21.4.1.*
|
||||
ARG version=21.5.1.*
|
||||
ARG gosu_ver=1.10
|
||||
|
||||
# set non-empty deb_location_url url to create a docker image
|
||||
|
@ -1,7 +1,7 @@
|
||||
FROM ubuntu:18.04
|
||||
|
||||
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
|
||||
ARG version=21.4.1.*
|
||||
ARG version=21.5.1.*
|
||||
|
||||
RUN apt-get update && \
|
||||
apt-get install -y apt-transport-https dirmngr && \
|
||||
|
@ -1,7 +1,7 @@
|
||||
# docker build -t yandex/clickhouse-fasttest .
|
||||
FROM ubuntu:20.04
|
||||
|
||||
ENV DEBIAN_FRONTEND=noninteractive LLVM_VERSION=10
|
||||
ENV DEBIAN_FRONTEND=noninteractive LLVM_VERSION=11
|
||||
|
||||
RUN apt-get update \
|
||||
&& apt-get install ca-certificates lsb-release wget gnupg apt-transport-https \
|
||||
@ -43,20 +43,20 @@ RUN apt-get update \
|
||||
clang-tidy-${LLVM_VERSION} \
|
||||
cmake \
|
||||
curl \
|
||||
lsof \
|
||||
expect \
|
||||
fakeroot \
|
||||
git \
|
||||
gdb \
|
||||
git \
|
||||
gperf \
|
||||
lld-${LLVM_VERSION} \
|
||||
llvm-${LLVM_VERSION} \
|
||||
lsof \
|
||||
moreutils \
|
||||
ninja-build \
|
||||
psmisc \
|
||||
python3 \
|
||||
python3-pip \
|
||||
python3-lxml \
|
||||
python3-pip \
|
||||
python3-requests \
|
||||
python3-termcolor \
|
||||
rename \
|
||||
|
@ -8,6 +8,9 @@ trap 'kill $(jobs -pr) ||:' EXIT
|
||||
# that we can run the "everything else" stage from the cloned source.
|
||||
stage=${stage:-}
|
||||
|
||||
# Compiler version, normally set by Dockerfile
|
||||
export LLVM_VERSION=${LLVM_VERSION:-11}
|
||||
|
||||
# A variable to pass additional flags to CMake.
|
||||
# Here we explicitly default it to nothing so that bash doesn't complain about
|
||||
# it being undefined. Also read it as array so that we can pass an empty list
|
||||
@ -124,22 +127,26 @@ continue
|
||||
|
||||
function clone_root
|
||||
{
|
||||
git clone https://github.com/ClickHouse/ClickHouse.git -- "$FASTTEST_SOURCE" | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/clone_log.txt"
|
||||
git clone --depth 1 https://github.com/ClickHouse/ClickHouse.git -- "$FASTTEST_SOURCE" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/clone_log.txt"
|
||||
|
||||
(
|
||||
cd "$FASTTEST_SOURCE"
|
||||
if [ "$PULL_REQUEST_NUMBER" != "0" ]; then
|
||||
if git fetch origin "+refs/pull/$PULL_REQUEST_NUMBER/merge"; then
|
||||
if git fetch --depth 1 origin "+refs/pull/$PULL_REQUEST_NUMBER/merge"; then
|
||||
git checkout FETCH_HEAD
|
||||
echo 'Clonned merge head'
|
||||
echo "Checked out pull/$PULL_REQUEST_NUMBER/merge ($(git rev-parse FETCH_HEAD))"
|
||||
else
|
||||
git fetch origin "+refs/pull/$PULL_REQUEST_NUMBER/head"
|
||||
git fetch --depth 1 origin "+refs/pull/$PULL_REQUEST_NUMBER/head"
|
||||
git checkout "$COMMIT_SHA"
|
||||
echo 'Checked out to commit'
|
||||
echo "Checked out nominal SHA $COMMIT_SHA for PR $PULL_REQUEST_NUMBER"
|
||||
fi
|
||||
else
|
||||
if [ -v COMMIT_SHA ]; then
|
||||
git fetch --depth 1 origin "$COMMIT_SHA"
|
||||
git checkout "$COMMIT_SHA"
|
||||
echo "Checked out nominal SHA $COMMIT_SHA for master"
|
||||
else
|
||||
echo "Using default repository head $(git rev-parse HEAD)"
|
||||
fi
|
||||
fi
|
||||
)
|
||||
@ -181,7 +188,7 @@ function clone_submodules
|
||||
)
|
||||
|
||||
git submodule sync
|
||||
git submodule update --init --recursive "${SUBMODULES_TO_UPDATE[@]}"
|
||||
git submodule update --depth 1 --init --recursive "${SUBMODULES_TO_UPDATE[@]}"
|
||||
git submodule foreach git reset --hard
|
||||
git submodule foreach git checkout @ -f
|
||||
git submodule foreach git clean -xfd
|
||||
@ -215,7 +222,7 @@ function run_cmake
|
||||
|
||||
(
|
||||
cd "$FASTTEST_BUILD"
|
||||
cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER=clang++-10 -DCMAKE_C_COMPILER=clang-10 "${CMAKE_LIBS_CONFIG[@]}" "${FASTTEST_CMAKE_FLAGS[@]}" | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt"
|
||||
cmake "$FASTTEST_SOURCE" -DCMAKE_CXX_COMPILER="clang++-${LLVM_VERSION}" -DCMAKE_C_COMPILER="clang-${LLVM_VERSION}" "${CMAKE_LIBS_CONFIG[@]}" "${FASTTEST_CMAKE_FLAGS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/cmake_log.txt"
|
||||
)
|
||||
}
|
||||
|
||||
@ -223,7 +230,7 @@ function build
|
||||
{
|
||||
(
|
||||
cd "$FASTTEST_BUILD"
|
||||
time ninja clickhouse-bundle | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/build_log.txt"
|
||||
time ninja clickhouse-bundle 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/build_log.txt"
|
||||
if [ "$COPY_CLICKHOUSE_BINARY_TO_OUTPUT" -eq "1" ]; then
|
||||
cp programs/clickhouse "$FASTTEST_OUTPUT/clickhouse"
|
||||
fi
|
||||
@ -420,7 +427,7 @@ case "$stage" in
|
||||
# See the compatibility hacks in `clone_root` stage above. Remove at the same time,
|
||||
# after Nov 1, 2020.
|
||||
cd "$FASTTEST_WORKSPACE"
|
||||
clone_submodules | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/submodule_log.txt"
|
||||
clone_submodules 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/submodule_log.txt"
|
||||
;&
|
||||
"run_cmake")
|
||||
run_cmake
|
||||
@ -431,7 +438,7 @@ case "$stage" in
|
||||
"configure")
|
||||
# The `install_log.txt` is also needed for compatibility with old CI task --
|
||||
# if there is no log, it will decide that build failed.
|
||||
configure | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/install_log.txt"
|
||||
configure 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/install_log.txt"
|
||||
;&
|
||||
"run_tests")
|
||||
run_tests
|
||||
|
@ -69,11 +69,25 @@ function watchdog
|
||||
killall -9 clickhouse-client ||:
|
||||
}
|
||||
|
||||
function filter_exists
|
||||
{
|
||||
local path
|
||||
for path in "$@"; do
|
||||
if [ -e "$path" ]; then
|
||||
echo "$path"
|
||||
else
|
||||
echo "'$path' does not exists" >&2
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
function fuzz
|
||||
{
|
||||
# Obtain the list of newly added tests. They will be fuzzed in more extreme way than other tests.
|
||||
# Don't overwrite the NEW_TESTS_OPT so that it can be set from the environment.
|
||||
NEW_TESTS="$(grep -P 'tests/queries/0_stateless/.*\.sql' ci-changed-files.txt | sed -r -e 's!^!ch/!' | sort -R)"
|
||||
# ci-changed-files.txt contains also files that has been deleted/renamed, filter them out.
|
||||
NEW_TESTS="$(filter_exists $NEW_TESTS)"
|
||||
if [[ -n "$NEW_TESTS" ]]
|
||||
then
|
||||
NEW_TESTS_OPT="${NEW_TESTS_OPT:---interleave-queries-file ${NEW_TESTS}}"
|
||||
|
@ -19,7 +19,8 @@ RUN apt-get update \
|
||||
tar \
|
||||
krb5-user \
|
||||
iproute2 \
|
||||
lsof
|
||||
lsof \
|
||||
g++
|
||||
RUN rm -rf \
|
||||
/var/lib/apt/lists/* \
|
||||
/var/cache/debconf \
|
||||
|
@ -31,6 +31,7 @@ RUN apt-get update \
|
||||
software-properties-common \
|
||||
libkrb5-dev \
|
||||
krb5-user \
|
||||
g++ \
|
||||
&& rm -rf \
|
||||
/var/lib/apt/lists/* \
|
||||
/var/cache/debconf \
|
||||
|
@ -0,0 +1,23 @@
|
||||
version: '2.3'
|
||||
services:
|
||||
mysql2:
|
||||
image: mysql:5.7
|
||||
restart: always
|
||||
environment:
|
||||
MYSQL_ROOT_PASSWORD: clickhouse
|
||||
ports:
|
||||
- 3348:3306
|
||||
mysql3:
|
||||
image: mysql:5.7
|
||||
restart: always
|
||||
environment:
|
||||
MYSQL_ROOT_PASSWORD: clickhouse
|
||||
ports:
|
||||
- 3388:3306
|
||||
mysql4:
|
||||
image: mysql:5.7
|
||||
restart: always
|
||||
environment:
|
||||
MYSQL_ROOT_PASSWORD: clickhouse
|
||||
ports:
|
||||
- 3368:3306
|
@ -11,10 +11,3 @@ services:
|
||||
default:
|
||||
aliases:
|
||||
- postgre-sql.local
|
||||
postgres2:
|
||||
image: postgres
|
||||
restart: always
|
||||
environment:
|
||||
POSTGRES_PASSWORD: mysecretpassword
|
||||
ports:
|
||||
- 5441:5432
|
||||
|
@ -0,0 +1,23 @@
|
||||
version: '2.3'
|
||||
services:
|
||||
postgres2:
|
||||
image: postgres
|
||||
restart: always
|
||||
environment:
|
||||
POSTGRES_PASSWORD: mysecretpassword
|
||||
ports:
|
||||
- 5421:5432
|
||||
postgres3:
|
||||
image: postgres
|
||||
restart: always
|
||||
environment:
|
||||
POSTGRES_PASSWORD: mysecretpassword
|
||||
ports:
|
||||
- 5441:5432
|
||||
postgres4:
|
||||
image: postgres
|
||||
restart: always
|
||||
environment:
|
||||
POSTGRES_PASSWORD: mysecretpassword
|
||||
ports:
|
||||
- 5461:5432
|
@ -21,6 +21,7 @@ export CLICKHOUSE_TESTS_SERVER_BIN_PATH=/clickhouse
|
||||
export CLICKHOUSE_TESTS_CLIENT_BIN_PATH=/clickhouse
|
||||
export CLICKHOUSE_TESTS_BASE_CONFIG_DIR=/clickhouse-config
|
||||
export CLICKHOUSE_ODBC_BRIDGE_BINARY_PATH=/clickhouse-odbc-bridge
|
||||
export CLICKHOUSE_LIBRARY_BRIDGE_BINARY_PATH=/clickhouse-library-bridge
|
||||
|
||||
export DOCKER_MYSQL_GOLANG_CLIENT_TAG=${DOCKER_MYSQL_GOLANG_CLIENT_TAG:=latest}
|
||||
export DOCKER_MYSQL_JAVA_CLIENT_TAG=${DOCKER_MYSQL_JAVA_CLIENT_TAG:=latest}
|
||||
|
39
docker/test/keeper-jepsen/Dockerfile
Normal file
39
docker/test/keeper-jepsen/Dockerfile
Normal file
@ -0,0 +1,39 @@
|
||||
# docker build -t yandex/clickhouse-keeper-jepsen-test .
|
||||
FROM yandex/clickhouse-test-base
|
||||
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
ENV CLOJURE_VERSION=1.10.3.814
|
||||
|
||||
# arguments
|
||||
ENV PR_TO_TEST=""
|
||||
ENV SHA_TO_TEST=""
|
||||
|
||||
ENV NODES_USERNAME="root"
|
||||
ENV NODES_PASSWORD=""
|
||||
ENV TESTS_TO_RUN="30"
|
||||
ENV TIME_LIMIT="30"
|
||||
|
||||
|
||||
# volumes
|
||||
ENV NODES_FILE_PATH="/nodes.txt"
|
||||
ENV TEST_OUTPUT="/test_output"
|
||||
|
||||
RUN mkdir "/root/.ssh"
|
||||
RUN touch "/root/.ssh/known_hosts"
|
||||
|
||||
# install java
|
||||
RUN apt-get update && apt-get install default-jre default-jdk libjna-java libjna-jni ssh gnuplot graphviz --yes --no-install-recommends
|
||||
|
||||
# install clojure
|
||||
RUN curl -O "https://download.clojure.org/install/linux-install-${CLOJURE_VERSION}.sh" && \
|
||||
chmod +x "linux-install-${CLOJURE_VERSION}.sh" && \
|
||||
bash "./linux-install-${CLOJURE_VERSION}.sh"
|
||||
|
||||
# install leiningen
|
||||
RUN curl -O "https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein" && \
|
||||
chmod +x ./lein && \
|
||||
mv ./lein /usr/bin
|
||||
|
||||
COPY run.sh /
|
||||
|
||||
CMD ["/bin/bash", "/run.sh"]
|
22
docker/test/keeper-jepsen/run.sh
Normal file
22
docker/test/keeper-jepsen/run.sh
Normal file
@ -0,0 +1,22 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
|
||||
CLICKHOUSE_PACKAGE=${CLICKHOUSE_PACKAGE:="https://clickhouse-builds.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/clickhouse_build_check/clang-11_relwithdebuginfo_none_bundled_unsplitted_disable_False_binary/clickhouse"}
|
||||
CLICKHOUSE_REPO_PATH=${CLICKHOUSE_REPO_PATH:=""}
|
||||
|
||||
|
||||
if [ -z "$CLICKHOUSE_REPO_PATH" ]; then
|
||||
CLICKHOUSE_REPO_PATH=ch
|
||||
rm -rf ch ||:
|
||||
mkdir ch ||:
|
||||
wget -nv -nd -c "https://clickhouse-test-reports.s3.yandex.net/$PR_TO_TEST/$SHA_TO_TEST/repo/clickhouse_no_subs.tar.gz"
|
||||
tar -C ch --strip-components=1 -xf clickhouse_no_subs.tar.gz
|
||||
ls -lath ||:
|
||||
fi
|
||||
|
||||
cd "$CLICKHOUSE_REPO_PATH/tests/jepsen.clickhouse-keeper"
|
||||
|
||||
(lein run test-all --nodes-file "$NODES_FILE_PATH" --username "$NODES_USERNAME" --logging-json --password "$NODES_PASSWORD" --time-limit "$TIME_LIMIT" --concurrency 50 -r 50 --snapshot-distance 100 --stale-log-gap 100 --reserved-log-items 10 --lightweight-run --clickhouse-source "$CLICKHOUSE_PACKAGE" -q --test-count "$TESTS_TO_RUN" || true) | tee "$TEST_OUTPUT/jepsen_run_all_tests.log"
|
||||
|
||||
mv store "$TEST_OUTPUT/"
|
@ -74,12 +74,17 @@ function run_tests()
|
||||
ADDITIONAL_OPTIONS+=('--order=random')
|
||||
ADDITIONAL_OPTIONS+=('--skip')
|
||||
ADDITIONAL_OPTIONS+=('00000_no_tests_to_skip')
|
||||
ADDITIONAL_OPTIONS+=('--jobs')
|
||||
ADDITIONAL_OPTIONS+=('4')
|
||||
# Note that flaky check must be ran in parallel, but for now we run
|
||||
# everything in parallel except DatabaseReplicated. See below.
|
||||
fi
|
||||
|
||||
if [[ -n "$USE_DATABASE_REPLICATED" ]] && [[ "$USE_DATABASE_REPLICATED" -eq 1 ]]; then
|
||||
ADDITIONAL_OPTIONS+=('--replicated-database')
|
||||
else
|
||||
# Too many tests fail for DatabaseReplicated in parallel. All other
|
||||
# configurations are OK.
|
||||
ADDITIONAL_OPTIONS+=('--jobs')
|
||||
ADDITIONAL_OPTIONS+=('8')
|
||||
fi
|
||||
|
||||
clickhouse-test --testname --shard --zookeeper --hung-check --print-time \
|
||||
|
@ -126,7 +126,13 @@ Contribute all new information in English language. Other languages are translat
|
||||
|
||||
### Adding a New File
|
||||
|
||||
When adding a new file:
|
||||
When you add a new file, it should end with a link like:
|
||||
|
||||
`[Original article](https://clickhouse.tech/docs/<path-to-the-page>) <!--hide-->`
|
||||
|
||||
and there should be **a new empty line** after it.
|
||||
|
||||
{## When adding a new file:
|
||||
|
||||
- Make symbolic links for all other languages. You can use the following commands:
|
||||
|
||||
@ -134,7 +140,7 @@ When adding a new file:
|
||||
$ cd /ClickHouse/clone/directory/docs
|
||||
$ ln -sr en/new/file.md lang/new/file.md
|
||||
```
|
||||
|
||||
##}
|
||||
<a name="adding-a-new-language"/>
|
||||
|
||||
### Adding a New Language
|
||||
@ -195,8 +201,11 @@ Templates:
|
||||
|
||||
- [Function](_description_templates/template-function.md)
|
||||
- [Setting](_description_templates/template-setting.md)
|
||||
- [Server Setting](_description_templates/template-server-setting.md)
|
||||
- [Database or Table engine](_description_templates/template-engine.md)
|
||||
- [System table](_description_templates/template-system-table.md)
|
||||
- [Data type](_description_templates/data-type.md)
|
||||
- [Statement](_description_templates/statement.md)
|
||||
|
||||
|
||||
<a name="how-to-build-docs"/>
|
||||
|
@ -31,9 +31,10 @@ toc_title: Cloud
|
||||
|
||||
## Alibaba Cloud {#alibaba-cloud}
|
||||
|
||||
Alibaba Cloud Managed Service for ClickHouse [China Site](https://www.aliyun.com/product/clickhouse) (Will be available at international site at May, 2021) provides the following key features:
|
||||
- Highly reliable cloud disk storage engine based on Alibaba Cloud Apsara distributed system
|
||||
- Expand capacity on demand without manual data migration
|
||||
Alibaba Cloud Managed Service for ClickHouse. [China Site](https://www.aliyun.com/product/clickhouse) (will be available at the international site in May 2021). Provides the following key features:
|
||||
|
||||
- Highly reliable cloud disk storage engine based on [Alibaba Cloud Apsara](https://www.alibabacloud.com/product/apsara-stack) distributed system
|
||||
- Expand capacity on-demand without manual data migration
|
||||
- Support single-node, single-replica, multi-node, and multi-replica architectures, and support hot and cold data tiering
|
||||
- Support access allow-list, one-key recovery, multi-layer network security protection, cloud disk encryption
|
||||
- Seamless integration with cloud log systems, databases, and data application tools
|
||||
|
@ -5,43 +5,77 @@ toc_title: Build on Mac OS X
|
||||
|
||||
# How to Build ClickHouse on Mac OS X {#how-to-build-clickhouse-on-mac-os-x}
|
||||
|
||||
Build should work on Mac OS X 10.15 (Catalina).
|
||||
Build should work on x86_64 (Intel) based macOS 10.15 (Catalina) and higher with recent Xcode's native AppleClang, or Homebrew's vanilla Clang or GCC compilers.
|
||||
|
||||
## Install Homebrew {#install-homebrew}
|
||||
|
||||
``` bash
|
||||
$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
|
||||
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
|
||||
```
|
||||
|
||||
## Install Xcode and Command Line Tools {#install-xcode-and-command-line-tools}
|
||||
|
||||
Install the latest [Xcode](https://apps.apple.com/am/app/xcode/id497799835?mt=12) from App Store.
|
||||
|
||||
Open it at least once to accept the end-user license agreement and automatically install the required components.
|
||||
|
||||
Then, make sure that the latest Comman Line Tools are installed and selected in the system:
|
||||
|
||||
``` bash
|
||||
$ sudo rm -rf /Library/Developer/CommandLineTools
|
||||
$ sudo xcode-select --install
|
||||
```
|
||||
|
||||
Reboot.
|
||||
|
||||
## Install Required Compilers, Tools, and Libraries {#install-required-compilers-tools-and-libraries}
|
||||
|
||||
``` bash
|
||||
$ brew install cmake ninja libtool gettext llvm
|
||||
$ brew update
|
||||
$ brew install cmake ninja libtool gettext llvm gcc
|
||||
```
|
||||
|
||||
## Checkout ClickHouse Sources {#checkout-clickhouse-sources}
|
||||
|
||||
``` bash
|
||||
$ git clone --recursive git@github.com:ClickHouse/ClickHouse.git
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
``` bash
|
||||
$ git clone --recursive https://github.com/ClickHouse/ClickHouse.git
|
||||
|
||||
$ cd ClickHouse
|
||||
$ git clone --recursive git@github.com:ClickHouse/ClickHouse.git # or https://github.com/ClickHouse/ClickHouse.git
|
||||
```
|
||||
|
||||
## Build ClickHouse {#build-clickhouse}
|
||||
|
||||
> Please note: ClickHouse doesn't support build with native Apple Clang compiler, we need use clang from LLVM.
|
||||
To build using Xcode's native AppleClang compiler:
|
||||
|
||||
``` bash
|
||||
$ cd ClickHouse
|
||||
$ rm -rf build
|
||||
$ mkdir build
|
||||
$ cd build
|
||||
$ cmake .. -DCMAKE_C_COMPILER=`brew --prefix llvm`/bin/clang -DCMAKE_CXX_COMPILER=`brew --prefix llvm`/bin/clang++ -DCMAKE_PREFIX_PATH=`brew --prefix llvm`
|
||||
$ ninja
|
||||
$ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_JEMALLOC=OFF ..
|
||||
$ cmake --build . --config RelWithDebInfo
|
||||
$ cd ..
|
||||
```
|
||||
|
||||
To build using Homebrew's vanilla Clang compiler:
|
||||
|
||||
``` bash
|
||||
$ cd ClickHouse
|
||||
$ rm -rf build
|
||||
$ mkdir build
|
||||
$ cd build
|
||||
$ cmake -DCMAKE_C_COMPILER=$(brew --prefix llvm)/bin/clang -DCMAKE_CXX_COMPILER==$(brew --prefix llvm)/bin/clang++ -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_JEMALLOC=OFF ..
|
||||
$ cmake --build . --config RelWithDebInfo
|
||||
$ cd ..
|
||||
```
|
||||
|
||||
To build using Homebrew's vanilla GCC compiler:
|
||||
|
||||
``` bash
|
||||
$ cd ClickHouse
|
||||
$ rm -rf build
|
||||
$ mkdir build
|
||||
$ cd build
|
||||
$ cmake -DCMAKE_C_COMPILER=$(brew --prefix gcc)/bin/gcc-10 -DCMAKE_CXX_COMPILER=$(brew --prefix gcc)/bin/g++-10 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_JEMALLOC=OFF ..
|
||||
$ cmake --build . --config RelWithDebInfo
|
||||
$ cd ..
|
||||
```
|
||||
|
||||
|
@ -5,36 +5,87 @@ toc_title: Third-Party Libraries Used
|
||||
|
||||
# Third-Party Libraries Used {#third-party-libraries-used}
|
||||
|
||||
| Library | License |
|
||||
|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| base64 | [BSD 2-Clause License](https://github.com/aklomp/base64/blob/a27c565d1b6c676beaf297fe503c4518185666f7/LICENSE) |
|
||||
| boost | [Boost Software License 1.0](https://github.com/ClickHouse-Extras/boost-extra/blob/6883b40449f378019aec792f9983ce3afc7ff16e/LICENSE_1_0.txt) |
|
||||
| brotli | [MIT](https://github.com/google/brotli/blob/master/LICENSE) |
|
||||
| capnproto | [MIT](https://github.com/capnproto/capnproto/blob/master/LICENSE) |
|
||||
| cctz | [Apache License 2.0](https://github.com/google/cctz/blob/4f9776a310f4952454636363def82c2bf6641d5f/LICENSE.txt) |
|
||||
| double-conversion | [BSD 3-Clause License](https://github.com/google/double-conversion/blob/cf2f0f3d547dc73b4612028a155b80536902ba02/LICENSE) |
|
||||
| FastMemcpy | [MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libmemcpy/impl/LICENSE) |
|
||||
| googletest | [BSD 3-Clause License](https://github.com/google/googletest/blob/master/LICENSE) |
|
||||
| h3 | [Apache License 2.0](https://github.com/uber/h3/blob/master/LICENSE) |
|
||||
| hyperscan | [BSD 3-Clause License](https://github.com/intel/hyperscan/blob/master/LICENSE) |
|
||||
| libcxxabi | [BSD + MIT](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libglibc-compatibility/libcxxabi/LICENSE.TXT) |
|
||||
| libdivide | [Zlib License](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) |
|
||||
| libgsasl | [LGPL v2.1](https://github.com/ClickHouse-Extras/libgsasl/blob/3b8948a4042e34fb00b4fb987535dc9e02e39040/LICENSE) |
|
||||
| libhdfs3 | [Apache License 2.0](https://github.com/ClickHouse-Extras/libhdfs3/blob/bd6505cbb0c130b0db695305b9a38546fa880e5a/LICENSE.txt) |
|
||||
| libmetrohash | [Apache License 2.0](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libmetrohash/LICENSE) |
|
||||
| libpcg-random | [Apache License 2.0](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libpcg-random/LICENSE-APACHE.txt) |
|
||||
| libressl | [OpenSSL License](https://github.com/ClickHouse-Extras/ssl/blob/master/COPYING) |
|
||||
| librdkafka | [BSD 2-Clause License](https://github.com/edenhill/librdkafka/blob/363dcad5a23dc29381cc626620e68ae418b3af19/LICENSE) |
|
||||
| libwidechar_width | [CC0 1.0 Universal](https://github.com/ClickHouse/ClickHouse/blob/master/libs/libwidechar_width/LICENSE) |
|
||||
| llvm | [BSD 3-Clause License](https://github.com/ClickHouse-Extras/llvm/blob/163def217817c90fb982a6daf384744d8472b92b/llvm/LICENSE.TXT) |
|
||||
| lz4 | [BSD 2-Clause License](https://github.com/lz4/lz4/blob/c10863b98e1503af90616ae99725ecd120265dfb/LICENSE) |
|
||||
| mariadb-connector-c | [LGPL v2.1](https://github.com/ClickHouse-Extras/mariadb-connector-c/blob/3.1/COPYING.LIB) |
|
||||
| murmurhash | [Public Domain](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/murmurhash/LICENSE) |
|
||||
| pdqsort | [Zlib License](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/pdqsort/license.txt) |
|
||||
| poco | [Boost Software License - Version 1.0](https://github.com/ClickHouse-Extras/poco/blob/fe5505e56c27b6ecb0dcbc40c49dc2caf4e9637f/LICENSE) |
|
||||
| protobuf | [BSD 3-Clause License](https://github.com/ClickHouse-Extras/protobuf/blob/12735370922a35f03999afff478e1c6d7aa917a4/LICENSE) |
|
||||
| re2 | [BSD 3-Clause License](https://github.com/google/re2/blob/7cf8b88e8f70f97fd4926b56aa87e7f53b2717e0/LICENSE) |
|
||||
| sentry-native | [MIT License](https://github.com/getsentry/sentry-native/blob/master/LICENSE) |
|
||||
| UnixODBC | [LGPL v2.1](https://github.com/ClickHouse-Extras/UnixODBC/tree/b0ad30f7f6289c12b76f04bfb9d466374bb32168) |
|
||||
| zlib-ng | [Zlib License](https://github.com/ClickHouse-Extras/zlib-ng/blob/develop/LICENSE.md) |
|
||||
| zstd | [BSD 3-Clause License](https://github.com/facebook/zstd/blob/dev/LICENSE) |
|
||||
The list of third-party libraries can be obtained by the following query:
|
||||
|
||||
```
|
||||
SELECT library_name, license_type, license_path FROM system.licenses ORDER BY library_name COLLATE 'en'
|
||||
```
|
||||
|
||||
[Example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIGxpYnJhcnlfbmFtZSwgbGljZW5zZV90eXBlLCBsaWNlbnNlX3BhdGggRlJPTSBzeXN0ZW0ubGljZW5zZXMgT1JERVIgQlkgbGlicmFyeV9uYW1lIENPTExBVEUgJ2VuJw==)
|
||||
|
||||
| library_name | license_type | license_path |
|
||||
|:-|:-|:-|
|
||||
| abseil-cpp | Apache | /contrib/abseil-cpp/LICENSE |
|
||||
| AMQP-CPP | Apache | /contrib/AMQP-CPP/LICENSE |
|
||||
| arrow | Apache | /contrib/arrow/LICENSE.txt |
|
||||
| avro | Apache | /contrib/avro/LICENSE.txt |
|
||||
| aws | Apache | /contrib/aws/LICENSE.txt |
|
||||
| aws-c-common | Apache | /contrib/aws-c-common/LICENSE |
|
||||
| aws-c-event-stream | Apache | /contrib/aws-c-event-stream/LICENSE |
|
||||
| aws-checksums | Apache | /contrib/aws-checksums/LICENSE |
|
||||
| base64 | BSD 2-clause | /contrib/base64/LICENSE |
|
||||
| boost | Boost | /contrib/boost/LICENSE_1_0.txt |
|
||||
| boringssl | BSD | /contrib/boringssl/LICENSE |
|
||||
| brotli | MIT | /contrib/brotli/LICENSE |
|
||||
| capnproto | MIT | /contrib/capnproto/LICENSE |
|
||||
| cassandra | Apache | /contrib/cassandra/LICENSE.txt |
|
||||
| cctz | Apache | /contrib/cctz/LICENSE.txt |
|
||||
| cityhash102 | MIT | /contrib/cityhash102/COPYING |
|
||||
| cppkafka | BSD 2-clause | /contrib/cppkafka/LICENSE |
|
||||
| croaring | Apache | /contrib/croaring/LICENSE |
|
||||
| curl | Apache | /contrib/curl/docs/LICENSE-MIXING.md |
|
||||
| cyrus-sasl | BSD 2-clause | /contrib/cyrus-sasl/COPYING |
|
||||
| double-conversion | BSD 3-clause | /contrib/double-conversion/LICENSE |
|
||||
| dragonbox | Apache | /contrib/dragonbox/LICENSE-Apache2-LLVM |
|
||||
| fast_float | Apache | /contrib/fast_float/LICENSE |
|
||||
| fastops | MIT | /contrib/fastops/LICENSE |
|
||||
| flatbuffers | Apache | /contrib/flatbuffers/LICENSE.txt |
|
||||
| fmtlib | Unknown | /contrib/fmtlib/LICENSE.rst |
|
||||
| gcem | Apache | /contrib/gcem/LICENSE |
|
||||
| googletest | BSD 3-clause | /contrib/googletest/LICENSE |
|
||||
| grpc | Apache | /contrib/grpc/LICENSE |
|
||||
| h3 | Apache | /contrib/h3/LICENSE |
|
||||
| hyperscan | Boost | /contrib/hyperscan/LICENSE |
|
||||
| icu | Public Domain | /contrib/icu/icu4c/LICENSE |
|
||||
| icudata | Public Domain | /contrib/icudata/LICENSE |
|
||||
| jemalloc | BSD 2-clause | /contrib/jemalloc/COPYING |
|
||||
| krb5 | MIT | /contrib/krb5/src/lib/gssapi/LICENSE |
|
||||
| libc-headers | LGPL | /contrib/libc-headers/LICENSE |
|
||||
| libcpuid | BSD 2-clause | /contrib/libcpuid/COPYING |
|
||||
| libcxx | Apache | /contrib/libcxx/LICENSE.TXT |
|
||||
| libcxxabi | Apache | /contrib/libcxxabi/LICENSE.TXT |
|
||||
| libdivide | zLib | /contrib/libdivide/LICENSE.txt |
|
||||
| libfarmhash | MIT | /contrib/libfarmhash/COPYING |
|
||||
| libgsasl | LGPL | /contrib/libgsasl/LICENSE |
|
||||
| libhdfs3 | Apache | /contrib/libhdfs3/LICENSE.txt |
|
||||
| libmetrohash | Apache | /contrib/libmetrohash/LICENSE |
|
||||
| libpq | Unknown | /contrib/libpq/COPYRIGHT |
|
||||
| libpqxx | BSD 3-clause | /contrib/libpqxx/COPYING |
|
||||
| librdkafka | MIT | /contrib/librdkafka/LICENSE.murmur2 |
|
||||
| libunwind | Apache | /contrib/libunwind/LICENSE.TXT |
|
||||
| libuv | BSD | /contrib/libuv/LICENSE |
|
||||
| llvm | Apache | /contrib/llvm/llvm/LICENSE.TXT |
|
||||
| lz4 | BSD | /contrib/lz4/LICENSE |
|
||||
| mariadb-connector-c | LGPL | /contrib/mariadb-connector-c/COPYING.LIB |
|
||||
| miniselect | Boost | /contrib/miniselect/LICENSE_1_0.txt |
|
||||
| msgpack-c | Boost | /contrib/msgpack-c/LICENSE_1_0.txt |
|
||||
| murmurhash | Public Domain | /contrib/murmurhash/LICENSE |
|
||||
| NuRaft | Apache | /contrib/NuRaft/LICENSE |
|
||||
| openldap | Unknown | /contrib/openldap/LICENSE |
|
||||
| orc | Apache | /contrib/orc/LICENSE |
|
||||
| poco | Boost | /contrib/poco/LICENSE |
|
||||
| protobuf | BSD 3-clause | /contrib/protobuf/LICENSE |
|
||||
| rapidjson | MIT | /contrib/rapidjson/bin/jsonschema/LICENSE |
|
||||
| re2 | BSD 3-clause | /contrib/re2/LICENSE |
|
||||
| replxx | BSD 3-clause | /contrib/replxx/LICENSE.md |
|
||||
| rocksdb | BSD 3-clause | /contrib/rocksdb/LICENSE.leveldb |
|
||||
| sentry-native | MIT | /contrib/sentry-native/LICENSE |
|
||||
| simdjson | Apache | /contrib/simdjson/LICENSE |
|
||||
| snappy | Public Domain | /contrib/snappy/COPYING |
|
||||
| sparsehash-c11 | BSD 3-clause | /contrib/sparsehash-c11/LICENSE |
|
||||
| stats | Apache | /contrib/stats/LICENSE |
|
||||
| thrift | Apache | /contrib/thrift/LICENSE |
|
||||
| unixodbc | LGPL | /contrib/unixodbc/COPYING |
|
||||
| xz | Public Domain | /contrib/xz/COPYING |
|
||||
| zlib-ng | zLib | /contrib/zlib-ng/LICENSE.md |
|
||||
| zstd | BSD | /contrib/zstd/LICENSE |
|
||||
|
@ -3,15 +3,52 @@ toc_priority: 32
|
||||
toc_title: Atomic
|
||||
---
|
||||
|
||||
|
||||
# Atomic {#atomic}
|
||||
|
||||
It supports non-blocking `DROP` and `RENAME TABLE` queries and atomic `EXCHANGE TABLES t1 AND t2` queries. `Atomic` database engine is used by default.
|
||||
It supports non-blocking [DROP TABLE](#drop-detach-table) and [RENAME TABLE](#rename-table) queries and atomic [EXCHANGE TABLES t1 AND t2](#exchange-tables) queries. `Atomic` database engine is used by default.
|
||||
|
||||
## Creating a Database {#creating-a-database}
|
||||
|
||||
```sql
|
||||
CREATE DATABASE test ENGINE = Atomic;
|
||||
``` sql
|
||||
CREATE DATABASE test[ ENGINE = Atomic];
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/engines/database-engines/atomic/) <!--hide-->
|
||||
## Specifics and recommendations {#specifics-and-recommendations}
|
||||
|
||||
### Table UUID {#table-uuid}
|
||||
|
||||
All tables in database `Atomic` have persistent [UUID](../../sql-reference/data-types/uuid.md) and store data in directory `/clickhouse_path/store/xxx/xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy/`, where `xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy` is UUID of the table.
|
||||
Usually, the UUID is generated automatically, but the user can also explicitly specify the UUID in the same way when creating the table (this is not recommended). To display the `SHOW CREATE` query with the UUID you can use setting [show_table_uuid_in_table_create_query_if_not_nil](../../operations/settings/settings.md#show_table_uuid_in_table_create_query_if_not_nil). For example:
|
||||
|
||||
```sql
|
||||
CREATE TABLE name UUID '28f1c61c-2970-457a-bffe-454156ddcfef' (n UInt64) ENGINE = ...;
|
||||
```
|
||||
### RENAME TABLE {#rename-table}
|
||||
|
||||
`RENAME` queries are performed without changing UUID and moving table data. These queries do not wait for the completion of queries using the table and will be executed instantly.
|
||||
|
||||
### DROP/DETACH TABLE {#drop-detach-table}
|
||||
|
||||
On `DROP TABLE` no data is removed, database `Atomic` just marks table as dropped by moving metadata to `/clickhouse_path/metadata_dropped/` and notifies background thread. Delay before final table data deletion is specify by [database_atomic_delay_before_drop_table_sec](../../operations/server-configuration-parameters/settings.md#database_atomic_delay_before_drop_table_sec) setting.
|
||||
You can specify synchronous mode using `SYNC` modifier. Use the [database_atomic_wait_for_drop_and_detach_synchronously](../../operations/settings/settings.md#database_atomic_wait_for_drop_and_detach_synchronously) setting to do this. In this case `DROP` waits for running `SELECT`, `INSERT` and other queries which are using the table to finish. Table will be actually removed when it's not in use.
|
||||
|
||||
### EXCHANGE TABLES {#exchange-tables}
|
||||
|
||||
`EXCHANGE` query swaps tables atomically. So instead of this non-atomic operation:
|
||||
|
||||
```sql
|
||||
RENAME TABLE new_table TO tmp, old_table TO new_table, tmp TO old_table;
|
||||
```
|
||||
you can use one atomic query:
|
||||
|
||||
``` sql
|
||||
EXCHANGE TABLES new_table AND old_table;
|
||||
```
|
||||
|
||||
### ReplicatedMergeTree in Atomic Database {#replicatedmergetree-in-atomic-database}
|
||||
|
||||
For [ReplicatedMergeTree](../table-engines/mergetree-family/replication.md#table_engines-replication) tables is recomended do not specify parameters of engine - path in ZooKeeper and replica name. In this case will be used parameters of the configuration [default_replica_path](../../operations/server-configuration-parameters/settings.md#default_replica_path) and [default_replica_name](../../operations/server-configuration-parameters/settings.md#default_replica_name). If you want specify parameters of engine explicitly than recomended to use {uuid} macros. This is useful so that unique paths are automatically generated for each table in the ZooKeeper.
|
||||
|
||||
## See Also
|
||||
|
||||
- [system.databases](../../operations/system-tables/databases.md) system table
|
||||
|
@ -18,4 +18,8 @@ You can also use the following database engines:
|
||||
|
||||
- [Lazy](../../engines/database-engines/lazy.md)
|
||||
|
||||
- [Atomic](../../engines/database-engines/atomic.md)
|
||||
|
||||
- [PostgreSQL](../../engines/database-engines/postgresql.md)
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/database_engines/) <!--hide-->
|
||||
|
138
docs/en/engines/database-engines/postgresql.md
Normal file
138
docs/en/engines/database-engines/postgresql.md
Normal file
@ -0,0 +1,138 @@
|
||||
---
|
||||
toc_priority: 35
|
||||
toc_title: PostgreSQL
|
||||
---
|
||||
|
||||
# PostgreSQL {#postgresql}
|
||||
|
||||
Allows to connect to databases on a remote [PostgreSQL](https://www.postgresql.org) server. Supports read and write operations (`SELECT` and `INSERT` queries) to exchange data between ClickHouse and PostgreSQL.
|
||||
|
||||
Gives the real-time access to table list and table structure from remote PostgreSQL with the help of `SHOW TABLES` and `DESCRIBE TABLE` queries.
|
||||
|
||||
Supports table structure modifications (`ALTER TABLE ... ADD|DROP COLUMN`). If `use_table_cache` parameter (see the Engine Parameters below) it set to `1`, the table structure is cached and not checked for being modified, but can be updated with `DETACH` and `ATTACH` queries.
|
||||
|
||||
## Creating a Database {#creating-a-database}
|
||||
|
||||
``` sql
|
||||
CREATE DATABASE test_database
|
||||
ENGINE = PostgreSQL('host:port', 'database', 'user', 'password'[, `use_table_cache`]);
|
||||
```
|
||||
|
||||
**Engine Parameters**
|
||||
|
||||
- `host:port` — PostgreSQL server address.
|
||||
- `database` — Remote database name.
|
||||
- `user` — PostgreSQL user.
|
||||
- `password` — User password.
|
||||
- `use_table_cache` — Defines if the database table structure is cached or not. Optional. Default value: `0`.
|
||||
|
||||
## Data Types Support {#data_types-support}
|
||||
|
||||
| PostgerSQL | ClickHouse |
|
||||
|------------------|--------------------------------------------------------------|
|
||||
| DATE | [Date](../../sql-reference/data-types/date.md) |
|
||||
| TIMESTAMP | [DateTime](../../sql-reference/data-types/datetime.md) |
|
||||
| REAL | [Float32](../../sql-reference/data-types/float.md) |
|
||||
| DOUBLE | [Float64](../../sql-reference/data-types/float.md) |
|
||||
| DECIMAL, NUMERIC | [Decimal](../../sql-reference/data-types/decimal.md) |
|
||||
| SMALLINT | [Int16](../../sql-reference/data-types/int-uint.md) |
|
||||
| INTEGER | [Int32](../../sql-reference/data-types/int-uint.md) |
|
||||
| BIGINT | [Int64](../../sql-reference/data-types/int-uint.md) |
|
||||
| SERIAL | [UInt32](../../sql-reference/data-types/int-uint.md) |
|
||||
| BIGSERIAL | [UInt64](../../sql-reference/data-types/int-uint.md) |
|
||||
| TEXT, CHAR | [String](../../sql-reference/data-types/string.md) |
|
||||
| INTEGER | Nullable([Int32](../../sql-reference/data-types/int-uint.md))|
|
||||
| ARRAY | [Array](../../sql-reference/data-types/array.md) |
|
||||
|
||||
|
||||
## Examples of Use {#examples-of-use}
|
||||
|
||||
Database in ClickHouse, exchanging data with the PostgreSQL server:
|
||||
|
||||
``` sql
|
||||
CREATE DATABASE test_database
|
||||
ENGINE = PostgreSQL('postgres1:5432', 'test_database', 'postgres', 'mysecretpassword', 1);
|
||||
```
|
||||
|
||||
``` sql
|
||||
SHOW DATABASES;
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─name──────────┐
|
||||
│ default │
|
||||
│ test_database │
|
||||
│ system │
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
``` sql
|
||||
SHOW TABLES FROM test_database;
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─name───────┐
|
||||
│ test_table │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
Reading data from the PostgreSQL table:
|
||||
|
||||
``` sql
|
||||
SELECT * FROM test_database.test_table;
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─id─┬─value─┐
|
||||
│ 1 │ 2 │
|
||||
└────┴───────┘
|
||||
```
|
||||
|
||||
Writing data to the PostgreSQL table:
|
||||
|
||||
``` sql
|
||||
INSERT INTO test_database.test_table VALUES (3,4);
|
||||
SELECT * FROM test_database.test_table;
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─int_id─┬─value─┐
|
||||
│ 1 │ 2 │
|
||||
│ 3 │ 4 │
|
||||
└────────┴───────┘
|
||||
```
|
||||
|
||||
Consider the table structure was modified in PostgreSQL:
|
||||
|
||||
``` sql
|
||||
postgre> ALTER TABLE test_table ADD COLUMN data Text
|
||||
```
|
||||
|
||||
As the `use_table_cache` parameter was set to `1` when the database was created, the table structure in ClickHouse was cached and therefore not modified:
|
||||
|
||||
``` sql
|
||||
DESCRIBE TABLE test_database.test_table;
|
||||
```
|
||||
``` text
|
||||
┌─name───┬─type──────────────┐
|
||||
│ id │ Nullable(Integer) │
|
||||
│ value │ Nullable(Integer) │
|
||||
└────────┴───────────────────┘
|
||||
```
|
||||
|
||||
After detaching the table and attaching it again, the structure was updated:
|
||||
|
||||
``` sql
|
||||
DETACH TABLE test_database.test_table;
|
||||
ATTACH TABLE test_database.test_table;
|
||||
DESCRIBE TABLE test_database.test_table;
|
||||
```
|
||||
``` text
|
||||
┌─name───┬─type──────────────┐
|
||||
│ id │ Nullable(Integer) │
|
||||
│ value │ Nullable(Integer) │
|
||||
│ data │ Nullable(String) │
|
||||
└────────┴───────────────────┘
|
||||
```
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/database-engines/postgresql/) <!--hide-->
|
@ -47,12 +47,17 @@ Engines for communicating with other data storage and processing systems.
|
||||
|
||||
Engines in the family:
|
||||
|
||||
- [Kafka](../../engines/table-engines/integrations/kafka.md#kafka)
|
||||
- [MySQL](../../engines/table-engines/integrations/mysql.md#mysql)
|
||||
- [ODBC](../../engines/table-engines/integrations/odbc.md#table-engine-odbc)
|
||||
- [JDBC](../../engines/table-engines/integrations/jdbc.md#table-engine-jdbc)
|
||||
- [HDFS](../../engines/table-engines/integrations/hdfs.md#hdfs)
|
||||
- [S3](../../engines/table-engines/integrations/s3.md#table-engine-s3)
|
||||
|
||||
- [ODBC](../../engines/table-engines/integrations/odbc.md)
|
||||
- [JDBC](../../engines/table-engines/integrations/jdbc.md)
|
||||
- [MySQL](../../engines/table-engines/integrations/mysql.md)
|
||||
- [MongoDB](../../engines/table-engines/integrations/mongodb.md)
|
||||
- [HDFS](../../engines/table-engines/integrations/hdfs.md)
|
||||
- [S3](../../engines/table-engines/integrations/s3.md)
|
||||
- [Kafka](../../engines/table-engines/integrations/kafka.md)
|
||||
- [EmbeddedRocksDB](../../engines/table-engines/integrations/embedded-rocksdb.md)
|
||||
- [RabbitMQ](../../engines/table-engines/integrations/rabbitmq.md)
|
||||
- [PostgreSQL](../../engines/table-engines/integrations/postgresql.md)
|
||||
|
||||
### Special Engines {#special-engines}
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 6
|
||||
toc_priority: 9
|
||||
toc_title: EmbeddedRocksDB
|
||||
---
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 4
|
||||
toc_priority: 6
|
||||
toc_title: HDFS
|
||||
---
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
---
|
||||
toc_folder_title: Integrations
|
||||
toc_priority: 30
|
||||
toc_priority: 1
|
||||
---
|
||||
|
||||
# Table Engines for Integrations {#table-engines-for-integrations}
|
||||
@ -19,5 +19,3 @@ List of supported integrations:
|
||||
- [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md)
|
||||
- [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md)
|
||||
- [PostgreSQL](../../../engines/table-engines/integrations/postgresql.md)
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/) <!--hide-->
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 2
|
||||
toc_priority: 3
|
||||
toc_title: JDBC
|
||||
---
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 5
|
||||
toc_priority: 8
|
||||
toc_title: Kafka
|
||||
---
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 7
|
||||
toc_priority: 5
|
||||
toc_title: MongoDB
|
||||
---
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 3
|
||||
toc_priority: 4
|
||||
toc_title: MySQL
|
||||
---
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 1
|
||||
toc_priority: 2
|
||||
toc_title: ODBC
|
||||
---
|
||||
|
||||
|
@ -1,11 +1,11 @@
|
||||
---
|
||||
toc_priority: 8
|
||||
toc_priority: 11
|
||||
toc_title: PostgreSQL
|
||||
---
|
||||
|
||||
# PostgreSQL {#postgresql}
|
||||
|
||||
The PostgreSQL engine allows you to perform `SELECT` queries on data that is stored on a remote PostgreSQL server.
|
||||
The PostgreSQL engine allows to perform `SELECT` and `INSERT` queries on data that is stored on a remote PostgreSQL server.
|
||||
|
||||
## Creating a Table {#creating-a-table}
|
||||
|
||||
@ -15,7 +15,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
|
||||
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
|
||||
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
|
||||
...
|
||||
) ENGINE = PostgreSQL('host:port', 'database', 'table', 'user', 'password');
|
||||
) ENGINE = PostgreSQL('host:port', 'database', 'table', 'user', 'password'[, `schema`]);
|
||||
```
|
||||
|
||||
See a detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query.
|
||||
@ -29,25 +29,51 @@ The table structure can differ from the original PostgreSQL table structure:
|
||||
**Engine Parameters**
|
||||
|
||||
- `host:port` — PostgreSQL server address.
|
||||
|
||||
- `database` — Remote database name.
|
||||
|
||||
- `table` — Remote table name.
|
||||
|
||||
- `user` — PostgreSQL user.
|
||||
|
||||
- `password` — User password.
|
||||
- `schema` — Non-default table schema. Optional.
|
||||
|
||||
SELECT Queries on PostgreSQL side run as `COPY (SELECT ...) TO STDOUT` inside read-only PostgreSQL transaction with commit after each `SELECT` query.
|
||||
## Implementation Details {#implementation-details}
|
||||
|
||||
Simple `WHERE` clauses such as `=, !=, >, >=, <, <=, IN` are executed on the PostgreSQL server.
|
||||
`SELECT` queries on PostgreSQL side run as `COPY (SELECT ...) TO STDOUT` inside read-only PostgreSQL transaction with commit after each `SELECT` query.
|
||||
|
||||
Simple `WHERE` clauses such as `=`, `!=`, `>`, `>=`, `<`, `<=`, and `IN` are executed on the PostgreSQL server.
|
||||
|
||||
All joins, aggregations, sorting, `IN [ array ]` conditions and the `LIMIT` sampling constraint are executed in ClickHouse only after the query to PostgreSQL finishes.
|
||||
|
||||
INSERT Queries on PostgreSQL side run as `COPY "table_name" (field1, field2, ... fieldN) FROM STDIN` inside PostgreSQL transaction with auto-commit after each `INSERT` statement.
|
||||
`INSERT` queries on PostgreSQL side run as `COPY "table_name" (field1, field2, ... fieldN) FROM STDIN` inside PostgreSQL transaction with auto-commit after each `INSERT` statement.
|
||||
|
||||
PostgreSQL Array types converts into ClickHouse arrays.
|
||||
Be careful in PostgreSQL an array data created like a type_name[] may contain multi-dimensional arrays of different dimensions in different table rows in same column, but in ClickHouse it is only allowed to have multidimensional arrays of the same count of dimensions in all table rows in same column.
|
||||
PostgreSQL `Array` types are converted into ClickHouse arrays.
|
||||
|
||||
!!! info "Note"
|
||||
Be careful - in PostgreSQL an array data, created like a `type_name[]`, may contain multi-dimensional arrays of different dimensions in different table rows in same column. But in ClickHouse it is only allowed to have multidimensional arrays of the same count of dimensions in all table rows in same column.
|
||||
|
||||
Replicas priority for PostgreSQL dictionary source is supported. The bigger the number in map, the less the priority. The highest priority is `0`.
|
||||
|
||||
In the example below replica `example01-1` has the highest priority:
|
||||
|
||||
```xml
|
||||
<postgresql>
|
||||
<port>5432</port>
|
||||
<user>clickhouse</user>
|
||||
<password>qwerty</password>
|
||||
<replica>
|
||||
<host>example01-1</host>
|
||||
<priority>1</priority>
|
||||
</replica>
|
||||
<replica>
|
||||
<host>example01-2</host>
|
||||
<priority>2</priority>
|
||||
</replica>
|
||||
<db>db_name</db>
|
||||
<table>table_name</table>
|
||||
<where>id=10</where>
|
||||
<invalidate_query>SQL_QUERY</invalidate_query>
|
||||
</postgresql>
|
||||
</source>
|
||||
```
|
||||
|
||||
## Usage Example {#usage-example}
|
||||
|
||||
@ -64,14 +90,14 @@ PRIMARY KEY (int_id));
|
||||
|
||||
CREATE TABLE
|
||||
|
||||
postgres=# insert into test (int_id, str, "float") VALUES (1,'test',2);
|
||||
postgres=# INSERT INTO test (int_id, str, "float") VALUES (1,'test',2);
|
||||
INSERT 0 1
|
||||
|
||||
postgresql> select * from test;
|
||||
int_id | int_nullable | float | str | float_nullable
|
||||
--------+--------------+-------+------+----------------
|
||||
1 | | 2 | test |
|
||||
(1 row)
|
||||
postgresql> SELECT * FROM test;
|
||||
int_id | int_nullable | float | str | float_nullable
|
||||
--------+--------------+-------+------+----------------
|
||||
1 | | 2 | test |
|
||||
(1 row)
|
||||
```
|
||||
|
||||
Table in ClickHouse, retrieving data from the PostgreSQL table created above:
|
||||
@ -87,20 +113,33 @@ ENGINE = PostgreSQL('localhost:5432', 'public', 'test', 'postges_user', 'postgre
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT * FROM postgresql_table WHERE str IN ('test')
|
||||
SELECT * FROM postgresql_table WHERE str IN ('test');
|
||||
```
|
||||
|
||||
``` text
|
||||
┌─float_nullable─┬─str──┬─int_id─┐
|
||||
│ ᴺᵁᴸᴸ │ test │ 1 │
|
||||
└────────────────┴──────┴────────┘
|
||||
1 rows in set. Elapsed: 0.019 sec.
|
||||
```
|
||||
|
||||
Using Non-default Schema:
|
||||
|
||||
## See Also {#see-also}
|
||||
```text
|
||||
postgres=# CREATE SCHEMA "nice.schema";
|
||||
|
||||
- [The ‘postgresql’ table function](../../../sql-reference/table-functions/postgresql.md)
|
||||
postgres=# CREATE TABLE "nice.schema"."nice.table" (a integer);
|
||||
|
||||
postgres=# INSERT INTO "nice.schema"."nice.table" SELECT i FROM generate_series(0, 99) as t(i)
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE pg_table_schema_with_dots (a UInt32)
|
||||
ENGINE PostgreSQL('localhost:5432', 'clickhouse', 'nice.table', 'postgrsql_user', 'password', 'nice.schema');
|
||||
```
|
||||
|
||||
**See Also**
|
||||
|
||||
- [The `postgresql` table function](../../../sql-reference/table-functions/postgresql.md)
|
||||
- [Using PostgreSQL as a source of external dictionary](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-postgresql)
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/engines/table-engines/integrations/postgresql/) <!--hide-->
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 6
|
||||
toc_priority: 10
|
||||
toc_title: RabbitMQ
|
||||
---
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
---
|
||||
toc_priority: 4
|
||||
toc_priority: 7
|
||||
toc_title: S3
|
||||
---
|
||||
|
||||
|
@ -3,7 +3,7 @@ toc_priority: 35
|
||||
toc_title: AggregatingMergeTree
|
||||
---
|
||||
|
||||
# Aggregatingmergetree {#aggregatingmergetree}
|
||||
# AggregatingMergeTree {#aggregatingmergetree}
|
||||
|
||||
The engine inherits from [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md#table_engines-mergetree), altering the logic for data parts merging. ClickHouse replaces all rows with the same primary key (or more accurately, with the same [sorting key](../../../engines/table-engines/mergetree-family/mergetree.md)) with a single row (within a one data part) that stores a combination of states of aggregate functions.
|
||||
|
||||
|
@ -23,6 +23,7 @@ toc_title: Client Libraries
|
||||
- [SeasClick C++ client](https://github.com/SeasX/SeasClick)
|
||||
- [one-ck](https://github.com/lizhichao/one-ck)
|
||||
- [glushkovds/phpclickhouse-laravel](https://packagist.org/packages/glushkovds/phpclickhouse-laravel)
|
||||
- [kolya7k ClickHouse PHP extension](https://github.com//kolya7k/clickhouse-php)
|
||||
- Go
|
||||
- [clickhouse](https://github.com/kshvakov/clickhouse/)
|
||||
- [go-clickhouse](https://github.com/roistat/go-clickhouse)
|
||||
|
5
docs/en/interfaces/third-party/gui.md
vendored
5
docs/en/interfaces/third-party/gui.md
vendored
@ -184,4 +184,9 @@ SeekTable is [free](https://www.seektable.com/help/cloud-pricing) for personal/i
|
||||
|
||||
[How to configure ClickHouse connection in SeekTable.](https://www.seektable.com/help/clickhouse-pivot-table)
|
||||
|
||||
|
||||
### Chadmin {#chadmin}
|
||||
|
||||
[Chadmin](https://github.com/bun4uk/chadmin) is a simple UI where you can visualize your currently running queries on your ClickHouse cluster and info about them and kill them if you want.
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/interfaces/third-party/gui/) <!--hide-->
|
||||
|
@ -13,6 +13,7 @@ toc_title: Adopters
|
||||
| <a href="https://2gis.ru" class="favicon">2gis</a> | Maps | Monitoring | — | — | [Talk in Russian, July 2019](https://youtu.be/58sPkXfq6nw) |
|
||||
| <a href="https://getadmiral.com/" class="favicon">Admiral</a> | Martech | Engagement Management | — | — | [Webinar Slides, June 2020](https://altinity.com/presentations/2020/06/16/big-data-in-real-time-how-clickhouse-powers-admirals-visitor-relationships-for-publishers) |
|
||||
| <a href="http://www.adscribe.tv/" class="favicon">AdScribe</a> | Ads | TV Analytics | — | — | [A quote from CTO](https://altinity.com/24x7-support/) |
|
||||
| <a href="https://ahrefs.com/" class="favicon">Ahrefs</a> | SEO | Analytics | — | — | [Job listing](https://ahrefs.com/jobs/data-scientist-search) |
|
||||
| <a href="https://cn.aliyun.com/" class="favicon">Alibaba Cloud</a> | Cloud | Managed Service | — | — | [Official Website](https://help.aliyun.com/product/144466.html) |
|
||||
| <a href="https://alohabrowser.com/" class="favicon">Aloha Browser</a> | Mobile App | Browser backend | — | — | [Slides in Russian, May 2019](https://presentations.clickhouse.tech/meetup22/aloha.pdf) |
|
||||
| <a href="https://altinity.com/" class="favicon">Altinity</a> | Cloud, SaaS | Main product | — | — | [Official Website](https://altinity.com/) |
|
||||
|
@ -100,6 +100,11 @@ Default value: `1073741824` (1 GB).
|
||||
<size_limit>1073741824</size_limit>
|
||||
</core_dump>
|
||||
```
|
||||
## database_atomic_delay_before_drop_table_sec {#database_atomic_delay_before_drop_table_sec}
|
||||
|
||||
Sets the delay before remove table data in seconds. If the query has `SYNC` modifier, this setting is ignored.
|
||||
|
||||
Default value: `480` (8 minute).
|
||||
|
||||
## default_database {#default-database}
|
||||
|
||||
@ -125,6 +130,25 @@ Settings profiles are located in the file specified in the parameter `user_confi
|
||||
<default_profile>default</default_profile>
|
||||
```
|
||||
|
||||
## default_replica_path {#default_replica_path}
|
||||
|
||||
The path to the table in ZooKeeper.
|
||||
|
||||
**Example**
|
||||
|
||||
``` xml
|
||||
<default_replica_path>/clickhouse/tables/{uuid}/{shard}</default_replica_path>
|
||||
```
|
||||
## default_replica_name {#default_replica_name}
|
||||
|
||||
The replica name in ZooKeeper.
|
||||
|
||||
**Example**
|
||||
|
||||
``` xml
|
||||
<default_replica_name>{replica}</default_replica_name>
|
||||
```
|
||||
|
||||
## dictionaries_config {#server_configuration_parameters-dictionaries_config}
|
||||
|
||||
The path to the config file for external dictionaries.
|
||||
@ -321,7 +345,8 @@ Similar to `interserver_http_host`, except that this hostname can be used by oth
|
||||
The username and password used to authenticate during [replication](../../engines/table-engines/mergetree-family/replication.md) with the Replicated\* engines. These credentials are used only for communication between replicas and are unrelated to credentials for ClickHouse clients. The server is checking these credentials for connecting replicas and use the same credentials when connecting to other replicas. So, these credentials should be set the same for all replicas in a cluster.
|
||||
By default, the authentication is not used.
|
||||
|
||||
**Note:** These credentials are common for replication through `HTTP` and `HTTPS`.
|
||||
!!! note "Note"
|
||||
These credentials are common for replication through `HTTP` and `HTTPS`.
|
||||
|
||||
This section contains the following parameters:
|
||||
|
||||
|
@ -2787,6 +2787,28 @@ Possible values:
|
||||
|
||||
Default value: `0`.
|
||||
|
||||
## database_atomic_wait_for_drop_and_detach_synchronously {#database_atomic_wait_for_drop_and_detach_synchronously}
|
||||
|
||||
Adds a modifier `SYNC` to all `DROP` and `DETACH` queries.
|
||||
|
||||
Possible values:
|
||||
|
||||
- 0 — Queries will be executed with delay.
|
||||
- 1 — Queries will be executed without delay.
|
||||
|
||||
Default value: `0`.
|
||||
|
||||
## show_table_uuid_in_table_create_query_if_not_nil {#show_table_uuid_in_table_create_query_if_not_nil}
|
||||
|
||||
Sets the `SHOW TABLE` query display.
|
||||
|
||||
Possible values:
|
||||
|
||||
- 0 — The query will be displayed without table UUID.
|
||||
- 1 — The query will be displayed with table UUID.
|
||||
|
||||
Default value: `0`.
|
||||
|
||||
## allow_experimental_live_view {#allow-experimental-live-view}
|
||||
|
||||
Allows creation of experimental [live views](../../sql-reference/statements/create/view.md#live-view).
|
||||
@ -2822,4 +2844,15 @@ Sets the interval in seconds after which periodically refreshed [live view](../.
|
||||
|
||||
Default value: `60`.
|
||||
|
||||
## check_query_single_value_result {#check_query_single_value_result}
|
||||
|
||||
Defines the level of detail for the [CHECK TABLE](../../sql-reference/statements/check-table.md#checking-mergetree-tables) query result for `MergeTree` family engines .
|
||||
|
||||
Possible values:
|
||||
|
||||
- 0 — the query shows a check status for every individual data part of a table.
|
||||
- 1 — the query shows the general table check status.
|
||||
|
||||
Default value: `0`.
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/operations/settings/settings/) <!-- hide -->
|
||||
|
@ -14,7 +14,17 @@ Columns:
|
||||
|
||||
- `node_name` ([String](../../sql-reference/data-types/string.md)) — Node name in ZooKeeper.
|
||||
|
||||
- `type` ([String](../../sql-reference/data-types/string.md)) — Type of the task in the queue: `GET_PARTS`, `MERGE_PARTS`, `DETACH_PARTS`, `DROP_PARTS`, or `MUTATE_PARTS`.
|
||||
- `type` ([String](../../sql-reference/data-types/string.md)) — Type of the task in the queue, one of:
|
||||
- `GET_PART` - Get the part from another replica.
|
||||
- `ATTACH_PART` - Attach the part, possibly from our own replica (if found in `detached` folder).
|
||||
You may think of it as a `GET_PART` with some optimisations as they're nearly identical.
|
||||
- `MERGE_PARTS` - Merge the parts.
|
||||
- `DROP_RANGE` - Delete the parts in the specified partition in the specified number range.
|
||||
- `CLEAR_COLUMN` - NOTE: Deprecated. Drop specific column from specified partition.
|
||||
- `CLEAR_INDEX` - NOTE: Deprecated. Drop specific index from specified partition.
|
||||
- `REPLACE_RANGE` - Drop certain range of partitions and replace them by new ones
|
||||
- `MUTATE_PART` - Apply one or several mutations to the part.
|
||||
- `ALTER_METADATA` - Apply alter modification according to global /metadata and /columns paths
|
||||
|
||||
- `create_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — Date and time when the task was submitted for execution.
|
||||
|
||||
|
@ -20,10 +20,12 @@ Columns:
|
||||
|
||||
When connecting to the server by `clickhouse-client`, you see the string similar to `Connected to ClickHouse server version 19.18.1 revision 54429.`. This field contains the `revision`, but not the `version` of a server.
|
||||
|
||||
- `timer_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Timer type:
|
||||
- `trace_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Trace type:
|
||||
|
||||
- `Real` represents wall-clock time.
|
||||
- `CPU` represents CPU time.
|
||||
- `Real` represents collecting stack traces by wall-clock time.
|
||||
- `CPU` represents collecting stack traces by CPU time.
|
||||
- `Memory` represents collecting allocations and deallocations when memory allocation exceeds the subsequent watermark.
|
||||
- `MemorySample` represents collecting random allocations and deallocations.
|
||||
|
||||
- `thread_number` ([UInt32](../../sql-reference/data-types/int-uint.md)) — Thread identifier.
|
||||
|
||||
|
@ -15,7 +15,8 @@ $ sudo service clickhouse-server restart
|
||||
|
||||
If you installed ClickHouse using something other than the recommended `deb` packages, use the appropriate update method.
|
||||
|
||||
ClickHouse does not support a distributed update. The operation should be performed consecutively on each separate server. Do not update all the servers on a cluster simultaneously, or the cluster will be unavailable for some time.
|
||||
!!! note "Note"
|
||||
You can update multiple servers at once as soon as there is no moment when all replicas of one shard are offline.
|
||||
|
||||
The upgrade of older version of ClickHouse to specific version:
|
||||
|
||||
@ -31,4 +32,3 @@ $ sudo service clickhouse-server restart
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -26,7 +26,7 @@ Function:
|
||||
|
||||
- Uses the HyperLogLog algorithm to approximate the number of different argument values.
|
||||
|
||||
212 5-bit cells are used. The size of the state is slightly more than 2.5 KB. The result is not very accurate (up to ~10% error) for small data sets (<10K elements). However, the result is fairly accurate for high-cardinality data sets (10K-100M), with a maximum error of ~1.6%. Starting from 100M, the estimation error increases, and the function will return very inaccurate results for data sets with extremely high cardinality (1B+ elements).
|
||||
2^12 5-bit cells are used. The size of the state is slightly more than 2.5 KB. The result is not very accurate (up to ~10% error) for small data sets (<10K elements). However, the result is fairly accurate for high-cardinality data sets (10K-100M), with a maximum error of ~1.6%. Starting from 100M, the estimation error increases, and the function will return very inaccurate results for data sets with extremely high cardinality (1B+ elements).
|
||||
|
||||
- Provides the determinate result (it doesn’t depend on the query processing order).
|
||||
|
||||
|
@ -9,7 +9,7 @@ Allows to store an instant in time, that can be expressed as a calendar date and
|
||||
|
||||
Tick size (precision): 10<sup>-precision</sup> seconds
|
||||
|
||||
Syntax:
|
||||
**Syntax:**
|
||||
|
||||
``` sql
|
||||
DateTime64(precision, [timezone])
|
||||
@ -17,9 +17,11 @@ DateTime64(precision, [timezone])
|
||||
|
||||
Internally, stores data as a number of ‘ticks’ since epoch start (1970-01-01 00:00:00 UTC) as Int64. The tick resolution is determined by the precision parameter. Additionally, the `DateTime64` type can store time zone that is the same for the entire column, that affects how the values of the `DateTime64` type values are displayed in text format and how the values specified as strings are parsed (‘2020-01-01 05:00:01.000’). The time zone is not stored in the rows of the table (or in resultset), but is stored in the column metadata. See details in [DateTime](../../sql-reference/data-types/datetime.md).
|
||||
|
||||
Supported range from January 1, 1925 till December 31, 2283.
|
||||
|
||||
## Examples {#examples}
|
||||
|
||||
**1.** Creating a table with `DateTime64`-type column and inserting data into it:
|
||||
1. Creating a table with `DateTime64`-type column and inserting data into it:
|
||||
|
||||
``` sql
|
||||
CREATE TABLE dt
|
||||
@ -27,15 +29,15 @@ CREATE TABLE dt
|
||||
`timestamp` DateTime64(3, 'Europe/Moscow'),
|
||||
`event_id` UInt8
|
||||
)
|
||||
ENGINE = TinyLog
|
||||
ENGINE = TinyLog;
|
||||
```
|
||||
|
||||
``` sql
|
||||
INSERT INTO dt Values (1546300800000, 1), ('2019-01-01 00:00:00', 2)
|
||||
INSERT INTO dt Values (1546300800000, 1), ('2019-01-01 00:00:00', 2);
|
||||
```
|
||||
|
||||
``` sql
|
||||
SELECT * FROM dt
|
||||
SELECT * FROM dt;
|
||||
```
|
||||
|
||||
``` text
|
||||
@ -45,13 +47,13 @@ SELECT * FROM dt
|
||||
└─────────────────────────┴──────────┘
|
||||
```
|
||||
|
||||
- When inserting datetime as an integer, it is treated as an appropriately scaled Unix Timestamp (UTC). `1546300800000` (with precision 3) represents `'2019-01-01 00:00:00'` UTC. However, as `timestamp` column has `Europe/Moscow` (UTC+3) timezone specified, when outputting as a string the value will be shown as `'2019-01-01 03:00:00'`
|
||||
- When inserting datetime as an integer, it is treated as an appropriately scaled Unix Timestamp (UTC). `1546300800000` (with precision 3) represents `'2019-01-01 00:00:00'` UTC. However, as `timestamp` column has `Europe/Moscow` (UTC+3) timezone specified, when outputting as a string the value will be shown as `'2019-01-01 03:00:00'`.
|
||||
- When inserting string value as datetime, it is treated as being in column timezone. `'2019-01-01 00:00:00'` will be treated as being in `Europe/Moscow` timezone and stored as `1546290000000`.
|
||||
|
||||
**2.** Filtering on `DateTime64` values
|
||||
2. Filtering on `DateTime64` values
|
||||
|
||||
``` sql
|
||||
SELECT * FROM dt WHERE timestamp = toDateTime64('2019-01-01 00:00:00', 3, 'Europe/Moscow')
|
||||
SELECT * FROM dt WHERE timestamp = toDateTime64('2019-01-01 00:00:00', 3, 'Europe/Moscow');
|
||||
```
|
||||
|
||||
``` text
|
||||
@ -60,12 +62,12 @@ SELECT * FROM dt WHERE timestamp = toDateTime64('2019-01-01 00:00:00', 3, 'Europ
|
||||
└─────────────────────────┴──────────┘
|
||||
```
|
||||
|
||||
Unlike `DateTime`, `DateTime64` values are not converted from `String` automatically
|
||||
Unlike `DateTime`, `DateTime64` values are not converted from `String` automatically.
|
||||
|
||||
**3.** Getting a time zone for a `DateTime64`-type value:
|
||||
3. Getting a time zone for a `DateTime64`-type value:
|
||||
|
||||
``` sql
|
||||
SELECT toDateTime64(now(), 3, 'Europe/Moscow') AS column, toTypeName(column) AS x
|
||||
SELECT toDateTime64(now(), 3, 'Europe/Moscow') AS column, toTypeName(column) AS x;
|
||||
```
|
||||
|
||||
``` text
|
||||
@ -74,13 +76,13 @@ SELECT toDateTime64(now(), 3, 'Europe/Moscow') AS column, toTypeName(column) AS
|
||||
└─────────────────────────┴────────────────────────────────┘
|
||||
```
|
||||
|
||||
**4.** Timezone conversion
|
||||
4. Timezone conversion
|
||||
|
||||
``` sql
|
||||
SELECT
|
||||
toDateTime64(timestamp, 3, 'Europe/London') as lon_time,
|
||||
toDateTime64(timestamp, 3, 'Europe/Moscow') as mos_time
|
||||
FROM dt
|
||||
FROM dt;
|
||||
```
|
||||
|
||||
``` text
|
||||
@ -90,7 +92,7 @@ FROM dt
|
||||
└─────────────────────────┴─────────────────────────┘
|
||||
```
|
||||
|
||||
## See Also {#see-also}
|
||||
**See Also**
|
||||
|
||||
- [Type conversion functions](../../sql-reference/functions/type-conversion-functions.md)
|
||||
- [Functions for working with dates and times](../../sql-reference/functions/date-time-functions.md)
|
||||
|
@ -69,6 +69,8 @@ Types of sources (`source_type`):
|
||||
- [ClickHouse](#dicts-external_dicts_dict_sources-clickhouse)
|
||||
- [MongoDB](#dicts-external_dicts_dict_sources-mongodb)
|
||||
- [Redis](#dicts-external_dicts_dict_sources-redis)
|
||||
- [Cassandra](#dicts-external_dicts_dict_sources-cassandra)
|
||||
- [PostgreSQL](#dicts-external_dicts_dict_sources-postgresql)
|
||||
|
||||
## Local File {#dicts-external_dicts_dict_sources-local_file}
|
||||
|
||||
|
@ -159,14 +159,14 @@ Configuration fields:
|
||||
| Tag | Description | Required |
|
||||
|------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|
||||
| `name` | Column name. | Yes |
|
||||
| `type` | ClickHouse data type.<br/>ClickHouse tries to cast value from dictionary to the specified data type. For example, for MySQL, the field might be `TEXT`, `VARCHAR`, or `BLOB` in the MySQL source table, but it can be uploaded as `String` in ClickHouse.<br/>[Nullable](../../../sql-reference/data-types/nullable.md) is not supported. | Yes |
|
||||
| `null_value` | Default value for a non-existing element.<br/>In the example, it is an empty string. You cannot use `NULL` in this field. | Yes |
|
||||
| `type` | ClickHouse data type.<br/>ClickHouse tries to cast value from dictionary to the specified data type. For example, for MySQL, the field might be `TEXT`, `VARCHAR`, or `BLOB` in the MySQL source table, but it can be uploaded as `String` in ClickHouse.<br/>[Nullable](../../../sql-reference/data-types/nullable.md) is currently supported for [Flat](external-dicts-dict-layout.md#flat), [Hashed](external-dicts-dict-layout.md#dicts-external_dicts_dict_layout-hashed), [ComplexKeyHashed](external-dicts-dict-layout.md#complex-key-hashed), [Direct](external-dicts-dict-layout.md#direct), [ComplexKeyDirect](external-dicts-dict-layout.md#complex-key-direct), [RangeHashed](external-dicts-dict-layout.md#range-hashed), [Polygon](external-dicts-dict-polygon.md) dictionaries. In [Cache](external-dicts-dict-layout.md#cache), [ComplexKeyCache](external-dicts-dict-layout.md#complex-key-cache), [SSDCache](external-dicts-dict-layout.md#ssd-cache), [SSDComplexKeyCache](external-dicts-dict-layout.md#complex-key-ssd-cache), [IPTrie](external-dicts-dict-layout.md#ip-trie) dictionaries `Nullable` types are not supported. | Yes |
|
||||
| `null_value` | Default value for a non-existing element.<br/>In the example, it is an empty string. [NULL](../../syntax.md#null-literal) value can be used only for the `Nullable` types (see the previous line with types description). | Yes |
|
||||
| `expression` | [Expression](../../../sql-reference/syntax.md#syntax-expressions) that ClickHouse executes on the value.<br/>The expression can be a column name in the remote SQL database. Thus, you can use it to create an alias for the remote column.<br/><br/>Default value: no expression. | No |
|
||||
| <a name="hierarchical-dict-attr"></a> `hierarchical` | If `true`, the attribute contains the value of a parent key for the current key. See [Hierarchical Dictionaries](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-hierarchical.md).<br/><br/>Default value: `false`. | No |
|
||||
| `injective` | Flag that shows whether the `id -> attribute` image is [injective](https://en.wikipedia.org/wiki/Injective_function).<br/>If `true`, ClickHouse can automatically place after the `GROUP BY` clause the requests to dictionaries with injection. Usually it significantly reduces the amount of such requests.<br/><br/>Default value: `false`. | No |
|
||||
| `is_object_id` | Flag that shows whether the query is executed for a MongoDB document by `ObjectID`.<br/><br/>Default value: `false`. | No |
|
||||
|
||||
## See Also {#see-also}
|
||||
**See Also**
|
||||
|
||||
- [Functions for working with external dictionaries](../../../sql-reference/functions/ext-dict-functions.md).
|
||||
|
||||
|
@ -10,8 +10,6 @@ A dictionary is a mapping (`key -> attributes`) that is convenient for various t
|
||||
|
||||
ClickHouse supports special functions for working with dictionaries that can be used in queries. It is easier and more efficient to use dictionaries with functions than a `JOIN` with reference tables.
|
||||
|
||||
[NULL](../../sql-reference/syntax.md#null-literal) values can’t be stored in a dictionary.
|
||||
|
||||
ClickHouse supports:
|
||||
|
||||
- [Built-in dictionaries](../../sql-reference/dictionaries/internal-dicts.md#internal_dicts) with a specific [set of functions](../../sql-reference/functions/ym-dict-functions.md).
|
||||
|
@ -245,7 +245,7 @@ Elements set to `NULL` are handled as normal values.
|
||||
|
||||
Returns the number of elements in the arr array for which func returns something other than 0. If ‘func’ is not specified, it returns the number of non-zero elements in the array.
|
||||
|
||||
Note that the `arrayCount` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument.
|
||||
Note that the `arrayCount` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You can pass a lambda function to it as the first argument.
|
||||
|
||||
## countEqual(arr, x) {#countequalarr-x}
|
||||
|
||||
@ -1229,7 +1229,7 @@ SELECT arrayReverseFill(x -> not isNull(x), [1, null, 3, 11, 12, null, null, 5,
|
||||
└────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Note that the `arrayReverseFilter` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted.
|
||||
Note that the `arrayReverseFill` is a [higher-order function](../../sql-reference/functions/index.md#higher-order-functions). You must pass a lambda function to it as the first argument, and it can’t be omitted.
|
||||
|
||||
## arraySplit(func, arr1, …) {#array-split}
|
||||
|
||||
@ -1293,7 +1293,7 @@ Note that the `arrayFirstIndex` is a [higher-order function](../../sql-reference
|
||||
|
||||
## arrayMin {#array-min}
|
||||
|
||||
Returns the minimum of elements in the source array.
|
||||
Returns the minimum of elements in the source array.
|
||||
|
||||
If the `func` function is specified, returns the mininum of elements converted by this function.
|
||||
|
||||
@ -1312,9 +1312,9 @@ arrayMin([func,] arr)
|
||||
|
||||
**Returned value**
|
||||
|
||||
- The minimum of function values (or the array minimum).
|
||||
- The minimum of function values (or the array minimum).
|
||||
|
||||
Type: if `func` is specified, matches `func` return value type, else matches the array elements type.
|
||||
Type: if `func` is specified, matches `func` return value type, else matches the array elements type.
|
||||
|
||||
**Examples**
|
||||
|
||||
@ -1348,7 +1348,7 @@ Result:
|
||||
|
||||
## arrayMax {#array-max}
|
||||
|
||||
Returns the maximum of elements in the source array.
|
||||
Returns the maximum of elements in the source array.
|
||||
|
||||
If the `func` function is specified, returns the maximum of elements converted by this function.
|
||||
|
||||
@ -1367,9 +1367,9 @@ arrayMax([func,] arr)
|
||||
|
||||
**Returned value**
|
||||
|
||||
- The maximum of function values (or the array maximum).
|
||||
- The maximum of function values (or the array maximum).
|
||||
|
||||
Type: if `func` is specified, matches `func` return value type, else matches the array elements type.
|
||||
Type: if `func` is specified, matches `func` return value type, else matches the array elements type.
|
||||
|
||||
**Examples**
|
||||
|
||||
@ -1403,7 +1403,7 @@ Result:
|
||||
|
||||
## arraySum {#array-sum}
|
||||
|
||||
Returns the sum of elements in the source array.
|
||||
Returns the sum of elements in the source array.
|
||||
|
||||
If the `func` function is specified, returns the sum of elements converted by this function.
|
||||
|
||||
@ -1418,7 +1418,7 @@ arraySum([func,] arr)
|
||||
**Arguments**
|
||||
|
||||
- `func` — Function. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
|
||||
- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
|
||||
- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
@ -1458,7 +1458,7 @@ Result:
|
||||
|
||||
## arrayAvg {#array-avg}
|
||||
|
||||
Returns the average of elements in the source array.
|
||||
Returns the average of elements in the source array.
|
||||
|
||||
If the `func` function is specified, returns the average of elements converted by this function.
|
||||
|
||||
@ -1473,7 +1473,7 @@ arrayAvg([func,] arr)
|
||||
**Arguments**
|
||||
|
||||
- `func` — Function. [Expression](../../sql-reference/data-types/special-data-types/expression.md).
|
||||
- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
|
||||
- `arr` — Array. [Array](../../sql-reference/data-types/array.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
|
@ -250,3 +250,53 @@ Result:
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
## bitHammingDistance {#bithammingdistance}
|
||||
|
||||
Returns the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) between the bit representations of two integer values. Can be used with [SimHash](../../sql-reference/functions/hash-functions.md#ngramsimhash) functions for detection of semi-duplicate strings. The smaller is the distance, the more likely those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
bitHammingDistance(int1, int2)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `int1` — First integer value. [Int64](../../sql-reference/data-types/int-uint.md).
|
||||
- `int2` — Second integer value. [Int64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- The Hamming distance.
|
||||
|
||||
Type: [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Examples**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT bitHammingDistance(111, 121);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─bitHammingDistance(111, 121)─┐
|
||||
│ 3 │
|
||||
└──────────────────────────────┘
|
||||
```
|
||||
|
||||
With [SimHash](../../sql-reference/functions/hash-functions.md#ngramsimhash):
|
||||
|
||||
``` sql
|
||||
SELECT bitHammingDistance(ngramSimHash('cat ate rat'), ngramSimHash('rat ate cat'));
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─bitHammingDistance(ngramSimHash('cat ate rat'), ngramSimHash('rat ate cat'))─┐
|
||||
│ 5 │
|
||||
└──────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
@ -7,6 +7,8 @@ toc_title: Hash
|
||||
|
||||
Hash functions can be used for the deterministic pseudo-random shuffling of elements.
|
||||
|
||||
Simhash is a hash function, which returns close hash values for close (similar) arguments.
|
||||
|
||||
## halfMD5 {#hash-functions-halfmd5}
|
||||
|
||||
[Interprets](../../sql-reference/functions/type-conversion-functions.md#type_conversion_functions-reinterpretAsString) all the input parameters as strings and calculates the [MD5](https://en.wikipedia.org/wiki/MD5) hash value for each of them. Then combines hashes, takes the first 8 bytes of the hash of the resulting string, and interprets them as `UInt64` in big-endian byte order.
|
||||
@ -482,3 +484,938 @@ Result:
|
||||
|
||||
- [xxHash](http://cyan4973.github.io/xxHash/).
|
||||
|
||||
## ngramSimHash {#ngramsimhash}
|
||||
|
||||
Splits a ASCII string into n-grams of `ngramsize` symbols and returns the n-gram `simhash`. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramSimHash(string[, ngramsize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramSimHash('ClickHouse') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────Hash─┐
|
||||
│ 1627567969 │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## ngramSimHashCaseInsensitive {#ngramsimhashcaseinsensitive}
|
||||
|
||||
Splits a ASCII string into n-grams of `ngramsize` symbols and returns the n-gram `simhash`. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramSimHashCaseInsensitive(string[, ngramsize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramSimHashCaseInsensitive('ClickHouse') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌──────Hash─┐
|
||||
│ 562180645 │
|
||||
└───────────┘
|
||||
```
|
||||
|
||||
## ngramSimHashUTF8 {#ngramsimhashutf8}
|
||||
|
||||
Splits a UTF-8 string into n-grams of `ngramsize` symbols and returns the n-gram `simhash`. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramSimHashUTF8(string[, ngramsize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramSimHashUTF8('ClickHouse') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────Hash─┐
|
||||
│ 1628157797 │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## ngramSimHashCaseInsensitiveUTF8 {#ngramsimhashcaseinsensitiveutf8}
|
||||
|
||||
Splits a UTF-8 string into n-grams of `ngramsize` symbols and returns the n-gram `simhash`. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramSimHashCaseInsensitiveUTF8(string[, ngramsize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramSimHashCaseInsensitiveUTF8('ClickHouse') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────Hash─┐
|
||||
│ 1636742693 │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## wordShingleSimHash {#wordshinglesimhash}
|
||||
|
||||
Splits a ASCII string into parts (shingles) of `shinglesize` words and returns the word shingle `simhash`. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleSimHash(string[, shinglesize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleSimHash('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────Hash─┐
|
||||
│ 2328277067 │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## wordShingleSimHashCaseInsensitive {#wordshinglesimhashcaseinsensitive}
|
||||
|
||||
Splits a ASCII string into parts (shingles) of `shinglesize` words and returns the word shingle `simhash`. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleSimHashCaseInsensitive(string[, shinglesize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleSimHashCaseInsensitive('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────Hash─┐
|
||||
│ 2194812424 │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## wordShingleSimHashUTF8 {#wordshinglesimhashutf8}
|
||||
|
||||
Splits a UTF-8 string into parts (shingles) of `shinglesize` words and returns the word shingle `simhash`. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleSimHashUTF8(string[, shinglesize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optinal. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleSimHashUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────Hash─┐
|
||||
│ 2328277067 │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## wordShingleSimHashCaseInsensitiveUTF8 {#wordshinglesimhashcaseinsensitiveutf8}
|
||||
|
||||
Splits a UTF-8 string into parts (shingles) of `shinglesize` words and returns the word shingle `simhash`. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [bitHammingDistance](../../sql-reference/functions/bit-functions.md#bithammingdistance). The smaller is the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) of the calculated `simhashes` of two strings, the more likely these strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleSimHashCaseInsensitiveUTF8(string[, shinglesize])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Hash value.
|
||||
|
||||
Type: [UInt64](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleSimHashCaseInsensitiveUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌───────Hash─┐
|
||||
│ 2194812424 │
|
||||
└────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHash {#ngramminhash}
|
||||
|
||||
Splits a ASCII string into n-grams of `ngramsize` symbols and calculates hash values for each n-gram. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHash(string[, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHash('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple──────────────────────────────────────┐
|
||||
│ (18333312859352735453,9054248444481805918) │
|
||||
└────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHashCaseInsensitive {#ngramminhashcaseinsensitive}
|
||||
|
||||
Splits a ASCII string into n-grams of `ngramsize` symbols and calculates hash values for each n-gram. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHashCaseInsensitive(string[, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHashCaseInsensitive('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple──────────────────────────────────────┐
|
||||
│ (2106263556442004574,13203602793651726206) │
|
||||
└────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHashUTF8 {#ngramminhashutf8}
|
||||
|
||||
Splits a UTF-8 string into n-grams of `ngramsize` symbols and calculates hash values for each n-gram. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHashUTF8(string[, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHashUTF8('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple──────────────────────────────────────┐
|
||||
│ (18333312859352735453,6742163577938632877) │
|
||||
└────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHashCaseInsensitiveUTF8 {#ngramminhashcaseinsensitiveutf8}
|
||||
|
||||
Splits a UTF-8 string into n-grams of `ngramsize` symbols and calculates hash values for each n-gram. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHashCaseInsensitiveUTF8(string [, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHashCaseInsensitiveUTF8('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple───────────────────────────────────────┐
|
||||
│ (12493625717655877135,13203602793651726206) │
|
||||
└─────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHashArg {#ngramminhasharg}
|
||||
|
||||
Splits a ASCII string into n-grams of `ngramsize` symbols and returns the n-grams with minimum and maximum hashes, calculated by the [ngramMinHash](#ngramminhash) function with the same input. Is case sensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHashArg(string[, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` n-grams each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHashArg('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────────────────────────────────────────┐
|
||||
│ (('ous','ick','lic','Hou','kHo','use'),('Hou','lic','ick','ous','ckH','Cli')) │
|
||||
└───────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHashArgCaseInsensitive {#ngramminhashargcaseinsensitive}
|
||||
|
||||
Splits a ASCII string into n-grams of `ngramsize` symbols and returns the n-grams with minimum and maximum hashes, calculated by the [ngramMinHashCaseInsensitive](#ngramminhashcaseinsensitive) function with the same input. Is case insensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHashArgCaseInsensitive(string[, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` n-grams each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHashArgCaseInsensitive('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────────────────────────────────────────┐
|
||||
│ (('ous','ick','lic','kHo','use','Cli'),('kHo','lic','ick','ous','ckH','Hou')) │
|
||||
└───────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHashArgUTF8 {#ngramminhashargutf8}
|
||||
|
||||
Splits a UTF-8 string into n-grams of `ngramsize` symbols and returns the n-grams with minimum and maximum hashes, calculated by the [ngramMinHashUTF8](#ngramminhashutf8) function with the same input. Is case sensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHashArgUTF8(string[, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` n-grams each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHashArgUTF8('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────────────────────────────────────────┐
|
||||
│ (('ous','ick','lic','Hou','kHo','use'),('kHo','Hou','lic','ick','ous','ckH')) │
|
||||
└───────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## ngramMinHashArgCaseInsensitiveUTF8 {#ngramminhashargcaseinsensitiveutf8}
|
||||
|
||||
Splits a UTF-8 string into n-grams of `ngramsize` symbols and returns the n-grams with minimum and maximum hashes, calculated by the [ngramMinHashCaseInsensitiveUTF8](#ngramminhashcaseinsensitiveutf8) function with the same input. Is case insensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
ngramMinHashArgCaseInsensitiveUTF8(string[, ngramsize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `ngramsize` — The size of an n-gram. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` n-grams each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT ngramMinHashArgCaseInsensitiveUTF8('ClickHouse') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────────────────────────────────────────┐
|
||||
│ (('ckH','ous','ick','lic','kHo','use'),('kHo','lic','ick','ous','ckH','Hou')) │
|
||||
└───────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHash {#wordshingleminhash}
|
||||
|
||||
Splits a ASCII string into parts (shingles) of `shinglesize` words and calculates hash values for each word shingle. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHash(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHash('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple──────────────────────────────────────┐
|
||||
│ (16452112859864147620,5844417301642981317) │
|
||||
└────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHashCaseInsensitive {#wordshingleminhashcaseinsensitive}
|
||||
|
||||
Splits a ASCII string into parts (shingles) of `shinglesize` words and calculates hash values for each word shingle. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHashCaseInsensitive(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHashCaseInsensitive('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────┐
|
||||
│ (3065874883688416519,1634050779997673240) │
|
||||
└───────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHashUTF8 {#wordshingleminhashutf8}
|
||||
|
||||
Splits a UTF-8 string into parts (shingles) of `shinglesize` words and calculates hash values for each word shingle. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHashUTF8(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHashUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple──────────────────────────────────────┐
|
||||
│ (16452112859864147620,5844417301642981317) │
|
||||
└────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHashCaseInsensitiveUTF8 {#wordshingleminhashcaseinsensitiveutf8}
|
||||
|
||||
Splits a UTF-8 string into parts (shingles) of `shinglesize` words and calculates hash values for each word shingle. Uses `hashnum` minimum hashes to calculate the minimum hash and `hashnum` maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
|
||||
|
||||
Can be used for detection of semi-duplicate strings with [tupleHammingDistance](../../sql-reference/functions/tuple-functions.md#tuplehammingdistance). For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHashCaseInsensitiveUTF8(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two hashes — the minimum and the maximum.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([UInt64](../../sql-reference/data-types/int-uint.md), [UInt64](../../sql-reference/data-types/int-uint.md)).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHashCaseInsensitiveUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────┐
|
||||
│ (3065874883688416519,1634050779997673240) │
|
||||
└───────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHashArg {#wordshingleminhasharg}
|
||||
|
||||
Splits a ASCII string into parts (shingles) of `shinglesize` words each and returns the shingles with minimum and maximum word hashes, calculated by the [wordshingleMinHash](#wordshingleminhash) function with the same input. Is case sensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHashArg(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` word shingles each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHashArg('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────────────────────────────────┐
|
||||
│ (('OLAP','database','analytical'),('online','oriented','processing')) │
|
||||
└───────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHashArgCaseInsensitive {#wordshingleminhashargcaseinsensitive}
|
||||
|
||||
Splits a ASCII string into parts (shingles) of `shinglesize` words each and returns the shingles with minimum and maximum word hashes, calculated by the [wordShingleMinHashCaseInsensitive](#wordshingleminhashcaseinsensitive) function with the same input. Is case insensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHashArgCaseInsensitive(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` word shingles each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHashArgCaseInsensitive('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple──────────────────────────────────────────────────────────────────┐
|
||||
│ (('queries','database','analytical'),('oriented','processing','DBMS')) │
|
||||
└────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHashArgUTF8 {#wordshingleminhashargutf8}
|
||||
|
||||
Splits a UTF-8 string into parts (shingles) of `shinglesize` words each and returns the shingles with minimum and maximum word hashes, calculated by the [wordShingleMinHashUTF8](#wordshingleminhashutf8) function with the same input. Is case sensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHashArgUTF8(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` word shingles each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHashArgUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple─────────────────────────────────────────────────────────────────┐
|
||||
│ (('OLAP','database','analytical'),('online','oriented','processing')) │
|
||||
└───────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## wordShingleMinHashArgCaseInsensitiveUTF8 {#wordshingleminhashargcaseinsensitiveutf8}
|
||||
|
||||
Splits a UTF-8 string into parts (shingles) of `shinglesize` words each and returns the shingles with minimum and maximum word hashes, calculated by the [wordShingleMinHashCaseInsensitiveUTF8](#wordshingleminhashcaseinsensitiveutf8) function with the same input. Is case insensitive.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
wordShingleMinHashArgCaseInsensitiveUTF8(string[, shinglesize, hashnum])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `string` — String. [String](../../sql-reference/data-types/string.md).
|
||||
- `shinglesize` — The size of a word shingle. Optional. Possible values: any number from `1` to `25`. Default value: `3`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
- `hashnum` — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from `1` to `25`. Default value: `6`. [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Returned value**
|
||||
|
||||
- Tuple with two tuples with `hashnum` word shingles each.
|
||||
|
||||
Type: [Tuple](../../sql-reference/data-types/tuple.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md)), [Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md))).
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT wordShingleMinHashArgCaseInsensitiveUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─Tuple──────────────────────────────────────────────────────────────────┐
|
||||
│ (('queries','database','analytical'),('oriented','processing','DBMS')) │
|
||||
└────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
@ -111,4 +111,55 @@ Result:
|
||||
|
||||
- [Tuple](../../sql-reference/data-types/tuple.md)
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/sql-reference/functions/tuple-functions/) <!--hide-->
|
||||
## tupleHammingDistance {#tuplehammingdistance}
|
||||
|
||||
Returns the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) between two tuples of the same size.
|
||||
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
tupleHammingDistance(tuple1, tuple2)
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `tuple1` — First tuple. [Tuple](../../sql-reference/data-types/tuple.md).
|
||||
- `tuple2` — Second tuple. [Tuple](../../sql-reference/data-types/tuple.md).
|
||||
|
||||
Tuples should have the same type of the elements.
|
||||
|
||||
**Returned value**
|
||||
|
||||
- The Hamming distance.
|
||||
|
||||
Type: [UInt8](../../sql-reference/data-types/int-uint.md).
|
||||
|
||||
**Examples**
|
||||
|
||||
Query:
|
||||
|
||||
``` sql
|
||||
SELECT tupleHammingDistance((1, 2, 3), (3, 2, 1)) AS HammingDistance;
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─HammingDistance─┐
|
||||
│ 2 │
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
Can be used with [MinHash](../../sql-reference/functions/hash-functions.md#ngramminhash) functions for detection of semi-duplicate strings:
|
||||
|
||||
``` sql
|
||||
SELECT tupleHammingDistance(wordShingleMinHash(string), wordShingleMinHashCaseInsensitive(string)) as HammingDistance FROM (SELECT 'Clickhouse is a column-oriented database management system for online analytical processing of queries.' AS string);
|
||||
```
|
||||
|
||||
Result:
|
||||
|
||||
``` text
|
||||
┌─HammingDistance─┐
|
||||
│ 2 │
|
||||
└─────────────────┘
|
||||
```
|
||||
|
@ -40,7 +40,7 @@ Read about setting the partition expression in a section [How to specify the par
|
||||
|
||||
After the query is executed, you can do whatever you want with the data in the `detached` directory — delete it from the file system, or just leave it.
|
||||
|
||||
This query is replicated – it moves the data to the `detached` directory on all replicas. Note that you can execute this query only on a leader replica. To find out if a replica is a leader, perform the `SELECT` query to the [system.replicas](../../../operations/system-tables/replicas.md#system_tables-replicas) table. Alternatively, it is easier to make a `DETACH` query on all replicas - all the replicas throw an exception, except the leader replica.
|
||||
This query is replicated – it moves the data to the `detached` directory on all replicas. Note that you can execute this query only on a leader replica. To find out if a replica is a leader, perform the `SELECT` query to the [system.replicas](../../../operations/system-tables/replicas.md#system_tables-replicas) table. Alternatively, it is easier to make a `DETACH` query on all replicas - all the replicas throw an exception, except the leader replicas (as multiple leaders are allowed).
|
||||
|
||||
## DROP PARTITION\|PART {#alter_drop-partition}
|
||||
|
||||
@ -85,9 +85,15 @@ ALTER TABLE visits ATTACH PART 201901_2_2_0;
|
||||
|
||||
Read more about setting the partition expression in a section [How to specify the partition expression](#alter-how-to-specify-part-expr).
|
||||
|
||||
This query is replicated. The replica-initiator checks whether there is data in the `detached` directory. If data exists, the query checks its integrity. If everything is correct, the query adds the data to the table. All other replicas download the data from the replica-initiator.
|
||||
This query is replicated. The replica-initiator checks whether there is data in the `detached` directory.
|
||||
If data exists, the query checks its integrity. If everything is correct, the query adds the data to the table.
|
||||
|
||||
So you can put data to the `detached` directory on one replica, and use the `ALTER ... ATTACH` query to add it to the table on all replicas.
|
||||
If the non-initiator replica, receiving the attach command, finds the part with the correct checksums in its own
|
||||
`detached` folder, it attaches the data without fetching it from other replicas.
|
||||
If there is no part with the correct checksums, the data is downloaded from any replica having the part.
|
||||
|
||||
You can put data to the `detached` directory on one replica and use the `ALTER ... ATTACH` query to add it to the
|
||||
table on all replicas.
|
||||
|
||||
## ATTACH PARTITION FROM {#alter_attach-partition-from}
|
||||
|
||||
@ -95,7 +101,8 @@ So you can put data to the `detached` directory on one replica, and use the `ALT
|
||||
ALTER TABLE table2 ATTACH PARTITION partition_expr FROM table1
|
||||
```
|
||||
|
||||
This query copies the data partition from the `table1` to `table2` adds data to exsisting in the `table2`. Note that data won’t be deleted from `table1`.
|
||||
This query copies the data partition from the `table1` to `table2`.
|
||||
Note that data won't be deleted neither from `table1` nor from `table2`.
|
||||
|
||||
For the query to run successfully, the following conditions must be met:
|
||||
|
||||
|
@ -5,13 +5,14 @@ toc_title: ATTACH
|
||||
|
||||
# ATTACH Statement {#attach}
|
||||
|
||||
This query is exactly the same as [CREATE](../../sql-reference/statements/create/table.md), but
|
||||
Attaches the table, for example, when moving a database to another server.
|
||||
|
||||
- Instead of the word `CREATE` it uses the word `ATTACH`.
|
||||
- The query does not create data on the disk, but assumes that data is already in the appropriate places, and just adds information about the table to the server.
|
||||
After executing an ATTACH query, the server will know about the existence of the table.
|
||||
The query does not create data on the disk, but assumes that data is already in the appropriate places, and just adds information about the table to the server. After executing an `ATTACH` query, the server will know about the existence of the table.
|
||||
|
||||
If the table was previously detached ([DETACH](../../sql-reference/statements/detach.md)), meaning that its structure is known, you can use shorthand without defining the structure.
|
||||
If the table was previously detached ([DETACH](../../sql-reference/statements/detach.md)) query, meaning that its structure is known, you can use shorthand without defining the structure.
|
||||
|
||||
## Syntax Forms {#syntax-forms}
|
||||
### Attach Existing Table {#attach-existing-table}
|
||||
|
||||
``` sql
|
||||
ATTACH TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
|
||||
@ -21,4 +22,38 @@ This query is used when starting the server. The server stores table metadata as
|
||||
|
||||
If the table was detached permanently, it won't be reattached at the server start, so you need to use `ATTACH` query explicitly.
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/sql-reference/statements/attach/) <!--hide-->
|
||||
### Сreate New Table And Attach Data {#create-new-table-and-attach-data}
|
||||
|
||||
**With specify path to table data**
|
||||
|
||||
```sql
|
||||
ATTACH TABLE name FROM 'path/to/data/' (col1 Type1, ...)
|
||||
```
|
||||
|
||||
It creates new table with provided structure and attaches table data from provided directory in `user_files`.
|
||||
|
||||
**Example**
|
||||
|
||||
Query:
|
||||
|
||||
```sql
|
||||
DROP TABLE IF EXISTS test;
|
||||
INSERT INTO TABLE FUNCTION file('01188_attach/test/data.TSV', 'TSV', 's String, n UInt8') VALUES ('test', 42);
|
||||
ATTACH TABLE test FROM '01188_attach/test' (s String, n UInt8) ENGINE = File(TSV);
|
||||
SELECT * FROM test;
|
||||
```
|
||||
Result:
|
||||
|
||||
```sql
|
||||
┌─s────┬──n─┐
|
||||
│ test │ 42 │
|
||||
└──────┴────┘
|
||||
```
|
||||
|
||||
**With specify table UUID** (Only for `Atomic` database)
|
||||
|
||||
```sql
|
||||
ATTACH TABLE name UUID '<uuid>' (col1 Type1, ...)
|
||||
```
|
||||
|
||||
It creates new table with provided structure and attaches data from table with the specified UUID.
|
@ -30,9 +30,36 @@ Performed over the tables with another table engines causes an exception.
|
||||
|
||||
Engines from the `*Log` family don’t provide automatic data recovery on failure. Use the `CHECK TABLE` query to track data loss in a timely manner.
|
||||
|
||||
For `MergeTree` family engines, the `CHECK TABLE` query shows a check status for every individual data part of a table on the local server.
|
||||
## Checking the MergeTree Family Tables {#checking-mergetree-tables}
|
||||
|
||||
**If the data is corrupted**
|
||||
For `MergeTree` family engines, if [check_query_single_value_result](../../operations/settings/settings.md#check_query_single_value_result) = 0, the `CHECK TABLE` query shows a check status for every individual data part of a table on the local server.
|
||||
|
||||
```sql
|
||||
SET check_query_single_value_result = 0;
|
||||
CHECK TABLE test_table;
|
||||
```
|
||||
|
||||
```text
|
||||
┌─part_path─┬─is_passed─┬─message─┐
|
||||
│ all_1_4_1 │ 1 │ │
|
||||
│ all_1_4_2 │ 1 │ │
|
||||
└───────────┴───────────┴─────────┘
|
||||
```
|
||||
|
||||
If `check_query_single_value_result` = 0, the `CHECK TABLE` query shows the general table check status.
|
||||
|
||||
```sql
|
||||
SET check_query_single_value_result = 1;
|
||||
CHECK TABLE test_table;
|
||||
```
|
||||
|
||||
```text
|
||||
┌─result─┐
|
||||
│ 1 │
|
||||
└────────┘
|
||||
```
|
||||
|
||||
## If the Data Is Corrupted {#if-data-is-corrupted}
|
||||
|
||||
If the table is corrupted, you can copy the non-corrupted data to another table. To do this:
|
||||
|
||||
|
@ -287,7 +287,9 @@ REPLACE TABLE myOldTable SELECT * FROM myOldTable WHERE CounterID <12345;
|
||||
|
||||
### Syntax
|
||||
|
||||
{CREATE [OR REPLACE]|REPLACE} TABLE [db.]table_name
|
||||
``` sql
|
||||
{CREATE [OR REPLACE] | REPLACE} TABLE [db.]table_name
|
||||
```
|
||||
|
||||
All syntax forms for `CREATE` query also work for this query. `REPLACE` for a non-existent table will cause an error.
|
||||
|
||||
|
@ -169,7 +169,7 @@ SYSTEM START MERGES [ON VOLUME <volume_name> | [db.]merge_tree_family_table_name
|
||||
### STOP TTL MERGES {#query_language-stop-ttl-merges}
|
||||
|
||||
Provides possibility to stop background delete old data according to [TTL expression](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl) for tables in the MergeTree family:
|
||||
Return `Ok.` even table doesn’t exists or table have not MergeTree engine. Return error when database doesn’t exists:
|
||||
Returns `Ok.` even if table doesn’t exist or table has not MergeTree engine. Returns error when database doesn’t exist:
|
||||
|
||||
``` sql
|
||||
SYSTEM STOP TTL MERGES [[db.]merge_tree_family_table_name]
|
||||
@ -178,7 +178,7 @@ SYSTEM STOP TTL MERGES [[db.]merge_tree_family_table_name]
|
||||
### START TTL MERGES {#query_language-start-ttl-merges}
|
||||
|
||||
Provides possibility to start background delete old data according to [TTL expression](../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-ttl) for tables in the MergeTree family:
|
||||
Return `Ok.` even table doesn’t exists. Return error when database doesn’t exists:
|
||||
Returns `Ok.` even if table doesn’t exist. Returns error when database doesn’t exist:
|
||||
|
||||
``` sql
|
||||
SYSTEM START TTL MERGES [[db.]merge_tree_family_table_name]
|
||||
@ -187,7 +187,7 @@ SYSTEM START TTL MERGES [[db.]merge_tree_family_table_name]
|
||||
### STOP MOVES {#query_language-stop-moves}
|
||||
|
||||
Provides possibility to stop background move data according to [TTL table expression with TO VOLUME or TO DISK clause](../../engines/table-engines/mergetree-family/mergetree.md#mergetree-table-ttl) for tables in the MergeTree family:
|
||||
Return `Ok.` even table doesn’t exists. Return error when database doesn’t exists:
|
||||
Returns `Ok.` even if table doesn’t exist. Returns error when database doesn’t exist:
|
||||
|
||||
``` sql
|
||||
SYSTEM STOP MOVES [[db.]merge_tree_family_table_name]
|
||||
@ -196,7 +196,7 @@ SYSTEM STOP MOVES [[db.]merge_tree_family_table_name]
|
||||
### START MOVES {#query_language-start-moves}
|
||||
|
||||
Provides possibility to start background move data according to [TTL table expression with TO VOLUME and TO DISK clause](../../engines/table-engines/mergetree-family/mergetree.md#mergetree-table-ttl) for tables in the MergeTree family:
|
||||
Return `Ok.` even table doesn’t exists. Return error when database doesn’t exists:
|
||||
Returns `Ok.` even if table doesn’t exist. Returns error when database doesn’t exist:
|
||||
|
||||
``` sql
|
||||
SYSTEM STOP MOVES [[db.]merge_tree_family_table_name]
|
||||
@ -209,7 +209,7 @@ ClickHouse can manage background replication related processes in [ReplicatedMer
|
||||
### STOP FETCHES {#query_language-system-stop-fetches}
|
||||
|
||||
Provides possibility to stop background fetches for inserted parts for tables in the `ReplicatedMergeTree` family:
|
||||
Always returns `Ok.` regardless of the table engine and even table or database doesn’t exists.
|
||||
Always returns `Ok.` regardless of the table engine and even if table or database doesn’t exist.
|
||||
|
||||
``` sql
|
||||
SYSTEM STOP FETCHES [[db.]replicated_merge_tree_family_table_name]
|
||||
@ -218,7 +218,7 @@ SYSTEM STOP FETCHES [[db.]replicated_merge_tree_family_table_name]
|
||||
### START FETCHES {#query_language-system-start-fetches}
|
||||
|
||||
Provides possibility to start background fetches for inserted parts for tables in the `ReplicatedMergeTree` family:
|
||||
Always returns `Ok.` regardless of the table engine and even table or database doesn’t exists.
|
||||
Always returns `Ok.` regardless of the table engine and even if table or database doesn’t exist.
|
||||
|
||||
``` sql
|
||||
SYSTEM START FETCHES [[db.]replicated_merge_tree_family_table_name]
|
||||
@ -264,6 +264,10 @@ Wait until a `ReplicatedMergeTree` table will be synced with other replicas in a
|
||||
SYSTEM SYNC REPLICA [db.]replicated_merge_tree_family_table_name
|
||||
```
|
||||
|
||||
After running this statement the `[db.]replicated_merge_tree_family_table_name` fetches commands from
|
||||
the common replicated log into its own replication queue, and then the query waits till the replica processes all
|
||||
of the fetched commands.
|
||||
|
||||
### RESTART REPLICA {#query_language-system-restart-replica}
|
||||
|
||||
Provides possibility to reinitialize Zookeeper sessions state for `ReplicatedMergeTree` table, will compare current state with Zookeeper as source of true and add tasks to Zookeeper queue if needed
|
||||
@ -276,4 +280,3 @@ SYSTEM RESTART REPLICA [db.]replicated_merge_tree_family_table_name
|
||||
### RESTART REPLICAS {#query_language-system-restart-replicas}
|
||||
|
||||
Provides possibility to reinitialize Zookeeper sessions state for all `ReplicatedMergeTree` tables, will compare current state with Zookeeper as source of true and add tasks to Zookeeper queue if needed
|
||||
|
||||
|
@ -21,16 +21,18 @@ You can use table functions in:
|
||||
!!! warning "Warning"
|
||||
You can’t use table functions if the [allow_ddl](../../operations/settings/permissions-for-queries.md#settings_allow_ddl) setting is disabled.
|
||||
|
||||
| Function | Description |
|
||||
|-----------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| [file](../../sql-reference/table-functions/file.md) | Creates a File-engine table. |
|
||||
| [merge](../../sql-reference/table-functions/merge.md) | Creates a Merge-engine table. |
|
||||
| [numbers](../../sql-reference/table-functions/numbers.md) | Creates a table with a single column filled with integer numbers. |
|
||||
| [remote](../../sql-reference/table-functions/remote.md) | Allows you to access remote servers without creating a Distributed-engine table. |
|
||||
| [url](../../sql-reference/table-functions/url.md) | Creates a URL-engine table. |
|
||||
| [mysql](../../sql-reference/table-functions/mysql.md) | Creates a MySQL-engine table. |
|
||||
| [postgresql](../../sql-reference/table-functions/postgresql.md) | Creates a PostgreSQL-engine table. |
|
||||
| [jdbc](../../sql-reference/table-functions/jdbc.md) | Creates a JDBC-engine table. |
|
||||
| [odbc](../../sql-reference/table-functions/odbc.md) | Creates a ODBC-engine table. |
|
||||
| [hdfs](../../sql-reference/table-functions/hdfs.md) | Creates a HDFS-engine table. |
|
||||
| [s3](../../sql-reference/table-functions/s3.md) | Creates a S3-engine table. |
|
||||
| Function | Description |
|
||||
|------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| [file](../../sql-reference/table-functions/file.md) | Creates a [File](../../engines/table-engines/special/file.md)-engine table. |
|
||||
| [merge](../../sql-reference/table-functions/merge.md) | Creates a [Merge](../../engines/table-engines/special/merge.md)-engine table. |
|
||||
| [numbers](../../sql-reference/table-functions/numbers.md) | Creates a table with a single column filled with integer numbers. |
|
||||
| [remote](../../sql-reference/table-functions/remote.md) | Allows you to access remote servers without creating a [Distributed](../../engines/table-engines/special/distributed.md)-engine table. |
|
||||
| [url](../../sql-reference/table-functions/url.md) | Creates a [Url](../../engines/table-engines/special/url.md)-engine table. |
|
||||
| [mysql](../../sql-reference/table-functions/mysql.md) | Creates a [MySQL](../../engines/table-engines/integrations/mysql.md)-engine table. |
|
||||
| [postgresql](../../sql-reference/table-functions/postgresql.md) | Creates a [PostgreSQL](../../engines/table-engines/integrations/postgresql.md)-engine table. |
|
||||
| [jdbc](../../sql-reference/table-functions/jdbc.md) | Creates a [JDBC](../../engines/table-engines/integrations/jdbc.md)-engine table. |
|
||||
| [odbc](../../sql-reference/table-functions/odbc.md) | Creates a [ODBC](../../engines/table-engines/integrations/odbc.md)-engine table. |
|
||||
| [hdfs](../../sql-reference/table-functions/hdfs.md) | Creates a [HDFS](../../engines/table-engines/integrations/hdfs.md)-engine table. |
|
||||
| [s3](../../sql-reference/table-functions/s3.md) | Creates a [S3](../../engines/table-engines/integrations/s3.md)-engine table. |
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/sql-reference/table-functions/) <!--hide-->
|
||||
|
@ -10,33 +10,17 @@ Allows `SELECT` and `INSERT` queries to be performed on data that is stored on a
|
||||
**Syntax**
|
||||
|
||||
``` sql
|
||||
postgresql('host:port', 'database', 'table', 'user', 'password')
|
||||
postgresql('host:port', 'database', 'table', 'user', 'password'[, `schema`])
|
||||
```
|
||||
|
||||
**Arguments**
|
||||
|
||||
- `host:port` — PostgreSQL server address.
|
||||
|
||||
- `database` — Remote database name.
|
||||
|
||||
- `table` — Remote table name.
|
||||
|
||||
- `user` — PostgreSQL user.
|
||||
|
||||
- `password` — User password.
|
||||
|
||||
|
||||
SELECT Queries on PostgreSQL side run as `COPY (SELECT ...) TO STDOUT` inside read-only PostgreSQL transaction with commit after each `SELECT` query.
|
||||
|
||||
Simple `WHERE` clauses such as `=, !=, >, >=, <, <=, IN` are executed on the PostgreSQL server.
|
||||
|
||||
All joins, aggregations, sorting, `IN [ array ]` conditions and the `LIMIT` sampling constraint are executed in ClickHouse only after the query to PostgreSQL finishes.
|
||||
|
||||
INSERT Queries on PostgreSQL side run as `COPY "table_name" (field1, field2, ... fieldN) FROM STDIN` inside PostgreSQL transaction with auto-commit after each `INSERT` statement.
|
||||
|
||||
PostgreSQL Array types converts into ClickHouse arrays.
|
||||
|
||||
Be careful in PostgreSQL an array data type column like Integer[] may contain arrays of different dimensions in different rows, but in ClickHouse it is only allowed to have multidimensional arrays of the same dimension in all rows.
|
||||
- `schema` — Non-default table schema. Optional.
|
||||
|
||||
**Returned Value**
|
||||
|
||||
@ -45,6 +29,23 @@ A table object with the same columns as the original PostgreSQL table.
|
||||
!!! info "Note"
|
||||
In the `INSERT` query to distinguish table function `postgresql(...)` from table name with column names list you must use keywords `FUNCTION` or `TABLE FUNCTION`. See examples below.
|
||||
|
||||
## Implementation Details {#implementation-details}
|
||||
|
||||
`SELECT` queries on PostgreSQL side run as `COPY (SELECT ...) TO STDOUT` inside read-only PostgreSQL transaction with commit after each `SELECT` query.
|
||||
|
||||
Simple `WHERE` clauses such as `=`, `!=`, `>`, `>=`, `<`, `<=`, and `IN` are executed on the PostgreSQL server.
|
||||
|
||||
All joins, aggregations, sorting, `IN [ array ]` conditions and the `LIMIT` sampling constraint are executed in ClickHouse only after the query to PostgreSQL finishes.
|
||||
|
||||
`INSERT` queries on PostgreSQL side run as `COPY "table_name" (field1, field2, ... fieldN) FROM STDIN` inside PostgreSQL transaction with auto-commit after each `INSERT` statement.
|
||||
|
||||
PostgreSQL Array types converts into ClickHouse arrays.
|
||||
|
||||
!!! info "Note"
|
||||
Be careful, in PostgreSQL an array data type column like Integer[] may contain arrays of different dimensions in different rows, but in ClickHouse it is only allowed to have multidimensional arrays of the same dimension in all rows.
|
||||
|
||||
Supports replicas priority for PostgreSQL dictionary source. The bigger the number in map, the less the priority. The highest priority is `0`.
|
||||
|
||||
**Examples**
|
||||
|
||||
Table in PostgreSQL:
|
||||
@ -60,14 +61,14 @@ PRIMARY KEY (int_id));
|
||||
|
||||
CREATE TABLE
|
||||
|
||||
postgres=# insert into test (int_id, str, "float") VALUES (1,'test',2);
|
||||
postgres=# INSERT INTO test (int_id, str, "float") VALUES (1,'test',2);
|
||||
INSERT 0 1
|
||||
|
||||
postgresql> select * from test;
|
||||
int_id | int_nullable | float | str | float_nullable
|
||||
--------+--------------+-------+------+----------------
|
||||
1 | | 2 | test |
|
||||
(1 row)
|
||||
postgresql> SELECT * FROM test;
|
||||
int_id | int_nullable | float | str | float_nullable
|
||||
--------+--------------+-------+------+----------------
|
||||
1 | | 2 | test |
|
||||
(1 row)
|
||||
```
|
||||
|
||||
Selecting data from ClickHouse:
|
||||
@ -96,9 +97,24 @@ SELECT * FROM postgresql('localhost:5432', 'test', 'test', 'postgresql_user', 'p
|
||||
└────────┴──────────────┴───────┴──────┴────────────────┘
|
||||
```
|
||||
|
||||
Using Non-default Schema:
|
||||
|
||||
```text
|
||||
postgres=# CREATE SCHEMA "nice.schema";
|
||||
|
||||
postgres=# CREATE TABLE "nice.schema"."nice.table" (a integer);
|
||||
|
||||
postgres=# INSERT INTO "nice.schema"."nice.table" SELECT i FROM generate_series(0, 99) as t(i)
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE pg_table_schema_with_dots (a UInt32)
|
||||
ENGINE PostgreSQL('localhost:5432', 'clickhouse', 'nice.table', 'postgrsql_user', 'password', 'nice.schema');
|
||||
```
|
||||
|
||||
**See Also**
|
||||
|
||||
- [The ‘PostgreSQL’ table engine](../../engines/table-engines/integrations/postgresql.md)
|
||||
- [The PostgreSQL table engine](../../engines/table-engines/integrations/postgresql.md)
|
||||
- [Using PostgreSQL as a source of external dictionary](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-postgresql)
|
||||
|
||||
[Original article](https://clickhouse.tech/docs/en/sql-reference/table-functions/postgresql/) <!--hide-->
|
||||
|
@ -624,7 +624,7 @@ uniqHLL12(x[, ...])
|
||||
|
||||
- HyperLogLogアルゴリズムを使用して、異なる引数値の数を近似します。
|
||||
|
||||
212 5-bit cells are used. The size of the state is slightly more than 2.5 KB. The result is not very accurate (up to ~10% error) for small data sets (<10K elements). However, the result is fairly accurate for high-cardinality data sets (10K-100M), with a maximum error of ~1.6%. Starting from 100M, the estimation error increases, and the function will return very inaccurate results for data sets with extremely high cardinality (1B+ elements).
|
||||
2^12 5-bit cells are used. The size of the state is slightly more than 2.5 KB. The result is not very accurate (up to ~10% error) for small data sets (<10K elements). However, the result is fairly accurate for high-cardinality data sets (10K-100M), with a maximum error of ~1.6%. Starting from 100M, the estimation error increases, and the function will return very inaccurate results for data sets with extremely high cardinality (1B+ elements).
|
||||
|
||||
- 決定的な結果を提供します(クエリ処理順序に依存しません)。
|
||||
|
||||
|
@ -29,3 +29,14 @@ toc_title: "Поставщики облачных услуг ClickHouse"
|
||||
- cross-az масштабирование для повышения производительности и обеспечения высокой доступности
|
||||
- встроенный мониторинг и редактор SQL-запросов
|
||||
|
||||
## Alibaba Cloud {#alibaba-cloud}
|
||||
|
||||
Управляемый облачный сервис Alibaba для ClickHouse: [китайская площадка](https://www.aliyun.com/product/clickhouse), будет доступен на международной площадке в мае 2021 года. Сервис предоставляет следующие возможности:
|
||||
|
||||
- надежный сервер для облачного хранилища на основе распределенной системы [Alibaba Cloud Apsara](https://www.alibabacloud.com/product/apsara-stack);
|
||||
- расширяемая по запросу емкость, без переноса данных вручную;
|
||||
- поддержка одноузловой и многоузловой архитектуры, архитектуры с одной или несколькими репликами, а также многоуровневого хранения cold и hot data;
|
||||
- поддержка прав доступа, one-key восстановления, многоуровневая защита сети, шифрование облачного диска;
|
||||
- полная интеграция с облачными системами логирования, базами данных и инструментами обработки данных;
|
||||
- встроенная платформа для мониторинга и управления базами данных;
|
||||
- техническая поддержка от экспертов по работе с базами данных.
|
@ -27,7 +27,7 @@ ClickHouse - полноценная колоночная СУБД. Данные
|
||||
|
||||
`IColumn` предоставляет методы для общих реляционных преобразований данных, но они не отвечают всем потребностям. Например, `ColumnUInt64` не имеет метода для вычисления суммы двух столбцов, а `ColumnString` не имеет метода для запуска поиска по подстроке. Эти бесчисленные процедуры реализованы вне `IColumn`.
|
||||
|
||||
Различные функции на колонках могут быть реализованы обобщенным, неэффективным путем, используя `IColumn` методы для извлечения значений `Field`, или специальным путем, используя знания о внутреннем распределение данных в памяти в конкретной реализации `IColumn`. Для этого функции приводятся к конкретному типу `IColumn` и работают напрямую с его внутренним представлением. Например, в `ColumnUInt64` есть метод getData, который возвращает ссылку на внутренний массив, чтение и заполнение которого, выполняется отдельной процедурой напрямую. Фактически, мы имеем "дырявую абстракции", обеспечивающие эффективные специализации различных процедур.
|
||||
Различные функции на колонках могут быть реализованы обобщенным, неэффективным путем, используя `IColumn` методы для извлечения значений `Field`, или специальным путем, используя знания о внутреннем распределение данных в памяти в конкретной реализации `IColumn`. Для этого функции приводятся к конкретному типу `IColumn` и работают напрямую с его внутренним представлением. Например, в `ColumnUInt64` есть метод `getData`, который возвращает ссылку на внутренний массив, чтение и заполнение которого, выполняется отдельной процедурой напрямую. Фактически, мы имеем "дырявые абстракции", обеспечивающие эффективные специализации различных процедур.
|
||||
|
||||
## Типы данных (Data Types) {#data_types}
|
||||
|
||||
@ -42,7 +42,7 @@ ClickHouse - полноценная колоночная СУБД. Данные
|
||||
|
||||
## Блоки (Block) {#block}
|
||||
|
||||
`Block` это контейнер, который представляет фрагмент (chunk) таблицы в памяти. Это набор троек - `(IColumn, IDataType, имя колонки)`. В процессе выполнения запроса, данные обрабатываются `Block`ами. Если у нас есть `Block`, значит у нас есть данные (в объекте `IColumn`), информация о типе (в `IDataType`), которая говорит нам, как работать с колонкой, и имя колонки (оригинальное имя колонки таблицы или служебное имя, присвоенное для получения промежуточных результатов вычислений).
|
||||
`Block` это контейнер, который представляет фрагмент (chunk) таблицы в памяти. Это набор троек - `(IColumn, IDataType, имя колонки)`. В процессе выполнения запроса, данные обрабатываются `Block`-ами. Если у нас есть `Block`, значит у нас есть данные (в объекте `IColumn`), информация о типе (в `IDataType`), которая говорит нам, как работать с колонкой, и имя колонки (оригинальное имя колонки таблицы или служебное имя, присвоенное для получения промежуточных результатов вычислений).
|
||||
|
||||
При вычислении некоторой функции на колонках в блоке мы добавляем еще одну колонку с результатами в блок, не трогая колонки аргументов функции, потому что операции иммутабельные. Позже ненужные колонки могут быть удалены из блока, но не модифицированы. Это удобно для устранения общих подвыражений.
|
||||
|
||||
@ -58,7 +58,7 @@ ClickHouse - полноценная колоночная СУБД. Данные
|
||||
2. Реализацию форматов данных. Например, при выводе данных в терминал в формате `Pretty`, вы создаете выходной поток блоков, который форматирует поступающие в него блоки.
|
||||
3. Трансформацию данных. Допустим, у вас есть `IBlockInputStream` и вы хотите создать отфильтрованный поток. Вы создаете `FilterBlockInputStream` и инициализируете его вашим потоком. Затем вы тянете (pull) блоки из `FilterBlockInputStream`, а он тянет блоки исходного потока, фильтрует их и возвращает отфильтрованные блоки вам. Таким образом построены конвейеры выполнения запросов.
|
||||
|
||||
Имеются и более сложные трансформации. Например, когда вы тянете блоки из `AggregatingBlockInputStream`, он считывает все данные из своего источника, агрегирует их, и возвращает поток агрегированных данных вам. Другой пример: конструктор `UnionBlockInputStream` принимает множество источников входных данных и число потоков. Такой `Stream` работает в несколько потоков и читает данные источников параллельно.
|
||||
Имеются и более сложные трансформации. Например, когда вы тянете блоки из `AggregatingBlockInputStream`, он считывает все данные из своего источника, агрегирует их, и возвращает поток агрегированных данных вам. Другой пример: конструктор `UnionBlockInputStream` принимает множество источников входных данных и число потоков. Такой `Stream` работает в несколько потоков и читает данные источников параллельно.
|
||||
|
||||
> Потоки блоков используют «втягивающий» (pull) подход к управлению потоком выполнения: когда вы вытягиваете блок из первого потока, он, следовательно, вытягивает необходимые блоки из вложенных потоков, так и работает весь конвейер выполнения. Ни «pull» ни «push» не имеют явного преимущества, потому что поток управления неявный, и это ограничивает в реализации различных функций, таких как одновременное выполнение нескольких запросов (слияние нескольких конвейеров вместе). Это ограничение можно преодолеть с помощью сопрограмм (coroutines) или просто запуском дополнительных потоков, которые ждут друг друга. У нас может быть больше возможностей, если мы сделаем поток управления явным: если мы локализуем логику для передачи данных из одной расчетной единицы в другую вне этих расчетных единиц. Читайте эту [статью](http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/) для углубленного изучения.
|
||||
|
||||
@ -110,9 +110,9 @@ ClickHouse - полноценная колоночная СУБД. Данные
|
||||
> Генераторы парсеров не используются по историческим причинам.
|
||||
|
||||
## Интерпретаторы {#interpreters}
|
||||
|
||||
|
||||
Интерпретаторы отвечают за создание конвейера выполнения запроса из `AST`. Есть простые интерпретаторы, такие как `InterpreterExistsQuery` и `InterpreterDropQuery` или более сложный `InterpreterSelectQuery`. Конвейер выполнения запроса представляет собой комбинацию входных и выходных потоков блоков. Например, результатом интерпретации `SELECT` запроса является `IBlockInputStream` для чтения результирующего набора данных; результат интерпретации `INSERT` запроса - это `IBlockOutputStream`, для записи данных, предназначенных для вставки; результат интерпретации `INSERT SELECT` запроса - это `IBlockInputStream`, который возвращает пустой результирующий набор при первом чтении, но копирует данные из `SELECT` к `INSERT`.
|
||||
|
||||
|
||||
`InterpreterSelectQuery` использует `ExpressionAnalyzer` и `ExpressionActions` механизмы для анализа запросов и преобразований. Именно здесь выполняется большинство оптимизаций запросов на основе правил. `ExpressionAnalyzer` написан довольно грязно и должен быть переписан: различные преобразования запросов и оптимизации должны быть извлечены в отдельные классы, чтобы позволить модульные преобразования или запросы.
|
||||
|
||||
## Функции {#functions}
|
||||
@ -162,9 +162,9 @@ ClickHouse имеет сильную типизацию, поэтому нет
|
||||
|
||||
Сервера в кластере в основном независимы. Вы можете создать `Распределенную` (`Distributed`) таблицу на одном или всех серверах в кластере. Такая таблица сама по себе не хранит данные - она только предоставляет возможность "просмотра" всех локальных таблиц на нескольких узлах кластера. При выполнении `SELECT` распределенная таблица переписывает запрос, выбирает удаленные узлы в соответствии с настройками балансировки нагрузки и отправляет им запрос. Распределенная таблица просит удаленные сервера обработать запрос до той стадии, когда промежуточные результаты с разных серверов могут быть объединены. Затем он получает промежуточные результаты и объединяет их. Распределенная таблица пытается возложить как можно больше работы на удаленные серверы и сократить объем промежуточных данных, передаваемых по сети.
|
||||
|
||||
Ситуация усложняется, при использовании подзапросы в случае IN или JOIN, когда каждый из них использует таблицу `Distributed`. Есть разные стратегии для выполнения таких запросов.
|
||||
Ситуация усложняется, при использовании подзапросов в случае `IN` или `JOIN`, когда каждый из них использует таблицу `Distributed`. Есть разные стратегии для выполнения таких запросов.
|
||||
|
||||
Глобального плана выполнения распределенных запросов не существует. Каждый узел имеет собственный локальный план для своей части работы. У нас есть простое однонаправленное выполнение распределенных запросов: мы отправляем запросы на удаленные узлы и затем объединяем результаты. Но это невозможно для сложных запросов GROUP BY высокой кардинальности или запросов с большим числом временных данных в JOIN: в таких случаях нам необходимо перераспределить («reshuffle») данные между серверами, что требует дополнительной координации. ClickHouse не поддерживает выполнение запросов такого рода, и нам нужно работать над этим.
|
||||
Глобального плана выполнения распределенных запросов не существует. Каждый узел имеет собственный локальный план для своей части работы. У нас есть простое однонаправленное выполнение распределенных запросов: мы отправляем запросы на удаленные узлы и затем объединяем результаты. Но это невозможно для сложных запросов `GROUP BY` высокой кардинальности или запросов с большим числом временных данных в `JOIN`: в таких случаях нам необходимо перераспределить («reshuffle») данные между серверами, что требует дополнительной координации. ClickHouse не поддерживает выполнение запросов такого рода, и нам нужно работать над этим.
|
||||
|
||||
## Merge Tree {#merge-tree}
|
||||
|
||||
@ -190,7 +190,7 @@ ClickHouse имеет сильную типизацию, поэтому нет
|
||||
|
||||
Репликация использует асинхронную multi-master схему. Вы можете вставить данные в любую реплику, которая имеет открытую сессию в `ZooKeeper`, и данные реплицируются на все другие реплики асинхронно. Поскольку ClickHouse не поддерживает UPDATE, репликация исключает конфликты (conflict-free replication). Поскольку подтверждение вставок кворумом не реализовано, только что вставленные данные могут быть потеряны в случае сбоя одного узла.
|
||||
|
||||
Метаданные для репликации хранятся в `ZooKeeper`. Существует журнал репликации, в котором перечислены действия, которые необходимо выполнить. Среди этих действий: получить часть (get the part); объединить части (merge parts); удалить партицию (drop a partition) и так далее. Каждая реплика копирует журнал репликации в свою очередь, а затем выполняет действия из очереди. Например, при вставке в журнале создается действие «получить часть» (get the part), и каждая реплика загружает эту часть. Слияния координируются между репликами, чтобы получить идентичные до байта результаты. Все части объединяются одинаково на всех репликах. Одна из реплик-лидеров инициирует новое слияние кусков первой и записывает действия «слияния частей» в журнал. Несколько реплик (или все) могут быть лидерами одновременно. Реплике можно запретить быть лидером с помощью `merge_tree` настройки `replicated_can_become_leader`.
|
||||
Метаданные для репликации хранятся в `ZooKeeper`. Существует журнал репликации, в котором перечислены действия, которые необходимо выполнить. Среди этих действий: получить часть (get the part); объединить части (merge parts); удалить партицию (drop a partition) и так далее. Каждая реплика копирует журнал репликации в свою очередь, а затем выполняет действия из очереди. Например, при вставке в журнале создается действие «получить часть» (get the part), и каждая реплика загружает эту часть. Слияния координируются между репликами, чтобы получить идентичные до байта результаты. Все части объединяются одинаково на всех репликах. Одна из реплик-лидеров инициирует новое слияние кусков первой и записывает действия «слияния частей» в журнал. Несколько реплик (или все) могут быть лидерами одновременно. Реплике можно запретить быть лидером с помощью `merge_tree` настройки `replicated_can_become_leader`.
|
||||
|
||||
Репликация является физической: между узлами передаются только сжатые части, а не запросы. Слияния обрабатываются на каждой реплике независимо, в большинстве случаев, чтобы снизить затраты на сеть, во избежание усиления роли сети. Крупные объединенные части отправляются по сети только в случае значительной задержки репликации.
|
||||
|
||||
|
@ -7,15 +7,15 @@ toc_title: "Инструкция для разработчиков"
|
||||
|
||||
Сборка ClickHouse поддерживается на Linux, FreeBSD, Mac OS X.
|
||||
|
||||
# Если вы используете Windows {#esli-vy-ispolzuete-windows}
|
||||
## Если вы используете Windows {#esli-vy-ispolzuete-windows}
|
||||
|
||||
Если вы используете Windows, вам потребуется создать виртуальную машину с Ubuntu. Для работы с виртуальной машиной, установите VirtualBox. Скачать Ubuntu можно на сайте: https://www.ubuntu.com/#download Создайте виртуальную машину из полученного образа. Выделите для неё не менее 4 GB оперативной памяти. Для запуска терминала в Ubuntu, найдите в меню программу со словом terminal (gnome-terminal, konsole или что-то в этом роде) или нажмите Ctrl+Alt+T.
|
||||
|
||||
# Если вы используете 32-битную систему {#esli-vy-ispolzuete-32-bitnuiu-sistemu}
|
||||
## Если вы используете 32-битную систему {#esli-vy-ispolzuete-32-bitnuiu-sistemu}
|
||||
|
||||
ClickHouse не работает и не собирается на 32-битных системах. Получите доступ к 64-битной системе и продолжайте.
|
||||
|
||||
# Создание репозитория на GitHub {#sozdanie-repozitoriia-na-github}
|
||||
## Создание репозитория на GitHub {#sozdanie-repozitoriia-na-github}
|
||||
|
||||
Для работы с репозиторием ClickHouse, вам потребуется аккаунт на GitHub. Наверное, он у вас уже есть.
|
||||
|
||||
@ -34,7 +34,7 @@ ClickHouse не работает и не собирается на 32-битны
|
||||
|
||||
Подробное руководство по использованию Git: https://git-scm.com/book/ru/v2
|
||||
|
||||
# Клонирование репозитория на рабочую машину {#klonirovanie-repozitoriia-na-rabochuiu-mashinu}
|
||||
## Клонирование репозитория на рабочую машину {#klonirovanie-repozitoriia-na-rabochuiu-mashinu}
|
||||
|
||||
Затем вам потребуется загрузить исходники для работы на свой компьютер. Это называется «клонирование репозитория», потому что создаёт на вашем компьютере локальную копию репозитория, с которой вы будете работать.
|
||||
|
||||
@ -78,7 +78,7 @@ ClickHouse не работает и не собирается на 32-битны
|
||||
|
||||
После этого, вы сможете добавлять в свой репозиторий обновления из репозитория Яндекса с помощью команды `git pull upstream master`.
|
||||
|
||||
## Работа с сабмодулями Git {#rabota-s-sabmoduliami-git}
|
||||
### Работа с сабмодулями Git {#rabota-s-sabmoduliami-git}
|
||||
|
||||
Работа с сабмодулями git может быть достаточно болезненной. Следующие команды позволят содержать их в порядке:
|
||||
|
||||
@ -110,7 +110,7 @@ The next commands would help you to reset all submodules to the initial state (!
|
||||
git submodule foreach git submodule foreach git reset --hard
|
||||
git submodule foreach git submodule foreach git clean -xfd
|
||||
|
||||
# Система сборки {#sistema-sborki}
|
||||
## Система сборки {#sistema-sborki}
|
||||
|
||||
ClickHouse использует систему сборки CMake и Ninja.
|
||||
|
||||
@ -130,11 +130,11 @@ Ninja - система запуска сборочных задач.
|
||||
|
||||
Проверьте версию CMake: `cmake --version`. Если версия меньше 3.3, то установите новую версию с сайта https://cmake.org/download/
|
||||
|
||||
# Необязательные внешние библиотеки {#neobiazatelnye-vneshnie-biblioteki}
|
||||
## Необязательные внешние библиотеки {#neobiazatelnye-vneshnie-biblioteki}
|
||||
|
||||
ClickHouse использует для сборки некоторое количество внешних библиотек. Но ни одну из них не требуется отдельно устанавливать, так как они собираются вместе с ClickHouse, из исходников, которые расположены в submodules. Посмотреть набор этих библиотек можно в директории contrib.
|
||||
|
||||
# Компилятор C++ {#kompiliator-c}
|
||||
## Компилятор C++ {#kompiliator-c}
|
||||
|
||||
В качестве компилятора C++ поддерживается GCC начиная с версии 9 или Clang начиная с версии 8.
|
||||
|
||||
@ -148,7 +148,7 @@ ClickHouse использует для сборки некоторое коли
|
||||
|
||||
Если вы решили использовать Clang, вы также можете установить `libc++` и `lld`, если вы знаете, что это такое. При желании, установите `ccache`.
|
||||
|
||||
# Процесс сборки {#protsess-sborki}
|
||||
## Процесс сборки {#protsess-sborki}
|
||||
|
||||
Теперь вы готовы к сборке ClickHouse. Для размещения собранных файлов, рекомендуется создать отдельную директорию build внутри директории ClickHouse:
|
||||
|
||||
@ -206,7 +206,7 @@ Mac OS X:
|
||||
|
||||
ls -l programs/clickhouse
|
||||
|
||||
# Запуск собранной версии ClickHouse {#zapusk-sobrannoi-versii-clickhouse}
|
||||
## Запуск собранной версии ClickHouse {#zapusk-sobrannoi-versii-clickhouse}
|
||||
|
||||
Для запуска сервера из под текущего пользователя, с выводом логов в терминал и с использованием примеров конфигурационных файлов, расположенных в исходниках, перейдите в директорию `ClickHouse/programs/server/` (эта директория находится не в директории build) и выполните:
|
||||
|
||||
@ -233,7 +233,7 @@ Mac OS X:
|
||||
sudo service clickhouse-server stop
|
||||
sudo -u clickhouse ClickHouse/build/programs/clickhouse server --config-file /etc/clickhouse-server/config.xml
|
||||
|
||||
# Среда разработки {#sreda-razrabotki}
|
||||
## Среда разработки {#sreda-razrabotki}
|
||||
|
||||
Если вы не знаете, какую среду разработки использовать, то рекомендуется использовать CLion. CLion является платным ПО, но его можно использовать бесплатно в течение пробного периода. Также он бесплатен для учащихся. CLion можно использовать как под Linux, так и под Mac OS X.
|
||||
|
||||
@ -243,7 +243,7 @@ Mac OS X:
|
||||
|
||||
На всякий случай заметим, что CLion самостоятельно создаёт свою build директорию, самостоятельно выбирает тип сборки debug по-умолчанию, для конфигурации использует встроенную в CLion версию CMake вместо установленного вами, а для запуска задач использует make вместо ninja. Это нормально, просто имейте это ввиду, чтобы не возникало путаницы.
|
||||
|
||||
# Написание кода {#napisanie-koda}
|
||||
## Написание кода {#napisanie-koda}
|
||||
|
||||
Описание архитектуры ClickHouse: https://clickhouse.tech/docs/ru/development/architecture/
|
||||
|
||||
@ -253,7 +253,7 @@ Mac OS X:
|
||||
|
||||
Список задач: https://github.com/ClickHouse/ClickHouse/issues?q=is%3Aopen+is%3Aissue+label%3A%22easy+task%22
|
||||
|
||||
# Тестовые данные {#testovye-dannye}
|
||||
## Тестовые данные {#testovye-dannye}
|
||||
|
||||
Разработка ClickHouse часто требует загрузки реалистичных наборов данных. Особенно это важно для тестирования производительности. Специально для вас мы подготовили набор данных, представляющий собой анонимизированные данные Яндекс.Метрики. Загрузка этих данных потребует ещё 3 GB места на диске. Для выполнения большинства задач разработки, загружать эти данные не обязательно.
|
||||
|
||||
@ -274,7 +274,7 @@ Mac OS X:
|
||||
clickhouse-client --max_insert_block_size 100000 --query "INSERT INTO test.hits FORMAT TSV" < hits_v1.tsv
|
||||
clickhouse-client --max_insert_block_size 100000 --query "INSERT INTO test.visits FORMAT TSV" < visits_v1.tsv
|
||||
|
||||
# Создание Pull Request {#sozdanie-pull-request}
|
||||
## Создание Pull Request {#sozdanie-pull-request}
|
||||
|
||||
Откройте свой форк репозитория в интерфейсе GitHub. Если вы вели разработку в бранче, выберите этот бранч. На странице будет доступна кнопка «Pull request». По сути, это означает «создать заявку на принятие моих изменений в основной репозиторий».
|
||||
|
||||
|
@ -3,15 +3,52 @@ toc_priority: 32
|
||||
toc_title: Atomic
|
||||
---
|
||||
|
||||
|
||||
# Atomic {#atomic}
|
||||
|
||||
Поддерживает неблокирующие запросы `DROP` и `RENAME TABLE` и запросы `EXCHANGE TABLES t1 AND t2`. Движок `Atomic` используется по умолчанию.
|
||||
Поддерживает неблокирующие запросы [DROP TABLE](#drop-detach-table) и [RENAME TABLE](#rename-table) и атомарные запросы [EXCHANGE TABLES t1 AND t](#exchange-tables). Движок `Atomic` используется по умолчанию.
|
||||
|
||||
## Создание БД {#creating-a-database}
|
||||
|
||||
```sql
|
||||
CREATE DATABASE test ENGINE = Atomic;
|
||||
``` sql
|
||||
CREATE DATABASE test[ ENGINE = Atomic];
|
||||
```
|
||||
|
||||
[Оригинальная статья](https://clickhouse.tech/docs/ru/engines/database-engines/atomic/) <!--hide-->
|
||||
## Особенности и рекомендации {#specifics-and-recommendations}
|
||||
|
||||
### UUID {#table-uuid}
|
||||
|
||||
Каждая таблица в базе данных `Atomic` имеет уникальный [UUID](../../sql-reference/data-types/uuid.md) и хранит данные в папке `/clickhouse_path/store/xxx/xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy/`, где `xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy` - это UUID таблицы.
|
||||
Обычно UUID генерируется автоматически, но пользователь также может явно указать UUID в момент создания таблицы (однако это не рекомендуется). Для отображения UUID в запросе `SHOW CREATE` вы можете использовать настройку [show_table_uuid_in_table_create_query_if_not_nil](../../operations/settings/settings.md#show_table_uuid_in_table_create_query_if_not_nil). Результат выполнения в таком случае будет иметь вид:
|
||||
|
||||
```sql
|
||||
CREATE TABLE name UUID '28f1c61c-2970-457a-bffe-454156ddcfef' (n UInt64) ENGINE = ...;
|
||||
```
|
||||
### RENAME TABLE {#rename-table}
|
||||
|
||||
Запросы `RENAME` выполняются без изменения UUID и перемещения табличных данных. Эти запросы не ожидают завершения использующих таблицу запросов и будут выполнены мгновенно.
|
||||
|
||||
### DROP/DETACH TABLE {#drop-detach-table}
|
||||
|
||||
При выполнении запроса `DROP TABLE` никакие данные не удаляются. Таблица помечается как удаленная, метаданные перемещаются в папку `/clickhouse_path/metadata_dropped/` и база данных уведомляет фоновый поток. Задержка перед окончательным удалением данных задается настройкой [database_atomic_delay_before_drop_table_sec](../../operations/server-configuration-parameters/settings.md#database_atomic_delay_before_drop_table_sec).
|
||||
Вы можете задать синхронный режим, определяя модификатор `SYNC`. Используйте для этого настройку [database_atomic_wait_for_drop_and_detach_synchronously](../../operations/settings/settings.md#database_atomic_wait_for_drop_and_detach_synchronously). В этом случае запрос `DROP` ждет завершения `SELECT`, `INSERT` и других запросов, которые используют таблицу. Таблица будет фактически удалена, когда она не будет использоваться.
|
||||
|
||||
### EXCHANGE TABLES {#exchange-tables}
|
||||
|
||||
Запрос `EXCHANGE` меняет местами две таблицы атомарно. Вместо неатомарной операции:
|
||||
|
||||
```sql
|
||||
RENAME TABLE new_table TO tmp, old_table TO new_table, tmp TO old_table;
|
||||
```
|
||||
вы можете использовать один атомарный запрос:
|
||||
|
||||
``` sql
|
||||
EXCHANGE TABLES new_table AND old_table;
|
||||
```
|
||||
|
||||
### ReplicatedMergeTree in Atomic Database {#replicatedmergetree-in-atomic-database}
|
||||
|
||||
Для таблиц [ReplicatedMergeTree](../table-engines/mergetree-family/replication.md#table_engines-replication) рекомендуется не указывать параметры движка - путь в ZooKeeper и имя реплики. В этом случае будут использоваться параметры конфигурации: [default_replica_path](../../operations/server-configuration-parameters/settings.md#default_replica_path) и [default_replica_name](../../operations/server-configuration-parameters/settings.md#default_replica_name). Если вы хотите определить параметры движка явно, рекомендуется использовать макрос {uuid}. Это удобно, так как автоматически генерируются уникальные пути для каждой таблицы в ZooKeeper.
|
||||
|
||||
## Смотрите также
|
||||
|
||||
- Системная таблица [system.databases](../../operations/system-tables/databases.md).
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user