mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-17 21:24:28 +00:00
Merged with master
This commit is contained in:
commit
2c2932e185
137
CHANGELOG.md
137
CHANGELOG.md
@ -1,3 +1,140 @@
|
||||
## ClickHouse release 19.1.6, 2019-01-24
|
||||
|
||||
### Backward Incompatible Change
|
||||
* Removed `ALTER MODIFY PRIMARY KEY` command because it was superseded by the `ALTER MODIFY ORDER BY` command. [#3887](https://github.com/yandex/ClickHouse/pull/3887) ([ztlpn](https://github.com/ztlpn))
|
||||
|
||||
### New Features
|
||||
* Add ability to choose per column codecs for storage log and tiny log. [#4111](https://github.com/yandex/ClickHouse/pull/4111) ([alesapin](https://github.com/alesapin))
|
||||
* Added functions `filesystemAvailable`, `filesystemFree`, `filesystemCapacity`. [#4097](https://github.com/yandex/ClickHouse/pull/4097) ([bgranvea](https://github.com/bgranvea))
|
||||
* Add custom compression codecs. [#3899](https://github.com/yandex/ClickHouse/pull/3899) ([alesapin](https://github.com/alesapin))
|
||||
* Added hashing functions `xxHash64` and `xxHash32`. [#3905](https://github.com/yandex/ClickHouse/pull/3905) ([filimonov](https://github.com/filimonov))
|
||||
* Added multiple joins emulation (very experimental). [#3946](https://github.com/yandex/ClickHouse/pull/3946) ([4ertus2](https://github.com/4ertus2))
|
||||
* Added support for CatBoost multiclass models evaluation. Function `modelEvaluate` returns tuple with per-class raw predictions for multiclass models. `libcatboostmodel.so` should be built with [#607](https://github.com/catboost/catboost/pull/607). [#3959](https://github.com/yandex/ClickHouse/pull/3959) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Added gccHash function which uses the same hash seed as [gcc](https://github.com/gcc-mirror/gcc/blob/41d6b10e96a1de98e90a7c0378437c3255814b16/libstdc%2B%2B-v3/include/bits/functional_hash.h#L191) [#4000](https://github.com/yandex/ClickHouse/pull/4000) ([sundy-li](https://github.com/sundy-li))
|
||||
* Added compression codec delta. [#4052](https://github.com/yandex/ClickHouse/pull/4052) ([alesapin](https://github.com/alesapin))
|
||||
* Added multi searcher to search from multiple constant strings from big haystack. Added functions (`multiPosition`, `multiSearch` ,`firstMatch`) * (` `, `UTF8`, `CaseInsensitive`, `CaseInsensitiveUTF8`) [#4053](https://github.com/yandex/ClickHouse/pull/4053) ([danlark1](https://github.com/danlark1))
|
||||
* Added ability to alter compression codecs. [#4054](https://github.com/yandex/ClickHouse/pull/4054) ([alesapin](https://github.com/alesapin))
|
||||
* Add ability to write data into HDFS and small refactoring. [#4084](https://github.com/yandex/ClickHouse/pull/4084) ([alesapin](https://github.com/alesapin))
|
||||
* Removed some redundant objects from compiled expressions cache (optimization). [#4042](https://github.com/yandex/ClickHouse/pull/4042) ([alesapin](https://github.com/alesapin))
|
||||
* Added functions `JavaHash`, `HiveHash`. [#3811](https://github.com/yandex/ClickHouse/pull/3811) ([shangshujie365](https://github.com/shangshujie365))
|
||||
* Added functions `left`, `right`, `trim`, `ltrim`, `rtrim`, `timestampadd`, `timestampsub`. [#3826](https://github.com/yandex/ClickHouse/pull/3826) ([blinkov](https://github.com/blinkov))
|
||||
* Added function `remoteSecure`. Function works as `remote`, but uses secure connection. [#4088](https://github.com/yandex/ClickHouse/pull/4088) ([proller](https://github.com/proller))
|
||||
|
||||
### Improvements
|
||||
* Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN statements, and for IF EXISTS in DROP/MODIFY/CLEAR/COMMENT COLUMN. [#3900](https://github.com/yandex/ClickHouse/pull/3900) ([bgranvea](https://github.com/bgranvea))
|
||||
* Function `parseDateTimeBestEffort`: support for formats `DD.MM.YYYY`, `DD.MM.YY`, `DD-MM-YYYY`, `DD-Mon-YYYY`, `DD/Month/YYYY` and similar. [#3922](https://github.com/yandex/ClickHouse/pull/3922) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Add a MergeTree setting `use_minimalistic_part_header_in_zookeeper`. If enabled, Replicated tables will store compact part metadata in a single part znode. This can dramatically reduce ZooKeeper snapshot size (especially if the tables have a lot of columns). Note that after enabling this setting you will not be able to downgrade to a version that doesn't support it. [#3960](https://github.com/yandex/ClickHouse/pull/3960) ([ztlpn](https://github.com/ztlpn))
|
||||
* Add an DFA-based implementation for functions `sequenceMatch` and `sequenceCount` in case pattern doesn't contain time. [#\](https://github.com/yandex/ClickHouse/pull/4004) ([ercolanelli-leo](https://github.com/ercolanelli-leo))
|
||||
* Changed the way CapnProtoInputStream creates actions in such a way that it now support structures that are jagged. [#4063](https://github.com/yandex/ClickHouse/pull/4063) ([Miniwoffer](https://github.com/Miniwoffer))
|
||||
* Better way to collect columns, tables and joins from AST when checking required columns. [#3930](https://github.com/yandex/ClickHouse/pull/3930) ([4ertus2](https://github.com/4ertus2))
|
||||
* Zero left padding PODArray so that -1 element is always valid and zeroed. It's used for branchless Offset access. [#3920](https://github.com/yandex/ClickHouse/pull/3920) ([amosbird](https://github.com/amosbird))
|
||||
* Performance improvement for int serialization. [#3968](https://github.com/yandex/ClickHouse/pull/3968) ([amosbird](https://github.com/amosbird))
|
||||
* Moved debian/ specific entries to debian/.gitignore [#4106](https://github.com/yandex/ClickHouse/pull/4106) ([gerasiov](https://github.com/gerasiov))
|
||||
* Decreased the number of connections in case of large number of Distributed tables in a single server. [#3726](https://github.com/yandex/ClickHouse/pull/3726) ([zhang2014](https://github.com/zhang2014))
|
||||
* Supported totals row for `WITH TOTALS` query for ODBC driver (ODBCDriver2 format). [#3836](https://github.com/yandex/ClickHouse/pull/3836) ([nightweb](https://github.com/nightweb))
|
||||
* Better constant expression folding. Possibility to skip unused shards if SELECT query filters by sharding_key (setting `distributed_optimize_skip_select_on_unused_shards`). [#3851](https://github.com/yandex/ClickHouse/pull/3851) ([abyss7](https://github.com/abyss7))
|
||||
* Do not log from odbc-bridge when there is no console. [#3857](https://github.com/yandex/ClickHouse/pull/3857) ([alesapin](https://github.com/alesapin))
|
||||
* Forbid using aggregate functions inside scalar subqueries. [#3865](https://github.com/yandex/ClickHouse/pull/3865) ([abyss7](https://github.com/abyss7))
|
||||
* Added ability to use Enums as integers inside if function. [#3875](https://github.com/yandex/ClickHouse/pull/3875) ([abyss7](https://github.com/abyss7))
|
||||
* Added `low_cardinality_allow_in_native_format` setting. If disabled, do not use `LowCadrinality` type in native format. [#3879](https://github.com/yandex/ClickHouse/pull/3879) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Removed duplicate code. [#3915](https://github.com/yandex/ClickHouse/pull/3915) ([sergey-v-galtsev](https://github.com/sergey-v-galtsev))
|
||||
* Minor improvements in StorageKafka. [#3919](https://github.com/yandex/ClickHouse/pull/3919) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Automatically disable logs in negative tests. [#3940](https://github.com/yandex/ClickHouse/pull/3940) ([4ertus2](https://github.com/4ertus2))
|
||||
* Refactored SyntaxAnalyzer. [#4014](https://github.com/yandex/ClickHouse/pull/4014) ([4ertus2](https://github.com/4ertus2))
|
||||
* Reverted jemalloc patch which lead to performance degradation. [#4018](https://github.com/yandex/ClickHouse/pull/4018) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Refactored QueryNormalizer. Unified column sources for ASTIdentifier and ASTQualifiedAsterisk (were different), removed column duplicates for ASTQualifiedAsterisk sources, cleared asterisks replacement. [#4031](https://github.com/yandex/ClickHouse/pull/4031) ([4ertus2](https://github.com/4ertus2))
|
||||
* Refactored code with ASTIdentifier. [#4056](https://github.com/yandex/ClickHouse/pull/4056) [#4077](https://github.com/yandex/ClickHouse/pull/4077) [#4087](https://github.com/yandex/ClickHouse/pull/4087) ([4ertus2](https://github.com/4ertus2))
|
||||
* Improve error message in `clickhouse-test` script when no ClickHouse binary was found. [#4130](https://github.com/yandex/ClickHouse/pull/4130) ([Miniwoffer](https://github.com/Miniwoffer))
|
||||
* Rewrited code to calculate integer conversion function monotonicity. [#3921](https://github.com/yandex/ClickHouse/pull/3921) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fixed typos in comments. [#4089](https://github.com/yandex/ClickHouse/pull/4089) ([kvinty](https://github.com/kvinty))
|
||||
|
||||
### Build/Testing/Packaging Improvements
|
||||
* Added minimal support for powerpc build. [#4132](https://github.com/yandex/ClickHouse/pull/4132) ([danlark1](https://github.com/danlark1))
|
||||
* Fixed error when the server cannot start with the `bash: /usr/bin/clickhouse-extract-from-config: Operation not permitted` message within Docker or systemd-nspawn. [#4136](https://github.com/yandex/ClickHouse/pull/4136) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Updated `mariadb-client` library. Fixed one of issues found by UBSan. [#3924](https://github.com/yandex/ClickHouse/pull/3924) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Some fixes for UBSan builds. [#3926](https://github.com/yandex/ClickHouse/pull/3926) [#3948](https://github.com/yandex/ClickHouse/pull/3948) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Move docker images to 18.10 and add compatibility file for glibc >= 2.28 [#3965](https://github.com/yandex/ClickHouse/pull/3965) ([alesapin](https://github.com/alesapin))
|
||||
* Add env variable if user don't want to chown directories in server docker image. [#3967](https://github.com/yandex/ClickHouse/pull/3967) ([alesapin](https://github.com/alesapin))
|
||||
* Stateful functional tests are run on public available dataset. [#3969](https://github.com/yandex/ClickHouse/pull/3969) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Enabled most of the warnings from `-Weverything` in clang. Enabled `-Wpedantic`. [#3986](https://github.com/yandex/ClickHouse/pull/3986) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Link to libLLVM rather than to individual LLVM libs when USE_STATIC_LIBRARIES is off. [#3989](https://github.com/yandex/ClickHouse/pull/3989) ([orivej](https://github.com/orivej))
|
||||
* Added a few more warnings that are available only in clang 8. [#3993](https://github.com/yandex/ClickHouse/pull/3993) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fixed bugs found by PVS-Studio. [#4013](https://github.com/yandex/ClickHouse/pull/4013) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Added sanitizer variables for test images. [#4072](https://github.com/yandex/ClickHouse/pull/4072) ([alesapin](https://github.com/alesapin))
|
||||
* clickhouse-server debian package will recommend `libcap2-bin` package to use `setcap` tool for setting capabilities. This is optional. [#4093](https://github.com/yandex/ClickHouse/pull/4093) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Improved compilation time, fixed includes. [#3898](https://github.com/yandex/ClickHouse/pull/3898) ([proller](https://github.com/proller))
|
||||
* Added performance tests for hash functions. [#3918](https://github.com/yandex/ClickHouse/pull/3918) ([filimonov](https://github.com/filimonov))
|
||||
* Fixed cyclic library dependences. [#3958](https://github.com/yandex/ClickHouse/pull/3958) ([proller](https://github.com/proller))
|
||||
* Improved compilation with low available memory. [#4030](https://github.com/yandex/ClickHouse/pull/4030) ([proller](https://github.com/proller))
|
||||
|
||||
### Bug Fixes
|
||||
* Fix bug when in remote table function execution when wrong restrictions were used for in `getStructureOfRemoteTable`. [#4009](https://github.com/yandex/ClickHouse/pull/4009) ([alesapin](https://github.com/alesapin))
|
||||
* Fix a leak of netlink sockets. They were placed in a pool where they were never deleted and new sockets were created at the start of a new thread when all current sockets were in use. [#4017](https://github.com/yandex/ClickHouse/pull/4017) ([ztlpn](https://github.com/ztlpn))
|
||||
* Regression in master. Fix "Unknown identifier" error in case column names appear in lambdas. [#4115](https://github.com/yandex/ClickHouse/pull/4115) ([4ertus2](https://github.com/4ertus2))
|
||||
* Fix bug with closing /proc/self/fd earlier than all fds were read from /proc. [#4120](https://github.com/yandex/ClickHouse/pull/4120) ([alesapin](https://github.com/alesapin))
|
||||
* Fixed misspells in **comments** and **string literals** under `dbms`. [#4122](https://github.com/yandex/ClickHouse/pull/4122) ([maiha](https://github.com/maiha))
|
||||
* Fixed String to UInt monotonic conversion in case of usage String in primary key. [#3870](https://github.com/yandex/ClickHouse/pull/3870) ([zhang2014](https://github.com/zhang2014))
|
||||
* Add checking that 'SET send_logs_level = value' query accept appropriate value. [#3873](https://github.com/yandex/ClickHouse/pull/3873) ([s-mx](https://github.com/s-mx))
|
||||
* Fixed a race condition when executing a distributed ALTER task. The race condition led to more than one replica trying to execute the task and all replicas except one failing with a ZooKeeper error. [#3904](https://github.com/yandex/ClickHouse/pull/3904) ([ztlpn](https://github.com/ztlpn))
|
||||
* Fixed segfault in `arrayEnumerateUniq`, `arrayEnumerateDense` functions in case of some invalid arguments. [#3909](https://github.com/yandex/ClickHouse/pull/3909) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fix UB in StorageMerge. [#3910](https://github.com/yandex/ClickHouse/pull/3910) ([amosbird](https://github.com/amosbird))
|
||||
* Fixed segfault in functions `addDays`, `subtractDays`. [#3913](https://github.com/yandex/ClickHouse/pull/3913) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fixed error: functions `round`, `floor`, `trunc`, `ceil` may return bogus result when executed on integer argument and large negative scale. [#3914](https://github.com/yandex/ClickHouse/pull/3914) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fixed a bug introduced by 'kill query sync' which leads to a core dump. [#3916](https://github.com/yandex/ClickHouse/pull/3916) ([fancyqlx](https://github.com/fancyqlx))
|
||||
* Fix bug with long delay after empty replication queue. [#3928](https://github.com/yandex/ClickHouse/pull/3928) ([alesapin](https://github.com/alesapin))
|
||||
* Don't do exponential backoff when there is nothing to do for task. [#3932](https://github.com/yandex/ClickHouse/pull/3932) ([alesapin](https://github.com/alesapin))
|
||||
* Fix a bug that led to hangups in threads that perform ALTERs of Replicated tables and in the thread that updates configuration from ZooKeeper. #2947 #3891 [#3934](https://github.com/yandex/ClickHouse/pull/3934) ([ztlpn](https://github.com/ztlpn))
|
||||
* Fixed error in internal implementation of `quantileTDigest` (found by Artem Vakhrushev). This error never happens in ClickHouse and was relevant only for those who use ClickHouse codebase as a library directly. [#3935](https://github.com/yandex/ClickHouse/pull/3935) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fix bug with wrong prefix for ipv4 subnet masks. [#3945](https://github.com/yandex/ClickHouse/pull/3945) ([alesapin](https://github.com/alesapin))
|
||||
* Fix a bug when `from_zk` config elements weren't refreshed after a request to ZooKeeper timed out. #2947 [#3947](https://github.com/yandex/ClickHouse/pull/3947) ([ztlpn](https://github.com/ztlpn))
|
||||
* Fixed dictionary copying at LowCardinality::cloneEmpty() method which lead to excessive memory usage in case of inserting into table with LowCardinality primary key. [#3955](https://github.com/yandex/ClickHouse/pull/3955) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Fixed crash (`std::terminate`) in rare cases when a new thread cannot be created due to exhausted resources. [#3956](https://github.com/yandex/ClickHouse/pull/3956) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fix user and password forwarding for replicated tables queries. [#3957](https://github.com/yandex/ClickHouse/pull/3957) ([alesapin](https://github.com/alesapin))
|
||||
* Fixed very rare race condition that can happen when listing tables in Dictionary database while reloading dictionaries. [#3970](https://github.com/yandex/ClickHouse/pull/3970) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fixed LowCardinality serialization for Native format in case of empty arrays. #3907 [#4011](https://github.com/yandex/ClickHouse/pull/4011) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Fixed incorrect result while using distinct by single LowCardinality numeric column. #3895 [#4012](https://github.com/yandex/ClickHouse/pull/4012) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Make compiled_expression_cache_size setting limited by default. [#4041](https://github.com/yandex/ClickHouse/pull/4041) ([alesapin](https://github.com/alesapin))
|
||||
* Fix ubsan bug in compression codecs. [#4069](https://github.com/yandex/ClickHouse/pull/4069) ([alesapin](https://github.com/alesapin))
|
||||
* Allow Kafka Engine to ignore some number of parsing errors per block. [#4094](https://github.com/yandex/ClickHouse/pull/4094) ([abyss7](https://github.com/abyss7))
|
||||
* Fixed glibc compatibility issues. [#4100](https://github.com/yandex/ClickHouse/pull/4100) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fixed issues found by PVS-Studio. [#4103](https://github.com/yandex/ClickHouse/pull/4103) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fix a way how to collect array join columns. [#4121](https://github.com/yandex/ClickHouse/pull/4121) ([4ertus2](https://github.com/4ertus2))
|
||||
* Fixed incorrect result when HAVING was used with ROLLUP or CUBE. [#3756](https://github.com/yandex/ClickHouse/issues/3756) [#3837](https://github.com/yandex/ClickHouse/pull/3837) ([reflection](https://github.com/reflection))
|
||||
* Fixed specialized aggregation with LowCardinality key (in case when `compile` setting is enabled). [#3886](https://github.com/yandex/ClickHouse/pull/3886) ([KochetovNicolai](https://github.com/KochetovNicolai))
|
||||
* Fixed data type check in type conversion functions. [#3896](https://github.com/yandex/ClickHouse/pull/3896) ([zhang2014](https://github.com/zhang2014))
|
||||
* Fixed column aliases for query with `JOIN ON` syntax and distributed tables. [#3980](https://github.com/yandex/ClickHouse/pull/3980) ([zhang2014](https://github.com/zhang2014))
|
||||
* Fixed issues detected by UBSan. [#3021](https://github.com/yandex/ClickHouse/pull/3021) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
|
||||
### Doc fixes
|
||||
* Translated table engines related part to Chinese. [#3844](https://github.com/yandex/ClickHouse/pull/3844) ([lamber-ken](https://github.com/lamber-ken))
|
||||
* Fixed `toStartOfFiveMinute` description. [#4096](https://github.com/yandex/ClickHouse/pull/4096) ([cheesedosa](https://github.com/cheesedosa))
|
||||
* Added description for client `--secure` argument. [#3961](https://github.com/yandex/ClickHouse/pull/3961) ([vicdashkov](https://github.com/vicdashkov))
|
||||
* Added descriptions for settings `merge_tree_uniform_read_distribution`, `merge_tree_min_rows_for_concurrent_read`, `merge_tree_min_rows_for_seek`, `merge_tree_coarse_index_granularity`, `merge_tree_max_rows_to_use_cache` [#4024](https://github.com/yandex/ClickHouse/pull/4024) ([BayoNet](https://github.com/BayoNet))
|
||||
* Minor doc fixes. [#4098](https://github.com/yandex/ClickHouse/pull/4098) ([blinkov](https://github.com/blinkov))
|
||||
* Updated example for zookeeper config setting. [#3883](https://github.com/yandex/ClickHouse/pull/3883) [#3894](https://github.com/yandex/ClickHouse/pull/3894) ([ogorbacheva](https://github.com/ogorbacheva))
|
||||
* Updated info about escaping in formats Vertical, Pretty and VerticalRaw. [#4118](https://github.com/yandex/ClickHouse/pull/4118) ([ogorbacheva](https://github.com/ogorbacheva))
|
||||
* Adding description of the functions for working with UUID. [#4059](https://github.com/yandex/ClickHouse/pull/4059) ([ogorbacheva](https://github.com/ogorbacheva))
|
||||
* Add the description of the CHECK TABLE query. [#3881](https://github.com/yandex/ClickHouse/pull/3881) [#4043](https://github.com/yandex/ClickHouse/pull/4043) ([ogorbacheva](https://github.com/ogorbacheva))
|
||||
* Add `zh/tests` doc translate to Chinese. [#4034](https://github.com/yandex/ClickHouse/pull/4034) ([sundy-li](https://github.com/sundy-li))
|
||||
* Added documentation about functions `multiPosition`, `firstMatch`, `multiSearch`. [#4123](https://github.com/yandex/ClickHouse/pull/4123) ([danlark1](https://github.com/danlark1))
|
||||
* Add puppet module to the list of the third party libraries. [#3862](https://github.com/yandex/ClickHouse/pull/3862) ([Felixoid](https://github.com/Felixoid))
|
||||
* Fixed typo in the English version of Creating a Table example [#3872](https://github.com/yandex/ClickHouse/pull/3872) ([areldar](https://github.com/areldar))
|
||||
* Mention about nagios plugin for ClickHouse [#3878](https://github.com/yandex/ClickHouse/pull/3878) ([lisuml](https://github.com/lisuml))
|
||||
* Update of query language syntax description. [#4065](https://github.com/yandex/ClickHouse/pull/4065) ([BayoNet](https://github.com/BayoNet))
|
||||
* Added documentation for per-column compression codecs. [#4073](https://github.com/yandex/ClickHouse/pull/4073) ([alex-krash](https://github.com/alex-krash))
|
||||
* Updated articles about CollapsingMergeTree, GraphiteMergeTree, Replicated*MergeTree, `CREATE TABLE` query [#4085](https://github.com/yandex/ClickHouse/pull/4085) ([BayoNet](https://github.com/BayoNet))
|
||||
* Other minor improvements. [#3897](https://github.com/yandex/ClickHouse/pull/3897) [#3923](https://github.com/yandex/ClickHouse/pull/3923) [#4066](https://github.com/yandex/ClickHouse/pull/4066) [#3860](https://github.com/yandex/ClickHouse/pull/3860) [#3906](https://github.com/yandex/ClickHouse/pull/3906) [#3936](https://github.com/yandex/ClickHouse/pull/3936) [#3975](https://github.com/yandex/ClickHouse/pull/3975) ([ogorbacheva](https://github.com/ogorbacheva)) ([ogorbacheva](https://github.com/ogorbacheva)) ([ogorbacheva](https://github.com/ogorbacheva)) ([blinkov](https://github.com/blinkov)) ([blinkov](https://github.com/blinkov)) ([sdk2](https://github.com/sdk2)) ([blinkov](https://github.com/blinkov))
|
||||
|
||||
### Other
|
||||
* Updated librdkafka to v1.0.0-RC5. Used cppkafka instead of raw C interface. [#4025](https://github.com/yandex/ClickHouse/pull/4025) ([abyss7](https://github.com/abyss7))
|
||||
* Fixed `hidden` on page title [#4033](https://github.com/yandex/ClickHouse/pull/4033) ([xboston](https://github.com/xboston))
|
||||
* Updated year in copyright to 2019. [#4039](https://github.com/yandex/ClickHouse/pull/4039) ([xboston](https://github.com/xboston))
|
||||
* Added check that server process is started from the data directory's owner. Do not start server from root. [#3785](https://github.com/yandex/ClickHouse/pull/3785) ([sergey-v-galtsev](https://github.com/sergey-v-galtsev))
|
||||
* Removed function `shardByHash`. [#3833](https://github.com/yandex/ClickHouse/pull/3833) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
* Fixed typo in ClusterCopier. [#3854](https://github.com/yandex/ClickHouse/pull/3854) ([dqminh](https://github.com/dqminh))
|
||||
* Minor grammar fixes. [#3855](https://github.com/yandex/ClickHouse/pull/3855) ([intgr](https://github.com/intgr))
|
||||
* Added test script to reproduce performance degradation in jemalloc. [#4036](https://github.com/yandex/ClickHouse/pull/4036) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||
|
||||
## ClickHouse release 18.16.1, 2018-12-21
|
||||
|
||||
### Bug fixes:
|
||||
|
@ -3,6 +3,21 @@ cmake_minimum_required (VERSION 3.3)
|
||||
|
||||
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_CURRENT_SOURCE_DIR}/cmake/Modules/")
|
||||
|
||||
option(ENABLE_IPO "Enable inter-procedural optimization (aka LTO)" OFF) # need cmake 3.9+
|
||||
if(ENABLE_IPO)
|
||||
cmake_policy(SET CMP0069 NEW)
|
||||
include(CheckIPOSupported)
|
||||
check_ipo_supported(RESULT IPO_SUPPORTED OUTPUT IPO_NOT_SUPPORTED)
|
||||
if(IPO_SUPPORTED)
|
||||
message(STATUS "IPO/LTO is supported, enabling")
|
||||
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
|
||||
else()
|
||||
message(STATUS "IPO/LTO is not supported: <${IPO_NOT_SUPPORTED}>")
|
||||
endif()
|
||||
else()
|
||||
message(STATUS "IPO/LTO not enabled.")
|
||||
endif()
|
||||
|
||||
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
|
||||
# Require at least gcc 7
|
||||
if (CMAKE_CXX_COMPILER_VERSION VERSION_LESS 7 AND NOT CMAKE_VERSION VERSION_LESS 2.8.9)
|
||||
@ -81,7 +96,7 @@ option (ENABLE_TESTS "Enables tests" ON)
|
||||
if (CMAKE_SYSTEM_PROCESSOR MATCHES "amd64|x86_64")
|
||||
option (USE_INTERNAL_MEMCPY "Use internal implementation of 'memcpy' function instead of provided by libc. Only for x86_64." ON)
|
||||
|
||||
if (OS_LINUX AND NOT UNBUNDLED)
|
||||
if (OS_LINUX AND NOT UNBUNDLED AND MAKE_STATIC_LIBRARIES)
|
||||
option (GLIBC_COMPATIBILITY "Set to TRUE to enable compatibility with older glibc libraries. Only for x86_64, Linux. Implies USE_INTERNAL_MEMCPY." ON)
|
||||
if (GLIBC_COMPATIBILITY)
|
||||
message (STATUS "Some symbols from glibc will be replaced for compatibility")
|
||||
@ -120,7 +135,9 @@ else()
|
||||
message(STATUS "Disabling compiler -pipe option (have only ${AVAILABLE_PHYSICAL_MEMORY} mb of memory)")
|
||||
endif()
|
||||
|
||||
include (cmake/test_cpu.cmake)
|
||||
if(NOT DISABLE_CPU_OPTIMIZE)
|
||||
include(cmake/test_cpu.cmake)
|
||||
endif()
|
||||
|
||||
if(NOT COMPILER_CLANG) # clang: error: the clang compiler does not support '-march=native'
|
||||
option(ARCH_NATIVE "Enable -march=native compiler flag" ${ARCH_ARM})
|
||||
@ -229,9 +246,13 @@ include (cmake/find_re2.cmake)
|
||||
include (cmake/find_rdkafka.cmake)
|
||||
include (cmake/find_capnp.cmake)
|
||||
include (cmake/find_llvm.cmake)
|
||||
include (cmake/find_cpuid.cmake)
|
||||
include (cmake/find_cpuid.cmake) # Freebsd, bundled
|
||||
if (NOT USE_CPUID)
|
||||
include (cmake/find_cpuinfo.cmake) # Debian
|
||||
endif()
|
||||
include (cmake/find_libgsasl.cmake)
|
||||
include (cmake/find_libxml2.cmake)
|
||||
include (cmake/find_protobuf.cmake)
|
||||
include (cmake/find_hdfs3.cmake)
|
||||
include (cmake/find_consistent-hashing.cmake)
|
||||
include (cmake/find_base64.cmake)
|
||||
|
@ -28,7 +28,7 @@ find_library(METROHASH_LIBRARIES
|
||||
|
||||
find_path(METROHASH_INCLUDE_DIR
|
||||
NAMES metrohash.h
|
||||
PATHS ${METROHASH_ROOT_DIR}/include ${METROHASH_INCLUDE_PATHS}
|
||||
PATHS ${METROHASH_ROOT_DIR}/include PATH_SUFFIXES metrohash ${METROHASH_INCLUDE_PATHS}
|
||||
)
|
||||
|
||||
include(FindPackageHandleStandardArgs)
|
||||
|
@ -2,11 +2,11 @@ if (NOT ARCH_ARM)
|
||||
option (USE_INTERNAL_CPUID_LIBRARY "Set to FALSE to use system cpuid library instead of bundled" ${NOT_UNBUNDLED})
|
||||
endif ()
|
||||
|
||||
#if (USE_INTERNAL_CPUID_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libcpuid/include/cpuid/libcpuid.h")
|
||||
# message (WARNING "submodule contrib/libcpuid is missing. to fix try run: \n git submodule update --init --recursive")
|
||||
# set (USE_INTERNAL_CPUID_LIBRARY 0)
|
||||
# set (MISSING_INTERNAL_CPUID_LIBRARY 1)
|
||||
#endif ()
|
||||
if (USE_INTERNAL_CPUID_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libcpuid/CMakeLists.txt")
|
||||
message (WARNING "submodule contrib/libcpuid is missing. to fix try run: \n git submodule update --init --recursive")
|
||||
set (USE_INTERNAL_CPUID_LIBRARY 0)
|
||||
set (MISSING_INTERNAL_CPUID_LIBRARY 1)
|
||||
endif ()
|
||||
|
||||
if (NOT USE_INTERNAL_CPUID_LIBRARY)
|
||||
find_library (CPUID_LIBRARY cpuid)
|
||||
@ -20,10 +20,12 @@ if (CPUID_LIBRARY AND CPUID_INCLUDE_DIR)
|
||||
add_definitions(-DHAVE_STDINT_H)
|
||||
# TODO: make virtual target cpuid:cpuid with COMPILE_DEFINITIONS property
|
||||
endif ()
|
||||
set (USE_CPUID 1)
|
||||
elseif (NOT MISSING_INTERNAL_CPUID_LIBRARY)
|
||||
set (CPUID_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libcpuid/include)
|
||||
set (USE_INTERNAL_CPUID_LIBRARY 1)
|
||||
set (CPUID_LIBRARY cpuid)
|
||||
set (USE_CPUID 1)
|
||||
endif ()
|
||||
|
||||
message (STATUS "Using cpuid: ${CPUID_INCLUDE_DIR} : ${CPUID_LIBRARY}")
|
||||
message (STATUS "Using cpuid=${USE_CPUID}: ${CPUID_INCLUDE_DIR} : ${CPUID_LIBRARY}")
|
||||
|
17
cmake/find_cpuinfo.cmake
Normal file
17
cmake/find_cpuinfo.cmake
Normal file
@ -0,0 +1,17 @@
|
||||
option(USE_INTERNAL_CPUINFO_LIBRARY "Set to FALSE to use system cpuinfo library instead of bundled" ${NOT_UNBUNDLED})
|
||||
|
||||
if(NOT USE_INTERNAL_CPUINFO_LIBRARY)
|
||||
find_library(CPUINFO_LIBRARY cpuinfo)
|
||||
find_path(CPUINFO_INCLUDE_DIR NAMES cpuinfo.h PATHS ${CPUINFO_INCLUDE_PATHS})
|
||||
endif()
|
||||
|
||||
if(CPUID_LIBRARY AND CPUID_INCLUDE_DIR)
|
||||
set(USE_CPUINFO 1)
|
||||
elseif(NOT MISSING_INTERNAL_CPUINFO_LIBRARY)
|
||||
set(CPUINFO_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libcpuinfo/include)
|
||||
set(USE_INTERNAL_CPUINFO_LIBRARY 1)
|
||||
set(CPUINFO_LIBRARY cpuinfo)
|
||||
set(USE_CPUINFO 1)
|
||||
endif()
|
||||
|
||||
message(STATUS "Using cpuinfo=${USE_CPUINFO}: ${CPUINFO_INCLUDE_DIR} : ${CPUINFO_LIBRARY}")
|
@ -8,13 +8,22 @@ if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/googletest/googletest/CMakeList
|
||||
set (MISSING_INTERNAL_GTEST_LIBRARY 1)
|
||||
endif ()
|
||||
|
||||
if (NOT USE_INTERNAL_GTEST_LIBRARY)
|
||||
find_package (GTest)
|
||||
endif ()
|
||||
|
||||
if (NOT GTEST_INCLUDE_DIRS AND NOT MISSING_INTERNAL_GTEST_LIBRARY)
|
||||
if(NOT USE_INTERNAL_GTEST_LIBRARY)
|
||||
# TODO: autodetect of GTEST_SRC_DIR by EXISTS /usr/src/googletest/CMakeLists.txt
|
||||
if(NOT GTEST_SRC_DIR)
|
||||
find_package(GTest)
|
||||
endif()
|
||||
endif()
|
||||
|
||||
if (NOT GTEST_SRC_DIR AND NOT GTEST_INCLUDE_DIRS AND NOT MISSING_INTERNAL_GTEST_LIBRARY)
|
||||
set (USE_INTERNAL_GTEST_LIBRARY 1)
|
||||
set (GTEST_MAIN_LIBRARIES gtest_main)
|
||||
set (GTEST_INCLUDE_DIRS ${ClickHouse_SOURCE_DIR}/contrib/googletest/googletest)
|
||||
endif ()
|
||||
|
||||
message (STATUS "Using gtest: ${GTEST_INCLUDE_DIRS} : ${GTEST_MAIN_LIBRARIES}")
|
||||
if((GTEST_INCLUDE_DIRS AND GTEST_MAIN_LIBRARIES) OR GTEST_SRC_DIR)
|
||||
set(USE_GTEST 1)
|
||||
endif()
|
||||
|
||||
message (STATUS "Using gtest=${USE_GTEST}: ${GTEST_INCLUDE_DIRS} : ${GTEST_MAIN_LIBRARIES} : ${GTEST_SRC_DIR}")
|
||||
|
@ -1,18 +1,29 @@
|
||||
option (USE_INTERNAL_PROTOBUF_LIBRARY "Set to FALSE to use system protobuf instead of bundled" ON)
|
||||
option(USE_INTERNAL_PROTOBUF_LIBRARY "Set to FALSE to use system protobuf instead of bundled" ${NOT_UNBUNDLED})
|
||||
|
||||
if (NOT USE_INTERNAL_PROTOBUF_LIBRARY)
|
||||
if(NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/protobuf/cmake/CMakeLists.txt")
|
||||
if(USE_INTERNAL_PROTOBUF_LIBRARY)
|
||||
message(WARNING "submodule contrib/protobuf is missing. to fix try run: \n git submodule update --init --recursive")
|
||||
set(USE_INTERNAL_PROTOBUF_LIBRARY 0)
|
||||
endif()
|
||||
set(MISSING_INTERNAL_PROTOBUF_LIBRARY 1)
|
||||
endif()
|
||||
|
||||
if(NOT USE_INTERNAL_PROTOBUF_LIBRARY)
|
||||
find_package(Protobuf)
|
||||
endif ()
|
||||
endif()
|
||||
|
||||
if (Protobuf_LIBRARY AND Protobuf_INCLUDE_DIR)
|
||||
else ()
|
||||
set(Protobuf_INCLUDE_DIR ${CMAKE_SOURCE_DIR}/contrib/protobuf/src)
|
||||
set(USE_PROTOBUF 1)
|
||||
elseif(NOT MISSING_INTERNAL_PROTOBUF_LIBRARY)
|
||||
set(Protobuf_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/protobuf/src)
|
||||
|
||||
set(USE_PROTOBUF 1)
|
||||
set(USE_INTERNAL_PROTOBUF_LIBRARY 1)
|
||||
set(Protobuf_LIBRARY libprotobuf)
|
||||
set(Protobuf_PROTOC_LIBRARY libprotoc)
|
||||
set(Protobuf_LITE_LIBRARY libprotobuf-lite)
|
||||
|
||||
set(Protobuf_PROTOC_EXECUTABLE ${CMAKE_BINARY_DIR}/contrib/protobuf/cmake/protoc)
|
||||
set(Protobuf_PROTOC_EXECUTABLE ${ClickHouse_BINARY_DIR}/contrib/protobuf/cmake/protoc)
|
||||
|
||||
if(NOT DEFINED PROTOBUF_GENERATE_CPP_APPEND_PATH)
|
||||
set(PROTOBUF_GENERATE_CPP_APPEND_PATH TRUE)
|
||||
@ -77,4 +88,4 @@ else ()
|
||||
endfunction()
|
||||
endif()
|
||||
|
||||
message (STATUS "Using protobuf: ${Protobuf_INCLUDE_DIR} : ${Protobuf_LIBRARY}")
|
||||
message(STATUS "Using protobuf=${USE_PROTOBUF}: ${Protobuf_INCLUDE_DIR} : ${Protobuf_LIBRARY}")
|
||||
|
@ -2,6 +2,11 @@ if (NOT ARCH_ARM AND NOT ARCH_32 AND NOT APPLE)
|
||||
option (ENABLE_RDKAFKA "Enable kafka" ON)
|
||||
endif ()
|
||||
|
||||
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/cppkafka/CMakeLists.txt")
|
||||
message (WARNING "submodule contrib/cppkafka is missing. to fix try run: \n git submodule update --init --recursive")
|
||||
set (ENABLE_RDKAFKA 0)
|
||||
endif ()
|
||||
|
||||
if (ENABLE_RDKAFKA)
|
||||
|
||||
if (OS_LINUX AND NOT ARCH_ARM)
|
||||
|
@ -14,6 +14,7 @@ if (ZSTD_LIBRARY AND ZSTD_INCLUDE_DIR)
|
||||
else ()
|
||||
set (USE_INTERNAL_ZSTD_LIBRARY 1)
|
||||
set (ZSTD_LIBRARY zstd)
|
||||
set (ZSTD_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/zstd/lib)
|
||||
endif ()
|
||||
|
||||
message (STATUS "Using zstd: ${ZSTD_INCLUDE_DIR} : ${ZSTD_LIBRARY}")
|
||||
|
@ -2,4 +2,5 @@ set(DIVIDE_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libdivide)
|
||||
set(COMMON_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/libs/libcommon/include ${ClickHouse_BINARY_DIR}/libs/libcommon/include)
|
||||
set(DBMS_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/dbms/src ${ClickHouse_BINARY_DIR}/dbms/src)
|
||||
set(DOUBLE_CONVERSION_CONTRIB_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/double-conversion)
|
||||
set(METROHASH_CONTRIB_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libmetrohash/src)
|
||||
set(PCG_RANDOM_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libpcg-random/include)
|
||||
|
26
contrib/CMakeLists.txt
vendored
26
contrib/CMakeLists.txt
vendored
@ -107,6 +107,11 @@ if (USE_INTERNAL_SSL_LIBRARY)
|
||||
if (NOT MAKE_STATIC_LIBRARIES)
|
||||
set (BUILD_SHARED 1)
|
||||
endif ()
|
||||
|
||||
# By default, ${CMAKE_INSTALL_PREFIX}/etc/ssl is selected - that is not what we need.
|
||||
# We need to use system wide ssl directory.
|
||||
set (OPENSSLDIR "/etc/ssl")
|
||||
|
||||
set (LIBRESSL_SKIP_INSTALL 1 CACHE INTERNAL "")
|
||||
add_subdirectory (ssl)
|
||||
target_include_directories(${OPENSSL_CRYPTO_LIBRARY} SYSTEM PUBLIC ${OPENSSL_INCLUDE_DIR})
|
||||
@ -166,13 +171,16 @@ if (USE_INTERNAL_POCO_LIBRARY)
|
||||
endif ()
|
||||
endif ()
|
||||
|
||||
if (USE_INTERNAL_GTEST_LIBRARY)
|
||||
if(USE_INTERNAL_GTEST_LIBRARY)
|
||||
# Google Test from sources
|
||||
add_subdirectory(${ClickHouse_SOURCE_DIR}/contrib/googletest/googletest ${CMAKE_CURRENT_BINARY_DIR}/googletest)
|
||||
# avoid problems with <regexp.h>
|
||||
target_compile_definitions (gtest INTERFACE GTEST_HAS_POSIX_RE=0)
|
||||
target_include_directories (gtest SYSTEM INTERFACE ${ClickHouse_SOURCE_DIR}/contrib/googletest/include)
|
||||
endif ()
|
||||
elseif(GTEST_SRC_DIR)
|
||||
add_subdirectory(${GTEST_SRC_DIR}/googletest ${CMAKE_CURRENT_BINARY_DIR}/googletest)
|
||||
target_compile_definitions(gtest INTERFACE GTEST_HAS_POSIX_RE=0)
|
||||
endif()
|
||||
|
||||
if (USE_INTERNAL_LLVM_LIBRARY)
|
||||
file(GENERATE OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/empty.cpp CONTENT " ")
|
||||
@ -207,14 +215,14 @@ if (USE_INTERNAL_LIBXML2_LIBRARY)
|
||||
add_subdirectory(libxml2-cmake)
|
||||
endif ()
|
||||
|
||||
if (USE_INTERNAL_PROTOBUF_LIBRARY)
|
||||
set(protobuf_BUILD_TESTS OFF CACHE INTERNAL "" FORCE)
|
||||
set(protobuf_BUILD_SHARED_LIBS OFF CACHE INTERNAL "" FORCE)
|
||||
set(protobuf_WITH_ZLIB 0 CACHE INTERNAL "" FORCE) # actually will use zlib, but skip find
|
||||
add_subdirectory(protobuf/cmake)
|
||||
endif ()
|
||||
|
||||
if (USE_INTERNAL_HDFS3_LIBRARY)
|
||||
include(${ClickHouse_SOURCE_DIR}/cmake/find_protobuf.cmake)
|
||||
if (USE_INTERNAL_PROTOBUF_LIBRARY)
|
||||
set(protobuf_BUILD_TESTS OFF CACHE INTERNAL "" FORCE)
|
||||
set(protobuf_BUILD_SHARED_LIBS OFF CACHE INTERNAL "" FORCE)
|
||||
set(protobuf_WITH_ZLIB 0 CACHE INTERNAL "" FORCE) # actually will use zlib, but skip find
|
||||
add_subdirectory(protobuf/cmake)
|
||||
endif ()
|
||||
add_subdirectory(libhdfs3-cmake)
|
||||
endif ()
|
||||
|
||||
|
2
contrib/jemalloc
vendored
2
contrib/jemalloc
vendored
@ -1 +1 @@
|
||||
Subproject commit 41b7372eadee941b9164751b8d4963f915d3ceae
|
||||
Subproject commit cd2931ad9bbd78208565716ab102e86d858c2fff
|
@ -1,5 +1,5 @@
|
||||
if (HAVE_SSE42) # Not used. Pretty easy to port.
|
||||
set (SOURCES_SSE42_ONLY src/metrohash128crc.cpp)
|
||||
set (SOURCES_SSE42_ONLY src/metrohash128crc.cpp src/metrohash128crc.h)
|
||||
endif ()
|
||||
|
||||
add_library(metrohash
|
||||
|
@ -1,22 +1,201 @@
|
||||
The MIT License (MIT)
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
Copyright (c) 2015 J. Andrew Rogers
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
1. Definitions.
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
|
@ -5,12 +5,44 @@ MetroHash is a set of state-of-the-art hash functions for *non-cryptographic* us
|
||||
* Fastest general-purpose functions for bulk hashing.
|
||||
* Fastest general-purpose functions for small, variable length keys.
|
||||
* Robust statistical bias profile, similar to the MD5 cryptographic hash.
|
||||
* Hashes can be constructed incrementally (**new**)
|
||||
* 64-bit, 128-bit, and 128-bit CRC variants currently available.
|
||||
* Optimized for modern x86-64 microarchitectures.
|
||||
* Elegant, compact, readable functions.
|
||||
|
||||
You can read more about the design and history [here](http://www.jandrewrogers.com/2015/05/27/metrohash/).
|
||||
|
||||
## News
|
||||
|
||||
### 23 October 2018
|
||||
|
||||
The project has been re-licensed under Apache License v2.0. The purpose of this license change is consistency with the imminent release of MetroHash v2.0, which is also licensed under the Apache license.
|
||||
|
||||
### 27 July 2015
|
||||
|
||||
Two new 64-bit and 128-bit algorithms add the ability to construct hashes incrementally. In addition to supporting incremental construction, the algorithms are slightly superior to the prior versions.
|
||||
|
||||
A big change is that these new algorithms are implemented as C++ classes that support both incremental and stateless hashing. These classes also have a static method for verifying the implementation against the test vectors built into the classes. Implementations are now fully contained by their respective headers e.g. "metrohash128.h".
|
||||
|
||||
*Note: an incremental version of the 128-bit CRC version is on its way but is not included in this push.*
|
||||
|
||||
**Usage Example For Stateless Hashing**
|
||||
|
||||
`MetroHash128::Hash(key, key_length, hash_ptr, seed)`
|
||||
|
||||
**Usage Example For Incremental Hashing**
|
||||
|
||||
`MetroHash128 hasher;`
|
||||
`hasher.Update(partial_key, partial_key_length);`
|
||||
`...`
|
||||
`hasher.Update(partial_key, partial_key_length);`
|
||||
`hasher.Finalize(hash_ptr);`
|
||||
|
||||
An `Initialize(seed)` method allows the hasher objects to be reused.
|
||||
|
||||
|
||||
### 27 May 2015
|
||||
|
||||
Six hash functions have been included in the initial release:
|
||||
|
||||
* 64-bit hash functions, "metrohash64_1" and "metrohash64_2"
|
||||
|
@ -1,7 +1,4 @@
|
||||
origin: git@github.com:jandrewrogers/MetroHash.git
|
||||
commit d9dee18a54a8a6766e24c1950b814ac7ca9d1a89
|
||||
Merge: 761e8a4 3d06b24
|
||||
origin: https://github.com/jandrewrogers/MetroHash.git
|
||||
commit 690a521d9beb2e1050cc8f273fdabc13b31bf8f6 tag: v1.1.3
|
||||
Author: J. Andrew Rogers <andrew@jarbox.org>
|
||||
Date: Sat Jun 6 16:12:06 2015 -0700
|
||||
|
||||
modified README
|
||||
Date: Tue Oct 23 09:49:53 2018 -0700
|
||||
|
@ -1,73 +1,24 @@
|
||||
// metrohash.h
|
||||
//
|
||||
// The MIT License (MIT)
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Copyright (c) 2015 J. Andrew Rogers
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef METROHASH_METROHASH_H
|
||||
#define METROHASH_METROHASH_H
|
||||
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
||||
// MetroHash 64-bit hash functions
|
||||
void metrohash64_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
void metrohash64_2(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
|
||||
// MetroHash 128-bit hash functions
|
||||
void metrohash128_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
void metrohash128_2(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
|
||||
// MetroHash 128-bit hash functions using CRC instruction
|
||||
void metrohash128crc_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
void metrohash128crc_2(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
|
||||
|
||||
/* rotate right idiom recognized by compiler*/
|
||||
inline static uint64_t rotate_right(uint64_t v, unsigned k)
|
||||
{
|
||||
return (v >> k) | (v << (64 - k));
|
||||
}
|
||||
|
||||
// unaligned reads, fast and safe on Nehalem and later microarchitectures
|
||||
inline static uint64_t read_u64(const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint64_t*>(ptr));
|
||||
}
|
||||
|
||||
inline static uint64_t read_u32(const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint32_t*>(ptr));
|
||||
}
|
||||
|
||||
inline static uint64_t read_u16(const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint16_t*>(ptr));
|
||||
}
|
||||
|
||||
inline static uint64_t read_u8 (const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint8_t *>(ptr));
|
||||
}
|
||||
|
||||
#include "metrohash64.h"
|
||||
#include "metrohash128.h"
|
||||
#include "metrohash128crc.h"
|
||||
|
||||
#endif // #ifndef METROHASH_METROHASH_H
|
||||
|
@ -1,29 +1,260 @@
|
||||
// metrohash128.cpp
|
||||
//
|
||||
// The MIT License (MIT)
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Copyright (c) 2015 J. Andrew Rogers
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include <string.h>
|
||||
#include "platform.h"
|
||||
#include "metrohash128.h"
|
||||
|
||||
const char * MetroHash128::test_string = "012345678901234567890123456789012345678901234567890123456789012";
|
||||
|
||||
const uint8_t MetroHash128::test_seed_0[16] = {
|
||||
0xC7, 0x7C, 0xE2, 0xBF, 0xA4, 0xED, 0x9F, 0x9B,
|
||||
0x05, 0x48, 0xB2, 0xAC, 0x50, 0x74, 0xA2, 0x97
|
||||
};
|
||||
|
||||
const uint8_t MetroHash128::test_seed_1[16] = {
|
||||
0x45, 0xA3, 0xCD, 0xB8, 0x38, 0x19, 0x9D, 0x7F,
|
||||
0xBD, 0xD6, 0x8D, 0x86, 0x7A, 0x14, 0xEC, 0xEF
|
||||
};
|
||||
|
||||
|
||||
|
||||
MetroHash128::MetroHash128(const uint64_t seed)
|
||||
{
|
||||
Initialize(seed);
|
||||
}
|
||||
|
||||
|
||||
void MetroHash128::Initialize(const uint64_t seed)
|
||||
{
|
||||
// initialize internal hash registers
|
||||
state.v[0] = (static_cast<uint64_t>(seed) - k0) * k3;
|
||||
state.v[1] = (static_cast<uint64_t>(seed) + k1) * k2;
|
||||
state.v[2] = (static_cast<uint64_t>(seed) + k0) * k2;
|
||||
state.v[3] = (static_cast<uint64_t>(seed) - k1) * k3;
|
||||
|
||||
// initialize total length of input
|
||||
bytes = 0;
|
||||
}
|
||||
|
||||
|
||||
void MetroHash128::Update(const uint8_t * const buffer, const uint64_t length)
|
||||
{
|
||||
const uint8_t * ptr = reinterpret_cast<const uint8_t*>(buffer);
|
||||
const uint8_t * const end = ptr + length;
|
||||
|
||||
// input buffer may be partially filled
|
||||
if (bytes % 32)
|
||||
{
|
||||
uint64_t fill = 32 - (bytes % 32);
|
||||
if (fill > length)
|
||||
fill = length;
|
||||
|
||||
memcpy(input.b + (bytes % 32), ptr, static_cast<size_t>(fill));
|
||||
ptr += fill;
|
||||
bytes += fill;
|
||||
|
||||
// input buffer is still partially filled
|
||||
if ((bytes % 32) != 0) return;
|
||||
|
||||
// process full input buffer
|
||||
state.v[0] += read_u64(&input.b[ 0]) * k0; state.v[0] = rotate_right(state.v[0],29) + state.v[2];
|
||||
state.v[1] += read_u64(&input.b[ 8]) * k1; state.v[1] = rotate_right(state.v[1],29) + state.v[3];
|
||||
state.v[2] += read_u64(&input.b[16]) * k2; state.v[2] = rotate_right(state.v[2],29) + state.v[0];
|
||||
state.v[3] += read_u64(&input.b[24]) * k3; state.v[3] = rotate_right(state.v[3],29) + state.v[1];
|
||||
}
|
||||
|
||||
// bulk update
|
||||
bytes += (end - ptr);
|
||||
while (ptr <= (end - 32))
|
||||
{
|
||||
// process directly from the source, bypassing the input buffer
|
||||
state.v[0] += read_u64(ptr) * k0; ptr += 8; state.v[0] = rotate_right(state.v[0],29) + state.v[2];
|
||||
state.v[1] += read_u64(ptr) * k1; ptr += 8; state.v[1] = rotate_right(state.v[1],29) + state.v[3];
|
||||
state.v[2] += read_u64(ptr) * k2; ptr += 8; state.v[2] = rotate_right(state.v[2],29) + state.v[0];
|
||||
state.v[3] += read_u64(ptr) * k3; ptr += 8; state.v[3] = rotate_right(state.v[3],29) + state.v[1];
|
||||
}
|
||||
|
||||
// store remaining bytes in input buffer
|
||||
if (ptr < end)
|
||||
memcpy(input.b, ptr, end - ptr);
|
||||
}
|
||||
|
||||
|
||||
void MetroHash128::Finalize(uint8_t * const hash)
|
||||
{
|
||||
// finalize bulk loop, if used
|
||||
if (bytes >= 32)
|
||||
{
|
||||
state.v[2] ^= rotate_right(((state.v[0] + state.v[3]) * k0) + state.v[1], 21) * k1;
|
||||
state.v[3] ^= rotate_right(((state.v[1] + state.v[2]) * k1) + state.v[0], 21) * k0;
|
||||
state.v[0] ^= rotate_right(((state.v[0] + state.v[2]) * k0) + state.v[3], 21) * k1;
|
||||
state.v[1] ^= rotate_right(((state.v[1] + state.v[3]) * k1) + state.v[2], 21) * k0;
|
||||
}
|
||||
|
||||
// process any bytes remaining in the input buffer
|
||||
const uint8_t * ptr = reinterpret_cast<const uint8_t*>(input.b);
|
||||
const uint8_t * const end = ptr + (bytes % 32);
|
||||
|
||||
if ((end - ptr) >= 16)
|
||||
{
|
||||
state.v[0] += read_u64(ptr) * k2; ptr += 8; state.v[0] = rotate_right(state.v[0],33) * k3;
|
||||
state.v[1] += read_u64(ptr) * k2; ptr += 8; state.v[1] = rotate_right(state.v[1],33) * k3;
|
||||
state.v[0] ^= rotate_right((state.v[0] * k2) + state.v[1], 45) * k1;
|
||||
state.v[1] ^= rotate_right((state.v[1] * k3) + state.v[0], 45) * k0;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 8)
|
||||
{
|
||||
state.v[0] += read_u64(ptr) * k2; ptr += 8; state.v[0] = rotate_right(state.v[0],33) * k3;
|
||||
state.v[0] ^= rotate_right((state.v[0] * k2) + state.v[1], 27) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 4)
|
||||
{
|
||||
state.v[1] += read_u32(ptr) * k2; ptr += 4; state.v[1] = rotate_right(state.v[1],33) * k3;
|
||||
state.v[1] ^= rotate_right((state.v[1] * k3) + state.v[0], 46) * k0;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 2)
|
||||
{
|
||||
state.v[0] += read_u16(ptr) * k2; ptr += 2; state.v[0] = rotate_right(state.v[0],33) * k3;
|
||||
state.v[0] ^= rotate_right((state.v[0] * k2) + state.v[1], 22) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 1)
|
||||
{
|
||||
state.v[1] += read_u8 (ptr) * k2; state.v[1] = rotate_right(state.v[1],33) * k3;
|
||||
state.v[1] ^= rotate_right((state.v[1] * k3) + state.v[0], 58) * k0;
|
||||
}
|
||||
|
||||
state.v[0] += rotate_right((state.v[0] * k0) + state.v[1], 13);
|
||||
state.v[1] += rotate_right((state.v[1] * k1) + state.v[0], 37);
|
||||
state.v[0] += rotate_right((state.v[0] * k2) + state.v[1], 13);
|
||||
state.v[1] += rotate_right((state.v[1] * k3) + state.v[0], 37);
|
||||
|
||||
bytes = 0;
|
||||
|
||||
// do any endian conversion here
|
||||
|
||||
memcpy(hash, state.v, 16);
|
||||
}
|
||||
|
||||
|
||||
void MetroHash128::Hash(const uint8_t * buffer, const uint64_t length, uint8_t * const hash, const uint64_t seed)
|
||||
{
|
||||
const uint8_t * ptr = reinterpret_cast<const uint8_t*>(buffer);
|
||||
const uint8_t * const end = ptr + length;
|
||||
|
||||
uint64_t v[4];
|
||||
|
||||
v[0] = (static_cast<uint64_t>(seed) - k0) * k3;
|
||||
v[1] = (static_cast<uint64_t>(seed) + k1) * k2;
|
||||
|
||||
if (length >= 32)
|
||||
{
|
||||
v[2] = (static_cast<uint64_t>(seed) + k0) * k2;
|
||||
v[3] = (static_cast<uint64_t>(seed) - k1) * k3;
|
||||
|
||||
do
|
||||
{
|
||||
v[0] += read_u64(ptr) * k0; ptr += 8; v[0] = rotate_right(v[0],29) + v[2];
|
||||
v[1] += read_u64(ptr) * k1; ptr += 8; v[1] = rotate_right(v[1],29) + v[3];
|
||||
v[2] += read_u64(ptr) * k2; ptr += 8; v[2] = rotate_right(v[2],29) + v[0];
|
||||
v[3] += read_u64(ptr) * k3; ptr += 8; v[3] = rotate_right(v[3],29) + v[1];
|
||||
}
|
||||
while (ptr <= (end - 32));
|
||||
|
||||
v[2] ^= rotate_right(((v[0] + v[3]) * k0) + v[1], 21) * k1;
|
||||
v[3] ^= rotate_right(((v[1] + v[2]) * k1) + v[0], 21) * k0;
|
||||
v[0] ^= rotate_right(((v[0] + v[2]) * k0) + v[3], 21) * k1;
|
||||
v[1] ^= rotate_right(((v[1] + v[3]) * k1) + v[2], 21) * k0;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 16)
|
||||
{
|
||||
v[0] += read_u64(ptr) * k2; ptr += 8; v[0] = rotate_right(v[0],33) * k3;
|
||||
v[1] += read_u64(ptr) * k2; ptr += 8; v[1] = rotate_right(v[1],33) * k3;
|
||||
v[0] ^= rotate_right((v[0] * k2) + v[1], 45) * k1;
|
||||
v[1] ^= rotate_right((v[1] * k3) + v[0], 45) * k0;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 8)
|
||||
{
|
||||
v[0] += read_u64(ptr) * k2; ptr += 8; v[0] = rotate_right(v[0],33) * k3;
|
||||
v[0] ^= rotate_right((v[0] * k2) + v[1], 27) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 4)
|
||||
{
|
||||
v[1] += read_u32(ptr) * k2; ptr += 4; v[1] = rotate_right(v[1],33) * k3;
|
||||
v[1] ^= rotate_right((v[1] * k3) + v[0], 46) * k0;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 2)
|
||||
{
|
||||
v[0] += read_u16(ptr) * k2; ptr += 2; v[0] = rotate_right(v[0],33) * k3;
|
||||
v[0] ^= rotate_right((v[0] * k2) + v[1], 22) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 1)
|
||||
{
|
||||
v[1] += read_u8 (ptr) * k2; v[1] = rotate_right(v[1],33) * k3;
|
||||
v[1] ^= rotate_right((v[1] * k3) + v[0], 58) * k0;
|
||||
}
|
||||
|
||||
v[0] += rotate_right((v[0] * k0) + v[1], 13);
|
||||
v[1] += rotate_right((v[1] * k1) + v[0], 37);
|
||||
v[0] += rotate_right((v[0] * k2) + v[1], 13);
|
||||
v[1] += rotate_right((v[1] * k3) + v[0], 37);
|
||||
|
||||
// do any endian conversion here
|
||||
|
||||
memcpy(hash, v, 16);
|
||||
}
|
||||
|
||||
|
||||
bool MetroHash128::ImplementationVerified()
|
||||
{
|
||||
uint8_t hash[16];
|
||||
const uint8_t * key = reinterpret_cast<const uint8_t *>(MetroHash128::test_string);
|
||||
|
||||
// verify one-shot implementation
|
||||
MetroHash128::Hash(key, strlen(MetroHash128::test_string), hash, 0);
|
||||
if (memcmp(hash, MetroHash128::test_seed_0, 16) != 0) return false;
|
||||
|
||||
MetroHash128::Hash(key, strlen(MetroHash128::test_string), hash, 1);
|
||||
if (memcmp(hash, MetroHash128::test_seed_1, 16) != 0) return false;
|
||||
|
||||
// verify incremental implementation
|
||||
MetroHash128 metro;
|
||||
|
||||
metro.Initialize(0);
|
||||
metro.Update(reinterpret_cast<const uint8_t *>(MetroHash128::test_string), strlen(MetroHash128::test_string));
|
||||
metro.Finalize(hash);
|
||||
if (memcmp(hash, MetroHash128::test_seed_0, 16) != 0) return false;
|
||||
|
||||
metro.Initialize(1);
|
||||
metro.Update(reinterpret_cast<const uint8_t *>(MetroHash128::test_string), strlen(MetroHash128::test_string));
|
||||
metro.Finalize(hash);
|
||||
if (memcmp(hash, MetroHash128::test_seed_1, 16) != 0) return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
#include "metrohash.h"
|
||||
|
||||
void metrohash128_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out)
|
||||
{
|
||||
@ -97,6 +328,8 @@ void metrohash128_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t *
|
||||
v[0] += rotate_right((v[0] * k2) + v[1], 13);
|
||||
v[1] += rotate_right((v[1] * k3) + v[0], 37);
|
||||
|
||||
// do any endian conversion here
|
||||
|
||||
memcpy(out, v, 16);
|
||||
}
|
||||
|
||||
@ -173,6 +406,8 @@ void metrohash128_2(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t *
|
||||
v[0] += rotate_right((v[0] * k2) + v[1], 33);
|
||||
v[1] += rotate_right((v[1] * k3) + v[0], 33);
|
||||
|
||||
// do any endian conversion here
|
||||
|
||||
memcpy(out, v, 16);
|
||||
}
|
||||
|
||||
|
72
contrib/libmetrohash/src/metrohash128.h
Normal file
72
contrib/libmetrohash/src/metrohash128.h
Normal file
@ -0,0 +1,72 @@
|
||||
// metrohash128.h
|
||||
//
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef METROHASH_METROHASH_128_H
|
||||
#define METROHASH_METROHASH_128_H
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
class MetroHash128
|
||||
{
|
||||
public:
|
||||
static const uint32_t bits = 128;
|
||||
|
||||
// Constructor initializes the same as Initialize()
|
||||
MetroHash128(const uint64_t seed=0);
|
||||
|
||||
// Initializes internal state for new hash with optional seed
|
||||
void Initialize(const uint64_t seed=0);
|
||||
|
||||
// Update the hash state with a string of bytes. If the length
|
||||
// is sufficiently long, the implementation switches to a bulk
|
||||
// hashing algorithm directly on the argument buffer for speed.
|
||||
void Update(const uint8_t * buffer, const uint64_t length);
|
||||
|
||||
// Constructs the final hash and writes it to the argument buffer.
|
||||
// After a hash is finalized, this instance must be Initialized()-ed
|
||||
// again or the behavior of Update() and Finalize() is undefined.
|
||||
void Finalize(uint8_t * const hash);
|
||||
|
||||
// A non-incremental function implementation. This can be significantly
|
||||
// faster than the incremental implementation for some usage patterns.
|
||||
static void Hash(const uint8_t * buffer, const uint64_t length, uint8_t * const hash, const uint64_t seed=0);
|
||||
|
||||
// Does implementation correctly execute test vectors?
|
||||
static bool ImplementationVerified();
|
||||
|
||||
// test vectors -- Hash(test_string, seed=0) => test_seed_0
|
||||
static const char * test_string;
|
||||
static const uint8_t test_seed_0[16];
|
||||
static const uint8_t test_seed_1[16];
|
||||
|
||||
private:
|
||||
static const uint64_t k0 = 0xC83A91E1;
|
||||
static const uint64_t k1 = 0x8648DBDB;
|
||||
static const uint64_t k2 = 0x7BDEC03B;
|
||||
static const uint64_t k3 = 0x2F5870A5;
|
||||
|
||||
struct { uint64_t v[4]; } state;
|
||||
struct { uint8_t b[32]; } input;
|
||||
uint64_t bytes;
|
||||
};
|
||||
|
||||
|
||||
// Legacy 128-bit hash functions -- do not use
|
||||
void metrohash128_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
void metrohash128_2(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
|
||||
|
||||
#endif // #ifndef METROHASH_METROHASH_128_H
|
@ -1,31 +1,24 @@
|
||||
// metrohash128crc.cpp
|
||||
//
|
||||
// The MIT License (MIT)
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Copyright (c) 2015 J. Andrew Rogers
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
|
||||
#include "metrohash.h"
|
||||
#include <nmmintrin.h>
|
||||
#include <string.h>
|
||||
#include "metrohash.h"
|
||||
#include "platform.h"
|
||||
|
||||
|
||||
void metrohash128crc_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out)
|
||||
|
27
contrib/libmetrohash/src/metrohash128crc.h
Normal file
27
contrib/libmetrohash/src/metrohash128crc.h
Normal file
@ -0,0 +1,27 @@
|
||||
// metrohash128crc.h
|
||||
//
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef METROHASH_METROHASH_128_CRC_H
|
||||
#define METROHASH_METROHASH_128_CRC_H
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
// Legacy 128-bit hash functions
|
||||
void metrohash128crc_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
void metrohash128crc_2(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
|
||||
|
||||
#endif // #ifndef METROHASH_METROHASH_128_CRC_H
|
@ -1,29 +1,257 @@
|
||||
// metrohash64.cpp
|
||||
//
|
||||
// The MIT License (MIT)
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Copyright (c) 2015 J. Andrew Rogers
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#include "platform.h"
|
||||
#include "metrohash64.h"
|
||||
|
||||
#include <cstring>
|
||||
|
||||
const char * MetroHash64::test_string = "012345678901234567890123456789012345678901234567890123456789012";
|
||||
|
||||
const uint8_t MetroHash64::test_seed_0[8] = { 0x6B, 0x75, 0x3D, 0xAE, 0x06, 0x70, 0x4B, 0xAD };
|
||||
const uint8_t MetroHash64::test_seed_1[8] = { 0x3B, 0x0D, 0x48, 0x1C, 0xF4, 0xB9, 0xB8, 0xDF };
|
||||
|
||||
|
||||
|
||||
MetroHash64::MetroHash64(const uint64_t seed)
|
||||
{
|
||||
Initialize(seed);
|
||||
}
|
||||
|
||||
|
||||
void MetroHash64::Initialize(const uint64_t seed)
|
||||
{
|
||||
vseed = (static_cast<uint64_t>(seed) + k2) * k0;
|
||||
|
||||
// initialize internal hash registers
|
||||
state.v[0] = vseed;
|
||||
state.v[1] = vseed;
|
||||
state.v[2] = vseed;
|
||||
state.v[3] = vseed;
|
||||
|
||||
// initialize total length of input
|
||||
bytes = 0;
|
||||
}
|
||||
|
||||
|
||||
void MetroHash64::Update(const uint8_t * const buffer, const uint64_t length)
|
||||
{
|
||||
const uint8_t * ptr = reinterpret_cast<const uint8_t*>(buffer);
|
||||
const uint8_t * const end = ptr + length;
|
||||
|
||||
// input buffer may be partially filled
|
||||
if (bytes % 32)
|
||||
{
|
||||
uint64_t fill = 32 - (bytes % 32);
|
||||
if (fill > length)
|
||||
fill = length;
|
||||
|
||||
memcpy(input.b + (bytes % 32), ptr, static_cast<size_t>(fill));
|
||||
ptr += fill;
|
||||
bytes += fill;
|
||||
|
||||
// input buffer is still partially filled
|
||||
if ((bytes % 32) != 0) return;
|
||||
|
||||
// process full input buffer
|
||||
state.v[0] += read_u64(&input.b[ 0]) * k0; state.v[0] = rotate_right(state.v[0],29) + state.v[2];
|
||||
state.v[1] += read_u64(&input.b[ 8]) * k1; state.v[1] = rotate_right(state.v[1],29) + state.v[3];
|
||||
state.v[2] += read_u64(&input.b[16]) * k2; state.v[2] = rotate_right(state.v[2],29) + state.v[0];
|
||||
state.v[3] += read_u64(&input.b[24]) * k3; state.v[3] = rotate_right(state.v[3],29) + state.v[1];
|
||||
}
|
||||
|
||||
// bulk update
|
||||
bytes += static_cast<uint64_t>(end - ptr);
|
||||
while (ptr <= (end - 32))
|
||||
{
|
||||
// process directly from the source, bypassing the input buffer
|
||||
state.v[0] += read_u64(ptr) * k0; ptr += 8; state.v[0] = rotate_right(state.v[0],29) + state.v[2];
|
||||
state.v[1] += read_u64(ptr) * k1; ptr += 8; state.v[1] = rotate_right(state.v[1],29) + state.v[3];
|
||||
state.v[2] += read_u64(ptr) * k2; ptr += 8; state.v[2] = rotate_right(state.v[2],29) + state.v[0];
|
||||
state.v[3] += read_u64(ptr) * k3; ptr += 8; state.v[3] = rotate_right(state.v[3],29) + state.v[1];
|
||||
}
|
||||
|
||||
// store remaining bytes in input buffer
|
||||
if (ptr < end)
|
||||
memcpy(input.b, ptr, static_cast<size_t>(end - ptr));
|
||||
}
|
||||
|
||||
|
||||
void MetroHash64::Finalize(uint8_t * const hash)
|
||||
{
|
||||
// finalize bulk loop, if used
|
||||
if (bytes >= 32)
|
||||
{
|
||||
state.v[2] ^= rotate_right(((state.v[0] + state.v[3]) * k0) + state.v[1], 37) * k1;
|
||||
state.v[3] ^= rotate_right(((state.v[1] + state.v[2]) * k1) + state.v[0], 37) * k0;
|
||||
state.v[0] ^= rotate_right(((state.v[0] + state.v[2]) * k0) + state.v[3], 37) * k1;
|
||||
state.v[1] ^= rotate_right(((state.v[1] + state.v[3]) * k1) + state.v[2], 37) * k0;
|
||||
|
||||
state.v[0] = vseed + (state.v[0] ^ state.v[1]);
|
||||
}
|
||||
|
||||
// process any bytes remaining in the input buffer
|
||||
const uint8_t * ptr = reinterpret_cast<const uint8_t*>(input.b);
|
||||
const uint8_t * const end = ptr + (bytes % 32);
|
||||
|
||||
if ((end - ptr) >= 16)
|
||||
{
|
||||
state.v[1] = state.v[0] + (read_u64(ptr) * k2); ptr += 8; state.v[1] = rotate_right(state.v[1],29) * k3;
|
||||
state.v[2] = state.v[0] + (read_u64(ptr) * k2); ptr += 8; state.v[2] = rotate_right(state.v[2],29) * k3;
|
||||
state.v[1] ^= rotate_right(state.v[1] * k0, 21) + state.v[2];
|
||||
state.v[2] ^= rotate_right(state.v[2] * k3, 21) + state.v[1];
|
||||
state.v[0] += state.v[2];
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 8)
|
||||
{
|
||||
state.v[0] += read_u64(ptr) * k3; ptr += 8;
|
||||
state.v[0] ^= rotate_right(state.v[0], 55) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 4)
|
||||
{
|
||||
state.v[0] += read_u32(ptr) * k3; ptr += 4;
|
||||
state.v[0] ^= rotate_right(state.v[0], 26) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 2)
|
||||
{
|
||||
state.v[0] += read_u16(ptr) * k3; ptr += 2;
|
||||
state.v[0] ^= rotate_right(state.v[0], 48) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 1)
|
||||
{
|
||||
state.v[0] += read_u8 (ptr) * k3;
|
||||
state.v[0] ^= rotate_right(state.v[0], 37) * k1;
|
||||
}
|
||||
|
||||
state.v[0] ^= rotate_right(state.v[0], 28);
|
||||
state.v[0] *= k0;
|
||||
state.v[0] ^= rotate_right(state.v[0], 29);
|
||||
|
||||
bytes = 0;
|
||||
|
||||
// do any endian conversion here
|
||||
|
||||
memcpy(hash, state.v, 8);
|
||||
}
|
||||
|
||||
|
||||
void MetroHash64::Hash(const uint8_t * buffer, const uint64_t length, uint8_t * const hash, const uint64_t seed)
|
||||
{
|
||||
const uint8_t * ptr = reinterpret_cast<const uint8_t*>(buffer);
|
||||
const uint8_t * const end = ptr + length;
|
||||
|
||||
uint64_t h = (static_cast<uint64_t>(seed) + k2) * k0;
|
||||
|
||||
if (length >= 32)
|
||||
{
|
||||
uint64_t v[4];
|
||||
v[0] = h;
|
||||
v[1] = h;
|
||||
v[2] = h;
|
||||
v[3] = h;
|
||||
|
||||
do
|
||||
{
|
||||
v[0] += read_u64(ptr) * k0; ptr += 8; v[0] = rotate_right(v[0],29) + v[2];
|
||||
v[1] += read_u64(ptr) * k1; ptr += 8; v[1] = rotate_right(v[1],29) + v[3];
|
||||
v[2] += read_u64(ptr) * k2; ptr += 8; v[2] = rotate_right(v[2],29) + v[0];
|
||||
v[3] += read_u64(ptr) * k3; ptr += 8; v[3] = rotate_right(v[3],29) + v[1];
|
||||
}
|
||||
while (ptr <= (end - 32));
|
||||
|
||||
v[2] ^= rotate_right(((v[0] + v[3]) * k0) + v[1], 37) * k1;
|
||||
v[3] ^= rotate_right(((v[1] + v[2]) * k1) + v[0], 37) * k0;
|
||||
v[0] ^= rotate_right(((v[0] + v[2]) * k0) + v[3], 37) * k1;
|
||||
v[1] ^= rotate_right(((v[1] + v[3]) * k1) + v[2], 37) * k0;
|
||||
h += v[0] ^ v[1];
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 16)
|
||||
{
|
||||
uint64_t v0 = h + (read_u64(ptr) * k2); ptr += 8; v0 = rotate_right(v0,29) * k3;
|
||||
uint64_t v1 = h + (read_u64(ptr) * k2); ptr += 8; v1 = rotate_right(v1,29) * k3;
|
||||
v0 ^= rotate_right(v0 * k0, 21) + v1;
|
||||
v1 ^= rotate_right(v1 * k3, 21) + v0;
|
||||
h += v1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 8)
|
||||
{
|
||||
h += read_u64(ptr) * k3; ptr += 8;
|
||||
h ^= rotate_right(h, 55) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 4)
|
||||
{
|
||||
h += read_u32(ptr) * k3; ptr += 4;
|
||||
h ^= rotate_right(h, 26) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 2)
|
||||
{
|
||||
h += read_u16(ptr) * k3; ptr += 2;
|
||||
h ^= rotate_right(h, 48) * k1;
|
||||
}
|
||||
|
||||
if ((end - ptr) >= 1)
|
||||
{
|
||||
h += read_u8 (ptr) * k3;
|
||||
h ^= rotate_right(h, 37) * k1;
|
||||
}
|
||||
|
||||
h ^= rotate_right(h, 28);
|
||||
h *= k0;
|
||||
h ^= rotate_right(h, 29);
|
||||
|
||||
memcpy(hash, &h, 8);
|
||||
}
|
||||
|
||||
|
||||
bool MetroHash64::ImplementationVerified()
|
||||
{
|
||||
uint8_t hash[8];
|
||||
const uint8_t * key = reinterpret_cast<const uint8_t *>(MetroHash64::test_string);
|
||||
|
||||
// verify one-shot implementation
|
||||
MetroHash64::Hash(key, strlen(MetroHash64::test_string), hash, 0);
|
||||
if (memcmp(hash, MetroHash64::test_seed_0, 8) != 0) return false;
|
||||
|
||||
MetroHash64::Hash(key, strlen(MetroHash64::test_string), hash, 1);
|
||||
if (memcmp(hash, MetroHash64::test_seed_1, 8) != 0) return false;
|
||||
|
||||
// verify incremental implementation
|
||||
MetroHash64 metro;
|
||||
|
||||
metro.Initialize(0);
|
||||
metro.Update(reinterpret_cast<const uint8_t *>(MetroHash64::test_string), strlen(MetroHash64::test_string));
|
||||
metro.Finalize(hash);
|
||||
if (memcmp(hash, MetroHash64::test_seed_0, 8) != 0) return false;
|
||||
|
||||
metro.Initialize(1);
|
||||
metro.Update(reinterpret_cast<const uint8_t *>(MetroHash64::test_string), strlen(MetroHash64::test_string));
|
||||
metro.Finalize(hash);
|
||||
if (memcmp(hash, MetroHash64::test_seed_1, 8) != 0) return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
#include "metrohash.h"
|
||||
|
||||
void metrohash64_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out)
|
||||
{
|
||||
|
73
contrib/libmetrohash/src/metrohash64.h
Normal file
73
contrib/libmetrohash/src/metrohash64.h
Normal file
@ -0,0 +1,73 @@
|
||||
// metrohash64.h
|
||||
//
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef METROHASH_METROHASH_64_H
|
||||
#define METROHASH_METROHASH_64_H
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
class MetroHash64
|
||||
{
|
||||
public:
|
||||
static const uint32_t bits = 64;
|
||||
|
||||
// Constructor initializes the same as Initialize()
|
||||
MetroHash64(const uint64_t seed=0);
|
||||
|
||||
// Initializes internal state for new hash with optional seed
|
||||
void Initialize(const uint64_t seed=0);
|
||||
|
||||
// Update the hash state with a string of bytes. If the length
|
||||
// is sufficiently long, the implementation switches to a bulk
|
||||
// hashing algorithm directly on the argument buffer for speed.
|
||||
void Update(const uint8_t * buffer, const uint64_t length);
|
||||
|
||||
// Constructs the final hash and writes it to the argument buffer.
|
||||
// After a hash is finalized, this instance must be Initialized()-ed
|
||||
// again or the behavior of Update() and Finalize() is undefined.
|
||||
void Finalize(uint8_t * const hash);
|
||||
|
||||
// A non-incremental function implementation. This can be significantly
|
||||
// faster than the incremental implementation for some usage patterns.
|
||||
static void Hash(const uint8_t * buffer, const uint64_t length, uint8_t * const hash, const uint64_t seed=0);
|
||||
|
||||
// Does implementation correctly execute test vectors?
|
||||
static bool ImplementationVerified();
|
||||
|
||||
// test vectors -- Hash(test_string, seed=0) => test_seed_0
|
||||
static const char * test_string;
|
||||
static const uint8_t test_seed_0[8];
|
||||
static const uint8_t test_seed_1[8];
|
||||
|
||||
private:
|
||||
static const uint64_t k0 = 0xD6D018F5;
|
||||
static const uint64_t k1 = 0xA2AA033B;
|
||||
static const uint64_t k2 = 0x62992FC1;
|
||||
static const uint64_t k3 = 0x30BC5B29;
|
||||
|
||||
struct { uint64_t v[4]; } state;
|
||||
struct { uint8_t b[32]; } input;
|
||||
uint64_t bytes;
|
||||
uint64_t vseed;
|
||||
};
|
||||
|
||||
|
||||
// Legacy 64-bit hash functions -- do not use
|
||||
void metrohash64_1(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
void metrohash64_2(const uint8_t * key, uint64_t len, uint32_t seed, uint8_t * out);
|
||||
|
||||
|
||||
#endif // #ifndef METROHASH_METROHASH_64_H
|
50
contrib/libmetrohash/src/platform.h
Normal file
50
contrib/libmetrohash/src/platform.h
Normal file
@ -0,0 +1,50 @@
|
||||
// platform.h
|
||||
//
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef METROHASH_PLATFORM_H
|
||||
#define METROHASH_PLATFORM_H
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
// rotate right idiom recognized by most compilers
|
||||
inline static uint64_t rotate_right(uint64_t v, unsigned k)
|
||||
{
|
||||
return (v >> k) | (v << (64 - k));
|
||||
}
|
||||
|
||||
// unaligned reads, fast and safe on Nehalem and later microarchitectures
|
||||
inline static uint64_t read_u64(const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint64_t*>(ptr));
|
||||
}
|
||||
|
||||
inline static uint64_t read_u32(const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint32_t*>(ptr));
|
||||
}
|
||||
|
||||
inline static uint64_t read_u16(const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint16_t*>(ptr));
|
||||
}
|
||||
|
||||
inline static uint64_t read_u8 (const void * const ptr)
|
||||
{
|
||||
return static_cast<uint64_t>(*reinterpret_cast<const uint8_t *>(ptr));
|
||||
}
|
||||
|
||||
|
||||
#endif // #ifndef METROHASH_PLATFORM_H
|
@ -1,27 +1,18 @@
|
||||
// testvector.h
|
||||
//
|
||||
// The MIT License (MIT)
|
||||
// Copyright 2015-2018 J. Andrew Rogers
|
||||
//
|
||||
// Copyright (c) 2015 J. Andrew Rogers
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
// of this software and associated documentation files (the "Software"), to deal
|
||||
// in the Software without restriction, including without limitation the rights
|
||||
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
// copies of the Software, and to permit persons to whom the Software is
|
||||
// furnished to do so, subject to the following conditions:
|
||||
//
|
||||
// The above copyright notice and this permission notice shall be included in all
|
||||
// copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
// SOFTWARE.
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
#ifndef METROHASH_TESTVECTOR_H
|
||||
#define METROHASH_TESTVECTOR_H
|
||||
@ -46,6 +37,8 @@ struct TestVectorData
|
||||
|
||||
static const char * test_key_63 = "012345678901234567890123456789012345678901234567890123456789012";
|
||||
|
||||
// The hash assumes a little-endian architecture. Treating the hash results
|
||||
// as an array of uint64_t should enable conversion for big-endian implementations.
|
||||
const TestVectorData TestVector [] =
|
||||
{
|
||||
// seed = 0
|
||||
|
@ -2,6 +2,7 @@ set(RDKAFKA_SOURCE_DIR ${CMAKE_SOURCE_DIR}/contrib/librdkafka/src)
|
||||
|
||||
set(SRCS
|
||||
${RDKAFKA_SOURCE_DIR}/crc32c.c
|
||||
${RDKAFKA_SOURCE_DIR}/rdkafka_zstd.c
|
||||
${RDKAFKA_SOURCE_DIR}/rdaddr.c
|
||||
${RDKAFKA_SOURCE_DIR}/rdavl.c
|
||||
${RDKAFKA_SOURCE_DIR}/rdbuf.c
|
||||
@ -59,5 +60,6 @@ set(SRCS
|
||||
|
||||
add_library(rdkafka ${LINK_MODE} ${SRCS})
|
||||
target_include_directories(rdkafka SYSTEM PUBLIC include)
|
||||
target_include_directories(rdkafka SYSTEM PUBLIC ${RDKAFKA_SOURCE_DIR})
|
||||
target_link_libraries(rdkafka PUBLIC ${ZLIB_LIBRARIES} ${OPENSSL_SSL_LIBRARY} ${OPENSSL_CRYPTO_LIBRARY})
|
||||
target_include_directories(rdkafka SYSTEM PUBLIC ${RDKAFKA_SOURCE_DIR}) # Because weird logic with "include_next" is used.
|
||||
target_include_directories(rdkafka SYSTEM PRIVATE ${ZSTD_INCLUDE_DIR}/common) # Because wrong path to "zstd_errors.h" is used.
|
||||
target_link_libraries(rdkafka PUBLIC ${ZLIB_LIBRARIES} ${ZSTD_LIBRARY} ${OPENSSL_SSL_LIBRARY} ${OPENSSL_CRYPTO_LIBRARY})
|
||||
|
@ -51,6 +51,8 @@
|
||||
//#define WITH_PLUGINS 1
|
||||
// zlib
|
||||
#define WITH_ZLIB 1
|
||||
// zstd
|
||||
#define WITH_ZSTD 1
|
||||
// WITH_SNAPPY
|
||||
#define WITH_SNAPPY 1
|
||||
// WITH_SOCKEM
|
||||
|
@ -200,11 +200,18 @@ target_link_libraries (clickhouse_common_io
|
||||
${Boost_SYSTEM_LIBRARY}
|
||||
PRIVATE
|
||||
apple_rt
|
||||
PUBLIC
|
||||
Threads::Threads
|
||||
PRIVATE
|
||||
${CMAKE_DL_LIBS}
|
||||
)
|
||||
|
||||
if (NOT ARCH_ARM AND CPUID_LIBRARY)
|
||||
target_link_libraries (clickhouse_common_io PRIVATE ${CPUID_LIBRARY})
|
||||
if(CPUID_LIBRARY)
|
||||
target_link_libraries(clickhouse_common_io PRIVATE ${CPUID_LIBRARY})
|
||||
endif()
|
||||
|
||||
if(CPUINFO_LIBRARY)
|
||||
target_link_libraries(clickhouse_common_io PRIVATE ${CPUINFO_LIBRARY})
|
||||
endif()
|
||||
|
||||
target_link_libraries (dbms
|
||||
@ -225,6 +232,7 @@ target_link_libraries (dbms
|
||||
${Boost_PROGRAM_OPTIONS_LIBRARY}
|
||||
PUBLIC
|
||||
${Boost_SYSTEM_LIBRARY}
|
||||
Threads::Threads
|
||||
)
|
||||
|
||||
if (NOT USE_INTERNAL_RE2_LIBRARY)
|
||||
@ -298,6 +306,11 @@ target_link_libraries(dbms PRIVATE ${OPENSSL_CRYPTO_LIBRARY} Threads::Threads)
|
||||
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${DIVIDE_INCLUDE_DIR})
|
||||
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${SPARCEHASH_INCLUDE_DIR})
|
||||
|
||||
if (USE_PROTOBUF)
|
||||
target_link_libraries (dbms PRIVATE ${Protobuf_LIBRARY})
|
||||
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${Protobuf_INCLUDE_DIR})
|
||||
endif ()
|
||||
|
||||
if (USE_HDFS)
|
||||
target_link_libraries (clickhouse_common_io PRIVATE ${HDFS3_LIBRARY})
|
||||
target_include_directories (clickhouse_common_io SYSTEM BEFORE PRIVATE ${HDFS3_INCLUDE_DIR})
|
||||
@ -318,7 +331,7 @@ target_include_directories (clickhouse_common_io BEFORE PRIVATE ${COMMON_INCLUDE
|
||||
add_subdirectory (programs)
|
||||
add_subdirectory (tests)
|
||||
|
||||
if (ENABLE_TESTS)
|
||||
if (ENABLE_TESTS AND USE_GTEST)
|
||||
macro (grep_gtest_sources BASE_DIR DST_VAR)
|
||||
# Cold match files that are not in tests/ directories
|
||||
file(GLOB_RECURSE "${DST_VAR}" RELATIVE "${BASE_DIR}" "gtest*.cpp")
|
||||
|
@ -11,7 +11,7 @@
|
||||
#include <Poco/File.h>
|
||||
#include <Poco/Util/Application.h>
|
||||
#include <Common/Stopwatch.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <AggregateFunctions/ReservoirSampler.h>
|
||||
#include <AggregateFunctions/registerAggregateFunctions.h>
|
||||
#include <boost/program_options.hpp>
|
||||
|
@ -5,4 +5,5 @@ target_include_directories (clickhouse-benchmark-lib SYSTEM PRIVATE ${PCG_RANDOM
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-benchmark clickhouse-benchmark.cpp)
|
||||
target_link_libraries (clickhouse-benchmark PRIVATE clickhouse-benchmark-lib clickhouse_aggregate_functions)
|
||||
install (TARGETS clickhouse-benchmark ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -7,6 +7,7 @@ endif ()
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-client clickhouse-client.cpp)
|
||||
target_link_libraries (clickhouse-client PRIVATE clickhouse-client-lib)
|
||||
install (TARGETS clickhouse-client ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
||||
install (FILES clickhouse-client.xml DESTINATION ${CLICKHOUSE_ETC_DIR}/clickhouse-client COMPONENT clickhouse-client RENAME config.xml)
|
||||
|
@ -56,6 +56,7 @@
|
||||
#include <Parsers/formatAST.h>
|
||||
#include <Parsers/parseQuery.h>
|
||||
#include <Interpreters/Context.h>
|
||||
#include <Interpreters/InterpreterSetQuery.h>
|
||||
#include <Client/Connection.h>
|
||||
#include <Common/InterruptListener.h>
|
||||
#include <Functions/registerFunctions.h>
|
||||
@ -219,6 +220,9 @@ private:
|
||||
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
|
||||
#undef EXTRACT_SETTING
|
||||
|
||||
/// Set path for format schema files
|
||||
if (config().has("format_schema_path"))
|
||||
context.setFormatSchemaPath(Poco::Path(config().getString("format_schema_path")).toString());
|
||||
}
|
||||
|
||||
|
||||
@ -1206,6 +1210,10 @@ private:
|
||||
const auto & id = typeid_cast<const ASTIdentifier &>(*query_with_output->format);
|
||||
current_format = id.name;
|
||||
}
|
||||
if (query_with_output->settings_ast)
|
||||
{
|
||||
InterpreterSetQuery(query_with_output->settings_ast, context).executeForCurrentContext();
|
||||
}
|
||||
}
|
||||
|
||||
if (has_vertical_output_suffix)
|
||||
|
@ -5,4 +5,5 @@ if (CLICKHOUSE_SPLIT_BINARY)
|
||||
# Also in utils
|
||||
add_executable (clickhouse-compressor clickhouse-compressor.cpp)
|
||||
target_link_libraries (clickhouse-compressor PRIVATE clickhouse-compressor-lib)
|
||||
install (TARGETS clickhouse-compressor ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -4,4 +4,5 @@ target_link_libraries (clickhouse-copier-lib PRIVATE clickhouse-server-lib click
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-copier clickhouse-copier.cpp)
|
||||
target_link_libraries (clickhouse-copier clickhouse-copier-lib)
|
||||
install (TARGETS clickhouse-copier ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -18,7 +18,7 @@
|
||||
#include <pcg_random.hpp>
|
||||
|
||||
#include <common/logger_useful.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <daemon/OwnPatternFormatter.h>
|
||||
|
||||
#include <Common/Exception.h>
|
||||
|
@ -4,4 +4,5 @@ target_link_libraries (clickhouse-extract-from-config-lib PRIVATE clickhouse_com
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-extract-from-config clickhouse-extract-from-config.cpp)
|
||||
target_link_libraries (clickhouse-extract-from-config PRIVATE clickhouse-extract-from-config-lib)
|
||||
install (TARGETS clickhouse-extract-from-config ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -3,4 +3,5 @@ target_link_libraries (clickhouse-format-lib PRIVATE dbms clickhouse_common_io c
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-format clickhouse-format.cpp)
|
||||
target_link_libraries (clickhouse-format PRIVATE clickhouse-format-lib)
|
||||
install (TARGETS clickhouse-format ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -4,4 +4,5 @@ target_link_libraries (clickhouse-local-lib PRIVATE clickhouse_common_io clickho
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-local clickhouse-local.cpp)
|
||||
target_link_libraries (clickhouse-local PRIVATE clickhouse-local-lib)
|
||||
install (TARGETS clickhouse-local ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -17,6 +17,7 @@
|
||||
#include <Common/Config/ConfigProcessor.h>
|
||||
#include <Common/escapeForFileName.h>
|
||||
#include <Common/ClickHouseRevision.h>
|
||||
#include <Common/ThreadStatus.h>
|
||||
#include <Common/config_version.h>
|
||||
#include <IO/ReadBufferFromString.h>
|
||||
#include <IO/WriteBufferFromString.h>
|
||||
@ -102,7 +103,7 @@ int LocalServer::main(const std::vector<std::string> & /*args*/)
|
||||
try
|
||||
{
|
||||
Logger * log = &logger();
|
||||
|
||||
ThreadStatus thread_status;
|
||||
UseSSL use_ssl;
|
||||
|
||||
if (!config().has("query") && !config().has("table-structure")) /// Nothing to process
|
||||
|
@ -5,4 +5,5 @@ if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-obfuscator clickhouse-obfuscator.cpp)
|
||||
set_target_properties(clickhouse-obfuscator PROPERTIES RUNTIME_OUTPUT_DIRECTORY ..)
|
||||
target_link_libraries (clickhouse-obfuscator PRIVATE clickhouse-obfuscator-lib)
|
||||
install (TARGETS clickhouse-obfuscator ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -36,4 +36,5 @@ endif ()
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-odbc-bridge odbc-bridge.cpp)
|
||||
target_link_libraries (clickhouse-odbc-bridge PRIVATE clickhouse-odbc-bridge-lib)
|
||||
install (TARGETS clickhouse-odbc-bridge ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -5,4 +5,5 @@ target_include_directories (clickhouse-performance-test-lib SYSTEM PRIVATE ${PCG
|
||||
if (CLICKHOUSE_SPLIT_BINARY)
|
||||
add_executable (clickhouse-performance-test clickhouse-performance-test.cpp)
|
||||
target_link_libraries (clickhouse-performance-test PRIVATE clickhouse-performance-test-lib)
|
||||
install (TARGETS clickhouse-performance-test ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
@ -23,7 +23,7 @@
|
||||
#include <IO/ConnectionTimeouts.h>
|
||||
#include <IO/UseSSL.h>
|
||||
#include <Interpreters/Settings.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <common/getMemoryAmount.h>
|
||||
#include <Poco/AutoPtr.h>
|
||||
#include <Poco/Exception.h>
|
||||
|
@ -23,7 +23,7 @@ if (CLICKHOUSE_SPLIT_BINARY)
|
||||
install (TARGETS clickhouse-server ${CLICKHOUSE_ALL_TARGETS} RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
|
||||
endif ()
|
||||
|
||||
if (OS_LINUX AND MAKE_STATIC_LIBRARIES)
|
||||
if (GLIBC_COMPATIBILITY)
|
||||
set (GLIBC_MAX_REQUIRED 2.4 CACHE INTERNAL "")
|
||||
# temporary disabled. to enable - change 'exit 0' to 'exit $a'
|
||||
add_test(NAME GLIBC_required_version COMMAND bash -c "readelf -s ${CMAKE_CURRENT_BINARY_DIR}/../clickhouse-server | perl -nE 'END {exit 0 if $a} ++$a, print if /\\x40GLIBC_(\\S+)/ and pack(q{C*}, split /\\./, \$1) gt pack q{C*}, split /\\./, q{${GLIBC_MAX_REQUIRED}}'")
|
||||
|
@ -647,6 +647,7 @@ void HTTPHandler::trySendExceptionToClient(const std::string & s, int exception_
|
||||
void HTTPHandler::handleRequest(Poco::Net::HTTPServerRequest & request, Poco::Net::HTTPServerResponse & response)
|
||||
{
|
||||
setThreadName("HTTPHandler");
|
||||
ThreadStatus thread_status;
|
||||
|
||||
Output used_output;
|
||||
|
||||
|
@ -6,6 +6,7 @@
|
||||
#include <thread>
|
||||
#include <vector>
|
||||
#include <Common/ProfileEvents.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
@ -46,7 +47,7 @@ private:
|
||||
bool quit = false;
|
||||
std::mutex mutex;
|
||||
std::condition_variable cond;
|
||||
std::thread thread{&MetricsTransmitter::run, this};
|
||||
ThreadFromGlobalPool thread{&MetricsTransmitter::run, this};
|
||||
|
||||
static constexpr auto profile_events_path_prefix = "ClickHouse.ProfileEvents.";
|
||||
static constexpr auto current_metrics_path_prefix = "ClickHouse.Metrics.";
|
||||
|
@ -27,6 +27,7 @@
|
||||
#include <Common/getMultipleKeysFromConfig.h>
|
||||
#include <Common/getNumberOfPhysicalCPUCores.h>
|
||||
#include <Common/TaskStatsInfoGetter.h>
|
||||
#include <Common/ThreadStatus.h>
|
||||
#include <IO/HTTPCommon.h>
|
||||
#include <IO/UseSSL.h>
|
||||
#include <Interpreters/AsynchronousMetrics.h>
|
||||
@ -129,9 +130,10 @@ std::string Server::getDefaultCorePath() const
|
||||
int Server::main(const std::vector<std::string> & /*args*/)
|
||||
{
|
||||
Logger * log = &logger();
|
||||
|
||||
UseSSL use_ssl;
|
||||
|
||||
ThreadStatus thread_status;
|
||||
|
||||
registerFunctions();
|
||||
registerAggregateFunctions();
|
||||
registerTableFunctions();
|
||||
@ -418,7 +420,7 @@ int Server::main(const std::vector<std::string> & /*args*/)
|
||||
|
||||
/// Set path for format schema files
|
||||
auto format_schema_path = Poco::File(config().getString("format_schema_path", path + "format_schemas/"));
|
||||
global_context->setFormatSchemaPath(format_schema_path.path() + "/");
|
||||
global_context->setFormatSchemaPath(format_schema_path.path());
|
||||
format_schema_path.createDirectories();
|
||||
|
||||
LOG_INFO(log, "Loading metadata.");
|
||||
|
@ -55,6 +55,7 @@ namespace ErrorCodes
|
||||
void TCPHandler::runImpl()
|
||||
{
|
||||
setThreadName("TCPHandler");
|
||||
ThreadStatus thread_status;
|
||||
|
||||
connection_context = server.context();
|
||||
connection_context.setSessionContext(connection_context);
|
||||
|
@ -100,7 +100,7 @@ public:
|
||||
return res;
|
||||
}
|
||||
|
||||
void NO_SANITIZE_UNDEFINED add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
|
||||
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
|
||||
{
|
||||
/// Out of range conversion may occur. This is Ok.
|
||||
|
||||
@ -177,8 +177,11 @@ public:
|
||||
static void assertSecondArg(const DataTypes & argument_types)
|
||||
{
|
||||
if constexpr (has_second_arg)
|
||||
/// TODO: check that second argument is of numerical type.
|
||||
{
|
||||
assertBinary(Name::name, argument_types);
|
||||
if (!isUnsignedInteger(argument_types[1]))
|
||||
throw Exception("Second argument (weight) for function " + std::string(Name::name) + " must be unsigned integer, but it has type " + argument_types[1]->getName(), ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
}
|
||||
else
|
||||
assertUnary(Name::name, argument_types);
|
||||
}
|
||||
|
@ -12,10 +12,41 @@ namespace DB
|
||||
namespace
|
||||
{
|
||||
|
||||
AggregateFunctionPtr createAggregateFunctionSumMap(const std::string & name, const DataTypes & arguments, const Array & params)
|
||||
struct WithOverflowPolicy
|
||||
{
|
||||
assertNoParameters(name, params);
|
||||
/// Overflow, meaning that the returned type is the same as the input type.
|
||||
static DataTypePtr promoteType(const DataTypePtr & data_type) { return data_type; }
|
||||
};
|
||||
|
||||
struct WithoutOverflowPolicy
|
||||
{
|
||||
/// No overflow, meaning we promote the types if necessary.
|
||||
static DataTypePtr promoteType(const DataTypePtr & data_type)
|
||||
{
|
||||
if (!data_type->canBePromoted())
|
||||
throw new Exception{"Values to be summed are expected to be Numeric, Float or Decimal.",
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
|
||||
|
||||
return data_type->promoteNumericType();
|
||||
}
|
||||
};
|
||||
|
||||
template <typename T>
|
||||
using SumMapWithOverflow = AggregateFunctionSumMap<T, WithOverflowPolicy>;
|
||||
|
||||
template <typename T>
|
||||
using SumMapWithoutOverflow = AggregateFunctionSumMap<T, WithoutOverflowPolicy>;
|
||||
|
||||
template <typename T>
|
||||
using SumMapFilteredWithOverflow = AggregateFunctionSumMapFiltered<T, WithOverflowPolicy>;
|
||||
|
||||
template <typename T>
|
||||
using SumMapFilteredWithoutOverflow = AggregateFunctionSumMapFiltered<T, WithoutOverflowPolicy>;
|
||||
|
||||
using SumMapArgs = std::pair<DataTypePtr, DataTypes>;
|
||||
|
||||
SumMapArgs parseArguments(const std::string & name, const DataTypes & arguments)
|
||||
{
|
||||
if (arguments.size() < 2)
|
||||
throw Exception("Aggregate function " + name + " requires at least two arguments of Array type.",
|
||||
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
|
||||
@ -25,9 +56,11 @@ AggregateFunctionPtr createAggregateFunctionSumMap(const std::string & name, con
|
||||
throw Exception("First argument for function " + name + " must be an array.",
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
const DataTypePtr & keys_type = array_type->getNestedType();
|
||||
|
||||
DataTypePtr keys_type = array_type->getNestedType();
|
||||
|
||||
DataTypes values_types;
|
||||
values_types.reserve(arguments.size() - 1);
|
||||
for (size_t i = 1; i < arguments.size(); ++i)
|
||||
{
|
||||
array_type = checkAndGetDataType<DataTypeArray>(arguments[i].get());
|
||||
@ -37,20 +70,55 @@ AggregateFunctionPtr createAggregateFunctionSumMap(const std::string & name, con
|
||||
values_types.push_back(array_type->getNestedType());
|
||||
}
|
||||
|
||||
AggregateFunctionPtr res(createWithNumericBasedType<AggregateFunctionSumMap>(*keys_type, keys_type, values_types));
|
||||
return {std::move(keys_type), std::move(values_types)};
|
||||
}
|
||||
|
||||
template <template <typename> class Function>
|
||||
AggregateFunctionPtr createAggregateFunctionSumMap(const std::string & name, const DataTypes & arguments, const Array & params)
|
||||
{
|
||||
assertNoParameters(name, params);
|
||||
|
||||
auto [keys_type, values_types] = parseArguments(name, arguments);
|
||||
|
||||
AggregateFunctionPtr res(createWithNumericBasedType<Function>(*keys_type, keys_type, values_types));
|
||||
if (!res)
|
||||
res.reset(createWithDecimalType<AggregateFunctionSumMap>(*keys_type, keys_type, values_types));
|
||||
res.reset(createWithDecimalType<Function>(*keys_type, keys_type, values_types));
|
||||
if (!res)
|
||||
throw Exception("Illegal type of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
template <template <typename> class Function>
|
||||
AggregateFunctionPtr createAggregateFunctionSumMapFiltered(const std::string & name, const DataTypes & arguments, const Array & params)
|
||||
{
|
||||
if (params.size() != 1)
|
||||
throw Exception("Aggregate function " + name + " requires exactly one parameter of Array type.",
|
||||
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
|
||||
|
||||
Array keys_to_keep;
|
||||
if (!params.front().tryGet<Array>(keys_to_keep))
|
||||
throw Exception("Aggregate function " + name + " requires an Array as parameter.",
|
||||
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
auto [keys_type, values_types] = parseArguments(name, arguments);
|
||||
|
||||
AggregateFunctionPtr res(createWithNumericBasedType<Function>(*keys_type, keys_type, values_types, keys_to_keep));
|
||||
if (!res)
|
||||
res.reset(createWithDecimalType<Function>(*keys_type, keys_type, values_types, keys_to_keep));
|
||||
if (!res)
|
||||
throw Exception("Illegal type of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
|
||||
|
||||
return res;
|
||||
}
|
||||
}
|
||||
|
||||
void registerAggregateFunctionSumMap(AggregateFunctionFactory & factory)
|
||||
{
|
||||
factory.registerFunction("sumMap", createAggregateFunctionSumMap);
|
||||
factory.registerFunction("sumMap", createAggregateFunctionSumMap<SumMapWithoutOverflow>);
|
||||
factory.registerFunction("sumMapWithOverflow", createAggregateFunctionSumMap<SumMapWithOverflow>);
|
||||
factory.registerFunction("sumMapFiltered", createAggregateFunctionSumMapFiltered<SumMapFilteredWithoutOverflow>);
|
||||
factory.registerFunction("sumMapFilteredWithOverflow", createAggregateFunctionSumMapFiltered<SumMapFilteredWithOverflow>);
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -50,9 +50,9 @@ struct AggregateFunctionSumMapData
|
||||
* ([1,2,3,4,5,6,7,8,9,10],[10,10,45,20,35,20,15,30,20,20])
|
||||
*/
|
||||
|
||||
template <typename T>
|
||||
class AggregateFunctionSumMap final : public IAggregateFunctionDataHelper<
|
||||
AggregateFunctionSumMapData<NearestFieldType<T>>, AggregateFunctionSumMap<T>>
|
||||
template <typename T, typename Derived, typename OverflowPolicy>
|
||||
class AggregateFunctionSumMapBase : public IAggregateFunctionDataHelper<
|
||||
AggregateFunctionSumMapData<NearestFieldType<T>>, Derived>
|
||||
{
|
||||
private:
|
||||
using ColVecType = std::conditional_t<IsDecimalNumber<T>, ColumnDecimal<T>, ColumnVector<T>>;
|
||||
@ -61,7 +61,7 @@ private:
|
||||
DataTypes values_types;
|
||||
|
||||
public:
|
||||
AggregateFunctionSumMap(const DataTypePtr & keys_type, const DataTypes & values_types)
|
||||
AggregateFunctionSumMapBase(const DataTypePtr & keys_type, const DataTypes & values_types)
|
||||
: keys_type(keys_type), values_types(values_types) {}
|
||||
|
||||
String getName() const override { return "sumMap"; }
|
||||
@ -72,7 +72,7 @@ public:
|
||||
types.emplace_back(std::make_shared<DataTypeArray>(keys_type));
|
||||
|
||||
for (const auto & value_type : values_types)
|
||||
types.emplace_back(std::make_shared<DataTypeArray>(value_type));
|
||||
types.emplace_back(std::make_shared<DataTypeArray>(OverflowPolicy::promoteType(value_type)));
|
||||
|
||||
return std::make_shared<DataTypeTuple>(types);
|
||||
}
|
||||
@ -109,6 +109,11 @@ public:
|
||||
array_column.getData().get(values_vec_offset + i, value);
|
||||
const auto & key = keys_vec.getData()[keys_vec_offset + i];
|
||||
|
||||
if (!keepKey(key))
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
IteratorType it;
|
||||
if constexpr (IsDecimalNumber<T>)
|
||||
{
|
||||
@ -253,6 +258,52 @@ public:
|
||||
}
|
||||
|
||||
const char * getHeaderFilePath() const override { return __FILE__; }
|
||||
|
||||
bool keepKey(const T & key) const { return static_cast<const Derived &>(*this).keepKey(key); }
|
||||
};
|
||||
|
||||
template <typename T, typename OverflowPolicy>
|
||||
class AggregateFunctionSumMap final :
|
||||
public AggregateFunctionSumMapBase<T, AggregateFunctionSumMap<T, OverflowPolicy>, OverflowPolicy>
|
||||
{
|
||||
private:
|
||||
using Self = AggregateFunctionSumMap<T, OverflowPolicy>;
|
||||
using Base = AggregateFunctionSumMapBase<T, Self, OverflowPolicy>;
|
||||
|
||||
public:
|
||||
AggregateFunctionSumMap(const DataTypePtr & keys_type, DataTypes & values_types)
|
||||
: Base{keys_type, values_types}
|
||||
{}
|
||||
|
||||
String getName() const override { return "sumMap"; }
|
||||
|
||||
bool keepKey(const T &) const { return true; }
|
||||
};
|
||||
|
||||
template <typename T, typename OverflowPolicy>
|
||||
class AggregateFunctionSumMapFiltered final :
|
||||
public AggregateFunctionSumMapBase<T, AggregateFunctionSumMapFiltered<T, OverflowPolicy>, OverflowPolicy>
|
||||
{
|
||||
private:
|
||||
using Self = AggregateFunctionSumMapFiltered<T, OverflowPolicy>;
|
||||
using Base = AggregateFunctionSumMapBase<T, Self, OverflowPolicy>;
|
||||
|
||||
std::unordered_set<T> keys_to_keep;
|
||||
|
||||
public:
|
||||
AggregateFunctionSumMapFiltered(const DataTypePtr & keys_type, const DataTypes & values_types, const Array & keys_to_keep_)
|
||||
: Base{keys_type, values_types}
|
||||
{
|
||||
keys_to_keep.reserve(keys_to_keep_.size());
|
||||
for (const Field & f : keys_to_keep_)
|
||||
{
|
||||
keys_to_keep.emplace(f.safeGet<NearestFieldType<T>>());
|
||||
}
|
||||
}
|
||||
|
||||
String getName() const override { return "sumMapFiltered"; }
|
||||
|
||||
bool keepKey(const T & key) const { return keys_to_keep.count(key); }
|
||||
};
|
||||
|
||||
}
|
||||
|
@ -6,25 +6,25 @@ namespace DB
|
||||
{
|
||||
|
||||
/** Data for HyperLogLogBiasEstimator in the uniqCombined function.
|
||||
* The development plan is as follows:
|
||||
* 1. Assemble ClickHouse.
|
||||
* 2. Run the script src/dbms/scripts/gen-bias-data.py, which returns one array for getRawEstimates()
|
||||
* and another array for getBiases().
|
||||
* 3. Update `raw_estimates` and `biases` arrays. Also update the size of arrays in InterpolatedData.
|
||||
* 4. Assemble ClickHouse.
|
||||
* 5. Run the script src/dbms/scripts/linear-counting-threshold.py, which creates 3 files:
|
||||
* - raw_graph.txt (1st column: the present number of unique values;
|
||||
* 2nd column: relative error in the case of HyperLogLog without applying any corrections)
|
||||
* - linear_counting_graph.txt (1st column: the present number of unique values;
|
||||
* 2nd column: relative error in the case of HyperLogLog using LinearCounting)
|
||||
* - bias_corrected_graph.txt (1st column: the present number of unique values;
|
||||
* 2nd column: relative error in the case of HyperLogLog with the use of corrections from the algorithm HyperLogLog++)
|
||||
* 6. Generate a graph with gnuplot based on this data.
|
||||
* 7. Determine the minimum number of unique values at which it is better to correct the error
|
||||
* using its evaluation (ie, using the HyperLogLog++ algorithm) than applying the LinearCounting algorithm.
|
||||
* 7. Accordingly, update the constant in the function getThreshold()
|
||||
* 8. Assemble ClickHouse.
|
||||
*/
|
||||
* The development plan is as follows:
|
||||
* 1. Assemble ClickHouse.
|
||||
* 2. Run the script src/dbms/scripts/gen-bias-data.py, which returns one array for getRawEstimates()
|
||||
* and another array for getBiases().
|
||||
* 3. Update `raw_estimates` and `biases` arrays. Also update the size of arrays in InterpolatedData.
|
||||
* 4. Assemble ClickHouse.
|
||||
* 5. Run the script src/dbms/scripts/linear-counting-threshold.py, which creates 3 files:
|
||||
* - raw_graph.txt (1st column: the present number of unique values;
|
||||
* 2nd column: relative error in the case of HyperLogLog without applying any corrections)
|
||||
* - linear_counting_graph.txt (1st column: the present number of unique values;
|
||||
* 2nd column: relative error in the case of HyperLogLog using LinearCounting)
|
||||
* - bias_corrected_graph.txt (1st column: the present number of unique values;
|
||||
* 2nd column: relative error in the case of HyperLogLog with the use of corrections from the algorithm HyperLogLog++)
|
||||
* 6. Generate a graph with gnuplot based on this data.
|
||||
* 7. Determine the minimum number of unique values at which it is better to correct the error
|
||||
* using its evaluation (ie, using the HyperLogLog++ algorithm) than applying the LinearCounting algorithm.
|
||||
* 7. Accordingly, update the constant in the function getThreshold()
|
||||
* 8. Assemble ClickHouse.
|
||||
*/
|
||||
struct UniqCombinedBiasData
|
||||
{
|
||||
using InterpolatedData = std::array<double, 200>;
|
||||
|
@ -15,33 +15,33 @@
|
||||
|
||||
|
||||
/** Approximate calculation of anything, as usual, is constructed according to the following scheme:
|
||||
* - some data structure is used to calculate the value of X;
|
||||
* - Not all values are added to the data structure, but only selected ones (according to some selectivity criteria);
|
||||
* - after processing all elements, the data structure is in some state S;
|
||||
* - as an approximate value of X, the value calculated according to the maximum likelihood principle is returned:
|
||||
* at what real value X, the probability of finding the data structure in the obtained state S is maximal.
|
||||
*/
|
||||
* - some data structure is used to calculate the value of X;
|
||||
* - Not all values are added to the data structure, but only selected ones (according to some selectivity criteria);
|
||||
* - after processing all elements, the data structure is in some state S;
|
||||
* - as an approximate value of X, the value calculated according to the maximum likelihood principle is returned:
|
||||
* at what real value X, the probability of finding the data structure in the obtained state S is maximal.
|
||||
*/
|
||||
|
||||
/** In particular, what is described below can be found by the name of the BJKST algorithm.
|
||||
*/
|
||||
*/
|
||||
|
||||
/** Very simple hash-set for approximate number of unique values.
|
||||
* Works like this:
|
||||
* - you can insert UInt64;
|
||||
* - before insertion, first the hash function UInt64 -> UInt32 is calculated;
|
||||
* - the original value is not saved (lost);
|
||||
* - further all operations are made with these hashes;
|
||||
* - hash table is constructed according to the scheme:
|
||||
* - open addressing (one buffer, position in buffer is calculated by taking remainder of division by its size);
|
||||
* - linear probing (if the cell already has a value, then the cell following it is taken, etc.);
|
||||
* - the missing value is zero-encoded; to remember presence of zero in set, separate variable of type bool is used;
|
||||
* - buffer growth by 2 times when filling more than 50%;
|
||||
* - if the set has more UNIQUES_HASH_MAX_SIZE elements, then all the elements are removed from the set,
|
||||
* not divisible by 2, and then all elements that do not divide by 2 are not inserted into the set;
|
||||
* - if the situation repeats, then only elements dividing by 4, etc., are taken.
|
||||
* - the size() method returns an approximate number of elements that have been inserted into the set;
|
||||
* - there are methods for quick reading and writing in binary and text form.
|
||||
*/
|
||||
* Works like this:
|
||||
* - you can insert UInt64;
|
||||
* - before insertion, first the hash function UInt64 -> UInt32 is calculated;
|
||||
* - the original value is not saved (lost);
|
||||
* - further all operations are made with these hashes;
|
||||
* - hash table is constructed according to the scheme:
|
||||
* - open addressing (one buffer, position in buffer is calculated by taking remainder of division by its size);
|
||||
* - linear probing (if the cell already has a value, then the cell following it is taken, etc.);
|
||||
* - the missing value is zero-encoded; to remember presence of zero in set, separate variable of type bool is used;
|
||||
* - buffer growth by 2 times when filling more than 50%;
|
||||
* - if the set has more UNIQUES_HASH_MAX_SIZE elements, then all the elements are removed from the set,
|
||||
* not divisible by 2, and then all elements that do not divide by 2 are not inserted into the set;
|
||||
* - if the situation repeats, then only elements dividing by 4, etc., are taken.
|
||||
* - the size() method returns an approximate number of elements that have been inserted into the set;
|
||||
* - there are methods for quick reading and writing in binary and text form.
|
||||
*/
|
||||
|
||||
/// The maximum degree of buffer size before the values are discarded
|
||||
#define UNIQUES_HASH_MAX_SIZE_DEGREE 17
|
||||
@ -50,8 +50,8 @@
|
||||
#define UNIQUES_HASH_MAX_SIZE (1ULL << (UNIQUES_HASH_MAX_SIZE_DEGREE - 1))
|
||||
|
||||
/** The number of least significant bits used for thinning. The remaining high-order bits are used to determine the position in the hash table.
|
||||
* (high-order bits are taken because the younger bits will be constant after dropping some of the values)
|
||||
*/
|
||||
* (high-order bits are taken because the younger bits will be constant after dropping some of the values)
|
||||
*/
|
||||
#define UNIQUES_HASH_BITS_FOR_SKIP (32 - UNIQUES_HASH_MAX_SIZE_DEGREE)
|
||||
|
||||
/// Initial buffer size degree
|
||||
@ -59,8 +59,8 @@
|
||||
|
||||
|
||||
/** This hash function is not the most optimal, but UniquesHashSet states counted with it,
|
||||
* stored in many places on disks (in the Yandex.Metrika), so it continues to be used.
|
||||
*/
|
||||
* stored in many places on disks (in the Yandex.Metrika), so it continues to be used.
|
||||
*/
|
||||
struct UniquesHashSetDefaultHash
|
||||
{
|
||||
size_t operator() (UInt64 x) const
|
||||
|
@ -9,9 +9,9 @@
|
||||
|
||||
|
||||
/** In a loop it connects to the server and immediately breaks the connection.
|
||||
* Using the SO_LINGER option, we ensure that the connection is terminated by sending a RST packet (not FIN).
|
||||
* This behavior causes a bug in the TCPServer implementation in the Poco library.
|
||||
*/
|
||||
* Using the SO_LINGER option, we ensure that the connection is terminated by sending a RST packet (not FIN).
|
||||
* This behavior causes a bug in the TCPServer implementation in the Poco library.
|
||||
*/
|
||||
int main(int argc, char ** argv)
|
||||
try
|
||||
{
|
||||
|
@ -1,4 +1,4 @@
|
||||
set(SRCS)
|
||||
|
||||
add_executable (column_unique column_unique.cpp ${SRCS})
|
||||
target_link_libraries (column_unique PRIVATE dbms gtest_main)
|
||||
if(USE_GTEST)
|
||||
add_executable(column_unique column_unique.cpp)
|
||||
target_link_libraries(column_unique PRIVATE dbms gtest_main)
|
||||
endif()
|
@ -4,5 +4,5 @@ add_headers_and_sources(clickhouse_common_config .)
|
||||
|
||||
add_library(clickhouse_common_config ${LINK_MODE} ${clickhouse_common_config_headers} ${clickhouse_common_config_sources})
|
||||
|
||||
target_link_libraries(clickhouse_common_config PUBLIC common PRIVATE clickhouse_common_zookeeper string_utils PUBLIC ${Poco_XML_LIBRARY} ${Poco_Util_LIBRARY})
|
||||
target_link_libraries(clickhouse_common_config PUBLIC common PRIVATE clickhouse_common_zookeeper string_utils PUBLIC ${Poco_XML_LIBRARY} ${Poco_Util_LIBRARY} Threads::Threads)
|
||||
target_include_directories(clickhouse_common_config PUBLIC ${DBMS_INCLUDE_DIR})
|
||||
|
@ -33,7 +33,7 @@ ConfigReloader::ConfigReloader(
|
||||
|
||||
void ConfigReloader::start()
|
||||
{
|
||||
thread = std::thread(&ConfigReloader::run, this);
|
||||
thread = ThreadFromGlobalPool(&ConfigReloader::run, this);
|
||||
}
|
||||
|
||||
|
||||
|
@ -1,6 +1,7 @@
|
||||
#pragma once
|
||||
|
||||
#include "ConfigProcessor.h"
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <Common/ZooKeeper/Common.h>
|
||||
#include <Common/ZooKeeper/ZooKeeperNodeCache.h>
|
||||
#include <time.h>
|
||||
@ -81,7 +82,7 @@ private:
|
||||
Updater updater;
|
||||
|
||||
std::atomic<bool> quit{false};
|
||||
std::thread thread;
|
||||
ThreadFromGlobalPool thread;
|
||||
|
||||
/// Locked inside reloadIfNewer.
|
||||
std::mutex reload_mutex;
|
||||
|
@ -2,6 +2,7 @@
|
||||
|
||||
#include "CurrentThread.h"
|
||||
#include <common/logger_useful.h>
|
||||
#include <common/likely.h>
|
||||
#include <Common/ThreadStatus.h>
|
||||
#include <Common/TaskStatsInfoGetter.h>
|
||||
#include <Interpreters/ProcessList.h>
|
||||
@ -10,11 +11,6 @@
|
||||
#include <Poco/Logger.h>
|
||||
|
||||
|
||||
#if defined(ARCADIA_ROOT)
|
||||
# include <util/thread/singleton.h>
|
||||
#endif
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
@ -23,91 +19,62 @@ namespace ErrorCodes
|
||||
extern const int LOGICAL_ERROR;
|
||||
}
|
||||
|
||||
// Smoker's implementation to avoid thread_local usage: error: undefined symbol: __cxa_thread_atexit
|
||||
#if defined(ARCADIA_ROOT)
|
||||
struct ThreadStatusPtrHolder : ThreadStatusPtr
|
||||
{
|
||||
ThreadStatusPtrHolder() { ThreadStatusPtr::operator=(ThreadStatus::create()); }
|
||||
};
|
||||
struct ThreadScopePtrHolder : CurrentThread::ThreadScopePtr
|
||||
{
|
||||
ThreadScopePtrHolder() { CurrentThread::ThreadScopePtr::operator=(std::make_shared<CurrentThread::ThreadScope>()); }
|
||||
};
|
||||
# define current_thread (*FastTlsSingleton<ThreadStatusPtrHolder>())
|
||||
# define current_thread_scope (*FastTlsSingleton<ThreadScopePtrHolder>())
|
||||
#else
|
||||
/// Order of current_thread and current_thread_scope matters
|
||||
thread_local ThreadStatusPtr _current_thread = ThreadStatus::create();
|
||||
thread_local CurrentThread::ThreadScopePtr _current_thread_scope = std::make_shared<CurrentThread::ThreadScope>();
|
||||
# define current_thread _current_thread
|
||||
# define current_thread_scope _current_thread_scope
|
||||
#endif
|
||||
|
||||
void CurrentThread::updatePerformanceCounters()
|
||||
{
|
||||
get()->updatePerformanceCounters();
|
||||
get().updatePerformanceCounters();
|
||||
}
|
||||
|
||||
ThreadStatusPtr CurrentThread::get()
|
||||
ThreadStatus & CurrentThread::get()
|
||||
{
|
||||
#ifndef NDEBUG
|
||||
if (!current_thread || current_thread.use_count() <= 0)
|
||||
if (unlikely(!current_thread))
|
||||
throw Exception("Thread #" + std::to_string(Poco::ThreadNumber::get()) + " status was not initialized", ErrorCodes::LOGICAL_ERROR);
|
||||
|
||||
if (Poco::ThreadNumber::get() != current_thread->thread_number)
|
||||
throw Exception("Current thread has different thread number", ErrorCodes::LOGICAL_ERROR);
|
||||
#endif
|
||||
|
||||
return current_thread;
|
||||
}
|
||||
|
||||
CurrentThread::ThreadScopePtr CurrentThread::getScope()
|
||||
{
|
||||
return current_thread_scope;
|
||||
return *current_thread;
|
||||
}
|
||||
|
||||
ProfileEvents::Counters & CurrentThread::getProfileEvents()
|
||||
{
|
||||
return current_thread->performance_counters;
|
||||
return current_thread ? get().performance_counters : ProfileEvents::global_counters;
|
||||
}
|
||||
|
||||
MemoryTracker & CurrentThread::getMemoryTracker()
|
||||
{
|
||||
return current_thread->memory_tracker;
|
||||
return get().memory_tracker;
|
||||
}
|
||||
|
||||
void CurrentThread::updateProgressIn(const Progress & value)
|
||||
{
|
||||
current_thread->progress_in.incrementPiecewiseAtomically(value);
|
||||
get().progress_in.incrementPiecewiseAtomically(value);
|
||||
}
|
||||
|
||||
void CurrentThread::updateProgressOut(const Progress & value)
|
||||
{
|
||||
current_thread->progress_out.incrementPiecewiseAtomically(value);
|
||||
get().progress_out.incrementPiecewiseAtomically(value);
|
||||
}
|
||||
|
||||
void CurrentThread::attachInternalTextLogsQueue(const std::shared_ptr<InternalTextLogsQueue> & logs_queue)
|
||||
{
|
||||
get()->attachInternalTextLogsQueue(logs_queue);
|
||||
get().attachInternalTextLogsQueue(logs_queue);
|
||||
}
|
||||
|
||||
std::shared_ptr<InternalTextLogsQueue> CurrentThread::getInternalTextLogsQueue()
|
||||
{
|
||||
/// NOTE: this method could be called at early server startup stage
|
||||
/// NOTE: this method could be called in ThreadStatus destructor, therefore we make use_count() check just in case
|
||||
|
||||
if (!current_thread || current_thread.use_count() <= 0)
|
||||
if (!current_thread)
|
||||
return nullptr;
|
||||
|
||||
if (current_thread->getCurrentState() == ThreadStatus::ThreadState::Died)
|
||||
if (get().getCurrentState() == ThreadStatus::ThreadState::Died)
|
||||
return nullptr;
|
||||
|
||||
return current_thread->getInternalTextLogsQueue();
|
||||
return get().getInternalTextLogsQueue();
|
||||
}
|
||||
|
||||
ThreadGroupStatusPtr CurrentThread::getGroup()
|
||||
{
|
||||
return get()->getThreadGroup();
|
||||
if (!current_thread)
|
||||
return nullptr;
|
||||
|
||||
return get().getThreadGroup();
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -32,7 +32,7 @@ class CurrentThread
|
||||
{
|
||||
public:
|
||||
/// Handler to current thread
|
||||
static ThreadStatusPtr get();
|
||||
static ThreadStatus & get();
|
||||
|
||||
/// Group to which belongs current thread
|
||||
static ThreadGroupStatusPtr getGroup();
|
||||
@ -85,25 +85,6 @@ public:
|
||||
bool log_peak_memory_usage_in_destructor = true;
|
||||
};
|
||||
|
||||
/// Implicitly finalizes current thread in the destructor
|
||||
class ThreadScope
|
||||
{
|
||||
public:
|
||||
void (*deleter)() = nullptr;
|
||||
|
||||
ThreadScope() = default;
|
||||
~ThreadScope()
|
||||
{
|
||||
if (deleter)
|
||||
deleter();
|
||||
|
||||
/// std::terminate on exception: this is Ok.
|
||||
}
|
||||
};
|
||||
|
||||
using ThreadScopePtr = std::shared_ptr<ThreadScope>;
|
||||
static ThreadScopePtr getScope();
|
||||
|
||||
private:
|
||||
static void defaultThreadDeleter();
|
||||
};
|
||||
|
@ -408,6 +408,12 @@ namespace ErrorCodes
|
||||
extern const int ILLEGAL_SYNTAX_FOR_CODEC_TYPE = 431;
|
||||
extern const int UNKNOWN_CODEC = 432;
|
||||
extern const int ILLEGAL_CODEC_PARAMETER = 433;
|
||||
extern const int CANNOT_PARSE_PROTOBUF_SCHEMA = 434;
|
||||
extern const int NO_DATA_FOR_REQUIRED_PROTOBUF_FIELD = 435;
|
||||
extern const int CANNOT_CONVERT_TO_PROTOBUF_TYPE = 436;
|
||||
extern const int PROTOBUF_FIELD_NOT_REPEATED = 437;
|
||||
extern const int DATA_TYPE_CANNOT_BE_PROMOTED = 438;
|
||||
extern const int CANNOT_SCHEDULE_TASK = 439;
|
||||
|
||||
extern const int KEEPER_EXCEPTION = 999;
|
||||
extern const int POCO_EXCEPTION = 1000;
|
||||
|
@ -45,10 +45,10 @@ static int sigtimedwait(const sigset_t *set, siginfo_t *info, const struct times
|
||||
|
||||
|
||||
/** As long as there exists an object of this class - it blocks the INT signal, at the same time it lets you know if it came.
|
||||
* This is necessary so that you can interrupt the execution of the request with Ctrl+C.
|
||||
* Use only one instance of this class at a time.
|
||||
* If `check` method returns true (the signal has arrived), the next call will wait for the next signal.
|
||||
*/
|
||||
* This is necessary so that you can interrupt the execution of the request with Ctrl+C.
|
||||
* Use only one instance of this class at a time.
|
||||
* If `check` method returns true (the signal has arrived), the next call will wait for the next signal.
|
||||
*/
|
||||
class InterruptListener
|
||||
{
|
||||
private:
|
||||
|
@ -190,17 +190,20 @@ namespace CurrentMemoryTracker
|
||||
{
|
||||
void alloc(Int64 size)
|
||||
{
|
||||
DB::CurrentThread::getMemoryTracker().alloc(size);
|
||||
if (DB::current_thread)
|
||||
DB::CurrentThread::getMemoryTracker().alloc(size);
|
||||
}
|
||||
|
||||
void realloc(Int64 old_size, Int64 new_size)
|
||||
{
|
||||
DB::CurrentThread::getMemoryTracker().alloc(new_size - old_size);
|
||||
if (DB::current_thread)
|
||||
DB::CurrentThread::getMemoryTracker().alloc(new_size - old_size);
|
||||
}
|
||||
|
||||
void free(Int64 size)
|
||||
{
|
||||
DB::CurrentThread::getMemoryTracker().free(size);
|
||||
if (DB::current_thread)
|
||||
DB::CurrentThread::getMemoryTracker().free(size);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -79,7 +79,7 @@ private:
|
||||
constexpr uint64_t nextAlphaSize(uint64_t x)
|
||||
{
|
||||
constexpr uint64_t ALPHA_MAP_ELEMENTS_PER_COUNTER = 6;
|
||||
return 1ULL<<(sizeof(uint64_t) * 8 - __builtin_clzll(x * ALPHA_MAP_ELEMENTS_PER_COUNTER));
|
||||
return 1ULL << (sizeof(uint64_t) * 8 - __builtin_clzll(x * ALPHA_MAP_ELEMENTS_PER_COUNTER));
|
||||
}
|
||||
|
||||
public:
|
||||
|
235
dbms/src/Common/ThreadPool.cpp
Normal file
235
dbms/src/Common/ThreadPool.cpp
Normal file
@ -0,0 +1,235 @@
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <Common/Exception.h>
|
||||
|
||||
#include <iostream>
|
||||
#include <type_traits>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int CANNOT_SCHEDULE_TASK;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
template <typename Thread>
|
||||
ThreadPoolImpl<Thread>::ThreadPoolImpl(size_t max_threads)
|
||||
: ThreadPoolImpl(max_threads, max_threads, max_threads)
|
||||
{
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
ThreadPoolImpl<Thread>::ThreadPoolImpl(size_t max_threads, size_t max_free_threads, size_t queue_size)
|
||||
: max_threads(max_threads), max_free_threads(max_free_threads), queue_size(queue_size)
|
||||
{
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
template <typename ReturnType>
|
||||
ReturnType ThreadPoolImpl<Thread>::scheduleImpl(Job job, int priority, std::optional<uint64_t> wait_microseconds)
|
||||
{
|
||||
auto on_error = []
|
||||
{
|
||||
if constexpr (std::is_same_v<ReturnType, void>)
|
||||
throw DB::Exception("Cannot schedule a task", DB::ErrorCodes::CANNOT_SCHEDULE_TASK);
|
||||
else
|
||||
return false;
|
||||
};
|
||||
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
|
||||
auto pred = [this] { return !queue_size || scheduled_jobs < queue_size || shutdown; };
|
||||
|
||||
if (wait_microseconds)
|
||||
{
|
||||
if (!job_finished.wait_for(lock, std::chrono::microseconds(*wait_microseconds), pred))
|
||||
return on_error();
|
||||
}
|
||||
else
|
||||
job_finished.wait(lock, pred);
|
||||
|
||||
if (shutdown)
|
||||
return on_error();
|
||||
|
||||
jobs.emplace(std::move(job), priority);
|
||||
++scheduled_jobs;
|
||||
|
||||
if (threads.size() < std::min(max_threads, scheduled_jobs))
|
||||
{
|
||||
threads.emplace_front();
|
||||
try
|
||||
{
|
||||
threads.front() = Thread([this, it = threads.begin()] { worker(it); });
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
threads.pop_front();
|
||||
}
|
||||
}
|
||||
}
|
||||
new_job_or_shutdown.notify_one();
|
||||
return ReturnType(true);
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
void ThreadPoolImpl<Thread>::schedule(Job job, int priority)
|
||||
{
|
||||
scheduleImpl<void>(std::move(job), priority, std::nullopt);
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
bool ThreadPoolImpl<Thread>::trySchedule(Job job, int priority, uint64_t wait_microseconds)
|
||||
{
|
||||
return scheduleImpl<bool>(std::move(job), priority, wait_microseconds);
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
void ThreadPoolImpl<Thread>::scheduleOrThrow(Job job, int priority, uint64_t wait_microseconds)
|
||||
{
|
||||
scheduleImpl<void>(std::move(job), priority, wait_microseconds);
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
void ThreadPoolImpl<Thread>::wait()
|
||||
{
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
job_finished.wait(lock, [this] { return scheduled_jobs == 0; });
|
||||
|
||||
if (first_exception)
|
||||
{
|
||||
std::exception_ptr exception;
|
||||
std::swap(exception, first_exception);
|
||||
std::rethrow_exception(exception);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
ThreadPoolImpl<Thread>::~ThreadPoolImpl()
|
||||
{
|
||||
finalize();
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
void ThreadPoolImpl<Thread>::finalize()
|
||||
{
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
shutdown = true;
|
||||
}
|
||||
|
||||
new_job_or_shutdown.notify_all();
|
||||
|
||||
for (auto & thread : threads)
|
||||
thread.join();
|
||||
|
||||
threads.clear();
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
size_t ThreadPoolImpl<Thread>::active() const
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
return scheduled_jobs;
|
||||
}
|
||||
|
||||
template <typename Thread>
|
||||
void ThreadPoolImpl<Thread>::worker(typename std::list<Thread>::iterator thread_it)
|
||||
{
|
||||
while (true)
|
||||
{
|
||||
Job job;
|
||||
bool need_shutdown = false;
|
||||
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
new_job_or_shutdown.wait(lock, [this] { return shutdown || !jobs.empty(); });
|
||||
need_shutdown = shutdown;
|
||||
|
||||
if (!jobs.empty())
|
||||
{
|
||||
job = jobs.top().job;
|
||||
jobs.pop();
|
||||
}
|
||||
else
|
||||
{
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
if (!need_shutdown)
|
||||
{
|
||||
try
|
||||
{
|
||||
job();
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
if (!first_exception)
|
||||
first_exception = std::current_exception();
|
||||
shutdown = true;
|
||||
--scheduled_jobs;
|
||||
}
|
||||
job_finished.notify_all();
|
||||
new_job_or_shutdown.notify_all();
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
--scheduled_jobs;
|
||||
|
||||
if (threads.size() > scheduled_jobs + max_free_threads)
|
||||
{
|
||||
threads.erase(thread_it);
|
||||
job_finished.notify_all();
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
job_finished.notify_all();
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
template class ThreadPoolImpl<std::thread>;
|
||||
template class ThreadPoolImpl<ThreadFromGlobalPool>;
|
||||
|
||||
|
||||
void ExceptionHandler::setException(std::exception_ptr && exception)
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
if (!first_exception)
|
||||
first_exception = std::move(exception);
|
||||
}
|
||||
|
||||
void ExceptionHandler::throwIfException()
|
||||
{
|
||||
std::unique_lock lock(mutex);
|
||||
if (first_exception)
|
||||
std::rethrow_exception(first_exception);
|
||||
}
|
||||
|
||||
|
||||
ThreadPool::Job createExceptionHandledJob(ThreadPool::Job job, ExceptionHandler & handler)
|
||||
{
|
||||
return [job{std::move(job)}, &handler] ()
|
||||
{
|
||||
try
|
||||
{
|
||||
job();
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
handler.setException(std::current_exception());
|
||||
}
|
||||
};
|
||||
}
|
||||
|
203
dbms/src/Common/ThreadPool.h
Normal file
203
dbms/src/Common/ThreadPool.h
Normal file
@ -0,0 +1,203 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstdint>
|
||||
#include <thread>
|
||||
#include <mutex>
|
||||
#include <condition_variable>
|
||||
#include <functional>
|
||||
#include <queue>
|
||||
#include <list>
|
||||
#include <optional>
|
||||
#include <ext/singleton.h>
|
||||
|
||||
#include <Common/ThreadStatus.h>
|
||||
|
||||
|
||||
/** Very simple thread pool similar to boost::threadpool.
|
||||
* Advantages:
|
||||
* - catches exceptions and rethrows on wait.
|
||||
*
|
||||
* This thread pool can be used as a task queue.
|
||||
* For example, you can create a thread pool with 10 threads (and queue of size 10) and schedule 1000 tasks
|
||||
* - in this case you will be blocked to keep 10 tasks in fly.
|
||||
*
|
||||
* Thread: std::thread or something with identical interface.
|
||||
*/
|
||||
template <typename Thread>
|
||||
class ThreadPoolImpl
|
||||
{
|
||||
public:
|
||||
using Job = std::function<void()>;
|
||||
|
||||
/// Size is constant. Up to num_threads are created on demand and then run until shutdown.
|
||||
explicit ThreadPoolImpl(size_t max_threads);
|
||||
|
||||
/// queue_size - maximum number of running plus scheduled jobs. It can be greater than max_threads. Zero means unlimited.
|
||||
ThreadPoolImpl(size_t max_threads, size_t max_free_threads, size_t queue_size);
|
||||
|
||||
/// Add new job. Locks until number of scheduled jobs is less than maximum or exception in one of threads was thrown.
|
||||
/// If an exception in some thread was thrown, method silently returns, and exception will be rethrown only on call to 'wait' function.
|
||||
/// Priority: greater is higher.
|
||||
void schedule(Job job, int priority = 0);
|
||||
|
||||
/// Wait for specified amount of time and schedule a job or return false.
|
||||
bool trySchedule(Job job, int priority = 0, uint64_t wait_microseconds = 0);
|
||||
|
||||
/// Wait for specified amount of time and schedule a job or throw an exception.
|
||||
void scheduleOrThrow(Job job, int priority = 0, uint64_t wait_microseconds = 0);
|
||||
|
||||
/// Wait for all currently active jobs to be done.
|
||||
/// You may call schedule and wait many times in arbitary order.
|
||||
/// If any thread was throw an exception, first exception will be rethrown from this method,
|
||||
/// and exception will be cleared.
|
||||
void wait();
|
||||
|
||||
/// Waits for all threads. Doesn't rethrow exceptions (use 'wait' method to rethrow exceptions).
|
||||
/// You should not destroy object while calling schedule or wait methods from another threads.
|
||||
~ThreadPoolImpl();
|
||||
|
||||
/// Returns number of running and scheduled jobs.
|
||||
size_t active() const;
|
||||
|
||||
private:
|
||||
mutable std::mutex mutex;
|
||||
std::condition_variable job_finished;
|
||||
std::condition_variable new_job_or_shutdown;
|
||||
|
||||
const size_t max_threads;
|
||||
const size_t max_free_threads;
|
||||
const size_t queue_size;
|
||||
|
||||
size_t scheduled_jobs = 0;
|
||||
bool shutdown = false;
|
||||
|
||||
struct JobWithPriority
|
||||
{
|
||||
Job job;
|
||||
int priority;
|
||||
|
||||
JobWithPriority(Job job, int priority)
|
||||
: job(job), priority(priority) {}
|
||||
|
||||
bool operator< (const JobWithPriority & rhs) const
|
||||
{
|
||||
return priority < rhs.priority;
|
||||
}
|
||||
};
|
||||
|
||||
std::priority_queue<JobWithPriority> jobs;
|
||||
std::list<Thread> threads;
|
||||
std::exception_ptr first_exception;
|
||||
|
||||
|
||||
template <typename ReturnType>
|
||||
ReturnType scheduleImpl(Job job, int priority, std::optional<uint64_t> wait_microseconds);
|
||||
|
||||
void worker(typename std::list<Thread>::iterator thread_it);
|
||||
|
||||
void finalize();
|
||||
};
|
||||
|
||||
|
||||
/// ThreadPool with std::thread for threads.
|
||||
using FreeThreadPool = ThreadPoolImpl<std::thread>;
|
||||
|
||||
|
||||
/** Global ThreadPool that can be used as a singleton.
|
||||
* Why it is needed?
|
||||
*
|
||||
* Linux can create and destroy about 100 000 threads per second (quite good).
|
||||
* With simple ThreadPool (based on mutex and condvar) you can assign about 200 000 tasks per second
|
||||
* - not much difference comparing to not using a thread pool at all.
|
||||
*
|
||||
* But if you reuse OS threads instead of creating and destroying them, several benefits exist:
|
||||
* - allocator performance will usually be better due to reuse of thread local caches, especially for jemalloc:
|
||||
* https://github.com/jemalloc/jemalloc/issues/1347
|
||||
* - address sanitizer and thread sanitizer will not fail due to global limit on number of created threads.
|
||||
* - program will work faster in gdb;
|
||||
*/
|
||||
class GlobalThreadPool : public FreeThreadPool, public ext::singleton<GlobalThreadPool>
|
||||
{
|
||||
public:
|
||||
GlobalThreadPool() : FreeThreadPool(10000, 1000, 10000) {}
|
||||
};
|
||||
|
||||
|
||||
/** Looks like std::thread but allocates threads in GlobalThreadPool.
|
||||
* Also holds ThreadStatus for ClickHouse.
|
||||
*/
|
||||
class ThreadFromGlobalPool
|
||||
{
|
||||
public:
|
||||
ThreadFromGlobalPool() {}
|
||||
|
||||
template <typename Function, typename... Args>
|
||||
explicit ThreadFromGlobalPool(Function && func, Args &&... args)
|
||||
{
|
||||
mutex = std::make_unique<std::mutex>();
|
||||
|
||||
/// The function object must be copyable, so we wrap lock_guard in shared_ptr.
|
||||
GlobalThreadPool::instance().scheduleOrThrow([
|
||||
lock = std::make_shared<std::lock_guard<std::mutex>>(*mutex),
|
||||
func = std::forward<Function>(func),
|
||||
args = std::make_tuple(std::forward<Args>(args)...)]
|
||||
{
|
||||
DB::ThreadStatus thread_status;
|
||||
std::apply(func, args);
|
||||
});
|
||||
}
|
||||
|
||||
ThreadFromGlobalPool(ThreadFromGlobalPool && rhs)
|
||||
{
|
||||
*this = std::move(rhs);
|
||||
}
|
||||
|
||||
ThreadFromGlobalPool & operator=(ThreadFromGlobalPool && rhs)
|
||||
{
|
||||
if (mutex)
|
||||
std::terminate();
|
||||
mutex = std::move(rhs.mutex);
|
||||
return *this;
|
||||
}
|
||||
|
||||
~ThreadFromGlobalPool()
|
||||
{
|
||||
if (mutex)
|
||||
std::terminate();
|
||||
}
|
||||
|
||||
void join()
|
||||
{
|
||||
{
|
||||
std::lock_guard lock(*mutex);
|
||||
}
|
||||
mutex.reset();
|
||||
}
|
||||
|
||||
bool joinable() const
|
||||
{
|
||||
return static_cast<bool>(mutex);
|
||||
}
|
||||
|
||||
private:
|
||||
std::unique_ptr<std::mutex> mutex; /// Object must be moveable.
|
||||
};
|
||||
|
||||
|
||||
/// Recommended thread pool for the case when multiple thread pools are created and destroyed.
|
||||
using ThreadPool = ThreadPoolImpl<ThreadFromGlobalPool>;
|
||||
|
||||
|
||||
/// Allows to save first catched exception in jobs and postpone its rethrow.
|
||||
class ExceptionHandler
|
||||
{
|
||||
public:
|
||||
void setException(std::exception_ptr && exception);
|
||||
void throwIfException();
|
||||
|
||||
private:
|
||||
std::exception_ptr first_exception;
|
||||
std::mutex mutex;
|
||||
};
|
||||
|
||||
ThreadPool::Job createExceptionHandledJob(ThreadPool::Job job, ExceptionHandler & handler);
|
@ -21,10 +21,13 @@ namespace ErrorCodes
|
||||
}
|
||||
|
||||
|
||||
thread_local ThreadStatusPtr current_thread = nullptr;
|
||||
|
||||
|
||||
TasksStatsCounters TasksStatsCounters::current()
|
||||
{
|
||||
TasksStatsCounters res;
|
||||
CurrentThread::get()->taskstats_getter->getStat(res.stat, CurrentThread::get()->os_thread_id);
|
||||
CurrentThread::get().taskstats_getter->getStat(res.stat, CurrentThread::get().os_thread_id);
|
||||
return res;
|
||||
}
|
||||
|
||||
@ -39,17 +42,19 @@ ThreadStatus::ThreadStatus()
|
||||
memory_tracker.setDescription("(for thread)");
|
||||
log = &Poco::Logger::get("ThreadStatus");
|
||||
|
||||
current_thread = this;
|
||||
|
||||
/// NOTE: It is important not to do any non-trivial actions (like updating ProfileEvents or logging) before ThreadStatus is created
|
||||
/// Otherwise it could lead to SIGSEGV due to current_thread dereferencing
|
||||
}
|
||||
|
||||
ThreadStatusPtr ThreadStatus::create()
|
||||
ThreadStatus::~ThreadStatus()
|
||||
{
|
||||
return ThreadStatusPtr(new ThreadStatus);
|
||||
if (deleter)
|
||||
deleter();
|
||||
current_thread = nullptr;
|
||||
}
|
||||
|
||||
ThreadStatus::~ThreadStatus() = default;
|
||||
|
||||
void ThreadStatus::initPerformanceCounters()
|
||||
{
|
||||
performance_counters_finalized = false;
|
||||
|
@ -9,6 +9,8 @@
|
||||
#include <map>
|
||||
#include <mutex>
|
||||
#include <shared_mutex>
|
||||
#include <functional>
|
||||
#include <boost/noncopyable.hpp>
|
||||
|
||||
|
||||
namespace Poco
|
||||
@ -23,7 +25,7 @@ namespace DB
|
||||
class Context;
|
||||
class QueryStatus;
|
||||
class ThreadStatus;
|
||||
using ThreadStatusPtr = std::shared_ptr<ThreadStatus>;
|
||||
using ThreadStatusPtr = ThreadStatus*;
|
||||
class QueryThreadLog;
|
||||
struct TasksStatsCounters;
|
||||
struct RUsageCounters;
|
||||
@ -67,14 +69,20 @@ public:
|
||||
using ThreadGroupStatusPtr = std::shared_ptr<ThreadGroupStatus>;
|
||||
|
||||
|
||||
extern thread_local ThreadStatusPtr current_thread;
|
||||
|
||||
/** Encapsulates all per-thread info (ProfileEvents, MemoryTracker, query_id, query context, etc.).
|
||||
* Used inside thread-local variable. See variables in CurrentThread.cpp
|
||||
* The object must be created in thread function and destroyed in the same thread before the exit.
|
||||
* It is accessed through thread-local pointer.
|
||||
*
|
||||
* This object should be used only via "CurrentThread", see CurrentThread.h
|
||||
*/
|
||||
class ThreadStatus : public std::enable_shared_from_this<ThreadStatus>
|
||||
class ThreadStatus : public boost::noncopyable
|
||||
{
|
||||
public:
|
||||
ThreadStatus();
|
||||
~ThreadStatus();
|
||||
|
||||
/// Poco's thread number (the same number is used in logs)
|
||||
UInt32 thread_number = 0;
|
||||
/// Linux's PID (or TGID) (the same id is shown by ps util)
|
||||
@ -88,8 +96,8 @@ public:
|
||||
Progress progress_in;
|
||||
Progress progress_out;
|
||||
|
||||
public:
|
||||
static ThreadStatusPtr create();
|
||||
using Deleter = std::function<void()>;
|
||||
Deleter deleter;
|
||||
|
||||
ThreadGroupStatusPtr getThreadGroup() const
|
||||
{
|
||||
@ -136,11 +144,7 @@ public:
|
||||
/// Detaches thread from the thread group and the query, dumps performance counters if they have not been dumped
|
||||
void detachQuery(bool exit_if_already_detached = false, bool thread_exits = false);
|
||||
|
||||
~ThreadStatus();
|
||||
|
||||
protected:
|
||||
ThreadStatus();
|
||||
|
||||
void initPerformanceCounters();
|
||||
|
||||
void logToQueryThreadLog(QueryThreadLog & thread_log);
|
||||
|
@ -4,7 +4,7 @@ add_headers_and_sources(clickhouse_common_zookeeper .)
|
||||
|
||||
add_library(clickhouse_common_zookeeper ${LINK_MODE} ${clickhouse_common_zookeeper_headers} ${clickhouse_common_zookeeper_sources})
|
||||
|
||||
target_link_libraries (clickhouse_common_zookeeper PUBLIC clickhouse_common_io common PRIVATE string_utils PUBLIC ${Poco_Util_LIBRARY})
|
||||
target_link_libraries (clickhouse_common_zookeeper PUBLIC clickhouse_common_io common PRIVATE string_utils PUBLIC ${Poco_Util_LIBRARY} Threads::Threads)
|
||||
target_include_directories(clickhouse_common_zookeeper PUBLIC ${DBMS_INCLUDE_DIR})
|
||||
|
||||
if (ENABLE_TESTS)
|
||||
|
@ -853,8 +853,8 @@ ZooKeeper::ZooKeeper(
|
||||
if (!auth_scheme.empty())
|
||||
sendAuth(auth_scheme, auth_data);
|
||||
|
||||
send_thread = std::thread([this] { sendThread(); });
|
||||
receive_thread = std::thread([this] { receiveThread(); });
|
||||
send_thread = ThreadFromGlobalPool([this] { sendThread(); });
|
||||
receive_thread = ThreadFromGlobalPool([this] { receiveThread(); });
|
||||
|
||||
ProfileEvents::increment(ProfileEvents::ZooKeeperInit);
|
||||
}
|
||||
|
@ -3,6 +3,7 @@
|
||||
#include <Core/Types.h>
|
||||
#include <Common/ConcurrentBoundedQueue.h>
|
||||
#include <Common/CurrentMetrics.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <Common/ZooKeeper/IKeeper.h>
|
||||
|
||||
#include <IO/ReadBuffer.h>
|
||||
@ -209,8 +210,8 @@ private:
|
||||
Watches watches;
|
||||
std::mutex watches_mutex;
|
||||
|
||||
std::thread send_thread;
|
||||
std::thread receive_thread;
|
||||
ThreadFromGlobalPool send_thread;
|
||||
ThreadFromGlobalPool receive_thread;
|
||||
|
||||
void connect(
|
||||
const Addresses & addresses,
|
||||
|
@ -17,6 +17,9 @@
|
||||
#cmakedefine01 USE_HDFS
|
||||
#cmakedefine01 USE_XXHASH
|
||||
#cmakedefine01 USE_INTERNAL_LLVM_LIBRARY
|
||||
#cmakedefine01 USE_PROTOBUF
|
||||
#cmakedefine01 USE_CPUID
|
||||
#cmakedefine01 USE_CPUINFO
|
||||
|
||||
#cmakedefine01 CLICKHOUSE_SPLIT_BINARY
|
||||
#cmakedefine01 LLVM_HAS_RTTI
|
||||
|
@ -1,19 +1,20 @@
|
||||
#include <Common/getNumberOfPhysicalCPUCores.h>
|
||||
#include <thread>
|
||||
|
||||
#if defined(__x86_64__)
|
||||
|
||||
#include <libcpuid/libcpuid.h>
|
||||
#include <Common/Exception.h>
|
||||
|
||||
#include <Common/config.h>
|
||||
#if USE_CPUID
|
||||
# include <libcpuid/libcpuid.h>
|
||||
# include <Common/Exception.h>
|
||||
namespace DB { namespace ErrorCodes { extern const int CPUID_ERROR; }}
|
||||
|
||||
#elif USE_CPUINFO
|
||||
# include <cpuinfo.h>
|
||||
#endif
|
||||
|
||||
|
||||
|
||||
unsigned getNumberOfPhysicalCPUCores()
|
||||
{
|
||||
#if defined(__x86_64__)
|
||||
#if USE_CPUID
|
||||
cpu_raw_data_t raw_data;
|
||||
if (0 != cpuid_get_raw_data(&raw_data))
|
||||
throw DB::Exception("Cannot cpuid_get_raw_data: " + std::string(cpuid_error()), DB::ErrorCodes::CPUID_ERROR);
|
||||
@ -37,6 +38,13 @@ unsigned getNumberOfPhysicalCPUCores()
|
||||
|
||||
if (res != 0)
|
||||
return res;
|
||||
#elif USE_CPUINFO
|
||||
uint32_t cores = 0;
|
||||
if (cpuinfo_initialize())
|
||||
cores = cpuinfo_get_cores_count();
|
||||
|
||||
if (cores)
|
||||
return cores;
|
||||
#endif
|
||||
|
||||
/// As a fallback (also for non-x86 architectures) assume there are no hyper-threading on the system.
|
||||
|
@ -53,6 +53,13 @@ target_link_libraries (thread_creation_latency PRIVATE clickhouse_common_io)
|
||||
add_executable (thread_pool thread_pool.cpp)
|
||||
target_link_libraries (thread_pool PRIVATE clickhouse_common_io)
|
||||
|
||||
add_executable (thread_pool_2 thread_pool_2.cpp)
|
||||
target_link_libraries (thread_pool_2 PRIVATE clickhouse_common_io)
|
||||
|
||||
add_executable (multi_version multi_version.cpp)
|
||||
target_link_libraries (multi_version PRIVATE clickhouse_common_io)
|
||||
add_check(multi_version)
|
||||
|
||||
add_executable (array_cache array_cache.cpp)
|
||||
target_link_libraries (array_cache PRIVATE clickhouse_common_io)
|
||||
|
||||
|
@ -8,7 +8,7 @@
|
||||
#include <Common/RWLock.h>
|
||||
#include <Common/Stopwatch.h>
|
||||
#include <common/Types.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <random>
|
||||
#include <pcg_random.hpp>
|
||||
#include <thread>
|
||||
|
@ -1,8 +1,8 @@
|
||||
#include <string.h>
|
||||
#include <iostream>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <functional>
|
||||
#include <common/MultiVersion.h>
|
||||
#include <Common/MultiVersion.h>
|
||||
#include <Poco/Exception.h>
|
||||
|
||||
|
||||
@ -23,7 +23,7 @@ void thread2(MV & x, const char * result)
|
||||
}
|
||||
|
||||
|
||||
int main(int argc, char ** argv)
|
||||
int main(int, char **)
|
||||
{
|
||||
try
|
||||
{
|
@ -16,7 +16,7 @@
|
||||
#include <Compression/CompressedReadBuffer.h>
|
||||
|
||||
#include <Common/Stopwatch.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
|
||||
using Key = UInt64;
|
||||
|
@ -16,7 +16,7 @@
|
||||
#include <Compression/CompressedReadBuffer.h>
|
||||
|
||||
#include <Common/Stopwatch.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
|
||||
using Key = UInt64;
|
||||
|
@ -5,7 +5,7 @@
|
||||
|
||||
#include <Common/Stopwatch.h>
|
||||
#include <Common/Exception.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
|
||||
int x = 0;
|
||||
|
@ -1,4 +1,4 @@
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
/** Reproduces bug in ThreadPool.
|
||||
* It get stuck if we call 'wait' many times from many other threads simultaneously.
|
||||
|
21
dbms/src/Common/tests/thread_pool_2.cpp
Normal file
21
dbms/src/Common/tests/thread_pool_2.cpp
Normal file
@ -0,0 +1,21 @@
|
||||
#include <atomic>
|
||||
#include <iostream>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
|
||||
int main(int, char **)
|
||||
{
|
||||
std::atomic<size_t> res{0};
|
||||
|
||||
for (size_t i = 0; i < 1000; ++i)
|
||||
{
|
||||
size_t threads = 16;
|
||||
ThreadPool pool(threads);
|
||||
for (size_t j = 0; j < threads; ++j)
|
||||
pool.schedule([&]{ ++res; });
|
||||
pool.wait();
|
||||
}
|
||||
|
||||
std::cerr << res << "\n";
|
||||
return 0;
|
||||
}
|
@ -161,9 +161,9 @@ BackgroundSchedulePool::BackgroundSchedulePool(size_t size)
|
||||
|
||||
threads.resize(size);
|
||||
for (auto & thread : threads)
|
||||
thread = std::thread([this] { threadFunction(); });
|
||||
thread = ThreadFromGlobalPool([this] { threadFunction(); });
|
||||
|
||||
delayed_thread = std::thread([this] { delayExecutionThreadFunction(); });
|
||||
delayed_thread = ThreadFromGlobalPool([this] { delayExecutionThreadFunction(); });
|
||||
}
|
||||
|
||||
|
||||
@ -181,7 +181,7 @@ BackgroundSchedulePool::~BackgroundSchedulePool()
|
||||
delayed_thread.join();
|
||||
|
||||
LOG_TRACE(&Logger::get("BackgroundSchedulePool"), "Waiting for threads to finish.");
|
||||
for (std::thread & thread : threads)
|
||||
for (auto & thread : threads)
|
||||
thread.join();
|
||||
}
|
||||
catch (...)
|
||||
|
@ -13,6 +13,8 @@
|
||||
#include <boost/noncopyable.hpp>
|
||||
#include <Common/ZooKeeper/Types.h>
|
||||
#include <Common/CurrentThread.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -119,7 +121,7 @@ public:
|
||||
~BackgroundSchedulePool();
|
||||
|
||||
private:
|
||||
using Threads = std::vector<std::thread>;
|
||||
using Threads = std::vector<ThreadFromGlobalPool>;
|
||||
|
||||
void threadFunction();
|
||||
void delayExecutionThreadFunction();
|
||||
@ -141,7 +143,7 @@ private:
|
||||
std::condition_variable wakeup_cond;
|
||||
std::mutex delayed_tasks_mutex;
|
||||
/// Thread waiting for next delayed task.
|
||||
std::thread delayed_thread;
|
||||
ThreadFromGlobalPool delayed_thread;
|
||||
/// Tasks ordered by scheduled time.
|
||||
DelayedTasks delayed_tasks;
|
||||
|
||||
|
@ -427,6 +427,18 @@ Names Block::getNames() const
|
||||
}
|
||||
|
||||
|
||||
DataTypes Block::getDataTypes() const
|
||||
{
|
||||
DataTypes res;
|
||||
res.reserve(columns());
|
||||
|
||||
for (const auto & elem : data)
|
||||
res.push_back(elem.type);
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
|
||||
template <typename ReturnType>
|
||||
static ReturnType checkBlockStructure(const Block & lhs, const Block & rhs, const std::string & context_description)
|
||||
{
|
||||
|
@ -82,6 +82,7 @@ public:
|
||||
const ColumnsWithTypeAndName & getColumnsWithTypeAndName() const;
|
||||
NamesAndTypesList getNamesAndTypesList() const;
|
||||
Names getNames() const;
|
||||
DataTypes getDataTypes() const;
|
||||
|
||||
/// Returns number of rows from first column in block, not equal to nullptr. If no columns, returns 0.
|
||||
size_t rows() const;
|
||||
|
@ -166,3 +166,20 @@ template <> constexpr bool IsDecimalNumber<Decimal64> = true;
|
||||
template <> constexpr bool IsDecimalNumber<Decimal128> = true;
|
||||
|
||||
}
|
||||
|
||||
/// Specialization of `std::hash` for the Decimal<T> types.
|
||||
namespace std
|
||||
{
|
||||
template <typename T>
|
||||
struct hash<DB::Decimal<T>> { size_t operator()(const DB::Decimal<T> & x) const { return hash<T>()(x.value); } };
|
||||
|
||||
template <>
|
||||
struct hash<DB::Decimal128>
|
||||
{
|
||||
size_t operator()(const DB::Decimal128 & x) const
|
||||
{
|
||||
return std::hash<DB::Int64>()(x.value >> 64)
|
||||
^ std::hash<DB::Int64>()(x.value & std::numeric_limits<DB::UInt64>::max());
|
||||
}
|
||||
};
|
||||
}
|
||||
|
@ -35,7 +35,7 @@ void AsynchronousBlockInputStream::next()
|
||||
{
|
||||
ready.reset();
|
||||
|
||||
pool.schedule([this, thread_group=CurrentThread::getGroup()] ()
|
||||
pool.schedule([this, thread_group = CurrentThread::getGroup()] ()
|
||||
{
|
||||
CurrentMetrics::Increment metric_increment{CurrentMetrics::QueryThread};
|
||||
|
||||
|
@ -5,7 +5,7 @@
|
||||
#include <DataStreams/IBlockInputStream.h>
|
||||
#include <Common/setThreadName.h>
|
||||
#include <Common/CurrentMetrics.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <Common/MemoryTracker.h>
|
||||
#include <Poco/Ext/ThreadNumber.h>
|
||||
|
||||
|
@ -195,7 +195,7 @@ void MergingAggregatedMemoryEfficientBlockInputStream::start()
|
||||
*/
|
||||
|
||||
for (size_t i = 0; i < merging_threads; ++i)
|
||||
pool.schedule([this, thread_group=CurrentThread::getGroup()] () { mergeThread(thread_group); });
|
||||
pool.schedule([this, thread_group = CurrentThread::getGroup()] () { mergeThread(thread_group); });
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -4,7 +4,7 @@
|
||||
#include <DataStreams/IBlockInputStream.h>
|
||||
#include <Common/ConcurrentBoundedQueue.h>
|
||||
#include <Common/CurrentThread.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <condition_variable>
|
||||
|
||||
|
||||
|
@ -13,6 +13,7 @@
|
||||
#include <Common/CurrentMetrics.h>
|
||||
#include <Common/MemoryTracker.h>
|
||||
#include <Common/CurrentThread.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
|
||||
|
||||
/** Allows to process multiple block input streams (sources) in parallel, using specified number of threads.
|
||||
@ -303,8 +304,8 @@ private:
|
||||
|
||||
Handler & handler;
|
||||
|
||||
/// Streams.
|
||||
using ThreadsData = std::vector<std::thread>;
|
||||
/// Threads.
|
||||
using ThreadsData = std::vector<ThreadFromGlobalPool>;
|
||||
ThreadsData threads;
|
||||
|
||||
/** A set of available sources that are not currently processed by any thread.
|
||||
|
@ -5,7 +5,7 @@
|
||||
#include <Common/CurrentThread.h>
|
||||
#include <Common/setThreadName.h>
|
||||
#include <Common/getNumberOfPhysicalCPUCores.h>
|
||||
#include <common/ThreadPool.h>
|
||||
#include <Common/ThreadPool.h>
|
||||
#include <Storages/MergeTree/ReplicatedMergeTreeBlockOutputStream.h>
|
||||
|
||||
namespace DB
|
||||
|
@ -181,7 +181,7 @@ SummingSortedBlockInputStream::SummingSortedBlockInputStream(
|
||||
if (map_desc.key_col_nums.size() == 1)
|
||||
{
|
||||
// Create summation for all value columns in the map
|
||||
desc.init("sumMap", argument_types);
|
||||
desc.init("sumMapWithOverflow", argument_types);
|
||||
columns_to_aggregate.emplace_back(std::move(desc));
|
||||
}
|
||||
else
|
||||
@ -220,7 +220,7 @@ void SummingSortedBlockInputStream::insertCurrentRowIfNeeded(MutableColumns & me
|
||||
}
|
||||
else
|
||||
{
|
||||
/// It is sumMap aggregate function.
|
||||
/// It is sumMapWithOverflow aggregate function.
|
||||
/// Assume that the row isn't empty in this case (just because it is compatible with previous version)
|
||||
current_row_is_zero = false;
|
||||
}
|
||||
|
@ -70,7 +70,7 @@ private:
|
||||
/// Stores aggregation function, state, and columns to be used as function arguments
|
||||
struct AggregateDescription
|
||||
{
|
||||
/// An aggregate function 'sumWithOverflow' or 'sumMap' for summing.
|
||||
/// An aggregate function 'sumWithOverflow' or 'sumMapWithOverflow' for summing.
|
||||
AggregateFunctionPtr function;
|
||||
IAggregateFunction::AddFunc add_function = nullptr;
|
||||
std::vector<size_t> column_numbers;
|
||||
|
@ -9,6 +9,7 @@
|
||||
#include <Common/AlignedBuffer.h>
|
||||
|
||||
#include <Formats/FormatSettings.h>
|
||||
#include <Formats/ProtobufWriter.h>
|
||||
#include <DataTypes/DataTypeAggregateFunction.h>
|
||||
#include <DataTypes/DataTypeFactory.h>
|
||||
|
||||
@ -248,6 +249,12 @@ void DataTypeAggregateFunction::deserializeTextCSV(IColumn & column, ReadBuffer
|
||||
}
|
||||
|
||||
|
||||
void DataTypeAggregateFunction::serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const
|
||||
{
|
||||
protobuf.writeAggregateFunction(function, static_cast<const ColumnAggregateFunction &>(column).getData()[row_num]);
|
||||
}
|
||||
|
||||
|
||||
MutableColumnPtr DataTypeAggregateFunction::createColumn() const
|
||||
{
|
||||
return ColumnAggregateFunction::create(function);
|
||||
|
@ -56,6 +56,7 @@ public:
|
||||
void serializeTextXML(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
void serializeTextCSV(const IColumn & column, size_t row_num, WriteBuffer & ostr, const FormatSettings &) const override;
|
||||
void deserializeTextCSV(IColumn & column, ReadBuffer & istr, const FormatSettings & settings) const override;
|
||||
void serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const override;
|
||||
|
||||
MutableColumnPtr createColumn() const override;
|
||||
|
||||
|
@ -430,6 +430,18 @@ void DataTypeArray::deserializeTextCSV(IColumn & column, ReadBuffer & istr, cons
|
||||
}
|
||||
|
||||
|
||||
void DataTypeArray::serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const
|
||||
{
|
||||
const ColumnArray & column_array = static_cast<const ColumnArray &>(column);
|
||||
const ColumnArray::Offsets & offsets = column_array.getOffsets();
|
||||
size_t offset = offsets[row_num - 1];
|
||||
size_t next_offset = offsets[row_num];
|
||||
const IColumn & nested_column = column_array.getData();
|
||||
for (size_t i = offset; i < next_offset; ++i)
|
||||
nested->serializeProtobuf(nested_column, i, protobuf);
|
||||
}
|
||||
|
||||
|
||||
MutableColumnPtr DataTypeArray::createColumn() const
|
||||
{
|
||||
return ColumnArray::create(nested->createColumn(), ColumnArray::ColumnOffsets::create());
|
||||
|
@ -84,6 +84,10 @@ public:
|
||||
DeserializeBinaryBulkSettings & settings,
|
||||
DeserializeBinaryBulkStatePtr & state) const override;
|
||||
|
||||
void serializeProtobuf(const IColumn & column,
|
||||
size_t row_num,
|
||||
ProtobufWriter & protobuf) const override;
|
||||
|
||||
MutableColumnPtr createColumn() const override;
|
||||
|
||||
Field getDefault() const override;
|
||||
|
@ -4,6 +4,7 @@
|
||||
#include <Columns/ColumnsNumber.h>
|
||||
#include <DataTypes/DataTypeDate.h>
|
||||
#include <DataTypes/DataTypeFactory.h>
|
||||
#include <Formats/ProtobufWriter.h>
|
||||
|
||||
|
||||
namespace DB
|
||||
@ -72,6 +73,11 @@ void DataTypeDate::deserializeTextCSV(IColumn & column, ReadBuffer & istr, const
|
||||
static_cast<ColumnUInt16 &>(column).getData().push_back(value.getDayNum());
|
||||
}
|
||||
|
||||
void DataTypeDate::serializeProtobuf(const IColumn & column, size_t row_num, ProtobufWriter & protobuf) const
|
||||
{
|
||||
protobuf.writeDate(DayNum(static_cast<const ColumnUInt16 &>(column).getData()[row_num]));
|
||||
}
|
||||
|
||||
bool DataTypeDate::equals(const IDataType & rhs) const
|
||||
{
|
||||
return typeid(rhs) == typeid(*this);
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user