Merge branch 'master' of github.com:PerformanceVision/ClickHouse into ignore_scheme

This commit is contained in:
Guillaume Tassery 2019-05-06 17:20:25 +07:00
commit 24dce70207
1014 changed files with 24915 additions and 15344 deletions

View File

@ -1,3 +1,155 @@
## ClickHouse release 19.5.3.8, 2019-04-18
### Bug fixes
* Fixed type of setting `max_partitions_per_insert_block` from boolean to UInt64. [#5028](https://github.com/yandex/ClickHouse/pull/5028) ([Mohammad Hossein Sekhavat](https://github.com/mhsekhavat))
## ClickHouse release 19.5.2.6, 2019-04-15
### New Features
* [Hyperscan](https://github.com/intel/hyperscan) multiple regular expression matching was added (functions `multiMatchAny`, `multiMatchAnyIndex`, `multiFuzzyMatchAny`, `multiFuzzyMatchAnyIndex`). [#4780](https://github.com/yandex/ClickHouse/pull/4780), [#4841](https://github.com/yandex/ClickHouse/pull/4841) ([Danila Kutenin](https://github.com/danlark1))
* `multiSearchFirstPosition` function was added. [#4780](https://github.com/yandex/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1))
* Implement the predefined expression filter per row for tables. [#4792](https://github.com/yandex/ClickHouse/pull/4792) ([Ivan](https://github.com/abyss7))
* A new type of data skipping indices based on bloom filters (can be used for `equal`, `in` and `like` functions). [#4499](https://github.com/yandex/ClickHouse/pull/4499) ([Nikita Vasilev](https://github.com/nikvas0))
* Added `ASOF JOIN` which allows to run queries that join to the most recent value known. [#4774](https://github.com/yandex/ClickHouse/pull/4774) [#4867](https://github.com/yandex/ClickHouse/pull/4867) [#4863](https://github.com/yandex/ClickHouse/pull/4863) [#4875](https://github.com/yandex/ClickHouse/pull/4875) ([Martijn Bakker](https://github.com/Gladdy), [Artem Zuikov](https://github.com/4ertus2))
* Rewrite multiple `COMMA JOIN` to `CROSS JOIN`. Then rewrite them to `INNER JOIN` if possible. [#4661](https://github.com/yandex/ClickHouse/pull/4661) ([Artem Zuikov](https://github.com/4ertus2))
### Improvement
* `topK` and `topKWeighted` now supports custom `loadFactor` (fixes issue [#4252](https://github.com/yandex/ClickHouse/issues/4252)). [#4634](https://github.com/yandex/ClickHouse/pull/4634) ([Kirill Danshin](https://github.com/kirillDanshin))
* Allow to use `parallel_replicas_count > 1` even for tables without sampling (the setting is simply ignored for them). In previous versions it was lead to exception. [#4637](https://github.com/yandex/ClickHouse/pull/4637) ([Alexey Elymanov](https://github.com/digitalist))
* Support for `CREATE OR REPLACE VIEW`. Allow to create a view or set a new definition in a single statement. [#4654](https://github.com/yandex/ClickHouse/pull/4654) ([Boris Granveaud](https://github.com/bgranvea))
* `Buffer` table engine now supports `PREWHERE`. [#4671](https://github.com/yandex/ClickHouse/pull/4671) ([Yangkuan Liu](https://github.com/LiuYangkuan))
* Add ability to start replicated table without metadata in zookeeper in `readonly` mode. [#4691](https://github.com/yandex/ClickHouse/pull/4691) ([alesapin](https://github.com/alesapin))
* Fixed flicker of progress bar in clickhouse-client. The issue was most noticeable when using `FORMAT Null` with streaming queries. [#4811](https://github.com/yandex/ClickHouse/pull/4811) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow to disable functions with `hyperscan` library on per user basis to limit potentially excessive and uncontrolled resource usage. [#4816](https://github.com/yandex/ClickHouse/pull/4816) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add version number logging in all errors. [#4824](https://github.com/yandex/ClickHouse/pull/4824) ([proller](https://github.com/proller))
* Added restriction to the `multiMatch` functions which requires string size to fit into `unsigned int`. Also added the number of arguments limit to the `multiSearch` functions. [#4834](https://github.com/yandex/ClickHouse/pull/4834) ([Danila Kutenin](https://github.com/danlark1))
* Improved usage of scratch space and error handling in Hyperscan. [#4866](https://github.com/yandex/ClickHouse/pull/4866) ([Danila Kutenin](https://github.com/danlark1))
* Fill `system.graphite_detentions` from a table config of `*GraphiteMergeTree` engine tables. [#4584](https://github.com/yandex/ClickHouse/pull/4584) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Rename `trigramDistance` function to `ngramDistance` and add more functions with `CaseInsensitive` and `UTF`. [#4602](https://github.com/yandex/ClickHouse/pull/4602) ([Danila Kutenin](https://github.com/danlark1))
* Improved data skipping indices calculation. [#4640](https://github.com/yandex/ClickHouse/pull/4640) ([Nikita Vasilev](https://github.com/nikvas0))
* Keep ordinary, `DEFAULT`, `MATERIALIZED` and `ALIAS` columns in a single list (fixes issue [#2867](https://github.com/yandex/ClickHouse/issues/2867)). [#4707](https://github.com/yandex/ClickHouse/pull/4707) ([Alex Zatelepin](https://github.com/ztlpn))
### Bug Fix
* Avoid `std::terminate` in case of memory allocation failure. Now `std::bad_alloc` exception is thrown as expected. [#4665](https://github.com/yandex/ClickHouse/pull/4665) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixes capnproto reading from buffer. Sometimes files wasn't loaded successfully by HTTP. [#4674](https://github.com/yandex/ClickHouse/pull/4674) ([Vladislav](https://github.com/smirnov-vs))
* Fix error `Unknown log entry type: 0` after `OPTIMIZE TABLE FINAL` query. [#4683](https://github.com/yandex/ClickHouse/pull/4683) ([Amos Bird](https://github.com/amosbird))
* Wrong arguments to `hasAny` or `hasAll` functions may lead to segfault. [#4698](https://github.com/yandex/ClickHouse/pull/4698) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Deadlock may happen while executing `DROP DATABASE dictionary` query. [#4701](https://github.com/yandex/ClickHouse/pull/4701) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix undefinied behavior in `median` and `quantile` functions. [#4702](https://github.com/yandex/ClickHouse/pull/4702) ([hcz](https://github.com/hczhcz))
* Fix compression level detection when `network_compression_method` in lowercase. Broken in v19.1. [#4706](https://github.com/yandex/ClickHouse/pull/4706) ([proller](https://github.com/proller))
* Fixed ignorance of `<timezone>UTC</timezone>` setting (fixes issue [#4658](https://github.com/yandex/ClickHouse/issues/4658)). [#4718](https://github.com/yandex/ClickHouse/pull/4718) ([proller](https://github.com/proller))
* Fix `histogram` function behaviour with `Distributed` tables. [#4741](https://github.com/yandex/ClickHouse/pull/4741) ([olegkv](https://github.com/olegkv))
* Fixed tsan report `destroy of a locked mutex`. [#4742](https://github.com/yandex/ClickHouse/pull/4742) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed TSan report on shutdown due to race condition in system logs usage. Fixed potential use-after-free on shutdown when part_log is enabled. [#4758](https://github.com/yandex/ClickHouse/pull/4758) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix recheck parts in `ReplicatedMergeTreeAlterThread` in case of error. [#4772](https://github.com/yandex/ClickHouse/pull/4772) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Arithmetic operations on intermediate aggregate function states were not working for constant arguments (such as subquery results). [#4776](https://github.com/yandex/ClickHouse/pull/4776) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Always backquote column names in metadata. Otherwise it's impossible to create a table with column named `index` (server won't restart due to malformed `ATTACH` query in metadata). [#4782](https://github.com/yandex/ClickHouse/pull/4782) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix crash in `ALTER ... MODIFY ORDER BY` on `Distributed` table. [#4790](https://github.com/yandex/ClickHouse/pull/4790) ([TCeason](https://github.com/TCeason))
* Fix segfault in `JOIN ON` with enabled `enable_optimize_predicate_expression`. [#4794](https://github.com/yandex/ClickHouse/pull/4794) ([Winter Zhang](https://github.com/zhang2014))
* Fix bug with adding an extraneous row after consuming a protobuf message from Kafka. [#4808](https://github.com/yandex/ClickHouse/pull/4808) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix crash of `JOIN` on not-nullable vs nullable column. Fix `NULLs` in right keys in `ANY JOIN` + `join_use_nulls`. [#4815](https://github.com/yandex/ClickHouse/pull/4815) ([Artem Zuikov](https://github.com/4ertus2))
* Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
* Fixed race condition in `SELECT` from `system.tables` if the table is renamed or altered concurrently. [#4836](https://github.com/yandex/ClickHouse/pull/4836) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed data race when fetching data part that is already obsolete. [#4839](https://github.com/yandex/ClickHouse/pull/4839) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed rare data race that can happen during `RENAME` table of MergeTree family. [#4844](https://github.com/yandex/ClickHouse/pull/4844) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed segmentation fault in function `arrayIntersect`. Segmentation fault could happen if function was called with mixed constant and ordinary arguments. [#4847](https://github.com/yandex/ClickHouse/pull/4847) ([Lixiang Qian](https://github.com/fancyqlx))
* Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Fix `No message received` exception while fetching parts between replicas. [#4856](https://github.com/yandex/ClickHouse/pull/4856) ([alesapin](https://github.com/alesapin))
* Fixed `arrayIntersect` function wrong result in case of several repeated values in single array. [#4871](https://github.com/yandex/ClickHouse/pull/4871) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix a race condition during concurrent `ALTER COLUMN` queries that could lead to a server crash (fixes issue [#3421](https://github.com/yandex/ClickHouse/issues/3421)). [#4592](https://github.com/yandex/ClickHouse/pull/4592) ([Alex Zatelepin](https://github.com/ztlpn))
* Fix incorrect result in `FULL/RIGHT JOIN` with const column. [#4723](https://github.com/yandex/ClickHouse/pull/4723) ([Artem Zuikov](https://github.com/4ertus2))
* Fix duplicates in `GLOBAL JOIN` with asterisk. [#4705](https://github.com/yandex/ClickHouse/pull/4705) ([Artem Zuikov](https://github.com/4ertus2))
* Fix parameter deduction in `ALTER MODIFY` of column `CODEC` when column type is not specified. [#4883](https://github.com/yandex/ClickHouse/pull/4883) ([alesapin](https://github.com/alesapin))
* Functions `cutQueryStringAndFragment()` and `queryStringAndFragment()` now works correctly when `URL` contains a fragment and no query. [#4894](https://github.com/yandex/ClickHouse/pull/4894) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix rare bug when setting `min_bytes_to_use_direct_io` is greater than zero, which occures when thread have to seek backward in column file. [#4897](https://github.com/yandex/ClickHouse/pull/4897) ([alesapin](https://github.com/alesapin))
* Fix wrong argument types for aggregate functions with `LowCardinality` arguments (fixes issue [#4919](https://github.com/yandex/ClickHouse/issues/4919)). [#4922](https://github.com/yandex/ClickHouse/pull/4922) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix wrong name qualification in `GLOBAL JOIN`. [#4969](https://github.com/yandex/ClickHouse/pull/4969) ([Artem Zuikov](https://github.com/4ertus2))
* Fix function `toISOWeek` result for year 1970. [#4988](https://github.com/yandex/ClickHouse/pull/4988) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix `DROP`, `TRUNCATE` and `OPTIMIZE` queries duplication, when executed on `ON CLUSTER` for `ReplicatedMergeTree*` tables family. [#4991](https://github.com/yandex/ClickHouse/pull/4991) ([alesapin](https://github.com/alesapin))
### Backward Incompatible Change
* Rename setting `insert_sample_with_metadata` to setting `input_format_defaults_for_omitted_fields`. [#4771](https://github.com/yandex/ClickHouse/pull/4771) ([Artem Zuikov](https://github.com/4ertus2))
* Added setting `max_partitions_per_insert_block` (with value 100 by default). If inserted block contains larger number of partitions, an exception is thrown. Set it to 0 if you want to remove the limit (not recommended). [#4845](https://github.com/yandex/ClickHouse/pull/4845) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Multi-search functions were renamed (`multiPosition` to `multiSearchAllPositions`, `multiSearch` to `multiSearchAny`, `firstMatch` to `multiSearchFirstIndex`). [#4780](https://github.com/yandex/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1))
### Performance Improvement
* Optimize Volnitsky searcher by inlining, giving about 5-10% search improvement for queries with many needles or many similar bigrams. [#4862](https://github.com/yandex/ClickHouse/pull/4862) ([Danila Kutenin](https://github.com/danlark1))
* Fix performance issue when setting `use_uncompressed_cache` is greater than zero, which appeared when all read data contained in cache. [#4913](https://github.com/yandex/ClickHouse/pull/4913) ([alesapin](https://github.com/alesapin))
### Build/Testing/Packaging Improvement
* Hardening debug build: more granular memory mappings and ASLR; add memory protection for mark cache and index. This allows to find more memory stomping bugs in case when ASan and MSan cannot do it. [#4632](https://github.com/yandex/ClickHouse/pull/4632) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add support for cmake variables `ENABLE_PROTOBUF`, `ENABLE_PARQUET` and `ENABLE_BROTLI` which allows to enable/disable the above features (same as we can do for librdkafka, mysql, etc). [#4669](https://github.com/yandex/ClickHouse/pull/4669) ([Silviu Caragea](https://github.com/silviucpp))
* Add ability to print process list and stacktraces of all threads if some queries are hung after test run. [#4675](https://github.com/yandex/ClickHouse/pull/4675) ([alesapin](https://github.com/alesapin))
* Add retries on `Connection loss` error in `clickhouse-test`. [#4682](https://github.com/yandex/ClickHouse/pull/4682) ([alesapin](https://github.com/alesapin))
* Add freebsd build with vagrant and build with thread sanitizer to packager script. [#4712](https://github.com/yandex/ClickHouse/pull/4712) [#4748](https://github.com/yandex/ClickHouse/pull/4748) ([alesapin](https://github.com/alesapin))
* Now user asked for password for user `'default'` during installation. [#4725](https://github.com/yandex/ClickHouse/pull/4725) ([proller](https://github.com/proller))
* Suppress warning in `rdkafka` library. [#4740](https://github.com/yandex/ClickHouse/pull/4740) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow ability to build without ssl. [#4750](https://github.com/yandex/ClickHouse/pull/4750) ([proller](https://github.com/proller))
* Add a way to launch clickhouse-server image from a custom user. [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Upgrade contrib boost to 1.69. [#4793](https://github.com/yandex/ClickHouse/pull/4793) ([proller](https://github.com/proller))
* Disable usage of `mremap` when compiled with Thread Sanitizer. Surprisingly enough, TSan does not intercept `mremap` (though it does intercept `mmap`, `munmap`) that leads to false positives. Fixed TSan report in stateful tests. [#4859](https://github.com/yandex/ClickHouse/pull/4859) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add test checking using format schema via HTTP interface. [#4864](https://github.com/yandex/ClickHouse/pull/4864) ([Vitaly Baranov](https://github.com/vitlibar))
## ClickHouse release 19.4.4.33, 2019-04-17
### Bug Fixes
* Avoid `std::terminate` in case of memory allocation failure. Now `std::bad_alloc` exception is thrown as expected. [#4665](https://github.com/yandex/ClickHouse/pull/4665) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixes capnproto reading from buffer. Sometimes files wasn't loaded successfully by HTTP. [#4674](https://github.com/yandex/ClickHouse/pull/4674) ([Vladislav](https://github.com/smirnov-vs))
* Fix error `Unknown log entry type: 0` after `OPTIMIZE TABLE FINAL` query. [#4683](https://github.com/yandex/ClickHouse/pull/4683) ([Amos Bird](https://github.com/amosbird))
* Wrong arguments to `hasAny` or `hasAll` functions may lead to segfault. [#4698](https://github.com/yandex/ClickHouse/pull/4698) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Deadlock may happen while executing `DROP DATABASE dictionary` query. [#4701](https://github.com/yandex/ClickHouse/pull/4701) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix undefinied behavior in `median` and `quantile` functions. [#4702](https://github.com/yandex/ClickHouse/pull/4702) ([hcz](https://github.com/hczhcz))
* Fix compression level detection when `network_compression_method` in lowercase. Broken in v19.1. [#4706](https://github.com/yandex/ClickHouse/pull/4706) ([proller](https://github.com/proller))
* Fixed ignorance of `<timezone>UTC</timezone>` setting (fixes issue [#4658](https://github.com/yandex/ClickHouse/issues/4658)). [#4718](https://github.com/yandex/ClickHouse/pull/4718) ([proller](https://github.com/proller))
* Fix `histogram` function behaviour with `Distributed` tables. [#4741](https://github.com/yandex/ClickHouse/pull/4741) ([olegkv](https://github.com/olegkv))
* Fixed tsan report `destroy of a locked mutex`. [#4742](https://github.com/yandex/ClickHouse/pull/4742) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed TSan report on shutdown due to race condition in system logs usage. Fixed potential use-after-free on shutdown when part_log is enabled. [#4758](https://github.com/yandex/ClickHouse/pull/4758) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix recheck parts in `ReplicatedMergeTreeAlterThread` in case of error. [#4772](https://github.com/yandex/ClickHouse/pull/4772) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Arithmetic operations on intermediate aggregate function states were not working for constant arguments (such as subquery results). [#4776](https://github.com/yandex/ClickHouse/pull/4776) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Always backquote column names in metadata. Otherwise it's impossible to create a table with column named `index` (server won't restart due to malformed `ATTACH` query in metadata). [#4782](https://github.com/yandex/ClickHouse/pull/4782) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix crash in `ALTER ... MODIFY ORDER BY` on `Distributed` table. [#4790](https://github.com/yandex/ClickHouse/pull/4790) ([TCeason](https://github.com/TCeason))
* Fix segfault in `JOIN ON` with enabled `enable_optimize_predicate_expression`. [#4794](https://github.com/yandex/ClickHouse/pull/4794) ([Winter Zhang](https://github.com/zhang2014))
* Fix bug with adding an extraneous row after consuming a protobuf message from Kafka. [#4808](https://github.com/yandex/ClickHouse/pull/4808) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
* Fixed race condition in `SELECT` from `system.tables` if the table is renamed or altered concurrently. [#4836](https://github.com/yandex/ClickHouse/pull/4836) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed data race when fetching data part that is already obsolete. [#4839](https://github.com/yandex/ClickHouse/pull/4839) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed rare data race that can happen during `RENAME` table of MergeTree family. [#4844](https://github.com/yandex/ClickHouse/pull/4844) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed segmentation fault in function `arrayIntersect`. Segmentation fault could happen if function was called with mixed constant and ordinary arguments. [#4847](https://github.com/yandex/ClickHouse/pull/4847) ([Lixiang Qian](https://github.com/fancyqlx))
* Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix `No message received` exception while fetching parts between replicas. [#4856](https://github.com/yandex/ClickHouse/pull/4856) ([alesapin](https://github.com/alesapin))
* Fixed `arrayIntersect` function wrong result in case of several repeated values in single array. [#4871](https://github.com/yandex/ClickHouse/pull/4871) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix a race condition during concurrent `ALTER COLUMN` queries that could lead to a server crash (fixes issue [#3421](https://github.com/yandex/ClickHouse/issues/3421)). [#4592](https://github.com/yandex/ClickHouse/pull/4592) ([Alex Zatelepin](https://github.com/ztlpn))
* Fix parameter deduction in `ALTER MODIFY` of column `CODEC` when column type is not specified. [#4883](https://github.com/yandex/ClickHouse/pull/4883) ([alesapin](https://github.com/alesapin))
* Functions `cutQueryStringAndFragment()` and `queryStringAndFragment()` now works correctly when `URL` contains a fragment and no query. [#4894](https://github.com/yandex/ClickHouse/pull/4894) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix rare bug when setting `min_bytes_to_use_direct_io` is greater than zero, which occures when thread have to seek backward in column file. [#4897](https://github.com/yandex/ClickHouse/pull/4897) ([alesapin](https://github.com/alesapin))
* Fix wrong argument types for aggregate functions with `LowCardinality` arguments (fixes issue [#4919](https://github.com/yandex/ClickHouse/issues/4919)). [#4922](https://github.com/yandex/ClickHouse/pull/4922) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix function `toISOWeek` result for year 1970. [#4988](https://github.com/yandex/ClickHouse/pull/4988) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix `DROP`, `TRUNCATE` and `OPTIMIZE` queries duplication, when executed on `ON CLUSTER` for `ReplicatedMergeTree*` tables family. [#4991](https://github.com/yandex/ClickHouse/pull/4991) ([alesapin](https://github.com/alesapin))
### Improvements
* Keep ordinary, `DEFAULT`, `MATERIALIZED` and `ALIAS` columns in a single list (fixes issue [#2867](https://github.com/yandex/ClickHouse/issues/2867)). [#4707](https://github.com/yandex/ClickHouse/pull/4707) ([Alex Zatelepin](https://github.com/ztlpn))
## ClickHouse release 19.4.3.11, 2019-04-02
### Bug Fixes
* Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
### Build/Testing/Packaging Improvement
* Add a way to launch clickhouse-server image from a custom user. [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
## ClickHouse release 19.4.2.7, 2019-03-30
### Bug Fixes
@ -13,11 +165,11 @@
### New Features
* Added full support for `Protobuf` format (input and output, nested data structures). [#4174](https://github.com/yandex/ClickHouse/pull/4174) [#4493](https://github.com/yandex/ClickHouse/pull/4493) ([Vitaly Baranov](https://github.com/vitlibar))
* Added bitmap functions with Roaring Bitmaps. [#4207](https://github.com/yandex/ClickHouse/pull/4207) ([Andy Yang](https://github.com/andyyzh)) [#4568](https://github.com/yandex/ClickHouse/pull/4568) ([Vitaly Baranov](https://github.com/vitlibar))
* Parquet format support [#4448](https://github.com/yandex/ClickHouse/pull/4448) ([proller](https://github.com/proller))
* Parquet format support. [#4448](https://github.com/yandex/ClickHouse/pull/4448) ([proller](https://github.com/proller))
* N-gram distance was added for fuzzy string comparison. It is similar to q-gram metrics in R language. [#4466](https://github.com/yandex/ClickHouse/pull/4466) ([Danila Kutenin](https://github.com/danlark1))
* Combine rules for graphite rollup from dedicated aggregation and retention patterns. [#4426](https://github.com/yandex/ClickHouse/pull/4426) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Added `max_execution_speed` and `max_execution_speed_bytes` to limit resource usage. Added `min_execution_speed_bytes` setting to complement the `min_execution_speed`. [#4430](https://github.com/yandex/ClickHouse/pull/4430) ([Winter Zhang](https://github.com/zhang2014))
* Implemented function `flatten` [#4555](https://github.com/yandex/ClickHouse/pull/4555) [#4409](https://github.com/yandex/ClickHouse/pull/4409) ([alexey-milovidov](https://github.com/alexey-milovidov), [kzon](https://github.com/kzon))
* Implemented function `flatten`. [#4555](https://github.com/yandex/ClickHouse/pull/4555) [#4409](https://github.com/yandex/ClickHouse/pull/4409) ([alexey-milovidov](https://github.com/alexey-milovidov), [kzon](https://github.com/kzon))
* Added functions `arrayEnumerateDenseRanked` and `arrayEnumerateUniqRanked` (it's like `arrayEnumerateUniq` but allows to fine tune array depth to look inside multidimensional arrays). [#4475](https://github.com/yandex/ClickHouse/pull/4475) ([proller](https://github.com/proller)) [#4601](https://github.com/yandex/ClickHouse/pull/4601) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Multiple JOINS with some restrictions: no asterisks, no complex aliases in ON/WHERE/GROUP BY/... [#4462](https://github.com/yandex/ClickHouse/pull/4462) ([Artem Zuikov](https://github.com/4ertus2))
@ -26,25 +178,25 @@
* Fixed bug in data skipping indices: order of granules after INSERT was incorrect. [#4407](https://github.com/yandex/ClickHouse/pull/4407) ([Nikita Vasilev](https://github.com/nikvas0))
* Fixed `set` index for `Nullable` and `LowCardinality` columns. Before it, `set` index with `Nullable` or `LowCardinality` column led to error `Data type must be deserialized with multiple streams` while selecting. [#4594](https://github.com/yandex/ClickHouse/pull/4594) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Correctly set update_time on full `executable` dictionary update. [#4551](https://github.com/yandex/ClickHouse/pull/4551) ([Tema Novikov](https://github.com/temoon))
* Fix broken progress bar in 19.3 [#4627](https://github.com/yandex/ClickHouse/pull/4627) ([filimonov](https://github.com/filimonov))
* Fix broken progress bar in 19.3. [#4627](https://github.com/yandex/ClickHouse/pull/4627) ([filimonov](https://github.com/filimonov))
* Fixed inconsistent values of MemoryTracker when memory region was shrinked, in certain cases. [#4619](https://github.com/yandex/ClickHouse/pull/4619) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed undefined behaviour in ThreadPool [#4612](https://github.com/yandex/ClickHouse/pull/4612) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed undefined behaviour in ThreadPool. [#4612](https://github.com/yandex/ClickHouse/pull/4612) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed a very rare crash with the message `mutex lock failed: Invalid argument` that could happen when a MergeTree table was dropped concurrently with a SELECT. [#4608](https://github.com/yandex/ClickHouse/pull/4608) ([Alex Zatelepin](https://github.com/ztlpn))
* ODBC driver compatibility with `LowCardinality` data type [#4381](https://github.com/yandex/ClickHouse/pull/4381) ([proller](https://github.com/proller))
* FreeBSD: Fixup for `AIOcontextPool: Found io_event with unknown id 0` error [#4438](https://github.com/yandex/ClickHouse/pull/4438) ([urgordeadbeef](https://github.com/urgordeadbeef))
* ODBC driver compatibility with `LowCardinality` data type. [#4381](https://github.com/yandex/ClickHouse/pull/4381) ([proller](https://github.com/proller))
* FreeBSD: Fixup for `AIOcontextPool: Found io_event with unknown id 0` error. [#4438](https://github.com/yandex/ClickHouse/pull/4438) ([urgordeadbeef](https://github.com/urgordeadbeef))
* `system.part_log` table was created regardless to configuration. [#4483](https://github.com/yandex/ClickHouse/pull/4483) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix undefined behaviour in `dictIsIn` function for cache dictionaries. [#4515](https://github.com/yandex/ClickHouse/pull/4515) ([alesapin](https://github.com/alesapin))
* Fixed a deadlock when a SELECT query locks the same table multiple times (e.g. from different threads or when executing multiple subqueries) and there is a concurrent DDL query. [#4535](https://github.com/yandex/ClickHouse/pull/4535) ([Alex Zatelepin](https://github.com/ztlpn))
* Disable compile_expressions by default until we get own `llvm` contrib and can test it with `clang` and `asan`. [#4579](https://github.com/yandex/ClickHouse/pull/4579) ([alesapin](https://github.com/alesapin))
* Prevent `std::terminate` when `invalidate_query` for `clickhouse` external dictionary source has returned wrong resultset (empty or more than one row or more than one column). Fixed issue when the `invalidate_query` was performed every five seconds regardless to the `lifetime`. [#4583](https://github.com/yandex/ClickHouse/pull/4583) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Avoid deadlock when the `invalidate_query` for a dictionary with `clickhouse` source was involving `system.dictionaries` table or `Dictionaries` database (rare case). [#4599](https://github.com/yandex/ClickHouse/pull/4599) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixes for CROSS JOIN with empty WHERE [#4598](https://github.com/yandex/ClickHouse/pull/4598) ([Artem Zuikov](https://github.com/4ertus2))
* Fixes for CROSS JOIN with empty WHERE. [#4598](https://github.com/yandex/ClickHouse/pull/4598) ([Artem Zuikov](https://github.com/4ertus2))
* Fixed segfault in function "replicate" when constant argument is passed. [#4603](https://github.com/yandex/ClickHouse/pull/4603) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix lambda function with predicate optimizer. [#4408](https://github.com/yandex/ClickHouse/pull/4408) ([Winter Zhang](https://github.com/zhang2014))
* Multiple JOINs multiple fixes. [#4595](https://github.com/yandex/ClickHouse/pull/4595) ([Artem Zuikov](https://github.com/4ertus2))
### Improvements
* Support aliases in JOIN ON section for right table columns [#4412](https://github.com/yandex/ClickHouse/pull/4412) ([Artem Zuikov](https://github.com/4ertus2))
* Support aliases in JOIN ON section for right table columns. [#4412](https://github.com/yandex/ClickHouse/pull/4412) ([Artem Zuikov](https://github.com/4ertus2))
* Result of multiple JOINs need correct result names to be used in subselects. Replace flat aliases with source names in result. [#4474](https://github.com/yandex/ClickHouse/pull/4474) ([Artem Zuikov](https://github.com/4ertus2))
* Improve push-down logic for joined statements. [#4387](https://github.com/yandex/ClickHouse/pull/4387) ([Ivan](https://github.com/abyss7))
@ -67,6 +219,18 @@
* Fix compilation on Mac. [#4371](https://github.com/yandex/ClickHouse/pull/4371) ([Vitaly Baranov](https://github.com/vitlibar))
* Build fixes for FreeBSD and various unusual build configurations. [#4444](https://github.com/yandex/ClickHouse/pull/4444) ([proller](https://github.com/proller))
## ClickHouse release 19.3.9.1, 2019-04-02
### Bug Fixes
* Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
* Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
### Build/Testing/Packaging Improvement
* Add a way to launch clickhouse-server image from a custom user [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
## ClickHouse release 19.3.7, 2019-03-12

View File

@ -1,3 +1,164 @@
## ClickHouse release 19.5.3.8, 2019-04-18
### Исправления ошибок
* Исправлен тип настройки `max_partitions_per_insert_block` с булевого на UInt64. [#5028](https://github.com/yandex/ClickHouse/pull/5028) ([Mohammad Hossein Sekhavat](https://github.com/mhsekhavat))
## ClickHouse release 19.5.2.6, 2019-04-15
### Новые возможности
* Добавлены функции для работы с несколькими регулярными выражениями с помощью библиотеки [Hyperscan](https://github.com/intel/hyperscan). (`multiMatchAny`, `multiMatchAnyIndex`, `multiFuzzyMatchAny`, `multiFuzzyMatchAnyIndex`). [#4780](https://github.com/yandex/ClickHouse/pull/4780), [#4841](https://github.com/yandex/ClickHouse/pull/4841) ([Danila Kutenin](https://github.com/danlark1))
* Добавлена функция `multiSearchFirstPosition`. [#4780](https://github.com/yandex/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1))
* Реализована возможность указания построчного ограничения доступа к таблицам. [#4792](https://github.com/yandex/ClickHouse/pull/4792) ([Ivan](https://github.com/abyss7))
* Добавлен новый тип вторичного индекса на базе фильтра Блума (используется в функциях `equal`, `in` и `like`). [#4499](https://github.com/yandex/ClickHouse/pull/4499) ([Nikita Vasilev](https://github.com/nikvas0))
* Добавлен `ASOF JOIN` которые позволяет джойнить строки по наиболее близкому известному значению. [#4774](https://github.com/yandex/ClickHouse/pull/4774) [#4867](https://github.com/yandex/ClickHouse/pull/4867) [#4863](https://github.com/yandex/ClickHouse/pull/4863) [#4875](https://github.com/yandex/ClickHouse/pull/4875) ([Martijn Bakker](https://github.com/Gladdy), [Artem Zuikov](https://github.com/4ertus2))
* Теперь запрос `COMMA JOIN` переписывается `CROSS JOIN`. И затем оба переписываются в `INNER JOIN`, если это возможно. [#4661](https://github.com/yandex/ClickHouse/pull/4661) ([Artem Zuikov](https://github.com/4ertus2))
### Улучшения
* Функции `topK` и `topKWeighted` теперь поддерживают произвольный `loadFactor` (исправляет issue [#4252](https://github.com/yandex/ClickHouse/issues/4252)). [#4634](https://github.com/yandex/ClickHouse/pull/4634) ([Kirill Danshin](https://github.com/kirillDanshin))
* Добавлена возможность использования настройки `parallel_replicas_count > 1` для таблиц без семплирования (ранее настройка просто игнорировалась). [#4637](https://github.com/yandex/ClickHouse/pull/4637) ([Alexey Elymanov](https://github.com/digitalist))
* Поддержан запрос `CREATE OR REPLACE VIEW`. Позволяет создать `VIEW` или изменить запрос в одном выражении. [#4654](https://github.com/yandex/ClickHouse/pull/4654) ([Boris Granveaud](https://github.com/bgranvea))
* Движок таблиц `Buffer` теперь поддерживает `PREWHERE`. [#4671](https://github.com/yandex/ClickHouse/pull/4671) ([Yangkuan Liu](https://github.com/LiuYangkuan))
* Теперь реплицируемые таблицы могу стартовать в `readonly` режиме даже при отсутствии zookeeper. [#4691](https://github.com/yandex/ClickHouse/pull/4691) ([alesapin](https://github.com/alesapin))
* Исправлено мигание прогресс-бара в clickhouse-client. Проблема была наиболее заметна при использовании `FORMAT Null` в потоковых запросах. [#4811](https://github.com/yandex/ClickHouse/pull/4811) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлена возможность отключения функций, использующих библиотеку `hyperscan`, для пользователей, чтобы ограничить возможное неконтролируемое потребление ресурсов. [#4816](https://github.com/yandex/ClickHouse/pull/4816) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлено логирование номера версии во все исключения. [#4824](https://github.com/yandex/ClickHouse/pull/4824) ([proller](https://github.com/proller))
* Добавлено ограничение на размер строк и количество параметров в функции `multiMatch`. Теперь они принимают строки умещающиеся в `unsigned int`. [#4834](https://github.com/yandex/ClickHouse/pull/4834) ([Danila Kutenin](https://github.com/danlark1))
* Улучшено использование памяти и обработка ошибок в Hyperscan. [#4866](https://github.com/yandex/ClickHouse/pull/4866) ([Danila Kutenin](https://github.com/danlark1))
* Теперь системная таблица `system.graphite_detentions` заполняется из конфигурационного файла для таблиц семейства `*GraphiteMergeTree`. [#4584](https://github.com/yandex/ClickHouse/pull/4584) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Функция `trigramDistance` переименована в функцию `ngramDistance`. Добавлено несколько функций с `CaseInsensitive` и `UTF`. [#4602](https://github.com/yandex/ClickHouse/pull/4602) ([Danila Kutenin](https://github.com/danlark1))
* Улучшено вычисление вторичных индексов. [#4640](https://github.com/yandex/ClickHouse/pull/4640) ([Nikita Vasilev](https://github.com/nikvas0))
* Теперь обычные колонки, а также колонки `DEFAULT`, `MATERIALIZED` и `ALIAS` хранятся в одном списке (исправляет issue [#2867](https://github.com/yandex/ClickHouse/issues/2867)). [#4707](https://github.com/yandex/ClickHouse/pull/4707) ([Alex Zatelepin](https://github.com/ztlpn))
### Исправления ошибок
* В случае невозможности выделить память вместо вызова `std::terminate` бросается исключение `std::bad_alloc`. [#4665](https://github.com/yandex/ClickHouse/pull/4665) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлены ошибки чтения capnproto из буфера. Иногда файлы не загружались по HTTP. [#4674](https://github.com/yandex/ClickHouse/pull/4674) ([Vladislav](https://github.com/smirnov-vs))
* Исправлена ошибка `Unknown log entry type: 0` после запроса `OPTIMIZE TABLE FINAL`. [#4683](https://github.com/yandex/ClickHouse/pull/4683) ([Amos Bird](https://github.com/amosbird))
* При передаче неправильных аргументов в `hasAny` и `hasAll` могла происходить ошибка сегментирования. [#4698](https://github.com/yandex/ClickHouse/pull/4698) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлен дедлок, который мог происходить при запросе `DROP DATABASE dictionary`. [#4701](https://github.com/yandex/ClickHouse/pull/4701) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено неопределенное поведение в функциях `median` и `quantile`. [#4702](https://github.com/yandex/ClickHouse/pull/4702) ([hcz](https://github.com/hczhcz))
* Исправлено определение уровня сжатия при указании настройки `network_compression_method` в нижнем регистре. Было сломано в v19.1. [#4706](https://github.com/yandex/ClickHouse/pull/4706) ([proller](https://github.com/proller))
* Настройка `<timezone>UTC</timezone>` больше не игнорируется (исправляет issue [#4658](https://github.com/yandex/ClickHouse/issues/4658)). [#4718](https://github.com/yandex/ClickHouse/pull/4718) ([proller](https://github.com/proller))
* Исправлено поведение функции `histogram` с `Distributed` таблицами. [#4741](https://github.com/yandex/ClickHouse/pull/4741) ([olegkv](https://github.com/olegkv))
* Исправлено срабатывание thread-санитайзера с ошибкой `destroy of a locked mutex`. [#4742](https://github.com/yandex/ClickHouse/pull/4742) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено срабатывание thread-санитайзера при завершении сервера, вызванное гонкой при использовании системных логов. Также исправлена потенциальная ошибка use-after-free при завершении сервера в котором был включен `part_log`. [#4758](https://github.com/yandex/ClickHouse/pull/4758) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена перепроверка кусков в `ReplicatedMergeTreeAlterThread` при появлении ошибок. [#4772](https://github.com/yandex/ClickHouse/pull/4772) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлена работа арифметических операций с промежуточными состояниями агрегатных функций для константных аргументов (таких как результаты подзапросов). [#4776](https://github.com/yandex/ClickHouse/pull/4776) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Теперь имена колонок всегда экранируются в файлах с метаинформацией. В противном случае было невозможно создать таблицу с колонкой с именем `index`. [#4782](https://github.com/yandex/ClickHouse/pull/4782) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено падение в запросе `ALTER ... MODIFY ORDER BY` к `Distributed` таблице. [#4790](https://github.com/yandex/ClickHouse/pull/4790) ([TCeason](https://github.com/TCeason))
* Исправлена ошибка сегментирования при запросах с `JOIN ON` и включенной настройкой `enable_optimize_predicate_expression`. [#4794](https://github.com/yandex/ClickHouse/pull/4794) ([Winter Zhang](https://github.com/zhang2014))
* Исправлено добавление лишней строки после чтения protobuf-сообщения из таблицы с движком `Kafka`. [#4808](https://github.com/yandex/ClickHouse/pull/4808) ([Vitaly Baranov](https://github.com/vitlibar))
* Исправлено падение при запросе с `JOIN ON` с не `nullable` и nullable колонкой. Также исправлено поведение при появлении `NULLs` среди ключей справа в`ANY JOIN` + `join_use_nulls`. [#4815](https://github.com/yandex/ClickHouse/pull/4815) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлена ошибка сегментирования в `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
* Исправлена гонка при `SELECT` запросе из `system.tables` если таблица была конкурентно переименована или к ней был применен `ALTER` запрос. [#4836](https://github.com/yandex/ClickHouse/pull/4836) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена гонка при скачивании куска, который уже является устаревшим. [#4839](https://github.com/yandex/ClickHouse/pull/4839) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена редкая гонка при `RENAME` запросах к таблицам семейства MergeTree. [#4844](https://github.com/yandex/ClickHouse/pull/4844) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка сегментирования в функции `arrayIntersect`. Ошибка возникала при вызове функции с константными и не константными аргументами. [#4847](https://github.com/yandex/ClickHouse/pull/4847) ([Lixiang Qian](https://github.com/fancyqlx))
* Исправлена редкая ошибка при чтении из колонки типа `Array(LowCardinality)`, которая возникала, если в колонке содержалось большее количество подряд идущих пустых массивов. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлено паление в запроса с `FULL/RIGHT JOIN` когда объединение происходило по nullable и не nullable колонке. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлена ошибка `No message received`, возникавшая при скачивании кусков между репликами. [#4856](https://github.com/yandex/ClickHouse/pull/4856) ([alesapin](https://github.com/alesapin))
* Исправлена ошибка в функции `arrayIntersect` приводившая к неправильным результатам в случае нескольких повторяющихся значений в массиве. [#4871](https://github.com/yandex/ClickHouse/pull/4871) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлена гонка при конкурентных `ALTER COLUMN` запросах, которая могла приводить к падению сервера (исправляет issue [#3421](https://github.com/yandex/ClickHouse/issues/3421)). [#4592](https://github.com/yandex/ClickHouse/pull/4592) ([Alex Zatelepin](https://github.com/ztlpn))
* Исправлен некорректный результат в `FULL/RIGHT JOIN` запросах с константной колонкой. [#4723](https://github.com/yandex/ClickHouse/pull/4723) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлено появление дубликатов в `GLOBAL JOIN` со звездочкой. [#4705](https://github.com/yandex/ClickHouse/pull/4705) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлено определение параметров кодеков в запросах `ALTER MODIFY`, если тип колонки не был указан. [#4883](https://github.com/yandex/ClickHouse/pull/4883) ([alesapin](https://github.com/alesapin))
* Функции `cutQueryStringAndFragment()` и `queryStringAndFragment()` теперь работают корректно, когда `URL` содержит фрагмент, но не содержит запроса. [#4894](https://github.com/yandex/ClickHouse/pull/4894) ([Vitaly Baranov](https://github.com/vitlibar))
* Исправлена редкая ошибка, возникавшая при установке настройки `min_bytes_to_use_direct_io` больше нуля. Она возникла при необходимости сдвинутся в файле, который уже прочитан до конца. [#4897](https://github.com/yandex/ClickHouse/pull/4897) ([alesapin](https://github.com/alesapin))
* Исправлено неправильное определение типов аргументов для агрегатных функций с `LowCardinality` аргументами (исправляет [#4919](https://github.com/yandex/ClickHouse/issues/4919)). [#4922](https://github.com/yandex/ClickHouse/pull/4922) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлена неверная квалификация имён в `GLOBAL JOIN`. [#4969](https://github.com/yandex/ClickHouse/pull/4969) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлен результат функции `toISOWeek` для 1970 года. [#4988](https://github.com/yandex/ClickHouse/pull/4988) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено дублирование `DROP`, `TRUNCATE` и `OPTIMIZE` запросов, когда они выполнялись `ON CLUSTER` для семейства таблиц `ReplicatedMergeTree*`. [#4991](https://github.com/yandex/ClickHouse/pull/4991) ([alesapin](https://github.com/alesapin))
### Обратно несовместимые изменения
* Настройка `insert_sample_with_metadata` переименована в `input_format_defaults_for_omitted_fields`. [#4771](https://github.com/yandex/ClickHouse/pull/4771) ([Artem Zuikov](https://github.com/4ertus2))
* Добавлена настройка `max_partitions_per_insert_block` (со значением по умолчанию 100). Если вставляемый блок содержит большое количество партиций, то бросается исключение. Лимит можно убрать выставив настройку в 0 (не рекомендуется). [#4845](https://github.com/yandex/ClickHouse/pull/4845) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Функции мультипоиска были переименованы (`multiPosition` в `multiSearchAllPositions`, `multiSearch` в `multiSearchAny`, `firstMatch` в `multiSearchFirstIndex`). [#4780](https://github.com/yandex/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1))
### Улучшение производительности
* Оптимизирован поиска с помощью алгоритма Volnitsky с помощью инлайнинга. Это дает около 5-10% улучшения производительности поиска для запросов ищущих множество слов или много одинаковых биграмм. [#4862](https://github.com/yandex/ClickHouse/pull/4862) ([Danila Kutenin](https://github.com/danlark1))
* Исправлено снижение производительности при выставлении настройки `use_uncompressed_cache` больше нуля для запросов, данные которых целиком лежат в кеше. [#4913](https://github.com/yandex/ClickHouse/pull/4913) ([alesapin](https://github.com/alesapin))
### Улучшения сборки/тестирования/пакетирования
* Более строгие настройки для debug-сборок: более гранулярные маппинги памяти и использование ASLR; добавлена защита памяти для кеша засечек и индекса. Это позволяет найти больше ошибок порчи памяти, которые не обнаруживают address-санитайзер и thread-санитайзер. [#4632](https://github.com/yandex/ClickHouse/pull/4632) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлены настройки `ENABLE_PROTOBUF`, `ENABLE_PARQUET` и `ENABLE_BROTLI` которые позволяют отключить соответствующие компоненты. [#4669](https://github.com/yandex/ClickHouse/pull/4669) ([Silviu Caragea](https://github.com/silviucpp))
* Теперь при зависании запросов во время работы тестов будет показан список запросов и стек-трейсы всех потоков. [#4675](https://github.com/yandex/ClickHouse/pull/4675) ([alesapin](https://github.com/alesapin))
* Добавлены ретраи при ошибке `Connection loss` в `clickhouse-test`. [#4682](https://github.com/yandex/ClickHouse/pull/4682) ([alesapin](https://github.com/alesapin))
* Добавлена возможность сборки под FreeBSD в `packager`-скрипт. [#4712](https://github.com/yandex/ClickHouse/pull/4712) [#4748](https://github.com/yandex/ClickHouse/pull/4748) ([alesapin](https://github.com/alesapin))
* Теперь при установке предлагается установить пароль для пользователя `'default'`. [#4725](https://github.com/yandex/ClickHouse/pull/4725) ([proller](https://github.com/proller))
* Убраны предупреждения из библиотеки `rdkafka` при сборке. [#4740](https://github.com/yandex/ClickHouse/pull/4740) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлена возможность сборки без поддержки ssl. [#4750](https://github.com/yandex/ClickHouse/pull/4750) ([proller](https://github.com/proller))
* Добавлена возможность запускать докер-образ с clickhouse-server из под любого пользователя. [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Boost обновлен до 1.69. [#4793](https://github.com/yandex/ClickHouse/pull/4793) ([proller](https://github.com/proller))
* Отключено использование `mremap` при сборке с thread-санитайзером, что приводило к ложным срабатываниям. Исправлены ошибки thread-санитайзера в stateful-тестах. [#4859](https://github.com/yandex/ClickHouse/pull/4859) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлен тест проверяющий использование схемы форматов для HTTP-интерфейса. [#4864](https://github.com/yandex/ClickHouse/pull/4864) ([Vitaly Baranov](https://github.com/vitlibar))
## ClickHouse release 19.4.4.33, 2019-04-17
### Исправление ошибок
* В случае невозможности выделить память вместо вызова `std::terminate` бросается исключение `std::bad_alloc`. [#4665](https://github.com/yandex/ClickHouse/pull/4665) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлены ошибки чтения capnproto из буфера. Иногда файлы не загружались по HTTP. [#4674](https://github.com/yandex/ClickHouse/pull/4674) ([Vladislav](https://github.com/smirnov-vs))
* Исправлена ошибка `Unknown log entry type: 0` после запроса `OPTIMIZE TABLE FINAL`. [#4683](https://github.com/yandex/ClickHouse/pull/4683) ([Amos Bird](https://github.com/amosbird))
* При передаче неправильных аргументов в `hasAny` и `hasAll` могла происходить ошибка сегментирования. [#4698](https://github.com/yandex/ClickHouse/pull/4698) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлен дедлок, который мог происходить при запросе `DROP DATABASE dictionary`. [#4701](https://github.com/yandex/ClickHouse/pull/4701) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено неопределенное поведение в функциях `median` и `quantile`. [#4702](https://github.com/yandex/ClickHouse/pull/4702) ([hcz](https://github.com/hczhcz))
* Исправлено определение уровня сжатия при указании настройки `network_compression_method` в нижнем регистре. Было сломано в v19.1. [#4706](https://github.com/yandex/ClickHouse/pull/4706) ([proller](https://github.com/proller))
* Настройка `<timezone>UTC</timezone>` больше не игнорируется (исправляет issue [#4658](https://github.com/yandex/ClickHouse/issues/4658)). [#4718](https://github.com/yandex/ClickHouse/pull/4718) ([proller](https://github.com/proller))
* Исправлено поведение функции `histogram` с `Distributed` таблицами. [#4741](https://github.com/yandex/ClickHouse/pull/4741) ([olegkv](https://github.com/olegkv))
* Исправлено срабатывание thread-санитайзера с ошибкой `destroy of a locked mutex`. [#4742](https://github.com/yandex/ClickHouse/pull/4742) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено срабатывание thread-санитайзера при завершении сервера, вызванное гонкой при использовании системных логов. Также исправлена потенциальная ошибка use-after-free при завершении сервера в котором был включен `part_log`. [#4758](https://github.com/yandex/ClickHouse/pull/4758) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена перепроверка кусков в `ReplicatedMergeTreeAlterThread` при появлении ошибок. [#4772](https://github.com/yandex/ClickHouse/pull/4772) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлена работа арифметических операций с промежуточными состояниями агрегатных функций для константных аргументов (таких как результаты подзапросов). [#4776](https://github.com/yandex/ClickHouse/pull/4776) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Теперь имена колонок всегда экранируются в файлах с метаинформацией. В противном случае было невозможно создать таблицу с колонкой с именем `index`. [#4782](https://github.com/yandex/ClickHouse/pull/4782) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено падение в запросе `ALTER ... MODIFY ORDER BY` к `Distributed` таблице. [#4790](https://github.com/yandex/ClickHouse/pull/4790) ([TCeason](https://github.com/TCeason))
* Исправлена ошибка сегментирования при запросах с `JOIN ON` и включенной настройкой `enable_optimize_predicate_expression`. [#4794](https://github.com/yandex/ClickHouse/pull/4794) ([Winter Zhang](https://github.com/zhang2014))
* Исправлено добавление лишней строки после чтения protobuf-сообщения из таблицы с движком `Kafka`. [#4808](https://github.com/yandex/ClickHouse/pull/4808) ([Vitaly Baranov](https://github.com/vitlibar))
* Исправлена ошибка сегментирования в `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
* Исправлена гонка при `SELECT` запросе из `system.tables` если таблица была конкурентно переименована или к ней был применен `ALTER` запрос. [#4836](https://github.com/yandex/ClickHouse/pull/4836) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена гонка при скачивании куска, который уже является устаревшим. [#4839](https://github.com/yandex/ClickHouse/pull/4839) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена редкая гонка при `RENAME` запросах к таблицам семейства MergeTree. [#4844](https://github.com/yandex/ClickHouse/pull/4844) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка сегментирования в функции `arrayIntersect`. Ошибка возникала при вызове функции с константными и не константными аргументами. [#4847](https://github.com/yandex/ClickHouse/pull/4847) ([Lixiang Qian](https://github.com/fancyqlx))
* Исправлена редкая ошибка при чтении из колонки типа `Array(LowCardinality)`, которая возникала, если в колонке содержалось большее количество подряд идущих пустых массивов. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлена ошибка `No message received`, возникавшая при скачивании кусков между репликами. [#4856](https://github.com/yandex/ClickHouse/pull/4856) ([alesapin](https://github.com/alesapin))
* Исправлена ошибка в функции `arrayIntersect` приводившая к неправильным результатам в случае нескольких повторяющихся значений в массиве. [#4871](https://github.com/yandex/ClickHouse/pull/4871) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлена гонка при конкурентных `ALTER COLUMN` запросах, которая могла приводить к падению сервера (исправляет issue [#3421](https://github.com/yandex/ClickHouse/issues/3421)). [#4592](https://github.com/yandex/ClickHouse/pull/4592) ([Alex Zatelepin](https://github.com/ztlpn))
* Исправлено определение параметров кодеков в запросах `ALTER MODIFY`, если тип колонки не был указан. [#4883](https://github.com/yandex/ClickHouse/pull/4883) ([alesapin](https://github.com/alesapin))
* Функции `cutQueryStringAndFragment()` и `queryStringAndFragment()` теперь работают корректно, когда `URL` содержит фрагмент, но не содержит запроса. [#4894](https://github.com/yandex/ClickHouse/pull/4894) ([Vitaly Baranov](https://github.com/vitlibar))
* Исправлена редкая ошибка, возникавшая при установке настройки `min_bytes_to_use_direct_io` больше нуля. Она возникла при необходимости сдвинутся в файле, который уже прочитан до конца. [#4897](https://github.com/yandex/ClickHouse/pull/4897) ([alesapin](https://github.com/alesapin))
* Исправлено неправильное определение типов аргументов для агрегатных функций с `LowCardinality` аргументами (исправляет [#4919](https://github.com/yandex/ClickHouse/issues/4919)). [#4922](https://github.com/yandex/ClickHouse/pull/4922) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Исправлен результат функции `toISOWeek` для 1970 года. [#4988](https://github.com/yandex/ClickHouse/pull/4988) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено дублирование `DROP`, `TRUNCATE` и `OPTIMIZE` запросов, когда они выполнялись `ON CLUSTER` для семейства таблиц `ReplicatedMergeTree*`. [#4991](https://github.com/yandex/ClickHouse/pull/4991) ([alesapin](https://github.com/alesapin))
### Улучшения
* Теперь обычные колонки, а также колонки `DEFAULT`, `MATERIALIZED` и `ALIAS` хранятся в одном списке (исправляет issue [#2867](https://github.com/yandex/ClickHouse/issues/2867)). [#4707](https://github.com/yandex/ClickHouse/pull/4707) ([Alex Zatelepin](https://github.com/ztlpn))
## ClickHouse release 19.4.3.11, 2019-04-02
### Исправление ошибок
* Исправлено паление в запроса с `FULL/RIGHT JOIN` когда объединение происходило по nullable и не nullable колонке. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлена ошибка сегментирования в `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
### Улучшения сборки/тестирования/пакетирования
* Добавлена возможность запускать докер-образ с clickhouse-server из под любого пользователя. [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
## ClickHouse release 19.4.2.7, 2019-03-30
### Исправление ошибок
* Исправлена редкая ошибка при чтении из колонки типа `Array(LowCardinality)`, которая возникала, если в колонке содержалось большее количество подряд идущих пустых массивов. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
## ClickHouse release 19.4.1.3, 2019-03-19
### Исправление ошибок
* Исправлено поведение удаленных запросов, которые одновременно содержали `LIMIT BY` и `LIMIT`. Раньше для таких запросов `LIMIT` мог быть выполнен до `LIMIT BY`, что приводило к перефильтрации. [#4708](https://github.com/yandex/ClickHouse/pull/4708) ([Constantin S. Pan](https://github.com/kvap))
## ClickHouse release 19.4.0.49, 2019-03-09
### Новые возможности

View File

@ -59,7 +59,7 @@ if (NOT MAKE_STATIC_LIBRARIES)
endif ()
if (SPLIT_SHARED_LIBRARIES)
set (LINK_MODE SHARED)
set(BUILD_SHARED_LIBS 1 CACHE INTERNAL "")
endif ()
if (USE_STATIC_LIBRARIES)
@ -317,6 +317,7 @@ include (cmake/find_hdfs3.cmake) # uses protobuf
include (cmake/find_consistent-hashing.cmake)
include (cmake/find_base64.cmake)
include (cmake/find_hyperscan.cmake)
include (cmake/find_lfalloc.cmake)
find_contrib_lib(cityhash)
find_contrib_lib(farmhash)
find_contrib_lib(metrohash)

View File

@ -10,3 +10,10 @@ ClickHouse is an open-source column-oriented database management system that all
* [Blog](https://clickhouse.yandex/blog/en/) contains various ClickHouse-related articles, as well as announces and reports about events.
* [Contacts](https://clickhouse.yandex/#contacts) can help to get your questions answered if there are any.
* You can also [fill this form](https://forms.yandex.com/surveys/meet-yandex-clickhouse-team/) to meet Yandex ClickHouse team in person.
## Upcoming Events
* [ClickHouse Community Meetup in Limassol](https://www.facebook.com/events/386638262181785/) on May 7.
* ClickHouse at [Percona Live 2019](https://www.percona.com/live/19/other-open-source-databases-track) in Austin on May 28-30.
* [ClickHouse Community Meetup in Beijing](https://www.huodongxing.com/event/2483759276200) on June 8.
* [ClickHouse Community Meetup in Shenzhen](https://www.huodongxing.com/event/3483759917300) on October 20.
* [ClickHouse Community Meetup in Shanghai](https://www.huodongxing.com/event/4483760336000) on October 27.

View File

@ -1,88 +1,147 @@
# This file copied from contrib/poco/cmake/FindODBC.cmake to allow build without submodules
# Distributed under the OSI-approved BSD 3-Clause License. See accompanying
# file Copyright.txt or https://cmake.org/licensing for details.
#.rst:
# FindMySQL
# -------
#
# Find the ODBC driver manager includes and library.
# Find ODBC Runtime
#
# ODBC is an open standard for connecting to different databases in a
# semi-vendor-independent fashion. First you install the ODBC driver
# manager. Then you need a driver for each separate database you want
# to connect to (unless a generic one works). VTK includes neither
# the driver manager nor the vendor-specific drivers: you have to find
# those yourself.
# This will define the following variables::
#
# This module defines
# ODBC_INCLUDE_DIRECTORIES, where to find sql.h
# ODBC_LIBRARIES, the libraries to link against to use ODBC
# ODBC_FOUND. If false, you cannot build anything that requires ODBC.
# ODBC_FOUND - True if the system has the libraries
# ODBC_INCLUDE_DIRS - where to find the headers
# ODBC_LIBRARIES - where to find the libraries
# ODBC_DEFINITIONS - compile definitons
#
# Hints:
# Set ``ODBC_ROOT_DIR`` to the root directory of an installation.
#
include(FindPackageHandleStandardArgs)
option (ENABLE_ODBC "Enable ODBC" ${OS_LINUX})
if (OS_LINUX)
option (USE_INTERNAL_ODBC_LIBRARY "Set to FALSE to use system odbc library instead of bundled" ${NOT_UNBUNDLED})
else ()
option (USE_INTERNAL_ODBC_LIBRARY "Set to FALSE to use system odbc library instead of bundled" OFF)
endif ()
find_package(PkgConfig QUIET)
pkg_check_modules(PC_ODBC QUIET odbc)
if (USE_INTERNAL_ODBC_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/unixodbc/README")
message (WARNING "submodule contrib/unixodbc is missing. to fix try run: \n git submodule update --init --recursive")
set (USE_INTERNAL_ODBC_LIBRARY 0)
endif ()
if(WIN32)
get_filename_component(kit_dir "[HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows Kits\\Installed Roots;KitsRoot]" REALPATH)
get_filename_component(kit81_dir "[HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows Kits\\Installed Roots;KitsRoot81]" REALPATH)
endif()
if (ENABLE_ODBC)
if (USE_INTERNAL_ODBC_LIBRARY)
set (ODBC_LIBRARIES unixodbc)
set (ODBC_INCLUDE_DIRECTORIES ${CMAKE_SOURCE_DIR}/contrib/unixodbc/include)
set (ODBC_FOUND 1)
set (USE_ODBC 1)
else ()
find_path(ODBC_INCLUDE_DIRECTORIES
NAMES sql.h
HINTS
/usr/include
/usr/include/iodbc
/usr/include/odbc
/usr/local/include
/usr/local/include/iodbc
/usr/local/include/odbc
/usr/local/iodbc/include
/usr/local/odbc/include
"C:/Program Files/ODBC/include"
"C:/Program Files/Microsoft SDKs/Windows/v7.0/include"
"C:/Program Files/Microsoft SDKs/Windows/v6.0a/include"
"C:/ODBC/include"
DOC "Specify the directory containing sql.h."
)
find_path(ODBC_INCLUDE_DIR
NAMES sql.h
HINTS
${ODBC_ROOT_DIR}/include
${ODBC_ROOT_INCLUDE_DIRS}
PATHS
${PC_ODBC_INCLUDE_DIRS}
/usr/include
/usr/local/include
/usr/local/odbc/include
/usr/local/iodbc/include
"C:/Program Files/ODBC/include"
"C:/Program Files/Microsoft SDKs/Windows/v7.0/include"
"C:/Program Files/Microsoft SDKs/Windows/v6.0a/include"
"C:/ODBC/include"
"${kit_dir}/Include/um"
"${kit81_dir}/Include/um"
PATH_SUFFIXES
odbc
iodbc
DOC "Specify the directory containing sql.h."
)
find_library(ODBC_LIBRARIES
NAMES iodbc odbc iodbcinst odbcinst odbc32
HINTS
/usr/lib
/usr/lib/iodbc
/usr/lib/odbc
/usr/local/lib
/usr/local/lib/iodbc
/usr/local/lib/odbc
/usr/local/iodbc/lib
/usr/local/odbc/lib
"C:/Program Files/ODBC/lib"
"C:/ODBC/lib/debug"
"C:/Program Files (x86)/Microsoft SDKs/Windows/v7.0A/Lib"
DOC "Specify the ODBC driver manager library here."
)
if(NOT ODBC_INCLUDE_DIR AND WIN32)
set(ODBC_INCLUDE_DIR "")
else()
set(REQUIRED_INCLUDE_DIR ODBC_INCLUDE_DIR)
endif()
# MinGW find usually fails
if(MINGW)
set(ODBC_INCLUDE_DIRECTORIES ".")
set(ODBC_LIBRARIES odbc32)
endif()
if(WIN32 AND CMAKE_SIZEOF_VOID_P EQUAL 8)
set(WIN_ARCH x64)
elseif(WIN32 AND CMAKE_SIZEOF_VOID_P EQUAL 4)
set(WIN_ARCH x86)
endif()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(ODBC
DEFAULT_MSG
ODBC_INCLUDE_DIRECTORIES
ODBC_LIBRARIES)
find_library(ODBC_LIBRARY
NAMES unixodbc iodbc odbc odbc32
HINTS
${ODBC_ROOT_DIR}/lib
${ODBC_ROOT_LIBRARY_DIRS}
PATHS
${PC_ODBC_LIBRARY_DIRS}
/usr/lib
/usr/local/lib
/usr/local/odbc/lib
/usr/local/iodbc/lib
"C:/Program Files/ODBC/lib"
"C:/ODBC/lib/debug"
"C:/Program Files (x86)/Microsoft SDKs/Windows/v7.0A/Lib"
"${kit81_dir}/Lib/winv6.3/um"
"${kit_dir}/Lib/win8/um"
PATH_SUFIXES
odbc
${WIN_ARCH}
DOC "Specify the ODBC driver manager library here."
)
mark_as_advanced(ODBC_FOUND ODBC_LIBRARIES ODBC_INCLUDE_DIRECTORIES)
endif ()
endif ()
if(NOT ODBC_LIBRARY AND WIN32)
# List names of ODBC libraries on Windows
set(ODBC_LIBRARY odbc32.lib)
endif()
message (STATUS "Using odbc: ${ODBC_INCLUDE_DIRECTORIES} : ${ODBC_LIBRARIES}")
# List additional libraries required to use ODBC library
if(WIN32 AND MSVC OR CMAKE_CXX_COMPILER_ID MATCHES "Intel")
set(_odbc_required_libs_names odbccp32;ws2_32)
endif()
foreach(_lib_name IN LISTS _odbc_required_libs_names)
find_library(_lib_path
NAMES ${_lib_name}
HINTS
${ODBC_ROOT_DIR}/lib
${ODBC_ROOT_LIBRARY_DIRS}
PATHS
${PC_ODBC_LIBRARY_DIRS}
/usr/lib
/usr/local/lib
/usr/local/odbc/lib
/usr/local/iodbc/lib
"C:/Program Files/ODBC/lib"
"C:/ODBC/lib/debug"
"C:/Program Files (x86)/Microsoft SDKs/Windows/v7.0A/Lib"
PATH_SUFFIXES
odbc
)
if (_lib_path)
list(APPEND _odbc_required_libs_paths ${_lib_path})
endif()
unset(_lib_path CACHE)
endforeach()
unset(_odbc_lib_paths)
unset(_odbc_required_libs_names)
find_package_handle_standard_args(ODBC
FOUND_VAR ODBC_FOUND
REQUIRED_VARS
ODBC_LIBRARY
${REQUIRED_INCLUDE_DIR}
VERSION_VAR ODBC_VERSION
)
if(ODBC_FOUND)
set(ODBC_LIBRARIES ${ODBC_LIBRARY} ${_odbc_required_libs_paths})
set(ODBC_INCLUDE_DIRS ${ODBC_INCLUDE_DIR})
set(ODBC_DEFINITIONS ${PC_ODBC_CFLAGS_OTHER})
endif()
if(ODBC_FOUND AND NOT TARGET ODBC::ODBC)
add_library(ODBC::ODBC UNKNOWN IMPORTED)
set_target_properties(ODBC::ODBC PROPERTIES
IMPORTED_LOCATION "${ODBC_LIBRARY}"
INTERFACE_LINK_LIBRARIES "${_odbc_required_libs_paths}"
INTERFACE_COMPILE_OPTIONS "${PC_ODBC_CFLAGS_OTHER}"
INTERFACE_INCLUDE_DIRECTORIES "${ODBC_INCLUDE_DIR}"
)
endif()
mark_as_advanced(ODBC_LIBRARY ODBC_INCLUDE_DIR)

View File

@ -203,12 +203,12 @@ endforeach()
if(Poco_DataODBC_LIBRARY)
list(APPEND Poco_DataODBC_LIBRARY ${ODBC_LIBRARIES} ${LTDL_LIBRARY})
list(APPEND Poco_INCLUDE_DIRS ${ODBC_INCLUDE_DIRECTORIES})
list(APPEND Poco_INCLUDE_DIRS ${ODBC_INCLUDE_DIRS})
endif()
if(Poco_SQLODBC_LIBRARY)
list(APPEND Poco_SQLODBC_LIBRARY ${ODBC_LIBRARIES} ${LTDL_LIBRARY})
list(APPEND Poco_INCLUDE_DIRS ${ODBC_INCLUDE_DIRECTORIES})
list(APPEND Poco_INCLUDE_DIRS ${ODBC_INCLUDE_DIRS})
endif()
if(Poco_NetSSL_LIBRARY)

View File

@ -1,8 +1,7 @@
find_program (CCACHE_FOUND ccache)
if (CCACHE_FOUND AND NOT CMAKE_CXX_COMPILER_LAUNCHER MATCHES "ccache" AND NOT CMAKE_CXX_COMPILER MATCHES "ccache")
execute_process(COMMAND ${CCACHE_FOUND} "-V" OUTPUT_VARIABLE CCACHE_VERSION)
string(REGEX REPLACE "ccache version ([0-9\\.]+).*" "\\1" CCACHE_VERSION ${CCACHE_VERSION} )
string(REGEX REPLACE "ccache version ([0-9\\.]+).*" "\\1" CCACHE_VERSION ${CCACHE_VERSION})
if (CCACHE_VERSION VERSION_GREATER "3.2.0" OR NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
#message(STATUS "Using ${CCACHE_FOUND} ${CCACHE_VERSION}")

View File

@ -1,9 +1,8 @@
if (OS_FREEBSD)
find_library (EXECINFO_LIBRARY execinfo)
find_library (ELF_LIBRARY elf)
message (STATUS "Using execinfo: ${EXECINFO_LIBRARY}")
message (STATUS "Using elf: ${ELF_LIBRARY}")
set (EXECINFO_LIBRARIES ${EXECINFO_LIBRARY} ${ELF_LIBRARY})
message (STATUS "Using execinfo: ${EXECINFO_LIBRARIES}")
else ()
set (EXECINFO_LIBRARY "")
set (ELF_LIBRARY "")
set (EXECINFO_LIBRARIES "")
endif ()

10
cmake/find_lfalloc.cmake Normal file
View File

@ -0,0 +1,10 @@
if (NOT SANITIZE AND NOT ARCH_ARM AND NOT ARCH_32 AND NOT ARCH_PPC64LE AND NOT OS_FREEBSD)
option (ENABLE_LFALLOC "Set to FALSE to use system libgsasl library instead of bundled" ${NOT_UNBUNDLED})
endif ()
if (ENABLE_LFALLOC)
set (USE_LFALLOC 1)
set (USE_LFALLOC_RANDOM_HINT 1)
set (LFALLOC_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/lfalloc/src)
message (STATUS "Using lfalloc=${USE_LFALLOC}: ${LFALLOC_INCLUDE_DIR}")
endif ()

View File

@ -1,93 +1,34 @@
# This file copied from contrib/poco/cmake/FindODBC.cmake to allow build without submodules
#
# Find the ODBC driver manager includes and library.
#
# ODBC is an open standard for connecting to different databases in a
# semi-vendor-independent fashion. First you install the ODBC driver
# manager. Then you need a driver for each separate database you want
# to connect to (unless a generic one works). VTK includes neither
# the driver manager nor the vendor-specific drivers: you have to find
# those yourself.
#
# This module defines
# ODBC_INCLUDE_DIRECTORIES, where to find sql.h
# ODBC_LIBRARIES, the libraries to link against to use ODBC
# ODBC_FOUND. If false, you cannot build anything that requires ODBC.
option (ENABLE_ODBC "Enable ODBC" ${OS_LINUX})
if (OS_LINUX)
option (USE_INTERNAL_ODBC_LIBRARY "Set to FALSE to use system odbc library instead of bundled" ${NOT_UNBUNDLED})
else ()
option (USE_INTERNAL_ODBC_LIBRARY "Set to FALSE to use system odbc library instead of bundled" OFF)
endif ()
if (USE_INTERNAL_ODBC_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/unixodbc/README")
message (WARNING "submodule contrib/unixodbc is missing. to fix try run: \n git submodule update --init --recursive")
set (USE_INTERNAL_ODBC_LIBRARY 0)
endif ()
set (ODBC_INCLUDE_DIRECTORIES ) # Include directories will be either used automatically by target_include_directories or set later.
if (ENABLE_ODBC)
if (USE_INTERNAL_ODBC_LIBRARY)
set (ODBC_LIBRARIES unixodbc)
set (ODBC_FOUND 1)
set (USE_ODBC 1)
if(ENABLE_ODBC)
if (OS_LINUX)
option(USE_INTERNAL_ODBC_LIBRARY "Set to FALSE to use system odbc library instead of bundled" ${NOT_UNBUNDLED})
else ()
find_path(ODBC_INCLUDE_DIRECTORIES
NAMES sql.h
HINTS
/usr/include
/usr/include/iodbc
/usr/include/odbc
/usr/local/include
/usr/local/include/iodbc
/usr/local/include/odbc
/usr/local/iodbc/include
/usr/local/odbc/include
"C:/Program Files/ODBC/include"
"C:/Program Files/Microsoft SDKs/Windows/v7.0/include"
"C:/Program Files/Microsoft SDKs/Windows/v6.0a/include"
"C:/ODBC/include"
DOC "Specify the directory containing sql.h."
)
option(USE_INTERNAL_ODBC_LIBRARY "Set to FALSE to use system odbc library instead of bundled" OFF)
endif()
find_library(ODBC_LIBRARIES
NAMES iodbc odbc iodbcinst odbcinst odbc32
HINTS
/usr/lib
/usr/lib/iodbc
/usr/lib/odbc
/usr/local/lib
/usr/local/lib/iodbc
/usr/local/lib/odbc
/usr/local/iodbc/lib
/usr/local/odbc/lib
"C:/Program Files/ODBC/lib"
"C:/ODBC/lib/debug"
"C:/Program Files (x86)/Microsoft SDKs/Windows/v7.0A/Lib"
DOC "Specify the ODBC driver manager library here."
)
if(USE_INTERNAL_ODBC_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/unixodbc/README")
message(WARNING "submodule contrib/unixodbc is missing. to fix try run: \n git submodule update --init --recursive")
set(USE_INTERNAL_ODBC_LIBRARY 0)
set(MISSING_INTERNAL_ODBC_LIBRARY 1)
endif()
# MinGW find usually fails
if (MINGW)
set(ODBC_INCLUDE_DIRECTORIES ".")
set(ODBC_LIBRARIES odbc32)
endif ()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(ODBC
DEFAULT_MSG
ODBC_INCLUDE_DIRECTORIES
ODBC_LIBRARIES)
if (USE_STATIC_LIBRARIES)
list(APPEND ODBC_LIBRARIES ${LTDL_LIBRARY})
endif ()
mark_as_advanced(ODBC_FOUND ODBC_LIBRARIES ODBC_INCLUDE_DIRECTORIES)
set(ODBC_INCLUDE_DIRS ) # Include directories will be either used automatically by target_include_directories or set later.
if(USE_INTERNAL_ODBC_LIBRARY AND NOT MISSING_INTERNAL_ODBC_LIBRARY)
set(ODBC_LIBRARY unixodbc)
set(ODBC_LIBRARIES ${ODBC_LIBRARY})
set(ODBC_INCLUDE_DIRS "${ClickHouse_SOURCE_DIR}/contrib/unixodbc/include")
set(ODBC_FOUND 1)
else()
find_package(ODBC)
endif ()
endif ()
message (STATUS "Using odbc=${ODBC_FOUND}: ${ODBC_INCLUDE_DIRECTORIES} : ${ODBC_LIBRARIES}")
if(ODBC_FOUND)
set(USE_ODBC 1)
set(ODBC_INCLUDE_DIRECTORIES ${ODBC_INCLUDE_DIRS}) # for old poco
set(ODBC_INCLUDE_DIR ${ODBC_INCLUDE_DIRS}) # for old poco
endif()
message(STATUS "Using odbc=${USE_ODBC}: ${ODBC_INCLUDE_DIRS} : ${ODBC_LIBRARIES}")
endif()

View File

@ -76,7 +76,7 @@ elseif (NOT MISSING_INTERNAL_POCO_LIBRARY)
set (Poco_SQLODBC_INCLUDE_DIR
"${ClickHouse_SOURCE_DIR}/contrib/poco/SQL/ODBC/include/"
"${ClickHouse_SOURCE_DIR}/contrib/poco/Data/ODBC/include/"
${ODBC_INCLUDE_DIRECTORIES}
${ODBC_INCLUDE_DIRS}
)
set (Poco_SQLODBC_LIBRARY PocoSQLODBC ${ODBC_LIBRARIES} ${LTDL_LIBRARY})
endif ()
@ -88,7 +88,7 @@ elseif (NOT MISSING_INTERNAL_POCO_LIBRARY)
set (USE_POCO_DATAODBC 1)
set (Poco_DataODBC_INCLUDE_DIR
"${ClickHouse_SOURCE_DIR}/contrib/poco/Data/ODBC/include/"
${ODBC_INCLUDE_DIRECTORIES}
${ODBC_INCLUDE_DIRS}
)
set (Poco_DataODBC_LIBRARY PocoDataODBC ${ODBC_LIBRARIES} ${LTDL_LIBRARY})
endif ()

View File

@ -60,6 +60,72 @@ if(OPENSSL_FOUND)
set(USE_SSL 1)
endif()
# used by new poco
# part from /usr/share/cmake-*/Modules/FindOpenSSL.cmake, with removed all "EXISTS "
if(OPENSSL_FOUND AND NOT USE_INTERNAL_SSL_LIBRARY)
if(NOT TARGET OpenSSL::Crypto AND
(OPENSSL_CRYPTO_LIBRARY OR
LIB_EAY_LIBRARY_DEBUG OR
LIB_EAY_LIBRARY_RELEASE)
)
add_library(OpenSSL::Crypto UNKNOWN IMPORTED)
set_target_properties(OpenSSL::Crypto PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES "${OPENSSL_INCLUDE_DIR}")
if(OPENSSL_CRYPTO_LIBRARY)
set_target_properties(OpenSSL::Crypto PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES "C"
IMPORTED_LOCATION "${OPENSSL_CRYPTO_LIBRARY}")
endif()
if(LIB_EAY_LIBRARY_RELEASE)
set_property(TARGET OpenSSL::Crypto APPEND PROPERTY
IMPORTED_CONFIGURATIONS RELEASE)
set_target_properties(OpenSSL::Crypto PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES_RELEASE "C"
IMPORTED_LOCATION_RELEASE "${LIB_EAY_LIBRARY_RELEASE}")
endif()
if(LIB_EAY_LIBRARY_DEBUG)
set_property(TARGET OpenSSL::Crypto APPEND PROPERTY
IMPORTED_CONFIGURATIONS DEBUG)
set_target_properties(OpenSSL::Crypto PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES_DEBUG "C"
IMPORTED_LOCATION_DEBUG "${LIB_EAY_LIBRARY_DEBUG}")
endif()
endif()
if(NOT TARGET OpenSSL::SSL AND
(OPENSSL_SSL_LIBRARY OR
SSL_EAY_LIBRARY_DEBUG OR
SSL_EAY_LIBRARY_RELEASE)
)
add_library(OpenSSL::SSL UNKNOWN IMPORTED)
set_target_properties(OpenSSL::SSL PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES "${OPENSSL_INCLUDE_DIR}")
if(OPENSSL_SSL_LIBRARY)
set_target_properties(OpenSSL::SSL PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES "C"
IMPORTED_LOCATION "${OPENSSL_SSL_LIBRARY}")
endif()
if(SSL_EAY_LIBRARY_RELEASE)
set_property(TARGET OpenSSL::SSL APPEND PROPERTY
IMPORTED_CONFIGURATIONS RELEASE)
set_target_properties(OpenSSL::SSL PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES_RELEASE "C"
IMPORTED_LOCATION_RELEASE "${SSL_EAY_LIBRARY_RELEASE}")
endif()
if(SSL_EAY_LIBRARY_DEBUG)
set_property(TARGET OpenSSL::SSL APPEND PROPERTY
IMPORTED_CONFIGURATIONS DEBUG)
set_target_properties(OpenSSL::SSL PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES_DEBUG "C"
IMPORTED_LOCATION_DEBUG "${SSL_EAY_LIBRARY_DEBUG}")
endif()
if(TARGET OpenSSL::Crypto)
set_target_properties(OpenSSL::SSL PROPERTIES
INTERFACE_LINK_LIBRARIES OpenSSL::Crypto)
endif()
endif()
endif()
endif ()
message (STATUS "Using ssl=${USE_SSL}: ${OPENSSL_INCLUDE_DIR} : ${OPENSSL_LIBRARIES}")

View File

@ -120,6 +120,9 @@ if (USE_INTERNAL_SSL_LIBRARY)
target_include_directories(${OPENSSL_CRYPTO_LIBRARY} SYSTEM PUBLIC ${OPENSSL_INCLUDE_DIR})
target_include_directories(${OPENSSL_SSL_LIBRARY} SYSTEM PUBLIC ${OPENSSL_INCLUDE_DIR})
set (POCO_SKIP_OPENSSL_FIND 1)
add_library(OpenSSL::Crypto ALIAS ${OPENSSL_CRYPTO_LIBRARY})
add_library(OpenSSL::SSL ALIAS ${OPENSSL_SSL_LIBRARY})
endif ()
if (ENABLE_MYSQL AND USE_INTERNAL_MYSQL_LIBRARY)
@ -144,6 +147,7 @@ endif()
if (ENABLE_ODBC AND USE_INTERNAL_ODBC_LIBRARY)
add_subdirectory (unixodbc-cmake)
add_library(ODBC::ODBC ALIAS ${ODBC_LIBRARIES})
endif ()
if (USE_INTERNAL_CAPNP_LIBRARY)

View File

@ -41,7 +41,7 @@ set( thriftcpp_threads_SOURCES
${LIBRARY_DIR}/src/thrift/concurrency/Monitor.cpp
${LIBRARY_DIR}/src/thrift/concurrency/Mutex.cpp
)
add_library(${THRIFT_LIBRARY} ${LINK_MODE} ${thriftcpp_SOURCES} ${thriftcpp_threads_SOURCES})
add_library(${THRIFT_LIBRARY} ${thriftcpp_SOURCES} ${thriftcpp_threads_SOURCES})
set_target_properties(${THRIFT_LIBRARY} PROPERTIES CXX_STANDARD 14) # REMOVE after https://github.com/apache/thrift/pull/1641
target_include_directories(${THRIFT_LIBRARY} SYSTEM PUBLIC ${ClickHouse_SOURCE_DIR}/contrib/thrift/lib/cpp/src PRIVATE ${Boost_INCLUDE_DIRS})
@ -149,7 +149,7 @@ if (ARROW_WITH_ZSTD)
endif()
add_library(${ARROW_LIBRARY} ${LINK_MODE} ${ARROW_SRCS})
add_library(${ARROW_LIBRARY} ${ARROW_SRCS})
target_include_directories(${ARROW_LIBRARY} SYSTEM PUBLIC ${ClickHouse_SOURCE_DIR}/contrib/arrow/cpp/src PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/cpp/src ${Boost_INCLUDE_DIRS})
target_link_libraries(${ARROW_LIBRARY} PRIVATE ${DOUBLE_CONVERSION_LIBRARIES} Threads::Threads)
if (ARROW_WITH_LZ4)
@ -195,7 +195,7 @@ list(APPEND PARQUET_SRCS
${CMAKE_CURRENT_SOURCE_DIR}/cpp/src/parquet/parquet_constants.cpp
${CMAKE_CURRENT_SOURCE_DIR}/cpp/src/parquet/parquet_types.cpp
)
add_library(${PARQUET_LIBRARY} ${LINK_MODE} ${PARQUET_SRCS})
add_library(${PARQUET_LIBRARY} ${PARQUET_SRCS})
target_include_directories(${PARQUET_LIBRARY} SYSTEM PUBLIC ${ClickHouse_SOURCE_DIR}/contrib/arrow/cpp/src ${CMAKE_CURRENT_SOURCE_DIR}/cpp/src)
include(${ClickHouse_SOURCE_DIR}/contrib/thrift/build/cmake/ConfigureChecks.cmake) # makes config.h
target_link_libraries(${PARQUET_LIBRARY} PUBLIC ${ARROW_LIBRARY} PRIVATE ${THRIFT_LIBRARY} ${Boost_REGEX_LIBRARY})

View File

@ -24,7 +24,7 @@ endif ()
configure_file(config.h.in ${CMAKE_CURRENT_BINARY_DIR}/config.h)
add_library(base64 ${LINK_MODE}
add_library(base64
${LIBRARY_DIR}/lib/lib.c
${LIBRARY_DIR}/lib/codec_choose.c
${LIBRARY_DIR}/lib/arch/avx/codec.c

View File

@ -20,7 +20,7 @@ endif()
macro(add_boost_lib lib_name)
add_headers_and_sources(boost_${lib_name} ${LIBRARY_DIR}/libs/${lib_name}/src)
add_library(boost_${lib_name}_internal ${LINK_MODE} ${boost_${lib_name}_sources})
add_library(boost_${lib_name}_internal ${boost_${lib_name}_sources})
target_include_directories(boost_${lib_name}_internal SYSTEM BEFORE PUBLIC ${Boost_INCLUDE_DIRS})
target_compile_definitions(boost_${lib_name}_internal PUBLIC BOOST_SYSTEM_NO_DEPRECATED)
endmacro()

View File

@ -28,6 +28,6 @@ set(SRCS
${BROTLI_SOURCE_DIR}/common/transform.c
)
add_library(brotli ${LINK_MODE} ${SRCS})
add_library(brotli ${SRCS})
target_include_directories(brotli PUBLIC ${BROTLI_SOURCE_DIR}/include)

View File

@ -1,6 +1,6 @@
SET(LIBRARY_DIR ${ClickHouse_SOURCE_DIR}/contrib/cctz)
add_library(cctz ${LINK_MODE}
add_library(cctz
${LIBRARY_DIR}/src/civil_time_detail.cc
${LIBRARY_DIR}/src/time_zone_fixed.cc
${LIBRARY_DIR}/src/time_zone_format.cc

View File

@ -23,7 +23,7 @@ set(SRCS
${CPPKAFKA_DIR}/src/consumer.cpp
)
add_library(cppkafka ${LINK_MODE} ${SRCS})
add_library(cppkafka ${SRCS})
target_link_libraries(cppkafka PRIVATE ${RDKAFKA_LIBRARY})
target_include_directories(cppkafka PRIVATE ${CPPKAFKA_DIR}/include/cppkafka)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,23 @@
#pragma once
#include <string.h>
#include <stdlib.h>
#include "util/system/compiler.h"
namespace NMalloc {
volatile inline bool IsAllocatorCorrupted = false;
static inline void AbortFromCorruptedAllocator() {
IsAllocatorCorrupted = true;
abort();
}
struct TAllocHeader {
void* Block;
size_t AllocSize;
void Y_FORCE_INLINE Encode(void* block, size_t size, size_t signature) {
Block = block;
AllocSize = size | signature;
}
};
}

View File

@ -0,0 +1,33 @@
Style guide for the util folder is a stricter version of general style guide (mostly in terms of ambiguity resolution).
* all {} must be in K&R style
* &, * tied closer to a type, not to variable
* always use `using` not `typedef`
* even a single line block must be in braces {}:
```
if (A) {
B();
}
```
* _ at the end of private data member of a class - `First_`, `Second_`
* every .h file must be accompanied with corresponding .cpp to avoid a leakage and check that it is self contained
* prohibited to use `printf`-like functions
Things declared in the general style guide, which sometimes are missed:
* `template <`, not `template<`
* `noexcept`, not `throw ()` nor `throw()`, not required for destructors
* indents inside `namespace` same as inside `class`
Requirements for a new code (and for corrections in an old code which involves change of behaviour) in util:
* presence of UNIT-tests
* presence of comments in Doxygen style
* accessors without Get prefix (`Length()`, but not `GetLength()`)
This guide is not a mandatory as there is the general style guide.
Nevertheless if it is not followed, then a next `ya style .` run in the util folder will undeservedly update authors of some lines of code.
Thus before a commit it is recommended to run `ya style .` in the util folder.

View File

@ -0,0 +1,51 @@
#pragma once
#include "defaults.h"
using TAtomicBase = intptr_t;
using TAtomic = volatile TAtomicBase;
#if defined(__GNUC__)
#include "atomic_gcc.h"
#elif defined(_MSC_VER)
#include "atomic_win.h"
#else
#error unsupported platform
#endif
#if !defined(ATOMIC_COMPILER_BARRIER)
#define ATOMIC_COMPILER_BARRIER()
#endif
static inline TAtomicBase AtomicSub(TAtomic& a, TAtomicBase v) {
return AtomicAdd(a, -v);
}
static inline TAtomicBase AtomicGetAndSub(TAtomic& a, TAtomicBase v) {
return AtomicGetAndAdd(a, -v);
}
#if defined(USE_GENERIC_SETGET)
static inline TAtomicBase AtomicGet(const TAtomic& a) {
return a;
}
static inline void AtomicSet(TAtomic& a, TAtomicBase v) {
a = v;
}
#endif
static inline bool AtomicTryLock(TAtomic* a) {
return AtomicCas(a, 1, 0);
}
static inline bool AtomicTryAndTryLock(TAtomic* a) {
return (AtomicGet(*a) == 0) && AtomicTryLock(a);
}
static inline void AtomicUnlock(TAtomic* a) {
ATOMIC_COMPILER_BARRIER();
AtomicSet(*a, 0);
}
#include "atomic_ops.h"

View File

@ -0,0 +1,90 @@
#pragma once
#define ATOMIC_COMPILER_BARRIER() __asm__ __volatile__("" \
: \
: \
: "memory")
static inline TAtomicBase AtomicGet(const TAtomic& a) {
TAtomicBase tmp;
#if defined(_arm64_)
__asm__ __volatile__(
"ldar %x[value], %[ptr] \n\t"
: [value] "=r"(tmp)
: [ptr] "Q"(a)
: "memory");
#else
__atomic_load(&a, &tmp, __ATOMIC_ACQUIRE);
#endif
return tmp;
}
static inline void AtomicSet(TAtomic& a, TAtomicBase v) {
#if defined(_arm64_)
__asm__ __volatile__(
"stlr %x[value], %[ptr] \n\t"
: [ptr] "=Q"(a)
: [value] "r"(v)
: "memory");
#else
__atomic_store(&a, &v, __ATOMIC_RELEASE);
#endif
}
static inline intptr_t AtomicIncrement(TAtomic& p) {
return __atomic_add_fetch(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndIncrement(TAtomic& p) {
return __atomic_fetch_add(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicDecrement(TAtomic& p) {
return __atomic_sub_fetch(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndDecrement(TAtomic& p) {
return __atomic_fetch_sub(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicAdd(TAtomic& p, intptr_t v) {
return __atomic_add_fetch(&p, v, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndAdd(TAtomic& p, intptr_t v) {
return __atomic_fetch_add(&p, v, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicSwap(TAtomic* p, intptr_t v) {
(void)p; // disable strange 'parameter set but not used' warning on gcc
intptr_t ret;
__atomic_exchange(p, &v, &ret, __ATOMIC_SEQ_CST);
return ret;
}
static inline bool AtomicCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
(void)a; // disable strange 'parameter set but not used' warning on gcc
return __atomic_compare_exchange(a, &compare, &exchange, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
(void)a; // disable strange 'parameter set but not used' warning on gcc
__atomic_compare_exchange(a, &compare, &exchange, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
return compare;
}
static inline intptr_t AtomicOr(TAtomic& a, intptr_t b) {
return __atomic_or_fetch(&a, b, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicXor(TAtomic& a, intptr_t b) {
return __atomic_xor_fetch(&a, b, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicAnd(TAtomic& a, intptr_t b) {
return __atomic_and_fetch(&a, b, __ATOMIC_SEQ_CST);
}
static inline void AtomicBarrier() {
__sync_synchronize();
}

View File

@ -0,0 +1,189 @@
#pragma once
#include <type_traits>
template <typename T>
inline TAtomic* AsAtomicPtr(T volatile* target) {
return reinterpret_cast<TAtomic*>(target);
}
template <typename T>
inline const TAtomic* AsAtomicPtr(T const volatile* target) {
return reinterpret_cast<const TAtomic*>(target);
}
// integral types
template <typename T>
struct TAtomicTraits {
enum {
Castable = std::is_integral<T>::value && sizeof(T) == sizeof(TAtomicBase) && !std::is_const<T>::value,
};
};
template <typename T, typename TT>
using TEnableIfCastable = std::enable_if_t<TAtomicTraits<T>::Castable, TT>;
template <typename T>
inline TEnableIfCastable<T, T> AtomicGet(T const volatile& target) {
return static_cast<T>(AtomicGet(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, void> AtomicSet(T volatile& target, TAtomicBase value) {
AtomicSet(*AsAtomicPtr(&target), value);
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicIncrement(T volatile& target) {
return static_cast<T>(AtomicIncrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndIncrement(T volatile& target) {
return static_cast<T>(AtomicGetAndIncrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicDecrement(T volatile& target) {
return static_cast<T>(AtomicDecrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndDecrement(T volatile& target) {
return static_cast<T>(AtomicGetAndDecrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicAdd(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicAdd(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndAdd(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicGetAndAdd(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicSub(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicSub(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndSub(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicGetAndSub(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicSwap(T volatile* target, TAtomicBase exchange) {
return static_cast<T>(AtomicSwap(AsAtomicPtr(target), exchange));
}
template <typename T>
inline TEnableIfCastable<T, bool> AtomicCas(T volatile* target, TAtomicBase exchange, TAtomicBase compare) {
return AtomicCas(AsAtomicPtr(target), exchange, compare);
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndCas(T volatile* target, TAtomicBase exchange, TAtomicBase compare) {
return static_cast<T>(AtomicGetAndCas(AsAtomicPtr(target), exchange, compare));
}
template <typename T>
inline TEnableIfCastable<T, bool> AtomicTryLock(T volatile* target) {
return AtomicTryLock(AsAtomicPtr(target));
}
template <typename T>
inline TEnableIfCastable<T, bool> AtomicTryAndTryLock(T volatile* target) {
return AtomicTryAndTryLock(AsAtomicPtr(target));
}
template <typename T>
inline TEnableIfCastable<T, void> AtomicUnlock(T volatile* target) {
AtomicUnlock(AsAtomicPtr(target));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicOr(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicOr(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicAnd(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicAnd(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicXor(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicXor(*AsAtomicPtr(&target), value));
}
// pointer types
template <typename T>
inline T* AtomicGet(T* const volatile& target) {
return reinterpret_cast<T*>(AtomicGet(*AsAtomicPtr(&target)));
}
template <typename T>
inline void AtomicSet(T* volatile& target, T* value) {
AtomicSet(*AsAtomicPtr(&target), reinterpret_cast<TAtomicBase>(value));
}
using TNullPtr = decltype(nullptr);
template <typename T>
inline void AtomicSet(T* volatile& target, TNullPtr) {
AtomicSet(*AsAtomicPtr(&target), 0);
}
template <typename T>
inline T* AtomicSwap(T* volatile* target, T* exchange) {
return reinterpret_cast<T*>(AtomicSwap(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange)));
}
template <typename T>
inline T* AtomicSwap(T* volatile* target, TNullPtr) {
return reinterpret_cast<T*>(AtomicSwap(AsAtomicPtr(target), 0));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, T* exchange, T* compare) {
return AtomicCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), reinterpret_cast<TAtomicBase>(compare));
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, T* exchange, T* compare) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), reinterpret_cast<TAtomicBase>(compare)));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, T* exchange, TNullPtr) {
return AtomicCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), 0);
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, T* exchange, TNullPtr) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), 0));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, TNullPtr, T* compare) {
return AtomicCas(AsAtomicPtr(target), 0, reinterpret_cast<TAtomicBase>(compare));
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, TNullPtr, T* compare) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), 0, reinterpret_cast<TAtomicBase>(compare)));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, TNullPtr, TNullPtr) {
return AtomicCas(AsAtomicPtr(target), 0, 0);
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, TNullPtr, TNullPtr) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), 0, 0));
}

View File

@ -0,0 +1,114 @@
#pragma once
#include <intrin.h>
#define USE_GENERIC_SETGET
#if defined(_i386_)
#pragma intrinsic(_InterlockedIncrement)
#pragma intrinsic(_InterlockedDecrement)
#pragma intrinsic(_InterlockedExchangeAdd)
#pragma intrinsic(_InterlockedExchange)
#pragma intrinsic(_InterlockedCompareExchange)
static inline intptr_t AtomicIncrement(TAtomic& a) {
return _InterlockedIncrement((volatile long*)&a);
}
static inline intptr_t AtomicGetAndIncrement(TAtomic& a) {
return _InterlockedIncrement((volatile long*)&a) - 1;
}
static inline intptr_t AtomicDecrement(TAtomic& a) {
return _InterlockedDecrement((volatile long*)&a);
}
static inline intptr_t AtomicGetAndDecrement(TAtomic& a) {
return _InterlockedDecrement((volatile long*)&a) + 1;
}
static inline intptr_t AtomicAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd((volatile long*)&a, b) + b;
}
static inline intptr_t AtomicGetAndAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd((volatile long*)&a, b);
}
static inline intptr_t AtomicSwap(TAtomic* a, intptr_t b) {
return _InterlockedExchange((volatile long*)a, b);
}
static inline bool AtomicCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange((volatile long*)a, exchange, compare) == compare;
}
static inline intptr_t AtomicGetAndCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange((volatile long*)a, exchange, compare);
}
#else // _x86_64_
#pragma intrinsic(_InterlockedIncrement64)
#pragma intrinsic(_InterlockedDecrement64)
#pragma intrinsic(_InterlockedExchangeAdd64)
#pragma intrinsic(_InterlockedExchange64)
#pragma intrinsic(_InterlockedCompareExchange64)
static inline intptr_t AtomicIncrement(TAtomic& a) {
return _InterlockedIncrement64((volatile __int64*)&a);
}
static inline intptr_t AtomicGetAndIncrement(TAtomic& a) {
return _InterlockedIncrement64((volatile __int64*)&a) - 1;
}
static inline intptr_t AtomicDecrement(TAtomic& a) {
return _InterlockedDecrement64((volatile __int64*)&a);
}
static inline intptr_t AtomicGetAndDecrement(TAtomic& a) {
return _InterlockedDecrement64((volatile __int64*)&a) + 1;
}
static inline intptr_t AtomicAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd64((volatile __int64*)&a, b) + b;
}
static inline intptr_t AtomicGetAndAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd64((volatile __int64*)&a, b);
}
static inline intptr_t AtomicSwap(TAtomic* a, intptr_t b) {
return _InterlockedExchange64((volatile __int64*)a, b);
}
static inline bool AtomicCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange64((volatile __int64*)a, exchange, compare) == compare;
}
static inline intptr_t AtomicGetAndCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange64((volatile __int64*)a, exchange, compare);
}
static inline intptr_t AtomicOr(TAtomic& a, intptr_t b) {
return _InterlockedOr64(&a, b) | b;
}
static inline intptr_t AtomicAnd(TAtomic& a, intptr_t b) {
return _InterlockedAnd64(&a, b) & b;
}
static inline intptr_t AtomicXor(TAtomic& a, intptr_t b) {
return _InterlockedXor64(&a, b) ^ b;
}
#endif // _x86_
//TODO
static inline void AtomicBarrier() {
TAtomic val = 0;
AtomicSwap(&val, 0);
}

View File

@ -0,0 +1,617 @@
#pragma once
// useful cross-platfrom definitions for compilers
/**
* @def Y_FUNC_SIGNATURE
*
* Use this macro to get pretty function name (see example).
*
* @code
* void Hi() {
* Cout << Y_FUNC_SIGNATURE << Endl;
* }
* template <typename T>
* void Do() {
* Cout << Y_FUNC_SIGNATURE << Endl;
* }
* int main() {
* Hi(); // void Hi()
* Do<int>(); // void Do() [T = int]
* Do<TString>(); // void Do() [T = TString]
* }
* @endcode
*/
#if defined(__GNUC__)
#define Y_FUNC_SIGNATURE __PRETTY_FUNCTION__
#elif defined(_MSC_VER)
#define Y_FUNC_SIGNATURE __FUNCSIG__
#else
#define Y_FUNC_SIGNATURE ""
#endif
#ifdef __GNUC__
#define Y_PRINTF_FORMAT(n, m) __attribute__((__format__(__printf__, n, m)))
#endif
#ifndef Y_PRINTF_FORMAT
#define Y_PRINTF_FORMAT(n, m)
#endif
#if defined(__clang__)
#define Y_NO_SANITIZE(...) __attribute__((no_sanitize(__VA_ARGS__)))
#endif
#if !defined(Y_NO_SANITIZE)
#define Y_NO_SANITIZE(...)
#endif
/**
* @def Y_DECLARE_UNUSED
*
* Macro is needed to silence compiler warning about unused entities (e.g. function or argument).
*
* @code
* Y_DECLARE_UNUSED int FunctionUsedSolelyForDebugPurposes();
* assert(FunctionUsedSolelyForDebugPurposes() == 42);
*
* void Foo(const int argumentUsedOnlyForDebugPurposes Y_DECLARE_UNUSED) {
* assert(argumentUsedOnlyForDebugPurposes == 42);
* // however you may as well omit `Y_DECLARE_UNUSED` and use `UNUSED` macro instead
* Y_UNUSED(argumentUsedOnlyForDebugPurposes);
* }
* @endcode
*/
#ifdef __GNUC__
#define Y_DECLARE_UNUSED __attribute__((unused))
#endif
#ifndef Y_DECLARE_UNUSED
#define Y_DECLARE_UNUSED
#endif
#if defined(__GNUC__)
#define Y_LIKELY(Cond) __builtin_expect(!!(Cond), 1)
#define Y_UNLIKELY(Cond) __builtin_expect(!!(Cond), 0)
#define Y_PREFETCH_READ(Pointer, Priority) __builtin_prefetch((const void*)(Pointer), 0, Priority)
#define Y_PREFETCH_WRITE(Pointer, Priority) __builtin_prefetch((const void*)(Pointer), 1, Priority)
#endif
/**
* @def Y_FORCE_INLINE
*
* Macro to use in place of 'inline' in function declaration/definition to force
* it to be inlined.
*/
#if !defined(Y_FORCE_INLINE)
#if defined(CLANG_COVERAGE)
#/* excessive __always_inline__ might significantly slow down compilation of an instrumented unit */
#define Y_FORCE_INLINE inline
#elif defined(_MSC_VER)
#define Y_FORCE_INLINE __forceinline
#elif defined(__GNUC__)
#/* Clang also defines __GNUC__ (as 4) */
#define Y_FORCE_INLINE inline __attribute__((__always_inline__))
#else
#define Y_FORCE_INLINE inline
#endif
#endif
/**
* @def Y_NO_INLINE
*
* Macro to use in place of 'inline' in function declaration/definition to
* prevent it from being inlined.
*/
#if !defined(Y_NO_INLINE)
#if defined(_MSC_VER)
#define Y_NO_INLINE __declspec(noinline)
#elif defined(__GNUC__) || defined(__INTEL_COMPILER)
#/* Clang also defines __GNUC__ (as 4) */
#define Y_NO_INLINE __attribute__((__noinline__))
#else
#define Y_NO_INLINE
#endif
#endif
//to cheat compiler about strict aliasing or similar problems
#if defined(__GNUC__)
#define Y_FAKE_READ(X) \
do { \
__asm__ __volatile__("" \
: \
: "m"(X)); \
} while (0)
#define Y_FAKE_WRITE(X) \
do { \
__asm__ __volatile__("" \
: "=m"(X)); \
} while (0)
#endif
#if !defined(Y_FAKE_READ)
#define Y_FAKE_READ(X)
#endif
#if !defined(Y_FAKE_WRITE)
#define Y_FAKE_WRITE(X)
#endif
#ifndef Y_PREFETCH_READ
#define Y_PREFETCH_READ(Pointer, Priority) (void)(const void*)(Pointer), (void)Priority
#endif
#ifndef Y_PREFETCH_WRITE
#define Y_PREFETCH_WRITE(Pointer, Priority) (void)(const void*)(Pointer), (void)Priority
#endif
#ifndef Y_LIKELY
#define Y_LIKELY(Cond) (Cond)
#define Y_UNLIKELY(Cond) (Cond)
#endif
#ifdef __GNUC__
#define _packed __attribute__((packed))
#else
#define _packed
#endif
#if defined(__GNUC__)
#define Y_WARN_UNUSED_RESULT __attribute__((warn_unused_result))
#endif
#ifndef Y_WARN_UNUSED_RESULT
#define Y_WARN_UNUSED_RESULT
#endif
#if defined(__GNUC__)
#define Y_HIDDEN __attribute__((visibility("hidden")))
#endif
#if !defined(Y_HIDDEN)
#define Y_HIDDEN
#endif
#if defined(__GNUC__)
#define Y_PUBLIC __attribute__((visibility("default")))
#endif
#if !defined(Y_PUBLIC)
#define Y_PUBLIC
#endif
#if !defined(Y_UNUSED) && !defined(__cplusplus)
#define Y_UNUSED(var) (void)(var)
#endif
#if !defined(Y_UNUSED) && defined(__cplusplus)
template <class... Types>
constexpr Y_FORCE_INLINE int Y_UNUSED(Types&&...) {
return 0;
};
#endif
/**
* @def Y_ASSUME
*
* Macro that tells the compiler that it can generate optimized code
* as if the given expression will always evaluate true.
* The behavior is undefined if it ever evaluates false.
*
* @code
* // factored into a function so that it's testable
* inline int Avg(int x, int y) {
* if (x >= 0 && y >= 0) {
* return (static_cast<unsigned>(x) + static_cast<unsigned>(y)) >> 1;
* } else {
* // a slower implementation
* }
* }
*
* // we know that xs and ys are non-negative from domain knowledge,
* // but we can't change the types of xs and ys because of API constrains
* int Foo(const TVector<int>& xs, const TVector<int>& ys) {
* TVector<int> avgs;
* avgs.resize(xs.size());
* for (size_t i = 0; i < xs.size(); ++i) {
* auto x = xs[i];
* auto y = ys[i];
* Y_ASSUME(x >= 0);
* Y_ASSUME(y >= 0);
* xs[i] = Avg(x, y);
* }
* }
* @endcode
*/
#if defined(__GNUC__)
#define Y_ASSUME(condition) ((condition) ? (void)0 : __builtin_unreachable())
#elif defined(_MSC_VER)
#define Y_ASSUME(condition) __assume(condition)
#else
#define Y_ASSUME(condition) Y_UNUSED(condition)
#endif
#ifdef __cplusplus
[[noreturn]]
#endif
Y_HIDDEN void _YandexAbort();
/**
* @def Y_UNREACHABLE
*
* Macro that marks the rest of the code branch unreachable.
* The behavior is undefined if it's ever reached.
*
* @code
* switch (i % 3) {
* case 0:
* return foo;
* case 1:
* return bar;
* case 2:
* return baz;
* default:
* Y_UNREACHABLE();
* }
* @endcode
*/
#if defined(__GNUC__) || defined(_MSC_VER)
#define Y_UNREACHABLE() Y_ASSUME(0)
#else
#define Y_UNREACHABLE() _YandexAbort()
#endif
#if defined(undefined_sanitizer_enabled)
#define _ubsan_enabled_
#endif
#ifdef __clang__
#if __has_feature(thread_sanitizer)
#define _tsan_enabled_
#endif
#if __has_feature(memory_sanitizer)
#define _msan_enabled_
#endif
#if __has_feature(address_sanitizer)
#define _asan_enabled_
#endif
#else
#if defined(thread_sanitizer_enabled) || defined(__SANITIZE_THREAD__)
#define _tsan_enabled_
#endif
#if defined(memory_sanitizer_enabled)
#define _msan_enabled_
#endif
#if defined(address_sanitizer_enabled) || defined(__SANITIZE_ADDRESS__)
#define _asan_enabled_
#endif
#endif
#if defined(_asan_enabled_) || defined(_msan_enabled_) || defined(_tsan_enabled_) || defined(_ubsan_enabled_)
#define _san_enabled_
#endif
#if defined(_MSC_VER)
#define __PRETTY_FUNCTION__ __FUNCSIG__
#endif
#if defined(__GNUC__)
#define Y_WEAK __attribute__((weak))
#else
#define Y_WEAK
#endif
#if defined(__CUDACC_VER_MAJOR__)
#define Y_CUDA_AT_LEAST(x, y) (__CUDACC_VER_MAJOR__ > x || (__CUDACC_VER_MAJOR__ == x && __CUDACC_VER_MINOR__ >= y))
#else
#define Y_CUDA_AT_LEAST(x, y) 0
#endif
// NVidia CUDA C++ Compiler did not know about noexcept keyword until version 9.0
#if !Y_CUDA_AT_LEAST(9, 0)
#if defined(__CUDACC__) && !defined(noexcept)
#define noexcept throw ()
#endif
#endif
#if defined(__GNUC__)
#define Y_COLD __attribute__((cold))
#define Y_LEAF __attribute__((leaf))
#define Y_WRAPPER __attribute__((artificial))
#else
#define Y_COLD
#define Y_LEAF
#define Y_WRAPPER
#endif
/**
* @def Y_PRAGMA
*
* Macro for use in other macros to define compiler pragma
* See below for other usage examples
*
* @code
* #if defined(__clang__) || defined(__GNUC__)
* #define Y_PRAGMA_NO_WSHADOW \
* Y_PRAGMA("GCC diagnostic ignored \"-Wshadow\"")
* #elif defined(_MSC_VER)
* #define Y_PRAGMA_NO_WSHADOW \
* Y_PRAGMA("warning(disable:4456 4457")
* #else
* #define Y_PRAGMA_NO_WSHADOW
* #endif
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA(x) _Pragma(x)
#elif defined(_MSC_VER)
#define Y_PRAGMA(x) __pragma(x)
#else
#define Y_PRAGMA(x)
#endif
/**
* @def Y_PRAGMA_DIAGNOSTIC_PUSH
*
* Cross-compiler pragma to save diagnostic settings
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html
* MSVC: https://msdn.microsoft.com/en-us/library/2c8f766e.aspx
* Clang: https://clang.llvm.org/docs/UsersManual.html#controlling-diagnostics-via-pragmas
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_DIAGNOSTIC_PUSH \
Y_PRAGMA("GCC diagnostic push")
#elif defined(_MSC_VER)
#define Y_PRAGMA_DIAGNOSTIC_PUSH \
Y_PRAGMA(warning(push))
#else
#define Y_PRAGMA_DIAGNOSTIC_PUSH
#endif
/**
* @def Y_PRAGMA_DIAGNOSTIC_POP
*
* Cross-compiler pragma to restore diagnostic settings
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html
* MSVC: https://msdn.microsoft.com/en-us/library/2c8f766e.aspx
* Clang: https://clang.llvm.org/docs/UsersManual.html#controlling-diagnostics-via-pragmas
*
* @code
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_DIAGNOSTIC_POP \
Y_PRAGMA("GCC diagnostic pop")
#elif defined(_MSC_VER)
#define Y_PRAGMA_DIAGNOSTIC_POP \
Y_PRAGMA(warning(pop))
#else
#define Y_PRAGMA_DIAGNOSTIC_POP
#endif
/**
* @def Y_PRAGMA_NO_WSHADOW
*
* Cross-compiler pragma to disable warnings about shadowing variables
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_WSHADOW
*
* // some code which use variable shadowing, e.g.:
*
* for (int i = 0; i < 100; ++i) {
* Use(i);
*
* for (int i = 42; i < 100500; ++i) { // this i is shadowing previous i
* AnotherUse(i);
* }
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_WSHADOW \
Y_PRAGMA("GCC diagnostic ignored \"-Wshadow\"")
#elif defined(_MSC_VER)
#define Y_PRAGMA_NO_WSHADOW \
Y_PRAGMA(warning(disable : 4456 4457))
#else
#define Y_PRAGMA_NO_WSHADOW
#endif
/**
* @ def Y_PRAGMA_NO_UNUSED_FUNCTION
*
* Cross-compiler pragma to disable warnings about unused functions
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
* Clang: https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-function
* MSVC: there is no such warning
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_UNUSED_FUNCTION
*
* // some code which introduces a function which later will not be used, e.g.:
*
* void Foo() {
* }
*
* int main() {
* return 0; // Foo() never called
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_UNUSED_FUNCTION \
Y_PRAGMA("GCC diagnostic ignored \"-Wunused-function\"")
#else
#define Y_PRAGMA_NO_UNUSED_FUNCTION
#endif
/**
* @ def Y_PRAGMA_NO_UNUSED_PARAMETER
*
* Cross-compiler pragma to disable warnings about unused function parameters
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
* Clang: https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-parameter
* MSVC: https://msdn.microsoft.com/en-us/library/26kb9fy0.aspx
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_UNUSED_PARAMETER
*
* // some code which introduces a function with unused parameter, e.g.:
*
* void foo(int a) {
* // a is not referenced
* }
*
* int main() {
* foo(1);
* return 0;
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_UNUSED_PARAMETER \
Y_PRAGMA("GCC diagnostic ignored \"-Wunused-parameter\"")
#elif defined(_MSC_VER)
#define Y_PRAGMA_NO_UNUSED_PARAMETER \
Y_PRAGMA(warning(disable : 4100))
#else
#define Y_PRAGMA_NO_UNUSED_PARAMETER
#endif
/**
* @def Y_PRAGMA_NO_DEPRECATED
*
* Cross compiler pragma to disable warnings and errors about deprecated
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
* Clang: https://clang.llvm.org/docs/DiagnosticsReference.html#wdeprecated
* MSVC: https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-3-c4996?view=vs-2017
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_DEPRECATED
*
* [deprecated] void foo() {
* // ...
* }
*
* int main() {
* foo();
* return 0;
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_DEPRECATED \
Y_PRAGMA("GCC diagnostic ignored \"-Wdeprecated\"")
#elif defined(_MSC_VER)
#define Y_PRAGMA_NO_DEPRECATED \
Y_PRAGMA(warning(disable : 4996))
#else
#define Y_PRAGMA_NO_DEPRECATED
#endif
#if defined(__clang__) || defined(__GNUC__)
/**
* @def Y_CONST_FUNCTION
methods and functions, marked with this method are promised to:
1. do not have side effects
2. this method do not read global memory
NOTE: this attribute can't be set for methods that depend on data, pointed by this
this allow compilers to do hard optimization of that functions
NOTE: in common case this attribute can't be set if method have pointer-arguments
NOTE: as result there no any reason to discard result of such method
*/
#define Y_CONST_FUNCTION [[gnu::const]]
#endif
#if !defined(Y_CONST_FUNCTION)
#define Y_CONST_FUNCTION
#endif
#if defined(__clang__) || defined(__GNUC__)
/**
* @def Y_PURE_FUNCTION
methods and functions, marked with this method are promised to:
1. do not have side effects
2. result will be the same if no global memory changed
this allow compilers to do hard optimization of that functions
NOTE: as result there no any reason to discard result of such method
*/
#define Y_PURE_FUNCTION [[gnu::pure]]
#endif
#if !defined(Y_PURE_FUNCTION)
#define Y_PURE_FUNCTION
#endif
/**
* @ def Y_HAVE_INT128
*
* Defined when the compiler supports __int128 extension
*
* @code
*
* #if defined(Y_HAVE_INT128)
* __int128 myVeryBigInt = 12345678901234567890;
* #endif
*
* @endcode
*/
#if defined(__SIZEOF_INT128__)
#define Y_HAVE_INT128 1
#endif
/**
* XRAY macro must be passed to compiler if XRay is enabled.
*
* Define everything XRay-specific as a macro so that it doesn't cause errors
* for compilers that doesn't support XRay.
*/
#if defined(XRAY) && defined(__cplusplus)
#include <xray/xray_interface.h>
#define Y_XRAY_ALWAYS_INSTRUMENT [[clang::xray_always_instrument]]
#define Y_XRAY_NEVER_INSTRUMENT [[clang::xray_never_instrument]]
#define Y_XRAY_CUSTOM_EVENT(__string, __length) \
do { \
__xray_customevent(__string, __length); \
} while (0)
#else
#define Y_XRAY_ALWAYS_INSTRUMENT
#define Y_XRAY_NEVER_INSTRUMENT
#define Y_XRAY_CUSTOM_EVENT(__string, __length) \
do { \
} while (0)
#endif

View File

@ -0,0 +1,168 @@
#pragma once
#include "platform.h"
#if defined _unix_
#define LOCSLASH_C '/'
#define LOCSLASH_S "/"
#else
#define LOCSLASH_C '\\'
#define LOCSLASH_S "\\"
#endif // _unix_
#if defined(__INTEL_COMPILER) && defined(__cplusplus)
#include <new>
#endif
// low and high parts of integers
#if !defined(_win_)
#include <sys/param.h>
#endif
#if defined(BSD) || defined(_android_)
#if defined(BSD)
#include <machine/endian.h>
#endif
#if defined(_android_)
#include <endian.h>
#endif
#if (BYTE_ORDER == LITTLE_ENDIAN)
#define _little_endian_
#elif (BYTE_ORDER == BIG_ENDIAN)
#define _big_endian_
#else
#error unknown endian not supported
#endif
#elif (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(WHATEVER_THAT_HAS_BIG_ENDIAN)
#define _big_endian_
#else
#define _little_endian_
#endif
// alignment
#if (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(__alpha__) || defined(__ia64__) || defined(WHATEVER_THAT_NEEDS_ALIGNING_QUADS)
#define _must_align8_
#endif
#if (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(__alpha__) || defined(__ia64__) || defined(WHATEVER_THAT_NEEDS_ALIGNING_LONGS)
#define _must_align4_
#endif
#if (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(__alpha__) || defined(__ia64__) || defined(WHATEVER_THAT_NEEDS_ALIGNING_SHORTS)
#define _must_align2_
#endif
#if defined(__GNUC__)
#define alias_hack __attribute__((__may_alias__))
#endif
#ifndef alias_hack
#define alias_hack
#endif
#include "types.h"
#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)
#define PRAGMA(x) _Pragma(#x)
#define RCSID(idstr) PRAGMA(comment(exestr, idstr))
#else
#define RCSID(idstr) static const char rcsid[] = idstr
#endif
#include "compiler.h"
#ifdef _win_
#include <malloc.h>
#elif defined(_sun_)
#include <alloca.h>
#endif
#ifdef NDEBUG
#define Y_IF_DEBUG(X)
#else
#define Y_IF_DEBUG(X) X
#endif
/**
* @def Y_ARRAY_SIZE
*
* This macro is needed to get number of elements in a statically allocated fixed size array. The
* expression is a compile-time constant and therefore can be used in compile time computations.
*
* @code
* enum ENumbers {
* EN_ONE,
* EN_TWO,
* EN_SIZE
* }
*
* const char* NAMES[] = {
* "one",
* "two"
* }
*
* static_assert(Y_ARRAY_SIZE(NAMES) == EN_SIZE, "you should define `NAME` for each enumeration");
* @endcode
*
* This macro also catches type errors. If you see a compiler error like "warning: division by zero
* is undefined" when using `Y_ARRAY_SIZE` then you are probably giving it a pointer.
*
* Since all of our code is expected to work on a 64 bit platform where pointers are 8 bytes we may
* falsefully accept pointers to types of sizes that are divisors of 8 (1, 2, 4 and 8).
*/
#if defined(__cplusplus)
namespace NArraySizePrivate {
template <class T>
struct TArraySize;
template <class T, size_t N>
struct TArraySize<T[N]> {
enum {
Result = N
};
};
template <class T, size_t N>
struct TArraySize<T (&)[N]> {
enum {
Result = N
};
};
}
#define Y_ARRAY_SIZE(arr) ((size_t)::NArraySizePrivate::TArraySize<decltype(arr)>::Result)
#else
#undef Y_ARRAY_SIZE
#define Y_ARRAY_SIZE(arr) \
((sizeof(arr) / sizeof((arr)[0])) / static_cast<size_t>(!(sizeof(arr) % sizeof((arr)[0]))))
#endif
#undef Y_ARRAY_BEGIN
#define Y_ARRAY_BEGIN(arr) (arr)
#undef Y_ARRAY_END
#define Y_ARRAY_END(arr) ((arr) + Y_ARRAY_SIZE(arr))
/**
* Concatenates two symbols, even if one of them is itself a macro.
*/
#define Y_CAT(X, Y) Y_CAT_I(X, Y)
#define Y_CAT_I(X, Y) Y_CAT_II(X, Y)
#define Y_CAT_II(X, Y) X##Y
#define Y_STRINGIZE(X) UTIL_PRIVATE_STRINGIZE_AUX(X)
#define UTIL_PRIVATE_STRINGIZE_AUX(X) #X
#if defined(__COUNTER__)
#define Y_GENERATE_UNIQUE_ID(N) Y_CAT(N, __COUNTER__)
#endif
#if !defined(Y_GENERATE_UNIQUE_ID)
#define Y_GENERATE_UNIQUE_ID(N) Y_CAT(N, __LINE__)
#endif
#define NPOS ((size_t)-1)

View File

@ -0,0 +1,242 @@
#pragma once
// What OS ?
// our definition has the form _{osname}_
#if defined(_WIN64)
#define _win64_
#define _win32_
#elif defined(__WIN32__) || defined(_WIN32) // _WIN32 is also defined by the 64-bit compiler for backward compatibility
#define _win32_
#else
#define _unix_
#if defined(__sun__) || defined(sun) || defined(sparc) || defined(__sparc)
#define _sun_
#endif
#if defined(__hpux__)
#define _hpux_
#endif
#if defined(__linux__)
#define _linux_
#endif
#if defined(__FreeBSD__)
#define _freebsd_
#endif
#if defined(__CYGWIN__)
#define _cygwin_
#endif
#if defined(__APPLE__)
#define _darwin_
#endif
#if defined(__ANDROID__)
#define _android_
#endif
#endif
#if defined(__IOS__)
#define _ios_
#endif
#if defined(_linux_)
#if defined(_musl_)
//nothing to do
#elif defined(_android_)
#define _bionic_
#else
#define _glibc_
#endif
#endif
#if defined(_darwin_)
#define unix
#define __unix__
#endif
#if defined(_win32_) || defined(_win64_)
#define _win_
#endif
#if defined(__arm__) || defined(__ARM__) || defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM)
#if defined(__arm64) || defined(__arm64__) || defined(__aarch64__)
#define _arm64_
#else
#define _arm32_
#endif
#endif
#if defined(_arm64_) || defined(_arm32_)
#define _arm_
#endif
/* __ia64__ and __x86_64__ - defined by GNU C.
* _M_IA64, _M_X64, _M_AMD64 - defined by Visual Studio.
*
* Microsoft can define _M_IX86, _M_AMD64 (before Visual Studio 8)
* or _M_X64 (starting in Visual Studio 8).
*/
#if defined(__x86_64__) || defined(_M_X64) || defined(_M_AMD64)
#define _x86_64_
#endif
#if defined(__i386__) || defined(_M_IX86)
#define _i386_
#endif
#if defined(__ia64__) || defined(_M_IA64)
#define _ia64_
#endif
#if defined(__powerpc__)
#define _ppc_
#endif
#if defined(__powerpc64__)
#define _ppc64_
#endif
#if !defined(sparc) && !defined(__sparc) && !defined(__hpux__) && !defined(__alpha__) && !defined(_ia64_) && !defined(_x86_64_) && !defined(_arm_) && !defined(_i386_) && !defined(_ppc_) && !defined(_ppc64_)
#error "platform not defined, please, define one"
#endif
#if defined(_x86_64_) || defined(_i386_)
#define _x86_
#endif
#if defined(__MIC__)
#define _mic_
#define _k1om_
#endif
// stdio or MessageBox
#if defined(__CONSOLE__) || defined(_CONSOLE)
#define _console_
#endif
#if (defined(_win_) && !defined(_console_))
#define _windows_
#elif !defined(_console_)
#define _console_
#endif
#if defined(__SSE__) || defined(SSE_ENABLED)
#define _sse_
#endif
#if defined(__SSE2__) || defined(SSE2_ENABLED)
#define _sse2_
#endif
#if defined(__SSE3__) || defined(SSE3_ENABLED)
#define _sse3_
#endif
#if defined(__SSSE3__) || defined(SSSE3_ENABLED)
#define _ssse3_
#endif
#if defined(POPCNT_ENABLED)
#define _popcnt_
#endif
#if defined(__DLL__) || defined(_DLL)
#define _dll_
#endif
// 16, 32 or 64
#if defined(__sparc_v9__) || defined(_x86_64_) || defined(_ia64_) || defined(_arm64_) || defined(_ppc64_)
#define _64_
#else
#define _32_
#endif
/* All modern 64-bit Unix systems use scheme LP64 (long, pointers are 64-bit).
* Microsoft uses a different scheme: LLP64 (long long, pointers are 64-bit).
*
* Scheme LP64 LLP64
* char 8 8
* short 16 16
* int 32 32
* long 64 32
* long long 64 64
* pointer 64 64
*/
#if defined(_32_)
#define SIZEOF_PTR 4
#elif defined(_64_)
#define SIZEOF_PTR 8
#endif
#define PLATFORM_DATA_ALIGN SIZEOF_PTR
#if !defined(SIZEOF_PTR)
#error todo
#endif
#define SIZEOF_CHAR 1
#define SIZEOF_UNSIGNED_CHAR 1
#define SIZEOF_SHORT 2
#define SIZEOF_UNSIGNED_SHORT 2
#define SIZEOF_INT 4
#define SIZEOF_UNSIGNED_INT 4
#if defined(_32_)
#define SIZEOF_LONG 4
#define SIZEOF_UNSIGNED_LONG 4
#elif defined(_64_)
#if defined(_win_)
#define SIZEOF_LONG 4
#define SIZEOF_UNSIGNED_LONG 4
#else
#define SIZEOF_LONG 8
#define SIZEOF_UNSIGNED_LONG 8
#endif // _win_
#endif // _32_
#if !defined(SIZEOF_LONG)
#error todo
#endif
#define SIZEOF_LONG_LONG 8
#define SIZEOF_UNSIGNED_LONG_LONG 8
#undef SIZEOF_SIZE_T // in case we include <Python.h> which defines it, too
#define SIZEOF_SIZE_T SIZEOF_PTR
#if defined(__INTEL_COMPILER)
#pragma warning(disable 1292)
#pragma warning(disable 1469)
#pragma warning(disable 193)
#pragma warning(disable 271)
#pragma warning(disable 383)
#pragma warning(disable 424)
#pragma warning(disable 444)
#pragma warning(disable 584)
#pragma warning(disable 593)
#pragma warning(disable 981)
#pragma warning(disable 1418)
#pragma warning(disable 304)
#pragma warning(disable 810)
#pragma warning(disable 1029)
#pragma warning(disable 1419)
#pragma warning(disable 177)
#pragma warning(disable 522)
#pragma warning(disable 858)
#pragma warning(disable 111)
#pragma warning(disable 1599)
#pragma warning(disable 411)
#pragma warning(disable 304)
#pragma warning(disable 858)
#pragma warning(disable 444)
#pragma warning(disable 913)
#pragma warning(disable 310)
#pragma warning(disable 167)
#pragma warning(disable 180)
#pragma warning(disable 1572)
#endif
#if defined(_MSC_VER)
#undef _WINSOCKAPI_
#define _WINSOCKAPI_
#undef NOMINMAX
#define NOMINMAX
#endif

View File

@ -0,0 +1,117 @@
#pragma once
// DO_NOT_STYLE
#include "platform.h"
#include <inttypes.h>
typedef int8_t i8;
typedef int16_t i16;
typedef uint8_t ui8;
typedef uint16_t ui16;
typedef int yssize_t;
#define PRIYSZT "d"
#if defined(_darwin_) && defined(_32_)
typedef unsigned long ui32;
typedef long i32;
#else
typedef uint32_t ui32;
typedef int32_t i32;
#endif
#if defined(_darwin_) && defined(_64_)
typedef unsigned long ui64;
typedef long i64;
#else
typedef uint64_t ui64;
typedef int64_t i64;
#endif
#define LL(number) INT64_C(number)
#define ULL(number) UINT64_C(number)
// Macro for size_t and ptrdiff_t types
#if defined(_32_)
# if defined(_darwin_)
# define PRISZT "lu"
# undef PRIi32
# define PRIi32 "li"
# undef SCNi32
# define SCNi32 "li"
# undef PRId32
# define PRId32 "li"
# undef SCNd32
# define SCNd32 "li"
# undef PRIu32
# define PRIu32 "lu"
# undef SCNu32
# define SCNu32 "lu"
# undef PRIx32
# define PRIx32 "lx"
# undef SCNx32
# define SCNx32 "lx"
# elif !defined(_cygwin_)
# define PRISZT PRIu32
# else
# define PRISZT "u"
# endif
# define SCNSZT SCNu32
# define PRIPDT PRIi32
# define SCNPDT SCNi32
# define PRITMT PRIi32
# define SCNTMT SCNi32
#elif defined(_64_)
# if defined(_darwin_)
# define PRISZT "lu"
# undef PRIu64
# define PRIu64 PRISZT
# undef PRIx64
# define PRIx64 "lx"
# undef PRIX64
# define PRIX64 "lX"
# undef PRId64
# define PRId64 "ld"
# undef PRIi64
# define PRIi64 "li"
# undef SCNi64
# define SCNi64 "li"
# undef SCNu64
# define SCNu64 "lu"
# undef SCNx64
# define SCNx64 "lx"
# else
# define PRISZT PRIu64
# endif
# define SCNSZT SCNu64
# define PRIPDT PRIi64
# define SCNPDT SCNi64
# define PRITMT PRIi64
# define SCNTMT SCNi64
#else
# error "Unsupported platform"
#endif
// SUPERLONG
#if !defined(DONT_USE_SUPERLONG) && !defined(SUPERLONG_MAX)
#define SUPERLONG_MAX ~LL(0)
typedef i64 SUPERLONG;
#endif
// UNICODE
// UCS-2, native byteorder
typedef ui16 wchar16;
// internal symbol type: UTF-16LE
typedef wchar16 TChar;
typedef ui32 wchar32;
#if defined(_MSC_VER)
#include <basetsd.h>
typedef SSIZE_T ssize_t;
#define HAVE_SSIZE_T 1
#include <wchar.h>
#endif
#include <sys/types.h>

View File

@ -1,4 +1,4 @@
add_library (btrie
add_library(btrie
src/btrie.c
include/btrie.h
)

View File

@ -183,7 +183,7 @@ set(SRCS
)
# target
add_library(hdfs3 STATIC ${SRCS} ${PROTO_SOURCES} ${PROTO_HEADERS})
add_library(hdfs3 ${SRCS} ${PROTO_SOURCES} ${PROTO_HEADERS})
if (USE_INTERNAL_PROTOBUF_LIBRARY)
add_dependencies(hdfs3 protoc)

View File

@ -18,6 +18,7 @@
#define METROHASH_PLATFORM_H
#include <stdint.h>
#include <cstring>
// rotate right idiom recognized by most compilers
inline static uint64_t rotate_right(uint64_t v, unsigned k)
@ -25,20 +26,28 @@ inline static uint64_t rotate_right(uint64_t v, unsigned k)
return (v >> k) | (v << (64 - k));
}
// unaligned reads, fast and safe on Nehalem and later microarchitectures
inline static uint64_t read_u64(const void * const ptr)
{
return static_cast<uint64_t>(*reinterpret_cast<const uint64_t*>(ptr));
uint64_t result;
// Assignment like `result = *reinterpret_cast<const uint64_t *>(ptr)` here would mean undefined behaviour (unaligned read),
// so we use memcpy() which is the most portable. clang & gcc usually translates `memcpy()` into a single `load` instruction
// when hardware supports it, so using memcpy() is efficient too.
memcpy(&result, ptr, sizeof(result));
return result;
}
inline static uint64_t read_u32(const void * const ptr)
{
return static_cast<uint64_t>(*reinterpret_cast<const uint32_t*>(ptr));
uint32_t result;
memcpy(&result, ptr, sizeof(result));
return result;
}
inline static uint64_t read_u16(const void * const ptr)
{
return static_cast<uint64_t>(*reinterpret_cast<const uint16_t*>(ptr));
uint16_t result;
memcpy(&result, ptr, sizeof(result));
return result;
}
inline static uint64_t read_u8 (const void * const ptr)

View File

@ -54,7 +54,7 @@ set(SRCS
${RDKAFKA_SOURCE_DIR}/rdgz.c
)
add_library(rdkafka ${LINK_MODE} ${SRCS})
add_library(rdkafka ${SRCS})
target_include_directories(rdkafka SYSTEM PUBLIC include)
target_include_directories(rdkafka SYSTEM PUBLIC ${RDKAFKA_SOURCE_DIR}) # Because weird logic with "include_next" is used.
target_include_directories(rdkafka SYSTEM PRIVATE ${ZSTD_INCLUDE_DIR}/common) # Because wrong path to "zstd_errors.h" is used.

View File

@ -1,62 +1,56 @@
set(LIBXML2_SOURCE_DIR ${CMAKE_SOURCE_DIR}/contrib/libxml2)
set(LIBXML2_BINARY_DIR ${CMAKE_BINARY_DIR}/contrib/libxml2)
set(SRCS
${LIBXML2_SOURCE_DIR}/parser.c
${LIBXML2_SOURCE_DIR}/HTMLparser.c
${LIBXML2_SOURCE_DIR}/buf.c
${LIBXML2_SOURCE_DIR}/xzlib.c
${LIBXML2_SOURCE_DIR}/xmlregexp.c
${LIBXML2_SOURCE_DIR}/entities.c
${LIBXML2_SOURCE_DIR}/rngparser.c
${LIBXML2_SOURCE_DIR}/encoding.c
${LIBXML2_SOURCE_DIR}/legacy.c
${LIBXML2_SOURCE_DIR}/error.c
${LIBXML2_SOURCE_DIR}/debugXML.c
${LIBXML2_SOURCE_DIR}/xpointer.c
${LIBXML2_SOURCE_DIR}/DOCBparser.c
${LIBXML2_SOURCE_DIR}/xmlcatalog.c
${LIBXML2_SOURCE_DIR}/c14n.c
${LIBXML2_SOURCE_DIR}/xmlreader.c
${LIBXML2_SOURCE_DIR}/xmlstring.c
${LIBXML2_SOURCE_DIR}/dict.c
${LIBXML2_SOURCE_DIR}/xpath.c
${LIBXML2_SOURCE_DIR}/tree.c
${LIBXML2_SOURCE_DIR}/trionan.c
${LIBXML2_SOURCE_DIR}/pattern.c
${LIBXML2_SOURCE_DIR}/globals.c
${LIBXML2_SOURCE_DIR}/xmllint.c
${LIBXML2_SOURCE_DIR}/chvalid.c
${LIBXML2_SOURCE_DIR}/relaxng.c
${LIBXML2_SOURCE_DIR}/list.c
${LIBXML2_SOURCE_DIR}/xinclude.c
${LIBXML2_SOURCE_DIR}/xmlIO.c
${LIBXML2_SOURCE_DIR}/triostr.c
${LIBXML2_SOURCE_DIR}/hash.c
${LIBXML2_SOURCE_DIR}/xmlsave.c
${LIBXML2_SOURCE_DIR}/HTMLtree.c
${LIBXML2_SOURCE_DIR}/SAX.c
${LIBXML2_SOURCE_DIR}/xmlschemas.c
${LIBXML2_SOURCE_DIR}/SAX2.c
${LIBXML2_SOURCE_DIR}/threads.c
${LIBXML2_SOURCE_DIR}/runsuite.c
${LIBXML2_SOURCE_DIR}/catalog.c
${LIBXML2_SOURCE_DIR}/uri.c
${LIBXML2_SOURCE_DIR}/xmlmodule.c
${LIBXML2_SOURCE_DIR}/xlink.c
${LIBXML2_SOURCE_DIR}/entities.c
${LIBXML2_SOURCE_DIR}/encoding.c
${LIBXML2_SOURCE_DIR}/error.c
${LIBXML2_SOURCE_DIR}/parserInternals.c
${LIBXML2_SOURCE_DIR}/xmlwriter.c
${LIBXML2_SOURCE_DIR}/xmlunicode.c
${LIBXML2_SOURCE_DIR}/runxmlconf.c
${LIBXML2_SOURCE_DIR}/parser.c
${LIBXML2_SOURCE_DIR}/tree.c
${LIBXML2_SOURCE_DIR}/hash.c
${LIBXML2_SOURCE_DIR}/list.c
${LIBXML2_SOURCE_DIR}/xmlIO.c
${LIBXML2_SOURCE_DIR}/xmlmemory.c
${LIBXML2_SOURCE_DIR}/nanoftp.c
${LIBXML2_SOURCE_DIR}/xmlschemastypes.c
${LIBXML2_SOURCE_DIR}/uri.c
${LIBXML2_SOURCE_DIR}/valid.c
${LIBXML2_SOURCE_DIR}/xlink.c
${LIBXML2_SOURCE_DIR}/HTMLparser.c
${LIBXML2_SOURCE_DIR}/HTMLtree.c
${LIBXML2_SOURCE_DIR}/debugXML.c
${LIBXML2_SOURCE_DIR}/xpath.c
${LIBXML2_SOURCE_DIR}/xpointer.c
${LIBXML2_SOURCE_DIR}/xinclude.c
${LIBXML2_SOURCE_DIR}/nanohttp.c
${LIBXML2_SOURCE_DIR}/nanoftp.c
${LIBXML2_SOURCE_DIR}/DOCBparser.c
${LIBXML2_SOURCE_DIR}/catalog.c
${LIBXML2_SOURCE_DIR}/globals.c
${LIBXML2_SOURCE_DIR}/threads.c
${LIBXML2_SOURCE_DIR}/c14n.c
${LIBXML2_SOURCE_DIR}/xmlstring.c
${LIBXML2_SOURCE_DIR}/buf.c
${LIBXML2_SOURCE_DIR}/xmlregexp.c
${LIBXML2_SOURCE_DIR}/xmlschemas.c
${LIBXML2_SOURCE_DIR}/xmlschemastypes.c
${LIBXML2_SOURCE_DIR}/xmlunicode.c
${LIBXML2_SOURCE_DIR}/triostr.c
#${LIBXML2_SOURCE_DIR}/trio.c
${LIBXML2_SOURCE_DIR}/xmlreader.c
${LIBXML2_SOURCE_DIR}/relaxng.c
${LIBXML2_SOURCE_DIR}/dict.c
${LIBXML2_SOURCE_DIR}/SAX2.c
${LIBXML2_SOURCE_DIR}/xmlwriter.c
${LIBXML2_SOURCE_DIR}/legacy.c
${LIBXML2_SOURCE_DIR}/chvalid.c
${LIBXML2_SOURCE_DIR}/pattern.c
${LIBXML2_SOURCE_DIR}/xmlsave.c
${LIBXML2_SOURCE_DIR}/xmlmodule.c
${LIBXML2_SOURCE_DIR}/schematron.c
${LIBXML2_SOURCE_DIR}/xzlib.c
)
add_library(libxml2 STATIC ${SRCS})
add_library(libxml2 ${SRCS})
target_link_libraries(libxml2 ${ZLIB_LIBRARIES})

2
contrib/lz4 vendored

@ -1 +1 @@
Subproject commit c10863b98e1503af90616ae99725ecd120265dfb
Subproject commit 7a4e3b1fac5cd9d4ec7c8d0091329ba107ec2131

View File

@ -9,8 +9,7 @@ add_library (lz4
${LIBRARY_DIR}/xxhash.h
${LIBRARY_DIR}/lz4.h
${LIBRARY_DIR}/lz4hc.h
${LIBRARY_DIR}/lz4opt.h)
${LIBRARY_DIR}/lz4hc.h)
target_compile_definitions(lz4 PUBLIC LZ4_DISABLE_DEPRECATE_WARNINGS=1)

View File

@ -2,7 +2,7 @@ set(MARIADB_CLIENT_SOURCE_DIR ${CMAKE_SOURCE_DIR}/contrib/mariadb-connector-c)
set(MARIADB_CLIENT_BINARY_DIR ${CMAKE_BINARY_DIR}/contrib/mariadb-connector-c)
set(SRCS
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/bmove_upp.c
#${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/bmove_upp.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/get_password.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_alloc.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_array.c
@ -58,10 +58,10 @@ if(OPENSSL_LIBRARIES)
list(APPEND SRCS ${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/openssl.c)
endif()
add_library(mysqlclient STATIC ${SRCS})
add_library(mysqlclient ${SRCS})
if(OPENSSL_LIBRARIES)
target_link_libraries(mysqlclient ${OPENSSL_LIBRARIES})
target_link_libraries(mysqlclient PRIVATE ${OPENSSL_LIBRARIES})
target_compile_definitions(mysqlclient PRIVATE -D HAVE_OPENSSL -D HAVE_TLS)
endif()

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit fe5505e56c27b6ecb0dcbc40c49dc2caf4e9637f
Subproject commit 29439cf7fa32c1a2d62d925bb6d6a3f14668a4a2

View File

@ -10,7 +10,7 @@ foreach (src ${RE2_SOURCES_})
list(APPEND RE2_ST_SOURCES ${RE2_SOURCE_DIR}/${src})
endforeach ()
add_library (re2_st ${RE2_ST_SOURCES})
add_library(re2_st ${RE2_ST_SOURCES})
target_compile_definitions (re2_st PRIVATE NDEBUG NO_THREADS re2=re2_st)
target_include_directories (re2_st PRIVATE .)
target_include_directories (re2_st SYSTEM PUBLIC ${CMAKE_CURRENT_BINARY_DIR} ${RE2_SOURCE_DIR})

View File

@ -23,7 +23,7 @@ ${ODBC_SOURCE_DIR}/libltdl/loaders/preopen.c
${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64/libltdl/libltdlcS.c
)
add_library(ltdl ${LINK_MODE} ${SRCS})
add_library(ltdl ${SRCS})
target_include_directories(ltdl PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64/libltdl)
target_include_directories(ltdl PUBLIC ${ODBC_SOURCE_DIR}/libltdl)
@ -273,15 +273,15 @@ ${ODBC_SOURCE_DIR}/lst/lstSetFreeFunc.c
${ODBC_SOURCE_DIR}/lst/_lstVisible.c
)
add_library(unixodbc ${LINK_MODE} ${SRCS})
add_library(unixodbc ${SRCS})
target_link_libraries(unixodbc ltdl)
target_link_libraries(unixodbc PRIVATE ltdl)
# SYSTEM_FILE_PATH was changed to /etc
target_include_directories(unixodbc PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64/private)
target_include_directories(unixodbc PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64)
target_include_directories(unixodbc PUBLIC ${ODBC_SOURCE_DIR}/include)
target_include_directories(unixodbc SYSTEM PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64/private)
target_include_directories(unixodbc SYSTEM PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64)
target_include_directories(unixodbc SYSTEM PUBLIC ${ODBC_SOURCE_DIR}/include)
target_compile_definitions(unixodbc PRIVATE -DHAVE_CONFIG_H)

View File

@ -125,6 +125,6 @@ IF (ZSTD_LEGACY_SUPPORT)
${LIBRARY_LEGACY_DIR}/zstd_v07.h)
ENDIF (ZSTD_LEGACY_SUPPORT)
ADD_LIBRARY(zstd ${LINK_MODE} ${Sources} ${Headers})
ADD_LIBRARY(zstd ${Sources} ${Headers})
target_include_directories (zstd PUBLIC ${LIBRARY_DIR})

View File

@ -134,7 +134,7 @@ list (APPEND dbms_headers src/TableFunctions/ITableFunction.h src/TableFunctio
list (APPEND dbms_sources src/Dictionaries/DictionaryFactory.cpp src/Dictionaries/DictionarySourceFactory.cpp src/Dictionaries/DictionaryStructure.cpp)
list (APPEND dbms_headers src/Dictionaries/DictionaryFactory.h src/Dictionaries/DictionarySourceFactory.h src/Dictionaries/DictionaryStructure.h)
add_library(clickhouse_common_io ${LINK_MODE} ${clickhouse_common_io_headers} ${clickhouse_common_io_sources})
add_library(clickhouse_common_io ${clickhouse_common_io_headers} ${clickhouse_common_io_sources})
if (OS_FREEBSD)
target_compile_definitions (clickhouse_common_io PUBLIC CLOCK_MONOTONIC_COARSE=CLOCK_MONOTONIC_FAST)
@ -155,7 +155,6 @@ if (USE_EMBEDDED_COMPILER)
target_include_directories (dbms SYSTEM BEFORE PUBLIC ${LLVM_INCLUDE_DIRS})
endif ()
if (CMAKE_BUILD_TYPE_UC STREQUAL "RELEASE" OR CMAKE_BUILD_TYPE_UC STREQUAL "RELWITHDEBINFO" OR CMAKE_BUILD_TYPE_UC STREQUAL "MINSIZEREL")
# Won't generate debug info for files with heavy template instantiation to achieve faster linking and lower size.
set_source_files_properties(
@ -186,8 +185,6 @@ target_link_libraries (clickhouse_common_io
${LINK_LIBRARIES_ONLY_ON_X86_64}
PUBLIC
${DOUBLE_CONVERSION_LIBRARIES}
PRIVATE
pocoext
PUBLIC
${Poco_Net_LIBRARY}
${Poco_Util_LIBRARY}
@ -197,8 +194,7 @@ target_link_libraries (clickhouse_common_io
${CITYHASH_LIBRARIES}
PRIVATE
${ZLIB_LIBRARIES}
${EXECINFO_LIBRARY}
${ELF_LIBRARY}
${EXECINFO_LIBRARIES}
PUBLIC
${Boost_SYSTEM_LIBRARY}
PRIVATE
@ -214,6 +210,10 @@ target_link_libraries (clickhouse_common_io
target_include_directories(clickhouse_common_io SYSTEM BEFORE PUBLIC ${RE2_INCLUDE_DIR})
if (USE_LFALLOC)
target_include_directories (clickhouse_common_io SYSTEM BEFORE PUBLIC ${LFALLOC_INCLUDE_DIR})
endif ()
if(CPUID_LIBRARY)
target_link_libraries(clickhouse_common_io PRIVATE ${CPUID_LIBRARY})
endif()
@ -223,8 +223,9 @@ if(CPUINFO_LIBRARY)
endif()
target_link_libraries (dbms
PRIVATE
PUBLIC
clickhouse_compression
PRIVATE
clickhouse_parsers
clickhouse_common_config
PUBLIC
@ -232,7 +233,6 @@ target_link_libraries (dbms
PRIVATE
clickhouse_dictionaries_embedded
PUBLIC
pocoext
${MYSQLXX_LIBRARY}
PRIVATE
${BTRIE_LIBRARIES}
@ -257,8 +257,8 @@ if (USE_POCO_SQLODBC)
target_link_libraries (clickhouse_common_io PRIVATE ${Poco_SQL_LIBRARY})
target_link_libraries (dbms PRIVATE ${Poco_SQLODBC_LIBRARY} ${Poco_SQL_LIBRARY})
if (NOT USE_INTERNAL_POCO_LIBRARY)
target_include_directories (clickhouse_common_io SYSTEM PRIVATE ${ODBC_INCLUDE_DIRECTORIES} ${Poco_SQL_INCLUDE_DIR})
target_include_directories (dbms SYSTEM PRIVATE ${ODBC_INCLUDE_DIRECTORIES} ${Poco_SQLODBC_INCLUDE_DIR} PUBLIC ${Poco_SQL_INCLUDE_DIR})
target_include_directories (clickhouse_common_io SYSTEM PRIVATE ${ODBC_INCLUDE_DIRS} ${Poco_SQL_INCLUDE_DIR})
target_include_directories (dbms SYSTEM PRIVATE ${ODBC_INCLUDE_DIRS} ${Poco_SQLODBC_INCLUDE_DIR} SYSTEM PUBLIC ${Poco_SQL_INCLUDE_DIR})
endif()
endif()
@ -271,7 +271,7 @@ if (USE_POCO_DATAODBC)
target_link_libraries (clickhouse_common_io PRIVATE ${Poco_Data_LIBRARY})
target_link_libraries (dbms PRIVATE ${Poco_DataODBC_LIBRARY})
if (NOT USE_INTERNAL_POCO_LIBRARY)
target_include_directories (dbms SYSTEM PRIVATE ${ODBC_INCLUDE_DIRECTORIES} ${Poco_DataODBC_INCLUDE_DIR})
target_include_directories (dbms SYSTEM PRIVATE ${ODBC_INCLUDE_DIRS} ${Poco_DataODBC_INCLUDE_DIR})
endif()
endif()

View File

@ -1,11 +1,11 @@
# This strings autochanged from release_lib.sh:
set(VERSION_REVISION 54418)
set(VERSION_REVISION 54420)
set(VERSION_MAJOR 19)
set(VERSION_MINOR 6)
set(VERSION_MINOR 8)
set(VERSION_PATCH 1)
set(VERSION_GITHASH 30d3496c36cf3945c9828ac0b7cf7d1774a9f845)
set(VERSION_DESCRIBE v19.6.1.1-testing)
set(VERSION_STRING 19.6.1.1)
set(VERSION_GITHASH a76e504f45ff4a74e8c492bd269f022352d5f6d9)
set(VERSION_DESCRIBE v19.8.1.1-testing)
set(VERSION_STRING 19.8.1.1)
# end of autochange
set(VERSION_EXTRA "" CACHE STRING "")

View File

@ -44,7 +44,7 @@ macro(clickhouse_program_add_library name)
set(CLICKHOUSE_${name_uc}_INCLUDE ${CLICKHOUSE_${name_uc}_INCLUDE} PARENT_SCOPE)
if(NOT CLICKHOUSE_ONE_SHARED)
add_library(clickhouse-${name}-lib ${LINK_MODE} ${CLICKHOUSE_${name_uc}_SOURCES})
add_library(clickhouse-${name}-lib ${CLICKHOUSE_${name_uc}_SOURCES})
set(_link ${CLICKHOUSE_${name_uc}_LINK}) # can't use ${} in if()
if(_link)
@ -209,11 +209,6 @@ else ()
install (FILES ${CMAKE_CURRENT_BINARY_DIR}/clickhouse-obfuscator DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
list(APPEND CLICKHOUSE_BUNDLE clickhouse-obfuscator)
endif ()
if (ENABLE_CLICKHOUSE_ODBC_BRIDGE)
# just to be able to run integration tests
add_custom_target (clickhouse-odbc-bridge-copy ALL COMMAND ${CMAKE_COMMAND} -E create_symlink ${CMAKE_CURRENT_BINARY_DIR}/odbc-bridge/clickhouse-odbc-bridge clickhouse-odbc-bridge DEPENDS clickhouse-odbc-bridge)
endif ()
# install always because depian package want this files:
add_custom_target (clickhouse-clang ALL COMMAND ${CMAKE_COMMAND} -E create_symlink clickhouse clickhouse-clang DEPENDS clickhouse)

View File

@ -439,7 +439,7 @@ int mainEntryClickHouseBenchmark(int argc, char ** argv)
("help", "produce help message")
("concurrency,c", value<unsigned>()->default_value(1), "number of parallel queries")
("delay,d", value<double>()->default_value(1), "delay between intermediate reports in seconds (set 0 to disable reports)")
("stage", value<std::string>()->default_value("complete"), "request query processing up to specified stage")
("stage", value<std::string>()->default_value("complete"), "request query processing up to specified stage: complete,fetch_columns,with_mergeable_state")
("iterations,i", value<size_t>()->default_value(0), "amount of queries to be executed")
("timelimit,t", value<double>()->default_value(0.), "stop launch of queries after specified time limit")
("randomize,r", value<bool>()->default_value(false), "randomize order of execution")
@ -451,14 +451,14 @@ int mainEntryClickHouseBenchmark(int argc, char ** argv)
("password", value<std::string>()->default_value(""), "")
("database", value<std::string>()->default_value("default"), "")
("stacktrace", "print stack traces of exceptions")
#define DECLARE_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) (#NAME, boost::program_options::value<std::string> (), DESCRIPTION)
APPLY_FOR_SETTINGS(DECLARE_SETTING)
#undef DECLARE_SETTING
;
Settings settings;
settings.addProgramOptions(desc);
boost::program_options::variables_map options;
boost::program_options::store(boost::program_options::parse_command_line(argc, argv, desc), options);
boost::program_options::notify(options);
if (options.count("help"))
{
@ -469,15 +469,6 @@ int mainEntryClickHouseBenchmark(int argc, char ** argv)
print_stacktrace = options.count("stacktrace");
/// Extract `settings` and `limits` from received `options`
Settings settings;
#define EXTRACT_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \
if (options.count(#NAME)) \
settings.set(#NAME, options[#NAME].as<std::string>());
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
#undef EXTRACT_SETTING
UseSSL use_ssl;
Benchmark benchmark(

View File

@ -2,7 +2,7 @@ add_definitions(-Wno-error -Wno-unused-parameter -Wno-non-virtual-dtor -U_LIBCPP
link_directories(${LLVM_LIBRARY_DIRS})
add_library(clickhouse-compiler-lib ${LINK_MODE}
add_library(clickhouse-compiler-lib
driver.cpp
cc1_main.cpp
cc1as_main.cpp
@ -46,7 +46,7 @@ LLVMSupport
#PollyISL
#PollyPPCG
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -1,8 +1,9 @@
add_definitions(-Wno-error -Wno-unused-parameter -Wno-non-virtual-dtor -U_LIBCPP_DEBUG)
link_directories(${LLVM_LIBRARY_DIRS})
add_library(clickhouse-compiler-lib ${LINK_MODE}
add_library(clickhouse-compiler-lib
driver.cpp
cc1_main.cpp
cc1as_main.cpp
@ -46,7 +47,7 @@ ${REQUIRED_LLVM_LIBRARIES}
#PollyISL
#PollyPPCG
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -2,7 +2,7 @@ add_definitions(-Wno-error -Wno-unused-parameter -Wno-non-virtual-dtor -U_LIBCPP
link_directories(${LLVM_LIBRARY_DIRS})
add_library(clickhouse-compiler-lib ${LINK_MODE}
add_library(clickhouse-compiler-lib
driver.cpp
cc1_main.cpp
cc1gen_reproducer_main.cpp
@ -42,7 +42,7 @@ lldCore
${REQUIRED_LLVM_LIBRARIES}
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -2,7 +2,7 @@ add_definitions(-Wno-error -Wno-unused-parameter -Wno-non-virtual-dtor -U_LIBCPP
link_directories(${LLVM_LIBRARY_DIRS})
add_library(clickhouse-compiler-lib ${LINK_MODE}
add_library(clickhouse-compiler-lib
driver.cpp
cc1_main.cpp
cc1as_main.cpp
@ -42,7 +42,7 @@ lldCore
${REQUIRED_LLVM_LIBRARIES}
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -217,11 +217,12 @@ private:
context.setApplicationType(Context::ApplicationType::CLIENT);
/// settings and limits could be specified in config file, but passed settings has higher priority
#define EXTRACT_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \
if (config().has(#NAME) && !context.getSettingsRef().NAME.changed) \
context.setSetting(#NAME, config().getString(#NAME));
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
#undef EXTRACT_SETTING
for (auto && setting : context.getSettingsRef())
{
const String & name = setting.getName().toString();
if (config().has(name) && !setting.isChanged())
setting.setValue(config().getString(name));
}
/// Set path for format schema files
if (config().has("format_schema_path"))
@ -479,6 +480,17 @@ private:
}
else
{
/// This is intended for testing purposes.
if (config().getBool("always_load_suggestion_data", false))
{
#if USE_READLINE
SCOPE_EXIT({ Suggest::instance().finalize(); });
Suggest::instance().load(connection_parameters, config().getInt("suggestion_limit"));
#else
throw Exception("Command line suggestions cannot work without readline", ErrorCodes::BAD_ARGUMENTS);
#endif
}
query_id = config().getString("query_id", "");
nonInteractive();
@ -805,8 +817,7 @@ private:
{
if (!old_settings)
old_settings.emplace(context.getSettingsRef());
for (const auto & change : settings_ast.as<ASTSetQuery>()->changes)
context.setSetting(change.name, change.value);
context.applySettingsChanges(settings_ast.as<ASTSetQuery>()->changes);
};
const auto * insert = parsed_query->as<ASTInsertQuery>();
if (insert && insert->settings_ast)
@ -836,7 +847,7 @@ private:
if (change.name == "profile")
current_profile = change.value.safeGet<String>();
else
context.setSetting(change.name, change.value);
context.applySettingChange(change);
}
}
@ -1604,8 +1615,6 @@ public:
min_description_length = std::min(min_description_length, line_length - 2);
}
#define DECLARE_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) (#NAME, po::value<std::string>(), DESCRIPTION)
/// Main commandline options related to client functionality and all parameters from Settings.
po::options_description main_description("Main options", line_length, min_description_length);
main_description.add_options()
@ -1629,6 +1638,7 @@ public:
("database,d", po::value<std::string>(), "database")
("pager", po::value<std::string>(), "pager")
("disable_suggestion,A", "Disable loading suggestion data. Note that suggestion data is loaded asynchronously through a second connection to ClickHouse server. Also it is reasonable to disable suggestion if you want to paste a query with TAB characters. Shorthand option -A is for those who get used to mysql client.")
("always_load_suggestion_data", "Load suggestion data even if clickhouse-client is run in non-interactive mode. Used for testing.")
("suggestion_limit", po::value<int>()->default_value(10000),
"Suggestion limit for how many databases, tables and columns to fetch.")
("multiline,m", "multiline")
@ -1647,9 +1657,9 @@ public:
("compression", po::value<bool>(), "enable or disable compression")
("log-level", po::value<std::string>(), "client log level")
("server_logs_file", po::value<std::string>(), "put server logs into specified file")
APPLY_FOR_SETTINGS(DECLARE_SETTING)
;
#undef DECLARE_SETTING
context.getSettingsRef().addProgramOptions(main_description);
/// Commandline options related to external tables.
po::options_description external_description("External tables options");
@ -1665,6 +1675,8 @@ public:
common_arguments.size(), common_arguments.data()).options(main_description).run();
po::variables_map options;
po::store(parsed, options);
po::notify(options);
if (options.count("version") || options.count("V"))
{
showClientVersion();
@ -1715,15 +1727,14 @@ public:
}
}
/// Extract settings from the options.
#define EXTRACT_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \
if (options.count(#NAME)) \
{ \
context.setSetting(#NAME, options[#NAME].as<std::string>()); \
config().setString(#NAME, options[#NAME].as<std::string>()); \
/// Copy settings-related program options to config.
/// TODO: Is this code necessary?
for (const auto & setting : context.getSettingsRef())
{
const String name = setting.getName().toString();
if (options.count(name))
config().setString(name, options[name].as<std::string>());
}
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
#undef EXTRACT_SETTING
if (options.count("config-file") && options.count("config"))
throw Exception("Two or more configuration files referenced in arguments", ErrorCodes::BAD_ARGUMENTS);
@ -1782,6 +1793,13 @@ public:
server_logs_file = options["server_logs_file"].as<std::string>();
if (options.count("disable_suggestion"))
config().setBool("disable_suggestion", true);
if (options.count("always_load_suggestion_data"))
{
if (options.count("disable_suggestion"))
throw Exception("Command line parameters disable_suggestion (-A) and always_load_suggestion_data cannot be specified simultaneously",
ErrorCodes::BAD_ARGUMENTS);
config().setBool("always_load_suggestion_data", true);
}
if (options.count("suggestion_limit"))
config().setInt("suggestion_limit", options["suggestion_limit"].as<int>());
}

View File

@ -2054,7 +2054,7 @@ private:
ConfigurationPtr task_cluster_initial_config;
ConfigurationPtr task_cluster_current_config;
Coordination::Stat task_descprtion_current_stat;
Coordination::Stat task_descprtion_current_stat{};
std::unique_ptr<TaskCluster> task_cluster;

View File

@ -69,11 +69,7 @@ void LocalServer::initialize(Poco::Util::Application & self)
void LocalServer::applyCmdSettings()
{
#define EXTRACT_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \
if (cmd_settings.NAME.changed) \
context->getSettingsRef().NAME = cmd_settings.NAME;
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
#undef EXTRACT_SETTING
context->getSettingsRef().copyChangesFrom(cmd_settings);
}
/// If path is specified and not empty, will try to setup server environment and load existing metadata
@ -414,7 +410,6 @@ void LocalServer::init(int argc, char ** argv)
min_description_length = std::min(min_description_length, line_length - 2);
}
#define DECLARE_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) (#NAME, po::value<std::string> (), DESCRIPTION)
po::options_description description("Main options", line_length, min_description_length);
description.add_options()
("help", "produce help message")
@ -435,13 +430,15 @@ void LocalServer::init(int argc, char ** argv)
("verbose", "print query and other debugging info")
("ignore-error", "do not stop processing if a query failed")
("version,V", "print version information and exit")
APPLY_FOR_SETTINGS(DECLARE_SETTING);
#undef DECLARE_SETTING
;
cmd_settings.addProgramOptions(description);
/// Parse main commandline options.
po::parsed_options parsed = po::command_line_parser(argc, argv).options(description).run();
po::variables_map options;
po::store(parsed, options);
po::notify(options);
if (options.count("version") || options.count("V"))
{
@ -457,13 +454,6 @@ void LocalServer::init(int argc, char ** argv)
exit(0);
}
/// Extract settings and limits from the options.
#define EXTRACT_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \
if (options.count(#NAME)) \
cmd_settings.set(#NAME, options[#NAME].as<std::string>());
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
#undef EXTRACT_SETTING
/// Save received data into the internal config.
if (options.count("config-file"))
config().setString("config-file", options["config-file"].as<std::string>());

View File

@ -912,8 +912,8 @@ public:
size_t columns = header.columns();
models.reserve(columns);
for (size_t i = 0; i < columns; ++i)
models.emplace_back(factory.get(*header.getByPosition(i).type, hash(seed, i), markov_model_params));
for (const auto & elem : header)
models.emplace_back(factory.get(*elem.type, hash(seed, elem.name), markov_model_params));
}
void train(const Columns & columns)
@ -954,7 +954,7 @@ try
("structure,S", po::value<std::string>(), "structure of the initial table (list of column and type names)")
("input-format", po::value<std::string>(), "input format of the initial table data")
("output-format", po::value<std::string>(), "default output format")
("seed", po::value<std::string>(), "seed (arbitrary string), must be random string with at least 10 bytes length")
("seed", po::value<std::string>(), "seed (arbitrary string), must be random string with at least 10 bytes length; note that a seed for each column is derived from this seed and a column name: you can obfuscate data for different tables and as long as you use identical seed and identical column names, the data for corresponding non-text columns for different tables will be transformed in the same way, so the data for different tables can be JOINed after obfuscation")
("limit", po::value<UInt64>(), "if specified - stop after generating that number of rows")
("silent", po::value<bool>()->default_value(false), "don't print information messages to stderr")
("order", po::value<UInt64>()->default_value(5), "order of markov model to generate strings")

View File

@ -16,18 +16,20 @@ set(CLICKHOUSE_ODBC_BRIDGE_INCLUDE PUBLIC ${ClickHouse_SOURCE_DIR}/libs/libdaemo
if (USE_POCO_SQLODBC)
set(CLICKHOUSE_ODBC_BRIDGE_LINK ${CLICKHOUSE_ODBC_BRIDGE_LINK} PRIVATE ${Poco_SQLODBC_LIBRARY})
set(CLICKHOUSE_ODBC_BRIDGE_INCLUDE ${CLICKHOUSE_ODBC_BRIDGE_INCLUDE} SYSTEM PRIVATE ${ODBC_INCLUDE_DIRECTORIES} ${Poco_SQLODBC_INCLUDE_DIR})
set(CLICKHOUSE_ODBC_BRIDGE_INCLUDE ${CLICKHOUSE_ODBC_BRIDGE_INCLUDE} SYSTEM PRIVATE ${ODBC_INCLUDE_DIRS} ${Poco_SQLODBC_INCLUDE_DIR})
endif ()
if (Poco_SQL_FOUND)
set(CLICKHOUSE_ODBC_BRIDGE_LINK ${CLICKHOUSE_ODBC_BRIDGE_LINK} PRIVATE ${Poco_SQL_LIBRARY})
set(CLICKHOUSE_ODBC_BRIDGE_INCLUDE ${CLICKHOUSE_ODBC_BRIDGE_INCLUDE} SYSTEM PRIVATE ${Poco_SQL_INCLUDE_DIR})
endif ()
if (USE_POCO_DATAODBC)
set(CLICKHOUSE_ODBC_BRIDGE_LINK ${CLICKHOUSE_ODBC_BRIDGE_LINK} PRIVATE ${Poco_DataODBC_LIBRARY})
set(CLICKHOUSE_ODBC_BRIDGE_INCLUDE ${CLICKHOUSE_ODBC_BRIDGE_INCLUDE} SYSTEM PRIVATE ${ODBC_INCLUDE_DIRECTORIES} ${Poco_DataODBC_INCLUDE_DIR})
set(CLICKHOUSE_ODBC_BRIDGE_INCLUDE ${CLICKHOUSE_ODBC_BRIDGE_INCLUDE} SYSTEM PRIVATE ${ODBC_INCLUDE_DIRS} ${Poco_DataODBC_INCLUDE_DIR})
endif()
if (Poco_Data_FOUND)
set(CLICKHOUSE_ODBC_BRIDGE_LINK ${CLICKHOUSE_ODBC_BRIDGE_LINK} PRIVATE ${Poco_Data_LIBRARY})
set(CLICKHOUSE_ODBC_BRIDGE_INCLUDE ${CLICKHOUSE_ODBC_BRIDGE_INCLUDE} SYSTEM PRIVATE ${Poco_Data_INCLUDE_DIR})
endif ()
clickhouse_program_add_library(odbc-bridge)
@ -37,12 +39,13 @@ clickhouse_program_add_library(odbc-bridge)
# For this reason, we disabling -rdynamic linker flag. But we do it in strange way:
SET(CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS "")
add_executable (clickhouse-odbc-bridge odbc-bridge.cpp)
add_executable(clickhouse-odbc-bridge odbc-bridge.cpp)
set_target_properties(clickhouse-odbc-bridge PROPERTIES RUNTIME_OUTPUT_DIRECTORY ..)
clickhouse_program_link_split_binary(odbc-bridge)
install (TARGETS clickhouse-odbc-bridge RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
install(TARGETS clickhouse-odbc-bridge RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} COMPONENT clickhouse)
if (ENABLE_TESTS)
add_subdirectory (tests)
endif ()
if(ENABLE_TESTS)
add_subdirectory(tests)
endif()

View File

@ -2,7 +2,6 @@
#include "PingHandler.h"
#include "ColumnInfoHandler.h"
#include <Poco/URI.h>
#include <Poco/Ext/SessionPoolHelpers.h>
#include <Poco/Net/HTTPServerRequest.h>
#include <common/logger_useful.h>

View File

@ -11,11 +11,12 @@
#include <IO/WriteHelpers.h>
#include <IO/ReadHelpers.h>
#include <Interpreters/Context.h>
#include <Poco/Ext/SessionPoolHelpers.h>
#include <Poco/Net/HTTPServerRequest.h>
#include <Poco/Net/HTTPServerResponse.h>
#include <Poco/Net/HTMLForm.h>
#include <common/logger_useful.h>
#include <mutex>
#include <Poco/ThreadPool.h>
namespace DB
{
@ -31,6 +32,24 @@ namespace
}
}
using PocoSessionPoolConstructor = std::function<std::shared_ptr<Poco::Data::SessionPool>()>;
/** Is used to adjust max size of default Poco thread pool. See issue #750
* Acquire the lock, resize pool and construct new Session.
*/
std::shared_ptr<Poco::Data::SessionPool> createAndCheckResizePocoSessionPool(PocoSessionPoolConstructor pool_constr)
{
static std::mutex mutex;
Poco::ThreadPool & pool = Poco::ThreadPool::defaultPool();
/// NOTE: The lock don't guarantee that external users of the pool don't change its capacity
std::unique_lock lock(mutex);
if (pool.available() == 0)
pool.addCapacity(2 * std::max(pool.capacity(), 1));
return pool_constr();
}
ODBCHandler::PoolPtr ODBCHandler::getPool(const std::string & connection_str)
{

View File

@ -16,7 +16,7 @@ std::vector<XMLConfigurationPtr> ConfigPreprocessor::processConfig(
std::vector<XMLConfigurationPtr> result;
for (const auto & path : paths)
{
result.emplace_back(new XMLConfiguration(path));
result.emplace_back(XMLConfigurationPtr(new XMLConfiguration(path)));
result.back()->setString("path", Poco::Path(path).absolute().toString());
}

View File

@ -21,7 +21,7 @@ void extractSettings(
const XMLConfigurationPtr & config,
const std::string & key,
const Strings & settings_list,
std::map<std::string, std::string> & settings_to_apply)
SettingsChanges & settings_to_apply)
{
for (const std::string & setup : settings_list)
{
@ -32,7 +32,7 @@ void extractSettings(
if (value.empty())
value = "true";
settings_to_apply[setup] = value;
settings_to_apply.emplace_back(SettingChange{setup, value});
}
}
@ -70,7 +70,7 @@ void PerformanceTestInfo::applySettings(XMLConfigurationPtr config)
{
if (config->has("settings"))
{
std::map<std::string, std::string> settings_to_apply;
SettingsChanges settings_to_apply;
Strings config_settings;
config->keys("settings", config_settings);
@ -96,19 +96,7 @@ void PerformanceTestInfo::applySettings(XMLConfigurationPtr config)
}
extractSettings(config, "settings", config_settings, settings_to_apply);
/// This macro goes through all settings in the Settings.h
/// and, if found any settings in test's xml configuration
/// with the same name, sets its value to settings
std::map<std::string, std::string>::iterator it;
#define EXTRACT_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \
it = settings_to_apply.find(#NAME); \
if (it != settings_to_apply.end()) \
settings.set(#NAME, settings_to_apply[#NAME]);
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
#undef EXTRACT_SETTING
settings.applyChanges(settings_to_apply);
if (settings_contain("average_rows_speed_precision"))
TestStats::avg_rows_speed_precision =

View File

@ -61,6 +61,7 @@ public:
const std::string & default_database_,
const std::string & user_,
const std::string & password_,
const Settings & cmd_settings,
const bool lite_output_,
const std::string & profiles_file_,
Strings && input_files_,
@ -87,6 +88,7 @@ public:
, input_files(input_files_)
, log(&Poco::Logger::get("PerformanceTestSuite"))
{
global_context.getSettingsRef().copyChangesFrom(cmd_settings);
if (input_files.size() < 1)
throw Exception("No tests were specified", ErrorCodes::BAD_ARGUMENTS);
}
@ -110,10 +112,6 @@ public:
return 0;
}
void setContextSetting(const String & name, const std::string & value)
{
global_context.setSetting(name, value);
}
private:
Connection connection;
@ -298,6 +296,8 @@ std::unordered_map<std::string, std::vector<std::size_t>> getTestQueryIndexes(co
{
std::unordered_map<std::string, std::vector<std::size_t>> result;
const auto & options = parsed_opts.options;
if (options.empty())
return result;
for (size_t i = 0; i < options.size() - 1; ++i)
{
const auto & opt = options[i];
@ -324,7 +324,6 @@ try
using Strings = DB::Strings;
#define DECLARE_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) (#NAME, po::value<std::string>(), DESCRIPTION)
po::options_description desc("Allowed options");
desc.add_options()
("help", "produce help message")
@ -346,9 +345,10 @@ try
("input-files", value<Strings>()->multitoken(), "Input .xml files")
("query-indexes", value<std::vector<size_t>>()->multitoken(), "Input query indexes")
("recursive,r", "Recurse in directories to find all xml's")
APPLY_FOR_SETTINGS(DECLARE_SETTING);
#undef DECLARE_SETTING
;
DB::Settings cmd_settings;
cmd_settings.addProgramOptions(desc);
po::options_description cmdline_options;
cmdline_options.add(desc);
@ -395,6 +395,7 @@ try
options["database"].as<std::string>(),
options["user"].as<std::string>(),
options["password"].as<std::string>(),
cmd_settings,
options.count("lite") > 0,
options["profiles-file"].as<std::string>(),
std::move(input_files),
@ -406,15 +407,6 @@ try
std::move(skip_names_regexp),
queries_with_indexes,
timeouts);
/// Extract settings from the options.
#define EXTRACT_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \
if (options.count(#NAME)) \
{ \
performance_test_suite.setContextSetting(#NAME, options[#NAME].as<std::string>()); \
}
APPLY_FOR_SETTINGS(EXTRACT_SETTING)
#undef EXTRACT_SETTING
return performance_test_suite.run();
}
catch (...)

View File

@ -497,8 +497,7 @@ void HTTPHandler::processQuery(
settings.readonly = 2;
}
auto readonly_before_query = settings.readonly;
SettingsChanges settings_changes;
for (auto it = params.begin(); it != params.end(); ++it)
{
if (it->first == "database")
@ -515,21 +514,13 @@ void HTTPHandler::processQuery(
else
{
/// All other query parameters are treated as settings.
String value;
/// Setting is skipped if value wasn't changed.
if (!settings.tryGet(it->first, value) || it->second != value)
{
if (readonly_before_query == 1)
throw Exception("Cannot override setting (" + it->first + ") in readonly mode", ErrorCodes::READONLY);
if (readonly_before_query && it->first == "readonly")
throw Exception("Setting 'readonly' cannot be overrided in readonly mode", ErrorCodes::READONLY);
context.setSetting(it->first, it->second);
}
settings_changes.push_back({it->first, it->second});
}
}
context.checkSettingsConstraints(settings_changes);
context.applySettingsChanges(settings_changes);
/// HTTP response compression is turned on only if the client signalled that they support it
/// (using Accept-Encoding header) and 'enable_http_compression' setting is turned on.
used_output.out->setCompression(client_supports_http_compression && settings.enable_http_compression);

View File

@ -85,7 +85,7 @@ void ReplicasStatusHandler::handleRequest(Poco::Net::HTTPServerRequest & request
if (!response.sent())
{
/// We have not sent anything yet and we don't even know if we need to compress response.
/// We have not sent anything yet and we don't even know if we need to compress response.
response.send() << getCurrentExceptionMessage(false) << std::endl;
}
}

View File

@ -370,8 +370,8 @@ void TCPHandler::processInsertQuery(const Settings & global_settings)
if (client_revision >= DBMS_MIN_REVISION_WITH_COLUMN_DEFAULTS_METADATA)
{
const auto & db_and_table = query_context->getInsertionTable();
if (auto * columns = ColumnsDescription::loadFromContext(*query_context, db_and_table.first, db_and_table.second))
sendTableColumns(*columns);
if (query_context->getSettingsRef().input_format_defaults_for_omitted_fields)
sendTableColumns(query_context->getTable(db_and_table.first, db_and_table.second)->getColumns());
}
/// Send block to the client - table structure.

View File

@ -16,6 +16,7 @@
with minimum number of different symbols between replica's hostname and local hostname
(Hamming distance).
in_order - first live replica is chosen in specified order.
first_or_random - if first replica one has higher number of errors, pick a random one from replicas with minimum number of errors.
-->
<load_balancing>random</load_balancing>
</default>

View File

@ -1,262 +0,0 @@
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import sys
import argparse
import tempfile
import random
import subprocess
import bisect
from copy import deepcopy
# Псевдослучайный генератор уникальных чисел.
# http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers/
class UniqueRandomGenerator:
prime = 4294967291
def __init__(self, seed_base, seed_offset):
self.index = self.permutePQR(self.permutePQR(seed_base) + 0x682f0161)
self.intermediate_offset = self.permutePQR(self.permutePQR(seed_offset) + 0x46790905)
def next(self):
val = self.permutePQR((self.permutePQR(self.index) + self.intermediate_offset) ^ 0x5bf03635)
self.index = self.index + 1
return val
def permutePQR(self, x):
if x >=self.prime:
return x
else:
residue = (x * x) % self.prime
if x <= self.prime/2:
return residue
else:
return self.prime - residue
# Создать таблицу содержащую уникальные значения.
def generate_data_source(host, port, http_port, min_cardinality, max_cardinality, count):
chunk_size = round((max_cardinality - min_cardinality) / float(count))
used_values = 0
cur_count = 0
next_size = 0
sup = 32768
n1 = random.randrange(0, sup)
n2 = random.randrange(0, sup)
urng = UniqueRandomGenerator(n1, n2)
is_first = True
with tempfile.TemporaryDirectory() as tmp_dir:
filename = tmp_dir + '/table.txt'
with open(filename, 'w+b') as file_handle:
while cur_count < count:
if is_first == True:
is_first = False
if min_cardinality != 0:
next_size = min_cardinality + 1
else:
next_size = chunk_size
else:
next_size += chunk_size
while used_values < next_size:
h = urng.next()
used_values = used_values + 1
out = str(h) + "\t" + str(cur_count) + "\n";
file_handle.write(bytes(out, 'UTF-8'));
cur_count = cur_count + 1
query = "DROP TABLE IF EXISTS data_source"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
query = "CREATE TABLE data_source(UserID UInt64, KeyID UInt64) ENGINE=TinyLog"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
cat = subprocess.Popen(("cat", filename), stdout=subprocess.PIPE)
subprocess.check_output(("POST", "http://{0}:{1}/?query=INSERT INTO data_source FORMAT TabSeparated".format(host, http_port)), stdin=cat.stdout)
cat.wait()
def perform_query(host, port):
query = "SELECT runningAccumulate(uniqExactState(UserID)) AS exact, "
query += "runningAccumulate(uniqCombinedRawState(UserID)) AS approx "
query += "FROM data_source GROUP BY KeyID"
return subprocess.check_output(["clickhouse-client", "--host", host, "--port", port, "--query", query])
def parse_clickhouse_response(response):
parsed = []
lines = response.decode().split("\n")
for cur_line in lines:
rows = cur_line.split("\t")
if len(rows) == 2:
parsed.append([float(rows[0]), float(rows[1])])
return parsed
def accumulate_data(accumulated_data, data):
if not accumulated_data:
accumulated_data = deepcopy(data)
else:
for row1, row2 in zip(accumulated_data, data):
row1[1] += row2[1];
return accumulated_data
def generate_raw_result(accumulated_data, count):
expected_tab = []
bias_tab = []
for row in accumulated_data:
exact = row[0]
expected = row[1] / count
bias = expected - exact
expected_tab.append(expected)
bias_tab.append(bias)
return [ expected_tab, bias_tab ]
def generate_sample(raw_estimates, biases, n_samples):
result = []
min_card = raw_estimates[0]
max_card = raw_estimates[len(raw_estimates) - 1]
step = (max_card - min_card) / (n_samples - 1)
for i in range(0, n_samples + 1):
x = min_card + i * step
j = bisect.bisect_left(raw_estimates, x)
if j == len(raw_estimates):
result.append((raw_estimates[j - 1], biases[j - 1]))
elif raw_estimates[j] == x:
result.append((raw_estimates[j], biases[j]))
else:
# Найти 6 ближайших соседей. Вычислить среднее арифметическое.
# 6 точек слева x [j-6 j-5 j-4 j-3 j-2 j-1]
begin = max(j - 6, 0) - 1
end = j - 1
T = []
for k in range(end, begin, -1):
T.append(x - raw_estimates[k])
# 6 точек справа x [j j+1 j+2 j+3 j+4 j+5]
begin = j
end = min(j + 5, len(raw_estimates) - 1) + 1
U = []
for k in range(begin, end):
U.append(raw_estimates[k] - x)
# Сливаем расстояния.
V = []
lim = min(len(T), len(U))
k1 = 0
k2 = 0
while k1 < lim and k2 < lim:
if T[k1] == U[k2]:
V.append(j - k1 - 1)
V.append(j + k2)
k1 = k1 + 1
k2 = k2 + 1
elif T[k1] < U[k2]:
V.append(j - k1 - 1)
k1 = k1 + 1
else:
V.append(j + k2)
k2 = k2 + 1
if k1 < len(T):
while k1 < len(T):
V.append(j - k1 - 1)
k1 = k1 + 1
elif k2 < len(U):
while k2 < len(U):
V.append(j + k2)
k2 = k2 + 1
# Выбираем 6 ближайших точек.
# Вычисляем средние.
begin = 0
end = min(len(V), 6)
sum = 0
bias = 0
for k in range(begin, end):
sum += raw_estimates[V[k]]
bias += biases[V[k]]
sum /= float(end)
bias /= float(end)
result.append((sum, bias))
# Пропустить последовательные результаты, чьи оценки одинаковые.
final_result = []
last = -1
for entry in result:
if entry[0] != last:
final_result.append((entry[0], entry[1]))
last = entry[0]
return final_result
def dump_arrays(data):
print("Size of each array: {0}\n".format(len(data)))
is_first = True
sep = ''
print("raw_estimates = ")
print("{")
for row in data:
print("\t{0}{1}".format(sep, row[0]))
if is_first == True:
is_first = False
sep = ","
print("};")
is_first = True
sep = ""
print("\nbiases = ")
print("{")
for row in data:
print("\t{0}{1}".format(sep, row[1]))
if is_first == True:
is_first = False
sep = ","
print("};")
def start():
parser = argparse.ArgumentParser(description = "Generate bias correction tables for HyperLogLog-based functions.")
parser.add_argument("-x", "--host", default="localhost", help="ClickHouse server host name");
parser.add_argument("-p", "--port", type=int, default=9000, help="ClickHouse server TCP port");
parser.add_argument("-t", "--http_port", type=int, default=8123, help="ClickHouse server HTTP port");
parser.add_argument("-i", "--iterations", type=int, default=5000, help="number of iterations");
parser.add_argument("-m", "--min_cardinality", type=int, default=16384, help="minimal cardinality");
parser.add_argument("-M", "--max_cardinality", type=int, default=655360, help="maximal cardinality");
parser.add_argument("-s", "--samples", type=int, default=200, help="number of sampled values");
args = parser.parse_args()
accumulated_data = []
for i in range(0, args.iterations):
print(i + 1)
sys.stdout.flush()
generate_data_source(args.host, str(args.port), str(args.http_port), args.min_cardinality, args.max_cardinality, 1000)
response = perform_query(args.host, str(args.port))
data = parse_clickhouse_response(response)
accumulated_data = accumulate_data(accumulated_data, data)
result = generate_raw_result(accumulated_data, args.iterations)
sampled_data = generate_sample(result[0], result[1], args.samples)
dump_arrays(sampled_data)
if __name__ == "__main__": start()

View File

@ -1 +0,0 @@
Hits table generator based on LSTM neural network trained on real hits. You need to have weights for model or train model on real hits to generate data.

View File

@ -1,22 +0,0 @@
import argparse
from model import Model
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-n', type=int, default=100000,
help='number of objects to generate')
parser.add_argument('--output_file', type=str, default='out.tsv',
help='output file name')
parser.add_argument('--weights_path', type=str,
help='path to weights')
args = parser.parse_args()
if __name__ == '__main__':
if not args.weights_path:
raise Exception('please specify path to model weights with --weights_path')
gen = Model()
gen.generate(args.n, args.output_file, args.weights_path)

View File

@ -1,147 +0,0 @@
import numpy as np
import os
import pickle
import tensorflow as tf
from random import sample
from keras.layers import Dense, Embedding
from tqdm import tqdm
RNN_NUM_UNITS = 256
EMB_SIZE = 32
MAX_LENGTH = 1049
with open('tokens', 'rb') as f:
tokens = pickle.load(f)
n_tokens = len(tokens)
token_to_id = {c: i for i, c in enumerate(tokens)}
def to_matrix(objects, max_len=None, pad=0, dtype='int32'):
max_len = max_len or max(map(len, objects))
matrix = np.zeros([len(objects), max_len], dtype) + pad
for i in range(len(objects)):
name_ix = list(map(token_to_id.get, objects[i]))
matrix[i, :len(name_ix)] = name_ix
return matrix.T
class Model:
def __init__(self, learning_rate=0.0001):
# an embedding layer that converts character ids into embeddings
self.embed_x = Embedding(n_tokens, EMB_SIZE)
get_h_next = Dense(1024, activation='relu')
# a dense layer that maps current hidden state
# to probabilities of characters [h_t+1]->P(x_t+1|h_t+1)
self.get_probas = Dense(n_tokens, activation='softmax')
self.input_sequence = tf.placeholder('int32', (MAX_LENGTH, None))
batch_size = tf.shape(self.input_sequence)[1]
self.gru_cell_first = tf.nn.rnn_cell.GRUCell(RNN_NUM_UNITS)
self.lstm_cell_second = tf.nn.rnn_cell.LSTMCell(RNN_NUM_UNITS)
h_prev_first = self.gru_cell_first.zero_state(batch_size, dtype=tf.float32)
h_prev_second = tf.nn.rnn_cell.LSTMStateTuple(
tf.zeros([batch_size, RNN_NUM_UNITS]), # initial cell state,
tf.zeros([batch_size, RNN_NUM_UNITS]) # initial hidden state
)
predicted_probas = []
for t in range(MAX_LENGTH):
x_t = self.input_sequence[t]
# convert character id into embedding
x_t_emb = self.embed_x(tf.reshape(x_t, [-1, 1]))[:, 0]
out_next_first, h_next_first = self.gru_cell_first(x_t_emb, h_prev_first)
h_prev_first = h_next_first
out_next_second, h_next_second = self.lstm_cell_second(out_next_first, h_prev_second)
h_prev_second = h_next_second
probas_next = self.get_probas(out_next_second)
predicted_probas.append(probas_next)
predicted_probas = tf.stack(predicted_probas)
predictions_matrix = tf.reshape(predicted_probas[:-1], [-1, len(tokens)])
answers_matrix = tf.one_hot(tf.reshape(self.input_sequence[1:], [-1]), n_tokens)
self.loss = tf.reduce_mean(tf.reduce_sum(
-answers_matrix * tf.log(tf.clip_by_value(predictions_matrix, 1e-7, 1.0)),
reduction_indices=[1]
))
optimizer = tf.train.AdamOptimizer(learning_rate)
gvs = optimizer.compute_gradients(self.loss)
capped_gvs = [(gr if gr is None else tf.clip_by_value(gr, -1., 1.), var) for gr, var in gvs]
self.optimize = optimizer.apply_gradients(capped_gvs)
self.sess = tf.Session()
self.sess.run(tf.global_variables_initializer())
self.saver = tf.train.Saver()
def train(self, train_data_path, save_dir, num_iters, batch_size=64, restore_from=False):
history = []
if restore_from:
with open(restore_from + '_history') as f:
history = pickle.load(f)
self.saver.restore(self.sess, restore_from)
with open(train_data_path, 'r') as f:
train_data = f.readlines()
train_data = filter(lambda a: len(a) < MAX_LENGTH, train_data)
for i in tqdm(range(num_iters)):
batch = to_matrix(
map(lambda a: '\n' + a.rstrip('\n'), sample(train_data, batch_size)),
max_len=MAX_LENGTH
)
loss_i, _ = self.sess.run([self.loss, self.optimize], {self.input_sequence: batch})
history.append(loss_i)
if len(history) % 2000 == 0:
self.saver.save(self.sess, os.path.join(save_dir, '{}_iters'.format(len(history))))
self.saver.save(self.sess, os.path.join(save_dir, '{}_iters'.format(len(history))))
with open(os.path.join(save_dir, '{}_iters_history'.format(len(history)))) as f:
pickle.dump(history, f)
def generate(self, num_objects, output_file, weights_path):
self.saver.restore(self.sess, weights_path)
batch_size = num_objects
x_t = tf.placeholder('int32', (None, batch_size))
h_t_first = tf.Variable(tf.zeros([batch_size, RNN_NUM_UNITS]))
h_t_second = tf.nn.rnn_cell.LSTMStateTuple(
tf.Variable(tf.zeros([batch_size, RNN_NUM_UNITS])),
tf.Variable(tf.zeros([batch_size, RNN_NUM_UNITS]))
)
x_t_emb = self.embed_x(tf.reshape(x_t, [-1, 1]))[:, 0]
first_out_next, next_h_first = self.gru_cell_first(x_t_emb, h_t_first)
second_out_next, next_h_second = self.lstm_cell_second(first_out_next, h_t_second)
next_probs = self.get_probas(second_out_next)
x_sequence = np.zeros(shape=(1, batch_size), dtype=int) + token_to_id['\n']
self.sess.run(
[tf.assign(h_t_first, h_t_first.initial_value),
tf.assign(h_t_second[0], h_t_second[0].initial_value),
tf.assign(h_t_second[1], h_t_second[1].initial_value)]
)
for i in tqdm(range(MAX_LENGTH - 1)):
x_probs, _, _, _ = self.sess.run(
[next_probs,
tf.assign(h_t_second[0], next_h_second[0]),
tf.assign(h_t_second[1], next_h_second[1]),
tf.assign(h_t_first, next_h_first)],
{x_t: [x_sequence[-1, :]]}
)
next_char = [np.random.choice(n_tokens, p=x_probs[i]) for i in range(batch_size)]
if sum(next_char) == 0:
break
x_sequence = np.append(x_sequence, [next_char], axis=0)
with open(output_file, 'w') as f:
f.writelines([''.join([tokens[ix] for ix in x_sequence.T[k]]) + '\n' for k in range(batch_size)])

View File

@ -1,3 +0,0 @@
Keras==2.0.6
numpy
tensorflow-gpu==1.4.0

View File

@ -1,506 +0,0 @@
(lp0
S'\x83'
p1
aS'\x04'
p2
aS'\x87'
p3
aS'\x8b'
p4
aS'\x8f'
p5
aS'\x10'
p6
aS'\x93'
p7
aS'\x14'
p8
aS'\x97'
p9
aS'\x18'
p10
aS'\x9b'
p11
aS'\x1c'
p12
aS'\x9f'
p13
aS' '
p14
aS'\xa3'
p15
aS'$'
p16
aS'\xa7'
p17
aS'('
p18
aS'\xab'
p19
aS','
p20
aS'\xaf'
p21
aS'0'
p22
aS'\xb3'
p23
aS'4'
p24
aS'\xb7'
p25
aS'8'
p26
aS'\xbb'
p27
aS'<'
p28
aS'\xbf'
p29
aS'@'
p30
aS'\xc3'
p31
aS'D'
p32
aS'\xc7'
p33
aS'H'
p34
aS'\xcb'
p35
aS'L'
p36
aS'\xcf'
p37
aS'P'
p38
aS'\xd3'
p39
aS'T'
p40
aS'\xd7'
p41
aS'X'
p42
aS'\xdb'
p43
aS'\\'
p44
aS'\xdf'
p45
aS'`'
p46
aS'\xe3'
p47
aS'd'
p48
aS'\xe7'
p49
aS'h'
p50
aS'\xeb'
p51
aS'l'
p52
aS'\xef'
p53
aS'p'
p54
aS'\xf3'
p55
aS't'
p56
aS'\xf7'
p57
aS'x'
p58
aS'\xfb'
p59
aS'|'
p60
aS'\xff'
p61
aS'\x80'
p62
aS'\x03'
p63
aS'\x84'
p64
aS'\x07'
p65
aS'\x88'
p66
aS'\x0b'
p67
aS'\x8c'
p68
aS'\x0f'
p69
aS'\x90'
p70
aS'\x13'
p71
aS'\x94'
p72
aS'\x17'
p73
aS'\x98'
p74
aS'\x1b'
p75
aS'\x9c'
p76
aS'\x1f'
p77
aS'\xa0'
p78
aS'#'
p79
aS'\xa4'
p80
aS"'"
p81
aS'\xa8'
p82
aS'+'
p83
aS'\xac'
p84
aS'/'
p85
aS'\xb0'
p86
aS'3'
p87
aS'\xb4'
p88
aS'7'
p89
aS'\xb8'
p90
aS';'
p91
aS'\xbc'
p92
aS'?'
p93
aS'\xc0'
p94
aS'C'
p95
aS'\xc4'
p96
aS'G'
p97
aS'\xc8'
p98
aS'K'
p99
aS'\xcc'
p100
aS'O'
p101
aS'\xd0'
p102
aS'S'
p103
aS'\xd4'
p104
aS'W'
p105
aS'\xd8'
p106
aS'['
p107
aS'\xdc'
p108
aS'_'
p109
aS'\xe0'
p110
aS'c'
p111
aS'\xe4'
p112
aS'g'
p113
aS'\xe8'
p114
aS'k'
p115
aS'\xec'
p116
aS'o'
p117
aS'\xf0'
p118
aS's'
p119
aS'\xf4'
p120
aS'w'
p121
aS'\xf8'
p122
aS'{'
p123
aS'\xfc'
p124
aS'\x7f'
p125
aS'\x81'
p126
aS'\x02'
p127
aS'\x85'
p128
aS'\x06'
p129
aS'\x89'
p130
aS'\n'
p131
aS'\x8d'
p132
aS'\x0e'
p133
aS'\x91'
p134
aS'\x12'
p135
aS'\x95'
p136
aS'\x16'
p137
aS'\x99'
p138
aS'\x1a'
p139
aS'\x9d'
p140
aS'\x1e'
p141
aS'\xa1'
p142
aS'"'
p143
aS'\xa5'
p144
aS'&'
p145
aS'\xa9'
p146
aS'*'
p147
aS'\xad'
p148
aS'.'
p149
aS'\xb1'
p150
aS'2'
p151
aS'\xb5'
p152
aS'6'
p153
aS'\xb9'
p154
aS':'
p155
aS'\xbd'
p156
aS'>'
p157
aS'\xc1'
p158
aS'B'
p159
aS'\xc5'
p160
aS'F'
p161
aS'\xc9'
p162
aS'J'
p163
aS'\xcd'
p164
aS'N'
p165
aS'\xd1'
p166
aS'R'
p167
aS'\xd5'
p168
aS'V'
p169
aS'\xd9'
p170
aS'Z'
p171
aS'\xdd'
p172
aS'^'
p173
aS'\xe1'
p174
aS'b'
p175
aS'\xe5'
p176
aS'f'
p177
aS'\xe9'
p178
aS'j'
p179
aS'\xed'
p180
aS'n'
p181
aS'\xf1'
p182
aS'r'
p183
aS'\xf5'
p184
aS'v'
p185
aS'\xf9'
p186
aS'z'
p187
aS'\xfd'
p188
aS'~'
p189
aS'\x01'
p190
aS'\x82'
p191
aS'\x05'
p192
aS'\x86'
p193
aS'\t'
p194
aS'\x8a'
p195
aS'\x8e'
p196
aS'\x11'
p197
aS'\x92'
p198
aS'\x15'
p199
aS'\x96'
p200
aS'\x19'
p201
aS'\x9a'
p202
aS'\x1d'
p203
aS'\x9e'
p204
aS'!'
p205
aS'\xa2'
p206
aS'%'
p207
aS'\xa6'
p208
aS')'
p209
aS'\xaa'
p210
aS'-'
p211
aS'\xae'
p212
aS'1'
p213
aS'\xb2'
p214
aS'5'
p215
aS'\xb6'
p216
aS'9'
p217
aS'\xba'
p218
aS'='
p219
aS'\xbe'
p220
aS'A'
p221
aS'\xc2'
p222
aS'E'
p223
aS'\xc6'
p224
aS'I'
p225
aS'\xca'
p226
aS'M'
p227
aS'\xce'
p228
aS'Q'
p229
aS'\xd2'
p230
aS'U'
p231
aS'\xd6'
p232
aS'Y'
p233
aS'\xda'
p234
aS']'
p235
aS'\xde'
p236
aS'a'
p237
aS'\xe2'
p238
aS'e'
p239
aS'\xe6'
p240
aS'i'
p241
aS'\xea'
p242
aS'm'
p243
aS'\xee'
p244
aS'q'
p245
aS'\xf2'
p246
aS'u'
p247
aS'\xf6'
p248
aS'y'
p249
aS'\xfa'
p250
aS'}'
p251
aS'\xfe'
p252
a.

View File

@ -1,26 +0,0 @@
import argparse
from model import Model
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--n_iter', type=int, default=10000,
help='number of iterations')
parser.add_argument('--save_dir', type=str, default='save',
help='dir for saving weights')
parser.add_argument('--data_path', type=str,
help='path to train data')
parser.add_argument('--learning_rate', type=int, default=0.0001,
help='learning rate')
parser.add_argument('--batch_size', type=int, default=64,
help='batch size')
parser.add_argument('--restore_from', type=str,
help='path to train saved weights')
args = parser.parse_args()
if __name__ == '__main__':
if not args.data_path:
raise Exception('please specify path to train data with --data_path')
gen = Model(args.learning_rate)
gen.train(args.data_path, args.save_dir, args.n_iter, args.batch_size, args.restore_from)

View File

@ -1,150 +0,0 @@
#!/usr/bin/python3.4
# -*- coding: utf-8 -*-
import sys
import argparse
import tempfile
import random
import subprocess
import bisect
from copy import deepcopy
# Псевдослучайный генератор уникальных чисел.
# http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers/
class UniqueRandomGenerator:
prime = 4294967291
def __init__(self, seed_base, seed_offset):
self.index = self.permutePQR(self.permutePQR(seed_base) + 0x682f0161)
self.intermediate_offset = self.permutePQR(self.permutePQR(seed_offset) + 0x46790905)
def next(self):
val = self.permutePQR((self.permutePQR(self.index) + self.intermediate_offset) ^ 0x5bf03635)
self.index = self.index + 1
return val
def permutePQR(self, x):
if x >=self.prime:
return x
else:
residue = (x * x) % self.prime
if x <= self.prime/2:
return residue
else:
return self.prime - residue
# Создать таблицу содержащую уникальные значения.
def generate_data_source(host, port, http_port, min_cardinality, max_cardinality, count):
chunk_size = round((max_cardinality - (min_cardinality + 1)) / float(count))
used_values = 0
cur_count = 0
next_size = 0
sup = 32768
n1 = random.randrange(0, sup)
n2 = random.randrange(0, sup)
urng = UniqueRandomGenerator(n1, n2)
is_first = True
with tempfile.TemporaryDirectory() as tmp_dir:
filename = tmp_dir + '/table.txt'
with open(filename, 'w+b') as file_handle:
while cur_count < count:
if is_first == True:
is_first = False
if min_cardinality != 0:
next_size = min_cardinality + 1
else:
next_size = chunk_size
else:
next_size += chunk_size
while used_values < next_size:
h = urng.next()
used_values = used_values + 1
out = str(h) + "\t" + str(cur_count) + "\n";
file_handle.write(bytes(out, 'UTF-8'));
cur_count = cur_count + 1
query = "DROP TABLE IF EXISTS data_source"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
query = "CREATE TABLE data_source(UserID UInt64, KeyID UInt64) ENGINE=TinyLog"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
cat = subprocess.Popen(("cat", filename), stdout=subprocess.PIPE)
subprocess.check_output(("POST", "http://{0}:{1}/?query=INSERT INTO data_source FORMAT TabSeparated".format(host, http_port)), stdin=cat.stdout)
cat.wait()
def perform_query(host, port):
query = "SELECT runningAccumulate(uniqExactState(UserID)) AS exact, "
query += "runningAccumulate(uniqCombinedRawState(UserID)) AS raw, "
query += "runningAccumulate(uniqCombinedLinearCountingState(UserID)) AS linear_counting, "
query += "runningAccumulate(uniqCombinedBiasCorrectedState(UserID)) AS bias_corrected "
query += "FROM data_source GROUP BY KeyID"
return subprocess.check_output(["clickhouse-client", "--host", host, "--port", port, "--query", query])
def parse_clickhouse_response(response):
parsed = []
lines = response.decode().split("\n")
for cur_line in lines:
rows = cur_line.split("\t")
if len(rows) == 4:
parsed.append([float(rows[0]), float(rows[1]), float(rows[2]), float(rows[3])])
return parsed
def accumulate_data(accumulated_data, data):
if not accumulated_data:
accumulated_data = deepcopy(data)
else:
for row1, row2 in zip(accumulated_data, data):
row1[1] += row2[1];
row1[2] += row2[2];
row1[3] += row2[3];
return accumulated_data
def dump_graphs(data, count):
with open("raw_graph.txt", "w+b") as fh1, open("linear_counting_graph.txt", "w+b") as fh2, open("bias_corrected_graph.txt", "w+b") as fh3:
expected_tab = []
bias_tab = []
for row in data:
exact = row[0]
raw = row[1] / count;
linear_counting = row[2] / count;
bias_corrected = row[3] / count;
outstr = "{0}\t{1}\n".format(exact, abs(raw - exact) / exact)
fh1.write(bytes(outstr, 'UTF-8'))
outstr = "{0}\t{1}\n".format(exact, abs(linear_counting - exact) / exact)
fh2.write(bytes(outstr, 'UTF-8'))
outstr = "{0}\t{1}\n".format(exact, abs(bias_corrected - exact) / exact)
fh3.write(bytes(outstr, 'UTF-8'))
def start():
parser = argparse.ArgumentParser(description = "Generate graphs that help to determine the linear counting threshold.")
parser.add_argument("-x", "--host", default="localhost", help="clickhouse host name");
parser.add_argument("-p", "--port", type=int, default=9000, help="clickhouse client TCP port");
parser.add_argument("-t", "--http_port", type=int, default=8123, help="clickhouse HTTP port");
parser.add_argument("-i", "--iterations", type=int, default=5000, help="number of iterations");
parser.add_argument("-m", "--min_cardinality", type=int, default=16384, help="minimal cardinality");
parser.add_argument("-M", "--max_cardinality", type=int, default=655360, help="maximal cardinality");
args = parser.parse_args()
accumulated_data = []
for i in range(0, args.iterations):
print(i + 1)
sys.stdout.flush()
generate_data_source(args.host, str(args.port), str(args.http_port), args.min_cardinality, args.max_cardinality, 1000)
response = perform_query(args.host, str(args.port))
data = parse_clickhouse_response(response)
accumulated_data = accumulate_data(accumulated_data, data)
dump_graphs(accumulated_data, args.iterations)
if __name__ == "__main__": start()

View File

@ -1,10 +0,0 @@
#!/usr/bin/env bash
for (( i = 0; i < 1000; i++ )); do
if (( RANDOM % 10 )); then
clickhouse-client --port=9007 --query="INSERT INTO mt (x) SELECT rand64() AS x FROM system.numbers LIMIT 100000"
else
clickhouse-client --port=9007 --query="INSERT INTO mt (x) SELECT rand64() AS x FROM system.numbers LIMIT 300000"
fi
done

View File

@ -1,76 +0,0 @@
from __future__ import print_function
import argparse
import matplotlib.pyplot as plt
import ast
TMP_FILE='tmp.tsv'
def parse_args():
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-f', '--file', default='data.tsv')
cfg = parser.parse_args()
return cfg
def draw():
place = dict()
max_coord = 0
global_top = 0
for line in open(TMP_FILE):
numbers = line.split('\t')
if len(numbers) <= 2:
continue
name = numbers[-2]
if numbers[0] == '1':
dx = int(numbers[3])
max_coord += dx
place[name] = [1, max_coord, 1, dx]
max_coord += dx
plt.plot([max_coord - 2 * dx, max_coord], [1, 1])
for line in open(TMP_FILE):
numbers = line.split('\t')
if len(numbers) <= 2:
continue
name = numbers[-2]
if numbers[0] == '2':
list = ast.literal_eval(numbers[-1])
coord = [0,0,0,0]
for cur_name in list:
coord[0] = max(place[cur_name][0], coord[0])
coord[1] += place[cur_name][1] * place[cur_name][2]
coord[2] += place[cur_name][2]
coord[3] += place[cur_name][3]
coord[1] /= coord[2]
coord[0] += 1
global_top = max(global_top, coord[0])
place[name] = coord
for cur_name in list:
plt.plot([coord[1], place[cur_name][1]],[coord[0], place[cur_name][0]])
plt.plot([coord[1] - coord[3], coord[1] + coord[3]], [coord[0], coord[0]])
plt.plot([0], [global_top + 1])
plt.plot([0], [-1])
plt.show()
def convert(input_file):
print(input_file)
tmp_file = open(TMP_FILE, "w")
for line in open(input_file):
numbers = line.split('\t')
numbers2 = numbers[-2].split('_')
if numbers2[-2] == numbers2[-3]:
numbers2[-2] = str(int(numbers2[-2]) + 1)
numbers2[-3] = str(int(numbers2[-3]) + 1)
numbers[-2] = '_'.join(numbers2[1:])
print('\t'.join(numbers), end='', file=tmp_file)
else:
print(line, end='', file=tmp_file)
def main():
cfg = parse_args()
convert(cfg.file)
draw()
if __name__ == '__main__':
main()

View File

@ -1,61 +0,0 @@
import time
import ast
from datetime import datetime
FILE='data.tsv'
def get_metrix():
data = []
time_to_merge = 0
count_of_parts = 0
max_count_of_parts = 0
parts_in_time = []
last_date = 0
for line in open(FILE):
fields = line.split('\t')
last_date = datetime.strptime(fields[2], '%Y-%m-%d %H:%M:%S')
break
for line in open(FILE):
fields = line.split('\t')
cur_date = datetime.strptime(fields[2], '%Y-%m-%d %H:%M:%S')
if fields[0] == '2':
time_to_merge += int(fields[4])
list = ast.literal_eval(fields[-1])
count_of_parts -= len(list) - 1
else:
count_of_parts += 1
if max_count_of_parts < count_of_parts:
max_count_of_parts = count_of_parts
parts_in_time.append([(cur_date-last_date).total_seconds(), count_of_parts])
last_date = cur_date
stats_parts_in_time = []
global_time = 0
average_parts = 0
for i in range(max_count_of_parts + 1):
stats_parts_in_time.append(0)
for elem in parts_in_time:
stats_parts_in_time[elem[1]] += elem[0]
global_time += elem[0]
average_parts += elem[0] * elem[1]
for i in range(max_count_of_parts):
stats_parts_in_time[i] /= global_time
average_parts /= global_time
return time_to_merge, max_count_of_parts, average_parts, stats_parts_in_time
def main():
time_to_merge, max_parts, average_parts, stats_parts = get_metrix()
print('time_to_merge=', time_to_merge)
print('max_parts=', max_parts)
print('average_parts=', average_parts)
print('stats_parts=', stats_parts)
if __name__ == '__main__':
main()

View File

@ -1,56 +0,0 @@
#!/usr/bin/python3
import sys
import math
import statistics as stat
start = int(sys.argv[1])
end = int(sys.argv[2])
#Copied from dbms/src/Common/HashTable/Hash.h
def intHash32(key, salt = 0):
key ^= salt;
key = (~key) + (key << 18);
key = key ^ ((key >> 31) | (key << 33));
key = key * 21;
key = key ^ ((key >> 11) | (key << 53));
key = key + (key << 6);
key = key ^ ((key >> 22) | (key << 42));
return key & 0xffffffff
#Number of buckets for precision p = 12, m = 2^p
m = 4096
n = start
c = 0
m1 = {}
m2 = {}
l1 = []
l2 = []
while n <= end:
c += 1
h = intHash32(n)
#Extract left most 12 bits
x1 = (h >> 20) & 0xfff
m1[x1] = 1
z1 = m - len(m1)
#Linear counting formula
u1 = int(m * math.log(float(m) / float(z1)))
e1 = abs(100*float(u1 - c)/float(c))
l1.append(e1)
print("%d %d %d %f" % (n, c, u1, e1))
#Extract right most 12 bits
x2 = h & 0xfff
m2[x2] = 1
z2 = m - len(m2)
u2 = int(m * math.log(float(m) / float(z2)))
e2 = abs(100*float(u2 - c)/float(c))
l2.append(e2)
print("%d %d %d %f" % (n, c, u2, e2))
n += 1
print("Left 12 bits error: min=%f max=%f avg=%f median=%f median_low=%f median_high=%f" % (min(l1), max(l1), stat.mean(l1), stat.median(l1), stat.median_low(l1), stat.median_high(l1)))
print("Right 12 bits error: min=%f max=%f avg=%f median=%f median_low=%f median_high=%f" % (min(l2), max(l2), stat.mean(l2), stat.median(l2), stat.median_low(l2), stat.median_high(l2)))

View File

@ -1,11 +0,0 @@
#!/usr/bin/env bash
for ((p = 2; p <= 10; p++))
do
for ((i = 1; i <= 9; i++))
do
n=$(( 10**p * i ))
echo -n "$n "
clickhouse-client -q "select uniqHLL12(number), uniq(number), uniqCombined(number) from numbers($n);"
done
done

View File

@ -9,55 +9,97 @@
namespace DB
{
namespace ErrorCodes
{
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
extern const int BAD_ARGUMENTS;
}
namespace
{
/// Substitute return type for Date and DateTime
class AggregateFunctionGroupUniqArrayDate : public AggregateFunctionGroupUniqArray<DataTypeDate::FieldType>
template <typename has_limit>
class AggregateFunctionGroupUniqArrayDate : public AggregateFunctionGroupUniqArray<DataTypeDate::FieldType, has_limit>
{
public:
AggregateFunctionGroupUniqArrayDate(const DataTypePtr & argument_type) : AggregateFunctionGroupUniqArray<DataTypeDate::FieldType>(argument_type) {}
AggregateFunctionGroupUniqArrayDate(const DataTypePtr & argument_type, UInt64 max_elems_ = std::numeric_limits<UInt64>::max()) : AggregateFunctionGroupUniqArray<DataTypeDate::FieldType, has_limit>(argument_type, max_elems_) {}
DataTypePtr getReturnType() const override { return std::make_shared<DataTypeArray>(std::make_shared<DataTypeDate>()); }
};
class AggregateFunctionGroupUniqArrayDateTime : public AggregateFunctionGroupUniqArray<DataTypeDateTime::FieldType>
template <typename has_limit>
class AggregateFunctionGroupUniqArrayDateTime : public AggregateFunctionGroupUniqArray<DataTypeDateTime::FieldType, has_limit>
{
public:
AggregateFunctionGroupUniqArrayDateTime(const DataTypePtr & argument_type) : AggregateFunctionGroupUniqArray<DataTypeDateTime::FieldType>(argument_type) {}
AggregateFunctionGroupUniqArrayDateTime(const DataTypePtr & argument_type, UInt64 max_elems_ = std::numeric_limits<UInt64>::max()) : AggregateFunctionGroupUniqArray<DataTypeDateTime::FieldType, has_limit>(argument_type, max_elems_) {}
DataTypePtr getReturnType() const override { return std::make_shared<DataTypeArray>(std::make_shared<DataTypeDateTime>()); }
};
static IAggregateFunction * createWithExtraTypes(const DataTypePtr & argument_type)
template <typename has_limit, typename ... TArgs>
static IAggregateFunction * createWithExtraTypes(const DataTypePtr & argument_type, TArgs && ... args)
{
WhichDataType which(argument_type);
if (which.idx == TypeIndex::Date) return new AggregateFunctionGroupUniqArrayDate(argument_type);
else if (which.idx == TypeIndex::DateTime) return new AggregateFunctionGroupUniqArrayDateTime(argument_type);
if (which.idx == TypeIndex::Date) return new AggregateFunctionGroupUniqArrayDate<has_limit>(argument_type, std::forward<TArgs>(args)...);
else if (which.idx == TypeIndex::DateTime) return new AggregateFunctionGroupUniqArrayDateTime<has_limit>(argument_type, std::forward<TArgs>(args)...);
else
{
/// Check that we can use plain version of AggreagteFunctionGroupUniqArrayGeneric
if (argument_type->isValueUnambiguouslyRepresentedInContiguousMemoryRegion())
return new AggreagteFunctionGroupUniqArrayGeneric<true>(argument_type);
return new AggreagteFunctionGroupUniqArrayGeneric<true, has_limit>(argument_type, std::forward<TArgs>(args)...);
else
return new AggreagteFunctionGroupUniqArrayGeneric<false>(argument_type);
return new AggreagteFunctionGroupUniqArrayGeneric<false, has_limit>(argument_type, std::forward<TArgs>(args)...);
}
}
template <typename has_limit, typename ... TArgs>
inline AggregateFunctionPtr createAggregateFunctionGroupUniqArrayImpl(const std::string & name, const DataTypePtr & argument_type, TArgs ... args)
{
AggregateFunctionPtr res(createWithNumericType<AggregateFunctionGroupUniqArray, has_limit, const DataTypePtr &, TArgs...>(*argument_type, argument_type, std::forward<TArgs>(args)...));
if (!res)
res = AggregateFunctionPtr(createWithExtraTypes<has_limit>(argument_type, std::forward<TArgs>(args)...));
if (!res)
throw Exception("Illegal type " + argument_type->getName() +
" of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
return res;
}
AggregateFunctionPtr createAggregateFunctionGroupUniqArray(const std::string & name, const DataTypes & argument_types, const Array & parameters)
{
assertNoParameters(name, parameters);
assertUnary(name, argument_types);
AggregateFunctionPtr res(createWithNumericType<AggregateFunctionGroupUniqArray>(*argument_types[0], argument_types[0]));
bool limit_size = false;
UInt64 max_elems = std::numeric_limits<UInt64>::max();
if (!res)
res = AggregateFunctionPtr(createWithExtraTypes(argument_types[0]));
if (parameters.empty())
{
// no limit
}
else if (parameters.size() == 1)
{
auto type = parameters[0].getType();
if (type != Field::Types::Int64 && type != Field::Types::UInt64)
throw Exception("Parameter for aggregate function " + name + " should be positive number", ErrorCodes::BAD_ARGUMENTS);
if (!res)
throw Exception("Illegal type " + argument_types[0]->getName() +
" of argument for aggregate function " + name, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
if ((type == Field::Types::Int64 && parameters[0].get<Int64>() < 0) ||
(type == Field::Types::UInt64 && parameters[0].get<UInt64>() == 0))
throw Exception("Parameter for aggregate function " + name + " should be positive number", ErrorCodes::BAD_ARGUMENTS);
return res;
limit_size = true;
max_elems = parameters[0].get<UInt64>();
}
else
throw Exception("Incorrect number of parameters for aggregate function " + name + ", should be 0 or 1",
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
if (!limit_size)
return createAggregateFunctionGroupUniqArrayImpl<std::false_type>(name, argument_types[0]);
else
return createAggregateFunctionGroupUniqArrayImpl<std::true_type>(name, argument_types[0], max_elems);
}
}

View File

@ -36,16 +36,21 @@ struct AggregateFunctionGroupUniqArrayData
/// Puts all values to the hash set. Returns an array of unique values. Implemented for numeric types.
template <typename T>
template <typename T, typename Tlimit_num_elem>
class AggregateFunctionGroupUniqArray
: public IAggregateFunctionDataHelper<AggregateFunctionGroupUniqArrayData<T>, AggregateFunctionGroupUniqArray<T>>
: public IAggregateFunctionDataHelper<AggregateFunctionGroupUniqArrayData<T>, AggregateFunctionGroupUniqArray<T, Tlimit_num_elem>>
{
static constexpr bool limit_num_elems = Tlimit_num_elem::value;
UInt64 max_elems;
private:
using State = AggregateFunctionGroupUniqArrayData<T>;
public:
AggregateFunctionGroupUniqArray(const DataTypePtr & argument_type)
: IAggregateFunctionDataHelper<AggregateFunctionGroupUniqArrayData<T>, AggregateFunctionGroupUniqArray<T>>({argument_type}, {}) {}
AggregateFunctionGroupUniqArray(const DataTypePtr & argument_type, UInt64 max_elems_ = std::numeric_limits<UInt64>::max())
: IAggregateFunctionDataHelper<AggregateFunctionGroupUniqArrayData<T>,
AggregateFunctionGroupUniqArray<T, Tlimit_num_elem>>({argument_type}, {}),
max_elems(max_elems_) {}
String getName() const override { return "groupUniqArray"; }
@ -56,12 +61,27 @@ public:
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena *) const override
{
if (limit_num_elems && this->data(place).value.size() >= max_elems)
return;
this->data(place).value.insert(static_cast<const ColumnVector<T> &>(*columns[0]).getData()[row_num]);
}
void merge(AggregateDataPtr place, ConstAggregateDataPtr rhs, Arena *) const override
{
this->data(place).value.merge(this->data(rhs).value);
if (!limit_num_elems)
this->data(place).value.merge(this->data(rhs).value);
else
{
auto & cur_set = this->data(place).value;
auto & rhs_set = this->data(rhs).value;
for (auto & rhs_elem : rhs_set)
{
if (cur_set.size() >= max_elems)
return;
cur_set.insert(rhs_elem.getValue());
}
}
}
void serialize(ConstAggregateDataPtr place, WriteBuffer & buf) const override
@ -111,25 +131,43 @@ struct AggreagteFunctionGroupUniqArrayGenericData
Set value;
};
/// Helper function for deserialize and insert for the class AggreagteFunctionGroupUniqArrayGeneric
template <bool is_plain_column>
static StringRef getSerializationImpl(const IColumn & column, size_t row_num, Arena & arena);
template <bool is_plain_column>
static void deserializeAndInsertImpl(StringRef str, IColumn & data_to);
/** Template parameter with true value should be used for columns that store their elements in memory continuously.
* For such columns groupUniqArray() can be implemented more efficiently (especially for small numeric arrays).
*/
template <bool is_plain_column = false>
template <bool is_plain_column = false, typename Tlimit_num_elem = std::false_type>
class AggreagteFunctionGroupUniqArrayGeneric
: public IAggregateFunctionDataHelper<AggreagteFunctionGroupUniqArrayGenericData, AggreagteFunctionGroupUniqArrayGeneric<is_plain_column>>
: public IAggregateFunctionDataHelper<AggreagteFunctionGroupUniqArrayGenericData, AggreagteFunctionGroupUniqArrayGeneric<is_plain_column, Tlimit_num_elem>>
{
DataTypePtr & input_data_type;
static constexpr bool limit_num_elems = Tlimit_num_elem::value;
UInt64 max_elems;
using State = AggreagteFunctionGroupUniqArrayGenericData;
static StringRef getSerialization(const IColumn & column, size_t row_num, Arena & arena);
static StringRef getSerialization(const IColumn & column, size_t row_num, Arena & arena)
{
return getSerializationImpl<is_plain_column>(column, row_num, arena);
}
static void deserializeAndInsert(StringRef str, IColumn & data_to);
static void deserializeAndInsert(StringRef str, IColumn & data_to)
{
return deserializeAndInsertImpl<is_plain_column>(str, data_to);
}
public:
AggreagteFunctionGroupUniqArrayGeneric(const DataTypePtr & input_data_type)
: IAggregateFunctionDataHelper<AggreagteFunctionGroupUniqArrayGenericData, AggreagteFunctionGroupUniqArrayGeneric<is_plain_column>>({input_data_type}, {})
, input_data_type(this->argument_types[0]) {}
AggreagteFunctionGroupUniqArrayGeneric(const DataTypePtr & input_data_type, UInt64 max_elems_ = std::numeric_limits<UInt64>::max())
: IAggregateFunctionDataHelper<AggreagteFunctionGroupUniqArrayGenericData, AggreagteFunctionGroupUniqArrayGeneric<is_plain_column, Tlimit_num_elem>>({input_data_type}, {})
, input_data_type(this->argument_types[0])
, max_elems(max_elems_) {}
String getName() const override { return "groupUniqArray"; }
@ -174,7 +212,10 @@ public:
bool inserted;
State::Set::iterator it;
if (limit_num_elems && set.size() >= max_elems)
return;
StringRef str_serialized = getSerialization(*columns[0], row_num, *arena);
set.emplace(str_serialized, it, inserted);
if constexpr (!is_plain_column)
@ -198,6 +239,8 @@ public:
State::Set::iterator it;
for (auto & rhs_elem : rhs_set)
{
if (limit_num_elems && cur_set.size() >= max_elems)
return ;
cur_set.emplace(rhs_elem.getValue(), it, inserted);
if (inserted)
{
@ -229,31 +272,30 @@ public:
template <>
inline StringRef AggreagteFunctionGroupUniqArrayGeneric<false>::getSerialization(const IColumn & column, size_t row_num, Arena & arena)
inline StringRef getSerializationImpl<false>(const IColumn & column, size_t row_num, Arena & arena)
{
const char * begin = nullptr;
return column.serializeValueIntoArena(row_num, arena, begin);
}
template <>
inline StringRef AggreagteFunctionGroupUniqArrayGeneric<true>::getSerialization(const IColumn & column, size_t row_num, Arena &)
inline StringRef getSerializationImpl<true>(const IColumn & column, size_t row_num, Arena &)
{
return column.getDataAt(row_num);
}
template <>
inline void AggreagteFunctionGroupUniqArrayGeneric<false>::deserializeAndInsert(StringRef str, IColumn & data_to)
inline void deserializeAndInsertImpl<false>(StringRef str, IColumn & data_to)
{
data_to.deserializeAndInsertFromArena(str.data);
}
template <>
inline void AggreagteFunctionGroupUniqArrayGeneric<true>::deserializeAndInsert(StringRef str, IColumn & data_to)
inline void deserializeAndInsertImpl<true>(StringRef str, IColumn & data_to)
{
data_to.insertData(str.data, str.size);
}
#undef AGGREGATE_FUNCTION_GROUP_ARRAY_UNIQ_MAX_SIZE
}

View File

@ -104,7 +104,6 @@ public:
if (event)
{
this->data(place).add(i);
break;
}
}
}

View File

@ -19,7 +19,7 @@ list(REMOVE_ITEM clickhouse_aggregate_functions_headers
FactoryHelpers.h
)
add_library(clickhouse_aggregate_functions ${LINK_MODE} ${clickhouse_aggregate_functions_sources})
add_library(clickhouse_aggregate_functions ${clickhouse_aggregate_functions_sources})
target_link_libraries(clickhouse_aggregate_functions PRIVATE dbms PUBLIC ${CITYHASH_LIBRARIES})
target_include_directories (clickhouse_aggregate_functions BEFORE PRIVATE ${COMMON_INCLUDE_DIR})

View File

@ -136,7 +136,7 @@ class QuantileTDigest
{
if (unmerged > 0)
{
RadixSort<RadixSortTraits>::execute(summary.data(), summary.size());
RadixSort<RadixSortTraits>::executeLSD(summary.data(), summary.size());
if (summary.size() > 3)
{

View File

@ -1,7 +1,7 @@
# TODO: make separate lib datastream, block, ...
#include(${ClickHouse_SOURCE_DIR}/cmake/dbms_glob_sources.cmake)
#add_headers_and_sources(clickhouse_client .)
#add_library(clickhouse_client ${LINK_MODE} ${clickhouse_client_headers} ${clickhouse_client_sources})
#add_library(clickhouse_client ${clickhouse_client_headers} ${clickhouse_client_sources})
#target_link_libraries (clickhouse_client clickhouse_common_io ${Poco_Net_LIBRARY})
#target_include_directories (clickhouse_client PRIVATE ${DBMS_INCLUDE_DIR})

View File

@ -401,7 +401,7 @@ void Connection::sendQuery(
if (settings)
settings->serialize(*out);
else
writeStringBinary("", *out);
writeStringBinary("" /* empty string is a marker of the end of settings */, *out);
writeVarUInt(stage, *out);
writeVarUInt(static_cast<bool>(compression), *out);

View File

@ -62,6 +62,9 @@ IConnectionPool::Entry ConnectionPoolWithFailover::get(const Settings * settings
break;
case LoadBalancing::RANDOM:
break;
case LoadBalancing::FIRST_OR_RANDOM:
get_priority = [](size_t i) -> size_t { return i >= 1; };
break;
}
return Base::get(try_get_entry, get_priority);
@ -134,6 +137,9 @@ std::vector<ConnectionPoolWithFailover::TryResult> ConnectionPoolWithFailover::g
break;
case LoadBalancing::RANDOM:
break;
case LoadBalancing::FIRST_OR_RANDOM:
get_priority = [](size_t i) -> size_t { return i >= 1; };
break;
}
bool fallback_to_stale_replicas = settings ? bool(settings->fallback_to_stale_replicas_for_distributed_queries) : true;

View File

@ -43,13 +43,13 @@ using Arenas = std::vector<ArenaPtr>;
* specifying which individual values should be destroyed and which ones should not.
* Clearly, this method would have a substantially non-zero price.
*/
class ColumnAggregateFunction final : public COWPtrHelper<IColumn, ColumnAggregateFunction>
class ColumnAggregateFunction final : public COWHelper<IColumn, ColumnAggregateFunction>
{
public:
using Container = PaddedPODArray<AggregateDataPtr>;
private:
friend class COWPtrHelper<IColumn, ColumnAggregateFunction>;
friend class COWHelper<IColumn, ColumnAggregateFunction>;
/// Memory pools. Aggregate states are allocated from them.
Arenas arenas;

View File

@ -13,10 +13,10 @@ namespace DB
* In memory, it is represented as one column of a nested type, whose size is equal to the sum of the sizes of all arrays,
* and as an array of offsets in it, which allows you to get each element.
*/
class ColumnArray final : public COWPtrHelper<IColumn, ColumnArray>
class ColumnArray final : public COWHelper<IColumn, ColumnArray>
{
private:
friend class COWPtrHelper<IColumn, ColumnArray>;
friend class COWHelper<IColumn, ColumnArray>;
/** Create an array column with specified values and offsets. */
ColumnArray(MutableColumnPtr && nested_column, MutableColumnPtr && offsets_column);
@ -30,7 +30,7 @@ public:
/** Create immutable column using immutable arguments. This arguments may be shared with other columns.
* Use IColumn::mutate in order to make mutable column and mutate shared nested columns.
*/
using Base = COWPtrHelper<IColumn, ColumnArray>;
using Base = COWHelper<IColumn, ColumnArray>;
static Ptr create(const ColumnPtr & nested_column, const ColumnPtr & offsets_column)
{

View File

@ -18,10 +18,10 @@ namespace ErrorCodes
/** ColumnConst contains another column with single element,
* but looks like a column with arbitrary amount of same elements.
*/
class ColumnConst final : public COWPtrHelper<IColumn, ColumnConst>
class ColumnConst final : public COWHelper<IColumn, ColumnConst>
{
private:
friend class COWPtrHelper<IColumn, ColumnConst>;
friend class COWHelper<IColumn, ColumnConst>;
WrappedPtr data;
size_t s;

View File

@ -55,13 +55,13 @@ private:
/// A ColumnVector for Decimals
template <typename T>
class ColumnDecimal final : public COWPtrHelper<ColumnVectorHelper, ColumnDecimal<T>>
class ColumnDecimal final : public COWHelper<ColumnVectorHelper, ColumnDecimal<T>>
{
static_assert(IsDecimalNumber<T>);
private:
using Self = ColumnDecimal;
friend class COWPtrHelper<ColumnVectorHelper, Self>;
friend class COWHelper<ColumnVectorHelper, Self>;
public:
using Container = DecimalPaddedPODArray<T>;

View File

@ -13,10 +13,10 @@ namespace DB
/** A column of values of "fixed-length string" type.
* If you insert a smaller string, it will be padded with zero bytes.
*/
class ColumnFixedString final : public COWPtrHelper<ColumnVectorHelper, ColumnFixedString>
class ColumnFixedString final : public COWHelper<ColumnVectorHelper, ColumnFixedString>
{
public:
friend class COWPtrHelper<ColumnVectorHelper, ColumnFixedString>;
friend class COWHelper<ColumnVectorHelper, ColumnFixedString>;
using Chars = PaddedPODArray<UInt8>;

View File

@ -15,10 +15,10 @@ namespace DB
/** A column containing a lambda expression.
* Behaves like a constant-column. Contains an expression, but not input or output data.
*/
class ColumnFunction final : public COWPtrHelper<IColumn, ColumnFunction>
class ColumnFunction final : public COWHelper<IColumn, ColumnFunction>
{
private:
friend class COWPtrHelper<IColumn, ColumnFunction>;
friend class COWHelper<IColumn, ColumnFunction>;
ColumnFunction(size_t size, FunctionBasePtr function, const ColumnsWithTypeAndName & columns_to_capture);

View File

@ -14,9 +14,9 @@ namespace ErrorCodes
extern const int ILLEGAL_COLUMN;
}
class ColumnLowCardinality final : public COWPtrHelper<IColumn, ColumnLowCardinality>
class ColumnLowCardinality final : public COWHelper<IColumn, ColumnLowCardinality>
{
friend class COWPtrHelper<IColumn, ColumnLowCardinality>;
friend class COWHelper<IColumn, ColumnLowCardinality>;
ColumnLowCardinality(MutableColumnPtr && column_unique, MutableColumnPtr && indexes, bool is_shared = false);
ColumnLowCardinality(const ColumnLowCardinality & other) = default;
@ -25,7 +25,7 @@ public:
/** Create immutable column using immutable arguments. This arguments may be shared with other columns.
* Use IColumn::mutate in order to make mutable column and mutate shared nested columns.
*/
using Base = COWPtrHelper<IColumn, ColumnLowCardinality>;
using Base = COWHelper<IColumn, ColumnLowCardinality>;
static Ptr create(const ColumnPtr & column_unique_, const ColumnPtr & indexes_, bool is_shared = false)
{
return ColumnLowCardinality::create(column_unique_->assumeMutable(), indexes_->assumeMutable(), is_shared);

View File

@ -6,10 +6,10 @@
namespace DB
{
class ColumnNothing final : public COWPtrHelper<IColumnDummy, ColumnNothing>
class ColumnNothing final : public COWHelper<IColumnDummy, ColumnNothing>
{
private:
friend class COWPtrHelper<IColumnDummy, ColumnNothing>;
friend class COWHelper<IColumnDummy, ColumnNothing>;
ColumnNothing(size_t s_)
{

View File

@ -20,10 +20,10 @@ using ConstNullMapPtr = const NullMap *;
/// over a bitmap because columns are usually stored on disk as compressed
/// files. In this regard, using a bitmap instead of a byte map would
/// greatly complicate the implementation with little to no benefits.
class ColumnNullable final : public COWPtrHelper<IColumn, ColumnNullable>
class ColumnNullable final : public COWHelper<IColumn, ColumnNullable>
{
private:
friend class COWPtrHelper<IColumn, ColumnNullable>;
friend class COWHelper<IColumn, ColumnNullable>;
ColumnNullable(MutableColumnPtr && nested_column_, MutableColumnPtr && null_map_);
ColumnNullable(const ColumnNullable &) = default;
@ -32,7 +32,7 @@ public:
/** Create immutable column using immutable arguments. This arguments may be shared with other columns.
* Use IColumn::mutate in order to make mutable column and mutate shared nested columns.
*/
using Base = COWPtrHelper<IColumn, ColumnNullable>;
using Base = COWHelper<IColumn, ColumnNullable>;
static Ptr create(const ColumnPtr & nested_column_, const ColumnPtr & null_map_)
{
return ColumnNullable::create(nested_column_->assumeMutable(), null_map_->assumeMutable());

View File

@ -14,10 +14,10 @@ using ConstSetPtr = std::shared_ptr<const Set>;
* Behaves like a constant-column (because the set is one, not its own for each line).
* This column has a nonstandard value, so it can not be obtained via a normal interface.
*/
class ColumnSet final : public COWPtrHelper<IColumnDummy, ColumnSet>
class ColumnSet final : public COWHelper<IColumnDummy, ColumnSet>
{
private:
friend class COWPtrHelper<IColumnDummy, ColumnSet>;
friend class COWHelper<IColumnDummy, ColumnSet>;
ColumnSet(size_t s_, const ConstSetPtr & data_) : data(data_) { s = s_; }
ColumnSet(const ColumnSet &) = default;

View File

@ -18,14 +18,14 @@ namespace DB
/** Column for String values.
*/
class ColumnString final : public COWPtrHelper<IColumn, ColumnString>
class ColumnString final : public COWHelper<IColumn, ColumnString>
{
public:
using Char = UInt8;
using Chars = PaddedPODArray<UInt8>;
private:
friend class COWPtrHelper<IColumn, ColumnString>;
friend class COWHelper<IColumn, ColumnString>;
/// Maps i'th position to offset to i+1'th element. Last offset maps to the end of all chars (is the size of all chars).
Offsets offsets;

View File

@ -12,10 +12,10 @@ namespace DB
* Mixed constant/non-constant columns is prohibited in tuple
* for implementation simplicity.
*/
class ColumnTuple final : public COWPtrHelper<IColumn, ColumnTuple>
class ColumnTuple final : public COWHelper<IColumn, ColumnTuple>
{
private:
friend class COWPtrHelper<IColumn, ColumnTuple>;
friend class COWHelper<IColumn, ColumnTuple>;
using TupleColumns = std::vector<WrappedPtr>;
TupleColumns columns;
@ -30,7 +30,7 @@ public:
/** Create immutable column using immutable arguments. This arguments may be shared with other columns.
* Use IColumn::mutate in order to make mutable column and mutate shared nested columns.
*/
using Base = COWPtrHelper<IColumn, ColumnTuple>;
using Base = COWHelper<IColumn, ColumnTuple>;
static Ptr create(const Columns & columns);
static Ptr create(const TupleColumns & columns);
static Ptr create(Columns && arg) { return create(arg); }

View File

@ -25,9 +25,9 @@ namespace ErrorCodes
}
template <typename ColumnType>
class ColumnUnique final : public COWPtrHelper<IColumnUnique, ColumnUnique<ColumnType>>
class ColumnUnique final : public COWHelper<IColumnUnique, ColumnUnique<ColumnType>>
{
friend class COWPtrHelper<IColumnUnique, ColumnUnique<ColumnType>>;
friend class COWHelper<IColumnUnique, ColumnUnique<ColumnType>>;
private:
explicit ColumnUnique(MutableColumnPtr && holder, bool is_nullable);

View File

@ -7,6 +7,7 @@
#include <Common/Arena.h>
#include <Common/SipHash.h>
#include <Common/NaNUtils.h>
#include <Common/RadixSort.h>
#include <IO/WriteBuffer.h>
#include <IO/WriteHelpers.h>
#include <Columns/ColumnsCommon.h>
@ -18,7 +19,6 @@
#include <emmintrin.h>
#endif
namespace DB
{
@ -68,19 +68,41 @@ struct ColumnVector<T>::greater
bool operator()(size_t lhs, size_t rhs) const { return CompareHelper<T>::greater(parent.data[lhs], parent.data[rhs], nan_direction_hint); }
};
namespace
{
template <typename T>
struct ValueWithIndex
{
T value;
UInt32 index;
};
template <typename T>
struct RadixSortTraits : RadixSortNumTraits<T>
{
using Element = ValueWithIndex<T>;
static T & extractKey(Element & elem) { return elem.value; }
};
}
template <typename T>
void ColumnVector<T>::getPermutation(bool reverse, size_t limit, int nan_direction_hint, IColumn::Permutation & res) const
{
size_t s = data.size();
res.resize(s);
for (size_t i = 0; i < s; ++i)
res[i] = i;
if (s == 0)
return;
if (limit >= s)
limit = 0;
if (limit)
{
for (size_t i = 0; i < s; ++i)
res[i] = i;
if (reverse)
std::partial_sort(res.begin(), res.begin() + limit, res.end(), greater(*this, nan_direction_hint));
else
@ -88,6 +110,71 @@ void ColumnVector<T>::getPermutation(bool reverse, size_t limit, int nan_directi
}
else
{
/// A case for radix sort
if constexpr (std::is_arithmetic_v<T> && !std::is_same_v<T, UInt128>)
{
/// Thresholds on size. Lower threshold is arbitrary. Upper threshold is chosen by the type for histogram counters.
if (s >= 256 && s <= std::numeric_limits<UInt32>::max())
{
PaddedPODArray<ValueWithIndex<T>> pairs(s);
for (UInt32 i = 0; i < s; ++i)
pairs[i] = {data[i], i};
RadixSort<RadixSortTraits<T>>::executeLSD(pairs.data(), s);
/// Radix sort treats all NaNs to be greater than all numbers.
/// If the user needs the opposite, we must move them accordingly.
size_t nans_to_move = 0;
if (std::is_floating_point_v<T> && nan_direction_hint < 0)
{
for (ssize_t i = s - 1; i >= 0; --i)
{
if (isNaN(pairs[i].value))
++nans_to_move;
else
break;
}
}
if (reverse)
{
if (nans_to_move)
{
for (size_t i = 0; i < s - nans_to_move; ++i)
res[i] = pairs[s - nans_to_move - 1 - i].index;
for (size_t i = s - nans_to_move; i < s; ++i)
res[i] = pairs[s - 1 - (i - (s - nans_to_move))].index;
}
else
{
for (size_t i = 0; i < s; ++i)
res[s - 1 - i] = pairs[i].index;
}
}
else
{
if (nans_to_move)
{
for (size_t i = 0; i < nans_to_move; ++i)
res[i] = pairs[i + s - nans_to_move].index;
for (size_t i = nans_to_move; i < s; ++i)
res[i] = pairs[i - nans_to_move].index;
}
else
{
for (size_t i = 0; i < s; ++i)
res[i] = pairs[i].index;
}
}
return;
}
}
/// Default sorting algorithm.
for (size_t i = 0; i < s; ++i)
res[i] = i;
if (reverse)
pdqsort(res.begin(), res.end(), greater(*this, nan_direction_hint));
else
@ -95,6 +182,7 @@ void ColumnVector<T>::getPermutation(bool reverse, size_t limit, int nan_directi
}
}
template <typename T>
const char * ColumnVector<T>::getFamilyName() const
{

Some files were not shown because too many files have changed in this diff Show More