merged with master

This commit is contained in:
Yuriy 2019-04-23 21:58:29 +03:00
commit a4bf3621e3
1371 changed files with 37306 additions and 19769 deletions

3
.gitmodules vendored
View File

@ -76,3 +76,6 @@
[submodule "contrib/brotli"]
path = contrib/brotli
url = https://github.com/google/brotli.git
[submodule "contrib/hyperscan"]
path = contrib/hyperscan
url = https://github.com/ClickHouse-Extras/hyperscan.git

View File

@ -1,3 +1,217 @@
## ClickHouse release 19.5.2.6, 2019-04-15
### New Features
* [Hyperscan](https://github.com/intel/hyperscan) multiple regular expression matching was added (functions `multiMatchAny`, `multiMatchAnyIndex`, `multiFuzzyMatchAny`, `multiFuzzyMatchAnyIndex`). [#4780](https://github.com/yandex/ClickHouse/pull/4780), [#4841](https://github.com/yandex/ClickHouse/pull/4841) ([Danila Kutenin](https://github.com/danlark1))
* `multiSearchFirstPosition` function was added. [#4780](https://github.com/yandex/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1))
* Implement the predefined expression filter per row for tables. [#4792](https://github.com/yandex/ClickHouse/pull/4792) ([Ivan](https://github.com/abyss7))
* A new type of data skipping indices based on bloom filters (can be used for `equal`, `in` and `like` functions). [#4499](https://github.com/yandex/ClickHouse/pull/4499) ([Nikita Vasilev](https://github.com/nikvas0))
* Added `ASOF JOIN` which allows to run queries that join to the most recent value known. [#4774](https://github.com/yandex/ClickHouse/pull/4774) [#4867](https://github.com/yandex/ClickHouse/pull/4867) [#4863](https://github.com/yandex/ClickHouse/pull/4863) [#4875](https://github.com/yandex/ClickHouse/pull/4875) ([Martijn Bakker](https://github.com/Gladdy), [Artem Zuikov](https://github.com/4ertus2))
* Rewrite multiple `COMMA JOIN` to `CROSS JOIN`. Then rewrite them to `INNER JOIN` if possible. [#4661](https://github.com/yandex/ClickHouse/pull/4661) ([Artem Zuikov](https://github.com/4ertus2))
### Improvement
* `topK` and `topKWeighted` now supports custom `loadFactor` (fixes issue [#4252](https://github.com/yandex/ClickHouse/issues/4252)). [#4634](https://github.com/yandex/ClickHouse/pull/4634) ([Kirill Danshin](https://github.com/kirillDanshin))
* Allow to use `parallel_replicas_count > 1` even for tables without sampling (the setting is simply ignored for them). In previous versions it was lead to exception. [#4637](https://github.com/yandex/ClickHouse/pull/4637) ([Alexey Elymanov](https://github.com/digitalist))
* Support for `CREATE OR REPLACE VIEW`. Allow to create a view or set a new definition in a single statement. [#4654](https://github.com/yandex/ClickHouse/pull/4654) ([Boris Granveaud](https://github.com/bgranvea))
* `Buffer` table engine now supports `PREWHERE`. [#4671](https://github.com/yandex/ClickHouse/pull/4671) ([Yangkuan Liu](https://github.com/LiuYangkuan))
* Add ability to start replicated table without metadata in zookeeper in `readonly` mode. [#4691](https://github.com/yandex/ClickHouse/pull/4691) ([alesapin](https://github.com/alesapin))
* Fixed flicker of progress bar in clickhouse-client. The issue was most noticeable when using `FORMAT Null` with streaming queries. [#4811](https://github.com/yandex/ClickHouse/pull/4811) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow to disable functions with `hyperscan` library on per user basis to limit potentially excessive and uncontrolled resource usage. [#4816](https://github.com/yandex/ClickHouse/pull/4816) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add version number logging in all errors. [#4824](https://github.com/yandex/ClickHouse/pull/4824) ([proller](https://github.com/proller))
* Added restriction to the `multiMatch` functions which requires string size to fit into `unsigned int`. Also added the number of arguments limit to the `multiSearch` functions. [#4834](https://github.com/yandex/ClickHouse/pull/4834) ([Danila Kutenin](https://github.com/danlark1))
* Improved usage of scratch space and error handling in Hyperscan. [#4866](https://github.com/yandex/ClickHouse/pull/4866) ([Danila Kutenin](https://github.com/danlark1))
* Fill `system.graphite_detentions` from a table config of `*GraphiteMergeTree` engine tables. [#4584](https://github.com/yandex/ClickHouse/pull/4584) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Rename `trigramDistance` function to `ngramDistance` and add more functions with `CaseInsensitive` and `UTF`. [#4602](https://github.com/yandex/ClickHouse/pull/4602) ([Danila Kutenin](https://github.com/danlark1))
* Improved data skipping indices calculation. [#4640](https://github.com/yandex/ClickHouse/pull/4640) ([Nikita Vasilev](https://github.com/nikvas0))
### Bug Fix
* Avoid `std::terminate` in case of memory allocation failure. Now `std::bad_alloc` exception is thrown as expected. [#4665](https://github.com/yandex/ClickHouse/pull/4665) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixes capnproto reading from buffer. Sometimes files wasn't loaded successfully by HTTP. [#4674](https://github.com/yandex/ClickHouse/pull/4674) ([Vladislav](https://github.com/smirnov-vs))
* Fix error `Unknown log entry type: 0` after `OPTIMIZE TABLE FINAL` query. [#4683](https://github.com/yandex/ClickHouse/pull/4683) ([Amos Bird](https://github.com/amosbird))
* Wrong arguments to `hasAny` or `hasAll` functions may lead to segfault. [#4698](https://github.com/yandex/ClickHouse/pull/4698) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Deadlock may happen while executing `DROP DATABASE dictionary` query. [#4701](https://github.com/yandex/ClickHouse/pull/4701) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix undefinied behavior in `median` and `quantile` functions. [#4702](https://github.com/yandex/ClickHouse/pull/4702) ([hcz](https://github.com/hczhcz))
* Fix compression level detection when `network_compression_method` in lowercase. Broken in v19.1. [#4706](https://github.com/yandex/ClickHouse/pull/4706) ([proller](https://github.com/proller))
* Keep ordinary, `DEFAULT`, `MATERIALIZED` and `ALIAS` columns in a single list (fixes issue [#2867](https://github.com/yandex/ClickHouse/issues/2867)). [#4707](https://github.com/yandex/ClickHouse/pull/4707) ([Alex Zatelepin](https://github.com/ztlpn))
* Fixed ignorance of `<timezone>UTC</timezone>` setting (fixes issue [#4658](https://github.com/yandex/ClickHouse/issues/4658)). [#4718](https://github.com/yandex/ClickHouse/pull/4718) ([proller](https://github.com/proller))
* Fix `histogram` function behaviour with `Distributed` tables. [#4741](https://github.com/yandex/ClickHouse/pull/4741) ([olegkv](https://github.com/olegkv))
* Fixed tsan report `destroy of a locked mutex`. [#4742](https://github.com/yandex/ClickHouse/pull/4742) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed TSan report on shutdown due to race condition in system logs usage. Fixed potential use-after-free on shutdown when part_log is enabled. [#4758](https://github.com/yandex/ClickHouse/pull/4758) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix recheck parts in `ReplicatedMergeTreeAlterThread` in case of error. [#4772](https://github.com/yandex/ClickHouse/pull/4772) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Arithmetic operations on intermediate aggregate function states were not working for constant arguments (such as subquery results). [#4776](https://github.com/yandex/ClickHouse/pull/4776) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Always backquote column names in metadata. Otherwise it's impossible to create a table with column named `index` (server won't restart due to malformed `ATTACH` query in metadata). [#4782](https://github.com/yandex/ClickHouse/pull/4782) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix crash in `ALTER ... MODIFY ORDER BY` on `Distributed` table. [#4790](https://github.com/yandex/ClickHouse/pull/4790) ([TCeason](https://github.com/TCeason))
* Fix segfault in `JOIN ON` with enabled `enable_optimize_predicate_expression`. [#4794](https://github.com/yandex/ClickHouse/pull/4794) ([Winter Zhang](https://github.com/zhang2014))
* Fix bug with adding an extraneous row after consuming a protobuf message from Kafka. [#4808](https://github.com/yandex/ClickHouse/pull/4808) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix crash of `JOIN` on not-nullable vs nullable column. Fix `NULLs` in right keys in `ANY JOIN` + `join_use_nulls`. [#4815](https://github.com/yandex/ClickHouse/pull/4815) ([Artem Zuikov](https://github.com/4ertus2))
* Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
* Fixed race condition in `SELECT` from `system.tables` if the table is renamed or altered concurrently. [#4836](https://github.com/yandex/ClickHouse/pull/4836) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed data race when fetching data part that is already obsolete. [#4839](https://github.com/yandex/ClickHouse/pull/4839) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed rare data race that can happen during `RENAME` table of MergeTree family. [#4844](https://github.com/yandex/ClickHouse/pull/4844) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed segmentation fault in function `arrayIntersect`. Segmentation fault could happen if function was called with mixed constant and ordinary arguments. [#4847](https://github.com/yandex/ClickHouse/pull/4847) ([Lixiang Qian](https://github.com/fancyqlx))
* Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Fix `No message received` exception while fetching parts between replicas. [#4856](https://github.com/yandex/ClickHouse/pull/4856) ([alesapin](https://github.com/alesapin))
* Fixed `arrayIntersect` function wrong result in case of several repeated values in single array. [#4871](https://github.com/yandex/ClickHouse/pull/4871) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix a race condition during concurrent `ALTER COLUMN` queries that could lead to a server crash (fixes issue [#3421](https://github.com/yandex/ClickHouse/issues/3421)). [#4592](https://github.com/yandex/ClickHouse/pull/4592) ([Alex Zatelepin](https://github.com/ztlpn))
* Fix incorrect result in `FULL/RIGHT JOIN` with const column. [#4723](https://github.com/yandex/ClickHouse/pull/4723) ([Artem Zuikov](https://github.com/4ertus2))
* Fix duplicates in `GLOBAL JOIN` with asterisk. [#4705](https://github.com/yandex/ClickHouse/pull/4705) ([Artem Zuikov](https://github.com/4ertus2))
* Fix parameter deduction in `ALTER MODIFY` of column `CODEC` when column type is not specified. [#4883](https://github.com/yandex/ClickHouse/pull/4883) ([alesapin](https://github.com/alesapin))
* Functions `cutQueryStringAndFragment()` and `queryStringAndFragment()` now works correctly when `URL` contains a fragment and no query. [#4894](https://github.com/yandex/ClickHouse/pull/4894) ([Vitaly Baranov](https://github.com/vitlibar))
* Fix rare bug when setting `min_bytes_to_use_direct_io` is greater than zero, which occures when thread have to seek backward in column file. [#4897](https://github.com/yandex/ClickHouse/pull/4897) ([alesapin](https://github.com/alesapin))
* Fix wrong argument types for aggregate functions with `LowCardinality` arguments (fixes issue [#4919](https://github.com/yandex/ClickHouse/issues/4919)). [#4922](https://github.com/yandex/ClickHouse/pull/4922) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Fix wrong name qualification in `GLOBAL JOIN`. [#4969](https://github.com/yandex/ClickHouse/pull/4969) ([Artem Zuikov](https://github.com/4ertus2))
* Function `toISOWeek` result for year 1970. [#4988](https://github.com/yandex/ClickHouse/pull/4988) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix `DROP`, `TRUNCATE` and `OPTIMIZE` queries duplication, when executed on `ON CLUSTER` for `ReplicatedMergeTree*` tables family. [#4991](https://github.com/yandex/ClickHouse/pull/4991) ([alesapin](https://github.com/alesapin))
### Backward Incompatible Change
* Rename setting `insert_sample_with_metadata` to setting `input_format_defaults_for_omitted_fields`. [#4771](https://github.com/yandex/ClickHouse/pull/4771) ([Artem Zuikov](https://github.com/4ertus2))
* Added setting `max_partitions_per_insert_block` (with value 100 by default). If inserted block contains larger number of partitions, an exception is thrown. Set it to 0 if you want to remove the limit (not recommended). [#4845](https://github.com/yandex/ClickHouse/pull/4845) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Multi-search functions were renamed (`multiPosition` to `multiSearchAllPositions`, `multiSearch` to `multiSearchAny`, `firstMatch` to `multiSearchFirstIndex`). [#4780](https://github.com/yandex/ClickHouse/pull/4780) ([Danila Kutenin](https://github.com/danlark1))
### Performance Improvement
* Optimize Volnitsky searcher by inlining, giving about 5-10% search improvement for queries with many needles or many similar bigrams. [#4862](https://github.com/yandex/ClickHouse/pull/4862) ([Danila Kutenin](https://github.com/danlark1))
* Fix performance issue when setting `use_uncompressed_cache` is greater than zero, which appeared when all read data contained in cache. [#4913](https://github.com/yandex/ClickHouse/pull/4913) ([alesapin](https://github.com/alesapin))
### Build/Testing/Packaging Improvement
* Hardening debug build: more granular memory mappings and ASLR; add memory protection for mark cache and index. This allows to find more memory stomping bugs in case when ASan and MSan cannot do it. [#4632](https://github.com/yandex/ClickHouse/pull/4632) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add support for cmake variables `ENABLE_PROTOBUF`, `ENABLE_PARQUET` and `ENABLE_BROTLI` which allows to enable/disable the above features (same as we can do for librdkafka, mysql, etc). [#4669](https://github.com/yandex/ClickHouse/pull/4669) ([Silviu Caragea](https://github.com/silviucpp))
* Add ability to print process list and stacktraces of all threads if some queries are hung after test run. [#4675](https://github.com/yandex/ClickHouse/pull/4675) ([alesapin](https://github.com/alesapin))
* Add retries on `Connection loss` error in `clickhouse-test`. [#4682](https://github.com/yandex/ClickHouse/pull/4682) ([alesapin](https://github.com/alesapin))
* Add freebsd build with vagrant and build with thread sanitizer to packager script. [#4712](https://github.com/yandex/ClickHouse/pull/4712) [#4748](https://github.com/yandex/ClickHouse/pull/4748) ([alesapin](https://github.com/alesapin))
* Now user asked for password for user `'default'` during installation. [#4725](https://github.com/yandex/ClickHouse/pull/4725) ([proller](https://github.com/proller))
* Suppress warning in `rdkafka` library. [#4740](https://github.com/yandex/ClickHouse/pull/4740) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Allow ability to build without ssl. [#4750](https://github.com/yandex/ClickHouse/pull/4750) ([proller](https://github.com/proller))
* Add a way to launch clickhouse-server image from a custom user. [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Upgrade contrib boost to 1.69. [#4793](https://github.com/yandex/ClickHouse/pull/4793) ([proller](https://github.com/proller))
* Disable usage of `mremap` when compiled with Thread Sanitizer. Surprisingly enough, TSan does not intercept `mremap` (though it does intercept `mmap`, `munmap`) that leads to false positives. Fixed TSan report in stateful tests. [#4859](https://github.com/yandex/ClickHouse/pull/4859) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Add test checking using format schema via HTTP interface. [#4864](https://github.com/yandex/ClickHouse/pull/4864) ([Vitaly Baranov](https://github.com/vitlibar))
## ClickHouse release 19.4.3.11, 2019-04-02
### Bug Fixes
* Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
### Build/Testing/Packaging Improvement
* Add a way to launch clickhouse-server image from a custom user. [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
## ClickHouse release 19.4.2.7, 2019-03-30
### Bug Fixes
* Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
## ClickHouse release 19.4.1.3, 2019-03-19
### Bug Fixes
* Fixed remote queries which contain both `LIMIT BY` and `LIMIT`. Previously, if `LIMIT BY` and `LIMIT` were used for remote query, `LIMIT` could happen before `LIMIT BY`, which led to too filtered result. [#4708](https://github.com/yandex/ClickHouse/pull/4708) ([Constantin S. Pan](https://github.com/kvap))
## ClickHouse release 19.4.0.49, 2019-03-09
### New Features
* Added full support for `Protobuf` format (input and output, nested data structures). [#4174](https://github.com/yandex/ClickHouse/pull/4174) [#4493](https://github.com/yandex/ClickHouse/pull/4493) ([Vitaly Baranov](https://github.com/vitlibar))
* Added bitmap functions with Roaring Bitmaps. [#4207](https://github.com/yandex/ClickHouse/pull/4207) ([Andy Yang](https://github.com/andyyzh)) [#4568](https://github.com/yandex/ClickHouse/pull/4568) ([Vitaly Baranov](https://github.com/vitlibar))
* Parquet format support. [#4448](https://github.com/yandex/ClickHouse/pull/4448) ([proller](https://github.com/proller))
* N-gram distance was added for fuzzy string comparison. It is similar to q-gram metrics in R language. [#4466](https://github.com/yandex/ClickHouse/pull/4466) ([Danila Kutenin](https://github.com/danlark1))
* Combine rules for graphite rollup from dedicated aggregation and retention patterns. [#4426](https://github.com/yandex/ClickHouse/pull/4426) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Added `max_execution_speed` and `max_execution_speed_bytes` to limit resource usage. Added `min_execution_speed_bytes` setting to complement the `min_execution_speed`. [#4430](https://github.com/yandex/ClickHouse/pull/4430) ([Winter Zhang](https://github.com/zhang2014))
* Implemented function `flatten`. [#4555](https://github.com/yandex/ClickHouse/pull/4555) [#4409](https://github.com/yandex/ClickHouse/pull/4409) ([alexey-milovidov](https://github.com/alexey-milovidov), [kzon](https://github.com/kzon))
* Added functions `arrayEnumerateDenseRanked` and `arrayEnumerateUniqRanked` (it's like `arrayEnumerateUniq` but allows to fine tune array depth to look inside multidimensional arrays). [#4475](https://github.com/yandex/ClickHouse/pull/4475) ([proller](https://github.com/proller)) [#4601](https://github.com/yandex/ClickHouse/pull/4601) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Multiple JOINS with some restrictions: no asterisks, no complex aliases in ON/WHERE/GROUP BY/... [#4462](https://github.com/yandex/ClickHouse/pull/4462) ([Artem Zuikov](https://github.com/4ertus2))
### Bug Fixes
* This release also contains all bug fixes from 19.3 and 19.1.
* Fixed bug in data skipping indices: order of granules after INSERT was incorrect. [#4407](https://github.com/yandex/ClickHouse/pull/4407) ([Nikita Vasilev](https://github.com/nikvas0))
* Fixed `set` index for `Nullable` and `LowCardinality` columns. Before it, `set` index with `Nullable` or `LowCardinality` column led to error `Data type must be deserialized with multiple streams` while selecting. [#4594](https://github.com/yandex/ClickHouse/pull/4594) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Correctly set update_time on full `executable` dictionary update. [#4551](https://github.com/yandex/ClickHouse/pull/4551) ([Tema Novikov](https://github.com/temoon))
* Fix broken progress bar in 19.3. [#4627](https://github.com/yandex/ClickHouse/pull/4627) ([filimonov](https://github.com/filimonov))
* Fixed inconsistent values of MemoryTracker when memory region was shrinked, in certain cases. [#4619](https://github.com/yandex/ClickHouse/pull/4619) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed undefined behaviour in ThreadPool. [#4612](https://github.com/yandex/ClickHouse/pull/4612) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed a very rare crash with the message `mutex lock failed: Invalid argument` that could happen when a MergeTree table was dropped concurrently with a SELECT. [#4608](https://github.com/yandex/ClickHouse/pull/4608) ([Alex Zatelepin](https://github.com/ztlpn))
* ODBC driver compatibility with `LowCardinality` data type. [#4381](https://github.com/yandex/ClickHouse/pull/4381) ([proller](https://github.com/proller))
* FreeBSD: Fixup for `AIOcontextPool: Found io_event with unknown id 0` error. [#4438](https://github.com/yandex/ClickHouse/pull/4438) ([urgordeadbeef](https://github.com/urgordeadbeef))
* `system.part_log` table was created regardless to configuration. [#4483](https://github.com/yandex/ClickHouse/pull/4483) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix undefined behaviour in `dictIsIn` function for cache dictionaries. [#4515](https://github.com/yandex/ClickHouse/pull/4515) ([alesapin](https://github.com/alesapin))
* Fixed a deadlock when a SELECT query locks the same table multiple times (e.g. from different threads or when executing multiple subqueries) and there is a concurrent DDL query. [#4535](https://github.com/yandex/ClickHouse/pull/4535) ([Alex Zatelepin](https://github.com/ztlpn))
* Disable compile_expressions by default until we get own `llvm` contrib and can test it with `clang` and `asan`. [#4579](https://github.com/yandex/ClickHouse/pull/4579) ([alesapin](https://github.com/alesapin))
* Prevent `std::terminate` when `invalidate_query` for `clickhouse` external dictionary source has returned wrong resultset (empty or more than one row or more than one column). Fixed issue when the `invalidate_query` was performed every five seconds regardless to the `lifetime`. [#4583](https://github.com/yandex/ClickHouse/pull/4583) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Avoid deadlock when the `invalidate_query` for a dictionary with `clickhouse` source was involving `system.dictionaries` table or `Dictionaries` database (rare case). [#4599](https://github.com/yandex/ClickHouse/pull/4599) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixes for CROSS JOIN with empty WHERE. [#4598](https://github.com/yandex/ClickHouse/pull/4598) ([Artem Zuikov](https://github.com/4ertus2))
* Fixed segfault in function "replicate" when constant argument is passed. [#4603](https://github.com/yandex/ClickHouse/pull/4603) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix lambda function with predicate optimizer. [#4408](https://github.com/yandex/ClickHouse/pull/4408) ([Winter Zhang](https://github.com/zhang2014))
* Multiple JOINs multiple fixes. [#4595](https://github.com/yandex/ClickHouse/pull/4595) ([Artem Zuikov](https://github.com/4ertus2))
### Improvements
* Support aliases in JOIN ON section for right table columns. [#4412](https://github.com/yandex/ClickHouse/pull/4412) ([Artem Zuikov](https://github.com/4ertus2))
* Result of multiple JOINs need correct result names to be used in subselects. Replace flat aliases with source names in result. [#4474](https://github.com/yandex/ClickHouse/pull/4474) ([Artem Zuikov](https://github.com/4ertus2))
* Improve push-down logic for joined statements. [#4387](https://github.com/yandex/ClickHouse/pull/4387) ([Ivan](https://github.com/abyss7))
### Performance Improvements
* Improved heuristics of "move to PREWHERE" optimization. [#4405](https://github.com/yandex/ClickHouse/pull/4405) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Use proper lookup tables that uses HashTable's API for 8-bit and 16-bit keys. [#4536](https://github.com/yandex/ClickHouse/pull/4536) ([Amos Bird](https://github.com/amosbird))
* Improved performance of string comparison. [#4564](https://github.com/yandex/ClickHouse/pull/4564) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Cleanup distributed DDL queue in a separate thread so that it doesn't slow down the main loop that processes distributed DDL tasks. [#4502](https://github.com/yandex/ClickHouse/pull/4502) ([Alex Zatelepin](https://github.com/ztlpn))
* When `min_bytes_to_use_direct_io` is set to 1, not every file was opened with O_DIRECT mode because the data size to read was sometimes underestimated by the size of one compressed block. [#4526](https://github.com/yandex/ClickHouse/pull/4526) ([alexey-milovidov](https://github.com/alexey-milovidov))
### Build/Testing/Packaging Improvement
* Added support for clang-9 [#4604](https://github.com/yandex/ClickHouse/pull/4604) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix wrong `__asm__` instructions (again) [#4621](https://github.com/yandex/ClickHouse/pull/4621) ([Konstantin Podshumok](https://github.com/podshumok))
* Add ability to specify settings for `clickhouse-performance-test` from command line. [#4437](https://github.com/yandex/ClickHouse/pull/4437) ([alesapin](https://github.com/alesapin))
* Add dictionaries tests to integration tests. [#4477](https://github.com/yandex/ClickHouse/pull/4477) ([alesapin](https://github.com/alesapin))
* Added queries from the benchmark on the website to automated performance tests. [#4496](https://github.com/yandex/ClickHouse/pull/4496) ([alexey-milovidov](https://github.com/alexey-milovidov))
* `xxhash.h` does not exist in external lz4 because it is an implementation detail and its symbols are namespaced with `XXH_NAMESPACE` macro. When lz4 is external, xxHash has to be external too, and the dependents have to link to it. [#4495](https://github.com/yandex/ClickHouse/pull/4495) ([Orivej Desh](https://github.com/orivej))
* Fixed a case when `quantileTiming` aggregate function can be called with negative or floating point argument (this fixes fuzz test with undefined behaviour sanitizer). [#4506](https://github.com/yandex/ClickHouse/pull/4506) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Spelling error correction. [#4531](https://github.com/yandex/ClickHouse/pull/4531) ([sdk2](https://github.com/sdk2))
* Fix compilation on Mac. [#4371](https://github.com/yandex/ClickHouse/pull/4371) ([Vitaly Baranov](https://github.com/vitlibar))
* Build fixes for FreeBSD and various unusual build configurations. [#4444](https://github.com/yandex/ClickHouse/pull/4444) ([proller](https://github.com/proller))
## ClickHouse release 19.3.9.1, 2019-04-02
### Bug Fixes
* Fix crash in `FULL/RIGHT JOIN` when we joining on nullable vs not nullable. [#4855](https://github.com/yandex/ClickHouse/pull/4855) ([Artem Zuikov](https://github.com/4ertus2))
* Fix segmentation fault in `clickhouse-copier`. [#4835](https://github.com/yandex/ClickHouse/pull/4835) ([proller](https://github.com/proller))
* Fixed reading from `Array(LowCardinality)` column in rare case when column contained a long sequence of empty arrays. [#4850](https://github.com/yandex/ClickHouse/pull/4850) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
### Build/Testing/Packaging Improvement
* Add a way to launch clickhouse-server image from a custom user [#4753](https://github.com/yandex/ClickHouse/pull/4753) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
## ClickHouse release 19.3.7, 2019-03-12
### Bug fixes
* Fixed error in #3920. This error manifestate itself as random cache corruption (messages `Unknown codec family code`, `Cannot seek through file`) and segfaults. This bug first appeared in version 19.1 and is present in versions up to 19.1.10 and 19.3.6. [#4623](https://github.com/yandex/ClickHouse/pull/4623) ([alexey-milovidov](https://github.com/alexey-milovidov))
## ClickHouse release 19.3.6, 2019-03-02
### Bug fixes
* When there are more than 1000 threads in a thread pool, `std::terminate` may happen on thread exit. [Azat Khuzhin](https://github.com/azat) [#4485](https://github.com/yandex/ClickHouse/pull/4485) [#4505](https://github.com/yandex/ClickHouse/pull/4505) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Now it's possible to create `ReplicatedMergeTree*` tables with comments on columns without defaults and tables with columns codecs without comments and defaults. Also fix comparison of codecs. [#4523](https://github.com/yandex/ClickHouse/pull/4523) ([alesapin](https://github.com/alesapin))
* Fixed crash on JOIN with array or tuple. [#4552](https://github.com/yandex/ClickHouse/pull/4552) ([Artem Zuikov](https://github.com/4ertus2))
* Fixed crash in clickhouse-copier with the message `ThreadStatus not created`. [#4540](https://github.com/yandex/ClickHouse/pull/4540) ([Artem Zuikov](https://github.com/4ertus2))
* Fixed hangup on server shutdown if distributed DDLs were used. [#4472](https://github.com/yandex/ClickHouse/pull/4472) ([Alex Zatelepin](https://github.com/ztlpn))
* Incorrect column numbers were printed in error message about text format parsing for columns with number greater than 10. [#4484](https://github.com/yandex/ClickHouse/pull/4484) ([alexey-milovidov](https://github.com/alexey-milovidov))
### Build/Testing/Packaging Improvements
* Fixed build with AVX enabled. [#4527](https://github.com/yandex/ClickHouse/pull/4527) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Enable extended accounting and IO accounting based on good known version instead of kernel under which it is compiled. [#4541](https://github.com/yandex/ClickHouse/pull/4541) ([nvartolomei](https://github.com/nvartolomei))
* Allow to skip setting of core_dump.size_limit, warning instead of throw if limit set fail. [#4473](https://github.com/yandex/ClickHouse/pull/4473) ([proller](https://github.com/proller))
* Removed the `inline` tags of `void readBinary(...)` in `Field.cpp`. Also merged redundant `namespace DB` blocks. [#4530](https://github.com/yandex/ClickHouse/pull/4530) ([hcz](https://github.com/hczhcz))
## ClickHouse release 19.3.5, 2019-02-21
### Bug fixes
@ -67,7 +281,7 @@
* Fixed race condition when selecting from `system.tables` may give `table doesn't exist` error. [#4313](https://github.com/yandex/ClickHouse/pull/4313) ([alexey-milovidov](https://github.com/alexey-milovidov))
* `clickhouse-client` can segfault on exit while loading data for command line suggestions if it was run in interactive mode. [#4317](https://github.com/yandex/ClickHouse/pull/4317) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed a bug when the execution of mutations containing `IN` operators was producing incorrect results. [#4099](https://github.com/yandex/ClickHouse/pull/4099) ([Alex Zatelepin](https://github.com/ztlpn))
* Fixed error: if there is a database with `Dictionary` engine, all dictionaries forced to load at server startup, and if there is a dictionary with ClickHouse source from localhost, the dictionary cannot load. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed error: if there is a database with `Dictionary` engine, all dictionaries forced to load at server startup, and if there is a dictionary with ClickHouse source from localhost, the dictionary cannot load. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed error when system logs are tried to create again at server shutdown. [#4254](https://github.com/yandex/ClickHouse/pull/4254) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Correctly return the right type and properly handle locks in `joinGet` function. [#4153](https://github.com/yandex/ClickHouse/pull/4153) ([Amos Bird](https://github.com/amosbird))
* Added `sumMapWithOverflow` function. [#4151](https://github.com/yandex/ClickHouse/pull/4151) ([Léo Ercolanelli](https://github.com/ercolanelli-leo))
@ -92,7 +306,7 @@
* Added script which creates changelog from pull requests description. [#4169](https://github.com/yandex/ClickHouse/pull/4169) [#4173](https://github.com/yandex/ClickHouse/pull/4173) ([KochetovNicolai](https://github.com/KochetovNicolai)) ([KochetovNicolai](https://github.com/KochetovNicolai))
* Added puppet module for Clickhouse. [#4182](https://github.com/yandex/ClickHouse/pull/4182) ([Maxim Fedotov](https://github.com/MaxFedotov))
* Added docs for a group of undocumented functions. [#4168](https://github.com/yandex/ClickHouse/pull/4168) ([Winter Zhang](https://github.com/zhang2014))
* ARM build fixes. [#4210](https://github.com/yandex/ClickHouse/pull/4210)[#4306](https://github.com/yandex/ClickHouse/pull/4306) [#4291](https://github.com/yandex/ClickHouse/pull/4291) ([proller](https://github.com/proller)) ([proller](https://github.com/proller))
* ARM build fixes. [#4210](https://github.com/yandex/ClickHouse/pull/4210)[#4306](https://github.com/yandex/ClickHouse/pull/4306) [#4291](https://github.com/yandex/ClickHouse/pull/4291) ([proller](https://github.com/proller)) ([proller](https://github.com/proller))
* Dictionary tests now able to run from `ctest`. [#4189](https://github.com/yandex/ClickHouse/pull/4189) ([proller](https://github.com/proller))
* Now `/etc/ssl` is used as default directory with SSL certificates. [#4167](https://github.com/yandex/ClickHouse/pull/4167) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added checking SSE and AVX instruction at start. [#4234](https://github.com/yandex/ClickHouse/pull/4234) ([Igr](https://github.com/igron99))
@ -123,6 +337,20 @@
* Improved server shutdown time and ALTERs waiting time. [#4372](https://github.com/yandex/ClickHouse/pull/4372) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Added info about the replicated_can_become_leader setting to system.replicas and add logging if the replica won't try to become leader. [#4379](https://github.com/yandex/ClickHouse/pull/4379) ([Alex Zatelepin](https://github.com/ztlpn))
## ClickHouse release 19.1.14, 2019-03-14
* Fixed error `Column ... queried more than once` that may happen if the setting `asterisk_left_columns_only` is set to 1 in case of using `GLOBAL JOIN` with `SELECT *` (rare case). The issue does not exist in 19.3 and newer. [6bac7d8d](https://github.com/yandex/ClickHouse/pull/4692/commits/6bac7d8d11a9b0d6de0b32b53c47eb2f6f8e7062) ([Artem Zuikov](https://github.com/4ertus2))
## ClickHouse release 19.1.13, 2019-03-12
This release contains exactly the same set of patches as 19.3.7.
## ClickHouse release 19.1.10, 2019-03-03
This release contains exactly the same set of patches as 19.3.6.
## ClickHouse release 19.1.9, 2019-02-21
### Bug fixes
@ -140,7 +368,7 @@
### Bug Fixes
* Correctly return the right type and properly handle locks in `joinGet` function. [#4153](https://github.com/yandex/ClickHouse/pull/4153) ([Amos Bird](https://github.com/amosbird))
* Fixed error when system logs are tried to create again at server shutdown. [#4254](https://github.com/yandex/ClickHouse/pull/4254) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed error: if there is a database with `Dictionary` engine, all dictionaries forced to load at server startup, and if there is a dictionary with ClickHouse source from localhost, the dictionary cannot load. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed error: if there is a database with `Dictionary` engine, all dictionaries forced to load at server startup, and if there is a dictionary with ClickHouse source from localhost, the dictionary cannot load. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed a bug when the execution of mutations containing `IN` operators was producing incorrect results. [#4099](https://github.com/yandex/ClickHouse/pull/4099) ([Alex Zatelepin](https://github.com/ztlpn))
* `clickhouse-client` can segfault on exit while loading data for command line suggestions if it was run in interactive mode. [#4317](https://github.com/yandex/ClickHouse/pull/4317) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed race condition when selecting from `system.tables` may give `table doesn't exist` error. [#4313](https://github.com/yandex/ClickHouse/pull/4313) ([alexey-milovidov](https://github.com/alexey-milovidov))

View File

@ -1,3 +1,89 @@
## ClickHouse release 19.4.0.49, 2019-03-09
### Новые возможности
* Добавлена полная поддержка формата `Protobuf` (чтение и запись, вложенные структуры данных). [#4174](https://github.com/yandex/ClickHouse/pull/4174) [#4493](https://github.com/yandex/ClickHouse/pull/4493) ([Vitaly Baranov](https://github.com/vitlibar))
* Добавлены функции для работы с битовыми масками с использованием библиотеки Roaring Bitmaps. [#4207](https://github.com/yandex/ClickHouse/pull/4207) ([Andy Yang](https://github.com/andyyzh)) [#4568](https://github.com/yandex/ClickHouse/pull/4568) ([Vitaly Baranov](https://github.com/vitlibar))
* Поддержка формата `Parquet` [#4448](https://github.com/yandex/ClickHouse/pull/4448) ([proller](https://github.com/proller))
* Вычисление расстояния между строками с помощью подсчёта N-грам - для приближённого сравнения строк. Алгоритм похож на q-gram metrics в языке R. [#4466](https://github.com/yandex/ClickHouse/pull/4466) ([Danila Kutenin](https://github.com/danlark1))
* Движок таблиц GraphiteMergeTree поддерживает отдельные шаблоны для правил агрегации и для правил времени хранения. [#4426](https://github.com/yandex/ClickHouse/pull/4426) ([Mikhail f. Shiryaev](https://github.com/Felixoid))
* Добавлены настройки `max_execution_speed` и `max_execution_speed_bytes` для того, чтобы ограничить потребление ресурсов запросами. Добавлена настройка `min_execution_speed_bytes` в дополнение к `min_execution_speed`. [#4430](https://github.com/yandex/ClickHouse/pull/4430) ([Winter Zhang](https://github.com/zhang2014))
* Добавлена функция `flatten` - конвертация многомерных массивов в плоский массив. [#4555](https://github.com/yandex/ClickHouse/pull/4555) [#4409](https://github.com/yandex/ClickHouse/pull/4409) ([alexey-milovidov](https://github.com/alexey-milovidov), [kzon](https://github.com/kzon))
* Добавлены функции `arrayEnumerateDenseRanked` и `arrayEnumerateUniqRanked` (похожа на `arrayEnumerateUniq` но позволяет указать глубину, на которую следует смотреть в многомерные массивы). [#4475](https://github.com/yandex/ClickHouse/pull/4475) ([proller](https://github.com/proller)) [#4601](https://github.com/yandex/ClickHouse/pull/4601) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлена поддержка множества JOIN в одном запросе без подзапросов, с некоторыми ограничениями: без звёздочки и без алиасов сложных выражений в ON/WHERE/GROUP BY/... [#4462](https://github.com/yandex/ClickHouse/pull/4462) ([Artem Zuikov](https://github.com/4ertus2))
### Исправления ошибок
* Этот релиз также содержит все исправления из 19.3 и 19.1.
* Исправлена ошибка во вторичных индексах (экспериментальная возможность): порядок гранул при INSERT был неверным. [#4407](https://github.com/yandex/ClickHouse/pull/4407) ([Nikita Vasilev](https://github.com/nikvas0))
* Исправлена работа вторичного индекса (экспериментальная возможность) типа `set` для столбцов типа `Nullable` и `LowCardinality`. Ранее их использование вызывало ошибку `Data type must be deserialized with multiple streams` при запросе SELECT. [#4594](https://github.com/yandex/ClickHouse/pull/4594) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
* Правильное запоминание времени последнего обновления при полной перезагрузке словарей типа `executable`. [#4551](https://github.com/yandex/ClickHouse/pull/4551) ([Tema Novikov](https://github.com/temoon))
* Исправлена неработоспособность прогресс-бара, возникшая в версии 19.3 [#4627](https://github.com/yandex/ClickHouse/pull/4627) ([filimonov](https://github.com/filimonov))
* Исправлены неправильные значения MemoryTracker, если кусок памяти был уменьшен в размере, в очень редких случаях. [#4619](https://github.com/yandex/ClickHouse/pull/4619) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено undefined behaviour в ThreadPool [#4612](https://github.com/yandex/ClickHouse/pull/4612) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено очень редкое падение с сообщением `mutex lock failed: Invalid argument`, которое могло произойти, если таблица типа MergeTree удалялась одновременно с SELECT. [#4608](https://github.com/yandex/ClickHouse/pull/4608) ([Alex Zatelepin](https://github.com/ztlpn))
* Совместимость ODBC драйвера с типом данных `LowCardinality` [#4381](https://github.com/yandex/ClickHouse/pull/4381) ([proller](https://github.com/proller))
* Исправление ошибки `AIOcontextPool: Found io_event with unknown id 0` под ОС FreeBSD [#4438](https://github.com/yandex/ClickHouse/pull/4438) ([urgordeadbeef](https://github.com/urgordeadbeef))
* Таблица `system.part_log` создавалась независимо от того, была ли она объявлена в конфигурации. [#4483](https://github.com/yandex/ClickHouse/pull/4483) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено undefined behaviour в функции `dictIsIn` для словарей типа `cache`. [#4515](https://github.com/yandex/ClickHouse/pull/4515) ([alesapin](https://github.com/alesapin))
* Исправлен deadlock в случае, если запрос SELECT блокирует одну и ту же таблицу несколько раз (например - из разных потоков, либо при выполнении разных подзапросов) и одновременно с этим производится DDL запрос. [#4535](https://github.com/yandex/ClickHouse/pull/4535) ([Alex Zatelepin](https://github.com/ztlpn))
* Настройка `compile_expressions` выключена по-умолчанию до тех пор, пока мы не зафиксируем исходники используемой библиотеки `LLVM` и не будем проверять её под `ASan` (сейчас библиотека LLVM берётся из системы). [#4579](https://github.com/yandex/ClickHouse/pull/4579) ([alesapin](https://github.com/alesapin))
* Исправлено падение по `std::terminate`, если `invalidate_query` для внешних словарей с источником `clickhouse` вернул неправильный результат (пустой; более чем одну строку; более чем один столбец). Исправлена ошибка, из-за которой запрос `invalidate_query` производился каждые пять секунд, независимо от указанного `lifetime`. [#4583](https://github.com/yandex/ClickHouse/pull/4583) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлен deadlock в случае, если запрос `invalidate_query` для внешнего словаря с источником `clickhouse` использовал таблицу `system.dictionaries` или базу данных типа `Dictionary` (редкий случай). [#4599](https://github.com/yandex/ClickHouse/pull/4599) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена работа CROSS JOIN с пустым WHERE [#4598](https://github.com/yandex/ClickHouse/pull/4598) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлен segfault в функции `replicate` с константным аргументом. [#4603](https://github.com/yandex/ClickHouse/pull/4603) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена работа predicate pushdown (настройка `enable_optimize_predicate_expression`) с лямбда-функциями. [#4408](https://github.com/yandex/ClickHouse/pull/4408) ([Winter Zhang](https://github.com/zhang2014))
* Множественные исправления для множества JOIN в одном запросе. [#4595](https://github.com/yandex/ClickHouse/pull/4595) ([Artem Zuikov](https://github.com/4ertus2))
### Улучшения
* Поддержка алиасов в секции JOIN ON для правой таблицы [#4412](https://github.com/yandex/ClickHouse/pull/4412) ([Artem Zuikov](https://github.com/4ertus2))
* Используются правильные алиасы в случае множественных JOIN с подзапросами. [#4474](https://github.com/yandex/ClickHouse/pull/4474) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлена логика работы predicate pushdown (настройка `enable_optimize_predicate_expression`) для JOIN. [#4387](https://github.com/yandex/ClickHouse/pull/4387) ([Ivan](https://github.com/abyss7))
### Улучшения производительности
* Улучшена эвристика оптимизации "перенос в PREWHERE". [#4405](https://github.com/yandex/ClickHouse/pull/4405) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Используются настоящие lookup таблицы вместо хэш-таблиц в случае 8 и 16 битных ключей. Интерфейс хэш-таблиц обобщён, чтобы поддерживать этот случай. [#4536](https://github.com/yandex/ClickHouse/pull/4536) ([Amos Bird](https://github.com/amosbird))
* Улучшена производительность сравнения строк. [#4564](https://github.com/yandex/ClickHouse/pull/4564) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Очередь DDL операций (для запросов ON CLUSTER) очищается в отдельном потоке, чтобы не замедлять основную работу. [#4502](https://github.com/yandex/ClickHouse/pull/4502) ([Alex Zatelepin](https://github.com/ztlpn))
* Даже если настройка `min_bytes_to_use_direct_io` выставлена в 1, не каждый файл открывался в режиме O_DIRECT, потому что размер файлов иногда недооценивался на размер одного сжатого блока. [#4526](https://github.com/yandex/ClickHouse/pull/4526) ([alexey-milovidov](https://github.com/alexey-milovidov))
### Улучшения сборки/тестирования/пакетирования
* Добавлена поддержка компилятора clang-9 [#4604](https://github.com/yandex/ClickHouse/pull/4604) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлены неправильные `__asm__` инструкции [#4621](https://github.com/yandex/ClickHouse/pull/4621) ([Konstantin Podshumok](https://github.com/podshumok))
* Добавлена поддержка задания настроек выполнения запросов для `clickhouse-performance-test` из командной строки. [#4437](https://github.com/yandex/ClickHouse/pull/4437) ([alesapin](https://github.com/alesapin))
* Тесты словарей перенесены в интеграционные тесты. [#4477](https://github.com/yandex/ClickHouse/pull/4477) ([alesapin](https://github.com/alesapin))
* В набор автоматизированных тестов производительности добавлены запросы, находящиеся в разделе "benchmark" на официальном сайте. [#4496](https://github.com/yandex/ClickHouse/pull/4496) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправления сборки в случае использования внешних библиотек lz4 и xxhash. [#4495](https://github.com/yandex/ClickHouse/pull/4495) ([Orivej Desh](https://github.com/orivej))
* Исправлен undefined behaviour, если функция `quantileTiming` была вызвана с отрицательным или нецелым аргументом (обнаружено с помощью fuzz test под undefined behaviour sanitizer). [#4506](https://github.com/yandex/ClickHouse/pull/4506) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлены опечатки в коде. [#4531](https://github.com/yandex/ClickHouse/pull/4531) ([sdk2](https://github.com/sdk2))
* Исправлена сборка под Mac. [#4371](https://github.com/yandex/ClickHouse/pull/4371) ([Vitaly Baranov](https://github.com/vitlibar))
* Исправлена сборка под FreeBSD и для некоторых необычных конфигурациях сборки. [#4444](https://github.com/yandex/ClickHouse/pull/4444) ([proller](https://github.com/proller))
## ClickHouse release 19.3.7, 2019-03-12
### Исправления ошибок
* Исправлена ошибка в #3920. Ошибка проявлялась в виде случайных повреждений кэша (сообщения `Unknown codec family code`, `Cannot seek through file`) и segfault. Ошибка впервые возникла в 19.1 и присутствует во всех версиях до 19.1.10 и 19.3.6. [#4623](https://github.com/yandex/ClickHouse/pull/4623) ([alexey-milovidov](https://github.com/alexey-milovidov))
## ClickHouse release 19.3.6, 2019-03-02
### Исправления ошибок
* Если в пуле потоков было более 1000 потоков, то при выходе из потока, вызывается `std::terminate`. [Azat Khuzhin](https://github.com/azat) [#4485](https://github.com/yandex/ClickHouse/pull/4485) [#4505](https://github.com/yandex/ClickHouse/pull/4505) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Теперь возможно создавать таблицы `ReplicatedMergeTree*` с комментариями столбцов без указания DEFAULT, а также с CODEC но без COMMENT и DEFAULT. Исправлено сравнение CODEC друг с другом. [#4523](https://github.com/yandex/ClickHouse/pull/4523) ([alesapin](https://github.com/alesapin))
* Исправлено падение при JOIN по массивам и кортежам. [#4552](https://github.com/yandex/ClickHouse/pull/4552) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлено падение `clickhouse-copier` с сообщением `ThreadStatus not created`. [#4540](https://github.com/yandex/ClickHouse/pull/4540) ([Artem Zuikov](https://github.com/4ertus2))
* Исправлено зависание сервера при завершении работы в случае использования распределённых DDL. [#4472](https://github.com/yandex/ClickHouse/pull/4472) ([Alex Zatelepin](https://github.com/ztlpn))
* В сообщениях об ошибке при парсинге текстовых форматов, выдавались неправильные номера столбцов, в случае, если номер больше 10. [#4484](https://github.com/yandex/ClickHouse/pull/4484) ([alexey-milovidov](https://github.com/alexey-milovidov))
### Улучшения сборки/тестирования/пакетирования
* Исправлена сборка с включенным AVX. [#4527](https://github.com/yandex/ClickHouse/pull/4527) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена поддержка расширенных метрик выполнения запроса в случае, если ClickHouse был собран на системе с новым ядром Linux, а запускается на системе с существенно более старым ядром. [#4541](https://github.com/yandex/ClickHouse/pull/4541) ([nvartolomei](https://github.com/nvartolomei))
* Продолжение работы в случае невозможности применить настройку `core_dump.size_limit` с выводом предупреждения. [#4473](https://github.com/yandex/ClickHouse/pull/4473) ([proller](https://github.com/proller))
* Удалено `inline` для `void readBinary(...)` в `Field.cpp`. [#4530](https://github.com/yandex/ClickHouse/pull/4530) ([hcz](https://github.com/hczhcz))
## ClickHouse release 19.3.5, 2019-02-21
### Исправления ошибок:
@ -74,7 +160,7 @@
* Исправлена ошибка, из-за которой при запросе к таблице `system.tables` могло возникать исключение `table doesn't exist`. [#4313](https://github.com/yandex/ClickHouse/pull/4313) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка, приводившая к падению `clickhouse-client` в интерактивном режиме, если успеть выйти из него во время загрузки подсказок командной строки. [#4317](https://github.com/yandex/ClickHouse/pull/4317) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка, приводившая к неверным результатам исполнения мутаций, содержащих оператор `IN`. [#4099](https://github.com/yandex/ClickHouse/pull/4099) ([Alex Zatelepin](https://github.com/ztlpn))
* Исправлена ошибка, из-за которой, если была создана база данных с движком `Dictionary`, все словари загружались при старте сервера, а словари с источником из локального ClickHouse не могли загрузиться. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка, из-за которой, если была создана база данных с движком `Dictionary`, все словари загружались при старте сервера, а словари с источником из локального ClickHouse не могли загрузиться. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлено повторное создание таблиц с системными логами (`system.query_log`, `system.part_log`) при остановке сервера. [#4254](https://github.com/yandex/ClickHouse/pull/4254) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлен вывод типа возвращаемого значения, а также использование блокировок в функции `joinGet`. [#4153](https://github.com/yandex/ClickHouse/pull/4153) ([Amos Bird](https://github.com/amosbird))
* Исправлено падение сервера при использовании настройки `allow_experimental_multiple_joins_emulation`. [52de2c](https://github.com/yandex/ClickHouse/commit/52de2cd927f7b5257dd67e175f0a5560a48840d0) ([Artem Zuikov](https://github.com/4ertus2))
@ -98,7 +184,7 @@
* Добавлен инструмент, собирающий changelog из описаний pull request-ов. [#4169](https://github.com/yandex/ClickHouse/pull/4169) [#4173](https://github.com/yandex/ClickHouse/pull/4173) ([KochetovNicolai](https://github.com/KochetovNicolai)) ([KochetovNicolai](https://github.com/KochetovNicolai))
* Добавлен puppet-модуль для Clickhouse. [#4182](https://github.com/yandex/ClickHouse/pull/4182) ([Maxim Fedotov](https://github.com/MaxFedotov))
* Добавлена документация для нескольких недокументированных функций. [#4168](https://github.com/yandex/ClickHouse/pull/4168) ([Winter Zhang](https://github.com/zhang2014))
* Исправления сборки под ARM. [#4210](https://github.com/yandex/ClickHouse/pull/4210)[#4306](https://github.com/yandex/ClickHouse/pull/4306) [#4291](https://github.com/yandex/ClickHouse/pull/4291) ([proller](https://github.com/proller)) ([proller](https://github.com/proller))
* Исправления сборки под ARM. [#4210](https://github.com/yandex/ClickHouse/pull/4210)[#4306](https://github.com/yandex/ClickHouse/pull/4306) [#4291](https://github.com/yandex/ClickHouse/pull/4291) ([proller](https://github.com/proller)) ([proller](https://github.com/proller))
* Добавлена возможность запускать тесты словарей из `ctest`. [#4189](https://github.com/yandex/ClickHouse/pull/4189) ([proller](https://github.com/proller))
* Теперь директорией с SSL-сертификатами по умолчанию является `/etc/ssl`. [#4167](https://github.com/yandex/ClickHouse/pull/4167) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлена проверка доступности SSE и AVX-инструкций на старте. [#4234](https://github.com/yandex/ClickHouse/pull/4234) ([Igr](https://github.com/igron99))
@ -133,6 +219,18 @@
* Уменьшено время ожидания завершения сервера и завершения запросов `ALTER`. [#4372](https://github.com/yandex/ClickHouse/pull/4372) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Добавлена информация о значении настройки `replicated_can_become_leader` в таблицу `system.replicas`. Добавлено логирование в случае, если реплика не собирается стать лидером. [#4379](https://github.com/yandex/ClickHouse/pull/4379) ([Alex Zatelepin](https://github.com/ztlpn))
## ClickHouse release 19.1.14, 2019-03-14
* Исправлена ошибка `Column ... queried more than once`, которая могла произойти в случае включенной настройки `asterisk_left_columns_only` в случае использования `GLOBAL JOIN` а также `SELECT *` (редкий случай). Эта ошибка изначально отсутствует в версиях 19.3 и более новых. [6bac7d8d](https://github.com/yandex/ClickHouse/pull/4692/commits/6bac7d8d11a9b0d6de0b32b53c47eb2f6f8e7062) ([Artem Zuikov](https://github.com/4ertus2))
## ClickHouse release 19.1.13, 2019-03-12
Этот релиз содержит такие же исправления ошибок, как и 19.3.7.
## ClickHouse release 19.1.10, 2019-03-03
Этот релиз содержит такие же исправления ошибок, как и 19.3.6.
## ClickHouse release 19.1.9, 2019-02-21
### Исправления ошибок:
@ -152,7 +250,7 @@
* Исправлен вывод типа возвращаемого значения, а также использование блокировок в функции `joinGet`. [#4153](https://github.com/yandex/ClickHouse/pull/4153) ([Amos Bird](https://github.com/amosbird))
* Исправлено повторное создание таблиц с системными логами (`system.query_log`, `system.part_log`) при остановке сервера. [#4254](https://github.com/yandex/ClickHouse/pull/4254) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка, из-за которой, если была создана база данных с движком `Dictionary`, все словари загружались при старте сервера, а словари с источником из локального ClickHouse не могли загрузиться. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка, из-за которой, если была создана база данных с движком `Dictionary`, все словари загружались при старте сервера, а словари с источником из локального ClickHouse не могли загрузиться. [#4255](https://github.com/yandex/ClickHouse/pull/4255) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка, приводившая к неверным результатам исполнения мутаций, содержащих оператор `IN`. [#4099](https://github.com/yandex/ClickHouse/pull/4099) ([Alex Zatelepin](https://github.com/ztlpn))
* Исправлена ошибка, приводившая к падению `clickhouse-client` в интерактивном режиме, если успеть выйти из него во время загрузки подсказок командной строки. [#4317](https://github.com/yandex/ClickHouse/pull/4317) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Исправлена ошибка, из-за которой при запросе к таблице `system.tables` могло возникать исключение `table doesn't exist`. [#4313](https://github.com/yandex/ClickHouse/pull/4313) ([alexey-milovidov](https://github.com/alexey-milovidov))

View File

@ -1,7 +1,11 @@
project (ClickHouse)
cmake_minimum_required (VERSION 3.3)
project(ClickHouse)
cmake_minimum_required(VERSION 3.3)
cmake_policy(SET CMP0023 NEW)
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_CURRENT_SOURCE_DIR}/cmake/Modules/")
set(CMAKE_EXPORT_COMPILE_COMMANDS 1) # Write compile_commands.json
set(CMAKE_LINK_DEPENDS_NO_SHARED 1) # Do not relink all depended targets on .so
set(CMAKE_CONFIGURATION_TYPES "RelWithDebInfo;Debug;Release;MinSizeRel" CACHE STRING "" FORCE)
set(CMAKE_DEBUG_POSTFIX "d" CACHE STRING "Generate debug library name with a postfix.") # To be consistent with CMakeLists from contrib libs.
option(ENABLE_IPO "Enable inter-procedural optimization (aka LTO)" OFF) # need cmake 3.9+
if(ENABLE_IPO)
@ -37,9 +41,6 @@ if (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/.git" AND NOT EXISTS "${ClickHouse_SOURC
message (FATAL_ERROR "Submodules are not initialized. Run\n\tgit submodule update --init --recursive")
endif ()
# Write compile_commands.json
set(CMAKE_EXPORT_COMPILE_COMMANDS 1)
include (cmake/find_ccache.cmake)
if (NOT CMAKE_BUILD_TYPE OR CMAKE_BUILD_TYPE STREQUAL "None")
@ -49,8 +50,6 @@ endif ()
string(TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE_UC)
message (STATUS "CMAKE_BUILD_TYPE: ${CMAKE_BUILD_TYPE}")
set (CMAKE_CONFIGURATION_TYPES "RelWithDebInfo;Debug;Release;MinSizeRel" CACHE STRING "" FORCE)
set (CMAKE_DEBUG_POSTFIX "d" CACHE STRING "Generate debug library name with a postfix.") # To be consistent with CMakeLists from contrib libs.
option (USE_STATIC_LIBRARIES "Set to FALSE to use shared libraries" ON)
option (MAKE_STATIC_LIBRARIES "Set to FALSE to make shared libraries" ${USE_STATIC_LIBRARIES})
@ -141,7 +140,7 @@ if(NOT COMPILER_CLANG) # clang: error: the clang compiler does not support '-mar
endif()
if (ARCH_NATIVE)
set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native")
set (COMPILER_FLAGS "${COMPILER_FLAGS} -march=native")
endif ()
# Special options for better optimized code with clang
@ -179,14 +178,20 @@ include (cmake/use_libcxx.cmake)
# This is intended for more control of what we are linking.
set (DEFAULT_LIBS "")
if (OS_LINUX AND NOT UNBUNDLED)
# Note: this probably has no effict, but I'm not an expert in CMake.
if (OS_LINUX AND NOT UNBUNDLED AND (GLIBC_COMPATIBILITY OR USE_LIBCXX))
# Note: this probably has no effect, but I'm not an expert in CMake.
set (CMAKE_C_IMPLICIT_LINK_LIBRARIES "")
set (CMAKE_CXX_IMPLICIT_LINK_LIBRARIES "")
# Disable default linked libraries.
set (DEFAULT_LIBS "-nodefaultlibs")
# We need builtins from Clang's RT even without libcxx - for ubsan+int128. See https://bugs.llvm.org/show_bug.cgi?id=16404
set (BUILTINS_LIB_PATH "")
if (COMPILER_CLANG)
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIB_PATH OUTPUT_STRIP_TRAILING_WHITESPACE)
endif ()
# Add C++ libraries.
#
# This consist of:
@ -197,14 +202,9 @@ if (OS_LINUX AND NOT UNBUNDLED)
#
# There are two variants of C++ library: libc++ (from LLVM compiler infrastructure) and libstdc++ (from GCC).
if (USE_LIBCXX)
set (BUILTINS_LIB_PATH "")
if (COMPILER_CLANG)
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIB_PATH OUTPUT_STRIP_TRAILING_WHITESPACE)
endif ()
set (DEFAULT_LIBS "${DEFAULT_LIBS} -Wl,-Bstatic -lc++ -lc++abi -lgcc_eh ${BUILTINS_LIB_PATH} -Wl,-Bdynamic")
else ()
set (DEFAULT_LIBS "${DEFAULT_LIBS} -Wl,-Bstatic -lstdc++ -lgcc_eh -lgcc -Wl,-Bdynamic")
set (DEFAULT_LIBS "${DEFAULT_LIBS} -Wl,-Bstatic -lstdc++ -lgcc_eh -lgcc ${BUILTINS_LIB_PATH} -Wl,-Bdynamic")
endif ()
# Linking with GLIBC prevents portability of binaries to older systems.
@ -216,6 +216,7 @@ if (OS_LINUX AND NOT UNBUNDLED)
string (TOUPPER ${CMAKE_BUILD_TYPE} CMAKE_BUILD_TYPE_UC)
set (CMAKE_POSTFIX_VARIABLE "CMAKE_${CMAKE_BUILD_TYPE_UC}_POSTFIX")
# FIXME: glibc-compatibility may be non-static in some builds!
set (DEFAULT_LIBS "${DEFAULT_LIBS} libs/libglibc-compatibility/libglibc-compatibility${${CMAKE_POSTFIX_VARIABLE}}.a")
endif ()
@ -227,6 +228,11 @@ if (OS_LINUX AND NOT UNBUNDLED)
message(STATUS "Default libraries: ${DEFAULT_LIBS}")
endif ()
if (DEFAULT_LIBS)
# Add default libs to all targets as the last dependency.
set(CMAKE_CXX_STANDARD_LIBRARIES ${DEFAULT_LIBS})
endif ()
if (NOT MAKE_STATIC_LIBRARIES)
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
@ -310,6 +316,8 @@ include (cmake/find_pdqsort.cmake)
include (cmake/find_hdfs3.cmake) # uses protobuf
include (cmake/find_consistent-hashing.cmake)
include (cmake/find_base64.cmake)
include (cmake/find_hyperscan.cmake)
include (cmake/find_lfalloc.cmake)
find_contrib_lib(cityhash)
find_contrib_lib(farmhash)
find_contrib_lib(metrohash)
@ -336,35 +344,29 @@ add_subdirectory (dbms)
include (cmake/print_include_directories.cmake)
if (DEFAULT_LIBS)
# Add default libs to all targets as the last dependency.
# I have found no better way to specify default libs in CMake that will appear single time in specific order at the end of linker arguments.
function(add_default_libs target_name)
if (GLIBC_COMPATIBILITY)
# FIXME: actually glibc-compatibility should always be built first,
# because it's unconditionally linked via $DEFAULT_LIBS,
# and these looks like the first places that get linked.
function (add_glibc_compat target_name)
if (TARGET ${target_name})
# message(STATUS "Has target ${target_name}")
set_property(TARGET ${target_name} APPEND PROPERTY LINK_LIBRARIES "${DEFAULT_LIBS}")
set_property(TARGET ${target_name} APPEND PROPERTY INTERFACE_LINK_LIBRARIES "${DEFAULT_LIBS}")
if (GLIBC_COMPATIBILITY)
add_dependencies(${target_name} glibc-compatibility)
endif ()
add_dependencies(${target_name} glibc-compatibility)
endif ()
endfunction ()
add_default_libs(ltdl)
add_default_libs(zlibstatic)
add_default_libs(jemalloc)
add_default_libs(unwind)
add_default_libs(memcpy)
add_default_libs(Foundation)
add_default_libs(common)
add_default_libs(gtest)
add_default_libs(lz4)
add_default_libs(zstd)
add_default_libs(snappy)
add_default_libs(arrow)
add_default_libs(protoc)
add_default_libs(thrift_static)
add_default_libs(boost_regex_internal)
add_glibc_compat(ltdl)
add_glibc_compat(zlibstatic)
add_glibc_compat(jemalloc)
add_glibc_compat(unwind)
add_glibc_compat(memcpy)
add_glibc_compat(Foundation)
add_glibc_compat(common)
add_glibc_compat(gtest)
add_glibc_compat(lz4)
add_glibc_compat(zstd)
add_glibc_compat(snappy)
add_glibc_compat(arrow)
add_glibc_compat(protoc)
add_glibc_compat(thrift_static)
add_glibc_compat(boost_regex_internal)
endif ()

View File

@ -12,5 +12,8 @@ ClickHouse is an open-source column-oriented database management system that all
* You can also [fill this form](https://forms.yandex.com/surveys/meet-yandex-clickhouse-team/) to meet Yandex ClickHouse team in person.
## Upcoming Events
* [ClickHouse Community Meetup](https://www.eventbrite.com/e/clickhouse-meetup-in-madrid-registration-55376746339) in Madrid on April 2.
* [ClickHouse Community Meetup in Limassol](https://www.facebook.com/events/386638262181785/) on May 7.
* ClickHouse at [Percona Live 2019](https://www.percona.com/live/19/other-open-source-databases-track) in Austin on May 28-30.
* [ClickHouse Community Meetup in Beijing](https://www.huodongxing.com/event/2483759276200) on June 8.
* [ClickHouse Community Meetup in Shenzhen](https://www.huodongxing.com/event/3483759917300) on October 20.
* [ClickHouse Community Meetup in Shanghai](https://www.huodongxing.com/event/4483760336000) on October 27.

View File

@ -21,7 +21,7 @@ BUILD_TARGETS=clickhouse
BUILD_TYPE=Debug
ENABLE_EMBEDDED_COMPILER=0
CMAKE_FLAGS="-D CMAKE_C_FLAGS_ADD=-g0 -D CMAKE_CXX_FLAGS_ADD=-g0 -D ENABLE_JEMALLOC=0 -D ENABLE_CAPNP=0 -D ENABLE_RDKAFKA=0 -D ENABLE_UNWIND=0 -D ENABLE_ICU=0 -D ENABLE_POCO_MONGODB=0 -D ENABLE_POCO_NETSSL=0 -D ENABLE_POCO_ODBC=0 -D ENABLE_ODBC=0 -D ENABLE_MYSQL=0"
CMAKE_FLAGS="-D CMAKE_C_FLAGS_ADD=-g0 -D CMAKE_CXX_FLAGS_ADD=-g0 -D ENABLE_JEMALLOC=0 -D ENABLE_CAPNP=0 -D ENABLE_RDKAFKA=0 -D ENABLE_UNWIND=0 -D ENABLE_ICU=0 -D ENABLE_POCO_MONGODB=0 -D ENABLE_POCO_NETSSL=0 -D ENABLE_POCO_ODBC=0 -D ENABLE_ODBC=0 -D ENABLE_MYSQL=0 -D ENABLE_SSL=0 -D ENABLE_POCO_NETSSL=0"
[[ $(uname) == "FreeBSD" ]] && COMPILER_PACKAGE_VERSION=devel && export COMPILER_PATH=/usr/local/bin

View File

@ -1,9 +1,8 @@
if (OS_FREEBSD)
find_library (EXECINFO_LIBRARY execinfo)
find_library (ELF_LIBRARY elf)
message (STATUS "Using execinfo: ${EXECINFO_LIBRARY}")
message (STATUS "Using elf: ${ELF_LIBRARY}")
set (EXECINFO_LIBRARIES ${EXECINFO_LIBRARY} ${ELF_LIBRARY})
message (STATUS "Using execinfo: ${EXECINFO_LIBRARIES}")
else ()
set (EXECINFO_LIBRARY "")
set (ELF_LIBRARY "")
set (EXECINFO_LIBRARIES "")
endif ()

View File

@ -0,0 +1,33 @@
if (HAVE_SSSE3)
option (ENABLE_HYPERSCAN "Enable hyperscan" ON)
endif ()
if (ENABLE_HYPERSCAN)
option (USE_INTERNAL_HYPERSCAN_LIBRARY "Set to FALSE to use system hyperscan instead of the bundled" ${NOT_UNBUNDLED})
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/hyperscan/CMakeLists.txt")
if (USE_INTERNAL_HYPERSCAN_LIBRARY)
message (WARNING "submodule contrib/hyperscan is missing. to fix try run: \n git submodule update --init --recursive")
endif ()
set (MISSING_INTERNAL_HYPERSCAN_LIBRARY 1)
set (USE_INTERNAL_HYPERSCAN_LIBRARY 0)
endif ()
if (NOT USE_INTERNAL_HYPERSCAN_LIBRARY)
find_library (HYPERSCAN_LIBRARY hs)
find_path (HYPERSCAN_INCLUDE_DIR NAMES hs/hs.h hs.h PATHS ${HYPERSCAN_INCLUDE_PATHS})
endif ()
if (HYPERSCAN_LIBRARY AND HYPERSCAN_INCLUDE_DIR)
set (USE_HYPERSCAN 1)
elseif (NOT MISSING_INTERNAL_HYPERSCAN_LIBRARY)
set (HYPERSCAN_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/hyperscan/src)
set (HYPERSCAN_LIBRARY hs)
set (USE_HYPERSCAN 1)
set (USE_INTERNAL_HYPERSCAN_LIBRARY 1)
endif()
message (STATUS "Using hyperscan=${USE_HYPERSCAN}: ${HYPERSCAN_INCLUDE_DIR} : ${HYPERSCAN_LIBRARY}")
endif ()

10
cmake/find_lfalloc.cmake Normal file
View File

@ -0,0 +1,10 @@
if (NOT SANITIZE AND NOT ARCH_ARM AND NOT ARCH_32 AND NOT ARCH_PPC64LE AND NOT OS_FREEBSD)
option (ENABLE_LFALLOC "Set to FALSE to use system libgsasl library instead of bundled" ${NOT_UNBUNDLED})
endif ()
if (ENABLE_LFALLOC)
set (USE_LFALLOC 1)
set (USE_LFALLOC_RANDOM_HINT 1)
set (LFALLOC_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/lfalloc/src)
message (STATUS "Using lfalloc=${USE_LFALLOC}: ${LFALLOC_INCLUDE_DIR}")
endif ()

View File

@ -36,6 +36,8 @@ elseif (NOT MISSING_INTERNAL_POCO_LIBRARY)
set (ENABLE_DATA_SQLITE 0 CACHE BOOL "")
set (ENABLE_DATA_MYSQL 0 CACHE BOOL "")
set (ENABLE_DATA_POSTGRESQL 0 CACHE BOOL "")
set (ENABLE_ENCODINGS 0 CACHE BOOL "")
# new after 2.0.0:
set (POCO_ENABLE_ZIP 0 CACHE BOOL "")
set (POCO_ENABLE_PAGECOMPILER 0 CACHE BOOL "")

View File

@ -1,5 +1,5 @@
# Freebsd: contrib/cppkafka/include/cppkafka/detail/endianness.h:53:23: error: 'betoh16' was not declared in this scope
if (NOT ARCH_ARM AND NOT ARCH_32 AND NOT APPLE AND NOT OS_FREEBSD)
if (NOT ARCH_ARM AND NOT ARCH_32 AND NOT APPLE AND NOT OS_FREEBSD AND OPENSSL_FOUND)
option (ENABLE_RDKAFKA "Enable kafka" ON)
endif ()

View File

@ -1,7 +1,19 @@
option (ENABLE_SSL "Enable ssl" ON)
if (ENABLE_SSL)
if(NOT ARCH_32)
option(USE_INTERNAL_SSL_LIBRARY "Set to FALSE to use system *ssl library instead of bundled" ${NOT_UNBUNDLED})
endif()
if(NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/ssl/CMakeLists.txt")
if(USE_INTERNAL_SSL_LIBRARY)
message(WARNING "submodule contrib/ssl is missing. to fix try run: \n git submodule update --init --recursive")
endif()
set(USE_INTERNAL_SSL_LIBRARY 0)
set(MISSING_INTERNAL_SSL_LIBRARY 1)
endif()
set (OPENSSL_USE_STATIC_LIBS ${USE_STATIC_LIBRARIES})
if (NOT USE_INTERNAL_SSL_LIBRARY)
@ -28,7 +40,7 @@ if (NOT USE_INTERNAL_SSL_LIBRARY)
endif ()
endif ()
if (NOT OPENSSL_FOUND)
if (NOT OPENSSL_FOUND AND NOT MISSING_INTERNAL_SSL_LIBRARY)
set (USE_INTERNAL_SSL_LIBRARY 1)
set (OPENSSL_ROOT_DIR "${ClickHouse_SOURCE_DIR}/contrib/ssl")
set (OPENSSL_INCLUDE_DIR "${OPENSSL_ROOT_DIR}/include")
@ -43,4 +55,11 @@ if (NOT OPENSSL_FOUND)
set (OPENSSL_FOUND 1)
endif ()
message (STATUS "Using ssl=${OPENSSL_FOUND}: ${OPENSSL_INCLUDE_DIR} : ${OPENSSL_LIBRARIES}")
if(OPENSSL_FOUND)
# we need keep OPENSSL_FOUND for many libs in contrib
set(USE_SSL 1)
endif()
endif ()
message (STATUS "Using ssl=${USE_SSL}: ${OPENSSL_INCLUDE_DIR} : ${OPENSSL_LIBRARIES}")

View File

@ -4,7 +4,7 @@ if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-stringop-overflow -Wno-implicit-function-declaration -Wno-return-type -Wno-array-bounds -Wno-bool-compare -Wno-int-conversion -Wno-switch")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-maybe-uninitialized -Wno-format -Wno-misleading-indentation -Wno-implicit-fallthrough -Wno-class-memaccess -Wno-sign-compare -std=c++1z")
elseif (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality -Wno-tautological-constant-compare -Wno-tautological-constant-out-of-range-compare -Wno-implicit-function-declaration -Wno-return-type -Wno-pointer-bool-conversion -Wno-enum-conversion -Wno-int-conversion -Wno-switch")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-format -Wno-parentheses-equality -Wno-tautological-constant-compare -Wno-tautological-constant-out-of-range-compare -Wno-implicit-function-declaration -Wno-return-type -Wno-pointer-bool-conversion -Wno-enum-conversion -Wno-int-conversion -Wno-switch -Wno-string-plus-int")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-old-style-cast -Wno-unused-function -Wno-unused-variable -Wno-unused-result -Wno-deprecated-declarations -Wno-non-virtual-dtor -Wno-format -Wno-inconsistent-missing-override -std=c++1z")
endif ()
@ -125,13 +125,17 @@ endif ()
if (ENABLE_MYSQL AND USE_INTERNAL_MYSQL_LIBRARY)
add_subdirectory (mariadb-connector-c-cmake)
target_include_directories(mysqlclient BEFORE PRIVATE ${ZLIB_INCLUDE_DIR})
target_include_directories(mysqlclient BEFORE PRIVATE ${OPENSSL_INCLUDE_DIR})
if(OPENSSL_INCLUDE_DIR)
target_include_directories(mysqlclient BEFORE PRIVATE ${OPENSSL_INCLUDE_DIR})
endif()
endif ()
if (USE_INTERNAL_RDKAFKA_LIBRARY)
add_subdirectory (librdkafka-cmake)
target_include_directories(rdkafka BEFORE PRIVATE ${ZLIB_INCLUDE_DIR})
target_include_directories(rdkafka BEFORE PRIVATE ${OPENSSL_INCLUDE_DIR})
if(OPENSSL_INCLUDE_DIR)
target_include_directories(rdkafka BEFORE PRIVATE ${OPENSSL_INCLUDE_DIR})
endif()
endif ()
if (USE_RDKAFKA)
@ -280,12 +284,17 @@ endif ()
if (USE_INTERNAL_BROTLI_LIBRARY)
add_subdirectory(brotli-cmake)
target_compile_definitions(brotli PRIVATE BROTLI_BUILD_PORTABLE=1)
endif ()
if (USE_INTERNAL_PROTOBUF_LIBRARY)
set(protobuf_BUILD_TESTS OFF CACHE INTERNAL "" FORCE)
set(protobuf_BUILD_SHARED_LIBS OFF CACHE INTERNAL "" FORCE)
if (MAKE_STATIC_LIBRARIES)
set(protobuf_BUILD_SHARED_LIBS OFF CACHE INTERNAL "" FORCE)
else ()
set(protobuf_BUILD_SHARED_LIBS ON CACHE INTERNAL "" FORCE)
endif ()
set(protobuf_WITH_ZLIB 0 CACHE INTERNAL "" FORCE) # actually will use zlib, but skip find
set(protobuf_BUILD_TESTS OFF CACHE INTERNAL "" FORCE)
add_subdirectory(protobuf/cmake)
endif ()
@ -296,3 +305,7 @@ endif ()
if (USE_BASE64)
add_subdirectory (base64-cmake)
endif()
if (USE_INTERNAL_HYPERSCAN_LIBRARY)
add_subdirectory (hyperscan)
endif()

2
contrib/boost vendored

@ -1 +1 @@
Subproject commit 6a96e8b59f76148eb8ad54a9d15259f8ce84c606
Subproject commit 471ea208abb92a5cba7d3a08a819bb728f27e95f

1
contrib/hyperscan vendored Submodule

@ -0,0 +1 @@
Subproject commit 05b0f9064cca4bd55548dedb0a32ed9461146c1e

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,23 @@
#pragma once
#include <string.h>
#include <stdlib.h>
#include "util/system/compiler.h"
namespace NMalloc {
volatile inline bool IsAllocatorCorrupted = false;
static inline void AbortFromCorruptedAllocator() {
IsAllocatorCorrupted = true;
abort();
}
struct TAllocHeader {
void* Block;
size_t AllocSize;
void Y_FORCE_INLINE Encode(void* block, size_t size, size_t signature) {
Block = block;
AllocSize = size | signature;
}
};
}

View File

@ -0,0 +1,33 @@
Style guide for the util folder is a stricter version of general style guide (mostly in terms of ambiguity resolution).
* all {} must be in K&R style
* &, * tied closer to a type, not to variable
* always use `using` not `typedef`
* even a single line block must be in braces {}:
```
if (A) {
B();
}
```
* _ at the end of private data member of a class - `First_`, `Second_`
* every .h file must be accompanied with corresponding .cpp to avoid a leakage and check that it is self contained
* prohibited to use `printf`-like functions
Things declared in the general style guide, which sometimes are missed:
* `template <`, not `template<`
* `noexcept`, not `throw ()` nor `throw()`, not required for destructors
* indents inside `namespace` same as inside `class`
Requirements for a new code (and for corrections in an old code which involves change of behaviour) in util:
* presence of UNIT-tests
* presence of comments in Doxygen style
* accessors without Get prefix (`Length()`, but not `GetLength()`)
This guide is not a mandatory as there is the general style guide.
Nevertheless if it is not followed, then a next `ya style .` run in the util folder will undeservedly update authors of some lines of code.
Thus before a commit it is recommended to run `ya style .` in the util folder.

View File

@ -0,0 +1,51 @@
#pragma once
#include "defaults.h"
using TAtomicBase = intptr_t;
using TAtomic = volatile TAtomicBase;
#if defined(__GNUC__)
#include "atomic_gcc.h"
#elif defined(_MSC_VER)
#include "atomic_win.h"
#else
#error unsupported platform
#endif
#if !defined(ATOMIC_COMPILER_BARRIER)
#define ATOMIC_COMPILER_BARRIER()
#endif
static inline TAtomicBase AtomicSub(TAtomic& a, TAtomicBase v) {
return AtomicAdd(a, -v);
}
static inline TAtomicBase AtomicGetAndSub(TAtomic& a, TAtomicBase v) {
return AtomicGetAndAdd(a, -v);
}
#if defined(USE_GENERIC_SETGET)
static inline TAtomicBase AtomicGet(const TAtomic& a) {
return a;
}
static inline void AtomicSet(TAtomic& a, TAtomicBase v) {
a = v;
}
#endif
static inline bool AtomicTryLock(TAtomic* a) {
return AtomicCas(a, 1, 0);
}
static inline bool AtomicTryAndTryLock(TAtomic* a) {
return (AtomicGet(*a) == 0) && AtomicTryLock(a);
}
static inline void AtomicUnlock(TAtomic* a) {
ATOMIC_COMPILER_BARRIER();
AtomicSet(*a, 0);
}
#include "atomic_ops.h"

View File

@ -0,0 +1,90 @@
#pragma once
#define ATOMIC_COMPILER_BARRIER() __asm__ __volatile__("" \
: \
: \
: "memory")
static inline TAtomicBase AtomicGet(const TAtomic& a) {
TAtomicBase tmp;
#if defined(_arm64_)
__asm__ __volatile__(
"ldar %x[value], %[ptr] \n\t"
: [value] "=r"(tmp)
: [ptr] "Q"(a)
: "memory");
#else
__atomic_load(&a, &tmp, __ATOMIC_ACQUIRE);
#endif
return tmp;
}
static inline void AtomicSet(TAtomic& a, TAtomicBase v) {
#if defined(_arm64_)
__asm__ __volatile__(
"stlr %x[value], %[ptr] \n\t"
: [ptr] "=Q"(a)
: [value] "r"(v)
: "memory");
#else
__atomic_store(&a, &v, __ATOMIC_RELEASE);
#endif
}
static inline intptr_t AtomicIncrement(TAtomic& p) {
return __atomic_add_fetch(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndIncrement(TAtomic& p) {
return __atomic_fetch_add(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicDecrement(TAtomic& p) {
return __atomic_sub_fetch(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndDecrement(TAtomic& p) {
return __atomic_fetch_sub(&p, 1, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicAdd(TAtomic& p, intptr_t v) {
return __atomic_add_fetch(&p, v, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndAdd(TAtomic& p, intptr_t v) {
return __atomic_fetch_add(&p, v, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicSwap(TAtomic* p, intptr_t v) {
(void)p; // disable strange 'parameter set but not used' warning on gcc
intptr_t ret;
__atomic_exchange(p, &v, &ret, __ATOMIC_SEQ_CST);
return ret;
}
static inline bool AtomicCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
(void)a; // disable strange 'parameter set but not used' warning on gcc
return __atomic_compare_exchange(a, &compare, &exchange, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicGetAndCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
(void)a; // disable strange 'parameter set but not used' warning on gcc
__atomic_compare_exchange(a, &compare, &exchange, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
return compare;
}
static inline intptr_t AtomicOr(TAtomic& a, intptr_t b) {
return __atomic_or_fetch(&a, b, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicXor(TAtomic& a, intptr_t b) {
return __atomic_xor_fetch(&a, b, __ATOMIC_SEQ_CST);
}
static inline intptr_t AtomicAnd(TAtomic& a, intptr_t b) {
return __atomic_and_fetch(&a, b, __ATOMIC_SEQ_CST);
}
static inline void AtomicBarrier() {
__sync_synchronize();
}

View File

@ -0,0 +1,189 @@
#pragma once
#include <type_traits>
template <typename T>
inline TAtomic* AsAtomicPtr(T volatile* target) {
return reinterpret_cast<TAtomic*>(target);
}
template <typename T>
inline const TAtomic* AsAtomicPtr(T const volatile* target) {
return reinterpret_cast<const TAtomic*>(target);
}
// integral types
template <typename T>
struct TAtomicTraits {
enum {
Castable = std::is_integral<T>::value && sizeof(T) == sizeof(TAtomicBase) && !std::is_const<T>::value,
};
};
template <typename T, typename TT>
using TEnableIfCastable = std::enable_if_t<TAtomicTraits<T>::Castable, TT>;
template <typename T>
inline TEnableIfCastable<T, T> AtomicGet(T const volatile& target) {
return static_cast<T>(AtomicGet(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, void> AtomicSet(T volatile& target, TAtomicBase value) {
AtomicSet(*AsAtomicPtr(&target), value);
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicIncrement(T volatile& target) {
return static_cast<T>(AtomicIncrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndIncrement(T volatile& target) {
return static_cast<T>(AtomicGetAndIncrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicDecrement(T volatile& target) {
return static_cast<T>(AtomicDecrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndDecrement(T volatile& target) {
return static_cast<T>(AtomicGetAndDecrement(*AsAtomicPtr(&target)));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicAdd(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicAdd(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndAdd(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicGetAndAdd(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicSub(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicSub(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndSub(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicGetAndSub(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicSwap(T volatile* target, TAtomicBase exchange) {
return static_cast<T>(AtomicSwap(AsAtomicPtr(target), exchange));
}
template <typename T>
inline TEnableIfCastable<T, bool> AtomicCas(T volatile* target, TAtomicBase exchange, TAtomicBase compare) {
return AtomicCas(AsAtomicPtr(target), exchange, compare);
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicGetAndCas(T volatile* target, TAtomicBase exchange, TAtomicBase compare) {
return static_cast<T>(AtomicGetAndCas(AsAtomicPtr(target), exchange, compare));
}
template <typename T>
inline TEnableIfCastable<T, bool> AtomicTryLock(T volatile* target) {
return AtomicTryLock(AsAtomicPtr(target));
}
template <typename T>
inline TEnableIfCastable<T, bool> AtomicTryAndTryLock(T volatile* target) {
return AtomicTryAndTryLock(AsAtomicPtr(target));
}
template <typename T>
inline TEnableIfCastable<T, void> AtomicUnlock(T volatile* target) {
AtomicUnlock(AsAtomicPtr(target));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicOr(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicOr(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicAnd(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicAnd(*AsAtomicPtr(&target), value));
}
template <typename T>
inline TEnableIfCastable<T, T> AtomicXor(T volatile& target, TAtomicBase value) {
return static_cast<T>(AtomicXor(*AsAtomicPtr(&target), value));
}
// pointer types
template <typename T>
inline T* AtomicGet(T* const volatile& target) {
return reinterpret_cast<T*>(AtomicGet(*AsAtomicPtr(&target)));
}
template <typename T>
inline void AtomicSet(T* volatile& target, T* value) {
AtomicSet(*AsAtomicPtr(&target), reinterpret_cast<TAtomicBase>(value));
}
using TNullPtr = decltype(nullptr);
template <typename T>
inline void AtomicSet(T* volatile& target, TNullPtr) {
AtomicSet(*AsAtomicPtr(&target), 0);
}
template <typename T>
inline T* AtomicSwap(T* volatile* target, T* exchange) {
return reinterpret_cast<T*>(AtomicSwap(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange)));
}
template <typename T>
inline T* AtomicSwap(T* volatile* target, TNullPtr) {
return reinterpret_cast<T*>(AtomicSwap(AsAtomicPtr(target), 0));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, T* exchange, T* compare) {
return AtomicCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), reinterpret_cast<TAtomicBase>(compare));
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, T* exchange, T* compare) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), reinterpret_cast<TAtomicBase>(compare)));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, T* exchange, TNullPtr) {
return AtomicCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), 0);
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, T* exchange, TNullPtr) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), reinterpret_cast<TAtomicBase>(exchange), 0));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, TNullPtr, T* compare) {
return AtomicCas(AsAtomicPtr(target), 0, reinterpret_cast<TAtomicBase>(compare));
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, TNullPtr, T* compare) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), 0, reinterpret_cast<TAtomicBase>(compare)));
}
template <typename T>
inline bool AtomicCas(T* volatile* target, TNullPtr, TNullPtr) {
return AtomicCas(AsAtomicPtr(target), 0, 0);
}
template <typename T>
inline T* AtomicGetAndCas(T* volatile* target, TNullPtr, TNullPtr) {
return reinterpret_cast<T*>(AtomicGetAndCas(AsAtomicPtr(target), 0, 0));
}

View File

@ -0,0 +1,114 @@
#pragma once
#include <intrin.h>
#define USE_GENERIC_SETGET
#if defined(_i386_)
#pragma intrinsic(_InterlockedIncrement)
#pragma intrinsic(_InterlockedDecrement)
#pragma intrinsic(_InterlockedExchangeAdd)
#pragma intrinsic(_InterlockedExchange)
#pragma intrinsic(_InterlockedCompareExchange)
static inline intptr_t AtomicIncrement(TAtomic& a) {
return _InterlockedIncrement((volatile long*)&a);
}
static inline intptr_t AtomicGetAndIncrement(TAtomic& a) {
return _InterlockedIncrement((volatile long*)&a) - 1;
}
static inline intptr_t AtomicDecrement(TAtomic& a) {
return _InterlockedDecrement((volatile long*)&a);
}
static inline intptr_t AtomicGetAndDecrement(TAtomic& a) {
return _InterlockedDecrement((volatile long*)&a) + 1;
}
static inline intptr_t AtomicAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd((volatile long*)&a, b) + b;
}
static inline intptr_t AtomicGetAndAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd((volatile long*)&a, b);
}
static inline intptr_t AtomicSwap(TAtomic* a, intptr_t b) {
return _InterlockedExchange((volatile long*)a, b);
}
static inline bool AtomicCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange((volatile long*)a, exchange, compare) == compare;
}
static inline intptr_t AtomicGetAndCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange((volatile long*)a, exchange, compare);
}
#else // _x86_64_
#pragma intrinsic(_InterlockedIncrement64)
#pragma intrinsic(_InterlockedDecrement64)
#pragma intrinsic(_InterlockedExchangeAdd64)
#pragma intrinsic(_InterlockedExchange64)
#pragma intrinsic(_InterlockedCompareExchange64)
static inline intptr_t AtomicIncrement(TAtomic& a) {
return _InterlockedIncrement64((volatile __int64*)&a);
}
static inline intptr_t AtomicGetAndIncrement(TAtomic& a) {
return _InterlockedIncrement64((volatile __int64*)&a) - 1;
}
static inline intptr_t AtomicDecrement(TAtomic& a) {
return _InterlockedDecrement64((volatile __int64*)&a);
}
static inline intptr_t AtomicGetAndDecrement(TAtomic& a) {
return _InterlockedDecrement64((volatile __int64*)&a) + 1;
}
static inline intptr_t AtomicAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd64((volatile __int64*)&a, b) + b;
}
static inline intptr_t AtomicGetAndAdd(TAtomic& a, intptr_t b) {
return _InterlockedExchangeAdd64((volatile __int64*)&a, b);
}
static inline intptr_t AtomicSwap(TAtomic* a, intptr_t b) {
return _InterlockedExchange64((volatile __int64*)a, b);
}
static inline bool AtomicCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange64((volatile __int64*)a, exchange, compare) == compare;
}
static inline intptr_t AtomicGetAndCas(TAtomic* a, intptr_t exchange, intptr_t compare) {
return _InterlockedCompareExchange64((volatile __int64*)a, exchange, compare);
}
static inline intptr_t AtomicOr(TAtomic& a, intptr_t b) {
return _InterlockedOr64(&a, b) | b;
}
static inline intptr_t AtomicAnd(TAtomic& a, intptr_t b) {
return _InterlockedAnd64(&a, b) & b;
}
static inline intptr_t AtomicXor(TAtomic& a, intptr_t b) {
return _InterlockedXor64(&a, b) ^ b;
}
#endif // _x86_
//TODO
static inline void AtomicBarrier() {
TAtomic val = 0;
AtomicSwap(&val, 0);
}

View File

@ -0,0 +1,617 @@
#pragma once
// useful cross-platfrom definitions for compilers
/**
* @def Y_FUNC_SIGNATURE
*
* Use this macro to get pretty function name (see example).
*
* @code
* void Hi() {
* Cout << Y_FUNC_SIGNATURE << Endl;
* }
* template <typename T>
* void Do() {
* Cout << Y_FUNC_SIGNATURE << Endl;
* }
* int main() {
* Hi(); // void Hi()
* Do<int>(); // void Do() [T = int]
* Do<TString>(); // void Do() [T = TString]
* }
* @endcode
*/
#if defined(__GNUC__)
#define Y_FUNC_SIGNATURE __PRETTY_FUNCTION__
#elif defined(_MSC_VER)
#define Y_FUNC_SIGNATURE __FUNCSIG__
#else
#define Y_FUNC_SIGNATURE ""
#endif
#ifdef __GNUC__
#define Y_PRINTF_FORMAT(n, m) __attribute__((__format__(__printf__, n, m)))
#endif
#ifndef Y_PRINTF_FORMAT
#define Y_PRINTF_FORMAT(n, m)
#endif
#if defined(__clang__)
#define Y_NO_SANITIZE(...) __attribute__((no_sanitize(__VA_ARGS__)))
#endif
#if !defined(Y_NO_SANITIZE)
#define Y_NO_SANITIZE(...)
#endif
/**
* @def Y_DECLARE_UNUSED
*
* Macro is needed to silence compiler warning about unused entities (e.g. function or argument).
*
* @code
* Y_DECLARE_UNUSED int FunctionUsedSolelyForDebugPurposes();
* assert(FunctionUsedSolelyForDebugPurposes() == 42);
*
* void Foo(const int argumentUsedOnlyForDebugPurposes Y_DECLARE_UNUSED) {
* assert(argumentUsedOnlyForDebugPurposes == 42);
* // however you may as well omit `Y_DECLARE_UNUSED` and use `UNUSED` macro instead
* Y_UNUSED(argumentUsedOnlyForDebugPurposes);
* }
* @endcode
*/
#ifdef __GNUC__
#define Y_DECLARE_UNUSED __attribute__((unused))
#endif
#ifndef Y_DECLARE_UNUSED
#define Y_DECLARE_UNUSED
#endif
#if defined(__GNUC__)
#define Y_LIKELY(Cond) __builtin_expect(!!(Cond), 1)
#define Y_UNLIKELY(Cond) __builtin_expect(!!(Cond), 0)
#define Y_PREFETCH_READ(Pointer, Priority) __builtin_prefetch((const void*)(Pointer), 0, Priority)
#define Y_PREFETCH_WRITE(Pointer, Priority) __builtin_prefetch((const void*)(Pointer), 1, Priority)
#endif
/**
* @def Y_FORCE_INLINE
*
* Macro to use in place of 'inline' in function declaration/definition to force
* it to be inlined.
*/
#if !defined(Y_FORCE_INLINE)
#if defined(CLANG_COVERAGE)
#/* excessive __always_inline__ might significantly slow down compilation of an instrumented unit */
#define Y_FORCE_INLINE inline
#elif defined(_MSC_VER)
#define Y_FORCE_INLINE __forceinline
#elif defined(__GNUC__)
#/* Clang also defines __GNUC__ (as 4) */
#define Y_FORCE_INLINE inline __attribute__((__always_inline__))
#else
#define Y_FORCE_INLINE inline
#endif
#endif
/**
* @def Y_NO_INLINE
*
* Macro to use in place of 'inline' in function declaration/definition to
* prevent it from being inlined.
*/
#if !defined(Y_NO_INLINE)
#if defined(_MSC_VER)
#define Y_NO_INLINE __declspec(noinline)
#elif defined(__GNUC__) || defined(__INTEL_COMPILER)
#/* Clang also defines __GNUC__ (as 4) */
#define Y_NO_INLINE __attribute__((__noinline__))
#else
#define Y_NO_INLINE
#endif
#endif
//to cheat compiler about strict aliasing or similar problems
#if defined(__GNUC__)
#define Y_FAKE_READ(X) \
do { \
__asm__ __volatile__("" \
: \
: "m"(X)); \
} while (0)
#define Y_FAKE_WRITE(X) \
do { \
__asm__ __volatile__("" \
: "=m"(X)); \
} while (0)
#endif
#if !defined(Y_FAKE_READ)
#define Y_FAKE_READ(X)
#endif
#if !defined(Y_FAKE_WRITE)
#define Y_FAKE_WRITE(X)
#endif
#ifndef Y_PREFETCH_READ
#define Y_PREFETCH_READ(Pointer, Priority) (void)(const void*)(Pointer), (void)Priority
#endif
#ifndef Y_PREFETCH_WRITE
#define Y_PREFETCH_WRITE(Pointer, Priority) (void)(const void*)(Pointer), (void)Priority
#endif
#ifndef Y_LIKELY
#define Y_LIKELY(Cond) (Cond)
#define Y_UNLIKELY(Cond) (Cond)
#endif
#ifdef __GNUC__
#define _packed __attribute__((packed))
#else
#define _packed
#endif
#if defined(__GNUC__)
#define Y_WARN_UNUSED_RESULT __attribute__((warn_unused_result))
#endif
#ifndef Y_WARN_UNUSED_RESULT
#define Y_WARN_UNUSED_RESULT
#endif
#if defined(__GNUC__)
#define Y_HIDDEN __attribute__((visibility("hidden")))
#endif
#if !defined(Y_HIDDEN)
#define Y_HIDDEN
#endif
#if defined(__GNUC__)
#define Y_PUBLIC __attribute__((visibility("default")))
#endif
#if !defined(Y_PUBLIC)
#define Y_PUBLIC
#endif
#if !defined(Y_UNUSED) && !defined(__cplusplus)
#define Y_UNUSED(var) (void)(var)
#endif
#if !defined(Y_UNUSED) && defined(__cplusplus)
template <class... Types>
constexpr Y_FORCE_INLINE int Y_UNUSED(Types&&...) {
return 0;
};
#endif
/**
* @def Y_ASSUME
*
* Macro that tells the compiler that it can generate optimized code
* as if the given expression will always evaluate true.
* The behavior is undefined if it ever evaluates false.
*
* @code
* // factored into a function so that it's testable
* inline int Avg(int x, int y) {
* if (x >= 0 && y >= 0) {
* return (static_cast<unsigned>(x) + static_cast<unsigned>(y)) >> 1;
* } else {
* // a slower implementation
* }
* }
*
* // we know that xs and ys are non-negative from domain knowledge,
* // but we can't change the types of xs and ys because of API constrains
* int Foo(const TVector<int>& xs, const TVector<int>& ys) {
* TVector<int> avgs;
* avgs.resize(xs.size());
* for (size_t i = 0; i < xs.size(); ++i) {
* auto x = xs[i];
* auto y = ys[i];
* Y_ASSUME(x >= 0);
* Y_ASSUME(y >= 0);
* xs[i] = Avg(x, y);
* }
* }
* @endcode
*/
#if defined(__GNUC__)
#define Y_ASSUME(condition) ((condition) ? (void)0 : __builtin_unreachable())
#elif defined(_MSC_VER)
#define Y_ASSUME(condition) __assume(condition)
#else
#define Y_ASSUME(condition) Y_UNUSED(condition)
#endif
#ifdef __cplusplus
[[noreturn]]
#endif
Y_HIDDEN void _YandexAbort();
/**
* @def Y_UNREACHABLE
*
* Macro that marks the rest of the code branch unreachable.
* The behavior is undefined if it's ever reached.
*
* @code
* switch (i % 3) {
* case 0:
* return foo;
* case 1:
* return bar;
* case 2:
* return baz;
* default:
* Y_UNREACHABLE();
* }
* @endcode
*/
#if defined(__GNUC__) || defined(_MSC_VER)
#define Y_UNREACHABLE() Y_ASSUME(0)
#else
#define Y_UNREACHABLE() _YandexAbort()
#endif
#if defined(undefined_sanitizer_enabled)
#define _ubsan_enabled_
#endif
#ifdef __clang__
#if __has_feature(thread_sanitizer)
#define _tsan_enabled_
#endif
#if __has_feature(memory_sanitizer)
#define _msan_enabled_
#endif
#if __has_feature(address_sanitizer)
#define _asan_enabled_
#endif
#else
#if defined(thread_sanitizer_enabled) || defined(__SANITIZE_THREAD__)
#define _tsan_enabled_
#endif
#if defined(memory_sanitizer_enabled)
#define _msan_enabled_
#endif
#if defined(address_sanitizer_enabled) || defined(__SANITIZE_ADDRESS__)
#define _asan_enabled_
#endif
#endif
#if defined(_asan_enabled_) || defined(_msan_enabled_) || defined(_tsan_enabled_) || defined(_ubsan_enabled_)
#define _san_enabled_
#endif
#if defined(_MSC_VER)
#define __PRETTY_FUNCTION__ __FUNCSIG__
#endif
#if defined(__GNUC__)
#define Y_WEAK __attribute__((weak))
#else
#define Y_WEAK
#endif
#if defined(__CUDACC_VER_MAJOR__)
#define Y_CUDA_AT_LEAST(x, y) (__CUDACC_VER_MAJOR__ > x || (__CUDACC_VER_MAJOR__ == x && __CUDACC_VER_MINOR__ >= y))
#else
#define Y_CUDA_AT_LEAST(x, y) 0
#endif
// NVidia CUDA C++ Compiler did not know about noexcept keyword until version 9.0
#if !Y_CUDA_AT_LEAST(9, 0)
#if defined(__CUDACC__) && !defined(noexcept)
#define noexcept throw ()
#endif
#endif
#if defined(__GNUC__)
#define Y_COLD __attribute__((cold))
#define Y_LEAF __attribute__((leaf))
#define Y_WRAPPER __attribute__((artificial))
#else
#define Y_COLD
#define Y_LEAF
#define Y_WRAPPER
#endif
/**
* @def Y_PRAGMA
*
* Macro for use in other macros to define compiler pragma
* See below for other usage examples
*
* @code
* #if defined(__clang__) || defined(__GNUC__)
* #define Y_PRAGMA_NO_WSHADOW \
* Y_PRAGMA("GCC diagnostic ignored \"-Wshadow\"")
* #elif defined(_MSC_VER)
* #define Y_PRAGMA_NO_WSHADOW \
* Y_PRAGMA("warning(disable:4456 4457")
* #else
* #define Y_PRAGMA_NO_WSHADOW
* #endif
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA(x) _Pragma(x)
#elif defined(_MSC_VER)
#define Y_PRAGMA(x) __pragma(x)
#else
#define Y_PRAGMA(x)
#endif
/**
* @def Y_PRAGMA_DIAGNOSTIC_PUSH
*
* Cross-compiler pragma to save diagnostic settings
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html
* MSVC: https://msdn.microsoft.com/en-us/library/2c8f766e.aspx
* Clang: https://clang.llvm.org/docs/UsersManual.html#controlling-diagnostics-via-pragmas
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_DIAGNOSTIC_PUSH \
Y_PRAGMA("GCC diagnostic push")
#elif defined(_MSC_VER)
#define Y_PRAGMA_DIAGNOSTIC_PUSH \
Y_PRAGMA(warning(push))
#else
#define Y_PRAGMA_DIAGNOSTIC_PUSH
#endif
/**
* @def Y_PRAGMA_DIAGNOSTIC_POP
*
* Cross-compiler pragma to restore diagnostic settings
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html
* MSVC: https://msdn.microsoft.com/en-us/library/2c8f766e.aspx
* Clang: https://clang.llvm.org/docs/UsersManual.html#controlling-diagnostics-via-pragmas
*
* @code
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_DIAGNOSTIC_POP \
Y_PRAGMA("GCC diagnostic pop")
#elif defined(_MSC_VER)
#define Y_PRAGMA_DIAGNOSTIC_POP \
Y_PRAGMA(warning(pop))
#else
#define Y_PRAGMA_DIAGNOSTIC_POP
#endif
/**
* @def Y_PRAGMA_NO_WSHADOW
*
* Cross-compiler pragma to disable warnings about shadowing variables
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_WSHADOW
*
* // some code which use variable shadowing, e.g.:
*
* for (int i = 0; i < 100; ++i) {
* Use(i);
*
* for (int i = 42; i < 100500; ++i) { // this i is shadowing previous i
* AnotherUse(i);
* }
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_WSHADOW \
Y_PRAGMA("GCC diagnostic ignored \"-Wshadow\"")
#elif defined(_MSC_VER)
#define Y_PRAGMA_NO_WSHADOW \
Y_PRAGMA(warning(disable : 4456 4457))
#else
#define Y_PRAGMA_NO_WSHADOW
#endif
/**
* @ def Y_PRAGMA_NO_UNUSED_FUNCTION
*
* Cross-compiler pragma to disable warnings about unused functions
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
* Clang: https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-function
* MSVC: there is no such warning
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_UNUSED_FUNCTION
*
* // some code which introduces a function which later will not be used, e.g.:
*
* void Foo() {
* }
*
* int main() {
* return 0; // Foo() never called
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_UNUSED_FUNCTION \
Y_PRAGMA("GCC diagnostic ignored \"-Wunused-function\"")
#else
#define Y_PRAGMA_NO_UNUSED_FUNCTION
#endif
/**
* @ def Y_PRAGMA_NO_UNUSED_PARAMETER
*
* Cross-compiler pragma to disable warnings about unused function parameters
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
* Clang: https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-parameter
* MSVC: https://msdn.microsoft.com/en-us/library/26kb9fy0.aspx
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_UNUSED_PARAMETER
*
* // some code which introduces a function with unused parameter, e.g.:
*
* void foo(int a) {
* // a is not referenced
* }
*
* int main() {
* foo(1);
* return 0;
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_UNUSED_PARAMETER \
Y_PRAGMA("GCC diagnostic ignored \"-Wunused-parameter\"")
#elif defined(_MSC_VER)
#define Y_PRAGMA_NO_UNUSED_PARAMETER \
Y_PRAGMA(warning(disable : 4100))
#else
#define Y_PRAGMA_NO_UNUSED_PARAMETER
#endif
/**
* @def Y_PRAGMA_NO_DEPRECATED
*
* Cross compiler pragma to disable warnings and errors about deprecated
*
* @see
* GCC: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
* Clang: https://clang.llvm.org/docs/DiagnosticsReference.html#wdeprecated
* MSVC: https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-3-c4996?view=vs-2017
*
* @code
* Y_PRAGMA_DIAGNOSTIC_PUSH
* Y_PRAGMA_NO_DEPRECATED
*
* [deprecated] void foo() {
* // ...
* }
*
* int main() {
* foo();
* return 0;
* }
*
* Y_PRAGMA_DIAGNOSTIC_POP
* @endcode
*/
#if defined(__clang__) || defined(__GNUC__)
#define Y_PRAGMA_NO_DEPRECATED \
Y_PRAGMA("GCC diagnostic ignored \"-Wdeprecated\"")
#elif defined(_MSC_VER)
#define Y_PRAGMA_NO_DEPRECATED \
Y_PRAGMA(warning(disable : 4996))
#else
#define Y_PRAGMA_NO_DEPRECATED
#endif
#if defined(__clang__) || defined(__GNUC__)
/**
* @def Y_CONST_FUNCTION
methods and functions, marked with this method are promised to:
1. do not have side effects
2. this method do not read global memory
NOTE: this attribute can't be set for methods that depend on data, pointed by this
this allow compilers to do hard optimization of that functions
NOTE: in common case this attribute can't be set if method have pointer-arguments
NOTE: as result there no any reason to discard result of such method
*/
#define Y_CONST_FUNCTION [[gnu::const]]
#endif
#if !defined(Y_CONST_FUNCTION)
#define Y_CONST_FUNCTION
#endif
#if defined(__clang__) || defined(__GNUC__)
/**
* @def Y_PURE_FUNCTION
methods and functions, marked with this method are promised to:
1. do not have side effects
2. result will be the same if no global memory changed
this allow compilers to do hard optimization of that functions
NOTE: as result there no any reason to discard result of such method
*/
#define Y_PURE_FUNCTION [[gnu::pure]]
#endif
#if !defined(Y_PURE_FUNCTION)
#define Y_PURE_FUNCTION
#endif
/**
* @ def Y_HAVE_INT128
*
* Defined when the compiler supports __int128 extension
*
* @code
*
* #if defined(Y_HAVE_INT128)
* __int128 myVeryBigInt = 12345678901234567890;
* #endif
*
* @endcode
*/
#if defined(__SIZEOF_INT128__)
#define Y_HAVE_INT128 1
#endif
/**
* XRAY macro must be passed to compiler if XRay is enabled.
*
* Define everything XRay-specific as a macro so that it doesn't cause errors
* for compilers that doesn't support XRay.
*/
#if defined(XRAY) && defined(__cplusplus)
#include <xray/xray_interface.h>
#define Y_XRAY_ALWAYS_INSTRUMENT [[clang::xray_always_instrument]]
#define Y_XRAY_NEVER_INSTRUMENT [[clang::xray_never_instrument]]
#define Y_XRAY_CUSTOM_EVENT(__string, __length) \
do { \
__xray_customevent(__string, __length); \
} while (0)
#else
#define Y_XRAY_ALWAYS_INSTRUMENT
#define Y_XRAY_NEVER_INSTRUMENT
#define Y_XRAY_CUSTOM_EVENT(__string, __length) \
do { \
} while (0)
#endif

View File

@ -0,0 +1,168 @@
#pragma once
#include "platform.h"
#if defined _unix_
#define LOCSLASH_C '/'
#define LOCSLASH_S "/"
#else
#define LOCSLASH_C '\\'
#define LOCSLASH_S "\\"
#endif // _unix_
#if defined(__INTEL_COMPILER) && defined(__cplusplus)
#include <new>
#endif
// low and high parts of integers
#if !defined(_win_)
#include <sys/param.h>
#endif
#if defined(BSD) || defined(_android_)
#if defined(BSD)
#include <machine/endian.h>
#endif
#if defined(_android_)
#include <endian.h>
#endif
#if (BYTE_ORDER == LITTLE_ENDIAN)
#define _little_endian_
#elif (BYTE_ORDER == BIG_ENDIAN)
#define _big_endian_
#else
#error unknown endian not supported
#endif
#elif (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(WHATEVER_THAT_HAS_BIG_ENDIAN)
#define _big_endian_
#else
#define _little_endian_
#endif
// alignment
#if (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(__alpha__) || defined(__ia64__) || defined(WHATEVER_THAT_NEEDS_ALIGNING_QUADS)
#define _must_align8_
#endif
#if (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(__alpha__) || defined(__ia64__) || defined(WHATEVER_THAT_NEEDS_ALIGNING_LONGS)
#define _must_align4_
#endif
#if (defined(_sun_) && !defined(__i386__)) || defined(_hpux_) || defined(__alpha__) || defined(__ia64__) || defined(WHATEVER_THAT_NEEDS_ALIGNING_SHORTS)
#define _must_align2_
#endif
#if defined(__GNUC__)
#define alias_hack __attribute__((__may_alias__))
#endif
#ifndef alias_hack
#define alias_hack
#endif
#include "types.h"
#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)
#define PRAGMA(x) _Pragma(#x)
#define RCSID(idstr) PRAGMA(comment(exestr, idstr))
#else
#define RCSID(idstr) static const char rcsid[] = idstr
#endif
#include "compiler.h"
#ifdef _win_
#include <malloc.h>
#elif defined(_sun_)
#include <alloca.h>
#endif
#ifdef NDEBUG
#define Y_IF_DEBUG(X)
#else
#define Y_IF_DEBUG(X) X
#endif
/**
* @def Y_ARRAY_SIZE
*
* This macro is needed to get number of elements in a statically allocated fixed size array. The
* expression is a compile-time constant and therefore can be used in compile time computations.
*
* @code
* enum ENumbers {
* EN_ONE,
* EN_TWO,
* EN_SIZE
* }
*
* const char* NAMES[] = {
* "one",
* "two"
* }
*
* static_assert(Y_ARRAY_SIZE(NAMES) == EN_SIZE, "you should define `NAME` for each enumeration");
* @endcode
*
* This macro also catches type errors. If you see a compiler error like "warning: division by zero
* is undefined" when using `Y_ARRAY_SIZE` then you are probably giving it a pointer.
*
* Since all of our code is expected to work on a 64 bit platform where pointers are 8 bytes we may
* falsefully accept pointers to types of sizes that are divisors of 8 (1, 2, 4 and 8).
*/
#if defined(__cplusplus)
namespace NArraySizePrivate {
template <class T>
struct TArraySize;
template <class T, size_t N>
struct TArraySize<T[N]> {
enum {
Result = N
};
};
template <class T, size_t N>
struct TArraySize<T (&)[N]> {
enum {
Result = N
};
};
}
#define Y_ARRAY_SIZE(arr) ((size_t)::NArraySizePrivate::TArraySize<decltype(arr)>::Result)
#else
#undef Y_ARRAY_SIZE
#define Y_ARRAY_SIZE(arr) \
((sizeof(arr) / sizeof((arr)[0])) / static_cast<size_t>(!(sizeof(arr) % sizeof((arr)[0]))))
#endif
#undef Y_ARRAY_BEGIN
#define Y_ARRAY_BEGIN(arr) (arr)
#undef Y_ARRAY_END
#define Y_ARRAY_END(arr) ((arr) + Y_ARRAY_SIZE(arr))
/**
* Concatenates two symbols, even if one of them is itself a macro.
*/
#define Y_CAT(X, Y) Y_CAT_I(X, Y)
#define Y_CAT_I(X, Y) Y_CAT_II(X, Y)
#define Y_CAT_II(X, Y) X##Y
#define Y_STRINGIZE(X) UTIL_PRIVATE_STRINGIZE_AUX(X)
#define UTIL_PRIVATE_STRINGIZE_AUX(X) #X
#if defined(__COUNTER__)
#define Y_GENERATE_UNIQUE_ID(N) Y_CAT(N, __COUNTER__)
#endif
#if !defined(Y_GENERATE_UNIQUE_ID)
#define Y_GENERATE_UNIQUE_ID(N) Y_CAT(N, __LINE__)
#endif
#define NPOS ((size_t)-1)

View File

@ -0,0 +1,242 @@
#pragma once
// What OS ?
// our definition has the form _{osname}_
#if defined(_WIN64)
#define _win64_
#define _win32_
#elif defined(__WIN32__) || defined(_WIN32) // _WIN32 is also defined by the 64-bit compiler for backward compatibility
#define _win32_
#else
#define _unix_
#if defined(__sun__) || defined(sun) || defined(sparc) || defined(__sparc)
#define _sun_
#endif
#if defined(__hpux__)
#define _hpux_
#endif
#if defined(__linux__)
#define _linux_
#endif
#if defined(__FreeBSD__)
#define _freebsd_
#endif
#if defined(__CYGWIN__)
#define _cygwin_
#endif
#if defined(__APPLE__)
#define _darwin_
#endif
#if defined(__ANDROID__)
#define _android_
#endif
#endif
#if defined(__IOS__)
#define _ios_
#endif
#if defined(_linux_)
#if defined(_musl_)
//nothing to do
#elif defined(_android_)
#define _bionic_
#else
#define _glibc_
#endif
#endif
#if defined(_darwin_)
#define unix
#define __unix__
#endif
#if defined(_win32_) || defined(_win64_)
#define _win_
#endif
#if defined(__arm__) || defined(__ARM__) || defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM)
#if defined(__arm64) || defined(__arm64__) || defined(__aarch64__)
#define _arm64_
#else
#define _arm32_
#endif
#endif
#if defined(_arm64_) || defined(_arm32_)
#define _arm_
#endif
/* __ia64__ and __x86_64__ - defined by GNU C.
* _M_IA64, _M_X64, _M_AMD64 - defined by Visual Studio.
*
* Microsoft can define _M_IX86, _M_AMD64 (before Visual Studio 8)
* or _M_X64 (starting in Visual Studio 8).
*/
#if defined(__x86_64__) || defined(_M_X64) || defined(_M_AMD64)
#define _x86_64_
#endif
#if defined(__i386__) || defined(_M_IX86)
#define _i386_
#endif
#if defined(__ia64__) || defined(_M_IA64)
#define _ia64_
#endif
#if defined(__powerpc__)
#define _ppc_
#endif
#if defined(__powerpc64__)
#define _ppc64_
#endif
#if !defined(sparc) && !defined(__sparc) && !defined(__hpux__) && !defined(__alpha__) && !defined(_ia64_) && !defined(_x86_64_) && !defined(_arm_) && !defined(_i386_) && !defined(_ppc_) && !defined(_ppc64_)
#error "platform not defined, please, define one"
#endif
#if defined(_x86_64_) || defined(_i386_)
#define _x86_
#endif
#if defined(__MIC__)
#define _mic_
#define _k1om_
#endif
// stdio or MessageBox
#if defined(__CONSOLE__) || defined(_CONSOLE)
#define _console_
#endif
#if (defined(_win_) && !defined(_console_))
#define _windows_
#elif !defined(_console_)
#define _console_
#endif
#if defined(__SSE__) || defined(SSE_ENABLED)
#define _sse_
#endif
#if defined(__SSE2__) || defined(SSE2_ENABLED)
#define _sse2_
#endif
#if defined(__SSE3__) || defined(SSE3_ENABLED)
#define _sse3_
#endif
#if defined(__SSSE3__) || defined(SSSE3_ENABLED)
#define _ssse3_
#endif
#if defined(POPCNT_ENABLED)
#define _popcnt_
#endif
#if defined(__DLL__) || defined(_DLL)
#define _dll_
#endif
// 16, 32 or 64
#if defined(__sparc_v9__) || defined(_x86_64_) || defined(_ia64_) || defined(_arm64_) || defined(_ppc64_)
#define _64_
#else
#define _32_
#endif
/* All modern 64-bit Unix systems use scheme LP64 (long, pointers are 64-bit).
* Microsoft uses a different scheme: LLP64 (long long, pointers are 64-bit).
*
* Scheme LP64 LLP64
* char 8 8
* short 16 16
* int 32 32
* long 64 32
* long long 64 64
* pointer 64 64
*/
#if defined(_32_)
#define SIZEOF_PTR 4
#elif defined(_64_)
#define SIZEOF_PTR 8
#endif
#define PLATFORM_DATA_ALIGN SIZEOF_PTR
#if !defined(SIZEOF_PTR)
#error todo
#endif
#define SIZEOF_CHAR 1
#define SIZEOF_UNSIGNED_CHAR 1
#define SIZEOF_SHORT 2
#define SIZEOF_UNSIGNED_SHORT 2
#define SIZEOF_INT 4
#define SIZEOF_UNSIGNED_INT 4
#if defined(_32_)
#define SIZEOF_LONG 4
#define SIZEOF_UNSIGNED_LONG 4
#elif defined(_64_)
#if defined(_win_)
#define SIZEOF_LONG 4
#define SIZEOF_UNSIGNED_LONG 4
#else
#define SIZEOF_LONG 8
#define SIZEOF_UNSIGNED_LONG 8
#endif // _win_
#endif // _32_
#if !defined(SIZEOF_LONG)
#error todo
#endif
#define SIZEOF_LONG_LONG 8
#define SIZEOF_UNSIGNED_LONG_LONG 8
#undef SIZEOF_SIZE_T // in case we include <Python.h> which defines it, too
#define SIZEOF_SIZE_T SIZEOF_PTR
#if defined(__INTEL_COMPILER)
#pragma warning(disable 1292)
#pragma warning(disable 1469)
#pragma warning(disable 193)
#pragma warning(disable 271)
#pragma warning(disable 383)
#pragma warning(disable 424)
#pragma warning(disable 444)
#pragma warning(disable 584)
#pragma warning(disable 593)
#pragma warning(disable 981)
#pragma warning(disable 1418)
#pragma warning(disable 304)
#pragma warning(disable 810)
#pragma warning(disable 1029)
#pragma warning(disable 1419)
#pragma warning(disable 177)
#pragma warning(disable 522)
#pragma warning(disable 858)
#pragma warning(disable 111)
#pragma warning(disable 1599)
#pragma warning(disable 411)
#pragma warning(disable 304)
#pragma warning(disable 858)
#pragma warning(disable 444)
#pragma warning(disable 913)
#pragma warning(disable 310)
#pragma warning(disable 167)
#pragma warning(disable 180)
#pragma warning(disable 1572)
#endif
#if defined(_MSC_VER)
#undef _WINSOCKAPI_
#define _WINSOCKAPI_
#undef NOMINMAX
#define NOMINMAX
#endif

View File

@ -0,0 +1,117 @@
#pragma once
// DO_NOT_STYLE
#include "platform.h"
#include <inttypes.h>
typedef int8_t i8;
typedef int16_t i16;
typedef uint8_t ui8;
typedef uint16_t ui16;
typedef int yssize_t;
#define PRIYSZT "d"
#if defined(_darwin_) && defined(_32_)
typedef unsigned long ui32;
typedef long i32;
#else
typedef uint32_t ui32;
typedef int32_t i32;
#endif
#if defined(_darwin_) && defined(_64_)
typedef unsigned long ui64;
typedef long i64;
#else
typedef uint64_t ui64;
typedef int64_t i64;
#endif
#define LL(number) INT64_C(number)
#define ULL(number) UINT64_C(number)
// Macro for size_t and ptrdiff_t types
#if defined(_32_)
# if defined(_darwin_)
# define PRISZT "lu"
# undef PRIi32
# define PRIi32 "li"
# undef SCNi32
# define SCNi32 "li"
# undef PRId32
# define PRId32 "li"
# undef SCNd32
# define SCNd32 "li"
# undef PRIu32
# define PRIu32 "lu"
# undef SCNu32
# define SCNu32 "lu"
# undef PRIx32
# define PRIx32 "lx"
# undef SCNx32
# define SCNx32 "lx"
# elif !defined(_cygwin_)
# define PRISZT PRIu32
# else
# define PRISZT "u"
# endif
# define SCNSZT SCNu32
# define PRIPDT PRIi32
# define SCNPDT SCNi32
# define PRITMT PRIi32
# define SCNTMT SCNi32
#elif defined(_64_)
# if defined(_darwin_)
# define PRISZT "lu"
# undef PRIu64
# define PRIu64 PRISZT
# undef PRIx64
# define PRIx64 "lx"
# undef PRIX64
# define PRIX64 "lX"
# undef PRId64
# define PRId64 "ld"
# undef PRIi64
# define PRIi64 "li"
# undef SCNi64
# define SCNi64 "li"
# undef SCNu64
# define SCNu64 "lu"
# undef SCNx64
# define SCNx64 "lx"
# else
# define PRISZT PRIu64
# endif
# define SCNSZT SCNu64
# define PRIPDT PRIi64
# define SCNPDT SCNi64
# define PRITMT PRIi64
# define SCNTMT SCNi64
#else
# error "Unsupported platform"
#endif
// SUPERLONG
#if !defined(DONT_USE_SUPERLONG) && !defined(SUPERLONG_MAX)
#define SUPERLONG_MAX ~LL(0)
typedef i64 SUPERLONG;
#endif
// UNICODE
// UCS-2, native byteorder
typedef ui16 wchar16;
// internal symbol type: UTF-16LE
typedef wchar16 TChar;
typedef ui32 wchar32;
#if defined(_MSC_VER)
#include <basetsd.h>
typedef SSIZE_T ssize_t;
#define HAVE_SSIZE_T 1
#include <wchar.h>
#endif
#include <sys/types.h>

View File

@ -208,7 +208,8 @@ target_link_libraries(hdfs3 ${LIBXML2_LIBRARY})
# inherit from parent cmake
target_include_directories(hdfs3 PRIVATE ${Boost_INCLUDE_DIRS})
target_include_directories(hdfs3 PRIVATE ${Protobuf_INCLUDE_DIR})
target_include_directories(hdfs3 PRIVATE ${OPENSSL_INCLUDE_DIR})
target_link_libraries(hdfs3 ${Protobuf_LIBRARY})
target_link_libraries(hdfs3 ${OPENSSL_LIBRARIES})
if(OPENSSL_INCLUDE_DIR AND OPENSSL_LIBRARIES)
target_include_directories(hdfs3 PRIVATE ${OPENSSL_INCLUDE_DIR})
target_link_libraries(hdfs3 ${OPENSSL_LIBRARIES})
endif()

View File

@ -18,6 +18,7 @@
#define METROHASH_PLATFORM_H
#include <stdint.h>
#include <cstring>
// rotate right idiom recognized by most compilers
inline static uint64_t rotate_right(uint64_t v, unsigned k)
@ -25,20 +26,28 @@ inline static uint64_t rotate_right(uint64_t v, unsigned k)
return (v >> k) | (v << (64 - k));
}
// unaligned reads, fast and safe on Nehalem and later microarchitectures
inline static uint64_t read_u64(const void * const ptr)
{
return static_cast<uint64_t>(*reinterpret_cast<const uint64_t*>(ptr));
uint64_t result;
// Assignment like `result = *reinterpret_cast<const uint64_t *>(ptr)` here would mean undefined behaviour (unaligned read),
// so we use memcpy() which is the most portable. clang & gcc usually translates `memcpy()` into a single `load` instruction
// when hardware supports it, so using memcpy() is efficient too.
memcpy(&result, ptr, sizeof(result));
return result;
}
inline static uint64_t read_u32(const void * const ptr)
{
return static_cast<uint64_t>(*reinterpret_cast<const uint32_t*>(ptr));
uint32_t result;
memcpy(&result, ptr, sizeof(result));
return result;
}
inline static uint64_t read_u16(const void * const ptr)
{
return static_cast<uint64_t>(*reinterpret_cast<const uint16_t*>(ptr));
uint16_t result;
memcpy(&result, ptr, sizeof(result));
return result;
}
inline static uint64_t read_u8 (const void * const ptr)

2
contrib/librdkafka vendored

@ -1 +1 @@
Subproject commit 73295a702cd1c85c11749ade500d713db7099cca
Subproject commit 8695b9d63ac0fe1b891b511d5b36302ffc84d4e2

View File

@ -58,4 +58,7 @@ add_library(rdkafka ${LINK_MODE} ${SRCS})
target_include_directories(rdkafka SYSTEM PUBLIC include)
target_include_directories(rdkafka SYSTEM PUBLIC ${RDKAFKA_SOURCE_DIR}) # Because weird logic with "include_next" is used.
target_include_directories(rdkafka SYSTEM PRIVATE ${ZSTD_INCLUDE_DIR}/common) # Because wrong path to "zstd_errors.h" is used.
target_link_libraries(rdkafka PUBLIC ${ZLIB_LIBRARIES} ${ZSTD_LIBRARY} ${LZ4_LIBRARY} ${OPENSSL_SSL_LIBRARY} ${OPENSSL_CRYPTO_LIBRARY})
target_link_libraries(rdkafka PUBLIC ${ZLIB_LIBRARIES} ${ZSTD_LIBRARY} ${LZ4_LIBRARY})
if(OPENSSL_SSL_LIBRARY AND OPENSSL_CRYPTO_LIBRARY)
target_link_libraries(rdkafka PUBLIC ${OPENSSL_SSL_LIBRARY} ${OPENSSL_CRYPTO_LIBRARY})
endif()

2
contrib/lz4 vendored

@ -1 +1 @@
Subproject commit c10863b98e1503af90616ae99725ecd120265dfb
Subproject commit 780aac520b69d6369f4e3995624c37e56d75498d

View File

@ -9,8 +9,7 @@ add_library (lz4
${LIBRARY_DIR}/xxhash.h
${LIBRARY_DIR}/lz4.h
${LIBRARY_DIR}/lz4hc.h
${LIBRARY_DIR}/lz4opt.h)
${LIBRARY_DIR}/lz4hc.h)
target_compile_definitions(lz4 PUBLIC LZ4_DISABLE_DEPRECATE_WARNINGS=1)

View File

@ -33,7 +33,6 @@ ${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_time.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_tls.c
#${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/gnutls.c
#${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/ma_schannel.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/openssl.c
#${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/schannel.c
#${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/auth_gssapi_client.c
#${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/dialog.c
@ -55,12 +54,19 @@ ${MARIADB_CLIENT_SOURCE_DIR}/plugins/pvio/pvio_socket.c
${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64/libmariadb/ma_client_plugin.c
)
if(OPENSSL_LIBRARIES)
list(APPEND SRCS ${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/openssl.c)
endif()
add_library(mysqlclient STATIC ${SRCS})
target_link_libraries(mysqlclient ${OPENSSL_LIBRARIES})
if(OPENSSL_LIBRARIES)
target_link_libraries(mysqlclient ${OPENSSL_LIBRARIES})
target_compile_definitions(mysqlclient PRIVATE -D HAVE_OPENSSL -D HAVE_TLS)
endif()
target_include_directories(mysqlclient PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/linux_x86_64/include)
target_include_directories(mysqlclient PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/common/include)
target_include_directories(mysqlclient PUBLIC ${MARIADB_CLIENT_SOURCE_DIR}/include)
target_compile_definitions(mysqlclient PRIVATE -D THREAD -D HAVE_OPENSSL -D HAVE_TLS)
target_compile_definitions(mysqlclient PRIVATE -D THREAD)

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit fe5505e56c27b6ecb0dcbc40c49dc2caf4e9637f
Subproject commit 29439cf7fa32c1a2d62d925bb6d6a3f14668a4a2

View File

@ -20,7 +20,7 @@ set (CONFIG_VERSION ${CMAKE_CURRENT_BINARY_DIR}/src/Common/config_version.h)
set (CONFIG_COMMON ${CMAKE_CURRENT_BINARY_DIR}/src/Common/config.h)
include (cmake/version.cmake)
message (STATUS "Will build ${VERSION_FULL} revision ${VERSION_REVISION}")
message (STATUS "Will build ${VERSION_FULL} revision ${VERSION_REVISION} ${VERSION_OFFICIAL}")
configure_file (src/Common/config.h.in ${CONFIG_COMMON})
configure_file (src/Common/config_version.h.in ${CONFIG_VERSION})
@ -57,7 +57,7 @@ if (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
endif ()
if (NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 8)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wextra-semi-stmt -Wshadow-field -Wstring-plus-int")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wextra-semi-stmt -Wshadow-field -Wstring-plus-int -Wempty-init-stmt")
endif ()
if (NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 9)
@ -155,7 +155,6 @@ if (USE_EMBEDDED_COMPILER)
target_include_directories (dbms SYSTEM BEFORE PUBLIC ${LLVM_INCLUDE_DIRS})
endif ()
if (CMAKE_BUILD_TYPE_UC STREQUAL "RELEASE" OR CMAKE_BUILD_TYPE_UC STREQUAL "RELWITHDEBINFO" OR CMAKE_BUILD_TYPE_UC STREQUAL "MINSIZEREL")
# Won't generate debug info for files with heavy template instantiation to achieve faster linking and lower size.
set_source_files_properties(
@ -186,8 +185,6 @@ target_link_libraries (clickhouse_common_io
${LINK_LIBRARIES_ONLY_ON_X86_64}
PUBLIC
${DOUBLE_CONVERSION_LIBRARIES}
PRIVATE
pocoext
PUBLIC
${Poco_Net_LIBRARY}
${Poco_Util_LIBRARY}
@ -197,8 +194,7 @@ target_link_libraries (clickhouse_common_io
${CITYHASH_LIBRARIES}
PRIVATE
${ZLIB_LIBRARIES}
${EXECINFO_LIBRARY}
${ELF_LIBRARY}
${EXECINFO_LIBRARIES}
PUBLIC
${Boost_SYSTEM_LIBRARY}
PRIVATE
@ -214,6 +210,10 @@ target_link_libraries (clickhouse_common_io
target_include_directories(clickhouse_common_io SYSTEM BEFORE PUBLIC ${RE2_INCLUDE_DIR})
if (USE_LFALLOC)
target_include_directories (clickhouse_common_io SYSTEM BEFORE PUBLIC ${LFALLOC_INCLUDE_DIR})
endif ()
if(CPUID_LIBRARY)
target_link_libraries(clickhouse_common_io PRIVATE ${CPUID_LIBRARY})
endif()
@ -223,8 +223,9 @@ if(CPUINFO_LIBRARY)
endif()
target_link_libraries (dbms
PRIVATE
PUBLIC
clickhouse_compression
PRIVATE
clickhouse_parsers
clickhouse_common_config
PUBLIC
@ -232,7 +233,6 @@ target_link_libraries (dbms
PRIVATE
clickhouse_dictionaries_embedded
PUBLIC
pocoext
${MYSQLXX_LIBRARY}
PRIVATE
${BTRIE_LIBRARIES}
@ -309,7 +309,10 @@ if (USE_PARQUET)
endif ()
endif ()
target_link_libraries(dbms PRIVATE ${OPENSSL_CRYPTO_LIBRARY} Threads::Threads)
if(OPENSSL_CRYPTO_LIBRARY)
target_link_libraries(dbms PRIVATE ${OPENSSL_CRYPTO_LIBRARY})
endif ()
target_link_libraries(dbms PRIVATE Threads::Threads)
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${DIVIDE_INCLUDE_DIR})
target_include_directories (dbms SYSTEM BEFORE PRIVATE ${SPARCEHASH_INCLUDE_DIR})

View File

@ -1,11 +1,11 @@
# This strings autochanged from release_lib.sh:
set(VERSION_REVISION 54417)
set(VERSION_REVISION 54419)
set(VERSION_MAJOR 19)
set(VERSION_MINOR 5)
set(VERSION_MINOR 7)
set(VERSION_PATCH 1)
set(VERSION_GITHASH 628ed349c335b79a441a1bd6e4bc791d61dfe62c)
set(VERSION_DESCRIBE v19.5.1.1-testing)
set(VERSION_STRING 19.5.1.1)
set(VERSION_GITHASH b0b369b30f04a5026d1da5c7d3fd5998d6de1fe4)
set(VERSION_DESCRIBE v19.7.1.1-testing)
set(VERSION_STRING 19.7.1.1)
# end of autochange
set(VERSION_EXTRA "" CACHE STRING "")
@ -24,3 +24,7 @@ set (VERSION_FULL "${VERSION_NAME} ${VERSION_STRING}")
set (VERSION_SO "${VERSION_STRING}")
math (EXPR VERSION_INTEGER "${VERSION_PATCH} + ${VERSION_MINOR}*1000 + ${VERSION_MAJOR}*1000000")
if(YANDEX_OFFICIAL_BUILD)
set(VERSION_OFFICIAL " (official build)")
endif()

View File

@ -93,6 +93,7 @@ if (CLICKHOUSE_ONE_SHARED)
target_link_libraries(clickhouse-lib ${CLICKHOUSE_SERVER_LINK} ${CLICKHOUSE_CLIENT_LINK} ${CLICKHOUSE_LOCAL_LINK} ${CLICKHOUSE_BENCHMARK_LINK} ${CLICKHOUSE_PERFORMANCE_TEST_LINK} ${CLICKHOUSE_COPIER_LINK} ${CLICKHOUSE_EXTRACT_FROM_CONFIG_LINK} ${CLICKHOUSE_COMPRESSOR_LINK} ${CLICKHOUSE_FORMAT_LINK} ${CLICKHOUSE_OBFUSCATOR_LINK} ${CLICKHOUSE_COMPILER_LINK} ${CLICKHOUSE_ODBC_BRIDGE_LINK})
target_include_directories(clickhouse-lib ${CLICKHOUSE_SERVER_INCLUDE} ${CLICKHOUSE_CLIENT_INCLUDE} ${CLICKHOUSE_LOCAL_INCLUDE} ${CLICKHOUSE_BENCHMARK_INCLUDE} ${CLICKHOUSE_PERFORMANCE_TEST_INCLUDE} ${CLICKHOUSE_COPIER_INCLUDE} ${CLICKHOUSE_EXTRACT_FROM_CONFIG_INCLUDE} ${CLICKHOUSE_COMPRESSOR_INCLUDE} ${CLICKHOUSE_FORMAT_INCLUDE} ${CLICKHOUSE_OBFUSCATOR_INCLUDE} ${CLICKHOUSE_COMPILER_INCLUDE} ${CLICKHOUSE_ODBC_BRIDGE_INCLUDE})
set_target_properties(clickhouse-lib PROPERTIES SOVERSION ${VERSION_MAJOR}.${VERSION_MINOR} VERSION ${VERSION_SO} OUTPUT_NAME clickhouse DEBUG_POSTFIX "")
install (TARGETS clickhouse-lib LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR} COMPONENT clickhouse)
endif()
if (CLICKHOUSE_SPLIT_BINARY)
@ -154,7 +155,7 @@ else ()
clickhouse_target_link_split_lib(clickhouse obfuscator)
endif ()
if (USE_EMBEDDED_COMPILER)
clickhouse_target_link_split_lib(clickhouse compiler)
target_link_libraries(clickhouse PRIVATE clickhouse-compiler-lib)
endif ()
set (CLICKHOUSE_BUNDLE)

View File

@ -439,7 +439,7 @@ int mainEntryClickHouseBenchmark(int argc, char ** argv)
("help", "produce help message")
("concurrency,c", value<unsigned>()->default_value(1), "number of parallel queries")
("delay,d", value<double>()->default_value(1), "delay between intermediate reports in seconds (set 0 to disable reports)")
("stage", value<std::string>()->default_value("complete"), "request query processing up to specified stage")
("stage", value<std::string>()->default_value("complete"), "request query processing up to specified stage: complete,fetch_columns,with_mergeable_state")
("iterations,i", value<size_t>()->default_value(0), "amount of queries to be executed")
("timelimit,t", value<double>()->default_value(0.), "stop launch of queries after specified time limit")
("randomize,r", value<bool>()->default_value(false), "randomize order of execution")

View File

@ -46,7 +46,7 @@ LLVMSupport
#PollyISL
#PollyPPCG
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -46,7 +46,7 @@ ${REQUIRED_LLVM_LIBRARIES}
#PollyISL
#PollyPPCG
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -42,7 +42,7 @@ lldCore
${REQUIRED_LLVM_LIBRARIES}
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -42,7 +42,7 @@ lldCore
${REQUIRED_LLVM_LIBRARIES}
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARY} Threads::Threads
PUBLIC ${ZLIB_LIBRARIES} ${EXECINFO_LIBRARIES} Threads::Threads
${MALLOC_LIBRARIES}
${GLIBC_COMPATIBILITY_LIBRARIES}
${MEMCPY_LIBRARIES}

View File

@ -1,5 +1,11 @@
#!/bin/sh
# Helper for split build mode.
# Allows to run commands like
# clickhouse client
# clickhouse server
# ...
set -e
CMD=$1
shift

View File

@ -42,6 +42,7 @@
#include <IO/ReadBufferFromString.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <IO/Operators.h>
#include <IO/UseSSL.h>
#include <DataStreams/AsynchronousBlockInputStream.h>
#include <DataStreams/AddingDefaultsBlockInputStream.h>
@ -101,6 +102,7 @@ namespace ErrorCodes
extern const int LOGICAL_ERROR;
extern const int CANNOT_SET_SIGNAL_HANDLER;
extern const int CANNOT_READLINE;
extern const int SYSTEM_ERROR;
}
@ -295,7 +297,6 @@ private:
/// The value of the option is used as the text of query (or of multiple queries).
/// If stdin is not a terminal, INSERT data for the first query is read from it.
/// - stdin is not a terminal. In this case queries are read from it.
stdin_is_not_tty = !isatty(STDIN_FILENO);
if (stdin_is_not_tty || config().has("query"))
is_interactive = false;
@ -610,9 +611,6 @@ private:
try
{
/// Determine the terminal size.
ioctl(0, TIOCGWINSZ, &terminal_size);
if (!process(input))
break;
}
@ -704,7 +702,7 @@ private:
return true;
}
ASTInsertQuery * insert = typeid_cast<ASTInsertQuery *>(ast.get());
auto * insert = ast->as<ASTInsertQuery>();
if (insert && insert->data)
{
@ -799,22 +797,38 @@ private:
written_progress_chars = 0;
written_first_block = false;
const ASTSetQuery * set_query = typeid_cast<const ASTSetQuery *>(&*parsed_query);
const ASTUseQuery * use_query = typeid_cast<const ASTUseQuery *>(&*parsed_query);
/// INSERT query for which data transfer is needed (not an INSERT SELECT) is processed separately.
const ASTInsertQuery * insert = typeid_cast<const ASTInsertQuery *>(&*parsed_query);
{
/// Temporarily apply query settings to context.
std::optional<Settings> old_settings;
SCOPE_EXIT({ if (old_settings) context.setSettings(*old_settings); });
auto apply_query_settings = [&](const IAST & settings_ast)
{
if (!old_settings)
old_settings.emplace(context.getSettingsRef());
for (const auto & change : settings_ast.as<ASTSetQuery>()->changes)
context.setSetting(change.name, change.value);
};
const auto * insert = parsed_query->as<ASTInsertQuery>();
if (insert && insert->settings_ast)
apply_query_settings(*insert->settings_ast);
/// FIXME: try to prettify this cast using `as<>()`
const auto * with_output = dynamic_cast<const ASTQueryWithOutput *>(parsed_query.get());
if (with_output && with_output->settings_ast)
apply_query_settings(*with_output->settings_ast);
connection->forceConnected();
connection->forceConnected();
if (insert && !insert->select)
processInsertQuery();
else
processOrdinaryQuery();
/// INSERT query for which data transfer is needed (not an INSERT SELECT) is processed separately.
if (insert && !insert->select)
processInsertQuery();
else
processOrdinaryQuery();
}
/// Do not change context (current DB, settings) in case of an exception.
if (!got_exception)
{
if (set_query)
if (const auto * set_query = parsed_query->as<ASTSetQuery>())
{
/// Save all changes in settings to avoid losing them if the connection is lost.
for (const auto & change : set_query->changes)
@ -826,7 +840,7 @@ private:
}
}
if (use_query)
if (const auto * use_query = parsed_query->as<ASTUseQuery>())
{
const String & new_database = use_query->database;
/// If the client initiates the reconnection, it takes the settings from the config.
@ -858,7 +872,7 @@ private:
/// Convert external tables to ExternalTableData and send them using the connection.
void sendExternalTables()
{
auto * select = typeid_cast<const ASTSelectWithUnionQuery *>(&*parsed_query);
const auto * select = parsed_query->as<ASTSelectWithUnionQuery>();
if (!select && !external_tables.empty())
throw Exception("External tables could be sent only with select query", ErrorCodes::BAD_ARGUMENTS);
@ -883,7 +897,7 @@ private:
void processInsertQuery()
{
/// Send part of query without data, because data will be sent separately.
const ASTInsertQuery & parsed_insert_query = typeid_cast<const ASTInsertQuery &>(*parsed_query);
const auto & parsed_insert_query = parsed_query->as<ASTInsertQuery &>();
String query_without_data = parsed_insert_query.data
? query.substr(0, parsed_insert_query.data - query.data())
: query;
@ -940,7 +954,7 @@ private:
void sendData(Block & sample, const ColumnsDescription & columns_description)
{
/// If INSERT data must be sent.
const ASTInsertQuery * parsed_insert_query = typeid_cast<const ASTInsertQuery *>(&*parsed_query);
const auto * parsed_insert_query = parsed_query->as<ASTInsertQuery>();
if (!parsed_insert_query)
return;
@ -965,18 +979,16 @@ private:
String current_format = insert_format;
/// Data format can be specified in the INSERT query.
if (ASTInsertQuery * insert = typeid_cast<ASTInsertQuery *>(&*parsed_query))
if (const auto * insert = parsed_query->as<ASTInsertQuery>())
{
if (!insert->format.empty())
current_format = insert->format;
if (insert->settings_ast)
InterpreterSetQuery(insert->settings_ast, context).executeForCurrentContext();
}
BlockInputStreamPtr block_input = context.getInputFormat(
current_format, buf, sample, insert_format_max_block_size);
const auto & column_defaults = columns_description.defaults;
const auto & column_defaults = columns_description.getDefaults();
if (!column_defaults.empty())
block_input = std::make_shared<AddingDefaultsBlockInputStream>(block_input, column_defaults, context);
@ -1231,12 +1243,14 @@ private:
String current_format = format;
/// The query can specify output format or output file.
if (ASTQueryWithOutput * query_with_output = dynamic_cast<ASTQueryWithOutput *>(&*parsed_query))
/// FIXME: try to prettify this cast using `as<>()`
if (const auto * query_with_output = dynamic_cast<const ASTQueryWithOutput *>(parsed_query.get()))
{
if (query_with_output->out_file != nullptr)
if (query_with_output->out_file)
{
const auto & out_file_node = typeid_cast<const ASTLiteral &>(*query_with_output->out_file);
const auto & out_file_node = query_with_output->out_file->as<ASTLiteral &>();
const auto & out_file = out_file_node.value.safeGet<std::string>();
out_file_buf.emplace(out_file, DBMS_DEFAULT_BUFFER_SIZE, O_WRONLY | O_EXCL | O_CREAT);
out_buf = &*out_file_buf;
@ -1248,13 +1262,9 @@ private:
{
if (has_vertical_output_suffix)
throw Exception("Output format already specified", ErrorCodes::CLIENT_OUTPUT_FORMAT_SPECIFIED);
const auto & id = typeid_cast<const ASTIdentifier &>(*query_with_output->format);
const auto & id = query_with_output->format->as<ASTIdentifier &>();
current_format = id.name;
}
if (query_with_output->settings_ast)
{
InterpreterSetQuery(query_with_output->settings_ast, context).executeForCurrentContext();
}
}
if (has_vertical_output_suffix)
@ -1318,6 +1328,9 @@ private:
/// Received data block is immediately displayed to the user.
block_out_stream->flush();
/// Restore progress bar after data block.
writeProgress();
}
@ -1357,8 +1370,8 @@ private:
void clearProgress()
{
std::cerr << RESTORE_CURSOR_POSITION CLEAR_TO_END_OF_LINE;
written_progress_chars = 0;
std::cerr << RESTORE_CURSOR_POSITION CLEAR_TO_END_OF_LINE;
}
@ -1367,6 +1380,9 @@ private:
if (!need_render_progress)
return;
/// Output all progress bar commands to stderr at once to avoid flicker.
WriteBufferFromFileDescriptor message(STDERR_FILENO, 1024);
static size_t increment = 0;
static const char * indicators[8] =
{
@ -1381,13 +1397,15 @@ private:
};
if (written_progress_chars)
clearProgress();
message << RESTORE_CURSOR_POSITION CLEAR_TO_END_OF_LINE;
else
std::cerr << SAVE_CURSOR_POSITION;
message << SAVE_CURSOR_POSITION;
message << DISABLE_LINE_WRAPPING;
size_t prefix_size = message.count();
std::stringstream message;
message << indicators[increment % 8]
<< std::fixed << std::setprecision(3)
<< " Progress: ";
message
@ -1402,8 +1420,7 @@ private:
else
message << ". ";
written_progress_chars = message.str().size() - (increment % 8 == 7 ? 10 : 13);
std::cerr << DISABLE_LINE_WRAPPING << message.rdbuf();
written_progress_chars = message.count() - prefix_size - (increment % 8 == 7 ? 10 : 13); /// Don't count invisible output (escape sequences).
/// If the approximate number of rows to process is known, we can display a progress bar and percentage.
if (progress.total_rows > 0)
@ -1425,19 +1442,21 @@ private:
if (width_of_progress_bar > 0)
{
std::string bar = UnicodeBar::render(UnicodeBar::getWidth(progress.rows, 0, total_rows_corrected, width_of_progress_bar));
std::cerr << "\033[0;32m" << bar << "\033[0m";
message << "\033[0;32m" << bar << "\033[0m";
if (width_of_progress_bar > static_cast<ssize_t>(bar.size() / UNICODE_BAR_CHAR_SIZE))
std::cerr << std::string(width_of_progress_bar - bar.size() / UNICODE_BAR_CHAR_SIZE, ' ');
message << std::string(width_of_progress_bar - bar.size() / UNICODE_BAR_CHAR_SIZE, ' ');
}
}
}
/// Underestimate percentage a bit to avoid displaying 100%.
std::cerr << ' ' << (99 * progress.rows / total_rows_corrected) << '%';
message << ' ' << (99 * progress.rows / total_rows_corrected) << '%';
}
std::cerr << ENABLE_LINE_WRAPPING;
message << ENABLE_LINE_WRAPPING;
++increment;
message.next();
}
@ -1504,7 +1523,7 @@ private:
void showClientVersion()
{
std::cout << DBMS_NAME << " client version " << VERSION_STRING << "." << std::endl;
std::cout << DBMS_NAME << " client version " << VERSION_STRING << VERSION_OFFICIAL << "." << std::endl;
}
public:
@ -1569,7 +1588,7 @@ public:
}
}
ioctl(0, TIOCGWINSZ, &terminal_size);
stdin_is_not_tty = !isatty(STDIN_FILENO);
namespace po = boost::program_options;
@ -1577,7 +1596,11 @@ public:
unsigned min_description_length = line_length / 2;
if (!stdin_is_not_tty)
{
line_length = std::max(3U, static_cast<unsigned>(terminal_size.ws_col));
if (ioctl(STDIN_FILENO, TIOCGWINSZ, &terminal_size))
throwFromErrno("Cannot obtain terminal window size (ioctl TIOCGWINSZ)", ErrorCodes::SYSTEM_ERROR);
line_length = std::max(
static_cast<unsigned>(strlen("--http_native_compression_disable_checksumming_on_decompress ")),
static_cast<unsigned>(terminal_size.ws_col));
min_description_length = std::min(min_description_length, line_length - 2);
}

View File

@ -39,7 +39,7 @@ private:
"DATABASES", "LIKE", "PROCESSLIST", "CASE", "WHEN", "THEN", "ELSE", "END", "DESCRIBE", "DESC", "USE", "SET", "OPTIMIZE", "FINAL", "DEDUPLICATE",
"INSERT", "VALUES", "SELECT", "DISTINCT", "SAMPLE", "ARRAY", "JOIN", "GLOBAL", "LOCAL", "ANY", "ALL", "INNER", "LEFT", "RIGHT", "FULL", "OUTER",
"CROSS", "USING", "PREWHERE", "WHERE", "GROUP", "BY", "WITH", "TOTALS", "HAVING", "ORDER", "COLLATE", "LIMIT", "UNION", "AND", "OR", "ASC", "IN",
"KILL", "QUERY", "SYNC", "ASYNC", "TEST", "BETWEEN"
"KILL", "QUERY", "SYNC", "ASYNC", "TEST", "BETWEEN", "TRUNCATE"
};
/// Words are fetched asynchonously.

View File

@ -1,6 +1,7 @@
#include <iostream>
#include <optional>
#include <boost/program_options.hpp>
#include <boost/algorithm/string/join.hpp>
#include <Common/Exception.h>
#include <IO/WriteBufferFromFileDescriptor.h>
@ -9,6 +10,8 @@
#include <Compression/CompressedReadBuffer.h>
#include <IO/WriteHelpers.h>
#include <IO/copyData.h>
#include <Parsers/parseQuery.h>
#include <Parsers/ExpressionElementParsers.h>
#include <Compression/CompressionFactory.h>
@ -64,7 +67,7 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
("hc", "use LZ4HC instead of LZ4")
("zstd", "use ZSTD instead of LZ4")
("codec", boost::program_options::value<std::vector<std::string>>()->multitoken(), "use codecs combination instead of LZ4")
("level", boost::program_options::value<std::vector<int>>()->multitoken(), "compression levels for codecs specified via --codec")
("level", boost::program_options::value<int>(), "compression level for codecs spicified via flags")
("none", "use no compression instead of LZ4")
("stat", "print block statistics of compressed data")
;
@ -94,6 +97,9 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
if ((use_lz4hc || use_zstd || use_none) && !codecs.empty())
throw DB::Exception("Wrong options, codec flags like --zstd and --codec options are mutually exclusive", DB::ErrorCodes::BAD_ARGUMENTS);
if (!codecs.empty() && options.count("level"))
throw DB::Exception("Wrong options, --level is not compatible with --codec list", DB::ErrorCodes::BAD_ARGUMENTS);
std::string method_family = "LZ4";
if (use_lz4hc)
@ -103,28 +109,22 @@ int mainEntryClickHouseCompressor(int argc, char ** argv)
else if (use_none)
method_family = "NONE";
std::vector<int> levels;
std::optional<int> level = std::nullopt;
if (options.count("level"))
levels = options["level"].as<std::vector<int>>();
level = options["level"].as<int>();
DB::CompressionCodecPtr codec;
if (!codecs.empty())
{
if (levels.size() > codecs.size())
throw DB::Exception("Specified more levels than codecs", DB::ErrorCodes::BAD_ARGUMENTS);
DB::ParserCodec codec_parser;
std::vector<DB::CodecNameWithLevel> codec_names;
for (size_t i = 0; i < codecs.size(); ++i)
{
if (i < levels.size())
codec_names.emplace_back(codecs[i], levels[i]);
else
codec_names.emplace_back(codecs[i], std::nullopt);
}
codec = DB::CompressionCodecFactory::instance().get(codec_names);
std::string codecs_line = boost::algorithm::join(codecs, ",");
auto ast = DB::parseQuery(codec_parser, "(" + codecs_line + ")", 0);
codec = DB::CompressionCodecFactory::instance().get(ast, nullptr);
}
else
codec = DB::CompressionCodecFactory::instance().get(method_family, levels.empty() ? std::nullopt : std::optional<int>(levels.back()));
codec = DB::CompressionCodecFactory::instance().get(method_family, level);
DB::ReadBufferFromFileDescriptor rb(STDIN_FILENO);

View File

@ -17,11 +17,11 @@ $ ./clickhouse-compressor --decompress < input_file > output_file
Compress data with ZSTD at level 5:
```
$ ./clickhouse-compressor --codec ZSTD --level 5 < input_file > output_file
$ ./clickhouse-compressor --codec 'ZSTD(5)' < input_file > output_file
```
Compress data with ZSTD level 10, LZ4HC level 7 and LZ4.
Compress data with Delta of four bytes and ZSTD level 10.
```
$ ./clickhouse-compressor --codec ZSTD --level 5 --codec LZ4HC --level 7 --codec LZ4 < input_file > output_file
$ ./clickhouse-compressor --codec 'Delta(4)' --codec 'ZSTD(10)' < input_file > output_file
```

View File

@ -1,7 +1,6 @@
#include "ClusterCopier.h"
#include <chrono>
#include <Poco/Util/XMLConfiguration.h>
#include <Poco/Logger.h>
#include <Poco/ConsoleChannel.h>
@ -13,14 +12,11 @@
#include <Poco/FileChannel.h>
#include <Poco/SplitterChannel.h>
#include <Poco/Util/HelpFormatter.h>
#include <boost/algorithm/string.hpp>
#include <pcg_random.hpp>
#include <common/logger_useful.h>
#include <Common/ThreadPool.h>
#include <daemon/OwnPatternFormatter.h>
#include <Common/Exception.h>
#include <Common/ZooKeeper/ZooKeeper.h>
#include <Common/ZooKeeper/KeeperException.h>
@ -61,6 +57,7 @@
#include <DataStreams/NullBlockOutputStream.h>
#include <IO/Operators.h>
#include <IO/ReadBufferFromString.h>
#include <IO/ReadBufferFromFile.h>
#include <Functions/registerFunctions.h>
#include <TableFunctions/registerTableFunctions.h>
#include <AggregateFunctions/registerAggregateFunctions.h>
@ -483,7 +480,7 @@ String DB::TaskShard::getHostNameExample() const
static bool isExtendedDefinitionStorage(const ASTPtr & storage_ast)
{
const ASTStorage & storage = typeid_cast<const ASTStorage &>(*storage_ast);
const auto & storage = storage_ast->as<ASTStorage &>();
return storage.partition_by || storage.order_by || storage.sample_by;
}
@ -491,8 +488,8 @@ static ASTPtr extractPartitionKey(const ASTPtr & storage_ast)
{
String storage_str = queryToString(storage_ast);
const ASTStorage & storage = typeid_cast<const ASTStorage &>(*storage_ast);
const ASTFunction & engine = typeid_cast<const ASTFunction &>(*storage.engine);
const auto & storage = storage_ast->as<ASTStorage &>();
const auto & engine = storage.engine->as<ASTFunction &>();
if (!endsWith(engine.name, "MergeTree"))
{
@ -500,9 +497,6 @@ static ASTPtr extractPartitionKey(const ASTPtr & storage_ast)
ErrorCodes::BAD_ARGUMENTS);
}
ASTPtr arguments_ast = engine.arguments->clone();
ASTs & arguments = typeid_cast<ASTExpressionList &>(*arguments_ast).children;
if (isExtendedDefinitionStorage(storage_ast))
{
if (storage.partition_by)
@ -516,6 +510,12 @@ static ASTPtr extractPartitionKey(const ASTPtr & storage_ast)
bool is_replicated = startsWith(engine.name, "Replicated");
size_t min_args = is_replicated ? 3 : 1;
if (!engine.arguments)
throw Exception("Expected arguments in " + storage_str, ErrorCodes::BAD_ARGUMENTS);
ASTPtr arguments_ast = engine.arguments->clone();
ASTs & arguments = arguments_ast->children;
if (arguments.size() < min_args)
throw Exception("Expected at least " + toString(min_args) + " arguments in " + storage_str, ErrorCodes::BAD_ARGUMENTS);
@ -894,6 +894,28 @@ public:
}
}
void uploadTaskDescription(const std::string & task_path, const std::string & task_file, const bool force)
{
auto local_task_description_path = task_path + "/description";
String task_config_str;
{
ReadBufferFromFile in(task_file);
readStringUntilEOF(task_config_str, in);
}
if (task_config_str.empty())
return;
auto zookeeper = context.getZooKeeper();
zookeeper->createAncestors(local_task_description_path);
auto code = zookeeper->tryCreate(local_task_description_path, task_config_str, zkutil::CreateMode::Persistent);
if (code && force)
zookeeper->createOrUpdate(local_task_description_path, task_config_str, zkutil::CreateMode::Persistent);
LOG_DEBUG(log, "Task description " << ((code && !force) ? "not " : "") << "uploaded to " << local_task_description_path << " with result " << code << " ("<< zookeeper->error2string(code) << ")");
}
void reloadTaskDescription()
{
auto zookeeper = context.getZooKeeper();
@ -1179,12 +1201,12 @@ protected:
/// Removes MATERIALIZED and ALIAS columns from create table query
static ASTPtr removeAliasColumnsFromCreateQuery(const ASTPtr & query_ast)
{
const ASTs & column_asts = typeid_cast<ASTCreateQuery &>(*query_ast).columns_list->columns->children;
const ASTs & column_asts = query_ast->as<ASTCreateQuery &>().columns_list->columns->children;
auto new_columns = std::make_shared<ASTExpressionList>();
for (const ASTPtr & column_ast : column_asts)
{
const ASTColumnDeclaration & column = typeid_cast<const ASTColumnDeclaration &>(*column_ast);
const auto & column = column_ast->as<ASTColumnDeclaration &>();
if (!column.default_specifier.empty())
{
@ -1197,12 +1219,12 @@ protected:
}
ASTPtr new_query_ast = query_ast->clone();
ASTCreateQuery & new_query = typeid_cast<ASTCreateQuery &>(*new_query_ast);
auto & new_query = new_query_ast->as<ASTCreateQuery &>();
auto new_columns_list = std::make_shared<ASTColumns>();
new_columns_list->set(new_columns_list->columns, new_columns);
new_columns_list->set(
new_columns_list->indices, typeid_cast<ASTCreateQuery &>(*query_ast).columns_list->indices->clone());
if (auto indices = query_ast->as<ASTCreateQuery>()->columns_list->indices)
new_columns_list->set(new_columns_list->indices, indices->clone());
new_query.replace(new_query.columns_list, new_columns_list);
@ -1212,7 +1234,7 @@ protected:
/// Replaces ENGINE and table name in a create query
std::shared_ptr<ASTCreateQuery> rewriteCreateQueryStorage(const ASTPtr & create_query_ast, const DatabaseAndTableName & new_table, const ASTPtr & new_storage_ast)
{
ASTCreateQuery & create = typeid_cast<ASTCreateQuery &>(*create_query_ast);
const auto & create = create_query_ast->as<ASTCreateQuery &>();
auto res = std::make_shared<ASTCreateQuery>(create);
if (create.storage == nullptr || new_storage_ast == nullptr)
@ -1646,7 +1668,7 @@ protected:
/// Try create table (if not exists) on each shard
{
auto create_query_push_ast = rewriteCreateQueryStorage(task_shard.current_pull_table_create_query, task_table.table_push, task_table.engine_push_ast);
typeid_cast<ASTCreateQuery &>(*create_query_push_ast).if_not_exists = true;
create_query_push_ast->as<ASTCreateQuery &>().if_not_exists = true;
String query = queryToString(create_query_push_ast);
LOG_DEBUG(log, "Create destination tables. Query: " << query);
@ -1779,7 +1801,7 @@ protected:
void dropAndCreateLocalTable(const ASTPtr & create_ast)
{
auto & create = typeid_cast<ASTCreateQuery &>(*create_ast);
const auto & create = create_ast->as<ASTCreateQuery &>();
dropLocalTableIfExists({create.database, create.table});
InterpreterCreateQuery interpreter(create_ast, context);
@ -2032,7 +2054,7 @@ private:
ConfigurationPtr task_cluster_initial_config;
ConfigurationPtr task_cluster_current_config;
Coordination::Stat task_descprtion_current_stat;
Coordination::Stat task_descprtion_current_stat{};
std::unique_ptr<TaskCluster> task_cluster;
@ -2104,6 +2126,10 @@ void ClusterCopierApp::defineOptions(Poco::Util::OptionSet & options)
options.addOption(Poco::Util::Option("task-path", "", "path to task in ZooKeeper")
.argument("task-path").binding("task-path"));
options.addOption(Poco::Util::Option("task-file", "", "path to task file for uploading in ZooKeeper to task-path")
.argument("task-file").binding("task-file"));
options.addOption(Poco::Util::Option("task-upload-force", "", "Force upload task-file even node already exists")
.argument("task-upload-force").binding("task-upload-force"));
options.addOption(Poco::Util::Option("safe-mode", "", "disables ALTER DROP PARTITION in case of errors")
.binding("safe-mode"));
options.addOption(Poco::Util::Option("copy-fault-probability", "", "the copying fails with specified probability (used to test partition state recovering)")
@ -2154,6 +2180,11 @@ void ClusterCopierApp::mainImpl()
auto copier = std::make_unique<ClusterCopier>(task_path, host_id, default_database, *context);
copier->setSafeMode(is_safe_mode);
copier->setCopyFaultProbability(copy_fault_probability);
auto task_file = config().getString("task-file", "");
if (!task_file.empty())
copier->uploadTaskDescription(task_path, task_file, config().getBool("task-upload-force", false));
copier->init();
copier->process();
}

View File

@ -369,7 +369,7 @@ void LocalServer::setupUsers()
static void showClientVersion()
{
std::cout << DBMS_NAME << " client version " << VERSION_STRING << "." << '\n';
std::cout << DBMS_NAME << " client version " << VERSION_STRING << VERSION_OFFICIAL << "." << '\n';
}
std::string LocalServer::getHelpHeader() const

View File

@ -1,6 +1,6 @@
#pragma once
#include <Interpreters/Settings.h>
#include <Core/Settings.h>
#include <Poco/Util/Application.h>
#include <memory>

View File

@ -2,7 +2,6 @@
#include "PingHandler.h"
#include "ColumnInfoHandler.h"
#include <Poco/URI.h>
#include <Poco/Ext/SessionPoolHelpers.h>
#include <Poco/Net/HTTPServerRequest.h>
#include <common/logger_useful.h>

View File

@ -11,11 +11,12 @@
#include <IO/WriteHelpers.h>
#include <IO/ReadHelpers.h>
#include <Interpreters/Context.h>
#include <Poco/Ext/SessionPoolHelpers.h>
#include <Poco/Net/HTTPServerRequest.h>
#include <Poco/Net/HTTPServerResponse.h>
#include <Poco/Net/HTMLForm.h>
#include <common/logger_useful.h>
#include <mutex>
#include <Poco/ThreadPool.h>
namespace DB
{
@ -31,6 +32,24 @@ namespace
}
}
using PocoSessionPoolConstructor = std::function<std::shared_ptr<Poco::Data::SessionPool>()>;
/** Is used to adjust max size of default Poco thread pool. See issue #750
* Acquire the lock, resize pool and construct new Session.
*/
std::shared_ptr<Poco::Data::SessionPool> createAndCheckResizePocoSessionPool(PocoSessionPoolConstructor pool_constr)
{
static std::mutex mutex;
Poco::ThreadPool & pool = Poco::ThreadPool::defaultPool();
/// NOTE: The lock don't guarantee that external users of the pool don't change its capacity
std::unique_lock lock(mutex);
if (pool.available() == 0)
pool.addCapacity(2 * std::max(pool.capacity(), 1));
return pool_constr();
}
ODBCHandler::PoolPtr ODBCHandler::getPool(const std::string & connection_str)
{

View File

@ -16,7 +16,7 @@ std::vector<XMLConfigurationPtr> ConfigPreprocessor::processConfig(
std::vector<XMLConfigurationPtr> result;
for (const auto & path : paths)
{
result.emplace_back(new XMLConfiguration(path));
result.emplace_back(XMLConfigurationPtr(new XMLConfiguration(path)));
result.back()->setString("path", Poco::Path(path).absolute().toString());
}

View File

@ -2,7 +2,7 @@
#include <string>
#include <vector>
#include <map>
#include <Interpreters/Settings.h>
#include <Core/Settings.h>
#include <Poco/Util/XMLConfiguration.h>
#include <Poco/AutoPtr.h>

View File

@ -25,7 +25,7 @@
#include <Interpreters/Context.h>
#include <IO/ConnectionTimeouts.h>
#include <IO/UseSSL.h>
#include <Interpreters/Settings.h>
#include <Core/Settings.h>
#include <Common/Exception.h>
#include <Common/InterruptListener.h>
@ -298,6 +298,8 @@ std::unordered_map<std::string, std::vector<std::size_t>> getTestQueryIndexes(co
{
std::unordered_map<std::string, std::vector<std::size_t>> result;
const auto & options = parsed_opts.options;
if (options.empty())
return result;
for (size_t i = 0; i < options.size() - 1; ++i)
{
const auto & opt = options[i];

View File

@ -4,7 +4,7 @@
#include "TestStopConditions.h"
#include <Common/InterruptListener.h>
#include <Interpreters/Context.h>
#include <Interpreters/Settings.h>
#include <Core/Settings.h>
#include <Client/Connection.h>
namespace DB

View File

@ -296,7 +296,7 @@ void HTTPHandler::processQuery(
/// The client can pass a HTTP header indicating supported compression method (gzip or deflate).
String http_response_compression_methods = request.get("Accept-Encoding", "");
bool client_supports_http_compression = false;
ZlibCompressionMethod http_response_compression_method {};
CompressionMethod http_response_compression_method {};
if (!http_response_compression_methods.empty())
{
@ -305,12 +305,17 @@ void HTTPHandler::processQuery(
if (std::string::npos != http_response_compression_methods.find("gzip"))
{
client_supports_http_compression = true;
http_response_compression_method = ZlibCompressionMethod::Gzip;
http_response_compression_method = CompressionMethod::Gzip;
}
else if (std::string::npos != http_response_compression_methods.find("deflate"))
{
client_supports_http_compression = true;
http_response_compression_method = ZlibCompressionMethod::Zlib;
http_response_compression_method = CompressionMethod::Zlib;
}
else if (http_response_compression_methods == "br")
{
client_supports_http_compression = true;
http_response_compression_method = CompressionMethod::Brotli;
}
}
@ -394,11 +399,11 @@ void HTTPHandler::processQuery(
{
if (http_request_compression_method_str == "gzip")
{
in_post = std::make_unique<ZlibInflatingReadBuffer>(*in_post_raw, ZlibCompressionMethod::Gzip);
in_post = std::make_unique<ZlibInflatingReadBuffer>(*in_post_raw, CompressionMethod::Gzip);
}
else if (http_request_compression_method_str == "deflate")
{
in_post = std::make_unique<ZlibInflatingReadBuffer>(*in_post_raw, ZlibCompressionMethod::Zlib);
in_post = std::make_unique<ZlibInflatingReadBuffer>(*in_post_raw, CompressionMethod::Zlib);
}
#if USE_BROTLI
else if (http_request_compression_method_str == "br")
@ -606,7 +611,7 @@ void HTTPHandler::processQuery(
executeQuery(*in, *used_output.out_maybe_delayed_and_compressed, /* allow_into_outfile = */ false, context,
[&response] (const String & content_type) { response.setContentType(content_type); },
[&response] (const String & current_query_id) { response.add("Query-Id", current_query_id); });
[&response] (const String & current_query_id) { response.add("X-ClickHouse-Query-Id", current_query_id); });
if (used_output.hasDelayed())
{

View File

@ -133,7 +133,7 @@ int Server::run()
}
if (config().hasOption("version"))
{
std::cout << DBMS_NAME << " server version " << VERSION_STRING << "." << std::endl;
std::cout << DBMS_NAME << " server version " << VERSION_STRING << VERSION_OFFICIAL << "." << std::endl;
return 0;
}
return Application::run();

View File

@ -723,8 +723,7 @@ bool TCPHandler::receiveData()
if (!(storage = query_context->tryGetExternalTable(external_table_name)))
{
NamesAndTypesList columns = block.getNamesAndTypesList();
storage = StorageMemory::create(external_table_name,
ColumnsDescription{columns, NamesAndTypesList{}, NamesAndTypesList{}, ColumnDefaults{}, ColumnComments{}, ColumnCodecs{}});
storage = StorageMemory::create(external_table_name, ColumnsDescription{columns});
storage->startup();
query_context->addExternalTable(external_table_name, storage);
}
@ -768,7 +767,7 @@ void TCPHandler::initBlockOutput(const Block & block)
{
if (!state.maybe_compressed_out)
{
std::string method = query_context->getSettingsRef().network_compression_method;
std::string method = Poco::toUpper(query_context->getSettingsRef().network_compression_method.toString());
std::optional<int> level;
if (method == "ZSTD")
level = query_context->getSettingsRef().network_zstd_compression_level;

View File

@ -25,7 +25,7 @@ namespace Poco { class Logger; }
namespace DB
{
struct ColumnsDescription;
class ColumnsDescription;
/// State of query processing.
struct QueryState

View File

@ -0,0 +1,21 @@
<?xml version="1.0"?>
<yandex>
<profiles>
<!-- Profile that allows only read queries. -->
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
<users>
<readonly>
<password></password>
<networks incl="networks" replace="replace">
<ip>::1</ip>
<ip>127.0.0.1</ip>
</networks>
<profile>readonly</profile>
<quota>default</quota>
</readonly>
</users>
</yandex>

View File

@ -16,6 +16,7 @@
with minimum number of different symbols between replica's hostname and local hostname
(Hamming distance).
in_order - first live replica is chosen in specified order.
first_or_random - if first replica one has higher number of errors, pick a random one from replicas with minimum number of errors.
-->
<load_balancing>random</load_balancing>
</default>
@ -74,10 +75,30 @@
<!-- Quota for user. -->
<quota>default</quota>
<!-- For testing the table filters -->
<databases>
<test>
<!-- Simple expression filter -->
<filtered_table1>
<filter>a = 1</filter>
</filtered_table1>
<!-- Complex expression filter -->
<filtered_table2>
<filter>a + b &lt; 1 or c - d &gt; 5</filter>
</filtered_table2>
<!-- Filter with ALIAS column -->
<filtered_table3>
<filter>c = 1</filter>
</filtered_table3>
</test>
</databases>
</default>
<!-- Example of user with readonly access. -->
<readonly>
<!-- <readonly>
<password></password>
<networks incl="networks" replace="replace">
<ip>::1</ip>
@ -85,7 +106,7 @@
</networks>
<profile>readonly</profile>
<quota>default</quota>
</readonly>
</readonly> -->
</users>
<!-- Quotas. -->

View File

@ -1,262 +0,0 @@
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import sys
import argparse
import tempfile
import random
import subprocess
import bisect
from copy import deepcopy
# Псевдослучайный генератор уникальных чисел.
# http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers/
class UniqueRandomGenerator:
prime = 4294967291
def __init__(self, seed_base, seed_offset):
self.index = self.permutePQR(self.permutePQR(seed_base) + 0x682f0161)
self.intermediate_offset = self.permutePQR(self.permutePQR(seed_offset) + 0x46790905)
def next(self):
val = self.permutePQR((self.permutePQR(self.index) + self.intermediate_offset) ^ 0x5bf03635)
self.index = self.index + 1
return val
def permutePQR(self, x):
if x >=self.prime:
return x
else:
residue = (x * x) % self.prime
if x <= self.prime/2:
return residue
else:
return self.prime - residue
# Создать таблицу содержащую уникальные значения.
def generate_data_source(host, port, http_port, min_cardinality, max_cardinality, count):
chunk_size = round((max_cardinality - min_cardinality) / float(count))
used_values = 0
cur_count = 0
next_size = 0
sup = 32768
n1 = random.randrange(0, sup)
n2 = random.randrange(0, sup)
urng = UniqueRandomGenerator(n1, n2)
is_first = True
with tempfile.TemporaryDirectory() as tmp_dir:
filename = tmp_dir + '/table.txt'
with open(filename, 'w+b') as file_handle:
while cur_count < count:
if is_first == True:
is_first = False
if min_cardinality != 0:
next_size = min_cardinality + 1
else:
next_size = chunk_size
else:
next_size += chunk_size
while used_values < next_size:
h = urng.next()
used_values = used_values + 1
out = str(h) + "\t" + str(cur_count) + "\n";
file_handle.write(bytes(out, 'UTF-8'));
cur_count = cur_count + 1
query = "DROP TABLE IF EXISTS data_source"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
query = "CREATE TABLE data_source(UserID UInt64, KeyID UInt64) ENGINE=TinyLog"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
cat = subprocess.Popen(("cat", filename), stdout=subprocess.PIPE)
subprocess.check_output(("POST", "http://{0}:{1}/?query=INSERT INTO data_source FORMAT TabSeparated".format(host, http_port)), stdin=cat.stdout)
cat.wait()
def perform_query(host, port):
query = "SELECT runningAccumulate(uniqExactState(UserID)) AS exact, "
query += "runningAccumulate(uniqCombinedRawState(UserID)) AS approx "
query += "FROM data_source GROUP BY KeyID"
return subprocess.check_output(["clickhouse-client", "--host", host, "--port", port, "--query", query])
def parse_clickhouse_response(response):
parsed = []
lines = response.decode().split("\n")
for cur_line in lines:
rows = cur_line.split("\t")
if len(rows) == 2:
parsed.append([float(rows[0]), float(rows[1])])
return parsed
def accumulate_data(accumulated_data, data):
if not accumulated_data:
accumulated_data = deepcopy(data)
else:
for row1, row2 in zip(accumulated_data, data):
row1[1] += row2[1];
return accumulated_data
def generate_raw_result(accumulated_data, count):
expected_tab = []
bias_tab = []
for row in accumulated_data:
exact = row[0]
expected = row[1] / count
bias = expected - exact
expected_tab.append(expected)
bias_tab.append(bias)
return [ expected_tab, bias_tab ]
def generate_sample(raw_estimates, biases, n_samples):
result = []
min_card = raw_estimates[0]
max_card = raw_estimates[len(raw_estimates) - 1]
step = (max_card - min_card) / (n_samples - 1)
for i in range(0, n_samples + 1):
x = min_card + i * step
j = bisect.bisect_left(raw_estimates, x)
if j == len(raw_estimates):
result.append((raw_estimates[j - 1], biases[j - 1]))
elif raw_estimates[j] == x:
result.append((raw_estimates[j], biases[j]))
else:
# Найти 6 ближайших соседей. Вычислить среднее арифметическое.
# 6 точек слева x [j-6 j-5 j-4 j-3 j-2 j-1]
begin = max(j - 6, 0) - 1
end = j - 1
T = []
for k in range(end, begin, -1):
T.append(x - raw_estimates[k])
# 6 точек справа x [j j+1 j+2 j+3 j+4 j+5]
begin = j
end = min(j + 5, len(raw_estimates) - 1) + 1
U = []
for k in range(begin, end):
U.append(raw_estimates[k] - x)
# Сливаем расстояния.
V = []
lim = min(len(T), len(U))
k1 = 0
k2 = 0
while k1 < lim and k2 < lim:
if T[k1] == U[k2]:
V.append(j - k1 - 1)
V.append(j + k2)
k1 = k1 + 1
k2 = k2 + 1
elif T[k1] < U[k2]:
V.append(j - k1 - 1)
k1 = k1 + 1
else:
V.append(j + k2)
k2 = k2 + 1
if k1 < len(T):
while k1 < len(T):
V.append(j - k1 - 1)
k1 = k1 + 1
elif k2 < len(U):
while k2 < len(U):
V.append(j + k2)
k2 = k2 + 1
# Выбираем 6 ближайших точек.
# Вычисляем средние.
begin = 0
end = min(len(V), 6)
sum = 0
bias = 0
for k in range(begin, end):
sum += raw_estimates[V[k]]
bias += biases[V[k]]
sum /= float(end)
bias /= float(end)
result.append((sum, bias))
# Пропустить последовательные результаты, чьи оценки одинаковые.
final_result = []
last = -1
for entry in result:
if entry[0] != last:
final_result.append((entry[0], entry[1]))
last = entry[0]
return final_result
def dump_arrays(data):
print("Size of each array: {0}\n".format(len(data)))
is_first = True
sep = ''
print("raw_estimates = ")
print("{")
for row in data:
print("\t{0}{1}".format(sep, row[0]))
if is_first == True:
is_first = False
sep = ","
print("};")
is_first = True
sep = ""
print("\nbiases = ")
print("{")
for row in data:
print("\t{0}{1}".format(sep, row[1]))
if is_first == True:
is_first = False
sep = ","
print("};")
def start():
parser = argparse.ArgumentParser(description = "Generate bias correction tables for HyperLogLog-based functions.")
parser.add_argument("-x", "--host", default="localhost", help="ClickHouse server host name");
parser.add_argument("-p", "--port", type=int, default=9000, help="ClickHouse server TCP port");
parser.add_argument("-t", "--http_port", type=int, default=8123, help="ClickHouse server HTTP port");
parser.add_argument("-i", "--iterations", type=int, default=5000, help="number of iterations");
parser.add_argument("-m", "--min_cardinality", type=int, default=16384, help="minimal cardinality");
parser.add_argument("-M", "--max_cardinality", type=int, default=655360, help="maximal cardinality");
parser.add_argument("-s", "--samples", type=int, default=200, help="number of sampled values");
args = parser.parse_args()
accumulated_data = []
for i in range(0, args.iterations):
print(i + 1)
sys.stdout.flush()
generate_data_source(args.host, str(args.port), str(args.http_port), args.min_cardinality, args.max_cardinality, 1000)
response = perform_query(args.host, str(args.port))
data = parse_clickhouse_response(response)
accumulated_data = accumulate_data(accumulated_data, data)
result = generate_raw_result(accumulated_data, args.iterations)
sampled_data = generate_sample(result[0], result[1], args.samples)
dump_arrays(sampled_data)
if __name__ == "__main__": start()

View File

@ -1 +0,0 @@
Hits table generator based on LSTM neural network trained on real hits. You need to have weights for model or train model on real hits to generate data.

View File

@ -1,22 +0,0 @@
import argparse
from model import Model
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-n', type=int, default=100000,
help='number of objects to generate')
parser.add_argument('--output_file', type=str, default='out.tsv',
help='output file name')
parser.add_argument('--weights_path', type=str,
help='path to weights')
args = parser.parse_args()
if __name__ == '__main__':
if not args.weights_path:
raise Exception('please specify path to model weights with --weights_path')
gen = Model()
gen.generate(args.n, args.output_file, args.weights_path)

View File

@ -1,147 +0,0 @@
import numpy as np
import os
import pickle
import tensorflow as tf
from random import sample
from keras.layers import Dense, Embedding
from tqdm import tqdm
RNN_NUM_UNITS = 256
EMB_SIZE = 32
MAX_LENGTH = 1049
with open('tokens', 'rb') as f:
tokens = pickle.load(f)
n_tokens = len(tokens)
token_to_id = {c: i for i, c in enumerate(tokens)}
def to_matrix(objects, max_len=None, pad=0, dtype='int32'):
max_len = max_len or max(map(len, objects))
matrix = np.zeros([len(objects), max_len], dtype) + pad
for i in range(len(objects)):
name_ix = list(map(token_to_id.get, objects[i]))
matrix[i, :len(name_ix)] = name_ix
return matrix.T
class Model:
def __init__(self, learning_rate=0.0001):
# an embedding layer that converts character ids into embeddings
self.embed_x = Embedding(n_tokens, EMB_SIZE)
get_h_next = Dense(1024, activation='relu')
# a dense layer that maps current hidden state
# to probabilities of characters [h_t+1]->P(x_t+1|h_t+1)
self.get_probas = Dense(n_tokens, activation='softmax')
self.input_sequence = tf.placeholder('int32', (MAX_LENGTH, None))
batch_size = tf.shape(self.input_sequence)[1]
self.gru_cell_first = tf.nn.rnn_cell.GRUCell(RNN_NUM_UNITS)
self.lstm_cell_second = tf.nn.rnn_cell.LSTMCell(RNN_NUM_UNITS)
h_prev_first = self.gru_cell_first.zero_state(batch_size, dtype=tf.float32)
h_prev_second = tf.nn.rnn_cell.LSTMStateTuple(
tf.zeros([batch_size, RNN_NUM_UNITS]), # initial cell state,
tf.zeros([batch_size, RNN_NUM_UNITS]) # initial hidden state
)
predicted_probas = []
for t in range(MAX_LENGTH):
x_t = self.input_sequence[t]
# convert character id into embedding
x_t_emb = self.embed_x(tf.reshape(x_t, [-1, 1]))[:, 0]
out_next_first, h_next_first = self.gru_cell_first(x_t_emb, h_prev_first)
h_prev_first = h_next_first
out_next_second, h_next_second = self.lstm_cell_second(out_next_first, h_prev_second)
h_prev_second = h_next_second
probas_next = self.get_probas(out_next_second)
predicted_probas.append(probas_next)
predicted_probas = tf.stack(predicted_probas)
predictions_matrix = tf.reshape(predicted_probas[:-1], [-1, len(tokens)])
answers_matrix = tf.one_hot(tf.reshape(self.input_sequence[1:], [-1]), n_tokens)
self.loss = tf.reduce_mean(tf.reduce_sum(
-answers_matrix * tf.log(tf.clip_by_value(predictions_matrix, 1e-7, 1.0)),
reduction_indices=[1]
))
optimizer = tf.train.AdamOptimizer(learning_rate)
gvs = optimizer.compute_gradients(self.loss)
capped_gvs = [(gr if gr is None else tf.clip_by_value(gr, -1., 1.), var) for gr, var in gvs]
self.optimize = optimizer.apply_gradients(capped_gvs)
self.sess = tf.Session()
self.sess.run(tf.global_variables_initializer())
self.saver = tf.train.Saver()
def train(self, train_data_path, save_dir, num_iters, batch_size=64, restore_from=False):
history = []
if restore_from:
with open(restore_from + '_history') as f:
history = pickle.load(f)
self.saver.restore(self.sess, restore_from)
with open(train_data_path, 'r') as f:
train_data = f.readlines()
train_data = filter(lambda a: len(a) < MAX_LENGTH, train_data)
for i in tqdm(range(num_iters)):
batch = to_matrix(
map(lambda a: '\n' + a.rstrip('\n'), sample(train_data, batch_size)),
max_len=MAX_LENGTH
)
loss_i, _ = self.sess.run([self.loss, self.optimize], {self.input_sequence: batch})
history.append(loss_i)
if len(history) % 2000 == 0:
self.saver.save(self.sess, os.path.join(save_dir, '{}_iters'.format(len(history))))
self.saver.save(self.sess, os.path.join(save_dir, '{}_iters'.format(len(history))))
with open(os.path.join(save_dir, '{}_iters_history'.format(len(history)))) as f:
pickle.dump(history, f)
def generate(self, num_objects, output_file, weights_path):
self.saver.restore(self.sess, weights_path)
batch_size = num_objects
x_t = tf.placeholder('int32', (None, batch_size))
h_t_first = tf.Variable(tf.zeros([batch_size, RNN_NUM_UNITS]))
h_t_second = tf.nn.rnn_cell.LSTMStateTuple(
tf.Variable(tf.zeros([batch_size, RNN_NUM_UNITS])),
tf.Variable(tf.zeros([batch_size, RNN_NUM_UNITS]))
)
x_t_emb = self.embed_x(tf.reshape(x_t, [-1, 1]))[:, 0]
first_out_next, next_h_first = self.gru_cell_first(x_t_emb, h_t_first)
second_out_next, next_h_second = self.lstm_cell_second(first_out_next, h_t_second)
next_probs = self.get_probas(second_out_next)
x_sequence = np.zeros(shape=(1, batch_size), dtype=int) + token_to_id['\n']
self.sess.run(
[tf.assign(h_t_first, h_t_first.initial_value),
tf.assign(h_t_second[0], h_t_second[0].initial_value),
tf.assign(h_t_second[1], h_t_second[1].initial_value)]
)
for i in tqdm(range(MAX_LENGTH - 1)):
x_probs, _, _, _ = self.sess.run(
[next_probs,
tf.assign(h_t_second[0], next_h_second[0]),
tf.assign(h_t_second[1], next_h_second[1]),
tf.assign(h_t_first, next_h_first)],
{x_t: [x_sequence[-1, :]]}
)
next_char = [np.random.choice(n_tokens, p=x_probs[i]) for i in range(batch_size)]
if sum(next_char) == 0:
break
x_sequence = np.append(x_sequence, [next_char], axis=0)
with open(output_file, 'w') as f:
f.writelines([''.join([tokens[ix] for ix in x_sequence.T[k]]) + '\n' for k in range(batch_size)])

View File

@ -1,3 +0,0 @@
Keras==2.0.6
numpy
tensorflow-gpu==1.4.0

View File

@ -1,506 +0,0 @@
(lp0
S'\x83'
p1
aS'\x04'
p2
aS'\x87'
p3
aS'\x8b'
p4
aS'\x8f'
p5
aS'\x10'
p6
aS'\x93'
p7
aS'\x14'
p8
aS'\x97'
p9
aS'\x18'
p10
aS'\x9b'
p11
aS'\x1c'
p12
aS'\x9f'
p13
aS' '
p14
aS'\xa3'
p15
aS'$'
p16
aS'\xa7'
p17
aS'('
p18
aS'\xab'
p19
aS','
p20
aS'\xaf'
p21
aS'0'
p22
aS'\xb3'
p23
aS'4'
p24
aS'\xb7'
p25
aS'8'
p26
aS'\xbb'
p27
aS'<'
p28
aS'\xbf'
p29
aS'@'
p30
aS'\xc3'
p31
aS'D'
p32
aS'\xc7'
p33
aS'H'
p34
aS'\xcb'
p35
aS'L'
p36
aS'\xcf'
p37
aS'P'
p38
aS'\xd3'
p39
aS'T'
p40
aS'\xd7'
p41
aS'X'
p42
aS'\xdb'
p43
aS'\\'
p44
aS'\xdf'
p45
aS'`'
p46
aS'\xe3'
p47
aS'd'
p48
aS'\xe7'
p49
aS'h'
p50
aS'\xeb'
p51
aS'l'
p52
aS'\xef'
p53
aS'p'
p54
aS'\xf3'
p55
aS't'
p56
aS'\xf7'
p57
aS'x'
p58
aS'\xfb'
p59
aS'|'
p60
aS'\xff'
p61
aS'\x80'
p62
aS'\x03'
p63
aS'\x84'
p64
aS'\x07'
p65
aS'\x88'
p66
aS'\x0b'
p67
aS'\x8c'
p68
aS'\x0f'
p69
aS'\x90'
p70
aS'\x13'
p71
aS'\x94'
p72
aS'\x17'
p73
aS'\x98'
p74
aS'\x1b'
p75
aS'\x9c'
p76
aS'\x1f'
p77
aS'\xa0'
p78
aS'#'
p79
aS'\xa4'
p80
aS"'"
p81
aS'\xa8'
p82
aS'+'
p83
aS'\xac'
p84
aS'/'
p85
aS'\xb0'
p86
aS'3'
p87
aS'\xb4'
p88
aS'7'
p89
aS'\xb8'
p90
aS';'
p91
aS'\xbc'
p92
aS'?'
p93
aS'\xc0'
p94
aS'C'
p95
aS'\xc4'
p96
aS'G'
p97
aS'\xc8'
p98
aS'K'
p99
aS'\xcc'
p100
aS'O'
p101
aS'\xd0'
p102
aS'S'
p103
aS'\xd4'
p104
aS'W'
p105
aS'\xd8'
p106
aS'['
p107
aS'\xdc'
p108
aS'_'
p109
aS'\xe0'
p110
aS'c'
p111
aS'\xe4'
p112
aS'g'
p113
aS'\xe8'
p114
aS'k'
p115
aS'\xec'
p116
aS'o'
p117
aS'\xf0'
p118
aS's'
p119
aS'\xf4'
p120
aS'w'
p121
aS'\xf8'
p122
aS'{'
p123
aS'\xfc'
p124
aS'\x7f'
p125
aS'\x81'
p126
aS'\x02'
p127
aS'\x85'
p128
aS'\x06'
p129
aS'\x89'
p130
aS'\n'
p131
aS'\x8d'
p132
aS'\x0e'
p133
aS'\x91'
p134
aS'\x12'
p135
aS'\x95'
p136
aS'\x16'
p137
aS'\x99'
p138
aS'\x1a'
p139
aS'\x9d'
p140
aS'\x1e'
p141
aS'\xa1'
p142
aS'"'
p143
aS'\xa5'
p144
aS'&'
p145
aS'\xa9'
p146
aS'*'
p147
aS'\xad'
p148
aS'.'
p149
aS'\xb1'
p150
aS'2'
p151
aS'\xb5'
p152
aS'6'
p153
aS'\xb9'
p154
aS':'
p155
aS'\xbd'
p156
aS'>'
p157
aS'\xc1'
p158
aS'B'
p159
aS'\xc5'
p160
aS'F'
p161
aS'\xc9'
p162
aS'J'
p163
aS'\xcd'
p164
aS'N'
p165
aS'\xd1'
p166
aS'R'
p167
aS'\xd5'
p168
aS'V'
p169
aS'\xd9'
p170
aS'Z'
p171
aS'\xdd'
p172
aS'^'
p173
aS'\xe1'
p174
aS'b'
p175
aS'\xe5'
p176
aS'f'
p177
aS'\xe9'
p178
aS'j'
p179
aS'\xed'
p180
aS'n'
p181
aS'\xf1'
p182
aS'r'
p183
aS'\xf5'
p184
aS'v'
p185
aS'\xf9'
p186
aS'z'
p187
aS'\xfd'
p188
aS'~'
p189
aS'\x01'
p190
aS'\x82'
p191
aS'\x05'
p192
aS'\x86'
p193
aS'\t'
p194
aS'\x8a'
p195
aS'\x8e'
p196
aS'\x11'
p197
aS'\x92'
p198
aS'\x15'
p199
aS'\x96'
p200
aS'\x19'
p201
aS'\x9a'
p202
aS'\x1d'
p203
aS'\x9e'
p204
aS'!'
p205
aS'\xa2'
p206
aS'%'
p207
aS'\xa6'
p208
aS')'
p209
aS'\xaa'
p210
aS'-'
p211
aS'\xae'
p212
aS'1'
p213
aS'\xb2'
p214
aS'5'
p215
aS'\xb6'
p216
aS'9'
p217
aS'\xba'
p218
aS'='
p219
aS'\xbe'
p220
aS'A'
p221
aS'\xc2'
p222
aS'E'
p223
aS'\xc6'
p224
aS'I'
p225
aS'\xca'
p226
aS'M'
p227
aS'\xce'
p228
aS'Q'
p229
aS'\xd2'
p230
aS'U'
p231
aS'\xd6'
p232
aS'Y'
p233
aS'\xda'
p234
aS']'
p235
aS'\xde'
p236
aS'a'
p237
aS'\xe2'
p238
aS'e'
p239
aS'\xe6'
p240
aS'i'
p241
aS'\xea'
p242
aS'm'
p243
aS'\xee'
p244
aS'q'
p245
aS'\xf2'
p246
aS'u'
p247
aS'\xf6'
p248
aS'y'
p249
aS'\xfa'
p250
aS'}'
p251
aS'\xfe'
p252
a.

View File

@ -1,26 +0,0 @@
import argparse
from model import Model
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--n_iter', type=int, default=10000,
help='number of iterations')
parser.add_argument('--save_dir', type=str, default='save',
help='dir for saving weights')
parser.add_argument('--data_path', type=str,
help='path to train data')
parser.add_argument('--learning_rate', type=int, default=0.0001,
help='learning rate')
parser.add_argument('--batch_size', type=int, default=64,
help='batch size')
parser.add_argument('--restore_from', type=str,
help='path to train saved weights')
args = parser.parse_args()
if __name__ == '__main__':
if not args.data_path:
raise Exception('please specify path to train data with --data_path')
gen = Model(args.learning_rate)
gen.train(args.data_path, args.save_dir, args.n_iter, args.batch_size, args.restore_from)

View File

@ -1,150 +0,0 @@
#!/usr/bin/python3.4
# -*- coding: utf-8 -*-
import sys
import argparse
import tempfile
import random
import subprocess
import bisect
from copy import deepcopy
# Псевдослучайный генератор уникальных чисел.
# http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers/
class UniqueRandomGenerator:
prime = 4294967291
def __init__(self, seed_base, seed_offset):
self.index = self.permutePQR(self.permutePQR(seed_base) + 0x682f0161)
self.intermediate_offset = self.permutePQR(self.permutePQR(seed_offset) + 0x46790905)
def next(self):
val = self.permutePQR((self.permutePQR(self.index) + self.intermediate_offset) ^ 0x5bf03635)
self.index = self.index + 1
return val
def permutePQR(self, x):
if x >=self.prime:
return x
else:
residue = (x * x) % self.prime
if x <= self.prime/2:
return residue
else:
return self.prime - residue
# Создать таблицу содержащую уникальные значения.
def generate_data_source(host, port, http_port, min_cardinality, max_cardinality, count):
chunk_size = round((max_cardinality - (min_cardinality + 1)) / float(count))
used_values = 0
cur_count = 0
next_size = 0
sup = 32768
n1 = random.randrange(0, sup)
n2 = random.randrange(0, sup)
urng = UniqueRandomGenerator(n1, n2)
is_first = True
with tempfile.TemporaryDirectory() as tmp_dir:
filename = tmp_dir + '/table.txt'
with open(filename, 'w+b') as file_handle:
while cur_count < count:
if is_first == True:
is_first = False
if min_cardinality != 0:
next_size = min_cardinality + 1
else:
next_size = chunk_size
else:
next_size += chunk_size
while used_values < next_size:
h = urng.next()
used_values = used_values + 1
out = str(h) + "\t" + str(cur_count) + "\n";
file_handle.write(bytes(out, 'UTF-8'));
cur_count = cur_count + 1
query = "DROP TABLE IF EXISTS data_source"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
query = "CREATE TABLE data_source(UserID UInt64, KeyID UInt64) ENGINE=TinyLog"
subprocess.check_output(["clickhouse-client", "--host", host, "--port", str(port), "--query", query])
cat = subprocess.Popen(("cat", filename), stdout=subprocess.PIPE)
subprocess.check_output(("POST", "http://{0}:{1}/?query=INSERT INTO data_source FORMAT TabSeparated".format(host, http_port)), stdin=cat.stdout)
cat.wait()
def perform_query(host, port):
query = "SELECT runningAccumulate(uniqExactState(UserID)) AS exact, "
query += "runningAccumulate(uniqCombinedRawState(UserID)) AS raw, "
query += "runningAccumulate(uniqCombinedLinearCountingState(UserID)) AS linear_counting, "
query += "runningAccumulate(uniqCombinedBiasCorrectedState(UserID)) AS bias_corrected "
query += "FROM data_source GROUP BY KeyID"
return subprocess.check_output(["clickhouse-client", "--host", host, "--port", port, "--query", query])
def parse_clickhouse_response(response):
parsed = []
lines = response.decode().split("\n")
for cur_line in lines:
rows = cur_line.split("\t")
if len(rows) == 4:
parsed.append([float(rows[0]), float(rows[1]), float(rows[2]), float(rows[3])])
return parsed
def accumulate_data(accumulated_data, data):
if not accumulated_data:
accumulated_data = deepcopy(data)
else:
for row1, row2 in zip(accumulated_data, data):
row1[1] += row2[1];
row1[2] += row2[2];
row1[3] += row2[3];
return accumulated_data
def dump_graphs(data, count):
with open("raw_graph.txt", "w+b") as fh1, open("linear_counting_graph.txt", "w+b") as fh2, open("bias_corrected_graph.txt", "w+b") as fh3:
expected_tab = []
bias_tab = []
for row in data:
exact = row[0]
raw = row[1] / count;
linear_counting = row[2] / count;
bias_corrected = row[3] / count;
outstr = "{0}\t{1}\n".format(exact, abs(raw - exact) / exact)
fh1.write(bytes(outstr, 'UTF-8'))
outstr = "{0}\t{1}\n".format(exact, abs(linear_counting - exact) / exact)
fh2.write(bytes(outstr, 'UTF-8'))
outstr = "{0}\t{1}\n".format(exact, abs(bias_corrected - exact) / exact)
fh3.write(bytes(outstr, 'UTF-8'))
def start():
parser = argparse.ArgumentParser(description = "Generate graphs that help to determine the linear counting threshold.")
parser.add_argument("-x", "--host", default="localhost", help="clickhouse host name");
parser.add_argument("-p", "--port", type=int, default=9000, help="clickhouse client TCP port");
parser.add_argument("-t", "--http_port", type=int, default=8123, help="clickhouse HTTP port");
parser.add_argument("-i", "--iterations", type=int, default=5000, help="number of iterations");
parser.add_argument("-m", "--min_cardinality", type=int, default=16384, help="minimal cardinality");
parser.add_argument("-M", "--max_cardinality", type=int, default=655360, help="maximal cardinality");
args = parser.parse_args()
accumulated_data = []
for i in range(0, args.iterations):
print(i + 1)
sys.stdout.flush()
generate_data_source(args.host, str(args.port), str(args.http_port), args.min_cardinality, args.max_cardinality, 1000)
response = perform_query(args.host, str(args.port))
data = parse_clickhouse_response(response)
accumulated_data = accumulate_data(accumulated_data, data)
dump_graphs(accumulated_data, args.iterations)
if __name__ == "__main__": start()

View File

@ -1,10 +0,0 @@
#!/usr/bin/env bash
for (( i = 0; i < 1000; i++ )); do
if (( RANDOM % 10 )); then
clickhouse-client --port=9007 --query="INSERT INTO mt (x) SELECT rand64() AS x FROM system.numbers LIMIT 100000"
else
clickhouse-client --port=9007 --query="INSERT INTO mt (x) SELECT rand64() AS x FROM system.numbers LIMIT 300000"
fi
done

View File

@ -1,76 +0,0 @@
from __future__ import print_function
import argparse
import matplotlib.pyplot as plt
import ast
TMP_FILE='tmp.tsv'
def parse_args():
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-f', '--file', default='data.tsv')
cfg = parser.parse_args()
return cfg
def draw():
place = dict()
max_coord = 0
global_top = 0
for line in open(TMP_FILE):
numbers = line.split('\t')
if len(numbers) <= 2:
continue
name = numbers[-2]
if numbers[0] == '1':
dx = int(numbers[3])
max_coord += dx
place[name] = [1, max_coord, 1, dx]
max_coord += dx
plt.plot([max_coord - 2 * dx, max_coord], [1, 1])
for line in open(TMP_FILE):
numbers = line.split('\t')
if len(numbers) <= 2:
continue
name = numbers[-2]
if numbers[0] == '2':
list = ast.literal_eval(numbers[-1])
coord = [0,0,0,0]
for cur_name in list:
coord[0] = max(place[cur_name][0], coord[0])
coord[1] += place[cur_name][1] * place[cur_name][2]
coord[2] += place[cur_name][2]
coord[3] += place[cur_name][3]
coord[1] /= coord[2]
coord[0] += 1
global_top = max(global_top, coord[0])
place[name] = coord
for cur_name in list:
plt.plot([coord[1], place[cur_name][1]],[coord[0], place[cur_name][0]])
plt.plot([coord[1] - coord[3], coord[1] + coord[3]], [coord[0], coord[0]])
plt.plot([0], [global_top + 1])
plt.plot([0], [-1])
plt.show()
def convert(input_file):
print(input_file)
tmp_file = open(TMP_FILE, "w")
for line in open(input_file):
numbers = line.split('\t')
numbers2 = numbers[-2].split('_')
if numbers2[-2] == numbers2[-3]:
numbers2[-2] = str(int(numbers2[-2]) + 1)
numbers2[-3] = str(int(numbers2[-3]) + 1)
numbers[-2] = '_'.join(numbers2[1:])
print('\t'.join(numbers), end='', file=tmp_file)
else:
print(line, end='', file=tmp_file)
def main():
cfg = parse_args()
convert(cfg.file)
draw()
if __name__ == '__main__':
main()

View File

@ -1,61 +0,0 @@
import time
import ast
from datetime import datetime
FILE='data.tsv'
def get_metrix():
data = []
time_to_merge = 0
count_of_parts = 0
max_count_of_parts = 0
parts_in_time = []
last_date = 0
for line in open(FILE):
fields = line.split('\t')
last_date = datetime.strptime(fields[2], '%Y-%m-%d %H:%M:%S')
break
for line in open(FILE):
fields = line.split('\t')
cur_date = datetime.strptime(fields[2], '%Y-%m-%d %H:%M:%S')
if fields[0] == '2':
time_to_merge += int(fields[4])
list = ast.literal_eval(fields[-1])
count_of_parts -= len(list) - 1
else:
count_of_parts += 1
if max_count_of_parts < count_of_parts:
max_count_of_parts = count_of_parts
parts_in_time.append([(cur_date-last_date).total_seconds(), count_of_parts])
last_date = cur_date
stats_parts_in_time = []
global_time = 0
average_parts = 0
for i in range(max_count_of_parts + 1):
stats_parts_in_time.append(0)
for elem in parts_in_time:
stats_parts_in_time[elem[1]] += elem[0]
global_time += elem[0]
average_parts += elem[0] * elem[1]
for i in range(max_count_of_parts):
stats_parts_in_time[i] /= global_time
average_parts /= global_time
return time_to_merge, max_count_of_parts, average_parts, stats_parts_in_time
def main():
time_to_merge, max_parts, average_parts, stats_parts = get_metrix()
print('time_to_merge=', time_to_merge)
print('max_parts=', max_parts)
print('average_parts=', average_parts)
print('stats_parts=', stats_parts)
if __name__ == '__main__':
main()

View File

@ -1,56 +0,0 @@
#!/usr/bin/python3
import sys
import math
import statistics as stat
start = int(sys.argv[1])
end = int(sys.argv[2])
#Copied from dbms/src/Common/HashTable/Hash.h
def intHash32(key, salt = 0):
key ^= salt;
key = (~key) + (key << 18);
key = key ^ ((key >> 31) | (key << 33));
key = key * 21;
key = key ^ ((key >> 11) | (key << 53));
key = key + (key << 6);
key = key ^ ((key >> 22) | (key << 42));
return key & 0xffffffff
#Number of buckets for precision p = 12, m = 2^p
m = 4096
n = start
c = 0
m1 = {}
m2 = {}
l1 = []
l2 = []
while n <= end:
c += 1
h = intHash32(n)
#Extract left most 12 bits
x1 = (h >> 20) & 0xfff
m1[x1] = 1
z1 = m - len(m1)
#Linear counting formula
u1 = int(m * math.log(float(m) / float(z1)))
e1 = abs(100*float(u1 - c)/float(c))
l1.append(e1)
print("%d %d %d %f" % (n, c, u1, e1))
#Extract right most 12 bits
x2 = h & 0xfff
m2[x2] = 1
z2 = m - len(m2)
u2 = int(m * math.log(float(m) / float(z2)))
e2 = abs(100*float(u2 - c)/float(c))
l2.append(e2)
print("%d %d %d %f" % (n, c, u2, e2))
n += 1
print("Left 12 bits error: min=%f max=%f avg=%f median=%f median_low=%f median_high=%f" % (min(l1), max(l1), stat.mean(l1), stat.median(l1), stat.median_low(l1), stat.median_high(l1)))
print("Right 12 bits error: min=%f max=%f avg=%f median=%f median_low=%f median_high=%f" % (min(l2), max(l2), stat.mean(l2), stat.median(l2), stat.median_low(l2), stat.median_high(l2)))

View File

@ -1,11 +0,0 @@
#!/usr/bin/env bash
for ((p = 2; p <= 10; p++))
do
for ((i = 1; i <= 9; i++))
do
n=$(( 10**p * i ))
echo -n "$n "
clickhouse-client -q "select uniqHLL12(number), uniq(number), uniqCombined(number) from numbers($n);"
done
done

View File

@ -199,8 +199,13 @@ public:
for (auto & rhs_elem : rhs_set)
{
cur_set.emplace(rhs_elem.getValue(), it, inserted);
if (inserted && it->getValue().size)
it->getValueMutable().data = arena->insert(it->getValue().data, it->getValue().size);
if (inserted)
{
if (it->getValue().size)
it->getValueMutable().data = arena->insert(it->getValue().data, it->getValue().size);
else
it->getValueMutable().data = nullptr;
}
}
}

View File

@ -268,7 +268,7 @@ public:
void merge(const AggregateFunctionHistogramData & other, UInt32 max_bins)
{
lower_bound = std::min(lower_bound, other.lower_bound);
upper_bound = std::max(lower_bound, other.upper_bound);
upper_bound = std::max(upper_bound, other.upper_bound);
for (size_t i = 0; i < other.size; i++)
add(other.points[i].mean, other.points[i].weight, max_bins);
}

View File

@ -0,0 +1,85 @@
#include <AggregateFunctions/AggregateFunctionLeastSqr.h>
#include <AggregateFunctions/AggregateFunctionFactory.h>
#include <AggregateFunctions/FactoryHelpers.h>
namespace DB
{
namespace
{
AggregateFunctionPtr createAggregateFunctionLeastSqr(
const String & name,
const DataTypes & arguments,
const Array & params
)
{
assertNoParameters(name, params);
assertBinary(name, arguments);
const IDataType * x_arg = arguments.front().get();
WhichDataType which_x {
x_arg
};
const IDataType * y_arg = arguments.back().get();
WhichDataType which_y {
y_arg
};
#define FOR_LEASTSQR_TYPES_2(M, T) \
M(T, UInt8) \
M(T, UInt16) \
M(T, UInt32) \
M(T, UInt64) \
M(T, Int8) \
M(T, Int16) \
M(T, Int32) \
M(T, Int64) \
M(T, Float32) \
M(T, Float64)
#define FOR_LEASTSQR_TYPES(M) \
FOR_LEASTSQR_TYPES_2(M, UInt8) \
FOR_LEASTSQR_TYPES_2(M, UInt16) \
FOR_LEASTSQR_TYPES_2(M, UInt32) \
FOR_LEASTSQR_TYPES_2(M, UInt64) \
FOR_LEASTSQR_TYPES_2(M, Int8) \
FOR_LEASTSQR_TYPES_2(M, Int16) \
FOR_LEASTSQR_TYPES_2(M, Int32) \
FOR_LEASTSQR_TYPES_2(M, Int64) \
FOR_LEASTSQR_TYPES_2(M, Float32) \
FOR_LEASTSQR_TYPES_2(M, Float64)
#define DISPATCH(T1, T2) \
if (which_x.idx == TypeIndex::T1 && which_y.idx == TypeIndex::T2) \
return std::make_shared<AggregateFunctionLeastSqr<T1, T2>>( \
arguments, \
params \
);
FOR_LEASTSQR_TYPES(DISPATCH)
#undef FOR_LEASTSQR_TYPES_2
#undef FOR_LEASTSQR_TYPES
#undef DISPATCH
throw Exception(
"Illegal types ("
+ x_arg->getName() + ", " + y_arg->getName()
+ ") of arguments of aggregate function " + name
+ ", must be Native Ints, Native UInts or Floats",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT
);
}
}
void registerAggregateFunctionLeastSqr(AggregateFunctionFactory & factory)
{
factory.registerFunction("leastSqr", createAggregateFunctionLeastSqr);
}
}

View File

@ -0,0 +1,195 @@
#pragma once
#include <AggregateFunctions/IAggregateFunction.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnTuple.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeTuple.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <limits>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
template <typename X, typename Y, typename Ret>
struct AggregateFunctionLeastSqrData final
{
size_t count = 0;
Ret sum_x = 0;
Ret sum_y = 0;
Ret sum_xx = 0;
Ret sum_xy = 0;
void add(X x, Y y)
{
count += 1;
sum_x += x;
sum_y += y;
sum_xx += x * x;
sum_xy += x * y;
}
void merge(const AggregateFunctionLeastSqrData & other)
{
count += other.count;
sum_x += other.sum_x;
sum_y += other.sum_y;
sum_xx += other.sum_xx;
sum_xy += other.sum_xy;
}
void serialize(WriteBuffer & buf) const
{
writeBinary(count, buf);
writeBinary(sum_x, buf);
writeBinary(sum_y, buf);
writeBinary(sum_xx, buf);
writeBinary(sum_xy, buf);
}
void deserialize(ReadBuffer & buf)
{
readBinary(count, buf);
readBinary(sum_x, buf);
readBinary(sum_y, buf);
readBinary(sum_xx, buf);
readBinary(sum_xy, buf);
}
Ret getK() const
{
Ret divisor = sum_xx * count - sum_x * sum_x;
if (divisor == 0)
return std::numeric_limits<Ret>::quiet_NaN();
return (sum_xy * count - sum_x * sum_y) / divisor;
}
Ret getB(Ret k) const
{
if (count == 0)
return std::numeric_limits<Ret>::quiet_NaN();
return (sum_y - k * sum_x) / count;
}
};
/// Calculates simple linear regression parameters.
/// Result is a tuple (k, b) for y = k * x + b equation, solved by least squares approximation.
template <typename X, typename Y, typename Ret = Float64>
class AggregateFunctionLeastSqr final : public IAggregateFunctionDataHelper<
AggregateFunctionLeastSqrData<X, Y, Ret>,
AggregateFunctionLeastSqr<X, Y, Ret>
>
{
public:
AggregateFunctionLeastSqr(
const DataTypes & arguments,
const Array & params
):
IAggregateFunctionDataHelper<
AggregateFunctionLeastSqrData<X, Y, Ret>,
AggregateFunctionLeastSqr<X, Y, Ret>
> {arguments, params}
{
// notice: arguments has been checked before
}
String getName() const override
{
return "leastSqr";
}
const char * getHeaderFilePath() const override
{
return __FILE__;
}
void add(
AggregateDataPtr place,
const IColumn ** columns,
size_t row_num,
Arena *
) const override
{
auto col_x {
static_cast<const ColumnVector<X> *>(columns[0])
};
auto col_y {
static_cast<const ColumnVector<Y> *>(columns[1])
};
X x = col_x->getData()[row_num];
Y y = col_y->getData()[row_num];
this->data(place).add(x, y);
}
void merge(
AggregateDataPtr place,
ConstAggregateDataPtr rhs, Arena *
) const override
{
this->data(place).merge(this->data(rhs));
}
void serialize(
ConstAggregateDataPtr place,
WriteBuffer & buf
) const override
{
this->data(place).serialize(buf);
}
void deserialize(
AggregateDataPtr place,
ReadBuffer & buf, Arena *
) const override
{
this->data(place).deserialize(buf);
}
DataTypePtr getReturnType() const override
{
DataTypes types {
std::make_shared<DataTypeNumber<Ret>>(),
std::make_shared<DataTypeNumber<Ret>>(),
};
Strings names {
"k",
"b",
};
return std::make_shared<DataTypeTuple>(
std::move(types),
std::move(names)
);
}
void insertResultInto(
ConstAggregateDataPtr place,
IColumn & to
) const override
{
Ret k = this->data(place).getK();
Ret b = this->data(place).getB(k);
auto & col_tuple = static_cast<ColumnTuple &>(to);
auto & col_k = static_cast<ColumnVector<Ret> &>(col_tuple.getColumn(0));
auto & col_b = static_cast<ColumnVector<Ret> &>(col_tuple.getColumn(1));
col_k.getData().push_back(k);
col_b.getData().push_back(b);
}
};
}

View File

@ -15,60 +15,60 @@ namespace ErrorCodes
namespace
{
template <typename T> using FuncQuantile = AggregateFunctionQuantile<T, QuantileReservoirSampler<T>, NameQuantile, false, Float64, false>;
template <typename T> using FuncQuantiles = AggregateFunctionQuantile<T, QuantileReservoirSampler<T>, NameQuantiles, false, Float64, true>;
template <typename Value, bool FloatReturn> using FuncQuantile = AggregateFunctionQuantile<Value, QuantileReservoirSampler<Value>, NameQuantile, false, std::conditional_t<FloatReturn, Float64, void>, false>;
template <typename Value, bool FloatReturn> using FuncQuantiles = AggregateFunctionQuantile<Value, QuantileReservoirSampler<Value>, NameQuantiles, false, std::conditional_t<FloatReturn, Float64, void>, true>;
template <typename T> using FuncQuantileDeterministic = AggregateFunctionQuantile<T, QuantileReservoirSamplerDeterministic<T>, NameQuantileDeterministic, true, Float64, false>;
template <typename T> using FuncQuantilesDeterministic = AggregateFunctionQuantile<T, QuantileReservoirSamplerDeterministic<T>, NameQuantilesDeterministic, true, Float64, true>;
template <typename Value, bool FloatReturn> using FuncQuantileDeterministic = AggregateFunctionQuantile<Value, QuantileReservoirSamplerDeterministic<Value>, NameQuantileDeterministic, true, std::conditional_t<FloatReturn, Float64, void>, false>;
template <typename Value, bool FloatReturn> using FuncQuantilesDeterministic = AggregateFunctionQuantile<Value, QuantileReservoirSamplerDeterministic<Value>, NameQuantilesDeterministic, true, std::conditional_t<FloatReturn, Float64, void>, true>;
template <typename T> using FuncQuantileExact = AggregateFunctionQuantile<T, QuantileExact<T>, NameQuantileExact, false, void, false>;
template <typename T> using FuncQuantilesExact = AggregateFunctionQuantile<T, QuantileExact<T>, NameQuantilesExact, false, void, true>;
template <typename Value, bool _> using FuncQuantileExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantileExact, false, void, false>;
template <typename Value, bool _> using FuncQuantilesExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantilesExact, false, void, true>;
template <typename T> using FuncQuantileExactWeighted = AggregateFunctionQuantile<T, QuantileExactWeighted<T>, NameQuantileExactWeighted, true, void, false>;
template <typename T> using FuncQuantilesExactWeighted = AggregateFunctionQuantile<T, QuantileExactWeighted<T>, NameQuantilesExactWeighted, true, void, true>;
template <typename Value, bool _> using FuncQuantileExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantileExactWeighted, true, void, false>;
template <typename Value, bool _> using FuncQuantilesExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantilesExactWeighted, true, void, true>;
template <typename T> using FuncQuantileTiming = AggregateFunctionQuantile<T, QuantileTiming<T>, NameQuantileTiming, false, Float32, false>;
template <typename T> using FuncQuantilesTiming = AggregateFunctionQuantile<T, QuantileTiming<T>, NameQuantilesTiming, false, Float32, true>;
template <typename Value, bool _> using FuncQuantileTiming = AggregateFunctionQuantile<Value, QuantileTiming<Value>, NameQuantileTiming, false, Float32, false>;
template <typename Value, bool _> using FuncQuantilesTiming = AggregateFunctionQuantile<Value, QuantileTiming<Value>, NameQuantilesTiming, false, Float32, true>;
template <typename T> using FuncQuantileTimingWeighted = AggregateFunctionQuantile<T, QuantileTiming<T>, NameQuantileTimingWeighted, true, Float32, false>;
template <typename T> using FuncQuantilesTimingWeighted = AggregateFunctionQuantile<T, QuantileTiming<T>, NameQuantilesTimingWeighted, true, Float32, true>;
template <typename Value, bool _> using FuncQuantileTimingWeighted = AggregateFunctionQuantile<Value, QuantileTiming<Value>, NameQuantileTimingWeighted, true, Float32, false>;
template <typename Value, bool _> using FuncQuantilesTimingWeighted = AggregateFunctionQuantile<Value, QuantileTiming<Value>, NameQuantilesTimingWeighted, true, Float32, true>;
template <typename T> using FuncQuantileTDigest = AggregateFunctionQuantile<T, QuantileTDigest<T>, NameQuantileTDigest, false, Float32, false>;
template <typename T> using FuncQuantilesTDigest = AggregateFunctionQuantile<T, QuantileTDigest<T>, NameQuantilesTDigest, false, Float32, true>;
template <typename Value, bool FloatReturn> using FuncQuantileTDigest = AggregateFunctionQuantile<Value, QuantileTDigest<Value>, NameQuantileTDigest, false, std::conditional_t<FloatReturn, Float32, void>, false>;
template <typename Value, bool FloatReturn> using FuncQuantilesTDigest = AggregateFunctionQuantile<Value, QuantileTDigest<Value>, NameQuantilesTDigest, false, std::conditional_t<FloatReturn, Float32, void>, true>;
template <typename T> using FuncQuantileTDigestWeighted = AggregateFunctionQuantile<T, QuantileTDigest<T>, NameQuantileTDigestWeighted, true, Float32, false>;
template <typename T> using FuncQuantilesTDigestWeighted = AggregateFunctionQuantile<T, QuantileTDigest<T>, NameQuantilesTDigestWeighted, true, Float32, true>;
template <typename Value, bool FloatReturn> using FuncQuantileTDigestWeighted = AggregateFunctionQuantile<Value, QuantileTDigest<Value>, NameQuantileTDigestWeighted, true, std::conditional_t<FloatReturn, Float32, void>, false>;
template <typename Value, bool FloatReturn> using FuncQuantilesTDigestWeighted = AggregateFunctionQuantile<Value, QuantileTDigest<Value>, NameQuantilesTDigestWeighted, true, std::conditional_t<FloatReturn, Float32, void>, true>;
template <template <typename> class Function>
template <template <typename, bool> class Function>
static constexpr bool supportDecimal()
{
return std::is_same_v<Function<Float32>, FuncQuantileExact<Float32>> ||
std::is_same_v<Function<Float32>, FuncQuantilesExact<Float32>>;
return std::is_same_v<Function<Float32, false>, FuncQuantileExact<Float32, false>> ||
std::is_same_v<Function<Float32, false>, FuncQuantilesExact<Float32, false>>;
}
template <template <typename> class Function>
template <template <typename, bool> class Function>
AggregateFunctionPtr createAggregateFunctionQuantile(const std::string & name, const DataTypes & argument_types, const Array & params)
{
/// Second argument type check doesn't depend on the type of the first one.
Function<void>::assertSecondArg(argument_types);
Function<void, true>::assertSecondArg(argument_types);
const DataTypePtr & argument_type = argument_types[0];
WhichDataType which(argument_type);
#define DISPATCH(TYPE) \
if (which.idx == TypeIndex::TYPE) return std::make_shared<Function<TYPE>>(argument_type, params);
if (which.idx == TypeIndex::TYPE) return std::make_shared<Function<TYPE, true>>(argument_type, params);
FOR_NUMERIC_TYPES(DISPATCH)
#undef DISPATCH
if (which.idx == TypeIndex::Date) return std::make_shared<Function<DataTypeDate::FieldType>>(argument_type, params);
if (which.idx == TypeIndex::DateTime) return std::make_shared<Function<DataTypeDateTime::FieldType>>(argument_type, params);
if (which.idx == TypeIndex::Date) return std::make_shared<Function<DataTypeDate::FieldType, false>>(argument_type, params);
if (which.idx == TypeIndex::DateTime) return std::make_shared<Function<DataTypeDateTime::FieldType, false>>(argument_type, params);
if constexpr (supportDecimal<Function>())
{
if (which.idx == TypeIndex::Decimal32) return std::make_shared<Function<Decimal32>>(argument_type, params);
if (which.idx == TypeIndex::Decimal64) return std::make_shared<Function<Decimal64>>(argument_type, params);
if (which.idx == TypeIndex::Decimal128) return std::make_shared<Function<Decimal128>>(argument_type, params);
if (which.idx == TypeIndex::Decimal32) return std::make_shared<Function<Decimal32, true>>(argument_type, params);
if (which.idx == TypeIndex::Decimal64) return std::make_shared<Function<Decimal64, true>>(argument_type, params);
if (which.idx == TypeIndex::Decimal128) return std::make_shared<Function<Decimal128, true>>(argument_type, params);
}
throw Exception("Illegal type " + argument_type->getName() + " of argument for aggregate function " + name,

View File

@ -65,9 +65,7 @@ class AggregateFunctionQuantile final : public IAggregateFunctionDataHelper<Data
private:
using ColVecType = std::conditional_t<IsDecimalNumber<Value>, ColumnDecimal<Value>, ColumnVector<Value>>;
static constexpr bool returns_float = !(std::is_same_v<FloatReturnType, void>)
&& (!(std::is_same_v<Value, DataTypeDate::FieldType> || std::is_same_v<Value, DataTypeDateTime::FieldType>)
|| std::is_same_v<Data, QuantileTiming<Value>>);
static constexpr bool returns_float = !(std::is_same_v<FloatReturnType, void>);
static_assert(!IsDecimalNumber<Value> || !returns_float);
QuantileLevels<Float64> levels;

View File

@ -24,8 +24,7 @@ struct WithoutOverflowPolicy
static DataTypePtr promoteType(const DataTypePtr & data_type)
{
if (!data_type->canBePromoted())
throw new Exception{"Values to be summed are expected to be Numeric, Float or Decimal.",
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
throw Exception{"Values to be summed are expected to be Numeric, Float or Decimal.", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT};
return data_type->promoteNumericType();
}

View File

@ -16,7 +16,6 @@
#include <Common/HashTable/HashSet.h>
#include <Common/HyperLogLogWithSmallSetOptimization.h>
#include <Common/CombinedCardinalityEstimator.h>
#include <Common/MemoryTracker.h>
#include <Common/typeid_cast.h>
#include <AggregateFunctions/UniquesHashSet.h>

View File

@ -67,10 +67,10 @@ struct UniqVariadicHash<false, true>
{
UInt64 hash;
const Columns & tuple_columns = static_cast<const ColumnTuple *>(columns[0])->getColumns();
const auto & tuple_columns = static_cast<const ColumnTuple *>(columns[0])->getColumns();
const ColumnPtr * column = tuple_columns.data();
const ColumnPtr * columns_end = column + num_args;
const auto * column = tuple_columns.data();
const auto * columns_end = column + num_args;
{
StringRef value = column->get()->getDataAt(row_num);
@ -116,10 +116,10 @@ struct UniqVariadicHash<true, true>
{
static inline UInt128 apply(size_t num_args, const IColumn ** columns, size_t row_num)
{
const Columns & tuple_columns = static_cast<const ColumnTuple *>(columns[0])->getColumns();
const auto & tuple_columns = static_cast<const ColumnTuple *>(columns[0])->getColumns();
const ColumnPtr * column = tuple_columns.data();
const ColumnPtr * columns_end = column + num_args;
const auto * column = tuple_columns.data();
const auto * columns_end = column + num_args;
SipHash hash;

View File

@ -15,7 +15,7 @@ namespace ErrorCodes
Array getAggregateFunctionParametersArray(const ASTPtr & expression_list, const std::string & error_context)
{
const ASTs & parameters = typeid_cast<const ASTExpressionList &>(*expression_list).children;
const ASTs & parameters = expression_list->children;
if (parameters.empty())
throw Exception("Parameters list to aggregate functions cannot be empty", ErrorCodes::BAD_ARGUMENTS);
@ -23,14 +23,14 @@ Array getAggregateFunctionParametersArray(const ASTPtr & expression_list, const
for (size_t i = 0; i < parameters.size(); ++i)
{
const ASTLiteral * lit = typeid_cast<const ASTLiteral *>(parameters[i].get());
if (!lit)
const auto * literal = parameters[i]->as<ASTLiteral>();
if (!literal)
{
throw Exception("Parameters to aggregate functions must be literals" + (error_context.empty() ? "" : " (in " + error_context +")"),
ErrorCodes::PARAMETERS_TO_AGGREGATE_FUNCTIONS_MUST_BE_LITERALS);
}
params_row[i] = lit->value;
params_row[i] = literal->value;
}
return params_row;
@ -67,8 +67,7 @@ void getAggregateFunctionNameAndParametersArray(
parameters_str.data(), parameters_str.data() + parameters_str.size(),
"parameters of aggregate function in " + error_context, 0);
ASTExpressionList & args_list = typeid_cast<ASTExpressionList &>(*args_ast);
if (args_list.children.empty())
if (args_ast->children.empty())
throw Exception("Incorrect list of parameters to aggregate function "
+ aggregate_function_name, ErrorCodes::BAD_ARGUMENTS);

View File

@ -29,6 +29,7 @@ void registerAggregateFunctionsBitwise(AggregateFunctionFactory &);
void registerAggregateFunctionsBitmap(AggregateFunctionFactory &);
void registerAggregateFunctionsMaxIntersections(AggregateFunctionFactory &);
void registerAggregateFunctionEntropy(AggregateFunctionFactory &);
void registerAggregateFunctionLeastSqr(AggregateFunctionFactory &);
void registerAggregateFunctionCombinatorIf(AggregateFunctionCombinatorFactory &);
void registerAggregateFunctionCombinatorArray(AggregateFunctionCombinatorFactory &);
@ -69,6 +70,7 @@ void registerAggregateFunctions()
registerAggregateFunctionHistogram(factory);
registerAggregateFunctionRetention(factory);
registerAggregateFunctionEntropy(factory);
registerAggregateFunctionLeastSqr(factory);
}
{

View File

@ -357,7 +357,7 @@ void Connection::sendQuery(
if (settings)
{
std::optional<int> level;
std::string method = settings->network_compression_method;
std::string method = Poco::toUpper(settings->network_compression_method.toString());
/// Bad custom logic
if (method == "ZSTD")

View File

@ -18,7 +18,7 @@
#include <IO/ConnectionTimeouts.h>
#include <Interpreters/Settings.h>
#include <Core/Settings.h>
#include <Interpreters/TablesStatus.h>
#include <Compression/ICompressionCodec.h>
@ -271,7 +271,7 @@ private:
void initBlockInput();
void initBlockLogsInput();
void throwUnexpectedPacket(UInt64 packet_type, const char * expected) const;
[[noreturn]] void throwUnexpectedPacket(UInt64 packet_type, const char * expected) const;
};
}

View File

@ -6,7 +6,7 @@
#include <Common/getFQDNOrHostName.h>
#include <Common/isLocalAddress.h>
#include <Common/ProfileEvents.h>
#include <Interpreters/Settings.h>
#include <Core/Settings.h>
namespace ProfileEvents
@ -62,6 +62,9 @@ IConnectionPool::Entry ConnectionPoolWithFailover::get(const Settings * settings
break;
case LoadBalancing::RANDOM:
break;
case LoadBalancing::FIRST_OR_RANDOM:
get_priority = [](size_t i) -> size_t { return i >= 1; };
break;
}
return Base::get(try_get_entry, get_priority);
@ -134,6 +137,9 @@ std::vector<ConnectionPoolWithFailover::TryResult> ConnectionPoolWithFailover::g
break;
case LoadBalancing::RANDOM:
break;
case LoadBalancing::FIRST_OR_RANDOM:
get_priority = [](size_t i) -> size_t { return i >= 1; };
break;
}
bool fallback_to_stale_replicas = settings ? bool(settings->fallback_to_stale_replicas_for_distributed_queries) : true;

View File

@ -43,13 +43,13 @@ using Arenas = std::vector<ArenaPtr>;
* specifying which individual values should be destroyed and which ones should not.
* Clearly, this method would have a substantially non-zero price.
*/
class ColumnAggregateFunction final : public COWPtrHelper<IColumn, ColumnAggregateFunction>
class ColumnAggregateFunction final : public COWHelper<IColumn, ColumnAggregateFunction>
{
public:
using Container = PaddedPODArray<AggregateDataPtr>;
private:
friend class COWPtrHelper<IColumn, ColumnAggregateFunction>;
friend class COWHelper<IColumn, ColumnAggregateFunction>;
/// Memory pools. Aggregate states are allocated from them.
Arenas arenas;

View File

@ -576,7 +576,7 @@ ColumnPtr ColumnArray::filterTuple(const Filter & filt, ssize_t result_size_hint
/// Make temporary arrays for each components of Tuple, then filter and collect back.
size_t tuple_size = tuple.getColumns().size();
size_t tuple_size = tuple.tupleSize();
if (tuple_size == 0)
throw Exception("Logical error: empty tuple", ErrorCodes::LOGICAL_ERROR);
@ -809,12 +809,12 @@ ColumnPtr ColumnArray::replicateString(const Offsets & replicate_offsets) const
for (size_t i = 0; i < col_size; ++i)
{
/// How much to replicate the array.
/// How many times to replicate the array.
size_t size_to_replicate = replicate_offsets[i] - prev_replicate_offset;
/// The number of rows in the array.
/// The number of strings in the array.
size_t value_size = src_offsets[i] - prev_src_offset;
/// Number of characters in rows of the array, including zero/null bytes.
size_t sum_chars_size = value_size == 0 ? 0 : (src_string_offsets[prev_src_offset + value_size - 1] - prev_src_string_offset);
/// Number of characters in strings of the array, including zero bytes.
size_t sum_chars_size = src_string_offsets[prev_src_offset + value_size - 1] - prev_src_string_offset; /// -1th index is Ok, see PaddedPODArray.
for (size_t j = 0; j < size_to_replicate; ++j)
{
@ -824,7 +824,7 @@ ColumnPtr ColumnArray::replicateString(const Offsets & replicate_offsets) const
size_t prev_src_string_offset_local = prev_src_string_offset;
for (size_t k = 0; k < value_size; ++k)
{
/// Size of one row.
/// Size of single string.
size_t chars_size = src_string_offsets[k + prev_src_offset] - prev_src_string_offset_local;
current_res_string_offset += chars_size;
@ -835,7 +835,7 @@ ColumnPtr ColumnArray::replicateString(const Offsets & replicate_offsets) const
if (sum_chars_size)
{
/// Copies the characters of the array of rows.
/// Copies the characters of the array of strings.
res_chars.resize(res_chars.size() + sum_chars_size);
memcpySmallAllowReadWriteOverflow15(
&res_chars[res_chars.size() - sum_chars_size], &src_chars[prev_src_string_offset], sum_chars_size);
@ -941,7 +941,7 @@ ColumnPtr ColumnArray::replicateTuple(const Offsets & replicate_offsets) const
/// Make temporary arrays for each components of Tuple. In the same way as for Nullable.
size_t tuple_size = tuple.getColumns().size();
size_t tuple_size = tuple.tupleSize();
if (tuple_size == 0)
throw Exception("Logical error: empty tuple", ErrorCodes::LOGICAL_ERROR);

View File

@ -13,10 +13,10 @@ namespace DB
* In memory, it is represented as one column of a nested type, whose size is equal to the sum of the sizes of all arrays,
* and as an array of offsets in it, which allows you to get each element.
*/
class ColumnArray final : public COWPtrHelper<IColumn, ColumnArray>
class ColumnArray final : public COWHelper<IColumn, ColumnArray>
{
private:
friend class COWPtrHelper<IColumn, ColumnArray>;
friend class COWHelper<IColumn, ColumnArray>;
/** Create an array column with specified values and offsets. */
ColumnArray(MutableColumnPtr && nested_column, MutableColumnPtr && offsets_column);
@ -30,7 +30,7 @@ public:
/** Create immutable column using immutable arguments. This arguments may be shared with other columns.
* Use IColumn::mutate in order to make mutable column and mutate shared nested columns.
*/
using Base = COWPtrHelper<IColumn, ColumnArray>;
using Base = COWHelper<IColumn, ColumnArray>;
static Ptr create(const ColumnPtr & nested_column, const ColumnPtr & offsets_column)
{
@ -81,15 +81,15 @@ public:
bool hasEqualOffsets(const ColumnArray & other) const;
/** More efficient methods of manipulation */
IColumn & getData() { return data->assumeMutableRef(); }
IColumn & getData() { return *data; }
const IColumn & getData() const { return *data; }
IColumn & getOffsetsColumn() { return offsets->assumeMutableRef(); }
IColumn & getOffsetsColumn() { return *offsets; }
const IColumn & getOffsetsColumn() const { return *offsets; }
Offsets & ALWAYS_INLINE getOffsets()
{
return static_cast<ColumnOffsets &>(offsets->assumeMutableRef()).getData();
return static_cast<ColumnOffsets &>(*offsets).getData();
}
const Offsets & ALWAYS_INLINE getOffsets() const
@ -124,8 +124,8 @@ public:
}
private:
ColumnPtr data;
ColumnPtr offsets;
WrappedPtr data;
WrappedPtr offsets;
size_t ALWAYS_INLINE offsetAt(ssize_t i) const { return getOffsets()[i - 1]; }
size_t ALWAYS_INLINE sizeAt(ssize_t i) const { return getOffsets()[i] - getOffsets()[i - 1]; }

View File

@ -18,12 +18,12 @@ namespace ErrorCodes
/** ColumnConst contains another column with single element,
* but looks like a column with arbitrary amount of same elements.
*/
class ColumnConst final : public COWPtrHelper<IColumn, ColumnConst>
class ColumnConst final : public COWHelper<IColumn, ColumnConst>
{
private:
friend class COWPtrHelper<IColumn, ColumnConst>;
friend class COWHelper<IColumn, ColumnConst>;
ColumnPtr data;
WrappedPtr data;
size_t s;
ColumnConst(const ColumnPtr & data, size_t s);
@ -141,9 +141,8 @@ public:
const char * deserializeAndInsertFromArena(const char * pos) override
{
auto & mutable_data = data->assumeMutableRef();
auto res = mutable_data.deserializeAndInsertFromArena(pos);
mutable_data.popBack(1);
auto res = data->deserializeAndInsertFromArena(pos);
data->popBack(1);
++s;
return res;
}
@ -208,11 +207,9 @@ public:
/// Not part of the common interface.
IColumn & getDataColumn() { return data->assumeMutableRef(); }
IColumn & getDataColumn() { return *data; }
const IColumn & getDataColumn() const { return *data; }
//MutableColumnPtr getDataColumnMutablePtr() { return data; }
const ColumnPtr & getDataColumnPtr() const { return data; }
//ColumnPtr & getDataColumnPtr() { return data; }
Field getField() const { return getDataColumn()[0]; }

View File

@ -55,13 +55,13 @@ private:
/// A ColumnVector for Decimals
template <typename T>
class ColumnDecimal final : public COWPtrHelper<ColumnVectorHelper, ColumnDecimal<T>>
class ColumnDecimal final : public COWHelper<ColumnVectorHelper, ColumnDecimal<T>>
{
static_assert(IsDecimalNumber<T>);
private:
using Self = ColumnDecimal;
friend class COWPtrHelper<ColumnVectorHelper, Self>;
friend class COWHelper<ColumnVectorHelper, Self>;
public:
using Container = DecimalPaddedPODArray<T>;

View File

@ -13,10 +13,10 @@ namespace DB
/** A column of values of "fixed-length string" type.
* If you insert a smaller string, it will be padded with zero bytes.
*/
class ColumnFixedString final : public COWPtrHelper<ColumnVectorHelper, ColumnFixedString>
class ColumnFixedString final : public COWHelper<ColumnVectorHelper, ColumnFixedString>
{
public:
friend class COWPtrHelper<ColumnVectorHelper, ColumnFixedString>;
friend class COWHelper<ColumnVectorHelper, ColumnFixedString>;
using Chars = PaddedPODArray<UInt8>;

View File

@ -129,19 +129,6 @@ std::vector<MutableColumnPtr> ColumnFunction::scatter(IColumn::ColumnIndex num_c
return columns;
}
void ColumnFunction::insertDefault()
{
for (auto & column : captured_columns)
column.column->assumeMutableRef().insertDefault();
++size_;
}
void ColumnFunction::popBack(size_t n)
{
for (auto & column : captured_columns)
column.column->assumeMutableRef().popBack(n);
size_ -= n;
}
size_t ColumnFunction::byteSize() const
{
size_t total_size = 0;

View File

@ -15,10 +15,10 @@ namespace DB
/** A column containing a lambda expression.
* Behaves like a constant-column. Contains an expression, but not input or output data.
*/
class ColumnFunction final : public COWPtrHelper<IColumn, ColumnFunction>
class ColumnFunction final : public COWHelper<IColumn, ColumnFunction>
{
private:
friend class COWPtrHelper<IColumn, ColumnFunction>;
friend class COWHelper<IColumn, ColumnFunction>;
ColumnFunction(size_t size, FunctionBasePtr function, const ColumnsWithTypeAndName & columns_to_capture);
@ -34,8 +34,7 @@ public:
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;
void insertDefault() override;
void popBack(size_t n) override;
std::vector<MutableColumnPtr> scatter(IColumn::ColumnIndex num_columns,
const IColumn::Selector & selector) const override;
@ -64,7 +63,12 @@ public:
void insert(const Field &) override
{
throw Exception("Cannot get insert into " + getName(), ErrorCodes::NOT_IMPLEMENTED);
throw Exception("Cannot insert into " + getName(), ErrorCodes::NOT_IMPLEMENTED);
}
void insertDefault() override
{
throw Exception("Cannot insert into " + getName(), ErrorCodes::NOT_IMPLEMENTED);
}
void insertRangeFrom(const IColumn &, size_t, size_t) override
@ -92,6 +96,11 @@ public:
throw Exception("updateHashWithValue is not implemented for " + getName(), ErrorCodes::NOT_IMPLEMENTED);
}
void popBack(size_t) override
{
throw Exception("popBack is not implemented for " + getName(), ErrorCodes::NOT_IMPLEMENTED);
}
int compareAt(size_t, size_t, const IColumn &, int) const override
{
throw Exception("compareAt is not implemented for " + getName(), ErrorCodes::NOT_IMPLEMENTED);

View File

@ -306,21 +306,11 @@ void ColumnLowCardinality::setSharedDictionary(const ColumnPtr & column_unique)
dictionary.setShared(column_unique);
}
ColumnLowCardinality::MutablePtr ColumnLowCardinality::compact()
{
auto positions = idx.getPositions();
/// Create column with new indexes and old dictionary.
auto column = ColumnLowCardinality::create(getDictionary().assumeMutable(), (*std::move(positions)).mutate());
/// Will create new dictionary.
column->compactInplace();
return column;
}
ColumnLowCardinality::MutablePtr ColumnLowCardinality::cutAndCompact(size_t start, size_t length) const
{
auto sub_positions = (*idx.getPositions()->cut(start, length)).mutate();
/// Create column with new indexes and old dictionary.
/// Dictionary is shared, but will be recreated after compactInplace call.
auto column = ColumnLowCardinality::create(getDictionary().assumeMutable(), std::move(sub_positions));
/// Will create new dictionary.
column->compactInplace();
@ -522,7 +512,7 @@ void ColumnLowCardinality::Index::insertPosition(UInt64 position)
while (position > getMaxPositionForCurrentType())
expandType();
positions->assumeMutableRef().insert(position);
positions->insert(position);
checkSizeOfType();
}
@ -540,7 +530,7 @@ void ColumnLowCardinality::Index::insertPositionsRange(const IColumn & column, U
convertPositions<ColumnType>();
if (size_of_type == sizeof(ColumnType))
positions->assumeMutableRef().insertRangeFrom(column, offset, limit);
positions->insertRangeFrom(column, offset, limit);
else
{
auto copy = [&](auto cur_type)

View File

@ -14,9 +14,9 @@ namespace ErrorCodes
extern const int ILLEGAL_COLUMN;
}
class ColumnLowCardinality final : public COWPtrHelper<IColumn, ColumnLowCardinality>
class ColumnLowCardinality final : public COWHelper<IColumn, ColumnLowCardinality>
{
friend class COWPtrHelper<IColumn, ColumnLowCardinality>;
friend class COWHelper<IColumn, ColumnLowCardinality>;
ColumnLowCardinality(MutableColumnPtr && column_unique, MutableColumnPtr && indexes, bool is_shared = false);
ColumnLowCardinality(const ColumnLowCardinality & other) = default;
@ -25,7 +25,7 @@ public:
/** Create immutable column using immutable arguments. This arguments may be shared with other columns.
* Use IColumn::mutate in order to make mutable column and mutate shared nested columns.
*/
using Base = COWPtrHelper<IColumn, ColumnLowCardinality>;
using Base = COWHelper<IColumn, ColumnLowCardinality>;
static Ptr create(const ColumnPtr & column_unique_, const ColumnPtr & indexes_, bool is_shared = false)
{
return ColumnLowCardinality::create(column_unique_->assumeMutable(), indexes_->assumeMutable(), is_shared);
@ -149,10 +149,10 @@ public:
const IColumnUnique & getDictionary() const { return dictionary.getColumnUnique(); }
const ColumnPtr & getDictionaryPtr() const { return dictionary.getColumnUniquePtr(); }
/// IColumnUnique & getUnique() { return static_cast<IColumnUnique &>(*column_unique->assumeMutable()); }
/// IColumnUnique & getUnique() { return static_cast<IColumnUnique &>(*column_unique); }
/// ColumnPtr getUniquePtr() const { return column_unique; }
/// IColumn & getIndexes() { return idx.getPositions()->assumeMutableRef(); }
/// IColumn & getIndexes() { return *idx.getPositions(); }
const IColumn & getIndexes() const { return *idx.getPositions(); }
const ColumnPtr & getIndexesPtr() const { return idx.getPositions(); }
size_t getSizeOfIndexType() const { return idx.getSizeOfIndexType(); }
@ -177,10 +177,8 @@ public:
void setSharedDictionary(const ColumnPtr & column_unique);
bool isSharedDictionary() const { return dictionary.isShared(); }
/// Create column new dictionary with only keys that are mentioned in index.
MutablePtr compact();
/// Cut + compact.
/// Create column with new dictionary from column part.
/// Dictionary will have only keys that are mentioned in index.
MutablePtr cutAndCompact(size_t start, size_t length) const;
struct DictionaryEncodedColumn
@ -202,13 +200,13 @@ public:
explicit Index(ColumnPtr positions);
const ColumnPtr & getPositions() const { return positions; }
ColumnPtr & getPositionsPtr() { return positions; }
WrappedPtr & getPositionsPtr() { return positions; }
size_t getPositionAt(size_t row) const;
void insertPosition(UInt64 position);
void insertPositionsRange(const IColumn & column, UInt64 offset, UInt64 limit);
void popBack(size_t n) { positions->assumeMutableRef().popBack(n); }
void reserve(size_t n) { positions->assumeMutableRef().reserve(n); }
void popBack(size_t n) { positions->popBack(n); }
void reserve(size_t n) { positions->reserve(n); }
UInt64 getMaxPositionForCurrentType() const;
@ -224,7 +222,7 @@ public:
void countKeys(ColumnUInt64::Container & counts) const;
private:
ColumnPtr positions;
WrappedPtr positions;
size_t size_of_type = 0;
void updateSizeOfType() { size_of_type = getSizeOfIndexType(*positions, size_of_type); }
@ -252,10 +250,10 @@ private:
explicit Dictionary(ColumnPtr column_unique, bool is_shared);
const ColumnPtr & getColumnUniquePtr() const { return column_unique; }
ColumnPtr & getColumnUniquePtr() { return column_unique; }
WrappedPtr & getColumnUniquePtr() { return column_unique; }
const IColumnUnique & getColumnUnique() const { return static_cast<const IColumnUnique &>(*column_unique); }
IColumnUnique & getColumnUnique() { return static_cast<IColumnUnique &>(column_unique->assumeMutableRef()); }
IColumnUnique & getColumnUnique() { return static_cast<IColumnUnique &>(*column_unique); }
/// Dictionary may be shared for several mutable columns.
/// Immutable columns may have the same column unique, which isn't necessarily shared dictionary.
@ -266,7 +264,7 @@ private:
void compact(ColumnPtr & positions);
private:
ColumnPtr column_unique;
WrappedPtr column_unique;
bool shared = false;
void checkColumn(const IColumn & column);

Some files were not shown because too many files have changed in this diff Show More