This commit is contained in:
MyroTk 2021-03-01 16:58:40 +01:00
commit 1f7aded743
1918 changed files with 56637 additions and 23831 deletions

2
.github/codecov.yml vendored
View File

@ -1,5 +1,5 @@
codecov:
max_report_age: off
max_report_age: "off"
strict_yaml_branch: "master"
ignore:

View File

@ -8,7 +8,7 @@
name: Docker Container Scan (clickhouse-server)
on:
"on":
pull_request:
paths:
- docker/server/Dockerfile

View File

@ -1,32 +0,0 @@
# See the example here: https://github.com/github/codeql-action
name: "CodeQL Scanning"
on:
schedule:
- cron: '0 19 * * *'
jobs:
CodeQL-Build:
runs-on: self-hosted
timeout-minutes: 1440
steps:
- name: Checkout repository
uses: actions/checkout@v2
with:
fetch-depth: 2
submodules: 'recursive'
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
with:
languages: cpp
- run: sudo apt-get update && sudo apt-get install -y git cmake python ninja-build gcc-10 g++-10 && mkdir build
- run: cd build && CC=gcc-10 CXX=g++-10 cmake ..
- run: cd build && ninja
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1

6
.gitignore vendored
View File

@ -137,3 +137,9 @@ website/package-lock.json
/prof
*.iml
# data store
/programs/server/data
/programs/server/metadata
/programs/server/store

4
.gitmodules vendored
View File

@ -184,7 +184,7 @@
url = https://github.com/ClickHouse-Extras/krb5
[submodule "contrib/cyrus-sasl"]
path = contrib/cyrus-sasl
url = https://github.com/cyrusimap/cyrus-sasl
url = https://github.com/ClickHouse-Extras/cyrus-sasl
branch = cyrus-sasl-2.1
[submodule "contrib/croaring"]
path = contrib/croaring
@ -220,4 +220,4 @@
url = https://github.com/ClickHouse-Extras/boringssl.git
[submodule "contrib/NuRaft"]
path = contrib/NuRaft
url = https://github.com/eBay/NuRaft.git
url = https://github.com/ClickHouse-Extras/NuRaft.git

45
.pylintrc Normal file
View File

@ -0,0 +1,45 @@
# vim: ft=config
[BASIC]
max-module-lines=2000
# due to SQL
max-line-length=200
# Drop/decrease them one day:
max-branches=50
max-nested-blocks=10
max-statements=200
[FORMAT]
ignore-long-lines = (# )?<?https?://\S+>?$
[MESSAGES CONTROL]
disable = bad-continuation,
missing-docstring,
bad-whitespace,
too-few-public-methods,
invalid-name,
too-many-arguments,
keyword-arg-before-vararg,
too-many-locals,
too-many-instance-attributes,
cell-var-from-loop,
fixme,
too-many-public-methods,
wildcard-import,
unused-wildcard-import,
singleton-comparison,
# pytest.mark.parametrize is not callable (not-callable)
not-callable,
# https://github.com/PyCQA/pylint/issues/3882
# [Python 3.9] Value 'Optional' is unsubscriptable (unsubscriptable-object) (also Union)
unsubscriptable-object,
# Drop them one day:
redefined-outer-name,
broad-except,
bare-except,
no-else-return,
global-statement
[SIMILARITIES]
# due to SQL
min-similarity-lines=1000

15
.yamllint Normal file
View File

@ -0,0 +1,15 @@
# vi: ft=yaml
extends: default
rules:
indentation:
level: warning
indent-sequences: consistent
line-length:
# there are some bash -c "", so this is OK
max: 300
level: warning
comments:
min-spaces-from-content: 1
document-start:
present: false

View File

@ -1,5 +1,180 @@
## ClickHouse release 21.2
### ClickHouse release v21.2.2.8-stable, 2021-02-07
#### Backward Incompatible Change
* Bitwise functions (`bitAnd`, `bitOr`, etc) are forbidden for floating point arguments. Now you have to do explicit cast to integer. [#19853](https://github.com/ClickHouse/ClickHouse/pull/19853) ([Azat Khuzhin](https://github.com/azat)).
* Forbid `lcm`/`gcd` for floats. [#19532](https://github.com/ClickHouse/ClickHouse/pull/19532) ([Azat Khuzhin](https://github.com/azat)).
* Fix memory tracking for `OPTIMIZE TABLE`/merges; account query memory limits and sampling for `OPTIMIZE TABLE`/merges. [#18772](https://github.com/ClickHouse/ClickHouse/pull/18772) ([Azat Khuzhin](https://github.com/azat)).
* Disallow floating point column as partition key, see [#18421](https://github.com/ClickHouse/ClickHouse/issues/18421#event-4147046255). [#18464](https://github.com/ClickHouse/ClickHouse/pull/18464) ([hexiaoting](https://github.com/hexiaoting)).
* Excessive parenthesis in type definitions no longer supported, example: `Array((UInt8))`.
#### New Feature
* Added `PostgreSQL` table engine (both select/insert, with support for multidimensional arrays), also as table function. Added `PostgreSQL` dictionary source. Added `PostgreSQL` database engine. [#18554](https://github.com/ClickHouse/ClickHouse/pull/18554) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Data type `Nested` now supports arbitrary levels of nesting. Introduced subcolumns of complex types, such as `size0` in `Array`, `null` in `Nullable`, names of `Tuple` elements, which can be read without reading of whole column. [#17310](https://github.com/ClickHouse/ClickHouse/pull/17310) ([Anton Popov](https://github.com/CurtizJ)).
* Added `Nullable` support for `FlatDictionary`, `HashedDictionary`, `ComplexKeyHashedDictionary`, `DirectDictionary`, `ComplexKeyDirectDictionary`, `RangeHashedDictionary`. [#18236](https://github.com/ClickHouse/ClickHouse/pull/18236) ([Maksim Kita](https://github.com/kitaisreal)).
* Adds a new table called `system.distributed_ddl_queue` that displays the queries in the DDL worker queue. [#17656](https://github.com/ClickHouse/ClickHouse/pull/17656) ([Bharat Nallan](https://github.com/bharatnc)).
* Added support of mapping LDAP group names, and attribute values in general, to local roles for users from ldap user directories. [#17211](https://github.com/ClickHouse/ClickHouse/pull/17211) ([Denis Glazachev](https://github.com/traceon)).
* Support insert into table function `cluster`, and for both table functions `remote` and `cluster`, support distributing data across nodes by specify sharding key. Close [#16752](https://github.com/ClickHouse/ClickHouse/issues/16752). [#18264](https://github.com/ClickHouse/ClickHouse/pull/18264) ([flynn](https://github.com/ucasFL)).
* Add function `decodeXMLComponent` to decode characters for XML. Example: `SELECT decodeXMLComponent('Hello,&quot;world&quot;!')` [#17659](https://github.com/ClickHouse/ClickHouse/issues/17659). [#18542](https://github.com/ClickHouse/ClickHouse/pull/18542) ([nauta](https://github.com/nautaa)).
* Added functions `parseDateTimeBestEffortUSOrZero`, `parseDateTimeBestEffortUSOrNull`. [#19712](https://github.com/ClickHouse/ClickHouse/pull/19712) ([Maksim Kita](https://github.com/kitaisreal)).
* Add `sign` math function. [#19527](https://github.com/ClickHouse/ClickHouse/pull/19527) ([flynn](https://github.com/ucasFL)).
* Add information about used features (functions, table engines, etc) into system.query_log. [#18495](https://github.com/ClickHouse/ClickHouse/issues/18495). [#19371](https://github.com/ClickHouse/ClickHouse/pull/19371) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Function `formatDateTime` support the `%Q` modification to format date to quarter. [#19224](https://github.com/ClickHouse/ClickHouse/pull/19224) ([Jianmei Zhang](https://github.com/zhangjmruc)).
* Support MetaKey+Enter hotkey binding in play UI. [#19012](https://github.com/ClickHouse/ClickHouse/pull/19012) ([sundyli](https://github.com/sundy-li)).
* Add three functions for map data type: 1. `mapContains(map, key)` to check weather map.keys include the second parameter key. 2. `mapKeys(map)` return all the keys in Array format 3. `mapValues(map)` return all the values in Array format. [#18788](https://github.com/ClickHouse/ClickHouse/pull/18788) ([hexiaoting](https://github.com/hexiaoting)).
* Add `log_comment` setting related to [#18494](https://github.com/ClickHouse/ClickHouse/issues/18494). [#18549](https://github.com/ClickHouse/ClickHouse/pull/18549) ([Zijie Lu](https://github.com/TszKitLo40)).
* Add support of tuple argument to `argMin` and `argMax` functions. [#17359](https://github.com/ClickHouse/ClickHouse/pull/17359) ([Ildus Kurbangaliev](https://github.com/ildus)).
* Support `EXISTS VIEW` syntax. [#18552](https://github.com/ClickHouse/ClickHouse/pull/18552) ([Du Chuan](https://github.com/spongedu)).
* Add `SELECT ALL` syntax. closes [#18706](https://github.com/ClickHouse/ClickHouse/issues/18706). [#18723](https://github.com/ClickHouse/ClickHouse/pull/18723) ([flynn](https://github.com/ucasFL)).
#### Performance Improvement
* Faster parts removal by lowering the number of `stat` syscalls. This returns the optimization that existed while ago. More safe interface of `IDisk`. This closes [#19065](https://github.com/ClickHouse/ClickHouse/issues/19065). [#19086](https://github.com/ClickHouse/ClickHouse/pull/19086) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Aliases declared in `WITH` statement are properly used in index analysis. Queries like `WITH column AS alias SELECT ... WHERE alias = ...` may use index now. [#18896](https://github.com/ClickHouse/ClickHouse/pull/18896) ([Amos Bird](https://github.com/amosbird)).
* Add `optimize_alias_column_prediction` (on by default), that will: - Respect aliased columns in WHERE during partition pruning and skipping data using secondary indexes; - Respect aliased columns in WHERE for trivial count queries for optimize_trivial_count; - Respect aliased columns in GROUP BY/ORDER BY for optimize_aggregation_in_order/optimize_read_in_order. [#16995](https://github.com/ClickHouse/ClickHouse/pull/16995) ([sundyli](https://github.com/sundy-li)).
* Speed up aggregate function `sum`. Improvement only visible on synthetic benchmarks and not very practical. [#19216](https://github.com/ClickHouse/ClickHouse/pull/19216) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Update libc++ and use another ABI to provide better performance. [#18914](https://github.com/ClickHouse/ClickHouse/pull/18914) ([Danila Kutenin](https://github.com/danlark1)).
* Rewrite `sumIf()` and `sum(if())` function to `countIf()` function when logically equivalent. [#17041](https://github.com/ClickHouse/ClickHouse/pull/17041) ([flynn](https://github.com/ucasFL)).
* Use a connection pool for S3 connections, controlled by the `s3_max_connections` settings. [#13405](https://github.com/ClickHouse/ClickHouse/pull/13405) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Add support for zstd long option for better compression of string columns to save space. [#17184](https://github.com/ClickHouse/ClickHouse/pull/17184) ([ygrek](https://github.com/ygrek)).
* Slightly improve server latency by removing access to configuration on every connection. [#19863](https://github.com/ClickHouse/ClickHouse/pull/19863) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Reduce lock contention for multiple layers of the `Buffer` engine. [#19379](https://github.com/ClickHouse/ClickHouse/pull/19379) ([Azat Khuzhin](https://github.com/azat)).
* Support splitting `Filter` step of query plan into `Expression + Filter` pair. Together with `Expression + Expression` merging optimization ([#17458](https://github.com/ClickHouse/ClickHouse/issues/17458)) it may delay execution for some expressions after `Filter` step. [#19253](https://github.com/ClickHouse/ClickHouse/pull/19253) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### Improvement
* `SELECT count() FROM table` now can be executed if only one any column can be selected from the `table`. This PR fixes [#10639](https://github.com/ClickHouse/ClickHouse/issues/10639). [#18233](https://github.com/ClickHouse/ClickHouse/pull/18233) ([Vitaly Baranov](https://github.com/vitlibar)).
* Set charset to `utf8mb4` when interacting with remote MySQL servers. Fixes [#19795](https://github.com/ClickHouse/ClickHouse/issues/19795). [#19800](https://github.com/ClickHouse/ClickHouse/pull/19800) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* `S3` table function now supports `auto` compression mode (autodetect). This closes [#18754](https://github.com/ClickHouse/ClickHouse/issues/18754). [#19793](https://github.com/ClickHouse/ClickHouse/pull/19793) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Correctly output infinite arguments for `formatReadableTimeDelta` function. In previous versions, there was implicit conversion to implementation specific integer value. [#19791](https://github.com/ClickHouse/ClickHouse/pull/19791) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Table function `S3` will use global region if the region can't be determined exactly. This closes [#10998](https://github.com/ClickHouse/ClickHouse/issues/10998). [#19750](https://github.com/ClickHouse/ClickHouse/pull/19750) ([Vladimir Chebotarev](https://github.com/excitoon)).
* In distributed queries if the setting `async_socket_for_remote` is enabled, it was possible to get stack overflow at least in debug build configuration if very deeply nested data type is used in table (e.g. `Array(Array(Array(...more...)))`). This fixes [#19108](https://github.com/ClickHouse/ClickHouse/issues/19108). This change introduces minor backward incompatibility: excessive parenthesis in type definitions no longer supported, example: `Array((UInt8))`. [#19736](https://github.com/ClickHouse/ClickHouse/pull/19736) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add separate pool for message brokers (RabbitMQ and Kafka). [#19722](https://github.com/ClickHouse/ClickHouse/pull/19722) ([Azat Khuzhin](https://github.com/azat)).
* Fix rare `max_number_of_merges_with_ttl_in_pool` limit overrun (more merges with TTL can be assigned) for non-replicated MergeTree. [#19708](https://github.com/ClickHouse/ClickHouse/pull/19708) ([alesapin](https://github.com/alesapin)).
* Dictionary: better error message during attribute parsing. [#19678](https://github.com/ClickHouse/ClickHouse/pull/19678) ([Maksim Kita](https://github.com/kitaisreal)).
* Add an option to disable validation of checksums on reading. Should never be used in production. Please do not expect any benefits in disabling it. It may only be used for experiments and benchmarks. The setting only applicable for tables of MergeTree family. Checksums are always validated for other table engines and when receiving data over network. In my observations there is no performance difference or it is less than 0.5%. [#19588](https://github.com/ClickHouse/ClickHouse/pull/19588) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Support constant result in function `multiIf`. [#19533](https://github.com/ClickHouse/ClickHouse/pull/19533) ([Maksim Kita](https://github.com/kitaisreal)).
* Enable function length/empty/notEmpty for datatype Map, which returns keys number in Map. [#19530](https://github.com/ClickHouse/ClickHouse/pull/19530) ([taiyang-li](https://github.com/taiyang-li)).
* Add `--reconnect` option to `clickhouse-benchmark`. When this option is specified, it will reconnect before every request. This is needed for testing. [#19872](https://github.com/ClickHouse/ClickHouse/pull/19872) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Support using the new location of `.debug` file. This fixes [#19348](https://github.com/ClickHouse/ClickHouse/issues/19348). [#19520](https://github.com/ClickHouse/ClickHouse/pull/19520) ([Amos Bird](https://github.com/amosbird)).
* `toIPv6` function parses `IPv4` addresses. [#19518](https://github.com/ClickHouse/ClickHouse/pull/19518) ([Bharat Nallan](https://github.com/bharatnc)).
* Add `http_referer` field to `system.query_log`, `system.processes`, etc. This closes [#19389](https://github.com/ClickHouse/ClickHouse/issues/19389). [#19390](https://github.com/ClickHouse/ClickHouse/pull/19390) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Improve MySQL compatibility by making more functions case insensitive and adding aliases. [#19387](https://github.com/ClickHouse/ClickHouse/pull/19387) ([Daniil Kondratyev](https://github.com/dankondr)).
* Add metrics for MergeTree parts (Wide/Compact/InMemory) types. [#19381](https://github.com/ClickHouse/ClickHouse/pull/19381) ([Azat Khuzhin](https://github.com/azat)).
* Allow docker to be executed with arbitrary uid. [#19374](https://github.com/ClickHouse/ClickHouse/pull/19374) ([filimonov](https://github.com/filimonov)).
* Fix wrong alignment of values of `IPv4` data type in Pretty formats. They were aligned to the right, not to the left. This closes [#19184](https://github.com/ClickHouse/ClickHouse/issues/19184). [#19339](https://github.com/ClickHouse/ClickHouse/pull/19339) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Allow change `max_server_memory_usage` without restart. This closes [#18154](https://github.com/ClickHouse/ClickHouse/issues/18154). [#19186](https://github.com/ClickHouse/ClickHouse/pull/19186) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* The exception when function `bar` is called with certain NaN argument may be slightly misleading in previous versions. This fixes [#19088](https://github.com/ClickHouse/ClickHouse/issues/19088). [#19107](https://github.com/ClickHouse/ClickHouse/pull/19107) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Explicitly set uid / gid of clickhouse user & group to the fixed values (101) in clickhouse-server images. [#19096](https://github.com/ClickHouse/ClickHouse/pull/19096) ([filimonov](https://github.com/filimonov)).
* Fixed `PeekableReadBuffer: Memory limit exceed` error when inserting data with huge strings. Fixes [#18690](https://github.com/ClickHouse/ClickHouse/issues/18690). [#18979](https://github.com/ClickHouse/ClickHouse/pull/18979) ([tavplubix](https://github.com/tavplubix)).
* Docker image: several improvements for clickhouse-server entrypoint. [#18954](https://github.com/ClickHouse/ClickHouse/pull/18954) ([filimonov](https://github.com/filimonov)).
* Add `normalizeQueryKeepNames` and `normalizedQueryHashKeepNames` to normalize queries without masking long names with `?`. This helps better analyze complex query logs. [#18910](https://github.com/ClickHouse/ClickHouse/pull/18910) ([Amos Bird](https://github.com/amosbird)).
* Check per-block checksum of the distributed batch on the sender before sending (without reading the file twice, the checksums will be verified while reading), this will avoid stuck of the INSERT on the receiver (on truncated .bin file on the sender). Avoid reading .bin files twice for batched INSERT (it was required to calculate rows/bytes to take squashing into account, now this information included into the header, backward compatible is preserved). [#18853](https://github.com/ClickHouse/ClickHouse/pull/18853) ([Azat Khuzhin](https://github.com/azat)).
* Fix issues with RIGHT and FULL JOIN of tables with aggregate function states. In previous versions exception about `cloneResized` method was thrown. [#18818](https://github.com/ClickHouse/ClickHouse/pull/18818) ([templarzq](https://github.com/templarzq)).
* Added prefix-based S3 endpoint settings. [#18812](https://github.com/ClickHouse/ClickHouse/pull/18812) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Add [UInt8, UInt16, UInt32, UInt64] arguments types support for bitmapTransform, bitmapSubsetInRange, bitmapSubsetLimit, bitmapContains functions. This closes [#18713](https://github.com/ClickHouse/ClickHouse/issues/18713). [#18791](https://github.com/ClickHouse/ClickHouse/pull/18791) ([sundyli](https://github.com/sundy-li)).
* Allow CTE (Common Table Expressions) to be further aliased. Propagate CSE (Common Subexpressions Elimination) to subqueries in the same level when `enable_global_with_statement = 1`. This fixes [#17378](https://github.com/ClickHouse/ClickHouse/issues/17378) . This fixes https://github.com/ClickHouse/ClickHouse/pull/16575#issuecomment-753416235 . [#18684](https://github.com/ClickHouse/ClickHouse/pull/18684) ([Amos Bird](https://github.com/amosbird)).
* Update librdkafka to v1.6.0-RC2. Fixes [#18668](https://github.com/ClickHouse/ClickHouse/issues/18668). [#18671](https://github.com/ClickHouse/ClickHouse/pull/18671) ([filimonov](https://github.com/filimonov)).
* In case of unexpected exceptions automatically restart background thread which is responsible for execution of distributed DDL queries. Fixes [#17991](https://github.com/ClickHouse/ClickHouse/issues/17991). [#18285](https://github.com/ClickHouse/ClickHouse/pull/18285) ([徐炘](https://github.com/weeds085490)).
* Updated AWS C++ SDK in order to utilize global regions in S3. [#17870](https://github.com/ClickHouse/ClickHouse/pull/17870) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Added support for `WITH ... [AND] [PERIODIC] REFRESH [interval_in_sec]` clause when creating `LIVE VIEW` tables. [#14822](https://github.com/ClickHouse/ClickHouse/pull/14822) ([vzakaznikov](https://github.com/vzakaznikov)).
* Restrict `MODIFY TTL` queries for `MergeTree` tables created in old syntax. Previously the query succeeded, but actually it had no effect. [#19064](https://github.com/ClickHouse/ClickHouse/pull/19064) ([Anton Popov](https://github.com/CurtizJ)).
#### Bug Fix
* Fix index analysis of binary functions with constant argument which leads to wrong query results. This fixes [#18364](https://github.com/ClickHouse/ClickHouse/issues/18364). [#18373](https://github.com/ClickHouse/ClickHouse/pull/18373) ([Amos Bird](https://github.com/amosbird)).
* Fix starting the server with tables having default expressions containing dictGet(). Allow getting return type of dictGet() without loading dictionary. [#19805](https://github.com/ClickHouse/ClickHouse/pull/19805) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix server crash after query with `if` function with `Tuple` type of then/else branches result. `Tuple` type must contain `Array` or another complex type. Fixes [#18356](https://github.com/ClickHouse/ClickHouse/issues/18356). [#20133](https://github.com/ClickHouse/ClickHouse/pull/20133) ([alesapin](https://github.com/alesapin)).
* `MaterializeMySQL` (experimental feature): Fix replication for statements that update several tables. [#20066](https://github.com/ClickHouse/ClickHouse/pull/20066) ([Håvard Kvålen](https://github.com/havardk)).
* Prevent "Connection refused" in docker during initialization script execution. [#20012](https://github.com/ClickHouse/ClickHouse/pull/20012) ([filimonov](https://github.com/filimonov)).
* `EmbeddedRocksDB` is an experimental storage. Fix the issue with lack of proper type checking. Simplified code. This closes [#19967](https://github.com/ClickHouse/ClickHouse/issues/19967). [#19972](https://github.com/ClickHouse/ClickHouse/pull/19972) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix a segfault in function `fromModifiedJulianDay` when the argument type is `Nullable(T)` for any integral types other than Int32. [#19959](https://github.com/ClickHouse/ClickHouse/pull/19959) ([PHO](https://github.com/depressed-pho)).
* The function `greatCircleAngle` returned inaccurate results in previous versions. This closes [#19769](https://github.com/ClickHouse/ClickHouse/issues/19769). [#19789](https://github.com/ClickHouse/ClickHouse/pull/19789) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix rare bug when some replicated operations (like mutation) cannot process some parts after data corruption. Fixes [#19593](https://github.com/ClickHouse/ClickHouse/issues/19593). [#19702](https://github.com/ClickHouse/ClickHouse/pull/19702) ([alesapin](https://github.com/alesapin)).
* Background thread which executes `ON CLUSTER` queries might hang waiting for dropped replicated table to do something. It's fixed. [#19684](https://github.com/ClickHouse/ClickHouse/pull/19684) ([yiguolei](https://github.com/yiguolei)).
* Fix wrong deserialization of columns description. It makes INSERT into a table with a column named `\` impossible. [#19479](https://github.com/ClickHouse/ClickHouse/pull/19479) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Mark distributed batch as broken in case of empty data block in one of files. [#19449](https://github.com/ClickHouse/ClickHouse/pull/19449) ([Azat Khuzhin](https://github.com/azat)).
* Fixed very rare bug that might cause mutation to hang after `DROP/DETACH/REPLACE/MOVE PARTITION`. It was partially fixed by [#15537](https://github.com/ClickHouse/ClickHouse/issues/15537) for the most cases. [#19443](https://github.com/ClickHouse/ClickHouse/pull/19443) ([tavplubix](https://github.com/tavplubix)).
* Fix possible error `Extremes transform was already added to pipeline`. Fixes [#14100](https://github.com/ClickHouse/ClickHouse/issues/14100). [#19430](https://github.com/ClickHouse/ClickHouse/pull/19430) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix default value in join types with non-zero default (e.g. some Enums). Closes [#18197](https://github.com/ClickHouse/ClickHouse/issues/18197). [#19360](https://github.com/ClickHouse/ClickHouse/pull/19360) ([vdimir](https://github.com/vdimir)).
* Do not mark file for distributed send as broken on EOF. [#19290](https://github.com/ClickHouse/ClickHouse/pull/19290) ([Azat Khuzhin](https://github.com/azat)).
* Fix leaking of pipe fd for `async_socket_for_remote`. [#19153](https://github.com/ClickHouse/ClickHouse/pull/19153) ([Azat Khuzhin](https://github.com/azat)).
* Fix infinite reading from file in `ORC` format (was introduced in [#10580](https://github.com/ClickHouse/ClickHouse/issues/10580)). Fixes [#19095](https://github.com/ClickHouse/ClickHouse/issues/19095). [#19134](https://github.com/ClickHouse/ClickHouse/pull/19134) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix issue in merge tree data writer which can lead to marks with bigger size than fixed granularity size. Fixes [#18913](https://github.com/ClickHouse/ClickHouse/issues/18913). [#19123](https://github.com/ClickHouse/ClickHouse/pull/19123) ([alesapin](https://github.com/alesapin)).
* Fix startup bug when clickhouse was not able to read compression codec from `LowCardinality(Nullable(...))` and throws exception `Attempt to read after EOF`. Fixes [#18340](https://github.com/ClickHouse/ClickHouse/issues/18340). [#19101](https://github.com/ClickHouse/ClickHouse/pull/19101) ([alesapin](https://github.com/alesapin)).
* Simplify the implementation of `tupleHammingDistance`. Support for tuples of any equal length. Fixes [#19029](https://github.com/ClickHouse/ClickHouse/issues/19029). [#19084](https://github.com/ClickHouse/ClickHouse/pull/19084) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Make sure `groupUniqArray` returns correct type for argument of Enum type. This closes [#17875](https://github.com/ClickHouse/ClickHouse/issues/17875). [#19019](https://github.com/ClickHouse/ClickHouse/pull/19019) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix possible error `Expected single dictionary argument for function` if use function `ignore` with `LowCardinality` argument. Fixes [#14275](https://github.com/ClickHouse/ClickHouse/issues/14275). [#19016](https://github.com/ClickHouse/ClickHouse/pull/19016) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix inserting of `LowCardinality` column to table with `TinyLog` engine. Fixes [#18629](https://github.com/ClickHouse/ClickHouse/issues/18629). [#19010](https://github.com/ClickHouse/ClickHouse/pull/19010) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix minor issue in JOIN: Join tries to materialize const columns, but our code waits for them in other places. [#18982](https://github.com/ClickHouse/ClickHouse/pull/18982) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Disable `optimize_move_functions_out_of_any` because optimization is not always correct. This closes [#18051](https://github.com/ClickHouse/ClickHouse/issues/18051). This closes [#18973](https://github.com/ClickHouse/ClickHouse/issues/18973). [#18981](https://github.com/ClickHouse/ClickHouse/pull/18981) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix possible exception `QueryPipeline stream: different number of columns` caused by merging of query plan's `Expression` steps. Fixes [#18190](https://github.com/ClickHouse/ClickHouse/issues/18190). [#18980](https://github.com/ClickHouse/ClickHouse/pull/18980) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed very rare deadlock at shutdown. [#18977](https://github.com/ClickHouse/ClickHouse/pull/18977) ([tavplubix](https://github.com/tavplubix)).
* Fixed rare crashes when server run out of memory. [#18976](https://github.com/ClickHouse/ClickHouse/pull/18976) ([tavplubix](https://github.com/tavplubix)).
* Fix incorrect behavior when `ALTER TABLE ... DROP PART 'part_name'` query removes all deduplication blocks for the whole partition. Fixes [#18874](https://github.com/ClickHouse/ClickHouse/issues/18874). [#18969](https://github.com/ClickHouse/ClickHouse/pull/18969) ([alesapin](https://github.com/alesapin)).
* Fixed issue [#18894](https://github.com/ClickHouse/ClickHouse/issues/18894) Add a check to avoid exception when long column alias('table.column' style, usually auto-generated by BI tools like Looker) equals to long table name. [#18968](https://github.com/ClickHouse/ClickHouse/pull/18968) ([Daniel Qin](https://github.com/mathfool)).
* Fix error `Task was not found in task queue` (possible only for remote queries, with `async_socket_for_remote = 1`). [#18964](https://github.com/ClickHouse/ClickHouse/pull/18964) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix bug when mutation with some escaped text (like `ALTER ... UPDATE e = CAST('foo', 'Enum8(\'foo\' = 1')` serialized incorrectly. Fixes [#18878](https://github.com/ClickHouse/ClickHouse/issues/18878). [#18944](https://github.com/ClickHouse/ClickHouse/pull/18944) ([alesapin](https://github.com/alesapin)).
* ATTACH PARTITION will reset mutations. [#18804](https://github.com/ClickHouse/ClickHouse/issues/18804). [#18935](https://github.com/ClickHouse/ClickHouse/pull/18935) ([fastio](https://github.com/fastio)).
* Fix issue with `bitmapOrCardinality` that may lead to nullptr dereference. This closes [#18911](https://github.com/ClickHouse/ClickHouse/issues/18911). [#18912](https://github.com/ClickHouse/ClickHouse/pull/18912) ([sundyli](https://github.com/sundy-li)).
* Fixed `Attempt to read after eof` error when trying to `CAST` `NULL` from `Nullable(String)` to `Nullable(Decimal(P, S))`. Now function `CAST` returns `NULL` when it cannot parse decimal from nullable string. Fixes [#7690](https://github.com/ClickHouse/ClickHouse/issues/7690). [#18718](https://github.com/ClickHouse/ClickHouse/pull/18718) ([Winter Zhang](https://github.com/zhang2014)).
* Fix data type convert issue for MySQL engine. [#18124](https://github.com/ClickHouse/ClickHouse/pull/18124) ([bo zeng](https://github.com/mis98zb)).
* Fix clickhouse-client abort exception while executing only `select`. [#19790](https://github.com/ClickHouse/ClickHouse/pull/19790) ([taiyang-li](https://github.com/taiyang-li)).
#### Build/Testing/Packaging Improvement
* Run [SQLancer](https://twitter.com/RiggerManuel/status/1352345625480884228) (logical SQL fuzzer) in CI. [#19006](https://github.com/ClickHouse/ClickHouse/pull/19006) ([Ilya Yatsishin](https://github.com/qoega)).
* Query Fuzzer will fuzz newly added tests more extensively. This closes [#18916](https://github.com/ClickHouse/ClickHouse/issues/18916). [#19185](https://github.com/ClickHouse/ClickHouse/pull/19185) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Integrate with [Big List of Naughty Strings](https://github.com/minimaxir/big-list-of-naughty-strings/) for better fuzzing. [#19480](https://github.com/ClickHouse/ClickHouse/pull/19480) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add integration tests run with MSan. [#18974](https://github.com/ClickHouse/ClickHouse/pull/18974) ([alesapin](https://github.com/alesapin)).
* Fixed MemorySanitizer errors in cyrus-sasl and musl. [#19821](https://github.com/ClickHouse/ClickHouse/pull/19821) ([Ilya Yatsishin](https://github.com/qoega)).
* Insuffiient arguments check in `positionCaseInsensitiveUTF8` function triggered address sanitizer. [#19720](https://github.com/ClickHouse/ClickHouse/pull/19720) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Remove --project-directory for docker-compose in integration test. Fix logs formatting from docker container. [#19706](https://github.com/ClickHouse/ClickHouse/pull/19706) ([Ilya Yatsishin](https://github.com/qoega)).
* Made generation of macros.xml easier for integration tests. No more excessive logging from dicttoxml. dicttoxml project is not active for 5+ years. [#19697](https://github.com/ClickHouse/ClickHouse/pull/19697) ([Ilya Yatsishin](https://github.com/qoega)).
* Allow to explicitly enable or disable watchdog via environment variable `CLICKHOUSE_WATCHDOG_ENABLE`. By default it is enabled if server is not attached to terminal. [#19522](https://github.com/ClickHouse/ClickHouse/pull/19522) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Allow building ClickHouse with Kafka support on arm64. [#19369](https://github.com/ClickHouse/ClickHouse/pull/19369) ([filimonov](https://github.com/filimonov)).
* Allow building librdkafka without ssl. [#19337](https://github.com/ClickHouse/ClickHouse/pull/19337) ([filimonov](https://github.com/filimonov)).
* Restore Kafka input in FreeBSD builds. [#18924](https://github.com/ClickHouse/ClickHouse/pull/18924) ([Alexandre Snarskii](https://github.com/snar)).
* Fix potential nullptr dereference in table function `VALUES`. [#19357](https://github.com/ClickHouse/ClickHouse/pull/19357) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Avoid UBSan reports in `arrayElement` function, `substring` and `arraySum`. Fixes [#19305](https://github.com/ClickHouse/ClickHouse/issues/19305). Fixes [#19287](https://github.com/ClickHouse/ClickHouse/issues/19287). This closes [#19336](https://github.com/ClickHouse/ClickHouse/issues/19336). [#19347](https://github.com/ClickHouse/ClickHouse/pull/19347) ([alexey-milovidov](https://github.com/alexey-milovidov)).
## ClickHouse release 21.1
### ClickHouse release v21.1.3.32-stable, 2021-02-03
#### Bug Fix
* BloomFilter index crash fix. Fixes [#19757](https://github.com/ClickHouse/ClickHouse/issues/19757). [#19884](https://github.com/ClickHouse/ClickHouse/pull/19884) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix crash when pushing down predicates to union distinct subquery. This fixes [#19855](https://github.com/ClickHouse/ClickHouse/issues/19855). [#19861](https://github.com/ClickHouse/ClickHouse/pull/19861) ([Amos Bird](https://github.com/amosbird)).
* Fix filtering by UInt8 greater than 127. [#19799](https://github.com/ClickHouse/ClickHouse/pull/19799) ([Anton Popov](https://github.com/CurtizJ)).
* In previous versions, unusual arguments for function arrayEnumerateUniq may cause crash or infinite loop. This closes [#19787](https://github.com/ClickHouse/ClickHouse/issues/19787). [#19788](https://github.com/ClickHouse/ClickHouse/pull/19788) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed stack overflow when using accurate comparison of arithmetic type with string type. [#19773](https://github.com/ClickHouse/ClickHouse/pull/19773) ([tavplubix](https://github.com/tavplubix)).
* Fix crash when nested column name was used in `WHERE` or `PREWHERE`. Fixes [#19755](https://github.com/ClickHouse/ClickHouse/issues/19755). [#19763](https://github.com/ClickHouse/ClickHouse/pull/19763) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix a segmentation fault in `bitmapAndnot` function. Fixes [#19668](https://github.com/ClickHouse/ClickHouse/issues/19668). [#19713](https://github.com/ClickHouse/ClickHouse/pull/19713) ([Maksim Kita](https://github.com/kitaisreal)).
* Some functions with big integers may cause segfault. Big integers is experimental feature. This closes [#19667](https://github.com/ClickHouse/ClickHouse/issues/19667). [#19672](https://github.com/ClickHouse/ClickHouse/pull/19672) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix wrong result of function `neighbor` for `LowCardinality` argument. Fixes [#10333](https://github.com/ClickHouse/ClickHouse/issues/10333). [#19617](https://github.com/ClickHouse/ClickHouse/pull/19617) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix use-after-free of the CompressedWriteBuffer in Connection after disconnect. [#19599](https://github.com/ClickHouse/ClickHouse/pull/19599) ([Azat Khuzhin](https://github.com/azat)).
* `DROP/DETACH TABLE table ON CLUSTER cluster SYNC` query might hang, it's fixed. Fixes [#19568](https://github.com/ClickHouse/ClickHouse/issues/19568). [#19572](https://github.com/ClickHouse/ClickHouse/pull/19572) ([tavplubix](https://github.com/tavplubix)).
* Query CREATE DICTIONARY id expression fix. [#19571](https://github.com/ClickHouse/ClickHouse/pull/19571) ([Maksim Kita](https://github.com/kitaisreal)).
* Fix SIGSEGV with merge_tree_min_rows_for_concurrent_read/merge_tree_min_bytes_for_concurrent_read=0/UINT64_MAX. [#19528](https://github.com/ClickHouse/ClickHouse/pull/19528) ([Azat Khuzhin](https://github.com/azat)).
* Buffer overflow (on memory read) was possible if `addMonth` function was called with specifically crafted arguments. This fixes [#19441](https://github.com/ClickHouse/ClickHouse/issues/19441). This fixes [#19413](https://github.com/ClickHouse/ClickHouse/issues/19413). [#19472](https://github.com/ClickHouse/ClickHouse/pull/19472) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Uninitialized memory read was possible in encrypt/decrypt functions if empty string was passed as IV. This closes [#19391](https://github.com/ClickHouse/ClickHouse/issues/19391). [#19397](https://github.com/ClickHouse/ClickHouse/pull/19397) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix possible buffer overflow in Uber H3 library. See https://github.com/uber/h3/issues/392. This closes [#19219](https://github.com/ClickHouse/ClickHouse/issues/19219). [#19383](https://github.com/ClickHouse/ClickHouse/pull/19383) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix system.parts _state column (LOGICAL_ERROR when querying this column, due to incorrect order). [#19346](https://github.com/ClickHouse/ClickHouse/pull/19346) ([Azat Khuzhin](https://github.com/azat)).
* Fixed possible wrong result or segfault on aggregation when Materialized View and its target table have different structure. Fixes [#18063](https://github.com/ClickHouse/ClickHouse/issues/18063). [#19322](https://github.com/ClickHouse/ClickHouse/pull/19322) ([tavplubix](https://github.com/tavplubix)).
* Fix error `Cannot convert column now64() because it is constant but values of constants are different in source and result`. Continuation of [#7156](https://github.com/ClickHouse/ClickHouse/issues/7156). [#19316](https://github.com/ClickHouse/ClickHouse/pull/19316) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix bug when concurrent `ALTER` and `DROP` queries may hang while processing ReplicatedMergeTree table. [#19237](https://github.com/ClickHouse/ClickHouse/pull/19237) ([alesapin](https://github.com/alesapin)).
* Fixed `There is no checkpoint` error when inserting data through http interface using `Template` or `CustomSeparated` format. Fixes [#19021](https://github.com/ClickHouse/ClickHouse/issues/19021). [#19072](https://github.com/ClickHouse/ClickHouse/pull/19072) ([tavplubix](https://github.com/tavplubix)).
* Disable constant folding for subqueries on the analysis stage, when the result cannot be calculated. [#18446](https://github.com/ClickHouse/ClickHouse/pull/18446) ([Azat Khuzhin](https://github.com/azat)).
* Mutation might hang waiting for some non-existent part after `MOVE` or `REPLACE PARTITION` or, in rare cases, after `DETACH` or `DROP PARTITION`. It's fixed. [#15537](https://github.com/ClickHouse/ClickHouse/pull/15537) ([tavplubix](https://github.com/tavplubix)).
### ClickHouse release v21.1.2.15-stable 2021-01-18
#### Backward Incompatible Change

View File

@ -8,12 +8,8 @@ ClickHouse® is an open-source column-oriented database management system that a
* [Tutorial](https://clickhouse.tech/docs/en/getting_started/tutorial/) shows how to set up and query small ClickHouse cluster.
* [Documentation](https://clickhouse.tech/docs/en/) provides more in-depth information.
* [YouTube channel](https://www.youtube.com/c/ClickHouseDB) has a lot of content about ClickHouse in video format.
* [Slack](https://join.slack.com/t/clickhousedb/shared_invite/zt-d2zxkf9e-XyxDa_ucfPxzuH4SJIm~Ng) and [Telegram](https://telegram.me/clickhouse_en) allow to chat with ClickHouse users in real-time.
* [Slack](https://join.slack.com/t/clickhousedb/shared_invite/zt-ly9m4w1x-6j7x5Ts_pQZqrctAbRZ3cg) and [Telegram](https://telegram.me/clickhouse_en) allow to chat with ClickHouse users in real-time.
* [Blog](https://clickhouse.yandex/blog/en/) contains various ClickHouse-related articles, as well as announcements and reports about events.
* [Code Browser](https://clickhouse.tech/codebrowser/html_report/ClickHouse/index.html) with syntax highlight and navigation.
* [Yandex.Messenger channel](https://yandex.ru/chat/#/join/20e380d9-c7be-4123-ab06-e95fb946975e) shares announcements and useful links in Russian.
* [Contacts](https://clickhouse.tech/#contacts) can help to get your questions answered if there are any.
* You can also [fill this form](https://clickhouse.tech/#meet) to meet Yandex ClickHouse team in person.
## Upcoming Events
* [Chinese ClickHouse Meetup (online)](http://hdxu.cn/8KxZE) on 6 February 2021.

View File

@ -7,6 +7,7 @@
#include <ctime>
#include <string>
#define DATE_LUT_MAX (0xFFFFFFFFU - 86400)
#define DATE_LUT_MAX_DAY_NUM (0xFFFFFFFFU / 86400)
/// Table size is bigger than DATE_LUT_MAX_DAY_NUM to fill all indices within UInt16 range: this allows to remove extra check.
@ -249,7 +250,7 @@ public:
{
DayNum index = findIndex(t);
if (unlikely(index == 0))
if (unlikely(index == 0 || index > DATE_LUT_MAX_DAY_NUM))
return t + offset_at_start_of_epoch;
time_t res = t - lut[index].date;
@ -264,18 +265,43 @@ public:
{
DayNum index = findIndex(t);
/// If it is not 1970 year (findIndex found nothing appropriate),
/// than limit number of hours to avoid insane results like 1970-01-01 89:28:15
if (unlikely(index == 0))
/// If it is overflow case,
/// then limit number of hours to avoid insane results like 1970-01-01 89:28:15
if (unlikely(index == 0 || index > DATE_LUT_MAX_DAY_NUM))
return static_cast<unsigned>((t + offset_at_start_of_epoch) / 3600) % 24;
time_t res = t - lut[index].date;
time_t time = t - lut[index].date;
/// Data is cleaned to avoid possibility of underflow.
if (res >= lut[index].time_at_offset_change)
if (time >= lut[index].time_at_offset_change)
time += lut[index].amount_of_offset_change;
unsigned res = time / 3600;
return res <= 23 ? res : 0;
}
/** Calculating offset from UTC in seconds.
* which means Using the same literal time of "t" to get the corresponding timestamp in UTC,
* then subtract the former from the latter to get the offset result.
* The boundaries when meets DST(daylight saving time) change should be handled very carefully.
*/
inline time_t timezoneOffset(time_t t) const
{
DayNum index = findIndex(t);
/// Calculate daylight saving offset first.
/// Because the "amount_of_offset_change" in LUT entry only exists in the change day, it's costly to scan it from the very begin.
/// but we can figure out all the accumulated offsets from 1970-01-01 to that day just by get the whole difference between lut[].date,
/// and then, we can directly subtract multiple 86400s to get the real DST offsets for the leap seconds is not considered now.
time_t res = (lut[index].date - lut[0].date) % 86400;
/// As so far to know, the maximal DST offset couldn't be more than 2 hours, so after the modulo operation the remainder
/// will sits between [-offset --> 0 --> offset] which respectively corresponds to moving clock forward or backward.
res = res > 43200 ? (86400 - res) : (0 - res);
/// Check if has a offset change during this day. Add the change when cross the line
if (lut[index].amount_of_offset_change != 0 && t >= lut[index].date + lut[index].time_at_offset_change)
res += lut[index].amount_of_offset_change;
return res / 3600;
return res + offset_at_start_of_epoch;
}
/** Only for time zones with/when offset from UTC is multiple of five minutes.
@ -289,15 +315,21 @@ public:
* each minute, with added or subtracted leap second, spans exactly 60 unix timestamps.
*/
inline unsigned toSecond(time_t t) const { return t % 60; }
inline unsigned toSecond(time_t t) const { return UInt32(t) % 60; }
inline unsigned toMinute(time_t t) const
{
if (offset_is_whole_number_of_hours_everytime)
return (t / 60) % 60;
return (UInt32(t) / 60) % 60;
UInt32 date = find(t).date;
return (UInt32(t) - date) / 60 % 60;
/// To consider the DST changing situation within this day.
/// also make the special timezones with no whole hour offset such as 'Australia/Lord_Howe' been taken into account
DayNum index = findIndex(t);
UInt32 res = t - lut[index].date;
if (lut[index].amount_of_offset_change != 0 && t >= lut[index].date + lut[index].time_at_offset_change)
res += lut[index].amount_of_offset_change;
return res / 60 % 60;
}
inline time_t toStartOfMinute(time_t t) const { return t / 60 * 60; }
@ -530,9 +562,7 @@ public:
}
}
/*
* check and change mode to effective
*/
/// Check and change mode to effective.
inline UInt8 check_week_mode(UInt8 mode) const
{
UInt8 week_format = (mode & 7);
@ -541,9 +571,8 @@ public:
return week_format;
}
/*
* Calc weekday from d
* Returns 0 for monday, 1 for tuesday ...
/** Calculate weekday from d.
* Returns 0 for monday, 1 for tuesday...
*/
inline unsigned calc_weekday(DayNum d, bool sunday_first_day_of_week) const
{
@ -553,7 +582,7 @@ public:
return toDayOfWeek(DayNum(d + 1)) - 1;
}
/* Calc days in one year. */
/// Calculate days in one year.
inline unsigned calc_days_in_year(UInt16 year) const
{
return ((year & 3) == 0 && (year % 100 || (year % 400 == 0 && year)) ? 366 : 365);
@ -851,7 +880,7 @@ public:
}
/// Saturation can occur if 29 Feb is mapped to non-leap year.
inline time_t addYears(time_t t, Int64 delta) const
inline NO_SANITIZE_UNDEFINED time_t addYears(time_t t, Int64 delta) const
{
DayNum result_day = addYears(toDayNum(t), delta);
@ -863,7 +892,7 @@ public:
return lut[result_day].date + time_offset;
}
inline DayNum addYears(DayNum d, Int64 delta) const
inline NO_SANITIZE_UNDEFINED DayNum addYears(DayNum d, Int64 delta) const
{
const Values & values = lut[d];

View File

@ -168,14 +168,6 @@ public:
static_assert(sizeof(LocalDate) == 4);
inline std::ostream & operator<< (std::ostream & ostr, const LocalDate & date)
{
return ostr << date.year()
<< '-' << (date.month() / 10) << (date.month() % 10)
<< '-' << (date.day() / 10) << (date.day() % 10);
}
namespace std
{
inline string to_string(const LocalDate & date)

View File

@ -169,20 +169,6 @@ public:
static_assert(sizeof(LocalDateTime) == 8);
inline std::ostream & operator<< (std::ostream & ostr, const LocalDateTime & datetime)
{
ostr << std::setfill('0') << std::setw(4) << datetime.year();
ostr << '-' << (datetime.month() / 10) << (datetime.month() % 10)
<< '-' << (datetime.day() / 10) << (datetime.day() % 10)
<< ' ' << (datetime.hour() / 10) << (datetime.hour() % 10)
<< ':' << (datetime.minute() / 10) << (datetime.minute() % 10)
<< ':' << (datetime.second() / 10) << (datetime.second() % 10);
return ostr;
}
namespace std
{
inline string to_string(const LocalDateTime & datetime)

View File

@ -12,6 +12,8 @@
#include <dlfcn.h>
#include <fcntl.h>
#include <fstream>
#include <fmt/format.h>
namespace
{
@ -189,8 +191,8 @@ void ReplxxLineReader::openEditor()
return;
}
String editor = std::getenv("EDITOR");
if (editor.empty())
const char * editor = std::getenv("EDITOR");
if (!editor || !*editor)
editor = "vim";
replxx::Replxx::State state(rx.get_state());
@ -204,7 +206,7 @@ void ReplxxLineReader::openEditor()
if ((-1 == res || 0 == res) && errno != EINTR)
{
rx.print("Cannot write to temporary query file %s: %s\n", filename, errnoToString(errno).c_str());
return;
break;
}
bytes_written += res;
}
@ -215,7 +217,7 @@ void ReplxxLineReader::openEditor()
return;
}
if (0 == execute(editor + " " + filename))
if (0 == execute(fmt::format("{} {}", editor, filename)))
{
try
{

View File

@ -1,9 +1,30 @@
#pragma once
#include <common/extended_types.h>
#include <common/defines.h>
namespace common
{
/// Multiply and ignore overflow.
template <typename T1, typename T2>
inline auto NO_SANITIZE_UNDEFINED mulIgnoreOverflow(T1 x, T2 y)
{
return x * y;
}
template <typename T1, typename T2>
inline auto NO_SANITIZE_UNDEFINED addIgnoreOverflow(T1 x, T2 y)
{
return x + y;
}
template <typename T1, typename T2>
inline auto NO_SANITIZE_UNDEFINED subIgnoreOverflow(T1 x, T2 y)
{
return x - y;
}
template <typename T>
inline bool addOverflow(T x, T y, T & res)
{
@ -33,14 +54,14 @@ namespace common
{
static constexpr __int128 min_int128 = minInt128();
static constexpr __int128 max_int128 = maxInt128();
res = x + y;
res = addIgnoreOverflow(x, y);
return (y > 0 && x > max_int128 - y) || (y < 0 && x < min_int128 - y);
}
template <>
inline bool addOverflow(wInt256 x, wInt256 y, wInt256 & res)
{
res = x + y;
res = addIgnoreOverflow(x, y);
return (y > 0 && x > std::numeric_limits<wInt256>::max() - y) ||
(y < 0 && x < std::numeric_limits<wInt256>::min() - y);
}
@ -48,7 +69,7 @@ namespace common
template <>
inline bool addOverflow(wUInt256 x, wUInt256 y, wUInt256 & res)
{
res = x + y;
res = addIgnoreOverflow(x, y);
return x > std::numeric_limits<wUInt256>::max() - y;
}
@ -81,14 +102,14 @@ namespace common
{
static constexpr __int128 min_int128 = minInt128();
static constexpr __int128 max_int128 = maxInt128();
res = x - y;
res = subIgnoreOverflow(x, y);
return (y < 0 && x > max_int128 + y) || (y > 0 && x < min_int128 + y);
}
template <>
inline bool subOverflow(wInt256 x, wInt256 y, wInt256 & res)
{
res = x - y;
res = subIgnoreOverflow(x, y);
return (y < 0 && x > std::numeric_limits<wInt256>::max() + y) ||
(y > 0 && x < std::numeric_limits<wInt256>::min() + y);
}
@ -96,7 +117,7 @@ namespace common
template <>
inline bool subOverflow(wUInt256 x, wUInt256 y, wUInt256 & res)
{
res = x - y;
res = subIgnoreOverflow(x, y);
return x < y;
}
@ -127,33 +148,33 @@ namespace common
template <>
inline bool mulOverflow(__int128 x, __int128 y, __int128 & res)
{
res = static_cast<unsigned __int128>(x) * static_cast<unsigned __int128>(y); /// Avoid signed integer overflow.
res = mulIgnoreOverflow(x, y);
if (!x || !y)
return false;
unsigned __int128 a = (x > 0) ? x : -x;
unsigned __int128 b = (y > 0) ? y : -y;
return (a * b) / b != a;
return mulIgnoreOverflow(a, b) / b != a;
}
template <>
inline bool mulOverflow(wInt256 x, wInt256 y, wInt256 & res)
{
res = x * y;
res = mulIgnoreOverflow(x, y);
if (!x || !y)
return false;
wInt256 a = (x > 0) ? x : -x;
wInt256 b = (y > 0) ? y : -y;
return (a * b) / b != a;
return mulIgnoreOverflow(a, b) / b != a;
}
template <>
inline bool mulOverflow(wUInt256 x, wUInt256 y, wUInt256 & res)
{
res = x * y;
res = mulIgnoreOverflow(x, y);
if (!x || !y)
return false;
return (x * y) / y != x;
return res / y != x;
}
}

View File

@ -1,5 +1,20 @@
#pragma once
/// __has_feature supported only by clang.
///
/// But libcxx/libcxxabi overrides it to 0,
/// thus the checks for __has_feature will be wrong.
///
/// NOTE:
/// - __has_feature cannot be simply undefined,
/// since this will be broken if some C++ header will be included after
/// including <common/defines.h>
/// - it should not have fallback to 0,
/// since this may create false-positive detection (common problem)
#if defined(__clang__) && defined(__has_feature)
# define ch_has_feature __has_feature
#endif
#if defined(_MSC_VER)
# if !defined(likely)
# define likely(x) (x)
@ -32,8 +47,8 @@
/// Check for presence of address sanitizer
#if !defined(ADDRESS_SANITIZER)
# if defined(__has_feature)
# if __has_feature(address_sanitizer)
# if defined(ch_has_feature)
# if ch_has_feature(address_sanitizer)
# define ADDRESS_SANITIZER 1
# endif
# elif defined(__SANITIZE_ADDRESS__)
@ -42,8 +57,8 @@
#endif
#if !defined(THREAD_SANITIZER)
# if defined(__has_feature)
# if __has_feature(thread_sanitizer)
# if defined(ch_has_feature)
# if ch_has_feature(thread_sanitizer)
# define THREAD_SANITIZER 1
# endif
# elif defined(__SANITIZE_THREAD__)
@ -52,8 +67,8 @@
#endif
#if !defined(MEMORY_SANITIZER)
# if defined(__has_feature)
# if __has_feature(memory_sanitizer)
# if defined(ch_has_feature)
# if ch_has_feature(memory_sanitizer)
# define MEMORY_SANITIZER 1
# endif
# elif defined(__MEMORY_SANITIZER__)
@ -84,10 +99,12 @@
# define NO_SANITIZE_UNDEFINED __attribute__((__no_sanitize__("undefined")))
# define NO_SANITIZE_ADDRESS __attribute__((__no_sanitize__("address")))
# define NO_SANITIZE_THREAD __attribute__((__no_sanitize__("thread")))
# define ALWAYS_INLINE_NO_SANITIZE_UNDEFINED __attribute__((__always_inline__, __no_sanitize__("undefined")))
#else /// It does not work in GCC. GCC 7 cannot recognize this attribute and GCC 8 simply ignores it.
# define NO_SANITIZE_UNDEFINED
# define NO_SANITIZE_ADDRESS
# define NO_SANITIZE_THREAD
# define ALWAYS_INLINE_NO_SANITIZE_UNDEFINED ALWAYS_INLINE
#endif
/// A template function for suppressing warnings about unused variables or function results.

View File

@ -104,8 +104,3 @@ template <> struct is_big_int<wUInt256> { static constexpr bool value = true; };
template <typename T>
inline constexpr bool is_big_int_v = is_big_int<T>::value;
template <typename To, typename From>
inline To bigint_cast(const From & x [[maybe_unused]])
{
return static_cast<To>(x);
}

View File

@ -15,8 +15,8 @@
#endif
#define __msan_unpoison(X, Y) // NOLINT
#if defined(__has_feature)
# if __has_feature(memory_sanitizer)
#if defined(ch_has_feature)
# if ch_has_feature(memory_sanitizer)
# undef __msan_unpoison
# include <sanitizer/msan_interface.h>
# endif

View File

@ -152,7 +152,7 @@ static void signalHandler(int sig, siginfo_t * info, void * context)
if (sig != SIGTSTP) /// This signal is used for debugging.
{
/// The time that is usually enough for separate thread to print info into log.
sleepForSeconds(10);
sleepForSeconds(20); /// FIXME: use some feedback from threads that process stacktrace
call_default_signal_handler(sig);
}
@ -230,10 +230,10 @@ public:
}
else
{
siginfo_t info;
ucontext_t context;
siginfo_t info{};
ucontext_t context{};
StackTrace stack_trace(NoCapture{});
UInt32 thread_num;
UInt32 thread_num{};
std::string query_id;
DB::ThreadStatus * thread_ptr{};
@ -311,7 +311,8 @@ private:
if (stack_trace.getSize())
{
/// Write bare stack trace (addresses) just in case if we will fail to print symbolized stack trace.
/// NOTE This still require memory allocations and mutex lock inside logger. BTW we can also print it to stderr using write syscalls.
/// NOTE: This still require memory allocations and mutex lock inside logger.
/// BTW we can also print it to stderr using write syscalls.
std::stringstream bare_stacktrace;
bare_stacktrace << "Stack trace:";
@ -324,7 +325,7 @@ private:
/// Write symbolized stack trace line by line for better grep-ability.
stack_trace.toStringEveryLine([&](const std::string & s) { LOG_FATAL(log, s); });
#if defined(__linux__)
#if defined(OS_LINUX)
/// Write information about binary checksum. It can be difficult to calculate, so do it only after printing stack trace.
String calculated_binary_hash = getHashOfLoadedBinaryHex();
if (daemon.stored_binary_hash.empty())
@ -415,7 +416,9 @@ static void sanitizerDeathCallback()
else
log_message = "Terminate called without an active exception";
static const size_t buf_size = 1024;
/// POSIX.1 says that write(2)s of less than PIPE_BUF bytes must be atomic - man 7 pipe
/// And the buffer should not be too small because our exception messages can be large.
static constexpr size_t buf_size = PIPE_BUF;
if (log_message.size() > buf_size - 16)
log_message.resize(buf_size - 16);
@ -561,6 +564,7 @@ void debugIncreaseOOMScore()
{
DB::WriteBufferFromFile buf("/proc/self/oom_score_adj");
buf.write(new_score.c_str(), new_score.size());
buf.close();
}
catch (const Poco::Exception & e)
{
@ -783,7 +787,7 @@ void BaseDaemon::initializeTerminationAndSignalProcessing()
/// Setup signal handlers.
/// SIGTSTP is added for debugging purposes. To output a stack trace of any running thread at anytime.
addSignalHandler({SIGABRT, SIGSEGV, SIGILL, SIGBUS, SIGSYS, SIGFPE, SIGPIPE, SIGTSTP}, signalHandler, &handled_signals);
addSignalHandler({SIGABRT, SIGSEGV, SIGILL, SIGBUS, SIGSYS, SIGFPE, SIGPIPE, SIGTSTP, SIGTRAP}, signalHandler, &handled_signals);
addSignalHandler({SIGHUP, SIGUSR1}, closeLogsSignalHandler, &handled_signals);
addSignalHandler({SIGINT, SIGQUIT, SIGTERM}, terminateRequestedSignalHandler, &handled_signals);
@ -986,7 +990,7 @@ void BaseDaemon::setupWatchdog()
if (errno == ECHILD)
{
logger().information("Child process no longer exists.");
_exit(status);
_exit(WEXITSTATUS(status));
}
if (WIFEXITED(status))
@ -1020,7 +1024,7 @@ void BaseDaemon::setupWatchdog()
/// Automatic restart is not enabled but you can play with it.
#if 1
_exit(status);
_exit(WEXITSTATUS(status));
#else
logger().information("Will restart.");
if (argv0)

View File

@ -83,7 +83,7 @@ public:
template <class T>
void writeToGraphite(const std::string & key, const T & value, const std::string & config_name = DEFAULT_GRAPHITE_CONFIG_NAME, time_t timestamp = 0, const std::string & custom_root_path = "")
{
auto writer = getGraphiteWriter(config_name);
auto *writer = getGraphiteWriter(config_name);
if (writer)
writer->write(key, value, timestamp, custom_root_path);
}
@ -91,7 +91,7 @@ public:
template <class T>
void writeToGraphite(const GraphiteWriter::KeyValueVector<T> & key_vals, const std::string & config_name = DEFAULT_GRAPHITE_CONFIG_NAME, time_t timestamp = 0, const std::string & custom_root_path = "")
{
auto writer = getGraphiteWriter(config_name);
auto *writer = getGraphiteWriter(config_name);
if (writer)
writer->write(key_vals, timestamp, custom_root_path);
}
@ -99,7 +99,7 @@ public:
template <class T>
void writeToGraphite(const GraphiteWriter::KeyValueVector<T> & key_vals, const std::chrono::system_clock::time_point & current_time, const std::string & custom_root_path)
{
auto writer = getGraphiteWriter();
auto *writer = getGraphiteWriter();
if (writer)
writer->write(key_vals, std::chrono::system_clock::to_time_t(current_time), custom_root_path);
}

View File

@ -31,7 +31,7 @@ static void *volatile vdso_func = (void *)getcpu_init;
int sched_getcpu(void)
{
int r;
unsigned cpu;
unsigned cpu = 0;
#ifdef VDSO_GETCPU_SYM
getcpu_f f = (getcpu_f)vdso_func;

View File

@ -3,7 +3,6 @@ add_library (mysqlxx
Exception.cpp
Query.cpp
ResultBase.cpp
StoreQueryResult.cpp
UseQueryResult.cpp
Row.cpp
Value.cpp

View File

@ -51,10 +51,11 @@ Connection::Connection(
const char* ssl_key,
unsigned timeout,
unsigned rw_timeout,
bool enable_local_infile)
bool enable_local_infile,
bool opt_reconnect)
: Connection()
{
connect(db, server, user, password, port, socket, ssl_ca, ssl_cert, ssl_key, timeout, rw_timeout, enable_local_infile);
connect(db, server, user, password, port, socket, ssl_ca, ssl_cert, ssl_key, timeout, rw_timeout, enable_local_infile, opt_reconnect);
}
Connection::Connection(const std::string & config_name)
@ -80,7 +81,8 @@ void Connection::connect(const char* db,
const char * ssl_key,
unsigned timeout,
unsigned rw_timeout,
bool enable_local_infile)
bool enable_local_infile,
bool opt_reconnect)
{
if (is_connected)
disconnect();
@ -104,9 +106,8 @@ void Connection::connect(const char* db,
if (mysql_options(driver.get(), MYSQL_OPT_LOCAL_INFILE, &enable_local_infile_arg))
throw ConnectionFailed(errorMessage(driver.get()), mysql_errno(driver.get()));
/// Enables auto-reconnect.
bool reconnect = true;
if (mysql_options(driver.get(), MYSQL_OPT_RECONNECT, reinterpret_cast<const char *>(&reconnect)))
/// See C API Developer Guide: Automatic Reconnection Control
if (mysql_options(driver.get(), MYSQL_OPT_RECONNECT, reinterpret_cast<const char *>(&opt_reconnect)))
throw ConnectionFailed(errorMessage(driver.get()), mysql_errno(driver.get()));
/// Specifies particular ssl key and certificate if it needs
@ -116,8 +117,8 @@ void Connection::connect(const char* db,
if (!mysql_real_connect(driver.get(), server, user, password, db, port, ifNotEmpty(socket), driver->client_flag))
throw ConnectionFailed(errorMessage(driver.get()), mysql_errno(driver.get()));
/// Sets UTF-8 as default encoding.
if (mysql_set_character_set(driver.get(), "UTF8"))
/// Sets UTF-8 as default encoding. See https://mariadb.com/kb/en/mysql_set_character_set/
if (mysql_set_character_set(driver.get(), "utf8mb4"))
throw ConnectionFailed(errorMessage(driver.get()), mysql_errno(driver.get()));
is_connected = true;

View File

@ -14,6 +14,8 @@
/// Disable LOAD DATA LOCAL INFILE because it is insecure
#define MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE false
/// See https://dev.mysql.com/doc/c-api/5.7/en/c-api-auto-reconnect.html
#define MYSQLXX_DEFAULT_MYSQL_OPT_RECONNECT true
namespace mysqlxx
@ -39,7 +41,6 @@ private:
/** MySQL connection.
* Usage:
* mysqlxx::Connection connection("Test", "127.0.0.1", "root", "qwerty", 3306);
* std::cout << connection.query("SELECT 'Hello, World!'").store().at(0).at(0).getString() << std::endl;
*
* Or with Poco library configuration:
* mysqlxx::Connection connection("mysql_params");
@ -77,7 +78,8 @@ public:
const char * ssl_key = "",
unsigned timeout = MYSQLXX_DEFAULT_TIMEOUT,
unsigned rw_timeout = MYSQLXX_DEFAULT_RW_TIMEOUT,
bool enable_local_infile = MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE);
bool enable_local_infile = MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE,
bool opt_reconnect = MYSQLXX_DEFAULT_MYSQL_OPT_RECONNECT);
/// Creates connection. Can be used if Poco::Util::Application is using.
/// All settings will be got from config_name section of configuration.
@ -97,7 +99,8 @@ public:
const char* ssl_key,
unsigned timeout = MYSQLXX_DEFAULT_TIMEOUT,
unsigned rw_timeout = MYSQLXX_DEFAULT_RW_TIMEOUT,
bool enable_local_infile = MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE);
bool enable_local_infile = MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE,
bool opt_reconnect = MYSQLXX_DEFAULT_MYSQL_OPT_RECONNECT);
void connect(const std::string & config_name)
{
@ -113,6 +116,7 @@ public:
std::string ssl_cert = cfg.getString(config_name + ".ssl_cert", "");
std::string ssl_key = cfg.getString(config_name + ".ssl_key", "");
bool enable_local_infile = cfg.getBool(config_name + ".enable_local_infile", MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE);
bool opt_reconnect = cfg.getBool(config_name + ".opt_reconnect", MYSQLXX_DEFAULT_MYSQL_OPT_RECONNECT);
unsigned timeout =
cfg.getInt(config_name + ".connect_timeout",
@ -136,7 +140,8 @@ public:
ssl_key.c_str(),
timeout,
rw_timeout,
enable_local_infile);
enable_local_infile,
opt_reconnect);
}
/// If MySQL connection was established.

View File

@ -26,6 +26,15 @@ struct ConnectionFailed : public Exception
};
/// Connection to MySQL server was lost
struct ConnectionLost : public Exception
{
ConnectionLost(const std::string & msg, int code = 0) : Exception(msg, code) {}
const char * name() const throw() override { return "mysqlxx::ConnectionLost"; }
const char * className() const throw() override { return "mysqlxx::ConnectionLost"; }
};
/// Erroneous query.
struct BadQuery : public Exception
{

View File

@ -10,7 +10,6 @@
#include <common/sleep.h>
#include <Poco/Util/Application.h>
#include <Poco/Util/LayeredConfiguration.h>
@ -41,7 +40,9 @@ void Pool::Entry::decrementRefCount()
Pool::Pool(const Poco::Util::AbstractConfiguration & cfg, const std::string & config_name,
unsigned default_connections_, unsigned max_connections_,
const char * parent_config_name_)
: default_connections(default_connections_), max_connections(max_connections_)
: logger(Poco::Logger::get("mysqlxx::Pool"))
, default_connections(default_connections_)
, max_connections(max_connections_)
{
server = cfg.getString(config_name + ".host");
@ -78,6 +79,9 @@ Pool::Pool(const Poco::Util::AbstractConfiguration & cfg, const std::string & co
enable_local_infile = cfg.getBool(config_name + ".enable_local_infile",
cfg.getBool(parent_config_name + ".enable_local_infile", MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE));
opt_reconnect = cfg.getBool(config_name + ".opt_reconnect",
cfg.getBool(parent_config_name + ".opt_reconnect", MYSQLXX_DEFAULT_MYSQL_OPT_RECONNECT));
}
else
{
@ -96,6 +100,8 @@ Pool::Pool(const Poco::Util::AbstractConfiguration & cfg, const std::string & co
enable_local_infile = cfg.getBool(
config_name + ".enable_local_infile", MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE);
opt_reconnect = cfg.getBool(config_name + ".opt_reconnect", MYSQLXX_DEFAULT_MYSQL_OPT_RECONNECT);
}
connect_timeout = cfg.getInt(config_name + ".connect_timeout",
@ -125,20 +131,30 @@ Pool::Entry Pool::get()
initialize();
for (;;)
{
logger.trace("(%s): Iterating through existing MySQL connections", getDescription());
for (auto & connection : connections)
{
if (connection->ref_count == 0)
return Entry(connection, this);
}
logger.trace("(%s): Trying to allocate a new connection.", getDescription());
if (connections.size() < static_cast<size_t>(max_connections))
{
Connection * conn = allocConnection();
if (conn)
return Entry(conn, this);
logger.trace("(%s): Unable to create a new connection: Allocation failed.", getDescription());
}
else
{
logger.trace("(%s): Unable to create a new connection: Max number of connections has been reached.", getDescription());
}
lock.unlock();
logger.trace("(%s): Sleeping for %d seconds.", getDescription(), MYSQLXX_POOL_SLEEP_ON_CONNECT_FAIL);
sleepForSeconds(MYSQLXX_POOL_SLEEP_ON_CONNECT_FAIL);
lock.lock();
}
@ -162,8 +178,7 @@ Pool::Entry Pool::tryGet()
if (res.tryForceConnected()) /// Tries to reestablish connection as well
return res;
auto & logger = Poco::Util::Application::instance().logger();
logger.information("Idle connection to mysql server cannot be recovered, dropping it.");
logger.debug("(%s): Idle connection to MySQL server cannot be recovered, dropping it.", getDescription());
/// This one is disconnected, cannot be reestablished and so needs to be disposed of.
connection_it = connections.erase(connection_it);
@ -186,6 +201,8 @@ Pool::Entry Pool::tryGet()
void Pool::removeConnection(Connection* connection)
{
logger.trace("(%s): Removing connection.", getDescription());
std::lock_guard<std::mutex> lock(mutex);
if (connection)
{
@ -210,8 +227,6 @@ void Pool::Entry::forceConnected() const
if (data == nullptr)
throw Poco::RuntimeException("Tried to access NULL database connection.");
Poco::Util::Application & app = Poco::Util::Application::instance();
bool first = true;
while (!tryForceConnected())
{
@ -220,7 +235,7 @@ void Pool::Entry::forceConnected() const
else
sleepForSeconds(MYSQLXX_POOL_SLEEP_ON_CONNECT_FAIL);
app.logger().information("MYSQL: Reconnecting to " + pool->description);
pool->logger.debug("Entry: Reconnecting to MySQL server %s", pool->description);
data->conn.connect(
pool->db.c_str(),
pool->server.c_str(),
@ -233,7 +248,8 @@ void Pool::Entry::forceConnected() const
pool->ssl_key.c_str(),
pool->connect_timeout,
pool->rw_timeout,
pool->enable_local_infile);
pool->enable_local_infile,
pool->opt_reconnect);
}
}
@ -242,18 +258,22 @@ bool Pool::Entry::tryForceConnected() const
{
auto * const mysql_driver = data->conn.getDriver();
const auto prev_connection_id = mysql_thread_id(mysql_driver);
pool->logger.trace("Entry(connection %lu): sending PING to check if it is alive.", prev_connection_id);
if (data->conn.ping()) /// Attempts to reestablish lost connection
{
const auto current_connection_id = mysql_thread_id(mysql_driver);
if (prev_connection_id != current_connection_id)
{
auto & logger = Poco::Util::Application::instance().logger();
logger.information("Connection to mysql server has been reestablished. Connection id changed: %lu -> %lu",
prev_connection_id, current_connection_id);
pool->logger.debug("Entry(connection %lu): Reconnected to MySQL server. Connection id changed: %lu -> %lu",
current_connection_id, prev_connection_id, current_connection_id);
}
pool->logger.trace("Entry(connection %lu): PING ok.", current_connection_id);
return true;
}
pool->logger.trace("Entry(connection %lu): PING failed.", prev_connection_id);
return false;
}
@ -274,15 +294,13 @@ void Pool::initialize()
Pool::Connection * Pool::allocConnection(bool dont_throw_if_failed_first_time)
{
Poco::Util::Application & app = Poco::Util::Application::instance();
std::unique_ptr<Connection> conn(new Connection);
std::unique_ptr<Connection> conn_ptr{new Connection};
try
{
app.logger().information("MYSQL: Connecting to " + description);
logger.debug("Connecting to %s", description);
conn->conn.connect(
conn_ptr->conn.connect(
db.c_str(),
server.c_str(),
user.c_str(),
@ -294,29 +312,29 @@ Pool::Connection * Pool::allocConnection(bool dont_throw_if_failed_first_time)
ssl_key.c_str(),
connect_timeout,
rw_timeout,
enable_local_infile);
enable_local_infile,
opt_reconnect);
}
catch (mysqlxx::ConnectionFailed & e)
{
logger.error(e.what());
if ((!was_successful && !dont_throw_if_failed_first_time)
|| e.errnum() == ER_ACCESS_DENIED_ERROR
|| e.errnum() == ER_DBACCESS_DENIED_ERROR
|| e.errnum() == ER_BAD_DB_ERROR)
{
app.logger().error(e.what());
throw;
}
else
{
app.logger().error(e.what());
return nullptr;
}
}
connections.push_back(conn_ptr.get());
was_successful = true;
auto * connection = conn.release();
connections.push_back(connection);
return connection;
return conn_ptr.release();
}
}

View File

@ -6,6 +6,8 @@
#include <atomic>
#include <Poco/Exception.h>
#include <Poco/Logger.h>
#include <mysqlxx/Connection.h>
@ -165,19 +167,21 @@ public:
unsigned rw_timeout_ = MYSQLXX_DEFAULT_RW_TIMEOUT,
unsigned default_connections_ = MYSQLXX_POOL_DEFAULT_START_CONNECTIONS,
unsigned max_connections_ = MYSQLXX_POOL_DEFAULT_MAX_CONNECTIONS,
unsigned enable_local_infile_ = MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE)
: default_connections(default_connections_), max_connections(max_connections_),
db(db_), server(server_), user(user_), password(password_), port(port_), socket(socket_),
connect_timeout(connect_timeout_), rw_timeout(rw_timeout_), enable_local_infile(enable_local_infile_) {}
unsigned enable_local_infile_ = MYSQLXX_DEFAULT_ENABLE_LOCAL_INFILE,
bool opt_reconnect_ = MYSQLXX_DEFAULT_MYSQL_OPT_RECONNECT)
: logger(Poco::Logger::get("mysqlxx::Pool")), default_connections(default_connections_),
max_connections(max_connections_), db(db_), server(server_), user(user_), password(password_), port(port_), socket(socket_),
connect_timeout(connect_timeout_), rw_timeout(rw_timeout_), enable_local_infile(enable_local_infile_),
opt_reconnect(opt_reconnect_) {}
Pool(const Pool & other)
: default_connections{other.default_connections},
: logger(other.logger), default_connections{other.default_connections},
max_connections{other.max_connections},
db{other.db}, server{other.server},
user{other.user}, password{other.password},
port{other.port}, socket{other.socket},
connect_timeout{other.connect_timeout}, rw_timeout{other.rw_timeout},
enable_local_infile{other.enable_local_infile}
enable_local_infile{other.enable_local_infile}, opt_reconnect(other.opt_reconnect)
{}
Pool & operator=(const Pool &) = delete;
@ -201,6 +205,8 @@ public:
void removeConnection(Connection * connection);
protected:
Poco::Logger & logger;
/// Number of MySQL connections which are created at launch.
unsigned default_connections;
/// Maximum possible number of connections
@ -231,6 +237,7 @@ private:
std::string ssl_cert;
std::string ssl_key;
bool enable_local_infile;
bool opt_reconnect;
/// True if connection was established at least once.
bool was_successful{false};

View File

@ -1,3 +1,8 @@
#include <algorithm>
#include <ctime>
#include <random>
#include <thread>
#include <mysqlxx/PoolWithFailover.h>
@ -33,6 +38,19 @@ PoolWithFailover::PoolWithFailover(const Poco::Util::AbstractConfiguration & con
std::make_shared<Pool>(config_, replica_name, default_connections_, max_connections_, config_name_.c_str()));
}
}
/// PoolWithFailover objects are stored in a cache inside PoolFactory.
/// This cache is reset by ExternalDictionariesLoader after every SYSTEM RELOAD DICTIONAR{Y|IES}
/// which triggers massive re-constructing of connection pools.
/// The state of PRNGs like std::mt19937 is considered to be quite heavy
/// thus here we attempt to optimize its construction.
static thread_local std::mt19937 rnd_generator(
std::hash<std::thread::id>{}(std::this_thread::get_id()) + std::clock());
for (auto & [_, replicas] : replicas_by_priority)
{
if (replicas.size() > 1)
std::shuffle(replicas.begin(), replicas.end(), rnd_generator);
}
}
else
{

View File

@ -1,11 +1,16 @@
#if __has_include(<mysql.h>)
#include <errmsg.h>
#include <mysql.h>
#else
#include <mysql/errmsg.h>
#include <mysql/mysql.h>
#endif
#include <Poco/Logger.h>
#include <mysqlxx/Connection.h>
#include <mysqlxx/Query.h>
#include <mysqlxx/Types.h>
namespace mysqlxx
@ -57,8 +62,24 @@ void Query::reset()
void Query::executeImpl()
{
std::string query_string = query_buf.str();
if (mysql_real_query(conn->getDriver(), query_string.data(), query_string.size()))
throw BadQuery(errorMessage(conn->getDriver()), mysql_errno(conn->getDriver()));
MYSQL* mysql_driver = conn->getDriver();
auto & logger = Poco::Logger::get("mysqlxx::Query");
logger.trace("Running MySQL query using connection %lu", mysql_thread_id(mysql_driver));
if (mysql_real_query(mysql_driver, query_string.data(), query_string.size()))
{
const auto err_no = mysql_errno(mysql_driver);
switch (err_no)
{
case CR_SERVER_GONE_ERROR:
[[fallthrough]];
case CR_SERVER_LOST:
throw ConnectionLost(errorMessage(mysql_driver), err_no);
default:
throw BadQuery(errorMessage(mysql_driver), err_no);
}
}
}
UseQueryResult Query::use()
@ -71,16 +92,6 @@ UseQueryResult Query::use()
return UseQueryResult(res, conn, this);
}
StoreQueryResult Query::store()
{
executeImpl();
MYSQL_RES * res = mysql_store_result(conn->getDriver());
if (!res)
checkError(conn->getDriver());
return StoreQueryResult(res, conn, this);
}
void Query::execute()
{
executeImpl();

View File

@ -3,7 +3,6 @@
#include <sstream>
#include <mysqlxx/UseQueryResult.h>
#include <mysqlxx/StoreQueryResult.h>
namespace mysqlxx
@ -46,11 +45,6 @@ public:
*/
UseQueryResult use();
/** Выполнить запрос с загрузкой на клиента всех строк.
* Требуется оперативка, чтобы вместить весь результат, зато к строкам можно обращаться в произвольном порядке.
*/
StoreQueryResult store();
/// Значение auto increment после последнего INSERT-а.
UInt64 insertID();

View File

@ -9,7 +9,7 @@ class Connection;
class Query;
/** Базовый класс для UseQueryResult и StoreQueryResult.
/** Базовый класс для UseQueryResult.
* Содержит общую часть реализации,
* Ссылается на Connection. Если уничтожить Connection, то пользоваться ResultBase и любым результатом нельзя.
* Использовать объект можно только для результата одного запроса!

View File

@ -35,7 +35,7 @@ public:
{
}
/** Для того, чтобы создать Row, используйте соответствующие методы UseQueryResult или StoreQueryResult. */
/** Для того, чтобы создать Row, используйте соответствующие методы UseQueryResult. */
Row(MYSQL_ROW row_, ResultBase * res_, MYSQL_LENGTHS lengths_)
: row(row_), res(res_), lengths(lengths_)
{

View File

@ -1,30 +0,0 @@
#if __has_include(<mysql.h>)
#include <mysql.h>
#else
#include <mysql/mysql.h>
#endif
#include <mysqlxx/Connection.h>
#include <mysqlxx/StoreQueryResult.h>
namespace mysqlxx
{
StoreQueryResult::StoreQueryResult(MYSQL_RES * res_, Connection * conn_, const Query * query_) : ResultBase(res_, conn_, query_)
{
UInt64 rows = mysql_num_rows(res);
reserve(rows);
lengths.resize(rows * num_fields);
for (UInt64 i = 0; MYSQL_ROW row = mysql_fetch_row(res); ++i)
{
MYSQL_LENGTHS lengths_for_row = mysql_fetch_lengths(res);
memcpy(&lengths[i * num_fields], lengths_for_row, sizeof(lengths[0]) * num_fields);
push_back(Row(row, this, &lengths[i * num_fields]));
}
checkError(conn->getDriver());
}
}

View File

@ -1,45 +0,0 @@
#pragma once
#include <vector>
#include <mysqlxx/ResultBase.h>
#include <mysqlxx/Row.h>
namespace mysqlxx
{
class Connection;
/** Результат выполнения запроса, загруженный полностью на клиента.
* Это требует оперативку, чтобы вместить весь результат,
* но зато реализует произвольный доступ к строкам по индексу.
* Если размер результата большой - используйте лучше UseQueryResult.
* Объект содержит ссылку на Connection.
* Если уничтожить Connection, то объект становится некорректным и все строки результата - тоже.
* Если задать следующий запрос в соединении, то объект и все строки тоже становятся некорректными.
* Использовать объект можно только для результата одного запроса!
* (При попытке присвоить объекту результат следующего запроса - UB.)
*/
class StoreQueryResult : public std::vector<Row>, public ResultBase
{
public:
StoreQueryResult(MYSQL_RES * res_, Connection * conn_, const Query * query_);
size_t num_rows() const { return size(); }
private:
/** Не смотря на то, что весь результат выполнения запроса загружается на клиента,
* и все указатели MYSQL_ROW на отдельные строки различные,
* при этом функция mysql_fetch_lengths() возвращает длины
* для текущей строки по одному и тому же адресу.
* То есть, чтобы можно было пользоваться несколькими Row одновременно,
* необходимо заранее куда-то сложить все длины.
*/
using Lengths = std::vector<MYSQL_LENGTH>;
Lengths lengths;
};
}

View File

@ -12,8 +12,7 @@ class Connection;
/** Результат выполнения запроса, предназначенный для чтения строк, одна за другой.
* В памяти при этом хранится только одна, текущая строка.
* В отличие от StoreQueryResult, произвольный доступ к строкам невозможен,
* а также, при чтении следующей строки, предыдущая становится некорректной.
* При чтении следующей строки, предыдущая становится некорректной.
* Вы обязаны прочитать все строки из результата
* (вызывать функцию fetch(), пока она не вернёт значение, преобразующееся к false),
* иначе при следующем запросе будет выкинуто исключение с текстом "Commands out of sync".

View File

@ -25,7 +25,7 @@ class ResultBase;
/** Represents a single value read from MySQL.
* It doesn't owns the value. It's just a wrapper of a pair (const char *, size_t).
* If the UseQueryResult/StoreQueryResult or Connection is destroyed,
* If the UseQueryResult or Connection is destroyed,
* or you have read the next Row while using UseQueryResult, then the object is invalidated.
* Allows to transform (parse) the value to various data types:
* - with getUInt(), getString(), ... (recommended);

View File

@ -38,15 +38,6 @@ int main(int, char **)
}
}
{
mysqlxx::Query query = connection.query();
query << "SELECT 1234567890 abc, 12345.67890 def UNION ALL SELECT 9876543210, 98765.43210";
mysqlxx::StoreQueryResult result = query.store();
std::cerr << result.at(0)["abc"].getUInt() << ", " << result.at(0)["def"].getDouble() << std::endl
<< result.at(1)["abc"].getUInt() << ", " << result.at(1)["def"].getDouble() << std::endl;
}
{
mysqlxx::UseQueryResult result = connection.query("SELECT 'abc\\\\def' x").use();
mysqlxx::Row row = result.fetch();
@ -54,27 +45,6 @@ int main(int, char **)
std::cerr << row << std::endl;
}
{
mysqlxx::Query query = connection.query("SEL");
query << "ECT 1";
std::cerr << query.store().at(0).at(0) << std::endl;
}
{
/// Копирование Query
mysqlxx::Query query = connection.query("SELECT 'Ok' x");
using Queries = std::vector<mysqlxx::Query>;
Queries queries;
queries.push_back(query);
for (auto & q : queries)
{
std::cerr << q.str() << std::endl;
std::cerr << q.store().at(0) << std::endl;
}
}
{
/// Копирование Query
mysqlxx::Query query1 = connection.query("SELECT");
@ -84,62 +54,6 @@ int main(int, char **)
std::cerr << query1.str() << ", " << query2.str() << std::endl;
}
{
/// Копирование Query
using Queries = std::list<mysqlxx::Query>;
Queries queries;
queries.push_back(connection.query("SELECT"));
mysqlxx::Query & qref = queries.back();
qref << " 1";
for (auto & query : queries)
{
std::cerr << query.str() << std::endl;
std::cerr << query.store().at(0) << std::endl;
}
}
{
/// Транзакции
connection.query("DROP TABLE IF EXISTS tmp").execute();
connection.query("CREATE TABLE tmp (x INT, PRIMARY KEY (x)) ENGINE = InnoDB").execute();
mysqlxx::Transaction trans(connection);
connection.query("INSERT INTO tmp VALUES (1)").execute();
std::cerr << connection.query("SELECT * FROM tmp").store().size() << std::endl;
trans.rollback();
std::cerr << connection.query("SELECT * FROM tmp").store().size() << std::endl;
}
{
/// Транзакции
connection.query("DROP TABLE IF EXISTS tmp").execute();
connection.query("CREATE TABLE tmp (x INT, PRIMARY KEY (x)) ENGINE = InnoDB").execute();
{
mysqlxx::Transaction trans(connection);
connection.query("INSERT INTO tmp VALUES (1)").execute();
std::cerr << connection.query("SELECT * FROM tmp").store().size() << std::endl;
}
std::cerr << connection.query("SELECT * FROM tmp").store().size() << std::endl;
}
{
/// Транзакции
mysqlxx::Connection connection2("test", "127.0.0.1", "root", "qwerty", 3306);
connection2.query("DROP TABLE IF EXISTS tmp").execute();
connection2.query("CREATE TABLE tmp (x INT, PRIMARY KEY (x)) ENGINE = InnoDB").execute();
mysqlxx::Transaction trans(connection2);
connection2.query("INSERT INTO tmp VALUES (1)").execute();
std::cerr << connection2.query("SELECT * FROM tmp").store().size() << std::endl;
}
std::cerr << connection.query("SELECT * FROM tmp").store().size() << std::endl;
{
/// NULL
mysqlxx::Null<int> x = mysqlxx::null;
@ -152,59 +66,6 @@ int main(int, char **)
std::cerr << (x == 1 ? "Ok" : "Fail") << std::endl;
std::cerr << (x.isNull() ? "Fail" : "Ok") << std::endl;
}
{
/// Исключения при попытке достать значение не того типа
try
{
connection.query("SELECT -1").store().at(0).at(0).getUInt();
std::cerr << "Fail" << std::endl;
}
catch (const mysqlxx::Exception & e)
{
std::cerr << "Ok, " << e.message() << std::endl;
}
try
{
connection.query("SELECT 'xxx'").store().at(0).at(0).getInt();
std::cerr << "Fail" << std::endl;
}
catch (const mysqlxx::Exception & e)
{
std::cerr << "Ok, " << e.message() << std::endl;
}
try
{
connection.query("SELECT NULL").store().at(0).at(0).getString();
std::cerr << "Fail" << std::endl;
}
catch (const mysqlxx::Exception & e)
{
std::cerr << "Ok, " << e.message() << std::endl;
}
try
{
connection.query("SELECT 123").store().at(0).at(0).getDate();
std::cerr << "Fail" << std::endl;
}
catch (const mysqlxx::Exception & e)
{
std::cerr << "Ok, " << e.message() << std::endl;
}
try
{
connection.query("SELECT '2011-01-01'").store().at(0).at(0).getDateTime();
std::cerr << "Fail" << std::endl;
}
catch (const mysqlxx::Exception & e)
{
std::cerr << "Ok, " << e.message() << std::endl;
}
}
}
catch (const mysqlxx::Exception & e)
{

View File

@ -1,9 +1,9 @@
# This strings autochanged from release_lib.sh:
SET(VERSION_REVISION 54447)
SET(VERSION_REVISION 54448)
SET(VERSION_MAJOR 21)
SET(VERSION_MINOR 2)
SET(VERSION_MINOR 3)
SET(VERSION_PATCH 1)
SET(VERSION_GITHASH 53d0c9fa7255aa1dc48991d19f4246ff71cc2fd7)
SET(VERSION_DESCRIBE v21.2.1.1-prestable)
SET(VERSION_STRING 21.2.1.1)
SET(VERSION_GITHASH ef72ba7349f230321750c13ee63b49a11a7c0adc)
SET(VERSION_DESCRIBE v21.3.1.1-prestable)
SET(VERSION_STRING 21.3.1.1)
# end of autochange

View File

@ -32,27 +32,25 @@ if (CCACHE_FOUND AND NOT COMPILER_MATCHES_CCACHE)
if (CCACHE_VERSION VERSION_GREATER "3.2.0" OR NOT CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
message(STATUS "Using ${CCACHE_FOUND} ${CCACHE_VERSION}")
# debian (debhlpers) set SOURCE_DATE_EPOCH environment variable, that is
set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE ${CCACHE_FOUND})
set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK ${CCACHE_FOUND})
# debian (debhelpers) set SOURCE_DATE_EPOCH environment variable, that is
# filled from the debian/changelog or current time.
#
# - 4.0+ ccache always includes this environment variable into the hash
# of the manifest, which do not allow to use previous cache,
# - 4.2+ ccache ignores SOURCE_DATE_EPOCH under time_macros sloppiness.
# - 4.2+ ccache ignores SOURCE_DATE_EPOCH for every file w/o __DATE__/__TIME__
#
# So for:
# - 4.2+ time_macros sloppiness is used,
# - 4.2+ does not require any sloppiness
# - 4.0+ will ignore SOURCE_DATE_EPOCH environment variable.
if (CCACHE_VERSION VERSION_GREATER_EQUAL "4.2")
message(STATUS "Use time_macros sloppiness for ccache")
set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE "${CCACHE_FOUND} --set-config=sloppiness=time_macros")
set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK "${CCACHE_FOUND} --set-config=sloppiness=time_macros")
message(STATUS "ccache is 4.2+ no quirks for SOURCE_DATE_EPOCH required")
elseif (CCACHE_VERSION VERSION_GREATER_EQUAL "4.0")
message(STATUS "Ignore SOURCE_DATE_EPOCH for ccache")
set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE "env -u SOURCE_DATE_EPOCH ${CCACHE_FOUND}")
set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK "env -u SOURCE_DATE_EPOCH ${CCACHE_FOUND}")
else()
set_property (GLOBAL PROPERTY RULE_LAUNCH_COMPILE ${CCACHE_FOUND})
set_property (GLOBAL PROPERTY RULE_LAUNCH_LINK ${CCACHE_FOUND})
endif()
else ()
message(${RECONFIGURE_MESSAGE_LEVEL} "Not using ${CCACHE_FOUND} ${CCACHE_VERSION} bug: https://bugzilla.samba.org/show_bug.cgi?id=8118")

View File

@ -11,7 +11,7 @@ if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/NuRaft/CMakeLists.txt")
return()
endif ()
if (NOT OS_FREEBSD)
if (NOT OS_FREEBSD AND NOT OS_DARWIN)
set (USE_NURAFT 1)
set (NURAFT_LIBRARY nuraft)
@ -20,5 +20,5 @@ if (NOT OS_FREEBSD)
message (STATUS "Using NuRaft=${USE_NURAFT}: ${NURAFT_INCLUDE_DIR} : ${NURAFT_LIBRARY}")
else()
set (USE_NURAFT 0)
message (STATUS "Using internal NuRaft library on FreeBSD is not supported")
message (STATUS "Using internal NuRaft library on FreeBSD and Darwin is not supported")
endif()

2
contrib/NuRaft vendored

@ -1 +1 @@
Subproject commit 410bd149da84cdde60b4436b02b738749f4e87e1
Subproject commit 9a0d78de4b90546368d954b6434f0e9a823e8d80

2
contrib/aws vendored

@ -1 +1 @@
Subproject commit a220591e335923ce1c19bbf9eb925787f7ab6c13
Subproject commit 7d48b2c8193679cc4516e5bd68ae4a64b94dae7d

View File

@ -11,7 +11,7 @@ endif ()
target_compile_options(base64_scalar PRIVATE -falign-loops)
if (ARCH_AMD64)
target_compile_options(base64_ssse3 PRIVATE -mssse3 -falign-loops)
target_compile_options(base64_ssse3 PRIVATE -mno-avx -mno-avx2 -mssse3 -falign-loops)
target_compile_options(base64_avx PRIVATE -falign-loops -mavx)
target_compile_options(base64_avx2 PRIVATE -falign-loops -mavx2)
else ()

2
contrib/boost vendored

@ -1 +1 @@
Subproject commit 8e259cd2a6b60d75dd17e73432f11bb7b9351bb1
Subproject commit 48f40ebb539220d328958f8823b094c0b07a4e79

2
contrib/brotli vendored

@ -1 +1 @@
Subproject commit 5805f99a533a8f8118699c0100d8c102f3605f65
Subproject commit 63be8a99401992075c23e99f7c84de1c653e39e2

View File

@ -2,6 +2,8 @@ set(BROTLI_SOURCE_DIR ${ClickHouse_SOURCE_DIR}/contrib/brotli/c)
set(BROTLI_BINARY_DIR ${ClickHouse_BINARY_DIR}/contrib/brotli/c)
set(SRCS
${BROTLI_SOURCE_DIR}/enc/command.c
${BROTLI_SOURCE_DIR}/enc/fast_log.c
${BROTLI_SOURCE_DIR}/dec/bit_reader.c
${BROTLI_SOURCE_DIR}/dec/state.c
${BROTLI_SOURCE_DIR}/dec/huffman.c
@ -26,6 +28,9 @@ set(SRCS
${BROTLI_SOURCE_DIR}/enc/memory.c
${BROTLI_SOURCE_DIR}/common/dictionary.c
${BROTLI_SOURCE_DIR}/common/transform.c
${BROTLI_SOURCE_DIR}/common/platform.c
${BROTLI_SOURCE_DIR}/common/context.c
${BROTLI_SOURCE_DIR}/common/constants.c
)
add_library(brotli ${SRCS})

2
contrib/cassandra vendored

@ -1 +1 @@
Subproject commit 9cbc1a806df5d40fddbf84533b9873542c6513d8
Subproject commit b446d7eb68e6962f431e2b3771313bfe9a2bbd93

2
contrib/hyperscan vendored

@ -1 +1 @@
Subproject commit 3907fd00ee8b2538739768fa9533f8635a276531
Subproject commit e9f08df0213fc637aac0a5bbde9beeaeba2fe9fa

View File

@ -252,6 +252,7 @@ if (NOT EXTERNAL_HYPERSCAN_LIBRARY_FOUND)
target_compile_definitions (hyperscan PUBLIC USE_HYPERSCAN=1)
target_compile_options (hyperscan
PRIVATE -g0 # Library has too much debug information
-mno-avx -mno-avx2 # The library is using dynamic dispatch and is confused if AVX is enabled globally
-march=corei7 -O2 -fno-strict-aliasing -fno-omit-frame-pointer -fvisibility=hidden # The options from original build system
-fno-sanitize=undefined # Assume the library takes care of itself
)

View File

@ -30,7 +30,12 @@ set(SRCS
add_library(nuraft ${SRCS})
target_compile_definitions(nuraft PRIVATE USE_BOOST_ASIO=1 BOOST_ASIO_STANDALONE=1)
if (NOT OPENSSL_SSL_LIBRARY OR NOT OPENSSL_CRYPTO_LIBRARY)
target_compile_definitions(nuraft PRIVATE USE_BOOST_ASIO=1 BOOST_ASIO_STANDALONE=1 SSL_LIBRARY_NOT_FOUND=1)
else()
target_compile_definitions(nuraft PRIVATE USE_BOOST_ASIO=1 BOOST_ASIO_STANDALONE=1)
endif()
target_include_directories (nuraft SYSTEM PRIVATE ${LIBRARY_DIR}/include/libnuraft)
# for some reason include "asio.h" directly without "boost/" prefix.

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit 2c32e17c7dfee1f8bf24227b697cdef5fddf0823
Subproject commit fbaaba4a02e29987b8c584747a496c79528f125f

4
debian/changelog vendored
View File

@ -1,5 +1,5 @@
clickhouse (21.2.1.1) unstable; urgency=low
clickhouse (21.3.1.1) unstable; urgency=low
* Modified source code
-- clickhouse-release <clickhouse-release@yandex-team.ru> Mon, 11 Jan 2021 11:12:08 +0300
-- clickhouse-release <clickhouse-release@yandex-team.ru> Mon, 01 Feb 2021 12:50:53 +0300

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
ARG version=21.2.1.*
ARG version=21.3.1.*
RUN apt-get update \
&& apt-get install --yes --no-install-recommends \

View File

@ -1,7 +1,7 @@
FROM ubuntu:20.04
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
ARG version=21.2.1.*
ARG version=21.3.1.*
ARG gosu_ver=1.10
# user/group precreated explicitly with fixed uid/gid on purpose.

View File

@ -113,10 +113,10 @@ $ docker run --rm -e CLICKHOUSE_UID=0 -e CLICKHOUSE_GID=0 --name clickhouse-serv
### How to create default database and user on starting
Sometimes you may want to create user (user named `default` is used by default) and database on image starting. You can do it using environment variables `CLICKHOUSE_DB`, `CLICKHOUSE_USER` and `CLICKHOUSE_PASSWORD`:
Sometimes you may want to create user (user named `default` is used by default) and database on image starting. You can do it using environment variables `CLICKHOUSE_DB`, `CLICKHOUSE_USER`, `CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT` and `CLICKHOUSE_PASSWORD`:
```
$ docker run --rm -e CLICKHOUSE_DB=my_database -e CLICKHOUSE_USER=username -e CLICKHOUSE_PASSWORD=password -p 9000:9000/tcp yandex/clickhouse-server
$ docker run --rm -e CLICKHOUSE_DB=my_database -e CLICKHOUSE_USER=username -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 -e CLICKHOUSE_PASSWORD=password -p 9000:9000/tcp yandex/clickhouse-server
```
## How to extend this image

View File

@ -54,8 +54,10 @@ docker cp -L "${ubuntu20image}":/lib/x86_64-linux-gnu/libm.so.6 "${CONTAIN
docker cp -L "${ubuntu20image}":/lib/x86_64-linux-gnu/libpthread.so.0 "${CONTAINER_ROOT_FOLDER}/lib"
docker cp -L "${ubuntu20image}":/lib/x86_64-linux-gnu/librt.so.1 "${CONTAINER_ROOT_FOLDER}/lib"
docker cp -L "${ubuntu20image}":/lib/x86_64-linux-gnu/libnss_dns.so.2 "${CONTAINER_ROOT_FOLDER}/lib"
docker cp -L "${ubuntu20image}":/lib/x86_64-linux-gnu/libnss_files.so.2 "${CONTAINER_ROOT_FOLDER}/lib"
docker cp -L "${ubuntu20image}":/lib/x86_64-linux-gnu/libresolv.so.2 "${CONTAINER_ROOT_FOLDER}/lib"
docker cp -L "${ubuntu20image}":/lib64/ld-linux-x86-64.so.2 "${CONTAINER_ROOT_FOLDER}/lib64"
docker cp -L "${ubuntu20image}":/etc/nsswitch.conf "${CONTAINER_ROOT_FOLDER}/etc"
docker build "$DOCKER_BUILD_FOLDER" -f Dockerfile.alpine -t "${DOCKER_IMAGE}:${VERSION}-alpine" --pull
rm -rf "$CONTAINER_ROOT_FOLDER"

View File

@ -54,6 +54,7 @@ FORMAT_SCHEMA_PATH="$(clickhouse extract-from-config --config-file "$CLICKHOUSE_
CLICKHOUSE_USER="${CLICKHOUSE_USER:-default}"
CLICKHOUSE_PASSWORD="${CLICKHOUSE_PASSWORD:-}"
CLICKHOUSE_DB="${CLICKHOUSE_DB:-}"
CLICKHOUSE_ACCESS_MANAGEMENT="${CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT:-0}"
for dir in "$DATA_DIR" \
"$ERROR_LOG_DIR" \
@ -97,6 +98,7 @@ if [ -n "$CLICKHOUSE_USER" ] && [ "$CLICKHOUSE_USER" != "default" ] || [ -n "$CL
</networks>
<password>${CLICKHOUSE_PASSWORD}</password>
<quota>default</quota>
<access_management>${CLICKHOUSE_ACCESS_MANAGEMENT}</access_management>
</${CLICKHOUSE_USER}>
</users>
</yandex>
@ -120,7 +122,7 @@ if [ -n "$(ls /docker-entrypoint-initdb.d/)" ] || [ -n "$CLICKHOUSE_DB" ]; then
sleep 1
done
clickhouseclient=( clickhouse-client --multiquery -u "$CLICKHOUSE_USER" --password "$CLICKHOUSE_PASSWORD" )
clickhouseclient=( clickhouse-client --multiquery --host "127.0.0.1" -u "$CLICKHOUSE_USER" --password "$CLICKHOUSE_PASSWORD" )
echo

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04
ARG repository="deb https://repo.clickhouse.tech/deb/stable/ main/"
ARG version=21.2.1.*
ARG version=21.3.1.*
RUN apt-get update && \
apt-get install -y apt-transport-https dirmngr && \

View File

@ -43,9 +43,11 @@ RUN apt-get update \
clang-tidy-${LLVM_VERSION} \
cmake \
curl \
lsof \
expect \
fakeroot \
git \
gdb \
gperf \
lld-${LLVM_VERSION} \
llvm-${LLVM_VERSION} \

View File

@ -70,6 +70,7 @@ function start_server
--path "$FASTTEST_DATA"
--user_files_path "$FASTTEST_DATA/user_files"
--top_level_domains_path "$FASTTEST_DATA/top_level_domains"
--test_keeper_server.log_storage_path "$FASTTEST_DATA/coordination"
)
clickhouse-server "${opts[@]}" &>> "$FASTTEST_OUTPUT/server.log" &
server_pid=$!
@ -107,6 +108,18 @@ function start_server
fi
echo "ClickHouse server pid '$server_pid' started and responded"
echo "
handle all noprint
handle SIGSEGV stop print
handle SIGBUS stop print
handle SIGABRT stop print
continue
thread apply all backtrace
continue
" > script.gdb
gdb -batch -command script.gdb -p "$server_pid" &
}
function clone_root
@ -120,7 +133,7 @@ function clone_root
git checkout FETCH_HEAD
echo 'Clonned merge head'
else
git fetch
git fetch origin "+refs/pull/$PULL_REQUEST_NUMBER/head"
git checkout "$COMMIT_SHA"
echo 'Checked out to commit'
fi
@ -163,6 +176,7 @@ function clone_submodules
contrib/xz
contrib/dragonbox
contrib/fast_float
contrib/NuRaft
)
git submodule sync
@ -182,6 +196,7 @@ function run_cmake
"-DENABLE_EMBEDDED_COMPILER=0"
"-DENABLE_THINLTO=0"
"-DUSE_UNWIND=1"
"-DENABLE_NURAFT=1"
)
# TODO remove this? we don't use ccache anyway. An option would be to download it
@ -251,8 +266,13 @@ function run_tests
00701_rollup
00834_cancel_http_readonly_queries_on_client_close
00911_tautological_compare
# Hyperscan
00926_multimatch
00929_multi_match_edit_distance
01681_hyperscan_debug_assertion
01176_mysql_client_interactive # requires mysql client
01031_mutations_interpreter_and_context
01053_ssd_dictionary # this test mistakenly requires acces to /var/lib/clickhouse -- can't run this locally, disabled
01083_expressions_in_engine_arguments
@ -315,11 +335,12 @@ function run_tests
# In fasttest, ENABLE_LIBRARIES=0, so rocksdb engine is not enabled by default
01504_rocksdb
01686_rocksdb
# Look at DistributedFilesToInsert, so cannot run in parallel.
01460_DistributedFilesToInsert
01541_max_memory_usage_for_user
01541_max_memory_usage_for_user_long
# Require python libraries like scipy, pandas and numpy
01322_ttest_scipy
@ -337,7 +358,7 @@ function run_tests
01666_blns
)
time clickhouse-test --hung-check -j 8 --order=random --use-skip-list --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" -- "$FASTTEST_FOCUS" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/test_log.txt"
(time clickhouse-test --hung-check -j 8 --order=random --use-skip-list --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" -- "$FASTTEST_FOCUS" 2>&1 ||:) | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/test_log.txt"
# substr is to remove semicolon after test name
readarray -t FAILED_TESTS < <(awk '/\[ FAIL|TIMEOUT|ERROR \]/ { print substr($3, 1, length($3)-1) }' "$FASTTEST_OUTPUT/test_log.txt" | tee "$FASTTEST_OUTPUT/failed-parallel-tests.txt")
@ -354,7 +375,7 @@ function run_tests
stop_server ||:
# Clean the data so that there is no interference from the previous test run.
rm -rf "$FASTTEST_DATA"/{{meta,}data,user_files} ||:
rm -rf "$FASTTEST_DATA"/{{meta,}data,user_files,coordination} ||:
start_server

View File

@ -21,13 +21,16 @@ function clone
git init
git remote add origin https://github.com/ClickHouse/ClickHouse
git fetch --depth=100 origin "$SHA_TO_TEST"
git fetch --depth=100 origin master # Used to obtain the list of modified or added tests
# Network is unreliable. GitHub neither.
for _ in {1..100}; do git fetch --depth=100 origin "$SHA_TO_TEST" && break; sleep 1; done
# Used to obtain the list of modified or added tests
for _ in {1..100}; do git fetch --depth=100 origin master && break; sleep 1; done
# If not master, try to fetch pull/.../{head,merge}
if [ "$PR_TO_TEST" != "0" ]
then
git fetch --depth=100 origin "refs/pull/$PR_TO_TEST/*:refs/heads/pull/$PR_TO_TEST/*"
for _ in {1..100}; do git fetch --depth=100 origin "refs/pull/$PR_TO_TEST/*:refs/heads/pull/$PR_TO_TEST/*" && break; sleep 1; done
fi
git checkout "$SHA_TO_TEST"
@ -187,16 +190,16 @@ case "$stage" in
# Lost connection to the server. This probably means that the server died
# with abort.
echo "failure" > status.txt
if ! grep -ao "Received signal.*\|Logical error.*\|Assertion.*failed\|Failed assertion.*\|.*runtime error: .*\|.*is located.*\|SUMMARY: MemorySanitizer:.*\|SUMMARY: ThreadSanitizer:.*" server.log > description.txt
if ! grep -ao "Received signal.*\|Logical error.*\|Assertion.*failed\|Failed assertion.*\|.*runtime error: .*\|.*is located.*\|SUMMARY: MemorySanitizer:.*\|SUMMARY: ThreadSanitizer:.*\|.*_LIBCPP_ASSERT.*" server.log > description.txt
then
echo "Lost connection to server. See the logs" > description.txt
echo "Lost connection to server. See the logs." > description.txt
fi
else
# Something different -- maybe the fuzzer itself died? Don't grep the
# server log in this case, because we will find a message about normal
# server termination (Received signal 15), which is confusing.
echo "failure" > status.txt
echo "Fuzzer failed ($fuzzer_exit_code). See the logs" > description.txt
echo "Fuzzer failed ($fuzzer_exit_code). See the logs." > description.txt
fi
;&
"report")

View File

@ -18,7 +18,8 @@ RUN apt-get update \
curl \
tar \
krb5-user \
iproute2
iproute2 \
lsof
RUN rm -rf \
/var/lib/apt/lists/* \
/var/cache/debconf \

View File

@ -58,10 +58,11 @@ RUN dockerd --version; docker --version
RUN python3 -m pip install \
PyMySQL \
aerospike \
aerospike==4.0.0 \
avro \
cassandra-driver \
confluent-kafka \
confluent-kafka==1.5.0 \
dict2xml \
dicttoxml \
docker \
docker-compose==1.22.0 \

View File

@ -14,7 +14,7 @@ services:
ports:
- 1006:1006
- 50070:50070
- 9000:9000
- 9010:9010
depends_on:
- hdfskerberos
entrypoint: /etc/bootstrap.sh -d

View File

@ -7,4 +7,8 @@ services:
MYSQL_ROOT_PASSWORD: clickhouse
ports:
- 3308:3306
command: --server_id=100 --log-bin='mysql-bin-1.log' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency
command: --server_id=100 --log-bin='mysql-bin-1.log'
--default-time-zone='+3:00'
--gtid-mode="ON"
--enforce-gtid-consistency
--log-error-verbosity=3

View File

@ -7,4 +7,9 @@ services:
MYSQL_ROOT_PASSWORD: clickhouse
ports:
- 33308:3306
command: --server_id=100 --log-bin='mysql-bin-1.log' --default_authentication_plugin='mysql_native_password' --default-time-zone='+3:00' --gtid-mode="ON" --enforce-gtid-consistency
command: --server_id=100 --log-bin='mysql-bin-1.log'
--default_authentication_plugin='mysql_native_password'
--default-time-zone='+3:00'
--gtid-mode="ON"
--enforce-gtid-consistency
--log-error-verbosity=3

View File

@ -7,7 +7,7 @@ services:
MYSQL_ALLOW_EMPTY_PASSWORD: 1
command: --federated --socket /var/run/mysqld/mysqld.sock
healthcheck:
test: ["CMD", "mysqladmin" ,"ping", "-h", "localhost"]
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 1s
timeout: 2s
retries: 100

View File

@ -1,11 +1,11 @@
version: '2.3'
services:
zoo1:
image: zookeeper:3.4.12
image: zookeeper:3.6.2
restart: always
environment:
ZOO_TICK_TIME: 500
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
ZOO_MY_ID: 1
JVMFLAGS: -Dzookeeper.forceSync=no
volumes:
@ -16,11 +16,11 @@ services:
source: ${ZK_DATA_LOG1:-}
target: /datalog
zoo2:
image: zookeeper:3.4.12
image: zookeeper:3.6.2
restart: always
environment:
ZOO_TICK_TIME: 500
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888
ZOO_MY_ID: 2
JVMFLAGS: -Dzookeeper.forceSync=no
volumes:
@ -31,11 +31,11 @@ services:
source: ${ZK_DATA_LOG2:-}
target: /datalog
zoo3:
image: zookeeper:3.4.12
image: zookeeper:3.6.2
restart: always
environment:
ZOO_TICK_TIME: 500
ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
ZOO_MY_ID: 3
JVMFLAGS: -Dzookeeper.forceSync=no
volumes:

View File

@ -97,6 +97,7 @@ function configure
rm -r right/db ||:
rm -r db0/preprocessed_configs ||:
rm -r db0/{data,metadata}/system ||:
rm db0/status ||:
cp -al db0/ left/db/
cp -al db0/ right/db/
}

View File

@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import argparse
import clickhouse_driver
@ -44,6 +44,7 @@ parser.add_argument('--port', nargs='*', default=[9000], help="Space-separated l
parser.add_argument('--runs', type=int, default=1, help='Number of query runs per server.')
parser.add_argument('--max-queries', type=int, default=None, help='Test no more than this number of queries, chosen at random.')
parser.add_argument('--queries-to-run', nargs='*', type=int, default=None, help='Space-separated list of indexes of queries to test.')
parser.add_argument('--max-query-seconds', type=int, default=10, help='For how many seconds at most a query is allowed to run. The script finishes with error if this time is exceeded.')
parser.add_argument('--profile-seconds', type=int, default=0, help='For how many seconds to profile a query for which the performance has changed.')
parser.add_argument('--long', action='store_true', help='Do not skip the tests tagged as long.')
parser.add_argument('--print-queries', action='store_true', help='Print test queries and exit.')
@ -323,7 +324,7 @@ for query_index in queries_to_run:
server_seconds += elapsed
print(f'query\t{query_index}\t{run_id}\t{conn_index}\t{elapsed}')
if elapsed > 10:
if elapsed > args.max_query_seconds:
# Stop processing pathologically slow queries, to avoid timing out
# the entire test task. This shouldn't really happen, so we don't
# need much handling for this case and can just exit.

View File

@ -60,4 +60,8 @@ fi
# more idiologically correct.
read -ra ADDITIONAL_OPTIONS <<< "${ADDITIONAL_OPTIONS:-}"
if [[ -n "$USE_DATABASE_REPLICATED" ]] && [[ "$USE_DATABASE_REPLICATED" -eq 1 ]]; then
ADDITIONAL_OPTIONS+=('--replicated-database')
fi
clickhouse-test --testname --shard --zookeeper --no-stateless --hung-check --print-time "$SKIP_LIST_OPT" "${ADDITIONAL_OPTIONS[@]}" "$SKIP_TESTS_OPTION" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee test_output/test_result.txt

View File

@ -3,6 +3,9 @@ FROM yandex/clickhouse-test-base
ARG odbc_driver_url="https://github.com/ClickHouse/clickhouse-odbc/releases/download/v1.1.4.20200302/clickhouse-odbc-1.1.4-Linux.tar.gz"
RUN echo "deb [trusted=yes] http://repo.mysql.com/apt/ubuntu/ bionic mysql-5.7" >> /etc/apt/sources.list \
&& apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 8C718D3B5072E1F5
RUN apt-get update -y \
&& env DEBIAN_FRONTEND=noninteractive \
apt-get install --yes --no-install-recommends \
@ -13,6 +16,7 @@ RUN apt-get update -y \
ncdu \
netcat-openbsd \
openssl \
protobuf-compiler \
python3 \
python3-lxml \
python3-requests \
@ -23,7 +27,8 @@ RUN apt-get update -y \
telnet \
tree \
unixodbc \
wget
wget \
mysql-client=5.7*
RUN pip3 install numpy scipy pandas

View File

@ -53,14 +53,19 @@ function run_tests()
if [ "$NUM_TRIES" -gt "1" ]; then
ADDITIONAL_OPTIONS+=('--skip')
ADDITIONAL_OPTIONS+=('00000_no_tests_to_skip')
ADDITIONAL_OPTIONS+=('--jobs')
ADDITIONAL_OPTIONS+=('4')
fi
for _ in $(seq 1 "$NUM_TRIES"); do
clickhouse-test --testname --shard --zookeeper --hung-check --print-time "$SKIP_LIST_OPT" "${ADDITIONAL_OPTIONS[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee -a test_output/test_result.txt
if [ "${PIPESTATUS[0]}" -ne "0" ]; then
break;
if [[ -n "$USE_DATABASE_REPLICATED" ]] && [[ "$USE_DATABASE_REPLICATED" -eq 1 ]]; then
ADDITIONAL_OPTIONS+=('--replicated-database')
fi
done
clickhouse-test --testname --shard --zookeeper --hung-check --print-time \
--test-runs "$NUM_TRIES" \
"$SKIP_LIST_OPT" "${ADDITIONAL_OPTIONS[@]}" 2>&1 \
| ts '%Y-%m-%d %H:%M:%S' \
| tee -a test_output/test_result.txt
}
export -f run_tests

View File

@ -5,7 +5,10 @@ RUN apt-get update -y && \
apt-get install -y --no-install-recommends \
python3-pip \
python3-setuptools \
python3-wheel
python3-wheel \
brotli \
netcat-openbsd \
zstd
RUN python3 -m pip install \
wheel \
@ -15,7 +18,10 @@ RUN python3 -m pip install \
pytest-randomly \
pytest-rerunfailures \
pytest-timeout \
pytest-xdist
pytest-xdist \
pandas \
numpy \
scipy
CMD dpkg -i package_folder/clickhouse-common-static_*.deb; \
dpkg -i package_folder/clickhouse-common-static-dbg_*.deb; \

View File

@ -8,16 +8,23 @@ dpkg -i package_folder/clickhouse-server_*.deb
dpkg -i package_folder/clickhouse-client_*.deb
dpkg -i package_folder/clickhouse-test_*.deb
function configure()
{
# install test configs
/usr/share/clickhouse-test/config/install.sh
# for clickhouse-server (via service)
echo "ASAN_OPTIONS='malloc_context_size=10 verbosity=1 allocator_release_to_os_interval_ms=10000'" >> /etc/environment
# for clickhouse-client
export ASAN_OPTIONS='malloc_context_size=10 allocator_release_to_os_interval_ms=10000'
# since we run clickhouse from root
sudo chown root: /var/lib/clickhouse
}
function stop()
{
timeout 120 service clickhouse-server stop
# Wait for process to disappear from processlist and also try to kill zombies.
while kill -9 "$(pidof clickhouse-server)"
do
echo "Killed clickhouse-server"
sleep 0.5
done
clickhouse stop
}
function start()
@ -33,19 +40,26 @@ function start()
tail -n1000 /var/log/clickhouse-server/clickhouse-server.log
break
fi
timeout 120 service clickhouse-server start
# use root to match with current uid
clickhouse start --user root >/var/log/clickhouse-server/stdout.log 2>/var/log/clickhouse-server/stderr.log
sleep 0.5
counter=$((counter + 1))
done
echo "
handle all noprint
handle SIGSEGV stop print
handle SIGBUS stop print
handle SIGABRT stop print
continue
thread apply all backtrace
continue
" > script.gdb
gdb -batch -command script.gdb -p "$(cat /var/run/clickhouse-server/clickhouse-server.pid)" &
}
# install test configs
/usr/share/clickhouse-test/config/install.sh
# for clickhouse-server (via service)
echo "ASAN_OPTIONS='malloc_context_size=10 verbosity=1 allocator_release_to_os_interval_ms=10000'" >> /etc/environment
# for clickhouse-client
export ASAN_OPTIONS='malloc_context_size=10 allocator_release_to_os_interval_ms=10000'
configure
start
@ -64,9 +78,11 @@ clickhouse-client --query "RENAME TABLE datasets.hits_v1 TO test.hits"
clickhouse-client --query "RENAME TABLE datasets.visits_v1 TO test.visits"
clickhouse-client --query "SHOW TABLES FROM test"
./stress --output-folder test_output --skip-func-tests "$SKIP_TESTS_OPTION"
./stress --hung-check --output-folder test_output --skip-func-tests "$SKIP_TESTS_OPTION" && echo "OK" > /test_output/script_exit_code.txt || echo "FAIL" > /test_output/script_exit_code.txt
stop
# TODO remove me when persistent snapshots will be ready
rm -fr /var/lib/clickhouse/coordination ||:
start
clickhouse-client --query "SELECT 'Server successfuly started'" > /test_output/alive_check.txt || echo 'Server failed to start' > /test_output/alive_check.txt

View File

@ -1,8 +1,9 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from multiprocessing import cpu_count
from subprocess import Popen, check_call
from subprocess import Popen, call, STDOUT
import os
import sys
import shutil
import argparse
import logging
@ -22,12 +23,15 @@ def get_options(i):
if 0 < i:
options += " --order=random"
if i % 2 == 1:
if i % 3 == 1:
options += " --db-engine=Ordinary"
if i % 3 == 2:
options += ''' --db-engine="Replicated('/test/db/test_{}', 's1', 'r1')"'''.format(i)
# If database name is not specified, new database is created for each functional test.
# Run some threads with one database for all tests.
if i % 3 == 1:
if i % 2 == 1:
options += " --database=test_{}".format(i)
if i == 13:
@ -64,7 +68,8 @@ if __name__ == "__main__":
parser.add_argument("--server-log-folder", default='/var/log/clickhouse-server')
parser.add_argument("--output-folder")
parser.add_argument("--global-time-limit", type=int, default=3600)
parser.add_argument("--num-parallel", default=cpu_count());
parser.add_argument("--num-parallel", default=cpu_count())
parser.add_argument('--hung-check', action='store_true', default=False)
args = parser.parse_args()
func_pipes = []
@ -81,4 +86,13 @@ if __name__ == "__main__":
logging.info("Finished %s from %s processes", len(retcodes), len(func_pipes))
time.sleep(5)
logging.info("All processes finished")
if args.hung_check:
logging.info("Checking if some queries hung")
cmd = "{} {} {}".format(args.test_cmd, "--hung-check", "00001_select_1")
res = call(cmd, shell=True, stderr=STDOUT)
if res != 0:
logging.info("Hung check failed with exit code {}".format(res))
sys.exit(1)
logging.info("Stress test finished")

View File

@ -1,12 +1,23 @@
# docker build -t yandex/clickhouse-style-test .
FROM ubuntu:20.04
RUN apt-get update && env DEBIAN_FRONTEND=noninteractive apt-get install --yes shellcheck libxml2-utils git python3-pip && pip3 install codespell
RUN apt-get update && env DEBIAN_FRONTEND=noninteractive apt-get install --yes \
shellcheck \
libxml2-utils \
git \
python3-pip \
pylint \
yamllint \
&& pip3 install codespell
# For |& syntax
SHELL ["bash", "-c"]
CMD cd /ClickHouse/utils/check-style && \
./check-style -n | tee /test_output/style_output.txt && \
./check-typos | tee /test_output/typos_output.txt && \
./check-whitespaces -n | tee /test_output/whitespaces_output.txt && \
./check-duplicate-includes.sh | tee /test_output/duplicate_output.txt && \
./shellcheck-run.sh | tee /test_output/shellcheck_output.txt
./check-style -n |& tee /test_output/style_output.txt && \
./check-typos |& tee /test_output/typos_output.txt && \
./check-whitespaces -n |& tee /test_output/whitespaces_output.txt && \
./check-duplicate-includes.sh |& tee /test_output/duplicate_output.txt && \
./shellcheck-run.sh |& tee /test_output/shellcheck_output.txt && \
true

View File

@ -0,0 +1,29 @@
---
toc_priority:
toc_title:
---
# data_type_name {#data_type-name}
Description.
**Parameters** (Optional)
- `x` — Description. [Type name](relative/path/to/type/dscr.md#type).
- `y` — Description. [Type name](relative/path/to/type/dscr.md#type).
**Examples**
```sql
```
## Additional Info {#additional-info} (Optional)
The name of an additional section can be any, for example, **Usage**.
**See Also** (Optional)
- [link](#)
[Original article](https://clickhouse.tech/docs/en/data_types/<data-type-name>/) <!--hide-->

View File

@ -12,16 +12,20 @@ Alias: `<alias name>`. (Optional)
More text (Optional).
**Parameters** (Optional)
**Arguments** (Optional)
- `x` — Description. [Type name](relative/path/to/type/dscr.md#type).
- `y` — Description. [Type name](relative/path/to/type/dscr.md#type).
**Parameters** (Optional, only for parametric aggregate functions)
- `z` — Description. [Type name](relative/path/to/type/dscr.md#type).
**Returned value(s)**
- Returned values list.
Type: [Type](relative/path/to/type/dscr.md#type).
Type: [Type name](relative/path/to/type/dscr.md#type).
**Example**

View File

@ -8,10 +8,14 @@ Columns:
**Example**
Query:
``` sql
SELECT * FROM system.table_name
```
Result:
``` text
Some output. It shouldn't be too long.
```

View File

@ -40,7 +40,7 @@ $ cd ClickHouse
``` bash
$ mkdir build
$ cd build
$ cmake ..-DCMAKE_C_COMPILER=`brew --prefix llvm`/bin/clang -DCMAKE_CXX_COMPILER=`brew --prefix llvm`/bin/clang++ -DCMAKE_PREFIX_PATH=`brew --prefix llvm`
$ cmake .. -DCMAKE_C_COMPILER=`brew --prefix llvm`/bin/clang -DCMAKE_CXX_COMPILER=`brew --prefix llvm`/bin/clang++ -DCMAKE_PREFIX_PATH=`brew --prefix llvm`
$ ninja
$ cd ..
```

View File

@ -0,0 +1,17 @@
---
toc_priority: 32
toc_title: Atomic
---
# Atomic {#atomic}
It is supports non-blocking `DROP` and `RENAME TABLE` queries and atomic `EXCHANGE TABLES t1 AND t2` queries. Atomic database engine is used by default.
## Creating a Database {#creating-a-database}
```sql
CREATE DATABASE test ENGINE = Atomic;
```
[Original article](https://clickhouse.tech/docs/en/engines/database_engines/atomic/) <!--hide-->

View File

@ -8,7 +8,7 @@ toc_title: Introduction
Database engines allow you to work with tables.
By default, ClickHouse uses its native database engine, which provides configurable [table engines](../../engines/table-engines/index.md) and an [SQL dialect](../../sql-reference/syntax.md).
By default, ClickHouse uses database engine [Atomic](../../engines/database-engines/atomic.md). It is provides configurable [table engines](../../engines/table-engines/index.md) and an [SQL dialect](../../sql-reference/syntax.md).
You can also use the following database engines:

View File

@ -93,6 +93,7 @@ ClickHouse has only one physical order, which is determined by `ORDER BY` clause
- Cascade `UPDATE/DELETE` queries are not supported by the `MaterializeMySQL` engine.
- Replication can be easily broken.
- Manual operations on database and tables are forbidden.
- `MaterializeMySQL` is influenced by [optimize_on_insert](../../operations/settings/settings.md#optimize-on-insert) setting. The data is merged in the corresponding table in the `MaterializeMySQL` database when a table in the MySQL server changes.
## Examples of Use {#examples-of-use}
@ -156,4 +157,4 @@ SELECT * FROM mysql.test;
└───┴─────┴──────┘
```
[Original article](https://clickhouse.tech/docs/en/database_engines/materialize-mysql/) <!--hide-->
[Original article](https://clickhouse.tech/docs/en/engines/database-engines/materialize-mysql/) <!--hide-->

View File

@ -51,6 +51,23 @@ All other MySQL data types are converted into [String](../../sql-reference/data-
[Nullable](../../sql-reference/data-types/nullable.md) is supported.
## Global Variables Support {#global-variables-support}
For better compatibility you may address global variables in MySQL style, as `@@identifier`.
These variables are supported:
- `version`
- `max_allowed_packet`
!!! warning "Warning"
By now these variables are stubs and don't correspond to anything.
Example:
``` sql
SELECT @@version;
```
## Examples of Use {#examples-of-use}
Table in MySQL:

View File

@ -7,8 +7,6 @@ toc_title: EmbeddedRocksDB
This engine allows integrating ClickHouse with [rocksdb](http://rocksdb.org/).
`EmbeddedRocksDB` lets you:
## Creating a Table {#table_engine-EmbeddedRocksDB-creating-a-table}
``` sql
@ -23,6 +21,9 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
Required parameters:
- `primary_key_name` any column name in the column list.
- `primary key` must be specified, it supports only one column in the primary key. The primary key will be serialized in binary as a `rocksdb key`.
- columns other than the primary key will be serialized in binary as `rocksdb` value in corresponding order.
- queries with key `equals` or `in` filtering will be optimized to multi keys lookup from `rocksdb`.
Example:
@ -38,8 +39,4 @@ ENGINE = EmbeddedRocksDB
PRIMARY KEY key
```
## Description {#description}
- `primary key` must be specified, it only supports one column in primary key. The primary key will serialized in binary as rocksdb key.
- columns other than the primary key will be serialized in binary as rocksdb value in corresponding order.
- queries with key `equals` or `in` filtering will be optimized to multi keys lookup from rocksdb.
[Original article](https://clickhouse.tech/docs/en/operations/table_engines/embedded-rocksdb/) <!--hide-->

View File

@ -12,6 +12,9 @@ List of supported integrations:
- [ODBC](../../../engines/table-engines/integrations/odbc.md)
- [JDBC](../../../engines/table-engines/integrations/jdbc.md)
- [MySQL](../../../engines/table-engines/integrations/mysql.md)
- [MongoDB](../../../engines/table-engines/integrations/mongodb.md)
- [HDFS](../../../engines/table-engines/integrations/hdfs.md)
- [S3](../../../engines/table-engines/integrations/s3.md)
- [Kafka](../../../engines/table-engines/integrations/kafka.md)
- [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md)
- [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md)

View File

@ -38,20 +38,20 @@ SETTINGS
Required parameters:
- `kafka_broker_list` A comma-separated list of brokers (for example, `localhost:9092`).
- `kafka_topic_list` A list of Kafka topics.
- `kafka_group_name` A group of Kafka consumers. Reading margins are tracked for each group separately. If you dont want messages to be duplicated in the cluster, use the same group name everywhere.
- `kafka_format` Message format. Uses the same notation as the SQL `FORMAT` function, such as `JSONEachRow`. For more information, see the [Formats](../../../interfaces/formats.md) section.
- `kafka_broker_list` A comma-separated list of brokers (for example, `localhost:9092`).
- `kafka_topic_list` A list of Kafka topics.
- `kafka_group_name` A group of Kafka consumers. Reading margins are tracked for each group separately. If you dont want messages to be duplicated in the cluster, use the same group name everywhere.
- `kafka_format` Message format. Uses the same notation as the SQL `FORMAT` function, such as `JSONEachRow`. For more information, see the [Formats](../../../interfaces/formats.md) section.
Optional parameters:
- `kafka_row_delimiter` Delimiter character, which ends the message.
- `kafka_schema` Parameter that must be used if the format requires a schema definition. For example, [Capn Proto](https://capnproto.org/) requires the path to the schema file and the name of the root `schema.capnp:Message` object.
- `kafka_num_consumers` The number of consumers per table. Default: `1`. Specify more consumers if the throughput of one consumer is insufficient. The total number of consumers should not exceed the number of partitions in the topic, since only one consumer can be assigned per partition.
- `kafka_max_block_size` - The maximum batch size (in messages) for poll (default: `max_block_size`).
- `kafka_skip_broken_messages` Kafka message parser tolerance to schema-incompatible messages per block. Default: `0`. If `kafka_skip_broken_messages = N` then the engine skips *N* Kafka messages that cannot be parsed (a message equals a row of data).
- `kafka_commit_every_batch` - Commit every consumed and handled batch instead of a single commit after writing a whole block (default: `0`).
- `kafka_thread_per_consumer` - Provide independent thread for each consumer (default: `0`). When enabled, every consumer flush the data independently, in parallel (otherwise - rows from several consumers squashed to form one block).
- `kafka_row_delimiter` Delimiter character, which ends the message.
- `kafka_schema` Parameter that must be used if the format requires a schema definition. For example, [Capn Proto](https://capnproto.org/) requires the path to the schema file and the name of the root `schema.capnp:Message` object.
- `kafka_num_consumers` The number of consumers per table. Default: `1`. Specify more consumers if the throughput of one consumer is insufficient. The total number of consumers should not exceed the number of partitions in the topic, since only one consumer can be assigned per partition.
- `kafka_max_block_size` The maximum batch size (in messages) for poll (default: `max_block_size`).
- `kafka_skip_broken_messages` Kafka message parser tolerance to schema-incompatible messages per block. Default: `0`. If `kafka_skip_broken_messages = N` then the engine skips *N* Kafka messages that cannot be parsed (a message equals a row of data).
- `kafka_commit_every_batch` Commit every consumed and handled batch instead of a single commit after writing a whole block (default: `0`).
- `kafka_thread_per_consumer` Provide independent thread for each consumer (default: `0`). When enabled, every consumer flush the data independently, in parallel (otherwise rows from several consumers squashed to form one block).
Examples:

View File

@ -0,0 +1,57 @@
---
toc_priority: 7
toc_title: MongoDB
---
# MongoDB {#mongodb}
MongoDB engine is read-only table engine which allows to read data (`SELECT` queries) from remote MongoDB collection. Engine supports only non-nested data types. `INSERT` queries are not supported.
## Creating a Table {#creating-a-table}
``` sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name
(
name1 [type1],
name2 [type2],
...
) ENGINE = MongoDB(host:port, database, collection, user, password);
```
**Engine Parameters**
- `host:port` — MongoDB server address.
- `database` — Remote database name.
- `collection` — Remote collection name.
- `user` — MongoDB user.
- `password` — User password.
## Usage Example {#usage-example}
Table in ClickHouse which allows to read data from MongoDB collection:
``` text
CREATE TABLE mongo_table
(
key UInt64,
data String
) ENGINE = MongoDB('mongo1:27017', 'test', 'simple_table', 'testuser', 'clickhouse');
```
Query:
``` sql
SELECT COUNT() FROM mongo_table;
```
``` text
┌─count()─┐
│ 4 │
└─────────┘
```
[Original article](https://clickhouse.tech/docs/en/operations/table_engines/integrations/mongodb/) <!--hide-->

View File

@ -59,10 +59,26 @@ Optional parameters:
- `rabbitmq_max_block_size`
- `rabbitmq_flush_interval_ms`
Required configuration:
Also format settings can be added along with rabbitmq-related settings.
Example:
``` sql
CREATE TABLE queue (
key UInt64,
value UInt64,
date DateTime
) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672',
rabbitmq_exchange_name = 'exchange1',
rabbitmq_format = 'JSONEachRow',
rabbitmq_num_consumers = 5,
date_time_input_format = 'best_effort';
```
The RabbitMQ server configuration should be added using the ClickHouse config file.
Required configuration:
``` xml
<rabbitmq>
<username>root</username>
@ -70,16 +86,12 @@ The RabbitMQ server configuration should be added using the ClickHouse config fi
</rabbitmq>
```
Example:
Additional configuration:
``` sql
CREATE TABLE queue (
key UInt64,
value UInt64
) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672',
rabbitmq_exchange_name = 'exchange1',
rabbitmq_format = 'JSONEachRow',
rabbitmq_num_consumers = 5;
``` xml
<rabbitmq>
<vhost>clickhouse</vhost>
</rabbitmq>
```
## Description {#description}
@ -105,6 +117,7 @@ Exchange type options:
- `consistent_hash` - Data is evenly distributed between all bound tables (where the exchange name is the same). Note that this exchange type must be enabled with RabbitMQ plugin: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`.
Setting `rabbitmq_queue_base` may be used for the following cases:
- to let different tables share queues, so that multiple consumers could be registered for the same queues, which makes a better performance. If using `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` settings, the exact match of queues is achieved in case these parameters are the same.
- to be able to restore reading from certain durable queues when not all messages were successfully consumed. To resume consumption from one specific queue - set its name in `rabbitmq_queue_base` setting and do not specify `rabbitmq_num_consumers` and `rabbitmq_num_queues` (defaults to 1). To resume consumption from all queues, which were declared for a specific table - just specify the same settings: `rabbitmq_queue_base`, `rabbitmq_num_consumers`, `rabbitmq_num_queues`. By default, queue names will be unique to tables.
- to reuse queues as they are declared durable and not auto-deleted. (Can be deleted via any of RabbitMQ CLI tools.)

View File

@ -114,6 +114,10 @@ CREATE TABLE big_table (name String, value UInt32) ENGINE = S3('https://storage.
- `_path` — Path to the file.
- `_file` — Name of the file.
**See Also**
- [Virtual columns](../../../engines/table-engines/index.md#table_engines-virtual_columns)
## S3-related settings {#settings}
The following settings can be set before query execution or placed into configuration file.
@ -124,8 +128,29 @@ The following settings can be set before query execution or placed into configur
Security consideration: if malicious user can specify arbitrary S3 URLs, `s3_max_redirects` must be set to zero to avoid [SSRF](https://en.wikipedia.org/wiki/Server-side_request_forgery) attacks; or alternatively, `remote_host_filter` must be specified in server configuration.
**See Also**
### Endpoint-based settings {#endpointsettings}
- [Virtual columns](../../../engines/table-engines/index.md#table_engines-virtual_columns)
The following settings can be specified in configuration file for given endpoint (which will be matched by exact prefix of a URL):
- `endpoint` — Mandatory. Specifies prefix of an endpoint.
- `access_key_id` and `secret_access_key` — Optional. Specifies credentials to use with given endpoint.
- `use_environment_credentials` — Optional, default value is `false`. If set to `true`, S3 client will try to obtain credentials from environment variables and Amazon EC2 metadata for given endpoint.
- `header` — Optional, can be speficied multiple times. Adds specified HTTP header to a request to given endpoint.
- `server_side_encryption_customer_key_base64` — Optional. If specified, required headers for accessing S3 objects with SSE-C encryption will be set.
Example:
```
<s3>
<endpoint-name>
<endpoint>https://storage.yandexcloud.net/my-test-bucket-768/</endpoint>
<!-- <access_key_id>ACCESS_KEY_ID</access_key_id> -->
<!-- <secret_access_key>SECRET_ACCESS_KEY</secret_access_key> -->
<!-- <use_environment_credentials>false</use_environment_credentials> -->
<!-- <header>Authorization: Bearer SOME-TOKEN</header> -->
<!-- <server_side_encryption_customer_key_base64>BASE64-ENCODED-KEY</server_side_encryption_customer_key_base64> -->
</endpoint-name>
</s3>
```
[Original article](https://clickhouse.tech/docs/en/operations/table_engines/s3/) <!--hide-->

View File

@ -45,7 +45,10 @@ ORDER BY expr
[PARTITION BY expr]
[PRIMARY KEY expr]
[SAMPLE BY expr]
[TTL expr [DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'], ...]
[TTL expr
[DELETE|TO DISK 'xxx'|TO VOLUME 'xxx' [, ...] ]
[WHERE conditions]
[GROUP BY key_expr [SET v1 = aggr_func(v1) [, v2 = aggr_func(v2) ...]] ] ]
[SETTINGS name=value, ...]
```
@ -80,7 +83,7 @@ For a description of parameters, see the [CREATE query description](../../../sql
Expression must have one `Date` or `DateTime` column as a result. Example:
`TTL date + INTERVAL 1 DAY`
Type of the rule `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'` specifies an action to be done with the part if the expression is satisfied (reaches current time): removal of expired rows, moving a part (if expression is satisfied for all rows in a part) to specified disk (`TO DISK 'xxx'`) or to volume (`TO VOLUME 'xxx'`). Default type of the rule is removal (`DELETE`). List of multiple rules can specified, but there should be no more than one `DELETE` rule.
Type of the rule `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'|GROUP BY` specifies an action to be done with the part if the expression is satisfied (reaches current time): removal of expired rows, moving a part (if expression is satisfied for all rows in a part) to specified disk (`TO DISK 'xxx'`) or to volume (`TO VOLUME 'xxx'`), or aggregating values in expired rows. Default type of the rule is removal (`DELETE`). List of multiple rules can specified, but there should be no more than one `DELETE` rule.
For more details, see [TTL for columns and tables](#table_engine-mergetree-ttl)
@ -101,6 +104,7 @@ For a description of parameters, see the [CREATE query description](../../../sql
- `max_parts_in_total` — Maximum number of parts in all partitions.
- `max_compress_block_size` — Maximum size of blocks of uncompressed data before compressing for writing to a table. You can also specify this setting in the global settings (see [max_compress_block_size](../../../operations/settings/settings.md#max-compress-block-size) setting). The value specified when table is created overrides the global value for this setting.
- `min_compress_block_size` — Minimum size of blocks of uncompressed data required for compression when writing the next mark. You can also specify this setting in the global settings (see [min_compress_block_size](../../../operations/settings/settings.md#min-compress-block-size) setting). The value specified when table is created overrides the global value for this setting.
- `max_partitions_to_read` — Limits the maximum number of partitions that can be accessed in one query. You can also specify setting [max_partitions_to_read](../../../operations/settings/merge-tree-settings.md#max-partitions-to-read) in the global setting.
**Example of Sections Setting**
@ -455,18 +459,28 @@ ALTER TABLE example_table
Table can have an expression for removal of expired rows, and multiple expressions for automatic move of parts between [disks or volumes](#table_engine-mergetree-multiple-volumes). When rows in the table expire, ClickHouse deletes all corresponding rows. For parts moving feature, all rows of a part must satisfy the movement expression criteria.
``` sql
TTL expr [DELETE|TO DISK 'aaa'|TO VOLUME 'bbb'], ...
TTL expr
[DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'][, DELETE|TO DISK 'aaa'|TO VOLUME 'bbb'] ...
[WHERE conditions]
[GROUP BY key_expr [SET v1 = aggr_func(v1) [, v2 = aggr_func(v2) ...]] ]
```
Type of TTL rule may follow each TTL expression. It affects an action which is to be done once the expression is satisfied (reaches current time):
- `DELETE` - delete expired rows (default action);
- `TO DISK 'aaa'` - move part to the disk `aaa`;
- `TO VOLUME 'bbb'` - move part to the disk `bbb`.
- `TO VOLUME 'bbb'` - move part to the disk `bbb`;
- `GROUP BY` - aggregate expired rows.
Examples:
With `WHERE` clause you may specify which of the expired rows to delete or aggregate (it cannot be applied to moves).
Creating a table with TTL
`GROUP BY` expression must be a prefix of the table primary key.
If a column is not part of the `GROUP BY` expression and is not set explicitely in the `SET` clause, in result row it contains an occasional value from the grouped rows (as if aggregate function `any` is applied to it).
**Examples**
Creating a table with TTL:
``` sql
CREATE TABLE example_table
@ -482,13 +496,43 @@ TTL d + INTERVAL 1 MONTH [DELETE],
d + INTERVAL 2 WEEK TO DISK 'bbb';
```
Altering TTL of the table
Altering TTL of the table:
``` sql
ALTER TABLE example_table
MODIFY TTL d + INTERVAL 1 DAY;
```
Creating a table, where the rows are expired after one month. The expired rows where dates are Mondays are deleted:
``` sql
CREATE TABLE table_with_where
(
d DateTime,
a Int
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(d)
ORDER BY d
TTL d + INTERVAL 1 MONTH DELETE WHERE toDayOfWeek(d) = 1;
```
Creating a table, where expired rows are aggregated. In result rows `x` contains the maximum value accross the grouped rows, `y` — the minimum value, and `d` — any occasional value from grouped rows.
``` sql
CREATE TABLE table_for_aggregation
(
d DateTime,
k1 Int,
k2 Int,
x Int,
y Int
)
ENGINE = MergeTree
ORDER BY k1, k2
TTL d + INTERVAL 1 MONTH GROUP BY k1, k2 SET x = max(x), y = min(y);
```
**Removing Data**
Data with an expired TTL is removed when ClickHouse merges data parts.
@ -671,6 +715,7 @@ Configuration markup:
<endpoint>https://storage.yandexcloud.net/my-bucket/root-path/</endpoint>
<access_key_id>your_access_key_id</access_key_id>
<secret_access_key>your_secret_access_key</secret_access_key>
<server_side_encryption_customer_key_base64>your_base64_encoded_customer_key</server_side_encryption_customer_key_base64>
<proxy>
<uri>http://proxy1</uri>
<uri>http://proxy2</uri>
@ -706,7 +751,8 @@ Optional parameters:
- `metadata_path` — Path on local FS to store metadata files for S3. Default value is `/var/lib/clickhouse/disks/<disk_name>/`.
- `cache_enabled` — Allows to cache mark and index files on local FS. Default value is `true`.
- `cache_path` — Path on local FS where to store cached mark and index files. Default value is `/var/lib/clickhouse/disks/<disk_name>/cache/`.
- `skip_access_check` — If true disk access checks will not be performed on disk start-up. Default value is `false`.
- `skip_access_check` — If true, disk access checks will not be performed on disk start-up. Default value is `false`.
- `server_side_encryption_customer_key_base64` — If specified, required headers for accessing S3 objects with SSE-C encryption will be set.
S3 disk can be configured as `main` or `cold` storage:

View File

@ -66,7 +66,8 @@ SELECT * FROM file_engine_table
## Usage in ClickHouse-local {#usage-in-clickhouse-local}
In [clickhouse-local](../../../operations/utilities/clickhouse-local.md) File engine accepts file path in addition to `Format`. Default input/output streams can be specified using numeric or human-readable names like `0` or `stdin`, `1` or `stdout`.
In [clickhouse-local](../../../operations/utilities/clickhouse-local.md) File engine accepts file path in addition to `Format`. Default input/output streams can be specified using numeric or human-readable names like `0` or `stdin`, `1` or `stdout`. It is possible to read and write compressed files based on an additional engine parameter or file extension (`gz`, `br` or `xz`).
**Example:**
``` bash

View File

@ -39,4 +39,4 @@ More details on [manipulating partitions](../../sql-reference/statements/alter/p
Its rather radical to drop all data from a table, but in some cases it might be exactly what you need.
More details on [table truncation](../../sql-reference/statements/alter/partition.md#alter_drop-partition).
More details on [table truncation](../../sql-reference/statements/truncate.md).

View File

@ -5,7 +5,7 @@ toc_title: Brown University Benchmark
# Brown University Benchmark
MgBench - A new analytical benchmark for machine-generated log data, [Andrew Crotty](http://cs.brown.edu/people/acrotty/).
`MgBench` is a new analytical benchmark for machine-generated log data, [Andrew Crotty](http://cs.brown.edu/people/acrotty/).
Download the data:
```
@ -153,7 +153,7 @@ ORDER BY dt,
hr;
-- Q1.4: Over a 1-month period, how often was each server blocked on disk I/O?
-- Q1.4: Over 1 month, how often was each server blocked on disk I/O?
SELECT machine_name,
COUNT(*) AS spikes
@ -301,7 +301,7 @@ WHERE event_type = 'temperature'
AND log_time >= '2019-11-29 17:00:00.000';
-- Q3.4: Over the past 6 months, how frequently was each door opened?
-- Q3.4: Over the past 6 months, how frequently were each door opened?
SELECT device_name,
device_floor,
@ -412,3 +412,5 @@ ORDER BY yr,
```
The data is also available for interactive queries in the [Playground](https://gh-api.clickhouse.tech/play?user=play), [example](https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUIG1hY2hpbmVfbmFtZSwKICAgICAgIE1JTihjcHUpIEFTIGNwdV9taW4sCiAgICAgICBNQVgoY3B1KSBBUyBjcHVfbWF4LAogICAgICAgQVZHKGNwdSkgQVMgY3B1X2F2ZywKICAgICAgIE1JTihuZXRfaW4pIEFTIG5ldF9pbl9taW4sCiAgICAgICBNQVgobmV0X2luKSBBUyBuZXRfaW5fbWF4LAogICAgICAgQVZHKG5ldF9pbikgQVMgbmV0X2luX2F2ZywKICAgICAgIE1JTihuZXRfb3V0KSBBUyBuZXRfb3V0X21pbiwKICAgICAgIE1BWChuZXRfb3V0KSBBUyBuZXRfb3V0X21heCwKICAgICAgIEFWRyhuZXRfb3V0KSBBUyBuZXRfb3V0X2F2ZwpGUk9NICgKICBTRUxFQ1QgbWFjaGluZV9uYW1lLAogICAgICAgICBDT0FMRVNDRShjcHVfdXNlciwgMC4wKSBBUyBjcHUsCiAgICAgICAgIENPQUxFU0NFKGJ5dGVzX2luLCAwLjApIEFTIG5ldF9pbiwKICAgICAgICAgQ09BTEVTQ0UoYnl0ZXNfb3V0LCAwLjApIEFTIG5ldF9vdXQKICBGUk9NIG1nYmVuY2gubG9nczEKICBXSEVSRSBtYWNoaW5lX25hbWUgSU4gKCdhbmFuc2knLCdhcmFnb2cnLCd1cmQnKQogICAgQU5EIGxvZ190aW1lID49IFRJTUVTVEFNUCAnMjAxNy0wMS0xMSAwMDowMDowMCcKKSBBUyByCkdST1VQIEJZIG1hY2hpbmVfbmFtZQ==).
[Original article](https://clickhouse.tech/docs/en/getting_started/example_datasets/brown-benchmark/) <!--hide-->

File diff suppressed because one or more lines are too long

View File

@ -20,5 +20,6 @@ The list of documented datasets:
- [Terabyte of Click Logs from Criteo](../../getting-started/example-datasets/criteo.md)
- [AMPLab Big Data Benchmark](../../getting-started/example-datasets/amplab-benchmark.md)
- [Brown University Benchmark](../../getting-started/example-datasets/brown-benchmark.md)
- [Cell Towers](../../getting-started/example-datasets/cell-towers.md)
[Original article](https://clickhouse.tech/docs/en/getting_started/example_datasets) <!--hide-->

View File

@ -15,17 +15,9 @@ This dataset can be obtained in two ways:
Downloading data:
``` bash
for s in `seq 1987 2018`
do
for m in `seq 1 12`
do
wget https://transtats.bts.gov/PREZIP/On_Time_Reporting_Carrier_On_Time_Performance_1987_present_${s}_${m}.zip
done
done
echo https://transtats.bts.gov/PREZIP/On_Time_Reporting_Carrier_On_Time_Performance_1987_present_{1987..2021}_{1..12}.zip | xargs -P10 wget --no-check-certificate --continue
```
(from https://github.com/Percona-Lab/ontime-airline-performance/blob/master/download.sh )
Creating a table:
``` sql
@ -145,12 +137,14 @@ ORDER BY (Carrier, FlightDate)
SETTINGS index_granularity = 8192;
```
Loading data:
Loading data with multiple threads:
``` bash
$ for i in *.zip; do echo $i; unzip -cq $i '*.csv' | sed 's/\.00//g' | clickhouse-client --input_format_with_names_use_header=0 --host=example-perftest01j --query="INSERT INTO ontime FORMAT CSVWithNames"; done
ls -1 *.zip | xargs -I{} -P $(nproc) bash -c "echo {}; unzip -cq {} '*.csv' | sed 's/\.00//g' | clickhouse-client --input_format_with_names_use_header=0 --query='INSERT INTO ontime FORMAT CSVWithNames'"
```
(if you will have memory shortage or other issues on your server, remove the `-P $(nproc)` part)
## Download of Prepared Partitions {#download-of-prepared-partitions}
``` bash

View File

@ -254,7 +254,6 @@ ENGINE = MergeTree()
PARTITION BY toYYYYMM(EventDate)
ORDER BY (CounterID, EventDate, intHash32(UserID))
SAMPLE BY intHash32(UserID)
SETTINGS index_granularity = 8192
```
``` sql
@ -450,7 +449,6 @@ ENGINE = CollapsingMergeTree(Sign)
PARTITION BY toYYYYMM(StartDate)
ORDER BY (CounterID, StartDate, intHash32(UserID), VisitID)
SAMPLE BY intHash32(UserID)
SETTINGS index_granularity = 8192
```
You can execute those queries using the interactive mode of `clickhouse-client` (just launch it in a terminal without specifying a query in advance) or try some [alternative interface](../interfaces/index.md) if you want.
@ -646,7 +644,7 @@ If there are no replicas at the moment on replicated table creation, a new first
``` sql
CREATE TABLE tutorial.hits_replica (...)
ENGINE = ReplcatedMergeTree(
ENGINE = ReplicatedMergeTree(
'/clickhouse_perftest/tables/{shard}/hits',
'{replica}'
)

View File

@ -31,8 +31,8 @@ The supported formats are:
| [JSONCompactString](#jsoncompactstring) | ✗ | ✔ |
| [JSONEachRow](#jsoneachrow) | ✔ | ✔ |
| [JSONEachRowWithProgress](#jsoneachrowwithprogress) | ✗ | ✔ |
| [JSONStringEachRow](#jsonstringeachrow) | ✔ | ✔ |
| [JSONStringEachRowWithProgress](#jsonstringeachrowwithprogress) | ✗ | ✔ |
| [JSONStringsEachRow](#jsonstringseachrow) | ✔ | ✔ |
| [JSONStringsEachRowWithProgress](#jsonstringseachrowwithprogress) | ✗ | ✔ |
| [JSONCompactEachRow](#jsoncompacteachrow) | ✔ | ✔ |
| [JSONCompactEachRowWithNamesAndTypes](#jsoncompacteachrowwithnamesandtypes) | ✔ | ✔ |
| [JSONCompactStringEachRow](#jsoncompactstringeachrow) | ✔ | ✔ |
@ -612,7 +612,7 @@ Example:
```
## JSONEachRow {#jsoneachrow}
## JSONStringEachRow {#jsonstringeachrow}
## JSONStringsEachRow {#jsonstringseachrow}
## JSONCompactEachRow {#jsoncompacteachrow}
## JSONCompactStringEachRow {#jsoncompactstringeachrow}
@ -627,9 +627,9 @@ When using these formats, ClickHouse outputs rows as separated, newline-delimite
When inserting the data, you should provide a separate JSON value for each row.
## JSONEachRowWithProgress {#jsoneachrowwithprogress}
## JSONStringEachRowWithProgress {#jsonstringeachrowwithprogress}
## JSONStringsEachRowWithProgress {#jsonstringseachrowwithprogress}
Differs from `JSONEachRow`/`JSONStringEachRow` in that ClickHouse will also yield progress information as JSON values.
Differs from `JSONEachRow`/`JSONStringsEachRow` in that ClickHouse will also yield progress information as JSON values.
```json
{"row":{"'hello'":"hello","multiply(42, number)":"0","range(5)":[0,1,2,3,4]}}

View File

@ -148,28 +148,48 @@ $ echo 'DROP TABLE t' | curl 'http://localhost:8123/' --data-binary @-
For successful requests that dont return a data table, an empty response body is returned.
You can use the internal ClickHouse compression format when transmitting data. The compressed data has a non-standard format, and you will need to use the special `clickhouse-compressor` program to work with it (it is installed with the `clickhouse-client` package). To increase the efficiency of data insertion, you can disable server-side checksum verification by using the [http_native_compression_disable_checksumming_on_decompress](../operations/settings/settings.md#settings-http_native_compression_disable_checksumming_on_decompress) setting.
If you specified `compress=1` in the URL, the server compresses the data it sends you.
If you specified `decompress=1` in the URL, the server decompresses the same data that you pass in the `POST` method.
## Compression {#compression}
You can also choose to use [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression). To send a compressed `POST` request, append the request header `Content-Encoding: compression_method`. In order for ClickHouse to compress the response, you must append `Accept-Encoding: compression_method`. ClickHouse supports `gzip`, `br`, and `deflate` [compression methods](https://en.wikipedia.org/wiki/HTTP_compression#Content-Encoding_tokens). To enable HTTP compression, you must use the ClickHouse [enable_http_compression](../operations/settings/settings.md#settings-enable_http_compression) setting. You can configure the data compression level in the [http_zlib_compression_level](#settings-http_zlib_compression_level) setting for all the compression methods.
You can use compression to reduce network traffic when transmitting a large amount of data or for creating dumps that are immediately compressed.
You can use this to reduce network traffic when transmitting a large amount of data, or for creating dumps that are immediately compressed.
You can use the internal ClickHouse compression format when transmitting data. The compressed data has a non-standard format, and you need `clickhouse-compressor` program to work with it. It is installed with the `clickhouse-client` package. To increase the efficiency of data insertion, you can disable server-side checksum verification by using the [http_native_compression_disable_checksumming_on_decompress](../operations/settings/settings.md#settings-http_native_compression_disable_checksumming_on_decompress) setting.
Examples of sending data with compression:
If you specify `compress=1` in the URL, the server will compress the data it sends to you. If you specify `decompress=1` in the URL, the server will decompress the data which you pass in the `POST` method.
``` bash
#Sending data to the server:
$ curl -vsS "http://localhost:8123/?enable_http_compression=1" -d 'SELECT number FROM system.numbers LIMIT 10' -H 'Accept-Encoding: gzip'
You can also choose to use [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression). ClickHouse supports the following [compression methods](https://en.wikipedia.org/wiki/HTTP_compression#Content-Encoding_tokens):
#Sending data to the client:
$ echo "SELECT 1" | gzip -c | curl -sS --data-binary @- -H 'Content-Encoding: gzip' 'http://localhost:8123/'
```
- `gzip`
- `br`
- `deflate`
- `xz`
To send a compressed `POST` request, append the request header `Content-Encoding: compression_method`.
In order for ClickHouse to compress the response, enable compression with [enable_http_compression](../operations/settings/settings.md#settings-enable_http_compression) setting and append `Accept-Encoding: compression_method` header to the request. You can configure the data compression level in the [http_zlib_compression_level](../operations/settings/settings.md#settings-http_zlib_compression_level) setting for all compression methods.
!!! note "Note"
Some HTTP clients might decompress data from the server by default (with `gzip` and `deflate`) and you might get decompressed data even if you use the compression settings correctly.
**Examples**
``` bash
# Sending compressed data to the server
$ echo "SELECT 1" | gzip -c | \
curl -sS --data-binary @- -H 'Content-Encoding: gzip' 'http://localhost:8123/'
```
``` bash
# Receiving compressed data from the server
$ curl -vsS "http://localhost:8123/?enable_http_compression=1" \
-H 'Accept-Encoding: gzip' --output result.gz -d 'SELECT number FROM system.numbers LIMIT 3'
$ zcat result.gz
0
1
2
```
## Default Database {#default-database}
You can use the database URL parameter or the X-ClickHouse-Database header to specify the default database.
``` bash

Some files were not shown because too many files have changed in this diff Show More