Merge branch 'master' of github.com:ClickHouse/ClickHouse into docs/CLICKHOUSEDOCS-558-RBAC-introduction

This commit is contained in:
Sergei Shtykov 2020-05-01 16:49:56 +03:00
commit 8e2747d4b0
4218 changed files with 271749 additions and 263908 deletions

View File

@ -1,41 +0,0 @@
stages:
- builder
- build
variables:
GIT_SUBMODULE_STRATEGY: recursive
builder:
stage: builder
when: manual
services:
- docker:dind
script:
- docker info
- apk add --no-cache git curl binutils ca-certificates
- docker login -u gitlab -p nopasswd $CI_REGISTRY
- docker build -t yandex/clickhouse-builder ./docker/builder
- docker tag yandex/clickhouse-builder $CI_REGISTRY/yandex/clickhouse-builder
- docker push $CI_REGISTRY/yandex/clickhouse-builder
tags:
- docker
build:
stage: build
when: manual
services:
- docker:dind
script:
- apk add --no-cache git curl binutils ca-certificates
- git submodule sync --recursive
- git submodule update --init --recursive
- docker info
- docker login -u gitlab -p nopasswd $CI_REGISTRY
- docker pull $CI_REGISTRY/yandex/clickhouse-builder
- docker run --rm --volumes-from "${HOSTNAME}-build" --workdir "${CI_PROJECT_DIR}" --env CI_PROJECT_DIR=${CI_PROJECT_DIR} $CI_REGISTRY/yandex/clickhouse-builder /build_gitlab_ci.sh
# You can upload your binary to nexus
- curl -v --keepalive-time 60 --keepalive --user "$NEXUS_USER:$NEXUS_PASSWORD" -XPUT "http://$NEXUS_HOST/repository/binaries/$CI_PROJECT_NAME" --upload-file ./src/Server/clickhouse
# Or download artifacts from gitlab
artifacts:
paths:
- ./src/Server/clickhouse
expire_in: 1 day
tags:
- docker

8
.gitmodules vendored
View File

@ -110,16 +110,16 @@
branch = v1.25.0
[submodule "contrib/aws"]
path = contrib/aws
url = https://github.com/aws/aws-sdk-cpp.git
url = https://github.com/ClickHouse-Extras/aws-sdk-cpp.git
[submodule "aws-c-event-stream"]
path = contrib/aws-c-event-stream
url = https://github.com/awslabs/aws-c-event-stream.git
url = https://github.com/ClickHouse-Extras/aws-c-event-stream.git
[submodule "aws-c-common"]
path = contrib/aws-c-common
url = https://github.com/awslabs/aws-c-common.git
url = https://github.com/ClickHouse-Extras/aws-c-common.git
[submodule "aws-checksums"]
path = contrib/aws-checksums
url = https://github.com/awslabs/aws-checksums.git
url = https://github.com/ClickHouse-Extras/aws-checksums.git
[submodule "contrib/curl"]
path = contrib/curl
url = https://github.com/curl/curl.git

View File

@ -1,11 +1,68 @@
## ClickHouse release v20.3
### ClickHouse release v20.3.7.46, 2020-04-17
#### Bug Fix
* Fix `Logical error: CROSS JOIN has expressions` error for queries with comma and names joins mix. [#10311](https://github.com/ClickHouse/ClickHouse/pull/10311) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix queries with `max_bytes_before_external_group_by`. [#10302](https://github.com/ClickHouse/ClickHouse/pull/10302) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix move-to-prewhere optimization in presense of arrayJoin functions (in certain cases). This fixes [#10092](https://github.com/ClickHouse/ClickHouse/issues/10092). [#10195](https://github.com/ClickHouse/ClickHouse/pull/10195) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add the ability to relax the restriction on non-deterministic functions usage in mutations with `allow_nondeterministic_mutations` setting. [#10186](https://github.com/ClickHouse/ClickHouse/pull/10186) ([filimonov](https://github.com/filimonov)).
### ClickHouse release v20.3.6.40, 2020-04-16
#### New Feature
* Added function `isConstant`. This function checks whether its argument is constant expression and returns 1 or 0. It is intended for development, debugging and demonstration purposes. [#10198](https://github.com/ClickHouse/ClickHouse/pull/10198) ([alexey-milovidov](https://github.com/alexey-milovidov)).
#### Bug Fix
* Fix error `Pipeline stuck` with `max_rows_to_group_by` and `group_by_overflow_mode = 'break'`. [#10279](https://github.com/ClickHouse/ClickHouse/pull/10279) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix rare possible exception `Cannot drain connections: cancel first`. [#10239](https://github.com/ClickHouse/ClickHouse/pull/10239) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed bug where ClickHouse would throw "Unknown function lambda." error message when user tries to run ALTER UPDATE/DELETE on tables with ENGINE = Replicated*. Check for nondeterministic functions now handles lambda expressions correctly. [#10237](https://github.com/ClickHouse/ClickHouse/pull/10237) ([Alexander Kazakov](https://github.com/Akazz)).
* Fixed "generateRandom" function for Date type. This fixes [#9973](https://github.com/ClickHouse/ClickHouse/issues/9973). Fix an edge case when dates with year 2106 are inserted to MergeTree tables with old-style partitioning but partitions are named with year 1970. [#10218](https://github.com/ClickHouse/ClickHouse/pull/10218) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Convert types if the table definition of a View does not correspond to the SELECT query. This fixes [#10180](https://github.com/ClickHouse/ClickHouse/issues/10180) and [#10022](https://github.com/ClickHouse/ClickHouse/issues/10022). [#10217](https://github.com/ClickHouse/ClickHouse/pull/10217) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix `parseDateTimeBestEffort` for strings in RFC-2822 when day of week is Tuesday or Thursday. This fixes [#10082](https://github.com/ClickHouse/ClickHouse/issues/10082). [#10214](https://github.com/ClickHouse/ClickHouse/pull/10214) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix column names of constants inside JOIN that may clash with names of constants outside of JOIN. [#10207](https://github.com/ClickHouse/ClickHouse/pull/10207) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix possible inifinite query execution when the query actually should stop on LIMIT, while reading from infinite source like `system.numbers` or `system.zeros`. [#10206](https://github.com/ClickHouse/ClickHouse/pull/10206) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix using the current database for access checking when the database isn't specified. [#10192](https://github.com/ClickHouse/ClickHouse/pull/10192) ([Vitaly Baranov](https://github.com/vitlibar)).
* Convert blocks if structure does not match on INSERT into Distributed(). [#10135](https://github.com/ClickHouse/ClickHouse/pull/10135) ([Azat Khuzhin](https://github.com/azat)).
* Fix possible incorrect result for extremes in processors pipeline. [#10131](https://github.com/ClickHouse/ClickHouse/pull/10131) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix some kinds of alters with compact parts. [#10130](https://github.com/ClickHouse/ClickHouse/pull/10130) ([Anton Popov](https://github.com/CurtizJ)).
* Fix incorrect `index_granularity_bytes` check while creating new replica. Fixes [#10098](https://github.com/ClickHouse/ClickHouse/issues/10098). [#10121](https://github.com/ClickHouse/ClickHouse/pull/10121) ([alesapin](https://github.com/alesapin)).
* Fix SIGSEGV on INSERT into Distributed table when its structure differs from the underlying tables. [#10105](https://github.com/ClickHouse/ClickHouse/pull/10105) ([Azat Khuzhin](https://github.com/azat)).
* Fix possible rows loss for queries with `JOIN` and `UNION ALL`. Fixes [#9826](https://github.com/ClickHouse/ClickHouse/issues/9826), [#10113](https://github.com/ClickHouse/ClickHouse/issues/10113). [#10099](https://github.com/ClickHouse/ClickHouse/pull/10099) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed replicated tables startup when updating from an old ClickHouse version where `/table/replicas/replica_name/metadata` node doesn't exist. Fixes [#10037](https://github.com/ClickHouse/ClickHouse/issues/10037). [#10095](https://github.com/ClickHouse/ClickHouse/pull/10095) ([alesapin](https://github.com/alesapin)).
* Add some arguments check and support identifier arguments for MySQL Database Engine. [#10077](https://github.com/ClickHouse/ClickHouse/pull/10077) ([Winter Zhang](https://github.com/zhang2014)).
* Fix bug in clickhouse dictionary source from localhost clickhouse server. The bug may lead to memory corruption if types in dictionary and source are not compatible. [#10071](https://github.com/ClickHouse/ClickHouse/pull/10071) ([alesapin](https://github.com/alesapin)).
* Fix bug in `CHECK TABLE` query when table contain skip indices. [#10068](https://github.com/ClickHouse/ClickHouse/pull/10068) ([alesapin](https://github.com/alesapin)).
* Fix error `Cannot clone block with columns because block has 0 columns ... While executing GroupingAggregatedTransform`. It happened when setting `distributed_aggregation_memory_efficient` was enabled, and distributed query read aggregating data with different level from different shards (mixed single and two level aggregation). [#10063](https://github.com/ClickHouse/ClickHouse/pull/10063) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix a segmentation fault that could occur in GROUP BY over string keys containing trailing zero bytes ([#8636](https://github.com/ClickHouse/ClickHouse/issues/8636), [#8925](https://github.com/ClickHouse/ClickHouse/issues/8925)). [#10025](https://github.com/ClickHouse/ClickHouse/pull/10025) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix parallel distributed INSERT SELECT for remote table. This PR fixes the solution provided in [#9759](https://github.com/ClickHouse/ClickHouse/pull/9759). [#9999](https://github.com/ClickHouse/ClickHouse/pull/9999) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix the number of threads used for remote query execution (performance regression, since 20.3). This happened when query from `Distributed` table was executed simultaneously on local and remote shards. Fixes [#9965](https://github.com/ClickHouse/ClickHouse/issues/9965). [#9971](https://github.com/ClickHouse/ClickHouse/pull/9971) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix bug in which the necessary tables weren't retrieved at one of the processing stages of queries to some databases. Fixes [#9699](https://github.com/ClickHouse/ClickHouse/issues/9699). [#9949](https://github.com/ClickHouse/ClickHouse/pull/9949) ([achulkov2](https://github.com/achulkov2)).
* Fix 'Not found column in block' error when `JOIN` appears with `TOTALS`. Fixes [#9839](https://github.com/ClickHouse/ClickHouse/issues/9839). [#9939](https://github.com/ClickHouse/ClickHouse/pull/9939) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix a bug with `ON CLUSTER` DDL queries freezing on server startup. [#9927](https://github.com/ClickHouse/ClickHouse/pull/9927) ([Gagan Arneja](https://github.com/garneja)).
* Fix parsing multiple hosts set in the CREATE USER command, e.g. `CREATE USER user6 HOST NAME REGEXP 'lo.?*host', NAME REGEXP 'lo*host'`. [#9924](https://github.com/ClickHouse/ClickHouse/pull/9924) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix `TRUNCATE` for Join table engine ([#9917](https://github.com/ClickHouse/ClickHouse/issues/9917)). [#9920](https://github.com/ClickHouse/ClickHouse/pull/9920) ([Amos Bird](https://github.com/amosbird)).
* Fix "scalar doesn't exist" error in ALTERs ([#9878](https://github.com/ClickHouse/ClickHouse/issues/9878)). [#9904](https://github.com/ClickHouse/ClickHouse/pull/9904) ([Amos Bird](https://github.com/amosbird)).
* Fix race condition between drop and optimize in `ReplicatedMergeTree`. [#9901](https://github.com/ClickHouse/ClickHouse/pull/9901) ([alesapin](https://github.com/alesapin)).
* Fix error with qualified names in `distributed_product_mode='local'`. Fixes [#4756](https://github.com/ClickHouse/ClickHouse/issues/4756). [#9891](https://github.com/ClickHouse/ClickHouse/pull/9891) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix calculating grants for introspection functions from the setting 'allow_introspection_functions'. [#9840](https://github.com/ClickHouse/ClickHouse/pull/9840) ([Vitaly Baranov](https://github.com/vitlibar)).
#### Build/Testing/Packaging Improvement
* Fix integration test `test_settings_constraints`. [#9962](https://github.com/ClickHouse/ClickHouse/pull/9962) ([Vitaly Baranov](https://github.com/vitlibar)).
* Removed dependency on `clock_getres`. [#9833](https://github.com/ClickHouse/ClickHouse/pull/9833) ([alexey-milovidov](https://github.com/alexey-milovidov)).
### ClickHouse release v20.3.5.21, 2020-03-27
#### Bug Fix
* Fix 'Different expressions with the same alias' error when query has PREWHERE and WHERE on distributed table and `SET distributed_product_mode = 'local'`. [#9871](https://github.com/ClickHouse/ClickHouse/pull/9871) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix mutations excessive memory consumption for tables with a composite primary key. This fixes [#9850](https://github.com/ClickHouse/ClickHouse/issues/9850). [#9860](https://github.com/ClickHouse/ClickHouse/pull/9860) ([alesapin](https://github.com/alesapin)).
* For INSERT queries shard now clamps the settings got from the initiator to the shard's constaints instead of throwing an exception. This fix allows to send INSERT queries to a shard with another constraints. This change improves fix [#9447](https://github.com/ClickHouse/ClickHouse/issues/9447). [#9852](https://github.com/ClickHouse/ClickHouse/pull/9852) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix 'COMMA to CROSS JOIN rewriter is not enabled or cannot rewrite query' error in case of subqueries with COMMA JOIN out of tables lists (i.e. in WHERE). Fixes [#9782](https://github.com/ClickHouse/ClickHouse/issues/9782). [#9830](https://github.com/ClickHouse/ClickHouse/pull/9830) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix possible exception `Got 0 in totals chunk, expected 1` on client. It happened for queries with `JOIN` in case if right joined table had zero rows. Example: `select * from system.one t1 join system.one t2 on t1.dummy = t2.dummy limit 0 FORMAT TabSeparated;`. Fixes [#9777](https://github.com/ClickHouse/ClickHouse/issues/9777). [#9823](https://github.com/ClickHouse/ClickHouse/pull/9823) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix SIGSEGV with optimize_skip_unused_shards when type cannot be converted. [#9804](https://github.com/ClickHouse/ClickHouse/pull/9804) ([Azat Khuzhin](https://github.com/azat)).
@ -273,6 +330,55 @@
## ClickHouse release v20.1
### ClickHouse release v20.1.10.70, 2020-04-17
#### Bug Fix
* Fix rare possible exception `Cannot drain connections: cancel first`. [#10239](https://github.com/ClickHouse/ClickHouse/pull/10239) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed bug where ClickHouse would throw `'Unknown function lambda.'` error message when user tries to run `ALTER UPDATE/DELETE` on tables with `ENGINE = Replicated*`. Check for nondeterministic functions now handles lambda expressions correctly. [#10237](https://github.com/ClickHouse/ClickHouse/pull/10237) ([Alexander Kazakov](https://github.com/Akazz)).
* Fix `parseDateTimeBestEffort` for strings in RFC-2822 when day of week is Tuesday or Thursday. This fixes [#10082](https://github.com/ClickHouse/ClickHouse/issues/10082). [#10214](https://github.com/ClickHouse/ClickHouse/pull/10214) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix column names of constants inside `JOIN` that may clash with names of constants outside of `JOIN`. [#10207](https://github.com/ClickHouse/ClickHouse/pull/10207) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix possible inifinite query execution when the query actually should stop on LIMIT, while reading from infinite source like `system.numbers` or `system.zeros`. [#10206](https://github.com/ClickHouse/ClickHouse/pull/10206) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix move-to-prewhere optimization in presense of `arrayJoin` functions (in certain cases). This fixes [#10092](https://github.com/ClickHouse/ClickHouse/issues/10092). [#10195](https://github.com/ClickHouse/ClickHouse/pull/10195) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Add the ability to relax the restriction on non-deterministic functions usage in mutations with `allow_nondeterministic_mutations` setting. [#10186](https://github.com/ClickHouse/ClickHouse/pull/10186) ([filimonov](https://github.com/filimonov)).
* Convert blocks if structure does not match on `INSERT` into table with `Distributed` engine. [#10135](https://github.com/ClickHouse/ClickHouse/pull/10135) ([Azat Khuzhin](https://github.com/azat)).
* Fix `SIGSEGV` on `INSERT` into `Distributed` table when its structure differs from the underlying tables. [#10105](https://github.com/ClickHouse/ClickHouse/pull/10105) ([Azat Khuzhin](https://github.com/azat)).
* Fix possible rows loss for queries with `JOIN` and `UNION ALL`. Fixes [#9826](https://github.com/ClickHouse/ClickHouse/issues/9826), [#10113](https://github.com/ClickHouse/ClickHouse/issues/10113). [#10099](https://github.com/ClickHouse/ClickHouse/pull/10099) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Add arguments check and support identifier arguments for MySQL Database Engine. [#10077](https://github.com/ClickHouse/ClickHouse/pull/10077) ([Winter Zhang](https://github.com/zhang2014)).
* Fix bug in clickhouse dictionary source from localhost clickhouse server. The bug may lead to memory corruption if types in dictionary and source are not compatible. [#10071](https://github.com/ClickHouse/ClickHouse/pull/10071) ([alesapin](https://github.com/alesapin)).
* Fix error `Cannot clone block with columns because block has 0 columns ... While executing GroupingAggregatedTransform`. It happened when setting `distributed_aggregation_memory_efficient` was enabled, and distributed query read aggregating data with different level from different shards (mixed single and two level aggregation). [#10063](https://github.com/ClickHouse/ClickHouse/pull/10063) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix a segmentation fault that could occur in `GROUP BY` over string keys containing trailing zero bytes ([#8636](https://github.com/ClickHouse/ClickHouse/issues/8636), [#8925](https://github.com/ClickHouse/ClickHouse/issues/8925)). [#10025](https://github.com/ClickHouse/ClickHouse/pull/10025) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix bug in which the necessary tables weren't retrieved at one of the processing stages of queries to some databases. Fixes [#9699](https://github.com/ClickHouse/ClickHouse/issues/9699). [#9949](https://github.com/ClickHouse/ClickHouse/pull/9949) ([achulkov2](https://github.com/achulkov2)).
* Fix `'Not found column in block'` error when `JOIN` appears with `TOTALS`. Fixes [#9839](https://github.com/ClickHouse/ClickHouse/issues/9839). [#9939](https://github.com/ClickHouse/ClickHouse/pull/9939) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix a bug with `ON CLUSTER` DDL queries freezing on server startup. [#9927](https://github.com/ClickHouse/ClickHouse/pull/9927) ([Gagan Arneja](https://github.com/garneja)).
* Fix `TRUNCATE` for Join table engine ([#9917](https://github.com/ClickHouse/ClickHouse/issues/9917)). [#9920](https://github.com/ClickHouse/ClickHouse/pull/9920) ([Amos Bird](https://github.com/amosbird)).
* Fix `'scalar doesn't exist'` error in ALTER queries ([#9878](https://github.com/ClickHouse/ClickHouse/issues/9878)). [#9904](https://github.com/ClickHouse/ClickHouse/pull/9904) ([Amos Bird](https://github.com/amosbird)).
* Fix race condition between drop and optimize in `ReplicatedMergeTree`. [#9901](https://github.com/ClickHouse/ClickHouse/pull/9901) ([alesapin](https://github.com/alesapin)).
* Fixed `DeleteOnDestroy` logic in `ATTACH PART` which could lead to automatic removal of attached part and added few tests. [#9410](https://github.com/ClickHouse/ClickHouse/pull/9410) ([Vladimir Chebotarev](https://github.com/excitoon)).
#### Build/Testing/Packaging Improvement
* Fix unit test `collapsing_sorted_stream`. [#9367](https://github.com/ClickHouse/ClickHouse/pull/9367) ([Deleted user](https://github.com/ghost)).
### ClickHouse release v20.1.9.54, 2020-03-28
#### Bug Fix
* Fix `'Different expressions with the same alias'` error when query has `PREWHERE` and `WHERE` on distributed table and `SET distributed_product_mode = 'local'`. [#9871](https://github.com/ClickHouse/ClickHouse/pull/9871) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix mutations excessive memory consumption for tables with a composite primary key. This fixes [#9850](https://github.com/ClickHouse/ClickHouse/issues/9850). [#9860](https://github.com/ClickHouse/ClickHouse/pull/9860) ([alesapin](https://github.com/alesapin)).
* For INSERT queries shard now clamps the settings got from the initiator to the shard's constaints instead of throwing an exception. This fix allows to send `INSERT` queries to a shard with another constraints. This change improves fix [#9447](https://github.com/ClickHouse/ClickHouse/issues/9447). [#9852](https://github.com/ClickHouse/ClickHouse/pull/9852) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix possible exception `Got 0 in totals chunk, expected 1` on client. It happened for queries with `JOIN` in case if right joined table had zero rows. Example: `select * from system.one t1 join system.one t2 on t1.dummy = t2.dummy limit 0 FORMAT TabSeparated;`. Fixes [#9777](https://github.com/ClickHouse/ClickHouse/issues/9777). [#9823](https://github.com/ClickHouse/ClickHouse/pull/9823) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix `SIGSEGV` with `optimize_skip_unused_shards` when type cannot be converted. [#9804](https://github.com/ClickHouse/ClickHouse/pull/9804) ([Azat Khuzhin](https://github.com/azat)).
* Fixed a few cases when timezone of the function argument wasn't used properly. [#9574](https://github.com/ClickHouse/ClickHouse/pull/9574) ([Vasily Nemkov](https://github.com/Enmk)).
#### Improvement
* Remove `ORDER BY` stage from mutations because we read from a single ordered part in a single thread. Also add check that the order of rows in mutation is ordered in sorting key order and this order is not violated. [#9886](https://github.com/ClickHouse/ClickHouse/pull/9886) ([alesapin](https://github.com/alesapin)).
#### Build/Testing/Packaging Improvement
* Clean up duplicated linker flags. Make sure the linker won't look up an unexpected symbol. [#9433](https://github.com/ClickHouse/ClickHouse/pull/9433) ([Amos Bird](https://github.com/amosbird)).
### ClickHouse release v20.1.8.41, 2020-03-20
#### Bug Fix

View File

@ -84,9 +84,10 @@ option (ENABLE_FUZZING "Enables fuzzing instrumentation" OFF)
if (ENABLE_FUZZING)
message (STATUS "Fuzzing instrumentation enabled")
set (WITH_COVERAGE ON)
set (SANITIZE "libfuzzer")
set (FUZZER "libfuzzer")
endif()
include (cmake/fuzzer.cmake)
include (cmake/sanitize.cmake)
if (CMAKE_GENERATOR STREQUAL "Ninja" AND NOT DISABLE_COLORED_BUILD)
@ -213,10 +214,14 @@ if (COMPILER_CLANG)
# TODO investigate that
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-omit-frame-pointer")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fno-omit-frame-pointer")
if (OS_DARWIN)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++ -Wl,-U,_inside_main")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wl,-U,_inside_main")
endif()
# Display absolute paths in error messages. Otherwise KDevelop fails to navigate to correct file and opens a new file instead.
set(COMPILER_FLAGS "${COMPILER_FLAGS} -fdiagnostics-absolute-paths")
endif ()
option (ENABLE_LIBRARIES "Enable all libraries (Global default switch)" ON)

View File

@ -11,10 +11,9 @@ ClickHouse is an open-source column-oriented database management system that all
* [Slack](https://join.slack.com/t/clickhousedb/shared_invite/zt-d2zxkf9e-XyxDa_ucfPxzuH4SJIm~Ng) and [Telegram](https://telegram.me/clickhouse_en) allow to chat with ClickHouse users in real-time.
* [Blog](https://clickhouse.yandex/blog/en/) contains various ClickHouse-related articles, as well as announces and reports about events.
* [Contacts](https://clickhouse.tech/#contacts) can help to get your questions answered if there are any.
* You can also [fill this form](https://forms.yandex.com/surveys/meet-yandex-clickhouse-team/) to meet Yandex ClickHouse team in person.
* You can also [fill this form](https://clickhouse.tech/#meet) to meet Yandex ClickHouse team in person.
## Upcoming Events
* [ClickHouse Monitoring Round Table (online in English)](https://www.eventbrite.com/e/clickhouse-april-virtual-meetup-tickets-102272923066) on April 15, 2020.
* [ClickHouse Workshop in Novosibirsk](https://2020.codefest.ru/lecture/1628) on TBD date.
* [Yandex C++ Open-Source Sprints in Moscow](https://events.yandex.ru/events/otkrytyj-kod-v-yandek-28-03-2020) on TBD date.

View File

@ -3,8 +3,10 @@ if (USE_CLANG_TIDY)
endif ()
add_subdirectory (common)
add_subdirectory (loggers)
add_subdirectory (daemon)
add_subdirectory (loggers)
add_subdirectory (pcg-random)
add_subdirectory (widechar_width)
if (USE_MYSQL)
add_subdirectory (mysqlxx)

View File

@ -39,6 +39,10 @@ else ()
target_compile_definitions(common PUBLIC WITH_COVERAGE=0)
endif ()
if (USE_INTERNAL_CCTZ)
set_source_files_properties(DateLUTImpl.cpp PROPERTIES COMPILE_DEFINITIONS USE_INTERNAL_CCTZ)
endif()
target_include_directories(common PUBLIC .. ${CMAKE_CURRENT_BINARY_DIR}/..)
if (NOT USE_INTERNAL_BOOST_LIBRARY)

View File

@ -2,16 +2,19 @@
#include <cctz/civil_time.h>
#include <cctz/time_zone.h>
#include <cctz/zone_info_source.h>
#include <common/unaligned.h>
#include <Poco/Exception.h>
#include <dlfcn.h>
#include <algorithm>
#include <cassert>
#include <chrono>
#include <cstring>
#include <iostream>
#include <memory>
#define DATE_LUT_MIN 0
namespace
{
@ -47,7 +50,7 @@ DateLUTImpl::DateLUTImpl(const std::string & time_zone_)
assert(inside_main);
size_t i = 0;
time_t start_of_day = DATE_LUT_MIN;
time_t start_of_day = 0;
cctz::time_zone cctz_time_zone;
if (!cctz::load_time_zone(time_zone, &cctz_time_zone))
@ -157,3 +160,74 @@ DateLUTImpl::DateLUTImpl(const std::string & time_zone_)
years_months_lut[year_months_lut_index] = first_day_of_last_month;
}
}
#if !defined(ARCADIA_BUILD) /// Arcadia's variant of CCTZ already has the same implementation.
/// Prefer to load timezones from blobs linked to the binary.
/// The blobs are provided by "tzdata" library.
/// This allows to avoid dependency on system tzdata.
namespace cctz_extension
{
namespace
{
class Source : public cctz::ZoneInfoSource
{
public:
Source(const char * data_, size_t size_) : data(data_), size(size_) {}
size_t Read(void * buf, size_t bytes) override
{
if (bytes > size)
bytes = size;
memcpy(buf, data, bytes);
data += bytes;
size -= bytes;
return bytes;
}
int Skip(size_t offset) override
{
if (offset <= size)
{
data += offset;
size -= offset;
return 0;
}
else
{
errno = EINVAL;
return -1;
}
}
private:
const char * data;
size_t size;
};
std::unique_ptr<cctz::ZoneInfoSource> custom_factory(
const std::string & name,
const std::function<std::unique_ptr<cctz::ZoneInfoSource>(const std::string & name)> & fallback)
{
std::string name_replaced = name;
std::replace(name_replaced.begin(), name_replaced.end(), '/', '_');
std::replace(name_replaced.begin(), name_replaced.end(), '-', '_');
/// These are the names that are generated by "ld -r -b binary"
std::string symbol_name_data = "_binary_" + name_replaced + "_start";
std::string symbol_name_size = "_binary_" + name_replaced + "_size";
const void * sym_data = dlsym(RTLD_DEFAULT, symbol_name_data.c_str());
const void * sym_size = dlsym(RTLD_DEFAULT, symbol_name_size.c_str());
if (sym_data && sym_size)
return std::make_unique<Source>(static_cast<const char *>(sym_data), unalignedLoad<size_t>(&sym_size));
return fallback(name);
}
}
ZoneInfoSourceFactory zone_info_source_factory = custom_factory;
}
#endif

View File

@ -99,7 +99,7 @@ private:
return guess;
/// Time zones that have offset 0 from UTC do daylight saving time change (if any) towards increasing UTC offset (example: British Standard Time).
if (offset_at_start_of_epoch >= 0)
if (t >= lut[DayNum(guess + 1)].date)
return DayNum(guess + 1);
return DayNum(guess - 1);
@ -287,8 +287,8 @@ public:
if (offset_is_whole_number_of_hours_everytime)
return (t / 60) % 60;
time_t date = find(t).date;
return (t - date) / 60 % 60;
UInt32 date = find(t).date;
return (UInt32(t) - date) / 60 % 60;
}
inline time_t toStartOfMinute(time_t t) const { return t / 60 * 60; }
@ -301,9 +301,8 @@ public:
if (offset_is_whole_number_of_hours_everytime)
return t / 3600 * 3600;
time_t date = find(t).date;
/// Still can return wrong values for time at 1970-01-01 if the UTC offset was non-whole number of hours.
return date + (t - date) / 3600 * 3600;
UInt32 date = find(t).date;
return date + (UInt32(t) - date) / 3600 * 3600;
}
/** Number of calendar day since the beginning of UNIX epoch (1970-01-01 is zero)
@ -579,7 +578,7 @@ public:
return t / 3600;
/// Assume that if offset was fractional, then the fraction is the same as at the beginning of epoch.
/// NOTE This assumption is false for "Pacific/Pitcairn" time zone.
/// NOTE This assumption is false for "Pacific/Pitcairn" and "Pacific/Kiritimati" time zones.
return (t + 86400 - offset_at_start_of_epoch) / 3600;
}

View File

@ -127,7 +127,7 @@ LineReader::InputStatus LineReader::readOneLine(const String & prompt)
#ifdef OS_LINUX
if (!readline_ptr)
{
for (auto name : {"libreadline.so", "libreadline.so.0", "libeditline.so", "libeditline.so.0"})
for (const auto * name : {"libreadline.so", "libreadline.so.0", "libeditline.so", "libeditline.so.0"})
{
void * dl_handle = dlopen(name, RTLD_LAZY);
if (dl_handle)

View File

@ -37,13 +37,13 @@ ReplxxLineReader::ReplxxLineReader(const Suggest & suggest, const String & histo
/// By default C-p/C-n binded to COMPLETE_NEXT/COMPLETE_PREV,
/// bind C-p/C-n to history-previous/history-next like readline.
rx.bind_key(Replxx::KEY::control('N'), std::bind(&Replxx::invoke, &rx, Replxx::ACTION::HISTORY_NEXT, _1));
rx.bind_key(Replxx::KEY::control('P'), std::bind(&Replxx::invoke, &rx, Replxx::ACTION::HISTORY_PREVIOUS, _1));
rx.bind_key(Replxx::KEY::control('N'), [this](char32_t code) { return rx.invoke(Replxx::ACTION::HISTORY_NEXT, code); });
rx.bind_key(Replxx::KEY::control('P'), [this](char32_t code) { return rx.invoke(Replxx::ACTION::HISTORY_PREVIOUS, code); });
/// By default COMPLETE_NEXT/COMPLETE_PREV was binded to C-p/C-n, re-bind
/// to M-P/M-N (that was used for HISTORY_COMMON_PREFIX_SEARCH before, but
/// it also binded to M-p/M-n).
rx.bind_key(Replxx::KEY::meta('N'), std::bind(&Replxx::invoke, &rx, Replxx::ACTION::COMPLETE_NEXT, _1));
rx.bind_key(Replxx::KEY::meta('P'), std::bind(&Replxx::invoke, &rx, Replxx::ACTION::COMPLETE_PREVIOUS, _1));
rx.bind_key(Replxx::KEY::meta('N'), [this](char32_t code) { return rx.invoke(Replxx::ACTION::COMPLETE_NEXT, code); });
rx.bind_key(Replxx::KEY::meta('P'), [this](char32_t code) { return rx.invoke(Replxx::ACTION::COMPLETE_PREVIOUS, code); });
}
ReplxxLineReader::~ReplxxLineReader()

View File

@ -11,7 +11,7 @@ void argsToConfig(const Poco::Util::Application::ArgVec & argv, Poco::Util::Laye
/// Test: -- --1=1 --1=2 --3 5 7 8 -9 10 -11=12 14= 15== --16==17 --=18 --19= --20 21 22 --23 --24 25 --26 -27 28 ---29=30 -- ----31 32 --33 3-4
Poco::AutoPtr<Poco::Util::MapConfiguration> map_config = new Poco::Util::MapConfiguration;
std::string key;
for (auto & arg : argv)
for (const auto & arg : argv)
{
auto key_start = arg.find_first_not_of('-');
auto pos_minus = arg.find('-');

View File

@ -70,7 +70,7 @@ extern "C"
#endif
int dl_iterate_phdr(int (*callback) (dl_phdr_info * info, size_t size, void * data), void * data)
{
auto current_phdr_cache = phdr_cache.load();
auto * current_phdr_cache = phdr_cache.load();
if (!current_phdr_cache)
{
// Cache is not yet populated, pass through to the original function.

View File

@ -1,15 +1,9 @@
#pragma once
#include <boost/operators.hpp>
#include <type_traits>
/** https://svn.boost.org/trac/boost/ticket/5182
*/
template <class T, class Tag>
struct StrongTypedef
: boost::totally_ordered1< StrongTypedef<T, Tag>
, boost::totally_ordered2< StrongTypedef<T, Tag>, T> >
{
private:
using Self = StrongTypedef;

View File

@ -11,6 +11,10 @@ using Int16 = int16_t;
using Int32 = int32_t;
using Int64 = int64_t;
#if __cplusplus <= 201703L
using char8_t = unsigned char;
#endif
using UInt8 = char8_t;
using UInt16 = uint16_t;
using UInt32 = uint32_t;

View File

@ -1,12 +1,47 @@
LIBRARY()
ADDINCL(
GLOBAL clickhouse/base
contrib/libs/cctz/include
)
CFLAGS (GLOBAL -DARCADIA_BUILD)
IF (OS_DARWIN)
CFLAGS (GLOBAL -DOS_DARWIN)
ELSEIF (OS_FREEBSD)
CFLAGS (GLOBAL -DOS_FREEBSD)
ELSEIF (OS_LINUX)
CFLAGS (GLOBAL -DOS_LINUX)
ENDIF ()
PEERDIR(
contrib/libs/cctz/src
contrib/libs/cxxsupp/libcxx-filesystem
contrib/libs/poco/Net
contrib/libs/poco/Util
contrib/restricted/boost
contrib/restricted/cityhash-1.0.2
)
SRCS(
argsToConfig.cpp
coverage.cpp
DateLUT.cpp
DateLUTImpl.cpp
demangle.cpp
getFQDNOrHostName.cpp
getMemoryAmount.cpp
getThreadId.cpp
JSON.cpp
LineReader.cpp
mremap.cpp
phdr_cache.cpp
preciseExp10.c
setTerminalEcho.cpp
shift10.cpp
sleep.cpp
terminalColors.cpp
)
END()

View File

@ -50,11 +50,13 @@
#include <Common/getMultipleKeysFromConfig.h>
#include <Common/ClickHouseRevision.h>
#include <Common/Config/ConfigProcessor.h>
#include <Common/config_version.h>
#ifdef __APPLE__
// ucontext is not available without _XOPEN_SOURCE
#define _XOPEN_SOURCE 700
#if !defined(ARCADIA_BUILD)
# include <Common/config_version.h>
#endif
#if defined(OS_DARWIN)
# define _XOPEN_SOURCE 700 // ucontext is not available without _XOPEN_SOURCE
#endif
#include <ucontext.h>
@ -231,7 +233,6 @@ private:
Logger * log;
BaseDaemon & daemon;
private:
void onTerminate(const std::string & message, UInt32 thread_num) const
{
LOG_FATAL(log, "(version " << VERSION_STRING << VERSION_OFFICIAL << ") (from thread " << thread_num << ") " << message);
@ -410,7 +411,7 @@ std::string BaseDaemon::getDefaultCorePath() const
void BaseDaemon::closeFDs()
{
#if defined(__FreeBSD__) || (defined(__APPLE__) && defined(__MACH__))
#if defined(OS_FREEBSD) || defined(OS_DARWIN)
Poco::File proc_path{"/dev/fd"};
#else
Poco::File proc_path{"/proc/self/fd"};
@ -430,7 +431,7 @@ void BaseDaemon::closeFDs()
else
{
int max_fd = -1;
#ifdef _SC_OPEN_MAX
#if defined(_SC_OPEN_MAX)
max_fd = sysconf(_SC_OPEN_MAX);
if (max_fd == -1)
#endif
@ -448,7 +449,7 @@ namespace
/// the maximum is 1000, and chromium uses 300 for its tab processes. Ignore
/// whatever errors that occur, because it's just a debugging aid and we don't
/// care if it breaks.
#if defined(__linux__) && !defined(NDEBUG)
#if defined(OS_LINUX) && !defined(NDEBUG)
void debugIncreaseOOMScore()
{
const std::string new_score = "555";

14
base/daemon/ya.make Normal file
View File

@ -0,0 +1,14 @@
LIBRARY()
NO_COMPILER_WARNINGS()
PEERDIR(
clickhouse/src/Common
)
SRCS(
BaseDaemon.cpp
GraphiteWriter.cpp
)
END()

View File

@ -166,12 +166,29 @@ void Loggers::buildLoggers(Poco::Util::AbstractConfiguration & config, Poco::Log
logger.root().setChannel(logger.getChannel());
// Explicitly specified log levels for specific loggers.
Poco::Util::AbstractConfiguration::Keys levels;
config.keys("logger.levels", levels);
{
Poco::Util::AbstractConfiguration::Keys loggers_level;
config.keys("logger.levels", loggers_level);
if (!levels.empty())
for (const auto & level : levels)
logger.root().get(level).setLevel(config.getString("logger.levels." + level, "trace"));
if (!loggers_level.empty())
{
for (const auto & key : loggers_level)
{
if (key == "logger" || key.starts_with("logger["))
{
const std::string name(config.getString("logger.levels." + key + ".name"));
const std::string level(config.getString("logger.levels." + key + ".level"));
logger.root().get(name).setLevel(level);
}
else
{
// Legacy syntax
const std::string level(config.getString("logger.levels." + key, "trace"));
logger.root().get(key).setLevel(level);
}
}
}
}
}
void Loggers::closeLogs(Poco::Logger & logger)

View File

@ -75,7 +75,11 @@ void OwnPatternFormatter::formatExtended(const DB::ExtendedLogMessage & msg_ext,
if (color)
writeCString(resetColor(), wb);
writeCString("> ", wb);
if (color)
writeString(setColor(std::hash<std::string>()(msg.getSource())), wb);
DB::writeString(msg.getSource(), wb);
if (color)
writeCString(resetColor(), wb);
writeCString(": ", wb);
DB::writeString(msg.getText(), wb);
}

View File

@ -20,7 +20,7 @@ void OwnSplitChannel::log(const Poco::Message & msg)
if (channels.empty() && (logs_queue == nullptr || msg.getPriority() > logs_queue->max_priority))
return;
if (auto masker = SensitiveDataMasker::getInstance())
if (auto * masker = SensitiveDataMasker::getInstance())
{
auto message_text = msg.getText();
auto matches = masker->wipeSensitiveData(message_text);

15
base/loggers/ya.make Normal file
View File

@ -0,0 +1,15 @@
LIBRARY()
PEERDIR(
clickhouse/src/Common
)
SRCS(
ExtendedLogChannel.cpp
Loggers.cpp
OwnFormattingChannel.cpp
OwnPatternFormatter.cpp
OwnSplitChannel.cpp
)
END()

View File

@ -0,0 +1,2 @@
add_library(pcg_random INTERFACE)
target_include_directories(pcg_random INTERFACE .)

View File

@ -292,7 +292,7 @@ inline itype rotl(itype value, bitcount_t rot)
{
constexpr bitcount_t bits = sizeof(itype) * 8;
constexpr bitcount_t mask = bits - 1;
#if PCG_USE_ZEROCHECK_ROTATE_IDIOM
#if defined(PCG_USE_ZEROCHECK_ROTATE_IDIOM)
return rot ? (value << rot) | (value >> (bits - rot)) : value;
#else
return (value << rot) | (value >> ((- rot) & mask));
@ -304,7 +304,7 @@ inline itype rotr(itype value, bitcount_t rot)
{
constexpr bitcount_t bits = sizeof(itype) * 8;
constexpr bitcount_t mask = bits - 1;
#if PCG_USE_ZEROCHECK_ROTATE_IDIOM
#if defined(PCG_USE_ZEROCHECK_ROTATE_IDIOM)
return rot ? (value >> rot) | (value << (bits - rot)) : value;
#else
return (value >> rot) | (value << ((- rot) & mask));
@ -318,7 +318,7 @@ inline itype rotr(itype value, bitcount_t rot)
*
* These overloads will be preferred over the general template code above.
*/
#if PCG_USE_INLINE_ASM && __GNUC__ && (__x86_64__ || __i386__)
#if defined(PCG_USE_INLINE_ASM) && __GNUC__ && (__x86_64__ || __i386__)
inline uint8_t rotr(uint8_t value, bitcount_t rot)
{
@ -600,7 +600,7 @@ std::ostream& operator<<(std::ostream& out, printable_typename<T>) {
#ifdef __GNUC__
int status;
char* pretty_name =
abi::__cxa_demangle(implementation_typename, NULL, NULL, &status);
abi::__cxa_demangle(implementation_typename, nullptr, nullptr, &status);
if (status == 0)
out << pretty_name;
free(static_cast<void*>(pretty_name));

5
base/pcg-random/ya.make Normal file
View File

@ -0,0 +1,5 @@
LIBRARY()
ADDINCL (GLOBAL clickhouse/base/pcg-random)
END()

View File

@ -0,0 +1,9 @@
LIBRARY()
ADDINCL(GLOBAL clickhouse/base/widechar_width)
SRCS(
widechar_width.cpp
)
END()

View File

@ -1,3 +1,7 @@
RECURSE(
common
daemon
loggers
pcg-random
widechar_width
)

View File

@ -4,7 +4,11 @@ if (NOT COMPILER_CLANG)
message (FATAL_ERROR "FreeBSD build is supported only for Clang")
endif ()
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIBRARY OUTPUT_STRIP_TRAILING_WHITESPACE)
if (${CMAKE_SYSTEM_PROCESSOR} STREQUAL "amd64")
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-x86_64.a OUTPUT_VARIABLE BUILTINS_LIBRARY OUTPUT_STRIP_TRAILING_WHITESPACE)
else ()
execute_process (COMMAND ${CMAKE_CXX_COMPILER} --print-file-name=libclang_rt.builtins-${CMAKE_SYSTEM_PROCESSOR}.a OUTPUT_VARIABLE BUILTINS_LIBRARY OUTPUT_STRIP_TRAILING_WHITESPACE)
endif ()
set (DEFAULT_LIBS "${DEFAULT_LIBS} ${BUILTINS_LIBRARY} ${COVERAGE_OPTION} -lc -lm -lrt -lpthread")

21
cmake/fuzzer.cmake Normal file
View File

@ -0,0 +1,21 @@
option (FUZZER "Enable fuzzer: libfuzzer")
if (FUZZER)
if (FUZZER STREQUAL "libfuzzer")
# NOTE: Eldar Zaitov decided to name it "libfuzzer" instead of "fuzzer" to keep in mind another possible fuzzer backends.
# NOTE: no-link means that all the targets are built with instrumentation for fuzzer, but only some of them (tests) have entry point for fuzzer and it's not checked.
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} -fsanitize=fuzzer-no-link")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} -fsanitize=fuzzer-no-link")
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=fuzzer-no-link")
endif()
# NOTE: oss-fuzz can change LIB_FUZZING_ENGINE variable
if (NOT LIB_FUZZING_ENGINE)
set (LIB_FUZZING_ENGINE "-fsanitize=fuzzer")
endif ()
else ()
message (FATAL_ERROR "Unknown fuzzer type: ${FUZZER}")
endif ()
endif()

View File

@ -2,4 +2,3 @@ set(DIVIDE_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libdivide)
set(DBMS_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/src ${ClickHouse_BINARY_DIR}/src)
set(DOUBLE_CONVERSION_CONTRIB_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/double-conversion)
set(METROHASH_CONTRIB_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libmetrohash/src)
set(PCG_RANDOM_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libpcg-random/include)

View File

@ -58,18 +58,6 @@ if (SANITIZE)
# llvm-tblgen, that is used during LLVM build, doesn't work with UBSan.
set (ENABLE_EMBEDDED_COMPILER 0 CACHE BOOL "")
elseif (SANITIZE STREQUAL "libfuzzer")
# NOTE: Eldar Zaitov decided to name it "libfuzzer" instead of "fuzzer" to keep in mind another possible fuzzer backends.
# NOTE: no-link means that all the targets are built with instrumentation for fuzzer, but only some of them (tests) have entry point for fuzzer and it's not checked.
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SAN_FLAGS} -fsanitize=fuzzer-no-link,address,undefined -fsanitize-address-use-after-scope")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SAN_FLAGS} -fsanitize=fuzzer-no-link,address,undefined -fsanitize-address-use-after-scope")
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=fuzzer-no-link,address,undefined -fsanitize-address-use-after-scope")
endif()
if (MAKE_STATIC_LIBRARIES AND CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static-libasan -static-libubsan")
endif ()
set (LIBFUZZER_CMAKE_CXX_FLAGS "-fsanitize=fuzzer,address,undefined -fsanitize-address-use-after-scope")
else ()
message (FATAL_ERROR "Unknown sanitizer type: ${SANITIZE}")
endif ()

View File

@ -1,11 +1,11 @@
# This strings autochanged from release_lib.sh:
set(VERSION_REVISION 54434)
set(VERSION_REVISION 54435)
set(VERSION_MAJOR 20)
set(VERSION_MINOR 4)
set(VERSION_MINOR 5)
set(VERSION_PATCH 1)
set(VERSION_GITHASH 05da1bff8b5826608d05618dab984cdf8f96e679)
set(VERSION_DESCRIBE v20.4.1.1-prestable)
set(VERSION_STRING 20.4.1.1)
set(VERSION_GITHASH 91df18a906dcffdbee6816e5389df6c65f86e35f)
set(VERSION_DESCRIBE v20.5.1.1-prestable)
set(VERSION_STRING 20.5.1.1)
# end of autochange
set(VERSION_EXTRA "" CACHE STRING "")

View File

@ -333,6 +333,5 @@ add_subdirectory(grpc-cmake)
add_subdirectory(replxx-cmake)
add_subdirectory(FastMemcpy)
add_subdirectory(widecharwidth)
add_subdirectory(consistent-hashing)
add_subdirectory(consistent-hashing-sumbur)

2
contrib/cctz vendored

@ -1 +1 @@
Subproject commit 5a3f785329cecdd2b68cd950e0647e9246774ef2
Subproject commit 7a2db4ece6e0f1b246173cbdb62711ae258ee841

View File

@ -23,6 +23,600 @@ if (USE_INTERNAL_CCTZ)
# yes, need linux, because bsd check inside linux in time_zone_libc.cc:24
target_compile_definitions (cctz PRIVATE __USE_BSD linux _XOPEN_SOURCE=600)
endif ()
# Build a libray with embedded tzdata
# We invoke 'ld' and 'objcopy' directly because lld linker has no option to generate object file with binary data.
# Note: we can invoke specific ld from toolchain and relax condition on ARCH_AMD64.
if (OS_LINUX AND ARCH_AMD64)
set (TIMEZONES
Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
Africa/Algiers
Africa/Asmara
Africa/Asmera
Africa/Bamako
Africa/Bangui
Africa/Banjul
Africa/Bissau
Africa/Blantyre
Africa/Brazzaville
Africa/Bujumbura
Africa/Cairo
Africa/Casablanca
Africa/Ceuta
Africa/Conakry
Africa/Dakar
Africa/Dar_es_Salaam
Africa/Djibouti
Africa/Douala
Africa/El_Aaiun
Africa/Freetown
Africa/Gaborone
Africa/Harare
Africa/Johannesburg
Africa/Juba
Africa/Kampala
Africa/Khartoum
Africa/Kigali
Africa/Kinshasa
Africa/Lagos
Africa/Libreville
Africa/Lome
Africa/Luanda
Africa/Lubumbashi
Africa/Lusaka
Africa/Malabo
Africa/Maputo
Africa/Maseru
Africa/Mbabane
Africa/Mogadishu
Africa/Monrovia
Africa/Nairobi
Africa/Ndjamena
Africa/Niamey
Africa/Nouakchott
Africa/Ouagadougou
Africa/Porto-Novo
Africa/Sao_Tome
Africa/Timbuktu
Africa/Tripoli
Africa/Tunis
Africa/Windhoek
America/Adak
America/Anchorage
America/Anguilla
America/Antigua
America/Araguaina
America/Argentina/Buenos_Aires
America/Argentina/Catamarca
America/Argentina/ComodRivadavia
America/Argentina/Cordoba
America/Argentina/Jujuy
America/Argentina/La_Rioja
America/Argentina/Mendoza
America/Argentina/Rio_Gallegos
America/Argentina/Salta
America/Argentina/San_Juan
America/Argentina/San_Luis
America/Argentina/Tucuman
America/Argentina/Ushuaia
America/Aruba
America/Asuncion
America/Atikokan
America/Atka
America/Bahia
America/Bahia_Banderas
America/Barbados
America/Belem
America/Belize
America/Blanc-Sablon
America/Boa_Vista
America/Bogota
America/Boise
America/Buenos_Aires
America/Cambridge_Bay
America/Campo_Grande
America/Cancun
America/Caracas
America/Catamarca
America/Cayenne
America/Cayman
America/Chicago
America/Chihuahua
America/Coral_Harbour
America/Cordoba
America/Costa_Rica
America/Creston
America/Cuiaba
America/Curacao
America/Danmarkshavn
America/Dawson
America/Dawson_Creek
America/Denver
America/Detroit
America/Dominica
America/Edmonton
America/Eirunepe
America/El_Salvador
America/Ensenada
America/Fortaleza
America/Fort_Nelson
America/Fort_Wayne
America/Glace_Bay
America/Godthab
America/Goose_Bay
America/Grand_Turk
America/Grenada
America/Guadeloupe
America/Guatemala
America/Guayaquil
America/Guyana
America/Halifax
America/Havana
America/Hermosillo
America/Indiana/Indianapolis
America/Indiana/Knox
America/Indiana/Marengo
America/Indiana/Petersburg
America/Indianapolis
America/Indiana/Tell_City
America/Indiana/Vevay
America/Indiana/Vincennes
America/Indiana/Winamac
America/Inuvik
America/Iqaluit
America/Jamaica
America/Jujuy
America/Juneau
America/Kentucky/Louisville
America/Kentucky/Monticello
America/Knox_IN
America/Kralendijk
America/La_Paz
America/Lima
America/Los_Angeles
America/Louisville
America/Lower_Princes
America/Maceio
America/Managua
America/Manaus
America/Marigot
America/Martinique
America/Matamoros
America/Mazatlan
America/Mendoza
America/Menominee
America/Merida
America/Metlakatla
America/Mexico_City
America/Miquelon
America/Moncton
America/Monterrey
America/Montevideo
America/Montreal
America/Montserrat
America/Nassau
America/New_York
America/Nipigon
America/Nome
America/Noronha
America/North_Dakota/Beulah
America/North_Dakota/Center
America/North_Dakota/New_Salem
America/Ojinaga
America/Panama
America/Pangnirtung
America/Paramaribo
America/Phoenix
America/Port-au-Prince
America/Porto_Acre
America/Port_of_Spain
America/Porto_Velho
America/Puerto_Rico
America/Punta_Arenas
America/Rainy_River
America/Rankin_Inlet
America/Recife
America/Regina
America/Resolute
America/Rio_Branco
America/Rosario
America/Santa_Isabel
America/Santarem
America/Santiago
America/Santo_Domingo
America/Sao_Paulo
America/Scoresbysund
America/Shiprock
America/Sitka
America/St_Barthelemy
America/St_Johns
America/St_Kitts
America/St_Lucia
America/St_Thomas
America/St_Vincent
America/Swift_Current
America/Tegucigalpa
America/Thule
America/Thunder_Bay
America/Tijuana
America/Toronto
America/Tortola
America/Vancouver
America/Virgin
America/Whitehorse
America/Winnipeg
America/Yakutat
America/Yellowknife
Antarctica/Casey
Antarctica/Davis
Antarctica/DumontDUrville
Antarctica/Macquarie
Antarctica/Mawson
Antarctica/McMurdo
Antarctica/Palmer
Antarctica/Rothera
Antarctica/South_Pole
Antarctica/Syowa
Antarctica/Troll
Antarctica/Vostok
Arctic/Longyearbyen
Asia/Aden
Asia/Almaty
Asia/Amman
Asia/Anadyr
Asia/Aqtau
Asia/Aqtobe
Asia/Ashgabat
Asia/Ashkhabad
Asia/Atyrau
Asia/Baghdad
Asia/Bahrain
Asia/Baku
Asia/Bangkok
Asia/Barnaul
Asia/Beirut
Asia/Bishkek
Asia/Brunei
Asia/Calcutta
Asia/Chita
Asia/Choibalsan
Asia/Chongqing
Asia/Chungking
Asia/Colombo
Asia/Dacca
Asia/Damascus
Asia/Dhaka
Asia/Dili
Asia/Dubai
Asia/Dushanbe
Asia/Famagusta
Asia/Gaza
Asia/Harbin
Asia/Hebron
Asia/Ho_Chi_Minh
Asia/Hong_Kong
Asia/Hovd
Asia/Irkutsk
Asia/Istanbul
Asia/Jakarta
Asia/Jayapura
Asia/Jerusalem
Asia/Kabul
Asia/Kamchatka
Asia/Karachi
Asia/Kashgar
Asia/Kathmandu
Asia/Katmandu
Asia/Khandyga
Asia/Kolkata
Asia/Krasnoyarsk
Asia/Kuala_Lumpur
Asia/Kuching
Asia/Kuwait
Asia/Macao
Asia/Macau
Asia/Magadan
Asia/Makassar
Asia/Manila
Asia/Muscat
Asia/Nicosia
Asia/Novokuznetsk
Asia/Novosibirsk
Asia/Omsk
Asia/Oral
Asia/Phnom_Penh
Asia/Pontianak
Asia/Pyongyang
Asia/Qatar
Asia/Qostanay
Asia/Qyzylorda
Asia/Rangoon
Asia/Riyadh
Asia/Saigon
Asia/Sakhalin
Asia/Samarkand
Asia/Seoul
Asia/Shanghai
Asia/Singapore
Asia/Srednekolymsk
Asia/Taipei
Asia/Tashkent
Asia/Tbilisi
Asia/Tehran
Asia/Tel_Aviv
Asia/Thimbu
Asia/Thimphu
Asia/Tokyo
Asia/Tomsk
Asia/Ujung_Pandang
Asia/Ulaanbaatar
Asia/Ulan_Bator
Asia/Urumqi
Asia/Ust-Nera
Asia/Vientiane
Asia/Vladivostok
Asia/Yakutsk
Asia/Yangon
Asia/Yekaterinburg
Asia/Yerevan
Atlantic/Azores
Atlantic/Bermuda
Atlantic/Canary
Atlantic/Cape_Verde
Atlantic/Faeroe
Atlantic/Faroe
Atlantic/Jan_Mayen
Atlantic/Madeira
Atlantic/Reykjavik
Atlantic/South_Georgia
Atlantic/Stanley
Atlantic/St_Helena
Australia/ACT
Australia/Adelaide
Australia/Brisbane
Australia/Broken_Hill
Australia/Canberra
Australia/Currie
Australia/Darwin
Australia/Eucla
Australia/Hobart
Australia/LHI
Australia/Lindeman
Australia/Lord_Howe
Australia/Melbourne
Australia/North
Australia/NSW
Australia/Perth
Australia/Queensland
Australia/South
Australia/Sydney
Australia/Tasmania
Australia/Victoria
Australia/West
Australia/Yancowinna
Brazil/Acre
Brazil/DeNoronha
Brazil/East
Brazil/West
Canada/Atlantic
Canada/Central
Canada/Eastern
Canada/Mountain
Canada/Newfoundland
Canada/Pacific
Canada/Saskatchewan
Canada/Yukon
CET
Chile/Continental
Chile/EasterIsland
CST6CDT
Cuba
EET
Egypt
Eire
EST
EST5EDT
Etc/GMT
Etc/Greenwich
Etc/UCT
Etc/Universal
Etc/UTC
Etc/Zulu
Europe/Amsterdam
Europe/Andorra
Europe/Astrakhan
Europe/Athens
Europe/Belfast
Europe/Belgrade
Europe/Berlin
Europe/Bratislava
Europe/Brussels
Europe/Bucharest
Europe/Budapest
Europe/Busingen
Europe/Chisinau
Europe/Copenhagen
Europe/Dublin
Europe/Gibraltar
Europe/Guernsey
Europe/Helsinki
Europe/Isle_of_Man
Europe/Istanbul
Europe/Jersey
Europe/Kaliningrad
Europe/Kiev
Europe/Kirov
Europe/Lisbon
Europe/Ljubljana
Europe/London
Europe/Luxembourg
Europe/Madrid
Europe/Malta
Europe/Mariehamn
Europe/Minsk
Europe/Monaco
Europe/Moscow
Europe/Nicosia
Europe/Oslo
Europe/Paris
Europe/Podgorica
Europe/Prague
Europe/Riga
Europe/Rome
Europe/Samara
Europe/San_Marino
Europe/Sarajevo
Europe/Saratov
Europe/Simferopol
Europe/Skopje
Europe/Sofia
Europe/Stockholm
Europe/Tallinn
Europe/Tirane
Europe/Tiraspol
Europe/Ulyanovsk
Europe/Uzhgorod
Europe/Vaduz
Europe/Vatican
Europe/Vienna
Europe/Vilnius
Europe/Volgograd
Europe/Warsaw
Europe/Zagreb
Europe/Zaporozhye
Europe/Zurich
Factory
GB
GB-Eire
GMT
GMT0
Greenwich
Hongkong
HST
Iceland
Indian/Antananarivo
Indian/Chagos
Indian/Christmas
Indian/Cocos
Indian/Comoro
Indian/Kerguelen
Indian/Mahe
Indian/Maldives
Indian/Mauritius
Indian/Mayotte
Indian/Reunion
Iran
Israel
Jamaica
Japan
Kwajalein
Libya
MET
Mexico/BajaNorte
Mexico/BajaSur
Mexico/General
MST
MST7MDT
Navajo
NZ
NZ-CHAT
Pacific/Apia
Pacific/Auckland
Pacific/Bougainville
Pacific/Chatham
Pacific/Chuuk
Pacific/Easter
Pacific/Efate
Pacific/Enderbury
Pacific/Fakaofo
Pacific/Fiji
Pacific/Funafuti
Pacific/Galapagos
Pacific/Gambier
Pacific/Guadalcanal
Pacific/Guam
Pacific/Honolulu
Pacific/Johnston
Pacific/Kiritimati
Pacific/Kosrae
Pacific/Kwajalein
Pacific/Majuro
Pacific/Marquesas
Pacific/Midway
Pacific/Nauru
Pacific/Niue
Pacific/Norfolk
Pacific/Noumea
Pacific/Pago_Pago
Pacific/Palau
Pacific/Pitcairn
Pacific/Pohnpei
Pacific/Ponape
Pacific/Port_Moresby
Pacific/Rarotonga
Pacific/Saipan
Pacific/Samoa
Pacific/Tahiti
Pacific/Tarawa
Pacific/Tongatapu
Pacific/Truk
Pacific/Wake
Pacific/Wallis
Pacific/Yap
Poland
Portugal
PRC
PST8PDT
ROC
ROK
Singapore
Turkey
UCT
Universal
US/Alaska
US/Aleutian
US/Arizona
US/Central
US/Eastern
US/East-Indiana
US/Hawaii
US/Indiana-Starke
US/Michigan
US/Mountain
US/Pacific
US/Samoa
UTC
WET
W-SU
Zulu)
set(TZDIR ${LIBRARY_DIR}/testdata/zoneinfo)
set(TZ_OBJS)
foreach(TIMEZONE ${TIMEZONES})
string(REPLACE "/" "_" TIMEZONE_ID ${TIMEZONE})
set(TZ_OBJ ${TIMEZONE_ID}.o)
set(TZ_OBJS ${TZ_OBJS} ${TZ_OBJ})
# https://stackoverflow.com/questions/14776463/compile-and-add-an-object-file-from-a-binary-with-cmake
add_custom_command(OUTPUT ${TZ_OBJ}
COMMAND cd ${TZDIR} && ld -r -b binary -o ${CMAKE_CURRENT_BINARY_DIR}/${TZ_OBJ} ${TIMEZONE}
COMMAND objcopy --rename-section .data=.rodata,alloc,load,readonly,data,contents
${CMAKE_CURRENT_BINARY_DIR}/${TZ_OBJ} ${CMAKE_CURRENT_BINARY_DIR}/${TZ_OBJ})
set_source_files_properties(${TZ_OBJ} PROPERTIES EXTERNAL_OBJECT true GENERATED true)
endforeach(TIMEZONE)
add_library(tzdata STATIC ${TZ_OBJS})
set_target_properties(tzdata PROPERTIES LINKER_LANGUAGE C)
target_link_libraries(cctz -Wl,--whole-archive tzdata -Wl,--no-whole-archive) # whole-archive prevents symbols from being discarded
endif ()
else ()
find_library (LIBRARY_CCTZ cctz)
find_path (INCLUDE_CCTZ NAMES cctz/civil_time.h)

2
contrib/libgsasl vendored

@ -1 +1 @@
Subproject commit 42ef20687042637252e64df1934b6d47771486d1
Subproject commit 140fb58250588c8323285b75fcf127c4adc33dfa

View File

@ -1,52 +0,0 @@
# PCG Random Number Generation, C++ Edition
[PCG-Random website]: http://www.pcg-random.org
This code provides an implementation of the PCG family of random number
generators, which are fast, statistically excellent, and offer a number of
useful features.
Full details can be found at the [PCG-Random website]. This version
of the code provides many family members -- if you just want one
simple generator, you may prefer the minimal C version of the library.
There are two kinds of generator, normal generators and extended generators.
Extended generators provide *k* dimensional equidistribution and can perform
party tricks, but generally speaking most people only need the normal
generators.
There are two ways to access the generators, using a convenience typedef
or by using the underlying templates directly (similar to C++11's `std::mt19937` typedef vs its `std::mersenne_twister_engine` template). For most users, the convenience typedef is what you want, and probably you're fine with `pcg32` for 32-bit numbers. If you want 64-bit numbers, either use `pcg64` (or, if you're on a 32-bit system, making 64 bits from two calls to `pcg32_k2` may be faster).
## Documentation and Examples
Visit [PCG-Random website] for information on how to use this library, or look
at the sample code in the `sample` directory -- hopefully it should be fairly
self explanatory.
## Building
The code is written in C++11, as an include-only library (i.e., there is
nothing you need to build). There are some provided demo programs and tests
however. On a Unix-style system (e.g., Linux, Mac OS X) you should be able
to just type
make
To build the demo programs.
## Testing
Run
make test
## Directory Structure
The directories are arranged as follows:
* `include` -- contains `pcg_random.hpp` and supporting include files
* `test-high` -- test code for the high-level API where the functions have
shorter, less scary-looking names.
* `sample` -- sample code, some similar to the code in `test-high` but more
human readable, some other examples too

2
contrib/llvm vendored

@ -1 +1 @@
Subproject commit 5dab18f4861677548b8f7f6815f49384480ecead
Subproject commit 3d6c7e916760b395908f28a1c885c8334d4fa98b

2
contrib/poco vendored

@ -1 +1 @@
Subproject commit ddca76ba4956cb57150082394536cc43ff28f6fa
Subproject commit 7d605a1ae5d878294f91f68feb62ae51e9a04426

2
contrib/rapidjson vendored

@ -1 +1 @@
Subproject commit 01950eb7acec78818d68b762efc869bba2420d82
Subproject commit 8f4c021fa2f1e001d2376095928fc0532adf2ae6

4
debian/changelog vendored
View File

@ -1,5 +1,5 @@
clickhouse (20.4.1.1) unstable; urgency=low
clickhouse (20.5.1.1) unstable; urgency=low
* Modified source code
-- clickhouse-release <clickhouse-release@yandex-team.ru> Thu, 12 Mar 2020 16:07:04 +0300
-- clickhouse-release <clickhouse-release@yandex-team.ru> Tue, 28 Apr 2020 20:12:13 +0300

2
debian/control vendored
View File

@ -28,7 +28,7 @@ Description: Client binary for ClickHouse
Package: clickhouse-common-static
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}, tzdata
Depends: ${shlibs:Depends}, ${misc:Depends}
Suggests: clickhouse-common-static-dbg
Replaces: clickhouse-common, clickhouse-server-base
Provides: clickhouse-common, clickhouse-server-base

View File

@ -1,5 +1,5 @@
## ClickHouse Dockerfiles
This directory contain Dockerfiles for `clickhouse-client` and `clickhouse-server`. They updated each release.
This directory contain Dockerfiles for `clickhouse-client` and `clickhouse-server`. They are updated in each release.
Also there is bunch of images for testing and CI. They are listed in `images.json` file and updated on each commit to master. If you need to add another image, place information about it into `images.json`.

View File

@ -1,5 +1,9 @@
FROM ubuntu:19.10
RUN apt-get --allow-unauthenticated update -y && apt-get install --yes wget gnupg
RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
RUN echo "deb [trusted=yes] http://apt.llvm.org/eoan/ llvm-toolchain-eoan-10 main" >> /etc/apt/sources.list
RUN apt-get update -y \
&& env DEBIAN_FRONTEND=noninteractive \
apt-get install --yes --no-install-recommends \
@ -19,10 +23,11 @@ RUN apt-get update -y \
python-termcolor \
sudo \
tzdata \
clang \
clang-tidy \
lld \
lldb
llvm-10 \
clang-10 \
clang-tidy-10 \
lld-10 \
lldb-10
COPY build.sh /

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04
ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
ARG version=20.4.1.*
ARG version=20.5.1.*
RUN apt-get update \
&& apt-get install --yes --no-install-recommends \
@ -16,8 +16,7 @@ RUN apt-get update \
apt-get install --allow-unauthenticated --yes --no-install-recommends \
clickhouse-client=$version \
clickhouse-common-static=$version \
locales \
tzdata \
locales
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf \
&& apt-get clean

View File

@ -1,10 +1,10 @@
{
"docker/packager/deb": "yandex/clickhouse-deb-builder",
"docker/packager/binary": "yandex/clickhouse-binary-builder",
"docker/test/coverage": "yandex/clickhouse-coverage",
"docker/test/compatibility/centos": "yandex/clickhouse-test-old-centos",
"docker/test/compatibility/ubuntu": "yandex/clickhouse-test-old-ubuntu",
"docker/test/integration": "yandex/clickhouse-integration-test",
"docker/test/performance": "yandex/clickhouse-performance-test",
"docker/test/integration/base": "yandex/clickhouse-integration-test",
"docker/test/performance-comparison": "yandex/clickhouse-performance-comparison",
"docker/test/pvs": "yandex/clickhouse-pvs-test",
"docker/test/stateful": "yandex/clickhouse-stateful-test",
@ -14,5 +14,6 @@
"docker/test/unit": "yandex/clickhouse-unit-test",
"docker/test/stress": "yandex/clickhouse-stress-test",
"docker/test/split_build_smoke_test": "yandex/clickhouse-split-build-smoke-test",
"tests/integration/image": "yandex/clickhouse-integration-tests-runner"
"docker/test/codebrowser": "yandex/clickhouse-codebrowser",
"docker/test/integration/runner": "yandex/clickhouse-integration-tests-runner"
}

View File

@ -18,11 +18,11 @@ $ ls -l deb/test_output
```
Build ClickHouse binary with `clang-9.0` and `address` sanitizer in `relwithdebuginfo`
Build ClickHouse binary with `clang-9` and `address` sanitizer in `relwithdebuginfo`
mode:
```
$ mkdir $HOME/some_clickhouse
$ ./packager --output-dir=$HOME/some_clickhouse --package-type binary --compiler=clang-9.0 --sanitizer=address
$ ./packager --output-dir=$HOME/some_clickhouse --package-type binary --compiler=clang-9 --sanitizer=address
$ ls -l $HOME/some_clickhouse
-rwxr-xr-x 1 root root 787061952 clickhouse
lrwxrwxrwx 1 root root 10 clickhouse-benchmark -> clickhouse

View File

@ -1,7 +1,10 @@
# Trigger new image build
# docker build -t yandex/clickhouse-binary-builder .
FROM ubuntu:19.10
RUN apt-get --allow-unauthenticated update -y && apt-get install --yes wget gnupg
RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
RUN echo "deb [trusted=yes] http://apt.llvm.org/eoan/ llvm-toolchain-eoan-10 main" >> /etc/apt/sources.list
RUN apt-get --allow-unauthenticated update -y \
&& env DEBIAN_FRONTEND=noninteractive \
apt-get --allow-unauthenticated install --yes --no-install-recommends \
@ -23,6 +26,10 @@ RUN apt-get update -y \
curl \
gcc-9 \
g++-9 \
llvm-10 \
clang-10 \
lld-10 \
clang-tidy-10 \
clang-9 \
lld-9 \
clang-tidy-9 \
@ -43,10 +50,10 @@ RUN apt-get update -y \
build-essential
# This symlink required by gcc to find lld compiler
RUN ln -s /usr/bin/lld-9 /usr/bin/ld.lld
RUN ln -s /usr/bin/lld-10 /usr/bin/ld.lld
ENV CC=clang-9
ENV CXX=clang++-9
ENV CC=clang-10
ENV CXX=clang++-10
# libtapi is required to support .tbh format from recent MacOS SDKs
RUN git clone https://github.com/tpoechtrager/apple-libtapi.git

View File

@ -1,6 +1,10 @@
# docker build -t yandex/clickhouse-deb-builder .
FROM ubuntu:19.10
RUN apt-get --allow-unauthenticated update -y && apt-get install --yes wget gnupg
RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
RUN echo "deb [trusted=yes] http://apt.llvm.org/eoan/ llvm-toolchain-eoan-10 main" >> /etc/apt/sources.list
RUN apt-get --allow-unauthenticated update -y \
&& env DEBIAN_FRONTEND=noninteractive \
apt-get --allow-unauthenticated install --yes --no-install-recommends \
@ -19,6 +23,10 @@ RUN apt-get --allow-unauthenticated update -y \
apt-get --allow-unauthenticated install --yes --no-install-recommends \
gcc-9 \
g++-9 \
llvm-10 \
clang-10 \
lld-10 \
clang-tidy-10 \
clang-9 \
lld-9 \
clang-tidy-9 \
@ -69,7 +77,7 @@ RUN chmod +x dpkg-deb
RUN cp dpkg-deb /usr/bin
# This symlink required by gcc to find lld compiler
RUN ln -s /usr/bin/lld-9 /usr/bin/ld.lld
RUN ln -s /usr/bin/lld-10 /usr/bin/ld.lld
COPY build.sh /

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04
ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
ARG version=20.4.1.*
ARG version=20.5.1.*
ARG gosu_ver=1.10
RUN apt-get update \
@ -19,7 +19,7 @@ RUN apt-get update \
clickhouse-client=$version \
clickhouse-server=$version \
locales \
tzdata \
ca-certificates \
wget \
&& rm -rf \
/var/lib/apt/lists/* \

View File

@ -1,7 +1,7 @@
FROM ubuntu:18.04
ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
ARG version=20.4.1.*
ARG version=20.5.1.*
RUN apt-get update && \
apt-get install -y apt-transport-https dirmngr && \

View File

@ -0,0 +1,6 @@
## Docker containers for integration tests
- `base` container with required packages
- `runner` container with that runs integration tests in docker
- `compose` contains docker_compose YaML files that are used in tests
How to run integration tests is described in tests/integration/README.md

View File

@ -0,0 +1,76 @@
# docker build -t yandex/clickhouse-integration-tests-runner .
FROM ubuntu:18.04
RUN apt-get update \
&& env DEBIAN_FRONTEND=noninteractive apt-get install --yes \
ca-certificates \
bash \
btrfs-progs \
e2fsprogs \
iptables \
xfsprogs \
tar \
pigz \
wget \
git \
iproute2 \
module-init-tools \
cgroupfs-mount \
python-pip \
tzdata \
libreadline-dev \
libicu-dev \
bsdutils \
curl \
liblua5.1-dev \
luajit \
libssl-dev \
gdb \
&& rm -rf \
/var/lib/apt/lists/* \
/var/cache/debconf \
/tmp/* \
&& apt-get clean
ENV TZ=Europe/Moscow
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN pip install urllib3==1.23 pytest docker-compose==1.22.0 docker dicttoxml kazoo PyMySQL psycopg2==2.7.5 pymongo tzlocal kafka-python protobuf redis aerospike pytest-timeout minio rpm-confluent-schemaregistry
ENV DOCKER_CHANNEL stable
ENV DOCKER_VERSION 17.09.1-ce
RUN set -eux; \
\
# this "case" statement is generated via "update.sh"
\
if ! wget -O docker.tgz "https://download.docker.com/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz"; then \
echo >&2 "error: failed to download 'docker-${DOCKER_VERSION}' from '${DOCKER_CHANNEL}' for '${x86_64}'"; \
exit 1; \
fi; \
\
tar --extract \
--file docker.tgz \
--strip-components 1 \
--directory /usr/local/bin/ \
; \
rm docker.tgz; \
\
dockerd --version; \
docker --version
COPY modprobe.sh /usr/local/bin/modprobe
COPY dockerd-entrypoint.sh /usr/local/bin/
RUN set -x \
&& addgroup --system dockremap \
&& adduser --system dockremap \
&& adduser dockremap dockremap \
&& echo 'dockremap:165536:65536' >> /etc/subuid \
&& echo 'dockremap:165536:65536' >> /etc/subgid
VOLUME /var/lib/docker
EXPOSE 2375
ENTRYPOINT ["dockerd-entrypoint.sh"]
CMD ["sh", "-c", "pytest $PYTEST_OPTS"]

View File

@ -9,10 +9,10 @@ set -eu
# Docker often uses "modprobe -va foo bar baz"
# so we ignore modules that start with "-"
for module; do
if [ "${module#-}" = "$module" ]; then
ip link show "$module" || true
lsmod | grep "$module" || true
fi
if [ "${module#-}" = "$module" ]; then
ip link show "$module" || true
lsmod | grep "$module" || true
fi
done
# remove /usr/local/... from PATH so we can exec the real modprobe as a last resort

View File

@ -10,6 +10,7 @@ RUN apt-get update \
bash \
curl \
g++ \
gdb \
git \
libc6-dbg \
moreutils \

View File

@ -2,7 +2,7 @@
set -ex
set -o pipefail
trap "exit" INT TERM
trap "kill $(jobs -pr) ||:" EXIT
trap 'kill $(jobs -pr) ||:' EXIT
stage=${stage:-}
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
@ -14,26 +14,26 @@ function configure
rm right/config/config.d/text_log.xml ||:
cp -rv right/config left ||:
sed -i 's/<tcp_port>9000/<tcp_port>9001/g' left/config/config.xml
sed -i 's/<tcp_port>9000/<tcp_port>9002/g' right/config/config.xml
sed -i 's/<tcp_port>900./<tcp_port>9001/g' left/config/config.xml
sed -i 's/<tcp_port>900./<tcp_port>9002/g' right/config/config.xml
# Start a temporary server to rename the tables
while killall clickhouse; do echo . ; sleep 1 ; done
while killall clickhouse-server; do echo . ; sleep 1 ; done
echo all killed
set -m # Spawn temporary in its own process groups
left/clickhouse server --config-file=left/config/config.xml -- --path db0 &> setup-server-log.log &
left/clickhouse-server --config-file=left/config/config.xml -- --path db0 &> setup-server-log.log &
left_pid=$!
kill -0 $left_pid
disown $left_pid
set +m
while ! left/clickhouse client --port 9001 --query "select 1" ; do kill -0 $left_pid ; echo . ; sleep 1 ; done
while ! clickhouse-client --port 9001 --query "select 1" ; do kill -0 $left_pid ; echo . ; sleep 1 ; done
echo server for setup started
left/clickhouse client --port 9001 --query "create database test" ||:
left/clickhouse client --port 9001 --query "rename table datasets.hits_v1 to test.hits" ||:
clickhouse-client --port 9001 --query "create database test" ||:
clickhouse-client --port 9001 --query "rename table datasets.hits_v1 to test.hits" ||:
while killall clickhouse; do echo . ; sleep 1 ; done
while killall clickhouse-server; do echo . ; sleep 1 ; done
echo all killed
# Remove logs etc, because they will be updated, and sharing them between
@ -42,43 +42,50 @@ function configure
rm db0/metadata/system/* -rf ||:
# Make copies of the original db for both servers. Use hardlinks instead
# of copying. Be careful to remove preprocessed configs or it can lead to
# weird effects.
# of copying. Be careful to remove preprocessed configs and system tables,or
# it can lead to weird effects.
rm -r left/db ||:
rm -r right/db ||:
rm -r db0/preprocessed_configs ||:
rm -r db/{data,metadata}/system ||:
cp -al db0/ left/db/
cp -al db0/ right/db/
}
function restart
{
while killall clickhouse; do echo . ; sleep 1 ; done
while killall clickhouse-server; do echo . ; sleep 1 ; done
echo all killed
set -m # Spawn servers in their own process groups
left/clickhouse server --config-file=left/config/config.xml -- --path left/db &>> left-server-log.log &
left/clickhouse-server --config-file=left/config/config.xml -- --path left/db &>> left-server-log.log &
left_pid=$!
kill -0 $left_pid
disown $left_pid
right/clickhouse server --config-file=right/config/config.xml -- --path right/db &>> right-server-log.log &
right/clickhouse-server --config-file=right/config/config.xml -- --path right/db &>> right-server-log.log &
right_pid=$!
kill -0 $right_pid
disown $right_pid
set +m
while ! left/clickhouse client --port 9001 --query "select 1" ; do kill -0 $left_pid ; echo . ; sleep 1 ; done
while ! clickhouse-client --port 9001 --query "select 1" ; do kill -0 $left_pid ; echo . ; sleep 1 ; done
echo left ok
while ! right/clickhouse client --port 9002 --query "select 1" ; do kill -0 $right_pid ; echo . ; sleep 1 ; done
while ! clickhouse-client --port 9002 --query "select 1" ; do kill -0 $right_pid ; echo . ; sleep 1 ; done
echo right ok
left/clickhouse client --port 9001 --query "select * from system.tables where database != 'system'"
left/clickhouse client --port 9001 --query "select * from system.build_options"
right/clickhouse client --port 9002 --query "select * from system.tables where database != 'system'"
right/clickhouse client --port 9002 --query "select * from system.build_options"
clickhouse-client --port 9001 --query "select * from system.tables where database != 'system'"
clickhouse-client --port 9001 --query "select * from system.build_options"
clickhouse-client --port 9002 --query "select * from system.tables where database != 'system'"
clickhouse-client --port 9002 --query "select * from system.build_options"
# Check again that both servers we started are running -- this is important
# for running locally, when there might be some other servers started and we
# will connect to them instead.
kill -0 $left_pid
kill -0 $right_pid
}
function run_tests
@ -86,63 +93,60 @@ function run_tests
# Just check that the script runs at all
"$script_dir/perf.py" --help > /dev/null
# When testing commits from master, use the older test files. This allows the
# tests to pass even when we add new functions and tests for them, that are
# not supported in the old revision.
# When testing a PR, use the test files from the PR so that we can test their
# changes.
test_prefix=$([ "$PR_TO_TEST" == "0" ] && echo left || echo right)/performance
# Find the directory with test files.
if [ -v CHPC_TEST_PATH ]
then
# Use the explicitly set path to directory with test files.
test_prefix="$CHPC_TEST_PATH"
elif [ "$PR_TO_TEST" = "0" ]
then
# When testing commits from master, use the older test files. This
# allows the tests to pass even when we add new functions and tests for
# them, that are not supported in the old revision.
test_prefix=left/performance
elif [ "$PR_TO_TEST" != "" ] && [ "$PR_TO_TEST" != "0" ]
then
# For PRs, use newer test files so we can test these changes.
test_prefix=right/performance
for x in {test-times,skipped-tests}.tsv
# If some tests were changed in the PR, we may want to run only these
# ones. The list of changed tests in changed-test.txt is prepared in
# entrypoint.sh from git diffs, because it has the cloned repo. Used
# to use rsync for that but it was really ugly and not always correct
# (e.g. when the reference SHA is really old and has some other
# differences to the tested SHA, besides the one introduced by the PR).
changed_test_files=$(sed "s/tests\/performance/${test_prefix//\//\\/}/" changed-tests.txt)
fi
# Determine which tests to run.
if [ -v CHPC_TEST_GREP ]
then
# Run only explicitly specified tests, if any.
test_files=$(ls "$test_prefix" | grep "$CHPC_TEST_GREP" | xargs -I{} -n1 readlink -f "$test_prefix/{}")
elif [ "$changed_test_files" != "" ]
then
# Use test files that changed in the PR.
test_files="$changed_test_files"
else
# The default -- run all tests found in the test dir.
test_files=$(ls "$test_prefix"/*.xml)
fi
# Delete old report files.
for x in {test-times,skipped-tests,wall-clock-times,report-thresholds,client-times}.tsv
do
rm -v "$x" ||:
touch "$x"
done
# FIXME a quick crutch to bring the run time down for the unstable tests --
# if some performance tests xmls were changed in a PR, run only these ones.
if [ "$PR_TO_TEST" != "0" ]
then
# changed-test.txt prepared in entrypoint.sh from git diffs, because it
# has the cloned repo. Used to use rsync for that but it was really ugly
# and not always correct (e.g. when the reference SHA is really old and
# has some other differences to the tested SHA, besides the one introduced
# by the PR).
test_files_override=$(sed "s/tests\/performance/${test_prefix//\//\\/}/" changed-tests.txt)
if [ "$test_files_override" != "" ]
then
test_files=$test_files_override
fi
fi
# Run only explicitly specified tests, if any
if [ -v CHPC_TEST_GLOB ]
then
# I do want to expand the globs in the variable.
# shellcheck disable=SC2086
test_files=$(ls "$test_prefix"/$CHPC_TEST_GLOB.xml)
fi
if [ "$test_files" == "" ]
then
# FIXME remove some broken long tests
for test_name in {IPv4,IPv6,modulo,parse_engine_file,number_formatting_formats,select_format,arithmetic,cryptographic_hashes,logical_functions_{medium,small}}
do
printf "$test_name\tMarked as broken (see compare.sh)\n" >> skipped-tests.tsv
rm "$test_prefix/$test_name.xml" ||:
done
test_files=$(ls "$test_prefix"/*.xml)
fi
# Run the tests.
test_name="<none>"
for test in $test_files
do
# Check that both servers are alive, to fail faster if they die.
left/clickhouse client --port 9001 --query "select 1 format Null" \
clickhouse-client --port 9001 --query "select 1 format Null" \
|| { echo $test_name >> left-server-died.log ; restart ; continue ; }
right/clickhouse client --port 9002 --query "select 1 format Null" \
clickhouse-client --port 9002 --query "select 1 format Null" \
|| { echo $test_name >> right-server-died.log ; restart ; continue ; }
test_name=$(basename "$test" ".xml")
@ -154,14 +158,6 @@ function run_tests
# The test completed with zero status, so we treat stderr as warnings
mv "$test_name-err.log" "$test_name-warn.log"
grep ^query "$test_name-raw.tsv" | cut -f2- > "$test_name-queries.tsv"
grep ^client-time "$test_name-raw.tsv" | cut -f2- > "$test_name-client-time.tsv"
skipped=$(grep ^skipped "$test_name-raw.tsv" | cut -f2-)
if [ "$skipped" != "" ]
then
printf "$test_name""\t""$skipped""\n" >> skipped-tests.tsv
fi
done
unset TIMEFORMAT
@ -169,46 +165,213 @@ function run_tests
wait
}
function get_profiles_watchdog
{
sleep 6000
echo "The trace collection did not finish in time." >> profile-errors.log
for pid in $(pgrep -f clickhouse)
do
gdb -p "$pid" --batch --ex "info proc all" --ex "thread apply all bt" --ex quit &> "$pid.gdb.log" &
done
wait
for _ in {1..10}
do
if ! pkill -f clickhouse
then
break
fi
sleep 1
done
}
function get_profiles
{
# Collect the profiles
left/clickhouse client --port 9001 --query "set query_profiler_cpu_time_period_ns = 0"
left/clickhouse client --port 9001 --query "set query_profiler_real_time_period_ns = 0"
right/clickhouse client --port 9001 --query "set query_profiler_cpu_time_period_ns = 0"
right/clickhouse client --port 9001 --query "set query_profiler_real_time_period_ns = 0"
left/clickhouse client --port 9001 --query "system flush logs"
right/clickhouse client --port 9002 --query "system flush logs"
clickhouse-client --port 9001 --query "set query_profiler_cpu_time_period_ns = 0"
clickhouse-client --port 9001 --query "set query_profiler_real_time_period_ns = 0"
clickhouse-client --port 9001 --query "set query_profiler_cpu_time_period_ns = 0"
clickhouse-client --port 9001 --query "set query_profiler_real_time_period_ns = 0"
clickhouse-client --port 9001 --query "system flush logs"
clickhouse-client --port 9002 --query "system flush logs"
left/clickhouse client --port 9001 --query "select * from system.query_log where type = 2 format TSVWithNamesAndTypes" > left-query-log.tsv ||: &
left/clickhouse client --port 9001 --query "select * from system.query_thread_log format TSVWithNamesAndTypes" > left-query-thread-log.tsv ||: &
left/clickhouse client --port 9001 --query "select * from system.trace_log format TSVWithNamesAndTypes" > left-trace-log.tsv ||: &
left/clickhouse client --port 9001 --query "select arrayJoin(trace) addr, concat(splitByChar('/', addressToLine(addr))[-1], '#', demangle(addressToSymbol(addr)) ) name from system.trace_log group by addr format TSVWithNamesAndTypes" > left-addresses.tsv ||: &
left/clickhouse client --port 9001 --query "select * from system.metric_log format TSVWithNamesAndTypes" > left-metric-log.tsv ||: &
clickhouse-client --port 9001 --query "select * from system.query_log where type = 2 format TSVWithNamesAndTypes" > left-query-log.tsv ||: &
clickhouse-client --port 9001 --query "select * from system.query_thread_log format TSVWithNamesAndTypes" > left-query-thread-log.tsv ||: &
clickhouse-client --port 9001 --query "select * from system.trace_log format TSVWithNamesAndTypes" > left-trace-log.tsv ||: &
clickhouse-client --port 9001 --query "select arrayJoin(trace) addr, concat(splitByChar('/', addressToLine(addr))[-1], '#', demangle(addressToSymbol(addr)) ) name from system.trace_log group by addr format TSVWithNamesAndTypes" > left-addresses.tsv ||: &
clickhouse-client --port 9001 --query "select * from system.metric_log format TSVWithNamesAndTypes" > left-metric-log.tsv ||: &
right/clickhouse client --port 9002 --query "select * from system.query_log where type = 2 format TSVWithNamesAndTypes" > right-query-log.tsv ||: &
right/clickhouse client --port 9002 --query "select * from system.query_thread_log format TSVWithNamesAndTypes" > right-query-thread-log.tsv ||: &
right/clickhouse client --port 9002 --query "select * from system.trace_log format TSVWithNamesAndTypes" > right-trace-log.tsv ||: &
right/clickhouse client --port 9002 --query "select arrayJoin(trace) addr, concat(splitByChar('/', addressToLine(addr))[-1], '#', demangle(addressToSymbol(addr)) ) name from system.trace_log group by addr format TSVWithNamesAndTypes" > right-addresses.tsv ||: &
right/clickhouse client --port 9002 --query "select * from system.metric_log format TSVWithNamesAndTypes" > right-metric-log.tsv ||: &
clickhouse-client --port 9002 --query "select * from system.query_log where type = 2 format TSVWithNamesAndTypes" > right-query-log.tsv ||: &
clickhouse-client --port 9002 --query "select * from system.query_thread_log format TSVWithNamesAndTypes" > right-query-thread-log.tsv ||: &
clickhouse-client --port 9002 --query "select * from system.trace_log format TSVWithNamesAndTypes" > right-trace-log.tsv ||: &
clickhouse-client --port 9002 --query "select arrayJoin(trace) addr, concat(splitByChar('/', addressToLine(addr))[-1], '#', demangle(addressToSymbol(addr)) ) name from system.trace_log group by addr format TSVWithNamesAndTypes" > right-addresses.tsv ||: &
clickhouse-client --port 9002 --query "select * from system.metric_log format TSVWithNamesAndTypes" > right-metric-log.tsv ||: &
wait
# Just check that the servers are alive so that we return a proper exit code.
# We don't consistently check the return codes of the above background jobs.
clickhouse-client --port 9001 --query "select 1"
clickhouse-client --port 9002 --query "select 1"
}
# Build and analyze randomization distribution for all queries.
function analyze_queries
{
find . -maxdepth 1 -name "*-queries.tsv" -print | \
xargs -n1 -I% basename % -queries.tsv | \
parallel --verbose right/clickhouse local --file "{}-queries.tsv" \
--structure "\"query text, run int, version UInt32, time float\"" \
--query "\"$(cat "$script_dir/eqmed.sql")\"" \
">" {}-report.tsv
rm -v analyze-commands.txt analyze-errors.log all-queries.tsv unstable-queries.tsv ./*-report.tsv raw-queries.tsv client-times.tsv report-thresholds.tsv ||:
# Split the raw test output into files suitable for analysis.
IFS=$'\n'
for test_file in $(find . -maxdepth 1 -name "*-raw.tsv" -print)
do
test_name=$(basename "$test_file" "-raw.tsv")
sed -n "s/^query\t//p" < "$test_file" > "$test_name-queries.tsv"
sed -n "s/^client-time/$test_name/p" < "$test_file" >> "client-times.tsv"
sed -n "s/^report-threshold/$test_name/p" < "$test_file" >> "report-thresholds.tsv"
sed -n "s/^skipped/$test_name/p" < "$test_file" >> "skipped-tests.tsv"
done
unset IFS
# This is a lateral join in bash... please forgive me.
# We don't have arrayPermute(), so I have to make random permutations with
# `order by rand`, and it becomes really slow if I do it for more than one
# query. We also don't have lateral joins. So I just put all runs of each
# query into a separate file, and then compute randomization distribution
# for each file. I do this in parallel using GNU parallel.
IFS=$'\n'
for test_file in $(find . -maxdepth 1 -name "*-queries.tsv" -print)
do
test_name=$(basename "$test_file" "-queries.tsv")
query_index=1
for query in $(cut -d' ' -f1 "$test_file" | sort | uniq)
do
query_prefix="$test_name.q$query_index"
query_index=$((query_index + 1))
grep -F "$query " "$test_file" > "$query_prefix.tmp"
printf "%s\0\n" \
"clickhouse-local \
--file \"$query_prefix.tmp\" \
--structure 'query text, run int, version UInt32, time float' \
--query \"$(cat "$script_dir/eqmed.sql")\" \
>> \"$test_name-report.tsv\"" \
2>> analyze-errors.log \
>> analyze-commands.txt
done
done
wait
unset IFS
parallel --verbose --null < analyze-commands.txt
}
# Analyze results
function report
{
rm -r report ||:
mkdir report ||:
rm ./*.{rep,svg} test-times.tsv test-dump.tsv unstable.tsv unstable-query-ids.tsv unstable-query-metrics.tsv changed-perf.tsv unstable-tests.tsv unstable-queries.tsv bad-tests.tsv slow-on-client.tsv all-queries.tsv ||:
cat analyze-errors.log >> report/errors.log ||:
cat profile-errors.log >> report/errors.log ||:
clickhouse-local --query "
create table queries engine File(TSVWithNamesAndTypes, 'report/queries.tsv')
as select
-- FIXME Comparison mode doesn't make sense for queries that complete
-- immediately, so for now we pretend they don't exist. We don't want to
-- remove them altogether because we want to be able to detect regressions,
-- but the right way to do this is not yet clear.
(left + right) / 2 < 0.02 as short,
not short and abs(diff) > report_threshold and abs(diff) > stat_threshold as changed_fail,
not short and abs(diff) > report_threshold - 0.05 and abs(diff) > stat_threshold as changed_show,
not short and not changed_fail and stat_threshold > report_threshold + 0.10 as unstable_fail,
not short and not changed_show and stat_threshold > report_threshold - 0.05 as unstable_show,
left, right, diff, stat_threshold,
if(report_threshold > 0, report_threshold, 0.10) as report_threshold,
reports.test,
query
from
(
select *,
replaceAll(_file, '-report.tsv', '') test
from file('*-report.tsv', TSV, 'left float, right float, diff float, stat_threshold float, query text')
) reports
left join file('report-thresholds.tsv', TSV, 'test text, report_threshold float') thresholds
using test
;
-- keep the table in old format so that we can analyze new and old data together
create table queries_old_format engine File(TSVWithNamesAndTypes, 'queries.rep')
as select short, changed_fail, unstable_fail, left, right, diff, stat_threshold, test, query
from queries
;
create table changed_perf_tsv engine File(TSV, 'report/changed-perf.tsv') as
select left, right, diff, stat_threshold, changed_fail, test, query from queries where changed_show
order by abs(diff) desc;
create table unstable_queries_tsv engine File(TSV, 'report/unstable-queries.tsv') as
select left, right, diff, stat_threshold, unstable_fail, test, query from queries where unstable_show
order by stat_threshold desc;
create table queries_for_flamegraph engine File(TSVWithNamesAndTypes, 'report/queries-for-flamegraph.tsv') as
select query, test from queries where unstable_show or changed_show
;
create table unstable_tests_tsv engine File(TSV, 'report/bad-tests.tsv') as
select test, sum(unstable_fail) u, sum(changed_fail) c, u + c s from queries
group by test having s > 0 order by s desc;
create table query_time engine Memory as select *
from file('client-times.tsv', TSV, 'test text, query text, client float, server float');
create table wall_clock engine Memory as select *
from file('wall-clock-times.tsv', TSV, 'test text, real float, user float, system float');
create table slow_on_client_tsv engine File(TSV, 'report/slow-on-client.tsv') as
select client, server, floor(client/server, 3) p, query
from query_time where p > 1.02 order by p desc;
create table test_time engine Memory as
select test, sum(client) total_client_time,
maxIf(client, not short) query_max,
minIf(client, not short) query_min,
count(*) queries,
sum(short) short_queries
from query_time full join queries
on query_time.query = queries.query
group by test;
create table test_times_tsv engine File(TSV, 'report/test-times.tsv') as
select wall_clock.test, real,
floor(total_client_time, 3),
queries,
short_queries,
floor(query_max, 3),
floor(real / queries, 3) avg_real_per_query,
floor(query_min, 3)
from test_time
-- wall clock times are also measured for skipped tests, so don't
-- do full join
left join wall_clock using test
order by avg_real_per_query desc;
create table all_tests_tsv engine File(TSV, 'report/all-queries.tsv') as
select changed_fail, unstable_fail,
left, right, diff,
floor(left > right ? left / right : right / left, 3),
stat_threshold, test, query
from queries order by test, query;
" 2> >(tee -a report/errors.log 1>&2)
for x in {right,left}-{addresses,{query,query-thread,trace,metric}-log}.tsv
do
# FIXME This loop builds column definitons from TSVWithNamesAndTypes in an
@ -219,85 +382,12 @@ do
| tr '\n' ', ' | sed 's/,$//' > "$x.columns"
done
rm ./*.{rep,svg} test-times.tsv test-dump.tsv unstable.tsv unstable-query-ids.tsv unstable-query-metrics.tsv changed-perf.tsv unstable-tests.tsv unstable-queries.tsv bad-tests.tsv slow-on-client.tsv all-queries.tsv ||:
right/clickhouse local --query "
create table queries engine File(TSVWithNamesAndTypes, 'queries.rep')
as select
-- FIXME Comparison mode doesn't make sense for queries that complete
-- immediately, so for now we pretend they don't exist. We don't want to
-- remove them altogether because we want to be able to detect regressions,
-- but the right way to do this is not yet clear.
left + right < 0.05 as short,
not short and abs(diff) < 0.10 and rd[3] > 0.10 as unstable,
-- Do not consider changed the queries with 5% RD below 5% -- e.g., we're
-- likely to observe a difference > 5% in less than 5% cases.
-- Not sure it is correct, but empirically it filters out a lot of noise.
not short and abs(diff) > 0.15 and abs(diff) > rd[3] and rd[1] > 0.05 as changed,
left, right, diff, rd,
replaceAll(_file, '-report.tsv', '') test,
query
from file('*-report.tsv', TSV, 'left float, right float, diff float, rd Array(float), query text');
create table changed_perf_tsv engine File(TSV, 'changed-perf.tsv') as
select left, right, diff, rd, test, query from queries where changed
order by rd[3] desc;
create table unstable_queries_tsv engine File(TSV, 'unstable-queries.tsv') as
select left, right, diff, rd, test, query from queries where unstable
order by rd[3] desc;
create table unstable_tests_tsv engine File(TSV, 'bad-tests.tsv') as
select test, sum(unstable) u, sum(changed) c, u + c s from queries
group by test having s > 0 order by s desc;
create table query_time engine Memory as select *, replaceAll(_file, '-client-time.tsv', '') test
from file('*-client-time.tsv', TSV, 'query text, client float, server float');
create table wall_clock engine Memory as select *
from file('wall-clock-times.tsv', TSV, 'test text, real float, user float, system float');
create table slow_on_client_tsv engine File(TSV, 'slow-on-client.tsv') as
select client, server, floor(client/server, 3) p, query
from query_time where p > 1.02 order by p desc;
create table test_time engine Memory as
select test, sum(client) total_client_time,
maxIf(client, not short) query_max,
minIf(client, not short) query_min,
count(*) queries,
sum(short) short_queries
from query_time, queries
where query_time.query = queries.query
group by test;
create table test_times_tsv engine File(TSV, 'test-times.tsv') as
select wall_clock.test, real,
floor(total_client_time, 3),
queries,
short_queries,
floor(query_max, 3),
floor(real / queries, 3) avg_real_per_query,
floor(query_min, 3)
from test_time join wall_clock using test
order by avg_real_per_query desc;
create table all_tests_tsv engine File(TSV, 'all-queries.tsv') as
select left, right, diff,
floor(left > right ? left / right : right / left, 3),
rd, test, query
from queries order by test, query;
" 2> >(head -2 >> report-errors.rep) ||:
for version in {right,left}
do
right/clickhouse local --query "
create view queries as
select * from file('queries.rep', TSVWithNamesAndTypes,
'short int, unstable int, changed int, left float, right float,
diff float, rd Array(float), test text, query text');
clickhouse-local --query "
create view queries_for_flamegraph as
select * from file('report/queries-for-flamegraph.tsv', TSVWithNamesAndTypes,
'query text, test text');
create view query_log as select *
from file('$version-query-log.tsv', TSVWithNamesAndTypes,
@ -311,14 +401,14 @@ create view addresses_src as select *
from file('$version-addresses.tsv', TSVWithNamesAndTypes,
'$(cat "$version-addresses.tsv.columns")');
create table addresses_join engine Join(any, left, address) as
create table addresses_join_$version engine Join(any, left, address) as
select addr address, name from addresses_src;
create table unstable_query_runs engine File(TSVWithNamesAndTypes,
'unstable-query-runs.$version.rep') as
select query_id, query from query_log
join queries using query
where query_id not like 'prewarm %' and (unstable or changed)
select query, query_id from query_log
where query in (select query from queries_for_flamegraph)
and query_id not like 'prewarm %'
;
create table unstable_query_log engine File(Vertical,
@ -350,7 +440,7 @@ create table unstable_run_traces engine File(TSVWithNamesAndTypes,
'unstable-run-traces.$version.rep') as
select
count() value,
joinGet(addresses_join, 'name', arrayJoin(trace)) metric,
joinGet(addresses_join_$version, 'name', arrayJoin(trace)) metric,
unstable_query_runs.query_id,
any(unstable_query_runs.query) query
from unstable_query_runs
@ -361,22 +451,22 @@ create table unstable_run_traces engine File(TSVWithNamesAndTypes,
create table metric_devation engine File(TSVWithNamesAndTypes,
'metric-deviation.$version.rep') as
select floor((q[3] - q[1])/q[2], 3) d,
quantilesExact(0, 0.5, 1)(value) q, metric, query
select query, floor((q[3] - q[1])/q[2], 3) d,
quantilesExact(0, 0.5, 1)(value) q, metric
from (select * from unstable_run_metrics
union all select * from unstable_run_traces
union all select * from unstable_run_metrics_2) mm
join queries using query
join queries_for_flamegraph using query
group by query, metric
having d > 0.5
order by any(rd[3]) desc, query desc, d desc
order by query desc, d desc
;
create table stacks engine File(TSV, 'stacks.$version.rep') as
select
query,
arrayStringConcat(
arrayMap(x -> joinGet(addresses_join, 'name', x),
arrayMap(x -> joinGet(addresses_join_$version, 'name', x),
arrayReverse(trace)
),
';'
@ -386,7 +476,7 @@ create table stacks engine File(TSV, 'stacks.$version.rep') as
join unstable_query_runs using query_id
group by query, trace
;
" 2> >(head -2 >> report-errors.rep) ||: # do not run in parallel because they use the same data dir for StorageJoins which leads to weird errors.
" 2> >(tee -a report/errors.log 1>&2) # do not run in parallel because they use the same data dir for StorageJoins which leads to weird errors.
done
wait
@ -396,11 +486,17 @@ do
for query in $(cut -d' ' -f1 "stacks.$version.rep" | sort | uniq)
do
query_file=$(echo "$query" | cut -c-120 | sed 's/[/]/_/g')
# Build separate .svg flamegraph for each query.
grep -F "$query " "stacks.$version.rep" \
| cut -d' ' -f 2- \
| sed 's/\t/ /g' \
| tee "$query_file.stacks.$version.rep" \
| ~/fg/flamegraph.pl > "$query_file.$version.svg" &
# Copy metric stats into separate files as well.
grep -F "$query " "metric-deviation.$version.rep" \
| cut -f2- > "$query_file.$version.metrics.rep" &
done
done
wait
@ -408,9 +504,13 @@ unset IFS
# Remember that grep sets error code when nothing is found, hence the bayan
# operator.
grep -H -m2 -i '\(Exception\|Error\):[^:]' ./*-err.log | sed 's/:/\t/' > run-errors.tsv ||:
grep -H -m2 -i '\(Exception\|Error\):[^:]' ./*-err.log | sed 's/:/\t/' >> run-errors.tsv ||:
}
# Check that local and client are in PATH
clickhouse-local --version > /dev/null
clickhouse-client --version > /dev/null
case "$stage" in
"")
;&
@ -425,10 +525,28 @@ case "$stage" in
time run_tests ||:
;&
"get_profiles")
# Getting profiles inexplicably hangs sometimes, so try to save some logs if
# this happens again. Give the servers some time to collect all info, then
# trace and kill. Start in a subshell, so that both function don't interfere
# with each other's jobs through `wait`. Also make the subshell have its own
# process group, so that we can then kill it with all its child processes.
# Somehow it doesn't kill the children by itself when dying.
set -m
( get_profiles_watchdog ) &
watchdog_pid=$!
set +m
# Check that the watchdog started OK.
kill -0 $watchdog_pid
# If the tests fail with OOM or something, still try to restart the servers
# to collect the logs. Prefer not to restart, because addresses might change
# and we won't be able to process trace_log data.
time get_profiles || restart || get_profiles ||:
# and we won't be able to process trace_log data. Start in a subshell, so that
# it doesn't interfere with the watchdog through `wait`.
( time get_profiles || restart || get_profiles ||: )
# Kill the whole process group, because somehow when the subshell is killed,
# the sleep inside remains alive and orphaned.
while env kill -- -$watchdog_pid ; do sleep 1; done
# Stop the servers to free memory for the subsequent query analysis.
while killall clickhouse; do echo . ; sleep 1 ; done
@ -440,7 +558,11 @@ case "$stage" in
"report")
time report ||:
time "$script_dir/report.py" --report=all-queries > all-queries.html 2> >(head -2 >> report-errors.rep) ||:
time "$script_dir/report.py" --report=all-queries > all-queries.html 2> >(tee -a report/errors.log 1>&2) ||:
time "$script_dir/report.py" > report.html
;&
esac
# Print some final debug info to help debug Weirdness, of which there is plenty.
jobs
pstree -apgT

View File

@ -1,4 +1,9 @@
<yandex>
<http_port remove="remove"/>
<mysql_port remove="remove"/>
<interserver_http_port remove="remove"/>
<listen_host>::</listen_host>
<logger>
<console>true</console>
</logger>

View File

@ -2,9 +2,11 @@
set -ex
set -o pipefail
trap "exit" INT TERM
trap "kill $(jobs -pr) ||:" EXIT
trap 'kill $(jobs -pr) ||:' EXIT
mkdir db0 ||:
mkdir left ||:
mkdir right ||:
left_pr=$1
left_sha=$2
@ -22,24 +24,24 @@ dataset_paths["values"]="https://clickhouse-datasets.s3.yandex.net/values_with_e
function download
{
rm -r left ||:
mkdir left ||:
rm -r right ||:
mkdir right ||:
# might have the same version on left and right
if ! [ "$left_sha" = "$right_sha" ]
then
wget -nv -nd -c "https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/performance/performance.tgz" -O- | tar -C left --strip-components=1 -zxv &
wget -nv -nd -c "https://clickhouse-builds.s3.yandex.net/$right_pr/$right_sha/performance/performance.tgz" -O- | tar -C right --strip-components=1 -zxv &
else
wget -nv -nd -c "https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/performance/performance.tgz" -O- | tar -C left --strip-components=1 -zxv && cp -a left right &
mkdir right ||:
wget -nv -nd -c "https://clickhouse-builds.s3.yandex.net/$left_pr/$left_sha/performance/performance.tgz" -O- | tar -C left --strip-components=1 -zxv && cp -a left/* right &
fi
for dataset_name in $datasets
do
dataset_path="${dataset_paths[$dataset_name]}"
[ "$dataset_path" != "" ]
if [ "$dataset_path" = "" ]
then
>&2 echo "Unknown dataset '$dataset_name'"
exit 1
fi
cd db0 && wget -nv -nd -c "$dataset_path" -O- | tar -xv &
done

View File

@ -87,9 +87,6 @@ git -C ch diff --name-only "$SHA_TO_TEST" "$(git -C ch merge-base "$SHA_TO_TEST"
# Set python output encoding so that we can print queries with Russian letters.
export PYTHONIOENCODING=utf-8
# Use a default number of runs if not told otherwise
export CHPC_RUNS=${CHPC_RUNS:-7}
# By default, use the main comparison script from the tested package, so that we
# can change it in PRs.
script_path="right/scripts"
@ -101,14 +98,15 @@ fi
# Even if we have some errors, try our best to save the logs.
set +e
# Older version use 'kill 0', so put the script into a separate process group
# FIXME remove set +m in April 2020
set +m
# Use clickhouse-client and clickhouse-local from the right server.
PATH="$(readlink -f right/)":"$PATH"
export PATH
# Start the main comparison script.
{ \
time ../download.sh "$REF_PR" "$REF_SHA" "$PR_TO_TEST" "$SHA_TO_TEST" && \
time stage=configure "$script_path"/compare.sh ; \
} 2>&1 | ts "$(printf '%%Y-%%m-%%d %%H:%%M:%%S\t')" | tee compare.log
set -m
# Stop the servers to free memory. Normally they are restarted before getting
# the profile info, so they shouldn't use much, but if the comparison script
@ -121,5 +119,5 @@ done
dmesg -T > dmesg.log
7z a /output/output.7z ./*.{log,tsv,html,txt,rep,svg} {right,left}/{performance,db/preprocessed_configs}
7z a /output/output.7z ./*.{log,tsv,html,txt,rep,svg} {right,left}/{performance,db/preprocessed_configs,scripts} ./report
cp compare.log /output

View File

@ -1,41 +1,58 @@
-- input is table(query text, run UInt32, version int, time float)
select
-- abs(diff_percent) > rd_quantiles_percent[3] fail,
floor(original_medians_array.time_by_version[1], 4) l,
floor(original_medians_array.time_by_version[2], 4) r,
floor((r - l) / l, 3) diff_percent,
arrayMap(x -> floor(x / l, 3), rd.rd_quantiles) rd_quantiles_percent,
floor(threshold / l, 3) threshold_percent,
query
from
(
select query, quantiles(0.05, 0.5, 0.95, 0.99)(abs(time_by_label[1] - time_by_label[2])) rd_quantiles -- quantiles of randomization distribution
-- quantiles of randomization distributions
select quantileExact(0.999)(abs(time_by_label[1] - time_by_label[2]) as d) threshold
---- uncomment to see what the distribution is really like
--, uniqExact(d) u
--, arraySort(x->x.1,
-- arrayZip(
-- (sumMap([d], [1]) as f).1,
-- f.2)) full_histogram
from
(
select query, virtual_run, groupArrayInsertAt(median_time, random_label) time_by_label -- make array 'random label' -> 'median time'
select virtual_run, groupArrayInsertAt(median_time, random_label) time_by_label -- make array 'random label' -> 'median time'
from (
select query, medianExact(time) median_time, virtual_run, random_label -- get median times, grouping by random label
select medianExact(time) median_time, virtual_run, random_label -- get median times, grouping by random label
from (
select *, toUInt32(rowNumberInBlock() % 2) random_label -- randomly relabel measurements
select *, toUInt32(rowNumberInAllBlocks() % 2) random_label -- randomly relabel measurements
from (
select query, time, number virtual_run
from table, numbers(1, 10000) nn -- duplicate input measurements into many virtual runs
order by query, virtual_run, rand() -- for each virtual run, randomly reorder measurements
select time, number virtual_run
from
-- strip the query away before the join -- it might be several kB long;
(select time, run, version from table) no_query,
-- duplicate input measurements into many virtual runs
numbers(1, 100000) nn
-- for each virtual run, randomly reorder measurements
order by virtual_run, rand()
) virtual_runs
) relabeled
group by query, virtual_run, random_label
group by virtual_run, random_label
) virtual_medians
group by query, virtual_run -- aggregate by random_label
group by virtual_run -- aggregate by random_label
) virtual_medians_array
group by query -- aggregate by virtual_run
-- this select aggregates by virtual_run
) rd,
(
select groupArrayInsertAt(median_time, version) time_by_version, query
select groupArrayInsertAt(median_time, version) time_by_version
from
(
select medianExact(time) median_time, query, version
from table group by query, version
) original_medians
group by query
) original_medians_array
where rd.query = original_medians_array.query
order by rd_quantiles_percent[3] desc;
(
select medianExact(time) median_time, version
from table
group by version
) original_medians
) original_medians_array,
(
select any(query) query from table
) any_query,
(
select throwIf(uniq(query) != 1) from table
) check_single_query -- this subselect checks that there is only one query in the input table;
-- written this way so that it is not optimized away (#10523)
;

View File

@ -0,0 +1,54 @@
#!/bin/bash
set -ex
set -o pipefail
trap "exit" INT TERM
trap 'kill $(jobs -pr) ||:' EXIT
stage=${stage:-}
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
repo_dir=${repo_dir:-$(readlink -f "$script_dir/../../..")}
function download
{
rm -r left right db0 ||:
mkdir left right db0 ||:
"$script_dir/download.sh" ||: &
cp -vP "$repo_dir"/../build-gcc9-rel/programs/clickhouse* right &
cp -vP "$repo_dir"/../build-clang10-rel/programs/clickhouse* left &
wait
}
function configure
{
# Test files
cp -av "$repo_dir/tests/performance" right
cp -av "$repo_dir/tests/performance" left
# Configs
cp -av "$script_dir/config" right
cp -av "$script_dir/config" left
cp -av "$repo_dir"/programs/server/config* right/config
cp -av "$repo_dir"/programs/server/user* right/config
cp -av "$repo_dir"/programs/server/config* left/config
cp -av "$repo_dir"/programs/server/user* left/config
tree left
}
function run
{
left/clickhouse-local --query "select * from system.build_options format PrettySpace" | sed 's/ *$//' | fold -w 80 -s > left-commit.txt
right/clickhouse-local --query "select * from system.build_options format PrettySpace" | sed 's/ *$//' | fold -w 80 -s > right-commit.txt
PATH=right:"$PATH" stage=configure "$script_dir/compare.sh" &> >(tee compare.log)
}
download
configure
run
rm output.7z
7z a output.7z ./*.{log,tsv,html,txt,rep,svg} {right,left}/{performance,db/preprocessed_configs}

View File

@ -25,7 +25,7 @@ parser = argparse.ArgumentParser(description='Run performance test.')
parser.add_argument('file', metavar='FILE', type=argparse.FileType('r', encoding='utf-8'), nargs=1, help='test description file')
parser.add_argument('--host', nargs='*', default=['localhost'], help="Server hostname(s). Corresponds to '--port' options.")
parser.add_argument('--port', nargs='*', default=[9000], help="Server port(s). Corresponds to '--host' options.")
parser.add_argument('--runs', type=int, default=int(os.environ.get('CHPC_RUNS', 7)), help='Number of query runs per server. Defaults to CHPC_RUNS environment variable.')
parser.add_argument('--runs', type=int, default=int(os.environ.get('CHPC_RUNS', 13)), help='Number of query runs per server. Defaults to CHPC_RUNS environment variable.')
parser.add_argument('--no-long', type=bool, default=True, help='Skip the tests tagged as long.')
args = parser.parse_args()
@ -48,6 +48,10 @@ infinite_sign = root.find('.//average_speed_not_changing_for_ms')
if infinite_sign is not None:
raise Exception('Looks like the test is infinite (sign 1)')
# Print report threshold for the test if it is set.
if 'max_ignored_relative_change' in root.attrib:
print(f'report-threshold\t{root.attrib["max_ignored_relative_change"]}')
# Open connections
servers = [{'host': host, 'port': port} for (host, port) in zip(args.host, args.port)]
connections = [clickhouse_driver.Client(**server) for server in servers]
@ -137,12 +141,27 @@ test_queries = substitute_parameters(test_query_templates)
report_stage_end('substitute2')
for q in test_queries:
for i, q in enumerate(test_queries):
# We have some crazy long queries (about 100kB), so trim them to a sane
# length.
query_display_name = q
if len(query_display_name) > 1000:
query_display_name = f'{query_display_name[:1000]}...({i})'
# Prewarm: run once on both servers. Helps to bring the data into memory,
# precompile the queries, etc.
for conn_index, c in enumerate(connections):
res = c.execute(q, query_id = 'prewarm {} {}'.format(0, q))
print('prewarm\t' + tsv_escape(q) + '\t' + str(conn_index) + '\t' + str(c.last_query.elapsed))
try:
for conn_index, c in enumerate(connections):
res = c.execute(q, query_id = f'prewarm {0} {query_display_name}')
print(f'prewarm\t{tsv_escape(query_display_name)}\t{conn_index}\t{c.last_query.elapsed}')
except:
# If prewarm fails for some query -- skip it, and try to test the others.
# This might happen if the new test introduces some function that the
# old server doesn't support. Still, report it as an error.
# FIXME the driver reconnects on error and we lose settings, so this might
# lead to further errors or unexpected behavior.
print(traceback.format_exc(), file=sys.stderr)
continue
# Now, perform measured runs.
# Track the time spent by the client to process this query, so that we can notice
@ -153,11 +172,11 @@ for q in test_queries:
for run in range(0, args.runs):
for conn_index, c in enumerate(connections):
res = c.execute(q)
print('query\t' + tsv_escape(q) + '\t' + str(run) + '\t' + str(conn_index) + '\t' + str(c.last_query.elapsed))
print(f'query\t{tsv_escape(query_display_name)}\t{run}\t{conn_index}\t{c.last_query.elapsed}')
server_seconds += c.last_query.elapsed
client_seconds = time.perf_counter() - start_seconds
print('client-time\t{}\t{}\t{}'.format(tsv_escape(q), client_seconds, server_seconds))
print(f'client-time\t{tsv_escape(query_display_name)}\t{client_seconds}\t{server_seconds}')
report_stage_end('benchmark')

View File

@ -38,8 +38,8 @@ Then see the `report.html` in the `output` directory.
There are some environment variables that influence what the test does:
* `-e CHCP_RUNS` -- the number of runs;
* `-e CHPC_TEST_GLOB` -- the names of the tests (xml files) to run, interpreted
as a shell glob.
* `-e CHPC_TEST_GREP` -- the names of the tests (xml files) to run, interpreted
as a grep pattern.
#### Re-genarate report with your tweaks
From the workspace directory (extracted test output archive):
@ -53,5 +53,49 @@ More stages are available, e.g. restart servers or run the tests. See the code.
docker/test/performance-comparison/perf.py --host=localhost --port=9000 --runs=1 tests/performance/logical_functions_small.xml
```
#### Run all tests on some custom configuration
Start two servers manually on ports `9001` (old) and `9002` (new). Change to a
new directory to be used as workspace for tests, and try something like this:
```
$ PATH=$PATH:~/ch4/build-gcc9-rel/programs \
CHPC_TEST_PATH=~/ch3/ch/tests/performance \
CHPC_TEST_GREP=visit_param \
stage=run_tests \
~/ch3/ch/docker/test/performance-comparison/compare.sh
```
* `PATH` must contain `clickhouse-local` and `clickhouse-client`.
* `CHPC_TEST_PATH` -- path to performance test cases, e.g. `tests/performance`.
* `CHPC_TEST_GREP` -- a filter for which tests to run, as a grep pattern.
* `stage` -- from which execution stage to start. To run the tests, use
`run_tests` stage.
The tests will run, and the `report.html` will be generated in the current
directory.
More complex setup is possible, but inconvenient and requires some scripting.
See `manual-run.sh` for inspiration.
#### Statistical considerations
Generating randomization distribution for medians is tricky. Suppose we have N
runs for each version, and then use the combined 2N run results to make a
virtual experiment. In this experiment, we only have N possible values for
median of each version. This becomes very clear if you sort those 2N runs and
imagine where a window of N runs can be -- the N/2 smallest and N/2 largest
values can never be medians. From these N possible values of
medians, you can obtain (N/2)^2 possible values of absolute median difference.
These numbers are +-1, I'm making an off-by-one error somewhere. So, if your
number of runs is small, e.g. 7, you'll only get 16 possible differences, so
even if you make 100k virtual experiments, the randomization distribution will
have only 16 steps, so you'll get weird effects. So you also have to have
enough runs. You can observe it on real data if you add more output to the
query that calculates randomization distribution, e.g., add a number of unique
median values. Looking even more closely, you can see that the exact
values of medians don't matter, and the randomization distribution for
difference of medians devolves into some kind of ranked test. We could probably
skip all these virtual experiments and calculate the resulting distribution
analytically, but I don't know enough math to do it. It would be something
close to Wilcoxon test distribution.
### References
1\. Box, Hunter, Hunter "Statictics for exprerimenters", p. 78: "A Randomized Design Used in the Comparison of Standard and Modified Fertilizer Mixtures for Tomato Plants."

View File

@ -22,6 +22,9 @@ slower_queries = 0
unstable_queries = 0
very_unstable_queries = 0
# max seconds to run one query by itself, not counting preparation
allowed_single_run_time = 2
header_template = """
<!DOCTYPE html>
<html>
@ -95,7 +98,8 @@ def tableRow(cell_values, cell_attributes = []):
return tr(''.join([td(v, a)
for v, a in itertools.zip_longest(
cell_values, cell_attributes,
fillvalue = '')]))
fillvalue = '')
if a is not None]))
def tableHeader(r):
return tr(''.join([th(f) for f in r]))
@ -139,16 +143,45 @@ def printSimpleTable(caption, columns, rows):
print(tableRow(row))
print(tableEnd())
def print_tested_commits():
global report_errors
try:
printSimpleTable('Tested commits', ['Old', 'New'],
[['<pre>{}</pre>'.format(x) for x in
[open('left-commit.txt').read(),
open('right-commit.txt').read()]]])
except:
# Don't fail if no commit info -- maybe it's a manual run.
report_errors.append(
traceback.format_exception_only(
*sys.exc_info()[:2])[-1])
pass
def print_report_errors():
global report_errors
# Add the errors reported by various steps of comparison script
try:
report_errors += [l.strip() for l in open('report/errors.log')]
except:
report_errors.append(
traceback.format_exception_only(
*sys.exc_info()[:2])[-1])
pass
if len(report_errors):
print(tableStart('Errors while building the report'))
print(tableHeader(['Error']))
for x in report_errors:
print(tableRow([x]))
print(tableEnd())
if args.report == 'main':
print(header_template.format())
printSimpleTable('Tested commits', ['Old', 'New'],
[['<pre>{}</pre>'.format(x) for x in
[open('left-commit.txt').read(),
open('right-commit.txt').read()]]])
print_tested_commits()
def print_changes():
rows = tsvRows('changed-perf.tsv')
rows = tsvRows('report/changed-perf.tsv')
if not rows:
return
@ -156,25 +189,29 @@ if args.report == 'main':
print(tableStart('Changes in performance'))
columns = [
'Old, s', # 0
'New, s', # 1
'Relative difference (new&nbsp;-&nbsp;old)/old', # 2
'Randomization distribution quantiles \
[5%,&nbsp;50%,&nbsp;95%,&nbsp;99%]', # 3
'Test', # 4
'Query', # 5
'Old, s', # 0
'New, s', # 1
'Relative difference (new&nbsp;-&nbsp;old)/old', # 2
'p&nbsp;<&nbsp;0.001 threshold', # 3
# Failed # 4
'Test', # 5
'Query', # 6
]
print(tableHeader(columns))
attrs = ['' for c in columns]
attrs[4] = None
for row in rows:
if float(row[2]) < 0.:
faster_queries += 1
attrs[2] = 'style="background: #adbdff"'
if int(row[4]):
if float(row[2]) < 0.:
faster_queries += 1
attrs[2] = 'style="background: #adbdff"'
else:
slower_queries += 1
attrs[2] = 'style="background: #ffb0a0"'
else:
slower_queries += 1
attrs[2] = 'style="background: #ffb0a0"'
attrs[2] = ''
print(tableRow(row, attrs))
@ -182,7 +219,7 @@ if args.report == 'main':
print_changes()
slow_on_client_rows = tsvRows('slow-on-client.tsv')
slow_on_client_rows = tsvRows('report/slow-on-client.tsv')
error_tests += len(slow_on_client_rows)
printSimpleTable('Slow on client',
['Client time, s', 'Server time, s', 'Ratio', 'Query'],
@ -192,7 +229,7 @@ if args.report == 'main':
global unstable_queries
global very_unstable_queries
unstable_rows = tsvRows('unstable-queries.tsv')
unstable_rows = tsvRows('report/unstable-queries.tsv')
if not unstable_rows:
return
@ -202,19 +239,19 @@ if args.report == 'main':
'Old, s', #0
'New, s', #1
'Relative difference (new&nbsp;-&nbsp;old)/old', #2
'Randomization distribution quantiles [5%,&nbsp;50%,&nbsp;95%,&nbsp;99%]', #3
'Test', #4
'Query' #5
'p&nbsp;<&nbsp;0.001 threshold', #3
# Failed #4
'Test', #5
'Query' #6
]
print(tableStart('Unstable queries'))
print(tableHeader(columns))
attrs = ['' for c in columns]
attrs[4] = None
for r in unstable_rows:
rd = ast.literal_eval(r[3])
# Note the zero-based array index, this is rd[3] in SQL.
if rd[2] > 0.2:
if int(r[4]):
very_unstable_queries += 1
attrs[3] = 'style="background: #ffb0a0"'
else:
@ -235,11 +272,11 @@ if args.report == 'main':
printSimpleTable('Tests with most unstable queries',
['Test', 'Unstable', 'Changed perf', 'Total not OK'],
tsvRows('bad-tests.tsv'))
tsvRows('report/bad-tests.tsv'))
def print_test_times():
global slow_average_tests
rows = tsvRows('test-times.tsv')
rows = tsvRows('report/test-times.tsv')
if not rows:
return
@ -257,16 +294,18 @@ if args.report == 'main':
print(tableStart('Test times'))
print(tableHeader(columns))
nominal_runs = 13 # FIXME pass this as an argument
total_runs = (nominal_runs + 1) * 2 # one prewarm run, two servers
attrs = ['' for c in columns]
for r in rows:
if float(r[6]) > 22:
if float(r[6]) > 1.5 * total_runs:
# FIXME should be 15s max -- investigate parallel_insert
slow_average_tests += 1
attrs[6] = 'style="background: #ffb0a0"'
else:
attrs[6] = ''
if float(r[5]) > 30:
if float(r[5]) > allowed_single_run_time * total_runs:
slow_average_tests += 1
attrs[5] = 'style="background: #ffb0a0"'
else:
@ -278,15 +317,7 @@ if args.report == 'main':
print_test_times()
# Add the errors reported by various steps of comparison script
report_errors += [l.strip() for l in open('report-errors.rep')]
if len(report_errors):
print(tableStart('Errors while building the report'))
print(tableHeader(['Error']))
for x in report_errors:
print(tableRow([x]))
print(tableEnd())
print_report_errors()
print("""
<p class="links">
@ -340,37 +371,50 @@ elif args.report == 'all-queries':
print(header_template.format())
printSimpleTable('Tested commits', ['Old', 'New'],
[['<pre>{}</pre>'.format(x) for x in
[open('left-commit.txt').read(),
open('right-commit.txt').read()]]])
print_tested_commits()
def print_all_queries():
rows = tsvRows('all-queries.tsv')
rows = tsvRows('report/all-queries.tsv')
if not rows:
return
columns = [
'Old, s', #0
'New, s', #1
'Relative difference (new&nbsp;-&nbsp;old)/old', #2
'Times speedup/slowdown', #3
'Randomization distribution quantiles \
[5%,&nbsp;50%,&nbsp;95%,&nbsp;99%]', #4
'Test', #5
'Query', #6
# Changed #0
# Unstable #1
'Old, s', #2
'New, s', #3
'Relative difference (new&nbsp;-&nbsp;old)/old', #4
'Times speedup/slowdown', #5
'p&nbsp;<&nbsp;0.001 threshold', #6
'Test', #7
'Query', #8
]
print(tableStart('All query times'))
print(tableHeader(columns))
attrs = ['' for c in columns]
attrs[0] = None
attrs[1] = None
for r in rows:
if float(r[2]) > 0.05:
attrs[3] = 'style="background: #ffb0a0"'
elif float(r[2]) < -0.05:
attrs[3] = 'style="background: #adbdff"'
if int(r[1]):
attrs[6] = 'style="background: #ffb0a0"'
else:
attrs[6] = ''
if int(r[0]):
if float(r[4]) > 0.:
attrs[4] = 'style="background: #ffb0a0"'
else:
attrs[4] = 'style="background: #adbdff"'
else:
attrs[4] = ''
if (float(r[2]) + float(r[3])) / 2 > allowed_single_run_time:
attrs[2] = 'style="background: #ffb0a0"'
attrs[3] = 'style="background: #ffb0a0"'
else:
attrs[2] = ''
attrs[3] = ''
print(tableRow(r, attrs))
@ -379,6 +423,8 @@ elif args.report == 'all-queries':
print_all_queries()
print_report_errors()
print("""
<p class="links">
<a href="output.7z">Test output</a>

View File

@ -20,10 +20,10 @@ RUN apt-get --allow-unauthenticated update -y \
# apt-get --allow-unauthenticated install --yes --no-install-recommends \
# pvs-studio
ENV PKG_VERSION="pvs-studio-7.04.34029.84-amd64.deb"
ENV PKG_VERSION="pvs-studio-7.07.38234.46-amd64.deb"
RUN wget -q http://files.viva64.com/beta/$PKG_VERSION
RUN sudo dpkg -i $PKG_VERSION
RUN wget "http://files.viva64.com/$PKG_VERSION"
RUN sudo dpkg -i "$PKG_VERSION"
CMD cd /repo_folder && pvs-studio-analyzer credentials $LICENCE_NAME $LICENCE_KEY -o ./licence.lic \
&& cmake . && ninja re2_st && \

View File

@ -14,6 +14,11 @@ kill_clickhouse () {
sleep 10
fi
done
echo "Will try to send second kill signal for sure"
kill `pgrep -u clickhouse` 2>/dev/null
sleep 5
echo "clickhouse pids" `ps aux | grep clickhouse` | ts '%Y-%m-%d %H:%M:%S'
}
start_clickhouse () {
@ -50,6 +55,7 @@ ln -s /usr/share/clickhouse-test/config/zookeeper.xml /etc/clickhouse-server/con
ln -s /usr/share/clickhouse-test/config/disks.xml /etc/clickhouse-server/config.d/; \
ln -s /usr/share/clickhouse-test/config/secure_ports.xml /etc/clickhouse-server/config.d/; \
ln -s /usr/share/clickhouse-test/config/clusters.xml /etc/clickhouse-server/config.d/; \
ln -s /usr/share/clickhouse-test/config/graphite.xml /etc/clickhouse-server/config.d/; \
ln -s /usr/share/clickhouse-test/config/server.key /etc/clickhouse-server/; \
ln -s /usr/share/clickhouse-test/config/server.crt /etc/clickhouse-server/; \
ln -s /usr/share/clickhouse-test/config/dhparam.pem /etc/clickhouse-server/; \

View File

@ -36,6 +36,7 @@ CMD dpkg -i package_folder/clickhouse-common-static_*.deb; \
ln -s /usr/lib/llvm-9/bin/llvm-symbolizer /usr/bin/llvm-symbolizer; \
echo "TSAN_OPTIONS='halt_on_error=1 history_size=7 ignore_noninstrumented_modules=1 verbosity=1'" >> /etc/environment; \
echo "UBSAN_OPTIONS='print_stacktrace=1'" >> /etc/environment; \
echo "ASAN_OPTIONS='malloc_context_size=10 verbosity=1 allocator_release_to_os_interval_ms=10000'" >> /etc/environment; \
service clickhouse-server start && sleep 5 \
&& /s3downloader --dataset-names $DATASETS \
&& chmod 777 -R /var/lib/clickhouse \

View File

@ -193,10 +193,10 @@ When writing docs, you can use prepared templates. Copy the code of a template a
Templates:
- [Function](dscr-templates/template-function.md)
- [Setting](dscr-templates/template-setting.md)
- [Table engine](dscr-templates/template-table-engine.md)
- [System table](dscr-templates/template-system-table.md)
- [Function](_description_templates/template-function.md)
- [Setting](_description_templates/template-setting.md)
- [Table engine](_description_templates/template-table-engine.md)
- [System table](_description_templates/template-system-table.md)
<a name="how-to-build-docs"/>

View File

@ -0,0 +1 @@
../../../CHANGELOG.md

View File

@ -0,0 +1,11 @@
sudo apt-get install apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4
echo "deb https://repo.clickhouse.tech/deb/stable/ main/" | sudo tee \
/etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client
sudo service clickhouse-server start
clickhouse-client

View File

@ -0,0 +1,8 @@
sudo yum install yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
sudo yum install clickhouse-server clickhouse-client
sudo /etc/init.d/clickhouse-server start
clickhouse-client

View File

@ -0,0 +1,19 @@
export LATEST_VERSION=$(curl -s https://repo.clickhouse.tech/tgz/stable/ | \
grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-common-static-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-server-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-client-$LATEST_VERSION.tgz
tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh
tar -xzvf clickhouse-common-static-dbg-$LATEST_VERSION.tgz
sudo clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh
tar -xzvf clickhouse-server-$LATEST_VERSION.tgz
sudo clickhouse-server-$LATEST_VERSION/install/doinst.sh
sudo /etc/init.d/clickhouse-server start
tar -xzvf clickhouse-client-$LATEST_VERSION.tgz
sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh

View File

@ -1,3 +1,8 @@
---
toc_priority: 1
toc_title: Cloud
---
# ClickHouse Cloud Service Providers {#clickhouse-cloud-service-providers}
!!! info "Info"
@ -7,7 +12,7 @@
[Yandex Managed Service for ClickHouse](https://cloud.yandex.com/services/managed-clickhouse?utm_source=referrals&utm_medium=clickhouseofficialsite&utm_campaign=link3) provides the following key features:
- Fully managed ZooKeeper service for [ClickHouse replication](../engines/table_engines/mergetree_family/replication.md)
- Fully managed ZooKeeper service for [ClickHouse replication](../engines/table-engines/mergetree-family/replication.md)
- Multiple storage type choices
- Replicas in different availability zones
- Encryption and isolation

View File

@ -0,0 +1,21 @@
---
toc_priority: 3
toc_title: Support
---
# ClickHouse Commercial Support Service Providers {#clickhouse-commercial-support-service-providers}
!!! info "Info"
If you have launched a ClickHouse commercial support service, feel free to [open a pull-request](https://github.com/ClickHouse/ClickHouse/edit/master/docs/en/commercial/support.md) adding it to the following list.
## Altinity {#altinity}
[Service description](https://www.altinity.com/24x7-support)
## Mafiree {#mafiree}
[Service description](http://mafiree.com/clickhouse-analytics-services.php)
## MinervaDB {#minervadb}
[Service description](https://minervadb.com/index.php/clickhouse-consulting-and-support-by-minervadb/)

Some files were not shown because too many files have changed in this diff Show More