Merge branch 'master' into complex_map_key

This commit is contained in:
taiyang-li 2024-02-27 17:25:37 +08:00
commit ccd317da7e
1395 changed files with 35125 additions and 12623 deletions

View File

@ -17,7 +17,7 @@ assignees: ''
> A link to reproducer in [https://fiddle.clickhouse.com/](https://fiddle.clickhouse.com/).
**Does it reproduce on recent release?**
**Does it reproduce on the most recent release?**
[The list of releases](https://github.com/ClickHouse/ClickHouse/blob/master/utils/list-versions/version_date.tsv)
@ -34,11 +34,11 @@ assignees: ''
**How to reproduce**
* Which ClickHouse server version to use
* Which interface to use, if matters
* Which interface to use, if it matters
* Non-default settings, if any
* `CREATE TABLE` statements for all tables involved
* Sample data for all these tables, use [clickhouse-obfuscator](https://github.com/ClickHouse/ClickHouse/blob/master/programs/obfuscator/Obfuscator.cpp#L42-L80) if necessary
* Queries to run that lead to unexpected result
* Queries to run that lead to an unexpected result
**Expected behavior**

View File

@ -11,7 +11,7 @@ on: # yamllint disable-line rule:truthy
- 'backport/**'
jobs:
RunConfig:
runs-on: [self-hosted, style-checker]
runs-on: [self-hosted, style-checker-aarch64]
outputs:
data: ${{ steps.runconfig.outputs.CI_DATA }}
steps:

View File

@ -11,7 +11,7 @@ on: # yamllint disable-line rule:truthy
- 'master'
jobs:
RunConfig:
runs-on: [self-hosted, style-checker]
runs-on: [self-hosted, style-checker-aarch64]
outputs:
data: ${{ steps.runconfig.outputs.CI_DATA }}
steps:
@ -35,7 +35,7 @@ jobs:
- name: PrepareRunConfig
id: runconfig
run: |
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --configure --rebuild-all-binaries --outfile ${{ runner.temp }}/ci_run_data.json
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --configure --outfile ${{ runner.temp }}/ci_run_data.json
echo "::group::CI configuration"
python3 -m json.tool ${{ runner.temp }}/ci_run_data.json
@ -55,7 +55,6 @@ jobs:
uses: ./.github/workflows/reusable_docker.yml
with:
data: ${{ needs.RunConfig.outputs.data }}
set_latest: true
StyleCheck:
needs: [RunConfig, BuildDockers]
if: ${{ !failure() && !cancelled() }}
@ -98,6 +97,14 @@ jobs:
build_name: package_release
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
BuilderDebReleaseCoverage:
needs: [RunConfig, BuildDockers]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_build.yml
with:
build_name: package_release_coverage
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
BuilderDebAarch64:
needs: [RunConfig, BuildDockers]
if: ${{ !failure() && !cancelled() }}
@ -278,6 +285,7 @@ jobs:
- BuilderDebDebug
- BuilderDebMsan
- BuilderDebRelease
- BuilderDebReleaseCoverage
- BuilderDebTsan
- BuilderDebUBsan
uses: ./.github/workflows/reusable_test.yml
@ -319,7 +327,7 @@ jobs:
run_command: |
python3 build_report_check.py "$CHECK_NAME"
MarkReleaseReady:
if: ${{ !failure() && !cancelled() }}
if: ${{ ! (contains(needs.*.result, 'skipped') || contains(needs.*.result, 'failure')) }}
needs:
- BuilderBinDarwin
- BuilderBinDarwinAarch64
@ -329,8 +337,6 @@ jobs:
steps:
- name: Check out repository code
uses: ClickHouse/checkout@v1
with:
clear-repository: true
- name: Mark Commit Release Ready
run: |
cd "$GITHUB_WORKSPACE/tests/ci"
@ -369,36 +375,28 @@ jobs:
test_name: Stateless tests (release)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestReleaseDatabaseOrdinary:
FunctionalStatelessTestReleaseAnalyzerS3Replicated:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateless tests (release, DatabaseOrdinary)
test_name: Stateless tests (release, analyzer, s3, DatabaseReplicated)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestReleaseDatabaseReplicated:
needs: [RunConfig, BuilderDebRelease]
FunctionalStatelessTestS3Debug:
needs: [RunConfig, BuilderDebDebug]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateless tests (release, DatabaseReplicated)
test_name: Stateless tests (debug, s3 storage)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestReleaseAnalyzer:
needs: [RunConfig, BuilderDebRelease]
FunctionalStatelessTestS3Tsan:
needs: [RunConfig, BuilderDebTsan]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateless tests (release, analyzer)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestReleaseS3:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateless tests (release, s3 storage)
test_name: Stateless tests (tsan, s3 storage)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestAarch64:
@ -509,6 +507,55 @@ jobs:
test_name: Stateful tests (debug)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
# Parallel replicas
FunctionalStatefulTestDebugParallelReplicas:
needs: [RunConfig, BuilderDebDebug]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateful tests (debug, ParallelReplicas)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatefulTestUBsanParallelReplicas:
needs: [RunConfig, BuilderDebUBsan]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateful tests (ubsan, ParallelReplicas)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatefulTestMsanParallelReplicas:
needs: [RunConfig, BuilderDebMsan]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateful tests (msan, ParallelReplicas)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatefulTestTsanParallelReplicas:
needs: [RunConfig, BuilderDebTsan]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateful tests (tsan, ParallelReplicas)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatefulTestAsanParallelReplicas:
needs: [RunConfig, BuilderDebAsan]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateful tests (asan, ParallelReplicas)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatefulTestReleaseParallelReplicas:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateful tests (release, ParallelReplicas)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
##############################################################################################
########################### ClickBench #######################################################
##############################################################################################
@ -716,6 +763,28 @@ jobs:
runner_type: func-tester-aarch64
data: ${{ needs.RunConfig.outputs.data }}
##############################################################################################
############################ SQLLOGIC TEST ###################################################
##############################################################################################
SQLLogicTestRelease:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Sqllogic test (release)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
##############################################################################################
##################################### SQL TEST ###############################################
##############################################################################################
SQLTest:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: SQLTest
runner_type: fuzzer-unit-tester
data: ${{ needs.RunConfig.outputs.data }}
##############################################################################################
###################################### SQLANCER FUZZERS ######################################
##############################################################################################
SQLancerTestRelease:
@ -740,15 +809,14 @@ jobs:
- MarkReleaseReady
- FunctionalStatelessTestDebug
- FunctionalStatelessTestRelease
- FunctionalStatelessTestReleaseDatabaseOrdinary
- FunctionalStatelessTestReleaseDatabaseReplicated
- FunctionalStatelessTestReleaseAnalyzer
- FunctionalStatelessTestReleaseS3
- FunctionalStatelessTestReleaseAnalyzerS3Replicated
- FunctionalStatelessTestAarch64
- FunctionalStatelessTestAsan
- FunctionalStatelessTestTsan
- FunctionalStatelessTestMsan
- FunctionalStatelessTestUBsan
- FunctionalStatelessTestS3Debug
- FunctionalStatelessTestS3Tsan
- FunctionalStatefulTestDebug
- FunctionalStatefulTestRelease
- FunctionalStatefulTestAarch64
@ -756,6 +824,12 @@ jobs:
- FunctionalStatefulTestTsan
- FunctionalStatefulTestMsan
- FunctionalStatefulTestUBsan
- FunctionalStatefulTestDebugParallelReplicas
- FunctionalStatefulTestUBsanParallelReplicas
- FunctionalStatefulTestMsanParallelReplicas
- FunctionalStatefulTestTsanParallelReplicas
- FunctionalStatefulTestAsanParallelReplicas
- FunctionalStatefulTestReleaseParallelReplicas
- StressTestDebug
- StressTestAsan
- StressTestTsan
@ -781,6 +855,8 @@ jobs:
- UnitTestsReleaseClang
- SQLancerTestRelease
- SQLancerTestDebug
- SQLLogicTestRelease
- SQLTest
runs-on: [self-hosted, style-checker]
steps:
- name: Check out repository code

View File

@ -14,7 +14,7 @@ jobs:
# The task for having a preserved ENV and event.json for later investigation
uses: ./.github/workflows/debug.yml
RunConfig:
runs-on: [self-hosted, style-checker]
runs-on: [self-hosted, style-checker-aarch64]
outputs:
data: ${{ steps.runconfig.outputs.CI_DATA }}
steps:
@ -28,7 +28,7 @@ jobs:
id: runconfig
run: |
echo "::group::configure CI run"
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --configure --skip-jobs --rebuild-all-docker --outfile ${{ runner.temp }}/ci_run_data.json
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --configure --skip-jobs --outfile ${{ runner.temp }}/ci_run_data.json
echo "::endgroup::"
echo "::group::CI run configure results"

View File

@ -18,7 +18,7 @@ on: # yamllint disable-line rule:truthy
##########################################################################################
jobs:
RunConfig:
runs-on: [self-hosted, style-checker]
runs-on: [self-hosted, style-checker-aarch64]
outputs:
data: ${{ steps.runconfig.outputs.CI_DATA }}
steps:
@ -147,6 +147,14 @@ jobs:
build_name: package_release
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
BuilderDebReleaseCoverage:
needs: [RunConfig, FastTest]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_build.yml
with:
build_name: package_release_coverage
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
BuilderDebAarch64:
needs: [RunConfig, FastTest]
if: ${{ !failure() && !cancelled() }}
@ -309,6 +317,7 @@ jobs:
- BuilderDebDebug
- BuilderDebMsan
- BuilderDebRelease
- BuilderDebReleaseCoverage
- BuilderDebTsan
- BuilderDebUBsan
uses: ./.github/workflows/reusable_test.yml
@ -382,28 +391,12 @@ jobs:
test_name: Stateless tests (release)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestReleaseDatabaseReplicated:
FunctionalStatelessTestReleaseAnalyzerS3Replicated:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateless tests (release, DatabaseReplicated)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestReleaseAnalyzer:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateless tests (release, analyzer)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestReleaseS3:
needs: [RunConfig, BuilderDebRelease]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Stateless tests (release, s3 storage)
test_name: Stateless tests (release, analyzer, s3, DatabaseReplicated)
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
FunctionalStatelessTestS3Debug:
@ -483,21 +476,9 @@ jobs:
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: tests bugfix validate check
test_name: Bugfix validation
runner_type: func-tester
data: ${{ needs.RunConfig.outputs.data }}
additional_envs: |
KILL_TIMEOUT=3600
run_command: |
TEMP_PATH="${TEMP_PATH}/integration" \
python3 integration_test_check.py "Integration $CHECK_NAME" \
--validate-bugfix --post-commit-status=file || echo 'ignore exit code'
TEMP_PATH="${TEMP_PATH}/stateless" \
python3 functional_test_check.py "Stateless $CHECK_NAME" "$KILL_TIMEOUT" \
--validate-bugfix --post-commit-status=file || echo 'ignore exit code'
python3 bugfix_validate_check.py "${TEMP_PATH}/stateless/functional_commit_status.tsv" "${TEMP_PATH}/integration/integration_commit_status.tsv"
##############################################################################################
############################ FUNCTIONAl STATEFUL TESTS #######################################
##############################################################################################
@ -753,14 +734,6 @@ jobs:
#############################################################################################
############################# INTEGRATION TESTS #############################################
#############################################################################################
IntegrationTestsAsan:
needs: [RunConfig, BuilderDebAsan]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Integration tests (asan)
runner_type: stress-tester
data: ${{ needs.RunConfig.outputs.data }}
IntegrationTestsAnalyzerAsan:
needs: [RunConfig, BuilderDebAsan]
if: ${{ !failure() && !cancelled() }}
@ -777,13 +750,14 @@ jobs:
test_name: Integration tests (tsan)
runner_type: stress-tester
data: ${{ needs.RunConfig.outputs.data }}
IntegrationTestsRelease:
needs: [RunConfig, BuilderDebRelease]
IntegrationTestsAarch64:
needs: [RunConfig, BuilderDebAarch64]
if: ${{ !failure() && !cancelled() }}
uses: ./.github/workflows/reusable_test.yml
with:
test_name: Integration tests (release)
runner_type: stress-tester
test_name: Integration tests (aarch64)
# FIXME: there is no stress-tester for aarch64. func-tester-aarch64 is ok?
runner_type: func-tester-aarch64
data: ${{ needs.RunConfig.outputs.data }}
IntegrationTestsFlakyCheck:
needs: [RunConfig, BuilderDebAsan]
@ -881,10 +855,9 @@ jobs:
- BuilderSpecialReport
- DocsCheck
- FastTest
- TestsBugfixCheck
- FunctionalStatelessTestDebug
- FunctionalStatelessTestRelease
- FunctionalStatelessTestReleaseDatabaseReplicated
- FunctionalStatelessTestReleaseAnalyzer
- FunctionalStatelessTestAarch64
- FunctionalStatelessTestAsan
- FunctionalStatelessTestTsan
@ -897,9 +870,9 @@ jobs:
- FunctionalStatefulTestTsan
- FunctionalStatefulTestMsan
- FunctionalStatefulTestUBsan
- FunctionalStatelessTestReleaseS3
- FunctionalStatelessTestS3Debug
- FunctionalStatelessTestS3Tsan
- FunctionalStatelessTestReleaseAnalyzerS3Replicated
- FunctionalStatefulTestReleaseParallelReplicas
- FunctionalStatefulTestAsanParallelReplicas
- FunctionalStatefulTestTsanParallelReplicas
@ -920,10 +893,9 @@ jobs:
- ASTFuzzerTestTsan
- ASTFuzzerTestMSan
- ASTFuzzerTestUBSan
- IntegrationTestsAsan
- IntegrationTestsAnalyzerAsan
- IntegrationTestsTsan
- IntegrationTestsRelease
- IntegrationTestsAarch64
- IntegrationTestsFlakyCheck
- PerformanceComparisonX86
- PerformanceComparisonAarch
@ -992,7 +964,7 @@ jobs:
####################################### libFuzzer ###########################################
#############################################################################################
libFuzzer:
if: ${{ !failure() && !cancelled() && contains(github.event.pull_request.labels.*.name, 'libFuzzer') }}
if: ${{ !failure() && !cancelled() }}
needs: [RunConfig, StyleCheck]
uses: ./.github/workflows/libfuzzer.yml
with:

View File

@ -14,7 +14,7 @@ on: # yamllint disable-line rule:truthy
jobs:
RunConfig:
runs-on: [self-hosted, style-checker]
runs-on: [self-hosted, style-checker-aarch64]
outputs:
data: ${{ steps.runconfig.outputs.CI_DATA }}
steps:
@ -41,7 +41,7 @@ jobs:
id: runconfig
run: |
echo "::group::configure CI run"
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --configure --rebuild-all-binaries --outfile ${{ runner.temp }}/ci_run_data.json
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --configure --outfile ${{ runner.temp }}/ci_run_data.json
echo "::endgroup::"
echo "::group::CI run configure results"
python3 -m json.tool ${{ runner.temp }}/ci_run_data.json
@ -91,6 +91,8 @@ jobs:
build_name: package_release
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
# always rebuild on release branches to be able to publish from any commit
force: true
BuilderDebAarch64:
needs: [RunConfig, BuildDockers]
if: ${{ !failure() && !cancelled() }}
@ -99,6 +101,8 @@ jobs:
build_name: package_aarch64
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
# always rebuild on release branches to be able to publish from any commit
force: true
BuilderDebAsan:
needs: [RunConfig, BuildDockers]
if: ${{ !failure() && !cancelled() }}
@ -142,6 +146,8 @@ jobs:
build_name: binary_darwin
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
# always rebuild on release branches to be able to publish from any commit
force: true
BuilderBinDarwinAarch64:
needs: [RunConfig, BuildDockers]
if: ${{ !failure() && !cancelled() }}
@ -150,6 +156,8 @@ jobs:
build_name: binary_darwin_aarch64
checkout_depth: 0
data: ${{ needs.RunConfig.outputs.data }}
# always rebuild on release branches to be able to publish from any commit
force: true
############################################################################################
##################################### Docker images #######################################
############################################################################################
@ -206,13 +214,8 @@ jobs:
if: ${{ !cancelled() }}
needs:
- RunConfig
- BuilderDebRelease
- BuilderDebAarch64
- BuilderDebAsan
- BuilderDebTsan
- BuilderDebUBsan
- BuilderDebMsan
- BuilderDebDebug
- BuilderBinDarwin
- BuilderBinDarwinAarch64
uses: ./.github/workflows/reusable_test.yml
with:
test_name: ClickHouse special build check
@ -225,7 +228,7 @@ jobs:
run_command: |
python3 build_report_check.py "$CHECK_NAME"
MarkReleaseReady:
if: ${{ !failure() && !cancelled() }}
if: ${{ ! (contains(needs.*.result, 'skipped') || contains(needs.*.result, 'failure')) }}
needs:
- BuilderBinDarwin
- BuilderBinDarwinAarch64
@ -235,8 +238,6 @@ jobs:
steps:
- name: Check out repository code
uses: ClickHouse/checkout@v1
with:
clear-repository: true
- name: Mark Commit Release Ready
run: |
cd "$GITHUB_WORKSPACE/tests/ci"

View File

@ -26,6 +26,10 @@ name: Build ClickHouse
description: json ci data
type: string
required: true
force:
description: disallow job skipping
type: boolean
default: false
additional_envs:
description: additional ENV variables to setup the job
type: string
@ -33,7 +37,7 @@ name: Build ClickHouse
jobs:
Build:
name: Build-${{inputs.build_name}}
if: contains(fromJson(inputs.data).jobs_data.jobs_to_do, inputs.build_name)
if: ${{ contains(fromJson(inputs.data).jobs_data.jobs_to_do, inputs.build_name) || inputs.force }}
env:
GITHUB_JOB_OVERRIDDEN: Build-${{inputs.build_name}}
runs-on: [self-hosted, '${{inputs.runner_type}}']
@ -78,13 +82,15 @@ jobs:
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" \
--infile ${{ toJson(inputs.data) }} \
--job-name "$BUILD_NAME" \
--run
--run \
${{ inputs.force && '--force' || '' }}
- name: Post
# it still be build report to upload for failed build job
if: ${{ !cancelled() }}
run: |
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --infile ${{ toJson(inputs.data) }} --post --job-name '${{inputs.build_name}}'
- name: Mark as done
if: ${{ !cancelled() }}
run: |
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --infile ${{ toJson(inputs.data) }} --mark-success --job-name '${{inputs.build_name}}'
- name: Clean

View File

@ -46,7 +46,7 @@ jobs:
needs: [DockerBuildAmd64, DockerBuildAarch64]
runs-on: [self-hosted, style-checker]
if: |
!failure() && !cancelled() && toJson(fromJson(inputs.data).docker_data.missing_multi) != '[]'
!failure() && !cancelled() && (toJson(fromJson(inputs.data).docker_data.missing_multi) != '[]' || inputs.set_latest)
steps:
- name: Check out repository code
uses: ClickHouse/checkout@v1
@ -55,14 +55,12 @@ jobs:
- name: Build images
run: |
cd "$GITHUB_WORKSPACE/tests/ci"
FLAG_LATEST=''
if [ "${{ inputs.set_latest }}" == "true" ]; then
FLAG_LATEST='--set-latest'
echo "latest tag will be set for resulting manifests"
fi
python3 docker_manifests_merge.py --suffix amd64 --suffix aarch64 \
--image-tags '${{ toJson(fromJson(inputs.data).docker_data.images) }}' \
--missing-images '${{ toJson(fromJson(inputs.data).docker_data.missing_multi) }}' \
--set-latest
else
python3 docker_manifests_merge.py --suffix amd64 --suffix aarch64 \
--image-tags '${{ toJson(fromJson(inputs.data).docker_data.images) }}' \
--missing-images '${{ toJson(fromJson(inputs.data).docker_data.missing_multi) }}'
fi
$FLAG_LATEST

View File

@ -107,6 +107,7 @@ jobs:
run: |
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --infile ${{ toJson(inputs.data) }} --post --job-name '${{inputs.test_name}}'
- name: Mark as done
if: ${{ !cancelled() }}
run: |
python3 "$GITHUB_WORKSPACE/tests/ci/ci.py" --infile ${{ toJson(inputs.data) }} --mark-success --job-name '${{inputs.test_name}}' --batch ${{matrix.batch}}
- name: Clean

View File

@ -55,7 +55,7 @@ jobs:
python3 ./utils/security-generator/generate_security.py > SECURITY.md
git diff HEAD
- name: Create Pull Request
uses: peter-evans/create-pull-request@v3
uses: peter-evans/create-pull-request@v6
with:
author: "robot-clickhouse <robot-clickhouse@users.noreply.github.com>"
token: ${{ secrets.ROBOT_CLICKHOUSE_COMMIT_TOKEN }}

View File

@ -1,6 +1,6 @@
### CI modificators (add a leading space to apply):
### CI modificators (add a leading space to apply) ###
## To avoid a merge commit in CI:
#no_merge_commit
@ -8,12 +8,21 @@
## To discard CI cache:
#no_ci_cache
## To not test (only style check):
#do_not_test
## To run specified set of tests in CI:
#ci_set_<SET_NAME>
#ci_set_reduced
#ci_set_arm
#ci_set_integration
## To run specified job in CI:
#job_<JOB NAME>
#job_stateless_tests_release
#job_package_debug
#job_integration_tests_asan
## To run only specified batches for multi-batch job(s)
#batch_2
#btach_1_2_3

View File

@ -6,8 +6,6 @@
### <a id="241"></a> ClickHouse release 24.1, 2024-01-30
### ClickHouse release master (b4a5b6060ea) FIXME as compared to v23.12.1.1368-stable (a2faa65b080)
#### Backward Incompatible Change
* The setting `print_pretty_type_names` is turned on by default. You can turn it off to keep the old behavior or `SET compatibility = '23.12'`. [#57726](https://github.com/ClickHouse/ClickHouse/pull/57726) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* The MergeTree setting `clean_deleted_rows` is deprecated, it has no effect anymore. The `CLEANUP` keyword for `OPTIMIZE` is not allowed by default (unless `allow_experimental_replacing_merge_with_cleanup` is enabled). [#58316](https://github.com/ClickHouse/ClickHouse/pull/58316) ([Alexander Tokmakov](https://github.com/tavplubix)).
@ -24,7 +22,6 @@
* Add `quantileDD` aggregate function as well as the corresponding `quantilesDD` and `medianDD`. It is based on the DDSketch https://www.vldb.org/pvldb/vol12/p2195-masson.pdf. ### Documentation entry for user-facing changes. [#56342](https://github.com/ClickHouse/ClickHouse/pull/56342) ([Srikanth Chekuri](https://github.com/srikanthccv)).
* Allow to configure any kind of object storage with any kind of metadata type. [#58357](https://github.com/ClickHouse/ClickHouse/pull/58357) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Added `null_status_on_timeout_only_active` and `throw_only_active` modes for `distributed_ddl_output_mode` that allow to avoid waiting for inactive replicas. [#58350](https://github.com/ClickHouse/ClickHouse/pull/58350) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Allow partitions from tables with different partition expressions to be attached when the destination table partition expression doesn't re-partition/split the part. [#39507](https://github.com/ClickHouse/ClickHouse/pull/39507) ([Arthur Passos](https://github.com/arthurpassos)).
* Add function `arrayShingles` to compute subarrays, e.g. `arrayShingles([1, 2, 3, 4, 5], 3)` returns `[[1,2,3],[2,3,4],[3,4,5]]`. [#58396](https://github.com/ClickHouse/ClickHouse/pull/58396) ([Zheng Miao](https://github.com/zenmiao7)).
* Added functions `punycodeEncode`, `punycodeDecode`, `idnaEncode` and `idnaDecode` which are useful for translating international domain names to an ASCII representation according to the IDNA standard. [#58454](https://github.com/ClickHouse/ClickHouse/pull/58454) ([Robert Schulze](https://github.com/rschu1ze)).
* Added string similarity functions `dramerauLevenshteinDistance`, `jaroSimilarity` and `jaroWinklerSimilarity`. [#58531](https://github.com/ClickHouse/ClickHouse/pull/58531) ([Robert Schulze](https://github.com/rschu1ze)).

View File

@ -254,10 +254,17 @@ endif()
include(cmake/cpu_features.cmake)
# Asynchronous unwind tables are needed for Query Profiler.
# They are already by default on some platforms but possibly not on all platforms.
# Enable it explicitly.
set (COMPILER_FLAGS "${COMPILER_FLAGS} -fasynchronous-unwind-tables")
# Query Profiler doesn't work on MacOS for several reasons
# - PHDR cache is not available
# - We use native functionality to get stacktraces which is not async signal safe
# and thus we don't need to generate asynchronous unwind tables
if (NOT OS_DARWIN)
# Asynchronous unwind tables are needed for Query Profiler.
# They are already by default on some platforms but possibly not on all platforms.
# Enable it explicitly.
set (COMPILER_FLAGS "${COMPILER_FLAGS} -fasynchronous-unwind-tables")
endif()
# Reproducible builds.
if (CMAKE_BUILD_TYPE_UC STREQUAL "DEBUG")
@ -348,7 +355,7 @@ if (COMPILER_CLANG)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fdiagnostics-absolute-paths")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fdiagnostics-absolute-paths")
if (NOT ENABLE_TESTS AND NOT SANITIZE AND OS_LINUX)
if (NOT ENABLE_TESTS AND NOT SANITIZE AND NOT SANITIZE_COVERAGE AND OS_LINUX)
# https://clang.llvm.org/docs/ThinLTO.html
# Applies to clang and linux only.
# Disabled when building with tests or sanitizers.
@ -546,7 +553,7 @@ if (ENABLE_RUST)
endif()
endif()
if (CMAKE_BUILD_TYPE_UC STREQUAL "RELWITHDEBINFO" AND NOT SANITIZE AND OS_LINUX AND (ARCH_AMD64 OR ARCH_AARCH64))
if (CMAKE_BUILD_TYPE_UC STREQUAL "RELWITHDEBINFO" AND NOT SANITIZE AND NOT SANITIZE_COVERAGE AND OS_LINUX AND (ARCH_AMD64 OR ARCH_AARCH64))
set(CHECK_LARGE_OBJECT_SIZES_DEFAULT ON)
else ()
set(CHECK_LARGE_OBJECT_SIZES_DEFAULT OFF)

View File

@ -37,7 +37,7 @@ Keep an eye out for upcoming meetups around the world. Somewhere else you want u
## Recent Recordings
* **Recent Meetup Videos**: [Meetup Playlist](https://www.youtube.com/playlist?list=PL0Z2YDlm0b3iNDUzpY1S3L_iV4nARda_U) Whenever possible recordings of the ClickHouse Community Meetups are edited and presented as individual talks. Current featuring "Modern SQL in 2023", "Fast, Concurrent, and Consistent Asynchronous INSERTS in ClickHouse", and "Full-Text Indices: Design and Experiments"
* **Recording available**: [**v23.10 Release Webinar**](https://www.youtube.com/watch?v=PGQS6uPb970) All the features of 23.10, one convenient video! Watch it now!
* **Recording available**: [**v24.1 Release Webinar**](https://www.youtube.com/watch?v=pBF9g0wGAGs) All the features of 24.1, one convenient video! Watch it now!
* **All release webinar recordings**: [YouTube playlist](https://www.youtube.com/playlist?list=PL0Z2YDlm0b3jAlSy1JxyP8zluvXaN3nxU)

View File

@ -10,6 +10,7 @@ set (CMAKE_CXX_STANDARD 20)
set (SRCS
argsToConfig.cpp
cgroupsv2.cpp
coverage.cpp
demangle.cpp
getAvailableMemoryAmount.cpp
@ -17,6 +18,7 @@ set (SRCS
getMemoryAmount.cpp
getPageSize.cpp
getThreadId.cpp
int8_to_string.cpp
JSON.cpp
mremap.cpp
phdr_cache.cpp

View File

@ -1,6 +1,7 @@
#pragma once
#include <base/types.h>
#include <base/extended_types.h>
namespace wide
{
@ -44,3 +45,8 @@ concept is_over_big_int =
|| std::is_same_v<T, Decimal128>
|| std::is_same_v<T, Decimal256>;
}
template <> struct is_signed<DB::Decimal32> { static constexpr bool value = true; };
template <> struct is_signed<DB::Decimal64> { static constexpr bool value = true; };
template <> struct is_signed<DB::Decimal128> { static constexpr bool value = true; };
template <> struct is_signed<DB::Decimal256> { static constexpr bool value = true; };

View File

@ -1,5 +1,6 @@
#pragma once
#include <bit>
#include <cstring>
#include <algorithm>
#include <type_traits>

64
base/base/cgroupsv2.cpp Normal file
View File

@ -0,0 +1,64 @@
#include <base/cgroupsv2.h>
#include <base/defines.h>
#include <fstream>
#include <sstream>
bool cgroupsV2Enabled()
{
#if defined(OS_LINUX)
/// This file exists iff the host has cgroups v2 enabled.
auto controllers_file = default_cgroups_mount / "cgroup.controllers";
if (!std::filesystem::exists(controllers_file))
return false;
return true;
#else
return false;
#endif
}
bool cgroupsV2MemoryControllerEnabled()
{
#if defined(OS_LINUX)
chassert(cgroupsV2Enabled());
/// According to https://docs.kernel.org/admin-guide/cgroup-v2.html:
/// - file 'cgroup.controllers' defines which controllers *can* be enabled
/// - file 'cgroup.subtree_control' defines which controllers *are* enabled
/// Caveat: nested groups may disable controllers. For simplicity, check only the top-level group.
std::ifstream subtree_control_file(default_cgroups_mount / "cgroup.subtree_control");
if (!subtree_control_file.is_open())
return false;
std::string controllers;
std::getline(subtree_control_file, controllers);
if (controllers.find("memory") == std::string::npos)
return false;
return true;
#else
return false;
#endif
}
std::string cgroupV2OfProcess()
{
#if defined(OS_LINUX)
chassert(cgroupsV2Enabled());
/// All PIDs assigned to a cgroup are in /sys/fs/cgroups/{cgroup_name}/cgroup.procs
/// A simpler way to get the membership is:
std::ifstream cgroup_name_file("/proc/self/cgroup");
if (!cgroup_name_file.is_open())
return "";
/// With cgroups v2, there will be a *single* line with prefix "0::/"
/// (see https://docs.kernel.org/admin-guide/cgroup-v2.html)
std::string cgroup;
std::getline(cgroup_name_file, cgroup);
static const std::string v2_prefix = "0::/";
if (!cgroup.starts_with(v2_prefix))
return "";
cgroup = cgroup.substr(v2_prefix.length());
return cgroup;
#else
return "";
#endif
}

22
base/base/cgroupsv2.h Normal file
View File

@ -0,0 +1,22 @@
#pragma once
#include <filesystem>
#include <string>
#if defined(OS_LINUX)
/// I think it is possible to mount the cgroups hierarchy somewhere else (e.g. when in containers).
/// /sys/fs/cgroup was still symlinked to the actual mount in the cases that I have seen.
static inline const std::filesystem::path default_cgroups_mount = "/sys/fs/cgroup";
#endif
/// Is cgroups v2 enabled on the system?
bool cgroupsV2Enabled();
/// Is the memory controller of cgroups v2 enabled on the system?
/// Assumes that cgroupsV2Enabled() is enabled.
bool cgroupsV2MemoryControllerEnabled();
/// Which cgroup does the process belong to?
/// Returns an empty string if the cgroup cannot be determined.
/// Assumes that cgroupsV2Enabled() is enabled.
std::string cgroupV2OfProcess();

View File

@ -1,4 +1,5 @@
#include "coverage.h"
#include <sys/mman.h>
#pragma GCC diagnostic ignored "-Wreserved-identifier"
@ -52,11 +53,21 @@ namespace
uint32_t * guards_start = nullptr;
uint32_t * guards_end = nullptr;
uintptr_t * coverage_array = nullptr;
uintptr_t * current_coverage_array = nullptr;
uintptr_t * cumulative_coverage_array = nullptr;
size_t coverage_array_size = 0;
uintptr_t * all_addresses_array = nullptr;
size_t all_addresses_array_size = 0;
uintptr_t * allocate(size_t size)
{
/// Note: mmap return zero-initialized memory, and we count on that.
void * map = mmap(nullptr, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (MAP_FAILED == map)
return nullptr;
return static_cast<uintptr_t*>(map);
}
}
extern "C"
@ -79,7 +90,8 @@ void __sanitizer_cov_trace_pc_guard_init(uint32_t * start, uint32_t * stop)
coverage_array_size = stop - start;
/// Note: we will leak this.
coverage_array = static_cast<uintptr_t*>(malloc(sizeof(uintptr_t) * coverage_array_size));
current_coverage_array = allocate(sizeof(uintptr_t) * coverage_array_size);
cumulative_coverage_array = allocate(sizeof(uintptr_t) * coverage_array_size);
resetCoverage();
}
@ -92,8 +104,8 @@ void __sanitizer_cov_pcs_init(const uintptr_t * pcs_begin, const uintptr_t * pcs
return;
pc_table_initialized = true;
all_addresses_array = static_cast<uintptr_t*>(malloc(sizeof(uintptr_t) * coverage_array_size));
all_addresses_array_size = pcs_end - pcs_begin;
all_addresses_array = allocate(sizeof(uintptr_t) * all_addresses_array_size);
/// They are not a real pointers, but also contain a flag in the most significant bit,
/// in which we are not interested for now. Reset it.
@ -115,17 +127,24 @@ void __sanitizer_cov_trace_pc_guard(uint32_t * guard)
/// The values of `*guard` are as you set them in
/// __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive
/// and use them to dereference an array or a bit vector.
void * pc = __builtin_return_address(0);
intptr_t pc = reinterpret_cast<uintptr_t>(__builtin_return_address(0));
coverage_array[guard - guards_start] = reinterpret_cast<uintptr_t>(pc);
current_coverage_array[guard - guards_start] = pc;
cumulative_coverage_array[guard - guards_start] = pc;
}
}
__attribute__((no_sanitize("coverage")))
std::span<const uintptr_t> getCoverage()
std::span<const uintptr_t> getCurrentCoverage()
{
return {coverage_array, coverage_array_size};
return {current_coverage_array, coverage_array_size};
}
__attribute__((no_sanitize("coverage")))
std::span<const uintptr_t> getCumulativeCoverage()
{
return {cumulative_coverage_array, coverage_array_size};
}
__attribute__((no_sanitize("coverage")))
@ -137,7 +156,7 @@ std::span<const uintptr_t> getAllInstrumentedAddresses()
__attribute__((no_sanitize("coverage")))
void resetCoverage()
{
memset(coverage_array, 0, coverage_array_size * sizeof(*coverage_array));
memset(current_coverage_array, 0, coverage_array_size * sizeof(*current_coverage_array));
/// The guard defines whether the __sanitizer_cov_trace_pc_guard should be called.
/// For example, you can unset it after first invocation to prevent excessive work.

View File

@ -15,7 +15,10 @@ void dumpCoverageReportIfPossible();
/// Get accumulated unique program addresses of the instrumented parts of the code,
/// seen so far after program startup or after previous reset.
/// The returned span will be represented as a sparse map, containing mostly zeros, which you should filter away.
std::span<const uintptr_t> getCoverage();
std::span<const uintptr_t> getCurrentCoverage();
/// Similar but not being reset.
std::span<const uintptr_t> getCumulativeCoverage();
/// Get all instrumented addresses that could be in the coverage.
std::span<const uintptr_t> getAllInstrumentedAddresses();

View File

@ -1,16 +1,55 @@
#include <stdexcept>
#include <fstream>
#include <base/getMemoryAmount.h>
#include <base/cgroupsv2.h>
#include <base/getPageSize.h>
#include <fstream>
#include <stdexcept>
#include <unistd.h>
#include <sys/types.h>
#include <sys/param.h>
#if defined(BSD)
#include <sys/sysctl.h>
#endif
namespace
{
std::optional<uint64_t> getCgroupsV2MemoryLimit()
{
#if defined(OS_LINUX)
if (!cgroupsV2Enabled())
return {};
if (!cgroupsV2MemoryControllerEnabled())
return {};
std::string cgroup = cgroupV2OfProcess();
auto current_cgroup = cgroup.empty() ? default_cgroups_mount : (default_cgroups_mount / cgroup);
/// Open the bottom-most nested memory limit setting file. If there is no such file at the current
/// level, try again at the parent level as memory settings are inherited.
while (current_cgroup != default_cgroups_mount.parent_path())
{
std::ifstream setting_file(current_cgroup / "memory.max");
if (setting_file.is_open())
{
uint64_t value;
if (setting_file >> value)
return {value};
else
return {}; /// e.g. the cgroups default "max"
}
current_cgroup = current_cgroup.parent_path();
}
return {};
#else
return {};
#endif
}
}
/** Returns the size of physical memory (RAM) in bytes.
* Returns 0 on unsupported platform
*/
@ -26,34 +65,27 @@ uint64_t getMemoryAmountOrZero()
uint64_t memory_amount = num_pages * page_size;
#if defined(OS_LINUX)
// Try to lookup at the Cgroup limit
// CGroups v2
std::ifstream cgroupv2_limit("/sys/fs/cgroup/memory.max");
if (cgroupv2_limit.is_open())
{
uint64_t memory_limit = 0;
cgroupv2_limit >> memory_limit;
if (memory_limit > 0 && memory_limit < memory_amount)
memory_amount = memory_limit;
}
/// Respect the memory limit set by cgroups v2.
auto limit_v2 = getCgroupsV2MemoryLimit();
if (limit_v2.has_value() && *limit_v2 < memory_amount)
memory_amount = *limit_v2;
else
{
// CGroups v1
std::ifstream cgroup_limit("/sys/fs/cgroup/memory/memory.limit_in_bytes");
if (cgroup_limit.is_open())
/// Cgroups v1 were replaced by v2 in 2015. The only reason we keep supporting v1 is that the transition to v2
/// has been slow. Caveat : Hierarchical groups as in v2 are not supported for v1, the location of the memory
/// limit (virtual) file is hard-coded.
/// TODO: check at the end of 2024 if we can get rid of v1.
std::ifstream limit_file_v1("/sys/fs/cgroup/memory/memory.limit_in_bytes");
if (limit_file_v1.is_open())
{
uint64_t memory_limit = 0; // in case of read error
cgroup_limit >> memory_limit;
if (memory_limit > 0 && memory_limit < memory_amount)
memory_amount = memory_limit;
uint64_t limit_v1;
if (limit_file_v1 >> limit_v1)
if (limit_v1 < memory_amount)
memory_amount = limit_v1;
}
}
#endif
return memory_amount;
}

View File

@ -0,0 +1,9 @@
#include <base/int8_to_string.h>
namespace std
{
std::string to_string(Int8 v) /// NOLINT (cert-dcl58-cpp)
{
return to_string(int8_t{v});
}
}

View File

@ -0,0 +1,17 @@
#pragma once
#include <base/defines.h>
#include <base/types.h>
#include <fmt/format.h>
template <>
struct fmt::formatter<Int8> : fmt::formatter<int8_t>
{
};
namespace std
{
std::string to_string(Int8 v); /// NOLINT (cert-dcl58-cpp)
}

View File

@ -3,14 +3,29 @@
#include <cstdint>
#include <string>
/// This is needed for more strict aliasing. https://godbolt.org/z/xpJBSb https://stackoverflow.com/a/57453713
/// Using char8_t more strict aliasing (https://stackoverflow.com/a/57453713)
using UInt8 = char8_t;
/// Same for using signed _BitInt(8) (there isn't a signed char8_t, which would be more convenient)
/// See https://godbolt.org/z/fafnWEnnf
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wbit-int-extension"
using Int8 = signed _BitInt(8);
#pragma clang diagnostic pop
namespace std
{
template <>
struct hash<Int8> /// NOLINT (cert-dcl58-cpp)
{
size_t operator()(const Int8 x) const { return std::hash<int8_t>()(int8_t{x}); }
};
}
using UInt16 = uint16_t;
using UInt32 = uint32_t;
using UInt64 = uint64_t;
using Int8 = int8_t;
using Int16 = int16_t;
using Int32 = int32_t;
using Int64 = int64_t;

View File

@ -6,6 +6,7 @@
#include "throwError.h"
#include <bit>
#include <cmath>
#include <cfloat>
#include <cassert>

View File

@ -166,12 +166,6 @@ set (SRCS
)
add_library (_poco_foundation ${SRCS})
target_link_libraries (_poco_foundation
PUBLIC
boost::headers_only
boost::system
)
add_library (Poco::Foundation ALIAS _poco_foundation)
# TODO: remove these warning exclusions

View File

@ -23,8 +23,6 @@
#include <map>
#include <vector>
#include <boost/smart_ptr/intrusive_ptr.hpp>
#include "Poco/Channel.h"
#include "Poco/Format.h"
#include "Poco/Foundation.h"
@ -37,7 +35,7 @@ namespace Poco
class Exception;
class Logger;
using LoggerPtr = boost::intrusive_ptr<Logger>;
using LoggerPtr = std::shared_ptr<Logger>;
class Foundation_API Logger : public Channel
/// Logger is a special Channel that acts as the main
@ -953,9 +951,6 @@ private:
static std::optional<LoggerMapIterator> find(const std::string & name);
static Logger * findRawPtr(const std::string & name);
friend void intrusive_ptr_add_ref(Logger * ptr);
friend void intrusive_ptr_release(Logger * ptr);
Logger();
Logger(const Logger &);
Logger & operator=(const Logger &);

View File

@ -53,10 +53,11 @@ protected:
virtual ~RefCountedObject();
/// Destroys the RefCountedObject.
mutable std::atomic<size_t> _counter;
private:
RefCountedObject(const RefCountedObject &);
RefCountedObject & operator=(const RefCountedObject &);
mutable std::atomic<size_t> _counter;
};

View File

@ -302,9 +302,40 @@ void Logger::formatDump(std::string& message, const void* buffer, std::size_t le
namespace
{
inline LoggerPtr makeLoggerPtr(Logger & logger)
struct LoggerDeleter
{
return LoggerPtr(&logger, false /*add_ref*/);
void operator()(Poco::Logger * logger)
{
std::lock_guard<std::mutex> lock(getLoggerMutex());
/// If logger infrastructure is destroyed just decrement logger reference count
if (!_pLoggerMap)
{
logger->release();
return;
}
auto it = _pLoggerMap->find(logger->name());
assert(it != _pLoggerMap->end());
/** If reference count is 1, this means this shared pointer owns logger
* and need destroy it.
*/
size_t reference_count_before_release = logger->release();
if (reference_count_before_release == 1)
{
assert(it->second.owned_by_shared_ptr);
_pLoggerMap->erase(it);
}
}
};
inline LoggerPtr makeLoggerPtr(Logger & logger, bool owned_by_shared_ptr)
{
if (owned_by_shared_ptr)
return LoggerPtr(&logger, LoggerDeleter());
return LoggerPtr(std::shared_ptr<void>{}, &logger);
}
}
@ -327,15 +358,10 @@ LoggerPtr Logger::getShared(const std::string & name, bool should_be_owned_by_sh
/** If during `unsafeGet` logger was created, then this shared pointer owns it.
* If logger was already created, then this shared pointer does not own it.
*/
if (inserted)
{
if (should_be_owned_by_shared_ptr_if_created)
if (inserted && should_be_owned_by_shared_ptr_if_created)
it->second.owned_by_shared_ptr = true;
else
it->second.logger->duplicate();
}
return makeLoggerPtr(*it->second.logger);
return makeLoggerPtr(*it->second.logger, it->second.owned_by_shared_ptr);
}
@ -343,29 +369,20 @@ std::pair<Logger::LoggerMapIterator, bool> Logger::unsafeGet(const std::string&
{
std::optional<Logger::LoggerMapIterator> optional_logger_it = find(name);
bool should_recreate_logger = false;
if (optional_logger_it)
{
auto & logger_it = *optional_logger_it;
std::optional<size_t> reference_count_before;
if (get_shared)
if (logger_it->second.owned_by_shared_ptr)
{
reference_count_before = logger_it->second.logger->duplicate();
}
else if (logger_it->second.owned_by_shared_ptr)
{
reference_count_before = logger_it->second.logger->duplicate();
logger_it->second.logger->duplicate();
if (!get_shared)
logger_it->second.owned_by_shared_ptr = false;
}
/// Other thread already decided to delete this logger, but did not yet remove it from map
if (reference_count_before && reference_count_before == 0)
should_recreate_logger = true;
}
if (!optional_logger_it || should_recreate_logger)
if (!optional_logger_it)
{
Logger * logger = nullptr;
@ -379,12 +396,6 @@ std::pair<Logger::LoggerMapIterator, bool> Logger::unsafeGet(const std::string&
logger = new Logger(name, par.getChannel(), par.getLevel());
}
if (should_recreate_logger)
{
(*optional_logger_it)->second.logger = logger;
return std::make_pair(*optional_logger_it, true);
}
return add(logger);
}
@ -412,7 +423,7 @@ LoggerPtr Logger::createShared(const std::string & name, Channel * pChannel, int
auto [it, inserted] = unsafeCreate(name, pChannel, level);
it->second.owned_by_shared_ptr = true;
return makeLoggerPtr(*it->second.logger);
return makeLoggerPtr(*it->second.logger, it->second.owned_by_shared_ptr);
}
Logger& Logger::root()
@ -479,43 +490,6 @@ Logger * Logger::findRawPtr(const std::string & name)
}
void intrusive_ptr_add_ref(Logger * ptr)
{
ptr->duplicate();
}
void intrusive_ptr_release(Logger * ptr)
{
size_t reference_count_before = ptr->_counter.fetch_sub(1, std::memory_order_acq_rel);
if (reference_count_before != 1)
return;
{
std::lock_guard<std::mutex> lock(getLoggerMutex());
if (_pLoggerMap)
{
auto it = _pLoggerMap->find(ptr->name());
/** It is possible that during release other thread created logger and
* updated iterator in map.
*/
if (it != _pLoggerMap->end() && ptr == it->second.logger)
{
/** If reference count is 0, this means this intrusive pointer owns logger
* and need destroy it.
*/
assert(it->second.owned_by_shared_ptr);
_pLoggerMap->erase(it);
}
}
}
delete ptr;
}
void Logger::names(std::vector<std::string>& names)
{
std::lock_guard<std::mutex> lock(getLoggerMutex());

View File

@ -63,14 +63,14 @@ endif()
option(WITH_COVERAGE "Instrumentation for code coverage with default implementation" OFF)
if (WITH_COVERAGE)
message (INFORMATION "Enabled instrumentation for code coverage")
message (STATUS "Enabled instrumentation for code coverage")
set(COVERAGE_FLAGS "-fprofile-instr-generate -fcoverage-mapping")
endif()
option (SANITIZE_COVERAGE "Instrumentation for code coverage with custom callbacks" OFF)
if (SANITIZE_COVERAGE)
message (INFORMATION "Enabled instrumentation for code coverage")
message (STATUS "Enabled instrumentation for code coverage")
# We set this define for whole build to indicate that at least some parts are compiled with coverage.
# And to expose it in system.build_options.

2
contrib/NuRaft vendored

@ -1 +1 @@
Subproject commit 1278e32bb0d5dc489f947e002bdf8c71b0ddaa63
Subproject commit 4a12f99dfc9d47c687ff7700b927cc76856225d1

2
contrib/aws vendored

@ -1 +1 @@
Subproject commit 4ec215f3607c2111bf2cc91ba842046a6b5eb0c4
Subproject commit 5f0542b3ad7eef25b0540d37d778207e0345ea8f

2
contrib/curl vendored

@ -1 +1 @@
Subproject commit 7161cb17c01dcff1dc5bf89a18437d9d729f1ecd
Subproject commit 5ce164e0e9290c96eb7d502173426c0a135ec008

2
contrib/libssh vendored

@ -1 +1 @@
Subproject commit 2c76332ef56d90f55965ab24da6b6dbcbef29c4c
Subproject commit ed4011b91873836713576475a98cd132cd834539

View File

@ -8,24 +8,12 @@ endif()
set(LIB_SOURCE_DIR "${ClickHouse_SOURCE_DIR}/contrib/libssh")
set(LIB_BINARY_DIR "${ClickHouse_BINARY_DIR}/contrib/libssh")
project(libssh VERSION 0.9.7 LANGUAGES C)
# Set CMake variables which are used in libssh_version.h.cmake
project(libssh VERSION 0.9.8 LANGUAGES C)
# global needed variable
set(APPLICATION_NAME ${PROJECT_NAME})
# SOVERSION scheme: CURRENT.AGE.REVISION
# If there was an incompatible interface change:
# Increment CURRENT. Set AGE and REVISION to 0
# If there was a compatible interface change:
# Increment AGE. Set REVISION to 0
# If the source code was changed, but there were no interface changes:
# Increment REVISION.
set(LIBRARY_VERSION "4.8.7")
set(LIBRARY_VERSION "4.8.8")
set(LIBRARY_SOVERSION "4")
# Copy library files to a lib sub-directory
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${LIB_BINARY_DIR}/lib")
set(CMAKE_THREAD_PREFER_PTHREADS ON)
set(THREADS_PREFER_PTHREAD_FLAG ON)
@ -33,7 +21,87 @@ set(WITH_ZLIB OFF)
set(WITH_SYMBOL_VERSIONING OFF)
set(WITH_SERVER ON)
include(IncludeSources.cmake)
set(libssh_SRCS
${LIB_SOURCE_DIR}/src/agent.c
${LIB_SOURCE_DIR}/src/auth.c
${LIB_SOURCE_DIR}/src/base64.c
${LIB_SOURCE_DIR}/src/bignum.c
${LIB_SOURCE_DIR}/src/buffer.c
${LIB_SOURCE_DIR}/src/callbacks.c
${LIB_SOURCE_DIR}/src/channels.c
${LIB_SOURCE_DIR}/src/client.c
${LIB_SOURCE_DIR}/src/config.c
${LIB_SOURCE_DIR}/src/connect.c
${LIB_SOURCE_DIR}/src/connector.c
${LIB_SOURCE_DIR}/src/curve25519.c
${LIB_SOURCE_DIR}/src/dh.c
${LIB_SOURCE_DIR}/src/ecdh.c
${LIB_SOURCE_DIR}/src/error.c
${LIB_SOURCE_DIR}/src/getpass.c
${LIB_SOURCE_DIR}/src/init.c
${LIB_SOURCE_DIR}/src/kdf.c
${LIB_SOURCE_DIR}/src/kex.c
${LIB_SOURCE_DIR}/src/known_hosts.c
${LIB_SOURCE_DIR}/src/knownhosts.c
${LIB_SOURCE_DIR}/src/legacy.c
${LIB_SOURCE_DIR}/src/log.c
${LIB_SOURCE_DIR}/src/match.c
${LIB_SOURCE_DIR}/src/messages.c
${LIB_SOURCE_DIR}/src/misc.c
${LIB_SOURCE_DIR}/src/options.c
${LIB_SOURCE_DIR}/src/packet.c
${LIB_SOURCE_DIR}/src/packet_cb.c
${LIB_SOURCE_DIR}/src/packet_crypt.c
${LIB_SOURCE_DIR}/src/pcap.c
${LIB_SOURCE_DIR}/src/pki.c
${LIB_SOURCE_DIR}/src/pki_container_openssh.c
${LIB_SOURCE_DIR}/src/poll.c
${LIB_SOURCE_DIR}/src/session.c
${LIB_SOURCE_DIR}/src/scp.c
${LIB_SOURCE_DIR}/src/socket.c
${LIB_SOURCE_DIR}/src/string.c
${LIB_SOURCE_DIR}/src/threads.c
${LIB_SOURCE_DIR}/src/wrapper.c
${LIB_SOURCE_DIR}/src/external/bcrypt_pbkdf.c
${LIB_SOURCE_DIR}/src/external/blowfish.c
${LIB_SOURCE_DIR}/src/external/chacha.c
${LIB_SOURCE_DIR}/src/external/poly1305.c
${LIB_SOURCE_DIR}/src/chachapoly.c
${LIB_SOURCE_DIR}/src/config_parser.c
${LIB_SOURCE_DIR}/src/token.c
${LIB_SOURCE_DIR}/src/pki_ed25519_common.c
${LIB_SOURCE_DIR}/src/threads/noop.c
${LIB_SOURCE_DIR}/src/threads/pthread.c
# LIBCRYPT specific
${libssh_SRCS}
${LIB_SOURCE_DIR}/src/threads/libcrypto.c
${LIB_SOURCE_DIR}/src/pki_crypto.c
${LIB_SOURCE_DIR}/src/ecdh_crypto.c
${LIB_SOURCE_DIR}/src/libcrypto.c
${LIB_SOURCE_DIR}/src/dh_crypto.c
${LIB_SOURCE_DIR}/src/options.c
${LIB_SOURCE_DIR}/src/server.c
${LIB_SOURCE_DIR}/src/bind.c
${LIB_SOURCE_DIR}/src/bind_config.c
)
if (NOT (ENABLE_OPENSSL OR ENABLE_OPENSSL_DYNAMIC))
add_compile_definitions(USE_BORINGSSL=1)
endif()
configure_file(${LIB_SOURCE_DIR}/include/libssh/libssh_version.h.cmake ${LIB_BINARY_DIR}/include/libssh/libssh_version.h @ONLY)
add_library(_ssh STATIC ${libssh_SRCS})
add_library(ch_contrib::ssh ALIAS _ssh)
target_link_libraries(_ssh PRIVATE OpenSSL::Crypto)
target_include_directories(_ssh PUBLIC "${LIB_SOURCE_DIR}/include" "${LIB_BINARY_DIR}/include")
# These headers need to be generated using the native build system on each platform.
if (OS_LINUX)
if (ARCH_AMD64)
if (USE_MUSL)
@ -63,7 +131,3 @@ elseif (OS_FREEBSD)
else ()
message(FATAL_ERROR "Platform is not supported")
endif()
configure_file(${LIB_SOURCE_DIR}/include/libssh/libssh_version.h.cmake
${LIB_BINARY_DIR}/include/libssh/libssh_version.h
@ONLY)

View File

@ -1,98 +0,0 @@
set(LIBSSH_LINK_LIBRARIES
${LIBSSH_LINK_LIBRARIES}
OpenSSL::Crypto
)
set(libssh_SRCS
${LIB_SOURCE_DIR}/src/agent.c
${LIB_SOURCE_DIR}/src/auth.c
${LIB_SOURCE_DIR}/src/base64.c
${LIB_SOURCE_DIR}/src/bignum.c
${LIB_SOURCE_DIR}/src/buffer.c
${LIB_SOURCE_DIR}/src/callbacks.c
${LIB_SOURCE_DIR}/src/channels.c
${LIB_SOURCE_DIR}/src/client.c
${LIB_SOURCE_DIR}/src/config.c
${LIB_SOURCE_DIR}/src/connect.c
${LIB_SOURCE_DIR}/src/connector.c
${LIB_SOURCE_DIR}/src/curve25519.c
${LIB_SOURCE_DIR}/src/dh.c
${LIB_SOURCE_DIR}/src/ecdh.c
${LIB_SOURCE_DIR}/src/error.c
${LIB_SOURCE_DIR}/src/getpass.c
${LIB_SOURCE_DIR}/src/init.c
${LIB_SOURCE_DIR}/src/kdf.c
${LIB_SOURCE_DIR}/src/kex.c
${LIB_SOURCE_DIR}/src/known_hosts.c
${LIB_SOURCE_DIR}/src/knownhosts.c
${LIB_SOURCE_DIR}/src/legacy.c
${LIB_SOURCE_DIR}/src/log.c
${LIB_SOURCE_DIR}/src/match.c
${LIB_SOURCE_DIR}/src/messages.c
${LIB_SOURCE_DIR}/src/misc.c
${LIB_SOURCE_DIR}/src/options.c
${LIB_SOURCE_DIR}/src/packet.c
${LIB_SOURCE_DIR}/src/packet_cb.c
${LIB_SOURCE_DIR}/src/packet_crypt.c
${LIB_SOURCE_DIR}/src/pcap.c
${LIB_SOURCE_DIR}/src/pki.c
${LIB_SOURCE_DIR}/src/pki_container_openssh.c
${LIB_SOURCE_DIR}/src/poll.c
${LIB_SOURCE_DIR}/src/session.c
${LIB_SOURCE_DIR}/src/scp.c
${LIB_SOURCE_DIR}/src/socket.c
${LIB_SOURCE_DIR}/src/string.c
${LIB_SOURCE_DIR}/src/threads.c
${LIB_SOURCE_DIR}/src/wrapper.c
${LIB_SOURCE_DIR}/src/external/bcrypt_pbkdf.c
${LIB_SOURCE_DIR}/src/external/blowfish.c
${LIB_SOURCE_DIR}/src/external/chacha.c
${LIB_SOURCE_DIR}/src/external/poly1305.c
${LIB_SOURCE_DIR}/src/chachapoly.c
${LIB_SOURCE_DIR}/src/config_parser.c
${LIB_SOURCE_DIR}/src/token.c
${LIB_SOURCE_DIR}/src/pki_ed25519_common.c
)
set(libssh_SRCS
${libssh_SRCS}
${LIB_SOURCE_DIR}/src/threads/noop.c
${LIB_SOURCE_DIR}/src/threads/pthread.c
)
# LIBCRYPT specific
set(libssh_SRCS
${libssh_SRCS}
${LIB_SOURCE_DIR}/src/threads/libcrypto.c
${LIB_SOURCE_DIR}/src/pki_crypto.c
${LIB_SOURCE_DIR}/src/ecdh_crypto.c
${LIB_SOURCE_DIR}/src/libcrypto.c
${LIB_SOURCE_DIR}/src/dh_crypto.c
)
if (NOT (ENABLE_OPENSSL OR ENABLE_OPENSSL_DYNAMIC))
add_compile_definitions(USE_BORINGSSL=1)
endif()
set(libssh_SRCS
${libssh_SRCS}
${LIB_SOURCE_DIR}/src/options.c
${LIB_SOURCE_DIR}/src/server.c
${LIB_SOURCE_DIR}/src/bind.c
${LIB_SOURCE_DIR}/src/bind_config.c
)
add_library(_ssh STATIC ${libssh_SRCS})
target_include_directories(_ssh PRIVATE ${LIB_BINARY_DIR})
target_include_directories(_ssh PUBLIC "${LIB_SOURCE_DIR}/include" "${LIB_BINARY_DIR}/include")
target_link_libraries(_ssh
PRIVATE ${LIBSSH_LINK_LIBRARIES})
add_library(ch_contrib::ssh ALIAS _ssh)
target_compile_options(_ssh
PRIVATE
${DEFAULT_C_COMPILE_FLAGS}
-D_GNU_SOURCE)

View File

@ -1,6 +1,10 @@
#include <libunwind.h>
/// On MacOS this function will be replaced with a dynamic symbol
/// from the system library.
#if !defined(OS_DARWIN)
int backtrace(void ** buffer, int size)
{
return unw_backtrace(buffer, size);
}
#endif

2
contrib/liburing vendored

@ -1 +1 @@
Subproject commit f5a48392c4ea33f222cbebeb2e2fc31620162949
Subproject commit f4e42a515cd78c8c9cac2be14222834be5f8df2b

2
contrib/libuv vendored

@ -1 +1 @@
Subproject commit 3a85b2eb3d83f369b8a8cafd329d7e9dc28f60cf
Subproject commit 4482964660c77eec1166cd7d14fb915e3dbd774a

@ -1 +1 @@
Subproject commit 2568a7cd1297c7c3044b0f3cc0c23a6f6444d856
Subproject commit d2142eed98046a47ff7112e3cc1e197c8a5cd80f

2
contrib/lz4 vendored

@ -1 +1 @@
Subproject commit 92ebf1870b9acbefc0e7970409a181954a10ff40
Subproject commit ce45a9dbdb059511a3e9576b19db3e7f1a4f172e

2
contrib/qpl vendored

@ -1 +1 @@
Subproject commit a61bdd845fd7ca363b2bcc55454aa520dfcd8298
Subproject commit d4715e0e79896b85612158e135ee1a85f3b3e04d

2
contrib/rapidjson vendored

@ -1 +1 @@
Subproject commit c4ef90ccdbc21d5d5a628d08316bfd301e32d6fa
Subproject commit 800ca2f38fc3b387271d9e1926fcfc9070222104

View File

@ -62,7 +62,6 @@
"dependent": []
},
"docker/test/integration/runner": {
"only_amd64": true,
"name": "clickhouse/integration-tests-runner",
"dependent": []
},

View File

@ -34,7 +34,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="24.1.1.2048"
ARG VERSION="24.1.5.6"
ARG PACKAGES="clickhouse-keeper"
ARG DIRECT_DOWNLOAD_URLS=""

View File

@ -72,7 +72,7 @@ RUN add-apt-repository ppa:ubuntu-toolchain-r/test --yes \
zstd \
zip \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# Download toolchain and SDK for Darwin
RUN curl -sL -O https://github.com/phracker/MacOSX-SDKs/releases/download/11.3/MacOSX11.0.sdk.tar.xz

View File

@ -115,12 +115,17 @@ def run_docker_image_with_env(
subprocess.check_call(cmd, shell=True)
def is_release_build(debug_build: bool, package_type: str, sanitizer: str) -> bool:
return not debug_build and package_type == "deb" and sanitizer == ""
def is_release_build(
debug_build: bool, package_type: str, sanitizer: str, coverage: bool
) -> bool:
return (
not debug_build and package_type == "deb" and sanitizer == "" and not coverage
)
def parse_env_variables(
debug_build: bool,
coverage: bool,
compiler: str,
sanitizer: str,
package_type: str,
@ -261,7 +266,7 @@ def parse_env_variables(
build_target = (
f"{build_target} clickhouse-odbc-bridge clickhouse-library-bridge"
)
if is_release_build(debug_build, package_type, sanitizer):
if is_release_build(debug_build, package_type, sanitizer, coverage):
cmake_flags.append("-DSPLIT_DEBUG_SYMBOLS=ON")
result.append("WITH_PERFORMANCE=1")
if is_cross_arm:
@ -287,6 +292,9 @@ def parse_env_variables(
else:
result.append("BUILD_TYPE=None")
if coverage:
cmake_flags.append("-DSANITIZE_COVERAGE=1 -DBUILD_STANDALONE_KEEPER=0")
if not cache:
cmake_flags.append("-DCOMPILER_CACHE=disabled")
@ -415,6 +423,11 @@ def parse_args() -> argparse.Namespace:
choices=("address", "thread", "memory", "undefined", ""),
default="",
)
parser.add_argument(
"--coverage",
action="store_true",
help="enable granular coverage with introspection",
)
parser.add_argument("--clang-tidy", action="store_true")
parser.add_argument(
@ -507,6 +520,7 @@ def main() -> None:
env_prepared = parse_env_variables(
args.debug_build,
args.coverage,
args.compiler,
args.sanitizer,
args.package_type,

View File

@ -32,7 +32,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="24.1.1.2048"
ARG VERSION="24.1.5.6"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
ARG DIRECT_DOWNLOAD_URLS=""

View File

@ -23,14 +23,11 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
tzdata \
wget \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/var/cache/debconf \
/tmp/*
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="24.1.1.2048"
ARG VERSION="24.1.5.6"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
# set non-empty deb_location_url url to create a docker image

View File

@ -118,13 +118,19 @@ if [ -n "$CLICKHOUSE_USER" ] && [ "$CLICKHOUSE_USER" != "default" ] || [ -n "$CL
EOT
fi
CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS="${CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS:-}"
# checking $DATA_DIR for initialization
if [ -d "${DATA_DIR%/}/data" ]; then
DATABASE_ALREADY_EXISTS='true'
fi
# only run initialization on an empty data directory
if [ -z "${DATABASE_ALREADY_EXISTS}" ]; then
# run initialization if flag CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS is not empty or data directory is empty
if [[ -n "${CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS}" || -z "${DATABASE_ALREADY_EXISTS}" ]]; then
RUN_INITDB_SCRIPTS='true'
fi
if [ -n "${RUN_INITDB_SCRIPTS}" ]; then
if [ -n "$(ls /docker-entrypoint-initdb.d/)" ] || [ -n "$CLICKHOUSE_DB" ]; then
# port is needed to check if clickhouse-server is ready for connections
HTTP_PORT="$(clickhouse extract-from-config --config-file "$CLICKHOUSE_CONFIG" --key=http_port --try)"

View File

@ -13,7 +13,10 @@ RUN apt-get update \
zstd \
locales \
sudo \
--yes --no-install-recommends
--yes --no-install-recommends \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# Sanitizer options for services (clickhouse-server)
# Set resident memory limit for TSAN to 45GiB (46080MiB) to avoid OOMs in Stress tests

View File

@ -17,16 +17,20 @@ CLICKHOUSE_CI_LOGS_CLUSTER=${CLICKHOUSE_CI_LOGS_CLUSTER:-system_logs_export}
EXTRA_COLUMNS=${EXTRA_COLUMNS:-"pull_request_number UInt32, commit_sha String, check_start_time DateTime('UTC'), check_name LowCardinality(String), instance_type LowCardinality(String), instance_id String, INDEX ix_pr (pull_request_number) TYPE set(100), INDEX ix_commit (commit_sha) TYPE set(100), INDEX ix_check_time (check_start_time) TYPE minmax, "}
EXTRA_COLUMNS_EXPRESSION=${EXTRA_COLUMNS_EXPRESSION:-"CAST(0 AS UInt32) AS pull_request_number, '' AS commit_sha, now() AS check_start_time, toLowCardinality('') AS check_name, toLowCardinality('') AS instance_type, '' AS instance_id"}
EXTRA_ORDER_BY_COLUMNS=${EXTRA_ORDER_BY_COLUMNS:-"check_name, "}
EXTRA_ORDER_BY_COLUMNS=${EXTRA_ORDER_BY_COLUMNS:-"check_name"}
# trace_log needs more columns for symbolization
EXTRA_COLUMNS_TRACE_LOG="${EXTRA_COLUMNS} symbols Array(LowCardinality(String)), lines Array(LowCardinality(String)), "
EXTRA_COLUMNS_EXPRESSION_TRACE_LOG="${EXTRA_COLUMNS_EXPRESSION}, arrayMap(x -> demangle(addressToSymbol(x)), trace)::Array(LowCardinality(String)) AS symbols, arrayMap(x -> addressToLine(x), trace)::Array(LowCardinality(String)) AS lines"
# coverage_log needs more columns for symbolization, but only symbol names (the line numbers are too heavy to calculate)
EXTRA_COLUMNS_COVERAGE_LOG="${EXTRA_COLUMNS} symbols Array(LowCardinality(String)), "
EXTRA_COLUMNS_EXPRESSION_COVERAGE_LOG="${EXTRA_COLUMNS_EXPRESSION}, arrayMap(x -> demangle(addressToSymbol(x)), coverage)::Array(LowCardinality(String)) AS symbols"
function __set_connection_args
{
# It's impossible to use generous $CONNECTION_ARGS string, it's unsafe from word splitting perspective.
# It's impossible to use a generic $CONNECTION_ARGS string, it's unsafe from word splitting perspective.
# That's why we must stick to the generated option
CONNECTION_ARGS=(
--receive_timeout=45 --send_timeout=45 --secure
@ -129,6 +133,19 @@ function setup_logs_replication
debug_or_sanitizer_build=$(clickhouse-client -q "WITH ((SELECT value FROM system.build_options WHERE name='BUILD_TYPE') AS build, (SELECT value FROM system.build_options WHERE name='CXX_FLAGS') as flags) SELECT build='Debug' OR flags LIKE '%fsanitize%'")
echo "Build is debug or sanitizer: $debug_or_sanitizer_build"
# We will pre-create a table system.coverage_log.
# It is normally created by clickhouse-test rather than the server,
# so we will create it in advance to make it be picked up by the next commands:
clickhouse-client --query "
CREATE TABLE IF NOT EXISTS system.coverage_log
(
time DateTime COMMENT 'The time of test run',
test_name String COMMENT 'The name of the test',
coverage Array(UInt64) COMMENT 'An array of addresses of the code (a subset of addresses instrumented for coverage) that were encountered during the test run'
) ENGINE = Null COMMENT 'Contains information about per-test coverage from the CI, but used only for exporting to the CI cluster'
"
# For each system log table:
echo 'Create %_log tables'
clickhouse-client --query "SHOW TABLES FROM system LIKE '%\\_log'" | while read -r table
@ -139,11 +156,16 @@ function setup_logs_replication
# Do not try to resolve stack traces in case of debug/sanitizers
# build, since it is too slow (flushing of trace_log can take ~1min
# with such MV attached)
if [[ "$debug_or_sanitizer_build" = 1 ]]; then
if [[ "$debug_or_sanitizer_build" = 1 ]]
then
EXTRA_COLUMNS_EXPRESSION_FOR_TABLE="${EXTRA_COLUMNS_EXPRESSION}"
else
EXTRA_COLUMNS_EXPRESSION_FOR_TABLE="${EXTRA_COLUMNS_EXPRESSION_TRACE_LOG}"
fi
elif [[ "$table" = "coverage_log" ]]
then
EXTRA_COLUMNS_FOR_TABLE="${EXTRA_COLUMNS_COVERAGE_LOG}"
EXTRA_COLUMNS_EXPRESSION_FOR_TABLE="${EXTRA_COLUMNS_EXPRESSION_COVERAGE_LOG}"
else
EXTRA_COLUMNS_FOR_TABLE="${EXTRA_COLUMNS}"
EXTRA_COLUMNS_EXPRESSION_FOR_TABLE="${EXTRA_COLUMNS_EXPRESSION}"
@ -160,7 +182,7 @@ function setup_logs_replication
# Create the destination table with adapted name and structure:
statement=$(clickhouse-client --format TSVRaw --query "SHOW CREATE TABLE system.${table}" | sed -r -e '
s/^\($/('"$EXTRA_COLUMNS_FOR_TABLE"'/;
s/ORDER BY \(/ORDER BY ('"$EXTRA_ORDER_BY_COLUMNS"'/;
s/^ORDER BY (([^\(].+?)|\((.+?)\))$/ORDER BY ('"$EXTRA_ORDER_BY_COLUMNS"', \2\3)/;
s/^CREATE TABLE system\.\w+_log$/CREATE TABLE IF NOT EXISTS '"$table"'_'"$hash"'/;
/^TTL /d
')
@ -168,7 +190,7 @@ function setup_logs_replication
echo -e "Creating remote destination table ${table}_${hash} with statement:\n${statement}" >&2
echo "$statement" | clickhouse-client --database_replicated_initial_query_timeout_sec=10 \
--distributed_ddl_task_timeout=30 \
--distributed_ddl_task_timeout=30 --distributed_ddl_output_mode=throw_only_active \
"${CONNECTION_ARGS[@]}" || continue
echo "Creating table system.${table}_sender" >&2

View File

@ -20,7 +20,9 @@ RUN apt-get update \
pv \
jq \
zstd \
--yes --no-install-recommends
--yes --no-install-recommends \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
RUN pip3 install numpy==1.26.3 scipy==1.12.0 pandas==1.5.3 Jinja2==3.1.3
@ -31,12 +33,14 @@ RUN mkdir -p /tmp/clickhouse-odbc-tmp \
&& cp /tmp/clickhouse-odbc-tmp/lib64/*.so /usr/local/lib/ \
&& odbcinst -i -d -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbcinst.ini.sample \
&& odbcinst -i -s -l -f /tmp/clickhouse-odbc-tmp/share/doc/clickhouse-odbc/config/odbc.ini.sample \
&& rm -rf /tmp/clickhouse-odbc-tmp \
&& rm -rf /tmp/clickhouse-odbc-tmp
# Give suid to gdb to grant it attach permissions
# chmod 777 to make the container user independent
RUN chmod u+s /usr/bin/gdb \
&& mkdir -p /var/lib/clickhouse \
&& chmod 777 /var/lib/clickhouse
# chmod 777 to make the container user independent
ENV TZ=Europe/Amsterdam
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

View File

@ -29,7 +29,7 @@ RUN apt-get update \
wget \
&& apt-get autoremove --yes \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
RUN pip3 install Jinja2

View File

@ -86,7 +86,7 @@ function download
chmod +x clickhouse
# clickhouse may be compressed - run once to decompress
./clickhouse ||:
./clickhouse --query "SELECT 1" ||:
ln -s ./clickhouse ./clickhouse-server
ln -s ./clickhouse ./clickhouse-client
ln -s ./clickhouse ./clickhouse-local
@ -387,10 +387,15 @@ if [ -f core.zst ]; then
fi
rg --text -F '<Fatal>' server.log > fatal.log ||:
FATAL_LINK=''
if [ -s fatal.log ]; then
FATAL_LINK='<a href="fatal.log">fatal.log</a>'
fi
dmesg -T > dmesg.log ||:
zstd --threads=0 server.log
zstd --threads=0 fuzzer.log
zstd --threads=0 --rm server.log
zstd --threads=0 --rm fuzzer.log
cat > report.html <<EOF ||:
<!DOCTYPE html>
@ -419,6 +424,7 @@ p.links a { padding: 5px; margin: 3px; background: #FFF; line-height: 2; white-s
<a href="main.log">main.log</a>
<a href="dmesg.log">dmesg.log</a>
${CORE_LINK}
${FATAL_LINK}
</p>
<table>
<tr>

View File

@ -10,13 +10,13 @@ ENV \
init=/lib/systemd/systemd
# install systemd packages
RUN apt-get update && \
apt-get install -y --no-install-recommends \
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
sudo \
systemd \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists
\
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# configure systemd
# remove systemd 'wants' triggers

View File

@ -1,31 +1,27 @@
FROM ubuntu:20.04
MAINTAINER lgbo-ustc <lgbo.ustc@gmail.com>
RUN apt-get update
RUN apt-get install -y wget openjdk-8-jre
RUN wget https://archive.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz && \
tar -xf hadoop-3.1.0.tar.gz && rm -rf hadoop-3.1.0.tar.gz
RUN wget https://apache.apache.org/dist/hive/hive-2.3.9/apache-hive-2.3.9-bin.tar.gz && \
tar -xf apache-hive-2.3.9-bin.tar.gz && rm -rf apache-hive-2.3.9-bin.tar.gz
RUN apt install -y vim
RUN apt install -y openssh-server openssh-client
RUN apt install -y mysql-server
RUN mkdir -p /root/.ssh && \
ssh-keygen -t rsa -b 2048 -P '' -f /root/.ssh/id_rsa && \
cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys && \
cp /root/.ssh/id_rsa /etc/ssh/ssh_host_rsa_key && \
cp /root/.ssh/id_rsa.pub /etc/ssh/ssh_host_rsa_key.pub
RUN wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.27.tar.gz &&\
tar -xf mysql-connector-java-8.0.27.tar.gz && \
mv mysql-connector-java-8.0.27/mysql-connector-java-8.0.27.jar /apache-hive-2.3.9-bin/lib/ && \
rm -rf mysql-connector-java-8.0.27.tar.gz mysql-connector-java-8.0.27
RUN apt install -y iputils-ping net-tools
RUN apt-get update \
&& apt-get install -y wget openjdk-8-jre \
&& wget https://archive.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz \
&& tar -xf hadoop-3.1.0.tar.gz && rm -rf hadoop-3.1.0.tar.gz \
&& wget https://apache.apache.org/dist/hive/hive-2.3.9/apache-hive-2.3.9-bin.tar.gz \
&& tar -xf apache-hive-2.3.9-bin.tar.gz && rm -rf apache-hive-2.3.9-bin.tar.gz \
&& apt install -y vim \
&& apt install -y openssh-server openssh-client \
&& apt install -y mysql-server \
&& mkdir -p /root/.ssh \
&& ssh-keygen -t rsa -b 2048 -P '' -f /root/.ssh/id_rsa \
&& cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys \
&& cp /root/.ssh/id_rsa /etc/ssh/ssh_host_rsa_key \
&& cp /root/.ssh/id_rsa.pub /etc/ssh/ssh_host_rsa_key.pub \
&& wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.27.tar.gz \
&& tar -xf mysql-connector-java-8.0.27.tar.gz \
&& mv mysql-connector-java-8.0.27/mysql-connector-java-8.0.27.jar /apache-hive-2.3.9-bin/lib/ \
&& rm -rf mysql-connector-java-8.0.27.tar.gz mysql-connector-java-8.0.27 \
&& apt install -y iputils-ping net-tools \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
ENV JAVA_HOME=/usr
ENV HADOOP_HOME=/hadoop-3.1.0
@ -44,4 +40,3 @@ COPY demo_data.txt /
ENV PATH=/apache-hive-2.3.9-bin/bin:/hadoop-3.1.0/bin:/hadoop-3.1.0/sbin:$PATH
RUN service ssh start && sed s/HOSTNAME/$HOSTNAME/ /hadoop-3.1.0/etc/hadoop/core-site.xml.template > /hadoop-3.1.0/etc/hadoop/core-site.xml && hdfs namenode -format
COPY start.sh /

View File

@ -3,14 +3,10 @@
FROM ubuntu:18.04
RUN apt-get update && \
apt-get install -y software-properties-common build-essential openjdk-8-jdk curl
RUN rm -rf \
/var/lib/apt/lists/* \
/var/cache/debconf \
/tmp/* \
RUN apt-get clean
RUN apt-get update \
&& apt-get install -y software-properties-common build-essential openjdk-8-jdk curl \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
ARG ver=42.2.12
RUN curl -L -o /postgresql-java-${ver}.jar https://repo1.maven.org/maven2/org/postgresql/postgresql/${ver}/postgresql-${ver}.jar

View File

@ -37,11 +37,8 @@ RUN apt-get update \
libkrb5-dev \
krb5-user \
g++ \
&& rm -rf \
/var/lib/apt/lists/* \
/var/cache/debconf \
/tmp/* \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
ENV TZ=Etc/UTC
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
@ -62,47 +59,49 @@ RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - \
&& dockerd --version; docker --version
# kazoo 2.10.0 is broken
# https://s3.amazonaws.com/clickhouse-test-reports/59337/524625a1d2f4cc608a3f1059e3df2c30f353a649/integration_tests__asan__analyzer__[5_6].html
RUN python3 -m pip install --no-cache-dir \
PyMySQL \
aerospike==11.1.0 \
asyncio \
PyMySQL==1.1.0 \
asyncio==3.4.3 \
avro==1.10.2 \
azure-storage-blob \
boto3 \
cassandra-driver \
confluent-kafka==1.9.2 \
azure-storage-blob==12.19.0 \
boto3==1.34.24 \
cassandra-driver==3.29.0 \
confluent-kafka==2.3.0 \
delta-spark==2.3.0 \
dict2xml \
dicttoxml \
dict2xml==1.7.4 \
dicttoxml==1.7.16 \
docker==6.1.3 \
docker-compose==1.29.2 \
grpcio \
grpcio-tools \
kafka-python \
kazoo \
lz4 \
minio \
nats-py \
protobuf \
grpcio==1.60.0 \
grpcio-tools==1.60.0 \
kafka-python==2.0.2 \
lz4==4.3.3 \
minio==7.2.3 \
nats-py==2.6.0 \
protobuf==4.25.2 \
kazoo==2.9.0 \
psycopg2-binary==2.9.6 \
pyhdfs \
pyhdfs==0.3.1 \
pymongo==3.11.0 \
pyspark==3.3.2 \
pytest \
pytest==7.4.4 \
pytest-order==1.0.0 \
pytest-random \
pytest-repeat \
pytest-timeout \
pytest-xdist \
pytz \
pytest-random==0.2 \
pytest-repeat==0.9.3 \
pytest-timeout==2.2.0 \
pytest-xdist==3.5.0 \
pytest-reportlog==0.4.0 \
pytz==2023.3.post1 \
pyyaml==5.3.1 \
redis \
requests-kerberos \
redis==5.0.1 \
requests-kerberos==0.14.0 \
tzlocal==2.1 \
retry \
bs4 \
lxml \
urllib3
retry==0.9.2 \
bs4==0.0.2 \
lxml==5.1.0 \
urllib3==2.0.7
# bs4, lxml are for cloud tests, do not delete
# Hudi supports only spark 3.3.*, not 3.4

View File

@ -1,7 +1,7 @@
version: '2.3'
services:
mysql2:
image: mysql:5.7
image: mysql:8.0
restart: always
environment:
MYSQL_ROOT_PASSWORD: clickhouse
@ -23,7 +23,7 @@ services:
source: ${MYSQL_CLUSTER_LOGS:-}
target: /mysql/
mysql3:
image: mysql:5.7
image: mysql:8.0
restart: always
environment:
MYSQL_ROOT_PASSWORD: clickhouse
@ -45,7 +45,7 @@ services:
source: ${MYSQL_CLUSTER_LOGS:-}
target: /mysql/
mysql4:
image: mysql:5.7
image: mysql:8.0
restart: always
environment:
MYSQL_ROOT_PASSWORD: clickhouse

View File

@ -24,7 +24,10 @@ RUN mkdir "/root/.ssh"
RUN touch "/root/.ssh/known_hosts"
# install java
RUN apt-get update && apt-get install default-jre default-jdk libjna-java libjna-jni ssh gnuplot graphviz --yes --no-install-recommends
RUN apt-get update && \
apt-get install default-jre default-jdk libjna-java libjna-jni ssh gnuplot graphviz --yes --no-install-recommends \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# install clojure
RUN curl -O "https://download.clojure.org/install/linux-install-${CLOJURE_VERSION}.sh" && \

View File

@ -27,7 +27,7 @@ RUN apt-get update \
wget \
&& apt-get autoremove --yes \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
RUN pip3 install Jinja2

View File

@ -37,7 +37,7 @@ RUN apt-get update \
&& apt-get purge --yes python3-dev g++ \
&& apt-get autoremove --yes \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
COPY run.sh /

View File

@ -31,7 +31,9 @@ RUN mkdir "/root/.ssh"
RUN touch "/root/.ssh/known_hosts"
# install java
RUN apt-get update && apt-get install default-jre default-jdk libjna-java libjna-jni ssh gnuplot graphviz --yes --no-install-recommends
RUN apt-get update && apt-get install default-jre default-jdk libjna-java libjna-jni ssh gnuplot graphviz --yes --no-install-recommends \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# install clojure
RUN curl -O "https://download.clojure.org/install/linux-install-${CLOJURE_VERSION}.sh" && \

View File

@ -5,9 +5,10 @@ FROM ubuntu:22.04
ARG apt_archive="http://archive.ubuntu.com"
RUN sed -i "s|http://archive.ubuntu.com|$apt_archive|g" /etc/apt/sources.list
RUN apt-get update --yes && \
env DEBIAN_FRONTEND=noninteractive apt-get install wget git default-jdk maven python3 --yes --no-install-recommends && \
apt-get clean
RUN apt-get update --yes \
&& env DEBIAN_FRONTEND=noninteractive apt-get install wget git default-jdk maven python3 --yes --no-install-recommends \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# We need to get the repository's HEAD each time despite, so we invalidate layers' cache
ARG CACHE_INVALIDATOR=0

View File

@ -15,7 +15,8 @@ RUN apt-get update --yes \
unixodbc-dev \
odbcinst \
sudo \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
RUN pip3 install \
numpy \

View File

@ -11,7 +11,8 @@ RUN apt-get update --yes \
python3-dev \
python3-pip \
sudo \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
RUN pip3 install \
pyyaml \

View File

@ -9,7 +9,8 @@ RUN apt-get update -y \
python3-requests \
nodejs \
npm \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
COPY create.sql /
COPY run.sh /

View File

@ -44,9 +44,10 @@ RUN apt-get update -y \
pv \
zip \
p7zip-full \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
RUN pip3 install numpy scipy pandas Jinja2 pyarrow
RUN pip3 install numpy==1.26.3 scipy==1.12.0 pandas==1.5.3 Jinja2==3.1.3 pyarrow==15.0.0
RUN mkdir -p /tmp/clickhouse-odbc-tmp \
&& wget -nv -O - ${odbc_driver_url} | tar --strip-components=1 -xz -C /tmp/clickhouse-odbc-tmp \
@ -73,7 +74,6 @@ RUN arch=${TARGETARCH:-amd64} \
&& wget "https://dl.min.io/client/mc/release/linux-${arch}/archive/mc.RELEASE.${MINIO_CLIENT_VERSION}" -O ./mc \
&& chmod +x ./mc ./minio
RUN wget --no-verbose 'https://archive.apache.org/dist/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz' \
&& tar -xvf hadoop-3.3.1.tar.gz \
&& rm -rf hadoop-3.3.1.tar.gz

View File

@ -11,4 +11,6 @@ VOLUME /packages
CMD apt-get update ;\
DEBIAN_FRONTEND=noninteractive \
apt install -y /packages/clickhouse-common-static_*.deb \
/packages/clickhouse-client_*.deb
/packages/clickhouse-client_*.deb \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*

View File

@ -185,11 +185,15 @@ function run_tests()
if [[ -n "$USE_DATABASE_REPLICATED" ]] && [[ "$USE_DATABASE_REPLICATED" -eq 1 ]]; then
ADDITIONAL_OPTIONS+=('--replicated-database')
# Too many tests fail for DatabaseReplicated in parallel.
ADDITIONAL_OPTIONS+=('--jobs')
ADDITIONAL_OPTIONS+=('2')
elif [[ 1 == $(clickhouse-client --query "SELECT value LIKE '%SANITIZE_COVERAGE%' FROM system.build_options WHERE name = 'CXX_FLAGS'") ]]; then
# Coverage on a per-test basis could only be collected sequentially.
# Do not set the --jobs parameter.
echo "Running tests with coverage collection."
else
# Too many tests fail for DatabaseReplicated in parallel. All other
# configurations are OK.
# All other configurations are OK.
ADDITIONAL_OPTIONS+=('--jobs')
ADDITIONAL_OPTIONS+=('8')
fi

View File

@ -214,8 +214,7 @@ function check_server_start()
function check_logs_for_critical_errors()
{
# Sanitizer asserts
rg -Fa "==================" /var/log/clickhouse-server/stderr.log | rg -v "in query:" >> /test_output/tmp
rg -Fa "WARNING" /var/log/clickhouse-server/stderr.log >> /test_output/tmp
sed -n '/WARNING:.*anitizer/,/^$/p' >> /test_output/tmp
rg -Fav -e "ASan doesn't fully support makecontext/swapcontext functions" -e "DB::Exception" /test_output/tmp > /dev/null \
&& echo -e "Sanitizer assert (in stderr.log)$FAIL$(head_escaped /test_output/tmp)" >> /test_output/test_results.tsv \
|| echo -e "No sanitizer asserts$OK" >> /test_output/test_results.tsv
@ -233,8 +232,8 @@ function check_logs_for_critical_errors()
# Remove file logical_errors.txt if it's empty
[ -s /test_output/logical_errors.txt ] || rm /test_output/logical_errors.txt
# No such key errors
rg --text "Code: 499.*The specified key does not exist" /var/log/clickhouse-server/clickhouse-server*.log > /test_output/no_such_key_errors.txt \
# No such key errors (ignore a.myext which is used in 02724_database_s3.sh and does not exist)
rg --text "Code: 499.*The specified key does not exist" /var/log/clickhouse-server/clickhouse-server*.log | grep -v "a.myext" > /test_output/no_such_key_errors.txt \
&& echo -e "S3_ERROR No such key thrown (see clickhouse-server.log or no_such_key_errors.txt)$FAIL$(trim_server_logs no_such_key_errors.txt)" >> /test_output/test_results.tsv \
|| echo -e "No lost s3 keys$OK" >> /test_output/test_results.tsv

View File

@ -19,7 +19,8 @@ RUN apt-get update -y \
openssl \
netcat-openbsd \
brotli \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
COPY run.sh /

View File

@ -21,6 +21,7 @@ RUN apt-get update && env DEBIAN_FRONTEND=noninteractive apt-get install --yes \
locales \
&& pip3 install black==23.1.0 boto3 codespell==2.2.1 mypy==1.3.0 PyGithub unidiff pylint==2.6.2 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/* \
&& rm -rf /root/.cache/pip
RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && locale-gen en_US.UTF-8

View File

@ -19,7 +19,8 @@ RUN apt-get update -y \
openssl \
netcat-openbsd \
brotli \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
COPY run.sh /

View File

@ -77,11 +77,18 @@ remove_keeper_config "async_replication" "1"
# create_if_not_exists feature flag doesn't exist on some older versions
remove_keeper_config "create_if_not_exists" "[01]"
# latest_logs_cache_size_threshold setting doesn't exist on some older versions
remove_keeper_config "latest_logs_cache_size_threshold" "[[:digit:]]\+"
# commit_logs_cache_size_threshold setting doesn't exist on some older versions
remove_keeper_config "commit_logs_cache_size_threshold" "[[:digit:]]\+"
# it contains some new settings, but we can safely remove it
rm /etc/clickhouse-server/config.d/merge_tree.xml
rm /etc/clickhouse-server/config.d/enable_wait_for_shutdown_replicated_tables.xml
rm /etc/clickhouse-server/config.d/zero_copy_destructive_operations.xml
rm /etc/clickhouse-server/config.d/storage_conf_02963.xml
rm /etc/clickhouse-server/config.d/backoff_failed_mutation.xml
rm /etc/clickhouse-server/users.d/nonconst_timezone.xml
rm /etc/clickhouse-server/users.d/s3_cache_new.xml
rm /etc/clickhouse-server/users.d/replicated_ddl_entry.xml
@ -109,6 +116,12 @@ remove_keeper_config "async_replication" "1"
# create_if_not_exists feature flag doesn't exist on some older versions
remove_keeper_config "create_if_not_exists" "[01]"
# latest_logs_cache_size_threshold setting doesn't exist on some older versions
remove_keeper_config "latest_logs_cache_size_threshold" "[[:digit:]]\+"
# commit_logs_cache_size_threshold setting doesn't exist on some older versions
remove_keeper_config "commit_logs_cache_size_threshold" "[[:digit:]]\+"
# But we still need default disk because some tables loaded only into it
sudo cat /etc/clickhouse-server/config.d/s3_storage_policy_by_default.xml \
| sed "s|<main><disk>s3</disk></main>|<main><disk>s3</disk></main><default><disk>default</disk></default>|" \
@ -122,6 +135,7 @@ rm /etc/clickhouse-server/config.d/merge_tree.xml
rm /etc/clickhouse-server/config.d/enable_wait_for_shutdown_replicated_tables.xml
rm /etc/clickhouse-server/config.d/zero_copy_destructive_operations.xml
rm /etc/clickhouse-server/config.d/storage_conf_02963.xml
rm /etc/clickhouse-server/config.d/backoff_failed_mutation.xml
rm /etc/clickhouse-server/config.d/block_number.xml
rm /etc/clickhouse-server/users.d/nonconst_timezone.xml
rm /etc/clickhouse-server/users.d/s3_cache_new.xml

View File

@ -5,7 +5,6 @@ FROM ubuntu:22.04
ARG apt_archive="http://archive.ubuntu.com"
RUN sed -i "s|http://archive.ubuntu.com|$apt_archive|g" /etc/apt/sources.list
# 15.0.2
ENV DEBIAN_FRONTEND=noninteractive LLVM_VERSION=17
RUN apt-get update \
@ -27,9 +26,10 @@ RUN apt-get update \
&& export CODENAME="$(lsb_release --codename --short | tr 'A-Z' 'a-z')" \
&& echo "deb https://apt.llvm.org/${CODENAME}/ llvm-toolchain-${CODENAME}-${LLVM_VERSION} main" >> \
/etc/apt/sources.list \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# Install cmake 3.20+ for rust support
# Install cmake 3.20+ for Rust support
# Used https://askubuntu.com/a/1157132 as reference
RUN curl -s https://apt.kitware.com/keys/kitware-archive-latest.asc | \
gpg --dearmor - > /etc/apt/trusted.gpg.d/kitware.gpg && \
@ -60,9 +60,10 @@ RUN apt-get update \
software-properties-common \
tzdata \
--yes --no-install-recommends \
&& apt-get clean
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*
# This symlink required by gcc to find lld compiler
# This symlink is required by gcc to find the lld linker
RUN ln -s /usr/bin/lld-${LLVM_VERSION} /usr/bin/ld.lld
# for external_symbolizer_path
RUN ln -s /usr/bin/llvm-symbolizer-${LLVM_VERSION} /usr/bin/llvm-symbolizer
@ -107,5 +108,4 @@ RUN arch=${TARGETARCH:-amd64} \
&& mv "/tmp/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl/sccache" /usr/bin \
&& rm "/tmp/sccache-$SCCACHE_VERSION-$rarch-unknown-linux-musl" -r
COPY process_functional_tests_result.py /

View File

@ -0,0 +1,31 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v23.11.5.29-stable (d83b108deca) FIXME as compared to v23.11.4.24-stable (e79d840d7fe)
#### Improvement
* Backported in [#58815](https://github.com/ClickHouse/ClickHouse/issues/58815): Add `SYSTEM JEMALLOC PURGE` for purging unused jemalloc pages, `SYSTEM JEMALLOC [ ENABLE | DISABLE | FLUSH ] PROFILE` for controlling jemalloc profile if the profiler is enabled. Add jemalloc-related 4LW command in Keeper: `jmst` for dumping jemalloc stats, `jmfp`, `jmep`, `jmdp` for controlling jemalloc profile if the profiler is enabled. [#58665](https://github.com/ClickHouse/ClickHouse/pull/58665) ([Antonio Andelic](https://github.com/antonio2368)).
* Backported in [#59234](https://github.com/ClickHouse/ClickHouse/issues/59234): Allow to ignore schema evolution in Iceberg table engine and read all data using schema specified by the user on table creation or latest schema parsed from metadata on table creation. This is done under a setting `iceberg_engine_ignore_schema_evolution` that is disabled by default. Note that enabling this setting can lead to incorrect result as in case of evolved schema all data files will be read using the same schema. [#59133](https://github.com/ClickHouse/ClickHouse/pull/59133) ([Kruglov Pavel](https://github.com/Avogar)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix a stupid case of intersecting parts [#58482](https://github.com/ClickHouse/ClickHouse/pull/58482) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix stream partitioning in parallel window functions [#58739](https://github.com/ClickHouse/ClickHouse/pull/58739) ([Dmitry Novik](https://github.com/novikd)).
* Fix double destroy call on exception throw in addBatchLookupTable8 [#58745](https://github.com/ClickHouse/ClickHouse/pull/58745) ([Raúl Marín](https://github.com/Algunenano)).
* Fix JSONExtract function for LowCardinality(Nullable) columns [#58808](https://github.com/ClickHouse/ClickHouse/pull/58808) ([vdimir](https://github.com/vdimir)).
* Fix: LIMIT BY and LIMIT in distributed query [#59153](https://github.com/ClickHouse/ClickHouse/pull/59153) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix not-ready set for system.tables [#59351](https://github.com/ClickHouse/ClickHouse/pull/59351) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix translate() with FixedString input [#59356](https://github.com/ClickHouse/ClickHouse/pull/59356) ([Raúl Marín](https://github.com/Algunenano)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* refine error message [#57991](https://github.com/ClickHouse/ClickHouse/pull/57991) ([Han Fei](https://github.com/hanfei1991)).
* Fix rare race in external sort/aggregation with temporary data in cache [#58013](https://github.com/ClickHouse/ClickHouse/pull/58013) ([Anton Popov](https://github.com/CurtizJ)).
* Follow-up to [#58482](https://github.com/ClickHouse/ClickHouse/issues/58482) [#58574](https://github.com/ClickHouse/ClickHouse/pull/58574) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix possible race in ManyAggregatedData dtor. [#58624](https://github.com/ClickHouse/ClickHouse/pull/58624) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Decrease log level for one log message [#59168](https://github.com/ClickHouse/ClickHouse/pull/59168) ([Kseniia Sumarokova](https://github.com/kssenii)).

View File

@ -0,0 +1,36 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v23.12.3.40-stable (a594704ae75) FIXME as compared to v23.12.2.59-stable (17ab210e761)
#### Improvement
* Backported in [#58660](https://github.com/ClickHouse/ClickHouse/issues/58660): When executing some queries, which require a lot of streams for reading data, the error `"Paste JOIN requires sorted tables only"` was previously thrown. Now the numbers of streams resize to 1 in that case. [#58608](https://github.com/ClickHouse/ClickHouse/pull/58608) ([Yarik Briukhovetskyi](https://github.com/yariks5s)).
* Backported in [#58817](https://github.com/ClickHouse/ClickHouse/issues/58817): Add `SYSTEM JEMALLOC PURGE` for purging unused jemalloc pages, `SYSTEM JEMALLOC [ ENABLE | DISABLE | FLUSH ] PROFILE` for controlling jemalloc profile if the profiler is enabled. Add jemalloc-related 4LW command in Keeper: `jmst` for dumping jemalloc stats, `jmfp`, `jmep`, `jmdp` for controlling jemalloc profile if the profiler is enabled. [#58665](https://github.com/ClickHouse/ClickHouse/pull/58665) ([Antonio Andelic](https://github.com/antonio2368)).
* Backported in [#59235](https://github.com/ClickHouse/ClickHouse/issues/59235): Allow to ignore schema evolution in Iceberg table engine and read all data using schema specified by the user on table creation or latest schema parsed from metadata on table creation. This is done under a setting `iceberg_engine_ignore_schema_evolution` that is disabled by default. Note that enabling this setting can lead to incorrect result as in case of evolved schema all data files will be read using the same schema. [#59133](https://github.com/ClickHouse/ClickHouse/pull/59133) ([Kruglov Pavel](https://github.com/Avogar)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Delay reading from StorageKafka to allow multiple reads in materialized views [#58477](https://github.com/ClickHouse/ClickHouse/pull/58477) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Fix a stupid case of intersecting parts [#58482](https://github.com/ClickHouse/ClickHouse/pull/58482) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Disable max_joined_block_rows in ConcurrentHashJoin [#58595](https://github.com/ClickHouse/ClickHouse/pull/58595) ([vdimir](https://github.com/vdimir)).
* Fix stream partitioning in parallel window functions [#58739](https://github.com/ClickHouse/ClickHouse/pull/58739) ([Dmitry Novik](https://github.com/novikd)).
* Fix double destroy call on exception throw in addBatchLookupTable8 [#58745](https://github.com/ClickHouse/ClickHouse/pull/58745) ([Raúl Marín](https://github.com/Algunenano)).
* Fix JSONExtract function for LowCardinality(Nullable) columns [#58808](https://github.com/ClickHouse/ClickHouse/pull/58808) ([vdimir](https://github.com/vdimir)).
* Multiple read file log storage in mv [#58877](https://github.com/ClickHouse/ClickHouse/pull/58877) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Fix: LIMIT BY and LIMIT in distributed query [#59153](https://github.com/ClickHouse/ClickHouse/pull/59153) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix not-ready set for system.tables [#59351](https://github.com/ClickHouse/ClickHouse/pull/59351) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix translate() with FixedString input [#59356](https://github.com/ClickHouse/ClickHouse/pull/59356) ([Raúl Marín](https://github.com/Algunenano)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Follow-up to [#58482](https://github.com/ClickHouse/ClickHouse/issues/58482) [#58574](https://github.com/ClickHouse/ClickHouse/pull/58574) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix possible race in ManyAggregatedData dtor. [#58624](https://github.com/ClickHouse/ClickHouse/pull/58624) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Change log level for super imporant message in Keeper [#59010](https://github.com/ClickHouse/ClickHouse/pull/59010) ([alesapin](https://github.com/alesapin)).
* Decrease log level for one log message [#59168](https://github.com/ClickHouse/ClickHouse/pull/59168) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Fix fasttest by pinning pip dependencies [#59256](https://github.com/ClickHouse/ClickHouse/pull/59256) ([Azat Khuzhin](https://github.com/azat)).
* No debug symbols in Rust [#59306](https://github.com/ClickHouse/ClickHouse/pull/59306) ([Alexey Milovidov](https://github.com/alexey-milovidov)).

View File

@ -0,0 +1,21 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v23.12.4.15-stable (4233d111d20) FIXME as compared to v23.12.3.40-stable (a594704ae75)
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix incorrect result of arrayElement / map[] on empty value [#59594](https://github.com/ClickHouse/ClickHouse/pull/59594) ([Raúl Marín](https://github.com/Algunenano)).
* Fix crash in topK when merging empty states [#59603](https://github.com/ClickHouse/ClickHouse/pull/59603) ([Raúl Marín](https://github.com/Algunenano)).
* Fix distributed table with a constant sharding key [#59606](https://github.com/ClickHouse/ClickHouse/pull/59606) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix leftPad / rightPad function with FixedString input [#59739](https://github.com/ClickHouse/ClickHouse/pull/59739) ([Raúl Marín](https://github.com/Algunenano)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Fix 02720_row_policy_column_with_dots [#59453](https://github.com/ClickHouse/ClickHouse/pull/59453) ([Duc Canh Le](https://github.com/canhld94)).
* Pin python dependencies in stateless tests [#59663](https://github.com/ClickHouse/ClickHouse/pull/59663) ([Raúl Marín](https://github.com/Algunenano)).

View File

@ -0,0 +1,14 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.1.2.5-stable (b2605dd4a5a) FIXME as compared to v24.1.1.2048-stable (5a024dfc093)
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix translate() with FixedString input [#59356](https://github.com/ClickHouse/ClickHouse/pull/59356) ([Raúl Marín](https://github.com/Algunenano)).
* Fix stacktraces for binaries without debug symbols [#59444](https://github.com/ClickHouse/ClickHouse/pull/59444) ([Azat Khuzhin](https://github.com/azat)).

View File

@ -0,0 +1,34 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.1.3.31-stable (135b08cbd28) FIXME as compared to v24.1.2.5-stable (b2605dd4a5a)
#### Improvement
* Backported in [#59569](https://github.com/ClickHouse/ClickHouse/issues/59569): Now dashboard understands both compressed and uncompressed state of URL's #hash (backward compatibility). Continuation of [#59124](https://github.com/ClickHouse/ClickHouse/issues/59124) . [#59548](https://github.com/ClickHouse/ClickHouse/pull/59548) ([Amos Bird](https://github.com/amosbird)).
* Backported in [#59776](https://github.com/ClickHouse/ClickHouse/issues/59776): Added settings `split_parts_ranges_into_intersecting_and_non_intersecting_final` and `split_intersecting_parts_ranges_into_layers_final`. This settings are needed to disable optimizations for queries with `FINAL` and needed for debug only. [#59705](https://github.com/ClickHouse/ClickHouse/pull/59705) ([Maksim Kita](https://github.com/kitaisreal)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix `ASTAlterCommand::formatImpl` in case of column specific settings… [#59445](https://github.com/ClickHouse/ClickHouse/pull/59445) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
* Make MAX use the same rules as permutation for complex types [#59498](https://github.com/ClickHouse/ClickHouse/pull/59498) ([Raúl Marín](https://github.com/Algunenano)).
* Fix corner case when passing `update_insert_deduplication_token_in_dependent_materialized_views` [#59544](https://github.com/ClickHouse/ClickHouse/pull/59544) ([Jordi Villar](https://github.com/jrdi)).
* Fix incorrect result of arrayElement / map[] on empty value [#59594](https://github.com/ClickHouse/ClickHouse/pull/59594) ([Raúl Marín](https://github.com/Algunenano)).
* Fix crash in topK when merging empty states [#59603](https://github.com/ClickHouse/ClickHouse/pull/59603) ([Raúl Marín](https://github.com/Algunenano)).
* Maintain function alias in RewriteSumFunctionWithSumAndCountVisitor [#59658](https://github.com/ClickHouse/ClickHouse/pull/59658) ([Raúl Marín](https://github.com/Algunenano)).
* Fix leftPad / rightPad function with FixedString input [#59739](https://github.com/ClickHouse/ClickHouse/pull/59739) ([Raúl Marín](https://github.com/Algunenano)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Backport [#59650](https://github.com/ClickHouse/ClickHouse/issues/59650) to 24.1: MergeTree FINAL optimization diagnostics and settings"'. [#59701](https://github.com/ClickHouse/ClickHouse/pull/59701) ([Raúl Marín](https://github.com/Algunenano)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Fix 02720_row_policy_column_with_dots [#59453](https://github.com/ClickHouse/ClickHouse/pull/59453) ([Duc Canh Le](https://github.com/canhld94)).
* Refactoring of dashboard state encoding [#59554](https://github.com/ClickHouse/ClickHouse/pull/59554) ([Sergei Trifonov](https://github.com/serxa)).
* MergeTree FINAL optimization diagnostics and settings [#59650](https://github.com/ClickHouse/ClickHouse/pull/59650) ([Maksim Kita](https://github.com/kitaisreal)).
* Pin python dependencies in stateless tests [#59663](https://github.com/ClickHouse/ClickHouse/pull/59663) ([Raúl Marín](https://github.com/Algunenano)).

View File

@ -0,0 +1,28 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.1.4.20-stable (f59d842b3fa) FIXME as compared to v24.1.3.31-stable (135b08cbd28)
#### Improvement
* Backported in [#59826](https://github.com/ClickHouse/ClickHouse/issues/59826): In case when `merge_max_block_size_bytes` is small enough and tables contain wide rows (strings or tuples) background merges may stuck in an endless loop. This behaviour is fixed. Follow-up for https://github.com/ClickHouse/ClickHouse/pull/59340. [#59812](https://github.com/ClickHouse/ClickHouse/pull/59812) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
#### Build/Testing/Packaging Improvement
* Backported in [#59885](https://github.com/ClickHouse/ClickHouse/issues/59885): If you want to run initdb scripts every time when ClickHouse container is starting you shoud initialize environment varible CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS. [#59808](https://github.com/ClickHouse/ClickHouse/pull/59808) ([Alexander Nikolaev](https://github.com/AlexNik)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Fix digest calculation in Keeper [#59439](https://github.com/ClickHouse/ClickHouse/pull/59439) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix distributed table with a constant sharding key [#59606](https://github.com/ClickHouse/ClickHouse/pull/59606) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix query start time on non initial queries [#59662](https://github.com/ClickHouse/ClickHouse/pull/59662) ([Raúl Marín](https://github.com/Algunenano)).
* Fix parsing of partition expressions surrounded by parens [#59901](https://github.com/ClickHouse/ClickHouse/pull/59901) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Temporarily remove a feature that doesn't work [#59688](https://github.com/ClickHouse/ClickHouse/pull/59688) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Make ZooKeeper actually sequentialy consistent [#59735](https://github.com/ClickHouse/ClickHouse/pull/59735) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Fix special build reports in release branches [#59797](https://github.com/ClickHouse/ClickHouse/pull/59797) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,17 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.1.5.6-stable (7f67181ff31) FIXME as compared to v24.1.4.20-stable (f59d842b3fa)
#### Bug Fix (user-visible misbehavior in an official stable release)
* UniqExactSet read crash fix [#59928](https://github.com/ClickHouse/ClickHouse/pull/59928) ([Maksim Kita](https://github.com/kitaisreal)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* CI: do not reuse builds on release branches [#59798](https://github.com/ClickHouse/ClickHouse/pull/59798) ([Max K.](https://github.com/maxknv)).

View File

@ -166,11 +166,11 @@ For most external applications, we recommend using the HTTP interface because it
## Configuration {#configuration}
ClickHouse Server is based on POCO C++ Libraries and uses `Poco::Util::AbstractConfiguration` to represent it's configuration. Configuration is held by `Poco::Util::ServerApplication` class inherited by `DaemonBase` class, which in turn is inherited by `DB::Server` class, implementing clickhouse-server itself. So config can be accessed by `ServerApplication::config()` method.
ClickHouse Server is based on POCO C++ Libraries and uses `Poco::Util::AbstractConfiguration` to represent its configuration. Configuration is held by `Poco::Util::ServerApplication` class inherited by `DaemonBase` class, which in turn is inherited by `DB::Server` class, implementing clickhouse-server itself. So config can be accessed by `ServerApplication::config()` method.
Config is read from multiple files (in XML or YAML format) and merged into single `AbstractConfiguration` by `ConfigProcessor` class. Configuration is loaded at server startup and can be reloaded later if one of config files is updated, removed or added. `ConfigReloader` class is responsible for periodic monitoring of these changes and reload procedure as well. `SYSTEM RELOAD CONFIG` query also triggers config to be reloaded.
For queries and subsystems other than `Server` config is accessible using `Context::getConfigRef()` method. Every subsystem that is capable of reloading it's config without server restart should register itself in reload callback in `Server::main()` method. Note that if newer config has an error, most subsystems will ignore new config, log warning messages and keep working with previously loaded config. Due to the nature of `AbstractConfiguration` it is not possible to pass reference to specific section, so `String config_prefix` is usually used instead.
For queries and subsystems other than `Server` config is accessible using `Context::getConfigRef()` method. Every subsystem that is capable of reloading its config without server restart should register itself in reload callback in `Server::main()` method. Note that if newer config has an error, most subsystems will ignore new config, log warning messages and keep working with previously loaded config. Due to the nature of `AbstractConfiguration` it is not possible to pass reference to specific section, so `String config_prefix` is usually used instead.
## Threads and jobs {#threads-and-jobs}
@ -255,7 +255,7 @@ When we are going to read something from a part in `MergeTree`, we look at `prim
When you `INSERT` a bunch of data into `MergeTree`, that bunch is sorted by primary key order and forms a new part. There are background threads that periodically select some parts and merge them into a single sorted part to keep the number of parts relatively low. Thats why it is called `MergeTree`. Of course, merging leads to “write amplification”. All parts are immutable: they are only created and deleted, but not modified. When SELECT is executed, it holds a snapshot of the table (a set of parts). After merging, we also keep old parts for some time to make a recovery after failure easier, so if we see that some merged part is probably broken, we can replace it with its source parts.
`MergeTree` is not an LSM tree because it does not contain MEMTABLE and LOG: inserted data is written directly to the filesystem. This behavior makes MergeTree much more suitable to insert data in batches. Therefore frequently inserting small amounts of rows is not ideal for MergeTree. For example, a couple of rows per second is OK, but doing it a thousand times a second is not optimal for MergeTree. However, there is an async insert mode for small inserts to overcome this limitation. We did it this way for simplicitys sake, and because we are already inserting data in batches in our applications
`MergeTree` is not an LSM tree because it does not contain MEMTABLE and LOG: inserted data is written directly to the filesystem. This behavior makes MergeTree much more suitable to insert data in batches. Therefore, frequently inserting small amounts of rows is not ideal for MergeTree. For example, a couple of rows per second is OK, but doing it a thousand times a second is not optimal for MergeTree. However, there is an async insert mode for small inserts to overcome this limitation. We did it this way for simplicitys sake, and because we are already inserting data in batches in our applications
There are MergeTree engines that are doing additional work during background merges. Examples are `CollapsingMergeTree` and `AggregatingMergeTree`. This could be treated as special support for updates. Keep in mind that these are not real updates because users usually have no control over the time when background merges are executed, and data in a `MergeTree` table is almost always stored in more than one part, not in completely merged form.

View File

@ -38,7 +38,7 @@ ninja
## Running
Once built, the binary can be run with, eg.:
Once built, the binary can be run with, e.g.:
```bash
qemu-s390x-static -L /usr/s390x-linux-gnu ./clickhouse

View File

@ -37,7 +37,7 @@ sudo xcode-select --install
``` bash
brew update
brew install ccache cmake ninja libtool gettext llvm gcc binutils grep findutils
brew install ccache cmake ninja libtool gettext llvm gcc binutils grep findutils nasm
```
## Checkout ClickHouse Sources {#checkout-clickhouse-sources}

View File

@ -95,7 +95,7 @@ Complete below three steps mentioned in [Star Schema Benchmark](https://clickhou
- Inserting data. Here should use `./benchmark_sample/rawdata_dir/ssb-dbgen/*.tbl` as input data.
- Converting “star schema” to de-normalized “flat schema”
Set up database with with IAA Deflate codec
Set up database with IAA Deflate codec
``` bash
$ cd ./database_dir/deflate
@ -104,7 +104,7 @@ $ [CLICKHOUSE_EXE] client
```
Complete three steps same as lz4 above
Set up database with with ZSTD codec
Set up database with ZSTD codec
``` bash
$ cd ./database_dir/zstd

View File

@ -13,7 +13,7 @@ ClickHouse utilizes third-party libraries for different purposes, e.g., to conne
SELECT library_name, license_type, license_path FROM system.licenses ORDER BY library_name COLLATE 'en';
```
(Note that the listed libraries are the ones located in the `contrib/` directory of the ClickHouse repository. Depending on the build options, some of of the libraries may have not been compiled, and as a result, their functionality may not be available at runtime.
Note that the listed libraries are the ones located in the `contrib/` directory of the ClickHouse repository. Depending on the build options, some of the libraries may have not been compiled, and as a result, their functionality may not be available at runtime.
[Example](https://play.clickhouse.com/play?user=play#U0VMRUNUIGxpYnJhcnlfbmFtZSwgbGljZW5zZV90eXBlLCBsaWNlbnNlX3BhdGggRlJPTSBzeXN0ZW0ubGljZW5zZXMgT1JERVIgQlkgbGlicmFyeV9uYW1lIENPTExBVEUgJ2VuJw==)

View File

@ -7,13 +7,13 @@ description: Prerequisites and an overview of how to build ClickHouse
# Getting Started Guide for Building ClickHouse
ClickHouse can be build on Linux, FreeBSD and macOS. If you use Windows, you can still build ClickHouse in a virtual machine running Linux, e.g. [VirtualBox](https://www.virtualbox.org/) with Ubuntu.
ClickHouse can be built on Linux, FreeBSD and macOS. If you use Windows, you can still build ClickHouse in a virtual machine running Linux, e.g. [VirtualBox](https://www.virtualbox.org/) with Ubuntu.
ClickHouse requires a 64-bit system to compile and run, 32-bit systems do not work.
## Creating a Repository on GitHub {#creating-a-repository-on-github}
To start developing for ClickHouse you will need a [GitHub](https://www.virtualbox.org/) account. Please also generate a SSH key locally (if you don't have one already) and upload the public key to GitHub as this is a prerequisite for contributing patches.
To start developing for ClickHouse you will need a [GitHub](https://www.virtualbox.org/) account. Please also generate an SSH key locally (if you don't have one already) and upload the public key to GitHub as this is a prerequisite for contributing patches.
Next, create a fork of the [ClickHouse repository](https://github.com/ClickHouse/ClickHouse/) in your personal account by clicking the "fork" button in the upper right corner.
@ -37,7 +37,7 @@ git clone git@github.com:your_github_username/ClickHouse.git # replace placehol
cd ClickHouse
```
This command creates a directory `ClickHouse/` containing the source code of ClickHouse. If you specify a custom checkout directory after the URL but it is important that this path does not contain whitespaces as it may lead to problems with the build later on.
This command creates a directory `ClickHouse/` containing the source code of ClickHouse. If you specify a custom checkout directory after the URL, but it is important that this path does not contain whitespaces as it may lead to problems with the build later on.
The ClickHouse repository uses Git submodules, i.e. references to external repositories (usually 3rd party libraries used by ClickHouse). These are not checked out by default. To do so, you can either
@ -45,7 +45,7 @@ The ClickHouse repository uses Git submodules, i.e. references to external repos
- if `git clone` did not check out submodules, run `git submodule update --init --jobs <N>` (e.g. `<N> = 12` to parallelize the checkout) to achieve the same as the previous alternative, or
- if `git clone` did not check out submodules and you like to use [sparse](https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/) and [shallow](https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/) submodule checkout to omit unneeded files and history in submodules to save space (ca. 5 GB instead of ca. 15 GB), run `./contrib/update-submodules.sh`. Not really recommended as it generally makes working with submodules less convenient and slower.
- if `git clone` did not check out submodules, and you like to use [sparse](https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/) and [shallow](https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/) submodule checkout to omit unneeded files and history in submodules to save space (ca. 5 GB instead of ca. 15 GB), run `./contrib/update-submodules.sh`. Not really recommended as it generally makes working with submodules less convenient and slower.
You can check the Git status with the command: `git submodule status`.
@ -91,7 +91,7 @@ If you use Arch or Gentoo, you probably know it yourself how to install CMake.
## C++ Compiler {#c-compiler}
Compilers Clang starting from version 15 is supported for building ClickHouse.
Compilers Clang starting from version 16 is supported for building ClickHouse.
Clang should be used instead of gcc. Though, our continuous integration (CI) platform runs checks for about a dozen of build combinations.
@ -143,7 +143,7 @@ When a large amount of RAM is available on build machine you should limit the nu
On machines with 4GB of RAM, it is recommended to specify 1, for 8GB of RAM `-j 2` is recommended.
If you get the message: `ninja: error: loading 'build.ninja': No such file or directory`, it means that generating a build configuration has failed and you need to inspect the message above.
If you get the message: `ninja: error: loading 'build.ninja': No such file or directory`, it means that generating a build configuration has failed, and you need to inspect the message above.
Upon the successful start of the building process, youll see the build progress - the number of processed tasks and the total number of tasks.
@ -184,7 +184,7 @@ You can also run your custom-built ClickHouse binary with the config file from t
**CLion (recommended)**
If you do not know which IDE to use, we recommend that you use [CLion](https://www.jetbrains.com/clion/). CLion is commercial software but it offers a 30 day free trial. It is also free of charge for students. CLion can be used on both Linux and macOS.
If you do not know which IDE to use, we recommend that you use [CLion](https://www.jetbrains.com/clion/). CLion is commercial software, but it offers a 30 day free trial. It is also free of charge for students. CLion can be used on both Linux and macOS.
A few things to know when using CLion to develop ClickHouse:

View File

@ -10,7 +10,7 @@ Allows to connect to databases on a remote [PostgreSQL](https://www.postgresql.o
Gives the real-time access to table list and table structure from remote PostgreSQL with the help of `SHOW TABLES` and `DESCRIBE TABLE` queries.
Supports table structure modifications (`ALTER TABLE ... ADD|DROP COLUMN`). If `use_table_cache` parameter (see the Engine Parameters below) it set to `1`, the table structure is cached and not checked for being modified, but can be updated with `DETACH` and `ATTACH` queries.
Supports table structure modifications (`ALTER TABLE ... ADD|DROP COLUMN`). If `use_table_cache` parameter (see the Engine Parameters below) is set to `1`, the table structure is cached and not checked for being modified, but can be updated with `DETACH` and `ATTACH` queries.
## Creating a Database {#creating-a-database}

View File

@ -38,6 +38,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
[nats_username = 'user',]
[nats_password = 'password',]
[nats_token = 'clickhouse',]
[nats_credential_file = '/var/nats_credentials',]
[nats_startup_connect_tries = '5']
[nats_max_rows_per_message = 1,]
[nats_handle_error_mode = 'default']
@ -63,6 +64,7 @@ Optional parameters:
- `nats_username` - NATS username.
- `nats_password` - NATS password.
- `nats_token` - NATS auth token.
- `nats_credential_file` - Path to a NATS credentials file.
- `nats_startup_connect_tries` - Number of connect tries at startup. Default: `5`.
- `nats_max_rows_per_message` — The maximum number of rows written in one NATS message for row-based formats. (default : `1`).
- `nats_handle_error_mode` — How to handle errors for RabbitMQ engine. Possible values: default (the exception will be thrown if we fail to parse a message), stream (the exception message and raw message will be saved in virtual columns `_error` and `_raw_message`).

View File

@ -2,7 +2,7 @@
Nearest neighborhood search is the problem of finding the M closest points for a given point in an N-dimensional vector space. The most
straightforward approach to solve this problem is a brute force search where the distance between all points in the vector space and the
reference point is computed. This method guarantees perfect accuracy but it is usually too slow for practical applications. Thus, nearest
reference point is computed. This method guarantees perfect accuracy, but it is usually too slow for practical applications. Thus, nearest
neighborhood search problems are often solved with [approximative algorithms](https://github.com/erikbern/ann-benchmarks). Approximative
nearest neighborhood search techniques, in conjunction with [embedding
methods](https://cloud.google.com/architecture/overview-extracting-and-serving-feature-embeddings-for-machine-learning) allow to search huge
@ -24,7 +24,7 @@ LIMIT N
`vectors` contains N-dimensional values of type [Array](../../../sql-reference/data-types/array.md) or
[Tuple](../../../sql-reference/data-types/tuple.md), for example embeddings. Function `Distance` computes the distance between two vectors.
Often, the the Euclidean (L2) distance is chosen as distance function but [other
Often, the Euclidean (L2) distance is chosen as distance function but [other
distance functions](/docs/en/sql-reference/functions/distance-functions.md) are also possible. `Point` is the reference point, e.g. `(0.17,
0.33, ...)`, and `N` limits the number of search results.
@ -109,7 +109,7 @@ clickhouse-client --param_vec='hello' --query="SELECT * FROM table_with_ann_inde
**Restrictions**: Queries that contain both a `WHERE Distance(vectors, Point) < MaxDistance` and an `ORDER BY Distance(vectors, Point)`
clause cannot use ANN indexes. Also, the approximate algorithms used to determine the nearest neighbors require a limit, hence queries
without `LIMIT` clause cannot utilize ANN indexes. Also ANN indexes are only used if the query has a `LIMIT` value smaller than setting
without `LIMIT` clause cannot utilize ANN indexes. Also, ANN indexes are only used if the query has a `LIMIT` value smaller than setting
`max_limit_for_ann_queries` (default: 1 million rows). This is a safeguard to prevent large memory allocations by external libraries for
approximate neighbor search.
@ -120,9 +120,9 @@ then each indexed block will contain 16384 rows. However, data structures and al
provided by external libraries) are inherently row-oriented. They store a compact representation of a set of rows and also return rows for
ANN queries. This causes some rather unintuitive differences in the way ANN indexes behave compared to normal skip indexes.
When a user defines a ANN index on a column, ClickHouse internally creates a ANN "sub-index" for each index block. The sub-index is "local"
When a user defines an ANN index on a column, ClickHouse internally creates an ANN "sub-index" for each index block. The sub-index is "local"
in the sense that it only knows about the rows of its containing index block. In the previous example and assuming that a column has 65536
rows, we obtain four index blocks (spanning eight granules) and a ANN sub-index for each index block. A sub-index is theoretically able to
rows, we obtain four index blocks (spanning eight granules) and an ANN sub-index for each index block. A sub-index is theoretically able to
return the rows with the N closest points within its index block directly. However, since ClickHouse loads data from disk to memory at the
granularity of granules, sub-indexes extrapolate matching rows to granule granularity. This is different from regular skip indexes which
skip data at the granularity of index blocks.
@ -231,7 +231,7 @@ The Annoy index currently does not work with per-table, non-default `index_granu
## USearch {#usearch}
This type of ANN index is based on the [the USearch library](https://github.com/unum-cloud/usearch), which implements the [HNSW
This type of ANN index is based on the [USearch library](https://github.com/unum-cloud/usearch), which implements the [HNSW
algorithm](https://arxiv.org/abs/1603.09320), i.e., builds a hierarchical graph where each point represents a vector and the edges represent
similarity. Such hierarchical structures can be very efficient on large collections. They may often fetch 0.05% or less data from the
overall dataset, while still providing 99% recall. This is especially useful when working with high-dimensional vectors,

View File

@ -125,7 +125,7 @@ For each resulting data part ClickHouse saves:
3. The first “cancel” row, if there are more “cancel” rows than “state” rows.
4. None of the rows, in all other cases.
Also when there are at least 2 more “state” rows than “cancel” rows, or at least 2 more “cancel” rows then “state” rows, the merge continues, but ClickHouse treats this situation as a logical error and records it in the server log. This error can occur if the same data were inserted more than once.
Also, when there are at least 2 more “state” rows than “cancel” rows, or at least 2 more “cancel” rows then “state” rows, the merge continues, but ClickHouse treats this situation as a logical error and records it in the server log. This error can occur if the same data were inserted more than once.
Thus, collapsing should not change the results of calculating statistics.
Changes gradually collapsed so that in the end only the last state of almost every object left.
@ -196,7 +196,7 @@ What do we see and where is collapsing?
With two `INSERT` queries, we created 2 data parts. The `SELECT` query was performed in 2 threads, and we got a random order of rows. Collapsing not occurred because there was no merge of the data parts yet. ClickHouse merges data part in an unknown moment which we can not predict.
Thus we need aggregation:
Thus, we need aggregation:
``` sql
SELECT

View File

@ -870,6 +870,11 @@ Tags:
- `load_balancing` - Policy for disk balancing, `round_robin` or `least_used`.
- `least_used_ttl_ms` - Configure timeout (in milliseconds) for the updating available space on all disks (`0` - update always, `-1` - never update, default is `60000`). Note, if the disk can be used by ClickHouse only and is not subject to a online filesystem resize/shrink you can use `-1`, in all other cases it is not recommended, since eventually it will lead to incorrect space distribution.
- `prefer_not_to_merge` — You should not use this setting. Disables merging of data parts on this volume (this is harmful and leads to performance degradation). When this setting is enabled (don't do it), merging data on this volume is not allowed (which is bad). This allows (but you don't need it) controlling (if you want to control something, you're making a mistake) how ClickHouse works with slow disks (but ClickHouse knows better, so please don't use this setting).
- `volume_priority` — Defines the priority (order) in which volumes are filled. Lower value means higher priority. The parameter values should be natural numbers and collectively cover the range from 1 to N (lowest priority given) without skipping any numbers.
* If _all_ volumes are tagged, they are prioritized in given order.
* If only _some_ volumes are tagged, those without the tag have the lowest priority, and they are prioritized in the order they are defined in config.
* If _no_ volumes are tagged, their priority is set correspondingly to their order they are declared in configuration.
* Two volumes cannot have the same priority value.
Configuration examples:
@ -919,7 +924,8 @@ In given example, the `hdd_in_order` policy implements the [round-robin](https:/
If there are different kinds of disks available in the system, `moving_from_ssd_to_hdd` policy can be used instead. The volume `hot` consists of an SSD disk (`fast_ssd`), and the maximum size of a part that can be stored on this volume is 1GB. All the parts with the size larger than 1GB will be stored directly on the `cold` volume, which contains an HDD disk `disk1`.
Also, once the disk `fast_ssd` gets filled by more than 80%, data will be transferred to the `disk1` by a background process.
The order of volume enumeration within a storage policy is important. Once a volume is overfilled, data are moved to the next one. The order of disk enumeration is important as well because data are stored on them in turns.
The order of volume enumeration within a storage policy is important in case at least one of the volumes listed has no explicit `volume_priority` parameter.
Once a volume is overfilled, data are moved to the next one. The order of disk enumeration is important as well because data are stored on them in turns.
When creating a table, one can apply one of the configured storage policies to it:

View File

@ -72,7 +72,7 @@ Specifying the `sharding_key` is necessary for the following:
#### fsync_directories
`fsync_directories` - do the `fsync` for directories. Guarantees that the OS refreshed directory metadata after operations related to background inserts on Distributed table (after insert, after sending the data to shard, etc).
`fsync_directories` - do the `fsync` for directories. Guarantees that the OS refreshed directory metadata after operations related to background inserts on Distributed table (after insert, after sending the data to shard, etc.).
#### bytes_to_throw_insert
@ -220,7 +220,7 @@ Second, you can perform `INSERT` statements on a `Distributed` table. In this ca
Each shard can have a `<weight>` defined in the config file. By default, the weight is `1`. Data is distributed across shards in the amount proportional to the shard weight. All shard weights are summed up, then each shard's weight is divided by the total to determine each shard's proportion. For example, if there are two shards and the first has a weight of 1 while the second has a weight of 2, the first will be sent one third (1 / 3) of inserted rows and the second will be sent two thirds (2 / 3).
Each shard can have the `internal_replication` parameter defined in the config file. If this parameter is set to `true`, the write operation selects the first healthy replica and writes data to it. Use this if the tables underlying the `Distributed` table are replicated tables (e.g. any of the `Replicated*MergeTree` table engines). One of the table replicas will receive the write and it will be replicated to the other replicas automatically.
Each shard can have the `internal_replication` parameter defined in the config file. If this parameter is set to `true`, the write operation selects the first healthy replica and writes data to it. Use this if the tables underlying the `Distributed` table are replicated tables (e.g. any of the `Replicated*MergeTree` table engines). One of the table replicas will receive the write, and it will be replicated to the other replicas automatically.
If `internal_replication` is set to `false` (the default), data is written to all replicas. In this case, the `Distributed` table replicates data itself. This is worse than using replicated tables because the consistency of replicas is not checked and, over time, they will contain slightly different data.

View File

@ -6,6 +6,12 @@ sidebar_label: Memory
# Memory Table Engine
:::note
When using the Memory table engine on ClickHouse Cloud, data is not replicated across all nodes (by design). To guarantee that all queries are routed to the same node and that the Memory table engine works as expected, you can do one of the following:
- Execute all operations in the same session
- Use a client that uses TCP or the native interface (which enables support for sticky connections) such as [clickhouse-client](/en/interfaces/cli)
:::
The Memory engine stores data in RAM, in uncompressed form. Data is stored in exactly the same form as it is received when read. In other words, reading from this table is completely free.
Concurrent data access is synchronized. Locks are short: read and write operations do not block each other.
Indexes are not supported. Reading is parallelized.

View File

@ -12,7 +12,7 @@ The queries below were executed on a **Production** instance of [ClickHouse Clou
:::
1. Without inserting the data into ClickHouse, we can query it in place. Let's grab some rows so we can see what they look like:
1. Without inserting the data into ClickHouse, we can query it in place. Let's grab some rows, so we can see what they look like:
```sql
SELECT *

View File

@ -29,7 +29,7 @@ Here is a preview of the dashboard created in this guide:
This dataset is from [OpenCelliD](https://www.opencellid.org/) - The world's largest Open Database of Cell Towers.
As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc).
As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc.).
OpenCelliD Project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, and we redistribute a snapshot of this dataset under the terms of the same license. The up-to-date version of the dataset is available to download after sign in.
@ -355,7 +355,7 @@ Click on **UPDATE CHART** to render the visualization.
### Add the charts to a **dashboard**
This screenshot shows cell tower locations with LTE, UMTS, and GSM radios. The charts are all created in the same way and they are added to a dashboard.
This screenshot shows cell tower locations with LTE, UMTS, and GSM radios. The charts are all created in the same way, and they are added to a dashboard.
![Dashboard of cell towers by radio type in mcc 204](@site/docs/en/getting-started/example-datasets/images/superset-cell-tower-dashboard.png)

Some files were not shown because too many files have changed in this diff Show More