Commit Graph

129472 Commits

Author SHA1 Message Date
Azat Khuzhin
4a02de4674 Add ability to disable checksums for S3 to avoid excessive input file read
AWS S3 client can read file multiple times, this is required for:
- calculate checksums
- calculate signature (done only for HTTP, since ClickHouse uses
  PayloadSigningPolicy::Never)

So this means that for HTTP, to send file to S3 it will be read 3x
times, and for HTTPS 2x times.

By overriding GetChecksumAlgorithmName() to return empty string,
checksums can be disabled, and the input file will be read only once.

And even though additional https layer adds extra integrity layer,
someone still may find this too risky I guess, even though ClickHouse
internal format (for MergeTree) has checksums, and more.

Here is an example stacktrace of this excessive read:

<details>

<summary>stacktrace</summary>

    (lldb) bt
    * thread 383, name = 'BackupWorker', stop reason = breakpoint 1.1
      * frame 0: 0x00000000103c5fc0 clickhouse`DB::StdStreamBufFromReadBuffer::seekpos() + 32 at StdStreamBufFromReadBuffer.cpp:67
        frame 1: 0x000000001777f7f8 clickhouse`std::__1::basic_istream<char, std::__1::char_traits<char>>::tellg() [inlined] std::__1::basic_streambuf<char, std::__1::char_traits<char>>::pubseekoff[abi:v15000](this=<unavailable>, __off=0, __way=cur, __which=8) + 120 at streambuf:162
        frame 2: 0x000000001777f7e3 clickhouse`std::__1::basic_istream<char, std::__1::char_traits<char>>::tellg() + 99 at istream:1249
        frame 3: 0x00000000152e4979 clickhouse`Aws::Utils::Crypto::MD5OpenSSLImpl::Calculate() + 57 at CryptoImpl.cpp:223
        frame 4: 0x00000000152dedee clickhouse`Aws::Utils::Crypto::MD5::Calculate() + 14 at MD5.cpp:30
        frame 5: 0x00000000152db5ac clickhouse`Aws::Utils::HashingUtils::CalculateMD5() + 44 at HashingUtils.cpp:235
        frame 6: 0x000000001528b97b clickhouse`Aws::Client::AWSClient::AddChecksumToRequest() const + 507 at AWSClient.cpp:772
        frame 7: 0x000000001528ded2 clickhouse`Aws::Client::AWSClient::BuildHttpRequest() const + 1682 at AWSClient.cpp:930
        frame 8: 0x00000000100b864f clickhouse`DB::S3::Client::BuildHttpRequest() const + 15 at Client.cpp:622
        frame 9: 0x0000000015286a41 clickhouse`Aws::Client::AWSClient::AttemptOneRequest(this=0x00007ffde2f8f000, httpRequest=<unavailable>, request=<unavailable>, signerName=<unavailable>, signerRegionOverride=<unavailable>, signerServiceNameOverride="s3") const + 65 at AWSClient.cpp:491
        frame 10: 0x00000000152845b9 clickhouse`Aws::Client::AWSClient::AttemptExhaustively(this=0x00007ffde2f8f000, uri=0x00007ffdd4d44f38, request=0x00007ffdd4d45d10, method=HTTP_PUT, signerName="SignatureV4", signerRegionOverride="us-east-1", signerServiceNameOverride="s3") const + 1337 at AWSClient.cpp:272
        frame 11: 0x0000000015298d0d clickhouse`Aws::Client::AWSXMLClient::MakeRequest() const + 45 at AWSXmlClient.cpp:99
        frame 12: 0x0000000015298cb5 clickhouse`Aws::Client::AWSXMLClient::MakeRequest() const + 309 at AWSXmlClient.cpp:66
        frame 13: 0x0000000015354b23 clickhouse`Aws::S3::S3Client::PutObject(this=0x00007ffde2f8f000, request=0x00007ffdd4d45d10) const + 2659 at S3Client.cpp:1731
        frame 14: 0x00000000100b174f clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const [inlined]
        frame 15: 0x00000000100b173a clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const + 41 at Client.cpp:578
        frame 16: 0x00000000100b1711 clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const + 981 at Client.cpp:508
        frame 17: 0x00000000100b133c clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const [inlined]
        frame 18: 0x00000000100b133c clickhouse`DB::S3::Client::PutObject() const + 28 at Client.cpp:418
        frame 19: 0x00000000103b96d6 clickhouse`DB::copyDataToS3File()

</details>

This new behaviour could be enabled with `s3_disable_checksum=true`.

Note, that I've checked this implementation with GCS/R2/S3/MinIO and it
works everywhere.
2023-11-26 19:20:19 +01:00
Alexey Milovidov
588fd16518
Merge pull request #57230 from ClickHouse/remove-bad-test
Remove test `01280_ttl_where_group_by`
2023-11-26 04:38:16 +01:00
Alexey Milovidov
1cc33f3430
Merge pull request #56164 from azat/not-byte-identical-message
Add more details to "Data after merge is not byte-identical to data on another replicas"
2023-11-26 04:14:15 +01:00
Alexey Milovidov
f8ebe5134d
Merge pull request #55836 from azat/dist/limit-by-fix
RFC: Fix "Cannot find column X in source stream" for Distributed queries with LIMIT BY
2023-11-26 04:03:41 +01:00
Alexey Milovidov
304d6375be
Merge pull request #56225 from azat/rocksdb-compact
Allow manual compaction of rocksdb via OPTIMIZE query
2023-11-26 03:59:54 +01:00
Alexey Milovidov
36cc857441
Merge pull request #57232 from ClickHouse/revert-57170-tests/01600_parts_types_metrics
Revert "Add debugging info for 01600_parts_types_metrics on failures"
2023-11-26 03:58:09 +01:00
Alexey Milovidov
32da588d5e
Revert "Add debugging info for 01600_parts_types_metrics on failures" 2023-11-26 05:57:54 +03:00
Alexey Milovidov
fde14f0daf
Merge pull request #57191 from azat/client-log_comment-file
[RFC] Set log_comment to the file name while processing files in client
2023-11-26 03:44:40 +01:00
Alexey Milovidov
f636dea879
Merge pull request #54327 from den-crane/background_fetches_pool_size
increase background_fetches_pool_size to 16, background_schedule_pool_size to 512
2023-11-26 02:50:38 +01:00
Alexey Milovidov
9fa112af9a
Merge pull request #53721 from takakawa/possible_wrong_type_conversion_bugfix
[bugfix] possible postgresql logical replication error: wrong type conversion
2023-11-26 02:48:24 +01:00
Alexey Milovidov
b92c416ced Remove test 01280_ttl_where_group_by 2023-11-26 02:34:18 +01:00
Alexey Milovidov
f63048e2d6
Merge pull request #57229 from ClickHouse/revert-57222-update-sentry
Revert "Update Sentry"
2023-11-26 02:30:17 +01:00
Alexey Milovidov
e60941f7c5
Revert "Update Sentry" 2023-11-26 04:30:05 +03:00
robot-ch-test-poll1
a956cec61f
Merge pull request #57228 from rschu1ze/docs-alias
Docs: Mention alias `database` for `name` in `system.databases`
2023-11-26 02:25:32 +01:00
Robert Schulze
e074629749
Docs: Mention alias 'database' for 'name' in system.databases 2023-11-25 22:04:49 +00:00
Alexey Milovidov
d29092f8af
Merge pull request #54909 from canhld94/revert-54893-revert-54819-ch_net_interfaces
Resubmit: Avoid excessive calls to getifaddrs in isLocalAddress
2023-11-25 23:00:18 +01:00
Alexey Milovidov
d7e64fa446
Merge pull request #57150 from azat/ci/no-partial-results
Remove partial results from build matrix for stress tests
2023-11-25 22:55:14 +01:00
Alexey Milovidov
e5dd78a9e8
Merge pull request #57222 from ClickHouse/update-sentry
Update Sentry
2023-11-25 22:53:27 +01:00
Alexey Milovidov
c50f096200
Merge pull request #57227 from azat/client-INS-cursor
Change cursor style for overwrite mode (INS) to blinking in client
2023-11-25 22:53:09 +01:00
Azat Khuzhin
ced0bbd932 Change cursor style for overwrite mode (INS) to blinking in client
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-11-25 22:00:56 +01:00
Alexey Milovidov
b31d60e886
Merge pull request #57226 from ClickHouse/auto/v23.8.8.20-lts
Update version_date.tsv and changelogs after v23.8.8.20-lts
2023-11-25 21:37:04 +01:00
Alexey Milovidov
2cdb13e367
Merge pull request #57225 from ClickHouse/auto/v23.3.18.15-lts
Update version_date.tsv and changelogs after v23.3.18.15-lts
2023-11-25 21:36:53 +01:00
Robert Schulze
4088ec0eac
Merge pull request #57199 from rschu1ze/docs-math-funcs
Docs: Improve math function docs
2023-11-25 21:22:53 +01:00
Alexey Milovidov
4aff43ef7f
Merge pull request #57221 from ClickHouse/faster-libunwind
Fix occasional slowness of stack unwinding.
2023-11-25 21:13:21 +01:00
robot-clickhouse
ff34726e0c Update version_date.tsv and changelogs after v23.8.8.20-lts 2023-11-25 19:48:29 +00:00
Robert Schulze
2e7c16e138
Fix broken links 2023-11-25 19:47:32 +00:00
Robert Schulze
5b7d2a903d
Merge pull request #57152 from rschu1ze/fine-granular-plan-opt-settings
Fine-granular enablement/disabling of plan-level optimizations
2023-11-25 20:44:23 +01:00
Alexey Milovidov
121776783b
Merge pull request #57223 from ClickHouse/auto/v23.10.5.20-stable
Update version_date.tsv and changelogs after v23.10.5.20-stable
2023-11-25 20:21:25 +01:00
Alexey Milovidov
2fe032d47e
Merge pull request #57224 from ClickHouse/auto/v23.9.6.20-stable
Update version_date.tsv and changelogs after v23.9.6.20-stable
2023-11-25 20:20:39 +01:00
robot-clickhouse
f78612f37e Update version_date.tsv and changelogs after v23.3.18.15-lts 2023-11-25 18:40:50 +00:00
robot-clickhouse
2bb12386dd Update version_date.tsv and changelogs after v23.9.6.20-stable 2023-11-25 18:37:17 +00:00
robot-clickhouse
e58d2ae5d6 Update version_date.tsv and changelogs after v23.10.5.20-stable 2023-11-25 18:36:28 +00:00
Alexey Milovidov
1e00048cf7 Update Sentry 2023-11-25 19:11:52 +01:00
Alexey Milovidov
2222d8cbf2 Update Sentry 2023-11-25 18:47:21 +01:00
Alexey Milovidov
5eb3cafb52 libunwind: fix slowness under Musl 2023-11-25 16:21:17 +01:00
Alexey Milovidov
143617e303 Remove garbage 2023-11-25 16:21:17 +01:00
Vitaly Baranov
2e7f314599
Merge pull request #50209 from ilejn/merge_row_policy
Engine Merge obeys row policy
2023-11-25 10:34:22 +01:00
Vitaly Baranov
4fed61e8ca
Merge pull request #57146 from vitlibar/fix-test_replicated_merge_tree_encryption_codec_different_keys_2
Fix test test_replicated_merge_tree_encryption_codec/test.py::test_different_keys
2023-11-25 09:36:33 +01:00
Alexey Milovidov
61c9d304f0
Merge pull request #57125 from ClickHouse/try-fix-57097
Cancel executor in ~CreatingSetsTransform
2023-11-25 03:02:39 +01:00
Alexey Milovidov
fa706b8bfe
Merge pull request #57170 from azat/tests/01600_parts_types_metrics
Add debugging info for 01600_parts_types_metrics on failures
2023-11-25 03:01:56 +01:00
Alexey Milovidov
8877b7ce78
Merge pull request #57198 from ClickHouse/analyzer-fuzz-6
Analyzer fuzzer 6 (arrayJoin)
2023-11-25 03:00:48 +01:00
Alexey Milovidov
ef9670b5de
Merge pull request #57200 from Algunenano/i47366
Add test for #47366
2023-11-25 03:00:18 +01:00
Alexey Milovidov
8325d04313
Merge pull request #57201 from myrrc/feature/elide-functional-stacktraces
Do not demangle stack frames from __functional
2023-11-25 02:59:49 +01:00
Alexey Milovidov
44c44fab10
Merge pull request #57204 from azat/client-skim-fix-crash
Fix possible crash (in Rust) of fuzzy finder in client
2023-11-25 02:58:18 +01:00
Alexey Milovidov
c478acab42
Merge pull request #57206 from azat/tests/test_distributed_storage_configuration
Fix test_distributed_storage_configuration flakiness
2023-11-25 02:57:51 +01:00
Alexey Milovidov
91cc132feb
Merge pull request #56873 from ClickHouse/memory-for-client-in-stress-and-fuzzer
Set limit for memory usage for client in Stress tests and ASTFuzzers
2023-11-24 23:15:27 +01:00
jsc0218
55e0a825b7
Merge pull request #57106 from BetterStackHQ/ah/uniq-id-check-master
Optimize query uniqueness check in ProcessList
2023-11-24 17:04:25 -05:00
Alexey Milovidov
685ad98652
Merge pull request #57122 from ClickHouse/fuzzer-disable-checksums
Disable checksums for builds with fuzzer
2023-11-24 23:03:04 +01:00
robot-clickhouse
106dca221b
Merge pull request #57192 from Algunenano/i5323
Add test for #5323
2023-11-24 22:47:17 +01:00
Alexey Milovidov
289df618f4
Merge pull request #57001 from arthurpassos/aws-s3-sign-any-x-amz-header-clean
Sign all aws headers
2023-11-24 21:12:10 +01:00