Commit Graph

484 Commits

Author SHA1 Message Date
Alexey Milovidov
10d65a1ade
Merge pull request #55559 from azat/s3-fix-excessive-reads
Add ability to disable checksums for S3 to avoid excessive input file read
2023-12-05 06:34:21 +01:00
robot-ch-test-poll1
d63b652dae
Merge pull request #57385 from vitlibar/fix-inconsistent-metadata-for-backup-2
Stop using INCONSISTENT_METADATA_FOR_BACKUP that much
2023-12-04 10:29:57 +01:00
Vitaly Baranov
1bc1563e0e Stop using INCONSISTENT_METADATA_FOR_BACKUP that much. If possible prefer to continue scanning instead of stopping and starting the scanning for backup again. 2023-11-30 21:37:42 +01:00
Antonio Andelic
550513d90e Finalize ZK client on fault injection 2023-11-29 12:30:21 +00:00
Azat Khuzhin
4a02de4674 Add ability to disable checksums for S3 to avoid excessive input file read
AWS S3 client can read file multiple times, this is required for:
- calculate checksums
- calculate signature (done only for HTTP, since ClickHouse uses
  PayloadSigningPolicy::Never)

So this means that for HTTP, to send file to S3 it will be read 3x
times, and for HTTPS 2x times.

By overriding GetChecksumAlgorithmName() to return empty string,
checksums can be disabled, and the input file will be read only once.

And even though additional https layer adds extra integrity layer,
someone still may find this too risky I guess, even though ClickHouse
internal format (for MergeTree) has checksums, and more.

Here is an example stacktrace of this excessive read:

<details>

<summary>stacktrace</summary>

    (lldb) bt
    * thread 383, name = 'BackupWorker', stop reason = breakpoint 1.1
      * frame 0: 0x00000000103c5fc0 clickhouse`DB::StdStreamBufFromReadBuffer::seekpos() + 32 at StdStreamBufFromReadBuffer.cpp:67
        frame 1: 0x000000001777f7f8 clickhouse`std::__1::basic_istream<char, std::__1::char_traits<char>>::tellg() [inlined] std::__1::basic_streambuf<char, std::__1::char_traits<char>>::pubseekoff[abi:v15000](this=<unavailable>, __off=0, __way=cur, __which=8) + 120 at streambuf:162
        frame 2: 0x000000001777f7e3 clickhouse`std::__1::basic_istream<char, std::__1::char_traits<char>>::tellg() + 99 at istream:1249
        frame 3: 0x00000000152e4979 clickhouse`Aws::Utils::Crypto::MD5OpenSSLImpl::Calculate() + 57 at CryptoImpl.cpp:223
        frame 4: 0x00000000152dedee clickhouse`Aws::Utils::Crypto::MD5::Calculate() + 14 at MD5.cpp:30
        frame 5: 0x00000000152db5ac clickhouse`Aws::Utils::HashingUtils::CalculateMD5() + 44 at HashingUtils.cpp:235
        frame 6: 0x000000001528b97b clickhouse`Aws::Client::AWSClient::AddChecksumToRequest() const + 507 at AWSClient.cpp:772
        frame 7: 0x000000001528ded2 clickhouse`Aws::Client::AWSClient::BuildHttpRequest() const + 1682 at AWSClient.cpp:930
        frame 8: 0x00000000100b864f clickhouse`DB::S3::Client::BuildHttpRequest() const + 15 at Client.cpp:622
        frame 9: 0x0000000015286a41 clickhouse`Aws::Client::AWSClient::AttemptOneRequest(this=0x00007ffde2f8f000, httpRequest=<unavailable>, request=<unavailable>, signerName=<unavailable>, signerRegionOverride=<unavailable>, signerServiceNameOverride="s3") const + 65 at AWSClient.cpp:491
        frame 10: 0x00000000152845b9 clickhouse`Aws::Client::AWSClient::AttemptExhaustively(this=0x00007ffde2f8f000, uri=0x00007ffdd4d44f38, request=0x00007ffdd4d45d10, method=HTTP_PUT, signerName="SignatureV4", signerRegionOverride="us-east-1", signerServiceNameOverride="s3") const + 1337 at AWSClient.cpp:272
        frame 11: 0x0000000015298d0d clickhouse`Aws::Client::AWSXMLClient::MakeRequest() const + 45 at AWSXmlClient.cpp:99
        frame 12: 0x0000000015298cb5 clickhouse`Aws::Client::AWSXMLClient::MakeRequest() const + 309 at AWSXmlClient.cpp:66
        frame 13: 0x0000000015354b23 clickhouse`Aws::S3::S3Client::PutObject(this=0x00007ffde2f8f000, request=0x00007ffdd4d45d10) const + 2659 at S3Client.cpp:1731
        frame 14: 0x00000000100b174f clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const [inlined]
        frame 15: 0x00000000100b173a clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const + 41 at Client.cpp:578
        frame 16: 0x00000000100b1711 clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const + 981 at Client.cpp:508
        frame 17: 0x00000000100b133c clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const [inlined]
        frame 18: 0x00000000100b133c clickhouse`DB::S3::Client::PutObject() const + 28 at Client.cpp:418
        frame 19: 0x00000000103b96d6 clickhouse`DB::copyDataToS3File()

</details>

This new behaviour could be enabled with `s3_disable_checksum=true`.

Note, that I've checked this implementation with GCS/R2/S3/MinIO and it
works everywhere.
2023-11-26 19:20:19 +01:00
vdimir
15234474d7
Implement system table blob_storage_log 2023-11-21 09:18:25 +00:00
Alexey Milovidov
62a87665c5 Fix build 2023-11-19 16:31:18 +01:00
Sema Checherinda
a950595c24
Merge pull request #56314 from CheSema/s3-aggressive-timeouts
s3 adaptive timeouts
2023-11-19 14:12:14 +01:00
Alexey Milovidov
edc3b2fe48
Merge pull request #56958 from ClickHouse/metric-queued-jobs
Add metrics for the number of queued jobs, which is useful for the IO thread pool
2023-11-19 10:37:18 +01:00
Alexey Milovidov
d56cbda185 Add metrics for the number of queued jobs, which is useful for the IO thread pool 2023-11-18 19:07:59 +01:00
Antonio Andelic
9bcedf3764 Cleanup 2023-11-17 10:27:19 +00:00
Antonio Andelic
7dda3b2353 Review comments 2023-11-17 10:11:15 +00:00
Antonio Andelic
9d965368a2 Fix build 2023-11-15 08:36:24 +00:00
Antonio Andelic
2f9ac9b49c Address comments 2023-11-14 14:33:34 +00:00
Sema Checherinda
8d36fd6e54 get rid off of client_with_long_timeout_ptr 2023-11-14 11:34:12 +01:00
Sema Checherinda
770a762317 aggressive timeout 2023-11-14 11:34:11 +01:00
Antonio Andelic
9e91e4d671 Define BackupReferenceEntry 2023-11-13 14:43:02 +00:00
Antonio Andelic
59480205d4 Merge branch 'master' into keeper-map-backup-restore 2023-11-13 12:21:26 +00:00
Vitaly Baranov
1711bed63e Load base backups lazily (if a backup is not needed it won't be loaded). 2023-11-09 19:29:04 +01:00
Antonio Andelic
f9895ab37b Small fixes and add test 2023-11-09 15:56:57 +00:00
Vitaly Baranov
e48d9f18a8
Update BackupsWorker.cpp 2023-11-09 15:01:25 +01:00
Kseniia Sumarokova
0760e69e54
Merge pull request #56312 from ClickHouse/parallelize-backup-entries-collector
Parallelize `BackupEntriesCollector`
2023-11-09 13:07:48 +01:00
Antonio Andelic
1f000242a1 Merge branch 'master' into keeper-map-backup-restore 2023-11-08 13:16:47 +00:00
Antonio Andelic
18a5eeec38 Make on cluster backup/restore work 2023-11-08 13:16:38 +00:00
alesapin
4a097cd373
Merge pull request #55216 from vitlibar/backup-use-two-more-thread-pools
Use more thread pools in BACKUP/RESTORE to avoid its hanging in tests
2023-11-08 12:18:10 +01:00
kssenii
080715e598 Fix 2023-11-08 11:29:11 +01:00
Antonio Andelic
86ba6ad1e8 Local backup and restore 2023-11-08 10:22:44 +00:00
Vitaly Baranov
da5f48e44a Use map instead of array. 2023-11-08 08:45:50 +01:00
kssenii
271059dcb8 Fxi 2023-11-07 19:03:45 +01:00
kssenii
15517d04df Fix 2023-11-07 18:02:44 +01:00
kssenii
70048236f3 Add ProfileEvents column to system.backups 2023-11-07 17:32:08 +01:00
kssenii
53423a8f5c Add metrics 2023-11-07 17:10:45 +01:00
kssenii
437711a9df Use separate pool 2023-11-07 16:47:12 +01:00
Kseniia Sumarokova
d89356103c
Update BackupEntriesCollector.cpp 2023-11-06 18:28:18 +01:00
kssenii
8c1aaa51e4 Fxi 2023-11-06 16:37:01 +01:00
kssenii
9b9a6f8afc Parallelize BackupEntriesCollector 2023-11-03 17:34:09 +01:00
kssenii
6fc4c8d332 Add retries 2023-10-30 17:20:17 +01:00
Vitaly Baranov
8966ae4c0e Schedule threads while making a backup faster. 2023-10-03 16:08:00 +02:00
Vitaly Baranov
05f82d48df Use 2 more thread pools to avoid hanging in case BACKUP/RESTORE ON CLUSTER ASYNC. Create thread pools lazily. 2023-10-03 16:07:55 +02:00
alesapin
61a299843b Fix deadlock in backups 2023-09-29 14:02:43 +02:00
robot-ch-test-poll4
0f2d7233d9
Merge pull request #54804 from vitlibar/more-configurable-collecting-metadata-for-backup
More configurable collecting metadata for backup
2023-09-27 22:31:41 +02:00
Robert Schulze
cde10fe7b5
Merge remote-tracking branch 'rschu1ze/master' into clang-tidy-reenable-checks 2023-09-26 18:59:41 +00:00
Vitaly Baranov
fe008c23c4
Merge pull request #54900 from vitlibar/retry-backup-s3-operations-after-conection-reset
Retry backup S3 operations after connection reset failure
2023-09-26 18:36:10 +02:00
Vitaly Baranov
1f198d8659 Add comments. 2023-09-26 18:07:36 +02:00
Robert Schulze
9fff447716
Re-enable clang-tidy checks 2023-09-26 09:34:12 +00:00
Vitaly Baranov
a93d64ab24 More configurable collecting metadata for backup. 2023-09-25 12:22:18 +02:00
Vitaly Baranov
1e567d5008 Retry backup s3 operations after ConnectionResetException. 2023-09-23 01:56:28 +02:00
Victor Krasnov
fea886907d Fix data race during backup_log initialization 2023-09-22 09:54:33 +00:00
Azat Khuzhin
c706101891 Fix throttling of BACKUPs from/to S3 (in case native copy was not used)
In some cases native copy is not possible, and such requests should be
throttled.

v0: copyS3FileNativeWithFallback
v2: revert v0 and pass write_settings
v3: pass read_settings to copyFile()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-09-20 18:28:43 +02:00
Sema Checherinda
d9e15c00c9 limit the delay before next try in S3 2023-09-14 19:45:07 +02:00