Commit Graph

2265 Commits

Author SHA1 Message Date
Alexey Milovidov
8d984df135
Merge pull request #58237 from azat/build/fwd-decl-exception
Some code refactoring (was an attempt to improve build time, but failed)
2023-12-28 00:21:09 +01:00
Azat Khuzhin
b9233f6d4f Move Allocator code into module part
This should reduce amount of code that should be recompiled on
Exception.h changes (and everything else that had been included there).

This will actually not help a lot, because it is also included into
PODArray.h and ThreadPool.h at least... Sigh.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-27 15:42:08 +01:00
Dani Pozo
2be2486e94 Remove retryStrategy assignments overwritten in ClientFactory::create() 2023-12-22 17:28:43 +01:00
Raúl Marín
ced9407cef Improve error messages 2023-12-21 10:29:05 +01:00
Nikita Mikhaylov
6360b76792 Merge branch 'master' of github.com:ClickHouse/ClickHouse into remove-the-limit-for-connections-per-endpoint 2023-12-18 21:49:31 +00:00
Raúl Marín
2639d0715f Merge remote-tracking branch 'blessed/master' into log_message_string 2023-12-18 10:40:18 +01:00
Alexey Milovidov
dbd509417e
Merge pull request #57970 from ClickHouse/nickitat-patch-17
Always use `pread` for reading cache segments
2023-12-17 20:53:05 +01:00
Nikita Taranov
9c2ef4eae5
Add profile event for cache lookup in ThreadPoolRemoteFSReader (#57437) 2023-12-17 19:03:49 +01:00
Nikita Taranov
587f829eb8
Always use pread for reading cache segments 2023-12-17 18:37:07 +01:00
Raúl Marín
b269f87f4c Better text_log with ErrnoException 2023-12-15 19:27:56 +01:00
Sema Checherinda
0dfe530a7f
Merge pull request #56744 from MikhailBurdukov/native_copy_for_s3_disks
Enabled s3 `copyObject` for copy between s3 disks.
2023-12-15 16:05:16 +01:00
Kseniia Sumarokova
79db3c66df
Merge branch 'master' into allow-to-change-some-cache-settings-without-restart 2023-12-13 23:33:59 +01:00
Nikita Mikhaylov
8372c70958 Merge branch 'master' of github.com:ClickHouse/ClickHouse into remove-the-limit-for-connections-per-endpoint 2023-12-13 18:29:56 +00:00
Alexey Milovidov
62b6d1ef5e iMerge branch 'master' of github.com:ClickHouse/ClickHouse into clickbench-ci 2023-12-13 01:41:27 +01:00
Kseniia Sumarokova
91d36ad224
Merge pull request #57076 from ClickHouse/slru-for-filesystem-cache
Implement SLRU cache policy for filesystem cache
2023-12-12 10:20:58 +01:00
Alexey Milovidov
9789c2caa2 Review fixes 2023-12-12 05:48:09 +01:00
MikhailBurdukov
119e451967 Merge branch 'master' into native_copy_for_s3_disks 2023-12-11 07:25:20 +00:00
Azat Khuzhin
6ccbc2ea75 Move io_uring reader into the Context from static to make it's thread joinable
v2: fix for standalone keeper build
CI: https://s3.amazonaws.com/clickhouse-test-reports/52717/72b1052f7c2d453308262924e767ab8dc2206933/stateless_tests__debug__[4_5].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-09 22:50:48 +01:00
Nikita Mikhaylov
c979124cfe Merge branch 'master' of github.com:ClickHouse/ClickHouse into remove-the-limit-for-connections-per-endpoint 2023-12-08 16:25:02 +00:00
MikhailBurdukov
d10217af16 style 2023-12-08 13:14:47 +00:00
kssenii
725571461d Merge remote-tracking branch 'origin/master' into slru-for-filesystem-cache 2023-12-07 19:49:03 +01:00
kssenii
8be3c9d218 Merge remote-tracking branch 'origin/master' into allow-to-change-some-cache-settings-without-restart 2023-12-07 12:14:24 +01:00
kssenii
614da21144 Better 2023-12-07 12:12:10 +01:00
kssenii
f44f7c8c28 Allow to change some cache settings without server restart 2023-12-06 19:29:18 +01:00
jsc0218
cdd5280272
Merge pull request #57387 from evillique/better-disks
Initialize only required disks in clickhouse-disks
2023-12-05 13:59:04 -05:00
Nikita Mikhaylov
04d167c6d9 Better 2023-12-05 13:34:37 +01:00
Alexey Milovidov
10d65a1ade
Merge pull request #55559 from azat/s3-fix-excessive-reads
Add ability to disable checksums for S3 to avoid excessive input file read
2023-12-05 06:34:21 +01:00
kssenii
4a28f10c3d Minor cache changes 2023-12-04 19:02:37 +01:00
vdimir
a4ae90de0d
Merge pull request #57275 from ClickHouse/vdimir/merge_task_tmp_data
Background merges correctly use temporary data storage in the cache
2023-12-04 14:52:20 +01:00
robot-ch-test-poll
1b49463bd2
Merge pull request #55841 from nickitat/optimize_reading3
Optimize reading from cache
2023-12-01 17:36:57 +01:00
Nikolay Degterinsky
e8203f8a76 Initialize only required disks 2023-11-30 03:09:55 +00:00
vdimir
b5babe1692
MergeTask uses temporary data storage 2023-11-29 16:18:32 +00:00
Kseniia Sumarokova
a89bb04e9c
Update comment 2023-11-29 13:30:55 +01:00
kssenii
28d54f0027 Better exception code 2023-11-29 10:57:35 +01:00
Nikita Taranov
a81453cafc fix test 2023-11-28 23:48:52 +01:00
Nikita Taranov
03450d5077 merge fixes 2023-11-28 18:24:05 +01:00
kssenii
4d64cd5d11 Fix 2023-11-28 17:13:08 +01:00
Nikita Taranov
52f644c0df Merge branch 'master' into optimize_reading3 2023-11-28 16:36:38 +01:00
Nikita Taranov
bdab4d5944 handle case with multiple blobs in ReadBufferFromRemoteFSGather 2023-11-28 15:40:35 +01:00
Nikita Taranov
0ad796aa99 add profile event
add profile event for set&seek

fix

fix

fix
2023-11-28 15:40:34 +01:00
MikhailBurdukov
e04416b48f Fix 2023-11-27 18:42:06 +00:00
MikhailBurdukov
0b25e5c347 review 2023-11-27 17:45:02 +00:00
MikhailBurdukov
aa3cba1f1c Fix 2023-11-27 13:45:21 +00:00
MikhailBurdukov
6f19e8ebd1
Merge branch 'master' into native_copy_for_s3_disks 2023-11-27 14:25:36 +03:00
MikhailBurdukov
c10c30832c Review fix 2023-11-27 10:58:30 +00:00
Azat Khuzhin
4a02de4674 Add ability to disable checksums for S3 to avoid excessive input file read
AWS S3 client can read file multiple times, this is required for:
- calculate checksums
- calculate signature (done only for HTTP, since ClickHouse uses
  PayloadSigningPolicy::Never)

So this means that for HTTP, to send file to S3 it will be read 3x
times, and for HTTPS 2x times.

By overriding GetChecksumAlgorithmName() to return empty string,
checksums can be disabled, and the input file will be read only once.

And even though additional https layer adds extra integrity layer,
someone still may find this too risky I guess, even though ClickHouse
internal format (for MergeTree) has checksums, and more.

Here is an example stacktrace of this excessive read:

<details>

<summary>stacktrace</summary>

    (lldb) bt
    * thread 383, name = 'BackupWorker', stop reason = breakpoint 1.1
      * frame 0: 0x00000000103c5fc0 clickhouse`DB::StdStreamBufFromReadBuffer::seekpos() + 32 at StdStreamBufFromReadBuffer.cpp:67
        frame 1: 0x000000001777f7f8 clickhouse`std::__1::basic_istream<char, std::__1::char_traits<char>>::tellg() [inlined] std::__1::basic_streambuf<char, std::__1::char_traits<char>>::pubseekoff[abi:v15000](this=<unavailable>, __off=0, __way=cur, __which=8) + 120 at streambuf:162
        frame 2: 0x000000001777f7e3 clickhouse`std::__1::basic_istream<char, std::__1::char_traits<char>>::tellg() + 99 at istream:1249
        frame 3: 0x00000000152e4979 clickhouse`Aws::Utils::Crypto::MD5OpenSSLImpl::Calculate() + 57 at CryptoImpl.cpp:223
        frame 4: 0x00000000152dedee clickhouse`Aws::Utils::Crypto::MD5::Calculate() + 14 at MD5.cpp:30
        frame 5: 0x00000000152db5ac clickhouse`Aws::Utils::HashingUtils::CalculateMD5() + 44 at HashingUtils.cpp:235
        frame 6: 0x000000001528b97b clickhouse`Aws::Client::AWSClient::AddChecksumToRequest() const + 507 at AWSClient.cpp:772
        frame 7: 0x000000001528ded2 clickhouse`Aws::Client::AWSClient::BuildHttpRequest() const + 1682 at AWSClient.cpp:930
        frame 8: 0x00000000100b864f clickhouse`DB::S3::Client::BuildHttpRequest() const + 15 at Client.cpp:622
        frame 9: 0x0000000015286a41 clickhouse`Aws::Client::AWSClient::AttemptOneRequest(this=0x00007ffde2f8f000, httpRequest=<unavailable>, request=<unavailable>, signerName=<unavailable>, signerRegionOverride=<unavailable>, signerServiceNameOverride="s3") const + 65 at AWSClient.cpp:491
        frame 10: 0x00000000152845b9 clickhouse`Aws::Client::AWSClient::AttemptExhaustively(this=0x00007ffde2f8f000, uri=0x00007ffdd4d44f38, request=0x00007ffdd4d45d10, method=HTTP_PUT, signerName="SignatureV4", signerRegionOverride="us-east-1", signerServiceNameOverride="s3") const + 1337 at AWSClient.cpp:272
        frame 11: 0x0000000015298d0d clickhouse`Aws::Client::AWSXMLClient::MakeRequest() const + 45 at AWSXmlClient.cpp:99
        frame 12: 0x0000000015298cb5 clickhouse`Aws::Client::AWSXMLClient::MakeRequest() const + 309 at AWSXmlClient.cpp:66
        frame 13: 0x0000000015354b23 clickhouse`Aws::S3::S3Client::PutObject(this=0x00007ffde2f8f000, request=0x00007ffdd4d45d10) const + 2659 at S3Client.cpp:1731
        frame 14: 0x00000000100b174f clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const [inlined]
        frame 15: 0x00000000100b173a clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const + 41 at Client.cpp:578
        frame 16: 0x00000000100b1711 clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const + 981 at Client.cpp:508
        frame 17: 0x00000000100b133c clickhouse`DB::S3::Client::PutObject(DB::S3::ExtendedRequest<Aws::S3::Model::PutObjectRequest> const&) const [inlined]
        frame 18: 0x00000000100b133c clickhouse`DB::S3::Client::PutObject() const + 28 at Client.cpp:418
        frame 19: 0x00000000103b96d6 clickhouse`DB::copyDataToS3File()

</details>

This new behaviour could be enabled with `s3_disable_checksum=true`.

Note, that I've checked this implementation with GCS/R2/S3/MinIO and it
works everywhere.
2023-11-26 19:20:19 +01:00
Aleksei Filatov
1a03f5f7f4 Merge remote-tracking branch 'upstream/master' into add_cancellation_point_for_moving_background_operation 2023-11-23 16:43:33 +03:00
kssenii
00177a8016 Merge remote-tracking branch 'origin/master' into slru-for-filesystem-cache 2023-11-23 12:29:53 +01:00
Kseniia Sumarokova
2880e6437e
Merge pull request #56936 from jrdi/fs-cache-hit-profile-events
Add CachedReadBufferReadFromCache{Hits,Misses} profile events
2023-11-22 11:26:00 +01:00
Kseniia Sumarokova
e4f66b8469
Merge pull request #55158 from kssenii/fs-cache-improvement
fs cache improvement for big reads
2023-11-21 21:50:00 +01:00