Commit Graph

1609 Commits

Author SHA1 Message Date
Alexander Gololobov
31bffdd735
Merge pull request #43460 from ClickHouse/smaller_buffer_for_small_files
Use smaller buffer for small files
2022-11-22 14:00:22 +01:00
Alexander Gololobov
5afd2a1add Pass file size to better choose buffer size 2022-11-21 22:36:45 +01:00
Alexander Gololobov
86a0dd010f Use read_hint and file_size for choosing buffer size 2022-11-21 22:36:28 +01:00
Azat Khuzhin
c393af2812 Fix is_read_only property of local disks
But it wasn't a problem before, since before it was not possible to have
readonly==true.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:35:41 +01:00
Azat Khuzhin
e6e223adc2 Check if local disk is readonly before access checks 2022-11-20 16:35:32 +01:00
Azat Khuzhin
cb60576221 Change DiskLocal::setup() signature (it always return true)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:31:37 +01:00
Azat Khuzhin
38c009214a Change the name pattern for memory disks
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:28:35 +01:00
Azat Khuzhin
11be9b9ad1 Create disk directory before access check for local disk
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:28:35 +01:00
Azat Khuzhin
dddcca5cc1 Fix deadlock in DiskRestartProxy on disk restart
stacktrace:
    contrib/libcxx/src/condition_variable.cpp:47::std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&)
    contrib/libcxx/src/shared_mutex.cpp:65::std::__1::shared_timed_mutex::lock_shared()
    src/Disks/DiskRestartProxy.cpp:229::DB::DiskRestartProxy::writeFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, unsigned long, DB::WriteMode, DB::WriteSettings const&)
    src/Disks/IDisk.cpp:0::DB::IDisk::checkAccessImpl(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)
    contrib/libcxx/include/string:1499::DB::IDisk::checkAccess()
    src/Disks/IDisk.cpp:0::DB::IDisk::startup(std::__1::shared_ptr<DB::Context const>, bool)
    src/Disks/DiskRestartProxy.cpp:375::DB::DiskRestartProxy::restart(std::__1::shared_ptr<DB::Context const>)
    contrib/libcxx/include/__memory/shared_ptr.h:701::DB::InterpreterSystemQuery::restartDisk(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&)
    src/Interpreters/InterpreterSystemQuery.cpp:508::DB::InterpreterSystemQuery::execute()
    src/Interpreters/executeQuery.cpp:0::DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, DB::ReadBuffer*)
    src/Interpreters/executeQuery.cpp:1083::DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum)
    src/Server/TCPHandler.cpp:0::DB::TCPHandler::runImpl()
    src/Server/TCPHandler.cpp:1904::DB::TCPHandler::run()
    contrib/poco/Net/src/TCPServerConnection.cpp:57::Poco::Net::TCPServerConnection::start()
    contrib/libcxx/include/__memory/unique_ptr.h:48::Poco::Net::TCPServerDispatcher::run()
    contrib/poco/Foundation/src/ThreadPool.cpp:213::Poco::PooledThread::run()
    contrib/poco/Foundation/include/Poco/SharedPtr.h:156::Poco::ThreadImpl::runnableEntry(void*)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:28:35 +01:00
Azat Khuzhin
44f23c2568 Make disks checks only for clickhouse-server
This will fix clickhouse-disks

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:28:35 +01:00
Azat Khuzhin
2efd29f49d Implement access (read/read-by-offset/write/delete) check for all disks
Previously we had such access (read/write/delete) checks only for S3 and
Azure disks (read/read-by-offset/write/delete), this patch adds check
for all disks.

Also I've added the disk name into IDisk interface, since it is required
for the error message in IDisk::checkAccess(), but I have to add
DiskEncrypted::encrypted_name due DiskEncrypted inherits from
DiskDecorator not from IDisk, and so does not have ability to set disk
name (DiskEncrypted could pass the disk name to the DiskDecorator, but
it is not used anywere, and besides this will require to duplicate the
name for each user of DiskDecorator).

Also from nwo on, skip_access_check will make sense for every disk, plus
now startup() called for each disk (before it was missed for some of
them).

And I've added skip_access_check as as a member for DiskRestartProxy,
since it calls startup() on restart().

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:28:35 +01:00
Azat Khuzhin
324b1a7658 Add server UUID for the S3 disks checks to avoid possible races
Otherwise, if you are lucky enough, race condition is possible, and you
can get some errors because one server already removed the file while
another was trying to read it.

But it was possible only for:
- batch deletes check for "s3" disk
- and all checks for "s3_plain" disk, since this disk does not uses
  random names

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-20 16:11:45 +01:00
Kseniia Sumarokova
f0dbfbb0f4
Merge pull request #42800 from azat/disks/web-fix
Do not suppress exceptions in web disk (and fix retries for requests from web disk)
2022-11-20 16:07:45 +01:00
Azat Khuzhin
c029549859 Allow to drop tables from s3_plain disk (so as from web disk)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-19 10:10:27 +01:00
Azat Khuzhin
e2726e03cc Override DiskDecorator::isReadOnly()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-19 10:10:27 +01:00
xiedeyantu
c258d3ac8b fix s3 support question mark wildcard 2022-11-18 12:11:22 +08:00
Sergei Trifonov
f2f0676bcc
Revert "Revert "S3 request per second rate throttling"" 2022-11-17 17:35:04 +01:00
Alexander Tokmakov
9011a18234
Revert "S3 request per second rate throttling" 2022-11-16 22:33:48 +03:00
Sergei Trifonov
159743edd6
Merge pull request #43014 from ClickHouse/disk-s3-throttler
S3 request per second rate throttling
2022-11-16 18:51:06 +01:00
Azat Khuzhin
1510adeca9 Fix retries for web disk
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-12 12:45:19 +01:00
Azat Khuzhin
4bb7832f63 Handle 404 errors as non fatal during obtaining list from web disk
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-12 12:37:26 +01:00
Azat Khuzhin
d28411f68b Do not suppress exceptions in web disk
Before this patch it is possible that web disk will suppress it and
instead of got an exception during read you will get just zero records:

    2022.10.28 15:18:30.739698 [ 10663 ] {} <Error> ReadWriteBufferFromHTTP: HTTP request to `http://127.0.0.1:8080/store/a3e/a3e65e1f-5fd4-47ed-9dbd-307f2586b52d/detached/.index` failed at try 1/1 with bytes read: 0/unknown. Error: Connection refused. (Current backoff wait is 100/1600 ms)
    2022.10.28 15:18:30.841210 [ 10663 ] {} <Trace> DiskWeb: Cannot load disk metadata. Error: Poco::Exception. Code: 1000, e.code() = 111, Connection refused (version 22.11.1.1...

Here this exception received on initialization, from server context,
i.e. w/o query context and that's why it hadn't throw, and eventually
you will just zero records:

    SELECT *
    FROM data_from_web

    Query id: ee544a5e-3c67-4fb4-8f14-f8e4a082b237

    Ok.

    0 rows in set. Elapsed: 0.019 sec.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-12 12:37:26 +01:00
Azat Khuzhin
ac27bc0193 Add trailing slash (/) in S3ObjectStorage::getDirectoryContents()
Otherwise it returns only the directory itself.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-10 18:08:41 +01:00
Azat Khuzhin
d3e0f16873 Remove common root path in MetadataStorageFromPlainObjectStorage::listDirectory()
This path should not leak into users/outside, since later it can be
passed to other APIs, i.e. exists() and so on.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-10 18:08:41 +01:00
Azat Khuzhin
9296bfb161 Handle list_object_keys_size for Azure
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-10 18:08:41 +01:00
Azat Khuzhin
2cbc61df18 Fix ATTACH FROM s3_plain for wide part
Previously because of incorrect
MetadataStorageFromPlainObjectStorage::exists(), that used
S3ObjectStorage::exists() before, which works only for existing keys,
not some intermediate path.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-10 18:08:41 +01:00
Azat Khuzhin
41884d3b88 Optimize MetadataStorageFromPlainObjectStorage::getFileSize()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-10 18:08:41 +01:00
Azat Khuzhin
7b7ae175df Add max_keys for IObjectStorage::findAllFiles()
v2: Fix google-default-arguments for IObjectStorage::findAllFiles()
v3: Update max_keys for S3 requests in S3ObjectStorage::findAllFiles() loop
v4: Clarify things about max_keys vs list_object_keys_size in S3ObjectStorage::findAllFiles()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-10 18:08:26 +01:00
Azat Khuzhin
434b9c14d8 Handle all entries for azure blob storage (not only first 5k)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-10 17:42:32 +01:00
Sergei Trifonov
8eedd1e046
Merge branch 'master' into disk-s3-throttler 2022-11-08 15:00:56 +01:00
serxa
6d5d9ff421 rename ReadWriteSettings -> RequestSettings 2022-11-08 13:48:23 +00:00
serxa
2daec0b45e S3 request per second rate throttling + refactoring 2022-11-07 18:05:40 +00:00
Aleksandr Musorin
5cb69d8a22 changed type name for S3_Plain storage
renamed a disk for S3PlainObjectStorage in system.disks table from s3 to s3_plain
2022-11-04 17:35:51 +01:00
Azat Khuzhin
c93262170d Remove ReadOnlyMetadataStorage
Throw exceptions from IMetadataStorage instead to avoid introducing
extra abstractions.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:30 +01:00
Azat Khuzhin
51bd0c2ac1 Make ReadOnlyMetadataStorage really readonly
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
bceca73f6f Throw NOT_IMPLEMENTED form ReadOnlyMetadataStorage::getLastChanged()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
3b7abbbff4 Rename s/listPrefix/findAllFiles, s/listPrefixInPath/getDirectoryContents/
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
fa7535c90d Slightly optimize MetadataStorageFromPlainObjectStorage::isDirectory()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
82ea67eb51 Slightly better MetadataStorageFromPlainObjectStorage::isFile()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
95fb2ad3cf Implement ATTACH of MergeTree table for s3_plain disk
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
4e42521f44 Introduce ReadOnlyMetadataStorage
And use it for:
- MetadataStorageFromPlainObjectStorage
- MetadataStorageFromStaticFilesWebServer

This will allow to reduce ~100-200 lines of duplicated code, and plus
make the code less error prone.

Note, for now I tried to make this without behaviour changes.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
f8ba24f040 Implement S3ObjectStorage::listPrefixInPath()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:26 +01:00
Azat Khuzhin
18e4fdf40f Introduce IObjectStorage::listPrefixInPath()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:25 +01:00
Azat Khuzhin
ee18140c48 Remove listPrefix() implementation for disks that does not support send_metadata
The reason for removing is it because not compatible with restoring
(with send_metadata set) anyway:
- HDFS - is not compatible with send_metadata, and besides it's
  implementaion is not correct, since it is simply `ls -l`, while the
  following is required: `find . -maxdepth 1 -type f`
- Web - is not compatible with send_metadata anyway
- Local - is not compatible with send_metadata anyway

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:25 +01:00
Azat Khuzhin
88db8ae7fa Add a comment for IObjectStorage::listPrefix()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:25 +01:00
Azat Khuzhin
94d9600fb8 Implement MetadataStorageFromPlainObjectStorage::getLastChanged() (as for web)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:25 +01:00
Azat Khuzhin
b34ffda272 Implement MetadataStorageFromPlainObjectStorage::getLastModified() (used by MergeTree)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:25 +01:00
Azat Khuzhin
f6d0c03ee5 Fix path to files in MetadataStorageFromPlainObjectStorage
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-31 12:02:25 +01:00
Azat Khuzhin
7a5432feaa Fix possible SIGSEGV for web disks when file does not exists
It can be triggered in multiple ways, either when file does not exists
and you are trying to create MergeTree table from web (that has special
code for UUID handling) or by simply OPTIMIZE TABLE FINAL for MergeTree
table that is located on web disk, in both cases you will get the
following:

<details>

<summary>stacktrace</summary>

    2022.10.28 14:08:40.631226 [ 6043 ] {6165bf5f-e76b-4bca-941c-7c7ff5e3b46b} <Trace> ContextAccess (default): Access granted: OPTIMIZE ON default.data_from_web
    2022.10.28 14:08:40.632017 [ 6043 ] {6165bf5f-e76b-4bca-941c-7c7ff5e3b46b} <Debug> default.data_from_web (a3e65e1f-5fd4-47ed-9dbd-307f2586b52d) (MergerMutator): Selected 1 parts from all_1_1_0 to all_1_1_0
    2022.10.28 14:08:40.632496 [ 6043 ] {6165bf5f-e76b-4bca-941c-7c7ff5e3b46b} <Trace> default.data_from_web (a3e65e1f-5fd4-47ed-9dbd-307f2586b52d): Trying to reserve 1.00 MiB using storage policy from min volume index 0
    2022.10.28 14:08:40.632752 [ 6043 ] {6165bf5f-e76b-4bca-941c-7c7ff5e3b46b} <Trace> DiskObjectStorage(DiskWebServer): Reserved 1.00 MiB on remote disk `web_disk`, having unreserved 16.00 EiB.
    2022.10.28 14:08:40.634155 [ 6043 ] {a3e65e1f-5fd4-47ed-9dbd-307f2586b52d::all_1_1_1} <Debug> MergeTask::PrepareStage: Merging 1 parts: from all_1_1_0 to all_1_1_0 into Compact
    2022.10.28 14:08:40.634498 [ 6043 ] {a3e65e1f-5fd4-47ed-9dbd-307f2586b52d::all_1_1_1} <Trace> WebObjectStorage: Loading metadata for directory: http://127.0.0.1:8080/store/a3e/a3e65e1f-5fd4-47ed-9dbd-307f2586b52d/tmp_merge_all_1_1_1
    2022.10.28 14:08:40.635025 [ 6043 ] {a3e65e1f-5fd4-47ed-9dbd-307f2586b52d::all_1_1_1} <Trace> DiskWeb: Adding directory: store/a3e/a3e65e1f-5fd4-47ed-9dbd-307f2586b52d/tmp_merge_all_1_1_1/
    2022.10.28 14:08:40.635355 [ 6043 ] {a3e65e1f-5fd4-47ed-9dbd-307f2586b52d::all_1_1_1} <Trace> ReadWriteBufferFromHTTP: Sending request to http://127.0.0.1:8080/store/a3e/a3e65e1f-5fd4-47ed-9dbd-307f2586b52d/tmp_merge_all_1_1_1/.index
    2022.10.28 14:08:40.639618 [ 6043 ] {a3e65e1f-5fd4-47ed-9dbd-307f2586b52d::all_1_1_1} <Trace> DiskWeb: Cannot load disk metadata. Error: Code: 86. DB::Exception: Received error from remote server /store/a3e/a3e65e1f-5fd4-47ed-9dbd-307f2586b52d/tmp_merge_all_1_1_1/.index. HTTP status code: 404 Not Found, body: <!doctype html><html><head><title>404 Not Found</title><style>
    body { background-color: #fcfcfc; color: #333333; margin: 0; padding:0; }
    h1 { font-size: 1.5em; font-weight: normal; background-color: #9999cc; min-height:2em; line-height:2em; border-bottom: 1px inset black; margin: 0; }
    h1, p { padding-left: 10px; }
    code.url { background-color: #eeeeee; font-family:monospace; padding:0 2px;}
    </style>
    </head><body><h1>Not Found</h1><p>The requested resource <code class="url">/store/a3e/a3e65e1f-5fd4-47ed-9dbd-307f2586b52d/tmp_merge_all_1_1_1/.index</code> was not found on this server.</p></body></html>: while loading disk metadata. (RECEIVED_ERROR_FROM_REMOTE_IO_SERVER) (version 22.11.1.1)
    2022.10.28 14:08:40.640527 [ 5488 ] {} <Trace> BaseDaemon: Received signal 11
    2022.10.28 14:08:40.641529 [ 9027 ] {} <Fatal> BaseDaemon: ########################################
    2022.10.28 14:08:40.642759 [ 9027 ] {} <Fatal> BaseDaemon: (version 22.11.1.1, build id: 12145DA78CE5E9EBB10A034177FAE5967EF81A4A) (from thread 6043) (query_id: a3e65e1f-5fd4-47ed-9dbd-307f2586b52d::all_1_1_1) (query: optimize table data_from_web final) Received signal Segmentation fault (11)
    2022.10.28 14:08:40.643260 [ 9027 ] {} <Fatal> BaseDaemon: Address: NULL pointer. Access: read. Unknown si_code.
    2022.10.28 14:08:40.643769 [ 9027 ] {} <Fatal> BaseDaemon: Stack trace: 0x7ffff416c0f2 0x7ffff7cd1ca8 0x7ffff679ae5e 0x7fffd52e7906 0x7fffd50c65aa 0x7fffca7a0d42 0x7fffcaee79ec 0x7fffcaf242f8 0x7fffcaf242b5 0x7fffcaf2427d 0x7fffcaf24255 0x7fffcaf2421d 0x7ffff65c3686 0x7ffff65c2295 0x7fffcaeee2a9 0x7fffcaef2c43 0x7fffcaee3c0e 0x7fffcc4a7851 0x7fffcc4a768f 0x7fffcc4abb2d 0x7fffcfdce828 0x7fffd03e3eaa 0x7fffd03dfe3b 0x7fffc8ec42d4 0x7fffc8ed51d2 0x7ffff4bdd839 0x7ffff4bde0a8 0x7ffff48ab261 0x7ffff48a769a 0x7ffff48a6335 0x7ffff409f8fd 0x7ffff4121a60
    2022.10.28 14:08:40.644411 [ 9027 ] {} <Fatal> BaseDaemon: 4. ? @ 0x7ffff416c0f2 in ?
    2022.10.28 14:08:40.676390 [ 9027 ] {} <Fatal> BaseDaemon: 5. /src/ch/clickhouse/src/Common/StringUtils/StringUtils.cpp:9: detail::startsWith(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, char const*, unsigned long) @ 0x1ca8 in /src/ch/clickhouse/.cmake/src/Common/StringUtils/libstring_utilsd.so
    2022.10.28 14:08:40.730727 [ 9027 ] {} <Fatal> BaseDaemon: 6. /src/ch/clickhouse/src/Common/StringUtils/StringUtils.h:19: startsWith(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) @ 0x59ae5e in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
    2022.10.28 14:08:40.923955 [ 9027 ] {} <Fatal> BaseDaemon: 7. /src/ch/clickhouse/src/Disks/ObjectStorages/Web/MetadataStorageFromStaticFilesWebServer.cpp:58: DB::MetadataStorageFromStaticFilesWebServer::exists(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) const @ 0x6e7906 in /src/ch/clickhouse/.cmake/src/libdbmsd.so
    2022.10.28 14:08:41.291996 [ 9027 ] {} <Fatal> BaseDaemon: 8. /src/ch/clickhouse/src/Disks/ObjectStorages/DiskObjectStorage.cpp:181: DB::DiskObjectStorage::exists(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) const @ 0x4c65aa in /src/ch/clickhouse/.cmake/src/libdbmsd.so
    2022.10.28 14:08:41.704697 [ 9027 ] {} <Fatal> BaseDaemon: 9. /src/ch/clickhouse/src/Storages/MergeTree/DataPartStorageOnDisk.cpp:74: DB::DataPartStorageOnDisk::exists() const @ 0xda0d42 in /src/ch/clickhouse/.cmake/src/libclickhouse_storages_mergetreed.so
    2022.10.28 14:08:43.032459 [ 9027 ] {} <Fatal> BaseDaemon: 10. /src/ch/clickhouse/src/Storages/MergeTree/MergeTask.cpp:147: DB::MergeTask::ExecuteAndFinalizeHorizontalPart::prepare() @ 0x14e79ec in /src/ch/clickhouse/.cmake/src/libclickhouse_storages_mergetreed.so
    ...
    Segmentation fault (core dumped)

</details>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-28 15:15:49 +02:00
Anton Popov
79889198b1 fix style check 2022-10-25 23:03:37 +00:00
Anton Popov
c4d4f2dbed better interface 2022-10-25 22:14:06 +00:00
Anton Popov
c8199bc125
Merge branch 'master' into refactor-data-part 2022-10-25 14:31:05 +02:00
Nikolai Kochetov
5c4444237e
Merge pull request #42617 from ClickHouse/revert-revert-41268-disable-s3-parallel-write-for-part-moves-to-disk-s3
Revert revert 41268 disable s3 parallel write for part moves to disk s3
2022-10-25 14:29:27 +02:00
Nikolai Kochetov
5dabbf89ad Fixing build. 2022-10-24 13:43:24 +02:00
Anton Popov
56e5daba0c remove DataPartStorageBuilder 2022-10-23 00:23:15 +00:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Nikolai Kochetov
57ffcb7ed0 Revert "Merge pull request #41681 from ClickHouse/revert-41268-disable-s3-parallel-write-for-part-moves-to-disk-s3"
This reverts commit 7956c2becf, reversing
changes made to 57be648984.
2022-10-20 19:51:27 +02:00
Vitaly Baranov
1365105bc4 Implement backup to S3 2022-10-19 00:04:41 +02:00
Vitaly Baranov
eb637a6f81
Merge pull request #42232 from azat/backups/s3
Support BACKUP to S3 with as-is path/data structure
2022-10-18 02:37:07 +02:00
filimonov
cd9afdcb7c
Increase request_timeout_ms for s3 disks. 2022-10-14 17:19:14 +02:00
Azat Khuzhin
8830f0608d Support BACKUP to S3 with as-is path/data structure
Right now backup to S3 does not make a lot of sense, since:
- it has random names, and to decoding them
- requires metadata from local disk (/var/lib/disks/DISK/BACKUP_NAME)
- or send_metadata (but it is also tricky even with it)

So this patch adds simpler interface for S3, it is only suitable for
BACKUP/RESTORE, so don't try to use it for MergeTree engine.

It is done by adding separate disk - `s3_plain` for this, that:
- does not support any extended features, like renames/hardlinks/attrs/...
  (so basically everything that MergeTree requires)
- only write/read/unlink/list files

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-14 12:59:25 +02:00
vdimir
0178307c27 Followup for TemporaryDataOnDisk 2022-10-12 15:25:23 +02:00
Azat Khuzhin
688edafcc6 Disks/ObjectStorages: add comments for some classes
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-11 17:21:26 +02:00
Azat Khuzhin
f3833e3a53 Introduce StaticDirectoryIterator
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-11 17:19:35 +02:00
Alexander Tokmakov
bd10a9d2d4
Merge pull request #42168 from ClickHouse/unreachable_macro
Abort instead of `__builtin_unreachable` in debug builds
2022-10-08 19:05:57 +03:00
Alexander Tokmakov
4175f8cde6 abort instead of __builtin_unreachable in debug builds 2022-10-07 21:49:08 +02:00
kssenii
53d54ae6c9 Fix 2022-10-07 12:26:54 +02:00
Kseniia Sumarokova
eb6e60753d
Merge pull request #42129 from kssenii/fix-non-async-buffer
Fix buffer resize with `remote_filesystem_read_method=read` + fs cache
2022-10-07 12:05:53 +02:00
kssenii
2d96c81f4c Fix 2022-10-06 17:09:20 +02:00
Robert Schulze
cc24fcd6d5
Merge pull request #41873 from ClickHouse/generated-file-cleanup
Consolidate CMake-generated config headers
2022-10-06 14:31:03 +02:00
Raúl Marín
adbaaca2f5
QOL log improvements (#41947)
* Uniformize disk reservation logs

* Remove log about destroying stuff that appears all the time

* More tweaks on disk reservation logs

* Reorder logs in hash join

* Remove log that provides little information

* Collapse part removal logs

Co-authored-by: Sergei Trifonov <sergei@clickhouse.com>
2022-10-06 14:22:44 +02:00
Robert Schulze
da5a2e2db0
Merge remote-tracking branch 'origin/master' into generated-file-cleanup
Physical merge conflicts:
- src/Common/ZooKeeper/ZooKeeperImpl.cpp
- src/Core/config_core.h.in
- src/Functions/FunctionsAES.h
- src/Functions/config_functions.h.in
- src/configure_config.cmake

Logical merge conflicts:
- Functions/tryDecrypt.cpp
2022-10-06 08:43:25 +00:00
kssenii
2ab884359e More efficient WriteBufferFromAzureBlobStorage 2022-10-03 22:15:04 +02:00
kssenii
c6d7cd5820 Fix write into azure blob storage 2022-10-03 18:54:07 +02:00
Robert Schulze
db5ef7b3cb
Merge branch 'master' into generated-file-cleanup 2022-10-02 23:13:18 +02:00
Vitaly Baranov
7be9744bab Remove confusing warning when inserting with perform_ttl_move_on_insert=false. 2022-09-30 20:51:48 +02:00
Vladimir C
895afdec45
Merge pull request #40893 from ClickHouse/vdimir/track-tmp-disk 2022-09-30 11:27:24 +02:00
Robert Schulze
cc92a2d174
Merge branch 'master' into generated-file-cleanup 2022-09-30 09:56:31 +02:00
Alexander Tokmakov
4e422b8046
Merge pull request #41741 from ClickHouse/fix_intersecting_parts
Fix intersecting parts
2022-09-29 18:00:22 +03:00
vdimir
f495361e28
fixes for TemporaryDataOnDisk 2022-09-29 10:09:29 +00:00
vdimir
1234b70118
Handle decorator checking tmp disk type 2022-09-29 09:51:49 +00:00
Robert Schulze
f24fab7747
Fix some #include atrocities 2022-09-28 13:49:28 +00:00
Robert Schulze
78fc36ca49
Generate config.h into ${CONFIG_INCLUDE_PATH}
This makes the target location consistent with other auto-generated
files like config_formats.h, config_core.h, and config_functions.h and
simplifies the build of clickhouse_common.
2022-09-28 12:48:26 +00:00
Kseniia Sumarokova
6037da1ca8
Update CachedOnDiskReadBufferFromFile.cpp 2022-09-28 14:04:39 +02:00
Alexander Tokmakov
287fe40173 Merge branch 'master' into fix_intersecting_parts 2022-09-27 16:55:39 +02:00
Kseniia Sumarokova
caf6a99f41
Merge pull request #41011 from kssenii/fix-thread-status
Fix incorrect attach query in threadpool readers, get rid of static threadpools for reads/writes, make threadpool size for reads/writes configurable
2022-09-27 11:35:07 +02:00
Kseniia Sumarokova
e4f9d311c3
Merge pull request #41731 from kssenii/fix-azure-tests
Try fix azure tests
2022-09-27 11:01:19 +02:00
Alexander Tokmakov
9501f88f8c
Merge branch 'master' into fix_intersecting_parts 2022-09-26 19:53:59 +03:00
alesapin
31f6636a47 Fix endless remove 2022-09-26 16:33:25 +02:00
kssenii
921776625e Fix integration tests 2022-09-26 16:20:00 +02:00
kssenii
a02354458a Review fixes 2022-09-26 12:27:29 +02:00
Kseniia Sumarokova
c53f463d2d
Update S3ObjectStorage.cpp 2022-09-25 15:15:59 +02:00
Kseniia Sumarokova
50f7ce6107
Merge branch 'master' into fix-thread-status 2022-09-24 17:22:45 +02:00
Kseniia Sumarokova
21e09f3e1f
Merge pull request #41733 from kssenii/cache-logging-level-reduce
Change logging levels in cache
2022-09-24 17:20:46 +02:00
alesapin
34d9794ab7
Merge pull request #41653 from ClickHouse/investigating_more_bugs
Add very explicit logging on disk choice for fetch
2022-09-24 16:15:02 +02:00
Kseniia Sumarokova
307314e1bd
Update CachedOnDiskReadBufferFromFile.cpp 2022-09-24 13:42:58 +02:00
kssenii
0a801dad2a Merge remote-tracking branch 'upstream/master' into fix-thread-status 2022-09-23 19:39:07 +02:00
kssenii
30726721ad Fix threadpool reader (for local fs) 2022-09-23 19:35:16 +02:00
Kseniia Sumarokova
81aa9b9199
Update WriteBufferFromAzureBlobStorage.cpp 2022-09-23 15:34:39 +02:00
kssenii
c122e4dd1f Refactor log levels 2022-09-23 15:32:05 +02:00