Commit Graph

348 Commits

Author SHA1 Message Date
Vitaly Baranov
1cf1ce07fe Use server-side copy during restore from S3 to S3. 2023-03-13 23:50:13 +01:00
SmitaRKulkarni
9a35a434f8
Merge branch 'master' into Backup_Restore_concurrency_check_node 2023-03-13 10:04:32 +01:00
Antonio Andelic
a170a909a4 Add expiration window for S3 credentials 2023-03-10 10:06:32 +00:00
Antonio Andelic
5bc21538e5 Enable use_environment_credentials by default 2023-03-09 10:31:55 +00:00
Mike Kot
9920a52c51 use std::lerp, constexpr hex.h 2023-03-07 22:50:17 +00:00
Antonio Andelic
12525f768c
Add default constructor for MultiReadResponse (#47254)
* Add default constructor for MultiReadResponse
* Remove optional
* Fix style
2023-03-06 14:18:01 +01:00
Alexey Milovidov
4f85b733f1
Use string concatenation for XML serialization (#47251) 2023-03-05 18:19:33 +01:00
Nikita Mikhaylov
099013831a
Added batching for reads and retries for the most heavy function in backups (#47243) 2023-03-05 16:15:03 +01:00
Alexey Milovidov
a70789c0b3 Whitespace 2023-03-04 09:15:33 +01:00
Nikita Mikhaylov
5c4da5aa4a
Use separate thread pool for IO operations for backups (#47174) 2023-03-03 20:05:42 +01:00
Smita Kulkarni
d2dbd5f293 Updated to use tryGet instead of get for checking stage of backups/restores in concurrency check and updated tests by increasing data size to have a longer backup/restore to ensure the overlap and increased timeout correspondingly. 2023-03-03 16:48:14 +01:00
Smita Kulkarni
0506d9289c Updated Backup/Restore Coordination construction and removed coordination_path and added uuid in settings - Use cluster state data to check concurrent backup/restore 2023-02-16 09:30:27 +01:00
Smita Kulkarni
9817c5601b Fixed clang tidy build by updating parameter name to common_backups_path - Use cluster state data to check concurrent backup/restore 2023-02-12 22:25:33 +01:00
Smita Kulkarni
2ce67830c8 Fixed style check by removing trailing whitespaces in BackupsWorker.h - Use cluster state data to check concurrent backup/restore 2023-02-10 14:41:43 +01:00
Smita Kulkarni
94fba0b664 Fixed build issue caused after merge master in BackupsWorker.h - Use cluster state data to check concurrent backup/restore 2023-02-10 13:53:21 +01:00
Smita Kulkarni
a89d208ed7 Merge branch 'master' into Cluster_state_for_disallow_concurrent_backup_restore 2023-02-10 12:17:01 +01:00
Smita Kulkarni
7fee8995d3 Addressed review comments and moved concurrency check to Backup/Restore Coordination - Use cluster state data to check concurrent backup/restore 2023-02-10 12:04:05 +01:00
Azat Khuzhin
a3a5867b07 Fix data race in BACKUP
Fixes the following data race:

<details>

WARNING: ThreadSanitizer: data race (pid=1)
  Write of size 8 at 0x7b580016ff20 by thread T218 (mutexes: write M0):
    0 DB::BackupImpl::writeFile() build_docker/../src/Backups/BackupImpl.cpp:1000:9 (clickhouse+0x1bd0b7a6) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)
    1 DB::writeBackupEntries()::$_0::operator()(bool) const build_docker/../src/Backups/BackupUtils.cpp:109:25 (clickhouse+0x1bc19cda) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)

  Previous read of size 8 at 0x7b580016ff20 by thread T238:
    0 DB::BackupImpl::writeFile() build_docker/../src/Backups/BackupImpl.cpp:956:14 (clickhouse+0x1bd0ae8d) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)
    1 DB::writeBackupEntries()::$_0::operator()(bool) const build_docker/../src/Backups/BackupUtils.cpp:109:25 (clickhouse+0x1bc19cda) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)

</details>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-04 11:09:11 +01:00
Antonio Andelic
d5117f2aa6
Define S3 client with bucket and endpoint resolution (#45783)
* Update aws

* Define S3 client with bucket and endpoint resolution

* Add defines for ErrorCodes

* Use S3Client everywhere

* Remove unused errorcode

* Add DROP S3 CLIENT CACHE query

* Add a comment

* Fix style

* Update aws

* Update reference files

* Add missing include

* Fix unit test

* Remove unneeded declarations

* Correctly use RetryStrategy

* Rename S3Client to Client

* Fix retry count

* fix clang-tidy warnings
2023-02-03 14:30:52 +01:00
Vitaly Baranov
45d2d678ab
Merge pull request #45800 from vitlibar/rename-new-columns-in-system-backups
Rename new columns in system.backups
2023-02-03 14:00:16 +01:00
Smita Kulkarni
ef54683386 Use cluster state data to check concurrent backup/restore
Implementation:
* BackupWorker checks the if any backup/restore which has a path in zookeeper has status not completed, if yes, new backup/restore is stopped.
* For not on cluster only active backup / restore is checked.
* Removed restore_uuid from RestoreSettings, as it is no longer used.
2023-02-02 19:25:14 +01:00
Vitaly Baranov
96b140cc95 Rename columns in system.backups: num_files, num_processed_files, processed_files_size
num_processed_files -> num_files (BACKUP) / files_read (RESTORE)
processed_files_size -> total_size (BACKUP) / bytes_read (RESTORE)
2023-01-31 22:45:41 +01:00
Pradeep Chhetri
deaa70fb14
Merge branch 'master' into pchhetri/fix-45690 2023-01-30 21:35:16 +08:00
Vitaly Baranov
38910412c4
Merge pull request #42244 from AVMusorin/fix_backup_restore_num_files
Added num_processed_files and processed_files_size for backup and restore processes
2023-01-30 09:24:49 +01:00
Vitaly Baranov
326f4d2a4f Fix using mutex for increaseProcessSize 2023-01-29 17:50:53 +01:00
Pradeep Chhetri
8156a6761f Set compression method and level for backup writer
Signed-off-by: Pradeep Chhetri <pradeepchhetri4444@gmail.com>
2023-01-28 21:49:59 +08:00
Alexander Tokmakov
a584ad0eb1 forbid runtime strings 2023-01-26 10:52:47 +01:00
Alexander Tokmakov
6eb557b2ba Merge branch 'master' into exception_message_patterns4 2023-01-25 13:49:17 +01:00
Vitaly Baranov
32efe92199
Merge pull request #45487 from vitlibar/use-new-copy-s3-functions-in-s3-obj-storage
Use new copy s3 functions in S3ObjectStorage
2023-01-25 13:22:04 +01:00
Smita Kulkarni
6be7d1c24a Addressed review comments and renamed function to hasConcurrentBackups/Restores - Updated backup/restore status when concurrent backups & restores are not allowed 2023-01-24 16:20:12 +01:00
Smita Kulkarni
642f9ca549 Merge branch 'master' into 45486_Fix_flaky_test_for_disallowing_concurrent_backups_restores 2023-01-24 09:37:28 +01:00
Smita Kulkarni
9ae5ac2388 Moved concurrency checks inside functions - Updated backup/restore status when concurrent backups & restores are not allowed 2023-01-24 09:31:51 +01:00
Alexander Tokmakov
3f6594f4c6 forbid old ctor of Exception 2023-01-23 22:18:05 +01:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
AVMusorin
82f194fbc6
added mutex for increaseProcessedSize 2023-01-23 17:15:50 +01:00
Aleksandr
2caeed901b
Merge branch 'master' into fix_backup_restore_num_files 2023-01-23 13:43:20 +01:00
Smita Kulkarni
310ae62d90 Updated backup/restore status when concurrent backups & restores are not allowed
Implementation:
* Moved concurrent backup/restore check inside try-catch block which sets the status so that other nodes in cluster are aware of failures.
* Renamed backup_uuid to restore_uuid in RestoreSettings.
Testing:
* Updated test test_backup_and_restore_on_cluster/test_disallow_concurrency to check for specific backup/restore id.
2023-01-22 19:01:09 +01:00
Azat Khuzhin
2a8f116c18 Forward declaration of ConcurrentBoundedQueue in ThreadStatus
ThreadStatus is the header that recomplies almost all ClickHouse
modules.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-21 16:02:09 +01:00
Vitaly Baranov
5ceb64accc Use new copy s3 functions in S3ObjectStorage. 2023-01-21 15:47:58 +01:00
Aleksandr
206eb4d446
removed unused if statement for increaseProcessedSize
Co-authored-by: Vitaly Baranov <vitlibar@hotmail.com>
2023-01-21 15:07:25 +01:00
Vitaly Baranov
f0fda580d0
Merge pull request #45188 from vitlibar/backup-to-s3-memory-optimization
Optimize memory consumption during backup to S3
2023-01-21 12:37:35 +01:00
Alexander Tokmakov
910d6dc0ce
Merge pull request #45342 from ClickHouse/exception_message_patterns
Save message format strings for DB::Exception
2023-01-20 18:46:52 +03:00
Aleksandr Musorin
838acb22b7
added num_processed_files and processed_files_size 2023-01-20 10:20:41 +01:00
SmitaRKulkarni
db03dd1bb9
Merge branch 'master' into 43891_Disallow_concurrent_backups_and_restores 2023-01-19 09:32:50 +01:00
Smita Kulkarni
d7ca742d98 Fixed style check for beginning of if - Added settings to disallow concurrent backups and restores 2023-01-18 08:59:47 +01:00
Smita Kulkarni
ee526ce877 Fix style check - Added settings to disallow concurrent backups and restores 2023-01-17 22:52:55 +01:00
Smita Kulkarni
6e06af1b25 Updated strategy for handling internal backups & restores to avoid concurrent internal backups & restores - Added settings to disallow concurrent backups and restores 2023-01-17 22:27:13 +01:00
Alexander Tokmakov
5cd90c1a3e Merge branch 'master' into exception_message_patterns 2023-01-17 20:04:04 +01:00
Vitaly Baranov
14a7ee8e26 Copy files to S3 during backup directly without using WriteBufferFromS3 to decrease memory consumption. 2023-01-17 09:35:41 +01:00
Alexander Tokmakov
522686f78b less empty patterns 2023-01-17 01:19:44 +01:00
Vitaly Baranov
21b8aaeb8b Stop using HeadObject requests in S3
because they don't work well with endpoints without explicit region.
2023-01-15 20:28:11 +01:00
Alexander Tokmakov
881b17492f Merge branch 'master' into fix_get_part_name 2023-01-10 21:39:35 +01:00
Smita Kulkarni
93530e8d34 Added settings to disallow concurrent backups and restores
Implementation:
* Added server level settings to disallow concurrent backups and restores, which are read and set when BackupWorker is created in Context.
* Settings are set to true by default.
* Before starting backup or restores, added a check to see if any other backups/restores are running (except internal ones).
Testing:
* Added a test test_backup_and_restore_on_cluster/test_disallow_concurrency.
2023-01-09 18:14:39 +01:00
kssenii
67509aa2d5 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2023-01-03 16:41:30 +01:00
Azat Khuzhin
c9c590071d Add ability to disable deduplication for BACKUP
Right now BACKUP omit similar files, and will not allow to use this
backup as a regular table, and usually those similar files are quite
small (i.e. columns.txt).

So by using `BACKUP TO S3() deduplicate_files=0` you will be possible to
use `ATTACH TABLE` directly from S3.

P.S. right now it is possible only for the table with one part, since,
usually, there is nothing to deduplicate (if the columns are different).

v2: Add deduplicate_files into metadata
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:06 +01:00
Azat Khuzhin
7d81c39207 backups: ignore file not found error for S3 (similar to Disk)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:05 +01:00
Azat Khuzhin
998bf444e6 backups: remove IBackupCoordination::getFileSizeAndChecksum() (in favor of getFileInfo())
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:05 +01:00
Alexander Tokmakov
cdc3912743 fix incorrect usages of getPartName() 2022-12-20 22:44:27 +01:00
kssenii
6bd4f8c029 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2022-12-20 21:17:28 +01:00
Nikolai Kochetov
b2355a2212 Fixing tests. 2022-12-17 16:02:34 +00:00
Nikolai Kochetov
29c6caaeaf Validate s3 part upload settings. 2022-12-17 14:18:42 +00:00
Nikolai Kochetov
62ff98344e Validate s3 part upload settings. 2022-12-17 14:09:53 +00:00
kssenii
30547d2dcd Replace old named collections code for url 2022-12-17 00:24:05 +01:00
Anton Popov
8b9b8b083c
Merge pull request #43726 from CurtizJ/optimize-storage-s3
Improve performance of storage `S3` with large number of small files
2022-12-16 14:38:10 +01:00
Vitaly Baranov
fb8aca8319
Merge pull request #44158 from vitlibar/improve-referential-deps
Improve referential dependencies
2022-12-14 21:17:02 +01:00
Anton Popov
cce3257f39
Merge branch 'master' into optimize-storage-s3 2022-12-13 21:35:12 +01:00
Vitaly Baranov
d7eccb4581
Merge pull request #43940 from azat/backups/gcs
Fix BACKUP TO S3 for Google Cloud Storage
2022-12-13 19:04:52 +01:00
Anton Popov
0c87031e80 Merge remote-tracking branch 'upstream/master' into HEAD 2022-12-13 16:33:21 +00:00
Vitaly Baranov
4f0d1c5e0f Fix copying of query contexts for async backup/restore. 2022-12-12 18:22:14 +01:00
Vitaly Baranov
0207637f6b Use query context instead of the global context in DDLDependencyVisitor. 2022-12-12 18:22:14 +01:00
Vitaly Baranov
76ba8ab3d4 Add new tests. 2022-12-12 18:22:09 +01:00
Vitaly Baranov
b91af1b650 Fix initialization of s3 request settings. 2022-12-10 05:43:51 +01:00
Vitaly Baranov
0ba4870a18 Fix race in S3 multipart upload. 2022-12-09 03:02:39 +01:00
Azat Khuzhin
3d8ea48103 Fix BACKUP TO S3 for Google Cloud Storage (no batch delete support)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-08 20:46:54 +01:00
Vitaly Baranov
e1f7f04752
Referential dependencies for RESTORE (#43834)
* Rename DDLDependencyVisitor -> DDLLoadingDependencyVisitor.

* Move building a loading graph to TablesLoader.

* Implement referential dependencies for tables and use them
when restoring tables from a backup.

* Remove StorageID::operator < (because of its inconsistency with ==).

* Add new tests.

* Fix test.

* Fix memory leak.

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2022-12-02 15:05:46 +01:00
Kruglov Pavel
8f22c9b013
Merge pull request #43824 from ianton-ru/ORION-1976
Fix multipart upload for large S3 object
2022-12-01 12:26:50 +01:00
Anton Ivashkin
d6ca97c8d0 Fix multipart upload for large S3 object 2022-11-30 11:58:04 +02:00
Alexander Gololobov
8f49c1ea16 Moved helpers to Common/XMLUtils.* 2022-11-28 22:14:18 +01:00
Anton Popov
65a78bcd91 improve performance of storage S3 2022-11-26 15:24:01 +00:00
Alexander Gololobov
6064f83aca Use XMLDocument instead of XMLConfiguration for faster loading 2022-11-24 15:00:08 +01:00
Kseniia Sumarokova
5c90d5aa7e
Merge pull request #43253 from xiedeyantu/fix-s3-glob
fix s3 support question mark wildcard
2022-11-22 14:26:56 +01:00
Sergei Trifonov
d05223e70b
Merge pull request #43335 from ClickHouse/revert-43306-revert-43014-disk-s3-throttler
Revert "Revert "S3 request per second rate throttling""
2022-11-18 16:22:30 +01:00
Vitaly Baranov
a348332eab
Merge pull request #43227 from vitlibar/improve-masking-sensitive-info
Improve masking sensitive info
2022-11-18 15:37:50 +01:00
xiedeyantu
c258d3ac8b fix s3 support question mark wildcard 2022-11-18 12:11:22 +08:00
Vitaly Baranov
050df6ac7f Move InDepthNodeVisitor.h back to src/Interpreters. 2022-11-17 18:16:32 +01:00
Sergei Trifonov
f2f0676bcc
Revert "Revert "S3 request per second rate throttling"" 2022-11-17 17:35:04 +01:00
Alexander Tokmakov
9011a18234
Revert "S3 request per second rate throttling" 2022-11-16 22:33:48 +03:00
Vitaly Baranov
b280b68333 Fix style. 2022-11-16 15:57:50 +01:00
Vitaly Baranov
ce81166c7e Fix style. 2022-11-16 01:35:11 +01:00
Vitaly Baranov
8d72f75556 Make the password wiping an option of IAST::format(). 2022-11-16 01:35:06 +01:00
Kseniia Sumarokova
59cf5def67
Merge branch 'master' into disk-s3-throttler 2022-11-15 12:13:37 +01:00
Vitaly Baranov
8e99f5fea3 Move maskSensitiveInfoInQueryForLogging() to src/Parsers/ 2022-11-14 18:55:19 +01:00
Vitaly Baranov
e18c97faf7 Remove dependencies maskSensitiveInfo() from Context. 2022-11-14 18:55:19 +01:00
serxa
ad377b357f fix backup tests 2022-11-11 13:24:43 +00:00
Vitaly Baranov
ae19af0015 Fix backup of Lazy databases. 2022-11-10 00:27:00 +01:00
serxa
2de26daa56 fix build 2022-11-08 14:31:29 +00:00
Sergei Trifonov
8eedd1e046
Merge branch 'master' into disk-s3-throttler 2022-11-08 15:00:56 +01:00
serxa
6d5d9ff421 rename ReadWriteSettings -> RequestSettings 2022-11-08 13:48:23 +00:00
serxa
2daec0b45e S3 request per second rate throttling + refactoring 2022-11-07 18:05:40 +00:00
Vitaly Baranov
52b1f4aed9
Merge pull request #42484 from vitlibar/mask-sensitive-info-in-logs
Mask some information in logs
2022-11-04 14:09:38 +01:00