Commit Graph

521 Commits

Author SHA1 Message Date
Vitaly Baranov
f4ac4c3f9d Corrections after review. 2023-05-17 03:23:16 +02:00
Vitaly Baranov
b068f0b619 Fix build. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
943707963f Add backup setting "decrypt_files_from_encrypted_disks" 2023-05-16 14:27:27 +02:00
Vitaly Baranov
019493efa3 Fix throttling in backups. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
5198997fd8 Remove ReadSettings from backup entries. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
7cea264230 Fix whitespaces. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
c48c20fac8 Use combined checksums for encrypted immutable files. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
517e119e03 Move checksum calculation to IBackupEntry. 2023-05-16 14:27:27 +02:00
Vitaly Baranov
002fd19cb7 Move the common part of BackupIO_* to BackupIO_Default. 2023-05-16 14:27:23 +02:00
Vitaly Baranov
c92219f01b BACKUP now writes encrypted data for tables on encrypted disks. 2023-05-16 14:26:33 +02:00
Vitaly Baranov
cc50fcc60a Remove the 'temporary_file_' argument from BackupEntryFromImmutableFile's constructor. 2023-05-16 14:25:37 +02:00
Vitaly Baranov
101aa6eff0 Add function copyS3FileFromDisk(). 2023-05-16 14:25:37 +02:00
Vitaly Baranov
69114cb550 Add function getBlobPath() to IDisk interface to allow copying to/from disks which are not built on top of IObjectStorage. 2023-05-16 14:25:36 +02:00
Vitaly Baranov
fd2731845c Simplify interface of IBackupWriter: Remove supportNativeCopy() function. 2023-05-16 14:25:36 +02:00
Smita Kulkarni
9a2645a729 Fixed clang build 2023-05-16 14:09:38 +02:00
Sema Checherinda
03c51208d1
Merge pull request #44869 from CheSema/multi_part_upload
rework WriteBufferFromS3, add tests, add abortion
2023-05-16 10:52:01 +02:00
Alexey Milovidov
5a44dc26e7 Fixes for clang-17 2023-05-13 02:57:31 +02:00
Sema Checherinda
7fbf87be17 rework WriteBufferFromS3, squashed 2023-05-10 18:31:47 +00:00
Smita Kulkarni
49ecba63af Removed setStageForCluster and added option all_hosts to set stage for cluster 2023-05-08 14:51:04 +02:00
Smita Kulkarni
b0c408faa4 Merge branch 'master' into Follow_up_Backup_Restore_concurrency_check_node_2 2023-05-05 17:51:33 +02:00
Antonio Andelic
a68a023ca7
Merge pull request #48724 from johanngan/sse-kms
Support SSE-KMS configuration with S3 client
2023-05-04 13:20:54 +02:00
alesapin
412b161104
Merge pull request #48791 from kssenii/better-local-object-storage
Make local object storage work consistently with s3 object storage, fix problem with append, make it configurable as independent storage
2023-05-04 11:47:43 +02:00
johanngan
731823b873 Add support for SSE-KMS configuration with S3
https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html

Similar to the server_side_encryption_customer_key_base64 option for
configuring SSE-C with S3, add the following settings to configure
SSE-KMS on a per-endpoint/disk basis:
  - server_side_encryption_kms_key_id
  - server_side_encryption_kms_encryption_context
  - server_side_encryption_kms_bucket_key_enabled
2023-05-03 21:35:38 -05:00
Nikita Mikhaylov
954e3b724c
Speedup outdated parts loading (#49317) 2023-05-03 18:56:45 +02:00
kssenii
35f437ac9c Address review comments 2023-05-03 14:37:18 +02:00
Alexey Milovidov
530b764953 Fix IBM 2023-04-21 12:38:45 +02:00
Smita Kulkarni
93572ab427 Removed parameter from setStage function and added function setStageForCluster 2023-04-15 13:43:04 +02:00
Smita Kulkarni
d4b2297e9f Fixed comment 2023-04-13 09:53:39 +02:00
SmitaRKulkarni
6568c330c5
Merge branch 'master' into Follow_up_Backup_Restore_concurrency_check_node_2 2023-04-13 09:46:36 +02:00
Smita Kulkarni
49c95a535a Updated to add error or completed status in zookeeper for a cluster for backup/restore, to avoid interpreting previously failed backup/restore when zookeeper is unable to remove nodes 2023-04-12 20:26:57 +02:00
Raúl Marín
45ad555c39
Merge branch 'master' into zk_retry_timeout 2023-04-10 10:04:16 +02:00
Azat Khuzhin
011480924a Use forward declaration of ThreadPool
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-07 11:25:35 +02:00
robot-ch-test-poll1
9466cec1fc
Merge pull request #48342 from ClickHouse/Backup_Restore_concurrency_check_node_2
Check node for Backup Restore concurrency
2023-04-05 23:49:32 +02:00
Raúl Marín
2276d4feb4 Backups have no context and no process list element 2023-04-05 11:19:04 +02:00
Azat Khuzhin
dd9f0f409b Remove knowledge about throttling from IBackupWriter::supportNativeCopy()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-05 09:40:19 +02:00
Azat Khuzhin
c8597fbb9a Do not throttle S3-S3 backups if native copy is possible
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-05 09:40:18 +02:00
Azat Khuzhin
c332d290d8 Keep only one throttler for BACKUPs IO (instead of separate read/write)
There is no need in separate read/write throttling, because you cannot
write faster then read anyway, and plus this makes the code less cleaner

(and also it will allow avoid implementing throttling backups to S3,
since it does not use common S3 writer).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-05 09:40:17 +02:00
Azat Khuzhin
218b1f9c29 Add ability to throttle BACKUPs on per-server/backup basis
Server settings:
- backup_read_bandwidth_for_server
- backup_write_bandwidth_for_server

Query settings:
- backup_read_bandwidth
- backup_write_bandwidth

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-05 09:39:48 +02:00
Smita Kulkarni
a36b81a22d Fixed comment 2023-04-04 19:07:31 +02:00
Smita Kulkarni
f4e2d45fbc Added check for backup/restore when they fail and status is not COMPLETED 2023-04-04 19:05:27 +02:00
Raúl Marín
3cee537e73 Changes for master 2023-04-04 18:58:28 +02:00
Raúl Marín
0e17b3b147 Merge remote-tracking branch 'blessed/master' into zk_retry_timeout 2023-04-04 18:53:55 +02:00
Smita Kulkarni
beb164dd51 Merge branch 'master' into Backup_Restore_concurrency_check_node_2 2023-04-04 18:51:30 +02:00
Nikita Mikhaylov
fa5b2bd4a5
Added Keeper retries for backups operations (#47224) 2023-04-04 18:12:08 +02:00
Antonio Andelic
a329d80bfa
Merge pull request #47397 from ClickHouse/enable-env-credentials-default
Enable `use_environment_credentials` by default
2023-04-04 10:00:03 +02:00
Smita Kulkarni
bce8eb7468 Updated to use tryGet instead of get for checking stage of backups/restores in concurrency check 2023-04-03 12:21:16 +02:00
Raúl Marín
8fdf87982c Merge remote-tracking branch 'blessed/master' into zk_retry_timeout 2023-04-03 10:26:18 +02:00
Antonio Andelic
da194f082d
Merge branch 'master' into enable-env-credentials-default 2023-04-03 09:45:03 +02:00
Vitaly Baranov
3f4aadfe7d Add logging for concurrency checks for backups. 2023-03-31 23:50:35 +02:00
Alexey Milovidov
070210a02f
Merge pull request #48271 from vchekan/master
In messages, put values into quotes
2023-03-31 15:35:19 +03:00
Antonio Andelic
e982f2a67a Merge branch 'master' into enable-env-credentials-default 2023-03-31 09:11:01 +00:00
Vadym Chekan
0f4c8144a6 In messages, put values into quotes
Configuration values, such as disk names, backup engine names, etc, may give error message unintended sense, for example, if trying to backup to `disk` instead of `Disk`, the error message will be "Not found backup engine disk", which can be interpreted as "disk of backup engine not found". It might be not clear that the word "disk" comes from the query and is not part of the error message.
2023-03-30 22:46:18 -07:00
Antonio Andelic
80cb121d2a
Merge pull request #48092 from ClickHouse/nosign-keyword-for-s3
Add support for `NOSIGN` keyword and `no_sign_request` config for S3
2023-03-30 18:10:56 +02:00
Alexey Milovidov
e982fb9f1c
Merge pull request #47880 from azat/threadpool-introspection
ThreadPool metrics introspection
2023-03-30 01:27:31 +03:00
Vitaly Baranov
481a7a76ac
Simplify backup coordination for file infos (#48095)
* Remove obsolete code for archive suffixes.

* Simplify backup coordination, stop using it for restoring.

* Build all file infos before writing to backup. Decrease number of znodes.

* Split long values before writing to ZooKeeper.

* Use separate mutexes for unrelated activities.

* Make test test_disallow_concurrency less flaky.

* Add comments and test for backup_keeper_value_max_size.
2023-03-29 15:19:40 +02:00
Antonio Andelic
44e95aa65f Merge branch 'master' into nosign-keyword-for-s3 2023-03-29 11:10:03 +00:00
Azat Khuzhin
f38a7aeabe ThreadPool metrics introspection
There are lots of thread pools and simple local-vs-global is not enough
already, it is good to know which one in particular uses threads.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-03-29 10:46:59 +02:00
Antonio Andelic
160aa186bb Add support for NOSIGN keyword and no_sign_request config 2023-03-28 07:05:35 +00:00
Vitaly Baranov
e43fc77a4e
Merge pull request #46989 from AVMusorin/update-system-backups-periodically
Dynamic update `system.backups`
2023-03-27 17:26:47 +02:00
Raúl Marín
83b68caccc Do not continue retrying to connect to ZK if the query is killed or over limits 2023-03-27 16:01:15 +02:00
Vitaly Baranov
1badc3cba0
Move information about current hosts and list of all hosts to BackupCoordination (#47971)
to simplify the code and help implementing other features.

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-03-24 17:38:19 +01:00
Alexander Tokmakov
cd7d1fb990
Revert "Revert "Revert "Backup_Restore_concurrency_check_node""" 2023-03-24 04:35:50 +03:00
SmitaRKulkarni
04822a63e1
Merge pull request #47586 from ClickHouse/revert-47581-revert-47216-Backup_Restore_concurrency_check_node
Revert "Revert "Backup_Restore_concurrency_check_node""
2023-03-23 10:02:00 +01:00
Azat Khuzhin
1ebbfac721 Use restore_threads (not backup_threads) for RESTORE ASYNC
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-03-21 20:35:00 +02:00
AVMusorin
038bfb40ab
dynamic update system.backups 2023-03-21 11:42:54 +01:00
Vitaly Baranov
198409e12a
Merge pull request #46085 from aalexfvk/alexfvk/store_udf_in_zookeeper
Replication of user-defined SQL functions using ZooKeeper
2023-03-20 13:41:29 +01:00
Antonio Andelic
a0582a14b9
Merge pull request #47423 from ClickHouse/add-expiration-window-s3
Add expiration window for S3 credentials
2023-03-18 10:11:29 +01:00
Aleksei Filatov
886b530963 [rev: 1] Fix review remarks 2023-03-17 13:56:05 +03:00
Aleksei Filatov
690d8355ca Add classes for backup/restore UDF 2023-03-17 13:56:04 +03:00
Sema Checherinda
3c6deddd1d work with comments on PR 2023-03-16 19:55:58 +01:00
Vitaly Baranov
25356786ea Simplify the implementation, create new utility function copyS3FileToDisk(). 2023-03-14 23:34:44 +01:00
SmitaRKulkarni
8db4964ebc
Revert "Revert "Backup_Restore_concurrency_check_node"" 2023-03-14 20:23:43 +01:00
Alexander Tokmakov
773cd5a686
Revert "Backup_Restore_concurrency_check_node" 2023-03-14 18:55:08 +03:00
Vitaly Baranov
1cf1ce07fe Use server-side copy during restore from S3 to S3. 2023-03-13 23:50:13 +01:00
SmitaRKulkarni
9a35a434f8
Merge branch 'master' into Backup_Restore_concurrency_check_node 2023-03-13 10:04:32 +01:00
Antonio Andelic
a170a909a4 Add expiration window for S3 credentials 2023-03-10 10:06:32 +00:00
Antonio Andelic
5bc21538e5 Enable use_environment_credentials by default 2023-03-09 10:31:55 +00:00
Mike Kot
9920a52c51 use std::lerp, constexpr hex.h 2023-03-07 22:50:17 +00:00
Antonio Andelic
12525f768c
Add default constructor for MultiReadResponse (#47254)
* Add default constructor for MultiReadResponse
* Remove optional
* Fix style
2023-03-06 14:18:01 +01:00
Alexey Milovidov
4f85b733f1
Use string concatenation for XML serialization (#47251) 2023-03-05 18:19:33 +01:00
Nikita Mikhaylov
099013831a
Added batching for reads and retries for the most heavy function in backups (#47243) 2023-03-05 16:15:03 +01:00
Alexey Milovidov
a70789c0b3 Whitespace 2023-03-04 09:15:33 +01:00
Nikita Mikhaylov
5c4da5aa4a
Use separate thread pool for IO operations for backups (#47174) 2023-03-03 20:05:42 +01:00
Smita Kulkarni
d2dbd5f293 Updated to use tryGet instead of get for checking stage of backups/restores in concurrency check and updated tests by increasing data size to have a longer backup/restore to ensure the overlap and increased timeout correspondingly. 2023-03-03 16:48:14 +01:00
Smita Kulkarni
0506d9289c Updated Backup/Restore Coordination construction and removed coordination_path and added uuid in settings - Use cluster state data to check concurrent backup/restore 2023-02-16 09:30:27 +01:00
Smita Kulkarni
9817c5601b Fixed clang tidy build by updating parameter name to common_backups_path - Use cluster state data to check concurrent backup/restore 2023-02-12 22:25:33 +01:00
Smita Kulkarni
2ce67830c8 Fixed style check by removing trailing whitespaces in BackupsWorker.h - Use cluster state data to check concurrent backup/restore 2023-02-10 14:41:43 +01:00
Smita Kulkarni
94fba0b664 Fixed build issue caused after merge master in BackupsWorker.h - Use cluster state data to check concurrent backup/restore 2023-02-10 13:53:21 +01:00
Smita Kulkarni
a89d208ed7 Merge branch 'master' into Cluster_state_for_disallow_concurrent_backup_restore 2023-02-10 12:17:01 +01:00
Smita Kulkarni
7fee8995d3 Addressed review comments and moved concurrency check to Backup/Restore Coordination - Use cluster state data to check concurrent backup/restore 2023-02-10 12:04:05 +01:00
Azat Khuzhin
a3a5867b07 Fix data race in BACKUP
Fixes the following data race:

<details>

WARNING: ThreadSanitizer: data race (pid=1)
  Write of size 8 at 0x7b580016ff20 by thread T218 (mutexes: write M0):
    0 DB::BackupImpl::writeFile() build_docker/../src/Backups/BackupImpl.cpp:1000:9 (clickhouse+0x1bd0b7a6) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)
    1 DB::writeBackupEntries()::$_0::operator()(bool) const build_docker/../src/Backups/BackupUtils.cpp:109:25 (clickhouse+0x1bc19cda) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)

  Previous read of size 8 at 0x7b580016ff20 by thread T238:
    0 DB::BackupImpl::writeFile() build_docker/../src/Backups/BackupImpl.cpp:956:14 (clickhouse+0x1bd0ae8d) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)
    1 DB::writeBackupEntries()::$_0::operator()(bool) const build_docker/../src/Backups/BackupUtils.cpp:109:25 (clickhouse+0x1bc19cda) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)

</details>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-04 11:09:11 +01:00
Antonio Andelic
d5117f2aa6
Define S3 client with bucket and endpoint resolution (#45783)
* Update aws

* Define S3 client with bucket and endpoint resolution

* Add defines for ErrorCodes

* Use S3Client everywhere

* Remove unused errorcode

* Add DROP S3 CLIENT CACHE query

* Add a comment

* Fix style

* Update aws

* Update reference files

* Add missing include

* Fix unit test

* Remove unneeded declarations

* Correctly use RetryStrategy

* Rename S3Client to Client

* Fix retry count

* fix clang-tidy warnings
2023-02-03 14:30:52 +01:00
Vitaly Baranov
45d2d678ab
Merge pull request #45800 from vitlibar/rename-new-columns-in-system-backups
Rename new columns in system.backups
2023-02-03 14:00:16 +01:00
Smita Kulkarni
ef54683386 Use cluster state data to check concurrent backup/restore
Implementation:
* BackupWorker checks the if any backup/restore which has a path in zookeeper has status not completed, if yes, new backup/restore is stopped.
* For not on cluster only active backup / restore is checked.
* Removed restore_uuid from RestoreSettings, as it is no longer used.
2023-02-02 19:25:14 +01:00
Vitaly Baranov
96b140cc95 Rename columns in system.backups: num_files, num_processed_files, processed_files_size
num_processed_files -> num_files (BACKUP) / files_read (RESTORE)
processed_files_size -> total_size (BACKUP) / bytes_read (RESTORE)
2023-01-31 22:45:41 +01:00
Pradeep Chhetri
deaa70fb14
Merge branch 'master' into pchhetri/fix-45690 2023-01-30 21:35:16 +08:00
Vitaly Baranov
38910412c4
Merge pull request #42244 from AVMusorin/fix_backup_restore_num_files
Added num_processed_files and processed_files_size for backup and restore processes
2023-01-30 09:24:49 +01:00
Vitaly Baranov
326f4d2a4f Fix using mutex for increaseProcessSize 2023-01-29 17:50:53 +01:00
Pradeep Chhetri
8156a6761f Set compression method and level for backup writer
Signed-off-by: Pradeep Chhetri <pradeepchhetri4444@gmail.com>
2023-01-28 21:49:59 +08:00
Alexander Tokmakov
a584ad0eb1 forbid runtime strings 2023-01-26 10:52:47 +01:00
Alexander Tokmakov
6eb557b2ba Merge branch 'master' into exception_message_patterns4 2023-01-25 13:49:17 +01:00
Vitaly Baranov
32efe92199
Merge pull request #45487 from vitlibar/use-new-copy-s3-functions-in-s3-obj-storage
Use new copy s3 functions in S3ObjectStorage
2023-01-25 13:22:04 +01:00
Smita Kulkarni
6be7d1c24a Addressed review comments and renamed function to hasConcurrentBackups/Restores - Updated backup/restore status when concurrent backups & restores are not allowed 2023-01-24 16:20:12 +01:00
Smita Kulkarni
642f9ca549 Merge branch 'master' into 45486_Fix_flaky_test_for_disallowing_concurrent_backups_restores 2023-01-24 09:37:28 +01:00
Smita Kulkarni
9ae5ac2388 Moved concurrency checks inside functions - Updated backup/restore status when concurrent backups & restores are not allowed 2023-01-24 09:31:51 +01:00
Alexander Tokmakov
3f6594f4c6 forbid old ctor of Exception 2023-01-23 22:18:05 +01:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
AVMusorin
82f194fbc6
added mutex for increaseProcessedSize 2023-01-23 17:15:50 +01:00
Aleksandr
2caeed901b
Merge branch 'master' into fix_backup_restore_num_files 2023-01-23 13:43:20 +01:00
Smita Kulkarni
310ae62d90 Updated backup/restore status when concurrent backups & restores are not allowed
Implementation:
* Moved concurrent backup/restore check inside try-catch block which sets the status so that other nodes in cluster are aware of failures.
* Renamed backup_uuid to restore_uuid in RestoreSettings.
Testing:
* Updated test test_backup_and_restore_on_cluster/test_disallow_concurrency to check for specific backup/restore id.
2023-01-22 19:01:09 +01:00
Azat Khuzhin
2a8f116c18 Forward declaration of ConcurrentBoundedQueue in ThreadStatus
ThreadStatus is the header that recomplies almost all ClickHouse
modules.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-21 16:02:09 +01:00
Vitaly Baranov
5ceb64accc Use new copy s3 functions in S3ObjectStorage. 2023-01-21 15:47:58 +01:00
Aleksandr
206eb4d446
removed unused if statement for increaseProcessedSize
Co-authored-by: Vitaly Baranov <vitlibar@hotmail.com>
2023-01-21 15:07:25 +01:00
Vitaly Baranov
f0fda580d0
Merge pull request #45188 from vitlibar/backup-to-s3-memory-optimization
Optimize memory consumption during backup to S3
2023-01-21 12:37:35 +01:00
Alexander Tokmakov
910d6dc0ce
Merge pull request #45342 from ClickHouse/exception_message_patterns
Save message format strings for DB::Exception
2023-01-20 18:46:52 +03:00
Aleksandr Musorin
838acb22b7
added num_processed_files and processed_files_size 2023-01-20 10:20:41 +01:00
SmitaRKulkarni
db03dd1bb9
Merge branch 'master' into 43891_Disallow_concurrent_backups_and_restores 2023-01-19 09:32:50 +01:00
Smita Kulkarni
d7ca742d98 Fixed style check for beginning of if - Added settings to disallow concurrent backups and restores 2023-01-18 08:59:47 +01:00
Smita Kulkarni
ee526ce877 Fix style check - Added settings to disallow concurrent backups and restores 2023-01-17 22:52:55 +01:00
Smita Kulkarni
6e06af1b25 Updated strategy for handling internal backups & restores to avoid concurrent internal backups & restores - Added settings to disallow concurrent backups and restores 2023-01-17 22:27:13 +01:00
Alexander Tokmakov
5cd90c1a3e Merge branch 'master' into exception_message_patterns 2023-01-17 20:04:04 +01:00
Vitaly Baranov
14a7ee8e26 Copy files to S3 during backup directly without using WriteBufferFromS3 to decrease memory consumption. 2023-01-17 09:35:41 +01:00
Alexander Tokmakov
522686f78b less empty patterns 2023-01-17 01:19:44 +01:00
Vitaly Baranov
21b8aaeb8b Stop using HeadObject requests in S3
because they don't work well with endpoints without explicit region.
2023-01-15 20:28:11 +01:00
Alexander Tokmakov
881b17492f Merge branch 'master' into fix_get_part_name 2023-01-10 21:39:35 +01:00
Smita Kulkarni
93530e8d34 Added settings to disallow concurrent backups and restores
Implementation:
* Added server level settings to disallow concurrent backups and restores, which are read and set when BackupWorker is created in Context.
* Settings are set to true by default.
* Before starting backup or restores, added a check to see if any other backups/restores are running (except internal ones).
Testing:
* Added a test test_backup_and_restore_on_cluster/test_disallow_concurrency.
2023-01-09 18:14:39 +01:00
kssenii
67509aa2d5 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2023-01-03 16:41:30 +01:00
Azat Khuzhin
c9c590071d Add ability to disable deduplication for BACKUP
Right now BACKUP omit similar files, and will not allow to use this
backup as a regular table, and usually those similar files are quite
small (i.e. columns.txt).

So by using `BACKUP TO S3() deduplicate_files=0` you will be possible to
use `ATTACH TABLE` directly from S3.

P.S. right now it is possible only for the table with one part, since,
usually, there is nothing to deduplicate (if the columns are different).

v2: Add deduplicate_files into metadata
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:06 +01:00
Azat Khuzhin
7d81c39207 backups: ignore file not found error for S3 (similar to Disk)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:05 +01:00
Azat Khuzhin
998bf444e6 backups: remove IBackupCoordination::getFileSizeAndChecksum() (in favor of getFileInfo())
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:05 +01:00
Alexander Tokmakov
cdc3912743 fix incorrect usages of getPartName() 2022-12-20 22:44:27 +01:00
kssenii
6bd4f8c029 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2022-12-20 21:17:28 +01:00
Nikolai Kochetov
b2355a2212 Fixing tests. 2022-12-17 16:02:34 +00:00
Nikolai Kochetov
29c6caaeaf Validate s3 part upload settings. 2022-12-17 14:18:42 +00:00
Nikolai Kochetov
62ff98344e Validate s3 part upload settings. 2022-12-17 14:09:53 +00:00
kssenii
30547d2dcd Replace old named collections code for url 2022-12-17 00:24:05 +01:00
Anton Popov
8b9b8b083c
Merge pull request #43726 from CurtizJ/optimize-storage-s3
Improve performance of storage `S3` with large number of small files
2022-12-16 14:38:10 +01:00
Vitaly Baranov
fb8aca8319
Merge pull request #44158 from vitlibar/improve-referential-deps
Improve referential dependencies
2022-12-14 21:17:02 +01:00
Anton Popov
cce3257f39
Merge branch 'master' into optimize-storage-s3 2022-12-13 21:35:12 +01:00
Vitaly Baranov
d7eccb4581
Merge pull request #43940 from azat/backups/gcs
Fix BACKUP TO S3 for Google Cloud Storage
2022-12-13 19:04:52 +01:00
Anton Popov
0c87031e80 Merge remote-tracking branch 'upstream/master' into HEAD 2022-12-13 16:33:21 +00:00
Vitaly Baranov
4f0d1c5e0f Fix copying of query contexts for async backup/restore. 2022-12-12 18:22:14 +01:00
Vitaly Baranov
0207637f6b Use query context instead of the global context in DDLDependencyVisitor. 2022-12-12 18:22:14 +01:00
Vitaly Baranov
76ba8ab3d4 Add new tests. 2022-12-12 18:22:09 +01:00
Vitaly Baranov
b91af1b650 Fix initialization of s3 request settings. 2022-12-10 05:43:51 +01:00
Vitaly Baranov
0ba4870a18 Fix race in S3 multipart upload. 2022-12-09 03:02:39 +01:00
Azat Khuzhin
3d8ea48103 Fix BACKUP TO S3 for Google Cloud Storage (no batch delete support)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-08 20:46:54 +01:00
Vitaly Baranov
e1f7f04752
Referential dependencies for RESTORE (#43834)
* Rename DDLDependencyVisitor -> DDLLoadingDependencyVisitor.

* Move building a loading graph to TablesLoader.

* Implement referential dependencies for tables and use them
when restoring tables from a backup.

* Remove StorageID::operator < (because of its inconsistency with ==).

* Add new tests.

* Fix test.

* Fix memory leak.

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2022-12-02 15:05:46 +01:00
Kruglov Pavel
8f22c9b013
Merge pull request #43824 from ianton-ru/ORION-1976
Fix multipart upload for large S3 object
2022-12-01 12:26:50 +01:00
Anton Ivashkin
d6ca97c8d0 Fix multipart upload for large S3 object 2022-11-30 11:58:04 +02:00
Alexander Gololobov
8f49c1ea16 Moved helpers to Common/XMLUtils.* 2022-11-28 22:14:18 +01:00
Anton Popov
65a78bcd91 improve performance of storage S3 2022-11-26 15:24:01 +00:00
Alexander Gololobov
6064f83aca Use XMLDocument instead of XMLConfiguration for faster loading 2022-11-24 15:00:08 +01:00
Kseniia Sumarokova
5c90d5aa7e
Merge pull request #43253 from xiedeyantu/fix-s3-glob
fix s3 support question mark wildcard
2022-11-22 14:26:56 +01:00
Sergei Trifonov
d05223e70b
Merge pull request #43335 from ClickHouse/revert-43306-revert-43014-disk-s3-throttler
Revert "Revert "S3 request per second rate throttling""
2022-11-18 16:22:30 +01:00
Vitaly Baranov
a348332eab
Merge pull request #43227 from vitlibar/improve-masking-sensitive-info
Improve masking sensitive info
2022-11-18 15:37:50 +01:00
xiedeyantu
c258d3ac8b fix s3 support question mark wildcard 2022-11-18 12:11:22 +08:00
Vitaly Baranov
050df6ac7f Move InDepthNodeVisitor.h back to src/Interpreters. 2022-11-17 18:16:32 +01:00
Sergei Trifonov
f2f0676bcc
Revert "Revert "S3 request per second rate throttling"" 2022-11-17 17:35:04 +01:00
Alexander Tokmakov
9011a18234
Revert "S3 request per second rate throttling" 2022-11-16 22:33:48 +03:00
Vitaly Baranov
b280b68333 Fix style. 2022-11-16 15:57:50 +01:00
Vitaly Baranov
ce81166c7e Fix style. 2022-11-16 01:35:11 +01:00
Vitaly Baranov
8d72f75556 Make the password wiping an option of IAST::format(). 2022-11-16 01:35:06 +01:00
Kseniia Sumarokova
59cf5def67
Merge branch 'master' into disk-s3-throttler 2022-11-15 12:13:37 +01:00
Vitaly Baranov
8e99f5fea3 Move maskSensitiveInfoInQueryForLogging() to src/Parsers/ 2022-11-14 18:55:19 +01:00
Vitaly Baranov
e18c97faf7 Remove dependencies maskSensitiveInfo() from Context. 2022-11-14 18:55:19 +01:00
serxa
ad377b357f fix backup tests 2022-11-11 13:24:43 +00:00
Vitaly Baranov
ae19af0015 Fix backup of Lazy databases. 2022-11-10 00:27:00 +01:00
serxa
2de26daa56 fix build 2022-11-08 14:31:29 +00:00
Sergei Trifonov
8eedd1e046
Merge branch 'master' into disk-s3-throttler 2022-11-08 15:00:56 +01:00
serxa
6d5d9ff421 rename ReadWriteSettings -> RequestSettings 2022-11-08 13:48:23 +00:00
serxa
2daec0b45e S3 request per second rate throttling + refactoring 2022-11-07 18:05:40 +00:00
Vitaly Baranov
52b1f4aed9
Merge pull request #42484 from vitlibar/mask-sensitive-info-in-logs
Mask some information in logs
2022-11-04 14:09:38 +01:00
Vitaly Baranov
32194c1200 Add max limitation for the size of an uploaded part. 2022-11-02 17:53:54 +01:00
Vitaly Baranov
b9f2f17331 Add test and logging. 2022-11-01 12:23:20 +01:00
Vitaly Baranov
914ab51992 Increase the size of upload part exponentially for backup to S3. 2022-10-31 17:54:41 +01:00
Vitaly Baranov
a30bfada63 Wipe passwords from backup logs too. 2022-10-31 10:50:33 +01:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Vitaly Baranov
1365105bc4 Implement backup to S3 2022-10-19 00:04:41 +02:00
Vitaly Baranov
69ebf12dab
Merge pull request #42146 from azat/backups/metadata-overflow-fix
Fix reusing of files > 4GB from base backup
2022-10-08 00:22:28 +02:00
Azat Khuzhin
dae8d6b316 Convert backup version from UInt64 to int
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-07 14:59:38 +02:00
Azat Khuzhin
94566abda9 Fix reusing of files > 4GB from base backup
Previosly u64 numbers was truncated to u32 numbers during writing to the
mdatadata xml file, and further incremental backup cannot reuse them,
since the file in base backup is smaller.

P.S. There can be other places, I thought about enabling
-Wshorten-64-to-32, but there are lots of warnings right now.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-07 14:56:09 +02:00
Azat Khuzhin
2c84ad30ba Fix double "file" in "Writing backup for file" message
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-07 14:27:11 +02:00
Alfonso Martinez
65b161341c Replaced changed functions for tryLockForShare 2022-09-28 18:08:10 +02:00
Alfonso Martinez
6bb166b79b exception replaced by nullptr 2022-09-28 17:41:51 +02:00
Azat Khuzhin
b698a4ff65 Apply changes to http handlers on fly without server restart
This has been implemented by simply restarting http servers in case of
http_handlers directive in configuration xml had been changed.

But, for this I have to change the handlers interface to accept
configuration separatelly, since the configuration that contains in the
server is the configuration with which server had been started.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Co-authored-by: Antonio Andelic
2022-09-12 17:34:51 +02:00
Vitaly Baranov
122009a2bd Use table lock if database is ordinary and zero-copy-replication is enabled. 2022-09-08 13:54:59 +02:00
Vitaly Baranov
9c847ceec9 No hardlinks while making backup of MergeTree in atomic database. 2022-09-07 11:44:50 +02:00
Robert Schulze
c7c00f9002
Merge pull request #40739 from ClickHouse/clang-tidy-for-headers
Enable clang-tidy for headers
2022-09-02 07:54:50 +02:00
alesapin
1ae7e82126
Merge pull request #40819 from vitlibar/fix-locking-when-writing-backup
Fix locking while writing backup in multiple threads
2022-09-01 13:20:21 +02:00
Vitaly Baranov
007ae0e6cc Fix incremental backups for Log family. 2022-08-31 12:57:28 +02:00
Robert Schulze
cedf75ed5e
Enable clang-tidy for headers
clang-tidy now also checks code in header files. Because the analyzer
finds tons of issues, activate the check only for directory "base/" (see
file ".clang-tidy"). All other directories, in particular "src/" are
left to future work.

While many findings were fixed, some were not (and suppressed instead).
Reasons for this include: a) the file is 1:1 copypaste of a 3rd-party
lib (e.g. pcg_extras.h) and fixing stuff would make upgrades/fixes more
difficult b) a fix would have broken lots of using code
2022-08-31 10:48:15 +00:00
Vitaly Baranov
77d741dc25 Add comments. 2022-08-30 18:58:13 +02:00
Vitaly Baranov
86872b2307 Fix locking while writing backup in multiple threads. 2022-08-30 18:10:54 +02:00
Vladimir C
ddde5096ef
Merge branch 'master' into vdimir/tmp-file-metrics 2022-08-25 15:23:35 +02:00
vdimir
fbc35f066b
Remove unsused ctors of BackupEntryFromImmutableFile 2022-08-24 16:14:07 +00:00
alesapin
669a48e302 Merge branch 'data_source_description' of github.com:ClickHouse/ClickHouse into data_source_description 2022-08-24 17:48:15 +02:00
alesapin
571778ad25
Update src/Backups/BackupIO_Disk.cpp
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2022-08-24 17:45:26 +02:00
alesapin
814bc37f0e Use DiskPtr 2022-08-24 17:45:20 +02:00
alesapin
354f4e90eb Remove redundant lines 2022-08-21 18:21:01 +02:00