Commit Graph

521 Commits

Author SHA1 Message Date
Antonio Andelic
e982f2a67a Merge branch 'master' into enable-env-credentials-default 2023-03-31 09:11:01 +00:00
Vadym Chekan
0f4c8144a6 In messages, put values into quotes
Configuration values, such as disk names, backup engine names, etc, may give error message unintended sense, for example, if trying to backup to `disk` instead of `Disk`, the error message will be "Not found backup engine disk", which can be interpreted as "disk of backup engine not found". It might be not clear that the word "disk" comes from the query and is not part of the error message.
2023-03-30 22:46:18 -07:00
Antonio Andelic
80cb121d2a
Merge pull request #48092 from ClickHouse/nosign-keyword-for-s3
Add support for `NOSIGN` keyword and `no_sign_request` config for S3
2023-03-30 18:10:56 +02:00
Alexey Milovidov
e982fb9f1c
Merge pull request #47880 from azat/threadpool-introspection
ThreadPool metrics introspection
2023-03-30 01:27:31 +03:00
Vitaly Baranov
481a7a76ac
Simplify backup coordination for file infos (#48095)
* Remove obsolete code for archive suffixes.

* Simplify backup coordination, stop using it for restoring.

* Build all file infos before writing to backup. Decrease number of znodes.

* Split long values before writing to ZooKeeper.

* Use separate mutexes for unrelated activities.

* Make test test_disallow_concurrency less flaky.

* Add comments and test for backup_keeper_value_max_size.
2023-03-29 15:19:40 +02:00
Antonio Andelic
44e95aa65f Merge branch 'master' into nosign-keyword-for-s3 2023-03-29 11:10:03 +00:00
Azat Khuzhin
f38a7aeabe ThreadPool metrics introspection
There are lots of thread pools and simple local-vs-global is not enough
already, it is good to know which one in particular uses threads.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-03-29 10:46:59 +02:00
Antonio Andelic
160aa186bb Add support for NOSIGN keyword and no_sign_request config 2023-03-28 07:05:35 +00:00
Vitaly Baranov
e43fc77a4e
Merge pull request #46989 from AVMusorin/update-system-backups-periodically
Dynamic update `system.backups`
2023-03-27 17:26:47 +02:00
Raúl Marín
83b68caccc Do not continue retrying to connect to ZK if the query is killed or over limits 2023-03-27 16:01:15 +02:00
Vitaly Baranov
1badc3cba0
Move information about current hosts and list of all hosts to BackupCoordination (#47971)
to simplify the code and help implementing other features.

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-03-24 17:38:19 +01:00
Alexander Tokmakov
cd7d1fb990
Revert "Revert "Revert "Backup_Restore_concurrency_check_node""" 2023-03-24 04:35:50 +03:00
SmitaRKulkarni
04822a63e1
Merge pull request #47586 from ClickHouse/revert-47581-revert-47216-Backup_Restore_concurrency_check_node
Revert "Revert "Backup_Restore_concurrency_check_node""
2023-03-23 10:02:00 +01:00
Azat Khuzhin
1ebbfac721 Use restore_threads (not backup_threads) for RESTORE ASYNC
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-03-21 20:35:00 +02:00
AVMusorin
038bfb40ab
dynamic update system.backups 2023-03-21 11:42:54 +01:00
Vitaly Baranov
198409e12a
Merge pull request #46085 from aalexfvk/alexfvk/store_udf_in_zookeeper
Replication of user-defined SQL functions using ZooKeeper
2023-03-20 13:41:29 +01:00
Antonio Andelic
a0582a14b9
Merge pull request #47423 from ClickHouse/add-expiration-window-s3
Add expiration window for S3 credentials
2023-03-18 10:11:29 +01:00
Aleksei Filatov
886b530963 [rev: 1] Fix review remarks 2023-03-17 13:56:05 +03:00
Aleksei Filatov
690d8355ca Add classes for backup/restore UDF 2023-03-17 13:56:04 +03:00
Sema Checherinda
3c6deddd1d work with comments on PR 2023-03-16 19:55:58 +01:00
Vitaly Baranov
25356786ea Simplify the implementation, create new utility function copyS3FileToDisk(). 2023-03-14 23:34:44 +01:00
SmitaRKulkarni
8db4964ebc
Revert "Revert "Backup_Restore_concurrency_check_node"" 2023-03-14 20:23:43 +01:00
Alexander Tokmakov
773cd5a686
Revert "Backup_Restore_concurrency_check_node" 2023-03-14 18:55:08 +03:00
Vitaly Baranov
1cf1ce07fe Use server-side copy during restore from S3 to S3. 2023-03-13 23:50:13 +01:00
SmitaRKulkarni
9a35a434f8
Merge branch 'master' into Backup_Restore_concurrency_check_node 2023-03-13 10:04:32 +01:00
Antonio Andelic
a170a909a4 Add expiration window for S3 credentials 2023-03-10 10:06:32 +00:00
Antonio Andelic
5bc21538e5 Enable use_environment_credentials by default 2023-03-09 10:31:55 +00:00
Mike Kot
9920a52c51 use std::lerp, constexpr hex.h 2023-03-07 22:50:17 +00:00
Antonio Andelic
12525f768c
Add default constructor for MultiReadResponse (#47254)
* Add default constructor for MultiReadResponse
* Remove optional
* Fix style
2023-03-06 14:18:01 +01:00
Alexey Milovidov
4f85b733f1
Use string concatenation for XML serialization (#47251) 2023-03-05 18:19:33 +01:00
Nikita Mikhaylov
099013831a
Added batching for reads and retries for the most heavy function in backups (#47243) 2023-03-05 16:15:03 +01:00
Alexey Milovidov
a70789c0b3 Whitespace 2023-03-04 09:15:33 +01:00
Nikita Mikhaylov
5c4da5aa4a
Use separate thread pool for IO operations for backups (#47174) 2023-03-03 20:05:42 +01:00
Smita Kulkarni
d2dbd5f293 Updated to use tryGet instead of get for checking stage of backups/restores in concurrency check and updated tests by increasing data size to have a longer backup/restore to ensure the overlap and increased timeout correspondingly. 2023-03-03 16:48:14 +01:00
Smita Kulkarni
0506d9289c Updated Backup/Restore Coordination construction and removed coordination_path and added uuid in settings - Use cluster state data to check concurrent backup/restore 2023-02-16 09:30:27 +01:00
Smita Kulkarni
9817c5601b Fixed clang tidy build by updating parameter name to common_backups_path - Use cluster state data to check concurrent backup/restore 2023-02-12 22:25:33 +01:00
Smita Kulkarni
2ce67830c8 Fixed style check by removing trailing whitespaces in BackupsWorker.h - Use cluster state data to check concurrent backup/restore 2023-02-10 14:41:43 +01:00
Smita Kulkarni
94fba0b664 Fixed build issue caused after merge master in BackupsWorker.h - Use cluster state data to check concurrent backup/restore 2023-02-10 13:53:21 +01:00
Smita Kulkarni
a89d208ed7 Merge branch 'master' into Cluster_state_for_disallow_concurrent_backup_restore 2023-02-10 12:17:01 +01:00
Smita Kulkarni
7fee8995d3 Addressed review comments and moved concurrency check to Backup/Restore Coordination - Use cluster state data to check concurrent backup/restore 2023-02-10 12:04:05 +01:00
Azat Khuzhin
a3a5867b07 Fix data race in BACKUP
Fixes the following data race:

<details>

WARNING: ThreadSanitizer: data race (pid=1)
  Write of size 8 at 0x7b580016ff20 by thread T218 (mutexes: write M0):
    0 DB::BackupImpl::writeFile() build_docker/../src/Backups/BackupImpl.cpp:1000:9 (clickhouse+0x1bd0b7a6) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)
    1 DB::writeBackupEntries()::$_0::operator()(bool) const build_docker/../src/Backups/BackupUtils.cpp:109:25 (clickhouse+0x1bc19cda) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)

  Previous read of size 8 at 0x7b580016ff20 by thread T238:
    0 DB::BackupImpl::writeFile() build_docker/../src/Backups/BackupImpl.cpp:956:14 (clickhouse+0x1bd0ae8d) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)
    1 DB::writeBackupEntries()::$_0::operator()(bool) const build_docker/../src/Backups/BackupUtils.cpp:109:25 (clickhouse+0x1bc19cda) (BuildId: 3558ba44526114e01870f02cc410103fa6cb8de3)

</details>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-04 11:09:11 +01:00
Antonio Andelic
d5117f2aa6
Define S3 client with bucket and endpoint resolution (#45783)
* Update aws

* Define S3 client with bucket and endpoint resolution

* Add defines for ErrorCodes

* Use S3Client everywhere

* Remove unused errorcode

* Add DROP S3 CLIENT CACHE query

* Add a comment

* Fix style

* Update aws

* Update reference files

* Add missing include

* Fix unit test

* Remove unneeded declarations

* Correctly use RetryStrategy

* Rename S3Client to Client

* Fix retry count

* fix clang-tidy warnings
2023-02-03 14:30:52 +01:00
Vitaly Baranov
45d2d678ab
Merge pull request #45800 from vitlibar/rename-new-columns-in-system-backups
Rename new columns in system.backups
2023-02-03 14:00:16 +01:00
Smita Kulkarni
ef54683386 Use cluster state data to check concurrent backup/restore
Implementation:
* BackupWorker checks the if any backup/restore which has a path in zookeeper has status not completed, if yes, new backup/restore is stopped.
* For not on cluster only active backup / restore is checked.
* Removed restore_uuid from RestoreSettings, as it is no longer used.
2023-02-02 19:25:14 +01:00
Vitaly Baranov
96b140cc95 Rename columns in system.backups: num_files, num_processed_files, processed_files_size
num_processed_files -> num_files (BACKUP) / files_read (RESTORE)
processed_files_size -> total_size (BACKUP) / bytes_read (RESTORE)
2023-01-31 22:45:41 +01:00
Pradeep Chhetri
deaa70fb14
Merge branch 'master' into pchhetri/fix-45690 2023-01-30 21:35:16 +08:00
Vitaly Baranov
38910412c4
Merge pull request #42244 from AVMusorin/fix_backup_restore_num_files
Added num_processed_files and processed_files_size for backup and restore processes
2023-01-30 09:24:49 +01:00
Vitaly Baranov
326f4d2a4f Fix using mutex for increaseProcessSize 2023-01-29 17:50:53 +01:00
Pradeep Chhetri
8156a6761f Set compression method and level for backup writer
Signed-off-by: Pradeep Chhetri <pradeepchhetri4444@gmail.com>
2023-01-28 21:49:59 +08:00
Alexander Tokmakov
a584ad0eb1 forbid runtime strings 2023-01-26 10:52:47 +01:00
Alexander Tokmakov
6eb557b2ba Merge branch 'master' into exception_message_patterns4 2023-01-25 13:49:17 +01:00
Vitaly Baranov
32efe92199
Merge pull request #45487 from vitlibar/use-new-copy-s3-functions-in-s3-obj-storage
Use new copy s3 functions in S3ObjectStorage
2023-01-25 13:22:04 +01:00
Smita Kulkarni
6be7d1c24a Addressed review comments and renamed function to hasConcurrentBackups/Restores - Updated backup/restore status when concurrent backups & restores are not allowed 2023-01-24 16:20:12 +01:00
Smita Kulkarni
642f9ca549 Merge branch 'master' into 45486_Fix_flaky_test_for_disallowing_concurrent_backups_restores 2023-01-24 09:37:28 +01:00
Smita Kulkarni
9ae5ac2388 Moved concurrency checks inside functions - Updated backup/restore status when concurrent backups & restores are not allowed 2023-01-24 09:31:51 +01:00
Alexander Tokmakov
3f6594f4c6 forbid old ctor of Exception 2023-01-23 22:18:05 +01:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
AVMusorin
82f194fbc6
added mutex for increaseProcessedSize 2023-01-23 17:15:50 +01:00
Aleksandr
2caeed901b
Merge branch 'master' into fix_backup_restore_num_files 2023-01-23 13:43:20 +01:00
Smita Kulkarni
310ae62d90 Updated backup/restore status when concurrent backups & restores are not allowed
Implementation:
* Moved concurrent backup/restore check inside try-catch block which sets the status so that other nodes in cluster are aware of failures.
* Renamed backup_uuid to restore_uuid in RestoreSettings.
Testing:
* Updated test test_backup_and_restore_on_cluster/test_disallow_concurrency to check for specific backup/restore id.
2023-01-22 19:01:09 +01:00
Azat Khuzhin
2a8f116c18 Forward declaration of ConcurrentBoundedQueue in ThreadStatus
ThreadStatus is the header that recomplies almost all ClickHouse
modules.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-21 16:02:09 +01:00
Vitaly Baranov
5ceb64accc Use new copy s3 functions in S3ObjectStorage. 2023-01-21 15:47:58 +01:00
Aleksandr
206eb4d446
removed unused if statement for increaseProcessedSize
Co-authored-by: Vitaly Baranov <vitlibar@hotmail.com>
2023-01-21 15:07:25 +01:00
Vitaly Baranov
f0fda580d0
Merge pull request #45188 from vitlibar/backup-to-s3-memory-optimization
Optimize memory consumption during backup to S3
2023-01-21 12:37:35 +01:00
Alexander Tokmakov
910d6dc0ce
Merge pull request #45342 from ClickHouse/exception_message_patterns
Save message format strings for DB::Exception
2023-01-20 18:46:52 +03:00
Aleksandr Musorin
838acb22b7
added num_processed_files and processed_files_size 2023-01-20 10:20:41 +01:00
SmitaRKulkarni
db03dd1bb9
Merge branch 'master' into 43891_Disallow_concurrent_backups_and_restores 2023-01-19 09:32:50 +01:00
Smita Kulkarni
d7ca742d98 Fixed style check for beginning of if - Added settings to disallow concurrent backups and restores 2023-01-18 08:59:47 +01:00
Smita Kulkarni
ee526ce877 Fix style check - Added settings to disallow concurrent backups and restores 2023-01-17 22:52:55 +01:00
Smita Kulkarni
6e06af1b25 Updated strategy for handling internal backups & restores to avoid concurrent internal backups & restores - Added settings to disallow concurrent backups and restores 2023-01-17 22:27:13 +01:00
Alexander Tokmakov
5cd90c1a3e Merge branch 'master' into exception_message_patterns 2023-01-17 20:04:04 +01:00
Vitaly Baranov
14a7ee8e26 Copy files to S3 during backup directly without using WriteBufferFromS3 to decrease memory consumption. 2023-01-17 09:35:41 +01:00
Alexander Tokmakov
522686f78b less empty patterns 2023-01-17 01:19:44 +01:00
Vitaly Baranov
21b8aaeb8b Stop using HeadObject requests in S3
because they don't work well with endpoints without explicit region.
2023-01-15 20:28:11 +01:00
Alexander Tokmakov
881b17492f Merge branch 'master' into fix_get_part_name 2023-01-10 21:39:35 +01:00
Smita Kulkarni
93530e8d34 Added settings to disallow concurrent backups and restores
Implementation:
* Added server level settings to disallow concurrent backups and restores, which are read and set when BackupWorker is created in Context.
* Settings are set to true by default.
* Before starting backup or restores, added a check to see if any other backups/restores are running (except internal ones).
Testing:
* Added a test test_backup_and_restore_on_cluster/test_disallow_concurrency.
2023-01-09 18:14:39 +01:00
kssenii
67509aa2d5 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2023-01-03 16:41:30 +01:00
Azat Khuzhin
c9c590071d Add ability to disable deduplication for BACKUP
Right now BACKUP omit similar files, and will not allow to use this
backup as a regular table, and usually those similar files are quite
small (i.e. columns.txt).

So by using `BACKUP TO S3() deduplicate_files=0` you will be possible to
use `ATTACH TABLE` directly from S3.

P.S. right now it is possible only for the table with one part, since,
usually, there is nothing to deduplicate (if the columns are different).

v2: Add deduplicate_files into metadata
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:06 +01:00
Azat Khuzhin
7d81c39207 backups: ignore file not found error for S3 (similar to Disk)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:05 +01:00
Azat Khuzhin
998bf444e6 backups: remove IBackupCoordination::getFileSizeAndChecksum() (in favor of getFileInfo())
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-27 15:56:05 +01:00
Alexander Tokmakov
cdc3912743 fix incorrect usages of getPartName() 2022-12-20 22:44:27 +01:00
kssenii
6bd4f8c029 Merge remote-tracking branch 'upstream/master' into use-new-named-collections-code-2 2022-12-20 21:17:28 +01:00
Nikolai Kochetov
b2355a2212 Fixing tests. 2022-12-17 16:02:34 +00:00
Nikolai Kochetov
29c6caaeaf Validate s3 part upload settings. 2022-12-17 14:18:42 +00:00
Nikolai Kochetov
62ff98344e Validate s3 part upload settings. 2022-12-17 14:09:53 +00:00
kssenii
30547d2dcd Replace old named collections code for url 2022-12-17 00:24:05 +01:00
Anton Popov
8b9b8b083c
Merge pull request #43726 from CurtizJ/optimize-storage-s3
Improve performance of storage `S3` with large number of small files
2022-12-16 14:38:10 +01:00
Vitaly Baranov
fb8aca8319
Merge pull request #44158 from vitlibar/improve-referential-deps
Improve referential dependencies
2022-12-14 21:17:02 +01:00
Anton Popov
cce3257f39
Merge branch 'master' into optimize-storage-s3 2022-12-13 21:35:12 +01:00
Vitaly Baranov
d7eccb4581
Merge pull request #43940 from azat/backups/gcs
Fix BACKUP TO S3 for Google Cloud Storage
2022-12-13 19:04:52 +01:00
Anton Popov
0c87031e80 Merge remote-tracking branch 'upstream/master' into HEAD 2022-12-13 16:33:21 +00:00
Vitaly Baranov
4f0d1c5e0f Fix copying of query contexts for async backup/restore. 2022-12-12 18:22:14 +01:00
Vitaly Baranov
0207637f6b Use query context instead of the global context in DDLDependencyVisitor. 2022-12-12 18:22:14 +01:00
Vitaly Baranov
76ba8ab3d4 Add new tests. 2022-12-12 18:22:09 +01:00
Vitaly Baranov
b91af1b650 Fix initialization of s3 request settings. 2022-12-10 05:43:51 +01:00
Vitaly Baranov
0ba4870a18 Fix race in S3 multipart upload. 2022-12-09 03:02:39 +01:00
Azat Khuzhin
3d8ea48103 Fix BACKUP TO S3 for Google Cloud Storage (no batch delete support)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-08 20:46:54 +01:00
Vitaly Baranov
e1f7f04752
Referential dependencies for RESTORE (#43834)
* Rename DDLDependencyVisitor -> DDLLoadingDependencyVisitor.

* Move building a loading graph to TablesLoader.

* Implement referential dependencies for tables and use them
when restoring tables from a backup.

* Remove StorageID::operator < (because of its inconsistency with ==).

* Add new tests.

* Fix test.

* Fix memory leak.

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2022-12-02 15:05:46 +01:00
Kruglov Pavel
8f22c9b013
Merge pull request #43824 from ianton-ru/ORION-1976
Fix multipart upload for large S3 object
2022-12-01 12:26:50 +01:00
Anton Ivashkin
d6ca97c8d0 Fix multipart upload for large S3 object 2022-11-30 11:58:04 +02:00
Alexander Gololobov
8f49c1ea16 Moved helpers to Common/XMLUtils.* 2022-11-28 22:14:18 +01:00
Anton Popov
65a78bcd91 improve performance of storage S3 2022-11-26 15:24:01 +00:00
Alexander Gololobov
6064f83aca Use XMLDocument instead of XMLConfiguration for faster loading 2022-11-24 15:00:08 +01:00
Kseniia Sumarokova
5c90d5aa7e
Merge pull request #43253 from xiedeyantu/fix-s3-glob
fix s3 support question mark wildcard
2022-11-22 14:26:56 +01:00
Sergei Trifonov
d05223e70b
Merge pull request #43335 from ClickHouse/revert-43306-revert-43014-disk-s3-throttler
Revert "Revert "S3 request per second rate throttling""
2022-11-18 16:22:30 +01:00
Vitaly Baranov
a348332eab
Merge pull request #43227 from vitlibar/improve-masking-sensitive-info
Improve masking sensitive info
2022-11-18 15:37:50 +01:00
xiedeyantu
c258d3ac8b fix s3 support question mark wildcard 2022-11-18 12:11:22 +08:00
Vitaly Baranov
050df6ac7f Move InDepthNodeVisitor.h back to src/Interpreters. 2022-11-17 18:16:32 +01:00
Sergei Trifonov
f2f0676bcc
Revert "Revert "S3 request per second rate throttling"" 2022-11-17 17:35:04 +01:00
Alexander Tokmakov
9011a18234
Revert "S3 request per second rate throttling" 2022-11-16 22:33:48 +03:00
Vitaly Baranov
b280b68333 Fix style. 2022-11-16 15:57:50 +01:00
Vitaly Baranov
ce81166c7e Fix style. 2022-11-16 01:35:11 +01:00
Vitaly Baranov
8d72f75556 Make the password wiping an option of IAST::format(). 2022-11-16 01:35:06 +01:00
Kseniia Sumarokova
59cf5def67
Merge branch 'master' into disk-s3-throttler 2022-11-15 12:13:37 +01:00
Vitaly Baranov
8e99f5fea3 Move maskSensitiveInfoInQueryForLogging() to src/Parsers/ 2022-11-14 18:55:19 +01:00
Vitaly Baranov
e18c97faf7 Remove dependencies maskSensitiveInfo() from Context. 2022-11-14 18:55:19 +01:00
serxa
ad377b357f fix backup tests 2022-11-11 13:24:43 +00:00
Vitaly Baranov
ae19af0015 Fix backup of Lazy databases. 2022-11-10 00:27:00 +01:00
serxa
2de26daa56 fix build 2022-11-08 14:31:29 +00:00
Sergei Trifonov
8eedd1e046
Merge branch 'master' into disk-s3-throttler 2022-11-08 15:00:56 +01:00
serxa
6d5d9ff421 rename ReadWriteSettings -> RequestSettings 2022-11-08 13:48:23 +00:00
serxa
2daec0b45e S3 request per second rate throttling + refactoring 2022-11-07 18:05:40 +00:00
Vitaly Baranov
52b1f4aed9
Merge pull request #42484 from vitlibar/mask-sensitive-info-in-logs
Mask some information in logs
2022-11-04 14:09:38 +01:00
Vitaly Baranov
32194c1200 Add max limitation for the size of an uploaded part. 2022-11-02 17:53:54 +01:00
Vitaly Baranov
b9f2f17331 Add test and logging. 2022-11-01 12:23:20 +01:00
Vitaly Baranov
914ab51992 Increase the size of upload part exponentially for backup to S3. 2022-10-31 17:54:41 +01:00
Vitaly Baranov
a30bfada63 Wipe passwords from backup logs too. 2022-10-31 10:50:33 +01:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Vitaly Baranov
1365105bc4 Implement backup to S3 2022-10-19 00:04:41 +02:00
Vitaly Baranov
69ebf12dab
Merge pull request #42146 from azat/backups/metadata-overflow-fix
Fix reusing of files > 4GB from base backup
2022-10-08 00:22:28 +02:00
Azat Khuzhin
dae8d6b316 Convert backup version from UInt64 to int
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-07 14:59:38 +02:00
Azat Khuzhin
94566abda9 Fix reusing of files > 4GB from base backup
Previosly u64 numbers was truncated to u32 numbers during writing to the
mdatadata xml file, and further incremental backup cannot reuse them,
since the file in base backup is smaller.

P.S. There can be other places, I thought about enabling
-Wshorten-64-to-32, but there are lots of warnings right now.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-07 14:56:09 +02:00
Azat Khuzhin
2c84ad30ba Fix double "file" in "Writing backup for file" message
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-07 14:27:11 +02:00
Alfonso Martinez
65b161341c Replaced changed functions for tryLockForShare 2022-09-28 18:08:10 +02:00
Alfonso Martinez
6bb166b79b exception replaced by nullptr 2022-09-28 17:41:51 +02:00
Azat Khuzhin
b698a4ff65 Apply changes to http handlers on fly without server restart
This has been implemented by simply restarting http servers in case of
http_handlers directive in configuration xml had been changed.

But, for this I have to change the handlers interface to accept
configuration separatelly, since the configuration that contains in the
server is the configuration with which server had been started.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Co-authored-by: Antonio Andelic
2022-09-12 17:34:51 +02:00
Vitaly Baranov
122009a2bd Use table lock if database is ordinary and zero-copy-replication is enabled. 2022-09-08 13:54:59 +02:00
Vitaly Baranov
9c847ceec9 No hardlinks while making backup of MergeTree in atomic database. 2022-09-07 11:44:50 +02:00
Robert Schulze
c7c00f9002
Merge pull request #40739 from ClickHouse/clang-tidy-for-headers
Enable clang-tidy for headers
2022-09-02 07:54:50 +02:00
alesapin
1ae7e82126
Merge pull request #40819 from vitlibar/fix-locking-when-writing-backup
Fix locking while writing backup in multiple threads
2022-09-01 13:20:21 +02:00
Vitaly Baranov
007ae0e6cc Fix incremental backups for Log family. 2022-08-31 12:57:28 +02:00
Robert Schulze
cedf75ed5e
Enable clang-tidy for headers
clang-tidy now also checks code in header files. Because the analyzer
finds tons of issues, activate the check only for directory "base/" (see
file ".clang-tidy"). All other directories, in particular "src/" are
left to future work.

While many findings were fixed, some were not (and suppressed instead).
Reasons for this include: a) the file is 1:1 copypaste of a 3rd-party
lib (e.g. pcg_extras.h) and fixing stuff would make upgrades/fixes more
difficult b) a fix would have broken lots of using code
2022-08-31 10:48:15 +00:00
Vitaly Baranov
77d741dc25 Add comments. 2022-08-30 18:58:13 +02:00
Vitaly Baranov
86872b2307 Fix locking while writing backup in multiple threads. 2022-08-30 18:10:54 +02:00
Vladimir C
ddde5096ef
Merge branch 'master' into vdimir/tmp-file-metrics 2022-08-25 15:23:35 +02:00
vdimir
fbc35f066b
Remove unsused ctors of BackupEntryFromImmutableFile 2022-08-24 16:14:07 +00:00
alesapin
669a48e302 Merge branch 'data_source_description' of github.com:ClickHouse/ClickHouse into data_source_description 2022-08-24 17:48:15 +02:00
alesapin
571778ad25
Update src/Backups/BackupIO_Disk.cpp
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2022-08-24 17:45:26 +02:00
alesapin
814bc37f0e Use DiskPtr 2022-08-24 17:45:20 +02:00
alesapin
354f4e90eb Remove redundant lines 2022-08-21 18:21:01 +02:00
alesapin
704d7fdc41 Fix copy and use disks 2022-08-21 18:18:35 +02:00
alesapin
8fd3088459 Commit missed files 2022-08-20 17:21:03 +02:00
alesapin
d8664c3227 Add shortcut for backups 2022-08-19 16:58:30 +02:00
alesapin
7b460b5f85 Small refactoring 2022-08-19 13:31:25 +02:00
Vitaly Baranov
32e40e630e Fix removing "internal" column. 2022-07-27 12:24:21 +02:00
Vitaly Baranov
794eeb5d51 Split "total_size" to "uncompressed_size" and "compressed_size". 2022-07-27 10:36:56 +02:00
Vitaly Baranov
e602e01232 Fix style. 2022-07-27 09:04:10 +02:00
Vitaly Baranov
51a2bf33e8 Rename backup statuses to CREATING_BACKUP, BACKUP_CREATED, BACKUP_FAILED, RESTORING, RESTORED, RESTORE_FAILED. 2022-07-27 09:04:10 +02:00
Vitaly Baranov
1cfe0b10f7 Add columns "total_size" and "num_files" to system.backups 2022-07-27 09:04:10 +02:00
Vitaly Baranov
35c267b3b1 Replace column "status_changed_time" with columns "start_time" and "end_time". 2022-07-27 09:04:10 +02:00
Vitaly Baranov
fc16a15ecf Rename column "uuid" -> "id" in system.backups and allow user to set it in a query. 2022-07-27 09:04:10 +02:00
Vitaly Baranov
131019ba49 Rename column "backup_name" -> "name" in system.backups. 2022-07-27 09:04:10 +02:00
Vitaly Baranov
afd0982187 Remove column "internal" from system.backups 2022-07-27 09:04:10 +02:00
Vitaly Baranov
16a60b5e93
Merge pull request #39455 from vitlibar/fix-locks-add-tests
Improve synchronization between hosts in distributed backup and fix locks
2022-07-27 09:02:58 +02:00
Vitaly Baranov
413024b4f4 Add call ZooKeeper::sync(). 2022-07-26 14:14:01 +02:00
Vitaly Baranov
f0cd564648 Changes after review and added comments. 2022-07-26 11:58:05 +02:00
Vitaly Baranov
c0ec6fd913 Use Poco::Event to simplify code. 2022-07-26 09:53:32 +02:00
Vitaly Baranov
76599d1231 Finally fix locking storages for reading during backup. 2022-07-26 08:58:33 +02:00
Vitaly Baranov
6174fe1d72 Fix tests. 2022-07-22 18:33:46 +02:00
Vitaly Baranov
7795b2cec3 Fix system.backups: now it can show duplicate UUIDs with different flag. 2022-07-21 20:30:26 +02:00
Vitaly Baranov
dc392cd4d3 Improve synchronization between hosts in distributed backup.
Use ephemeral zk nodes to check other hosts for termination.
2022-07-21 11:45:26 +02:00
Nikolai Kochetov
91043351aa Fixing build. 2022-07-20 20:30:16 +00:00
Vitaly Baranov
150e058be9 lockTablesForReading() comes back. 2022-07-20 09:04:18 +02:00
Vitaly Baranov
ce233761d7 Fix making a query scope for async backups. 2022-07-15 13:35:04 +02:00
Vitaly Baranov
2f47be5da7 Check that the destination for a backup is not in use. 2022-07-15 13:34:58 +02:00
Vitaly Baranov
847cda87f9 BACKUP/RESTORE ON CLUSTER use async mode on replicas now. 2022-07-08 22:26:01 +02:00
Vitaly Baranov
5dcc271856 More careful destructors. 2022-07-07 11:16:44 +02:00
Vitaly Baranov
7f84cf3968 Fix style. 2022-07-06 16:36:59 +02:00
Vitaly Baranov
5d7ad46f6a Move files and write comments. 2022-07-06 11:09:31 +02:00
Vitaly Baranov
1ac46c5e48 Fix making backups containing multiple ACL tables. 2022-07-05 20:57:01 +02:00
Vitaly Baranov
f9204315b5 Store columns.txt in backups for the Memory table engine too. 2022-07-05 19:03:20 +02:00
Vitaly Baranov
43d35eec1b Write unfinished mutations to backup. 2022-07-05 14:51:09 +02:00
Vitaly Baranov
92e0ee0b6f More detailed error messages. 2022-07-03 14:20:19 +02:00
Vitaly Baranov
e367d96964 Fix style. 2022-06-30 15:10:33 +02:00
Vitaly Baranov
5456bde4a2 Improve gathering metadata for storing ACL in backups. 2022-06-30 09:46:37 +02:00
Vitaly Baranov
031ca28fdc Add test for partition clause. More checks for data compatibility on restore. 2022-06-30 08:37:18 +02:00
Vitaly Baranov
11b51d2878 Implement storing UDF in backups. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
aa97bf5125 Improve handling predefined databases and tables. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
7689e0c36f Improve gathering metadata for backup - part 6. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
6ca400fd89 Improve gathering metadata for backup - part 5. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
aaf7f66549 Improve gathering metadata for backup - part 4. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
44db346fea Improve gathering metadata for backup - part 3. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
461a31f237 Improve gathering metadata for backup - part 2. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
64b51a3772 Improve gathering metadata for backup. 2022-06-30 08:37:17 +02:00
Vitaly Baranov
18b4413df8
Merge pull request #38299 from vitlibar/backup-improvements-7
Backup improvements 7
2022-06-23 11:37:03 +02:00
Vitaly Baranov
5ae8fce1ef Attach threads to thread groups better. 2022-06-22 18:51:41 +02:00
Robert Schulze
55b39e709d
Merge remote-tracking branch 'origin/master' into clang-tsa 2022-06-20 16:39:32 +02:00
Robert Schulze
5a4f21c50f
Support for Clang Thread Safety Analysis (TSA)
- TSA is a static analyzer build by Google which finds race conditions
  and deadlocks at compile time.

- It works by associating a shared member variable with a
  synchronization primitive that protects it. The compiler can then
  check at each access if proper locking happened before. A good
  introduction are [0] and [1].

- TSA requires some help by the programmer via annotations. Luckily,
  LLVM's libcxx already has annotations for std::mutex, std::lock_guard,
  std::shared_mutex and std::scoped_lock. This commit enables them
  (--> contrib/libcxx-cmake/CMakeLists.txt).

- Further, this commit adds convenience macros for the low-level
  annotations for use in ClickHouse (--> base/defines.h). For
  demonstration, they are leveraged in a few places.

- As we compile with "-Wall -Wextra -Weverything", the required compiler
  flag "-Wthread-safety-analysis" was already enabled. Negative checks
  are an experimental feature of TSA and disabled
  (--> cmake/warnings.cmake). Compile times did not increase noticeably.

- TSA is used in a few places with simple locking. I tried TSA also
  where locking is more complex. The problem was usually that it is
  unclear which data is protected by which lock :-(. But there was
  definitely some weird code where locking looked broken. So there is
  some potential to find bugs.

*** Limitations of TSA besides the ones listed in [1]:

- The programmer needs to know which lock protects which piece of shared
  data. This is not always easy for large classes.

- Two synchronization primitives used in ClickHouse are not annotated in
  libcxx:
  (1) std::unique_lock: A releaseable lock handle often together with
      std::condition_variable, e.g. in solve producer-consumer problems.
  (2) std::recursive_mutex: A re-entrant mutex variant. Its usage can be
      considered a design flaw + typically it is slower than a standard
      mutex. In this commit, one std::recursive_mutex was converted to
      std::mutex and annotated with TSA.

- For free-standing functions (e.g. helper functions) which are passed
  shared data members, it can be tricky to specify the associated lock.
  This is because the annotations use the normal C++ rules for symbol
  resolution.

[0] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
[1] https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42958.pdf
2022-06-20 16:13:25 +02:00
Vitaly Baranov
a6fc0dea4e Fix clang-tidy more. 2022-06-20 11:04:37 +02:00
Vitaly Baranov
638ea23399 Fix build. 2022-06-20 03:44:59 +02:00
Vitaly Baranov
8a7c970ce0 Fix style. 2022-06-19 15:58:26 +02:00
Vitaly Baranov
01aaaf7395 More accurate access checking for RESTORE. 2022-06-19 11:26:41 +02:00
Vitaly Baranov
de9a07d18d Fix RESTORE ALL for tables without database in backup. 2022-06-18 14:07:01 +02:00
Vitaly Baranov
36475c5b98 Fix handling empty files in backups. 2022-06-18 12:28:32 +02:00
Vitaly Baranov
a0c558a17e Implement backup/restore for ACL system tables (system.users, system.roles, etc.) 2022-06-17 18:14:31 +02:00
Vitaly Baranov
c2c35fad82 Refactoring of the code getting create table queries for backup. 2022-06-15 20:32:35 +02:00
Vitaly Baranov
c0f06c5e16 Require new privilige 'BACKUP' to make a backup. 2022-06-15 20:32:35 +02:00
Vitaly Baranov
0102626532 Disable the 'BACKUP ALL' command (it's not quite clear what to do with predefined databases). 2022-06-15 20:32:35 +02:00
Vitaly Baranov
cb9bf62e77 Change syntax RESTORE ALL DATABASES => RESTORE ALL 2022-06-15 20:32:35 +02:00
Vitaly Baranov
1198e86295 Fix storing temporary tables and skipping system tables while making a backup. 2022-06-15 20:32:34 +02:00
Vitaly Baranov
6877b8f864 Fix renaming visitor. 2022-06-15 20:32:34 +02:00
Vitaly Baranov
d78a2cda72 Restore tables regarding their dependencies. 2022-06-15 20:32:34 +02:00
Vitaly Baranov
cf34883000 Use QualifiedTableName instead of DatabaseAndTableName. Remove mode 'ALL TEMPORARY TABLES' 2022-06-15 20:32:34 +02:00
Vitaly Baranov
21f3bed435 Simplify path calculations in backup. 2022-06-15 20:32:34 +02:00
Vitaly Baranov
592f568f83 Move backup/restore code to storages and databases - part 2. 2022-06-15 20:32:31 +02:00
Vitaly Baranov
724bc4dc57 Move backup/restore code to storages and databases - part 1. 2022-06-15 20:28:43 +02:00
Vitaly Baranov
ce1836f0d2 Lock tables for share before backup and restore. 2022-06-15 20:28:43 +02:00
Vitaly Baranov
73b1894a21 Rework collecting replicated parts. 2022-06-15 20:28:42 +02:00
Vitaly Baranov
d00b4a7fdb Remove obsolete function IRestoreCoordination::getReplicatedTableDataPath() 2022-06-15 20:28:42 +02:00
Vitaly Baranov
e891eba80e Finalize write buffers used in backups. 2022-06-15 20:26:27 +02:00
Maksim Kita
98a89b50ff Use pdqsort instead of standard sort 2022-06-13 15:31:08 +02:00
Alexander Gololobov
8cc41521ad tidy build fix 2022-05-17 14:35:12 +02:00
Vitaly Baranov
c1baad0763 Fix style. 2022-05-15 14:09:42 +02:00
Vitaly Baranov
ecbbfca698 Fix handling timeouts. 2022-05-14 12:38:19 +02:00
Vitaly Baranov
feb2de8542 Fix access checking for BACKUP and RESTORE. 2022-05-14 10:48:35 +02:00
Vitaly Baranov
dfa1053b9f Use query scopes for async backup/restore. 2022-05-13 10:35:02 +02:00
Vitaly Baranov
23322b0bf6 Add async tests. 2022-05-12 19:42:05 +02:00
Vitaly Baranov
2c92fe21a9 Implement restoring to a bigger or smaller cluster.
Remove backup settings allow_storing_multiple_replicas: now it's always allowed.
2022-05-12 14:55:06 +02:00
Vitaly Baranov
30005a1eff BACKUP ON CLUSTER correctly collects data of a replicated table from all replicas now,
and if some part doesn't exist on either replica it's no problem anymore.
2022-05-12 13:33:42 +02:00
Vitaly Baranov
41a41d3a31 Set max_free_threads=0 for BackupsWorker. 2022-05-08 10:43:12 +02:00
Vitaly Baranov
1b2eb4fe27 Use more clear syntax for BACKUP/RESTORE. 2022-05-08 10:37:02 +02:00
Vitaly Baranov
160bc288d3 Fix implementation of totalBytes() & totalRows() for Log family. 2022-05-04 00:15:21 +02:00
Vitaly Baranov
202dd864ed Fix compilation. 2022-05-03 18:34:29 +02:00
Vitaly Baranov
484c2c9c4a Use SYSTEM SYNC DATABASE REPLICA to make code better. 2022-05-03 16:59:41 +02:00
Vitaly Baranov
828f45f078 Add new restore setting 'allow_non_empty_tables'. 2022-05-03 16:18:45 +02:00
Vitaly Baranov
cb9d867f5f Fix restore coordination for creating tables in replicated databases. 2022-05-03 11:03:16 +02:00
Vitaly Baranov
5257ce31f8 Improved using ThreadPool for making backup or restoring, changed columns in system.backups. 2022-05-03 11:03:13 +02:00
Vitaly Baranov
409edfd3fa Rework RestoreCoordination: make restore deterministic. 2022-05-03 11:01:44 +02:00
Vitaly Baranov
2c754f44fc Make calculation of shard_num & replica_num not dependant on match of the cluster's definitions on nodes. 2022-05-03 11:01:44 +02:00
Vitaly Baranov
b1295311c9 Fix crash when BACKUP & RESTORE are called without ON CLUSTER for replicated DB. 2022-05-03 11:01:44 +02:00
Vitaly Baranov
2a645bb187 Fix sending 'create_table' and 'create_database' restore settings to cluster. 2022-05-03 11:01:44 +02:00
Vitaly Baranov
bddec55d35 Added ASTBackupQuery::setDatabase(). 2022-05-03 11:01:44 +02:00
Amos Bird
9d30be2c59
Fix build again 2022-04-29 14:45:27 +08:00
Vitaly Baranov
a8e924caf6 Make BACKUP & RESTORE synchronous by default. 2022-04-26 18:45:39 +02:00
Vitaly Baranov
eb1917f9de Use sequential nodes for counters. Add comments. 2022-04-26 18:45:35 +02:00
Vitaly Baranov
a89ef54c69 Fix tests and compilation. 2022-04-26 13:32:23 +02:00
Vitaly Baranov
16f8c71eb4 Disable usage of archives with backups on clusters. 2022-04-26 10:13:49 +02:00
Vitaly Baranov
1c0b731ea6 Fix compilation. 2022-04-26 10:13:08 +02:00
Vitaly Baranov
78bcb96098 Rename backup & restore setting 'async' -> 'sync', and make backup & restore async by default. 2022-04-26 09:51:19 +02:00
Vitaly Baranov
000b184691 Fix style & compilation. 2022-04-25 23:05:35 +02:00