Commit Graph

569 Commits

Author SHA1 Message Date
Anton Popov
d1fa2148de
Merge branch 'master' into dynamic-columns-14 2022-09-09 19:32:07 +02:00
Alexander Tokmakov
48927ba0ac
Merge branch 'master' into no-hardlinks-while-making-backup-of-mergetree-in-atomic-db 2022-09-09 14:24:44 +03:00
Anton Popov
ba41239ecd Merge remote-tracking branch 'upstream/master' into HEAD 2022-09-08 15:20:29 +00:00
Alexander Tokmakov
507ad0c4d9
Merge pull request #40691 from azat/mergetree/transaction-wal-fix
Move committing InMemory parts to WAL out of NOEXCEPT_SCOPE()
2022-09-08 15:55:30 +03:00
Vitaly Baranov
122009a2bd Use table lock if database is ordinary and zero-copy-replication is enabled. 2022-09-08 13:54:59 +02:00
Vitaly Baranov
9c847ceec9 No hardlinks while making backup of MergeTree in atomic database. 2022-09-07 11:44:50 +02:00
Anton Popov
f0a404e2c8 Merge remote-tracking branch 'upstream/master' into HEAD 2022-09-06 15:51:16 +00:00
Kseniia Sumarokova
3558361a05
Merge branch 'master' into refactor-merge-tree-read 2022-09-06 16:00:43 +02:00
Alexey Milovidov
940a53e519
Merge pull request #40984 from Lucky-Chang/typo_fix
Fix some typos and clang-tidy warnings
2022-09-06 02:37:50 +03:00
kssenii
83514fa2ef Refactor 2022-09-05 20:08:22 +02:00
ianton-ru
39e1fc7a0f
Merge branch 'master' into fix-ordinary-system-unfreeze 2022-09-05 17:10:59 +03:00
Luck-Chang
1ac8e739c9 fix some typos and clang-tidy warnings 2022-09-05 09:50:24 +08:00
Azat Khuzhin
2e85f9f0ad Remove completely processed WAL files
Previously all WAL files had been stored, though with the time of use
this can take too much space on disk, and also the startup time will be
increased.

But it is pretty easy to prune old WAL files (the one parts from which
had been completely written to disk already).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-09-04 14:18:14 +02:00
Azat Khuzhin
3c478da003 Move committing InMemory parts to WAL out of NOEXCEPT_SCOPE()
Since this commit can definitelly throw (i.e. due to ENOSPC).

Note, that it should be safe, since rollback() will call dropPart() for
those parts.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-26 23:37:59 +02:00
alesapin
4398a357c8
Merge pull request #40151 from ClickHouse/fix_cannot_quickly_remove_directory
Fix "Cannot quickly remove directory"
2022-08-17 12:11:18 +02:00
Alexander Tokmakov
e691888267 fix build 2022-08-12 13:03:57 +02:00
Alexander Tokmakov
fe572104aa fix old tmp dirs cleanup 2022-08-09 18:44:51 +02:00
Alexander Gololobov
a64aa00869
Merge pull request #37893 from zhangjmruc/feature/sql-standard-delete
Support SQL standard "delete from ... where ..." syntax and lightweight implementation on merge tree tables
2022-07-26 23:39:07 +02:00
Kruglov Pavel
c683cb252f
Merge pull request #39227 from amosbird/rename-log1
Rename log when rename merge tree tables
2022-07-26 17:12:44 +02:00
Alexander Gololobov
25deba2c1b
Merge branch 'master' into feature/sql-standard-delete 2022-07-25 22:13:20 +02:00
Alexander Gololobov
460950ecdc
Merge branch 'master' into feature/sql-standard-delete 2022-07-24 21:27:22 +02:00
Alexander Gololobov
c8b3c574a4 Disable lightweight delete if table has projections 2022-07-24 12:21:47 +02:00
mergify[bot]
c01ff2d38a
Merge branch 'master' into improve_replicated_merge_logging 2022-07-23 19:10:10 +00:00
Nikolai Kochetov
91043351aa Fixing build. 2022-07-20 20:30:16 +00:00
Alexander Tokmakov
d82f378a9d do not enqueue uneeded parts for check 2022-07-18 23:37:07 +02:00
Raúl Marín
aea045f297 Improve logging around replicated merges 2022-07-15 15:48:35 +02:00
jianmei zhang
b4a37e1e22 Disable optimizations for count() when lightweight delete exists, add hasLightweightDelete() function in IMergeTreeDataPart 2022-07-15 12:32:41 +08:00
Amos Bird
96bb6e0cd2
Rename log when rename merge tree tables 2022-07-14 21:22:46 +08:00
Vitaly Baranov
ed27987646
Merge pull request #38861 from vitlibar/backup-improvements-9
Backup Improvements 9
2022-07-07 02:24:47 +02:00
Vitaly Baranov
7f84cf3968 Fix style. 2022-07-06 16:36:59 +02:00
Vitaly Baranov
43d35eec1b Write unfinished mutations to backup. 2022-07-05 14:51:09 +02:00
alesapin
b2db49dbf1
Merge branch 'master' into better_data_part_storage_builder 2022-07-02 15:39:44 +02:00
alesapin
eb5046ab26 Simplify everything 2022-06-30 22:51:27 +02:00
alesapin
cb90aca2ef Some changes in merge tree 2022-06-30 14:12:45 +02:00
Vitaly Baranov
031ca28fdc Add test for partition clause. More checks for data compatibility on restore. 2022-06-30 08:37:18 +02:00
alesapin
0a3fab1cb6 Some sad changes 2022-06-28 12:51:49 +02:00
alesapin
d88cdcd5c1 AAA 2022-06-27 21:41:29 +02:00
Vadim Volodin
3296ba2532 Fix SYSTEM UNFREEZE for ordinary database 2022-06-24 20:46:16 +03:00
alesapin
d963f262f8 Fix style 2022-06-24 17:43:18 +02:00
alesapin
f685cf2268 Fix comment 2022-06-24 17:33:43 +02:00
alesapin
011d58d7a0 Simplify more 2022-06-24 17:19:59 +02:00
alesapin
612c4571d5 Split method into smaller 2022-06-24 15:41:09 +02:00
alesapin
7517e1f4d5 Remove some complexity 2022-06-24 15:24:02 +02:00
alesapin
37310dc9df Simpler 2022-06-24 14:10:15 +02:00
alesapin
36b810b076 Remove unused parameter 2022-06-24 13:42:36 +02:00
alesapin
af1a9d18ab Remove transaction argument 2022-06-24 13:34:00 +02:00
alesapin
9910395823 Simplify method signature 2022-06-24 13:19:29 +02:00
Nikolai Kochetov
cc6fdfe0eb
Merge pull request #36555 from ClickHouse/refactor-something-in-part-volumes
Separate data storage abstraction for MergeTree
2022-06-22 11:13:36 +02:00
Robert Schulze
0d80874d40
Merge pull request #38068 from ClickHouse/clang-tsa
Support for Clang Thread Safety Analysis (TSA)
2022-06-21 20:19:33 +02:00
Nikolai Kochetov
1e8c9ecd4c Merge branch 'master' into refactor-something-in-part-volumes 2022-06-21 12:37:21 +02:00
mergify[bot]
9bdd9e14a6
Merge branch 'master' into fix_flaky_tests_with_transactions 2022-06-20 18:11:30 +00:00
Robert Schulze
55b39e709d
Merge remote-tracking branch 'origin/master' into clang-tsa 2022-06-20 16:39:32 +02:00
Robert Schulze
5a4f21c50f
Support for Clang Thread Safety Analysis (TSA)
- TSA is a static analyzer build by Google which finds race conditions
  and deadlocks at compile time.

- It works by associating a shared member variable with a
  synchronization primitive that protects it. The compiler can then
  check at each access if proper locking happened before. A good
  introduction are [0] and [1].

- TSA requires some help by the programmer via annotations. Luckily,
  LLVM's libcxx already has annotations for std::mutex, std::lock_guard,
  std::shared_mutex and std::scoped_lock. This commit enables them
  (--> contrib/libcxx-cmake/CMakeLists.txt).

- Further, this commit adds convenience macros for the low-level
  annotations for use in ClickHouse (--> base/defines.h). For
  demonstration, they are leveraged in a few places.

- As we compile with "-Wall -Wextra -Weverything", the required compiler
  flag "-Wthread-safety-analysis" was already enabled. Negative checks
  are an experimental feature of TSA and disabled
  (--> cmake/warnings.cmake). Compile times did not increase noticeably.

- TSA is used in a few places with simple locking. I tried TSA also
  where locking is more complex. The problem was usually that it is
  unclear which data is protected by which lock :-(. But there was
  definitely some weird code where locking looked broken. So there is
  some potential to find bugs.

*** Limitations of TSA besides the ones listed in [1]:

- The programmer needs to know which lock protects which piece of shared
  data. This is not always easy for large classes.

- Two synchronization primitives used in ClickHouse are not annotated in
  libcxx:
  (1) std::unique_lock: A releaseable lock handle often together with
      std::condition_variable, e.g. in solve producer-consumer problems.
  (2) std::recursive_mutex: A re-entrant mutex variant. Its usage can be
      considered a design flaw + typically it is slower than a standard
      mutex. In this commit, one std::recursive_mutex was converted to
      std::mutex and annotated with TSA.

- For free-standing functions (e.g. helper functions) which are passed
  shared data members, it can be tricky to specify the associated lock.
  This is because the annotations use the normal C++ rules for symbol
  resolution.

[0] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
[1] https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42958.pdf
2022-06-20 16:13:25 +02:00
Nikolai Kochetov
7452d04e3a Merge branch 'master' into refactor-something-in-part-volumes 2022-06-20 15:31:02 +02:00
alesapin
50801e41c5 Merge branch 'master' into refactor-something-in-part-volumes 2022-06-19 14:05:46 +02:00
Alexander Tokmakov
83adf56383 fix race 2022-06-17 18:13:57 +02:00
Alexander Tokmakov
39c0219c11 fixes 2022-06-16 19:41:32 +02:00
Vitaly Baranov
21f3bed435 Simplify path calculations in backup. 2022-06-15 20:32:34 +02:00
Vitaly Baranov
592f568f83 Move backup/restore code to storages and databases - part 2. 2022-06-15 20:32:31 +02:00
Vitaly Baranov
724bc4dc57 Move backup/restore code to storages and databases - part 1. 2022-06-15 20:28:43 +02:00
Vitaly Baranov
73b1894a21 Rework collecting replicated parts. 2022-06-15 20:28:42 +02:00
Vitaly Baranov
5cabdbd982 Restore parts of MergeTree in correct order. 2022-06-15 20:28:40 +02:00
Alexander Tokmakov
2ac72319bd
Merge pull request #37185 from amosbird/projection-fix-three
Fix possible heap-use-after-free error when reading system.projection_parts and system.projection_parts_columns
2022-06-15 19:00:10 +03:00
Nikolai Kochetov
2a9a63ac7b Merge branch 'master' into refactor-something-in-part-volumes 2022-06-15 15:35:26 +02:00
alesapin
af1cd745e1
Merge pull request #37975 from kssenii/clean-up-broken-detached
Clean up broken detached parts after timeout
2022-06-14 20:53:31 +02:00
Amos Bird
d5a7a5be8e
Fix use-after-free in system.projection_parts 2022-06-14 23:41:42 +08:00
kssenii
7a2676c7ab Clean up broken detached parts with timeout 2022-06-10 12:27:57 +02:00
Vadim Volodin
637d293fbd Add SYSTEM UNFREEZE query 2022-06-08 15:21:14 +03:00
Nikolai Kochetov
5bc9b32025 Merge branch 'master' into refactor-something-in-part-volumes 2022-06-08 11:10:06 +00:00
alesapin
e32d36d790 Proper fix 2022-06-07 17:58:32 +02:00
Nikolai Kochetov
678d978acf Merge branch 'master' into refactor-something-in-part-volumes 2022-06-07 15:23:00 +00:00
Anton Popov
ef6f5a6500
Merge pull request #37570 from azat/column-ttl-expired-fix
Do not write expired columns by TTL after subsequent merges
2022-06-07 13:05:03 +02:00
Nikolai Kochetov
89c5855d20 Merge branch 'master' into refactor-something-in-part-volumes 2022-06-02 12:19:07 +02:00
Azat Khuzhin
0de1a64436 Log empty parts in IMergedBlockOutputStream::removeEmptyColumnsFromPart()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-26 20:14:09 +03:00
Anton Popov
16e839ac71 add profile events for introspection of part types 2022-05-25 14:54:49 +00:00
Anton Popov
aec30c4076
Merge pull request #37053 from CurtizJ/remove-streams-comments
Remove last mentions of data streams
2022-05-10 13:38:13 +02:00
Anton Popov
e911900054 remove last mentions of data streams 2022-05-09 19:15:24 +00:00
alesapin
28e492bc17 Followup fix 2022-05-09 15:21:33 +02:00
alesapin
018ed10684 Add test 2022-05-09 15:21:21 +02:00
alesapin
46712f1d98 Fix forgotten parts in cleanup thread 2022-05-08 00:53:55 +02:00
Anton Popov
515f68eead Merge remote-tracking branch 'upstream/master' into dynamic-columns-14 2022-05-06 16:10:51 +00:00
Anton Popov
566c08086a support Object type inside other types 2022-05-06 14:44:00 +00:00
Alexander Tokmakov
af9b4a5b9c fix intersecting parts 2022-05-04 16:22:06 +02:00
Nikolai Kochetov
35095191eb Merge branch 'master' into refactor-something-in-part-volumes 2022-05-03 17:51:47 +02:00
Nikolai Kochetov
e44af67fee Merge branch 'master' into refactor-something-in-part-volumes 2022-04-26 21:08:00 +02:00
Memo
c38a4b4255
Update src/Storages/MergeTree/MergeTreeData.h
Co-authored-by: alesapin <alesapin@gmail.com>
2022-04-26 10:26:18 +08:00
mergify[bot]
a5aab53b70
Merge branch 'master' into add_query_level_settingss 2022-04-25 21:41:36 +00:00
Nikolai Kochetov
8c00692844 Part 8 2022-04-22 16:58:09 +00:00
alesapin
5465415751 Fix replace/move partition with zero copy replication 2022-04-21 14:39:12 +02:00
alesapin
c14e2e0b96 Fix more 2022-04-20 21:08:26 +02:00
alesapin
40c15222f8 Merge branch 'master' into fix_trash 2022-04-20 12:45:49 +02:00
Nikolai Kochetov
bcbab2ead8 Part 6. 2022-04-19 19:34:41 +00:00
alesapin
7cb7c120cc Less ugly 2022-04-19 15:53:10 +02:00
alesapin
bd7b3847c1 Some code 2022-04-19 01:09:09 +02:00
Memo
b3adf150b5 add_query_level_settings 2022-04-18 12:15:41 +08:00
alesapin
1706ae9e15 Some trash implementation 2022-04-15 18:36:23 +02:00
Anton Popov
305dd57262
Merge branch 'master' into fix_storage_distributed_ttl 2022-04-14 14:51:15 +02:00
Alexander Tokmakov
66fdf35dfd remove outdated parts immediately on drop partition 2022-04-13 18:01:22 +02:00
Nikolai Kochetov
76870ad92a Part 5 2022-04-12 18:59:49 +00:00
Alexander Tokmakov
457a9e9691 fixes for ReplicatedMergeTree 2022-04-12 14:14:26 +02:00
Alexander Tokmakov
8290ffa88d Merge branch 'master' into mvcc_prototype 2022-04-07 13:50:42 +02:00
Nikolai Kochetov
5a1392a8e3 Try refactor something (1) 2022-04-05 19:12:48 +00:00
Amos Bird
35a8bb2a9b
add comment 2022-04-05 15:56:38 +08:00
Amos Bird
163664fad7
Improve minmax_count_projection 2022-04-05 15:56:37 +08:00
Alexander Tokmakov
287d858fda Merge branch 'master' into mvcc_prototype 2022-03-29 16:24:12 +02:00
taiyang-li
8dbf1c60e7 merge master and fix conflict 2022-03-23 11:36:50 +08:00
Alexander Tokmakov
3c762f566d Merge branch 'master' into mvcc_prototype 2022-03-21 20:16:29 +01:00
Vitaly Baranov
ce25afb2e9 Storages and databases are hollow by default now. 2022-03-20 20:02:15 +01:00
Vitaly Baranov
7f89b98308 Rework BackupSettings and RestoreSettings a little, pass StorageRestoreSettings to storages. 2022-03-20 20:02:15 +01:00
Vitaly Baranov
258a472001 Shorter names: rename IRestoreFromBackupTask -> IRestoreTask. 2022-03-20 20:02:15 +01:00
Vitaly Baranov
7a63feb3f7 Make restore tasks explicit. 2022-03-20 20:01:31 +01:00
Alexander Tokmakov
07d952b728 use snapshots for semistructured data, durability fixes 2022-03-17 18:26:18 +01:00
Alexander Tokmakov
d04dc03fa4 Merge branch 'master' into mvcc_prototype 2022-03-17 15:24:32 +01:00
Nikolai Kochetov
ee9c2ec735
Merge pull request #34780 from azat/mt-delayed-part-flush
Do not delay final part writing by default (fixes possible Memory limit exceeded during INSERT)
2022-03-17 12:30:51 +01:00
Anton Popov
0ba78c3c3a Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-16 15:28:09 +00:00
Alexander Tokmakov
9702b5177d Merge branch 'master' into mvcc_prototype 2022-03-14 21:45:38 +01:00
Alexander Tokmakov
278d779a01 log cleanup, more comments 2022-03-14 21:43:34 +01:00
Maksim Kita
2fdcf53a76 Fix clang-tidy warnings in Server, Storages folders 2022-03-14 18:17:35 +00:00
Anton Popov
063917786e minor fixes 2022-03-14 17:29:18 +00:00
Anton Popov
36ec379aeb Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-14 16:28:35 +00:00
Alexander Tokmakov
061fa6a6f2 Merge branch 'master' into mvcc_prototype 2022-03-10 13:13:04 +01:00
Azat Khuzhin
3a5a39a9df Do not delay final part writing by default
For async s3 writes final part flushing was defered until all the INSERT
block was processed, however in case of too many partitions/columns you
may exceed max_memory_usage limit (since each stream has overhead).

Introduce max_insert_delayed_streams_for_parallel_writes (with default
to 1000 for S3, 0 otherwise), to avoid this.

This should "Memory limit exceeded" errors in performance tests.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 22:17:36 +03:00
Azat Khuzhin
caffc144b5 Fix possible "Part directory doesn't exist" during INSERT
In #33291 final part commit had been defered, and now it can take
significantly more time, that may lead to "Part directory doesn't exist"
error during INSERT:

    2022.02.21 18:18:06.979881 [ 11329 ] {insert} <Debug> executeQuery: (from 127.1:24572, user: default) INSERT INTO db.table (...) VALUES
    2022.02.21 20:58:03.933593 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18044_18044_0 to 20220214_270654_270654_0.
    2022.02.21 21:16:50.961917 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18197_18197_0 to 20220214_270689_270689_0.
    ...
    2022.02.22 21:16:57.632221 [ 64878 ] {} <Warning> db.table: Removing temporary directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/
    ...
    2022.02.23 12:23:56.277480 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18232_18232_0 to 20220214_273459_273459_0.
    2022.02.23 12:23:56.299218 [ 11329 ] {insert} <Error> executeQuery: Code: 107. DB::Exception: Part directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/ doesn't exist. Most likely it is a logical error. (FILE_DOESNT_EXIST) (version 22.2.1.1) (from 127.1:24572) (in query: INSERT INTO db.table (...) VALUES), Stack trace (when copying this message, always include the lines below):

Follow-up for: #28760
Refs: #33291

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 07:44:11 +03:00
taiyang-li
b4174b0bef merge master and fix conflicts 2022-03-08 11:39:25 +08:00
Anton Popov
2758db5341 add more comments 2022-03-01 19:32:55 +03:00
Anton Popov
fcdebea925 Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-25 13:41:30 +03:00
Alexander Tokmakov
dae044f86b Merge branch 'master' into mvcc_prototype 2022-02-17 13:49:37 +03:00
Anton Popov
a661eaf39f better performance of getting storage snapshot 2022-02-16 02:17:22 +03:00
Alexander Tokmakov
ae5aa8c12d write part version before other files 2022-02-15 02:24:51 +03:00
alesapin
b2886a429b Fix lock during fetch 2022-02-14 12:20:27 +03:00
Alexander Tokmakov
07e66e690d Merge branch 'master' into mvcc_prototype 2022-02-11 15:53:32 +03:00
Anton Popov
f012871a7c better caching of common types of object columns 2022-02-11 01:20:30 +03:00
alesapin
3af06b23f8 POC 2022-02-10 22:45:52 +03:00
Anton Popov
dcd7312d75 cache common type on objects in MergeTree 2022-02-09 23:47:53 +03:00
Anton Popov
18940b8637 Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-09 23:38:38 +03:00
taiyang-li
d04ccc0489 Merge branch 'master' into rocksdb_metacache 2022-02-09 11:54:10 +08:00
taiyang-li
b6132d490f merge master and solve conflict 2022-02-08 15:24:59 +08:00
Amos Bird
3fab7af541
Bug fix and improvement of minmax_count_projection 2022-02-06 16:46:11 +08:00
Alexander Tokmakov
fe30e0f162 fixes 2022-02-03 21:57:09 +03:00
Anton Popov
836a348a9c Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-01 15:23:07 +03:00
Alexander Tokmakov
2e4ae37d98 Merge branch 'master' into mvcc_prototype 2022-02-01 13:20:03 +03:00
Amos Bird
ec7d367814
DiskLocal checker
Add DiskLocal checker so that ReplicatedMergeTree can recover data when some of its disks are broken.
2022-02-01 05:55:27 +08:00
Anton Popov
78b9f15abb Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-30 03:24:37 +03:00
李扬
6d50d36405
Merge branch 'master' into rocksdb_metacache 2022-01-28 22:00:31 -06:00
Alexander Tokmakov
fb9b2d5326 Merge branch 'master' into mvcc_prototype 2022-01-28 21:18:36 +03:00
Alexander Tokmakov
e0304c2a58 review fixes, write tid into mutation entry 2022-01-28 20:47:37 +03:00
alexey-milovidov
f6684dbc62
Merge pull request #32304 from devcrafter/deduplication_token_7461
insert_deduplication_token setting for INSERT statement
2022-01-28 13:03:55 +03:00
zhongyuankai
a6254516e0 Fix Alter ttl modification unsupported table engine 2022-01-24 21:48:52 +08:00
taiyang-li
73def8b483 merge master and solve conflict 2022-01-24 11:01:43 +08:00
Anton Popov
e8ce091e68 Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-21 20:11:18 +03:00