Commit Graph

473 Commits

Author SHA1 Message Date
alesapin
36b810b076 Remove unused parameter 2022-06-24 13:42:36 +02:00
alesapin
af1a9d18ab Remove transaction argument 2022-06-24 13:34:00 +02:00
alesapin
9910395823 Simplify method signature 2022-06-24 13:19:29 +02:00
Nikolai Kochetov
cc6fdfe0eb
Merge pull request #36555 from ClickHouse/refactor-something-in-part-volumes
Separate data storage abstraction for MergeTree
2022-06-22 11:13:36 +02:00
Robert Schulze
0d80874d40
Merge pull request #38068 from ClickHouse/clang-tsa
Support for Clang Thread Safety Analysis (TSA)
2022-06-21 20:19:33 +02:00
Nikolai Kochetov
1e8c9ecd4c Merge branch 'master' into refactor-something-in-part-volumes 2022-06-21 12:37:21 +02:00
mergify[bot]
9bdd9e14a6
Merge branch 'master' into fix_flaky_tests_with_transactions 2022-06-20 18:11:30 +00:00
Robert Schulze
55b39e709d
Merge remote-tracking branch 'origin/master' into clang-tsa 2022-06-20 16:39:32 +02:00
Robert Schulze
5a4f21c50f
Support for Clang Thread Safety Analysis (TSA)
- TSA is a static analyzer build by Google which finds race conditions
  and deadlocks at compile time.

- It works by associating a shared member variable with a
  synchronization primitive that protects it. The compiler can then
  check at each access if proper locking happened before. A good
  introduction are [0] and [1].

- TSA requires some help by the programmer via annotations. Luckily,
  LLVM's libcxx already has annotations for std::mutex, std::lock_guard,
  std::shared_mutex and std::scoped_lock. This commit enables them
  (--> contrib/libcxx-cmake/CMakeLists.txt).

- Further, this commit adds convenience macros for the low-level
  annotations for use in ClickHouse (--> base/defines.h). For
  demonstration, they are leveraged in a few places.

- As we compile with "-Wall -Wextra -Weverything", the required compiler
  flag "-Wthread-safety-analysis" was already enabled. Negative checks
  are an experimental feature of TSA and disabled
  (--> cmake/warnings.cmake). Compile times did not increase noticeably.

- TSA is used in a few places with simple locking. I tried TSA also
  where locking is more complex. The problem was usually that it is
  unclear which data is protected by which lock :-(. But there was
  definitely some weird code where locking looked broken. So there is
  some potential to find bugs.

*** Limitations of TSA besides the ones listed in [1]:

- The programmer needs to know which lock protects which piece of shared
  data. This is not always easy for large classes.

- Two synchronization primitives used in ClickHouse are not annotated in
  libcxx:
  (1) std::unique_lock: A releaseable lock handle often together with
      std::condition_variable, e.g. in solve producer-consumer problems.
  (2) std::recursive_mutex: A re-entrant mutex variant. Its usage can be
      considered a design flaw + typically it is slower than a standard
      mutex. In this commit, one std::recursive_mutex was converted to
      std::mutex and annotated with TSA.

- For free-standing functions (e.g. helper functions) which are passed
  shared data members, it can be tricky to specify the associated lock.
  This is because the annotations use the normal C++ rules for symbol
  resolution.

[0] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
[1] https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42958.pdf
2022-06-20 16:13:25 +02:00
Nikolai Kochetov
7452d04e3a Merge branch 'master' into refactor-something-in-part-volumes 2022-06-20 15:31:02 +02:00
alesapin
50801e41c5 Merge branch 'master' into refactor-something-in-part-volumes 2022-06-19 14:05:46 +02:00
Alexander Tokmakov
83adf56383 fix race 2022-06-17 18:13:57 +02:00
Alexander Tokmakov
39c0219c11 fixes 2022-06-16 19:41:32 +02:00
Vitaly Baranov
21f3bed435 Simplify path calculations in backup. 2022-06-15 20:32:34 +02:00
Vitaly Baranov
592f568f83 Move backup/restore code to storages and databases - part 2. 2022-06-15 20:32:31 +02:00
Vitaly Baranov
724bc4dc57 Move backup/restore code to storages and databases - part 1. 2022-06-15 20:28:43 +02:00
Vitaly Baranov
73b1894a21 Rework collecting replicated parts. 2022-06-15 20:28:42 +02:00
Vitaly Baranov
5cabdbd982 Restore parts of MergeTree in correct order. 2022-06-15 20:28:40 +02:00
Alexander Tokmakov
2ac72319bd
Merge pull request #37185 from amosbird/projection-fix-three
Fix possible heap-use-after-free error when reading system.projection_parts and system.projection_parts_columns
2022-06-15 19:00:10 +03:00
Nikolai Kochetov
2a9a63ac7b Merge branch 'master' into refactor-something-in-part-volumes 2022-06-15 15:35:26 +02:00
alesapin
af1cd745e1
Merge pull request #37975 from kssenii/clean-up-broken-detached
Clean up broken detached parts after timeout
2022-06-14 20:53:31 +02:00
Amos Bird
d5a7a5be8e
Fix use-after-free in system.projection_parts 2022-06-14 23:41:42 +08:00
kssenii
7a2676c7ab Clean up broken detached parts with timeout 2022-06-10 12:27:57 +02:00
Vadim Volodin
637d293fbd Add SYSTEM UNFREEZE query 2022-06-08 15:21:14 +03:00
Nikolai Kochetov
5bc9b32025 Merge branch 'master' into refactor-something-in-part-volumes 2022-06-08 11:10:06 +00:00
alesapin
e32d36d790 Proper fix 2022-06-07 17:58:32 +02:00
Nikolai Kochetov
678d978acf Merge branch 'master' into refactor-something-in-part-volumes 2022-06-07 15:23:00 +00:00
Anton Popov
ef6f5a6500
Merge pull request #37570 from azat/column-ttl-expired-fix
Do not write expired columns by TTL after subsequent merges
2022-06-07 13:05:03 +02:00
Nikolai Kochetov
89c5855d20 Merge branch 'master' into refactor-something-in-part-volumes 2022-06-02 12:19:07 +02:00
Azat Khuzhin
0de1a64436 Log empty parts in IMergedBlockOutputStream::removeEmptyColumnsFromPart()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-26 20:14:09 +03:00
Anton Popov
16e839ac71 add profile events for introspection of part types 2022-05-25 14:54:49 +00:00
Anton Popov
aec30c4076
Merge pull request #37053 from CurtizJ/remove-streams-comments
Remove last mentions of data streams
2022-05-10 13:38:13 +02:00
Anton Popov
e911900054 remove last mentions of data streams 2022-05-09 19:15:24 +00:00
alesapin
28e492bc17 Followup fix 2022-05-09 15:21:33 +02:00
alesapin
018ed10684 Add test 2022-05-09 15:21:21 +02:00
alesapin
46712f1d98 Fix forgotten parts in cleanup thread 2022-05-08 00:53:55 +02:00
Alexander Tokmakov
af9b4a5b9c fix intersecting parts 2022-05-04 16:22:06 +02:00
Nikolai Kochetov
35095191eb Merge branch 'master' into refactor-something-in-part-volumes 2022-05-03 17:51:47 +02:00
Nikolai Kochetov
e44af67fee Merge branch 'master' into refactor-something-in-part-volumes 2022-04-26 21:08:00 +02:00
Memo
c38a4b4255
Update src/Storages/MergeTree/MergeTreeData.h
Co-authored-by: alesapin <alesapin@gmail.com>
2022-04-26 10:26:18 +08:00
mergify[bot]
a5aab53b70
Merge branch 'master' into add_query_level_settingss 2022-04-25 21:41:36 +00:00
Nikolai Kochetov
8c00692844 Part 8 2022-04-22 16:58:09 +00:00
alesapin
5465415751 Fix replace/move partition with zero copy replication 2022-04-21 14:39:12 +02:00
alesapin
c14e2e0b96 Fix more 2022-04-20 21:08:26 +02:00
alesapin
40c15222f8 Merge branch 'master' into fix_trash 2022-04-20 12:45:49 +02:00
Nikolai Kochetov
bcbab2ead8 Part 6. 2022-04-19 19:34:41 +00:00
alesapin
7cb7c120cc Less ugly 2022-04-19 15:53:10 +02:00
alesapin
bd7b3847c1 Some code 2022-04-19 01:09:09 +02:00
Memo
b3adf150b5 add_query_level_settings 2022-04-18 12:15:41 +08:00
alesapin
1706ae9e15 Some trash implementation 2022-04-15 18:36:23 +02:00
Anton Popov
305dd57262
Merge branch 'master' into fix_storage_distributed_ttl 2022-04-14 14:51:15 +02:00
Alexander Tokmakov
66fdf35dfd remove outdated parts immediately on drop partition 2022-04-13 18:01:22 +02:00
Nikolai Kochetov
76870ad92a Part 5 2022-04-12 18:59:49 +00:00
Alexander Tokmakov
457a9e9691 fixes for ReplicatedMergeTree 2022-04-12 14:14:26 +02:00
Alexander Tokmakov
8290ffa88d Merge branch 'master' into mvcc_prototype 2022-04-07 13:50:42 +02:00
Nikolai Kochetov
5a1392a8e3 Try refactor something (1) 2022-04-05 19:12:48 +00:00
Amos Bird
35a8bb2a9b
add comment 2022-04-05 15:56:38 +08:00
Amos Bird
163664fad7
Improve minmax_count_projection 2022-04-05 15:56:37 +08:00
Alexander Tokmakov
287d858fda Merge branch 'master' into mvcc_prototype 2022-03-29 16:24:12 +02:00
taiyang-li
8dbf1c60e7 merge master and fix conflict 2022-03-23 11:36:50 +08:00
Alexander Tokmakov
3c762f566d Merge branch 'master' into mvcc_prototype 2022-03-21 20:16:29 +01:00
Vitaly Baranov
ce25afb2e9 Storages and databases are hollow by default now. 2022-03-20 20:02:15 +01:00
Vitaly Baranov
7f89b98308 Rework BackupSettings and RestoreSettings a little, pass StorageRestoreSettings to storages. 2022-03-20 20:02:15 +01:00
Vitaly Baranov
258a472001 Shorter names: rename IRestoreFromBackupTask -> IRestoreTask. 2022-03-20 20:02:15 +01:00
Vitaly Baranov
7a63feb3f7 Make restore tasks explicit. 2022-03-20 20:01:31 +01:00
Alexander Tokmakov
07d952b728 use snapshots for semistructured data, durability fixes 2022-03-17 18:26:18 +01:00
Alexander Tokmakov
d04dc03fa4 Merge branch 'master' into mvcc_prototype 2022-03-17 15:24:32 +01:00
Nikolai Kochetov
ee9c2ec735
Merge pull request #34780 from azat/mt-delayed-part-flush
Do not delay final part writing by default (fixes possible Memory limit exceeded during INSERT)
2022-03-17 12:30:51 +01:00
Anton Popov
0ba78c3c3a Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-16 15:28:09 +00:00
Alexander Tokmakov
9702b5177d Merge branch 'master' into mvcc_prototype 2022-03-14 21:45:38 +01:00
Alexander Tokmakov
278d779a01 log cleanup, more comments 2022-03-14 21:43:34 +01:00
Maksim Kita
2fdcf53a76 Fix clang-tidy warnings in Server, Storages folders 2022-03-14 18:17:35 +00:00
Anton Popov
063917786e minor fixes 2022-03-14 17:29:18 +00:00
Anton Popov
36ec379aeb Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-14 16:28:35 +00:00
Alexander Tokmakov
061fa6a6f2 Merge branch 'master' into mvcc_prototype 2022-03-10 13:13:04 +01:00
Azat Khuzhin
3a5a39a9df Do not delay final part writing by default
For async s3 writes final part flushing was defered until all the INSERT
block was processed, however in case of too many partitions/columns you
may exceed max_memory_usage limit (since each stream has overhead).

Introduce max_insert_delayed_streams_for_parallel_writes (with default
to 1000 for S3, 0 otherwise), to avoid this.

This should "Memory limit exceeded" errors in performance tests.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 22:17:36 +03:00
Azat Khuzhin
caffc144b5 Fix possible "Part directory doesn't exist" during INSERT
In #33291 final part commit had been defered, and now it can take
significantly more time, that may lead to "Part directory doesn't exist"
error during INSERT:

    2022.02.21 18:18:06.979881 [ 11329 ] {insert} <Debug> executeQuery: (from 127.1:24572, user: default) INSERT INTO db.table (...) VALUES
    2022.02.21 20:58:03.933593 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18044_18044_0 to 20220214_270654_270654_0.
    2022.02.21 21:16:50.961917 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18197_18197_0 to 20220214_270689_270689_0.
    ...
    2022.02.22 21:16:57.632221 [ 64878 ] {} <Warning> db.table: Removing temporary directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/
    ...
    2022.02.23 12:23:56.277480 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18232_18232_0 to 20220214_273459_273459_0.
    2022.02.23 12:23:56.299218 [ 11329 ] {insert} <Error> executeQuery: Code: 107. DB::Exception: Part directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/ doesn't exist. Most likely it is a logical error. (FILE_DOESNT_EXIST) (version 22.2.1.1) (from 127.1:24572) (in query: INSERT INTO db.table (...) VALUES), Stack trace (when copying this message, always include the lines below):

Follow-up for: #28760
Refs: #33291

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 07:44:11 +03:00
taiyang-li
b4174b0bef merge master and fix conflicts 2022-03-08 11:39:25 +08:00
Anton Popov
2758db5341 add more comments 2022-03-01 19:32:55 +03:00
Anton Popov
fcdebea925 Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-25 13:41:30 +03:00
Alexander Tokmakov
dae044f86b Merge branch 'master' into mvcc_prototype 2022-02-17 13:49:37 +03:00
Anton Popov
a661eaf39f better performance of getting storage snapshot 2022-02-16 02:17:22 +03:00
Alexander Tokmakov
ae5aa8c12d write part version before other files 2022-02-15 02:24:51 +03:00
alesapin
b2886a429b Fix lock during fetch 2022-02-14 12:20:27 +03:00
Alexander Tokmakov
07e66e690d Merge branch 'master' into mvcc_prototype 2022-02-11 15:53:32 +03:00
Anton Popov
f012871a7c better caching of common types of object columns 2022-02-11 01:20:30 +03:00
alesapin
3af06b23f8 POC 2022-02-10 22:45:52 +03:00
Anton Popov
dcd7312d75 cache common type on objects in MergeTree 2022-02-09 23:47:53 +03:00
Anton Popov
18940b8637 Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-09 23:38:38 +03:00
taiyang-li
d04ccc0489 Merge branch 'master' into rocksdb_metacache 2022-02-09 11:54:10 +08:00
taiyang-li
b6132d490f merge master and solve conflict 2022-02-08 15:24:59 +08:00
Amos Bird
3fab7af541
Bug fix and improvement of minmax_count_projection 2022-02-06 16:46:11 +08:00
Alexander Tokmakov
fe30e0f162 fixes 2022-02-03 21:57:09 +03:00
Anton Popov
836a348a9c Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-01 15:23:07 +03:00
Alexander Tokmakov
2e4ae37d98 Merge branch 'master' into mvcc_prototype 2022-02-01 13:20:03 +03:00
Amos Bird
ec7d367814
DiskLocal checker
Add DiskLocal checker so that ReplicatedMergeTree can recover data when some of its disks are broken.
2022-02-01 05:55:27 +08:00
Anton Popov
78b9f15abb Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-30 03:24:37 +03:00
李扬
6d50d36405
Merge branch 'master' into rocksdb_metacache 2022-01-28 22:00:31 -06:00
Alexander Tokmakov
fb9b2d5326 Merge branch 'master' into mvcc_prototype 2022-01-28 21:18:36 +03:00
Alexander Tokmakov
e0304c2a58 review fixes, write tid into mutation entry 2022-01-28 20:47:37 +03:00