Commit Graph

155 Commits

Author SHA1 Message Date
Azat Khuzhin
fef5f146e7 Fix ENOENT with fsync_part_directory and Vertical merge
fsync of the temporary part directory is superfluous anyway, and besides
that directory is not exists at that time, that will lead to ENOENT
error:

    2022.02.18 17:02:51.634565 [ 35639 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 107. DB::ErrnoException: Cannot open file /var/lib/clickhouse/data/system/text_log/tmp_merge_202202_1864_3192_14/, errno: 2, strerror: No such file or directory. (FILE_DOESNT_EXIST), Stack trace (when copying this message, always include the lines below):

    0. DB::Exception::Exception() @ 0xb26ecfa in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    1. DB::throwFromErrnoWithPath() @ 0xb2700ea in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    2. DB::LocalDirectorySyncGuard::LocalDirectorySyncGuard() @ 0x14905531 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    3. DB::DiskLocal::getDirectorySyncGuard() const @ 0x148af3e3 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    4. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::prepare() @ 0x157bef13 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug

Note, that IMergeTreeDataPart::renameTo() anyway will have fsync for the
directory.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-19 07:50:59 +03:00
Azat Khuzhin
65e9b4879d Fix possible memory_tracker use-after-free for merges/mutations
There are two possible cases for execution merges/mutations:
1) from background thread
2) from OPTIMIZE TABLE query

1) is pretty simple, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)" ==
      background_thread_memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Global / description="(total)"

  So as you can see it is pretty simple and MemoryTrackerThreadSwitcher
  does not do anything icky for this case.

2) is complex, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Process / description="(for query)" ==
      background_thread_memory_tracker = level=Process / description="(for query)"

  Before this patch to track memory (and related things, like sampling,
  profiling and so on) for OPTIMIZE TABLE query dirty hacks was done to
  do this, since current_thread memory_tracker was of Thread scope, that
  does not have any limits.

  And so if will change parent for it to Merge/Mutate memory tracker
  (which also does not have some of settings) it will not be correctly
  tracked.

  To address this Merge/Mutate was set as parent not to the
  current_thread memory_tracker but to it's parent, since it's scope is
  Process with all settings.

  But that parent's memory_tracker is the memory_tracker of the
  thread_group, and so if you will have nested ThreadPool inside
  merge/mutate (this is the case for s3 async writes, which has been
  added in #33291) you may get use-after-free of memory_tracker.

  Consider the following example:

    MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker.parent = merge_list_entry->memory_tracker
      (see also background_thread_memory_tracker above)

    CurrentThread::attachTo()
      current_thread.memory_tracker.parent = thread_group.memory_tracker

    CurrentThread::detachQuery()
      current_thread.memory_tracker.parent = thread_group.memory_tracker.parent
      # and this is equal to merge_list_entry->memory_tracker

    ~MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker = thread_group.memory_tracker.parent

  So after the following we will get incorrect memory_tracker (from the
  mege_list_entry) when the next job in that ThreadPool will not have
  thread_group, since in this case it will not try to update the
  current_thread.memory_tracker.parent and use-after-free will happens.

So to address the (2) issue, settings from the parent memory_tracker
should be copied to the merge_list_entry->memory_tracker, to avoid
playing with parent memory tracker.

Note, that settings from the query (OPTIMIZE TABLE) is not available at
that time, so it cannot be used (instead of parent's memory tracker
settings).

v2: remove memory_tracker.setOrRaiseHardLimit() from settings

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-18 16:23:54 +03:00
Alexander Tokmakov
ae5aa8c12d write part version before other files 2022-02-15 02:24:51 +03:00
Anton Popov
dcd7312d75 cache common type on objects in MergeTree 2022-02-09 23:47:53 +03:00
Anton Popov
18940b8637 Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-09 23:38:38 +03:00
Nikolai Kochetov
2a6eb593be
Revert "Revert "Add pool to WriteBufferFromS3"" 2022-02-01 13:36:51 +03:00
alexey-milovidov
095d9bfa43
Revert "Add pool to WriteBufferFromS3" 2022-02-01 05:49:40 +03:00
Anton Popov
78b9f15abb Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-30 03:24:37 +03:00
Nikolai Kochetov
9b2998c639 Review fixes. 2022-01-26 18:08:01 +00:00
Nikolai Kochetov
506ee8c024 Refactor some code. 2022-01-24 15:55:29 +00:00
Nikolai Kochetov
f74cf1e152 Merge branch 'master' into add-pool-to-s3-write-buffer 2022-01-21 21:55:42 +03:00
Nikolai Kochetov
b3cbb63487 Merge branch 'master' into add-pool-to-s3-write-buffer 2022-01-21 21:41:54 +03:00
Anton Popov
e8ce091e68 Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-21 20:11:18 +03:00
Anton Popov
c2206c0b96 better storing serialization infos 2022-01-21 03:29:30 +03:00
Amos Bird
62441f0a0f
Fix mutation when table contains projections (#33679) 2022-01-19 15:27:11 +03:00
Nikolai Kochetov
6d49a62666 Some more async writes. 2022-01-14 19:53:55 +00:00
Nikolai Kochetov
28f2012b06 Add some more async writing. 2022-01-11 19:02:48 +00:00
Anton Popov
99ebabd822 Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-17 19:02:29 +03:00
Anton Popov
16312e7e4a Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-14 18:58:17 +03:00
Anton Popov
61a5f8a61a add comments 2021-12-08 18:56:30 +03:00
Azat Khuzhin
82e0e9d27c Fix minmax partition index during merge in case of empty part in future parts
It is possible to get empty part in future parts, during merging, that
can be created in case of there is information about this part in
zookeeper, but no replicas has it
(StorageReplicatedMergeTree::createEmptyPartInsteadOfLost())

And this may incorrectly initialize min value, which will be updated
after one more merge (since during that merge there will be no more
information about empty part) or OPTIMIZE.
2021-12-04 21:49:49 +03:00
Anton Popov
54f51444c0 Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-01 15:49:02 +03:00
Anton Popov
ccd78e3838 Merge remote-tracking branch 'upstream/master' into HEAD 2021-11-22 17:19:35 +03:00
Amos Bird
30b06a969b
Cancel vertical merges (#31057)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-11-15 14:32:53 +03:00
Anton Popov
a20922b2d3 Merge remote-tracking branch 'origin/sparse-serialization' into HEAD 2021-11-09 15:36:25 +03:00
Anton Popov
66973a2a28 Merge remote-tracking branch 'upstream/master' into HEAD 2021-11-08 21:27:45 +03:00
Anton Popov
84e914e05a minor fixes near serializations 2021-11-05 01:46:00 +03:00
Nikita Mikhaylov
59c8ed9b0c
Fixed is_cancelled predicate inside MergeTask (#30996) 2021-11-03 13:25:45 +03:00
Anton Popov
d50137013c Merge remote-tracking branch 'upstream/master' into HEAD 2021-11-01 16:55:53 +03:00
Anton Popov
0099dfd523 refactoring of SerializationInfo 2021-10-29 20:21:02 +03:00
Anton Popov
d71ffc355a Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-18 15:18:22 +03:00
Nikolai Kochetov
6e479b301a Update memory optimisation for MergingSorted. 2021-10-18 12:54:12 +03:00
Nikolai Kochetov
bfcbf5abe0 Merge branch 'master' into removing-data-streams-folder 2021-10-17 10:42:37 +03:00
alexey-milovidov
4f11cfa59d
Merge pull request #28760 from azat/mutator-forbid-cleaner
Forbid cleaning of tmp directories that can be used by an active mutation/merge.
2021-10-17 03:18:01 +03:00
Nikolai Kochetov
067eaadadd Merge branch 'master' into removing-data-streams-folder 2021-10-16 09:46:05 +03:00
Nikolai Kochetov
c668696047
Merge pull request #30171 from ClickHouse/remove-stream-interfaces
Remove stream interfaces
2021-10-16 09:34:01 +03:00
Azat Khuzhin
07e8b2b3c7 Do not try to remove temporary paths that is currently in written by merge/mutation
v2: rebase against MergeTask
v3: rebase due to conflicts in src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp
v4:
- rebase due to conflicts in src/Storages/MergeTree/MergeTask.cpp
- drop common/scope_guard_safe.h (not used)
2021-10-16 00:43:52 +03:00
Nikolai Kochetov
ad8a344b46 Move TTL streams and algo 2021-10-15 13:11:57 +03:00
Anton Popov
7aa6068fb2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-14 19:44:08 +03:00
Nikolai Kochetov
2957971ee3 Remove some last streams. 2021-10-13 21:22:02 +03:00
Amos Bird
72bccaa501
Fix path name 2021-10-11 19:12:08 +08:00
Amos Bird
83717b7c3b
Get rid of naming limitation of projections. 2021-10-11 18:32:17 +08:00
Nikolai Kochetov
340b53ef85 Remove some more streams. 2021-10-08 17:03:54 +03:00
Nikolai Kochetov
d0c6f11fcb More. 2021-10-06 20:59:27 +03:00
mergify[bot]
ab55225c8f
Merge branch 'master' into remove-merging-streams 2021-10-05 14:16:34 +00:00
Nikolai Kochetov
2001ebbf9d Fix build. 2021-10-04 21:52:31 +03:00
Azat Khuzhin
117e9e77c8 Apply max_untracked_memory/memory_profiler_{step,sample_probability} during mutate/merge 2021-10-03 17:39:07 +03:00
Alexey Milovidov
fe6b7c77c7 Rename "common" to "base" 2021-10-02 10:13:14 +03:00
Nikolai Kochetov
68f8b9d235 Update ColumnGathererStream 2021-09-29 20:45:01 +03:00
Anton Popov
914781052e Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-29 17:37:07 +03:00
Nikolai Kochetov
236d71ea94
Merge pull request #28582 from ClickHouse/rewrite-pushing-to-views
Rewrite PushingToViews
2021-09-27 21:19:11 +03:00
Amos Bird
23d3d894e6
Fix projection merges and mutations. 2021-09-24 22:45:50 +08:00
Anton Popov
6f9e53197c Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-20 17:17:05 +03:00
Nikolai Kochetov
a8443bef4d Fix build. 2021-09-17 20:52:26 +03:00
Nikita Mikhaylov
c52b8ec083
Introduced MergeTask and MutateTask (#25165)
Introduced MergeTask and MutateTask
2021-09-17 00:19:58 +03:00