Commit Graph

45 Commits

Author SHA1 Message Date
Alexey Milovidov
bc77aab2e8 Update MergeList.h 2023-08-16 17:49:28 +00:00
Alexey Milovidov
7376b7ec8c Update MergeList.h 2023-08-16 17:49:28 +00:00
Jianfei Hu
bd4df60df6 fix merges
Signed-off-by: Jianfei Hu <hujianfei258@gmail.com>
2023-08-16 17:49:28 +00:00
Jianfei Hu
59a81b82bc wip the merge partition
Signed-off-by: Jianfei Hu <hujianfei258@gmail.com>
2023-08-16 17:49:28 +00:00
Alexey Milovidov
a8bdb20fc4
Merge pull request #48787 from ClickHouse/background-memory-tracker
Add MemoryTracker for the background tasks [Resubmit]
2023-05-09 07:58:36 +03:00
Nikita Mikhaylov
aa4c5fe958
Enhancements for background merges (#49313) 2023-05-02 13:43:59 +02:00
Dmitry Novik
cf5d9a175a Revert "Merge pull request #48760 from ClickHouse/revert-46089-background-memory-tracker"
This reverts commit a61ed33223, reversing
changes made to 5f01b8a2b5.
2023-04-14 16:34:19 +02:00
Alexander Tokmakov
af1bf08663
Revert "Add MemoryTracker for the background tasks" 2023-04-13 21:05:02 +03:00
Dmitry Novik
06e6794fc0 Merge remote-tracking branch 'origin/master' into background-memory-tracker 2023-04-11 15:29:35 +00:00
Azat Khuzhin
aacf2a0838 Move ThreadGroupSwitcher to ThreadStatus.h (out from MergeTree code)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-07 15:32:08 +02:00
Azat Khuzhin
5b2b20a0b0 Rename ThreadGroupStatus to ThreadGroup
There are methods like getThreadGroup() and ThreadGroupSwitcher class,
so seems that this is logical.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-07 15:31:48 +02:00
Sema Checherinda
1031e2001b fix build 2023-03-29 07:41:31 +02:00
Sema Checherinda
a6ab33a906 no use query, but storage context 2023-03-27 16:30:32 +02:00
Sema Checherinda
aeb8766ad5 adjust after rebase 2023-03-24 19:53:16 +01:00
Sema Checherinda
da3e744405 set context from the master thread 2023-03-24 19:53:16 +01:00
Sema Checherinda
0fcf7c0363 std::optional instead shared_ptr 2023-03-24 19:53:16 +01:00
Sema Checherinda
bc107c70fa merge and mutation make thread group for setting memory trackers right 2023-03-24 19:53:16 +01:00
Dmitry Novik
1e065b32f3 Add MemoryTracker for the background tasks 2023-02-06 17:25:58 +00:00
Azat Khuzhin
5da2f52722 Use Int64 over UInt64 for prev_untracked_memory* in MemoryTrackerThreadSwitcher
Since those types originally is Int64

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-11-22 19:40:35 +01:00
Raúl Marín
d68b3dfd43 Fix destructor order 2022-07-15 15:48:35 +02:00
Raúl Marín
aea045f297 Improve logging around replicated merges 2022-07-15 15:48:35 +02:00
lthaooo
6632616733
Fix TTL merge scheduling bug (#36387) 2022-06-01 21:09:53 +02:00
Azat Khuzhin
65e9b4879d Fix possible memory_tracker use-after-free for merges/mutations
There are two possible cases for execution merges/mutations:
1) from background thread
2) from OPTIMIZE TABLE query

1) is pretty simple, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)" ==
      background_thread_memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Global / description="(total)"

  So as you can see it is pretty simple and MemoryTrackerThreadSwitcher
  does not do anything icky for this case.

2) is complex, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Process / description="(for query)" ==
      background_thread_memory_tracker = level=Process / description="(for query)"

  Before this patch to track memory (and related things, like sampling,
  profiling and so on) for OPTIMIZE TABLE query dirty hacks was done to
  do this, since current_thread memory_tracker was of Thread scope, that
  does not have any limits.

  And so if will change parent for it to Merge/Mutate memory tracker
  (which also does not have some of settings) it will not be correctly
  tracked.

  To address this Merge/Mutate was set as parent not to the
  current_thread memory_tracker but to it's parent, since it's scope is
  Process with all settings.

  But that parent's memory_tracker is the memory_tracker of the
  thread_group, and so if you will have nested ThreadPool inside
  merge/mutate (this is the case for s3 async writes, which has been
  added in #33291) you may get use-after-free of memory_tracker.

  Consider the following example:

    MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker.parent = merge_list_entry->memory_tracker
      (see also background_thread_memory_tracker above)

    CurrentThread::attachTo()
      current_thread.memory_tracker.parent = thread_group.memory_tracker

    CurrentThread::detachQuery()
      current_thread.memory_tracker.parent = thread_group.memory_tracker.parent
      # and this is equal to merge_list_entry->memory_tracker

    ~MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker = thread_group.memory_tracker.parent

  So after the following we will get incorrect memory_tracker (from the
  mege_list_entry) when the next job in that ThreadPool will not have
  thread_group, since in this case it will not try to update the
  current_thread.memory_tracker.parent and use-after-free will happens.

So to address the (2) issue, settings from the parent memory_tracker
should be copied to the merge_list_entry->memory_tracker, to avoid
playing with parent memory tracker.

Note, that settings from the query (OPTIMIZE TABLE) is not available at
that time, so it cannot be used (instead of parent's memory tracker
settings).

v2: remove memory_tracker.setOrRaiseHardLimit() from settings

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-18 16:23:54 +03:00
taiyang-li
36ca0b296b implement hive table engine 2021-11-05 19:55:30 +08:00
Azat Khuzhin
2a7a1d8df5 Avoid losing any allocations context from merges/mutations 2021-10-15 01:43:28 +03:00
Azat Khuzhin
8a209a78d7 Set query_id for mutations/merges
This will allow to distinguish allocations in trace_log.
2021-10-15 01:43:28 +03:00
Azat Khuzhin
8cc45bea7b Avoid accounting memory from another mutation/merge
Before this path it was possible that one merge/mutation may account
memory from another, due to ThreadStatus::untracked_memory.

And this cause flakiness of 01200_mutations_memory_consumption.
2021-10-15 01:43:28 +03:00
Azat Khuzhin
117e9e77c8 Apply max_untracked_memory/memory_profiler_{step,sample_probability} during mutate/merge 2021-10-03 17:39:07 +03:00
Amos Bird
23d3d894e6
Fix projection merges and mutations. 2021-09-24 22:45:50 +08:00
Nikita Mikhaylov
c52b8ec083
Introduced MergeTask and MutateTask (#25165)
Introduced MergeTask and MutateTask
2021-09-17 00:19:58 +03:00
tavplubix
c958f876b8
Update MergeList.h 2021-06-30 15:55:26 +03:00
Alexander Tokmakov
76156af5cc cancel merges on drop partition 2021-06-24 17:07:43 +03:00
Azat Khuzhin
1062d0ec91 Distinguish KILL MUTATION for different tables.
Before this patch KILL MUTATION marks mutation as canceled just by name
(and part numbers) so if you have multiple tables with the same part
name, then killing mutation for one table, will mark it as killed for
another too.

Fix this by comparing StorageID too (it is better to use StorageID over
database/table to avoid ambiguity by using UUIDs for comparing).

Here is a failure of the 01414_freeze_does_not_prevent_alters on CI [1].

  [1]: https://clickhouse-test-reports.s3.yandex.net/24069/9fb69dcf98c71a939d200cad3c8491bf43a44622/functional_stateless_tests_(ubsan).html#fail1
2021-06-08 10:51:22 +03:00
alesapin
5622e6daa6 Fix rare max_number_of_merges_with_ttl_in_pool limit overrun for non-replicated MergeTree 2021-01-27 14:56:12 +03:00
alesapin
b34960bffa Merge branch 'master' into system_fetches_table 2020-10-30 11:33:37 +03:00
alesapin
60f2d822d7 Fix fake race condition on system.merges merge_algorithm 2020-10-27 18:27:12 +03:00
alesapin
880f4bbd05 System fetches 2020-10-26 19:38:35 +03:00
alesapin
9ed4668dbb Refactor common part of background list 2020-10-26 15:40:55 +03:00
Amos Bird
c2d79bc5cc
Add merge_algorithm to system.merges 2020-09-13 10:00:03 +08:00
alesapin
ea7168580b Fixes 2020-09-04 16:55:07 +03:00
alesapin
e42d0f60da Fix several bugs 2020-09-04 14:27:27 +03:00
alesapin
61ecaebcb1 Simplify settings for TTL merges 2020-09-04 09:55:19 +03:00
alesapin
f4c7ff0376 Add fixed size of Merge TTLS 2020-09-03 16:00:13 +03:00
Azat Khuzhin
d93b9a57f6 Forward declaration for Context as much as possible.
Now after changing Context.h 488 modules will be recompiled instead of 582.
2020-05-21 01:53:18 +03:00
Ivan Lezhankin
06446b4f08 dbms/ → src/ 2020-04-03 18:14:31 +03:00