Fix possible UB in MergeTreeBackgroundExecutor

Lambda erase_from_active() captures the item (TaskRuntimeDataPtr), most
of the code path is OK, since it explicitly reset the item->task.

However one is not, when it moves the item to pending list, which will
be cleaned up when the table will be DROP/DETACH'ed, from
MergeTreeBackgroundExecutor::removeTasksCorrespondingToStorage(),
and in this case if IStorage will be already removed, then it will lead
to use-after-free on destroying the lambda, since it captures the item
by value.

And I belive that CI founds this issue here [1]:

<details>

<summary>stack trace</summary>

    4 0x268d1354 in DB::ReplicatedMergeTreeQueue::CurrentlyExecuting::~CurrentlyExecuting() build_docker/../src/Storages/MergeTree/ReplicatedMergeTreeQueue.cpp:1510:33
    5 0x268ead1a in std::__1::default_delete<DB::ReplicatedMergeTreeQueue::CurrentlyExecuting>::operator()(DB::ReplicatedMergeTreeQueue::CurrentlyExecuting*) const build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:54:5
    6 0x268ead1a in std::__1::unique_ptr<DB::ReplicatedMergeTreeQueue::CurrentlyExecuting, std::__1::default_delete<DB::ReplicatedMergeTreeQueue::CurrentlyExecuting> >::reset(DB::ReplicatedMergeTreeQueue::CurrentlyExecuting*) build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:315:7
    7 0x268ead1a in std::__1::unique_ptr<DB::ReplicatedMergeTreeQueue::CurrentlyExecuting, std::__1::default_delete<DB::ReplicatedMergeTreeQueue::CurrentlyExecuting> >::~unique_ptr() build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:269:19
    8 0x268ead1a in DB::ReplicatedMergeTreeQueue::SelectedEntry::~SelectedEntry() build_docker/../src/Storages/MergeTree/ReplicatedMergeTreeQueue.h:351:12
    9 0x268ead1a in void std::__1::__destroy_at<DB::ReplicatedMergeTreeQueue::SelectedEntry, 0>(DB::ReplicatedMergeTreeQueue::SelectedEntry*) build_docker/../contrib/libcxx/include/__memory/construct_at.h:56:13
    ...
    16 0x265e9abb in DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(std::__1::shared_ptr<DB::TaskRuntimeData>)::'lambda'()::~() build_docker/../src/Storages/MergeTree/MergeTreeBackgroundExecutor.cpp:127:30
    17 0x265e9abb in DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(std::__1::shared_ptr<DB::TaskRuntimeData>) build_docker/../src/Storages/MergeTree/MergeTreeBackgroundExecutor.cpp:235:1
    18 0x265ea730 in DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::threadFunction() build_docker/../src/Storages/MergeTree/MergeTreeBackgroundExecutor.cpp:261:13

</details>

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/0/533c7888038453c047df816f3f65292cca05a54f/stateless_tests__ubsan__actions_.html

And I also I think that the following issue will also be fixed [2]:

<details>

<summary>stack trace</summary>

    [ 680 ] {} <Fatal> : Logical error: 'Memory tracker: allocations not allowed.'.
    ...
    [ 23430 ] {} <Fatal> BaseDaemon: 23. /build/build_docker/../src/Common/formatReadable.cpp:46: formatReadableSizeWithBinarySuffix(double, int) @ 0x1713bb98 in /usr/bin/clickhouse
    ...
    [ 23430 ] {} <Fatal> BaseDaemon: 31. ../src/Common/MemoryTracker.cpp:112: MemoryTracker::logPeakMemoryUsage() @ 0x170e0ab9 in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 32. /build/build_docker/../src/Common/MemoryTracker.cpp:98: MemoryTracker::~MemoryTracker() @ 0x170e063a in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 33. /build/build_docker/../src/Storages/MergeTree/MergeList.cpp:144: DB::MergeListElement::~MergeListElement() @ 0x279fb290 in /usr/bin/clickhouse
    ...
    [ 23430 ] {} <Fatal> BaseDaemon: 38. /build/build_docker/../src/Storages/MergeTree/BackgroundProcessList.h:41: DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>::~BackgroundProcessListEntry() @ 0x276ce6c7 in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 39. /build/build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:54: std::__1::default_delete<DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo> >::operator()(DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>*) const @ 0x276ce60b in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 40. /build/build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:316: std::__1::unique_ptr<DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>, std::__1::default_delete<DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo> > >::reset(DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>*) @ 0x276ce57c in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 41. /build/build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:269: std::__1::unique_ptr<DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>, std::__1::default_delete<DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo> > >::~unique_ptr() @ 0x276ce399 in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 42. /build/build_docker/../src/Storages/MergeTree/MutatePlainMergeTreeTask.h:22: DB::MutatePlainMergeTreeTask::~MutatePlainMergeTreeTask() @ 0x27defceb in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 43. /build/build_docker/../contrib/libcxx/include/__memory/construct_at.h:57: void std::__1::__destroy_at<DB::MutatePlainMergeTreeTask, 0>(DB::MutatePlainMergeTreeTask*) @ 0x27dd69c1 in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: 44. /build/build_docker/../contrib/libcxx/include/__memory/construct_at.h:82: void std::__1::destroy_at<DB::MutatePlainMergeTreeTask, 0>(DB::MutatePlainMergeTreeTask*) @ 0x27dd6955 in /usr/bin/clickhouse
    [ 23430 ] {} <Fatal> BaseDaemon: Integrity check of the executable skipped because the reference checksum could not be read. (calculated checksum: 91F5937571C11255DFE73230B52CE9C0)
    [ 602 ] {} <Fatal> Application: Child process was terminated by signal 6.

</details>

  [2]: https://s3.amazonaws.com/clickhouse-test-reports/39222/a068c397dfd7943359a8b554566c3c70b78baf8d/stateless_tests__debug__actions__%5B1/3%5D.html

Refs: https://github.com/ClickHouse/ClickHouse/pull/29614#discussion_r720455032 (cc @nikitamikhaylov)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This commit is contained in:
Azat Khuzhin 2022-07-18 18:32:18 +03:00
parent 855ad30a5d
commit c6f7698f9a

View File

@ -124,7 +124,7 @@ void MergeTreeBackgroundExecutor<Queue>::routine(TaskRuntimeDataPtr item)
/// All operations with queues are considered no to do any allocations
auto erase_from_active = [this, item]() TSA_REQUIRES(mutex)
auto erase_from_active = [this, &item]() TSA_REQUIRES(mutex)
{
active.erase(std::remove(active.begin(), active.end(), item), active.end());
};
@ -157,11 +157,10 @@ void MergeTreeBackgroundExecutor<Queue>::routine(TaskRuntimeDataPtr item)
if (need_execute_again)
{
std::lock_guard guard(mutex);
erase_from_active();
if (item->is_currently_deleting)
{
erase_from_active();
/// This is significant to order the destructors.
{
NOEXCEPT_SCOPE({
@ -179,7 +178,6 @@ void MergeTreeBackgroundExecutor<Queue>::routine(TaskRuntimeDataPtr item)
/// Otherwise the destruction of the task won't be ordered with the destruction of the
/// storage.
pending.push(std::move(item));
erase_from_active();
has_tasks.notify_one();
item = nullptr;
return;