There was serveral issues:
1) there can be events for multiple threads in one block
2) lack of filter for thread_id = 0
3) thread_id was included into comparator
Fixes: #34719
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
CI founds [1]:
WARNING: ThreadSanitizer: data race (pid=99)
Write of size 8 at 0x7b14002192b0 by thread T27:
12 std::__1::map<>
13 DB::OpenedFileCache::~OpenedFileCache() obj-x86_64-linux-gnu/../src/IO/OpenedFileCache.h:27:7 (clickhouse+0xac66de6)
14 cxa_at_exit_wrapper(void*) crtstuff.c (clickhouse+0xaa3646f)
15 decltype(*(std::__1::forward<nuraft::raft_server*&>(fp0)).*fp()) std::__1::__invoke<void ()(), nuraft::raft_server*&, void>()
Previous read of size 8 at 0x7b14002192b0 by thread T37 (mutexes: write M732116415018761216):
4 DB::OpenedFileCache::get() obj-x86_64-linux-gnu/../src/IO/OpenedFileCache.h:47:37 (clickhouse+0xac66784)
Thread T27 'nuraft_commit' (tid=193, running) created by main thread at:
...
Thread T37 'MergeMutate' (tid=204, running) created by main thread at:
...
But it also reports that the mutex was already destroyed:
Mutex M732116415018761216 is already destroyed.
The problem is that [nuraft can call `exit()`](1707a7572a/src/handle_commit.cxx (L157-L158)) which will call atexit handlers:
2022.02.17 22:54:03.495450 [ 193 ] {} <Error> RaftInstance: background committing thread encounter err Memory limit (total) exceeded: would use 56.56 GiB (attempt to allocate chunk of 8192 bytes), maximum: 55.82 GiB, exiting to protect the system
[1]: https://s3.amazonaws.com/clickhouse-test-reports/33057/5a8cf3ac98808dadf125068a33ed9c622998a484/fuzzer_astfuzzertsan,actions//report.html
Let's replace exit() with abort() to avoid this.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
fsync of the temporary part directory is superfluous anyway, and besides
that directory is not exists at that time, that will lead to ENOENT
error:
2022.02.18 17:02:51.634565 [ 35639 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 107. DB::ErrnoException: Cannot open file /var/lib/clickhouse/data/system/text_log/tmp_merge_202202_1864_3192_14/, errno: 2, strerror: No such file or directory. (FILE_DOESNT_EXIST), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception() @ 0xb26ecfa in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
1. DB::throwFromErrnoWithPath() @ 0xb2700ea in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
2. DB::LocalDirectorySyncGuard::LocalDirectorySyncGuard() @ 0x14905531 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
3. DB::DiskLocal::getDirectorySyncGuard() const @ 0x148af3e3 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
4. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::prepare() @ 0x157bef13 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
Note, that IMergeTreeDataPart::renameTo() anyway will have fsync for the
directory.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>