The problem was that with this patch set renameMergedTemporaryPart() is
called without temporary_directory_lock holded (in MergeTask), since it
is reseted just before calling renameMergedTemporaryPart(), and this can
be seen in logs:
2024.03.29 19:56:42.126919 [ 1341 ] {ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95::-8_0_138_2_2} <Trace> test_btnct5cr.alter_table_0 (ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95) (MergerMutator): Merged 50 parts: [-8_0_0_0_2, -8_138_138_0] -> -8_0_138_2_2
2024.03.29 19:56:42.127034 [ 1341 ] {ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95::-8_0_138_2_2} <Debug> test_btnct5cr.alter_table_0 (ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95): Committing part -8_0_138_2_2 to zookeeper
2024.03.29 19:56:42.128462 [ 884 ] {} <Warning> test_btnct5cr.alter_table_0 (ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95): Removing temporary directory /var/lib/clickhouse/store/ea7/ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95/tmp_merge_-8_0_138_2_2/
2024.03.29 19:56:42.128647 [ 1341 ] {ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95::-8_0_138_2_2} <Debug> test_btnct5cr.alter_table_0 (ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95): Part -8_0_138_2_2 committed to zookeeper
...
2024.03.29 19:56:54.586084 [ 57841 ] {bf240267-0620-4294-afc1-479c58e6be89} <Error> executeQuery: std::exception. Code: 1001, type: std::__1::__fs::filesystem::filesystem_error, e.what() = filesystem error: in file_size: No such file or directory ["/var/lib/clickhouse/store/ea7/ea7a3fd2-cf47-4ec7-91a5-51c69fba1b95/-8_0_138_2_2/data.cmrk3"]
This should fix failures of 00993_system_parts_race_condition_drop_zookeeper in [1].
[1]: https://s3.amazonaws.com/clickhouse-test-reports/61973/f6f826c85dd5b7bb8db16286fd10dcf441a440f7/stateless_tests__coverage__[4_6].html
Though now it looks hackish...
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Since there is an assertion that does not allows to remove detached
parts during cleanup, which sounds good in general, but breaks this new
code.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Under heavy load, or not so heavy but with fsync_part_directory=1,
time that renameTo() holds DataPartsLock will be increased, and this
will affect almost every operation with this table.
On one of production clusters I saw ~60 seconds with
fsync_part_directory=1.
Move the renameTo() out from the critical section.
v2: instead of using DataPartsLock.lock.lock()/unlock() move the renameTo() into MergeTreeData::Transaction::commit()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>