Fix 02903_rmt_retriable_merge_exception flakiness for replicated database

In case of replicated database system stop pulling replication log for
rmt2 should be done on all replicas, otherwise some replica may merge
the part and all other replicas may fetch it.

Also, since SYSTEM STOP PULLING REPLICATION LOG does not waits for the
current pull, let's trigger log pull explicitly to provide at least some
guarantee that replication log pulling had been stopped, otherwise race
is possible [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/57155/f68717ccd0a07a499911c9b0db7537ae8205e41b/stateless_tests_flaky_check__asan_.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This commit is contained in:
Azat Khuzhin 2023-11-23 17:22:39 +01:00
parent 8aaf9a4cb4
commit 81da52bdf4

View File

@ -10,7 +10,12 @@ CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# (i.e. "No active replica has part X or covering part")
# does not appears as errors (level=Error), only as info message (level=Information).
$CLICKHOUSE_CLIENT -nm -q "
cluster=default
if [[ $($CLICKHOUSE_CLIENT -q "select count()>0 from system.clusters where cluster = 'test_cluster_database_replicated'") = 1 ]]; then
cluster=test_cluster_database_replicated
fi
$CLICKHOUSE_CLIENT -nm --distributed_ddl_output_mode=none -q "
drop table if exists rmt1;
drop table if exists rmt2;
@ -21,7 +26,12 @@ $CLICKHOUSE_CLIENT -nm -q "
insert into rmt1 values (2);
system sync replica rmt1;
system stop pulling replication log rmt2;
-- SYSTEM STOP PULLING REPLICATION LOG does not waits for the current pull,
-- trigger it explicitly to 'avoid race' (though proper way will be to wait
-- for current pull in the StorageReplicatedMergeTree::getActionLock())
system sync replica rmt2;
-- NOTE: CLICKHOUSE_DATABASE is required
system stop pulling replication log on cluster $cluster $CLICKHOUSE_DATABASE.rmt2;
optimize table rmt1 final settings alter_sync=0, optimize_throw_if_noop=1;
" || exit 1