Initial implementation was different and it changed the entire
ReplicatedMergeTreeSink::commitPart() which change history provided by git blame.
Then RetriesControl.retryLoop() was introduced later which significantly reduces
the diff since it's like while() used before.
So, check outing the current version will keep more original history in
git blame, which is useful here
majority_insert_quorum is defined as (number_of_replicas/2)+1. Insert
will be successful only if majority of quorum have applied it. If
insert_quorum and majority_insert_quorum both are specified, max of
both will be used.
It depends on how much parts do you have, but for some workload with
InMemory only parts w/o merges, I got 5% increase.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
The setting allows a user to provide own deduplication semantic in Replicated*MergeTree
If provided, it's used instead of data digest to generate block ID
So, for example, by providing a unique value for the setting in each INSERT statement,
user can avoid the same inserted data being deduplicated
Inserting data within the same INSERT statement are split into blocks
according to the *insert_block_size* settings
(max_insert_block_size, min_insert_block_size_rows, min_insert_block_size_bytes).
Each block with the same INSERT statement will get an ordinal number.
The ordinal number is added to insert_deduplication_token to get block dedup token
i.e. <token>_0, <token>_1, ... Deduplication is done per block
So, to guarantee deduplication for two same INSERT queries,
dedup token and number of blocks to have to be the same
Issue: #7461