ClickHouse/tests/queries/0_stateless/02124_insert_deduplication_token_replica.reference
Igor Nikonov 100ee92c64 insert_deduplication_token setting for INSERT statement
The setting allows a user to provide own deduplication semantic in Replicated*MergeTree
If provided, it's used instead of data digest to generate block ID
So, for example, by providing a unique value for the setting in each INSERT statement,
user can avoid the same inserted data being deduplicated

Inserting data within the same INSERT statement are split into blocks
according to the *insert_block_size* settings
(max_insert_block_size, min_insert_block_size_rows, min_insert_block_size_bytes).
Each block with the same INSERT statement will get an ordinal number.
The ordinal number is added to insert_deduplication_token to get block dedup token
i.e. <token>_0, <token>_1, ... Deduplication is done per block
So, to guarantee deduplication for two same INSERT queries,
dedup token and number of blocks to have to be the same

Issue: #7461
2021-12-19 13:15:45 +00:00

25 lines
594 B
Plaintext

create replica 1 and check deduplication
two inserts with exact data, one inserted, one deduplicated by data digest
1 1001
two inserts with the same dedup token, one inserted, one deduplicated by the token
1 1001
1 1001
reset deduplication token and insert new row
1 1001
1 1001
2 1002
create replica 2 and check deduplication
inserted value deduplicated by data digest, the same result as before
1 1001
1 1001
2 1002
inserted value deduplicated by dedup token, the same result as before
1 1001
1 1001
2 1002
new record inserted by providing new deduplication token
1 1001
1 1001
2 1002
2 1002