CI found one hanged query [1], where the problem was that becase of
allow_experimental_parallel_reading_from_replicas the Cancel packet was
read by receivePartitionMergeTreeReadTaskResponseAssumeLocked() and so
the executor was not cancelled, while the code in
MergeTreeBaseSelectProcessor was not ready for this:
<details>
{
"is_initial_query": 0,
"elapsed": 1727.714379573,
"is_cancelled": 0,
"read_rows": "196577",
"read_bytes": "1179462",
"written_rows": "0",
"written_bytes": "0",
"query": "SELECT `CounterID`, `EventDate` FROM `test`.`hits` ORDER BY `CounterID` DESC, `EventDate` ASC LIMIT 50",
...
}
In logs:
2021.12.31 12:11:55.384735 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Debug> executeQuery: (from [::ffff:127.0.0.1]:58094, initial_query_id: e2966ca5-e836-44ef-8f8e-d1c1b32a>
2021.12.31 12:11:55.454379 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Trace> ContextAccess (default): Access granted: SELECT(EventDate, CounterID) ON test.hits
2021.12.31 12:11:55.457583 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Trace> datasets.hits_v1: Parallel reading from replicas enabled true
2021.12.31 12:11:55.459739 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Trace> InterpreterSelectQuery: FetchColumns -> WithMergeableStateAfterAggregationAndLimit
2021.12.31 12:11:55.471048 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Debug> datasets.hits_v1 (SelectExecutor): Key condition: unknown
2021.12.31 12:11:55.476514 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Debug> datasets.hits_v1 (SelectExecutor): MinMax index condition: unknown
2021.12.31 12:11:55.488302 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Debug> datasets.hits_v1 (SelectExecutor): Selected 2/2 parts by partition key, 2 parts by primary key, >
2021.12.31 12:11:55.494020 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Debug> MergeTreeReverseSelectProcessor: Reading 1 ranges in reverse order from part 201403_20_20_0, app>
2021.12.31 12:11:55.497644 [ 101532 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Debug> MergeTreeReverseSelectProcessor: Reading 138 ranges in reverse order from part 201403_19_19_2, a>
2021.12.31 12:11:55.536372 [ 170601 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Compact part, average mark size is 83886080
2021.12.31 12:11:55.558783 [ 171701 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.563960 [ 171701 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.577512 [ 171701 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.585660 [ 171701 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.613694 [ 170601 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.730597 [ 170601 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.743554 [ 171701 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Debug> MergingSortedTransform: Merge sorted 3 blocks, 65567 rows in 0.243999671 sec., 268717.5754429603>
2021.12.31 12:11:55.744196 [ 170601 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.890923 [ 170601 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
2021.12.31 12:11:55.891222 [ 170601 ] {7e5f551e-5960-4fda-9447-9bfdae4660c1} <Test> MergeTreeBaseSelectProcessor: Reading from Wide part, average mark size is 3069
</details>
[1]: https://s3.amazonaws.com/clickhouse-test-reports/33341/0685fd99855bacd0bce02507c00a3bd7709eea61/stress_test__address__actions_.html
The setting allows a user to provide own deduplication semantic in Replicated*MergeTree
If provided, it's used instead of data digest to generate block ID
So, for example, by providing a unique value for the setting in each INSERT statement,
user can avoid the same inserted data being deduplicated
Inserting data within the same INSERT statement are split into blocks
according to the *insert_block_size* settings
(max_insert_block_size, min_insert_block_size_rows, min_insert_block_size_bytes).
Each block with the same INSERT statement will get an ordinal number.
The ordinal number is added to insert_deduplication_token to get block dedup token
i.e. <token>_0, <token>_1, ... Deduplication is done per block
So, to guarantee deduplication for two same INSERT queries,
dedup token and number of blocks to have to be the same
Issue: #7461
It is possible to get empty part in future parts, during merging, that
can be created in case of there is information about this part in
zookeeper, but no replicas has it
(StorageReplicatedMergeTree::createEmptyPartInsteadOfLost())
And this may incorrectly initialize min value, which will be updated
after one more merge (since during that merge there will be no more
information about empty part) or OPTIMIZE.