* add the query data deduplication excluding duplicated parts in MergeTree family engines.
query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1
allow_experimental_query_deduplication setting is to enable part deduplication, default ot false.
data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.
NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.
* add _part_uuid virtual column, allowing to use UUIDs in predicates.
Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com>
address comments
v2: Increase timeout for 01676_clickhouse_client_autocomplete
https://github.com/ClickHouse/ClickHouse/pull/19584#discussion_r565727175
v3: Disable 01676_clickhouse_client_autocomplete in unbundled build (arcadia)
autocomplete does not have to work fully unbundled build (since it lack
of replxx).
Similar to bd523a0aff
v4: set expect timeout back to 1 and increase total timeout to 20 sec
v4: set expect timeout back to 3 and increase total timeout to 22 (3*X+1) sec
When we remember too many query fragments, just clean the database
and start collecting it anew. Hopefully this should make the fuzzer more
aggressive.