mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-26 01:22:04 +00:00
921518db0a
* add the query data deduplication excluding duplicated parts in MergeTree family engines. query deduplication is based on parts' UUID which should be enabled first with merge_tree setting assign_part_uuids=1 allow_experimental_query_deduplication setting is to enable part deduplication, default ot false. data part UUID is a mechanism of giving a data part a unique identifier. Having UUID and deduplication mechanism provides a potential of moving parts between shards preserving data consistency on a read path: duplicated UUIDs will cause root executor to retry query against on of the replica explicitly asking to exclude encountered duplicated fingerprints during a distributed query execution. NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will update part's UUID. * add _part_uuid virtual column, allowing to use UUIDs in predicates. Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com> address comments
25 lines
622 B
XML
25 lines
622 B
XML
<yandex>
|
|
<remote_servers>
|
|
<test_cluster>
|
|
<shard>
|
|
<replica>
|
|
<host>node1</host>
|
|
<port>9000</port>
|
|
</replica>
|
|
</shard>
|
|
<shard>
|
|
<replica>
|
|
<host>node2</host>
|
|
<port>9000</port>
|
|
</replica>
|
|
</shard>
|
|
<shard>
|
|
<replica>
|
|
<host>node3</host>
|
|
<port>9000</port>
|
|
</replica>
|
|
</shard>
|
|
</test_cluster>
|
|
</remote_servers>
|
|
</yandex>
|