mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-18 13:42:02 +00:00
921518db0a
* add the query data deduplication excluding duplicated parts in MergeTree family engines. query deduplication is based on parts' UUID which should be enabled first with merge_tree setting assign_part_uuids=1 allow_experimental_query_deduplication setting is to enable part deduplication, default ot false. data part UUID is a mechanism of giving a data part a unique identifier. Having UUID and deduplication mechanism provides a potential of moving parts between shards preserving data consistency on a read path: duplicated UUIDs will cause root executor to retry query against on of the replica explicitly asking to exclude encountered duplicated fingerprints during a distributed query execution. NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will update part's UUID. * add _part_uuid virtual column, allowing to use UUIDs in predicates. Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com> address comments |
||
---|---|---|
.. | ||
tests | ||
CMakeLists.txt | ||
Connection.cpp | ||
Connection.h | ||
ConnectionPool.h | ||
ConnectionPoolWithFailover.cpp | ||
ConnectionPoolWithFailover.h | ||
MultiplexedConnections.cpp | ||
MultiplexedConnections.h | ||
TimeoutSetter.cpp | ||
TimeoutSetter.h | ||
ya.make | ||
ya.make.in |