TODO (suggested by Nikolai)
1. Build query plan fro current query (inside storage::read) up to WithMergableState
2. Check, that plan is simple enough: Aggregating - Expression - Filter - ReadFromStorage (or simplier)
3. Check, that filter is the same as filter in projection, and also expression calculates the same aggregation keys as in projection
4. Return WithMergableState if projection applies
3 will be easier to do with ActionsDAG, cause it sees all functions, and dependencies are direct (but it is possible with ExpressionActions also)
Also need to figure out how prewhere works for projections, and
row_filter_policies.
wip
* master: (759 commits)
Suppress UBSan report in Decimal comparison
Suppress UBSan report in Decimal comparison
Fix UBSan report in arrayDifference
Update README.md
Non significant change in AggregationCommon
Print stack trace on SIGTRAP
Fix dependent test
Fix tests for better parallel run
Add test for already working code
Revert "Fix access control manager destruction order"
Update index.md
Update index.md
Update index.md
Bit more complicated example for isIPv4String - ru
Bit more complicated example for isIPv4String
cleanup
Replace database with ordinary
Added comments
Split tests to make them stable
Fixes
...
# Conflicts:
# src/Storages/MergeTree/MergeTreeRangeReader.cpp
* add the query data deduplication excluding duplicated parts in MergeTree family engines.
query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1
allow_experimental_query_deduplication setting is to enable part deduplication, default ot false.
data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.
NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.
* add _part_uuid virtual column, allowing to use UUIDs in predicates.
Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com>
address comments