TODO (suggested by Nikolai)
1. Build query plan fro current query (inside storage::read) up to WithMergableState
2. Check, that plan is simple enough: Aggregating - Expression - Filter - ReadFromStorage (or simplier)
3. Check, that filter is the same as filter in projection, and also expression calculates the same aggregation keys as in projection
4. Return WithMergableState if projection applies
3 will be easier to do with ActionsDAG, cause it sees all functions, and dependencies are direct (but it is possible with ExpressionActions also)
Also need to figure out how prewhere works for projections, and
row_filter_policies.
wip
It was initially implemented in #15454, but was reverted in #21948 (due
to higher memory usage).
This implementation differs from the initial, since now there is
separate attribute to enable preallocation, before it was done
automatically, but this has problems with duplicates in the source.
Plus this implementation does not uses dynamic_cast, instead it extends
IDictionarySource interface.
Overflow row is used for GROUP BY if all of the above is true:
- WITH TOTALS is requested
- max_rows_to_group_by > 0
- group_by_overflow_mode = any
- totals_mode != after_having_exclusive
And in case of overflow row and external GROUP BY, once the temporary
file dumps to disk it resets without_key data variant to nullptr, so any
subsequent dump to disk will cause SIGSEGV.
Fix this, by recreating without_key data variant after dumping to disk,
instead of reseting to nullptr.
And also add sanity check (LOGICAL_ERROR) to make error more
deterministic in case of such error.
Found with fuzzer [1].
[1]: https://clickhouse-test-reports.s3.yandex.net/23929/e7027e052998540ee660d186727e20f9555b729d/fuzzer_ubsan/report.html#fail1