* Fix GROUP BY AggregateFunction
finalizeChunk() was unconditionally converting AggregateFunction to the
underlying type, however this should be done only if the aggregate was
applied.
So pass names of aggregates as an argument to the finalizeChunk()
Fuzzer report [1]:
Logical error: 'Bad cast from type DB::ColumnArray to DB::ColumnAggregateFunction'. Received signal 6 Received signal Aborted (6)
For the following query:
SELECT
arraySort(groupArrayArray(grp_simple)),
grp_aggreg,
arraySort(groupArrayArray(grp_simple)),
b,
arraySort(groupArrayArray(grp_simple)) AS grs
FROM data_02294
GROUP BY
a,
grp_aggreg,
b
SETTINGS optimize_aggregation_in_order = 1
[1]: https://s3.amazonaws.com/clickhouse-test-reports/37050/323ae98202d80fc4b311be1e7308ef2ac39e6063/fuzzer_astfuzzerdebug,actions//fuzzer.log
v2: fix conflicts in src/Interpreters/InterpreterSelectQuery.cpp
v3: Fix header for GROUP BY AggregateFunction WITH TOTALS
v4: Add sanity check into finalizeBlock()
v5: Use typeid_cast<&> to get more sensible error in case of bad cast (as suggested by @nickitat)
v6: Fix positions passed to finalizeChunk()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Core/ColumnNumbers.h: remove unused <string>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Optimize finalizeChunk()/finalizeBlock()
v2: s/ByPosition/Mask/ s/by_position/mask/
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This statistics significantly decrease performance of
optimize_aggregation_in_order with a prefix key.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Suppose you have a table with lots of rows, like:
create table data_02233 (parent_key Int, child_key Int, value Int) engine=MergeTree() order by parent_key
And you want to do GROUP BY (parent_key, child_key) with optimize_aggregation_in_order:
select parent_key, child_key, count() from data_02233 group by parent_key, child_key with totals order by parent_key, child_key
Right now, it is not possible, because optimize_aggregation_in_order
supports only w/o key aggregation, i.e. GROUP BY cannot be done inside
unique parent_key region.
v2: rebase on top SortDescriptionWithPositions
v3: disable two-level aggregation
v4: fix merging of aggregates
v5: improve tests coverage (add a test with multiple parts, to add merge processor)
v6: add a test for compiled aggregate functions (sum()) explicitly
v7: add missing sortBlock()
v8: remove group_by_description_optimized
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>