ALTER DROP COLUMN of nested column did not requires mutation before, and
so it leaves nested column as-is, and in case of compact parts
subsequent alter, that requires mutation, will trigger READ_COLUMN of
that nested column (because it exists in part), but it will fail because
there is no such column in the table already.
Here is example of such a failure on CI - [1].
[1]: https://s3.amazonaws.com/clickhouse-test-reports/35459/52099b23a1cb9a7ff036c5c60aa037c999b333ef/stateless_tests__thread__actions__[1/3].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Before it does lots of extra work, now, it will be significantly more
optimal (thousands of rows -> 1-2 million of rows).
v2: s/executeOnBlockSimple/executeOnBlockSmall/
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This statistics significantly decrease performance of
optimize_aggregation_in_order with a prefix key.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Suppose you have a table with lots of rows, like:
create table data_02233 (parent_key Int, child_key Int, value Int) engine=MergeTree() order by parent_key
And you want to do GROUP BY (parent_key, child_key) with optimize_aggregation_in_order:
select parent_key, child_key, count() from data_02233 group by parent_key, child_key with totals order by parent_key, child_key
Right now, it is not possible, because optimize_aggregation_in_order
supports only w/o key aggregation, i.e. GROUP BY cannot be done inside
unique parent_key region.
v2: rebase on top SortDescriptionWithPositions
v3: disable two-level aggregation
v4: fix merging of aggregates
v5: improve tests coverage (add a test with multiple parts, to add merge processor)
v6: add a test for compiled aggregate functions (sum()) explicitly
v7: add missing sortBlock()
v8: remove group_by_description_optimized
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>