Commit Graph

532 Commits

Author SHA1 Message Date
Azat Khuzhin
dea1706d4e
Fix GROUP BY AggregateFunction (#37093)
* Fix GROUP BY AggregateFunction

finalizeChunk() was unconditionally converting AggregateFunction to the
underlying type, however this should be done only if the aggregate was
applied.

So pass names of aggregates as an argument to the finalizeChunk()

Fuzzer report [1]:

    Logical error: 'Bad cast from type DB::ColumnArray to DB::ColumnAggregateFunction'. Received signal 6 Received signal Aborted (6)

For the following query:

    SELECT
        arraySort(groupArrayArray(grp_simple)),
        grp_aggreg,
        arraySort(groupArrayArray(grp_simple)),
        b,
        arraySort(groupArrayArray(grp_simple)) AS grs
    FROM data_02294
    GROUP BY
        a,
        grp_aggreg,
        b
    SETTINGS optimize_aggregation_in_order = 1

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/37050/323ae98202d80fc4b311be1e7308ef2ac39e6063/fuzzer_astfuzzerdebug,actions//fuzzer.log

v2: fix conflicts in src/Interpreters/InterpreterSelectQuery.cpp
v3: Fix header for GROUP BY AggregateFunction WITH TOTALS
v4: Add sanity check into finalizeBlock()
v5: Use typeid_cast<&> to get more sensible error in case of bad cast (as suggested by @nickitat)
v6: Fix positions passed to finalizeChunk()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Core/ColumnNumbers.h: remove unused <string>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Optimize finalizeChunk()/finalizeBlock()

v2: s/ByPosition/Mask/ s/by_position/mask/
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-18 23:37:43 +02:00
Robert Schulze
d66dcdad79
Fix new occurrences of new clang-tidy warnings 2022-05-16 11:31:36 +02:00
Maksim Kita
437d70d4da Fixed tests 2022-05-11 21:59:51 +02:00
Maksim Kita
75555c436b Fix usage of min_count_to_compile_sort_description setting 2022-05-11 21:59:51 +02:00
Maksim Kita
ea8ce3140a Fixed tests 2022-05-11 21:59:51 +02:00
Maksim Kita
4e7d10297b Fixed style 2022-05-11 21:59:51 +02:00
Maksim Kita
cbfb773b50 Fixed tests 2022-05-11 21:59:51 +02:00
Maksim Kita
8ceb63ee6c Added JIT compilation of SortDescription 2022-05-11 21:59:51 +02:00
Nikolai Kochetov
2d99f0ce13 Simplify code a little bit. 2022-05-11 12:16:15 +00:00
Nikolai Kochetov
4b8a2e2d80 Fix fuzzed queries. 2022-05-11 10:22:34 +00:00
Nikolai Kochetov
b6075031d8 Delete GroupingSetsTransform. 2022-05-10 17:54:36 +00:00
Nikolai Kochetov
f7dbd48ee5 Simplify code a little bit. 2022-05-10 16:12:03 +00:00
Nikolai Kochetov
a02e1d2f4a Simplify code a little bit. 2022-05-10 16:00:00 +00:00
mergify[bot]
55a6d22ad3
Merge branch 'master' into grouping-sets-fix 2022-05-09 14:02:10 +00:00
Alexey Milovidov
6216c1827f
Merge pull request #37020 from ucasfl/remove-code
remove useless code
2022-05-09 00:00:07 +03:00
fenglv
2cd0f2aaed remove useless code 2022-05-08 16:50:13 +00:00
Vladimir C
bd5fab97d9
Merge pull request #36415 from bigo-sg/concurrent_join 2022-05-06 17:11:10 +02:00
Dmitry Novik
9a251e0028 Cleanup code 2022-05-05 18:13:00 +00:00
Dmitry Novik
4cc26aa38b Merge remote-tracking branch 'origin/master' into grouping-sets-fix
And fix execution of the query with only one grouping set
2022-05-05 17:14:52 +00:00
Dmitry Novik
161f52292b Support distributed queries 2022-05-05 13:56:16 +00:00
Dmitry Novik
9be17ef50c
Merge pull request #35111 from azat/optimize_aggregation_in_order-prefix
Implement partial GROUP BY key for optimize_aggregation_in_order
2022-05-02 17:49:48 +02:00
Azat Khuzhin
190ce217bb Disable GROUP BY statistics for optimize_aggregation_in_order
This statistics significantly decrease performance of
optimize_aggregation_in_order with a prefix key.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-04-29 06:58:27 +03:00
Azat Khuzhin
3931dbd848 Implement partial GROUP BY key for optimize_aggregation_in_order
Suppose you have a table with lots of rows, like:

    create table data_02233 (parent_key Int, child_key Int, value Int) engine=MergeTree() order by parent_key

And you want to do GROUP BY (parent_key, child_key) with optimize_aggregation_in_order:

    select parent_key, child_key, count() from data_02233 group by parent_key, child_key with totals order by parent_key, child_key

Right now, it is not possible, because optimize_aggregation_in_order
supports only w/o key aggregation, i.e. GROUP BY cannot be done inside
unique parent_key region.

v2: rebase on top SortDescriptionWithPositions
v3: disable two-level aggregation
v4: fix merging of aggregates
v5: improve tests coverage (add a test with multiple parts, to add merge processor)
v6: add a test for compiled aggregate functions (sum()) explicitly
v7: add missing sortBlock()
v8: remove group_by_description_optimized
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-04-29 06:58:07 +03:00
Amos Bird
4a5e4274f0
base should not depend on Common 2022-04-29 10:26:35 +08:00
fenglv
1b84d59047 fix typo
modify comment
2022-04-27 12:24:49 +00:00
lgbo-ustc
5738871a8b update QueryPipelineBuilder::joinPipelines 2022-04-27 10:24:19 +08:00
lgbo-ustc
520b05b9f1 update test case tests/queries/0_stateless/02236_explain_pipeline_join.sql 2022-04-27 10:08:22 +08:00
lgbo-ustc
6cb7b7888f update test case 02236_explain_pipeline_join 2022-04-26 19:07:07 +08:00
lgbo-ustc
0b0fa8453b fixed bug: resize on left pipeline cause the order by result wrong 2022-04-26 18:06:16 +08:00
lgbo-ustc
74ccc233d2 Merge remote-tracking branch 'ck/master' into concurrent_join 2022-04-26 09:21:02 +08:00
Nikita Taranov
5dc9478bac fix SortingStep::updateOutputStream() 2022-04-25 17:29:14 +00:00
lgbo-ustc
981d560553 Merge remote-tracking branch 'ck/master' into concurrent_join 2022-04-25 13:00:04 +08:00
Amos Bird
a25bb50096
Refactor many exception messages
1. Always use fmt variant
2. Remove redundant period at the end of message
3. Remove useless parenthesis
2022-04-24 19:44:00 +08:00
Nikita Mikhaylov
224f4dc620
Made parallel_reading_from_replicas work with localhost replica (#36281) 2022-04-22 15:52:38 +02:00
Dmitry Novik
94a381e522 Add test for parallel GROUPING SETS processing 2022-04-21 17:17:55 +00:00
Dmitry Novik
b5e2b38529 Optimize pipeline in case of single input 2022-04-21 14:11:58 +00:00
Dmitry Novik
9cefb62341 Cleanup 2022-04-21 10:43:11 +00:00
Dmitry Novik
61deae7105
Merge branch 'master' into grouping-sets-fix 2022-04-21 03:34:42 +02:00
Dmitry Novik
6e73cd6929 Implement parallel grouping sets processing 2022-04-21 01:18:40 +00:00
lgbo-ustc
3d7338581b Improve join
now adding joined blocks from right table can be run parallelly, speedup the join process
2022-04-19 16:07:30 +08:00
Robert Schulze
118e94523c
Activate clang-tidy warning "readability-container-contains"
This check suggests replacing <Container>.count() by
<Container>.contains() which is more speaking and in case of
multimaps/multisets also faster.
2022-04-18 23:53:11 +02:00
Dmitry Novik
a16710c750 Merge remote-tracking branch 'origin/master' into grouping-sets-fix 2022-04-14 17:29:51 +00:00
Nikolai Kochetov
362fcfd2b8
Merge pull request #36075 from ClickHouse/fix-limit-push-down-over-window
Disable LIMIT push down through WINDOW functions.
2022-04-13 11:57:37 +02:00
Nikolai Kochetov
2deec53162 Disable LIMIT push down through WINDOW functions. 2022-04-08 13:39:54 +00:00
Yakov Olkhovskiy
90c4cd3de7
Merge branch 'master' into interpolate-feature 2022-04-05 14:39:07 -04:00
Nickita Taranov
0f94a58f3a use getName() 2022-04-04 14:59:38 +02:00
Nickita Taranov
440e57769a more fizes 2022-04-04 14:33:58 +02:00
Nickita Taranov
ce40d84eef more fixes 2022-04-04 14:33:58 +02:00
Nickita Taranov
a39427f00b clean up 2022-04-04 14:33:57 +02:00
Nickita Taranov
eedcd61479 fix 2022-04-04 14:33:57 +02:00