Igor Nikonov
965f96bd84
DISTINCT in order: perf improvement
...
+ reduce allocations in DistinctSortedChunkTransform
+ use it for final distinct as well
2022-07-20 20:44:47 +00:00
Nikolai Kochetov
91043351aa
Fixing build.
2022-07-20 20:30:16 +00:00
Nikolai Kochetov
4e8cd70b1d
Merge branch 'master' into use-dag-in-key-condition
2022-07-20 17:38:33 +02:00
Nikolai Kochetov
f570cde815
Fixing build.
2022-07-19 20:19:57 +00:00
Igor Nikonov
1fe83cc8d8
optimize_sorting_for_input_stream setting and perf tests
2022-07-19 16:58:15 +00:00
Nikolai Kochetov
eaeb30a71a
Merge branch 'master' into use-dag-in-key-condition
2022-07-19 18:39:52 +02:00
Dmitry Novik
50989bdb68
Merge branch 'master' into group-by-use-nulls
2022-07-19 14:58:01 +02:00
Alexander Gololobov
9de72d995a
POC lightweight delete using __row_exists virtual column and prewhere-like filtering
2022-07-18 20:06:42 +02:00
Igor Nikonov
e607d103cc
Just to trigger CI
2022-07-15 23:16:02 +00:00
Igor Nikonov
6f224b026a
Perf test. Code polishing
2022-07-15 21:54:57 +00:00
Igor Nikonov
d9b312f955
Self-review + revert test
2022-07-15 17:00:25 +00:00
Igor Nikonov
a3a1ccc520
Fix: SortMode::Chunk
2022-07-15 10:26:13 +00:00
jianmei zhang
9d27af7ee2
For some columns mutations, skip to apply deleted mask when read some columns. Also add unit test case
2022-07-15 12:32:41 +08:00
Igor Nikonov
8170f4e33a
Merge branch 'master' into skipping_sorting_step
2022-07-14 23:05:45 +02:00
Igor Nikonov
1efdb4e3e5
Disable finish sort with sorted chunks
2022-07-14 21:02:44 +00:00
Igor Nikonov
7cd12393c2
If sorting type is specified then use it. Otherwise rely on sort description
2022-07-14 16:26:25 +00:00
Igor Nikonov
1d49adad20
Introduce Auto mode for sorting step (replace others for now)
2022-07-14 13:29:39 +00:00
Igor Nikonov
b73aca2a3b
Merge branch 'master' into skipping_sorting_step
2022-07-13 19:06:41 +02:00
Igor Nikonov
159c9428bd
clean up
2022-07-13 17:05:54 +00:00
Igor Nikonov
f0d547993a
Fix: 01655_plan_optimizations_optimize_read_in_window_order_long
...
basically I returned code. Both plans with Finish sorting, need to check
sorting prefix
2022-07-13 16:51:51 +00:00
vdimir
4124dc9ac4
Rewrite tryPushDownFilter for join with lambda
2022-07-13 12:06:29 +00:00
vdimir
549a85fee9
Throw logical error on child idx mismatch in tryAddNewFilterStep
2022-07-13 11:53:46 +00:00
vdimir
bddf6c1b32
Pushdown filter to the right side of sorting join
2022-07-13 11:36:25 +00:00
Igor Nikonov
1d6f699a12
Use sort mode Port for reading in order
2022-07-12 21:56:00 +00:00
Amos Bird
982e1a73d3
Better
2022-07-12 22:21:46 +08:00
Amos Bird
d3709c6c26
Avoid redundant join block transformation.
2022-07-12 22:20:10 +08:00
Amos Bird
b9d9ca5194
style fix
2022-07-12 22:20:08 +08:00
Dmitry Novik
aabf5123d6
Fixup
2022-07-12 13:46:06 +00:00
Igor Nikonov
2c8d9080bd
Fix: consider collation in column sort description comparison
2022-07-12 13:14:10 +00:00
Dmitry Novik
cfca3db884
Fix crash with totals
2022-07-12 12:15:43 +00:00
Igor Nikonov
ea5e7793b2
Fix: self-review comments
2022-07-11 21:26:39 +00:00
Igor Nikonov
e0776b1c82
Fix: test for optimize read in window order
...
+ code polishing
2022-07-11 20:59:38 +00:00
Igor Nikonov
0ca8166ab2
Fix: forgot to return sorting type in constructors
2022-07-11 20:59:38 +00:00
Igor Nikonov
47bed7e318
Try to choose sorting transform based on sort description with fallback
2022-07-11 20:59:38 +00:00
Igor Nikonov
2a7e3bd741
Fix + SortMode::None as default value
2022-07-11 20:59:38 +00:00
Igor Nikonov
16d2319a8d
SortingStep: type of sorting is deduced based on input stream sorting description in during transformation
...
+ perf test
2022-07-11 20:59:38 +00:00
Igor Nikonov
7d4d92bd61
In case full sort was wrong choise during plan interpretation
2022-07-11 20:59:38 +00:00
Igor Nikonov
67ce421e38
Skip sorting step if input stream is globally sorted
2022-07-11 20:59:38 +00:00
Dmitry Novik
d1df66687b
Merge branch 'master' into group-by-use-nulls
2022-07-07 20:54:38 +02:00
Dmitry Novik
1587385f7a
Cleanup code
2022-07-07 18:53:20 +00:00
vdimir
7c586a9e7c
Minor updates for full soring merge join
2022-07-06 14:28:05 +00:00
vdimir
1b429fc1af
wip: any left/right sorting join
2022-07-06 14:23:46 +00:00
vdimir
8dce97123c
wip: any inner full sorting join
2022-07-06 14:23:46 +00:00
vdimir
4a16195964
Calculate output header for full sorting merge join
2022-07-06 14:23:45 +00:00
vdimir
fa8eb35599
Pipeline for full sorting merge join
2022-07-06 14:23:44 +00:00
Maksim Kita
b94489d52c
Merge pull request #38859 from kitaisreal/merge-tree-merge-disable-batch-optimization
...
MergeTree merge disable batch optimization
2022-07-06 15:59:40 +02:00
Maksim Kita
bdc21737d5
MergeTree merge disable batch optimization
2022-07-05 16:15:00 +02:00
Igor Nikonov
9ef8ff5a31
Addressing review comments
2022-07-01 22:50:00 +00:00
Anton Popov
ef87e1207c
better support of read_in_order in case of fixed prefix of sorting key
2022-07-01 16:45:01 +00:00
Dmitry Novik
81dd90893e
Merge remote-tracking branch 'origin/master' into group-by-use-nulls
2022-07-01 16:24:05 +00:00
Dmitry Novik
33f601ec0a
Commit support use_nulls for GS
2022-06-30 15:14:26 +00:00
Igor Nikonov
488ee75fc4
+ use DistinctSorted for final distinct step
...
+ fix performance tests
2022-06-30 13:03:39 +00:00
Dmitry Novik
98e9bc84d5
Refector ROLLUP and CUBE
2022-06-30 10:13:58 +00:00
Igor Nikonov
d435532c68
Adapt range search algorithm to high cardinality case
...
+ range search done in steps of some number of rows.
Controled by new
setting `distinct_in_order_range_search_step`. By default 0, i.e.
whole chunk
+ before start binary search, linear probing is done on each step (32
rows currently)
2022-06-29 23:30:35 +00:00
mergify[bot]
36139eacd7
Merge branch 'master' into dictinct_in_order_optimization
2022-06-29 13:37:16 +00:00
Nikita Taranov
f5d26572df
Quick fix for aggregation pipeline ( #38295 )
2022-06-29 01:16:30 +02:00
Igor Nikonov
4a00e33e6b
Fixes for some review comments
2022-06-28 21:42:46 +00:00
mergify[bot]
a9c1b68034
Merge branch 'master' into dictinct_in_order_optimization
2022-06-27 20:16:00 +00:00
Dmitry Novik
1d15d72211
Support NULLs in ROLLUP
2022-06-27 18:42:26 +00:00
Nikita Taranov
2487ba7f00
Move updateInputStream
to ITransformingStep
( #37393 )
2022-06-27 13:16:52 +02:00
Igor Nikonov
68927dd60c
Adapt distinct for sorted chunks to handle sorted stream, so we can use
...
it for final distinct as well
2022-06-26 14:52:36 +00:00
Igor Nikonov
04ce070da0
Remove unnecessary include
2022-06-24 23:11:52 +00:00
Igor Nikonov
d5c6f5c18f
Fixes
...
+ flaky test with explain pipeline
+ consider sort direction from read order info in sort description
(ReadFromMergeTree step)
2022-06-24 22:49:27 +00:00
mergify[bot]
b5d3fd50d2
Merge branch 'master' into dictinct_in_order_optimization
2022-06-23 09:48:38 +00:00
Igor Nikonov
944c247345
DISTINCT in order optimization
...
+ try use the optimization for final distinct in case of sorted stream
(sorting inside and among chunks)
+ sorting description contains only columns from sorting key which are in
header as well
2022-06-23 09:47:22 +00:00
Nikita Taranov
41ba0118b5
Bring back #36396 ( #38110 )
...
* Revert "Revert "More parallel execution for queries with `FINAL` (#36396 )""
This reverts commit 5bfb15262c
.
* fix tests
* fix review suggestions
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-06-22 15:05:07 +02:00
mergify[bot]
f45b4f56d8
Merge branch 'master' into dictinct_in_order_optimization
2022-06-21 21:25:37 +00:00
Igor Nikonov
b0a98bd875
DISTINCT in order optimization
...
+ use SortDescription from input data stream in DistinctStep to decide if the optimization is applicable
2022-06-21 21:23:49 +00:00
Nikolai Kochetov
b8d27aa8dd
Merge pull request #37469 from azat/projections-optimize_aggregation_in_order
...
Implement in order aggregation (optimize_aggregation_in_order) for projections for tables with fully materialized projections
2022-06-21 12:17:35 +02:00
Igor Nikonov
6ac68e8303
DISTINCT in order optimization
...
+ optimization for DISTINCT containing primary key columns
2022-06-20 10:06:15 +00:00
Vladimir Chebotarev
aef6fe6008
Rebase fix.
2022-06-20 05:15:08 +03:00
Vladimir Chebotarev
92a553fb77
Build fix.
2022-06-20 05:15:08 +03:00
Vladimir Chebotarev
6a363b7429
Build fix.
2022-06-20 05:15:08 +03:00
Vladimir Chebotarev
d41c97ea1d
Review fixes.
2022-06-20 05:15:08 +03:00
Vladimir Chebotarev
4f38e01343
Unused code.
2022-06-20 05:15:08 +03:00
Vladimir Chebotarev
cc45f15eae
Build fix.
2022-06-20 05:15:08 +03:00
Vladimir Chebotarev
3c2a63b87a
Fix test.
2022-06-20 05:15:07 +03:00
Vladimir Chebotarev
e50210969f
Style.
2022-06-20 05:15:07 +03:00
Vladimir Chebotarev
7f9557f8a3
Added optimize_read_in_window_order
setting.
2022-06-20 05:15:07 +03:00
Vladimir Chebotarev
ec22f6d539
Draft.
2022-06-20 05:15:07 +03:00
Azat Khuzhin
4694929623
Implement merging only for AggregatingStep
...
v2: fill AggregateColumnsConstData only for only_merge
(fixes 01291_aggregation_in_order and some other tests)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-16 09:58:36 +03:00
Azat Khuzhin
3559e35b70
AggregatingStep: remove unused forward decl
...
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-16 09:58:36 +03:00
wangdh15
02cce40b3a
when using clang12 compile, the unused filed shard_count will cause compile error. So delete it.
2022-06-16 10:43:31 +08:00
Alexander Tokmakov
5bfb15262c
Revert "More parallel execution for queries with FINAL
( #36396 )"
...
This reverts commit c8afeafe0e
.
2022-06-15 17:25:38 +03:00
Nikita Taranov
c8afeafe0e
More parallel execution for queries with FINAL
( #36396 )
2022-06-15 12:44:20 +02:00
Alexey Milovidov
ab9fc572d5
Merge pull request #37667 from ClickHouse/group-by-enum-fix
...
Support types with non-standard defaults in ROLLUP, CUBE, GROUPING SETS
2022-06-15 05:14:33 +03:00
Yakov Olkhovskiy
11e6b37ea6
preserve filling step position
2022-06-09 13:35:55 -04:00
mergify[bot]
2d01abf871
Merge branch 'master' into revert-37647-Fix-all-CheckTriviallyCopyableMove-Errors
2022-06-07 13:32:30 +00:00
Igor Nikonov
dcad154105
Merge pull request #37866 from ClickHouse/igor_minor_cleanup
...
Minor cleanup
2022-06-07 15:24:56 +02:00
Anton Popov
df6882d2b9
Revert "Fix errors of CheckTriviallyCopyableMove type"
2022-06-07 13:53:10 +02:00
Robert Schulze
2d87af2a15
Merge pull request #37647 from DevTeamBK/Fix-all-CheckTriviallyCopyableMove-Errors
...
Fix errors of CheckTriviallyCopyableMove type
2022-06-05 19:58:47 +02:00
Igor Nikonov
13149dc094
Minor cleanup
2022-06-05 14:31:07 +00:00
HeenaBansal2009
4cb561b070
Fix new warning from BuilderBinTidy
2022-06-03 11:47:36 -07:00
Nikolai Kochetov
468c04ee66
Fix test.
2022-06-02 21:29:29 +00:00
Nikolai Kochetov
176af473c3
Fix build.
2022-06-02 19:38:47 +00:00
Nikolai Kochetov
8991f39412
Merge branch 'master' into refactor-read-metrics-and-callbacks
2022-06-02 17:00:08 +00:00
Nikolai Kochetov
00395e752e
Cleanup
2022-06-02 16:59:14 +00:00
HeenaBansal2009
e3080f2a97
Merge remote-tracking branch 'origin' into Fix-all-CheckTriviallyCopyableMove-Errors
2022-06-02 07:30:08 -07:00
Nikita Mikhaylov
d34e051c69
Support for simultaneous read from local and remote parallel replica ( #37204 )
2022-06-02 11:46:33 +02:00
Nikolai Kochetov
edac3d6714
Merge branch 'master' into refactor-read-metrics-and-callbacks
2022-06-02 09:36:20 +00:00
Anton Popov
6cf9405f09
fix optimize_monotonous_functions_in_order_by in distributed queries
2022-06-01 00:50:28 +00:00
Nikolai Kochetov
86fbb74703
Merge branch 'master' into refactor-read-metrics-and-callbacks
2022-05-31 18:07:47 +00:00
Nikolai Kochetov
147a819221
Refactor a little bit more.
2022-05-31 14:43:38 +00:00
Dmitry Novik
0e63583b8f
Support types with non-standard defaults in ROLLUP, CUBE, GROUPING SETS
2022-05-31 00:11:10 +00:00
Nikolai Kochetov
77b07dd0a8
Merge pull request #37163 from ClickHouse/grouping-function
...
Add GROUPING function
2022-05-30 20:45:04 +02:00
HeenaBansal2009
b7eb6bbd38
Fixed clang-tidy-CheckTriviallyCopyableMove-errors
2022-05-30 11:09:03 -07:00
Nikolai Kochetov
5ef51ed27b
Fix more tests.
2022-05-30 13:10:30 +00:00
Nikolai Kochetov
b80b1940ce
Fix some tests.
2022-05-27 20:47:35 +00:00
Nikolai Kochetov
1b85f2c1d6
Merge branch 'master' into refactor-read-metrics-and-callbacks
2022-05-25 16:27:40 +02:00
Nikolai Kochetov
3d84aae0ab
Better.
2022-05-24 20:06:08 +00:00
Amos Bird
76ddb39d02
refactor format
2022-05-24 12:09:00 +08:00
Amos Bird
983e52cd3f
Aggresive filter pushdown for join
2022-05-24 12:08:42 +08:00
Nikolai Kochetov
fd97a9d885
Move some resources
2022-05-23 19:47:32 +00:00
Nikolai Kochetov
9756b759c6
Move some resources
2022-05-23 13:46:57 +00:00
Nikolai Kochetov
56feef01e7
Move some resources
2022-05-20 19:49:31 +00:00
Dmitry Novik
b3ccf96c81
Merge remote-tracking branch 'origin/master' into grouping-function
2022-05-19 17:58:33 +00:00
Dmitry Novik
d4c66f4a48
Code cleanup & fix GROUPING() with TOTALS
2022-05-19 16:36:51 +00:00
Azat Khuzhin
dea1706d4e
Fix GROUP BY AggregateFunction ( #37093 )
...
* Fix GROUP BY AggregateFunction
finalizeChunk() was unconditionally converting AggregateFunction to the
underlying type, however this should be done only if the aggregate was
applied.
So pass names of aggregates as an argument to the finalizeChunk()
Fuzzer report [1]:
Logical error: 'Bad cast from type DB::ColumnArray to DB::ColumnAggregateFunction'. Received signal 6 Received signal Aborted (6)
For the following query:
SELECT
arraySort(groupArrayArray(grp_simple)),
grp_aggreg,
arraySort(groupArrayArray(grp_simple)),
b,
arraySort(groupArrayArray(grp_simple)) AS grs
FROM data_02294
GROUP BY
a,
grp_aggreg,
b
SETTINGS optimize_aggregation_in_order = 1
[1]: https://s3.amazonaws.com/clickhouse-test-reports/37050/323ae98202d80fc4b311be1e7308ef2ac39e6063/fuzzer_astfuzzerdebug,actions//fuzzer.log
v2: fix conflicts in src/Interpreters/InterpreterSelectQuery.cpp
v3: Fix header for GROUP BY AggregateFunction WITH TOTALS
v4: Add sanity check into finalizeBlock()
v5: Use typeid_cast<&> to get more sensible error in case of bad cast (as suggested by @nickitat)
v6: Fix positions passed to finalizeChunk()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Core/ColumnNumbers.h: remove unused <string>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Optimize finalizeChunk()/finalizeBlock()
v2: s/ByPosition/Mask/ s/by_position/mask/
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-05-18 23:37:43 +02:00
Dmitry Novik
e5b395e054
Support ROLLUP and CUBE in GROUPING function
2022-05-16 17:33:38 +00:00
Robert Schulze
d66dcdad79
Fix new occurrences of new clang-tidy warnings
2022-05-16 11:31:36 +02:00
Dmitry Novik
6fc7dfea80
Support ordinary GROUP BY
2022-05-13 23:04:12 +00:00
Nikolai Kochetov
0a715b26db
Move some resources.
2022-05-13 20:02:28 +00:00
Dmitry Novik
ae81268d4d
Try to compute helper column lazy
2022-05-13 14:55:50 +00:00
Dmitry Novik
c5b40a9c91
WIP on GROUPING function
2022-05-12 16:40:26 +00:00
Maksim Kita
437d70d4da
Fixed tests
2022-05-11 21:59:51 +02:00
Maksim Kita
75555c436b
Fix usage of min_count_to_compile_sort_description setting
2022-05-11 21:59:51 +02:00
Maksim Kita
ea8ce3140a
Fixed tests
2022-05-11 21:59:51 +02:00
Maksim Kita
4e7d10297b
Fixed style
2022-05-11 21:59:51 +02:00
Maksim Kita
cbfb773b50
Fixed tests
2022-05-11 21:59:51 +02:00
Maksim Kita
8ceb63ee6c
Added JIT compilation of SortDescription
2022-05-11 21:59:51 +02:00
Nikolai Kochetov
2d99f0ce13
Simplify code a little bit.
2022-05-11 12:16:15 +00:00
Nikolai Kochetov
4b8a2e2d80
Fix fuzzed queries.
2022-05-11 10:22:34 +00:00
Nikolai Kochetov
b6075031d8
Delete GroupingSetsTransform.
2022-05-10 17:54:36 +00:00
Nikolai Kochetov
f7dbd48ee5
Simplify code a little bit.
2022-05-10 16:12:03 +00:00
Nikolai Kochetov
a02e1d2f4a
Simplify code a little bit.
2022-05-10 16:00:00 +00:00
mergify[bot]
55a6d22ad3
Merge branch 'master' into grouping-sets-fix
2022-05-09 14:02:10 +00:00
Alexey Milovidov
6216c1827f
Merge pull request #37020 from ucasfl/remove-code
...
remove useless code
2022-05-09 00:00:07 +03:00
fenglv
2cd0f2aaed
remove useless code
2022-05-08 16:50:13 +00:00
Vladimir C
bd5fab97d9
Merge pull request #36415 from bigo-sg/concurrent_join
2022-05-06 17:11:10 +02:00
Dmitry Novik
9a251e0028
Cleanup code
2022-05-05 18:13:00 +00:00
Dmitry Novik
4cc26aa38b
Merge remote-tracking branch 'origin/master' into grouping-sets-fix
...
And fix execution of the query with only one grouping set
2022-05-05 17:14:52 +00:00
Dmitry Novik
161f52292b
Support distributed queries
2022-05-05 13:56:16 +00:00
Dmitry Novik
9be17ef50c
Merge pull request #35111 from azat/optimize_aggregation_in_order-prefix
...
Implement partial GROUP BY key for optimize_aggregation_in_order
2022-05-02 17:49:48 +02:00
Azat Khuzhin
190ce217bb
Disable GROUP BY statistics for optimize_aggregation_in_order
...
This statistics significantly decrease performance of
optimize_aggregation_in_order with a prefix key.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-04-29 06:58:27 +03:00
Azat Khuzhin
3931dbd848
Implement partial GROUP BY key for optimize_aggregation_in_order
...
Suppose you have a table with lots of rows, like:
create table data_02233 (parent_key Int, child_key Int, value Int) engine=MergeTree() order by parent_key
And you want to do GROUP BY (parent_key, child_key) with optimize_aggregation_in_order:
select parent_key, child_key, count() from data_02233 group by parent_key, child_key with totals order by parent_key, child_key
Right now, it is not possible, because optimize_aggregation_in_order
supports only w/o key aggregation, i.e. GROUP BY cannot be done inside
unique parent_key region.
v2: rebase on top SortDescriptionWithPositions
v3: disable two-level aggregation
v4: fix merging of aggregates
v5: improve tests coverage (add a test with multiple parts, to add merge processor)
v6: add a test for compiled aggregate functions (sum()) explicitly
v7: add missing sortBlock()
v8: remove group_by_description_optimized
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-04-29 06:58:07 +03:00
Amos Bird
4a5e4274f0
base should not depend on Common
2022-04-29 10:26:35 +08:00
fenglv
1b84d59047
fix typo
...
modify comment
2022-04-27 12:24:49 +00:00
lgbo-ustc
5738871a8b
update QueryPipelineBuilder::joinPipelines
2022-04-27 10:24:19 +08:00
lgbo-ustc
520b05b9f1
update test case tests/queries/0_stateless/02236_explain_pipeline_join.sql
2022-04-27 10:08:22 +08:00
lgbo-ustc
6cb7b7888f
update test case 02236_explain_pipeline_join
2022-04-26 19:07:07 +08:00