Commit Graph

1403 Commits

Author SHA1 Message Date
Nikolai Kochetov
4deaf7cefb
Merge pull request #56134 from yariks5s/force_optimize_projection_name
Implementing force_optimize_projection_name
2023-11-01 13:12:10 +01:00
Nikolai Kochetov
f748f12426
Merge pull request #51746 from ClickHouse/fix-read-in-order-with-array-join
Fix 'Cannot find column' in read-in-order optimization with ARRAY JOIN
2023-10-31 11:51:01 +01:00
Nikolai Kochetov
84f6a243b7 Merge branch 'master' into fix-read-in-order-with-array-join 2023-10-30 16:35:31 +00:00
yariks5s
03236c48ed init 2023-10-30 16:21:50 +00:00
Nikolai Kochetov
554ceb4e1d Merge branch 'master' into planner-prepare-filters-for-analysis-2 2023-10-30 11:56:30 +01:00
frinkr
18c50c11b3
Multithreading after window functions (#50771)
* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing

* fix style

* fix style

* fix style

* setting query_plan_preserve_num_streams_after_window_functions default true

* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0

* fix test references

* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).

* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing

* fix style

* fix style

* fix style

* setting query_plan_preserve_num_streams_after_window_functions default true

* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0

* fix test references

* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).

* add perf test

* perf: change the dataset from 50M to 5M

* rename query_plan_preserve_num_streams_after_window_functions -> query_plan_enable_multithreading_after_window_functions

* update test reference

* fix clang-tidy

---------

Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-10-27 12:36:28 +02:00
Alexey Milovidov
bb5a60dc19
Merge pull request #55893 from ClickHouse/revert-partial-result-2
Revert "Revert "Revert "Add settings for real-time updates during query execution"""
2023-10-25 22:20:28 +02:00
robot-ch-test-poll1
ef78889aa2
Merge pull request #55952 from ClickHouse/disable_apply_deleted_mask
Added a setting to allow reading rows marked as deleted
2023-10-25 01:14:07 +02:00
李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) (#55330)
* support orc filter push down

* update orc lib version

* replace setqueryinfo with setkeycondition

* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536

* refactor source with key condition

* fix building error

* remove std::cout

* update orc

* update orc version

* fix bugs

* improve code

* upgrade orc lib

* fix code style

* change as requested

* add performance tests for orc filter push down

* add performance tests for orc filter push down

* fix all bugs

* fix default as null issue

* add uts for null as default issues

* upgrade orc lib

* fix failed orc lib uts and fix typo

* fix failed uts

* fix failed uts

* fix ast fuzzer tests

* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html

* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm

* fix wrong performance tests

* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html

* add some comments

* add some comments

* inline range::equals and range::less

* fix data race of key condition

* trigger ci
2023-10-24 12:08:17 -07:00
Alexander Gololobov
959b8b64bd Added a setting to allow reading rows marked as deleted 2023-10-23 19:59:17 +02:00
Alexey Milovidov
5217d64551 Remove garbage 2023-10-22 01:53:50 +02:00
Alexey Milovidov
7ec4b99e94 Revert partial result 2023-10-21 03:14:22 +02:00
Raúl Marín
3eaf752284 Merge remote-tracking branch 'blessed/master' into parallel_replicas_row_estimation 2023-10-17 11:36:39 +02:00
Jiebin Sun
df17cd467b
Release more num_streams if data is small (#53867)
* Release more num_streams if data is small

Besides the sum_marks and min_marks_for_concurrent_read, we could also involve the
system cores to get the num_streams if the data is small. Increasing the num_streams
and decreasing the min_marks_for_concurrent_read would improve the parallel performance
if the system has plentiful cores.

Test the patch on 2x80 vCPUs system. Q39 of clickbench has got 3.3x performance improvement.
Q36 has got 2.6x performance improvement. The overall geomean has got 9% gain.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Release more num_streams if data is small
Change the min marks from 4 to 8 as the profit is small and 8 granules
is the default block size.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

---------

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-10-16 18:41:38 +02:00
Raúl Marín
500dab9569 Merge remote-tracking branch 'blessed/master' into parallel_replicas_row_estimation 2023-10-13 10:12:57 +02:00
Igor Nikonov
62060a0603 Merge remote-tracking branch 'origin/master' into pr-coordinator-usage-cleanup 2023-10-11 15:11:15 +00:00
Igor Nikonov
9d95f4e1b6 Cleanup: parallel replica coordinator usage 2023-10-11 15:04:59 +00:00
Raúl Marín
0b9bd809e7 Merge remote-tracking branch 'blessed/master' into parallel_replicas_row_estimation 2023-10-11 16:50:22 +02:00
Raúl Marín
95d2063e91 Merge remote-tracking branch 'blessed/master' into parallel_replicas_row_estimation 2023-10-10 17:29:45 +02:00
Nikita Mikhaylov
4456fe40f9
Remove the old code for projection analysis (#55112) 2023-10-10 17:13:32 +02:00
Azat Khuzhin
099665478d Fix incorrect merging of Nested for SELECT FINAL FROM SummingMergeTree
The problem was the order of the columns, in case of SELECT FINAL it got
"counters_Map.count", "counters_Map.id"

But in case of OPTIMIZE FINAL it got "counters_Map.id",
"counters_Map.count" correctly.

Note, that this bugs exists there from the very recent versions, I've
checked 19.x and it was there.

P.S. there is a workaround for this problem, if you will use one of the
following patterns for key columns:
- *ID
- *Key
- *Type

That way it will be explicitly matched as key and everything will work.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-10-08 07:32:47 +02:00
Nikolai Kochetov
3bd71c937c Merge branch 'master' into fix-read-in-order-with-array-join 2023-10-04 09:17:09 +00:00
Nikolai Kochetov
d944b59902 Merge branch 'master' into planner-prepare-filters-for-analysis-2 2023-10-03 14:28:16 +00:00
vdimir
3a9abde35d
Merge pull request #54514 from ClickHouse/vdimir/allow_experimental_partial_result
Add setting allow_experimental_partial_result
2023-09-29 10:32:57 +02:00
Nikita Taranov
0e506b618e
impl (#54934) 2023-09-28 14:12:19 +02:00
vdimir
3f3feea0b7
Add setting allow_experimental_partial_result 2023-09-28 09:40:56 +00:00
Kruglov Pavel
4d675dbad4
Merge pull request #54825 from azat/fix-virtual-columns-filtering
Fix filtering parts with indexHint for non analyzer (resubmit)
2023-09-22 18:20:16 +02:00
Igor Nikonov
b1cc698477
Merge pull request #54564 from vitlibar/fix-sorting-of-union-of-sorted
Fix sorting of UNION ALL of already sorted results
2023-09-21 22:49:53 +02:00
Azat Khuzhin
d9a634eb0f Fix filtering parts with indexHint for non analyzer
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
(cherry picked from commit ffa82e9297)
2023-09-20 11:29:35 +02:00
Azat Khuzhin
c439c4bca2
Revert "Fix filtering parts with indexHint for non analyzer" 2023-09-19 21:39:21 +02:00
Kruglov Pavel
d555fdffa5
Merge pull request #54449 from azat/parts-prune-indexHint
Fix filtering parts with indexHint for non analyzer
2023-09-19 19:15:41 +02:00
Igor Nikonov
047d214436 Merge remote-tracking branch 'origin/master' into parallel-replicas-not-enough-replicas 2023-09-18 15:29:56 +00:00
Igor Nikonov
e1019ba3c4 Disabling parallel replicas per shard will be done separately 2023-09-18 15:27:55 +00:00
Kruglov Pavel
9c888ea42b
Merge pull request #53549 from Avogar/group-by-constant-keys
Optimize group by constant keys
2023-09-18 12:12:40 +02:00
Amos Bird
0518b64b58
Fix nullable primary key in final (#54164)
* Fix nullable primary key in final

* Real fix

* Address reviews
2023-09-15 22:44:13 +02:00
Kruglov Pavel
2075f9c667
Merge branch 'master' into group-by-constant-keys 2023-09-15 15:10:08 +02:00
Vitaly Baranov
9a0e1ef592 Fix sorting of UNION ALL of already sorted results. 2023-09-14 15:04:37 +02:00
Sema Checherinda
8a9b544a97
Merge branch 'master' into optimize_all_lonely_parts 2023-09-13 16:07:19 +02:00
Igor Nikonov
7b3f32b95a
Merge pull request #54520 from ClickHouse/pr-cleanup
Parallel replicas: cleanup unused params
2023-09-12 19:48:18 +02:00
Igor Nikonov
1287f68745 Handle clusterAllReplicas/remote cases to avoid unnecessary logging 2023-09-12 12:52:29 +00:00
robot-clickhouse
63243fbc03
Merge pull request #54480 from amosbird/fix_54406
Fix aggregate projections with normalized states
2023-09-12 13:43:41 +02:00
Igor Nikonov
d5ea047ab8 Parallel replicas: cleanup unused params 2023-09-11 21:52:40 +00:00
Igor Nikonov
2293923f66 Disable parallel replicas on shards with not enough nodes 2023-09-11 21:46:46 +00:00
Amos Bird
667426f1f2
DataTypeAggregateFunction::strictEquals 2023-09-12 03:54:19 +08:00
Nikolai Kochetov
903c966cc8
Merge branch 'master' into planner-prepare-filters-for-analysis-2 2023-09-11 16:14:03 +02:00
Amos Bird
fb0f9ff565
Fix aggregate projections with normalized states 2023-09-10 03:21:22 +08:00
Nikolai Kochetov
9b936c44db
Revert "Revert "Add settings for real-time updates during query execution"" 2023-09-09 12:29:39 +02:00
Alexey Milovidov
03a755732a
Revert "Add settings for real-time updates during query execution" 2023-09-09 03:10:23 +03:00
Azat Khuzhin
ffa82e9297 Fix filtering parts with indexHint for non analyzer
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-09-08 18:17:06 +02:00
Anton Popov
1aa34c16e3 Merge remote-tracking branch 'upstream/master' into HEAD 2023-09-08 14:00:45 +00:00