Commit Graph

1117 Commits

Author SHA1 Message Date
Alexey Milovidov
88440d4c07
Merge pull request #54568 from JackyWoo/optimize_uniq_to_count2
Resubmit optimization uniq to count
2023-10-30 01:33:36 +01:00
Alexey Milovidov
64b6e68a50
Merge pull request #55683 from amosbird/issue-55653
Reuse granule during skip index reading
2023-10-30 00:51:51 +01:00
frinkr
18c50c11b3
Multithreading after window functions (#50771)
* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing

* fix style

* fix style

* fix style

* setting query_plan_preserve_num_streams_after_window_functions default true

* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0

* fix test references

* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).

* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing

* fix style

* fix style

* fix style

* setting query_plan_preserve_num_streams_after_window_functions default true

* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0

* fix test references

* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).

* add perf test

* perf: change the dataset from 50M to 5M

* rename query_plan_preserve_num_streams_after_window_functions -> query_plan_enable_multithreading_after_window_functions

* update test reference

* fix clang-tidy

---------

Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-10-27 12:36:28 +02:00
lgbo
489e6d9bdc
Optimization for getting value from map, arrayElement(1/2) (#55929) 2023-10-27 09:54:25 +02:00
Maksim Kita
aa5fc05a55 Revert "Merge pull request #55682 from ClickHouse/revert-35961-decimal-column-improve-get-permutation"
This reverts commit f6dee5fe3c, reversing
changes made to f96bda1deb.
2023-10-25 21:48:13 +03:00
李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) (#55330)
* support orc filter push down

* update orc lib version

* replace setqueryinfo with setkeycondition

* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536

* refactor source with key condition

* fix building error

* remove std::cout

* update orc

* update orc version

* fix bugs

* improve code

* upgrade orc lib

* fix code style

* change as requested

* add performance tests for orc filter push down

* add performance tests for orc filter push down

* fix all bugs

* fix default as null issue

* add uts for null as default issues

* upgrade orc lib

* fix failed orc lib uts and fix typo

* fix failed uts

* fix failed uts

* fix ast fuzzer tests

* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html

* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm

* fix wrong performance tests

* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html

* add some comments

* add some comments

* inline range::equals and range::less

* fix data race of key condition

* trigger ci
2023-10-24 12:08:17 -07:00
Robert Schulze
68c3f41b71
Fix performance tests 2023-10-24 08:56:09 +00:00
Duc Canh Le
5923e1b116
Cache cast function in set during execution (#55712)
* Cache cast function in set during execution

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

* minor fix for performance test

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

* Update src/Interpreters/castColumn.cpp

Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

* improvement

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

* fix use-after-free

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

---------

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-10-23 13:31:44 +02:00
Amos Bird
602f01f651
Reuse granule during skip index reading 2023-10-18 14:40:34 +08:00
Alexey Milovidov
2da1ff4b0d
Revert "Improve ColumnDecimal, ColumnVector getPermutation performance using pdqsort with RadixSort" 2023-10-16 19:07:11 +03:00
Maksim Kita
d4f9e0de12 Added performance test 2023-10-11 19:01:00 +03:00
Robert Schulze
2848548c63
Merge remote-tracking branch 'rschu1ze/master' into arrayFold 2023-10-08 16:32:36 +00:00
JackyWoo
826f7ac7eb Merge branch 'master' into optimize_uniq_to_count2 2023-09-27 09:14:28 +08:00
Maksim Kita
40be8227ea Fixed tests 2023-09-25 17:29:42 +03:00
Maksim Kita
1de95d8c36 Updated implementation 2023-09-25 17:29:42 +03:00
Maksim Kita
0a9835d085 Added performance tests 2023-09-25 17:29:42 +03:00
JackyWoo
231d16040b Merge branch 'master' into optimize_uniq_to_count2 2023-09-19 10:29:03 +08:00
robot-clickhouse
51851ecc21
Merge pull request #54613 from bigo-sg/improve_json_query
Improve json sql functions by serializing json element into column's buffer direclty
2023-09-15 19:35:30 +02:00
lgbo-ustc
fa0f9b0e1f update benckmark test 2023-09-15 18:09:58 +08:00
lgbo-ustc
e8d217634e improve json sql functions by serilizing data into column direclty 2023-09-14 12:41:17 +08:00
Sema Checherinda
8a9b544a97
Merge branch 'master' into optimize_all_lonely_parts 2023-09-13 16:07:19 +02:00
JackyWoo
70a262a775 Add optimization uniq to count 2023-09-13 16:16:11 +08:00
Alexey Milovidov
bd4aec0601
Revert "Optimize uniq to count" 2023-09-13 09:14:06 +03:00
JackyWoo
d065ac32e0 Merge branch 'master' into optimize_uniq_to_count 2023-09-04 10:06:36 +08:00
Duc Canh Le
06afe0c2aa more stable stateless test + add perf. test 2023-08-31 06:27:06 +00:00
Jiebin Sun
7c529e5691
Optimize the merge if all hashSets are singleLevel in UniqExactSet (#52973)
* Optimize the merge if all hashSets are singleLevel

In PR(https://github.com/ClickHouse/ClickHouse/pull/50748), it has added new phase
`parallelizeMergePrepare` before merge if all the hashSets are not all singleLevel
or not all twoLevel. Then it will convert all the singleLevelSet to twoLevelSet in
parallel, which will increase the CPU utilization and QPS.

But if all the hashtables are singleLevel, it could also benefit from the
`parallelizeMergePrepare` optimization in most cases if the hashtable size are not
too small. By tuning the Query `SELECT COUNT(DISTINCT SearchPhase) FROM hits_v1`
in different threads, we have got the mild threshold 6,000.

Test patch with the Query 'SELECT COUNT(DISTINCT Title) FROM hits_v1' on 2x80 vCPUs
server. If the threads are less than 48, the hashSets are all twoLevel or mixed by
singleLevel and twoLevel. If the threads are over 56, all the hashSets are singleLevel.
And the QPS has got at most 2.35x performance gain.

Threads	Opt/Base
8	100.0%
16	99.4%
24	110.3%
32	99.9%
40	99.3%
48	99.8%
56	183.0%
64	234.7%
72	233.1%
80	229.9%
88	224.5%
96	229.6%
104	235.1%
112	229.5%
120	229.1%
128	217.8%
136	222.9%
144	217.8%
152	204.3%
160	203.2%

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Add the comment and explanation for PR#52973

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

---------

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-08-30 11:26:16 +02:00
JackyWoo
a963048e1a Merge branch 'master' into optimize_uniq_to_count 2023-08-28 11:10:05 +08:00
avogar
ecb0e9844c Disable cache in perf test 2023-08-23 21:01:18 +00:00
avogar
68e3af56d4 Address comments 2023-08-23 13:19:15 +00:00
Kruglov Pavel
e193aec583
Merge branch 'master' into fast-count-from-files 2023-08-23 12:15:34 +02:00
Kruglov Pavel
67c5c0203b
Merge branch 'master' into fast-count-from-files 2023-08-22 15:03:48 +02:00
Alexey Milovidov
037277c4a2 Remove bad test 2023-08-22 14:21:23 +02:00
Michael Kolupaev
d752611c43 Performance test 2023-08-21 14:15:52 -07:00
Kruglov Pavel
88aee95122
Merge branch 'master' into fast-count-from-files 2023-08-21 14:46:33 +02:00
avogar
47304bf7aa Optimize count from files in most input formats 2023-08-21 12:30:52 +00:00
Alexey Milovidov
125169d9ae Remove useless test 2023-08-20 03:51:30 +02:00
robot-ch-test-poll4
3aa9cb1267
Merge pull request #51399 from liuneng1994/optimize_nullable_aggragate_serialized_method
Optimize aggregation performance of nullable String key when use AggregationMethodSerialized
2023-08-16 19:37:44 +02:00
liuneng
8a83301316 optimize 2023-08-08 13:38:25 +08:00
liuneng
f33367cd8b add more test 2023-08-08 13:38:25 +08:00
liuneng
f96b9b7512 optimize fixed size column 2023-08-08 13:38:25 +08:00
liuneng
035dbdaf22 remove numbers optimization. It will decrease performance 2023-08-08 13:38:25 +08:00
liuneng
4f9920c71c optimize performance of nullable String And Number column serializeValueIntoArena 2023-08-08 13:38:25 +08:00
Duc Canh Le
ad0ac43814 fix performance test 2023-08-07 06:25:46 +00:00
Duc Canh Le
ed2a1d7c9b select required columns when getting join 2023-08-07 03:15:20 +00:00
JackyWoo
43ea21a4ce make default optimize_uniq_to_count to true 2023-08-02 18:28:22 +08:00
JackyWoo
1c930f34de reduce performance time 2023-08-02 18:10:01 +08:00
JackyWoo
162c674d74 remove settings in uniq_to_count 2023-08-02 10:50:04 +08:00
JackyWoo
ef3f5e2a7c fix performance tests error 2023-08-02 10:15:56 +08:00
JackyWoo
93b28903cb
Merge branch 'master' into optimize_uniq_to_count 2023-08-02 10:13:22 +08:00
Jiebin Sun
78f3a575f9
Convert hashSets in parallel before merge (#50748)
* Convert hashSets in parallel before merge

Before merge, if one of the lhs and rhs is singleLevelSet and the other is twoLevelSet,
then the SingleLevelSet will call convertToTwoLevel(). The convert process is not in parallel
and it will cost lots of cycle if it cosume all the singleLevelSet.

The idea of the patch is to convert all the singleLevelSets to twoLevelSets in parallel if
the hashsets are not all singleLevel or not all twoLevel.

I have tested the patch on Intel 2 x 112 vCPUs SPR server with clickbench and latest upstream
ClickHouse.
Q5 has got a big 264% performance improvement and 24 queries have got at least 5% performance
gain. The overall geomean of 43 queries has gained 7.4% more than the base code.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* add resize() for the data_vec in parallelizeMergePrepare()

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Add the performance test prepare_hash_before_merge.xml

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Fit the CI to rename the data set from hits_v1 to test.hits.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* remove the redundant branch in UniqExactSet

Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

* Remove the empty methods and add throw exception in parallelizeMergePrepare()

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

---------

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-07-27 15:06:34 +02:00