ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-04 05:22:17 +00:00

Author	SHA1	Message	Date
Alexey Milovidov	88440d4c07	Merge pull request #54568 from JackyWoo/optimize_uniq_to_count2 Resubmit optimization uniq to count	2023-10-30 01:33:36 +01:00
Alexey Milovidov	64b6e68a50	Merge pull request #55683 from amosbird/issue-55653 Reuse granule during skip index reading	2023-10-30 00:51:51 +01:00
frinkr	18c50c11b3	Multithreading after window functions (#50771 ) * feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing * fix style * fix style * fix style * setting query_plan_preserve_num_streams_after_window_functions default true * fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0 * fix test references * Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway). * feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing * fix style * fix style * fix style * setting query_plan_preserve_num_streams_after_window_functions default true * fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0 * fix test references * Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway). * add perf test * perf: change the dataset from 50M to 5M * rename query_plan_preserve_num_streams_after_window_functions -> query_plan_enable_multithreading_after_window_functions * update test reference * fix clang-tidy --------- Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>	2023-10-27 12:36:28 +02:00
lgbo	489e6d9bdc	Optimization for getting value from map, `arrayElement`(1/2) (#55929 )	2023-10-27 09:54:25 +02:00
Maksim Kita	aa5fc05a55	Revert "Merge pull request #55682 from ClickHouse/revert-35961-decimal-column-improve-get-permutation" This reverts commit `f6dee5fe3c`, reversing changes made to `f96bda1deb`.	2023-10-25 21:48:13 +03:00
李扬	465962df7f	Support orc filter push down (file + stripe + rowgroup level) (#55330 ) * support orc filter push down * update orc lib version * replace setqueryinfo with setkeycondition * fix issue https://github.com/ClickHouse/ClickHouse/issues/53536 * refactor source with key condition * fix building error * remove std::cout * update orc * update orc version * fix bugs * improve code * upgrade orc lib * fix code style * change as requested * add performance tests for orc filter push down * add performance tests for orc filter push down * fix all bugs * fix default as null issue * add uts for null as default issues * upgrade orc lib * fix failed orc lib uts and fix typo * fix failed uts * fix failed uts * fix ast fuzzer tests * fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html * fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm * fix wrong performance tests * disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html * add some comments * add some comments * inline range::equals and range::less * fix data race of key condition * trigger ci	2023-10-24 12:08:17 -07:00
Robert Schulze	68c3f41b71	Fix performance tests	2023-10-24 08:56:09 +00:00
Duc Canh Le	5923e1b116	Cache cast function in set during execution (#55712 ) * Cache cast function in set during execution Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> * minor fix for performance test Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> * Update src/Interpreters/castColumn.cpp Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com> * improvement Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> * fix use-after-free Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> --------- Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>	2023-10-23 13:31:44 +02:00
Amos Bird	602f01f651	Reuse granule during skip index reading	2023-10-18 14:40:34 +08:00
Alexey Milovidov	2da1ff4b0d	Revert "Improve ColumnDecimal, ColumnVector getPermutation performance using pdqsort with RadixSort"	2023-10-16 19:07:11 +03:00
Maksim Kita	d4f9e0de12	Added performance test	2023-10-11 19:01:00 +03:00
Robert Schulze	2848548c63	Merge remote-tracking branch 'rschu1ze/master' into arrayFold	2023-10-08 16:32:36 +00:00
JackyWoo	826f7ac7eb	Merge branch 'master' into optimize_uniq_to_count2	2023-09-27 09:14:28 +08:00
Maksim Kita	40be8227ea	Fixed tests	2023-09-25 17:29:42 +03:00
Maksim Kita	1de95d8c36	Updated implementation	2023-09-25 17:29:42 +03:00
Maksim Kita	0a9835d085	Added performance tests	2023-09-25 17:29:42 +03:00
JackyWoo	231d16040b	Merge branch 'master' into optimize_uniq_to_count2	2023-09-19 10:29:03 +08:00
robot-clickhouse	51851ecc21	Merge pull request #54613 from bigo-sg/improve_json_query Improve json sql functions by serializing json element into column's buffer direclty	2023-09-15 19:35:30 +02:00
lgbo-ustc	fa0f9b0e1f	update benckmark test	2023-09-15 18:09:58 +08:00
lgbo-ustc	e8d217634e	improve json sql functions by serilizing data into column direclty	2023-09-14 12:41:17 +08:00
Sema Checherinda	8a9b544a97	Merge branch 'master' into optimize_all_lonely_parts	2023-09-13 16:07:19 +02:00
JackyWoo	70a262a775	Add optimization uniq to count	2023-09-13 16:16:11 +08:00
Alexey Milovidov	bd4aec0601	Revert "Optimize uniq to count"	2023-09-13 09:14:06 +03:00
JackyWoo	d065ac32e0	Merge branch 'master' into optimize_uniq_to_count	2023-09-04 10:06:36 +08:00
Duc Canh Le	06afe0c2aa	more stable stateless test + add perf. test	2023-08-31 06:27:06 +00:00
Jiebin Sun	7c529e5691	Optimize the merge if all hashSets are singleLevel in UniqExactSet (#52973 ) * Optimize the merge if all hashSets are singleLevel In PR(https://github.com/ClickHouse/ClickHouse/pull/50748), it has added new phase `parallelizeMergePrepare` before merge if all the hashSets are not all singleLevel or not all twoLevel. Then it will convert all the singleLevelSet to twoLevelSet in parallel, which will increase the CPU utilization and QPS. But if all the hashtables are singleLevel, it could also benefit from the `parallelizeMergePrepare` optimization in most cases if the hashtable size are not too small. By tuning the Query `SELECT COUNT(DISTINCT SearchPhase) FROM hits_v1` in different threads, we have got the mild threshold 6,000. Test patch with the Query 'SELECT COUNT(DISTINCT Title) FROM hits_v1' on 2x80 vCPUs server. If the threads are less than 48, the hashSets are all twoLevel or mixed by singleLevel and twoLevel. If the threads are over 56, all the hashSets are singleLevel. And the QPS has got at most 2.35x performance gain. Threads Opt/Base 8 100.0% 16 99.4% 24 110.3% 32 99.9% 40 99.3% 48 99.8% 56 183.0% 64 234.7% 72 233.1% 80 229.9% 88 224.5% 96 229.6% 104 235.1% 112 229.5% 120 229.1% 128 217.8% 136 222.9% 144 217.8% 152 204.3% 160 203.2% Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * Add the comment and explanation for PR#52973 Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> --------- Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>	2023-08-30 11:26:16 +02:00
JackyWoo	a963048e1a	Merge branch 'master' into optimize_uniq_to_count	2023-08-28 11:10:05 +08:00
avogar	ecb0e9844c	Disable cache in perf test	2023-08-23 21:01:18 +00:00
avogar	68e3af56d4	Address comments	2023-08-23 13:19:15 +00:00
Kruglov Pavel	e193aec583	Merge branch 'master' into fast-count-from-files	2023-08-23 12:15:34 +02:00
Kruglov Pavel	67c5c0203b	Merge branch 'master' into fast-count-from-files	2023-08-22 15:03:48 +02:00
Alexey Milovidov	037277c4a2	Remove bad test	2023-08-22 14:21:23 +02:00
Michael Kolupaev	d752611c43	Performance test	2023-08-21 14:15:52 -07:00
Kruglov Pavel	88aee95122	Merge branch 'master' into fast-count-from-files	2023-08-21 14:46:33 +02:00
avogar	47304bf7aa	Optimize count from files in most input formats	2023-08-21 12:30:52 +00:00
Alexey Milovidov	125169d9ae	Remove useless test	2023-08-20 03:51:30 +02:00
robot-ch-test-poll4	3aa9cb1267	Merge pull request #51399 from liuneng1994/optimize_nullable_aggragate_serialized_method Optimize aggregation performance of nullable String key when use AggregationMethodSerialized	2023-08-16 19:37:44 +02:00
liuneng	8a83301316	optimize	2023-08-08 13:38:25 +08:00
liuneng	f33367cd8b	add more test	2023-08-08 13:38:25 +08:00
liuneng	f96b9b7512	optimize fixed size column	2023-08-08 13:38:25 +08:00
liuneng	035dbdaf22	remove numbers optimization. It will decrease performance	2023-08-08 13:38:25 +08:00
liuneng	4f9920c71c	optimize performance of nullable String And Number column serializeValueIntoArena	2023-08-08 13:38:25 +08:00
Duc Canh Le	ad0ac43814	fix performance test	2023-08-07 06:25:46 +00:00
Duc Canh Le	ed2a1d7c9b	select required columns when getting join	2023-08-07 03:15:20 +00:00
JackyWoo	43ea21a4ce	make default optimize_uniq_to_count to true	2023-08-02 18:28:22 +08:00
JackyWoo	1c930f34de	reduce performance time	2023-08-02 18:10:01 +08:00
JackyWoo	162c674d74	remove settings in uniq_to_count	2023-08-02 10:50:04 +08:00
JackyWoo	ef3f5e2a7c	fix performance tests error	2023-08-02 10:15:56 +08:00
JackyWoo	93b28903cb	Merge branch 'master' into optimize_uniq_to_count	2023-08-02 10:13:22 +08:00
Jiebin Sun	78f3a575f9	Convert hashSets in parallel before merge (#50748 ) * Convert hashSets in parallel before merge Before merge, if one of the lhs and rhs is singleLevelSet and the other is twoLevelSet, then the SingleLevelSet will call convertToTwoLevel(). The convert process is not in parallel and it will cost lots of cycle if it cosume all the singleLevelSet. The idea of the patch is to convert all the singleLevelSets to twoLevelSets in parallel if the hashsets are not all singleLevel or not all twoLevel. I have tested the patch on Intel 2 x 112 vCPUs SPR server with clickbench and latest upstream ClickHouse. Q5 has got a big 264% performance improvement and 24 queries have got at least 5% performance gain. The overall geomean of 43 queries has gained 7.4% more than the base code. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * add resize() for the data_vec in parallelizeMergePrepare() Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * Add the performance test prepare_hash_before_merge.xml Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * Fit the CI to rename the data set from hits_v1 to test.hits. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * remove the redundant branch in UniqExactSet Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com> * Remove the empty methods and add throw exception in parallelizeMergePrepare() Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> --------- Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>	2023-07-27 15:06:34 +02:00

1 2 3 4 5 ...

1117 Commits