ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-28 10:31:57 +00:00

Author	SHA1	Message	Date
taiyang-li	b142489c3c	fix code style	2023-11-02 10:49:18 +08:00
lgbo-ustc	8334585eaf	improve parquet struct field reading	2023-11-01 15:18:39 +08:00
taiyang-li	c97b2c5be7	fix code style	2023-10-31 12:00:45 +08:00
taiyang-li	b72341e1a8	Merge branch 'master' into orc_tuple_field_prune	2023-10-31 10:07:43 +08:00
taiyang-li	38f24c0455	add performance tests	2023-10-30 20:29:43 +08:00
Alexey Milovidov	88440d4c07	Merge pull request #54568 from JackyWoo/optimize_uniq_to_count2 Resubmit optimization uniq to count	2023-10-30 01:33:36 +01:00
Alexey Milovidov	64b6e68a50	Merge pull request #55683 from amosbird/issue-55653 Reuse granule during skip index reading	2023-10-30 00:51:51 +01:00
frinkr	18c50c11b3	Multithreading after window functions (#50771 ) * feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing * fix style * fix style * fix style * setting query_plan_preserve_num_streams_after_window_functions default true * fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0 * fix test references * Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway). * feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing * fix style * fix style * fix style * setting query_plan_preserve_num_streams_after_window_functions default true * fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0 * fix test references * Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway). * add perf test * perf: change the dataset from 50M to 5M * rename query_plan_preserve_num_streams_after_window_functions -> query_plan_enable_multithreading_after_window_functions * update test reference * fix clang-tidy --------- Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>	2023-10-27 12:36:28 +02:00
lgbo	489e6d9bdc	Optimization for getting value from map, `arrayElement`(1/2) (#55929 )	2023-10-27 09:54:25 +02:00
Maksim Kita	aa5fc05a55	Revert "Merge pull request #55682 from ClickHouse/revert-35961-decimal-column-improve-get-permutation" This reverts commit `f6dee5fe3c`, reversing changes made to `f96bda1deb`.	2023-10-25 21:48:13 +03:00
李扬	465962df7f	Support orc filter push down (file + stripe + rowgroup level) (#55330 ) * support orc filter push down * update orc lib version * replace setqueryinfo with setkeycondition * fix issue https://github.com/ClickHouse/ClickHouse/issues/53536 * refactor source with key condition * fix building error * remove std::cout * update orc * update orc version * fix bugs * improve code * upgrade orc lib * fix code style * change as requested * add performance tests for orc filter push down * add performance tests for orc filter push down * fix all bugs * fix default as null issue * add uts for null as default issues * upgrade orc lib * fix failed orc lib uts and fix typo * fix failed uts * fix failed uts * fix ast fuzzer tests * fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html * fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm * fix wrong performance tests * disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html * add some comments * add some comments * inline range::equals and range::less * fix data race of key condition * trigger ci	2023-10-24 12:08:17 -07:00
Robert Schulze	68c3f41b71	Fix performance tests	2023-10-24 08:56:09 +00:00
Duc Canh Le	5923e1b116	Cache cast function in set during execution (#55712 ) * Cache cast function in set during execution Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> * minor fix for performance test Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> * Update src/Interpreters/castColumn.cpp Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com> * improvement Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> * fix use-after-free Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> --------- Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com> Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>	2023-10-23 13:31:44 +02:00
Amos Bird	602f01f651	Reuse granule during skip index reading	2023-10-18 14:40:34 +08:00
Alexey Milovidov	2da1ff4b0d	Revert "Improve ColumnDecimal, ColumnVector getPermutation performance using pdqsort with RadixSort"	2023-10-16 19:07:11 +03:00
Duc Canh Le	16687632da	add format Null to performance tests Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>	2023-10-14 08:01:55 +00:00
Duc Canh Le	285ae778e4	Merge branch 'master' into final_no_copy Fix flaky test 02447_drop_database_replica	2023-10-14 03:34:42 +00:00
Duc Canh Le	dcc464b4da	add more performance test Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>	2023-10-13 08:16:54 +00:00
Maksim Kita	d4f9e0de12	Added performance test	2023-10-11 19:01:00 +03:00
Robert Schulze	2848548c63	Merge remote-tracking branch 'rschu1ze/master' into arrayFold	2023-10-08 16:32:36 +00:00
JackyWoo	826f7ac7eb	Merge branch 'master' into optimize_uniq_to_count2	2023-09-27 09:14:28 +08:00
Maksim Kita	40be8227ea	Fixed tests	2023-09-25 17:29:42 +03:00
Maksim Kita	1de95d8c36	Updated implementation	2023-09-25 17:29:42 +03:00
Maksim Kita	0a9835d085	Added performance tests	2023-09-25 17:29:42 +03:00
JackyWoo	231d16040b	Merge branch 'master' into optimize_uniq_to_count2	2023-09-19 10:29:03 +08:00
robot-clickhouse	51851ecc21	Merge pull request #54613 from bigo-sg/improve_json_query Improve json sql functions by serializing json element into column's buffer direclty	2023-09-15 19:35:30 +02:00
lgbo-ustc	fa0f9b0e1f	update benckmark test	2023-09-15 18:09:58 +08:00
lgbo-ustc	e8d217634e	improve json sql functions by serilizing data into column direclty	2023-09-14 12:41:17 +08:00
Sema Checherinda	8a9b544a97	Merge branch 'master' into optimize_all_lonely_parts	2023-09-13 16:07:19 +02:00
JackyWoo	70a262a775	Add optimization uniq to count	2023-09-13 16:16:11 +08:00
Alexey Milovidov	bd4aec0601	Revert "Optimize uniq to count"	2023-09-13 09:14:06 +03:00
JackyWoo	d065ac32e0	Merge branch 'master' into optimize_uniq_to_count	2023-09-04 10:06:36 +08:00
Duc Canh Le	06afe0c2aa	more stable stateless test + add perf. test	2023-08-31 06:27:06 +00:00
Jiebin Sun	7c529e5691	Optimize the merge if all hashSets are singleLevel in UniqExactSet (#52973 ) * Optimize the merge if all hashSets are singleLevel In PR(https://github.com/ClickHouse/ClickHouse/pull/50748), it has added new phase `parallelizeMergePrepare` before merge if all the hashSets are not all singleLevel or not all twoLevel. Then it will convert all the singleLevelSet to twoLevelSet in parallel, which will increase the CPU utilization and QPS. But if all the hashtables are singleLevel, it could also benefit from the `parallelizeMergePrepare` optimization in most cases if the hashtable size are not too small. By tuning the Query `SELECT COUNT(DISTINCT SearchPhase) FROM hits_v1` in different threads, we have got the mild threshold 6,000. Test patch with the Query 'SELECT COUNT(DISTINCT Title) FROM hits_v1' on 2x80 vCPUs server. If the threads are less than 48, the hashSets are all twoLevel or mixed by singleLevel and twoLevel. If the threads are over 56, all the hashSets are singleLevel. And the QPS has got at most 2.35x performance gain. Threads Opt/Base 8 100.0% 16 99.4% 24 110.3% 32 99.9% 40 99.3% 48 99.8% 56 183.0% 64 234.7% 72 233.1% 80 229.9% 88 224.5% 96 229.6% 104 235.1% 112 229.5% 120 229.1% 128 217.8% 136 222.9% 144 217.8% 152 204.3% 160 203.2% Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * Add the comment and explanation for PR#52973 Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> --------- Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>	2023-08-30 11:26:16 +02:00
JackyWoo	a963048e1a	Merge branch 'master' into optimize_uniq_to_count	2023-08-28 11:10:05 +08:00
avogar	ecb0e9844c	Disable cache in perf test	2023-08-23 21:01:18 +00:00
avogar	68e3af56d4	Address comments	2023-08-23 13:19:15 +00:00
Kruglov Pavel	e193aec583	Merge branch 'master' into fast-count-from-files	2023-08-23 12:15:34 +02:00
Kruglov Pavel	67c5c0203b	Merge branch 'master' into fast-count-from-files	2023-08-22 15:03:48 +02:00
Alexey Milovidov	037277c4a2	Remove bad test	2023-08-22 14:21:23 +02:00
Michael Kolupaev	d752611c43	Performance test	2023-08-21 14:15:52 -07:00
Kruglov Pavel	88aee95122	Merge branch 'master' into fast-count-from-files	2023-08-21 14:46:33 +02:00
avogar	47304bf7aa	Optimize count from files in most input formats	2023-08-21 12:30:52 +00:00
Alexey Milovidov	125169d9ae	Remove useless test	2023-08-20 03:51:30 +02:00
robot-ch-test-poll4	3aa9cb1267	Merge pull request #51399 from liuneng1994/optimize_nullable_aggragate_serialized_method Optimize aggregation performance of nullable String key when use AggregationMethodSerialized	2023-08-16 19:37:44 +02:00
liuneng	8a83301316	optimize	2023-08-08 13:38:25 +08:00
liuneng	f33367cd8b	add more test	2023-08-08 13:38:25 +08:00
liuneng	f96b9b7512	optimize fixed size column	2023-08-08 13:38:25 +08:00
liuneng	035dbdaf22	remove numbers optimization. It will decrease performance	2023-08-08 13:38:25 +08:00
liuneng	4f9920c71c	optimize performance of nullable String And Number column serializeValueIntoArena	2023-08-08 13:38:25 +08:00
Duc Canh Le	ad0ac43814	fix performance test	2023-08-07 06:25:46 +00:00
Duc Canh Le	ed2a1d7c9b	select required columns when getting join	2023-08-07 03:15:20 +00:00
JackyWoo	43ea21a4ce	make default optimize_uniq_to_count to true	2023-08-02 18:28:22 +08:00
JackyWoo	1c930f34de	reduce performance time	2023-08-02 18:10:01 +08:00
JackyWoo	162c674d74	remove settings in uniq_to_count	2023-08-02 10:50:04 +08:00
JackyWoo	ef3f5e2a7c	fix performance tests error	2023-08-02 10:15:56 +08:00
JackyWoo	93b28903cb	Merge branch 'master' into optimize_uniq_to_count	2023-08-02 10:13:22 +08:00
Jiebin Sun	78f3a575f9	Convert hashSets in parallel before merge (#50748 ) * Convert hashSets in parallel before merge Before merge, if one of the lhs and rhs is singleLevelSet and the other is twoLevelSet, then the SingleLevelSet will call convertToTwoLevel(). The convert process is not in parallel and it will cost lots of cycle if it cosume all the singleLevelSet. The idea of the patch is to convert all the singleLevelSets to twoLevelSets in parallel if the hashsets are not all singleLevel or not all twoLevel. I have tested the patch on Intel 2 x 112 vCPUs SPR server with clickbench and latest upstream ClickHouse. Q5 has got a big 264% performance improvement and 24 queries have got at least 5% performance gain. The overall geomean of 43 queries has gained 7.4% more than the base code. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * add resize() for the data_vec in parallelizeMergePrepare() Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * Add the performance test prepare_hash_before_merge.xml Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * Fit the CI to rename the data set from hits_v1 to test.hits. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * remove the redundant branch in UniqExactSet Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com> * Remove the empty methods and add throw exception in parallelizeMergePrepare() Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> --------- Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>	2023-07-27 15:06:34 +02:00
JackyWoo	5f47aacef2	add performance tests	2023-07-27 15:41:16 +08:00
JackyWoo	95c41f49e0	not change projection columns	2023-07-27 15:41:16 +08:00
robot-ch-test-poll4	110500049a	Merge pull request #50532 from nickitat/more_pushdown_for_right_side_of_join Push down to right side of a join in more cases	2023-07-26 14:43:57 +02:00
Nikita Taranov	b2acbe42b7	add perf test	2023-07-24 20:34:01 +02:00
Igor Nikonov	91f7185e8c	Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct	2023-07-24 18:47:23 +02:00
Igor Nikonov	90e393ecf6	Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct	2023-07-18 14:26:22 +00:00
Alexey Milovidov	62bfa4ed93	Fix performance test for regexp cache	2023-07-09 02:21:48 +02:00
vdimir	737cff7e57	Remove whole join_set_filter.xml, will resubmit	2023-07-03 17:00:20 +02:00
vdimir	9ea5d929a5	Update tests/performance/join_set_filter.xml	2023-07-03 17:00:20 +02:00
vdimir	ebd7ecb230	Remove unstable queries from performance/join_set_filter	2023-07-03 17:00:20 +02:00
Igor Nikonov	35bc97e5f9	Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct	2023-06-16 20:56:56 +00:00
Azat Khuzhin	5caa3a9e80	Adjust min_insert_block_size_rows for materialized_view_parallelize_output_from_storages Otherwise it is too slow for perf tests on CI [1]. [1]: https://s3.amazonaws.com/clickhouse-test-reports/50214/e287ec50920c7cadabea6ec19ef14b353345ac93/performance_comparison_[3_4]/report.html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-06-14 19:11:23 +03:00
Azat Khuzhin	3e419730c3	Disable parallelize_output_from_storages for processing MATERIALIZED VIEWs Adding more processors for parallelize_output_from_storages is not a costless operation (I've experienced some issues in production because of this), and it is not easy to fix in a normal way, so let's disable it for now. Before this patch: - INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=1, min_insert_block_size_rows=1000 0 rows in set. Elapsed: 3.648 sec. Processed 20.00 million rows, 120.00 MB (5.48 million rows/s., 32.90 MB/s.) - INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=0, min_insert_block_size_rows=1000 0 rows in set. Elapsed: 1.851 sec. Processed 20.00 million rows, 120.00 MB (10.80 million rows/s., 64.82 MB/s.) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-06-14 19:11:23 +03:00
Igor Nikonov	79f53f428b	Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct	2023-06-13 13:45:36 +02:00
flynn	92c87dedad	Add parallel state merge for some other combinator except If (#50413 ) * Add parallel state merge for some other combinator except If * add test * update test	2023-06-08 00:41:32 +02:00
flynn	f616314f8b	fix typo	2023-05-29 02:22:13 +00:00
flynn	05783f99cd	update test	2023-05-28 14:17:59 +00:00
flynn	ec82c657eb	Parallel merge of uniqExactIf states	2023-05-28 06:04:23 +00:00
Azat Khuzhin	2996b38606	Add ability to configure maximum load factor for the HASHED/SPARSE_HASHED layout As it turns out, HashMap/PackedHashMap works great even with max load factor of 0.99. By "great" I mean it least it works faster then google sparsehash, and not to mention it's friendliness to the memory allocator (it has zero fragmentation since it works with a continuious memory region, in comparison to the sparsehash that doing lots of realloc, which jemalloc does not like, due to it's slabs). Here is a table of different setups: settings \| load (sec) \| read (sec) \| read (million rows/s) \| bytes_allocated \| RSS - \| - \| - \| - \| - \| - HASHED upstream \| - \| - \| - \| - \| 35GiB SPARSE_HASHED upstream \| - \| - \| - \| - \| 26GiB - \| - \| - \| - \| - \| - sparse_hash_map glibc hashbench \| - \| - \| - \| - \| 17.5GiB sparse_hash_map packed allocator \| 101.878 \| 231.48 \| 4.32 \| - \| 17.7GiB PackedHashMap 0.5 \| 15.514 \| 42.35 \| 23.61 \| 20GiB \| 22GiB hashed 0.95 \| 34.903 \| 115.615 \| 8.65 \| 16GiB \| 18.7GiB PackedHashMap 0.95 \| 93.6 \| 19.883 \| 10.68 \| 10GiB \| 12.8GiB PackedHashMap 0.99 \| 26.113 \| 83.6 \| 11.96 \| 10GiB \| 12.3GiB As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less memory then SPARSE_HASHED in upstream, and it also 2x faster for read! v2: fix grower Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7b5d156cc5	Optimize SPARSE_HASHED layout (by using PackedHashMap) In case you want dictionary optimized for memory, SPARSE_HASHED is not always gives you what you need. Consider the following example <UInt64, UInt16> as <Key, Value>, but this pair will also have a 6 byte padding (on amd64), so this is almost 40% of space wastage. And because of this padding, even google::sparse_hash_map, does not make picture better, in fact, sparse_hash_map is not very friendly to memory allocators (especially jemalloc). Here are some numbers for dictionary with 1e9 elements and UInt64 as key, and UInt16 as value: settings \| load (sec) \| read (sec) \| read (million rows/s) \| bytes_allocated \| RSS HASHED upstream \| - \| - \| - \| - \| 35GiB SPARSE_HASHED upstream \| - \| - \| - \| - \| 26GiB - \| - \| - \| - \| - \| - sparse_hash_map glibc hashbench \| - \| - \| - \| - \| 17.5GiB sparse_hash_map packed allocator \| 101.878 \| 231.48 \| 4.32 \| - \| 17.7GiB PackedHashMap \| 15.514 \| 42.35 \| 23.61 \| 20GiB \| 22GiB As you can see PackedHashMap looks way more better then HASHED, and even better then SPARSE_HASHED, but slightly worse then sparse_hash_map with packed allocator (it is done with a custom patch to google sparse_hash_map). v2: rebase on top of bucket_count fix Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Lirikl	39a505e3f3	init	2023-05-11 22:54:00 +03:00
lgbo-ustc	a07359fbe8	enable used flags's reinit only when the hash talbe rehash	2023-05-11 11:06:13 +08:00
Alexey Milovidov	8a6e07f0ea	Make projections production-ready	2023-05-10 03:35:13 +02:00
Alexey Milovidov	f449df85b6	Deprecate in-memory parts	2023-05-03 00:31:09 +02:00
Alexey Milovidov	c279516ac1	Merge branch 'master' into parallel-reading-from-file	2023-04-10 08:02:50 +03:00
Igor Nikonov	8fdc2b3326	Perf test	2023-04-07 20:06:11 +00:00
Anton Popov	10d2b1330b	add perf test	2023-04-04 21:29:52 +00:00
Anton Popov	1e79245b94	add tests	2023-03-28 17:20:05 +00:00
Ongkong	d9c7bc1859	Fix ASOF LEFT JOIN performance degradation (#47544 )	2023-03-18 23:53:00 +01:00
LiuNeng	d4c5ab9dcd	Optimize one nullable key aggregate performance (#45772 )	2023-03-02 21:01:52 +01:00
Igor Nikonov	2f7aa8849b	Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct	2023-03-02 20:48:28 +01:00
Igor Nikonov	548d79c2e8	Remove perf test duplicate_order_by_and_distinct.xml	2023-03-02 12:31:09 +00:00
Alexander Gololobov	f64d08bd5c	Enable lightweight delete support by default	2023-03-01 19:35:55 +01:00
Nikita Taranov	ab44740efb	Enable perf tests added in #45364 (#46623 )	2023-02-28 00:26:11 +01:00
Alexey Milovidov	17992b178a	Merge pull request #45364 from nickitat/aggr_partitions_independently Add option to aggregate partitions independently	2023-02-19 17:44:18 +03:00
Alexey Milovidov	417158f59f	Merge branch 'master' into lower_upper	2023-02-19 04:05:10 +03:00
Nikita Taranov	f70044f34b	Merge branch 'master' into aggr_partitions_independently	2023-02-18 13:19:05 +00:00
Alexey Milovidov	6e0dab71ed	Merge pull request #46188 from bigo-sg/rewrite_array_exists Rewrite array exists to has	2023-02-12 05:53:22 +03:00
Alexey Milovidov	786aa069e1	Merge pull request #46187 from ClickHouse/speed-up-count-digits Speed up `countDigits`	2023-02-10 07:41:12 +03:00
taiyang-li	b83ad6bb81	add perf test	2023-02-09 12:30:50 +08:00
Alexey Milovidov	9a86d0087c	Add performance test	2023-02-09 04:52:33 +01:00
Alexey Milovidov	66043eec24	Merge branch 'master' into decimal-performance	2023-02-09 04:59:37 +03:00

1 2 3 4 5 ...

1175 Commits