Kruglov Pavel
10535132c3
Merge pull request #59385 from Avogar/fix-bad-types-check
...
Fix validating suspicious/experimental types in nested types
2024-02-21 14:38:01 +01:00
Robert Schulze
7d354164a5
Add performance test for dotProduct()
2024-02-20 21:41:10 +00:00
taiyang-li
2d6b4b400c
Merge remote-tracking branch 'origin/master' into opt_sum_decimal
2024-02-19 12:25:22 +08:00
avogar
109720d162
Merge branch 'master' of github.com:ClickHouse/ClickHouse into fix-bad-types-check
2024-02-15 12:10:49 +00:00
Raúl Marín
11519f949b
Merge pull request #59731 from kitaisreal/asof-join-try-sort-with-radix-sort
...
ASOF JOIN use trySort with RadixSort
2024-02-14 15:54:22 +01:00
avogar
64779835fa
Update tests
2024-02-14 12:48:05 +00:00
李扬
90d07ba82c
Trivial optimize of function coalesce. ( #59627 )
...
* reuse result of functionfactory::get
* add perf test
* Update src/Functions/coalesce.cpp
Co-authored-by: János Benjamin Antal <antaljanosbenjamin@users.noreply.github.com>
* change as requested
---------
Co-authored-by: János Benjamin Antal <antaljanosbenjamin@users.noreply.github.com>
2024-02-14 11:29:45 +01:00
Maksim Kita
2caf3f0fbb
Fixed tests
2024-02-13 15:41:17 +03:00
Maksim Kita
a359ceecb5
ASOF JOIN use trySort with RadixSort
2024-02-13 15:41:17 +03:00
Raúl Marín
d17b0867e6
Merge remote-tracking branch 'blessed/master' into argmin_optimization
2024-02-09 13:08:23 +01:00
taiyang-li
549b77021d
add some perf tests
2024-02-04 15:55:22 +08:00
taiyang-li
ddc6aad8ff
merge master
2024-02-01 10:58:33 +08:00
Raúl Marín
f67bff12b7
Merge pull request #59148 from bigo-sg/improve_if_with_floating
...
Continue optimizing branch miss of if function when result type is float*/decimal*/int*
2024-01-31 12:42:06 +01:00
taiyang-li
2ad7607bad
opt if when input type is map
2024-01-31 17:33:47 +08:00
Maksim Kita
20c1f0c18f
Revert "Revert "Add new aggregation function groupArraySorted()""
2024-01-30 17:15:29 +03:00
Raúl Marín
cda39e64e4
Perf: Only consider XML files
2024-01-29 17:47:50 +01:00
Raúl Marín
6a2fcb778f
Restore comment
2024-01-29 13:07:30 +01:00
Raúl Marín
c79a151cca
Simplify query_run_metric_arrays in perf tests
2024-01-29 13:00:49 +01:00
taiyang-li
49fc8a7099
Merge branch 'master' into improve_if_with_floating
2024-01-29 11:02:05 +08:00
Azat Khuzhin
44e42052b1
Fix perf tests after sumMap starts to filter out -0.
...
Before perf tests was relying on the following:
SELECT sumMap(['foo', 'bar'], [-0., -0.])
┌─sumMap(['foo', 'bar'], [-0., -0.])─┐
│ (['bar','foo'],[-0,-0]) │
└────────────────────────────────────┘
While it got changed, and now:
┌─sumMap(['foo', 'bar'], [-0., -0.])─┐
│ ([],[]) │
└────────────────────────────────────┘
But it works for nan:
SELECT sumMap(['foo', 'bar'], [nan, nan])
┌─sumMap(['foo', 'bar'], [nan, nan])─┐
│ (['bar','foo'],[nan,nan]) │
└────────────────────────────────────┘
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-01-27 13:14:07 +01:00
taiyang-li
56db02c953
Merge branch 'master' into improve_if_with_floating
2024-01-26 19:21:49 +08:00
taiyang-li
e8629cf4f5
add another perf and tests
2024-01-25 18:02:55 +08:00
taiyang-li
a657a2631f
add perf tests
2024-01-24 19:58:07 +08:00
Robert Schulze
a4c6f87fb9
Further reduce runtime of norm_distance.xml
2024-01-23 10:52:52 +00:00
Raúl Marín
1e00dec997
Merge remote-tracking branch 'blessed/master' into argmin_optimization
2024-01-22 14:56:55 +01:00
Dmitry Novik
a18a8d8ea3
Merge pull request #59009 from nickitat/uniq_optimisation_for_distributed
...
`uniqExact` state parallel merging for distributed queries
2024-01-22 14:05:02 +01:00
Nikita Taranov
2b5482be8c
add perf test
2024-01-19 17:57:11 +01:00
Raúl Marín
3739d46817
Merge remote-tracking branch 'blessed/master' into argmin_optimization
2024-01-19 14:43:44 +01:00
Robert Schulze
aa2d36e598
Reduce memory consumption of norm_distance.xml
...
The test data was stored in in-memory Array and Tuple tables, each
consuming about 80 GB. The population of the Tuple tables exceeded the
maximum memory thresholds (*).
As a fix,
- Reduce the Array cardinality from 200 to 150 and the number of rows
from 10 to 8 million, reducing memory consumption to ca. 50 GB. This
should pass the memory threshold and does not affect the test purpose.
- Don't test tuples. Due to their data layout, vector search is not
efficient anyways. Saves another 80 GB.
2024-01-19 13:09:15 +00:00
Robert Schulze
180a07ee4b
Delete redundant test norm_distance_float.xml
...
- Duplicates the workload in norm_distance.xml
2024-01-19 13:09:14 +00:00
Robert Schulze
25fb44f16d
Extend performance test norm_dist.xml
...
Dimension = 200 (instead of 10) is more realistic for vector search use cases.
2024-01-19 13:09:08 +00:00
Raúl Marín
b92073ed7b
Revert "Extend performance test norm_dist.xml"
2024-01-19 13:21:03 +01:00
Raúl Marín
6aa5ac4cc6
Merge remote-tracking branch 'blessed/master' into argmin_optimization
2024-01-18 12:04:19 +01:00
Robert Schulze
abcb1b5c9f
Extend performance test norm_dist.xml
...
Dimension = 200 (instead of 10) is more realistic for vector search use cases.
2024-01-17 18:16:29 +00:00
Raúl Marín
849ac1fe99
Implement findExtremeMinIndex / findExtremeMaxIndex
2024-01-17 16:22:40 +01:00
Raúl Marín
cb9cc7a4cc
Fix table names
2024-01-17 11:36:07 +01:00
Alexey Milovidov
1afcab35c1
Fix supply chain attack in performance tests
2024-01-14 08:25:12 +01:00
Duc Canh Le
458c8d758d
simplify perf tests and minor code change
...
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2024-01-11 08:25:35 +00:00
Duc Canh Le
6331d8a6f2
Merge branch 'master' into final_no_copy
...
Resolve conflicts
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2024-01-11 02:55:14 +00:00
Azat Khuzhin
3d88dba0a7
Fix perf tests duration (checks.test_duration_ms)
...
The column in the source was seconds in Float32, we need to convert it to
milliseconds in UInt64.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-01-10 17:03:53 +03:00
taiyang-li
231de4ac49
Merge branch 'master' into ch_opt_array_element
2024-01-10 15:49:43 +08:00
李扬
2c76b1789c
Update tests/performance/array_element.xml
...
Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
2024-01-10 10:46:29 +08:00
Nikolay Degterinsky
24733700fb
Merge pull request #57745 from KevinyhZou/imporve_multi_if_nullable
...
Improve `MultiIf` function performance while type is nullable
2024-01-09 23:17:58 +01:00
taiyang-li
e5b4bc8f45
Merge branch 'master' into ch_opt_array_element
2024-01-09 17:17:38 +08:00
李扬
f6691ec334
Update tests/performance/array_element.xml
...
Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
2024-01-09 10:15:37 +08:00
taiyang-li
5c0ea7db94
Merge branch 'master' into ch_opt_array_element
2024-01-08 10:41:20 +08:00
Raúl Marín
0522d859c2
Merge pull request #58334 from Algunenano/minmax_non_numeric
...
Speedup MIN/MAX for non numeric types
2024-01-04 19:48:18 +01:00
Raúl Marín
641caba5b0
Adapt more tests
2024-01-04 11:36:33 +00:00
Raúl Marín
1d1edd5b57
Reduce sum_map.xml
2024-01-04 11:31:20 +00:00
Raúl Marín
2aa6690f2c
Reduce hashed_dictionary.xml
2024-01-04 11:29:17 +00:00
Raúl Marín
39eaa8dc9c
Halve the size of reinterpret_as.xml
2024-01-04 11:24:36 +00:00
Raúl Marín
3c7ae2f171
Reduce bounding_ratio.xml
2024-01-04 11:20:07 +00:00
Raúl Marín
c195320612
Reduce the size of join_used_flags.xml
2024-01-03 17:31:55 +00:00
Raúl Marín
c223ae56d3
Reduce the size of decimal_parse
2024-01-03 17:29:30 +00:00
Raúl Marín
910b338584
Reduce polymorphic_parts_m
2024-01-03 17:24:15 +00:00
Raúl Marín
b8305e1a6e
Make test more reasonable
2024-01-03 17:19:44 +00:00
Raúl Marín
7ee1697971
Reduce setup time of min_max_index.xml
2024-01-03 17:16:45 +00:00
Raúl Marín
1c40700ea1
Merge remote-tracking branch 'blessed/master' into minmax_non_numeric
2024-01-03 14:09:28 +01:00
Duc Canh Le
d623702378
Merge branch 'master' into final_no_copy
...
Resolve conflicts
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2024-01-02 06:07:49 +00:00
Raúl Marín
db97a69989
Add perf tests with tuples
2023-12-29 17:00:01 +01:00
Azat Khuzhin
0d71bf1411
Upload time of the perf tests into artifacts as test_duration_ms
...
Now perf test changes/failures will have two rows, row for new and row
for old server.
I thought about uploading only the time of the test on the new server,
but because not all perf tests uploaded, you cannot always get the time
of the test without the changes (i.e. from run on the upstream/master
repo/branch).
<details>
Before:
```sql
SELECT
concat(test, ' #', toString(query_index)),
'slower' AS test_status,
0 AS test_duration_ms,
concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance .', test, '.', toString(query_index)) AS report_url
FROM queries
WHERE (changed_fail != 0) AND (diff > 0)
UNION ALL
SELECT
concat(test, ' #', toString(query_index)),
'unstable' AS test_status,
0 AS test_duration_ms,
concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#unstable-queries .', test, '.', toString(query_index)) AS report_url
FROM queries
WHERE unstable_fail != 0
Query id: 49dfdc9a-f549-4499-9a1a-410e5053f6c1
┌─concat(test, ' #', toString(query_index))─┬─test_status─┬─test_duration_ms─┬─report_url─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ hashed_array_dictionary #16 │ slower │ 0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.hashed_array_dictionary.16 │
│ ngram_distance #2 │ slower │ 0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.2 │
│ ngram_distance #3 │ slower │ 0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.3 │
│ ngram_distance #4 │ slower │ 0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.4 │
└───────────────────────────────────────────┴─────────────┴──────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
After:
```sql
SELECT
concat(test, ' #', toString(query_index), '::', test_desc_.1) AS test_name,
'slower' AS test_status,
test_desc_.2 AS test_duration_ms,
concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance .', test, '.', toString(query_index)) AS report_url
FROM queries
ARRAY JOIN map('old', left, 'new', right) AS test_desc_
WHERE (changed_fail != 0) AND (diff > 0)
UNION ALL
SELECT
concat(test, ' #', toString(query_index), '::', test_desc_.1) AS test_name,
'unstable' AS test_status,
test_desc_.2 AS test_duration_ms,
concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#unstable-queries .', test, '.', toString(query_index)) AS report_url
FROM queries
ARRAY JOIN map('old', left, 'new', right) AS test_desc_
WHERE unstable_fail != 0
Query id: 20475bfd-754b-4159-aa16-7798f4720bf8
┌─test_name────────────────────────┬─test_status─┬─test_duration_ms─┬─report_url─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ hashed_array_dictionary #16::old │ slower │ 0.2149 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.hashed_array_dictionary.16 │
│ hashed_array_dictionary #16::new │ slower │ 0.2519 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.hashed_array_dictionary.16 │
│ ngram_distance #2::old │ slower │ 0.3598 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.2 │
│ ngram_distance #2::new │ slower │ 0.4425 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.2 │
│ ngram_distance #3::old │ slower │ 0.3644 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.3 │
│ ngram_distance #3::new │ slower │ 0.4716 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.3 │
│ ngram_distance #4::old │ slower │ 0.3577 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.4 │
│ ngram_distance #4::new │ slower │ 0.4577 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.4 │
└──────────────────────────────────┴─────────────┴──────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
</details>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-29 13:48:41 +01:00
Alexey Milovidov
0e678fb6c1
Merge pull request #56996 from ClickHouse/vdimir/hash_join_max_block_size
...
HashJoin respects max_joined_block_size_rows
2023-12-27 15:46:46 +01:00
Alexey Milovidov
f00337e2ba
Merge pull request #57872 from CurtizJ/optimize-aggregation-consecutive-keys
...
Better optimization of consecutive keys in aggregation
2023-12-27 15:44:22 +01:00
Raúl Marín
f8d9a850c7
Fix perf test README
2023-12-27 12:16:17 +00:00
kevinyhzou
d7d1541d81
add performance test xml
2023-12-27 19:43:59 +08:00
Duc Canh Le
476ca4246d
Merge branch 'master' into final_no_copy
...
Resolve conflicts + add some comments
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2023-12-27 07:00:58 +00:00
vdimir
afda7f1079
Add tests/performance/hashjoin_with_large_output.xml
...
Co-authored-by: liuneng <1398775315@qq.com>
2023-12-22 15:52:45 +00:00
zhanglistar
408f9ea1ae
Merge branch 'ClickHouse:master' into if-opt
2023-12-21 09:50:23 +08:00
zhanglistar
810305fafe
Fix if.xml.
2023-12-20 10:58:46 +08:00
zhanglistar
f2a02cca2f
if.xml add more perf tests.
2023-12-18 16:55:24 +08:00
zhanglistar
6d73c8b157
code format of if.xml
2023-12-18 16:48:27 +08:00
zhanglistar
ba34b80087
If improvement add comment and performance test.
2023-12-18 16:43:45 +08:00
Max K
84e5870b71
Reapply "improve CI with digest for docker, build and test jobs" ( #57904 )
...
* Revert "Revert "improve CI with digest for docker, build and test jobs""
* fix: docker manifest merge for missing images only
2023-12-18 09:07:22 +01:00
Anton Popov
5faf5e913e
slightly faster and perf test
2023-12-16 03:35:59 +00:00
Robert Schulze
b36a93a25d
Revert "Remove arrayFold
"
...
This reverts commit 15dc0ed610
.
2023-12-14 09:52:29 +00:00
Alexey Milovidov
15dc0ed610
Remove arrayFold
2023-12-14 04:34:32 +01:00
Alexey Milovidov
2988f6f92a
Revert "Add new aggregation function groupArraySorted()"
2023-12-05 15:31:17 +03:00
taiyang-li
4bbc0db154
Optimize array element function when input is array(map)/array(array(string))/array(array(number))
2023-12-05 15:57:43 +08:00
Alexey Milovidov
ae476002ef
Remove bad test
2023-12-05 06:56:57 +01:00
Yarik Briukhovetskyi
69205769d0
Merge branch 'ClickHouse:master' into group_sorted_array_function
2023-11-23 20:23:47 +01:00
yariks5s
cb6898b52f
fix due to review
2023-11-23 18:17:47 +00:00
Duc Canh Le
4b0382442f
Merge branch 'master' into final_no_copy
...
Try fixing CI issues
2023-11-22 08:53:38 +00:00
Alexander Tokmakov
28e0c51e3f
Update avg_weighted.xml ( #56797 )
2023-11-16 20:46:17 +01:00
taiyang-li
819c7e75ff
Merge branch 'master' into orc_tuple_field_prune
2023-11-02 17:52:05 +08:00
taiyang-li
b8665a610c
fix failed perf test
2023-11-02 15:27:48 +08:00
taiyang-li
b142489c3c
fix code style
2023-11-02 10:49:18 +08:00
lgbo-ustc
8334585eaf
improve parquet struct field reading
2023-11-01 15:18:39 +08:00
taiyang-li
c97b2c5be7
fix code style
2023-10-31 12:00:45 +08:00
taiyang-li
b72341e1a8
Merge branch 'master' into orc_tuple_field_prune
2023-10-31 10:07:43 +08:00
taiyang-li
38f24c0455
add performance tests
2023-10-30 20:29:43 +08:00
Alexey Milovidov
88440d4c07
Merge pull request #54568 from JackyWoo/optimize_uniq_to_count2
...
Resubmit optimization uniq to count
2023-10-30 01:33:36 +01:00
Alexey Milovidov
64b6e68a50
Merge pull request #55683 from amosbird/issue-55653
...
Reuse granule during skip index reading
2023-10-30 00:51:51 +01:00
frinkr
18c50c11b3
Multithreading after window functions ( #50771 )
...
* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing
* fix style
* fix style
* fix style
* setting query_plan_preserve_num_streams_after_window_functions default true
* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0
* fix test references
* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).
* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing
* fix style
* fix style
* fix style
* setting query_plan_preserve_num_streams_after_window_functions default true
* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0
* fix test references
* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).
* add perf test
* perf: change the dataset from 50M to 5M
* rename query_plan_preserve_num_streams_after_window_functions -> query_plan_enable_multithreading_after_window_functions
* update test reference
* fix clang-tidy
---------
Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-10-27 12:36:28 +02:00
lgbo
489e6d9bdc
Optimization for getting value from map, arrayElement
(1/2) ( #55929 )
2023-10-27 09:54:25 +02:00
Maksim Kita
aa5fc05a55
Revert "Merge pull request #55682 from ClickHouse/revert-35961-decimal-column-improve-get-permutation"
...
This reverts commit f6dee5fe3c
, reversing
changes made to f96bda1deb
.
2023-10-25 21:48:13 +03:00
李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) ( #55330 )
...
* support orc filter push down
* update orc lib version
* replace setqueryinfo with setkeycondition
* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536
* refactor source with key condition
* fix building error
* remove std::cout
* update orc
* update orc version
* fix bugs
* improve code
* upgrade orc lib
* fix code style
* change as requested
* add performance tests for orc filter push down
* add performance tests for orc filter push down
* fix all bugs
* fix default as null issue
* add uts for null as default issues
* upgrade orc lib
* fix failed orc lib uts and fix typo
* fix failed uts
* fix failed uts
* fix ast fuzzer tests
* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html
* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm
* fix wrong performance tests
* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html
* add some comments
* add some comments
* inline range::equals and range::less
* fix data race of key condition
* trigger ci
2023-10-24 12:08:17 -07:00
Robert Schulze
68c3f41b71
Fix performance tests
2023-10-24 08:56:09 +00:00
Duc Canh Le
5923e1b116
Cache cast function in set during execution ( #55712 )
...
* Cache cast function in set during execution
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
* minor fix for performance test
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
* Update src/Interpreters/castColumn.cpp
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
* improvement
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
* fix use-after-free
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
---------
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-10-23 13:31:44 +02:00
Amos Bird
602f01f651
Reuse granule during skip index reading
2023-10-18 14:40:34 +08:00
Alexey Milovidov
2da1ff4b0d
Revert "Improve ColumnDecimal, ColumnVector getPermutation performance using pdqsort with RadixSort"
2023-10-16 19:07:11 +03:00
Duc Canh Le
16687632da
add format Null to performance tests
...
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2023-10-14 08:01:55 +00:00
Duc Canh Le
285ae778e4
Merge branch 'master' into final_no_copy
...
Fix flaky test 02447_drop_database_replica
2023-10-14 03:34:42 +00:00
Duc Canh Le
dcc464b4da
add more performance test
...
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2023-10-13 08:16:54 +00:00
Maksim Kita
d4f9e0de12
Added performance test
2023-10-11 19:01:00 +03:00
Robert Schulze
2848548c63
Merge remote-tracking branch 'rschu1ze/master' into arrayFold
2023-10-08 16:32:36 +00:00
JackyWoo
826f7ac7eb
Merge branch 'master' into optimize_uniq_to_count2
2023-09-27 09:14:28 +08:00
Maksim Kita
40be8227ea
Fixed tests
2023-09-25 17:29:42 +03:00
Maksim Kita
1de95d8c36
Updated implementation
2023-09-25 17:29:42 +03:00
Maksim Kita
0a9835d085
Added performance tests
2023-09-25 17:29:42 +03:00
JackyWoo
231d16040b
Merge branch 'master' into optimize_uniq_to_count2
2023-09-19 10:29:03 +08:00
robot-clickhouse
51851ecc21
Merge pull request #54613 from bigo-sg/improve_json_query
...
Improve json sql functions by serializing json element into column's buffer direclty
2023-09-15 19:35:30 +02:00
lgbo-ustc
fa0f9b0e1f
update benckmark test
2023-09-15 18:09:58 +08:00
lgbo-ustc
e8d217634e
improve json sql functions by serilizing data into column direclty
2023-09-14 12:41:17 +08:00
Sema Checherinda
8a9b544a97
Merge branch 'master' into optimize_all_lonely_parts
2023-09-13 16:07:19 +02:00
JackyWoo
70a262a775
Add optimization uniq to count
2023-09-13 16:16:11 +08:00
Alexey Milovidov
bd4aec0601
Revert "Optimize uniq to count"
2023-09-13 09:14:06 +03:00
JackyWoo
d065ac32e0
Merge branch 'master' into optimize_uniq_to_count
2023-09-04 10:06:36 +08:00
Duc Canh Le
06afe0c2aa
more stable stateless test + add perf. test
2023-08-31 06:27:06 +00:00
Jiebin Sun
7c529e5691
Optimize the merge if all hashSets are singleLevel in UniqExactSet ( #52973 )
...
* Optimize the merge if all hashSets are singleLevel
In PR(https://github.com/ClickHouse/ClickHouse/pull/50748 ), it has added new phase
`parallelizeMergePrepare` before merge if all the hashSets are not all singleLevel
or not all twoLevel. Then it will convert all the singleLevelSet to twoLevelSet in
parallel, which will increase the CPU utilization and QPS.
But if all the hashtables are singleLevel, it could also benefit from the
`parallelizeMergePrepare` optimization in most cases if the hashtable size are not
too small. By tuning the Query `SELECT COUNT(DISTINCT SearchPhase) FROM hits_v1`
in different threads, we have got the mild threshold 6,000.
Test patch with the Query 'SELECT COUNT(DISTINCT Title) FROM hits_v1' on 2x80 vCPUs
server. If the threads are less than 48, the hashSets are all twoLevel or mixed by
singleLevel and twoLevel. If the threads are over 56, all the hashSets are singleLevel.
And the QPS has got at most 2.35x performance gain.
Threads Opt/Base
8 100.0%
16 99.4%
24 110.3%
32 99.9%
40 99.3%
48 99.8%
56 183.0%
64 234.7%
72 233.1%
80 229.9%
88 224.5%
96 229.6%
104 235.1%
112 229.5%
120 229.1%
128 217.8%
136 222.9%
144 217.8%
152 204.3%
160 203.2%
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
* Add the comment and explanation for PR#52973
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
---------
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-08-30 11:26:16 +02:00
JackyWoo
a963048e1a
Merge branch 'master' into optimize_uniq_to_count
2023-08-28 11:10:05 +08:00
avogar
ecb0e9844c
Disable cache in perf test
2023-08-23 21:01:18 +00:00
avogar
68e3af56d4
Address comments
2023-08-23 13:19:15 +00:00
Kruglov Pavel
e193aec583
Merge branch 'master' into fast-count-from-files
2023-08-23 12:15:34 +02:00
Kruglov Pavel
67c5c0203b
Merge branch 'master' into fast-count-from-files
2023-08-22 15:03:48 +02:00
Alexey Milovidov
037277c4a2
Remove bad test
2023-08-22 14:21:23 +02:00
Michael Kolupaev
d752611c43
Performance test
2023-08-21 14:15:52 -07:00
Kruglov Pavel
88aee95122
Merge branch 'master' into fast-count-from-files
2023-08-21 14:46:33 +02:00
avogar
47304bf7aa
Optimize count from files in most input formats
2023-08-21 12:30:52 +00:00
Alexey Milovidov
125169d9ae
Remove useless test
2023-08-20 03:51:30 +02:00
robot-ch-test-poll4
3aa9cb1267
Merge pull request #51399 from liuneng1994/optimize_nullable_aggragate_serialized_method
...
Optimize aggregation performance of nullable String key when use AggregationMethodSerialized
2023-08-16 19:37:44 +02:00
liuneng
8a83301316
optimize
2023-08-08 13:38:25 +08:00
liuneng
f33367cd8b
add more test
2023-08-08 13:38:25 +08:00
liuneng
f96b9b7512
optimize fixed size column
2023-08-08 13:38:25 +08:00
liuneng
035dbdaf22
remove numbers optimization. It will decrease performance
2023-08-08 13:38:25 +08:00
liuneng
4f9920c71c
optimize performance of nullable String And Number column serializeValueIntoArena
2023-08-08 13:38:25 +08:00
Duc Canh Le
ad0ac43814
fix performance test
2023-08-07 06:25:46 +00:00
Duc Canh Le
ed2a1d7c9b
select required columns when getting join
2023-08-07 03:15:20 +00:00
JackyWoo
43ea21a4ce
make default optimize_uniq_to_count to true
2023-08-02 18:28:22 +08:00
JackyWoo
1c930f34de
reduce performance time
2023-08-02 18:10:01 +08:00
JackyWoo
162c674d74
remove settings in uniq_to_count
2023-08-02 10:50:04 +08:00
JackyWoo
ef3f5e2a7c
fix performance tests error
2023-08-02 10:15:56 +08:00
JackyWoo
93b28903cb
Merge branch 'master' into optimize_uniq_to_count
2023-08-02 10:13:22 +08:00
Jiebin Sun
78f3a575f9
Convert hashSets in parallel before merge ( #50748 )
...
* Convert hashSets in parallel before merge
Before merge, if one of the lhs and rhs is singleLevelSet and the other is twoLevelSet,
then the SingleLevelSet will call convertToTwoLevel(). The convert process is not in parallel
and it will cost lots of cycle if it cosume all the singleLevelSet.
The idea of the patch is to convert all the singleLevelSets to twoLevelSets in parallel if
the hashsets are not all singleLevel or not all twoLevel.
I have tested the patch on Intel 2 x 112 vCPUs SPR server with clickbench and latest upstream
ClickHouse.
Q5 has got a big 264% performance improvement and 24 queries have got at least 5% performance
gain. The overall geomean of 43 queries has gained 7.4% more than the base code.
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
* add resize() for the data_vec in parallelizeMergePrepare()
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
* Add the performance test prepare_hash_before_merge.xml
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
* Fit the CI to rename the data set from hits_v1 to test.hits.
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
* remove the redundant branch in UniqExactSet
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
* Remove the empty methods and add throw exception in parallelizeMergePrepare()
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
---------
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-07-27 15:06:34 +02:00
JackyWoo
5f47aacef2
add performance tests
2023-07-27 15:41:16 +08:00
JackyWoo
95c41f49e0
not change projection columns
2023-07-27 15:41:16 +08:00
robot-ch-test-poll4
110500049a
Merge pull request #50532 from nickitat/more_pushdown_for_right_side_of_join
...
Push down to right side of a join in more cases
2023-07-26 14:43:57 +02:00
Nikita Taranov
b2acbe42b7
add perf test
2023-07-24 20:34:01 +02:00
Igor Nikonov
91f7185e8c
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct
2023-07-24 18:47:23 +02:00
Igor Nikonov
90e393ecf6
Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct
2023-07-18 14:26:22 +00:00
Alexey Milovidov
62bfa4ed93
Fix performance test for regexp cache
2023-07-09 02:21:48 +02:00
vdimir
737cff7e57
Remove whole join_set_filter.xml, will resubmit
2023-07-03 17:00:20 +02:00
vdimir
9ea5d929a5
Update tests/performance/join_set_filter.xml
2023-07-03 17:00:20 +02:00
vdimir
ebd7ecb230
Remove unstable queries from performance/join_set_filter
2023-07-03 17:00:20 +02:00
Igor Nikonov
35bc97e5f9
Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct
2023-06-16 20:56:56 +00:00
Azat Khuzhin
5caa3a9e80
Adjust min_insert_block_size_rows for materialized_view_parallelize_output_from_storages
...
Otherwise it is too slow for perf tests on CI [1].
[1]: https://s3.amazonaws.com/clickhouse-test-reports/50214/e287ec50920c7cadabea6ec19ef14b353345ac93/performance_comparison_[3_4]/report.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Azat Khuzhin
3e419730c3
Disable parallelize_output_from_storages for processing MATERIALIZED VIEWs
...
Adding more processors for parallelize_output_from_storages is not a
costless operation (I've experienced some issues in production because
of this), and it is not easy to fix in a normal way, so let's disable it
for now.
Before this patch:
- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=1, min_insert_block_size_rows=1000
0 rows in set. Elapsed: 3.648 sec. Processed 20.00 million rows, 120.00 MB (5.48 million rows/s., 32.90 MB/s.)
- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=0, min_insert_block_size_rows=1000
0 rows in set. Elapsed: 1.851 sec. Processed 20.00 million rows, 120.00 MB (10.80 million rows/s., 64.82 MB/s.)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Igor Nikonov
79f53f428b
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct
2023-06-13 13:45:36 +02:00
flynn
92c87dedad
Add parallel state merge for some other combinator except If ( #50413 )
...
* Add parallel state merge for some other combinator except If
* add test
* update test
2023-06-08 00:41:32 +02:00
flynn
f616314f8b
fix typo
2023-05-29 02:22:13 +00:00
flynn
05783f99cd
update test
2023-05-28 14:17:59 +00:00
flynn
ec82c657eb
Parallel merge of uniqExactIf states
2023-05-28 06:04:23 +00:00
Azat Khuzhin
2996b38606
Add ability to configure maximum load factor for the HASHED/SPARSE_HASHED layout
...
As it turns out, HashMap/PackedHashMap works great even with max load
factor of 0.99. By "great" I mean it least it works faster then
google sparsehash, and not to mention it's friendliness to the memory
allocator (it has zero fragmentation since it works with a continuious
memory region, in comparison to the sparsehash that doing lots of
realloc, which jemalloc does not like, due to it's slabs).
Here is a table of different setups:
settings | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
- | - | - | - | - | -
HASHED upstream | - | - | - | - | 35GiB
SPARSE_HASHED upstream | - | - | - | - | 26GiB
- | - | - | - | - | -
sparse_hash_map glibc hashbench | - | - | - | - | 17.5GiB
sparse_hash_map packed allocator | 101.878 | 231.48 | 4.32 | - | 17.7GiB
PackedHashMap 0.5 | 15.514 | 42.35 | 23.61 | 20GiB | 22GiB
hashed 0.95 | 34.903 | 115.615 | 8.65 | 16GiB | 18.7GiB
**PackedHashMap 0.95** | **93.6** | **19.883** | **10.68** | **10GiB** | **12.8GiB**
PackedHashMap 0.99 | 26.113 | 83.6 | 11.96 | 10GiB | 12.3GiB
As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less
memory then SPARSE_HASHED in upstream, and it also 2x faster for read!
v2: fix grower
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
7b5d156cc5
Optimize SPARSE_HASHED layout (by using PackedHashMap)
...
In case you want dictionary optimized for memory, SPARSE_HASHED is not
always gives you what you need.
Consider the following example <UInt64, UInt16> as <Key, Value>, but
this pair will also have a 6 byte padding (on amd64), so this is almost
40% of space wastage.
And because of this padding, even google::sparse_hash_map, does not make
picture better, in fact, sparse_hash_map is not very friendly to memory
allocators (especially jemalloc).
Here are some numbers for dictionary with 1e9 elements and UInt64 as
key, and UInt16 as value:
settings | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
HASHED upstream | - | - | - | - | 35GiB
SPARSE_HASHED upstream | - | - | - | - | 26GiB
- | - | - | - | - | -
sparse_hash_map glibc hashbench | - | - | - | - | 17.5GiB
sparse_hash_map packed allocator | 101.878 | 231.48 | 4.32 | - | 17.7GiB
PackedHashMap | 15.514 | 42.35 | 23.61 | 20GiB | 22GiB
As you can see PackedHashMap looks way more better then HASHED, and
even better then SPARSE_HASHED, but slightly worse then sparse_hash_map
with packed allocator (it is done with a custom patch to google
sparse_hash_map).
v2: rebase on top of bucket_count fix
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Lirikl
39a505e3f3
init
2023-05-11 22:54:00 +03:00
lgbo-ustc
a07359fbe8
enable used flags's reinit only when the hash talbe rehash
2023-05-11 11:06:13 +08:00
Alexey Milovidov
8a6e07f0ea
Make projections production-ready
2023-05-10 03:35:13 +02:00
Alexey Milovidov
f449df85b6
Deprecate in-memory parts
2023-05-03 00:31:09 +02:00
Alexey Milovidov
c279516ac1
Merge branch 'master' into parallel-reading-from-file
2023-04-10 08:02:50 +03:00
Igor Nikonov
8fdc2b3326
Perf test
2023-04-07 20:06:11 +00:00
Anton Popov
10d2b1330b
add perf test
2023-04-04 21:29:52 +00:00
Anton Popov
1e79245b94
add tests
2023-03-28 17:20:05 +00:00
Ongkong
d9c7bc1859
Fix ASOF LEFT JOIN performance degradation ( #47544 )
2023-03-18 23:53:00 +01:00
LiuNeng
d4c5ab9dcd
Optimize one nullable key aggregate performance ( #45772 )
2023-03-02 21:01:52 +01:00
Igor Nikonov
2f7aa8849b
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct
2023-03-02 20:48:28 +01:00
Igor Nikonov
548d79c2e8
Remove perf test duplicate_order_by_and_distinct.xml
2023-03-02 12:31:09 +00:00
Alexander Gololobov
f64d08bd5c
Enable lightweight delete support by default
2023-03-01 19:35:55 +01:00
Nikita Taranov
ab44740efb
Enable perf tests added in #45364 ( #46623 )
2023-02-28 00:26:11 +01:00
Alexey Milovidov
17992b178a
Merge pull request #45364 from nickitat/aggr_partitions_independently
...
Add option to aggregate partitions independently
2023-02-19 17:44:18 +03:00
Alexey Milovidov
417158f59f
Merge branch 'master' into lower_upper
2023-02-19 04:05:10 +03:00
Nikita Taranov
f70044f34b
Merge branch 'master' into aggr_partitions_independently
2023-02-18 13:19:05 +00:00
Alexey Milovidov
6e0dab71ed
Merge pull request #46188 from bigo-sg/rewrite_array_exists
...
Rewrite array exists to has
2023-02-12 05:53:22 +03:00
Alexey Milovidov
786aa069e1
Merge pull request #46187 from ClickHouse/speed-up-count-digits
...
Speed up `countDigits`
2023-02-10 07:41:12 +03:00
taiyang-li
b83ad6bb81
add perf test
2023-02-09 12:30:50 +08:00
Alexey Milovidov
9a86d0087c
Add performance test
2023-02-09 04:52:33 +01:00
Alexey Milovidov
66043eec24
Merge branch 'master' into decimal-performance
2023-02-09 04:59:37 +03:00
Igor Nikonov
72c393e7c4
Merge pull request #46014 from ClickHouse/inorder-optimization-update-sorting-properties
...
Update sorting properties after reading in order applied
2023-02-08 10:19:47 +01:00
Alexey Milovidov
a2df6e950e
Whitespace
2023-02-08 03:38:23 +01:00
Alexey Milovidov
168fbc9d7b
Add a test
2023-02-08 02:17:23 +01:00
李扬
444373679a
Merge branch 'master' into improve_decimal
2023-02-06 13:08:51 +08:00
Igor Nikonov
089a0009ad
Polishing
...
+ try to stabilize distinct in order perf test
2023-02-05 13:38:20 +00:00
Nikita Taranov
b983b363f8
Merge branch 'master' into aggr_partitions_independently
2023-02-04 18:24:31 +00:00
李扬
ad6f39389d
Update tests/performance/column_array_filter.xml
...
Co-authored-by: Alexander Gololobov <440544+davenger@users.noreply.github.com>
2023-02-04 18:49:13 +08:00
Nikita Mikhaylov
33877b5e00
Parallel replicas. Part [2] ( #43772 )
2023-02-03 14:34:18 +01:00
taiyang-li
36a98a1628
add performance tests
2023-02-02 20:16:16 +08:00
Nikita Taranov
e7ca90adab
fix perf test
2023-01-30 17:11:56 +00:00
Nikita Taranov
ac77808133
fix perf test
2023-01-30 17:11:56 +00:00
Nikita Taranov
52fe7edbd9
better key analysis
2023-01-30 17:11:56 +00:00
Nikita Taranov
2057db68a2
cosmetics
2023-01-30 17:10:45 +00:00
Nikita Taranov
1d45cce03c
support for aggr in order
2023-01-30 17:10:45 +00:00
Nikita Taranov
a2c9aeb7c9
stash
2023-01-30 17:10:45 +00:00