Commit Graph

1312 Commits

Author SHA1 Message Date
Raúl Marín
1d1edd5b57 Reduce sum_map.xml 2024-01-04 11:31:20 +00:00
Raúl Marín
2aa6690f2c Reduce hashed_dictionary.xml 2024-01-04 11:29:17 +00:00
Raúl Marín
39eaa8dc9c Halve the size of reinterpret_as.xml 2024-01-04 11:24:36 +00:00
Raúl Marín
3c7ae2f171 Reduce bounding_ratio.xml 2024-01-04 11:20:07 +00:00
Raúl Marín
c195320612 Reduce the size of join_used_flags.xml 2024-01-03 17:31:55 +00:00
Raúl Marín
c223ae56d3 Reduce the size of decimal_parse 2024-01-03 17:29:30 +00:00
Raúl Marín
910b338584 Reduce polymorphic_parts_m 2024-01-03 17:24:15 +00:00
Raúl Marín
b8305e1a6e Make test more reasonable 2024-01-03 17:19:44 +00:00
Raúl Marín
7ee1697971 Reduce setup time of min_max_index.xml 2024-01-03 17:16:45 +00:00
Raúl Marín
1c40700ea1 Merge remote-tracking branch 'blessed/master' into minmax_non_numeric 2024-01-03 14:09:28 +01:00
Duc Canh Le
d623702378 Merge branch 'master' into final_no_copy
Resolve conflicts

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2024-01-02 06:07:49 +00:00
Raúl Marín
db97a69989 Add perf tests with tuples 2023-12-29 17:00:01 +01:00
Azat Khuzhin
0d71bf1411 Upload time of the perf tests into artifacts as test_duration_ms
Now perf test changes/failures will have two rows, row for new and row
for old server.

I thought about uploading only the time of the test on the new server,
but because not all perf tests uploaded, you cannot always get the time
of the test without the changes (i.e. from run on the upstream/master
repo/branch).

<details>

Before:

```sql
SELECT
    concat(test, ' #', toString(query_index)),
    'slower' AS test_status,
    0 AS test_duration_ms,
    concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.', test, '.', toString(query_index)) AS report_url
FROM queries
WHERE (changed_fail != 0) AND (diff > 0)
UNION ALL
SELECT
    concat(test, ' #', toString(query_index)),
    'unstable' AS test_status,
    0 AS test_duration_ms,
    concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#unstable-queries.', test, '.', toString(query_index)) AS report_url
FROM queries
WHERE unstable_fail != 0

Query id: 49dfdc9a-f549-4499-9a1a-410e5053f6c1

┌─concat(test, ' #', toString(query_index))─┬─test_status─┬─test_duration_ms─┬─report_url─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ hashed_array_dictionary #16               │ slower      │                0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.hashed_array_dictionary.16 │
│ ngram_distance #2                         │ slower      │                0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.2           │
│ ngram_distance #3                         │ slower      │                0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.3           │
│ ngram_distance #4                         │ slower      │                0 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.4           │
└───────────────────────────────────────────┴─────────────┴──────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```

After:

```sql
SELECT
    concat(test, ' #', toString(query_index), '::', test_desc_.1) AS test_name,
    'slower' AS test_status,
    test_desc_.2 AS test_duration_ms,
    concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.', test, '.', toString(query_index)) AS report_url
FROM queries
ARRAY JOIN map('old', left, 'new', right) AS test_desc_
WHERE (changed_fail != 0) AND (diff > 0)
UNION ALL
SELECT
    concat(test, ' #', toString(query_index), '::', test_desc_.1) AS test_name,
    'unstable' AS test_status,
    test_desc_.2 AS test_duration_ms,
    concat('https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#unstable-queries.', test, '.', toString(query_index)) AS report_url
FROM queries
ARRAY JOIN map('old', left, 'new', right) AS test_desc_
WHERE unstable_fail != 0

Query id: 20475bfd-754b-4159-aa16-7798f4720bf8

┌─test_name────────────────────────┬─test_status─┬─test_duration_ms─┬─report_url─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ hashed_array_dictionary #16::old │ slower      │           0.2149 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.hashed_array_dictionary.16 │
│ hashed_array_dictionary #16::new │ slower      │           0.2519 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.hashed_array_dictionary.16 │
│ ngram_distance #2::old           │ slower      │           0.3598 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.2           │
│ ngram_distance #2::new           │ slower      │           0.4425 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.2           │
│ ngram_distance #3::old           │ slower      │           0.3644 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.3           │
│ ngram_distance #3::new           │ slower      │           0.4716 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.3           │
│ ngram_distance #4::old           │ slower      │           0.3577 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.4           │
│ ngram_distance #4::new           │ slower      │           0.4577 │ https://s3.amazonaws.com/clickhouse-test-reports/$PR_TO_TEST/$SHA_TO_TEST/${CLICKHOUSE_PERFORMANCE_COMPARISON_CHECK_NAME_PREFIX}/report.html#changes-in-performance.ngram_distance.4           │
└──────────────────────────────────┴─────────────┴──────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```

</details>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-29 13:48:41 +01:00
Alexey Milovidov
0e678fb6c1
Merge pull request #56996 from ClickHouse/vdimir/hash_join_max_block_size
HashJoin respects max_joined_block_size_rows
2023-12-27 15:46:46 +01:00
Alexey Milovidov
f00337e2ba
Merge pull request #57872 from CurtizJ/optimize-aggregation-consecutive-keys
Better optimization of consecutive keys in aggregation
2023-12-27 15:44:22 +01:00
Raúl Marín
f8d9a850c7 Fix perf test README 2023-12-27 12:16:17 +00:00
kevinyhzou
d7d1541d81 add performance test xml 2023-12-27 19:43:59 +08:00
Duc Canh Le
476ca4246d Merge branch 'master' into final_no_copy
Resolve conflicts + add some comments

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2023-12-27 07:00:58 +00:00
vdimir
afda7f1079
Add tests/performance/hashjoin_with_large_output.xml
Co-authored-by: liuneng <1398775315@qq.com>
2023-12-22 15:52:45 +00:00
zhanglistar
408f9ea1ae
Merge branch 'ClickHouse:master' into if-opt 2023-12-21 09:50:23 +08:00
zhanglistar
810305fafe Fix if.xml. 2023-12-20 10:58:46 +08:00
zhanglistar
f2a02cca2f if.xml add more perf tests. 2023-12-18 16:55:24 +08:00
zhanglistar
6d73c8b157 code format of if.xml 2023-12-18 16:48:27 +08:00
zhanglistar
ba34b80087 If improvement add comment and performance test. 2023-12-18 16:43:45 +08:00
Max K
84e5870b71
Reapply "improve CI with digest for docker, build and test jobs" (#57904)
* Revert "Revert "improve CI with digest for docker, build and test jobs""

* fix: docker manifest merge for missing images only
2023-12-18 09:07:22 +01:00
Anton Popov
5faf5e913e slightly faster and perf test 2023-12-16 03:35:59 +00:00
Robert Schulze
b36a93a25d
Revert "Remove arrayFold"
This reverts commit 15dc0ed610.
2023-12-14 09:52:29 +00:00
Alexey Milovidov
15dc0ed610 Remove arrayFold 2023-12-14 04:34:32 +01:00
Alexey Milovidov
2988f6f92a
Revert "Add new aggregation function groupArraySorted()" 2023-12-05 15:31:17 +03:00
taiyang-li
4bbc0db154 Optimize array element function when input is array(map)/array(array(string))/array(array(number)) 2023-12-05 15:57:43 +08:00
Alexey Milovidov
ae476002ef Remove bad test 2023-12-05 06:56:57 +01:00
Yarik Briukhovetskyi
69205769d0
Merge branch 'ClickHouse:master' into group_sorted_array_function 2023-11-23 20:23:47 +01:00
yariks5s
cb6898b52f fix due to review 2023-11-23 18:17:47 +00:00
Duc Canh Le
4b0382442f Merge branch 'master' into final_no_copy
Try fixing CI issues
2023-11-22 08:53:38 +00:00
Alexander Tokmakov
28e0c51e3f
Update avg_weighted.xml (#56797) 2023-11-16 20:46:17 +01:00
taiyang-li
819c7e75ff Merge branch 'master' into orc_tuple_field_prune 2023-11-02 17:52:05 +08:00
taiyang-li
b8665a610c fix failed perf test 2023-11-02 15:27:48 +08:00
taiyang-li
b142489c3c fix code style 2023-11-02 10:49:18 +08:00
lgbo-ustc
8334585eaf improve parquet struct field reading 2023-11-01 15:18:39 +08:00
taiyang-li
c97b2c5be7 fix code style 2023-10-31 12:00:45 +08:00
taiyang-li
b72341e1a8 Merge branch 'master' into orc_tuple_field_prune 2023-10-31 10:07:43 +08:00
taiyang-li
38f24c0455 add performance tests 2023-10-30 20:29:43 +08:00
Alexey Milovidov
88440d4c07
Merge pull request #54568 from JackyWoo/optimize_uniq_to_count2
Resubmit optimization uniq to count
2023-10-30 01:33:36 +01:00
Alexey Milovidov
64b6e68a50
Merge pull request #55683 from amosbird/issue-55653
Reuse granule during skip index reading
2023-10-30 00:51:51 +01:00
frinkr
18c50c11b3
Multithreading after window functions (#50771)
* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing

* fix style

* fix style

* fix style

* setting query_plan_preserve_num_streams_after_window_functions default true

* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0

* fix test references

* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).

* feat: Preserve number of streams after evaluation the window functions to allow parallel stream processing

* fix style

* fix style

* fix style

* setting query_plan_preserve_num_streams_after_window_functions default true

* fix tests by SETTINGS query_plan_preserve_num_streams_after_window_functions=0

* fix test references

* Resize the streams after the last window function, to keep the order between WindowTransforms (and WindowTransform works on single stream anyway).

* add perf test

* perf: change the dataset from 50M to 5M

* rename query_plan_preserve_num_streams_after_window_functions -> query_plan_enable_multithreading_after_window_functions

* update test reference

* fix clang-tidy

---------

Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-10-27 12:36:28 +02:00
lgbo
489e6d9bdc
Optimization for getting value from map, arrayElement(1/2) (#55929) 2023-10-27 09:54:25 +02:00
Maksim Kita
aa5fc05a55 Revert "Merge pull request #55682 from ClickHouse/revert-35961-decimal-column-improve-get-permutation"
This reverts commit f6dee5fe3c, reversing
changes made to f96bda1deb.
2023-10-25 21:48:13 +03:00
李扬
465962df7f
Support orc filter push down (file + stripe + rowgroup level) (#55330)
* support orc filter push down

* update orc lib version

* replace setqueryinfo with setkeycondition

* fix issue https://github.com/ClickHouse/ClickHouse/issues/53536

* refactor source with key condition

* fix building error

* remove std::cout

* update orc

* update orc version

* fix bugs

* improve code

* upgrade orc lib

* fix code style

* change as requested

* add performance tests for orc filter push down

* add performance tests for orc filter push down

* fix all bugs

* fix default as null issue

* add uts for null as default issues

* upgrade orc lib

* fix failed orc lib uts and fix typo

* fix failed uts

* fix failed uts

* fix ast fuzzer tests

* fix bug of uint64 overflow in https://s3.amazonaws.com/clickhouse-test-reports/55330/de22fdcaea2e12c96f300e95f59beba84401712d/fuzzer_astfuzzerubsan/report.html

* fix asan fatal caused by reused column vector batch in native orc input format. refer to https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__asan__[4_4].htm

* fix wrong performance tests

* disable 02892_orc_filter_pushdown on aarch64. https://s3.amazonaws.com/clickhouse-test-reports/55330/be39d23af2d7e27f5ec7f168947cf75aeaabf674/stateless_tests__aarch64_.html

* add some comments

* add some comments

* inline range::equals and range::less

* fix data race of key condition

* trigger ci
2023-10-24 12:08:17 -07:00
Robert Schulze
68c3f41b71
Fix performance tests 2023-10-24 08:56:09 +00:00
Duc Canh Le
5923e1b116
Cache cast function in set during execution (#55712)
* Cache cast function in set during execution

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

* minor fix for performance test

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

* Update src/Interpreters/castColumn.cpp

Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

* improvement

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

* fix use-after-free

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

---------

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-10-23 13:31:44 +02:00
Amos Bird
602f01f651
Reuse granule during skip index reading 2023-10-18 14:40:34 +08:00
Alexey Milovidov
2da1ff4b0d
Revert "Improve ColumnDecimal, ColumnVector getPermutation performance using pdqsort with RadixSort" 2023-10-16 19:07:11 +03:00
Duc Canh Le
16687632da add format Null to performance tests
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2023-10-14 08:01:55 +00:00
Duc Canh Le
285ae778e4 Merge branch 'master' into final_no_copy
Fix flaky test 02447_drop_database_replica
2023-10-14 03:34:42 +00:00
Duc Canh Le
dcc464b4da add more performance test
Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>
2023-10-13 08:16:54 +00:00
Maksim Kita
d4f9e0de12 Added performance test 2023-10-11 19:01:00 +03:00
Robert Schulze
2848548c63
Merge remote-tracking branch 'rschu1ze/master' into arrayFold 2023-10-08 16:32:36 +00:00
JackyWoo
826f7ac7eb Merge branch 'master' into optimize_uniq_to_count2 2023-09-27 09:14:28 +08:00
Maksim Kita
40be8227ea Fixed tests 2023-09-25 17:29:42 +03:00
Maksim Kita
1de95d8c36 Updated implementation 2023-09-25 17:29:42 +03:00
Maksim Kita
0a9835d085 Added performance tests 2023-09-25 17:29:42 +03:00
JackyWoo
231d16040b Merge branch 'master' into optimize_uniq_to_count2 2023-09-19 10:29:03 +08:00
robot-clickhouse
51851ecc21
Merge pull request #54613 from bigo-sg/improve_json_query
Improve json sql functions by serializing json element into column's buffer direclty
2023-09-15 19:35:30 +02:00
lgbo-ustc
fa0f9b0e1f update benckmark test 2023-09-15 18:09:58 +08:00
lgbo-ustc
e8d217634e improve json sql functions by serilizing data into column direclty 2023-09-14 12:41:17 +08:00
Sema Checherinda
8a9b544a97
Merge branch 'master' into optimize_all_lonely_parts 2023-09-13 16:07:19 +02:00
JackyWoo
70a262a775 Add optimization uniq to count 2023-09-13 16:16:11 +08:00
Alexey Milovidov
bd4aec0601
Revert "Optimize uniq to count" 2023-09-13 09:14:06 +03:00
JackyWoo
d065ac32e0 Merge branch 'master' into optimize_uniq_to_count 2023-09-04 10:06:36 +08:00
Duc Canh Le
06afe0c2aa more stable stateless test + add perf. test 2023-08-31 06:27:06 +00:00
Jiebin Sun
7c529e5691
Optimize the merge if all hashSets are singleLevel in UniqExactSet (#52973)
* Optimize the merge if all hashSets are singleLevel

In PR(https://github.com/ClickHouse/ClickHouse/pull/50748), it has added new phase
`parallelizeMergePrepare` before merge if all the hashSets are not all singleLevel
or not all twoLevel. Then it will convert all the singleLevelSet to twoLevelSet in
parallel, which will increase the CPU utilization and QPS.

But if all the hashtables are singleLevel, it could also benefit from the
`parallelizeMergePrepare` optimization in most cases if the hashtable size are not
too small. By tuning the Query `SELECT COUNT(DISTINCT SearchPhase) FROM hits_v1`
in different threads, we have got the mild threshold 6,000.

Test patch with the Query 'SELECT COUNT(DISTINCT Title) FROM hits_v1' on 2x80 vCPUs
server. If the threads are less than 48, the hashSets are all twoLevel or mixed by
singleLevel and twoLevel. If the threads are over 56, all the hashSets are singleLevel.
And the QPS has got at most 2.35x performance gain.

Threads	Opt/Base
8	100.0%
16	99.4%
24	110.3%
32	99.9%
40	99.3%
48	99.8%
56	183.0%
64	234.7%
72	233.1%
80	229.9%
88	224.5%
96	229.6%
104	235.1%
112	229.5%
120	229.1%
128	217.8%
136	222.9%
144	217.8%
152	204.3%
160	203.2%

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Add the comment and explanation for PR#52973

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

---------

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-08-30 11:26:16 +02:00
JackyWoo
a963048e1a Merge branch 'master' into optimize_uniq_to_count 2023-08-28 11:10:05 +08:00
avogar
ecb0e9844c Disable cache in perf test 2023-08-23 21:01:18 +00:00
avogar
68e3af56d4 Address comments 2023-08-23 13:19:15 +00:00
Kruglov Pavel
e193aec583
Merge branch 'master' into fast-count-from-files 2023-08-23 12:15:34 +02:00
Kruglov Pavel
67c5c0203b
Merge branch 'master' into fast-count-from-files 2023-08-22 15:03:48 +02:00
Alexey Milovidov
037277c4a2 Remove bad test 2023-08-22 14:21:23 +02:00
Michael Kolupaev
d752611c43 Performance test 2023-08-21 14:15:52 -07:00
Kruglov Pavel
88aee95122
Merge branch 'master' into fast-count-from-files 2023-08-21 14:46:33 +02:00
avogar
47304bf7aa Optimize count from files in most input formats 2023-08-21 12:30:52 +00:00
Alexey Milovidov
125169d9ae Remove useless test 2023-08-20 03:51:30 +02:00
robot-ch-test-poll4
3aa9cb1267
Merge pull request #51399 from liuneng1994/optimize_nullable_aggragate_serialized_method
Optimize aggregation performance of nullable String key when use AggregationMethodSerialized
2023-08-16 19:37:44 +02:00
liuneng
8a83301316 optimize 2023-08-08 13:38:25 +08:00
liuneng
f33367cd8b add more test 2023-08-08 13:38:25 +08:00
liuneng
f96b9b7512 optimize fixed size column 2023-08-08 13:38:25 +08:00
liuneng
035dbdaf22 remove numbers optimization. It will decrease performance 2023-08-08 13:38:25 +08:00
liuneng
4f9920c71c optimize performance of nullable String And Number column serializeValueIntoArena 2023-08-08 13:38:25 +08:00
Duc Canh Le
ad0ac43814 fix performance test 2023-08-07 06:25:46 +00:00
Duc Canh Le
ed2a1d7c9b select required columns when getting join 2023-08-07 03:15:20 +00:00
JackyWoo
43ea21a4ce make default optimize_uniq_to_count to true 2023-08-02 18:28:22 +08:00
JackyWoo
1c930f34de reduce performance time 2023-08-02 18:10:01 +08:00
JackyWoo
162c674d74 remove settings in uniq_to_count 2023-08-02 10:50:04 +08:00
JackyWoo
ef3f5e2a7c fix performance tests error 2023-08-02 10:15:56 +08:00
JackyWoo
93b28903cb
Merge branch 'master' into optimize_uniq_to_count 2023-08-02 10:13:22 +08:00
Jiebin Sun
78f3a575f9
Convert hashSets in parallel before merge (#50748)
* Convert hashSets in parallel before merge

Before merge, if one of the lhs and rhs is singleLevelSet and the other is twoLevelSet,
then the SingleLevelSet will call convertToTwoLevel(). The convert process is not in parallel
and it will cost lots of cycle if it cosume all the singleLevelSet.

The idea of the patch is to convert all the singleLevelSets to twoLevelSets in parallel if
the hashsets are not all singleLevel or not all twoLevel.

I have tested the patch on Intel 2 x 112 vCPUs SPR server with clickbench and latest upstream
ClickHouse.
Q5 has got a big 264% performance improvement and 24 queries have got at least 5% performance
gain. The overall geomean of 43 queries has gained 7.4% more than the base code.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* add resize() for the data_vec in parallelizeMergePrepare()

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Add the performance test prepare_hash_before_merge.xml

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Fit the CI to rename the data set from hits_v1 to test.hits.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* remove the redundant branch in UniqExactSet

Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

* Remove the empty methods and add throw exception in parallelizeMergePrepare()

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

---------

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-07-27 15:06:34 +02:00
JackyWoo
5f47aacef2 add performance tests 2023-07-27 15:41:16 +08:00
JackyWoo
95c41f49e0 not change projection columns 2023-07-27 15:41:16 +08:00
robot-ch-test-poll4
110500049a
Merge pull request #50532 from nickitat/more_pushdown_for_right_side_of_join
Push down to right side of a join in more cases
2023-07-26 14:43:57 +02:00
Nikita Taranov
b2acbe42b7 add perf test 2023-07-24 20:34:01 +02:00
Igor Nikonov
91f7185e8c
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct 2023-07-24 18:47:23 +02:00
Igor Nikonov
90e393ecf6 Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct 2023-07-18 14:26:22 +00:00
Alexey Milovidov
62bfa4ed93 Fix performance test for regexp cache 2023-07-09 02:21:48 +02:00
vdimir
737cff7e57 Remove whole join_set_filter.xml, will resubmit 2023-07-03 17:00:20 +02:00
vdimir
9ea5d929a5 Update tests/performance/join_set_filter.xml 2023-07-03 17:00:20 +02:00
vdimir
ebd7ecb230 Remove unstable queries from performance/join_set_filter 2023-07-03 17:00:20 +02:00
Igor Nikonov
35bc97e5f9 Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct 2023-06-16 20:56:56 +00:00
Azat Khuzhin
5caa3a9e80 Adjust min_insert_block_size_rows for materialized_view_parallelize_output_from_storages
Otherwise it is too slow for perf tests on CI [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/50214/e287ec50920c7cadabea6ec19ef14b353345ac93/performance_comparison_[3_4]/report.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Azat Khuzhin
3e419730c3 Disable parallelize_output_from_storages for processing MATERIALIZED VIEWs
Adding more processors for parallelize_output_from_storages is not a
costless operation (I've experienced some issues in production because
of this), and it is not easy to fix in a normal way, so let's disable it
for now.

Before this patch:
- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=1, min_insert_block_size_rows=1000
  0 rows in set. Elapsed: 3.648 sec. Processed 20.00 million rows, 120.00 MB (5.48 million rows/s., 32.90 MB/s.)

- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=0, min_insert_block_size_rows=1000
  0 rows in set. Elapsed: 1.851 sec. Processed 20.00 million rows, 120.00 MB (10.80 million rows/s., 64.82 MB/s.)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Igor Nikonov
79f53f428b
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct 2023-06-13 13:45:36 +02:00
flynn
92c87dedad
Add parallel state merge for some other combinator except If (#50413)
* Add parallel state merge for some other combinator except If

* add test

* update test
2023-06-08 00:41:32 +02:00
flynn
f616314f8b fix typo 2023-05-29 02:22:13 +00:00
flynn
05783f99cd update test 2023-05-28 14:17:59 +00:00
flynn
ec82c657eb Parallel merge of uniqExactIf states 2023-05-28 06:04:23 +00:00
Azat Khuzhin
2996b38606 Add ability to configure maximum load factor for the HASHED/SPARSE_HASHED layout
As it turns out, HashMap/PackedHashMap works great even with max load
factor of 0.99. By "great" I mean it least it works faster then
google sparsehash, and not to mention it's friendliness to the memory
allocator (it has zero fragmentation since it works with a continuious
memory region, in comparison to the sparsehash that doing lots of
realloc, which jemalloc does not like, due to it's slabs).

Here is a table of different setups:

settings                         | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
-                                | -          | -          | -                     | -               | -
HASHED upstream                  | -          | -          | -                     | -               | 35GiB
SPARSE_HASHED upstream           | -          | -          | -                     | -               | 26GiB
-                                | -          | -          | -                     | -               | -
sparse_hash_map glibc hashbench  | -          | -          | -                     | -               | 17.5GiB
sparse_hash_map packed allocator | 101.878    | 231.48     | 4.32                  | -               | 17.7GiB
PackedHashMap 0.5                | 15.514     | 42.35      | 23.61                 | 20GiB           | 22GiB
hashed 0.95                      | 34.903     | 115.615    | 8.65                  | 16GiB           | 18.7GiB
**PackedHashMap 0.95**           | **93.6**   | **19.883** | **10.68**             | **10GiB**       | **12.8GiB**
PackedHashMap 0.99               | 26.113     | 83.6       | 11.96                 | 10GiB           | 12.3GiB

As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less
memory then SPARSE_HASHED in upstream, and it also 2x faster for read!

v2: fix grower
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
7b5d156cc5 Optimize SPARSE_HASHED layout (by using PackedHashMap)
In case you want dictionary optimized for memory, SPARSE_HASHED is not
always gives you what you need.

Consider the following example <UInt64, UInt16> as <Key, Value>, but
this pair will also have a 6 byte padding (on amd64), so this is almost
40% of space wastage.

And because of this padding, even google::sparse_hash_map, does not make
picture better, in fact, sparse_hash_map is not very friendly to memory
allocators (especially jemalloc).

Here are some numbers for dictionary with 1e9 elements and UInt64 as
key, and UInt16 as value:

settings                         | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
HASHED upstream                  | -          | -          | -                     | -               | 35GiB
SPARSE_HASHED upstream           | -          | -          | -                     | -               | 26GiB
-                                | -          | -          | -                     | -               | -
sparse_hash_map glibc hashbench  | -          | -          | -                     | -               | 17.5GiB
sparse_hash_map packed allocator | 101.878    | 231.48     | 4.32                  | -               | 17.7GiB
PackedHashMap                    | 15.514     | 42.35      | 23.61                 | 20GiB           | 22GiB

As you can see PackedHashMap looks way more better then HASHED, and
even better then SPARSE_HASHED, but slightly worse then sparse_hash_map
with packed allocator (it is done with a custom patch to google
sparse_hash_map).

v2: rebase on top of bucket_count fix
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Lirikl
39a505e3f3 init 2023-05-11 22:54:00 +03:00
lgbo-ustc
a07359fbe8 enable used flags's reinit only when the hash talbe rehash 2023-05-11 11:06:13 +08:00
Alexey Milovidov
8a6e07f0ea Make projections production-ready 2023-05-10 03:35:13 +02:00
Alexey Milovidov
f449df85b6 Deprecate in-memory parts 2023-05-03 00:31:09 +02:00
Alexey Milovidov
c279516ac1
Merge branch 'master' into parallel-reading-from-file 2023-04-10 08:02:50 +03:00
Igor Nikonov
8fdc2b3326 Perf test 2023-04-07 20:06:11 +00:00
Anton Popov
10d2b1330b add perf test 2023-04-04 21:29:52 +00:00
Anton Popov
1e79245b94 add tests 2023-03-28 17:20:05 +00:00
Ongkong
d9c7bc1859
Fix ASOF LEFT JOIN performance degradation (#47544) 2023-03-18 23:53:00 +01:00
LiuNeng
d4c5ab9dcd
Optimize one nullable key aggregate performance (#45772) 2023-03-02 21:01:52 +01:00
Igor Nikonov
2f7aa8849b
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct 2023-03-02 20:48:28 +01:00
Igor Nikonov
548d79c2e8 Remove perf test duplicate_order_by_and_distinct.xml 2023-03-02 12:31:09 +00:00
Alexander Gololobov
f64d08bd5c Enable lightweight delete support by default 2023-03-01 19:35:55 +01:00
Nikita Taranov
ab44740efb
Enable perf tests added in #45364 (#46623) 2023-02-28 00:26:11 +01:00
Alexey Milovidov
17992b178a
Merge pull request #45364 from nickitat/aggr_partitions_independently
Add option to aggregate partitions independently
2023-02-19 17:44:18 +03:00
Alexey Milovidov
417158f59f
Merge branch 'master' into lower_upper 2023-02-19 04:05:10 +03:00
Nikita Taranov
f70044f34b Merge branch 'master' into aggr_partitions_independently 2023-02-18 13:19:05 +00:00
Alexey Milovidov
6e0dab71ed
Merge pull request #46188 from bigo-sg/rewrite_array_exists
Rewrite array exists to has
2023-02-12 05:53:22 +03:00
Alexey Milovidov
786aa069e1
Merge pull request #46187 from ClickHouse/speed-up-count-digits
Speed up `countDigits`
2023-02-10 07:41:12 +03:00
taiyang-li
b83ad6bb81 add perf test 2023-02-09 12:30:50 +08:00
Alexey Milovidov
9a86d0087c Add performance test 2023-02-09 04:52:33 +01:00
Alexey Milovidov
66043eec24
Merge branch 'master' into decimal-performance 2023-02-09 04:59:37 +03:00
Igor Nikonov
72c393e7c4
Merge pull request #46014 from ClickHouse/inorder-optimization-update-sorting-properties
Update sorting properties after reading in order applied
2023-02-08 10:19:47 +01:00
Alexey Milovidov
a2df6e950e Whitespace 2023-02-08 03:38:23 +01:00
Alexey Milovidov
168fbc9d7b Add a test 2023-02-08 02:17:23 +01:00
李扬
444373679a
Merge branch 'master' into improve_decimal 2023-02-06 13:08:51 +08:00
Igor Nikonov
089a0009ad Polishing
+ try to stabilize distinct in order perf test
2023-02-05 13:38:20 +00:00
Nikita Taranov
b983b363f8 Merge branch 'master' into aggr_partitions_independently 2023-02-04 18:24:31 +00:00
李扬
ad6f39389d
Update tests/performance/column_array_filter.xml
Co-authored-by: Alexander Gololobov <440544+davenger@users.noreply.github.com>
2023-02-04 18:49:13 +08:00
Nikita Mikhaylov
33877b5e00
Parallel replicas. Part [2] (#43772) 2023-02-03 14:34:18 +01:00
taiyang-li
36a98a1628 add performance tests 2023-02-02 20:16:16 +08:00
Nikita Taranov
e7ca90adab fix perf test 2023-01-30 17:11:56 +00:00
Nikita Taranov
ac77808133 fix perf test 2023-01-30 17:11:56 +00:00
Nikita Taranov
52fe7edbd9 better key analysis 2023-01-30 17:11:56 +00:00
Nikita Taranov
2057db68a2 cosmetics 2023-01-30 17:10:45 +00:00
Nikita Taranov
1d45cce03c support for aggr in order 2023-01-30 17:10:45 +00:00
Nikita Taranov
a2c9aeb7c9 stash 2023-01-30 17:10:45 +00:00
taiyang-li
d25740da83 change as request 2023-01-30 16:13:12 +08:00
Alexey Milovidov
bc2f454522
Merge branch 'master' into block-non-float-gorilla-v2 2023-01-28 03:30:12 +03:00
Igor Nikonov
300f78df96
Merge pull request #45567 from ClickHouse/enable-remove-redundant-sorting
Enable query_plan_remove_redundant_sorting optimization by default
2023-01-27 19:14:36 +01:00
Igor Nikonov
41b94b4954 Enable query_plan_remove_redundant_sorting optimization by default 2023-01-24 13:38:21 +00:00
Robert Schulze
97d1bed114
Merge branch 'master' into improve_week_day 2023-01-21 20:40:33 +01:00
Robert Schulze
e6167d6b36
Deprecate Gorilla compression of non-float columns
Reasons:

1. The original Gorilla paper proposed a compression schema for pairs of
   time stamps and double-precision FP values. ClickHouse's Gorilla
   codec only implements compression of the latter and it does not
   impose any data type restrictions.
   - Data types != Float* or (U)Int* (e.g. Decimal, Point etc.) are
     definitely not supposed to be used with Gorilla.
   - (U)Int* types are debatable. The paper only considers
     integers-stored-as-FP-values, a practical use case for which
     Gorilla works well. Standalone integers are not considered which
     makes them at least suspicious.

2. Achieve consistency with FPC, another specialized floating-point
   timeseries codec, which rejects non-float data.

3. On practical datasets, ZSTD is often "good enough" (**) so it should
   be okay to disincentive non-ZSTD codecs a little bit. If needed,
   Delta and DoubleDelta codecs are viable alternative for slowly
   changing (time-series-like) integer sequences.

Since on-prem and hosted users may still have Gorilla-compressed
non-float data, this combination is only deprecated for now. No warning
or error will be emitted. Users are encouraged to migrate
Gorilla-compressed non-float data to an alternative codec. It is planned
to treat Gorilla-compressed non-float columns as "suspicious" six months
after this commit (i.e. in v23.6). Even then, it will still be possible
to set "allow_suspicious_codecs = true" and read and write
Gorilla-compressed non-float data.

(*) Sec. 4.1.2, "Gorilla restricts the value element in its tuple to a
    double floating point type.", https://doi.org/10.14778/2824032.2824078

(**) https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema
2023-01-20 17:31:16 +00:00
Igor Nikonov
7ed8fec94f
Revert "Remove redundant sorting" 2023-01-18 18:38:25 +01:00
Igor Nikonov
72066846cf
Merge pull request #43905 from ClickHouse/igor/remove_redundant_order_by
Remove redundant sorting
2023-01-18 13:25:03 +01:00
Igor Nikonov
0cfa08df7a Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by 2023-01-17 16:28:17 +00:00
Alexander Tokmakov
df75c24f01
Revert "Disallow Gorilla codec on non-float columns" 2023-01-16 19:14:28 +03:00
Igor Nikonov
a34991cb65 Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by 2023-01-16 12:14:02 +00:00
Robert Schulze
bd41c74ddf
Various test, code and docs fixups 2023-01-15 13:47:34 +00:00
Robert Schulze
7023d68536
Fix codecs_int_*.xml 2023-01-15 13:31:45 +00:00
Azat Khuzhin
925fd2c33a tests/performance: do not use scientific notation in hashed_dictionary_sharded
v2: fix few mistakes
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
345c422e28 Add ability to load hashed dictionaries using multiple threads
Right now dictionaries (here I will talk about only
HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED)
can load data only in one thread, since it uses one hash table that
cannot be filled from multiple threads.

And in case you have very big dictionary (i.e. 10e9 elements), it can
take a awhile to load them, especially for SPARSE_HASHED variants (and
if you have such amount of elements there, you are likely use
SPARSE_HASHED, since it requires less memory), in my env it takes ~4
hours, which is enormous amount of time.

So this patch add support of shards for dictionaries, number of shards
determine how much hash tables will use this dictionary, also, and which
is more important, how much threads it can use to load the data.

And with 16 threads this works 2x faster, not perfect though, see the
follow up patches in this series.

v0: PARTITION BY
v1: SHARDS 1
v2: SHARDS(1)
v3: tried optimized mod - logical and, but it does not gain even 10%
v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either
v5: move SHARDS into layout parameters (unknown simply ignored)
v6: tune params for perf tests (to avoid too long queries)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:25 +01:00
Nikolai Kochetov
30310df5be
Merge branch 'master' into logical-optimizer-lowcardinality 2023-01-12 18:51:05 +01:00
Nikita Taranov
006fdd32d4
Apply preallocation optimisation more carefully (#44455)
* impl

* add perf test

* fix

* review fixes
2023-01-09 13:30:48 +01:00
Igor Nikonov
2187bdd4cc Disable diagnostics
+ cleanup
+ disable optimization in sort performance test since it removes sorting
  at all
2023-01-06 17:00:05 +00:00
Nikolay Degterinsky
dfe93b5d82
Merge pull request #42284 from Algunenano/perf_experiment
Performance experiment
2022-12-30 03:14:22 +01:00
Alexey Milovidov
79f2e747e4 Remove QuestDB (flaky test) 2022-12-28 12:42:14 +01:00
Raúl Marín
fc1fa82a39
Merge branch 'master' into perf_experiment 2022-12-27 10:51:58 +01:00
Raúl Marín
45d27f461b
Merge branch 'master' into perf_experiment 2022-12-20 09:07:48 +00:00
Kruglov Pavel
37df9b9990
Merge branch 'master' into refactor-schema-inference 2022-12-16 19:13:15 +01:00
Azat Khuzhin
53bac4de71 tests/perf: fix dependency check during DROP
CI [1]:

    DB::Exception: Cannot drop or rename default.hierarchical_dictionary_source_table, because some tables depend on it: default.hierarchical_hashed_array_dictionary, default.hierarchical_flat_dictionary, default.hierarchical_hashed_dictionary. Stack trace:

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/44256/8e67a361a8f14abec6717af09ee997eb25151685/performance_comparison_[1/4]/report.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-16 15:15:15 +01:00
Nikolay Degterinsky
9b6d31b95d
Merge branch 'master' into perf_experiment 2022-12-13 17:15:07 +01:00
avogar
7375a7d429 Refactor and improve schema inference for text formats 2022-12-07 21:19:27 +00:00
Guo Wangyang
b86686b3f8
Merge branch 'master' into logical-optimizer-lowcardinality 2022-12-07 13:33:25 +08:00
Maksim Kita
1cdc7ab62a
Merge pull request #43556 from Algunenano/interpretation_benchmark
Add benchmark for query interpretation with JOINs
2022-12-01 22:53:02 +03:00
Vladimir C
53dc70a2d0
Merge pull request #38191 from BigRedEye/grace_hash_join
Closes https://github.com/ClickHouse/ClickHouse/issues/11596
2022-11-30 17:01:00 +01:00
Nikolai Kochetov
51439e2c19
Merge pull request #43260 from ClickHouse/read-from-mt-in-io-pool
Read from MergeTree in I/O pool
2022-11-29 12:09:03 +01:00
Nikolai Kochetov
d9fc13b230
Update async_remote_read.xml 2022-11-28 14:00:49 +01:00
Nikita Taranov
8ed5cfc265
Memory bound merging for distributed aggregation in order (#40879)
* impl

* fix style

* make executeQueryWithParallelReplicas similar to executeQuery

* impl for parallel replicas

* cleaner code for remote sorting properties

* update test

* fix

* handle when nodes of old versions participate

* small fixes

* temporary enable for testing

* fix after merge

* Revert "temporary enable for testing"

This reverts commit cce7f8884c.

* review fixes

* add bc test

* Update src/Core/Settings.h
2022-11-28 00:41:31 +01:00
Nikita Taranov
d1c258cf20
Add xxh3 hash function (#43411)
* impl

* try fix

* add docs

* add test

* rm unused file

* excellent
2022-11-26 00:14:08 +01:00
Nikolai Kochetov
4632e7c644 Add max_streams_for_merge_tree_reading setting. 2022-11-25 17:14:22 +00:00
Nikolai Kochetov
dfd3976040
Update async_remote_read.xml 2022-11-25 14:53:45 +01:00
Igor Nikonov
236e7e3989 Small fixes 2022-11-25 12:04:12 +00:00
Igor Nikonov
20e67b7140 Merge remote-tracking branch 'origin/master' into HEAD 2022-11-24 13:10:37 +00:00
Nikolai Kochetov
e79c91947a
Update async_remote_read.xml 2022-11-24 12:35:02 +01:00
Raúl Marín
e910648c5d Add benchmark for query interpretation with JOINs 2022-11-23 13:15:35 +01:00
Raúl Marín
ed0c174c0c Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-11-21 11:02:31 +01:00
Guo Wangyang
7d6ff90e34
Merge branch 'master' into logical-optimizer-lowcardinality 2022-11-20 09:56:50 +08:00
Nikolai Kochetov
5da1d893fd
Merge branch 'master' into read-from-mt-in-io-pool 2022-11-18 21:10:45 +01:00
Nikita Taranov
7beb58b0cf
Optimize merge of uniqExact without_key (#43072)
* impl for uniqExact

* rm unused (read|write)Text methods

* fix style

* small fixes

* impl for variadic uniqExact

* refactor

* fix style

* more agressive inlining

* disable if max_threads=1

* small improvements

* review fixes

* Revert "rm unused (read|write)Text methods"

This reverts commit a7e7480584.

* encapsulate is_able_to_parallelize_merge in Data

* encapsulate is_exact & argument_is_tuple in Data
2022-11-17 13:19:02 +01:00
Kruglov Pavel
1b68f605a2
Merge pull request #42761 from AlfVII/fix-slow-json-extract-with-low-cardinality
Fixed slowness in JSONExtract with LowCardinality(String) tuples
2022-11-17 12:49:18 +01:00
Raúl Marín
97d6fc3071 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-11-17 11:48:46 +01:00
Nikolai Kochetov
10f449c6c1 Add a query to perftest. 2022-11-15 18:08:03 +00:00
李扬
1de5bb2392
Add function canonicalRand (#43124)
* add function canonicalRand

* add perf test

* revert rand.xml
2022-11-15 00:27:19 +01:00
Wangyang Guo
887779e8d8 Add perftest: low_cardinality_query 2022-11-08 17:19:18 +08:00