Commit Graph

1062 Commits

Author SHA1 Message Date
Igor Nikonov
91f7185e8c
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct 2023-07-24 18:47:23 +02:00
Igor Nikonov
90e393ecf6 Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct 2023-07-18 14:26:22 +00:00
Alexey Milovidov
62bfa4ed93 Fix performance test for regexp cache 2023-07-09 02:21:48 +02:00
vdimir
737cff7e57 Remove whole join_set_filter.xml, will resubmit 2023-07-03 17:00:20 +02:00
vdimir
9ea5d929a5 Update tests/performance/join_set_filter.xml 2023-07-03 17:00:20 +02:00
vdimir
ebd7ecb230 Remove unstable queries from performance/join_set_filter 2023-07-03 17:00:20 +02:00
Igor Nikonov
35bc97e5f9 Merge remote-tracking branch 'origin/master' into remove-perf-test-duplicate-order-by-and-distinct 2023-06-16 20:56:56 +00:00
Azat Khuzhin
5caa3a9e80 Adjust min_insert_block_size_rows for materialized_view_parallelize_output_from_storages
Otherwise it is too slow for perf tests on CI [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/50214/e287ec50920c7cadabea6ec19ef14b353345ac93/performance_comparison_[3_4]/report.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Azat Khuzhin
3e419730c3 Disable parallelize_output_from_storages for processing MATERIALIZED VIEWs
Adding more processors for parallelize_output_from_storages is not a
costless operation (I've experienced some issues in production because
of this), and it is not easy to fix in a normal way, so let's disable it
for now.

Before this patch:
- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=1, min_insert_block_size_rows=1000
  0 rows in set. Elapsed: 3.648 sec. Processed 20.00 million rows, 120.00 MB (5.48 million rows/s., 32.90 MB/s.)

- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=0, min_insert_block_size_rows=1000
  0 rows in set. Elapsed: 1.851 sec. Processed 20.00 million rows, 120.00 MB (10.80 million rows/s., 64.82 MB/s.)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Igor Nikonov
79f53f428b
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct 2023-06-13 13:45:36 +02:00
flynn
92c87dedad
Add parallel state merge for some other combinator except If (#50413)
* Add parallel state merge for some other combinator except If

* add test

* update test
2023-06-08 00:41:32 +02:00
flynn
f616314f8b fix typo 2023-05-29 02:22:13 +00:00
flynn
05783f99cd update test 2023-05-28 14:17:59 +00:00
flynn
ec82c657eb Parallel merge of uniqExactIf states 2023-05-28 06:04:23 +00:00
Azat Khuzhin
2996b38606 Add ability to configure maximum load factor for the HASHED/SPARSE_HASHED layout
As it turns out, HashMap/PackedHashMap works great even with max load
factor of 0.99. By "great" I mean it least it works faster then
google sparsehash, and not to mention it's friendliness to the memory
allocator (it has zero fragmentation since it works with a continuious
memory region, in comparison to the sparsehash that doing lots of
realloc, which jemalloc does not like, due to it's slabs).

Here is a table of different setups:

settings                         | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
-                                | -          | -          | -                     | -               | -
HASHED upstream                  | -          | -          | -                     | -               | 35GiB
SPARSE_HASHED upstream           | -          | -          | -                     | -               | 26GiB
-                                | -          | -          | -                     | -               | -
sparse_hash_map glibc hashbench  | -          | -          | -                     | -               | 17.5GiB
sparse_hash_map packed allocator | 101.878    | 231.48     | 4.32                  | -               | 17.7GiB
PackedHashMap 0.5                | 15.514     | 42.35      | 23.61                 | 20GiB           | 22GiB
hashed 0.95                      | 34.903     | 115.615    | 8.65                  | 16GiB           | 18.7GiB
**PackedHashMap 0.95**           | **93.6**   | **19.883** | **10.68**             | **10GiB**       | **12.8GiB**
PackedHashMap 0.99               | 26.113     | 83.6       | 11.96                 | 10GiB           | 12.3GiB

As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less
memory then SPARSE_HASHED in upstream, and it also 2x faster for read!

v2: fix grower
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
Azat Khuzhin
7b5d156cc5 Optimize SPARSE_HASHED layout (by using PackedHashMap)
In case you want dictionary optimized for memory, SPARSE_HASHED is not
always gives you what you need.

Consider the following example <UInt64, UInt16> as <Key, Value>, but
this pair will also have a 6 byte padding (on amd64), so this is almost
40% of space wastage.

And because of this padding, even google::sparse_hash_map, does not make
picture better, in fact, sparse_hash_map is not very friendly to memory
allocators (especially jemalloc).

Here are some numbers for dictionary with 1e9 elements and UInt64 as
key, and UInt16 as value:

settings                         | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
HASHED upstream                  | -          | -          | -                     | -               | 35GiB
SPARSE_HASHED upstream           | -          | -          | -                     | -               | 26GiB
-                                | -          | -          | -                     | -               | -
sparse_hash_map glibc hashbench  | -          | -          | -                     | -               | 17.5GiB
sparse_hash_map packed allocator | 101.878    | 231.48     | 4.32                  | -               | 17.7GiB
PackedHashMap                    | 15.514     | 42.35      | 23.61                 | 20GiB           | 22GiB

As you can see PackedHashMap looks way more better then HASHED, and
even better then SPARSE_HASHED, but slightly worse then sparse_hash_map
with packed allocator (it is done with a custom patch to google
sparse_hash_map).

v2: rebase on top of bucket_count fix
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-19 06:07:21 +02:00
lgbo-ustc
a07359fbe8 enable used flags's reinit only when the hash talbe rehash 2023-05-11 11:06:13 +08:00
Alexey Milovidov
8a6e07f0ea Make projections production-ready 2023-05-10 03:35:13 +02:00
Alexey Milovidov
f449df85b6 Deprecate in-memory parts 2023-05-03 00:31:09 +02:00
Alexey Milovidov
c279516ac1
Merge branch 'master' into parallel-reading-from-file 2023-04-10 08:02:50 +03:00
Igor Nikonov
8fdc2b3326 Perf test 2023-04-07 20:06:11 +00:00
Anton Popov
10d2b1330b add perf test 2023-04-04 21:29:52 +00:00
Anton Popov
1e79245b94 add tests 2023-03-28 17:20:05 +00:00
Ongkong
d9c7bc1859
Fix ASOF LEFT JOIN performance degradation (#47544) 2023-03-18 23:53:00 +01:00
LiuNeng
d4c5ab9dcd
Optimize one nullable key aggregate performance (#45772) 2023-03-02 21:01:52 +01:00
Igor Nikonov
2f7aa8849b
Merge branch 'master' into remove-perf-test-duplicate-order-by-and-distinct 2023-03-02 20:48:28 +01:00
Igor Nikonov
548d79c2e8 Remove perf test duplicate_order_by_and_distinct.xml 2023-03-02 12:31:09 +00:00
Alexander Gololobov
f64d08bd5c Enable lightweight delete support by default 2023-03-01 19:35:55 +01:00
Nikita Taranov
ab44740efb
Enable perf tests added in #45364 (#46623) 2023-02-28 00:26:11 +01:00
Alexey Milovidov
17992b178a
Merge pull request #45364 from nickitat/aggr_partitions_independently
Add option to aggregate partitions independently
2023-02-19 17:44:18 +03:00
Alexey Milovidov
417158f59f
Merge branch 'master' into lower_upper 2023-02-19 04:05:10 +03:00
Nikita Taranov
f70044f34b Merge branch 'master' into aggr_partitions_independently 2023-02-18 13:19:05 +00:00
Alexey Milovidov
6e0dab71ed
Merge pull request #46188 from bigo-sg/rewrite_array_exists
Rewrite array exists to has
2023-02-12 05:53:22 +03:00
Alexey Milovidov
786aa069e1
Merge pull request #46187 from ClickHouse/speed-up-count-digits
Speed up `countDigits`
2023-02-10 07:41:12 +03:00
taiyang-li
b83ad6bb81 add perf test 2023-02-09 12:30:50 +08:00
Alexey Milovidov
9a86d0087c Add performance test 2023-02-09 04:52:33 +01:00
Alexey Milovidov
66043eec24
Merge branch 'master' into decimal-performance 2023-02-09 04:59:37 +03:00
Igor Nikonov
72c393e7c4
Merge pull request #46014 from ClickHouse/inorder-optimization-update-sorting-properties
Update sorting properties after reading in order applied
2023-02-08 10:19:47 +01:00
Alexey Milovidov
a2df6e950e Whitespace 2023-02-08 03:38:23 +01:00
Alexey Milovidov
168fbc9d7b Add a test 2023-02-08 02:17:23 +01:00
李扬
444373679a
Merge branch 'master' into improve_decimal 2023-02-06 13:08:51 +08:00
Igor Nikonov
089a0009ad Polishing
+ try to stabilize distinct in order perf test
2023-02-05 13:38:20 +00:00
Nikita Taranov
b983b363f8 Merge branch 'master' into aggr_partitions_independently 2023-02-04 18:24:31 +00:00
李扬
ad6f39389d
Update tests/performance/column_array_filter.xml
Co-authored-by: Alexander Gololobov <440544+davenger@users.noreply.github.com>
2023-02-04 18:49:13 +08:00
Nikita Mikhaylov
33877b5e00
Parallel replicas. Part [2] (#43772) 2023-02-03 14:34:18 +01:00
taiyang-li
36a98a1628 add performance tests 2023-02-02 20:16:16 +08:00
Nikita Taranov
e7ca90adab fix perf test 2023-01-30 17:11:56 +00:00
Nikita Taranov
ac77808133 fix perf test 2023-01-30 17:11:56 +00:00
Nikita Taranov
52fe7edbd9 better key analysis 2023-01-30 17:11:56 +00:00
Nikita Taranov
2057db68a2 cosmetics 2023-01-30 17:10:45 +00:00