Commit Graph

1036 Commits

Author SHA1 Message Date
LiuNeng
d4c5ab9dcd
Optimize one nullable key aggregate performance (#45772) 2023-03-02 21:01:52 +01:00
Alexander Gololobov
f64d08bd5c Enable lightweight delete support by default 2023-03-01 19:35:55 +01:00
Nikita Taranov
ab44740efb
Enable perf tests added in #45364 (#46623) 2023-02-28 00:26:11 +01:00
Alexey Milovidov
17992b178a
Merge pull request #45364 from nickitat/aggr_partitions_independently
Add option to aggregate partitions independently
2023-02-19 17:44:18 +03:00
Alexey Milovidov
417158f59f
Merge branch 'master' into lower_upper 2023-02-19 04:05:10 +03:00
Nikita Taranov
f70044f34b Merge branch 'master' into aggr_partitions_independently 2023-02-18 13:19:05 +00:00
Alexey Milovidov
6e0dab71ed
Merge pull request #46188 from bigo-sg/rewrite_array_exists
Rewrite array exists to has
2023-02-12 05:53:22 +03:00
Alexey Milovidov
786aa069e1
Merge pull request #46187 from ClickHouse/speed-up-count-digits
Speed up `countDigits`
2023-02-10 07:41:12 +03:00
taiyang-li
b83ad6bb81 add perf test 2023-02-09 12:30:50 +08:00
Alexey Milovidov
9a86d0087c Add performance test 2023-02-09 04:52:33 +01:00
Alexey Milovidov
66043eec24
Merge branch 'master' into decimal-performance 2023-02-09 04:59:37 +03:00
Igor Nikonov
72c393e7c4
Merge pull request #46014 from ClickHouse/inorder-optimization-update-sorting-properties
Update sorting properties after reading in order applied
2023-02-08 10:19:47 +01:00
Alexey Milovidov
a2df6e950e Whitespace 2023-02-08 03:38:23 +01:00
Alexey Milovidov
168fbc9d7b Add a test 2023-02-08 02:17:23 +01:00
李扬
444373679a
Merge branch 'master' into improve_decimal 2023-02-06 13:08:51 +08:00
Igor Nikonov
089a0009ad Polishing
+ try to stabilize distinct in order perf test
2023-02-05 13:38:20 +00:00
Nikita Taranov
b983b363f8 Merge branch 'master' into aggr_partitions_independently 2023-02-04 18:24:31 +00:00
李扬
ad6f39389d
Update tests/performance/column_array_filter.xml
Co-authored-by: Alexander Gololobov <440544+davenger@users.noreply.github.com>
2023-02-04 18:49:13 +08:00
Nikita Mikhaylov
33877b5e00
Parallel replicas. Part [2] (#43772) 2023-02-03 14:34:18 +01:00
taiyang-li
36a98a1628 add performance tests 2023-02-02 20:16:16 +08:00
Nikita Taranov
e7ca90adab fix perf test 2023-01-30 17:11:56 +00:00
Nikita Taranov
ac77808133 fix perf test 2023-01-30 17:11:56 +00:00
Nikita Taranov
52fe7edbd9 better key analysis 2023-01-30 17:11:56 +00:00
Nikita Taranov
2057db68a2 cosmetics 2023-01-30 17:10:45 +00:00
Nikita Taranov
1d45cce03c support for aggr in order 2023-01-30 17:10:45 +00:00
Nikita Taranov
a2c9aeb7c9 stash 2023-01-30 17:10:45 +00:00
taiyang-li
d25740da83 change as request 2023-01-30 16:13:12 +08:00
Alexey Milovidov
bc2f454522
Merge branch 'master' into block-non-float-gorilla-v2 2023-01-28 03:30:12 +03:00
Igor Nikonov
300f78df96
Merge pull request #45567 from ClickHouse/enable-remove-redundant-sorting
Enable query_plan_remove_redundant_sorting optimization by default
2023-01-27 19:14:36 +01:00
Igor Nikonov
41b94b4954 Enable query_plan_remove_redundant_sorting optimization by default 2023-01-24 13:38:21 +00:00
Robert Schulze
97d1bed114
Merge branch 'master' into improve_week_day 2023-01-21 20:40:33 +01:00
Robert Schulze
e6167d6b36
Deprecate Gorilla compression of non-float columns
Reasons:

1. The original Gorilla paper proposed a compression schema for pairs of
   time stamps and double-precision FP values. ClickHouse's Gorilla
   codec only implements compression of the latter and it does not
   impose any data type restrictions.
   - Data types != Float* or (U)Int* (e.g. Decimal, Point etc.) are
     definitely not supposed to be used with Gorilla.
   - (U)Int* types are debatable. The paper only considers
     integers-stored-as-FP-values, a practical use case for which
     Gorilla works well. Standalone integers are not considered which
     makes them at least suspicious.

2. Achieve consistency with FPC, another specialized floating-point
   timeseries codec, which rejects non-float data.

3. On practical datasets, ZSTD is often "good enough" (**) so it should
   be okay to disincentive non-ZSTD codecs a little bit. If needed,
   Delta and DoubleDelta codecs are viable alternative for slowly
   changing (time-series-like) integer sequences.

Since on-prem and hosted users may still have Gorilla-compressed
non-float data, this combination is only deprecated for now. No warning
or error will be emitted. Users are encouraged to migrate
Gorilla-compressed non-float data to an alternative codec. It is planned
to treat Gorilla-compressed non-float columns as "suspicious" six months
after this commit (i.e. in v23.6). Even then, it will still be possible
to set "allow_suspicious_codecs = true" and read and write
Gorilla-compressed non-float data.

(*) Sec. 4.1.2, "Gorilla restricts the value element in its tuple to a
    double floating point type.", https://doi.org/10.14778/2824032.2824078

(**) https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema
2023-01-20 17:31:16 +00:00
Igor Nikonov
7ed8fec94f
Revert "Remove redundant sorting" 2023-01-18 18:38:25 +01:00
Igor Nikonov
72066846cf
Merge pull request #43905 from ClickHouse/igor/remove_redundant_order_by
Remove redundant sorting
2023-01-18 13:25:03 +01:00
Igor Nikonov
0cfa08df7a Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by 2023-01-17 16:28:17 +00:00
Alexander Tokmakov
df75c24f01
Revert "Disallow Gorilla codec on non-float columns" 2023-01-16 19:14:28 +03:00
Igor Nikonov
a34991cb65 Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by 2023-01-16 12:14:02 +00:00
Robert Schulze
bd41c74ddf
Various test, code and docs fixups 2023-01-15 13:47:34 +00:00
Robert Schulze
7023d68536
Fix codecs_int_*.xml 2023-01-15 13:31:45 +00:00
Azat Khuzhin
925fd2c33a tests/performance: do not use scientific notation in hashed_dictionary_sharded
v2: fix few mistakes
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
345c422e28 Add ability to load hashed dictionaries using multiple threads
Right now dictionaries (here I will talk about only
HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED)
can load data only in one thread, since it uses one hash table that
cannot be filled from multiple threads.

And in case you have very big dictionary (i.e. 10e9 elements), it can
take a awhile to load them, especially for SPARSE_HASHED variants (and
if you have such amount of elements there, you are likely use
SPARSE_HASHED, since it requires less memory), in my env it takes ~4
hours, which is enormous amount of time.

So this patch add support of shards for dictionaries, number of shards
determine how much hash tables will use this dictionary, also, and which
is more important, how much threads it can use to load the data.

And with 16 threads this works 2x faster, not perfect though, see the
follow up patches in this series.

v0: PARTITION BY
v1: SHARDS 1
v2: SHARDS(1)
v3: tried optimized mod - logical and, but it does not gain even 10%
v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either
v5: move SHARDS into layout parameters (unknown simply ignored)
v6: tune params for perf tests (to avoid too long queries)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:25 +01:00
Nikolai Kochetov
30310df5be
Merge branch 'master' into logical-optimizer-lowcardinality 2023-01-12 18:51:05 +01:00
Nikita Taranov
006fdd32d4
Apply preallocation optimisation more carefully (#44455)
* impl

* add perf test

* fix

* review fixes
2023-01-09 13:30:48 +01:00
Igor Nikonov
2187bdd4cc Disable diagnostics
+ cleanup
+ disable optimization in sort performance test since it removes sorting
  at all
2023-01-06 17:00:05 +00:00
Nikolay Degterinsky
dfe93b5d82
Merge pull request #42284 from Algunenano/perf_experiment
Performance experiment
2022-12-30 03:14:22 +01:00
Alexey Milovidov
79f2e747e4 Remove QuestDB (flaky test) 2022-12-28 12:42:14 +01:00
Raúl Marín
fc1fa82a39
Merge branch 'master' into perf_experiment 2022-12-27 10:51:58 +01:00
Raúl Marín
45d27f461b
Merge branch 'master' into perf_experiment 2022-12-20 09:07:48 +00:00
Kruglov Pavel
37df9b9990
Merge branch 'master' into refactor-schema-inference 2022-12-16 19:13:15 +01:00
Azat Khuzhin
53bac4de71 tests/perf: fix dependency check during DROP
CI [1]:

    DB::Exception: Cannot drop or rename default.hierarchical_dictionary_source_table, because some tables depend on it: default.hierarchical_hashed_array_dictionary, default.hierarchical_flat_dictionary, default.hierarchical_hashed_dictionary. Stack trace:

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/44256/8e67a361a8f14abec6717af09ee997eb25151685/performance_comparison_[1/4]/report.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-16 15:15:15 +01:00