Commit Graph

1049 Commits

Author SHA1 Message Date
Robert Schulze
bd41c74ddf
Various test, code and docs fixups 2023-01-15 13:47:34 +00:00
Robert Schulze
7023d68536
Fix codecs_int_*.xml 2023-01-15 13:31:45 +00:00
Azat Khuzhin
925fd2c33a tests/performance: do not use scientific notation in hashed_dictionary_sharded
v2: fix few mistakes
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
345c422e28 Add ability to load hashed dictionaries using multiple threads
Right now dictionaries (here I will talk about only
HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED)
can load data only in one thread, since it uses one hash table that
cannot be filled from multiple threads.

And in case you have very big dictionary (i.e. 10e9 elements), it can
take a awhile to load them, especially for SPARSE_HASHED variants (and
if you have such amount of elements there, you are likely use
SPARSE_HASHED, since it requires less memory), in my env it takes ~4
hours, which is enormous amount of time.

So this patch add support of shards for dictionaries, number of shards
determine how much hash tables will use this dictionary, also, and which
is more important, how much threads it can use to load the data.

And with 16 threads this works 2x faster, not perfect though, see the
follow up patches in this series.

v0: PARTITION BY
v1: SHARDS 1
v2: SHARDS(1)
v3: tried optimized mod - logical and, but it does not gain even 10%
v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either
v5: move SHARDS into layout parameters (unknown simply ignored)
v6: tune params for perf tests (to avoid too long queries)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:25 +01:00
Nikolai Kochetov
30310df5be
Merge branch 'master' into logical-optimizer-lowcardinality 2023-01-12 18:51:05 +01:00
Nikita Taranov
006fdd32d4
Apply preallocation optimisation more carefully (#44455)
* impl

* add perf test

* fix

* review fixes
2023-01-09 13:30:48 +01:00
Igor Nikonov
2187bdd4cc Disable diagnostics
+ cleanup
+ disable optimization in sort performance test since it removes sorting
  at all
2023-01-06 17:00:05 +00:00
Nikolay Degterinsky
dfe93b5d82
Merge pull request #42284 from Algunenano/perf_experiment
Performance experiment
2022-12-30 03:14:22 +01:00
Alexey Milovidov
79f2e747e4 Remove QuestDB (flaky test) 2022-12-28 12:42:14 +01:00
Raúl Marín
fc1fa82a39
Merge branch 'master' into perf_experiment 2022-12-27 10:51:58 +01:00
Raúl Marín
45d27f461b
Merge branch 'master' into perf_experiment 2022-12-20 09:07:48 +00:00
Kruglov Pavel
37df9b9990
Merge branch 'master' into refactor-schema-inference 2022-12-16 19:13:15 +01:00
Azat Khuzhin
53bac4de71 tests/perf: fix dependency check during DROP
CI [1]:

    DB::Exception: Cannot drop or rename default.hierarchical_dictionary_source_table, because some tables depend on it: default.hierarchical_hashed_array_dictionary, default.hierarchical_flat_dictionary, default.hierarchical_hashed_dictionary. Stack trace:

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/44256/8e67a361a8f14abec6717af09ee997eb25151685/performance_comparison_[1/4]/report.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-16 15:15:15 +01:00
Nikolay Degterinsky
9b6d31b95d
Merge branch 'master' into perf_experiment 2022-12-13 17:15:07 +01:00
avogar
7375a7d429 Refactor and improve schema inference for text formats 2022-12-07 21:19:27 +00:00
Guo Wangyang
b86686b3f8
Merge branch 'master' into logical-optimizer-lowcardinality 2022-12-07 13:33:25 +08:00
Maksim Kita
1cdc7ab62a
Merge pull request #43556 from Algunenano/interpretation_benchmark
Add benchmark for query interpretation with JOINs
2022-12-01 22:53:02 +03:00
Vladimir C
53dc70a2d0
Merge pull request #38191 from BigRedEye/grace_hash_join
Closes https://github.com/ClickHouse/ClickHouse/issues/11596
2022-11-30 17:01:00 +01:00
Nikolai Kochetov
51439e2c19
Merge pull request #43260 from ClickHouse/read-from-mt-in-io-pool
Read from MergeTree in I/O pool
2022-11-29 12:09:03 +01:00
Nikolai Kochetov
d9fc13b230
Update async_remote_read.xml 2022-11-28 14:00:49 +01:00
Nikita Taranov
8ed5cfc265
Memory bound merging for distributed aggregation in order (#40879)
* impl

* fix style

* make executeQueryWithParallelReplicas similar to executeQuery

* impl for parallel replicas

* cleaner code for remote sorting properties

* update test

* fix

* handle when nodes of old versions participate

* small fixes

* temporary enable for testing

* fix after merge

* Revert "temporary enable for testing"

This reverts commit cce7f8884c.

* review fixes

* add bc test

* Update src/Core/Settings.h
2022-11-28 00:41:31 +01:00
Nikita Taranov
d1c258cf20
Add xxh3 hash function (#43411)
* impl

* try fix

* add docs

* add test

* rm unused file

* excellent
2022-11-26 00:14:08 +01:00
Nikolai Kochetov
4632e7c644 Add max_streams_for_merge_tree_reading setting. 2022-11-25 17:14:22 +00:00
Nikolai Kochetov
dfd3976040
Update async_remote_read.xml 2022-11-25 14:53:45 +01:00
Igor Nikonov
236e7e3989 Small fixes 2022-11-25 12:04:12 +00:00
Igor Nikonov
20e67b7140 Merge remote-tracking branch 'origin/master' into HEAD 2022-11-24 13:10:37 +00:00
Nikolai Kochetov
e79c91947a
Update async_remote_read.xml 2022-11-24 12:35:02 +01:00
Raúl Marín
e910648c5d Add benchmark for query interpretation with JOINs 2022-11-23 13:15:35 +01:00
Raúl Marín
ed0c174c0c Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-11-21 11:02:31 +01:00
Guo Wangyang
7d6ff90e34
Merge branch 'master' into logical-optimizer-lowcardinality 2022-11-20 09:56:50 +08:00
Nikolai Kochetov
5da1d893fd
Merge branch 'master' into read-from-mt-in-io-pool 2022-11-18 21:10:45 +01:00
Nikita Taranov
7beb58b0cf
Optimize merge of uniqExact without_key (#43072)
* impl for uniqExact

* rm unused (read|write)Text methods

* fix style

* small fixes

* impl for variadic uniqExact

* refactor

* fix style

* more agressive inlining

* disable if max_threads=1

* small improvements

* review fixes

* Revert "rm unused (read|write)Text methods"

This reverts commit a7e7480584.

* encapsulate is_able_to_parallelize_merge in Data

* encapsulate is_exact & argument_is_tuple in Data
2022-11-17 13:19:02 +01:00
Kruglov Pavel
1b68f605a2
Merge pull request #42761 from AlfVII/fix-slow-json-extract-with-low-cardinality
Fixed slowness in JSONExtract with LowCardinality(String) tuples
2022-11-17 12:49:18 +01:00
Raúl Marín
97d6fc3071 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-11-17 11:48:46 +01:00
Nikolai Kochetov
10f449c6c1 Add a query to perftest. 2022-11-15 18:08:03 +00:00
李扬
1de5bb2392
Add function canonicalRand (#43124)
* add function canonicalRand

* add perf test

* revert rand.xml
2022-11-15 00:27:19 +01:00
Wangyang Guo
887779e8d8 Add perftest: low_cardinality_query 2022-11-08 17:19:18 +08:00
Kruglov Pavel
e9a01a1946
Merge branch 'master' into fix-slow-json-extract-with-low-cardinality 2022-11-04 11:13:46 +01:00
Nikolay Degterinsky
30ad1a6826
Merge branch 'master' into perf_experiment 2022-11-03 02:18:21 +03:00
vdimir
6a4247ca32
Merge branch 'master' into grace_hash_join 2022-10-31 09:54:37 +00:00
Alfonso Martinez
9e33b13737 Merge remote-tracking branch 'upstream/master' into fix-slow-json-extract-with-low-cardinality 2022-10-31 08:46:55 +01:00
avogar
fe0aea2e3a Support parallel parsing for LineAsString input format 2022-10-28 21:56:09 +00:00
Alfonso Martinez
c37b154254 Added reverted files and fixes for failing fuzzer tests 2022-10-28 12:37:59 +02:00
Raúl Marín
891484b462 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-10-27 13:17:07 +02:00
Vladimir C
31e8f92cd9
Merge pull request #42664 from ClickHouse/vdimir/followup-42274 2022-10-27 12:20:46 +02:00
vdimir
14d0f6457b
Add tests and doc for some url-related functions 2022-10-26 10:52:57 +00:00
Raúl Marín
9395f77421 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-10-26 11:46:17 +02:00
Raúl Marín
6e0a9452e7 Merge remote-tracking branch 'blessed/master' into perf_experiment 2022-10-25 15:25:06 +02:00
Anton Popov
eed21ad4ca
Revert "Low cardinality cases moved to the function for its corresponding type" 2022-10-25 01:30:32 +02:00
vdimir
adb63a5583
Merge branch 'master' into grace_hash_join 2022-10-17 12:32:56 +00:00
Raúl Marín
46616d341c Make explain_ast even larger 2022-10-14 14:06:44 +02:00
AlfVII
5b2703c412
Merge branch 'master' into fix-slow-json-extract-with-low-cardinality 2022-10-11 13:28:07 +02:00
vdimir
ff55c369bc
Merge branch 'tmp-data-followup' 2022-10-05 18:10:05 +00:00
BoloniniD
9dd15998c7 Print nicer exception if BLAKE3 is unavailable 2022-10-05 00:11:41 +03:00
Sergei Trifonov
a592150ae7
Merge branch 'master' into fix-slow-json-extract-with-low-cardinality 2022-10-03 18:10:07 +02:00
Vitaly Baranov
65c61877c7
Merge pull request #33435 from BoloniniD/BLAKE3
Integrating Rust code into ClickHouse
2022-10-03 15:25:06 +02:00
Anton Popov
77eacfbbe0
Update bitmap_array_element.xml 2022-10-03 14:56:34 +02:00
BoloniniD
f5c57cd4a8 Fix test queries 2022-10-03 00:20:44 +03:00
flynn
7109aff2f0 fix style 2022-09-30 18:04:51 +08:00
flynn
1f51a86285 add test 2022-09-30 18:04:51 +08:00
vdimir
7ebc297f4c
Merge branch 'master' into pr/BigRedEye/38191 2022-09-30 09:40:47 +00:00
BoloniniD
55c79230b3 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-09-29 23:53:25 +03:00
Sergei Trifonov
976804b5db
Merge branch 'master' into fix-slow-json-extract-with-low-cardinality 2022-09-26 12:48:49 +02:00
Alfonso Martinez
9cb74c7807 Low cardinality cases moved to the function for its corresponding type 2022-09-23 14:12:37 +02:00
Igor Nikonov
8c93a9adda Merge remote-tracking branch 'origin/master' into distinct_in_order_wo_order_by 2022-09-22 07:40:14 +00:00
Nikita Taranov
930d050b55
fix (#41648) 2022-09-21 19:04:03 +02:00
Nikita Taranov
100c055510
Prefetching in aggregation (#39304)
* impl

* stash

* clean up

* do not apply when HT is small

* make branch static

* also in merge

* do not hardcode look ahead value

* fix

* apply to methods with cheap key calculation

* more tests

* silence tidy

* fix build

* support HashMethodKeysFixed

* apply during merge only for cheap

* stash

* fixes

* rename method

* add feature flag

* cache prefetch threshold value

* fix

* fix

* Update HashMap.h

* fix typo

* 256KB as default l2 size

Co-authored-by: Alexey Milovidov <milovidov@clickhouse.com>
2022-09-21 18:59:07 +02:00
BoloniniD
55fcb98f29 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-09-19 21:53:14 +03:00
Igor Nikonov
aca810ba62 Merge remote-tracking branch 'origin/master' into distinct_in_order_wo_order_by 2022-09-19 18:34:38 +00:00
Kruglov Pavel
4c3194eefe
Merge pull request #41286 from azat/utf8-fix
Do not allow invalid sequences influence other rows in lowerUTF8/upperUTF8
2022-09-19 14:07:10 +02:00
Azat Khuzhin
bd54a6c45d tests: add perf test for lowerUTF8()/upperUTF8()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-09-17 11:16:45 +02:00
BoloniniD
452ef4435b Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-09-16 20:05:56 +03:00
Nikita Taranov
ee31be4286 impl 2022-09-16 15:41:15 +02:00
Igor Nikonov
eeecaf7a31 Merge remote-tracking branch 'origin/master' into distinct_in_order_wo_order_by 2022-09-16 10:30:52 +00:00
Raúl Marín
c3ff66bd9d
Implement batch processing for aggregate functions with multiple nullable arguments (#41058)
* Implement batch processing for aggregate functions with multiple nullable arguments

* Fix broken perf test

* Improve filter handling in addBatchSinglePlace with nullable arguments

* Fix detecting the Null filter usage
2022-09-15 23:51:38 +02:00
Raúl Marín
6dac509739
Speed up reading uniqState (#41089)
* Speed up reading UniquesHashSet

* Improve uniq serialization tests
2022-09-15 23:41:15 +02:00
Igor Nikonov
8a4806e8c0 Fix test
- remove perfomance queries which can be unstable
2022-09-15 10:53:42 +00:00
BoloniniD
e8bcbcd016
Merge branch 'master' into BLAKE3 2022-09-09 11:48:31 +03:00
vdimir
6d4b6c452a
Merge branch 'master' into grace_hash_join 2022-09-07 08:00:14 +00:00
Nikita Taranov
7c4f42d014
Skip empty literals in lz4 decompression (#40142) 2022-09-06 13:58:26 +02:00
Alexey Milovidov
193cd1b3b2
Merge pull request #39138 from nickitat/control_block_size_in_aggregator
Control block size in aggregator
2022-09-04 04:51:00 +03:00
vdimir
e21763e759
remove new setting from join_set_filter.xml 2022-08-29 09:49:13 +00:00
vdimir
470dcff89c
Add tests/performance/join_set_filter.xml 2022-08-29 09:49:11 +00:00
Alexey Milovidov
ab91c99495
Merge branch 'master' into control_block_size_in_aggregator 2022-08-20 21:28:27 +03:00
Kruglov Pavel
b67cb9e378
Merge pull request #40173 from Avogar/arrow-dict
Improve and fix dictionaries in Arrow format
2022-08-18 20:54:55 +02:00
Igor Nikonov
46ed4f6cdf
Merge pull request #38719 from ClickHouse/skipping_sorting_step
SortingStep: deduce way to sort based on input stream sort description
2022-08-17 12:58:11 +02:00
Nikita Taranov
63bc894a42 more parallelism 2022-08-16 18:56:22 +02:00
Alexander Tokmakov
6fd4d2cfb3 Revert "tests/performance: cover sparse_hashed dictionary (#40027)"
This reverts commit 6a30c23252.
2022-08-16 15:32:50 +03:00
avogar
c8571f82f9 Fix performance test 2022-08-15 11:41:03 +00:00
Kruglov Pavel
ac85676d84
Update arrow_format.xml 2022-08-15 00:10:08 +02:00
avogar
398576e9c9 Improve and fix dictionaries in Arrow format 2022-08-12 18:56:21 +00:00
Igor Nikonov
75f6fcfa70 Merge remote-tracking branch 'origin/master' into skipping_sorting_step 2022-08-11 12:35:55 +00:00
Azat Khuzhin
6a30c23252
tests/performance: cover sparse_hashed dictionary (#40027)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-10 21:48:00 +02:00
BoloniniD
b161773f71 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-08-02 20:25:25 +03:00
Igor Nikonov
7f0adb5eb0 Merge remote-tracking branch 'origin/master' into skipping_sorting_step 2022-07-31 07:07:36 +00:00
Alexey Milovidov
c9e6850306
Merge pull request #39325 from azat/perf-parallel_mv-fix
tests/performance: improve parallel_mv test
2022-07-31 02:51:38 +03:00
Alexey Milovidov
36e6500e54
Merge branch 'master' into BLAKE3 2022-07-30 23:14:05 +03:00
Anton Popov
1547c010b9
Merge pull request #39432 from ClickHouse/distinct_sorted_chunk_perf_impr
DISTINCT in order: perf improvement
2022-07-27 14:17:58 +02:00
Alexander Gololobov
460950ecdc
Merge branch 'master' into feature/sql-standard-delete 2022-07-24 21:27:22 +02:00
Alexander Gololobov
594195451e Cleanups 2022-07-24 12:21:18 +02:00