Commit Graph

106088 Commits

Author SHA1 Message Date
Alexey Milovidov
6698f3ccf0
Merge pull request #45008 from ClickHouse/add-dmesg-log-to-fuzzer
Add `dmesg.log` to Fuzzer
2023-01-14 14:43:30 +03:00
Dmitry Novik
3d23654720 Skip validation of function IN 2023-01-13 23:10:16 +00:00
Alexey Milovidov
ec8c38c174 Add review suggestion 2023-01-13 22:53:17 +01:00
Alexey Milovidov
167d980fe2 Fix GitHub 2023-01-13 22:52:05 +01:00
Alexey Milovidov
e6b3f54842
Merge pull request #41110 from ClickHouse/automerge
Automatically merge green backport PRs and green approved PRs
2023-01-13 23:16:35 +03:00
Alexander Tokmakov
d857d62a03 remove another set of macros 2023-01-13 20:34:31 +01:00
Alexander Tokmakov
2d7773fccc Merge branch 'master' into text_log_add_pattern 2023-01-13 20:33:46 +01:00
Han Fei
ed49ebf01a update setting explain 2023-01-13 20:26:08 +01:00
Han Fei
2fb2f503e3 Update src/Storages/MergeTree/MergeTreeSettings.h
Co-authored-by: Sergei Trifonov <svtrifonov@gmail.com>
2023-01-13 20:20:08 +01:00
Han Fei
bcf813fedc Update src/Storages/StorageReplicatedMergeTree.cpp
Co-authored-by: Sergei Trifonov <svtrifonov@gmail.com>
2023-01-13 20:19:30 +01:00
Han Fei
9e99c7e116 Update src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp
Co-authored-by: Sergei Trifonov <svtrifonov@gmail.com>
2023-01-13 20:19:13 +01:00
Han Fei
a258a39eb1 Merge branch 'master' into hanfei/async-cache 2023-01-13 20:17:58 +01:00
Nikolay Degterinsky
36c20bf293 Merge remote-tracking branch 'upstream/master' into better_asterisk_parser 2023-01-13 19:15:55 +00:00
DanRoscigno
d0a55f6dc9 doc grace_hash algorithm for join 2023-01-13 13:17:03 -05:00
Kseniia Sumarokova
75318e4cee
Merge pull request #45180 from kssenii/fix-flacky-test-multiple-disks
Fix flaky test test_multiple_disks/test.py::test_rename
2023-01-13 19:05:03 +01:00
Rich Raposa
c7aad8e48b
Merge pull request #45207 from ClickHouse/add-maxintersections-to-docs
Add maxIntersections to docs
2023-01-13 10:27:59 -07:00
Anton Popov
487de70d01 fix locking at loading outdated data parts 2023-01-13 17:05:32 +00:00
avogar
e2470dd670 Fix tests 2023-01-13 17:03:53 +00:00
Robert Schulze
5d3f0ec4a0
Disallow Gorilla codec on non-float columns
Cf. #45195
2023-01-13 16:53:28 +00:00
avogar
120986da52 Add tags to test 2023-01-13 16:34:38 +00:00
avogar
6cb7c4d175 Better commit, mark noexcept 2023-01-13 16:33:11 +00:00
avogar
76c89c6d20 Fix heap-use-after-free in reading from s3 2023-01-13 16:31:30 +00:00
Smita Kulkarni
d132d30707 Addressed review comments - 42648 Support scalar subqueries cache 2023-01-13 17:28:35 +01:00
Alexander Tokmakov
50bb1db9cc
Merge pull request #45251 from ClickHouse/tavplubix-patch-1
Update clickhouse-test
2023-01-13 18:39:00 +03:00
Alexander Tokmakov
6de4837580 fix 2023-01-13 16:07:20 +01:00
Maksim Kita
dc24d831cf
Merge pull request #42970 from ClickHouse/optimize-redundant-function
Implement optimize_redundant_functions_in_order_by on top of QueryTree.
2023-01-13 17:36:56 +03:00
Maksim Kita
05b1b78104
Merge pull request #44013 from kitaisreal/analyzer-aggregate-functions-passes-small-fixes
Analyzer aggregate functions passes small fixes
2023-01-13 17:31:53 +03:00
avogar
abfb6b096f Better exception message 2023-01-13 14:23:30 +00:00
Smita Kulkarni
a0fe26f506 Addressed review comments and updated name to ServerStartupMilliseconds - Record server startup time in ProfileEvents 2023-01-13 14:38:54 +01:00
Alexander Tokmakov
9d5ec474a3
Merge pull request #43998 from evillique/make_system_replicas_parallel
Make `system.replicas` parallel
2023-01-13 16:33:36 +03:00
Alexander Tokmakov
36c282e48e
Update clickhouse-test 2023-01-13 16:29:08 +03:00
Alexander Tokmakov
b88aae9d5c Merge branch 'master' into fix_44496 2023-01-13 14:05:57 +01:00
Smita Kulkarni
cf5cb0da97 Record server startup time in ProfileEvents
Implementation:
* Added ProfileEvents::ServerStartupTime.
* Recorded time from start of main till listening to sockets.
Testing:
* Added a test 02532_profileevents_server_startup_time.sql
2023-01-13 13:47:54 +01:00
Azat Khuzhin
64e3677961 Avoid double hash calculation in HashedDictionary::getShard(StringRef)
Previously it was written this way because getShard() was a simple
module operation.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
2783850f08 Minor review fixes in HashedDictionary
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
6e0a7add93 Completelly exception safe HashedDictionary dtor
Previously there was one (even though very unlikely) case when the dtor
can throw - logging code or ThreadPool::wait.

Just guard the dtor with try/catch and done with it.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
74def83c5d Destroy hashtables for hashed dictionary in parallel only for sharded dict
Since there can be multiple hashtables, since each attribute uses it's
own hashtable.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
1c0e0ea1e4 Disable sharded dictionaries with updatable sources
Support of sharded dictionary for updatable sources is questionable
since:
- sharded dictionary developed for hashed dictionary with a huge number
  of keys
- updatable source requires storing the whole table in memory (due to
  how reload works)
- also it is an open question will it have some benefits from the
  updatable source or not, since using updatable source with a huge
  number of changes in the source does not looks optimal and on the
  other side if there are small amount of changes the you don't need
  sharded dictionary at all

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
0506792790 tests: cover sharded hashed dictionary with update_field
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
c97991fce1 Use shared arena for HashedDictionary::blockToAttributes()
This should decrease number of allocations.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
01b100da61 Use shared arena in ParallelDictionaryLoader::createShardSelector() (and add missing rollback)
This should decrease number of allocations.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
64874824b4 Minor review fixes in HashedDictionary
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
77c1f07636 Make HashedDictionary::~HashedDictionary exception safe
Before it was possible for the desturctor to throw, in case of thread
allocation fails, rewrite it to trySchedule() and do sequential destroy
in this case.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
925fd2c33a tests/performance: do not use scientific notation in hashed_dictionary_sharded
v2: fix few mistakes
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
a3f189e191 Optimize sharded dictionaries with skewed distribution
In case of skewed distribution simple division by module will not give
you good distribution between shards and eventually this can lead to
performance the same as non-sharded dictionary (except for it will
occupy +1 thread for Block::scatter).

But if HashedDictionary::blockToAttributes() will not have calls to
HashedDictionary::getShard() this can be fixed by using a more complex
key-to-shard (getShard()) mapping. And actually you do not need to call
getShard() in blockToAttributes() you can simply use passed shard, and
that's it.

And by wrapping key with intHash64() in getShard() skewed distribution
can be fixed.

Note, that previously I tried similar approach but did not removed
getShard() from blockToAttributes(), that's why it failed.

And now it works almost as fast as with simple createBlockSelector(),
just 13.6% slower (18.75min vs 16.5min, with 16 threads).

Note, that I've also tried to add libdivide for this, but it does not
improves the performance.

I've also tried the approach without scatter, and it works 20% slower
then this one (22.5min VS 18.75min, with 16 threads).

v2: Use intHashCRC32() over intHash64() for HashedDictionary::getShard()
    (with intHash64() it works very slower, almost 2x slower, there was
    18min with 32 threads)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
655a564280 Parallel hash tables destroy for hashed dictionaries
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
99063b152f Allow to configure queue backlog of the parallel hashed dictionary loader
v2: Decrease default parallel_queue_backlog to 10000 (same speed)
v3: Rename parallel_queue_backlog to per_shard_load_backlog
v3: Rename per_shard_load_backlog to shard_load_queue_backlog
v4: Fix documentation
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
79ad81dfdf Implement separate queue for parallel loader of hashed dictionaries
Previous patches in this series has a bottleneck in rehash(). This is
the most slowest operation when insert lots of rows into the hashtable
and eventually all that thread pool sometimes work as the most slowest
thread since we did not have any queue of blocks.

This patch adds such queue and now it scales linearly, so initialy with
1 thread I had ~4 hours for 10e9 elements (UInt64 key, UInt16 value),
after this patch it works in 16 minutes with 16 threads (well actually I
have to use 32 threads because of distribution of data in the source
table).

And now with 16 threads it works 16 times faster.

Also this patch adds more optimal block splitting for the non-complex
dictionaries, and usual block splitting for complex dictionaries.
But anyway this moves the overhead from the loading into the hashtable
threads out to the reader thread, and this is better, since reader does
not uses that much CPU.

v2: fix use-after-free on failed load (add missing wait in dtor)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
5d0fd3cdc4 Remove sharded overhead for non-sharded hashed dictionaries
By adding one more template parameter - HashedDictionary<sharded> (yes,
it is already too much of them, for the template class that has explicit
instantion).

Since perf tests [1] shows 20% slowdown.

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/40003/8f0cf2d6b8a7df511afe901331d5e2c7b06c0b4d/performance_comparison_[1/4]/report.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
345c422e28 Add ability to load hashed dictionaries using multiple threads
Right now dictionaries (here I will talk about only
HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED)
can load data only in one thread, since it uses one hash table that
cannot be filled from multiple threads.

And in case you have very big dictionary (i.e. 10e9 elements), it can
take a awhile to load them, especially for SPARSE_HASHED variants (and
if you have such amount of elements there, you are likely use
SPARSE_HASHED, since it requires less memory), in my env it takes ~4
hours, which is enormous amount of time.

So this patch add support of shards for dictionaries, number of shards
determine how much hash tables will use this dictionary, also, and which
is more important, how much threads it can use to load the data.

And with 16 threads this works 2x faster, not perfect though, see the
follow up patches in this series.

v0: PARTITION BY
v1: SHARDS 1
v2: SHARDS(1)
v3: tried optimized mod - logical and, but it does not gain even 10%
v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either
v5: move SHARDS into layout parameters (unknown simply ignored)
v6: tune params for perf tests (to avoid too long queries)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:25 +01:00