ClickHouse/docs/en/sql-reference
Azat Khuzhin 345c422e28 Add ability to load hashed dictionaries using multiple threads
Right now dictionaries (here I will talk about only
HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED)
can load data only in one thread, since it uses one hash table that
cannot be filled from multiple threads.

And in case you have very big dictionary (i.e. 10e9 elements), it can
take a awhile to load them, especially for SPARSE_HASHED variants (and
if you have such amount of elements there, you are likely use
SPARSE_HASHED, since it requires less memory), in my env it takes ~4
hours, which is enormous amount of time.

So this patch add support of shards for dictionaries, number of shards
determine how much hash tables will use this dictionary, also, and which
is more important, how much threads it can use to load the data.

And with 16 threads this works 2x faster, not perfect though, see the
follow up patches in this series.

v0: PARTITION BY
v1: SHARDS 1
v2: SHARDS(1)
v3: tried optimized mod - logical and, but it does not gain even 10%
v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either
v5: move SHARDS into layout parameters (unknown simply ignored)
v6: tune params for perf tests (to avoid too long queries)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:25 +01:00
..
aggregate-functions Update theilsu.md 2023-01-10 08:40:48 -07:00
data-types Remove leftover empty lines at the end of markdown files 2023-01-09 15:15:18 +01:00
dictionaries Add ability to load hashed dictionaries using multiple threads 2023-01-13 13:39:25 +01:00
functions Revert "update function DAYOFWEEK and add new function WEEKDAY for mysql/spark compatiability" 2023-01-12 15:01:36 +03:00
operators
statements Merge pull request #41687 from ClickHouse/40907_Parameterized_views_as_table_functions 2023-01-12 14:24:32 +01:00
table-functions Merge pull request #45174 from ClickHouse/make-queries-copyable-from-docs 2023-01-11 16:59:52 +01:00
window-functions cross link docs to blogs 2022-12-05 17:28:03 +00:00
_category_.yml
ansi.md
distributed-ddl.md
formats.mdx
syntax.md Update syntax.md 2023-01-11 12:09:23 -07:00