Commit Graph

3248 Commits

Author SHA1 Message Date
Maksim Kita
8225d2814c
Merge pull request #40003 from azat/dict-shards
Add ability to load hashed dictionaries using multiple threads
2023-01-18 13:37:10 +03:00
Nikolay Degterinsky
70e79de69b
Merge pull request #38252 from bharatnc/ncb/weighted-quantile-approx
add quantileInterpolatedWeighted function
2023-01-16 13:41:13 +01:00
Sema Checherinda
d746a3c4ff
Merge pull request #44480 from wineternity/issue_43333_doc
[DOC] Add support for signed arguments in range() #43333
2023-01-16 10:26:49 +01:00
Ilya Yatsishin
cf5052c77e
Merge pull request #45291 from den-crane/patch-57 2023-01-16 02:33:54 +01:00
Dan Roscigno
adca0b64d3
use markdown file instead of URL to enforce 404 checks 2023-01-15 19:31:58 -05:00
Peignon Melvyn
674a1d1877
Update json.md 2023-01-16 01:27:08 +01:00
Denny Crane
6cf603e05f
Update index.md 2023-01-15 18:40:59 -04:00
Ilya Yatsishin
96987b7cd8
Merge pull request #45239 from Avogar/generate-random 2023-01-15 00:37:34 +01:00
Rich Raposa
c7aad8e48b
Merge pull request #45207 from ClickHouse/add-maxintersections-to-docs
Add maxIntersections to docs
2023-01-13 10:27:59 -07:00
Azat Khuzhin
99063b152f Allow to configure queue backlog of the parallel hashed dictionary loader
v2: Decrease default parallel_queue_backlog to 10000 (same speed)
v3: Rename parallel_queue_backlog to per_shard_load_backlog
v3: Rename per_shard_load_backlog to shard_load_queue_backlog
v4: Fix documentation
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:26 +01:00
Azat Khuzhin
345c422e28 Add ability to load hashed dictionaries using multiple threads
Right now dictionaries (here I will talk about only
HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED)
can load data only in one thread, since it uses one hash table that
cannot be filled from multiple threads.

And in case you have very big dictionary (i.e. 10e9 elements), it can
take a awhile to load them, especially for SPARSE_HASHED variants (and
if you have such amount of elements there, you are likely use
SPARSE_HASHED, since it requires less memory), in my env it takes ~4
hours, which is enormous amount of time.

So this patch add support of shards for dictionaries, number of shards
determine how much hash tables will use this dictionary, also, and which
is more important, how much threads it can use to load the data.

And with 16 threads this works 2x faster, not perfect though, see the
follow up patches in this series.

v0: PARTITION BY
v1: SHARDS 1
v2: SHARDS(1)
v3: tried optimized mod - logical and, but it does not gain even 10%
v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either
v5: move SHARDS into layout parameters (unknown simply ignored)
v6: tune params for perf tests (to avoid too long queries)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-13 13:39:25 +01:00
avogar
82ff1fd343 Add tests and docs 2023-01-12 22:29:23 +00:00
Dan Roscigno
8c94ed9597
Update docs/en/sql-reference/aggregate-functions/reference/maxintersections.md 2023-01-12 11:01:03 -05:00
Rich Raposa
759a4c0940
Update docs/en/sql-reference/aggregate-functions/reference/maxintersections.md
Co-authored-by: Dan Roscigno <dan@roscigno.com>
2023-01-12 08:53:22 -07:00
rfraposa
69a11574d2 Update maxintersections.md 2023-01-12 08:30:54 -07:00
Kseniia Sumarokova
db3e0219fc
Merge pull request #41687 from ClickHouse/40907_Parameterized_views_as_table_functions
40907 Parameterized views as table functions
2023-01-12 14:24:32 +01:00
Alexander Tokmakov
e37f572c34
Revert "update function DAYOFWEEK and add new function WEEKDAY for mysql/spark compatiability" 2023-01-12 15:01:36 +03:00
Dan Roscigno
7a651d749c
Update docs/en/sql-reference/aggregate-functions/reference/maxintersections.md 2023-01-11 19:20:37 -05:00
rfraposa
2e44ad9d0f Add maxIntersections to docs 2023-01-11 17:10:51 -07:00
Ilya Yatsishin
3a98f2bc12
Merge pull request #45190 from ClickHouse/add-query-params-to-docs
Add query parameters to the docs
2023-01-11 23:47:50 +01:00
Sergei Trifonov
1b94c839d5
Add docs for SYSTEM RELOAD USERS 2023-01-11 21:16:22 +01:00
Rich Raposa
f8ac49bb86
Update syntax.md 2023-01-11 12:09:23 -07:00
Rich Raposa
a389180f42
Update syntax.md 2023-01-11 12:05:35 -07:00
rfraposa
8b9d99e2e2 Update syntax.md 2023-01-11 11:51:53 -07:00
Sergei Trifonov
ec9f10e934
Merge pull request #45174 from ClickHouse/make-queries-copyable-from-docs
make more SQL queries copyable from docs in one click
2023-01-11 16:59:52 +01:00
DanRoscigno
7168c217b0 switch text to response for query blocks 2023-01-11 10:08:11 -05:00
serxa
8d099a4417 make more SQL queries copyable from docs in one click 2023-01-11 13:43:51 +00:00
Dan Roscigno
0ad969171e
Merge pull request #45127 from DanRoscigno/add-deltalake-docs
Add deltalake docs
2023-01-11 08:07:42 -05:00
Robert Schulze
9bb1e31369
Merge pull request #44860 from bigo-sg/improve_week_day
update function DAYOFWEEK and add new function WEEKDAY for mysql/spark compatiability
2023-01-11 13:58:45 +01:00
Dan Roscigno
6e9669cfae
Apply suggestions from code review 2023-01-11 07:53:37 -05:00
Dan Roscigno
d4c4f84161
Update docs/en/sql-reference/table-functions/hudi.md 2023-01-11 07:41:36 -05:00
Dan Roscigno
367d4fc4bf
Update docs/en/sql-reference/table-functions/hudi.md 2023-01-11 07:40:52 -05:00
Bharat Nallan Chakravarthy
e18a95719d review fix 2023-01-10 21:16:16 -08:00
DanRoscigno
563e0e76f9 init 2023-01-10 16:59:34 -05:00
DanRoscigno
75c04945bd spelling 2023-01-10 16:18:50 -05:00
rfraposa
57ab2a4110 Update nlp-functions.md
Added the detectLanguage functions
2023-01-10 12:26:51 -07:00
DanRoscigno
879ee05218 fix case of names 2023-01-10 11:18:33 -05:00
DanRoscigno
ee86afb125 add deltalake 2023-01-10 11:14:12 -05:00
Rich Raposa
f28b04fc59
Merge pull request #45124 from ClickHouse/rfraposa-patch-2
Update theilsu.md
2023-01-10 09:04:22 -07:00
Han Fei
91226abbfe
Merge pull request #43858 from hanfei1991/regexp-tree-dictionary
Merge #40878: Add regexp tree dictionary
2023-01-10 16:41:15 +01:00
Rich Raposa
df3cca6c35
Update theilsu.md
Typo
2023-01-10 08:40:48 -07:00
Ilya Yatsishin
4b3a9862cc
Merge pull request #45091 from ClickHouse/add-cramersv-agg-functions
Add missing agg functions to docs
2023-01-10 16:35:04 +01:00
Dan Roscigno
707ff65b4f
Update docs/en/sql-reference/aggregate-functions/reference/contingency.md 2023-01-10 09:55:59 -05:00
Han Fei
6ed4570f73
Merge branch 'master' into regexp-tree-dictionary 2023-01-10 15:36:30 +01:00
Han Fei
5f8296b719
Update docs/en/sql-reference/dictionaries/external-dictionaries/regexp-tree.md
Co-authored-by: Vladimir C <vdimir@clickhouse.com>
2023-01-10 14:41:06 +01:00
Robert Schulze
cbc18318ff
Merge pull request #45114 from ClickHouse/less-awkward-comment-about-debugging
Docs: Rewrite awkwardly phrased sentence about flush interval
2023-01-10 14:15:43 +01:00
Robert Schulze
6497ddd882
Docs: Rewrite awkwardly phrased sentence about flush interval 2023-01-10 13:13:35 +00:00
rfraposa
c3dcbb2671 Add cramersV function
And 3 similar functions t
2023-01-09 23:42:39 -07:00
Dan Roscigno
f8a5d5cb18
Merge branch 'ClickHouse:master' into add-deltalake-docs 2023-01-09 15:45:08 -05:00
Dan Roscigno
bfcb6eb0cf
Merge pull request #44995 from ClickHouse/qoega-patch-5
Add documentation for xxh3
2023-01-09 15:00:27 -05:00