Commit Graph

116783 Commits

Author SHA1 Message Date
Alexander Sapin
a67dd6e479 Readuntilend 2023-06-07 17:25:48 +02:00
Alexander Gololobov
bf6900f64c Write 1 part and do not use OPTIMIZE FINAL 2023-06-07 17:08:18 +02:00
avogar
cf947e6e01 Fix typo 2023-06-07 12:50:16 +00:00
avogar
87ac6b8b63 Fix reading negative decimals in avro format 2023-06-07 12:49:28 +00:00
Robert Schulze
b83f0fff7d
Merge branch 'master' into default-granularity 2023-06-07 14:39:56 +02:00
Robert Schulze
81cd3defd7
Fix expected results 2023-06-07 12:29:09 +00:00
Rich Raposa
06b05cf2aa
Merge pull request #50664 from ClickHouse/fix-docs-tuple-with-aggregates
Some minor fixes about using `Tuple` in aggregate functions
2023-06-07 06:07:52 -06:00
Azat Khuzhin
036ddcd47b
Fix excessive memory usage for FINAL (due to too much streams usage) (#50429)
Previously it could create MergeTreeInOrder for each mark, however this
could be very suboptimal, due to each MergeTreeInOrder has some memory
overhead.

Now, by collapsing all marks for one part together it is more memory
effiecient.

I've tried the query from the altinity wiki [1] and it decreases memory
usage twice:

    SELECT * FROM repl_tbl FINAL WHERE key IN (SELECT toUInt32(number) FROM numbers(1000000) WHERE number % 50000 = 0) FORMAT Null

- upstream: MemoryTracker: Peak memory usage (for query): 520.27 MiB.
- patched:  MemoryTracker: Peak memory usage (for query): 260.95 MiB.

  [1]: https://kb.altinity.com/engines/mergetree-table-engine-family/replacingmergetree/#multiple-keys

And it could be not 2x and even more or less, it depends on the gaps in
marks for reading (for example in my setup the memory usage increased a
lot, from ~16GiB of RAM to >64GiB due to lots of marks and gaps).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-07 13:48:08 +02:00
Alexander Sapin
71ae54f089 Fix args 2023-06-07 13:34:01 +02:00
Alexander Sapin
5d52c41689 Merge branch 'master' into azure_table_function 2023-06-07 12:33:26 +02:00
Kseniia Sumarokova
d25ea9b0cf
Merge pull request #50470 from kssenii/add-some-assertions-2
Add some assertions
2023-06-07 12:28:28 +02:00
János Benjamin Antal
35ef14482d Fix keyword capitalization 2023-06-07 10:13:13 +00:00
János Benjamin Antal
1a3dc7c3ed
Merge branch 'master' into fix-docs-tuple-with-aggregates 2023-06-07 12:11:03 +02:00
János Benjamin Antal
e0bc695e2d Use correct link format 2023-06-07 10:07:35 +00:00
Alexey Milovidov
6e9c08bbf4
Merge pull request #50642 from johanngan/regexptree-bad-opt
Revert invalid RegExpTreeDictionary optimization
2023-06-07 13:00:20 +03:00
Robert Schulze
7c80046834
Revert "Remove clang-tidy exclude"
This reverts commit 42c0547895.
2023-06-07 09:47:54 +00:00
Robert Schulze
c795eb0329
Temporarily disable a test 2023-06-07 09:46:47 +00:00
robot-ch-test-poll3
9c72449a34
Merge pull request #50607 from den-crane/patch-25
Doc. Clarification about ArgMax/Min behavior
2023-06-07 11:34:12 +02:00
Anton Popov
1e6b84c59c
Merge pull request #50660 from CurtizJ/merging-50329
Merging #50329
2023-06-07 11:18:42 +02:00
Anton Popov
3c2a6200e5 Merge branch 'ignore_index' of https://github.com/ClibMouse/ClickHouse into merging-50329 2023-06-07 09:15:57 +00:00
Robert Schulze
52e265badd
Merge remote-tracking branch 'rschu1ze/master' into annoy_cleanup 2023-06-07 09:13:41 +00:00
Robert Schulze
4050b637f1
ALTER TABLE ADD INDEX: Add default GRANULARITY argument for secondary indexes
- Related to #45451, which provides a default GRANULARITY when the
  skipping index is created in CREATE TABLE.
2023-06-07 09:04:24 +00:00
Antonio Andelic
26fb80b540
Merge pull request #50615 from ClickHouse/fix-jepsen-check
Fix Jepsen runs in PRs
2023-06-07 08:43:51 +02:00
Alexey Milovidov
9b49469e54
Merge pull request #50637 from Avogar/values-lc-null
Fix converting Null to LowCardinality(Nullable) in values table function
2023-06-07 07:38:11 +03:00
Alexey Milovidov
c02da1320f
Merge branch 'master' into regexptree-bad-opt 2023-06-07 07:37:17 +03:00
Derek Chia
f3959aa9e1
Update settings.md
`max_final_threads` is now set to the number of cores by default. See https://github.com/ClickHouse/ClickHouse/pull/47915
2023-06-07 11:07:16 +08:00
Alexey Milovidov
a61c8e246d
Merge pull request #50629 from Algunenano/revert_incorrect_optimizations
Revert incorrect optimizations
2023-06-07 05:34:07 +03:00
robot-ch-test-poll4
1be026d33e
Merge pull request #50644 from ClickHouse/rfraposa-patch-3
Update nyc-taxi.md
2023-06-07 04:32:54 +02:00
robot-ch-test-poll
1b1e3fbdd4
Merge pull request #50636 from ClickHouse/nickitat-patch-11
Disable 01676_clickhouse_client_autocomplete under UBSan
2023-06-07 01:36:58 +02:00
Boris Kuschel
45d000b717 Turn off analyzer for test 2023-06-06 19:08:42 -04:00
Boris Kuschel
1fa1215d15 Avoid UB 2023-06-06 19:08:42 -04:00
Boris Kuschel
7c2b88a00e Make test invariant 2023-06-06 19:08:42 -04:00
Boris Kuschel
689e0cabe0 Add space to if 2023-06-06 19:08:42 -04:00
Boris Kuschel
f552b96451 Add docs for ignore index 2023-06-06 19:08:42 -04:00
Boris Kuschel
068b1fbbcc Add ability to ignore index 2023-06-06 19:08:42 -04:00
robot-clickhouse
707abc85f4
Merge pull request #50608 from Misz606/patch-1
Update aggregatingmergetree.md
2023-06-07 01:07:51 +02:00
robot-ch-test-poll1
9783f8c746
Merge pull request #50643 from ClickHouse/rfraposa-patch-2
Style fix
2023-06-07 00:42:54 +02:00
Dmitry Novik
280e80fcd4
Merge branch 'master' into analyzer-distr-query 2023-06-07 00:32:09 +02:00
Rich Raposa
5f48f02023
Update index.md 2023-06-06 16:10:22 -06:00
Rich Raposa
a89c129c49
Update nyc-taxi.md
Use gcs function (instead of s3) for the GCS files
2023-06-06 15:54:57 -06:00
Rich Raposa
195cc51c43
Style fix 2023-06-06 15:51:03 -06:00
johanngan
be8e048799 Revert invalid RegExpTreeDictionary optimization
This reverts the following commits:
- e77dd81036
- e8527e720b

Additionally, functional tests are added.

When scanning complex regexp nodes sequentially with RE2, the old code
has an optimization to break out of the loop early upon finding a leaf
node that matches. This is an invalid optimization because there's no
guarantee that it's actually a VALID match, because its parents might
NOT have matched. Semantically, a user would expect this match to be
discarded and for the search to continue. Instead, since we skipped
matching after the first false positive, subsequent nodes that would
have matched are missing from the output value. This affects both
dictGet and dictGetAll.

It's difficult to distinguish a true positive from a false positive
while looping through complex_regexp_nodes because we would have to scan
all the parents of a matching node to confirm a true positive. Trying to
do this might actually end up being slower than just scanning every
complex regexp node, because complex_regexp_nodes is only a subset of
all the tree nodes; we may end up duplicating work with scanning
that Vectorscan has already done, depending on whether the parent nodes
are "simple" or "complex". So instead of trying to fix this
optimization, just remove it entirely.
2023-06-06 16:28:44 -05:00
alesapin
6ab2a50c39 Fix two tests and build 2023-06-06 22:48:53 +02:00
Smita Kulkarni
99f0be8ef5 Refactored to StorageAzureBlob 2023-06-06 21:58:54 +02:00
Han Fei
4130e1e9ac
Merge branch 'master' into revert_incorrect_optimizations 2023-06-06 21:44:39 +02:00
Robert Schulze
42c0547895
Remove clang-tidy exclude 2023-06-06 19:25:43 +00:00
alesapin
5637858182 Fix the most important check in the world 2023-06-06 21:06:45 +02:00
alesapin
8eaa32e89d Merge branch 'azure_table_function' of github.com:ClickHouse/ClickHouse into azure_table_function 2023-06-06 20:41:13 +02:00
alesapin
ceab5117a9 Fxi style 2023-06-06 20:39:54 +02:00
Smita Kulkarni
49b019b26d Refactored TableFunction name to TableFunctionAzureBlobStorage 2023-06-06 20:23:20 +02:00