Commit Graph

936 Commits

Author SHA1 Message Date
Robert Schulze
fd86829824
Consolidate config_core.h into config.h
Less duplication, less confusion ...
2022-09-28 13:31:57 +00:00
Alexey Milovidov
45afacdae4
Merge pull request #41186 from ClickHouse/fix-three-fourth-of-trash
Fix more than half of the trash
2022-09-22 07:28:26 +03:00
Nikolai Kochetov
2b46735c42 Fix a bug with missing rows after partial sort optimisation #41182 2022-09-20 14:08:39 +00:00
Alexey Milovidov
91baedf03a Fix 6/7 of trash 2022-09-19 08:53:20 +02:00
Alexey Milovidov
84f42e0874 Fix 3/4 of trash 2022-09-19 08:50:53 +02:00
Alexey Milovidov
ada7a44ae4 Remove -WithTerminatingZero methods 2022-09-17 05:34:18 +02:00
Nikita Taranov
6f186d3dd2
Do not return empty blocks from ConvertingAggregatedToChunksTransform (#41152)
* impl

* add test

* update test
2022-09-16 21:54:36 +02:00
Alexey Milovidov
fd235919aa Remove some methods 2022-09-10 05:04:40 +02:00
Alexey Milovidov
193cd1b3b2
Merge pull request #39138 from nickitat/control_block_size_in_aggregator
Control block size in aggregator
2022-09-04 04:51:00 +03:00
Vladimir C
963c0111bf
Merge pull request #39418 from vdimir/join_and_sets
Filter joined streams for `full_sorting_join` by each other before sorting
2022-09-02 13:57:06 +02:00
Antonio Andelic
e64436fef3 Fix typos with new codespell 2022-09-02 08:54:48 +00:00
Vladimir C
12e6fc4182
Merge branch 'master' into join_and_sets 2022-09-01 14:56:14 +02:00
Kseniia Sumarokova
c6c67a248d
Merge pull request #40792 from canhld94/ch_canh_intersect_distinct
Implement intersect + except distinct
2022-09-01 14:35:26 +02:00
Anton Popov
3504781529
Merge branch 'master' into fix-read-in-order-fixed-prefix 2022-08-30 23:32:43 +02:00
vdimir
0f6f3c73b0
Minor fix 2022-08-30 11:57:28 +00:00
Duc Canh Le
8590cc46c4 implement intersect + except distinct 2022-08-30 18:09:01 +08:00
vdimir
24f62e8486
Throw an error in CreatingSetsOnTheFlyTransform in case of input for finished 2022-08-29 11:27:08 +00:00
vdimir
b0e2616aa9
Style fixes in CreateSetAndFilterOnTheFlyTransform and related 2022-08-29 11:26:21 +00:00
Anton Popov
2a3e012931
Merge branch 'master' into fix-read-in-order-fixed-prefix 2022-08-29 13:17:26 +02:00
vdimir
714c53ab24
fix typos 2022-08-29 09:49:09 +00:00
vdimir
8e1632f824
Create sets for joins: better code 2022-08-29 09:49:08 +00:00
vdimir
7228091ff1
rename CreateSetAndFilterOnTheFlyTransform 2022-08-29 09:49:07 +00:00
vdimir
c778bba13f
Create sets for joins: wip 2022-08-29 09:47:00 +00:00
vdimir
31a167848d
Fix set finish condition in CreatingSetsOnTheFlyTransform 2022-08-29 09:46:59 +00:00
vdimir
8f06430ebd
Create sets for joins: upd 2022-08-29 09:46:58 +00:00
vdimir
3292566603
Format bytes in CreatingSetsOnTheFlyTransform logs 2022-08-29 09:46:57 +00:00
vdimir
031aaf3a45
Add Creating/FilterBySetsOnTheFlyStep for full sorting join 2022-08-29 09:46:57 +00:00
Azat Khuzhin
f9812d9917 Fix memory leak while pushing to MVs w/o query context (from Kafka/...)
While pushign to MVs, there is a low-level code that create
ThreadGroupStatus/ThreadStatus, it is required to gather some metrics
for system.query_views_log.

But, one should not use ThreadGroupStatus of the MainThreadStatus, since
this structure can hold some state, that may not be cleaned, plus this
may be racy, instead it is better to create new ThreadGroupStatus and
attach it instead.

Also this place misses detachQuery(), and because of this it leaks
ThreadGroupStatus::finished_threads_counters_memory. But it is only the
problem pushing to MVs is done w/o query context (i.e. from Kafka/...),
since when it has query context detachQuery() will be called eventually.

Before this patch series, when I've tried the reproducer with
500 MVs attached to Kafka engine (that @den-crane suggested), jemalloc
report looks like this:

    $ ../jeprof --text ~/ch/tmp/upstream/clickhouse-binary --base jeprof.44384.0.i0.heap jeprof.44384.167.i167.heap
    Using local file /home/azat/ch/tmp/upstream/clickhouse-binary.
    Using local file jeprof.44384.167.i167.heap.
    Total: 915.6 MB
       910.7  99.5%  99.5%    910.7  99.5% Snapshot (inline)
         9.5   1.0% 100.5%      9.5   1.0% std::__1::__libcpp_operator_new (inline)
         0.5   0.1% 100.6%      0.5   0.1% DB::TasksStatsCounters::create

And with focus to this place:

    $ ../jeprof --focus Snapshot --text ~/ch/tmp/upstream/clickhouse-binary --base jeprof.44384.0.i0.heap jeprof.44384.167.i167.heap
    Using local file /home/azat/ch/tmp/upstream/clickhouse-binary.
    Using local file jeprof.44384.167.i167.heap.
    Total: 915.6 MB
       910.7 100.0% 100.0%    910.7 100.0% Snapshot (inline)
         0.0   0.0% 100.0%    910.7 100.0% DB::QueryPipeline::reset
         0.0   0.0% 100.0%    910.7 100.0% DB::StorageKafka::streamToViews
         0.0   0.0% 100.0%    910.7 100.0% DB::StorageKafka::threadFunc
         0.0   0.0% 100.0%    910.7 100.0% ProfileEvents::Counters::getPartiallyAtomicSnapshot
         0.0   0.0% 100.0%    910.7 100.0% ~ThreadStatus
         0.0   0.0% 100.0%    910.7 100.0% ~ViewRuntimeData
         0.0   0.0% 100.0%    910.7 100.0% ~ViewRuntimeStats (inline)

Actually this report does not looks great (you understand it because I
stripped it), because --text does not that smart, but if you will use
--pdf for the report you will see the stacktrace (will attach pdf to the
pull request).

But after this patch series the process RSS does not goes beyond
~700MiB.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-29 11:36:33 +02:00
Vladimir C
e067629e0d
Merge pull request #40239 from vdimir/vdimir/tmp-file-metrics
More metrics for on-disk temporary data
2022-08-26 11:28:01 +02:00
vdimir
91788f29e8
Upd TemporaryFileOnDisk 2022-08-24 16:15:54 +00:00
vdimir
7194df1184
Move back TemporaryFile -> TemporaryFileOnDisk 2022-08-24 16:14:11 +00:00
vdimir
0349c85017
Use getCompressedBytes in BufferingToFileTransform and TemporaryFileStream 2022-08-24 16:14:10 +00:00
vdimir
51c44424cc
More metrics for temp files 2022-08-24 16:14:09 +00:00
vdimir
1321ac87b5
Minor fixes 2022-08-24 16:14:07 +00:00
vdimir
7e0c9062c7
Add ProfileEvents::ExternalSort(Un)CompressedBytes 2022-08-24 16:14:07 +00:00
Alexander Gololobov
1c2dd50ca5 Fix vertical merge of parts with lightweight deleted rows 2022-08-24 15:18:33 +02:00
Alexey Milovidov
ab91c99495
Merge branch 'master' into control_block_size_in_aggregator 2022-08-20 21:28:27 +03:00
Nikita Taranov
f650b23ee3 generate many blocks 2022-08-16 18:56:22 +02:00
Nikita Taranov
db0110fd7a more accurate crutch 2022-08-16 18:56:22 +02:00
Nikita Taranov
e5e0a24ab3 return chunks from prepareBlockAndFillWithoutKey 2022-08-16 18:56:22 +02:00
Vladimir Chebotaryov
3cc03b141e Fixed tests on Debug build type. 2022-08-16 15:43:37 +02:00
Vladimir Chebotaryov
66f9bfca61 Fixed point of origin for exponential decay window functions to the last value in window. 2022-08-16 15:43:37 +02:00
Anton Popov
4bd50bb06c
Merge branch 'master' into distinct_sorted_simplify 2022-08-12 17:11:18 +02:00
Kruglov Pavel
4c7222d938
Merge pull request #40020 from canhld94/ch_canh_fix_hash
fix HashMethodOneNumber with const column
2022-08-12 14:40:24 +02:00
Maksim Kita
6bec0f5854
Merge pull request #38956 from vdimir/dict-join-refactoring
Join with dictionary refactoring
2022-08-11 11:54:11 +02:00
Duc Canh Le
84cd867aa8 materialize column instead of handling column in hash method 2022-08-11 10:46:06 +08:00
vdimir
ad91c16ba0
Rename join_common -> JoinUtils 2022-08-10 14:20:28 +00:00
vdimir
708747ca0b
Merge branch 'master' into refactor-prepared-sets 2022-08-08 14:27:18 +02:00
Igor Nikonov
8278da6475 Fix: read row counts before move columns out of chunk 2022-08-05 21:29:57 +00:00
Igor Nikonov
9fddf6efde Merge remote-tracking branch 'origin/master' into ordinary_distinct_small_refact 2022-08-05 19:23:44 +00:00