Commit Graph

106 Commits

Author SHA1 Message Date
Dmitry Novik
6e73cd6929 Implement parallel grouping sets processing 2022-04-21 01:18:40 +00:00
Dmitry Novik
a16710c750 Merge remote-tracking branch 'origin/master' into grouping-sets-fix 2022-04-14 17:29:51 +00:00
Nikita Taranov
30f2a942c5
Predict size of hash table for GROUP BY (#33439)
* use AggregationMethod ctor with reserve

* add new settings

* add HashTablesStatistics

* support queries with limit

* support distributed and with external aggregation

* add new profile events

* add some tests

* add perf test

* export cache stats through AsynchronousMetrics

* rm redundant trace

* fix style

* fix 02122_parallel_formatting test

* review fixes

* fix 02122_parallel_formatting test

* apply also to two-level HTs

* try simpler strategy

* increase max_size_to_preallocate_for_aggregation for experiment

* fixes

* Revert "increase max_size_to_preallocate_for_aggregation for experiment"

This reverts commit 6cf6f75704.

* fix test

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-03-30 22:47:51 +02:00
Maksim Kita
e30117a3d6 Fix clang-tidy warnings in Interpreters, IO folders 2022-03-14 18:17:35 +00:00
Dmitry Novik
67df01d025
Merge branch 'master' into grouping-sets-fix 2022-02-22 04:18:03 -08:00
feng lv
07280e0ab1 Add name hints for data skipping indices
fix test
2022-02-20 11:48:22 +00:00
Azat Khuzhin
4fa2ae76bc Fix memory leak in AggregatingInOrderTransform
Reproducer:

    # NOTE: we need clickhouse from 33957 since right now LSan is broken due to getauxval().
    $ url=https://s3.amazonaws.com/clickhouse-builds/33957/e04b862673644d313712607a0078f5d1c48b5377/package_asan/clickhouse
    $ wget $url -o clickhouse-asan
    $ chmod +x clickhouse-asan
    $ ./clickhouse-asan server &

    $ ./clickhouse-asan client
    :) create table data (key Int, value String) engine=MergeTree() order by key
    :) insert into data select number%5, toString(number) from numbers(10e6)

    # usually it is enough one query, benchmark is just for stability of the results
    # note, that if the exception was not happen from AggregatingInOrderTransform then add --continue_on_errors and wait
    $ ./clickhouse-asan benchmark --query 'select key, uniqCombined64(value), groupArray(value) from data group by key' --optimize_aggregation_in_order=1 --memory_tracker_fault_probability=0.01, max_untracked_memory='2Mi'

LSan report:

    ==24595==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 3932160 byte(s) in 6 object(s) allocated from:
        0 0xcadba93 in realloc ()
        1 0xcc108d9 in Allocator<false, false>::realloc() obj-x86_64-linux-gnu/../src/Common/Allocator.h:134:30
        2 0xde19eae in void DB::PODArrayBase<>::realloc<DB::Arena*&>(unsigned long, DB::Arena*&) obj-x86_64-linux-gnu/../src/Common/PODArray.h:161:25
        3 0xde5f039 in void DB::PODArrayBase<>::reserveForNextSize<DB::Arena*&>(DB::Arena*&) obj-x86_64-linux-gnu/../src/Common/PODArray.h
        4 0xde5f039 in void DB::PODArray<>::push_back<>(DB::GroupArrayNodeString*&, DB::Arena*&) obj-x86_64-linux-gnu/../src/Common/PODArray.h:432:19
        5 0xde5f039 in DB::GroupArrayGeneralImpl<>::add() const obj-x86_64-linux-gnu/../src/AggregateFunctions/AggregateFunctionGroupArray.h:465:31
        6 0xde5f039 in DB::IAggregateFunctionHelper<>::addBatchSinglePlaceFromInterval() const obj-x86_64-linux-gnu/../src/AggregateFunctions/IAggregateFunction.h:481:53
        7 0x299df134 in DB::Aggregator::executeOnIntervalWithoutKeyImpl() obj-x86_64-linux-gnu/../src/Interpreters/Aggregator.cpp:869:31
        8 0x2ca75f7d in DB::AggregatingInOrderTransform::consume() obj-x86_64-linux-gnu/../src/Processors/Transforms/AggregatingInOrderTransform.cpp:124:13

    ...

    SUMMARY: AddressSanitizer: 4523184 byte(s) leaked in 12 allocation(s).

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-09 09:23:56 +03:00
Dmitry Novik
1095814bbf Cleanup code 2022-01-11 11:26:13 +00:00
fanzhou
83e9e5d0e5 some changes 2022-01-11 11:26:12 +00:00
MaxTheHuman
e6bd807f60 grouping sets development 2022-01-11 11:26:10 +00:00
MaxTheHuman
4d1b354b5f grouping sets development 2022-01-11 11:26:10 +00:00
MaxTheHuman
abe09324c1 grouping sets development 2022-01-11 11:26:10 +00:00
MaxTheHuman
3195d600c5 feat grouping-sets: initial changes 2022-01-11 11:26:10 +00:00
alexey-milovidov
0a55fa3dc2
Revert "Grouping sets dev" 2021-12-25 20:30:31 +03:00
alexey-milovidov
6b97af4c63
Merge pull request #26869 from taylor12805/grouping-sets-dev
Grouping sets dev
2021-12-17 20:50:15 +03:00
Dmitry Novik
56a3f4a000 Cleanup code 2021-12-14 22:15:14 +03:00
fanzhou
43db4594ba some changes 2021-11-29 19:35:33 +03:00
MaxTheHuman
ddd1799743 grouping sets development 2021-11-26 22:11:34 +03:00
MaxTheHuman
d2258decf5 grouping sets development 2021-11-26 21:50:03 +03:00
MaxTheHuman
e60d1dd818 grouping sets development 2021-11-26 21:38:44 +03:00
MaxTheHuman
2bd07ef338 feat grouping-sets: initial changes 2021-11-26 20:24:35 +03:00
Anton Popov
d50137013c Merge remote-tracking branch 'upstream/master' into HEAD 2021-11-01 16:55:53 +03:00
Nikolai Kochetov
a92dc0a826 Update obsolete comments. 2021-10-19 12:58:10 +03:00
Anton Popov
d71ffc355a Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-18 15:18:22 +03:00
Nikolai Kochetov
fd14faeae2 Remove DataStreams folder. 2021-10-15 23:18:20 +03:00
Anton Popov
7aa6068fb2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-14 19:44:08 +03:00
Nikolai Kochetov
ab28c6c855 Remove BlockInputStream interfaces. 2021-10-14 13:25:43 +03:00
Nikolai Kochetov
c6bce1a4cf Update Native. 2021-10-08 20:21:19 +03:00
Alexey Milovidov
fe6b7c77c7 Rename "common" to "base" 2021-10-02 10:13:14 +03:00
Anton Popov
eef436fe22 Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-16 18:07:42 +03:00
Amos Bird
91293c7449
Fix crash on exception with projection aggregate 2021-09-09 10:43:56 +08:00
Anton Popov
c3c3a06078 Merge remote-tracking branch 'upstream/master' into HEAD 2021-08-20 01:45:38 +03:00
Anton Popov
16ed0f6ed4 Merge remote-tracking branch 'upstream/master' into HEAD 2021-08-02 17:55:17 +03:00
Maksim Kita
3a6b37691a Compile aggregate functions without key 2021-07-27 19:50:57 +03:00
Maksim Kita
1fea19846b Compile aggregate functions profile events fix 2021-07-23 00:43:31 +03:00
Anton Popov
2b0ec88f5a disable jit aggregation for sparse columns 2021-07-20 20:02:41 +03:00
Anton Popov
14168b11f2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-07-07 17:05:11 +03:00
Maksim Kita
325b54f623 Aggregator compile only part of aggregate functions 2021-07-01 22:56:36 +03:00
Maksim Kita
da8c957167 Aggregator added CompiledExpressionCache 2021-07-01 22:56:36 +03:00
Maksim Kita
d24d3ae992 Added second variant of compilation 2021-07-01 22:56:36 +03:00
Maksim Kita
1e2f22a183 Aggregator compile part of aggregate functions 2021-07-01 22:56:36 +03:00
Maksim Kita
a5ef0067b8 Compile AggregateFunctionIf 2021-07-01 22:56:35 +03:00
Maksim Kita
9b71b1040a Aggregate functions update compile interface 2021-07-01 22:56:35 +03:00
Maksim Kita
3fe559b31f Compile aggregate functions 2021-07-01 22:56:35 +03:00
Anton Popov
567043113c Merge remote-tracking branch 'upstream/master' into HEAD 2021-06-21 01:36:06 +03:00
Alexey Milovidov
885ce194e0 Making fundamentals correct 2021-06-07 00:49:55 +03:00
Anton Popov
3e92c7f61a Merge remote-tracking branch 'upstream/master' into HEAD 2021-05-25 21:45:19 +03:00
Amos Bird
07b1be5a76
Fix distributed processing when using projection 2021-05-16 22:40:06 +08:00
Anton Popov
d8df0903b9 Merge remote-tracking branch 'upstream/master' into HEAD 2021-05-14 23:38:16 +03:00
Nikolai Kochetov
3296c9292f
Try to merge projectons faster. 2021-05-11 18:12:26 +08:00