Commit Graph

264 Commits

Author SHA1 Message Date
ltrk2
27a2d4d1c7
Merge branch 'master' into feature/mergetree-checksum-big-endian-support 2023-08-02 11:36:43 -04:00
Nikita Taranov
2cbe79b529
Fix memory consumption when max_block_size is huge in Aggregator + more memory checks (#51566)
* impl

* remove checks from without_key methods

* maybe will improve smth

* add test

* Update 02797_aggregator_huge_mem_usage_bug.sql

* Update 02797_aggregator_huge_mem_usage_bug.sql

---------

Co-authored-by: Alexey Milovidov <milovidov@clickhouse.com>
2023-08-02 15:11:52 +02:00
ltrk2
6c9a1b14ef
Merge branch 'master' into feature/mergetree-checksum-big-endian-support 2023-07-28 16:18:46 -04:00
Jiebin Sun
78f3a575f9
Convert hashSets in parallel before merge (#50748)
* Convert hashSets in parallel before merge

Before merge, if one of the lhs and rhs is singleLevelSet and the other is twoLevelSet,
then the SingleLevelSet will call convertToTwoLevel(). The convert process is not in parallel
and it will cost lots of cycle if it cosume all the singleLevelSet.

The idea of the patch is to convert all the singleLevelSets to twoLevelSets in parallel if
the hashsets are not all singleLevel or not all twoLevel.

I have tested the patch on Intel 2 x 112 vCPUs SPR server with clickbench and latest upstream
ClickHouse.
Q5 has got a big 264% performance improvement and 24 queries have got at least 5% performance
gain. The overall geomean of 43 queries has gained 7.4% more than the base code.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* add resize() for the data_vec in parallelizeMergePrepare()

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Add the performance test prepare_hash_before_merge.xml

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Fit the CI to rename the data set from hits_v1 to test.hits.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* remove the redundant branch in UniqExactSet

Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

* Remove the empty methods and add throw exception in parallelizeMergePrepare()

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

---------

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
2023-07-27 15:06:34 +02:00
Nikita Taranov
04180549b0
Fix possible double-free in Aggregator (#52439) 2023-07-26 13:15:58 +02:00
ltrk2
6b96a3943d Update further uses of SipHash 2023-07-19 10:01:58 -07:00
Maksim Kita
e9840bc6e1 JIT aggregation nullable key fix 2023-05-28 21:05:17 +03:00
Li Shuai
279970337a Fix all key value is null and group use rollup return wrong answer 2023-05-04 11:07:17 +08:00
Sema Checherinda
4dd86a406a
Merge pull request #48543 from azat/mv-uniq-thread-group
Use one ThreadGroup while pushing to materialized views (and some refactoring for ThreadGroup)
2023-04-11 11:47:46 +02:00
Azat Khuzhin
79b83c4fd2 Remove superfluous includes of logger_userful.h from headers
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-10 17:59:30 +02:00
Azat Khuzhin
5b2b20a0b0 Rename ThreadGroupStatus to ThreadGroup
There are methods like getThreadGroup() and ThreadGroupSwitcher class,
so seems that this is logical.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-04-07 15:31:48 +02:00
Azat Khuzhin
f38a7aeabe ThreadPool metrics introspection
There are lots of thread pools and simple local-vs-global is not enough
already, it is good to know which one in particular uses threads.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-03-29 10:46:59 +02:00
Nikolai Kochetov
73e98de46d Merge branch 'master' into aggregate-projections-analysis-query-plan 2023-03-23 21:28:36 +01:00
Sema Checherinda
3c6deddd1d work with comments on PR 2023-03-16 19:55:58 +01:00
Anton Popov
d2a8cd3ed4 fix performance regression 2023-03-14 14:51:28 +00:00
Nikolai Kochetov
669a92bae0 Merge branch 'master' into aggregate-projections-analysis-query-plan 2023-03-13 19:55:49 +01:00
Anton Popov
6f3e4d4137
Merge pull request #46118 from CurtizJ/fix-issues-with-sparse
Randomize setting `ratio_of_defaults_for_sparse_serialization`
2023-03-05 22:28:18 +01:00
LiuNeng
d4c5ab9dcd
Optimize one nullable key aggregate performance (#45772) 2023-03-02 21:01:52 +01:00
Anton Popov
d926713cf5 Merge remote-tracking branch 'upstream/master' into HEAD 2023-02-23 23:04:22 +00:00
Nikolai Kochetov
c5f93eb108 Fix more tests. 2023-02-21 15:44:50 +00:00
Nikolai Kochetov
413a8d38fa Fix totals row for projections. 2023-02-20 16:40:35 +00:00
Anton Popov
3730ea388f fix issues with sparse columns 2023-02-15 21:46:26 +00:00
Nikita Taranov
581f31ad3d better 2023-01-30 17:11:56 +00:00
Nikita Taranov
a18343773f improve perf 2023-01-30 17:11:56 +00:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
Nikita Taranov
006fdd32d4
Apply preallocation optimisation more carefully (#44455)
* impl

* add perf test

* fix

* review fixes
2023-01-09 13:30:48 +01:00
Vladimir C
7482ea54ab
Merge pull request #43972 from ClickHouse/vdimir/tmp-data-in-fs-cache-2 2022-12-23 11:59:27 +01:00
vdimir
182b34c11e
Fixes 2022-12-22 10:22:57 +00:00
Dmitry Novik
3d2fccab87
Merge branch 'master' into refector-function-node 2022-12-12 21:36:39 +01:00
Nikita Taranov
9e2265a6ed
Improve hash table preallocation optimisation (#43945)
* do not preallocate if max_size_to_preallocate_for_aggregation is too small

* skip optimisation for aggr without key

* increase default for max_size_to_preallocate_for_aggregation
2022-12-08 00:05:15 +01:00
Dmitry Novik
25ecb75ca8 Merge remote-tracking branch 'origin/master' into refector-function-node 2022-12-07 18:36:50 +00:00
Nikolai Kochetov
0ed82f3cc0
Merge branch 'master' into aggregating-in-order-from-query-plan 2022-12-06 16:36:49 +01:00
Dmitry Novik
2c70dbc76a Refactor FunctionNode 2022-12-02 19:15:26 +00:00
Alexander Tokmakov
e45105bf44 detach threads from thread group 2022-11-28 21:31:55 +01:00
Nikolai Kochetov
6d0646ed8f
Merge branch 'master' into aggregating-in-order-from-query-plan 2022-11-28 16:53:29 +01:00
Nikolai Kochetov
1dfa188c7a Add order info for aggregating step in plan. Added test. 2022-11-28 15:15:36 +00:00
Nikita Taranov
7beb58b0cf
Optimize merge of uniqExact without_key (#43072)
* impl for uniqExact

* rm unused (read|write)Text methods

* fix style

* small fixes

* impl for variadic uniqExact

* refactor

* fix style

* more agressive inlining

* disable if max_threads=1

* small improvements

* review fixes

* Revert "rm unused (read|write)Text methods"

This reverts commit a7e7480584.

* encapsulate is_able_to_parallelize_merge in Data

* encapsulate is_exact & argument_is_tuple in Data
2022-11-17 13:19:02 +01:00
taojiatao
721e85a03e minor fix error msg, repalce outdated func name 2022-11-04 11:27:10 +08:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Duc Canh Le
c05429574d add exception 2022-10-17 08:59:39 +08:00
Duc Canh Le
5526e05aac remove junk log 2022-10-16 17:19:40 +08:00
Duc Canh Le
9e9e967f1f choose correct aggregation method for lc128 and lc512 2022-10-16 16:57:15 +08:00
vdimir
0178307c27 Followup for TemporaryDataOnDisk 2022-10-12 15:25:23 +02:00
Alexander Tokmakov
4175f8cde6 abort instead of __builtin_unreachable in debug builds 2022-10-07 21:49:08 +02:00
Raúl Marín
adbaaca2f5
QOL log improvements (#41947)
* Uniformize disk reservation logs

* Remove log about destroying stuff that appears all the time

* More tweaks on disk reservation logs

* Reorder logs in hash join

* Remove log that provides little information

* Collapse part removal logs

Co-authored-by: Sergei Trifonov <sergei@clickhouse.com>
2022-10-06 14:22:44 +02:00
vdimir
0605f6ed7f
fix after rebase 2022-09-29 09:51:48 +00:00
vdimir
6f8e8b979d
Revert "wip"
This reverts commit 46e4f0236df9a6f7b03d40278e583bc93b96559a.
2022-09-29 09:51:47 +00:00
vdimir
74d45325b3
wip 2022-09-29 09:51:46 +00:00
vdimir
9f3f34548c
Allow to create temporaty streams on leaf TemporaryDataOnDisk 2022-09-29 09:51:45 +00:00
vdimir
a56a10f089
Do not require tmp_data in Aggregator 2022-09-29 09:51:42 +00:00