After detachQueryIfNotDetached() had been removed it is not enough to
use attachTo() for ThreadPool (scheduleOrThrowOnError()) since the query
may be already attached, if the thread doing multiple jobs, so
CurrentThread::attachToIfDetached() should be used instead.
This should fix all the places from the failures on CI [1]:
$ fgrep DB::CurrentThread::attachTo -A1 ~/Downloads/47.txt | fgrep -v attachTo | cut -d' ' -f5,6 | sort | uniq -c
92 --
2 /fasttest-workspace/build/../../ClickHouse/contrib/libcxx/include/deque:1393: DB::ParallelParsingInputFormat::parserThreadFunction(std::__1::shared_ptr<DB::ThreadGroupStatus>,
4 /fasttest-workspace/build/../../ClickHouse/src/Storages/MergeTree/MergeTreeData.cpp:1595: void
87 /fasttest-workspace/build/../../ClickHouse/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp:993: void
[1]: https://github.com/ClickHouse/ClickHouse/runs/4954466034?check_suite_focus=true
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
MemoryTracker starts accounting memory directly only after per-thread
allocation exceeded max_untracker_memory (or memory_profiler_step).
But even memory under this limit should be accounted too, and there is
code to do this in ThreadStatus dtor, however due to
PullingAsyncPipelineExecutor detached the query from thread group that
memory was not accounted.
So remove CurrentThread::detachQueryIfNotDetached() from threads that
uses ThreadFromGlobalPool since it has ThreadStatus, and the query will
be detached using CurrentThread::defaultThreadDeleter.
Note, that before this patch memory accounting works for HTTP queries
due to it had been accounted from ParallelFormattingOutputFormat, but
not for TCP.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Note, that simple "SELECT range(100)" will execute
ColumnArray::operator[] 14 times (most of them from
DB::checkColumnStructure())
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
We have special optimizations for multiple column ORDER BY: https://github.com/ClickHouse/ClickHouse/pull/10831 . It's beneficial to also apply to tuple columns.
Before:
select * from numbers(300000000) order by (1 - number , number + 1 , number) limit 10;
2.613 sec.
After:
select * from numbers(300000000) order by (1 - number , number + 1 , number) limit 10;
0.755 sec
No tuple:
select * from numbers(300000000) order by 1 - number , number + 1 , number limit 10;
0.755 sec
CI reports [1]:
Indirect leak of 648 byte(s) in 9 object(s) allocated from:
...
2 0x12b96503 in DB::AggregateFunctionSimpleState::getReturnType() const obj-x86_64-linux-gnu/../src/AggregateFunctions/AggregateFunctionSimpleState.h:47:15
...
[1]: https://s3.amazonaws.com/clickhouse-test-reports/33957/08f4f45fd9da923ae3e3fdd8a527c297d35247eb/stress_test__address__actions_.html
After we can get this query by using query_log artifact:
$ wget https://s3.amazonaws.com/clickhouse-test-reports/33957/08f4f45fd9da923ae3e3fdd8a527c297d35247eb/stress_test__address__actions_/query_log_dump.tar
$ tar -xf query_log_dump.tar
$ clickhouse-local --path var/lib/clickhouse/
SELECT query
FROM system.query_log
ARRAY JOIN used_aggregate_function_combinators AS func
WHERE has(used_aggregate_functions, 'groupBitOr') AND has(used_aggregate_function_combinators, 'SimpleState') AND (type != 'QueryStart')
Query id: 5b7722b3-f77e-4e7e-bd0b-586d6d32a899
┌─query────────────────────────────────────────────────────────────────────────────┐
│ with groupBitOrSimpleState(number) as c select toTypeName(c), c from numbers(1); │
└──────────────────────────────────────────────────────────────────────────────────┘
Fixes: 01570_aggregator_combinator_simple_state.sql
Fixes: #16853
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>