The release of ThreadGroupStatus::finished_threads_counters_memory
via the getProfileEventsCountersAndMemoryForThreads method brings
lots of lock contentions for ThreadGroupStatus::mutex and lowers
the overall performance. This commit optimizes this performance
issue by replacing the method call with an equivalent but more
lightweight code block.
* impl
* stash
* clean up
* do not apply when HT is small
* make branch static
* also in merge
* do not hardcode look ahead value
* fix
* apply to methods with cheap key calculation
* more tests
* silence tidy
* fix build
* support HashMethodKeysFixed
* apply during merge only for cheap
* stash
* fixes
* rename method
* add feature flag
* cache prefetch threshold value
* fix
* fix
* Update HashMap.h
* fix typo
* 256KB as default l2 size
Co-authored-by: Alexey Milovidov <milovidov@clickhouse.com>
* Purge jemalloc arenas in case of high memory usage.
* Purge jemalloc arenas in case of high memory usage.
* Get RSS before jemalloc counters. Try to avoid negative RSS.
* Try to avoid negative RSS.
* muzzy -> dirty
* Another fix.
* Update MemoryTracker.cpp
* Wait for purged memory.
* Revert "Wait for purged memory."
This reverts commit 53a2621a2d.
Right due to graceful cancellation of the query, it is possible to hung
accepting new queries or even have a deadlock, this is because
cancellation is done while acquiring ProcessListBase::mutex.
So this patch makes query cancellation lock-free, and now the lock will
be acquired only for preparing the query and after cancel is done.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Right now it is possible to call QueryStatus::addPipelineExecutor() when
the executors_mutex already acquired, it is possible when the query was
cancelled via KILL QUERY.
Here I will show some traces from debugger from a real example, where
tons of ProcessList::insert() got deadlocked.
Let's look at the lock owner for one of the threads that was deadlocked
in ProcessList::insert():
(gdb) p *mutex
$2 = {
__data = {
__owner = 46899,
},
}
And now let's see the stack trace of the 46899:
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1 0x00007fb65569b714 in __GI___pthread_mutex_lock (mutex=0x7fb4a9d15298) at ../nptl/pthread_mutex_lock.c:80
#2 0x000000001b6edd91 in pthread_mutex_lock (arg=0x7fb4a9d15298) at ../src/Common/ThreadFuzzer.cpp:317
#3 std::__1::__libcpp_mutex_lock (__m=0x7fb4a9d15298) at ../contrib/libcxx/include/__threading_support:303
#4 std::__1::mutex::lock (this=0x7fb4a9d15298) at ../contrib/libcxx/src/mutex.cpp:33
#5 0x0000000014c7ae63 in std::__1::lock_guard<std::__1::mutex>::lock_guard (__m=..., this=<optimized out>) at ../contrib/libcxx/include/__mutex_base:91
#6 DB::QueryStatus::addPipelineExecutor (this=0x7fb4a9d14f90, e=0x80) at ../src/Interpreters/ProcessList.cpp:372
#7 0x0000000015bee4a7 in DB::PipelineExecutor::PipelineExecutor (this=0x7fb4b1e53618, processors=..., elem=<optimized out>) at ../src/Processors/Executors/PipelineExecutor.cpp:54
#12 std::__1::make_shared<DB::PipelineExecutor, std::__1::vector<std::__1::shared_ptr<DB::IProcessor>, std::__1::allocator<std::__1::shared_ptr<DB::IProcessor> > >&, DB::QueryStatus*&, void> (__args=@0x7fb63095b9b0: 0x7fb4a9d14f90, __args=@0x7fb63095b9b0: 0x7fb4a9d14f90) at ../contrib/libcxx/include/__memory/shared_ptr.h:963
#13 DB::QueryPipelineBuilder::execute (this=0x7fb63095b8b0) at ../src/QueryPipeline/QueryPipelineBuilder.cpp:552
#14 0x00000000158c6c27 in DB::Connection::sendExternalTablesData (this=0x7fb6545e9d98, data=...) at ../src/Client/Connection.cpp:797
#27 0x0000000014043a81 in DB::RemoteQueryExecutorRoutine::operator() (this=0x7fb63095bf20, sink=...) at ../src/QueryPipeline/RemoteQueryExecutorReadContext.cpp:46
#32 0x000000000a16dd4f in make_fcontext () at ../contrib/boost/libs/context/src/asm/make_x86_64_sysv_elf_gas.S:71
And also in the logs you can see very strange things for this thread:
2022.09.13 14:14:51.228979 [ 51145 ] {1712D4E914EC7C99} <Debug> Connection (localhost:9000): Sent data for 1 external tables, total 11 rows in 0.00046389 sec., 23688 rows/sec., 3.84 KiB (8.07 MiB/sec.), compressed 1.1070121092649958 times to 3.47 KiB (7.29 MiB/sec.)
...
2022.09.13 14:14:51.719402 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Debug> executeQuery: (from 10.101.15.181:42478) KILL QUERY WHERE query_id = '1712D4E914EC7C99' (stage: Complete)
2022.09.13 14:14:51.719488 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Debug> executeQuery: (internal) SELECT query_id, user, query FROM system.processes WHERE query_id = '1712D4E914EC7C99' (stage: Complete)
2022.09.13 14:14:51.719754 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Trace> ContextAccess (default): Access granted: SELECT(user, query_id, query) ON system.processes
2022.09.13 14:14:51.720544 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2022.09.13 14:14:53.228964 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Debug> Connection (localhost:9000): Sent data for 2 scalars, total 2 rows in 2.6838e-05 sec., 73461 rows/sec., 68.00 B (2.38 MiB/sec.), compressed 0.4594594594594595 times to 148.00 B (5.16 MiB/sec.)
How is this possible? The answer is fibers and query cancellation
routine. During cancellation of async queries it going into fibers again
and try to do this gracefully. However because of this during canceling
query it may call QueryStatus::addPipelineExecutor() from
QueryStatus::cancelQuery().
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
- This commit restores statements "SYSTEM RELOAD MODEL(S)" which provide
a mechanism to update a model explicitly. It also saves potentially
unnecessary reloads of a model from disk after it's initial load.
To keep the complexity low, the semantics of "SYSTEM RELOAD MODEL(S)
was changed from eager to lazy. This means that both statements
previously immedately reloaded the specified/all models, whereas now
the statements only trigger an unload and the first call to
catboostEvaluate() does the actual load.
- Monitoring view SYSTEM.MODELS is also restored but with some obsolete
fields removed. The view was not documented in the past and for now it
remains undocumented. The commit is thus not considered a breach of
ClickHouse's public interface.
Previously it simply sleeps for async_insert_busy_timeout_ms delay on
shutdown, which is not great, and produces false-positive hung checks on
CI [1].
[1]: https://pastila.nl/?0087a100/db80915eac58e67eb05516bf459e5510
Refs: #29390 (cc @tavplubix, @CurtizJ)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
CI found one issue [1].
Here is the stack trace for invalid read:
<details>
<summary>stack trace</summary>
```
0: DB::TemporaryFileLazySource::TemporaryFileLazySource(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Block const&) [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__is_long(this="") const at string:1445:22
1: DB::TemporaryFileLazySource::TemporaryFileLazySource(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Block const&) [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(this="", __str="") at string:1927
2: DB::TemporaryFileLazySource::TemporaryFileLazySource(this=0x00007f3aec105f58, path_="", header_=0x00007f38ffd93b40) at TemporaryFileLazySource.cpp:11
3: DB::SortedBlocksWriter::streamFromFile(std::__1::unique_ptr<Poco::TemporaryFile, std::__1::default_delete<Poco::TemporaryFile> > const&) const [inlined] DB::TemporaryFileLazySource* std::__1::construct_at<DB::TemporaryFileLazySource, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Block, DB::TemporaryFileLazySource*>(__args=0x00007f38ffd91560) at construct_at.h:38:50
4: DB::SortedBlocksWriter::streamFromFile(std::__1::unique_ptr<Poco::TemporaryFile, std::__1::default_delete<Poco::TemporaryFile> > const&) const [inlined] void std::__1::allocator_traits<std::__1::allocator<DB::TemporaryFileLazySource> >::construct<DB::TemporaryFileLazySource, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Block, void, void>(__args=0x00007f38ffd91560) at allocator_traits.h:298
5: DB::SortedBlocksWriter::streamFromFile(std::__1::unique_ptr<Poco::TemporaryFile, std::__1::default_delete<Poco::TemporaryFile> > const&) const [inlined] std::__1::__shared_ptr_emplace<DB::TemporaryFileLazySource, std::__1::allocator<DB::TemporaryFileLazySource> >::__shared_ptr_emplace<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Block>(this=0x00007f3aec105f40, __args=0x00007f38ffd91560) at shared_ptr.h:293
6: DB::SortedBlocksWriter::streamFromFile(std::__1::unique_ptr<Poco::TemporaryFile, std::__1::default_delete<Poco::TemporaryFile> > const&) const [inlined] std::__1::shared_ptr<DB::TemporaryFileLazySource> std::__1::allocate_shared<DB::TemporaryFileLazySource, std::__1::allocator<DB::TemporaryFileLazySource>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Block, void>(__args=<unavailable>, __args=<unavailable>) at shared_ptr.h:954
7: DB::SortedBlocksWriter::streamFromFile(std::__1::unique_ptr<Poco::TemporaryFile, std::__1::default_delete<Poco::TemporaryFile> > const&) const [inlined] std::__1::shared_ptr<DB::TemporaryFileLazySource> std::__1::make_shared<DB::TemporaryFileLazySource, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Block, void>(__args=<unavailable>, __args=<unavailable>) at shared_ptr.h:963
8: DB::SortedBlocksWriter::streamFromFile(this=<unavailable>, file=<unavailable>) const at SortedBlocksWriter.cpp:238
9: DB::SortedBlocksWriter::premerge(this=<unavailable>) at SortedBlocksWriter.cpp:209:32
```
</details>
[1]: https://s3.amazonaws.com/clickhouse-test-reports/41046/adea92f847373d1fcfd733d8979c63024f9b80bf/stress_test__asan_.html
So the problem here is that there was empty unique_ptr<> reference to
temporary file, because of empty block that accepted by
SortedBlocksWriter::insert(), but insert() is not a problem the problem
is premerge() that steals blocks from insert() and do not have check
that there are some rows. However this check exists in
SortedBlocksWriter::flush(), and in that case temporary file is not
created.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
- The deleted function modelEvaluate() was superseded by
catboostEvaluate().
- Also delete the external model repository, as modelEvaluate() was it's
last user. Additionally remove the system view SYSTEM.MODELS for
inspecting the repository.
- SYSTEM RELOAD MODELS is also obsolete. HOWEVER, it was retained and
made a no-op instead of deleted.
Why?
The reason is that RBAC in distributed setups works by storing
privileges (granted and revoked) as plain SQL statements in Keeper.
Nodes read these statements at startup and parse them. If a privilege
for SYSTEM RELOAD MODELS exists but parser doesn't recognize it
nodes would fail to come up.
Considered but rejected alternatives:
- Ignore SYSTEM RELOAD MODELS during parsing RBAC privileges and
return an error for regular SYSTEM RELOAD MODELS SQL. Special-case
of no-op behavior, too brittle.
- Remove SYSTEM RELOAD MODELS manually from Keeper via command-line
manipulation of Keeper nodes or via SQL by dropping the privileges.
Needs user intervention during upgrade.
Fixed `Unknown identifier (aggregate-function)` exception which appears when a user tries to calculate WINDOW ORDER BY/PARTITION BY expressions over aggregate functions