Fix dictionary hang in case of CANNOT_SCHEDULE_TASK while loading

On CI you can find that 01747_executable_pool_dictionary_implicit_key
can hang [1], it is possible due to after CANNOT_SCHEDULE_TASK the async
loading will hang:

    2024.07.18 03:56:32.365226 [ 6138 ] {6206a18f-668c-4a5c-a5ad-07f577220762} <Trace> ExternalDictionariesLoader: Will load the object 'executable_pool_simple_implicit_key' in background, force = false, loading_id = 2
    2024.07.18 03:56:32.368005 [ 6138 ] {6206a18f-668c-4a5c-a5ad-07f577220762} <Error> executeQuery: Code: 439. DB::Exception: Cannot schedule a task: fault injected (threads=766, jobs=746): In scope SELECT dictGet('executable_pool_simple_implicit_key', 'a', toUInt64(1)). (CANNOT_SCHEDULE_TASK) (version 24.7.1.2241) (from [::1]:56446) (comment: 01747_executable_pool_dictionary_implicit_key.sql) (in query: SELECT dictGet('executable_pool_simple_implicit_key', 'a', toUInt64(1));), Stack trace (when copying this message, always include the lines below):
    0. /build/contrib/llvm-project/libcxx/include/exception:141: Poco::Exception::Exception(String const&, int) @ 0x0000000015f8a292
    1. /build/src/Common/Exception.cpp:110: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c3df6b9
    2. /build/contrib/llvm-project/libcxx/include/string:1499: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x0000000006de714c
    3. /build/contrib/llvm-project/libcxx/include/vector:438: DB::Exception::Exception<String const&, unsigned long, unsigned long&>(int, FormatStringHelperImpl<std::type_identity<String const&>::type, std::type_identity<unsigned long>::type, std::type_identity<unsigned long&>::type>, String const&, unsigned long&&, unsigned long&) @ 0x000000000c4838eb
    4. /build/src/Common/ThreadPool.cpp:0: void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda'(String const&)::operator()(String const&) const @ 0x000000000c4832d3
    5. /build/src/Common/ThreadPool.cpp:186: void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool) @ 0x000000000c47e7db
    6. /build/contrib/llvm-project/libcxx/include/__functional/function.h:818: ? @ 0x000000000c47ec8d
    7. /build/contrib/llvm-project/libcxx/include/__functional/function.h:818: ? @ 0x000000001114b16e
    8. /build/contrib/llvm-project/libcxx/include/__memory/shared_ptr.h:701: DB::ExternalLoader::LoadingDispatcher::startLoading(DB::ExternalLoader::LoadingDispatcher::Info&, bool, unsigned long) @ 0x0000000011147733
    9. /build/src/Interpreters/ExternalLoader.cpp:837: DB::ExternalLoader::LoadingDispatcher::loadImpl(String const&, std::chrono::duration<long long, std::ratio<1l, 1000l>>, bool, std::unique_lock<std::mutex>&)::'lambda'()::operator()() const @ 0x0000000011158bf9
    10. /build/contrib/llvm-project/libcxx/include/__mutex_base:397: DB::ExternalLoader::LoadingDispatcher::loadImpl(String const&, std::chrono::duration<long long, std::ratio<1l, 1000l>>, bool, std::unique_lock<std::mutex>&) @ 0x00000000111588bc
    11. /build/src/Interpreters/ExternalLoader.cpp:604: DB::ExternalLoader::LoadResult DB::ExternalLoader::LoadingDispatcher::tryLoad<DB::ExternalLoader::LoadResult>(String const&, std::chrono::duration<long long, std::ratio<1l, 1000l>>) @ 0x00000000111440bf
    12. /build/src/Interpreters/ExternalLoader.cpp:1381: std::shared_ptr<DB::IExternalLoadable const> DB::ExternalLoader::load<std::shared_ptr<DB::IExternalLoadable const>, void>(String const&) const @ 0x00000000111442f5
    13. /build/contrib/llvm-project/libcxx/include/__memory/shared_ptr.h:587: DB::ExternalDictionariesLoader::getDictionary(String const&, std::shared_ptr<DB::Context const>) const @ 0x0000000011141028
    14. /build/src/Functions/FunctionsExternalDictionaries.h:76: DB::FunctionDictHelper::getDictionary(String const&) @ 0x00000000071d28ec
    ...
    2024.07.18 03:58:29.000900 [ 48468 ] {8cf63d7e-dcbf-4af6-bd7c-0e1789ddce3b} <Debug> executeQuery: (from [::1]:40410) (comment: 01747_executable_pool_dictionary_implicit_key.sql) SELECT dictGet('executable_pool_simple_implicit_key', 'a', toUInt64(1)); (stage: Complete)
    # and no more rows for 8cf63d7e-dcbf-4af6-bd7c-0e1789ddce3b

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/66495/bc029ed8207ac75e96e9cb48cb79d27a9ffa4e2f/stress_test__debug_.html

The problem that it should be properly cancelled, otherwise it will not
be loaded in loadImpl(), but will be waited.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This commit is contained in:
Azat Khuzhin 2024-08-04 15:43:18 +02:00
parent 3905dde3d9
commit 9f31488e50

View File

@ -922,7 +922,16 @@ private:
if (enable_async_loading)
{
/// Put a job to the thread pool for the loading.
auto thread = ThreadFromGlobalPool{&LoadingDispatcher::doLoading, this, info.name, loading_id, forced_to_reload, min_id_to_finish_loading_dependencies_, true, CurrentThread::getGroup()};
ThreadFromGlobalPool thread;
try
{
thread = ThreadFromGlobalPool{&LoadingDispatcher::doLoading, this, info.name, loading_id, forced_to_reload, min_id_to_finish_loading_dependencies_, true, CurrentThread::getGroup()};
}
catch (...)
{
cancelLoading(info);
throw;
}
loading_threads.try_emplace(loading_id, std::move(thread));
}
else