In case of Buffer table has columns of AggregateFunction type,
aggregate states for such columns will be allocated from the query
context but those states can be destroyed from the server context (in
case of background flush), and thus memory will be leaked from the query
since aggregate states can be shared, and eventually this will lead to
MEMORY_LIMIT_EXCEEDED error.
To avoid this, prohibit sharing the aggregate states.
But note, that this problem only about memory accounting, not memory
usage itself.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Use INITIAL_QUERY for clickhouse-benchmark
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix parallel_reading_from_replicas with clickhouse-bechmark
Before it produces the following error:
$ clickhouse-benchmark --stacktrace -i1 --query "select * from remote('127.1', default.data_mt) limit 10" --allow_experimental_parallel_reading_from_replicas=1 --max_parallel_replicas=3
Loaded 1 queries.
Logical error: 'Coordinator for parallel reading from replicas is not initialized'.
Aborted (core dumped)
Since it uses the same code, i.e RemoteQueryExecutor ->
MultiplexedConnections, which enables coordinator if it was requested
from settings, but it should be done only for non-initial queries, i.e.
when server send connection to another server.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix 02226_parallel_reading_from_replicas_benchmark for older shellcheck
By shellcheck 0.8 does not complains, while on CI shellcheck 0.7.0 and
it does complains [1]:
In 02226_parallel_reading_from_replicas_benchmark.sh line 17:
--allow_experimental_parallel_reading_from_replicas=1
^-- SC2191: The = here is literal. To assign by index, use ( [index]=value ) with no spaces. To keep as literal, quote it.
Did you mean:
"--allow_experimental_parallel_reading_from_replicas=1"
[1]: https://s3.amazonaws.com/clickhouse-test-reports/34751/d883af711822faf294c876b017cbf745b1cda1b3/style_check__actions_/shellcheck_output.txt
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Add a warning if parallel_distributed_insert_select was ignored
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Respect max_distributed_depth for parallel_distributed_insert_select
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Print warning for non applied parallel_distributed_insert_select only for initial query
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Remove Cluster::getHashOfAddresses()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Forbid parallel_distributed_insert_select for remote()/cluster() with different addresses
Before it uses empty cluster name (getClusterName()) which is not
correct, compare all addresses instead.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix max_distributed_depth check
max_distributed_depth=1 must mean not more then one distributed query,
not two, since max_distributed_depth=0 means no limit, and
distribute_depth is 0 for the first query.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix INSERT INTO remote()/cluster() with parallel_distributed_insert_select
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Add a test for parallel_distributed_insert_select with cluster()/remote()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Return <remote> instead of empty cluster name in Distributed engine
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Make user with sharding_key and w/o in remote()/cluster() identical
Before with sharding_key the user was "default", while w/o it it was
empty.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
In #33291 final part commit had been defered, and now it can take
significantly more time, that may lead to "Part directory doesn't exist"
error during INSERT:
2022.02.21 18:18:06.979881 [ 11329 ] {insert} <Debug> executeQuery: (from 127.1:24572, user: default) INSERT INTO db.table (...) VALUES
2022.02.21 20:58:03.933593 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18044_18044_0 to 20220214_270654_270654_0.
2022.02.21 21:16:50.961917 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18197_18197_0 to 20220214_270689_270689_0.
...
2022.02.22 21:16:57.632221 [ 64878 ] {} <Warning> db.table: Removing temporary directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/
...
2022.02.23 12:23:56.277480 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18232_18232_0 to 20220214_273459_273459_0.
2022.02.23 12:23:56.299218 [ 11329 ] {insert} <Error> executeQuery: Code: 107. DB::Exception: Part directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/ doesn't exist. Most likely it is a logical error. (FILE_DOESNT_EXIST) (version 22.2.1.1) (from 127.1:24572) (in query: INSERT INTO db.table (...) VALUES), Stack trace (when copying this message, always include the lines below):
Follow-up for: #28760
Refs: #33291
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
In external dictionary providers, the allowed keys for configuration seemed to have a typo
of "update_lag" as "update_tag", preventing the use of "update_lag". This change fixes that.
Here is an example for deadlock during shutting down DDLWorker:
Server main thread:
6 ThreadFromGlobalPool::join () at ../src/Common/ThreadPool.h:217
7 DB::DDLWorker::shutdown () at ../src/Interpreters/DDLWorker.cpp:123
8 DB::DDLWorker::~DDLWorker () at ../src/Interpreters/DDLWorker.cpp:131
9 DB::DDLWorker::~DDLWorker () at ../src/Interpreters/DDLWorker.cpp:130
10 std::__1::default_delete<DB::DDLWorker>::operator() () at ../contrib/libcxx/include/memory:1397
11 std::__1::unique_ptr<>::reset (this=0x7f7521d44fd0, __p=0x0) at ../contrib/libcxx/include/memory:1658
12 DB::ContextSharedPart::shutdown () at ../src/Interpreters/Context.cpp:380 <!-- holds mutex
13 DB::Context::shutdown () at ../src/Interpreters/Context.cpp:2677
DDLWorker thread:
7 DB::Context::getLock () at ../src/Interpreters/Context.cpp:472 <-- trying to acquire shared.mutex
8 DB::Context::getTCPPortSecure () at ../src/Interpreters/Context.cpp:2120
9 DB::DDLTask::findCurrentHostID () at ../src/Interpreters/DDLTask.cpp:169
10 DB::DDLWorker::initAndCheckTask () at ../src/Interpreters/DDLWorker.cpp:191
11 DB::DDLWorker::scheduleTasks () at ../src/Interpreters/DDLWorker.cpp:367
12 DB::DDLWorker::runMainThread () at ../src/Interpreters/DDLWorker.cpp:1063
v2: replace getTryLockTimed() with getTryLock() since std::recursive_mutex does not have try_lock_for()
v3: stole objects from context
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Fixes: #21318
system.mutations includes only the message, but not stacktrace, and it
is not always obvious to understand the culprit w/o stacktrace.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Add columns description to metadata in case of schema inference
* Make better
* Remove unnecessary code
* Fix tests
* More tests
* Add tag no-fasttest
* Fix test
* Fix test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
It was set only bu utils/release/release_lib.sh, and seems that this
script is not used anymore, at least that part of it.
Also note, that GIT_DATE is the same, and it is date time, not only
date.
Plus VERSION_DATE is not installed for releases anyway.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
CMAKE_LIBRARY_ARCHITECTURE and it is useless, since it is reported only
if the compiler reports subdir arch triplet [1]
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1531678
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>