In case of Buffer table has columns of AggregateFunction type,
aggregate states for such columns will be allocated from the query
context but those states can be destroyed from the server context (in
case of background flush), and thus memory will be leaked from the query
since aggregate states can be shared, and eventually this will lead to
MEMORY_LIMIT_EXCEEDED error.
To avoid this, prohibit sharing the aggregate states.
But note, that this problem only about memory accounting, not memory
usage itself.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Use INITIAL_QUERY for clickhouse-benchmark
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix parallel_reading_from_replicas with clickhouse-bechmark
Before it produces the following error:
$ clickhouse-benchmark --stacktrace -i1 --query "select * from remote('127.1', default.data_mt) limit 10" --allow_experimental_parallel_reading_from_replicas=1 --max_parallel_replicas=3
Loaded 1 queries.
Logical error: 'Coordinator for parallel reading from replicas is not initialized'.
Aborted (core dumped)
Since it uses the same code, i.e RemoteQueryExecutor ->
MultiplexedConnections, which enables coordinator if it was requested
from settings, but it should be done only for non-initial queries, i.e.
when server send connection to another server.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix 02226_parallel_reading_from_replicas_benchmark for older shellcheck
By shellcheck 0.8 does not complains, while on CI shellcheck 0.7.0 and
it does complains [1]:
In 02226_parallel_reading_from_replicas_benchmark.sh line 17:
--allow_experimental_parallel_reading_from_replicas=1
^-- SC2191: The = here is literal. To assign by index, use ( [index]=value ) with no spaces. To keep as literal, quote it.
Did you mean:
"--allow_experimental_parallel_reading_from_replicas=1"
[1]: https://s3.amazonaws.com/clickhouse-test-reports/34751/d883af711822faf294c876b017cbf745b1cda1b3/style_check__actions_/shellcheck_output.txt
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Add a warning if parallel_distributed_insert_select was ignored
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Respect max_distributed_depth for parallel_distributed_insert_select
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Print warning for non applied parallel_distributed_insert_select only for initial query
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Remove Cluster::getHashOfAddresses()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Forbid parallel_distributed_insert_select for remote()/cluster() with different addresses
Before it uses empty cluster name (getClusterName()) which is not
correct, compare all addresses instead.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix max_distributed_depth check
max_distributed_depth=1 must mean not more then one distributed query,
not two, since max_distributed_depth=0 means no limit, and
distribute_depth is 0 for the first query.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Fix INSERT INTO remote()/cluster() with parallel_distributed_insert_select
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Add a test for parallel_distributed_insert_select with cluster()/remote()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Return <remote> instead of empty cluster name in Distributed engine
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Make user with sharding_key and w/o in remote()/cluster() identical
Before with sharding_key the user was "default", while w/o it it was
empty.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Add columns description to metadata in case of schema inference
* Make better
* Remove unnecessary code
* Fix tests
* More tests
* Add tag no-fasttest
* Fix test
* Fix test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>