Actually now we can create consumer object in the ctor, no need to do
this in startup(), since consumer now do not connects to kafka.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Since pool may exceed threads, while we need to run this thread always
to avoid memory leaking.
And this should not be a problem since librdkafka has multiple threads
for each consumer (5!) anyway.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This will make system.kafka_consumers more useful, since after TTL
consumer object will be removed prio this patch, but after, all
information will be preserved.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Pool of consumers created a problem for librdkafka internal statistics,
you need to read from the queue always, while in ClickHouse consumers
created regardless are there any readers or not (attached materialized
views or direct SELECTs).
Otherwise, this statistics messages got queued and never released,
which:
- creates live memory leak
- and also makes destroy very slow, due to librdkafka internals (it
moves entries from this queue into another linked list, but in a
with sorting, which is incredibly slow for linked lists)
So the idea is simple, let's create a pool of consumers only when they
are required, and destroy them after some timeout (right now it is 60
seconds) if nobody uses them, that way this problem should gone.
This should also reduce number of internal librdkafka threads, when
nobody reads from Kafka tables.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This may cause troubles, like forgetting to pass arguments, and there
are few places in the code (see the upcomming patch).
I doubt that this will make any performance changes, since the check
should be compile time.
And anyway Exception is an exceptional situation which should be rare
(there is no such code with single argument for logging, while logging
is more common).
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
The problem with --foreground option is that it send the signal only to
the process that had been spawned by timeout(1), while it can create
lots of children, and when you killing parent you are closing pipes and
childrens will get EPIPE, like in [1].
[1]: https://s3.amazonaws.com/clickhouse-test-reports/0/069f8bbb2f48541cc736903e1da5459fa2c27da0/stateless_tests__debug__%5B2_5%5D.html
Another problem is that now child process will finish correctly, which
may also print some errors like QUERY_WAS_CANCELLED (see [2]).
[2]: https://s3.amazonaws.com/clickhouse-test-reports/0/ef66714bf20042ba9cb5d59b7839befe26110b93/stateless_tests__release__analyzer_.html
In general this is not required actually, since all timeout invocations
uses timeout value less then the default test limit (10min). But it may
leave some processes in case of overriding this limit, i.e.
`clickhouse-test --timeout 1`
So to workaround this at least somehow, let's send SIGTERM and only
after some timeout (here I use 0.1), SIGKILL. This will give at least
some ability to terminate all childrens that had been spawned by
timeout(1).
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>