mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-12-15 02:41:59 +00:00
055c231438
It is not safe to use statistics because of how KafkaEngine works - it pre-creates consumers, and this leads to the situation when this statistics entries generated (RD_KAFKA_OP_STATS), but never consumed. Which creates a live memory leak for a server with Kafka tables, but without materialized view attached to it (and no SELECT). Another problem is that this makes shutdown very slow, because of how pending queue entries are handled in librdkafka, it uses TAILQ_INSERT_SORTED, which is sorted insert into linked list, which works incredibly slow (likely you will never wait till it ends and kill the server) For instance in my production setup the server was running for ~67 days with such table, and it got 1'942'233 `TAILQ_INSERT_SORTED` entries (which perfectly matches by the way - `67*86400/3` = 1'929'600), and it moved only 289'806 entries for a few hours, though I'm not sure how much time the process was in the running state, since most of the time it was with debugger attached. So for now let's disable it, to make this patch easy for backporting, and I will think about long term fix - do not pre-create consumers in Kafka engine. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> |
||
---|---|---|
.. | ||
KafkaConsumer.cpp | ||
KafkaConsumer.h | ||
KafkaProducer.cpp | ||
KafkaProducer.h | ||
KafkaSettings.cpp | ||
KafkaSettings.h | ||
KafkaSource.cpp | ||
KafkaSource.h | ||
parseSyslogLevel.cpp | ||
parseSyslogLevel.h | ||
StorageKafka.cpp | ||
StorageKafka.h |