ReadBufferFromKafkaConsumer does not handle the case when there is
message with an error on non first position in the current batch, since
it goes through messages in the batch after poll and stop on first valid
message.
But later it can try to use message as valid:
- while storing offset
- get topic name
- ...
And besides the message itself is also invalid (you can find this in the
gdb traces below).
So just filter out messages win an error error after poll.
SIGSEGV was with the following stacktrace:
(gdb) bt
3 0x0000000010f05b4d in rd_kafka_offset_store (app_rkt=0x0, partition=0, offset=0) at ../contrib/librdkafka/src/rdkafka_offset.c:656
4 0x0000000010e69657 in cppkafka::Consumer::store_offset (this=0x7f2015210820, msg=...) at ../contrib/cppkafka/include/cppkafka/message.h:225
5 0x000000000e68f208 in DB::ReadBufferFromKafkaConsumer::storeLastReadMessageOffset (this=0x7f206a136618) at ../contrib/libcxx/include/iterator:1508
6 0x000000000e68b207 in DB::KafkaBlockInputStream::readImpl (this=0x7f202c689020) at ../src/Storages/Kafka/KafkaBlockInputStream.cpp:150
7 0x000000000dd1178d in DB::IBlockInputStream::read (this=this@entry=0x7f202c689020) at ../src/DataStreams/IBlockInputStream.cpp:60
8 0x000000000dd34c0a in DB::copyDataImpl<> () at ../src/DataStreams/copyData.cpp:21
9 DB::copyData () at ../src/DataStreams/copyData.cpp:62
10 0x000000000e67c8f2 in DB::StorageKafka::streamToViews () at ../contrib/libcxx/include/memory:3823
11 0x000000000e67d218 in DB::StorageKafka::threadFunc () at ../src/Storages/Kafka/StorageKafka.cpp:488
And some information from it:
(gdb) p this.current.__i
$14 = (std::__1::__wrap_iter<cppkafka::Message const*>::iterator_type) 0x7f1ca8f58660
# current-1
(gdb) p $14-1
$15 = (const cppkafka::Message *) 0x7f1ca8f58600
(gdb) p $16.handle_
$17 = {__ptr_ = {<std::__1::__compressed_pair_elem<rd_kafka_message_s*, 0, false>> = { __value_ = 0x7f203577f938}, ...}
(gdb) p *(rd_kafka_message_s*)0x7f203577f938
$24 = {err = RD_KAFKA_RESP_ERR__TRANSPORT, rkt = 0x0, partition = 0, payload = 0x7f202f0339c0, len = 63, key = 0x0, key_len = 0, offset = 0, _private = 0x7f203577f8c0}
# current
(gdb) p $14-0
$28 = (const cppkafka::Message *) 0x7f1ca8f58660
(gdb) p $28.handle_.__ptr_
$29 = {<std::__1::__compressed_pair_elem<rd_kafka_message_s*, 0, false>> = { __value_ = 0x7f184f129bf0}, ...}
(gdb) p *(rd_kafka_message_s*)0x7f184f129bf0
$30 = {err = RD_KAFKA_RESP_ERR_NO_ERROR, rkt = 0x7f1ed44fe000, partition = 1, payload = 0x7f1fc9bc6036, len = 242, key = 0x0, key_len = 0, offset = 2394853582209,
# current+1
(gdb) p (*($14+1)).handle_.__ptr_
$44 = {<std::__1::__compressed_pair_elem<rd_kafka_message_s*, 0, false>> = { __value_ = 0x7f184f129d30}, ...}
(gdb) p *(rd_kafka_message_s*)0x7f184f129d30
$45 = {err = RD_KAFKA_RESP_ERR_NO_ERROR, rkt = 0x7f1ed44fe000, partition = 1, payload = 0x7f1fc9bc612f, len = 31, key = 0x0, key_len = 0, offset = 2394853582210,
_private = 0x7f184f129cc0}
# distance from the beginning
(gdb) p messages.__end_-messages.__begin_
$34 = 65536
(gdb) p ($14-0)-messages.__begin_
$37 = 8965
(gdb) p ($14-1)-messages.__begin_
$38 = 8964
# parsing info
(gdb) p allowed
$39 = false
(gdb) p new_rows
$40 = 1
(gdb) p total_rows
$41 = 8964
# current buffer is invalid
(gdb) p *buffer.__ptr_
$50 = {<DB::ReadBuffer> = {<DB::BufferBase> = {pos = 0x7f202f0339c0 "FindCoordinator response error: Local: Broker transport failure", bytes = 47904863385, working_buffer = {
begin_pos = 0x7f202f0339c0 "FindCoordinator response error: Local: Broker transport failure",
end_pos = 0x7f202f0339c0 "FindCoordinator response error: Local: Broker transport failure"}, internal_buffer = {
v0: check message errors in ReadBufferFromKafkaConsumer::nextImpl() (but
this may lead to using of that messages after and SIGSEGV again, doh).
v2: skip messages with an error after poll.