We would update the set of seen GTIDs as soon as we saw a GTID_EVENT,
which arrives before a transaction. This would mostly work fine, but
if we lost the connection to MySQL in the middle of a large transaction
we would persist that the transaction had been processed as soon as the
transaction had started. When the connection was reestablished, we
would not process the transaction again, which meant that we only
applied parts of it.
Fix this by updating the seen GTIDs at the end of the transaction
instead.
The problem is that GROUP BY can update saved_hash, which can be also
updated by subsequent update of a dictionary, and this will lead to
use-after-free.
You will find ASan report in `details`.
<details>
==24679==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000615d20 at pc 0x000022cc8684 bp 0x7ffea6b5f850 sp 0x7ffea6b5f848
READ of size 8 at 0x604000615d20 thread T223 (QueryPipelineEx)
0 0x22cc8683 in DB::ReverseIndex<>::insert(StringRef const&) obj-x86_64-linux-gnu/../src/Columns/ReverseIndex.h
1 0x22cc0de1 in COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeImpl<char8_t>()::'lambda'()::operator()() const obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:540:39
2 0x22cc0de1 in COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeImpl<char8_t>() obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:570:23
3 0x22cb9c66 in COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeFrom()::'lambda'(auto)::operator()<char8_t>(auto) const obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:592:26
4 0x22cb9c66 in DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeFrom() obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:600:28
5 0x2500b897 in DB::ColumnLowCardinality::insertRangeFrom() obj-x86_64-linux-gnu/../src/Columns/ColumnLowCardinality.cpp:205:62
6 0x25a182f4 in DB::appendBlock(DB::Block const&, DB::Block&) obj-x86_64-linux-gnu/../src/Storages/StorageBuffer.cpp:470:23
7 0x25a182f4 in DB::BufferSink::insertIntoBuffer(DB::Block const&, DB::StorageBuffer::Buffer&) obj-x86_64-linux-gnu/../src/Storages/StorageBuffer.cpp:634:9
8 0x25a173cc in DB::BufferSink::consume(DB::Chunk) obj-x86_64-linux-gnu/../src/Storages/StorageBuffer.cpp:595:9
9 0x26d1c997 in DB::SinkToStorage::transform(DB::Chunk&) obj-x86_64-linux-gnu/../src/Processors/Sinks/SinkToStorage.cpp:18:5
0x604000615d20 is located 16 bytes inside of 40-byte region [0x604000615d10,0x604000615d38)
freed by thread T37 (QueryPipelineEx) here:
2 0x22cb9392 in boost::intrusive_ptr<DB::ColumnVector<unsigned long> >::~intrusive_ptr() obj-x86_64-linux-gnu/../contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:98:23
4 0x22cb9392 in COW<DB::IColumn>::mutable_ptr<DB::ColumnVector<unsigned long> >::operator=() obj-x86_64-linux-gnu/../src/Common/COW.h💯57
5 0x22cb9392 in DB::ReverseIndex<>::tryGetSavedHash() const obj-x86_64-linux-gnu/../src/Columns/ReverseIndex.h:362:28
6 0x22cb9392 in DB::ColumnUnique<DB::ColumnString>::tryGetSavedHash() const obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:125:76
7 0x242eaed3 in DB::ColumnsHashing::HashMethodSingleLowCardinalityColumn<>::HashMethodSingleLowCardinalityColumn() obj-x86_64-linux-gnu/../src/Common/ColumnsHashing.h:287:50
8 0x242206c6 in void DB::Aggregator::executeImpl<>() const obj-x86_64-linux-gnu/../src/Interpreters/Aggregator.cpp:596:28
9 0x24148e99 in DB::Aggregator::executeOnBlock() const obj-x86_64-linux-gnu/../src/Interpreters/Aggregator.cpp:1004:9
10 0x26c24f3a in DB::AggregatingTransform::consume(DB::Chunk) obj-x86_64-linux-gnu/../src/Processors/Transforms/AggregatingTransform.cpp:539:33
11 0x26c2054e in DB::AggregatingTransform::work() obj-x86_64-linux-gnu/../src/Processors/Transforms/AggregatingTransform.cpp:500:9
previously allocated by thread T37 (QueryPipelineEx) here:
0 0xb6d44fd in operator new(unsigned long) (/src/ch/tmp/upstream/clickhouse-asan+0xb6d44fd)
1 0x11b78580 in COW<DB::IColumn>::mutable_ptr<> COWHelper<>::create<unsigned long&>(unsigned long&) (/src/ch/tmp/upstream/clickhouse-asan+0x11b78580)
2 0x22cbf7b1 in DB::ReverseIndex<>::calcHashes() const obj-x86_64-linux-gnu/../src/Columns/ReverseIndex.h:472:17
3 0x22cc2307 in DB::ReverseIndex<>::buildIndex() obj-x86_64-linux-gnu/../src/Columns/ReverseIndex.h:438:22
4 0x22cc658c in DB::ReverseIndex<>::insert(StringRef const&) obj-x86_64-linux-gnu/../src/Columns/ReverseIndex.h:484:9
5 0x22cc0de1 in COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeImpl<char8_t>()::'lambda'()::operator()() const obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:540:39
6 0x22cc0de1 in COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeImpl<char8_t>() obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:570:23
7 0x22cb9c66 in COW<DB::IColumn>::mutable_ptr<DB::IColumn> DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeFrom()::'lambda'(auto)::operator()<char8_t>(auto) const obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:592:26
8 0x22cb9c66 in DB::ColumnUnique<DB::ColumnString>::uniqueInsertRangeFrom() obj-x86_64-linux-gnu/../src/Columns/ColumnUnique.h:600:28
9 0x2500b897 in DB::ColumnLowCardinality::insertRangeFrom() obj-x86_64-linux-gnu/../src/Columns/ColumnLowCardinality.cpp:205:62
10 0x25a182f4 in DB::appendBlock(DB::Block const&, DB::Block&) obj-x86_64-linux-gnu/../src/Storages/StorageBuffer.cpp:470:23
11 0x25a182f4 in DB::BufferSink::insertIntoBuffer(DB::Block const&, DB::StorageBuffer::Buffer&) obj-x86_64-linux-gnu/../src/Storages/StorageBuffer.cpp:634:9
12 0x25a173cc in DB::BufferSink::consume(DB::Chunk) obj-x86_64-linux-gnu/../src/Storages/StorageBuffer.cpp:595:9
13 0x26d1c997 in DB::SinkToStorage::transform(DB::Chunk&) obj-x86_64-linux-gnu/../src/Processors/Sinks/SinkToStorage.cpp:18:5
SUMMARY: AddressSanitizer: heap-use-after-free obj-x86_64-linux-gnu/../src/Columns/ReverseIndex.h in DB::ReverseIndex<unsigned long, DB::ColumnString>::insert(StringRef const&)
Shadow bytes around the buggy address:
0x0c08800bab50: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 00
0x0c08800bab60: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 fa
0x0c08800bab70: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
0x0c08800bab80: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
0x0c08800bab90: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd
=>0x0c08800baba0: fa fa fd fd[fd]fd fd fa fa fa fd fd fd fd fd fa
0x0c08800babb0: fa fa fd fd fd fd fd fd fa fa fd fd fd fd fd fa
0x0c08800babc0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd
0x0c08800babd0: fa fa fd fd fd fd fd fd fa fa fd fd fd fd fd fa
0x0c08800babe0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd
0x0c08800babf0: fa fa fd fd fd fd fd fd fa fa fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==24679==ABORTING
</details>
This is required to support queries (like `foo IS NULL`) to external
databases (i.e. MySQL).
This patch had been submitted separately, due to:
- avoid mixing patches
- run CI (and update tests references)
Right now it is possible to get the following error:
Having 20 errors! 0 tests passed. 0 tests skipped. 57.37 s elapsed (MainProcess).
Won't run stateful tests because test data wasn't loaded.
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/managers.py", line 802, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 1462, in <module>
main(args)
File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 1261, in main
if len(restarted_tests) > 0:
File "<string>", line 2, in __len__
File "/usr/lib/python3.9/multiprocessing/managers.py", line 806, in _callmethod
self._connect()
File "/usr/lib/python3.9/multiprocessing/managers.py", line 793, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python3.9/multiprocessing/connection.py", line 507, in Client
c = SocketClient(address)
File "/usr/lib/python3.9/multiprocessing/connection.py", line 635, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
The reason behind this is that manager's thread got terminated:
ipdb> p restarted_tests._manager._process
<ForkProcess name='SyncManager-1' pid=25125 parent=24939 stopped exitcode=-SIGTERM>
Refs: #29259 (cc: @vdimir)
Follow-up for: #29197 (cc: @tavplubix)
It failed few times on CI [1], and it seems that 200ms is not enough:
2021.09.26 02:43:16.063992 [ 706 ] {d5144a51-f217-48f9-8630-29492f81aae5} <Trace> ContextAccess (default): Access granted: INSERT(id, s) ON test_fptde5.async_inserts
2021.09.26 02:43:16.069188 [ 706 ] {d5144a51-f217-48f9-8630-29492f81aae5} <Trace> AsynchronousInsertQueue: Have 1 pending inserts with total 11 bytes of data for query 'INSERT INTO test_fptde5.async_inserts FORMAT CSV'
2021.09.26 02:43:16.209661 [ 1940 ] {} <Trace> DynamicQueryHandler: Request URI: /?database=test_fptde5&log_comment=/usr/share/clickhouse-test/queries/0_stateless/02015_async_inserts_2.sh&async_insert=1&wait_for_async_insert=1
2021.09.26 02:43:16.340496 [ 1938 ] {e01d640f-90a8-4306-808f-d732fb144a11} <Trace> ContextAccess (default): Access granted: INSERT(id, s) ON test_fptde5.async_inserts
2021.09.26 02:43:16.401966 [ 1938 ] {e01d640f-90a8-4306-808f-d732fb144a11} <Trace> AsynchronousInsertQueue: Have 1 pending inserts with total 11 bytes of data for query 'INSERT INTO test_fptde5.async_inserts FORMAT CSV'
2021.09.26 02:43:16.667513 [ 1940 ] {5c059e03-397f-4f9a-95ca-0fcb3c479235} <Trace> ContextAccess (default): Access granted: INSERT(id, s) ON test_fptde5.async_inserts
2021.09.26 02:43:16.707577 [ 1392 ] {} <Error> AsynchronousInsertQueue: Failed parsing for query 'INSERT INTO test_fptde5.async_inserts FORMAT CSV' with query id d5144a51-f217-48f9-8630-29492f81aae5. DB::ParsingException: Cannot parse input: expected ','
before: 'qqqqqqqqqqq':
2021.09.26 02:43:16.789697 [ 1392 ] {} <Information> AsynchronousInsertQueue: Flushed 0 rows, 0 bytes for query 'INSERT INTO test_fptde5.async_inserts FORMAT CSV'
2021.09.26 02:43:16.872731 [ 1940 ] {5c059e03-397f-4f9a-95ca-0fcb3c479235} <Trace> AsynchronousInsertQueue: Have 1 pending inserts with total 11 bytes of data for query 'INSERT INTO test_fptde5.async_inserts FORMAT CSV'
2021.09.26 02:43:17.472223 [ 1392 ] {} <Trace> test_fptde5.async_inserts (b877377f-9706-488b-b877-377f9706f88b): Renaming temporary part tmp_insert_all_1_1_0 to all_1_1_0.
2021.09.26 02:43:17.527828 [ 1392 ] {} <Information> AsynchronousInsertQueue: Flushed 2 rows, 28 bytes for query 'INSERT INTO test_fptde5.async_inserts FORMAT CSV'
2021.09.26 02:43:18.344135 [ 1859 ] {} <Trace> test_fptde5.async_inserts (b877377f-9706-488b-b877-377f9706f88b): Renaming temporary part tmp_insert_all_2_2_0 to all_2_2_0.
2021.09.26 02:43:18.373230 [ 1859 ] {} <Information> AsynchronousInsertQueue: Flushed 2 rows, 28 bytes for query 'INSERT INTO test_fptde5.async_inserts FORMAT CSV'
[1]: https://clickhouse-test-reports.s3.yandex.net/29364/fb584715e175fef3a4c4373d1bbcebecab1db6d8/functional_stateless_tests_(thread).html#fail1