When I tried to add cool new clang-tidy 14 warnings, I noticed that the
current clang-tidy settings already produce a ton of warnings. This
commit addresses many of these. Almost all of them were non-critical,
i.e. C vs. C++ style casts.
ASan founds [1]:
==553==
ERROR: AddressSanitizer: heap-use-after-free on address 0x61e004694080 at pc 0x000029150af2 bp 0x7f70b3f8ef10 sp 0x7f70b3f8ef08
READ of size 8 at 0x61e004694080 thread T477 (QueryPipelineEx)
0 0x29150af1 in DB::MergeTreeDataWriter::writeTempPart() >
1 0x293b8e43 in DB::MergeTreeSink::consume(DB::Chunk) obj-x86_64-linux-gnu/../src/Storages/MergeTree/MergeTreeSink.cpp:27:65
2 0x29dac73b in DB::SinkToStorage::onConsume(DB::Chunk) obj-x86_64-linux-gnu/../src/Processors/Sinks/SinkToStorage.cpp:18:5
3 0x29c72dd2 in DB::ExceptionKeepingTransform::work()::$_1::operator()() const obj-x86_64-linux-gnu/../src/Processors/Transforms/ExceptionKeepingTransform.cpp:151:51
0x61e004694080 is located 2048 bytes inside of 2480-byte region [0x61e004693880,0x61e004694230)
freed by thread T199 (BgSchPool) here:
...
4 0x26220f20 in DB::DatabaseCatalog::TableMarkedAsDropped::~TableMarkedAsDropped() obj-x86_64-linux-gnu/../src/Interpreters/DatabaseCatalog.h:248:12
5 0x26220f20 in DB::DatabaseCatalog::dropTableDataTask() obj-x86_64-linux-gnu/../src/Interpreters/DatabaseCatalog.cpp:908:1
[1]: https://s3.amazonaws.com/clickhouse-test-reports/33201/4f04d6af61eabf4899eb8188150dc862aaab80fc/stress_test__address__actions_.html
There was a fix in #32572, but it was not complete (yes it reduced the
race window a lot, but not completely), since the inner table still can
go away after the INSERT chain was built, to fix this obtain the
reference earlier.
Follow-up for: #32572 (cc @tavplubix)
Stress tests founds [1]:
==527==WARNING: MemorySanitizer: use-of-uninitialized-value
0 0x37078ffd in unsigned long std::__1::__cxx_atomic_fetch_add<unsigned long>(std::__1::__cxx_atomic_base_impl<unsigned long>*, unsigned long, std::__1::memory_order) obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1050:12
1 0x37078ffd in std::__1::__atomic_base<unsigned long, true>::fetch_add(unsigned long, std::__1::memory_order) obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1719:17
2 0x37078ffd in std::__1::__atomic_base<unsigned long, true>::operator++() obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1756:57
3 0x37078ffd in SimpleIncrement::get() obj-x86_64-linux-gnu/../src/Common/SimpleIncrement.h:20:16
4 0x37078ffd in DB::MergeTreeDataWriter::writeTempPart(DB::BlockWithPartition&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::Context const>) obj-x86_64-linux-gnu/../src/Storages/MergeTree/MergeTreeDataWriter.cpp:276:46
5 0x373c446c in DB::MergeTreeSink::consume(DB::Chunk) obj-x86_64-linux-gnu/../src/Storages/MergeTree/MergeTreeSink.cpp:27:65
Uninitialized value was created by a heap deallocation
6 0x32d481e8 in DB::DatabaseCatalog::TableMarkedAsDropped::~TableMarkedAsDropped() obj-x86_64-linux-gnu/../src/Interpreters/DatabaseCatalog.h:248:12
7 0x32d3c134 in DB::DatabaseCatalog::dropTableDataTask() obj-x86_64-linux-gnu/../src/Interpreters/DatabaseCatalog.cpp:908:1
[1]: https://s3.amazonaws.com/clickhouse-test-reports/32534/fa6090f588dbf4cbb5f28bd2210847b070bb8218/stress_test__memory__actions_.html
The query was CREATE MATERIALIZED VIEW ... POPULATE AS SELECT ... from
00040_aggregating_materialized_view test.
* initial commit: add setting and stub
* typo
* added test stub
* fix
* wip merging new integration test and code proto
* adding steps interpreters
* adding firstly proposed solution (moving parts etc)
* added checking zookeeper path existence
* fixing the include
* fixing and sorting includes
* fixing outdated struct
* fix the name
* added ast ptr as level of indirection
* fix ref
* updating the changes
* working on test stub
* fix iterator -> reference
* revert rocksdb submodule update
* fixed show privileges test
* updated the test stub
* replaced rand() with thread_local_rng(), updated the tests
updated the test
fixed test config path
test fix
removed error messages
fixed the test
updated the test
fixed string literal
fixed literal
typo: =
* fixed the empty replica error message
* updated the test and the code with logs
* updated the possible test cases, updated
* added the code/test milestone comments
* updated the test (added more testcases)
* replaced native assert with CH one
* individual replicas recursive delete fix
* updated the AS db.name AST
* two small logging fixes
* manually generated AST fixes
* Updated the test, added the possible algo change
* Some thoughts about optimizing the solution:
ALTER MOVE PARTITION .. TO TABLE -> move to detached/ + ALTER ... ATTACH
* fix
* Removed the replica sync in test as it's invalid
* Some test tweaks
* tmp
* Rewrote the algo by using the executeQuery instead of
hand-crafting the ASTPtr.
Two questions still active.
* tr: logging active parts
* Extracted the parts moving algo into a separate helper function
* Fixed the test data and the queries slightly
* Replaced query to system.parts to direct invocation,
started building the test that breaks on various parts.
* Added the case for tables when at least one replica is alive
* Updated the test to test replicas restoration by detaching/attaching
* Altered the test to check restoration without replica restart
* Added the tables swap in the start if the server failed last time
* Hotfix when only /replicas/replica... path was deleted
* Restore ZK paths while creating a replicated MergeTree table
* Updated the docs, fixed the algo for individual replicas restoration case
* Initial parts table storage fix, tests sync fix
* Reverted individual replica restoration to general algo
* Slightly optimised getDataParts
* Trying another solution with parts detaching
* Rewrote algo without any steps, added ON CLUSTER support
* Attaching parts from other replica on restoration
* Getting part checksums from ZK
* Removed ON CLUSTER, finished working solution
* Multiple small changes after review
* Fixing parallel test
* Supporting rewritten form on cluster
* Test fix
* Moar logging
* Using source replica as checksum provider
* improve test, remove some code from parser
* Trying solution with move to detached + forget
* Moving all parts (not only Committed) to detached
* Edited docs for RESTORE REPLICA
* Re-merging
* minor fixes
Co-authored-by: Alexander Tokmakov <avtokmakov@yandex-team.ru>