* initial commit: add setting and stub
* typo
* added test stub
* fix
* wip merging new integration test and code proto
* adding steps interpreters
* adding firstly proposed solution (moving parts etc)
* added checking zookeeper path existence
* fixing the include
* fixing and sorting includes
* fixing outdated struct
* fix the name
* added ast ptr as level of indirection
* fix ref
* updating the changes
* working on test stub
* fix iterator -> reference
* revert rocksdb submodule update
* fixed show privileges test
* updated the test stub
* replaced rand() with thread_local_rng(), updated the tests
updated the test
fixed test config path
test fix
removed error messages
fixed the test
updated the test
fixed string literal
fixed literal
typo: =
* fixed the empty replica error message
* updated the test and the code with logs
* updated the possible test cases, updated
* added the code/test milestone comments
* updated the test (added more testcases)
* replaced native assert with CH one
* individual replicas recursive delete fix
* updated the AS db.name AST
* two small logging fixes
* manually generated AST fixes
* Updated the test, added the possible algo change
* Some thoughts about optimizing the solution:
ALTER MOVE PARTITION .. TO TABLE -> move to detached/ + ALTER ... ATTACH
* fix
* Removed the replica sync in test as it's invalid
* Some test tweaks
* tmp
* Rewrote the algo by using the executeQuery instead of
hand-crafting the ASTPtr.
Two questions still active.
* tr: logging active parts
* Extracted the parts moving algo into a separate helper function
* Fixed the test data and the queries slightly
* Replaced query to system.parts to direct invocation,
started building the test that breaks on various parts.
* Added the case for tables when at least one replica is alive
* Updated the test to test replicas restoration by detaching/attaching
* Altered the test to check restoration without replica restart
* Added the tables swap in the start if the server failed last time
* Hotfix when only /replicas/replica... path was deleted
* Restore ZK paths while creating a replicated MergeTree table
* Updated the docs, fixed the algo for individual replicas restoration case
* Initial parts table storage fix, tests sync fix
* Reverted individual replica restoration to general algo
* Slightly optimised getDataParts
* Trying another solution with parts detaching
* Rewrote algo without any steps, added ON CLUSTER support
* Attaching parts from other replica on restoration
* Getting part checksums from ZK
* Removed ON CLUSTER, finished working solution
* Multiple small changes after review
* Fixing parallel test
* Supporting rewritten form on cluster
* Test fix
* Moar logging
* Using source replica as checksum provider
* improve test, remove some code from parser
* Trying solution with move to detached + forget
* Moving all parts (not only Committed) to detached
* Edited docs for RESTORE REPLICA
* Re-merging
* minor fixes
Co-authored-by: Alexander Tokmakov <avtokmakov@yandex-team.ru>
ASAN report:
=================================================================
==7686==ERROR: AddressSanitizer: container-overflow on address 0x6200000bf080 at pc 0x00002a787e79 bp 0x7fffffffa2f0 sp 0x7fffffffa2e8
READ of size 4 at 0x6200000bf080 thread T0
0 0x2a787e78 in replxx::calculate_displayed_length(char32_t const*, int) obj-x86_64-linux-gnu/../contrib/replxx/src/util.cxx:66:15
1 0x2a75786c in replxx::Replxx::ReplxxImpl::dynamicRefresh(replxx::Prompt&, char32_t*, int, int) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:2201:3
2 0x2a7453f0 in replxx::Replxx::ReplxxImpl::incremental_history_search(char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:2008:3
3 0x2a73eecc in replxx::Replxx::ReplxxImpl::action(unsigned long long, replxx::Replxx::ACTION_RESULT (replxx::Replxx::ReplxxImpl::* const&)(char32_t), char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1246:29
4 0x2a73eecc in replxx::Replxx::ReplxxImpl::invoke(replxx::Replxx::ACTION, char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:318:70
5 0x2a74ed29 in std::__1::__function::__policy_func<replxx::Replxx::ACTION_RESULT (char32_t)>::operator()(char32_t&&) const obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2221:16
6 0x2a74ed29 in std::__1::function<replxx::Replxx::ACTION_RESULT (char32_t)>::operator()(char32_t) const obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2560:12
7 0x2a74ed29 in replxx::Replxx::ReplxxImpl::get_input_line() obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx🔢11
8 0x2a74dd3c in replxx::Replxx::ReplxxImpl::input(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:580:8
9 0x2a2a4075 in ReplxxLineReader::readOneLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/ReplxxLineReader.cpp:112:29
10 0x2a29b499 in LineReader::readLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/LineReader.cpp:81:26
11 0xb580f02 in DB::Client::mainImpl() obj-x86_64-linux-gnu/../programs/client/Client.cpp:665:33
12 0xb575825 in DB::Client::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) obj-x86_64-linux-gnu/../programs/client/Client.cpp:300:20
13 0x2a3aff25 in Poco::Util::Application::run() obj-x86_64-linux-gnu/../contrib/poco/Util/src/Application.cpp:334:8
14 0xb54c810 in mainEntryClickHouseClient(int, char**) obj-x86_64-linux-gnu/../programs/client/Client.cpp:2702:23
15 0xb326d8a in main obj-x86_64-linux-gnu/../programs/main.cpp:360:12
16 0x7ffff7dcbb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
17 0xb2794ad in _start (/src/ch/tmp/upstream/clickhouse-asan+0xb2794ad)
0x6200000bf080 is located 0 bytes inside of 3672-byte region [0x6200000bf080,0x6200000bfed8)
allocated by thread T0 here:
0 0xb3231dd in operator new(unsigned long) (/src/ch/tmp/upstream/clickhouse-asan+0xb3231dd)
1 0x2a75fb15 in void* std::__1::__libcpp_operator_new<unsigned long>(unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/new:235:10
2 0x2a75fb15 in std::__1::__libcpp_allocate(unsigned long, unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/new:261:10
3 0x2a75fb15 in std::__1::allocator<char32_t>::allocate(unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:840:38
4 0x2a75fb15 in std::__1::allocator_traits<std::__1::allocator<char32_t> >::allocate(std::__1::allocator<char32_t>&, unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/__memory/allocator_traits.h:468:21
5 0x2a75fb15 in std::__1::vector<char32_t, std::__1::allocator<char32_t> >::__vallocate(unsigned long) obj-x86_64-linux-gnu/../contrib/libcxx/include/vector:993:37
6 0x2a75fb15 in std::__1::enable_if<(__is_cpp17_forward_iterator<char32_t*>::value) && (is_constructible<char32_t, std::__1::iterator_traits<char32_t*>::reference>::value), void>::type std::__1::vector<char32_t, std::__1::allocator<char32_t> >::assign<char32_t*>(char32_t*, char32_t*) obj-x86_64-linux-gnu/../contrib/libcxx/include/vector:1460:9
7 0x2a745242 in std::__1::vector<char32_t, std::__1::allocator<char32_t> >::operator=(std::__1::vector<char32_t, std::__1::allocator<char32_t> > const&) obj-x86_64-linux-gnu/../contrib/libcxx/include/vector:1405:9
8 0x2a745242 in replxx::UnicodeString::assign(replxx::UnicodeString const&) obj-x86_64-linux-gnu/../contrib/replxx/src/unicodestring.hxx:83:9
9 0x2a745242 in replxx::Replxx::ReplxxImpl::incremental_history_search(char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1993:24
10 0x2a73eecc in replxx::Replxx::ReplxxImpl::action(unsigned long long, replxx::Replxx::ACTION_RESULT (replxx::Replxx::ReplxxImpl::* const&)(char32_t), char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:1246:29
11 0x2a73eecc in replxx::Replxx::ReplxxImpl::invoke(replxx::Replxx::ACTION, char32_t) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:318:70
12 0x2a74ed29 in std::__1::__function::__policy_func<replxx::Replxx::ACTION_RESULT (char32_t)>::operator()(char32_t&&) const obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2221:16
13 0x2a74ed29 in std::__1::function<replxx::Replxx::ACTION_RESULT (char32_t)>::operator()(char32_t) const obj-x86_64-linux-gnu/../contrib/libcxx/include/functional:2560:12
14 0x2a74ed29 in replxx::Replxx::ReplxxImpl::get_input_line() obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx🔢11
15 0x2a74dd3c in replxx::Replxx::ReplxxImpl::input(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../contrib/replxx/src/replxx_impl.cxx:580:8
16 0x2a2a4075 in ReplxxLineReader::readOneLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/ReplxxLineReader.cpp:112:29
17 0x2a29b499 in LineReader::readLine(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) obj-x86_64-linux-gnu/../base/common/LineReader.cpp:81:26
18 0xb580f02 in DB::Client::mainImpl() obj-x86_64-linux-gnu/../programs/client/Client.cpp:665:33
19 0xb575825 in DB::Client::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) obj-x86_64-linux-gnu/../programs/client/Client.cpp:300:20
20 0x2a3aff25 in Poco::Util::Application::run() obj-x86_64-linux-gnu/../contrib/poco/Util/src/Application.cpp:334:8
21 0xb54c810 in mainEntryClickHouseClient(int, char**) obj-x86_64-linux-gnu/../programs/client/Client.cpp:2702:23
22 0xb326d8a in main obj-x86_64-linux-gnu/../programs/main.cpp:360:12
23 0x7ffff7dcbb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_container_overflow=0.
If you suspect a false positive see also: https://github.com/google/sanitizers/wiki/AddressSanitizerContainerOverflow.
SUMMARY: AddressSanitizer: container-overflow obj-x86_64-linux-gnu/../contrib/replxx/src/util.cxx:66:15 in replxx::calculate_displayed_length(char32_t const*, int)
Shadow bytes around the buggy address:
0x0c408000fdc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c408000fdd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c408000fde0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c408000fdf0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c408000fe00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c408000fe10:[fc]fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
0x0c408000fe20: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
0x0c408000fe30: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
0x0c408000fe40: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
0x0c408000fe50: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
0x0c408000fe60: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==7686==ABORTING
Refs: https://github.com/ClickHouse-Extras/replxx/pull/16
v2: fix test, do not use /dev/null since it client will lock it
This way the remote nodes will not need to send all the rows, so this
will decrease network io and also this will make queries w/
optimize_aggregation_in_order=1/LIMIT X and w/o ORDER BY faster since it
initiator will not need to read all the rows, only first X (but note
that for this you need to your data to be sharded correctly or you may
get inaccurate results).
Note, that having lots of processing stages will increase the complexity
of interpreter (it is already not that clean and simple right now).
Although using separate QueryProcessingStage looks pretty natural.
Another option is to make WithMergeableStateAfterAggregation always, but
in this case you will not be able to disable only this optimization,
i.e. if there will be some issue with it.
v2: fix OFFSET
v3: convert 01814_distributed_push_down_limit test to .sh and add retries
v4: add test with OFFSET
v5: add new query stage into the bash completion
v6/tests: use LIMIT O,L syntax over LIMIT L OFFSET O since it is broken in ANTLR parser
https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_(antlr_debug).html#fail1
v7/tests: set use_hedged_requests to 0, to avoid excessive log entries on retries
https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_flaky_check_(address).html#fail1
Before this patch KILL MUTATION marks mutation as canceled just by name
(and part numbers) so if you have multiple tables with the same part
name, then killing mutation for one table, will mark it as killed for
another too.
Fix this by comparing StorageID too (it is better to use StorageID over
database/table to avoid ambiguity by using UUIDs for comparing).
Here is a failure of the 01414_freeze_does_not_prevent_alters on CI [1].
[1]: https://clickhouse-test-reports.s3.yandex.net/24069/9fb69dcf98c71a939d200cad3c8491bf43a44622/functional_stateless_tests_(ubsan).html#fail1
Will fix possible issues in CI [1].
2021-06-07 03:42:56 01520_client_print_query_id: [ FAIL ] 8.24 sec. - return code 1
2021-06-07 03:42:56 , result:
And in server logs:
2021.06.07 03:42:48.732289 [ 18471 ] {7df001e8-f9bb-488f-9fe1-6dc538f24808} <Debug> executeQuery: (from [::1]:45680, using production parser) (comment: /usr/share/clickhouse-test/queries/0_stateless/01520_client_print_query_id.expect) CREATE DATABASE test
...
2021.06.07 03:42:52.048064 [ 18703 ] {8526892a-f99a-4949-9f7d-3c8e82969439} <Debug> executeQuery: (from [::1]:45702, using production parser) SELECT DISTINCT arrayJoin(extractAll(name, '[\\w_]{2,}')) AS res FROM (SELECT name FROM system.functions UNION ALL SELECT name FROM system.table_engines UNION ALL SELECT name FROM system.formats UNION ALL SELECT name FROM system.table_functions UNION ALL SELECT name FROM system.data_type_families UNION ALL SELECT name FROM system.merge_tree_settings UNION ALL SELECT name FROM system.settings UNION ALL SELECT cluster FROM system.clusters UNION ALL SELECT macro FROM system.macros UNION ALL SELECT policy_name FROM system.storage_policies UNION ALL SELECT concat(func.name, comb.name) FROM system.functions AS func CROSS JOIN system.aggregate_function_combinators AS comb WHERE is_aggregate UNION ALL SELECT name FROM system.databases LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.tables LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.dictionaries LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.columns LIMIT 10000) WHERE notEmpty(res)
...
2021.06.07 03:42:54.523976 [ 5575 ] {d97df847-0feb-4e8d-92bd-3236e1f7f711} <Debug> executeQuery: (from [::1]:45710, using production parser) (comment: /usr/share/clickhouse-test/queries/0_stateless/01520_client_print_query_id.expect) DROP DATABASE test_3bce13
...
2021.06.07 03:42:54.722391 [ 18703 ] {8526892a-f99a-4949-9f7d-3c8e82969439} <Error> executeQuery: Code: 210, e.displayText() = DB::NetException: I/O error: Broken pipe, while writing to socket ([::1]:45702) (version 21.7.1.7071) (from [::1]:45702) (in query: SELECT DISTINCT arrayJoin(extractAll(name, '[\\w_]{2,}')) AS res FROM (SELECT name FROM system.functions UNION ALL SELECT name FROM system.table_engines UNION ALL SELECT name FROM system.formats UNION ALL SELECT name FROM system.table_functions UNION ALL SELECT name FROM system.data_type_families UNION ALL SELECT name FROM system.merge_tree_settings UNION ALL SELECT name FROM system.settings UNION ALL SELECT cluster FROM system.clusters UNION ALL SELECT macro FROM system.macros UNION ALL SELECT policy_name FROM system.storage_policies UNION ALL SELECT concat(func.name, comb.name) FROM system.functions AS func CROSS JOIN system.aggregate_function_combinators AS comb WHERE is_aggregate UNION ALL SELECT name FROM system.databases LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.tables LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.dictionaries LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.columns LIMIT 10000) WHERE notEmpty(res)), Stack trace (when copying this message, always include the lines below):
[1]: https://clickhouse-test-reports.s3.yandex.net/24069/263c06137cda16213c14ca60a89cfdd7fb511c81/functional_stateless_tests_(thread).html
After #24483 had been merged the only place where the allocation may
fail is the insert into PODArray in DB::OwnSplitChannel::log, but this
MR ignores the errors in this function, so to check new behaviour
separate server is required.
Before this patch clickhouse-client interprets the whole queries and if
"{ echo }" found, it starts echoing queries, but this will make it
impossible to skip some of lines.