ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-14 02:12:21 +00:00

Author	SHA1	Message	Date
Alexander Tokmakov	09ff66da0e	fix a couple of bugs that may cause replicas to diverge	2021-08-18 12:50:46 +03:00
Maksim Kita	7fdf3cc263	Merge pull request #27180 from kitaisreal/storage-system-replicas-added-column-replica-is-active Storage system replicas added column replica is active	2021-08-05 12:46:53 +03:00
Maksim Kita	3f48c85722	StorageSystemReplicas added column replica_is_active	2021-08-04 16:19:42 +03:00
Alexander Tokmakov	42a8bb6872	fix assertions in Replicated database	2021-08-02 16:19:11 +03:00
alesapin	71169d7937	Merge pull request #26716 from nvartolomei/nv/part-cleanup-sequence Avoid deleting old parts from FS on shutdown for replicated engine	2021-07-28 18:29:37 +03:00
Nikolai Kochetov	61d8f880cd	Rename some files.	2021-07-26 19:48:25 +03:00
Nikolai Kochetov	9c92f43359	Update storages.	2021-07-23 22:33:59 +03:00
Nicolae Vartolomei	f35e6eee19	Avoid deleting old parts from FS on shutdown for replicated engine This was introduced in https://github.com/ClickHouse/ClickHouse/pull/8602. The idea was to avoid data re-appearing in ClickHouse after DROP/DETACH PARTITION. This problem was only present in MergeTree engine and I don't understand why we need to do the same in ReplicatedMergeTree. For ReplicatedMergeTree the state of truth is stored in ZK, deleting things from filesystem just introduces inconsistencies and this is the main source for errors like "No active replica has part X or covering part". The resulting problem is fixed by https://github.com/ClickHouse/ClickHouse/pull/25820, but in my opinion we would better avoid introducing the ZK/FS inconsistency in the first place. When does this inconsistency appear? Often the sequence is like this: 0. Write 2 parts to ZK [all_0_0_0, all_1_1_0] 1. A merge gets scheduled 2. New part replaces old parts [new: all_0_1_1, old: all_0_0_0, all_1_1_0] 3. Replica gets shutdown and old parts are removed from filesystem 4. Replica comes back online, metadata about all parts is still stored in ZK for this new replica. 5. Other replica after cleanup thread runs will have only [all_0_1_1] in ZK 5. User triggers a DROP_RANGE after a while (drop range is for all_0_1_9999*) 6. Each replica deletes from ZK only [all_0_1_1]. The replica that got restarted uses its in-memory state to choose nodes to delete from ZK. 7. Restart the replica again. It will now think that there are 2 parts that it lost and needs to fetch them [all_0_0_0, all_1_1_0]. `clearOldPartsAndRemoveFromZK` which is triggered from cleanup thread runs cleanup sequence correctly, it first removes things from ZK and then from filesystem. I don't see much benefit of triggering it on shutdown and would rather have it called only from a single place. --- This is a very, very edge case situation but it proves that the current "fix" (https://github.com/ClickHouse/ClickHouse/pull/25820) isn't complete. ``` create table test( v UInt64 ) engine=ReplicatedMergeTree('/clickhouse/test', 'one') order by v settings old_parts_lifetime = 30; create table test2( v UInt64 ) engine=ReplicatedMergeTree('/clickhouse/test', 'two') order by v settings old_parts_lifetime = 30; create table test3( v UInt64 ) engine=ReplicatedMergeTree('/clickhouse/test', 'three') order by v settings old_parts_lifetime = 30; insert into table test values (1), (2), (3); insert into table test values (4); optimize table test final; detach table test; detach table test2; alter table test3 drop partition tuple(); attach table test; attach table test2; ``` ``` (CONNECTED [localhost:9181]) /> ls /clickhouse/test/replicas/one/parts all_0_0_0 all_1_1_0 (CONNECTED [localhost:9181]) /> ls /clickhouse/test/replicas/two/parts all_0_0_0 all_1_1_0 (CONNECTED [localhost:9181]) /> ls /clickhouse/test/replicas/three/parts ``` ``` detach table test; attach table test; ``` `test` will now figure out that parts exist only in ZK and will issue `GET_PART` after first removing parts from ZK. `test2` will receive fetch for unknown parts and will trigger part checks itself. Because `test` doesn't have the parts anymore in ZK `test2` will mark them as LostForever. It will also not insert empty parts, because the partition is empty. `test` is left with `GET_PART` in the queue and stuck. ``` SELECT table, type, replica_name, new_part_name, last_exception FROM system.replication_queue Query id: 74c5aa00-048d-4bc1-a2ea-6f69501c11a0 Row 1: ────── table: test type: GET_PART replica_name: one new_part_name: all_0_0_0 last_exception: Code: 234. DB::Exception: No active replica has part all_0_0_0 or covering part. (NO_REPLICA_HAS_PART) (version 21.9.1.1) Row 2: ────── table: test type: GET_PART replica_name: one new_part_name: all_1_1_0 last_exception: Code: 234. DB::Exception: No active replica has part all_1_1_0 or covering part. (NO_REPLICA_HAS_PART) (version 21.9.1.1) ```	2021-07-22 17:48:16 +01:00
tavplubix	41bb8acbb5	Update StorageReplicatedMergeTree.cpp	2021-07-14 20:05:50 +03:00
Filatenkov Artur	f1702d356e	Merge pull request #26120 from lthaooo/settings-merge_selecting_sleep_ms add merge_selecting_sleep_ms setting	2021-07-14 18:28:47 +03:00
Zhichang Yu	b4e6689bf9	fix test_hdfs_zero_copy_replication_move[tiered_copy-2]	2021-07-13 07:20:23 +00:00
Zhichang Yu	5047c758f4	fix per review	2021-07-13 07:20:20 +00:00
Zhichang Yu	fbd5eee8a1	hdfs zero copy	2021-07-13 07:19:12 +00:00
terrylin	faed6263bb	add merge_selecting_sleep_ms setting	2021-07-09 17:29:17 +08:00
alesapin	4c85dae572	Merge pull request #25743 from ClickHouse/fix_aggregation_ttl Fix bug in execution of TTL GROUP BY	2021-07-07 10:49:16 +03:00
alesapin	0d8844c510	Merge pull request #25884 from ClickHouse/fix_drop_part_in_queue Relax `DROP PART` guarantees and turn on checks in ReplicationQueue.	2021-07-07 10:48:48 +03:00
alesapin	f7e1cfdb24	Some partially working code	2021-07-05 22:58:55 +03:00
Anton Popov	9071ecd428	fix alter of settings in MergeTree	2021-07-05 15:44:58 +03:00
mergify[bot]	9da1c98998	Merge branch 'master' into fix_aggregation_ttl	2021-07-03 15:07:44 +00:00
alesapin	13c008c7a8	Change exception type	2021-07-02 16:38:46 +03:00
alesapin	2e29dc2975	More safe empty parts creation	2021-07-02 12:30:17 +03:00
mergify[bot]	98fa9f7951	Merge branch 'master' into better_remove_empty_parts	2021-07-01 11:32:43 +00:00
tavplubix	afbc6bf17d	Merge pull request #25684 from ClickHouse/cancel_merges_on_drop_partition Cancel merges on DROP PARTITION	2021-06-30 23:41:04 +03:00
alesapin	0193a9d087	Fix locking	2021-06-30 22:41:25 +03:00
alesapin	6a73c8b49e	Review fixes	2021-06-30 18:24:51 +03:00
mergify[bot]	1799804243	Merge branch 'master' into fix_aggregation_ttl	2021-06-30 09:15:46 +00:00
alesapin	a6834213a1	Add tests	2021-06-29 22:47:54 +03:00
alesapin	5822f0ba29	Replace lost parts with empty parts instead of hacking replication queue	2021-06-29 18:14:44 +03:00
Raúl Marín	bfc122df64	Fix some typos in Storage classes	2021-06-28 19:03:56 +02:00
alesapin	b11254d191	Merge branch 'master' into fix_aggregation_ttl	2021-06-28 13:31:29 +03:00
alesapin	71603a7d13	Merge pull request #25772 from ClickHouse/fix_mutation_test Fix flaky test and wrong message	2021-06-28 13:30:36 +03:00
alesapin	4c213b639f	Merge pull request #25548 from nikitamikhaylov/background-processing A little improvement in BackgroundJobsExecutor	2021-06-28 13:24:21 +03:00
alesapin	7e73762b48	Fix flaky test and wrong message	2021-06-28 11:28:45 +03:00
alesapin	c977c33d6d	Fix bug in execution of TTL GROUP BY	2021-06-27 19:18:15 +03:00
Alexander Tokmakov	76156af5cc	cancel merges on drop partition	2021-06-24 17:07:43 +03:00
alesapin	4be4bc21e2	Fix for fix	2021-06-23 23:57:49 +03:00
alesapin	81c74435a3	Fix drop part bug	2021-06-23 22:25:30 +03:00
Nikita Mikhaylov	c66a3b22b5	done	2021-06-22 23:24:47 +00:00
Mike Kot	4c391f8e99	SYSTEM RESTORE REPLICA replica [ON CLUSTER cluster] (#13652 ) * initial commit: add setting and stub * typo * added test stub * fix * wip merging new integration test and code proto * adding steps interpreters * adding firstly proposed solution (moving parts etc) * added checking zookeeper path existence * fixing the include * fixing and sorting includes * fixing outdated struct * fix the name * added ast ptr as level of indirection * fix ref * updating the changes * working on test stub * fix iterator -> reference * revert rocksdb submodule update * fixed show privileges test * updated the test stub * replaced rand() with thread_local_rng(), updated the tests updated the test fixed test config path test fix removed error messages fixed the test updated the test fixed string literal fixed literal typo: = * fixed the empty replica error message * updated the test and the code with logs * updated the possible test cases, updated * added the code/test milestone comments * updated the test (added more testcases) * replaced native assert with CH one * individual replicas recursive delete fix * updated the AS db.name AST * two small logging fixes * manually generated AST fixes * Updated the test, added the possible algo change * Some thoughts about optimizing the solution: ALTER MOVE PARTITION .. TO TABLE -> move to detached/ + ALTER ... ATTACH * fix * Removed the replica sync in test as it's invalid * Some test tweaks * tmp * Rewrote the algo by using the executeQuery instead of hand-crafting the ASTPtr. Two questions still active. * tr: logging active parts * Extracted the parts moving algo into a separate helper function * Fixed the test data and the queries slightly * Replaced query to system.parts to direct invocation, started building the test that breaks on various parts. * Added the case for tables when at least one replica is alive * Updated the test to test replicas restoration by detaching/attaching * Altered the test to check restoration without replica restart * Added the tables swap in the start if the server failed last time * Hotfix when only /replicas/replica... path was deleted * Restore ZK paths while creating a replicated MergeTree table * Updated the docs, fixed the algo for individual replicas restoration case * Initial parts table storage fix, tests sync fix * Reverted individual replica restoration to general algo * Slightly optimised getDataParts * Trying another solution with parts detaching * Rewrote algo without any steps, added ON CLUSTER support * Attaching parts from other replica on restoration * Getting part checksums from ZK * Removed ON CLUSTER, finished working solution * Multiple small changes after review * Fixing parallel test * Supporting rewritten form on cluster * Test fix * Moar logging * Using source replica as checksum provider * improve test, remove some code from parser * Trying solution with move to detached + forget * Moving all parts (not only Committed) to detached * Edited docs for RESTORE REPLICA * Re-merging * minor fixes Co-authored-by: Alexander Tokmakov <avtokmakov@yandex-team.ru>	2021-06-20 11:24:43 +03:00
Maksim Kita	67e9b85951	Merge ext into common	2021-06-16 23:28:41 +03:00
tavplubix	e2ecc51a1f	Merge pull request #25087 from ClickHouse/srmt_remove_copypaste Remove copypaste from StorageReplicatedMergeTree	2021-06-15 12:43:55 +03:00
alexey-milovidov	abe206f4fc	Merge pull request #25037 from nvartolomei/nv/queue-entry-wait-dead-code Update waitForTableReplicaToProcessLogEntry comments	2021-06-12 03:11:13 +03:00
Alexander Tokmakov	3ade38df82	remove copypaste	2021-06-08 22:17:45 +03:00
Azat Khuzhin	1062d0ec91	Distinguish KILL MUTATION for different tables. Before this patch KILL MUTATION marks mutation as canceled just by name (and part numbers) so if you have multiple tables with the same part name, then killing mutation for one table, will mark it as killed for another too. Fix this by comparing StorageID too (it is better to use StorageID over database/table to avoid ambiguity by using UUIDs for comparing). Here is a failure of the 01414_freeze_does_not_prevent_alters on CI [1]. [1]: https://clickhouse-test-reports.s3.yandex.net/24069/9fb69dcf98c71a939d200cad3c8491bf43a44622/functional_stateless_tests_(ubsan).html#fail1	2021-06-08 10:51:22 +03:00
Nicolae Vartolomei	19f64b3f25	Update waitForTableReplicaToProcessLogEntry comments	2021-06-07 11:01:57 +01:00
alesapin	2ea9d998e8	Merge pull request #24960 from nvartolomei/nv/queue-entry-wait-dead-code Delete support for waiting on queue- entries, is this dead code?	2021-06-07 12:40:34 +03:00
Nicolae Vartolomei	a9d108fc5f	Delete support for waiting on queue- entries, is this dead code?	2021-06-04 12:43:46 +01:00
alesapin	4127bbbb9c	Remove compatibility code with 18.12	2021-06-04 13:20:15 +03:00
alesapin	4a3fd34a9d	Fix typo	2021-06-04 11:33:07 +03:00
alesapin	f9c2f6925d	Don't try if source replica was dropped	2021-06-04 11:32:33 +03:00

1 2 3 4 5 ...

649 Commits