Commit Graph

675 Commits

Author SHA1 Message Date
Nikita Mikhaylov
4db5062d6b
Merge pull request #28374 from nikitamikhaylov/global-merge-executor
Introduced global executor for background MergeTree-related operations
2021-09-09 11:30:21 +03:00
Nikita Mikhaylov
6062dd0021 Better 2021-09-08 00:21:21 +00:00
Azat Khuzhin
db0767a194 Implement detach_not_byte_identical_parts
Maybe useful for further analysis of non byte-identical parts.
2021-09-07 23:29:57 +03:00
Nikita Mikhaylov
ea0fbf81af Renaming 2021-09-06 12:01:16 +00:00
Nikita Mikhaylov
292a24abe8 Merge upstream/master into global-merge-executor (using imerge) 2021-09-03 00:34:24 +00:00
Nikita Mikhaylov
cc7c221fad Own PriorityQueue + prettifying the code 2021-09-02 21:31:32 +00:00
mergify[bot]
5d299fbdee
Merge branch 'master' into remove_outdated_settings 2021-09-01 14:07:48 +00:00
alesapin
fd1581aee1 Fix style 2021-09-01 10:52:33 +03:00
Nikita Mikhaylov
dbc950caa4 added a test 2021-08-31 14:54:24 +00:00
alesapin
921e51e061 Remove some obsolete settings for replicated fetches 2021-08-31 15:22:56 +03:00
Nikita Mikhaylov
f8d4f04294 Merge upstream/master into global-merge-executor (using imerge) 2021-08-31 11:52:11 +00:00
Nikita Mikhaylov
1adb9bfe23 better 2021-08-31 11:02:39 +00:00
Nikita Mikhaylov
c4416906c8 done 2021-08-30 19:37:03 +00:00
Alexey Milovidov
79e0433ba7 Merge branch 'master' of github.com:yandex/ClickHouse into async-reads 2021-08-28 01:19:16 +03:00
tavplubix
703101fe4d
Merge pull request #27931 from ClickHouse/wait_for_all_replicas_timeouts
Avoid too long waiting for inactive replicas
2021-08-27 14:31:36 +03:00
Amos Bird
0169fce78e
Projection bug fixes and refactoring. 2021-08-26 19:09:31 +08:00
Alexey Milovidov
7c1d0a3baf Progress on development 2021-08-25 01:24:47 +03:00
tavplubix
0602d74a11
Merge pull request #28035 from ClickHouse/fix_replace_ranges_may_stuck
Fix race between REPLACE PARTITION and MOVE PARTITION
2021-08-24 17:49:57 +03:00
Alexander Tokmakov
d95131dd4a better check for alter partition version 2021-08-23 20:51:20 +03:00
Alexander Tokmakov
cc9c2fd63b make code better 2021-08-23 15:57:50 +03:00
Alexander Tokmakov
59eb3aa9a9 avoid too long waiting for inactive replicas 2021-08-20 15:59:57 +03:00
nvartolomei
c09c90125f
Merge branch 'master' into nv/last-queue-update-exception 2021-08-19 10:29:16 +01:00
Alexander Tokmakov
0ed046eb7b remove irrelevant comments 2021-08-18 15:33:11 +03:00
Alexander Tokmakov
09ff66da0e fix a couple of bugs that may cause replicas to diverge 2021-08-18 12:50:46 +03:00
mergify[bot]
f11e396151
Merge branch 'master' into nv/last-queue-update-exception 2021-08-18 07:00:50 +00:00
Nicolae Vartolomei
3f291b024a
Use plain mutex instead of MultiVersion 2021-08-09 13:58:23 +01:00
Maksim Kita
7fdf3cc263
Merge pull request #27180 from kitaisreal/storage-system-replicas-added-column-replica-is-active
Storage system replicas added column replica is active
2021-08-05 12:46:53 +03:00
Maksim Kita
3f48c85722 StorageSystemReplicas added column replica_is_active 2021-08-04 16:19:42 +03:00
Alexander Tokmakov
42a8bb6872 fix assertions in Replicated database 2021-08-02 16:19:11 +03:00
alesapin
71169d7937
Merge pull request #26716 from nvartolomei/nv/part-cleanup-sequence
Avoid deleting old parts from FS on shutdown for replicated engine
2021-07-28 18:29:37 +03:00
Nicolae Vartolomei
8b07a7f180 Store exception generated when we tried to update the queue last time
The use case is to alert when queue contains broken entries. Especially
important when ClickHouse breaks backwards compatibility between
versions and log entries written by newer versions aren't parseable by
old versions.

```
Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected 'quorum: ' before: 'merge_type: 2\n'
```
2021-07-27 15:42:40 +01:00
Nikolai Kochetov
61d8f880cd Rename some files. 2021-07-26 19:48:25 +03:00
Nikolai Kochetov
9c92f43359 Update storages. 2021-07-23 22:33:59 +03:00
Nicolae Vartolomei
f35e6eee19 Avoid deleting old parts from FS on shutdown for replicated engine
This was introduced in https://github.com/ClickHouse/ClickHouse/pull/8602.
The idea was to avoid data re-appearing in ClickHouse after DROP/DETACH
PARTITION. This problem was only present in MergeTree engine and I don't
understand why we need to do the same in ReplicatedMergeTree.

For ReplicatedMergeTree the state of truth is stored in ZK, deleting
things from filesystem just introduces inconsistencies and this is the
main source for errors like "No active replica has part X or covering
part".

The resulting problem is fixed by
https://github.com/ClickHouse/ClickHouse/pull/25820, but in my opinion
we would better avoid introducing the ZK/FS inconsistency in the first
place.

When does this inconsistency appear? Often the sequence is like this:

0. Write 2 parts to ZK [all_0_0_0, all_1_1_0]
1. A merge gets scheduled
2. New part replaces old parts [new: all_0_1_1, old: all_0_0_0, all_1_1_0]
3. Replica gets shutdown and old parts are removed from filesystem
4. Replica comes back online, metadata about all parts is still stored in ZK for this new replica.
5. Other replica after cleanup thread runs will have only [all_0_1_1] in
   ZK
5. User triggers a DROP_RANGE after a while (drop range is for all_0_1_9999*)
6. Each replica deletes from ZK only [all_0_1_1]. The replica that got
   restarted uses its in-memory state to choose nodes to delete from ZK.
7. Restart the replica again. It will now think that there are 2 parts
   that it lost and needs to fetch them [all_0_0_0, all_1_1_0].

`clearOldPartsAndRemoveFromZK` which is triggered from cleanup thread
runs cleanup sequence correctly, it first removes things from ZK and
then from filesystem. I don't see much benefit of triggering it on
shutdown and would rather have it called only from a single place.

---

This is a very, very edge case situation but it proves that the current
"fix" (https://github.com/ClickHouse/ClickHouse/pull/25820) isn't
complete.

```
create table test(
    v UInt64
)
engine=ReplicatedMergeTree('/clickhouse/test', 'one')
order by v
settings old_parts_lifetime = 30;

create table test2(
    v UInt64
)
engine=ReplicatedMergeTree('/clickhouse/test', 'two')
order by v
settings old_parts_lifetime = 30;

create table test3(
    v UInt64
)
engine=ReplicatedMergeTree('/clickhouse/test', 'three')
order by v
settings old_parts_lifetime = 30;

insert into table test values (1), (2), (3);
insert into table test values (4);

optimize table test final;

detach table test;
detach table test2;

alter table test3 drop partition tuple();

attach table test;
attach table test2;
```

```
(CONNECTED [localhost:9181]) /> ls /clickhouse/test/replicas/one/parts
all_0_0_0
all_1_1_0
(CONNECTED [localhost:9181]) /> ls /clickhouse/test/replicas/two/parts
all_0_0_0
all_1_1_0
(CONNECTED [localhost:9181]) /> ls /clickhouse/test/replicas/three/parts
```

```
detach table test;
attach table test;
```

`test` will now figure out that parts exist only in ZK and will issue `GET_PART`
after first removing parts from ZK.

`test2` will receive fetch for unknown parts and will trigger part checks itself.
Because `test` doesn't have the parts anymore in ZK `test2` will mark them as LostForever.
It will also not insert empty parts, because the partition is empty.

`test` is left with `GET_PART` in the queue and stuck.

```
SELECT
    table,
    type,
    replica_name,
    new_part_name,
    last_exception
FROM system.replication_queue

Query id: 74c5aa00-048d-4bc1-a2ea-6f69501c11a0

Row 1:
──────
table:          test
type:           GET_PART
replica_name:   one
new_part_name:  all_0_0_0
last_exception: Code: 234. DB::Exception: No active replica has part all_0_0_0 or covering part. (NO_REPLICA_HAS_PART) (version 21.9.1.1)

Row 2:
──────
table:          test
type:           GET_PART
replica_name:   one
new_part_name:  all_1_1_0
last_exception: Code: 234. DB::Exception: No active replica has part all_1_1_0 or covering part. (NO_REPLICA_HAS_PART) (version 21.9.1.1)
```
2021-07-22 17:48:16 +01:00
tavplubix
41bb8acbb5
Update StorageReplicatedMergeTree.cpp 2021-07-14 20:05:50 +03:00
Filatenkov Artur
f1702d356e
Merge pull request #26120 from lthaooo/settings-merge_selecting_sleep_ms
add merge_selecting_sleep_ms setting
2021-07-14 18:28:47 +03:00
Zhichang Yu
b4e6689bf9 fix test_hdfs_zero_copy_replication_move[tiered_copy-2] 2021-07-13 07:20:23 +00:00
Zhichang Yu
5047c758f4 fix per review 2021-07-13 07:20:20 +00:00
Zhichang Yu
fbd5eee8a1 hdfs zero copy 2021-07-13 07:19:12 +00:00
terrylin
faed6263bb add merge_selecting_sleep_ms setting 2021-07-09 17:29:17 +08:00
alesapin
4c85dae572
Merge pull request #25743 from ClickHouse/fix_aggregation_ttl
Fix bug in execution of TTL GROUP BY
2021-07-07 10:49:16 +03:00
alesapin
0d8844c510
Merge pull request #25884 from ClickHouse/fix_drop_part_in_queue
Relax `DROP PART` guarantees and turn on checks in ReplicationQueue.
2021-07-07 10:48:48 +03:00
alesapin
f7e1cfdb24 Some partially working code 2021-07-05 22:58:55 +03:00
Anton Popov
9071ecd428 fix alter of settings in MergeTree 2021-07-05 15:44:58 +03:00
mergify[bot]
9da1c98998
Merge branch 'master' into fix_aggregation_ttl 2021-07-03 15:07:44 +00:00
alesapin
13c008c7a8 Change exception type 2021-07-02 16:38:46 +03:00
alesapin
2e29dc2975 More safe empty parts creation 2021-07-02 12:30:17 +03:00
mergify[bot]
98fa9f7951
Merge branch 'master' into better_remove_empty_parts 2021-07-01 11:32:43 +00:00
tavplubix
afbc6bf17d
Merge pull request #25684 from ClickHouse/cancel_merges_on_drop_partition
Cancel merges on DROP PARTITION
2021-06-30 23:41:04 +03:00
alesapin
0193a9d087 Fix locking 2021-06-30 22:41:25 +03:00