mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-25 00:52:02 +00:00
156 lines
7.6 KiB
Markdown
156 lines
7.6 KiB
Markdown
# Jepsen tests ClickHouse Keeper
|
|
|
|
A Clojure library designed to test ZooKeeper-like implementation inside ClickHouse.
|
|
|
|
## Test scenarios (workloads)
|
|
|
|
### CAS register
|
|
|
|
CAS Register has three operations: read number, write number, compare-and-swap number. This register is simulated as a single ZooKeeper node. Read transforms to ZooKeeper's `getData` request. Write transforms to the `set` request. Compare-and-swap implemented via `getData` + compare in code + `set` new value with `version` from `getData`.
|
|
|
|
In this test, we use a linearizable checker, so Jepsen validates that history was linearizable. One of the heaviest workloads.
|
|
|
|
Strictly requires `quorum_reads` to be true.
|
|
|
|
### Set
|
|
|
|
Set has two operations: add a number to set and read all values from set. This workload is simulated on a single ZooKeeper node with a string value that represents Clojure set data structure. Add operation very similar to compare-and-swap. We read string value from ZooKeeper node with `getData`, parse it to Clojure's set, add new value to the set and try to write it with the received version.
|
|
|
|
In this test, Jepsen validates that all successfully added values can be read. Generator for this workload performs only add operations until a timeout and after that tries to read set once.
|
|
|
|
### Unique IDs
|
|
|
|
In the Unique IDs workload we have only one operation: generate a new unique number. It's implemented using ZooKeeper's sequential nodes. For each generates request client just creates a new sequential node in ZooKeeper with a fixed prefix. After that cuts the prefix off from the returned path and parses the number from the rest part.
|
|
|
|
Jepsen checks that all returned IDs were unique.
|
|
|
|
### Counter
|
|
|
|
Counter workload has two operations: read counter value and add some number to the counter. Its implementation is quite weird. We add number `N` to the counter creating `N` sequential nodes in a single ZooKeeper transaction. Counter read implemented as `getChildren` ZooKeeper request and count of all returned nodes.
|
|
|
|
Jepsen checks that counter value lies in the interval of possible value. Strictly requires `quorum_reads` to be true.
|
|
|
|
### Total queue
|
|
|
|
Simulates an unordered queue with three operations: enqueue number, dequeue, and drain. Enqueue operation uses `create` request with node name equals to number. `Dequeue` operation is more interesting. We list (`getChildren`) all nodes and remember the parent node version. After that we choose the smallest one and prepare the transaction: `check` parent node version + set an empty value to parent node + delete smalled child node. Drain operation is just `getChildren` on the parent path.
|
|
|
|
Jepsen checks that all enqueued values were dequeued or drained. Duplicates are allowed because Jepsen doesn't know the value of the unknown-status (`:info`) dequeue operation. So when we try to `dequeue` some element we should return it even if our delete transaction failed with `Connection loss` error.
|
|
|
|
### Linear queue
|
|
|
|
Same with the total queue, but without drain operation. Checks linearizability between enqueue and dequeue. Sometimes consume more than 10GB during validation even for very short histories.
|
|
|
|
|
|
## Nemesis
|
|
|
|
We use almost all standard nemeses with small changes for our storage.
|
|
|
|
### Random node killer (random-node-killer)
|
|
|
|
Sleep 5 seconds, kills random node, sleep for 5 seconds, and starts it back.
|
|
|
|
### All nodes killer (all-nodes-killer)
|
|
|
|
Kill all nodes at once, sleep for 5 seconds, and starts them back.
|
|
|
|
### Simple partitioner (simple-partitioner)
|
|
|
|
Partition one node from others using iptables. No one can see the victim and the victim cannot see anybody.
|
|
|
|
### Random node stop (random-node-hammer-time)
|
|
|
|
Send `SIGSTOP` to the random node. Sleep 5 seconds. Send `SIGCONT`.
|
|
|
|
### All nodes stop (all-nodes-hammer-time)
|
|
|
|
Send `SIGSTOP` to all nodes. Sleep 5 seconds. Send `SIGCONT`.
|
|
|
|
### Logs corruptor (logs-corruptor)
|
|
|
|
Corrupts latest log (change one random byte) in `clickhouse_path/coordination/logs`. Restarts nodes.
|
|
|
|
### Snapshots corruptor (snapshots-corruptor)
|
|
|
|
Corrupts latest snapshot (change one random byte) in `clickhouse_path/coordination/snapshots`. Restarts nodes.
|
|
|
|
### Logs and snapshots corruptor (logs-and-snapshots-corruptor)
|
|
|
|
Corrupts both the latest log and snapshot. Restarts node.
|
|
|
|
### Drop data corruptor (drop-data-corruptor)
|
|
|
|
Drop all data from `clickhouse_path/coordinator`. Restarts node.
|
|
|
|
### Bridge partitioner (bridge-partitioner)
|
|
|
|
Two nodes don't see each other but can see another node. The last node can see both.
|
|
|
|
### Blind node partitioner (blind-node-partitioner)
|
|
|
|
One of the nodes cannot see another, but they can see it.
|
|
|
|
### Blind others partitioner (blind-others-partitioner)
|
|
|
|
Two nodes don't see one node but it can see both.
|
|
|
|
## Usage
|
|
|
|
### Dependencies
|
|
|
|
- leiningen (https://leiningen.org/)
|
|
- clojure (https://clojure.org/)
|
|
- jvm
|
|
|
|
### Options for `lein run`
|
|
|
|
- `test` Run a single test.
|
|
- `test-all` Run all available tests from tests-set.
|
|
- `-w (--workload)` One of the workloads. Option for a single `test`.
|
|
- `--nemesis` One of nemeses. Option for a single `test`.
|
|
- `-q (--quorum)` Run test with quorum reads.
|
|
- `-r (--rate)` How many operations per second Jepsen will generate in a single thread.
|
|
- `-s (--snapshot-distance)` ClickHouse Keeper setting. How often we will create a new snapshot.
|
|
- `--stale-log-gap` ClickHosue Keeper setting. A leader will send a snapshot instead of a log to this node if it's committed index less than leaders - this setting value.
|
|
- `--reserved-log-items` ClickHouse Keeper setting. How many log items to keep after the snapshot.
|
|
- `--ops-per-key` Option for CAS register workload. Total ops that will be generated for a single register.
|
|
- `--lightweight-run` Run some lightweight tests without linearizability checks. Option for `tests-all` run.
|
|
- `--reuse-binary` Don't download clickhouse binary if it already exists on the node.
|
|
- `--clickhouse-source` URL to clickhouse `.deb`, `.tgz` or binary.
|
|
- `--time-limit` (in seconds) How long Jepsen will generate new operations.
|
|
- `--nodes-file` File with nodes for SSH. Newline separated.
|
|
- `--username` SSH username for nodes.
|
|
- `--password` SSH password for nodes.
|
|
- `--concurrency` How many threads Jepsen will use for concurrent requests.
|
|
- `--test-count` How many times to run a single test or how many tests to run from the tests set.
|
|
|
|
|
|
### Examples:
|
|
|
|
1. Run `Set` workload with `logs-and-snapshots-corruptor` ten times:
|
|
|
|
```sh
|
|
$ lein run test --nodes-file nodes.txt --username root --password '' --time-limit 30 --concurrency 50 -r 50 --workload set --nemesis logs-and-snapshots-corruptor --clickhouse-source 'https://clickhouse-builds.s3.yandex.net/someurl/clickhouse-common-static_21.4.1.6321_amd64.deb' -q --test-count 10 --reuse-binary
|
|
```
|
|
|
|
2. Run ten random tests from `lightweight-run` with some custom Keeper settings:
|
|
|
|
``` sh
|
|
$ lein run test-all --nodes-file nodes.txt --username root --password '' --time-limit 30 --concurrency 50 -r 50 --snapshot-distance 100 --stale-log-gap 100 --reserved-log-items 10 --lightweight-run --clickhouse-source 'someurl' -q --reuse-binary --test-count 10
|
|
```
|
|
|
|
|
|
## License
|
|
|
|
Copyright © 2021 FIXME
|
|
|
|
This program and the accompanying materials are made available under the
|
|
terms of the Eclipse Public License 2.0 which is available at
|
|
http://www.eclipse.org/legal/epl-2.0.
|
|
|
|
This Source Code may also be made available under the following Secondary
|
|
Licenses when the conditions for such availability set forth in the Eclipse
|
|
Public License, v. 2.0 are satisfied: GNU General Public License as published by
|
|
the Free Software Foundation, either version 2 of the License, or (at your
|
|
option) any later version, with the GNU Classpath Exception which is available
|
|
at https://www.gnu.org/software/classpath/license.html.
|