ClickHouse/tests/jepsen.clickhouse-keeper
2022-08-29 10:50:18 +00:00
..
doc rename 2021-03-30 10:45:51 +03:00
resources Use default expiry for jepsen 2022-08-23 13:18:17 +00:00
src/jepsen/clickhouse_keeper Add bunch of comments for request processing 2022-08-29 10:50:18 +00:00
test/jepsen rename 2021-03-30 10:45:51 +03:00
.gitignore
LICENSE
project.clj Update jepsen and scp 2022-05-31 13:53:45 +00:00
README.md

Jepsen tests ClickHouse Keeper

A Clojure library designed to test ZooKeeper-like implementation inside ClickHouse.

Test scenarios (workloads)

CAS register

CAS Register has three operations: read number, write number, compare-and-swap number. This register is simulated as a single ZooKeeper node. Read transforms to ZooKeeper's getData request. Write transforms to the set request. Compare-and-swap implemented via getData + compare in code + set new value with version from getData.

In this test, we use a linearizable checker, so Jepsen validates that history was linearizable. One of the heaviest workloads.

Strictly requires quorum_reads to be true.

Set

Set has two operations: add a number to set and read all values from set. This workload is simulated on a single ZooKeeper node with a string value that represents Clojure set data structure. Add operation very similar to compare-and-swap. We read string value from ZooKeeper node with getData, parse it to Clojure's set, add new value to the set and try to write it with the received version.

In this test, Jepsen validates that all successfully added values can be read. Generator for this workload performs only add operations until a timeout and after that tries to read set once.

Unique IDs

In the Unique IDs workload we have only one operation: generate a new unique number. It's implemented using ZooKeeper's sequential nodes. For each generates request client just creates a new sequential node in ZooKeeper with a fixed prefix. After that cuts the prefix off from the returned path and parses the number from the rest part.

Jepsen checks that all returned IDs were unique.

Counter

Counter workload has two operations: read counter value and add some number to the counter. Its implementation is quite weird. We add number N to the counter creating N sequential nodes in a single ZooKeeper transaction. Counter read implemented as getChildren ZooKeeper request and count of all returned nodes.

Jepsen checks that counter value lies in the interval of possible value. Strictly requires quorum_reads to be true.

Total queue

Simulates an unordered queue with three operations: enqueue number, dequeue, and drain. Enqueue operation uses create request with node name equals to number. Dequeue operation is more interesting. We list (getChildren) all nodes and remember the parent node version. After that we choose the smallest one and prepare the transaction: check parent node version + set an empty value to parent node + delete smalled child node. Drain operation is just getChildren on the parent path.

Jepsen checks that all enqueued values were dequeued or drained. Duplicates are allowed because Jepsen doesn't know the value of the unknown-status (:info) dequeue operation. So when we try to dequeue some element we should return it even if our delete transaction failed with Connection loss error.

Linear queue

Same with the total queue, but without drain operation. Checks linearizability between enqueue and dequeue. Sometimes consume more than 10GB during validation even for very short histories.

Nemesis

We use almost all standard nemeses with small changes for our storage.

Random node killer (random-node-killer)

Sleep 5 seconds, kills random node, sleep for 5 seconds, and starts it back.

All nodes killer (all-nodes-killer)

Kill all nodes at once, sleep for 5 seconds, and starts them back.

Simple partitioner (simple-partitioner)

Partition one node from others using iptables. No one can see the victim and the victim cannot see anybody.

Random node stop (random-node-hammer-time)

Send SIGSTOP to the random node. Sleep 5 seconds. Send SIGCONT.

All nodes stop (all-nodes-hammer-time)

Send SIGSTOP to all nodes. Sleep 5 seconds. Send SIGCONT.

Logs corruptor (logs-corruptor)

Corrupts latest log (change one random byte) in clickhouse_path/coordination/logs. Restarts nodes.

Snapshots corruptor (snapshots-corruptor)

Corrupts latest snapshot (change one random byte) in clickhouse_path/coordination/snapshots. Restarts nodes.

Logs and snapshots corruptor (logs-and-snapshots-corruptor)

Corrupts both the latest log and snapshot. Restarts node.

Drop data corruptor (drop-data-corruptor)

Drop all data from clickhouse_path/coordinator. Restarts node.

Bridge partitioner (bridge-partitioner)

Two nodes don't see each other but can see another node. The last node can see both.

Blind node partitioner (blind-node-partitioner)

One of the nodes cannot see another, but they can see it.

Blind others partitioner (blind-others-partitioner)

Two nodes don't see one node but it can see both.

Usage

Dependencies

Options for lein run

  • test Run a single test.
  • test-all Run all available tests from tests-set.
  • -w (--workload) One of the workloads. Option for a single test.
  • --nemesis One of nemeses. Option for a single test.
  • -q (--quorum) Run test with quorum reads.
  • -r (--rate) How many operations per second Jepsen will generate in a single thread.
  • -s (--snapshot-distance) ClickHouse Keeper setting. How often we will create a new snapshot.
  • --stale-log-gap ClickHosue Keeper setting. A leader will send a snapshot instead of a log to this node if it's committed index less than leaders - this setting value.
  • --reserved-log-items ClickHouse Keeper setting. How many log items to keep after the snapshot.
  • --ops-per-key Option for CAS register workload. Total ops that will be generated for a single register.
  • --lightweight-run Run some lightweight tests without linearizability checks. Option for tests-all run.
  • --reuse-binary Don't download clickhouse binary if it already exists on the node.
  • --clickhouse-source URL to clickhouse .deb, .tgz or binary.
  • --time-limit (in seconds) How long Jepsen will generate new operations.
  • --nodes-file File with nodes for SSH. Newline separated.
  • --username SSH username for nodes.
  • --password SSH password for nodes.
  • --concurrency How many threads Jepsen will use for concurrent requests.
  • --test-count How many times to run a single test or how many tests to run from the tests set.

Examples:

  1. Run Set workload with logs-and-snapshots-corruptor ten times:
$ lein run test --nodes-file nodes.txt --username root --password '' --time-limit 30 --concurrency 50 -r 50 --workload set --nemesis logs-and-snapshots-corruptor  --clickhouse-source 'https://clickhouse-builds.s3.yandex.net/someurl/clickhouse-common-static_21.4.1.6321_amd64.deb' -q --test-count 10 --reuse-binary
  1. Run ten random tests from lightweight-run with some custom Keeper settings:
$ lein run test-all --nodes-file nodes.txt --username root --password '' --time-limit 30 --concurrency 50 -r 50 --snapshot-distance 100 --stale-log-gap 100 --reserved-log-items 10 --lightweight-run  --clickhouse-source 'someurl' -q --reuse-binary --test-count 10

License

Copyright © 2021 FIXME

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.