Commit Graph

2 Commits

Author SHA1 Message Date
Azat Khuzhin
be6777bc86 Reduce overhead of the mutations for SELECTs (v2)
SELECTs are affected by the mutations, since it tries to apply them on
fly, and scanning over existing mutations can take significant amount of
time (for simple queries, i.e. count())

And also even after mutation had been finished, it still a problem,
because mutations are not removed instantly.

So instead introduce an atomic counter alter_conversions_mutations, that
is incremented for new mutations and decremented once mutation
finished/killed, that way once the mutation finished they will not
affect queries.

Here are some numbers for non-RENAME mutations:

    rmt vanilla w/o mutations | queries: 3693, QPS: 494.813
    rmt vanilla w/ mutations  | queries: 2190, QPS: 388.256
    rmt patched w/o mutations | queries: 3168, QPS: 620.061
    rmt patched w/ mutations  | queries: 3155, QPS: 614.424
    mt vanilla w/o mutations  | queries: 3498, QPS: 656.399
    mt vanilla w/ mutations   | queries: 3821, QPS: 600.425
    mt patched w/o mutations  | queries: 5732, QPS: 745.585
    mt patched w/ mutations   | queries: 4719, QPS: 715.034

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-04-25 14:35:21 +02:00
Azat Khuzhin
a4f765cae7
Improve performance of SELECTs with active mutations (#59531)
* Configure keeper for perf tests

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Improve performance of SELECTs with active mutations

getAlterMutationCommandsForPart() can be a hot path for query execution
when there are pending mutations.

- LOG_TEST - it is not only check one bool, but actually a bunch of
  atomics as well.

- Return std::vector over std::map (map is not required there) - no
  changes in performance.

- Copy only RENAME_COLUMN (since only this mutation is required by
  AlterConversions).

And here are results:

run|result
-|-
SELECT w/o ALTER|queries: 1565, QPS: 355.259, RPS: 355.259
SELECT w/ ALTER unpatched|queries: 2099, QPS: 220.623, RPS: 220.623
SELECT w/ ALTER and w/o LOG_TEST|queries: 2730, QPS: 235.859, RPS: 235.859
SELECT w/ ALTER and w/o LOG_TEST and w/ RENAME_COLUMN only|queries: 2995, QPS: 290.982, RPS: 290.982

But there are still room for improvements, at least MergeTree engines
could implement getStorageSnapshotForQuery().

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Add AlterConversions::supportsMutationCommandType(), flatten vector<vector<MutationCommand>>

* Work around what appears to be a clang static analysis bug

---------

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Co-authored-by: Michael Kolupaev <michael.kolupaev@clickhouse.com>
2024-02-22 08:51:10 +00:00