Release more num_streams if data is small (#53867)

* Release more num_streams if data is small Besides the sum_marks and min_marks_for_concurrent_read, we could also involve the system cores to get the num_streams if the data is small. Increasing the num_streams and decreasing the min_marks_for_concurrent_read would improve the parallel performance if the system has plentiful cores. Test the patch on 2x80 vCPUs system. Q39 of clickbench has got 3.3x performance improvement. Q36 has got 2.6x performance improvement. The overall geomean has got 9% gain. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> * Release more num_streams if data is small Change the min marks from 4 to 8 as the profit is small and 8 granules is the default block size. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com> --------- Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2024-11-27 01:51:59 +00:00 · 2023-10-17 00:41:38 +08:00 · 2023-10-17 00:41:38 +08:00 · df17cd467b
commit df17cd467b
parent cbdb62d389
1 changed files with 23 additions and 1 deletions
--- a/src/Processors/QueryPlan/ReadFromMergeTree.cpp
+++ b/src/Processors/QueryPlan/ReadFromMergeTree.cpp
@ -718,7 +718,29 @@ Pipe ReadFromMergeTree::spreadMarkRangesAmongStreams(RangesInDataParts && parts_
    {
        /// Reduce the number of num_streams if the data is small.
        if (info.sum_marks < num_streams * info.min_marks_for_concurrent_read && parts_with_ranges.size() < num_streams)
-            num_streams = std::max((info.sum_marks + info.min_marks_for_concurrent_read - 1) / info.min_marks_for_concurrent_read, parts_with_ranges.size());
+        {
+            /*
+            If the data is fragmented, then allocate the size of parts to num_streams. If the data is not fragmented, besides the sum_marks and
+            min_marks_for_concurrent_read, involve the system cores to get the num_streams. Increase the num_streams and decrease the min_marks_for_concurrent_read
+            if the data is small but system has plentiful cores. It helps to improve the parallel performance of `MergeTreeRead` significantly.
+            Make sure the new num_streams `num_streams * increase_num_streams_ratio` will not exceed the previous calculated prev_num_streams.
+            The new info.min_marks_for_concurrent_read `info.min_marks_for_concurrent_read / increase_num_streams_ratio` should be larger than 8.
+            https://github.com/ClickHouse/ClickHouse/pull/53867
+            */
+            if ((info.sum_marks + info.min_marks_for_concurrent_read - 1) / info.min_marks_for_concurrent_read > parts_with_ranges.size())
+            {
+                const size_t prev_num_streams = num_streams;
+                num_streams = (info.sum_marks + info.min_marks_for_concurrent_read - 1) / info.min_marks_for_concurrent_read;
+                const size_t increase_num_streams_ratio = std::min(prev_num_streams / num_streams, info.min_marks_for_concurrent_read / 8);
+                if (increase_num_streams_ratio > 1)
+                {
+                    num_streams = num_streams * increase_num_streams_ratio;
+                    info.min_marks_for_concurrent_read = (info.sum_marks + num_streams - 1) / num_streams;
+                }
+            }
+            else
+                num_streams = parts_with_ranges.size();
+        }
    }

    auto read_type = is_parallel_reading_from_replicas ? ReadType::ParallelReplicas : ReadType::Default;