ClickHouse/workload-scheduling.md at 77697211cd2cf565fd248db9842360bde40d5e7f

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-09-20 16:50:48 +00:00

serxa 77697211cd add docs for system.scheduler + small fix

2023-09-06 19:30:39 +00:00

6.1 KiB

Raw Blame History

slug	sidebar_position	sidebar_label	title
/en/operations/workload-scheduling	69	Workload scheduling	Workload scheduling

When ClickHouse execute multiple queries simultaneously, they may be using shared resources (e.g. disks). Scheduling constraints and policies can be applied to regulate how resources are utilized and shared between different workloads. For every resource a scheduling hierarchy can be configured. Hierarchy root represents a resource, while leafs are queues, holding requests that exceed resource capacity.

:::note Currently only remote disk IO can be scheduled using described method. For CPU scheduling see settings about thread pools and concurrent_threads_soft_limit_num. For flexible memory limits see Memory overcommit :::

Disk configuration

To enable IO scheduling for a specific disk, you have to specify read_resource and/or write_resource in storage configuration. It says ClickHouse what resource should be used for every read and write requests with given disk. Read and write resource can refer to the same resource name, which is useful for local SSDs or HDDs. Multiple different disks also can refer to the same resource, which is useful for remote disks: if you want to be able to allow fair division of network bandwidth between e.g. "production" and "development" workloads.

Example:

<clickhouse>
    <storage_configuration>
        ...
        <disks>
            <s3>
                <type>s3</type>
                <endpoint>https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/root-path/</endpoint>
                <access_key_id>your_access_key_id</access_key_id>
                <secret_access_key>your_secret_access_key</secret_access_key>
                <read_resource>network_read</read_resource>
                <write_resource>network_write</write_resource>
            </s3>
        </disks>
        <policies>
            <s3_main>
                <volumes>
                    <main>
                        <disk>s3</disk>
                    </main>
                </volumes>
            </s3_main>
        </policies>
    </storage_configuration>
</clickhouse>

Workload markup

Queries can be marked with setting workload to distinguish different workloads. If workload is not set, than value "default" is used. Note that you are able to specify the other value using settings profiles. Setting constraints can be used to make workload constant if you want all queries from the user to be marked with fixed value of workload setting.

Let's consider an example of a system with two different workloads: "production" and "development".

SELECT count() FROM my_table WHERE value = 42 SETTINGS workload = 'production'
SELECT count() FROM my_table WHERE value = 13 SETTINGS workload = 'development'

Resource scheduling hierarchy

From the standpoint of scheduling subsystem a resource represents a hierarchy of scheduling nodes.

graph TD
    subgraph network_read
    nr_root(("/"))
    -->|100 concurrent requests| nr_fair("fair")
    -->|75% bandwidth| nr_prod["prod"]
    nr_fair
    -->|25% bandwidth| nr_dev["dev"]
    end

    subgraph network_write
    nw_root(("/"))
    -->|100 concurrent requests| nw_fair("fair")
    -->|75% bandwidth| nw_prod["prod"]
    nw_fair
    -->|25% bandwidth| nw_dev["dev"]
    end

Possible node types:

inflight_limit (constraint) - blocks if either number of concurrent in-flight requests exceeds max_requests, or their total cost exceeds max_cost; must have a single child.
fair (policy) - selects the next request to serve from one of its children nodes according to max-min fairness; children nodes can specify weight (default is 1).
priority (policy) - selects the next request to serve from one of its children nodes according to static priorities (lower value means higher priority); children nodes can specify priority (default is 0).
fifo (queue) - leaf of the hierarchy capable of holding requests that exceed resource capacity.

The following example shows how to define IO scheduling hierarchies shown in the picture:

<clickhouse>
    <resources>
        <network_read>
            <node path="/">
                <type>inflight_limit</type>
                <max_requests>100</max_requests>
            </node>
            <node path="/fair">
                <type>fair</type>
            </node>
            <node path="/fair/prod">
                <type>fifo</type>
                <weight>3</weight>
            </node>
            <node path="/fair/dev">
                <type>fifo</type>
            </node>
        </network_read>
        <network_write>
            <node path="/">
                <type>inflight_limit</type>
                <max_requests>100</max_requests>
            </node>
            <node path="/fair">
                <type>fair</type>
            </node>
            <node path="/fair/prod">
                <type>fifo</type>
                <weight>3</weight>
            </node>
            <node path="/fair/dev">
                <type>fifo</type>
            </node>
        </network_write>
    </resources>
</clickhouse>

Workload classifiers

Workload classifiers are used to define mapping from workload specified by a query into leaf-queues that should be used for specific resources. At the moment, workload classification is simple: only static mapping is available.