--- slug: /en/sql-reference/window-functions/ sidebar_position: 62 sidebar_label: Window Functions title: Window Functions --- ClickHouse supports the standard grammar for defining windows and window functions. The following features are currently supported: | Feature | Support or workaround | |------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ad hoc window specification (`count(*) over (partition by id order by time desc)`) | supported | | expressions involving window functions, e.g. `(count(*) over ()) / 2)` | supported | | `WINDOW` clause (`select ... from table window w as (partition by id)`) | supported | | `ROWS` frame | supported | | `RANGE` frame | supported, the default | | `INTERVAL` syntax for `DateTime` `RANGE OFFSET` frame | not supported, specify the number of seconds instead (`RANGE` works with any numeric type). | | `GROUPS` frame | not supported | | Calculating aggregate functions over a frame (`sum(value) over (order by time)`) | all aggregate functions are supported | | `rank()`, `dense_rank()`, `row_number()` | supported | | `lag/lead(value, offset)` | Not supported. Workarounds: | | | 1) replace with `any(value) over (.... rows between preceding and preceding)`, or `following` for `lead` | | | 2) use `lagInFrame/leadInFrame`, which are analogous, but respect the window frame. To get behavior identical to `lag/lead`, use `rows between unbounded preceding and unbounded following` | | ntile(buckets) | Supported. Specify window like, (partition by x order by y rows between unbounded preceding and unounded following). | ## ClickHouse-specific Window Functions ### nonNegativeDerivative(metric_column, timestamp_column[, INTERVAL X UNITS]) Finds non-negative derivative for given `metric_column` by `timestamp_column`. `INTERVAL` can be omitted, default is `INTERVAL 1 SECOND`. The computed value is the following for each row: - `0` for 1st row, - ${metric_i - metric_{i-1} \over timestamp_i - timestamp_{i-1}} * interval$ for $i_th$ row. ## References ### GitHub Issues The roadmap for the initial support of window functions is [in this issue](https://github.com/ClickHouse/ClickHouse/issues/18097). All GitHub issues related to window funtions have the [comp-window-functions](https://github.com/ClickHouse/ClickHouse/labels/comp-window-functions) tag. ### Tests These tests contain the examples of the currently supported grammar: https://github.com/ClickHouse/ClickHouse/blob/master/tests/performance/window_functions.xml https://github.com/ClickHouse/ClickHouse/blob/master/tests/queries/0_stateless/01591_window_functions.sql ### Postgres Docs https://www.postgresql.org/docs/current/sql-select.html#SQL-WINDOW https://www.postgresql.org/docs/devel/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS https://www.postgresql.org/docs/devel/functions-window.html https://www.postgresql.org/docs/devel/tutorial-window.html ### MySQL Docs https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html https://dev.mysql.com/doc/refman/8.0/en/window-functions-usage.html https://dev.mysql.com/doc/refman/8.0/en/window-functions-frames.html ## Syntax ```text aggregate_function (column_name) OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name]) FROM table_name WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column]) ``` - `PARTITION BY` - defines how to break a resultset into groups. - `ORDER BY` - defines how to order rows inside the group during calculation aggregate_function. - `ROWS or RANGE` - defines bounds of a frame, aggregate_function is calculated within a frame. - `WINDOW` - allows to reuse a window definition with multiple expressions. ### Functions These functions can be used only as a window function. `row_number()` - Number the current row within its partition starting from 1. `first_value(x)` - Return the first non-NULL value evaluated within its ordered frame. `last_value(x)` - Return the last non-NULL value evaluated within its ordered frame. `nth_value(x, offset)` - Return the first non-NULL value evaluated against the nth row (offset) in its ordered frame. `rank()` - Rank the current row within its partition with gaps. `dense_rank()` - Rank the current row within its partition without gaps. `lagInFrame(x)` - Return a value evaluated at the row that is at a specified physical offset row before the current row within the ordered frame. `leadInFrame(x)` - Return a value evaluated at the row that is offset rows after the current row within the ordered frame. ```text PARTITION ┌─────────────────┐ <-- UNBOUNDED PRECEDING (BEGINNING of the PARTITION) │ │ │ │ │=================│ <-- N PRECEDING <─┐ │ N ROWS │ │ F │ Before CURRENT │ │ R │~~~~~~~~~~~~~~~~~│ <-- CURRENT ROW │ A │ M ROWS │ │ M │ After CURRENT │ │ E │=================│ <-- M FOLLOWING <─┘ │ │ │ │ └─────────────────┘ <--- UNBOUNDED FOLLOWING (END of the PARTITION) ``` ## Examples ```sql CREATE TABLE wf_partition ( `part_key` UInt64, `value` UInt64, `order` UInt64 ) ENGINE = Memory; INSERT INTO wf_partition FORMAT Values (1,1,1), (1,2,2), (1,3,3), (2,0,0), (3,0,0); SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key) AS frame_values FROM wf_partition ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [1,2,3] │ <┐ │ 1 │ 2 │ 2 │ [1,2,3] │ │ 1-st group │ 1 │ 3 │ 3 │ [1,2,3] │ <┘ │ 2 │ 0 │ 0 │ [0] │ <- 2-nd group │ 3 │ 0 │ 0 │ [0] │ <- 3-d group └──────────┴───────┴───────┴──────────────┘ ``` ```sql CREATE TABLE wf_frame ( `part_key` UInt64, `value` UInt64, `order` UInt64 ) ENGINE = Memory; INSERT INTO wf_frame FORMAT Values (1,1,1), (1,2,2), (1,3,3), (1,4,4), (1,5,5); -- frame is bounded by bounds of a partition (BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key ORDER BY order ASC Rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS frame_values FROM wf_frame ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [1,2,3,4,5] │ │ 1 │ 2 │ 2 │ [1,2,3,4,5] │ │ 1 │ 3 │ 3 │ [1,2,3,4,5] │ │ 1 │ 4 │ 4 │ [1,2,3,4,5] │ │ 1 │ 5 │ 5 │ [1,2,3,4,5] │ └──────────┴───────┴───────┴──────────────┘ -- short form - no bound expression, no order by SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key) AS frame_values FROM wf_frame ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [1,2,3,4,5] │ │ 1 │ 2 │ 2 │ [1,2,3,4,5] │ │ 1 │ 3 │ 3 │ [1,2,3,4,5] │ │ 1 │ 4 │ 4 │ [1,2,3,4,5] │ │ 1 │ 5 │ 5 │ [1,2,3,4,5] │ └──────────┴───────┴───────┴──────────────┘ -- frame is bounded by the beggining of a partition and the current row SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key ORDER BY order ASC Rows BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS frame_values FROM wf_frame ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [1] │ │ 1 │ 2 │ 2 │ [1,2] │ │ 1 │ 3 │ 3 │ [1,2,3] │ │ 1 │ 4 │ 4 │ [1,2,3,4] │ │ 1 │ 5 │ 5 │ [1,2,3,4,5] │ └──────────┴───────┴───────┴──────────────┘ -- short form (frame is bounded by the beggining of a partition and the current row) SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key ORDER BY order ASC) AS frame_values FROM wf_frame ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [1] │ │ 1 │ 2 │ 2 │ [1,2] │ │ 1 │ 3 │ 3 │ [1,2,3] │ │ 1 │ 4 │ 4 │ [1,2,3,4] │ │ 1 │ 5 │ 5 │ [1,2,3,4,5] │ └──────────┴───────┴───────┴──────────────┘ -- frame is bounded by the beggining of a partition and the current row, but order is backward SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key ORDER BY order DESC) AS frame_values FROM wf_frame ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [5,4,3,2,1] │ │ 1 │ 2 │ 2 │ [5,4,3,2] │ │ 1 │ 3 │ 3 │ [5,4,3] │ │ 1 │ 4 │ 4 │ [5,4] │ │ 1 │ 5 │ 5 │ [5] │ └──────────┴───────┴───────┴──────────────┘ -- sliding frame - 1 PRECEDING ROW AND CURRENT ROW SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key ORDER BY order ASC Rows BETWEEN 1 PRECEDING AND CURRENT ROW) AS frame_values FROM wf_frame ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [1] │ │ 1 │ 2 │ 2 │ [1,2] │ │ 1 │ 3 │ 3 │ [2,3] │ │ 1 │ 4 │ 4 │ [3,4] │ │ 1 │ 5 │ 5 │ [4,5] │ └──────────┴───────┴───────┴──────────────┘ -- sliding frame - Rows BETWEEN 1 PRECEDING AND UNBOUNDED FOLLOWING SELECT part_key, value, order, groupArray(value) OVER (PARTITION BY part_key ORDER BY order ASC Rows BETWEEN 1 PRECEDING AND UNBOUNDED FOLLOWING) AS frame_values FROM wf_frame ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┐ │ 1 │ 1 │ 1 │ [1,2,3,4,5] │ │ 1 │ 2 │ 2 │ [1,2,3,4,5] │ │ 1 │ 3 │ 3 │ [2,3,4,5] │ │ 1 │ 4 │ 4 │ [3,4,5] │ │ 1 │ 5 │ 5 │ [4,5] │ └──────────┴───────┴───────┴──────────────┘ -- row_number does not respect the frame, so rn_1 = rn_2 = rn_3 != rn_4 SELECT part_key, value, order, groupArray(value) OVER w1 AS frame_values, row_number() OVER w1 AS rn_1, sum(1) OVER w1 AS rn_2, row_number() OVER w2 AS rn_3, sum(1) OVER w2 AS rn_4 FROM wf_frame WINDOW w1 AS (PARTITION BY part_key ORDER BY order DESC), w2 AS (PARTITION BY part_key ORDER BY order DESC Rows BETWEEN 1 PRECEDING AND CURRENT ROW) ORDER BY part_key ASC, value ASC; ┌─part_key─┬─value─┬─order─┬─frame_values─┬─rn_1─┬─rn_2─┬─rn_3─┬─rn_4─┐ │ 1 │ 1 │ 1 │ [5,4,3,2,1] │ 5 │ 5 │ 5 │ 2 │ │ 1 │ 2 │ 2 │ [5,4,3,2] │ 4 │ 4 │ 4 │ 2 │ │ 1 │ 3 │ 3 │ [5,4,3] │ 3 │ 3 │ 3 │ 2 │ │ 1 │ 4 │ 4 │ [5,4] │ 2 │ 2 │ 2 │ 2 │ │ 1 │ 5 │ 5 │ [5] │ 1 │ 1 │ 1 │ 1 │ └──────────┴───────┴───────┴──────────────┴──────┴──────┴──────┴──────┘ -- first_value and last_value respect the frame SELECT groupArray(value) OVER w1 AS frame_values_1, first_value(value) OVER w1 AS first_value_1, last_value(value) OVER w1 AS last_value_1, groupArray(value) OVER w2 AS frame_values_2, first_value(value) OVER w2 AS first_value_2, last_value(value) OVER w2 AS last_value_2 FROM wf_frame WINDOW w1 AS (PARTITION BY part_key ORDER BY order ASC), w2 AS (PARTITION BY part_key ORDER BY order ASC Rows BETWEEN 1 PRECEDING AND CURRENT ROW) ORDER BY part_key ASC, value ASC; ┌─frame_values_1─┬─first_value_1─┬─last_value_1─┬─frame_values_2─┬─first_value_2─┬─last_value_2─┐ │ [1] │ 1 │ 1 │ [1] │ 1 │ 1 │ │ [1,2] │ 1 │ 2 │ [1,2] │ 1 │ 2 │ │ [1,2,3] │ 1 │ 3 │ [2,3] │ 2 │ 3 │ │ [1,2,3,4] │ 1 │ 4 │ [3,4] │ 3 │ 4 │ │ [1,2,3,4,5] │ 1 │ 5 │ [4,5] │ 4 │ 5 │ └────────────────┴───────────────┴──────────────┴────────────────┴───────────────┴──────────────┘ -- second value within the frame SELECT groupArray(value) OVER w1 AS frame_values_1, nth_value(value, 2) OVER w1 AS second_value FROM wf_frame WINDOW w1 AS (PARTITION BY part_key ORDER BY order ASC Rows BETWEEN 3 PRECEDING AND CURRENT ROW) ORDER BY part_key ASC, value ASC ┌─frame_values_1─┬─second_value─┐ │ [1] │ 0 │ │ [1,2] │ 2 │ │ [1,2,3] │ 2 │ │ [1,2,3,4] │ 2 │ │ [2,3,4,5] │ 3 │ └────────────────┴──────────────┘ -- second value within the frame + Null for missing values SELECT groupArray(value) OVER w1 AS frame_values_1, nth_value(toNullable(value), 2) OVER w1 AS second_value FROM wf_frame WINDOW w1 AS (PARTITION BY part_key ORDER BY order ASC Rows BETWEEN 3 PRECEDING AND CURRENT ROW) ORDER BY part_key ASC, value ASC ┌─frame_values_1─┬─second_value─┐ │ [1] │ ᴺᵁᴸᴸ │ │ [1,2] │ 2 │ │ [1,2,3] │ 2 │ │ [1,2,3,4] │ 2 │ │ [2,3,4,5] │ 3 │ └────────────────┴──────────────┘ ``` ## Real world examples ### Maximum/total salary per department. ```sql CREATE TABLE employees ( `department` String, `employee_name` String, `salary` Float ) ENGINE = Memory; INSERT INTO employees FORMAT Values ('Finance', 'Jonh', 200), ('Finance', 'Joan', 210), ('Finance', 'Jean', 505), ('IT', 'Tim', 200), ('IT', 'Anna', 300), ('IT', 'Elen', 500); SELECT department, employee_name AS emp, salary, max_salary_per_dep, total_salary_per_dep, round((salary / total_salary_per_dep) * 100, 2) AS `share_per_dep(%)` FROM ( SELECT department, employee_name, salary, max(salary) OVER wndw AS max_salary_per_dep, sum(salary) OVER wndw AS total_salary_per_dep FROM employees WINDOW wndw AS (PARTITION BY department rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) ORDER BY department ASC, employee_name ASC ); ┌─department─┬─emp──┬─salary─┬─max_salary_per_dep─┬─total_salary_per_dep─┬─share_per_dep(%)─┐ │ Finance │ Jean │ 505 │ 505 │ 915 │ 55.19 │ │ Finance │ Joan │ 210 │ 505 │ 915 │ 22.95 │ │ Finance │ Jonh │ 200 │ 505 │ 915 │ 21.86 │ │ IT │ Anna │ 300 │ 500 │ 1000 │ 30 │ │ IT │ Elen │ 500 │ 500 │ 1000 │ 50 │ │ IT │ Tim │ 200 │ 500 │ 1000 │ 20 │ └────────────┴──────┴────────┴────────────────────┴──────────────────────┴──────────────────┘ ``` ### Cumulative sum. ```sql CREATE TABLE warehouse ( `item` String, `ts` DateTime, `value` Float ) ENGINE = Memory INSERT INTO warehouse VALUES ('sku38', '2020-01-01', 9), ('sku38', '2020-02-01', 1), ('sku38', '2020-03-01', -4), ('sku1', '2020-01-01', 1), ('sku1', '2020-02-01', 1), ('sku1', '2020-03-01', 1); SELECT item, ts, value, sum(value) OVER (PARTITION BY item ORDER BY ts ASC) AS stock_balance FROM warehouse ORDER BY item ASC, ts ASC; ┌─item──┬──────────────────ts─┬─value─┬─stock_balance─┐ │ sku1 │ 2020-01-01 00:00:00 │ 1 │ 1 │ │ sku1 │ 2020-02-01 00:00:00 │ 1 │ 2 │ │ sku1 │ 2020-03-01 00:00:00 │ 1 │ 3 │ │ sku38 │ 2020-01-01 00:00:00 │ 9 │ 9 │ │ sku38 │ 2020-02-01 00:00:00 │ 1 │ 10 │ │ sku38 │ 2020-03-01 00:00:00 │ -4 │ 6 │ └───────┴─────────────────────┴───────┴───────────────┘ ``` ### Moving / Sliding Average (per 3 rows) ```sql CREATE TABLE sensors ( `metric` String, `ts` DateTime, `value` Float ) ENGINE = Memory; insert into sensors values('cpu_temp', '2020-01-01 00:00:00', 87), ('cpu_temp', '2020-01-01 00:00:01', 77), ('cpu_temp', '2020-01-01 00:00:02', 93), ('cpu_temp', '2020-01-01 00:00:03', 87), ('cpu_temp', '2020-01-01 00:00:04', 87), ('cpu_temp', '2020-01-01 00:00:05', 87), ('cpu_temp', '2020-01-01 00:00:06', 87), ('cpu_temp', '2020-01-01 00:00:07', 87); SELECT metric, ts, value, avg(value) OVER (PARTITION BY metric ORDER BY ts ASC Rows BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_temp FROM sensors ORDER BY metric ASC, ts ASC; ┌─metric───┬──────────────────ts─┬─value─┬───moving_avg_temp─┐ │ cpu_temp │ 2020-01-01 00:00:00 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:00:01 │ 77 │ 82 │ │ cpu_temp │ 2020-01-01 00:00:02 │ 93 │ 85.66666666666667 │ │ cpu_temp │ 2020-01-01 00:00:03 │ 87 │ 85.66666666666667 │ │ cpu_temp │ 2020-01-01 00:00:04 │ 87 │ 89 │ │ cpu_temp │ 2020-01-01 00:00:05 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:00:06 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:00:07 │ 87 │ 87 │ └──────────┴─────────────────────┴───────┴───────────────────┘ ``` ### Moving / Sliding Average (per 10 seconds) ```sql SELECT metric, ts, value, avg(value) OVER (PARTITION BY metric ORDER BY ts Range BETWEEN 10 PRECEDING AND CURRENT ROW) AS moving_avg_10_seconds_temp FROM sensors ORDER BY metric ASC, ts ASC; ┌─metric───┬──────────────────ts─┬─value─┬─moving_avg_10_seconds_temp─┐ │ cpu_temp │ 2020-01-01 00:00:00 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:01:10 │ 77 │ 77 │ │ cpu_temp │ 2020-01-01 00:02:20 │ 93 │ 93 │ │ cpu_temp │ 2020-01-01 00:03:30 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:04:40 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:05:50 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:06:00 │ 87 │ 87 │ │ cpu_temp │ 2020-01-01 00:07:10 │ 87 │ 87 │ └──────────┴─────────────────────┴───────┴────────────────────────────┘ ``` ### Moving / Sliding Average (per 10 days) Temperature is stored with second precision, but using `Range` and `ORDER BY toDate(ts)` we form a frame with the size of 10 units, and because of `toDate(ts)` the unit is a day. ```sql CREATE TABLE sensors ( `metric` String, `ts` DateTime, `value` Float ) ENGINE = Memory; insert into sensors values('ambient_temp', '2020-01-01 00:00:00', 16), ('ambient_temp', '2020-01-01 12:00:00', 16), ('ambient_temp', '2020-01-02 11:00:00', 9), ('ambient_temp', '2020-01-02 12:00:00', 9), ('ambient_temp', '2020-02-01 10:00:00', 10), ('ambient_temp', '2020-02-01 12:00:00', 10), ('ambient_temp', '2020-02-10 12:00:00', 12), ('ambient_temp', '2020-02-10 13:00:00', 12), ('ambient_temp', '2020-02-20 12:00:01', 16), ('ambient_temp', '2020-03-01 12:00:00', 16), ('ambient_temp', '2020-03-01 12:00:00', 16), ('ambient_temp', '2020-03-01 12:00:00', 16); SELECT metric, ts, value, round(avg(value) OVER (PARTITION BY metric ORDER BY toDate(ts) Range BETWEEN 10 PRECEDING AND CURRENT ROW),2) AS moving_avg_10_days_temp FROM sensors ORDER BY metric ASC, ts ASC; ┌─metric───────┬──────────────────ts─┬─value─┬─moving_avg_10_days_temp─┐ │ ambient_temp │ 2020-01-01 00:00:00 │ 16 │ 16 │ │ ambient_temp │ 2020-01-01 12:00:00 │ 16 │ 16 │ │ ambient_temp │ 2020-01-02 11:00:00 │ 9 │ 12.5 │ │ ambient_temp │ 2020-01-02 12:00:00 │ 9 │ 12.5 │ │ ambient_temp │ 2020-02-01 10:00:00 │ 10 │ 10 │ │ ambient_temp │ 2020-02-01 12:00:00 │ 10 │ 10 │ │ ambient_temp │ 2020-02-10 12:00:00 │ 12 │ 11 │ │ ambient_temp │ 2020-02-10 13:00:00 │ 12 │ 11 │ │ ambient_temp │ 2020-02-20 12:00:01 │ 16 │ 13.33 │ │ ambient_temp │ 2020-03-01 12:00:00 │ 16 │ 16 │ │ ambient_temp │ 2020-03-01 12:00:00 │ 16 │ 16 │ │ ambient_temp │ 2020-03-01 12:00:00 │ 16 │ 16 │ └──────────────┴─────────────────────┴───────┴─────────────────────────┘ ``` ## Related Content - Blog: [Working with time series data in ClickHouse](https://clickhouse.com/blog/working-with-time-series-data-and-functions-ClickHouse) - Blog: [Window and array functions for Git commit sequences](https://clickhouse.com/blog/clickhouse-window-array-functions-git-commits) - Blog: [Getting Data Into ClickHouse - Part 3 - Using S3](https://clickhouse.com/blog/getting-data-into-clickhouse-part-3-s3)