DOCS-510: runningAccumulate (#12061)

* asiana21-DOCSUP-797 (#117)

* docs(runningAccumulate): the function description is added

* docs(runningAccumulate): the function description is modified

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* docs(runningAccumulate): some changes

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* docs(runningAccumulate): added ru translation

Co-authored-by: asiana21 <asiana21@yandex-team.ru>
Co-authored-by: BayoNet <da-daos@yandex.ru>

* CLICKHOUSEDOCS-510: Minor fix.

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: Ivan Blinkov <github@blinkov.ru>

* CLICKHOUSEDOCS-510: Fixed links.

Co-authored-by: AsiaKorushkina <43650329+AsiaKorushkina@users.noreply.github.com>
Co-authored-by: asiana21 <asiana21@yandex-team.ru>
Co-authored-by: Sergei Shtykov <bayonet@yandex-team.ru>
Co-authored-by: Ivan Blinkov <github@blinkov.ru>
This commit is contained in:
BayoNet 2020-07-01 16:36:41 +03:00 committed by GitHub
parent 03a643e9d3
commit 41fed28403
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 210 additions and 10 deletions

View File

@ -120,7 +120,7 @@ There are ordinary functions and aggregate functions. For aggregate functions, s
Ordinary functions dont change the number of rows they work as if they are processing each row independently. In fact, functions are not called for individual rows, but for `Block`s of data to implement vectorized query execution.
There are some miscellaneous functions, like [blockSize](../sql-reference/functions/other-functions.md#function-blocksize), [rowNumberInBlock](../sql-reference/functions/other-functions.md#function-rownumberinblock), and [runningAccumulate](../sql-reference/functions/other-functions.md#function-runningaccumulate), that exploit block processing and violate the independence of rows.
There are some miscellaneous functions, like [blockSize](../sql-reference/functions/other-functions.md#function-blocksize), [rowNumberInBlock](../sql-reference/functions/other-functions.md#function-rownumberinblock), and [runningAccumulate](../sql-reference/functions/other-functions.md#runningaccumulatexploit block processing and violate the independence of rows.
ClickHouse has strong typing, so theres no implicit type conversion. If a function doesn't support a specific combination of types, it throws an exception. But functions can work (be overloaded) for many different combinations of types. For example, the `plus` function (to implement the `+` operator) works for any combination of numeric types: `UInt8` + `Float32`, `UInt16` + `Int8`, and so on. Also, some variadic functions can accept any number of arguments, such as the `concat` function.

View File

@ -33,7 +33,7 @@ To work with these states, use:
- [AggregatingMergeTree](../../engines/table-engines/mergetree-family/aggregatingmergetree.md) table engine.
- [finalizeAggregation](../../sql-reference/functions/other-functions.md#function-finalizeaggregation) function.
- [runningAccumulate](../../sql-reference/functions/other-functions.md#function-runningaccumulate) function.
- [runningAccumulate](../../sql-reference/functions/other-functions.md#runningaccumulate) function.
- [-Merge](#aggregate_functions_combinators-merge) combinator.
- [-MergeState](#aggregate_functions_combinators-mergestate) combinator.

View File

@ -1054,11 +1054,110 @@ Result:
Takes state of aggregate function. Returns result of aggregation (finalized state).
## runningAccumulate {#function-runningaccumulate}
## runningAccumulate {#runningaccumulate}
Takes the states of the aggregate function and returns a column with values, are the result of the accumulation of these states for a set of block lines, from the first to the current line.
For example, takes state of aggregate function (example runningAccumulate(uniqState(UserID))), and for each row of block, return result of aggregate function on merge of states of all previous rows and current row.
So, result of function depends on partition of data to blocks and on order of data in block.
Accumulates states of an aggregate function for each row of a data block.
!!! warning "Warning"
The state is reset for each new data block.
**Syntax**
```sql
runningAccumulate(agg_state[, grouping]);
```
**Parameters**
- `agg_state` — State of the aggregate function. [AggregateFunction](../../sql-reference/data-types/aggregatefunction.md#data-type-aggregatefunction).
- `grouping` — Grouping key. Optional. The state of the function is reset if the `grouping` value is changed. It can be any of the [supported data types](../../sql-reference/data-types/index.md) for which the equality operator is defined.
**Returned value**
- Each resulting row contains a result of the aggregate function, accumulated for all the input rows from 0 to the current position. `runningAccumulate` resets states for each new data block or when the `grouping` value changes.
Type depends on the aggregate function used.
**Examples**
Consider how you can use `runningAccumulate` to find the cumulative sum of numbers without and with grouping.
Query:
```sql
SELECT k, runningAccumulate(sum_k) AS res FROM (SELECT number as k, sumState(k) AS sum_k FROM numbers(10) GROUP BY k ORDER BY k);
```
Result:
```text
┌─k─┬─res─┐
│ 0 │ 0 │
│ 1 │ 1 │
│ 2 │ 3 │
│ 3 │ 6 │
│ 4 │ 10 │
│ 5 │ 15 │
│ 6 │ 21 │
│ 7 │ 28 │
│ 8 │ 36 │
│ 9 │ 45 │
└───┴─────┘
```
The subquery generates `sumState` for every number from `0` to `9`. `sumState` returns the state of the [sum](../aggregate-functions/reference/sum.md) function that contains the sum of a single number.
The whole query does the following:
1. For the first row, `runningAccumulate` takes `sumState(0)` and returns `0`.
2. For the second row, the function merges `sumState(0)` and `sumState(1)` resulting in `sumState(0 + 1)`, and returns `1` as a result.
3. For the third row, the function merges `sumState(0 + 1)` and `sumState(2)` resulting in `sumState(0 + 1 + 2)`, and returns `3` as a result.
4. The actions are repeated until the block ends.
The following example shows the `groupping` parameter usage:
Query:
```sql
SELECT
grouping,
item,
runningAccumulate(state, grouping) AS res
FROM
(
SELECT
toInt8(number / 4) AS grouping,
number AS item,
sumState(number) AS state
FROM numbers(15)
GROUP BY item
ORDER BY item ASC
);
```
Result:
```text
┌─grouping─┬─item─┬─res─┐
│ 0 │ 0 │ 0 │
│ 0 │ 1 │ 1 │
│ 0 │ 2 │ 3 │
│ 0 │ 3 │ 6 │
│ 1 │ 4 │ 4 │
│ 1 │ 5 │ 9 │
│ 1 │ 6 │ 15 │
│ 1 │ 7 │ 22 │
│ 2 │ 8 │ 8 │
│ 2 │ 9 │ 17 │
│ 2 │ 10 │ 27 │
│ 2 │ 11 │ 38 │
│ 3 │ 12 │ 12 │
│ 3 │ 13 │ 25 │
│ 3 │ 14 │ 39 │
└──────────┴──────┴─────┘
```
As you can see, `runningAccumulate` merges states for each group of rows separately.
## joinGet {#joinget}

View File

@ -29,7 +29,7 @@
- Движок таблиц [AggregatingMergeTree](../../engines/table-engines/mergetree-family/aggregatingmergetree.md).
- Функция [finalizeAggregation](../../sql-reference/aggregate-functions/combinators.md#function-finalizeaggregation).
- Функция [runningAccumulate](../../sql-reference/aggregate-functions/combinators.md#function-runningaccumulate).
- Функция [runningAccumulate](../../sql-reference/aggregate-functions/combinators.md#runningaccumulate).
- Комбинатор [-Merge](#aggregate_functions_combinators-merge).
- Комбинатор [-MergeState](#aggregate_functions_combinators-mergestate).

View File

@ -1,4 +1,4 @@
# Прочие функции {#prochie-funktsii}
# Прочие функции {#other-functions}
## hostName() {#hostname}
@ -1036,9 +1036,110 @@ SELECT formatReadableSize(filesystemCapacity()) AS "Capacity", toTypeName(filesy
Принимает состояние агрегатной функции. Возвращает результат агрегирования.
## runningAccumulate {#function-runningaccumulate}
## runningAccumulate {#runningaccumulate}
Принимает на вход состояния агрегатной функции и возвращает столбец со значениями, которые представляют собой результат мёржа этих состояний для выборки строк из блока от первой до текущей строки. Например, принимает состояние агрегатной функции (например, `runningAccumulate(uniqState(UserID))`), и для каждой строки блока возвращает результат агрегатной функции после мёржа состояний функции для всех предыдущих строк и текущей. Таким образом, результат зависит от разбиения данных по блокам и от порядка данных в блоке.
Накапливает состояния агрегатной функции для каждой строки блока данных.
!!! warning "Warning"
Функция обнуляет состояние для каждого нового блока.
**Синтаксис**
```sql
runningAccumulate(agg_state[, grouping]);
```
**Параметры**
- `agg_state` — Состояние агрегатной функции. [AggregateFunction](../../sql-reference/data-types/aggregatefunction.md#data-type-aggregatefunction).
- `grouping` — Ключ группировки. Опциональный параметр. Состояние функции обнуляется, если значение `grouping` меняется. Параметр может быть любого [поддерживаемого типа данных](../../sql-reference/data-types/index.md), для которого определен оператор равенства.
**Возвращаемое значение**
- Каждая результирующая строка содержит результат агрегатной функции, накопленный для всех входных строк от 0 до текущей позиции. `runningAccumulate` обнуляет состояния для каждого нового блока данных или при изменении значения `grouping`.
Тип зависит от используемой агрегатной функции.
**Примеры**
Рассмотрим примеры использования `runningAccumulate` для нахождения кумулятивной суммы чисел без и с группировкой.
Запрос:
```sql
SELECT k, runningAccumulate(sum_k) AS res FROM (SELECT number as k, sumState(k) AS sum_k FROM numbers(10) GROUP BY k ORDER BY k);
```
Результат:
```text
┌─k─┬─res─┐
│ 0 │ 0 │
│ 1 │ 1 │
│ 2 │ 3 │
│ 3 │ 6 │
│ 4 │ 10 │
│ 5 │ 15 │
│ 6 │ 21 │
│ 7 │ 28 │
│ 8 │ 36 │
│ 9 │ 45 │
└───┴─────┘
```
Подзапрос формирует `sumState` для каждого числа от `0` до `9`. `sumState` возвращает состояние функции [sum](../../sql-reference/aggregate-functions/reference.md#agg_function-sum), содержащее сумму одного числа.
Весь запрос делает следующее:
1. Для первой строки `runningAccumulate` берет `sumState(0)` и возвращает `0`.
2. Для второй строки функция объединяет `sumState (0)` и `sumState (1)`, что приводит к `sumState (0 + 1)`, и возвращает в результате `1`.
3. Для третьей строки функция объединяет `sumState (0 + 1)` и `sumState (2)`, что приводит к `sumState (0 + 1 + 2)`, и в результате возвращает `3`.
4. Действия повторяются до тех пор, пока не закончится блок.
В следующем примере показано использование параметра `grouping`:
Запрос:
```sql
SELECT
grouping,
item,
runningAccumulate(state, grouping) AS res
FROM
(
SELECT
toInt8(number / 4) AS grouping,
number AS item,
sumState(number) AS state
FROM numbers(15)
GROUP BY item
ORDER BY item ASC
);
```
Результат:
```text
┌─grouping─┬─item─┬─res─┐
│ 0 │ 0 │ 0 │
│ 0 │ 1 │ 1 │
│ 0 │ 2 │ 3 │
│ 0 │ 3 │ 6 │
│ 1 │ 4 │ 4 │
│ 1 │ 5 │ 9 │
│ 1 │ 6 │ 15 │
│ 1 │ 7 │ 22 │
│ 2 │ 8 │ 8 │
│ 2 │ 9 │ 17 │
│ 2 │ 10 │ 27 │
│ 2 │ 11 │ 38 │
│ 3 │ 12 │ 12 │
│ 3 │ 13 │ 25 │
│ 3 │ 14 │ 39 │
└──────────┴──────┴─────┘
```
Как вы можете видеть, `runningAccumulate` объединяет состояния для каждой группы строк отдельно.
## joinGet {#joinget}