ClickHouse/docs/en/sql-reference/aggregate-functions/reference/quantiles.md

173 lines
7.2 KiB
Markdown
Raw Normal View History

---
2022-08-28 14:53:34 +00:00
slug: /en/sql-reference/aggregate-functions/reference/quantiles
sidebar_position: 201
---
2022-06-02 10:55:18 +00:00
# quantiles Functions
2021-06-22 14:17:31 +00:00
2022-06-02 10:55:18 +00:00
## quantiles
Syntax: `quantiles(level1, level2, …)(x)`
2024-01-30 10:49:56 +00:00
All the quantile functions also have corresponding quantiles functions: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantileInterpolatedWeighted`, `quantilesTDigest`, `quantilesBFloat16`, `quantilesDD`. These functions calculate all the quantiles of the listed levels in one pass, and return an array of the resulting values.
2021-06-14 15:34:56 +00:00
2022-06-02 10:55:18 +00:00
## quantilesExactExclusive
2021-06-14 15:34:56 +00:00
Exactly computes the [quantiles](https://en.wikipedia.org/wiki/Quantile) of a numeric data sequence.
To get exact value, all the passed values are combined into an array, which is then partially sorted. Therefore, the function consumes `O(n)` memory, where `n` is a number of values that were passed. However, for a small number of values, the function is very effective.
2021-07-29 15:20:55 +00:00
This function is equivalent to [PERCENTILE.EXC](https://support.microsoft.com/en-us/office/percentile-exc-function-bbaa7204-e9e1-4010-85bf-c31dc5dce4ba) Excel function, ([type R6](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample)).
2021-06-17 00:42:08 +00:00
2021-06-30 19:28:57 +00:00
Works more efficiently with sets of levels than [quantileExactExclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactexclusive).
2021-06-14 15:34:56 +00:00
**Syntax**
``` sql
quantilesExactExclusive(level1, level2, ...)(expr)
```
**Arguments**
- `expr` — Expression over the column values resulting in numeric [data types](../../../sql-reference/data-types/index.md#data_types), [Date](../../../sql-reference/data-types/date.md) or [DateTime](../../../sql-reference/data-types/datetime.md).
2021-06-14 15:34:56 +00:00
2021-06-22 21:32:41 +00:00
**Parameters**
- `level` — Levels of quantiles. Possible values: (0, 1) — bounds not included. [Float](../../../sql-reference/data-types/float.md).
2021-06-22 21:32:41 +00:00
2021-06-14 15:34:56 +00:00
**Returned value**
- [Array](../../../sql-reference/data-types/array.md) of quantiles of the specified levels.
2021-06-14 15:34:56 +00:00
Type of array values:
- [Float64](../../../sql-reference/data-types/float.md) for numeric data type input.
- [Date](../../../sql-reference/data-types/date.md) if input values have the `Date` type.
- [DateTime](../../../sql-reference/data-types/datetime.md) if input values have the `DateTime` type.
2021-06-14 15:34:56 +00:00
**Example**
Query:
``` sql
CREATE TABLE num AS numbers(1000);
SELECT quantilesExactExclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
```
Result:
``` text
┌─quantilesExactExclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x)─┐
│ [249.25,499.5,749.75,899.9,949.9499999999999,989.99,998.999] │
└─────────────────────────────────────────────────────────────────────┘
```
2022-06-02 10:55:18 +00:00
## quantilesExactInclusive
2021-06-14 15:34:56 +00:00
Exactly computes the [quantiles](https://en.wikipedia.org/wiki/Quantile) of a numeric data sequence.
To get exact value, all the passed values are combined into an array, which is then partially sorted. Therefore, the function consumes `O(n)` memory, where `n` is a number of values that were passed. However, for a small number of values, the function is very effective.
2021-06-17 00:42:08 +00:00
This function is equivalent to [PERCENTILE.INC](https://support.microsoft.com/en-us/office/percentile-inc-function-680f9539-45eb-410b-9a5e-c1355e5fe2ed) Excel function, ([type R7](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample)).
2021-06-30 19:28:57 +00:00
Works more efficiently with sets of levels than [quantileExactInclusive](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexactinclusive).
2021-06-14 15:34:56 +00:00
**Syntax**
``` sql
quantilesExactInclusive(level1, level2, ...)(expr)
```
**Arguments**
- `expr` — Expression over the column values resulting in numeric [data types](../../../sql-reference/data-types/index.md#data_types), [Date](../../../sql-reference/data-types/date.md) or [DateTime](../../../sql-reference/data-types/datetime.md).
2021-06-14 15:34:56 +00:00
2021-06-22 21:32:41 +00:00
**Parameters**
- `level` — Levels of quantiles. Possible values: [0, 1] — bounds included. [Float](../../../sql-reference/data-types/float.md).
2021-06-22 21:32:41 +00:00
2021-06-14 15:34:56 +00:00
**Returned value**
- [Array](../../../sql-reference/data-types/array.md) of quantiles of the specified levels.
2021-06-14 15:34:56 +00:00
Type of array values:
- [Float64](../../../sql-reference/data-types/float.md) for numeric data type input.
- [Date](../../../sql-reference/data-types/date.md) if input values have the `Date` type.
- [DateTime](../../../sql-reference/data-types/datetime.md) if input values have the `DateTime` type.
2021-06-14 15:34:56 +00:00
**Example**
Query:
``` sql
CREATE TABLE num AS numbers(1000);
SELECT quantilesExactInclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
```
Result:
``` text
┌─quantilesExactInclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x)─┐
│ [249.75,499.5,749.25,899.1,949.05,989.01,998.001] │
└─────────────────────────────────────────────────────────────────────┘
```
2023-02-20 02:27:13 +00:00
2023-04-20 09:41:17 +00:00
## quantilesGK
2023-02-20 02:27:13 +00:00
2023-04-20 09:41:17 +00:00
`quantilesGK` works similarly with `quantileGK` but allows us to calculate quantities at different levels simultaneously and returns an array.
2023-02-20 02:27:13 +00:00
**Syntax**
``` sql
2023-04-20 09:41:17 +00:00
quantilesGK(accuracy, level1, level2, ...)(expr)
2023-02-20 02:27:13 +00:00
```
**Returned value**
- [Array](../../../sql-reference/data-types/array.md) of quantiles of the specified levels.
2023-02-20 02:27:13 +00:00
Type of array values:
- [Float64](../../../sql-reference/data-types/float.md) for numeric data type input.
- [Date](../../../sql-reference/data-types/date.md) if input values have the `Date` type.
- [DateTime](../../../sql-reference/data-types/datetime.md) if input values have the `DateTime` type.
2023-02-20 02:27:13 +00:00
**Example**
Query:
``` sql
2023-04-20 09:41:17 +00:00
SELECT quantilesGK(1, 0.25, 0.5, 0.75)(number + 1)
2023-03-07 13:41:45 +00:00
FROM numbers(1000)
2023-04-20 09:41:17 +00:00
┌─quantilesGK(1, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [1,1,1] │
└──────────────────────────────────────────────────┘
2023-03-07 13:41:45 +00:00
2023-04-20 09:41:17 +00:00
SELECT quantilesGK(10, 0.25, 0.5, 0.75)(number + 1)
2023-03-07 13:41:45 +00:00
FROM numbers(1000)
2023-04-20 09:41:17 +00:00
┌─quantilesGK(10, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [156,413,659] │
└───────────────────────────────────────────────────┘
2023-03-07 13:41:45 +00:00
2023-04-20 09:41:17 +00:00
SELECT quantilesGK(100, 0.25, 0.5, 0.75)(number + 1)
2023-03-07 13:41:45 +00:00
FROM numbers(1000)
2023-04-20 09:41:17 +00:00
┌─quantilesGK(100, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [251,498,741] │
└────────────────────────────────────────────────────┘
2023-03-07 13:41:45 +00:00
2023-04-20 09:41:17 +00:00
SELECT quantilesGK(1000, 0.25, 0.5, 0.75)(number + 1)
2023-03-07 13:41:45 +00:00
FROM numbers(1000)
2023-02-20 02:27:13 +00:00
2023-04-20 09:41:17 +00:00
┌─quantilesGK(1000, 0.25, 0.5, 0.75)(plus(number, 1))─┐
│ [249,499,749] │
└─────────────────────────────────────────────────────┘
2023-02-20 02:27:13 +00:00
```