Update sumMapWithOverflow function

This commit is contained in:
Blargian 2024-04-27 20:58:34 +02:00
parent 6881f1a3aa
commit 30a0385bf0
6 changed files with 95 additions and 17 deletions

View File

@ -505,10 +505,14 @@ HAVING uniqUpTo(4)(UserID) >= 5
`uniqUpTo(4)(UserID)` calculates the number of unique `UserID` values for each `SearchPhrase`, but it only counts up to 4 unique values. If there are more than 4 unique `UserID` values for a `SearchPhrase`, the function returns 5 (4 + 1). The `HAVING` clause then filters out the `SearchPhrase` values for which the number of unique `UserID` values is less than 5. This will give you a list of search keywords that were used by at least 5 unique users.
## sumMapFiltered(keys_to_keep)(keys, values)
## sumMapFiltered
Same behavior as [sumMap](../../sql-reference/aggregate-functions/reference/summap.md#agg_functions-summap) except that an array of keys is passed as a parameter. This can be especially useful when working with a high cardinality of keys.
**Syntax**
`sumMapFiltered(keys_to_keep)(keys, values)`
## sumMapFilteredWithOverflow
## sequenceNextNode

View File

@ -66,8 +66,8 @@ ClickHouse-specific aggregate functions:
- [sumWithOverflow](/docs/en/sql-reference/aggregate-functions/reference/sumwithoverflow.md)
- [sumMap](/docs/en/sql-reference/aggregate-functions/reference/summap.md)
- [sumMapWithOverflow](/docs/en/sql-reference/aggregate-functions/reference/summapwithoverflow.md)
- [sumMapFiltered](/docs/en/sql-reference/aggregate-functions/reference/summapfiltered.md)
- [sumMapFilteredWithOverflow](/docs/en/sql-reference/aggregate-functions/reference/summapfilteredwithoverflow.md)
- [sumMapFiltered](/docs/en/sql-reference/aggregate-functions/parametric-functions.md/#summapfiltered)
- [sumMapFilteredWithOverflow](/docs/en/sql-reference/aggregate-functions/parametric-functions.md/#summapfilteredwithoverflow)
- [minMap](/docs/en/sql-reference/aggregate-functions/reference/minmap.md)
- [maxMap](/docs/en/sql-reference/aggregate-functions/reference/maxmap.md)
- [skewSamp](/docs/en/sql-reference/aggregate-functions/reference/skewsamp.md)

View File

@ -5,7 +5,7 @@ sidebar_position: 141
# sumMap
Totals a `value` array according to the keys specified in the `key` array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys.
Totals a `value` array according to the keys specified in the `key` array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys without overflow.
**Syntax**
@ -25,6 +25,10 @@ Passing a tuple of key and value arrays is a synonym to passing seperately an ar
The number of elements in `key` and `value` must be the same for each row that is totaled.
:::
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys.
**Example**
First we create a table called `sum_map`, and insert some data into it. Arrays of keys and values are stored separately as a column called `statusMap` of [Nested](../../data-types/nested-data-structures/index.md) type, and together as a column called `statusMapTuple` of [tuple](../../data-types/tuple.md) type to illustrate the use of the two different syntaxes of this function described above.
@ -73,5 +77,6 @@ Result:
```
**See Also**
- [Map combinator for Map datatype](../combinators.md#-map)
- [sumMapWithOverflow](../reference/summapwithoverflow.md)

View File

@ -1,6 +0,0 @@
---
slug: /en/sql-reference/aggregate-functions/reference/summapfiltered
sidebar_position: 141
---
# sumMapFiltered

View File

@ -1,6 +0,0 @@
---
slug: /en/sql-reference/aggregate-functions/reference/summapfilteredwithoverflow
sidebar_position: 141
---
# sumMapFilteredWithOverflow

View File

@ -5,3 +5,84 @@ sidebar_position: 141
# sumMapWithOverflow
Totals a `value` array according to the keys specified in the `key` array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys.
It differs from the [sumMap](../reference/summap.md) function in that it does summation with overflow - i.e. returns the same data type for the summation as the argument type.
**Syntax**
- `sumMapWithOverflow(key <Array>, value <Array>)` [Array type](../../data-types/array.md).
- `sumMapWithOverflow(Tuple(key <Array>, value <Array>))` [Tuple type](../../data-types/tuple.md).
**Arguments**
- `key`. [Array](../../data-types/array.md) of keys.
- `value`. [Array](../../data-types/array.md) of values.
Passing a tuple of key and value arrays is a synonym to passing seperately an array of keys and an array of values.
:::note
The number of elements in `key` and `value` must be the same for each row that is totaled.
:::
**Example**
First we create a table called `sum_map`, and insert some data into it. Arrays of keys and values are stored separately as a column called `statusMap` of [Nested](../../data-types/nested-data-structures/index.md) type, and together as a column called `statusMapTuple` of [tuple](../../data-types/tuple.md) type to illustrate the use of the two different syntaxes of this function described above.
Query:
``` sql
CREATE TABLE sum_map(
date Date,
timeslot DateTime,
statusMap Nested(
status UInt8,
requests UInt8
),
statusMapTuple Tuple(Array(Int8), Array(Int8))
) ENGINE = Log;
```
```sql
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10], ([1, 2, 3], [10, 10, 10])),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10], ([3, 4, 5], [10, 10, 10])),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10], ([4, 5, 6], [10, 10, 10])),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10], ([6, 7, 8], [10, 10, 10]));
```
If we query the table using the `sumMap`, `sumMapWithOverflow` with the array type syntax, and `toTypeName` functions then we can see that
for the `sumMapWithOverflow` function, the data type of the summed values array is the same as the argument type, both `UInt8` (i.e. summation was done with overflow). For `sumMap` the data type of the summed values arrays has changed from `UInt8` to `UInt64` such that overflow does not occur.
Query:
``` sql
SELECT
timeslot,
toTypeName(sumMap(statusMap.status, statusMap.requests)),
toTypeName(sumMapWithOverflow(statusMap.status, statusMap.requests)),
FROM sum_map
GROUP BY timeslot
```
Equivalently we could have used the tuple syntax with for the same result.
``` sql
SELECT
timeslot,
toTypeName(sumMap(statusMapTuple)),
toTypeName(sumMapWithOverflow(statusMapTuple)),
FROM sum_map
GROUP BY timeslot
```
Result:
``` text
┌────────────timeslot─┬─toTypeName(sumMap(statusMap.status, statusMap.requests))─┬─toTypeName(sumMapWithOverflow(statusMap.status, statusMap.requests))─┐
1. │ 2000-01-01 00:01:00 │ Tuple(Array(UInt8), Array(UInt64)) │ Tuple(Array(UInt8), Array(UInt8)) │
2. │ 2000-01-01 00:00:00 │ Tuple(Array(UInt8), Array(UInt64)) │ Tuple(Array(UInt8), Array(UInt8)) │
└─────────────────────┴──────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────┘
```
**See Also**
- [sumMap](../reference/summap.md)