Add sumMapFiltered and sumMapFilteredWithOverflow

This commit is contained in:
Blargian 2024-04-28 19:30:00 +02:00
parent 4a2facb844
commit 81412c8f6d
3 changed files with 112 additions and 6 deletions

View File

@ -507,14 +507,116 @@ HAVING uniqUpTo(4)(UserID) >= 5
## sumMapFiltered ## sumMapFiltered
Same behavior as [sumMap](../../sql-reference/aggregate-functions/reference/summap.md#agg_functions-summap) except that an array of keys is passed as a parameter. This can be especially useful when working with a high cardinality of keys. This function behaves the same as [sumMap](../../sql-reference/aggregate-functions/reference/summap.md#agg_functions-summap) except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys.
**Syntax** **Syntax**
`sumMapFiltered(keys_to_keep)(keys, values)` `sumMapFiltered(keys_to_keep)(keys, values)`
**Parameters**
- `keys_to_keep`: [Array](../data-types/array.md) of keys to filter with.
- `keys`: [Array](../data-types/array.md) of keys.
- `values`: [Array](../data-types/array.md) of values.
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys.
**Example**
Query:
```sql
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt16, requests UInt64)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
```
```sql
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) FROM sum_map;
```
Result:
```response
┌─sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests)─┐
1. │ ([1,4,8],[10,20,10]) │
└─────────────────────────────────────────────────────────────────┘
```
## sumMapFilteredWithOverflow ## sumMapFilteredWithOverflow
This function behaves the same as [sumMap](../../sql-reference/aggregate-functions/reference/summap.md#agg_functions-summap) except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys. It differs from the [sumMapFiltered](#summapfiltered) function in that it does summation with overflow - i.e. returns the same data type for the summation as the argument data type.
**Syntax**
`sumMapFilteredWithOverflow(keys_to_keep)(keys, values)`
**Parameters**
- `keys_to_keep`: [Array](../data-types/array.md) of keys to filter with.
- `keys`: [Array](../data-types/array.md) of keys.
- `values`: [Array](../data-types/array.md) of values.
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys.
**Example**
In this example we create a table `sum_map`, insert some data into it and then use both `sumMapFilteredWithOverflow` and `sumMapFiltered` and the `toTypeName` function for comparison of the result. Where `requests` was of type `UInt8` in the created table, `sumMapFiltered` has promoted the type of the summed values to `UInt64` to avoid overflow whereas `sumMapFilteredWithOverflow` has kept the type as `UInt8` which is not large enough to store the result - i.e. overflow has occured.
Query:
```sql
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt8, requests UInt8)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
```
```sql
SELECT sumMapFilteredWithOverflow([1, 4, 8])(statusMap.status, statusMap.requests) as summap_overflow, toTypeName(summap_overflow) FROM sum_map;
```
```sql
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) as summap, toTypeName(summap) FROM sum_map;
```
Result:
```response
┌─sum──────────────────┬─toTypeName(sum)───────────────────┐
1. │ ([1,4,8],[10,20,10]) │ Tuple(Array(UInt8), Array(UInt8)) │
└──────────────────────┴───────────────────────────────────┘
```
```response
┌─summap───────────────┬─toTypeName(summap)─────────────────┐
1. │ ([1,4,8],[10,20,10]) │ Tuple(Array(UInt8), Array(UInt64)) │
└──────────────────────┴────────────────────────────────────┘
```
## sequenceNextNode ## sequenceNextNode
Returns a value of the next event that matched an event chain. Returns a value of the next event that matched an event chain.

View File

@ -16,8 +16,8 @@ Alias: `sumMappedArrays`.
**Arguments** **Arguments**
- `key`. [Array](../../data-types/array.md) of keys. - `key`: [Array](../../data-types/array.md) of keys.
- `value`. [Array](../../data-types/array.md) of values. - `value`: [Array](../../data-types/array.md) of values.
Passing a tuple of key and value arrays is a synonym to passing seperately an array of keys and an array of values. Passing a tuple of key and value arrays is a synonym to passing seperately an array of keys and an array of values.

View File

@ -6,7 +6,7 @@ sidebar_position: 141
# sumMapWithOverflow # sumMapWithOverflow
Totals a `value` array according to the keys specified in the `key` array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys. Totals a `value` array according to the keys specified in the `key` array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys.
It differs from the [sumMap](../reference/summap.md) function in that it does summation with overflow - i.e. returns the same data type for the summation as the argument type. It differs from the [sumMap](../reference/summap.md) function in that it does summation with overflow - i.e. returns the same data type for the summation as the argument data type.
**Syntax** **Syntax**
@ -15,8 +15,8 @@ It differs from the [sumMap](../reference/summap.md) function in that it does su
**Arguments** **Arguments**
- `key`. [Array](../../data-types/array.md) of keys. - `key`: [Array](../../data-types/array.md) of keys.
- `value`. [Array](../../data-types/array.md) of values. - `value`: [Array](../../data-types/array.md) of values.
Passing a tuple of key and value arrays is a synonym to passing seperately an array of keys and an array of values. Passing a tuple of key and value arrays is a synonym to passing seperately an array of keys and an array of values.
@ -24,6 +24,10 @@ Passing a tuple of key and value arrays is a synonym to passing seperately an ar
The number of elements in `key` and `value` must be the same for each row that is totaled. The number of elements in `key` and `value` must be the same for each row that is totaled.
::: :::
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys.
**Example** **Example**
First we create a table called `sum_map`, and insert some data into it. Arrays of keys and values are stored separately as a column called `statusMap` of [Nested](../../data-types/nested-data-structures/index.md) type, and together as a column called `statusMapTuple` of [tuple](../../data-types/tuple.md) type to illustrate the use of the two different syntaxes of this function described above. First we create a table called `sum_map`, and insert some data into it. Arrays of keys and values are stored separately as a column called `statusMap` of [Nested](../../data-types/nested-data-structures/index.md) type, and together as a column called `statusMapTuple` of [tuple](../../data-types/tuple.md) type to illustrate the use of the two different syntaxes of this function described above.