ClickHouse/docs/en/sql_reference/functions/bitmap_functions.md

495 lines
10 KiB
Markdown
Raw Normal View History

2020-04-03 13:23:32 +00:00
---
toc_priority: 49
toc_title: Bitmap
---
2020-03-20 10:10:48 +00:00
# Bitmap functions {#bitmap-functions}
Bitmap functions work for two bitmaps Object value calculation, it is to return new bitmap or cardinality while using formula calculation, such as and, or, xor, and not, etc.
There are 2 kinds of construction methods for Bitmap Object. One is to be constructed by aggregation function groupBitmap with -State, the other is to be constructed by Array Object. It is also to convert Bitmap Object to Array Object.
2019-06-14 10:29:16 +00:00
RoaringBitmap is wrapped into a data structure while actual storage of Bitmap objects. When the cardinality is less than or equal to 32, it uses Set objet. When the cardinality is greater than 32, it uses RoaringBitmap object. That is why storage of low cardinality set is faster.
For more information on RoaringBitmap, see: [CRoaring](https://github.com/RoaringBitmap/CRoaring).
## bitmapBuild {#bitmap_functions-bitmapbuild}
Build a bitmap from unsigned integer array.
2020-03-20 10:10:48 +00:00
``` sql
bitmapBuild(array)
```
**Parameters**
- `array` unsigned integer array.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-06-14 10:29:16 +00:00
SELECT bitmapBuild([1, 2, 3, 4, 5]) AS res, toTypeName(res)
```
2020-03-20 10:10:48 +00:00
``` text
2019-06-14 10:29:16 +00:00
┌─res─┬─toTypeName(bitmapBuild([1, 2, 3, 4, 5]))─────┐
 │ AggregateFunction(groupBitmap, UInt8) │
└─────┴──────────────────────────────────────────────┘
```
2020-03-20 10:10:48 +00:00
## bitmapToArray {#bitmaptoarray}
Convert bitmap to integer array.
2020-03-20 10:10:48 +00:00
``` sql
bitmapToArray(bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapToArray(bitmapBuild([1, 2, 3, 4, 5])) AS res
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─────────┐
│ [1,2,3,4,5] │
└─────────────┘
```
## bitmapSubsetInRange {#bitmap-functions-bitmapsubsetinrange}
2019-07-30 10:54:50 +00:00
2020-03-20 10:10:48 +00:00
Return subset in specified range (not include the range\_end).
2019-07-30 10:54:50 +00:00
2020-03-20 10:10:48 +00:00
``` sql
bitmapSubsetInRange(bitmap, range_start, range_end)
2019-07-30 10:54:50 +00:00
```
**Parameters**
- `bitmap` [Bitmap object](#bitmap_functions-bitmapbuild).
2020-04-03 13:23:32 +00:00
- `range_start` range start point. Type: [UInt32](../../sql_reference/data_types/int_uint.md).
- `range_end` range end point(excluded). Type: [UInt32](../../sql_reference/data_types/int_uint.md).
2019-07-30 10:54:50 +00:00
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapToArray(bitmapSubsetInRange(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(30), toUInt32(200))) AS res
2019-07-30 10:54:50 +00:00
```
2020-03-20 10:10:48 +00:00
``` text
2019-07-30 10:54:50 +00:00
┌─res───────────────┐
│ [30,31,32,33,100] │
└───────────────────┘
```
2020-03-18 18:43:51 +00:00
## bitmapSubsetLimit {#bitmapsubsetlimit}
2019-09-17 06:34:08 +00:00
2019-11-27 09:09:44 +00:00
Creates a subset of bitmap with n elements taken between `range_start` and `cardinality_limit`.
2019-09-17 06:34:08 +00:00
**Syntax**
2020-03-20 10:10:48 +00:00
``` sql
bitmapSubsetLimit(bitmap, range_start, cardinality_limit)
2019-09-17 06:34:08 +00:00
```
**Parameters**
- `bitmap` [Bitmap object](#bitmap_functions-bitmapbuild).
2020-04-03 13:23:32 +00:00
- `range_start` The subset starting point. Type: [UInt32](../../sql_reference/data_types/int_uint.md).
- `cardinality_limit` The subset cardinality upper limit. Type: [UInt32](../../sql_reference/data_types/int_uint.md).
**Returned value**
2019-11-27 09:09:44 +00:00
The subset.
Type: `Bitmap object`.
2019-09-17 06:34:08 +00:00
**Example**
Query:
2020-03-20 10:10:48 +00:00
``` sql
2019-09-17 06:34:08 +00:00
SELECT bitmapToArray(bitmapSubsetLimit(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(30), toUInt32(200))) AS res
```
Result:
2020-03-20 10:10:48 +00:00
``` text
2019-09-17 06:34:08 +00:00
┌─res───────────────────────┐
│ [30,31,32,33,100,200,500] │
└───────────────────────────┘
```
## bitmapContains {#bitmap_functions-bitmapcontains}
Checks whether the bitmap contains an element.
2020-03-20 10:10:48 +00:00
``` sql
bitmapContains(haystack, needle)
```
**Parameters**
- `haystack` [Bitmap object](#bitmap_functions-bitmapbuild), where the function searches.
2020-04-03 13:23:32 +00:00
- `needle` Value that the function searches. Type: [UInt32](../../sql_reference/data_types/int_uint.md).
**Returned values**
- 0 — If `haystack` doesnt contain `needle`.
- 1 — If `haystack` contains `needle`.
Type: `UInt8`.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapContains(bitmapBuild([1,5,7,9]), toUInt32(9)) AS res
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
│ 1 │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapHasAny {#bitmaphasany}
2019-06-14 10:29:16 +00:00
Checks whether two bitmaps have intersection by some elements.
2020-03-20 10:10:48 +00:00
``` sql
2019-06-14 10:29:16 +00:00
bitmapHasAny(bitmap1, bitmap2)
```
If you are sure that `bitmap2` contains strictly one element, consider using the [bitmapContains](#bitmap_functions-bitmapcontains) function. It works more efficiently.
**Parameters**
- `bitmap*` bitmap object.
2019-06-14 10:29:16 +00:00
**Return values**
- `1`, if `bitmap1` and `bitmap2` have one similar element at least.
- `0`, otherwise.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapHasAny(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
│ 1 │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapHasAll {#bitmaphasall}
2019-06-14 10:29:16 +00:00
Analogous to `hasAll(array, array)` returns 1 if the first bitmap contains all the elements of the second one, 0 otherwise.
If the second argument is an empty bitmap then returns 1.
2020-03-20 10:10:48 +00:00
``` sql
bitmapHasAll(bitmap,bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapHasAll(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
│ 0 │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapCardinality {#bitmapcardinality}
2019-11-02 10:10:48 +00:00
Retrun bitmap cardinality of type UInt64.
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
bitmapCardinality(bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapCardinality(bitmapBuild([1, 2, 3, 4, 5])) AS res
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
2019-11-02 10:10:48 +00:00
│ 5 │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapMin {#bitmapmin}
2020-03-20 10:10:48 +00:00
Retrun the smallest value of type UInt64 in the set, UINT32\_MAX if the set is empty.
bitmapMin(bitmap)
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapMin(bitmapBuild([1, 2, 3, 4, 5])) AS res
```
┌─res─┐
│ 1 │
└─────┘
2020-03-20 10:10:48 +00:00
## bitmapMax {#bitmapmax}
2019-11-02 10:10:48 +00:00
Retrun the greatest value of type UInt64 in the set, 0 if the set is empty.
bitmapMax(bitmap)
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapMax(bitmapBuild([1, 2, 3, 4, 5])) AS res
```
┌─res─┐
│ 5 │
└─────┘
2020-03-20 10:10:48 +00:00
## bitmapTransform {#bitmaptransform}
2019-11-02 10:10:48 +00:00
Transform an array of values in a bitmap to another array of values, the result is a new bitmap.
bitmapTransform(bitmap, from_array, to_array)
**Parameters**
- `bitmap` bitmap object.
- `from_array` UInt32 array. For idx in range \[0, from\_array.size()), if bitmap contains from\_array\[idx\], then replace it with to\_array\[idx\]. Note that the result depends on array ordering if there are common elements between from\_array and to\_array.
- `to_array` UInt32 array, its size shall be the same to from\_array.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapToArray(bitmapTransform(bitmapBuild([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]), cast([5,999,2] as Array(UInt32)), cast([2,888,20] as Array(UInt32)))) AS res
```
┌─res───────────────────┐
│ [1,3,4,6,7,8,9,10,20] │
└───────────────────────┘
2020-03-20 10:10:48 +00:00
## bitmapAnd {#bitmapand}
2019-11-02 10:10:48 +00:00
Two bitmap and calculation, the result is a new bitmap.
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
bitmapAnd(bitmap,bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapToArray(bitmapAnd(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
2019-11-02 10:10:48 +00:00
│ [3] │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapOr {#bitmapor}
2019-09-18 08:30:18 +00:00
2019-11-02 10:10:48 +00:00
Two bitmap or calculation, the result is a new bitmap.
2019-09-18 08:30:18 +00:00
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
bitmapOr(bitmap,bitmap)
2019-09-18 08:30:18 +00:00
```
**Parameters**
- `bitmap` bitmap object.
2019-09-18 08:30:18 +00:00
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapToArray(bitmapOr(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
2019-09-18 08:30:18 +00:00
```
2020-03-20 10:10:48 +00:00
``` text
2019-11-02 10:10:48 +00:00
┌─res─────────┐
│ [1,2,3,4,5] │
└─────────────┘
2019-09-18 08:30:18 +00:00
```
2019-11-02 10:10:48 +00:00
2020-03-20 10:10:48 +00:00
## bitmapXor {#bitmapxor}
2019-11-02 10:10:48 +00:00
Two bitmap xor calculation, the result is a new bitmap.
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
bitmapXor(bitmap,bitmap)
2019-09-18 08:30:18 +00:00
```
2019-11-02 10:10:48 +00:00
**Parameters**
2019-09-18 08:30:18 +00:00
- `bitmap` bitmap object.
2019-09-18 08:30:18 +00:00
2019-11-02 10:10:48 +00:00
**Example**
2019-09-18 08:30:18 +00:00
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapToArray(bitmapXor(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
2019-09-18 08:30:18 +00:00
```
2019-11-02 10:10:48 +00:00
2020-03-20 10:10:48 +00:00
``` text
2019-11-02 10:10:48 +00:00
┌─res───────┐
│ [1,2,4,5] │
└───────────┘
```
2020-03-20 10:10:48 +00:00
## bitmapAndnot {#bitmapandnot}
2019-11-02 10:10:48 +00:00
Two bitmap andnot calculation, the result is a new bitmap.
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
bitmapAndnot(bitmap,bitmap)
2019-09-18 08:30:18 +00:00
```
**Parameters**
- `bitmap` bitmap object.
2019-09-18 08:30:18 +00:00
**Example**
2020-03-20 10:10:48 +00:00
``` sql
2019-11-02 10:10:48 +00:00
SELECT bitmapToArray(bitmapAndnot(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
2019-09-18 08:30:18 +00:00
```
2020-03-20 10:10:48 +00:00
``` text
2019-11-02 10:10:48 +00:00
┌─res───┐
│ [1,2] │
└───────┘
2019-09-18 08:30:18 +00:00
```
2020-03-20 10:10:48 +00:00
## bitmapAndCardinality {#bitmapandcardinality}
Two bitmap and calculation, return cardinality of type UInt64.
2020-03-20 10:10:48 +00:00
``` sql
bitmapAndCardinality(bitmap,bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapAndCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
│ 1 │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapOrCardinality {#bitmaporcardinality}
Two bitmap or calculation, return cardinality of type UInt64.
2020-03-20 10:10:48 +00:00
``` sql
bitmapOrCardinality(bitmap,bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapOrCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
│ 5 │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapXorCardinality {#bitmapxorcardinality}
Two bitmap xor calculation, return cardinality of type UInt64.
2020-03-20 10:10:48 +00:00
``` sql
bitmapXorCardinality(bitmap,bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapXorCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
│ 4 │
└─────┘
```
2020-03-20 10:10:48 +00:00
## bitmapAndnotCardinality {#bitmapandnotcardinality}
Two bitmap andnot calculation, return cardinality of type UInt64.
2020-03-20 10:10:48 +00:00
``` sql
bitmapAndnotCardinality(bitmap,bitmap)
```
**Parameters**
- `bitmap` bitmap object.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitmapAndnotCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
2020-03-20 10:10:48 +00:00
``` text
┌─res─┐
│ 2 │
└─────┘
```
2020-01-30 10:34:55 +00:00
[Original article](https://clickhouse.tech/docs/en/query_language/functions/bitmap_functions/) <!--hide-->