ClickHouse/docs/en/sql-reference/functions/bit-functions.md

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

441 lines
14 KiB
Markdown
Raw Normal View History

2020-04-03 13:23:32 +00:00
---
2022-08-28 14:53:34 +00:00
slug: /en/sql-reference/functions/bit-functions
2023-04-19 17:05:55 +00:00
sidebar_position: 20
sidebar_label: Bit
2020-04-03 13:23:32 +00:00
---
2022-06-02 10:55:18 +00:00
# Bit Functions
2017-04-03 19:49:50 +00:00
2021-10-24 20:09:06 +00:00
Bit functions work for any pair of types from `UInt8`, `UInt16`, `UInt32`, `UInt64`, `Int8`, `Int16`, `Int32`, `Int64`, `Float32`, or `Float64`. Some functions support `String` and `FixedString` types.
2017-04-03 19:49:50 +00:00
2017-04-26 19:16:38 +00:00
The result type is an integer with bits equal to the maximum bits of its arguments. If at least one of the arguments is signed, the result is a signed number. If an argument is a floating-point number, it is cast to Int64.
2017-04-03 19:49:50 +00:00
2022-06-02 10:55:18 +00:00
## bitAnd(a, b)
2017-04-03 19:49:50 +00:00
2022-06-02 10:55:18 +00:00
## bitOr(a, b)
2017-04-03 19:49:50 +00:00
2022-06-02 10:55:18 +00:00
## bitXor(a, b)
2017-04-03 19:49:50 +00:00
2022-06-02 10:55:18 +00:00
## bitNot(a)
2017-04-03 19:49:50 +00:00
2022-06-02 10:55:18 +00:00
## bitShiftLeft(a, b)
Shifts the binary representation of a value to the left by a specified number of bit positions.
A `FixedString` or a `String` is treated as a single multibyte value.
Bits of a `FixedString` value are lost as they are shifted out. On the contrary, a `String` value is extended with additional bytes, so no bits are lost.
2021-10-22 16:33:23 +00:00
**Syntax**
``` sql
2021-10-28 20:30:16 +00:00
bitShiftLeft(a, b)
2021-10-22 16:33:23 +00:00
```
**Arguments**
- `a` — A value to shift. [Integer types](../data-types/int-uint.md), [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
- `b` — The number of shift positions. [Unsigned integer types](../data-types/int-uint.md), 64 bit types or less are allowed.
2021-10-22 16:33:23 +00:00
**Returned value**
- Shifted value.
2021-10-22 16:33:23 +00:00
The type of the returned value is the same as the type of the input value.
2021-10-22 16:33:23 +00:00
**Example**
2021-10-28 04:25:53 +00:00
In the following queries [bin](encoding-functions.md#bin) and [hex](encoding-functions.md#hex) functions are used to show bits of shifted values.
2021-10-22 16:33:23 +00:00
``` sql
2021-10-28 20:30:16 +00:00
SELECT 99 AS a, bin(a), bitShiftLeft(a, 2) AS a_shifted, bin(a_shifted);
2021-10-28 04:25:53 +00:00
SELECT 'abc' AS a, hex(a), bitShiftLeft(a, 4) AS a_shifted, hex(a_shifted);
SELECT toFixedString('abc', 3) AS a, hex(a), bitShiftLeft(a, 4) AS a_shifted, hex(a_shifted);
2021-10-22 16:33:23 +00:00
```
Result:
``` text
2021-10-28 20:30:16 +00:00
┌──a─┬─bin(99)──┬─a_shifted─┬─bin(bitShiftLeft(99, 2))─┐
│ 99 │ 01100011 │ 140 │ 10001100 │
└────┴──────────┴───────────┴──────────────────────────┘
2021-10-28 04:25:53 +00:00
┌─a───┬─hex('abc')─┬─a_shifted─┬─hex(bitShiftLeft('abc', 4))─┐
│ abc │ 616263 │ &0 │ 06162630 │
└─────┴────────────┴───────────┴─────────────────────────────┘
┌─a───┬─hex(toFixedString('abc', 3))─┬─a_shifted─┬─hex(bitShiftLeft(toFixedString('abc', 3), 4))─┐
│ abc │ 616263 │ &0 │ 162630 │
└─────┴──────────────────────────────┴───────────┴───────────────────────────────────────────────┘
2021-10-22 16:33:23 +00:00
```
2022-06-02 10:55:18 +00:00
## bitShiftRight(a, b)
2018-04-23 06:20:21 +00:00
Shifts the binary representation of a value to the right by a specified number of bit positions.
A `FixedString` or a `String` is treated as a single multibyte value. Note that the length of a `String` value is reduced as bits are shifted out.
2021-10-28 20:30:16 +00:00
**Syntax**
``` sql
bitShiftRight(a, b)
```
**Arguments**
- `a` — A value to shift. [Integer types](../data-types/int-uint.md), [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
- `b` — The number of shift positions. [Unsigned integer types](../data-types/int-uint.md), 64 bit types or less are allowed.
2021-10-28 20:30:16 +00:00
**Returned value**
- Shifted value.
2021-10-28 20:30:16 +00:00
The type of the returned value is the same as the type of the input value.
2021-10-28 20:30:16 +00:00
**Example**
2021-11-01 19:42:38 +00:00
Query:
2021-10-28 20:30:16 +00:00
``` sql
SELECT 101 AS a, bin(a), bitShiftRight(a, 2) AS a_shifted, bin(a_shifted);
SELECT 'abc' AS a, hex(a), bitShiftRight(a, 12) AS a_shifted, hex(a_shifted);
SELECT toFixedString('abc', 3) AS a, hex(a), bitShiftRight(a, 12) AS a_shifted, hex(a_shifted);
```
Result:
``` text
┌───a─┬─bin(101)─┬─a_shifted─┬─bin(bitShiftRight(101, 2))─┐
│ 101 │ 01100101 │ 25 │ 00011001 │
└─────┴──────────┴───────────┴────────────────────────────┘
┌─a───┬─hex('abc')─┬─a_shifted─┬─hex(bitShiftRight('abc', 12))─┐
│ abc │ 616263 │ │ 0616 │
└─────┴────────────┴───────────┴───────────────────────────────┘
┌─a───┬─hex(toFixedString('abc', 3))─┬─a_shifted─┬─hex(bitShiftRight(toFixedString('abc', 3), 12))─┐
│ abc │ 616263 │ │ 000616 │
└─────┴──────────────────────────────┴───────────┴─────────────────────────────────────────────────┘
```
2022-06-02 10:55:18 +00:00
## bitRotateLeft(a, b)
2022-06-02 10:55:18 +00:00
## bitRotateRight(a, b)
2022-01-03 08:31:11 +00:00
## bitSlice(s, offset, length)
Returns a substring starting with the bit from the offset index that is length bits long. bits indexing starts from
1
**Syntax**
``` sql
bitSlice(s, offset[, length])
```
**Arguments**
- `s` — s is [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
2022-05-07 10:47:15 +00:00
- `offset` — The start index with bit, A positive value indicates an offset on the left, and a negative value is an indent on the right. Numbering of the bits begins with 1.
- `length` — The length of substring with bit. If you specify a negative value, the function returns an open substring \[offset, array_length - length\]. If you omit the value, the function returns the substring \[offset, the_end_string\]. If length exceeds s, it will be truncate.If length isn't multiple of 8, will fill 0 on the right.
2022-01-03 08:31:11 +00:00
**Returned value**
- The substring. [String](../data-types/string.md)
2022-01-03 08:31:11 +00:00
**Example**
Query:
``` sql
select bin('Hello'), bin(bitSlice('Hello', 1, 8))
select bin('Hello'), bin(bitSlice('Hello', 1, 2))
select bin('Hello'), bin(bitSlice('Hello', 1, 9))
select bin('Hello'), bin(bitSlice('Hello', -4, 8))
2022-01-03 08:31:11 +00:00
```
Result:
``` text
┌─bin('Hello')─────────────────────────────┬─bin(bitSlice('Hello', 1, 8))─┐
│ 0100100001100101011011000110110001101111 │ 01001000 │
└──────────────────────────────────────────┴──────────────────────────────┘
┌─bin('Hello')─────────────────────────────┬─bin(bitSlice('Hello', 1, 2))─┐
│ 0100100001100101011011000110110001101111 │ 01000000 │
└──────────────────────────────────────────┴──────────────────────────────┘
┌─bin('Hello')─────────────────────────────┬─bin(bitSlice('Hello', 1, 9))─┐
│ 0100100001100101011011000110110001101111 │ 0100100000000000 │
└──────────────────────────────────────────┴──────────────────────────────┘
┌─bin('Hello')─────────────────────────────┬─bin(bitSlice('Hello', -4, 8))─┐
│ 0100100001100101011011000110110001101111 │ 11110000 │
└──────────────────────────────────────────┴───────────────────────────────┘
2022-01-03 08:31:11 +00:00
```
## byteSlice(s, offset, length)
See function [substring](string-functions.md#substring).
2022-06-02 10:55:18 +00:00
## bitTest
Takes any integer and converts it into [binary form](https://en.wikipedia.org/wiki/Binary_number), returns the value of a bit at specified position. The countdown starts from 0 from the right to the left.
2020-03-20 10:10:48 +00:00
**Syntax**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTest(number, index)
```
**Arguments**
- `number` Integer number.
- `index` Position of bit.
2024-05-23 15:14:56 +00:00
**Returned value**
2024-05-23 15:14:56 +00:00
- Value of the bit at the specified position. [UInt8](../data-types/int-uint.md).
**Example**
For example, the number 43 in base-2 (binary) numeral system is 101011.
Query:
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTest(43, 1);
```
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─bitTest(43, 1)─┐
│ 1 │
└────────────────┘
```
Another example:
Query:
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTest(43, 2);
```
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─bitTest(43, 2)─┐
│ 0 │
└────────────────┘
```
2022-06-02 10:55:18 +00:00
## bitTestAll
Returns result of [logical conjuction](https://en.wikipedia.org/wiki/Logical_conjunction) (AND operator) of all bits at given positions. The countdown starts from 0 from the right to the left.
2023-06-02 11:30:05 +00:00
The conjuction for bit-wise operations:
0 AND 0 = 0
2020-02-16 06:09:22 +00:00
0 AND 1 = 0
2020-02-16 06:09:22 +00:00
1 AND 0 = 0
2020-02-16 06:09:22 +00:00
1 AND 1 = 1
2020-03-20 10:10:48 +00:00
**Syntax**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTestAll(number, index1, index2, index3, index4, ...)
```
**Arguments**
- `number` Integer number.
- `index1`, `index2`, `index3`, `index4` Positions of bit. For example, for set of positions (`index1`, `index2`, `index3`, `index4`) is true if and only if all of its positions are true (`index1` ⋀ `index2`, ⋀ `index3``index4`).
2024-05-23 15:14:56 +00:00
**Returned value**
2024-05-23 15:14:56 +00:00
- Result of the logical conjuction. [UInt8](../data-types/int-uint.md).
**Example**
For example, the number 43 in base-2 (binary) numeral system is 101011.
Query:
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTestAll(43, 0, 1, 3, 5);
```
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─bitTestAll(43, 0, 1, 3, 5)─┐
│ 1 │
└────────────────────────────┘
```
Another example:
Query:
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTestAll(43, 0, 1, 3, 5, 2);
```
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─bitTestAll(43, 0, 1, 3, 5, 2)─┐
│ 0 │
└───────────────────────────────┘
```
2022-06-02 10:55:18 +00:00
## bitTestAny
Returns result of [logical disjunction](https://en.wikipedia.org/wiki/Logical_disjunction) (OR operator) of all bits at given positions. The countdown starts from 0 from the right to the left.
2023-06-02 11:30:05 +00:00
The disjunction for bit-wise operations:
0 OR 0 = 0
2020-02-16 06:09:22 +00:00
0 OR 1 = 1
2020-02-16 06:09:22 +00:00
1 OR 0 = 1
2020-02-16 06:09:22 +00:00
1 OR 1 = 1
2020-03-20 10:10:48 +00:00
**Syntax**
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTestAny(number, index1, index2, index3, index4, ...)
```
**Arguments**
- `number` Integer number.
- `index1`, `index2`, `index3`, `index4` Positions of bit.
2024-05-23 15:14:56 +00:00
**Returned value**
2024-05-23 15:14:56 +00:00
- Result of the logical disjunction. [UInt8](../data-types/int-uint.md).
**Example**
For example, the number 43 in base-2 (binary) numeral system is 101011.
Query:
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTestAny(43, 0, 2);
```
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─bitTestAny(43, 0, 2)─┐
│ 1 │
└──────────────────────┘
```
Another example:
Query:
2020-03-20 10:10:48 +00:00
``` sql
SELECT bitTestAny(43, 4, 2);
```
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─bitTestAny(43, 4, 2)─┐
│ 0 │
└──────────────────────┘
```
2022-06-02 10:55:18 +00:00
## bitCount
Calculates the number of bits set to one in the binary representation of a number.
**Syntax**
``` sql
bitCount(x)
```
**Arguments**
- `x` — [Integer](../data-types/int-uint.md) or [floating-point](../data-types/float.md) number. The function uses the value representation in memory. It allows supporting floating-point numbers.
**Returned value**
2024-05-23 13:48:20 +00:00
- Number of bits set to one in the input number. [UInt8](../data-types/int-uint.md).
2024-05-23 13:48:20 +00:00
:::note
The function does not convert the input value to a larger type ([sign extension](https://en.wikipedia.org/wiki/Sign_extension)). So, for example, `bitCount(toUInt8(-1)) = 8`.
:::
**Example**
Take for example the number 333. Its binary representation: 0000000101001101.
Query:
``` sql
SELECT bitCount(333);
```
Result:
``` text
┌─bitCount(333)─┐
│ 5 │
└───────────────┘
```
2022-06-02 10:55:18 +00:00
## bitHammingDistance
Returns the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) between the bit representations of two integer values. Can be used with [SimHash](../../sql-reference/functions/hash-functions.md#ngramsimhash) functions for detection of semi-duplicate strings. The smaller is the distance, the more likely those strings are the same.
**Syntax**
``` sql
bitHammingDistance(int1, int2)
```
**Arguments**
- `int1` — First integer value. [Int64](../data-types/int-uint.md).
- `int2` — Second integer value. [Int64](../data-types/int-uint.md).
**Returned value**
- The Hamming distance. [UInt8](../data-types/int-uint.md).
**Examples**
Query:
``` sql
SELECT bitHammingDistance(111, 121);
```
Result:
``` text
┌─bitHammingDistance(111, 121)─┐
│ 3 │
└──────────────────────────────┘
```
With [SimHash](../../sql-reference/functions/hash-functions.md#ngramsimhash):
``` sql
SELECT bitHammingDistance(ngramSimHash('cat ate rat'), ngramSimHash('rat ate cat'));
```
Result:
``` text
┌─bitHammingDistance(ngramSimHash('cat ate rat'), ngramSimHash('rat ate cat'))─┐
│ 5 │
└──────────────────────────────────────────────────────────────────────────────┘
```