ClickHouse/docs/en/sql-reference/functions/tuple-functions.md

244 lines
6.6 KiB
Markdown
Raw Normal View History

---
toc_priority: 66
toc_title: Tuples
---
# Functions for Working with Tuples {#tuple-functions}
## tuple {#tuple}
A function that allows grouping multiple columns.
For columns with the types T1, T2, …, it returns a Tuple(T1, T2, …) type tuple containing these columns. There is no cost to execute the function.
Tuples are normally used as intermediate values for an argument of IN operators, or for creating a list of formal parameters of lambda functions. Tuples cant be written to a table.
The function implements the operator `(x, y, …)`.
**Syntax**
``` sql
tuple(x, y, …)
```
## tupleElement {#tupleelement}
A function that allows getting a column from a tuple.
N is the column index, starting from 1. N must be a constant. N must be a constant. N must be a strict postive integer no greater than the size of the tuple.
There is no cost to execute the function.
The function implements the operator `x.N`.
**Syntax**
``` sql
tupleElement(tuple, n)
```
## untuple {#untuple}
Performs syntactic substitution of [tuple](../../sql-reference/data-types/tuple.md#tuplet1-t2) elements in the call location.
**Syntax**
``` sql
untuple(x)
```
You can use the `EXCEPT` expression to skip columns as a result of the query.
**Arguments**
- `x` — A `tuple` function, column, or tuple of elements. [Tuple](../../sql-reference/data-types/tuple.md).
**Returned value**
- None.
**Examples**
Input table:
``` text
┌─key─┬─v1─┬─v2─┬─v3─┬─v4─┬─v5─┬─v6────────┐
│ 1 │ 10 │ 20 │ 40 │ 30 │ 15 │ (33,'ab') │
│ 2 │ 25 │ 65 │ 70 │ 40 │ 6 │ (44,'cd') │
│ 3 │ 57 │ 30 │ 20 │ 10 │ 5 │ (55,'ef') │
│ 4 │ 55 │ 12 │ 7 │ 80 │ 90 │ (66,'gh') │
│ 5 │ 30 │ 50 │ 70 │ 25 │ 55 │ (77,'kl') │
└─────┴────┴────┴────┴────┴────┴───────────┘
```
Example of using a `Tuple`-type column as the `untuple` function parameter:
Query:
``` sql
SELECT untuple(v6) FROM kv;
```
Result:
``` text
┌─_ut_1─┬─_ut_2─┐
│ 33 │ ab │
│ 44 │ cd │
│ 55 │ ef │
│ 66 │ gh │
│ 77 │ kl │
└───────┴───────┘
```
2021-07-10 20:12:32 +00:00
Note: the names are implementation specific and are subject to change. You should not assume specific names of the columns after application of the `untuple`.
Example of using an `EXCEPT` expression:
Query:
``` sql
SELECT untuple((* EXCEPT (v2, v3),)) FROM kv;
```
Result:
``` text
┌─key─┬─v1─┬─v4─┬─v5─┬─v6────────┐
│ 1 │ 10 │ 30 │ 15 │ (33,'ab') │
│ 2 │ 25 │ 40 │ 6 │ (44,'cd') │
│ 3 │ 57 │ 10 │ 5 │ (55,'ef') │
│ 4 │ 55 │ 80 │ 90 │ (66,'gh') │
│ 5 │ 30 │ 25 │ 55 │ (77,'kl') │
└─────┴────┴────┴────┴───────────┘
```
**See Also**
- [Tuple](../../sql-reference/data-types/tuple.md)
## tupleHammingDistance {#tuplehammingdistance}
Returns the [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance) between two tuples of the same size.
**Syntax**
``` sql
tupleHammingDistance(tuple1, tuple2)
```
**Arguments**
- `tuple1` — First tuple. [Tuple](../../sql-reference/data-types/tuple.md).
- `tuple2` — Second tuple. [Tuple](../../sql-reference/data-types/tuple.md).
Tuples should have the same type of the elements.
**Returned value**
- The Hamming distance.
Type: [UInt8](../../sql-reference/data-types/int-uint.md).
**Examples**
Query:
``` sql
SELECT tupleHammingDistance((1, 2, 3), (3, 2, 1)) AS HammingDistance;
```
Result:
``` text
┌─HammingDistance─┐
│ 2 │
└─────────────────┘
```
Can be used with [MinHash](../../sql-reference/functions/hash-functions.md#ngramminhash) functions for detection of semi-duplicate strings:
``` sql
2021-05-27 19:49:41 +00:00
SELECT tupleHammingDistance(wordShingleMinHash(string), wordShingleMinHashCaseInsensitive(string)) as HammingDistance FROM (SELECT 'ClickHouse is a column-oriented database management system for online analytical processing of queries.' AS string);
```
Result:
``` text
┌─HammingDistance─┐
│ 2 │
└─────────────────┘
```
2021-10-25 11:18:51 +00:00
## tupleToNameValuePairs {#tupletonamevaluepairs}
2021-10-25 21:01:31 +00:00
Turns a named tuple into an array of (name, value) pairs. For a `Tuple(a T, b T, ..., c T)` returns `Array(Tuple(String, T), ...)`
in which the `Strings` represents the named fields of the tuple and the `T`s are the values associated with those names. All values in the tuple should be of the same type.
2021-10-25 11:18:51 +00:00
**Syntax**
``` sql
2021-10-25 21:01:31 +00:00
tupleToNameValuePairs(tuple)
2021-10-25 11:18:51 +00:00
```
**Arguments**
2021-10-25 21:01:31 +00:00
- `tuple` — Named tuple. [Tuple](../../sql-reference/data-types/tuple.md) with any type of values.
2021-10-25 11:18:51 +00:00
**Returned value**
2021-10-25 21:01:31 +00:00
- An array with (name, value) pairs.
2021-10-25 11:18:51 +00:00
2021-10-25 21:01:31 +00:00
Type: [Array](../../sql-reference/data-types/array.md)([Tuple](../../sql-reference/data-types/tuple.md)([String](../../sql-reference/data-types/string.md), ...)).
2021-10-25 11:18:51 +00:00
**Example**
2021-10-25 21:01:31 +00:00
Query:
``` sql
CREATE TABLE tupletest (`col` Tuple(user_ID UInt64, session_ID UInt64) ENGINE = Memory;
INSERT INTO tupletest VALUES (tuple( 100, 2502)), (tuple(1,100));
SELECT tupleToNameValuePairs(col) FROM tupletest;
```
Result:
``` text
┌─tupleToNameValuePairs(col)────────────┐
│ [('user_ID',100),('session_ID',2502)] │
│ [('user_ID',1),('session_ID',100)] │
└───────────────────────────────────────┘
```
It is possible to transform colums to rows using this function:
``` sql
CREATE TABLE tupletest (`col` Tuple(CPU Float64, Memory Float64, Disk Float64)) ENGINE = Memory;
INSERT INTO tupletest VALUES(tuple(3.3, 5.5, 6.6));
SELECT arrayJoin(tupleToNameValuePairs(col))FROM tupletest;
```
Result:
``` text
┌─arrayJoin(tupleToNameValuePairs(col))─┐
│ ('CPU',3.3) │
│ ('Memory',5.5) │
│ ('Disk',6.6) │
└───────────────────────────────────────┘
```
If you pass a simple tuple to the function, ClickHouse uses the indexes of the values as their names:
``` sql
SELECT tupleToNameValuePairs(tuple(3, 2, 1));
```
Result:
``` text
┌─tupleToNameValuePairs(tuple(3, 2, 1))─┐
│ [('1',3),('2',2),('3',1)] │
└───────────────────────────────────────┘
```