23 KiB
sidebar_position | sidebar_label |
---|---|
66 | Tuples |
Functions for Working with Tuples
tuple
A function that allows grouping multiple columns. For columns with the types T1, T2, …, it returns a Tuple(T1, T2, …) type tuple containing these columns. There is no cost to execute the function. Tuples are normally used as intermediate values for an argument of IN operators, or for creating a list of formal parameters of lambda functions. Tuples can’t be written to a table.
The function implements the operator (x, y, …)
.
Syntax
tuple(x, y, …)
tupleElement
A function that allows getting a column from a tuple. ‘N’ is the column index, starting from 1. ‘N’ must be a constant. ‘N’ must be a strict postive integer no greater than the size of the tuple. There is no cost to execute the function.
The function implements the operator x.N
.
Syntax
tupleElement(tuple, n)
untuple
Performs syntactic substitution of tuple elements in the call location.
Syntax
untuple(x)
You can use the EXCEPT
expression to skip columns as a result of the query.
Arguments
x
— Atuple
function, column, or tuple of elements. Tuple.
Returned value
- None.
Examples
Input table:
┌─key─┬─v1─┬─v2─┬─v3─┬─v4─┬─v5─┬─v6────────┐
│ 1 │ 10 │ 20 │ 40 │ 30 │ 15 │ (33,'ab') │
│ 2 │ 25 │ 65 │ 70 │ 40 │ 6 │ (44,'cd') │
│ 3 │ 57 │ 30 │ 20 │ 10 │ 5 │ (55,'ef') │
│ 4 │ 55 │ 12 │ 7 │ 80 │ 90 │ (66,'gh') │
│ 5 │ 30 │ 50 │ 70 │ 25 │ 55 │ (77,'kl') │
└─────┴────┴────┴────┴────┴────┴───────────┘
Example of using a Tuple
-type column as the untuple
function parameter:
Query:
SELECT untuple(v6) FROM kv;
Result:
┌─_ut_1─┬─_ut_2─┐
│ 33 │ ab │
│ 44 │ cd │
│ 55 │ ef │
│ 66 │ gh │
│ 77 │ kl │
└───────┴───────┘
Note: the names are implementation specific and are subject to change. You should not assume specific names of the columns after application of the untuple
.
Example of using an EXCEPT
expression:
Query:
SELECT untuple((* EXCEPT (v2, v3),)) FROM kv;
Result:
┌─key─┬─v1─┬─v4─┬─v5─┬─v6────────┐
│ 1 │ 10 │ 30 │ 15 │ (33,'ab') │
│ 2 │ 25 │ 40 │ 6 │ (44,'cd') │
│ 3 │ 57 │ 10 │ 5 │ (55,'ef') │
│ 4 │ 55 │ 80 │ 90 │ (66,'gh') │
│ 5 │ 30 │ 25 │ 55 │ (77,'kl') │
└─────┴────┴────┴────┴───────────┘
See Also
tupleHammingDistance
Returns the Hamming Distance between two tuples of the same size.
Syntax
tupleHammingDistance(tuple1, tuple2)
Arguments
Tuples should have the same type of the elements.
Returned value
- The Hamming distance.
Type: The result type is calculed the same way it is for Arithmetic functions, based on the number of elements in the input tuples.
SELECT
toTypeName(tupleHammingDistance(tuple(0), tuple(0))) AS t1,
toTypeName(tupleHammingDistance((0, 0), (0, 0))) AS t2,
toTypeName(tupleHammingDistance((0, 0, 0), (0, 0, 0))) AS t3,
toTypeName(tupleHammingDistance((0, 0, 0, 0), (0, 0, 0, 0))) AS t4,
toTypeName(tupleHammingDistance((0, 0, 0, 0, 0), (0, 0, 0, 0, 0))) AS t5
┌─t1────┬─t2─────┬─t3─────┬─t4─────┬─t5─────┐
│ UInt8 │ UInt16 │ UInt32 │ UInt64 │ UInt64 │
└───────┴────────┴────────┴────────┴────────┘
Examples
Query:
SELECT tupleHammingDistance((1, 2, 3), (3, 2, 1)) AS HammingDistance;
Result:
┌─HammingDistance─┐
│ 2 │
└─────────────────┘
Can be used with MinHash functions for detection of semi-duplicate strings:
SELECT tupleHammingDistance(wordShingleMinHash(string), wordShingleMinHashCaseInsensitive(string)) as HammingDistance FROM (SELECT 'ClickHouse is a column-oriented database management system for online analytical processing of queries.' AS string);
Result:
┌─HammingDistance─┐
│ 2 │
└─────────────────┘
tupleToNameValuePairs
Turns a named tuple into an array of (name, value) pairs. For a Tuple(a T, b T, ..., c T)
returns Array(Tuple(String, T), ...)
in which the Strings
represents the named fields of the tuple and T
are the values associated with those names. All values in the tuple should be of the same type.
Syntax
tupleToNameValuePairs(tuple)
Arguments
tuple
— Named tuple. Tuple with any types of values.
Returned value
- An array with (name, value) pairs.
Type: Array(Tuple(String, ...)).
Example
Query:
CREATE TABLE tupletest (`col` Tuple(user_ID UInt64, session_ID UInt64) ENGINE = Memory;
INSERT INTO tupletest VALUES (tuple( 100, 2502)), (tuple(1,100));
SELECT tupleToNameValuePairs(col) FROM tupletest;
Result:
┌─tupleToNameValuePairs(col)────────────┐
│ [('user_ID',100),('session_ID',2502)] │
│ [('user_ID',1),('session_ID',100)] │
└───────────────────────────────────────┘
It is possible to transform colums to rows using this function:
CREATE TABLE tupletest (`col` Tuple(CPU Float64, Memory Float64, Disk Float64)) ENGINE = Memory;
INSERT INTO tupletest VALUES(tuple(3.3, 5.5, 6.6));
SELECT arrayJoin(tupleToNameValuePairs(col))FROM tupletest;
Result:
┌─arrayJoin(tupleToNameValuePairs(col))─┐
│ ('CPU',3.3) │
│ ('Memory',5.5) │
│ ('Disk',6.6) │
└───────────────────────────────────────┘
If you pass a simple tuple to the function, ClickHouse uses the indexes of the values as their names:
SELECT tupleToNameValuePairs(tuple(3, 2, 1));
Result:
┌─tupleToNameValuePairs(tuple(3, 2, 1))─┐
│ [('1',3),('2',2),('3',1)] │
└───────────────────────────────────────┘
tuplePlus
Calculates the sum of corresponding values of two tuples of the same size.
Syntax
tuplePlus(tuple1, tuple2)
Alias: vectorSum
.
Arguments
Returned value
- Tuple with the sum.
Type: Tuple.
Example
Query:
SELECT tuplePlus((1, 2), (2, 3));
Result:
┌─tuplePlus((1, 2), (2, 3))─┐
│ (3,5) │
└───────────────────────────┘
tupleMinus
Calculates the subtraction of corresponding values of two tuples of the same size.
Syntax
tupleMinus(tuple1, tuple2)
Alias: vectorDifference
.
Arguments
Returned value
- Tuple with the result of subtraction.
Type: Tuple.
Example
Query:
SELECT tupleMinus((1, 2), (2, 3));
Result:
┌─tupleMinus((1, 2), (2, 3))─┐
│ (-1,-1) │
└────────────────────────────┘
tupleMultiply
Calculates the multiplication of corresponding values of two tuples of the same size.
Syntax
tupleMultiply(tuple1, tuple2)
Arguments
Returned value
- Tuple with the multiplication.
Type: Tuple.
Example
Query:
SELECT tupleMultiply((1, 2), (2, 3));
Result:
┌─tupleMultiply((1, 2), (2, 3))─┐
│ (2,6) │
└───────────────────────────────┘
tupleDivide
Calculates the division of corresponding values of two tuples of the same size. Note that division by zero will return inf
.
Syntax
tupleDivide(tuple1, tuple2)
Arguments
Returned value
- Tuple with the result of division.
Type: Tuple.
Example
Query:
SELECT tupleDivide((1, 2), (2, 3));
Result:
┌─tupleDivide((1, 2), (2, 3))─┐
│ (0.5,0.6666666666666666) │
└─────────────────────────────┘
tupleNegate
Calculates the negation of the tuple values.
Syntax
tupleNegate(tuple)
Arguments
tuple
— Tuple.
Returned value
- Tuple with the result of negation.
Type: Tuple.
Example
Query:
SELECT tupleNegate((1, 2));
Result:
┌─tupleNegate((1, 2))─┐
│ (-1,-2) │
└─────────────────────┘
tupleMultiplyByNumber
Returns a tuple with all values multiplied by a number.
Syntax
tupleMultiplyByNumber(tuple, number)
Arguments
Returned value
- Tuple with multiplied values.
Type: Tuple.
Example
Query:
SELECT tupleMultiplyByNumber((1, 2), -2.1);
Result:
┌─tupleMultiplyByNumber((1, 2), -2.1)─┐
│ (-2.1,-4.2) │
└─────────────────────────────────────┘
tupleDivideByNumber
Returns a tuple with all values divided by a number. Note that division by zero will return inf
.
Syntax
tupleDivideByNumber(tuple, number)
Arguments
Returned value
- Tuple with divided values.
Type: Tuple.
Example
Query:
SELECT tupleDivideByNumber((1, 2), 0.5);
Result:
┌─tupleDivideByNumber((1, 2), 0.5)─┐
│ (2,4) │
└──────────────────────────────────┘
dotProduct
Calculates the scalar product of two tuples of the same size.
Syntax
dotProduct(tuple1, tuple2)
Alias: scalarProduct
.
Arguments
Returned value
- Scalar product.
Type: Int/UInt, Float or Decimal.
Example
Query:
SELECT dotProduct((1, 2), (2, 3));
Result:
┌─dotProduct((1, 2), (2, 3))─┐
│ 8 │
└────────────────────────────┘
L1Norm
Calculates the sum of absolute values of a tuple.
Syntax
L1Norm(tuple)
Alias: normL1
.
Arguments
tuple
— Tuple.
Returned value
- L1-norm or taxicab geometry distance.
Example
Query:
SELECT L1Norm((1, 2));
Result:
┌─L1Norm((1, 2))─┐
│ 3 │
└────────────────┘
L2Norm
Calculates the square root of the sum of the squares of the tuple values.
Syntax
L2Norm(tuple)
Alias: normL2
.
Arguments
tuple
— Tuple.
Returned value
- L2-norm or Euclidean distance.
Type: Float.
Example
Query:
SELECT L2Norm((1, 2));
Result:
┌───L2Norm((1, 2))─┐
│ 2.23606797749979 │
└──────────────────┘
LinfNorm
Calculates the maximum of absolute values of a tuple.
Syntax
LinfNorm(tuple)
Alias: normLinf
.
Arguments
tuple
— Tuple.
Returned value
- Linf-norm or the maximum absolute value.
Type: Float.
Example
Query:
SELECT LinfNorm((1, -2));
Result:
┌─LinfNorm((1, -2))─┐
│ 2 │
└───────────────────┘
LpNorm
Calculates the root of p
-th power of the sum of the absolute values of a tuple in the power of p
.
Syntax
LpNorm(tuple, p)
Alias: normLp
.
Arguments
Returned value
Type: Float.
Example
Query:
SELECT LpNorm((1, -2), 2);
Result:
┌─LpNorm((1, -2), 2)─┐
│ 2.23606797749979 │
└────────────────────┘
L1Distance
Calculates the distance between two points (the values of the tuples are the coordinates) in L1
space (1-norm (taxicab geometry distance)).
Syntax
L1Distance(tuple1, tuple2)
Alias: distanceL1
.
Arguments
Returned value
- 1-norm distance.
Type: Float.
Example
Query:
SELECT L1Distance((1, 2), (2, 3));
Result:
┌─L1Distance((1, 2), (2, 3))─┐
│ 2 │
└────────────────────────────┘
L2Distance
Calculates the distance between two points (the values of the tuples are the coordinates) in Euclidean space (Euclidean distance).
Syntax
L2Distance(tuple1, tuple2)
Alias: distanceL2
.
Arguments
Returned value
- 2-norm distance.
Type: Float.
Example
Query:
SELECT L2Distance((1, 2), (2, 3));
Result:
┌─L2Distance((1, 2), (2, 3))─┐
│ 1.4142135623730951 │
└────────────────────────────┘
LinfDistance
Calculates the distance between two points (the values of the tuples are the coordinates) in L_{inf}
space (maximum norm).
Syntax
LinfDistance(tuple1, tuple2)
Alias: distanceLinf
.
Arguments
Returned value
- Infinity-norm distance.
Type: Float.
Example
Query:
SELECT LinfDistance((1, 2), (2, 3));
Result:
┌─LinfDistance((1, 2), (2, 3))─┐
│ 1 │
└──────────────────────────────┘
LpDistance
Calculates the distance between two points (the values of the tuples are the coordinates) in Lp
space (p-norm distance).
Syntax
LpDistance(tuple1, tuple2, p)
Alias: distanceLp
.
Arguments
tuple1
— First tuple. Tuple.tuple1
— Second tuple. Tuple.p
— The power. Possible values: real number from[1; inf)
. UInt or Float.
Returned value
- p-norm distance.
Type: Float.
Example
Query:
SELECT LpDistance((1, 2), (2, 3), 3);
Result:
┌─LpDistance((1, 2), (2, 3), 3)─┐
│ 1.2599210498948732 │
└───────────────────────────────┘
L1Normalize
Calculates the unit vector of a given vector (the values of the tuple are the coordinates) in L1
space (taxicab geometry).
Syntax
L1Normalize(tuple)
Alias: normalizeL1
.
Arguments
tuple
— Tuple.
Returned value
- Unit vector.
Example
Query:
SELECT L1Normalize((1, 2));
Result:
┌─L1Normalize((1, 2))─────────────────────┐
│ (0.3333333333333333,0.6666666666666666) │
└─────────────────────────────────────────┘
L2Normalize
Calculates the unit vector of a given vector (the values of the tuple are the coordinates) in Euclidean space (using Euclidean distance).
Syntax
L2Normalize(tuple)
Alias: normalizeL1
.
Arguments
tuple
— Tuple.
Returned value
- Unit vector.
Example
Query:
SELECT L2Normalize((3, 4));
Result:
┌─L2Normalize((3, 4))─┐
│ (0.6,0.8) │
└─────────────────────┘
LinfNormalize
Calculates the unit vector of a given vector (the values of the tuple are the coordinates) in L_{inf}
space (using maximum norm).
Syntax
LinfNormalize(tuple)
Alias: normalizeLinf
.
Arguments
tuple
— Tuple.
Returned value
- Unit vector.
Example
Query:
SELECT LinfNormalize((3, 4));
Result:
┌─LinfNormalize((3, 4))─┐
│ (0.75,1) │
└───────────────────────┘
LpNormalize
Calculates the unit vector of a given vector (the values of the tuple are the coordinates) in Lp
space (using p-norm).
Syntax
LpNormalize(tuple, p)
Alias: normalizeLp
.
Arguments
Returned value
- Unit vector.
Example
Query:
SELECT LpNormalize((3, 4),5);
Result:
┌─LpNormalize((3, 4), 5)──────────────────┐
│ (0.7187302630182624,0.9583070173576831) │
└─────────────────────────────────────────┘
cosineDistance
Calculates the cosine distance between two vectors (the values of the tuples are the coordinates). The less the returned value is, the more similar are the vectors.
Syntax
cosineDistance(tuple1, tuple2)
Arguments
Returned value
- Cosine of the angle between two vectors substracted from one.
Type: Float.
Example
Query:
SELECT cosineDistance((1, 2), (2, 3));
Result:
┌─cosineDistance((1, 2), (2, 3))─┐
│ 0.007722123286332261 │
└────────────────────────────────┘