mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-14 03:25:15 +00:00
302 lines
10 KiB
Markdown
302 lines
10 KiB
Markdown
---
|
||
slug: /en/sql-reference/statements/select/array-join
|
||
sidebar_label: ARRAY JOIN
|
||
---
|
||
|
||
# ARRAY JOIN Clause
|
||
|
||
It is a common operation for tables that contain an array column to produce a new table that has a column with each individual array element of that initial column, while values of other columns are duplicated. This is the basic case of what `ARRAY JOIN` clause does.
|
||
|
||
Its name comes from the fact that it can be looked at as executing `JOIN` with an array or nested data structure. The intent is similar to the [arrayJoin](../../../sql-reference/functions/array-join.md#functions_arrayjoin) function, but the clause functionality is broader.
|
||
|
||
Syntax:
|
||
|
||
``` sql
|
||
SELECT <expr_list>
|
||
FROM <left_subquery>
|
||
[LEFT] ARRAY JOIN <array>
|
||
[WHERE|PREWHERE <expr>]
|
||
...
|
||
```
|
||
|
||
You can specify only one `ARRAY JOIN` clause in a `SELECT` query.
|
||
|
||
Supported types of `ARRAY JOIN` are listed below:
|
||
|
||
- `ARRAY JOIN` - In base case, empty arrays are not included in the result of `JOIN`.
|
||
- `LEFT ARRAY JOIN` - The result of `JOIN` contains rows with empty arrays. The value for an empty array is set to the default value for the array element type (usually 0, empty string or NULL).
|
||
|
||
## Basic ARRAY JOIN Examples
|
||
|
||
The examples below demonstrate the usage of the `ARRAY JOIN` and `LEFT ARRAY JOIN` clauses. Let’s create a table with an [Array](../../../sql-reference/data-types/array.md) type column and insert values into it:
|
||
|
||
``` sql
|
||
CREATE TABLE arrays_test
|
||
(
|
||
s String,
|
||
arr Array(UInt8)
|
||
) ENGINE = Memory;
|
||
|
||
INSERT INTO arrays_test
|
||
VALUES ('Hello', [1,2]), ('World', [3,4,5]), ('Goodbye', []);
|
||
```
|
||
|
||
``` text
|
||
┌─s───────────┬─arr─────┐
|
||
│ Hello │ [1,2] │
|
||
│ World │ [3,4,5] │
|
||
│ Goodbye │ [] │
|
||
└─────────────┴─────────┘
|
||
```
|
||
|
||
The example below uses the `ARRAY JOIN` clause:
|
||
|
||
``` sql
|
||
SELECT s, arr
|
||
FROM arrays_test
|
||
ARRAY JOIN arr;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─arr─┐
|
||
│ Hello │ 1 │
|
||
│ Hello │ 2 │
|
||
│ World │ 3 │
|
||
│ World │ 4 │
|
||
│ World │ 5 │
|
||
└───────┴─────┘
|
||
```
|
||
|
||
The next example uses the `LEFT ARRAY JOIN` clause:
|
||
|
||
``` sql
|
||
SELECT s, arr
|
||
FROM arrays_test
|
||
LEFT ARRAY JOIN arr;
|
||
```
|
||
|
||
``` text
|
||
┌─s───────────┬─arr─┐
|
||
│ Hello │ 1 │
|
||
│ Hello │ 2 │
|
||
│ World │ 3 │
|
||
│ World │ 4 │
|
||
│ World │ 5 │
|
||
│ Goodbye │ 0 │
|
||
└─────────────┴─────┘
|
||
```
|
||
|
||
## Using Aliases
|
||
|
||
An alias can be specified for an array in the `ARRAY JOIN` clause. In this case, an array item can be accessed by this alias, but the array itself is accessed by the original name. Example:
|
||
|
||
``` sql
|
||
SELECT s, arr, a
|
||
FROM arrays_test
|
||
ARRAY JOIN arr AS a;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─arr─────┬─a─┐
|
||
│ Hello │ [1,2] │ 1 │
|
||
│ Hello │ [1,2] │ 2 │
|
||
│ World │ [3,4,5] │ 3 │
|
||
│ World │ [3,4,5] │ 4 │
|
||
│ World │ [3,4,5] │ 5 │
|
||
└───────┴─────────┴───┘
|
||
```
|
||
|
||
Using aliases, you can perform `ARRAY JOIN` with an external array. For example:
|
||
|
||
``` sql
|
||
SELECT s, arr_external
|
||
FROM arrays_test
|
||
ARRAY JOIN [1, 2, 3] AS arr_external;
|
||
```
|
||
|
||
``` text
|
||
┌─s───────────┬─arr_external─┐
|
||
│ Hello │ 1 │
|
||
│ Hello │ 2 │
|
||
│ Hello │ 3 │
|
||
│ World │ 1 │
|
||
│ World │ 2 │
|
||
│ World │ 3 │
|
||
│ Goodbye │ 1 │
|
||
│ Goodbye │ 2 │
|
||
│ Goodbye │ 3 │
|
||
└─────────────┴──────────────┘
|
||
```
|
||
|
||
Multiple arrays can be comma-separated in the `ARRAY JOIN` clause. In this case, `JOIN` is performed with them simultaneously (the direct sum, not the cartesian product). Note that all the arrays must have the same size by default. Example:
|
||
|
||
``` sql
|
||
SELECT s, arr, a, num, mapped
|
||
FROM arrays_test
|
||
ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num, arrayMap(x -> x + 1, arr) AS mapped;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─arr─────┬─a─┬─num─┬─mapped─┐
|
||
│ Hello │ [1,2] │ 1 │ 1 │ 2 │
|
||
│ Hello │ [1,2] │ 2 │ 2 │ 3 │
|
||
│ World │ [3,4,5] │ 3 │ 1 │ 4 │
|
||
│ World │ [3,4,5] │ 4 │ 2 │ 5 │
|
||
│ World │ [3,4,5] │ 5 │ 3 │ 6 │
|
||
└───────┴─────────┴───┴─────┴────────┘
|
||
```
|
||
|
||
The example below uses the [arrayEnumerate](../../../sql-reference/functions/array-functions#array_functions-arrayenumerate) function:
|
||
|
||
``` sql
|
||
SELECT s, arr, a, num, arrayEnumerate(arr)
|
||
FROM arrays_test
|
||
ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─arr─────┬─a─┬─num─┬─arrayEnumerate(arr)─┐
|
||
│ Hello │ [1,2] │ 1 │ 1 │ [1,2] │
|
||
│ Hello │ [1,2] │ 2 │ 2 │ [1,2] │
|
||
│ World │ [3,4,5] │ 3 │ 1 │ [1,2,3] │
|
||
│ World │ [3,4,5] │ 4 │ 2 │ [1,2,3] │
|
||
│ World │ [3,4,5] │ 5 │ 3 │ [1,2,3] │
|
||
└───────┴─────────┴───┴─────┴─────────────────────┘
|
||
```
|
||
Multiple arrays with different sizes can be joined by using: `SETTINGS enable_unaligned_array_join = 1`. Example:
|
||
|
||
```sql
|
||
SELECT s, arr, a, b
|
||
FROM arrays_test ARRAY JOIN arr as a, [['a','b'],['c']] as b
|
||
SETTINGS enable_unaligned_array_join = 1;
|
||
```
|
||
|
||
```text
|
||
┌─s───────┬─arr─────┬─a─┬─b─────────┐
|
||
│ Hello │ [1,2] │ 1 │ ['a','b'] │
|
||
│ Hello │ [1,2] │ 2 │ ['c'] │
|
||
│ World │ [3,4,5] │ 3 │ ['a','b'] │
|
||
│ World │ [3,4,5] │ 4 │ ['c'] │
|
||
│ World │ [3,4,5] │ 5 │ [] │
|
||
│ Goodbye │ [] │ 0 │ ['a','b'] │
|
||
│ Goodbye │ [] │ 0 │ ['c'] │
|
||
└─────────┴─────────┴───┴───────────┘
|
||
```
|
||
|
||
## ARRAY JOIN with Nested Data Structure
|
||
|
||
`ARRAY JOIN` also works with [nested data structures](../../../sql-reference/data-types/nested-data-structures/nested.md):
|
||
|
||
``` sql
|
||
CREATE TABLE nested_test
|
||
(
|
||
s String,
|
||
nest Nested(
|
||
x UInt8,
|
||
y UInt32)
|
||
) ENGINE = Memory;
|
||
|
||
INSERT INTO nested_test
|
||
VALUES ('Hello', [1,2], [10,20]), ('World', [3,4,5], [30,40,50]), ('Goodbye', [], []);
|
||
```
|
||
|
||
``` text
|
||
┌─s───────┬─nest.x──┬─nest.y─────┐
|
||
│ Hello │ [1,2] │ [10,20] │
|
||
│ World │ [3,4,5] │ [30,40,50] │
|
||
│ Goodbye │ [] │ [] │
|
||
└─────────┴─────────┴────────────┘
|
||
```
|
||
|
||
``` sql
|
||
SELECT s, `nest.x`, `nest.y`
|
||
FROM nested_test
|
||
ARRAY JOIN nest;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─nest.x─┬─nest.y─┐
|
||
│ Hello │ 1 │ 10 │
|
||
│ Hello │ 2 │ 20 │
|
||
│ World │ 3 │ 30 │
|
||
│ World │ 4 │ 40 │
|
||
│ World │ 5 │ 50 │
|
||
└───────┴────────┴────────┘
|
||
```
|
||
|
||
When specifying names of nested data structures in `ARRAY JOIN`, the meaning is the same as `ARRAY JOIN` with all the array elements that it consists of. Examples are listed below:
|
||
|
||
``` sql
|
||
SELECT s, `nest.x`, `nest.y`
|
||
FROM nested_test
|
||
ARRAY JOIN `nest.x`, `nest.y`;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─nest.x─┬─nest.y─┐
|
||
│ Hello │ 1 │ 10 │
|
||
│ Hello │ 2 │ 20 │
|
||
│ World │ 3 │ 30 │
|
||
│ World │ 4 │ 40 │
|
||
│ World │ 5 │ 50 │
|
||
└───────┴────────┴────────┘
|
||
```
|
||
|
||
This variation also makes sense:
|
||
|
||
``` sql
|
||
SELECT s, `nest.x`, `nest.y`
|
||
FROM nested_test
|
||
ARRAY JOIN `nest.x`;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─nest.x─┬─nest.y─────┐
|
||
│ Hello │ 1 │ [10,20] │
|
||
│ Hello │ 2 │ [10,20] │
|
||
│ World │ 3 │ [30,40,50] │
|
||
│ World │ 4 │ [30,40,50] │
|
||
│ World │ 5 │ [30,40,50] │
|
||
└───────┴────────┴────────────┘
|
||
```
|
||
|
||
An alias may be used for a nested data structure, in order to select either the `JOIN` result or the source array. Example:
|
||
|
||
``` sql
|
||
SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`
|
||
FROM nested_test
|
||
ARRAY JOIN nest AS n;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┐
|
||
│ Hello │ 1 │ 10 │ [1,2] │ [10,20] │
|
||
│ Hello │ 2 │ 20 │ [1,2] │ [10,20] │
|
||
│ World │ 3 │ 30 │ [3,4,5] │ [30,40,50] │
|
||
│ World │ 4 │ 40 │ [3,4,5] │ [30,40,50] │
|
||
│ World │ 5 │ 50 │ [3,4,5] │ [30,40,50] │
|
||
└───────┴─────┴─────┴─────────┴────────────┘
|
||
```
|
||
|
||
Example of using the [arrayEnumerate](../../../sql-reference/functions/array-functions#array_functions-arrayenumerate) function:
|
||
|
||
``` sql
|
||
SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`, num
|
||
FROM nested_test
|
||
ARRAY JOIN nest AS n, arrayEnumerate(`nest.x`) AS num;
|
||
```
|
||
|
||
``` text
|
||
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┬─num─┐
|
||
│ Hello │ 1 │ 10 │ [1,2] │ [10,20] │ 1 │
|
||
│ Hello │ 2 │ 20 │ [1,2] │ [10,20] │ 2 │
|
||
│ World │ 3 │ 30 │ [3,4,5] │ [30,40,50] │ 1 │
|
||
│ World │ 4 │ 40 │ [3,4,5] │ [30,40,50] │ 2 │
|
||
│ World │ 5 │ 50 │ [3,4,5] │ [30,40,50] │ 3 │
|
||
└───────┴─────┴─────┴─────────┴────────────┴─────┘
|
||
```
|
||
|
||
## Implementation Details
|
||
|
||
The query execution order is optimized when running `ARRAY JOIN`. Although `ARRAY JOIN` must always be specified before the [WHERE](../../../sql-reference/statements/select/where.md)/[PREWHERE](../../../sql-reference/statements/select/prewhere.md) clause in a query, technically they can be performed in any order, unless result of `ARRAY JOIN` is used for filtering. The processing order is controlled by the query optimizer.
|