ClickHouse/docs/en/sql-reference/functions/array-join.md

141 lines
3.7 KiB
Markdown
Raw Normal View History

2020-04-03 13:23:32 +00:00
---
2022-08-28 14:53:34 +00:00
slug: /en/sql-reference/functions/array-join
sidebar_position: 61
sidebar_label: arrayJoin
2020-04-03 13:23:32 +00:00
---
2022-06-02 10:55:18 +00:00
# arrayJoin function
This is a very unusual function.
2021-05-27 19:44:11 +00:00
Normal functions do not change a set of rows, but just change the values in each row (map).
Aggregate functions compress a set of rows (fold or reduce).
2022-08-15 19:40:48 +00:00
The `arrayJoin` function takes each row and generates a set of rows (unfold).
This function takes an array as an argument, and propagates the source row to multiple rows for the number of elements in the array.
All the values in columns are simply copied, except the values in the column where this function is applied; it is replaced with the corresponding array value.
Example:
``` sql
SELECT arrayJoin([1, 2, 3] AS src) AS dst, 'Hello', src
```
2020-03-20 10:10:48 +00:00
``` text
┌─dst─┬─\'Hello\'─┬─src─────┐
│ 1 │ Hello │ [1,2,3] │
│ 2 │ Hello │ [1,2,3] │
│ 3 │ Hello │ [1,2,3] │
└─────┴───────────┴─────────┘
```
2018-04-23 06:20:21 +00:00
2022-08-15 19:40:48 +00:00
The `arrayJoin` function affects all sections of the query, including the `WHERE` section. Notice the result 2, even though the subquery returned 1 row.
Example:
```sql
SELECT sum(1) AS impressions
FROM
(
2022-08-15 19:45:07 +00:00
SELECT ['Istanbul', 'Berlin', 'Bobruisk'] AS cities
2022-08-15 19:40:48 +00:00
)
2022-08-15 19:45:07 +00:00
WHERE arrayJoin(cities) IN ['Istanbul', 'Berlin'];
2022-08-15 19:40:48 +00:00
```
``` text
┌─impressions─┐
│ 2 │
└─────────────┘
```
A query can use multiple `arrayJoin` functions. In this case, the transformation is performed multiple times and the rows are multiplied.
Example:
```sql
SELECT
sum(1) AS impressions,
arrayJoin(cities) AS city,
arrayJoin(browsers) AS browser
FROM
(
SELECT
2022-08-15 19:42:55 +00:00
['Istanbul', 'Berlin', 'Bobruisk'] AS cities,
2022-08-15 19:40:48 +00:00
['Firefox', 'Chrome', 'Chrome'] AS browsers
)
GROUP BY
2,
3
```
``` text
┌─impressions─┬─city─────┬─browser─┐
2022-08-15 19:42:55 +00:00
│ 2 │ Istanbul │ Chrome │
│ 1 │ Istanbul │ Firefox │
2022-08-15 19:40:48 +00:00
│ 2 │ Berlin │ Chrome │
│ 1 │ Berlin │ Firefox │
│ 2 │ Bobruisk │ Chrome │
│ 1 │ Bobruisk │ Firefox │
└─────────────┴──────────┴─────────┘
```
Note the [ARRAY JOIN](../statements/select/array-join.md) syntax in the SELECT query, which provides broader possibilities.
`ARRAY JOIN` allows you to convert multiple arrays with the same number of elements at a time.
Example:
```sql
SELECT
sum(1) AS impressions,
city,
browser
FROM
(
SELECT
2022-08-15 19:42:55 +00:00
['Istanbul', 'Berlin', 'Bobruisk'] AS cities,
2022-08-15 19:40:48 +00:00
['Firefox', 'Chrome', 'Chrome'] AS browsers
)
ARRAY JOIN
cities AS city,
browsers AS browser
GROUP BY
2,
3
```
``` text
┌─impressions─┬─city─────┬─browser─┐
2022-08-15 19:42:55 +00:00
│ 1 │ Istanbul │ Firefox │
2022-08-15 19:40:48 +00:00
│ 1 │ Berlin │ Chrome │
│ 1 │ Bobruisk │ Chrome │
└─────────────┴──────────┴─────────┘
```
Or you can use [Tuple](../data-types/tuple.md)
Example:
```sql
SELECT
sum(1) AS impressions,
(arrayJoin(arrayZip(cities, browsers)) AS t).1 AS city,
t.2 AS browser
FROM
(
SELECT
2022-08-15 19:42:55 +00:00
['Istanbul', 'Berlin', 'Bobruisk'] AS cities,
2022-08-15 19:40:48 +00:00
['Firefox', 'Chrome', 'Chrome'] AS browsers
)
GROUP BY
2,
3
```
``` text
┌─impressions─┬─city─────┬─browser─┐
2022-08-15 19:42:55 +00:00
│ 1 │ Istanbul │ Firefox │
2022-08-15 19:40:48 +00:00
│ 1 │ Berlin │ Chrome │
│ 1 │ Bobruisk │ Chrome │
└─────────────┴──────────┴─────────┘
```