ClickHouse/docs/en/sql-reference/functions/splitting-merging-functions.md

---
toc_priority: 47
toc_title: Splitting and Merging Strings and Arrays
---

# Functions for Splitting and Merging Strings and Arrays {#functions-for-splitting-and-merging-strings-and-arrays}

## splitByChar(separator, s) {#splitbycharseparator-s}

Splits a string into substrings separated by a specified character. It uses a constant string `separator` which consisting of exactly one character.
Returns an array of selected substrings. Empty substrings may be selected if the separator occurs at the beginning or end of the string, or if there are multiple consecutive separators.

**Syntax**

``` sql
splitByChar(<separator>, <s>)
```

**Arguments**

-   `separator` — The separator which should contain exactly one character. [String](../../sql-reference/data-types/string.md).
-   `s` — The string to split. [String](../../sql-reference/data-types/string.md).

**Returned value(s)**

Returns an array of selected substrings. Empty substrings may be selected when:

-   A separator occurs at the beginning or end of the string;
-   There are multiple consecutive separators;
-   The original string `s` is empty.

Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md).

**Example**

``` sql
SELECT splitByChar(',', '1,2,3,abcde')
```

``` text
┌─splitByChar(',', '1,2,3,abcde')─┐
│ ['1','2','3','abcde']           │
└─────────────────────────────────┘
```

## splitByString(separator, s) {#splitbystringseparator-s}

Splits a string into substrings separated by a string. It uses a constant string `separator` of multiple characters as the separator. If the string `separator` is empty, it will split the string `s` into an array of single characters.

**Syntax**

``` sql
splitByString(<separator>, <s>)
```

**Arguments**

-   `separator` — The separator. [String](../../sql-reference/data-types/string.md).
-   `s` — The string to split. [String](../../sql-reference/data-types/string.md).

**Returned value(s)**

Returns an array of selected substrings. Empty substrings may be selected when:

Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md).

-   A non-empty separator occurs at the beginning or end of the string;
-   There are multiple consecutive non-empty separators;
-   The original string `s` is empty while the separator is not empty.

**Example**

``` sql
SELECT splitByString(', ', '1, 2 3, 4,5, abcde')
```

``` text
┌─splitByString(', ', '1, 2 3, 4,5, abcde')─┐
│ ['1','2 3','4,5','abcde']                 │
└───────────────────────────────────────────┘
```

``` sql
SELECT splitByString('', 'abcde')
```

``` text
┌─splitByString('', 'abcde')─┐
│ ['a','b','c','d','e']      │
└────────────────────────────┘
```

## splitByRegexp(regexp, s) {#splitbyregexpseparator-s}

Splits a string into substrings separated by a regular expression. It uses a regular expression string `regexp` as the separator. If the `regexp` is empty, it will split the string s into an array of single characters. If no match is found for this regex expression, the string `s` won't be split.

**Syntax**

``` sql
splitByRegexp(<regexp>, <s>)
```

**Arguments**

-   `regexp` — Regular expression. Constant. [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
-   `s` — The string to split. [String](../../sql-reference/data-types/string.md).

**Returned value(s)**

Returns an array of selected substrings. Empty substrings may be selected when:

Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md).

-   A non-empty regular expression match occurs at the beginning or end of the string;
-   There are multiple consecutive non-empty regular expression matches;
-   The original string `s` is empty while the regular expression is not empty.

**Example**

``` sql
SELECT splitByRegexp('\\d+', 'a1bc2de3f')
```

``` text
┌─splitByRegexp('\\d+', 'a1bc2de3f')─┐
│ ['a','bc','de','f']                │
└────────────────────────────────────┘
```

``` sql
SELECT splitByRegexp('', 'abcde')
```

``` text
┌─splitByRegexp('', 'abcde')─┐
│ ['a','b','c','d','e']      │
└────────────────────────────┘
```

## arrayStringConcat(arr\[, separator\]) {#arraystringconcatarr-separator}

Concatenates the strings listed in the array with the separator.’separator’ is an optional parameter: a constant string, set to an empty string by default.
Returns the string.

## alphaTokens(s) {#alphatokenss}

Selects substrings of consecutive bytes from the ranges a-z and A-Z.Returns an array of substrings.

**Example**

``` sql
SELECT alphaTokens('abca1abc')
```

``` text
┌─alphaTokens('abca1abc')─┐
│ ['abca','abc']          │
└─────────────────────────┘
```

## extractAllGroups(text, regexp) {#extractallgroups}

Extracts all groups from non-overlapping substrings matched by a regular expression.

**Syntax** 

``` sql
extractAllGroups(text, regexp) 
```

**Arguments** 

-   `text` — [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
-   `regexp` — Regular expression. Constant. [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).

**Returned values**

-   If the function finds at least one matching group, it returns `Array(Array(String))` column, clustered by group_id (1 to N, where N is number of capturing groups in `regexp`).

-   If there is no matching group, returns an empty array.

Type: [Array](../data-types/array.md).

**Example**

Query:

``` sql
SELECT extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)');
```

Result:

``` text
┌─extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)')─┐
│ [['abc','123'],['8','"hkl"']]                                         │
└───────────────────────────────────────────────────────────────────────┘
```
-												Get rid of toc_en.yml (#10023)


											
										
										
											2020-04-03 13:23:32 +00:00
+								---
 								toc_priority: 47
 								toc_title: Splitting and Merging Strings and Arrays
 								---
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								# Functions for Splitting and Merging Strings and Arrays {#functions-for-splitting-and-merging-strings-and-arrays}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								## splitByChar(separator, s) {#splitbycharseparator-s}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
+								Splits a string into substrings separated by a specified character. It uses a constant string `separator` which consisting of exactly one character.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								Returns an array of selected substrings. Empty substrings may be selected if the separator occurs at the beginning or end of the string, or if there are multiple consecutive separators.
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
+								**Syntax**
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								``` sql
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
+								splitByChar(<separator>, <s>)
 								```
-												Global replacement `Parameters` to `Arguments`

											
										
										
											2021-02-15 21:22:10 +00:00
+								**Arguments**
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								-   `separator` — The separator which should contain exactly one character. [String](../../sql-reference/data-types/string.md).
 								-   `s` — The string to split. [String](../../sql-reference/data-types/string.md).
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
 								**Returned value(s)**
 								Returns an array of selected substrings. Empty substrings may be selected when:
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								-   A separator occurs at the beginning or end of the string;
 								-   There are multiple consecutive separators;
 								-   The original string `s` is empty.
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md).
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
 								**Example**
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
+								SELECT splitByChar(',', '1,2,3,abcde')
 								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
 								``` text
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
+								┌─splitByChar(',', '1,2,3,abcde')─┐
 								│ ['1','2','3','abcde']           │
 								└─────────────────────────────────┘
 								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								## splitByString(separator, s) {#splitbystringseparator-s}
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
 								Splits a string into substrings separated by a string. It uses a constant string `separator` of multiple characters as the separator. If the string `separator` is empty, it will split the string `s` into an array of single characters.
 								**Syntax**
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								``` sql
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
+								splitByString(<separator>, <s>)
 								```
-												Global replacement `Parameters` to `Arguments`

											
										
										
											2021-02-15 21:22:10 +00:00
+								**Arguments**
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								-   `separator` — The separator. [String](../../sql-reference/data-types/string.md).
 								-   `s` — The string to split. [String](../../sql-reference/data-types/string.md).
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
 								**Returned value(s)**
 								Returns an array of selected substrings. Empty substrings may be selected when:
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md).
-												Merge branch 'master' into patch-0320
											
										
										
											2020-03-20 18:36:14 +00:00
-												[docs] replace underscores with hyphens (#10606)

* Replace underscores with hyphens

* remove temporary code

* fix style check

* fix collapse
											
										
										
											2020-04-30 18:19:18 +00:00
+								-   A non-empty separator occurs at the beginning or end of the string;
 								-   There are multiple consecutive non-empty separators;
 								-   The original string `s` is empty while the separator is not empty.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
+								**Example**
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
+								SELECT splitByString(', ', '1, 2 3, 4,5, abcde')
 								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
 								``` text
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
+								┌─splitByString(', ', '1, 2 3, 4,5, abcde')─┐
 								│ ['1','2 3','4,5','abcde']                 │
 								└───────────────────────────────────────────┘
 								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
+								SELECT splitByString('', 'abcde')
 								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
 								``` text
-												Update tests and docs for string splitting functions

											
										
										
											2020-03-19 02:35:18 +00:00
+								┌─splitByString('', 'abcde')─┐
 								│ ['a','b','c','d','e']      │
 								└────────────────────────────┘
 								```
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												split into characters when split by '' & add docs

											
										
										
											2021-05-13 03:15:38 +00:00
+								## splitByRegexp(regexp, s) {#splitbyregexpseparator-s}
 								Splits a string into substrings separated by a regular expression. It uses a regular expression string `regexp` as the separator. If the `regexp` is empty, it will split the string s into an array of single characters. If no match is found for this regex expression, the string `s` won't be split.
 								**Syntax**
 								``` sql
 								splitByRegexp(<regexp>, <s>)
 								```
 								**Arguments**
 								-   `regexp` — Regular expression. Constant. [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
 								-   `s` — The string to split. [String](../../sql-reference/data-types/string.md).
 								**Returned value(s)**
 								Returns an array of selected substrings. Empty substrings may be selected when:
 								Type: [Array](../../sql-reference/data-types/array.md) of [String](../../sql-reference/data-types/string.md).
 								-   A non-empty regular expression match occurs at the beginning or end of the string;
 								-   There are multiple consecutive non-empty regular expression matches;
 								-   The original string `s` is empty while the regular expression is not empty.
 								**Example**
 								``` sql
 								SELECT splitByRegexp('\\d+', 'a1bc2de3f')
 								```
 								``` text
 								┌─splitByRegexp('\\d+', 'a1bc2de3f')─┐
 								│ ['a','bc','de','f']                │
 								└────────────────────────────────────┘
 								```
 								``` sql
 								SELECT splitByRegexp('', 'abcde')
 								```
 								``` text
 								┌─splitByRegexp('', 'abcde')─┐
 								│ ['a','b','c','d','e']      │
 								└────────────────────────────┘
 								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								## arrayStringConcat(arr\[, separator\]) {#arraystringconcatarr-separator}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								Concatenates the strings listed in the array with the separator.’separator’ is an optional parameter: a constant string, set to an empty string by default.
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
+								Returns the string.
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								## alphaTokens(s) {#alphatokenss}
-												Sources for english documentation switched to Markdown.
Edit page link is fixed too for both language versions of documentation.

											
										
										
											2017-12-28 15:13:23 +00:00
 								Selects substrings of consecutive bytes from the ranges a-z and A-Z.Returns an array of substrings.
-												Update tests and docs related to splitByString

											
										
										
											2020-03-20 05:37:46 +00:00
+								**Example**
-												Add examples for alphaTokens and argMin (#3189)


											
										
										
											2018-09-21 15:13:45 +00:00
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
+								``` sql
-												Add examples for alphaTokens and argMin (#3189)


											
										
										
											2018-09-21 15:13:45 +00:00
+								SELECT alphaTokens('abca1abc')
-												DOCAPI-8530: Code blocks markup fix (#7060)

* Typo fix.

* Links fix.

* Fixed links in docs.

* More fixes.

* docs/en: cleaning some files

* docs/en: cleaning data_types

* docs/en: cleaning database_engines

* docs/en: cleaning development

* docs/en: cleaning getting_started

* docs/en: cleaning interfaces

* docs/en: cleaning operations

* docs/en: cleaning query_lamguage

* docs/en: cleaning en

* docs/ru: cleaning data_types

* docs/ru: cleaning index

* docs/ru: cleaning database_engines

* docs/ru: cleaning development

* docs/ru: cleaning general

* docs/ru: cleaning getting_started

* docs/ru: cleaning interfaces

* docs/ru: cleaning operations

* docs/ru: cleaning query_language

* docs: cleaning interfaces/http

* Update docs/en/data_types/array.md

decorated ```

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/getting_started/example_datasets/nyc_taxi.md

fixed typo

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/getting_started/example_datasets/ontime.md

fixed typo

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/interfaces/formats.md

fixed error

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/table_engines/custom_partitioning_key.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/utils/clickhouse-local.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/dicts/external_dicts_dict_sources.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/utils/clickhouse-local.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/json_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/json_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/other_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/other_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/date_time_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/table_engines/jdbc.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* docs: fixed error

* docs: fixed error

											
										
										
											2019-09-23 15:31:46 +00:00
+								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
 								``` text
-												Add examples for alphaTokens and argMin (#3189)


											
										
										
											2018-09-21 15:13:45 +00:00
+								┌─alphaTokens('abca1abc')─┐
 								│ ['abca','abc']          │
 								└─────────────────────────┘
-												WIP on docs/website (#3383)

* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

											
										
										
											2018-10-16 10:47:17 +00:00
+								```
-												Normalization for en markdown (#9763)


											
										
										
											2020-03-20 10:10:48 +00:00
-												DOCSUP-1110: document the extractAllGroups function (#138)

* add EN description

* changes in EN version after review

* add RU version
											
										
										
											2020-07-19 09:33:50 +00:00
+								## extractAllGroups(text, regexp) {#extractallgroups}
 								Extracts all groups from non-overlapping substrings matched by a regular expression.
 								**Syntax**
 								``` sql
 								extractAllGroups(text, regexp)
 								```
-												Global replacement `Parameters` to `Arguments`

											
										
										
											2021-02-15 21:22:10 +00:00
+								**Arguments**
-												DOCSUP-1110: document the extractAllGroups function (#138)

* add EN description

* changes in EN version after review

* add RU version
											
										
										
											2020-07-19 09:33:50 +00:00
 								-   `text` — [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
 								-   `regexp` — Regular expression. Constant. [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
 								**Returned values**
 								-   If the function finds at least one matching group, it returns `Array(Array(String))` column, clustered by group_id (1 to N, where N is number of capturing groups in `regexp`).
 								-   If there is no matching group, returns an empty array.
 								Type: [Array](../data-types/array.md).
 								**Example**
 								Query:
 								``` sql
 								SELECT extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)');
 								```
 								Result:
 								``` text
 								┌─extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)')─┐
 								│ [['abc','123'],['8','"hkl"']]                                         │
 								└───────────────────────────────────────────────────────────────────────┘
 								```