ClickHouse/docs/en/sql-reference/functions/string-replace-functions.md

196 lines
5.8 KiB
Markdown
Raw Normal View History

2020-04-03 13:23:32 +00:00
---
2022-08-28 14:53:34 +00:00
slug: /en/sql-reference/functions/string-replace-functions
2023-04-19 17:05:55 +00:00
sidebar_position: 150
sidebar_label: Replacing in Strings
2020-04-03 13:23:32 +00:00
---
2023-04-20 10:08:49 +00:00
# Functions for Replacing in Strings
2023-06-02 11:30:05 +00:00
[General strings functions](string-functions.md) and [functions for searching in strings](string-search-functions.md) are described separately.
2020-06-19 10:10:51 +00:00
2023-04-20 10:08:49 +00:00
## replaceOne
2023-04-20 10:08:49 +00:00
Replaces the first occurrence of the substring `pattern` in `haystack` by the `replacement` string.
2023-04-20 10:08:49 +00:00
**Syntax**
2023-04-20 10:08:49 +00:00
```sql
replaceOne(haystack, pattern, replacement)
```
## replaceAll
Replaces all occurrences of the substring `pattern` in `haystack` by the `replacement` string.
**Syntax**
```sql
replaceAll(haystack, pattern, replacement)
```
Alias: `replace`.
2023-04-20 10:08:49 +00:00
## replaceRegexpOne
Replaces the first occurrence of the substring matching the regular expression `pattern` (in [re2 syntax](https://github.com/google/re2/wiki/Syntax)) in `haystack` by the `replacement` string.
2023-04-20 10:08:49 +00:00
`replacement` can containing substitutions `\0-\9`.
Substitutions `\1-\9` correspond to the 1st to 9th capturing group (submatch), substitution `\0` corresponds to the entire match.
2023-04-20 10:08:49 +00:00
To use a verbatim `\` character in the `pattern` or `replacement` strings, escape it using `\`.
Also keep in mind that string literals require extra escaping.
**Syntax**
```sql
replaceRegexpOne(haystack, pattern, replacement)
```
**Example**
Converting ISO dates to American format:
2020-03-20 10:10:48 +00:00
``` sql
SELECT DISTINCT
EventDate,
replaceRegexpOne(toString(EventDate), '(\\d{4})-(\\d{2})-(\\d{2})', '\\2/\\3/\\1') AS res
FROM test.hits
LIMIT 7
FORMAT TabSeparated
```
2023-04-20 10:08:49 +00:00
Result:
2020-03-20 10:10:48 +00:00
``` text
2014-03-17 03/17/2014
2014-03-18 03/18/2014
2014-03-19 03/19/2014
2014-03-20 03/20/2014
2014-03-21 03/21/2014
2014-03-22 03/22/2014
2014-03-23 03/23/2014
```
2023-04-20 10:08:49 +00:00
Copying a string ten times:
2020-03-20 10:10:48 +00:00
``` sql
SELECT replaceRegexpOne('Hello, World!', '.*', '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0') AS res
```
2023-04-20 10:08:49 +00:00
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─res────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World! │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
2023-04-20 10:08:49 +00:00
## replaceRegexpAll
2023-04-20 10:08:49 +00:00
Like `replaceRegexpOne` but replaces all occurrences of the pattern.
Alias: `REGEXP_REPLACE`.
**Example**
2020-03-20 10:10:48 +00:00
``` sql
SELECT replaceRegexpAll('Hello, World!', '.', '\\0\\0') AS res
```
2023-04-20 10:08:49 +00:00
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─res────────────────────────┐
│ HHeelllloo,, WWoorrlldd!! │
└────────────────────────────┘
```
2023-04-20 10:08:49 +00:00
As an exception, if a regular expression worked on an empty substring, the replacement is not made more than once, e.g.:
2020-03-20 10:10:48 +00:00
``` sql
SELECT replaceRegexpAll('Hello, World!', '^', 'here: ') AS res
```
2023-04-20 10:08:49 +00:00
Result:
2020-03-20 10:10:48 +00:00
``` text
┌─res─────────────────┐
│ here: Hello, World! │
└─────────────────────┘
```
2018-04-23 06:20:21 +00:00
2023-04-20 10:08:49 +00:00
## regexpQuoteMeta
2023-04-20 10:08:49 +00:00
Adds a backslash before these characters with special meaning in regular expressions: `\0`, `\\`, `|`, `(`, `)`, `^`, `$`, `.`, `[`, `]`, `?`, `*`, `+`, `{`, `:`, `-`.
This implementation slightly differs from re2::RE2::QuoteMeta. It escapes zero byte as `\0` instead of `\x00` and it escapes only required characters.
2023-04-20 10:08:49 +00:00
For more information, see [RE2](https://github.com/google/re2/blob/master/re2/re2.cc#L473)
2023-04-20 10:08:49 +00:00
**Syntax**
2022-07-13 01:52:25 +00:00
2023-04-20 10:08:49 +00:00
```sql
regexpQuoteMeta(s)
```
2022-07-13 01:52:25 +00:00
## format
Format the `pattern` string with the values (strings, integers, etc.) listed in the arguments, similar to formatting in Python. The pattern string can contain replacement fields surrounded by curly braces `{}`. Anything not contained in braces is considered literal text and copied verbatim into the output. Literal brace character can be escaped by two braces: `{{ '{{' }}` and `{{ '}}' }}`. Field names can be numbers (starting from zero) or empty (then they are implicitly given monotonically increasing numbers).
**Syntax**
```sql
format(pattern, s0, s1, …)
```
**Example**
``` sql
SELECT format('{1} {0} {1}', 'World', 'Hello')
```
```result
┌─format('{1} {0} {1}', 'World', 'Hello')─┐
│ Hello World Hello │
└─────────────────────────────────────────┘
```
With implicit numbers:
``` sql
SELECT format('{} {}', 'Hello', 'World')
```
```result
┌─format('{} {}', 'Hello', 'World')─┐
│ Hello World │
└───────────────────────────────────┘
```
2023-04-20 10:08:49 +00:00
## translate
2022-07-13 01:52:25 +00:00
2023-04-20 10:08:49 +00:00
Replaces characters in the string `s` using a one-to-one character mapping defined by `from` and `to` strings. `from` and `to` must be constant ASCII strings of the same size. Non-ASCII characters in the original string are not modified.
2022-07-13 01:52:25 +00:00
2023-04-20 10:08:49 +00:00
**Syntax**
2022-07-13 01:52:25 +00:00
2023-04-20 10:08:49 +00:00
```sql
translate(s, from, to)
2022-07-13 01:52:25 +00:00
```
2023-04-20 10:08:49 +00:00
**Example**
2022-07-13 01:52:25 +00:00
``` sql
2023-04-20 10:08:49 +00:00
SELECT translate('Hello, World!', 'delor', 'DELOR') AS res
2022-07-13 01:52:25 +00:00
```
2023-04-20 10:08:49 +00:00
Result:
2022-07-13 01:52:25 +00:00
``` text
┌─res───────────┐
2023-04-20 10:08:49 +00:00
│ HELLO, WORLD! │
2022-07-13 01:52:25 +00:00
└───────────────┘
```
2023-04-20 10:08:49 +00:00
## translateUTF8
Like [translate](#translate) but assumes `s`, `from` and `to` are UTF-8 encoded strings.