Merge pull request #30467 from gyuton/gyuton-DOCSUP-16362-Document-the-ngram-function

DOCSUP-16362: Documented the ngrams function
This commit is contained in:
Maksim Kita 2021-10-22 10:22:12 +03:00 committed by GitHub
commit 54fed3ae0e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 75 additions and 0 deletions

View File

@ -270,3 +270,40 @@ Result:
│ [['abc','123'],['8','"hkl"']] │
└───────────────────────────────────────────────────────────────────────┘
```
## ngrams {#ngrams}
Splits the UTF-8 string into n-grams of `ngramsize` symbols.
**Syntax**
``` sql
ngrams(string, ngramsize)
```
**Arguments**
- `string` — String. [String](../../sql-reference/data-types/string.md) or [FixedString](../../sql-reference/data-types/fixedstring.md).
- `ngramsize` — The size of an n-gram. [UInt](../../sql-reference/data-types/int-uint.md).
**Returned values**
- Array with n-grams.
Type: [Array](../../sql-reference/data-types/array.md)([FixedString](../../sql-reference/data-types/fixedstring.md)).
**Example**
Query:
``` sql
SELECT ngrams('ClickHouse', 3);
```
Result:
``` text
┌─ngrams('ClickHouse', 3)───────────────────────────┐
│ ['Cli','lic','ick','ckH','kHo','Hou','ous','use'] │
└───────────────────────────────────────────────────┘
```

View File

@ -232,3 +232,41 @@ SELECT alphaTokens('abca1abc');
│ ['abca','abc'] │
└─────────────────────────┘
```
## ngrams {#ngrams}
Выделяет из UTF-8 строки отрезки (n-граммы) размером `ngramsize` символов.
**Синтаксис**
``` sql
ngrams(string, ngramsize)
```
**Аргументы**
- `string` — строка. [String](../../sql-reference/data-types/string.md) or [FixedString](../../sql-reference/data-types/fixedstring.md).
- `ngramsize` — размер n-грамм. [UInt](../../sql-reference/data-types/int-uint.md).
**Возвращаемые значения**
- Массив с n-граммами.
Тип: [Array](../../sql-reference/data-types/array.md)([FixedString](../../sql-reference/data-types/fixedstring.md)).
**Пример**
Запрос:
``` sql
SELECT ngrams('ClickHouse', 3);
```
Результат:
``` text
┌─ngrams('ClickHouse', 3)───────────────────────────┐
│ ['Cli','lic','ick','ckH','kHo','Hou','ous','use'] │
└───────────────────────────────────────────────────┘
```