Merge pull request #62923 from Blargian/document_lengthUTF8

[Docs] update `length` and `lengthUTF8`
This commit is contained in:
Nikita Mikhaylov 2024-04-25 11:37:32 +00:00 committed by GitHub
commit c98b9c350a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -88,20 +88,93 @@ Result:
## length
Returns the length of a string in bytes (not: in characters or Unicode code points).
The function also works for arrays.
Returns the length of a string in bytes rather than in characters or Unicode code points. The function also works for arrays.
Alias: `OCTET_LENGTH`
**Syntax**
```sql
length(s)
```
**Parameters**
- `s`: An input string or array. [String](../data-types/string)/[Array](../data-types/array).
**Returned value**
- Length of the string or array `s` in bytes. [UInt64](../data-types/int-uint).
**Example**
Query:
```sql
SELECT length('Hello, world!');
```
Result:
```response
┌─length('Hello, world!')─┐
│ 13 │
└─────────────────────────┘
```
Query:
```sql
SELECT length([1, 2, 3, 4]);
```
Result:
```response
┌─length([1, 2, 3, 4])─┐
│ 4 │
└──────────────────────┘
```
## lengthUTF8
Returns the length of a string in Unicode code points (not: in bytes or characters). It assumes that the string contains valid UTF-8 encoded text. If this assumption is violated, no exception is thrown and the result is undefined.
Returns the length of a string in Unicode code points rather than in bytes or characters. It assumes that the string contains valid UTF-8 encoded text. If this assumption is violated, no exception is thrown and the result is undefined.
Alias:
Aliases:
- `CHAR_LENGTH`
- `CHARACTER_LENGTH`
**Syntax**
```sql
lengthUTF8(s)
```
**Parameters**
- `s`: String containing valid UTF-8 encoded text. [String](../data-types/string).
**Returned value**
- Length of the string `s` in Unicode code points. [UInt64](../data-types/int-uint.md).
**Example**
Query:
```sql
SELECT lengthUTF8('Здравствуй, мир!');
```
Result:
```response
┌─lengthUTF8('Здравствуй, мир!')─┐
│ 16 │
└────────────────────────────────┘
```
## left
Returns a substring of string `s` with a specified `offset` starting from the left.