2020-04-03 13:23:32 +00:00
---
2022-08-28 14:53:34 +00:00
slug: /en/sql-reference/functions/other-functions
2023-04-19 17:05:55 +00:00
sidebar_position: 140
2022-04-09 13:29:05 +00:00
sidebar_label: Other
2020-04-03 13:23:32 +00:00
---
2022-06-02 10:55:18 +00:00
# Other Functions
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
## hostName
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Returns the name of the host on which this function was executed. If the function executes on a remote server (distributed processing), the remote server name is returned.
If the function executes in the context of a distributed table, it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
hostName()
```
**Returned value**
- Host name. [String ](../data-types/string.md ).
2023-09-18 17:34:40 +00:00
## getMacro {#getMacro}
2020-05-02 18:01:13 +00:00
2023-06-01 18:27:34 +00:00
Returns a named value from the [macros ](../../operations/server-configuration-parameters/settings.md#macros ) section of the server configuration.
2020-05-02 18:01:13 +00:00
2020-06-18 08:24:31 +00:00
**Syntax**
2020-05-02 18:01:13 +00:00
2024-03-12 16:57:34 +00:00
```sql
2020-05-02 18:01:13 +00:00
getMacro(name);
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-05-02 18:01:13 +00:00
2024-05-24 03:54:16 +00:00
- `name` — Macro name to retrieve from the `<macros>` section. [String ](../data-types/string.md#string ).
2020-05-02 18:01:13 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- Value of the specified macro. [String ](../data-types/string.md ).
2020-05-02 18:01:13 +00:00
**Example**
2023-06-01 18:27:34 +00:00
Example `<macros>` section in the server configuration file:
2020-05-02 18:01:13 +00:00
2024-03-12 16:57:34 +00:00
```xml
2020-05-02 18:01:13 +00:00
< macros >
< test > Value< / test >
< / macros >
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-05-02 18:01:13 +00:00
SELECT getMacro('test');
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-05-02 18:01:13 +00:00
┌─getMacro('test')─┐
│ Value │
└──────────────────┘
```
2023-06-01 18:27:34 +00:00
The same value can be retrieved as follows:
2020-05-02 18:01:13 +00:00
2024-03-12 16:57:34 +00:00
```sql
2020-05-02 18:01:13 +00:00
SELECT * FROM system.macros
WHERE macro = 'test';
```
2024-03-12 16:57:34 +00:00
```text
2020-05-02 18:01:13 +00:00
┌─macro─┬─substitution─┐
│ test │ Value │
└───────┴──────────────┘
```
2022-06-02 10:55:18 +00:00
## FQDN
2019-11-26 08:40:55 +00:00
2023-06-01 18:27:34 +00:00
Returns the fully qualified domain name of the ClickHouse server.
2019-11-26 08:40:55 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2019-11-26 08:40:55 +00:00
fqdn();
```
This function is case-insensitive.
**Returned value**
2024-05-23 13:48:20 +00:00
- String with the fully qualified domain name. [String ](../data-types/string.md ).
2019-11-26 08:40:55 +00:00
**Example**
2024-03-12 16:57:34 +00:00
```sql
2019-11-26 08:40:55 +00:00
SELECT FQDN();
```
Result:
2024-03-12 16:57:34 +00:00
```text
2019-11-26 08:40:55 +00:00
┌─FQDN()──────────────────────────┐
│ clickhouse.ru-central1.internal │
└─────────────────────────────────┘
```
2019-10-12 07:17:30 +00:00
2022-06-02 10:55:18 +00:00
## basename
2019-05-06 08:25:46 +00:00
2023-06-01 18:27:34 +00:00
Extracts the tail of a string following its last slash or backslash. This function if often used to extract the filename from a path.
2019-05-06 08:25:46 +00:00
2024-03-12 16:57:34 +00:00
```sql
2023-06-01 18:27:34 +00:00
basename(expr)
2019-05-06 08:25:46 +00:00
```
2021-02-15 21:22:10 +00:00
**Arguments**
2019-05-06 08:25:46 +00:00
2024-05-24 03:54:16 +00:00
- `expr` — A value of type [String ](../data-types/string.md ). Backslashes must be escaped.
2019-05-06 08:25:46 +00:00
2019-05-20 15:06:11 +00:00
**Returned Value**
2019-05-06 08:25:46 +00:00
2019-06-14 11:26:46 +00:00
A string that contains:
2019-05-06 08:25:46 +00:00
2023-06-01 18:27:34 +00:00
- The tail of the input string after its last slash or backslash. If the input string ends with a slash or backslash (e.g. `/` or `c:\` ), the function returns an empty string.
2023-04-19 15:55:29 +00:00
- The original string if there are no slashes or backslashes.
2019-05-06 08:25:46 +00:00
**Example**
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-05-07 15:07:29 +00:00
SELECT 'some/long/path/to/file' AS a, basename(a)
2019-05-06 08:25:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-05-07 15:07:29 +00:00
┌─a──────────────────────┬─basename('some\\long\\path\\to\\file')─┐
│ some\long\path\to\file │ file │
└────────────────────────┴────────────────────────────────────────┘
2019-05-06 08:25:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-05-07 15:07:29 +00:00
SELECT 'some\\long\\path\\to\\file' AS a, basename(a)
2019-05-07 14:56:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-05-07 15:07:29 +00:00
┌─a──────────────────────┬─basename('some\\long\\path\\to\\file')─┐
│ some\long\path\to\file │ file │
└────────────────────────┴────────────────────────────────────────┘
2019-05-07 14:56:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-05-07 15:07:29 +00:00
SELECT 'some-file-name' AS a, basename(a)
2019-05-06 08:25:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-05-07 15:07:29 +00:00
┌─a──────────────┬─basename('some-file-name')─┐
│ some-file-name │ some-file-name │
└────────────────┴────────────────────────────┘
2019-05-06 08:25:46 +00:00
```
2024-05-17 06:33:08 +00:00
## visibleWidth
2017-12-28 15:13:23 +00:00
Calculates the approximate width when outputting values to the console in text format (tab-separated).
2024-05-23 11:54:45 +00:00
This function is used by the system to implement [Pretty formats ](../../interfaces/formats.md ).
2017-12-28 15:13:23 +00:00
2018-09-04 11:18:59 +00:00
`NULL` is represented as a string corresponding to `NULL` in `Pretty` formats.
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
visibleWidth(x)
```
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2018-09-04 11:18:59 +00:00
SELECT visibleWidth(NULL)
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2024-05-17 06:33:08 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─visibleWidth(NULL)─┐
│ 4 │
└────────────────────┘
```
2024-05-17 06:33:08 +00:00
## toTypeName
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Returns the type name of the passed argument.
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
If `NULL` is passed, then the function returns type `Nullable(Nothing)` , which corresponds to ClickHouse's internal `NULL` representation.
2018-09-04 11:18:59 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
toTypeName(x)
```
## blockSize {#blockSize}
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
In ClickHouse, queries are processed in blocks (chunks).
This function returns the size (row count) of the block the function is called on.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
blockSize()
```
2022-06-02 10:55:18 +00:00
## byteSize
2021-01-28 11:38:24 +00:00
2023-06-01 18:27:34 +00:00
Returns an estimation of uncompressed byte size of its arguments in memory.
2021-01-28 11:38:24 +00:00
**Syntax**
```sql
2021-01-28 14:13:41 +00:00
byteSize(argument [, ...])
2021-01-28 11:38:24 +00:00
```
2021-02-15 21:22:10 +00:00
**Arguments**
2021-01-28 11:38:24 +00:00
2023-04-19 15:55:29 +00:00
- `argument` — Value.
2021-01-28 11:38:24 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- Estimation of byte size of the arguments in memory. [UInt64 ](../data-types/int-uint.md ).
2021-01-28 11:38:24 +00:00
2021-01-31 18:23:57 +00:00
**Examples**
2024-05-24 03:54:16 +00:00
For [String ](../data-types/string.md ) arguments, the function returns the string length + 9 (terminating zero + length).
2021-01-28 11:38:24 +00:00
Query:
```sql
SELECT byteSize('string');
```
Result:
2021-01-31 18:23:57 +00:00
```text
2021-01-28 11:38:24 +00:00
┌─byteSize('string')─┐
│ 15 │
└────────────────────┘
2021-01-31 18:23:57 +00:00
```
2021-01-28 11:38:24 +00:00
Query:
```sql
CREATE TABLE test
(
`key` Int32,
`u8` UInt8,
`u16` UInt16,
`u32` UInt32,
`u64` UInt64,
`i8` Int8,
`i16` Int16,
`i32` Int32,
`i64` Int64,
`f32` Float32,
`f64` Float64
)
ENGINE = MergeTree
ORDER BY key;
2021-01-31 18:23:57 +00:00
INSERT INTO test VALUES(1, 8, 16, 32, 64, -8, -16, -32, -64, 32.32, 64.64);
2021-01-28 11:38:24 +00:00
2021-02-02 02:54:38 +00:00
SELECT key, byteSize(u8) AS `byteSize(UInt8)` , byteSize(u16) AS `byteSize(UInt16)` , byteSize(u32) AS `byteSize(UInt32)` , byteSize(u64) AS `byteSize(UInt64)` , byteSize(i8) AS `byteSize(Int8)` , byteSize(i16) AS `byteSize(Int16)` , byteSize(i32) AS `byteSize(Int32)` , byteSize(i64) AS `byteSize(Int64)` , byteSize(f32) AS `byteSize(Float32)` , byteSize(f64) AS `byteSize(Float64)` FROM test ORDER BY key ASC FORMAT Vertical;
2021-01-28 11:38:24 +00:00
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-01-31 18:23:57 +00:00
Row 1:
──────
key: 1
2021-02-02 02:54:38 +00:00
byteSize(UInt8): 1
byteSize(UInt16): 2
byteSize(UInt32): 4
byteSize(UInt64): 8
byteSize(Int8): 1
byteSize(Int16): 2
byteSize(Int32): 4
byteSize(Int64): 8
2021-01-31 18:23:57 +00:00
byteSize(Float32): 4
byteSize(Float64): 8
2021-01-28 11:38:24 +00:00
```
2023-06-01 18:27:34 +00:00
If the function has multiple arguments, the function accumulates their byte sizes.
2021-01-28 11:38:24 +00:00
Query:
```sql
SELECT byteSize(NULL, 1, 0.3, '');
```
Result:
```text
┌─byteSize(NULL, 1, 0.3, '')─┐
│ 19 │
└────────────────────────────┘
```
2020-12-28 17:55:47 +00:00
2024-05-17 06:33:08 +00:00
## materialize
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Turns a constant into a full column containing a single value.
Full columns and constants are represented differently in memory. Functions usually execute different code for normal and constant arguments, although the result should typically be the same. This function can be used to debug this behavior.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
materialize(x)
```
## ignore
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Accepts any arguments, including `NULL` and does nothing. Always returns 0.
The argument is internally still evaluated. Useful e.g. for benchmarks.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
2024-05-23 11:54:45 +00:00
ignore(x)
2024-05-17 06:33:08 +00:00
```
2024-03-13 13:07:14 +00:00
## sleep
2017-12-28 15:13:23 +00:00
2024-03-12 16:57:03 +00:00
Used to introduce a delay or pause in the execution of a query. It is primarily used for testing and debugging purposes.
2017-12-28 15:13:23 +00:00
2024-03-12 16:57:03 +00:00
**Syntax**
```sql
sleep(seconds)
```
**Arguments**
2024-05-24 03:54:16 +00:00
- `seconds` : [UInt* ](../data-types/int-uint.md ) or [Float ](../data-types/float.md ) The number of seconds to pause the query execution to a maximum of 3 seconds. It can be a floating-point value to specify fractional seconds.
2024-03-12 16:57:03 +00:00
**Returned value**
This function does not return any value.
**Example**
```sql
SELECT sleep(2);
```
This function does not return any value. However, if you run the function with `clickhouse client` you will see something similar to:
```response
SELECT sleep(2)
Query id: 8aa9943e-a686-45e1-8317-6e8e3a5596ac
┌─sleep(2)─┐
│ 0 │
└──────────┘
1 row in set. Elapsed: 2.012 sec.
```
This query will pause for 2 seconds before completing. During this time, no results will be returned, and the query will appear to be hanging or unresponsive.
**Implementation details**
The `sleep()` function is generally not used in production environments, as it can negatively impact query performance and system responsiveness. However, it can be useful in the following scenarios:
1. **Testing** : When testing or benchmarking ClickHouse, you may want to simulate delays or introduce pauses to observe how the system behaves under certain conditions.
2. **Debugging** : If you need to examine the state of the system or the execution of a query at a specific point in time, you can use `sleep()` to introduce a pause, allowing you to inspect or collect relevant information.
3. **Simulation** : In some cases, you may want to simulate real-world scenarios where delays or pauses occur, such as network latency or external system dependencies.
It's important to use the `sleep()` function judiciously and only when necessary, as it can potentially impact the overall performance and responsiveness of your ClickHouse system.
## sleepEachRow
Pauses the execution of a query for a specified number of seconds for each row in the result set.
**Syntax**
```sql
sleepEachRow(seconds)
```
**Arguments**
2024-05-24 03:54:16 +00:00
- `seconds` : [UInt* ](../data-types/int-uint.md ) or [Float* ](../data-types/float.md ) The number of seconds to pause the query execution for each row in the result set to a maximum of 3 seconds. It can be a floating-point value to specify fractional seconds.
2024-03-12 16:57:03 +00:00
**Returned value**
This function returns the same input values as it receives, without modifying them.
**Example**
```sql
SELECT number, sleepEachRow(0.5) FROM system.numbers LIMIT 5;
```
```response
┌─number─┬─sleepEachRow(0.5)─┐
│ 0 │ 0 │
│ 1 │ 0 │
│ 2 │ 0 │
│ 3 │ 0 │
│ 4 │ 0 │
└────────┴───────────────────┘
```
But the output will be delayed, with a 0.5-second pause between each row.
The `sleepEachRow()` function is primarily used for testing and debugging purposes, similar to the `sleep()` function. It allows you to simulate delays or introduce pauses in the processing of each row, which can be useful in scenarios such as:
1. **Testing** : When testing or benchmarking ClickHouse's performance under specific conditions, you can use `sleepEachRow()` to simulate delays or introduce pauses for each row processed.
2. **Debugging** : If you need to examine the state of the system or the execution of a query for each row processed, you can use `sleepEachRow()` to introduce pauses, allowing you to inspect or collect relevant information.
3. **Simulation** : In some cases, you may want to simulate real-world scenarios where delays or pauses occur for each row processed, such as when dealing with external systems or network latencies.
2019-01-30 10:39:46 +00:00
2024-03-12 16:57:03 +00:00
Like the [`sleep()` function ](#sleep ), it's important to use `sleepEachRow()` judiciously and only when necessary, as it can significantly impact the overall performance and responsiveness of your ClickHouse system, especially when dealing with large result sets.
2019-01-30 10:39:46 +00:00
2024-05-17 06:33:08 +00:00
## currentDatabase
2017-12-28 15:13:23 +00:00
Returns the name of the current database.
2023-06-01 18:27:34 +00:00
Useful in table engine parameters of `CREATE TABLE` queries where you need to specify the database.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
currentDatabase()
```
## currentUser {#currentUser}
2019-09-29 11:15:26 +00:00
2023-06-01 18:27:34 +00:00
Returns the name of the current user. In case of a distributed query, the name of the user who initiated the query is returned.
2019-09-29 11:15:26 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2024-05-17 06:33:08 +00:00
currentUser()
2019-09-29 11:15:26 +00:00
```
2024-03-22 14:29:44 +00:00
Aliases: `user()` , `USER()` , `current_user()` . Aliases are case insensitive.
2019-09-29 11:15:26 +00:00
**Returned values**
2024-05-23 13:48:20 +00:00
- The name of the current user. [String ](../data-types/string.md ).
- In distributed queries, the login of the user who initiated the query. [String ](../data-types/string.md ).
2019-09-29 11:15:26 +00:00
**Example**
2024-03-12 16:57:34 +00:00
```sql
2019-10-04 07:24:41 +00:00
SELECT currentUser();
2019-09-29 11:15:26 +00:00
```
Result:
2024-03-12 16:57:34 +00:00
```text
2019-09-29 11:15:26 +00:00
┌─currentUser()─┐
│ default │
└───────────────┘
```
2019-08-13 16:08:12 +00:00
2022-06-02 10:55:18 +00:00
## isConstant
2020-05-02 15:14:10 +00:00
2023-06-01 18:27:34 +00:00
Returns whether the argument is a constant expression.
2020-05-02 15:14:10 +00:00
2023-06-01 18:27:34 +00:00
A constant expression is an expression whose result is known during query analysis, i.e. before execution. For example, expressions over [literals ](../../sql-reference/syntax.md#literals ) are constant expressions.
2020-05-02 15:14:10 +00:00
2023-06-01 18:27:34 +00:00
This function is mostly intended for development, debugging and demonstration.
2020-05-02 15:14:10 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2020-05-02 15:14:10 +00:00
isConstant(x)
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-05-02 15:14:10 +00:00
2023-04-19 15:55:29 +00:00
- `x` — Expression to check.
2020-05-02 15:14:10 +00:00
**Returned values**
2024-05-24 03:54:16 +00:00
- `1` if `x` is constant. [UInt8 ](../data-types/int-uint.md ).
- `0` if `x` is non-constant. [UInt8 ](../data-types/int-uint.md ).
2020-05-02 15:14:10 +00:00
**Examples**
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-05-02 15:14:10 +00:00
SELECT isConstant(x + 1) FROM (SELECT 43 AS x)
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-05-02 15:14:10 +00:00
┌─isConstant(plus(x, 1))─┐
│ 1 │
└────────────────────────┘
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-05-02 15:14:10 +00:00
WITH 3.14 AS pi SELECT isConstant(cos(pi))
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-05-02 15:14:10 +00:00
┌─isConstant(cos(pi))─┐
│ 1 │
└─────────────────────┘
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-05-02 15:14:10 +00:00
SELECT isConstant(number) FROM numbers(1)
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-05-02 15:14:10 +00:00
┌─isConstant(number)─┐
│ 0 │
└────────────────────┘
```
2024-03-28 10:22:28 +00:00
## hasColumnInTable
Given the database name, the table name, and the column name as constant strings, returns 1 if the given column exists, otherwise 0.
**Syntax**
```sql
hasColumnInTable(\[‘ hostname’ \[, ‘ username’ \[, ‘ password’ \]\],\] ‘ database’ , ‘ table’ , ‘ column’ )
```
**Parameters**
- `database` : name of the database. [String literal ](../syntax#syntax-string-literal )
2024-05-02 07:00:40 +00:00
- `table` : name of the table. [String literal ](../syntax#syntax-string-literal )
2024-03-28 10:22:28 +00:00
- `column` : name of the column. [String literal ](../syntax#syntax-string-literal )
- `hostname` : remote server name to perform the check on. [String literal ](../syntax#syntax-string-literal )
- `username` : username for remote server. [String literal ](../syntax#syntax-string-literal )
- `password` : password for remote server. [String literal ](../syntax#syntax-string-literal )
**Returned value**
- `1` if the given column exists.
2024-05-02 07:00:40 +00:00
- `0` , otherwise.
2024-03-28 10:22:28 +00:00
**Implementation details**
2017-12-28 15:13:23 +00:00
For elements in a nested data structure, the function checks for the existence of a column. For the nested data structure itself, the function returns 0.
2024-03-28 10:22:28 +00:00
**Example**
Query:
```sql
SELECT hasColumnInTable('system','metrics','metric')
```
```response
1
```
```sql
SELECT hasColumnInTable('system','metrics','non-existing_column')
```
```response
0
```
2024-03-29 16:24:54 +00:00
## hasThreadFuzzer
2024-03-30 14:16:01 +00:00
Returns whether Thread Fuzzer is effective. It can be used in tests to prevent runs from being too long.
2024-03-29 16:24:54 +00:00
**Syntax**
```sql
hasThreadFuzzer();
```
2022-06-02 10:55:18 +00:00
## bar
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Builds a bar chart.
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
`bar(x, min, max, width)` draws a band with width proportional to `(x - min)` and equal to `width` characters when `x = max` .
2018-04-28 11:45:37 +00:00
2021-02-16 11:21:23 +00:00
**Arguments**
2018-04-28 11:45:37 +00:00
2023-04-19 15:55:29 +00:00
- `x` — Size to display.
- `min, max` — Integer constants. The value must fit in `Int64` .
- `width` — Constant, positive integer, can be fractional.
2017-12-28 15:13:23 +00:00
The band is drawn with accuracy to one eighth of a symbol.
Example:
2024-03-12 16:57:34 +00:00
```sql
2017-12-28 15:13:23 +00:00
SELECT
toHour(EventTime) AS h,
count() AS c,
bar(c, 0, 600000, 20) AS bar
FROM test.hits
GROUP BY h
ORDER BY h ASC
```
2024-03-12 16:57:34 +00:00
```text
2017-12-28 15:13:23 +00:00
┌──h─┬──────c─┬─bar────────────────┐
│ 0 │ 292907 │ █████████▋ │
│ 1 │ 180563 │ ██████ │
│ 2 │ 114861 │ ███▋ │
│ 3 │ 85069 │ ██▋ │
│ 4 │ 68543 │ ██▎ │
│ 5 │ 78116 │ ██▌ │
│ 6 │ 113474 │ ███▋ │
│ 7 │ 170678 │ █████▋ │
│ 8 │ 278380 │ █████████▎ │
│ 9 │ 391053 │ █████████████ │
│ 10 │ 457681 │ ███████████████▎ │
│ 11 │ 493667 │ ████████████████▍ │
│ 12 │ 509641 │ ████████████████▊ │
│ 13 │ 522947 │ █████████████████▍ │
│ 14 │ 539954 │ █████████████████▊ │
│ 15 │ 528460 │ █████████████████▌ │
│ 16 │ 539201 │ █████████████████▊ │
│ 17 │ 523539 │ █████████████████▍ │
│ 18 │ 506467 │ ████████████████▊ │
│ 19 │ 520915 │ █████████████████▎ │
│ 20 │ 521665 │ █████████████████▍ │
│ 21 │ 542078 │ ██████████████████ │
│ 22 │ 493642 │ ████████████████▍ │
│ 23 │ 400397 │ █████████████▎ │
└────┴────────┴────────────────────┘
```
2022-06-02 10:55:18 +00:00
## transform
2017-12-28 15:13:23 +00:00
Transforms a value according to the explicitly defined mapping of some elements to other ones.
There are two variations of this function:
2022-06-02 10:55:18 +00:00
### transform(x, array_from, array_to, default)
2017-12-28 15:13:23 +00:00
`x` – What to transform.
2023-06-01 18:27:34 +00:00
`array_from` – Constant array of values to convert.
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
`array_to` – Constant array of values to convert the values in ‘ from’ to.
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
`default` – Which value to use if ‘ x’ is not equal to any of the values in ‘ from’ .
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
`array_from` and `array_to` must have equally many elements.
Signature:
2017-12-28 15:13:23 +00:00
2024-04-10 21:39:47 +00:00
For `x` equal to one of the elements in `array_from` , the function returns the corresponding element in `array_to` , i.e. the one at the same array index. Otherwise, it returns `default` . If multiple matching elements exist `array_from` , it returns the element corresponding to the first of them.
2017-12-28 15:13:23 +00:00
`transform(T, Array(T), Array(U), U) -> U`
`T` and `U` can be numeric, string, or Date or DateTime types.
2023-06-01 18:27:34 +00:00
The same letter (T or U) means that types must be mutually compatible and not necessarily equal.
For example, the first argument could have type `Int64` , while the second argument could have type `Array(UInt16)` .
2017-12-28 15:13:23 +00:00
Example:
2024-03-12 16:57:34 +00:00
```sql
2017-12-28 15:13:23 +00:00
SELECT
2018-03-25 02:04:22 +00:00
transform(SearchEngineID, [2, 3], ['Yandex', 'Google'], 'Other') AS title,
2017-12-28 15:13:23 +00:00
count() AS c
FROM test.hits
WHERE SearchEngineID != 0
GROUP BY title
ORDER BY c DESC
```
2024-03-12 16:57:34 +00:00
```text
2017-12-28 15:13:23 +00:00
┌─title─────┬──────c─┐
│ Yandex │ 498635 │
│ Google │ 229872 │
│ Other │ 104472 │
└───────────┴────────┘
```
2022-06-02 10:55:18 +00:00
### transform(x, array_from, array_to)
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Similar to the other variation but has no ‘ default’ argument. In case no match can be found, `x` is returned.
2017-12-28 15:13:23 +00:00
Example:
2024-03-12 16:57:34 +00:00
```sql
2017-12-28 15:13:23 +00:00
SELECT
2022-10-21 17:08:44 +00:00
transform(domain(Referer), ['yandex.ru', 'google.ru', 'vkontakte.ru'], ['www.yandex', 'example.com', 'vk.com']) AS s,
2017-12-28 15:13:23 +00:00
count() AS c
FROM test.hits
GROUP BY domain(Referer)
ORDER BY count() DESC
LIMIT 10
```
2024-03-12 16:57:34 +00:00
```text
2017-12-28 15:13:23 +00:00
┌─s──────────────┬───────c─┐
│ │ 2906259 │
│ www.yandex │ 867767 │
│ ███████.ru │ 313599 │
│ mail.yandex.ru │ 107147 │
│ ██████.ru │ 100355 │
│ █████████.ru │ 65040 │
│ news.yandex.ru │ 64515 │
│ ██████.net │ 59141 │
│ example.com │ 57316 │
└────────────────┴─────────┘
```
2024-05-17 06:33:08 +00:00
## formatReadableDecimalSize
2022-11-01 09:41:49 +00:00
2023-06-01 18:27:34 +00:00
Given a size (number of bytes), this function returns a readable, rounded size with suffix (KB, MB, etc.) as string.
2022-11-01 09:41:49 +00:00
2024-06-03 08:47:51 +00:00
The opposite operations of this function are [parseReadableSize ](#parseReadableSize ), [parseReadableSizeOrZero ](#parseReadableSizeOrZero ), and [parseReadableSizeOrNull ](#parseReadableSizeOrNull ).
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
formatReadableDecimalSize(x)
```
**Example**
Query:
2022-11-01 09:41:49 +00:00
2024-03-12 16:57:34 +00:00
```sql
2022-11-01 09:41:49 +00:00
SELECT
arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
formatReadableDecimalSize(filesize_bytes) AS filesize
```
2024-05-17 06:33:08 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2022-11-01 09:41:49 +00:00
┌─filesize_bytes─┬─filesize───┐
│ 1 │ 1.00 B │
2022-11-01 21:33:18 +00:00
│ 1024 │ 1.02 KB │
│ 1048576 │ 1.05 MB │
│ 192851925 │ 192.85 MB │
2022-11-01 09:41:49 +00:00
└────────────────┴────────────┘
```
2024-05-17 06:33:08 +00:00
## formatReadableSize
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Given a size (number of bytes), this function returns a readable, rounded size with suffix (KiB, MiB, etc.) as string.
2017-12-28 15:13:23 +00:00
2024-06-03 08:47:51 +00:00
The opposite operations of this function are [parseReadableSize ](#parseReadableSize ), [parseReadableSizeOrZero ](#parseReadableSizeOrZero ), and [parseReadableSizeOrNull ](#parseReadableSizeOrNull ).
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
formatReadableSize(x)
```
Alias: `FORMAT_BYTES` .
**Example**
Query:
2017-12-28 15:13:23 +00:00
2024-03-12 16:57:34 +00:00
```sql
2017-12-28 15:13:23 +00:00
SELECT
arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
formatReadableSize(filesize_bytes) AS filesize
```
2024-05-17 06:33:08 +00:00
Result:
2023-12-06 21:54:50 +00:00
2024-03-12 16:57:34 +00:00
```text
2017-12-28 15:13:23 +00:00
┌─filesize_bytes─┬─filesize───┐
│ 1 │ 1.00 B │
│ 1024 │ 1.00 KiB │
│ 1048576 │ 1.00 MiB │
│ 192851925 │ 183.92 MiB │
└────────────────┴────────────┘
```
2024-05-28 15:20:23 +00:00
## formatReadableQuantity
Given a number, this function returns a rounded number with suffix (thousand, million, billion, etc.) as string.
**Syntax**
```sql
formatReadableQuantity(x)
```
**Example**
Query:
```sql
SELECT
arrayJoin([1024, 1234 * 1000, (4567 * 1000) * 1000, 98765432101234]) AS number,
formatReadableQuantity(number) AS number_for_humans
```
Result:
```text
┌─────────number─┬─number_for_humans─┐
│ 1024 │ 1.02 thousand │
│ 1234000 │ 1.23 million │
│ 4567000000 │ 4.57 billion │
│ 98765432101234 │ 98.77 trillion │
└────────────────┴───────────────────┘
```
## formatReadableTimeDelta
Given a time interval (delta) in seconds, this function returns a time delta with year/month/day/hour/minute/second/millisecond/microsecond/nanosecond as string.
**Syntax**
```sql
formatReadableTimeDelta(column[, maximum_unit, minimum_unit])
```
**Arguments**
- `column` — A column with a numeric time delta.
- `maximum_unit` — Optional. Maximum unit to show.
- Acceptable values: `nanoseconds` , `microseconds` , `milliseconds` , `seconds` , `minutes` , `hours` , `days` , `months` , `years` .
- Default value: `years` .
- `minimum_unit` — Optional. Minimum unit to show. All smaller units are truncated.
- Acceptable values: `nanoseconds` , `microseconds` , `milliseconds` , `seconds` , `minutes` , `hours` , `days` , `months` , `years` .
- If explicitly specified value is bigger than `maximum_unit` , an exception will be thrown.
- Default value: `seconds` if `maximum_unit` is `seconds` or bigger, `nanoseconds` otherwise.
**Example**
```sql
SELECT
arrayJoin([100, 12345, 432546534]) AS elapsed,
formatReadableTimeDelta(elapsed) AS time_delta
```
```text
┌────elapsed─┬─time_delta ─────────────────────────────────────────────────────┐
│ 100 │ 1 minute and 40 seconds │
│ 12345 │ 3 hours, 25 minutes and 45 seconds │
│ 432546534 │ 13 years, 8 months, 17 days, 7 hours, 48 minutes and 54 seconds │
└────────────┴─────────────────────────────────────────────────────────────────┘
```
```sql
SELECT
arrayJoin([100, 12345, 432546534]) AS elapsed,
formatReadableTimeDelta(elapsed, 'minutes') AS time_delta
```
```text
┌────elapsed─┬─time_delta ─────────────────────────────────────────────────────┐
│ 100 │ 1 minute and 40 seconds │
│ 12345 │ 205 minutes and 45 seconds │
│ 432546534 │ 7209108 minutes and 54 seconds │
└────────────┴─────────────────────────────────────────────────────────────────┘
```
```sql
SELECT
arrayJoin([100, 12345, 432546534.00000006]) AS elapsed,
formatReadableTimeDelta(elapsed, 'minutes', 'nanoseconds') AS time_delta
```
```text
┌────────────elapsed─┬─time_delta─────────────────────────────────────┐
│ 100 │ 1 minute and 40 seconds │
│ 12345 │ 205 minutes and 45 seconds │
│ 432546534.00000006 │ 7209108 minutes, 54 seconds and 60 nanoseconds │
└────────────────────┴────────────────────────────────────────────────┘
```
2024-06-03 08:47:51 +00:00
## parseReadableSize
Given a string containing a byte size and `B` , `KiB` , `KB` , `MiB` , `MB` , etc. as a unit (i.e. [ISO/IEC 80000-13 ](https://en.wikipedia.org/wiki/ISO/IEC_80000 ) or decimal byte unit), this function returns the corresponding number of bytes.
If the function is unable to parse the input value, it throws an exception.
2024-06-04 12:14:15 +00:00
The inverse operations of this function are [formatReadableSize ](#formatReadableSize ) and [formatReadableDecimalSize ](#formatReadableDecimalSize ).
2024-06-03 08:47:51 +00:00
**Syntax**
```sql
formatReadableSize(x)
```
**Arguments**
- `x` : Readable size with ISO/IEC 80000-13 or decimal byte unit ([String](../../sql-reference/data-types/string.md)).
**Returned value**
- Number of bytes, rounded up to the nearest integer ([UInt64](../../sql-reference/data-types/int-uint.md)).
**Example**
```sql
SELECT
arrayJoin(['1 B', '1 KiB', '3 MB', '5.314 KiB']) AS readable_sizes,
parseReadableSize(readable_sizes) AS sizes;
```
```text
┌─readable_sizes─┬───sizes─┐
│ 1 B │ 1 │
│ 1 KiB │ 1024 │
│ 3 MB │ 3000000 │
│ 5.314 KiB │ 5442 │
└────────────────┴─────────┘
```
## parseReadableSizeOrNull
Given a string containing a byte size and `B` , `KiB` , `KB` , `MiB` , `MB` , etc. as a unit (i.e. [ISO/IEC 80000-13 ](https://en.wikipedia.org/wiki/ISO/IEC_80000 ) or decimal byte unit), this function returns the corresponding number of bytes.
If the function is unable to parse the input value, it returns `NULL` .
2024-06-04 12:14:15 +00:00
The inverse operations of this function are [formatReadableSize ](#formatReadableSize ) and [formatReadableDecimalSize ](#formatReadableDecimalSize ).
2024-06-03 08:47:51 +00:00
**Syntax**
```sql
parseReadableSizeOrNull(x)
```
**Arguments**
- `x` : Readable size with ISO/IEC 80000-13 or decimal byte unit ([String](../../sql-reference/data-types/string.md)).
**Returned value**
- Number of bytes, rounded up to the nearest integer, or NULL if unable to parse the input (Nullable([UInt64](../../sql-reference/data-types/int-uint.md))).
**Example**
```sql
SELECT
arrayJoin(['1 B', '1 KiB', '3 MB', '5.314 KiB', 'invalid']) AS readable_sizes,
parseReadableSizeOrNull(readable_sizes) AS sizes;
```
```text
┌─readable_sizes─┬───sizes─┐
│ 1 B │ 1 │
│ 1 KiB │ 1024 │
│ 3 MB │ 3000000 │
│ 5.314 KiB │ 5442 │
│ invalid │ ᴺᵁᴸᴸ │
└────────────────┴─────────┘
```
## parseReadableSizeOrZero
Given a string containing a byte size and `B` , `KiB` , `KB` , `MiB` , `MB` , etc. as a unit (i.e. [ISO/IEC 80000-13 ](https://en.wikipedia.org/wiki/ISO/IEC_80000 ) or decimal byte unit), this function returns the corresponding number of bytes. If the function is unable to parse the input value, it returns `0` .
2024-06-04 12:14:15 +00:00
The inverse operations of this function are [formatReadableSize ](#formatReadableSize ) and [formatReadableDecimalSize ](#formatReadableDecimalSize ).
2024-06-03 08:47:51 +00:00
**Syntax**
```sql
parseReadableSizeOrZero(x)
```
**Arguments**
- `x` : Readable size with ISO/IEC 80000-13 or decimal byte unit ([String](../../sql-reference/data-types/string.md)).
**Returned value**
- Number of bytes, rounded up to the nearest integer, or 0 if unable to parse the input ([UInt64](../../sql-reference/data-types/int-uint.md)).
**Example**
```sql
SELECT
arrayJoin(['1 B', '1 KiB', '3 MB', '5.314 KiB', 'invalid']) AS readable_sizes,
parseReadableSizeOrZero(readable_sizes) AS sizes;
```
```text
┌─readable_sizes─┬───sizes─┐
│ 1 B │ 1 │
│ 1 KiB │ 1024 │
│ 3 MB │ 3000000 │
│ 5.314 KiB │ 5442 │
│ invalid │ 0 │
└────────────────┴─────────┘
```
2022-07-10 11:39:45 +00:00
## parseTimeDelta
Parse a sequence of numbers followed by something resembling a time unit.
**Syntax**
```sql
parseTimeDelta(timestr)
```
**Arguments**
2023-04-19 15:55:29 +00:00
- `timestr` — A sequence of numbers followed by something resembling a time unit.
2022-07-10 11:39:45 +00:00
**Returned value**
2023-04-19 15:55:29 +00:00
- A floating-point number with the number of seconds.
2022-07-10 11:39:45 +00:00
**Example**
```sql
SELECT parseTimeDelta('11s+22min')
```
```text
┌─parseTimeDelta('11s+22min')─┐
│ 1331 │
└─────────────────────────────┘
```
```sql
SELECT parseTimeDelta('1yr2mo')
```
```text
┌─parseTimeDelta('1yr2mo')─┐
│ 36806400 │
└──────────────────────────┘
```
2024-05-17 06:33:08 +00:00
## least
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Returns the smaller value of a and b.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
least(a, b)
```
## greatest
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Returns the larger value of a and b.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
greatest(a, b)
```
## uptime
2017-12-28 15:13:23 +00:00
2020-03-20 10:10:48 +00:00
Returns the server’ s uptime in seconds.
2023-06-01 18:27:34 +00:00
If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
2017-12-28 15:13:23 +00:00
2024-04-10 02:40:14 +00:00
**Syntax**
``` sql
uptime()
```
**Returned value**
2024-05-24 03:54:16 +00:00
- Time value of seconds. [UInt32 ](../data-types/int-uint.md ).
2024-04-10 02:40:14 +00:00
**Example**
Query:
``` sql
SELECT uptime() as Uptime;
```
Result:
``` response
┌─Uptime─┐
│ 55867 │
└────────┘
```
2024-05-17 06:33:08 +00:00
## version
2017-12-28 15:13:23 +00:00
2024-02-23 16:26:29 +00:00
Returns the current version of ClickHouse as a string in the form of:
2024-02-26 16:41:44 +00:00
- Major version
- Minor version
- Patch version
- Number of commits since the previous stable release.
2024-02-23 16:26:29 +00:00
```plaintext
2024-02-26 16:41:44 +00:00
major_version.minor_version.patch_version.number_of_commits_since_the_previous_stable_release
2024-02-23 16:26:29 +00:00
```
If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise, it produces a constant value.
**Syntax**
```sql
version()
```
**Arguments**
None.
**Returned value**
2024-05-23 13:48:20 +00:00
- Current version of ClickHouse. [String ](../data-types/string ).
2024-02-23 16:26:29 +00:00
**Implementation details**
None.
**Example**
Query:
```sql
SELECT version()
```
**Result**:
```response
┌─version()─┐
│ 24.2.1.1 │
└───────────┘
```
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
## buildId
2021-10-08 05:05:12 +00:00
2021-10-09 18:37:28 +00:00
Returns the build ID generated by a compiler for the running ClickHouse server binary.
2023-06-01 18:27:34 +00:00
If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
2021-10-08 05:05:12 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
buildId()
```
2024-05-15 13:42:40 +00:00
## blockNumber
2019-01-30 10:39:46 +00:00
2024-05-21 20:39:35 +00:00
Returns a monotonically increasing sequence number of the [block ](../../development/architecture.md#block ) containing the row.
2024-05-21 20:41:20 +00:00
The returned block number is updated on a best-effort basis, i.e. it may not be fully accurate.
2019-01-30 10:39:46 +00:00
2024-05-15 13:42:40 +00:00
**Syntax**
```sql
blockNumber()
```
**Returned value**
- Sequence number of the data block where the row is located. [UInt64 ](../data-types/int-uint.md ).
**Example**
Query:
```sql
2024-05-21 20:39:35 +00:00
SELECT blockNumber()
2024-05-21 20:41:20 +00:00
FROM
(
2024-05-15 13:42:40 +00:00
SELECT *
2024-05-21 20:39:35 +00:00
FROM system.numbers
LIMIT 10
) SETTINGS max_block_size = 2
2024-05-15 13:42:40 +00:00
```
Result:
```response
2024-05-21 20:39:35 +00:00
┌─blockNumber()─┐
│ 7 │
│ 7 │
└───────────────┘
┌─blockNumber()─┐
│ 8 │
│ 8 │
└───────────────┘
┌─blockNumber()─┐
│ 9 │
│ 9 │
└───────────────┘
┌─blockNumber()─┐
│ 10 │
│ 10 │
└───────────────┘
┌─blockNumber()─┐
│ 11 │
│ 11 │
└───────────────┘
2024-05-15 13:42:40 +00:00
```
## rowNumberInBlock {#rowNumberInBlock}
2024-05-21 20:39:35 +00:00
Returns for each [block ](../../development/architecture.md#block ) processed by `rowNumberInBlock` the number of the current row.
The returned number starts for each block at 0.
2024-05-15 13:42:40 +00:00
**Syntax**
```sql
rowNumberInBlock()
```
**Returned value**
- Ordinal number of the row in the data block starting from 0. [UInt64 ](../data-types/int-uint.md ).
**Example**
Query:
```sql
2024-05-21 20:39:35 +00:00
SELECT rowNumberInBlock()
2024-05-21 20:41:20 +00:00
FROM
(
2024-05-21 20:39:35 +00:00
SELECT *
FROM system.numbers_mt
LIMIT 10
) SETTINGS max_block_size = 2
2024-05-15 13:42:40 +00:00
```
Result:
```response
2024-05-21 20:39:35 +00:00
┌─rowNumberInBlock()─┐
│ 0 │
│ 1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│ 0 │
│ 1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│ 0 │
│ 1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│ 0 │
│ 1 │
└────────────────────┘
┌─rowNumberInBlock()─┐
│ 0 │
│ 1 │
└────────────────────┘
2024-05-15 13:42:40 +00:00
```
## rowNumberInAllBlocks
2024-05-21 20:39:35 +00:00
Returns a unique row number for each row processed by `rowNumberInAllBlocks` . The returned numbers start at 0.
2024-05-15 13:42:40 +00:00
**Syntax**
```sql
rowNumberInAllBlocks()
```
**Returned value**
- Ordinal number of the row in the data block starting from 0. [UInt64 ](../data-types/int-uint.md ).
**Example**
Query:
```sql
2024-05-21 20:39:35 +00:00
SELECT rowNumberInAllBlocks()
FROM
(
SELECT *
FROM system.numbers_mt
LIMIT 10
)
SETTINGS max_block_size = 2
2024-05-15 13:42:40 +00:00
```
Result:
```response
2024-05-21 20:39:35 +00:00
┌─rowNumberInAllBlocks()─┐
│ 0 │
│ 1 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│ 4 │
│ 5 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│ 2 │
│ 3 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│ 6 │
│ 7 │
└────────────────────────┘
┌─rowNumberInAllBlocks()─┐
│ 8 │
│ 9 │
└────────────────────────┘
2024-05-15 13:42:40 +00:00
```
2019-01-30 10:39:46 +00:00
2022-06-02 10:55:18 +00:00
## neighbor
2019-08-13 13:11:24 +00:00
2023-06-01 18:27:34 +00:00
The window function that provides access to a row at a specified offset before or after the current row of a given column.
2019-08-13 13:11:24 +00:00
2019-11-08 13:15:45 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2019-11-12 08:01:46 +00:00
neighbor(column, offset[, default_value])
2019-11-08 13:15:45 +00:00
```
2019-08-13 13:11:24 +00:00
The result of the function depends on the affected data blocks and the order of data in the block.
2020-10-28 14:13:27 +00:00
2023-06-01 18:27:34 +00:00
:::note
Only returns neighbor inside the currently processed data block.
2024-04-29 21:00:56 +00:00
Because of this error-prone behavior the function is DEPRECATED, please use proper window functions instead.
2022-04-09 13:29:05 +00:00
:::
2020-10-28 14:13:27 +00:00
2023-06-01 18:27:34 +00:00
The order of rows during calculation of `neighbor()` can differ from the order of rows returned to the user.
To prevent that you can create a subquery with [ORDER BY ](../../sql-reference/statements/select/order-by.md ) and call the function from outside the subquery.
2019-08-13 13:11:24 +00:00
2021-02-15 21:22:10 +00:00
**Arguments**
2019-11-08 13:15:45 +00:00
2023-04-19 15:55:29 +00:00
- `column` — A column name or scalar expression.
2024-05-24 03:54:16 +00:00
- `offset` — The number of rows to look before or ahead of the current row in `column` . [Int64 ](../data-types/int-uint.md ).
2023-06-01 18:27:34 +00:00
- `default_value` — Optional. The returned value if offset is beyond the block boundaries. Type of data blocks affected.
2019-11-08 13:15:45 +00:00
2019-11-12 08:01:46 +00:00
**Returned values**
2019-11-08 13:15:45 +00:00
2023-06-01 18:27:34 +00:00
- Value of `column` with `offset` distance from current row, if `offset` is not outside the block boundaries.
- The default value of `column` or `default_value` (if given), if `offset` is outside the block boundaries.
2019-11-08 13:15:45 +00:00
2024-05-23 13:48:20 +00:00
:::note
The return type will be that of the data blocks affected or the default value type.
:::
2019-11-08 13:15:45 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-11-08 13:15:45 +00:00
SELECT number, neighbor(number, 2) FROM system.numbers LIMIT 10;
```
Result:
2024-03-12 16:57:34 +00:00
```text
2019-11-08 13:15:45 +00:00
┌─number─┬─neighbor(number, 2)─┐
│ 0 │ 2 │
│ 1 │ 3 │
│ 2 │ 4 │
│ 3 │ 5 │
│ 4 │ 6 │
│ 5 │ 7 │
│ 6 │ 8 │
│ 7 │ 9 │
│ 8 │ 0 │
│ 9 │ 0 │
└────────┴─────────────────────┘
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-11-08 13:15:45 +00:00
SELECT number, neighbor(number, 2, 999) FROM system.numbers LIMIT 10;
```
Result:
2024-03-12 16:57:34 +00:00
```text
2019-11-08 13:15:45 +00:00
┌─number─┬─neighbor(number, 2, 999)─┐
│ 0 │ 2 │
│ 1 │ 3 │
│ 2 │ 4 │
│ 3 │ 5 │
│ 4 │ 6 │
│ 5 │ 7 │
│ 6 │ 8 │
│ 7 │ 9 │
│ 8 │ 999 │
│ 9 │ 999 │
└────────┴──────────────────────────┘
```
2019-08-13 13:11:24 +00:00
This function can be used to compute year-over-year metric value:
2019-11-08 13:15:45 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-08-13 13:11:24 +00:00
WITH toDate('2018-01-01') AS start_date
SELECT
toStartOfMonth(start_date + (number * 32)) AS month,
toInt32(month) % 100 AS money,
2019-08-23 01:31:04 +00:00
neighbor(money, -12) AS prev_year,
2019-08-13 13:11:24 +00:00
round(prev_year / money, 2) AS year_over_year
FROM numbers(16)
```
2019-11-08 13:15:45 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-08-13 13:11:24 +00:00
┌──────month─┬─money─┬─prev_year─┬─year_over_year─┐
│ 2018-01-01 │ 32 │ 0 │ 0 │
│ 2018-02-01 │ 63 │ 0 │ 0 │
│ 2018-03-01 │ 91 │ 0 │ 0 │
│ 2018-04-01 │ 22 │ 0 │ 0 │
│ 2018-05-01 │ 52 │ 0 │ 0 │
│ 2018-06-01 │ 83 │ 0 │ 0 │
│ 2018-07-01 │ 13 │ 0 │ 0 │
│ 2018-08-01 │ 44 │ 0 │ 0 │
│ 2018-09-01 │ 75 │ 0 │ 0 │
│ 2018-10-01 │ 5 │ 0 │ 0 │
│ 2018-11-01 │ 36 │ 0 │ 0 │
│ 2018-12-01 │ 66 │ 0 │ 0 │
│ 2019-01-01 │ 97 │ 32 │ 0.33 │
│ 2019-02-01 │ 28 │ 63 │ 2.25 │
│ 2019-03-01 │ 56 │ 91 │ 1.62 │
│ 2019-04-01 │ 87 │ 22 │ 0.25 │
└────────────┴───────┴───────────┴────────────────┘
```
2024-05-17 06:33:08 +00:00
## runningDifference {#runningDifference}
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Calculates the difference between two consecutive row values in the data block.
Returns 0 for the first row, and for subsequent rows the difference to the previous row.
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
:::note
Only returns differences inside the currently processed data block.
2024-04-29 21:00:56 +00:00
Because of this error-prone behavior the function is DEPRECATED, please use proper window functions instead.
2022-04-09 13:29:05 +00:00
:::
2021-06-22 10:14:24 +00:00
2017-12-28 15:13:23 +00:00
The result of the function depends on the affected data blocks and the order of data in the block.
2020-10-28 14:13:27 +00:00
2023-06-01 18:27:34 +00:00
The order of rows during calculation of `runningDifference()` can differ from the order of rows returned to the user.
To prevent that you can create a subquery with [ORDER BY ](../../sql-reference/statements/select/order-by.md ) and call the function from outside the subquery.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
runningDifference(x)
```
**Example**
Query:
2017-12-28 15:13:23 +00:00
2024-03-12 16:57:34 +00:00
```sql
2017-12-28 15:13:23 +00:00
SELECT
EventID,
EventTime,
runningDifference(EventTime) AS delta
FROM
(
SELECT
EventID,
EventTime
FROM events
WHERE EventDate = '2016-11-24'
ORDER BY EventTime ASC
LIMIT 5
)
```
2024-05-17 06:33:08 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2017-12-28 15:13:23 +00:00
┌─EventID─┬───────────EventTime─┬─delta─┐
│ 1106 │ 2016-11-24 00:00:04 │ 0 │
│ 1107 │ 2016-11-24 00:00:05 │ 1 │
│ 1108 │ 2016-11-24 00:00:05 │ 0 │
│ 1109 │ 2016-11-24 00:00:09 │ 4 │
│ 1110 │ 2016-11-24 00:00:10 │ 1 │
└─────────┴─────────────────────┴───────┘
```
2023-06-01 18:27:34 +00:00
Please note that the block size affects the result. The internal state of `runningDifference` state is reset for each new block.
2019-09-02 20:15:40 +00:00
2024-05-17 06:33:08 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-09-02 20:15:40 +00:00
SELECT
number,
runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2024-05-17 06:33:08 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-09-02 20:15:40 +00:00
┌─number─┬─diff─┐
│ 0 │ 0 │
└────────┴──────┘
┌─number─┬─diff─┐
│ 65536 │ 0 │
└────────┴──────┘
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2024-05-17 06:33:08 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-12-28 23:11:02 +00:00
set max_block_size=100000 -- default value is 65536!
2019-09-02 20:15:40 +00:00
SELECT
number,
runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2024-05-17 06:33:08 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-09-02 20:15:40 +00:00
┌─number─┬─diff─┐
│ 0 │ 0 │
└────────┴──────┘
```
2022-06-02 10:55:18 +00:00
## runningDifferenceStartingWithFirstValue
2019-01-30 10:39:46 +00:00
2024-04-29 20:34:23 +00:00
:::note
This function is DEPRECATED (see the note for `runningDifference` ).
:::
2023-06-01 18:27:34 +00:00
Same as [runningDifference ](./other-functions.md#other_functions-runningdifference ), but returns the value of the first row as the value on the first row.
2019-01-30 10:39:46 +00:00
2022-06-02 10:55:18 +00:00
## runningConcurrency
2020-12-21 03:08:37 +00:00
2021-03-14 16:27:58 +00:00
Calculates the number of concurrent events.
2021-06-22 10:14:24 +00:00
Each event has a start time and an end time. The start time is included in the event, while the end time is excluded. Columns with a start time and an end time must be of the same data type.
2021-03-14 16:27:58 +00:00
The function calculates the total number of active (concurrent) events for each event start time.
2020-12-21 03:08:37 +00:00
2023-03-27 18:54:05 +00:00
:::tip
2022-04-09 13:29:05 +00:00
Events must be ordered by the start time in ascending order. If this requirement is violated the function raises an exception. Every data block is processed separately. If events from different data blocks overlap then they can not be processed correctly.
:::
2021-03-14 17:33:12 +00:00
2020-12-21 03:08:37 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-03-08 19:44:45 +00:00
runningConcurrency(start, end)
2020-12-21 03:08:37 +00:00
```
2021-03-08 19:44:45 +00:00
**Arguments**
2020-12-21 03:08:37 +00:00
2024-05-24 03:54:16 +00:00
- `start` — A column with the start time of events. [Date ](../data-types/date.md ), [DateTime ](../data-types/datetime.md ), or [DateTime64 ](../data-types/datetime64.md ).
- `end` — A column with the end time of events. [Date ](../data-types/date.md ), [DateTime ](../data-types/datetime.md ), or [DateTime64 ](../data-types/datetime64.md ).
2020-12-21 03:08:37 +00:00
**Returned values**
2024-05-24 03:54:16 +00:00
- The number of concurrent events at each event start time. [UInt32 ](../data-types/int-uint.md )
2020-12-21 03:08:37 +00:00
**Example**
2021-03-10 20:46:29 +00:00
Consider the table:
2020-12-21 03:08:37 +00:00
2024-03-12 16:57:34 +00:00
```text
2021-03-08 19:44:45 +00:00
┌──────start─┬────────end─┐
│ 2021-03-03 │ 2021-03-11 │
│ 2021-03-06 │ 2021-03-12 │
│ 2021-03-07 │ 2021-03-08 │
│ 2021-03-11 │ 2021-03-12 │
└────────────┴────────────┘
2020-12-21 03:08:37 +00:00
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2021-03-08 19:44:45 +00:00
SELECT start, runningConcurrency(start, end) FROM example_table;
2020-12-21 03:08:37 +00:00
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-03-08 19:44:45 +00:00
┌──────start─┬─runningConcurrency(start, end)─┐
│ 2021-03-03 │ 1 │
│ 2021-03-06 │ 2 │
│ 2021-03-07 │ 3 │
│ 2021-03-11 │ 2 │
└────────────┴────────────────────────────────┘
2020-12-21 03:08:37 +00:00
```
2024-05-17 06:33:08 +00:00
## MACNumToString
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Interprets a UInt64 number as a MAC address in big endian format. Returns the corresponding MAC address in format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form) as string.
2017-12-28 15:13:23 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
MACNumToString(num)
```
## MACStringToNum
2017-12-28 15:13:23 +00:00
The inverse function of MACNumToString. If the MAC address has an invalid format, it returns 0.
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
MACStringToNum(s)
```
## MACStringToOUI
2017-12-28 15:13:23 +00:00
2023-06-01 18:27:34 +00:00
Given a MAC address in format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form), returns the first three octets as a UInt64 number. If the MAC address has an invalid format, it returns 0.
2018-09-04 11:18:59 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
MACStringToOUI(s)
```
2022-06-02 10:55:18 +00:00
## getSizeOfEnumType
2018-09-04 11:18:59 +00:00
2024-05-24 03:54:16 +00:00
Returns the number of fields in [Enum ](../data-types/enum.md ).
2023-06-01 18:27:34 +00:00
An exception is thrown if the type is not `Enum` .
2018-09-04 11:18:59 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2018-09-04 11:18:59 +00:00
getSizeOfEnumType(value)
```
2021-02-15 21:22:10 +00:00
**Arguments:**
2018-09-04 11:18:59 +00:00
2023-04-19 15:55:29 +00:00
- `value` — Value of type `Enum` .
2018-09-04 11:18:59 +00:00
**Returned values**
2023-04-19 15:55:29 +00:00
- The number of fields with `Enum` input values.
2018-09-04 11:18:59 +00:00
**Example**
2024-03-12 16:57:34 +00:00
```sql
2018-09-04 11:18:59 +00:00
SELECT getSizeOfEnumType( CAST('a' AS Enum8('a' = 1, 'b' = 2) ) ) AS x
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─x─┐
│ 2 │
└───┘
```
2022-06-02 10:55:18 +00:00
## blockSerializedSize
2020-02-01 19:41:35 +00:00
2023-06-01 18:27:34 +00:00
Returns the size on disk without considering compression.
2020-02-01 19:41:35 +00:00
2024-03-12 16:57:34 +00:00
```sql
2020-02-01 19:41:35 +00:00
blockSerializedSize(value[, value[, ...]])
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-02-01 19:41:35 +00:00
2023-04-19 15:55:29 +00:00
- `value` — Any value.
2020-02-01 19:41:35 +00:00
**Returned values**
2023-06-01 18:27:34 +00:00
- The number of bytes that will be written to disk for block of values without compression.
2020-02-01 19:41:35 +00:00
**Example**
2020-06-29 09:48:18 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-02-01 19:41:35 +00:00
SELECT blockSerializedSize(maxState(1)) as x
```
2020-03-20 10:10:48 +00:00
2020-07-09 15:10:35 +00:00
Result:
2020-06-29 09:48:18 +00:00
2024-03-12 16:57:34 +00:00
```text
2020-02-01 19:41:35 +00:00
┌─x─┐
│ 2 │
└───┘
```
2022-06-02 10:55:18 +00:00
## toColumnTypeName
2018-09-04 11:18:59 +00:00
2023-06-01 18:27:34 +00:00
Returns the internal name of the data type that represents the value.
2018-09-04 11:18:59 +00:00
2024-05-17 06:33:08 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2018-09-04 11:18:59 +00:00
toColumnTypeName(value)
```
2021-02-15 21:22:10 +00:00
**Arguments:**
2018-09-04 11:18:59 +00:00
2023-04-19 15:55:29 +00:00
- `value` — Any type of value.
2018-09-04 11:18:59 +00:00
**Returned values**
2023-06-01 18:27:34 +00:00
- The internal data type name used to represent `value` .
2018-09-04 11:18:59 +00:00
2023-06-01 18:27:34 +00:00
**Example**
Difference between `toTypeName ' and ' toColumnTypeName` :
2018-09-04 11:18:59 +00:00
2024-03-12 16:57:34 +00:00
```sql
2019-09-23 15:31:46 +00:00
SELECT toTypeName(CAST('2018-01-01 01:02:03' AS DateTime))
2018-09-04 11:18:59 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─toTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime │
└─────────────────────────────────────────────────────┘
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-09-23 15:31:46 +00:00
SELECT toColumnTypeName(CAST('2018-01-01 01:02:03' AS DateTime))
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─toColumnTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ Const(UInt32) │
└───────────────────────────────────────────────────────────┘
```
2023-06-01 18:27:34 +00:00
The example shows that the `DateTime` data type is internally stored as `Const(UInt32)` .
2018-09-04 11:18:59 +00:00
2022-06-02 10:55:18 +00:00
## dumpColumnStructure
2018-09-04 11:18:59 +00:00
Outputs a detailed description of data structures in RAM
2024-03-12 16:57:34 +00:00
```sql
2018-09-04 11:18:59 +00:00
dumpColumnStructure(value)
```
2021-02-15 21:22:10 +00:00
**Arguments:**
2018-09-04 11:18:59 +00:00
2023-04-19 15:55:29 +00:00
- `value` — Any type of value.
2018-09-04 11:18:59 +00:00
**Returned values**
2023-06-01 18:27:34 +00:00
- A description of the column structure used for representing `value` .
2018-09-04 11:18:59 +00:00
**Example**
2024-03-12 16:57:34 +00:00
```sql
2018-09-04 11:18:59 +00:00
SELECT dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime, Const(size = 1, UInt32(size = 1)) │
└──────────────────────────────────────────────────────────────┘
```
2022-06-02 10:55:18 +00:00
## defaultValueOfArgumentType
2018-09-04 11:18:59 +00:00
2023-06-01 18:27:34 +00:00
Returns the default value for the given data type.
2018-09-04 11:18:59 +00:00
Does not include default values for custom columns set by the user.
2024-05-17 06:33:08 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2018-09-04 11:18:59 +00:00
defaultValueOfArgumentType(expression)
```
2021-02-15 21:22:10 +00:00
**Arguments:**
2018-09-04 11:18:59 +00:00
2023-04-19 15:55:29 +00:00
- `expression` — Arbitrary type of value or an expression that results in a value of an arbitrary type.
2018-09-04 11:18:59 +00:00
**Returned values**
2023-04-19 15:55:29 +00:00
- `0` for numbers.
- Empty string for strings.
2024-05-24 03:54:16 +00:00
- `ᴺᵁᴸᴸ` for [Nullable ](../data-types/nullable.md ).
2018-09-04 11:18:59 +00:00
**Example**
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-09-23 15:31:46 +00:00
SELECT defaultValueOfArgumentType( CAST(1 AS Int8) )
2018-09-04 11:18:59 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─defaultValueOfArgumentType(CAST(1, 'Int8'))─┐
│ 0 │
└─────────────────────────────────────────────┘
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2019-09-23 15:31:46 +00:00
SELECT defaultValueOfArgumentType( CAST(1 AS Nullable(Int8) ) )
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─defaultValueOfArgumentType(CAST(1, 'Nullable(Int8)'))─┐
│ ᴺᵁᴸᴸ │
└───────────────────────────────────────────────────────┘
```
2022-06-02 10:55:18 +00:00
## defaultValueOfTypeName
2020-08-19 07:52:33 +00:00
2023-06-01 18:27:34 +00:00
Returns the default value for the given type name.
2020-08-19 07:52:33 +00:00
Does not include default values for custom columns set by the user.
2024-03-12 16:57:34 +00:00
```sql
2020-08-19 07:52:33 +00:00
defaultValueOfTypeName(type)
```
2021-02-15 21:22:10 +00:00
**Arguments:**
2020-08-19 07:52:33 +00:00
2023-04-19 15:55:29 +00:00
- `type` — A string representing a type name.
2020-08-19 07:52:33 +00:00
**Returned values**
2023-04-19 15:55:29 +00:00
- `0` for numbers.
- Empty string for strings.
2024-05-24 03:54:16 +00:00
- `ᴺᵁᴸᴸ` for [Nullable ](../data-types/nullable.md ).
2020-08-19 07:52:33 +00:00
**Example**
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-08-19 07:52:33 +00:00
SELECT defaultValueOfTypeName('Int8')
```
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2020-08-19 07:52:33 +00:00
┌─defaultValueOfTypeName('Int8')─┐
│ 0 │
└────────────────────────────────┘
```
2023-06-01 18:27:34 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-08-19 07:52:33 +00:00
SELECT defaultValueOfTypeName('Nullable(Int8)')
```
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2020-08-19 07:52:33 +00:00
┌─defaultValueOfTypeName('Nullable(Int8)')─┐
│ ᴺᵁᴸᴸ │
└──────────────────────────────────────────┘
```
2022-06-02 10:55:18 +00:00
## indexHint
2021-02-28 07:25:56 +00:00
2023-06-01 18:27:34 +00:00
This function is intended for debugging and introspection. It ignores its argument and always returns 1. The arguments are not evaluated.
But during index analysis, the argument of this function is assumed to be not wrapped in `indexHint` . This allows to select data in index ranges by the corresponding condition but without further filtering by this condition. The index in ClickHouse is sparse and using `indexHint` will yield more data than specifying the same condition directly.
2021-02-28 07:25:56 +00:00
**Syntax**
```sql
SELECT * FROM table WHERE indexHint(< expression > )
```
**Returned value**
2024-05-23 13:48:20 +00:00
- `1` . [Uint8 ](../data-types/int-uint.md ).
2021-02-28 07:25:56 +00:00
**Example**
2022-04-11 05:01:34 +00:00
Here is the example of test data from the table [ontime ](../../getting-started/example-datasets/ontime.md ).
2021-02-28 07:25:56 +00:00
2023-06-01 18:27:34 +00:00
Table:
2021-02-28 07:25:56 +00:00
```sql
SELECT count() FROM ontime
```
```text
┌─count()─┐
│ 4276457 │
└─────────┘
```
The table has indexes on the fields `(FlightDate, (Year, FlightDate))` .
2023-06-01 18:27:34 +00:00
Create a query which does not use the index:
2021-02-28 07:25:56 +00:00
```sql
SELECT FlightDate AS k, count() FROM ontime GROUP BY k ORDER BY k
```
ClickHouse processed the entire table (`Processed 4.28 million rows`).
Result:
```text
┌──────────k─┬─count()─┐
│ 2017-01-01 │ 13970 │
│ 2017-01-02 │ 15882 │
........................
│ 2017-09-28 │ 16411 │
│ 2017-09-29 │ 16384 │
│ 2017-09-30 │ 12520 │
└────────────┴─────────┘
```
2023-06-01 18:27:34 +00:00
To apply the index, select a specific date:
2021-02-28 07:25:56 +00:00
```sql
SELECT FlightDate AS k, count() FROM ontime WHERE k = '2017-09-15' GROUP BY k ORDER BY k
```
2023-06-01 18:27:34 +00:00
ClickHouse now uses the index to process a significantly smaller number of rows (`Processed 32.74 thousand rows`).
2021-02-28 07:25:56 +00:00
Result:
```text
┌──────────k─┬─count()─┐
│ 2017-09-15 │ 16428 │
└────────────┴─────────┘
```
2023-06-01 18:27:34 +00:00
Now wrap the expression `k = '2017-09-15'` in function `indexHint` :
2021-02-28 07:25:56 +00:00
Query:
```sql
SELECT
FlightDate AS k,
count()
FROM ontime
WHERE indexHint(k = '2017-09-15')
GROUP BY k
ORDER BY k ASC
```
2023-06-01 18:27:34 +00:00
ClickHouse used the index the same way as previously (`Processed 32.74 thousand rows`).
2021-02-28 07:25:56 +00:00
The expression `k = '2017-09-15'` was not used when generating the result.
2023-06-01 18:27:34 +00:00
In example, the `indexHint` function allows to see adjacent dates.
2021-02-28 07:25:56 +00:00
Result:
```text
┌──────────k─┬─count()─┐
│ 2017-09-14 │ 7071 │
│ 2017-09-15 │ 16428 │
│ 2017-09-16 │ 1077 │
│ 2017-09-30 │ 8167 │
└────────────┴─────────┘
```
2022-06-02 10:55:18 +00:00
## replicate
2018-09-04 11:18:59 +00:00
Creates an array with a single value.
2024-05-17 06:33:08 +00:00
:::note
This function is used for the internal implementation of [arrayJoin ](../../sql-reference/functions/array-join.md#functions_arrayjoin ).
:::
**Syntax**
2018-09-04 11:18:59 +00:00
2024-03-12 16:57:34 +00:00
```sql
2024-05-17 06:33:08 +00:00
replicate(x, arr)
2018-09-04 11:18:59 +00:00
```
2024-05-17 06:33:08 +00:00
**Arguments**
2018-09-04 11:18:59 +00:00
2023-06-01 18:27:34 +00:00
- `x` — The value to fill the result array with.
2024-05-17 06:33:08 +00:00
- `arr` — An array. [Array ](../data-types/array.md ).
2018-09-04 11:18:59 +00:00
2019-09-30 09:17:55 +00:00
**Returned value**
2024-05-23 13:48:20 +00:00
An array of the lame length as `arr` filled with value `x` . [Array ](../data-types/array.md ).
2018-09-04 11:18:59 +00:00
**Example**
2019-09-30 09:17:55 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2024-05-17 06:33:08 +00:00
SELECT replicate(1, ['a', 'b', 'c']);
2019-09-23 15:31:46 +00:00
```
2019-09-30 09:17:55 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2018-09-04 11:18:59 +00:00
┌─replicate(1, ['a', 'b', 'c'])─┐
│ [1,1,1] │
└───────────────────────────────┘
```
2024-05-17 06:33:08 +00:00
## revision
Returns the current ClickHouse [server revision ](../../operations/system-tables/metrics#revision ).
**Syntax**
```sql
revision()
```
**Returned value**
- The current ClickHouse server revision. [UInt32 ](../data-types/int-uint.md ).
**Example**
Query:
```sql
SELECT revision();
```
Result:
```response
┌─revision()─┐
│ 54485 │
└────────────┘
```
2022-06-02 10:55:18 +00:00
## filesystemAvailable
2019-01-30 10:39:46 +00:00
2023-06-01 18:27:34 +00:00
Returns the amount of free space in the filesystem hosting the database persistence. The returned value is always smaller than total free space ([filesystemFree](#filesystemfree)) because some space is reserved for the operating system.
2019-01-30 10:39:46 +00:00
2019-10-07 19:32:18 +00:00
**Syntax**
2019-07-18 11:04:45 +00:00
2024-03-12 16:57:34 +00:00
```sql
2019-07-18 11:04:45 +00:00
filesystemAvailable()
```
2019-10-07 19:32:18 +00:00
**Returned value**
2019-07-18 11:04:45 +00:00
2024-05-24 03:54:16 +00:00
- The amount of remaining space available in bytes. [UInt64 ](../data-types/int-uint.md ).
2019-07-18 11:04:45 +00:00
**Example**
2019-10-07 19:32:18 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-06-01 18:27:34 +00:00
SELECT formatReadableSize(filesystemAvailable()) AS "Available space";
2019-07-18 11:04:45 +00:00
```
2019-10-07 19:32:18 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2023-06-01 18:27:34 +00:00
┌─Available space─┐
│ 30.75 GiB │
└─────────────────┘
2019-10-07 19:32:18 +00:00
```
2022-06-02 10:55:18 +00:00
## filesystemFree
2019-10-07 19:32:18 +00:00
2023-06-01 18:27:34 +00:00
Returns the total amount of the free space on the filesystem hosting the database persistence. See also `filesystemAvailable`
2019-10-07 19:32:18 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2019-11-11 11:41:33 +00:00
filesystemFree()
2019-10-07 19:32:18 +00:00
```
**Returned value**
2019-07-18 11:04:45 +00:00
2024-05-24 03:54:16 +00:00
- The amount of free space in bytes. [UInt64 ](../data-types/int-uint.md ).
2019-07-18 11:04:45 +00:00
**Example**
2019-10-07 19:32:18 +00:00
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-06-01 18:27:34 +00:00
SELECT formatReadableSize(filesystemFree()) AS "Free space";
2019-07-18 11:04:45 +00:00
```
2019-01-30 10:39:46 +00:00
2019-10-07 19:32:18 +00:00
Result:
2019-01-30 10:39:46 +00:00
2024-03-12 16:57:34 +00:00
```text
2023-06-01 18:27:34 +00:00
┌─Free space─┐
│ 32.39 GiB │
└────────────┘
2019-07-18 11:04:45 +00:00
```
2019-01-30 10:39:46 +00:00
2022-06-02 10:55:18 +00:00
## filesystemCapacity
2019-10-07 19:32:18 +00:00
2023-06-01 18:27:34 +00:00
Returns the capacity of the filesystem in bytes. Needs the [path ](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-path ) to the data directory to be configured.
2019-10-07 19:32:18 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2019-11-11 11:42:53 +00:00
filesystemCapacity()
2019-10-07 19:32:18 +00:00
```
**Returned value**
2024-05-24 03:54:16 +00:00
- Capacity of the filesystem in bytes. [UInt64 ](../data-types/int-uint.md ).
2019-10-07 19:32:18 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-06-01 18:27:34 +00:00
SELECT formatReadableSize(filesystemCapacity()) AS "Capacity";
2019-07-18 11:04:45 +00:00
```
2019-01-30 10:39:46 +00:00
2019-10-07 19:32:18 +00:00
Result:
2019-01-30 10:39:46 +00:00
2024-03-12 16:57:34 +00:00
```text
2023-06-01 18:27:34 +00:00
┌─Capacity──┐
│ 39.32 GiB │
└───────────┘
2019-10-07 19:32:18 +00:00
```
2019-01-30 10:39:46 +00:00
2022-06-02 10:55:18 +00:00
## initializeAggregation
2021-06-22 10:14:24 +00:00
2024-05-24 03:54:16 +00:00
Calculates the result of an aggregate function based on a single value. This function can be used to initialize aggregate functions with combinator [-State ](../../sql-reference/aggregate-functions/combinators.md#agg-functions-combinator-state ). You can create states of aggregate functions and insert them to columns of type [AggregateFunction ](../data-types/aggregatefunction.md#data-type-aggregatefunction ) or use initialized aggregates as default values.
2021-06-22 10:14:24 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-06-22 10:14:24 +00:00
initializeAggregation (aggregate_function, arg1, arg2, ..., argN)
```
**Arguments**
2024-05-24 03:54:16 +00:00
- `aggregate_function` — Name of the aggregation function to initialize. [String ](../data-types/string.md ).
2023-04-19 15:55:29 +00:00
- `arg` — Arguments of aggregate function.
2021-06-22 10:14:24 +00:00
**Returned value(s)**
- Result of aggregation for every row passed to the function.
2023-11-23 14:57:19 +00:00
The return type is the same as the return type of function, that `initializeAggregation` takes as first argument.
2021-06-22 10:14:24 +00:00
**Example**
Query:
```sql
SELECT uniqMerge(state) FROM (SELECT initializeAggregation('uniqState', number % 3) AS state FROM numbers(10000));
```
2024-03-12 16:57:34 +00:00
2021-06-22 10:14:24 +00:00
Result:
```text
┌─uniqMerge(state)─┐
│ 3 │
└──────────────────┘
```
Query:
```sql
SELECT finalizeAggregation(state), toTypeName(state) FROM (SELECT initializeAggregation('sumState', number % 3) AS state FROM numbers(5));
```
2023-06-01 18:27:34 +00:00
2021-06-22 10:14:24 +00:00
Result:
```text
┌─finalizeAggregation(state)─┬─toTypeName(state)─────────────┐
│ 0 │ AggregateFunction(sum, UInt8) │
│ 1 │ AggregateFunction(sum, UInt8) │
│ 2 │ AggregateFunction(sum, UInt8) │
│ 0 │ AggregateFunction(sum, UInt8) │
│ 1 │ AggregateFunction(sum, UInt8) │
└────────────────────────────┴───────────────────────────────┘
```
Example with `AggregatingMergeTree` table engine and `AggregateFunction` column:
```sql
CREATE TABLE metrics
(
key UInt64,
value AggregateFunction(sum, UInt64) DEFAULT initializeAggregation('sumState', toUInt64(0))
)
ENGINE = AggregatingMergeTree
ORDER BY key
```
```sql
INSERT INTO metrics VALUES (0, initializeAggregation('sumState', toUInt64(42)))
```
**See Also**
2023-06-01 18:27:34 +00:00
2023-04-19 15:55:29 +00:00
- [arrayReduce ](../../sql-reference/functions/array-functions.md#arrayreduce )
2021-06-22 10:14:24 +00:00
2022-06-02 10:55:18 +00:00
## finalizeAggregation
2019-01-30 10:39:46 +00:00
2023-06-01 18:27:34 +00:00
Given a state of aggregate function, this function returns the result of aggregation (or finalized state when using a [-State ](../../sql-reference/aggregate-functions/combinators.md#agg-functions-combinator-state ) combinator).
2020-12-23 01:24:05 +00:00
2021-06-22 10:14:24 +00:00
**Syntax**
2020-12-23 01:24:05 +00:00
2024-03-12 16:57:34 +00:00
```sql
2020-12-24 08:25:47 +00:00
finalizeAggregation(state)
2020-12-23 01:24:05 +00:00
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-12-23 01:24:05 +00:00
2024-05-24 03:54:16 +00:00
- `state` — State of aggregation. [AggregateFunction ](../data-types/aggregatefunction.md#data-type-aggregatefunction ).
2020-12-23 01:24:05 +00:00
**Returned value(s)**
2023-04-19 15:55:29 +00:00
- Value/values that was aggregated.
2020-12-23 01:24:05 +00:00
2024-05-23 13:48:20 +00:00
:::note
The return type is equal to that of any types which were aggregated.
:::
2020-12-23 01:24:05 +00:00
**Examples**
Query:
```sql
SELECT finalizeAggregation(( SELECT countState(number) FROM numbers(10)));
```
Result:
```text
┌─finalizeAggregation(_subquery16)─┐
│ 10 │
└──────────────────────────────────┘
```
Query:
```sql
SELECT finalizeAggregation(( SELECT sumState(number) FROM numbers(10)));
```
Result:
```text
┌─finalizeAggregation(_subquery20)─┐
│ 45 │
└──────────────────────────────────┘
```
2021-06-22 10:14:24 +00:00
Note that `NULL` values are ignored.
2020-12-23 01:24:05 +00:00
Query:
```sql
SELECT finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]));
```
Result:
```text
┌─finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]))─┐
│ 2 │
└────────────────────────────────────────────────────────────┘
```
Combined example:
Query:
```sql
WITH initializeAggregation('sumState', number) AS one_row_sum_state
SELECT
number,
finalizeAggregation(one_row_sum_state) AS one_row_sum,
runningAccumulate(one_row_sum_state) AS cumulative_sum
2020-12-24 08:25:47 +00:00
FROM numbers(10);
2020-12-23 01:24:05 +00:00
```
Result:
```text
┌─number─┬─one_row_sum─┬─cumulative_sum─┐
│ 0 │ 0 │ 0 │
│ 1 │ 1 │ 1 │
│ 2 │ 2 │ 3 │
│ 3 │ 3 │ 6 │
│ 4 │ 4 │ 10 │
│ 5 │ 5 │ 15 │
│ 6 │ 6 │ 21 │
│ 7 │ 7 │ 28 │
│ 8 │ 8 │ 36 │
│ 9 │ 9 │ 45 │
└────────┴─────────────┴────────────────┘
```
2019-01-30 10:39:46 +00:00
2021-06-22 10:14:24 +00:00
**See Also**
2023-06-01 18:27:34 +00:00
2023-04-19 15:55:29 +00:00
- [arrayReduce ](../../sql-reference/functions/array-functions.md#arrayreduce )
- [initializeAggregation ](#initializeaggregation )
2020-12-23 08:08:57 +00:00
2022-06-02 10:55:18 +00:00
## runningAccumulate
2019-01-30 10:39:46 +00:00
2023-06-01 18:27:34 +00:00
Accumulates the states of an aggregate function for each row of a data block.
2020-07-01 13:36:41 +00:00
2023-06-01 18:27:34 +00:00
:::note
The state is reset for each new block of data.
2024-04-29 21:00:56 +00:00
Because of this error-prone behavior the function is DEPRECATED, please use proper window functions instead.
2022-04-09 13:29:05 +00:00
:::
2020-07-01 13:36:41 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2020-07-01 13:36:41 +00:00
runningAccumulate(agg_state[, grouping]);
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-07-01 13:36:41 +00:00
2024-05-24 03:54:16 +00:00
- `agg_state` — State of the aggregate function. [AggregateFunction ](../data-types/aggregatefunction.md#data-type-aggregatefunction ).
- `grouping` — Grouping key. Optional. The state of the function is reset if the `grouping` value is changed. It can be any of the [supported data types ](../data-types/index.md ) for which the equality operator is defined.
2020-07-01 13:36:41 +00:00
**Returned value**
2023-04-19 15:55:29 +00:00
- Each resulting row contains a result of the aggregate function, accumulated for all the input rows from 0 to the current position. `runningAccumulate` resets states for each new data block or when the `grouping` value changes.
2020-07-01 13:36:41 +00:00
Type depends on the aggregate function used.
**Examples**
Consider how you can use `runningAccumulate` to find the cumulative sum of numbers without and with grouping.
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-07-01 13:36:41 +00:00
SELECT k, runningAccumulate(sum_k) AS res FROM (SELECT number as k, sumState(k) AS sum_k FROM numbers(10) GROUP BY k ORDER BY k);
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-07-01 13:36:41 +00:00
┌─k─┬─res─┐
│ 0 │ 0 │
│ 1 │ 1 │
│ 2 │ 3 │
│ 3 │ 6 │
│ 4 │ 10 │
│ 5 │ 15 │
│ 6 │ 21 │
│ 7 │ 28 │
│ 8 │ 36 │
│ 9 │ 45 │
└───┴─────┘
```
2020-07-09 15:10:35 +00:00
The subquery generates `sumState` for every number from `0` to `9` . `sumState` returns the state of the [sum ](../../sql-reference/aggregate-functions/reference/sum.md ) function that contains the sum of a single number.
2019-01-30 10:39:46 +00:00
2020-07-01 13:36:41 +00:00
The whole query does the following:
2023-06-01 18:27:34 +00:00
1. For the first row, `runningAccumulate` takes `sumState(0)` and returns `0` .
2. For the second row, the function merges `sumState(0)` and `sumState(1)` resulting in `sumState(0 + 1)` , and returns `1` as a result.
3. For the third row, the function merges `sumState(0 + 1)` and `sumState(2)` resulting in `sumState(0 + 1 + 2)` , and returns `3` as a result.
4. The actions are repeated until the block ends.
2020-07-01 13:36:41 +00:00
The following example shows the `groupping` parameter usage:
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-07-09 15:10:35 +00:00
SELECT
2020-07-01 13:36:41 +00:00
grouping,
item,
runningAccumulate(state, grouping) AS res
2020-07-09 15:10:35 +00:00
FROM
2020-07-01 13:36:41 +00:00
(
2020-07-09 15:10:35 +00:00
SELECT
2020-07-01 13:36:41 +00:00
toInt8(number / 4) AS grouping,
number AS item,
sumState(number) AS state
FROM numbers(15)
GROUP BY item
ORDER BY item ASC
);
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-07-01 13:36:41 +00:00
┌─grouping─┬─item─┬─res─┐
│ 0 │ 0 │ 0 │
│ 0 │ 1 │ 1 │
│ 0 │ 2 │ 3 │
│ 0 │ 3 │ 6 │
│ 1 │ 4 │ 4 │
│ 1 │ 5 │ 9 │
│ 1 │ 6 │ 15 │
│ 1 │ 7 │ 22 │
│ 2 │ 8 │ 8 │
│ 2 │ 9 │ 17 │
│ 2 │ 10 │ 27 │
│ 2 │ 11 │ 38 │
│ 3 │ 12 │ 12 │
│ 3 │ 13 │ 25 │
│ 3 │ 14 │ 39 │
└──────────┴──────┴─────┘
```
As you can see, `runningAccumulate` merges states for each group of rows separately.
2019-01-30 10:39:46 +00:00
2022-06-02 10:55:18 +00:00
## joinGet
2019-01-30 10:39:46 +00:00
2020-04-30 18:19:18 +00:00
The function lets you extract data from the table the same way as from a [dictionary ](../../sql-reference/dictionaries/index.md ).
2019-01-30 10:39:46 +00:00
2023-06-01 18:27:34 +00:00
Gets the data from [Join ](../../engines/table-engines/special/join.md#creating-a-table ) tables using the specified join key.
2019-06-21 07:58:15 +00:00
2019-07-31 14:49:16 +00:00
Only supports tables created with the `ENGINE = Join(ANY, LEFT, <join_keys>)` statement.
2019-01-30 10:39:46 +00:00
2020-01-24 10:52:26 +00:00
**Syntax**
2019-10-27 16:33:47 +00:00
2024-03-12 16:57:34 +00:00
```sql
2019-10-27 16:33:47 +00:00
joinGet(join_storage_table_name, `value_column` , join_keys)
```
2021-02-15 21:22:10 +00:00
**Arguments**
2019-10-27 16:33:47 +00:00
2023-06-01 18:27:34 +00:00
- `join_storage_table_name` — an [identifier ](../../sql-reference/syntax.md#syntax-identifiers ) indicating where the search is performed. The identifier is searched in the default database (see setting `default_database` in the config file). To override the default database, use `USE db_name` or specify the database and the table through the separator `db_name.db_table` as in the example.
2023-04-19 15:55:29 +00:00
- `value_column` — name of the column of the table that contains required data.
- `join_keys` — list of keys.
2019-10-27 16:33:47 +00:00
**Returned value**
2023-06-01 18:27:34 +00:00
Returns a list of values corresponded to list of keys.
2019-10-27 16:33:47 +00:00
2021-05-27 19:44:11 +00:00
If certain does not exist in source table then `0` or `null` will be returned based on [join_use_nulls ](../../operations/settings/settings.md#join_use_nulls ) setting.
2019-10-27 16:33:47 +00:00
2020-04-30 18:19:18 +00:00
More info about `join_use_nulls` in [Join operation ](../../engines/table-engines/special/join.md ).
2020-01-24 10:52:26 +00:00
2019-10-27 16:33:47 +00:00
**Example**
Input table:
2020-01-20 09:48:34 +00:00
2024-03-12 16:57:34 +00:00
```sql
2020-01-20 09:48:34 +00:00
CREATE DATABASE db_test
2020-01-24 11:38:29 +00:00
CREATE TABLE db_test.id_val(`id` UInt32, `val` UInt32) ENGINE = Join(ANY, LEFT, id) SETTINGS join_use_nulls = 1
2020-01-20 09:48:34 +00:00
INSERT INTO db_test.id_val VALUES (1,11)(2,12)(4,13)
2019-10-27 16:33:47 +00:00
```
2024-03-12 16:57:34 +00:00
```text
2019-10-27 16:33:47 +00:00
┌─id─┬─val─┐
│ 4 │ 13 │
│ 2 │ 12 │
│ 1 │ 11 │
└────┴─────┘
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-06-01 18:27:34 +00:00
SELECT joinGet(db_test.id_val, 'val', toUInt32(number)) from numbers(4) SETTINGS join_use_nulls = 1
2019-10-27 16:33:47 +00:00
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-01-20 09:48:34 +00:00
┌─joinGet(db_test.id_val, 'val', toUInt32(number))─┐
│ 0 │
│ 11 │
│ 12 │
│ 0 │
└──────────────────────────────────────────────────┘
2019-10-27 16:33:47 +00:00
```
2024-05-17 06:33:08 +00:00
## catboostEvaluate
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.
SQL syntax:
SELECT
catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
ACTION AS target
FROM amazon_train
LIMIT 10
Required configuration:
<catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
*** Implementation Details ***
The internal protocol between the server and the library-bridge is
simple:
- HTTP GET on path "/extdict_ping":
A ping, used during the handshake to check if the library-bridge runs.
- HTTP POST on path "extdict_request"
(1) Send a "catboost_GetTreeCount" request from the server to the
bridge, containing a library path (e.g /home/user/libcatboost.so) and
a model path (e.g. /home/user/model.bin). Rirst, this unloads the
catboost library handler associated to the model path (if it was
loaded), then loads the catboost library handler associated to the
model path, then executes GetTreeCount() on the library handler and
finally sends the result back to the server. Step (1) is called once
by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
library path handler is unloaded in the beginning because it contains
state which may no longer be valid if the user runs
catboost("/path/to/model.bin", ...) more than once and if "model.bin"
was updated in between.
(2) Send "catboost_Evaluate" from the server to the bridge, containing
the model path and the features to run the interference on. Step (2)
is called multiple times (once per chunk) by the server from function
FunctionCatBoostEvaluate::executeImpl(). The library handler for the
given model path is expected to be already loaded by Step (1).
Fixes #27870
2022-08-05 07:53:06 +00:00
2023-02-16 19:19:25 +00:00
:::note
This function is not available in ClickHouse Cloud.
:::
2023-06-02 13:41:01 +00:00
Evaluate an external catboost model. [CatBoost ](https://catboost.ai ) is an open-source gradient boosting library developed by Yandex for machine learning.
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.
SQL syntax:
SELECT
catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
ACTION AS target
FROM amazon_train
LIMIT 10
Required configuration:
<catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
*** Implementation Details ***
The internal protocol between the server and the library-bridge is
simple:
- HTTP GET on path "/extdict_ping":
A ping, used during the handshake to check if the library-bridge runs.
- HTTP POST on path "extdict_request"
(1) Send a "catboost_GetTreeCount" request from the server to the
bridge, containing a library path (e.g /home/user/libcatboost.so) and
a model path (e.g. /home/user/model.bin). Rirst, this unloads the
catboost library handler associated to the model path (if it was
loaded), then loads the catboost library handler associated to the
model path, then executes GetTreeCount() on the library handler and
finally sends the result back to the server. Step (1) is called once
by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
library path handler is unloaded in the beginning because it contains
state which may no longer be valid if the user runs
catboost("/path/to/model.bin", ...) more than once and if "model.bin"
was updated in between.
(2) Send "catboost_Evaluate" from the server to the bridge, containing
the model path and the features to run the interference on. Step (2)
is called multiple times (once per chunk) by the server from function
FunctionCatBoostEvaluate::executeImpl(). The library handler for the
given model path is expected to be already loaded by Step (1).
Fixes #27870
2022-08-05 07:53:06 +00:00
Accepts a path to a catboost model and model arguments (features). Returns Float64.
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
2024-05-23 11:54:45 +00:00
catboostEvaluate(path_to_model, feature_1, feature_2, ..., feature_n)
2024-05-17 06:33:08 +00:00
```
**Example**
2024-03-12 16:57:34 +00:00
```sql
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.
SQL syntax:
SELECT
catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
ACTION AS target
FROM amazon_train
LIMIT 10
Required configuration:
<catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
*** Implementation Details ***
The internal protocol between the server and the library-bridge is
simple:
- HTTP GET on path "/extdict_ping":
A ping, used during the handshake to check if the library-bridge runs.
- HTTP POST on path "extdict_request"
(1) Send a "catboost_GetTreeCount" request from the server to the
bridge, containing a library path (e.g /home/user/libcatboost.so) and
a model path (e.g. /home/user/model.bin). Rirst, this unloads the
catboost library handler associated to the model path (if it was
loaded), then loads the catboost library handler associated to the
model path, then executes GetTreeCount() on the library handler and
finally sends the result back to the server. Step (1) is called once
by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
library path handler is unloaded in the beginning because it contains
state which may no longer be valid if the user runs
catboost("/path/to/model.bin", ...) more than once and if "model.bin"
was updated in between.
(2) Send "catboost_Evaluate" from the server to the bridge, containing
the model path and the features to run the interference on. Step (2)
is called multiple times (once per chunk) by the server from function
FunctionCatBoostEvaluate::executeImpl(). The library handler for the
given model path is expected to be already loaded by Step (1).
Fixes #27870
2022-08-05 07:53:06 +00:00
SELECT feat1, ..., feat_n, catboostEvaluate('/path/to/model.bin', feat_1, ..., feat_n) AS prediction
FROM data_table
```
**Prerequisites**
1. Build the catboost evaluation library
Before evaluating catboost models, the `libcatboostmodel.<so|dylib>` library must be made available. See [CatBoost documentation ](https://catboost.ai/docs/concepts/c-plus-plus-api_dynamic-c-pluplus-wrapper.html ) how to compile it.
Next, specify the path to `libcatboostmodel.<so|dylib>` in the clickhouse configuration:
2024-03-12 16:57:34 +00:00
```xml
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.
SQL syntax:
SELECT
catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
ACTION AS target
FROM amazon_train
LIMIT 10
Required configuration:
<catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
*** Implementation Details ***
The internal protocol between the server and the library-bridge is
simple:
- HTTP GET on path "/extdict_ping":
A ping, used during the handshake to check if the library-bridge runs.
- HTTP POST on path "extdict_request"
(1) Send a "catboost_GetTreeCount" request from the server to the
bridge, containing a library path (e.g /home/user/libcatboost.so) and
a model path (e.g. /home/user/model.bin). Rirst, this unloads the
catboost library handler associated to the model path (if it was
loaded), then loads the catboost library handler associated to the
model path, then executes GetTreeCount() on the library handler and
finally sends the result back to the server. Step (1) is called once
by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
library path handler is unloaded in the beginning because it contains
state which may no longer be valid if the user runs
catboost("/path/to/model.bin", ...) more than once and if "model.bin"
was updated in between.
(2) Send "catboost_Evaluate" from the server to the bridge, containing
the model path and the features to run the interference on. Step (2)
is called multiple times (once per chunk) by the server from function
FunctionCatBoostEvaluate::executeImpl(). The library handler for the
given model path is expected to be already loaded by Step (1).
Fixes #27870
2022-08-05 07:53:06 +00:00
< clickhouse >
...
< catboost_lib_path > /path/to/libcatboostmodel.so< / catboost_lib_path >
...
< / clickhouse >
```
2022-12-02 10:52:01 +00:00
For security and isolation reasons, the model evaluation does not run in the server process but in the clickhouse-library-bridge process.
At the first execution of `catboostEvaluate()` , the server starts the library bridge process if it is not running already. Both processes
communicate using a HTTP interface. By default, port `9012` is used. A different port can be specified as follows - this is useful if port
`9012` is already assigned to a different service.
2024-03-12 16:57:34 +00:00
```xml
2022-12-02 10:52:01 +00:00
< library_bridge >
< port > 9019< / port >
< / library_bridge >
```
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.
SQL syntax:
SELECT
catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
ACTION AS target
FROM amazon_train
LIMIT 10
Required configuration:
<catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
*** Implementation Details ***
The internal protocol between the server and the library-bridge is
simple:
- HTTP GET on path "/extdict_ping":
A ping, used during the handshake to check if the library-bridge runs.
- HTTP POST on path "extdict_request"
(1) Send a "catboost_GetTreeCount" request from the server to the
bridge, containing a library path (e.g /home/user/libcatboost.so) and
a model path (e.g. /home/user/model.bin). Rirst, this unloads the
catboost library handler associated to the model path (if it was
loaded), then loads the catboost library handler associated to the
model path, then executes GetTreeCount() on the library handler and
finally sends the result back to the server. Step (1) is called once
by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
library path handler is unloaded in the beginning because it contains
state which may no longer be valid if the user runs
catboost("/path/to/model.bin", ...) more than once and if "model.bin"
was updated in between.
(2) Send "catboost_Evaluate" from the server to the bridge, containing
the model path and the features to run the interference on. Step (2)
is called multiple times (once per chunk) by the server from function
FunctionCatBoostEvaluate::executeImpl(). The library handler for the
given model path is expected to be already loaded by Step (1).
Fixes #27870
2022-08-05 07:53:06 +00:00
2. Train a catboost model using libcatboost
See [Training and applying models ](https://catboost.ai/docs/features/training.html#training ) for how to train catboost models from a training data set.
2024-05-17 06:33:08 +00:00
## throwIf
2019-01-30 10:39:46 +00:00
2023-06-01 18:27:34 +00:00
Throw an exception if argument `x` is true.
2024-05-17 06:33:08 +00:00
**Syntax**
```sql
2024-05-23 11:54:45 +00:00
throwIf(x[, message[, error_code]])
2024-05-17 06:33:08 +00:00
```
2023-06-01 18:27:34 +00:00
**Arguments**
- `x` - the condition to check.
- `message` - a constant string providing a custom error message. Optional.
- `error_code` - A constant integer providing a custom error code. Optional.
2022-08-17 20:13:23 +00:00
To use the `error_code` argument, configuration parameter `allow_custom_error_code_in_throwif` must be enabled.
2019-09-03 01:27:48 +00:00
2023-06-01 18:27:34 +00:00
**Example**
2024-03-12 16:57:34 +00:00
```sql
2019-09-03 01:27:48 +00:00
SELECT throwIf(number = 3, 'Too many') FROM numbers(10);
2019-09-23 15:31:46 +00:00
```
2020-03-20 10:10:48 +00:00
2023-06-01 18:27:34 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-09-03 01:27:48 +00:00
↙ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) Received exception from server (version 19.14.1):
Code: 395. DB::Exception: Received from localhost:9000. DB::Exception: Too many.
```
2018-10-16 10:47:17 +00:00
2022-06-02 10:55:18 +00:00
## identity
2019-10-22 19:14:56 +00:00
2023-06-01 18:27:34 +00:00
Returns its argument. Intended for debugging and testing. Allows to cancel using index, and get the query performance of a full scan. When the query is analyzed for possible use of an index, the analyzer ignores everything in `identity` functions. Also disables constant folding.
2019-09-03 00:18:44 +00:00
2019-10-22 19:14:56 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2020-02-02 21:25:51 +00:00
identity(x)
2019-10-22 19:14:56 +00:00
```
**Example**
Query:
2019-09-03 00:18:44 +00:00
2024-03-12 16:57:34 +00:00
```sql
2023-06-01 18:27:34 +00:00
SELECT identity(42);
2019-09-23 15:31:46 +00:00
```
2019-10-22 19:14:56 +00:00
Result:
2024-03-12 16:57:34 +00:00
```text
2019-09-03 00:18:44 +00:00
┌─identity(42)─┐
│ 42 │
└──────────────┘
```
2022-06-02 10:55:18 +00:00
## getSetting
2020-09-28 02:59:01 +00:00
Returns the current value of a [custom setting ](../../operations/settings/index.md#custom_settings ).
2020-10-09 19:29:42 +00:00
**Syntax**
2020-09-28 02:59:01 +00:00
```sql
2020-10-09 19:29:42 +00:00
getSetting('custom_setting');
2020-09-28 02:59:01 +00:00
```
2020-10-09 19:29:42 +00:00
**Parameter**
2020-09-28 02:59:01 +00:00
2024-05-24 03:54:16 +00:00
- `custom_setting` — The setting name. [String ](../data-types/string.md ).
2020-09-28 02:59:01 +00:00
**Returned value**
2023-06-01 18:27:34 +00:00
- The setting's current value.
2020-09-28 02:59:01 +00:00
**Example**
```sql
SET custom_a = 123;
2020-10-09 19:29:42 +00:00
SELECT getSetting('custom_a');
2020-09-28 02:59:01 +00:00
```
2023-06-01 18:27:34 +00:00
Result:
2020-09-28 02:59:01 +00:00
```
123
```
2020-10-09 19:29:42 +00:00
**See Also**
2020-09-28 02:59:01 +00:00
2023-04-19 15:55:29 +00:00
- [Custom Settings ](../../operations/settings/index.md#custom_settings )
2020-09-28 02:59:01 +00:00
2022-06-02 10:55:18 +00:00
## isDecimalOverflow
2020-10-07 18:13:01 +00:00
2024-05-24 03:54:16 +00:00
Checks whether the [Decimal ](../data-types/decimal.md ) value is outside its precision or outside the specified precision.
2020-10-07 18:13:01 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2020-10-07 18:13:01 +00:00
isDecimalOverflow(d, [p])
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-10-07 18:13:01 +00:00
2024-05-24 03:54:16 +00:00
- `d` — value. [Decimal ](../data-types/decimal.md ).
- `p` — precision. Optional. If omitted, the initial precision of the first argument is used. This parameter can be helpful to migrate data from/to another database or file. [UInt8 ](../data-types/int-uint.md#uint-ranges ).
2020-10-07 18:13:01 +00:00
**Returned values**
2023-06-01 18:27:34 +00:00
- `1` — Decimal value has more digits then allowed by its precision,
2023-04-19 15:55:29 +00:00
- `0` — Decimal value satisfies the specified precision.
2020-10-07 18:13:01 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-10-07 18:13:01 +00:00
SELECT isDecimalOverflow(toDecimal32(1000000000, 0), 9),
isDecimalOverflow(toDecimal32(1000000000, 0)),
isDecimalOverflow(toDecimal32(-1000000000, 0), 9),
isDecimalOverflow(toDecimal32(-1000000000, 0));
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-10-07 18:13:01 +00:00
1 1 1 1
```
2022-06-02 10:55:18 +00:00
## countDigits
2020-10-07 18:13:01 +00:00
2023-06-01 18:27:34 +00:00
Returns number of decimal digits need to represent a value.
2020-10-07 18:13:01 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2020-10-07 18:13:01 +00:00
countDigits(x)
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-10-07 18:13:01 +00:00
2024-05-24 03:54:16 +00:00
- `x` — [Int ](../data-types/int-uint.md ) or [Decimal ](../data-types/decimal.md ) value.
2020-10-07 18:13:01 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- Number of digits. [UInt8 ](../data-types/int-uint.md#uint-ranges ).
2020-10-07 18:13:01 +00:00
2023-03-18 02:45:43 +00:00
:::note
2022-04-09 13:29:05 +00:00
For `Decimal` values takes into account their scales: calculates result over underlying integer type which is `(value * scale)` . For example: `countDigits(42) = 2` , `countDigits(42.000) = 5` , `countDigits(0.04200) = 4` . I.e. you may check decimal overflow for `Decimal64` with `countDecimal(x) > 18` . It's a slow variant of [isDecimalOverflow ](#is-decimal-overflow ).
:::
2020-10-07 18:13:01 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-10-07 18:13:01 +00:00
SELECT countDigits(toDecimal32(1, 9)), countDigits(toDecimal32(-1, 9)),
countDigits(toDecimal64(1, 18)), countDigits(toDecimal64(-1, 18)),
countDigits(toDecimal128(1, 38)), countDigits(toDecimal128(-1, 38));
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-10-07 18:13:01 +00:00
10 10 19 19 39 39
```
2020-09-28 02:59:01 +00:00
2022-06-02 10:55:18 +00:00
## errorCodeToName
2020-10-12 18:22:09 +00:00
2024-05-24 03:54:16 +00:00
- The textual name of an error code. [LowCardinality(String) ](../data-types/lowcardinality.md ).
2020-10-12 18:22:09 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2020-10-12 18:22:09 +00:00
errorCodeToName(1)
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-10-12 18:22:09 +00:00
UNSUPPORTED_METHOD
```
2022-06-02 10:55:18 +00:00
## tcpPort
2020-12-21 20:13:26 +00:00
2020-12-22 15:37:34 +00:00
Returns [native interface ](../../interfaces/tcp.md ) TCP port number listened by this server.
2023-06-01 18:27:34 +00:00
If executed in the context of a distributed table, this function generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
2020-12-21 20:13:26 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2020-12-21 20:13:26 +00:00
tcpPort()
```
2021-02-15 21:22:10 +00:00
**Arguments**
2020-12-21 20:13:26 +00:00
2023-04-19 15:55:29 +00:00
- None.
2020-12-21 20:13:26 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- The TCP port number. [UInt16 ](../data-types/int-uint.md ).
2020-12-21 20:13:26 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2020-12-21 20:13:26 +00:00
SELECT tcpPort();
```
Result:
2024-03-12 16:57:34 +00:00
```text
2020-12-21 20:13:26 +00:00
┌─tcpPort()─┐
│ 9000 │
└───────────┘
```
**See Also**
2023-04-19 15:55:29 +00:00
- [tcp_port ](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-tcp_port )
2020-12-21 20:13:26 +00:00
2022-06-02 10:55:18 +00:00
## currentProfiles
2021-08-03 12:03:10 +00:00
2023-03-18 02:45:43 +00:00
Returns a list of the current [settings profiles ](../../guides/sre/user-management/index.md#settings-profiles-management ) for the current user.
2021-08-05 06:25:52 +00:00
2021-08-06 05:15:55 +00:00
The command [SET PROFILE ](../../sql-reference/statements/set.md#query-set ) could be used to change the current setting profile. If the command `SET PROFILE` was not used the function returns the profiles specified at the current user's definition (see [CREATE USER ](../../sql-reference/statements/create/user.md#create-user-statement )).
2021-08-03 12:03:10 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-04 11:26:24 +00:00
currentProfiles()
2021-08-03 12:03:10 +00:00
```
2021-08-04 10:15:50 +00:00
**Returned value**
2021-08-03 12:03:10 +00:00
2024-05-24 03:54:16 +00:00
- List of the current user settings profiles. [Array ](../data-types/array.md )([String](../data-types/string.md)).
2021-08-03 12:03:10 +00:00
2022-06-02 10:55:18 +00:00
## enabledProfiles
2021-08-03 12:03:10 +00:00
2024-03-12 16:57:34 +00:00
Returns settings profiles, assigned to the current user both explicitly and implicitly. Explicitly assigned profiles are the same as returned by the [currentProfiles ](#current-profiles ) function. Implicitly assigned profiles include parent profiles of other assigned profiles, profiles assigned via granted roles, profiles assigned via their own settings, and the main default profile (see the `default_profile` section in the main server configuration file).
2021-08-04 11:26:24 +00:00
2021-08-03 12:03:10 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-04 11:26:24 +00:00
enabledProfiles()
2021-08-03 12:03:10 +00:00
```
2021-08-04 10:15:50 +00:00
**Returned value**
2021-08-03 12:03:10 +00:00
2024-05-24 03:54:16 +00:00
- List of the enabled settings profiles. [Array ](../data-types/array.md )([String](../data-types/string.md)).
2021-08-03 12:03:10 +00:00
2022-06-02 10:55:18 +00:00
## defaultProfiles
2021-08-03 12:03:10 +00:00
2021-08-05 06:29:12 +00:00
Returns all the profiles specified at the current user's definition (see [CREATE USER ](../../sql-reference/statements/create/user.md#create-user-statement ) statement).
2021-08-03 12:03:10 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-04 11:26:24 +00:00
defaultProfiles()
2021-08-03 12:03:10 +00:00
```
2021-08-04 10:15:50 +00:00
**Returned value**
2021-08-03 12:03:10 +00:00
2024-05-24 03:54:16 +00:00
- List of the default settings profiles. [Array ](../data-types/array.md )([String](../data-types/string.md)).
2021-08-07 15:07:37 +00:00
2022-06-02 10:55:18 +00:00
## currentRoles
2021-08-03 12:03:10 +00:00
2023-06-01 18:27:34 +00:00
Returns the roles assigned to the current user. The roles can be changed by the [SET ROLE ](../../sql-reference/statements/set-role.md#set-role-statement ) statement. If no `SET ROLE` statement was not, the function `currentRoles` returns the same as `defaultRoles` .
2021-08-03 12:03:10 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-04 09:31:24 +00:00
currentRoles()
2021-08-03 12:03:10 +00:00
```
2021-08-04 09:31:24 +00:00
**Returned value**
2021-08-03 12:03:10 +00:00
2024-05-24 03:54:16 +00:00
- A list of the current roles for the current user. [Array ](../data-types/array.md )([String](../data-types/string.md)).
2021-08-03 12:03:10 +00:00
2022-06-02 10:55:18 +00:00
## enabledRoles
2021-08-03 12:03:10 +00:00
2021-08-06 05:24:13 +00:00
Returns the names of the current roles and the roles, granted to some of the current roles.
2021-08-03 12:03:10 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-04 09:31:24 +00:00
enabledRoles()
2021-08-03 12:03:10 +00:00
```
2021-08-04 09:31:24 +00:00
**Returned value**
2021-08-03 12:03:10 +00:00
2024-05-24 03:54:16 +00:00
- List of the enabled roles for the current user. [Array ](../data-types/array.md )([String](../data-types/string.md)).
2021-08-03 12:03:10 +00:00
2022-06-02 10:55:18 +00:00
## defaultRoles
2021-08-03 12:03:10 +00:00
2024-05-02 07:00:40 +00:00
Returns the roles which are enabled by default for the current user when he logs in. Initially these are all roles granted to the current user (see [GRANT ](../../sql-reference/statements/grant.md#select )), but that can be changed with the [SET DEFAULT ROLE ](../../sql-reference/statements/set-role.md#set-default-role-statement ) statement.
2021-08-03 12:03:10 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-04 09:31:24 +00:00
defaultRoles()
2021-08-03 12:03:10 +00:00
```
2021-08-04 09:31:24 +00:00
**Returned value**
2021-08-03 12:03:10 +00:00
2024-05-24 03:54:16 +00:00
- List of the default roles for the current user. [Array ](../data-types/array.md )([String](../data-types/string.md)).
2021-08-07 15:07:37 +00:00
2022-06-02 10:55:18 +00:00
## getServerPort
2021-08-22 10:38:43 +00:00
2023-06-01 18:27:34 +00:00
Returns the server port number. When the port is not used by the server, throws an exception.
2021-08-22 10:38:43 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-22 10:38:43 +00:00
getServerPort(port_name)
```
**Arguments**
2024-05-24 03:54:16 +00:00
- `port_name` — The name of the server port. [String ](../data-types/string.md#string ). Possible values:
2021-08-22 16:02:40 +00:00
2024-03-12 16:57:34 +00:00
- 'tcp_port'
- 'tcp_port_secure'
- 'http_port'
- 'https_port'
- 'interserver_http_port'
- 'interserver_https_port'
- 'mysql_port'
- 'postgresql_port'
- 'grpc_port'
- 'prometheus.port'
2021-08-22 10:38:43 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- The number of the server port. [UInt16 ](../data-types/int-uint.md ).
2021-08-22 10:38:43 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2021-08-22 10:38:43 +00:00
SELECT getServerPort('tcp_port');
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-08-22 10:38:43 +00:00
┌─getServerPort('tcp_port')─┐
│ 9000 │
└───────────────────────────┘
```
2021-08-29 16:00:29 +00:00
2023-09-18 17:34:40 +00:00
## queryID {#queryID}
2021-08-21 10:47:06 +00:00
2021-08-24 16:51:20 +00:00
Returns the ID of the current query. Other parameters of a query can be extracted from the [system.query_log ](../../operations/system-tables/query_log.md ) table via `query_id` .
2021-08-21 10:47:06 +00:00
2023-06-01 18:27:34 +00:00
In contrast to [initialQueryID ](#initial-query-id ) function, `queryID` can return different results on different shards (see the example).
2021-08-21 20:26:27 +00:00
2021-08-21 10:47:06 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-21 10:47:06 +00:00
queryID()
```
**Returned value**
2024-05-24 03:54:16 +00:00
- The ID of the current query. [String ](../data-types/string.md )
2021-08-21 10:47:06 +00:00
2021-08-21 20:26:27 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2021-08-25 17:15:57 +00:00
CREATE TABLE tmp (str String) ENGINE = Log;
INSERT INTO tmp (*) VALUES ('a');
2021-08-23 19:44:01 +00:00
SELECT count(DISTINCT t) FROM (SELECT queryID() AS t FROM remote('127.0.0.{1..3}', currentDatabase(), 'tmp') GROUP BY queryID());
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-08-23 19:44:01 +00:00
┌─count()─┐
│ 3 │
└─────────┘
2021-08-21 20:26:27 +00:00
```
2022-06-02 10:55:18 +00:00
## initialQueryID
2021-08-21 10:47:06 +00:00
2021-08-24 16:51:29 +00:00
Returns the ID of the initial current query. Other parameters of a query can be extracted from the [system.query_log ](../../operations/system-tables/query_log.md ) table via `initial_query_id` .
2021-08-21 10:47:06 +00:00
2021-08-24 16:51:56 +00:00
In contrast to [queryID ](#query-id ) function, `initialQueryID` returns the same results on different shards (see example).
2021-08-21 20:26:27 +00:00
2021-08-21 10:47:06 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-08-21 10:47:06 +00:00
initialQueryID()
```
**Returned value**
2024-05-24 03:54:16 +00:00
- The ID of the initial current query. [String ](../data-types/string.md )
2021-08-21 20:26:27 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2021-08-25 17:16:16 +00:00
CREATE TABLE tmp (str String) ENGINE = Log;
INSERT INTO tmp (*) VALUES ('a');
2021-08-23 19:44:01 +00:00
SELECT count(DISTINCT t) FROM (SELECT initialQueryID() AS t FROM remote('127.0.0.{1..3}', currentDatabase(), 'tmp') GROUP BY queryID());
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-08-23 19:44:01 +00:00
┌─count()─┐
│ 1 │
└─────────┘
2021-08-21 20:26:27 +00:00
```
2021-09-20 05:37:18 +00:00
2022-06-02 10:55:18 +00:00
## shardNum
2021-09-20 05:37:18 +00:00
2023-06-01 18:27:34 +00:00
Returns the index of a shard which processes a part of data in a distributed query. Indices are started from `1` .
2021-10-06 19:50:05 +00:00
If a query is not distributed then constant value `0` is returned.
2021-09-20 05:37:18 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-09-20 05:37:18 +00:00
shardNum()
```
**Returned value**
2024-05-24 03:54:16 +00:00
- Shard index or constant `0` . [UInt32 ](../data-types/int-uint.md ).
2021-09-20 05:37:18 +00:00
2021-10-06 19:50:05 +00:00
**Example**
In the following example a configuration with two shards is used. The query is executed on the [system.one ](../../operations/system-tables/one.md ) table on every shard.
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-03-18 02:45:43 +00:00
CREATE TABLE shard_num_example (dummy UInt8)
2021-10-06 19:50:05 +00:00
ENGINE=Distributed(test_cluster_two_shards_localhost, system, one, dummy);
SELECT dummy, shardNum(), shardCount() FROM shard_num_example;
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-10-06 19:50:05 +00:00
┌─dummy─┬─shardNum()─┬─shardCount()─┐
│ 0 │ 2 │ 2 │
│ 0 │ 1 │ 2 │
└───────┴────────────┴──────────────┘
```
**See Also**
2023-04-19 15:55:29 +00:00
- [Distributed Table Engine ](../../engines/table-engines/special/distributed.md )
2021-10-06 19:50:05 +00:00
2022-06-02 10:55:18 +00:00
## shardCount
2021-09-20 05:37:18 +00:00
2021-10-06 19:50:05 +00:00
Returns the total number of shards for a distributed query.
If a query is not distributed then constant value `0` is returned.
2021-09-20 05:37:18 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-09-20 05:37:18 +00:00
shardCount()
```
**Returned value**
2024-05-24 03:54:16 +00:00
- Total number of shards or `0` . [UInt32 ](../data-types/int-uint.md ).
2021-09-20 05:37:18 +00:00
2021-10-06 19:50:05 +00:00
**See Also**
2021-09-20 05:37:18 +00:00
2021-10-06 19:50:05 +00:00
- [shardNum() ](#shard-num ) function example also contains `shardCount()` function call.
2021-10-20 17:20:14 +00:00
2022-06-02 10:55:18 +00:00
## getOSKernelVersion
2021-10-20 17:20:14 +00:00
2021-10-21 18:33:47 +00:00
Returns a string with the current OS kernel version.
2021-10-20 17:20:14 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-10-20 17:20:14 +00:00
getOSKernelVersion()
```
**Arguments**
2023-04-19 15:55:29 +00:00
- None.
2021-10-20 17:20:14 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- The current OS kernel version. [String ](../data-types/string.md ).
2021-10-20 17:20:14 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2021-10-20 17:20:14 +00:00
SELECT getOSKernelVersion();
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-10-20 17:20:14 +00:00
┌─getOSKernelVersion()────┐
│ Linux 4.15.0-55-generic │
└─────────────────────────┘
```
2021-11-07 21:42:57 +00:00
2022-06-02 10:55:18 +00:00
## zookeeperSessionUptime
2021-11-07 21:42:57 +00:00
2021-11-09 10:02:06 +00:00
Returns the uptime of the current ZooKeeper session in seconds.
2021-11-07 21:42:57 +00:00
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2021-11-07 21:42:57 +00:00
zookeeperSessionUptime()
```
**Arguments**
2023-04-19 15:55:29 +00:00
- None.
2021-11-07 21:42:57 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- Uptime of the current ZooKeeper session in seconds. [UInt32 ](../data-types/int-uint.md ).
2021-11-07 21:42:57 +00:00
**Example**
Query:
2024-03-12 16:57:34 +00:00
```sql
2021-11-07 21:42:57 +00:00
SELECT zookeeperSessionUptime();
```
Result:
2024-03-12 16:57:34 +00:00
```text
2021-11-07 21:42:57 +00:00
┌─zookeeperSessionUptime()─┐
│ 286 │
└──────────────────────────┘
```
2023-03-09 17:47:14 +00:00
## generateRandomStructure
Generates random table structure in a format `column1_name column1_type, column2_name column2_type, ...` .
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2023-05-11 11:58:08 +00:00
generateRandomStructure([number_of_columns, seed])
2023-03-09 17:47:14 +00:00
```
**Arguments**
2023-03-09 20:15:32 +00:00
- `number_of_columns` — The desired number of columns in the result table structure. If set to 0 or `Null` , the number of columns will be random from 1 to 128. Default value: `Null` .
- `seed` - Random seed to produce stable results. If seed is not specified or set to `Null` , it is randomly generated.
2023-03-09 17:47:14 +00:00
All arguments must be constant.
**Returned value**
2024-05-24 03:54:16 +00:00
- Randomly generated table structure. [String ](../data-types/string.md ).
2023-03-09 17:47:14 +00:00
**Examples**
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-03-09 17:47:14 +00:00
SELECT generateRandomStructure()
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-03-09 17:47:14 +00:00
┌─generateRandomStructure()─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ c1 Decimal32(5), c2 Date, c3 Tuple(LowCardinality(String), Int128, UInt64, UInt16, UInt8, IPv6), c4 Array(UInt128), c5 UInt32, c6 IPv4, c7 Decimal256(64), c8 Decimal128(3), c9 UInt256, c10 UInt64, c11 DateTime │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-03-09 17:47:14 +00:00
SELECT generateRandomStructure(1)
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-03-09 17:47:14 +00:00
┌─generateRandomStructure(1)─┐
│ c1 Map(UInt256, UInt16) │
└────────────────────────────┘
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-05-11 11:58:08 +00:00
SELECT generateRandomStructure(NULL, 33)
2023-03-09 17:47:14 +00:00
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-05-11 11:58:08 +00:00
┌─generateRandomStructure(NULL, 33)─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ c1 DateTime, c2 Enum8('c2V0' = 0, 'c2V1' = 1, 'c2V2' = 2, 'c2V3' = 3), c3 LowCardinality(Nullable(FixedString(30))), c4 Int16, c5 Enum8('c5V0' = 0, 'c5V1' = 1, 'c5V2' = 2, 'c5V3' = 3), c6 Nullable(UInt8), c7 String, c8 Nested(e1 IPv4, e2 UInt8, e3 UInt16, e4 UInt16, e5 Int32, e6 Map(Date, Decimal256(70))) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
2023-03-09 17:47:14 +00:00
```
2023-05-15 11:20:03 +00:00
**Note**: the maximum nesting depth of complex types (Array, Tuple, Map, Nested) is limited to 16.
2023-03-09 17:47:14 +00:00
This function can be used together with [generateRandom ](../../sql-reference/table-functions/generate.md ) to generate completely random tables.
2023-07-27 18:54:41 +00:00
## structureToCapnProtoSchema {#structure_to_capn_proto_schema}
Converts ClickHouse table structure to CapnProto schema.
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
structureToCapnProtoSchema(structure)
```
**Arguments**
- `structure` — Table structure in a format `column1_name column1_type, column2_name column2_type, ...` .
- `root_struct_name` — Name for root struct in CapnProto schema. Default value - `Message` ;
**Returned value**
2024-05-24 03:54:16 +00:00
- CapnProto schema. [String ](../data-types/string.md ).
2023-07-27 18:54:41 +00:00
**Examples**
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
SELECT structureToCapnProtoSchema('column1 String, column2 UInt32, column3 Array(String)') FORMAT RawBLOB
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-07-27 18:54:41 +00:00
@0xf96402dd754d0eb7 ;
struct Message
{
column1 @0 : Data;
column2 @1 : UInt32;
column3 @2 : List(Data);
}
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
SELECT structureToCapnProtoSchema('column1 Nullable(String), column2 Tuple(element1 UInt32, element2 Array(String)), column3 Map(String, String)') FORMAT RawBLOB
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-07-27 18:54:41 +00:00
@0xd1c8320fecad2b7f ;
struct Message
{
struct Column1
{
union
{
value @0 : Data;
null @1 : Void;
}
}
column1 @0 : Column1;
struct Column2
{
element1 @0 : UInt32;
element2 @1 : List(Data);
}
column2 @1 : Column2;
struct Column3
{
struct Entry
{
key @0 : Data;
value @1 : Data;
}
entries @0 : List(Entry);
}
column3 @2 : Column3;
}
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
SELECT structureToCapnProtoSchema('column1 String, column2 UInt32', 'Root') FORMAT RawBLOB
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-07-27 18:54:41 +00:00
@0x96ab2d4ab133c6e1 ;
struct Root
{
column1 @0 : Data;
column2 @1 : UInt32;
}
```
## structureToProtobufSchema {#structure_to_protobuf_schema}
Converts ClickHouse table structure to Protobuf schema.
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
structureToProtobufSchema(structure)
```
**Arguments**
- `structure` — Table structure in a format `column1_name column1_type, column2_name column2_type, ...` .
- `root_message_name` — Name for root message in Protobuf schema. Default value - `Message` ;
**Returned value**
2024-05-24 03:54:16 +00:00
- Protobuf schema. [String ](../data-types/string.md ).
2023-07-27 18:54:41 +00:00
**Examples**
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
SELECT structureToProtobufSchema('column1 String, column2 UInt32, column3 Array(String)') FORMAT RawBLOB
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-07-27 18:54:41 +00:00
syntax = "proto3";
message Message
{
bytes column1 = 1;
uint32 column2 = 2;
repeated bytes column3 = 3;
}
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
SELECT structureToProtobufSchema('column1 Nullable(String), column2 Tuple(element1 UInt32, element2 Array(String)), column3 Map(String, String)') FORMAT RawBLOB
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-07-27 18:54:41 +00:00
syntax = "proto3";
message Message
{
bytes column1 = 1;
message Column2
{
uint32 element1 = 1;
repeated bytes element2 = 2;
}
Column2 column2 = 2;
map< string , bytes > column3 = 3;
}
```
Query:
2024-03-12 16:57:34 +00:00
```sql
2023-07-27 18:54:41 +00:00
SELECT structureToProtobufSchema('column1 String, column2 UInt32', 'Root') FORMAT RawBLOB
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-07-27 18:54:41 +00:00
syntax = "proto3";
message Root
{
bytes column1 = 1;
uint32 column2 = 2;
}
```
2023-10-24 14:13:15 +00:00
## formatQuery
2023-10-26 11:08:11 +00:00
Returns a formatted, possibly multi-line, version of the given SQL query.
2023-10-24 14:13:15 +00:00
2023-11-03 20:19:39 +00:00
Throws an exception if the query is not well-formed. To return `NULL` instead, function `formatQueryOrNull()` may be used.
2023-10-24 14:13:15 +00:00
**Syntax**
```sql
formatQuery(query)
2023-11-03 20:19:39 +00:00
formatQueryOrNull(query)
2023-10-24 14:13:15 +00:00
```
**Arguments**
2024-05-24 03:54:16 +00:00
- `query` - The SQL query to be formatted. [String ](../data-types/string.md )
2023-10-24 14:13:15 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- The formatted query. [String ](../data-types/string.md ).
2023-10-24 14:13:15 +00:00
**Example**
2023-10-25 08:10:04 +00:00
```sql
SELECT formatQuery('select a, b FRom tab WHERE a > 3 and b < 3 ' ) ;
```
Result:
```result
┌─formatQuery('select a, b FRom tab WHERE a > 3 and b < 3 ' ) ─ ┐
│ SELECT
a,
b
FROM tab
WHERE (a > 3) AND (b < 3 ) │
└───────────────────────────────────────────────────────────────┘
```
2023-10-26 11:09:20 +00:00
## formatQuerySingleLine
2023-10-25 08:10:04 +00:00
2023-10-26 11:08:11 +00:00
Like formatQuery() but the returned formatted string contains no line breaks.
2023-10-25 08:10:04 +00:00
2023-11-03 20:19:39 +00:00
Throws an exception if the query is not well-formed. To return `NULL` instead, function `formatQuerySingleLineOrNull()` may be used.
2023-10-25 08:10:04 +00:00
**Syntax**
```sql
2023-10-26 11:09:20 +00:00
formatQuerySingleLine(query)
2023-11-03 20:19:39 +00:00
formatQuerySingleLineOrNull(query)
2023-10-25 08:10:04 +00:00
```
**Arguments**
2024-05-24 03:54:16 +00:00
- `query` - The SQL query to be formatted. [String ](../data-types/string.md )
2023-10-25 08:10:04 +00:00
**Returned value**
2024-05-24 03:54:16 +00:00
- The formatted query. [String ](../data-types/string.md ).
2023-10-25 08:10:04 +00:00
**Example**
```sql
2023-10-26 11:09:20 +00:00
SELECT formatQuerySingleLine('select a, b FRom tab WHERE a > 3 and b < 3 ' ) ;
2023-10-25 08:10:04 +00:00
```
Result:
```result
2023-10-26 11:09:20 +00:00
┌─formatQuerySingleLine('select a, b FRom tab WHERE a > 3 and b < 3 ' ) ─ ┐
│ SELECT a, b FROM tab WHERE (a > 3) AND (b < 3 ) │
└─────────────────────────────────────────────────────────────────────────┘
2023-10-25 08:10:04 +00:00
```
2023-12-19 16:43:30 +00:00
## variantElement
Extracts a column with specified type from a `Variant` column.
**Syntax**
2023-12-19 23:40:18 +00:00
```sql
2023-12-27 19:02:50 +00:00
variantElement(variant, type_name, [, default_value])
2023-12-19 16:43:30 +00:00
```
2023-12-19 23:40:18 +00:00
**Arguments**
2024-05-24 03:54:16 +00:00
- `variant` — Variant column. [Variant ](../data-types/variant.md ).
- `type_name` — The name of the variant type to extract. [String ](../data-types/string.md ).
2023-12-19 16:43:30 +00:00
- `default_value` - The default value that will be used if variant doesn't have variant with specified type. Can be any type. Optional.
**Returned value**
- Subcolumn of a `Variant` column with specified type.
**Example**
```sql
CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT v, variantElement(v, 'String'), variantElement(v, 'UInt64'), variantElement(v, 'Array(UInt64)') FROM test;
```
```text
┌─v─────────────┬─variantElement(v, 'String')─┬─variantElement(v, 'UInt64')─┬─variantElement(v, 'Array(UInt64)')─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │
│ 42 │ ᴺᵁᴸᴸ │ 42 │ [] │
│ Hello, World! │ Hello, World! │ ᴺᵁᴸᴸ │ [] │
│ [1,2,3] │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [1,2,3] │
└───────────────┴─────────────────────────────┴─────────────────────────────┴────────────────────────────────────┘
```
2024-01-30 18:01:12 +00:00
## variantType
Returns the variant type name for each row of `Variant` column. If row contains NULL, it returns `'None'` for it.
**Syntax**
```sql
variantType(variant)
```
**Arguments**
2024-05-24 03:54:16 +00:00
- `variant` — Variant column. [Variant ](../data-types/variant.md ).
2024-01-30 18:01:12 +00:00
**Returned value**
- Enum8 column with variant type name for each row.
**Example**
```sql
CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT variantType(v) FROM test;
```
```text
┌─variantType(v)─┐
│ None │
│ UInt64 │
│ String │
│ Array(UInt64) │
└────────────────┘
```
```sql
SELECT toTypeName(variantType(v)) FROM test LIMIT 1;
```
```text
┌─toTypeName(variantType(v))──────────────────────────────────────────┐
│ Enum8('None' = -1, 'Array(UInt64)' = 0, 'String' = 1, 'UInt64' = 2) │
└─────────────────────────────────────────────────────────────────────┘
```
2023-10-30 19:54:51 +00:00
## minSampleSizeConversion
Calculates minimum required sample size for an A/B test comparing conversions (proportions) in two samples.
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2023-10-30 19:54:51 +00:00
minSampleSizeConversion(baseline, mde, power, alpha)
```
Uses the formula described in [this article ](https://towardsdatascience.com/required-sample-size-for-a-b-testing-6f6608dd330a ). Assumes equal sizes of treatment and control groups. Returns the sample size required for one group (i.e. the sample size required for the whole experiment is twice the returned value).
**Arguments**
- `baseline` — Baseline conversion. [Float ](../data-types/float.md ).
- `mde` — Minimum detectable effect (MDE) as percentage points (e.g. for a baseline conversion 0.25 the MDE 0.03 means an expected change to 0.25 ± 0.03). [Float ](../data-types/float.md ).
- `power` — Required statistical power of a test (1 - probability of Type II error). [Float ](../data-types/float.md ).
- `alpha` — Required significance level of a test (probability of Type I error). [Float ](../data-types/float.md ).
**Returned value**
A named [Tuple ](../data-types/tuple.md ) with 3 elements:
- `"minimum_sample_size"` — Required sample size. [Float64 ](../data-types/float.md ).
- `"detect_range_lower"` — Lower bound of the range of values not detectable with the returned required sample size (i.e. all values less than or equal to `"detect_range_lower"` are detectable with the provided `alpha` and `power` ). Calculated as `baseline - mde` . [Float64 ](../data-types/float.md ).
- `"detect_range_upper"` — Upper bound of the range of values not detectable with the returned required sample size (i.e. all values greater than or equal to `"detect_range_upper"` are detectable with the provided `alpha` and `power` ). Calculated as `baseline + mde` . [Float64 ](../data-types/float.md ).
**Example**
The following query calculates the required sample size for an A/B test with baseline conversion of 25%, MDE of 3%, significance level of 5%, and the desired statistical power of 80%:
2024-03-12 16:57:34 +00:00
```sql
2023-10-30 19:54:51 +00:00
SELECT minSampleSizeConversion(0.25, 0.03, 0.80, 0.05) AS sample_size;
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-10-30 19:54:51 +00:00
┌─sample_size───────────────────┐
│ (3396.077603219163,0.22,0.28) │
└───────────────────────────────┘
```
2023-10-31 12:05:41 +00:00
## minSampleSizeContinuous
2023-10-30 19:54:51 +00:00
Calculates minimum required sample size for an A/B test comparing means of a continuous metric in two samples.
**Syntax**
2024-03-12 16:57:34 +00:00
```sql
2023-10-30 19:54:51 +00:00
minSampleSizeContinous(baseline, sigma, mde, power, alpha)
```
2023-10-31 12:05:41 +00:00
Alias: `minSampleSizeContinous`
2023-10-30 19:54:51 +00:00
Uses the formula described in [this article ](https://towardsdatascience.com/required-sample-size-for-a-b-testing-6f6608dd330a ). Assumes equal sizes of treatment and control groups. Returns the required sample size for one group (i.e. the sample size required for the whole experiment is twice the returned value). Also assumes equal variance of the test metric in treatment and control groups.
**Arguments**
- `baseline` — Baseline value of a metric. [Integer ](../data-types/int-uint.md ) or [Float ](../data-types/float.md ).
- `sigma` — Baseline standard deviation of a metric. [Integer ](../data-types/int-uint.md ) or [Float ](../data-types/float.md ).
2024-03-12 16:57:34 +00:00
- `mde` — Minimum detectable effect (MDE) as percentage of the baseline value (e.g. for a baseline value 112.25 the MDE 0.03 means an expected change to 112.25 ± 112.25\*0.03). [Integer ](../data-types/int-uint.md ) or [Float ](../data-types/float.md ).
2023-10-30 19:54:51 +00:00
- `power` — Required statistical power of a test (1 - probability of Type II error). [Integer ](../data-types/int-uint.md ) or [Float ](../data-types/float.md ).
- `alpha` — Required significance level of a test (probability of Type I error). [Integer ](../data-types/int-uint.md ) or [Float ](../data-types/float.md ).
**Returned value**
A named [Tuple ](../data-types/tuple.md ) with 3 elements:
- `"minimum_sample_size"` — Required sample size. [Float64 ](../data-types/float.md ).
- `"detect_range_lower"` — Lower bound of the range of values not detectable with the returned required sample size (i.e. all values less than or equal to `"detect_range_lower"` are detectable with the provided `alpha` and `power` ). Calculated as `baseline * (1 - mde)` . [Float64 ](../data-types/float.md ).
- `"detect_range_upper"` — Upper bound of the range of values not detectable with the returned required sample size (i.e. all values greater than or equal to `"detect_range_upper"` are detectable with the provided `alpha` and `power` ). Calculated as `baseline * (1 + mde)` . [Float64 ](../data-types/float.md ).
**Example**
The following query calculates the required sample size for an A/B test on a metric with baseline value of 112.25, standard deviation of 21.1, MDE of 3%, significance level of 5%, and the desired statistical power of 80%:
2024-03-12 16:57:34 +00:00
```sql
2023-10-30 19:54:51 +00:00
SELECT minSampleSizeContinous(112.25, 21.1, 0.03, 0.80, 0.05) AS sample_size;
```
Result:
2024-03-12 16:57:34 +00:00
```text
2023-10-30 19:54:51 +00:00
┌─sample_size───────────────────────────┐
│ (616.2931945826209,108.8825,115.6175) │
└───────────────────────────────────────┘
```
2024-03-23 20:48:28 +00:00
2024-03-22 01:14:41 +00:00
## connectionId
Retrieves the connection ID of the client that submitted the current query and returns it as a UInt64 integer.
**Syntax**
```sql
connectionId()
```
**Parameters**
None.
**Returned value**
Returns an integer of type UInt64.
**Implementation details**
This function is most useful in debugging scenarios or for internal purposes within the MySQL handler. It was created for compatibility with [MySQL's `CONNECTION_ID` function ](https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_connection-id ) It is not typically used in production queries.
**Example**
Query:
```sql
SELECT connectionId();
```
```response
0
```
## connection_id
An alias of `connectionId` . Retrieves the connection ID of the client that submitted the current query and returns it as a UInt64 integer.
**Syntax**
```sql
connection_id()
```
**Parameters**
None.
**Returned value**
Returns an integer of type UInt64.
**Implementation details**
This function is most useful in debugging scenarios or for internal purposes within the MySQL handler. It was created for compatibility with [MySQL's `CONNECTION_ID` function ](https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_connection-id ) It is not typically used in production queries.
**Example**
Query:
```sql
SELECT connection_id();
```
```response
0
```
2024-04-26 11:17:47 +00:00
2024-03-23 20:48:28 +00:00
## getClientHTTPHeader
Get the value of an HTTP header.
If there is no such header or the current request is not performed via the HTTP interface, the function returns an empty string.
Certain HTTP headers (e.g., `Authentication` and `X-ClickHouse-*` ) are restricted.
The function requires the setting `allow_get_client_http_header` to be enabled.
The setting is not enabled by default for security reasons, because some headers, such as `Cookie` , could contain sensitive info.
HTTP headers are case sensitive for this function.
If the function is used in the context of a distributed query, it returns non-empty result only on the initiator node.
2024-05-05 01:51:42 +00:00
## showCertificate
2024-05-07 20:09:40 +00:00
Shows information about the current server's Secure Sockets Layer (SSL) certificate if it has been configured. See [Configuring SSL-TLS ](https://clickhouse.com/docs/en/guides/sre/configuring-ssl ) for more information on how to configure ClickHouse to use OpenSSL certificates to validate connections.
2024-05-05 01:51:42 +00:00
**Syntax**
```sql
showCertificate()
```
2024-05-05 01:54:01 +00:00
2024-05-07 20:09:40 +00:00
**Returned value**
2024-05-05 01:54:01 +00:00
2024-05-24 03:54:16 +00:00
- Map of key-value pairs relating to the configured SSL certificate. [Map ](../data-types/map.md )([String](../data-types/string.md), [String ](../data-types/string.md )).
2024-05-05 01:54:01 +00:00
2024-05-07 20:09:40 +00:00
**Example**
Query:
2024-05-05 01:54:01 +00:00
2024-05-07 20:09:40 +00:00
```sql
SELECT showCertificate() FORMAT LineAsString;
```
Result:
```response
{'version':'1','serial_number':'2D9071D64530052D48308473922C7ADAFA85D6C5','signature_algo':'sha256WithRSAEncryption','issuer':'/CN=marsnet.local CA','not_before':'May 7 17:01:21 2024 GMT','not_after':'May 7 17:01:21 2025 GMT','subject':'/CN=chnode1','pkey_algo':'rsaEncryption'}
```