DOCAPI-8530: Code blocks markup fix (#7060)

* Typo fix.

* Links fix.

* Fixed links in docs.

* More fixes.

* docs/en: cleaning some files

* docs/en: cleaning data_types

* docs/en: cleaning database_engines

* docs/en: cleaning development

* docs/en: cleaning getting_started

* docs/en: cleaning interfaces

* docs/en: cleaning operations

* docs/en: cleaning query_lamguage

* docs/en: cleaning en

* docs/ru: cleaning data_types

* docs/ru: cleaning index

* docs/ru: cleaning database_engines

* docs/ru: cleaning development

* docs/ru: cleaning general

* docs/ru: cleaning getting_started

* docs/ru: cleaning interfaces

* docs/ru: cleaning operations

* docs/ru: cleaning query_language

* docs: cleaning interfaces/http

* Update docs/en/data_types/array.md

decorated ```

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/getting_started/example_datasets/nyc_taxi.md

fixed typo

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/getting_started/example_datasets/ontime.md

fixed typo

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/interfaces/formats.md

fixed error

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/table_engines/custom_partitioning_key.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/utils/clickhouse-local.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/dicts/external_dicts_dict_sources.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/utils/clickhouse-local.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/json_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/json_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/other_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/other_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/query_language/functions/date_time_functions.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* Update docs/en/operations/table_engines/jdbc.md

Co-Authored-By: BayoNet <da-daos@yandex.ru>

* docs: fixed error

* docs: fixed error
This commit is contained in:
BayoNet 2019-09-23 18:31:46 +03:00 committed by GitHub
parent 016f3b0a45
commit 2d2bc052e1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
211 changed files with 2254 additions and 2373 deletions

View File

@ -8,42 +8,34 @@ Array of `T`-type items.
You can use a function to create an array:
```
```sql
array(T)
```
You can also use square brackets.
```
```sql
[]
```
Example of creating an array:
```sql
SELECT array(1, 2) AS x, toTypeName(x)
```
:) SELECT array(1, 2) AS x, toTypeName(x)
SELECT
[1, 2] AS x,
toTypeName(x)
```text
┌─x─────┬─toTypeName(array(1, 2))─┐
│ [1,2] │ Array(UInt8) │
└───────┴─────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
:) SELECT [1, 2] AS x, toTypeName(x)
SELECT
[1, 2] AS x,
toTypeName(x)
```
```sql
SELECT [1, 2] AS x, toTypeName(x)
```
```text
┌─x─────┬─toTypeName([1, 2])─┐
│ [1,2] │ Array(UInt8) │
└───────┴────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
```
## Working with data types
@ -54,31 +46,23 @@ If ClickHouse couldn't determine the data type, it will generate an exception. F
Examples of automatic data type detection:
```
:) SELECT array(1, 2, NULL) AS x, toTypeName(x)
SELECT
[1, 2, NULL] AS x,
toTypeName(x)
```sql
SELECT array(1, 2, NULL) AS x, toTypeName(x)
```
```text
┌─x──────────┬─toTypeName(array(1, 2, NULL))─┐
│ [1,2,NULL] │ Array(Nullable(UInt8)) │
└────────────┴───────────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
```
If you try to create an array of incompatible data types, ClickHouse throws an exception:
```sql
SELECT array(1, 'a')
```
:) SELECT array(1, 'a')
SELECT [1, 'a']
```text
Received exception from server (version 1.1.54388):
Code: 386. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: There is no supertype for types UInt8, String because some of them are String/FixedString and some of them are not.
0 rows in set. Elapsed: 0.246 sec.
```

View File

@ -51,36 +51,36 @@ Some functions on Decimal return result as Float64 (for example, var or stddev).
During calculations on Decimal, integer overflows might happen. Excessive digits in fraction are discarded (not rounded). Excessive digits in integer part will lead to exception.
```
```sql
SELECT toDecimal32(2, 4) AS x, x / 3
```
```
```text
┌──────x─┬─divide(toDecimal32(2, 4), 3)─┐
│ 2.0000 │ 0.6666 │
└────────┴──────────────────────────────┘
```
```
```sql
SELECT toDecimal32(4.2, 8) AS x, x * x
```
```
```text
DB::Exception: Scale is out of bounds.
```
```
```sql
SELECT toDecimal32(4.2, 8) AS x, 6 * x
```
```
```text
DB::Exception: Decimal math overflow.
```
Overflow checks lead to operations slowdown. If it is known that overflows are not possible, it makes sense to disable checks using `decimal_check_overflow` setting. When checks are disabled and overflow happens, the result will be incorrect:
```
```sql
SET decimal_check_overflow = 0;
SELECT toDecimal32(4.2, 8) AS x, 6 * x
```
```
```text
┌──────────x─┬─multiply(6, toDecimal32(4.2, 8))─┐
│ 4.20000000 │ -17.74967296 │
└────────────┴──────────────────────────────────┘
@ -88,10 +88,10 @@ SELECT toDecimal32(4.2, 8) AS x, 6 * x
Overflow checks happen not only on arithmetic operations, but also on value comparison:
```
```sql
SELECT toDecimal32(1, 8) < 100
```
```
```text
DB::Exception: Can't compare.
```

View File

@ -4,13 +4,13 @@
### Basic Usage
``` sql
```sql
CREATE TABLE hits (url String, from IPv4) ENGINE = MergeTree() ORDER BY url;
DESCRIBE TABLE hits;
```
```
```text
┌─name─┬─type───┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┐
│ url │ String │ │ │ │ │
│ from │ IPv4 │ │ │ │ │
@ -19,19 +19,19 @@ DESCRIBE TABLE hits;
OR you can use IPv4 domain as a key:
``` sql
```sql
CREATE TABLE hits (url String, from IPv4) ENGINE = MergeTree() ORDER BY from;
```
`IPv4` domain supports custom input format as IPv4-strings:
``` sql
```sql
INSERT INTO hits (url, from) VALUES ('https://wikipedia.org', '116.253.40.133')('https://clickhouse.yandex', '183.247.232.58')('https://clickhouse.yandex/docs/en/', '116.106.34.242');
SELECT * FROM hits;
```
```
```text
┌─url────────────────────────────────┬───────────from─┐
│ https://clickhouse.yandex/docs/en/ │ 116.106.34.242 │
│ https://wikipedia.org │ 116.253.40.133 │
@ -41,11 +41,11 @@ SELECT * FROM hits;
Values are stored in compact binary form:
``` sql
```sql
SELECT toTypeName(from), hex(from) FROM hits LIMIT 1;
```
```
```text
┌─toTypeName(from)─┬─hex(from)─┐
│ IPv4 │ B7F7E83A │
└──────────────────┴───────────┘
@ -54,7 +54,7 @@ SELECT toTypeName(from), hex(from) FROM hits LIMIT 1;
Domain values are not implicitly convertible to types other than `UInt32`.
If you want to convert `IPv4` value to a string, you have to do that explicitly with `IPv4NumToString()` function:
``` sql
```sql
SELECT toTypeName(s), IPv4NumToString(from) as s FROM hits LIMIT 1;
```
@ -66,11 +66,11 @@ SELECT toTypeName(s), IPv4NumToString(from) as s FROM hits LIMIT 1;
Or cast to a `UInt32` value:
``` sql
```sql
SELECT toTypeName(i), CAST(from as UInt32) as i FROM hits LIMIT 1;
```
```
```text
┌─toTypeName(CAST(from, 'UInt32'))─┬──────────i─┐
│ UInt32 │ 3086477370 │
└──────────────────────────────────┴────────────┘

View File

@ -4,13 +4,13 @@
### Basic Usage
``` sql
```sql
CREATE TABLE hits (url String, from IPv6) ENGINE = MergeTree() ORDER BY url;
DESCRIBE TABLE hits;
```
```
```text
┌─name─┬─type───┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┐
│ url │ String │ │ │ │ │
│ from │ IPv6 │ │ │ │ │
@ -19,19 +19,19 @@ DESCRIBE TABLE hits;
OR you can use `IPv6` domain as a key:
``` sql
```sql
CREATE TABLE hits (url String, from IPv6) ENGINE = MergeTree() ORDER BY from;
```
`IPv6` domain supports custom input as IPv6-strings:
``` sql
```sql
INSERT INTO hits (url, from) VALUES ('https://wikipedia.org', '2a02:aa08:e000:3100::2')('https://clickhouse.yandex', '2001:44c8:129:2632:33:0:252:2')('https://clickhouse.yandex/docs/en/', '2a02:e980:1e::1');
SELECT * FROM hits;
```
```
```text
┌─url────────────────────────────────┬─from──────────────────────────┐
│ https://clickhouse.yandex │ 2001:44c8:129:2632:33:0:252:2 │
│ https://clickhouse.yandex/docs/en/ │ 2a02:e980:1e::1 │
@ -41,11 +41,11 @@ SELECT * FROM hits;
Values are stored in compact binary form:
``` sql
```sql
SELECT toTypeName(from), hex(from) FROM hits LIMIT 1;
```
```
```text
┌─toTypeName(from)─┬─hex(from)────────────────────────┐
│ IPv6 │ 200144C8012926320033000002520002 │
└──────────────────┴──────────────────────────────────┘
@ -54,11 +54,11 @@ SELECT toTypeName(from), hex(from) FROM hits LIMIT 1;
Domain values are not implicitly convertible to types other than `FixedString(16)`.
If you want to convert `IPv6` value to a string, you have to do that explicitly with `IPv6NumToString()` function:
``` sql
```sql
SELECT toTypeName(s), IPv6NumToString(from) as s FROM hits LIMIT 1;
```
```
```text
┌─toTypeName(IPv6NumToString(from))─┬─s─────────────────────────────┐
│ String │ 2001:44c8:129:2632:33:0:252:2 │
└───────────────────────────────────┴───────────────────────────────┘
@ -66,11 +66,11 @@ SELECT toTypeName(s), IPv6NumToString(from) as s FROM hits LIMIT 1;
Or cast to a `FixedString(16)` value:
``` sql
```sql
SELECT toTypeName(i), CAST(from as FixedString(16)) as i FROM hits LIMIT 1;
```
```
```text
┌─toTypeName(CAST(from, 'FixedString(16)'))─┬─i───────┐
│ FixedString(16) │ <20><><EFBFBD>
└───────────────────────────────────────────┴─────────┘

View File

@ -26,19 +26,15 @@ ENGINE = TinyLog
Column `x` can only store values that are listed in the type definition: `'hello'` or `'world'`. If you try to save any other value, ClickHouse will raise an exception. 8-bit size for this `Enum` is chosen automatically.
```sql
:) INSERT INTO t_enum VALUES ('hello'), ('world'), ('hello')
INSERT INTO t_enum VALUES
INSERT INTO t_enum VALUES ('hello'), ('world'), ('hello')
```
```text
Ok.
3 rows in set. Elapsed: 0.002 sec.
:) insert into t_enum values('a')
INSERT INTO t_enum VALUES
```
```sql
INSERT INTO t_enum values('a')
```
```text
Exception on client:
Code: 49. DB::Exception: Unknown element 'a' for type Enum('hello' = 1, 'world' = 2)
```
@ -47,7 +43,8 @@ When you query data from the table, ClickHouse outputs the string values from `E
```sql
SELECT * FROM t_enum
```
```text
┌─x─────┐
│ hello │
│ world │
@ -59,7 +56,8 @@ If you need to see the numeric equivalents of the rows, you must cast the `Enum`
```sql
SELECT CAST(x, 'Int8') FROM t_enum
```
```text
┌─CAST(x, 'Int8')─┐
│ 1 │
│ 2 │
@ -71,7 +69,8 @@ To create an Enum value in a query, you also need to use `CAST`.
```sql
SELECT toTypeName(CAST('a', 'Enum(\'a\' = 1, \'b\' = 2)'))
```
```text
┌─toTypeName(CAST('a', 'Enum(\'a\' = 1, \'b\' = 2)'))─┐
│ Enum8('a' = 1, 'b' = 2) │
└─────────────────────────────────────────────────────┘
@ -85,7 +84,7 @@ Neither the string nor the numeric value in an `Enum` can be [NULL](../query_lan
An `Enum` can be contained in [Nullable](nullable.md) type. So if you create a table using the query
```
```sql
CREATE TABLE t_enum_nullable
(
x Nullable( Enum8('hello' = 1, 'world' = 2) )
@ -95,7 +94,7 @@ ENGINE = TinyLog
it can store not only `'hello'` and `'world'`, but `NULL`, as well.
```
```sql
INSERT INTO t_enum_nullable Values('hello'),('world'),(NULL)
```

View File

@ -4,7 +4,7 @@ A fixed-length string of `N` bytes (neither characters nor code points).
To declare a column of `FixedString` type, use the following syntax:
```
```sql
<column_name> FixedString(N)
```
@ -30,7 +30,7 @@ When selecting the data, ClickHouse does not remove the null bytes at the end of
Let's consider the following table with the single `FixedString(2)` column:
```
```text
┌─name──┐
│ b │
└───────┘
@ -38,15 +38,14 @@ Let's consider the following table with the single `FixedString(2)` column:
The query `SELECT * FROM FixedStringTable WHERE a = 'b'` does not return any data as a result. We should complement the filter pattern with null bytes.
```
```sql
SELECT * FROM FixedStringTable
WHERE a = 'b\0'
```
```text
┌─a─┐
│ b │
└───┘
1 rows in set. Elapsed: 0.002 sec.
```
This behavior differs from MySQL behavior for the `CHAR` type (where strings are padded with spaces, and the spaces are removed for output).

View File

@ -13,11 +13,11 @@ We recommend that you store data in integer form whenever possible. For example,
- Computations with floating-point numbers might produce a rounding error.
``` sql
```sql
SELECT 1 - 0.9
```
```
```text
┌───────minus(1, 0.9)─┐
│ 0.09999999999999998 │
└─────────────────────┘
@ -33,11 +33,11 @@ In contrast to standard SQL, ClickHouse supports the following categories of flo
- `Inf` Infinity.
``` sql
```sql
SELECT 0.5 / 0
```
```
```text
┌─divide(0.5, 0)─┐
│ inf │
└────────────────┘
@ -45,11 +45,11 @@ SELECT 0.5 / 0
- `-Inf` Negative infinity.
``` sql
```sql
SELECT -0.5 / 0
```
```
```text
┌─divide(-0.5, 0)─┐
│ -inf │
└─────────────────┘
@ -57,11 +57,11 @@ SELECT -0.5 / 0
- `NaN` Not a number.
```
```sql
SELECT 0 / 0
```
```
```text
┌─divide(0, 0)─┐
│ nan │
└──────────────┘

View File

@ -33,7 +33,7 @@ To insert data, use `INSERT SELECT` with aggregate `-State`- functions.
**Function examples**
```
```sql
uniqState(UserID)
quantilesState(0.5, 0.9)(SendTiming)
```

View File

@ -4,7 +4,7 @@ A nested data structure is like a nested table. The parameters of a nested data
Example:
``` sql
```sql
CREATE TABLE test.visits
(
CounterID UInt32,
@ -35,7 +35,7 @@ In most cases, when working with a nested data structure, its individual columns
Example:
``` sql
```sql
SELECT
Goals.ID,
Goals.EventTime
@ -44,7 +44,7 @@ WHERE CounterID = 101500 AND length(Goals.ID) < 5
LIMIT 10
```
```
```text
┌─Goals.ID───────────────────────┬─Goals.EventTime───────────────────────────────────────────────────────────────────────────┐
│ [1073752,591325,591325] │ ['2014-03-17 16:38:10','2014-03-17 16:38:48','2014-03-17 16:42:27'] │
│ [1073752] │ ['2014-03-17 00:28:25'] │
@ -63,7 +63,7 @@ It is easiest to think of a nested data structure as a set of multiple column ar
The only place where a SELECT query can specify the name of an entire nested data structure instead of individual columns is the ARRAY JOIN clause. For more information, see "ARRAY JOIN clause". Example:
``` sql
```sql
SELECT
Goal.ID,
Goal.EventTime
@ -73,7 +73,7 @@ WHERE CounterID = 101500 AND length(Goals.ID) < 5
LIMIT 10
```
```
```text
┌─Goal.ID─┬──────Goal.EventTime─┐
│ 1073752 │ 2014-03-17 16:38:10 │
│ 591325 │ 2014-03-17 16:38:48 │

View File

@ -17,39 +17,20 @@ To store `Nullable` type values in table column, ClickHouse uses a separate file
## Usage example
```sql
CREATE TABLE t_null(x Int8, y Nullable(Int8)) ENGINE TinyLog
```
:) CREATE TABLE t_null(x Int8, y Nullable(Int8)) ENGINE TinyLog
CREATE TABLE t_null
(
x Int8,
y Nullable(Int8)
)
ENGINE = TinyLog
Ok.
0 rows in set. Elapsed: 0.012 sec.
:) INSERT INTO t_null VALUES (1, NULL), (2, 3)
INSERT INTO t_null VALUES
Ok.
1 rows in set. Elapsed: 0.007 sec.
:) SELECT x + y FROM t_null
SELECT x + y
FROM t_null
```sql
INSERT INTO t_null VALUES (1, NULL), (2, 3)
```
```sql
SELECT x + y FROM t_null
```
```text
┌─plus(x, y)─┐
│ ᴺᵁᴸᴸ │
│ 5 │
└────────────┘
2 rows in set. Elapsed: 0.144 sec.
```
[Original article](https://clickhouse.yandex/docs/en/data_types/nullable/) <!--hide-->

View File

@ -7,16 +7,13 @@ For example, literal [NULL](../../query_language/syntax.md#null-literal) has typ
The `Nothing` type can also used to denote empty arrays:
```bash
:) SELECT toTypeName(array())
SELECT toTypeName([])
```sql
SELECT toTypeName(array())
```
```text
┌─toTypeName(array())─┐
│ Array(Nothing) │
└─────────────────────┘
1 rows in set. Elapsed: 0.062 sec.
```

View File

@ -11,24 +11,19 @@ Tuples can be the result of a query. In this case, for text formats other than J
You can use a function to create a tuple:
```
```sql
tuple(T1, T2, ...)
```
Example of creating a tuple:
```sql
SELECT tuple(1,'a') AS x, toTypeName(x)
```
:) SELECT tuple(1,'a') AS x, toTypeName(x)
SELECT
(1, 'a') AS x,
toTypeName(x)
```text
┌─x───────┬─toTypeName(tuple(1, 'a'))─┐
│ (1,'a') │ Tuple(UInt8, String) │
└─────────┴───────────────────────────┘
1 rows in set. Elapsed: 0.021 sec.
```
## Working with data types
@ -37,18 +32,13 @@ When creating a tuple on the fly, ClickHouse automatically detects the type of e
Example of automatic data type detection:
```
```sql
SELECT tuple(1, NULL) AS x, toTypeName(x)
SELECT
(1, NULL) AS x,
toTypeName(x)
```
```text
┌─x────────┬─toTypeName(tuple(1, NULL))──────┐
│ (1,NULL) │ Tuple(UInt8, Nullable(Nothing)) │
└──────────┴─────────────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
```

View File

@ -4,13 +4,13 @@ A universally unique identifier (UUID) is a 16-byte number used to identify reco
The example of UUID type value is represented below:
```
```text
61f0c404-5cb3-11e7-907b-a6006ad3dba0
```
If you do not specify the UUID column value when inserting a new record, the UUID value is filled with zero:
```
```text
00000000-0000-0000-0000-000000000000
```
@ -24,13 +24,16 @@ To generate the UUID value, ClickHouse provides the [generateUUIDv4](../query_la
This example demonstrates creating a table with the UUID type column and inserting a value into the table.
``` sql
:) CREATE TABLE t_uuid (x UUID, y String) ENGINE=TinyLog
:) INSERT INTO t_uuid SELECT generateUUIDv4(), 'Example 1'
:) SELECT * FROM t_uuid
```sql
CREATE TABLE t_uuid (x UUID, y String) ENGINE=TinyLog
```
```sql
INSERT INTO t_uuid SELECT generateUUIDv4(), 'Example 1'
```
```sql
SELECT * FROM t_uuid
```
```text
┌────────────────────────────────────x─┬─y─────────┐
│ 417ddc5d-e556-4d27-95dd-a34d84e46a50 │ Example 1 │
└──────────────────────────────────────┴───────────┘
@ -40,11 +43,13 @@ This example demonstrates creating a table with the UUID type column and inserti
In this example, the UUID column value is not specified when inserting a new record.
``` sql
:) INSERT INTO t_uuid (y) VALUES ('Example 2')
:) SELECT * FROM t_uuid
```sql
INSERT INTO t_uuid (y) VALUES ('Example 2')
```
```sql
SELECT * FROM t_uuid
```
```text
┌────────────────────────────────────x─┬─y─────────┐
│ 417ddc5d-e556-4d27-95dd-a34d84e46a50 │ Example 1 │
│ 00000000-0000-0000-0000-000000000000 │ Example 2 │

View File

@ -55,7 +55,7 @@ All other MySQL data types are converted into [String](../data_types/string.md).
Table in MySQL:
```
```text
mysql> USE test;
Database changed

View File

@ -3,21 +3,21 @@
## Install Git and Pbuilder
```bash
sudo apt-get update
sudo apt-get install git pbuilder debhelper lsb-release fakeroot sudo debian-archive-keyring debian-keyring
$ sudo apt-get update
$ sudo apt-get install git pbuilder debhelper lsb-release fakeroot sudo debian-archive-keyring debian-keyring
```
## Checkout ClickHouse Sources
```bash
git clone --recursive --branch stable https://github.com/yandex/ClickHouse.git
cd ClickHouse
$ git clone --recursive --branch stable https://github.com/yandex/ClickHouse.git
$ cd ClickHouse
```
## Run Release Script
```bash
./release
$ ./release
```
# How to Build ClickHouse for Development
@ -29,13 +29,13 @@ Only x86_64 with SSE 4.2 is supported. Support for AArch64 is experimental.
To test for SSE 4.2, do
```bash
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
$ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
```
## Install Git and CMake
```bash
sudo apt-get install git cmake ninja-build
$ sudo apt-get install git cmake ninja-build
```
Or cmake3 instead of cmake on older systems.
@ -47,10 +47,10 @@ There are several ways to do this.
### Install from a PPA Package
```bash
sudo apt-get install software-properties-common
sudo apt-add-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-9 g++-9
$ sudo apt-get install software-properties-common
$ sudo apt-add-repository ppa:ubuntu-toolchain-r/test
$ sudo apt-get update
$ sudo apt-get install gcc-9 g++-9
```
### Install from Sources
@ -60,23 +60,25 @@ Look at [utils/ci/build-gcc-from-sources.sh](https://github.com/yandex/ClickHous
## Use GCC 9 for Builds
```bash
export CC=gcc-9
export CXX=g++-9
$ export CC=gcc-9
$ export CXX=g++-9
```
## Install Required Libraries from Packages
```bash
sudo apt-get install libicu-dev libreadline-dev gperf
$ sudo apt-get install libicu-dev libreadline-dev gperf
```
## Checkout ClickHouse Sources
```bash
git clone --recursive git@github.com:yandex/ClickHouse.git
# or: git clone --recursive https://github.com/yandex/ClickHouse.git
cd ClickHouse
$ git clone --recursive git@github.com:yandex/ClickHouse.git
```
or
```bash
$ git clone --recursive https://github.com/yandex/ClickHouse.git
$ cd ClickHouse
```
For the latest stable version, switch to the `stable` branch.
@ -84,11 +86,11 @@ For the latest stable version, switch to the `stable` branch.
## Build ClickHouse
```bash
mkdir build
cd build
cmake ..
ninja
cd ..
$ mkdir build
$ cd build
$ cmake ..
$ ninja
$ cd ..
```
To create an executable, run `ninja clickhouse`.

View File

@ -5,22 +5,25 @@ Build should work on Mac OS X 10.12.
## Install Homebrew
```bash
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
```
## Install Required Compilers, Tools, and Libraries
```bash
brew install cmake ninja gcc icu4c openssl libtool gettext readline gperf
$ brew install cmake ninja gcc icu4c openssl libtool gettext readline gperf
```
## Checkout ClickHouse Sources
```bash
git clone --recursive git@github.com:yandex/ClickHouse.git
# or: git clone --recursive https://github.com/yandex/ClickHouse.git
$ git clone --recursive git@github.com:yandex/ClickHouse.git
```
or
```bash
$ git clone --recursive https://github.com/yandex/ClickHouse.git
cd ClickHouse
$ cd ClickHouse
```
For the latest stable version, switch to the `stable` branch.
@ -28,11 +31,11 @@ For the latest stable version, switch to the `stable` branch.
## Build ClickHouse
```bash
mkdir build
cd build
cmake .. -DCMAKE_CXX_COMPILER=`which g++-8` -DCMAKE_C_COMPILER=`which gcc-8`
ninja
cd ..
$ mkdir build
$ cd build
$ cmake .. -DCMAKE_CXX_COMPILER=`which g++-8` -DCMAKE_C_COMPILER=`which gcc-8`
$ ninja
$ cd ..
```
## Caveats
@ -45,7 +48,7 @@ If you intend to run clickhouse-server, make sure to increase the system's maxfi
To do so, create the following file:
/Library/LaunchDaemons/limit.maxfiles.plist:
``` xml
```xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
@ -70,7 +73,7 @@ To do so, create the following file:
```
Execute the following command:
``` bash
```bash
$ sudo chown root:wheel /Library/LaunchDaemons/limit.maxfiles.plist
```

View File

@ -86,21 +86,21 @@ Note that all clickhouse tools (server, client, etc) are just symlinks to a sing
Alternatively you can install ClickHouse package: either stable release from Yandex repository or you can build package for yourself with `./release` in ClickHouse sources root. Then start the server with `sudo service clickhouse-server start` (or stop to stop the server). Look for logs at `/etc/clickhouse-server/clickhouse-server.log`.
When ClickHouse is already installed on your system, you can build a new `clickhouse` binary and replace the existing binary:
```
sudo service clickhouse-server stop
sudo cp ./clickhouse /usr/bin/
sudo service clickhouse-server start
```bash
$ sudo service clickhouse-server stop
$ sudo cp ./clickhouse /usr/bin/
$ sudo service clickhouse-server start
```
Also you can stop system clickhouse-server and run your own with the same configuration but with logging to terminal:
```
sudo service clickhouse-server stop
sudo -u clickhouse /usr/bin/clickhouse server --config-file /etc/clickhouse-server/config.xml
```bash
$ sudo service clickhouse-server stop
$ sudo -u clickhouse /usr/bin/clickhouse server --config-file /etc/clickhouse-server/config.xml
```
Example with gdb:
```
sudo -u clickhouse gdb --args /usr/bin/clickhouse server --config-file /etc/clickhouse-server/config.xml
```bash
$ sudo -u clickhouse gdb --args /usr/bin/clickhouse server --config-file /etc/clickhouse-server/config.xml
```
If the system clickhouse-server is already running and you don't want to stop it, you can change port numbers in your `config.xml` (or override them in a file in `config.d` directory), provide appropriate data path, and run it.
@ -112,7 +112,7 @@ If the system clickhouse-server is already running and you don't want to stop it
Before publishing release as stable we deploy it on testing environment. Testing environment is a cluster that process 1/39 part of [Yandex.Metrica](https://metrica.yandex.com/) data. We share our testing environment with Yandex.Metrica team. ClickHouse is upgraded without downtime on top of existing data. We look at first that data is processed successfully without lagging from realtime, the replication continue to work and there is no issues visible to Yandex.Metrica team. First check can be done in the following way:
```
```sql
SELECT hostName() AS h, any(version()), any(uptime()), max(UTCEventTime), count() FROM remote('example01-01-{1..3}t', merge, hits) WHERE EventDate >= today() - 2 GROUP BY h ORDER BY h;
```
@ -126,16 +126,16 @@ After deploying to testing environment we run load testing with queries from pro
Make sure you have enabled `query_log` on your production cluster.
Collect query log for a day or more:
```
clickhouse-client --query="SELECT DISTINCT query FROM system.query_log WHERE event_date = today() AND query LIKE '%ym:%' AND query NOT LIKE '%system.query_log%' AND type = 2 AND is_initial_query" > queries.tsv
```bash
$ clickhouse-client --query="SELECT DISTINCT query FROM system.query_log WHERE event_date = today() AND query LIKE '%ym:%' AND query NOT LIKE '%system.query_log%' AND type = 2 AND is_initial_query" > queries.tsv
```
This is a way complicated example. `type = 2` will filter queries that are executed successfully. `query LIKE '%ym:%'` is to select relevant queries from Yandex.Metrica. `is_initial_query` is to select only queries that are initiated by client, not by ClickHouse itself (as parts of distributed query processing).
`scp` this log to your testing cluster and run it as following:
```
clickhouse benchmark --concurrency 16 < queries.tsv
```bash
$ clickhouse benchmark --concurrency 16 < queries.tsv
```
(probably you also want to specify a `--user`)

View File

@ -17,7 +17,7 @@ If you use Oracle through the ODBC driver as a source of external dictionaries,
**Example**
```
```sql
NLS_LANG=RUSSIAN_RUSSIA.UTF8
```

View File

@ -7,21 +7,21 @@ Sign up for a free account at <https://aws.amazon.com>. You will need a credit c
Run the following in the console:
```bash
sudo apt-get install s3cmd
mkdir tiny; cd tiny;
s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/tiny/ .
cd ..
mkdir 1node; cd 1node;
s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/1node/ .
cd ..
mkdir 5nodes; cd 5nodes;
s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/5nodes/ .
cd ..
$ sudo apt-get install s3cmd
$ mkdir tiny; cd tiny;
$ s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/tiny/ .
$ cd ..
$ mkdir 1node; cd 1node;
$ s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/1node/ .
$ cd ..
$ mkdir 5nodes; cd 5nodes;
$ s3cmd sync s3://big-data-benchmark/pavlo/text-deflate/5nodes/ .
$ cd ..
```
Run the following ClickHouse queries:
``` sql
```sql
CREATE TABLE rankings_tiny
(
pageURL String,
@ -86,12 +86,12 @@ CREATE TABLE uservisits_5nodes_on_single
Go back to the console:
```bash
for i in tiny/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_tiny FORMAT CSV"; done
for i in tiny/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_tiny FORMAT CSV"; done
for i in 1node/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_1node FORMAT CSV"; done
for i in 1node/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_1node FORMAT CSV"; done
for i in 5nodes/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_5nodes_on_single FORMAT CSV"; done
for i in 5nodes/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_5nodes_on_single FORMAT CSV"; done
$ for i in tiny/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_tiny FORMAT CSV"; done
$ for i in tiny/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_tiny FORMAT CSV"; done
$ for i in 1node/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_1node FORMAT CSV"; done
$ for i in 1node/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_1node FORMAT CSV"; done
$ for i in 5nodes/rankings/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO rankings_5nodes_on_single FORMAT CSV"; done
$ for i in 5nodes/uservisits/*.deflate; do echo $i; zlib-flate -uncompress < $i | clickhouse-client --host=example-perftest01j --query="INSERT INTO uservisits_5nodes_on_single FORMAT CSV"; done
```
Queries for obtaining data samples:

View File

@ -4,14 +4,14 @@ Download the data from <http://labs.criteo.com/downloads/download-terabyte-click
Create a table to import the log to:
``` sql
```sql
CREATE TABLE criteo_log (date Date, clicked UInt8, int1 Int32, int2 Int32, int3 Int32, int4 Int32, int5 Int32, int6 Int32, int7 Int32, int8 Int32, int9 Int32, int10 Int32, int11 Int32, int12 Int32, int13 Int32, cat1 String, cat2 String, cat3 String, cat4 String, cat5 String, cat6 String, cat7 String, cat8 String, cat9 String, cat10 String, cat11 String, cat12 String, cat13 String, cat14 String, cat15 String, cat16 String, cat17 String, cat18 String, cat19 String, cat20 String, cat21 String, cat22 String, cat23 String, cat24 String, cat25 String, cat26 String) ENGINE = Log
```
Download the data:
```bash
for i in {00..23}; do echo $i; zcat datasets/criteo/day_${i#0}.gz | sed -r 's/^/2000-01-'${i/00/24}'\t/' | clickhouse-client --host=example-perftest01j --query="INSERT INTO criteo_log FORMAT TabSeparated"; done
$ for i in {00..23}; do echo $i; zcat datasets/criteo/day_${i#0}.gz | sed -r 's/^/2000-01-'${i/00/24}'\t/' | clickhouse-client --host=example-perftest01j --query="INSERT INTO criteo_log FORMAT TabSeparated"; done
```
Create a table for the converted data:
@ -65,7 +65,7 @@ CREATE TABLE criteo
Transform data from the raw log and put it in the second table:
``` sql
```sql
INSERT INTO criteo SELECT date, clicked, int1, int2, int3, int4, int5, int6, int7, int8, int9, int10, int11, int12, int13, reinterpretAsUInt32(unhex(cat1)) AS icat1, reinterpretAsUInt32(unhex(cat2)) AS icat2, reinterpretAsUInt32(unhex(cat3)) AS icat3, reinterpretAsUInt32(unhex(cat4)) AS icat4, reinterpretAsUInt32(unhex(cat5)) AS icat5, reinterpretAsUInt32(unhex(cat6)) AS icat6, reinterpretAsUInt32(unhex(cat7)) AS icat7, reinterpretAsUInt32(unhex(cat8)) AS icat8, reinterpretAsUInt32(unhex(cat9)) AS icat9, reinterpretAsUInt32(unhex(cat10)) AS icat10, reinterpretAsUInt32(unhex(cat11)) AS icat11, reinterpretAsUInt32(unhex(cat12)) AS icat12, reinterpretAsUInt32(unhex(cat13)) AS icat13, reinterpretAsUInt32(unhex(cat14)) AS icat14, reinterpretAsUInt32(unhex(cat15)) AS icat15, reinterpretAsUInt32(unhex(cat16)) AS icat16, reinterpretAsUInt32(unhex(cat17)) AS icat17, reinterpretAsUInt32(unhex(cat18)) AS icat18, reinterpretAsUInt32(unhex(cat19)) AS icat19, reinterpretAsUInt32(unhex(cat20)) AS icat20, reinterpretAsUInt32(unhex(cat21)) AS icat21, reinterpretAsUInt32(unhex(cat22)) AS icat22, reinterpretAsUInt32(unhex(cat23)) AS icat23, reinterpretAsUInt32(unhex(cat24)) AS icat24, reinterpretAsUInt32(unhex(cat25)) AS icat25, reinterpretAsUInt32(unhex(cat26)) AS icat26 FROM criteo_log;
DROP TABLE criteo_log;

View File

@ -4,47 +4,47 @@ Dataset consists of two tables containing anonymized data about hits (`hits_v1`)
## Obtaining Tables from Prepared Partitions
**Download and import hits:**
```bash
curl -O https://clickhouse-datasets.s3.yandex.net/hits/partitions/hits_v1.tar
tar xvf hits_v1.tar -C /var/lib/clickhouse # path to ClickHouse data directory
# check permissions on unpacked data, fix if required
sudo service clickhouse-server restart
clickhouse-client --query "SELECT COUNT(*) FROM datasets.hits_v1"
$ curl -O https://clickhouse-datasets.s3.yandex.net/hits/partitions/hits_v1.tar
$ tar xvf hits_v1.tar -C /var/lib/clickhouse # path to ClickHouse data directory
$ # check permissions on unpacked data, fix if required
$ sudo service clickhouse-server restart
$ clickhouse-client --query "SELECT COUNT(*) FROM datasets.hits_v1"
```
**Download and import visits:**
```bash
curl -O https://clickhouse-datasets.s3.yandex.net/visits/partitions/visits_v1.tar
tar xvf visits_v1.tar -C /var/lib/clickhouse # path to ClickHouse data directory
# check permissions on unpacked data, fix if required
sudo service clickhouse-server restart
clickhouse-client --query "SELECT COUNT(*) FROM datasets.visits_v1"
$ curl -O https://clickhouse-datasets.s3.yandex.net/visits/partitions/visits_v1.tar
$ tar xvf visits_v1.tar -C /var/lib/clickhouse # path to ClickHouse data directory
$ # check permissions on unpacked data, fix if required
$ sudo service clickhouse-server restart
$ clickhouse-client --query "SELECT COUNT(*) FROM datasets.visits_v1"
```
## Obtaining Tables from Compressed tsv-file
**Download and import hits from compressed tsv-file**
```bash
curl https://clickhouse-datasets.s3.yandex.net/hits/tsv/hits_v1.tsv.xz | unxz --threads=`nproc` > hits_v1.tsv
# now create table
clickhouse-client --query "CREATE DATABASE IF NOT EXISTS datasets"
clickhouse-client --query "CREATE TABLE datasets.hits_v1 ( WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime, EventDate Date, CounterID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RegionID UInt32, UserID UInt64, CounterClass Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, URLDomain String, RefererDomain String, Refresh UInt8, IsRobot UInt8, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, FlashMinor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth UInt16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightVersion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt8, IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8, DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), UTCEventTime DateTime, Age UInt8, Sex UInt8, Income UInt8, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), RemoteIP UInt32, RemoteIP6 FixedString(16), WindowName Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNetwork String, SocialAction String, HTTPError UInt16, SendTiming Int32, DNSTiming Int32, ConnectTiming Int32, ResponseStartTiming Int32, ResponseEndTiming Int32, FetchTiming Int32, RedirectTiming Int32, DOMInteractiveTiming Int32, DOMContentLoadedTiming Int32, DOMCompleteTiming Int32, LoadEventStartTiming Int32, LoadEventEndTiming Int32, NSToDOMContentLoadedTiming Int32, FirstPaintTiming Int32, RedirectCount Int8, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64, ParamOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, GoalsReached Array(UInt32), OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64, CLID UInt32, YCLID UInt64, ShareService String, ShareURL String, ShareTitle String, ParsedParams Nested(Key1 String, Key2 String, Key3 String, Key4 String, Key5 String, ValueDouble Float64), IslandID FixedString(16), RequestNum UInt32, RequestTry UInt8) ENGINE = MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity = 8192"
# import data
cat hits_v1.tsv | clickhouse-client --query "INSERT INTO datasets.hits_v1 FORMAT TSV" --max_insert_block_size=100000
# optionally you can optimize table
clickhouse-client --query "OPTIMIZE TABLE datasets.hits_v1 FINAL"
clickhouse-client --query "SELECT COUNT(*) FROM datasets.hits_v1"
$ curl https://clickhouse-datasets.s3.yandex.net/hits/tsv/hits_v1.tsv.xz | unxz --threads=`nproc` > hits_v1.tsv
$ # now create table
$ clickhouse-client --query "CREATE DATABASE IF NOT EXISTS datasets"
$ clickhouse-client --query "CREATE TABLE datasets.hits_v1 ( WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime, EventDate Date, CounterID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RegionID UInt32, UserID UInt64, CounterClass Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, URLDomain String, RefererDomain String, Refresh UInt8, IsRobot UInt8, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, FlashMinor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth UInt16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightVersion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt8, IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8, DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), UTCEventTime DateTime, Age UInt8, Sex UInt8, Income UInt8, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), RemoteIP UInt32, RemoteIP6 FixedString(16), WindowName Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNetwork String, SocialAction String, HTTPError UInt16, SendTiming Int32, DNSTiming Int32, ConnectTiming Int32, ResponseStartTiming Int32, ResponseEndTiming Int32, FetchTiming Int32, RedirectTiming Int32, DOMInteractiveTiming Int32, DOMContentLoadedTiming Int32, DOMCompleteTiming Int32, LoadEventStartTiming Int32, LoadEventEndTiming Int32, NSToDOMContentLoadedTiming Int32, FirstPaintTiming Int32, RedirectCount Int8, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64, ParamOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, GoalsReached Array(UInt32), OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64, CLID UInt32, YCLID UInt64, ShareService String, ShareURL String, ShareTitle String, ParsedParams Nested(Key1 String, Key2 String, Key3 String, Key4 String, Key5 String, ValueDouble Float64), IslandID FixedString(16), RequestNum UInt32, RequestTry UInt8) ENGINE = MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate, intHash32(UserID)) SAMPLE BY intHash32(UserID) SETTINGS index_granularity = 8192"
$ # import data
$ cat hits_v1.tsv | clickhouse-client --query "INSERT INTO datasets.hits_v1 FORMAT TSV" --max_insert_block_size=100000
$ # optionally you can optimize table
$ clickhouse-client --query "OPTIMIZE TABLE datasets.hits_v1 FINAL"
$ clickhouse-client --query "SELECT COUNT(*) FROM datasets.hits_v1"
```
**Download and import visits from compressed tsv-file**
```bash
curl https://clickhouse-datasets.s3.yandex.net/visits/tsv/visits_v1.tsv.xz | unxz --threads=`nproc` > visits_v1.tsv
# now create table
clickhouse-client --query "CREATE DATABASE IF NOT EXISTS datasets"
clickhouse-client --query "CREATE TABLE datasets.visits_v1 ( CounterID UInt32, StartDate Date, Sign Int8, IsNew UInt8, VisitID UInt64, UserID UInt64, StartTime DateTime, Duration UInt32, UTCStartTime DateTime, PageViews Int32, Hits Int32, IsBounce UInt8, Referer String, StartURL String, RefererDomain String, StartURLDomain String, EndURL String, LinkURL String, IsDownload UInt8, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, PlaceID Int32, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), IsYandex UInt8, GoalReachesDepth Int32, GoalReachesURL Int32, GoalReachesAny Int32, SocialSourceNetworkID UInt8, SocialSourcePage String, MobilePhoneModel String, ClientEventTime DateTime, RegionID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RemoteIP UInt32, RemoteIP6 FixedString(16), IPNetworkID UInt32, SilverlightVersion3 UInt32, CodeVersion UInt32, ResolutionWidth UInt16, ResolutionHeight UInt16, UserAgentMajor UInt16, UserAgentMinor UInt16, WindowClientWidth UInt16, WindowClientHeight UInt16, SilverlightVersion2 UInt8, SilverlightVersion4 UInt16, FlashVersion3 UInt16, FlashVersion4 UInt16, ClientTimeZone Int16, OS UInt8, UserAgent UInt8, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, NetMajor UInt8, NetMinor UInt8, MobilePhone UInt8, SilverlightVersion1 UInt8, Age UInt8, Sex UInt8, Income UInt8, JavaEnable UInt8, CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, BrowserLanguage UInt16, BrowserCountry UInt16, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), Params Array(String), Goals Nested(ID UInt32, Serial UInt32, EventTime DateTime, Price Int64, OrderID String, CurrencyID UInt32), WatchIDs Array(UInt64), ParamSumPrice Int64, ParamCurrency FixedString(3), ParamCurrencyID UInt16, ClickLogID UInt64, ClickEventID Int32, ClickGoodEvent Int32, ClickEventTime DateTime, ClickPriorityID Int32, ClickPhraseID Int32, ClickPageID Int32, ClickPlaceID Int32, ClickTypeID Int32, ClickResourceID Int32, ClickCost UInt32, ClickClientIP UInt32, ClickDomainID UInt32, ClickURL String, ClickAttempt UInt8, ClickOrderID UInt32, ClickBannerID UInt32, ClickMarketCategoryID UInt32, ClickMarketPP UInt32, ClickMarketCategoryName String, ClickMarketPPName String, ClickAWAPSCampaignName String, ClickPageName String, ClickTargetType UInt16, ClickTargetPhraseID UInt64, ClickContextType UInt8, ClickSelectType Int8, ClickOptions String, ClickGroupBannerID Int32, OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, FirstVisit DateTime, PredLastVisit Date, LastVisit Date, TotalVisits UInt32, TraficSource Nested(ID Int8, SearchEngineID UInt16, AdvEngineID UInt8, PlaceID UInt16, SocialSourceNetworkID UInt8, Domain String, SearchPhrase String, SocialSourcePage String), Attendance FixedString(16), CLID UInt32, YCLID UInt64, NormalizedRefererHash UInt64, SearchPhraseHash UInt64, RefererDomainHash UInt64, NormalizedStartURLHash UInt64, StartURLDomainHash UInt64, NormalizedEndURLHash UInt64, TopLevelDomain UInt64, URLScheme UInt64, OpenstatServiceNameHash UInt64, OpenstatCampaignIDHash UInt64, OpenstatAdIDHash UInt64, OpenstatSourceIDHash UInt64, UTMSourceHash UInt64, UTMMediumHash UInt64, UTMCampaignHash UInt64, UTMContentHash UInt64, UTMTermHash UInt64, FromHash UInt64, WebVisorEnabled UInt8, WebVisorActivity UInt32, ParsedParams Nested(Key1 String, Key2 String, Key3 String, Key4 String, Key5 String, ValueDouble Float64), Market Nested(Type UInt8, GoalID UInt32, OrderID String, OrderPrice Int64, PP UInt32, DirectPlaceID UInt32, DirectOrderID UInt32, DirectBannerID UInt32, GoodID String, GoodName String, GoodQuantity Int32, GoodPrice Int64), IslandID FixedString(16)) ENGINE = CollapsingMergeTree(StartDate, intHash32(UserID), (CounterID, StartDate, intHash32(UserID), VisitID), 8192, Sign)"
# import data
cat visits_v1.tsv | clickhouse-client --query "INSERT INTO datasets.visits_v1 FORMAT TSV" --max_insert_block_size=100000
# optionally you can optimize table
clickhouse-client --query "OPTIMIZE TABLE datasets.visits_v1 FINAL"
clickhouse-client --query "SELECT COUNT(*) FROM datasets.visits_v1"
$ curl https://clickhouse-datasets.s3.yandex.net/visits/tsv/visits_v1.tsv.xz | unxz --threads=`nproc` > visits_v1.tsv
$ # now create table
$ clickhouse-client --query "CREATE DATABASE IF NOT EXISTS datasets"
$ clickhouse-client --query "CREATE TABLE datasets.visits_v1 ( CounterID UInt32, StartDate Date, Sign Int8, IsNew UInt8, VisitID UInt64, UserID UInt64, StartTime DateTime, Duration UInt32, UTCStartTime DateTime, PageViews Int32, Hits Int32, IsBounce UInt8, Referer String, StartURL String, RefererDomain String, StartURLDomain String, EndURL String, LinkURL String, IsDownload UInt8, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, PlaceID Int32, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), IsYandex UInt8, GoalReachesDepth Int32, GoalReachesURL Int32, GoalReachesAny Int32, SocialSourceNetworkID UInt8, SocialSourcePage String, MobilePhoneModel String, ClientEventTime DateTime, RegionID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RemoteIP UInt32, RemoteIP6 FixedString(16), IPNetworkID UInt32, SilverlightVersion3 UInt32, CodeVersion UInt32, ResolutionWidth UInt16, ResolutionHeight UInt16, UserAgentMajor UInt16, UserAgentMinor UInt16, WindowClientWidth UInt16, WindowClientHeight UInt16, SilverlightVersion2 UInt8, SilverlightVersion4 UInt16, FlashVersion3 UInt16, FlashVersion4 UInt16, ClientTimeZone Int16, OS UInt8, UserAgent UInt8, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, NetMajor UInt8, NetMinor UInt8, MobilePhone UInt8, SilverlightVersion1 UInt8, Age UInt8, Sex UInt8, Income UInt8, JavaEnable UInt8, CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, BrowserLanguage UInt16, BrowserCountry UInt16, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), Params Array(String), Goals Nested(ID UInt32, Serial UInt32, EventTime DateTime, Price Int64, OrderID String, CurrencyID UInt32), WatchIDs Array(UInt64), ParamSumPrice Int64, ParamCurrency FixedString(3), ParamCurrencyID UInt16, ClickLogID UInt64, ClickEventID Int32, ClickGoodEvent Int32, ClickEventTime DateTime, ClickPriorityID Int32, ClickPhraseID Int32, ClickPageID Int32, ClickPlaceID Int32, ClickTypeID Int32, ClickResourceID Int32, ClickCost UInt32, ClickClientIP UInt32, ClickDomainID UInt32, ClickURL String, ClickAttempt UInt8, ClickOrderID UInt32, ClickBannerID UInt32, ClickMarketCategoryID UInt32, ClickMarketPP UInt32, ClickMarketCategoryName String, ClickMarketPPName String, ClickAWAPSCampaignName String, ClickPageName String, ClickTargetType UInt16, ClickTargetPhraseID UInt64, ClickContextType UInt8, ClickSelectType Int8, ClickOptions String, ClickGroupBannerID Int32, OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, FirstVisit DateTime, PredLastVisit Date, LastVisit Date, TotalVisits UInt32, TraficSource Nested(ID Int8, SearchEngineID UInt16, AdvEngineID UInt8, PlaceID UInt16, SocialSourceNetworkID UInt8, Domain String, SearchPhrase String, SocialSourcePage String), Attendance FixedString(16), CLID UInt32, YCLID UInt64, NormalizedRefererHash UInt64, SearchPhraseHash UInt64, RefererDomainHash UInt64, NormalizedStartURLHash UInt64, StartURLDomainHash UInt64, NormalizedEndURLHash UInt64, TopLevelDomain UInt64, URLScheme UInt64, OpenstatServiceNameHash UInt64, OpenstatCampaignIDHash UInt64, OpenstatAdIDHash UInt64, OpenstatSourceIDHash UInt64, UTMSourceHash UInt64, UTMMediumHash UInt64, UTMCampaignHash UInt64, UTMContentHash UInt64, UTMTermHash UInt64, FromHash UInt64, WebVisorEnabled UInt8, WebVisorActivity UInt32, ParsedParams Nested(Key1 String, Key2 String, Key3 String, Key4 String, Key5 String, ValueDouble Float64), Market Nested(Type UInt8, GoalID UInt32, OrderID String, OrderPrice Int64, PP UInt32, DirectPlaceID UInt32, DirectOrderID UInt32, DirectBannerID UInt32, GoodID String, GoodName String, GoodQuantity Int32, GoodPrice Int64), IslandID FixedString(16)) ENGINE = CollapsingMergeTree(StartDate, intHash32(UserID), (CounterID, StartDate, intHash32(UserID), VisitID), 8192, Sign)"
$ # import data
$ cat visits_v1.tsv | clickhouse-client --query "INSERT INTO datasets.visits_v1 FORMAT TSV" --max_insert_block_size=100000
$ # optionally you can optimize table
$ clickhouse-client --query "OPTIMIZE TABLE datasets.visits_v1 FINAL"
$ clickhouse-client --query "SELECT COUNT(*) FROM datasets.visits_v1"
```
## Queries

File diff suppressed because one or more lines are too long

View File

@ -24,7 +24,7 @@ done
Creating a table:
``` sql
```sql
CREATE TABLE `ontime` (
`Year` UInt16,
`Quarter` UInt8,
@ -141,17 +141,17 @@ CREATE TABLE `ontime` (
Loading data:
```bash
for i in *.zip; do echo $i; unzip -cq $i '*.csv' | sed 's/\.00//g' | clickhouse-client --host=example-perftest01j --query="INSERT INTO ontime FORMAT CSVWithNames"; done
$ for i in *.zip; do echo $i; unzip -cq $i '*.csv' | sed 's/\.00//g' | clickhouse-client --host=example-perftest01j --query="INSERT INTO ontime FORMAT CSVWithNames"; done
```
## Dowload of Prepared Partitions
## Download of Prepared Partitions
```bash
curl -O https://clickhouse-datasets.s3.yandex.net/ontime/partitions/ontime.tar
tar xvf ontime.tar -C /var/lib/clickhouse # path to ClickHouse data directory
# check permissions of unpacked data, fix if required
sudo service clickhouse-server restart
clickhouse-client --query "select count(*) from datasets.ontime"
$ curl -O https://clickhouse-datasets.s3.yandex.net/ontime/partitions/ontime.tar
$ tar xvf ontime.tar -C /var/lib/clickhouse # path to ClickHouse data directory
$ # check permissions of unpacked data, fix if required
$ sudo service clickhouse-server restart
$ clickhouse-client --query "select count(*) from datasets.ontime"
```
!!!info
@ -162,7 +162,7 @@ clickhouse-client --query "select count(*) from datasets.ontime"
Q0.
``` sql
```sql
SELECT avg(c1)
FROM
(
@ -174,7 +174,7 @@ FROM
Q1. The number of flights per day from the year 2000 to 2008
``` sql
```sql
SELECT DayOfWeek, count(*) AS c
FROM ontime
WHERE Year>=2000 AND Year<=2008
@ -184,7 +184,7 @@ ORDER BY c DESC;
Q2. The number of flights delayed by more than 10 minutes, grouped by the day of the week, for 2000-2008
``` sql
```sql
SELECT DayOfWeek, count(*) AS c
FROM ontime
WHERE DepDelay>10 AND Year>=2000 AND Year<=2008
@ -194,7 +194,7 @@ ORDER BY c DESC;
Q3. The number of delays by airport for 2000-2008
``` sql
```sql
SELECT Origin, count(*) AS c
FROM ontime
WHERE DepDelay>10 AND Year>=2000 AND Year<=2008
@ -205,7 +205,7 @@ LIMIT 10;
Q4. The number of delays by carrier for 2007
``` sql
```sql
SELECT Carrier, count(*)
FROM ontime
WHERE DepDelay>10 AND Year=2007
@ -215,7 +215,7 @@ ORDER BY count(*) DESC;
Q5. The percentage of delays by carrier for 2007
``` sql
```sql
SELECT Carrier, c, c2, c*100/c2 as c3
FROM
(
@ -241,7 +241,7 @@ ORDER BY c3 DESC;
Better version of the same query:
``` sql
```sql
SELECT Carrier, avg(DepDelay>10)*100 AS c3
FROM ontime
WHERE Year=2007
@ -251,7 +251,7 @@ ORDER BY Carrier
Q6. The previous request for a broader range of years, 2000-2008
``` sql
```sql
SELECT Carrier, c, c2, c*100/c2 as c3
FROM
(
@ -277,7 +277,7 @@ ORDER BY c3 DESC;
Better version of the same query:
``` sql
```sql
SELECT Carrier, avg(DepDelay>10)*100 AS c3
FROM ontime
WHERE Year>=2000 AND Year<=2008
@ -287,7 +287,7 @@ ORDER BY Carrier;
Q7. Percentage of flights delayed for more than 10 minutes, by year
``` sql
```sql
SELECT Year, c1/c2
FROM
(
@ -311,7 +311,7 @@ ORDER BY Year;
Better version of the same query:
``` sql
```sql
SELECT Year, avg(DepDelay>10)
FROM ontime
GROUP BY Year
@ -320,7 +320,7 @@ ORDER BY Year;
Q8. The most popular destinations by the number of directly connected cities for various year ranges
``` sql
```sql
SELECT DestCityName, uniqExact(OriginCityName) AS u F
ROM ontime
WHERE Year>=2000 and Year<=2010
@ -331,7 +331,7 @@ LIMIT 10;
Q9.
``` sql
```sql
SELECT Year, count(*) AS c1
FROM ontime
GROUP BY Year;
@ -339,7 +339,7 @@ GROUP BY Year;
Q10.
``` sql
```sql
SELECT
min(Year), max(Year), Carrier, count(*) AS cnt,
sum(ArrDelayMinutes>30) AS flights_delayed,
@ -357,7 +357,7 @@ LIMIT 1000;
Bonus:
``` sql
```sql
SELECT avg(cnt)
FROM
(

View File

@ -2,25 +2,25 @@
Compiling dbgen:
```
git clone git@github.com:vadimtk/ssb-dbgen.git
cd ssb-dbgen
make
```bash
$ git clone git@github.com:vadimtk/ssb-dbgen.git
$ cd ssb-dbgen
$ make
```
Generating data:
```
./dbgen -s 1000 -T c
./dbgen -s 1000 -T l
./dbgen -s 1000 -T p
./dbgen -s 1000 -T s
./dbgen -s 1000 -T d
```bash
$ ./dbgen -s 1000 -T c
$ ./dbgen -s 1000 -T l
$ ./dbgen -s 1000 -T p
$ ./dbgen -s 1000 -T s
$ ./dbgen -s 1000 -T d
```
Creating tables in ClickHouse:
```
```sql
CREATE TABLE customer
(
C_CUSTKEY UInt32,
@ -85,16 +85,16 @@ ENGINE = MergeTree ORDER BY S_SUPPKEY;
Inserting data:
```
clickhouse-client --query "INSERT INTO customer FORMAT CSV" < customer.tbl
clickhouse-client --query "INSERT INTO part FORMAT CSV" < part.tbl
clickhouse-client --query "INSERT INTO supplier FORMAT CSV" < supplier.tbl
clickhouse-client --query "INSERT INTO lineorder FORMAT CSV" < lineorder.tbl
```bash
$ clickhouse-client --query "INSERT INTO customer FORMAT CSV" < customer.tbl
$ clickhouse-client --query "INSERT INTO part FORMAT CSV" < part.tbl
$ clickhouse-client --query "INSERT INTO supplier FORMAT CSV" < supplier.tbl
$ clickhouse-client --query "INSERT INTO lineorder FORMAT CSV" < lineorder.tbl
```
Converting "star schema" to denormalized "flat schema":
```
```sql
SET max_memory_usage = 20000000000, allow_experimental_multiple_joins_emulation = 1;
CREATE TABLE lineorder_flat
@ -112,44 +112,56 @@ ALTER TABLE lineorder_flat DROP COLUMN C_CUSTKEY, DROP COLUMN S_SUPPKEY, DROP CO
Running the queries:
```
Q1.1
```sql
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE toYear(LO_ORDERDATE) = 1993 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
```
Q1.2
```sql
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE toYYYYMM(LO_ORDERDATE) = 199401 AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
```
Q1.3
```sql
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE toISOWeek(LO_ORDERDATE) = 6 AND toYear(LO_ORDERDATE) = 1994 AND LO_DISCOUNT BETWEEN 5 AND 7 AND LO_QUANTITY BETWEEN 26 AND 35;
```
Q2.1
```sql
SELECT sum(LO_REVENUE), toYear(LO_ORDERDATE) AS year, P_BRAND FROM lineorder_flat WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA' GROUP BY year, P_BRAND ORDER BY year, P_BRAND;
```
Q2.2
```sql
SELECT sum(LO_REVENUE), toYear(LO_ORDERDATE) AS year, P_BRAND FROM lineorder_flat WHERE P_BRAND BETWEEN 'MFGR#2221' AND 'MFGR#2228' AND S_REGION = 'ASIA' GROUP BY year, P_BRAND ORDER BY year, P_BRAND;
```
Q2.3
```sql
SELECT sum(LO_REVENUE), toYear(LO_ORDERDATE) AS year, P_BRAND FROM lineorder_flat WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE' GROUP BY year, P_BRAND ORDER BY year, P_BRAND;
```
Q3.1
```sql
SELECT C_NATION, S_NATION, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA' AND year >= 1992 AND year <= 1997 GROUP BY C_NATION, S_NATION, year ORDER BY year asc, revenue desc;
```
Q3.2
```sql
SELECT C_CITY, S_CITY, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE C_NATION = 'UNITED STATES' AND S_NATION = 'UNITED STATES' AND year >= 1992 AND year <= 1997 GROUP BY C_CITY, S_CITY, year ORDER BY year asc, revenue desc;
```
Q3.3
```sql
SELECT C_CITY, S_CITY, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE (C_CITY = 'UNITED KI1' OR C_CITY = 'UNITED KI5') AND (S_CITY = 'UNITED KI1' OR S_CITY = 'UNITED KI5') AND year >= 1992 AND year <= 1997 GROUP BY C_CITY, S_CITY, year ORDER BY year asc, revenue desc;
```
Q3.4
```sql
SELECT C_CITY, S_CITY, toYear(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE (C_CITY = 'UNITED KI1' OR C_CITY = 'UNITED KI5') AND (S_CITY = 'UNITED KI1' OR S_CITY = 'UNITED KI5') AND toYYYYMM(LO_ORDERDATE) = '199712' GROUP BY C_CITY, S_CITY, year ORDER BY year asc, revenue desc;
```
Q4.1
```sql
SELECT toYear(LO_ORDERDATE) AS year, C_NATION, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2') GROUP BY year, C_NATION ORDER BY year, C_NATION;
```
Q4.2
```sql
SELECT toYear(LO_ORDERDATE) AS year, S_NATION, P_CATEGORY, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (year = 1997 OR year = 1998) AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2') GROUP BY year, S_NATION, P_CATEGORY ORDER BY year, S_NATION, P_CATEGORY;
```
Q4.3
```sql
SELECT toYear(LO_ORDERDATE) AS year, S_CITY, P_BRAND, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE S_NATION = 'UNITED STATES' AND (year = 1997 OR year = 1998) AND P_CATEGORY = 'MFGR#14' GROUP BY year, S_CITY, P_BRAND ORDER BY year, S_CITY, P_BRAND;
```

View File

@ -4,7 +4,7 @@ See: <http://dumps.wikimedia.org/other/pagecounts-raw/>
Creating a table:
``` sql
```sql
CREATE TABLE wikistat
(
date Date,
@ -20,9 +20,9 @@ CREATE TABLE wikistat
Loading data:
```bash
for i in {2007..2016}; do for j in {01..12}; do echo $i-$j >&2; curl -sSL "http://dumps.wikimedia.org/other/pagecounts-raw/$i/$i-$j/" | grep -oE 'pagecounts-[0-9]+-[0-9]+\.gz'; done; done | sort | uniq | tee links.txt
cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pagecounts-raw/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1/')/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1-\2/')/$link; done
ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done
$ for i in {2007..2016}; do for j in {01..12}; do echo $i-$j >&2; curl -sSL "http://dumps.wikimedia.org/other/pagecounts-raw/$i/$i-$j/" | grep -oE 'pagecounts-[0-9]+-[0-9]+\.gz'; done; done | sort | uniq | tee links.txt
$ cat links.txt | while read link; do wget http://dumps.wikimedia.org/other/pagecounts-raw/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1/')/$(echo $link | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})[0-9]{2}-[0-9]+\.gz/\1-\2/')/$link; done
$ ls -1 /opt/wikistat/ | grep gz | while read i; do echo $i; gzip -cd /opt/wikistat/$i | ./wikistat-loader --time="$(echo -n $i | sed -r 's/pagecounts-([0-9]{4})([0-9]{2})([0-9]{2})-([0-9]{2})([0-9]{2})([0-9]{2})\.gz/\1-\2-\3 \4-00-00/')" | clickhouse-client --query="INSERT INTO wikistat FORMAT TabSeparated"; done
```

View File

@ -18,8 +18,8 @@ Yandex ClickHouse team recommends using official pre-compiled `deb` packages for
To install official packages add the Yandex repository in `/etc/apt/sources.list` or in a separate `/etc/apt/sources.list.d/clickhouse.list` file:
```
deb http://repo.yandex.ru/clickhouse/deb/stable/ main/
```bash
$ deb http://repo.yandex.ru/clickhouse/deb/stable/ main/
```
If you want to use the most recent version, replace `stable` with `testing` (this is recommended for your testing environments).
@ -27,10 +27,10 @@ If you want to use the most recent version, replace `stable` with `testing` (thi
Then run these commands to actually install packages:
```bash
sudo apt-get install dirmngr # optional
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4 # optional
sudo apt-get update
sudo apt-get install clickhouse-client clickhouse-server
$ sudo apt-get install dirmngr # optional
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4 # optional
$ sudo apt-get update
$ sudo apt-get install clickhouse-client clickhouse-server
```
You can also download and install packages manually from here: <https://repo.yandex.ru/clickhouse/deb/stable/main/>.
@ -42,9 +42,9 @@ Yandex ClickHouse team recommends using official pre-compiled `rpm` packages for
First you need to add the official repository:
```bash
sudo yum install yum-utils
sudo rpm --import https://repo.yandex.ru/clickhouse/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.yandex.ru/clickhouse/rpm/stable/x86_64
$ sudo yum install yum-utils
$ sudo rpm --import https://repo.yandex.ru/clickhouse/CLICKHOUSE-KEY.GPG
$ sudo yum-config-manager --add-repo https://repo.yandex.ru/clickhouse/rpm/stable/x86_64
```
If you want to use the most recent version, replace `stable` with `testing` (this is recommended for your testing environments).
@ -52,7 +52,7 @@ If you want to use the most recent version, replace `stable` with `testing` (thi
Then run these commands to actually install packages:
```bash
sudo yum install clickhouse-server clickhouse-client
$ sudo yum install clickhouse-server clickhouse-client
```
You can also download and install packages manually from here: <https://repo.yandex.ru/clickhouse/rpm/stable/x86_64>.
@ -67,13 +67,13 @@ To manually compile ClickHouse, follow the instructions for [Linux](../developme
You can compile packages and install them or use programs without installing packages. Also by building manually you can disable SSE 4.2 requirement or build for AArch64 CPUs.
```
```text
Client: dbms/programs/clickhouse-client
Server: dbms/programs/clickhouse-server
```
You'll need to create a data and metadata folders and `chown` them for the desired user. Their paths can be changed in server config (src/dbms/programs/server/config.xml), by default they are:
```
```text
/opt/clickhouse/data/default/
/opt/clickhouse/metadata/default/
```
@ -129,18 +129,14 @@ $ ./clickhouse-client
ClickHouse client version 0.0.18749.
Connecting to localhost:9000.
Connected to ClickHouse server version 0.0.18749.
:) SELECT 1
```
```sql
SELECT 1
```
```text
┌─1─┐
│ 1 │
└───┘
1 rows in set. Elapsed: 0.003 sec.
:)
```
**Congratulations, the system works!**

View File

@ -78,22 +78,16 @@ See the difference?
For example, the query "count the number of records for each advertising platform" requires reading one "advertising platform ID" column, which takes up 1 byte uncompressed. If most of the traffic was not from advertising platforms, you can expect at least 10-fold compression of this column. When using a quick compression algorithm, data decompression is possible at a speed of at least several gigabytes of uncompressed data per second. In other words, this query can be processed at a speed of approximately several billion rows per second on a single server. This speed is actually achieved in practice.
<details markdown="1"><summary>Example</summary>
```
```bash
$ clickhouse-client
ClickHouse client version 0.0.52053.
Connecting to localhost:9000.
Connected to ClickHouse server version 0.0.52053.
:) SELECT CounterID, count() FROM hits GROUP BY CounterID ORDER BY count() DESC LIMIT 20
SELECT
CounterID,
count()
FROM hits
GROUP BY CounterID
ORDER BY count() DESC
LIMIT 20
```
```sql
SELECT CounterID, count() FROM hits GROUP BY CounterID ORDER BY count() DESC LIMIT 20
```
```text
┌─CounterID─┬──count()─┐
│ 114208 │ 56057344 │
│ 115080 │ 51619590 │
@ -116,10 +110,6 @@ LIMIT 20
│ 115079 │ 8837972 │
│ 337234 │ 8205961 │
└───────────┴──────────┘
20 rows in set. Elapsed: 0.153 sec. Processed 1.00 billion rows, 4.00 GB (6.53 billion rows/s., 26.10 GB/s.)
:)
```
</details>

View File

@ -22,14 +22,14 @@ Similar to the HTTP interface, when using the 'query' parameter and sending data
Example of using the client to insert data:
```bash
echo -ne "1, 'some text', '2016-08-14 00:00:00'\n2, 'some more text', '2016-08-14 00:00:01'" | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
$ echo -ne "1, 'some text', '2016-08-14 00:00:00'\n2, 'some more text', '2016-08-14 00:00:01'" | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
cat <<_EOF | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
$ cat <<_EOF | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
3, 'some text', '2016-08-14 00:00:00'
4, 'some more text', '2016-08-14 00:00:01'
_EOF
cat file.csv | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
$ cat file.csv | clickhouse-client --database=test --query="INSERT INTO test FORMAT CSV";
```
In batch mode, the default data format is TabSeparated. You can set the format in the FORMAT clause of the query.
@ -70,14 +70,14 @@ The command-line client allows passing external data (external temporary tables)
You can create a query with parameters and pass values to them from client application. This allows to avoid formatting query with specific dynamic values on client side. For example:
```bash
clickhouse-client --param_parName="[1, 2]" -q "SELECT * FROM table WHERE a = {parName:Array(UInt16)}"
$ clickhouse-client --param_parName="[1, 2]" -q "SELECT * FROM table WHERE a = {parName:Array(UInt16)}"
```
#### Query Syntax {#cli-queries-with-parameters-syntax}
Format a query as usual, then place the values that you want to pass from the app parameters to the query in braces in the following format:
```
```sql
{<name>:<data type>}
```
@ -87,7 +87,7 @@ Format a query as usual, then place the values that you want to pass from the ap
#### Example
```bash
clickhouse-client --param_tuple_in_tuple="(10, ('dt', 10))" -q "SELECT * FROM table WHERE val = {tuple_in_tuple:Tuple(UInt8, Tuple(String, UInt8))}"
$ clickhouse-client --param_tuple_in_tuple="(10, ('dt', 10))" -q "SELECT * FROM table WHERE val = {tuple_in_tuple:Tuple(UInt8, Tuple(String, UInt8))}"
```
## Configuring {#interfaces_cli_configuration}

View File

@ -47,11 +47,11 @@ The `TabSeparated` format is convenient for processing data using custom program
The `TabSeparated` format supports outputting total values (when using WITH TOTALS) and extreme values (when 'extremes' is set to 1). In these cases, the total values and extremes are output after the main data. The main result, total values, and extremes are separated from each other by an empty line. Example:
``` sql
```sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT TabSeparated``
```
```
```text
2014-03-17 1406958
2014-03-18 1383658
2014-03-19 1405797
@ -83,7 +83,7 @@ As an exception, parsing dates with times is also supported in Unix timestamp fo
Strings are output with backslash-escaped special characters. The following escape sequences are used for output: `\b`, `\f`, `\r`, `\n`, `\t`, `\0`, `\'`, `\\`. Parsing also supports the sequences `\a`, `\v`, and `\xHH` (hex escape sequences) and any `\c` sequences, where `c` is any character (these sequences are converted to `c`). Thus, reading data supports formats where a line feed can be written as `\n` or `\`, or as a line feed. For example, the string `Hello world` with a line feed between the words instead of a space can be parsed in any of the following variations:
```
```text
Hello\nworld
Hello\
@ -211,7 +211,7 @@ format_schema_rows_between_delimiter = '\n '
```
`Insert` example:
```
```text
Some header
Page views: 5, User id: 4324182021466249494, Useless field: hello, Duration: 146, Sign: -1
Page views: 6, User id: 4324182021466249494, Useless field: world, Duration: 185, Sign: 1
@ -241,7 +241,7 @@ format_schema_rows_between_delimiter = ','
Similar to TabSeparated, but outputs a value in name=value format. Names are escaped the same way as in TabSeparated format, and the = symbol is also escaped.
```
```text
SearchPhrase= count()=8267016
SearchPhrase=bathroom interior design count()=2166
SearchPhrase=yandex count()=1655
@ -260,7 +260,7 @@ SearchPhrase=baku count()=1000
SELECT * FROM t_null FORMAT TSKV
```
```
```text
x=1 y=\N
```
@ -276,8 +276,8 @@ Comma Separated Values format ([RFC](https://tools.ietf.org/html/rfc4180)).
When formatting, rows are enclosed in double quotes. A double quote inside a string is output as two double quotes in a row. There are no other rules for escaping characters. Date and date-time are enclosed in double quotes. Numbers are output without quotes. Values are separated by a delimiter character, which is `,` by default. The delimiter character is defined in the setting [format_csv_delimiter](../operations/settings/settings.md#settings-format_csv_delimiter). Rows are separated using the Unix line feed (LF). Arrays are serialized in CSV as follows: first the array is serialized to a string as in TabSeparated format, and then the resulting string is output to CSV in double quotes. Tuples in CSV format are serialized as separate columns (that is, their nesting in the tuple is lost).
```
clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV" < data.csv
```bash
$ clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV" < data.csv
```
&ast;By default, the delimiter is `,`. See the [format_csv_delimiter](../operations/settings/settings.md#settings-format_csv_delimiter) setting for more information.
@ -300,7 +300,7 @@ Also prints the header row, similar to `TabSeparatedWithNames`.
Outputs data in JSON format. Besides data tables, it also outputs column names and types, along with some additional information: the total number of output rows, and the number of rows that could have been output if there weren't a LIMIT. Example:
``` sql
```sql
SELECT SearchPhrase, count() AS c FROM test.hits GROUP BY SearchPhrase WITH TOTALS ORDER BY c DESC LIMIT 5 FORMAT JSON
```
@ -445,7 +445,7 @@ When inserting the data, you should provide a separate JSON object for each row.
### Inserting Data
```
```sql
INSERT INTO UserActivity FORMAT JSONEachRow {"PageViews":5, "UserID":"4324182021466249494", "Duration":146,"Sign":-1} {"UserID":"4324182021466249494","PageViews":6,"Duration":185,"Sign":1}
```
@ -464,7 +464,7 @@ If `DEFAULT expr` is specified, ClickHouse uses different substitution rules dep
Consider the following table:
```
```sql
CREATE TABLE IF NOT EXISTS example_table
(
x UInt32,
@ -482,7 +482,7 @@ CREATE TABLE IF NOT EXISTS example_table
Consider the `UserActivity` table as an example:
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ -1 │
│ 4324182021466249494 │ 6 │ 185 │ 1 │
@ -491,7 +491,7 @@ Consider the `UserActivity` table as an example:
The query `SELECT * FROM UserActivity FORMAT JSONEachRow` returns:
```
```text
{"UserID":"4324182021466249494","PageViews":5,"Duration":146,"Sign":-1}
{"UserID":"4324182021466249494","PageViews":6,"Duration":185,"Sign":1}
```
@ -576,11 +576,11 @@ Each result block is output as a separate table. This is necessary so that block
Example (shown for the [PrettyCompact](#prettycompact) format):
``` sql
```sql
SELECT * FROM t_null
```
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
└───┴──────┘
@ -588,11 +588,11 @@ SELECT * FROM t_null
Rows are not escaped in Pretty* formats. Example is shown for the [PrettyCompact](#prettycompact) format:
``` sql
```sql
SELECT 'String with \'quotes\' and \t character' AS Escaping_test
```
```
```text
┌─Escaping_test────────────────────────┐
│ String with 'quotes' and character │
└──────────────────────────────────────┘
@ -603,11 +603,11 @@ This format is only appropriate for outputting a query result, but not for parsi
The Pretty format supports outputting total values (when using WITH TOTALS) and extremes (when 'extremes' is set to 1). In these cases, total values and extreme values are output after the main data, in separate tables. Example (shown for the [PrettyCompact](#prettycompact) format):
``` sql
```sql
SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT PrettyCompact
```
```
```text
┌──EventDate─┬───────c─┐
│ 2014-03-17 │ 1406958 │
│ 2014-03-18 │ 1383658 │
@ -646,7 +646,7 @@ Differs from Pretty in that ANSI-escape sequences aren't used. This is necessary
Example:
```bash
watch -n1 "clickhouse-client --query='SELECT event, value FROM system.events FORMAT PrettyCompactNoEscapes'"
$ watch -n1 "clickhouse-client --query='SELECT event, value FROM system.events FORMAT PrettyCompactNoEscapes'"
```
You can use the HTTP interface for displaying in the browser.
@ -702,11 +702,11 @@ Prints each value on a separate line with the column name specified. This format
Example:
``` sql
```sql
SELECT * FROM t_null FORMAT Vertical
```
```
```text
Row 1:
──────
x: 1
@ -714,11 +714,11 @@ y: ᴺᵁᴸᴸ
```
Rows are not escaped in Vertical format:
``` sql
```sql
SELECT 'string with \'quotes\' and \t with some special \n characters' AS test FORMAT Vertical
```
```
```text
Row 1:
──────
test: string with 'quotes' and with some special
@ -807,12 +807,12 @@ Cap'n Proto is a binary message format similar to Protocol Buffers and Thrift, b
Cap'n Proto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query.
```bash
cat capnproto_messages.bin | clickhouse-client --query "INSERT INTO test.hits FORMAT CapnProto SETTINGS format_schema='schema:Message'"
$ cat capnproto_messages.bin | clickhouse-client --query "INSERT INTO test.hits FORMAT CapnProto SETTINGS format_schema='schema:Message'"
```
Where `schema.capnp` looks like this:
```
```capnp
struct Message {
SearchPhrase @0 :Text;
c @1 :Uint64;
@ -842,7 +842,7 @@ cat protobuf_messages.bin | clickhouse-client --query "INSERT INTO test.table FO
where the file `schemafile.proto` looks like this:
```
```capnp
syntax = "proto3";
message MessageType {
@ -859,7 +859,7 @@ If types of a column and a field of Protocol Buffers' message are different the
Nested messages are supported. For example, for the field `z` in the following message type
```
```capnp
message MessageType {
message XType {
message YType {
@ -876,7 +876,7 @@ Nested messages are suitable to input or output a [nested data structures](../da
Default values defined in a protobuf schema like this
```
```capnp
syntax = "proto2";
message MessageType {

View File

@ -75,31 +75,31 @@ The POST method of transmitting data is necessary for INSERT queries. In this ca
Examples: Creating a table:
```bash
echo 'CREATE TABLE t (a UInt8) ENGINE = Memory' | curl 'http://localhost:8123/' --data-binary @-
$ echo 'CREATE TABLE t (a UInt8) ENGINE = Memory' | curl 'http://localhost:8123/' --data-binary @-
```
Using the familiar INSERT query for data insertion:
```bash
echo 'INSERT INTO t VALUES (1),(2),(3)' | curl 'http://localhost:8123/' --data-binary @-
$ echo 'INSERT INTO t VALUES (1),(2),(3)' | curl 'http://localhost:8123/' --data-binary @-
```
Data can be sent separately from the query:
```bash
echo '(4),(5),(6)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20VALUES' --data-binary @-
$ echo '(4),(5),(6)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20VALUES' --data-binary @-
```
You can specify any data format. The 'Values' format is the same as what is used when writing INSERT INTO t VALUES:
```bash
echo '(7),(8),(9)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20Values' --data-binary @-
$ echo '(7),(8),(9)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20Values' --data-binary @-
```
To insert data from a tab-separated dump, specify the corresponding format:
```bash
echo -ne '10\n11\n12\n' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20TabSeparated' --data-binary @-
$ echo -ne '10\n11\n12\n' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20TabSeparated' --data-binary @-
```
Reading the table contents. Data is output in random order due to parallel query processing:
@ -123,7 +123,7 @@ $ curl 'http://localhost:8123/?query=SELECT%20a%20FROM%20t'
Deleting the table.
```bash
echo 'DROP TABLE t' | curl 'http://localhost:8123/' --data-binary @-
$ echo 'DROP TABLE t' | curl 'http://localhost:8123/' --data-binary @-
```
For successful requests that don't return a data table, an empty response body is returned.
@ -141,10 +141,10 @@ Examples of sending data with compression:
```bash
#Sending data to the server:
curl -vsS "http://localhost:8123/?enable_http_compression=1" -d 'SELECT number FROM system.numbers LIMIT 10' -H 'Accept-Encoding: gzip'
$ curl -vsS "http://localhost:8123/?enable_http_compression=1" -d 'SELECT number FROM system.numbers LIMIT 10' -H 'Accept-Encoding: gzip'
#Sending data to the client:
echo "SELECT 1" | gzip -c | curl -sS --data-binary @- -H 'Content-Encoding: gzip' 'http://localhost:8123/'
$ echo "SELECT 1" | gzip -c | curl -sS --data-binary @- -H 'Content-Encoding: gzip' 'http://localhost:8123/'
```
!!! note "Note"
@ -173,13 +173,13 @@ The username and password can be indicated in one of two ways:
1. Using HTTP Basic Authentication. Example:
```bash
echo 'SELECT 1' | curl 'http://user:password@localhost:8123/' -d @-
$ echo 'SELECT 1' | curl 'http://user:password@localhost:8123/' -d @-
```
2. In the 'user' and 'password' URL parameters. Example:
```bash
echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @-
$ echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @-
```
If the user name is not specified, the `default` name is used. If the password is not specified, the empty password is used.
@ -207,7 +207,7 @@ Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you
You can receive information about the progress of a query in `X-ClickHouse-Progress` response headers. To do this, enable [send_progress_in_http_headers](../operations/settings/settings.md#settings-send_progress_in_http_headers). Example of the header sequence:
```
```text
X-ClickHouse-Progress: {"read_rows":"2752512","read_bytes":"240570816","total_rows_to_read":"8880128"}
X-ClickHouse-Progress: {"read_rows":"5439488","read_bytes":"482285394","total_rows_to_read":"8880128"}
X-ClickHouse-Progress: {"read_rows":"8783786","read_bytes":"819092887","total_rows_to_read":"8880128"}
@ -239,7 +239,7 @@ To ensure that the entire response is buffered, set `wait_end_of_query=1`. In th
Example:
```bash
curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wait_end_of_query=1' -d 'SELECT toUInt8(number) FROM system.numbers LIMIT 9000000 FORMAT RowBinary'
$ curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wait_end_of_query=1' -d 'SELECT toUInt8(number) FROM system.numbers LIMIT 9000000 FORMAT RowBinary'
```
Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client side, the error can only be detected at the parsing stage.
@ -251,7 +251,7 @@ You can create a query with parameters and pass values for them from the corresp
### Example
```bash
curl -sS "<address>?param_id=2&param_phrase=test" -d "SELECT * FROM table WHERE int_column = {id:UInt8} and string_column = {phrase:String}"
$ curl -sS "<address>?param_id=2&param_phrase=test" -d "SELECT * FROM table WHERE int_column = {id:UInt8} and string_column = {phrase:String}"
```
[Original article](https://clickhouse.yandex/docs/en/interfaces/http_interface/) <!--hide-->

View File

@ -20,8 +20,10 @@ Substitutions can also be performed from ZooKeeper. To do this, specify the attr
The `config.xml` file can specify a separate config with user settings, profiles, and quotas. The relative path to this config is set in the 'users_config' element. By default, it is `users.xml`. If `users_config` is omitted, the user settings, profiles, and quotas are specified directly in `config.xml`.
In addition, `users_config` may have overrides in files from the `users_config.d` directory (for example, `users.d`) and substitutions. For example, you can have separate config file for each user like this:
``` xml
```bash
$ cat /etc/clickhouse-server/users.d/alice.xml
```
```xml
<yandex>
<users>
<alice>

View File

@ -3,7 +3,7 @@
The constraints on settings can be defined in the `users` section of the `user.xml` configuration file and prohibit users from changing some of the settings with the `SET` query.
The constraints are defined as following:
```
```xml
<profiles>
<user_name>
<constraints>
@ -30,7 +30,7 @@ There are supported three types of constraints: `min`, `max`, `readonly`. The `m
**Example:** Let `users.xml` includes lines:
```
```xml
<profiles>
<default>
<max_memory_usage>10000000000</max_memory_usage>
@ -51,13 +51,13 @@ There are supported three types of constraints: `min`, `max`, `readonly`. The `m
The following queries all throw exceptions:
```
```sql
SET max_memory_usage=20000000001;
SET max_memory_usage=4999999999;
SET force_index_by_date=1;
```
```
```text
Code: 452, e.displayText() = DB::Exception: Setting max_memory_usage should not be greater than 20000000000.
Code: 452, e.displayText() = DB::Exception: Setting max_memory_usage should not be less than 5000000000.
Code: 452, e.displayText() = DB::Exception: Setting force_index_by_date should not be changed.

View File

@ -179,7 +179,8 @@ Insert the [DateTime](../../data_types/datetime.md) type value with the differen
```sql
SET input_format_values_interpret_expressions = 0;
INSERT INTO datetime_t VALUES (now())
```
```text
Exception on client:
Code: 27. DB::Exception: Cannot parse input: expected ) before: now()): (at row 1)
```
@ -187,7 +188,8 @@ Code: 27. DB::Exception: Cannot parse input: expected ) before: now()): (at row
```sql
SET input_format_values_interpret_expressions = 1;
INSERT INTO datetime_t VALUES (now())
```
```text
Ok.
```
@ -196,7 +198,8 @@ The last query is equivalent to the following:
```sql
SET input_format_values_interpret_expressions = 0;
INSERT INTO datetime_t SELECT now()
```
```text
Ok.
```
@ -599,7 +602,7 @@ ClickHouse supports the following algorithms of choosing replicas:
### Random (by default) {#load_balancing-random}
```
```sql
load_balancing = random
```
@ -608,7 +611,7 @@ Disadvantages: Server proximity is not accounted for; if the replicas have diffe
### Nearest Hostname {#load_balancing-nearest_hostname}
```
```sql
load_balancing = nearest_hostname
```
@ -622,7 +625,7 @@ We can also assume that when sending a query to the same server, in the absence
### In Order {#load_balancing-in_order}
```
```sql
load_balancing = in_order
```
@ -632,7 +635,7 @@ This method is appropriate when you know exactly which replica is preferable.
### First or Random {#load_balancing-first_or_random}
```
```sql
load_balancing = first_or_random
```

View File

@ -4,7 +4,7 @@ The `users` section of the `user.xml` configuration file contains user settings.
Structure of the `users` section:
```
```xml
<users>
<!-- If user name was not specified, 'default' user is used. -->
<user_name>
@ -80,7 +80,7 @@ All results of DNS requests are cached until the server restarts.
To open access for user from any network, specify:
```
```xml
<ip>::/0</ip>
```
@ -90,7 +90,7 @@ To open access for user from any network, specify:
To open access only from localhost, specify:
```
```xml
<ip>::1</ip>
<ip>127.0.0.1</ip>
```
@ -114,7 +114,7 @@ In this section, you can you can limit rows that are returned by ClickHouse for
The following configuration forces that user `user1` can only see the rows of `table1` as the result of `SELECT` queries, where the value of the `id` field is 1000.
```
```xml
<user1>
<databases>
<database_name>

View File

@ -494,7 +494,7 @@ WHERE table = 'visits'
FORMAT Vertical
```
```
```text
Row 1:
──────
database: merge
@ -520,7 +520,7 @@ active_replicas: 2
Columns:
```
```text
database: Database name
table: Table name
engine: Table engine name
@ -573,7 +573,7 @@ If you don't request the last 4 columns (log_max_index, log_pointer, total_repli
For example, you can check that everything is working correctly like this:
``` sql
```sql
SELECT
database,
table,
@ -619,13 +619,13 @@ Columns:
Example:
``` sql
```sql
SELECT *
FROM system.settings
WHERE changed
```
```
```text
┌─name───────────────────┬─value───────┬─changed─┐
│ max_threads │ 8 │ 1 │
│ use_uncompressed_cache │ 0 │ 1 │
@ -686,14 +686,14 @@ Columns:
Example:
``` sql
```sql
SELECT *
FROM system.zookeeper
WHERE path = '/clickhouse/tables/01-08/visits/replicas'
FORMAT Vertical
```
```
```text
Row 1:
──────
name: example01-08-1.yandex.ru

View File

@ -11,7 +11,7 @@ It is appropriate to use `AggregatingMergeTree` if it reduces the number of rows
## Creating a Table
``` sql
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
@ -59,7 +59,7 @@ In the results of `SELECT` query the values of `AggregateFunction` type have imp
`AggregatingMergeTree` materialized view that watches the `test.visits` table:
``` sql
```sql
CREATE MATERIALIZED VIEW test.basic
ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate)
AS SELECT
@ -73,7 +73,7 @@ GROUP BY CounterID, StartDate;
Inserting of data into the `test.visits` table.
``` sql
```sql
INSERT INTO test.visits ...
```
@ -81,7 +81,7 @@ The data are inserted in both the table and view `test.basic` that will perform
To get the aggregated data, we need to execute a query such as `SELECT ... GROUP BY ...` from the view `test.basic`:
``` sql
```sql
SELECT
StartDate,
sumMerge(Visits) AS Visits,

View File

@ -25,7 +25,7 @@ The conditions for flushing the data are calculated separately for each of the `
Example:
``` sql
```sql
CREATE TABLE merge.hits_buffer AS merge.hits ENGINE = Buffer(merge, hits, 16, 10, 100, 10000, 1000000, 10000000, 100000000)
```

View File

@ -65,7 +65,7 @@ Use the particular column `Sign`. If `Sign = 1` it means that the row is a state
For example, we want to calculate how much pages users checked at some site and how long they were there. At some moment of time we write the following row with the state of user activity:
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │
└─────────────────────┴───────────┴──────────┴──────┘
@ -73,7 +73,7 @@ For example, we want to calculate how much pages users checked at some site and
At some moment later we register the change of user activity and write it with the following two rows.
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ -1 │
│ 4324182021466249494 │ 6 │ 185 │ 1 │
@ -86,7 +86,7 @@ The second row contains the current state.
As we need only the last state of user activity, the rows
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │
│ 4324182021466249494 │ 5 │ 146 │ -1 │
@ -131,7 +131,7 @@ If you need to extract data without aggregation (for example, to check whether r
Example data:
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │
│ 4324182021466249494 │ 5 │ 146 │ -1 │
@ -166,11 +166,11 @@ We use two `INSERT` queries to create two different data parts. If we insert the
Getting the data:
```
```sql
SELECT * FROM UAct
```
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ -1 │
│ 4324182021466249494 │ 6 │ 185 │ 1 │
@ -195,7 +195,7 @@ FROM UAct
GROUP BY UserID
HAVING sum(Sign) > 0
```
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┐
│ 4324182021466249494 │ 6 │ 185 │
└─────────────────────┴───────────┴──────────┘
@ -206,7 +206,7 @@ If we do not need aggregation and want to force collapsing, we can use `FINAL` m
```sql
SELECT * FROM UAct FINAL
```
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 6 │ 185 │ 1 │
└─────────────────────┴───────────┴──────────┴──────┘
@ -218,7 +218,7 @@ This way of selecting the data is very inefficient. Don't use it for big tables.
Example data:
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │
│ 4324182021466249494 │ -5 │ -146 │ -1 │
@ -247,28 +247,38 @@ insert into UAct values(4324182021466249494, -5, -146, -1);
insert into UAct values(4324182021466249494, 6, 185, 1);
select * from UAct final; // avoid using final in production (just for a test or small tables)
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 6 │ 185 │ 1 │
└─────────────────────┴───────────┴──────────┴──────┘
```
```sql
SELECT
UserID,
sum(PageViews) AS PageViews,
sum(Duration) AS Duration
FROM UAct
GROUP BY UserID
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┐
│ 4324182021466249494 │ 6 │ 185 │
└─────────────────────┴───────────┴──────────┘
```
```sqk
select count() FROM UAct
```
```text
┌─count()─┐
│ 3 │
└─────────┘
```
```sql
optimize table UAct final;
select * FROM UAct
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 6 │ 185 │ 1 │
└─────────────────────┴───────────┴──────────┴──────┘

View File

@ -6,7 +6,7 @@ A partition is a logical combination of records in a table by a specified criter
The partition is specified in the `PARTITION BY expr` clause when [creating a table](mergetree.md#table_engine-mergetree-creating-a-table). The partition key can be any expression from the table columns. For example, to specify partitioning by month, use the expression `toYYYYMM(date_column)`:
``` sql
```sql
CREATE TABLE visits
(
VisitDate Date,
@ -20,7 +20,7 @@ ORDER BY Hour;
The partition key can also be a tuple of expressions (similar to the [primary key](mergetree.md#primary-keys-and-indexes-in-queries)). For example:
``` sql
```sql
ENGINE = ReplicatedCollapsingMergeTree('/clickhouse/tables/name', 'replica1', Sign)
PARTITION BY (toMonday(StartDate), EventType)
ORDER BY (CounterID, StartDate, intHash32(UserID));
@ -35,7 +35,7 @@ When inserting new data to a table, this data is stored as a separate part (chun
Use the [system.parts](../system_tables.md#system_tables-parts) table to view the table parts and partitions. For example, let's assume that we have a `visits` table with partitioning by month. Let's perform the `SELECT` query for the `system.parts` table:
``` sql
```sql
SELECT
partition,
name,
@ -44,7 +44,7 @@ FROM system.parts
WHERE table = 'visits'
```
```
```text
┌─partition─┬─name───────────┬─active─┐
│ 201901 │ 201901_1_3_1 │ 0 │
│ 201901 │ 201901_1_9_2 │ 1 │
@ -74,11 +74,11 @@ The `active` column shows the status of the part. `1` is active; `0` is inactive
As you can see in the example, there are several separated parts of the same partition (for example, `201901_1_3_1` and `201901_1_9_2`). This means that these parts are not merged yet. ClickHouse merges the inserted parts of data periodically, approximately 15 minutes after inserting. In addition, you can perform a non-scheduled merge using the [OPTIMIZE](../../query_language/misc.md#misc_operations-optimize) query. Example:
``` sql
```sql
OPTIMIZE TABLE visits PARTITION 201902;
```
```
```text
┌─partition─┬─name───────────┬─active─┐
│ 201901 │ 201901_1_3_1 │ 0 │
│ 201901 │ 201901_1_9_2 │ 1 │
@ -96,7 +96,7 @@ Inactive parts will be deleted approximately 10 minutes after merging.
Another way to view a set of parts and partitions is to go into the directory of the table: `/var/lib/clickhouse/data/<database>/<table>/`. For example:
```bash
dev:/var/lib/clickhouse/data/default/visits$ ls -l
/var/lib/clickhouse/data/default/visits$ ls -l
total 40
drwxr-xr-x 2 clickhouse clickhouse 4096 Feb 1 16:48 201901_1_3_1
drwxr-xr-x 2 clickhouse clickhouse 4096 Feb 5 16:17 201901_1_9_2

View File

@ -38,9 +38,7 @@ As an example, consider a dictionary of `products` with the following configurat
Query the dictionary data:
``` sql
select name, type, key, attribute.names, attribute.types, bytes_allocated, element_count,source from system.dictionaries where name = 'products';
```sql
SELECT
name,
type,
@ -54,7 +52,7 @@ FROM system.dictionaries
WHERE name = 'products'
```
```
```text
┌─name─────┬─type─┬─key────┬─attribute.names─┬─attribute.types─┬─bytes_allocated─┬─element_count─┬─source──────────┐
│ products │ Flat │ UInt64 │ ['title'] │ ['String'] │ 23065376 │ 175032 │ ODBC: .products │
└──────────┴──────┴────────┴─────────────────┴─────────────────┴─────────────────┴───────────────┴─────────────────┘
@ -66,45 +64,29 @@ This view isn't helpful when you need to get raw data, or when performing a `JOI
Syntax:
```
```sql
CREATE TABLE %table_name% (%fields%) engine = Dictionary(%dictionary_name%)`
```
Usage example:
``` sql
```sql
create table products (product_id UInt64, title String) Engine = Dictionary(products);
CREATE TABLE products
(
product_id UInt64,
title String,
)
ENGINE = Dictionary(products)
```
```
Ok.
0 rows in set. Elapsed: 0.004 sec.
Ok
```
Take a look at what's in the table.
``` sql
```sql
select * from products limit 1;
SELECT *
FROM products
LIMIT 1
```
```
```text
┌────product_id─┬─title───────────┐
│ 152689 │ Some item │
└───────────────┴─────────────────┘
1 rows in set. Elapsed: 0.006 sec.
```

View File

@ -6,7 +6,7 @@ Reading is automatically parallelized. During a read, the table indexes on remot
The Distributed engine accepts parameters: the cluster name in the server's config file, the name of a remote database, the name of a remote table, and (optionally) a sharding key.
Example:
```
```sql
Distributed(logs, default, hits[, sharding_key])
```

View File

@ -32,9 +32,9 @@ The files specified in 'file' will be parsed by the format specified in 'format'
Examples:
```bash
echo -ne "1\n2\n3\n" | clickhouse-client --query="SELECT count() FROM test.visits WHERE TraficSourceID IN _data" --external --file=- --types=Int8
$ echo -ne "1\n2\n3\n" | clickhouse-client --query="SELECT count() FROM test.visits WHERE TraficSourceID IN _data" --external --file=- --types=Int8
849897
cat /etc/passwd | sed 's/:/\t/g' | clickhouse-client --query="SELECT shell, count() AS c FROM passwd GROUP BY shell ORDER BY c DESC" --external --file=- --name=passwd --structure='login String, unused String, uid UInt16, gid UInt16, comment String, home String, shell String'
$ cat /etc/passwd | sed 's/:/\t/g' | clickhouse-client --query="SELECT shell, count() AS c FROM passwd GROUP BY shell ORDER BY c DESC" --external --file=- --name=passwd --structure='login String, unused String, uid UInt16, gid UInt16, comment String, home String, shell String'
/bin/sh 20
/bin/false 5
/bin/bash 4
@ -47,9 +47,9 @@ When using the HTTP interface, external data is passed in the multipart/form-dat
Example:
```bash
cat /etc/passwd | sed 's/:/\t/g' > passwd.tsv
$ cat /etc/passwd | sed 's/:/\t/g' > passwd.tsv
curl -F 'passwd=@passwd.tsv;' 'http://localhost:8123/?query=SELECT+shell,+count()+AS+c+FROM+passwd+GROUP+BY+shell+ORDER+BY+c+DESC&passwd_structure=login+String,+unused+String,+uid+UInt16,+gid+UInt16,+comment+String,+home+String,+shell+String'
$ curl -F 'passwd=@passwd.tsv;' 'http://localhost:8123/?query=SELECT+shell,+count()+AS+c+FROM+passwd+GROUP+BY+shell+ORDER+BY+c+DESC&passwd_structure=login+String,+unused+String,+uid+UInt16,+gid+UInt16,+comment+String,+home+String,+shell+String'
/bin/sh 20
/bin/false 5
/bin/bash 4

View File

@ -11,7 +11,7 @@ Usage examples:
## Usage in ClickHouse Server
```
```sql
File(Format)
```
@ -33,7 +33,7 @@ You may manually create this subfolder and file in server filesystem and then [A
**1.** Set up the `file_engine_table` table:
``` sql
```sql
CREATE TABLE file_engine_table (name String, value UInt32) ENGINE=File(TabSeparated)
```
@ -49,11 +49,11 @@ two 2
**3.** Query the data:
``` sql
```sql
SELECT * FROM file_engine_table
```
```
```text
┌─name─┬─value─┐
│ one │ 1 │
│ two │ 2 │

View File

@ -89,7 +89,7 @@ patterns
Structure of the `patterns` section:
```
```text
pattern
regexp
function

View File

@ -5,7 +5,7 @@ to the [File](file.md) and [URL](url.md) engines, but provides Hadoop-specific f
## Usage
```
```sql
ENGINE = HDFS(URI, format)
```
The `URI` parameter is the whole file URI in HDFS.
@ -18,22 +18,22 @@ The `format` parameter specifies one of the available file formats. To perform
**1.** Set up the `hdfs_engine_table` table:
``` sql
```sql
CREATE TABLE hdfs_engine_table (name String, value UInt32) ENGINE=HDFS('hdfs://hdfs1:9000/other_storage', 'TSV')
```
**2.** Fill file:
``` sql
```sql
INSERT INTO hdfs_engine_table VALUES ('one', 1), ('two', 2), ('three', 3)
```
**3.** Query the data:
``` sql
```sql
SELECT * FROM hdfs_engine_table LIMIT 2
```
```
```text
┌─name─┬─value─┐
│ one │ 1 │
│ two │ 2 │

View File

@ -27,7 +27,7 @@ ENGINE = JDBC(dbms_uri, external_database, external_table)
Creating a table in MySQL server by connecting directly with it's console client:
```
```text
mysql> CREATE TABLE `test`.`test` (
-> `int_id` INT NOT NULL AUTO_INCREMENT,
-> `int_nullable` INT NULL DEFAULT NULL,
@ -50,30 +50,29 @@ mysql> select * from test;
Creating a table in ClickHouse server and selecting data from it:
```
```sql
CREATE TABLE jdbc_table ENGINE JDBC('jdbc:mysql://localhost:3306/?user=root&password=root', 'test', 'test')
Ok.
```
```sql
DESCRIBE TABLE jdbc_table
```
```text
┌─name───────────────┬─type───────────────┬─default_type─┬─default_expression─┐
│ int_id │ Int32 │ │ │
│ int_nullable │ Nullable(Int32) │ │ │
│ float │ Float32 │ │ │
│ float_nullable │ Nullable(Float32) │ │ │
└────────────────────┴────────────────────┴──────────────┴────────────────────┘
10 rows in set. Elapsed: 0.031 sec.
```
```sql
SELECT *
FROM jdbc_table
```
```text
┌─int_id─┬─int_nullable─┬─float─┬─float_nullable─┐
│ 1 │ ᴺᵁᴸᴸ │ 2 │ ᴺᵁᴸᴸ │
└────────┴──────────────┴───────┴────────────────┘
1 rows in set. Elapsed: 0.055 sec.
```
## See Also

View File

@ -4,7 +4,7 @@ Prepared data structure for using in [JOIN](../../query_language/select.md#selec
## Creating a Table
```
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],

View File

@ -11,7 +11,7 @@ Kafka lets you:
## Creating a Table {#table_engine-kafka-creating-a-table}
```
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
@ -44,7 +44,7 @@ Optional parameters:
Examples:
``` sql
```sql
CREATE TABLE queue (
timestamp UInt64,
level String,
@ -79,7 +79,7 @@ Examples:
Do not use this method in new projects. If possible, switch old projects to the method described above.
```
```sql
Kafka(kafka_broker_list, kafka_topic_list, kafka_group_name, kafka_format
[, kafka_row_delimiter, kafka_schema, kafka_num_consumers, kafka_skip_broken_messages])
```
@ -104,7 +104,7 @@ One kafka table can have as many materialized views as you like, they do not rea
Example:
``` sql
```sql
CREATE TABLE queue (
timestamp UInt64,
level String,
@ -128,7 +128,7 @@ To improve performance, received messages are grouped into blocks the size of [m
To stop receiving topic data or to change the conversion logic, detach the materialized view:
```
```sql
DETACH TABLE consumer;
ATTACH MATERIALIZED VIEW consumer;
```

View File

@ -101,7 +101,7 @@ The `index_granularity` setting can be omitted because 8192 is the default value
!!! attention
Do not use this method in new projects. If possible, switch old projects to the method described above.
```
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
@ -119,7 +119,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
**Example**
```
```sql
MergeTree(EventDate, intHash32(UserID), (CounterID, EventDate, intHash32(UserID)), 8192)
```
@ -370,14 +370,14 @@ The `TTL` clause can be set for the whole table and for each individual column.
The table must have the column in the [Date](../../data_types/date.md) or [DateTime](../../data_types/datetime.md) data type. To define the lifetime of data, use operations on this time column, for example:
```
```sql
TTL time_column
TTL time_column + interval
```
To define `interval`, use [time interval](../../query_language/operators.md#operators-datetime) operators.
```
```sql
TTL date_time + INTERVAL 1 MONTH
TTL date_time + INTERVAL 15 HOUR
```

View File

@ -45,7 +45,7 @@ The rest of the conditions and the `LIMIT` sampling constraint are executed in C
Table in MySQL:
```
```text
mysql> CREATE TABLE `test`.`test` (
-> `int_id` INT NOT NULL AUTO_INCREMENT,
-> `int_nullable` INT NULL DEFAULT NULL,

View File

@ -8,7 +8,7 @@ This engine supports the [Nullable](../../data_types/nullable.md) data type.
## Creating a Table
```
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1],
@ -41,15 +41,17 @@ Ensure that unixODBC and MySQL Connector are installed.
By default (if installed from packages), ClickHouse starts as user `clickhouse`. Thus, you need to create and configure this user in the MySQL server.
```bash
$ sudo mysql
```
sudo mysql
```sql
mysql> CREATE USER 'clickhouse'@'localhost' IDENTIFIED BY 'clickhouse';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'clickhouse'@'clickhouse' WITH GRANT OPTION;
```
Then configure the connection in `/etc/odbc.ini`.
```
```bash
$ cat /etc/odbc.ini
[mysqlconn]
DRIVER = /usr/local/lib/libmyodbc5w.so
@ -62,8 +64,8 @@ PASSWORD = clickhouse
You can check the connection using the `isql` utility from the unixODBC installation.
```
isql -v mysqlconn
```bash
$ isql -v mysqlconn
+---------------------------------------+
| Connected! |
| |
@ -72,7 +74,7 @@ isql -v mysqlconn
Table in MySQL:
```
```text
mysql> CREATE TABLE `test`.`test` (
-> `int_id` INT NOT NULL AUTO_INCREMENT,
-> `int_nullable` INT NULL DEFAULT NULL,

View File

@ -6,7 +6,7 @@ Use this engine in scenarios when you need to write many tables with a small amo
## Creating a Table {#table_engines-stripelog-creating-a-table}
```
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
column1_name [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
@ -60,7 +60,7 @@ ClickHouse uses multiple threads when selecting data. Each thread reads a separa
```sql
SELECT * FROM stripe_log_table
```
```
```text
┌───────────timestamp─┬─message_type─┬─message────────────────────┐
│ 2019-01-18 14:27:32 │ REGULAR │ The second regular message │
│ 2019-01-18 14:34:53 │ WARNING │ The first warning message │
@ -75,7 +75,7 @@ Sorting the results (ascending order by default):
```sql
SELECT * FROM stripe_log_table ORDER BY timestamp
```
```
```text
┌───────────timestamp─┬─message_type─┬─message────────────────────┐
│ 2019-01-18 14:23:43 │ REGULAR │ The first regular message │
│ 2019-01-18 14:27:32 │ REGULAR │ The second regular message │

View File

@ -7,7 +7,7 @@ We recommend to use the engine together with `MergeTree`. Store complete data in
## Creating a Table
```
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
@ -38,7 +38,7 @@ When creating a `SummingMergeTree` table the same [clauses](mergetree.md) are re
!!! attention
Do not use this method in new projects and, if possible, switch the old projects to the method described above.
```
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
@ -68,8 +68,8 @@ ORDER BY key
Insert data to it:
```
:) INSERT INTO summtt Values(1,1),(1,2),(2,1)
```sql
INSERT INTO summtt Values(1,1),(1,2),(2,1)
```
ClickHouse may sum all the rows not completely ([see below](#data-processing)), so we use an aggregate function `sum` and `GROUP BY` clause in the query.
@ -78,7 +78,7 @@ ClickHouse may sum all the rows not completely ([see below](#data-processing)),
SELECT key, sum(value) FROM summtt GROUP BY key
```
```
```text
┌─key─┬─sum(value)─┐
│ 2 │ 1 │
│ 1 │ 3 │
@ -119,7 +119,7 @@ then this nested table is interpreted as a mapping of `key => (values...)`, and
Examples:
```
```text
[(1, 100)] + [(2, 150)] -> [(1, 100), (2, 150)]
[(1, 100)] + [(1, 150)] -> [(1, 250)]
[(1, 100)] + [(1, 150), (2, 150)] -> [(1, 250), (2, 150)]

View File

@ -21,7 +21,7 @@ respectively. For processing `POST` requests, the remote server must support
**1.** Create a `url_engine_table` table on the server :
``` sql
```sql
CREATE TABLE url_engine_table (word String, value UInt64)
ENGINE=URL('http://127.0.0.1:12345/', CSV)
```
@ -46,16 +46,16 @@ if __name__ == "__main__":
```
```bash
python3 server.py
$ python3 server.py
```
**3.** Request data:
``` sql
```sql
SELECT * FROM url_engine_table
```
```
```text
┌─word──┬─value─┐
│ Hello │ 1 │
│ World │ 2 │

View File

@ -29,7 +29,7 @@ For a description of query parameters, see the [query description](../../query_l
**Engine Parameters**
```
```sql
VersionedCollapsingMergeTree(sign, version)
```
@ -81,7 +81,7 @@ Use the `Sign` column when writing the row. If `Sign = 1` it means that the row
For example, we want to calculate how many pages users visited on some site and how long they were there. At some point in time we write the following row with the state of user activity:
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┬─Version─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │ 1 |
└─────────────────────┴───────────┴──────────┴──────┴─────────┘
@ -89,7 +89,7 @@ For example, we want to calculate how many pages users visited on some site and
At some point later we register the change of user activity and write it with the following two rows.
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┬─Version─┐
│ 4324182021466249494 │ 5 │ 146 │ -1 │ 1 |
│ 4324182021466249494 │ 6 │ 185 │ 1 │ 2 |
@ -102,7 +102,7 @@ The second row contains the current state.
Because we need only the last state of user activity, the rows
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┬─Version─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │ 1 |
│ 4324182021466249494 │ 5 │ 146 │ -1 │ 1 |
@ -139,7 +139,7 @@ If you need to extract the data with "collapsing" but without aggregation (for e
Example data:
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┬─Version─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │ 1 |
│ 4324182021466249494 │ 5 │ 146 │ -1 │ 1 |
@ -175,11 +175,11 @@ We use two `INSERT` queries to create two different data parts. If we insert the
Getting the data:
```
```sql
SELECT * FROM UAct
```
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┬─Version─┐
│ 4324182021466249494 │ 5 │ 146 │ 1 │ 1 │
└─────────────────────┴───────────┴──────────┴──────┴─────────┘
@ -205,7 +205,7 @@ FROM UAct
GROUP BY UserID, Version
HAVING sum(Sign) > 0
```
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Version─┐
│ 4324182021466249494 │ 6 │ 185 │ 2 │
└─────────────────────┴───────────┴──────────┴─────────┘
@ -216,7 +216,7 @@ If we don't need aggregation and want to force collapsing, we can use the `FINAL
```sql
SELECT * FROM UAct FINAL
```
```
```text
┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┬─Version─┐
│ 4324182021466249494 │ 6 │ 185 │ 1 │ 2 │
└─────────────────────┴───────────┴──────────┴──────┴─────────┘

View File

@ -5,7 +5,7 @@
Always use the `performance` scaling governor. The `on-demand` scaling governor works much worse with constantly high demand.
```bash
echo 'performance' | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
$ echo 'performance' | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
```
## CPU Limitations
@ -20,8 +20,8 @@ For large amounts of data and when processing interactive (online) queries, you
Even for data volumes of ~50 TB per server, using 128 GB of RAM significantly improves query performance compared to 64 GB.
Do not disable overcommit. The value `cat /proc/sys/vm/overcommit_memory` should be 0 or 1. Run
```
echo 0 | sudo tee /proc/sys/vm/overcommit_memory
```bash
$ echo 0 | sudo tee /proc/sys/vm/overcommit_memory
```
## Huge Pages
@ -29,7 +29,7 @@ echo 0 | sudo tee /proc/sys/vm/overcommit_memory
Always disable transparent huge pages. It interferes with memory allocators, which leads to significant performance degradation.
```bash
echo 'never' | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
$ echo 'never' | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
```
Use `perf top` to watch the time spent in the kernel for memory management.
@ -54,7 +54,7 @@ If you have more than 4 disks, use RAID-6 (preferred) or RAID-50, instead of RAI
When using RAID-5, RAID-6 or RAID-50, always increase stripe_cache_size, since the default value is usually not the best choice.
```bash
echo 4096 | sudo tee /sys/block/md2/md/stripe_cache_size
$ echo 4096 | sudo tee /sys/block/md2/md/stripe_cache_size
```
Calculate the exact number from the number of devices and the block size, using the formula: `2 * num_devices * chunk_size_in_bytes / 4096`.
@ -163,7 +163,7 @@ dynamicConfigFile=/etc/zookeeper-{{ cluster['name'] }}/conf/zoo.cfg.dynamic
Java version:
```
```text
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
```
@ -211,7 +211,7 @@ JAVA_OPTS="-Xms{{ cluster.get('xms','128M') }} \
Salt init:
```
```text
description "zookeeper-{{ cluster['name'] }} centralized coordination service"
start on runlevel [2345]

View File

@ -26,14 +26,14 @@ Possible issues:
Command:
```
sudo service clickhouse-server status
```bash
$ sudo service clickhouse-server status
```
If the server is not running, start it with the command:
```
sudo service clickhouse-server start
```bash
$ sudo service clickhouse-server start
```
**Check logs**
@ -47,19 +47,19 @@ If the server started successfully, you should see the strings:
If `clickhouse-server` start failed with a configuration error, you should see the `<Error>` string with an error description. For example:
```
```text
2019.01.11 15:23:25.549505 [ 45 ] {} <Error> ExternalDictionaries: Failed reloading 'event2id' external dictionary: Poco::Exception. Code: 1000, e.code() = 111, e.displayText() = Connection refused, e.what() = Connection refused
```
If you don't see an error at the end of the file, look through the entire file starting from the string:
```
```text
<Information> Application: starting up.
```
If you try to start a second instance of `clickhouse-server` on the server, you see the following log:
```
```text
2019.01.11 15:25:11.151730 [ 1 ] {} <Information> : Starting ClickHouse 19.1.0 with revision 54413
2019.01.11 15:25:11.154578 [ 1 ] {} <Information> Application: starting up
2019.01.11 15:25:11.156361 [ 1 ] {} <Information> StatusFile: Status file ./status already exists - unclean restart. Contents:
@ -77,14 +77,14 @@ Revision: 54413
If you don't find any useful information in `clickhouse-server` logs or there aren't any logs, you can view `system.d` logs using the command:
```
sudo journalctl -u clickhouse-server
```bash
$ sudo journalctl -u clickhouse-server
```
**Start clickhouse-server in interactive mode**
```
sudo -u clickhouse /usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml
```bash
$ sudo -u clickhouse /usr/bin/clickhouse-server --config-file /etc/clickhouse-server/config.xml
```
This command starts the server as an interactive app with standard parameters of the autostart script. In this mode `clickhouse-server` prints all the event messages in the console.

View File

@ -2,10 +2,10 @@
If ClickHouse was installed from deb packages, execute the following commands on the server:
```
sudo apt-get update
sudo apt-get install clickhouse-client clickhouse-server
sudo service clickhouse-server restart
```bash
$ sudo apt-get update
$ sudo apt-get install clickhouse-client clickhouse-server
$ sudo service clickhouse-server restart
```
If you installed ClickHouse using something other than the recommended deb packages, use the appropriate update method.

View File

@ -24,7 +24,7 @@ To reduce network traffic, we recommend running `clickhouse-copier` on the same
The utility should be run manually:
```bash
clickhouse-copier copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir
$ clickhouse-copier copier --daemon --config zookeeper.xml --task-path /task/path --base-dir /path/to/dir
```
Parameters:

View File

@ -17,8 +17,8 @@ By default `clickhouse-local` does not have access to data on the same host, but
Basic usage:
``` bash
clickhouse-local --structure "table_structure" --input-format "format_of_incoming_data" -q "query"
```bash
$ clickhouse-local --structure "table_structure" --input-format "format_of_incoming_data" -q "query"
```
Arguments:
@ -40,8 +40,8 @@ Also there are arguments for each ClickHouse configuration variable which are mo
## Examples
``` bash
echo -e "1,2\n3,4" | clickhouse-local -S "a Int64, b Int64" -if "CSV" -q "SELECT * FROM table"
```bash
$ echo -e "1,2\n3,4" | clickhouse-local -S "a Int64, b Int64" -if "CSV" -q "SELECT * FROM table"
Read 2 rows, 32.00 B in 0.000 sec., 5182 rows/sec., 80.97 KiB/sec.
1 2
3 4
@ -49,7 +49,7 @@ Read 2 rows, 32.00 B in 0.000 sec., 5182 rows/sec., 80.97 KiB/sec.
Previous example is the same as:
``` bash
```bash
$ echo -e "1,2\n3,4" | clickhouse-local -q "CREATE TABLE table (a Int64, b Int64) ENGINE = File(CSV, stdin); SELECT a, b FROM table; DROP TABLE table"
Read 2 rows, 32.00 B in 0.000 sec., 4987 rows/sec., 77.93 KiB/sec.
1 2
@ -58,8 +58,10 @@ Read 2 rows, 32.00 B in 0.000 sec., 4987 rows/sec., 77.93 KiB/sec.
Now let's output memory user for each Unix user:
``` bash
```bash
$ ps aux | tail -n +2 | awk '{ printf("%s\t%s\n", $1, $4) }' | clickhouse-local -S "user String, mem Float64" -q "SELECT user, round(sum(mem), 2) as memTotal FROM table GROUP BY user ORDER BY memTotal DESC FORMAT Pretty"
```
```text
Read 186 rows, 4.15 KiB in 0.035 sec., 5302 rows/sec., 118.34 KiB/sec.
┏━━━━━━━━━━┳━━━━━━━━━━┓
┃ user ┃ memTotal ┃

View File

@ -48,7 +48,7 @@ Converts an aggregate function for tables into an aggregate function for arrays
Allows to divide data by groups, and then separately aggregates the data in those groups. Groups are created by splitting the values of one of the columns into intervals.
```
```sql
<aggFunction>Resample(start, end, step)(<aggFunction_params>, resampling_key)
```

View File

@ -15,7 +15,7 @@ During aggregation, all `NULL`s are skipped.
Consider this table:
```
```text
┌─x─┬────y─┐
│ 1 │ 2 │
│ 2 │ ᴺᵁᴸᴸ │
@ -27,34 +27,27 @@ Consider this table:
Let's say you need to total the values in the `y` column:
```sql
SELECT sum(y) FROM t_null_big
```
```
:) SELECT sum(y) FROM t_null_big
SELECT sum(y)
FROM t_null_big
┌─sum(y)─┐
│ 7 │
└────────┘
1 rows in set. Elapsed: 0.002 sec.
```
The `sum` function interprets `NULL` as `0`. In particular, this means that if the function receives input of a selection where all the values are `NULL`, then the result will be `0`, not `NULL`.
Now you can use the `groupArray` function to create an array from the `y` column:
```sql
SELECT groupArray(y) FROM t_null_big
```
:) SELECT groupArray(y) FROM t_null_big
SELECT groupArray(y)
FROM t_null_big
```text
┌─groupArray(y)─┐
│ [2,2,3] │
└───────────────┘
1 rows in set. Elapsed: 0.002 sec.
```
`groupArray` does not include `NULL` in the resulting array.

View File

@ -6,7 +6,7 @@ Some aggregate functions can accept not only argument columns (used for compress
Calculates an adaptive histogram. It doesn't guarantee precise results.
```
```sql
histogram(number_of_bins)(values)
```
@ -90,7 +90,7 @@ Example: `sequenceMatch ('(?1).*(?2)')(EventTime, URL LIKE '%company%', URL LIKE
This is a singular example. You could write it using other aggregate functions:
```
```sql
minIf(EventTime, URL LIKE '%company%') < maxIf(EventTime, URL LIKE '%cart%').
```
@ -153,7 +153,7 @@ Set the following chain of events:
To find out how far the user `user_id` could get through the chain in an hour in January of 2017, make the query:
```
```sql
SELECT
level,
count() AS c
@ -184,7 +184,7 @@ Consider you are doing a website analytics, intend to calculate the retention of
This could be easily calculate by `retention`
```
```sql
SELECT
sum(r[1]) AS r1,
sum(r[2]) AS r2,
@ -218,7 +218,7 @@ It works as fast as possible, except for cases when a large N value is used and
Usage example:
```
```text
Problem: Generate a report that shows only keywords that produced at least 5 unique users.
Solution: Write in the GROUP BY query SearchPhrase HAVING uniqUpTo(4)(UserID) >= 5
```

View File

@ -79,7 +79,7 @@ When a `SELECT` query has the `GROUP BY` clause or at least one aggregate functi
Selects a frequently occurring value using the [heavy hitters](http://www.cs.umd.edu/~samir/498/karp.pdf) algorithm. If there is a value that occurs more than in half the cases in each of the query's execution threads, this value is returned. Normally, the result is nondeterministic.
```
```sql
anyHeavy(column)
```
@ -91,12 +91,12 @@ anyHeavy(column)
Take the [OnTime](../../getting_started/example_datasets/ontime.md) data set and select any frequently occurring value in the `AirlineID` column.
``` sql
```sql
SELECT anyHeavy(AirlineID) AS res
FROM ontime
```
```
```text
┌───res─┐
│ 19690 │
└───────┘
@ -111,7 +111,7 @@ The result is just as indeterminate as for the `any` function.
Applies bitwise `AND` for series of numbers.
```
```sql
groupBitAnd(expr)
```
@ -127,7 +127,7 @@ Value of the `UInt*` type.
Test data:
```
```text
binary decimal
00101100 = 44
00011100 = 28
@ -137,7 +137,7 @@ binary decimal
Query:
```
```sql
SELECT groupBitAnd(num) FROM t
```
@ -145,7 +145,7 @@ Where `num` is the column with the test data.
Result:
```
```text
binary decimal
00000100 = 4
```
@ -154,7 +154,7 @@ binary decimal
Applies bitwise `OR` for series of numbers.
```
```sql
groupBitOr(expr)
```
@ -170,7 +170,7 @@ Value of the `UInt*` type.
Test data:
```
```text
binary decimal
00101100 = 44
00011100 = 28
@ -180,7 +180,7 @@ binary decimal
Query:
```
```sql
SELECT groupBitOr(num) FROM t
```
@ -188,7 +188,7 @@ Where `num` is the column with the test data.
Result:
```
```text
binary decimal
01111101 = 125
```
@ -197,7 +197,7 @@ binary decimal
Applies bitwise `XOR` for series of numbers.
```
```sql
groupBitXor(expr)
```
@ -213,7 +213,7 @@ Value of the `UInt*` type.
Test data:
```
```text
binary decimal
00101100 = 44
00011100 = 28
@ -223,7 +223,7 @@ binary decimal
Query:
```
```sql
SELECT groupBitXor(num) FROM t
```
@ -231,7 +231,7 @@ Where `num` is the column with the test data.
Result:
```
```text
binary decimal
01101000 = 104
```
@ -241,7 +241,7 @@ binary decimal
Bitmap or Aggregate calculations from a unsigned integer column, return cardinality of type UInt64, if add suffix -State, then return [bitmap object](../functions/bitmap_functions.md).
```
```sql
groupBitmap(expr)
```
@ -257,7 +257,7 @@ Value of the `UInt64` type.
Test data:
```
```text
UserID
1
1
@ -267,13 +267,13 @@ UserID
Query:
```
```sql
SELECT groupBitmap(UserID) as num FROM t
```
Result:
```
```text
num
3
```
@ -291,15 +291,17 @@ Calculates the maximum.
Calculates the 'arg' value for a minimal 'val' value. If there are several different values of 'arg' for minimal values of 'val', the first of these values encountered is output.
**Example:**
```
```text
┌─user─────┬─salary─┐
│ director │ 5000 │
│ manager │ 3000 │
│ worker │ 1000 │
└──────────┴────────┘
```
```sql
SELECT argMin(user, salary) FROM salary
```
```text
┌─argMin(user, salary)─┐
│ worker │
└──────────────────────┘
@ -330,7 +332,7 @@ Returns a tuple of two arrays: keys in sorted order, and values summed for
Example:
``` sql
```sql
CREATE TABLE sum_map(
date Date,
timeslot DateTime,
@ -351,7 +353,7 @@ FROM sum_map
GROUP BY timeslot
```
```
```text
┌────────────timeslot─┬─sumMap(statusMap.status, statusMap.requests)─┐
│ 2000-01-01 00:00:00 │ ([1,2,3,4,5],[10,10,20,10,10]) │
│ 2000-01-01 00:01:00 │ ([4,5,6,7,8],[10,10,20,10,10]) │
@ -362,7 +364,7 @@ GROUP BY timeslot
Computes the [skewness](https://en.wikipedia.org/wiki/Skewness) of a sequence.
```
```sql
skewPop(expr)
```
@ -386,7 +388,7 @@ Computes the [sample skewness](https://en.wikipedia.org/wiki/Skewness) of a sequ
It represents an unbiased estimate of the skewness of a random variable if passed values form its sample.
```
```sql
skewSamp(expr)
```
@ -408,7 +410,7 @@ SELECT skewSamp(value) FROM series_with_value_column
Computes the [kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a sequence.
```
```sql
kurtPop(expr)
```
@ -432,7 +434,7 @@ Computes the [sample kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a sequ
It represents an unbiased estimate of the kurtosis of a random variable if passed values form its sample.
```
```sql
kurtSamp(expr)
```
@ -463,7 +465,7 @@ The function returns array of tuples with `(timestamp, aggregated_value)` pairs.
Before using this function make sure `timestamp` is in ascending order.
Example:
```
```text
┌─uid─┬─timestamp─┬─value─┐
│ 1 │ 2 │ 0.2 │
│ 1 │ 7 │ 0.7 │
@ -477,7 +479,7 @@ Example:
│ 2 │ 24 │ 4.8 │
└─────┴───────────┴───────┘
```
```
```sql
CREATE TABLE time_series(
uid UInt64,
timestamp Int64,
@ -493,7 +495,7 @@ FROM (
);
```
And the result will be:
```
```text
[(2,0.2),(3,0.9),(7,2.1),(8,2.4),(12,3.6),(17,5.1),(18,5.4),(24,7.2),(25,2.5)]
```
@ -502,7 +504,7 @@ Similarly timeSeriesGroupRateSum, timeSeriesGroupRateSum will Calculate the rate
Also, timestamp should be in ascend order before use this function.
Use this function, the result above case will be:
```
```text
[(2,0),(3,0.1),(7,0.3),(8,0.3),(12,0.3),(17,0.3),(18,0.3),(24,0.3),(25,0.1)]
```
@ -516,7 +518,7 @@ The result is always Float64.
Calculates the approximate number of different values of the argument.
```
```sql
uniq(x[, ...])
```
@ -551,7 +553,7 @@ We recommend using this function in almost all scenarios.
Calculates the approximate number of different argument values.
```
```sql
uniqCombined(HLL_precision)(x[, ...])
```
@ -595,7 +597,7 @@ Compared to the [uniq](#agg_function-uniq) function, the `uniqCombined`:
Calculates the approximate number of different argument values, using the [HyperLogLog](https://en.wikipedia.org/wiki/HyperLogLog) algorithm.
```
```sql
uniqHLL12(x[, ...])
```
@ -631,7 +633,7 @@ We don't recommend using this function. In most cases, use the [uniq](#agg_funct
Calculates the exact number of different argument values.
```
```sql
uniqExact(x[, ...])
```
@ -676,7 +678,7 @@ Optional parameters:
Calculates the moving sum of input values.
```
```sql
groupArrayMovingSum(numbers_for_summing)
groupArrayMovingSum(window_size)(numbers_for_summing)
```
@ -745,7 +747,7 @@ FROM t
Calculates the moving average of input values.
```
```sql
groupArrayMovingAvg(numbers_for_summing)
groupArrayMovingAvg(window_size)(numbers_for_summing)
```
@ -850,7 +852,7 @@ Don't use this function for calculating timings. There is a more suitable functi
Computes the quantile of the specified level with determined precision. The function intended for calculating quantiles of page loading time in milliseconds.
```
```sql
quantileTiming(level)(expr)
```
@ -955,7 +957,7 @@ Returns an array of the most frequent values in the specified column. The result
Implements the [ Filtered Space-Saving](http://www.l2f.inesc-id.pt/~fmmb/wiki/uploads/Work/misnis.ref0a.pdf) algorithm for analyzing TopK, based on the reduce-and-combine algorithm from [Parallel Space Saving](https://arxiv.org/pdf/1401.0702.pdf).
```
```sql
topK(N)(column)
```
@ -972,12 +974,12 @@ We recommend using the `N < 10 ` value; performance is reduced with large `N` va
Take the [OnTime](../../getting_started/example_datasets/ontime.md) data set and select the three most frequently occurring values in the `AirlineID` column.
``` sql
```sql
SELECT topK(3)(AirlineID) AS res
FROM ontime
```
```
```text
┌─res─────────────────┐
│ [19393,19790,19805] │
└─────────────────────┘
@ -1001,7 +1003,7 @@ Calculates the Pearson correlation coefficient: `Σ((x - x̅)(y - y̅)) / sqrt(
Performs simple (unidimensional) linear regression.
```
```sql
simpleLinearRegression(x, y)
```

View File

@ -6,7 +6,7 @@ The `ALTER` query is only supported for `*MergeTree` tables, as well as `Merge`a
Changing the table structure.
``` sql
```sql
ALTER TABLE [db].name [ON CLUSTER cluster] ADD|DROP|CLEAR|COMMENT|MODIFY COLUMN ...
```
@ -25,7 +25,7 @@ These actions are described in detail below.
#### ADD COLUMN {#alter_add-column}
``` sql
```sql
ADD COLUMN [IF NOT EXISTS] name [type] [default_expr] [AFTER name_after]
```
@ -39,13 +39,13 @@ This approach allows us to complete the `ALTER` query instantly, without increas
Example:
``` sql
```sql
ALTER TABLE visits ADD COLUMN browser String AFTER user_id
```
#### DROP COLUMN {#alter_drop-column}
``` sql
```sql
DROP COLUMN [IF EXISTS] name
```
@ -55,13 +55,13 @@ Deletes data from the file system. Since this deletes entire files, the query is
Example:
``` sql
```sql
ALTER TABLE visits DROP COLUMN browser
```
#### CLEAR COLUMN {#alter_clear-column}
``` sql
```sql
CLEAR COLUMN [IF EXISTS] name IN PARTITION partition_name
```
@ -71,13 +71,13 @@ If the `IF EXISTS` clause is specified, the query won't return an error if the c
Example:
``` sql
```sql
ALTER TABLE visits CLEAR COLUMN browser IN PARTITION tuple()
```
#### COMMENT COLUMN {#alter_comment-column}
``` sql
```sql
COMMENT COLUMN [IF EXISTS] name 'comment'
```
@ -89,13 +89,13 @@ Comments are stored in the `comment_expression` column returned by the [DESCRIBE
Example:
``` sql
```sql
ALTER TABLE visits COMMENT COLUMN browser 'The table shows the browser used for accessing the site.'
```
#### MODIFY COLUMN {#alter_modify-column}
``` sql
```sql
MODIFY COLUMN [IF EXISTS] name [type] [default_expr]
```
@ -105,7 +105,7 @@ When changing the type, values are converted as if the [toType](functions/type_c
Example:
``` sql
```sql
ALTER TABLE visits MODIFY COLUMN browser Array(String)
```
@ -139,7 +139,7 @@ For tables that don't store data themselves (such as `Merge` and `Distributed`),
The following command is supported:
``` sql
```sql
MODIFY ORDER BY new_expression
```
@ -171,7 +171,7 @@ Also, they are replicated (syncing indices metadata through ZooKeeper).
See more on [constraints](create.md#constraints)
Constraints could be added or deleted using following syntax:
```
```sql
ALTER TABLE [db].name ADD CONSTRAINT constraint_name CHECK expression;
ALTER TABLE [db].name DROP CONSTRAINT constraint_name;
```
@ -197,7 +197,7 @@ The following operations with [partitions](../operations/table_engines/custom_pa
#### DETACH PARTITION {#alter_detach-partition}
``` sql
```sql
ALTER TABLE table_name DETACH PARTITION partition_expr
```
@ -205,7 +205,7 @@ Moves all data for the specified partition to the `detached` directory. The serv
Example:
``` sql
```sql
ALTER TABLE visits DETACH PARTITION 201901
```
@ -217,7 +217,7 @@ This query is replicated it moves the data to the `detached` directory on al
#### DROP PARTITION {#alter_drop-partition}
``` sql
```sql
ALTER TABLE table_name DROP PARTITION partition_expr
```
@ -245,7 +245,7 @@ ALTER TABLE table_name ATTACH PARTITION|PART partition_expr
Adds data to the table from the `detached` directory. It is possible to add data for an entire partition or for a separate part. Examples:
``` sql
```sql
ALTER TABLE visits ATTACH PARTITION 201901;
ALTER TABLE visits ATTACH PART 201901_2_2_0;
```
@ -258,7 +258,7 @@ So you can put data to the `detached` directory on one replica, and use the `ALT
#### REPLACE PARTITION {#alter_replace-partition}
``` sql
```sql
ALTER TABLE table2 REPLACE PARTITION partition_expr FROM table1
```
@ -271,7 +271,7 @@ For the query to run successfully, the following conditions must be met:
#### CLEAR COLUMN IN PARTITION {#alter_clear-column-partition}
``` sql
```sql
ALTER TABLE table_name CLEAR COLUMN column_name IN PARTITION partition_expr
```
@ -279,13 +279,13 @@ Resets all values in the specified column in a partition. If the `DEFAULT` claus
Example:
``` sql
```sql
ALTER TABLE visits CLEAR COLUMN hour in PARTITION 201902
```
#### FREEZE PARTITION {#alter_freeze-partition}
``` sql
```sql
ALTER TABLE table_name FREEZE [PARTITION partition_expr]
```
@ -321,7 +321,7 @@ For more information about backups and restoring data, see the [Data Backup](../
#### CLEAR INDEX IN PARTITION {#alter_clear-index-partition}
``` sql
```sql
ALTER TABLE table_name CLEAR INDEX index_name IN PARTITION partition_expr
```
@ -329,7 +329,7 @@ The query works similar to `CLEAR COLUMN`, but it resets an index instead of a c
#### FETCH PARTITION {#alter_fetch-partition}
``` sql
```sql
ALTER TABLE table_name FETCH PARTITION partition_expr FROM 'path-in-zookeeper'
```
@ -342,7 +342,7 @@ The query does the following:
For example:
``` sql
```sql
ALTER TABLE users FETCH PARTITION 201902 FROM '/clickhouse/tables/01-01/visits';
ALTER TABLE users ATTACH PARTITION 201902;
```
@ -370,7 +370,7 @@ For old-style tables, you can specify the partition either as a number `201901`
All the rules above are also true for the [OPTIMIZE](misc.md#misc_operations-optimize) query. If you need to specify the only partition when optimizing a non-partitioned table, set the expression `PARTITION tuple()`. For example:
``` sql
```sql
OPTIMIZE TABLE table_not_partitioned PARTITION tuple() FINAL;
```
@ -393,19 +393,19 @@ Existing tables are ready for mutations as-is (no conversion necessary), but aft
Currently available commands:
``` sql
```sql
ALTER TABLE [db.]table DELETE WHERE filter_expr
```
The `filter_expr` must be of type UInt8. The query deletes rows in the table for which this expression takes a non-zero value.
``` sql
```sql
ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr
```
The command is available starting with the 18.12.14 version. The `filter_expr` must be of type UInt8. This query updates values of specified columns to the values of corresponding expressions in rows for which the `filter_expr` takes a non-zero value. Values are casted to the column type using the `CAST` operator. Updating columns that are used in the calculation of the primary or the partition key is not supported.
``` sql
```sql
ALTER TABLE [db.]table MATERIALIZE INDEX name IN PARTITION partition_name
```

View File

@ -4,7 +4,7 @@
Creates database.
``` sql
```sql
CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster] [ENGINE = engine(...)]
```
@ -48,19 +48,19 @@ The structure of the table is a list of column descriptions. If indexes are supp
A column description is `name type` in the simplest case. Example: `RegionID UInt32`.
Expressions can also be defined for default values (see below).
``` sql
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name AS [db2.]name2 [ENGINE = engine]
```
Creates a table with the same structure as another table. You can specify a different engine for the table. If the engine is not specified, the same engine will be used as for the `db2.name2` table.
``` sql
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name AS table_fucntion()
```
Creates a table with the structure and data returned by a [table function](table_functions/index.md).
``` sql
```sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name ENGINE = engine AS SELECT ...
```
@ -221,7 +221,7 @@ In most cases, temporary tables are not created manually, but when using externa
The `CREATE`, `DROP`, `ALTER`, and `RENAME` queries support distributed execution on a cluster.
For example, the following query creates the `all_hits` `Distributed` table on each host in `cluster`:
``` sql
```sql
CREATE TABLE IF NOT EXISTS all_hits ON CLUSTER cluster (p Date, i Int32) ENGINE = Distributed(cluster, default, hits)
```
@ -231,7 +231,7 @@ The local version of the query will eventually be implemented on each host in th
## CREATE VIEW
``` sql
```sql
CREATE [MATERIALIZED] VIEW [IF NOT EXISTS] [db.]table_name [TO[db.]name] [ENGINE = engine] [POPULATE] AS SELECT ...
```
@ -241,19 +241,19 @@ Normal views don't store any data, but just perform a read from another table. I
As an example, assume you've created a view:
``` sql
```sql
CREATE VIEW view AS SELECT ...
```
and written a query:
``` sql
```sql
SELECT a, b, c FROM view
```
This query is fully equivalent to using the subquery:
``` sql
```sql
SELECT a, b, c FROM (SELECT ...)
```

View File

@ -112,7 +112,7 @@ This storage method works the same way as hashed and allows using date/time (arb
Example: The table contains discounts for each advertiser in the format:
```
```text
+---------------+---------------------+-------------------+--------+
| advertiser id | discount start date | discount end date | amount |
+===============+=====================+===================+========+
@ -146,7 +146,7 @@ Example:
To work with these dictionaries, you need to pass an additional argument to the `dictGetT` function, for which a range is selected:
```
```sql
dictGetT('dict_name', 'attr_name', id, date)
```
@ -240,7 +240,7 @@ This type of storage is for mapping network prefixes (IP addresses) to metadata
Example: The table contains network prefixes and their corresponding AS number and country code:
```
```text
+-----------------+-------+--------+
| prefix | asn | cca2 |
+=================+=======+========+
@ -283,13 +283,13 @@ The key must have only one String type attribute that contains an allowed IP pre
For queries, you must use the same functions (`dictGetT` with a tuple) as for dictionaries with composite keys:
```
```sql
dictGetT('dict_name', 'attr_name', tuple(ip))
```
The function takes either `UInt32` for IPv4, or `FixedString(16)` for IPv6:
```
```sql
dictGetString('prefix', 'asn', tuple(IPv6StringToNum('2001:db8::1')))
```

View File

@ -131,7 +131,7 @@ If you have a problems with encodings when using Oracle, see the corresponding [
Let's configure unixODBC for PostgreSQL. Content of `/etc/odbc.ini`:
```
```text
[gregtest]
Driver = /usr/lib/psqlodbca.so
Servername = localhost
@ -144,7 +144,7 @@ PASSWORD = test
If you then make a query such as
```
```sql
SELECT * FROM odbc('DSN=gregtest;Servername=some-server.com', 'test_db');
```
@ -155,12 +155,13 @@ ODBC driver will send values of `USERNAME` and `PASSWORD` from `odbc.ini` to `so
Ubuntu OS.
Installing unixODBC and the ODBC driver for PostgreSQL:
sudo apt-get install -y unixodbc odbcinst odbc-postgresql
```bash
$ sudo apt-get install -y unixodbc odbcinst odbc-postgresql
```
Configuring `/etc/odbc.ini` (or `~/.odbc.ini`):
```
```text
[DEFAULT]
Driver = myconnection
@ -222,13 +223,13 @@ Ubuntu OS.
Installing the driver: :
```
sudo apt-get install tdsodbc freetds-bin sqsh
```bash
$ sudo apt-get install tdsodbc freetds-bin sqsh
```
Configuring the driver: :
Configuring the driver:
```
```bash
$ cat /etc/freetds/freetds.conf
...

View File

@ -4,11 +4,11 @@ For all arithmetic functions, the result type is calculated as the smallest numb
Example:
``` sql
```sql
SELECT toTypeName(0), toTypeName(0 + 0), toTypeName(0 + 0 + 0), toTypeName(0 + 0 + 0 + 0)
```
```
```text
┌─toTypeName(0)─┬─toTypeName(plus(0, 0))─┬─toTypeName(plus(plus(0, 0), 0))─┬─toTypeName(plus(plus(plus(0, 0), 0), 0))─┐
│ UInt8 │ UInt16 │ UInt32 │ UInt64 │
└───────────────┴────────────────────────┴─────────────────────────────────┴──────────────────────────────────────────┘

View File

@ -49,7 +49,7 @@ Returns an 'Array(T)' type result, where 'T' is the smallest common type out of
Combines arrays passed as arguments.
```
```sql
arrayConcat(arrays)
```
@ -82,9 +82,10 @@ Returns 0 if the the element is not in the array, or 1 if it is.
`NULL` is processed as a value.
```
```sql
SELECT has([1, 2, NULL], NULL)
```
```text
┌─has([1, 2, NULL], NULL)─┐
│ 1 │
└─────────────────────────┘
@ -94,7 +95,7 @@ SELECT has([1, 2, NULL], NULL)
Checks whether one array is a subset of another.
```
```sql
hasAll(set, subset)
```
@ -132,7 +133,7 @@ hasAll(set, subset)
Checks whether two arrays have intersection by some elements.
```
```sql
hasAny(array1, array2)
```
@ -169,10 +170,10 @@ Returns the index of the first 'x' element (starting from 1) if it is in the arr
Example:
```
:) SELECT indexOf([1,3,NULL,NULL],NULL)
```sql
SELECT indexOf([1, 3, NULL, NULL], NULL)
```
```text
┌─indexOf([1, 3, NULL, NULL], NULL)─┐
│ 3 │
@ -189,9 +190,10 @@ Returns the number of elements in the array equal to x. Equivalent to arrayCount
Example:
```
```sql
SELECT countEqual([1, 2, NULL, NULL], NULL)
```
```text
┌─countEqual([1, 2, NULL, NULL], NULL)─┐
│ 2 │
└──────────────────────────────────────┘
@ -293,7 +295,7 @@ This is necessary when using ARRAY JOIN with a nested data structure and further
Removes the last item from the array.
```
```sql
arrayPopBack(array)
```
@ -316,7 +318,7 @@ SELECT arrayPopBack([1, 2, 3]) AS res
Removes the first item from the array.
```
```sql
arrayPopFront(array)
```
@ -339,7 +341,7 @@ SELECT arrayPopFront([1, 2, 3]) AS res
Adds one item to the end of the array.
```
```sql
arrayPushBack(array, single_value)
```
@ -363,7 +365,7 @@ SELECT arrayPushBack(['a'], 'b') AS res
Adds one element to the beginning of the array.
```
```sql
arrayPushFront(array, single_value)
```
@ -387,7 +389,7 @@ SELECT arrayPushBack(['b'], 'a') AS res
Changes the length of the array.
```
```sql
arrayResize(array, size[, extender])
```
@ -405,17 +407,19 @@ An array of length `size`.
**Examples of calls**
```
```sql
SELECT arrayResize([1], 3)
```
```text
┌─arrayResize([1], 3)─┐
│ [1,0,0] │
└─────────────────────┘
```
```
```sql
SELECT arrayResize([1], 3, NULL)
```
```text
┌─arrayResize([1], 3, NULL)─┐
│ [1,NULL,NULL] │
└───────────────────────────┘
@ -425,7 +429,7 @@ SELECT arrayResize([1], 3, NULL)
Returns a slice of the array.
```
```sql
arraySlice(array, offset[, length])
```
@ -653,7 +657,7 @@ Takes an array, returns an array with the difference between all pairs of neighb
SELECT arrayDifference([1, 2, 3, 4])
```
```
```text
┌─arrayDifference([1, 2, 3, 4])─┐
│ [0,1,1,1] │
└───────────────────────────────┘
@ -667,7 +671,7 @@ Takes an array, returns an array containing the different elements in all the ar
SELECT arrayDistinct([1, 2, 2, 3, 1])
```
```
```text
┌─arrayDistinct([1, 2, 2, 3, 1])─┐
│ [1,2,3] │
└────────────────────────────────┘
@ -687,7 +691,7 @@ SELECT
arrayIntersect([1, 2], [1, 3], [1, 4]) AS intersect
```
```
```text
┌─no_intersect─┬─intersect─┐
│ [] │ [1] │
└──────────────┴───────────┘

View File

@ -19,7 +19,7 @@ Example:
SELECT arrayJoin([1, 2, 3] AS src) AS dst, 'Hello', src
```
```
```text
┌─dst─┬─\'Hello\'─┬─src─────┐
│ 1 │ Hello │ [1,2,3] │
│ 2 │ Hello │ [1,2,3] │

View File

@ -13,7 +13,7 @@ For more information on RoaringBitmap, see: [CRoaring](https://github.com/Roarin
Build a bitmap from unsigned integer array.
```
```sql
bitmapBuild(array)
```
@ -36,7 +36,7 @@ SELECT bitmapBuild([1, 2, 3, 4, 5]) AS res, toTypeName(res)
Convert bitmap to integer array.
```
```sql
bitmapToArray(bitmap)
```
@ -50,7 +50,7 @@ bitmapToArray(bitmap)
SELECT bitmapToArray(bitmapBuild([1, 2, 3, 4, 5])) AS res
```
```
```text
┌─res─────────┐
│ [1,2,3,4,5] │
└─────────────┘
@ -60,7 +60,7 @@ SELECT bitmapToArray(bitmapBuild([1, 2, 3, 4, 5])) AS res
Return subset in specified range (not include the range_end).
```
```sql
bitmapSubsetInRange(bitmap, range_start, range_end)
```
@ -72,11 +72,11 @@ bitmapSubsetInRange(bitmap, range_start, range_end)
**Example**
``` sql
```sql
SELECT bitmapToArray(bitmapSubsetInRange(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(30), toUInt32(200))) AS res
```
```
```text
┌─res───────────────┐
│ [30,31,32,33,100] │
└───────────────────┘
@ -112,7 +112,7 @@ SELECT bitmapToArray(bitmapSubsetLimit(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12
Checks whether the bitmap contains an element.
```
```sql
bitmapContains(haystack, needle)
```
@ -143,7 +143,7 @@ SELECT bitmapContains(bitmapBuild([1,5,7,9]), toUInt32(9)) AS res
Checks whether two bitmaps have intersection by some elements.
```
```sql
bitmapHasAny(bitmap1, bitmap2)
```
@ -160,11 +160,11 @@ If you are sure that `bitmap2` contains strictly one element, consider using the
**Example**
``` sql
```sql
SELECT bitmapHasAny(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res
```
```
```text
┌─res─┐
│ 1 │
└─────┘
@ -175,7 +175,7 @@ SELECT bitmapHasAny(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res
Analogous to `hasAll(array, array)` returns 1 if the first bitmap contains all the elements of the second one, 0 otherwise.
If the second argument is an empty bitmap then returns 1.
```
```sql
bitmapHasAll(bitmap,bitmap)
```
@ -185,11 +185,11 @@ bitmapHasAll(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapHasAll(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res
```
```
```text
┌─res─┐
│ 0 │
└─────┘
@ -200,7 +200,7 @@ SELECT bitmapHasAll(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res
Two bitmap and calculation, the result is a new bitmap.
```
```sql
bitmapAnd(bitmap,bitmap)
```
@ -214,7 +214,7 @@ bitmapAnd(bitmap,bitmap)
SELECT bitmapToArray(bitmapAnd(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
```
```
```text
┌─res─┐
│ [3] │
└─────┘
@ -225,7 +225,7 @@ SELECT bitmapToArray(bitmapAnd(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS re
Two bitmap or calculation, the result is a new bitmap.
```
```sql
bitmapOr(bitmap,bitmap)
```
@ -235,11 +235,11 @@ bitmapOr(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapToArray(bitmapOr(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
```
```
```text
┌─res─────────┐
│ [1,2,3,4,5] │
└─────────────┘
@ -249,7 +249,7 @@ SELECT bitmapToArray(bitmapOr(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
Two bitmap xor calculation, the result is a new bitmap.
```
```sql
bitmapXor(bitmap,bitmap)
```
@ -259,11 +259,11 @@ bitmapXor(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapToArray(bitmapXor(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
```
```
```text
┌─res───────┐
│ [1,2,4,5] │
└───────────┘
@ -273,7 +273,7 @@ SELECT bitmapToArray(bitmapXor(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS re
Two bitmap andnot calculation, the result is a new bitmap.
```
```sql
bitmapAndnot(bitmap,bitmap)
```
@ -283,11 +283,11 @@ bitmapAndnot(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapToArray(bitmapAndnot(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res
```
```
```text
┌─res───┐
│ [1,2] │
└───────┘
@ -298,7 +298,7 @@ SELECT bitmapToArray(bitmapAndnot(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS
Retrun bitmap cardinality of type UInt64.
```
```sql
bitmapCardinality(bitmap)
```
@ -308,11 +308,11 @@ bitmapCardinality(bitmap)
**Example**
``` sql
```sql
SELECT bitmapCardinality(bitmapBuild([1, 2, 3, 4, 5])) AS res
```
```
```text
┌─res─┐
│ 5 │
└─────┘
@ -373,7 +373,7 @@ SELECT bitmapMax(bitmapBuild([1, 2, 3, 4, 5])) AS res
Two bitmap and calculation, return cardinality of type UInt64.
```
```sql
bitmapAndCardinality(bitmap,bitmap)
```
@ -383,11 +383,11 @@ bitmapAndCardinality(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapAndCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
```
```text
┌─res─┐
│ 1 │
└─────┘
@ -398,7 +398,7 @@ SELECT bitmapAndCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
Two bitmap or calculation, return cardinality of type UInt64.
```
```sql
bitmapOrCardinality(bitmap,bitmap)
```
@ -408,11 +408,11 @@ bitmapOrCardinality(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapOrCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
```
```text
┌─res─┐
│ 5 │
└─────┘
@ -422,7 +422,7 @@ SELECT bitmapOrCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
Two bitmap xor calculation, return cardinality of type UInt64.
```
```sql
bitmapXorCardinality(bitmap,bitmap)
```
@ -432,11 +432,11 @@ bitmapXorCardinality(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapXorCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
```
```text
┌─res─┐
│ 4 │
└─────┘
@ -447,7 +447,7 @@ SELECT bitmapXorCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
Two bitmap andnot calculation, return cardinality of type UInt64.
```
```sql
bitmapAndnotCardinality(bitmap,bitmap)
```
@ -457,11 +457,11 @@ bitmapAndnotCardinality(bitmap,bitmap)
**Example**
``` sql
```sql
SELECT bitmapAndnotCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
```
```
```text
┌─res─┐
│ 2 │
└─────┘

View File

@ -11,7 +11,7 @@ Returns `then` if `cond != 0`, or `else` if `cond = 0`.
Allows you to write the [CASE](../operators.md#operator_case) operator more compactly in the query.
```
```sql
multiIf(cond_1, then_1, cond_2, then_2...else)
```
@ -31,7 +31,7 @@ The function returns one of the values `then_N` or `else`, depending on the cond
Take the table
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │ 3 │
@ -40,7 +40,7 @@ Take the table
Run the query `SELECT multiIf(isNull(y) x, y < 3, y, NULL) FROM t_null`. Result:
```
```text
┌─multiIf(isNull(y), x, less(y, 3), y, NULL)─┐
│ 1 │
│ ᴺᵁᴸᴸ │

View File

@ -4,7 +4,7 @@ Support for time zones
All functions for working with the date and time that have a logical use for the time zone can accept a second optional time zone argument. Example: Asia/Yekaterinburg. In this case, they use the specified time zone instead of the local (default) one.
``` sql
```sql
SELECT
toDateTime('2016-06-15 23:00:00') AS time,
toDate(time) AS date_local,
@ -12,7 +12,7 @@ SELECT
toString(time, 'US/Samoa') AS time_samoa
```
```
```text
┌────────────────time─┬─date_local─┬─date_yekat─┬─time_samoa──────────┐
│ 2016-06-15 23:00:00 │ 2016-06-15 │ 2016-06-16 │ 2016-06-15 09:00:00 │
└─────────────────────┴────────────┴────────────┴─────────────────────┘
@ -201,7 +201,7 @@ For mode values with a meaning of “with 4 or more days this year,” weeks are
For mode values with a meaning of “contains January 1”, the week contains January 1 is week 1. It doesn't matter how many days in the new year the week contained, even if it contained only one day.
```
```sql
toWeek(date, [, mode][, Timezone])
```
**Parameters**
@ -212,11 +212,11 @@ toWeek(date, [, mode][, Timezone])
**Example**
``` sql
```sql
SELECT toDate('2016-12-27') AS date, toWeek(date) AS week0, toWeek(date,1) AS week1, toWeek(date,9) AS week9;
```
```
```text
┌───────date─┬─week0─┬─week1─┬─week9─┐
│ 2016-12-27 │ 52 │ 52 │ 1 │
└────────────┴───────┴───────┴───────┘
@ -231,11 +231,11 @@ The mode argument works exactly like the mode argument to toWeek(). For the sing
**Example**
``` sql
```sql
SELECT toDate('2016-12-27') AS date, toYearWeek(date) AS yearWeek0, toYearWeek(date,1) AS yearWeek1, toYearWeek(date,9) AS yearWeek9;
```
```
```text
┌───────date─┬─yearWeek0─┬─yearWeek1─┬─yearWeek9─┐
│ 2016-12-27 │ 201652 │ 201652 │ 201701 │
└────────────┴───────────┴───────────┴───────────┘
@ -286,7 +286,7 @@ SELECT
addYears(date_time, 1) AS add_years_with_date_time
```
```
```text
┌─add_years_with_date─┬─add_years_with_date_time─┐
│ 2019-01-01 │ 2019-01-01 00:00:00 │
└─────────────────────┴──────────────────────────┘
@ -305,7 +305,7 @@ SELECT
subtractYears(date_time, 1) AS subtract_years_with_date_time
```
```
```text
┌─subtract_years_with_date─┬─subtract_years_with_date_time─┐
│ 2018-01-01 │ 2018-01-01 00:00:00 │
└──────────────────────────┴───────────────────────────────┘

View File

@ -6,7 +6,7 @@ For information on connecting and configuring external dictionaries, see [Extern
Retrieves a value from an external dictionary.
```
```sql
dictGet('dict_name', 'attr_name', id_expr)
dictGetOrDefault('dict_name', 'attr_name', id_expr, default_value_expr)
```
@ -95,7 +95,7 @@ LIMIT 3
Checks whether a key is present in a dictionary.
```
```sql
dictHas('dict_name', id_expr)
```
@ -115,7 +115,7 @@ Type: `UInt8`.
For the hierarchical dictionary, returns an array of dictionary keys starting from the passed `id_expr` and continuing along the chain of parent elements.
```
```sql
dictGetHierarchy('dict_name', id_expr)
```
@ -134,7 +134,7 @@ Type: Array(UInt64).
Checks the ancestor of a key through the whole hierarchical chain in the dictionary.
```
```sql
dictIsIn('dict_name', child_id_expr, ancestor_id_expr)
```
@ -169,7 +169,7 @@ All these functions have the `OrDefault` modification. For example, `dictGetDate
Syntax:
```
```sql
dictGet[Type]('dict_name', 'attr_name', id_expr)
dictGet[Type]OrDefault('dict_name', 'attr_name', id_expr, default_value_expr)
```

View File

@ -4,7 +4,7 @@
Checks whether the argument is [NULL](../syntax.md#null).
```
```sql
isNull(x)
```
@ -21,7 +21,7 @@ isNull(x)
Input table
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │ 3 │
@ -30,25 +30,21 @@ Input table
Query
```sql
SELECT x FROM t_null WHERE isNull(y)
```
:) SELECT x FROM t_null WHERE isNull(y)
SELECT x
FROM t_null
WHERE isNull(y)
```text
┌─x─┐
│ 1 │
└───┘
1 rows in set. Elapsed: 0.010 sec.
```
## isNotNull
Checks whether the argument is [NULL](../syntax.md#null).
```
```sql
isNotNull(x)
```
@ -65,7 +61,7 @@ isNotNull(x)
Input table
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │ 3 │
@ -74,25 +70,21 @@ Input table
Query
```sql
SELECT x FROM t_null WHERE isNotNull(y)
```
:) SELECT x FROM t_null WHERE isNotNull(y)
SELECT x
FROM t_null
WHERE isNotNull(y)
```text
┌─x─┐
│ 2 │
└───┘
1 rows in set. Elapsed: 0.010 sec.
```
## coalesce
Checks from left to right whether `NULL` arguments were passed and returns the first non-`NULL` argument.
```
```sql
coalesce(x,...)
```
@ -109,7 +101,7 @@ coalesce(x,...)
Consider a list of contacts that may specify multiple ways to contact a customer.
```
```text
┌─name─────┬─mail─┬─phone─────┬──icq─┐
│ client 1 │ ᴺᵁᴸᴸ │ 123-45-67 │ 123 │
│ client 2 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │
@ -120,25 +112,22 @@ The `mail` and `phone` fields are of type String, but the `icq` field is `UInt32
Get the first available contact method for the customer from the contact list:
```sql
SELECT coalesce(mail, phone, CAST(icq,'Nullable(String)')) FROM aBook
```
:) SELECT coalesce(mail, phone, CAST(icq,'Nullable(String)')) FROM aBook
SELECT coalesce(mail, phone, CAST(icq, 'Nullable(String)'))
FROM aBook
```text
┌─name─────┬─coalesce(mail, phone, CAST(icq, 'Nullable(String)'))─┐
│ client 1 │ 123-45-67 │
│ client 2 │ ᴺᵁᴸᴸ │
└──────────┴──────────────────────────────────────────────────────┘
2 rows in set. Elapsed: 0.006 sec.
```
## ifNull
Returns an alternative value if the main argument is `NULL`.
```
```sql
ifNull(x,alt)
```
@ -154,17 +143,19 @@ ifNull(x,alt)
**Example**
```
```sql
SELECT ifNull('a', 'b')
```
```text
┌─ifNull('a', 'b')─┐
│ a │
└──────────────────┘
```
```
```sql
SELECT ifNull(NULL, 'b')
```
```text
┌─ifNull(NULL, 'b')─┐
│ b │
└───────────────────┘
@ -174,7 +165,7 @@ SELECT ifNull(NULL, 'b')
Returns `NULL` if the arguments are equal.
```
```sql
nullIf(x, y)
```
@ -189,17 +180,19 @@ nullIf(x, y)
**Example**
```
```sql
SELECT nullIf(1, 1)
```
```text
┌─nullIf(1, 1)─┐
│ ᴺᵁᴸᴸ │
└──────────────┘
```
```
```sql
SELECT nullIf(1, 2)
```
```text
┌─nullIf(1, 2)─┐
│ 1 │
└──────────────┘
@ -209,7 +202,7 @@ SELECT nullIf(1, 2)
Results in a value of type [Nullable](../../data_types/nullable.md) for a non- `Nullable`, if the value is not `NULL`.
```
```sql
assumeNotNull(x)
```
@ -226,15 +219,16 @@ assumeNotNull(x)
Consider the `t_null` table.
```
```sql
SHOW CREATE TABLE t_null
```
```text
┌─statement─────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.t_null ( x Int8, y Nullable(Int8)) ENGINE = TinyLog │
└───────────────────────────────────────────────────────────────────────────┘
```
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │ 3 │
@ -243,18 +237,20 @@ SHOW CREATE TABLE t_null
Apply the `assumeNotNull` function to the `y` column.
```
```sql
SELECT assumeNotNull(y) FROM t_null
```
```text
┌─assumeNotNull(y)─┐
│ 0 │
│ 3 │
└──────────────────┘
```
```
```sql
SELECT toTypeName(assumeNotNull(y)) FROM t_null
```
```text
┌─toTypeName(assumeNotNull(y))─┐
│ Int8 │
│ Int8 │
@ -265,7 +261,7 @@ SELECT toTypeName(assumeNotNull(y)) FROM t_null
Converts the argument type to `Nullable`.
```
```sql
toNullable(x)
```
@ -279,15 +275,18 @@ toNullable(x)
**Example**
```
```sql
SELECT toTypeName(10)
```
```text
┌─toTypeName(10)─┐
│ UInt8 │
└────────────────┘
```
```sql
SELECT toTypeName(toNullable(10))
```
```text
┌─toTypeName(toNullable(10))─┐
│ Nullable(UInt8) │
└────────────────────────────┘

View File

@ -4,7 +4,7 @@
Calculate the distance between two points on the Earth's surface using [the great-circle formula](https://en.wikipedia.org/wiki/Great-circle_distance).
```
```sql
greatCircleDistance(lon1Deg, lat1Deg, lon2Deg, lat2Deg)
```
@ -25,11 +25,11 @@ Generates an exception when the input parameter values fall outside of the range
**Example**
``` sql
```sql
SELECT greatCircleDistance(55.755831, 37.617673, -55.755831, -37.617673)
```
```
```text
┌─greatCircleDistance(55.755831, 37.617673, -55.755831, -37.617673)─┐
│ 14132374.194975413 │
└───────────────────────────────────────────────────────────────────┘
@ -40,7 +40,7 @@ SELECT greatCircleDistance(55.755831, 37.617673, -55.755831, -37.617673)
Checks whether the point belongs to at least one of the ellipses.
Coordinates are geometric in the Cartesian coordinate system.
```
```sql
pointInEllipses(x, y, x₀, y₀, a₀, b₀,...,xₙ, yₙ, aₙ, bₙ)
```
@ -58,11 +58,11 @@ The input parameters must be `2+4⋅n`, where `n` is the number of ellipses.
**Example**
``` sql
```sql
SELECT pointInEllipses(10., 10., 10., 9.1, 1., 0.9999)
```
```
```text
┌─pointInEllipses(10., 10., 10., 9.1, 1., 0.9999)─┐
│ 1 │
└─────────────────────────────────────────────────┘
@ -72,7 +72,7 @@ SELECT pointInEllipses(10., 10., 10., 9.1, 1., 0.9999)
Checks whether the point belongs to the polygon on the plane.
```
```sql
pointInPolygon((x, y), [(a, b), (c, d) ...], ...)
```
@ -89,11 +89,11 @@ If the point is on the polygon boundary, the function may return either 0 or 1.
**Example**
``` sql
```sql
SELECT pointInPolygon((3., 3.), [(6, 0), (8, 4), (5, 8), (0, 2)]) AS res
```
```
```text
┌─res─┐
│ 1 │
└─────┘
@ -102,7 +102,7 @@ SELECT pointInPolygon((3., 3.), [(6, 0), (8, 4), (5, 8), (0, 2)]) AS res
## geohashEncode
Encodes latitude and longitude as a geohash-string, please see (http://geohash.org/, https://en.wikipedia.org/wiki/Geohash).
```
```sql
geohashEncode(longitude, latitude, [precision])
```
@ -118,11 +118,11 @@ geohashEncode(longitude, latitude, [precision])
**Example**
``` sql
```sql
SELECT geohashEncode(-5.60302734375, 42.593994140625, 0) AS res
```
```
```text
┌─res──────────┐
│ ezs42d000000 │
└──────────────┘
@ -142,11 +142,11 @@ Decodes any geohash-encoded string into longitude and latitude.
**Example**
``` sql
```sql
SELECT geohashDecode('ezs42') AS res
```
```
```text
┌─res─────────────────────────────┐
│ (-5.60302734375,42.60498046875) │
└─────────────────────────────────┘
@ -156,7 +156,7 @@ SELECT geohashDecode('ezs42') AS res
Calculates [H3](https://uber.github.io/h3/#/documentation/overview/introduction) point index `(lon, lat)` with specified resolution.
```
```sql
geoToH3(lon, lat, resolution)
```
@ -175,10 +175,10 @@ Type: [UInt64](../../data_types/int_uint.md).
**Example**
``` sql
```sql
SELECT geoToH3(37.79506683, 55.71290588, 15) as h3Index
```
```
```text
┌────────────h3Index─┐
│ 644325524701193974 │
└────────────────────┘
@ -207,10 +207,10 @@ Please note that function will throw an exception if resulting array is over 10'
**Example**
```
```sql
SELECT geohashesInBox(24.48, 40.56, 24.785, 40.81, 4) AS thasos
```
```
```text
┌─thasos──────────────────────────────────────┐
│ ['sx1q','sx1r','sx32','sx1w','sx1x','sx38'] │
└─────────────────────────────────────────────┘

View File

@ -6,7 +6,7 @@ Hash functions can be used for the deterministic pseudo-random shuffling of elem
[Interprets](../../query_language/functions/type_conversion_functions.md#type_conversion_functions-reinterpretAsString) all the input parameters as strings and calculates the [MD5](https://en.wikipedia.org/wiki/MD5) hash value for each of them. Then combines hashes, takes the first 8 bytes of the hash of the resulting string, and interprets them as `UInt64` in big-endian byte order.
```
```sql
halfMD5(par1, ...)
```
@ -42,7 +42,7 @@ If you want to get the same result as output by the md5sum utility, use lower(he
Produces a 64-bit [SipHash](https://131002.net/siphash/) hash value.
```
```sql
sipHash64(par1,...)
```
@ -68,7 +68,7 @@ A [UInt64](../../data_types/int_uint.md) data type hash value.
```sql
SELECT sipHash64(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS SipHash, toTypeName(SipHash) AS type
```
```
```text
┌──────────────SipHash─┬─type───┐
│ 13726873534472839665 │ UInt64 │
└──────────────────────┴────────┘
@ -84,7 +84,7 @@ Differs from sipHash64 in that the final xor-folding state is only done up to 12
Produces a 64-bit [CityHash](https://github.com/google/cityhash) hash value.
```
```sql
cityHash64(par1,...)
```
@ -150,7 +150,7 @@ Levels are the same as in URLHierarchy. This function is specific to Yandex.Metr
Produces a 64-bit [FarmHash](https://github.com/google/farmhash) hash value.
```
```sql
farmHash64(par1, ...)
```
@ -191,7 +191,7 @@ This is just [JavaHash](#hash_functions-javahash) with zeroed out sign bit. This
Produces a 64-bit [MetroHash](http://www.jandrewrogers.com/2015/05/27/metrohash/) hash value.
```
```sql
metroHash64(par1, ...)
```
@ -224,7 +224,7 @@ For more information, see the link: [JumpConsistentHash](https://arxiv.org/pdf/1
Produces a [MurmurHash2](https://github.com/aappleby/smhasher) hash value.
```
```sql
murmurHash2_32(par1, ...)
murmurHash2_64(par1, ...)
```
@ -253,7 +253,7 @@ SELECT murmurHash2_64(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:
Produces a [MurmurHash3](https://github.com/aappleby/smhasher) hash value.
```
```sql
murmurHash3_32(par1, ...)
murmurHash3_64(par1, ...)
```
@ -282,7 +282,7 @@ SELECT murmurHash3_32(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:
Produces a 128-bit [MurmurHash3](https://github.com/aappleby/smhasher) hash value.
```
```sql
murmurHash3_128( expr )
```

View File

@ -25,18 +25,20 @@ Returns an array obtained from the original application of the `func` function t
Examples:
``` sql
```sql
SELECT arrayMap(x -> (x + 2), [1, 2, 3]) as res;
```
```text
┌─res─────┐
│ [3,4,5] │
└─────────┘
```
The following example shows how to create a tuple of elements from different arrays:
``` sql
```sql
SELECT arrayMap((x, y) -> (x, y), [1, 2, 3], [4, 5, 6]) AS res
```
```text
┌─res─────────────────┐
│ [(1,4),(2,5),(3,6)] │
└─────────────────────┘
@ -50,17 +52,17 @@ Returns an array containing only the elements in `arr1` for which `func` returns
Examples:
``` sql
```sql
SELECT arrayFilter(x -> x LIKE '%World%', ['Hello', 'abc World']) AS res
```
```
```text
┌─res───────────┐
│ ['abc World'] │
└───────────────┘
```
``` sql
```sql
SELECT
arrayFilter(
(i, x) -> x LIKE '%World%',
@ -69,7 +71,7 @@ SELECT
AS res
```
```
```text
┌─res─┐
│ [2] │
└─────┘
@ -111,11 +113,11 @@ Returns an array of partial sums of elements in the source array (a running sum)
Example:
``` sql
```sql
SELECT arrayCumSum([1, 1, 1, 1]) AS res
```
```
```text
┌─res──────────┐
│ [1, 2, 3, 4] │
└──────────────┘
@ -125,11 +127,11 @@ SELECT arrayCumSum([1, 1, 1, 1]) AS res
Same as `arrayCumSum`, returns an array of partial sums of elements in the source array (a running sum). Different `arrayCumSum`, when then returned value contains a value less than zero, the value is replace with zero and the subsequent calculation is performed with zero parameters. For example:
``` sql
```sql
SELECT arrayCumSumNonNegative([1, 1, -4, 1]) AS res
```
```
```text
┌─res───────┐
│ [1,2,0,1] │
└───────────┘
@ -143,11 +145,11 @@ The [Schwartzian transform](https://en.wikipedia.org/wiki/Schwartzian_transform)
Example:
``` sql
```sql
SELECT arraySort((x, y) -> y, ['hello', 'world'], [2, 1]);
```
```
```text
┌─res────────────────┐
│ ['world', 'hello'] │
└────────────────────┘
@ -161,10 +163,10 @@ Returns an array as result of sorting the elements of `arr1` in descending order
Example:
``` sql
```sql
SELECT arrayReverseSort((x, y) -> y, ['hello', 'world'], [2, 1]) as res;
```
``` sql
```text
┌─res───────────────┐
│ ['hello','world'] │
└───────────────────┘

View File

@ -14,7 +14,7 @@ Similar to IPv4NumToString, but using xxx instead of the last octet.
Example:
``` sql
```sql
SELECT
IPv4NumToStringClassC(ClientIP) AS k,
count() AS c
@ -24,7 +24,7 @@ ORDER BY c DESC
LIMIT 10
```
```
```text
┌─k──────────────┬─────c─┐
│ 83.149.9.xxx │ 26238 │
│ 217.118.81.xxx │ 26074 │
@ -46,17 +46,17 @@ Since using 'xxx' is highly unusual, this may be changed in the future. We recom
Accepts a FixedString(16) value containing the IPv6 address in binary format. Returns a string containing this address in text format.
IPv6-mapped IPv4 addresses are output in the format ::ffff:111.222.33.44. Examples:
``` sql
```sql
SELECT IPv6NumToString(toFixedString(unhex('2A0206B8000000000000000000000011'), 16)) AS addr
```
```
```text
┌─addr─────────┐
│ 2a02:6b8::11 │
└──────────────┘
```
``` sql
```sql
SELECT
IPv6NumToString(ClientIP6 AS k),
count() AS c
@ -67,7 +67,7 @@ ORDER BY c DESC
LIMIT 10
```
```
```text
┌─IPv6NumToString(ClientIP6)──────────────┬─────c─┐
│ 2a02:2168:aaa:bbbb::2 │ 24695 │
│ 2a02:2698:abcd:abcd:abcd:abcd:8888:5555 │ 22408 │
@ -82,7 +82,7 @@ LIMIT 10
└─────────────────────────────────────────┴───────┘
```
``` sql
```sql
SELECT
IPv6NumToString(ClientIP6 AS k),
count() AS c
@ -93,7 +93,7 @@ ORDER BY c DESC
LIMIT 10
```
```
```text
┌─IPv6NumToString(ClientIP6)─┬──────c─┐
│ ::ffff:94.26.111.111 │ 747440 │
│ ::ffff:37.143.222.4 │ 529483 │
@ -117,11 +117,11 @@ HEX can be uppercase or lowercase.
Takes a `UInt32` number. Interprets it as an IPv4 address in [big endian](https://en.wikipedia.org/wiki/Endianness). Returns a `FixedString(16)` value containing the IPv6 address in binary format. Examples:
``` sql
```sql
SELECT IPv6NumToString(IPv4ToIPv6(IPv4StringToNum('192.168.0.1'))) AS addr
```
```
```text
┌─addr───────────────┐
│ ::ffff:192.168.0.1 │
└────────────────────┘
@ -131,7 +131,7 @@ SELECT IPv6NumToString(IPv4ToIPv6(IPv4StringToNum('192.168.0.1'))) AS addr
Accepts a FixedString(16) value containing the IPv6 address in binary format. Returns a string containing the address of the specified number of bits removed in text format. For example:
``` sql
```sql
WITH
IPv6StringToNum('2001:0DB8:AC10:FE01:FEED:BABE:CAFE:F00D') AS ipv6,
IPv4ToIPv6(IPv4StringToNum('192.168.0.1')) AS ipv4
@ -141,7 +141,7 @@ SELECT
```
```
```text
┌─cutIPv6(ipv6, 2, 0)─────────────────┬─cutIPv6(ipv4, 0, 2)─┐
│ 2001:db8:ac10:fe01:feed:babe:cafe:0 │ ::ffff:192.168.0.0 │
└─────────────────────────────────────┴─────────────────────┘
@ -155,7 +155,7 @@ Accepts an IPv4 and an UInt8 value containing the [CIDR](https://en.wikipedia.or
```sql
SELECT IPv4CIDRToRange(toIPv4('192.168.5.2'), 16)
```
```
```text
┌─IPv4CIDRToRange(toIPv4('192.168.5.2'), 16)─┐
│ ('192.168.0.0','192.168.255.255') │
└────────────────────────────────────────────┘
@ -171,7 +171,7 @@ Accepts an IPv6 and an UInt8 value containing the CIDR. Return a tuple with two
SELECT IPv6CIDRToRange(toIPv6('2001:0db8:0000:85a3:0000:0000:ac1f:8001'), 32);
```
```
```text
┌─IPv6CIDRToRange(toIPv6('2001:0db8:0000:85a3:0000:0000:ac1f:8001'), 32)─┐
│ ('2001:db8::','2001:db8:ffff:ffff:ffff:ffff:ffff:ffff') │
└────────────────────────────────────────────────────────────────────────┘
@ -181,7 +181,7 @@ SELECT IPv6CIDRToRange(toIPv6('2001:0db8:0000:85a3:0000:0000:ac1f:8001'), 32);
An alias to `IPv4StringToNum()` that takes a string form of IPv4 address and returns value of [IPv4](../../data_types/domains/ipv4.md) type, which is binary equal to value returned by `IPv4StringToNum()`.
``` sql
```sql
WITH
'171.225.130.45' as IPv4_string
SELECT
@ -189,13 +189,13 @@ SELECT
toTypeName(toIPv4(IPv4_string))
```
```
```text
┌─toTypeName(IPv4StringToNum(IPv4_string))─┬─toTypeName(toIPv4(IPv4_string))─┐
│ UInt32 │ IPv4 │
└──────────────────────────────────────────┴─────────────────────────────────┘
```
``` sql
```sql
WITH
'171.225.130.45' as IPv4_string
SELECT
@ -203,7 +203,7 @@ SELECT
hex(toIPv4(IPv4_string))
```
```
```text
┌─hex(IPv4StringToNum(IPv4_string))─┬─hex(toIPv4(IPv4_string))─┐
│ ABE1822D │ ABE1822D │
└───────────────────────────────────┴──────────────────────────┘
@ -213,7 +213,7 @@ SELECT
An alias to `IPv6StringToNum()` that takes a string form of IPv6 address and returns value of [IPv6](../../data_types/domains/ipv6.md) type, which is binary equal to value returned by `IPv6StringToNum()`.
``` sql
```sql
WITH
'2001:438:ffff::407d:1bc1' as IPv6_string
SELECT
@ -221,13 +221,13 @@ SELECT
toTypeName(toIPv6(IPv6_string))
```
```
```text
┌─toTypeName(IPv6StringToNum(IPv6_string))─┬─toTypeName(toIPv6(IPv6_string))─┐
│ FixedString(16) │ IPv6 │
└──────────────────────────────────────────┴─────────────────────────────────┘
```
``` sql
```sql
WITH
'2001:438:ffff::407d:1bc1' as IPv6_string
SELECT
@ -235,7 +235,7 @@ SELECT
hex(toIPv6(IPv6_string))
```
```
```text
┌─hex(IPv6StringToNum(IPv6_string))─┬─hex(toIPv6(IPv6_string))─────────┐
│ 20010438FFFF000000000000407D1BC1 │ 20010438FFFF000000000000407D1BC1 │
└───────────────────────────────────┴──────────────────────────────────┘

View File

@ -35,7 +35,7 @@ Returns the value of a field, including separators.
Examples:
```
```sql
visitParamExtractRaw('{"abc":"\\n\\u0000"}', 'abc') = '"\\n\\u0000"'
visitParamExtractRaw('{"abc":{"def":[1,2,3]}}', 'abc') = '{"def":[1,2,3]}'
```
@ -46,7 +46,7 @@ Parses the string in double quotes. The value is unescaped. If unescaping failed
Examples:
```
```sql
visitParamExtractString('{"abc":"\\n\\u0000"}', 'abc') = '\n\0'
visitParamExtractString('{"abc":"\\u263a"}', 'abc') = '☺'
visitParamExtractString('{"abc":"\\u263"}', 'abc') = ''
@ -65,9 +65,9 @@ If the value does not exist, `0` will be returned.
Examples:
```
select JSONHas('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 1
select JSONHas('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 4) = 0
```sql
SELECT JSONHas('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 1
SELECT JSONHas('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 4) = 0
```
`indices_or_keys` is a list of zero or more arguments each of them can be either string or integer.
@ -82,12 +82,12 @@ You may use integers to access both JSON arrays and JSON objects.
So, for example:
```
select JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', 1) = 'a'
select JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', 2) = 'b'
select JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', -1) = 'b'
select JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', -2) = 'a'
select JSONExtractString('{"a": "hello", "b": [-100, 200.0, 300]}', 1) = 'hello'
```sql
SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', 1) = 'a'
SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', 2) = 'b'
SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', -1) = 'b'
SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', -2) = 'a'
SELECT JSONExtractString('{"a": "hello", "b": [-100, 200.0, 300]}', 1) = 'hello'
```
## JSONLength(json[, indices_or_keys]...)
@ -98,9 +98,9 @@ If the value does not exist or has a wrong type, `0` will be returned.
Examples:
```
select JSONLength('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 3
select JSONLength('{"a": "hello", "b": [-100, 200.0, 300]}') = 2
```sql
SELECT JSONLength('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 3
SELECT JSONLength('{"a": "hello", "b": [-100, 200.0, 300]}') = 2
```
## JSONType(json[, indices_or_keys]...)
@ -111,10 +111,10 @@ If the value does not exist, `Null` will be returned.
Examples:
```
select JSONType('{"a": "hello", "b": [-100, 200.0, 300]}') = 'Object'
select JSONType('{"a": "hello", "b": [-100, 200.0, 300]}', 'a') = 'String'
select JSONType('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 'Array'
```sql
SELECT JSONType('{"a": "hello", "b": [-100, 200.0, 300]}') = 'Object'
SELECT JSONType('{"a": "hello", "b": [-100, 200.0, 300]}', 'a') = 'String'
SELECT JSONType('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 'Array'
```
## JSONExtractUInt(json[, indices_or_keys]...)
@ -128,10 +128,10 @@ If the value does not exist or has a wrong type, `0` will be returned.
Examples:
```
select JSONExtractInt('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 1) = -100
select JSONExtractFloat('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 2) = 200.0
select JSONExtractUInt('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', -1) = 300
```sql
SELECT JSONExtractInt('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 1) = -100
SELECT JSONExtractFloat('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 2) = 200.0
SELECT JSONExtractUInt('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', -1) = 300
```
## JSONExtractString(json[, indices_or_keys]...)
@ -144,12 +144,12 @@ The value is unescaped. If unescaping failed, it returns an empty string.
Examples:
```
select JSONExtractString('{"a": "hello", "b": [-100, 200.0, 300]}', 'a') = 'hello'
select JSONExtractString('{"abc":"\\n\\u0000"}', 'abc') = '\n\0'
select JSONExtractString('{"abc":"\\u263a"}', 'abc') = '☺'
select JSONExtractString('{"abc":"\\u263"}', 'abc') = ''
select JSONExtractString('{"abc":"hello}', 'abc') = ''
```sql
SELECT JSONExtractString('{"a": "hello", "b": [-100, 200.0, 300]}', 'a') = 'hello'
SELECT JSONExtractString('{"abc":"\\n\\u0000"}', 'abc') = '\n\0'
SELECT JSONExtractString('{"abc":"\\u263a"}', 'abc') = '☺'
SELECT JSONExtractString('{"abc":"\\u263"}', 'abc') = ''
SELECT JSONExtractString('{"abc":"hello}', 'abc') = ''
```
## JSONExtract(json[, indices_or_keys...], return_type)
@ -163,7 +163,7 @@ This means
Examples:
```
```sql
SELECT JSONExtract('{"a": "hello", "b": [-100, 200.0, 300]}', 'Tuple(String, Array(Float64))') = ('hello',[-100,200,300])
SELECT JSONExtract('{"a": "hello", "b": [-100, 200.0, 300]}', 'Tuple(b Array(Float64), a String)') = ([-100,200,300],'hello')
SELECT JSONExtract('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 'Array(Nullable(Int8))') = [-100, NULL, NULL]
@ -179,7 +179,7 @@ Parse key-value pairs from a JSON where the values are of the given ClickHouse d
Example:
```
```sql
SELECT JSONExtractKeysAndValues('{"x": {"a": 5, "b": 7, "c": 11}}', 'x', 'Int8') = [('a',5),('b',7),('c',11)];
```
@ -191,8 +191,8 @@ If the part does not exist or has a wrong type, an empty string will be returned
Example:
```
select JSONExtractRaw('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = '[-100, 200.0, 300]'
```sql
SELECT JSONExtractRaw('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = '[-100, 200.0, 300]'
```
[Original article](https://clickhouse.yandex/docs/en/query_language/functions/json_functions/) <!--hide-->

View File

@ -48,11 +48,11 @@ If 'x' is non-negative, then erf(x / σ√2)<g> is the probability that a random
Example (three sigma rule):
``` sql
```sql
SELECT erf(3 / sqrt(2))
```
```
```text
┌─erf(divide(3, sqrt(2)))─┐
│ 0.9973002039367398 │
└─────────────────────────┘

View File

@ -8,7 +8,7 @@ Returns a string with the name of the host that this function was performed on.
Extracts the trailing part of a string after the last slash or backslash. This function if often used to extract the filename from a path.
```
```sql
basename( expr )
```
@ -60,9 +60,10 @@ This function is used by the system for implementing Pretty formats.
`NULL` is represented as a string corresponding to `NULL` in `Pretty` formats.
```
```sql
SELECT visibleWidth(NULL)
```
```text
┌─visibleWidth(NULL)─┐
│ 4 │
└────────────────────┘
@ -139,7 +140,7 @@ The band is drawn with accuracy to one eighth of a symbol.
Example:
``` sql
```sql
SELECT
toHour(EventTime) AS h,
count() AS c,
@ -149,7 +150,7 @@ GROUP BY h
ORDER BY h ASC
```
```
```text
┌──h─┬──────c─┬─bar────────────────┐
│ 0 │ 292907 │ █████████▋ │
│ 1 │ 180563 │ ██████ │
@ -208,7 +209,7 @@ If the 'x' value is equal to one of the elements in the 'array_from' array, it r
Example:
``` sql
```sql
SELECT
transform(SearchEngineID, [2, 3], ['Yandex', 'Google'], 'Other') AS title,
count() AS c
@ -218,7 +219,7 @@ GROUP BY title
ORDER BY c DESC
```
```
```text
┌─title─────┬──────c─┐
│ Yandex │ 498635 │
│ Google │ 229872 │
@ -237,7 +238,7 @@ Types:
Example:
``` sql
```sql
SELECT
transform(domain(Referer), ['yandex.ru', 'google.ru', 'vk.com'], ['www.yandex', 'example.com']) AS s,
count() AS c
@ -247,7 +248,7 @@ ORDER BY count() DESC
LIMIT 10
```
```
```text
┌─s──────────────┬───────c─┐
│ │ 2906259 │
│ www.yandex │ 867767 │
@ -267,13 +268,13 @@ Accepts the size (number of bytes). Returns a rounded size with a suffix (KiB, M
Example:
``` sql
```sql
SELECT
arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
formatReadableSize(filesize_bytes) AS filesize
```
```
```text
┌─filesize_bytes─┬─filesize───┐
│ 1 │ 1.00 B │
│ 1024 │ 1.00 KiB │
@ -325,7 +326,7 @@ If you make a subquery with ORDER BY and call the function from outside the subq
If `offset` value is outside block bounds, a default value for `column` returned. If `default_value` is given, then it will be used.
This function can be used to compute year-over-year metric value:
``` sql
```sql
WITH toDate('2018-01-01') AS start_date
SELECT
toStartOfMonth(start_date + (number * 32)) AS month,
@ -335,7 +336,7 @@ SELECT
FROM numbers(16)
```
```
```text
┌──────month─┬─money─┬─prev_year─┬─year_over_year─┐
│ 2018-01-01 │ 32 │ 0 │ 0 │
│ 2018-02-01 │ 63 │ 0 │ 0 │
@ -367,7 +368,7 @@ If you make a subquery with ORDER BY and call the function from outside the subq
Example:
``` sql
```sql
SELECT
EventID,
EventTime,
@ -384,7 +385,7 @@ FROM
)
```
```
```text
┌─EventID─┬───────────EventTime─┬─delta─┐
│ 1106 │ 2016-11-24 00:00:04 │ 0 │
│ 1107 │ 2016-11-24 00:00:05 │ 1 │
@ -396,19 +397,22 @@ FROM
Please note - block size affects the result. With each new block, the `runningDifference` state is reset.
``` sql
```sql
SELECT
number,
runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1
```
```text
┌─number─┬─diff─┐
│ 0 │ 0 │
└────────┴──────┘
┌─number─┬─diff─┐
│ 65536 │ 0 │
└────────┴──────┘
```
```sql
set max_block_size=100000 -- default value is 65536!
SELECT
@ -416,6 +420,8 @@ SELECT
runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1
```
```text
┌─number─┬─diff─┐
│ 0 │ 0 │
└────────┴──────┘
@ -441,7 +447,7 @@ Accepts a MAC address in the format AA:BB:CC:DD:EE:FF (colon-separated numbers i
Returns the number of fields in [Enum](../../data_types/enum.md).
```
```sql
getSizeOfEnumType(value)
```
@ -456,9 +462,10 @@ getSizeOfEnumType(value)
**Example**
```
```sql
SELECT getSizeOfEnumType( CAST('a' AS Enum8('a' = 1, 'b' = 2) ) ) AS x
```
```text
┌─x─┐
│ 2 │
└───┘
@ -468,7 +475,7 @@ SELECT getSizeOfEnumType( CAST('a' AS Enum8('a' = 1, 'b' = 2) ) ) AS x
Returns the name of the class that represents the data type of the column in RAM.
```
```sql
toColumnTypeName(value)
```
@ -482,21 +489,18 @@ toColumnTypeName(value)
**Example of the difference between` toTypeName ' and ' toColumnTypeName`**
```sql
SELECT toTypeName(CAST('2018-01-01 01:02:03' AS DateTime))
```
:) select toTypeName(cast('2018-01-01 01:02:03' AS DateTime))
SELECT toTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))
```text
┌─toTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime │
└─────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.008 sec.
:) select toColumnTypeName(cast('2018-01-01 01:02:03' AS DateTime))
SELECT toColumnTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))
```
```sql
SELECT toColumnTypeName(CAST('2018-01-01 01:02:03' AS DateTime))
```
```text
┌─toColumnTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ Const(UInt32) │
└───────────────────────────────────────────────────────────┘
@ -508,7 +512,7 @@ The example shows that the `DateTime` data type is stored in memory as `Const(UI
Outputs a detailed description of data structures in RAM
```
```sql
dumpColumnStructure(value)
```
@ -522,9 +526,10 @@ dumpColumnStructure(value)
**Example**
```
```sql
SELECT dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))
```
```text
┌─dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime, Const(size = 1, UInt32(size = 1)) │
└──────────────────────────────────────────────────────────────┘
@ -536,7 +541,7 @@ Outputs the default value for the data type.
Does not include default values for custom columns set by the user.
```
```sql
defaultValueOfArgumentType(expression)
```
@ -552,26 +557,21 @@ defaultValueOfArgumentType(expression)
**Example**
```sql
SELECT defaultValueOfArgumentType( CAST(1 AS Int8) )
```
:) SELECT defaultValueOfArgumentType( CAST(1 AS Int8) )
SELECT defaultValueOfArgumentType(CAST(1, 'Int8'))
```text
┌─defaultValueOfArgumentType(CAST(1, 'Int8'))─┐
│ 0 │
└─────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
:) SELECT defaultValueOfArgumentType( CAST(1 AS Nullable(Int8) ) )
SELECT defaultValueOfArgumentType(CAST(1, 'Nullable(Int8)'))
```
```sql
SELECT defaultValueOfArgumentType( CAST(1 AS Nullable(Int8) ) )
```
```text
┌─defaultValueOfArgumentType(CAST(1, 'Nullable(Int8)'))─┐
│ ᴺᵁᴸᴸ │
└───────────────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.002 sec.
```
## indexHint
@ -588,9 +588,10 @@ The expression passed to the function is not calculated, but ClickHouse applies
Here is a table with the test data for [ontime](../../getting_started/example_datasets/ontime.md).
```
```sql
SELECT count() FROM ontime
```
```text
┌─count()─┐
│ 4276457 │
└─────────┘
@ -600,15 +601,10 @@ The table has indexes for the fields `(FlightDate, (Year, FlightDate))`.
Create a selection by date like this:
```sql
SELECT FlightDate AS k, count() FROM ontime GROUP BY k ORDER BY k
```
:) SELECT FlightDate AS k, count() FROM ontime GROUP BY k ORDER BY k
SELECT
FlightDate AS k,
count()
FROM ontime
GROUP BY k
ORDER BY k ASC
```text
┌──────────k─┬─count()─┐
│ 2017-01-01 │ 13970 │
@ -618,37 +614,24 @@ ORDER BY k ASC
│ 2017-09-29 │ 16384 │
│ 2017-09-30 │ 12520 │
└────────────┴─────────┘
273 rows in set. Elapsed: 0.072 sec. Processed 4.28 million rows, 8.55 MB (59.00 million rows/s., 118.01 MB/s.)
```
In this selection, the index is not used and ClickHouse processed the entire table (`Processed 4.28 million rows`). To apply the index, select a specific date and run the following query:
```sql
SELECT FlightDate AS k, count() FROM ontime WHERE k = '2017-09-15' GROUP BY k ORDER BY k
```
:) SELECT FlightDate AS k, count() FROM ontime WHERE k = '2017-09-15' GROUP BY k ORDER BY k
SELECT
FlightDate AS k,
count()
FROM ontime
WHERE k = '2017-09-15'
GROUP BY k
ORDER BY k ASC
```text
┌──────────k─┬─count()─┐
│ 2017-09-15 │ 16428 │
└────────────┴─────────┘
1 rows in set. Elapsed: 0.014 sec. Processed 32.74 thousand rows, 65.49 KB (2.31 million rows/s., 4.63 MB/s.)
```
The last line of output shows that by using the index, ClickHouse processed a significantly smaller number of rows (`Processed 32.74 thousand rows`).
Now pass the expression `k = '2017-09-15'` to the `indexHint` function:
```
:) SELECT FlightDate AS k, count() FROM ontime WHERE indexHint(k = '2017-09-15') GROUP BY k ORDER BY k
```sql
SELECT
FlightDate AS k,
count()
@ -656,15 +639,14 @@ FROM ontime
WHERE indexHint(k = '2017-09-15')
GROUP BY k
ORDER BY k ASC
```
```text
┌──────────k─┬─count()─┐
│ 2017-09-14 │ 7071 │
│ 2017-09-15 │ 16428 │
│ 2017-09-16 │ 1077 │
│ 2017-09-30 │ 8167 │
└────────────┴─────────┘
4 rows in set. Elapsed: 0.004 sec. Processed 32.74 thousand rows, 65.49 KB (8.97 million rows/s., 17.94 MB/s.)
```
The response to the request shows that ClickHouse applied the index in the same way as the previous time (`Processed 32.74 thousand rows`). However, the resulting set of rows shows that the expression `k = '2017-09-15'` was not used when generating the result.
@ -677,7 +659,7 @@ Creates an array with a single value.
Used for internal implementation of [arrayJoin](array_join.md#functions_arrayjoin).
```
```sql
replicate(x, arr)
```
@ -692,9 +674,10 @@ replicate(x, arr)
**Example**
```
```sql
SELECT replicate(1, ['a', 'b', 'c'])
```
```text
┌─replicate(1, ['a', 'b', 'c'])─┐
│ [1,1,1] │
└───────────────────────────────┘
@ -704,7 +687,7 @@ SELECT replicate(1, ['a', 'b', 'c'])
Returns the amount of remaining space in the filesystem where the files of the databases located. See the [path](../../operations/server_settings/settings.md#server_settings-path) server setting description.
```
```sql
filesystemAvailable()
```
@ -756,7 +739,8 @@ custom_message - is an optional parameter: a constant string, provides an error
```sql
SELECT throwIf(number = 3, 'Too many') FROM numbers(10);
```
```text
↙ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) Received exception from server (version 19.14.1):
Code: 395. DB::Exception: Received from localhost:9000. DB::Exception: Too many.
```
@ -767,7 +751,8 @@ Returns the same value that was used as its argument.
```sql
SELECT identity(42)
```
```text
┌─identity(42)─┐
│ 42 │
└──────────────┘

View File

@ -22,7 +22,7 @@ Rounds a value to a specified number of decimal places.
The function returns the nearest number of the specified order. In case when given number has equal distance to surrounding numbers the function returns the number having the nearest even digit (banker's rounding).
```
```sql
round(expression [, decimal_places])
```
@ -42,10 +42,10 @@ The rounded number of the same type as the input number.
**Example of use**
``` sql
```sql
SELECT number / 2 AS x, round(x) FROM system.numbers LIMIT 3
```
```
```text
┌───x─┬─round(divide(number, 2))─┐
│ 0 │ 0 │
│ 0.5 │ 0 │
@ -57,7 +57,7 @@ SELECT number / 2 AS x, round(x) FROM system.numbers LIMIT 3
Rounding to the nearest number.
```
```text
round(3.2, 0) = 3
round(4.1267, 2) = 4.13
round(22,-1) = 20
@ -67,7 +67,7 @@ round(-467,-2) = -500
Banker's rounding.
```
```text
round(3.5) = 4
round(4.5) = 4
round(3.55, 1) = 3.6

View File

@ -20,9 +20,10 @@ Selects substrings of consecutive bytes from the ranges a-z and A-Z.Returns an a
**Example:**
```
```sql
SELECT alphaTokens('abca1abc')
```
```text
┌─alphaTokens('abca1abc')─┐
│ ['abca','abc'] │
└─────────────────────────┘

View File

@ -64,7 +64,7 @@ Returns 1, if the set of bytes is valid UTF-8 encoded, otherwise 0.
Replaces invalid UTF-8 characters by the `<60>` (U+FFFD) character. All running in a row invalid characters are collapsed into the one replacement character.
```
```sql
toValidUTF8( input_string )
```
@ -100,13 +100,16 @@ Formatting constant pattern with the string listed in the arguments. `pattern` i
```sql
SELECT format('{1} {0} {1}', 'World', 'Hello')
```
```text
┌─format('{1} {0} {1}', 'World', 'Hello')─┐
│ Hello World Hello │
└─────────────────────────────────────────┘
```
```sql
SELECT format('{} {}', 'Hello', 'World')
```
```text
┌─format('{} {}', 'Hello', 'World')─┐
│ Hello World │
└───────────────────────────────────┘

View File

@ -19,7 +19,7 @@ Also keep in mind that a string literal requires an extra escape.
Example 1. Converting the date to American format:
``` sql
```sql
SELECT DISTINCT
EventDate,
replaceRegexpOne(toString(EventDate), '(\\d{4})-(\\d{2})-(\\d{2})', '\\2/\\3/\\1') AS res
@ -28,7 +28,7 @@ LIMIT 7
FORMAT TabSeparated
```
```
```text
2014-03-17 03/17/2014
2014-03-18 03/18/2014
2014-03-19 03/19/2014
@ -40,11 +40,11 @@ FORMAT TabSeparated
Example 2. Copying a string ten times:
``` sql
```sql
SELECT replaceRegexpOne('Hello, World!', '.*', '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0') AS res
```
```
```text
┌─res────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World! │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
@ -54,11 +54,11 @@ SELECT replaceRegexpOne('Hello, World!', '.*', '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0')
This does the same thing, but replaces all the occurrences. Example:
``` sql
```sql
SELECT replaceRegexpAll('Hello, World!', '.', '\\0\\0') AS res
```
```
```text
┌─res────────────────────────┐
│ HHeelllloo,, WWoorrlldd!! │
└────────────────────────────┘
@ -67,11 +67,11 @@ SELECT replaceRegexpAll('Hello, World!', '.', '\\0\\0') AS res
As an exception, if a regular expression worked on an empty substring, the replacement is not made more than once.
Example:
``` sql
```sql
SELECT replaceRegexpAll('Hello, World!', '^', 'here: ') AS res
```
```
```text
┌─res─────────────────┐
│ here: Hello, World! │
└─────────────────────┘

View File

@ -194,7 +194,7 @@ When converting dates with times to numbers or vice versa, the date with time co
The date and date-with-time formats for the toDate/toDateTime functions are defined as follows:
```
```text
YYYY-MM-DD
YYYY-MM-DD hh:mm:ss
```
@ -207,13 +207,13 @@ Conversion between numeric types uses the same rules as assignments between diff
Additionally, the toString function of the DateTime argument can take a second String argument containing the name of the time zone. Example: `Asia/Yekaterinburg` In this case, the time is formatted according to the specified time zone.
``` sql
```sql
SELECT
now() AS now_local,
toString(now(), 'Asia/Yekaterinburg') AS now_yekat
```
```
```text
┌───────────now_local─┬─now_yekat───────────┐
│ 2016-06-15 00:11:21 │ 2016-06-15 02:11:21 │
└─────────────────────┴─────────────────────┘
@ -232,21 +232,21 @@ Accepts a String or FixedString argument. Returns the String with the content tr
Example:
``` sql
```sql
SELECT toFixedString('foo', 8) AS s, toStringCutToZero(s) AS s_cut
```
```
```text
┌─s─────────────┬─s_cut─┐
│ foo\0\0\0\0\0 │ foo │
└───────────────┴───────┘
```
``` sql
```sql
SELECT toFixedString('foo\0bar', 8) AS s, toStringCutToZero(s) AS s_cut
```
```
```text
┌─s──────────┬─s_cut─┐
│ foo\0bar\0 │ foo │
└────────────┴───────┘
@ -278,7 +278,7 @@ Converts 'x' to the 't' data type. The syntax CAST(x AS t) is also supported.
Example:
``` sql
```sql
SELECT
'2016-06-15 23:00:00' AS timestamp,
CAST(timestamp AS DateTime) AS datetime,
@ -287,7 +287,7 @@ SELECT
CAST(timestamp, 'FixedString(22)') AS fixed_string
```
```
```text
┌─timestamp───────────┬────────────datetime─┬───────date─┬─string──────────────┬─fixed_string──────────────┐
│ 2016-06-15 23:00:00 │ 2016-06-15 23:00:00 │ 2016-06-15 │ 2016-06-15 23:00:00 │ 2016-06-15 23:00:00\0\0\0 │
└─────────────────────┴─────────────────────┴────────────┴─────────────────────┴───────────────────────────┘
@ -297,16 +297,19 @@ Conversion to FixedString(N) only works for arguments of type String or FixedStr
Type conversion to [Nullable](../../data_types/nullable.md) and back is supported. Example:
```
```sql
SELECT toTypeName(x) FROM t_null
```
```text
┌─toTypeName(x)─┐
│ Int8 │
│ Int8 │
└───────────────┘
```
```sql
SELECT toTypeName(CAST(x, 'Nullable(UInt16)')) FROM t_null
```
```text
┌─toTypeName(CAST(x, 'Nullable(UInt16)'))─┐
│ Nullable(UInt16) │
│ Nullable(UInt16) │
@ -328,7 +331,7 @@ SELECT
date + interval_to_week
```
```
```text
┌─plus(date, interval_week)─┬─plus(date, interval_to_week)─┐
│ 2019-01-08 │ 2019-01-08 │
└───────────────────────────┴──────────────────────────────┘

View File

@ -16,7 +16,7 @@ Examples of typical returned values: http, https, ftp, mailto, tel, magnet...
Extracts the hostname from a URL.
```
```sql
domain(url)
```
@ -27,7 +27,7 @@ domain(url)
The URL can be specified with or without a scheme. Examples:
```
```text
svn+ssh://some.svn-hosting.com:80/repo/trunk
some.svn-hosting.com:80/repo/trunk
https://yandex.com/time/
@ -35,7 +35,7 @@ https://yandex.com/time/
For these examples, the `domain` function returns the following results:
```
```text
some.svn-hosting.com
some.svn-hosting.com
yandex.com
@ -67,7 +67,7 @@ Returns the domain and removes no more than one 'www.' from the beginning of it,
Extracts the the top-level domain from a URL.
```
```sql
topLevelDomain(url)
```
@ -77,7 +77,7 @@ topLevelDomain(url)
The URL can be specified with or without a scheme. Examples:
```
```text
svn+ssh://some.svn-hosting.com:80/repo/trunk
some.svn-hosting.com:80/repo/trunk
https://yandex.com/time/
@ -151,7 +151,7 @@ Returns an array containing the URL, truncated at the end by the symbols /,? in
The same as above, but without the protocol and host in the result. The / element (root) is not included. Example: the function is used to implement tree reports the URL in Yandex. Metric.
```
```text
URLPathHierarchy('https://example.com/browse/CONV-6788') =
[
'/browse/',
@ -164,11 +164,11 @@ URLPathHierarchy('https://example.com/browse/CONV-6788') =
Returns the decoded URL.
Example:
``` sql
```sql
SELECT decodeURLComponent('http://127.0.0.1:8123/?query=SELECT%201%3B') AS DecodedURL;
```
```
```text
┌─DecodedURL─────────────────────────────┐
│ http://127.0.0.1:8123/?query=SELECT 1; │
└────────────────────────────────────────┘

View File

@ -18,13 +18,14 @@ The UUID type value.
This example demonstrates creating a table with the UUID type column and inserting a value into the table.
``` sql
:) CREATE TABLE t_uuid (x UUID) ENGINE=TinyLog
```sql
CREATE TABLE t_uuid (x UUID) ENGINE=TinyLog
:) INSERT INTO t_uuid SELECT generateUUIDv4()
:) SELECT * FROM t_uuid
INSERT INTO t_uuid SELECT generateUUIDv4()
SELECT * FROM t_uuid
```
```text
┌────────────────────────────────────x─┐
│ f4bf890f-f9dc-4332-ad5c-0c18e73f28e9 │
└──────────────────────────────────────┘
@ -44,9 +45,10 @@ The UUID type value.
**Usage example**
``` sql
:) SELECT toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0') AS uuid
```sql
SELECT toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0') AS uuid
```
```text
┌─────────────────────────────────uuid─┐
│ 61f0c404-5cb3-11e7-907b-a6006ad3dba0 │
└──────────────────────────────────────┘
@ -56,7 +58,7 @@ The UUID type value.
Accepts a string containing 36 characters in the format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`, and returns it as a set of bytes in a [FixedString(16)](../../data_types/fixedstring.md).
``` sql
```sql
UUIDStringToNum(String)
```
@ -66,10 +68,12 @@ FixedString(16)
**Usage examples**
``` sql
:) SELECT
```sql
SELECT
'612f3c40-5d3b-217e-707b-6a546a3d7b29' AS uuid,
UUIDStringToNum(uuid) AS bytes
```
```text
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ a/<@];!~p{jTj={) │
@ -80,7 +84,7 @@ FixedString(16)
Accepts a [FixedString(16)](../../data_types/fixedstring.md) value, and returns a string containing 36 characters in text format.
``` sql
```sql
UUIDNumToString(FixedString(16))
```
@ -90,11 +94,12 @@ String.
**Usage example**
``` sql
```sql
SELECT
'a/<@];!~p{jTj={)' AS bytes,
UUIDNumToString(toFixedString(bytes, 16)) AS uuid
```
```text
┌─bytes────────────┬─uuid─────────────────────────────────┐
│ a/<@];!~p{jTj={) │ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │
└──────────────────┴──────────────────────────────────────┘

View File

@ -20,7 +20,7 @@ All the dictionaries are re-loaded in runtime (once every certain number of seco
All functions for working with regions have an optional argument at the end the dictionary key. It is referred to as the geobase.
Example:
```
```sql
regionToCountry(RegionID) Uses the default dictionary: /opt/geo/regions_hierarchy.txt
regionToCountry(RegionID, '') Uses the default dictionary: /opt/geo/regions_hierarchy.txt
regionToCountry(RegionID, 'ua') Uses the dictionary for the 'ua' key: /opt/geo/regions_hierarchy_ua.txt
@ -34,13 +34,13 @@ Accepts a UInt32 number the region ID from the Yandex geobase. If this regio
Converts a region to an area (type 5 in the geobase). In every other way, this function is the same as 'regionToCity'.
``` sql
```sql
SELECT DISTINCT regionToName(regionToArea(toUInt32(number), 'ua'))
FROM system.numbers
LIMIT 15
```
```
```text
┌─regionToName(regionToArea(toUInt32(number), \'ua\'))─┐
│ │
│ Moscow and Moscow region │
@ -64,13 +64,13 @@ LIMIT 15
Converts a region to a federal district (type 4 in the geobase). In every other way, this function is the same as 'regionToCity'.
``` sql
```sql
SELECT DISTINCT regionToName(regionToDistrict(toUInt32(number), 'ua'))
FROM system.numbers
LIMIT 15
```
```
```text
┌─regionToName(regionToDistrict(toUInt32(number), \'ua\'))─┐
│ │
│ Central federal district │

View File

@ -5,7 +5,7 @@ Adding data.
Basic query format:
``` sql
```sql
INSERT INTO [db.]table [(c1, c2, c3)] VALUES (v11, v12, v13), (v21, v22, v23), ...
```
@ -18,13 +18,13 @@ If [strict_insert_defaults=1](../operations/settings/settings.md), columns that
Data can be passed to the INSERT in any [format](../interfaces/formats.md#formats) supported by ClickHouse. The format must be specified explicitly in the query:
``` sql
```sql
INSERT INTO [db.]table [(c1, c2, c3)] FORMAT format_name data_set
```
For example, the following query format is identical to the basic version of INSERT ... VALUES:
``` sql
```sql
INSERT INTO [db.]table [(c1, c2, c3)] FORMAT Values (v11, v12, v13), (v21, v22, v23), ...
```
@ -32,7 +32,7 @@ ClickHouse removes all spaces and one line feed (if there is one) before the dat
Example:
``` sql
```sql
INSERT INTO t FORMAT TabSeparated
11 Hello, world!
22 Qwerty
@ -46,7 +46,7 @@ If table has [constraints](create.md#constraints), their expressions will be che
### Inserting The Results of `SELECT` {#insert_query_insert-select}
``` sql
```sql
INSERT INTO [db.]table [(c1, c2, c3)] SELECT ...
```

View File

@ -10,7 +10,7 @@ After executing an ATTACH query, the server will know about the existence of the
If the table was previously detached (``DETACH``), meaning that its structure is known, you can use shorthand without defining the structure.
``` sql
```sql
ATTACH TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster]
```
@ -20,7 +20,7 @@ This query is used when starting the server. The server stores table metadata as
Checks if the data in the table is corrupted.
``` sql
```sql
CHECK TABLE [db.]name
```
@ -56,7 +56,7 @@ If the table is corrupted, you can copy the non-corrupted data to another table.
## DESCRIBE TABLE {#misc-describe-table}
``` sql
```sql
DESC|DESCRIBE TABLE [db.]table [INTO OUTFILE filename] [FORMAT format]
```
@ -74,7 +74,7 @@ Nested data structures are output in "expanded" format. Each column is shown sep
Deletes information about the 'name' table from the server. The server stops knowing about the table's existence.
``` sql
```sql
DETACH TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
@ -87,14 +87,14 @@ There is no `DETACH DATABASE` query.
This query has two types: `DROP DATABASE` and `DROP TABLE`.
``` sql
```sql
DROP DATABASE [IF EXISTS] db [ON CLUSTER cluster]
```
Deletes all tables inside the 'db' database, then deletes the 'db' database itself.
If `IF EXISTS` is specified, it doesn't return an error if the database doesn't exist.
``` sql
```sql
DROP [TEMPORARY] TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
@ -103,7 +103,7 @@ If `IF EXISTS` is specified, it doesn't return an error if the table doesn't exi
## EXISTS
``` sql
```sql
EXISTS [TEMPORARY] TABLE [db.]name [INTO OUTFILE filename] [FORMAT format]
```
@ -111,7 +111,7 @@ Returns a single `UInt8`-type column, which contains the single value `0` if the
## KILL QUERY
``` sql
```sql
KILL QUERY [ON CLUSTER cluster]
WHERE <where expression to SELECT FROM system.processes query>
[SYNC|ASYNC|TEST]
@ -123,7 +123,7 @@ The queries to terminate are selected from the system.processes table using the
Examples:
``` sql
```sql
-- Forcibly terminates all queries with the specified query_id:
KILL QUERY WHERE query_id='2-857d-4a57-9ee0-327da5d60a90'
@ -173,7 +173,7 @@ Changes already made by the mutation are not rolled back.
## OPTIMIZE {#misc_operations-optimize}
``` sql
```sql
OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition] [FINAL]
```
@ -192,7 +192,7 @@ When `OPTIMIZE` is used with [ReplicatedMergeTree](../operations/table_engines/r
Renames one or more tables.
``` sql
```sql
RENAME TABLE [db11.]name11 TO [db12.]name12, [db21.]name21 TO [db22.]name22, ... [ON CLUSTER cluster]
```
@ -216,7 +216,7 @@ For more information, see [Settings](../operations/settings/settings.md).
## SHOW CREATE TABLE
``` sql
```sql
SHOW CREATE [TEMPORARY] TABLE [db.]table [INTO OUTFILE filename] [FORMAT format]
```
@ -224,7 +224,7 @@ Returns a single `String`-type 'statement' column, which contains a single value
## SHOW DATABASES {#show-databases}
``` sql
```sql
SHOW DATABASES [INTO OUTFILE filename] [FORMAT format]
```
@ -235,7 +235,7 @@ See also the section "Formats".
## SHOW PROCESSLIST
``` sql
```sql
SHOW PROCESSLIST [INTO OUTFILE filename] [FORMAT format]
```
@ -262,12 +262,12 @@ This query is nearly identical to: `SELECT * FROM system.processes`. The differe
Tip (execute in the console):
```bash
watch -n1 "clickhouse-client --query='SHOW PROCESSLIST'"
$ watch -n1 "clickhouse-client --query='SHOW PROCESSLIST'"
```
## SHOW TABLES
``` sql
```sql
SHOW [TEMPORARY] TABLES [FROM db] [LIKE 'pattern'] [INTO OUTFILE filename] [FORMAT format]
```
@ -282,7 +282,7 @@ See also the section "LIKE operator".
## TRUNCATE
``` sql
```sql
TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
```
@ -292,7 +292,7 @@ The `TRUNCATE` query is not supported for [View](../operations/table_engines/vie
## USE
``` sql
```sql
USE db
```

View File

@ -67,7 +67,7 @@ Groups of operators are listed in order of priority (the higher it is in the lis
## Operator for Working With Dates and Times {#operators-datetime}
``` sql
```sql
EXTRACT(part FROM date);
```
@ -88,7 +88,7 @@ The `date` parameter specifies the date or the time to process. Either [Date](..
Examples:
``` sql
```sql
SELECT EXTRACT(DAY FROM toDate('2017-06-15'));
SELECT EXTRACT(MONTH FROM toDate('2017-06-15'));
SELECT EXTRACT(YEAR FROM toDate('2017-06-15'));
@ -96,7 +96,7 @@ SELECT EXTRACT(YEAR FROM toDate('2017-06-15'));
In the following example we create a table and insert into it a value with the `DateTime` type.
``` sql
```sql
CREATE TABLE test.Orders
(
OrderId UInt64,
@ -106,10 +106,10 @@ CREATE TABLE test.Orders
ENGINE = Log;
```
``` sql
```sql
INSERT INTO test.Orders VALUES (1, 'Jarlsberg Cheese', toDateTime('2008-10-11 13:23:44'));
```
``` sql
```sql
SELECT
toYear(OrderDate) AS OrderYear,
toMonth(OrderDate) AS OrderMonth,
@ -118,6 +118,8 @@ SELECT
toMinute(OrderDate) AS OrderMinute,
toSecond(OrderDate) AS OrderSecond
FROM test.Orders;
```
```text
┌─OrderYear─┬─OrderMonth─┬─OrderDay─┬─OrderHour─┬─OrderMinute─┬─OrderSecond─┐
│ 2008 │ 10 │ 11 │ 13 │ 23 │ 44 │
@ -148,7 +150,7 @@ The conditional operator calculates the values of b and c, then checks whether c
## Conditional Expression {#operator_case}
``` sql
```sql
CASE [x]
WHEN a THEN b
[WHEN ... THEN ...]
@ -198,18 +200,13 @@ ClickHouse supports the `IS NULL` and `IS NOT NULL` operators.
- `0` otherwise.
- For other values, the `IS NULL` operator always returns `0`.
```bash
:) SELECT x+100 FROM t_null WHERE y IS NULL
SELECT x + 100
FROM t_null
WHERE isNull(y)
```sql
SELECT x+100 FROM t_null WHERE y IS NULL
```
```text
┌─plus(x, 100)─┐
│ 101 │
└──────────────┘
1 rows in set. Elapsed: 0.002 sec.
```
@ -220,18 +217,13 @@ WHERE isNull(y)
- `1` otherwise.
- For other values, the `IS NOT NULL` operator always returns `1`.
```bash
:) SELECT * FROM t_null WHERE y IS NOT NULL
SELECT *
FROM t_null
WHERE isNotNull(y)
```sql
SELECT * FROM t_null WHERE y IS NOT NULL
```
```text
┌─x─┬─y─┐
│ 2 │ 3 │
└───┴───┘
1 rows in set. Elapsed: 0.002 sec.
```
[Original article](https://clickhouse.yandex/docs/en/query_language/operators/) <!--hide-->

View File

@ -2,7 +2,7 @@
`SELECT` performs data retrieval.
``` sql
```sql
[WITH expr_list|(subquery)]
SELECT [DISTINCT] expr_list
[FROM [db.]table | (subquery) | table_function] [FINAL]
@ -35,7 +35,7 @@ This section provides support for Common Table Expressions ([CTE](https://en.wik
Results of WITH clause expressions can be used inside SELECT clause.
Example 1: Using constant expression as "variable"
```
```sql
WITH '2019-08-01 15:23:00' as ts_upper_bound
SELECT *
FROM hits
@ -45,7 +45,7 @@ WHERE
```
Example 2: Evicting sum(bytes) expression result from SELECT clause column list
```
```sql
WITH sum(bytes) as s
SELECT
formatReadableSize(s),
@ -56,7 +56,7 @@ ORDER BY s
```
Example 3: Using results of scalar subquery
```
```sql
/* this example would return TOP 10 of most huge tables */
WITH
(
@ -75,7 +75,7 @@ LIMIT 10
Example 4: Re-using expression in subquery
As a workaround for current limitation for expression usage in subqueries, you may duplicate it.
```
```sql
WITH ['hello'] AS hello
SELECT
hello,
@ -85,7 +85,8 @@ FROM
WITH ['hello'] AS hello
SELECT hello
)
```
```text
┌─hello─────┬─hello─────┐
│ ['hello'] │ ['hello'] │
└───────────┴───────────┘
@ -149,7 +150,7 @@ Here `k` is the number from 0 to 1 (both fractional and decimal notations are su
In a `SAMPLE k` clause, the sample is taken from the `k` fraction of data. The example is shown below:
``` sql
```sql
SELECT
Title,
count() * 10 AS PageViews
@ -177,27 +178,27 @@ The `_sample_factor` column contains relative coefficients that are calculated d
Let's consider the table `visits`, which contains the statistics about site visits. The first example shows how to calculate the number of page views:
``` sql
```sql
SELECT sum(PageViews * _sample_factor)
FROM visits
SAMPLE 10000000
```
```
The next example shows how to calculate the total number of visits:
``` sql
```sql
SELECT sum(_sample_factor)
FROM visits
SAMPLE 10000000
```
```
The example below shows how to calculate the average session duration. Note that you don't need to use the relative coefficient to calculate the average values.
``` sql
```sql
SELECT avg(Duration)
FROM visits
SAMPLE 10000000
```
```
#### SAMPLE k OFFSET m {#select-sample-offset}
@ -205,7 +206,7 @@ Here `k` and `m` are numbers from 0 to 1. Examples are shown below.
**Example 1**
``` sql
```sql
SAMPLE 1/10
```
@ -215,7 +216,7 @@ In this example, the sample is 1/10th of all data:
**Example 2**
``` sql
```sql
SAMPLE 1/10 OFFSET 1/2
```
@ -227,7 +228,7 @@ Here, a sample of 10% is taken from the second half of the data.
Allows executing `JOIN` with an array or nested data structure. The intent is similar to the [arrayJoin](functions/array_join.md#functions_arrayjoin) function, but its functionality is broader.
``` sql
```sql
SELECT <expr_list>
FROM <left_subquery>
[LEFT] ARRAY JOIN <array>
@ -246,7 +247,7 @@ Supported types of `ARRAY JOIN` are listed below:
The examples below demonstrate the usage of the `ARRAY JOIN` and `LEFT ARRAY JOIN` clauses. Let's create a table with an [Array](../data_types/array.md) type column and insert values into it:
``` sql
```sql
CREATE TABLE arrays_test
(
s String,
@ -256,7 +257,7 @@ CREATE TABLE arrays_test
INSERT INTO arrays_test
VALUES ('Hello', [1,2]), ('World', [3,4,5]), ('Goodbye', []);
```
```
```text
┌─s───────────┬─arr─────┐
│ Hello │ [1,2] │
│ World │ [3,4,5] │
@ -266,12 +267,12 @@ VALUES ('Hello', [1,2]), ('World', [3,4,5]), ('Goodbye', []);
The example below uses the `ARRAY JOIN` clause:
``` sql
```sql
SELECT s, arr
FROM arrays_test
ARRAY JOIN arr;
```
```
```text
┌─s─────┬─arr─┐
│ Hello │ 1 │
│ Hello │ 2 │
@ -283,12 +284,12 @@ ARRAY JOIN arr;
The next example uses the `LEFT ARRAY JOIN` clause:
``` sql
```sql
SELECT s, arr
FROM arrays_test
LEFT ARRAY JOIN arr;
```
```
```text
┌─s───────────┬─arr─┐
│ Hello │ 1 │
│ Hello │ 2 │
@ -303,13 +304,13 @@ LEFT ARRAY JOIN arr;
An alias can be specified for an array in the `ARRAY JOIN` clause. In this case, an array item can be accessed by this alias, but the array itself is accessed by the original name. Example:
``` sql
```sql
SELECT s, arr, a
FROM arrays_test
ARRAY JOIN arr AS a;
```
```
```text
┌─s─────┬─arr─────┬─a─┐
│ Hello │ [1,2] │ 1 │
│ Hello │ [1,2] │ 2 │
@ -321,13 +322,13 @@ ARRAY JOIN arr AS a;
Using aliases, you can perform `ARRAY JOIN` with an external array. For example:
``` sql
```sql
SELECT s, arr_external
FROM arrays_test
ARRAY JOIN [1, 2, 3] AS arr_external;
```
```
```text
┌─s───────────┬─arr_external─┐
│ Hello │ 1 │
│ Hello │ 2 │
@ -343,13 +344,13 @@ ARRAY JOIN [1, 2, 3] AS arr_external;
Multiple arrays can be comma-separated in the `ARRAY JOIN` clause. In this case, `JOIN` is performed with them simultaneously (the direct sum, not the cartesian product). Note that all the arrays must have the same size. Example:
``` sql
```sql
SELECT s, arr, a, num, mapped
FROM arrays_test
ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num, arrayMap(x -> x + 1, arr) AS mapped;
```
```
```text
┌─s─────┬─arr─────┬─a─┬─num─┬─mapped─┐
│ Hello │ [1,2] │ 1 │ 1 │ 2 │
│ Hello │ [1,2] │ 2 │ 2 │ 3 │
@ -361,13 +362,13 @@ ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num, arrayMap(x -> x + 1, arr) AS ma
The example below uses the [arrayEnumerate](functions/array_functions.md#array_functions-arrayenumerate) function:
``` sql
```sql
SELECT s, arr, a, num, arrayEnumerate(arr)
FROM arrays_test
ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num;
```
```
```text
┌─s─────┬─arr─────┬─a─┬─num─┬─arrayEnumerate(arr)─┐
│ Hello │ [1,2] │ 1 │ 1 │ [1,2] │
│ Hello │ [1,2] │ 2 │ 2 │ [1,2] │
@ -381,7 +382,7 @@ ARRAY JOIN arr AS a, arrayEnumerate(arr) AS num;
`ARRAY `JOIN`` also works with [nested data structures](../data_types/nested_data_structures/nested.md). Example:
``` sql
```sql
CREATE TABLE nested_test
(
s String,
@ -394,7 +395,7 @@ INSERT INTO nested_test
VALUES ('Hello', [1,2], [10,20]), ('World', [3,4,5], [30,40,50]), ('Goodbye', [], []);
```
```
```text
┌─s───────┬─nest.x──┬─nest.y─────┐
│ Hello │ [1,2] │ [10,20] │
│ World │ [3,4,5] │ [30,40,50] │
@ -402,13 +403,13 @@ VALUES ('Hello', [1,2], [10,20]), ('World', [3,4,5], [30,40,50]), ('Goodbye', []
└─────────┴─────────┴────────────┘
```
``` sql
```sql
SELECT s, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN nest;
```
```
```text
┌─s─────┬─nest.x─┬─nest.y─┐
│ Hello │ 1 │ 10 │
│ Hello │ 2 │ 20 │
@ -420,13 +421,13 @@ ARRAY JOIN nest;
When specifying names of nested data structures in `ARRAY JOIN`, the meaning is the same as `ARRAY JOIN` with all the array elements that it consists of. Examples are listed below:
``` sql
```sql
SELECT s, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN `nest.x`, `nest.y`;
```
```
```text
┌─s─────┬─nest.x─┬─nest.y─┐
│ Hello │ 1 │ 10 │
│ Hello │ 2 │ 20 │
@ -438,13 +439,13 @@ ARRAY JOIN `nest.x`, `nest.y`;
This variation also makes sense:
``` sql
```sql
SELECT s, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN `nest.x`;
```
```
```text
┌─s─────┬─nest.x─┬─nest.y─────┐
│ Hello │ 1 │ [10,20] │
│ Hello │ 2 │ [10,20] │
@ -456,13 +457,13 @@ ARRAY JOIN `nest.x`;
An alias may be used for a nested data structure, in order to select either the `JOIN` result or the source array. Example:
``` sql
```sql
SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`
FROM nested_test
ARRAY JOIN nest AS n;
```
```
```text
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┐
│ Hello │ 1 │ 10 │ [1,2] │ [10,20] │
│ Hello │ 2 │ 20 │ [1,2] │ [10,20] │
@ -474,13 +475,13 @@ ARRAY JOIN nest AS n;
Example of using the [arrayEnumerate](functions/array_functions.md#array_functions-arrayenumerate) function:
``` sql
```sql
SELECT s, `n.x`, `n.y`, `nest.x`, `nest.y`, num
FROM nested_test
ARRAY JOIN nest AS n, arrayEnumerate(`nest.x`) AS num;
```
```
```text
┌─s─────┬─n.x─┬─n.y─┬─nest.x──┬─nest.y─────┬─num─┐
│ Hello │ 1 │ 10 │ [1,2] │ [10,20] │ 1 │
│ Hello │ 2 │ 20 │ [1,2] │ [10,20] │ 2 │
@ -497,7 +498,7 @@ Joins the data in the normal [SQL JOIN](https://en.wikipedia.org/wiki/Join_(SQL)
!!! info "Note"
Not related to [ARRAY JOIN](#select-array-join-clause).
``` sql
```sql
SELECT <expr_list>
FROM <left_subquery>
[GLOBAL] [ANY|ALL] [INNER|LEFT|RIGHT|FULL|CROSS] [OUTER] JOIN <right_subquery>
@ -524,13 +525,13 @@ If a query contains the `WHERE` clause, ClickHouse tries to pushdown filters fro
We recommend the `JOIN ON` or `JOIN USING` syntax for creating queries. For example:
```
```sql
SELECT * FROM t1 JOIN t2 ON t1.a = t2.a JOIN t3 ON t1.a = t3.a
```
You can use comma-separated lists of tables in the `FROM` clause. This works only with the [allow_experimental_cross_to_join_conversion = 1](../operations/settings/settings.md#settings-allow_experimental_cross_to_join_conversion) setting. For example:
```
```sql
SELECT * FROM t1, t2, t3 WHERE t1.a = t2.a AND t1.a = t3.a
```
@ -576,7 +577,7 @@ You can use the following types of syntax:
For example, consider the following tables:
```
```text
table_1 table_2
event | ev_time | user_id event | ev_time | user_id
@ -610,7 +611,7 @@ When running a `JOIN`, there is no optimization of the order of execution in rel
Example:
``` sql
```sql
SELECT
CounterID,
hits,
@ -634,7 +635,7 @@ ORDER BY hits DESC
LIMIT 10
```
```
```text
┌─CounterID─┬───hits─┬─visits─┐
│ 1143050 │ 523264 │ 13665 │
│ 731962 │ 475698 │ 102716 │
@ -724,7 +725,7 @@ If a query contains only table columns inside aggregate functions, the GROUP BY
Example:
``` sql
```sql
SELECT
count(),
median(FetchTiming > 60 ? 60 : FetchTiming),
@ -738,7 +739,7 @@ As opposed to MySQL (and conforming to standard SQL), you can't get some value o
Example:
``` sql
```sql
SELECT
domainWithoutWWW(URL) AS domain,
count(),
@ -761,7 +762,7 @@ Here's an example to show what this means.
Assume you have this table:
```
```text
┌─x─┬────y─┐
│ 1 │ 2 │
│ 2 │ ᴺᵁᴸᴸ │
@ -773,7 +774,7 @@ Assume you have this table:
The query `SELECT sum(x), y FROM t_null_big GROUP BY y` results in:
```
```text
┌─sum(x)─┬────y─┐
│ 4 │ 2 │
│ 3 │ 3 │
@ -877,7 +878,7 @@ The `SELECT * FROM limit_by ORDER BY id, val LIMIT 2 OFFSET 1 BY id` query retur
The following query returns the top 5 referrers for each `domain, device_type` pair with a maximum of 100 rows in total (`LIMIT n BY + LIMIT`).
``` sql
```sql
SELECT
domainWithoutWWW(URL) AS domain,
domainWithoutWWW(REFERRER_URL) AS referrer,
@ -918,7 +919,7 @@ Example:
For the table
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │ 2 │
@ -935,7 +936,7 @@ For the table
Run the query `SELECT * FROM t_null_nan ORDER BY y NULLS FIRST` to get:
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 7 │ ᴺᵁᴸᴸ │
@ -1031,7 +1032,7 @@ If there isn't an `ORDER BY` clause that explicitly sorts results, the result ma
You can use UNION ALL to combine any number of queries. Example:
``` sql
```sql
SELECT CounterID, 1 AS table, toInt64(count()) AS c
FROM test.hits
GROUP BY CounterID
@ -1078,7 +1079,7 @@ The left side of the operator is either a single column or a tuple.
Examples:
``` sql
```sql
SELECT UserID IN (123, 456) FROM ...
SELECT (CounterID, UserID) IN ((34, 123), (101500, 456)) FROM ...
```
@ -1096,7 +1097,7 @@ If the right side of the operator is a table name that has the Set engine (a pre
The subquery may specify more than one column for filtering tuples.
Example:
``` sql
```sql
SELECT (CounterID, UserID) IN (SELECT CounterID, UserID FROM ...) FROM ...
```
@ -1105,7 +1106,7 @@ The columns to the left and right of the IN operator should have the same type.
The IN operator and subquery may occur in any part of the query, including in aggregate functions and lambda functions.
Example:
``` sql
```sql
SELECT
EventDate,
avg(UserID IN
@ -1119,7 +1120,7 @@ GROUP BY EventDate
ORDER BY EventDate ASC
```
```
```text
┌──EventDate─┬────ratio─┐
│ 2014-03-17 │ 1 │
│ 2014-03-18 │ 0.807696 │
@ -1140,7 +1141,7 @@ During request processing, the IN operator assumes that the result of an operati
Here is an example with the `t_null` table:
```
```text
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │ 3 │
@ -1149,7 +1150,7 @@ Here is an example with the `t_null` table:
Running the query `SELECT x FROM t_null WHERE y IN (NULL,3)` gives you the following result:
```
```text
┌─x─┐
│ 2 │
└───┘
@ -1157,10 +1158,11 @@ Running the query `SELECT x FROM t_null WHERE y IN (NULL,3)` gives you the follo
You can see that the row in which `y = NULL` is thrown out of the query results. This is because ClickHouse can't decide whether `NULL` is included in the `(NULL,3)` set, returns `0` as the result of the operation, and `SELECT` excludes this row from the final output.
```
```sql
SELECT y IN (NULL, 3)
FROM t_null
```
```text
┌─in(y, tuple(NULL, 3))─┐
│ 0 │
│ 1 │
@ -1189,13 +1191,13 @@ For a query to the **distributed_table**, the query will be sent to all the remo
For example, the query
``` sql
```sql
SELECT uniq(UserID) FROM distributed_table
```
will be sent to all remote servers as
``` sql
```sql
SELECT uniq(UserID) FROM local_table
```
@ -1203,7 +1205,7 @@ and run on each of them in parallel, until it reaches the stage where intermedia
Now let's examine a query with IN:
``` sql
```sql
SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM local_table WHERE CounterID = 34)
```
@ -1211,7 +1213,7 @@ SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID I
This query will be sent to all remote servers as
``` sql
```sql
SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM local_table WHERE CounterID = 34)
```
@ -1221,19 +1223,19 @@ This will work correctly and optimally if you are prepared for this case and hav
To correct how the query works when data is spread randomly across the cluster servers, you could specify **distributed_table** inside a subquery. The query would look like this:
``` sql
```sql
SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
```
This query will be sent to all remote servers as
``` sql
```sql
SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
```
The subquery will begin running on each remote server. Since the subquery uses a distributed table, the subquery that is on each remote server will be resent to every remote server as
``` sql
```sql
SELECT UserID FROM local_table WHERE CounterID = 34
```
@ -1241,19 +1243,19 @@ For example, if you have a cluster of 100 servers, executing the entire query wi
In such cases, you should always use GLOBAL IN instead of IN. Let's look at how it works for the query
``` sql
```sql
SELECT uniq(UserID) FROM distributed_table WHERE CounterID = 101500 AND UserID GLOBAL IN (SELECT UserID FROM distributed_table WHERE CounterID = 34)
```
The requestor server will run the subquery
``` sql
```sql
SELECT UserID FROM distributed_table WHERE CounterID = 34
```
and the result will be put in a temporary table in RAM. Then the request will be sent to each remote server as
``` sql
```sql
SELECT uniq(UserID) FROM local_table WHERE CounterID = 101500 AND UserID GLOBAL IN _data1
```

View File

@ -4,7 +4,7 @@ There are two types of parsers in the system: the full SQL parser (a recursive d
In all cases except the `INSERT` query, only the full SQL parser is used.
The `INSERT` query uses both parsers:
``` sql
```sql
INSERT INTO t VALUES (1, 'Hello, world'), (2, 'abc'), (3, 'def')
```
@ -112,7 +112,7 @@ Data types and table engines in the `CREATE` query are written the same way as i
An alias is a user-defined name for an expression in a query.
```
```sql
expr AS alias
```
@ -140,7 +140,7 @@ If an alias is defined for the result columns in the `SELECT` clause of a subque
Be careful with aliases that are the same as column or table names. Let's consider the following example:
```
```sql
CREATE TABLE t
(
a Int,
@ -149,12 +149,13 @@ CREATE TABLE t
ENGINE = TinyLog()
```
```
```sql
SELECT
argMax(a, b),
sum(b) AS b
FROM t
```
```text
Received exception from server (version 18.14.17):
Code: 184. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Aggregate function sum(b) is found inside another aggregate function in query.
```

View File

@ -3,7 +3,7 @@
Creates a table from a file.
```
```sql
file(path, format, structure)
```
@ -39,14 +39,14 @@ FROM file('test.csv', 'CSV', 'column1 UInt32, column2 UInt32, column3 UInt32')
LIMIT 2
```
```
```text
┌─column1─┬─column2─┬─column3─┐
│ 1 │ 2 │ 3 │
│ 3 │ 2 │ 1 │
└─────────┴─────────┴─────────┘
```
``` sql
```sql
-- getting the first 10 lines of a table that contains 3 columns of UInt32 type from a CSV file
SELECT * FROM file('test.csv', 'CSV', 'column1 UInt32, column2 UInt32, column3 UInt32') LIMIT 10
```

View File

@ -3,7 +3,7 @@
Creates a table from a file in HDFS.
```
```sql
hdfs(URI, format, structure)
```
@ -27,7 +27,7 @@ FROM hdfs('hdfs://hdfs1:9000/test', 'TSV', 'column1 UInt32, column2 UInt32, colu
LIMIT 2
```
```
```text
┌─column1─┬─column2─┬─column3─┐
│ 1 │ 2 │ 3 │
│ 3 │ 2 │ 1 │

View File

@ -23,13 +23,13 @@ and data in `data.csv` has a different structure `(col1 String, col2 Date, col3
data from the `data.csv` into the `test` table with simultaneous conversion looks like this:
```bash
cat data.csv | clickhouse-client --query="INSERT INTO test SELECT lower(col1), col3 * col3 FROM input('col1 String, col2 Date, col3 Int32') FORMAT CSV";
$ cat data.csv | clickhouse-client --query="INSERT INTO test SELECT lower(col1), col3 * col3 FROM input('col1 String, col2 Date, col3 Int32') FORMAT CSV";
```
- If `data.csv` contains data of the same structure `test_structure` as the table `test` then these two queries are equal:
```bash
cat data.csv | clickhouse-client --query="INSERT INTO test FORMAT CSV"
cat data.csv | clickhouse-client --query="INSERT INTO test SELECT * FROM input('test_structure') FORMAT CSV"
$ cat data.csv | clickhouse-client --query="INSERT INTO test FORMAT CSV"
$ cat data.csv | clickhouse-client --query="INSERT INTO test SELECT * FROM input('test_structure') FORMAT CSV"
```
[Original article](https://clickhouse.yandex/docs/en/query_language/table_functions/input/) <!--hide-->

View File

@ -8,15 +8,15 @@ It supports Nullable types (based on DDL of remote table that is queried).
**Examples**
``` sql
```sql
SELECT * FROM jdbc('jdbc:mysql://localhost:3306/?user=root&password=root', 'schema', 'table')
```
``` sql
```sql
SELECT * FROM jdbc('mysql://localhost:3306/?user=root&password=root', 'schema', 'table')
```
``` sql
```sql
SELECT * FROM jdbc('datasource://mysql-local', 'schema', 'table')
```

Some files were not shown because too many files have changed in this diff Show More